首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Traditional machine learning algorithms are not with satisfying generalization ability on noisy, imbalanced, and small sample training set. In this work, a novel virtual sample generation (VSG) method based on Gaussian distribution is proposed. Firstly, the method determines the mean and the standard error of Gaussian distribution. Then, virtual samples can be generated by such Gaussian distribution. Finally, a new training set is constructed by adding the virtual samples to the original training set. This work has shown that training on the new training set is equivalent to a form of regularization regarding small sample problems, or cost-sensitive learning regarding imbalanced sample problems. Experiments show that given a suitable number of virtual sample replicates, the generalization ability of the classifiers on the new training sets can be better than that on the original training sets.  相似文献   

2.
鉴于在实际的应用中滚动轴承的故障信号所属的类别往往是未知的,而且为了得到一定的测试数据需要花费大量的时间,甚至对机械设备造成了一些损害.利用极限学习机训练速度快且泛化能力强的特点,提出了一种基于半监督极限学习机的滚动轴承故障诊断方法,该方法允许在有少量带标签的轴承故障数据的情况下,将带标签的历史数据与新采集到的部分未带标签的数据一起用来训练得到一个最优的诊断模型.首先通过相空间重构将原始一维信号映射到一个高维的相空间,在相空间中提取初始的轴承特征集,然后将特征集输入半监督的极限学习机中进行训练和测试.实验结果表明,这种基于半监督算法的诊断模型简单,在神经元个数较少的情况下仍然具有很好的泛化能力,具有一定的应用价值.  相似文献   

3.
机器学习作为实现人工智能的一种重要方法,在数据挖掘、计算机视觉、自然语言处理等领域得到广泛应用。随着机器学习应用的普及发展,其安全与隐私问题受到越来越多的关注。首先结合机器学习的一般过程,对敌手模型进行了描述。然后总结了机器学习常见的安全威胁,如投毒攻击、对抗攻击、询问攻击等,以及应对的防御方法,如正则化、对抗训练、防御精馏等。接着对机器学习常见的隐私威胁,如训练数据窃取、逆向攻击、成员推理攻击等进行了总结,并给出了相应的隐私保护技术,如同态加密、差分隐私。最后给出了亟待解决的问题和发展方向。  相似文献   

4.
Yoram Gat 《Machine Learning》2001,42(3):233-239
A classifier is said to have good generalization ability if it performs on test data almost as well as it does on the training data. The main result of this paper provides a sufficient condition for a learning algorithm to have good finite sample generalization ability. This criterion applies in some cases where the set of all possible classifiers has infinite VC dimension. The result is applied to prove the good generalization ability of support vector machines by a exploiting a sparse-representation property.  相似文献   

5.
针对在线贯序极限学习机(OS-ELM)算法隐含层输出不稳定、易产生奇异矩阵和在线贯序更新时没有考虑训练样本时效性的问题,提出一种基于核函数映射的正则化自适应遗忘因子(FFOS-RKELM)算法.该算法利用核函数代替隐含层,能够产生稳定的输出结果.在初始阶段加入正则化方法,通过构造非奇异矩阵提高模型的泛化能力;在贯序更新阶段,通过新到的数据自动更新遗忘因子.将FFOS-RKELM算法应用到混沌时间序列预测和入口氮氧化物时间序列预测中,相比于OS-ELM、FFOS-RELM、OS-RKELM算法,可有效地提高预测精度和泛化能力.  相似文献   

6.
Recently, addressing the few-shot learning issue with meta-learning framework achieves great success. As we know, regularization is a powerful technique and widely used to improve machine learning algorithms. However, rare research focuses on designing appropriate meta-regularizations to further improve the generalization of meta-learning models in few-shot learning. In this paper, we propose a novel meta-contrastive loss that can be regarded as a regularization to fill this gap. The motivation of our method depends on the thought that the limited data in few-shot learning is just a small part of data sampled from the whole data distribution, and could lead to various bias representations of the whole data because of the different sampling parts. Thus, the models trained by a few training data (support set) and test data (query set) might misalign in the model space, making the model learned on the support set can not generalize well on the query data. The proposed meta-contrastive loss is designed to align the models of support and query sets to overcome this problem. The performance of the meta-learning model in few-shot learning can be improved. Extensive experiments demonstrate that our method can improve the performance of different gradient-based meta-learning models in various learning problems, e.g., few-shot regression and classification.  相似文献   

7.
虚拟样本生成技术研究   总被引:1,自引:0,他引:1  
虚拟样本生成技术主要研究如何利用待研究领域的先验知识并结合已有的训练样本构造辅助样本,扩充训练样本集,提高学习器的泛化能力。作为一种在机器学习中引入先验知识的方法,虚拟样本生成技术已经成为提高小样本学习问题泛化能力的主要手段之一,受到了国内外学者广泛研究。首先介绍了虚拟样本的概念,给出了衡量虚拟样本生成技术性能的两个指标,讨论了虚拟样本生成技术对学习器泛化能力的影响。然后根据虚拟样本生成技术的本质将其划分为3类,并针对每一类讨论了几种典型的虚拟样本生成技术,进而指出了现有虚拟样本生成技术存在的一些不足。最后进行总结并对虚拟样本生成技术的进一步发展提出了自己的看法。  相似文献   

8.
基于SVM的软测量建模   总被引:30,自引:2,他引:30  
支持向量机(Support Vector Machines)是一种基于统计学习理论的新型学习机,本 文提出用支持向量机建立软测量模型.理论分析和仿真研究表明,该方法学习速度快、跟踪 性能好、泛化能力强、对样本的依赖程度低,比基于RBF神经网络的软测量建模具有更好的 推广能力.  相似文献   

9.
基于健壮支持向量机的异常检测   总被引:1,自引:0,他引:1  
用于异常检测的机器学习方法,如神经网络和支持向量机,都对训练样本的噪声非常敏感,进而导致推广能力和分类准确性的下降。为了解决上述问题,论文提出一种新的基于健壮支持向量机的方法。先将RSVM与标准SVM作了对比,然后使用1998DARPABSM的数据作为评估数据。实验表明,该方法在入侵检测的准确率、误检率和有噪声情况下的推广能力和运行时等多项指标上都有良好的表现。  相似文献   

10.
基于卷积神经网络的发动机气路故障诊断方法   总被引:1,自引:0,他引:1       下载免费PDF全文
深度学习是一种新的基于特征表示的机器学习方法。深度学习模型包含多个隐藏层,可以通过对输入数据进行自动学习来获取隐藏的功能层中的特征信息。与传统的诊断方法相比,深度学习具备从原始信息中提取更丰富的特征的能力,因此已经成为基于机器学习的故障诊断研究的新方向,为发动机气路等复杂系统故障诊断带来了新思路。结合发动机气路试验数据的特点与深度学习的优势,提出基于卷积神经网络的故障诊断方法,包括预处理、模型训练及优化等过程,并实现了复杂系统故障诊断预测算法平台。经某发动机气路试验仿真数据实例验证,提出的方法具有较好的可行性和效果,能够充分利用深度学习的优点,更准确地识别发动机气路的健康状况。  相似文献   

11.
Metric-Based Methods for Adaptive Model Selection and Regularization   总被引:3,自引:0,他引:3  
We present a general approach to model selection and regularization that exploits unlabeled data to adaptively control hypothesis complexity in supervised learning tasks. The idea is to impose a metric structure on hypotheses by determining the discrepancy between their predictions across the distribution of unlabeled data. We show how this metric can be used to detect untrustworthy training error estimates, and devise novel model selection strategies that exhibit theoretical guarantees against over-fitting (while still avoiding under-fitting). We then extend the approach to derive a general training criterion for supervised learning—yielding an adaptive regularization method that uses unlabeled data to automatically set regularization parameters. This new criterion adjusts its regularization level to the specific set of training data received, and performs well on a variety of regression and conditional density estimation tasks. The only proviso for these methods is that sufficient unlabeled training data be available.  相似文献   

12.
Online learning algorithms have been preferred in many applications due to their ability to learn by the sequentially arriving data. One of the effective algorithms recently proposed for training single hidden-layer feedforward neural networks (SLFNs) is online sequential extreme learning machine (OS-ELM), which can learn data one-by-one or chunk-by-chunk at fixed or varying sizes. It is based on the ideas of extreme learning machine (ELM), in which the input weights and hidden layer biases are randomly chosen and then the output weights are determined by the pseudo-inverse operation. The learning speed of this algorithm is extremely high. However, it is not good to yield generalization models for noisy data and is difficult to initialize parameters in order to avoid singular and ill-posed problems. In this paper, we propose an improvement of OS-ELM based on the bi-objective optimization approach. It tries to minimize the empirical error and obtain small norm of network weight vector. Singular and ill-posed problems can be overcome by using the Tikhonov regularization. This approach is also able to learn data one-by-one or chunk-by-chunk. Experimental results show the better generalization performance of the proposed approach on benchmark datasets.  相似文献   

13.
针对标准支持向量机方法需要存储、计算和处理核矩阵而学习效率很低,不能有效处理较大规模数据挖掘的问题,提出一种基于近邻边缘检测的支持向量机方法 (SVM Method Based on Neighbor Edge Detection, ED_SVM)。该方法将近邻边缘检测技术引入SVM的训练过程,即首先对数据进行划分,选择混合类样本,通过边缘检测技术提取其中位于近似最优分类边界附近的含有较多重要支持向量信息的样本,构成新的小规模训练集,以在压缩训练集的同时保持原始支持向量信息的分布特性;并在新构成的训练集上训练标准SVM,在提高SVM学习效率的同时得到优秀的泛化性能。实验结果表明,本文提出的ED_SVM方法能够同时获得较高的测试精度和学习效率。  相似文献   

14.
一种自动选择参数的加权支持向量机算法   总被引:7,自引:0,他引:7  
C-SVM分类算法在不同类别样本数目不均衡的情况下,训练时的分类错误倾向于样本数目小的类别。样本集中出现重复样本时作为新样本重新计算,增加了算法的训练时间。针对这两种问题,分析了产生的原因,提出了一种加权支持向量机算法,补偿了类别差异造成的不利影响,加快了重复样本的决策速度。为提高算法的推广性能,在模型训练过程中引入遗传算法自动选择惩罚因子和核函数宽度两个参数。实验结果表明了该算法可以有效地解决类别不均衡和重复样本问题,且训练模型具有良好的推广性能。  相似文献   

15.
行人再识别是在不同环境下再次对特定行人进行检索,近几年来受到国内外学者的广泛关注。目前行人再识别算法多采用局部特征与全局特征相结合的方法,在单一数据集上的训练和测试取得了非常好的成绩,但是在跨域测试中成绩并不理想,泛化能力较低。提出一种基于深度胶囊网络的跨域行人再识别方法,通过视角分类训练任务,模型可以学习图像中行人的有效特征,这些特征可以直接迁移到行人再识别任务中,缓解了行人再识别泛化能力不足的问题。实验结果表明,本文模型优于目前所有无监督学习行人再识别方法,具有良好泛化能力。  相似文献   

16.
药物透血脑屏障是新药研发的一个重要因素。在传统栈式降噪自编码(Stacked Denoising Autoencoder,SDAE)基础上,提出一种改进的SDAE药物透血脑屏障预测方法。首先利用主成分分析(Principal Components Analysis,PCA)无监督训练一组权值初始化SDAE,避免随机初始化权值造成模型收敛速度较慢;然后为降噪自编码(Denoising Autoencoder,DAE)增加一层隐藏层,构造双隐层DAE,提高单个DAE提取药物分子抽象特征的能力;其次融合SDAE最后两个DAE的第一层隐藏层输出作为softmax分类器的输入,最终实现药物透血脑屏障预测。实验表明,与传统的SDAE及浅层机器学习模型SVM相比,改进后的模型对药物透血脑屏障具有更好的预测效果。  相似文献   

17.
An adoptive learning strategy using an artificial neural network ANN has been proposed here to control the motion of a 6 D.O.F manipulator robot and to overcome the inverse kinematics problem, which are mainly singularities and uncertainties in arm configurations. In this approach a network have been trained to learn a desired set of joint angles positions from a given set of end effector positions, experimental results has shown an excellent mapping over the working area of the robot, to validate the ability of the designed network to make prediction and well generalization for any set of data, a new training using different data set has been performed using the same network, experimental results has shown a good generalization for the new data sets.The proposed control technique does not require any prior knowledge of the kinematics model of the system being controlled, the basic idea of this concept is the use of the ANN to learn the characteristics of the robot system rather than to specify explicit robot system model. Any modification in the physical set-up of the robot such as the addition of a new tool would only require training for a new path without the need for any major system software modification, which is a significant advantage of using neural network technology.  相似文献   

18.
传统机器学习方法泛化性能不佳,需要通过大规模数据训练才能得到较好的拟合结果,因此不能快速学习训练集外的少量数据,对新种类任务适应性较差,而元学习可实现拥有类似人类学习能力的强人工智能,能够快速适应新的数据集,弥补机器学习的不足。针对传统机器学习中的自适应问题,利用样本图片的局部旋转对称性和镜像对称性,提出一种基于群等变卷积神经网络(G-CNN)的度量元学习算法,以提高特征提取能力。利用G-CNN构建4层特征映射网络,根据样本图片中的局部对称信息,将支持集样本映射到合适的度量空间,并以每类样本在度量空间中的特征平均值作为原型点。同时,通过同样的映射网络将查询机映射到度量空间,根据查询集中样本到原型点的距离完成分类。在Omniglot和miniImageNet数据集上的实验结果表明,该算法相比孪生网络、关系网络、MAML等传统4层元学习算法,在平均识别准确率和模型复杂度方面均具有优势。  相似文献   

19.
为解决数据流分类过程中样本标注和概念漂移问题,提出了一种基于实例迁移的数据流分类挖掘模型.首先,该模型用支持向量机作学习器,用所得分类模型中的支持向量构建源领域,待分类的当前数据块为目标域.然后,借助互近邻思想在源域中挑选目标域中样本的真邻居进行实例迁移,避免发生负迁移.最后,通过合并目标域和迁移样本形成训练集,提高标注样本数量,增强模型的泛化能力.理论分析和实验结果表明,所提方法具有可行性,相比其它学习方法在分类准确性方面更具优势.  相似文献   

20.
一般的学习模型都是基于一个假设的随机分布,然后通过训练真实数据来拟合出模型。网络模型复杂并且数据集规模也不小,这种方法简直就是凭借天生蛮力解决问题。Goodfellow认为正确使用数据的方式,是先对数据集的特征信息有insight之后,再干活。无监督学习是当下较为流行的话题,但也是困难较为繁多的话题。目前无监督学习可以分成以下两类,分别是确定型的自编码方法以及概率型的受限波尔兹曼机,其目标主要是使受限玻尔兹曼机达到稳定状态时原数据出现的概率最大。如何更快速更有效地地搭建模型以及如何做实验并有效地获得相关的实验结论是人们讨论的重点。在研究中,在判别模型中增加正则化,用卷积层代替池化层,在生成模型中输出层使用tanh激活函数激活,这样使得最终运算的准确率和损失率大大下降,并减少了冗余成分。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号