共查询到20条相似文献,搜索用时 187 毫秒
1.
基于边际Fisher准则和迁移学习的小样本集分类器设计算法 总被引:1,自引:0,他引:1
如何利用大量已有的同构标记数据(源域)设计小样本训练数据(目标域)的分类器是一个具有很强应用意义的研究问题. 由于不同域的数据特征分布有差异,直接使用源域数据对目标域样本进行分类的效果并不理想. 针对上述问题,本文提出了一种基于迁移学习的分类器设计算法. 首先,本文利用内积度量的边际Fisher准则对源域进行特征映射,提高源域中类内紧凑性和类间区分性. 其次,为了筛选合理的训练样本对,本文提出一种去除边界奇异点的算法来选择源域密集区域样本点,与目标域中的标记样本点组成训练样本对. 在核化空间上,本文学习了目标域特征到源域特征的非线性转换,将目标域映射到源域. 最后,利用邻近算法(k-nearest neighbor,kNN)分类器对映射后的目标域样本进行分类. 本文不仅改进了边际Fisher准则方法,并且将基于自适应样本对 筛选的迁移学习应用到小样本数据的分类器设计中,提高域间适应性. 在通用数据集上的实验结果表明,本文提出的方法能够有效提高小样本训练域的分类器性能. 相似文献
2.
3.
4.
现有深度学习算法应用于PolSAR图像分类时,较少考虑该图像数据的复数特点,使得数据的复数域信息不能被充分利用;同时,深度学习需要大量的标签样本作为模型的训练样本,但是PolSAR图像可获取的标签样本十分有限.针对上述问题,结合Tri-training算法和复值卷积神经网络(CV-CNN)提出了半监督PolSAR图像分类算法.首先通过Wishart分类器和Tri-training算法获取一些可靠性较高的伪标签样本,然后将其加入到复值卷积神经网络的训练样本中并用于模型训练,最终完成图像分类任务.通过四幅PolSAR图像分类的仿真实验表明,该算法不仅能够有效提升伪标签样本的可靠性,同时还可提高模型的分类准确率. 相似文献
5.
一种新的基于聚类的多分类器融合算法 总被引:11,自引:2,他引:9
提出了一种新的多分类器融合算法,该算法能找出各分类器在特征空间中局部性能较好的区域,并利用具有最优局部性能的分类器的输出作为最终的融合结果。首先,利用各分类器对训练样本进行分类,这样训练样本被划分为正确分类样本和错误分类样本两个集合;接着,对这两个样本集合分别进行聚类分析来划分特征空间,并计算各分类器在特征空间局部区域中的性能;在测试时,选择测试样本周围局部性能最优的分类器的输出作为最终的融合结果。基于ELENA数据集的实验显示了该算法的有效性。 相似文献
6.
为了提高多视图半监督协同算法的性能,并针对算法应用范围受限的问题,提出了一种组合标记规则的协同训练方法。该算法将一致性与非一致性标记规则相结合,若分类器具有相同标记则将对应样本加入到相应的样本集中;若标记不同且两分类器对应的标记置信度差值超过了一定的阈值,则采用高置信度分类器的标记结果,并将样本添加到相应的样本集中。通过判断两分类器对相应样本的标记是否一致以及差异性阈值对未标记样本进行组合标记,并利用分类器差异性判断原则更新分类模型,充分利用未标记样本中的有用信息将分类器性能提高5%以上。所提出的算法在桥梁结构健康监测数据集及标准UCI数据集上的实验结果验证了算法在多视图分类问题上的有效性和可行性。 相似文献
7.
在很多应用中,组合使用多个分类器可以降低分类错误率。该文就是基于这个思想提出了新的人脸识别算法,即加强概率推理模型。在该算法中,将分类任务划分成多个子分类器,每个子分类器集中于一些难分类的样本,然后组合这些子分类器形成一个强的分类器。试验结果表明算法的识别率比原来的概率推理模型的识别率提高了1.8%。 相似文献
8.
针对二支决策TAN分类器在处理不确定数据时有较高的错误率,提出一种新的三支扩展TAN贝叶斯分类器(3WDTAN).首先通过构建TAN贝叶斯分类模型,采用先验概率和类条件概率估计三支决策中的条件概率;其次构建3WD-TAN分类器,制定3WD-TAN分类器中正域,负域和边界域的三支分类规则,结合边界域处理不确定性数据的优势,在一定程度上纠正了传统TAN贝叶斯分类器产生的分类错误;最后通过在5个UCI数据集上选取NB、TAN、SETAN算法进行对比实验,表明3WD-TAN具有较高的准确率和召回率,且适用于不同规模数据集的分类问题. 相似文献
9.
10.
11.
针对传统AdaBoost算法的基分类器线性组合效率低以及过适应的问题,提出了一种基于基分类器系数与多样性的改进算法——WD AdaBoost。首先,根据基分类器的错误率与样本权重的分布状态,给出新的基分类器系数求解方法,以提高基分类器的组合效率;其次,在基分类器的选择策略上,WD AdaBoost算法引入双误度量以增加基分类器间的多样性。在五个来自不同实际应用领域的数据集上,与传统AdaBoost算法相比,CeffAda算法使用新的基分类器系数求解方法使测试误差平均降低了1.2个百分点;同时,WD AdaBoost算法与WLDF_Ada、AD_Ada、sk_AdaBoost等算法相对比,具有更低的错误率。实验结果表明,WD AdaBoost算法能够更高效地集成基分类器,抵抗过拟合,并可以提高分类性能。 相似文献
12.
动态集成选择算法中,待测样本的能力区域由固定样本组成,这会影响分类器选择,因此提出一种基于动态能力区域策略的DES-DCR-CIER算法。首先采用异构分类器生成基分类器池,解决同构集成分类器差异性较小和异构集成分类器数目较少的问题;然后采用相互自适应K近邻算法、逼近样本集距离中心和剔除类别边缘样本三个步骤得到待测样本的动态能力区域,基于整体互补性指数选择一组互补性强的分类器;最后通过ER规则对分类器组进行合成。在安徽合肥某三甲医院的八位超声科医生乳腺肿块诊断数据集和美国威斯康辛州乳腺癌诊断公开数据集上的实验表明,基于DES-DCR-CIER算法的诊断模型精度更优。 相似文献
13.
集成学习是一种可以有效改善分类系统性能的数据挖掘方法。采用动态分类器集成选择算法对卷烟感官质量进行智能评估。产生包含多个基分类器的分类器池;根据基分类器在被测样本邻域内的表现选择满足要求的分类器;采用被选择的分类器产生最终的预测结果。为了验证该方法的有效性,采用国内某烟草公司提供的卷烟感官评估历史数据集进行了实验比较分析。实验结果表明,与其他方法相比,该方法获得的效果明显改善。 相似文献
14.
15.
将集成学习的思想引入到增量学习之中可以显著提升学习效果,近年关于集成式增量学习的研究大多采用加权投票的方式将多个同质分类器进行结合,并没有很好地解决增量学习中的稳定-可塑性难题。针对此提出了一种异构分类器集成增量学习算法。该算法在训练过程中,为使模型更具稳定性,用新数据训练多个基分类器加入到异构的集成模型之中,同时采用局部敏感哈希表保存数据梗概以备待测样本近邻的查找;为了适应不断变化的数据,还会用新获得的数据更新集成模型中基分类器的投票权重;对待测样本进行类别预测时,以局部敏感哈希表中与待测样本相似的数据作为桥梁,计算基分类器针对该待测样本的动态权重,结合多个基分类器的投票权重和动态权重判定待测样本所属类别。通过对比实验,证明了该增量算法有比较高的稳定性和泛化能力。 相似文献
16.
多分类问题代价敏感AdaBoost算法 总被引:8,自引:2,他引:6
针对目前多分类代价敏感分类问题在转换成二分类代价敏感分类问题存在的代价合并问题, 研究并构造出了可直接应用于多分类问题的代价敏感AdaBoost算法.算法具有与连续AdaBoost算法 类似的流程和误差估计. 当代价完全相等时, 该算法就变成了一种新的多分类的连续AdaBoost算法, 算法能够确保训练错误率随着训练的分类器的个数增加而降低, 但不直接要求各个分类器相互独立条件, 或者说独立性条件可以通过算法规则来保证, 但现有多分类连续AdaBoost算法的推导必须要求各个分类器相互独立. 实验数据表明, 算法可以真正实现分类结果偏向错分代价较小的类, 特别当每一类被错分成其他类的代价不平衡但平均代价相等时, 目前已有的多分类代价敏感学习算法会失效, 但新方法仍然能 实现最小的错分代价. 研究方法为进一步研究集成学习算法提供了一种新的思路, 得到了一种易操作并近似满足分类错误率最小的多标签分类问题的AdaBoost算法. 相似文献
17.
J. Díez Author Vitae Author Vitae A. Bahamonde Author Vitae 《Pattern recognition》2010,43(11):3795-3804
In hierarchical classification, classes are arranged in a hierarchy represented by a tree or a forest, and each example is labeled with a set of classes located on paths from roots to leaves or internal nodes. In other words, both multiple and partial paths are allowed. A straightforward approach to learn a hierarchical classifier, usually used as a baseline method, consists in learning one binary classifier for each node of the hierarchy; the hierarchical classifier is then obtained using a top-down evaluation procedure. The main drawback of this naive approach is that these binary classifiers are constructed independently, when it is clear that there are dependencies between them that are motivated by the hierarchy and the evaluation procedure employed. In this paper, we present a new decomposition method in which each node classifier is built taking into account other classifiers, its descendants, and the loss function used to measure the goodness of hierarchical classifiers. Following a bottom-up learning strategy, the idea is to optimize the loss function at every subtree assuming that all classifiers are known except the one at the root. Experimental results show that the proposed approach has accuracies comparable to state-of-the-art hierarchical algorithms and is better than the naive baseline method described above. Moreover, the benefits of our proposal include the possibility of parallel implementations, as well as the use of all available well-known techniques to tune binary classification SVMs. 相似文献
18.
《Computer Speech and Language》2014,28(3):727-742
Automatic emotion recognition from speech signals is one of the important research areas, which adds value to machine intelligence. Pitch, duration, energy and Mel-frequency cepstral coefficients (MFCC) are the widely used features in the field of speech emotion recognition. A single classifier or a combination of classifiers is used to recognize emotions from the input features. The present work investigates the performance of the features of Autoregressive (AR) parameters, which include gain and reflection coefficients, in addition to the traditional linear prediction coefficients (LPC), to recognize emotions from speech signals. The classification performance of the features of AR parameters is studied using discriminant, k-nearest neighbor (KNN), Gaussian mixture model (GMM), back propagation artificial neural network (ANN) and support vector machine (SVM) classifiers and we find that the features of reflection coefficients recognize emotions better than the LPC. To improve the emotion recognition accuracy, we propose a class-specific multiple classifiers scheme, which is designed by multiple parallel classifiers, each of which is optimized to a class. Each classifier for an emotional class is built by a feature identified from a pool of features and a classifier identified from a pool of classifiers that optimize the recognition of the particular emotion. The outputs of the classifiers are combined by a decision level fusion technique. The experimental results show that the proposed scheme improves the emotion recognition accuracy. Further improvement in recognition accuracy is obtained when the scheme is built by including MFCC features in the pool of features. 相似文献
19.
SVM based adaptive learning method for text classification from positive and unlabeled documents 总被引:7,自引:6,他引:1
Automatic text classification is one of the most important tools in Information Retrieval. This paper presents a novel text
classifier using positive and unlabeled examples. The primary challenge of this problem as compared with the classical text
classification problem is that no labeled negative documents are available in the training example set. Firstly, we identify
many more reliable negative documents by an improved 1-DNF algorithm with a very low error rate. Secondly, we build a set
of classifiers by iteratively applying the SVM algorithm on a training data set, which is augmented during iteration. Thirdly,
different from previous PU-oriented text classification works, we adopt the weighted vote of all classifiers generated in
the iteration steps to construct the final classifier instead of choosing one of the classifiers as the final classifier.
Finally, we discuss an approach to evaluate the weighted vote of all classifiers generated in the iteration steps to construct
the final classifier based on PSO (Particle Swarm Optimization), which can discover the best combination of the weights. In
addition, we built a focused crawler based on link-contexts guided by different classifiers to evaluate our method. Several
comprehensive experiments have been conducted using the Reuters data set and thousands of web pages. Experimental results
show that our method increases the performance (F1-measure) compared with PEBL, and a focused web crawler guided by our PSO-based classifier outperforms other several classifiers
both in harvest rate and target recall. 相似文献
20.
In this paper, a new classifier design methodology, confidence-based classifier design, is proposed to design classifiers with controlled confidence. This methodology is under the guidance of two optimal classification theories, a new classification theory for designing optimal classifiers with controlled error rates and the C.K. Chow's optimal classification theory for designing optimal classifiers with controlled conditional error. The new methodology also takes advantage of the current well-developed classifier's probability preserving and ordering properties. It calibrates the output scores of current classifiers to the conditional error or error rates. Thus, it can either classify input samples or reject them according to the output scores of classifiers. It can achieve some reasonable performance even though it is not an optimal solution. An example is presented to implement the new methodology using support vector machines (SVMs). The empirical cumulative density function method is used to estimate error rates from the output scores of a trained SVM. Furthermore, a new dynamic bin width allocation method is proposed to estimate sample conditional error and this method adapts to the underlying probabilities. The experimental results clearly demonstrate the efficacy of the suggested classifier design methodology. 相似文献