首页 | 官方网站   微博 | 高级检索  
     

利用PCA和AdaBoost建立基于贝叶斯的组合分类器
引用本文:陈松峰,范明. 利用PCA和AdaBoost建立基于贝叶斯的组合分类器[J]. 计算机科学, 2010, 37(8): 236-239256
作者姓名:陈松峰  范明
作者单位:郑州大学信息工程学院,郑州,450052
基金项目:国家自然科学基金,国家"十一五"科技支撑计划课题 
摘    要:提出了一种使用基于贝叶斯的基分类器建立组合分类器的新方法PCABoost.本方法在创建训练样本时,随机地将特征集划分成K个子集,使用PCA得到每个子集的主成分,形成新的特征空间,并将全部的训练数据映射到新的特征空间作为新的训练集.通过不同的变换生成不同的特征空间,从而产生若干个有差异的训练集.在每一个新的训练集上利用AdaBoost建立一组基于贝叶斯的逐渐提升的分类器(即一个分类器组),这样就建立了若干个有差异的分类器组,然后在每个分类器组内部通过加权投票产生一个预测,再把每个组的预测通过投票来产生组合分类器的分类结果,最终建立一个具有两层组合的组合分类器.从UCI标准数据集中随机选取30个数据集进行实验.结果表明,本算法不仅能够显著提高基于贝叶斯的分类器的分类性能,而且与Rotation Forest和AdaBoost等组合方法相比,在大部分数据集上都具有更高的分类准确率.

关 键 词:组合分类器  主成分分析  贝叶斯
收稿时间:2009-09-14
修稿时间:2009-11-23

Construct Ensembles of Bayes-based Classifiers Using PCA and AdaBoost
CHEN Song-feng,FAN Ming. Construct Ensembles of Bayes-based Classifiers Using PCA and AdaBoost[J]. Computer Science, 2010, 37(8): 236-239256
Authors:CHEN Song-feng  FAN Ming
Affiliation:(School of Information and Engineering,Zhengzhou University,Zhengzhou 450052,China)
Abstract:We presented a novel method for constructing ensembles of I3ayes-based classifiers called PCABoost, For creating a training data, our method splited the features set into K-subsets randomly, and applied principal component analysis to each of the feature subsets to get its corresponding principal components. And then all of principal components were put together to form a new feature space into which the total original dataset were mapped to create a new training set. Different process could generated different feature space and different training sets. On each of the new training data we generated a group of classifiers which were boosted one by one using Adal3oost, so we could generate several different classifiers groups in the several different feature spaces. In the classification phase we firstly got several predicts using weighted-voted inside each of the classifiers groups, and then voted on the several predicts to get the final result as the ensembles predict. Experiments were carried on 30 benchmark datascts picked up randomly from the UCI Machine Learning Repository, the results indicate that our method not only improves the performance of I3ayes-based classifiers significantly, but also get higher accuracy on most of data sets than other ensemble methods such as Rotation Forest and AdaBoost.
Keywords:AdaBoost
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号