首页 | 官方网站   微博 | 高级检索  
     

基于改进朴素贝叶斯的区间不确定性数据分类方法
引用本文:李文进,熊小峰,毛伊敏. 基于改进朴素贝叶斯的区间不确定性数据分类方法[J]. 计算机应用, 2014, 34(11): 3268-3272. DOI: 10.11772/j.issn.1001-9081.2014.11.3268
作者姓名:李文进  熊小峰  毛伊敏
作者单位:1. 江西理工大学 理学院,江西 赣州 3410002. 江西理工大学 应用科学学院,江西 赣州 341000
基金项目:国家自然科学基金资助项目,江西省自然科学基金资助项目
摘    要:基于Parzen窗的朴素贝叶斯在区间不确定性数据分类中存在计算复杂度高、空间需求大的不足。针对该问题,提出一种改进的区间不确定性数据分类方法IU-PNBC。首先采用Parzen窗估计区间样本的类条件概率密度函数(CCPDF);然后通过代数插值得到类条件概率密度函数的近似函数;最后利用近似代数插值函数计算样本的后验概率, 并用于预测。通过人工生成的仿真数据和UCI标准数据集验证了算法假设的合理性以及插值点数对IU-PNBC算法分类精度的影响。实验结果表明,当插值点数大于15时,IU-PNBC算法的分类精度趋于稳定,且插值点数越多,算法分类精度越高;该算法可以避免原Parzen窗估计对训练样本的依赖, 并有效降低计算复杂度;同时由于该算法具有远低于基于Parzen窗的朴素贝叶斯的运行时间和空间需求, 因此适合解决数据量较大的区间不确定性数据分类问题。

关 键 词:区间不确定性数据  代数插值  朴素贝叶斯  Parzen窗估计  分类
收稿时间:2014-06-05
修稿时间:2014-06-30

Classification method for interval uncertain data based on improved naive Bayes
LI Wenjin , XIONG Xiaofeng , MAO Yimin. Classification method for interval uncertain data based on improved naive Bayes[J]. Journal of Computer Applications, 2014, 34(11): 3268-3272. DOI: 10.11772/j.issn.1001-9081.2014.11.3268
Authors:LI Wenjin    XIONG Xiaofeng    MAO Yimin
Affiliation:1. School of Science, Jiangxi University of Science and Technology, Ganzhou Jiangxi 341000, China;
2. College of Applied Science, Jiangxi University of Science and Technology, Ganzhou Jiangxi 341000,China
Abstract:Considering the high computation complexity and storage requirement of Naive Bayes (NB) based on Parzen Window Estimation (PWE), especially for classification on interval uncertain data, an improved method named IU-PNBC was proposed for classifying the interval uncertain data. Firstly, Class-Conditional Probability Density Function (CCPDF) was estimated by using PWE. Secondly, an approximate function for CCPDF was obtained by using algebraic interpolation. Finally, the posterior probability was computed and used for classification by using the approximate interpolation function. Artificial simulation data and UCI standard dataset were used to assume the rationality of the proposed algorithm and the affection of the interpolation points to classification accuracy of IU-PNBC. The experimental results show that: when the interpolation points are more than 15, the accuracy of IU-PNBC tends to be stable, and the accuracy increases with the increase of the interpolation points; IU-PNBC can avoid the dependence on the training samples and improve the computation efficiency effectively. Thus, IU-PNBC is suitable for classification on large interval uncertain data with lower computation complexity and storage requirement than NB based on Parzen window estimation.
Keywords:interval uncertain data  algebraic interpolation  Naive Bayes (NB)  Parzen Window Estimation (PWE)  classification
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号