首页 | 官方网站   微博 | 高级检索  
     

基于EM和贝叶斯网络的丢失数据填充算法
引用本文:李宏,阿玛尼,李平,吴敏.基于EM和贝叶斯网络的丢失数据填充算法[J].计算机工程与应用,2010,46(5):123-125.
作者姓名:李宏  阿玛尼  李平  吴敏
作者单位:中南大学 信息科学与工程学院,长沙 410083
基金项目:国家杰出青年基金No.60425310~~
摘    要:实际应用中存在大量的丢失数据的数据集,对丢失数据的处理已成为目前分类领域的研究热点。分析和比较了几种通用的丢失数据填充算法,并提出一种新的基于EM和贝叶斯网络的丢失数据填充算法。算法利用朴素贝叶斯估计出EM算法初值,然后将EM和贝叶斯网络结合进行迭代确定最终更新器,同时得到填充后的完整数据集。实验结果表明,与经典填充算法相比,新算法具有更高的分类准确率,且节省了大量开销。

关 键 词:丢失数据填充  参数更新器  最大期望值算法(EM)  贝叶斯网络  
收稿时间:2008-8-19
修稿时间:2008-11-3  

Imputation algorithm of missing values based on EM and Bayesian network
LI Hong,EMMANUEL Amani,LI Ping,WU Min.Imputation algorithm of missing values based on EM and Bayesian network[J].Computer Engineering and Applications,2010,46(5):123-125.
Authors:LI Hong  EMMANUEL Amani  LI Ping  WU Min
Affiliation:School of Information Science and Engineering,Central South University,Changsha 410083,China
Abstract:Dataset with missing values is quite common in real applications,and handling missing values has become a research hot issue in the classification field.This paper analyzes and compares several popular missing values imputation algorithms,and has proposed a novel imputation algorithm for missing values based on EM(Expectation Maximization)and Bayesian network.In this algorithm,the Na(i)ve Bayesian is employed to estimate the initial values of EM algorithm,and the EM inspired approach for filling up missing values is incorporated to Bayesian network learning with the objective of ensuring the ultimate updater.As a result,the complete dataset is got after imputation.Experiment results demonstrate that the proposed algorithm enables much higher classification accuracy and lower cost when compared with other classical imputation algorithm.
Keywords:missing values imputation  parameter updater  Expectation-Maximization(EM)  Bayesian network
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号