首页 | 官方网站   微博 | 高级检索  
     

新的结合互信息和粗糙集的特征选择
引用本文:史岳鹏,张明慧,朱颢东.新的结合互信息和粗糙集的特征选择[J].计算机工程与应用,2011,47(16):135-137.
作者姓名:史岳鹏  张明慧  朱颢东
作者单位:1. 郑州牧业工程高等专科学校信息工程系,郑州,450011
2. 郑州师范学院信息技术系,郑州,450044
3. 郑州轻工业学院计算机与通信工程学院,郑州,450002
基金项目:河南省基础与前沿技术研究计划项目
摘    要:特征选择是文本分类的一个重要步骤。分析了互信息,针对其不足引进了粗糙集给出了一个基于关系积的属性约简算法,并以此为基础提出了一个新的适用于海量文本数据集的特征选择方法。该方法使互信息进行特征初选,利用基于关系积的属性约简算法消除冗余词。实验结果表明此种特征选择方法的微平均F1和宏平均F1较高。

关 键 词:特征选择  文本分类  互信息  粗糙集  属性约简
修稿时间: 

New feature selection combined MI with RS
SHI Yuepeng,ZHANG Minghui,ZHU Haodong.New feature selection combined MI with RS[J].Computer Engineering and Applications,2011,47(16):135-137.
Authors:SHI Yuepeng  ZHANG Minghui  ZHU Haodong
Affiliation:1.Department of Information Engineering,Zhengzhou College of Animal Husbandry Engineering,Zhengzhou 450011,China 2.Department of Information Technology,Zhengzhou Normal University,Zhengzhou 450044,China 3.School of Computer and Communication Engineering,Zhengzhou University of Light Industry,Zhengzhou 450002,China
Abstract:Feature selection is an important step in text categorization.MI is analyzed,according to deficiency of MI,RS is introduced and an attribute reduction algorithm based on attribute union is proposed.A new feature selection method combined MI with the proposed attribute reduction algorithm is presented which is suitable for massive text data sets.The method uses MI to select features,and employs the proposed attribute reduction algorithm to eliminate redundancy.The experimental results show that micro average F1 and macro average F1 of the new method are higher.
Keywords:feature selection  text categorization  Mutual Information(MI)  Rough Se(tRS)  attribute reduction
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号