首页 | 官方网站   微博 | 高级检索  
     

聚类模式下一种优化的K-means文本特征选择
引用本文:刘海峰,刘守生,张学仁.聚类模式下一种优化的K-means文本特征选择[J].计算机科学,2011,38(1):195-197.
作者姓名:刘海峰  刘守生  张学仁
作者单位:解放军理工大学理学院,南京,210007
基金项目:本文受国家自然科学基金项目(编号:70571087)资助。
摘    要:文本特征降维是文本自动分类的核心技术。K-means方法是一种常用的基于划分的方法。针对该算法对类中心初始值及孤立点过于敏感的问题,提出了一种改进的K-means算法用于文本特征选择。通过优化初始类中心的选择模式及对孤立点的剔除,改善了文本特征聚类的效果。随后的文本分类试验表明,提出的改进K-means算法具有较好的特征选择能力,文本分类的效率较高。

关 键 词:特征选择,聚类,K均值,文本分类

Clustering-based Improved K-means Text Feature Selection
LIU Hai-feng,LIU Shou-sheng,ZHANG Xue-ren.Clustering-based Improved K-means Text Feature Selection[J].Computer Science,2011,38(1):195-197.
Authors:LIU Hai-feng  LIU Shou-sheng  ZHANG Xue-ren
Affiliation:(Institute of Sciences,PLA University of Science and Technology,Nanjing 210007,China)
Abstract:Text feature reduction is the key technology in text categorization. In addition, K-means is an partitioning method which usually be used. With regards to this arithmetic excessively incentive to the initial centers and the isolated points, the improved K-means arithmetic was put forward which is used in text feature selection. Text feature clustering was improved by optimizing primitive class center's options and the elimination of isolated point Following text classification test shows that the K-means arithmetic put forward in this paper has a good feature selection ability and high efficiency in text categorization.
Keywords:Feature selection  Clustering  K-means  Text categorization
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号