首页 | 官方网站   微博 | 高级检索  
     

概念向量文本聚类算法
引用本文:白秋产,金春霞,周海岩.概念向量文本聚类算法[J].计算机工程与应用,2011,35(35):155-157.
作者姓名:白秋产  金春霞  周海岩
作者单位:1. 淮阴工学院电子与电气工程学院,江苏淮安,223003
2. 淮阴工学院计算机工程学院,江苏淮安,223003
基金项目:江苏省科技攻关项目(No.BE2006357)
摘    要:为了解决基于传统关键词的文本聚类算法没有考虑特征关键词之间的相关性,而导致文本向量概念表达不够准确,提出基于概念向量的文本聚类算法TCBCV(Text Clustering Based on Concept Vector),采用HowNet的概念属性,并利用语义场密度和义原在概念树的权值选取合适的义原作为关键词的概念,实现关键词到概念的映射,不仅增加了文本之间的语义关系,而且降低了向量维度,将其应用于文本聚类,能够提高文本聚类效果。实验结果表明,该算法在文本聚类的准确率和召回率上都得到了较大的提高。

关 键 词:知网  概念语义场  义原抽取  概念向量  文本聚类
修稿时间: 

Text clustering algorithm based on concept vector
BAI Qiuchan , JIN Chunxia , ZHOU Haiyan.Text clustering algorithm based on concept vector[J].Computer Engineering and Applications,2011,35(35):155-157.
Authors:BAI Qiuchan  JIN Chunxia  ZHOU Haiyan
Affiliation:BAI Qiuchan1,JIN Chunxia2,ZHOU Haiyan2 1.Faculty of Electronic and Electrical Engineering,Huaiyin Institute of Technology,Huai'an,Jiangsu 223003,China 2.Faculty of Computer Engineering,China
Abstract:The text clustering algorithm based on traditional keyword does not take into account the semantic relation between key words,and then causes the concept of the text vector is not accurate enough.The paper proposes the text clustering algorithm based on concept vector.The algorithm adopts HowNet properties and the density of semantic field and the weight of meaning in concept tree to select the appropriate meaning of the original concepts as Keywords ,the text vector would be transformed from keyword vector...
Keywords:HowNet  concept semantic field  the original meaning extracting  concept vector  text clustering
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号