首页 | 官方网站   微博 | 高级检索  
     

高效的短文本主题词抽取方法
引用本文:常鹏,马辉.高效的短文本主题词抽取方法[J].计算机工程与应用,2011,47(20):126-128.
作者姓名:常鹏  马辉
作者单位:1. 天津大学管理学院,天津,300072
2. 天津城市建设学院管理系,天津,300384
摘    要:为了克服传统主题词抽取算法中的主题漂移与主题误判等问题,提出了利用词的共现信息来提高主题词抽取的准确率。根据词汇与文本中的上下文环境词汇的共现关系来调节词的权重评分,与文本主题具有较高共现率的词将被优先抽取为文本的主题词,从而提高文本的主题词抽取精度。经实验证明,提出的主题词抽取方法较一般主题词抽取方法准确率有所提升,特别是抽取文本篇幅较短时,该方法明显优于一般方法。

关 键 词:抽取  词共现  主题抽取
修稿时间: 

Efficient short texts keyword extraction method analysis
CHANG Peng,MA Hui.Efficient short texts keyword extraction method analysis[J].Computer Engineering and Applications,2011,47(20):126-128.
Authors:CHANG Peng  MA Hui
Affiliation:1.Department of Management,Tianjin University,Tianjin 300072,China 2.Department of Management,Tianjin Insititute of Urban Construction,Tianjin 300384,China
Abstract:In order to overcome the shortcoming of traditional methods of subject extraction, such as the theme drifting and theme misjudging,a new keywords extraction algorithm based on cooccurrence analysis is proposed in this paper.The word's weight is adjusted by its ability of associating with other words.The word that occurred with more words has greater impact and is extracted firstly.The experimental results show that the summarization generated by the improved algorithm gets better performance than other methods both in recall and precision.
Keywords:keyword extraction  co-occurrence  subject extraction
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号