首页 | 官方网站   微博 | 高级检索  
     

基于关键词抽取的自动文摘算法?
引用本文:蒋效宇.基于关键词抽取的自动文摘算法?[J].计算机工程,2012,38(3):183-186.
作者姓名:蒋效宇
作者单位:北京服装学院商学院,北京,100029
基金项目:北京市优秀人才培养资助专项科研基金资助项目(2009D005001000005)
摘    要:针对生成文摘内容不完整的问题,利用相邻词的共现频率进行未登录词识别,提出一种通过词汇链的构建进行中文关键词抽取和文摘生成的算法,并给出一种采用《知网》为知识库构建词汇链的方法。通过计算词义相似度构建词汇链,结合词汇所在词汇链的强度、信息熵和出现位置等属性,进行关键词抽取和句子重要度计算。实验结果表明,与已有算法相比,该算法能够提高生成摘要的召回率和准确率。

关 键 词:自动文摘  向量空间模型  关键词抽取  词汇链  未登录词识别
收稿时间:2011-06-03

Automatic Summarization Algorithm Based on Keyword Extraction
JIANG Xiao-yu.Automatic Summarization Algorithm Based on Keyword Extraction[J].Computer Engineering,2012,38(3):183-186.
Authors:JIANG Xiao-yu
Affiliation:JIANG Xiao-yu (Business School, Beijing Institute of Fashion Technology, Beijing 100029, China)
Abstract:In order to over the shorlcoming of the incomprehensive of summarization, a new lexical chain-based keywords extraction and automatic summarization algorithm from Chinese texts based on the unknown worst recognition using co-occurrence of neighbor words is proposed, and an algorithm for constructing lexical chain based on Hownet knowledge database is given in the method, lexical chain is constructed by calculating the semantic similarity between terms, keywords are extracted and the importance of each sentence is calculated according to the intensity of lexical chain, the entropy of terms and position. Experimental results show that the summarization generated by the improved algorithm gets better performance than other methods both in recall and precision.
Keywords:automatic summarization  vector space model  keyword extraction  lexical chain  unknown word recognition
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号