首页 | 官方网站   微博 | 高级检索  
     

主题关键词信息融合的中文生成式自动摘要研究
引用本文:侯丽微,胡珀,曹雯琳.主题关键词信息融合的中文生成式自动摘要研究[J].自动化学报,2019,45(3):530-539.
作者姓名:侯丽微  胡珀  曹雯琳
作者单位:1.华中师范大学计算机学院 武汉 430079
基金项目:中央高校基本科研业务费项目CCNU18TS044国家自然科学基金61402191中央高校基本科研业务费项目CCNU16JYKX15国家语委“十三五”科研规划项目WT135-11
摘    要:随着大数据和人工智能技术的迅猛发展,传统自动文摘研究正朝着从抽取式摘要到生成式摘要的方向演化,从中达到生成更高质量的自然流畅的文摘的目的.近年来,深度学习技术逐渐被应用于生成式摘要研究中,其中基于注意力机制的序列到序列模型已成为应用最广泛的模型之一,尤其在句子级摘要生成任务(如新闻标题生成、句子压缩等)中取得了显著的效果.然而,现有基于神经网络的生成式摘要模型绝大多数将注意力均匀分配到文本的所有内容中,而对其中蕴含的重要主题信息并没有细致区分.鉴于此,本文提出了一种新的融入主题关键词信息的多注意力序列到序列模型,通过联合注意力机制将文本中主题下重要的一些关键词语的信息与文本语义信息综合起来实现对摘要的引导生成.在NLPCC 2017的中文单文档摘要评测数据集上的实验结果验证了所提方法的有效性和先进性.

关 键 词:联合注意力机制    序列到序列模型    生成式摘要    主题关键词
收稿时间:2017-11-07

Automatic Chinese Abstractive Summarization With Topical Keywords Fusion
Affiliation:1.School of Computer Science, Central China Normal University, Wuhan 430079
Abstract:With the rapid development of big data and artificial intelligence technology, the automatic text summarization research is evolving from extractive summarization to abstractive summarization, which aims to generate more natural, higher quality and more fluent summary. In recent years, deep learning technology has been gradually applied to the abstractive summarization task. The sequence to sequence model based on attention mechanism has become one of the most widely used models, especially in the sentence-level summarization generation tasks (such as news headline generation, sentence compression and so on), and has achieved remarkable results. However, most of the abstractive summarization models based on neural networks distribute their attention to all the contents of the source document evenly, instead of regarding the important topics information of source documents discriminatively. In view of this, we propose a new multiple attention sequence-to-sequence model which integrates topical keywords information. And this model combines multidimensional topic information with text semantic information to generate the final summary by a joint attention mechanism. The evaluation results on the public dataset of NLPCC 2017 shared task3 show that our system is competitive with the state-of-the-art methods.
Keywords:
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号