一种基于词序信息的自动文摘方法 Automatic text summarization based on word order期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种基于词序信息的自动文摘方法

引用本文：	任纪生,张弛,王作英.一种基于词序信息的自动文摘方法[J].计算机工程与设计,2007,28(1):178-181.

作者姓名：	任纪生张弛王作英

作者单位：	清华大学,电子工程系,北京,100084

基金项目：	国家高技术研究发展计划(863计划)

摘要：	自动文摘技术应尽可能获取准确的相似度以确定句子或段落的权重,但目前常用的基于向量空间模型的计算方法却忽视句子、段落、文本中词的顺序.提出了一种新的基于相邻词序组的相似度度量方法并应用于文本的自动摘要,采用基于聚类的方法实现了词序组的向量表示并以此刻画句子、段落、文本,通过线性插值将基于不同长度词序组的相似度结果予以综合.同时,提出了新的基于含词序组重要性累计度的句子或段落的权重指标.实验证明利用词序信息可有效提高自动文摘质量.
关键词：	自动文摘词序向量空间模型相似度权重词序信息自动文摘相似度度量方法 word order based text summarization 质量利用验证权重指标综合结果长度线性插值向量表示聚类摘要应用
文章编号：	1000-7024（2007）01-0178-04
修稿时间：	2005-12-25
Automatic text summarization based on word order

REN Ji-sheng,ZHANG Chi,WANG Zuo-ying.Automatic text summarization based on word order[J].Computer Engineering and Design,2007,28(1):178-181.

Authors:	REN Ji-sheng ZHANG Chi WANG Zuo-ying

Affiliation:	Department ofElectronicEngineering, TsinghuaUniversity, Beijing 100084, China

Abstract:	Automatic text summarization obtain accurate similarity measure for determining the weight of a sentence or a paragraph,but the common algorithm based on vector space model actually neglects the word order presented in sentences,paragraphs,and texts.A new computational scheme based on the combination of neighboring word is proposed,which is applied in automatic text summarization.The vector representation for the combination of neighboring word is implemented via clustering and it is used for characterizing senten-ces,paragraphs,or texts.The similarity results of multi-length phrase are integrated through linear interpolation.A new weighting index for sentence or paragraph is also proposed based on the aggregate significance of word's combination.Experimental results show that the using of word order improve the quality of summarization effectively.

Keywords:	automatic text summarization word order vector space model similarity measure weight
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏