中文网页自动摘要系统的研究 Research on Automatic Abstracting of Chinese Web Page期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

中文网页自动摘要系统的研究

引用本文：	徐晓丹.中文网页自动摘要系统的研究[J].计算机与现代化,2006(9):120-122,126.

作者姓名：	徐晓丹

作者单位：	浙江师范大学信息科学与工程学院,浙江,金华,321004

摘要：	自动摘要是自然语言处理中的一个重要但又困难的分支，在Web信息检索中起着重要作用。文章采用拟人思维。提出了一种篇章结构分析和统计相结合的自动摘要方法，并实现了一个中文网页自动摘要实验系统。该方法首先对文本进行篇章结构分析，得到段落的位置信息和各级小标题信息；然后综合这些结构信息使用统计方法和启发式规则来提取文档的关键词、关键句，生成文档的摘要。在实验评估中，该方法取得了令人满意的摘要质量和速度。
关键词：	自动摘要中文网页篇章结构信息检索
文章编号：	1006-2475（2006）09-0120-03
收稿时间：	2005-09-30
修稿时间：	2005-09-30
Research on Automatic Abstracting of Chinese Web Page

XU Xiao-Dan.Research on Automatic Abstracting of Chinese Web Page[J].Computer and Modernization,2006(9):120-122,126.

Authors:	XU Xiao-Dan

Abstract:	Automatic abstracting is a practical and difficult branch in natural language processing, which becomes an important problem in domains such as Intemet information retrieval. This paper describes an automatic abstract system to process Chinese Web page, which is mainly based on text structure. The method provided in this paper is to analyze the text structure firstly, obtain the positional information of the paragraph and all levels of subtitles information, then uses statistical methods and the heuristic rule to extract key words and key sentences, and finally creates the abstract. Experiments show that this method can generate abstract effectively and efficiently.

Keywords:	automatic abstract Chinese Web page text structure information retrieval
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏