首页 | 官方网站   微博 | 高级检索  
     

基于关键词和命名实体识别的新闻话题线索抽取
引用本文:钱哲怡,李芳.基于关键词和命名实体识别的新闻话题线索抽取[J].计算机应用与软件,2011,28(12).
作者姓名:钱哲怡  李芳
作者单位:上海交通大学计算机系中德语言技术实验室 上海200240
摘    要:如何自动结构化新闻话题,从不同角度和不同侧面了解新闻话题,解决网络新闻信息过载的问题成为研究的热点。提出将新闻话题进行线索化的观点,根据抽取线索算法得到关键词和命名实体集合作为每一条线索主旨,并将新闻报道归类到线索中作为其内容来结构化新闻话题。实验结果表明,该方法在线索精度和文档划分评测指标上都有较好的效果,能够较清晰地展现话题的不同线索,以帮助用户了解新闻话题的发展脉络。

关 键 词:命名实体识别  线索抽取  线索文档划分  

KEYWORD AND NAME ENTITY IDENTIFICATION BASED NEWS TOPIC THREAD EXTRACTION
Qian Zheyi,Li Fang.KEYWORD AND NAME ENTITY IDENTIFICATION BASED NEWS TOPIC THREAD EXTRACTION[J].Computer Applications and Software,2011,28(12).
Authors:Qian Zheyi  Li Fang
Affiliation:Qian Zheyi Li Fang(UDS-SJTU Joint Research Lab for Language Technology,Department of Computer and Engineering,Shanghai Jiao Tong University,Shanghai 200240,China)
Abstract:How to automatically organize news topic,understand news topic from different angles and aspects to solve the problem of network news information overloading is becoming a hotspot for research.The article introduces a news topic threading view,i.e.,according to the thread extraction algorithm,Keywords and name entity aggregations are obtained as each thread theme,then classify news reports into threads as contents to organize the news topic.Experiments indicate that the method proposed in the article perfor...
Keywords:Name entity identification Thread extraction Thread documents dividing  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号