首页 | 官方网站   微博 | 高级检索  
     

基于字单元分析的中文辅助阅读系统
引用本文:方高林,于浩,孟遥,邹纲.基于字单元分析的中文辅助阅读系统[J].中文信息学报,2008,22(2):92-98.
作者姓名:方高林  于浩  孟遥  邹纲
作者单位:富士通研究开发中心有限公司,北京 100016
摘    要:辅助汉语学习研究作为一个重要的研究领域,已经在自然语言处理领域激发起越来越多人的兴趣。文中提出一个基于字分析单元的辅助阅读系统,它可以为汉语学习者提供即时的辅助翻译和学习功能。系统首先提出基于字信息的汉语词法分析方法,对汉语网页中文本进行分词处理,然后利用基于组成字结构信息的方法发现新词。对于通用词典未收录的新词(例如: 专业术语、专有名词和固定短语),系统提出了基于语义预测和反馈学习的方法在Web上挖掘出地道的译文。对于常用词,系统通过汉英(或汉日)词典提供即时的译文显示,用户也可通过词用法检索模块在网络上检索到该词的具体用法实例。该系统关键技术包括: 基于字信息的汉语词法分析,基于组成字结构信息的新词发现,基于语义预测和反馈学习的新词译文获取,这些模块均以字分析单元的方法为主线,并始终贯穿着整个系统。实验表明该系统在各方面都具有良好的性能。

关 键 词:计算机应用  中文信息处理  词法分析  新词发现  术语翻译  Web挖掘  辅助汉语学习  
文章编号:1003-0077(2008)02-0092-07
收稿时间:2007-04-10
修稿时间:2007-10-22

A Computer-Aided Chinese Reading System Based on Analysis Unit of Characters
FANG Gao-lin,YU Hao,MENG Yao,ZOU Gang.A Computer-Aided Chinese Reading System Based on Analysis Unit of Characters[J].Journal of Chinese Information Processing,2008,22(2):92-98.
Authors:FANG Gao-lin  YU Hao  MENG Yao  ZOU Gang
Affiliation:Fujitsu Research and Development Center, Co., LTD., Beijing 100016,China
Abstract:As one of the important research topics,computer-aided Chinese learning is attracting more and more interest in natural language processing society.A computer-aided reading and learning system based on analysis unit of characters is proposed to provide reading and learning assistant for Chinese learner in this paper.The system first employs character-based Chinese morphological analysis for segmenting Chinese texts into words,and then presents a method based on structure information of constituent characters for new word finding.For unknown words unregistered in the dictionary(such as: technical terms,proper nouns and fixed phrases),a method based on semantic prediction and feedback learning is proposed to mine their native translations from the Web.For frequent words,real-time translation display is implemented by the Chinese-English(Chinese-Japanese) dictionary database,and users can also obtain typical examples of this word usage through a word usage retrieval module.In this system,key technologies include: morphological analysis based on character information,word segmentation based on structure information of constituent characters,and translation acquisition of new words based on semantic prediction and feedback learning.A character analysis unit is the core of all proposed methods used in the whole system.Experiments show that our system has good performance in every aspect.
Keywords:computer application  Chinese information processing  morphological analysis  new word finding  term translation  Web mining  computer-aided Chinese learning
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号