首页 | 官方网站   微博 | 高级检索  
     

基于短语串实例的汉藏辅助翻译
引用本文:熊维,吴健,刘汇丹,张立强.基于短语串实例的汉藏辅助翻译[J].中文信息学报,2013,27(3):84-91.
作者姓名:熊维  吴健  刘汇丹  张立强
作者单位:1. 中国科学院 软件研究所,北京 100190; 2. 中国科学院大学,北京 100049
基金项目:中国科学院西部行动计划高新技术项目,国家重大科技专项资助项目
摘    要:目前汉藏机器翻译的研究主要集中在基于规则的方法上,主要原因在于汉藏的平行语料等基础资源相对匮乏,不方便做大规模的基于统计的汉藏机器翻译实验。该文依据汉藏辅助翻译项目的实际需求,在平行语料资源较少的情况下,提出了一种基于短语串实例的机器翻译方法,为辅助翻译提供候选译文。该方法主要利用词语对齐信息来充分挖掘现有平行语料资源信息。实验结果表明,该文提出的基于短语串实例方法优于传统基于句子实例的翻译,能够检索出任意长度的短语串翻译实例。在实验测试集上,该方法与默认参数下的Moses相比,翻译的BULE值接近Moses,短语翻译实例串的召回率提高了约9.71%。在平均句长为20个词的测试语料上,翻译速度达到平均每句0.175s,满足辅助翻译实时性的要求。

关 键 词:机器翻译  辅助翻译  基于短语的机器翻译  基于实例的机器翻译  

Example Phrase Based Chinese-Tibetan Computer Aided Translation
XIONG Wei , WU Jian , LIU Huidan , ZHANG Liqiang.Example Phrase Based Chinese-Tibetan Computer Aided Translation[J].Journal of Chinese Information Processing,2013,27(3):84-91.
Authors:XIONG Wei  WU Jian  LIU Huidan  ZHANG Liqiang
Affiliation:1. Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;2. University of Chinese Academy of Sciences, Beijing 100049, China
Abstract:At present, the research on Chinese-Tibetan machine translation is focused on rule-based methods. Due to the lack of parallel corpus and other resources between Chinese and Tibetan, it is almost impossible to carry statistical experiments on Chinese-Tibetan machine translation. According to the actual needs of the Chinese-Tibetan Computer Aided Translation, this paper proposes an example phrase based machine translation method. It can fully take advantage of the existing parallel corpus resources using the word-align information to improve the translation quality. Allowing the retrieval of arbitrarily long phrase examples, this approach is proved for a better performance than the example based method on sentence level. On the test data, the method achieves a comparable performance with Moses. The recall of translation phrase makes an improvement of 9.71% over Moses. The translation speed is about 0.175s per sentence, which meets the requirement of the computer aided translation system.
Key wordsmachine translation; computer aided translation; phrase-based translation; example-based translation
Keywords:machine translation  computer aided translation  phrase-based translation  example-based translation  
本文献已被 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号