引入基于主题复述知识的统计机器翻译模型 Improved statistical machine translation model with topic-based paraphrase期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

引入基于主题复述知识的统计机器翻译模型

引用本文：	苏劲松,董槐林,陈毅东,史晓东,吴清强. 引入基于主题复述知识的统计机器翻译模型[J]. 浙江大学学报(工学版), 2014, 48(10): 1843-1849

作者姓名：	苏劲松董槐林陈毅东史晓东吴清强

作者单位：	1.厦门大学软件学院,福建厦门 361005；2.厦门大学智能学科系,福建厦门 361005

基金项目：	国家“十二五”科技支撑计划资助项目(2012BAH14F03);国家自然科学基金资助项目(61005052,61303082);高等学校博士学科点专项科研基金资助项目(2012012120046);福建省自然科学基金资助项目(2011J01360);厦门市科技计划资助项目(3502Z20103001);深圳市高性能数据挖掘重点实验室资助项目(CXB201005250021A)

摘要：	针对传统的基于双语平行语料的复述获取方法在复述获取和应用的过程中忽视文档上下文的缺点,引入基于主题模型的上下文信息来改善复述获取-主要致力于如何计算上下文无关的复述生成概率和上下文相关的复述生成概率.研究如何将上述2种概率融入统计机器翻译建模,以提高翻译系统的性能.多个测试集上的实验结果证明了该方法的有效性.
关键词：	统计机器翻译复述主题模型
Improved statistical machine translation model with topic-based paraphrase

SU Jin-song;DONG Huai-lin;CHEN Yi-dong;SHI Xiao-dong;WU Qing-qiang. Improved statistical machine translation model with topic-based paraphrase[J]. Journal of Zhejiang University(Engineering Science), 2014, 48(10): 1843-1849

Authors:	SU Jin-song DONG Huai-lin CHEN Yi-dong SHI Xiao-dong WU Qing-qiang

Affiliation:	SU Jin-song;DONG Huai-lin;CHEN Yi-dong;SHI Xiao-dong;WU Qing-qiang;School of Software,Xiamen University;Department of Cognitive Science,Xiamen University;

Abstract:	To deal with the defect of the conventional parallel corpus based paraphrase extraction method which neglects document-level context, the paraphrase extraction and its application in statistical machine translation were improved by introducing the context based on topic model. The problem that how to better learn two kinds of paraphrase probabilities: topic-insensitive and topic-sensitive ones, was mainly analyzed. Both of the two probabilities can be incorporated into the modeling of statistical machine translation by using different methods. The experimental results on various test sets demonstrated the effectiveness of the approach.

Keywords:
本文献已被 CNKI 等数据库收录！
	点击此处可从《浙江大学学报(工学版)》浏览原始摘要信息
	点击此处可从《浙江大学学报(工学版)》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏