A comparison of segmentation methods and extended lexicon models for Arabic statistical machine translation |
| |
Authors: | Sa?a Hasan Saab Mansour Hermann Ney |
| |
Affiliation: | 1. Human Language Technology and Pattern Recognition Group, Lehrstuhl für Informatik 6, RWTH Aachen University, 52062, Aachen, Germany
|
| |
Abstract: | In this article, we investigate different methodologies of Arabic segmentation for statistical machine translation by comparing a rule-based segmenter to different statistically-based segmenters. We also present a method for segmentation that serves the needs of a real-time translation system without impairing the translation accuracy. Second, we report on extended lexicon models based on triplets that incorporate sentence-level context during the decoding process. Results are presented on different translation tasks that show improvements in both BLEU and TER scores. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|