Sciweavers

EMNLP
2010

Statistical Machine Translation with a Factorized Grammar

13 years 2 months ago
Statistical Machine Translation with a Factorized Grammar
In modern machine translation practice, a statistical phrasal or hierarchical translation system usually relies on a huge set of translation rules extracted from bi-lingual training data. This approach not only results in space and efficiency issues, but also suffers from the sparse data problem. In this paper, we propose to use factorized grammars, an idea widely accepted in the field of linguistic grammar construction, to generalize translation rules, so as to solve these two problems. We designed a method to take advantage of the XTAG English Grammar to facilitate the extraction of factorized rules. We experimented on various setups of low-resource language translation, and showed consistent significant improvement in BLEU over state-ofthe-art string-to-dependency baseline systems with 200K words of bi-lingual training data.
Libin Shen, Bing Zhang, Spyros Matsoukas, Jinxi Xu
Added 11 Feb 2011
Updated 11 Feb 2011
Type Journal
Year 2010
Where EMNLP
Authors Libin Shen, Bing Zhang, Spyros Matsoukas, Jinxi Xu, Ralph M. Weischedel
Comments (0)