Statistical Machine Translation with a Factorized Grammar

15 years 5 months ago

Download www.aclweb.org

In modern machine translation practice, a statistical phrasal or hierarchical translation system usually relies on a huge set of translation rules extracted from bi-lingual training data. This approach not only results in space and efficiency issues, but also suffers from the sparse data problem. In this paper, we propose to use factorized grammars, an idea widely accepted in the field of linguistic grammar construction, to generalize translation rules, so as to solve these two problems. We designed a method to take advantage of the XTAG English Grammar to facilitate the extraction of factorized rules. We experimented on various setups of low-resource language translation, and showed consistent significant improvement in BLEU over state-ofthe-art string-to-dependency baseline systems with 200K words of bi-lingual training data.

Libin Shen, Bing Zhang, Spyros Matsoukas, Jinxi Xu

Real-time Traffic

Bi-lingual Training Data | EMNLP 2010 | Modern Machine Translation | Natural Language Processing | Translation Rules |

claim paper

» Grammar Comparison Study for Translational Equivalence Modeling and Statistical Machine Tr...

» Using a Grammar Checker for Evaluation and Postprocessing of Statistical Machine Translati...

» DependencyBased Bracketing Transduction Grammar for Statistical Machine Translation

» Statistical Machine Translation by Parsing

» Constituent Reordering and Syntax Models for EnglishtoJapanese Statistical Machine Transla...

» Fixed Length Word Suffix for Factored Statistical Machine Translation

» Supertagged PhraseBased Statistical Machine Translation

» Segmentation for EnglishtoArabic Statistical Machine Translation

Post Info
More Details (n/a)

Added	11 Feb 2011
Updated	11 Feb 2011
Type	Journal
Year	2010
Where	EMNLP
Authors	Libin Shen, Bing Zhang, Spyros Matsoukas, Jinxi Xu, Ralph M. Weischedel

Comments (0)

Sciweavers

Statistical Machine Translation with a Factorized Grammar

Bi-lingual Training Data | EMNLP 2010 | Modern Machine Translation | Natural Language Processing | Translation Rules |

Explore & Download

Productivity Tools

Sciweavers