Enhancing Statistical Machine Translation with Character Alignment

13 years 6 months ago

Download www.aclweb.org

The dominant practice of statistical machine translation (SMT) uses the same Chinese word segmentation specification in both alignment and translation rule induction steps in building Chinese-English SMT system, which may suffer from a suboptimal problem that word segmentation better for alignment is not necessarily better for translation. To tackle this, we propose a framework that uses two different segmentation specifications for alignment and translation respectively: we use Chinese character as the basic unit for alignment, and then convert this alignment to conventional word alignment for translation rule induction. Experimentally, our approach outperformed two baselines: fully word-based system (using word for both alignment and translation) and fully character-based system, in terms of alignment quality and translation performance.

Ning Xi, Guangchao Tang, Xinyu Dai, Shujian Huang,

Real-time Traffic

ACL 2012 | Computational Linguistics | Statistical Machine Translation | Word Alignment | Word Segmentation |

claim paper

» Enhancing the Bilingual Concordancer TransSearch with WordLevel Alignment

» Statistical Machine Translation of German Compound Words

» A Comparison of Alignment Models for Statistical Machine Translation

» Segmentation and alignment of parallel text for statistical machine translation

» Modeling with Structures in Statistical Machine Translation

» Weighted Alignment Matrices for Statistical Machine Translation

» Given Bilingual Terminology in Statistical Machine Translation MWESensitve Word Alignment ...

» Statistical Machine Translation with Word and SentenceAligned Parallel Corpora

Post Info
More Details (n/a)

Added	29 Sep 2012
Updated	29 Sep 2012
Type	Journal
Year	2012
Where	ACL
Authors	Ning Xi, Guangchao Tang, Xinyu Dai, Shujian Huang, Jiajun Chen

Comments (0)

Sciweavers

Enhancing Statistical Machine Translation with Character Alignment

ACL 2012 | Computational Linguistics | Statistical Machine Translation | Word Alignment | Word Segmentation |

Explore & Download

Productivity Tools

Sciweavers