Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

10

ACL
2004

favoriteEmaildiscussreport

188views Computational Linguistics» more ACL 2004»

Statistical Machine Translation with Word- and Sentence-Aligned Parallel Corpora

13 years 5 months ago

Statistical Machine Translation with Word- and Sentence-Aligned Parallel Corpora

Download acl.ldc.upenn.edu

The parameters of statistical translation models are typically estimated from sentence-aligned parallel corpora. We show that significant improvements in the alignment and translation quality of such models can be achieved by additionally including wordaligned data during training. Incorporating wordlevel alignments into the parameter estimation of the IBM models reduces alignment error rate and increases the Bleu score when compared to training the same models only on sentence-aligned data. On the Verbmobil data set, we attain a 38% reduction in the alignment error rate and a higher Bleu score with half as many training examples. We discuss how varying the ratio of word-aligned to sentencealigned data affects the expected performance gain.

Chris Callison-Burch, David Talbot, Miles Osborne

Real-time Traffic

ACL 2004 | ACL 2007 | Alignment Error Rate | BLEU Score | Models Reduces Alignment |

claim paper

Related Content

» Creating SentenceAligned Parallel Text Corpora from a Large Archive of Potential Parallel ...

» Improving the accuracy of EnglishArabic statistical sentence alignment

» Improved Sentence Alignment on Parallel Web Pages Using a Stochastic Tree Alignment Model

» Sentence Alignment of HungarianEnglish Parallel Corpora Using a Hybrid Algorithm

» A Robust CrossStyle Bilingual Sentences Alignment Model

» FastChampollion A Fast and Robust Sentence Alignment Algorithm

» Exploiting Parallel Treebanks to Improve PhraseBased Statistical Machine Translation

» Word Alignment Annotation in a JapaneseChinese Parallel Corpus

» Improved Statistical Machine Translation Using MonolinguallyDerived Paraphrases

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2004
Where	ACL
Authors	Chris Callison-Burch, David Talbot, Miles Osborne

Comments (0)