Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

28

LREC
2008

favoriteEmaildiscussreport

114views Education» more LREC 2008»

Improving Statistical Machine Translation Efficiency by Triangulation

13 years 10 months ago

Improving Statistical Machine Translation Efficiency by Triangulation

Download www.lrec-conf.org

In current phrase-based Statistical Machine Translation systems, more training data is generally better than less. However, a larger data set eventually introduces a larger model that enlarges the search space for the decoder, and consequently requires more time and more resources to translate. This paper describes an attempt to reduce the model size by filtering out the less probable entries based on testing correlation using additional training data in an intermediate third language. The central idea behind the approach is triangulation, the process of incorporating multilingual knowledge in a single system, which eventually utilizes parallel corpora available in more than two languages. We conducted experiments using Europarl corpus to evaluate our approach. The reduction of the model size can be up to 70% while the translation quality is being preserved.

Yu Chen, Andreas Eisele, Martin Kay

Real-time Traffic

Education | LREC 2008 | Model Size | Phrase-based Statistical Machine | Training Data |

claim paper

Related Content

» Improvements in PhraseBased Statistical Machine Translation

» Efficient Decoding for Statistical Machine Translation with a Fully Expanded WFST Model

» Dynamic Translation Memory Using Statistical Machine Translation to Improve Translation Me...

» Automatic Evaluation Measures for Statistical Machine Translation System Optimization

» HMM Word and Phrase Alignment for Statistical Machine Translation

» Unsupervised Search for the Optimal Segmentation for Statistical Machine Translation

» Statistical Machine Translation with a Factorized Grammar

» Latticebased Minimum Error Rate Training for Statistical Machine Translation

» Lattice Minimum BayesRisk Decoding for Statistical Machine Translation

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	LREC
Authors	Yu Chen, Andreas Eisele, Martin Kay

Comments (0)