Using Reordering in Statistical Machine Translation based on Alignment Block Classification

15 years 3 months ago

Download gps-tsc.upc.es

Statistical Machine Translation (SMT) is based on alignment models which learn from bilingual corpora the word correspondences between source and target language. These models are assumed to be capable of learning reorderings. However, the difference in word order between two languages is one of the most important sources of errors in SMT. In this paper, we show that SMT can take advantatge of inductive learning in order to solve reordering problems. Given a word alignment, we identify those pairs of consecutive source blocks (sequences of words) whose translation is swapped, i.e. those blocks which, if swapped, generate a correct monotone translation. Afterwards, we classify these pairs into groups, following recursively a co-occurrence block criterion, in order to infer reorderings. Inside the same group, we allow new internal combination in order to generalize the reorder to unseen pairs of blocks. Then, we identify the pairs of blocks in the source corpora (both training and test)...

Marta R. Costa-Jussà, José A. R. Fon

Real-time Traffic

Consecutive Source Blocks | Correct Monotone Translation | Education | LREC 2008 | Statistical Machine Translation |

claim paper

» Segmentation and alignment of parallel text for statistical machine translation

» A Rankingbased Approach to Word Reordering for Statistical Machine Translation

» A Clustered Global Phrase Reordering Model for Statistical Machine Translation

» Syntaxbased reordering for statistical machine translation

» Statistical Machine Reordering

» Handling phrase reorderings for machine translation

» Combining PhraseBased and TemplateBased Alignment Models in Statistical Translation

» A Direct SyntaxDriven Reordering Model for PhraseBased Machine Translation

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	LREC
Authors	Marta R. Costa-Jussà, José A. R. Fonollosa, Enric Monte

Comments (0)

Sciweavers

Using Reordering in Statistical Machine Translation based on Alignment Block Classification

Consecutive Source Blocks | Correct Monotone Translation | Education | LREC 2008 | Statistical Machine Translation |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers