A Large Scale Distributed Syntactic, Semantic and Lexical Language Model for Machine Translation

14 years 12 days ago

Download www.cs.wright.edu

This paper presents an attempt at building a large scale distributed composite language model that simultaneously accounts for local word lexical information, mid-range sentence syntactic structure, and long-span document semantic content under a directed Markov random ﬁeld paradigm. The composite language model has been trained by performing a convergent N-best list approximate EM algorithm that has linear time complexity and a followup EM algorithm to improve word prediction power on corpora with up to a billion tokens and stored on a supercomputer. The large scale distributed composite language model gives drastic perplexity reduction over ngrams and achieves signiﬁcantly better translation quality measured by the BLEU score and “readability” when applied to the task of re-ranking the N-best list from a state-of-theart parsing-based machine translation system.

Ming Tan, Wenli Zhou, Lei Zheng, Shaojun Wang

Real-time Traffic

ACL 2011 | Composite Language | Computational Linguistics | Machine Translation System | Time Complexity |

claim paper

Post Info
More Details (n/a)

Added	23 Aug 2011
Updated	23 Aug 2011
Type	Journal
Year	2011
Where	ACL
Authors	Ming Tan, Wenli Zhou, Lei Zheng, Shaojun Wang

Comments (0)

Sciweavers

A Large Scale Distributed Syntactic, Semantic and Lexical Language Model for Machine Translation

ACL 2011 | Composite Language | Computational Linguistics | Machine Translation System | Time Complexity |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers