Sciweavers

CICLING
2009
Springer

Enriching Statistical Translation Models Using a Domain-Independent Multilingual Lexical Knowledge Base

14 years 5 months ago
Enriching Statistical Translation Models Using a Domain-Independent Multilingual Lexical Knowledge Base
This paper presents a method for improving phrase-based Statistical Machine Translation systems by enriching the original translation model with information derived from a multilingual lexical knowledge base. The method proposed exploits the Multilingual Central Repository (a group of linked WordNets from different languages), as a domain-independent knowledge database, to provide translation models with new possible translations for a large set of lexical tokens. Translation probabilities for these tokens are estimated using a set of simple heuristics based on WordNet topology and local context. During decoding, these probabilities are softly integrated so they can interact with other statistical models. We have applied this type of domain-independent translation modeling to several translation tasks obtaining a moderate but significant improvement in translation quality consistently according to a number of standard automatic evaluation metrics. This improvement is especially remarka...
Miguel García, Jesús Giménez,
Added 24 Nov 2009
Updated 24 Nov 2009
Type Conference
Year 2009
Where CICLING
Authors Miguel García, Jesús Giménez, Lluís Màrquez
Comments (0)