Sciweavers

COLING
2000

Structural Feature Selection For English-Korean Statistical Machine Translation

13 years 5 months ago
Structural Feature Selection For English-Korean Statistical Machine Translation
When aligning texts in very different languages such as Korean and English, structural features beyond word or phrase give useful intbrmation. In this paper, we present a method for selecting struetm'al features of two languages, from which we construct a model that assigns the conditional probabilities to corresponding tag sequences in bilingual EnglishKorean corpora. For tag sequence mapl)ing 1)etween two langauges, we first, define a structural feature fllnction which represents statistical prol)erties of elnpirical distribution of a set of training samples. The system, based on maximmn entrol)y coneet)t, sele(:ts only ti;atures that pro(luee high increases in loglikelihood of training salnl)les. These structurally mat)ped features are more informative knowledge for statistical machine translation t)etween English and Korean. Also, the inforum.tion can help to reduce the 1)arameter sl)ace of statisti('al alignment 1)yeliminating synta(:tically uiflikely alignmenls.
Seonho Kim, Juntae Yoon, Mansuk Song
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2000
Where COLING
Authors Seonho Kim, Juntae Yoon, Mansuk Song
Comments (0)