Sciweavers

ACL
2008

Unsupervised Multilingual Learning for Morphological Segmentation

13 years 6 months ago
Unsupervised Multilingual Learning for Morphological Segmentation
For centuries, the deep connection between languages has brought about major discoveries about human communication. In this paper we investigate how this powerful source of information can be exploited for unsupervised language learning. In particular, we study the task of morphological segmentation of multiple languages. We present a nonparametric Bayesian model that jointly induces morpheme segmentations of each language under consideration and at the same time identifies cross-lingual morpheme patr abstract morphemes. We apply our model to three Semitic languages: Arabic, Hebrew, Aramaic, as well as to English. Our results demonstrate that learning morphological models in tandem reduces error by up to 24% relative to monolingual models. Furthermore, we provide evidence that our joint model achieves better performance when applied to languages from the same family.
Benjamin Snyder, Regina Barzilay
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where ACL
Authors Benjamin Snyder, Regina Barzilay
Comments (0)