Sciweavers

CLEF
2009
Springer

Unsupervised Word Decomposition with the Promodes Algorithm

13 years 5 months ago
Unsupervised Word Decomposition with the Promodes Algorithm
We present Promodes, an algorithm for unsupervised word decomposition, which is based on a probabilistic generative model. The model considers segment boundaries as hidden variables and includes probabilities for letter transitions within segments. For the Morpho Challenge 2009, we demonstrate three versions of Promodes. The first one uses a simple segmentation algorithm on a subset of the data and applies maximum likelihood estimates for model parameters when decomposing words of the original language data. The second version estimates its parameters through expectation maximization (EM). A third method is a committee of unsupervised learners where learners correspond to different EM initializations. The solution is found by majority vote which decides whether to segment at a word position or not. In this paper, we describe the probabilistic model, parameter estimation and how the most likely decomposition of an input word is found. We have tested Promodes on non-vowelized and voweliz...
Sebastian Spiegler, Bruno Golénia, Peter A.
Added 08 Nov 2010
Updated 08 Nov 2010
Type Conference
Year 2009
Where CLEF
Authors Sebastian Spiegler, Bruno Golénia, Peter A. Flach
Comments (0)