Sciweavers

EMNLP
2010

Crouching Dirichlet, Hidden Markov Model: Unsupervised POS Tagging with Context Local Tag Generation

13 years 2 months ago
Crouching Dirichlet, Hidden Markov Model: Unsupervised POS Tagging with Context Local Tag Generation
We define the crouching Dirichlet, hidden Markov model (CDHMM), an HMM for partof-speech tagging which draws state prior distributions for each local document context. This simple modification of the HMM takes advantage of the dichotomy in natural language between content and function words. In contrast, a standard HMM draws all prior distributions once over all states and it is known to perform poorly in unsupervised and semisupervised POS tagging. This modification significantly improves unsupervised POS tagging performance across several measures on five data sets for four languages. We also show that simply using different hyperparameter values for content and function word states in a standard HMM (which we call HMM+) is surprisingly effective.
Taesun Moon, Katrin Erk, Jason Baldridge
Added 11 Feb 2011
Updated 11 Feb 2011
Type Journal
Year 2010
Where EMNLP
Authors Taesun Moon, Katrin Erk, Jason Baldridge
Comments (0)