Sciweavers

EMNLP
2009

The infinite HMM for unsupervised PoS tagging

13 years 2 months ago
The infinite HMM for unsupervised PoS tagging
We extend previous work on fully unsupervised part-of-speech tagging. Using a non-parametric version of the HMM, called the infinite HMM (iHMM), we address the problem of choosing the number of hidden states in unsupervised Markov models for PoS tagging. We experiment with two non-parametric priors, the Dirichlet and Pitman-Yor processes, on the Wall Street Journal dataset using a parallelized implementation of an iHMM inference algorithm. We evaluate the results with a variety of clustering evaluation metrics and achieve equivalent or better performances than previously reported. Building on this promising result we evaluate the output of the unsupervised PoS tagger as a direct replacement for the output of a fully supervised PoS tagger for the task of shallow parsing and compare the two evaluations.
Jurgen Van Gael, Andreas Vlachos, Zoubin Ghahraman
Added 17 Feb 2011
Updated 17 Feb 2011
Type Journal
Year 2009
Where EMNLP
Authors Jurgen Van Gael, Andreas Vlachos, Zoubin Ghahramani
Comments (0)