Sciweavers

NAACL
2004

Name Tagging with Word Clusters and Discriminative Training

13 years 5 months ago
Name Tagging with Word Clusters and Discriminative Training
We present a technique for augmenting annotated training data with hierarchical word clusters that are automatically derived from a large unannotated corpus. Cluster membership is encoded in features that are incorporated in a discriminatively trained tagging model. Active learning is used to select training examples. We evaluate the technique for named-entity tagging. Compared with a state-of-the-art HMM-based name finder, the presented technique requires only 13% as much annotated data to achieve the same level of performance. Given a large annotated training set of 1,000,000 words, the technique achieves a 25% reduction in error over the state-of-the-art HMM trained on the same material.
Scott Miller, Jethran Guinness, Alex Zamanian
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2004
Where NAACL
Authors Scott Miller, Jethran Guinness, Alex Zamanian
Comments (0)