Sciweavers

INTERSPEECH
2010

Semi-supervised part-of-speech tagging in speech applications

12 years 11 months ago
Semi-supervised part-of-speech tagging in speech applications
When no training or adaptation data is available, semisupervised training is a good alternative for processing new domains. We perform Bayesian training of a part-of-speech (POS) tagger from unannotated text and a dictionary of possible tags for each word. We complement that method with supervised prediction of possible tags for out-of-vocabulary words and study the impact of both semi-supervision and starting dictionary size on three representative downstream tasks (named entity tagging, semantic role labeling, ASR output postprocessing) that use POS tags as features. The outcome is no impact or a small decrease in performance compared to using a fully supervised tagger, with even potential gains in case of domain mismatch for the supervised tagger. Tasks that trust the tags completely (like ASR post-processing) are more affected by a reduction of the starting dictionary, but still yield positive outcome.
Richard Dufour, Benoît Favre
Added 18 May 2011
Updated 18 May 2011
Type Journal
Year 2010
Where INTERSPEECH
Authors Richard Dufour, Benoît Favre
Comments (0)