Sciweavers

EMNLP
2004

Comparing and Combining Generative and Posterior Probability Models: Some Advances in Sentence Boundary Detection in Speech

13 years 6 months ago
Comparing and Combining Generative and Posterior Probability Models: Some Advances in Sentence Boundary Detection in Speech
We compare and contrast two different models for detecting sentence-like units in continuous speech. The first approach uses hidden Markov sequence models based on N-grams and maximum likelihood estimation, and employs model interpolation to combine different representations of the data. The second approach models the posterior probabilities of the target classes; it is discriminative and integrates multiple knowledge sources in the maximum entropy (maxent) framework. Both models combine lexical, syntactic, and prosodic information. We develop a technique for integrating pretrained probability models into the maxent framework, and show that this approach can improve on an HMM-based state-of-the-art system for the sentence-boundary detection task. An even more substantial improvement is obtained by combining the posterior probabilities of the two systems.
Yang Liu, Andreas Stolcke, Elizabeth Shriberg, Mar
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2004
Where EMNLP
Authors Yang Liu, Andreas Stolcke, Elizabeth Shriberg, Mary P. Harper
Comments (0)