Discriminatively estimated discrete, parametric and smoothed-discrete duration models for speech recognition

12 years 8 months ago

Download mirlab.org

Duration of phonemic segments provide important cues for distinguishing words in languages such as Arabic. Recently, we proposed a discriminatively estimated joint acoustic, duration and language model for large vocabulary speech recognition [1]. In that work, we found simple discrete models to be effective for modeling duration, albeit they were neither smoothed nor parsimonious. These limitations are addressed here with two alternative models – parametric and smoothed-discrete models. Unlike previous work on parametric duration model, we estimate their parameters discriminatively and derive an analytical expression for estimating the parameters of a log-normal distribution using a recent approach [2]. On a large vocabulary Arabic task, we empirically evaluated different segmental units and durations models. Our results show bigrams of clustered states modeled with smoothed-discrete duration models are relatively more accurate and efﬁcient than other models considered.

Maider Lehr, Izhak Shafran

Real-time Traffic

Duration Models | ICASSP 2011 | Large Vocabulary | Signal Processing | Smoothed-discrete Duration Models |

claim paper

Added	21 Aug 2011
Updated	21 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Maider Lehr, Izhak Shafran

Sciweavers

Discriminatively estimated discrete, parametric and smoothed-discrete duration models for speech recognition

Duration Models | ICASSP 2011 | Large Vocabulary | Signal Processing | Smoothed-discrete Duration Models |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers