Comparison of modulation features for phoneme recognition

13 years 4 months ago

Download www.clsp.jhu.edu

In this paper, we compare several approaches for the extraction of modulation frequency features from speech signal using a phoneme recognition system. The general framework in these approaches is to decompose the speech signal into a set of sub-bands. Amplitude modulations (AM) in the sub-band signal are used to derive features for automatic speech recognition (ASR). Then, we propose a feature extraction technique which uses autoregressive models (AR) of sub-band Hilbert envelopes in relatively long segments of speech signal. AR models of Hilbert envelopes are derived using frequency domain linear prediction (FDLP). Features are formed by converting the FDLP envelopes into static and dynamic modulation frequency components. In the phoneme recognition experiments using the TIMIT database, the FDLP based modulation frequency features provide signiﬁcant improvements compared to other techniques (average relative improvement of 7.5 % over the base-line features). Furthermore, a detaile...

Sriram Ganapathy, Samuel Thomas, Hynek Hermansky

Real-time Traffic