Sciweavers

INTERSPEECH
2010

Mask estimation in non-stationary noise environments for missing feature based robust speech recognition

12 years 11 months ago
Mask estimation in non-stationary noise environments for missing feature based robust speech recognition
In missing feature based automatic speech recognition (ASR), the role of the spectro-temporal mask in providing an accurate description of the relationship between target speech and environmental noise is critical for minimizing the degradation in ASR word accuracy (WAC) as the signal-to-noise ratio (SNR) decreases. This paper demonstrates the importance of accurate characterization of instantaneous acoustic background for mask estimation in data imputation approaches to missing feature based ASR, especially in the presence of non-stationary background noise. Mask estimation relies on a hypothesis test designed to detect the presence of speech in time-frequency spectral bins under rapidly varying noise conditions. Masked melfrequency filter bank energies are reconstructed using a minimum mean squared error (MMSE) based data imputation procedure. The impact of this mask estimation approach is evaluated in the context of MMSE based data imputation under multiple background conditions ov...
Shirin Badiezadegan, Richard C. Rose
Added 18 May 2011
Updated 18 May 2011
Type Journal
Year 2010
Where INTERSPEECH
Authors Shirin Badiezadegan, Richard C. Rose
Comments (0)