Sciweavers

INTERSPEECH
2010

Robust automatic speech recognition with decoder oriented ideal binary mask estimation

12 years 11 months ago
Robust automatic speech recognition with decoder oriented ideal binary mask estimation
In this paper, we propose a joint optimal method for automatic speech recognition (ASR) and ideal binary mask (IBM) estimation in transformed into the cepstral domain through a newly derived generalized expectation maximization algorithm. First, cepstral domain missing feature marginalization is established using a linear transformation, after tying the mean and variance of non-existing cepstral coefficients. Second, IBM estimation is formulated using a generalized expectation maximization algorithm directly to optimize the ASR performance. Experimental results show that even in highly non-stationary mismatch condition (dance music as background noise), the proposed method achieves much higher absolute ASR accuracy improvement ranging from 14.69% at 0 dB SNR to 40.10% at 15 dB SNR compared with the conventional noise suppression method.
Lae-Hoon Kim, Kyung-Tae Kim, Mark Hasegawa-Johnson
Added 18 May 2011
Updated 18 May 2011
Type Journal
Year 2010
Where INTERSPEECH
Authors Lae-Hoon Kim, Kyung-Tae Kim, Mark Hasegawa-Johnson
Comments (0)