Sciweavers

TASLP
2008

Combining Spectral Representations for Large-Vocabulary Continuous Speech Recognition

13 years 4 months ago
Combining Spectral Representations for Large-Vocabulary Continuous Speech Recognition
In this paper we investigate the combination of complementary acoustic feature streams in large vocabulary continuous speech recognition (LVCSR). We have explored the use of acoustic features obtained using a pitch-synchronous analysis, STRAIGHT, in combination with conventional features such as mel frequency cepstral coefficients. Pitch-synchronous acoustic features are of particular interest when used with vocal tract length normalisation (VTLN) which is known to be affected by the fundamental frequency. We have combined these spectral representations directly at the acoustic feature level using heteroscedastic linear discriminant analysis (HLDA) and at the system level using ROVER. We evaluated this approach on three LVCSR tasks: dictated newspaper text (WSJCAM0), conversational telephone speech (CTS), and multiparty meeting transcription. The CTS and meeting transcription experiments were both evaluated using standard NIST test sets and evaluation protocols. Our results indicate th...
Giulia Garau, Steve Renals
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2008
Where TASLP
Authors Giulia Garau, Steve Renals
Comments (0)