Sciweavers

ICASSP
2011
IEEE

An investigation of subspace modeling for phonetic and speaker variability in automatic speech recognition

12 years 8 months ago
An investigation of subspace modeling for phonetic and speaker variability in automatic speech recognition
This paper investigates the impact of subspace based techniques for acoustic modeling in automatic speech recognition (ASR). There are many well known approaches to subspace based speaker adaptation which represent sources of variability as a projection within a low dimensional subspace. A new approach to acoustic modeling in ASR, referred to as the subspace based Gaussian mixture model (SGMM), represents phonetic variability as a set of projections applied at the state level in a hidden Markov model (HMM) based acoustic model. The impact of the SGMM in modeling these intrinsic sources of variability is evaluated for a continuous speech recognition (CSR) task. The SGMM is shown to provide an 18% reduction in word error rate (WER) for speaker independent (SI) ASR relative to the continuous density HMM (CDHMM) in the resource management CSR domain. The SI performance obtained from SGMM also represents a 5% reduction in WER relative to subspace based speaker adaption in an unsupervised s...
Richard C. Rose, Shou-Chun Yin, Yun Tang
Added 20 Aug 2011
Updated 20 Aug 2011
Type Journal
Year 2011
Where ICASSP
Authors Richard C. Rose, Shou-Chun Yin, Yun Tang
Comments (0)