Multi-stream parameterization for structural speech recognition

16 years 15 days ago

Download www.gavo.t.u-tokyo.ac.jp

Recently, a novel and structural representation of speech was proposed [1, 2], where the inevitable acoustic variations caused by nonlinguistic factors are effectively removed from speech. This structural representation captures only microphone- and speaker-invariant speech contrasts or dynamics and uses no absolute or static acoustic properties directly such as spectrums. In our previous study, the new representation was applied to recognizing a sequence of isolated vowels [3]. The structural models trained with a single speaker outperformed the conventional HMMs trained with more than four thousand speakers even in the case of noisy speech. We also applied the new models to recognizing utterances of connected vowels [4]. In the current paper, a multiple stream structuralization method is proposed to improve the performance of the structural recognition framework. The proposed method only with 8 training speakers shows the very comparable performance to that of the conventional 4,130...

Satoshi Asakawa, Nobuaki Minematsu, Keikichi Hiros

Real-time Traffic

ICASSP 2008 | Signal Processing | Speaker-invariant Speech Contrasts | Structural Representation | Structural Representation Captures |

claim paper

Added	30 May 2010
Updated	30 May 2010
Type	Conference
Year	2008
Where	ICASSP
Authors	Satoshi Asakawa, Nobuaki Minematsu, Keikichi Hirose

Sciweavers

Multi-stream parameterization for structural speech recognition

ICASSP 2008 | Signal Processing | Speaker-invariant Speech Contrasts | Structural Representation | Structural Representation Captures |

Explore & Download

Productivity Tools

Sciweavers