Rapid Feature Space Speaker Adaptation for Multi-Stream HMM-Based Audio-Visual Speech Recognition

13 years 10 months ago

Download www.cecs.uci.edu

Multi-stream hidden Markov models (HMMs) have recently been very successful in audio-visual speech recognition, where the audio and visual streams are fused at the ﬁnal decision level. In this paper we investigate fast feature space speaker adaptation using multi-stream HMMs for audio-visual speech recognition. In particular, we focus on studying the performance of feature-space maximum likelihood linear regression (fMLLR), a fast and effective method for estimating feature space transforms. Unlike the common speaker adaptation techniques of MAP or MLLR, fMLLR does not change the audio or visual HMM parameters, but simply applies a single transform to the testing features. We also address the problem of fast and robust on-line fMLLR adaptation using feature space maximum a posterior linear regression (fMAPLR). Adaptation experiments are reported on the IBM infrared headset audio-visual database. On average for a 20-speaker¢ hour independent test set, the multi-stream fMLLR achieves...

Jing Huang, Etienne Marcheret, Karthik Visweswaria

Real-time Traffic

Audio-Visual Speech Recognition | Feature Space | ICMCS 2005 | Speaker Adaptation |

claim paper

Post Info
More Details (n/a)

Added	24 Jun 2010
Updated	24 Jun 2010
Type	Conference
Year	2005
Where	ICMCS
Authors	Jing Huang, Etienne Marcheret, Karthik Visweswariah

Comments (0)

Sciweavers

Rapid Feature Space Speaker Adaptation for Multi-Stream HMM-Based Audio-Visual Speech Recognition

Audio-Visual Speech Recognition | Feature Space | ICMCS 2005 | Speaker Adaptation |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers