Using stacked transformations for recognizing foreign accented speech

14 years 7 months ago

Download mirlab.org

A common problem in speech recognition for foreign accented speech is that there is not enough training data for an accent-speciﬁc or a speaker-speciﬁc recognizer. Speaker adaptation can be used to improve the accuracy of a speakerindependent recognizer, but a lot of adaptation data is needed for speakers with a strong foreign accent. In this paper we propose a rather simple and successful technique of stacked transformations where the baseline models trained for native speakers are ﬁrst adapted by using accent-speciﬁc data and then by another transformation using speaker-speciﬁc data. Because the accent-speciﬁc data can be collected ofﬂine, the ﬁrst transformation can be more detailed and comprehensive, and the second one less detailed and fast. Experimental results are provided for speaker adaptation in English spoken by Finnish speakers. The evaluation results conﬁrm that the stacked transformations are very helpful for fast speaker adaptation.

Peter Smit, Mikko Kurimo

Real-time Traffic

Accent-speciﬁc Data | Foreign Accented Speech | ICASSP 2011 | Signal Processing | Speaker Adaptation |

claim paper

Added	21 Aug 2011
Updated	21 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Peter Smit, Mikko Kurimo

Sciweavers

Using stacked transformations for recognizing foreign accented speech

Accent-speciﬁc Data | Foreign Accented Speech | ICASSP 2011 | Signal Processing | Speaker Adaptation |

Explore & Download

Productivity Tools

Sciweavers