Cross-lingual speech recognition under runtime resource constraints

15 years 7 months ago

Download research.microsoft.com

This paper proposes and compares four cross-lingual and bilingual automatic speech recognition techniques under the constraints of limited memory size and CPU speed. The first three techniques fall into the category of lexicon conversion where each phoneme sequence (PHS) in the foreign language (FL) lexicon is mapped into the native language (NL) phoneme sequence. The first technique determines the PHS mapping through the international phonetic alphabet (IPA) features; The second and third techniques are data-driven. They determine the mapping by converting the PHS into corresponding contextindependent and context-dependent hidden Markov models (HMMs) respectively and searching for the NL PHS with the least Kullback-Leibler divergence (KLD) between the HMMs. The fourth technique falls into the category of acoustic-model (AM) merging where the FL’s AM is merged into the NL’s AM by mapping each senone in the FL’s AM to the senone in the NL’s AM with the minimum KLD. We discuss t...

Dong Yu, Li Deng, Peng Liu, Jian Wu, Yifan Gong, A

Real-time Traffic

AM Merging Technique | ICASSP 2009 | NL PHS | Phoneme Sequence | Signal Processing |

claim paper

Added	21 May 2010
Updated	21 May 2010
Type	Conference
Year	2009
Where	ICASSP
Authors	Dong Yu, Li Deng, Peng Liu, Jian Wu, Yifan Gong, Alex Acero

Sciweavers

Cross-lingual speech recognition under runtime resource constraints

AM Merging Technique | ICASSP 2009 | NL PHS | Phoneme Sequence | Signal Processing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers