Sciweavers

INTERSPEECH
2010

Learning a language model from continuous speech

12 years 11 months ago
Learning a language model from continuous speech
This paper presents a new approach to language model construction, learning a language model not from text, but directly from continuous speech. A phoneme lattice is created using acoustic model scores, and Bayesian techniques are used to robustly learn a language model from this noisy input. A novel sampling technique is devised that allows for the integrated learning of word boundaries and an n-gram language model with no prior linguistic knowledge. The proposed techniques were used to learn a language model directly from continuous, potentially large-vocabulary speech. This language model was able to significantly reduce the ASR phoneme error rate over a separate set of test data, and the proposed lattice processing and lexical acquisition techniques were found to be important factors in this improvement.
Graham Neubig, Masato Mimura, Shinsuke Mori, Tatsu
Added 18 May 2011
Updated 18 May 2011
Type Journal
Year 2010
Where INTERSPEECH
Authors Graham Neubig, Masato Mimura, Shinsuke Mori, Tatsuya Kawahara
Comments (0)