Sciweavers

ICASSP
2008
IEEE

An unsupervised web-based topic language model adaptation method

13 years 10 months ago
An unsupervised web-based topic language model adaptation method
This paper focuses on a solution to better adapt ASR systems, whose language models (LM) are usually trained on topic-independent corpora, to new topics, in particular in the case of broadcast news. We propose a new complete and fully unsupervised technique that selects keywords from each segment using information retrieval methods, to build a thematically coherent adaptation corpus from the Internet. The LM used for the initial transcription is then adapted before rescoring word lattices. Experimental results demonstrate the validity of the proposed adaptation technique with a significant reduction of the perplexity after LM adaptation. Word error rates are also improved in some cases though to a lesser extent.
Gwénolé Lecorvé, Guillaume Gr
Added 30 May 2010
Updated 30 May 2010
Type Conference
Year 2008
Where ICASSP
Authors Gwénolé Lecorvé, Guillaume Gravier, Pascale Sébillot
Comments (0)