We are developing a cross-media information retrieval system, in which users can view specific segments of lecture videos by submitting text queries. To produce a text index, the ...
Sources of training data suitable for language modeling of conversational speech are limited. In this paper, we show how training data can be supplemented with text from the web ï...
This paper explores the relationship between discourse structure and coverbal gesture. Using the idea of gestural cohesion, we show that coherent topic segments are characterized ...
Language models used in current automatic speech recognition systems are trained on general-purpose corpora and are therefore not relevant to transcribe spoken documents dealing w...
Language models for speech recognition tend to be brittle across domains, since their performance is vulnerable to changes in the genre or topic of the text on which they are trai...