Creating conversational interfaces for children is challenging in several respects. These include acoustic modeling for automatic speech recognition (ASR), language and dialog mode...
Speech carries both linguistic content – phonemes, words, sentences – and talker information, sometimes called ‘indexical information’. While talker variability materially...
We show that a classifier based on Gaussian mixture models (GMM) can be trained discriminatively to improve accuracy. We describe a training procedure based on the extended Baum-W...
In this paper, a cross-media browsing demonstrator named InfoLink is described. InfoLink automatically links the content of Dutch broadcast news videos to related information sour...
Jeroen Morang, Roeland Ordelman, Franciska de Jong...
Abstract. This article describes a method for document/speech alignment based on explicit verbal references to documents and parts of documents, in the context of multimodal meetin...