The paper describes aspects, methods and results of the development of an automatic transcription system for Catalan broadcast conversation by means of speech recognition. Emphasi...
We investigate genre effects on the task of automatic sentence segmentation, focusing on two important domains – broadcast news (BN) and broadcast conversation (BC). We employ a...
This paper describes recent advances at LIMSI in Mandarin Chinese speech-to-text transcription. A number of novel approaches were introduced in the different system components. Th...
Lori Lamel, Jean-Luc Gauvain, Viet-Bac Le, Ilya Op...
Abstract. This paper presents the final version of the Czech Broadcast Conversation Corpus released at the Linguistic Data Consortium (LDC). The corpus contains 72 recordings of a...
In language modeling for speech recognition, both the amount of training data and the match to the target task impact the goodness of the model, with the trade-off usually favorin...
Marius A. Marin, Sergey Feldman, Mari Ostendorf, M...