The Informedia Digital Library Project [Wactlar96] allows full content indexing and retrieval of text, audio and video material. Segmentation is an integral process in the Informe...
Typical broadcast material contains not only studio-recorded texts read by trained speakers, but also spontaneous and dialect speech, debates with cross-talk, voice-overs, and on-...
Doris Baum, Daniel Schneider, Rolf Bardeli, Jochen...
In this paper, we consider the extraction of speaker identity from audio records of broadcast news without a priori acoustic information about speakers. Using an automatic speech ...
Vincent Jousse, Simon Petit-Renaud, Sylvain Meigni...
This paper presents the EPAC corpus which is composed by a set of 100 hours of conversational speech manually transcribed and by the outputs of automatic tools (automatic segmenta...
We focus in this paper on the named entity recognition task in spoken data. The proposed approach investigates the use of various contexts of the words to improve recognition. Exp...