We present an approach for tracking a lecturer during the course of his speech. We use features from multiple cameras and microphones, and process them in a joint particle filter f...
Kai Nickel, Tobias Gehrig, Hazim Kemal Ekenel, Joh...
This paper presents the EPAC corpus which is composed by a set of 100 hours of conversational speech manually transcribed and by the outputs of automatic tools (automatic segmenta...
This paper is dedicated to the description and the study of a new feature extraction approach for emotion recognition. Our contribution is based on the extraction and the character...
We are developing a cross-media information retrieval system, in which users can view specific segments of lecture videos by submitting text queries. To produce a text index, the ...
This work investigates the validity and accuracy of using spatial cues with Time-Delay Estimation (TDE) as a method of segmenting multichannel recorded speech by speaker location....
Eva Cheng, Jason Lukasiak, Ian S. Burnett, David S...