The standard approach to speaker verification is to extract cepstral features from the speech spectrum and model them by generative or discriminative techniques. We propose a nov...
This paper investigates the use of speech-to-text methods for assigning an emotion class to a given speech utterance. Previous work shows that an emotion extracted from text can c...
Florian Metze, Anton Batliner, Florian Eyben, Tim ...
The VideoCLEF track, introduced in 2008, aims to develop and evaluate tasks related to analysis of and access to multilingual multimedia content. In its first year, VideoCLEF pilo...
This work investigates the validity and accuracy of using spatial cues with Time-Delay Estimation (TDE) as a method of segmenting multichannel recorded speech by speaker location....
Eva Cheng, Jason Lukasiak, Ian S. Burnett, David S...
Automatic lipreading is automatic speech recognition that uses only visual information. The relevant data in a video signal is isolated and features are extracted from it. From a s...