Data sparseness is an ever dominating problem in automatic emotion recognition. Using artificially generated speech for training or adapting models could potentially ease this: t...
Audiovisual speech recognition (AVSR) systems have been proven superior over audio-only speech recognizers in noisy environments by incorporating features of the visual modality. ...
Alexander Vorwerk, Xiaohui Wang, Dorothea Kolossa,...
Data imputation approaches for robust automatic speech recognition reconstruct noise corrupted spectral information by exploiting prior knowledge of the relationship between targe...
Multi-stream hidden Markov models (HMMs) have recently been very successful in audio-visual speech recognition, where the audio and visual streams are fused at the final decision...
This paper describes an algorithm that performs a simple form of computational auditory scene analysis to separate multiple speech signals from one another on the basis of the mod...