Abstract— Especially for tasks like automatic meeting transcription, it would be useful to automatically recognize speech also while multiple speakers are talking simultaneously....
Dorothea Kolossa, Shoko Araki, Marc Delcroix, Tomo...
Missing data techniques have been recently applied to speaker recognition to increase performance in noisy environments. The drawback of these techniques is the vulnerability of t...
Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data. In speech recognition, for example, an acoustic signal is...
In this paper, we propose a novel approach to feature compensation performed in the cepstral domain. We apply the linear approximation method in the cepstral domain to simplify th...
Woohyung Lim, Chang Woo Han, Jong Won Shin, Nam So...
Person identification using audio (speech) and visual (facial appearance, static or dynamic) modalities, either independently or jointly, is a thoroughly investigated problem in pa...