Usually the mel-frequency cepstral coefficients (MFCCs) are derived via Hamming windowed DFT spectrum. In this paper, we advocate to use a so-called multitaper method instead. Mul...
Tomi Kinnunen, Rahim Saeidi, Johan Sandberg, Maria...
We present a study on purely data-based recognition of animal sounds, performing evaluation on a real-world database obtained from the Humboldt-University Animal Sound Archive. As...
In this paper, a robust feature for text-independent speaker recognition is proposed, which simulate the response mode of cochlear neurons in processing acoustic signal. The featu...
—One of the biggest obstacles for constructing effective sociable virtual humans lies in the failure of machines to recognize the desires, feelings and intentions of the human us...
This paper presents a framework for speech-driven synthesis of real faces from a corpus of 3D video of a person speaking. Video-rate capture of dynamic 3D face shape and colour ap...
Ioannis A. Ypsilos, Adrian Hilton, Aseel Turkmani,...