We propose a correlogram-based time delay estimation method using signals modeled as the output of the cochlea, where the low-level signal processing happens in the human auditory...
Automatic Language Identification (LID) in music has received significantly less attention than LID in speech. Here, we study the problem of LID in music videos uploaded on YouT...
Vijay Chandrasekhar, Mehmet Emre Sargin, David A. ...
—This paper presents a blind source separation method for convolutive mixtures of speech/audio sources. The method can even be applied to an underdetermined case where there are ...
The performance of a typical speaker verification system degrades significantly in reverberant environments. This degradation is partly due to the conventional feature extractio...
Sriram Ganapathy, Jason W. Pelecanos, Mohamed Kama...
Emotion recognition from speech plays an important role in developing affective and intelligent systems. This study investigates sentence-level emotion recognition. We propose to ...