Abstract--This paper presents a novel method for automatically classifying consumer video clips based on their soundtracks. We use a set of 25 overlapping semantic classes, chosen ...
A key problem in using the output of an auditory model as the input to a machine-learning system in a machine-hearing application is to find a good feature-extraction layer. For ...
In this study, the generalized parametric spectral subtraction estimator is employed in the context of a ROVER speech enhancement framework to develop a robust phoneme class selec...
We consider the task of under-determined reverberant audio source separation. We model the contribution of each source to all mixture channels in the time-frequency domain as a ze...
We consider a semi-supervised setting for domain adaptation where only unlabeled data is available for the target domain. One way to tackle this problem is to train a generative m...