The development of user interfaces based on vision and speech requires the solution of a challenging statistical inference problem: The intentions and actions of multiple individu...
Prosodic information has been successfully used for speaker recognition for more than a decade. The best-performing prosodic system to date has been one based on features extracte...
Luciana Ferrer, Nicolas Scheffer, Elizabeth Shribe...
In this paper we face the problem of partitioning the news videos into stories, and of their classification according to a predefined set of categories. In particular, we propose ...
Francesco Colace, Pasquale Foggia, Gennaro Percann...
In supervector UBM/GMM paradigm, each acoustic file is represented by the mean parameters of a GMM model. This supervector space is used as a data representation space, which has...
It is well known that frame independence assumption is a fundamental limitation of current HMM based speech recognition systems. By treating each speech frame independently, HMMs ...