Inferences from time-series data can be greatly enhanced by taking into account multiple modalities. In some cases, such as audio of speech and the corresponding video of lip gest...
Trausti T. Kristjansson, Brendan J. Frey, Thomas S...
In the `missing data' approach to improving the robustness of automatic speech recognition to added noise, an initial process identifies spectraltemporal regions which are do...
While nonnegative matrix factorization (NMF) has successfully been applied for gain-robust multi-pitch detection, a method to track pitch values over time was not provided. We emb...
We address the question of whether and how boosting and bagging can be used for speech recognition. In order to do this, we compare two different boosting schemes, one at the pho...
We propose a new optimization algorithm called Generalized Baum Welch (GBW) algorithm for discriminative training on hidden Markov model (HMM). GBW is based on Lagrange relaxation...