This paper presents a Bayesian method for temporally aligning a music score and an audio rendition. A critical problem in audio-toscore alignment is in dealing with the wide varie...
Akira Maezawa, Hiroshi G. Okuno, Tetsuya Ogata, Ma...
Inferences from time-series data can be greatly enhanced by taking into account multiple modalities. In some cases, such as audio of speech and the corresponding video of lip gest...
Trausti T. Kristjansson, Brendan J. Frey, Thomas S...
—We propose a statistical framework for high-level feature extraction that uses SIFT Gaussian mixture models (GMMs) and audio models. SIFT features were extracted from all the im...
Abstract. We present a system which consists of a lifelike agent animated in real-time using video and audio analysis from the user. This kind of system could be used for Instant M...
Sylvain Le Gallou, Gaspard Breton, Renaud Sé...
The Student’s-t hidden Markov model (SHMM) has been recently proposed as a robust to outliers form of conventional continuous density hidden Markov models, trained by means of t...