We present a learning algorithm for non-parametric hidden Markov models with continuous state and observation spaces. All necessary probability densities are approximated using sa...
Eye gaze and gesture form key conversational grounding cues that are used extensively in face-to-face interaction among people. To accurately recognize visual feedback during inter...
Inferences from time-series data can be greatly enhanced by taking into account multiple modalities. In some cases, such as audio of speech and the corresponding video of lip gest...
Trausti T. Kristjansson, Brendan J. Frey, Thomas S...
Previously we have proposed different models for estimating articulatory gestures and vocal tract variable (TV) trajectories from synthetic speech. We have shown that when deploye...
Vikramjit Mitra, Hosung Nam, Carol Y. Espy-Wilson,...
Many human action recognition tasks involve data that can be factorized into multiple views such as body postures and hand shapes. These views often interact with each other over ...