People can understand complex auditory and visual information, often using one to disambiguate the other. Automated analysis, even at a lowlevel, faces severe challenges, includin...
John W. Fisher III, Trevor Darrell, William T. Fre...
In this paper we address the problem of estimating who is speaking from automatically extracted low resolution visual cues in group meetings. Traditionally, the task of speech/non...
Abstract. An important issue in tracking is how to incorporate an appropriate degree of adaptivity into the observation model. Without any adaptivity, tracking fails when object pr...
Andrew Blake, Jaco Vermaak, Michel Gangnet, Patric...
Close-talk headset microphones have been traditionally used for speech acquisition in a number of applications, as they naturally provide a higher signal-to-noise ratio -needed fo...
Iain McCowan, Maganto Hari Krishna, Daniel Gatica-...
Head pose and gesture offer several key conversational grounding cues and are used extensively in face-to-face interaction among people. We investigate how dialog context from an ...
Louis-Philippe Morency, Candace L. Sidner, Christo...