This paper presents a multi-modal approach to locate a speaker in a scene and determine to whom he or she is speaking. We present a simple probabilistic framework that combines mu...
Michael Siracusa, Louis-Philippe Morency, Kevin Wi...
Videos of multi-player team sports provide a challenging domain for dynamic scene analysis. Player actions and interactions are complex as they are driven by many factors, such as...
Kihwan Kim, Matthias Grundmann, Ariel Shamir, Iain...
While existing studies on YouTube’s massive user-generated video content have mostly focused on the analysis of videos, their characteristics, and network properties, little att...
The objective of active recognition is to iteratively collect the next "best" measurements (e.g., camera angles or viewpoints), to maximally reduce ambiguities in recogn...
Museums increasingly deploy new technologies to enhance visitors' experience of their exhibitions. They primarily rely on touch-screen computer systems, PDAs and digital audi...
Dirk vom Lehn, Jon Hindmarsh, Paul Luff, Christian...