Classifying pictures into one of several semantic categories is a classical image understanding problem. In this paper, we present a stratified approach to both binary (outdoor-in...
We describe a method for learning classes of facial motion patterns from video of a human interacting with a computerized embodied agent. The method also learns correlations betwe...
This paper presents a bottom-up approach that combines audio and video to simultaneously locate individual speakers in the video (2-D source localization) and segment their speech ...