Abstract. We present a new approach to modeling and processing multimedia data. This approach is based on graphical models that combine audio and video variables. We demonstrate it...
This paper presents an approach for human activity recognition by representing the frames of the video sequence with the distribution of local motion features and their spatiotemp...
We present spatio-temporal feature descriptors that can be inferred from video and used as building blocks in action recognition systems. They capture the evolution of ``elementar...
In this paper we present a framework for semantic scene parsing and object recognition based on dense depth maps. Five viewindependent 3D features that vary with object class are e...
We present a technique for coupling simulated fluid phenomena that interact with real dynamic scenes captured as a binocular video sequence. We first process the binocular video s...