High-level, or holistic, scene understanding involves
reasoning about objects, regions, and the 3D relationships
between them. This requires a representation above the
level of ...
Text retrieval from broadcast news video is unsatisfactory, because a transcript word frequently does not directly ‘describe’ the shot when it was spoken. Extending the retriev...
This paper presents a new algorithm for the automatic recognition of object classes from images (categorization). Compact and yet discriminative appearance-based object class mode...
We present a new data set encoding localized semantics for 1014 images and a methodology for using this kind of data for recognition evaluation. This methodology establishes protoc...
In this paper, we present a novel method for the voxel-wise extraction of rotation and gray-scale invariant features. These features are used for simultaneous segmentation and clas...