Understanding of the scene content of a video sequence is very important for content-based indexing and retrieval of multimedia databases. Research in this area in the past severa...
Video-based recognition and prediction of a temporally extended activity can benefit from a detailed description of high-level expectations about the activity. Stochastic grammars...
We describe a method to align ASL video subtitles with a closed-caption transcript. Our alignments are partial, based on spotting words within the video sequence, which consists o...
In this paper, we address two closely related visual tracking problems: 1) localizing a target's position in low or moderate resolution videos and 2) segmenting a target'...
In this paper we present a novel approach using a 4D (x,y,z,t) action feature model (4D-AFM) for recognizing actions from arbitrary views. The 4D-AFM elegantly encodes shape and m...