Learning Deformable Action Templates from Crowded Videos

14 years 9 months ago

Download www.stat.ucla.edu

In this paper, we present a Deformable Action Template (DAT) model that is learnable from cluttered real-world videos with weak supervisions. In our generative model, an action template is a sequence of image templates each of which consists of a set of shape and motion primitives (Gabor wavelets and optical-flow patches) at selected orientations and locations. These primitives are allowed to slightly perturb their locations and orientations to account for spatial deformations. We use a shared pursuit algorithm to automatically discover a best set of primitives and weights by maximizing the likelihood over one or more aligned training examples. Since it is extremely hard to accurately label human actions from real-world videos, we use a threestep semi-supervised learning procedure. 1) For each human action class, a template is initialized from a labeled (one bounding-box per frame) training video. 2) The template is used to detect actions from other training videos of ...

Benjamin Yao, Song-Chun Zhu

Real-time Traffic