Discovering discriminative action parts from mid-level video representations

13 years 3 months ago

Download vision.ucla.edu

We describe a mid-level approach for action recognition. From an input video, we extract salient spatio-temporal structures by forming clusters of trajectories that serve as candidates for the parts of an action. The assembly of these clusters into an action class is governed by a graphical model that incorporates appearance and motion constraints for the individual parts and pairwise constraints for the spatio-temporal dependencies among them. During training, we estimate the model parameters discriminatively. During classiﬁcation, we efﬁciently match the model to a video using discrete optimization. We validate the model’s classiﬁcation ability in standard benchmark datasets and illustrate its potential to support a ﬁne-grained analysis that not only gives a label to a video, but also identiﬁes and localizes its constituent parts.

Michalis Raptis, Iasonas Kokkinos, Stefano Soatto

Real-time Traffic

Computer Vision | Cvpr 2012 | Discrete Optimization | Motion Constraints | Temporal Dependencies |

claim paper

Added	28 Sep 2012
Updated	28 Sep 2012
Type	Journal
Year	2012
Where	CVPR
Authors	Michalis Raptis, Iasonas Kokkinos, Stefano Soatto

Sciweavers

Discovering discriminative action parts from mid-level video representations

Computer Vision | Cvpr 2012 | Discrete Optimization | Motion Constraints | Temporal Dependencies |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers