Activity recognition in video is dominated by low- and mid-level features, and while demonstrably capable, by nature, these features carry little semantic meaning. Inspired by the...
This work investigates the use of nonlinear dependencies in natural image sequence statistics to learn higher-order structures in natural videos. We propose a two-layer model that...
Video-based recognition and prediction of a temporally extended activity can benefit from a detailed description of high-level expectations about the activity. Stochastic grammars...
Thousands of hours of video are recorded every second across the world. Due to the fact that searching for a particular event of interest within hours of video is time consuming, ...
We introduce an epitomic representation for modeling human activities in video sequences. A video sequence is divided into segments within which the dynamics of objects is assumed...