We address the problem of multi-person tracking in a complex scene from a single camera. Although trackletassociation methods have shown impressive results in several challenging ...
Complex human activities occurring in videos can be defined in terms of temporal configurations of primitive actions. Prior work typically hand-picks the primitives, their total...
—SIFT-like local feature descriptors are ubiquitously employed in such computer vision applications as content-based retrieval, video analysis, copy detection, object recognition...
Christoph Strecha, Alexander A. Bronstein, Michael...
Object detectors are typically trained on a large set of still images annotated by bounding-boxes. This paper introduces an approach for learning object detectors from realworld w...
Alessandro Prest, Christian Leistner, Javier Civer...
We propose a nonparametric framework based on the beta process for discovering temporal patterns within a heterogenous video collection. Starting from quantized local motion descr...