Abstract. We address the problem of learning good features for understanding video data. We introduce a model that learns latent representations of image sequences from pairs of su...
Abstract. In spite of over two decades of intense research, illumination and pose invariance remain prohibitively challenging aspects of face recognition for most practical applica...
Complex human activities occurring in videos can be defined in terms of temporal configurations of primitive actions. Prior work typically hand-picks the primitives, their total...
Much of recent action recognition research is based on
space-time interest points extracted from video using a Bag
of Words (BOW) representation. It mainly relies on the discrimi...
Matteo Bregonzio (Queen Mary, University of London...