This work proposes a learning method for deep architectures that takes advantage of sequential data, in particular from the temporal coherence that naturally exists in unlabeled v...
The appearance of dynamic scenes is often largely governed by a latent low-dimensional dynamic process. We show how to learn a mapping from video frames to this lowdimensional rep...
Hand-coded scripts were used in the 1970-80s as knowledge backbones that enabled inference and other NLP tasks requiring deep semantic knowledge. We propose unsupervised induction...
This paper describes a system that can build appearance models of animals automatically from a video sequence of the relevant animal with no explicit supervisory information. The ...
We study video attention by detecting a salient object sequence from video segment. We formulate salient object sequence detection as energy minimization problem in a conditional ...