Abstract. The complexity of visual representations is substantially limited by the compositional nature of our visual world which, therefore, renders learning structured object mod...
Recognizing human action in non-instrumented video is a challenging task not only because of the variability produced by general scene factors like illumination, background, occlu...
Building models of the structure in musical signals raises the question of how to evaluate and compare different modeling approaches. One possibility is to use the model to impute...
Thierry Bertin-Mahieux, Graham Grindlay, Ron J. We...
Exploding amounts of multimedia data increasingly require automatic indexing and classification, e.g. training classifiers to produce high-level features, or semantic concepts, ch...
Wei Jiang, Eric Zavesky, Shih-Fu Chang, Alexander ...
We consider the problem of predicting a sequence of real-valued multivariate states from a given measurement sequence. Its typical application in computer vision is the task of mo...