Several stochastic models provide an effective framework to identify the temporal structure of audiovisual data. Most of them need as input a first video structure, i.e. connecti...
In this paper, we present an event parsing algorithm based on Stochastic Context Sensitive Grammar (SCSG) for understanding events, inferring the goal of agents, and predicting th...
Mingtao Pei, School of Computer Science, Yunde Jia...
Unsupervised categorization of objects is a fundamental problem in computer vision. While appearance-based methods have become popular recently, other important cues like function...
One of the principal bottlenecks in applying learning techniques to classification problems is the large amount of labeled training data required. Especially for images and video, ...
Ajay J. Joshi, Fatih Porikli, Nikolaos Papanikolop...
Since the emergence of extensive multimedia data, feature fusion has been more and more important for image and video retrieval, indexing and annotation. Existing feature fusion t...
Yun Fu, Liangliang Cao, Guodong Guo, Thomas S. Hua...