Computational models of grounded language learning have been based on the premise that words and concepts are learned simultaneously. Given the mounting cognitive evidence for conc...
We present an approach to key frame extraction for structuring user generated videos on video sharing websites (e. g. YouTube). Our approach is intended to link existing image sea...
In this paper, we study the problem of social relational inference using visual concepts which serve as indicators of actors’ social interactions. While social network analysis ...
Detection of moving objects in video streams is the first relevant step of information extraction in many computer vision applications. Aside from the intrinsic usefulness of being...
Digital video applications exploit the intrinsic structure of video sequences. In order to obtain and represent this structure for video annotation and indexing tasks, the main ini...