In the real visual world, the number of categories a classifier needs to discriminate is on the order of hundreds or thousands. For example, the SUN dataset [24] contains 899 sce...
Recent work shows how to use local spatio-temporal features to learn models of realistic human actions from video. However, existing methods typically rely on a predefined spatial...
In this paper we propose to use lexical semantic networks to extend the state-of-the-art object recognition techniques. We use the semantics of image labels to integrate prior kno...
In this paper, the automatic medical annotation task of the 2007 CLEF cross-language image retrieval campaign (ImageCLEF) is described. The paper focusses on the images used, the ...
Thomas Deselaers, Thomas Martin Deserno, Henning M...