Abstract. Humans can associate vision and language modalities and thus generate mental imagery, i.e. visual images, from linguistic input in an environment of unlimited inflowing i...
Conventional methods for multimodal data retrieval use text-tag based or cross-modal approaches such as tag-image co-occurrence and canonical correlation analysis. Since there are ...
JungWoo Ha, Byoung-Hee Kim, Bado Lee, Byoung-Tak Z...
Given a query image of an object, our objective is to retrieve all instances of that object in a large (1M+) image database. We adopt the bag-of-visual-words architecture which ha...
Ondrej Chum, James Philbin, Josef Sivic, Michael I...
We present methods for improving text search retrieval of visual multimedia content by applying a set of visual models of semantic concepts from a lexicon of concepts deemed relev...
Alexander Haubold, Apostol Natsev, Milind R. Napha...
Training a 3D model classifier on a small dataset is very challenging. However, large datasets of partially classified models are now commonly available online. We use an external...