Learning Visual Compound Models from Parallel Image-Text Datasets

13 years 6 months ago

Download www.cs.toronto.edu

Abstract. In this paper, we propose a new approach to learn structured visual compound models from shape-based feature descriptions. We use captioned text in order to drive the process of grouping boundary fragments detected in an image. In the learning framework, we transfer several techniques from computational linguistics to the visual domain and build on previous work in image annotation. A statistical translation model is used in order to establish links between caption words and image elements. Then, compounds are iteratively built up by using a mutual information measure. Relations between compound elements are automatically extracted and increase the discriminability of the visual models. We show results on dierent synthetic and realistic datasets in order to validate our approach.

Jan Moringen, Sven Wachsmuth, Sven J. Dickinson, S

Real-time Traffic

DAGM 2008 | Image Processing | Shape-based Feature Descriptions | Statistical Translation Model | Visual Compound Models |

claim paper

Added	19 Oct 2010
Updated	19 Oct 2010
Type	Conference
Year	2008
Where	DAGM
Authors	Jan Moringen, Sven Wachsmuth, Sven J. Dickinson, Suzanne Stevenson

Sciweavers

Learning Visual Compound Models from Parallel Image-Text Datasets

DAGM 2008 | Image Processing | Shape-based Feature Descriptions | Statistical Translation Model | Visual Compound Models |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers