Topic regression multi-modal Latent Dirichlet Allocation for image annotation

13 years 4 months ago

Download www.goldenmetallic.com

We present topic-regression multi-modal Latent Dirichlet Allocation (tr-mmLDA), a novel statistical topic model for the task of image and video annotation. At the heart of our new annotation model lies a novel latent variable regression approach to capture correlations between image or video features and annotation texts. Instead of sharing a set of latent topics between the 2 data modalities as in the formulation of correspondence LDA in [2], our approach introduces a regression module to correlate the 2 sets of topics, which captures more general forms of association and allows the number of topics in the 2 data modalities to be different. We demonstrate the power of tr-mmLDA on 2 standard annotation datasets: a 5000-image subset of COREL and a 2687-image LabelMe dataset. The proposed association model shows improved performance over correspondence LDA as measured by caption perplexity.

Duangmanee Putthividhya, Hagai Thomas Attias, Srik

Real-time Traffic

Annotation | Computer Vision | Correspondence Lda | CVPR 2010 | Multi-modal Latent Dirichlet |

claim paper

» MultiModal Hierarchical Dirichlet Process Model for Predicting Image Annotation and ImageO...

» Supervised topic model for automatic image annotation

» Automatic annotation of unique locations from video and text

Post Info
More Details (n/a)

Added	06 Dec 2010
Updated	06 Dec 2010
Type	Conference
Year	2010
Where	CVPR
Authors	Duangmanee Putthividhya, Hagai Thomas Attias, Srikantan S. Nagarajan

Comments (0)

Sciweavers

Topic regression multi-modal Latent Dirichlet Allocation for image annotation

Annotation | Computer Vision | Correspondence Lda | CVPR 2010 | Multi-modal Latent Dirichlet |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers