Sciweavers

CVPR
2010
IEEE

Topic regression multi-modal Latent Dirichlet Allocation for image annotation

13 years 4 months ago
Topic regression multi-modal Latent Dirichlet Allocation for image annotation
We present topic-regression multi-modal Latent Dirichlet Allocation (tr-mmLDA), a novel statistical topic model for the task of image and video annotation. At the heart of our new annotation model lies a novel latent variable regression approach to capture correlations between image or video features and annotation texts. Instead of sharing a set of latent topics between the 2 data modalities as in the formulation of correspondence LDA in [2], our approach introduces a regression module to correlate the 2 sets of topics, which captures more general forms of association and allows the number of topics in the 2 data modalities to be different. We demonstrate the power of tr-mmLDA on 2 standard annotation datasets: a 5000-image subset of COREL and a 2687-image LabelMe dataset. The proposed association model shows improved performance over correspondence LDA as measured by caption perplexity.
Duangmanee Putthividhya, Hagai Thomas Attias, Srik
Added 06 Dec 2010
Updated 06 Dec 2010
Type Conference
Year 2010
Where CVPR
Authors Duangmanee Putthividhya, Hagai Thomas Attias, Srikantan S. Nagarajan
Comments (0)