Discovering meaningful multimedia patterns with audio-visual concepts and associated text

14 years 6 months ago

Download www.ee.columbia.edu

The work presents the first effort to automatically annotate the semantic meanings of temporal video patterns obtained through unsupervised discovery processes. This problem is interesting in domains where neither perceptual patterns nor semantic concepts have simple structures. The patterns in video are modeled with hierarchical hidden Markov models (HHMM), with efficient algorithms to learn the parameters, the model complexity, and the relevant features; the meanings are contained in words of the speech transcript of the video. The pattern-word association is obtained via co-occurrence analysis and statistical machine translation models. Promising results are obtained through extensive experiments on 20+ hours of TRECVID news videos: video patterns that associate with distinct topics such as el-nino and politics are identified; the HHMM temporal structure model compares favorably to a non-temporal clustering algorithm.

Lexing Xie, Lyndon S. Kennedy, Shih-Fu Chang, Ajay

Real-time Traffic

Hidden Markov Models | ICIP 2004 | Image Processing | Perceptual Patterns | Semantic Meanings | Temporal Structure Model | Temporal Video Patterns |

claim paper

Post Info
More Details (n/a)

Added	24 Oct 2009
Updated	27 Oct 2009
Type	Conference
Year	2004
Where	ICIP
Authors	Lexing Xie, Lyndon S. Kennedy, Shih-Fu Chang, Ajay Divakaran, Huifang Sun, Ching-Yung Lin

Comments (0)

Sciweavers

Discovering meaningful multimedia patterns with audio-visual concepts and associated text

Hidden Markov Models | ICIP 2004 | Image Processing | Perceptual Patterns | Semantic Meanings | Temporal Structure Model | Temporal Video Patterns |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers