Audio-visual atoms for generic video concept classification

12 years 11 months ago

Download www.ee.columbia.edu

We investigate the challenging issue of joint audio-visual analysis of generic videos targeting at semantic concept detection. We propose to extract a novel representation, the Short-term Audio-Visual Atom (S-AVA), for improved concept detection. An S-AVA is defined as a short-term region track associated with regional visual features and background audio features. An effective algorithm, named ShortTerm Region tracking with joint Point Tracking and Region Segmentation (STR-PTRS), is developed to extract S-AVAs from generic videos under challenging conditions such as uneven lighting, clutter, occlusions, and complicated motions of both objects and camera. Discriminative audio-visual codebooks are constructed on top of S-AVAs using Multiple Instance Learning. Codebook-based features are generated for semantic concept detection. We extensively evaluate our algorithm over Kodak's consumer benchmark video set from real users. Experimental results confirm significant performance impro...

Wei Jiang, Courtenay V. Cotton, Shih-Fu Chang, Dan

Real-time Traffic

Audio-visual | Joint Audio-visual Analysis | Semantic Concept Detection | TOMCCAP 2010 |

claim paper

Added	22 May 2011
Updated	22 May 2011
Type	Journal
Year	2010
Where	TOMCCAP
Authors	Wei Jiang, Courtenay V. Cotton, Shih-Fu Chang, Dan Ellis, Alexander C. Loui

Sciweavers

Audio-visual atoms for generic video concept classification

Audio-visual | Joint Audio-visual Analysis | Semantic Concept Detection | TOMCCAP 2010 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers