What's News, What's Not? Associating News Videos with Words

15 years 10 months ago

Download www.clsp.jhu.edu

Text retrieval from broadcast news video is unsatisfactory, because a transcript word frequently does not directly ‘describe’ the shot when it was spoken. Extending the retrieved region to a window around the matching keyword provides better recall, but low precision. We improve on text retrieval using the following approach: First we segment the visual stream into coherent story-like units, using a set of visual news story delimiters. After ﬁltering out clearly irrelevant classes of shots, we are still left with an ambiguity of how words in the transcript relate to the visual content in the remaining shots of the story. Using a limited set of visual features at diﬀerent semantic levels ranging from color histograms, to faces, cars, and outdoors, an association matrix captures the correlation of these visual features to speciﬁc transcript words. This matrix is then reﬁned using an EM approach. Preliminary results show that this approach has the potential to signiﬁcantly i...

Pinar Duygulu, Alexander G. Hauptmann

Real-time Traffic

CIVR 2004 | Text Retrieval | Transcript Words | Visual |

claim paper

» MPEG7 contentbased analysisretrieval system and its applications

» PeopleLDA Anchoring Topics to People using Face Recognition

Post Info
More Details (n/a)

Added	01 Jul 2010
Updated	01 Jul 2010
Type	Conference
Year	2004
Where	CIVR
Authors	Pinar Duygulu, Alexander G. Hauptmann

Comments (0)

Sciweavers

What's News, What's Not? Associating News Videos with Words

CIVR 2004 | Text Retrieval | Transcript Words | Visual |

Explore & Download

Productivity Tools

Sciweavers