Naming People from Dialog: Temporal Grouping and Weak Supervision

15 years 12 months ago

Download www.seas.upenn.edu

We address the character identification problem in movies and television videos: assigning names to faces on the screen. Most prior work on person recognition in video assumes some supervised data such as screenplay or handlabeled faces. In this paper, our only source of ‘supervision’ are the dialog cues: first, second and third person references (such as “I’m Jack”, “Hey, Jack!” and “Jack left”). While this kind of supervision is sparse and indirect, we exploit multiple modalities and their interactions (appearance, dialog, mouth movement, synchrony, continuityediting cues) to effectively resolve identities through local temporal grouping followed by global weakly supervised recognition. We propose a novel temporal grouping model that partitions face tracks across multiple shots while respecting appearance, geometric and film-editing cues and constraints. In this model, states represent partitions of the k most recent face tracks, and transitions repr...

Timothee Cour, Benjamin Sapp, Akash Nagle, Ben Tas

Real-time Traffic

Computer Vision | CVPR 2010 | Temporal Grouping | Weak Supervision |

claim paper

Post Info
More Details (n/a)

Added	08 Apr 2010
Updated	14 May 2010
Type	Conference
Year	2010
Where	CVPR
Authors	Timothee Cour, Benjamin Sapp, Akash Nagle, Ben Taskar

Comments (0)

Sciweavers

Naming People from Dialog: Temporal Grouping and Weak Supervision

Computer Vision | CVPR 2010 | Temporal Grouping | Weak Supervision |

Explore & Download

Productivity Tools

Sciweavers