Sciweavers

WAPCV
2007
Springer

Language Label Learning for Visual Concepts Discovered from Video Sequences

14 years 4 months ago
Language Label Learning for Visual Concepts Discovered from Video Sequences
Computational models of grounded language learning have been based on the premise that words and concepts are learned simultaneously. Given the mounting cognitive evidence for concept formation in infants, we argue that the availability of pre-lexical concepts (learned from image sequences) leads to considerable computational efficiency in word acquisition. Key to the process is a model of bottom-up visual attention in dynamic scenes. Background learning and foreground segmentation is used to generate robust tracking and detect occlusion events. Trajectories are clustered to obtain motion event concepts. The object (image schemas) are abstracted from the combined appearance and motion data. The set of acquired concepts under visual attentive focus are then correlated with contemporaneous commentary to learn the grounded semantics of words and multi-word phrasal concatenations from the narrative. We demonstrate that even based on a mere half hour of video (of a scene involving many obje...
Prithwijit Guha, Amitabha Mukerjee
Added 09 Jun 2010
Updated 09 Jun 2010
Type Conference
Year 2007
Where WAPCV
Authors Prithwijit Guha, Amitabha Mukerjee
Comments (0)