Current computational models of bottom-up and top-down components of attention are predictive of eye movements across a range of stimuli and of simple, fixed visual tasks (such a...
We investigate methods of segmenting, visualizing, and indexing presentation videos by both audio and visual data. The audio track is segmented by speaker, and augmented with key ...
Acoustic event detection (AED) aims to identify both timestamps and types of multiple events and has been found to be very challenging. The cues for these events often times exist...
Po-Sen Huang, Xiaodan Zhuang, Mark Hasegawa-Johnso...
This paper presents techniques for multimedia annotation and their application to video summarization and translation. Our tool for annotation allows users to easily create annota...
Creating video recordings of events such as lectures or meetings is increasingly inexpensive and easy. However, reviewing the content of such video may be time-consuming and dif...