In this paper, we consider a vision-based system that can interpret a user's gestures in real time to manipulate windows and objects within a graphical user interface. A hand...
This paper presents a bottom-up approach that combines audio and video to simultaneously locate individual speakers in the video (2-D source localization) and segment their speech ...
The advance of technology makes video acquisition devices better and less costly, thereby increasing the number of applications that can effectively utilize digital video. Compare...
We describe how to create with machine learning techniques a generative, videorealistic, speech animation module. A human subject is first recorded using a videocamera as he/she u...