This paper presents a bottom-up approach that combines audio and video to simultaneously locate individual speakers in the video (2-D source localization) and segment their speech ...
In supervector UBM/GMM paradigm, each acoustic file is represented by the mean parameters of a GMM model. This supervector space is used as a data representation space, which has...
In an environment where the contexts of users are complex and the degree of freedom of user activity is very high, such as in daily life, several factors need to be considered for...
Hyoungnyoun Kim, Ig-Jae Kim, Hyoung-Gon Kim, Ji-Hy...
Each facial event will give rise to complex facial appearance variation. In this paper, we propose similarity features to describe the facial appearance for video-based facial even...
We propose a fast 3D model acquisition system that aligns intensity and depth images, and reconstructs a textured 3D mesh. 3D views are registered with shape alignment based on in...
Louis-Philippe Morency, Ali Rahimi, Trevor Darrell