—This paper describes a fully automated framework to generate realistic head motion, eye gaze, and eyelid motion simultaneously based on live (or recorded) speech input. Its cent...
The TNO speaker speaker diarization system is based on a standard BIC segmentation and clustering algorithm. Since for the NIST Rich Transcription speaker dizarization evaluation m...
This article addresses the modeling of reverberant recording environments in the context of under-determined convolutive blind source separation. We model the contribution of each ...
We investigate the use of Independent Subspace Analysis (ISA) for instrument identification in musical recordings. We represent short-term log-power spectra of possibly polyphoni...
Gaussian processes have been widely used as a method for inferring the pose of articulated bodies directly from image data. While able to model complex non-linear functions, they ...