We describe a system for automatically extracting dynamics of tongue gestures from ultrasound images of the tongue using translational deep belief networks (tDBNs). In tDBNs, a jo...
This paper considers application of Deep Belief Nets (DBNs) to natural language call routing. DBNs have been successfully applied to a number of tasks, including image, audio and ...
Ruhi Sarikaya, Geoffrey E. Hinton, Bhuvana Ramabha...
While nonnegative matrix factorization (NMF) has successfully been applied for gain-robust multi-pitch detection, a method to track pitch values over time was not provided. We emb...
A framework for generating facial expressions from emotional states in daily conversation is described. The framework allows avatars to express the speaker’s state not just prot...
A real time speaker localization and detection system for videoconferencing environments is presented. In this system, a recently proposed modified Steered Response Power - Phase...