Sciweavers

NIPS
2004

Real-Time Pitch Determination of One or More Voices by Nonnegative Matrix Factorization

13 years 5 months ago
Real-Time Pitch Determination of One or More Voices by Nonnegative Matrix Factorization
An auditory "scene", composed of overlapping acoustic sources, can be viewed as a complex object whose constituent parts are the individual sources. Pitch is known to be an important cue for auditory scene analysis. In this paper, with the goal of building agents that operate in human environments, we describe a real-time system to identify the presence of one or more voices and compute their pitch. The signal processing in the front end is based on instantaneous frequency estimation, a method for tracking the partials of voiced speech, while the pattern-matching in the back end is based on nonnegative matrix factorization, an unsupervised algorithm for learning the parts of complex objects. While supporting a framework to analyze complicated auditory scenes, our system maintains real-time operability and state-of-the-art performance in clean speech.
Fei Sha, Lawrence K. Saul
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2004
Where NIPS
Authors Fei Sha, Lawrence K. Saul
Comments (0)