The scarcity of available multi-track recordings constitutes a severe constraint on the training of probabilistic models for voice extraction from polyphonic music. We propose a n...
This paper studies the influence of n-gram language models in the recognition of sung phonemes and words. We train uni-, bi-, and trigram language models for phonemes and bi- and...
This paper presents a maximum likelihood approach to multiple fundamental frequency (F0) estimation for a mixture of harmonic sound sources, where the power spectrum of a time fra...