Recognition of phonemes and words in singing

14 years 4 months ago

Download www.cs.tut.fi

This paper studies the inﬂuence of n-gram language models in the recognition of sung phonemes and words. We train uni-, bi-, and trigram language models for phonemes and bi- and trigrams for words. The word-level language model is estimated from a textual lyrics database. In the recognition we use a hidden Markov model based phonetic recognizer adapted to singing voice. The models were tested on monophonic singing and on vocal lines separated from polyphonic music. On clean singing the phoneme recognition accuracies varied from 20% (no language model) to 39% (bigram) and on polyphonic music from 6% (no language model) to 20% (bigram). In word recognition, one ﬁfth of the words were recognized in clean singing, the performance being lower on polyphonic music. We study the use of the recognition results in a query-by-singing application. Using the recognized words, we retrieve the songs by searching for the text in a text lyrics database. For the word recognition system having only ...

Annamaria Mesaros, Tuomas Virtanen

Real-time Traffic

ICASSP 2010 | Language Model | Model | Polyphonic Music | Signal Processing |

claim paper

Post Info
More Details (n/a)

Added	06 Dec 2010
Updated	06 Dec 2010
Type	Conference
Year	2010
Where	ICASSP
Authors	Annamaria Mesaros, Tuomas Virtanen

Comments (0)

Sciweavers

Recognition of phonemes and words in singing

ICASSP 2010 | Language Model | Model | Polyphonic Music | Signal Processing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers