Sciweavers

SPEECH
2008

A comparison of grapheme and phoneme-based units for Spanish spoken term detection

13 years 3 months ago
A comparison of grapheme and phoneme-based units for Spanish spoken term detection
The ever-increasing volume of audio data available online through the world wide web means that automatic methods for indexing and search are becoming essential. Hidden Markov model (HMM) keyword spotting and lattice search techniques are the two most common approaches used by such systems. In keyword spotting, models or templates are defined for each search term prior to accessing the speech and used to find matches. Lattice search (referred to as spoken term detection), uses a pre-indexing of speech data in terms of word or sub-word units, which can then quickly be searched for arbitrary terms without referring to the original audio. In both cases, the search term can be modelled in terms of sub-word units, typically phonemes. For in-vocabulary words (i.e. words that appear in the pronunciation dictionary), the letter-to-sound conversion systems are accepted to work well. However, for out-ofvocabulary (OOV) search terms, letter-to-sound conversion must be used to generate a pronunci...
Javier Tejedor, Dong Wang, Joe Frankel, Simon King
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2008
Where SPEECH
Authors Javier Tejedor, Dong Wang, Joe Frankel, Simon King, José Colás
Comments (0)