Sciweavers

ICASSP
2011
IEEE

Efficient out-of-vocabulary term detection by n-gram array indices with distance from a syllable lattice

12 years 8 months ago
Efficient out-of-vocabulary term detection by n-gram array indices with distance from a syllable lattice
For spoken document retrieval, it is very important to consider Out-of-Vocabulary (OOV) and mis-recognition of spoken words. Therefore, sub-word unit based recognition and retrieval methods have been proposed. This paper describes a Japanese spoken document retrieval system that is robust for considering OOV words and mis-recognition of sub-units. We used individual syllables as sub-word unit in continuous speech recognition and an n-gram sequence of syllables in a recognized syllable-based lattice. We propose an n-gram indexing/retrieval method with distance in the syllable lattice for attacking OOV, recognition errors, and high speed retrieval. We applied this method to academic lecture presentation database of 44 hours, and 0.58(F-value) of the OOV words were detected in less than 2.5 milliseconds.
Keisuke Iwami, Yasuhisa Fujii, Kazumasa Yamamoto,
Added 20 Aug 2011
Updated 20 Aug 2011
Type Journal
Year 2011
Where ICASSP
Authors Keisuke Iwami, Yasuhisa Fujii, Kazumasa Yamamoto, Seiichi Nakagawa
Comments (0)