Abstract. In speech recognition, phonemes have demonstrated their efficacy to model the words of a language. While they are well defined for languages, their extension to human act...
Kaustubh Kulkarni, Edmond Boyer, Radu Horaud, Amit...
Rapid and inexpensive techniques for automatic transcription of speech have the potential to dramatically expand the types of content to which information retrieval techniques can...
Accurate unsupervised learning of phonemes of a language directly from speech is demonstrated via an algorithm for joint unsupervised learning of the topology and parameters of a ...
Abstract. Identification of music instruments in polyphonic sounds is difficult and challenging, especially where heterogeneous harmonic partials are overlapping with each other....
We investigate methods of segmenting, visualizing, and indexing presentation videos by both audio and visual data. The audio track is segmented by speaker, and augmented with key ...