Face-to-face meetings usually encompass several modalities including speech, gesture, handwriting, and person identification. Recognition and integration of each of these modaliti...
Michael Bett, Ralph Gross, Hua Yu, Xiaojin Zhu, Yu...
This paper presents a set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese. A large speech corpus produced by a single speaker is used, and the speech out...
Although techniques to automatically generate metadata have been steadily refined over the past decade, archive professionals at radio broadcasters continue to use conventional au...
A collaborative framework for detecting the different sources in mixed signals is presented in this paper. The approach is based on CHiLasso, a convex collaborative hierarchical s...
This paper reports the results of experiments in complex Arabic phonetic features identification using a rulebased system (SARPH) and modular connectionist architectures. The firs...