We propose a multimodal speaker segmentation algorithm with two main contributions: First, we suggest a hidden Markov model architecture that performs fusion of the three modaliti...
Viktor Rozgic, Kyu Jeong Han, Panayiotis G. Georgi...
CereProc R Ltd. have recently released a beta version of a commercial unit selection synthesiser featuring XML control of speech style. The system is freely available for academic ...
We present a novel algorithm for structural analysis of audio to detect repetitive patterns that are suitable for content-based audio information retrieval systems, since repetiti...
In this paper we investigate the combination of complementary acoustic feature streams in large vocabulary continuous speech recognition (LVCSR). We have explored the use of acoust...
An essential step in the generation of expressive speech synthesis is the automatic detection and classification of emotions most likely to be present in textual input. At last I...