This paper describes an initial approach to emotional speech synthesis in Catalan based on a diphone concatenation TTS system. The main goal of this work is to develop a simple pro...
The advent of statistical speech synthesis has enabled the unification of the basic techniques used in speech synthesis and recognition. Adaptation techniques that have been succ...
Lakshmi Saheer, Philip N. Garner, John Dines, Hui ...
The following paper presents a novel audio-visual approach for unsupervised speaker locationing. Using recordings from a single, low-resolution room overview camera and a single f...
This paper presents a method for automatic multimodal person authentication using speech, face and visual speech modalities. The proposed method uses the motion information to loc...
The perceived quality of a synthetic visual speech signal greatly depends on the smoothness of the presented visual articulators. This paper explains how concatenative visual spee...