Text-to-Audiovisual Speech Synthesizer

13 years 9 months ago
Text-to-Audiovisual Speech Synthesizer
This paper describes a text-to-audiovisual speech synthesizer system incorporating the head and eye movements. The face is modeled using a set of images of a human subject. Visemes, that are a set of lip images of the phonemes, are extracted from a recorded video. A smooth transition between visemes is achieved by morphing along the correspondence between the visemes obtained by optical flows. This paper also describes methods for introducing nonverbal mechanisms in visual speech communication such as eye blinks and head nods. For eye movements, a simple mask based approach is used. View morphing is used to generate the head movement. A complete audiovisual sequence is constructed by concatenating the viseme transitions and synchronizing the visual stream with the audio stream. An effort has been made to integrate all these features into a single system, which takes text, head and eye movement parameters and produces the audiovisual stream.
Udit Kumar Goyal, Ashish Kapoor, Prem Kalra
Added 26 Aug 2010
Updated 26 Aug 2010
Type Conference
Year 2000
Where VW
Authors Udit Kumar Goyal, Ashish Kapoor, Prem Kalra
Comments (0)