MikeTalk: A Talking Facial Display Based on Morphing Visemes

13 years 8 months ago

Download people.csail.mit.edu

We present MikeTalk, a text-to-audiovisual speech synthesizer which converts input text into an audiovisual speech stream. MikeTalk is built using visemes, which are a set of images spanning a large range of mouth shapes. The visemes are acquired from a recorded visual corpus of a human subject which is specifically designed to elicit one instantiation of each viseme. Using optical flow methods, correspondence from every viseme to every other viseme is computed automatically. By morphing along this correspondence, a smooth transition between viseme images may be generated. A complete visual utterance is constructed by concatenating viseme transitions. Finally, phoneme and timing information extracted from a text-to-speech synthesizer is exploited to determine which viseme transitions to use, and the rate at which the morphing process should occur. In this manner, we are able to synchronize the visual speech stream with the audio speech stream, and hence give the impression of a photor...

Tony Ezzat, Tomaso Poggio

Real-time Traffic

Audiovisual Speech Stream | CA 1998 | Computer Animation | Speech Stream | Text-to-audiovisual Speech Synthesizer |

claim paper

Post Info
More Details (n/a)

Added	04 Aug 2010
Updated	04 Aug 2010
Type	Conference
Year	1998
Where	CA
Authors	Tony Ezzat, Tomaso Poggio

Comments (0)

Sciweavers

MikeTalk: A Talking Facial Display Based on Morphing Visemes

Audiovisual Speech Stream | CA 1998 | Computer Animation | Speech Stream | Text-to-audiovisual Speech Synthesizer |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers