Sciweavers

551 search results - page 106 / 111
» Multimodal Speech Synthesis
Sort
View
CSL
2002
Springer
14 years 9 months ago
Learning visually grounded words and syntax for a scene description task
A spoken language generation system has been developed that learns to describe objects in computer-generated visual scenes. The system is trained by a `show-and-tell' procedu...
Deb K. Roy
INTERSPEECH
2010
14 years 4 months ago
An HMM trajectory tiling (HTT) approach to high quality TTS
We propose an HMM Trajectory Tiling (HTT) approach to high quality TTS, which is our entry to Blizzard Challenge 2010. In HTT, first refined HMM is trained with the Minimum Genera...
Yao Qian, Zhi-Jie Yan, Yijian Wu, Frank K. Soong, ...
MM
2005
ACM
146views Multimedia» more  MM 2005»
15 years 3 months ago
Unsupervised content discovery in composite audio
Automatically extracting semantic content from audio streams can be helpful in many multimedia applications. Motivated by the known limitations of traditional supervised approache...
Rui Cai, Lie Lu, Alan Hanjalic
SIGIR
2003
ACM
15 years 3 months ago
Transliteration of proper names in cross-language applications
Translation of proper names is generally recognized as a significant problem in many multi-lingual text and speech processing applications. Even when large bilingual lexicons use...
Paola Virga, Sanjeev Khudanpur
IWEC
2004
14 years 11 months ago
Development of Extemporaneous Performance by Synthetic Actors in the Rehearsal Process
Autonomous synthetic actors must invent variations of known material in order to perform given only a limited script, and to assist the director with development of the performance...
Tony A. Meyer, Chris H. Messom