A hidden Markov-model-based trainable speech synthesizer

13 years 9 months ago

Download crow.ee.washington.edu

This paper presents a new approach to speech synthesis in which a set of cross-word decision-tree state-clustered context-dependent hidden Markov models are used to define a set of subphone units to be used in a concatenation synthesizer. The models, trees, waveform segments and other parameters representing each clustered state are obtained completely automatically through training on a 1 hour single-speaker continuous-speech database. During synthesis the required utterance, specified as a string of words of known phonetic pronounciation, is generated as a sequence of these clustered states using a TD-PSOLA waveform concatenation synthesizer. The system produces speech which, though in a monotone, is both natural sounding and highly intelligible. A Modified Rhyme Test conducted to measure segmental intelligibility yielded a 5

R. E. Donovan, Philip C. Woodland

Real-time Traffic

Automated Reasoning | Concatenation Synthesizer | CSL 1999 | Hidden Markov Models | Waveform Concatenation Synthesizer |

claim paper

Post Info
More Details (n/a)

Added	22 Dec 2010
Updated	22 Dec 2010
Type	Journal
Year	1999
Where	CSL
Authors	R. E. Donovan, Philip C. Woodland

Comments (0)

Sciweavers

A hidden Markov-model-based trainable speech synthesizer

Automated Reasoning | Concatenation Synthesizer | CSL 1999 | Hidden Markov Models | Waveform Concatenation Synthesizer |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers