A Prosodic Diphone Database for Korean Text-to-Speech Synthesis System

14 years 2 months ago

Download www.ling.ohio-state.edu

This paper presents a prosodically conditioned diphone database to be used in a Korean text-to-speech (TTS) synthesis system. The diphones are prosodically conditioned in the sense that a single conventional diphone is stored as diﬀerent versions taken directly from the different prosodic domains of the prosodically labeled, read sentences (following the K-ToBI prosodic labeling conventions [3]). Four levels of the Korean prosodic domains were observed in the diphone selection process, thereby selecting four diﬀerent versions of each diphone. A 400-sentence subset of the Korean Newswire Text Corpora [5] were converted to its pronounced form as described in [8] and its read version was prosodically labeled. The greedy algorithm [7] identiﬁed 223 sentences containing 1,853 prosodic diphones (out of the 3,977 possible prosodic diphones) that can synthesize all four hundred utterances. Although our system cannot synthesize an unlimited number of sentences at this stage, the quality o...

Kyuchul Yoon

Real-time Traffic

CICLING 2005 | Natural Language Processing | Prosodic | Prosodic Diphones | Prosodic Domains |

claim paper

Added	26 Jun 2010
Updated	26 Jun 2010
Type	Conference
Year	2005
Where	CICLING
Authors	Kyuchul Yoon

Sciweavers

A Prosodic Diphone Database for Korean Text-to-Speech Synthesis System

CICLING 2005 | Natural Language Processing | Prosodic | Prosodic Diphones | Prosodic Domains |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers