Speech synthesis using HMM based diphone inventory encoding for low-resource devices

12 years 8 months ago

Download mirlab.org

In this paper we describe the compression of diphone inventories used by the acoustic synthesis of a concatenative synthesis system. The inventory compression is based on a codebook drawn from the Gaussian mean vectors of phoneme HMMs. There are two encoding/synthesis schemes, a speaker dependent and a speaker independent one. The advantage of the latter is the potential common use of the HM-models by a recognizer and a synthesizer. We describe the steps to encode the inventories as well as the acoustic synthesis using them. Using the proposed method a diphone inventory with 1175 units can be compressed down to 19 kB. We will show that the synthesis quality with HMM-encoded inventories matches the quality of synthesis with AMR- or SPEEX-encoded inventories at noticeably smaller inventory sizes.

Guntram Strecha, Matthias Wolff

Real-time Traffic

Acoustic Synthesis | Gaussian Mean Vectors | ICASSP 2011 | Inventories | Signal Processing |

claim paper

Post Info
More Details (n/a)

Added	21 Aug 2011
Updated	21 Aug 2011
Type	Journal
Year	2011
Where	ICASSP
Authors	Guntram Strecha, Matthias Wolff

Comments (0)

Sciweavers

Speech synthesis using HMM based diphone inventory encoding for low-resource devices

Acoustic Synthesis | Gaussian Mean Vectors | ICASSP 2011 | Inventories | Signal Processing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers