Non-parallel training for many-to-many eigenvoice conversion

15 years 3 months ago

Download spalab.naist.jp

This paper presents a novel training method of an eigenvoice Gaussian mixture model (EV-GMM) effectively using non-parallel data sets for many-to-many eigenvoice conversion, which is a technique for converting an arbitrary source speaker’s voice into an arbitrary target speaker’s voice. In the proposed method, an initial EV-GMM is trained with the conventional method using parallel data sets consisting of a single reference speaker and multiple pre-stored speakers. Then, the initial EV-GMM is further reﬁned using non-parallel data sets including a larger number of pre-stored speakers while considering the reference speaker’s voices as hidden variables. The experimental results demonstrate that the proposed method yields signiﬁcant quality improvements in converted speech by enabling us to use data of a larger number of pre-stored speakers.

Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Ki

Real-time Traffic

ICASSP 2010 | Initial Ev-gmm | Non-parallel Data Sets | Pre-stored Speakers | Signal Processing |

claim paper

Post Info
More Details (n/a)

Added	06 Dec 2010
Updated	06 Dec 2010
Type	Conference
Year	2010
Where	ICASSP
Authors	Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano

Comments (0)

Sciweavers

Non-parallel training for many-to-many eigenvoice conversion

ICASSP 2010 | Initial Ev-gmm | Non-parallel Data Sets | Pre-stored Speakers | Signal Processing |

Explore & Download

Productivity Tools

Sciweavers