Sciweavers

INTERSPEECH
2010

Roles of the average voice in speaker-adaptive HMM-based speech synthesis

12 years 11 months ago
Roles of the average voice in speaker-adaptive HMM-based speech synthesis
In speaker-adaptive HMM-based speech synthesis, there are a few speakers whose synthetic speech sounds worse than that of other speakers, despite having the same amount of adaptation data from within the same corpus. This paper investigates these fluctuations in quality and found that as mel-cepstral distance from the average voice becomes larger, the MOS scores generally become worse. Although the negative correlation obtained is not strong enough, this helps us improve the training and adaptation strategies for average voice models. Furthermore we remark that this correlation is strongly linked to "vocal attractiveness."
Junichi Yamagishi, Oliver Watts, Simon King, Bela
Added 18 May 2011
Updated 18 May 2011
Type Journal
Year 2010
Where INTERSPEECH
Authors Junichi Yamagishi, Oliver Watts, Simon King, Bela Usabaev
Comments (0)