In this paper, we present our recent studies of F0 estimation from the surface electromyographic (EMG) data using a Gaussian mixture model (GMM)-based voice conversion (VC) techni...
Keigo Nakamura, Matthias Janke, Michael Wand, Tanj...
In this paper we evaluate a method for generating synthetic speech at high speaking rates based on the interpolation of hidden semi-Markov models (HSMMs) trained on speech data re...
Michael Pucher, Dietmar Schabus, Junichi Yamagishi
In this paper, we propose a novel method for rapid feature space Maximum Likelihood Linear Regression (FMLLR) speaker adaptation based on bilinear models. When the amount of adapt...
Curtin University’s Talking Heads (TH) combine an MPEG-4 compliant Facial Animation Engine (FAE), an Text To Emotional Speech Synthesiser (TTES), a multi-modal Dialogue Manager (...
He Xiao, Donald Reid, Andrew Marriott, E. K. Gulla...
In this paper, we propose a robust compensation strategy to deal effectively with extraneous acoustic variations for spontaneous speech recognition. This strategy extends speaker a...