This paper presents a study of three statistical query translation models that use different units of translation. We begin with a review of a word-based translation model that us...
Speech has a property that the speech unit preceding a speech pause tends to lengthen. This work presents the use of a dynamic Bayesian network to model the prepausal lengthening ...
Ning Ma, Chris Bartels, Jeff A. Bilmes, Phil Green
In the present work we address the problem of phone duration modeling for the needs of emotional speech synthesis. Specifically, relying on ten well known machine learning techniqu...
Intra- and inter-speaker information, which include acoustical, speaker style, speech rate and temporal variation, despite their critical importance for the verification of claims...
Yongxin Zhang, Adel Iskander Fahmy, Michael S. Sco...
This paper presents a new approach to speech synthesis in which a set of cross-word decision-tree state-clustered context-dependent hidden Markov models are used to define a set o...