We propose a novel utterance comparison model based on probability theory and factor analysis that computes the likelihood of two speech utterances originating from the same speak...
Our previous analysis of speaker-adaptive HMM-based speech synthesis methods suggested that there are two possible reasons why average voices can obtain higher subjective scores t...
Sandra Andraszewicz, Junichi Yamagishi, Simon King
This paper presents a Named Entity Recognition (NER) method dedicated to process speech transcriptions. The main principle behind this method is to collect in an unsupervised way ...
Natural sounds are structured on many time-scales. A typical segment of speech, for example, contains features that span four orders of magnitude: Sentences (∼1 s); phonemes (...
Voice conversion can be reduced to a problem to find a transformation function between the corresponding speech sequences of two speakers. Perhaps the most voice conversions meth...