Deep Belief Networks (DBNs) are multi-layer generative models. They can be trained to model windows of coefficients extracted from speech and they discover multiple layers of fea...
Abdel-rahman Mohamed, Tara N. Sainath, George Dahl...
Abstract. New text independent speaker identification method is presented. Phase spectrum of allpole linear prediction (LP) model is used to derive the speech features. The featur...
In this paper we revisit some basic configuration choices of HMMbased speech synthesis, such as waveform sampling rate, auditory frequency warping scale and the logarithmic scali...
The effect of additive noise in a speaker recognition system is well known to be a crucial problem in real life applications. In a speaker recognition system, if the test utteranc...
To achieve reasonable accuracy in large vocabulary speech recognition systems, it is important to use detailed acoustic models together with good long span language models. For ex...
J. J. Odell, V. Valtchev, Philip C. Woodland, S. J...