MLP Internal Representation as Discriminative Features for Improved Speaker Recognition

15 years 10 months ago

Download www.hf.ntnu.no

Feature projection by non-linear discriminant analysis (NLDA) can substantially increase classification performance. In automatic speech recognition (ASR) the projection provided by the pre-squashed outputs from a one hidden layer multi-layer perceptron (MLP) trained to recognise speech subunits (phonemes) has previously been shown to significantly increase ASR performance. An analogous approach cannot be applied directly to speaker recognition because there is no recognised set of "speaker sub-units" to provide a finite set of MLP target classes, and for many applications it is not practical to train an MLP with one output for each target speaker. In this paper we show that the output from the second hidden layer (compression layer) of an MLP with three hidden layers trained to identify a subset of 100 speakers selected at random from a set of 300 training speakers in Timit, can provide a 77% relative error reduction for common Gaussian mixture model (GMM) based speaker iden...

Dalei Wu, Andrew C. Morris, Jacques C. Koreman

Real-time Traffic

Hidden Layers | Layer Multi-layer Perceptron | MLP Target Classes | NOLISP 2005 | Signal Processing |

claim paper

Added	28 Jun 2010
Updated	28 Jun 2010
Type	Conference
Year	2005
Where	NOLISP
Authors	Dalei Wu, Andrew C. Morris, Jacques C. Koreman

Sciweavers

MLP Internal Representation as Discriminative Features for Improved Speaker Recognition

Hidden Layers | Layer Multi-layer Perceptron | MLP Target Classes | NOLISP 2005 | Signal Processing |

Explore & Download

Productivity Tools

Sciweavers