Unsupervised discovery and training of maximally dissimilar cluster models

12 years 11 months ago

Download static.googleusercontent.com

One of the difficult problems of acoustic modeling for Automatic Speech Recognition (ASR) is how to adequately model the wide variety of acoustic conditions which may be present in the data. The problem is especially acute for tasks such as Google Search by Voice, where the amount of speech available per transaction is small, and adaptation techniques start showing their limitations. As training data from a very large user population is available however, it is possible to identify and jointly model subsets of the data with similar acoustic qualities. We describe a technique which allows us to perform this modeling at scale on large amounts of data by learning a treestructured partition of the acoustic space, and we demonstrate that we can significantly improve recognition accuracy in various conditions through unsupervised Maximum Mutual Information (MMI) training. Being fully unsupervised, this technique scales easily to increasing numbers of conditions.

Françoise Beaufays, Vincent Vanhoucke, Bria

Real-time Traffic

Automatic Speech Recognition | INTERSPEECH 2010 | Maximum Mutual Information | Signal Processing | Similar Acoustic Qualities |

claim paper

» Subspace outlier mining in large multimedia databases

» Unsupervised Discovery of Abnormal Activity Occurrences in Multidimensional Time Series wi...

» Towards Automatic Discovery of Object Categories

» Genetic interaction motif finding by expectation maximization a novel statistical model f...

» Unsupervised Learning of Models for Recognition

» AutoFeed an unsupervised learning system for generating webfeeds

» Clustering metagenomic sequences with interpolated Markov models

» GroupInduced Vector Spaces

Post Info
More Details (n/a)

Added	18 May 2011
Updated	18 May 2011
Type	Journal
Year	2010
Where	INTERSPEECH
Authors	Françoise Beaufays, Vincent Vanhoucke, Brian Strope

Comments (0)

Sciweavers

Unsupervised discovery and training of maximally dissimilar cluster models

Automatic Speech Recognition | INTERSPEECH 2010 | Maximum Mutual Information | Signal Processing | Similar Acoustic Qualities |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers