Using collective information in semi-supervised learning for speech recognition

13 years 11 months ago

Download research.microsoft.com

Training accurate acoustic models typically requires a large amount of transcribed data, which can be expensive to obtain. In this paper, we describe a novel semi-supervised learning algorithm for automatic speech recognition. The algorithm determines whether a hypothesized transcription should be used in the training by taking into consideration collective information from all utterances available instead of solely based on the conﬁdence from that utterance itself. It estimates the expected entropy reduction each utterance and transcription pair may cause to the whole unlabeled dataset and choose the ones with the positive gains. We compare our algorithm with existing conﬁdence-based semi-supervised learning algorithm and show that the former can consistently outperform the latter when the same amount of utterances is selected into the training set. We also indicate that our algorithm may determine the cutoff-point in a principled way by demonstrating that the point it ﬁnds is ...

Balakrishnan Varadarajan, Dong Yu, Li Deng, Alex A

Real-time Traffic

Accurate Acoustic Models | Conﬁdence-based Semi-supervised Learning | ICASSP 2009 | Semi-supervised Learning Algorithm | Signal Processing |

claim paper

» SemiSupervised Sequential Labeling and Segmentation Using GigaWord Scale Unlabeled Data

» Experiments in GraphBased SemiSupervised Learning Methods for ClassInstance Acquisition

» SemiSupervised Sequence Labeling with SelfLearned Features

» SemiSupervised Classification Using Linear Neighborhood Propagation

» SemiSupervised Fisher Linear Discriminant SFLD

» A segmentbased audiovisual speech recognizer data collection development and initial exper...

» Enhanced Multimedia Content Access and Exploitation Using Semantic Speech Retrieval

» Active learning and semisupervised learning for speech recognition A unified framework usi...

Post Info
More Details (n/a)

Added	21 May 2010
Updated	21 May 2010
Type	Conference
Year	2009
Where	ICASSP
Authors	Balakrishnan Varadarajan, Dong Yu, Li Deng, Alex Acero

Comments (0)

Sciweavers

Using collective information in semi-supervised learning for speech recognition

Accurate Acoustic Models | Conﬁdence-based Semi-supervised Learning | ICASSP 2009 | Semi-supervised Learning Algorithm | Signal Processing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers