Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

98

Voted

ACL
2001

favoriteEmaildiscussreport

188views Computational Linguistics» more ACL 2001»

Multi-Class Composite N-gram Language Model for Spoken Language Processing Using Multiple Word Clusters

15 years 1 months ago

Multi-Class Composite N-gram Language Model for Spoken Language Processing Using Multiple Word Clusters

Download acl.ldc.upenn.edu

In this paper, a new language model, the Multi-Class Composite N-gram, is proposed to avoid a data sparseness problem for spoken language in that it is difficult to collect training data. The Multi-Class Composite N-gram maintains an accurate word prediction capability and reliability for sparse data with a compact model size based on multiple word clusters, called MultiClasses. In the Multi-Class, the statistical connectivity at each position of the N-grams is regarded as word attributes, and one word cluster each is created to represent the positional attributes. Furthermore, by introducing higher order word N-grams through the grouping of frequent word successions, Multi-Class N-grams are extended to Multi-Class Composite N-grams. In experiments, the Multi-Class Composite N-grams result in 9.5% lower perplexity and a 16% lower word error rate in speech recognition with a 40% smaller parameter size than conventional word 3-grams.

Hirofumi Yamamoto, Shuntaro Isogai, Yoshinori Sagi

Real-time Traffic

ACL 2001 | ACL 2007 | Multi-class Composite N-grams | Multi-Class N-grams | Word Cluster |

claim paper

Related Content

» Spoken language interpretation On the use of dynamic Bayesian networks for semantic compos...

» Summarizing multiple spoken documents finding evidence from untranscribed audio

» An Example Is Worth a Thousand Words Composite Operation Modeling ByExample

» Automated Extraction of Signs from Continuous Sign Language Sentences using Iterated Condi...

» Unsupervised morphological segmentation and clustering with document boundaries

» Automatic Acquisition of Names Using Speak and Spell Mode in Spoken Dialogue Systems

» Compositional MatrixSpace Models for Sentiment Analysis

» CrossCutting Models of Lexical Semantics

» An ObjectPassing Model for Parallel Programming

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2001
Where	ACL
Authors	Hirofumi Yamamoto, Shuntaro Isogai, Yoshinori Sagisaka

Comments (0)