Sciweavers

TASLP
2010
97views more  TASLP 2010»
12 years 11 months ago
Hierarchical Bayesian Language Models for Conversational Speech Recognition
Traditional n-gram language models are widely used in state-of-the-art large vocabulary speech recognition systems. This simple model suffers from some limitations, such as overfi...
Songfang Huang, Steve Renals
EMNLP
2009
13 years 2 months ago
Less is More: Significance-Based N-gram Selection for Smaller, Better Language Models
The recent availability of large corpora for training N-gram language models has shown the utility of models of higher order than just trigrams. In this paper, we investigate meth...
Robert C. Moore, Chris Quirk
ACL
2009
13 years 2 months ago
Improved Smoothing for N-gram Language Models Based on Ordinary Counts
Kneser-Ney (1995) smoothing and its variants are generally recognized as having the best perplexity of any known method for estimating N-gram language models. Kneser-Ney smoothing...
Robert C. Moore, Chris Quirk
EMNLP
2010
13 years 2 months ago
Storing the Web in Memory: Space Efficient Language Models with Constant Time Retrieval
We present three novel methods of compactly storing very large n-gram language models. These methods use substantially less space than all known approaches and allow n-gram probab...
David Guthrie, Mark Hepple
TSD
2010
Springer
13 years 2 months ago
Improving Automatic Image Captioning Using Text Summarization Techniques
This paper presents two different approaches to automatic captioning of geo-tagged images by summarizing multiple web-documents that contain information related to an image’s lo...
Laura Plaza, Elena Lloret, Ahmet Aker
ICGI
2010
Springer
13 years 5 months ago
Enhanced Suffix Arrays as Language Models: Virtual k-Testable Languages
Abstract. In this article, we propose the use of suffix arrays to efficiently implement n-gram language models with practically unlimited size n. This approach, which is used with ...
Herman Stehouwer, Menno van Zaanen
NIPS
2008
13 years 5 months ago
A Scalable Hierarchical Distributed Language Model
Neural probabilistic language models (NPLMs) have been shown to be competitive with and occasionally superior to the widely-used n-gram language models. The main drawback of NPLMs...
Andriy Mnih, Geoffrey E. Hinton
LREC
2008
108views Education» more  LREC 2008»
13 years 5 months ago
A Lightweight and Efficient Tool for Cleaning Web Pages
Originally conceived as a "naive" baseline experiment using traditional n-gram language models as classifiers, the NCLEANER system has turned out to be a fast and lightw...
Stefan Evert
EMNLP
2008
13 years 5 months ago
Coarse-to-Fine Syntactic Machine Translation using Language Projections
The intersection of tree transducer-based translation models with n-gram language models results in huge dynamic programs for machine translation decoding. We propose a multipass,...
Slav Petrov, Aria Haghighi, Dan Klein
CICLING
2003
Springer
13 years 9 months ago
Experiments with Linguistic Categories for Language Model Optimization
In this work1 we obtain robust category-based language models to be integrated into speech recognition systems. Deductive rules are used to select linguistic categories and to matc...
Arantza Casillas, Amparo Varona, Inés Torre...