Sciweavers

EMNLP
2009
13 years 2 months ago
Stream-based Randomised Language Models for SMT
Randomised techniques allow very big language models to be represented succinctly. However, being batch-based they are unsuitable for modelling an unbounded stream of language whi...
Abby Levenberg, Miles Osborne
NAACL
2010
13 years 2 months ago
Improved Extraction Assessment through Better Language Models
A variety of information extraction techniques rely on the fact that instances of the same relation are "distributionally similar," in that they tend to appear in simila...
Arun Ahuja, Doug Downey
NAACL
2010
13 years 2 months ago
Language identification of names with SVMs
The task of identifying the language of text or utterances has a number of applications in natural language processing. Language identification has traditionally been approached w...
Aditya Bhargava, Grzegorz Kondrak
ISAMI
2010
13 years 2 months ago
Employing Compact Intra-genomic Language Models to Predict Genomic Sequences and Characterize Their Entropy
Probabilistic models of languages are fundamental to understand and learn the profile of the subjacent code in order to estimate its entropy, enabling the verification and predicti...
Sérgio A. D. Deusdado, Paulo Carvalho
ACL
2010
13 years 2 months ago
Intelligent Selection of Language Model Training Data
We address the problem of selecting nondomain-specific language model training data to build auxiliary language models for use in tasks such as machine translation. Our approach i...
Robert C. Moore, William Lewis
SPIRE
2010
Springer
13 years 3 months ago
Hypergeometric Language Model and Zipf-Like Scoring Function for Web Document Similarity Retrieval
The retrieval of similar documents in the Web from a given document is different in many aspects from information retrieval based on queries generated by regular search engine use...
Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti&a...
SIGIR
2002
ACM
13 years 4 months ago
Risk minimization and language modeling in text retrieval dissertation abstract
tion Abstract ChengXiang Zhai (Advisor: John Lafferty) Language Technologies Institute School of Computer Science Carnegie Mellon University With the dramatic increase in online in...
ChengXiang Zhai
SIGIR
2002
ACM
13 years 4 months ago
Two-stage language models for information retrieval
The optimal settings of retrieval parameters often depend on both the document collection and the query, and are usually found through empirical tuning. In this paper, we propose ...
ChengXiang Zhai, John D. Lafferty
SIGIR
2002
ACM
13 years 4 months ago
Term-specific smoothing for the language modeling approach to information retrieval: the importance of a query term
This paper follows a formal approach to information retrieval based on statistical language models. By introducing some simple reformulations of the basic language modeling approa...
Djoerd Hiemstra
SIGIR
2002
ACM
13 years 4 months ago
Predicting query performance
We develop a method for predicting query performance by computing the relative entropy between a query language model and the corresponding collection language model. The resultin...
Stephen Cronen-Townsend, Yun Zhou, W. Bruce Croft