Sciweavers

NAACL
2004
13 years 5 months ago
A Language Modeling Approach to Predicting Reading Difficulty
We demonstrate a new research approach to the problem of predicting the reading difficulty of a text passage, by recasting readability in terms of statistical language modeling. W...
Kevyn Collins-Thompson, James P. Callan
NAACL
2004
13 years 5 months ago
Name Tagging with Word Clusters and Discriminative Training
We present a technique for augmenting annotated training data with hierarchical word clusters that are automatically derived from a large unannotated corpus. Cluster membership is...
Scott Miller, Jethran Guinness, Alex Zamanian
NAACL
2004
13 years 5 months ago
Multiple Similarity Measures and Source-Pair Information in Story Link Detection
State-of-the-art story link detection systems, that is, systems that determine whether two stories are about the same event or linked, are usually based on the cosine-similarity m...
Francine Chen, Ayman Farahat, Thorsten Brants
NAACL
2004
13 years 5 months ago
Robust Reading: Identification and Tracing of Ambiguous Names
A given entity, representing a person, a location or an organization, may be mentioned in text in multiple, ambiguous ways. Understanding natural language requires identifying whe...
Xin Li, Paul Morie, Dan Roth
NAACL
2004
13 years 5 months ago
Unsupervised Learning of Contextual Role Knowledge for Coreference Resolution
We present a coreference resolver called BABAR that uses contextual role knowledge to evaluate possible antecedents for an anaphor. BABAR uses information extraction patterns to i...
David L. Bean, Ellen Riloff
NAACL
2004
13 years 5 months ago
Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization
We consider the problem of modeling the content structure of texts within a specific domain, in terms of the topics the texts address and the order in which these topics appear. W...
Regina Barzilay, Lillian Lee
NAACL
2004
13 years 5 months ago
Inferring Sentence-internal Temporal Relations
In this paper we propose a data intensive approach for inferring sentence-internal temporal relations, which relies on a simple probabilistic model and assumes no manual coding. W...
Mirella Lapata, Alex Lascarides
NAACL
2004
13 years 5 months ago
The Web as a Baseline: Evaluating the Performance of Unsupervised Web-based Models for a Range of NLP Tasks
Previous work demonstrated that web counts can be used to approximate bigram frequencies, and thus should be useful for a wide variety of NLP tasks. So far, only two generation ta...
Mirella Lapata, Frank Keller
NAACL
2004
13 years 5 months ago
A Probabilistic Rasch Analysis of Question Answering Evaluations
The field of Psychometrics routinely grapples with the question of what it means to measure the inherent ability of an organism to perform a given task, and for the last forty yea...
Rense Lange, Juan Moran, Warren R. Greiff, Lisa Fe...
NAACL
2004
13 years 5 months ago
Minimum Bayes-Risk Decoding for Statistical Machine Translation
We present Minimum Bayes-Risk (MBR) decoding for statistical machine translation. This statistical approach aims to minimize expected loss of translation errors under loss functio...
Shankar Kumar, William J. Byrne