Sciweavers

EMNLP
2008
13 years 6 months ago
Studying the History of Ideas Using Topic Models
How can the development of ideas in a scientific field be studied over time? We apply unsupervised topic modeling to the ACL Anthology to analyze historical trends in the field of...
David Hall, Daniel Jurafsky, Christopher D. Mannin...
EMNLP
2008
13 years 6 months ago
Language and Translation Model Adaptation using Comparable Corpora
Traditionally, statistical machine translation systems have relied on parallel bi-lingual data to train a translation model. While bi-lingual parallel data are expensive to genera...
Matthew G. Snover, Bonnie J. Dorr, Richard M. Schw...
EMNLP
2008
13 years 6 months ago
An Analysis of Active Learning Strategies for Sequence Labeling Tasks
Active learning is well-suited to many problems in natural language processing, where unlabeled data may be abundant but annotation is slow and expensive. This paper aims to shed ...
Burr Settles, Mark Craven
EMNLP
2008
13 years 6 months ago
Seed and Grow: Augmenting Statistically Generated Summary Sentences using Schematic Word Patterns
We examine the problem of content selection in statistical novel sentence generation. Our approach models the processes performed by professional editors when incorporating materi...
Stephen Wan, Robert Dale, Mark Dras, Cécile...
EMNLP
2008
13 years 6 months ago
Specialized Models and Ranking for Coreference Resolution
This paper investigates two strategies for improving coreference resolution: (1) training separate models that specialize in particular types of mentions (e.g., pronouns versus pr...
Pascal Denis, Jason Baldridge
EMNLP
2008
13 years 6 months ago
Decomposability of Translation Metrics for Improved Evaluation and Efficient Algorithms
BLEU is the de facto standard for evaluation and development of statistical machine translation systems. We describe three real-world situations involving comparisons between diff...
David Chiang, Steve DeNeefe, Yee Seng Chan, Hwee T...
EMNLP
2008
13 years 6 months ago
Summarizing Spoken and Written Conversations
In this paper we describe research on summarizing conversations in the meetings and emails domains. We introduce a conversation summarization system that works in multiple domains...
Gabriel Murray, Giuseppe Carenini
EMNLP
2008
13 years 6 months ago
Selecting Sentences for Answering Complex Questions
Complex questions that require inferencing and synthesizing information from multiple documents can be seen as a kind of topicoriented, informative multi-document summarization. I...
Yllias Chali, Shafiq R. Joty
EMNLP
2008
13 years 6 months ago
An Exploration of Document Impact on Graph-Based Multi-Document Summarization
The graph-based ranking algorithm has been recently exploited for multi-document summarization by making only use of the sentence-to-sentence relationships in the documents, under...
Xiaojun Wan