Previous work on minimizing weighted finite-state automata (including transducers) is limited to particular types of weights. We present efficient new minimization algorithms th...
References included in multi-document summaries are often problematic. In this paper, we present a corpus study performed to derive a statistical model for the syntactic realizati...
In this work, we present a new semantic language modeling approach to model news stories in the Topic Detection and Tracking (TDT) task. In the new approach, we build a unigram la...
The QCS information retrieval (IR) system is presented as a tool for querying, clustering, and summarizing document sets. QCS has been developed as a modular development framework...
Daniel M. Dunlavy, John M. Conroy, Dianne P. O'Lea...
A pseudoword is a composite comprised of two or more words chosen at random; the individual occurrences of the original words within a text are replaced by their conflation. Pseu...
We will demonstrate a spoken dialogue interface to a Geologist’s Field Assistant that is being developed as part of NASA’s Mobile Agents project. The assistant consists of a r...
This paper describes an application of active learning methods to the classification of phone strings recognized using unsupervised phonotactic models. The only training data req...
In order to boost the translation quality of EBMT based on a small-sized bilingual corpus, we use an out-of-domain bilingual corpus and, in addition, the language model of an indo...
WordFreak is a natural language annotation tool that has been designed to be easy to extend to new domains and tasks. Specifically, a plug-in architecture has been developed whic...
Recent TREC results have demonstrated the need for deeper text understanding methods. This paper introduces the idea of automated reasoning applied to question answering and shows...
Dan I. Moldovan, Christine Clark, Sanda M. Harabag...