Traditionally, statistical machine translation systems have relied on parallel bi-lingual data to train a translation model. While bi-lingual parallel data are expensive to genera...
Matthew G. Snover, Bonnie J. Dorr, Richard M. Schw...
In this paper, we report our experiments in the TREC 2008 Relevance Feedback Track. Our main goal is to study a novel problem in feedback, i.e., optimization of the balance of the...
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
This paper addresses the problem of extending an adaptive information filtering system to make decisions about the novelty and redundancy of relevant documents. It argues that rel...
In this paper, we introduce the concept of "user policies" and its applications to the browsing of HTML documents. The objective of policies is to specify user preferenc...