This work addresses the problem of document image analysis, and more particularly the topic of document structure recognition in old, damaged and handwritten document. The goal of...
: We describe our participation in the TREC 2003 Robust and Web tracks. For the Robust track, we experimented with the impact of stemming and feedback on the worst scoring topics. ...
Jaap Kamps, Christof Monz, Maarten de Rijke, B&oum...
Ranking for multilingual information retrieval (MLIR) is a task to rank documents of different languages solely based on their relevancy to the query regardless of query’s langu...
News articles about the same event published over time have properties that challenge NLP and IR applications. A cluster of such texts typically exhibits instances of paraphrase a...
This paper demonstrates a new method for leveraging unstructured annotations to infer semantic document properties. We consider the domain of product reviews, which are often anno...
S. R. K. Branavan, Harr Chen, Jacob Eisenstein, Re...