We present a novel approach to managing redundancy in sequence databanks such as GenBank. We store clusters of near-identical sequences as a representative union-sequence and a se...
Michael Cameron, Yaniv Bernstein, Hugh E. Williams
This paper compares several indexing methods for person names extracted from text, developed for an information retrieval system with requirements for fast approximate matching of...
Most existing information retrieval (IR) systems do not take much advantage of natural language processing (NLP) techniques due to the complexity and limited observed effectivenes...
This report describes a new open-domain answer retrieval system developed at the University of Edinburgh and gives results for the TREC-12 question answering track. Phrasal answer...
Jochen L. Leidner, Johan Bos, Tiphaine Dalmas, Jam...
This work explores the problem of cross-lingual pairwise similarity, where the task is to extract similar pairs of documents across two different languages. Solutions to this pro...