The world wide web has a wealth of information that is related to almost any text classification task. This paper presents a method for mining the web to improve text classificati...
: We describe our participation in the TREC 2004 Web and Terabyte tracks. For the web track, we employ mixture language models based on document full-text, incoming anchortext, and...
It is important to automatically extract key information from sensitive text documents for intelligence analysis. Text documents are usually unstructured and information extraction...
List question answering (QA) offers a unique challenge in effectively and efficiently locating a complete set of distinct answers from huge corpora or the Web. In TREC-12, the med...
Abstract. Extracting information automatically from texts for database representation requires previously well-grouped phrases so that entities can be separated adequately. This pr...