Named entities (e.g., "Kofi Annan", "Coca-Cola", "Second World War") are ubiquitous in web pages and other types of document and often provide a simpl...
Felix Weigel, Klaus U. Schulz, Levin Brunner, Edua...
Web pages are usually highly structured documents. In some documents, content with different functionality is laid out in blocks, some merely supporting the main discourse. In ot...
We study the problem of answering ambiguous web queries in a setting where there exists a taxonomy of information, and that both queries and documents may belong to more than one ...
Rakesh Agrawal, Sreenivas Gollapudi, Alan Halverso...
This paper provides an overview of experiments carried out at the TREC 2004 Terabyte Track using the Indri search engine. Indri is an efficient, effective distributed search engin...
Donald Metzler, Trevor Strohman, Howard R. Turtle,...
Large collections of documents containing various types of multimedia, are made available to the WWW. Unfortunately, due to the un-structuredness of Internet environments it is ha...