Duplication of Web pages greatly hurts the perceived relevance of a search engine. Existing methods for detecting duplicated Web pages can be classified into two categories, i.e. o...
Any given Web search engine may provide higher quality results than others for certain queries. Therefore, it is in users' best interest to utilize multiple search engines. I...
Ryen W. White, Matthew Richardson, Mikhail Bilenko...
Large-scale Parallel Web Search Engines (WSEs) needs to adopt a strategy for partitioning the inverted index among a set of parallel server nodes. In this paper we are interested ...
Querying XML data is a well-explored topic with powerful databasestyle query languages such as XPath and XQuery set to become W3C standards. An equally compelling paradigm for que...
Sihem Amer-Yahia, Laks V. S. Lakshmanan, Shashank ...
In this paper we propose a novel document retrieval model in which text queries are augmented with multi-dimensional taxonomy restrictions. These restrictions may be relaxed at a ...
Marcus Fontoura, Vanja Josifovski, Ravi Kumar, Chr...