Combating Web spam is one of the greatest challenges for Web search engines. State-of-the-art anti-spam techniques focus mainly on detecting varieties of spam strategies, such as ...
Chao Wei, Yiqun Liu, Min Zhang, Shaoping Ma, Liyun...
If the model of free-form queries, which has proved successful for HTML based search on the Web, is made available for Grid services, it will serve as a powerful tool for scientis...
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
In this paper we propose a hierarchical clustering engine, called SnakeT, that is able to organize on-the-fly the search results drawn from 16 commodity search engines into a hier...
In this work we propose a method that retrieves a list of related queries given an initial input query. The related queries are based on the query log of previously issued queries...