Image clustering is useful in many retrieval and classification applications. The main goal of image clustering is to partition a given dataset into salient clusters such that the...
Traditional research on spelling correction in natural language processing and information retrieval literature mostly relies on pre-defined lexicons to detect spelling errors. Bu...
In this paper, we study search bot traffic from search engine query logs at a large scale. Although bots that generate search traffic aggressively can be easily detected, a large ...
Given a terabyte click log, can we build an efficient and effective click model? It is commonly believed that web search click logs are a gold mine for search business, because th...
Anitha Kannan, Chao Liu 0001, Christos Faloutsos, ...
In order to increase retrieval precision, some new search engines provide manually verified answers to Frequently Asked Queries (FAQs). An underlying task is the identification of...