Today web search engines provide the easiest way to reach information on the web. In this scenario, more than 95% of Indian language content on the web is not searchable due to mu...
Web search behaviour studies, including eye-tracking studies of search result examination, have resulted in numerous insights to improve search result quality and presentation. Ye...
This paper expands on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million web pages on a weekly basis ove...
In this poster we introduce ProThes, a pilot meta-search engine (MSE) for a specific application domain. ProThes combines three approaches: meta-search, graphical user interface (...
One of the most important steps in web crawling is determining the starting points, or seed selection. This paper identifies and explores the problem of seed selection in webscal...