In this paper, we propose a new approach to clustering e-commerce search engines (ESEs) on the Web. Our approach utilizes the features available on the interface page of each ESE,...
Yiyao Lu, Hai He, Qian Peng, Weiyi Meng, Clement T...
Recent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchron...
Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey...
Nowadays the Web represents a growing collection of an enormous amount of contents where the need for better ways to find and organize the available data is becoming a fundamental...
Francesco Ronzano, Andrea Marchetti, Maurizio Tesc...
State-of-the-art Web search engines are inherently limited in their abilities to search information in Deep Web beyond portals. This paper discusses how Web services and Semantic-...
Duplication of Web pages greatly hurts the perceived relevance of a search engine. Existing methods for detecting duplicated Web pages can be classified into two categories, i.e. o...