An approach for reorganizing a Web site based on user access patterns is proposed. The Web server's log les and the Web pages on the site are rst preprocessed to obtain the ac...
The presence of replicas or near-replicas of documents is very common on the Web. Documents may be replicated completely or partially for different reasons (versions, mirrors, etc...
Ernesto Di Iorio, Michelangelo Diligenti, Marco Go...
Web Search System exists to retrieve necessary information on the WWW space. However, these are not accuracy enough. Then, we propose the technique for using Web Page Grouping tog...
Maintaining currency of search engine indices by exhaustive crawling is rapidly becoming impossible due to the increasing size and dynamic content of the web. Focused crawlers aim...
Michelangelo Diligenti, Frans Coetzee, Steve Lawre...
Nowadays, many applications are interested in detecting and discovering changes on the web to help users to understand page updates and more generally, the web dynamics. Web archiv...