Sciweavers

28 search results - page 2 / 6
» A large-scale study of the evolution of web pages
Sort
View
LAWEB
2003
IEEE
13 years 11 months ago
On the Evolution of Clusters of Near-Duplicate Web Pages
This paper expands on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million web pages on a weekly basis ove...
Dennis Fetterly, Mark Manasse, Marc Najork
JCDL
2011
ACM
301views Education» more  JCDL 2011»
12 years 8 months ago
Archiving the web using page changes patterns: a case study
A pattern is a model or a template used to summarize and describe the behavior (or the trend) of a data having generally some recurrent events. Patterns have received a considerab...
Myriam Ben Saad, Stéphane Gançarski
WWW
2005
ACM
14 years 6 months ago
The volume and evolution of web page templates
Web pages contain a combination of unique content and template material, which is present across multiple pages and used primarily for formatting, navigation, and branding. We stu...
David Gibson, Kunal Punera, Andrew Tomkins
VLDB
2000
ACM
104views Database» more  VLDB 2000»
13 years 9 months ago
The Evolution of the Web and Implications for an Incremental Crawler
In this paper we study how to build an effective incremental crawler. The crawler selectively and incrementally updates its index and/or local collection of web pages, instead of ...
Junghoo Cho, Hector Garcia-Molina
WWW
2004
ACM
14 years 6 months ago
Impact of search engines on page popularity
Recent studies show that a majority of Web page accesses are referred by search engines. In this paper we study the widespread use of Web search engines and its impact on the ecol...
Junghoo Cho, Sourashis Roy