Sciweavers

611 search results - page 70 / 123
» Random web crawls
Sort
View
SIGIR
2006
ACM
15 years 3 months ago
Finding near-duplicate web pages: a large-scale evaluation of algorithms
Broder et al.’s [3] shingling algorithm and Charikar’s [4] random projection based approach are considered “state-of-theart” algorithms for finding near-duplicate web pag...
Monika Rauch Henzinger
IM
2006
14 years 9 months ago
Protean Graphs
We propose a new random model of web graphs in which the degree of a vertex depends on its age. We characterize the degree sequence of this model and study its behaviour near the c...
Tomasz Luczak, Pawel Pralat
IPPS
2003
IEEE
15 years 3 months ago
SCIMPS: An Integrated Approach to Distributed Processing in Sensor Webs
: This paper presents a new tightly coupled computation/communication design developed to support the unique operational requirements of sensor webs. A critical challenge of sensor...
David L. Andrews, Joe Evans, Venumadhav Mangipudi,...
DKE
2006
122views more  DKE 2006»
14 years 9 months ago
Sampling, information extraction and summarisation of Hidden Web databases
Hidden Web databases maintain a collection of specialised documents, which are dynamically generated in response to users' queries. The majority of these documents are genera...
Yih-Ling Hedley, Muhammad Younas, Anne E. James, M...
COMSIS
2010
14 years 7 months ago
A content-based dynamic load-balancing algorithm for heterogeneous web server cluster
According to the different requests of Web and the heterogeneity of Web server, the paper presents a content-based loadbalancing algorithm. The mechanism of this algorithm is that ...
Lin Zhang, Xiao Ping Li, Su Yuan