Sciweavers

611 search results - page 70 / 123
» Random web crawls
Sort
View
SIGIR
2006
ACM
15 years 5 months ago
Finding near-duplicate web pages: a large-scale evaluation of algorithms
Broder et al.’s [3] shingling algorithm and Charikar’s [4] random projection based approach are considered “state-of-theart” algorithms for finding near-duplicate web pag...
Monika Rauch Henzinger
IM
2006
14 years 12 months ago
Protean Graphs
We propose a new random model of web graphs in which the degree of a vertex depends on its age. We characterize the degree sequence of this model and study its behaviour near the c...
Tomasz Luczak, Pawel Pralat
IPPS
2003
IEEE
15 years 5 months ago
SCIMPS: An Integrated Approach to Distributed Processing in Sensor Webs
: This paper presents a new tightly coupled computation/communication design developed to support the unique operational requirements of sensor webs. A critical challenge of sensor...
David L. Andrews, Joe Evans, Venumadhav Mangipudi,...
DKE
2006
122views more  DKE 2006»
14 years 12 months ago
Sampling, information extraction and summarisation of Hidden Web databases
Hidden Web databases maintain a collection of specialised documents, which are dynamically generated in response to users' queries. The majority of these documents are genera...
Yih-Ling Hedley, Muhammad Younas, Anne E. James, M...
COMSIS
2010
14 years 9 months ago
A content-based dynamic load-balancing algorithm for heterogeneous web server cluster
According to the different requests of Web and the heterogeneity of Web server, the paper presents a content-based loadbalancing algorithm. The mechanism of this algorithm is that ...
Lin Zhang, Xiao Ping Li, Su Yuan