Sciweavers

113 search results - page 3 / 23
» Parallel and Distributed Document Overlap Detection on the W...
Sort
View
WWW
2007
ACM
14 years 5 months ago
Detecting near-duplicates for web crawling
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma
CIKM
2008
Springer
13 years 6 months ago
A language for manipulating clustered web documents results
We propose a novel conception language for exploring the results retrieved by several internet search services (like search engines) that cluster retrieved documents. The goal is ...
Gloria Bordogna, Alessandro Campi, Giuseppe Psaila...
CLUSTER
2001
IEEE
13 years 8 months ago
Approximation Algorithms for Data Distribution with Load Balancing of Web Servers
Given the increasing traffic on the World Wide Web (Web), it is difficult for a single popular Web server to handle the demand from its many clients. By clustering a group of Web ...
Li-Chuan Chen, Hyeong-Ah Choi
ICDCS
2005
IEEE
13 years 10 months ago
Using a Layered Markov Model for Distributed Web Ranking Computation
The link structure of the Web graph is used in algorithms such as Kleinberg’s HITS and Google’s PageRank to assign authoritative weights to Web pages and thus rank them. Both ...
Jie Wu, Karl Aberer
ICDCS
1998
IEEE
13 years 8 months ago
A Framework for Consistent, Replicated Web Objects
Despite the extensive use of caching techniques, the Web is overloaded. While the caching techniques currently used help some, it would be better to use different caching and repli...
Anne-Marie Kermarrec, Ihor Kuz, Maarten van Steen,...