We present a new algorithm for finding large, dense subgraphs in massive graphs. Our algorithm is based on a recursive application of fingerprinting via shingles, and is extreme...
This paper expands on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million web pages on a weekly basis ove...
While zoomable user interfaces can improve the usability of applications by easing data access, a drawback is that some users tend to become lost after they have zoomed in. Previo...
Abstract. The PageRank algorithm demonstrates the significance of the computation of document ranking of general importance or authority in Web information retrieval. However, doi...
Geographic Information Systems (GIS) are increasingly managing very large sets of data and hence a centralized data repository may not always provide the most scalable solution. H...