Sciweavers

70 search results - page 14 / 14
» A scalable algorithm for high-quality clustering of web snip...
Sort
View
EPEW
2005
Springer
13 years 11 months ago
Hypergraph Partitioning for Faster Parallel PageRank Computation
The PageRank algorithm is used by search engines such as Google to order web pages. It uses an iterative numerical method to compute the maximal eigenvector of a transition matrix ...
Jeremy T. Bradley, Douglas V. de Jager, William J....
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
14 years 12 days ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
WWW
2010
ACM
14 years 16 days ago
Mind the data skew: distributed inferencing by speeddating in elastic regions
Semantic Web data exhibits very skewed frequency distributions among terms. Efficient large-scale distributed reasoning methods should maintain load-balance in the face of such hi...
Spyros Kotoulas, Eyal Oren, Frank van Harmelen
PAKDD
2011
ACM
209views Data Mining» more  PAKDD 2011»
12 years 8 months ago
Spectral Analysis for Billion-Scale Graphs: Discoveries and Implementation
Abstract. Given a graph with billions of nodes and edges, how can we find patterns and anomalies? Are there nodes that participate in too many or too few triangles? Are there clos...
U. Kang, Brendan Meeder, Christos Faloutsos
KDD
2009
ACM
141views Data Mining» more  KDD 2009»
14 years 6 months ago
Meme-tracking and the dynamics of the news cycle
Tracking new topics, ideas, and "memes" across the Web has been an issue of considerable interest. Recent work has developed methods for tracking topic shifts over long ...
Jure Leskovec, Lars Backstrom, Jon M. Kleinberg