Counting triangles and the curse of the last reducer

13 years 2 months ago
Counting triangles and the curse of the last reducer
The clustering coefficient of a node in a social network is a fundamental measure that quantifies how tightly-knit the community is around the node. Its computation can be reduced to counting the number of triangles incident on the particular node in the network. In case the graph is too big to fit into memory, this is a non-trivial task, and previous researchers showed how to estimate the clustering coefficient in this scenario. A different avenue of research is to to perform the computation in parallel, spreading it across many machines. In recent years MapReduce has emerged as a de facto programming paradigm for parallel computation on massive data sets. The main focus of this work is to give MapReduce algorithms for counting triangles which we use to compute clustering coefficients. Our contributions are twofold. First, we describe a sequential triangle counting algorithm and show how to adapt it to the MapReduce setting. This algorithm achieves a factor of 10-100 speed up over...
Siddharth Suri, Sergei Vassilvitskii
Added 15 May 2011
Updated 15 May 2011
Type Journal
Year 2011
Where WWW
Authors Siddharth Suri, Sergei Vassilvitskii
Comments (0)