A Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets Page: 1 of 21
This report is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to UNT Digital Library by the UNT Libraries Government Documents Department.
Extracted Text
The following text was automatically extracted from the image on this page using optical character recognition software:
A Faster Parallel Algorithm and Efficient
Multithreaded Implementations for Evaluating
Betweenness Centrality on Massive Datasets
Kamesh Madduri* David Edigert Karl Jiangt
KMadduri@lbl.gov dediger@gatech.edu k.jiang@gatech.edu
David A. Badert Daniel Chavarria-Mirandat
bader@cc.gatech.edu daniel.chavarria@pnl.gov
Abstract
We present a new lock-free parallel algorithm for computing betweenness central-
ity of massive small-world networks. With minor changes to the data structures, our
algorithm also achieves better spatial cache locality compared to previous approaches.
Betweenness centrality is a key algorithm kernel in HPCS SSCA#2, a benchmark ex-
tensively used to evaluate the performance of emerging high-performance computing
architectures for graph-theoretic computations. We design optimized implementations
of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded
systems: a Cray XMT system with the Threadstorm processor, and a single-socket Sun
multicore server with the UltraSPARC T2 processor. For a small-world network of 134
million vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core
Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the
SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to
more than a 2 x performance improvement over the previous parallel implementations.
To better characterize the performance of these multithreaded systems, we correlate
the SSCA#2 performance results with data from the memory-intensive STREAM and
RandomAccess benchmarks. Finally, we demonstrate the applicability of our implemen-
tation to analyze massive real-world datasets by computing approximate betweenness
centrality for a large-scale IMDb movie-actor network.
1 Introduction
Graphs are a fundamental abstraction for representing data sets, and graph-theoretic algo-
rithms and analysis routines are pervasive in several application domains today. Computa-
tions involving sparse real-world graphs such as socio-economic interactions, the world-wide
*Computational Research Division, Lawrence Berkeley National Laboratory.
tCollege of Computing, Georgia Institute of Technology.
$High Performance Computing, Pacific Northwest National Laboratory.1
Upcoming Pages
Here’s what’s next.
Search Inside
This report can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Report.
Madduri, Kamesh; Ediger, David; Jiang, Karl; Bader, David A. & Chavarria-Miranda, Daniel. A Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets, report, February 15, 2009; Berkeley, California. (https://digital.library.unt.edu/ark:/67531/metadc928449/m1/1/: accessed April 24, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.