HEigen: Spectral Analysis for Billion-Scale Graphs

Cited 29 time in webofscience Cited 38 time in scopus
  • Hit : 664
  • Download : 57
Given a graph with billions of nodes and edges, how can we find patterns and anomalies? Are there nodes that participate in too many or too few triangles? Are there close-knit near-cliques? These questions are expensive to answer unless we have the first several eigenvalues and eigenvectors of the graph adjacency matrix. However, eigensolvers suffer from subtle problems (e.g., convergence) for large sparse matrices, let alone for billion-scale ones. We address this problem with the proposed HEIGEN algorithm, which we carefully design to be accurate, efficient, and able to run on the highly scalable MAPREDUCE (HADOOP) environment. This enables HEIGEN to handle matrices more than 1; 000 x larger than those which can be analyzed by existing algorithms. We implement HEIGEN and run it on the M45 cluster, one of the top 50 supercomputers in the world. We report important discoveries about near-cliques and triangles on several real-world graphs, including a snapshot of the Twitter social network (56 Gb, 2 billion edges) and the
Publisher
IEEE COMPUTER SOC
Issue Date
2014-02
Language
English
Article Type
Article
Citation

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, v.26, no.2, pp.350 - 362

ISSN
1041-4347
DOI
10.1109/TKDE.2012.244
URI
http://hdl.handle.net/10203/187068
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 29 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0