PTE: Enumerating trillion triangles on distributed systems

Cited 31 time in webofscience Cited 0 time in scopus
  • Hit : 279
  • Download : 0
How can we enumerate triangles from an enormous graph with billions of vertices and edges? Triangle enumeration is an important task for graph data analysis with many applications including identifying suspicious users in social networks, detecting web spams, finding communities, etc. However, recent networks are so large that most of the previous algorithms fail to process them. Recently, several MapReduce algorithms have been proposed to address such large networks; however, they suffer from the massive shufled data resulting in a very long processing time. In this paper, we propose PTE (Pre-partitioned Triangle Enumeration), a new distributed algorithm for enumerating triangles in enormous graphs by resolving the structural inefficiency of the previous MapReduce algorithms. PTE enumerates trillions of triangles in a billion scale graph by decreasing three factors: the amount of shuffled data, total work, and network read. Experimental results show that PTE provides up to 47× faster performance than recent distributed algorithms on real world graphs, and succeeds in enumerating more than 3 trillion triangles on the ClueWeb12 graph with 6.3 billion vertices and 72 billion edges, which any previous triangle computation algorithm fail to process.
Publisher
ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD)
Issue Date
2016-08-17
Language
English
Citation

22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp.1115 - 1124

DOI
10.1145/2939672.2939757
URI
http://hdl.handle.net/10203/224442
Appears in Collection
CS-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 31 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0