I/O Efficient Structural Clustering and Maintenance of Clusters for Large-scale Graphs

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 63
  • Download : 0
In recent years, the size of graph data has increased significantly, but most existing graph clustering algorithms do not consider the case where the size of main memory is not sufficient to handle large amount of graph data. Exploring entire region of graph for clustering causes too many random disk accesses to use data that are not loaded into memory, resulting in excessive disk I/O and thrashing. To address this problem, we propose an I/O-efficient algorithm for structural clustering of a graph, called pm-SCAN. In the proposed method, if memory is insufficient, an input graph is partitioned into several subgraphs smaller than memory, and clustering is first performed for each subgraph. And then clusters from the subgraphs are merged based on connectivity between clusters so that global results can be obtained in the point of view of an original input graph. Not only does pm-SCAN produce scalable performance even for very large graphs, i.e., significant shortage of available memory, but also the result of pm-SCAN is the same as that of the original structural clustering algorithm SCAN. We also propose a cluster maintenance method for large-scale dynamic graphs that change over time. Instead of reclustering with a whole graph, only a small set of nodes whose structural connectivities are subject to change by a given update operation is first identified, and we access only those nodes in disk and update their clusters to reduce maintenance costs. This dynamic graph handling mechanism shows significant performance improvement compared to the existing method and the baseline that performs clustering from scratch. © 2020 Elsevier Ltd
Publisher
PERGAMON-ELSEVIER SCIENCE LTD
Issue Date
2021-04
Language
English
Article Type
Article
Citation

EXPERT SYSTEMS WITH APPLICATIONS, v.168, pp.114221

ISSN
0957-4174
DOI
10.1016/j.eswa.2020.114221
URI
http://hdl.handle.net/10203/280965
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0