Summarizing and Understanding Large Graphs

Cited 42 time in webofscience Cited 46 time in scopus
  • Hit : 307
  • Download : 0
How can we succinctly describe a million-node graph with a few simple sentences? Given a large graph, how can we find its most "important" structures, so that we can summarize it and easily visualize it? How can we measure the "importance" of a set of discovered subgraphs in a large graph? Starting with the observation that real graphs often consist of stars, bipartite cores, cliques, and chains, our main idea is to find the most succinct description of a graph in these "vocabulary" terms. To this end, we first mine candidate subgraphs using one or more graph partitioning algorithms. Next, we identify the optimal summarization using the minimum description length (MDL) principle, picking only those subgraphs from the candidates that together yield the best lossless compression of the graph-or, equivalently, that most succinctly describe its adjacency matrix. Our contributions are threefold: (i) formulation: we provide a principled encoding scheme to identify the vocabulary type of a given subgraph for six structure types prevalent in real-world graphs, (ii) algorithm: we develop VoG, an efficient method to approximate the MDL-optimal summary of a given graph in terms of local graph structures, and (iii) applicability: we report an extensive empirical evaluation on multimillion-edge real graphs, including Flickr and the Notre Dame web graph.
Publisher
Wiley Subscription Services
Issue Date
2015-06
Language
English
Article Type
Article
Keywords

COMPRESSION

Citation

STATISTICAL ANALYSIS AND DATA MINING, v.8, no.3, pp.183 - 202

ISSN
1932-1864
DOI
10.1002/sam.11267
URI
http://hdl.handle.net/10203/204025
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 42 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0