Community detection from multi-layer graphs built on relationships and attributes관계와 속성 기반 다층 그래프에서의 커뮤니티 발견

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 817
  • Download : 0
Community detection, also known as graph clustering, has been extensively studied in the literature. The goal of community detection is to partition vertices in a graph into densely-connected components so-called communities. In recent applications, however, an entity is associated with multiple aspects of relationships and multiple attributes from multiple data sources, which brings new challenges in community detection. These multimodal data sources can be naturally modeled as a multi-layer graph composed of multiple interdependent layers and mapping functions, where each layer represents an intra-relationship and each mapping function represents inter-relationship between two layers. Great efforts have therefore been made to tackle the problem of community detection in multi-layer graphs. In this dissertation, we propose novel frameworks for community detection from multiple data sources based on the multi-layer graph model. Among various combinations of multiple data sources, we deal with two representative cases: (i) multiple aspects of relationships and (ii) multiple attributes. The first case deals with multiple social graphs which consist of a set of users involved with different types of relationships. The second case deals with attributed graphs which consists of a set of users involved with social relationships as well as associated with multiple attributes. Particulary, we focus on a geosocial graph which has attracted much attention thanks to the widespread use of location-aware mobile devices. Since locations accessed by users can be regarded as various geographic preferences or interests of users, a geosocial graph is a representative case of attributed graphs. In the first part of this dissertation, we propose a novel framework for differential flattening, which facilitates the analysis of pillar multi-layer graphs, and apply this framework to community detection. Differential flattening merges multiple graphs into a single graph such that the graph structure with the maximum clustering coefficient is obtained from the single graph. It has two distinct features compared with the existing approaches. First, dealing with multiple layers is done independently of a specific community detection algorithm whereas previous approaches rely on a specific algorithm. Thus, any algorithm for a single graph becomes applicable to multi-layer graphs. Second, the contribution of each layer to the single graph is determined automatically for the maximum clustering coefficient. Since differential flattening is formulated as an optimization problem, the optimal solution is easily obtained by well-known algorithms such as interior point methods. Extensive experiments were conducted using the LFR benchmark networks as well as the DBLP, 20 Newsgroups, and MIT Reality Mining networks. The results show that our approach of differential flattening leads to discovery of higher-quality communities than baseline approaches and the state-of-the-art algorithms. In the second part of this dissertation, we propose a novel framework for geosocial co-clustering, which facilitates the analysis of attributed graphs with a focus on a geosocial graph. Geosocial co-clustering is formulated by non-negative matrix tri-factorization with dual regularizers. The existing matrix tri-factorization algorithms, however, suffer from a significant computational overhead when handling large-scale data sets in many real world applications. Our proposed framework takes advantage of the intrinsic properties of geosocial networks to reduce the computational overhead without compromising accuracy. First, the numbers of users and locations are effectively reduced through coarsening of our framework. Then, we decompose the matrix tri-factorization of a single large matrix into a series of multiple smaller sub-matrix tri-factorizations. To this end, the optimal split of the entire matrix is determined by crossing minimization and optimization of the minimum description length. Experiments conducted using four real-world geosocial networks show that our framework for geosocial co-clustering reduces the elapsed time by 19 to 69 times while achieving the accuracy of up to 95.2% compared with the state-of-the-art co-clustering algorithm. The strength of this dissertation lies in a wide variety of community detection from multiple heterogeneous data sources. Since the two cases proposed are expected to cover a significant proportion of the cases with multiple data sources, we believe that this work will enhance the quality of community detection in social networks from multiple data sources.
Advisors
Lee, Jae Gilresearcher이재길researcher
Description
한국과학기술원 :지식서비스공학대학원,
Publisher
한국과학기술원
Issue Date
2017
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 지식서비스공학대학원, 2017.8,[v, 83 p. :]

Keywords

community detection▼amultiple relationships▼amultiple attributes▼ageosocial graphs▼amulti-layer graphs▼adifferential flattening▼anon-negative matrix tri-factorization; 커뮤니티 발견▼a다중 관계▼a다중 속성▼a지리적 소셜 그래프▼a다층 그래프▼a차등 평탄화▼a음수 미포함 행렬 3중 분해

URI
http://hdl.handle.net/10203/242108
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=718897&flag=dissertation
Appears in Collection
KSE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0