DSpace at KOASAS: Efficient and accurate eigen-decomposition of large-scale PSD matrices via sample subspace compression

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Theses_Ph.D.(박사논문)

Efficient and accurate eigen-decomposition of large-scale PSD matrices via sample subspace compression샘플 부분 공간 압축을 통한 효율적이고 정확한 대규모 양의 준정부호 행렬 고유 분해

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 517
Download : 0

Export

Lim, Woosang / 임우상

Nystr$\ddot{o}$m method is a sampling based method for spectral decomposition of positive semi-definite (PSD) matrices, and is widely used in kernel-based machine learning for large-scale data sets.Since its introduction, there has been a large body of work that improves the approximation accuracy while maintaining computational efficiency. In this paper, we present novel Nystr$\ddot{o}$m schemes that improve both accuracy and efficiency based on a new theoretical analysis.We first prove that One-shot Nystr$\ddot{o}$m Method (ONM) which is one of the existing Nystr$\ddot{o}$m methods solves sample-based kernel PCA problem given the sample subspace,and suggest that the subspace distance measure is important for accuracy of Nystr$\ddot{o}$m methods.We then prove novel upper error bounds based on subspace distance measure, and propose Principal Subspace Approximation (PSA) sampling that minimizes our error bounds based on the notion of compression of sample matrices with sparse representation.By combining the ONM and PSA sampling, we present our Double Nystr$\ddot{o}$m Method (DNM) that efficiently reduces the size of the decomposition problem in two stages.We report the results of extensive experiments that provide a detailed comparison of various sampling strategies and our PSA sampling, and show that PSA sampling is superior even to the sampling strategies that use clustering algorithms in terms of both accuracy and efficiency.We also demonstrate our DNM is highly efficient and accurate compared to other state-of-the-art Nystr$\ddot{o}$m methods for large-scale data sets.Next, we generalize DNM, and present Nested Nystr$\ddot{o}$m Method (NNM) which is a multilayer method based on a nested sequence of subsamples and multiple compressions.To compute spectral decomposition of PSD matrices,it compresses sample matrices and solves a smaller sized optimization problem, and updates the eigenspace on each layer.We prove that its upper error bound decreases as we use additional layers. Experimental results show that NNM is more accurate than DNM within the same short time.Finally, we tackle the local triangle counting problem on graph streams by using Nystr$\ddot{o}$m extension. We first derive a local triangle counting algorithm based on Nystr$\ddot{o}$m method, and design MELTING-U which is a memory-efficient and accurate local triangle counting algorithm on graph streams. We also propose a fast version of MELTING-U, called MELTING. By using DNM, we show that MELTING-U and MELTING are memory-efficient and more accurate compared to the competitive algorithms on a number of real data sets..

Advisors: Bae, Doo Hwan researcher; 배두환 researcher; Park, Haesun researcher; 박혜선 researcher

Description: 한국과학기술원 :전산학부,

Publisher: 한국과학기술원

Issue Date: 2017

Identifier: 325007

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전산학부, 2017.8,[vi, 69 p. :]

Keywords: Positive Semi-Definite Matrix▼aEigen-decomposition▼aNystr$\ddot{o}$m method▼aLarge-Scale Learning▼aKernel Methods▼aLow-Rank Approximation; 양의 준정부호 행렬▼a고유 분해▼a나이스트롬 기법▼a대규모 학습▼a커널 기법▼a낮은 계수 근사법

URI: http://hdl.handle.net/10203/242098

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=718889&flag=dissertation

Appears in Collection: CS-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Efficient and accurate eigen-decomposition of large-scale PSD matrices via sample subspace compression샘플 부분 공간 압축을 통한 효율적이고 정확한 대규모 양의 준정부호 행렬 고유 분해

KOASAS

Communities & Collections