DSpace at KOASAS: Bayesian weight decay for deep convolutional neural networks : approximation and generalization

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Theses_Ph.D.(박사논문)

Bayesian weight decay for deep convolutional neural networks : approximation and generalization심층 회선 신경망의 베이지언 가중치 감쇠 : 근사화와 일반화

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 211
Download : 0

Export

Park, Jung-Guk

This study determines the weight decay parameter value of a deep convolutional neural network (CNN) that yields a good generalization. Although the weight decay is theoretically related to generalization error, determining a value of the weight decay is known to be a challenging issue. Deep CNNs are widely used in vision applications and guaranteeing their classification accuracy on unseen data is important. To obtain such a CNN in general, numerical trials with different weight decay values are needed. However, the larger the CNN architecture, the higher the computational cost of the trials. To address this problem, this study derives an analytical form for the decay parameter through a proposed objective function in conjunction with Bayesian probability distributions. For computational efficiency, a novel method to approximate this form is suggested. This method uses a small amount of information in the Hessian matrix. Under general conditions, the approximate form is guaranteed by a provable bound and is obtained by a proposed algorithm with discretized information, where its time complexity is linear in terms of the depth and width of the CNN. The bound provides a consistent result of the proposed learning scheme. Also, the generalization error of CNN trained by the proposed algorithm is analyzed with statistical learning theory and the analysis on computational complexity shows the rate of efficiency. By reducing the computational cost of determining the decay value, the approximation allows for the fast investigation of a deep CNN which yields a small generalization error. Experimental results show that the assumption verified with different deep CNNs is suitable for real-world image datasets. In addition, the method can show a remarkable time complexity reduction with achieving good classification accuracy when it is applied to deeper classification neural networks, more complex training methods, and/or objective functions requiring the high computational cost. The proposed method has an advantage in that it can be applied to any deep classification network trained by a loss function which satisfies mild conditions.

Advisors: Jo, Sungho Jo researcher; 조성호 researcher

Description: 한국과학기술원 :전산학부,

Publisher: 한국과학기술원

Issue Date: 2020

Identifier: 325007

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전산학부, 2020.2,[iv, 59 p. :]

Keywords: Bayesian method▼aconvolutional neural networks▼acomputational complexity▼ainverse Hessian matrix▼aregularization▼aweight decay; 베이지언 기법▼a계산 복잡도▼a회선 신경망▼a역 헤시안 행렬▼a학습 규제▼a가중치 감소

URI: http://hdl.handle.net/10203/284154

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=909372&flag=dissertation

Appears in Collection: CS-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Bayesian weight decay for deep convolutional neural networks : approximation and generalization심층 회선 신경망의 베이지언 가중치 감쇠 : 근사화와 일반화

KOASAS

Communities & Collections