DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Jo, Sungho Jo | - |
dc.contributor.advisor | 조성호 | - |
dc.contributor.author | Park, Jung-Guk | - |
dc.date.accessioned | 2021-05-12T19:40:11Z | - |
dc.date.available | 2021-05-12T19:40:11Z | - |
dc.date.issued | 2020 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=909372&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/284154 | - |
dc.description | 학위논문(박사) - 한국과학기술원 : 전산학부, 2020.2,[iv, 59 p. :] | - |
dc.description.abstract | This study determines the weight decay parameter value of a deep convolutional neural network (CNN) that yields a good generalization. Although the weight decay is theoretically related to generalization error, determining a value of the weight decay is known to be a challenging issue. Deep CNNs are widely used in vision applications and guaranteeing their classification accuracy on unseen data is important. To obtain such a CNN in general, numerical trials with different weight decay values are needed. However, the larger the CNN architecture, the higher the computational cost of the trials. To address this problem, this study derives an analytical form for the decay parameter through a proposed objective function in conjunction with Bayesian probability distributions. For computational efficiency, a novel method to approximate this form is suggested. This method uses a small amount of information in the Hessian matrix. Under general conditions, the approximate form is guaranteed by a provable bound and is obtained by a proposed algorithm with discretized information, where its time complexity is linear in terms of the depth and width of the CNN. The bound provides a consistent result of the proposed learning scheme. Also, the generalization error of CNN trained by the proposed algorithm is analyzed with statistical learning theory and the analysis on computational complexity shows the rate of efficiency. By reducing the computational cost of determining the decay value, the approximation allows for the fast investigation of a deep CNN which yields a small generalization error. Experimental results show that the assumption verified with different deep CNNs is suitable for real-world image datasets. In addition, the method can show a remarkable time complexity reduction with achieving good classification accuracy when it is applied to deeper classification neural networks, more complex training methods, and/or objective functions requiring the high computational cost. The proposed method has an advantage in that it can be applied to any deep classification network trained by a loss function which satisfies mild conditions. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | Bayesian method▼aconvolutional neural networks▼acomputational complexity▼ainverse Hessian matrix▼aregularization▼aweight decay | - |
dc.subject | 베이지언 기법▼a계산 복잡도▼a회선 신경망▼a역 헤시안 행렬▼a학습 규제▼a가중치 감소 | - |
dc.title | Bayesian weight decay for deep convolutional neural networks | - |
dc.title.alternative | 심층 회선 신경망의 베이지언 가중치 감쇠 : 근사화와 일반화 | - |
dc.type | Thesis(Ph.D) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :전산학부, | - |
dc.contributor.alternativeauthor | 박정국 | - |
dc.title.subtitle | approximation and generalization | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.