DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | 이지운 | - |
dc.contributor.author | Lee, Hoil | - |
dc.contributor.author | 이호일 | - |
dc.date.accessioned | 2024-08-08T19:31:17Z | - |
dc.date.available | 2024-08-08T19:31:17Z | - |
dc.date.issued | 2024 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1099288&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/322069 | - |
dc.description | 학위논문(박사) - 한국과학기술원 : 수리과학과, 2024.2,[vii, 145 p. :] | - |
dc.description.abstract | Machine learning models are typically initialized by independent Gaussian weights. However, there are evidences which reveal that the weights of some pre-trained models exhibit heavy-tailed distributions and dependence between themselves. This hints that, in terms of Bayesian inference, our prior belief does not properly describe the true behavior of the network function. To alleviate such limitations, we consider a network model where the weights are initialized with possibly dependent heavy-tailed distributions. We prove that, as the network width tends to infinity, the outputs of such a network converge in distribution to stable processes or, in general, mixtures of Gaussian processes, where the limiting stochastic processes are determined according to the distributions the weights are initialized. We also prove that some weights do not converge to zero as the width tends to infinity, and the corresponding node may represent hidden features. Additionally, we investigate the pruning error under the infinite-width limit. Finally, we analyze the optimization of our network via gradient flow, and prove that the gradient flow converges to the global minimum while learning features, unlike the previous Neural Tangent Kernel (NTK) model. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | 두터운 꼬리 분포▼a노드간 의존성▼a무한 너비 신경망▼a혼합 가우스 과정▼a표현 학습▼a가지치기▼aNeural Tangent Kernel | - |
dc.subject | Heavy-tailed distribution▼ainter-node dependence▼ainfinitely wide neural network▼amixture of Gaussian processes▼afeature learning▼apruning▼aNeural Tangent Kernel | - |
dc.title | Infinitely wide neural networks with heavy tails and inter-node dependence | - |
dc.title.alternative | 무한 너비 신경망의 두터운 꼬리 분포와 노드간 의존성 | - |
dc.type | Thesis(Ph.D) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :수리과학과, | - |
dc.contributor.alternativeauthor | Lee, Ji Oon | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.