Infinitely wide neural networks with heavy tails and inter-node dependence무한 너비 신경망의 두터운 꼬리 분포와 노드간 의존성

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 3
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisor이지운-
dc.contributor.authorLee, Hoil-
dc.contributor.author이호일-
dc.date.accessioned2024-08-08T19:31:17Z-
dc.date.available2024-08-08T19:31:17Z-
dc.date.issued2024-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1099288&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/322069-
dc.description학위논문(박사) - 한국과학기술원 : 수리과학과, 2024.2,[vii, 145 p. :]-
dc.description.abstractMachine learning models are typically initialized by independent Gaussian weights. However, there are evidences which reveal that the weights of some pre-trained models exhibit heavy-tailed distributions and dependence between themselves. This hints that, in terms of Bayesian inference, our prior belief does not properly describe the true behavior of the network function. To alleviate such limitations, we consider a network model where the weights are initialized with possibly dependent heavy-tailed distributions. We prove that, as the network width tends to infinity, the outputs of such a network converge in distribution to stable processes or, in general, mixtures of Gaussian processes, where the limiting stochastic processes are determined according to the distributions the weights are initialized. We also prove that some weights do not converge to zero as the width tends to infinity, and the corresponding node may represent hidden features. Additionally, we investigate the pruning error under the infinite-width limit. Finally, we analyze the optimization of our network via gradient flow, and prove that the gradient flow converges to the global minimum while learning features, unlike the previous Neural Tangent Kernel (NTK) model.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subject두터운 꼬리 분포▼a노드간 의존성▼a무한 너비 신경망▼a혼합 가우스 과정▼a표현 학습▼a가지치기▼aNeural Tangent Kernel-
dc.subjectHeavy-tailed distribution▼ainter-node dependence▼ainfinitely wide neural network▼amixture of Gaussian processes▼afeature learning▼apruning▼aNeural Tangent Kernel-
dc.titleInfinitely wide neural networks with heavy tails and inter-node dependence-
dc.title.alternative무한 너비 신경망의 두터운 꼬리 분포와 노드간 의존성-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :수리과학과,-
dc.contributor.alternativeauthorLee, Ji Oon-
Appears in Collection
MA-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0