Design and analysis of optimization problems in deep learning심층 학습의 최적화 문제에 대한 설계 및 분석

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 318
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorKang, Wanmo-
dc.contributor.advisor강완모-
dc.contributor.authorLee, Cheolhyoung-
dc.date.accessioned2021-05-12T19:43:50Z-
dc.date.available2021-05-12T19:43:50Z-
dc.date.issued2020-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=924355&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/284354-
dc.description학위논문(박사) - 한국과학기술원 : 수리과학과, 2020.8,[xii, 80 p. :]-
dc.description.abstractIt has been recently observed that probabilistic ideas could be useful in deep learning. For instance, stochastic gradient descent (SGD) enables a deep neural network to learn a task efficiently, and dropout prevents co-adaptation of neurons through random subnetworks. Despite their wide adoption, our understanding of their role in high dimensional parameter spaces is limited. In this dissertation, we analyze SGD from a geometrical perspective by inspecting the stochasticity of the norms and directions of minibatch gradients. We claim that the directional uniformity of minibatch gradients increases over the course of SGD. Furthermore, we formulate that dropout regularizes learning to minimize the deviation from the origin and that the strength of regularization adapts along the optimization trajectory. Inspired by this theoretical analysis of dropout, we propose a new regularization technique "mixout" useful in transfer learning. Mixout greatly improves both finetuning stability and average performance of pretrained large-scale language models. In the case of training from scratch, we introduce a variant of mixout preventing generator forgetting to avoid mode collapse in GANs.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectdeep learning▼astochastic gradient descent▼adropout▼afinetuning stability▼amode collapse-
dc.subject심층 학습▼a확률적 경사하강법▼a드롭아웃▼a미세조정 안정성▼a모드 붕괴-
dc.titleDesign and analysis of optimization problems in deep learning-
dc.title.alternative심층 학습의 최적화 문제에 대한 설계 및 분석-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :수리과학과,-
dc.contributor.alternativeauthor이철형-
Appears in Collection
MA-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0