(A) study on architecture search using continuous relaxationContinuous relaxation을 이용한 딥러닝 모델 탐색에 대한 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 152
  • Download : 0
DARTS is a simple and efficient architecture search method using continuous relaxation. In this paper, we suggest a couple of problems in DARTS verified by experiment and provide the solution for each. First, we show that DARTS has difficulties in the exploration of large search spaces, which leads to strong dependencies between operations in the one-shot network. Through experiments, we verify that DARTS cannot effectively find optimal solutions in the search space, and that removing operations in the derivation process of DARTS can highly affect other operations due to strong dependencies. Based on these observations, we propose DARTS In DARTS (DID), which is designed to break dependencies between operations. In DID, we randomly drop a certain number of operations from the search space at every iteration, so that operations do not rely on the others but rather learn useful weight parameters by itself. The main advantage of this method lies on its robustness against pruning of operations, which justifies the derivation procedure as in DARTS. Second, DARTS has a critical problem of inconsistency between the relaxed model and sampled model. We solve this problem by our method called STable ARchiTecture Search (STARTS) that directly trains the sampled models and show improved performance through extensive experiments. Also, this method shows better stability by leveraging weight sharing, which helps to keep consistent sampling ability under various search spaces. STARTS finds the image classifier on CIFAR-10 with almost state-of-the-art performance and the recurrent neural network for language modeling on Penn-Treebank which is comparable to the state-of-the-art model.
Advisors
Kim, Junmoresearcher김준모researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2019
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2019.8,[iv, 25 p. :]

Keywords

Deep learning▼aarchitecture search▼acontinuous relaxation▼aweight sharing▼adropout; 딥러닝▼a구조 탐색▼a컨티뉴어스 릴랙세이션▼a파라미터 공유▼a드랍아웃

URI
http://hdl.handle.net/10203/283072
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=875366&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0