Accuracy-aware efficient neural network compression for resource-limited system자원 제한 시스템을 위한 정확도 매개변수 기반의 효율적인 뉴럴 네트워크 압축

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 114
  • Download : 0
Network compression algorithm has been used to accelerates the processing of deep neural networks by reducing the number of trainable parameters. In SVD-based network compression, the rank of decomposed convolution filter is a key parameter that determines the complexity and accuracy of a neural network. In this paper, we first introduce the combinatorial rank configuration algorithm with SVD-based filter pruning. The proposed approaches choose a rank configuration satisfying the constraint of computational complexity or accuracy in the combinatorial search space, while the previous works iteratively determine the rank for a layer. To implement the feasible combinatorial space, we propose the bounded space generation algorithms to extract the essential rank configurations among the numerous combinations. Also, the novel accuracy metrics are proposed to represent the accuracy and complexity relationship for a given neural network. We use these metrics to quickly evaluate the accuracy of rank configuration. Finally, we propose a single-shot approach to choose a rank configuration in a few seconds. Experiments show that our method provides better compromise between accuracy and computational complexity/memory consumption while performing compression at much higher speed. For VGG-16 our network can reduce the FLOPs by 25% and improve accuracy by 0.7% compared to the baseline, while requiring only 3 minutes on a CPU to search for the right rank configuration. Previously, similar results were achieved in 4 hours with 8 GPUs. The better accuracy and complexity compromise, as well as the extremely fast speed of our method makes it suitable for neural network compression. We expect that our approaches can be widely used for the acceleration of high performance algorithm using many neural networks and the real-time resource management system for the resource-limited applications such as mobile and robotics.
Advisors
Shin, Jinwooresearcher신진우researcherKyung, Chong-Minresearcher경종민researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2020
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2020.2,[ix, 82 p. :]

Keywords

Neural network compression▼aFast network compression▼aAccuracy metric▼aGlobal optimization▼aNetwork acceleration; 신경망 압축▼a고속 압축▼a정확도 지표▼a종합적 최적화▼a네트워크 가속

URI
http://hdl.handle.net/10203/284220
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=909458&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0