Effective deep neural network compression based on the network pruning프루닝을 기반으로 한 효과적인 심층신경망 압축 기법 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 53
  • Download : 0
As the size and computational amount of deep neural networks continue to be larger, demand for an effective compression technique is increasing. Various approaches are being attempted to compress and accelerate the network. Among them, the network pruning method is gaining attention a lot because of the advantage that it reaches the optimal point effectively by using the pre-trained weights and the advantage that it achieves a high compression rate with the simple approach. In this study, we propose several network compression methods based on network pruning. First, we propose LRF, a filter pruning method that observes the linear relationship between filter values of pre-trained neural networks and compensates for the differences that occur during pruning by modifying the value of remaining weights. Second, by utilizing the knowledge distillation, we propose a new generalized compression framework IMR that results in high performance even when we remove the filters randomly without careful selection during pruning. Third, we propose a layer pruning method LASS that has an advantage in making the actual speed of various networks faster, to further increase the practicality of the pruning technique. The filter pruning used in LRF is an approach that is effective when compressing a network with a large amount of computation. In addition, IMR is a method that can be exploited with any pruning method and improves their performance. In other words, when IMR and LRF are combined together, the computation time can be successfully reduced with high performance in the case of compressing a large network. On the other hand, the layer pruning method LASS is effective in networks of all sizes, and its value is particularly noticeable when compressing a network of a small size. Since IMR can be also adopted with LASS, if both methods are applied together, the result of layer pruning for a small network is also significantly improved. From these proposed approaches in this dissertation, we contribute to the widespread usage of deep neural networks by enhancing their actual speed in various environments.
Advisors
Kim, Junmoresearcher김준모researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2022
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2022.8,[vii, 70 p. :]

Keywords

Computer vision▼aArtificial intelligence▼aDeep learning▼aArtificial neural networks▼aNetwork pruning▼aNetwork compression; 컴퓨터 비전▼a인공 지능▼a딥 러닝▼a인공 신경망▼a네트워크 프루닝▼a네트워크 압축

URI
http://hdl.handle.net/10203/309091
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1007861&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0