Energy-efficient deep-neural-network training processor with fine-grained mixed precision고속 학습 가능 고효율 혼합 정밀도 DNN 학습 프로세서

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 318
  • Download : 0
Recently, several hardware accelerators have been reported for deep neural network (DNN) operation, however, they focused on only inference rather than DNN training that is a crucial ingredient for user adaptation at the edge-device as well as transfer learning with domain-specific data. However, DNN training requires much heavier floating-point (FP) computation and memory access than DNN inference, thus, dedicated DNN training hardware is essential. In this dissertation, we present a deep learning neural processing unit (LNPU) supporting CNN and FC training as well as inference with the following key features. First, we proposed fine-grained mixed precision (FGMP) scheme. The FGMP divides data into FP8-group and FP16-group in data-element level. FGMP can dynamically adjust the ratio between FP8 and FP16 to reduce external memory access and avoid accuracy loss. With the FGMP, external memory access is reduced by 38.9% for ResNet-18 training. Second, we designed hardware architecture to support FGMP. For high energy efficiency, we proposed DL core architecture with configurable PE and data-path for DNN training with FGMP. As a result, the energy efficiency of LNPU is improved by $2.08 \times$ ResNet-18 training. Lastly, we proposed fully-reconfigurable hardware architecture for various kinds of operations in DNN training/inference with zero-skipping. With the help of fully-reconfigurable hardware architecture, proposed LNPU can support all of the steps of DNN training with skipping zeros which are derived from FGMP and ReLU, and so on. As a result, the energy efficiency is $ \times 4.4$ higher than NVIDIA V100 GPU and its normalized peak performance is $\times 2.4$ higher than the previous DNN training processor.
Advisors
Yoo, Hoi-Junresearcher유회준researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2020
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2020.8,[vii, 113 p. :]

Keywords

Deep learning▼adeep-neural-network▼aDNN training▼adigital processor▼aenergy-efficient hardware▼aartificial intelligence▼amachine learning; 딥러닝▼a딥 뉴럴 네트워크▼a학습▼a디지털 프로세서▼a고효율 하드웨어▼a인공지능▼a기계학습

URI
http://hdl.handle.net/10203/284457
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=924545&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0