(An) energy-efficient multiple-DNN training processor with active input-output dual zero skipping이중 희소성 응용 고효율 다중 DNN 학습 프로세서

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 60
  • Download : 0
Recently, many deep neural network (DNN) based services are moving towards the edge for on-device intelligence, which led to the growth of research in energy-efficient DNN accelerators. Furthermore, as the functionality of artificial intelligence (AI) advances, their architectures become more complex, incorporating multiple DNNs in a single AI model. A generative adversarial network (GAN) is an example that can perform advanced applications through its multiple-DNN architecture. This dissertation presents GANPU, an energy-efficient multiple deep neural network (DNN) training processor for GANs. It enables on-device training of GANs on performance-limited and battery-limited mobile devices, without sending user-specific data to servers, fully evading privacy concerns. Training GANs require a massive amount of computation, therefore it is difficult to accelerate in a resource-constrained platform. Besides, networks and layers in GANs show dramatically changing operational characteristics, making it difficult to optimize the processor's core and bandwidth allocation. For higher throughput and energy-efficiency, this paper proposed 3 key features. An adaptive spatio-temporal workload multiplexing is proposed to maintain high utilization in accelerating multiple DNNs in a single GAN model. To take advantage of ReLU sparsity during both inference and training, dual-sparsity exploitation architecture is proposed to skip redundant computations due to input and output feature zeros. Moreover, an exponent-only ReLU speculation algorithm is proposed along with its light-weight processing element architecture, to estimate the location of output feature zeros during the inference with minimal hardware overhead. Fabricated in a 65 nm process, the GANPU achieved the energy-efficiency of 75.68 TFLOPS/W for 16-bit floating-point computation, which is 4.85x higher than the state-of-the-art. As a result, GANPU enables on-device training of GANs with high energy-efficiency.
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2022
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2022.2,[vii, 125 p. :]

Keywords

Deep neural network▼aDeep learning▼aNeural processing unit▼aDNN accelerator▼aOn-device intelligence▼aMultiple-DNN acceleration▼aSparsity exploitation▼aGenerative adversarial network; 딥 뉴럴 네트워크▼a딥 러닝▼a뉴럴 프로세싱 유닛▼aDNN 가속기▼a온-디바이스 인공지능▼a다중 DNN 가속▼a희소성 활용▼a생성형 대립 신경망

URI
http://hdl.handle.net/10203/309155
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1000292&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0