GANPU: An Energy-Efficient Multi-DNN Training Processor for GANs With Speculative Dual-Sparsity Exploitation

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 40
  • Download : 0
This article presents generative adversarial network processing unit (GANPU), an energy-efficient multiple deep neural network (DNN) training processor for GANs. It enables on-device training of GANs on performance- and battery-limited mobile devices, without sending user-specific data to servers, fully evading privacy concerns. Training GANs require a massive amount of computation, and therefore, it is difficult to accelerate in a resource-constrained platform. Besides, networks and layers in GANs show dramatically changing operational characteristics, making it difficult to optimize the processor's core and bandwidth allocation. For higher throughput and energy efficiency, this article proposed three key features. An adaptive spatiotemporal workload multiplexing is proposed to maintain high utilization in accelerating multiple DNNs in a single GAN model. To take advantage of ReLU sparsity during both inference and training, dual-sparsity exploitation architecture is proposed to skip redundant computations due to input and output feature zeros. Moreover, an exponent-only ReLU speculation (EORS) algorithm is proposed along with its lightweight processing element (PE) architecture, to estimate the location of output feature zeros during the inference with minimal hardware overhead. Fabricated in a 65-nm process, the GANPU achieved the energy efficiency of 75.68 TFLOPS/W for 16-bit floating-point computation, which is 4.85x higher than the state of the art. As a result, GANPU enables on-device training of GANs with high energy efficiency.
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Issue Date
2021-09
Language
English
Article Type
Article
Citation

IEEE JOURNAL OF SOLID-STATE CIRCUITS, v.56, no.9, pp.2845 - 2857

ISSN
0018-9200
DOI
10.1109/JSSC.2021.3066572
URI
http://hdl.handle.net/10203/287858
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0