Hybrid Convolution Architecture for Energy-Efficient Deep Neural Network Processing

Cited 3 time in webofscience Cited 0 time in scopus
  • Hit : 258
  • Download : 0
This paper presents a convolution process and its hardware architecture for energy-efficient deep neural network (DNN) processing. A DNN in general consists of a number of convolutional layers, and the number of input features involved in the convolution of a shallow layer is larger than that of kernels. As the layer deepens, however, the number of input features decreases, while that of kernels increases. The previous convolution architectures developed for enhancing energy efficiency have tried to reduce the memory accesses by increasing the reuse of the data once accessed from the memory. However, redundant memory accesses are still required as the change in the numbers of data has not been considered. We propose a hybrid convolution process that selects either a kernel-stay or feature-stay process by taking into account the numbers of data, and a forwarding technique to further reduce the memory accesses needed to store and load partial sums. The proposed convolution process is effective in maximizing data reuse, leading to an energy-efficient hybrid convolution architecture. Compared to the state-of-the-art architectures, the proposed architecture enhances the energy efficiency by up to 2.38 times in a 65nm CMOS process.
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Issue Date
2021-05
Language
English
Article Type
Article
Citation

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, v.68, no.5, pp.2017 - 2029

ISSN
1549-8328
DOI
10.1109/TCSI.2021.3059882
URI
http://hdl.handle.net/10203/285325
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 3 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0