DSpace at KOASAS: Z-PIM: A Sparsity-Aware Processing-in-Memory Architecture With Fully Variable Weight Bit-Precision for Energy-Efficient Deep Neural Networks

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Journal Papers(저널논문)

Z-PIM: A Sparsity-Aware Processing-in-Memory Architecture With Fully Variable Weight Bit-Precision for Energy-Efficient Deep Neural Networks

Cited 30 time in

Cited 0 time in

Hit : 586
Download : 0

Export

Kim, Ji-Hoon / Lee, Juhyoung / Lee, Jinsu / Heo, Jaehoon / Kim, Joo-Young researcher

We present an energy-efficient processing-in-memory (PIM) architecture named Z-PIM that supports both sparsity handling and fully variable bit-precision in weight data for energy-efficient deep neural networks. Z-PIM adopts the bit-serial arithmetic that performs a multiplication bit-by-bit through multiple cycles to reduce the complexity of the operation in a single cycle and to provide flexibility in bit-precision. To this end, it employs a zero-skipping convolution SRAM, which performs in-memory AND operations based on custom 8T-SRAM cells and channel-wise accumulations, and a diagonal accumulation SRAM that performs bit- and spatial-wise accumulation on the channel-wise accumulation results using diagonal logic and adders to produce the final convolution outputs. We propose the hierarchical bitline structure for energy-efficient weight bit pre-charging and computational readout by reducing the parasitic capacitances of the bitlines. Its charge reuse scheme reduces the switching rate by 95.42% for the convolution layers of VGG-16 model. In addition, Z-PIM's channel-wise data mapping enables sparsity handling by skip-reading the input channels with zero weight. Its read-operation pipelining enabled by a readsequence scheduling improves the throughput by 66.1%. The Z-PIM chip is fabricated in a 65-nm CMOS process on a 7.568-mm(2) die, while it consumes average 5.294-mW power at 1.0-V voltage and 200-MHz frequency. It achieves 0.31-49.12-TOPS/W energy efficiency for convolution operations as the weight sparsity and bit-precision vary from 0.1 to 0.9 and 1 to 16 bit, respectively. For the figure of merit considering input bit- width, weight bit-width, and energy efficiency, the Z-PIM shows more than 2.1 times improvement over the state-of-the-art PIM implementations.

Publisher: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Issue Date: 2021-04

Language: English

Article Type: Article; Proceedings Paper

Citation: IEEE JOURNAL OF SOLID-STATE CIRCUITS, v.56, no.4, pp.1093 - 1104

ISSN: 0018-9200

DOI: 10.1109/JSSC.2020.3039206

URI: http://hdl.handle.net/10203/282546

Appears in Collection: EE-Journal Papers(저널논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 30 items in WoS	Click to see citing articles in

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Z-PIM: A Sparsity-Aware Processing-in-Memory Architecture With Fully Variable Weight Bit-Precision for Energy-Efficient Deep Neural Networks

This item is cited by other documents in WoS

KOASAS

Communities & Collections