DSpace at KOASAS: (A) redundancy-aware architecture to eliminate repeated computations in quantized convolutional neural networks

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

(A) redundancy-aware architecture to eliminate repeated computations in quantized convolutional neural networks양자화된 컨볼루셔널 신경망의 반복되는 연산을 제거하는 불필요한 연산 중복을 활용한 아키텍처

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 150
Download : 0

Export

Park, Kangkyu

Quantized Convolutional Neural Networks (QCNNs) have been proposed to reduce the memory size that is one of the major bottleneck of CNNs. Nevertheless, the massive number of computations still hinders an efficient processing of deep CNNs in modern devices. To reduce to computations while keeping the reduced memory size, this work is motivated by following two observations. First, there are a large number of redundant computations due to the repeated operands caused by quantization. The other is that QCNNs have extremely low or no sparsity in weights. Most of the previous accelerators have only considered the redundant computations caused by sparsity, not the repeated data, resulting in less improvement of performance in QCNNs compared to the high precision CNNs because of the lack of sparsity. This paper introduces RQ-CNN, a novel accelerator architecture that eliminates all the redundant computations in QCNNs. RQ-CNN identifies the indispensable data referred to both unique and non-zero data in the network. Through the identification process, the redundant computations are processed at one time by a merged multiplier using the indispensable data. While exploiting the redundancy, RQ-CNN maintains the parallelism and reusability of CNNs by fixing the redundancy scope that is a window for searching indispensable data without messing the data. On the state-of-the-art QCNNs, RQ-CNN improves the performance and energy by a factor of $4\times$ and $2.19\times$, respectively.

Advisors: Kim, Lee-Sup researcher; 김이섭 researcher

Description: 한국과학기술원 :전기및전자공학부,

Publisher: 한국과학기술원

Issue Date: 2019

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2019.2,[iii, 35 p. :]

Keywords: convolutional neural network inference▼anetwork quantization▼arepeated data▼aredundant computation; 콘볼루셔널 신경망 추론; 네트워크 양자화; 데이터 중복; 불필요 연산

URI: http://hdl.handle.net/10203/282843

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=870600&flag=dissertation

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

(A) redundancy-aware architecture to eliminate repeated computations in quantized convolutional neural networks양자화된 컨볼루셔널 신경망의 반복되는 연산을 제거하는 불필요한 연산 중복을 활용한 아키텍처

KOASAS

Communities & Collections