(A) redundancy-aware architecture to eliminate repeated computations in quantized convolutional neural networks양자화된 컨볼루셔널 신경망의 반복되는 연산을 제거하는 불필요한 연산 중복을 활용한 아키텍처

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 150
  • Download : 0
Quantized Convolutional Neural Networks (QCNNs) have been proposed to reduce the memory size that is one of the major bottleneck of CNNs. Nevertheless, the massive number of computations still hinders an efficient processing of deep CNNs in modern devices. To reduce to computations while keeping the reduced memory size, this work is motivated by following two observations. First, there are a large number of redundant computations due to the repeated operands caused by quantization. The other is that QCNNs have extremely low or no sparsity in weights. Most of the previous accelerators have only considered the redundant computations caused by sparsity, not the repeated data, resulting in less improvement of performance in QCNNs compared to the high precision CNNs because of the lack of sparsity. This paper introduces RQ-CNN, a novel accelerator architecture that eliminates all the redundant computations in QCNNs. RQ-CNN identifies the indispensable data referred to both unique and non-zero data in the network. Through the identification process, the redundant computations are processed at one time by a merged multiplier using the indispensable data. While exploiting the redundancy, RQ-CNN maintains the parallelism and reusability of CNNs by fixing the redundancy scope that is a window for searching indispensable data without messing the data. On the state-of-the-art QCNNs, RQ-CNN improves the performance and energy by a factor of $4\times$ and $2.19\times$, respectively.
Advisors
Kim, Lee-Supresearcher김이섭researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2019
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2019.2,[iii, 35 p. :]

Keywords

convolutional neural network inference▼anetwork quantization▼arepeated data▼aredundant computation; 콘볼루셔널 신경망 추론; 네트워크 양자화; 데이터 중복; 불필요 연산

URI
http://hdl.handle.net/10203/282843
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=870600&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0