(A) redundancy-aware architecture to eliminate repeated computations in quantized convolutional neural networks양자화된 컨볼루셔널 신경망의 반복되는 연산을 제거하는 불필요한 연산 중복을 활용한 아키텍처

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 149
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorKim, Lee-Sup-
dc.contributor.advisor김이섭-
dc.contributor.authorPark, Kangkyu-
dc.date.accessioned2021-05-10T19:30:17Z-
dc.date.available2021-05-10T19:30:17Z-
dc.date.issued2019-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=870600&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/282843-
dc.description학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2019.2,[iii, 35 p. :]-
dc.description.abstractQuantized Convolutional Neural Networks (QCNNs) have been proposed to reduce the memory size that is one of the major bottleneck of CNNs. Nevertheless, the massive number of computations still hinders an efficient processing of deep CNNs in modern devices. To reduce to computations while keeping the reduced memory size, this work is motivated by following two observations. First, there are a large number of redundant computations due to the repeated operands caused by quantization. The other is that QCNNs have extremely low or no sparsity in weights. Most of the previous accelerators have only considered the redundant computations caused by sparsity, not the repeated data, resulting in less improvement of performance in QCNNs compared to the high precision CNNs because of the lack of sparsity. This paper introduces RQ-CNN, a novel accelerator architecture that eliminates all the redundant computations in QCNNs. RQ-CNN identifies the indispensable data referred to both unique and non-zero data in the network. Through the identification process, the redundant computations are processed at one time by a merged multiplier using the indispensable data. While exploiting the redundancy, RQ-CNN maintains the parallelism and reusability of CNNs by fixing the redundancy scope that is a window for searching indispensable data without messing the data. On the state-of-the-art QCNNs, RQ-CNN improves the performance and energy by a factor of $4\times$ and $2.19\times$, respectively.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectconvolutional neural network inference▼anetwork quantization▼arepeated data▼aredundant computation-
dc.subject콘볼루셔널 신경망 추론-
dc.subject네트워크 양자화-
dc.subject데이터 중복-
dc.subject불필요 연산-
dc.title(A) redundancy-aware architecture to eliminate repeated computations in quantized convolutional neural networks-
dc.title.alternative양자화된 컨볼루셔널 신경망의 반복되는 연산을 제거하는 불필요한 연산 중복을 활용한 아키텍처-
dc.typeThesis(Master)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :전기및전자공학부,-
dc.contributor.alternativeauthor박강규-
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0