DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Kim, Lee-Sup | - |
dc.contributor.advisor | 김이섭 | - |
dc.contributor.author | Park, Kangkyu | - |
dc.date.accessioned | 2021-05-10T19:30:17Z | - |
dc.date.available | 2021-05-10T19:30:17Z | - |
dc.date.issued | 2019 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=870600&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/282843 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2019.2,[iii, 35 p. :] | - |
dc.description.abstract | Quantized Convolutional Neural Networks (QCNNs) have been proposed to reduce the memory size that is one of the major bottleneck of CNNs. Nevertheless, the massive number of computations still hinders an efficient processing of deep CNNs in modern devices. To reduce to computations while keeping the reduced memory size, this work is motivated by following two observations. First, there are a large number of redundant computations due to the repeated operands caused by quantization. The other is that QCNNs have extremely low or no sparsity in weights. Most of the previous accelerators have only considered the redundant computations caused by sparsity, not the repeated data, resulting in less improvement of performance in QCNNs compared to the high precision CNNs because of the lack of sparsity. This paper introduces RQ-CNN, a novel accelerator architecture that eliminates all the redundant computations in QCNNs. RQ-CNN identifies the indispensable data referred to both unique and non-zero data in the network. Through the identification process, the redundant computations are processed at one time by a merged multiplier using the indispensable data. While exploiting the redundancy, RQ-CNN maintains the parallelism and reusability of CNNs by fixing the redundancy scope that is a window for searching indispensable data without messing the data. On the state-of-the-art QCNNs, RQ-CNN improves the performance and energy by a factor of $4\times$ and $2.19\times$, respectively. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | convolutional neural network inference▼anetwork quantization▼arepeated data▼aredundant computation | - |
dc.subject | 콘볼루셔널 신경망 추론 | - |
dc.subject | 네트워크 양자화 | - |
dc.subject | 데이터 중복 | - |
dc.subject | 불필요 연산 | - |
dc.title | (A) redundancy-aware architecture to eliminate repeated computations in quantized convolutional neural networks | - |
dc.title.alternative | 양자화된 컨볼루셔널 신경망의 반복되는 연산을 제거하는 불필요한 연산 중복을 활용한 아키텍처 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :전기및전자공학부, | - |
dc.contributor.alternativeauthor | 박강규 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.