DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Kim, Dongjun | - |
dc.contributor.advisor | 김동준 | - |
dc.contributor.author | Tanvir, Muhammad | - |
dc.date.accessioned | 2021-05-13T19:39:48Z | - |
dc.date.available | 2021-05-13T19:39:48Z | - |
dc.date.issued | 2020 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=925245&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/285081 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2020.8,[iii, 23 p. :] | - |
dc.description.abstract | Developing efficient hardware solutions for processing Convolutional Neural Networks (CNNs) is an active area of research among the computer architecture community. While some model level modifications have been proposed over the years, the use of a transformed convolution scheme is the only approach, which guarantees performance improvement without the loss of accuracy. Among the transformed convolution schemes, the Winograd Minimal Filtering algorithm guarantees up to 2.25X performance improvement by significantly reducing the overall compute-intensity of the CNN. The Winograd convolution algorithm also accompanies an inherent parallelism called Intra Tile Parallelism, which presents a unique opportunity to further speedup the CNN processing. Our work proposes an efficient dataflow architecture, which exploits this Intra Tile parallelism to exhibit performance improvements for CNN processing over ResNet model. The performance improvements achieved from our experiments over the ResNet model outperform the state of the art results provided by NVIDIA's cuDNN library. We experienced a speedup of up to 2.14X for CNN layer processing time, and device memory bandwidth savings of up to 2.3X on Volta V100 Graphics Processing Unit (GPU), inside the NVIDIA's DGX-1 system, relative to their cuDNN library-based counterparts. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | Convolution▼aGPU▼aPerformance▼aOptimization | - |
dc.subject | 콘볼루션▼aGPU▼a성능▼a최적화 | - |
dc.title | Improving convolutional neural network processing in Winograd domain using intra tile parallelism | - |
dc.title.alternative | Winograd 영역에서의 타일 내 병렬성을 활용한 합성곱 신경망 처리 방식 개선 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :전기및전자공학부, | - |
dc.contributor.alternativeauthor | 탄 비르무하마드 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.