Improving convolutional neural network processing in Winograd domain using intra tile parallelismWinograd 영역에서의 타일 내 병렬성을 활용한 합성곱 신경망 처리 방식 개선

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 216
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorKim, Dongjun-
dc.contributor.advisor김동준-
dc.contributor.authorTanvir, Muhammad-
dc.date.accessioned2021-05-13T19:39:48Z-
dc.date.available2021-05-13T19:39:48Z-
dc.date.issued2020-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=925245&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/285081-
dc.description학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2020.8,[iii, 23 p. :]-
dc.description.abstractDeveloping efficient hardware solutions for processing Convolutional Neural Networks (CNNs) is an active area of research among the computer architecture community. While some model level modifications have been proposed over the years, the use of a transformed convolution scheme is the only approach, which guarantees performance improvement without the loss of accuracy. Among the transformed convolution schemes, the Winograd Minimal Filtering algorithm guarantees up to 2.25X performance improvement by significantly reducing the overall compute-intensity of the CNN. The Winograd convolution algorithm also accompanies an inherent parallelism called Intra Tile Parallelism, which presents a unique opportunity to further speedup the CNN processing. Our work proposes an efficient dataflow architecture, which exploits this Intra Tile parallelism to exhibit performance improvements for CNN processing over ResNet model. The performance improvements achieved from our experiments over the ResNet model outperform the state of the art results provided by NVIDIA's cuDNN library. We experienced a speedup of up to 2.14X for CNN layer processing time, and device memory bandwidth savings of up to 2.3X on Volta V100 Graphics Processing Unit (GPU), inside the NVIDIA's DGX-1 system, relative to their cuDNN library-based counterparts.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectConvolution▼aGPU▼aPerformance▼aOptimization-
dc.subject콘볼루션▼aGPU▼a성능▼a최적화-
dc.titleImproving convolutional neural network processing in Winograd domain using intra tile parallelism-
dc.title.alternativeWinograd 영역에서의 타일 내 병렬성을 활용한 합성곱 신경망 처리 방식 개선-
dc.typeThesis(Master)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :전기및전자공학부,-
dc.contributor.alternativeauthor탄 비르무하마드-
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0