DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kim, Sangyeob | ko |
dc.contributor.author | Lee, Juhyoung | ko |
dc.contributor.author | Kang, Sanghoon | ko |
dc.contributor.author | Han, Donghyeon | ko |
dc.contributor.author | Jo, Wooyoung | ko |
dc.contributor.author | Yoo, Hoi-Jun | ko |
dc.date.accessioned | 2022-04-14T06:41:08Z | - |
dc.date.available | 2022-04-14T06:41:08Z | - |
dc.date.created | 2022-01-18 | - |
dc.date.created | 2022-01-18 | - |
dc.date.created | 2022-01-18 | - |
dc.date.issued | 2022-04 | - |
dc.identifier.citation | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, v.69, no.4, pp.1494 - 1506 | - |
dc.identifier.issn | 1549-8328 | - |
dc.identifier.uri | http://hdl.handle.net/10203/292743 | - |
dc.description.abstract | This article proposes the TSUNAMI, which supports an energy-efficient deep-neural-network training. The TSUNAMI supports multi-modal iterative pruning to generate zeros in activation and weight. Tile-based dynamic activation pruning unit and weight memory shared pruning unit eliminate additional memory access. Coarse-zero skipping controller skips multiple unnecessary multiply-and-accumulation (MAC) operations at once, and fine-zero skipping controller skips randomly located unnecessary MAC operations. Weight sparsity balancer solves a utilization degradation caused by weight sparsity imbalance, and the workload of each convolution core is allocated by a random channel allocator. The TSUNAMI achieves an energy efficiency of 3.42 TFLOPS/W at 0.78V and 50MHz with floating-point 8-bit activation and weight. Also, it achieves an energy efficiency of 405.96 TFLOPS/W at 90% sparsity condition. | - |
dc.language | English | - |
dc.publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC | - |
dc.title | TSUNAMI: Triple Sparsity-Aware Ultra Energy-Efficient Neural Network Training Accelerator With Multi-Modal Iterative Pruning | - |
dc.type | Article | - |
dc.identifier.wosid | 000740068900001 | - |
dc.identifier.scopusid | 2-s2.0-85122567552 | - |
dc.type.rims | ART | - |
dc.citation.volume | 69 | - |
dc.citation.issue | 4 | - |
dc.citation.beginningpage | 1494 | - |
dc.citation.endingpage | 1506 | - |
dc.citation.publicationname | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS | - |
dc.identifier.doi | 10.1109/TCSI.2021.3138092 | - |
dc.contributor.localauthor | Yoo, Hoi-Jun | - |
dc.contributor.nonIdAuthor | Jo, Wooyoung | - |
dc.description.isOpenAccess | N | - |
dc.type.journalArticle | Article | - |
dc.subject.keywordAuthor | Training | - |
dc.subject.keywordAuthor | IP networks | - |
dc.subject.keywordAuthor | Tsunami | - |
dc.subject.keywordAuthor | Iterative methods | - |
dc.subject.keywordAuthor | Hardware | - |
dc.subject.keywordAuthor | Memory management | - |
dc.subject.keywordAuthor | Degradation | - |
dc.subject.keywordAuthor | DNN training accelerator | - |
dc.subject.keywordAuthor | stochastic coarse-fine level pruning | - |
dc.subject.keywordAuthor | tile-based dynamic activation pruning | - |
dc.subject.keywordAuthor | weight sparsity balancing | - |
dc.subject.keywordAuthor | adaptive triple-zero skipping | - |
dc.subject.keywordPlus | FACE RECOGNITION | - |
dc.subject.keywordPlus | CNN ACCELERATOR | - |
dc.subject.keywordPlus | PROCESSOR | - |
dc.subject.keywordPlus | HARDWARE | - |
dc.subject.keywordPlus | POWER | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.