HNPU: An Adaptive DNN Training Processor Utilizing Stochastic Dynamic Fixed-Point and Active Bit-Precision Searching

Cited 32 time in webofscience Cited 0 time in scopus
  • Hit : 862
  • Download : 63
DC FieldValueLanguage
dc.contributor.authorHan, Donghyeonko
dc.contributor.authorIm, Dongseokko
dc.contributor.authorPark, Gwangtaeko
dc.contributor.authorKim, Youngwooko
dc.contributor.authorSong, Seokchanko
dc.contributor.authorLee, Juhyoungko
dc.contributor.authorYoo, Hoi-Junko
dc.date.accessioned2021-09-26T01:50:11Z-
dc.date.available2021-09-26T01:50:11Z-
dc.date.created2021-09-24-
dc.date.created2021-09-24-
dc.date.created2021-09-24-
dc.date.issued2021-09-
dc.identifier.citationIEEE JOURNAL OF SOLID-STATE CIRCUITS, v.56, no.9, pp.2858 - 2869-
dc.identifier.issn0018-9200-
dc.identifier.urihttp://hdl.handle.net/10203/287867-
dc.description.abstractThis article presents HNPU, which is an energy-efficient deep neural network (DNN) training processor by adopting algorithm-hardware co-design. The HNPU supports stochastic dynamic fixed-point representation and layer-wise adaptive precision searching unit for low-bit-precision training. It additionally utilizes slice-level reconfigurability and sparsity to maximize its efficiency both in DNN inference and training. Adaptive bandwidth reconfigurable accumulation network enables reconfigurable DNN allocation and maintains its high core utilization even in various bit-precision conditions. Fabricated in a 28-nm process, the HNPU accomplished at least 5.9x higher energy efficiency and 2.5x higher area efficiency in actual DNN training compared with the previous state-of-the-art on-chip learning processors.-
dc.languageEnglish-
dc.publisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC-
dc.titleHNPU: An Adaptive DNN Training Processor Utilizing Stochastic Dynamic Fixed-Point and Active Bit-Precision Searching-
dc.typeArticle-
dc.identifier.wosid000690441300023-
dc.identifier.scopusid2-s2.0-85103259786-
dc.type.rimsART-
dc.citation.volume56-
dc.citation.issue9-
dc.citation.beginningpage2858-
dc.citation.endingpage2869-
dc.citation.publicationnameIEEE JOURNAL OF SOLID-STATE CIRCUITS-
dc.identifier.doi10.1109/JSSC.2021.3066400-
dc.embargo.liftdate9999-12-31-
dc.embargo.terms9999-12-31-
dc.contributor.localauthorYoo, Hoi-Jun-
dc.contributor.nonIdAuthorSong, Seokchan-
dc.description.isOpenAccessN-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorAdaptive bandwidth reconfigurable accumulation network (AB-RAN)-
dc.subject.keywordAuthordeep neural network (DNN)-
dc.subject.keywordAuthorin-out slice skipping (IOSS)-
dc.subject.keywordAuthorlayer-wise adaptive precision search-
dc.subject.keywordAuthoronline learning-
dc.subject.keywordAuthorslice-level sparsity exploitation-
dc.subject.keywordAuthorstochastic dynamic fixed point-
dc.subject.keywordPlusACCELERATOR-
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 32 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0