DC Field | Value | Language |
---|---|---|
dc.contributor.author | Lee, Jinsu | ko |
dc.contributor.author | Lee, Juhyoung | ko |
dc.contributor.author | Han, Donghyeon | ko |
dc.contributor.author | Lee, Jinmook | ko |
dc.contributor.author | Park, Gwangtae | ko |
dc.contributor.author | Yoo, Hoi-Jun | ko |
dc.date.accessioned | 2019-11-28T03:20:44Z | - |
dc.date.available | 2019-11-28T03:20:44Z | - |
dc.date.created | 2019-11-27 | - |
dc.date.created | 2019-11-27 | - |
dc.date.created | 2019-11-27 | - |
dc.date.issued | 2019-02 | - |
dc.identifier.citation | 2019 IEEE International Solid-State Circuits Conference, ISSCC 2019, pp.142 - 144 | - |
dc.identifier.uri | http://hdl.handle.net/10203/268663 | - |
dc.description.abstract | Recently, deep neural network (DNN) hardware accelerators have been reported for energy-efficient deep learning (DL) acceleration [1-6]. Most prior DNN inference accelerators are trained in the cloud using public datasets; parameters are then downloaded to implement AI [1-5]. However, local DNN learning with domain-specific and private data is required meet various user preferences on edge or mobile devices. Since edge and mobile devices contain only limited computation capability with battery power, an energy-efficient DNN learning processor is necessary. Only [6] supported on-chip DNN learning, but it was not energy-efficient, as it did not utilize sparsity which represents 37%-61% of the inputs for various CNNs, such as VGG16, AlexNet and ResNet-18, as shown in Fig. 7.7.1. Although [3-5] utilized the sparsity, they only considered the inference phase with inter-channel accumulation in Fig. 7.7.1, and did not support intra-channel accumulation for the weight-gradient generation (WG) step of the learning phase. Also, [6] adopted FP16, but it was not energy optimal because FP8 is enough for many input operands with 4× less energy than FP16. | - |
dc.language | English | - |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | - |
dc.title | 7.7 LNPU: A 25.3TFLOPS/W Sparse Deep-Neural-Network Learning Processor with Fine-Grained Mixed Precision of FP8-FP16 | - |
dc.type | Conference | - |
dc.identifier.wosid | 000463153600043 | - |
dc.identifier.scopusid | 2-s2.0-85063504226 | - |
dc.type.rims | CONF | - |
dc.citation.beginningpage | 142 | - |
dc.citation.endingpage | 144 | - |
dc.citation.publicationname | 2019 IEEE International Solid-State Circuits Conference, ISSCC 2019 | - |
dc.identifier.conferencecountry | US | - |
dc.identifier.conferencelocation | San Francisco, CA | - |
dc.identifier.doi | 10.1109/ISSCC.2019.8662302 | - |
dc.contributor.localauthor | Yoo, Hoi-Jun | - |
dc.contributor.nonIdAuthor | Lee, Jinsu | - |
dc.contributor.nonIdAuthor | Lee, Juhyoung | - |
dc.contributor.nonIdAuthor | Han, Donghyeon | - |
dc.contributor.nonIdAuthor | Lee, Jinmook | - |
dc.contributor.nonIdAuthor | Park, Gwangtae | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.