DSpace at KOASAS: 7.7 LNPU: A 25.3TFLOPS/W Sparse Deep-Neural-Network Learning Processor with Fine-Grained Mixed Precision of FP8-FP16

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Conference Papers(학술회의논문)

7.7 LNPU: A 25.3TFLOPS/W Sparse Deep-Neural-Network Learning Processor with Fine-Grained Mixed Precision of FP8-FP16

Cited 81 time in

Cited 88 time in

Hit : 396
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Lee, Jinsu	ko
dc.contributor.author	Lee, Juhyoung	ko
dc.contributor.author	Han, Donghyeon	ko
dc.contributor.author	Lee, Jinmook	ko
dc.contributor.author	Park, Gwangtae	ko
dc.contributor.author	Yoo, Hoi-Jun	ko
dc.date.accessioned	2019-11-28T03:20:44Z	-
dc.date.available	2019-11-28T03:20:44Z	-
dc.date.created	2019-11-27	-
dc.date.created	2019-11-27	-
dc.date.created	2019-11-27	-
dc.date.issued	2019-02	-
dc.identifier.citation	2019 IEEE International Solid-State Circuits Conference, ISSCC 2019, pp.142 - 144	-
dc.identifier.uri	http://hdl.handle.net/10203/268663	-
dc.description.abstract	Recently, deep neural network (DNN) hardware accelerators have been reported for energy-efficient deep learning (DL) acceleration [1-6]. Most prior DNN inference accelerators are trained in the cloud using public datasets; parameters are then downloaded to implement AI [1-5]. However, local DNN learning with domain-specific and private data is required meet various user preferences on edge or mobile devices. Since edge and mobile devices contain only limited computation capability with battery power, an energy-efficient DNN learning processor is necessary. Only [6] supported on-chip DNN learning, but it was not energy-efficient, as it did not utilize sparsity which represents 37%-61% of the inputs for various CNNs, such as VGG16, AlexNet and ResNet-18, as shown in Fig. 7.7.1. Although [3-5] utilized the sparsity, they only considered the inference phase with inter-channel accumulation in Fig. 7.7.1, and did not support intra-channel accumulation for the weight-gradient generation (WG) step of the learning phase. Also, [6] adopted FP16, but it was not energy optimal because FP8 is enough for many input operands with 4× less energy than FP16.	-
dc.language	English	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.title	7.7 LNPU: A 25.3TFLOPS/W Sparse Deep-Neural-Network Learning Processor with Fine-Grained Mixed Precision of FP8-FP16	-
dc.type	Conference	-
dc.identifier.wosid	000463153600043	-
dc.identifier.scopusid	2-s2.0-85063504226	-
dc.type.rims	CONF	-
dc.citation.beginningpage	142	-
dc.citation.endingpage	144	-
dc.citation.publicationname	2019 IEEE International Solid-State Circuits Conference, ISSCC 2019	-
dc.identifier.conferencecountry	US	-
dc.identifier.conferencelocation	San Francisco, CA	-
dc.identifier.doi	10.1109/ISSCC.2019.8662302	-
dc.contributor.localauthor	Yoo, Hoi-Jun	-
dc.contributor.nonIdAuthor	Lee, Jinsu	-
dc.contributor.nonIdAuthor	Lee, Juhyoung	-
dc.contributor.nonIdAuthor	Han, Donghyeon	-
dc.contributor.nonIdAuthor	Lee, Jinmook	-
dc.contributor.nonIdAuthor	Park, Gwangtae	-

Appears in Collection: EE-Conference Papers(학술회의논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 81 items in WoS	Click to see citing articles in

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

7.7 LNPU: A 25.3TFLOPS/W Sparse Deep-Neural-Network Learning Processor with Fine-Grained Mixed Precision of FP8-FP16

This item is cited by other documents in WoS

KOASAS

Communities & Collections