An optimized design technique of low-bit neural network training for personalization on IoT devices

Cited 11 time in webofscience Cited 10 time in scopus
  • Hit : 261
  • Download : 0
Personalization by incremental learning has become essential for IoT devices to enhance the performance of the deep learning models trained with global datasets. To avoid massive transmission traffic in the network, exploiting on-device learning is necessary. We propose a software/hardware co-design technique that builds an energy-efficient low-bit trainable system: (1) software optimizations by local low-bit quantization and computation freezing to minimize the on-chip storage requirement and computational complexity, (2) hardware design of a bit-flexible multiply-and-accumulate (MAC) array sharing the same resources in inference and training. Our scheme saves 99.2% on on-chip buffer storage and achieves 12.8x higher peak energy efficiency compared to previous trainable accelerators.
Publisher
Institute of Electrical and Electronics Engineers Inc.
Issue Date
2019-06-06
Language
English
Citation

56th ACM/EDAC/IEEE Design Automation Conference (DAC)

DOI
10.1145/3316781.3317769
URI
http://hdl.handle.net/10203/264260
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 11 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0