DC Field | Value | Language |
---|---|---|
dc.contributor.author | Suchang Kim | ko |
dc.contributor.author | Na, Seungho | ko |
dc.contributor.author | Kong, Byeong Yong | ko |
dc.contributor.author | Choi, Jae Woong | ko |
dc.contributor.author | Park, In-Cheol | ko |
dc.date.accessioned | 2021-06-15T02:10:08Z | - |
dc.date.available | 2021-06-15T02:10:08Z | - |
dc.date.created | 2021-06-09 | - |
dc.date.created | 2021-06-09 | - |
dc.date.created | 2021-06-09 | - |
dc.date.issued | 2021-06 | - |
dc.identifier.citation | IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, v.29, no.6, pp.1192 - 1205 | - |
dc.identifier.issn | 1063-8210 | - |
dc.identifier.uri | http://hdl.handle.net/10203/285908 | - |
dc.description.abstract | Deep neural network (DNN)-based object detection has been investigated and applied to various real-time applications. However, it is hard to employ the DNNs in embedded systems due to their high computational complexity and deep-layered structure. Although several field-programmable gate array (FPGA) implementations have been presented recently for real-time object detection, they suffer from either low throughput or low detection accuracy. In this article, we propose an efficient computing system for real-time SSDLite object detection on FPGA devices, which includes novel hardware architecture and system optimization techniques. In the proposed hardware architecture, a neural processing unit (NPU) that consists of heterogeneous units, such as band processing, scaling, and accumulating, and data fetching and formatting units is designed to accelerate the DNNs efficiently. In addition, system optimization techniques are presented to improve the throughput further. A task control unit is employed to balance the workload and increase the utilization of heterogeneous units in the NPU, and the object detection algorithm is refined accordingly. The proposed architecture is realized on an Intel Arria 10 FPGA and enhances the throughput by up to 13.6× compared to the state-of-the-art FPGA implementation. | - |
dc.language | English | - |
dc.publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC | - |
dc.title | Real-Time SSDLite Object Detection on FPGA | - |
dc.type | Article | - |
dc.identifier.wosid | 000658341800014 | - |
dc.identifier.scopusid | 2-s2.0-85103257176 | - |
dc.type.rims | ART | - |
dc.citation.volume | 29 | - |
dc.citation.issue | 6 | - |
dc.citation.beginningpage | 1192 | - |
dc.citation.endingpage | 1205 | - |
dc.citation.publicationname | IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS | - |
dc.identifier.doi | 10.1109/TVLSI.2021.3064639 | - |
dc.contributor.localauthor | Park, In-Cheol | - |
dc.contributor.nonIdAuthor | Na, Seungho | - |
dc.contributor.nonIdAuthor | Kong, Byeong Yong | - |
dc.description.isOpenAccess | N | - |
dc.type.journalArticle | Article | - |
dc.subject.keywordAuthor | Deep neural network (DNN) | - |
dc.subject.keywordAuthor | field-programmable gate array (FPGA) | - |
dc.subject.keywordAuthor | object detection | - |
dc.subject.keywordAuthor | real-time applications | - |
dc.subject.keywordAuthor | very-large-scale integration (VLSI) architecture | - |
dc.subject.keywordPlus | DEEP | - |
dc.subject.keywordPlus | ACCELERATOR | - |
dc.subject.keywordPlus | NETWORKS | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.