Segmenting 2K-Videos at 36.5 FPS with 24.3 GFLOPs: Accurate and Lightweight Realtime Semantic Segmentation Network

Cited 5 time in webofscience Cited 5 time in scopus
  • Hit : 632
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorOh, Dokwanko
dc.contributor.authorJi, Daehyunko
dc.contributor.authorJang, Cheolhunko
dc.contributor.authorHyunv, Yoonsukko
dc.contributor.authorBae, Hong S.ko
dc.contributor.authorHwang, Sungjuko
dc.date.accessioned2020-12-21T09:10:23Z-
dc.date.available2020-12-21T09:10:23Z-
dc.date.created2020-12-03-
dc.date.created2020-12-03-
dc.date.created2020-12-03-
dc.date.issued2020-05-31-
dc.identifier.citationIEEE International Conference on Robotics and Automation, ICRA 2020, pp.3153 - 3160-
dc.identifier.issn1050-4729-
dc.identifier.urihttp://hdl.handle.net/10203/278851-
dc.description.abstractWe propose a fast and lightweight end-to-end convolutional network architecture for real-time segmentation of high resolution videos, NfS-SegNet, that can segement 2K-videos at 36.5 FPS with 24.3 GFLOPS. This speed and computation-efficiency is due to following reasons: 1) The encoder network, NfS-Net, is optimized for speed with simple building blocks without memory-heavy operations such as depthwise convolutions, and outperforms state-of-the-art lightweight CNN architectures such as SqueezeNet [2], Mo- bileNet v1 [3] v2 [4] and ShuffleNet v1 [5] v2 [6] on image classification with significantly higher speed. 2) The NfS- SegNet has an asymmetric architecture with deeper encoder and shallow decoder, whose design is based on our empirical finding that the decoder is the main bottleneck in computation with relatively small contribution to the final performance. 3) Our novel uncertainty-aware knowledge distillation method guides the teacher model to focus its knowledge transfer on the most difficult image regions. We validate the performance of NfS-SegNet with the CITYSCAPE [1] benchmark, on which it achieves state-of-the-art performance among lightweight segementation models in terms of both accuracy and speed.-
dc.languageEnglish-
dc.publisherInstitute of Electrical and Electronics Engineers Inc.-
dc.titleSegmenting 2K-Videos at 36.5 FPS with 24.3 GFLOPs: Accurate and Lightweight Realtime Semantic Segmentation Network-
dc.typeConference-
dc.identifier.wosid000712319502039-
dc.identifier.scopusid2-s2.0-85092716742-
dc.type.rimsCONF-
dc.citation.beginningpage3153-
dc.citation.endingpage3160-
dc.citation.publicationnameIEEE International Conference on Robotics and Automation, ICRA 2020-
dc.identifier.conferencecountryFR-
dc.identifier.conferencelocationVirtual-
dc.identifier.doi10.1109/ICRA40945.2020.9196510-
dc.contributor.localauthorHwang, Sungju-
dc.contributor.nonIdAuthorOh, Dokwan-
dc.contributor.nonIdAuthorJi, Daehyun-
dc.contributor.nonIdAuthorJang, Cheolhun-
dc.contributor.nonIdAuthorHyunv, Yoonsuk-
dc.contributor.nonIdAuthorBae, Hong S.-
Appears in Collection
AI-Conference Papers(학술대회논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 5 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0