Learning monocular depth estimation via selective knowledge distillation of stereo depth estimation스트레오 깊이 추정의 선택적 지식 증류를 통한 단안 깊이 추정 학습

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 336
  • Download : 0
Monocular depth estimation has been extensively explored based on deep learning, yet its accuracy and generalization ability still lag far behind that of the stereo-based methods. To tackle this, a few recent studies have proposed to supervise the monocular depth estimation network by distilling disparity maps as proxy ground-truths, which are predicted by conventional stereo-based methods (i.e. Semi Global Matching) or pre-trained stereo matching networks. However, these studies naively distill the stereo knowledge without considering the comparative advantages of stereo-based and monocular depth estimation methods. In this paper, I propose to selectively distill the disparity maps for more reliable proxy supervision. Specifically, I first design a decoder (MaskDecoder) that learns two binary masks which are trained to choose optimally between the proxy disparity maps and the estimated depth maps for each pixel. Each binary mask forms new disparity maps that minimize the commonly-used loss functions for self-supervised monocular depth estimation (e.g. image reconstruction loss function and edge-aware smoothness loss function). The learned masks are then fed to another decoder (DepthDecoder) to enforce the estimated depths to learn from only the masked area in the proxy disparity maps. Additionally, a Teacher-Student module is designed to transfer the geometric knowledge of the StereoNet to the MonoNet since the StereoNet extracts features from the stereo image pair while the MonoNet extracts features only from a single image. Ablation studies verify the proposed methods bring more accurate estimation than a baseline model both qualitatively and quantitatively. Furthermore, extensive experiments validate the proposed methods achieve state-of-the-art performance for self- and proxy-supervised monocular depth estimation on the KITTI dataset, even surpassing some of the semi-supervised methods.
Advisors
Yoon, Kuk-Jinresearcher윤국진researcher
Description
한국과학기술원 :미래자동차학제전공,
Publisher
한국과학기술원
Issue Date
2021
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 미래자동차학제전공, 2021.2,[iv, 43 p. :]

Keywords

Monocular depth estimation; Stereo matching; 3D reconstruction; Self-supervised learning; Unsupervised learning; Knowledge Distillation; 단안 깊이 추정; 스테레오 매칭; 3차원 복원; 자기지도학습; 비지도학습; 지식 증류

URI
http://hdl.handle.net/10203/295222
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=948595&flag=dissertation
Appears in Collection
PD-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0