Self-supervised 3D geometric perception in adverse real-world environments불리한 실환경에서의 자기 감독 학습 기반 3D 기하학적 인지 방법론

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 4
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisor권인소-
dc.contributor.author신욱철-
dc.contributor.authorShin, Ukcheol-
dc.date.accessioned2024-07-26T19:30:53Z-
dc.date.available2024-07-26T19:30:53Z-
dc.date.issued2023-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1047245&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/320949-
dc.description학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2023.8,[xvi, 109 p. :]-
dc.description.abstractThis dissertation aims to make intelligent robots possess a robust geometric perception ability under various challenging real-world environments, such as rainy, snowy, foggy, dusty, over-exposed, and low-lighted conditions. Furthermore, in such harsh environments, we cannot collect Ground-Truth labels, such as dense depth maps, precise odometry, and optical flow. To achieve robust geometric perception ability without GT labels, my research proposes 1) a unified monocular-stereo depth network for the thermal image along with a large-scale multi-spectral dataset, 2) various self-supervised 3D geometry learning methods with various sensor modalities (e.g., RGB and thermal cameras), and 3) a multi-spectrum invariant and selectively-fusible depth estimation method. First, we propose a unified monocular-stereo depth estimation network for thermal images and a large-scale multi-spectral stereo seasonal (M&S^3$) dataset, including RGB, NIR, thermal, and LiDAR stereo system. Thermal camera are known to be robust against lighting and weather condition. However, despite the advantage, there is no large-scale dataset and research for geometric perception from thermal images. Therefore, in this research, we provide 1) a large-scale multi-sensor outdoor dataset, 2) exhaustively analyze the performance and robustness of monocular and stereo depth estimation from thermal images in various conditions (e.g., day, night, cloudy, and rainy), and 3) a unified depth network designed for thermal image showing high accuracy and flexibility. Second, we present various self-supervised learning methods for depth and ego-motion estimation from thermal images. Usually, we cannot guarantee Ground-Truth labels to train geometric perception networks in harsh weather, locational, and lighting conditions, such as cave, tunnel, dusty, and low-lighted conditions. Therefore, we need to train the networks in a self-supervised manner. However, in contrast to the robustness of thermal image, thermal image has undesirable image properties to generate self-supervision from image, such as low-contrast, blurry edge, and temporal image inconsistency. Therefore, we resolve the self-supervision issue of thermal image by proposing 1) multi-spectral consistency loss from paired RGB-T images, 2) joint adversarial and self-supervised learning from unpaired RGB-T images, and 3) temporal consistent thermal image mapping method. The former two methods generate self-supervision signals by exploiting RGB image with proposed differentiable forward mapping module and adversarial feature adaptation. On the other hand, method (3) can train the whole network with self-supervision signal from thermal image solely by utilizing the proposed temporal consistent image mapping method. The proposed mapping method resolves the undesirable image properties of thermal image based on an in-depth analysis of raw thermal image. Based on the self-supervised learning methods for thermal images, we demonstrate the network can estimate accurate depth map results from thermal image in challenging conditions. Lastly, we consider geometric perception in-the-wild scenario, which requires both high-level accuracy and robustness against various challenging environments. For this purpose, a common convention is deploying a multi-sensor system. Also, there are two main strategies for utilizing multi-sensor system: sensor-wise inference and multi-sensor fused inference. The former method is flexible but memory-inefficient, unreliable, and vulnerable. In contrast, multi-modal fusion can provide high-level reliability, yet they need a specialized architecture. Therefore, we propose an effective solution for the multi-spectrum generalizable and selectively-fusible depth estimation by exploiting contrastive learning between sensor modalities. Based on the proposed method, a single-depth network can achieve both spectral-invariant and multi-modal fused depth estimation while preserving reliability, memory efficiency, and flexibility.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subject자기지도학습▼a기하학적 인지▼a적대적 환경▼a열화상 카메라▼a센서 융합-
dc.subject인공지능-
dc.subjectArtificial intelligence▼aSelf-supervised learning▼a3D geometry▼aAdverse condition▼aThermal camera-
dc.titleSelf-supervised 3D geometric perception in adverse real-world environments-
dc.title.alternative불리한 실환경에서의 자기 감독 학습 기반 3D 기하학적 인지 방법론-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :전기및전자공학부,-
dc.contributor.alternativeauthorKweon, In So-
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0