Long term visual localization via image graph feature matching and adaptive generalized depth-pose optimization이미지 그래프 특징 매칭 및 일반화된 적응적 깊이-자세 최적화를 통한 장기 시각적 위치 추정

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 2
  • Download : 0
Accurate pose estimation is critical for many applications, including robotics, self-driving cars, and augmented reality. However, current approaches to pose estimation often struggle to maintain accuracy over long periods of time, particularly in the face of permanent changes to the environment. This is a significant limitation, as real-world environments are constantly changing, and poses need to be estimated accurately over long periods in order to enable reliable applications. This thesis proposes a novel approach to long-term visual localization (LTVL) toward a previously mapped environment to estimate the centimeter-accurate pose. To build a robust solution, we address the challenges of LTVL utilizing multiple techniques, including end-to-end learning, depth estimation, and relative pose estimation. We explore the combination of these techniques in a single framework. We also built our deep model with inspiration from traditional iteration-based approaches, allowing for greater generalizability, accuracy, and efficiency by building it within a Deep Equilibrium (DEQ) framework. The main contributions of this work are as follows. 1. We propose a novel end-to-end model for long-term pose alignment using Transformer-based GRU updates within a DEQ framework. We design the model to take a query image as an input and multiple reference images to localize towards. An image-graph is constructed for effective message passing. 2. We study depth estimation in a stereo and monocular system while taking into account real-time speed, generalizability, and adaptability to environmental changes. In addition, the depth estimation module is used alongside a relative pose estimator within an iterative update framework to obtain the optimal value for both. 3. To achieve robust localization even when no landmark is present in the scenes, we unify the query sequence’s long-term localization and relative poses. 4. We evaluate our approach in several challenging datasets and show that it can achieve state-of-the-art accuracy in pose estimation, even in scenarios with significant environmental variations. We also demonstrate the generalizability of our approach by applying it to a dataset collected by our lab around the KAIST main campus and the Munji campus. Overall, our work makes contributions to the field of long-term visual localization and has the potential to enable a wide range of applications that require accurate pose estimation over long periods.
Advisors
김수현researcherKim, Soo Hyunresearcher김경수researcher
Description
한국과학기술원 :기계공학과,
Publisher
한국과학기술원
Issue Date
2024
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 기계공학과, 2024.2,[ix, 102 p. :]

Keywords

장기간 시각 지역화▼a포즈 추정▼a종단간 학습▼a깊이 추정▼a반복 맞춤▼a딥 이퀄리브레인 모델▼a트랜스 포머; Long-term visual localization▼aPose estimation▼aEnd-to-end learning▼aDepth estimation▼aIterative alignment▼aDeep equilibrium model▼aTransformer

URI
http://hdl.handle.net/10203/321937
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1097772&flag=dissertation
Appears in Collection
ME-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0