Learning-based visual perception using semantic and geometric priors for autonomous driving의미론과 기하학적 사전정보를 활용한 학습 기반 자율주행 시각 인지 기법

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 213
  • Download : 0
Humans and robots repeat perception and action control to achieve a given goal. The robot's ability to recognize the surrounding environment is highly dependent on visual perception like humans. The visual perception is majorly classified into semantic and geometric perception depending on its purpose. These two have been considered as important features in securing robustness in applications where safety is the top priority, such as autonomous driving. In this thesis, we analyze the fundamental limitations of vision-based approaches in terms of semantic and geometric tasks, and propose techniques that imitate human's perception processes to overcome the limitations. Specifically, we improve the task performance by learning other task's output as prior knowledge. The contributions of this thesis are as below. First, we introduce a lane and road marking detection and classification technique, which is one of the most basic semantic tasks to implement autonomous driving, and its large-scale dataset. The previous vision-based approaches have a robustness issue of degraded performance at night time and adverse weather conditions. To solve this problem, we propose multi-task networks to detect lane and road markings and predict the position of vanishing points that represent high-level structural knowledge of the driving scene. The proposed method improves the performances of lane and road marking detection under dynamically changing illumination and weather conditions, and those of vanishing point prediction as well. Furthermore, we propose a novel large-scale dataset of traffic lanes, road markings, and vanishing points in various environmental conditions. Second, we present a unified joint training framework that explicitly models the motion of multiple moving objects, ego-motion, and depth in a monocular camera setup without supervision in dynamic environments, such as autonomous driving scenarios. Generally, learning 3D space through monocular videos is based on the Structure-from-Motion algorithm. Briefly, it is a technique of estimating the depth and ego-motion of the camera by using disparities of corresponding pixels in successive frames. To this end, it is assumed that all the captured objects are static and other motions except for the camera are considered as outliers. Here the problem is that, if there are many dynamic objects, it is hard to propagate a consistent supervisory signal for learning depth and to estimate both depth and multiple object motions simultaneously. The proposed technique can effectively solve this problem through object-aware semantic prior knowledge from instance segmentation or object detection tasks. Moreover, we propose an attention mechanism that explicitly disentangles the 3D motions of each dynamic object and ego-motion, and present a novel learning pipeline that interactively combined with the traditional sampling-based algorithm while training neural networks. Finally, we propose a sensor fusion technique for motion-related physical sensors that measure vehicle speed and inertia. Motion data is easily accessible on most vehicles and contains high-level geometric information. The proposed neural networks utilize this motion information as prior knowledge and present the capability to directly manipulate different viewpoints without an explicit geometric transformation. We apply this model to representation learning tasks and show improved performance in semantic segmentation and monocular depth estimation compared to existing methods. The algorithms and methodologies presented in this dissertation are validated and analyzed by various quantitative and qualitative experiments compared with existing algorithms.
Advisors
Kweon, In Soresearcher권인소researcher
Description
한국과학기술원 :로봇공학학제전공,
Country
한국과학기술원
Issue Date
2021
Identifier
325007
Language
eng
Article Type
Thesis(Ph.D)
URI
http://hdl.handle.net/10203/294529
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=962544&flag=dissertation
Appears in Collection
RE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0