(A) study on depth estimation using channel to space conversion채널 정보의 공간변환을 이용하는 깊이추정에 관한 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 151
  • Download : 0
Depth-estimation from a single input image can be used in various applications such as robotics and autonomous driving. UNet-styled networks with encoder/decoder structures have been widely used for monocular depth estimation based on supervised learning. Various studies have been attempted to reduce the amount of computation in the encoder, but research on saving the amount of computation in the decoder is relatively lacking. In general, in the decoder, an operation of increasing the image resolution while gradually reducing the channel size is repeated. If such processing can be performed at a time at a high magnification, the amount of computation in the decoder can be remarkably reduced. To achieve this goal in a monocular image-based depth estimation network, we propose a new network structure with reduced convolution layers at the decoder part, namely, the Cocktail Glass Network (CGN). And to make this structure possible, we propose a new feature data transformation method, which is called Channel to Space Remapping (CSR), which directly moves and transforms the data accumulated in the channel direction to the image plane. Using this method, it is possible to convert low-resolution data of a thick channel into high-resolution data of a thin channel in a single layer. The proposed method can be easily implemented using simple reshaping operations; therefore, it is suitable for reducing the depth-estimation network. Considering the experimental results based on the NYU V2 and KITTI datasets, we demonstrate that the proposed method reduces the amount of computation in the decoder by half, while maintaining the same level of accuracy; it can be used in both lightweight and large-model-capacity networks. In the latter part of the paper, we show that the proposed method is particularly suitable for depth estimation networks, and we further propose a method to improve performance by adding MLP to CSR. And we suggest that CSR can be used for the purpose of reducing the amount of computation not only in the depth estimation network but also in the Super Resolution network.
Advisors
Kim, Junmoresearcher김준모researcher
Description
한국과학기술원 :로봇공학학제전공,
Publisher
한국과학기술원
Issue Date
2022
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 로봇공학학제전공, 2022.8,[vi, 43 p. :]

Keywords

Neural▼aNetwork▼aImage▼aProcessing▼aDepth▼aEstimation▼aMonocular; 신경회로▼a영상처리▼a깊이추정▼a단안영상

URI
http://hdl.handle.net/10203/307954
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1007775&flag=dissertation
Appears in Collection
RE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0