Extreme video frame interpolation: extending VFI to video with extremely large motion극한의 모션을 갖는 영상에서의 비디오 프레임 보간법 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 53
  • Download : 0
Video Frame Interpolation (VFI) temporally interpolates one or more intermediate frames between every two consecutive frames with temporal coherence so that smoothly rendered fast motion is more visually pleasing. However, VFI is a long-standing and challenging task in computer vision, which is attributed to several factors such as non-linear motion, deformable object motion, large motions and motion blur in the video sequences. In this dissertation, we focus on the challenging issue of fast motions with extremely large pixel displacements which often occur in ultra-high-definition (UHD) video sequences and thus result in severe motion judder. To study VFI with large motion, we present an extreme VFI network, called XVFI-Net, that first handles the VFI for 4K videos with large motion. Even after training of the XVFI-Net, input frames can be flexibly down-scaled into any smaller size to cope with the spatial resolution and the degree of motion magnitudes of input frames during inference time, unlike the previous pyramid structures of a fixed number of scale levels. The recursive multi-scale shared structure of XVFI-Net allows for the large motion to be effectively captured in flexibly small-scale levels. In order to interpolate middle frames at any intermediate time instances, the bilateral optical flows are stably approximated by a novel complementary flow reversal (CFR) technique using the bidirectional motion. We also propose a coarse-direction-and-fine-attention (CDFA) module to extend the XVFI-Net, where multi-hypothesis flow vectors are utilized to deal with complex motion boundaries of fast-moving objects. The coarse direction vector is supervised by a self-guided training scheme. Then the fine attention vectors attend to the local details based on the coarse direction vector in a residual learning manner. Extensive experimental results show that our algorithms can successfully capture the essential information of objects with extremely large motions and complex textures while state-of-the-art methods exhibit poor performance. Furthermore, our XVFI-Net framework also performs comparably on the previous lower resolution benchmark dataset, which shows the robustness of our algorithm as well.
Advisors
Kim, Munchurlresearcher김문철researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2022
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2022.8,[vii, 70 p. :]

Keywords

Video processing▼aVideo restoration▼aFrame interpolation▼aMotion compensation▼aFrame rate up-conversion▼aOptical flow estimation; 비디오 프로세싱▼a영상 복원▼a프레임 보간▼a움직임 보상▼a프레임율 변환▼a광학 흐름 예측

URI
http://hdl.handle.net/10203/309070
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1007860&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0