A study on video super-resolution using a 3D convolutional neural network with spatio-temporal feature extraction시공간 특징 추출의 3차원 콘볼루션 신경망을 이용한 비디오 초해상화에 관한 연구
The demand for high quality videos has been increasing rapidly in recent years, and super-resolution (SR) methods are rising as core technologies for the generation of high quality visual content. SR methods can be mainly divided into two categories: single image SR and multi-frame (video) SR. While single image SR methods produce a single high resolution (HR) output from the corresponding single low resolution (LR) input, video SR methods produce a single HR output at a specific time instant from a series of consecutive LR input frames. Single image SR methods solely utilize the spatial information in a single input image to produce the HR output, whereas video SR methods exploit the temporal relations between the consecutive frames to make use of the additional spatial information available for a more accurate reconstruction of HR video frames. In this thesis, we present our research on video SR and propose a deep neural network-based HR frame generation method that considers scene changes when using the spatio-temporal information in video frame inputs. Furthermore, the proposed video SR method based on a 3D convolutional neural network does not require motion estimation nor compensation as a pre-processing step, which is often necessary for other video SR methods. We also present a scene boundary detection module and a frame input structure that prevents performance degradation due to scene changes in the input video frames.