Deep neural network optimization methods for efficient training and inference on image synthesis영상 합성을 위한 효율적 훈련 및 추론용 깊은 인공 신경망 최적화 방법

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 112
  • Download : 0
Recently, deep neural networks have demonstrated impressive image synthesis and reconstruction performances, and have been extensively studied in various computer vision tasks. Unfortunately, training and evaluating recent deep learning models such as diffusion models or NeRF(Neural Radiance Field) consume a lot of time and computational resources. To address this problem and to allow training with even a single GPU, here we present novel methods for training diffusion model and a model for sparse view 3D object reconstruction. First of all, we present a novel pyramidal diffusion model that can generate high resolution images starting from much coarser resolution images using a single score function trained with a positional embedding. This enables a neural network to be much lighter and also enables time-efficient image generation without compromising its performances. Furthermore, we show that the proposed approach can be also efficiently used for multi-scale super-resolution problem using a single score function. Next, inspired by NeRF that have achieved state-of-the-art performance on novel view synthesis using implicit neural representation, we present a novel neural support optimization method. The method only estimates the support set and updates values in the support so that the speed of training and inference can be improved. Additionally, reconstructing a 3D object from two perpendicular views has been intriguing research topic in tomographic reconstruction with many applications such as X-ray baggage scanners, electron microscopy, etc. Unfortunately, the convention reconstruction algorithms such as filtered back-projection as well as modern deep learning methods completely fail to reconstruct in this extreme sparse view acquisition setting. To overcome the problem and further enhance the reconstruction quality, we propose a novel feature domain loss using the pre-trained VGGNet and DINO-ViT. We demonstrate successful 3D reconstruction using real two projection images in baggage scanning and STEM-EDX tomography dataset.
Advisors
류도훈researcher
Description
한국과학기술원 :바이오및뇌공학과,
Publisher
한국과학기술원
Issue Date
2023
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 바이오및뇌공학과, 2023.2,[v, 39 p. :]

Keywords

Diffusion model▼aImage generation▼aSuper resolution▼aTomographic reconstrucion▼a3D reconstruction▼aNovel view synthesis▼aDeep learning; 확산 모델▼a이미지 생성▼a초해상도 영상 복원▼a단층 영상 복원▼a3차원 영상 복원▼a가상 시점 영상 생성▼a딥러닝

URI
http://hdl.handle.net/10203/308717
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1032732&flag=dissertation
Appears in Collection
BiS-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0