A study on the score-based diffusion model for improved training, flexible inference, and efficient sampling향상된 학습, 유연한 추론, 및 효과적인 샘플링을 위한 스코어 기반 확산 모델 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 2
  • Download : 0
Learning a data distribution and sampling from it are key on creative generation. For previous decades, however, human-level generation in a high-dimensional space was far-fetching for two reasons. First reason comes from the lack of computational resources. Second, none of generative models were scalable to high-dimensions. Therefore, models that seem to conquer the MNIST dataset failed at generating recognizable natural images, such as CIFAR-10. In this thesis, we introduce recent development of score-based diffusion models, which emerge as a strong candidate of the substitute for previous modeling frameworks. The diffusion models have three components [4]: the forward-time data diffusion process [5], the reverse-time generative diffusion process [6], and the score training objective [7]. There are few works [8, 9, 10] that provides the deep understanding of each component, and we aim to understand each component more deeply by answering fundamental questions that arise from the nature of diffusion models in three chapters. First, we observe that the previous training objective has a trade-off between the actual sample quality and the model likelihood evaluation. We explain this trade-off by the contribution of diffusion loss at each time: the large-time diffusion loss takes only an extremely minor portion on the model log-likelihood. From this imbalanced contribution of small-large times, the log-likelihood training leaves the score estimation on large time inaccurate, and the sample quality is deteriorated by this inaccuracy. We introduce Soft Truncation that successfully mitigates the trade-off. Soft Truncation ease the truncation bound at every mini-batch from a hyper-parameter $\epsilon$ to a random variable $\tau$ , and trains the score network for the batch on [$\tau$, T], instead of [$\epsilon$, T]. This forces batch update with large $\tau$ to focus on the range of large diffusion time, so the large time score is well-trained with Soft Truncation. Second, we extend the scope of forward-time data diffusion process from the linear SDEs to nonlinear SDEs. So far, the forward-time data diffusion process is fixed throughout the training procedure so to constrain the final density as one of a Gaussian distribution. However, intuitively, there would be promising diffusion patterns to efficiently train the diffusion models that is adaptive to the given data distribution. Therefore, we introduce Implicit Nonlinear Diffusion Models (INDM), that models the nonlinearity by an implciit way. We find that the explicit nonlinearity modeling is unsuccessful for its intractable transition probability, and introduce a normalizing flow to detour the intractability issue. Third, we aim to adjust the score estimation to improve sample quality. This work is motivated from the difference of local optimum and global optimum. At the global optimum of the training objective, the score network perfectly estimates the data score, but achieving the global optimality is hardly satisfied in reality. Instead, the score network (at local optimum) is merely an approximation of the data score,so there is a gap between the estimation and the true data score. We introduce a neural estimator of this gap, using a discriminator training. After the training, we augment the gap estimation to the original generative process to adjust the score part. Throughout the chapters, we validate our works in vision-oriented dataset, such as CIFAR-10.
Advisors
문일철researcher
Description
한국과학기술원 :산업및시스템공학과,
Publisher
한국과학기술원
Issue Date
2023
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 산업및시스템공학과, 2023.8,[x, 158 p. :]

Keywords

생성 모델▼a확산 모델▼a스코어 기반 모델▼a적대적 생성 모델▼a정규화 흐름; Generative models▼aDiffusion models▼aScore-based models▼aGenerative Adversarial networks▼aNormalizing flows

URI
http://hdl.handle.net/10203/320867
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1046817&flag=dissertation
Appears in Collection
IE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0