Pathwise gradient estimators for various probability distributions in deep generative models딥 생성 모델에서의 다양한 확률 분포에 대한 경로별 경사 추정자

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 285
  • Download : 0
Estimating the gradients of stochastic nodes is one of the crucial research questions in the deep generative modeling community to optimize the model parameters through gradient descent method. This dissertation discusses two types of pathwise gradient estimators: one for Dirichlet distribution, and the other for generic discrete distributions. In our first work, we propose Dirichlet Variational Autoencoder (DirVAE) using a Dirichlet prior. To infer the parameters of DirVAE, we develop the pathwise gradient estimator by approximating the inverse cumulative distribution function of the Gamma distribution, which is a component of the Dirichlet distribution. This approximation on a new prior led an investigation on the component collapsing, and DirVAE revealed that the component collapsing originates from two problem sources: decoder weight collapsing and latent value collapsing. By resolving the component collapsing problem with the Dirichlet prior, we show that DirVAE produces disentangled latent representation which leads to the significant performance gain. Comparing to the continuous case, the gradient estimation problem becomes further complex when we regard the stochastic nodes to be discrete because pathwise derivative techniques can not be applied. Hence, the gradient estimation requires the score function methods or the continuous relaxation of the discrete random variables. In our second work, we suggest a general version of the Gumbel-Softmax estimator with continuous relaxation, and this estimator is able to relax the discreteness of probability distributions, including broader types than the current practice. In detail, we utilize the truncation of discrete random variables and the Gumbel-Softmax trick with a linear transformation for the relaxation. The proposed approach enables the relaxed discrete random variable to be reparameterized and to backpropagate through a large scale stochastic neural network.
Advisors
Moon, Il-Chulresearcher문일철researcher
Description
한국과학기술원 :산업및시스템공학과,
Publisher
한국과학기술원
Issue Date
2020
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 산업및시스템공학과, 2020.8,[vii, 64 p. :]

Keywords

Deep Generative Model▼aVariational Autoencoder▼aPathwise Gradient Estimator▼aReparameterization Trick▼aRepresentation Learning; 딥 생성 모델▼a변분 오토인코더▼a경로별 경사 추정자▼a재매개화 기교▼a표현 학습

URI
http://hdl.handle.net/10203/284294
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=924242&flag=dissertation
Appears in Collection
IE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0