DSpace at KOASAS: Unconditional image-text pair generation with multimodal cross quantizer

DSpace at KOASAS

College of Engineering(공과대학)Kim Jaechul Graduate School of AI(김재철AI대학원)AI-Theses_Master(석사논문)

Unconditional image-text pair generation with multimodal cross quantizer다중모달 벡터 퀀타이저를 사용한 조건없는 이미지-텍스트 쌍 생성

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 138
Download : 0

Export

Lee, Hyungyung

Though deep generative models have gained a lot of attention, most of the existing works are designed for the unimodal generation task. In this paper, we explore a new method for unconditional image-text pair generation. We propose MXQ-VAE, a vector quantization method for multimodal image-text representation. MXQ-VAE accepts a paired image and text as input, and learns a joint quantized representation space, so that the image-text pair can be converted to a sequence of unified indices. Then we can use autoregressive generative models to model the joint image-text representation, and even perform unconditional image-text pair generation. Extensive experimental results demonstrate that our approach effectively generates semantically consistent image-text pair and also enhances meaningful alignment between image and text.

Advisors: Choi, Edward researcher; 최윤재 researcher

Description: 한국과학기술원 :김재철AI대학원,

Publisher: 한국과학기술원

Issue Date: 2022

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 김재철AI대학원, 2022.8,[iii, 13 p. :]

Keywords: Multimodal Representation Learning▼aVector Quantization▼aUnconditional Multimodal Generation; 다중모달 특징 학습▼a벡터 퀀타이제이션▼a조건없는 다중모달 생성

URI: http://hdl.handle.net/10203/308221

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1008214&flag=dissertation

Appears in Collection: AI-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Unconditional image-text pair generation with multimodal cross quantizer다중모달 벡터 퀀타이저를 사용한 조건없는 이미지-텍스트 쌍 생성

KOASAS

Communities & Collections