Image-text multi-modal representation learning by adversarial backpropagation적대적 역전파에 의한 영상-문장 멀티모달 표현 학습

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 567
  • Download : 0
We present novel method for image-text multi-modal representation learning. In our knowledge, this work is the first approach of applying adversarial learning concept to multi-modal learning and not exploiting image-text pair information to learn multi-modal feature. We only use category information in contrast with most previous methods using image-text pair information for multi-modal embedding. In this paper, we show that multi-modal feature can be achieved without image-text pair information and our method makes more similar distribution with image and text in multi-modal feature space than other methods which use image-text pair information. And we show our multi-modal feature has universal semantic information, even though it was trained for category prediction. Our model is end-to-end backpropagation, intuitive and easily extended to other multimodal learning work.
Advisors
Yang, Hyun Seungresearcher양현승researcher
Description
한국과학기술원 :전산학부,
Publisher
한국과학기술원
Issue Date
2017
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전산학부, 2017.8,[iii, 16 p. :]

Keywords

Multi-Modal Representation▼aGenerative Adversarial Network▼aDomain Adaptation▼aAdversarial Learning; 멀티모달 표현▼a적대적 생성 신경망▼a분야 이동▼a적대적 학습

URI
http://hdl.handle.net/10203/243445
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=718719&flag=dissertation
Appears in Collection
CS-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0