DSpace at KOASAS: InstaFormer plus plus : Multi-Domain Instance-Aware Image-to-Image Translation with Transformer

DSpace at KOASAS

College of Engineering(공과대학)Kim Jaechul Graduate School of AI(김재철AI대학원)AI-Journal Papers(저널논문)

InstaFormer plus plus : Multi-Domain Instance-Aware Image-to-Image Translation with Transformer

Cited 0 time in webofscience

Cited 0 time in

Hit : 10
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Kim, Soohyun	ko
dc.contributor.author	Baek, Jongbeom	ko
dc.contributor.author	Park, Jihye	ko
dc.contributor.author	Ha, Eunjae	ko
dc.contributor.author	Jung, Homin	ko
dc.contributor.author	Lee, Taeyoung	ko
dc.contributor.author	Kim, Seungryong	ko
dc.date.accessioned	2024-08-16T02:00:05Z	-
dc.date.available	2024-08-16T02:00:05Z	-
dc.date.created	2024-08-16	-
dc.date.issued	2024-04	-
dc.identifier.citation	INTERNATIONAL JOURNAL OF COMPUTER VISION, v.132, no.4, pp.1167 - 1186	-
dc.identifier.issn	0920-5691	-
dc.identifier.uri	http://hdl.handle.net/10203/322305	-
dc.description.abstract	We present a novel Transformer-based network architecture for instance-aware image-to-image translation, dubbed InstaFormer, to effectively integrate global- and instance-level information. By considering extracted content features from an image as visual tokens, our model discovers global consensus of content features by considering context information through self-attention module of Transformers. By augmenting such tokens with an instance-level feature extracted from the content feature with respect to bounding box information, our framework is capable of learning an interaction between object instances and the global image, thus boosting the instance-awareness. We replace layer normalization (LayerNorm) in standard Transformers with adaptive instance normalization (AdaIN) to enable a multi-modal translation with style codes. In addition, to improve the instance-awareness and translation quality at object regions, we present an instance-level content contrastive loss defined between input and translated image. Although competitive performance can be attained by InstaFormer, it may face some limitations, i.e., limited scalability in handling multiple domains, and reliance on domain annotations. To overcome this, we propose InstaFormer++ as an extension of Instaformer, which enables multi-domain translation in instance-aware image translation for the first time. We propose to obtain pseudo domain label by leveraging a list of candidate domain labels in a text format and pretrained vision-language model. We conduct experiments to demonstrate the effectiveness of our methods over the latest methods and provide extensive ablation studies.	-
dc.language	English	-
dc.publisher	SPRINGER	-
dc.title	InstaFormer plus plus : Multi-Domain Instance-Aware Image-to-Image Translation with Transformer	-
dc.type	Article	-
dc.identifier.wosid	001091935600001	-
dc.identifier.scopusid	2-s2.0-85175203558	-
dc.type.rims	ART	-
dc.citation.volume	132	-
dc.citation.issue	4	-
dc.citation.beginningpage	1167	-
dc.citation.endingpage	1186	-
dc.citation.publicationname	INTERNATIONAL JOURNAL OF COMPUTER VISION	-
dc.identifier.doi	10.1007/s11263-023-01866-y	-
dc.contributor.localauthor	Kim, Seungryong	-
dc.contributor.nonIdAuthor	Kim, Soohyun	-
dc.contributor.nonIdAuthor	Baek, Jongbeom	-
dc.contributor.nonIdAuthor	Park, Jihye	-
dc.contributor.nonIdAuthor	Ha, Eunjae	-
dc.contributor.nonIdAuthor	Jung, Homin	-
dc.contributor.nonIdAuthor	Lee, Taeyoung	-
dc.description.isOpenAccess	N	-
dc.type.journalArticle	Article	-
dc.subject.keywordAuthor	GANs	-
dc.subject.keywordAuthor	Instance-aware image-to-image translation	-
dc.subject.keywordAuthor	Vision and language	-
dc.subject.keywordAuthor	Image-to-image translation	-

Appears in Collection: AI-Journal Papers(저널논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

InstaFormer plus plus : Multi-Domain Instance-Aware Image-to-Image Translation with Transformer

KOASAS

Communities & Collections