DSpace at KOASAS: Simple but effective attention calibration for CLIP-guided diffusion models

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

Simple but effective attention calibration for CLIP-guided diffusion modelsCLIP 지도 디퓨젼 모델을 위한 간단하지만 효과적인 주의 집중 교정

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 3
Download : 0

Export

Jeon, Woo-jin / 전우진

While Contrastive Language-Image Pre-training (CLIP) model has significantly advanced text-to-image generation, we uncover two notable issues in its application to diffusion models, particularly with the implementation of local embeddings. First, the model disproportionately focuses on word embeddings with less information of the input prompt. Second, local embeddings disrupt the image geometry established by global embeddings at initial timesteps, risking misalignment with the original prompt. To mitigate the identified issues, we introduce two adjustments to cross-attention: sequence-dependent and time-dependent attention calibration. Our method employs simple numerical operations, for which we provide the values, ensuring easy implementation. In the sequence-dependent attention calibration, constants are added to the logits in the cross-attention layer to counterbalance the diminishing attention across the word sequence. The time-dependent attention adjustment enhances the attention towards global embeddings in the initial stages, facilitating better geometry formation. Our experiments on various datasets show that this simple method significantly improves the performance of Stable Diffusion, yielding images that more accurately depict the input prompts.

Advisors: 김창익 researcher

Description: 한국과학기술원 :전기및전자공학부,

Publisher: 한국과학기술원

Issue Date: 2024

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2024.2,[vi, 31 p. :]

Keywords: CLIP▼a디퓨젼▼a교차 어텐션; CLIP▼aDiffusion▼aCross-attention

URI: http://hdl.handle.net/10203/321607

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1097179&flag=dissertation

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Simple but effective attention calibration for CLIP-guided diffusion modelsCLIP 지도 디퓨젼 모델을 위한 간단하지만 효과적인 주의 집중 교정

KOASAS

Communities & Collections