DSpace at KOASAS: ReMixer : object-aware mixing layer for vision transformers

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

ReMixer : object-aware mixing layer for vision transformers비전 트랜스포머를 위한 물체 인식 기반 패치 혼합 신경층

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 76
Download : 0

Export

Kang, Hyunwoo

Vision Transformers (ViTs) have shown impressive results on various visual recognition tasks, alternating classic convolutional networks. While the initial ViTs treated all patches equally, recent studies reveal that incorporating inductive biases such as spatiality benefits the learned representations. However, most prior works solely focused on the location of patches, overlooking the scene structure of images. This paper aims to further guide the interaction of patches using the object information. Specifically, we propose ReMixer, which reweights the patch mixing layers of ViT based on the patch-wise object labels obtained in unsupervised or weakly-supervised manners, i.e., no additional human-annotating cost is necessary. Using the object labels, we compute a reweighting mask with a learnable scale parameter that calibrates the patch interactions, e.g., attention map of self-attention. We demonstrate that ReMixer improves ViTs over various downstream tasks, including classification, multi-object recognition, and background robustness. Finally, we show that our idea also works for MLP-Mixer and ConvMixer, implying its generic applicability to patch-based models.

Advisors: Shin, Jinwoo researcher; 신진우 researcher

Description: 한국과학기술원 :전기및전자공학부,

Publisher: 한국과학기술원

Issue Date: 2022

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2022.8,[iv, 23 p. :]

Keywords: Object-centric▼aInductive bias▼aVision transformers▼aPatch-based models; 물체 인식 기반▼a귀납적 편향▼a비전 트랜스포머▼a패치 기반 모델

URI: http://hdl.handle.net/10203/309947

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1008387&flag=dissertation

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

ReMixer : object-aware mixing layer for vision transformers비전 트랜스포머를 위한 물체 인식 기반 패치 혼합 신경층

KOASAS

Communities & Collections