DSpace at KOASAS: 모델 전도 공격에 안전한 모델 설명 생성에 관한 연구

DSpace at KOASAS

College of Engineering(공과대학)Graduate School of Information Security(정보보호대학원)IS-Theses_Master(석사논문)

모델 전도 공격에 안전한 모델 설명 생성에 관한 연구Learning to generate inversion-resistant model explanations

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 362
Download : 0

Export

정호용

중대한 결정에 인공지능이 광범위하게 채택되면서 자신의 결정에 대한 설명을 제공하는 해석 가능한 인공지능의 필요성이 높아졌다. 안타깝게도 모델 설명을 공개하는 것은 정보 유출을 촉진하여 인공지능 모델이 모델 전도 공격(Model Inversion Attack)에 취약해진다는 것이 최근 연구를 통해 입증되었다. 공격자는 공개된 모델 설명을 기반으로 원본 이미지를 보다 정확히 복원해낼 수 있게되어 민감한 정보가 유출될 수 있다. 본 연구는 모델 설명의 해석 가능성을 유지하면서 전도 공격의 위험을 최소화하기 위해 모델 설명을 교란하는 새로운 방어 프레임워크인 GNIME(Generative Noise Injector for Model Explanations)을 제시한다. 해당 기법은 최적의 노이즈를 주입하는 모델을 학습하기 위해 방어자인 노이즈 생성기와 공격자인 전도 모델을 최소극대화 시나리오를 통해 경쟁적으로 학습시킨다. 실험을 통해 GNIME이 원본 모델 설명의 기능을 유지하면서도 정보 유출을 크게 줄여 얼굴 인식 모델에서 전이 분류 정확도(Transferable Classification Accuracy)를 최대 84.8%까지 감소시킬 수 있음을 확인했다.

Advisors: 손수엘 researcher; Son, Sooel researcher

Description: 한국과학기술원 :정보보호대학원,

Publisher: 한국과학기술원

Issue Date: 2023

Identifier: 325007

Language: kor

Description: 학위논문(석사) - 한국과학기술원 : 정보보호대학원, 2023.2,[iv, 26 p. :]

Keywords: 모델 전도 공격▼a설명 가능한 인공지능▼a모델 설명; model inversion attack▼aexplainable AI▼amodel explanation

URI: http://hdl.handle.net/10203/309618

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1032999&flag=dissertation

Appears in Collection: IS-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

모델 전도 공격에 안전한 모델 설명 생성에 관한 연구Learning to generate inversion-resistant model explanations

KOASAS

Communities & Collections