DSpace at KOASAS: Besra: Self-correction for hallucination mitigation in large vision-language models

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

Besra: Self-correction for hallucination mitigation in large vision-language models베스라: 대형 시각 언어 모델의 환각 완화를 위한 자체 교정

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 2
Download : 0

Export

Kim, Yeonju / 김연주

Large Vision-Language Models (LVLMs) have revolutionized the field of computer vision by unifying various computer vision tasks through their ability to comprehend visual information. However, they often suffer from hallucination, generating inconsistent descriptions not aligned with input images. This paper introduces Besra, a Large Vision-Language Model designed to address hallucination by incorporating a self-correction task. Besra leverages its iterative refinement capability to enhance generated sentences' consistency with provided images. The model iteratively refines descriptions by refeeding them alongside corresponding images, facilitating a detailed examination of specific image regions. Besra-Self-Correction-30K, a proposed dataset, trains Besra's self-correction ability by inducing corrections based on predictions from a baseline LVLM. The approach aims to mitigate hallucination, enabling Besra to generate more accurate and contextually relevant descriptions through active image scrutiny. We evaluate Besra on POPE and MME benchmark and prove that a self-correction task is helpful for hallucination mitigation.

Advisors: 노용만 researcher

Description: 한국과학기술원 :전기및전자공학부,

Publisher: 한국과학기술원

Issue Date: 2024

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2024.2,[iii, 22 p. :]

Keywords: 대형 시각 언어 모델▼a환각 현상▼a자체 교정 작업▼a베스▼a베스라-자체교정-데이터셋; Large vision-language model▼aHallucination▼aSelf-correction▼aBesra▼aBesra-self-correction-30K

URI: http://hdl.handle.net/10203/321570

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1096788&flag=dissertation

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Besra: Self-correction for hallucination mitigation in large vision-language models베스라: 대형 시각 언어 모델의 환각 완화를 위한 자체 교정

KOASAS

Communities & Collections