DSpace at KOASAS: Improving upper confidence reinforcement learning with bootstrapping

DSpace at KOASAS

College of Engineering(공과대학)Dept. of Industrial and Systems Engineering(산업및시스템공학과)IE-Theses_Master(석사논문)

Improving upper confidence reinforcement learning with bootstrapping강화학습에서의 효율적 탐색을 위한 부트스트랩 기법의 활용

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 78
Download : 0

Export

Kim, Sanghwa

Upper confidence reinforcement learning (UCRL) algorithms are shown to be very effective to solve online reinforcement learning problems in which the environment is given by a Markov decision process (MDP) with unknown reward distributions and unknown state transition probabilities. Analogously to upper confidence bound (UCB) algorithm, UCRL algorithm constructs a set of plausible MDPs that contains the true MDP with a high probability, and finds an exploration policy based on optimistic interpretation of this confidence set. To achieve optimal balance between exploration and exploitation, it is crucial to construct the set of plausible MDPs as tight as possible. We introduce bootstrap techniques in construction of the set of plausible MDPs, in addition to the concentration inequalities such as Hoeffding's inequality and empirical Bernstein inequality used in the previous UCRL algorithms. By doing so, we can further utilize the whole distribution of given data thereby making the set of plausible MDPs tighter while preserving theoretical guarantees on the performance of worst case. We demonstrate through experiments that our proposed bootstrapping UCRL algorithms improve the existing UCRL algorithms by 5%-30% in terms of cumulative regret, and also provide theoretical analysis showing that this improvement can be carried out without degrading their performance guarantees.

Advisors: Min, Seungki researcher; 민승기 researcher; Kim, Kyoung-Kuk researcher; 김경국 researcher

Description: 한국과학기술원 :산업및시스템공학과,

Publisher: 한국과학기술원

Issue Date: 2022

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 산업및시스템공학과, 2022.2,[iii, 30 p. :]

URI: http://hdl.handle.net/10203/308787

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=997783&flag=dissertation

Appears in Collection: IE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Improving upper confidence reinforcement learning with bootstrapping강화학습에서의 효율적 탐색을 위한 부트스트랩 기법의 활용

KOASAS

Communities & Collections