DSpace at KOASAS: Rewards Prediction Based Credit Assignment for Reinforcement Learning

DSpace at KOASAS

College of Engineering(공과대학)Cho Chun Shik Graduate School for Mobility(조천식모빌리티대학원)GT-Theses_Master(석사논문)

Rewards Prediction Based Credit Assignment for Reinforcement Learning보상 예측 기반의 신뢰 할당을 통한 강화학습

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 290
Download : 0

Export

SEO, MINAH

In many reinforcement learning cases, a reward for an action is not immediately given to the action, and this is called delayed reward. When the form of reward is sparse binary rewards, under which rewards are given only when an agent succeeds in achieving a goal, success signals do not appear frequently, so the learning speed gets slow and the difficulty of learning increases. In this paper, a method to do credit assignment and improve sample efficiency by selecting key-action that contributed to receiving rewards among a series of actions, is proposed. To actions made precedent to the key-action, smaller reward than the key-action’s is given, so that the problem that success signals do not often appear can be alleviated. The main behavior is based on the predicted value of the rewards to be received based on the previous information in episode. As one kind of credit assignment method, there is a traditional reward shaping, but it requires prior knowledge of the environment, and is likely to involve the designer's bias. The proposed method can has dynamic reward shaping effect using a reward function that is modified according to the agent's experience while using sparse binary reward that does not require prior knowledge. In this paper, a key-action detection is experimented in the slide task that robot hits a puck and sends it to the goal point, and performance of the proposed method in push task, slide task, and maze solving task is shown. In the first experiment, it is confirmed that a robot detects proper key-action, which is at the moment just before the robot hit the object. In the other experiments, all the proposed cases show higher success rate or marginally improved performance than the cases without the proposed method.

Advisors: Har, Dong Soo researcher; 하동수 researcher

Description: 한국과학기술원 :조천식녹색교통대학원,

Publisher: 한국과학기술원

Issue Date: 2019

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 조천식녹색교통대학원, 2019.8,[iii, 46 p. :]

Keywords: Credit Assignment; Reward Shaping; Reinforcement Learning; Delayed Reward; Sparse Binary Reward; 신뢰할당; 보상변형; 강화학습; 지연보상; 희소이진보상

URI: http://hdl.handle.net/10203/285192

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=927180&flag=dissertation

Appears in Collection: GT-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Rewards Prediction Based Credit Assignment for Reinforcement Learning보상 예측 기반의 신뢰 할당을 통한 강화학습

KOASAS

Communities & Collections