DSpace at KOASAS: POfD-BC : policy optimization from demonstrations with behavior cloning for robot hand manipulation

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

POfD-BC : policy optimization from demonstrations with behavior cloning for robot hand manipulationPOfD-BC : 로봇 핸드 매니퓰레이션에서의 demonstrations과 BC를 이용한 policy optimization

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 197
Download : 0

Export

Choi, Yun-Seon

The five fingers robot hand has been developed for a long time, and there exist commercially available robot hands nowadays. But there are numerous difficulties in controlling the robot hand. The motion generation of the robot hand was virtually impossible even if any captured motion of the human hand is leveraged. As Deep learning advances, Deep-RL has shown remarkable achievement in several areas; it also becomes a solution for controlling the robot hand, although the complexity is high. But there is a problem in RL, which is that a reward function is manually made by human knowledge. The easiest way to make a reward function is sparse reward, indicating whether some subgoals are accomplished. This paper studies the robot hand manipulation with RL in this sparse reward condition. The existing algorithm POfD, which utilizes human demonstrations, was successfully in sparse reward environments. We firstly demonstrated POfD in robot hand manipulation tasks and analyzed, resulting in not solving for all tasks. The generated motions by POfD was also seen erratic. In the view of the performance and the practicality, POfD has some limitations. We propose POfD-BC to adapt POfD into imitation tasks, such as robot hand. This new method tries to mimic human hand motions, leading to being far more natural. Furthermore, we do transfer the learned behaviors to the new environments. The new four environments of tasks have been constructed for transfer learning. These tasks relate to the previous tasks. Behaviors coming from the pre-trained parameters would have common parts with new actions. The Experiments prove that new tasks can not be completed without prior knowledge. POfD-BC solves the contact reach robot hand manipulation tasks successfully, resulting in the practical motions. With the knowledge in the previous step, the robot hand easily learns how to manipulate in more complex situations. The imitation tasks, such as a human hand, need Behavior cloning for reliable and practical learning.

Advisors: Kim, Jong-Hwan researcher; 김종환 researcher

Description: 한국과학기술원 :전기및전자공학부,

Publisher: 한국과학기술원

Issue Date: 2020

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2020.2,[iv, 29 p. :]

Keywords: robotics▼arobot hand▼areinforcement learning▼aBehavior Cloning(BC)▼amotion generation; 로보틱스▼a로봇손▼a강화학습▼aBehavior Cloning(BC)▼a동작 생성

URI: http://hdl.handle.net/10203/284785

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=911415&flag=dissertation

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

POfD-BC : policy optimization from demonstrations with behavior cloning for robot hand manipulationPOfD-BC : 로봇 핸드 매니퓰레이션에서의 demonstrations과 BC를 이용한 policy optimization

KOASAS

Communities & Collections