POfD-BC : policy optimization from demonstrations with behavior cloning for robot hand manipulationPOfD-BC : 로봇 핸드 매니퓰레이션에서의 demonstrations과 BC를 이용한 policy optimization

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 181
  • Download : 0
The five fingers robot hand has been developed for a long time, and there exist commercially available robot hands nowadays. But there are numerous difficulties in controlling the robot hand. The motion generation of the robot hand was virtually impossible even if any captured motion of the human hand is leveraged. As Deep learning advances, Deep-RL has shown remarkable achievement in several areas; it also becomes a solution for controlling the robot hand, although the complexity is high. But there is a problem in RL, which is that a reward function is manually made by human knowledge. The easiest way to make a reward function is sparse reward, indicating whether some subgoals are accomplished. This paper studies the robot hand manipulation with RL in this sparse reward condition. The existing algorithm POfD, which utilizes human demonstrations, was successfully in sparse reward environments. We firstly demonstrated POfD in robot hand manipulation tasks and analyzed, resulting in not solving for all tasks. The generated motions by POfD was also seen erratic. In the view of the performance and the practicality, POfD has some limitations. We propose POfD-BC to adapt POfD into imitation tasks, such as robot hand. This new method tries to mimic human hand motions, leading to being far more natural. Furthermore, we do transfer the learned behaviors to the new environments. The new four environments of tasks have been constructed for transfer learning. These tasks relate to the previous tasks. Behaviors coming from the pre-trained parameters would have common parts with new actions. The Experiments prove that new tasks can not be completed without prior knowledge. POfD-BC solves the contact reach robot hand manipulation tasks successfully, resulting in the practical motions. With the knowledge in the previous step, the robot hand easily learns how to manipulate in more complex situations. The imitation tasks, such as a human hand, need Behavior cloning for reliable and practical learning.
Advisors
Kim, Jong-Hwanresearcher김종환researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2020
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2020.2,[iv, 29 p. :]

Keywords

robotics▼arobot hand▼areinforcement learning▼aBehavior Cloning(BC)▼amotion generation; 로보틱스▼a로봇손▼a강화학습▼aBehavior Cloning(BC)▼a동작 생성

URI
http://hdl.handle.net/10203/284785
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=911415&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0