Dynamic value gradient control using Gaussian process reinforcement learning with hyper-parameter optimization상위 모수 최적화된 가우시안 과정 강화 학습을 이용한 동적 가치 경사 제어

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 442
  • Download : 0
When a system works in dangerous materials or on hazardous area, is fault, then we cannot immediately repair the system. Therefore fault-tolerant design concept is very useful to obtain system`s reliability. This thesis gives an optimal control algorithm which can control system with fault. We can detect system`s fault as differences between model output and real output, and also from these differences a new model for system can be generated with on-line learning algorithm. With a new model, the controller can be generated as form of reinforcement learning on continuous state and action. Firstly, we use online sparse Gaussian Process (GP) regression for system modeling. Using that regression algorithm we can model the system in real time experiment. However, it is hard to choose the hyper-parameters of current GP. We propose new optimization algorithm based on information aspect, using that we can handle bias and variance trade-off. Secondly, using model-based value gradient control scheme with GP Reinforcement Learning (RL), we can obtain the optimal control algorithm which reduced time consuming. we use dynamic-framework which fully use simulation from learned model and real experiment from given environment. Using BEB-algorithm, we can make much strict algorithm which could solve exploration and exploitation trade-off. Simulation result shows performance of proposed algorithm which is superior to others. Our study of learning method for unknown system can be expected to stimulate research about fault tolerant design of intelligence robot.
Advisors
Lee, Ju-Jangresearcher이주장researcher
Description
한국과학기술원 : 전기 및 전자공학과,
Publisher
한국과학기술원
Issue Date
2011
Identifier
467859/325007  / 020093262
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기 및 전자공학과, 2011.2, [ vi, 43 p. ]

Keywords

Machine Learning; Reinforcement Learining; Gaussian Process; 기계 학습; 강화 학습; 가우시안 과정; 내고장 설계; Fault Tolerant Design

URI
http://hdl.handle.net/10203/180761
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=467859&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0