Dynamic value gradient control using Gaussian process reinforcement learning with hyper-parameter optimization = 상위 모수 최적화된 가우시안 과정 강화 학습을 이용한 동적 가치 경사 제어

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 476
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorLee, Ju-Jang-
dc.contributor.advisor이주장-
dc.contributor.authorShin, Seung-Yong-
dc.contributor.author신승용-
dc.date.accessioned2011-12-14T01:37:32Z-
dc.date.available2011-12-14T01:37:32Z-
dc.date.issued2011-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=467859&flag=dissertation-
dc.identifier.urihttp://hdl.handle.net/10203/36763-
dc.description학위논문(석사) - 한국과학기술원 : 전기 및 전자공학과, 2011.2, [ vi, 43 p. ]-
dc.description.abstractWhen a system works in dangerous materials or on hazardous area, is fault, then we cannot immediately repair the system. Therefore fault-tolerant design concept is very useful to obtain system`s reliability. This thesis gives an optimal control algorithm which can control system with fault. We can detect system`s fault as differences between model output and real output, and also from these differences a new model for system can be generated with on-line learning algorithm. With a new model, the controller can be generated as form of reinforcement learning on continuous state and action. Firstly, we use online sparse Gaussian Process (GP) regression for system modeling. Using that regression algorithm we can model the system in real time experiment. However, it is hard to choose the hyper-parameters of current GP. We propose new optimization algorithm based on information aspect, using that we can handle bias and variance trade-off. Secondly, using model-based value gradient control scheme with GP Reinforcement Learning (RL), we can obtain the optimal control algorithm which reduced time consuming. we use dynamic-framework which fully use simulation from learned model and real experiment from given environment. Using BEB-algorithm, we can make much strict algorithm which could solve exploration and exploitation trade-off. Simulation result shows performance of proposed algorithm which is superior to others. Our study of learning method for unknown system can be expected to stimulate research about fault tolerant design of intelligence robot.eng
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectGaussian Process-
dc.subjectReinforcement Learining-
dc.subjectMachine Learning-
dc.subjectFault Tolerant Design-
dc.subject내고장 설계-
dc.subject가우시안 과정-
dc.subject강화 학습-
dc.subject기계 학습-
dc.titleDynamic value gradient control using Gaussian process reinforcement learning with hyper-parameter optimization = 상위 모수 최적화된 가우시안 과정 강화 학습을 이용한 동적 가치 경사 제어-
dc.typeThesis(Master)-
dc.identifier.CNRN467859/325007 -
dc.description.department한국과학기술원 : 전기 및 전자공학과, -
dc.identifier.uid020093262-
dc.contributor.localauthorLee, Ju-Jang-
dc.contributor.localauthor이주장-
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0