Sample-efficient and safe deep reinforcement learning via reset deep ensemble agents앙상블 에이전트의 재설정을 통한 샘플 효율적이고 안전한 강화학습

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 2
  • Download : 0
Deep reinforcement learning (RL) has achieved remarkable success in solving complex tasks through its integration with deep neural networks (DNNs) as function approximators. However, the reliance on DNNs has introduced a new challenge called primacy bias, whereby these function approximators tend to prioritize early experiences, leading to overfitting. To mitigate this primacy bias, a reset method has been proposed, which performs periodic resets of a portion or the entirety of a deep RL agent while preserving the replay buffer. However, the use of the reset method can result in performance collapses after executing the reset, which can be detrimental from the perspective of safe RL and regret minimization. In this paper, we propose a new reset-based method that leverages deep ensemble learning to address the limitations of the vanilla reset method and enhance sample efficiency. The proposed method is evaluated through various experiments including those in the domain of safe RL. Numerical results show its effectiveness in high sample efficiency and safety considerations.
Advisors
성영철researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2024
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2024.2,[iv, 31 p. :]

Keywords

강화학습▼a앙상블학습▼a안전한 강화학습▼a샘플 효율성; Reinforcement learning▼aEnsemble learning▼aSafe reinforcement learning▼aSample efficiency

URI
http://hdl.handle.net/10203/321581
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1096799&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0