Self-supervised exploration for cooperative multi-agent reinforcement learning다중 에이전트 강화학습에서의 협력을 위한 자기지도 탐색기법

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 212
  • Download : 0
Learning in sparse reward environments remains challenging for standard cooperative multi-agent reinforcement learning (MARL) algorithms. Because extrinsic rewards are sparse, agents lack the motivation or direction on how to explore the environment. An effective approach for encouraging exploration in the single-agent setting is to give the agent the prediction error of a novelty module as intrinsic reward. This novelty module is trained to predict the agent’s next state given its current state and action. Thus, giving this prediction error to the agent as intrinsic reward motivates the agent to explore parts of the environment which are novel to it. In this work, we extend this self-supervised exploration method to cooperative MARL. Unlike in single-agent environments, exploration in cooperative multi-agent environments would be more efficient if agents coordinate how they explore the environment. Here, we propose a new novelty module architecture and intrinsic reward formulation that encourage coordinated exploration. In particular, we design a two-headed novelty module that learns to predict both the agent’s next state and the joint next state of all agents. We then give as intrinsic reward to the agent the sum of the individual prediction error and the joint prediction error of this two-headed novelty module. We demonstrate in two sparse reward cooperative navigation scenarios that the combination of our novelty module architecture and intrinsic reward formulation improves the performance of standard cooperative MARL algorithms the most.
Advisors
Yi, Yungresearcher이융researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2020
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2020.8,[iii, 21 p. :]

Keywords

Deep learning▼amulti-agent reinforcement learning▼asparse reward▼aexploration▼aself-supervised learning

URI
http://hdl.handle.net/10203/285075
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=925239&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0