Inducing cooperation by learning to reshape rewards in semi-cooperative multi-agent reinforcement learning부분적 학습 상황에서 협력유도를 위한 보상 구조 학습 기법을 통한 다중 에이전트 강화학습

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 376
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorYi, Yung-
dc.contributor.advisor이융-
dc.contributor.authorHostallero, David Earl-
dc.date.accessioned2019-09-04T02:43:50Z-
dc.date.available2019-09-04T02:43:50Z-
dc.date.issued2019-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=843438&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/266899-
dc.description학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2019.2,[iv, 30 p. :]-
dc.description.abstractWe propose a deep reinforcement learning algorithm for semi-cooperative multi-agent tasks, where agents are equipped with their separate reward functions, yet with willingness to cooperate. Under these semi-cooperative scenarios, popular methods of centralized training with decentralized execution for inducing cooperation and removing the non-stationarity problem do not work well due to lack of a common shared reward as well as inscalability in centralized training. Our algorithm, called Peer Evaluation-based Dual DQN (PED-DQN), proposes to give peer evaluation signals to observed agents, which quantifies how they ``feel'' about a certain transition. This exchange of peer evaluation over time turns out to render agents to gradually reshape their reward functions so that their action choices from the myopic best-response tend to result in the good joint action with high cooperation. This evaluation-based method also allows flexible and scalable training by not assuming knowledge of the number of other agents and their observation and action spaces. We provide the performance evaluation of PED-DQN for the scenarios ranging from a simple two-person prisoner’s dilemma to more complex semi-cooperative multi-agent tasks. In special cases where agents share a common reward function as in the centralized training methods, we show that inter-agent evaluation leads to better performance.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectMachine learning▼areinforcement learning▼amulti-agent systems-
dc.subject기계 학습▼a강화 학습▼a다중 에이전트 시스템-
dc.titleInducing cooperation by learning to reshape rewards in semi-cooperative multi-agent reinforcement learning-
dc.title.alternative부분적 학습 상황에서 협력유도를 위한 보상 구조 학습 기법을 통한 다중 에이전트 강화학습-
dc.typeThesis(Master)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :전기및전자공학부,-
dc.contributor.alternativeauthor호스탈레로, 다비드 얼-
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0