Learning to factorize with regularization for cooperative multi-agent reinforcement learning협력을 위한 다중 에이전트 강화 학습에서의 정규화를 통한 분해 학습

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 378
  • Download : 0
Multi-agent reinforcement learning tasks require that agents learn in a stable and scalable manner. To this end, we explore solutions in centralized training and decentralized execution (CTDE) regime popularized recently and focus on value-based methods. VDN and QMIX are representative examples employing centralized training to resolve instability and non-stationarity issues, and distributed execution to render the algorithm scalable. While appropriately factorizing the joint value functions into individual ones is key to distributed execution, we find that the existing methods of value function factorization address only a fraction of game-theoretically modelable MARL tasks. We propose QREG, which takes on a new approach to value function factorization: regularizing the joint value function. This approach translates to relaxing the previously assumed conditions placed on the nature of the value functions. Upon relaxing those assumptions, we illustrate that QREG covers every game satisfying a set of relatively mild conditions, enabling QREG to cover a wider class of games. Our simulations indicate superior performance in a variety of settings, with especially larger margins in games whose payoffs penalize non-cooperative behavior more harshly.
Advisors
Yi, Yungresearcher이융researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2019
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2019.2,[iv, 21 p. :]

Keywords

Machine learning▼adeep learning▼areinforcement learning▼amulti-agent reinforcement learning; 기계 학습▼a딥 러닝▼a강화 학습▼a다중 에이전트 강화 학습

URI
http://hdl.handle.net/10203/266898
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=843399&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0