Inducing Cooperation through Reward Reshaping based on Peer Evaluations in Deep Multi-Agent Reinforcement Learning

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 101
  • Download : 0
We propose a deep reinforcement learning algorithm for semi-cooperative multi-agent tasks, where agents are equipped with their separate reward functions, yet with some willingness to cooperate. It is intuitive that defining and directly maximizing a global reward function leads to cooperation because there is no concept of selfishness among agents. However, it may not be the best way of inducing such cooperation due to problems that arise from training multiple agents with a single reward (e.g., credit assignment). In addition, agents may intentionally be given separate reward functions to induce task prioritization whereas a global reward function may be difficult to define without diluting the effect of different tasks and causing their reward factors to be disregarded. Our algorithm, called Peer Evaluation-based Dual DQN (PED-DQN), proposes to give peer evaluation signals to observed agents, which quantify how they strategically value a certain transition. This exchange of peer evaluation among agents over time turns out to render agents to gradually reshape their reward functions so that their action choices from the myopic best response tend to result in a more cooperative joint action.
Publisher
Association for Computing Machinery, Inc
Issue Date
2020-05-11
Language
English
Citation

ACM AAMAS '20: International Conference on Autonomous Agents and Multiagent Systems, pp.520 - 528

URI
http://hdl.handle.net/10203/277764
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0