/Multi-Agent-DRL

Multiagent deep reinforcement learning research project

Primary LanguageJupyter Notebook

Deep Multi-Agent Reinforcement Learning in a Common-Pool Resource System

⚠️ Attention: this repository utilizes TensorFlow v1 APIs, which have been deprecated and migrated to TensorFlow v2. Consequently, the existing source code will no longer function as expected.

This project includes the source code of the paper: Deep Multi-agent Reinforcement Learning in a Common-Pool Resource System, which is accepted and published on IEEE CEC 2019.

Introduction

In complex social-ecological systems, multiple agents with diverse objectives take actions that affect the long-term dynamics of the system. Common pool resources are a subset of such systems, where property rights are typically poorly defined and dynamics are unknown a priori, creating a social dilemma reflected by the well-known tragedy of the commons. In this paper, we investigated the efficacy of deep reinforcement learning in a multi-agent setting of a common pool resource system. We used an abstract mathematical model of the system, represented as a partially-observable general-sum Markov game. In the first set of experiments, the independent agents used a deep Q-Network with discrete action spaces to guide decision-making. However, significant shortcomings were evident. Consequently, in a second set of experiments, a Deep Deterministic Policy Gradient learning model with continuous state and action spaces guided agent learning. Simulation results show that agents performed significantly better in terms of both sustainability and economic goals when using the second deep learning model. Despite the fact that agents do not have perfect foresight nor understanding of the implications of their "harvesting" efforts, deep reinforcement learning can be used effectively to "learn in the commons".

A demonstration of a simplified CPR system [Hauser, Oliver P., et al.]:

Prerequisite

Make sure you have Python3.11 installed on your machine.

On the root directory, run the command to install

pip install .

Run the demo

./run_demo

Models

You can choose between the following models:

The agent-environment interaction

interaction

DQN Architecture

dqn_nn

DDPG Architecture

ddpg_nn

Reference

  1. von der Osten F B, Kirley M, Miller T. Sustainability is possible despite greed-Exploring the nexus between profitability and sustainability in common pool resource systems[J]. Scientific reports, 2017, 7(1): 2307.
  2. Hausknecht, M., & Stone, P. (2015). Deep recurrent q-learning for partially observable mdps. CoRR, abs/1507.06527.
  3. Hauser, O. P., Rand, D. G., Peysakhovich, A., & Nowak, M. A. (2014). Cooperating with the future. Nature511(7508), 220.
  4. Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529.
  5. Kulkarni T D, Narasimhan K, Saeedi A, et al. Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation[C]//Advances in neural information processing systems. 2016: 3675-3683.
  6. https://github.com/Ceruleanacg/Reinforcement-Learning