Deep Reinforcement Learning: Multi Agent Continuous Control

Introduction

Trained Agent

The files in this repository implement a DDPQ agent acting in the Reacher environment. The basic idea is that the agent controls a double-jointed arm with the goal of moving it to a moving target location.

This implementation works with two versions of this environment:

  • The first version contains a single agent. This is considered solved when the agent has an average score of +30 over 100 consecutive episodes.
  • The second version contains 20 identical agents, each with its own copy of the environment. This is considered solved when the agents get an average score of +30 (over 100 consecutive episodes, and over all agents).

The shared mechanics of this environment are as follows:

  • Rewards: +0.1 for each time step where the agent's hand is in the goal location.
  • State space: 33 variables (includes position, rotation, velocity, and angular velocities of the arm)
  • Action space: Vector of 4 numbers, corresponding to torque applicable to two joints. Each value in this vector should be between -1 and 1.

The contents in the Continuous_Control.ipynb file solve the second version of this environment.

Getting Started

  1. Clone this repository.

  2. Download the environment from one of the links below. You need only select the environment that matches your operating system:

  3. Place the downloaded file(s) in the folder you cloned this repo to and unzip (or decompress) the file.

  4. Create a Python environment for this project.

  5. Activate that environment and install dependencies:

    pip install -r requirements.txt
    

Instructions

  1. Open the Continuous_Control.ipynb notebook and adjust the path to your desired environment file based on its name and where you placed it.

  2. You are ready to start interacting with the environment.

    • Use the cells in sections 1, 2 and 3 to initialize and explore the environment
    • Run the cells in section 4 to train the agent. Feel free to change the hyperparameters in ddpg_agent.py to see if you can improve training.
    • Run the cells in section 5 to test the agent.