The files in this repository implement a DDPQ agent acting in the Reacher environment. The basic idea is that the agent controls a double-jointed arm with the goal of moving it to a moving target location.
This implementation works with two versions of this environment:
- The first version contains a single agent. This is considered solved when the agent has an average score of +30 over 100 consecutive episodes.
- The second version contains 20 identical agents, each with its own copy of the environment. This is considered solved when the agents get an average score of +30 (over 100 consecutive episodes, and over all agents).
The shared mechanics of this environment are as follows:
- Rewards: +0.1 for each time step where the agent's hand is in the goal location.
- State space: 33 variables (includes position, rotation, velocity, and angular velocities of the arm)
- Action space: Vector of 4 numbers, corresponding to torque applicable to two joints. Each value in this vector should be between -1 and 1.
The contents in the Continuous_Control.ipynb
file solve the second version of this environment.
-
Clone this repository.
-
Download the environment from one of the links below. You need only select the environment that matches your operating system:
-
Version 1: One (1) Agent
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
-
Version 2: Twenty (20) Agents
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
-
-
Place the downloaded file(s) in the folder you cloned this repo to and unzip (or decompress) the file.
-
Create a Python environment for this project.
-
Activate that environment and install dependencies:
pip install -r requirements.txt
-
Open the
Continuous_Control.ipynb
notebook and adjust the path to your desired environment file based on its name and where you placed it. -
You are ready to start interacting with the environment.
- Use the cells in sections 1, 2 and 3 to initialize and explore the environment
- Run the cells in section 4 to train the agent. Feel free to change the hyperparameters in
ddpg_agent.py
to see if you can improve training. - Run the cells in section 5 to test the agent.