This project implements the third project of the Udacity Deep Reinforcement Learning Nanodegree.
It makes use of Unity ML Agents to set up, run, and learn to play the Tennis environment. In this environment the goal is for two agents controlling paddles to learn to "rally" in a tennis-like setting.
The environment is episodic, and continues to run until the ball hits the ground or goes out of bounds. Both the paddles and the ball move along a 2-dimensional plane.
Here is a gif of the agents executing the weights learned through training the model (these are the included .pth
files in the project directory):
The environment state for each agent is described by 3 size-8 vector observations. For practical purposes these states are treated as they returned by the environment as a 24-dimensional numpy array. An example starting state for a single agent looks like this:
[ 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0.
0. 0. 0. 0. -6.63803244 -1.5
-0. 0. 6.00063038 6. -0. 0. ]
The action space is a 2-dimensional vector for each agent. Each element in the vector is a continuous value representing movement along each dimension.
The Tennis environment is considered "solved" when the average per-episode score over 100 episodes reaches >0.5. The per-episode score is defined as the maximum of the two agents' individual episode scores.
These instructions assume a recent version of macOS, and was tested on Mojave (v10.14).
- Ensure Python 3.6 is installed. This can be done by running
python --version
from the command line (or occasionallypython3 --version
). If not installed, it can be retrieved here. - Ensure "Tennis.app" (included in the repo) opens correctly. Double-clicking in "Finder" should yield the visual of a blank environment.
- Install the python runtime dependencies listed in
requirements.txt
by runningpip install -r requirements.txt
from the top level of this repo.
For ease of navigation and visibility, all of the relevant classes and code to train the agent from scratch are implemented in the ipython notebook final-maddpg.ipynb
. To begin training, simply load the notebook and select "Cell -> Run All". If the notebook execution appears to hang, ensure that all Unity environments have been properly closed and Python kernels restarted before trying again.