This is the thrid project in the Udacity Deep Reinforcement Learning Nanodegree.
This project works with the Tennis environment.
In this environment, two agents control rackets to bounce a ball over a net. If an agent hits the ball over the net, it receives a reward of +0.1. If an agent lets a ball hit the ground or hits the ball out of bounds, it receives a reward of -0.01. Thus, the goal of each agent is to keep the ball in play.
The observation space consists of 8 variables corresponding to the position and velocity of the ball and racket. Each agent receives its own, local observation. Two continuous actions are available, corresponding to movement toward (or away from) the net, and jumping.
It is recommended to follow the Udacity DRL ND dependencies instructions here
This project utilises Unity ML-Agents, NumPy and PyTorch
A prebuilt simulator is required in be installed. You need only select the environment that matches your operating system:
Linux: click here Mac OSX: click here Windows (64-bit): click here
The file needs to placed in the root directory of the repository and unzipped.
Next, before starting the environment utilising the corresponding prebuilt app from Udacity Before running the code cell in the notebook, change the file_name
parameter to match the location of the Unity environment that you downloaded.
Then run the Tennis.ipynb
notebook using the drlnd kernel to train the DDPG agent.
Once trained the model weights will be saved in the same directory in the files checkpoint1_actor0.pth
, checkpoint1_actor1.pth
and checkpoint1_critic.pth
.
The model weights are used by the Trained Agent.ipynb
notebook against the simulator.