Udacity Deep Reinforcement Learning Nanodegree - Practice 1
This is a Deep Reinforcement Learning excercice. The environment is a 3D space. The action space has 37 values. The Unity environment returns values like this:
States look like: [1. 0. 0. 0. 0.84408134 0.
0. 1. 0. 0.0748472 0. 1.
0. 0. 0.25755 1. 0. 0.
0. 0.74177343 0. 1. 0. 0.
0.25854847 0. 0. 1. 0. 0.09355672
0. 1. 0. 0. 0.31969345 0.
0. ]
The agent has to move in 4 directions (actions) and get the yellow bananas (+1 score). If the agent get a blue banana it will score -1.
0 - walk forward
1 - walk backward
2 - turn left
3 - turn right
This is an episodic problem in a continuous space. The objective of the excercise is to use a DRL algorithm that can reach an average score of 15.0 in less than 1800 episodes.
To install and use the environment you have to follow these steps:
1 Install Anaconda Suite and create (and activate) a new python environment
conda create --name drlnd python=3.6
2 Install OpenAi gym
pip install gym
3 Install the Unity Environment App (not Unity SDK!)
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
4 Clone this repository and install other dependencies:
pip install ./python
5 Run Navigation.ipynb (Jupyter Notebook) with code and watch how it trains a new model and later use it to play alone.
To train a new agent, open Navigation.ipynb Jupyter Notebook file.
Follow the steps described on each cell.
You will find a dqn function that will train until np.mean(scores_window)>=15.0
This means that the average score is greater than 15.
Then it will use saved agent in file 'checkpoint.pth' to run 3 episodes that you can enjoy in realtime :-)