This is the the solution for Udacity DeepRL Unity project
The solution is mostly refactoring of a previously learned DQN environment
The goal of the agent is to collect as many yellow bananas as possible while avoiding blue bananas. The minimal requirement for success is to have a windowed average score of at least 13.0 points in 100 consecutive episodes.
The agent runs on Python 3.6 + PyTorch.
State space is a feature vector of 37 floats which describes the 3D world
Action space: 4 discrete Actions
Reward: +1 for collecting a yellow banana, and -1 for collecting a purple banana
Episodic based training with 300 steps per episode.
To set up a python environment to run the code in this repository, please follow the instructions below:
Linux:
conda create --name drlnd python=3.6
source activate drlnd
Windows:
conda create --name drlnd python=3.6
conda activate drlnd
conda install pytorch=0.4.0 -c pytorch
git clone: https://github.com/yossico/DeepRLUnity.git
cd DeepRLUnity
pip install .
python -m ipykernel install --user --name drlnd --display-name "drlnd"
Linux: click here
Mac OSX: click here
Windows (32-bit): click here
Windows (64-bit): click here
to run the environment and test the unity simulator change to drlnd environment created:
Linux: source activate drlnd
windows: activate drlnd
Run the Navigation.ipynb using:
Jupiter notebook Navigation.ipynb
Report of the learning algorithm in Report.md