Deep Q-Learning Agent mastering the Unity Banana Collector environment! Through deep reinforcement learning, an agent learns to collect bananas in a Unity environment to maximize the score 🍌
Languages: Python 3.6 and Pytorch
Environment: Unity ML-Agents Toolkit
- 0 - move forward
- 1 - move backward
- 2 - turn left
- 3 - turn right
The state of the environment has 37 dimensions, including velocity and perceptions of its environment in terms of vectors.
The agent is given a reward of +1 if it collects a yellow banana, and a reward of -1 if it collects a blue banana. The environment is solved when the agent accumulates an reward of +13 over 100 episodes.
- Install Anaconda if you don't have it already.
- Open Anaconda Prompt/command line/terminal
- Create a new environment (named banana-env):
conda create --name banana-env python=3.6
- Activate environment:
activate banana-env
- Navigate to desired directory to download project file:
cd path/to/desired/directory
- Clone the repository:
git clone https://github.com/albertlai431/banana-collector
- Go to dependencies directory:
cd banana-collector/python
- Install dependencies (may take a while):
pip install .
- Install pytorch 0.4.0 with conda:
conda install pytorch=0.4.0 -c pytorch
- Create kernel with environment:
python -m ipykernel install --user --name banana-env --display-name "banana-env"
- Launch jupyter-notebook and navigate to cloned repository directory
- Open
train.ipynb
and run the code if you would like to train the agent 💪 - Open
test.ipynb
and run the code if you would like to observe a fully trained agent! 😃 - Important: Before running any code in either of the ipynb files, click Kernel on the top bar, Change kernel > banana-env
- Remember to deactivate the environment in the Anaconda Prompt/command line/terminal after you are done:
conda deactivate
- The folder
Banana_Windows_x86_64
may not always work; if you are getting aUnityTimeOutException
, please go to this link and replaceBanana_Windows_x86_64
with the correct folder for your system. You may also need to modify theenv
declaration.