For this project, I am using slightly modified versions of Unity Reacher environment. See the animation with trained agent in comparison to "dumb" agent below
Trained Agent:
Random Agent
In this environment, a double-jointed arm can move to target locations. A reward of +0.1 is provided for each step that the agent's hand is in the goal location. Thus, the goal is to maintain arm's position at the target location for as many time steps as possible.
The observation space consists of 33 variables corresponding to position, rotation, velocity, and angular velocities of the arm. Each action is a vector with four numbers, corresponding to torque applicable to two joints. Every entry in the action vector should be a number between -1 and 1. Task is considered to be solved if the average cumulative reward in the last 100 episodes is greater or equal 30.
Note: This guide was only tested for macOS with pyenv
for python versions management.
Please follow these steps to be able to run this project:
-
Install build tools (such as C++ compiler and etc.) by installing Xcode and then Xcode command-line tools following one of the various guides .
-
Install dependencies. It is highly recommended to install all dependencies in virtual environment (see guide).
-
Install Unity ML-Agents Toolkit following instruction from this page (official GitHub of Unity ML-Agents Toolkit). It is very likely that most of you will only need to install
mlagents
andunityagents
packages with the following command:pip install mlagents unityagents
It is highly recommended to use Python not higher then 3.7, because
TesnsorFlow
(one of the dependency formlagents
) is only compatible with Python 3.7). -
Install PyTorch with
pip insall torch torchvision
Please see official installation guide for more information.
Alternatively, it is possible to install required dependencies using
requirements.txt
. To do that jsut run the following command in your terminal (preferably in project's virtual environment):pip install -r requirements.txt
Please note: this method is a bit inefficient and has some packages that are not really used in this project. In fact this is a "R&D" environment for experiments and testing.
-
-
Download the environment from one of the links below. You need only select the environment that matches your operating system:
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here
-
Place the file in the root of the project repository.
-
Run
main.py
from terminal withpython main.py
or simply run
main.py
in your IDE. -
Run
visualize.py
to see intelligent agent withpython vizualize.py
or, again, simply run this file in you IDE.
Check out technical report for implementation details.