Remember the sad Marvin from "Hitchhiker's guide to the galaxy"? In this project we train him to walk from the scratch using only pure python
with numpy
!
To install Marvin environment and dependencies, please, run following shell script:
> ./install.sh
It installs python3.7
with all dependencies inside the virtualenv named venv
.
To run Marvin, please, use the following commands:
> ./marvin.py <-r>
or
> source venv/bin/activate
> python3 marvin.py <-r>
Flag -r
is used for running trained Marvin (by default pretrained model is used)
As main model used fully-connected network as part of reinforce algorithm
with policy gradient methodology
.
Model predicts only mean
of the actions distribution without standart deviation
. Loss function consist only from predicted mean.
Basically, Marvin starts from the folowing state:
The pretrained Marvin can walk in the next way:
And in the trained state he can moves with the next speed: