Create new virtual environment, activate and install requirements:
python3 -m venv venv
source venv/bin/activate
pip install --update pip
pip install -r requirements
This repo uses click as a command line interface to reinforcment algorithms written on top of TensorFlow.
To see pretrained examples use python main.py example
. The options are --target
, --algorithm
and --num_steps
. target
refers to the openai gym environment. Choices are cart-pole
and luner-lander
. Algorithm is the algorithm used to train the solution, choices are pg
for Policy Gradient, dqn
for Deep Q Network and ac
for Actor Critic. --num_steps
is just the number of iterations of the trained example solution we run.
python main.py example --target='luner-lander' --algorithm='pg'
- evo is experimental and I'm not sure it should be technically classed as an evolutionary algorithm.
- Add policy delay to TD3