gym-tictactoe
OpenAI Gym Style Tic-Tac-Toe Environment.
|O|X
-----
|O|
-----
O|X|X
O's turn.
Enter location[1-9], q for quit:
Requirement
Python >= 3.5
Install
git clone https://github.com/haje01/gym-tictactoe.git
cd gym-tictactoe/
pip install -e .
Try example agents
cd examples/
python human_agent.py
python base_agent.py
python td_agent.py
Temporal Difference Agent Commands
Learn
Usage: td_agent.py learn [OPTIONS]
Learn and save the model.
Options:
-p, --episode INTEGER Episode count. [default: 17000]
-e, --epsilon FLOAT Exploring factor. [default: 0.08]
-a, --alpha FLOAT Step size. [default: 0.4]
-f, --save-file TEXT Save model data as file name. [default:
td_agent.dat]
--help Show this message and exit.
Bench
Usage: td_agent.py bench [OPTIONS]
Benchmark agent with base agent.
Options:
-p, --episode INTEGER Episode count. [default: 3000]
-f, --model-file TEXT Model data file name. [default: td_agent.dat]
--help Show this message and exit
Grid search
Usage: td_agent.py gridsearch [OPTIONS]
Grid search hyper-parameters.
Options:
-q, --quality [high|mid|low] Grid search quality. [default: mid]
-r, --reproduce-test INTEGER Reproducibility test count. [default: 3]
--help Show this message and exit.
Play
Usage: td_agent.py play [OPTIONS]
Play with human.
Options:
-f, --load-file TEXT Load file name. [default: td_agent.dat]
-n, --show-number Show location number when play. [default: False]
--help Show this message and exit.