/tic-tac-toe-zero

MuZero - tic-tac-toe

Primary LanguagePythonMIT LicenseMIT

tic-tac-toe-zero

Implementation of MuZero for Tic-Tac-Toe.

It can play optimally 65-70% of the times if you train long enough and if you are lucky.

RL is hard (ToT)

Example Usage

git clone https://github.com/souvikshanku/tic-tac-toe-zero.git
cd tic-tac-toe-zero

python3 -m venv .venv
source .venv/bin/activate
pip3 install -r requirements.txt

# End-to-end training
python3 self_play.py

# Play against 'random' agent
python3 check_accuracy.py