/tictactoe3d-deep-rl

Using NNs to approximate temporal difference learning for TicTacToe 3d (4x4x4 environment). Also included are implementations for value iteration on 3x3 tictactoe and Q learning on 4x4 tictactoe.

Primary LanguageJupyter Notebook

Watchers