mdanyalmalik/tictactoe3d-deep-rl
Using NNs to approximate temporal difference learning for TicTacToe 3d (4x4x4 environment). Also included are implementations for value iteration on 3x3 tictactoe and Q learning on 4x4 tictactoe.
Jupyter Notebook
Using NNs to approximate temporal difference learning for TicTacToe 3d (4x4x4 environment). Also included are implementations for value iteration on 3x3 tictactoe and Q learning on 4x4 tictactoe.
Jupyter Notebook