To get my sig from Tad, I'm making a reinforcement learning AI to play connect four. Let's go!
The board keeps track of the current state of the game. Players are numbered 1,2 on the board (0 is empty).
When making an AI to play Connect Four, I think it needs to evaluate the moves it has, and pick the one that maximizes the chances of it winning. Simple enough. Maybe not simple.
Let
- A githob repo
- some wikipedia page
- Probably save the entire model, rather than just the weights (easier to reload when I want to play it)
- Something to try: change the reward from binary (1 if win, -1 if loss) to a numerical scale of how quickly you win or slowly you use.