tjwei/2048-NN

Suggestion: Training outputs

Torrenal opened this issue · 1 comments

I'm assuming that when training the AI, it had only 4 outputs (north,south,east,west). Apologies if that's a faulty assumption.

For training the network, you might consider training it for a few additional outputs - not because you need them to play the game, but because by needing to provide them, the network will need an additional awareness of the game mechanics.
4 outputs (N/S/E/W) for when a move in that direction is possible, 0 if not possible.
4 outputs (N/S/E/W) for when a move in that direction will merge tiles.
1 output for a complexity score of the remaining tiles, after whatever mergers happen for the requested move.

The 'human' analog of training for these outputs would be 'learning the rules of the game' - It'll know when tiles merge and it'll know when moves are invalid, and it may internalize some of the logic for that in its decision process for moves.

tjwei commented