Battleships with Monte Carlo Tree Search

Intelligente Systeme, Praktikum 1, Enes Kaya, Jasper Wolny

Battleships

Rules of game:

Each player first sets up their ships in the game before the actual game begins
Available ships are:
- Carrier, which has 5 holes
- Battleship, which has 4 holes
- Cruiser, which has 3 holes
- Destroyer, which has 2 holes
For the sake of simplicity, in the actual implementation the "war field" dimensions are 10x10 and only one instance of each ship can be placed on the grid, without overlapping each other

MCTS for games of imperfect information

MCTS algorithms like UCB1/UCT are mostly designed for games like Chess and Go, which are games of perfect information. Assuming two players, each player at any given time in the game can see the complete state of the game and, potentially, all legal moves and combinations which lead to a win. The game we have chosen, Battleships, is a game of imperfect information. One player can only see their current state of the board (location of ships and the hits and misses of opponent) and to win the player can only make (educated) guesses about the opponents ships locations. So, the game has random elements and hidden information, that need to be determinized at some point for MCTS algorithms to work.

Information Set MCTS

Information Set MCTS is a family of algorithms that are suitable for games of imperfect information:

Information sets [1]:

Information sets are collections of states, which appear in a game when one player has knowledge about the state that another player does not. For example, in a card game each player hides his own cards from his opponents. In this example, the information set contains all states which correspond to all possible permutations of opponent cards. A player knows which information set they are in, but not which state within that information set.

There are many possibilites to approach the uncertainty. We will use the Determinization UTC technique first, and maybe explore the Information Set UTC (both described in the referenced papers).

The general idea is as follows (opponent is always the human player, by We we mean the AI):

The grid size and number of ships and their length is known to both players
After shooting at the grid in each round the opponent has to reveal wether we HIT or MISS
We can use the opponent grid information to sample a set of possible grids (game state)
We run the MCTS algorithm for each of the sampled grids
The next moves suggested by each MCTS run are averaged, and the highest ranked is chosen
Let opponent do his turn
Repeat at 3.

Some questions

Does it make sense to use MCTS for the game of Battleships?

References

[0] Daniel Whitehouse, Edward J. Powley and Peter I. Cowling. Determinization and Information Set Monte Carlo Tree Search for the card game Dou Di Zhu. Proceedings of IEEE Conference on Computational Intelligence in Games (CIG), 87–94, 2011.

[1] Daniel Whitehouse, Edward J. Powley and Peter I. Cowling. Information Set Monte Carlo Tree Search IEEE Transactions on Computational Intelligence and AI in Games

[2] http://www.aifactory.co.uk/newsletter/2013_01_reduce_burden.htm

Sample Output

COMPUTER IS MAKING GUESS...

6.0772095   6.267555    6.3350234   6.5314364   5.996564 
5.2217007   5.2408843   6.349465    0.0         5.7553244 
5.872676    5.131521    5.367996    7.2089844   5.997716 <---- C4 is rated highest win chance
5.6596212   5.3406906   0.0         5.4332376   6.7247863 
6.13833     6.43858     6.6647773   6.170787    5.063571

COMP HIT AT C4

YOUR BOARD...PRESS ENTER TO CONTINUE...

  1 2 3 4 5 
A - - D - - 
B - - D X - <--- That's the hit in the round before
C - - - X D <--- The current move (C4)
D - - O C D 
E - - - - - 
PRESS ENTER TO CONTINUE...

Lessons Learned

If you have a program in which some state will change very often (e.g. simulating from a given state), maybe don't use Java. It's harder to clone Objects, especially if they're complex. Functional programming would've made it a lot easier.

eneskaya/Battleship-MCTS-Java