/AI

GamePlayingAI

Primary LanguagePython

AI

cs4701 COMS W4701 - Artificial Intelligence - Programming Homework 2

Game Playing / Adversarial Search

Due: 11:59pm on Monday, October 16th Total Points: 100

You are welcome to discuss the problems with other students but you must turn in your own work. Please review the academic honesty policy for this course (at the end of the syllabus page).

Create a .zip or .tgz archive containing any of the files you submit. Upload that single file to Courseworks. The file you submit should use the followign naming convention:

YOURUNI_programming2.[zip | tgz]. For example, my uni is db2711, so my file should be named db2711_programming2.zip or db2711_programming2.tgz.

As a reminder, any assignments submitted late will incur a 20 point penalty. No submissions will be accepted later than 4 days after the submission deadline.

The instructions below are fairly specific and it is okay to deviate from implementation details. However: You will be graded based on the functionality of each function. Make sure the function signatures (function names, parameter and return types/data structures) match exactly the description in this assignment.

Introduction

In this assignment you will implement an AI player for the game Reversi, also known as Othello. For the rules of this game see here: https://en.wikipedia.org/wiki/Reversi (Links to an external site.)Links to an external site.. Our version of the game differs from these rules in one minor point: The game ends as soon as one of the players has no legal moves left.

First, download the archive othello.zipView in a new window, which contains the following 5 files:

othello_gui.py - Contains a simple graphical user interface (GUI) for Othello. othello_game.py - Contains the game "manager", which stores the current game state and communicates with different player AIs. othello_shared.py - Functions for computing legal moves, captured disks, and successor game states. These are shared between the game manager, the GUI and the AI players. randy_ai.py - Randy is an "AI" player that randomly selects a legal move. YOURUNI_ai.py - This is the scaffolding code you will use to build your AI player. Rename this file by replacing YOURUNI with your actual UNI. In addition to completing this assignment, you can submit an AI player to the Othello competition in which your AI will play against other AIs. Participation in the competition is optional and will not affect your grade. We will run the competition using anonymous aliases/names for your AIs (see below). The quarter-finals, semi-finals, and finals of the competition will be played out live during the lab sessions. The winners will receive a certificate.

Game State Representation

Each game state contains two pieces of information: The current player and the current disks on the board. Throughout our implementation, Player 1 (dark) is represented using the integer 1, and Player 2 (light) is represented using the integer 2.

The board is represented as a a tuple of tuples. The inner tuple represents each row of the board. Each entry in the rows is either an empty square (integer 0), a dark disk (integer 1) or a light disk (integer 2). For example, the 8x8 initial state looks like this:

((0, 0, 0, 0, 0, 0, 0, 0), (0, 0, 0, 0, 0, 0, 0, 0), (0, 0, 0, 0, 0, 0, 0, 0), (0, 0, 0, 2, 1, 0, 0, 0), (0, 0, 0, 1, 2, 0, 0, 0), (0, 0, 0, 0, 0, 0, 0, 0), (0, 0, 0, 0, 0, 0, 0, 0), (0, 0, 0, 0, 0, 0, 0, 0)) Getting Started

Run the Othello GUI, python othello_gui.py, which should bring up a game window. By default, both players are human players, taking turn.

The GUI can take one or two AI programs as command line parameters. When only one of them is specified, you will play against the AI (you will be player 1, dark): $python othello_gui.py randy_ai.py Play against Randy to develop a better understanding of how the game works and what strategies can give you an advantage.

When two AIs are specified you can watch them play against each other: $python othello_gui.py randy_ai.py randy_ai.py.

The GUI is rather minimalistic, so you need to close the window and then restart to play a new game.

(On some systems, you will get an error message when the game is trying to run randy_ai. In that case, your system is probably configured to use the command "python3" to call python 3. To fix this problem change the line subprocess.Popen(["python", filename]... to subprocess.Popen(["python3", filename]...) in othello_game.py.)

Communication between the Game Host and the AI

This is a technical detail that you can skip if you are not interested. Functions for communicating with the game manager are provided as part of the scaffolding code.

The AI and the Game Manager / GUI will run in different Python interpreters. The Game Manager / GUI will spawn a child process for each AI player. This makes it easier for the game manager to let the AI process time out and also makes sure that, if the AI crashes, the game manager can keep running. To communicate with the child process, the game manager uses pipes. Essentially, the game manager reads from the AI's standard output and writes to the AI's standard input. The two programs follow a precise protocol to communicate:

The AI sends a string to identify itself. For example, randy_ai sends "Randy". You should come up with a fun name for your AI. The game manager sends back "1" or "2", indicating if the AI plays dark or light. Then the AI sits and waits for input form the game manager. When it is the AI's turn, the game manager will send two lines: The current score, for example "SCORE 2 2" and the game board (a Python tuple converted to string). The game manager then waits for the AI to respond with a move, for example "4 3". At the end of the game, the game master sends the final score, for example "FINAL 33 31". Time Constraints

Your AI player will be expected to make a move within 10 seconds. If no move has been selected, the AI loses the game. This time constraint does not apply for human players.

You may change the time constraint by editing line 34 in othello_game.py: TIMEOUT = 10

However, we will run your AI with a timeout of 10 seconds during grading and for the Othello competition.

Part I. Minimax (25 pts)

First, change line 139 in othello_gui.py game = OthelloGameManager(dimension=8)to game = OthelloGameManager(dimension=4). This will run Othello on a 4x4 board. This restriction makes the game somewhat trivial: it is easy even for human players to think ahead to the end of the game. When both players play optimally, the player who goes second always wins. However, the default Minimax algorithm, without a depth limit, takes too long even on a 6x6 board.

Write the function compute_utility(board, color) that computes the utility of a final game board state (in the format described above). The utility is the number of disks of player color minus the number of disks of the opponent. Hint: The function get_score(board) returns a tuple (number of dark disks, number of light disks).

Then, implement the method select_move_minimax(board, color) that selects the action that leads to the state with the highest minimax value. The parameters of the function are the current board (in the format described above) and the color for the AI player, integer 1 for dark and 2 for light. The return value should be a (column, row) tuple, representing the move. Hints:

Implement Minimax recursively by writing two functions minimax_max_node(board, color) and minimax_min_node(board, color). Use the get_possible_moves(board, color) function in othello_shared.py, which returns a list of (column, row) tuples representing the legal moves for player color. Use the play_move(board, color, move) function in othello_shared.py, which computes the successor board state that results from player color playing move (a (column, row) tuple). Pay attention to which player should make a move for min nodes and max nodes. You should now be able to play against your AI on a 4x4 board or have your AI play against the random AI.

Part II. α-β Pruning (25 pts)

The simple Minimax approach becomes infeasible for boards larger than 4x4. Write the function select_move_alphabeta(board, color) which computes the best move using α-β-pruning. The parameters and return values are the same as for minimax.

Write recursive functions alphabeta_min_node(board, color, alpha, beta) and alphabeta_max_node(board,color, alpha, beta).

In the main function of YOURUNI_ai.py change the line

movei, movej = select_move_minimax(board, color)

to

movei, movej = select_move_alphabeta(board, color).

This should speed up decisions for the AI, but still not enough to be able to play larger boards than 4x4.

Part III. Caching states (12 pts)

Create a dictionary in your AI player (this can just be stored in a global variable on the top level of the file) that maps board states to their minimax value. Modify your minimax and α-β-pruning functions to store states in that dictionary after their minimax value is known. Then check, at the beginning of each function, if a state is already in the dictionary and do not explore it again.

Part IV. Node ordering heuristic (18 pts)

α-β-pruning works better if nodes that lead to a better utility are explored first. In the α-β-pruning functions, order successor states according to the following heuristic: nodes for which the #AI player's disks - #opponent's disk is highest should be explored first. Note that this is the same as the utility function, and it is okay to call the ultity function to compute this value. This should provide another small speed-up.

Part V. Depth Limit (20 pts)

So far we could only play on a trivial 4x4 board. Modify your α-β-pruning functions so that the algorithm only explores a limited number of levels, rather than the entire search tree. The function signatures should be alphabeta_min_node(board, color, alpha, beta, level, limit) and alphabeta_max_node(board, color, alpha, beta, level, limit), where level is the current ply and limit is the total number of plies that should be explored. Instead of using the true utility at the final board states, we will use a heuristic evaluation function that is defined in the same way as the utility (i.e. you can simply call the utility function as before to estimate the quality of a state).

Now switch back to an 8x8 board and experiment with the depth limit. What is the maximum number of plies you can safely evaluate in 10 seconds?

Taken together, these steps should give you an AI player that is challenging to play against.

What you need to Submit

Implementing the 5 steps listed above will give you full credit on this assignment. You only need to submit YOURUNI_ai.py (with YOURUNI replaced with your actual UNI). Do not submit any of the other .py files. To participate in the competition, submit a second file YOURUNI_ai_compete.py.

4701 Othello AI Competition

The steps outline above give you a good AI player, but we have only scratched the surface. There are many possible improvements that would create an even better AI.

In the Othello Competition, we will have your AI players compete with each other. The finalists and winner of the competition will receive certificates. We are planning to run most of the competition after the homework assignment is due, and then play out the quarter-finals, semi-finals and finals during the lab sessions.

To submit an AI to the competition, simply include a file YOURUNI_ai_compete.py with your homework submission. You should give your AI a funny/informative/unique name (i.e. the first string that is sent to the game manager). We will run the competition anonymously and will only announce the finalists and winners by real name. However, we will release the full tournament results using your AI's name.

It is okay if your AI for the competition does not follow the framework described above, as long as your AI communicates with the game manager using the correct protocol.

Here are some ideas for improvements to get you started:

Come up with a better node-ordering function. Come up with a better evaluation function. Instead of a fixed depth limit, use iterative deepening and a timer to maximize the number of levels you can explore in 10 seconds. Use a different strategy in the opening, mid-game, and end-game. Use a pattern database of common board states (you may include one data file of at most 500kb with your entry).