Minimax Algorithm in Python

I will implement a minimax algorithm with alpha-beta pruning and cutoff in this tutorial. This algorithm will be implemented in Python and I am creating an AI that is playing Tic-Tac-Toe and similar games against you or another human.

The minimax algorithm is used to solve adversial search problems in which goals of agents are in conflict, this is the case for most games. In a game with two players, the algorithm assumes that one player wants to maximize (MAX) the score and that the other player wants to minimize (MIN) the score. The algorithm can be extended to include more than two players.

The minimax algorithm evalutes all possible plays from the current game state down to the end of each game and then backes up the values to the current state. The algorithm assumes that both players plays optimal, the MAX player will chose the play with the highest score and the MIN player will chose the play with the lowest score. In a Tic-Tac-Toe game with 3×3 squares, it is 362 880 (9!) different states to examine when the game begins.

Alpha-Beta pruning and cutoff is used to speed up the search time for minimax. Alpha-Beta pruning means that we can ignore searching through nodes that never will be chosen given the information we already have, both players is assumed to play optimal. Cutoff means that we can use a fast heuristic evaluation algorithm when we have reached a certain depth in the minimax search. Cutoff is used in the early phases of a game when the number of possible plays is very high.

Tic-Tac-Toe and similar games

This code can be used for Tic-Tac-Toe, Connect Four and similar games. The AI-player uses a minimax algorithm to take decisions on each turn, a human player will get a recommendation before each move. An evaluation function (get_best_move) is used at cutoff, the evaluation function uses winning positions and heuristics to take decisions. A Connect Four game would be impossible to play without the evaluation function.

# Import libraries
import sys
import random

# This class represent a tic tac to game
class TicTacToeGame:

    # Create a new game
    def __init__(self, rows:int, columns:int, goal:int, max_depth:int=4):
        
        # Create the game state
        self.state = []
        self.tiles = {}
        self.inverted_tiles = {}
        tile = 0
        for y in range(rows):
            row = []
            for x in range(columns):
                row += '.'
                tile += 1
                self.tiles[tile] = (y, x)
                self.inverted_tiles[(y, x)] = tile
            self.state.append(row)

        # Set the number of noughts and crosses in a row that is needed to win the game
        self.goal = goal

        # Create vectors
        self.vectors = [(1,0), (0,1), (1,1), (-1,1)]

        # Set lengths
        self.rows = rows
        self.columns = columns
        self.max_row_index = rows - 1
        self.max_columns_index = columns - 1
        self.max_depth = max_depth

        # Heuristics for cutoff
        self.winning_positions = []
        self.get_winning_positions()

        # Set the starting player at random
        #self.player = 'O'
        self.player = random.choice(['X', 'O'])

    # Get winning positions
    def get_winning_positions(self):

        # Loop the board
        for y in range(self.rows):
            for x in range(self.columns):
                
                # Loop vectors
                for vector in self.vectors:
                    
                    # Get the start position
                    sy, sx = (y, x)

                    # Get vector deltas
                    dy, dx = vector

                    # Create a counter
                    counter = 0

                    # Loop until we are outside the board
                    positions = []
                    while True:

                        # Add the position
                        positions.append(self.inverted_tiles.get((sy, sx)))

                        # Check if we have a winning position
                        if (len(positions) == self.goal):

                            # Add winning positions
                            self.winning_positions.append(positions)

                            # Break out from the loop
                            break

                        # Update the position
                        sy += dy
                        sx += dx
                
                        # Check if the loop should terminate
                        if(sy < 0 or abs(sy) > self.max_row_index or sx < 0 or abs(sx) > self.max_columns_index):
                            break

    # Play the game
    def play(self):

        # Variables
        result = None

        # Create an infinite loop
        print('Starting board')
        while True:

            # Draw the state
            self.print_state()

            # Get a move from a player
            if (self.player == 'X'): # AI

                # Print AI move
                print('Player X moving (AI) ...')

                # Get the best move
                max, py, px, depth = self.max(-sys.maxsize, sys.maxsize)

                # Get a heuristic move at cutoff
                print('Depth: {0}'.format(depth))
                if(depth > self.max_depth):
                    py, px = self.get_best_move()

                # Make a move
                self.state[py][px] = 'X'

                # Check if the game has ended, break out from the loop in that case
                result = self.game_ended()
                if(result != None):
                    break

                # Change turn
                self.player = 'O'

            elif (self.player == 'O'): # Human player
                
                # Print turn
                print('Player O moving (Human) ...')

                # Get a recommended move
                min, py, px, depth = self.min(-sys.maxsize, sys.maxsize)

                # Get a heuristic move at cutoff
                print('Depth: {0}'.format(depth))
                if(depth > self.max_depth):
                    py, px = self.get_best_move()

                # Print a recommendation
                print('Recommendation: {0}'.format(self.inverted_tiles.get((py, px))))

                # Get input
                number = int(input('Make a move (tile number): '))
                tile = self.tiles.get(number)

                # Check if the move is legal
                if(tile != None):
                    
                    # Make a move
                    py, px = tile
                    self.state[py][px] = 'O'

                    # Check if the game has ended, break out from the loop in that case
                    result = self.game_ended()
                    if(result != None):
                        break

                    # Change turn
                    self.player = 'X'

                else:
                    print('Move is not legal, try again.')

        # Print result
        self.print_state()
        print('Winner is player: {0}'.format(result))

    # An evaluation function to get the best move based on heuristics
    def get_best_move(self):

        # Create an heuristic dictionary
        heuristics = {}

        # Get all empty cells
        empty_cells = []
        for y in range(self.rows):
            for x in range(self.columns):
                if (self.state[y][x] == '.'):
                    empty_cells.append((y, x))

        # Loop empty positions
        for empty in empty_cells:

            # Get numbered position
            number = self.inverted_tiles.get(empty)

            # Loop winning positions
            for win in self.winning_positions:

                # Check if number is in a winning position
                if(number in win):

                    # Calculate the number of X:s and O:s in the winning position
                    player_x = 0
                    player_o = 0
                    start_score = 1
                    for box in win:

                        # Get the position
                        y, x = self.tiles[box]

                        # Count X:s and O:s
                        if(self.state[y][x] == 'X'):
                            player_x += start_score if self.player == 'X' else start_score * 2
                            start_score *= 10
                        elif (self.state[y][x] == 'O'):
                            player_o += start_score if self.player == 'O' else start_score * 2
                            start_score *= 10

                    # Save heuristic
                    if(player_x == 0 or player_o == 0):

                        # Calculate a score
                        score = max(player_x, player_o) + start_score

                        # Update the score
                        if(heuristics.get(number) != None):
                            heuristics[number] += score
                        else:
                            heuristics[number] = score

        # Get the best move from the heuristic dictionary
        best_move = random.choice(empty_cells)
        best_count = -sys.maxsize
        for key, value in heuristics.items():
            if(value > best_count):
                best_move = self.tiles.get(key)
                best_count = value

        # Return the best move
        return best_move

    # Check if the game has ended
    def game_ended(self) -> str:

        # Check if a player has won
        result = self.player_has_won()
        if(result != None):
            return result

        # Check if the board is full
        for y in range(self.rows):
            for x in range(self.columns):
                if (self.state[y][x] == '.'):
                    return None

        # Return a tie
        return 'It is a tie!'
       
    # Check if a player has won
    def player_has_won(self) -> str:
        
        # Loop the board
        for y in range(self.rows):
            for x in range(self.columns):
                
                # Loop vectors
                for vector in self.vectors:
                    
                    # Get the start position
                    sy, sx = (y, x)

                    # Get vector deltas
                    dy, dx = vector

                    # Create counters
                    steps = 0
                    player_x = 0
                    player_o = 0

                    # Loop until we are outside the board or have moved the number of steps in the goal
                    while steps < self.goal:

                        # Add steps
                        steps += 1

                        # Check if a player has a piece in the tile
                        if(self.state[sy][sx] == 'X'):
                            player_x += 1
                        elif(self.state[sy][sx] == 'O'):
                            player_o += 1

                        # Update the position
                        sy += dy
                        sx += dx
                
                        # Check if the loop should terminate
                        if(sy < 0 or abs(sy) > self.max_row_index or sx < 0 or abs(sx) > self.max_columns_index):
                            break

                    # Check if we have a winner
                    if(player_x >= self.goal):
                        return 'X'
                    elif(player_o >= self.goal):
                        return 'O'

        # Return None if no winner is found
        return None

    # Get a min value (O)
    def min(self, alpha:int=-sys.maxsize, beta:int=sys.maxsize, depth:int=0):
        
        # Variables
        min_value = sys.maxsize
        by = None
        bx = None
        
        # Check if the game has ended
        result = self.game_ended()
        if(result != None):
            if result == 'X':
                return 1, 0, 0, depth
            elif result == 'O':
                return -1, 0, 0, depth
            elif result == 'It is a tie!':
                return 0, 0, 0, depth
        elif(depth > self.max_depth):
            return 0, 0, 0, depth

        # Loop the board
        for y in range(self.rows):
            for x in range(self.columns):

                # Check if the tile is empty
                if (self.state[y][x] == '.'):

                    # Make a move
                    self.state[y][x] = 'O'

                    # Get max value
                    max, max_y, max_x, depth = self.max(alpha, beta, depth + 1)
                    
                    # Set min value to max value if it is lower than curren min value
                    if (max < min_value):
                        min_value = max
                        by = y
                        bx = x
                        
                    # Reset the tile
                    self.state[y][x] = '.'

                    # Do an alpha test
                    if (min_value <= alpha):
                        return min_value, bx, by, depth

                    # Do a beta test
                    if (min_value < beta):
                        beta = min_value

        # Return min value
        return min_value, by, bx, depth

    # Get max value (X)
    def max(self, alpha:int=-sys.maxsize, beta:int=sys.maxsize, depth:int=0):

        # Variables
        max_value = -sys.maxsize
        by = None
        bx = None

        # Check if the game has ended
        result = self.game_ended()
        if(result != None):
            if result == 'X':
                return 1, 0, 0, depth
            elif result == 'O':
                return -1, 0, 0, depth
            elif result == 'It is a tie!':
                return 0, 0, 0, depth
        elif(depth > self.max_depth):
            return 0, 0, 0, depth

        # Loop the board
        for y in range(self.rows):
            for x in range(self.columns):

                # Check if the current tile is empty
                if (self.state[y][x] == '.'):
                    
                    # Add a piece to the board
                    self.state[y][x] = 'X'

                    # Set max value to min value if min value is greater than current max value
                    min, min_y, min_x, depth = self.min(alpha, beta, depth + 1)

                    # Adjust the max value
                    if (min > max_value):
                        max_value = min
                        by = y
                        bx = x

                    # Reset the tile
                    self.state[y][x] = '.'

                    # Do a beta test
                    if (max_value >= beta):
                        return max_value, bx, by, depth

                    # Do an alpha test
                    if (max_value > alpha):
                        alpha = max_value

        # Return max value
        return max_value, by, bx, depth

    # Print the current game state
    def print_state(self):
        for y in range(self.rows):
            print('| ', end='')
            for x in range(self.columns):
                if (self.state[y][x] != '.'):
                    print(' {0}  | '.format(self.state[y][x]), end='')
                else:
                    digit = str(self.inverted_tiles.get((y,x))) if len(str(self.inverted_tiles.get((y,x)))) > 1 else ' ' + str(self.inverted_tiles.get((y,x)))
                    print('{0}  | '.format(digit), end='')
            print()
        print()

# The main entry point for this module
def main():

    # Create a game
    #game = TicTacToeGame(7, 6, 4, 1000)
    game = TicTacToeGame(3, 3, 3, 1000)

    # Play the game
    game.play()

# Tell python to run main method
if __name__ == "__main__": main()

Output

Starting board
|  1  |  2  |  3  |
|  4  |  5  |  6  |
|  7  |  8  |  9  |

Player X moving (AI) ...
Depth: 1029
|  1  |  2  |  3  |
|  4  |  X  |  6  |
|  7  |  8  |  9  |

Player O moving (Human) ...
Depth: 1012
Recommendation: 1
Make a move (tile number): 1
|  O  |  2  |  3  |
|  4  |  X  |  6  |
|  7  |  8  |  9  |

Player X moving (AI) ...
Depth: 702
|  O  |  X  |  3  |
|  4  |  X  |  6  |
|  7  |  8  |  9  |

Player O moving (Human) ...
Depth: 268
Recommendation: 8
Make a move (tile number): 8
|  O  |  X  |  3  |
|  4  |  X  |  6  |
|  7  |  O  |  9  |

Player X moving (AI) ...
Depth: 104
|  O  |  X  |  3  |
|  X  |  X  |  6  |
|  7  |  O  |  9  |

Player O moving (Human) ...
Depth: 32
Recommendation: 6
Make a move (tile number): 6
|  O  |  X  |  3  |
|  X  |  X  |  O  |
|  7  |  O  |  9  |

Player X moving (AI) ...
Depth: 11
|  O  |  X  |  X  |
|  X  |  X  |  O  |
|  7  |  O  |  9  |

Player O moving (Human) ...
Depth: 4
Recommendation: 7
Make a move (tile number): 7
|  O  |  X  |  X  |
|  X  |  X  |  O  |
|  O  |  O  |  9  |

Player X moving (AI) ...
Depth: 1
|  O  |  X  |  X  |
|  X  |  X  |  O  |
|  O  |  O  |  X  |

Winner is player: It is a tie!

Leave a Reply

Your email address will not be published. Required fields are marked *