Adversarial Search and Games PDF

CHAPTER 4 ADVERSARIAL SEARCH AND GAMES Based on “Artificial Intelligence: A Modern Approach” by Stuart Russell and Peter Norvig Introduction In this chapter we cover multi-agent, competitive environments in which two or more agents have conflicting goals, giving rise to adversarial search problems (also known as games) We will concentrate on games, such as chess or NIM The state of these games is easy to represent, and agents are usually restricted to a small number of actions whose effects are defined by precise rules. Introduction In games, other agents:  Want our agent to lose  Introduce uncertainty, don’t know what will they do The unpredictability of other agents can introduce many possible contingencies We will consider a specific type:  Deterministic  Fully observable  Two agents with alternating actions Introduction The main problem with games is that the search space is too huge:  A search tree for chess has an average branching factor of 35  An average chess game lasts for 50 moves per player (ply)  The average search tree has 35100 nodes. Of course, there is no enough time to search the whole tree and calculate the exact consequences of every move If the optimal decision is not feasible (due to time or memory constraints), some decision has to be taken. Introduction We will investigate:  What is an optimal move and how to find it  Pruning the search tree  How to choose a good move when time is limited  How to define a heuristic evaluation function to approximate the utility of a state. Minimax Algorithm First thing to consider is how to make an optimal move Certain assumptions exist:  A two player game,  The play is sequential (agents taking turns)  The opponent is playing perfectly The two players are referred to as MAX and MIN:  MAX wants to maximize the utility  MIN wants to minimize the utility Minimax Algorithm Theproblem of finding the best move is formulated as a search problem:  Initial state: The current board configuration and the player to move  Action: A legal move that takes you from one state to the next  Goal test: Has a terminal state been reached (is the game over?)  Cost: A utility value is assigned to leaf nodes (terminal states) This defines a game tree to be traversed. Minimax Algorithm MAX is the first player to move, starting at the root node The utility values at the leaf nodes are from MAX’s point of view Theoptimal strategy is found by examining the minimax value of each node. Minimax Algorithm In a normal search tree, a solution is to find a path from the initial state to the goal state, which is a wining state for MAX However, since MIN affects the search, a contingent strategy should be found The strategy defines:  MAX’s move at the initial state,  MAX’s move at all states generated by all possible moves by MIN ... and so on This means that MAX needs a winning strategy independent of what MIN does. Minimax Algorithm MAX will choose a move that maximizes its utility value, MIN will choose a move that minimizes the utility value of MAX Thisresults in MAX (at the root) choosing its best move given available information Available information is via look ahead (the search down the tree) Minimax Algorithm Minimax Algorithm 3 3 0 2 3 1 0 5 2 7 0 Minimax Algorithm Consider the game of NIM:  7 matches placed in a pile,  Each player divides a pile of matches into 2 non-empty piles with a different number of matches  The player who cannot make a move is the loser The utility is +1 (MAX wins) or -1 (MAX loses) The value at each node represents the best value of the best terminal state the current player can hope to achieve. (-1) 7 MAX (-1) 6-1 (-1) 5-2 (-1) 4-3 MIN (+1) 5-1-1 (-1) 4-2-1 (+1) 3-2-2 (-1) 3-3-1 MAX (+1) 4-1-1-1 (-1) 3-2-1-1 2-2-2-1 MIN MAX wins (+1) MAX (+1) 3-1-1-1-1 2-2-1-1-1 MAX loses (-1) MIN 2-1-1-1-1-1 MAX wins (+1) Minimax Algorithm Complete if the tree is finite Optimal against an optimal opponent, otherwise? Time complexity = Space complexity =. Pruning Game trees are too huge to be solved to optimality Minimax has to generate the entire tree, compute the utility values at the leaf nodes, and then propagate these values up the tree We want a way to do less work by removing unnecessary branches thus exploring less nodes However, we want to guarantee the same decision as Minimax We use the pruning. Pruning Pruning 𝛼 =3 MAX 𝛼=− ∞ MIN 𝛽=∞ 𝛽=∞ 𝛽=∞ 𝛽=3 𝛽=2 𝛽=2 𝛽=14 Same solution as Minimax Pruning is the maximum value MAX is assured so far off the current path Think: If is worse than (from MAX’s point of view), MAX will avoid it, dashed branches will be pruned Happens when , referred to as -cutoff. Pruning is the minimum value MIN is assured so far off the current path Think: If is worse than (from MIN’s point of view), MIN will avoid it, dashed branches will be pruned, Happens when referred to as -cutoff. Pruning In the previous example, pruning explored 14 nodes instead of 21 nodes explored by Minimax, pruning also reached the same decision reached by Minimax. Pruning MAX 𝛼=2 MIN 𝛽=3 𝛽=2 𝛽=2 These nodes will be pruned Note that we have changed the order of Pruning In the previous example, changing the order of nodes increase the efficiency of pruning Efficiency dependent on the ordering of children: Checks each of MAX’s children until finding one with a value higher than Checks each of MIN’s children until finding one with a value lower than. Pruning Can use heuristics to order the nodes to check: Check the highest-value children first for MAX, Check the lowest-value children first for MIN. Good ordering can reduce time complexity to , random ordering gives roughly , remember that Minimax is. Pruning How to reach a good decision when time is limited? We can search the tree up to a certain depth instead of reaching a leaf node Hence, we have to use a heuristic function to evaluate the desirability of a certain state instead of the utility value Suppose we have 100 seconds and we explore nodes/second: We can explore nodes per move can reach a depth of 8, which is a pretty good chess program. Evaluation Functions Heuristic evaluation functions are functions to estimate the expected utility of a game from a non-leaf node A good function should: Order the terminal states as the true utility values do Should not take too much time to compute For non-leaf nodes, it should be strongly correlated with the actual chances of winning. Mostly work by calculating various features of the state. Evaluation Functions A simple version is to set it up by calculating a weighted average of the numerical values of all the features For Chess, we can use a linear weighted sum of features: +…+ e.g., with = (number of white queens) - (number of white queens) Uncertainty In some types of games, there are many unpredictable external events that could affect the course of the game Backgammon is a typical game that combines luck and skill One player’s roll of dice determine the legal moves Although a player would know his/her legal moves he/she cannot know the legal moves of the other player until the other player rolls the dice as well This means that a complete game tree cannot be constructed. Uncertainty In backgammon, a game tree includes chance nodes The branches out of a chance node signify the possible dice rolls Each branch is labeled with the probability of its occurrence. In Practice Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994 Used a pre-computed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions. Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997 Deep Blue searches 200 million positions per second Uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply. In Practice Othello: Human champions refuse to compete against computers, who are too good. Go: Human champions refuse to compete against computers, who are too bad. In Go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.

Adversarial Search and Games PDF

Document Details

Tags

Related

Summary

Full Transcript