You are on page 1of 67

Artificial Intelligence:

State Space Search for


Game Playing
 Russell & Norvig: Sections 5.1 & 5.4
 Many slides from:
robotics.stanford.edu/~latombe/cs121/2003/home.htm

Motivation
GO

chess

tic-tac-toe
2

Today


State Space Search for Game Playing





MiniMax
Alpha-beta pruning

Where we are today

State Space Search for Game Playing




Classical application for heuristic search






simple games: exhaustibly searchable


complex games: only partial search possible
additional problem: playing against opponent

Type of game:


2 person adversarial games




Perfect information


outcome of the game is only dependent on players moves

zero-sum game


both players know the state of the game and all possible moves

No chance involved


win, lose or tie

If the total gains of one player are added up, and the total losses are subtracted,
they will sum to zero.
a gain by one player must be matched by a loss by the other player

ex. chess, GO, tic-tac-toe,


4

Today


State Space Search for Game Playing





MiniMax
Alpha-beta pruning

Where we are today

MiniMax Search
Game between two opponents, MIN and MAX

MAX tries to win, and


MIN tries to minimize MAXs score
Assumption: both players have the same knowledge





Existing heuristic search methods do not work





would require a helpful opponent


Need to incorporate hostile moves into search strategy

MiniMax procedure:




label each level according to players move


leaves get a value of 1 or 0 (win for MAX or MIN)
propagate this value up:



if parent=MAX, give it max value of children


if parent=MIN, give it min value of children
6

Example: Game of Nim




Rules






2 players start with a pile of tokens


move: split (any) existing pile into two non-empty
differently-sized piles
game ends when no pile can be unevenly split
player who cannot make his move loses

State Space of Game Nim





source: G. Luger (2005)

start with one pile of tokens


each step has to divide one pile
of tokens into 2 non-empty piles
of different size
player without a move left loses
game

MiniMax Algorithm



Label nodes as MIN or MAX, alternating for each level


Define utility function (payoff function)







Do full search on tree (expand all nodes until game is over


for each branch)
Label leaves according to outcome
Propagate result up the tree with



to determine outcome of a game


e.g., (0, 1) or (-1, 0, 1)

M(n) = max( child nodes ) for a MAX node


M(n) = min( child nodes ) for a MIN node

Best next move for MAX is the one leading to the child with
the highest value (and vice versa for MIN)

Exhaustive MiniMax for Nim

Bold lines indicate


forced win for MAX

source: G. Luger (2005)

10

MiniMax with Heuristic




Exhaustive search for interesting games is rarely


feasible
Search only to predefined level



No exhaustive search




called n-ply look-ahead


n is number of levels
nodes evaluated with heuristics and not win/loss
indicates best state that can be reached
horizon effect

Games with opponent




simple strategy: try to maximize difference between


players using a heuristic function e(n)
11

Heuristic Function


e(n) is a heuristic that estimates how favorable a


node n is for MAX




e(n) > 0 --> n is favorable to MAX


e(n) < 0 --> n is favorable to MIN
e(n) = 0 --> n is neutral

12

MiniMax with Fixed Ply Depth

Leaf nodes show the actual heuristic value e(n)


Internal nodes show back-up heuristic value
source: G. Luger (2005)

13

Choosing an Evaluation Function e(n)




14

Example: e(n) for Tic-Tac-Toe


Possible





e(n)

e(n) = number of rows, columns, and diagonals open for MAX


- number of rows, columns, and diagonals open for MIN
e(n) =+, if n is a forced win for MAX
e(n) =-, if n is a forced win for MIN

e(n) = 88 = 0

e(n) = 64 = 2

e(n) = 33 = 0

15

More examples

source: G. Luger (2005)

16

Two-ply MiniMax for Opening Move


Tic-Tac-Toe tree
at horizon = 2

source: G. Luger (2005)

17

Two-ply MiniMax: MAXs possible 2nd moves

source: G. Luger (2005)

18

Two-ply minimax: MAXs move at end

source: G. Luger (2005)

19

Today


State Space Search for Game Playing





MiniMax
Alpha-beta pruning

Where we are today

20

Alpha-Beta Pruning


Optimization over minimax, that:







ignores (cuts off, prunes) branches of the tree


that cannot possibly lead to a better solution
reduces branching factor
allows deeper search with same effort

21

Alpha-Beta Pruning: Example 1




With minimax, we look at all possibilities nodes at the n-ply


depth
With - pruning, we ignore branches that could not
possibly contribute to the final decision
B will be >= 5

So we can ignore Bs right


branch, because A must be 3
D will be <= 0
But C will be >= 3
So we can ignore Ds right
branch
E will be <= 2.
So we can ignore Es right
branch
Because C will be 3.
source: G. Luger (2005)

22

A closer look
max level

-1 3 A

-1 C

min level

Pruning
-1

node C will not contribute its


value to node A so this part
of the tree cant have any
effect on the value that will be
backed up to node A
23

Alpha-Beta Pruning Algorithm




Start with depth-first search down to n-ply depth




apply heuristic e(n) to a state and its siblings

Select min or max and propagate result to parent as in minimax

Offer result to grand-parent node as potential cutoff

If it is a:


MAX node, you keep a value called alpha




MIN node, you keep a value called beta




best (highest) choice for MAX on path (never decreases)


best (lowest) choice for MIN on path (never increases)

Prune search tree




alpha cutoff: prune tree below MIN node, if value alpha

beta cutoff: prune tree below MAX node, if value beta

order sensitive

24

Example with tic-tac-toe


max level

min level

25

Example with tic-tac-toe


max level

min level

=2
value 2

The beta value of a MIN


node is an upper bound on
the final backed-up value.
It can never increase

e(n) = 2
26

Example with tic-tac-toe


max level

min level

e(n) = 2

The beta value of a MIN


node is an upper bound on
=21
the final backed-up value.
value 2 1
It can never increase

e(n) = 1
27

Example with tic-tac-toe


The alpha value of a MAX
node is a lower bound on
the final backed-up value.
It can never decrease

min level

e(n) = 2

=1
value 1

=1
value = 1

e(n) = 1
28

Example with tic-tac-toe


max level

min level

e(n) = 2

=1
value 1

=1
value = 1

e(n) = 1

= -1
value -1

e(n) = -1
29

Example with tic-tac-toe


incompatible
so stop searching the right branch;
the value cannot come from there!

=1
Search can be discontinued below
any MIN node whose beta value is
less than or equal to the alpha value
of one of its MAX ancestors
e(n) = 2

e(n) = 1

=1
value 1 and value -1

= -1
value -1

e(n) = -1

30

Alpha-Beta Pruning: Example 2

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

31

Alpha-Beta Pruning: Example 2

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

32

Alpha-Beta Pruning: Example 2

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

33

Alpha-Beta Pruning: Example 2

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

34

Alpha-Beta Pruning: Example 2

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

35

Alpha-Beta Pruning: Example 2

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

36

Alpha-Beta Pruning: Example 2

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

37

Alpha-Beta Pruning: Example 2

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

38

Alpha-Beta Pruning: Example 2


0
0

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

39

Alpha-Beta Pruning: Example 2


0
0

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

40

Alpha-Beta Pruning: Example 2


0
0

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

41

Alpha-Beta Pruning: Example 2


0
0

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

42

Alpha-Beta Pruning: Example 2


0
0

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

43

Alpha-Beta Pruning: Example 2


0
0

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

44

Alpha-Beta Pruning: Example 2


0
0
0

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

45

Alpha-Beta Pruning: Example 2


0
0
0

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

46

Alpha-Beta Pruning: Example 2


0
0
0

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

47

Alpha-Beta Pruning: Example 2


0
0
0

-3

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

48

Alpha-Beta Pruning: Example 2


0
0
0

-3

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

49

Alpha-Beta Pruning: Example 2


0
0
0

-3

-3

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

50

Alpha-Beta Pruning: Example 2


0
0
0

-3

-3

-5

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

51

Alpha-Beta Pruning: Example 2


0
0
0

-3

-3

-5

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

52

Alpha-Beta Pruning: Example 2


0
0
0

-3

-5

-5

-3

-5

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

53

Alpha-Beta Pruning: Example 2


0
0
0

-3

-5

-5

-3

-5

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

54

Alpha-Beta Pruning: Example 2


0
0

-3

-5

-5

-3

-5

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

55

Alpha-Beta Pruning: Example 2


1
0

-3

-5

-5

-3

-5

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

56

Alpha-Beta Pruning: Example 2


1
0

-3

-5

-5

-3

-5

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

57

Alpha-Beta Pruning: Example 3

source: http://en.wikipedia.org/wiki/File:AB_pruning.svg

58

Today


State Space Search for Game Playing





MiniMax
Alpha-beta pruning

Where we are today

59

Checkers: Tinsley vs. Chinook


Marion Tinsley
World champion
for over 40 years
VS
Chinook
Developed by
Jonathan Schaeffer,
professor at the U. of Alberta,

2007: Chinook won the world championship

60

Chess: Kasparov vs. Deep Blue


Garry Kasparov
50 billion neurons
2 positions/sec
VS
Deep Blue
32 RISC processors
+ 256 VLSI chess engines
200,000,000 pos/sec

1997: Deep Blue wins by 3 wins, 1 loss, and 2 draws

61

Chess: Kasparov vs. Deep Junior


Garry Kasparov
still 50 billion neurons
still 2 positions/sec
VS
Deep Junior
8 CPU, 8 GB RAM, Win 2000
2,000,000 pos/sec
Available at $100

2003: Match ends in a 3/3 tie!


62

Othello: Murakami vs. Logistello


Takeshi Murakami
World Othello champion
VS
Logistello
developed by Michael Buro
runs on a standard PC
https://skatgame.net/mburo/log.html

(including source code)

1997: Logistello beat Murakami by 6 games to 0

63

Go: Goemate
?
VS
Goemate
(the best Go program available today)
Developed by Chen Zhixing

Go has too high a branching factor for


existing search techniques
Programs must rely on huge databases
and pattern-recognition techniques

64

Secrets


Many game programs are based on:






alpha-beta pruning +
iterative deepening +
huge databases + ...

For instance, Chinook searched all checkers


configurations with 8 pieces or less and created an
endgame database of 444 billion board configurations

65

Perspective on Games: Con and Pro

Chess is the Drosophila of


artificial intelligence. However,
computer chess has developed
much as genetics might have if
the geneticists had concentrated
their efforts starting in 1910 on
breeding racing Drosophila. We
would have some science, but
mainly we would have very fast
fruit flies.

Saying Deep Blue doesnt


really think about chess
is like saying an airplane
doesn't really fly because
it doesn't flap its wings.
Drew McDermott

John McCarthy

66

Today


State Space Search for Game Playing





MiniMax
Alpha-beta pruning

Where we are today

67

You might also like