Lect3 PDF

Artificial Intelligence:
State Space Search for

Game Playing
Russell & Norvig: Sections 5.1 & 5.4
Many slides from:
robotics.stanford.edu/~latombe/cs121/2003/home.htm
Motivation
GO
chess
tic-tac-toe
2
Today
State Space Search for Game Playing

MiniMax
Alpha-beta pruning
Where we are today

Classical application for heuristic search

simple games: exhaustibly searchable

complex games: only partial search possible
additional problem: playing against opponent
Type of game:
2 person adversarial games

Perfect information
outcome of the game is only dependent on players moves
zero-sum game
both players know the state of the game and all possible moves
No chance involved
win, lose or tie
If the total gains of one player are added up, and the total losses are subtracted,
they will sum to zero.
a gain by one player must be matched by a loss by the other player
ex. chess, GO, tic-tac-toe,

4
Today

MiniMax
Alpha-beta pruning
Where we are today
MiniMax Search
Game between two opponents, MIN and MAX
MAX tries to win, and

MIN tries to minimize MAXs score
Assumption: both players have the same knowledge

Existing heuristic search methods do not work

would require a helpful opponent

Need to incorporate hostile moves into search strategy
MiniMax procedure:

label each level according to players move

leaves get a value of 1 or 0 (win for MAX or MIN)
propagate this value up:

if parent=MAX, give it max value of children

if parent=MIN, give it min value of children
6
Example: Game of Nim

Rules

2 players start with a pile of tokens

move: split (any) existing pile into two non-empty
differently-sized piles
game ends when no pile can be unevenly split
player who cannot make his move loses
State Space of Game Nim

source: G. Luger (2005)
start with one pile of tokens

each step has to divide one pile
of tokens into 2 non-empty piles
of different size
player without a move left loses
game
MiniMax Algorithm

Label nodes as MIN or MAX, alternating for each level

Define utility function (payoff function)

Do full search on tree (expand all nodes until game is over

for each branch)
Label leaves according to outcome
Propagate result up the tree with

to determine outcome of a game

e.g., (0, 1) or (-1, 0, 1)
M(n) = max( child nodes ) for a MAX node

M(n) = min( child nodes ) for a MIN node
Best next move for MAX is the one leading to the child with
the highest value (and vice versa for MIN)
Exhaustive MiniMax for Nim
Bold lines indicate

forced win for MAX
10
MiniMax with Heuristic

Exhaustive search for interesting games is rarely

feasible
Search only to predefined level

No exhaustive search

called n-ply look-ahead

n is number of levels
nodes evaluated with heuristics and not win/loss
indicates best state that can be reached
horizon effect
Games with opponent

simple strategy: try to maximize difference between

players using a heuristic function e(n)
11
Heuristic Function
e(n) is a heuristic that estimates how favorable a

node n is for MAX

e(n) > 0 --> n is favorable to MAX

e(n) < 0 --> n is favorable to MIN
e(n) = 0 --> n is neutral
12
MiniMax with Fixed Ply Depth
Leaf nodes show the actual heuristic value e(n)

Internal nodes show back-up heuristic value
13
Choosing an Evaluation Function e(n)

14
Example: e(n) for Tic-Tac-Toe

Possible

e(n)
e(n) = number of rows, columns, and diagonals open for MAX

- number of rows, columns, and diagonals open for MIN
e(n) =+, if n is a forced win for MAX
e(n) =-, if n is a forced win for MIN
e(n) = 88 = 0
e(n) = 64 = 2
e(n) = 33 = 0
15
More examples
16
Two-ply MiniMax for Opening Move

Tic-Tac-Toe tree
at horizon = 2
17
Two-ply MiniMax: MAXs possible 2nd moves
18
Two-ply minimax: MAXs move at end
19
Today

MiniMax
Alpha-beta pruning
Where we are today
20
Alpha-Beta Pruning
Optimization over minimax, that:

ignores (cuts off, prunes) branches of the tree

that cannot possibly lead to a better solution
reduces branching factor
allows deeper search with same effort
21
Alpha-Beta Pruning: Example 1

With minimax, we look at all possibilities nodes at the n-ply

depth
With - pruning, we ignore branches that could not
possibly contribute to the final decision
B will be >= 5
So we can ignore Bs right

branch, because A must be 3
D will be <= 0
But C will be >= 3
So we can ignore Ds right
branch
E will be <= 2.
So we can ignore Es right
branch
Because C will be 3.
22
A closer look
max level
-1 3 A
-1 C
min level
Pruning
-1
node C will not contribute its

value to node A so this part
of the tree cant have any
effect on the value that will be
backed up to node A
23
Alpha-Beta Pruning Algorithm

Start with depth-first search down to n-ply depth

apply heuristic e(n) to a state and its siblings
Select min or max and propagate result to parent as in minimax
Offer result to grand-parent node as potential cutoff
If it is a:
MAX node, you keep a value called alpha

MIN node, you keep a value called beta

best (highest) choice for MAX on path (never decreases)

best (lowest) choice for MIN on path (never increases)
Prune search tree

alpha cutoff: prune tree below MIN node, if value alpha
beta cutoff: prune tree below MAX node, if value beta
order sensitive
24
Example with tic-tac-toe

max level
min level
25

max level
min level
=2
value 2
The beta value of a MIN

node is an upper bound on
the final backed-up value.
It can never increase
e(n) = 2
26

max level
min level
e(n) = 2
The beta value of a MIN

node is an upper bound on
=21
value 2 1
It can never increase
e(n) = 1
27

The alpha value of a MAX
node is a lower bound on
It can never decrease
min level
e(n) = 2
=1
value 1
=1
value = 1
e(n) = 1
28

max level
min level
e(n) = 2
=1
value 1
=1
value = 1
e(n) = 1
= -1
value -1
e(n) = -1
29

incompatible
so stop searching the right branch;
the value cannot come from there!
=1
Search can be discontinued below
any MIN node whose beta value is
less than or equal to the alpha value
of one of its MAX ancestors
e(n) = 2
e(n) = 1
=1
value 1 and value -1
= -1
value -1
e(n) = -1
30
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
31
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
32
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
33
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
34
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
35
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
36
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
37
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
38

0
0
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
39

0
0
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
40

0
0
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
41

0
0
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
42

0
0
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
43

0
0
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
44

0
0
0
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
45

0
0
0
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
46

0
0
0
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
47

0
0
0
-3
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
48

0
0
0
-3
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
49

0
0
0
-3
-3
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
50

0
0
0
-3
-3
-5
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
51

0
0
0
-3
-3
-5
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
52

0
0
0
-3
-5
-5
-3
-5
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
53

0
0
0
-3
-5
-5
-3
-5
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
54

0
0
-3
-5
-5
-3
-5
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
55

1
0
-3
-5
-5
-3
-5
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
56

1
0
-3
-5
-5
-3
-5
0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2
57
source: http://en.wikipedia.org/wiki/File:AB_pruning.svg
58
Today

MiniMax
Alpha-beta pruning
Where we are today
59
Checkers: Tinsley vs. Chinook

Marion Tinsley
World champion
for over 40 years
VS
Chinook
Developed by
Jonathan Schaeffer,
professor at the U. of Alberta,
2007: Chinook won the world championship
60
Chess: Kasparov vs. Deep Blue

Garry Kasparov
50 billion neurons
2 positions/sec
VS
Deep Blue
32 RISC processors
+ 256 VLSI chess engines
200,000,000 pos/sec
1997: Deep Blue wins by 3 wins, 1 loss, and 2 draws
61
Chess: Kasparov vs. Deep Junior

Garry Kasparov
still 50 billion neurons
still 2 positions/sec
VS
Deep Junior
8 CPU, 8 GB RAM, Win 2000
2,000,000 pos/sec
Available at $100
2003: Match ends in a 3/3 tie!

62
Othello: Murakami vs. Logistello

Takeshi Murakami
World Othello champion
VS
Logistello
developed by Michael Buro
runs on a standard PC
https://skatgame.net/mburo/log.html
(including source code)
1997: Logistello beat Murakami by 6 games to 0
63
Go: Goemate
?
VS
Goemate
(the best Go program available today)
Developed by Chen Zhixing
Go has too high a branching factor for

existing search techniques
Programs must rely on huge databases
and pattern-recognition techniques
64
Secrets
Many game programs are based on:

alpha-beta pruning +
iterative deepening +
huge databases + ...
For instance, Chinook searched all checkers

configurations with 8 pieces or less and created an
endgame database of 444 billion board configurations
65
Perspective on Games: Con and Pro
Chess is the Drosophila of

artificial intelligence. However,
computer chess has developed
much as genetics might have if
the geneticists had concentrated
their efforts starting in 1910 on
breeding racing Drosophila. We
would have some science, but
mainly we would have very fast
fruit flies.
Saying Deep Blue doesnt

really think about chess
is like saying an airplane
doesn't really fly because
it doesn't flap its wings.
Drew McDermott
John McCarthy
66
Today

MiniMax
Alpha-beta pruning
Where we are today
67

Lect3 PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lect3 PDF

Uploaded by

Copyright:

Available Formats

Artificial Intelligence:

State Space Search for

State Space Search for Game Playing

Where we are today

State Space Search for Game Playing

Classical application for heuristic search

simple games: exhaustibly searchable

2 person adversarial games

outcome of the game is only dependent on players moves

win, lose or tie

ex. chess, GO, tic-tac-toe,

State Space Search for Game Playing

Where we are today

MAX tries to win, and

Existing heuristic search methods do not work

would require a helpful opponent

label each level according to players move

if parent=MAX, give it max value of children

Example: Game of Nim

2 players start with a pile of tokens

State Space of Game Nim

source: G. Luger (2005)

start with one pile of tokens

Label nodes as MIN or MAX, alternating for each level

Do full search on tree (expand all nodes until game is over

to determine outcome of a game

M(n) = max( child nodes ) for a MAX node

Exhaustive MiniMax for Nim

Bold lines indicate

source: G. Luger (2005)

MiniMax with Heuristic

Exhaustive search for interesting games is rarely

called n-ply look-ahead

Games with opponent

simple strategy: try to maximize difference between

e(n) is a heuristic that estimates how favorable a

e(n) > 0 --> n is favorable to MAX

MiniMax with Fixed Ply Depth

Leaf nodes show the actual heuristic value e(n)

Choosing an Evaluation Function e(n)

Example: e(n) for Tic-Tac-Toe

e(n) = number of rows, columns, and diagonals open for MAX

source: G. Luger (2005)

Two-ply MiniMax for Opening Move

source: G. Luger (2005)

Two-ply MiniMax: MAXs possible 2nd moves

source: G. Luger (2005)

Two-ply minimax: MAXs move at end

source: G. Luger (2005)

State Space Search for Game Playing

Where we are today

Optimization over minimax, that:

ignores (cuts off, prunes) branches of the tree

Alpha-Beta Pruning: Example 1

With minimax, we look at all possibilities nodes at the n-ply

So we can ignore Bs right

node C will not contribute its

Alpha-Beta Pruning Algorithm

Start with depth-first search down to n-ply depth

apply heuristic e(n) to a state and its siblings

Select min or max and propagate result to parent as in minimax

Offer result to grand-parent node as potential cutoff