You are on page 1of 40

Math 464: Linear Optimization and Game

Haijun Li

Department of Mathematics
Washington State University

Spring 2013
Game Theory

Game theory (GT) is a theory of rational behavior of people


with nonidentical (self-) interests.

Common Features:
1 There is a set of at least two players (or entities);
2 all players follow a same set of rules;
3 interests of different players are different and selfish.
Game Theory

• Game theory can be defined as the theory of mathematical


models of conflict and cooperation between intelligent,
rational decision-makers.
• Game theory is applicable whenever at least two
individuals - people, species, companies, political parties,
or nations - confront situations where the outcome for each
depends on the behavior of all.
• Game theory proposes solution concepts, defining rational
outcomes of games. Solution concepts may be hard to
compute ...
Early History ...

• Modern game theory began with the work of Ernst Zermelo


(1913, Well-Ordering Theorem, Axiom of Choice), Émile
Borel (1921, symmetric two-player zero-sum matrix game),
John von Neumann (1928, two-player matrix game).
• The early results are summarized in the great seminal
book “Theory of Games and Economic Behavior” of von
Neumann and Oskar Morgenstern (1944).
In all GT models the basic entity is a player.

Once we defined the set of players we may distinguish between two types of m
Types of Games
- primitives are the sets of possible actions of individual players;
Non-cooperative Game: actions of individual players
- primitives areGame:
Cooperative the setsjoint
of possible
actionsjoint actions
of groups ofofplayers
groups of players.

Game Theory

Noncooperative GT Cooperative GT
(models of type I) (models of type II)

Games in Games in Extensive Form


Strategic
Form
EFG with EFG with
Perfect Imperfect
Information Information
Strategic-Form Games or Games in
Normal Form

Basic ingredients:
• N = {1, . . . , n}, n ≥ 2, is a set of players.
• Si is a nonempty set of possible strategies (or pure
strategies) of player i. Each player i must choose some
si ∈ Si .
• S = {(s1 , . . . , sn ) : si ∈ Si }, the set of all possible outcomes
(or pure strategy profiles).
• ui : S → R, a utility function of player i; that is, ui (s) = payoff
of player i if the outcome is s ∈ S.

Definition
A strategic-form game is Γ = (N, {Si }, {ui }).
John Nash Equilibrium (1950)
• Observe that a player’s utility depends not just on his/her
action, but on actions of other players.
• For player i, finding the best action involves deliberating
about what others would do.

Definition
1 All players in N are happy to find such an outcome s∗ ∈ S
such that
ui (s) ≤ ui (s∗ ), ∀ i ∈ N, s ∈ S.
2 An outcome s∗ = (s∗1 , . . . , s∗n ) ∈ S is a Nash equilibrium if for
all i ∈ N,

ui (s∗1 , . . . , s∗i−1 , ti , s∗i+1 , . . . , s∗n ) ≤ ui (s∗ ), ∀ti ∈ Si .


Example: Prisoners’ Dilemma (RAND,
1950; Albert Tucker, 1950)

• Two suspects (A and B) committed a crime.


• Court does not have enough evidence to convict them of
the crime, but can convict them of a minor offense (1 year
in prison each).
• If one suspect confesses (acts as an informer), he walks
free, and the other suspect gets 20 years.
• If both confess, each gets 5 years.
• Suspects have no way of communicating or making
binding agreements.
Prisoners’ Dilemma: A Matrix Game
Rationality =⇒
6 Best Solution

Suspect A’s reasoning:


• If B stays quiet, I should confess;
• if B confesses, I should confess too.

Suspect B does a similar thing.

Unique Nash Equilibrium at (5, 5):


Both confess and each gets 5 years in prison.
player has a finite set of strategies.
Two-player Matrix Games
= {x1 , . . . , xn }, S2 = Y = {y1 , . . . , ym },
• N = {1, 2}
1 (xi , yj ), bij = u2 (xi , yj ).
• S1 = {x1 , . . . , xn }, S2 = {y1 , . . . , ym }.
• aij = u1 (xi , yj ), bij = u2 (xi , yj ).

y1 … ym
x1 (a11,b11) … (a1m,b 1m)
… …
xn (an1,bn1) … (anm,b nm)

Figure : Row Player = player 1 Column Player = player 2


outcome for each animal is that in which it acts like a hawk whi
ve; the worst outcome is that Example: Hawk-Dove
in which both animals act like haw
refers to be hawkish
• Two animals areif its opponent
fighting is dovish
over some andcan
prey. Each dovish
behaveif its o
. like a dove or like a hawk.
• The reasonable outcome for each animal is that in which it
acts like a hawk while the other acts like a dove.
• The
me has two worstequilibria,
Nash outcome is that
(d,h)in and
which(h,d),
both animals act like
corresponding to tw
hawks.
ons about the player who yields.
• Each animal prefers to be hawkish if its opponent is dovish
and dovish if its opponent is hawkish.
• The game has two Nash equilibria, (d,h) and (h,d).

dove hawk
dove 3,3 1,4
hawk 4,1 0,0
Example: Matching Pennies
e has no Nash
• Each of twoequilibria.
people chooses either Head or Tail.
• If the choices differ, person 1 pays person 2 $1; if they are
the same, person 2 pays person 1 $1.
• Each person cares only about the amount of money that
he receives.
• The game has no Nash equilibrium.

head tail
head 1,-1 -1,1
tail -1,1 1,-1

Figure : No Nash equilibrium


Strictly Competitive Games

Definition
A strategic game Γ = ({1, 2}, {S1 , S2 }, {u1 , u2 }) is strictly
competitive if for any outcome (s1 , s2 ) ∈ S, we have
u2 (s1 , s2 ) = −u1 (s1 , s2 ) (Zero-Sum).

Remark
1 If u1 (s1 , s2 ) = gain for player 1, then u1 (s1 , s2 ) = loss for
player 2.
2 If an outcome (s∗1 , s∗2 ) is a Nash equilibrium, then

u1 (s1 , s∗2 ) ≤ u1 (s∗1 , s∗2 ) ≤ u1 (s∗1 , s2 ), ∀ s1 ∈ S1 , s2 ∈ S2 .

That is, Nash equilibrium is a saddle point.


1 Player 1 maximizes gain, whereas player 2 minimizes loss.

min u1 (s1 , y) ≤ max u1 (x, s2 ), ∀ s1 ∈ S1 , s2 ∈ S2 .


y∈S2 x∈S1

2 In other worlds, player 1 maximizes player 2’s loss,


whereas player 2 minimizes player 1’s gain.

max min u1 (x, y) ≤ min max u1 (x, y)


x∈S1 y∈S2 y∈S2 x∈S1

3 A best guaranteed outcome for player 1 would be x∗ with

min u1 (x∗ , y) ≥ min u1 (x, y), ∀ x ∈ S1 .


y∈S2 y∈S2

4 A best guaranteed outcome for player 2 would be y∗ with

max u1 (x, y∗ ) ≤ max u1 (x, y), ∀ y ∈ S2 .


x∈S1 x∈S1

max min u1 (x, y) ≤ min u1 (x∗ , y) ≤ max u1 (x, y∗ ) ≤ min max u1 (x, y).
x∈S1 y∈S2 y∈S2 x∈S1 y∈S2 x∈S1
MiniMax Theorem (Borel, 1921; von Neumann, 1928)
An outcome (s∗1 , s∗2 ) is a Nash equilibrium in a strictly
competitive game Γ = ({1, 2}, {S1 , S2 }, {u1 , −u1 }) if and only if

max min u1 (x, y) = u1 (s∗1 , s∗2 ) = min max u1 (x, y) =: game value,
x∈S1 y∈S2 y∈S2 x∈S1

where s∗1 is a best outcome for player 1 while s∗2 is a best


outcome for player 2.
e strictly competitive strategic game admits simple and convenient
ntation in the matrix form.
Two-player Zero-Sum Matrix Games
{x1 , . . . ,•xnN},=Y{1,
=2} {y1 , . . . , ym },
• S = {x , . . . , x }, S2 = {y1 , . . . , ym }.
= u1 (xi , yj ), 1u2 (xi ,1 yj ) = n−u1 (x , yj ) = −aij .
• aij = u1 (xi , yj ), −aij = iu2 (x i , yj ).


a11 a12 ... a1m → min 


a21 a22 ... a2m → min 
max
.. .. .. .. .. .. =⇒ m
. . . . . . 



an1 an2 ... anm → min
↓ ↓ ... ↓
max max . . . max
| {z }
⇓ min
M
Two-Player Constant-Sum Games

• There are two players: player 1 is called the row player and
player 2 is called the column player.
• The row player must choose 1 of n strategies, and the
column player must choose 1 of m strategies.
• If the row player chooses the i-th strategy and the column
player chooses the j-th strategy, then the row player
receives a reward of aij and the column player receives a
reward of c − aij .
• If c = 0, then we have a two-player zero-sum game.
Example: Completing Networks
• Network 1 and Network 2 are competing for an audience of
100 million viewers at certain time slot.
• The networks must simultaneously announce the type of
show they will air in that time slot: Western, soap opera, or
comedy.
• If network 1 has aij million viewers, then network 2 will
have 100 − aij million viewers.
Game of Odds and Evens (or Matching
Pennies, again)
• Two players (Odd and Even) simultaneously choose the
number of fingers (1 or 2) to put out.
• If the sum of the fingers is odd, then Odd wins $1 from
Even.
• If the sum of the fingers is even, then Even wins $1 from
Odd.
• This game has no saddle point.
We Need More Strategies!
To analyze the games without saddle point, we introduce
randomized strategies by choosing a strategy according to a
probability distribution.
• x1 = probability that Odd puts out one finger
• x2 = probability that Odd puts out two finger
• y1 = probability that Even puts out one finger
• y2 = probability that Even puts out two finger

where x1 + x2 = 1 and y1 + y2 = 1, x1 ≥ 0, x2 ≥ 0, y1 ≥ 0, y2 ≥ 0.

Odd tosses a loaded coin (with P(Head) = x1 , P(tail) = x2 ) to


choose a strategy. Even does a similar thing.

If x1 = 1 or x2 = 1 (y1 = 1 or y2 = 1), then Odd (Even) chooses a


pure strategy.
Randomized Strategies

Let (x1 , . . . , xm ) and (y1 , . . . , yn ) be two probability vectors (i.e.,


entries are all non-negative and add up to 1 for each vector).
• There are two players: player 1 is called the row player and
player 2 is called the column player.
• The row player must choose 1 of m strategies, and the
column player must choose 1 of n strategies.
• If the row player chooses the i-th strategy with probability xi
and the column player chooses the j-th strategy with
probability yj , then the row player receives a reward of aij
and the column player receives a reward of −aij .
Given that one player chooses a strategy, how to calculate the
average reward of the other player?
Odd’s Optimal Strategy

Odd needs to minimize his loss (or find a loss floor).


• If Even puts out one finger, then Odd’s average reward is

Odd’s expected reward = (−1)x1 + (+1)(1 − x1 ) = 1 − 2x1 .

• If Even puts out two finger, then Odd’s average reward is

Odd’s expected reward = (+1)x1 + (−1)(1 − x1 ) = 2x1 − 1.


Figure : Odd’s Reward
Even’s Optimal Strategy

Even needs to maximize his reward (or find a reward ceiling).


• If Odd puts out one finger, then Even’s average reward is

Even’s expected reward = (+1)y1 + (−1)(1 − y1 ) = 2y1 − 1.

• If Odd puts out two finger, then Even’s average reward is

Even’s expected reward = (−1)y1 + (+1)(1 − y1 ) = 1 − 2y1 .


Figure : Even’s Reward
Analysis

Figure : Value of Game with Randomized Strategies


Value of Game with Randomized
Strategies

• In the game of Odds and Evens, Odd’s loss floor equals to


Even’s reward ceiling when they use the randomized
strategy ( 12 , 12 ).
• The common value of floor and ceiling is called the value of
the game.
• The strategy that corresponds to the value of the game is
called an optimal strategy.
• This optimal randomized strategy ( 12 , 21 ) can be obtained
via the duality theorem.
Randomized Strategies

• Γ = (N, {Si }, {ui }) is a strategic game.


• A randomized strategy of player i is a probability
distribution Pi over the set Si of its pure strategies.
• Pi (si ) = probability that player i chooses strategy si ∈ Si .
• We assume that randomized strategies of different players
are independent.

Definition
For any i ∈ N, the expected utility of player i given that player j,
j 6= i, chooses strategy sj ∈ Sj is given by
X
E(Pi ) := ui (s1 , . . . , si−1 , si , si+1 , . . . , sn )Pi (si ).
si ∈Si
Randomized Strategy Nash Equilibrium

• Γ = (N, {Si }, {ui }) is a strategic game.


• A randomized strategy of player i is a probability
distribution Pi over the set Si of its pure strategies.
• Pi (si ) = probability that player i chooses strategy si ∈ Si .
• We assume that randomized strategies of different players
are independent.

Theorem (Nash, 1950)


Every finite strategic game has a randomized strategy Nash
equilibrium.

Remark
For two-player matrix games this result was obtained by von
Neumann in 1928.
Example: Stone, Paper, and Scissors
• The two players (row and column players) must choose 1
of three strategies: Stone, Paper, and Scissors.
• If both players use the same strategy, the game is a draw.
• Otherwise, one player wins $1 from the other according to
the following rule:
scissors cut paper, paper covers stone, stone breaks scissors.
Randomized Strategies

• x1 = probability that row player chooses stone


• x2 = probability that row player chooses paper
• x3 = probability that row player chooses scissors
• y1 = probability that column player chooses stone
• y2 = probability that column player chooses paper
• y3 = probability that column player chooses scissors

where x1 + x2 + x3 = 1 and y1 + y2 + y3 = 1, x1 , x2 , x3 , y1 , y2 , y3
are all non-negative.

The row player chooses a randomized strategy (x1 , x2 , x3 ).

The column player chooses a randomized strategy (y1 , y2 , y3 ).


Row Player’s LP for Max. Reward v

max z = v
v ≤ x2 − x3
v ≤ −x1 + x3
v ≤ x1 − x2
x1 + x2 + x3 = 1
x1 , x2 , x3 ≥ 0, v urs.
Column Player’s LP for Min. Loss w

min z = w
w ≥ −y2 + y3
w ≥ y1 − y3
w ≥ −y1 + y2
y1 + y2 + y3 = 1
y1 , y2 , y3 ≥ 0, w urs.
Dual of Row’s LP = Column LP

The optimal strategy for both players is ( 13 , 31 , 31 ).

Figure : Dual of Row’s LP = Column LP


Proof Idea of Nash’s Theorem via
Duality
• Given that the column player chooses his strategy,
maximize the row player’s expected reward under
randomized strategy (x1 , . . . , xm ).
• Given that the row player chooses his strategy, minimize
the column player’s expected loss under randomized
strategy (y1 , . . . , yn ).
Figure : Dual of Row’s LP = Column LP
• Γ = ({1, 2}, {S1 , S2 }, {u1 , u2 }) is a strategic game.
• A randomized strategy of player i is a probability
distribution Pi over the set Si of its pure strategies.
• Es2 (P1 ) = expected utility of player 1 given that player 2
chooses strategy s2 ∈ S2 .
• Es1 (P2 ) = expected utility of player 2 given that player 1
chooses strategy s1 ∈ S1 .
Primal LP

max z = v
v ≤ min Es2 (P1 ), v urs.
s2 ∈S2

Dual LP

min z = w
w ≥ max Es1 (P2 ), w urs.
s1 ∈S1
Duality, Again

• An optimal solution exists such that

max min Es2 (P1 ) = min max Es1 (P2 )


∀P1 s2 ∈S2 ∀P2 s1 ∈S1

• The common value is known as the value of the game.


• Nash’s original proof (in his thesis) used Brouwer’s fixed
point theorem.
• When Nash made this point to John von Neumann in 1949,
von Neumann famously dismissed it with the words,
“That’s trivial, you know. That’s just a fixed point theorem.”
(Nasar, 1998)
Significance of Probabilistic Methods

• Probabilistic methods are often used to incorporate


uncertainty. In contrast, the probabilistic method is used
here to enlarge the solution set so that a Nash equilibrium
can be achieved using randomized strategies.
• Probabilistic methods are increasingly used to prove the
existence of certain rare objects in mathematical
constructs.

You might also like