You are on page 1of 119

Mathematical foundations of microeconomic theory:

Preference, utility, choice


Mark Voorneveld
September 6, 2010

Contents

Preface

iii

1 Preference

1.1

Preference relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Preference over commodity bundles . . . . . . . . . . . . . . . . . . . . . . . . . .

2 Utility

2.1

Utility functions

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2

From preference to utility: nite or countable sets . . . . . . . . . . . . . . . . . .

10

2.3

Preference, but no utility

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.4

In no-man's-land: A necessary and sucient condition for utility representation .

12

2.5

Continuous utility

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

2.6

Some special functional forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

3 Choice

22

3.1

Existence of most preferred elements . . . . . . . . . . . . . . . . . . . . . . . . .

3.2

Revealed preference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

3.3

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

4 Choices of a consumer: classical demand theory

22

26

4.1

The preference/utility maximization problem

4.2

Properties of the demand correspondence and indirect utility

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .

28

4.3

The expenditure minimization problem . . . . . . . . . . . . . . . . . . . . . . . .

31

4.4

Relations between UMP and EMP

. . . . . . . . . . . . . . . . . . . . . . . . . .

34

4.5

Welfare analysis for the consumer . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

4.6

Welfare and Hicksian demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38

5 Choices of a producer: classical supply theory

26

40

5.1

Production sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

40

5.2

Properties of production sets

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

40

5.3

The prot maximization problem . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

5.4

Solving the PMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

5.5

The cost minimization problem

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

46

5.6

Linking the PMP and the CMP . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47

5.7

Eciency

49

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 General equilibrium

50

6.1

What is an equilibrium? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

6.2

Pure exchange economies

51

6.3

Welfare analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

52

6.4

Private ownership economies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53

6.5

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7 Expected utility theory

57

7.1

Simple and compound gambles

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

7.2

Preferences over gambles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58

7.3

von Neumann-Morgenstern utility functions

. . . . . . . . . . . . . . . . . . . . .

60

7.4

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

8 Risk attitudes

63

8.1

In for a gamble?

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

8.2

Certainty equivalent and risk premium . . . . . . . . . . . . . . . . . . . . . . . .

64

8.3

Arrow-Pratt measure of absolute risk aversion . . . . . . . . . . . . . . . . . . . .

65

8.4

A derivation of the Arrow-Pratt measure . . . . . . . . . . . . . . . . . . . . . . .

66

9 Some critique on expected utility theory

67

9.1

Problems with unbounded utility: a variant of the St. Petersburg paradox . . . .

9.2

Allais' paradox

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

9.3

Probability matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

9.4

Rabin's calibration theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

10 Time preference

67

70

10.1 Stationarity and exponential discounting . . . . . . . . . . . . . . . . . . . . . . .

70

10.2 Preference reversal and hyperbolic discounting . . . . . . . . . . . . . . . . . . . .

72

10.3 Limit-of-means and overtaking

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

73

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

75

10.4 Better may be worse

11 Probabilistic choice

77

11.1 The Luce model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77

11.2 The logit model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

80

11.3 The linear probability model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

82

11.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

84

Full circle: overview

85

Notation

88

References

89

Suggested solutions

91

ii

Preface
Overview
The purpose of these notes is to introduce you to some mathematical foundations of economic
theory. These are building blocks of economics that hopefully contribute to your understanding of
formal modeling in your other courses and in the research papers you will read and  eventually
 write.
The typical model of the behavior of an economic agent requires careful answers to the
following questions:





(Q1) What can the agent choose from, i.e., what is the set of feasible alternatives?
(Q2) What does the agent like, i.e., what are the preferences over alternatives?
(Q3) How are the former two combined to make a choice, i.e., to select among alternatives?

Although we make some brief excursions into bounded rationality, the main building block of
traditional economics is rational choice: choose from your set of feasible alternatives a most
preferred one. This raises important related questions:




(Q4) When do most preferred elements exist?


(Q5) How are they aected when the agent's environment changes?

The fourth question is extremely important: you'd be surprised about how many people simply
skip over the existence issue and write papers about how solutions to economic problems are
aected by parameter changes, without ever wondering whether there even is a solution. The
fth question concerns things like how a consumer's demand is aected by price changes, wage
increases, etc.
Try to keep this in mind, because this is what will occupy us most of the time and constitutes
the red line of the course: regardless of the setting, we rst have to answer (Q1) to (Q3) to provide
a meaningful microfounded model of an economic agent's behavior. Sections 1 to 3 provide a
general framework for modeling preferences over and choice from a feasible set of alternatives.
This general framework is then applied to a number of specic cases: traditional models of
consumer choice (Section 4), producer choice (Section 5), choice over outcomes that are no longer
deterministic, but occur with certain probabilities (Section 7), choice over outcomes occurring
over time (Section 10), and even the modeling of seemingly suboptimal choices (Section 11).

Special features
Every course reects some of the teacher's own preferences. Although the material covered here
is pretty standard for a rst PhD course in microeconomic theory, what distinguishes these notes
from other graduate texts is:

Focus on preferences:

The notes have a relatively strong focus on preferences, rather than

utility functions. Utility functions are practical in the sense that they allow you to use standard
calculus tools, but this tends to blur the picture by making economics into an exercise in advanced
dierentiation. I try to avoid this. Although people make statements like I like coee more than
tea, you hardly ever see them in a supermarket with a calculator and their utility function
written on a piece of paper.
This allows us to give a much more general answer  with a remarkably simple proof  to
the question when most preferred elements exist; see Proposition 3.1.

From preferences to utility:


iii

 Not all preferences can be represented by means of a utility function.

Graduate texts

typically give exactly one example, lexicographic preferences, as if it concerns an exotic


phenomenon. These notes try to give some counterweight by providing several economically
relevant examples, all arising from the same general principle; see Section 2.3.

 So the question remains, when does a utility function exist? Section 2.4 provides necessary
and sucient conditions.

 As an important special case, when does a continous utility function exist? Proposition 2.6
provides a detailed proof. Remarkably, not even Fishburn (1970a), the standard reference
on utility theory, contains such a proof, and neither does any of the standard textbooks in
microeconomic theory.
I don't actually expect you to know the proof, I just wanted to ll a gap and make sure
you have access to it.

Miscellanea:

Other things not commonly found in standard texts include:

 An existence result for Walrasian equilibria in terms of excess demand correspondences,


rather than excess demand functions; see Proposition 6.5.

 Some excursions into the realm of bounded rationality, with brief discussions of hyperbolic
discounting (Section 10.2), probabilistic choice (Section 11), and some exotic preferences
(Section 3.3).

Solutions manual:
solutions to all

Like any textbook, these notes contain exercises.

of the exercises, in the hope of facilitating self-study:

do some exercises, you can immediately check your solutions.

They also contain


if you have time to

If you're pressed for time, you

can treat the worked exercises as a collection of a few dozen (cleverly disguised) examples and
applications.

Recommended reading
The lecture notes are the reading material for the course. You may omit the proof of Propositions
2.5 and 2.6, as well as the more mathematical exercises in Section 10.3. For the interested reader,
the following table refers to related material in Mas-Colell, Whinston, and Greene (1995, MWG),
which is by no means obligatory.

Lecture notes

See also MWG

1. Preference

1.AB, 2.AC, 3.B

2. Utility

3.C

3. Choice

1.CD, 2.D

4. Choices of a consumer

2.E, 3.DE, G, I

5. Choices of a producer

5.AC, FG

6. General equilibrium

15.AC, 16.AD, 17.AC, 18.AB

7. Expected utility theory

6.AB

8. Risk attitudes

6.C

9. Some critique

6.B

10. Time preference

20.AB

11. Probabilistic choice

none

Well, almost all exercises, as a couple of them will be used as this year's home assignments...
iv

Terminology
In economics, there is little consensus on terminology.

For instance, following Arrow (1959)

and Fishburn (1970b), I refer to a complete transitive binary relation that models an economic
agent's preferences as a `weak order'. Other names include `rational preference relation' (MasColell et al., 1995), a very loaded term, simply `preference relation' (Rubinstein, 2006), `complete
preordering' (Debreu, 1959), `complete weak order' (Fishburn, 1979), and `complete ordering'
(Debreu, 1954). The Micro I course and its exam use the denitions from these lecture notes.

1.

Preference

1.1.

Preference relations

Rational choice essentially means choosing from a set of feasible options a most preferred alternative. Let

be a set of alternatives. A

preference relation

allowing the comparison between pairs of alternatives. For each

x%y

% is a binary
x, y X , read

relation on

as  x is at least as good as/weakly preferred to/weakly better than

A binary relation

is a

weak order

X,

y .

if it satises:

Completeness: for all x, y X , x % y or y % x (or both).


Transitivity: for all x, y, z X , if x % y and y % z, then x % z.
Exercise 1.1
(a)

Are the following binary relations

necessarily complete, transitive?

consists of the items in an English dictionary,

is the alphabetical order in which they are

listed.
(b)

is a group of people and for

From preference relation

%,

Strict preference:
to

x, y X : x % y

if and only if

knows

y.

one can derive two other binary relations:

x  y if x % y , but not y % x ( x is better than/strictly preferred

y ).

Indierence:

xy

if

We sometimes write

y-x

instead of

x%y

and

y%x

x%y

( x and

and

yx

are equally good/equivalent).

instead of

x  y.

Economic theory relies

heavily on preferences. You should be aware of some hidden assumptions:

 Preferences are deterministic: they are not susceptible to a change of mind or mood shocks.
Statements like I like coee more than tea at any time, but today I prefer a cup of tea.
are ruled out.

 Preferences are ordinal: the intensity of preferences  as in I'm rather fond of the 6
o'clock news, but detest soap operas.  plays no role.

 Preference is a binary relation: it compares pairs of alternatives, independently of external


factors. Conditional statements like If there are twenty types of coee to choose from, I
prefer tea to any type of coee. Otherwise, I take an espresso. are ruled out.
Also completeness and transitivity deserve scrutiny.

Completeness rules out the existence of

incomparable alternatives. Transitivity is violated in a number of plausible situations:

Majority rule voting:

a, b, c

Consider three agents with strict preferences over three alternatives

as follows:

a 1 b 1 c

and

c 2 a 2 b

and

b 3 c 3 a.

This involves a slight but common abuse of notation: although this was not stated explicitly,
the notation above is taken to suggest, for instance, that also
relation

via majority rule voting:

a  b,

a 1 c.

Dene a new preference

because a majority (namely the agents 1 and 2)

strictly prefers

b.

over

bc

Similarly,

and

c  a,

in violation of transitivity. This example is

sometimes referred to as the Condorcet paradox.

Nonperceivable differences and similarity:

The human body cannot perceive dierences

in stimuli unless they exceed a certain threshold. For instance, you will typically not sense the
dierence between a cup of tea with

nN

grains of sugar and

n+1

grains of sugar. Therefore,

you will be indierent between them. If preferences are transitive, you will be indierent between
a cup of tea with 1 grain of sugar, 2 grains of sugar, 3 grains of sugar. . . one kilo of sugar. Are
you? This example is related to the more general issue of similarity: nearby alternatives may be
perceived similar and therefore equally good. But with a long chain of nearby alternatives, you
can create a huge change between alternatives, so that you may no longer be indierent between
them.
Properties of
relation

.

imply some properties of the indierence relation

and the strict preference

The proofs involve only simple manipulations of the denitions of

and

;

check

that you can do this. I only prove part (d).

Proposition 1.1

Let

(a) The indierence relation





is an equivalence relation, i.e., it satises:

reexivity:

x X : x x.

symmetry:

x, y X :

transitivity:

if

irreexivity:

x y,

x, y, z X :

(b) The strict preference relation





X.

be a weak order on

x X :

if

not

asymmetry:

x, y X :

transitivity:

x, y, z X :

if

then

xy

y x.

and

y z,

x z.

satises:

x  x.
x  y,
if

then not

xy

and

x, y, z X :

if

xy

and

y % z,

then

x % z.

(d)

x, y, z X :

if

xy

and

y % z,

then

x  z.

Proof. (d):

Let

Exercise 1.2

Complete the proof of the proposition.

then

x  z.

y % z . By denition of , x  y implies x % y .
With y % z and transitivity of %, this implies x % z . It is not true that z % x: if it were, it
would imply with y % z and transitivity of % that y % x, contradicting that x  y . Since x % z ,
but not z % x: x  z .


1.2.

with

xy

y  x.

y  z,

(c)

x, y, z X

then

and

Preference over commodity bundles

In the standard microeconomic model of consumer choice, the set of alternatives

L
taken to be R or

RL
+ for some

L N.

(commodity) bundle ; its k-th coordinate xk indicates the


2

is usually

L commodities, in
x = (x1 , . . . , xL ) X is
quantity of commodity k .

The interpretation is that there are

the latter case to be consumed in nonnegative amounts. An element


called a

The additional structure obtained this way allows us to introduce a number of new properties;
throughout this subsection, assume therefore that
typically illustrated using indierence curves. The
set

{y X : x y}

of points equivalent with

equals

RL
+

RL .

or

These properties are

indierence curve containing x X is the

x.

Recall that the (Euclidean) distance between vectors

x, y RL

is dened as

v
u L
uX
kx yk = t (x` y` )2 .
`=1
The preference relation

over

satises

is an alternative arbitrarily close to

yX

with

kx yk <

local nonsatiation if, for every alternative x, there

that is better: for each

xX

>0

and each

there is a

y  x.

and

Monotonicity properties come in dierent varieties, all reecting the intuition that more is
better.

Let

k {1, . . . , L}

and let

ek RL

denote the

k -th

standard basis vector with

coordinate equal to one and all other coordinates equal to zero. For

x, y RL ,

xy

if

xi yi

for all coordinates

i = 1, . . . , L,

x>y

if

xi > yi

for all coordinates

i = 1, . . . , L.

The preference relation

k -th

write

is:

strongly monotonic in coordinate


alternatives: for each

xX

and each

k if increasing this
> 0 : x + ek  x.

coordinate gives better

strongly monotonic if an increase in at least one coordinate gives better alternatives: for all

x, y X ,

if

xy

and

x 6= y ,

then

x  y.

monotonic if, for all x, y X : x y implies x % y, and x > y implies x  y.


For instance, a strongly monotonic preference relation
The converse holds if

% is strongly monotonic in each coordinate.

is transitive.

Exercise 1.3
(a) Prove the previous two sentences.

R2+

(b) Give an example of a preference relation over


for
(c) Let

k=1

and

X = RL
+

k = 2,

that is strongly monotonic in coordinate

and assume that according to the preference relation

%,

imply that

both

less is better (think of the

coordinates as measures of pollution, unhealthy commodities, etc.) in the sense that

x 6= y

k,

but not strongly monotonic.

x y.

xy

and

Is this preference relation locally nonsatiated?

(d) Answer the same question as in (c), but with

X = RL .

Each of the three monotonicity properties implies local nonsatiation. On the other hand, local
nonsatiation has no implications for monotonicity: the preference relation

on

R2+

with

(x1 , x2 ) % (y1 , y2 ) (x1 x2 )2 + x1 (y1 y2 )2 + y1


is locally nonsatiated, but satises none of the monotonicity properties. Figure 1 contains three
indierence curves of this preference relation  with the better ones further away from the

x2
3
2
1
0

x1
0

Figure 1:

Local nonsatiation has no implications for monotonicity.

origin  and shows that small increases in one or both of the coordinates may lead into areas
with strictly worse alternatives.
Figure 2 summarizes the relations between the three monotonicity relations and local nonsatiation. An arrow from strong monotonicity to monotonicity means that the former implies the
latter; the absence of an arrow in the opposite direction means that the converse is not true.

strongly monotonic in coordinate

strongly monotonic

monotonic

@
@
@
@
R
@

?
locally nonsatiated

Figure 2:

Relation between monotonicity properties and local nonsatiation on

A preference relation

is

alternatives weakly better than

continuous
y

RL

or

RL
+.

y X , the set {x X : x % y} of
{x X : x - y} of alternatives weakly worse than

if for every

and the set

are closed. The literature contains some alternative denitions as well:

Proposition 1.2
(a)

Let

be a weak order on

is continuous, i.e., for every

y X,

X.

The following properties are equivalent:

the sets

{x X : x % y}

and

{x X : x - y}

closed.
(b) For every
(c) The graph

y X,

the sets

{x X : x  y}

{(x, y) X X : x % y}

of

and

{x X : x y}

is closed.

are open.

are

(xn )nN
x % y.

(d) For all sequences

n N,

then also

x, y X ,
containing x) and

(e) For all

Proof.

and

(yn )nN

in

X,

if

x n x , yn y ,

and

xn % yn

x  y,

for all

then there is a neighborhood Ux of x (i.e., an open set


0
0
0
0
a neighborhood Uy of y such that x  y for all x Ux , y Uy .
if

Ux

Statements (a) and (b) are equivalent, since the complement of an open set is closed,

and vice versa. Also the equivalence of (c) and (d) is a matter of denition:
an element of the graph of

if and only if

xn % yn .

(xn , yn ) X X

is

Proving three implications suces to close

the circle and make sure that all ve statements are equivalent:

[(b) implies (e):]

x, y X with x  y . Distinguish two cases:


m X with x  m  y .
Dene Ux = {z X : z  m} and Uy = {z X : m  z}. These sets are open by (b).
0
0
0
0
Moreover, x Ux and y Uy by assumption. Let x Ux , y Uy . Then x  m and m  y .
0
0
By Proposition 1.1,  is transitive, so x  y , as we had to show.
Case 2: There is no m X with x  m  y .
Dene Ux = {z X : z  y} and Uy = {z X : x  z}. These sets are open by (b).
0
0
0
Moreover, x Ux and y Uy by assumption. Let x Ux , y Uy . Then x  y . It cannot be
0
0
0
0
that x  x , otherwise we would have x  x  y . By completeness, x % x. Similarly, y % y .
0
0
0
0
So x % x  y % y . By Proposition 1.1, x  y , as we had to show.

Case 1:

Assume (b) holds. Let

There is an

Conclude from cases 1 and 2 that (e) holds.

[(e) implies (c):]


the graph

Assume (e) holds. To establish (c), we need to show that the complement of

{(x, y) X X : x % y}

is open. By completeness of

%,

this complement is the set

S = {(x, y) X X : x y}.
(x, y) S , x, using (e),
0
x Ux , y 0 Uy . Conclude that

For each

neighborhoods

(x, y) S :
Taking the union over all

(x, y) S ,

Ux

of

and

Uy

of

such that

x0 y 0

for all

(x, y) Ux Uy S.

one obtains

S = (x,y)S Ux Uy .
As the union of open sets,

[(c) implies (a)]:

is open, as we had to show.

y X . We show that the set {x X : x % y}


{x X : x - y} is closed is analogous.

Assume (c) holds. Let

closed; establishing that also the set

is

The following proof is more general, but requires some knowledge of topology. In case of emergency, don't
worry. Simply forget that this footnote even exists!
[(c) implies (b)]: Assume (c) holds, i.e., the set S dened above is open. Let y X . We show that
L(y) = {x X : x y} is open; establishing that also the set {x X : x  y} is open is analogous.
Let x L(y). Then (x, y) S. Since S is open in the product topology generated by Cartesian products of
open sets in X , we can x neighborhoods U of x and U of y such that U U S. In particular, for each
x U , it follows that (x , y) U U , so x y . Conclude that
2

x L(y) :

Taking the union over all x L(y), one obtains

x Ux L(y).

L(y) = xL(y) Ux .

As the union of open sets, L(y) is open, as we had to show.


5

{x X : x % y} with limit x . We need to show that x also


lies in this set. By denition, (xn , y)nN is a sequence in the graph of %, which is closed by

assumption. Therefore, it contains the limit (x , y), i.e., x % y , as we had to show.



(xn )nN

Let

be a sequence in

Very roughly speaking, continuity of preferences requires that the strict preference relation is
unaected by small changes in the alternatives: if

is better than

y,

the same holds for nearby

alternatives.

A subtlety about open sets:

X.

subsets of the feasible set

Continuity properties are typically dened in terms of open

We often consider commodity spaces like

dened using the usual distance between vectors

and

X = RL
+.

Open sets are

y:

v
u L
uX
kx yk = t (x` y` )2 .
`=1
A subset

xX

Y X

is

open if each y Y

suciently close to
there is an

lie in

>0

is an interior point of

Y,

i.e., if for each

y Y,

all points

as well:

such that for all

xX

with

kx yk < :

Many people overlook a slight subtlety, namely the statement  . . . for all
looks innocuous: if you want to dene whether a subset of

X.
X = R2+

xY.
x X. . . 

(1)
in (1). This

is open, then obviously you're not

interested in stu that is outside of

But it does matter in identifying open subsets! Notice,

for instance, that as subsets of

 but not as subsets of

Y 1 = R2+ ,

Y 2 = {y R2+ : y1 < 1},

X = R2

 sets like

Y 3 = {y R2+ : y1 + 2y2 < 4}

X = RL
+ is endowed
L
with the
that it inherits from the larger set R : a set Y X is open if and
L
only if Y = X O , where O is an open set in the larger space R . This provides quick proofs
1
2
3
2
that the sets Y , Y , Y are open subsets of X = R+ :

are open. You might want to draw their pictures. In topological language,

relative topology

Y 1 = X R2 ,

Y 2 = X {y R2 : y1 < 1},

Y 3 = X {y R2 : y1 + 2y2 < 4},

and the sets

R2 ,
are open in

{y R2 : y1 < 1},

{y R2 : y1 + 2y2 < 4}

R2 .

The next two properties are related to other changes, namely shifts in or rescaling of the coordinates. The preference relation

is:

quasilinear in coordinate k if, for all x, y X and all > 0, x % y implies that
x + ek % y + ek :

the preference relation is insensitive to parallel shifts in the sense

that adding the same positive amount of commodity

to both alternatives does not

aect the preference over them.

homothetic if rescaling the coordinates does not aect the preferences: for all x, y
X

and all

> 0,

if

x % y,

then

x % y .

Of course, this requires knowing which subsets of X are open. In general  as you will recall from the math
course  this requires X to be a topological space, i.e., it comes equipped with a denition of open sets, subject
to three restrictions: (1) the empty set and X are open, (2) unions of open sets are open, (3) intersections of
nitely many open sets are open.
3

For instance, any preference relation where only the dierence between the rst coordinates
matters, like

(x1 , x2 ) % (y1 , y2 ) 3x1 + exp x2 3y1 + exp y2 ,


is quasilinear in the rst coordinate.

Often, such a coordinate is referred to as numeraire

or money and the economic idea is that not the exact amounts of money associated with
two alternatives matter, but the dierence between them.

A simple example of homothetic

preferences arises in most linear production processes: let alternatives


ingredients and let

and

denote vectors of

y if the ingredients of x suce to make at least as


x yields at least as much cake as y . More generally,

be weakly preferable to

much of your favorite cake as

y.

Then also

any preference relation dened in terms of a homogeneous function is homothetic. Recall that

f : RL
+ R is homogeneous of degree
k
f (x). Suppose that x % y if and only if

a function

f (x) =

k R if for each x RL
+ and each > 0:
f (x) f (y). Then % is homothetic:

x % y f (x) f (y) f (x) = k f (x) k f (y) = f (y) x % y.


Therefore, functions dened by

f (x1 , x2 ) = min{x1 , x2 }

and

f (x1 , x2 ) = x1 x32

generate homoth-

etic preferences.

Exercise 1.4

Give an example of a weak order

on

R2+

that satises:

(a) strong monotonicity in coordinate 1, but not quasilinearity in coordinate 1.


(b) quasilinearity in coordinate 1, but not strong monotonicity in coordinate 1.
(c) homotheticity, but none of the three monotonicity properties.
(d) all three monotonicity properties, but not homotheticity.

Exercise 1.5

Consider a weak order

(a) Prove: if

is continuous, then

(b) Not a drop too much:

X = RL
+

xy

x > y.

on

is monotonic. That is, also

with

if

x%y

if

x y.

Your favorite drink requires mixing its two ingredients in the same

x1 , x2 0 indicate the two amounts, you can mix min{x1 , x2 } of your


max{x1 , x2 } min{x1 , x2 } goes to waste. If you are primarily concerned about

amount: if

drink, but also feel it is unfortunate to waste ingredients, the following weak order
reect your preferences: for all




x
x

x, y

R2+ ,

x%y

drink, whereas
the amount of

on

R2+

may

if and only if

y : min{x1 , x2 } > min{y1 , y2 }, or


y , but not more waste: min{x1 , x2 } = min{y1 , y2 },
max{x1 , x2 } min{x1 , x2 } max{y1 , y2 } min{y1 , y2 }.

yields more of the drink than

gives the same amount of the drink as

but

Show that

xy

whenever

A preference relation

is

x > y,

convex

but not necessarily

if for each

y X,

x%y

the set

if

x y.

{x X : x % y}

of weakly better

alternatives is convex.

Proposition 1.3
with

x%y

% be a weak order on X . Then % is convex if and only if for all x, y X


[0, 1], also x + (1 )y % y . Informally, if x is at least as good as y ,
of the way from y to x is a weak improvement.
Let

and all

just walking part

Exercise 1.6
(a) Prove this proposition.

(b) Give an example to show that the proposition is false if

is not a weak order.

% is strictly convex
x + (1 )y  y .

A somewhat stronger version: a preference relation

x 6= y

and

x%y

and all

(0, 1),

it holds that

if for all

x, y X

This property implies that if you are indierent between two distinct alternatives

1
you can still improve upon them: by strict convexity, the alternative x
2

with

x, y X ,

1
2 y is strictly better.

2.

Utility

2.1.

Utility functions

In many cases, preferences over alternatives can be evaluated by some numerical assessment: I
prefer the alternative with the higher percentage of alcohol or I prefer the alternative yielding
the higher prot. In that case, we say that these functions  in the latter case the function
assigning to each alternative its associated prot  represent the decision maker's preferences.
Formally, a function

u:XR

is a

utility function representing % if for all x, y X :


x % y u(x) u(y).

One often uses the following simple result to verify that

(2)

represents a complete preference

%.

relation

Proposition 2.1

Let

be a complete preference relation on a set

and let

u:X R

be a

function. The following two claims are equivalent:


(a)

represents

(b) For all

%;

x, y X :


if
if

x  y,
x y,

then
then

u(x) > u(y),


u(x) = u(y).

Proof. (a) (b): Assume (a) holds. Let x, y X . If x  y , by denition of : x % y and not
y % x. Hence, by denition of a utility function, u(x) u(y) and not u(y) u(x). Conclude
that u(x) > u(y). Similarly, if x y , u(x) = u(y).
(b) (a): Assume (b) holds. Let x, y X . To show:
x % y u(x) u(y).
One direction is easy: if
Hence

x % y , then x  y

u(x) u(y). Conversely,


x % y is not true. Then

Suppose

Exercise 2.1

x y , so by (b), either u(x) > u(y) or u(x) = u(y).


assume that u(x) u(y). By completeness, x % y or y % x.
y  x, so by (b), u(y) > u(x), a contradiction.

or

The completeness condition in Proposition 2.1 cannot be omitted. Indeed, consider the

preference relation

on

with

x, y R :
and the function
(a)

(b)

u:RR

with

u(x) = x

for all

x%y xy+1
x R.

Show that:

is transitive, but not complete.


satises Proposition 2.1(b), but not Proposition 2.1(a).

If one function represents a preference relation, then many others do as well:

if preferences

are represented by a prot function, then also twice the prot or prot to the power three
represent the same preference relation. In general:

Proposition 2.2
function

Proof.

u : X R represents % and f : R R
v : X R dened by v(x) = f (u(x)) represents %.
If

is strictly increasing, then also the

By (2) and the denition of strictly increasing, we nd for all

x, y X :

x % y u(x) u(y) v(x) = f (u(x)) f (u(y)) = v(y),


so

represents

Since the

%.

ordering of the real numbers is complete and transitive, a preference relation that

can be represented by a utility function is necessarily complete and transitive: it must be a weak
order. But is being a weak order enough to guarantee the existence of a utility function? The
answer is positive for nite or countable sets.

2.2.

From preference to utility: nite or countable sets

Representing a weak order on a nite set by means of a utility function is easy: the more preferred
an alternative

xX

is, the larger is the set of elements weakly worse than

how many elements are weakly worse than

Proposition 2.3

Therefore, counting

Assume:

is nite,

is a weak order on

X.

Then there is a utility function representing

Proof.

x.

measures its utility.

%.

x X , dene u(x) = |{z X : x % z}|. Then u : X R represents %:


x, y X . If x y , then for each z X with y % z , Proposition 1.1(c) gives that x % z .
{z X : y % z} {z X : x % z}. Similarly, the converse inclusion holds, so
For each

{z X : x % z} = {z X : y % z}.
Hence

u(x) = u(y).

If

x  y,

Proposition 1.1(d) and the fact that

let
So

(3)

lies in the former set, but

not in the latter, imply:

{z X : x % z} {z X : y % z}.
Hence

If

(4)

u(x) > u(y).

is countable, simply counting the number of weakly worse alternatives does not work: there

may be innitely many of them. But we can give each element a positive weight, make sure that
the weights have a well-dened sum even if we add innitely many of them, and use the total
weight of the elements weakly worse than

X = {x1 , x2 , . . .}

2 ) to
the remainder (weight 2

Proposition 2.4

x2 , then half of the remainder (weight 23 ) to

Assume:

is countable;

is a weak order on

x.
21 )

as a measure of the utility of

and divide a bar of chocolate by giving half (weight

X.

Then there is a utility function representing

%.
10

For instance, label


to

x1 ,

x3 ,

then half of

and so on.

Proof.

Since

is countable, there is an injective function

u(x) =

n : X N.

For each

x X,

dene

2n(z) .

zX:x%z

P
(2n )nN has a nite sum nN 2n = 1, so u is well-dened. To see that u
represents %, let x, y X . If x y , (3) holds, so u(x) = u(y). If x  y , (4) holds, so
u(x) u(y) 2n(x) > 0.


The sequence

2.3.

Preference, but no utility

Not all preference relations  not even weak orders  can be represented by means of a utility
function. Graduate textbooks usually give exactly one example (lexicographic preferences), as if
it concerns an exotic phenomenon. This section gives some counterweight by providing several
economically relevant examples, all arising from the following general principle.
Fix a set of alternatives
countable set

I R,

X.

z in some unb(z) X and one good alternative g(z) X with the


each z I , the good alternative is strictly preferred to the

Suppose you can associate with each number

one bad alternative

following two properties. Firstly, for


bad one:

g(z)  b(z).
z < z0,
0
with z :

Secondly, if
associated

(5)

then the good alternative associated with

z, z 0 I :

is worse than the bad alternative

z < z 0 b(z 0 )  g(z).

(6)

Combining (5) and (6), representing such preferences by a utility function requires, for

z < z0:

u(b(z)) < u(g(z)) < u(b(z 0 )) < u(g(z 0 )).


z I , the interval [u(b(z)), u(g(z))] has positive length and if z, z 0 I have z 6= z 0 ,
0
0
the intervals [u(b(z)), u(g(z))] and [u(b(z )), u(g(z ))] are disjoint: one of them lies entirely to
the left of the other on the real axis. So uncountably many intervals [u(b(z)), u(g(z))] of positive
So for each

length must somehow be placed on the real line without any two of them intersecting.

This

[u(b(z)), u(g(z))] contains a


rational number r(z) Q. Since the intervals associated with dierent values of z are disjoint:
z 6= z 0 implies r(z) 6= r(z 0 ), i.e., the function r : I Q is injective. But I is uncountable and Q
is impossible: we simply run out of space! Formally, each interval

is countable, a contradiction. Some examples:

Lexicographic preferences.

(Debreu, 1954) Let

(x1 , x2 ) % (y1 , y2 ) x1 > y1

or

X = R2 .

(x1 = y1

Dene

and

as follows:

x2 y2 ) .

Alternatives are compared according to their rst coordinates; if these happen to be equal, they
are compared according to their second coordinates.
a dictionary.

Think of the way words are ordered in

z R, let b(z) = (z, 0) and g(z) = (z, 1). Then g(z)  b(z) and, if
g(z) = (z, 1) (z 0 , 0) = b(z 0 ). So (5) and (6) hold: this preference relation

For each

z, z 0 R, z < z 0 ,

then

cannot be represented by a utility function.

Preferences over information.

(Dubra and Echenique, 2001) It is common in economics

to model information by means of partitions of a state space. Let

11

zR

be a certain threshold.

x R: you are told the exact value


the interval [z, ). That means you can
x < z , but cannot distinguish between the

Suppose you get the following information about a number


of

if

x < z,

otherwise you are told that

perfectly distinguish between all real numbers


numbers in the interval

[z, ).

lies in

with

Therefore, information is summarized by the partition

b(z) = {{x} : x < z} {[z, )}


of

R.

Similarly, dene the information partition

g(z) = {{x} : x z} {(z, )}


that arises if you are told the exact value of

also in the case where

x = z:

all numbers

xz

can be perfectly distinguished, but larger ones not. Assume it is preferable to have more precise
information, i.e., ner information partitions (partition
from

Also if

is contained in a set from

z < z0,

partition

b(z 0 )

Q).

Partition

g(z)

is ner than partition

is ner than partition

is ner than partition

g(z),

so

b(z 0 )  g(z).

b(z),

Q
so

if every set

g(z)  b(z).

So (5) and (6) hold:

this preference relation cannot be represented by a utility function.

Preferences over utility flows.

At every moment in time t [0, ), an agent receives


x is simply a function x : [0, ) {0, 1}. Suppose preferences
satisfy the following monotonicity condition: if x(t) y(t) at all times t, with strict inequality
for at least one time period, then x  y . Dene, for each z [0, ), the alternative b(z) giving
payo one before time z and payo zero afterwards:

1 if t < z,
b(z)(t) =
0 otherwise.

payo zero or one: an alternative

Similarly, alternative

g(z)

gives payo one at/before time


g(z)(t) =
By the monotonicity requirement,

g(z)  b(z)

1
0

if

and payo zero afterwards:

t z,

otherwise.

and if

z < z 0 : b(z 0 )  g(z).

So (5) and (6) hold:

this preference relation cannot be represented by a utility function.

2.4.

In no-man's-land: A necessary and sucient condition for utility representation

We saw above that preference relations where there are uncountably many disjoint intervals between bad and good alternatives cannot be represented by means of a utility function. On the
other hand, complete and transitive preferences on a countable set do have a utility representation. Is there something in between these two cases that allows uncountably many alternatives,
but still has enough of a countable character that it allows a utility representation?
Let

be a complete, transitive preference relation over a set

X 
x, y X :

a minor abuse of notation, the set


subset

CX

such that for all


if

x  y,

is

then there exist

X.

The pair

Jaray order-separable
c1 , c2 C

s.t.

(X, %)

 or, with

if there is a countable

x % c1  c2 % y.
x, y X
whereas y

The condition roughly says that countably many alternatives suce to keep all pairs
with

x  y

apart:

lies on one side of the no-man's-land between

c1

and

c2 ,

lies on the other. This condition is both necessary and sucient for the existence of a utility
representation:

12

Proposition 2.5

Let

be a weak order on a set

X.

There is a utility function representing

if and only if

Exercise 2.2

This exercise guides you through the steps of the proof. Assume that

is Jaray order-separable.

U = {u(x) : x X} be the range


open interval (u1 , u2 ) contains no
(a) Prove that

of

u.

represents

%.

Let

jump in U is a pair (u1 , u2 ) U U where u1 < u2 and the

elements of

U : (u1 , u2 ) U = .

contains at most countably many jumps. (Suppose not. Use the idea behind (5) and

(6) to nd a contradiction.)

(u1 , u2 ), x a point x(u1 , u2 ) with utility u1 and a point y(u1 , u2 ) with utility u2 . Let
J = {x(u1 , u2 ), y(u1 , u2 )} be the union (over all jumps (u1 , u2 )) of these points. By (a), J is countable.
Next, for each pair of rational numbers r1 , r2 Q with r1 < r2 and (r1 , r2 ) U 6= , x an element
x(r1 , r2 ) X with utility in (r1 , r2 ) U . Let R be the union of all such points x(r1 , r2 ). Since there are
only countably many pairs (r1 , r2 ) as above, R is countable. Let C = J R.
For each jump

(b) Show that

makes

X is
2n(c) .

Conversely, assume

u(x) =

cC:c-x

Jaray order-separable.

Jaray order-separable via the set

represents

%.

For nite or countable sets

X,

(c) Show that

simply let

C=X

C.

n:CN

Let

to show that

be injective. Dene

by

is Jaray order-separable. For

preferences over uncountable sets, additional restrictions are required. We will see in Proposition
2.8, for instance, that on

2.5.

RL
+,

adding continuity to our list of requirements works.

Continuous utility

Economists usually work with continuous utility functions. Establishing existence of a continuous
utility function is troublesome: not even Fishburn (1970a), the standard reference in the eld,
bothers to give the proof. A well-known continuity result is often wrongly attributed to Debreu
(1954). However, his proof is awed (Debreu, 1964) and a more general continuity result was
already known from much older research on order types in the classical theory of sets, due to
Georg Cantor. See, for instance, Kamke (1950). The proof of Proposition 2.6 is not obligatory
reading; it follows Jaray (1975).

Proposition 2.6

Assume:

X;

is a weak order on

is Jaray order-separable;

X is endowed with
{x X : x y} are

a topology where, for all


open, i.e.,

y X,

Let

C X

make

and

%.
C if
c, c0 C

Jaray order-separable. Omitting redundant elements from

necessary, one may assume that no two distinct elements of

are equivalent: for all

c  c0 or c0  c.
[Dene utility on C :] Since C is countable, label C = {c1 , c2 , . . .}. Since the set Q = (0, 1) Q
of rationals in (0, 1) is countable, label Q = {q1 , q2 , . . .}. Dene a utility function f : C Q
by induction: f (c1 ) := q1 . Let n N, n 2, and assume f was dened on {c1 , . . . , cn1 }. To

with

c 6= c0 ,

{x X : x  y}

is continuous.

Then there exists a continuous utility function representing

Proof.

the sets

either

13

extend the utility function to


element

q` Q

with smallest

{c1 , . . . , cn }, dene f (cn ) to be rst element of Q (dened4 as the


index `) among those elements q` that give the desired extension:

k {1, . . . , n 1} :
A useful implication: let

a, b C

with

a b.

q` > f (ck ) cn  ck .
If the set of points in

(7)
between

and

b,

(a, b) = {c C : a c b},
cm . By construction, cm is the rst element
(a, b) to be assigned its value by f and therefore its image f (cm ) is the rst element in
(f (a), f (b)) Q.
[Extend utility to X :] For each x X , dene u(x) = sup {f (c) : c C, c - x}. The set over
which the supremum is taken is nonempty (it contains x) and bounded from above (by 1), so
this supremum exists. Moreover, u represents %. Let x, y X . If x y , the supremum is taken
over the same set, so u(x) = u(y). If x  y , there exist, by Jaray order-separability, elements
a, b C with x % a  b % y , so that u(x) f (a) > f (b) u(y).
[Establish continuity of utility:] The usual topology on R is generated by the intervals
(, r) and (r, ), with r rational. Therefore, it suces to prove that u1 ((, r)) and
u1 ((r, )) are open for all r Q. Let's do the former; the latter is similar.
1 ((, r)) equals (i) if r inf f (C), (ii) X if r > sup f (C) or if r = sup f (C) and
Now u
r
/ f (C), (iii) {x X : x f 1 (r)} if r f (C). By assumption, all these sets are open.
The only remaining case is when r
/ f (C) and inf f (C) < r < sup f (C). We show that r
belongs to a jump of f (C). Recall from Exercise 2.2 that a jump in f (C) is a pair of points
(f1 , f2 ) f (C) f (C) with f1 < f2 and (f1 , f2 ) f (C) = .
Suppose not. Since inf f (C) < r < sup f (C), there exist a, b C with f (a) < r < f (b). Let
m N be the maximum of the indices of f (a), r, f (b) Q. Then {q1 , . . . , qm } contains r and
0
0
elements p, p f (C) with p < r < p . Let n N be the smallest index for which {q1 , . . . , qn }

is nonempty, it has a rst element (Why?), say


in

has this property. Let


so

p2 = min f (C) {q1 , . . . , qn } (r, ),

(r, p2 ) {q1 , . . . , qn } = .

so

(p1 , r) {q1 , . . . , qn } = ,

(p1 , p2 ). Since it contains r, the interval (p1 , p2 ) cannot be a jump,


i.e., it contains elements from f (C). We show that this yields a contradiction.
Since p1 , p2 f (C), there exist b1 , b2 C with p1 = f (b1 ), p2 = f (b2 ). Since (p1 , p2 )f (C) 6=
, there is a p C with f (p1 ) < f (p) < f (p2 ), i.e., the set (b1 , b2 ) of points in C between b1

and b2 is nonempty. Let b be its rst element. By the implication following (7), its image f (b )
must be the rst element of (p1 , p2 ), which was r . But r
/ f (C), a contradiction.
1 ((, r)) = {x X : x
This shows that r belong to a jump (f1 , f2 ) of f (C). But then u
1
f (f2 )}, which is open by assumption.

So

p1 = max f (C) {q1 , . . . , qn } (, r),

is the rst element of

Let us apply this result to show that continuous weak orders on

RL
+

can be represented by a

continuous utility function. We rst establish an auxiliary result that is of interest in its own
right whenever we want to nd alternatives in between two others.

Caveat: `rst element' is dened in terms of the chosen enumerations of C and Q. This allows us to speak,
for instance, of the rst element in (0, 1), which makes absolutely no sense if one  mistakenly  were to believe
it was dened in terms of the usual order on R.
4

14

Proposition 2.7

Intermediate Value Theorem for preferences: Assume:

X = RL
+

is a continuous weak order on

is a connected subset of

for some

L N;
X;

X.

The following two results hold:


(a) If

xX

(b) If

y, y 0 Y

Proof. (a):
element of

and

y, y 0 Y

are such that

then there is a

Y,

y0,

the latter

y.

As

y 00 Y

then there is a

y 00 Y

with

with

x y 00 .

y  y 00  y 0 .
x. That is, each
B = {z X : z  x}.

are strictly better/worse than

belongs to exactly one of the sets

connected set
and

y  y0,

y % x % y0,

Suppose not: all elements of

The former contains

(b):

are such that

A = {z X : z x} and
B are open by continuity,

and

they separate the

a contradiction.

Suppose not. Then each element of

B = {z X : z  y 0 }.

belongs to exactly one of the sets

The former contains

nuity, they separate the connected set

Y,

y0,

the latter

y.

As

and

a contradiction.

In typical applications of this proposition, one takes

A = {z X : y  z}
B are open by conti

X , as in
:
x
=

= xL }
{x RL
1
+

to be equal to the entire set

Proposition 2.8, or to a suitably chosen convex set like the diagonal


in Proposition 2.9.

Proposition 2.8


X = RL
+

Assume:

for some

L N;

is a continuous weak order on

X.

Then there is a continuous utility function representing

Proof.

C = QL
+ makes X Jaray
is a z X with x  z  y .

The countable set

By Proposition 2.7, there

%.

order-separable: let

x, y X

with

x  y.

By continuity, the set

{a X : x  a  z} = {a X : x  a} {a X : a  z}
is the intersection of two open sets, hence open itself. It is nonempty by Proposition 2.7. The

X : every nonempty, open set in X has a nonempty intersection with C . Hence,


c1 C with x  c1  z . Similarly, there is a c2 C with z  c2  y . Conclude that
x  c1  c2  y , in correspondence with the requirement for Jaray order-separability. Now all
conditions of Proposition 2.6 are satised.


set

is dense in

there is a

Below we present a special case of Proposition 2.8 with a particularly simple proof.

Proposition 2.9


X=

Assume:
L
R+ for some L

N;

is a continuous, monotonic weak order on

X.

Then there is a continuous utility function representing

15

%.

Proof. Let e = (1, . . . , 1) RL+ denote the vector of ones.


Step 1: For each x X , there is a unique x 0 with x x e.
Let

x X.

Choose

max{x1 , . . . , xL }.

By monotonicity,

e % x % 0e.

By Proposition

2.7, the diagonal

{x RL
+ : x1 = = xL },
being connected, contains an element equivalent to
follows from monotonicity: increasing

Step 2:

x:

there is an

x 0

with

x x e .

Unicity

gives better alternatives, decreasing worse.

Dene u(x) = x . Then u represents %.


x, y X . Then x % y x e % y e u(x) = x y = u(y).
Step 3: u is continuous.
1 ((, )) of every open interval (, )
It suces to show that the preimage u
Let

is open. Now

u1 ((, )) = {x X : x  e} {x X : x e}


is the intersection of two open sets by continuity, and therefore open.

As a simple application, suppose that preferences are also homothetic. Then


implies that

x x e,

Corollary 2.10
%

so

u(x) = x = u(x).

x x e

and

This proves:

If  in addition to the assumptions in Proposition 2.9  the preference relation

is homothetic, there is a utility function homogeneous of degree one representing

%.

The next exercise studies the connection between continuous preferences and continuous utility.
The fact that statement (a) in that exercise is true, is useful: you will have relatively little trouble
recognizing continuous functions, and continuous utility implies continuous preferences !

Exercise 2.3

Consider a weak order

% on topological space X

represented by utility function

u : X R.

Are the following claims true or false?


(a) If

(b) If

2.6.

is continuous, then
is continuous, then

is continuous.

is continuous.

Some special functional forms

Recall that if a preference relation over commodity bundles is quasilinear in some coordinate, this
coordinate is often referred to by economists as `money' or a `numeraire'. Under mild additional
assumptions, such quasilinear preferences can be represented by means of a utility function of
the form `money plus whatever utility I get from the other commodities'.

Proposition 2.11

Assume:
L
R+ for some L N;

X=

is a weak order on

is quasilinear and strongly monotonic in the rst coordinate;

X;

 Getting something is at least as good as getting nothing:


 Any dierence can be compensated for by money:
s.t.

x (y1 + v, y2 , . . . , yL ).
16

x % (0, . . . , 0)

x, y X :

if

for every

x % y,

x X;

there is a

v0

Then there is a utility function of the form

Proof.

Let

x X.

u(x) = x1 + v(x2 , . . . , xL )

representing

%.

By assumption:

(0, x2 , . . . , xL ) % (0, . . . , 0).


Hence there is a number

v(x2 , . . . , xL ) 0

s.t.

(0, x2 , . . . , xL ) (v(x2 , . . . , xL ), 0, . . . , 0).


This number is unique, since

is strongly monotonic in the rst coordinate. Adding

x1 0

to

the rst coordinate, quasilinearity implies that

(x1 , x2 , . . . , xL ) (x1 + v(x2 , . . . , xL ), 0, . . . , 0) .


The utility function

u:XR

u(x) = x1 + v(x2 , . . . , xL )

with

represents

%:

x, y X : x % y (x1 + v(x2 , . . . , xL ), 0, . . . , 0) % (y1 + v(y2 , . . . , yL ), 0, . . . , 0)


x1 + v(x2 , . . . , xL ) y1 + v(y2 , . . . , yL ),
where the second equivalence follows from strong monotonicity of

in the rst coordinate.

The proof establishes that each alternative is equivalent with receiving a suciently large amount
of just the rst commodity: utility can be measured in units of commodity 1. This explains the
frequent use of quasilinear preferences: only if they are measured on the same scale can one do
meaningful comparisons between, say, your utility and mine.

Exercise 2.4

Is the nal property

x, y X :

if

x % y,

there is a

v0

s.t.

x y + ve1

(8)

in Proposition 2.11 implied by the others?

Exercise 2.5

Preferences with money (Kaneko, 1976): Let A be a nonempty set. Let X =


A R+ , where an element (a, m) X is interpreted as receiving a A and an amount of money m R+ .
A decision maker has a weak order % on X with the following three properties:
0
0
 strict preference can be compensated for by money: for all alternatives (a, m) and (a , m ) in X :
0
0

if (a, m)  (a , m ), there is a number m 0 such that (a, m) (a , m ).


0
0
0
  % is strongly monotonic in money: for all a A and m, m R+ : if m > m , then (a, m)  (a, m ).
0
0
 indierence is insensitive to shifts in money: for all alternatives (a, m) and (a , m ) in X and all
c 0: if (a, m) (a0 , m0 ), then (a, m + c) (a0 , m0 + c).
We construct a utility function assigning to each (a, m) X a utility of the form money plus utility
from a.

(a) Let

a, a0 A.

Show that there exist amounts of money

a, a0 A and m, m0 , w, w0 R+
m m0 = w w 0 .

(b) Let

Fix an arbitrary element

a A.

v(a) = m m,
Such

m, m

satisfy

(a, m) (a0 , m0 )

Dene the function


where

m, m

exist by (a) and the function

m, m0 R

v:AR

such that

and

(a, m) (a0 , m0 ).

(a, w) (a0 , w0 ).

by taking, for each

are chosen such that

a A:

(a , m ) (a, m).

is independent of the particular choices of

this function is well-dened.

17

Show that

m, m

by (b), so

(c) Show that the function

u:XR

with

u(a, m) = v(a) + m

is a utility function representing

%.

Also convexity and strict convexity of preferences have implications for the form of the utility
function. Recall that a real-valued function

on a convex domain

(Why convex?) is

quasiconcave if for all x, y X and all (0, 1):


u(x + (1 )y) min{u(x), u(y)}.

strictly quasiconcave if for all x, y X with x 6= y and all (0, 1):


u(x + (1 )y) > min{u(x), u(y)}.

Proposition 2.12


X = RL
+

u:XR

Then

Assume:

for some

L N;

is a convex weak order on


represents

is quasiconcave. If

Proof.

X;

%.
%

is strictly convex,

x, y X and (0, 1).


so min{u(x), u(y)} = u(y).

Let

u(x) u(y),

is strictly quasiconcave.

Assume without loss of generality that


By convexity of

%: x + (1 )y % y ,

x % y.

Then

so

u(x + (1 )y) u(y) = min{u(x), u(y)},




as we had to show. The proof for strict quasiconcavity is analogous.

Exercise 2.6

u on a convex domain X is that for all


Xu (r) = {x X : u(x) r} is convex. Provide a second proof of

(a) An equivalent way of dening a quasiconcave function

r R,

the upper contour set

Proposition 2.12, using this denition.


(b) As a converse to Proposition 2.12, prove that if
function on a convex set

X,

u : X R

is a (strictly) quasiconcave utility

the corresponding preference relation

(c) Give an example of a convex weak order on

is (strictly) convex.

that can be represented by a utility function, but

not by a concave one.

Next, we provide conditions for a weak order to be representable by a linear utility function.
Although we go into more detail, the proof follows Diecidue and Wakker (2002). A convenient
mathematical tool is treated in the following exercise.

Exercise 2.7

Cauchy's functional equation: On two domains, we show that, under mild assump-

tions, additive functions are linear. Let


(a) Let

f :RR

u R. Show that f (xu) = xf (u)


x Z, then for x Q.

be additive:

for all rational

x.

f (x + y) = f (x) + f (y)

for all

x, y R.

Hint: First establish the claim for

x N,

then for
Setting

Q.

u = 1

and

c = f (1),

it follows that

f (x) = cx

for all rational

x,

i.e.,

is linear on the eld

Approximating real numbers by rational ones and taking limits, it follows that

functions

f :RR

are linear. But much weaker conditions than continuity suce:

18

continuous additive

(b) Suppose

is

not linear on

R.

Show that its graph

So any assumption that prevents the graph of

{(x, y) R2 | y = f (x)}

being dense implies that

is dense.

must be linear! Such conditions

include continuity in a single point, boundedness/sign restrictions on small intervals, monotonicity, etc.
We now extend the domain to

F (x) + F (y)

for all

x, y R

n-dimensional

real vectors. Let

F : Rn R

be additive:

F (x + y) =

(c) Reduce this to the previously solved case by showing that there exist additive functions
for

i = 1, . . . , n

such that, for all

fi : R R

x Rn , F (x) = f1 (x1 ) + + fn (xn ).

With this tool in our baggage, we can prove the linear representation result:

Proposition 2.13

Assume:
L
R for some L N;

X=
 % is a weak order on X ;
 % is strongly monotonic;
 % is additive: for all x, y, z X , if x % y , then x + z % y + z ;
 For each x X there is a constant R such that x (1, . . . , 1).
Then there are 1 , . . . , L R++ such that the function u : X R with u(x) = 1 x1 + +L xL
represents %.


Proof.

By assumption, there is, for each

x X,

strong monotonicity, this number is unique.

a number

u(x) R such that x u(x)e. By


u : RL R is well-dened and

So the function

%.
u is additive. Let x, y X . Using additivity of % twice (for % and -), x u(x)e
implies that x + y u(x)e + y . Similarly, y u(y)e implies that u(x)e + y u(x)e + u(y)e =
(u(x) + u(y))e. By transitivity, x + y (u(x) + u(y))e. Hence u(x + y) = u(x) + u(y).
L
As u : R R satises Cauchy's functional equation, Exercise 2.7 implies that there are
PL
additive functions ui : R R (i = 1, . . . , L) with u(x) =
i=1 ui (xi ). By strong monotonicity,
each ui is strictly increasing: its graph cannot be dense. Hence, each ui is linear: there are
P
1 , . . . , L R such that u(x) = L
i=1 i xi . The constants 1 , . . . , L are positive by strict
monotonicity.

represents preferences
Moreover,

Most assumptions are familiar. Strong monotonicity assures that all the

are positive; with

milder monotonicity requirements, one can only assure that some of them are.

If you don't

like the nal assumption, recall from Proposition 2.9 that it can be replaced by continuity.
Additivity of preferences is obviously the key assumption. It essentially states that in evaluating
two alternatives

x, y X ,

only their dierence

xy

matters:

preferences are insensitive to

translations.
With later applications in mind (see Proposition 2.14), there is no nonnegativity assumption
on the vectors over which preferences were dened:

X = RL ,

not

RL
+.

If this makes you ner-

vous, notice that the proof hinges on the linearity of the function satisfying Cauchy's functional
equation. Fortunately, linearity can be derived even if additivity holds only on the nonnegative
orthant.
The remainder of this section is based on Voorneveld (2008), which contains more general
results. Due to its analytical tractability, the

u : RL
+ R

with

Cobb-Douglas utility function

u(x) = xa11 xaLL =

L
Y
i=1

19

xai i

(L N, a1 , . . . , aL > 0)

is among the most commonly used in economics; see also Exercise

which? .

Its name credits

Cobb and Douglas (1928), who used it in the context of production theory. What properties of
an agent's preferences assure that they can be represented by a Cobb-Douglas utility function?
Part of the trick is in exploiting the fact that this function also goes under the name of

log-linear utility : taking logarithms, we have that for all x, y RL++ :


x%y

L
X

ai ln xi

i=1

L
X

ai ln yi .

i=1

This reduces preferences to a linear utility function in the logarithm of the variables, allowing us
to exploit Proposition 2.13. Of course, this trick goes only part of the way, as one cannot take
logarithms on the boundary of

Proposition 2.14
X = RL
+

is a weak order on

is strongly monotonic;

for some

L N;
X;

i {1, . . . , L}, all x, y X , and each t > 0:


(x1 , . . . , xi1 , txi , xi+1 , . . . , xL ) % (y1 , . . . , yi1 , tyi , yi+1 , . . . , yL ).

is homothetic in each coordinate: for each

if

x % y,

 For each
Then

where some coordinates equal zero.

Assume:

RL
+,

Proof.

then

xX

there is a constant

R+

such that

x (1, . . . , 1).

can be represented by a Cobb-Douglas utility function.


We use Proposition 2.13 to show that

function on

RL
++ .

can be represented by a Cobb-Douglas utility

The domain is then extended to

Step 1, domain RL++ :

RL
+.

L
f : RL RL
++ for each x R by f (x) = (exp x1 , . . . , exp xL ).
1
L
L
1
Notice that f and its inverse f
: R++ R with f (y) = (ln y1 , . . . , ln yL ) are continuous.
L
L
Given the weak order % on R++ , dene a weak order %f on R as follows:
Dene

x, y RL :

x %f y

f (x) % f (y).

(9)

The exponential function is strictly increasing, so by substitution in (9), properties imposed on

carry over in a straightforward way to properties of

%f :

one easily veries that it is a weak

order satisfying strong monotonicity, and there exists, for each

x f (1, . . . , 1).

Applying coordinatewise homotheticity

x, y, t RL
++ :
Hence, by denition (9),

x%y

x RL ,

a scalar

such that

times, if follows that

(t1 x1 , . . . , tL xL ) % (t1 y1 , . . . , tL yL ).

(ln x1 , . . . , ln xL ) %f (ln y1 , . . . , ln yL )

implies that

(ln x1 , . . . , ln xL ) + (ln t1 , . . . , ln tL ) %f (ln y1 , . . . , ln yL ) + (ln t1 , . . . , ln tL ).


As

%f is additive.
RL satises all assumptions of Proposition 2.13: there are a1 , . . . , aL > 0
PL
L
represented by the utility function x 7
i=1 ai xi . By (9), for all x, y R++ :

is bijective, it follows that

Conclude that
such that

%f

x%y

is

%f

on

(ln x1 , . . . , ln xL ) %f (ln y1 , . . . , ln yL )

L
X
i=1

20

ai ln xi

L
X
i=1

ai ln yi .

Taking exponentials,

Step 2, domain RL+ :

QL

ai
n
i=1 xi on R++ .
L
on the entire domain R+ , we must establish

is represented by utility function

with

u(x) =

To see that u represents %


x (0, . . . , 0) for each x RL
+ with some, but not all, coordinates equal to zero. Pick such
L
an x. As x + (1/n)e R++ for each n N, strong monotonicity implies (0, . . . , 0) x + (1/n)e.
Hence, there is an n > 0 with x + (1/n)e n e. As at least one coordinate of x + (1/n)e goes
that

to zero:

0 = lim u(x + (1/n)e) = lim u(n e) = lim an1 ++aL .


n

As
all

a1 + + aL > 0, it follows that limn n = 0.


By assumption, x e for some 0. Positive
n N and limn n = 0. So must be zero.

Again, most assumptions are familiar.

are ruled out:

x x + (1/n)e n e

for

The homotheticity requirement says that rescaling of

specic coordinates does not aect preferences.

21

3.

Choice

3.1.

Existence of most preferred elements

Hitherto, we discussed how microeconomists usually model what economic agents want .

The

obvious next step is to consider what they actually do . The rationality paradigm underlying the
classical microeconomic theory requires that given (1) a set of mutually exclusive alternatives
and (2) a nicely behaved preference relation/utility function over the alternatives, the agent will
choose a most preferred alternative. This sounds pretty obvious, but an abundance of economic
terminology sometimes blurs the picture: most of traditional microeconomics is plain and simple
constrained optimization.
This begs the question: when do most preferred alternatives exist? This is not straightforward: if you have strongly monotonic preferences over apples and face no consumption constraints
whatsoever, there is no optimal amount of apples. Here is a very general existence result:

Proposition 3.1

Assume:

X;

is a weak order on a set

is upper semicontinuous: for all

x X,

the lower contour set

L(x) = {y X | y x}

is open;


Then

is a nonempty, compact subset of

X.

contains a most preferred element:

y Y :

Proof.

Suppose not: for every

{L(y) : y Y }

y Y

y % y

there is a

for all

y0 Y

y Y.
with

y0  y.

Then the lower contour

Y . By compactness, there is a nite


0
0
0
0
0
subcovering, i.e., a nite subset Y Y such that {L(y ) : y Y } covers Y . Since Y is nite,

it contains a most preferred element y . But then L(y ) covers Y , i.e., y is a best element of Y ,
contradicting our assumption.

sets

are an open covering of the compact set

Application to consumer model:


upper semicontinuous) weak order

Let

on

X = RL
+.

Suppose a consumer has a continuous (or

reecting his preferences and an amount of money

w > 0 in his pocket (w for wealth). Suppose the price


B(p, w) at prices p and wealth w consists of all aordable

vector is

p RL
++ .

The

budget set

feasible commodity bundles:

B(p, w) = {x RL
+ | p x w}.
This set is:

 nonempty: it contains the zero vector,


 closed: it is the intersection of nitely many closed halfspaces:5
L
L
B(p, w) = L
i=1 {x R | xi 0} {x R | p x w}.
 bounded:

0 xi w/pi

for all commodities

i,

Recall that a halfspace in R is a set of the type {x R


a 6= 0, and c R.
5

22

(10)

: a x c}

or {x R

, where a R ,

: a x c}

 compact: it is a closed and bounded subset of

RL and therefore compact by the Heine-Borel

theorem,

 convex: by (10), it is the intersection of convex halfspaces.


Since

B(p, w)

is nonempty and compact and

is assumed to be an upper semicontinuous weak

order, the budget set contains at least one most preferred alternative.

Exercise 3.1

A decision maker has lexicographic preferences

(x1 , x2 ) % (y1 , y2 ) x1 > y1


(a) Is

R2 :

over

(x1 = y1

x2 y2 ) .

and

upper semicontinuous?

(b) Does each nonempty, compact subset

3.2.

or

Y R2

contain a most preferred element?

Revealed preference

Rather than going from preferences to choices, this subsection, based on Arrow (1959), tries
to move in the opposite direction: can we  under suitable assumptions  explain observed
choices by constructing a preference relation that makes such choices rational?

choice structure is a tuple (X, B, C), where




Formally, a

is a nonempty set of alternatives.


is a nonempty collection of choice sets. Each element of

is a nonempty subset

interpreted as a potential problem for a decision-maker: `Please choose from

B X,

B .'

B B a nonempty set C(B) B , interpreted


B that the decision maker nds acceptable.
The choice structure (X, B, C) is rationalizable if there is a weak order % on X such that for
each choice set B B , the associated choices C(B) are the most preferred ones under %:


is a choice rule, assigning to each choice set

as those elements from

B B :

C(B) = {x B | x % y

for all

y B}.

(11)

Consider two properties one might expect from revealed preferences:

Weak axiom of revealed preference (WARP)

The choice structure

(X, B, C)

satises WARP if

A, B B, x, y A B :

if

x C(A), y C(B),

The idea behind WARP is this: in both choice problems

and

then

B,

x C(B).

alternatives

and

are

x C(A), this reveals x to be at least good as y ; otherwise x wouldn't be acceptable.


y C(B), then y must be at least as good as x. But then x and y ought to be
and you should nd x acceptable also in B .

available. If
Similarly, if
equivalent

Independence of irrelevant alternatives (IIA) The choice structure (X, B, C)


satises IIA if

A, B B :

if

AB

and

C(B) A 6= ,

Intuitively, suppose that some items on menu


to

A.

If

C(A) = C(B) A.

are not feasible after all and choice is restricted

still contains some acceptable elements from

element is acceptable in the smaller set

then

B,

choice should remain unaected: an

if and only if it was acceptable in the larger set

23

B.

Proposition 3.2

Consider a choice structure

(X, B, C).

(a) If it satises WARP, then it satises IIA.


(b) If it satises IIA and all choice sets with at most three elements are contained in

(X, B, C)

Proof. (a):

A, B be as in the denition of IIA. Let a C(A)


a C(B) A, b C(A). Since C(A) A B , we have

a, b A B,
a
C(A),

b
C(B).

Assume WARP holds. Let

b C(B) A.

B,

then

is rationalizable.

To show:

and

a C(B), b C(A).
x, y X , the set {x, y} lies in B by the assumption on B . Hence, we may dene
x % y if x C({x, y}). We need to check three things:
[% is complete:] Let x, y X . By nonemptiness, either x C({x, y}) or y C({x, y}), i.e.,
x % y or y % x.
[% is transitive:] Let x, y, z X and assume that x % y and y % z . By denition of %:
x C({x, y}) and y C({y, z}). To show: x % z , i.e., x C({x, z}).
If x = y or y = z , this follows immediately. If x = z , then x % z is the same as x % x, which
follows from completeness. So let x, y, z be distinct and consider the set {x, y, z} B . It suces
to show that x C({x, y, z}), because then x C({x, z}) by IIA.
Suppose, to the contrary, that x
/ C({x, y, z}). By nonemptiness of C , C({x, y, z}){y, z} =
6
. By IIA and y % z : y C({y, z}) = C({x, y, z}) {y, z}. So C({x, y, z}) {x, y} 6= .
By IIA and x % y : x C({x, y}) = C({x, y, z}) {x, y}, contradicting the assumption that
x
/ C({x, y, z}).
[% rationalizes (X, B, C):] To show that (11) holds, let B B.
Firstly, let z C(B). To show: z % y for all y B . So let y B . Then {y, z} B, {y, z}
B , and z C(B) {y, z} =
6 . By IIA, z C({y, z}). So z % y .
Secondly, let z B satisfy z % y for all y B . To show: z C(B). By nonemptiness, there
is a y C(B). Then {y, z} B, {y, z} B , and y C(B) {y, z} =
6 . By z % y and IIA:
z C({y, z}) = C(B) {y, z}, so z C(B).

By WARP,

(b):

For all

Exercise 3.3 investigates the other relations between rationalizability, WARP, and IIA.

3.3.

Exercises

Exercise 3.2
function

Weierstrass' Maximum Theorem:

f :XR

on a nonempty, compact set

Use Proposition 3.1 to prove that a continuous

achieves a maximum and a minimum.

Exercise 3.3
(a) Show that if

(X, B, C)

is rationalizable, it satises WARP.

(b) Does IIA imply WARP?


(c) Can the restriction on

in Proposition 3.2 be omitted?

(d) Does WARP imply rationalizability?

24

Exercise 3.4

X.

Let

X = {1, 2, . . . , n}

for some

For each of the following choice rules

C,

n N, n 3,

and let

consist of all nonempty subsets of

prove whether the choice structure

and/or IIA. If possible, construct a weak order

(X, B, C) satises WARP

rationalizing it.

v : X R assigns to each alternative x X a value


v(x) R. Those with a value at/above a given threshold r R are deemed `satisfactory'. For each
B B , the choice C(B) is dened as follows: go through the elements of B in increasing order and

(a) Satisficing (Simon, 1955): A function

choose the rst satisfactory one. If no such element exists, choose the nal (i.e., largest) element
of

B.
% on X in which no two distinct elements
B B with two/more elements, you politely abstain from
C(B) = {x B | y B : y  x}.

(b) Madly in love: Assume your partner has a weak order


are equivalent.

For each choice set

choosing your partner's favorite:

Exercise 3.5

A taste for precious metals: A consumer faces two luxury goods, the rst is gold,

the second platinum, and spends the entire wealth on the good with the highest price.

If prices are

equal, half of the wealth is spent on each good. To investigate the rationality of such behavior, consider
a choice structure

B1 = B((2, 1), 2),

(X, B, C),

the budget set at

(a) Draw the choice sets

C(B2 )

where

B1

and

X = R2+ , the commodity space, and B consists of two choice


prices p = (2, 1) and wealth w = 2, and B2 = B((1, 2), 2).

B2

in the same gure. Given the assumptions above, nd

C(B1 )

sets:

and

and also draw these in your gure.

(b) Does the choice structure

(X, B, C)

satisfy IIA?

(c) Does the choice structure

(X, B, C)

satisfy WARP?

(d) Is the choice structure

(X, B, C)

rationalizable?

Economic models of luxury goods often allow price-dependent preferences.


(e) Give an example of a utility function depending both on the commodity bundle

p  denoted u(x, p)
(p, w) R3++ .
vector

and the price

 that makes the consumer's behavior utility maximizing for every

25

4.

Choices of a consumer: classical demand theory

4.1.

The preference/utility maximization problem

Section 3.1 set the stage for the classical model of consumer behavior.

This model consists

(i) what the consumer wants: a preference relation or utility function;

of a specication of:

(ii) what the consumer nds feasible:

a budget set indicating the commodity bundles that

he can choose from; (iii) what the consumer  putting these two together  nds the most
preferable commodity bundles. Formally:

 there are

L N

commodities that can be consumed in nonnegative quantities, so the

commodity space is

 a price vector

X = RL
+;

p RL
++

i {1, . . . , L}

assigns to each commodity

 the consumer has a given income/`wealth'

w > 0,

a price

pi > 0;

i.e., an amount of money to spend on

buying a commodity bundle;

 the consumer has a preference relation

on

or even a utility function

u : X R

representing these preferences.


Typically, no additional restrictions are imposed on consumption, so the budget set

B(p, w) = {x RL
+ : p x w}
species the commodity bundles the consumer can aord. At this stage, it would be a good idea
to look back at Section 3.1 to recapitulate some properties of this budget set.
solves the following

preference maximization problem (%-MP):

%-MP:

Find the set of most preferable commodity bundles according to

budget set

B(p, w).
u,

Given utility function

UMP:

Solve

this yields the

max u(x)

s.t.

The consumer

in the

utility maximization problem (UMP):

x B(p, w).

It is common economic practice to assign special names to the set of solutions and  in case
a utility function is given  the corresponding optimal value of such optimization problems.
The

(Walrasian) demand correspondence assigns to each price vector p RL++ and wealth

w>0

the associated set

x(p, w)

of optimal commodity bundles:

x(p, w) = {x B(p, w) : x % y

for all

= {x B(p, w) : u(x) =

y B(p, w)}

max u(y)}.
yB(p,w)

Given a utility function


vector

u,

RL
++ and wealth

the

indirect utility function

w>0

v : RL+1
++ R

assigns to each price

the maximal utility the consumer can achieve. To compute it

is easy:

v(p, w) = u(x ),

where

is the utility of an arbitrary vector in the demand at

choice of x

x(p, w):

x x(p, w),

(p, w).

This is independent of the particular

since all such vectors are utility maximizers, their utility is the same.

26

Remark 4.1

u is a C 1 -function
containing X ), the UMP

If the utility function

continuous on an open set

(its partial derivatives exist and are

max u(x)
s.t.
p x w,
x1 0,
.
.
.

xL 0,
/

is usually solved using the associated Kuhn-Tucker conditions.

Remark 4.2

If the Walrasian demand correspondence is single-valued, i.e., if

of a single element for each

(p, w)

x(p, w)

consists

RL+1
++ , it is common to treat demand as a function, rather
/

than a correspondence.

Let us conclude this subsection with an example involving a well-known type of utility function.

Leontiev utility:

Baking your favorite cake requires xed proportions of its

one unit of cake takes a vector

(a1 , . . . , aL )

L 2 ingredients:
x RL
+,

RL
++ of ingredients. Given ingredient vector

how much cake can you produce? Well, looking at the i-th ingredient, your guess will be at most

xi /ai

units.

What constrains you are those ingredients

where this fraction is the smallest.

Therefore, a suitable utility function would be

u(x) = min{x1 /a1 , . . . , xL /aL },


specifying how many units of cake you can make from

x.

(12)

This utility function is not dierentiable,

so the Kuhn-Tucker conditions are not applicable.

Exercise 4.1

Check that the associated preference relation is continuous, monotonic (but not strongly),

convex (but not strictly), and homothetic.

Let prices and wealth be

(p, w) RL+1
++ .

Since preferences are continuous and the budget set

B(p, w) nonempty and compact, there is at least one solution to the UMP
x(p, w) 6= . Let's compute it. Firstly, if x solves the UMP, it must be that

(see Section 3.1):

x1 /a1 = = xL /aL .

(13)

min{x1 /a1 , . . . , xL /aL } < max{x1 /a1 , . . . , xL /aL }.

Then you're using the ingredients in the wrong proportions: you can only make u(x ) =

min{x1 /a1 , . . . , xL /aL } units of cake, but there are commodities i where you have enough for
xi /ai = max{x1 /a1 , . . . , xL /aL } units, an utter waste. If you were to trade a small amount of

Why?

Well, suppose this were not true:

these wasted ingredients for the non-wasted ones, you would still be in your budget set, but able
to make more cake. Hurray!
Secondly, preferences are monotonic, so you will use your entire budget on ingredients:

w.

Combining this with (13) gives us that there is a unique solution to the UMP at

x =

a1 w
PL

aL w

i=1 ai pi

, . . . , PL

27

i=1 ai pi

!
.

px =

(p, w), namely

By Remark 4.2, it is common to write this result down as a demand function:

(p, w) RL+1
++ :

aL w

a1 w

x(p, w) =

PL

i=1 ai pi

, . . . , PL

i=1 ai pi

instead of a single-valued demand correspondence:

(
(p, w) RL+1
++ :

a1 w

x(p, w) =

aL w

PL

i=1 ai pi

!)
.

, . . . , PL

i=1 ai pi

Substituting the demand vector in the utility function, we nd the indirect utility function:

(p, w)
Exercise 4.2

L+1
R++

v(p, w) = u

PL

i=1 ai pi

aL w

, . . . , PL

i=1 ai pi

= PL

i=1 ai pi

Our denition of the budget set is standard, but other realistic restrictions can be modeled

just as easily. In the commodity space


wealth

a1 w

X = R2+ ,

let the price vector be

w = 40 and an upper semicontinuous weak order % on X .

p = (8, 4).

The consumer has

In each of the following cases separately,

specify the budget set given the additional information. Does the new budget set necessarily contain at
least one most preferred bundle?
(a) Indivisibilities: The commodities cannot be cut into ever smaller pieces. Only integer quantities
are feasible.
(b) Rationing: The consumer is not allowed to buy more than three units of the rst commodity.
(c) Rebates 1: If the consumer buys more than ve units of the second commodity, these additional
units in excess of the rst ve have a lower price, namely two.
(d) Rebates 2: If the consumer buys more than ve units of the second commodity, the price of this
commodity (also the rst ve units) is decreased to two.
(e) Initial endowment: Instead of having wealth

= (1, 1) of one unit of both commodities.

w, suppose the consumer has an initial endowment

He can sell (parts of ) his initial endowment to generate

income to purchase other commodity bundles.


(f ) Package deal: The consumer has to buy the same quantity of both commodities.
(g) Gift certificate: The consumer has received a gift certicate of one monetary unit, which he
can spend in its entirety on commodity one.

4.2.

Properties of the demand correspondence and indirect utility

Section 1 listed a lot of properties that can be imposed on the consumer's preferences. The next
result indicates the consequences of such restrictions on the demand correspondence.

Proposition 4.3

Let

X = RL
+

for some

LN

and let

be a weak order on

X.

The Walrasian

demand correspondence has the following properties:


(a) If

(b) If

is upper semicontinuous, then

x(p, w)

is nonempty for all

(p, w) RL+1
++ .

is continuous, the Walrasian demand correspondence has a closed graph: for each
L+1
n
n
n
(pn , wn , xn )nN in R++
X with limit (p, w, x) RL+1
++ X : if x x(p , w )
for all n N, then also x x(p, w).

sequence

28

(c) Homogeneity of degree zero:

% is convex,
(p, w) RL+1
++ .

(d) If

(e) If

(p, w) RL+1
++ , > 0 : x(p, w) = x(p, w).

or equivalently, if

is quasiconcave, then

x(p, w)

is a convex set for all

is strictly convex, or equivalently, if u is strictly quasiconcave, then


(p, w) RL+1
++ .

x(p, w)

contains

at most one element for all


(f )

Walras' law:

(p, w) RL+1
++

All money is spent: If


and

is locally nonsatiated, then

px = w

for all

x x(p, w).

Proof. (a): See Section 3.1.


(b): Let (pn , wn , xn )nN be a

L+1
RL+1
++ X with limit (p, w, x) R++ X . Assume
n
n
n
that x x(p , w ) for all n N. To show: x x(p, w).
n
n
n
Firstly, x X and p x w , so taking limits: p x w . Conclude that x B(p, w).
Suppose that x
/ x(p, w): there is a y B(p, w) with y  x. By continuity of % and
0
0
0 0
Proposition 1.2, there are neighborhoods Ux of x and Uy of y such that y  x for all (x , y )
Ux Uy .
0
0
Choose y Uy with p y < w . This is possible: y B(p, w) implies that p y w . In case
0
of strict inequality, take y = y . In case of equality, small decreases in the positive coordinates
0
of y will give the desired y .
n
n
n 0
n
0
n
n
As (p , w ) (p, w), it follows that p y w for n suciently large, so y B(p , w )Uy .
n
n
n
0
n
n
As x x, x Ux for n suciently large. Hence, for large n, x Ux and y B(p , w ) Uy .
0
n
n
n
n
But then y  x , contradicting that x was optimal at prices p and wealth w .
(c): Since B(p, w) = {x RL+ : (p) x w} = {x RL+ : p x w} = B(p, w), the %-MP
sequence in

has the same domain before and after rescaling and therefore the same set of solutions.

(d):

% is convex. If x(p, w) = , it is convex. If x(p, w) 6= , let x x(p, w). Then


x(p, w) = B(p, w) {x X : x % x } is the intersection of two convex sets and therefore convex.
(e): Assume % is strictly convex. Suppose there are x, y x(p, w), x 6= y . Then 21 x + 12 y lies in
B(p, w) by convexity of B(p, w). By strict convexity of %, this bundle is strictly better than x
and y , contradicting that these were most preferred bundles in B(p, w).
(f): Assume % is locally nonsatiated. Let x x(p, w). Then p x w, since x B(p, w).
Suppose p x < w . For > 0 suciently small, the entire neighborhood {y X : kx yk < }
is contained in the budget set. By local nonsatiation, this neighborhood contains a point y with
y  x, contradicting that x is a most preferred bundle in the budget set.

Assume

An important consequence of the closed-graph property is that if Walrasian demand is singlevalued, the Walrasian demand function is continuous!

Exercise 4.3

If

is homothetic, then. . . what can you conclude about Walrasian demand?

To formulate properties of indirect utility, we will need to assume (Surprise!) that preferences
are represented by means of a utility function and that the demand correspondence is non-empty
valued: otherwise, indirect utility is undened.

Proposition 4.4


X=

Assume:
L
R+ for some L

N;

 The consumer's preference relation

is represented by utility function

29

u : X R;

 Walrasian demand is nonempty-valued:

(p, w) RL+1
++ , x(p, w) 6= .

Then the indirect utility function has the following properties:


(a) Homogeneity of degree zero:
(b) For each commodity i,

(p, w) RL+1
++ , > 0: v(p, w) = v(p, w).

is nonincreasing in the price of

(higher prices cannot make you

better o ).
(c)

is nondecreasing in wealth; if

is locally nonsatiated,

is even strictly increasing in

wealth.
(d)

is quasiconvex:

(e) If

r R : {(p, w) RL+1
++ : v(p, w) r}

is represented by a continuous utility function

Proof. (a): Follows from Proposition 4.3(c).


L+1
(b): Let (p, w) R++
and let i {1, . . . , L}

u,

max

yB(p0 ,w)

then

is continuous.

be a commodity. Let

0
strict increase in the price of commodity i. Then B(p , w)

v(p0 , w) =

is a convex set.

u(y)

p0

B(p, w),

be obtained from

by a

so

max u(y) = v(p, w),


yB(p,w)

since the second maximum is taken over a larger set.

(c):

The nondecreasing part is similar to (b), so we will only do the strictly-increasing part.

0
0
p RL
++ and 0 < w < w . To show: v(p, w) < v(p, w ).
0
0
Let
Then x B(p, w), so p x w < w . Since p x < w , for > 0 suciently
0
small, the entire neighborhood {y X : kx yk < } is contained in the budget set B(p, w ).
By local nonsatiation, this neighborhood contains a point y with y  x. Conclude that

% is locally
x x(p, w).

Assume

nonsatiated. Let

v(p, w) = u(x) < u(y)

(d):

max

zB(p,w0 )

u(z) = v(p, w0 ).

L+1
r R. If {(p, w) R++
: v(p, w) r} = , it is convex. If it is nonempty, let
0
0
(p, w), (p , w ) lie in this set and let [0, 1]. Write (p00 , w00 ) = (p, w) + (1 )(p0 , w0 ). To
00
00
00
00
show: v(p , w ) r , i.e., u(x) r for all x B(p , w ).
00
00
L
0
0
Let x B(p , w ). Then x R+ and (p x) + (1 )(p x) w + (1 )w . Therefore,
p x w or p0 x w0 (or both). W.l.o.g., p x w. Then x B(p, w), so u(x) v(p, w) r.
(e): Follows from Proposition 4.3(b).

Let

Exercise 4.4
(a) Proposition 4.4(c) might suggest that also (b) can be strengthened a bit: If
indirect utility is strictly decreasing in the price of commodity
(b) Write out the proof of Proposition 4.4(e) in detail.
(c) Why not just write If

is continuous,

is continuous?

30

i.

% is locally nonsatiated,

But this wrong. Why?

4.3.

The expenditure minimization problem

Consider a consumer with utility function

L
u : RL
+ R, prices p R++ , and a utility level u R.

What is the minimal amount the consumer has to pay, i.e., the minimal level of wealth needed to
reach utility level

u?

The answer is given by the

expenditure minimization problem (EMP):

min p x
s.t.
x RL
+,
u(x) u.
The

Hicksian or compensated demand correspondence assigns to each price vector p RL++

and each utility level

the associated set

h(p, u) = {x RL
+ : u(x) u

h(p, u)

and

of solutions to the EMP:

pxpy

for all

y RL
+

with

u(y) u}.

The Hicksian demand correspondence species the set of consumption bundles solving the EMP,
the

expenditure function e(p, u) indicates its value:


e(p, u) =

min

xRL
+ ,u(x)u

p x = p x

for all

x h(p, u).

Similar to our earlier approach to Walrasian demand and indirect utility, one can derive properties
of Hicksian demand and the expenditure function. To make the proposition at all sensible, one
needs to restrict attention to utility levels that are actually reachable; therefore, let

x RL
+}

be the range of the utility function

Proposition 4.5
order

%.

(a) If

Let

X = RL
+

for some

LN

and let

is upper semicontinuous, then

% is convex, or equivalently,
(p, u) RL
++ U .

(c) If

(d) If utility is continuous and


concave, then

h(p, u)

represent a consumer's weak

h(p, u)

is nonempty for all

u u(0, . . . , 0)

if utility is quasiconcave, then

is a convex set for all

is strictly convex, or equivalently, if utility is strictly quasi(p, u) RL


++ U .

(p0

u(x) = u

for all

(p, u) RL
++ U

with

x h(p, u).

Compensated law of demand:

h(p, u)

contains at most one element for all

and all

h(p00 , u), then

(p, u) RL
++ U .

(p, u) RL
++ U, > 0 : h(p, u) = h(p, u).

(e) No excess utility: If utility is continuous, then

x00

u:XR

The Hicksian demand correspondence has the following properties:

(b) Homogeneity of degree zero in prices:

(f )

U = {u(x) :

u.

p00 )

(x0

p0 , p00 RL
++
x ) 0.
let
00

and

u U.

If

x0 h(p0 , u)

and

Why this restriction? Well, suppose that u < u(0, . . . , 0). Since p x 0 for all x R , it follows that
h(p, u) = {(0, . . . , 0)}: expenditure is not minimal at utility u, because the zero vector, with higher utility, is the
cheapest option. Under suitable monotonicity restrictions, however, this will turn out to be an exotic case: the
zero vector will often give you the lowest utility in R , so that this footnote becomes irrelevant.
6

L
+

L
+

31

Proof. (a):

(p, u) RL
+ U . By feasibility, u(y) = u for some y X . By upper
semicontinuity of preferences, the set {x X : u(x) u} = {x X : x % y} is closed. Therefore,
L
L
the solution of the EMP lies in the nonempty set {x R+ : u(x) u} {x R+ : p x p y},
Let

which is the intersection of a closed and a compact set and therefore compact. The goal function

x 7 p x is continuous.

A continuous function on a nonempty, compact set achieves a minimum;

see Section 3.1.

(b):
(c):

x 7 (p) x gives the same solutions as minimizing x 7 p x.


(p, u) RL
++ U . If h(p, u) = , it is convex. If h(p, u) 6= , let y h(p, u).

Minimizing
Let

By

denition,

L
h(p, u) = {x RL
+ : u(x) u} {x R+ : p x p y}
is the intersection of convex sets, hence convex.

(d):

u u(0, . . . , 0), as h(p, u) = {(0, . . . , 0)} in those cases. So let


u > u(0, . . . , 0). Suppose h(p, u) contains two distinct alternatives, x, x0 . By strict convexity,
(x + x0 )/2 is strictly better, yet causes the same expenses. As (x + x0 )/2  x  (0, . . . , 0),
(x + x0 )/2 6= (0, . . . , 0): some of its coordinates are positive. By continuity, slight decreases in
these coordinates still yield alternatives at least as good as x, i.e., they remain feasible in the
EMP at (p, u), but cheaper than x, a contradiction.
(e): Assume the utility function is continuous. Let (p, u) RL++ U with u u(0, . . . , 0) and
x h(p, u).
If u = u(0, . . . , 0), then h(p, u) = {(0, . . . , 0)}, so the result is true: u(x) = u(0, . . . , 0) = u.
Next, let u > u(0, . . . , 0). Suppose u(x) > u. Then x 6= (0, . . . , 0), so that at least some
coordinates of x exceed zero. By continuity, u(y) > u for all y in a neighborhood of x. By
continuity, lim1 u(x) = u(x) > u, so u(x) > u for (0, 1) close to one. But p (x) =
(p x) < p x, contradicting that x h(p, u).
(f): Since x0 is optimal and x00 feasible in the EMP at (p0 , u), it follows that
The result is true if

p0 x0 p0 x00 .
Similarly,

p00 x00 p00 x0 .


Adding these inequalities and rewriting gives the compensated law of demand.
If

is single-valued, we will treat it as a function, rather than a correspondence, just as we did

for Walrasian demand (see Remark 4.2). The compensated law of demand implies that if you
raise the price of one of the goods, then the Hicksian demand for this good will not increase.
The next proposition states some properties of the expenditure function. Given the similarity
with earlier results, proofs are left as an exercise.

Proposition 4.6


X = RL
+

Assume:

for some

L N;

 The consumer's preference relation

 Hicksian demand is nonempty-valued:


Then the expenditure function

e:

RL
++

is represented by utility function

(p, u)

U R

(a) Homogeneity of degree one in prices:

RL
++

u : X R;

U : h(p, u) 6= .

has the following properties:

(p, u) RL
++ U, > 0 : e(p, u) = e(p, u).
32

u: If
u(0, . . . , 0) u < u00 :

(b) Monotonicity in
0

utility is continuous, then for all

p RL
++

and all

u0 , u00 U

with

e(p, u0 ) < e(p, u00 ).


(c) For each commodity i, expenditure is nondecreasing in the price of i.

u U , e(, u)

(d) For all

Exercise 4.5

is concave in

p.

Prove this proposition.

Remark 4.7

Establishing continuity properties for Hicksian demand and expenditure is less

straightforward than for Walrasian demand and indirect utility. Concave functions are continuous, so Proposition 4.6(d) implies that expenditure is continuous in prices. The utility function

u : R+ R

with

u(x) = max{0, x 1} shows that expenditure is not necessarily


p > 0 be the price of the only commodity, one nds

0
if u = 0,
e(p, u) =
p(u + 1) if u > 0.

continuous in

utility levels. Letting

Since

p > 0, e(p, )

has a discontinuity at

u = 0.

However, if the utility function is both

continuous and locally nonsatiated, continuity of the expenditure function

e : RL
++ U R

can be established using a result known as Berge's Maximum Theorem. Contrary to what most
textbooks (which do not provide the proof ) suggest, the proof is not straightforward. To establish

(p0 , u0 ) RL
++ U , local nonsatiation is used to establish existence of
0
a y
u(y) > u . Next, on a neighborhood of (p0 , u0 ), the EMP reduces to minimizing
p x subject to x {z RL
+ : u(z) u, p z p y}. This nal condition assures that the
continuity at an arbitrary

RL
+ with

conditions of the Maximum Theorem are satised.


Let us proceed with the example on Leontiev utility functions.

Leontiev utility (Continued):

U = R+ .

L
In order not to waste resources, a solution x to the EMP at (p, u) R++ U must satisfy

(13). Moreover, by continuity, it satises u(x ) = u. Combining these two conditions gives us
The Leontiev utility function in (12) has range

that there is a unique solution to the EMP at

(p, u),

namely

x = (a1 u, . . . , aL u).

Since the

solution is unique, it is common to write the result as a Hicksian demand function, rather than
a correspondence:

h(p, u) = (a1 u, . . . , aL u)

and

The following result gives a relation between

Proposition 4.8

h(p, u)

Assume the utility function

nonsatiated, strictly convex preferences. Then


demand for each good
respect to the price

` = 1, . . . , L

and

e(p, u)

PL

i=1 ai pi .

in a particularly simple case.

u : RL
+ R is continuous and represents locally
L
for all p R++ and all u > u(0, . . . , 0), Hicksian

can be found by derivating the expenditure function with

p` :
` = 1, . . . , L :

e(p, u) = p (a1 u, . . . , aL u) = u

h` (p, u) =

Requires some knowledge of topology. Can be omitted.

33

e(p, u)
.
p`

(14)

Proof.

We will not prove that the expenditure function is dierentiable.

The remainder of the

proof proceeds as follows. By strict convexity of preferences, Hicksian demand is single-valued,


so we treat

h()

as a function.

demand at prices

Fix

and utility level

p RL
++ and u U and let x = h(p, u)
u. For every price vector p0 RL
++ ,

e(p0 , u) =
p0 = p.
= p. By

with equality if

0
maximized at p

min

0
x0 RL
+ ,u(x )u

p0 x0 p0 x,

the rst order conditions, its partial

` = 1, . . . , L :

f (p0 ) = e(p0 , u) p0 x
derivatives at p must be zero:

f : RL
++ R

Hence, the function

denote Hicksian

with

f (p)
e(p, u)
e(p, u)
=
x` =
h` (p, u) = 0,
p`
p`
p`


proving the result.

Exercise 4.6

Roy's identity: Similarly, one can prove:

Assume the utility function

u : RL
+ R

strictly convex preferences.

Assume that the indirect utility function

is continuous and represents locally nonsatiated,

(p, w) with p RL
++ and w > 0.
` = 1, . . . , L can be found as follows:

at a point

Do this by showing that the function


its minimum at

f : RL
++ R

v()

is dierentiable

Then the Walrasian demand for each good

` = 1, . . . , L : x` (p, w) =

4.4.

is

with

v(p, w)/p`
.
v(p, w)/w

f (p0 ) = v(p0 , p0 x),

where

x = x(p, w),

achieves

p = p.

Relations between UMP and EMP

Proposition 4.9

u : RL
+ R
L
p R++ . Then:

Assume the utility function

nonsatiated preferences. Fix a price vector

is continuous and represents locally

is optimal in the UMP with wealth w > 0, then x is optimal in the EMP with utility

level u = u(x ). Moreover, the expenditure level in this EMP is exactly p x = w :

(a) If

x x(p, w) x h(p, u(x ))

and

e(p, u(x )) = w.

is optimal in the EMP with utility level u U , u > u(0, . . . , 0), then x is optimal in

the UMP with wealth w = p x . Moreover, the indirect utility level in this UMP is exactly

(b) If

u:
x h(p, u) x x(p, p x )

Proof. (a):
with prices

and

x x(p, w). By Walras' law, p x = w.


p and utility level u(x ). Let x h(p, u(x )). By
Let

e(p, u(x )) = p x p x = w

and

v(p, p x ) = u
Bundle

is feasible in the EMP

denition,

u(x) u(x ).

It follows from a duality result in convex analysis: for xed u, e(, u) is the support function of the strictly
convex set {x X : u(x) u}.
8

34

x B(p, w). But then its utility cannot exceed that of the utility
x(p, w). So u(x) = u(x ) and by Walras' law:

The rst inequality means that

maximizing bundle x

e(p, u(x )) = p x = p x = w.
x h(p, u(x )) and e(p, u(x )) = w.
(b):
h(p, u). By Proposition 4.5(e), u(x ) = u. Bundle x

prices p and wealth p x . Let x x(p, p x ). By denition,


Conclude that

Let x

v(p, p x ) = u(x) u(x ) = u


The rst claim shows that

and

is feasible in the EMP at

second claim cannot be strict:

px=p

is feasible in the UMP at

p x p x .

(p, u).

But then the inequality in the

x . By Proposition 4.5(e),

v(p, p x ) = u(x) = u(x ) = u.


Conclude that

x x(p, p x )

and

v(p, p x ) = u.

Under the assumptions above, we obtain important relations between the UMP and EMP:

Proof. (15):

e(p, v(p, w)) = w

(15)

v(p, e(p, u)) = u

(16)

x(p, w) = h(p, v(p, w))

(17)

h(p, u) = x(p, e(p, u))

(18)

x x(p, w). By denition, v(p, w) = u(x ). By Proposition 4.9(a),


e(p, v(p, w)) = e(p, u(x )) = w.
(17): We rst show that x(p, w) h(p, v(p, w)). Let x x(p, w). Then u(x) = v(p, w), so x
h(p, u(x)) = h(p, v(p, w)) by Proposition 4.9(a). Secondly, we show that h(p, v(p, w)) x(p, w).
Let x h(p, v(p, w)). By Proposition 4.9(b), x x(p, p x). Moreover, x h(p, v(p, w)) and (15)
imply that p x = e(p, v(p, w)) = w . Conclude that x x(p, p x) = x(p, w).
(16), (18): Similar.

Let

These results give convenient ways to nd solutions to the UMP from those of the EMP and vice
versa. Let us illustrate this in our Leontiev example.

Leontiev utility (Continued):

v(p, w) = PL

i=1 ai pi

By (16), expenditure solves

Recall that

and

x(p, w) =

u = v(p, e(p, u)) =

a1 w
PL

aL w

i=1 ai pi

e(p,u)
PL
, so
i=1 ai pi

, . . . , PL

i=1 ai pi

e(p, u) = u

!
.

PL

i=1 ai pi , exactly (Good

news, isn't it!) as we saw before. Hicksian demand can now be found in dierent ways. Firstly,
using Proposition 4.8:

` = 1, . . . , L :
and, secondly, using (18):

h(p, u)

h` (p, u) =

e(p, u)
= a` u,
p`

solves

h(p, u) = x(p, e(p, u)) =

a1 e(p, u)
aL e(p, u)
, . . . , PL
PL
i=1 ai pi
i=1 ai pi
35

!
= (a1 u, . . . , aL u).

Exercise 4.7

For Leontiev utility, use (15) and (17) to nd Walrasian demand and indirect utility from

the solutions of the EMP.

Exercise 4.8

Slutsky equation:

The so-called Slutsky equation provides a relation between the

sensitivity to price changes of the Walrasian and Hicksian demand functions.


Assume the utility function

u : RL
+ R

is continuous and represents locally nonsatiated, strictly

convex preferences. We know that in this case there are unique solutions to the UMP and EMP: we can
consider Walrasian and Hicksian demand functions.
holds. Fix

(p, w) RL+1
++

and utility level

If these functions are dierentiable, the following

u = v(p, w) > u(0, . . . , 0).9

Then for all commodities

k, `

{1, . . . , L}:
x` (p, w) x` (p, w)
h` (p, u)
=
+
xk (p, w).
pk
pk
w
Prove (19) as follows: You know from (18) that

pk ,

h` (p, u) = x` (p, e(p, u)).

(19)

Dierentiate this equation w.r.t.

using the Chain rule. Continue by substituting (14), (15), and (18).

4.5.

Welfare analysis for the consumer

Welfare analysis studies how changes in the consumer's environment  in our case: the budget
set  aect his well-being. Let

B0

be the budget set before, and

B1

the budget set after the

change. Assuming that optimal bundles exist, the consumer is better o after the change if and
only if whatever is optimal in

B1

is strictly preferred to whatever is optimal in

B0 .

This is welfare

analysis in a nutshell. Some obvious ways of detecting changes that are (at least weakly) welfare
improving are:

 The budget set has grown:


 An optimal bundle in

Exercise 4.9

B0

B0 B1 .

remains feasible in

B1 .

How is the consumer's welfare aected by the changes described in Exercise 4.2?

Whereas the above describes the idea behind welfare analysis in its full generality and simplicity,
economic textbooks tend to restrict attention to changes only in prices and wealth. The initial
vector of prices and wealth is denoted

(p0 , w0 ) RL+1
++

and the vector of prices and wealth

L+1
1
1
after the change is denoted (p , w ) R++ . This allows changes in prices only, keeping wealth
0
1
0
1
0
1
0
1
constant (p 6= p , w = w ), changes in wealth only, keeping prices constant (p = p , w 6= w ),
0

or simultaneous changes in prices and wealth (p

6= p1 , w0 6= w1 ).

Exercise 4.10

(p1 , w1 ).

L
0
0
Let % be a locally nonsatiated weak order on R+ . Consider a change from (p , w ) to
x0 x(p0 , w0 ). Show that if p1 x0 < w1 , the consumer is strictly better o under (p1 , w1 )
(p0 , w0 ).

Let

than under

Assume that the consumer's continuous, locally nonsatiated preference relation

can be repre-

sented by means of a utility function. We can derive the consumer's indirect utility function

1
1
and conclude that the consumer is better o after the change if and only if v(p , w )

>

v(p0 , w0 ).

However, since the indirect utility function depends on which utility function is chosen to
represent

%,

this does not tell us how much better o the consumer is.

changes unambiguously in monetary units, one constructs a so-called

To express welfare

money metric indirect

This inequality holds because the zero vector cannot solve the utility maximization problem: by local nonsatiation and strict positivity of prices and wealth, there is an aordable bundle preferred to the zero vector.
9

36

utility function using the expenditure function. Fix an arbitrary price vector
the real-valued function

e(
p, ).

p RL
++ .

Consider

By Proposition 4.6, this function is strictly increasing, so

e(
p, v(p1 , w1 )) > e(
p, v(p0 , w0 )) v(p1 , w1 ) > v(p0 , w0 ).
Moreover, since the expenditure function is expressed in monetary units,

e(
p, v(p1 , w1 )) e(
p, v(p0 , w0 ))

(20)

can be used as a monetary measure of welfare change: if it is positive, the welfare of the consumer
increases as a consequence of the change from
of the consumer has decreased.

(p0 , w0 )

to

(p1 , w1 ),

if it is negative, the welfare

It remains to prove that this money metric does not depend

on the choice of utility function representing the consumer's preferences. This follows from the
fact that expenditure can be expressed in a form independent of the utility function: for all

(p, u) RL
++ U ,

there is a

y RL
+

with

u(y) = u,

so

e(p, u) = min p x
= min p x
L
s.t.
x R+
s.t.
x RL
+
u(x) u
x%y
In (20), two natural choices for

would be the initial vector of prices

p0

and the new vector

equivalent

1
of prices p . These choices give rise to two well-known measures of welfare change:
0
0
0
1
and
. Let u = v(p , w ) and u =

variation (EV)

0 0
Notice that e(p , u )

compensating variation (CV)

w0 and

e(p1 , u1 )

v(p1 , w1 ).

w1 by local nonsatiation. We dene

EV ((p0 , w0 ), (p1 , w1 )) = e(p0 , u1 ) e(p0 , u0 ) = e(p0 , u1 ) w0 ,


CV ((p0 , w0 ), (p1 , w1 )) = e(p1 , u1 ) e(p1 , u0 ) = w1 e(p1 , u0 ).
There is no obvious way to say that one of the measures is better than the other, although the
equivalent variation has an advantage when comparing alternative changes:

1
1
changes either to (p , w ) or

(p2 , w2 ).

((p0 , w0 ), (p1 , w1 )) and

EV
p0 and

Both

are expressed in terms of wealth at prices

p1 and

CV

(p0 , w0 )

((p0 , w0 ), (p2 , w2 ))

can consequently be compared.

CV ((p0 , w0 ), (p1 , w1 )) is expressed in wealth at prices


2
prices p , so they are incomparable.
Leontiev utility (Continued):

EV

suppose

However,

((p0 , w0 ), (p2 , w2 )) in wealth at

The equivalent and compensating variation for Leontiev

utility follow immediately from the indirect utility function and expenditure function computed
earlier:

w0
u0 = v(p0 , w0 ) = PL
0
i=1 ai pi

and

w1
u1 = v(p1 , w1 ) = PL
,
1
i=1 ai pi

so

EV

((p0 , w0 ), (p1 , w1 ))

e(p0 , u1 )

e(p0 , u0 )

w1

P

CV ((p0 , w0 ), (p1 , w1 )) = e(p1 , u1 ) e(p1 , u0 ) = w1


Lump-sum tax:
a

Given initial prices and wealth

lump-sum tax

(p1 , w1 )

(p0 , w0 ),

L
Pi=1
L
i=1

ai p0i
w0 ,
ai p1i
P

L
1
0
i=1 ai pi
w PL
.
0
i=1 ai pi

suppose that the government levies

T (0, w0 ) on the consumer's wealth, keeping prices unchanged.


(p0 , w0 T ). Hence e(p0 , u0 ) = e(p1 , u0 ) = w0 and e(p1 , u1 ) = e(p0 , u1 ) = w1 =
37

Then

w0 T ,

so

EV ((p0 , w0 ), (p1 , w1 )) = CV ((p0 , w0 ), (p1 , w1 )) = T .

This is intuitive:

since the prices

remain unchanged, the monetary measure of welfare change as a consequence of a decrease of


in the consumer's wealth should equal

Deadweight loss:

T .

% be a continuous, locally nonsatiated, strictly


p 0 RL
++ and wealth w > 0. Suppose the
government levies a commodity tax t > 0 on the price of good `. Thus, the new price vector is
p1 = p0 + te` , where e` = (0, . . . , 0, 1, 0, . . . , 0) is the `-th standard basis vector of RL with `-th
1
coordinate 1 and all other coordinates 0. The total tax revenue is T = tx` (p , w) and
Let the preference relation

convex weak order on

RL
+.

Fix a price vector

EV ((p0 , w), (p1 , w)) = e(p0 , u1 ) w 0,


where

u1 = v(p1 , w)

as before.

Alternatively, to raise the same amount, the government can

T directly
T .

on the wealth of the consumer, keeping prices xed, yielding an

levy a lump-sum tax


equivalent variation

x solve the UMP


w, i.e., p0 x w tx` =

The consumer is at least weakly better o under lump-sum taxation. Let

under commodity taxation. Then x

0
w T . So x B(p , w T ), i.e.,

B(p0 +te

` , w), so p x +tx`

is feasible in the UMP under lump-sum taxation: the

consumer cannot be worse o under lump-sum taxation than under commodity taxation.
Therefore,

e(p0 , u1 ) w T .

The dierence

w T e(p0 , u1 ) 0
is called the

deadweight loss of commodity taxation .

Exercise 4.11

Cobb-Douglas utility:

In Section 4, the Leontiev utility function was used as a

running example to illustrate all denitions. Go through the same steps, now using the Cobb-Douglas
utility function

4.6.

u : RL
+ R

dened for all

x RL
+

by

u(x) = xa1 1 xa2 2 xaLL ,

where

a1 , . . . , aL > 0.

Welfare and Hicksian demand

Assume that the preferences of the consumer are continuous, locally nonsatiated and strictly
convex. If the only change is in the price of a single good, equivalent and compensating variation
can simply be expressed in terms of the Hicksian demand function.

To somewhat simplify

p0 and an arbitrary p` > 0


` to p` by (p` , p0` ).

notation, we denote for an arbitrary price vector

the price vector

0
obtained from p by changing the price of good
0
So given initial prices and wealth (p , w), suppose that only the price of good
0
1
1 0
1
is changed to p` 6= p` , giving rise to (p , w) = ((p` , p` ), w). Recall that

e(p, u)
= h` (p, u)
p`

and

` {1, . . . , L}

e(p1 , u1 ) = w.

Hence

EV ((p0 , w), (p1 , w)) = e(p0 , u1 ) w


= e(p0 , u1 ) e(p1 , u1 )
Z p0
0 , u1 )
` e(p` , p
`
dp`
=
1
p
`
p`
Z p0
`
=
h` (p` , p0` , u1 )dp` .
p1`

38

(21)

Similarly,

p0`

CV ((p , w), (p , w)) =


p1`

h` (p` , p0` , u0 )dp` .

(22)

This means that the equivalent and compensating variation due to such a simple price change
can be represented by areas to the left of  the Hicksian demand curve.

Normal goods:

Suppose good

is a

normal good

(i.e., its Walrasian demand is weakly in-

p0` > p1` . We claim that EV ((p0 , w), (p1 , w))


0
1
0
0
CV ((p , w), (p , w)). To see this, write u = v(p , w) and u1 = v(p1 , w). Since v is nonincreasing
0
1
0
0
0
1
in p` , u u . Since e is increasing in u, this implies that e(p` , p` , u ) e(p` , p` , u ) for all
p` > 0. Since good ` is normal and x` (p, e(p, u)) = h` (p, u), it follows that

creasing in income) and that its price is decreased:

h` (p` , p0` , u0 ) = x` (p` , p0` , e(p` , p0` , u0 )) x` (p` , p0` , e(p` , p0` , u1 )) = h` (p` , p0` , u1 )
for all

p` > 0.
0

Combining this with (21) and (22), it follows that

EV ((p , w), (p , w)) CV ((p , w), (p , w)) =

p0`

p1`

p0`

=
p1`

0.

39

h` (p` , p0` , u1 )dp`




p0`

p1`

h` (p` , p0` , u0 )dp`


h` (p` , p0` , u1 ) h` (p` , p0` , u0 ) dp`

5.

Choices of a producer: classical supply theory

5.1.

Production sets

Having treated the demand side of the economy in detail, we now turn to the supply side. The
supply side consists of rms that use a technology to convert one set of commodities (inputs)
to another (outputs). Just as for consumers, it is assumed that rms take prices as given and
that all commodities are traded at the market at publicly quoted prices. Consider an economy
with

LN

commodities. The rm's production can be described by a

production plan

production vector

or

RL which gives the net amount produced of each of the

y = (y1 , . . . , yL )
L commodities. If y` < 0, we say that good ` is used as an input in the production plan y , if
y` > 0, we say that good ` is used as an output in y . For instance, if L = 2, the production
plan y = (2, 6) indicates that two units of the rst commodity are used as an input to produce
an output of 6 units of the second commodity. The
production vectors is denoted by

production set

of technologically feasible

RL . This general description allows that a commodity is

used as an input in some production vectors, but as an output in others.


You may come across the following special cases:

Transformation functions:
using a function

F :

RL

Sometimes the production set can conveniently be described

called the

Y = {y RL : F (y) 0}
The set of boundary points

transformation function as follows:

and

F (y) = 0

{y RL : F (y) = 0}

Single-output technologies:

if

lies on the boundary of

is called the

Y.

transformation frontier .

L, is an output
1, . . . , L1, as inputs. These are single-output
technologies , typically summarized using a production function f : RL1
R that assigns
+
to each vector of input quantities z the maximal amount f (z) of output that can be produced
In many examples, one of the goods, say good

that is produced using the remaining goods, say

from it. One can then write

Y = {(z1 , . . . , zL1 , q) RL : q f (z)


Consider, for instance, a Cobb-Douglas production function
where

, > 0.

z RL1
+ }.

f : R2+ R

given by

f (z) = z1 z2 ,

Then

Y = {(z1 , z2 , q) R3 : q z1 z2 ,
5.2.

and

and

z1 , z2 0}.

Properties of production sets

Properties that are often imposed on the production set

Y RL

include:

nonempty : there is at least one feasible production vector.


Possibility of inaction : 0 Y . It is possible to do nothing,
Y

is

i.e., produce zero

outputs from zero inputs.

closed . This assumption is mainly for mathematical convenience.


No free lunch : if y Y RL+ , then y = 0. It is not possible to produce positive
Y

is

amounts of output without using inputs.

40

Free disposal : if y Y

and

y0 y,

then

y0 Y .

If

y0

is feasible and

uses at least

0
as much of each input, yet gives no more of the outputs, then also y is feasible.

Irreversibility :

if

y Y

and

y 6= 0,

then

y
/ Y.

It is impossible to reverse a

feasible production vector, i.e., to turn the outputs into the same amount of inputs
used to produce it.

Nonincreasing returns to scale :

if

y Y

and

[0, 1],

y Y .

then

This

means that feasible production plans can be scaled down.

Nondecreasing returns to scale :

if

yY

and

1,

then

y Y .

This means

that feasible production plans can be scaled up.

Constant returns to scale (CRS): if y Y

and

0,

then

y Y .

This is the

conjunction of the previous two properties.

Additivity/free entry :

if

y, y 0 Y ,

then

y + y0 Y .

If both

then it is feasible to set up two independent plants, one producing


together yielding

Y
Y

y 0 are feasible,
y , the other y 0 ,

and

y + y0.

convex : if y, y0 Y and [0, 1], then y + (1 )y0 Y .


0
0
is a convex cone : if y, y Y and , 0, then y + y Y .

is

One easily establishes relations between these properties. Possibility of inaction implies nonemptiness. Nondecreasing and nonincreasing returns to scale imply constant returns to scale. Some
less trivial ones are:

Proposition 5.1
(a) If

Let

Y RL

is convex and

be a production set.

0Y,

then

has nonincreasing returns to scale.

(b)

is a convex cone if and only if

is convex and has constant returns to scale.

(c)

is a convex cone if and only if

is additive and has nonincreasing returns to scale.

Y satises no free lunch and for all x, y Y and (0, 1), there
z x + (1 )y, z 6= x + (1 )y , then Y satises irreversibility.

(d) If

is a

Proof. (a): Let y Y and [0, 1]. By convexity, y + (1 )0 = y Y .


(b): Assume Y is a convex cone. Then Y istrivially convex. To establish CRS,

z Y

with

let y Y and
y Y . Conversely, assume Y is convex
0
and has CRS. To show that Y is a convex cone, let y, y Y and , 0. By CRS, 2y Y
1
1
0
0
0
and 2y Y . By convexity, (2y) + (2y ) = y + y Y .
2
2
(c): If Y is a convex cone, it is additive (take = = 1) and has nonincreasing returns to scale
(similar to the proof of CRS above). Conversely, assume that Y is additive and has nonincreasing
0
0
returns to scale. Let y, y Y and , 0. By additivity, ky Y and ky Y for all k N.
Choose k N such that /k 1 and /k 1. Since Y has nonincreasing returns to scale,
(/k)y Y and (/k)y 0 Y . By additivity: y = k(/k)y Y and y 0 Y . Again by
0
additivity y + y Y .
(d): Let y Y, y 6= 0, and suppose y Y . By assumption, there is a z Y such that
z 12 y + 21 (y) = 0, z 6= 0, contradicting no free lunch.


0.

Since

is a convex cone,

y =

1
2

y+

41

1
2

In the special case of a production function, properties of the production set are related to
properties of the production function. For instance:

Proposition 5.2
with

Consider a single-output technology with production function

f (0, . . . , 0) = 0,

so that

Y = {(z, q) RL : q f (z)
(a)

has constant returns to scale if and only if

(b)

is convex if and only if

Proof. (a):
z RL1
+

and

z RL1
+ }.

is homogeneous of degree one.

is concave.

Y satises constant
> 0: f (z) f (z). So let z RL1
+

First, assume that

and

By denition of the production set

By CRS, this implies

By denition of the production set

returns to scale.
and

= z

RL1
and
+

Y , (z, f (z)) Y .

(z, f (z)) Y .
Y , (z, f (z)) Y

means that

z RL1
and > 0: f (z) f (z).
+
0
= 1/ > 0.

Apply the result above to

Substitute

Multiply both sides with

z 0 = z

and

z0

So

and

So let

f (z) f (z).
z RL1
+

and

> 0.

0 : 0 f (z 0 ) f (0 z 0 ).

0 = 1/ : (1/)f (z) f ((1/)z) = f (z).


: f (z) f (z).

Using the above, it follows that if

f (z).

We show that for each

> 0.

Next, we show that for each

0
Fix z

f : RL1
R
+

has CRS, then for each

z RL1
+

and each

> 0 : f (z) =

is homogeneous of degree one.

z RL1
and each
+
> 0 : f (z) = f (z). To show: Y has CRS, i.e, if (z, q) Y and 0, then (z, q) Y .
This follows from the assumption that f (0, . . . , 0) = 0 if = 0. So let (z, q) Y and > 0.
Conversely, assume that

is homogeneous of degree one:

By denition of the production set

Multiply both sides with

Since

By denition of the production set

for each

Y , q f (z).

> 0 : q f (z).

is homogeneous of degree one,

f (z) = f (z),

Y , q f (z)

so

q f (z) = f (z).

together with

z RL1
+

implies that

(z, q) Y .

(b):

The function

f : RL1
R
+

is concave if and only if its subgraph

{(z, q) RL1
R : q f (z)} = {(z, q) RL : q f (z)
+
is convex. Multiplying the rst

and

z RL1
+ }

L1 coordinates with 1 maintains convexity, so this is equivalent

with

{(z, q) RL : q f (z)

and

z RL1
+ }=Y


being convex.

42

5.3.

The prot maximization problem

The production set

species a rm's set of feasible options. To make the choice problem of the

rm complete, we have to endow it with preferences. These preferences are particularly simple.
It is assumed that rms maximize prots given the commodity prices and the rm's production
set:

given production set

problem (PMP) is
The

Y RL

and a price vector

p RL
++ ,

the

prot maximization

max p y
s.t.
y Y.

prot function assigns to every price vector p RL++ the maximal prot
(p) = max{p y : y Y }.

The

supply correspondence

y()

assigns to every price vector

p RL
++

the set of prot-

maximizing production vectors:

y(p) = {y Y : p y = (p)}.
As opposed to the utility maximization problem, which has a solution under mild conditions
(like continuity of the utility function), there may not be a solution to the PMP: prots may be
unbounded. In that case, we set

(p) = +.

Indeed, we may have the following:

Proposition 5.3

p
(p) = +.

price vector
made, or

Proof.

L
Let Y R be nonempty and satisfy nondecreasing returns to scale. For each
RL
++ , either p y 0 for all y Y , which means that no positive prot can be

p RL
++ . Suppose that p y > 0 for some y Y . Since Y has
nondecreasing returns to scale, y Y for all 1, so p (y) = (p y) can be made arbitrarily
large by letting go to innity.

Consider a price vector

This makes the existence of solutions to the PMP a nontrivial issue. The following two results
provide sucient conditions.

Proposition 5.4

Assume that the production set

Y RL

is:

 nonempty,
 closed,
 bounded above: there is an

rR

such that

y` r

for all

yY

Then the prot maximization problem has at least one solution for each

Proof.

` {1, . . . , L}.
L
price vector p R++ .

and all

0
p RL
++ . By nonemptiness, there is a y Y . A solution to the PMP must lie in
L
0
the set P = Y {y R : p y p y }.
P is closed: Y is closed by assumption and the second set in the intersection is closed, since
Let

it is the upper contour set of a continuous function. The intersection of two closed sets is closed.

is bounded:

Moreover, all coordinates are bounded from below as well: let


coordinate

P
yP

By assumption, the coordinates of vectors in

` {1, . . . , L}.

p y p y 0 , it follows that
X
X
p` y` p y 0
pk yk p y 0
pk r,

Since

k6=`

k6=`
43

are bounded above by

r.

and consider an arbitrary

so

y`

is bounded from below by

Hence,

p y0


p
r
/p` .
k
k6=`

is compact. Since we maximize a continuous prot function over a compact set

there is at least one solution.

Y,


The following result establishes existence of solutions to the prot maximization problem under
resource constraints.

Proposition 5.5

Assume that the production set

Y RL :

 satises possibility of inaction,


 satises no free lunch,
 is closed,
 is convex,
 has a resource constraint: there is a nonzero vector
production to vectors

yY

with

RL
+

of inputs restricting feasible

y .

Then the prot maximization problem has at least one solution for each price vector

Exercise 5.1

p RL
++ .

This exercise guides you through the proof of Proposition 5.5.

(a) Show that

Y 0 = Y {y RL : y }

is nonempty and closed.

Y 0 = Y {y RL : y } is bounded, suppose it were not: there is a sequence (yn )nN


Y 0 whose increasing length kyn k diverges to innity. Dene zn = yn /kyn k.

To show that
of vectors in

(b) Show that for


(c) Show that

nN

(zn )nN

large enough,

zn

lies in

and satises

has a convergent subsequence with limit

zn + /kyn k 0.
z 6= 0

in

Y.

(d) Combine this with (b) to derive a contradiction.


As

Y0

is nonempty and compact and the prot function is continuous, a maximum exists!

Thus, whenever we talk about properties of the prot function and the supply correspondence,
we implicitly assume that the PMP has a solution, so that

Proposition 5.6

Consider a rm with production set

y(p) 6=

and

(p) < .

Y RL .

(a) The prot function is homogeneous of degree one, the supply correspondence is homogeneous
of degree zero.
(b) The prot function is convex.
(c) If
(d)

(e)

is convex,

y(p)

is a convex set for all

Hotelling's lemma:

p RL
++ .

function is dierentiable

p RL
++ . If y(p) consists of a single point y , then
at p and (p)/p` = y` for all goods ` = 1, . . . , L.

Law of supply:

p, p0 RL
++

Let

for all

and all

y y(p)

and

(p p0 ) (y y 0 ) 0.

44

y 0 y(p0 ):

the prot

Proof. (a): Do this yourself.


(b): We give two proofs.
First proof: we show that the epigraph epi() = {(p, v) RL++ R : v (p)} is a convex set.
(p1 , v 1 ), (p2 , v 2 ) epi() and [0, 1]. To show: v 1 + (1 )v 2 (p1 + (1 )p2 ).
y y(p1 + (1 )p2 ). Then pi y (pi ) v i for both i = 1, 2, so

Let
let

So

(p1 + (1 )p2 ) = p1 y + (1 )p2 y v 1 + (1 )v 2 .

Second proof:

1
2
p1 , p2 RL
++ and all [0, 1] : (p + (1 )p )
1
2
p1 , p2 RL
++ and [0, 1]. Let y y(p + (1 )p ). Then

we show that for all

(p1 ) + (1 )(p2 ). So let


pi y (pi ) for both i = 1, 2,

so

(p1 + (1 )p2 ) = p1 y + (1 )p2 y (p1 ) + (1 )(p2 ).

(c):

y(p) = Y {y RL : p y = (p)} is the intersection of Y and a


hyperplane. Since both are convex, so is y(p).
(d): We prove Hotelling's lemma, assuming that is dierentiable at p. By denition of the
0
L
0
0
0
prot function we know that for all p R++ : p y (p ), with equality if p = p. So the
L
0
0
0
function h : R++ R with h(p ) = (p ) p y achieves its minimum at p. But then its partial
derivatives at p must be zero:
Let

p RL
++ .

Then

` = 1, . . . , L :

h(p)/p` = (p)/p` y` = 0,

proving Hotelling's lemma.

(e):

Notice that

(p p0 ) (y y 0 ) = (p y p y 0 ) + (p0 y 0 p0 y) 0,
where the inequality follows from the denition of prot maximizers:

p0

y0

5.4.

(p0 )

p0

p y = (p) p y 0

y.

and

Solving the PMP

Just like in the utility maximization problem UMP, the Kuhn-Tucker conditions can be used to
nd necessary rst order conditions for the prot maximization problem PMP: if the production
set is

Y = {y RL : F (y) 0},
where

is continuously dierentiable and the price vector is

condition for y

p RL
++ ,

a necessary rst order

to be a solution to the PMP

max p y
s.t.
F (y) 0
is that there exists a Lagrange multiplier

p` =

such that for each good

F (y )
.
y`

45

` = 1, . . . , L :
(23)

If we divide the rst order condition for good


goods

`, k :

with that for good

k,

we nd that for all pairs of

p`
F (y )/y`
=
,
pk
F (y )/yk

i.e., in an optimal production plan

y,

the price ratio between two goods equals its so-called

marginal rate of transformation. If the set

is convex, the rst order conditions in (23) are also

sucient for a solution to the PMP.


In the single-output case, assume the production function

is dierentiable and that the

price of input

` = 1, . . . , L 1

Remark 5.7

I don't know the reason for this sudden change of notation from a price vector

an output-input price vector


level

equals

(p, w).

and the price of the output equals

Do not confuse the vector of input prices

p > 0.
p to

with the wealth

of the consumer. This choice of notation is unfortunate, but widespread in economics.

The PMP can be rewritten as

If

w` > 0

max pf (z) w z
s.t.
z RL1
+ .

is optimal, the Kuhn-Tucker conditions imply the existence of Lagrange multipliers

for each of the conditions

z` 0
p

f (z )
w` = `
z`

>0

for all

`),

and

`, k :

` z` = 0.

this implies that

` = 1, . . . , L 1 : p

` 0

` = 1, . . . , L 1 :

such that for all inputs

Assuming an interior solution (z`


order conditions become

so that for all inputs

(24)

` = 0

for all

`,

so the rst

f (z )
= w` ,
z`

w`
f (z )/z`
=
,
wk
f (z )/zk

(25)

which has the interpretation that the price ratio between two goods has to equal their so-called
marginal rate of technical substitution. Again, if the set

is convex, the rst order conditions

in (24) are also sucient for a solution to the PMP.

5.5.

The cost minimization problem

In a prot maximizing production plan, there is no way to produce the same amount of outputs

cost minimization problem (CMP),

at a lower total input cost. This motivates a study of the

which we consider only in the single-output case. Assume the production function is

and the input price vector is

f : RL1

+
q of the

RL1
++ . We want to produce at least an amount

output. What is the minimal amount we have to spend on inputs to achieve this? The answer
is given by the CMP:

min w z
s.t.
z RL1
+ ,
f (z) q.
The

conditional factor demand correspondence assigns to each vector w RL1


++ of input

prices and each output level

the associated set

L1
z(w, q) = {z R+
: f (z) q

and

z(w, q)

w z w z0
46

of solutions to the CMP:

for all

z 0 RL1
+

with

f (z 0 ) q}.

The conditional factor demand correspondence species the set of input vectors solving the CMP,

cost function c(w, q) indicates its value:

the

c(w, q) =

min

,f (z)q
zRL1
+

w z = w z

for all

z z(w, q).

The cost minimization problem and the expenditure minimization problem

min p x
s.t.
x RL
+
u(x) u
are identical, up to a relabeling of the involved functions. Therefore, rewriting Propositions 4.5,
4.6, and 4.8 provides a long list of properties for conditional factor demand and the cost function.
If the production function

is continuously dierentiable, the Kuhn-Tucker conditions can

z of the CMP, there must be a Lagrange multiplier 0


with the condition q f (z) 0 and Lagrange multipliers ` 0 associated with the
z` 0 such that for all ` = 1, . . . , L 1 :

be used to show that at a solution


associated
conditions

w` =

f (z )
+ `
z`

and

If the solution uses positive amounts of all inputs (z`


all `, so

w` =
`

for all

and consequently

` z` = 0.

>0

for all

`),

this implies that

` = 0

for

f (z )
z`

w`
f (z )/z`
=
,
wk
f (z )/zk

as in (25)!

5.6.

Linking the PMP and the CMP

In the case of a single-output economy with production function


vector

RL1
++ , and output price

p > 0,

f : RL1
R+ ,
+

input price

the PMP becomes

max pq w z
s.t.
q f (z),
z RL1
+ .
The set of solutions is commonly denoted as
solution

(z, q),

y(p, w) and the maximal prot as (p, w). In a


> 0) implies that q = f (z), otherwise the prot

positivity of the output price (p

can be increased:

pq w z < pf (z) w z.
Consequently, the PMP simplies to

max pf (z) w z.

zRL1
+

Moreover, production has to be as cheap as possible, so there is a link with the CMP:

47

Proposition 5.8

f : RL1
R+
+
L1
closed. Let w R++

Consider a production function

L1
set {z R+
: f (z) q} is nonempty and
p > 0 the output price. Consider the optimization problems
(P1)

maxzRL1 pf (z) w z ,

(P2)

maxq0 pq c(w, q).

such that for each

q 0,

the

be the vector of input prices,

The following claims are true:


(a) For each

z RL1
+ ,

(b) For each

q 0,

qz 0

with

pf (z) w z pqz c(w, qz ).

L1
zq R +

with

pf (zq ) w zq pq c(w, q).

there is a

there is a

(c) If one of the problems (P1) and (P2) has a solution, so does the other and the corresponding
maximum values coincide:

max pf (z) w z = max pq c(w, q).


q0

zRL1
+

Exercise 5.2

Prove Proposition 5.8.

The PMP as formulated in (P2) is particularly easy: given the cost function, the PMP reduces
to a single-variable maximization problem.

In practice, this is often the easiest way to solve

the PMP. Under suitable dierentiability assumptions, the necessary Kuhn-Tucker condition at
an optimum

q 0

is that there exists a Lagrange multiplier

such that:

p
Assuming

q > 0,

c(w, q )
=
q

this means that

=0

and

associated with the condition

q = 0.

and hence that price equals marginal costs at a prot

maximizing quantity. If the cost function is convex in

q,

this condition is also sucient.

Example: some calculations in a single-output economy:

Consider a technology using

a single input to produce a single output via the production function

f : R+ R with f (z) =

for all

z 0.

The production set is

Y = {(z, q) R2 : q f (z), z 0} = {y R2 : y1 0, y2
Assume that the input price is

w>0

and the output price is

problem (P1) becomes

p > 0.

y1 }.

The prot maximization

max p z wz.
z0

At

z = 0,

the prot is zero. At an interior solution

must be satised:

so

z =

p 2
2w , yielding output

z > 0,

the following rst order condition

p
w = 0,
2 z

z =

p
p2
2w and prot 2w

p2
4w

p2
4w

> 0.

Conclude that the

supply function is

y(p, w) =

  

p 2 p

,
Y
2w
2w
48

(26)

and the prot function

(p, w) =

p2
4w . The cost minimization problem for production level

is

min wz
s.t.
z 0,

z q.
z , it is clear
z = z(w, q) = q 2

z = q:

At an optimum

that

no inputs are wasted. Hence the conditional factor

demand is

and the cost function is

c(w, q) = wq 2 .

This allows us to rewrite

the prot maximization problem as in (P2):

max pq c(q, w) = max pq wq 2 .


q0

q0

Solving this optimization problem yields an optimal output quantity

5.7.

q =

p
2w as in (26).

Eciency

A production plan

yY

is

ecient

if there is no

y0 Y

with

y0 y

y 0 6= y .

and

In words,

there is no dierent production plan producing at least as much output while using at most as
much input. There is a close connection between prot maximization and eciency:

Proposition 5.9
(a) If

yY

Consider a production set

maximizes prots at prices

p RL
++ ,

is convex, then for every ecient y

that y is prot maximizing at prices p.

(b) If

Y RL .

Proof. (a):

Suppose

is not ecient: there is a

i.e., if

y y(p),

then

is ecient.

there is a nonzero price vector

y0 Y

with

y 0 y, y 0 6= y .

Then

p RL
+

such

p y0 > p y:

0
the prot from y exceeds that from the prot-maximizing y , a contradiction.
(b): Let Z = {y 0 RL : y 0 > y }. Since y is ecient: Z Y = . By the separating hyperplane
L
0
0
theorem, there is a vector p R , p 6= 0 such that p y p y for all y Z and y Y . Two
things remain to be shown:

p RL
+ . Suppose, to the contrary, that p` < 0 for some coordinate `. Then

< p y for some y 0 Z with y`0 y` > 0 suciently large. A contradiction.

Secondly, that y is prot maximizing at prices p. Let y Y . To show: p y p y . For

Firstly, that

y0

n N, dene the vector y n = (y1 + 1/n, . . . , yL + 1/n) Z .


y , it follows that also in the limit p y p y .

each

yn

Exercise 5.3

Then

p yn p y.

Since

This exercise investigates the need for the dierent assumptions in Proposition 5.9.

Y R2 , a point y Y and
prices p, but y is not ecient.

(a) Give an example of a production set


such that

maximizes prots at

(b) Give an example of a convex production set


prot maximizing for any

Y R2

and a point

price vector

yY

p R2+ , p 6= (0, 0),

which is ecient but not

p R2++ .

(c) Give an example of a production set

Y R2

which is not convex and a point

ecient, but not prot maximizing for any nonzero price vector

49

p R2+ .

y Y

which is

6.

General equilibrium

6.1.

What is an equilibrium?

Earlier, we studied how consumers choose optimal consumption bundles given their preferences,
wealth, and the price vector and how rms choose optimal production plans given their technology
and the price vector. Are there price vectors where all these optimal choices are actually feasible?
You don't, for instance, want people demanding ten apples if there only are ve. Such a price
vector and the corresponding demand and supply constitute a

Walrasian equilibrium .

Its

denition follows the central idea behind any economic equilibrium concept with decent microfoundations  it is a description of:

 something feasible, where


 each involved agent  taking as given those things beyond his control  makes a choice
that makes him as happy as possible.
Notice, in particular, that it involves no statements like markets clear or supply equals demand.

Economic agents  quite frankly  couldn't care less: they have their preferences,

some constraints, and all they wish for is to choose optimally. Nevertheless, some people become
very nervous when one doesn't assume that markets clear (excess demand equal to zero) in
equilibrium. I want to take this concern seriously, so let me briey explain this.

 Market clearing is an assumption about aggregate behavior that is not in line with the
microeconomic idea behind equilibrium that combines feasibility with optimal behavior of

individual agents; Kreps (1990, p. 6), for instance, states:


Generally speaking, an equilibrium is a situation in which each individual agent
is doing as well as it can for itself, given the array of actions taken by others
and given the institutional framework that denes the options of individuals and
links their actions.

 Sometimes, it is downright silly to insist on market clearing. Suppose agents in an economy


are endowed with a positive quantity of a commodity that is undesirable and of no use
whatsoever as an input. Why would you insist on supply and demand for this commodity
being equal? What are you going to do? Stu the good down people's throat?
Or what if agents only want to consume gloves in matching pairs? If there happen to be
more left- than right-hand gloves, simply leave excess gloves to gather dust somewhere.

 Consequently, market clearing is often not a part of the denition of equilibrium. See, for
instance, Arrow and Hahn (1971, p. 107), Kreps (1990, p. 190), Mas-Colell (1985, p. 169),
and Varian (1992, p. 316).

 Market clearing in equilibrium, however, turns out to be a consequence of commonly imposed restrictions. You may nd Exercise 6.2(c) helpful.
To illustrate the main ideas behind general equilibrium analysis, we start by studying a

exchange economy

pure

where there is no production, but where consumers are initially endowed

with certain amounts of the dierent goods. This entails no real loss of generality: our main tool
will be to study excess demand, regardless of whether it involves producers or not. Walrasian
equilibrium is dened and shown to exist in a particularly simple case.

Also, we study some

of its welfare properties. After introducing producers into the model, a more general existence
result is provided in Section 6.4.

50

6.2.
A

Pure exchange economies

pure exchange economy is a tuple E = (%h , h )hH , where:




is a nonempty, nite set of consumers/households,

and each consumer

 a weak order

h H has
%h over RL
+,

 an initial endowment h

where

The total endowment is denoted


consumer

hH

L N,

RL
+ of the
=

h
a commodity bundle x

hH

commodities.

h.

An

allocation

RL
+ . Allocation

x = (xh )hH

assigns to each

is:

feasible if hH xh ,
P
nonwasteful if hH xh = .
P

p, the initial endowment of consumer h H is worth p h , so consumer h


h
h
h
x RL
+ with p x p , i.e., consumer h's budget set is B (p, p ). Let

If the price vector is


can aord bundles

xh ()

denote this consumer's demand correspondence.

The basic idea behind equilibria (feasibility and optimal choices) leads to the following denition. A

Walrasian equilibrium of a pure exchange economy E = (%h , h )hH is a pair (p, x),

where:

p RL
+ , p 6= 0,
x=

is a price vector,

(xh )

hH is a feasible allocation,
 for each consumer h H , xh is a most preferred bundle at prices


p,

i.e.,

xh xh (p, p h ).

excess demand correspondence z assigning to each price vector p the dierence between total demand for and the total
Properties of Walrasian equilibrium are often studied using the
availability of the commodities:

z(p) =

X

 X
xh (p, p h ) { h } =
xh (p, p h ) {}.

hH

hH

p is an equilibrium price vector if and only if there is a


z z(p) where no commodity has positive excess demand,

By denition of Walrasian equilibrium,


corresponding excess demand vector
i.e., a

z z(p) RL
.

Budget sets are homogeneous of degree zero in prices:

h H, p RL
+ , > 0 :
Therefore, if

B h (p, p h ) = B h (p, (p) h ).

is an equilibrium price vector, then so is

for all

> 0.

In the computation

of Walrasian equilibria, this allows some simplications, for instance by assuming that the equilibrium price of one of the goods is equal to one, or that the sum of the prices is equal to one,
i.e., they lie in the unit simplex

= {p RL
+ :

PL

`=1 p`

= 1}

(also denoted

if we want to

stress the dimension of the vectors).


To illustrate the idea behind existence proofs of Walrasian equilibria, the next result makes
a lot of simplications.

Proposition 6.1

Assume that excess demand

z:

 is a well-dened function (rather than a correspondence)


 is continuous,

51

z : RL ,

 satises Walras' Law:


Then there is a price vector

Proof.

p z(p) = 0 for all p .


p with z(p) 0.

The idea is to change prices by making goods in excess demand relatively more expensive

and hope that demand for them goes down.

If there are no more changes, there is no excess

demand, and we found an equilibrium price vector. Dene

pi + max{zi (p), 0}
P
1+ L
j=1 max{zj (p), 0}

f (p) =
Function

f :
!

by

.
i=1,...,L

increases the price of commodities for which excess demand is positive and then

rescales the resulting price vector so that its coordinates add up to one. As the composition of
continuous functions,

f (p) = p.

is continuous. By Brouwer's xed point theorem, there is a

z(p) 0.

We show that

with

By Walras' Law:

0 = p z(p) = f (p) z(p)


h
i
XL
1
p z(p) +
=
max{zi (p), 0}zi (p) .
PL
i=1
1 + j=1 max{zj (p), 0} | {z }
=0

Therefore,

L
X

max{zi (p), 0}zi (p) = 0.

(27)

i=1
Notice:


max{zi (p), 0}zi (p) =

0
zi (p)2 > 0

if
if

zi (p) 0,
zi (p) > 0.

So (27) is the sum of nonnegative terms. The only way in which it can be zero, is if all its terms
are zero, i.e., if

zi (p) 0

for all

i,

as we had to show.

z(p) 0 together with the allocation x = (xh (p, p h ))hH is a


Walrasian equilibrium. Using z(p) 0 and Walras' Law (p z(p) = 0), it follows that excess
demand is zero for commodities i with pi > 0: a good can be in excess supply in equilibrium,
The price vector

with

but only if its price equals zero. The desired properties of excess demand are usually derived
from conditions on consumer preferences, using Proposition 4.3.

6.3.

Welfare analysis

A feasible allocation

is:

Pareto dominated
and x
h

hH
x
as in x

h

if there is another feasible allocation

xh for some

h H,

with

x
h %h xh

for all

i.e., if all consumers are at least as well o in

and at least one of them is strictly better o.

Pareto optimal if it is not Pareto dominated.


Call a nonempty collection S H of consumers a coalition . Coalition S can improve upon a
feasible allocation

if there are commodity bundles

x
h

 these bundles simply redistribute initial endowments:

52

h S such that
P
P
h = hS h ,
hS x

for all

 all members of
The

are better o:

x
h  xh

for all

h S.

core of E is the set of feasible allocations that no coalition can improve upon.

The requirement that no one-agent coalition can improve upon allocation


that

xh % h

Proposition 6.2
Proof.

h H.

for all

If

(p, x)

This condition is often referred to as

is a Walrasian equilibrium of

E,

then

simply requires

individual rationality .

lies in the core.

S H can improve upon x via commodity bundles (


xh )hS . Then
x
h h xh for each h S . By denition, xh is a most preferred bundle at prices p, so x
h
h
h
h
h
cannot lie in the budget set B (p, p ), i.e., p x
> p . Summing over all h S gives
P
P
h > p
h . This contradicts that (
p

xh )hS redistributes initial endowments:


hS
P hS
P
h = hS h .

hS x
Suppose coalition

Under weak assumptions, Walrasian equilibrium allocations are Pareto optimal:

Proposition 6.3
rium of

Proof.

Suppose

First fundamental welfare theorem: If (p, x) is a Walrasian equilib-

and consumers have locally nonsatiated preferences, then

is Pareto optimal.

x
: x
h %h xh for all h H and
h
h
h
By local nonsatiation, p x
P
P hp x = p for all h H
h
So, p
> p hH , contradicting feasibility of x
:
hH x


is Pareto dominated by feasible allocation

x
k k xk for some k H .
and p x
k >Pp xk = p k .
P
h hH h .
hH x

As a partial converse to the previous result, some additional assumptions guarantee that anything
that is Pareto optimal can be sustained as a Walrasian equilibrium allocation  at least if initial
endowments can somehow be redistributed.

Proposition 6.4

Second fundamental welfare theorem: Assume:

 for each redistribution of initial endowments in the pure exchange economy

E,

a Walrasian

equilibrium exists,

 consumers have strictly convex preferences.

x is a Pareto optimal allocation, redistribute initial endowments such that h = xh for all
h H . Then x is a Walrasian equilibrium allocation for the resulting pure exchange economy.

If

Proof.

(
p, x
).
B h (
p, p xh ), so x
h % xh .
By Pareto optimality of x, none of these preferences can be strict, so x
h h xh for all h H .
To see that x
h = xh for all h H , suppose there is an h H with x
h 6= xh . Consumer h can
h
h
aord (
x + x )/2. By strict convexity of preferences, this bundle is strictly preferred to x
h ,
contradicting that x
h is an optimal bundle for the consumer in the Walrasian equilibrium.

By assumption, the resulting pure exchange economy has a Walrasian equilibrium

For each

6.4.

h H, x
h

is optimal and

xh

is feasible in the budget set

Private ownership economies

Let us extend the pure exchange economy by adding rms, owned by the households: each household is entitled to a share (possibly zero) of each rm's prot. Formally, a

economy is a tuple



E = (%h , h )hH , (Y f )f F , (hf )hH,f F ,

where:

53

private ownership

is a nonempty, nite set of consumers/households,

f F has a production
and each consumer h H has
 a weak order %h over RL
+,
 each rm

 an initial endowment

h RL
+

 a claim to a share hf

set

of the

[0, 1]

Y f RL ,

where

a nonempty, nite set of rms,

L N,

commodities,

f F

of the prot of rm

(where

hH

hf = 1

for all

f F ).

allocation (x, y) = ((xh )hH , (yf )f F ) assigns to each consumer h H a commodity bundle
f
f
xh RL
+ and to each rm f F a production plan y Y . Allocation (x, y) is feasible if

An

xh

hH
If the price vector is
set

h +

yf .

f F

hH

p and rms decide on production plans (y f )f F , consumer h H


n

o
X
h
hf f
x RL
:
p

y
,
+

has budget

f F
h
because the initial endowment is worth p and

receives share

hf

of the prot

p yf

of rm

f F.
xh () denote the demand correspondence of consumer h H , y f () the supply corresponf
of rm f F , and () its prot function. The basic idea behind equilibria (feasibility

Let
dence

and optimal choices) leads to the following denition. A


ownership economy

p RL
+ , p 6= 0,

is a triple

(p, x, y),

Walrasian equilibrium

of a private

where

is a price vector,

 (x, y) = ((xh )hH , (y f )f F ) is a feasible allocation,


 for each consumer h H , xh is a most preferred bundle at prices

p:




X
hf y f ,
xh xh p, p h +
f F
 for each rm

f F , yf

maximizes prots at prices

p: y f y f (p)

and

f (p) = p y f .

Once again, existence of Walrasian equilibrium is usually established by looking at the

excess

demand correspondence z assigning to each price vector p the dierence between total demand
for and total availability of the commodities:

z(p) =



 X
X
hf f (p)
xh p, p h +
y f (p) {},
f F

hH
and the interest is in nding a price vector

f F

p where z(p) RL
6= .

The following result (Debreu,

1959, Section 5.6) establishes existence of such a price vector; as before, one may restrict attention
to prices in the unit simplex

Proposition 6.5

Assume that excess demand

z:

 achieves values in some convex, compact set


 is nonempty-valued:

z(p) 6=

for all

p ,
54

Z RL : z(p) Z

for all

p ,

 is convex-valued:

z(p)

 has a closed graph:

is a convex set for all

p ,

{(p, z) Z : z z(p)}

p z 0 for
z(p) RL
6= .

 satises a weak form of Walras' Law:


Then there is a price vector

Proof.

with

is a closed set,
all

and all

z z(p).

Once again, the idea is to make goods with large excess demand expensive in the hope of

decreasing it. This is achieved by maximizing, for a given excess demand vector

p z,

which requires putting all weight of

correspondence from

to

on the largest coordinate(s)

z , the expression
of z . Dene the

by

(z) = {p : p z = max
p0 z}.
0
p

p 7 p z over a nonempty, compact set , is nonemptyz Z and p0 (z). Then (z) = {p RL : p z = p0 z} is the intersection of
sets, so is convex-valued. A standard continuity argument shows that has a closed

As it maximizes a continuous function


valued. Let
convex
graph.
The correspondence

from and to

with

(p, z) = (z) z(p)


and z have these properties.
By Kakutani's xed point theorem, there is a (p, z) Z with (p, z) (p, z) = (z)z(p).
0
0
As z z(p), the weak Walras' Law implies that p z 0. As p (z), p z p z for all p .
0
0
For each ` {1, . . . , L}, taking p = e` gives that z` = p z p z 0, so z 0.

is nonempty-valued, convex-valued, and has a closed graph because

The trick, of course, is to derive the desired properties of the excess demand correspondence by
imposing properties on the components of the private ownership economy

E.

Given the results

of Sections 4 and 5, most of them should not come as a surprise. Only the rst is somewhat

Z ? Convexity
of Z is not the issue: if you can nd a compact set containing all the images z(p), they also lie
in a suciently large (convex) ball. Without going into details, compactness of Z is established
P
f
by realizing that the relevant production plans, by feasibility, must satisfy
f F y + 0.

complicated: what allows us to restrict attention to such a convex, compact set

Following the lines of Proposition 5.5, this set of attainable production plans can be shown to
be compact.
Appropriate modications of the fundamental welfare theorems continue to hold for private
ownership economies. As this section was meant only as a short introduction to the topic, the
interested reader is referred to Debreu (1959) for a more comprehensive treatment. Textbooks
on general equilibrium theory include Hildenbrand and Kirman (1988) and Starr (1997).

6.5.

Exercises

Exercise 6.1
(a) What is wrong with the following argument: Proposition 6.2 implies Proposition 6.3: if
the core, the coalition

S=H

of all consumers cannot improve upon it. So

(b) Give an example of a pure exchange economy

in the core, but is not Pareto optimal.

55

and a Walrasian equilibrium

lies in

is Pareto optimal.

(p, x)

such that

lies

Exercise 6.2

Market clearing:

Walras' Law holds:

pz =0

Consider a (pure exchange/private ownership) economy

for all price vectors

and all

z z(p).

where

Prove:

(a) In equilibrium, markets with a positive price clear:

L
p RL
+ , z z(p) R , ` {1, . . . , L} :
(b) If prices are positive and

L1

if

p` > 0,

then

z` = 0.

markets clear, then so does the nal one:

p RL
++ , z z(p), ` {1, . . . , L} :

zk = 0

k 6= `,

then

z` = 0.

(c) Consider an equilibrium. Suppose (c1) or (c2) is true for at least one consumer

h H:

if

for all

Markets clear in most standard applications:

(c1)
(c2)

% is strongly monotonic on X =
h has a positive amount of money

RL
+.
to spend,

x, y RL
+ :
and

%h

is strongly monotonic on

h's

least preferred alternatives are on the axes:

h
x RL
/ RL
++ , y
++ x  y,

X = RL
++ .

Prove that all markets clear.


Cobb-Douglas preferences, for instance, satisfy the requirements in (c2), not those in (c1).

Exercise 6.3

Consider a private ownership economy

E.

A feasible allocation

(x, y)

is

Pareto dominated if there is another feasible allocation (x, y) with xh %h xh for all h H
and

x
h h x h

for some

h H.

Pareto optimal if it is not Pareto dominated.


(a) Why do you think Pareto dominance is dened in terms of consumer preferences, ignoring those
of producers?
(b) Prove the First fundamental welfare theorem: If
and consumers have locally nonsatiated preferences, then

Exercise 6.4

Restricting attention to prices in the unit simplex

of a pure exchange economy

(p, x, y)
(x, y) is

is a Walrasian equilibrium of

Pareto optimal.

(to avoid trivialities), give an example

with two consumers, two commodities, and

(a) no Walrasian equilibrium.


(b) exactly one Walrasian equilibrium.
(c) exactly two Walrasian equilibria.
(d) innitely many Walrasian equilibria.
Answer the same question for a private ownership economy by adding 714 producers (yes, seven hundred
and fourteen. . . You don't seriously believe I'd ask this if the answer weren't trivial, do you?).

Exercise 6.5

King Solomon's problem:

In a well-known parable, king Solomon settles a dispute

between two women, each claiming that a certain baby is hers, by suggesting to cut it in two with his
sword: the true mother is revealed as she is willing to give up her child to the liar, rather than have
it killed.

Swords make babies divisible commodities, so consider a pure exchange economy with two

x [0, 1] be a share of a baby. The true


uT : [0, 1] R with uT (x) = x if x {0, 1} and uT (x) = 1 otherwise.
L
The liar has utility function u
: [0, 1] R with uL (x) = x. Determine for each initial allocation
T
L
2
( , ) {z R+ : z1 + z2 = 1} the set of feasible allocations, the set Pareto optimal allocations, the
consumers (the two women), one commodity (the baby). Let

mother has utility function

core, and the set of Walrasian equilibria.

56

7.

Expected utility theory

Hitherto, we assumed that decision makers act in a world of absolute certainty; typically, however,
the consequences of decisions entail some stochastic elements. This section treats the development of expected utility theory, using the axiomatic approach of von Neumann and Morgenstern.

7.1.

Simple and compound gambles

We maintain the notion of preferences, but instead of assuming that a decision maker (DM) has
preferences over certain outcomes, we consider preferences over
probability distributions over outcomes. Formally, let
set of

lotteries or gambles , which are

A = {a1 , . . . , an }

be a nonempty, nite

(deterministic) outcomes . A simple gamble assigns a probability pi to each outcome

ai A.

We denote a simple gamble by

g = (p1 a1 , , pn an ).
Probabilities should be nonnegative and add up to one, so the set of simple gambles is

(
(p1 a1 , , pn an ) : p1 , . . . , pn 0,

G1 =

n
X

)
pi = 1 .

(28)

i=1
For instance, when tossing a coin, the outcome will be heads
fair coin corresponds with the simple gamble

( 12 H, 12 T ).

or tails

T,

so

A = {H, T }.

Some notational conventions:

 one often omits outcomes with probability zero from the notation of a simple gamble:
( 12 a1 , 12 an ) is an abbreviation for the simple gamble

 one often writes

ai


1
1
a1 , 0 a2 , , 0 an1 , an .
2
2

for the simple gamble

(1 ai ) whose outcome is ai

with probability one.

Not all gambles are simple. Perhaps you decided to bet one dollar on your favorite number in a
roulette game, but toss a coin to decide which of two roulette wheels you want to play in a casino:
the outcome of the rst gamble (the coin toss) is another gamble (the roulette game). This is an
example of a compound gamble. In principle, we can have any level of compound gambles. For
convenience, we will assume that a compound gamble ends in a deterministic outcome after only
nitely many steps. Formally, the set of compound gambles is dened as follows. Let

m N, let Gm
G0 , . . . , Gm1 :

and, inductively, for each


the lower levels

(
Gm =

G0 = A

be the set of gambles whose outcomes are gambles from

(p1 g1 , , pk gk ) : k N, p1 , . . . , pk 0,

k
X

)
pi = 1,

and

g1 , . . . , gk m1
`=0 G`

i=1
The

set of compound gambles is

G =
m=0 Gm .

Associated with each compound gamble is a simple one, specifying the eective probabilities

A occur.
g yielding a1 with

with which the outcomes in

For instance, suppose that

compound gamble

probability

57

A = {a1 , a2 }

and consider the

and a lottery ticket with probability

1 .

a1

The lottery ticket is a simple gamble yielding

1 .

Eventually, this implies that

probability

(1 )(1 ).

Thus,

a1

and a2 with probability


+ (1 ) and a2 occurs with

with probability

occurs with probability

gives rise to the simple gamble

(( + (1 )) a1 , (1 )(1 ) a2 ).
Similarly, for every gamble

g G,

let

pi

be the eective probability assigned to

ai A

by

g.

g induces the simple gamble (p1 a1 , , pn an ) G1 or that the latter is the


reduced simple gamble associated with g. Notice that this reduced simple gamble is unique.

We say that

7.2.

Preferences over gambles

Assume the DM has a preference relation

over the set

of compound gambles. Impose the

following properties:
(G1)

is a weak order.

A = {a1 , . . . , an }, every simple gamble g G1 is fully


(p1 , . . . ,P
pn ) Rn of probabilities, i.e., we can interpret G1 simply as the
n = {p Rn+ : i pi = 1}. And in Rn , we know what continuity means, so we

Given the set of deterministic outcomes


described by its vector
unit simplex
can state:
(G2) Continuity on

G1 :

The preference relation

restricted to

G1

is continuous.

Continuous weak orders have played an extensive role also in our earlier sections; the following
properties explicitly exploit the specic structure of our gambling framework. Our next property
requires that in considering a gamble, the DM cares only about the eective probabilities assigned
to each outcome in

A:

it suces to restrict attention to simple gambles:

g G,
g (p1 a1 , , pn an ).

(G3) Reduction to simple gambles: for each


induced by

g,

then

if

(p1 a1 , , pn an )

is the simple gamble

This is a strong assumption. It rules out, for instance, any preference relation that takes into
account the complexity of compound gambles: a DM may strictly prefer the associated reduced
simple gamble to some

g G2562 ,

since it involves a much less intricate chain of events leading

to eventual deterministic outcomes.


Our next property, independence, says that if we mix two gambles

and

g0

with a third one,

g 00 , then the preference between the mixtures should be independent of the particular choice of
the third gamble. It essentially requires some form of independence of irrelevant alternatives: in
the two gambles

( g, (1 ) g 00 )
the gamble

g 00

and

occurs with the same probability

( g 0 , (1 ) g 00 ),

1 .

According to independence, this means

that the preference should depend only on the part where the two gambles are dierent, i.e., on
gambles

and

g0.

(G4) Independence: for all

g, g 0 , g 00 G

and all

(0, 1):

g % g 0 ( g, (1 ) g 00 ) % ( g 0 , (1 ) g 00 ).
58

These four properties have a number of intuitive consequences:

Proposition 7.1

Assume the preference relation

(a) There is a best element


(b) For each

g G,

and a worst element

there is a number

g [0, 1]

on

satises (G1) to (G4).

G1 ,

in

i.e., for all

g G1 : g % g % g .

such that

g (g g, (1 g ) g).
k N and let p1 , . . . , pk > 0 add
gi hi for all i = 1, . . . , k . Then

(c) Substitution: let


be such that

up to one. Let

g1 , . . . , gk , h1 , . . . , hk G

(p1 g1 , , pk gk ) (p1 h1 , , pk hk ).
Finally, let us assume that
(d) Monotonicity: for all

g  g ,

to avoid trivial cases.

, [0, 1],

if

> ,

then

( g, (1 ) g)  ( g, (1 ) g).

Proof. (a):

Immediate from continuity (G2) of the weak order (G1)

on the compact unit

n .
g G and let gs G1 be its reduced simple gamble. Since g gs by (G3) and
g % gs % g , it follows from transitivity (G1) that g % g % g .
Let p, p
n be the associated probabilities of g and g. By connectedness of the set of convex
simplex

(b):

Let

combinations of these best and worst gambles in the unit simplex, Proposition 2.7 implies that
there is a gamble with probabilities

g p + (1 g )p
equivalent with

g.

By reduction to simple gambles (G3), this means

g (g g, (1 g ) g)

(c):

By induction on

k N.

The claim is trivially true if

the claim is true for mixtures of less than

gambles.

k = 1.

Let

k N, k 2,

and suppose

To prove the case with mixtures of

gambles, notice that

p2
pk
(p1 g1 , , pk gk ) (p1 g1 , (1 p1 ) ( 1p
g2 , , 1p
gk ))
1
1
p2
pk

h
,

,
(p1 h1 , (1 p1 ) ( 1p
2
1p1 hk ))
1
(p1 h1 , , pk hk )

.
, [0, 1] satisfy > .

by (G1) and (G3)


by induction
by (G1) and (G3)

so the claim holds by transitivity of

(d):

Assume

g  g

and let

from reduction (G3) and independence (G4), so

= 1 or = 0, the result follows


assume that 1 > > > 0. Then
If

( g, (1 ) g)  ( g, (1 ) g)
g
59

by (G4)
by (G1) and (G3).

easily

Since

is a weak order (G1):

( g, (1 ) g)  g.
Denote the left gamble by

g.

Then

( g, (1 ) g) = g
( g, (1 ) g)

Since

by (G1) and (G3)

 ( g, (1 ) g)

by (G4)

( g, (1 ) g)

by (G1) and (G3).

is a weak order (G1):

( g, (1 ) g)  ( g, (1 ) g),


as we had to show.

7.3.

von Neumann-Morgenstern utility functions

Equipped with these results, one can show that properties (G1) to (G4) imply the existence
of a utility function
Formally, a

u:GR

that is linear in the eective probabilities over the outcomes.

von Neumann-Morgenstern (vNM) utility function

is a function

u:GR

that

 represents the preference relation

on

g, h G :

G:
g % h u(g) u(h),
g G:

 and does so in a way that for every gamble

u(g) =

n
X

pi u(ai ),

i=1
where

(p1 a1 , , pn an )

is the simple gamble induced by

g.

In words: a vNM utility function represents the preferences of the DM and the utility assigned
to a gamble equals the expected utility of the induced simple gamble.

Proposition 7.2

If

is a preference relation over

vNM utility function representing

Proof.

g g ,

and a worst gamble

in

G1 .

In the

any constant function is a vNM utility function. So assume, w.l.o.g.,

g  g .
g G, Proposition 7.1 implies the existence
g (g g, (1 g ) g). Dene
u(g) = g .

For each
that

satisfying (G1) to (G4), there exists a

By Proposition 7.1(a), there exists a best gamble

trivial case where


that

%.

60

of a unique number

g [0, 1]

such
(29)

This utility function represents

%:

let

g, h G.

Then

g % h (g g, (1 g ) g) % (h g, (1 h ) g)
u(g) = g h = u(h),
where the rst equivalence follows from transitivity (G1) of
monotonicity and the denition of

and the second equivalence from

u.
g G and let gs = (p1 a1 , , pn an ) be the
g gs , so u(g) = u(gs ). For each ai A, we know from
u(ai ) that

To obtain the expected utility expression, let


simple gamble induced by

g.

By (G3),

Proposition 7.1 and the denition of

ai (u(ai ) g, (1 u(ai )) g).


For each

i = 1, . . . , n,

dene

hi = (u(ai ) g, (1 u(ai )) g).

By substitution:

gs = (p1 a1 , , pn an ) (p1 h1 , , pn hn ).
Notice that

h1 , . . . , hn

are gambles over the best and worst gambles only.

probability for the best gamble

(p1 h1 , , pn hn )

is equivalent with

n
X

!
g, 1

pi u(ai )

i=1

n
X

!
pi u(ai )

!
pi u(ai )

g, 1

i=1
is the unique number in

Combining this with (30) yields

Remark 7.3

g .

we nd:

n
X

g gs (p1 h1 , , pn hn )
u(g)

i=1

Combining the above with transitivity of

By denition,

By computing the

and using reduction to simple gambles (G3), one nds that

n
X

!
pi u(ai )

!
g .

(30)

i=1

[0, 1]

satisfying

g (u(g) g, (1 u(g)) g).


P
u(g) = ni=1 pi u(ai ).

Conversely, it is straightforward to verify that if a preference relation

be represented by a vNM utility function, it must satisfy properties (G1) to (G4).

on

can

The linearity requirement on vNM utility implies that the earlier result from utility theory 
any strictly increasing transformation of the utility function of the consumer still represents the
same preferences  no longer holds. Indeed, the only transformations of a vNM utility function
that remain vNM utility functions, are positive ane transformations:

Proposition 7.4
with

a > 0,

also

u : G R dened in (29). For all a, b R


%. Conversely, if v : G R is a
exist a, b R with a > 0 such that v = au + b.

Consider the vNM utility function

au + b

is a vNM utility function representing

vNM utility function representing

on

G,

there

61

Proof.

g  g .

To avoid trivialities, assume that

second claim, let

a>0

and

The rst claim is simple.

To establish the

be the unique solution (do you understand why a solution exists

and why it is unique?) to

v(
g ) = au(
g ) + b,
v(g) = au(g) + b.
Let

g G.

By construction  see (29) 

g (u(g) g, (1 u(g)) g),

so

u(g) = u(g)u(
g ) + (1 u(g))u(g),

(31)

and, similarly,

v(g) = u(g)v(
g ) + (1 u(g))v(g)
= u(g)[au(
g ) + b] + (1 u(g))[au(g) + b]
= a[u(g)u(
g ) + (1 u(g))u(g)] + b
= au(g) + b,


where the last equation follows from (31).


Our development of vNM utilities involved a nite set

of deterministic outcomes and com-

pound gambles of nite length. These assumptions can be relaxed, but at the cost of increased
topological and measure-theoretic complexity.

7.4.

Exercises

Exercise 7.1

Throughout this exercise, let

G =
n=0 Gn

be the set of compound gambles over a nite

{a1 , . . . , ak } R of k 2 dierent deterministic outcomes. Recall: Gn is the set of n-th level gambles.
For each of the preference relations % over G dened below, answer the following questions:
 If possible, nd the best and the worst elements of G.
set

 For each of the four properties (G1) to (G4) guaranteeing the existence of a vNM utility function,
check whether

satises it.

 If (G1) to (G4) are satised, nd a vNM utility function representing

%.

(a) Most likely outcomes: A decision maker bases preferences on the average of the deterministic
outcomes that are most likely to occur.

Let

g G

and let

(p1 a1 , , pk ak )

be its induced

simple gamble. Let

L(g) = {ai : pi pj

for all

|L(g)|
g, h G:

be the set of most likely deterministic outcomes and


relation

on

is dened as follows: for all

g%h

j = 1, . . . , k}
its number of elements. The preference

1 X
1 X
ai
ai .
ai L(g)
ai L(h)
|L(g)|
|L(h)|
% over G
g G, there
is a unique n with g Gn . Let
Pk
Then u(g) =
m=1 pm am n.

(b) Keeping it simple: A decision maker dislikes complex alternatives and has preferences
represented by the following utility function: for each

(p1 a1 , , pk ak )
(c) Satisficing:

be its induced simple gamble.

A decision maker is content with all deterministic outcomes larger than 5.

% on G is represented by the following utility


P function:
(p1 a1 , , pk ak ) be its induced simple gamble. Then u(g) = i:ai >5 pi .

preference relation

62

for each

The

g G,

let

8.

Risk attitudes

8.1.

In for a gamble?

Let us conne attention to cases where the outcomes of the gambles are amounts of money:
a convex set in

R.

A is

Despite the fact that we now allow an innite set of outcomes, we will assume

that every gamble assigns positive probability to only nitely many outcomes.

The existence

theorem of vNM utility functions can be adjusted to this case by modifying the properties (G1)
to (G4) to innite sets. We assume that the vNM utility function

is increasing in money and

investigate the relation between this function and the DM's attitude towards risk.
Consider a nontrivial (i.e., at least two dierent deterministic outcomes have positive probability) simple gamble

g = (p1 w1 , , pn wn )

1. Accept the gamble; this yields utility

and suppose the DM is oered two scenarios:

u(g) =

Pn

i=1 pi u(wi ).

2. Accept the outcome that gives the expected value of the gamble with certainty (this is where
we need convexity of

A!).

The expected value of the gamble is equal to

u(E(g)) = u(

This alternative has utility

E(g) =

Pn

Pn

i=1 pi wi .

i=1 pi wi ).

The DM is said to be:

risk averse at g if u(g) < u(E(g)),


 risk neutral at g if u(g) = u(E(g)),
 risk loving at g if u(g) > u(E(g)).
The DM is said to be risk averse (on G) if he is risk averse at every nontrivial simple gamble
g over outcomes in A. Risk neutral and risk loving behavior are dened analogously. These


risk attitudes directly translate to properties of the associated vNM utility function over money:

Proposition 8.1
function

u.

Let

A R

be nonempty and convex.

Then the DM is:

(a) risk averse if and only if


(b) risk neutral if and only if
(c) risk loving if and only if

Proof.

is strictly concave on

is linear on

A,

A,

is strictly convex on

A.

We only prove the rst claim; the others are similar. Risk aversion means that for every

nontrivial gamble

(p1 w1 , , pn wn ),

u(p1 w1 , , pn wn ) =

n
X

pi u(wi ) < u(E(g)) = u

i=1
But this is equivalent with strict concavity:

PAn

strictly concave on
with

Assume the DM has a vNM utility

Pn

i=1 pi

=1:

!
p i wi

i=1

u is
p1 , . . . , p n > 0


by induction it follows that the function

if and only if for all dierent

i=1 pi u(wi )

n
X

P
< u ( ni=1 pi wi ).

w1 , . . . , w n A

and all

Although we can always check whether a DM is risk averse/neutral/loving at a specic gamble

g,

he does not have to be risk averse/neutral/loving over the entire collection of lotteries. It may

well be, for instance, that he is risk averse at high-stake lotteries and risk loving at low-stake
lotteries.

63

8.2.

Certainty equivalent and risk premium

certainty equivalent

The

of a simple gamble

is an amount of money

certainty such that the DM is indierent between the gamble

CE(g) oered
CE(g):

with

and accepting

u(g) = u(CE(g)).

Remark 8.2

For topologists (can be omitted): generalizing the continuity requirement G2 to

the case of an innite set


on

are continuous.

AR

of deterministic outcomes entails in particular that preferences

w A (say, weight one on the


w A with g % w. By the Intermediate
there is a CE(g) A with g CE(g). By

So for each simple gamble

best deterministic outcome in

g)

with

w %g

value theorem for preferences, Proposition 2.7,


monotonicity of preferences in money,

g,

there is a

and a

CE(g) is unique:

the certainty equivalent is a well-dened

notion.
The

risk premium of a simple gamble g is an amount of money P (g) such that u(g) = u(E(g)

P (g)).

Clearly,

P (g) = E(g) CE(g).


Intuitively, a risk averse DM prefers

E(g) with certainty over the gamble g .

But there will be some

amount that makes him indierent between accepting that amount with certainty and accepting
the gamble

g.

This amount is called the certainty equivalent. It is easy to show (see below) that

for a risk averse DM who strictly prefers more money to less, the certainty equivalent is less than
the expected value

E(g)

of the gamble: a risk averse person is willing to pay a positive amount

of money to avoid the gamble's inherent risk. This willingness to pay is the risk premium.

Proposition 8.3

Consider a DM with vNM utility function

u which is increasing in wealth.

The

following three statements are equivalent:


1. DM is risk averse,
2.

CE(g) < E(g)

3.

P (g) > 0

Proof.

Since

for all nontrivial gambles

for all nontrivial gambles

P (g) = E(g) CE(g),

g S,

g S.

statements 2 and 3 are equivalent, so it suces to show

that statements 1 and 2 are equivalent. The DM is risk averse if and only if for every nontrivial

g S , u(g) < u(E(g)).


which is equivalent with

CE(g), this is equivalent


CE(g) < E(g), since u is increasing.
By denition of

with

u(CE(g)) < u(E(g)),




As a simple exercise, try to formulate similar characterizations of risk neutral and risk loving
behavior.

Example.

u(w) = ln(w) for all w A. This DM is risk averse,


since u is strictly concave. Assume DM's initial wealth is w0 and DM faces a gamble g oering
50-50 odds of winning or losing an amount h (0, w0 ) :
Take

A = R++

and assume that

g = ((1/2) (w0 h) , (1/2) (w0 + h)).

64

Hence

E(g) = 12 (w0 h) + 12 (w0 + h) = w0 .


u(CE(g)) = u(g) =

The certainty equivalent

1
1
ln(w0 h) + ln(w0 + h) = ln
2
2

CE(g)

must satisfy

q
w02 h2 ,

where the nal equation follows from the properties of the natural logarithm. Hence

p
w02 h2 < w0 = E(g)
8.3.

and

P (g) = w0

CE(g) =

p
w02 h2 > 0.

Arrow-Pratt measure of absolute risk aversion

Arrow and Pratt considered the problem of measuring the extent of risk aversion. They assumed
that the vNM utility function

is an increasing, strictly concave function of wealth levels that

is twice dierentiable. In particular, they assume:

w : u0 (w) > 0
Using this, the

and

u00 (w) < 0.

(32)

Arrow-Pratt measure of absolute risk aversion at wealth w is dened as


Ra (w) =

u00 (w)
.
u0 (w)

Why is this a sensible measure of risk aversion? A heuristic derivation is provided in the next
subsection. The intuition is as follows: the more risk averse a DM is, the more he is willing to pay
to avoid certain gambles. Thus, the size of the risk premium in some way measures risk aversion.
It turns out that the Arrow-Pratt measure of absolute risk aversion is roughly proportional to
the risk premium the DM is willing to pay to avoid actuarially fair bets (a bet is actuarially
fair if its expected value equals initial wealth: the expected loss/gain is zero). Thus, if DM 1 is
more risk averse than DM 2, his risk premium for every nontrivial gamble exceeds that of DM 2,
so the same should hold (due to proportionality) for the Arrow-Pratt measures of absolute risk
aversion. The actual proof is somewhat more complicated; we omit it.

Proposition 8.4

Consider two DMs with vNM utility functions

and

respectively, both sat-

isfying (32). The following two claims are equivalent:


1.

00

00

(w)
(w)
Ra1 (w) = uu0 (w)
> vv0 (w)
= Ra2 (w)

for all wealth levels

w,

1
2. The risk premium P (g) of the DM with utility function u is strictly larger than the risk
2
premium P (g) of the DM with utility function v for every nontrivial gamble g S.
Notice that positive ane transformations of the utility functions do not aect

Ra (w):

it does

not depend on the choice of vNM utility function.


It is common in the literature on for instance portfolio choice to assume that risk aversion
decreases with wealth. This is the

DARA assumption (Decreasing Absolute Risk Aversion):

Ra ()

is a decreasing function of

65

w.

8.4.

A derivation of the Arrow-Pratt measure

The argument in this section is due to Pratt (1964). Assume (32) and let the DM's initial wealth
be

w0 .

Consider the gamble with 50-50 odds of winning or losing an amount

h:

g = ((1/2) (w0 h) , (1/2) (w0 + h)).


The gamble is fair:

E(g) = w0 .

Let

P = P (g) > 0

be the risk premium of

g:

1
1
u(g) = u(w0 h) + u(w0 + h) = u(E(g) P ) = u(w0 P ).
2
2
Take a rst order Taylor approximation of

u(w0 P )

around

(33)

w0 :

u(w0 P ) u(w0 ) u0 (w0 )P.


Take a second order Taylor approximation of

u(w0 h)

and

u(w0 + h)

(34)
around

w0 :

1
u(w0 h) u(w0 ) u0 (w0 )h + u00 (w0 )h2 ,
2
1
0
u(w0 + h) u(w0 ) + u (w0 )h + u00 (w0 )h2 .
2
Consequently,

1
1
1
u(w0 h) + u(w0 + h) u(w0 ) + u00 (w0 )h2 .
2
2
2

(35)

Using (33), (34), and (35), it follows that

1
u(w0 ) + u00 (w0 )h2 u(w0 ) u0 (w0 )P.
2
Rearranging terms, one nds

1 u00 (w0 )
P h2 0
.
2
u (w0 )

Conclude that the Arrow-Pratt measure of absolute risk aversion is approximately proportional
to the risk premium
losing an amount

P,

the willingness to pay in order to avoid the 50-50 odds of winning or

h.

66

9.

Some critique on expected utility theory

Expected utility theory is the main tool in economic models involving uncertainty. Nevertheless,
expected utility theory has been under constant attack from behavioral economists and psychologists who show that subjects in experiments or real-life situations systematically violate the
properties (G1) to (G4) or that mindless application of expected theory leads to counterintuitive
conclusions. For this reason, many alternative models for decision making under risk and uncertainty have been developed. Perhaps the most well-known  especially since Daniel Kahneman
was awarded the 2002 Nobel Prize in economics  is Kahneman and Tversky's prospect theory
(Kahneman and Tversky, 1964). Although we lack time to go into such alternative models, we
stand still for a while and consider a number of blows to the expected utility model.

9.1.

Problems with unbounded utility: a variant of the St. Petersburg paradox

Nothing in the development of our expected utility model required the utility function to be
bounded. Unbounded utility functions, however, make decision-makers susceptible to cunning
exploitation. Suppose a DM with initial wealth

w0 > 0 has a vNM utility function u over money

which is not bounded from above.


By assumption, there is some wealth

1
victim the gamble (
2

1
2

0, w1 ),

w1

with

u(w0 ) < 21 (u(0) + u(w1 )).

Smile and oer your

which he will accept by construction.

If he loses, he ends up with wealth zero. If he wins, reach him

w1

and just before he takes the

money from your hand, retract it, turn your smile back on, and oer him a gamble
where

w2

is chosen such that

u(w1 ) <

1
2 (u(0)

+ u(w2 )).

( 21 0, 12 w2 ),

Again, by construction, the DM will

accept.
As long as the DM goes on winning, keep oering such 50-50 odds gambles. . . The DM will
end up with wealth zero with probability one!

9.2.

Allais' paradox

Consider the following four simple gambles:

g1 = (1 $1, 000, 000),


g2 = ((0.10) ($5, 000, 000), (0.89) ($1, 000, 000), (0.01) ($0)),
g3 = ((0.11) ($1, 000, 000), (0.89) ($0)),
g4 = ((0.10) ($5, 000, 000), (0.90) ($0)).
It turns out that in dierent experiments, most people prefer

g1

g2 , but g4
u. Then

to

expected utility theory. Suppose a DM has vNM utility function

to

g3 .

This violates

g1  g2 u($1, 000, 000) > 0.10u($5, 000, 000) + 0.89u($1, 000, 000) + 0.01u($0).
Rearranging terms, we nd

g1  g2 0.11u($1, 000, 000) > 0.10u($5, 000, 000) + 0.01u($0)


0.11u($1, 000, 000) + 0.89u($0) > 0.10u($5, 000, 000) + 0.90u($0)
g3  g4 ,
where the last equivalence follows from computing the expected utility of

67

g3

and

g4 .

9.3.

Probability matching

You are paid

$1 each time you guess correctly whether a red or a green light will ash.

The lights

ash randomly, but the red is set to turn on three times as often as the green. It has been found
that many subjects in experiments of this type try to imitate the chance mechanism: they choose
red about three quarters of the time and green one quarter. Obviously it would be more protable
to always choose red.
with probability

3/4

Formally, the expected utility of the compound lottery of choosing red


gives you a one dollar payo with probability

(3/4)2 + (1/4)2 = 10/16,

corresponding with the simple gamble

((10/16) $1, (6/16) $0),


while choosing red with probability one corresponds with the simple gamble

((3/4) $1, (1/4) $0).


Since

3/4 > 10/16,

the second gamble should be strictly preferred over the rst.

This type of matching behavior has been frequently observed in real life, as well as laboratory
experiments, using both humans and animals as subjects. In an experiment with animals, for
instance, foraging behavior of pigeons was studied, using two food patches (call them red and
green, as above) with food being dispatched at the red location three quarters of the time and
at the green location one quarter of the time.

The pigeons tried to match this probability

distribution.
A small personal anecdote: jointly with two colleagues, I published two papers on a game
theoretic model of bounded rationality in which players are assumed to display matching behavior. To explain the type of behavior to laymen and motivate that it is observed in real life, we
used dierent examples, among them the pigeon example mentioned above. This led the Dutch
Foundation for Mathematical Research, which at that time was nancing my work, to publish
a press statement proudly proclaiming: People behave like pigeons when dealing with probability, a press statement that gave us extensive media coverage but where we desperately tried to
qualify our employers' overzealous interpretation. So in case you sometimes wonder what you
are doing. . . you may just be behaving like a pigeon!

9.4.

Rabin's calibration theorem

Matthew Rabin, one of the world's leading behavioral economists, published a remarkable article
(Rabin, 2000) on the consequences of risk aversion with respect to small-stake gambles. Let us
start with an example to illustrate the result. Consider a risk averse DM who for each initial
wealth level rejects a 50-50 odds gamble of winning

11

dollars or loosing

10

dollars: certainly a

rather unremarkable level of risk aversion. What does this imply about his preferences for other
gambles? Consider, for instance, the following statements:
1. For each level of wealth, the DM will reject the lottery with a 50 percent chance of loosing

100

dollars and a 50 percent chance of gaining

150

dollars.

2. For each level of wealth, the DM will reject the lottery with a 50 percent chance of loosing

100

dollars and a 50 percent chance of gaining

1, 500

dollars.

3. For each level of wealth, the DM will reject the lottery with a 50 percent chance of loosing

100

dollars and a 50 percent chance of gaining

68

1, 000, 000

dollars.

4. For each level of wealth, the DM will reject the lottery with a 50 percent chance of loosing

100

dollars and a 50 percent chance of gaining

5. You can proceed with gains

1, 000, 000, 000, 000, 000, 000

dollars.

as high as you want, but the DM will always reject the

lottery with a 50 percent chance of loosing

100

dollars and a 50 percent chance of gaining

G.
Which of these statements are true? The rst and the second may perhaps not be so surprising
and I probably wouldn't be asking you if the question was trivial, so even the third could be
true. On the other hand, one would certainly doubt the sanity of a DM rejecting the bet in the
fourth claim and lingering doubt turns to certainty in the fth case. Yet this is exactly what the
DM will do: no amount of money in the world will make him accept a gamble with a 50 percent
chance of loosing

100

dollars. Clearly, such behavior is absurd.

Let us try to establish some intuition. The fact that the DM at each wealth level

rejects

the gamble

1
1
(w 10) , (w + 11)
2
2

implies that

1
1
u(w 10) + u(w + 11) < u(w),
2
2

w :
or, rewriting the expression, that

w :

u(w + 11) u(w) < u(w) u(w 10).


w and w + 11
w 10 and w :

Hence, on average, the DM values each dollar between


much as he, on average, values each dollar between

w :

by at most

10/11

times as

u(w + 11) u(w)


10 u(w) u(w 10)
<
.
11
11
10

By concavity of the utility function, this means that the marginal utility of the
is at most

10/11

times the marginal utility of the

w :

(w 10)-th

u0 (w + 11) <

(w + 11)-th dollar

dollar:

10 0
u (w 10).
11

(36)

Repeated application of (36) implies an enormous decrease in marginal utility of money: the
marginal utility of dollar

w + 32

is at most

10/11

times the marginal utility of dollar

w + 11,

10/11 times the marginal utility of dollar w 10, so the marginal utility of
w + 32 is at most (10/11)2 0.83 times the value of dollar w 10. Similarly, the DM
3
values dollar w + 53 by at most (10/11) 0.75 times the value of dollar w 10. More generally,
k+1 times the value of dollar
the DM values dollar w + 11 + 21k , where k N, by at most (10/11)
w 10, which is an extremely high rate of deterioration for the value of money.
which is at most
dollar

69

10.

Time preference

Discounting essentially means that a given benet is valued higher when it is received immediately
than when it is received with a delay. A common economic motivation for discounting is that,
say, one dollar today is worth more than one dollar next year, as the immediate reward can be
put into a bank at an annual interest rate
year.

r > 0, making the dollar today worth 1 + r

dollars next

Another motivation, common in evolutionary models, is the risk that a delayed benet

may not be realized: you may die before receiving it (or be interrupted in achieving it, or be
cheated in the promise of receiving it).
In addition to the question how to model discounting in an appropriate way, decision theory
in the presence of time involves a number of careful considerations:

 Choice of horizon: should one look nitely or innitely far into the future? Keynes' famous
quote In the long run, we are all dead could be an argument in favor of a nite horizon.
Many economic models involve just two time periods as an abstraction of now and the
future. On the other hand, many decisions have no clearly dened nal period: you  or
in an evolutionary sense as in overlapping generations models, your genes  may live to
see another day. In such cases, an innite horizon makes sense.

 Choice of time as a discrete or continuous variable: also here, common sense, the appropriate level of abstraction, and (not rarely) the modeler's choice of mathematical tools is
decisive.
Unless specied otherwise, this section takes time as being discrete and uses an innite horizon.
We derive the standard exponential discounting model from a stationarity assumption on preferences and briey discuss a violation of stationarity and hyperbolic discounting. Section 10.3,
based on Osborne and Rubinstein (1994, Sec.

8.3), considers two criteria for evaluating out-

comes over time without discounting. The nal section, based on Voorneveld (2007), illustrates
the somewhat paradoxical statement that a sequence of utility-maximizing choices can minimize
utility.

10.1.

Stationarity and exponential discounting

The standard model of preferences over time assumes that:

 the set of alternatives consists of sequences of outcomes


arbitrary set, where

 preferences

ct

denotes the outcome at time

c = (c0 , c1 , . . .) = (ct )
t=0

in some

t.

over such sequences are represented by a utility function

U (c) = (0)u(c0 ) + (1)u(c1 ) + (2)u(c2 ) + =

of the form

(t)u(ct ),

(37)

t=0
with

(0) = 1.

The function in (37) is often interpreted as a sum of discounted instantaneous utilities:


outcome

ct

at time

tN

in the future. The discount factor

the notation involving an innite sum.

Exercise 10.1

the

u(ct ), but is discounted by a factor (t) (0, 1) as it lies


(0) = 1 for current outcomes is mostly cosmetic, facilitating

gives utility

The expression in (37) involves an innite sum, which may not be well-dened.

(a) Give an example to show this.

70

(b) Prove that the sum is well-dened if the sequence of discount factors is summable (

and the instantaneous utility function

(t) = t

for all

t,

t=0

(t) < )

is bounded.

exponential discounting :

The most common form of (37) involves


that

there is a

(0, 1)

such

turning utility into

U (c) =

t u(ct ).

(38)

t=0
Recall the earlier motivation for discounting of money: given a xed interest rate
period, one dollar tomorrow is worth only

= (1 + r)1 .

future money by powers of

(1 + r)1

r > 0

per

dollars today, so it makes sense to discount

Following Koopmans (1960), exponential discounting

can also be derived by imposing a stationarity requirement on preferences.


Preferences

stationarity

satisfy

if they are not aected if a common rst outcome is

dropped, and the timing of all other outcomes is advanced by one period. By repeated application, it implies that for a comparison between two sequences all initial periods with common
outcomes can be dropped, and the rst period of dierent outcomes can be taken as the initial
period. Formally, the preference relation

is

stationary if for all pairs (ct )


t=0 and (dt )t=0 with c0 = d0 :

(c0 , c1 , c2 , . . .) % (d0 , d1 , d2 , . . .)

(c1 , c2 , . . .) % (d1 , d2 , . . .).

Deriving exponential discounting usually proceeds along the following lines:

Proposition 10.1
 preferences

For notational convenience, let 0 be a feasible outcome. Assume that:


can be represented by a utility function as in (37),

 satisfy stationarity,
 the decision-maker is indierent between:

option 1:
option 2:
where

getting
getting

today and
today and

0
0

By induction on

u()u()
u( 0 )u(0 )

(, 0 , 0, 0, . . .)),

tomorrow,

u(0 ) 6= u( 0 ).

Then the discount factor is exponential:

Proof.


tomorrow (i.e., the sequence

for all

t.

< t.

(t) =

u()u()
u( 0 )u(0 )

The result is trivial if

t = 0.

t

Let

tN

and assume that

( ) =

Repeated application of stationarity implies that

( 0, . . . , 0 , , 0 , 0, 0, . . .) ( 0, . . . , 0 , , 0 , 0, 0, . . .),
| {z }
| {z }
t1

times

t1

times

i.e., their utility must be the same. Substitution in (37) gives

(t 1)(u() u()) + (t)(u(0 ) u( 0 )) = 0,


so


(t) = (t 1)

u() u()
u( 0 ) u(0 )


=

where the nal equality uses the induction hypothesis.

71

u() u()
u( 0 ) u(0 )

t
,


Exercise 10.2

t = 1.

Rational suicide: A decision maker (DM) lives for at most two periods,

At each time

t {0, 1}

t=1

to commit suicide. Regardless of his initial mood, at time

1/2

or happy with probability

1/2.

His instantaneous utility is

t.

on his action but also on the state of the world at time

alive and happy,

is alive and depressed, and

is commit suicide and

1
1
u(s, a) =

0
>0

and

he will be depressed with probability

state-dependent , i.e., it depends not only

The set of states is

S = {h, d, D}, where h is


A = {k, `}, where k
u : S A R is dened

is dead. The set of actions is

is go on living. The instantaneous utility function

as follows:

where

t = 0

that he is alive, he must decide, depending on his mood, whether or not

if
if
if

(s, a) = (h, `),


(s, a) = (h, k),
(s, a) = (d, `),

otherwise,

is the intensity of the depression. Thus, given that you're happy, killing yourself appears

D is irreversible: should the DM decide to kill


t = 0, then he receives utility 0 at time t = 1. The DM discounts the future exponentially
at rate 0 < < 1, and maximizes expected lifetime utility (of the standard additive form). We solve
the decision problem by backward induction, starting with optimal behavior in the nal period t = 1.
Assume the DM is alive at time t = 1.

silly, but if you're depressed, it may seem less so. State


himself at time

(a) What is the optimal action and the resulting instantaneous utility if the DM at

t = 1 is (a1) happy?

(a2) depressed?
Now consider the initial period: assume the DM is depressed at time
(b) Assuming optimal behavior at time

t = 1,

t = 0.

what is the optimal action at time

t = 0?

Note:

the DM does not kill himself immediately, there is uncertainty about his mood at time
the answer depends on

and

(i) if

t = 1; (ii)

(c) A psychologist claims that the option of future suicide might prevent depressed people from killing
themselves straight away. Explain this claim using the answers above.

10.2.

Preference reversal and hyperbolic discounting

Stationarity requires that if you prefer one apple today over two apples tomorrow, then shifting
this choice by one year (one apple next year versus two apples one year and a day from now)
doesn't change that you'd still rather have the single apple.

On the other hand, empirical

evidence (Thaler, 1981) seems to suggest that people are much more sensitive to a waiting time
of one day when it occurs right now than to a waiting time in the far future: if you anyway
have to wait an entire year for a lousy apple, you might as well wait one day more and double
the booty.

Dierent attempts to capture such a preference reversal go under the heading of

hyperbolic discounting. It simply involves discount factors that are not exponential. Arguably

(, )-model of Phelps and Pollak (1968). They dene,


2
3
as (0) = 1, (1) = , (2) = , (3) = , . . ., turning

the simplest approach is the so-called


for

, (0, 1),

the discount factors

the utility function (37) into:

U (c) = u(c0 ) +

t u(ct ).

t=1
To see that this model can explain the preference reversal for the apples, assume that utility
satises

u(0) = 0

and is strictly increasing in apples. Preferring one apple today over two apples

tomorrow means that

u(1) > u(2).


72

(39)

Preferring two apples one year and a day from now to one apple a year from now (and assuming
we're not in a leap year) means that

366 u(2) > 365 u(1).


For (39) and (40) to hold simultaneously, we simply need

<
Taking

suciently close to one and

(40)

(, ) (0, 1) (0, 1)

to satisfy

u(1)
< .
u(2)

suciently close to zero will do the trick.

Exercise 10.3 (Loewenstein and Prelec, 1992)

Discount factors

(t) = (1 + t)/ ,

with

, > 0,

t experimental data well. Show that also this model captures the preference reversal described above.

Exercise 10.4 (Wrneryd, 2007)

Sex and time preference:

In some evolutionary models of

intertemporal consumption, time periods represent generations and people care about future consumption
to the extent that it is exercised by their ospring (children, grandchildren, etc.). To simplify matters,

(i) a DM cares about consumption of its ospring only if it has a specic gene; (ii) mates
[0, 1], (iii) ospring gets in
expectation half of its genes from each parent, (iv) we consider one unit of ospring per time period.

assume that

are selected at random and have the relevant gene with probability

(a) Let

t N. Show, for
t-th period

the DM's

p0 = 1 and
t = 0, 1, 2, . . .

(b) Set

pt of
pt = 12 pt1 + 12 .
t
pt = + (1 ) 12 for all

instance by conditioning on the giver of the gene, that the probability


ospring carrying the gene satises the recurrence relation

show that the solution to the recurrence relation is

P (pt )t=0 in the place of discount factors, the standard separable utility
U (c) =
p
u(c
t ). Assume that consumption is in units of apples, u is strictly
t=0 t
u(0) = 0. Let's investigate the opportunity of preference reversal.

With these kinship parameters


function becomes
increasing, and
(c) Let

and

Prove: for

10.3.

u be such that the DM prefers 1


T N suciently large, the DM

apple now (t

= 0)

to 2 apples next generation (t

prefers 2 apples at time

T +1

to 1 apple at time

= 1).
T.

Limit-of-means and overtaking

By discounting, less weight is assigned to future utilities. This section introduces two other ways
of evaluating sequences of utilities, attaching equal weight to all periods. To save on notation,

(xt )
t=0 of real numbers, rather than

using the more elaborate (u(ct ))t=0 . Probably the rst thing that comes to mind is to value a

sequence of utilities (xt )t=0 using the long-term average of the utilities:
we will denote a sequence of utilities simply by a sequence

x0 + x1 + + xT 1
.
T
T
lim

However, even if the sequence is bounded, this limit may not exist: the average may continue
to oscillate. We verify this statement with a binary (zero-one) sequence. The idea is to append
enough ones to increase the average until it achieves a xed high value, then to append enough
zeroes to decrease the average until it reaches a xed low value, and continue this process.

An oscillating average:

Consider the binary sequence

(0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, . . .)
73

obtained by starting with a zero and two ones, and then after each block of zeroes or ones, double
the length of the sequence obtained so far with a block of the other number: after the rst block
of ones, we have three coordinates, so we double the length to six coordinates by appending some
zeroes. Then we double the length to twelve coordinates by adding some ones, etc. A simple
inductive proof shows that after the

k -th

block of ones, the sequence has

22k1 of them equal to one, and therefore an average of

2/3.

3 22k2

coordinates,

Doubling the length to

coordinates by appending zeroes decreases the average by a factor

1/2

1/3.

to

3 22k1

As appending

zeroes decreases, and appending ones increases the average, it follows that the average continues
to oscillate between

1/3

and

2/3.

Taking, instead, a pessimistic view of how the average utility changes over time will give us
a well-dened criterion.
sequence

(xt )
t=0

This requires some mathematical preliminaries.

of real numbers.

For each

t, st = inf{xs : s t}

Consider a bounded

indicates the inmum (in

somewhat colloquial terms, the worst value) of the tail of the sequence from time

onwards.

(st )
t=0

implies taking the inmum over a smaller set. As (st )t=0 is a

This inmum is well-dened, as the sequence is bounded. Notice also that the sequence
is weakly increasing: increasing

monotonic, bounded sequence, it converges. Its limit is called the lower limit or

(liminf) of the original sequence (xt )


t=0 :

limes inferior

lim inf xt = lim (inf{xs : s t}) .


t

By convention,

lim inf t xt =

(xt )
t=0

if

is not bounded from below.

If it is bounded

from below, but not from above, the sequence of inma may diverge, in which case one sets

lim inf t xt = +.
The following characterization of the lower limit may come in handy.
sequence and let

c R.

Then

lim inf t xt = c

Let

(xt )
t=0

be a

if and only if:

[L1] for each > 0, there is a T N such that c < xt for all t T ,
[L2] for each > 0 and each T 0 N, there is a t T 0 with xt < c + .
In words, the sequence eventually remains above
matter how small

Exercise 10.5

c ,

but dives below

c+

innitely often, no

> 0.

Prove this.

The limit-of-means criterion evaluates utility streams by means of the lower limit of the average
utility:

Limit of means:
preferred to

Let

x = (xt )
t=0

y = (yt )
t=0

and

be sequences in

by the limit-of-means criterion, denoted

lim inf
T

x L y ,

T 1
1 X
(xt yt ) > 0.
T

and

eventually exceeds

T 1
1 X
(xt yt ) >
T

Then

is

(41)

t=0

Inequality (41) is equivalent with the statement that for some


between sequences

R.

if and only if

> 0,

the average dierence

for all but nitely many periods

t=0

74

T.

Exercise 10.6

Prove this.

Changes in a single coordinate of a sequence become negligible once the average is taken over a
long time, so under the limit-of-means criterion, changes in any nite number of periods do not
matter. In particular, these preferences are stationary.

Exercise 10.7

Some authors refer to the limit-of-means criterion as the preference relation repre-

x = (xt )
t=0

sented by the utility function assigning to each bounded sequence

lim inf T T1

PT 1
t=0

the number

U (x) =

xt .

(a) Why must the sequences be bounded?


(b) Aside from this, are the two denitions really the same?

The following criterion also assigns equal weight to periods, but remains sensitive to changes in
single coordinates:

Overtaking:
to

Let

x = (xt )
t=0

y = (yt )
t=0 be sequences in R.
denoted x O y , if and only if

and

by the overtaking criterion,

lim inf
T

T
X

Then

x is preferred

(xt yt ) > 0.

t=0

Let us compare exponential discounting, and the limit-of-means and overtaking criteria.

The

latter were dened in terms of strict preferences. Dene the corresponding indierence relation

as follows:

x L y

if neither

x L y

nor

y L x.

Of course,

is dened similarly.

Comparison:

(1, 1, 0, 0, . . .) is preferred to the sequence (0, 0, . . .) under exponential


all (0, 1). Under the other two criteria, they are equivalent.

 The sequence
counting for

 The sequence

(1, 2, 0, 0, . . .)

is preferred to the sequence

(0, 0, . . .)

dis-

under the overtaking

criterion. Under the limit-of-means criterion, they are equivalent.

 For every

n N,

the sequence

(0, . . . , 0, 1, 1, . . .)
| {z }
n

is preferred to

(1, 0, 0, . . .)

times

under the limit-of-means criterion. However, for each

(0, 1),

a large enough delay in a constant stream of ones makes the instant gratication of getting
1 immediately the preferable option.

10.4.

Better may be worse

Consider an alcoholic who has to decide at each moment in discrete time whether to take a drink
(action 1) or not (action 0).

Given his uncertain life-length, common modeling practice is to

treat this as an innite horizon problem, discounting the impact of future decisions if so desired.

x = (xt )
t=0 of zeroes and ones, with xt = 1 if the alcoholic
takes a drink at time t and xt = 0 otherwise. With a minor abuse of notation, (0, xt ) denotes
the drinking pattern obtained from x by not drinking at time t. The pattern (1, xt ) is dened
A drinking pattern is a sequence

likewise.

75

The philosophy of Alcoholics Anonymous is to ght the temptations of alcohol by forgetting


about the past or the future and concentrate exclusively on the present:

stay away from a

drink one day at a time. Let us investigate the possibility of having a utility function

that

simultaneously models:

 Temptation: at any given day, the alcoholic is at least as well of  and sometimes better
 by choosing to drink:

U (1, xt ) U (0, xt )
U (1, xt ) > U (0, xt )

t, xt ,
some t, xt .

for all
for

 Health concerns: nevertheless, the best thing is never to drink and the worst thing is
to drink at all times:

(0, 0, . . .)

maximizes, and

(1, 1, . . .)

minimizes

U.

This sounds paradoxical and is indeed impossible under a nite horizon: suppose there are only

T N

periods. Start with an arbitrary drinking pattern and switch, one period at a time, any

abstention (0) to drinking (1).


function.

By temptation, each such switch weakly increases the utility

So drinking at all times maximizes utility, in conict with health concerns, which

would require that all these weak increases in utility eventually lead to a plunge in utility: it is
like climbing a stairway, but ending up lower than before (Figure 3).

Figure 3:

An impossible stairway

The next example shows that temptation and health concerns can be reconciled under an
innite horizon.

Drinking paradox:

Dene the utility for each drinking pattern

3
0
U (x) =
P
As a switch from
it by

2t > 0

to

if
if

t
t xt 2
at time

xt = 1
xt = 0

for only nitely many


for only nitely many

t
t

as follows:

(a rare drinker),
(a heavy addict),

otherwise.

leaves the utility unaected in the rst two cases and increases

otherwise, the temptation assumption is satised. However,

U (0, 0, . . .) = 3 = max U (x) > min U (x) = 0 = U (1, 1, . . .),


x

in conformance with health concerns.

76

11.

Probabilistic choice

Consider a DM with a nite set

A of alternatives.

Earlier, we saw that if the DM has a weak order

over these alternatives, there is a utility function

u:AR

representing these preferences

and making an optimal choice reduces to choosing an alternative

a arg maxbA u(b),

a utility

maximizing alternative. However, in numerous experiments, it turns out that DMs:

 do not always make the same choice under seemingly identical circumstances,
 sometimes choose seemingly suboptimal alternatives.
Such apparently irrational behavior has led to the development of so-called

models , where the main idea is that:

probabilistic choice

 each alternative is chosen with some probability,

 if

and

are feasible choices and

a % b,

then the probability of choosing

least as large as the probability of choosing

should be at

b.

This section gives a very short introduction to three probabilistic choice models: the Luce
model , the logit model , and the linear probability model . Often, probabilistic choice models
are derived in a random utility framework, where the true utility of each alternative consists of
a deterministic component plus a random component.

Depending on the realization of the

random utility component, a feasible choice will look good under some circumstances and bad
under others, thus motivating that observed choice is probabilistic: an alternative is only chosen
in circumstances where it looks optimal. We will not consider such random utility models: they
are (or should be) treated in detail in the econometrics courses. The development of these models
was one of the main causes for awarding Daniel McFadden the Nobel Prize in 2000. Instead, we
derive the models either axiomatically or via the introduction of

control costs : DMs want to

choose optimally, but incur costs to precisely implement their choices.


A good introduction to probabilistic choice models can be found in Anderson et al. (1992, Ch.
2) and Ben-Akiva and Lerman (1985, Ch. 3). On the content of this section: The Luce model
is due to Luce (1959).

The derivation of the logit choice probabilities using the entropy cost

function can be found in Mattsson and Weibull (2002). The derivation of the linear probability
model using the Euclidean distance as cost function is due to Voorneveld (2006). It is based on
an early contribution to the literature on bounded rationality in games by Rosenthal (1989).

11.1.

The Luce model

Consider a nite set

of alternatives. Some notation:

 in the remainder of this section, we assume that the DM has to choose from a subset
of alternatives in

containing at least two elements : choosing from a set with only one

alternative is trivial. We will typically denote such sets by

 If the DM has to make a choice from a set


chooses

 If

aS

S T A,
T by

by

PS (a) [0, 1].

S A,

SA
P

we denote the probability that an element from

set is

PT (a).

aS
By the assumption above:

PT (T ) = 1

for all

77

T A.

we denote the probability that the DM

Obviously, we require that

PT (S) =

or

T A.

aS

PS (a) = 1.

is chosen when the choice

 The set obtained from

by removing an element

aA

is denoted by

S \ {a}.

With this notation, the following two properties should be intuitive. The rst property states
that if some alternative
i.e.,

P{a,b} (a) = 0,

then

a T is never chosen in a pairwise comparison with some other b T ,


a can be deleted from T without aecting the choice probabilities of the

remaining alternatives:
(L1) Let

T A

and

a T.

If there exists a

bT

with

P{a,b} (a) = 0,

then

PT (S) = PT \{a} (S \ {a})


for all

S T.

PT (T \ {a}) = PT \{a} (T \ {a}) = 1, so PT (a) = 0.


a is always rejected in pairwise comparisons? In that
case it is reasonable to assume the following path independence condition: if a S T , then
the probability of choosing a from T should be equal to the probability of (i) rst selecting the
subset S and (ii) from S choosing the element a. Formally:

Taking

S = T \ {a}

in (L1), we get

What about cases where no alternative

(L2) Let

ST A

and

a S.

If

P{a,b} (a)
/ {0, 1}

for all

b T,

then

PT (a) = PT (S)PS (a).


When making a choice from a set

T A,

(L1) allows us to restrict attention to the alternatives

for which there is imperfect discriminatory power:

P{a,b} (a)
/ {0, 1} for all a, b T, a 6= b.

The

path independence condition then yields the following result:

Proposition 11.1

Assume that P{a,b} (a)


/ {0, 1} for all dierent a, b
(L2) holds if and only if there is a function u : A R++ such that

A.

Path independence

u(a)
bS u(b)

PS (a) = P
for every

S A.

Moreover, the function

(42)

is unique up to multiplication by a positive scalar.

Proof.
Step 1:

Assume path independence (L2) holds. We rst prove that

Suppose, to the contrary, that

PA (a) = 0

for some

a A.

PA (a) > 0

for all

a A.

By (L2), we know that for every

b A \ {a} :
0 = PA (a) = PA ({a, b})P{a,b} (a).
Since

P{a,b} (a) 6= 0,

it follows that

PA ({a, b}) = PA (a) + PA (b) = 0

for all

b A \ {a}.

Proba-

bilities are nonnegative, so it must be that

b A :
contradicting

bA PA (b)

= 1.

PA (b) = 0,
PA (a) > 0 for all a A, dene u(a) = PA (a).
SA:

Having shown that

Path independence (L2) implies that for every

PS (a) =

PA (a)
PA (a)
u(a)
=P
=P
.
PA (S)
bS PA (b)
bS u(b)
78

Step 2:

Conversely, suppose that there is a function

u : A R++

such that

u(a)
bS u(b)

PS (a) = P
for every

S A.

S T A and a S . Then
P
u(b)
u(a)
u(a)
PT (a) = P
= PbS
P
= PT (S)PS (a).
bT u(b)
bT u(b)
bS u(b)

Step 3:

To show that the function

To show: (L2) holds. So let

in (42) is unique up to multiplication with a positive

constant, suppose there are two such functions

and

u0 .

It follows that for every

aA:

u(a)
u0 (a)
=P
.
0
bA u(b)
bA u (b)
 P

P
0
=
bA u(b) /
bA u (b) > 0.
PA (a) = P

Hence

u(a) = u0 (a),

where

In words: In Luce's choice model, each alternative can be assigned a positive value such that the
probability of choosing a given alternative from a choice set is proportional to its value.
Debreu (1960) showed that path independence  although reasonable at rst sight  can
lead to counterintuitive conclusions. Consider, for instance, the following well-known variant of
Debreu's argument:

The blue bus/red bus paradox.

A DM has to make a traveling mode decision: he can either

go to his destination by car or by bus. Assume the DM assigns the same probability to both
alternatives:

P{car, bus} (car) = P{car, bus} (bus) = 1/2.

(43)

Suppose now that two buses can be used, which are completely identical, except in their colors:
one of them is red, the other is blue. So the choice set is

A = {car,

blue bus, red bus}. Assume

that the DM pays no attention to color:

P{blue bus, red bus} (blue

bus)

= P{blue bus, red bus} (red

bus).

(44)

Intuitively, since the DM according to (43) doesn't seem to care whether he goes by car or by
bus, it would seem reasonable to expect that he will choose to go by car with probability
and to go by bus with probability

1/2,

PA (car) = 1/2

and

1/2

choosing randomly between the blue and the red bus:

PA (blue

bus)

= PA (red

bus)

= 1/4,

or  at least  that the probability of taking the car should be larger than the probability of
taking any of the two buses. However, path independence (L2) implies

PA (car) = PA (blue

bus)

= PA (red

bus)

= 1/3.

To see this, notice that

PA (car)

(L2)

(43)

def

PA ({blue bus, car})P{blue bus, car} (car)


1
PA ({blue bus, car})
2
1
1
PA (blue bus) + PA (car),
2
2
79

so

PA (car) = PA (blue

bus) and, similarly,

PA (car) = PA (red

bus).

As the probabilities must

add up to one:

PA (car) = PA (blue

bus)

= PA (red

bus)

= 1/3.

So: in the choice problem with only one bus, the DM will choose to go by car or by bus with
equal probability, but when faced with the choice between going by car or going by bus in case
there are two virtually identical buses, the probability of choosing the car decreases from

1/2

to

1/3.
11.2.

The logit model

Again, consider a choice set


each alternative

> 0,

i A

A = {1, . . . , n}

with at least two distinct elements. Assume that

gives some utility or payo

the probability of choosing alternative

from

(i). In the
A is equal to

logit model

with parameter

e(i)/
exp((i)/)
=P
.
(j)/
jA exp((j)/)
jA e

PA (i) = P

(45)

Notice from (42) that this is just a special case of Luce's model, where the utility assigned to
each alternative

iA

is equal to

u(i) = exp((i)/) > 0.

Our goal will be two-fold:

1. motivating these choice probabilities by introducing control costs,


2. studying the role of the parameter

Control costs.

> 0.

We allow the DM to choose each of the alternatives with a certain probability,

so the DM chooses a probability distribution from

(
n =

p Rn+ :

n
X

)
pi = 1 .

i=1

A and has preferences % over the outcomes such


i % j if and only if (i) (j), the optimal thing to do is to choose only elements from
set arg maxiA (i) with positive probability. In most real-life situations, the DM cannot

Of course, if the DM is faced with choice set


that
the

guarantee the exact implementation of his choices: a careless driver may drive of the road, an
absentminded shopper may by mistake buy the wrong item. To model this, we assume that it
requires eort to implement choices: associated with each choice

p n

will be a disutility or

control cost c(p) R.


The (expected total) utility associated with each choice p n is dened as the dierence
P
between the expected payo

n
i=1 pi (i) and

> 0

times the control cost

c(p),

where

is a

positive scalar representing the relative weight assigned to the eort of implementing choice
Hence, the DM aims to solve

max

pn

n
X

p.

pi (i) c(p).

i=1

Dierent cost functions give rise to dierent choice probabilities. A common control cost function
that appears in many branches of science (physics, chemistry, information science, to name but
a few) is the following

entropy function :

c(p) =

n
X

pi ln (pi ) ,

i=1
80

(46)

where we use the convention that

0 ln 0 = 0.

One can show (we will not do so) that this is a

strictly convex function achieving its minimum at the vector

(1/n, . . . , 1/n), where all alternatives

are chosen with equal probability.

Proposition 11.2

The optimization problem

max

pn

n
X

pi (i) c(p),

(47)

i=1

with the control cost function from (46) has a unique maximum location with

exp((i)/)
,
jA exp((j)/)

i A : pi = P
the logit choice probabilities from (45).

Proof.

The cost function

is strictly convex, so the function

Pn

i=1 pi (i)

c(p)

is strictly

concave. Since we maximize a strictly concave, continuous function over a compact set, a maximum exists and is unique. Since the feasible set is entirely dened by linear (in)equalities, the
Kuhn-Tucker conditions give necessary and sucient conditions for a solution to be a maximum.
The condition for an interior solution
exists a Lagrange multiplier

since the gradient at

of

p n ,

i = 1, . . . , n : (i) (ln pi + 1) + = 0,
Pn
the goal function
i=1 pi (i) c(p) has i-th
(i)

Rewriting (48) gives, for each

Pn

j=1 pj

= 1,

with

c = exp(( )/)

and as

coordinate

a constant.

it follows that

exp((i)/)
,
jA exp((j)/)

pi = P

as we had to show.

(48)

i = 1, . . . , n:

i = 1, . . . , n :

The role of .

there

c(p)
= (i) (ln pi + 1).
pi

pi = c exp((i)/),
As

Ppni > 0 for all i, is that


i=1 pi = 1, such that

i.e., a solution where

associated with the constraint

Let us investigate what happens with the logit choice probabilities in (45) as

Consider two alternatives

i, j A, i 6= j .

Notice that the ratio of their

logit choice probabilities equals

PA (i)
exp ((i)/)
=
= exp
PA (j)
exp ((j)/)
which converges to one as

(i) (j)


,

(49)

But if the ratios of any two choice probabilities converge

to one, their limits must be equal; together with the fact that probabilities add up to one, we
conclude that the choice probabilities converge to

81

1/n

as

To consider the limit behavior as


to innity as

0.

0,

(i),

(i) > (j).

But then ratio (49) goes

Since we are dealing with probabilities here, which are bounded below by

zero and above by one, if must be that


payo

suppose that

PA (j) 0.

If we let

be the alternative with maximal

it follows that the probability of choosing an alternative with less than maximal

payo converges to zero. So in the limit, all probability is restricted to optimal alternatives and
it is clear from the denition of the choice probabilities that all of these will be chosen with equal
probability.
In summary, the parameter
large values of

can be interpreted as a measure of irrationality of the DM: for

the DM chooses by more or less blindly picking any of the alternatives, while

for small values of

11.3.

the choice of the DM is more or less optimal.

The linear probability model

The idea behind the linear probability model is the same as behind Luce's model and the logit
model: the probability of choosing an alternative should be (weakly) increasing in the payo
associated to the alternative:

(i) (j) PA (i) PA (j).

(50)

The adjective linear indicates that the dierence between these two probabilities should be linear
in the payo dierence: for a parameter

> 0,

we require that

PA (i) PA (j) = ((i) (j)).

(51)

Unfortunately, it is not always possible to combine these two properties for large values of

. Let's consider a simple example with two alternatives: A = {1, 2} and respective payos
(1) = 4, (2) = 0. By (50), we want PA (1) PA (2) and by (51), we want PA (1) PA (2) =
((1) (2)) = 4 . If we take = 1/8, this gives PA (1) PA (2) = 1/2. The probabilities have
to add up to one, so the unique solution is that PA (1) = 3/4 and PA (2) = 1/4. So far, so good.
Now take = 100: PA (1) PA (2) = 4 = 400. Since PA (1) and PA (2) are probabilities between
zero and one, making their dierence equal to 400 (or  for that matter  any number larger
than 1) is simply impossible.
So we have to relax our requirements (50) and (51) somewhat. Unwilling to change (50), let
us adapt (51). Indeed, we require the linearity condition whenever possible, but when we run
into problems like the one in the example above, we simply require that alternatives with low
payo are chosen with probability zero. Formally, choice probabilities

iA

PA (i)

for all alternatives

satisfy the linear probability model with parameter > 0 if the following holds:
if

PA (i) > 0,

then

PA (i) PA (j) ((i) (j))

for all

j A.

Let us check to see that (52) gives us what we want:

 If both

and

are chosen with positive probability, we nd from (52) that

PA (i) PA (j) ((i) (j))

and

PA (j) PA (i) ((j) (i)).

This implies

PA (i) PA (j) = ((i) (j)),


in correspondence with the linearity requirement (51).

82

(52)

(i) (j). We need to


show that the choice probabilities in the linear probability model satisfy PA (i) PA (j).
Discern two cases. First, if PA (j) = 0, it automatically follows that PA (i) 0 = PA (j). If
PA (j) > 0, application of (52) yields

 The choice probabilities also satisfy (50): take

i, j A

with

PA (j) PA (i) ((j) (i)) 0,


since

(i) (j)

and

> 0.

 Combining the two points above, we see that the choice probabilities are weakly increasing
in the associated payos. By necessity, we had to set the probability of choosing low-payo
alternatives equal to zero, but those that are chosen with positive probability still satisfy
the linearity requirement.

Control costs.

The choice probabilities can be derived in the same way as before by making

a clever choice of the cost function. Consider the cost function that assigns to every probability
vector
the

p n

the squared Euclidean distance to the vector

(1/n, . . . , 1/n)

that chooses each of

alternatives with equal probability:


n 
X
1 2
c(p) =
pi
.
n

(53)

i=1

So choosing all alternatives with equal probability gives zero costs and costs increase the further
away you go from the vector

(1/n, . . . , 1/n).

Proposition 11.3

> 0,

For each

As in the proof of Proposition 11.2, it follows that:

there is a unique solution to the maximization problem

max

n
X

pn

pi (i)

i=1

1
c(p)
2

(54)

with the cost function given in (53). The solution coincides with the choice probabilities in the
linear probability model with parameter

The role of .

Comparing the parameter

in the two optimization problems with control

costs in (47) and (54), you will notice that they switched roles: large values of

correspond with

a large weight assigned to the control cost function in the logit model, but with a small weight
assigned to the control cost function in the linear probability model. This change was necessary
because I wanted to follow the standard denition of the linear probability model in (52). But
the intuition remains the same:

measures (ir)rationality. In the case of the linear probability

model: for large values, (52) indicates that the dierence in the probability of choosing an optimal
alternative (highest

(i))

and a suboptimal alternative must be large. In the limit, this forces

the probability of choosing suboptimal alternatives to zero.


Conversely, for small values of

(52) indicates that the dierence in the probability of

choosing any two alternatives must be small.

Combining this with the fact that probabilities

add up to one, this implies that in the limit, all alternatives will be chosen with equal probability.

83

11.4.

Exercises

Exercise 11.1

Prove Proposition 11.3.

Exercise 11.2

Let

A = {1, 2}, (1) = 4, (2) = 0.

(a) Compute for every

>0

the choice probabilities satisfying the linear probability model.

(b) What happens with the choice probabilities as

0?

(c) What happens with the choice probabilities as

Exercise 11.3

Let

Interpret.

A = {1, 2, 3}, (1) = 0, (2) = 2, (3) = 8.

(a) Compute for each


for each

Interpret.

> 0,

>0

the choice probabilities in the logit model. Do these choice probabilities,

satisfy path independence? What happens with the choice probabilities as

(b) Answer the same questions for the linear probability model.

Exercise 11.4

The penalty function approach: Two of the probabilistic choice models considered

above could be rationalized using control cost functions giving a penalty to deviations from uniform
randomization. This exercise gives the general argument behind such rationalizations.
A penalty function on

Rn

of rearranging the coordinates: for each


that

c : Rn R+ . A symmetric penalty function is independent


n
bijection r : {1, . . . , n} {1, . . . , n} and each x R , it follows

is a function

c(x1 , . . . , xn ) = c(xr(1) , . . . , xr(n) ).

Consider a probabilistic choice model over a nite set


payo function

: A R.

A = {1, . . . , n}

with

n 2

symmetric penalty function: given parameter

P () :

max

pn

0,

n
X

they solve the problem

pi (i) c(p (1/n, . . . , 1/n)).

i=1

Show that the resulting choice probabilities satisfy the desired monotonicity requirement: if
and

(i) > (j),

then

elements and

Suppose a decision maker's choice probabilities can be rationalized using a

pi pj .

84

p solves P ()

Full circle
To make sure you get the big picture, let us  at the end of this course  turn back to where we
started: the overview of the course goals in the preface, and briey summarize how we achieved
them.

The general framework


A meaningful microfounded model in any branch of economics derives its conclusions from
assumptions about the behavior of individual economic agents. It requires careful answers to the
following questions:

(Q1) What can the agent choose from, i.e., what is the set of feasible alternatives?
(Q2) What does the agent like, i.e., what are the preferences over alternatives?
(Q3) How are the former two combined to make a choice, i.e., to select among alternatives?
We mostly stuck to rational choice: choose from your set of feasible alternatives a most preferred
one.
Sections 1 to 3 provided a general framework for modeling preferences over and choice from
arbitrary sets of alternatives. Important stops along the way included:

Utility theory:

utility functions are convenient tools to summarize an agent's preferences.

Nevertheless, in relevant cases, no utility function exists (Section 2.3).

We provided an exact

answer to when preferences can be represented by a utility function (Section 2.4).

Moreover,

we provided conditions under which utility functions had some additional nice structure.

For

instance, continuity was studied in Section 2.5, cases where preferences could be expressed in
terms of a numeraire in Section 2.6.

Existence of solutions:

Proposition 3.1 gave a general answer to a fourth central question:

(Q4) When do most preferred elements exist?


If the weak order reecting the agent's preferences is upper semicontinuous, the agent can nd
a most preferred alternative in any nonempty, compact set of options. We regularly appealed
to this result to establish that problems faced by economic agents actually have a solution;
sometimes (as in Propositions 4.3(a) and 7.1(a)) the result could be applied immediately, but
sometimes (as in Propositions 4.5(a), 5.4, and 5.5) a little more caution was needed.

Applications of the general framework


In many of the remaining sections, this general framework was applied to specic economic
problems. This required giving the set of alternatives as well as the preferences a specic meaning
that seems relevant to the problem under consideration. Moreover, this allowed us to study a
fth central question:

(Q5) How are most preferred elements aected by changes in the agent's environment?
Below, I will go through these applications, summarize how feasible sets and preferences were
dened, and  if applicable  indicate where we studied the answer to

85

(Q5).

Application 1: consumer facing budget constraint.

x RL
+

 Feasible alternatives: commodity bundles


 Preferences: an arbitrary weak order

in a budget set

B(p, w).

over the commodity space

X = RL
+.

 Changes in agent's environment: see Sections 4.2 and 4.5.

Application 2: consumer minimizing expenditure.

x RL
+

 Feasible alternatives: commodity bundles

 Preferences: dened in terms of the expenses

px

achieving a desired utility level.


at price vector

p RL
++ .

 Changes in agent's environment: see Section 4.3.

Application 3: producer maximizing profit.


 Feasible alternatives: production plans

in a production set

 Preferences: dened in terms of the prot

py

Y RL .

at price vector

p RL
++ .

 Changes in agent's environment: see Section 5.3.

Application 4: producer minimizing costs.


 Feasible alternatives: input vectors

z RL1
+

achieving a desired output level.

wz

 Preferences: dened in terms of the costs

at input price vector

w RL1
++ .

 Changes in agent's environment: see Section 5.5.

Application 5: expected utility theory.


 Feasible alternatives: compound gambles
 Preferences: an arbitrary weak order

over a set of deterministic outcomes.

over the set of compound gambles

G,

under some

assumptions resulting in a von Neumann-Morgenstern utility function.

 Changes in agent's environment: see Section 8 on risk attitudes.

Application 6: time preference.


 Feasible alternatives: sequences

c = (ct )
t=0

of outcomes occuring over time

t.

 Preferences: come in dierent forms, for instance:


1. represented by a utility function of the form

U (c) =

t=0 (t)u(ct ),

2. in terms of the limit of means criterion,


3. in terms of the overtaking criterion.

Application 7: probabilistic choice.

Although slightly outside the general framework,

in some probabilistic choice models like the logit and linear probability model, agents choose
probabilities as if they maximize expected payos subject to implementation costs:

 Feasible alternatives: choice probabilities assigned to a nite set

of alternatives.

 Preferences: represented by a utility function of the form expected payo minus control
costs; see Propositions 11.2 and 11.3.

86

Beyond these notes


Applications of the general framework abound also in other branches of economics. In macroeconomics, a government may evaluate alternative policies in terms of some social welfare function
summarizing the well-being of its citizens. In game theory  the mathematical toolbox used to
study interaction between agents, used in many branches of microeconomics, industrial organization, and political economics  players have dierent strategies to choose from and evaluate
them in terms of a preference relation that incorporates the uncertainty they face about, for
instance, the choices of the other players.
And what if we leave the realm of rational decision making? Parts of these notes (see, for
instance, Exercises 3.4, 3.5, and Section 11) illustrate that as long as we can write down formal
postulates about agents' behavior, our mathematical tools allow us to study their consequences
in a rigorous and consistent way. This is just the right amount of rationality we need:
Behavior is procedurally rational when it is the outcome of appropriate deliberation.
Its procedural rationality depends on the process that generated it. (Simon, 1976, p.
131)
Behavior is procedurally rational if there is a procedure  a recipe, if you wish  that translates
a decision problem to a well-dened choice. Procedurally rational decision makers are not wild
maniacs choosing without any logic whatsoever. Paraphrasing Shakespeare:
Though this be madnesse/Yet there is Method in't.

Hamlet, 1603, Act 2, Sc. 2.

I hope that the tools you acquired during this course will help you to address also other economic
problems in a structured way.

87

Notation
If

is a nite set,

|X|

denotes its cardinality, i.e., its number of elements.

A is also an element of B ): A B .
B , but A 6= B ): A B .
Set of positive integers: N = {1, 2, 3, . . .}.
Set of integers: Z = {. . . , 2, 1, 0, 1, 2, . . .}.
Set of rational numbers: Q = {p/q : p, q Z, q 6= 0}.
Set of real numbers: R.
For arbitrary L N :
L
L
L
Set of vectors in R with nonnegative coordinates: R+ = {x R : x1 , . . . , xL 0}.
L
L
L
Set of vectors in R with positive coordinates: R++ = {x R : x1 , . . . , xL > 0}.
L
Sets like Q++ are dened analogously.
L
For two vectors x, y R , their inner product is denoted by x y = x1 y1 + + xL yL .
Weak set inclusion (each element of

Strict/proper set inclusion (A

Moreover, write

xy

if

xi yi

for all coordinates

i = 1, . . . , L,

x>y

if

xi > yi

for all coordinates

i = 1, . . . , L.

and < are dened analogously.


k {1, . . . , L}, ek RL denotes the k -th

Relations
For

standard basis vector with

to one and all other coordinates equal to zero:

ek = (0, . . . , 0,

The vector of ones is denoted by

1
|{z}

th coordinate

e = (1, . . . , 1) RL .

88

, 0, . . . , 0).

k -th

coordinate equal

References
Anderson, S.P., de Palma, A., Thisse, J.-F., 1992. Discrete choice theory of product dierentiation. MIT Press.
Arrow, K.J., 1959. Rational choice functions and orderings. Economica 26, 121-126.
Arrow, K.J., Hahn, F.J., 1971. General competitive analysis. Amsterdam: North-Holland.
Ben-Akiva, M., Lerman, S.R., 1985. Discrete choice analysis. MIT Press.
Cobb, C.W., Douglas, P.H., 1928. A theory of production. American Economic Review (supplement) 18, 139-165.
Debreu, G., 1954. Representation of a preference ordering by a numerical function. In: Decision
Processes. Thrall, Davis, Coombs (eds.), John Wiley, pp. 159-165.
Debreu, G., 1959. Theory of value. Yale University Press.
Debreu, G., 1960. Review of R.D. Luce, Individual Choice Behavior: A Theoretical Analysis.
American Economic Review 50, 186-188.
Debreu, G., 1964. Continuity properties of Paretian utility. International Economic Review 5,
285-293.
Diecidue, E., Wakker, P.P., 2002. Dutch books: avoiding strategic and dynamic complications,
and a comonotonic extension. Mathematical Social Sciences 43, 135-149.
Dubra, J., Echenique, F., 2001. Monotone preferences over information. Topics in Theoretical
Economics 1, article 1.

http://www.bepress.com/bejte/topics/vol1/iss1/art1

Fishburn, P.C., 1970a. Utility theory for decision making. New York: John Wiley & Sons.
Fishburn, P.C., 1970b. Intransitive individual indierence and transitive majorities. Econometrica 38, 482-489.
Fishburn, P.C., 1979. Transitivity. Review of Economic Studies 46, 163-173.
Hildenbrand, W., Kirman, A.P., 1988. Equilibrium analysis. North-Holland.
Jaray, J.-Y., 1975. Existence of a continuous utility function: An elementary proof. Econometrica 43, 981-983.
Kahneman, D., Tversky, A., 1964. Prospect theory: an analysis of decision under risk. Econometrica 47, 263-291.
Kamke, E., 1950. Theory of sets. New York: Dover Publications.
Kaneko, M., 1976. Note on transferable utility. International Journal of Game Theory 5, 183-185.
Koopmans, T.C., 1960. Stationary ordinal utility and impatience. Econometrica 28, 287-309.
Kreps, D.M., 1990. A course in microeconomic theory. Hertfordshire: Harvester Wheatsheaf.
Loewenstein, G., Prelec, D., 1992. Anomalies in intertemporal choice: evidence and interpretation. Quarterly Journal of Economics 107, 573-597.
Luce, R.D., 1959. Individual choice behavior: A theoretical analysis. Wiley.
Mas-Colell, 1985. The theory of general economic equilibrium; A dierentiable approach. Cambridge: Cambridge University Press.
Mas-Colell, A., Whinston, M.D., Green, J.R., 1995.

Microeconomic theory.

Oxford: Oxford

University Press.
Mattsson, L.-G., Weibull, J.W., 2002. Probabilistic choice and procedurally bounded rationality.
Games and Economic Behavior 41, 61-78.
Osborne, M.J, Rubinstein, A., 1994. A course in game theory. Cambridge, MA: MIT Press.
Phelps, E.S., Pollak, R.A., 1968. On second-best national saving and game-equilibrium growth.
Review of Economic Studies 35, 201-208.
Pratt, J.W., 1964. Risk aversion in the small and in the large. Econometrica 32, 122-136.

89

Rabin, M., 2000. Risk aversion and expected-utility theory: a calibration theorem. Econometrica
68, 1281-1292.
Rosenthal, R.W., 1989. A bounded-rationality approach to the study of noncooperative games.
International Journal of Game Theory 18, 273-292.
Rubinstein, A., 2006. Lecture notes in microeconomic theory. Princeton NJ: Princeton University Press.

http://arielrubinstein.tau.ac.il/Rubinstein2007.pdf

Simon, H., 1955.

A behavioral model of rational choice.

Quarterly Journal of Economics 69,

99-118.
Simon, H.A., 1976. From substantive to procedural rationality. In: Method and Appraisal in
Economics. Latsis, S.J. (ed.), Cambridge University Press, pp. 129-146.
Starr, R.M., 1997. General equilibrium theory. Cambridge University Press.
Thaler, R., 1981.

Some empirical evidence on dynamic inconsistency.

Economics Letters 8,

201-207.
Varian, H.R., 1992. Microeconomic analysis. New York: W.W. Norton & Company, 3rd edition.
Voorneveld, M., 2006. Probabilistic choice in games: properties of Rosenthal's

t-solutions.

In-

ternational Journal of Game Theory 34, 105-121.


Voorneveld, M., 2007. The possibility of impossible stairways: Tail events and countable player
sets. To appear in Games and Economic Behavior.
Voorneveld, M., 2008. From preferences to Cobb-Douglas utility. SSE/EFI Working Paper Series
in Economics and Finance, No. 701.
Wrneryd, K., 2007. Sexual reproduction and time-inconsistent preferences. Economics Letters
95, 14-16.

90

Suggested solutions
These are (sometimes short) solutions to most exercises in the lecture notes. In solutions to the
home assignments and exam questions, you are expected to start from relevant denitions, and
clearly deduce and motivate your answers.

Suggestions for improvements (and corrections of

potential mistakes?) are welcome!

Section 1
Exercise 1.1
(a): Each pair
word

of words can be arranged in alphabetical order, so

dictionary, and word


word

is complete. Moreover, if

is found before or at the same place as (in case the words are identical) word

is found before or at the same place as word

is found before or at the same place as word

in the

in the dictionary, then

in the dictionary. Conclude that

is

transitive.

(b):

The binary relation

dened by knows is not necessarily complete or transitive.

violation of completeness occurs if there exist people who are unfamiliar with each other. Also
violations of transitivity are common: I know my wife, my wife knows her boss, but I do not
know my wife's boss.

Exercise 1.2
(a): [Reexivity of ]

Let x X . By completeness of %: x % x and (simply changing the


x - x. By denition of : x x. Conclude that is reexive.
[Symmetry of ] Let x, y X with x y . By denition of , x % y and y % x. But this is
also the denition of y x. Conclude that is symmetric.
[Transitivity of ] Let x, y, z X have x y and y z . By denition of , this means that
x % y , y % x, y % z , z % y . By transitivity of %, x % y and y % z give x % z . Similarly, z % y
and y % x give z % x. Since x % z and z % x: x z . Conclude that is transitive.
(b): [Irreexivity of ] Let x X . By denition of , x  x would require that x % x but
not x % x, a contradiction. Conclude that  is irreexive.
[Asymmetry of ] Let x, y X with x  y . By denition of , x % y but not y % x. By
denition of , not y  x. Conclude that  is asymmetric.
[Transitivity of ] Let x, y, z X have x  y and y  z . By denition of , this means that
x % y but not y % x and that y % z , but not z % y . By transitivity of %, x % y and y % z give
x % z . It is not true that z % x. If it were, transitivity of % with z % x and x % y would imply
z % y , contradicting y  z . Since x % z , but not z % x: x  z . Conclude that  is transitive.
(c): Let x, y, z X have x y and y % z . By denition of , this implies that x % y . As x % y
and y % z , transitivity of % gives x % z .
order of writing)

Exercise 1.3
(a): Assume % is strongly monotonic.

Let k {1, . . . , L} be one of the coordinates, let x X ,


> 0. Then x + ek x and x + ek 6= x, so by strong monotonicity, x + ek  x. Conclude
that % is strongly monotonic in coordinate k .
Now assume that % is strongly monotonic in each of its coordinates and transitive. Let
x, y X with x y and x 6= y . To show: x  y .
Starting with x, change the coordinates one by one to those of y . Formally, let z(0) = x and,
Pk
for each k {1, . . . , L}, dene z(k) = x +
`=1 (y` x` )e` . Then either z(k) = z(k 1) if the
and

91

k -th

coordinates of

and

coordinate. By transitivity, we nd that

(b):

z(k 1)  z(k) by
x = z(0)  z(L) = y .

are the same, or

The preference relation

x, y R2+ :

on

R2+

strong monotonicity in the

k -th

with

x % y xk > yk

for exactly one coordinate

k {1, 2}

k for both k = 1 and k = 2, but not strongly monotonic:


(1, 1). Notice: in line with (a), relation % is not transitive.
(c): No. The point (0, . . . , 0) RL+ cannot be improved upon: since less is better, (0, . . . , 0)  x
L
for every x R+ with x 6= (0, . . . , 0).
is strongly monotonic in coordinate

(2, 2)

(d):

is not strictly preferred to

Yes. Notice that the issue above, that improvements beyond the zero vector are impossible

if one is constrained to vectors with nonnegative coordinates, disappears. Let


Dene

y = x

2 e1

RL . Then

coordinate. Since less is better,

kx yk = < and y x, with strict inequality


y  x. Conclude that % is locally nonsatiated.

Exercise 1.4
(a): The preference relation on R2+

> 0.

in the rst

(x1 + 1)(x2 + 1) (y1 + 1)(y2 + 1)

is strongly monotonic in coordinate 1, but not quasilinear in coordinate 1: let

y = (2, 1).

and

with

x%y

x RL

x = (1, 2)

and

Then

(x1 + 1)(x2 + 1) = (y1 + 1)(y2 + 1) = 6,


Increase the rst coordinate of

and

by

> 0.

so

x y.

Then

(x1 + + 1)(x2 + 1) = 3(2 + ) > 2(3 + ) = (y1 + + 1)(y2 + 1),

so

x + e1  y + e1 .

Quasilinearity would require that the indierence remains unaected.

(b):

The preference relation

for all

x, y ,

% on R2+ where all alternatives are equivalent with each other (x % y

represented by a constant utility function) is trivially quasilinear but not strongly

monotonic in coordinate 1.

(c): Same preference relation as in (b).


(d): The preference relation on R2+ with
x%y

4x1 + 3x22 4y1 + 3y22

satises all three monotonicity properties, but is not homothetic. For instance,

41+3

02

>40+3

Exercise 1.5
(a): Let x, y RL+

12 , but

2(1, 0) 2(0, 1),

as

42+3

02

<40+3

(1, 0)  (0, 1),

as

22 .

x y . For each n N, xn = x + (1/n, . . . , 1/n) RL


+ satises xn > y ,
n
so xn % y (in fact, even x  y ). Letting n , continuity implies that limn xn = x % y .
(b): Let x, y RL+ have x > y . Then min{x1 , x2 } > min{y1 , y2 }, so x % y , but not y % x, i.e.,
x  y . Let x = (2, 1) and y = (1, 1). In both cases, you can only mix one unit of drink, but x
wastes one unit of the rst ingredient, so even though x y , x y .
have

Exercise 1.6
92

(a):

Assume the rst denition of convexity holds. Let

y X.

To show:

{x X : x % y}

is a

convex set.

z, z 0 {x X : x % y} and [0, 1]. Using completeness of %, we may assume w.l.o.g.


0
0
0
0
that z % z . By convexity, z + (1 )z % z % y , so z + (1 )z % y by transitivity of %.
Conversely, assume the second denition of convexity holds. Let x, y X with x % y and
[0, 1]. To show: x + (1 )y % y .
Elements x and (by completeness) y both lie in the set {x X : x % y}, which is convex by
assumption, so it also contains x + (1 )y . Conclude that x + (1 )y % y .
(b): Consider the preference relation % on R with
Let

x, y R :
For each

x % y x 0 > y.

y R:


{x R : x % y} =

R+

if
if

y 0,
0 > y,

is convex. Therefore, it satises the rst convexity condition. However, if


then

x % y,

but not

x + (1 )y % y ,

x = 1, y = 3, = 1/2,

in violation of the second convexity denition.

Section 2
Exercise 2.1
(a): [Transitivity] Let x, y, z R satisfy x % y , y % z .
so

x y + 1 z + 2 z + 1,

so

By denition,

x y+1

and

y z + 1,

x % z.

[Violation of completeness] Completeness requires in particular that for each x R: x % x,


i.e., that

x x + 1.

Clearly, this is not true.

(b): [Prop. 2.1(b) satised]

Let x, y R. If x  y , then x % y , so x y + 1. Therefore,


u(x) = x y + 1 > y = u(y). Moreover, there are no x, y X with x y (as this would require
x y + 1 and y x + 1), so the second condition is vacuous.
[Prop. 2.1(a) violated] u does not represent %, since % is not complete and the order induced
by u is.

Exercise 2.2
(a): Suppose the collection of jumps in U
and

(v1 , v2 ).

The intervals

(u1 , u2 )

and

is uncountable. Consider two distinct jumps

(v1 , v2 )

(u1 , u2 )

are disjoint by denition of a jump. Moreover,

each such interval contains a rational number, necessarily distinct from the one in the other
interval, since these intervals are disjoint.

Therefore, there is an injective function from the

uncountable set of jumps to the countable set of rational numbers, a contradiction.

(b): C

is the union of two countable sets J and R and therefore countable itself. Let x, y X
x  y . To show: there are c1 , c2 C with x % c1  c2 % y .
Case 1: (u(y), u(x)) is a jump in U . By denition of J , there are points c1 , c2 J C with
utility u(c1 ) = u(x), u(c2 ) = u(y). Hence x c1  c2 y , as in the requirement for Jaray
with

order-separability.

Case 2: (u(y), u(x)) is not a jump in U .

Then (u(y), u(x)) U 6= . By denition of R, there is


c R C with u(c) (u(y), u(x)). Now apply the reasoning so far to (u(c), u(x)). If it is a
jump in U , Case 1 says that there are c1 , c2 C with x c1  c2 c  y , as in the requirement

for Jaray order-separability. If it is not a jump, repeating the construction of Case 2 says that
there is a

c0 C

with

u(c0 ) (u(c), u(x)),

so that

Jaray order-separability.

93

x  c0  c  y ,

as in the requirement for

(c):

x, y X . If x  y , there exist, by Jaray order-separability, c1 , c2 C with x %


c1  c2 % y . Therefore, {c C : c - x} {c C : c - y}, as the former set includes
c1 , whereas the latter doesn't. Conclude that u(x) u(y) 2n(c1 ) > 0. If x y , then
{c C : c - x} = {c C : c - y}, so u(x) = u(y).
Let

Exercise 2.3
(a): True. By denition of a continuous function, pre-images of open sets are open sets.
quently, for each

x X,

Conse-

the sets

{y X : y x} = u1 ((, u(x))
| {z }

and

open

{y X : y  x} = u1 ((u(x), ))
| {z }
open

are open sets.

(b)

on R is represented by the continuous


u : R R with u(x) = x and hence, by (a), continuous. However, any strictly
function u : R R represents , including the discontinuous function

x
if x < 0,
u(x) =
x + 1 if x 0.

False. The usual greater than or equal to order

utility function
increasing

Exercise 2.4
 No. Lexicographic preferences (modied in such a way that you start comparing the second
2
coordinates, then the rst) on R+ constitute an example where preferences cannot even be
2
represented by a utility function. Let x, y R+ have x2 > y2 . The modied lexicographic
preference started by looking at these second coordinates, so no matter how much money
you add to the rst coordinate of

y,

you will strictly prefer

x.

 Here is an example where preferences can be represented by a utility function. It makes


having a second coordinate below one so bad, that you can never compensate this with
money and make it look as nice as an alternative whose second coordinate is at least one.
The preference relation

on

R2+

represented by the utility function


u(x) =
where

: R (0, 1)

(x1 ) + 1
(x1 )

if
if

x2 1,
x2 < 1,

is strictly increasing (like the cdf of a standard normal distribution),

satises all properties in Proposition 2.11, except (8).

 Under additional assumptions (like continuity, monotonicity), the answer is yes. See Rubinstein (2006, Lecture 4).

Exercise 2.5
(a): Consider (a, 0) and (a0 , 0) in X .

m = m0 = 0,
0
or one of the alternatives is strictly preferred over the other, w.l.o.g. (a, 0)  (a , 0). In the

latter case, invoke the rst property to conclude that there is an amount of money m such that
(a, 0) (a0 , m ). Take m = 0, m0 = m .
(b): W.l.o.g., m w. By the third property with c = w m:
Either

(a, 0) (a0 , 0),

in which case we take

(a0 , w0 ) (a, w) = (a, m + (w m)) (a0 , m0 + (w m)),


94

so

(a0 , w0 ) (a0 , m0 + (w m))

by transitivity of

But then

w0 = m0 + (w m)

by strong

monotonicity in money.

(c):

u(a, m) u(a0 , m0 ).
By the rst two properties, there are unique amounts of money m1 , m2 0 such that (a, m)
(a , m1 ) and (a0 , m0 ) (a , m2 ). By denition of v , we nd that
Let

(a, m), (a0 , m0 ) X .

To show:

(a, m) % (a0 , m0 )

u(a, m) = (m1 m) + m = m1

if and only if

and, similarly, that

u(a0 , m0 ) = m2 .

(55)

Therefore,

(a, m) % (a0 , m0 ) (a , m1 ) % (a , m2 )
m1 m2
u(a, m) u(a0 , m0 ),
where the rst equivalence follows from the fact that

(a, m) (a , m1 )

and

(a0 , m0 ) (a , m2 ),

the second equivalence from strong monotonicity in money, and the nal one from (55).

Exercise 2.6
(a): Let r R.

Xu (r) contains at most one element, it is convex. If it contains two or more,


(0, 1). To show: x + (1 )y Xu (r).
Without loss of generality, assume that x % y , so that u(x) u(y) r . By convexity of %:
x + (1 )y % y , so u(x + (1 )y) u(y) r, i.e., x + (1 )y Xu (r).
(b): Let's do the quasiconcavity part; strict quasiconcavity proceeds similarly. Assume u : X
R is quasiconcave. Let y X . To show: {x X : x % y} is a convex set.
By denition, {x X : x % y} = {x X : u(x) u(y)} = Xu (r), with r = u(y). The latter

let

x, y Xu (r)

If

and let

set is convex by the denition of a quasiconcave function under (a).

(c):

A function

on a convex domain

is concave if its subgraph

subgraph(u) = {(x, y) X R : y u(x)}


is a convex set. Consider the weak order

on


u(x) =

X=R
0
1

if
if

represented by the utility function

x 0,
x > 0.

y X , the upper contour



R
if y 0,
{x X : x % y} =
(0, ) if y > 0.

This preference relation is convex, as, for each

v : X R were a concave utility function representing %. By denition, (1, v(1))


(1, v(1)) are elements of subgraph(v). Take = 1/2 and consider the convex combination

Suppose
and

sets are convex:

1
1
1
1
(1, v(1)) + (1, v(1)) = (0, v(1) + v(1)).
2
2
2
2
Since

v(1) < v(1),

this point does not lie in the subgraph of

v:

1
1
v(0) = v(1) < v(1) + v(1).
2
2

Exercise 2.7
(a):
95

n N, f (nu) = nf (u) by additivity and induction on n.


f (0) = f (0 + 0) = f (0) + f (0), so f (0) = 0. Hence f (0u) = 0f (u).
For all n N, f (nu) = nf (u): indeed, 0 = f (0) = f (nu + (nu)) = f (nu) + f (nu),
so f (nu) = f (nu) = nf (u).
So f (xu) = xf (u) for all x Z.
For x Q, write x = p/q for some p, q Z, q 6= 0. Rewriting xu = (p/q)u gives q(xu) = pu.
Hence f (q(xu)) = f (pu). By the above, qf (xu) = pf (u), so f (xu) = (p/q)f (u) = xf (u).

 For all






(b):

If f is not linear, there are x, y R\{0} with f (x)/x 6= f (y)/y . Hence, vectors a = (x, f (x))
b = (y, f (y)) are linearly independent: vectors a + b with , R span R2 . So vectors
a + b with , Q are dense in R2 . The latter vectors are in the graph of f : for , Q,

and

(a) implies that

(x + y, f (x + y)) = (x + y, f (x) + f (y)) = (x + y, f (x) + f (y)) = a + b.

(c):
 For each

i {1, . . . , n},

dene

fi : R R

as follows:

xi R :
 Applying additivity of

F (n 1)

F (x) = F

fi (xi ) = F (xi ei ).

times gives, for each

n
X

!
xi e i

i=1
 To see that each

fi

n
X

F (xi ei ) =

i=1

must be additive, let

x Rn

that

n
X

fi (xi ).

i=1

xi , yi R.

By additivity of

F:

fi (xi + yi ) = F (xi ei + yi ei ) = F (xi ei ) + F (yi ei ) = fi (xi ) + fi (yi ).

Section 3
Exercise 3.2
By continuity of

f,

the weak order

on

with

x, y X :
has open lower contour sets: for each

x % y f (x) f (y)

x X,

L(x) = {y X : y x} = {y X : f (y) < f (x)} = f 1 ((, f (x)))


is the pre-image of an open interval. By Proposition 3.1,
of

contains a best element. By denition

%, this best element is a maximum of f . Existence of a minimum can be established by applying


% with

the proposition to the weak order

x, y X :

x % y f (x) f (y).

Exercise 3.3
(a): Assume (X, B, C) is rationalizable by the weak order % on X .
x C(A), y C(B).

To show:

x C(B) = {z B : z %
96

z 0 for all

A, B B , x, y A B ,
B}.

Let

z0

y A and x C(A) = {z A : z % z 0 for all z 0 A}: x % y .


y C(B): y % z 0 . Using x % y and transitivity of %: x % z 0 . So x % z 0
x C(B).
Since

(b):

z 0 B . Since
0
all z B , i.e.,

Let
for

No. Consider the choice structure with

X = {a, b, c, d}, B = {{a, b, c}, {b, c, d}}, C({a, b, c}) = {b}, C({b, c, d}) = {c}.
It trivially satises IIA: there are no distinct sets

WARP: in the rst problem,


least as good as

(c):

No.

b.

So

A, B B

A B . It does not satisfy


c, in the second c is revealed at
C({b, c, d}).
with

is revealed at least as good as

should have been contained in

The choice structure in (b) satises IIA, but is not rationalizable.

Suppose, to the

% rationalizes it. Since C({a, b, c}) = {b}, we must have that b % c and b % a.
Since C({b, c, d}) = {c}, we must have that c % b and c % d. But then b c, so c b % a
implies c % a. But then c % y for all y {a, b, c}, so c should have been included in C({a, b, c}).
(d): No. Consider the choice structure with X = {a, b, c}, B = {{a, b}, {b, c}, {a, c}}, C({a, b}) =
{a}, C({b, c}) = {b}, C({a, c}) = {c}. As distinct choice sets have only one point in common,
WARP is trivially satised. It is not rationalizable, as a rationalizing % should satisfy a  b, b 
c, c  a, in violation of transitivity.

contrary, that

Exercise 3.4
(a): [WARP satised]

Let

A, B B , x, y A B , x C(A),

and

y C(B).

To show:

x C(B).
We will simply show that

B B :

if

x = y.

By denition of

C:

contains a satisfactory alternative,

C(B)

selects one of them.

(56)

Distinguish two cases:

Case 1: v(x) < r.

Then also v(y) < r by (56). Now x C(A) implies that x is the largest
A. In particular, since y A: y x. Similarly, y C(B) implies that x y . So
x = y C(B).
Case 2: v(x) r. Then also v(y) r by (56). Now x C(A) implies that x is the smallest
satisfactory element of A. In particular, since y A: x y . Similarly, y C(B) implies that
y x. So x = y C(B).
element of

[IIA satised] WARP implies IIA.


[A rationalizing weak order] Some

conditions need to be satised: A satisfactory element

is always preferred to a nonsatisfactory one; Among nonsatisfactory alternatives, the largest is


chosen, so there having a high index is preferable. Among satisfactory alternatives, the smallest
is chosen, so there having a low index is preferable. One weak order (verify!) rationalizing the
choice structure is obtained by writing down (from worst to best) all nonsatisfactory alternatives
from smallest to largest, then all satisfactory alternatives from largest to smallest.

(b): [IIA violated] For each B B, let x (B) be your partner's most preferred element of B .

C : C(B) = B \ {x (B)} for each B B with more than one element. Take
B = X, A = C(B). Both sets lie in B and A B . Moreover, C(B) A = C(B) 6= . IIA would

imply that C(A) = C(B) A = C(B), but C(A) = A \ {x (A)} A = C(B), a contradiction.

By denition of

[WARP violated] WARP implies IIA and is therefore violated as well.


[Rationalizability] As WARP is violated, the choice structure is not rationalizable.

Exercise 3.5
97

(a:)

In

B1 ,

rst commodity gives

(b):
(c):

p1 = 2, so spending
C(B2 ) = {(0, 1)}.

the rst commodity has the highest price

C(B1 ) = {(1, 0)}.

Similarly,

w=2

wealth

on the

Yes, there is no set-inclusion between the two choice sets, so IIA holds vacuously.

x = (1, 0) and y = (0, 1)


x C(B2 ).
C(B1 ) = {x} would require x  y ,

No, bundles

lie in

B1 B2 .

Since

x C(B1 )

and

y C(B2 ),

WARP would require

(d): No:
(e): For instance:

whereas

x1
x2
u(x, p) =

x1 x2

if
if
if

C(B2 ) = {y}

would require

y  x.

p1 > p2 ,
p2 > p1 ,
p1 = p2 .

Section 4
Exercise 4.1
[Continuity:]
mally, for each

% is
y X,
As

represented by the continuous utility function

u,

it is continuous. For-

{x X : x % y} = {x X : u(x) u(y)} = u1 ([u(y), ))


is the preimage of a closed set under the continuous function
the set

{x X : x - y}

[Monotonicity, but not strong:]


that

and therefore closed. Similarly,

is closed.

x, y RL
+ with x y . There
u(x) = min{x1 /a1 , . . . , xL /aL } = xi /ai . As x y , it follows that
Take

u(x) = xi /ai yi /ai min{y1 /a1 , . . . , yL /aL } = u(y),


Similarly, if

x > y,

then

x  y.

is an

so

i {1, . . . , L}

such

x % y.

For a violation of strong monotonicity, notice that

u(0, . . . , 0) = u(1, 0, . . . , 0) = 0,
i.e., if you start with nothing, but get one unit of the rst ingredient, you still cannot bake a
cake due to lack of all the other ingredients!

[Convexity, but not strict:]

Let

y RL
+

and let

u(y) = .

Then

L
{x RL
+ : x % y} = {x R+ : min{x1 /a1 , . . . , xL /aL } }
L
= L
`=1 {x R+ : x` /a` }
is the intersection of convex halfspaces and therefore convex. For a violation of strict convexity,
take

x = (a1 +1, a2 , . . . , aL ), y = (a1 , . . . , aL ).


(0, 1):

Both vectors (and any convex combination) suce

to make one cake: for each

x y x + (1 )y,
in contradiction with strict convexity.

[Homotheticity:] u is homogeneous of degree one.


Exercise 4.2
With the additional restrictions, the budget sets become:

Indivisibilities: B(p, w) Z2+ .


98

Rationing: B(p, w) {x R2+ : x1 3}.


Rebates 1: {x R2+ : p1 x1 + 4 min{x2 , 5} + 2 max{x2 5, 0} w},

as the rst ve units of

p2 = 4 and any additional ones only 2.


2
Rebates 2: {x R+ : x2 5, p x w} {x R2+ : x2 > 5, 8x1 + 2x2 40}.
Initial endowment: {x R2+ : p x p }.
Package deal: B(p, w) {x R2+ : x1 = x2 }.
Gift certificate: B(p, w) {x R2+ : x1 1/p1 , p1 (x1 1/p1 ) + p2 x2 w},
commodity two cost

the rst set

being the budget set if he does not use the gift certicate, the second one if he does and therefore
acquires

1/p1

Except for
exists. Under

(30/8, 5))

units of the rst commodity without needing to address his budget.

Rebates 2,
Rebates 2,

the budget sets are nonempty, compact, so a most preferred bundle


the budget set is not closed (it doesn't contain the boundary point

and a most preferred bundle need not exist. For instance, if the utility function of the

consumer is

u(x) = min{4x1 , 3x2 },

there is no optimal bundle in the budget set. Drawing the

budget set and some indierence curves will help you to verify this.

Exercise 4.3

(p, w) RL+1
++

and all > 0,


x x(p, w).
Proof. Suppose not: there is a z B(p, w) with z  x. Then y := (1/)z B(p, w). As
x x(p, w), x % y . As % is homothetic, also x % y = z , contradicting that z  x.

Walrasian demand is homogeneous of degree one in wealth: for all

if

x x(p, w),

Exercise 4.4
(a): Consider
If
if

p1 > p 2 ,
p1 > p2 .

then

a consumer with utility function

u(x) = x1 + x2 .

Local nonsatiation is obvious.

the consumer spends the entire income on the second commodity, so


Increasing

p1

v(p, w) = w/p2

even further does not aect indirect utility, i.e., indirect utility is not

strictly decreasing in the price of commodity 1.

(b):

To show: for each sequence

(pn , wn )nN

in

RL+1
++

with limit

n
n
(p, w) RL+1
++ , v(p , w )

v(p, w).

Proof.

n N, let xn x(pn , wn ), which is possible by the assumptions in Proposition


B(pn , wn ) for all n and (pn , wn ) (p, w), the sequence (xn )nN eventually lies in
the slightly enhanced budget set B(p, w + 1), which is compact: taking a subsequence if necesn
sary, we may assume w.l.o.g. that the sequence (x )nN is convergent, with limit x X . The
n
n n
sequence (p , w , x )nN satises the properties of Proposition 4.3(b). In particular, x x(p, w),
n
n
n
i.e., limn v(p , w ) = limn u(x ) = u(x), by continuity of u.

For each

n
4.4. As x

(c):

Roughly speaking, because continuous preferences may be represented by discontinuous

utility functions, which may cause jumps in the indirect utility function as well.
For instance, suppose a consumer has continuous utility function

min{x, 1}

U : R+ R

with

U (x) =

and hence continuous preferences. These preferences can also be represented by the

discontinuous utility function

u : R+ R

with


u(x) =
Notice that


x(p, w) =

x
2

if
if

{w/p}
[1, w/p]
99

x 1,
x > 1.
if
if

w p,
w > p.

The indirect utility function given

is


v(p, w) =
with discontinuities at all points where

w/p
2

if
if

w p,
w > p,

p = w.

Exercise 4.5
(a): Follows since
e(p, u) = min (p) x
= min p x
= e(p, u).
L,
s.t.
x RL
,
s.t.
x

R
+
+
u(x) u.
u(x) u.

(b):

0 00
0
00
0
p RL
++ . Suppose there are u , u U with u(0, . . . , 0) u < u and e(p, u )
0
0
00
00
00
0
00
Let x h(p, u ) and x h(p, u ). Then x 6= (0, . . . , 0) and p x p x . By
00
00
00
0
00
0
continuity, lim1 u(x ) = u(x ) u > u , so u(x ) > u for (0, 1) close to one. But
00
00
0
0
0
0
then p (x ) = (p x ) (p x ) < p x , contradicting that x h(p, u ).
(c): Let (p, u) RL++ U , i {1, . . . , L}, and > 0. For each x RL+ with u(x) u,
(p + ei ) x p x, so e(p + ei , u) e(p, u).
(d): Let u U . To show: the set {(p, r) RL++ R : r e(p, u)} is convex.
1 1
2 2
1 1
2 2
Let (p , r ), (p , r ) lie in this set, let [0, 1], and dene (p, r) = (p , r ) + (1 )(p , r ).
1
2
Let x h(p, u). Then x is feasible in the EMP at (p , u) and at (p , u), so
Let

e(p, u00 ).

e(p, u) = p x
= (p1 x) + (1 )(p2 x)
e(p1 , u) + (1 )e(p2 , u)
r1 + (1 )r2
= r.

Exercise 4.6
Let

(p, w) RL+1
++

x = x(p, w). By Walras'


p0 and wealth p0 x:

and

in the UMP at prices

Law,

p x = w.

For each

p0 RL
++ , x

is feasible

v(p0 , p0 x) u(x) = v(p, w) = v(p, p x).


So the function

f : RL
++ R

with

f (p0 ) = v(p0 , p0 x)

achieves its minimum at

rst order conditions, it partial derivatives must be zero at

` = 1, . . . , L :
As

x = x(p, w)

and

p x = w,

By (17),

By the

p:

f (p)
v(p, p x) v(p, p x)
=
+
x` = 0.
p`
p`
w

the result now follows.

Exercise 4.7
By (15), indirect utility solves

p0 = p.

w = e(p, v(p, w)) = v(p, w)

PL

x(p, w) = h(p, v(p, w)) = (a1 v(p, w), . . . , aL v(p, w))


100

PL
i=1ai pi , so v(p, w) = w/  i=1 ai pi .
= PLa1 w , . . . , PLaL w
.
i=1 ai pi
i=1 ai pi

Exercise 4.8
We know from (18) that

h` (p, u) = x` (p, e(p, u)).

Dierentiating this equation w.r.t.

pk

and

using the Chain rule gives

h` (p, u)
x` (p, e(p, u)) x` (p, e(p, u)) e(p, u)
=
+
.
pk
pk
w
pk
Recall from (14) that

e(p,u)
pk

= hk (p, u):

h` (p, u)
x` (p, e(p, u)) x` (p, e(p, u))
=
+
hk (p, u).
pk
pk
w
u = v(p, w) that e(p, u) = e(p, v(p, w)) = w
h(p, u) = h(p, v(p, w)) = x(p, w), so:

It follows from (15) and

u = v(p, w)

that

and it follows from (18) and

h` (p, u)
x` (p, w) x` (p, w)
=
+
xk (p, w).
pk
pk
w

Exercise 4.9
Indivisibilities, rationing, package deals, as well as the specic initial endowment

= (1, 1)

imply smaller budget sets and therefore a (weakly) lower welfare. Rebates 1 and 2 and the gift
certicate imply a larger budget set and therefore a (weakly) higher welfare.

Exercise 4.10

p1 x0 < w1 , x0 B(p1 , w1 ).
kx yk suciently close to zero.
0
preferred to x .
As

As

x0

does not exhaust the budget,

p1 y w 1

Exercise 4.11
P
Write

A=

for all

By local nonsatiation, this neighborhood contains a

L
i=1 ai . Standard calculations give:


aL w
a1 w
,
, ...,
A p1
A pL
L  ai
 w A Y
ai
,
A
pi
i=1

L  ai /A 
Y
pi
a1
aL
1/A
,...,
,
u
ai
p1
pL
i=1
L  ai /A
Y
pi
1/A
Au
,
ai
i=1
L  0 ai /A
Y
pi
A(u1 )1/A
w0 ,
ai
i=1
L  1 ai /A
Y
pi
1
0 1/A
w A(u )
.
ai


x(p, w) =
v(p, w) =

h(p, u) =

e(p, u) =
EV ((p0 , w0 ), (p1 , w1 )) =
CV ((p0 , w0 ), (p1 , w1 )) =

i=1

101

with

strictly

It is commonly assumed (w.l.o.g., as this is just a monotonic transformation of the utility) that

A = 1,

which yields slightly more sympathetic expressions.

Section 5
Exercise 5.1
(a): Y {y RL : y } is the intersection of closed sets, hence closed.
vector

(b):

As the length of the vectors

(yn )nN

yn + 0 ,

so dividing by

All vectors

zn

kyn k

gives

kyn k
 1 for n
1
+ 1 kyn k 0 Y .

diverges to innity,

1
kyn k yn

zn =
zn + /kyn k 0.

By convexity and possibility of inaction,

(c):

It contains the zero

0.


suciently large.
By assumption,

have length one. A bounded sequence contains a convergent subsequence.

z 6= 0, as it is the limit of a sequence of vectors of length one. Secondly,


Y for n large, and Y is closed, also the limit z lies in Y .
(d): Letting n , and realizing that /kyn k 0, (b) implies that z 0. As z =
6 0, this
z
zn

Let

be its limit. Firstly,

as

lies in

contradicts no free lunch.

Exercise 5.2

Reasoning as in the EMP, the assumptions on

guarantee that the CMP is

solvable.

(a):

Dene qz = f (z) 0. The CMP at (w, qz ) has a solution and z is feasible in this CMP, so
c(w, qz ) w z . Conclude that pf (z) w z pqz c(w, qz ).
(b): Let zq solve the CMP at (w, q), i.e., zq RL1
+ , f (zq ) q , and c(w, q) = w zq . Conclude
that pf (zq ) w zq pq c(w, q).
(c): Assume (P1) has a solution z (the case where (P2) has a solution is similar). By (a), there
is a feasible qz in (P2) with equal or higher prot. It cannot be higher. Otherwise, by (b), there
is a feasible zqz in (P1) yielding a higher prot than qz and therefore higher than the prot
maximizing z , a contradiction. Conclude that qz solves (P2) and yields the same prot as z in
(P1).

Exercise 5.3
(a), (b): Consider

Y = {y R2 : y1 0, y2 y1 }.
2
vector p = (1, 0) R+ , but is not ecient, as

the convex production set

(0, 1) Y maximizes prot at price


(0, 0) Y . The point (0, 0) Y is ecient,

point

The
also

but does not maximize prot at strictly positive

prices.

(c):

Y = {y R2 : y (1, 1), (y1 1)2 + (y2 1)2 2}. The point


2
(0, 0)
Y is ecient, but not prot maximizing for any nonzero vector
p R+ : if p1 p2 , then
(1, 1 2) Y yields a positive prot, and if p1 p2 , then (1 2, 1) Y yields a positive
prot, whereas (0, 0) Y yields only zero prot.
Consider the production set

Section 6
Exercise 6.1
(a): Look at the denitions of improvements and Pareto optimality:
S=H

of all consumers cannot improve upon

the fact that the coalition

means that there is nothing feasible that makes

everybody better o. But there may still be room for improvement for some if not all consumers:
it may still be Pareto dominated.

(b):

Consider a pure exchange economy with two consumers and two commodities. The rst con-

sumer's preferences are represented by the utility function

102

u1 (x) = x1 x2 ,

the second consumer's

preferences by a constant utility function: he is indierent between all commodity bundles. If

1 = 2 = (1, 1),

then

(p, x) = ((1, 1), (1, 1), (1, 1))

(i.e., prices are equal and each consumer

sticks to the initial endowment) is a Walrasian equilibrium. By Proposition 6.2, the allocation
lies in the core. But the allocation is not Pareto optimal: giving the total endowment to the rst
consumer makes him better o, while not aecting the happiness of the second consumer.

Exercise 6.2
(a): Let p RL+ , z z(p) RL .

P
p z = k:pk >0 pk zk = 0. As the sum of
nonpositive terms, it can be zero only if z` = 0 whenever p` > 0.
(b): Let p, z, ` be as in the statement of the exercise. As zk = 0 for k 6= `, Walras' Law implies
p z = p` z` = 0. As p` > 0, this implies z` = 0.
(c): If in equilibrium the market for good ` {1, . . . , L} does not clear, its price is zero by
(a). So consumer h is not constrained in his consumption of `. In equilibrium, h must choose a
By Walras' Law,

most preferred bundle from the budget set, but there is none: under (c1), each bundle can be
improved upon by adding more of good
axes, as

`;

under (c2), a most preferred bundle can't lie on the

can aord a better alternative in

more of good

RL
++ ;

the latter can be improved upon by adding

`.

Exercise 6.3
(a): Pareto dominance

tries to compare allocations regardless of prices.

Preferences of rms

(prot) are functions of prices.

(b):

Let

E . Suppose there is a feasible allocation (


x, y)
(x, y). Local nonsatiation implies



P
x
h %h xh p x
h p xh = p h + f F hf y f ,


h H :
P
x
h h xh p x
h > p xh = p h + f F hf y f .

(p, x, y)

be a Walrasian equilibrium of

Pareto dominating

By Pareto dominance, such a weak preference holds for all, and strict preference for some
Summing over
at prices

hH

and using that equilibrium production plans

(y f )f F

h H.

are prot maximizing

gives

x
h > p

hH

xh

hH



X
p h +
hf y f
f F

hH

= p+p

yf

f F

p+p

yf .

f F
But

hH

x
h > p + p

f F

yf

contradicts feasibility of

(
x, y).

Exercise 6.4
Pure exchange economies:

You may verify that the following pure exchange economies

E = (%1 , %2 , 1 , 2 ) have the desired property:


(a): Let %1 and %2 be lexicographic preferences over
no Walrasian equilibrium:

103

R2+ , 1 = (1, 0),

and

2 = (0, 1).

There is

p has both prices positive, then consumer 1 demands 1


(p2 /p1 , 0), so there is excess demand for the rst commodity;

 if

and consumer 2 demands

 if one of the commodities has price zero, demand for this commodity is unbounded.

(b):
(c):
and

The standard Cobb-Douglas case.


Let

%2

%1

be represented by the utility function

by the utility function

u2 (x) = x1 + x2 .

Let

u1 (x) = max{min{2x1 , x2 }, min{x1 , 2x2 }}


1 = 2 = (1, 1).

 if one of the commodities has price zero, demand for this commodity is unbounded: there
are no Walrasian equilibria at such prices;

 if both prices are positive and p1 > p2 , the rst consumer demands a bundle with 2x1 = x2 ,
1
1
i.e., the bundle (p /(p1 + 2p2 ), 2p /(p1 + 2p2 )) and the second consumer spends
2
the entire income on the second commodity, i.e., demands the bundle (0, p /p2 ). In
particular, demand for the second commodity is at least twice the demand for the rst
commodity. As the total endowment of both commodities is equal, not both markets can
clear at the same time, contradicting the fact that (given local nonsatiation) markets with
a positive price must clear. There are no Walrasian equilibria at such prices;

 similarly, Walrasian equilibria with positive prices and


 if both prices are positive and equal, i.e.,
is

{(2/3, 4/3), (4/3, 2/3)}

p = (1/2, 1/2),

are ruled out;

the rst consumer's demand

{x R2+ : x1 + x2 =
((2/3, 4/3), (4/3, 2/3)) and

and the second consumer's demand is

2}. There are two (equilibrium/market


((4/3, 2/3), (2/3, 4/3)).

(d):

p2 > p1

clearing) allocations:

%1 , %2 are such that the consumers are indierent between all commodity
=
= (1, 1). Every (p, x) with p and x = (x1 , x2 ) R2+ R2+ with
h ) for both h = 1, 2 is a Walrasian equilibrium.

Preferences

1
bundles;
xh B h (p, p

Private ownership economies:


production set

Take the examples above and give the producers the trivial

{0} consisting of the remarkable feat of producing absolutely nothing using abso-

lutely nothing. If you prefer slightly larger production sets, you may want to choose them equal
to

R2 ,

containing all production plans producing absolutely nothing, possibly using something.

Exercise 6.5

Feasible allocations: {(xT , xL ) R2+ : xT + xL 1}.


Pareto optimal allocations: Must be nonwasteful, otherwise

the remainder can be given

to the liar, who becomes happier, while the true mother is not harmed. Moreover,
otherwise the true mother can be made happier by giving her
allocations

Core:

x=

(0, 1)

and

(1, 0)

xT
/ (0, 1):

0, while not harming the liar.

Only

are Pareto optimal.

The core depends on the initial allocation

( T , L ).

Denote an allocation by a vector

(xT , xL ).

 The liar can improve upon any allocation with

xL < L ,

so

xL L

in the core.

 For the true mother:

if

T = 0,

if

T (0, 1),

individual rationality and feasibility require that

xT {0, 1},

individual rationality has no bite: everything is at least as good as her

initial allocation,

if

T = 1,

individual rationality and feasibility require that

104

xT = 1.

xT (0, 1)

 The coalition of both women can improve upon any feasible allocation with
T
giving the liar the entire baby, so x {0, 1} in the core.

by

 Combining the above gives that the core is

{(1, 0)}
{(0, x) : L x 1}

{(0, 1)}
 Notice that if

T (0, 1),

Walrasian equilibria:

( T , L ) = (1, 0),
T
has (0, 1),
T
L
is ( , ) = (0, 1).

if the initial endowment is


if the initial endowment
if the initial endowment

there are wasteful core allocations.

The Walrasian equilibria depend on the initial allocation

( T , L ).

As

equilibrium involves a nonzero price vector, we may assume w.l.o.g. that the equilibrium price
is

p > 0.
 The true mother demands

if

T [0, 1)

and

if

T = 1.

L.

 The liar demands

 Therefore, the set of Walrasian equilibria is

{(p, xT , xL ) R3 : p > 0, xT = 0, xL = L }
{(p, xT , xL ) R3 : p > 0, xT = 1, xL = 0}

if the initial endowment has

T
if

T [0, 1),

= 1.

Section 7
Exercise 7.1
(a):
 Best elements of

G:

Worst elements of

those whose reduced simple gambles put largest probability on

G:

max{a1 , . . . , ak }.

those whose reduced simple gambles put largest probability on

min{a1 , . . . , ak }.
 (G1) satisfied: preferences represented by utility function

(G2) violated:

p) a2 ).

If

assume w.l.o.g. that

p > 1/2, a1

a1 > a2

a +a2
DM assigns value 1
2
a1 for sure.

and consider the gambles

a1

ai L(g) ai .

and

< a1

a1 ( 12 a1 , 12 a2 ).

(pa1 , (1

However, at

p = 1/2,

the

to the second gamble, so he strictly prefers the gamble giving

preferences are dened in terms of reduced simple gambles:

(G4) violated:

g 0 = a2 .

1
|L(g)|

is the most likely outcome in both gambles, so the DM is indierent

between them. Continuity would require

(G3) satisfied:

u(g) =

assume w.l.o.g. that

a1 > a 2 .

u(g) = u(gs ).

Then the DM strictly prefers

g = a1

to

Independence requires that also

( g, (1 ) a1 )  ( g 0 , (1 ) a1 )
for all

(0, 1).

However, for

close to zero,

a1

is the most likely outcome in both

gambles, so the DM is indierent between them.

 As (G2) and (G4) are violated, Remark 7.3 implies that


vNM utility function.

(b):
105

cannot be represented by a

G: deterministic outcome max{a1 , . . . , ak }. Worst elements of G do not


1
1
exist: for each g G, the gamble ( g, g) has higher complexity and is therefore worse
2
2
than g .

 Best element of

 (G1) satisfied: preferences represented by a utility function.

(G2) satisfied:

Pk

m=1 pm am 1 is continuous.
1
1
(G3) violated: the gambles a1 G0 and ( 2 a1 , 2 a1 ) G1 both have reduced simple
gamble (1 a1 ), yet the former lies in G0 and is therefore strictly preferred to the latter in
on

G1 ,

u(g) =

the DM's utility function

G1 .
(G4) violated:

Let

g = a1
G0 ,
g 0 = ( 12 a1 , 12 a1 ) G1 ,
g 00 = ( 21 g 0 , 12 g 0 ) G2 .
Let

(0, 1).

By construction,

( g, (1 ) g 00 ), ( g 0 , (1 ) g 00 ) G3 .
Hence

u(g) = a1 0,
u(g 0 ) = a1 1,
u( g, (1 ) g 00 ) = a1 3,
u( g 0 , (1 ) g 00 ) = a1 3,
in violation of (G4).

 As (G3) and (G4) are violated, Remark 7.3 implies that

cannot be represented by a

vNM utility function.

(c):
 To characterize the best and worst elements of
1.

G,

distinguish two cases:

min{a1 , . . . , ak } 5 < max{a1 , . . . , ak }.


Best elements of G: those putting probability

one on outcomes

am > 5

(utility equal

to its maximum, one).


Worst elements of

G:

those putting probability one on outcomes

am 5 (utility equal

to its minimum, zero).


2. Otherwise, if all

ak

exceed 5 or all

ak

are at most ve, the utility function is constant

(one in the former case, zero in the latter), so all gambles are equivalent (and hence
both best and worst elements of

G).

i = 1, . . . , k , dene u(ai ) = 0 if ai 5 and u(ai ) = 1 otherwise.


P Then for
g G with reduced simple gamble (p1 a1 , , pk ak ), we have u(g) = i:ai >5 pi =
Pk
i=1 pi u(ai ), i.e., this denes a vNM utility function. By Remark 7.3, % must satisfy (G1)

 Shortcut: for each


every

to (G4).

106

Section 10
Exercise 10.1
(a): If u has no

u(ct ) > 1/(t) for each time t. Then (t)u(ct ) > 1 at each time t and
(b): Let u be bounded by B P
R and let c =P
(ct )
t=0 be an arbitrary

B(t)
=
B
each t, |(t)u(ct )| B(t) and
t=0 (t) converges. By
P t=0
summable sequences, also
(t)u(c
)
converges.
t
t=0

Exercise 10.2
(a1) k gives instantaneous
optimal action is

(a2) k
(b) k

utility

u(h, k) = 1, `

u(d, `) +

u(h, `) = 1,

so the

u(d, `) = ,

so the

u(d, k) + 0 = 0, ` gives expected discounted


+ 0) = 21 , so the optimal action is

utility

u(d, k) = 0, `

gives instantaneous utility

gives instantaneous utility

with instantaneous utility 0.

gives expected discounted utility

1
2 (u(h, `)

stream of choices. For


the comparison test for

with instantaneous utility 1.

gives instantaneous utility

optimal action is

(u(ct ))
t=0 with

u(c
)
diverges.
t
t=0 t

upper bound, construct a sequence of instantaneous utilities

+ u(d, k)) = +

k
k

1
2 (1

1
2
1
if
2
1
if
2

if
and

< 0,
= 0,
> 0.

(c) If the severity of the depression is relatively small ( 21 > 0), an initially depressed person
may decide not to take his life in the hope of becoming happy later while still having the option
of suicide in case of continued depression.

Exercise 10.3
Preferring one apple today over two apples tomorrow means that

u(1) > (1 + )/ u(2).


Preferring two apples one year and a day from now to one apple a year from now (and assuming
we're not in a leap year) means that

(1 + 366)/ u(2) > (1 + 365)/ u(1).


These two inequalities hold simultaneously if

Given

1
1+

/

u(1)
<
<
u(2)

it remains possible to choose the exponent

means choosing

= .

1 + 365
1 + 366

/
.

arbitrarily: having it equal to

So we can simplify the problem and show that there are

solving

1
1+

u(1)
<
<
u(2)

1 + 365
1 + 366

or similarly

1
<
1+

u(1)
u(2)

1/

107

<

1 + 365
.
1 + 366

simply
, > 0

Notice that

>0

implies that

1
1 + 365
<
< 1,
1+
1 + 366

0<

(u(1)/u(2))1/ is a continuous function of > 0. As u(1)/u(2) (0, 1), it goes


to zero to as 0 and to one as . By the Intermediate Value Theorem, there exists, for
1/ lies between the two desired bounds.
each > 0, a > 0 such that (u(1)/u(2))

The expression

Exercise 10.5

lim inf t xt = c implies [L1] and [L2]: Let > 0. As limt inf{xs : s t} = c,
T N such that
c < inf{xs : s t} < c + ,
for all

t T.

Apply the rst inequality in the special case of

there is a

t = T:

c < inf{xs : s T },
so

c < xt for all t T , proving [L1].


0
Let T N and apply the second inequality

to the special case of

t = max{T, T 0 }:

inf{xs : s max{T, T 0 }} < c + ,


i.e., there is a

t T0

with

x t < c + ,

proving [L2].

[L1] and [L2] imply lim inf t xt = c:

> 0.

Let

By [L1] there is a

T N

such that

c /2 < xt
for all

t T.

Hence,

c < c /2 inf{xs : s T }.
As the inmum increases weakly if the bound

does, it follows that, for each

t T:

c < inf{xs : s t}.


By [L2] applied to an arbitrary

t T,

there is an

st

(57)

such that

xs < c + /2 < c + ,
i.e.,

inf{xs : s t} c + /2 < c + .
Combining (57) and (58) gives that for each

>0

there is a

T N

(58)
such that

c < inf{xs : s t} < c + ,


i.e.,

lim inf t xt = c.

Exercise 10.6
It suces to show, for an arbitrary sequence

lim inf xt > 0


t

(xt )
t=0 :

> 0 : xt >
108

for all but nitely many

t.

(): Assume

lim inf t xt > 0. If the liminf is innite, the weakly increasing sequence of inma
inf{xs : s t} diverges, so there is a T N with inf{xs : s T } 1. In particular, xt 1 for all
t T . If the liminf is nite, [L1] with = c/2 implies that there is a T N with xt > c = c/2
for all t T .
(): Assume there is an > 0 such that xt > for all but nitely many t: there is a T N
such that xt > for t T . Then inf{xs : s t} for t T , so also the limit of the inma
exceeds : it must be positive!

Exercise 10.7
(a): If a sequence

is unbounded, the liminf of average payos need not converge. For instance,

x = (xt )
t=0 dened
1 PT 1

recursively by x0 = 1 and, for all t N, xt =


k=0 xk , has time average T t=0 xt = T , so its liminf diverges to innity.

(b): Let x = (xt )


t=0 and y = (yt )t=0 be two bounded sequences. We need to investigate whether

the unbounded sequence

(t +

1)2

Pt1

T 1
1 X
lim inf
(xt yt ) > 0
T T

T 1
T 1
1 X
1 X
lim inf
xt > lim inf
yt .
T T
T T

t=0

t=0

(59)

t=0

x = (0, 0, . . .) be the zero sequence. Substitution in (59) and


z = (zt )
t=0 , that lim inf t zt = lim supt zt  where the limes
analogously to liminf as lim supt zt = limt (sup{zs : s t})  yields

To see that this is not the case, let


using, for any sequence
superior is dened

lim sup
T
This is obviously false.

T 1
1 X
yt < 0
T

lim inf

t=0

T 1
1 X
yt < 0.
T
t=0

For an explicit example, take the sequence from page 73 with the

oscillating average and subtract


equal to

1/3 1/2 = 1/6 < 0,

1/2 from each entry to obtain a sequence of averages with liminf


but limsup equal to 2/3 1/2 = 1/6 > 0.

Section 11
Exercise 11.1
The cost function

is strictly convex, so the function

Pn

i=1 pi (i)

1
2 c(p) is strictly concave.

Since we maximize a strictly concave, continuous function over a compact set, a maximum exists

i-th coordinate




1 c(p)
1
1
1
1
(i)
= (i) 2 pi
= (i)
pi
.
2 pi
2
n

and is unique. Notice that the gradient of the goal function has

Since the feasible set is entirely dened by linear (in)equalities, the Kuhn-Tucker conditions
give necessary and sucient conditions for a solution to be a maximum. So

solves the

i 0 associated
P with the
pi 0 and R associated with the equality constraint ni=1 pi = 1
i = 1, . . . , n :


1
1

(i)
pi
+ i + = 0 and i pi = 0.
(60)

maximization problem if and only if there are Lagrange multipliers


inequality constraints
such that for each

Rewriting we nd

i = 1, . . . , n : pi = (i) + (i + ) +
109

1
.
n

Assume that

solves the maximization problem. We check that it satises the linear probability

model with parameter

j A,

If

pi > 0,

then

i = 0

by complementary slackness. Hence for every

we nd, using (60):

pi

pj

 

1
1
(j) + (j + ) +
= (i) + +
n
n
= ((i) (j)) j


((i) (j)),
where the inequality follows from the fact that

>0

and

j 0.

This is exactly requirement

(52).
Conversely, if

satises requirement (52), one can easily show that it satises the

Kuhn-Tucker conditions. Recall that if

pi > 0

and

pj > 0,

then

pi pj = ((i) (j)),
so

Hence if we choose

pi

i {1, . . . , n}

1
(i) =



1

pj
(j).
n

(61)

pi > 0 and dene




1
1

=
pi
(i) R,

n
with

we have from (61) that

1
=

for all

with

pj > 0.

Now dene for each


k =
To see that

k 0

if

pk = 0,

pj


(j)

k:

0
1

pk > 0,
pk = 0.

if

pk


1
n

(k)

choose an alternative

if

with

pj > 0.

By denition of the linear

probability model,

pj pk ((j) (k)),
which implies

((j) (k))


1
pj pk 0.

Hence

k =
=
=


1
pk


1
pk


1
(k)
n



1
1
1

(k)
pj
+ (j)
n

n

1
((j) (k))
pj pk

0,
110

as we had to show. Substituting the denition of the Lagrange multipliers in (60) shows that
the Kuhn-Tucker conditions are satised.

Exercise 11.2
(a): Choice probabilities are weakly increasing in payos, so the probability of choosing 1 must
be positive. If also the probability of choosing

is positive, the linearity requirement implies

PA (1) PA (2) = ((1) (2)) = 4.


Together with

PA (1) + PA (2) = 1,

this gives

PA (1) =

4 + 1
1 4
, PA (2) =
.
2
2

(62)

Obviously, this is possible if and only if both these probabilities are nonnegative, i.e., if and only
if

1/4.

So for

(0, 1/4], the choice probabilities in (62) satisfy the linear probability model
there is only one such vector of choice probabilities. For > 1/4,

and we know that for every


we nd

PA (1) = 1, PA (2) = 0.

(b) (c):

Answered in the notes on

(63)

The role of .

Solution 11.3
(a):
 In the logit model with parameter
is

> 0,

the choice probability for each alternative

exp((i)/)
.
jA exp((j)/)

PA (i) = P

iA
(64)

 Substituting the payos, we nd:

PA (1) =
=
PA (2) =
PA (3) =

exp(0/)
exp(0/) + exp(2/) + exp(8/)
1
,
1 + exp(2/) + exp(8/)
exp(2/)
,
1 + exp(2/) + exp(8/)
exp(8/)
.
1 + exp(2/) + exp(8/)

Since the exponential function takes strictly positive values, all choice probabilities lie in

(0, 1).
 The logit model is a special case of Luce's choice model (see (42) and (45)), which satises
path independence. Hence the logit model satises path independence.

 As

the choice probabilities converge to

1/3.

See the motivation in Section 11.2.

(b):
PA (i) for all alternatives i A
> 0 if the following holds:

 Choice probabilities
with parameter
if

PA (i) > 0,

then

satisfy the linear probability model

PA (i) PA (j) ((i) (j))


111

for all

j A.

(65)

 Since choice probabilities are weakly increasing in payos and

(3) > (2) > (1),

there

are three cases to consider:

Case 1:

PA (i) > 0

Case 2:

PA (3), PA (2) > 0, PA (1) = 0.

Case 3:

PA (3) > 0, PA (2) = PA (1) = 0,

for all

i A.
PA (3) = 1.

or equivalently,

 Using (65), the rst case requires:

PA (3) PA (2) = ((3) (2)) = 6,


PA (3) PA (1) = ((3) (1)) = 8,
PA (1) + PA (2) + PA (3) = 1.
So:

PA (2) = PA (3) 6,
PA (1) = PA (3) 8,
3PA (3) 14 = 1.
Conclude that

PA (3) =
P (2) =
A
PA (1) =

1+14
3 ,
1+14

3
1+14

6 =
8 =

14
3 ,
110
3 .

(0, 1/10).
(0, 1/10).

To make sure that all probabilities are positive, this requires that
probabilities in (66) satisfy the linear probability model for

(66)

So the

 Using (65), the second case requires:

PA (3) PA (2) = ((3) (2)) = 6,


PA (3) PA (1) = PA (3) ((3) (1)) = 8,
PA (1) = 0,
PA (1) + PA (2) + PA (3) = 1.
Rewrite:

PA (2) = PA (3) 6,
PA (1) = 0,
PA (3) 8,
2PA (3) 6 = 1.
Conclude that

To make sure that

PA (3) = 1+6
2 ,
PA (2) = 1+6
2 6 =

PA (1) = 0.
PA (2)

and

PA (3)

16
3 ,

are positive and

PA (3) =

1 + 6
8,
2

112

(67)

[1/10, 1/6). Conclude that the choice probabilities in (67) satisfy the
probability model for [1/10, 1/6).

this requires that


linear

 Using (65), the third case requires:

PA (3) PA (2) = 1 ((3) (2)) = 6,


PA (3) PA (1) = 1 ((3) (1)) = 8.
So choice probabilities
as long as

PA (1) = PA (2) = 0, PA (3) = 1

satisfy the linear probability model

1/6.
> 0. In
> 0, PA (1) 6= PA ({1, 2})P{1,2} (1).

 The linear probability model does not satisfy path independence for every
particular, we will show that for a specic value of

This means that we have to consider choice probabilities in the smaller problem with only
alternatives

and

2.

Let us assume that both

P{1,2} (1)

and

P{1,2} (2)

are positive. This

requires that

P{1,2} (2) P{1,2} (1) = ((2) (1)) = 2,


P{1,2} (1) + P{1,2} (2) = 1,
so

P{1,2} (1) =

1 + 2
1 2
, P{1,2} (2) =
.
2
2

These choice probabilities satisfy the linear probability model as long as


let us choose

= 1/20.

(0, 1/2).

Now

Then

PA (1) =

1 10
1
=
3
6

but


PA ({1, 2})P{1,2} (1) =
=
=
=
6=
 As

1 10 1 4
+
3
3
2 14 1 2

3
2
(1 7)(1 2)
3
39
200
1
.
6

1 2
2

it follows from our earlier analysis that Case 3 is the only feasible one: the

decision maker rationally chooses alternative

with probability one.

Exercise 11.4
pi < pj . Exchange the
to the i-th and j -th alternative to obtain a
Pn probabilities
Passigned
n
0
0
vector p . By construction,
i=1 pi (i) >
i=1 pi (i), and by symmetry, the control cost term

Suppose

is unaected, contradicting that

solves

P ().

113

You might also like