6.4K views

Uploaded by Syed Asim Bukhari

ECON509 Introduction to Mathematical Economics I - Lecture Notes

- Mathematics for Economics Lecture Notes VM CDS(1)
- economics
- Chiang Wainwright Fundamental Methods Ch 2 3 Solutions
- Lecture Note of Mathematical Economics
- Mathematical Optimization and Economic Theory
- Schaum's Introduction to Mathematical Economics -- 532
- Linear Models & Matrix Algebra
- pdf75.pdf
- Basic Mathematical Economics
- Solution-Manual.pdf
- Consumer Theory Basics
- Roy's Identity
- Mathematics - Mathematical Economics and Finance
- Macroeconomics Phd Lectures notes
- Foundations of Mathematical Economics
- Microeconomics I - Module
- Economics Notes [PDF Library]
- Microeconomic Theory
- Macroeconomic Theory and Policy
- Attitude

You are on page 1of 127

University of Virginia

Lecture notes based on Chiang and Wainwright, Fundamental Methods of Mathematical Economics.

1 Mathematical economics

Why describe the world with mathematical models, rather than use verbal theory and logic? After all, this

was the state of economics until not too long ago (say, 1950s).

1. Math is a concise, parsimonious language, so we can describe a lot using fewer words.

2. Math contains many tools and theorems that help making general statements.

3. Math forces us to explicitly state all assumptions, and help preventing us from failing to acknowl-

edge implicit assumptions.

Math has become a common language for most economists. It facilitates communication between econo-

mists. Warning: despite its usefulness, if math is the only language for economists, then we are restricting

not only communication among us, but more importantly we are restricting our understanding of the world.

Mathematical models make strong assumptions and use theorems to deliver insightful conclusions. But,

remember the A-A’C-C’Theorem:

Let C be the set of conclusions that follow from the set of assumptions A. Let A’be a small perturbation

of A. There exists such A’ that delivers a set of conclusions C’ that is disjoint from C. Thus, the

insightfullness of C depends critically on the plausibility of A.

The plausibility of A depends on empirical validity, which needs to be established, usually using econo-

metrics. On the other hand, sometimes theory informs us on how to look at existing data, how to collect

new data, and which tools to use in its analysis. Thus, there is a constant discourse between theory and

empirics. Neither can be without the other (see the inductivism v deductivism debate).

Theory is an abstraction of the world. You focus on the most important relationships that you consider

important a priori to understanding some phenomenon. This may yield an economic model.

2 Economic models

Some useful notation: 8 for all, 9 exists, 9! exists and is unique. If we cross any of these, or pre…x by : or

, then it means "not".

1

2.1 Ingredients of mathematical models

1. Equations:

De…nitions : =R C

: Y =C +I +G+E M

: Kt+1 = (1 ) Kt + It

Behavioral/Optimization : qd = p

: MC = MR

: MC = P

Equilibrium : q d = q s

Parameters and functional forms govern the equations, which determine the relationships between variable.

Thus, any complete mathematical model can be written as

F ( ; Y; X) = 0 ;

where F is a set of functions (say, demand and supply), is a set of parameters (say, elasticities), Y are

endogenous variables (price and quantity) and X are exogenous, predetermined variables (income, weather).

Some models will not have explicit X variables.

One general de…nition of a model’s equilibrium is "a constellation of selected, interrelated variables so

adjusted to one another that no inherent tendency to change prevails in the model which they constitute".

Selected: there may be other variables. This implies a choice of what is endogenous and what is

exogenous, but also the overall set of variables that are explicitly considered in the model. Changing

the set of variables that is discussed, and the partition to exogenous and endogenous will likely change

the equilibrium.

Interrelated: all variables must be simultaneously in a state of rest, i.e. constant. And the value of

each variable must be consistent with the value of all other variables.

Inherent: this means that only the relationships within the model are setting the equilibrium. It

implies that the exogenous variables and parameters are all …xed.

2

Since all variables are at rest, an equilibrium is often called a static. Comparing equilibria is called therefore

comparative statics.

An equilibrium can be de…ned as Y that solves

F ( ; Y; X) = 0 ;

for given and X. This is one example for the usefulness of mathematics for economists: see how much is

described by so little notation.

We are interested in …nding an equilibrium for F ( ; Y; X) = 0. Sometimes, there will be no solution.

Sometimes it will be unique and sometimes there will be multiple equilibria. Each of these situations is

interesting in some context. In most cases, especially when policy is involved, we want a model to have

a unique equilibrium, because it implies a function from ( ; X) to Y (the implicit function theorem). But

this does not necessarily mean that reality follows a unique equilibrium; that is only a feature of a model.

Warning: models with unique equilibrium are useful for many theoretical purposes, but it takes a leap of

faith to go from model to reality –as if the unique equilibrium pertains to reality.

Students should familiarize themselves with the rest of chapter 3 on their own.

2.3 Numbers

Natural, N: 0; 1; 2::: or sometimes 1; 2; 3; :::

Rational, Q: n=d where both n and d are integers and d is not zero. n is the numerator and d is the

denominator.

p

Irrational numbers: cannot be written as rational numbers, e.g., , e, 2.

Real, R: rational and irrational. The real line: ( 1; 1). This is a special set, because it is dense, in

the sense that there are just as many real numbers between 0 and 1 (or any other real numbers) as on

the entire real line.

Complex: an extension of the real numbers, where there is an additional dimension in which we add

p

to the real numbers imaginary numbers: x + iy, where i = 1.

2.4 Sets

We already described some sets above (N, Q, R, Z). A set S contains elements e:

S = fe1 ; e2 ; e3 ; e4 g ;

where ei may be numbers or objects (say: car, bus, bike, etc.). We can think of sets in terms of the number

of elements that they contain:

3

Finite: S = fe1 ; e2 ; e3 ; e4 g.

Countable: there is a mapping between the set and N. Trivially, a …nite set is countable.

In…nite and countable: Q. Despite containing in…nitely many elements, they are countable.

de…ned as 8e 2 S1 ; e 2 S2 and 9e 2 S2 ; e 2

= S1 .

Equal: S1 = S2 : 8e 2 S1 ; e 2 S2 and 8e 2 S2 ; e 2 S1 .

The null set, ?, is a subset of any set, including itself, because it does not contain any element that is

not in any subset (it is empty).

Disjoint sets: S1 and S2 are disjoint if they do not share common elements, i.e. if there does not exist

an element e such that 8e 2 S1 and e 2 S2 .

Operations on sets:

= Ag.

= Bg. E.g., A = nA.

Rules:

Commutative:

A[B = B[A

A\B = B\A

Association:

(A [ B) [ C = A [ (B [ C)

(A \ B) \ C = A \ (B \ C)

4

Distributive:

A [ (B \ C) = (A [ B) \ (A [ C)

A \ (B [ C) = (A \ B) [ (A \ C)

Do Venn diagrams.

Ordered pairs: whereas fx; yg = fy; xg because they are sets, but not ordered, (x; y) 6= (y; x) unless x = y

(think of the two dimensional plane R2 ).

Let X and Y be two sets. Then Cartesian product of X and Y is a set that is given by

X Y = f(x; y) jx 2 X; y 2 Y g :

so that the set Y is related to the set X. Any subset of a Cartesian product also has this trait. Note that

each x 2 X may have more than one y 2 Y that is related to it.

If

8x 2 X; 9!y 2 Y such that (x; y) 2 S X Y ;

y = f (x)

or

f :X!Y :

The second term also is called mapping, or transformation. Note that although for y to be a function of

x we must have 8x 2 X; 9!y 2 Y , it is not necessarily true that 8y 2 Y; 9!x 2 X. In fact, there need not

exist any such x at all. For example, y = a + x2 , a > 0.

In y = f (x), y is the value or dependent variable; x is the argument or independent variable. The set

of all permissible values of x is called domain. For y = f (x), y is the image of x. The set of all possible

images is called the range.

5

2.6 Functional forms

Students should familiarize themselves with polynomials, exponents, logarithms, "rectangular hyperbolic"

functions (unit elasticity).

z = f (x; y) means that

This is a function from a plane in R2 to R or a subset of it. y = f (x1 ; x2 ; :::xn ) is a function from the Rn

hyperplane or hypersurface to R or a subset of it.

3 Equilibrium analysis

Students cover independently. Conceptual points reported above in 2.2.

4 Matrix algebra

4.1 De…nitions

Matrix: 2 3

a11 a12 ::: a1n

6 a21 a22 a2n 7

6 7

Am n =6 .. .. 7 = [aij ] i = 1; 2; :::m; j = 1; 2; :::n :

4 . . 5

am1 am2 ::: amn

Notation: usually matrices are denoted in upper case. m and n are called the dimensions.

Vector: 2 3

x1

6 x2 7

6 7

xm 1 =6 .. 7 :

4 . 5

xm

Notation: usually lowercase. Sometimes called a column vector. A row vector is

x0 = x1 x2 xm :

Equality: A = B i¤ aij = bij 8ij. Clearly, the dimensions of A and B must be equal.

6

Matrix multiplication: Let Am n and Bk l be matrices.

2 3

! 2 3

6 7 b11 b12 ::: b1l

6 a11 a12 ::: a1n 7 6 b21 7

6 76 b22 b2l 7

Am n Bn l = 6 a21 a22 a2n 7 6# .. .. 7

6 .. .. 74 . . 5

4 . . 5

bn1 bn2 ::: bnl

am1 am2 ::: amn

" n

#

X

= cij = aik bkj i = 1; 2; :::m; j = 1; 2; :::l :

k=1

0

– (A0 ) = A

0

– (A + B) = A0 + B 0

0

– (AB) = B 0 A0

Operation rules

– Commutative addition: A + B = B + A.

Identity matrix: 2 3

1 0 ::: 0

6 0 1 0 7

6 7

In = 6 .. .. .. 7 :

4 . . . 5

0 0 ::: 1

AI = IA = A (of course, dimensions must conform).

Zero matrix: all elements are zero. 0 + A = A, 0A = A0 = 0 (of course, dimensions must conform).

7

Example: the linear regression model is yn 1 = Xn k k 1 + "n 1 and the estimated model by OLS

1 1

y = Xb+e, where b = (X 0 X) X 0 y. Therefore we have yb = Xb = X (X 0 X) X 0 y and e = y yb = y Xb =

h i

1 1 1

y X (X 0 X) X 0 y = I X (X 0 X) X 0 y. We can de…ne the projection matrix as P = X (X 0 X) X 0

and the residual generating matrix as R = [I P ]. Both P and R are idempotent. What does it mean that

P is idempotent? And R?

2 4 2 4

A= ; B= :

1 2 1 2

Likewise, CD = CE does NOT imply D = E. E.g.,

2 3 1 1 2 1

C= ; D= ; E= :

6 9 1 2 3 2

This is because A, B and C are singular: there is one (or more) row or column that is a linear

combination of the other rows or columns, respectively.

Scalar multiplication: Let xm 1 be a vector. Then the scalar product cx is

2 3

cx1

6 cx2 7

6 7

cxm 1 = 6 . 7 :

4 .. 5

cxm

Inner product: Let xm 1 and ym 1 be vectors. Then the inner product is a scalar

m

X

x0 y = xi yi :

i=1

Outer product: Let xm 1 and yn 1 be vectors. Then the outer product is a matrix

2 3

x1 y1 x1 y2 : : : x1 yn

6 x2 y1 x2 y2 x2 yn 7

6 7

xy 0 = 6 .. .. 7 :

4 . . 5

xm y1 xm y2 ::: xm yn m n

– Scalar multiplication.

– Vector addition.

– Vector subtraction.

– Inner product and orthogonality (xy = 0 means x?y).

8

4.4 Linear dependence

De…nition 1: a set of k vectors x1 ; x2 ; :::xk are linearly independent i¤ neither one can be expressed as a

linear combination of all or some of the others. Otherwise, they are linearly dependent.

De…nition 2: a set of k vectors x1 ; x2 ; :::xk are linearly independent i¤ :9 a set of scalars c1 ; c2 ; :::ck such

Pk

that ci 6= 0 8i and i=1 ci xi = 0. Otherwise, they are linearly dependent.

Consider R2 .

All vectors that are multiples are linearly dependent. If two vectors cannot be expressed as multiples

then they are linearly independent.

If two vectors are linearly independent, then any third vector can be expressed as a linear combination

of the two.

The complete set of vectors of n dimensions is a space, or vector space. If all elements of these vectors are

real numbers (2 R), then this space is Rn .

A base spans the space to which it pertains. This means that any vector in Rn can be expressed as a

linear combination of of the base (it is spanned by the base).

Bases are minimal: they contain the smallest number of vectors that span the space.

2 3 2 3 2 3

1 0 0

e1 = 4 0 5 ; e2 = 4 1 5 ; e3 = 4 0 5

0 0 1

is a base.

Distance metric: Let x; y 2 S, some set. De…ne the distance between x and y by a function d:

d = d (x; y), which has the following properties:

– d (x; y) = 0 , x = y.

9

A metric space is given by a vector space + distance metric. The Euclidean space is given by Rn +

the following distance function

v

u n q

uX

d (x; y) = t

2 0

(xi yi ) = (x y) (x y) :

i=1

But you can imagine other metrics that give rise to other di¤erent metric space.

De…nition: if for some square (n n) matrix A there exists a matrix B such that AB = In , then B is the

1 1

inverse of A, and is denoted A , i.e. AA = I.

Properties:

1

Not all square matrices have an inverse. If A does not exist, then A is singular. Otherwise, A is

nonsingular.

1

A is the inverse of A and vice versa.

1 1

The inverse, if it exists, is unique. Proof: suppose not, i.e. AB = I and B 6= A . Then A AB =

1 1

A I, IB = B = A , a contradiction

Operation rules:

1 1

– A =A

1 1 1 1 1 1

– (AB) =B A . Proof: Let (AB) = C. Then (AB) (AB) = I = C (AB) = CAB ) CABB =

1 1 1 1 1

CA = IB =B ) CAA =C=B A

1 1 0 1 1 0

– (A0 ) = A . Proof: Let (A0 ) = B. Then (A0 ) A0 = I = BA0 ) (BA0 ) = AB 0 = I 0 =

1 1 0

I ) A AB 0 = A 1

I ) B0 = A 1

) B= A

Given square matrix, a su¢ cient condition is that the rows or columns are linearly independent. It

does not matter whether we use the row or column croterion because matrix is square.

1

A is square + linear independence , A is nonsingular , 9A

| {z }

necessary and su¢ cient conditions

How do we …nd the inverse matrix? Soon...Why do we care? See next section.

10

4.7 Solving systems of linear equations

We seek a solution x to the system Ax = c

1

An n xn 1 = cn 1 ) x = cA ;

where A is a nonsingular matrix and c is a vector. Each row of A gives coe¢ cients to the elements of x:

n

X

row 1 : a1i xi = c1

i=1

Xn

row 2 : a2i xi = c2

i=1

Many linear models can be solved this way. We will learn clever ways to compute the solution to this system.

We care about singularity of A because (given c) it tells us something about the solution x.

We introduce this through an example. Let x denote a vector of employment and unemployment rates:

x0 = e u , where e + u = 1. De…ne the matrix P as a transition matrix that gives the conditional

probabilities for transition from the state today to a state next period,

pee peu

P = ;

pue puu

where Pij = Pr (state j tomorrowjstate i today). Clearly, pee + peu = 1 and pue + puu = 1. Now add a time

dimension to x: x0t = et ut .

We ask: what is the employment and unemployment rates going to be in t + 1 given xt ? Answer:

pee peu

x0t+1 = x0t P = et ut = et pee + ut pue et peu + ut puu :

pue puu

What will they be in t + 2? Answer: x0t+2 = x0t P 2 . More generally, x0t0 +k = x0t0 P k .

A transition matrix, sometimes called stochastic matrix, is de…ned as a square matrix whose elements

are non negative and all rows sum to 1. This gives you conditional transition probabilities starting from

each state, where each row is a starting state and each column is the state in the next period.

Steady state: a situation in which the distribution over the states is not changing over time. How do

we …nd such a state, if it exists?

Method 1: Start with some initial condition x0 and iterate forward x0k = x00 P k , taking k ! 1.

11

5 Matrix algebra continued and linear models

5.1 Rank

De…nition: The number of linearly independent rows (or, equivalently, columns) of a matrix A is the rank

of A: r = rank (A).

Multiplying a matrix A by a another matrix B that is full rank does not change the rank of A.

If rank (A) = rA and rank (B) = rB , then rank (AB) = min frA ; rB g.

Finding the rank: the echelon matrix method. First de…ne elementary operations:

3. Interchanging rows: Ri $ Rj .

All these operations alter the matrix, but do not change its rank (in fact, they can all be expressed by

multiplying matrices, which are all full rank).

De…ne: echelon matrix.

3. The …rst element of each row on the left (which is 1) appears to the left of the row directly below it.

The number of non zero rows in the echelon matrix is the rank.

We use the elementary operations in order to change the subject matrix into an echelon matrix, which has

as many zeros as possible. A good way to start the process is to concentrate zeros at the bottom. Example:

2 3 2 3 2 1

3

0 11 4 4 1 0 1 0

1 4

A=4 2 6 2 5 R1 $ R 3 : 4 2 6 2 5 R1 : 4 2 6 2 5

4

4 1 0 0 11 4 0 11 4

2 1

3 2 3 2 3

1 0 1 41 0 1 1

0

4 2 4

R2 2R1 : 4 0 5 12 2 5 R3 + 2R2 : 4 0 5 12 2 5 R2 : 4 0 1 4=11 5

11

0 11 4 0 0 0 0 0 0

There is a row of zeros: rank (A) = 2. So A is singular.

12

5.2 Determinants and nonsingularity

Denote the determinant of a square matrix as jAn n j. This is not absolute value. If the determinant is zero

then the matrix is singular.

1. jA1 1j = a11 .

3. Determinants for higher order matrices. Let Ak k be a square matrix. The i-j minor jMij j is the

determinant of the matrix given by erasing row i and column j from A. Example:

2 3

a b c

e f

A=4 d e f 5 ; jM11 j = :

h i

g h i

The Laplace expansion of row i gives the determinant of A:

k

X k

X

i+j

jAk kj = aij ( 1) jMij j = aij Cij ;

j=1 j=1

i+j

where Cij = ( 1) jMij j is called the cofactor. Example: expansino by row 1

a b c

d e f = aC11 + bC12 + cC13

g h i

= a jM11 j b jM12 j + c jM13 j

e f d f d e

= a b +c

h i g i g h

= a (ei f h) b (di f g) + c (dh eg) :

In doing this, it is useful to choose the expansion with the row that has the most zeros.

Properties of determinants

1. jA0 j = jAj

5. If a row or a column are multiples of another row or column, respectively, then the determinant is zero:

linear dependence.

6. Changing the minors in the Laplace expansion by alien minors will give zero, i.e.

k

X i+j

aij ( 1) jMnj j = 0 ; i 6= n :

j=1

13

Determinants and singularity: jAj =

6 0

, A is nonsingular

, columns and rows are linearly independent

1

, 9A

1

, for Ax = c ; 9!x = A c

Let A be a nonsingular matrix, 2 3

a11 a12 ::: a1n

6 a21 a22 a2n 7

6 7

An n =6 .. .. 7 :

4 . . 5

an1 an2 ::: ann

The cofactor matrix of A is CA :

2 3

C11 C12 ::: C1n

6 C21 C22 C2n 7

6 7

CA = 6 .. .. 7 ;

4 . . 5

Cn1 Cn2 ::: Cnn

i+j 0

where Cij = ( 1) jMij j. The adjoint matrix of A is adjA = CA :

2 3

C11 C21 : : : Cn1

6 C12 C22 Cn2 7

0 6 7

adjA = CA =6 . .. 7 :

4 .. . 5

C1n C2n ::: Cnn

0

Consider ACA :

2 Pn Pn Pn 3

a C a C ::: a C

Pnj=1 1j 1j Pnj=1 1j 2j Pnj=1 1j nj

6 a 2j C1j j=1 a2j C2j j=1 a2j Cnj

7

0 6 j=1 7

ACA = 6 .. .. 7

4 5

Pn . Pn Pn .

j=1 anj C1j j=1 anj C2j ::: j=1 anj Cnj

2 Pn 3

j=1 a1j C1j Pn

0 ::: 0

6 0 j=1 a2j C2j 0 7

6 7

= 6 .. .. 7

4 . 5

Pn .

0 0 ::: j=1 anj Cnj

2 3

jAj 0 : : : 0

6 0 jAj 0 7

6 7

= 6 . .. 7 = jAj I ;

4 .. . 5

0 0 : : : jAj

14

where the o¤ diagonal elements are zero due to alien cofactors. It follows that

0

ACA = jAj I

0 1

ACA = I

jAj

1 0 1 adjA

A = CA = :

jAj jAj

Example:

1 2 4 3 0 4 2 1 2 1

A= ; CA = ; CA = ; jAj = 2; A = 3 1 :

3 4 2 1 3 1 2 2

For the system Ax = c and nonsingular A, we have

1 adjA

x=A c= c:

jAj

Denote by Aj the matrix A with column j replaced by c. Then it turns out that

jAj j

xj = :

jAj

Let the system of equations be homogenous: Ax = 0. If A is nonsingular, then only x = 0 is a solution. If

A is singular, then there are in…nite solutions, including x = 0.

For nonsingular A:

1. c 6= 0 ) 9!x 6= 0

2. c = 0 ) 9!x = 0

For singular A:

If there is inconsistency –linear dependency in A, the elements of c do not follow the same linear

combination –there is not solution.

One can think of the system Ax = c as defrining a relation between c and x. If A is nonsingular,

then there is a function (mapping/transformation) between c and x. In fact, when A is nonsingular, this

transformation is invertible.

15

5.7 Leontief input/output model

We are interested in computing the level of output that is required from each industry in an economy that is

required to satisfy …nal demand. This is not a trivial question, because output of some industries are inputs

for other industries, while also being consumed in …nal demand. These relationships constitute input/output

linkages.

Assume

This gives rise to the Leontief (…xed proportions) production function. The second assumption can be

relaxed, depending on the interpretation of the mode.. If you only want to use the framework for accounting

purposes, then this is not critical.

De…ne aio as the unit requirement of inputs from industry i used in the production of output o. I.e., in

order to produce on unit of output o you need aio units of i. For n industries An n = [aio ] is a technology

matrix. Each column tells you how much of each input is required to produce one unit of output o. If some

industry i does not require its own output for production, then aii .

If all industries were used as inputs as well as output, then there would be no primary inputs, i.e.

labor, entrepreneurial talent, land, natural resources. To accommodate primary inputs, we add an open

sector. If the aio are denominated in monetary values, i.e., in order to product $1 of output o you need $aio

Pn Pn

of input i, then we must have i=1 aio 1. And if there is an open sector, then we must have i=1 aio < 1.

This simply means that the cost of producing $1 is less than $1. By CRS and competitive economy, we

have the zero pro…t condition, which means that all revenue is paid out to inputs. So primary inputs receive

Pn

(1 i=1 aio ) from each industry o.

Equilibrium implies

supply = demand

= demand for intermediate inputs + …nal demand .

n

X

xo = aio xi + do

i=1

= a1o x1 + a2o x2 + ::: + ano xn + do :

| {z } |{z}

interm ediate dem and …nal

This implies

a1o x1 a2o x2 + ::: (1 aoo ) xo ao+1;o xo+1 ::: ano xn = do :

16

In matrix notation

2 32 3 2 3

(1 a11 ) a12 a13 a1n x1 d1

6 a21 (1 a22 ) a23 a2n 76 x2 7 6 d2 7

6 76 7 6 7

6 a31 a32 (1 a33 ) a3n 76 x3 7 6 d3 7

6 76 7=6 7 :

6 .. .. .. .. .. 76 .. 7 6 .. 7

4 . . . . . 54 . 5 4 . 5

an1 an2 an3 (1 ann ) xn dn

Or

(I A) x = d :

(I A) is the Leontief matrix. This implies that you need more x than just …nal demand because some

x is used as intermediate inputs ("I A < I").

1

x = (I A) d:

You need nonsingular (I A). But even then the solution to x might not be positive. We need to …nd

conditions for this.

Consider 2 3

a b c

A=4 d e f 5 :

g h i

De…ne

Principal minors: the minors that arise from deleting the i-th row and i-th column. E.g.

e f a c a b

jM11 j = ; jM22 j = ; jM33 j = :

h i g i d e

k-th order principal minor: is a principal minor that arises from a matrix of dimensions k k. If

the dimensions of the original matrix are n n, then a k-th order principal minor is obtained after

deleting the same n k rows and columns. E.g., the 1-st order principal minors of A are

The 2-nd order principal minors are jM11 j, jM22 j and jM33 j given above.

Leading principal minors: these are the 1-st, 2-nd, 3-rd (etc.) order principal minors, where we

keep the upper most left corner of the original matrix in each one. E.g.

a b c

a b

jM1 j = jaj ; jM2 j = ; jM3 j = d e f :

d e

g h i

17

Simon-Hawkins Condition (Theorem): consider the system of equations Bx = d. If (1) all non

diagonal elements of Bn n are non positive, i.e. bij 0; 8i 6= j; (2) all elements of dn 1 are non negative,

i.e. di 0; 8i; Then 9x 0 such that Bx = d i¤ (3) all leading principal minors are strictly positive, i.e.

jMi j > 0; 8i. In our case, B = I A, the Leontief matrix.

Economic meaning of SHC. To illustrate, use a 2 2 example:

1 a11 a12

I A= :

a21 1 a22

From (3) we have jM1 j = j1 a11 j = 1 a11 > 0, i.e. a11 < 1. This means that less than the total output

of x1 is used to produce x1 , i.e. viability. Next, we have

jM2 j = jI Aj

= (1 a11 ) (1 a22 ) a12 a21

= 1 a11 a22 + a11 a22 a12 a21 > 0

It follows that

| {z }

>0

a11 + a12 a21 < 1

|{z} | {z }

direct use indirect use

This means that the total amount of x1 demanded (for production of x1 and for production of x2 ) is less

than the amount produced (=1), i.e. the resource constraint is kept.

The closed model version treats the primary sector as any industry. Suppose that there is only one primary

input: labor. Then one interpretation is that the value of consumption of each good is in …xed proportions

(these preferences can be represented by a Cobb-Douglas utility function).

In this model …nal demand, as de…ned above, must equal zero. Since income accrues to primary inputs

(think of labor) and this income is captured in x, then it follows that the d vector must be equal to zero. We

know that …nal demand equals income. If …nal demand was positive, then we would have to have an open

sector to pay for that demand (from its income). I.e. we have a homogenous system:

(I A) x = 0

2 32 3 2 3

(1 a00 ) a01 a02 x0 0

4 a10 (1 a11 ) a12 5 4 x1 5 = 4 0 5 ;

a20 a21 (1 a22 ) x2 0

where 0 denotes the primary sector (there could be more than one).

Each row in the original A matrix must sum to 1, i.e. a0o + a2o + ::: + ano = 1; 8o, because all of the

input is exhausted in production. But then each column in I A can be expressed as minus the sum of all

18

other columns. It follows that I A is singular, and therefore x is not unique! It follows that you can scale

up or down the economy with no e¤ect. In fact, this is a general property of CRS economies with no outside

sector or endowment. One way to pin down the economy is to set some xi to some level, as an endowment.

Teaching assistant covers.

7.1 Di¤erentiation rules

dy

1. If y = f (x) = c, a constant, then dx =0

d n

2. dx ax = anxn 1

d 1

3. dx ln x = x

d

4. dx [f (x) g (x)] = f 0 (x) g 0 (x)

d

5. [f (x) g (x)] = f 0 (x) g (x) + f (x) g 0 (x)

dx

h i

f (x) f 0 (x)g(x)+f (x)g 0 (x) 0

f (x) g 0 (x)

d

6. dx g(x) = [g(x)]2

= fg(x)

(x)

g(x) g(x)

d df dg

7. dx f [g (x)] = dg dx (chain rule)

1

8. Inverse functions. Let y = f (x) be strictly monotone. Then an inverse function, x = f (y), exists

and

dx df 1 (y) 1 1

= = = ;

dy dy dy=dx df (x) =dx

1

where x and y map one into the other, i.e. y = f (x) and x = f (y).

Strictly monotone means that x1 > x2 ) f (x1 ) > f (x2 ) (strictly increasing) or f (x1 ) < f (x2 )

1

(strictly decreasing). It implies that there is an inverse function x = f (y) because 8y 2Range

9!x 2Domain (recall: 8x 2Domain 9!y 2Range de…nes f (x)).

Let y = f (x1 ; x2 ; :::xn ). De…ne the partial derivative of f with respect to xi :

@y f (xi + xi ; x i ) f (xi ; x i )

= lim :

@xi xi !0 xi

Operationally, you derive @y=@xi just as you would derive dy=dxi , while treating all other x i as constants.

Example. Consider the following production function

1='

y = z [ k ' + (1 ) l' ] ; ' 1:

19

De…ne the elasticity of substitution as the percent change in relative factor intensity (k=l) in response to a

1 percent change in the relative factor returns (r=w). What is the elasticity of substitution? If factors are

paid their marginal product, then

1 1

1

yk = z [ ]' ' k' 1

=r

'

1 1

1

yl = z [ ]' ' (1 ) l' 1

=w:

'

Thus ' 1

r k

=

w 1 l

and then 1

1

k 1 '

r 1 '

= :

l 1 w

1 1

The elasticity of substitution is 1 '. It is constant, = 1 '. This production function exhibits constant

elasticity of substitution, denoted a CES production function.

7.3 Gradients

y = f (x1 ; x2 ; :::xn )

rf = (f1 ; f2 ; :::fn ) ;

where

@f

fi = :

@xi

We can use this in …rst order approximations:

f jx0 = rf (x0 ) x

3 02 2 31

x1 x01

B6 7 6 .. 7C

f (x) f (x0 ) (f1 ; f2 ; :::fn )jx0 @4 ... 5 4 . 5A :

xn x0n

Application to open input/output model:

(I A) x = d

1

x = (I A) d=Vd

2 3 2 32 3

x1 v11 v1n d1

6 .. 7 6 .. .. .. 7 6 .. 7 :

4 . 5 = 4 . . . 54 . 5

xn vn1 vnn dn

Think of x as a function of d:

rx1 = v11 v12 v1n

@xi

vij = :

@dj

20

7.4 Jacobian and functional dependence

Let there be two functions

y1 = f (x1 ; x2 )

y2 = g (x1 ; x2 )

y1

@ @y1 @y1

@y y2 @x1 @x2

jJj = = = @y2 @y2 :

@x0 @ (x1 ; x2 ) @x1 @x2

Example: y1 = x1 x2 and y2 = ln x1 + ln x2 .

x2 x1

jJj = 1 1 =0:

x1 x2

1 4x2

jJj = 1 4x2 =0:

x1 +2x22 x1 +2x22

Another example: x = V d,

2 3 2 32 3 2 P 3

x1 v11 v13 d1 v1i di

6 .. 7 6 .. .. .. 7 6 .. 7 = 6 . 7

4 . 5=4 . . . 54 . 5 4 P .. 5 :

xn vn1 vnn dn vni di

So jJj = jV j. It follows that linear dependence is equivalent to functional dependence for a system of linear

equations. If jV j = 0 then there are 1 solutions for x and the relationship between d and x cannot be

inverted.

orem

8.1 Total derivative

Often we are interested in the total rate of change in some variable in response to a change in some other

variable or some parameter. If there are indirect e¤ects, as well as direct ones, you want to take this into

account. Sometimes the indirect e¤ects are due to general equilibrium constraints and can be very important.

Example: consider the utility function u (x; y) and the budget constraint px x + py y = I. Then the total

e¤ect of a small change in x on utility is

du @u @u dy

= + :

dx @x @y dx

21

More generally: F (x1 ; :::xn )

X @F n

dF dxj

= ;

dxi j=1

@xj dxi

Example: z = f (x; y; u; v), where x = x (u; v) and y = y (u; v).

dz @f dx @x dv @f dy @y dv @f @f dv

= + + + + + :

du @x du @v du @y du @v du @u @v du

dz

If we want to impose that v is constant, then this is denoted as du v and then all terms that involve dv=du

are zero:

dz @f dx @f dy @f

= + + :

du v @x du @y du @u

Now we are interested in the change (not rate of...) in some variable or function if all its arguments change

a bit, perturbed. For example, if the saving function for the economy is S = S (y; r), then

@S @S

dS = dy + dr :

@y @r

More generally, y = F (x1 ; :::xn )

Xn

@F

dy = dxj :

j=1

@xj

One can view the total di¤erential as a linearization of the function around a speci…c point.

The same rules that apply to derivatives apply to di¤erentials; just simply add dx after each partial

derivative:

1. dc = 0 for constant c.

@(cun )

2. d (cun ) = cnun 1

du = @u du.

@(u v) @(u v)

3. d (u v) = du dv = @u du + @v dv:

d (u v w) = du dv dw = @u du + @v dv + @w dw:

@(uv) @(uv)

4. d (uv) = vdu + udv = @u du + @v dv:

d (uvw) = vwdu + uwdv + uvdw = @u du + @v dv + @w dw:

5. d (u=v) = v2 = @u du + @v dv

Example: suppose that you want to know how much utility, u (x; y), changes if x and y are perturbed.

Then

@u @u

du = dx + dy :

@x @y

22

Now, if you imposed that utility is not changing, i.e. you are interested in an isoquant, then this implies

that du = 0 and then

@u @u

du = dx + dy = 0

@x @y

and hence

dy @u=@x

= :

dx @u=@y

This should not be understood as a derivative, but rather as a ratio of perturbations. We will see soon

conditions under which this is actually a derivative.

Log linearization. Suppose that you want to log-linearize z = f (x; y) around some point, say

(x ; y ; z ). This means …nding the percent change in z in response to a percent change in x and y. We have

@z @z

dz = dx + dy :

@x @y

dz x @z dx y @z dy

= +

z z @x x z @y y

x @z y @z

zb = b+

x yb ;

z @x z @y

where

dz

zb = d ln z

z

is approximately the percent change.

Another example:

Y = C +I +G

dY = dC + dI + dG

dY C dC I dI G dG

= + +

Y Y C Y I Y G

Cb I Gb

Yb = C + Ib + G :

Y Y Y

This is a useful tool to study the behavior of an equilibrium in response to a change in an exogenous variable.

Consider

F (x; y) = 0 :

We are interested in characterizing the implicit function between x and y, if it exists. We already saw one

implicit function when we computed the utility isoquant. In that case, we had

u (x; y) = u

23

for some constant level of u. This can be rewritten in the form above as

u (x; y) u=0:

From this we derived a dy=dx slope. But this can be more general and constitute a function.

Another example: what is the slope of a tangent line at any point on a circle?

x2 + y 2 = r2

x2 + y 2 r2 = 0

F (x; y) = 0

Fx dx + Fy dy = 2xdx + 2ydy = 0

dy x

= ; y 6= 0 :

dx y

p p

For example, the slope at r= 2; r= 2 is 1.

The implicit function theorem: Let the function F (x; y) 2 C 1 on some open set and F (x; y) = 0.

Then there exists a (implicit) function y = f (x) 2 C 1 that satis…es F (x; f (x)) = 0, such that

dy Fx

=

dx Fy

on this open set. More generally, if F (y; x1 ; x2 ; :::xn ) 2 C 1 on some open set and F (y; x1 ; x2 ; :::xn ) = 0,

then there exists a (implicit) function y = f (x1 ; x2 ; :::xn ) 2 C 1 that satis…es F (x; f (x)) = 0, such that

n

X

dy = fi dxi :

i=1

@y

If we allow only one speci…c xi to be perturbed, then fi = @xi = Fxi =Fy . From F (y; x1 ; x2 ; :::xn ) = 0 and

y = f (x1 ; x2 ; :::xn ) we have

@F @F @F

dy + dx1 + ::: + dxn = 0

@y @x1 @xn

dy = f1 dx1 + ::: + fn dxn

so that

@F

(f1 dx1 + ::: + fn dxn ) + Fx1 dx1 + ::: + Fxn dxn = (Fx1 + Fy f1 ) dx1 + ::: + (Fxn + Fy fn ) dxn = 0 :

@y

24

8.3.1 A more general version of the implicit function theorem

1. F 2 C 1 and

@F

2. jJj = @y 0 6= 0 at some point (x0 ; y0 ) (no functional dependence),

then 9y = f (x), a set of n functions in a neighborhood of (x0 ; y0 ) such that f 2 C 1 and F (x; f (x)) = 0 in

that neighborhood of (x0 ; y0 ).

We further develop this. From F (x; y) = 0 we have

@F @F @F @F

dyn 1 + dxm 1 =0 ) dy = dx :

@y 0 n n @x0 n m @y 0 @x0

@y

dyn 1 = dxm 1

@x0 n m

Combining we get

@F @y @F

dxm 1 = dxm 1 :

@y 0 n n @x0 n m @x0 n m

Now suppose that only x1 is perturbed, so that dx0 = dx1 0 0 . Then we get only the …rst column

in the set of equations above:

+ + ::: + dx1 = dx1

@y1 @x1 @y2 @x1 @yn @x1 @x1

..

.

@F n @y1 @F n @y2 @F n @yn @F n

+ + ::: + dx1 = dx1

@y1 @x1 @y2 @x1 @yn @x1 @x1

+ + ::: + =

@y1 @x1 @y2 @x1 @yn @x1 @x1

..

.

@F n @y1 @F n @y2 @F n @yn @F n

+ + ::: + =

@y1 @x1 @y2 @x1 @yn @x1 @x1

and thus

@F @y @F

= :

@y 0 n n @x1 n 1 @x1 n 1

h i h i

@F @F @y

Since we required jJj = @y 0 6= 0 it follows that the @y 0 matrix is nonsingular, and thus 9! @x1 ,

n n n 1

a solution to the system. This can be obtained by Cramer’s rule:

@yj jJj j

= ;

@x1 jJj

25

h i

@F

where jJj j is obtained by replacing the j-th column in jJj j by @x1 .

Why is this useful? We are often interested in how a model behaves around some point, usually an

equilibrium or perhaps a steady state. But models are typically nonlinear and the behavior is hard to

characterize without implicit functions.

2 @F 1 @F 1 @F 1

32 3 2 32 3

@F 1 @F 1 @F 1

@y1 @y2 @yn dy1 @x1 @x2 @xm dx1

6 @F 2 @F 2 @F 2 76 7 6 @F 2 @F 2 @F 2 76 7

6 76 dy2 7 6 76 dx2 7

6 @y1 @y2 @yn 76 .. 7+6

@x1 @x2 @xm 76 .. 7=0

6 .. .. .. .. 74 5 6 .. .. .. 74 5

4 . . . . 5 . 4 . . . 5 .

n n

@F n

@F n

@F n

dyn @F @F @F n dxm

@y1 @y2 @yn @x1 @x2 @xm

2 @F 1 @F 1 @F 1

32 3 2 32 3

@F 1 @F 1 @F 1

@y1 @y2 @yn dy1 @x1 @x2 @xm dx1

6 @F 2 @F 2 @F 2 76 7 6 @F 2 @F 2 @F 2 76 7

6 76 dy2 7 6 7 6 dx2 7

6 @y1 @y2 @yn 76 .. 7= 6 @x1 @x2 @xm 76 . 7

6 .. .. .. .. 74 5 6 .. .. .. 74 . 5

4 . . . . 5 . 4 . . . 5 .

n n

@F n

@F n

@F n

dyn @F @F @F n dxm

@y1 @y2 @yn @x1 @x2 @xm

2 3 2 @y 1 @y 1 @y 1

32 3

dy1 @x1 @x2 @xm dx1

6 dy2 7 6 @y 2 @y 2 @y 2 76

dx2 7

6 7 6 76 7

7=6 76

@x1 @x2 @xm

6 .. .. 7=0

4 . 5 6

4

..

.

..

.

..

.

74

5 . 5

dyn @y n @y n @y n dxm

@x1 @x2 @xm

2 @F 1 @F 1 @F 1

32 32 3 2 32 3

@y1 @y1 @y1 @F 1 @F 1 @F 1

@y1 @y2 @yn @x1 @x2 @xm dx1 @x1 @x2 @xm dx1

6 @F 2 @F 2 @F 2 76 @y2 @y2 @y2 76 7 6 @F 2 @F 2 @F 2 76 7

6 76 @x1 @x2 @xm 76 dx2 7 6 76 dx2 7

6 @y1 @y2 @yn 76 .. .. .. 76 .. 7= 6 @x1 @x2 @xm 76 .. 7

6 .. .. .. .. 76 74 5 6 .. .. .. 74 5

4 . . . . 54 . . . 5 . 4 . . . 5 .

@F n @F n @F n @yn @yn @yn dxm @F n @F n @F n dxm

@y1 @y2 @yn @x1 @x2 @xm @x1 @x2 @xm

+

demand : q d = d( p; y)

+

supply : q s = s( p)

equilibrium : q d = q s :

+ +

s( p) d( p; y) = 0 ;

F (p; y) = 0 ;

We are interested in how the endogenous price responds to income. By the implicit function theorem

9p = p (y) such that

dp Fy dy dy

= = = >0

dy Fp sp dp sp dp

26

because dp < 0. An increase in income unambiguously increases the price.

To …nd how quantity changes we apply the total derivative approach to the demand function:

dq @d dp @d

= +

dy @p dy @y

| {z } |{z}

"substitution" e¤ect<0 incom e e¤ect>0

so the sign here is ambiguous. But we can show that it is positive by using the supply side:

dq @s dp

= >0:

dy @p dy

Draw demand-supply system.

This example is simple, but the technique is very powerful, especially in nonlinear general equilibrium

models.

F (p; q; y) = 0

F 2 (p; q; y) = s (p) q=0:

Apply the general theorem. Check for functional dependence in the endogenous variables:

@F dp 1

jJj = = = dp + sp > 0 :

@ (p; q) sp 1

So there is no functional dependence. Thus 9p = p (y) and 9q = q (y). We now wish to compute the

derivatives with respect to the exogenous argument y. Since dF = 0 we have

@F 1 @F 1 @F 1

dp + dq + dy = 0

@p @q @y

@F 2 @F 2 @F 2

dp + dq + dy = 0

@p @q @y

Thus " # " #

@F 1 @F 1 @F 1

@p @q dp @y dy

@F 2 @F 2

= @F 2

dq

@p @q @y dy

@p

dp = dy

@y

@q

dq = dy

@y

27

to get " #" # " #

@F 1 @F 1 @p @F 1

@p @q @y dy @y dy

@F 2 @F 2 @q = @F 2

@p @q @y dy @y dy

" #" # " #

@F 1 @F 1 @p @F 1

@p @q @y @y

@F 2 @F 2 @q = @F 2

@p @q @y @y

" #" #

@d @p @d

@p 1 @y @y

@s @q = :

@p 1 @y

0

@p @q

We seek a solution for @y and @y . This is a system of equations, which we solve using Cramer’s rule:

@d

@y 1

@d

@p jJ1 j 0 1 @y

= = = >0

@y jJj jJj jJj

and

@d @d

@p @y

@s @d @s

@q jJ2 j @p 0 @y @p

= = = >0:

@y jJj jJj jJj

s (p) d (p; y) = 0 :

@s dp @d dp @d

=0

@p dy @p dy @y

Thus

dp @s @d @d

=

dy @p @p @y

and so

@d

dp @y

= @s @d

>0:

dy @p @p

A function may have many local minimum and maximum. A function may have only one global minimum

and maximum, if it exists.

28

9.1 Local maximum, minimum

First order necessary conditions (FONC): Let f 2 C 1 on some open convex set (will be de…ned

0

properly later) around x0 . If f (x0 ) = 0, then x0 is a critical point, i.e. it could be either a maximum or

minimum.

Second order su¢ cient conditions (SOC): Let f 2 C 2 on some open convex set around x0 . If

0

f (x0 ) = 0 (FONC satis…ed) then:

Extrema at the boundaries: if the domain of f (x) is bounded, then the boundaries may be extrema

without satisfying any of the conditions above.

Example:

y = x3 12x2 + 36x + 8

FONC:

f 0 (x) = 3x2 24x + 36 = 0

x2 8x + 12 = 0

x2 2x 6x + 12 = 0

x (x 2) 6 (x 2) = 0

(x 6) (x 2) = 0

f 00 (x) = 6x 24

f 00 (2) = 12 ) maximum

29

9.2 The N -th derivative test

If f 0 (x0 ) = 0 and the …rst non zero derivative at x0 is of order n, f (n) (x0 ) 6= 0, then

Example:

4

f (x) = (7 x) :

3

f 0 (x) = 4 (7 x) ;

2

f 00 (x) = 12 (7 x) ; f 00 (7) = 0

so x = 7 is a minimum: f (4) is the …rst non zero derivative. 4 is even. f (4) > 0.

Understanding the N -th derivative test is based on the Maclaurin expansion and the Taylor ex-

pansion.

Terms of art:

f (x) = a0 + a1 x + a2 x2 + a3 x3 + ::: + an xn

f (1) (x) = a1 + 2a2 x + 3a3 x2 + ::: + nan xn 1

..

.

f (n) (x) = 1 2 ::: (n 1) nan :

30

Evaluate at x = 0:

f (0) = a0

f (1) (0) = a1

f (2) (0) = 2a2

..

.

f (n) (0) = 1 2 ::: (n 1) nan = n!an :

f (x)jx=0 = + x+ x + x + ::: x :

0! 1! 2! 3! n!

Example: quadratic equation.

f (x) = a0 + a1 x + a2 x2 :

De…ne x = x0 + , where we …x x0 as an anchor and allow to vary. This is essentially relocating the origin

to (x0 ; f (x0 )).

2

g( ) a0 + a1 (x0 + ) + a2 (x0 + ) = f (x) :

Note that

g ( ) = f (x) :

Taking derivatives

g 00 ( ) = 2a2 :

g ( )j =0 = + + :

0! 1! 2!

Using = x x0 and the fact that x = x0 when = 0, we get a Maclaurin expansion for f (x) around

x = x0 :

f (x0 ) f (1) (x0 ) f (2) (x0 ) 2

f (x)jx=x0 = + (x x0 ) + (x x0 ) :

0! 1! 2!

More generally, we have the Taylor expansion for an arbitrary C n function:

f (x)jx=x0 = + (x x0 ) + (x x0 ) + ::: + (x x0 ) + Rn

0! 1! 2! n!

= Pn + Rn ;

31

As we choose higher n, then Rn will be smaller and in the limit vanish.

The Lagrange form of Rn : for some point p 2 [x0 ; x] (if x > x0 ) or p 2 [x; x0 ] (if x < x0 ) we have

1 n+1

Rn = f (n+1) (p) (x x0 ) :

(n + 1)!

Example: for n = 0 we have

f (x0 )

f (x)jx=x0 = + Rn = f (x0 ) + Rn = f (x0 ) + f 0 (p) (x x0 ) :

0!

Rearranging this we get

f (x) f (x0 ) = f 0 (p) (x x0 )

for some point p 2 [x0 ; x] (if x > x0 ) or p 2 [x; x0 ] (if x < x0 ). This is the Mean Value Theorem:

De…ne: x0 is a maximum (minimum) of f (x) if the change in the function, f f (x) f (x0 ), is negative

(positive) in a neighborhood of x0 , both on the right and on the left of x0 .

The Taylor expansion helps determining this.

f (2) (x0 ) 2 f (n) (x0 ) n 1 n+1

f = f (1) (x0 ) (x x0 ) + (x x0 ) + ::: + (x x0 ) + f (n+1) (p) (x x0 ) :

2 n! (n + 1)!

| {z }

rem ainder

0

1. Consider the case that f (x0 ) 6= 0, i.e. the …rst non zero derivative at x0 is of order 1. Choose n = 0,

so that the remainder will be of the same order of the …rst non zero derivative and evaluate

f = f 0 (p) (x x0 ) :

Using the fact that p is very close to x0 , so close that f 0 (p) 6= 0, we have that f changes signs around

x0 .

32

2. Consider the case of f 0 (x0 ) = 0 and f 00 (x0 ) 6= 0. Choose n = 1, so that the remainder will be of the

same order of the …rst non zero derivative (2) and evaluate

f 00 (p) 2 1 00 2

f = f 0 (x0 ) (x x0 ) + (x x0 ) = f (p) (x x0 ) :

2 2

2

Since (x x0 ) > 0 always and f 00 (p) 6= 0 we get f is either positive (minimum) or negative

(maximum) around x0 .

3. Consider the case of f 0 (x0 ) = 0, f 00 (x0 ) = 0 and f 000 (x0 ) 6= 0. Choose n = 2, so that the remainder

will be of the same order of the …rst non zero derivative (3) and evaluate

f = f 0 (x0 ) (x x0 ) + (x x0 ) + (x x0 ) = f (p) (x x0 ) :

2 6 6

3

Since (x x0 ) changes signs around x0 and f 000 (p) 6= 0 we get f is changing signs and therefore not

an extremum.

(x0 ) = 0 and f (n) (x0 ) 6= 0. Choose n 1, so

that the remainder will be of the same order of the …rst non zero derivative (n) and evaluate

f (2) (x0 ) 2 f (n 1)

(x0 ) n 1 1 (n) n

f = f (1) (x0 ) (x x0 ) + (x x0 ) + ::: + (x x0 ) + f (p) (x x0 )

2 (n 1)! n!

1 (n) n

= f (p) (x x0 ) :

n!

n

If n is odd, then (x x0 ) changes signs around x0 and f changes signs and therefore not an extremum.

n

If n is even, then (x x0 ) > 0 always and f is either positive (minimum) or negative (maximum).

1 2

e 2x x 6= 0

f (x) =

0 x=0

is not C 1 at 0.

These are used a lot in economics due to their useful properties, some of which have economic interpretations,

in particular in dynamic problems that involve time.

y = f (t) = bt ; b > 1 :

33

f (t) 2 C 1 .

f (t) > 0 8t 2 R.

(y) = logb y, where y 2 R++ .

Any y > 0 can be expressed as an exponent of many bases. Make sure you know how to convert bases:

loga y

logb y = :

loga b

y = Aert

d t

e = et

dt

d

Aert = rAert :

dt

1 1=n

lim 1+ = lim (1 + n) = e = 2:71828:::

m!1 m n!0

Use a Taylor expansion of ex around zero:

1 x 0 1 x 00 2 1 x 000 3

ex = e0 + (e ) x=0 (x 0) + (e ) x=0

(x 0) + (e ) x=0

(x 0) + :::

1! 2! 3!

1 1

= 1 + x + x2 + x3 + :::

2! 3!

Evaluate this at x = 1:

1 1

e1 = e = 1 + 1 + + + ::: = 2:71828:::

2! 3!

10.3 Examples

10.3.1 Interest compounding

Suppose that you are o¤ered an interest rate r on your savings after a year. Then the return after one year

is 1 + r. If you invested A, then at the end of the year you have

A (1 + r) :

r r

Now suppose that an interest m is o¤ered for 1=m of a year. In that case you get to compound m m times

throughout the year. In that case an investment of A will be worth at the end of the year

r m r m=r r

A 1+ =A 1+ :

m m

34

r

Now suppose that you get a instant rate of interest m where m ! 1 (n ! 0), compounded m ! 1 times

throughout the year. In that case an investment of A will be worth at the end of the year

r m r m=r r r m=r r r

1=u

lim A 1 + = lim A 1+ =A lim 1+ =A lim (1 + u) = Aer :

m!1 m m!1 m m!1 m u=r=m!0

Suppose that we are interested in an arbitrary period of time, t, where, say t = 1 is a year (but this is

arbitrary). Then the same kind of math will lead us to …nd the value of an investment A after t time to be

r m r m=r r

A 1+ =A 1+ :

m m

if m is …nite, and

Aert

The interest rate example tells you how much the investment is worth when it growth at a constant, instan-

taneous rate:

dV =dt rAert

growth rate = = = r per instant (dt):

V Aert

Any discrete growth rate can be described by a continuous growth rate:

t

A (1 + i) = Aert ;

where

(1 + i) = er :

10.3.3 Discounting

X

NPV = t ;

(1 + i)

t

where 1= (1 + i) is the discount factor. This can also be represented by continuous discounting

X rt

NPV = t = Xe ;

(1 + i)

t t rt

where the same discount factor is 1= (1 + i) = (1 + i) =e .

35

10.4 Logarithms

Log is the inverse function of the exponent.

y = bt , t = logb y :

E.g.,

16 = 24 , 4 = log2 16 :

Also,

y = blogb y :

Convention:

loge x = ln x :

Rules:

ln (uv) = ln u + ln v

ln (u=v) = ln u ln v

ln aub = ln a + b ln u

loga x

logb x = loga b , where a; b; x > 0

ln e 1

– Corollary: logb e = ln b = ln b

X2 X2 X2 X1

ln X2 ln X1 = ln = ln 1+1 = ln 1 + = ln (1 + x) ;

X1 X1 X1

where x is the growth rate of X. Take a …rst order Taylor approximation of ln (1 + x) around ln (1):

0

ln (1 + x) ln (1) + (ln (1)) (1 + x 1) = x :

So we have

ln X2 ln X1 x:

2. Logs "bend down": their image relative to the argument below the 45 degree line. Exponents do the

opposite.

36

4. Nevertheless, lim logb x = 1. Also, lim logb x = 1. Therefore the range is R.

x!1 x!0

ln y ln A

t= :

r

This answers the question: how long will it take to grow from A to y, if growth is at an instantaneous

rate of r.

d 1

ln t =

dt t

d d ln t 1

logb t = =

dt dt ln b t ln b

d t

e = et

dt

Let y = et , so that t = ln y:

d t d 1 1

e = y= = = y = et :

dt dt dt=dy 1=y

By chain rule:

d u du

e = eu

dt dt

d du=dt

ln u =

dt u

Higher derivatives:

dn t t

ne = e

(dt)

d 1 d2 1 d3 2

ln t = ; ln t = ; ln t = 3 :::

dt t (dt)2 t 2

(dt)

3 t

d t d2 t 2 d3 t 3

b = bt ln b ; 2 b = bt (ln b) ; t

3 b = b (ln b) :::

dt (dt) (dt)

The value of k bottles of wine is given by

p

t

V (t) = ke :

rt

Discounting: D (t) = e . The present value of V (t) today is

p p

rt t t rt

P V = D (t) V (t) = e ke = ke :

37

p

t rt

p

Choosing t to maximize P V = ke is equivalent to choosing t to maximize ln P V = ln k + t rt.

FONC:

0:5

0:5t r = 0

0:5

0:5t = r

Marginal bene…t to wait one more instant = marginal cost of waiting one more instant. t = 1=4t2 . SOC:

1:5

0:25t <0

so t is a maximum.

Denote

d

x = x_ :

dt

So the growth rate at some point in time is

dx=dt x_

= :

x x

So in the case x = Aert , we have

V_

=r :

V

And since x (0) = Aer0 = A, we can write without loss of generality x (t) = x0 ert .

Growth rates of combinations:

y_ u_ v_

= +

y u v

gy = gu + gv

Proof:

ln y (t) =

ln u (t) + ln v (t)

d d d

ln y (t) = ln u (t) + ln v (t)

dt dt dt

1 dy 1 du 1 dv

= +

y (t) dt u (t) dt v (t) dt

2. For y (t) = u (t) =v (t) we have

y_ u_ v_

=

y u v

gy = gu gv

u u

gy = gu gv

u v u v

38

10.8 Elasticities

An elasticity of y with respect to x is de…ned as

dy=y dy x

y;x = = :

dx=x dx y

Since

@ ln x dx

d ln x = dx =

@x x

we get

d ln y

y;x = :

d ln x

11.1 The di¤erential version of optimization with one variable

This helps developing concepts for what follows. Let z = f (x) 2 C 1 , x 2 R. Then

dz = f 0 (x) dx :

SOC:

d2 z = d [dz] = d [f 0 (x) dx] = f 00 (x) dx2 :

A maximum occurs when f 00 (x) < 0 or equivalently when d2 z < 0. A minimum occurs when f 00 (x) > 0

or equivalently when d2 z > 0.

Let z = f (x; y) 2 C 1 , x; y 2 R. Then

dz = fx dx + fy dy :

FONC: dz = 0 for arbitrary values of dx and dy, not both equal to zero. A necessary condition that gives

this is

fx = 0 and fy = 0 :

As before, this is not a su¢ cient condition for an extremum, not only because of in‡ection point, but also

due to saddle points.

@f dx dx

dz = = rf dx = fx fy = fx dx + fy dy :

@ (x; y) dy dy

If x 2 Rn then

2 3

dx1 n

@f 6 .. 7 X

dz = dx = rf dx = f1 fn 4 . 5= fi dxi :

@x0 i=1

dxn

39

De…ne

@2f

fxx =

@x2

@2f

fyy =

@y 2

@2f

fxy =

@x@y

@2f

fyx =

@y@x

Young’s Theorem: If both fxy and fyx are continuous, then fxy = fyx .

Now we apply this

In matrix notation

fxx fxy dx

d2 z = dx dy :

fxy fyy dy

And more generally, if x 2 Rn then

@2f

d2 z = dx0 dx :

@x@x0

| {z }

Hessian

d2 z 0 gives a maximum.

d2 z 0 gives a minimum.

SOSC (second order su¢ cient conditions): for arbitrary values of dx and dy

d2 z < 0 i¤ fxx < 0, fyy < 0 and fxx fyy > fxy

2

:

d2 z > 0 i¤ fxx > 0, fyy > 0 and fxx fyy > fxy

2

:

Comments:

SONC is necessary but not su¢ cient, while SOSC are not necessary.

2

If fxx fyy = fxy a point can be an extremum nonetheless.

2

If fxx fyy < fxy then this is a saddle point.

2 2

If fxy > 0, then fxx fyy > fxy > 0 implies sign(fxx ) =sign(fyy ).

40

11.3 Quadratic form and sign de…niteness

This is a tool to help analyze SOCs. Relabel terms for convenience:

z = f (x1 ; x2 )

d2 z = q ; dx1 = d1 ; dx2 = d2

f11 = a ; f22 = b ; f12 = h

Then

q = ad21 + 2hd1 d2 + bd22

a h d1

= d2 d1 :

h b d2

Note: d1 and d2 are variables, not constants, as in the FONC. We require the SOCs to hold 8d1 ; d2 ,

and in particular 8d1 ; d2 6= 0.

@2f

H=

@x@x0

The quadratic form is

q = d0 Hd

De…ne 8 9 8 9

>

> positive de…nite >

> >

> >0 >

>

< = < =

positive semide…nite 0

q is if q is invariably ;

>

> negative semide…nite >

> >

> 0 >

>

: ; : ;

negative de…nite <0

regardless of values of d. Otherwise, q is inde…nite.

Consider the determinant of H, jHj, which we call here the discriminant of H:

q is i¤ and jHj > 0 :

negative de…nite jaj < 0

jaj is (the determinant of) the …rst ordered minor of H. In the simple two variable case, jHj is (the

determinant of) the second ordered minor of H. In that case

jHj = ab h2 :

If jHj > 0, then a and b must have the same sign, since ab > h2 > 0.

41

11.4 Quadratic form for n variables and sign de…niteness

n X

X n

q = d0 Hd = hij di dj :

i=1 j=1

q is positive de…nite i¤ all (determinants of) the principal minors are positive

h11 h12

jH1 j = jh11 j > 0; jH2 j = > 0; ::: jHn j = jHj > 0 :

h21 h22

q is negative de…nite i¤ (determinants of) the odd principal minors are negative and the even ones

are positive:

jH1 j < 0; jH2 j > 0; jH3 j < 0; :::

Consider some n n matrix Hn n. We look for a characteristic root r (scalar) and characteristic vector

xn 1 (n 1) such that

Hx = rx :

Hx = rIx ) (H rI) x = 0 :

2 3

h11 r h12 h1n

6 h21 h22 r h2n 7

6 7

(H rI) = 6 .. .. .. .. 7

4 . . . . 5

hn1 hn2 ::: hnn r

If (H rI) has a non trivial solution (x 6= 0), then (H rI) must be singular, so that jH rIj = 0. This is

an equation that we can solve for r. The equation jH rIj = 0 is the characteristic equation, and is an

n degree polynomial in r, with n non trivial solutions (some of the solutions can be equal). Some properties:

If H is symmetric, then we will have r 2 R. This is useful, because most applications in economics will

deal with symmetric matrices, like Hessians and variance-covariance matrices.

For each characteristic root that solves jH rIj = 0 there are many characteristic vectors x such that

0

Hx = rx. Therefore we normalize: x x = 1. Denote the normalized characteristic vectors as v. Denote

the characteristic vectors (eigenvector) of the characteristic root (eigenvalue) as vi and ri .

The set of eigenvectors is orthonormal, i.e. orthogonal and normalized: vi0 vj = 0 8i 6= j and vi0 vi = 1.

42

11.5.1 Application to quadratic form

Let V = (v1 ; v2 ; :::vn ) be the set of eigenvectors of the matrix H. De…ne the vector y that solves d = V y.

We use this in the quadratic form

q = d0 Hd = y 0 V 0 HV y = y 0 Ry ;

2 3

r1 0 0

6 0 r2 0 7

6 7

R=6 .. .. 7

4 . . 5

0 0 rn

Here is why:

2 3 2 3

v10 r1 v10 v1 r1 v10 v2 r1 v10 vn

6 v20 7 6 r2 v20 v1 r2 v20 v2 r2 v20 vn 7

6 7 6 7

V 0 HV = V 0 Hv1 Hv2 Hvn =6 .. 7 r1 v1 r2 v2 rn vn =6 .. .. .. .. 7=R;

4 . 5 4 . . . . 5

vn0 rn vn0 v1 rn vn0 v2 rn vn0 vn

where the last equality follows from vi0 vj = 0 8i 6= j and vi0 vi = 1. It follows that sign(q) depends only on

Pn

the characteristic roots: q = y 0 Ry i=1 ri yi2 .

8 9 8 9

>

> positive de…nite >

> >

> >0 >

>

< = < =

positive semide…nite 0

q is i¤ all ri ;

>

> negative semide…nite >

> >

> 0 >

>

: ; : ;

negative de…nite <0

regardless of values of d. Otherwise, q is inde…nite.

When n is large, …nding the roots can be hard, because it involves …nding the roots of a polynomial of

degree n. But the computer can do it for us.

We seek conditions for a global maximum or minimum. If a function has a "hill shape" over its entire

domain, then we do not need to worry about boundary conditions and the local extremum will be a global

extremum. Although the global maximum can be found at the boundary of the domain, it will not be

detected by the FONC.

Concave, but not strictly: this allows for ‡at regions, so the global maximum may not be unique.

43

Let z = f (x) 2 C 2 , x 2 Rn .

8 9 8 9

>

> positive de…nite >

> >

> strictly convex >

>

< = < =

2 positive semide…nite convex

If d z is 8x in the domain, then f is ;

>

> negative semide…nite >

> >

> concave >

>

: ; : ;

negative de…nite strictly concave

When an objective function is general, then we must assume convexity or concavity. If a speci…c functional

form is used, we can check whether it is convex or concave.

De…nition 1: A function f is concave i¤ 8x; y 2graph of f the line between x and y lies on or below the

graph.

If 8x 6= y the line lies strictly below the graph, then f is strictly concave.

De…nition 2: A function f is concave i¤ 8x; y 2domain of f , which is assumed to be a convex set, and

8 2 (0; 1) we have

f (x) + (1 ) f (y) f [ x + (1 ) y] :

For strict concavity replace " " with "<" and add 8x 6= y.

For convexity replace " " with " " and "<" with ">".

Draw …gures.

The term x + (1 ) y, 2 (0; 1) is called a convex combination.

Properties:

f (x) + (1 ) f (y) f [ x + (1 ) y] = ( 1)

[ f (x)] + (1 ) [ f (y)] f [ x + (1 ) y]

3. If f and g are concave functions, then f + g is also concave. If one of the functions is strictly concave,

then f + g is strictly concave.

44

Proof: f and g are concave, therefore

f (x) + (1 ) f (y) f [ x + (1 ) y]

g (x) + (1 ) g (y) g [ x + (1 ) y]

[f (x) + g (x)] + (1 ) [f (y) + g (y)] f [ x + (1 ) y] + g [ x + (1 ) y]

[(f + g) (x)] + (1 ) [(f + g) (y)] (f + g) [ x + (1 ) y]

11.6.2 Example

2 2

(ii) : f [ x1 + (1 ) x2 ; y1 + (1 ) y2 ] = [ x1 + (1 ) x2 ] + [ y1 + (1 ) y2 ]

2 2

= x21 + y12 + (1 ) x22 + y22 + 2 (1 ) (x1 x2 + y1 y2 ) :

h i

2 2

(1 ) x21 + y12 + x22 + y22 2 (1 ) (x1 x2 + y1 y2 ) = (1 ) (x1 x2 ) + (y1 y2 ) 0:

So this is a convex function. Moreover, it is strictly convex, since 8x1 6= x2 and 8y1 6= y2 we have

(i) (ii)> 0.

Clearly, x2 + y 2 is strictly concave.

11.6.3 Example

2

Is f (x; y) = (x + y) concave or convex? Use the same procedure from above.

2 2

(i) : f (x1 ; y1 ) + (1 ) f (x2 ; y2 ) = (x1 + y1 ) + (1 ) (x2 + y2 ) :

Now consider

2

(ii) : f [ x1 + (1 ) x2 ; y1 + (1 ) y2 ] = [ x1 + (1 ) x2 + y1 + (1 ) y2 ]

2

= [ (x1 + y1 ) + (1 ) (x2 + y2 )]

2 2 2 2

= (x1 + y1 ) + 2 (1 ) (x1 + y1 ) (x2 + y2 ) + (1 ) (x2 + y2 ) :

h i

2 2

(1 ) (x1 + y1 ) + (x2 + y2 ) 2 (1 ) (x1 + y1 ) (x2 + y2 )

2

= (1 ) [(x1 + y1 ) (x2 + y2 )] 0:

So convex but not strictly. Why not strict? Because when x + y = 0, i.e. when y = x, we get f (x; y) = 0.

The shape of this function is a hammock, with the bottom at y = x.

45

11.7 Di¤erentiable functions, convexity and concavity

Let f (x) 2 C 1 and x 2 R. Then f is concave if 8x1 ; x2 2domain of f

f x1 f x2

f 0 x1 ;

x1 x2

i.e. the slope is smaller than the derivative at x1 . Think of x1 as the point of reference and x2 as a target

point. For convex replace " " with " ".

f x1 f x2 + f 0 x1 x1 x2 :

f x1 f x2 + rf x1 x1 x2 :

f rf x1 x:

semide…nite. If d2 z is negative de…nite, then f is strictly concave (not only if). Replace "negative" with

"positive" for convexity.

This is related, but distinct from convex and concave functions.

De…ne: convex set in Rn . Let the set S Rn . If 8x; y 2 S 8 and 2 [0; 1] we have

x + (1 )y 2 S

then S is a convex set. (this de…nition actually holds in other spaces too.) Essentially, a set is convex if it

has no "holes" (no doughnuts) and the boundary is not "dented" (no bananas).

46

11.8.1 Relation to convex functions 1

f (x) + (1 ) f (y) f [ x + (1 ) y]

x + (1 ) y 2 domain of f ;

S = fx : f (x) kg ; k 2 R

is a convex set (but not only if, i.e. this is a necessary condition, not su¢ cient).

S = fx : f (x) kg ; k 2 R

is a convex set (but not only if, i.e. this is a necessary condition, not su¢ cient).

47

This is why there is an intimate relationship between convex preferences and concave utility functions.

=R C = pq wl rk :

Let p, w, r be given, i.e. the …rm is a price taker in a competitive economy. To simplify, let output, q, be

the numeraire, so that p = 1 and everything is then denominated in units of output:

=q wl rk :

@ 1

= k l r=0

@k

@ 1

= k l w=0:

@k

SOC:

@2 ( 1) k 2

l 2

k 1

l 1

jHj = = 2 :

k k 1l 1

( 1) k l 2

@ @ k l

l

2 2

jH1 j = ( 1) k l < 0 8k; l > 0. jH2 j = jHj = (1 2 ) k 2( 1) 2(

l 1)

> 0 8k; l > 0. Therefore is

a concave function and the extremum will be a maximum.

From the FONC:

1 q

k l = =r

k

1 q

k l = =w

l

48

so that rk = wl = q. Thus

q

k =

r

q

l = :

w

Using this in the production function:

q q 2 1 2 1 1 2

q=k l = = q2 = 1 2 ;

r w rw rw

so that

1

1 1 1 2

1 1 2

k = 1 2

r w

1

1 1 1 2

1 1 2

l = 1 2 :

r w

12.1 Example: the consumer problem

Constraint(s) : s.t. (x; y) 2 B = f(x; y) : x; y 0; xpx + ypy 0g

(draw the budget set, B). Under some conditions, which we will explore soon, we will get the result that

the consumer chooses a point on the budget line, s.t. xpx + ypy = 0, and that x; y 0 is trivially satis…ed

(nonsatiation and strict convexity of u). So we state a simpler problem:

Constraint(s) : s.t. xpx + ypy = 0 :

The optimum will be denoted (x ; y ). The value of the problem is thus u (x ; y ). Constraints can only

hurt the unconstrained value (although they may not). This will happen when the unconstrained optimum

point is not in the constraint set. E.g.,

has a maximum at (x ; y ) = (1=2; 1=2), but this point is not on the line x+y = 2, so applying this constraint

will move us away from the unconstrained optimum and hurt the objective.

Let f; g 2 C 1 . Suppose that (x ; y ) is the solution to

49

and that (x ; y ) is not a critical point of g (x; y) = c, i.e. both gx 6= 0 and gy 6= 0 at (x ; y ). Then there

exists a number such that (x ; y ; ) is a critical point of

i.e.

@L

= c g (x; y) = 0

@

@L

= fx gx = 0

@x

@L

= fy gy = 0 :

@y

From this it follows that at (x ; y ; )

g (x ; y ) = c

= fx =gx

= fy =gy :

The last equations make it clear why we must check the constraint quali…cations, that both gx 6= 0

and gy 6= 0 at (x ; y ), i.e. check that (x ; y ) is not a critical point of g (x; y) = c. For linear constraints

this will be automatically satis…ed.

dz = fx dx + fy dy = 0 ;

and thus

dy fx

=

dx fy

But even in the constrained problem this still holds –as we will see below –except that now dx and dy are

not arbitrary: they must satisfy the constraint, i.e.

gx dx + gy dy = 0 :

Thus

dy gx

= :

dx gy

From both of these we obtain

gx fx

= ;

gy fy

i.e. the objective and the constraint are tangent. This follows from

fy fx

= = :

gy gx

50

A graphic interpretation. Think of the gradient as a vector that points in a particular direction. This

direction is where to move in order to increase the function the most, and is perpendicular to the isoquant

of the function. Notice that we have

rf (x ; y ) = rg (x ; y )

(fx ; fy ) = (gx ; gy ) :

This means that the constraint and the isoquant of the objective at the optimal value are parallel. They

may point in the same direction if > 0 or in opposite directions if < 0.

tells you how much f would increase if we relax the constraint by one unit, i.e. increase or decrease c by

1 (for equality constraints, it will be either-or). For example, if the objective is utility and the constraint is

your budget in money, $, then is in terms of utils/$. It tells you how many more utils you would get if

you had one more dollar.

Write the system of equations that de…ne the optimum as identities

F 1 ( ; x; y) = c g (x; y) = 0

F 2 ( ; x; y) = fx gx = 0

F 2 ( ; x; y) = fy gy = 0 :

This is a system of functions of the form F ( ; x; y; c) = 0. If all these functions are continuous and jJj =

6 0

at ( ; x ; y ), where

0 gx gy

@F

jJj = = gx fxx gxx fxy gxy ;

@ ( x y)

gy fxy gxy fyy gyy

then by the implicit function theorem we have at = (c), x = x (c) and y = y (c) with derivatives de…ned

that can be found as we did above. The point is that such functions exist and that they are di¤erentiable.

It follows that there is a sense in which d =dc is meaningful.

51

Now consider the value of the Lagrangian

L = L( ; x ; y ) = f (x ; y ) + [c g (x ; y )] ;

dL dx dy d dx dy

= fx + fy + [c g (x ; y )] + 1 gx gy

dc dc dc dc dc dc

dx dy d

= [fx gx ] + [fy gy ] + [c g (x ; y )] +

dc dc dc

= :

Therefore

dL @L

= = :

dc @c

This is a manifestation of the envelope theorem (see below). But we also know that at the optimum we

have

c g (x ; y ) = 0 :

L (x ; y ; ) = f (x ; y ) ;

and therefore

dL df

= = :

dc dc

Let x be a critical point of f (x; ). Then

df (x ; ) @f (x ; )

= :

d @

Proof: since at x we have fx (x ; ) = 0, we have

df (x ; ) @f (x ; ) dx @f (x ; ) @f (x ; )

= + =

d @x d @ @

Drawing of an "envelope" of functions and optima for f (x ; 1 ), f (x ; 2 ), ...

Let f (x) ; g (x) 2 C 1 and x 2 Rn . Suppose that x is the solution to

and that x is not a critical point of g (x) = c. Then there exists a number such that (x ; ) is a critical

point of

L = f (x) + [c g (x)] ;

52

i.e.

@L

= c g (x; y) = 0

@

@L

= fi gi = 0 ; i = 1; 2; :::n :

@xi

The constraint quali…cation is similar to above:

rg = (g1 (x ) ; g2 (x ) ; :::gn (x )) 6= 0 :

Let f (x) ; g j (x) 2 C 1 j = 1; 2; :::m, and x 2 Rn . Suppose that x is the solution to

and that x satis…es the constraint quali…cations. Then there exists m numbers 1; 2 ; ::: m such that

(x ; ) is a critical point of

m

X

L = f (x) + j cj g j (x) ;

j=1

i.e.

@L

= cj g j (x) = 0 ; j = 1; 2; :::m

@ j

@L

= fi gi = 0 ; i = 1; 2; :::n :

@xi

The constraint quali…cation now requires that

@g

rank =m;

@x0 m n

which is as large as it can possibly be. This means that we must have m n, because otherwise the

maximal rank would be n < m. This constraint quali…cation, as all the others, means that there exists

an m dimensional tangent hyperplane (a Rn m

vector space). Loosely speaking, it ensures that we

can construct tangencies freely enough.

This example shows that when the constraint quali…cation is not met, the Lagrange method does not work.

y2 = x3 ) y = x3=2 for x 0;

i.e.

n o

C = (x; y) : x 0; y = x3=2 ; y = x3=2

53

Notice that (0; 0) is the maximum point. Let us check the constraint quali…cation:

rg = 3x2 2y

rg (0; 0) = (0; 0) :

This violates the constraint quali…cations, since (0; 0) is a critical point of g (x; y).

Now check the Lagrangian

L = x+ x3 y2

L = x3 y2 = 0

Lx = 1 3x2 = 0 ) = 1=3x2

Ly = 2y = 0 ) either = 0 or y = 0 :

Suppose x 6= 0. Then > 0 and thus y = 0. But then from the constraint set x = 0 –a contradiction.

Let f (x) ; g (x) 2 C 1 , x 2 Rn . Suppose that x is the solution to

Then there exists two numbers 0 and 1 such that ( 1; x ) is a critical point of

i.e.

@L

= c g (x; y) = 0

@

@L

= 0 fi 1 gi = 0 ; i = 1; 2; :::n

@xi

54

and

0 2 f0; 1g

f 0; 1g =

6 (0; 0) :

We want to know whether d2 z is negative or positive de…nite on the constraint set. Using the Lagrange

method we …nd a critical point (x ; ) of the problem

L = f (x) + [c g (x)] :

But this is not a maximum of the L problem. In fact, (x ; ) is a saddle point: perturbations of x

around x will hurt the objective, while perturbations of around will increase the objective. If (x ; )

is a critical point of the L problem, then: holding constant, x maximizes the problem; and holding x

constant, maximizes the problem. This makes sense: lowering the shadow cost of the constraint as much

as possible at the point that maximizes the value.

This complicates characterizing the second order conditions, to distinguish maxima from minima. We

want to know whether d2 z is negative or positive de…nite on the constraint set. Consider the two variables

case

dz = fx dx + fy dy

gx

gx dx + gy dy = 0 ) dy = dx ;

gy

i.e. dy is not arbitrary. We can treat dy as a function of x and y when we di¤erentiate dz the second

time:

@ (dz) @ (dz)

d2 z = d (dz) = dx + dy

@x @y

@ @

[fx dx + fy dy] dx + [fx dx + fy dy] dy

@x @y

@ (dy) @ (dy)

= fxx dx + fyx dy + fy dx + fxy dx + fyy dy + fy dy

@x @y

= fxx dx2 + 2fxy dxdy + fyy dy 2 + fy d2 y ;

where we use

@ (dy) @ (dy)

d2 y = d (dy) = dx + dy :

@x @y

This is not a quadratic form, but we use g (x; y) = c again to transform it into one, by eliminating d2 y.

Di¤erentiate

dg = gx dx + gy dy = 0 ;

55

using dy as a function of x and y again:

@ (dg) @ (dg)

d2 g = d (dg) = dx + dy

@x @y

@ @

= [gx dx + gy dy] dx + [gx dx + gy dy] dy

@x @y

@ (dy) @ (dy)

= gxx dx + gyx dy + gy dx + gxy dx + gyy dy + gy dy

@x @y

= gxx dx2 + 2gxy dxdy + gyy dy 2 + gy d2 y = 0

Thus

1

d2 y = gxx dx2 + 2gxy dxdy + gyy dy 2 :

gy

Use this in the expression for d2 z to get

gxx gxy gyy

d2 z = fxx fy dx2 + 2 fxy fy dxdy + fyy fy dy 2 :

gy gy gy

From the FONCs we have

fy

=

gy

and by di¤erentiating the FONCs we get

Lyy = fyy gyy

Lxy = fxy gxy :

d2 z = Lxx dx2 + 2Lxy dxdy + Lyy dy 2 :

This is a quadratic form, but not a standard one, because, dx and dy are not arbitrary. As before, we want

to know the sign of d2 z, but unlike the unconstrained case, dx and dy must satisfy dg = gx dx + gy dy = 0.

Thus, we have second order necessary conditions (SONC):

These are less stringent conditions relative to unconstrained optimization, where we required conditions on

d2 z for all values of dx and dy. Here we consider only a subset of those values, so the requirement is less

stringent, although slightly harder to characterize.

56

12.10 Bordered Hessian and constrained optimization

Using the notations we used before for a Hessian,

a h

H=

h b

we can write

d2 z = Lxx dx2 + 2Lxy dxdy + Lyy dy 2

as

d2 z = adx2 + 2hdxdy + bdy 2 :

We also rewrite

gx dx + gy dy = 0

as

dx + dy = 0 :

s.t. 0 = dx + dy :

Eliminate dy using

dy = dx

to get

2 dx2

d2 z = a 2h +b 2

2 :

The sign of d2 z depends on the square brackets. For a maximum we need it to be negative. It turns out that

0

2 2

a 2h +b = a h H :

h b

Notice that H contains the Hessian, and is bordered by the gradient of the constraint. Thus, the term

"bordered Hessian".

Let f (x) ; g (x) 2 C 2 , x 2 Rn . Suppose that x is a critical point of the Lagrangian problem. Let

@2L

Hn n =

@x@x0

be the Hessian of L evaluated at x . Let rg be the set of linear constraints on dn 1 (= dxn 1 ), evaluated

at x :

rgd = 0 :

57

We want to know what is the sign of

d2 z = q = d0 Hd

such that

rgd = 0 :

The sign de…niteness of the quadratic form q depends on the following bordered Hessian

0 rg1 n

H (n+1) (n+1) = :

rgn0 1 Hn n

Recall that sign de…niteness of a matrix depends on the signs of the determinants of the leading principal

minors. Therefore

d2 z is s.t. dg = 0 i¤ ;

negative de…nite (max) H 3 > 0; H 4 < 0; H 5 > 0; :::

Note that in the text they start from H 2 , which they de…ne as the third leading principal minor and

is an abuse of notation. We have one consistent way to de…ne leading principal minors of a matrix and

we should stick to that.

Let f (x) ; g j (x) 2 C 2 j = 1; 2; :::m, and x 2 Rn . Suppose that x is a critical point of the Lagrangian

problem. Let

@2L

Hn n =

@x@x0

be the Hessian of L evaluated at x . Let

@g

Am n =

@x0

be the set of linear constraints on dn 1 (= dxn 1 ), evaluated at x :

Ad = 0 :

d2 z = q = d0 Hd

such that

Ad = 0 :

The sign de…niteness of the quadratic form q depends on the bordered Hessian

0m m Am n

H (m+n) (m+n) = :

A0n m Hn n

The sign de…niteness of H depends on the signs of the determinants of the leading principal minors.

58

For a maximum (d2 z negative de…nite) we require that H 2m ; H 2m+1 ::: H m+n alternate signs,

m

where sign H 2m = ( 1) (Dixit).

An alternative formulation for a maximum (d2 z negative de…nite) requires that H n+m ; H n+m 1 :::

n

alternate signs, where sign H n+m = ( 1) (Simon and Blume).

For a minimum...? We know that searching for a minimum of f is like searching for a maximum of

f . So one could set up the problem that way and just treat it like a maximization problem.

This is a less restrictive condition on the objective function.

set, and 8 2 (0; 1) we have

f x2 f x1 ) f x1 + (1 ) x2 f x1

f x1 + (1 ) x2 min f x2 ; f x1 :

In words: the image of the convex combination is larger than the lower of the two images.

set, and 8 2 (0; 1) we have

f x2 f x1 ) f x1 + (1 ) x2 f x2

f x1 + (1 ) x2 max f x2 ; f x1 :

In words: the image of the convex combination is smaller than the higher of the two images.

For strict quasiconcavity and quasiconvexity replace the second inequality with a strict inequality, but

not the …rst. Strict quasiconcavity or convexity rule out ‡at segments.

2

= f0; 1g.

59

Due to the ‡at segment, the function on the left is not strictly quasiconcave. Note that neither of these

functions is convex nor concave. Thus, this is a weaker restriction. The following function, while not convex

nor concave, is both quasiconcave and quasiconvex.

Properties:

Note that unlike concave functions, the sum of two quasiconcave functions is NOT quasiconcave.

60

Alternative de…nitions: Let x 2 Rn .

S = fx : f (x) kg ; k 2 R

S = fx : f (x) kg ; k 2 R

These may be easier to verify. Recal that for concavity and convexity the conditions above were necessary,

but not su¢ cient.

Consider a continuously di¤erentiable function f (x) 2 C 1 and x 2 Rn . Then f is

f x2 f x1 ) rf x1 x2 x1 0:

In words: the function does not change the sign of the slope (more than once).

f x2 f x1 ) rf x2 x2 x1 0:

In words: the function does not change the sign of the slope (more than once).

For strictness, change the last inequality to a strict one, which rules out ‡at regions.

Consider a twice continuously di¤erentiable function f (x) 2 C 2 and x 2 Rn . As usual, the Hessian is

denoted H and the gradient as rf . De…ne

01 1 rf1 n

B= :

rfn0 1 Hn n (n+1) (n+1)

Conditions for quasiconcavity and quasiconvexity in the positive orthant, x 2 Rn+ involve the leading

principal minors of B.

Necessary condition: f is quasiconcave on Rn+ if (but not only if) 8x 2 R, the leading principal

minors of B follow this pattern

jB2 j 0; jB3 j 0; jB4 j 0; :::

Su¢ cient condition: f is quasiconcave on Rn+ only if 8x 2 R, the leading principal minors of B follow

this pattern

jB2 j < 0; jB3 j > 0; jB4 j < 0; :::

61

De…nition: a function f is explicitly quasiconcave if 8x1 ; x2 2domain of f , which is assumed to

be a convex set, and 8 2 (0; 1) we have

f x2 > f x1 ) f x1 + (1 ) x2 > f x1 :

This rules out ‡at segments, except at the top of the hill.

1. strictly quasiconcave

f x2 f x1 ) f x1 + (1 ) x2 > f x1 :

2. explicitly quasiconcave

f x2 > f x1 ) f x1 + (1 ) x2 > f x1 :

3. quasiconcave

f x2 f x1 ) f x1 + (1 ) x2 f x1 :

Quasi-concavity is important because it allows arbitrary cardinality in the utility function, while maintaining

ordinality. Concavity imposes decreasing marginal utility, which is not necessary for characterization of

convex preferences and convex indi¤erence sets. Only when dealing with risk do we need to impose concavity.

We do not need concavity for global extrema.

Suppose that x is the solution to

If

2. f is explicitly quasiconcave,

If in addition f is strictly quasiconcave, then this global maximum is unique.

We like apples (a) and bananas (b), but want to reduce the cost of any (a; b) bundle for a given level of

+ +

utility (U ( a; b)) (or fruit salad, if we want to use a production metaphor).

62

Set up the appropriate Lagrangian

Here is in units of $/util: it tells you how much an additional util will cost.

FONC:

@L

= u U (a; b) = 0

@

@L

= pa Ua = 0 ) pa =Ua = >0

@a

@L

= pb Ub = 0 ) pb =Ub = >0:

@b

Thus

Ua pa

M RS = =

Ub pb

So we have tangency. Let the value of the problem be

C = a pa + b pb :

db pa

dC = pa da + pb db = 0 ) = <0:

da pb

We could also obtain this result from the implicit function theorem, since C (a; b) ; U (a; b) 2 C 1 and jJj =

6 0.

Yet another way to get this is to see that since U (a; b) = u, a constant,

db Ua

dU (a; b) = Ua da + Ub db = 0 ) = <0:

da Ub

At this stage all we know is that the isoquant for utility slopes downward, and that it is tangent to the

isocost line at the optimum.

SOC: recall 2 3

0 Ua Ub

H = 4 Ua Uaa Uab 5 :

Ub Uab Ubb

We need positive de…niteness of dC for a minimum, so that we need H 2 < 0 and H 3 = H < 0.

0 Ua

H2 = = Ua2 < 0 ;

Ua Uaa

so this is good. But

= Ua2 Ubb Ua Uab Ub Ub Ua Uab + Ub2 Uaa

63

Without further conditions on U , we do not know whether the expression in the parentheses is negative or

not ( > 0).

The curvature of the utility isoquant is given by

d db d2 b d Ua d Ua (a; b)

= = = =

da da da2 da Ub da Ub (a; b)

db db

Uaa + Uab da Ub Ua Ubb da + Uab

=

Ub2

h i h i

Ua Ua

Uaa + Uab Ub Ub Ua Ubb Ub + Uab

=

Ub2

2

Uaa Ub Ua Uab + Ua Ubb =Ub Ua Uab

=

Ub2

Uaa Ub2 2Ua Uab Ub + Ua2 Ubb

=

Ub3

1

= Uaa Ub2 2Ua Uab Ub + Ua2 Ubb :

Ub3

d2 b

This involves the same expression in the parentheses. If the indi¤erence curve is convex, then da2 0 and

thus the expression in the parentheses must be negative. This coincides with the positive semi-de…niteness

of dC . Thus, convex isoquants and existence of a global minimum in this case come together. This would

d2 b

ensure a global minimum, although not a unique one. If da2 > 0, then the isoquant is strictly convex and

the global minimum is unique, as dC is positive de…nite.

If U is strictly quasiconcave, then indeed the isoquant is strictly convex and the global minimum

is unique.

Expansion paths.

Homotheticity.

Elasticity of substitution.

13.1 One inequality constraint

Let f (x) ; g (x) 2 C 1 , x 2 Rn . The problem is

64

Write the constraint in a "standard way"

g (x) c 0:

and that x is not a critical point of g (x) = c, if this constraint binds. Write down the Lagrangian

function

L = f (x) + [c g (x)] :

@L

(1) : = fi gi = 0 ; i = 1; 2; :::n

@xi

(2) : [c g (x; y)] = 0

(3) : 0

(4) : g (x) c:

The standard way: write g (x) c 0 and then ‡ip it in the Lagrangian function [c g (x)].

Conditions 2 and 3 are called complementary slackness conditions. If the constraint is not binding,

then changing c a bit will not a¤ect the value of the problem; in that case = 0.

The constraint quali…cations are that if the constraint binds, i.e. g (x ) = c, then rg (x ) 6= 0.

Conditions 1-4 in the text are written di¤erently, although they are an equivalent representation:

@L

(i) : = fi gi = 0 ; i = 1; 2; :::n

@xi

@L

(ii) : = [c g (x; y)] 0

@

(iii) : 0

(iv) : [c g (x; y)] = 0 :

Notice that from (ii) we get g (x) c. If g (x) < c, then L > 0.

There is really nothing special about this problem, but it is worthwhile setting it up, for practice. Let

f (x) ; g (x) 2 C 1 , x 2 Rn . The problem is

65

I rewrite this as

Suppose that x is the solution to this problem and that x is not a critical point of the constraint set (to

be de…ned below). Write down the Lagrangian function

@L

(1) : = fi gi + ' = 0 ; i = 1; 2; :::n

@xi

(2) : [c g (x; y)] = 0

(3) : 0

(4) : g (x) c

(5) : ' [x] = 0

(6) : ' 0

(7) : x 0 ,x 0:

The constraint quali…cation is that is not a critical point of the constraints that bind. If only g (x) = c

binds, then we require rg (x ) 6= 0. See the general case below.

The text gives again a di¤erent – and I argue less intuitive – formulation. The Lagrangian is set up

without explicitly mentioning the non-negativity constraints

@Z

(i) : = fi 'gi 0

@xi

(ii) : xi 0

@Z

(iii) : xi = 0 ; i = 1; 2; :::n

@xi

@Z

(iv) : = [c g (x)] 0

@'

(v) : ' 0

@Z

(vi) : ' =0:

@'

The unequal treatment of di¤erent constraints is confusing. My method treats all constraints consistently.

A non-negativity constraint is just like any other.

66

13.3 The general case

Let f (x) ; g j (x) 2 C 1 , x 2 Rn , j = 1; 2; :::m. The problem is

m

X

L = f (x) + j cj g j (x) :

j=1

Suppose that x is the solution to the problem above and that x does not violate the constraint quali…cations

(see below). Then there exists m numbers j, j = 1; 2; :::m such that

m

X

@L j

(1) : = fi j gi (x) = 0 ; i = 1; 2; :::n

@xi j=1

(2) : j cj g j (x) = 0

(3) : j 0

(4) : g j (x) cj ; j = 1; 2; :::m :

The constraint quali…cations are as follows. Consider all the binding constraints. Count them by

jb = 1; 2; :::mb . Then we must have that the rank of

2 @g1 (x )

3

@x0

6 2

@g (x ) 7

@g B (x ) 6 7

= 6 @x0 7

@x 0 6 .. 7

4 . 5

@g mb (x )

@x0 mb n

is mb , as large as possible.

13.4 Minimization

It is worthwhile to consider minimization separately, although minimization of f is just like maximization

of f . We compare to maximization.

Let f (x) ; g (x) 2 C 1 , x 2 Rn . The problem is

Rewrite as

Choose x to maximize f (x) , s.t. g (x) c 0

67

Write down the Lagrangian function

L = f (x) + [c g (x)] :

FONCs

@L

(1) : = fi gi = 0 ; i = 1; 2; :::n

@xi

(2) : [c g (x; y)] = 0

(3) : 0

(4) : g (x) c:

Compare this to

Choose x to minimize h (x) , s.t. g (x) c:

Rewrite as

Choose x to minimize h (x) , s.t. g (x) c 0

L = h (x) + [c g (x)] :

FONCs

@L

(1) : = hi gi = 0 ; i = 1; 2; :::n

@xi

(2) : [c g (x; y)] = 0

(3) : 0

(4) : g (x) c:

Everything is the same. Just pay attention to the inequality setup correctly. This will be equivalent to

Compare this to

Choose x to maximize h (x) , s.t. g (x) c:

Rewrite as

Choose x to maximize h (x) , s.t. c g (x) 0:

L= h (x) + [g (x) c] :

68

13.5 Example

Choose fx; yg to maximize min fax; byg , s.t. xpx + ypy I ;

This is equivalent, because given a level of y, we will never choose ax > by, nor can the objective exceed by

by construction.

L = ax + [I xpx ypy ] + ' [by ax] :

FONC:

Lx = a px a' = 0

Ly = py + b' = 0

[I xpx ypy ] = 0

0

xpx + ypy I

' [by ax] = 0

' 0

ax by :

The solution process is a trial and error process. The best way is to start by checking which constraints do

not bind.

' > 0 must hold. Then ax = by ) y = ax=b.

2. Suppose = 0. Then b' = 0 ) ' = 0 – a contradiction (even without ' > 0, we would reach

another contradiction: a = 0). Therefore > 0. Then xpx + ypy = I ) xpx + axpy =b = I )

x (px + apy =b) = I ) x = I= (px + apy =b), y = aI= (bpx + apy ).

Solving for the multipliers (which is an integral part of the solution) involves solving

px + a' = a

py b' = 0:

69

This can be written in matrix notation

px a a

= :

py b ' 0

px a

= bpx apy < 0 :

py b

a a

0 b ab

= = >0

bpx apy bpx + apy

px a

py 0 apy

' = = >0:

bpx apy bpx + apy

Finally, we check the constraint quali…cations. Since both constraints bind ( > 0, ' > 0), we must have

a rank of two for the matrix

xpx + ypy

@

ax by px py

= :

@ [x y] a b

In this case we can verify that the rank is two by the determinant, since this is a square 2 2 matrix:

px py

= bpx apy < 0 :

a b

13.6 Example

Choose fx; yg to maximize x2 + x + 4y 2 , s.t. 2x + 2y 1 ; x; y 0

Rewrite as

2 3

2x + 2y

@4 x 5 2 3

y 2 2

=4 1 0 5 :

@ [x y]

0 1

This has rank 2 8 (x; y) 2 R2 , so the constraint quali…cations are never violated. The constraint set is a

triangle, all the constraints are linear, the the constraint quali…cation will not fail.

Set up the Lagrangian function

70

FONCs

Lx = 2x + 1 2 +'=0

Ly = 8y 2 + =0

[1 2x 2y] = 0 0 2x + 2y 1

'x = 0 ' 0 x 0

y=0 0 y 0

1. From Lx = 0 we have

2x + 1 + ' = 2 > 0

with strict inequality, because x 0 and ' 0. Thus > 0 and the constraint

2x + 2y = 1

binds, so that

y = 1=2 x or x = 1=2 y:

(x ; y ) = (0; 1=2).

2x + 1 = 2 :

From Ly = 0 we have

8y + =2 :

2x + 1 = 8y +

2 (1=2 y) + 1 = 8y +

2 2y = 8y +

10y + = 2:

The last result tells us that we cannot have both = 0 and y = 0, because we would get 0 = 2 – a

contradiction (also because then we would get = 0 from Ly = 0). So either = 0 or y = 0 but not

both.

(x ; y ) = (0:3; 0:2).

Eventually, we need to evaluate the objective function with each candidate to see which is the global

maximizer.

71

13.7 The Kuhn-Tucker su¢ ciency theorem

Let f (x) ; g j (x) 2 C 1 , j = 1; 2; :::m. The problem is

s.t. x 0 and g j (x) cj ; j = 1; 2; :::m :

Theorem: if

1. f is concave on Rn ,

2. g j are convex on Rn ,

We know: convex g j (x) cj gives a convex set. One can show that the intersection of convex sets is

also a convex set, so that the constraint set is also convex. SO the theorem actually says that trying

to maximize a concave function on a convex set give a global maximum, if it exists. It could exist on

the border or not –the FONCs will detect it.

But these are strong conditions on our objective and constraint functions. The next theorem relaxes

things quite a bit.

Let f (x) ; g j (x) 2 C 1 , j = 1; 2; :::m. The problem is

s.t. x 0 and g j (x) cj ; j = 1; 2; :::m :

Theorem: if

(b) 9i such that fi (x ) > 0 and xi > 0 (i.e. it does not violate the constraints).

72

(c) rf (x ) 6= 0 and f 2 C 2 around x .

(d) f (x) is concave.

If

(b) @g (x) =@x0 6= 0 8x in the constraint set.

Recall the envelope theorem for unconstrained optimization: if x is a critical point of f (x; ). Then

df (x ; ) @f (x ; )

= :

d @

@f (x ; )

This was due to @x = 0.

Now we face a more complicated problem:

For a problem with inequality constraints we simply use only those constraints that bind. We will consider

small perturbations of , so small that they will not a¤ect which constraint binds. Set up the Lagrangian

function

L = f (x; ) + [c g (x; )] :

FONCs

L = c g (x; ) = 0

We apply the implicit function theorem to this set of equations to get 9x = x ( ) and = ( ) for which

there well de…ned derivatives around ( ; x ). We know that at the optimum we have that the value of the

problem is the value of the Lagrangian function

f (x ; ) = L = f (x ; ) + [c g (x ; )]

= f (x ( ) ; ) + ( ) [c g (x ( ) ; )] :

73

De…ne the value of the problem as

v ( ) = f (x ; ) = f (x ( ) ; ) :

dv ( ) dL dx d dx

= = fx +f + [c g (x ( ) ; )] gx +g

d d d d d

dx d

= [fx gx ] + [c g (x ( ) ; )] + f g

d d

= f g :

Of course, we could have just applied this directly using the envelope theorem:

dv ( ) dL @L

= = =f g :

d d @

13.10 Duality

We will demonstrate the duality of utility maximization and cost minimization. But the principles here are

more general than consumer theory.

(this should be stated with but we focus on preferences with nonsatiation and strict convexity so the

solution lies on the budget line and x > 0 is satis…ed). The Lagrangian function is

L = u (x) + [I p0 x]

FONCs:

L = [I p0 x] = 0

We apply the implicit function theorem to this set of equations to get Marshallian demand

xm

i = xm

i (p; I)

m m

i = i (p; I)

for which there well de…ned derivatives around ( ; x ). We de…ne the indirect utility function

Choose x 2 Rn to minimize px s.t. u (x) = u ;

74

where u is a level of promised utility. The Lagrangian function is

Z = p0 x + ' [u u (x)] :

FONCs:

Z' = u (x) = 0

We apply the implicit function theorem to this set of equations to get Hicksian demand

'hi = 'hi (p; u)

for which there well de…ned derivatives around (' ; x ). We de…ne the expenditure function

e (p; u) = p0 xh (p; u) :

ui pi

= ;

uj pj

Thus, at the optimum

xm

i (p; I) = xhi (p; u)

e (p; u) = I

v (p; I) = u:

Moreover,

1

'=

Duality relies on unique global extrema. We need to have all the preconditions for that.

Make drawing.

v (p; I) = u (xm ) + (I p0 xm ) :

2 3

n

X X n

@v @xm

j @ @xm

j

= uj + (I p0 xm ) 4 pj + xm

i

5

@pi j=1

@p i @p i j=1

@p i

n

X @xm

j @

= (uj pj ) + (I p0 xm ) xm

i

j=1

@pi @pi

= xm

i :

75

An increase in pi will lower demand by xm

i , which decreases the value of the problem, as if by the decreasing

income by xm m

i times the utils/$ per dollar of pseudo lost income. In other words, income is now worth xi

less, and this taxes the objective by xm i . Taking the derivative with respect to income,

2 3

Xn Xn

@v @xm j @ @xm

j 5

= uj + (I p0 xm ) + 41 pj

@I j=1

@I @I j=1

@I

n

X @xm

j @

= (uj pj ) + (I p0 xm ) +

j=1

@I @I

= :

An increase in income will increase our utility by , which is the standard result.

In fact, we could get these results applying the envelpe theorem directly:

@v

= xm

i

@pi

@v

= :

@I

@v=@pi

= xm

i :

@v=@I

Why is this interesting? Because this is the amount of income needed to compensate consumers for (that

will leave them indi¤erent to) an increase in the price of some good xi . To see this, …rst consider

v (p; I) = u ;

where u is a level of promised utility (as in the dual problem). By the implicit function theorem 9I = I (pi )

in a neighbourhood of xm , which has a well de…ned derivative dI=dp. This function is de…ned at the

optimal bundle xm . Now consider the total di¤erential of v, evaluated at the optimal bundle xm :

vpi dpi + vI dI = 0 :

This di¤erential does not involve other partial derivatives because it is evaluated at the the optimal bundle

xm (i.e. the envelope theorem once again). And we set this di¤erential to zero, because we are considering

keeping the consumer exactly indi¤erent, i.e. her promised utility and optimal bundle remain unchanged.

Then we have

dI vpi @v=@pi

= = = xm

i :

dpi vI @v=@I

This result tells you that if you are to keep the consumer indi¤erent to a small change in the price of good

i, i.e. not changing the optimally chosen bundle, then you must compensate the consumer by xm

i units of

income. We will see this again in the dual problem, using Sheppard’s lemma, where keeping utility …xed

76

@e

is explicit. We will see that @pi = xhi = xm

i is exactly the change in expenditure that results from keeping

To see this graphically, consider a level curve of utility. The slope of the curve at (p; I) (more generally,

the gradient) is xm .

e (p; u) = p0 xh + ' u u xh :

@e

n

X @xpj @'

n

X @xhj

= xhi + pj + u u xh ' uj

@pi j=1

@pi @pi j=1

@pi

n

X @xhj @'

= (pj 'uj ) + u u xh + xhi

j=1

@pi @pi

= xhi :

An increase in pi will increases cost by xhi while keeping utility …xed at u (remember that this is a minimiza-

tion problem so increasing the value of the problem is "bad"). Note that this is exactly the result of Roy’s

Identity. Taking the derivative with respect to promised utility,

2 3

n

X n

X

@e @xhj @' @xhj

= pj + u u xh + ' 41 uj 5

@u j=1

@u @u j=1

@u

n

X @xhj @'

= (pj 'uj ) + u u xh +'

j=1

@u @u

= ':

An increase in utility income will expenditures by ', which is the standard result.

77

In fact, we could get these results applying the envelpe theorem directly:

@e

= xhi

@pi

@e

= ':

@u

14 Integration

14.1 Preliminaries

Consider a general function

x = x(t)

dx

x_ :

dt

This is how much x changes during a very short period dt. Suppose that you know x_ at any point in time.

We can write down how much x changed from some initial point in time, say t = 0, until period t as follows:

Z t

xdt

_ :

0

This is the sum of all changes in x from period 0 to t. The term of art is integration, i.e. we are integrating

all the increments. But you cannot say what x (t) is, unless you have the value of x at the initial point. This

is the same as saying that you know what the growth rate of GDP is, but you do not know the level. But

given x0 = x (0) we can tell what x (t) is:

Z t

x (t) = x0 + xdt

_ :

0

E.g.

x_ = t2

Z t Z t

1

xdt

_ = u2 du = t3 + c :

0 0 3

The constant c is arbitrary and captures the fact that we do not know the level.

Suppose that the instant growth rate of y is a constant r, i.e.

y_

=r :

y

This can be written as

y_ ry = 0 ;

_ = r. But so does y = kert .

So once again, without having additional information about the value of y at some initial point, we cannot

say what y (t) is.

78

14.2 Inde…nite integrals

Denote

dF (x)

f (x) = :

dx

Therefore,

dF (x) = f (x) dx :

Z Z

dF (x) = f (x) dx = F (x) + c ;

where the constant of integration, c, denotes that the integral is correct up to an indeterminate constant.

This is so because knowing the sum of increments does not tell you the level. Another way to see this is

d d

F (x) = [F (x) + c] :

dx dx

Integration is the opposite operation of di¤erentiation. Instead of looking at small perturbations, or incre-

ments, we look for the sum of all increments.

Commonly used integrals

R xn+1

1. xn dx = n+1 +c

R R

2. f 0 (x) ef (x) dx = ef (x) + c ; ex dx = ex + c

R f 0 (x) R 1

R dx

3. f (x) dx = ln [f (x)] + c ; x dx = x = ln x + c

Operation rules

R R R

1. Sum: [f (x) + g (x)] dx = f (x) dx + g (x) dx

R R

2. Scalar multiplication: k f (x) dx = kf (x) dx

Z Z Z

du

f (u) dx = f (u) u0 dx = f (u) du = F (u) + c :

dx

E.g. Z Z Z Z

2 3 3 1 4

2x x + 1 dx = 2 x + x dx = 2 x dx + 2 xdx = x + x2 + c

2

Alternatively, de…ne u = x2 + 1, hence u0 = 2x, and so

Z Z Z

2 du 1

2x x + 1 dx = udx = udu = u2 + c0

dx 2

1 2 2 1

= x + 1 + c0 = x4 + 2x2 + 1 + c0

2 2

1 4 1 1

= x + x2 + + c0 = x4 + x2 + c :

2 2 2

79

4. Integration by parts: Since

d (uv) = udv + vdu

we have Z Z Z

d (uv) = uv = udv + vdu :

Z Z

udv = uv vdu :

Z Z

U (x) dV (x) = U (x) V (x) V (x) dU (x)

Z Z

U (x) v (x) dx = U (x) V (x) V (x) u (x) dx :

'x

E.g., let f (x) = 'e . Then

Z Z

'x 'x 'x

x'e dx = xe e dx

Z Z

x 'e 'x dx = |{z}

|{z} x e 'x 1

|{z} e 'x dx

| {z } | {z } | {z }

U v U V u

V

The area under the f curve, i.e. between the f curve and the horizontal axis, from a to b is

Z b

f (x) dx = F (b) F (a) :

a

The Riemann Integral: create n rectangles that lie under the curve, that take the minimum of the

heights: ri , i = 1; 2:::n. Then create n rectangles with height the maximum of the heights: Ri , i = 1; 2:::n.

As the number of these rectangles increases, the sums of the rectangles may converge. If they do, then we

say that f is Reimann-integrable. I.e. if

n

X n

X

lim ri = lim Ri

n!1 n!1

i=1 i=1

80

then Z b

f (x) dx :

a

is well de…ned.

Properties:

Rb Ra

1. Minus/switching the integration limits: a

f (x) dx = b

f (x) dx = F (b) F (a) = [F (a) F (b)]

Ra

2. Zero: a

f (x) dx = F (a) F (a) = 0

Z c Z b Z c

f (x) dx = f (x) dx + f (x) dx :

a a b

Rb Rb

4. Scalar multiplication: a

kf (x) dx = k a

f (x) dx ; 8k 2 R

Rb Rb Rb

5. Sum: a

[f (x) + g (x)] dx = a

f (x) dx + a

g (x) dx

Rb Rb Rb

6. By parts: a

U vdx = U V jba a

uV dx = U (b) V (b) U (a) V (b) a

uV dx

Suppose that we wish to integrate a function from some initial point x0 until some inde…nite point x.

Then Z x

f (t) dt = F (x) F (x0 ) :

x0

and so Z x

F (x) = F (x0 ) + f (t) dx :

x0

Leibnitz’s Rule

Let f 2 C 1 (i.e. F 2 C 2 ). Then

b(

Z ) b(

Z )

@ @b ( ) @a ( ) @

f (x; ) dx = f (b ( ) ; ) f (a ( ) ; ) + f (x; ) dx :

@ @ @ @

a( ) a( )

81

Proof: let f (x; ) = dF (x; ) =dx. Then

b(

Z )

@ @ b( )

f (x; ) dx = [F (x; )ja( )

@ @

a( )

@

= [F (b ( ) ; ) F (a ( ) ; )]

@

@b ( ) @a ( )

= Fx (b ( ) ; ) + F (b ( ) ; ) Fx (a ( ); ) F (a ( ) ; )

@ @

@b ( ) @a ( )

= f (b ( ) ; ) f (a ( ) ; ) + [F (b ( ) ; ) F (a ( ) ; )]

@ @

b(

Z )

@b ( ) @a ( ) d

= f (b ( ) ; ) f (a ( ) ; ) + F (x; ) dx

@ @ dx

a( )

b(

Z )

@b ( ) @a ( ) @

= f (b ( ) ; ) f (a ( ) ; ) + f (x; ) dx :

@ @ @

a( )

The last line follows by Young’s Theorem. Clearly, if the integration limits do not depend on , then

Zb Zb

@ @

f (x; ) dx = f (x; ) dx ;

@ @

a a

b(

Z )

@ @b ( ) @a ( )

f (x) dx = f (b ( )) f (a ( )) :

@ @ @

a( )

14.4.1 In…nite integration limits

Z 1 Z b

f (x) dx = lim f (x) dx = F (1) F (a) :

a b!1 a

'x 'x

E.g., X exp('): F (x) = 1 e , f (x) = 'e for x 0.

Z 1 Z b

'x 'x 'b '0

'e dx = lim 'e dx = lim e +e =1:

0 b!1 0 b!1

Also

Z 1 Z 1 Z 1

'x 'x 1 'x

E (x) = xf (x) x = x'e dx = xe 0

e dx

0 0 0

1

'1 '0 1 'x 1 '1 1 '0

= " 1e " + 0e + e =0 e + e

' 0 ' '

1

= :

'

82

14.4.2 In…nite integrand

E.g., sometimes the integral is divergent, even though the integration limits are …nite:

Z 1 Z b

1 1 1

dx = lim dx = [ln (x)j1 = ln (1) ln (1) = 1 1=1

1 x b!1 1 x

Z 1 Z 1

1 1 1

dx = lim dx = [ln (x)j0 = ln (1) ln (0) = 0 + 1 = 1 :

0 x b!0 b x

lim f (x) = 1 :

x!p

Then the integral from a to b is convergent i¤ the partitions are also convergent:

Z b Z p Z b

f (x) dx = f (x) dx + f (x) dx :

a a p

E.g.

1

lim =1:

x!0 x3

Z 1 Z 0 Z 1 0 1

1 1 1 1 1

dx = dx + dx = +

1 x3 1 x3 0 x3 2x2 1 2x2 0

In discrete time we have the capital accumulation equation

Kt+1 = (1 ) Kt + It ;

Kt+1 Kt = It Kt :

We want to rewrite this in continuous time. In this context, investment, Itg , is instantaneous and capital

depreciates at an instantaneous rate of . Consider a period of length . The accumulation equation is

Kt+ Kt = It Kt :

Divide by to get

Kt+ Kt

= It Kt :

K_ t = It Kt ;

83

where it is understood that It is instantaneous investment at time t, and Kt is the amount of capital available

at that time. Kt is the amount of capital that vanishes due to depreciation. Write

K_ t = Itn ;

where Itn is net investment. Given a functional form for Itn we can tell how much capital is around at time

t, given an initial amount at time 0, K0 .

Let Itn = ta . then

Z t Z t Z t t

_ = ua+1 ta+1

Kt K0 = Kdt Iun du = ua du = = :

0 0 0 a+1 0 a+1

Domar was interested in answering: what must investment be in order to satisfy the equilibrium condition

at all times.

Structure of the model:

1

Y_ = I_ ;

s

t = Kt ;

therefore

_ = K_ = I :

3. Long run equilibrium is given when potential output is equal to actual output

=Y ;

therefore

_ = Y_ :

1

(i) output demand : Y_ = I_

s

(ii) potential output : _ = I

(iii) equilibrium : _ = Y_ :

84

Use (iii) in (ii) to get

I = Y_

1_

I= I ;

s

which gives

I_

= s:

I

Now integrate in order to …nd the level of investment at any given time:

Z _ Z

I

dt = sdt

I

ln I = st + c

It = e( s)t+c

= e( s)t c

e = I0 e( s)t

:

The larger is productivity, , and the higher the saving rate, s, the more investment is required. This is the

amount of investment needed for output to keep output in check with potential output.

Now suppose that output is not equal to its potential, i.e. 6= Y . This could happen if the investment

is not growing at the correct rate of s. Suppose that investment is growing at rate a, i.e.

It = I0 eat :

Yt

u = lim :

t!1 t

Compute what the capital stock is at any moment:

Z t Z t Z t

_ 1

K t K0 = Kd + I d = I0 ea d = I0 eat

0 0 0 a

(the constant of integration is absorbed in K0 .) Now compute

1

Yt s It 1 It 1 I0 eat a I0 eat a

u = lim = lim = lim = lim 1 at = lim at

= :

t!1 t t!1 Kt s t!1 Kt s t!1 a I0 e + K0 s t!1 I0 e + aK0 s

The last equality follows from using L’Hopital’s rule. If a > s then u > 1 there is a shortage of capacity,

excess demand. If a > s then u < 1 there is a excess of capacity, excess supply. Thus, in order to keep

output demand equal to output potential we must have a = s and thus u = 1.

In fact, this holds at any point in time:

d

I_ = I0 eat = aI0 eat :

dt

Therefore

1 _ a at

Y_ = I = I0 e

s s

_ = I = I0 eat :

85

So

Y_ a

I0 eat a

= s at = =u:

_ I0 e s

If the utilization rate is too high u > 1, then demand growth outstrips supply, Y_ > _ . If the utilization rate

is too low u < 1, then demand growth lags behind supply, Y_ < _ .

Thus, the razor edge condition: only a = s keeps us at a sustainable equilibrium path:

If u > 1, i.e. a > s , there is excess demand, investment is too high. Entrepreneurs will try to invest

even more to increase supply, but this implies an even larger gap between the two.

If u < 1, i.e. a < s , there is excess supply, investment is too low. Entrepreneurs will try to cut

investment to lower demand, but this implies an even larger gap between the two.

We deal with equations that involve y.

_ The general form is

dy d2 y

First order means dt , not dt2 .

In principle, we can have dn y=dtn , where n is the order of the di¤erential equation. In the next chapter

we will deal with up to d2 y=dt2 .

15.1.1 Homogenous case

y_ + ay = 0

y_

= a

y

which has solution

at

y (t) = y0 e :

86

15.1.2 Non homogenous case

y_ + ay = b ;

where b 6= 0. The solution method involves splitting the solution into two:

y_ + ay = 0 ;

so that

at

yc (t) = Ae :

yp (t) solves the original equation for a stationary solution, i.e. y_ = 0, which implies that y is constant

and thus y = b=a, where a 6= 0. The solution is thus

at b

y = yc + yp = Ae + :

a

a0 b b b

y0 = Ae + =A+ ) A = y0 :

a a a

b at b at b at

y (t) = y0 e + = y0 e + 1 e :

a a a

One way to think of the solution is a linear combination of two points: the initial condition y0 and the

at

particular, stationary solution b=a. (If a > 0, then for t 0 we have 0 e 1, which yields a convex

combination). Verify this solution:

2 3

b 6 b b b7

at 6 at 7

y_ = a y0 e = a 6 y0 e + 7= ay + b

a 4 a a a5

| {z }

y

) y_ + ay = b :

b at b

y (t) = y0 e +

a a

at b

= ke + ;

a

87

for some arbitrary point k. In this case

at

y_ = ake ;

and we have

at at b

y_ + ay = ake + a ke + =b:

a

When a = 0, we get

y_ = b

so

y = y0 + bt :

Z Z

y_ = b

y = bt + c ;

where c = y0 . We can also solve this using the same technique as above. yc solves y_ = 0, so that this

is a constant yc = A. yp should solve 0 = b, but this does not work unless b = 0. So try a di¤erent

particular solution, yp = kt, which requires k = b, because then y_ p = k = b. So the general solution is

y = yc + yp = A + bt :

E.g.

y_ + 2y = 6 :

yc solves y_ + 2y = 0, so

2t

yc = Ae :

yp = 3 :

Thus

2t

y = yc + yp = Ae +3 :

20

Together with y0 = 10 we get 10 = Ae + 3, so that A = 7. This completes the solution:

2t

y = 7e +3 :

2t

y_ = 14e

and

2t 2t

y_ + 2y = 14e + 2 7e +3 =6 :

88

15.2 Variable coe¢ cients

The general form is

y_ + u (t) y (t) = w (t) :

w (t) = 0:

y_

y_ + u (t) y (t) = 0 ) = u (t) :

y

Integrate both sides to get

Z Z

y_

dt = u (t) dt

y

Z

ln y + c = u (t) dt

R R

c u(t)dt u(t)dt

y = e = Ae ;

c

where A = e . Thus, the general solution is

R

u(t)dt

y = Ae :

Together with a value for y0 and a functional form for u (t) we can solve explicitly.

E.g.

y_ + 3t2 y = 0

y_ + 3t2 y = 0:

Thus

y_

= 3t2

y

Z Z

y_

dt = 3t2 dt

y

Z

ln y + c = 3t2 dt

R

c 3t2 dt t3

y = e = Ae :

w (t) 6= 0:

y_ + u (t) y (t) = w (t) :

The solution is Z

R R

u(t)dt u(t)dt

y=e A+ w (t) e dt :

89

Obtaining this solution requires some elaborate footwork, which we will do. But …rst, see that this works:

e.g.,

y_ + t2 y = t2 ) u (t) = t2 ; w (t) = t2 :

Z Z

1

u (t) dt = t2 dt = t3

3

Z R

Z

3 3

u(t)dt

w (t) e dt = t2 et =3 dt = et =3 ;

since Z

f 0 (y) ef (y) dy = ef (y) :

Thus

h i

t3 =3 3

t3 =3

y=e A + et =3

= Ae +1 :

t3 =3

y_ = t2 Ae

so

h i

t3 =3 t3 =3

y_ + u (t) y (t) = t2 Ae + t2 Ae +1

t3 =3 t3 =3

= t2 Ae + t2 Ae + t2

= t2

= w (t) :

Suppose that the primitive di¤erential equation can be written as

F (y; t) = c

so that

dF = Fy dy + Ft dt = 0 :

We use the latter total di¤erential to obtain F (y; t), from which we obtain y (t). We set F (y; t) = c to get

initial conditions.

De…nition: the di¤erential equation

M dy + N dt = 0

have

@M @2F @N

= = :

@t @t@y @y

The latter is what we will be checking in practice.

90

E.g., let F (y; t) = y 2 t = c. Then

dF = Fy dy + Ft dt = 2ytdy + y 2 dt = 0 :

Set

M = 2yt ; N = y 2 :

Check:

@2F @M

= = 2y

@t@y @t

@2F @N

= = 2y

@t@y @y

Before solving, one must always check that the equation is indeed exact.

Step 1: Since

dF = Fy dy + Ft dt

Z

F (y; t) = Fy dy + ' (t)

Z

= M dy + ' (t) ;

Example:

2yt dy + y 2 dt = 0 :

|{z} |{z}

M N

Step 1: Z Z

F (y; t) = M dy + ' (t) = 2ytdy + ' (t) = y 2 t + ' (t) :

Step 2:

@F (y; t) @ 2

= y t + ' (t) = y 2 + '0 (t) :

@t @t

Since N = y 2 we must have '0 (t) = 0, i.e. ' (t) is a constant function, ' (t) = k, for some k. Thus

F (y; t) = y 2 t + k = c ;

91

so we can ignore the constant k and write

F (y; t) = y 2 t = c :

1=2

y (t) = (ct) :

Example:

(t + 2y) dy + y + 3t2 dt = 0 :

So that

M = (t + 2y)

N = y + 3t2 :

@M @N

=1= ;

@t @y

so this is indeed an exact di¤erential equation.

Step 1: Z Z

F (y; t) = M dy + ' (t) = (t + 2y) dy + ' (t) = ty + y 2 + ' (t) :

Step 2:

@F (y; t) @

= ty + y 2 + ' (t) = y + '0 (t) = N = y + 3t2 ;

@t @t

so that

'0 (t) = 3t2

and Z Z

' (t) = '0 (t) dt = 3t2 dt = t3 :

Thus

F (y; t) = ty + y 2 + ' (t) = ty + y 2 + t3 :

Step 3: we cannot solve this analytically for y (t), but using the implicit function theorem, we can

characterize it.

Example: Let T F (t) be the time until some event occurs, T 0. De…ne the hazard rate as

f (t)

h (t) = ;

1 F (t)

which is the "probability" that the event occurs at time t, given that it has not occurred by time t.

We can write 0

R (t)

h (t) = ;

R (t)

92

where R (t) = 1 F (t). We know how to solve such di¤erential equations:

0

R (t) + h (t) R (t) = 0 :

Rt

h(s)ds

R (t) = Ae :

Since R (0) = 1 (the probability that the event occurs at all), then we have A = 1:

Rt

h(s)ds

R (t) = e :

It follows that

Rt

Z t Rt Rt

0 h(s)ds @ h(s)ds h(s)ds

f (t) = R (t) = e h (s) ds = e [ h (t)] = h (t) e :

@t

Suppose that the hazard rate is constant:

h (t) = :

In that case

Rt

ds t

f (t) = e = e ;

Now suppose that the hazard rate is not constant, but

1

h (t) = t :

In that case

Rt 1

1 s ds 1 t

f (t) = t e = t e ;

which is the p.d.f. of the Weibull distribution. This is useful if you want to model an increasing hazard

( > 1) or decreasing hazard ( < 1). When = 1 or we get the exponential distribution.

Sometimes we can turn a non exact di¤erential equation into an exact one. For example,

2tdy + ydt = 0

is not exact:

M = 2t

N = y

and

Mt = 2 6= Ny = 1 :

2ytdy + y 2 dt = 0 ;

93

15.4.1 Integrating factor

y_ + uy = w ;

where all variables are functions of t and we wish to solve for y (t). Write the equation above as

dy

+ uy = w

dt

dy + uydt = wdt

dy + (uy w) dt = 0:

Rt

u(s)ds

e :

Rt Rt

u(s)ds u(s)ds

e dy + e (uy w) dt = 0 :

Rt

u(s)ds

M = e

Rt

u(s)ds

N = e (uy w)

and

@M @ R t u(s)ds Rt

= e = e u(s)ds u (t)

@t @t

@N @ R t u(s)ds Rt

= e (uy w) = e u(s)ds u (t) :

@y @y

So @M=@t = @N=@y.

This form can be recovered from the method of undetermined coe¢ cients. We seek some A such

that

A dy + A (uy w)dt = 0

|{z} | {z }

M N

and

@M @A

= = A_

@t @t

@N @

= [A (uy w)] = Ay

@y @y

are equal. This means

A_ = Au

_

A=A = u

Rt

u(s)ds

A = e :

94

15.4.2 The general solution

y_ + uy = w :

Rewrite as

dy + (uy w) dt = 0 :

Rt Rt

u(s)ds u(s)ds

e| {z }dy + |e (uy w)dt = 0 :

{z }

M N

Step 1: Z Z Rt Rt

u(s)ds u(s)ds

F (y; t) = M dy + ' (t) = e dy + ' (t) = ye + ' (t) :

Step 2:

@F @ h R t u(s)ds i Rt

= ye + ' (t) = ye u(s)ds u (t) + '0 (t) = N :

@t @t

Using N from above we get

Rt Rt

u(s)ds

N = ye u (t) + '0 (t) = e u(s)ds

(uy w) ;

so that

Rt

'0 (t) = e u(s)ds

w

and so Z Rt

u(s)ds

' (t) = e wdt :

Rt Rt

u(s)ds u(s)ds

F (y; t) = ye e wdt = c

Rt Rt

u(s)ds u(s)ds

y=e c+ e wdt :

In general,

y_ = h (y; t)

f (y; t) dy + g (y; t) dt = 0 :

_ not y (n) .

First order means y,

n

First degree means y,

_ not (y)

_ .

95

15.5.1 Exact di¤erential equations

See above.

f (y) dy + g (t) dt = 0 :

f (y) dy = g (t) dt

Example:

3y 2 dy tdt = 0

Z Z

2

3y dy = tdt

1 2

y3 = t +c

2

1=3

1 2

y (t) = t +c :

2

Example:

2tdy ydt = 0

dy dt

=

y 2t

Z Z

dy dt

=

y 2t

1

ln y = ln t + c

2

1 1=2

y = e 2 ln t+c = eln t ec = ec t1=2 :

Suppose that

y_ = h (y; t)

can be written as

y_ + Ry = T y m ;

where

R = R (t)

T = T (t)

96

are functions only of t and

m 6= 0; 1 :

This is a Bernoulli equation, which can be reduced to a linear equation and solved as such. Here’s

how:

y_ + Ry = T ym

1

y_ + Ry 1 m

= T

ym

Use a change of variables

z = y1 m

so that

m

z_ = (1 m) y y_

y_ z_

= :

ym 1 m

Plug this in the equation to get

z_

+ Rz = T

1 m

2 3

dz + 4(1 m) Rz (1 m) T 5 dt = 0

| {z } | {z }

w w

dz + [uz + w] = 0:

Rt

Z Rt

u(s)ds u(s)ds

z (t) = e A+ e wdt :

1

y (t) = z (t) 1 m

:

Example:

y_ + ty = 3ty 2

In this case

R = t

T = 3t

m = 2:

2 1

y y_ + ty 3t = 0 :

97

Change variables

1

z = y

2

z_ = y y_

so that we get

z_ + tz 3t = 0

dz + ( tz + 3t) dt = 0:

so that we set

u = t

w = 3t :

Rt

Z R

t

u(s)ds

z (t) = e A + e u(s)ds wdt

Rt

Z Rt

= e sds A 3 e sds

tdt

Z

2 2

= et =2 A 3 e t =2 tdt

2

h 2

i

= et =2 A + 3e t =2

2

= Aet =2

+3 :

So that

1 2 1

y (t) = = Aet =2 + 3 :

z

Example:

y_ + y=t = y 3 :

In this case

R = 1=t

T = 1

m = 3:

3 1 2

y y_ + t y 1=0:

98

Change variables

2

z = y

3

z_ = 2y y_

so that we get

z_ z

+ 1 = 0

2 t

z

z_ + 2 + 2 = 0

t

z

dz + 2 + 2 dt = 0 :

t

so that we set

u = 2=t

w = 2:

Rt

Z Rt

u(s)ds u(s)ds

z (t) = e A+ e wdt

Rt

Z Rt

1 1

= e2 t ds

A 2 e 2 t ds

dt

Z

= e2 ln t A 2 e 2 ln t

dt

= t2 A 2t 2

= At2 2:

So that

1 2

y (t) = = At2 2 :

z2

Given

y_ = f (y)

we can plot y_ as a function of y. This is called a phase diagram. This is an autonomous di¤erential

equation, since t does not appear explicitly as an argument. We have three cases:

3. y_ = 0 : y is stationary, an equilibrium.

99

System A is dynamically stable: the y_ curve is downward sloping; any movement away from the

stationary point y will bring us back there.

System B is dynamically unstable: the y_ curve is upward sloping; any movement away from the

stationary point y take farther away.

y_ + ay = b

with solution

b at b

y (t) = y0 e +

a a

at at b

= e y0 + 1 e :

a

at

System A happens when a > 0: lim e ! 0, so that lim y (t) ! b=a = y .

t!1 t!1

at

System B happens when a < 0: lim e ! 1, so that lim y (t) ! 1.

t!1 t!1

15.7 The Solow growth model (no long run growth version)

1. CRS production function

Y = F (K; L)

y = f (k)

where y = Y =L, k = K=L. Given FK > 0 and FKK < 0 we have f 0 > 0 and f 00 < 0.

100

2. Constant saving rate: I = sY , so that K_ = sY K.

_

3. Constant labor force growth: L=L = n.

K_

= sf (k) k:

L

Since

d K _

KL K L_ K_ K L_ K_

k_ = = = = kn ;

dt L L2 L LL L

we get

k_ = sf (k) (n + ) k :

Since f 0 > 0 and f 00 < 0 we know that 9k such that sf (k) < (n + ) k. And given the Inada conditions

–f 0 (0) = 1 and f (0) = 0 –9k such that sf (k) > (n + ) k. Therefore, k_ > 0 for low levels of k; and k_ > 0

for high levels of k. Given the continuity of f we know that 9k such that k_ = 0, i.e. the system is stable.

101

16 Higher order di¤erential equations

We will discuss here only second order, since it is very rare to …nd higher order di¤erential equations in

economics. The methods introduced here can be extended to higher order di¤erential equations.

y 00 + a1 y 0 + a2 y = b ;

where

y = y (t)

y0 = dy=dt

y 00 = d2 y=dt2 ;

102

and a, b, and c are constants. The solution will take the form

y = yp + yc ;

where the particular solution, yp , characterizes a stable point and the complementary function, yc , charac-

terizes dynamics/transitions.

The particular solution. We start with the simplest solution possible; if this fails, we move up in the

degree of complexity.

b

If a2 = 0 and a1 6= 0, then yp = a1 t :

If a2 = 0 and a1 = 0, then yp = 2b t2 :

In the latter solutions, the "stable point" is moving. Recall that this solution is too restrictive, because it

constrains the coe¢ cients in the di¤erential equation.

y 00 + a1 y 0 + a2 y = 0 :

We "guess"

y = Aert

which implies

y0 = rAert

y 00 = r2 Aert

and thus

y 00 + a1 y 0 + a2 y = A r2 + a1 r + a2 ert = 0 :

r2 + a1 r + a2 = 0 :

a1 a21 4a2

r1;2 = :

2

For each root ri there is a potentially di¤erent Ai coe¢ cient. So there are two possible solutions,

y1 = A1 er1 t

y2 = A2 er2 t :

103

But we cannot just chose one solution, because this will restrict the coe¢ cients in the original di¤erential

equation. Thus, we have

yc = A1 er1 t + A2 er2 t :

Given two conditions on y – i.e. two values of either one of y, y 0 or y 00 at some point in time – we can pin

down A1 and A2 .

There are three options for the composition of the roots:

Two distinct real roots: r1 ; r2 2 R and r1 6= r2 . This will give us values for A1 and A2 , given two

conditions on y.

yc = A1 er1 t + A2 er2 t :

Repeated real root: r1 = r2 2 R, r = a1 =2. It might seem that we can just add up the solution as

before, but this will restrict the coe¢ cients in the original di¤erential equation. This is so because in

yc = (A1 + A2 ) er2 t we cannot separately identify A1 from A2 . We guess again:

y1 = A1 ert

y2 = A2 t ert :

This turns out to work, because both solve the homogenous equation. You can check this. Thus for

repeated real root the complementary function is

yc = A1 er1 t + A2 ter2 t :

p

Complex roots, r1;2 = r bi, i = 1, a21 < 4a2 . This gives rise to oscillating dynamics

Stability: does yc ! 0?

r1 = r2 = r 2 R: need r < 0.

104

16.2 Di¤erential equations with moving constant

y 00 + a1 y 0 + a2 y = b (t) ;

where a1 and a2 are constants. We require that b (t) takes a form that combines a …nite number of "elementary

functions", e.g. ktn , ekt , etc. We …nd yc in the same way as above, because we consider the homogenous

equation where b (t) = 0. We …nd yp by using some educated guess and verify our guess by using the method

of undetermined coe¢ cients. There is no general solution procedure for any type of b (t).

y 00 + 5y 0 + 3y = 6t2 t 1:

Guess:

y p = ' 2 t2 + ' 1 t + ' 0 :

This implies

yp00 = 2'2 :

= 3'2 t2 + (10'2 + 3'1 ) t + (2'2 + 5'1 + 3'0 ) :

we need to solve

3'2 = 6

10'2 + 3'1 = 1

2'2 + 5'1 + 3'0 = 1:

yp = 2t2 7t + 10 :

y 00 + a1 y 0 + a2 y = t 1

:

1

Then no guess of the type yp = 't or yp = ' ln t will work.

y 00 + 5y 0 = 6t2 t 1:

105

The former type of guess,

y p = ' 2 t2 + ' 1 t + ' 0 ;

will not work, because '2 will never show up in the equation, so cannot be recovered. Instead, try

y p = t2 '2 t2 + '1 t + ' 0 ;

and so on.

y 00 + a1 y 0 + a2 y = Bert :

Guess:

yp = Atert

with the same r and look for solutions for A. The guess yp = Aert will not work. E.g.

y 00 + 3y 0 4y = 2e 4t

:

Guess:

4t

yp = Ate

yp0 = Ae 4t

+ 4Ate 4t

= Ae 4t

(1 4t)

yp00 = 4Ae 4t

(1 4t) + 4Ae 4t

= Ae 4t

( 8 + 16t) :

y 00 + 3y 0 4y = Ae 4t

( 8 + 16t) + 3Ae 4t

(1 4t) + 4Ate 4t

4t

= Ae ( 8 + 16t + 3 12t 4t)

4t

= 5Ae

We need to solve

4t 4t

5Ae = 2e

so A = 0:4 and

4t

yp = 0:4te :

106

17 First order di¤erence equations

yt+1 + ayt = c :

As with di¤erential equations, we wish to trace out a path for some variable y over time, i.e. we seek y (t).

But now time is discrete, which gives rise to some peculiarities.

De…ne

yt yt+1 yt ;

yt yt+ yt

= ;

t

where = 1.

1. yt+1 + yt = c.

y1 = y0 + c

y2 = y1 + c = y0 + c + c = y0 + 2c

y3 = y2 + c = y0 + 2c + c = y0 + 3c

..

.

yt = y0 + ct :

y1 = ky0

y2 = ky1 = k 2 y0

..

.

yt = k t y0 :

yt+1 + ayt = c ;

where a 6= 0. The solution method involves splitting the solution into two:

107

yc (t) solves the homogenous equation

yt+1 + ayt = 0 :

Guess

yt = Abt

so that

yt+1 + ayt = 0

implies

Abt+1 + aAbt = 0

b+a = 0

b = a:

t

yc (t) = A ( a) ;

where a 6= 0.

a 6= 1. yp (t) solves the original equation for a stationary solution, yt = k, a constant. This implies

k + ak = c

c

k = :

1+a

So that

c

yp = ; a 6= 1:

1+a

a= 1. Guess yp (t) = kt. This implies

k (t + 1) kt = c

k = c:

So that

yp = ct ; a = 1:

t c

A ( a) + 1+a if a 6= 1

yt = yc (t) + yp (t) = :

A + ct if a= 1

for a 6= 1

c c

y0 = A + ) A = y0 :

1+a 1+a

108

for a = 1

y0 = A :

( h i

c t c

y0 1+a ( a) + 1+a if a 6= 1

yt = :

y0 + ct if a= 1

For a 6= 1 we have

h i c

t t

yt = y0 ( a) + 1 ( a) ;

1+a

c

which is a linear combination of the initial point and the stationary point 1+a . And if a 2 ( 1; 1), then this

process is stable. Otherwise it is not. For a = 1 and c 6= 1 the process is never stable.

Example:

yt+1 5yt = 1 :

yt+1 5yt = 0 :

Abt+1 5Abt = 0

Abt (b 5) = 0

b = 5;

so that

yc (t) = A5t :

yp = k solves

k 5k = 1

k = 1=4 ;

so that yp = 1=4.

109

17.3 Dynamic stability

Given

c c

y t = y0 bt + ;

1+a 1+a

the dynamics are governed by b (= a).

c

1 < b < 0: oscillations diminish over time. In the limit we converge on the stationary point 1+a .

b= 1: constant oscillations.

b = 0 means a = 0, so yt = c.

b = 1 means a = 1, so yt = y0 + ct.

c

3. 0 < b < 1 gives convergence to the stationary point 1+a .

This is an early model of agriculture markets. Farmers determined supply last year based on the prevailing

price at that time. Consumers determine demand based on current prices. Thus, three equations complete

the description of this model

s

supply : qt+1 = s (pt ) = + pt

d

demand : qt+1 = d (pt+1 ) = pt+1

s d

equilibrium : qt+1 = qt+1 ;

+ pt = pt+1

+

pt+1 + pt = :

| {z } | {z }

a c

t

+ +

pt = p0 + :

+ +

110

The process is convergent (stable) i¤ j j < j j. Since both are positive, we need < .

Interpretation: what are and ? These are the slopes of the demand and supply curves, respectively.

If follows that if the slope of the supply curve is lower than that of the demand curve, then the process if

convergent. I.e., as long as the farmers do not "overreact" to current prices next year, the market will converge

on a happy stable equilibrium price and quantity. Conversely, as long as consumers are not "insensitive" to

prices, then...

We will use only a qualitative/graphic approach and restrict to autonomous equations, in which t is not

explicit. Let

yt+1 = ' (yt ) :

Draw a phase diagram with yt+1 on the vertical axis and yt on the horizontal axis and the 45 degree ray

starting from the origin. For simplicity, y > 0. A stationary point satis…es y = ' (y). But sometimes

the stationary point is not stable. If j'0 (y)j < 1 at the stationary point, then the process is stable. More

generally, as long as j'0 (yt )j < 1 the process is stable, i.e. it will converge to some stationary point. When

j'0 (yt )j 1 the process will diverge.

111

Example: Galor and Zeira (1993), REStud.

112

18 Phase diagrams with two variables (19.5)

We now analyze a system of two autonomous di¤erential equations:

x_ = F (x; y)

y_ = G (x; y) :

F (x; y) = 0

G (x; y) = 0:

Apply the implicit function theorem separately to the above, which gives rise to two (separate) functions:

x_ = 0 : y = fx=0

_ (x)

y_ = 0 : y = gy=0

_ (x) ;

113

where

Fx

f0 =

Fy

Gx

g0 = :

Gy

Now suppose that we have enough information about F and G to characterize f and g. And suppose that

f and g intersect, which is the interesting case. This gives rise to a stationary point, in which both x and y

are constant:

fx=0

_ (x ) = gy=0

_ (x ) ) y :

There are two interesting cases, although you can characterize the other ones, once you do this.

Fx < 0; Fy > 0

Gx > 0; Gy < 0 :

_ x_ < 0 and in all points below fx=0

_ x_ > 0.

_ y_ < 0 and in all points below gy=0

_ y_ > 0.

Given an intersection, this gives rise to four regions in the (x; y) space:

1. Below fx=0

_ and above gy=0

_ : x_ < 0 and y_ < 0.

2. Above fx=0

_ and above gy=0

_ : x_ > 0 and y_ < 0.

3. Above fx=0

_ and below gy=0

_ : x_ > 0 and y_ > 0.

4. Below fx=0

_ and below gy=0

_ : x_ < 0 and y_ > 0.

This gives rise to a stable system. From any point in the (x; y) space we converge to (x ; y ).

114

Given the values that x_ and y_ take (given the direction in which the arrows point in the …gure), we can

draw trajectories. In this case, all trajectories will eventually arrive at the stationary point at the intersection

of x_ = 0 and y_ = 0.

Notice that at the point in which we cross the x_ = 0 the trajectory is vertical. Similarly, at the point

in which we cross the y_ = 0 the trajectory is horizontal.

Fx > 0; Fy < 0

Gx < 0; Gy > 0 :

Both f and g are still upward sloping, but now the pattern is di¤erent, because gy=0

_ crosses fx=0

_ at a steeper

slope. Notice that

_ x_ < 0 and in all points below fx=0

_ x_ > 0.

_ y_ > 0 and in all points below gy=0

_ y_ < 0.

Given an intersection, this gives rise to four regions in the (x; y) space:

1. Below fx=0

_ and above gy=0

_ : x_ > 0 and y_ > 0.

2. Above fx=0

_ and above gy=0

_ : x_ < 0 and y_ > 0.

3. Above fx=0

_ and below gy=0

_ : x_ < 0 and y_ < 0.

115

4. Below fx=0

_ and below gy=0

_ : x_ > 0 and y_ < 0.

This gives rise to an unstable system. However, there is a stationary point at the intersection, (x ; y ).

But in order to converge to (x ; y ) there are only two trajectories that bring us there, one from the region

above fx=0

_ and below gy=0

_ , the other from the region below fx=0

_ and above gy=0

_ . These trajectories are

called stable branches. If we are not on those trajectories, then we are on unstable branches. Note that

being in either region does not ensure that we are on a stable branch, as the …gure illustrates.

19 Optimal control

Like in static optimization problems, we want to maximize (or minimize) an objective function. The di¤erence

is that the objective is the sum of a path of values at any instant in time; therefore, we must choose an entire

path as a maximizer.1

The problem is generally stated as follows:

Z T

Choose u (t) to maximize F (y; u; t) dt

0

s.t.

Initial condition : y (0) = y0

rT

transversality condition : y (T ) e 0:

1 Thetheory behind this relies on "calculus of variations", which was …rst developed to compute trajectories of missiles (to

the moon and elsewhere) in the U.S.S.R.

116

where r is some average discount rate that is relevant to the problem. To this we need to sometimes add

Terminal condition : y (T ) = yT

Constraints on control : u (t) 2 U

The function y (t) is called the state variable. The function u (t) is called the control variable. It is useful

to think of of the state as a stock (like capital) and the control as a ‡ow (like investment or consumption).

Usually we will have F; g 2 C 1 , but in principle we could do without di¤erentiability with respect to u. I.e.,

we only need that the functions F and g are continuously di¤erentiable with respect to y and t.

The transversality condition immediately implies that y (T ) 0, but also something more. It tells you

rT

that if y (T ) > 0, then its value at the end of the problem, y (T ) e must be zero. This will become clearer

below, when we discuss the Lagrangian approach.

If there is no law of motion for y, then we can solve the problem separately at any instant as a static

problem. The value would just be the sum of those static values.

There is no uncertainty here. To deal with uncertainty, wait for your next course in math.

2. Investment/consumption: I = Y C = F (K; L) C.

3. Capital accumulation: K_ = I K.

We want to maximize the present value of instantaneous utility from now (at t = 0) till we die (at some

distant time T ). The problem is stated as

Z T

t

Choose C (t) to maximize e U [C (t)] dt

0

s.t.

K_ = I K

K (0) = K0

K (T ) = KT :

117

19.1 Pontryagin’s maximum principle and the Hamiltonian function

De…ne the Hamiltonian function:

The function (t) is called the co-state function and also has a law of motion. Finding is part of the

solution. The FONCs of this problem ensure that we maximize H at every point in time, and as a whole. If

u is a maximizing plan then

(i)

: H (y; u ; t; ) H (y; u; t; ) 8u 2 U

@H

or : = 0 if F; g 2 C 1

@u

@H

State equation (ii) : = y_ ) y_ = g (y; u; t)

@

@H

Costate equation (iii) : = _ ) _ + Fy + gy = 0

@y

Transversality condition (iv) : (T ) = 0 :

Conditions (ii)+(iii) are a system of …rst order di¤erential equations that can be solved explicitly if we have

functional forms and two conditions: y (0) = y0 and (T ) = 0. Note that has the same interpretation as

the Lagrange multiplier: it is the shadow cost of the constraint is at any instant.

We adopt the convention that y (0) = y0 is always given. There are a few way to introduce terminal

conditions, which gives the following taxonomy

1. When T is …xed,

(a) (T ) = 0, y (T ) free.

(b) y (T ) = yT , (T ) free.

(c) y (T ) ymin (or y (T ) ymax ), (T ) free. Add the following complementary slackness conditions:

y (T ) ymin

(T ) 0

(T ) (y (T ) ymin ) = 0

3. T Tmax (or T Tmin ) and y (T ) = yT . Add the following complementary slackness conditions:

H (T ) 0

T Tmax

H (T ) (T Tmax ) = 0

118

19.2 The Lagrangian approach

The problem is

Z T

Choose u (t) to maximize F (y; u; t) dt

0

s.t.

y_ = g (y; u; t)

rT

y (T ) e 0

y (0) = y0 :

You can think of y_ = g (y; u; t) as an inequality y_ g (y; u; t). We can write this up as a Lagrangian. For this

we need Lagrange multipliers for the law of motion constraint at every point in time, as well as an additional

multiplier for the transversality condition:

Z T Z T

rT

L = F (y; u; t) dt + (t) [g (y; u; t) y]

_ dt + y (T ) e

0 0

Z T Z T

rT

= [F (y; u; t) + (t) g (y; u; t)] dt (t) y_ (t) dt + y (T ) e :

0 0

ydt

_ = y+ _ ydt

so that

Z T Z T

T _ (t) y (t) dt + y (T ) e rT

L = [F (y; u; t) + (t) g (y; u; t)] dt [ (t) y (t)j0 +

0 0

Z T Z T

[F (y; u; t) + (t) g (y; u; t)] dt (T ) y (T ) + (0) y (0) + _ (t) y (t) dt + y (T ) e rT

:

0 0

Before writing down the FONCs for the Lagrangian, recall that

(i) : Lu = Fu + gu = 0

(iii) : Ly = Fy + gy + _ = 0 :

(i) : Hu = Fu + gu = 0

(ii) : H = g = y_

(iii) : Hy = Fy + gy = _ :

119

The requirement that y (0) = y0 can also be captured in the usual way, as well as y (T ) = yT , if it is required.

The transversality condition is captured by the complementary slackness conditions

rT

y (T ) e 0

0

rT

y (T ) e = 0:

rT

We see here that if y (T ) e > 0, then its value, , must be zero.

In these problems t is not an explicit argument.

Z T

Choose u to maximize F (y; u) dt s.t. y_ = g (y; u)

0

These problems are easier to solve and are amenable to analysis by phase diagrams.

Objective: You want to eat your cake in an optimal way, maximizing your satisfaction from eating it,

starting now (t = 0) and …nishing before bedtime, at T .

When you eat cake, the size diminishes by the amount that you ate: S_ = C.

You like cake, but less so when you eat more: U 0 (C) > 0, U 00 (C) < 0.

The problem is

Z T

Choose C to maximize U (C) dt s.t.

0

S_ = C

S (0) = S0

S (T ) 0:

H (C; S; ) = U (C) + [ C] :

120

FONCs:

@H

(i) : = U 0 (C) =0

@C

@H

(ii) : = C = S_

@

@H

(iii) : =0= _

@S

(iv) : S (T ) 0; (T ) 0; S (T ) (T ) = 0 :

From (iii) it follows that is constant. From (i) we have U 0 (C) = , and since is constant, C is constant

too. Then given a constant C we get from (ii) that

S=A Ct :

S = S0 Ct :

But we still do not know what is C, except that it is constant. So we solve for the complementary slackness

conditions, i.e., will we leave leftovers?

Suppose > 0. Then S (T ) = 0. Therefore

0 = S0 CT ;

which gives

S0

C= :

T

Suppose = 0. Then it is possible to have S (T ) > 0. But then we get U 0 = 0 –a contradiction.

The solution is thus

C (t) = S0 =T

S (t) = S0 (S0 =T ) t

1

(t) = (U 0 ) (S0 =T ) :

If we allowed a ‡at part in the utility function after some satiation point, then we could have a solution

with leftovers S (T ) > 0. In that case we would have more than one optimal path: all would be global

because with one ‡at part U is still quasi concave.

We demonstrate that on the optimal path the value of the Hamiltonian function is constant.

121

dH

= Hu u_ + Hy y_ + H _ + Ht :

dt

The FONCs were

Hu = 0

Hy = _

H = y_ :

dH @H

= :

dt @t

This is in fact a consequence of the envelope theorem, although not in a straightforward way. If time is not

@H

explicit in the problem, then @t = 0, which implies the statement above.

Many problems in economics involve discounting (as we saw above), so the problem is not autonomous.

However, usually the only place that time is explicit is in the discount factor,

Z T Z T

F (y; u; t) dt = e rt G (y; u) dt :

0 0

You can try to solve those problems "as-is", but an easier way (especially if the costate is of no particular

interest) is to use the current value Hamiltonian:

H

where

' = ert :

e (y; u ; ') H

(i)

: H e (y; u; ') 8u 2 U

@He

or : e g 2 C1

= 0 if H;

@u

@He

State equation (ii) : = y_ ) y_ = g (y; u)

@'

@He

Costate equation (iii) : = '_ + r' ) '_ r' + Fy + gy = 0

@y

e (T ) = 0 or other.

Transversality condition (iv) : ' (T ) = 0 or H

122

We now need to choose a functional form for the instantaneous utility function. The problem is

Z T

rt

Choose C to maximize e ln (C) dt s.t.

0

S_ = C

S (0) = S0

S (T ) 0:

e = ln C + ' [ C]

H

FONCs:

e

@H 1

= '=0

@C C

e

@H

= C = S_

@'

e

@H

= 0= '_ + r'

@S

S (T ) 0; ' (T ) 0; S (T ) ' (T ) = 0 :

'_

=r ;

'

hence

' = Bert ;

1 1 rt

C= = e :

' B

From (ii) we have

S_ = C

Z t Z t

_

Sdt = Cdt

0 0

Z t

S (t) = A+ Cdt ;

0

S (t) = S0 Cdt ;

0

123

1 rt

which makes sense. Now, using C = B e we get

Z t

1 rz

S (t) = S0 B e dz

0

t

1 1 rz

= S0 B e

r 0

1 1 rt 1 r0

= S0 B e + e

r r

1 rt

= S0 1 e

rB

Suppose ' (T ) = 0. Then B = 0 and C (T ) = 1 –not possible. So ' (T ) > 0, which implies S (T ) = 0.

Therefore

1 rT

0 = S0 1 e

rB

r 1 e rT

B =

S0

Therefore

S0 rt

C= rT ]

e ;

r [1 e

which is decreasing, and

rT

r 1 e

'= ert ;

S0

which is increasing. And …nally

rt

1 e

S (t) = S0 1 rT

:

1 e

This completes the characterization of the problem.

When the problem’s horizon is in…nite, i.e. never ends, we need to modify the transversality condition.

These are

lim (T ) y (T ) = 0

T !1

rT

lim ' (T ) e k (T ) = 0

T !1

1. Preferences: u (C), u0 > 0, u00 < 0. Inada conditions: u (0) = 0, u0 (0) = 1, u0 (C) ! 0 as C ! 1.

124

2. Aggregate production function: Y = F (K; L), CRS, Fi > 0, Fii < 0. Given this we can write the

per-worker version y = f (k), where f 0 > 0, f 00 < 0 and y = Y =L, k = K=L. Inada conditions:

f (0) = 0, f 0 (0) = 1, f 0 (k) ! 0 as k ! 1.

3. Capital accumulation: K_ = I K=Y C K. As we saw in the Solow model, we can write this

in per worker terms k_ = f (k) c (n + ) k, where n is the constant growth rate of labor force.

4. There cannot be negative consumption, but also, once output is converted into capital, we cannot eat

it. This can be summarized in 0 C F (K; L). This is an example of a restriction on the control

variable.

5. A social planner chooses a consumption plan to maximize everyone’s welfare, in equal weights. The

objective function is Z Z

1 1

nt t rt

V = L0 e e U (c) dt = e U (c) dt ;

0 0

where we normalize L0 = 1 and we set r = n > 0, which ensures integrability. Notice that everyone

gets the average level of consumption c = C=L.

The problem is

Choose c to maximize V s.t.

k_ = f (k) c (n + ) k

0 c f (k)

k (0) = k0

FONCs:

Hc = u0 (c) '=0

H' = [f (k) c (n + ) k] = k_

Hk = ' [f 0 (k) (n + )] = r' '_

rT

lim ' (T ) e k (T ) = 0

T !1

Ignore for now 0 c f (k). The transversality condition here is a su¢ cient condition for a maximum,

although in general this speci…c condition is not necessary. If this was a present value Hamiltonian the same

transversality condition would be limT !1 (T ) k (T ) = 0, which just means that the value of an additional

unit of capital in the limit is zero.

125

From Hc we have u0 (c) = '. From Hk we have

'_

= [f 0 (k) (n + + r)] :

'

We want to characterize the solution qualitatively using a phase diagram. To do this, we need two

equations: one for the state, k, and one for the control, c. Notice that

so

u00 (c) c_

= [f 0 (k) (n + + r)] :

u0 (c)

Rearrange to get

c_ u0 (c) 0

= [f (k) (n + + r)] :

c cu00 (c)

Notice that

cu00 (c)

u0 (c)

is the coe¢ cient of relative risk aversion. Let

c1

u (c) = :

1

This is a class of constant relative relative risk aversion, or CRRA, utility functions, with coe¢ cient of RRA

= .

Eventually, our two equations are

k_ = f (k) c (n + ) k

c_ 1 0

= [f (k) (n + + r)] :

c

k_ = 0 : c = f (k) (n + ) k

c_ = 0 : f 0 (k) = n + + r :

The c_ = 0 locus is a vertical line in the (k; c) space. Given the Inada conditions and diminishing returns to

capital, we have that the k_ = 0 locus is hump shaped. Since r > 0, the peak of the hump is to the right of

the vertical c_ = 0 locus.

The phase diagram features a saddle point, with two stable branches. If k is to the right of the c_ = 0

locus, then c_ < 0 and vice versa for k to the left of the c_ = 0 locus. For c above the k_ = 0 locus we have

k_ < 0 and vice versa for c below the k_ = 0 locus. See textbook for …gure.

De…ne the stationary point as (k ; c ). Suppose that we start with k0 < k . Then the optimal path for

consumption must be on the stable branch, i.e. c0 is on the stable branch, and will eventually go to c .

126

The reason is that any other choice is not optimal. Higher consumption will eventually lead to depletion

of the capital stock, which eventually leads to no output and therefore no consumption (U.S.A.). Too little

consumption will lead …rst to an increase in the capital stock and an increase in output, but eventually

this is not sustainable as the plan requires more and more consumption forgone to keep up with e¤ective

depreciation (n + ) and eventually leads to zero consumption as well (U.S.S.R.).

One can do more than just analyze the phase diagram. First, given functional forms we can compute the

exact paths for all dynamic variables. Second, we could linearize (a …rst order Taylor expansion) the system

of di¤erential equations around the saddle point to compute dynamics around that point.

127

- Mathematics for Economics Lecture Notes VM CDS(1)Uploaded byVishnu Venugopal
- economicsUploaded byMad Madhavi
- Chiang Wainwright Fundamental Methods Ch 2 3 SolutionsUploaded bybonadie
- Lecture Note of Mathematical EconomicsUploaded byKevin Hongdi Wang
- Mathematical Optimization and Economic TheoryUploaded byJing He
- Schaum's Introduction to Mathematical Economics -- 532Uploaded byLarry Looser
- Linear Models & Matrix AlgebraUploaded bySyed Asim Bukhari
- pdf75.pdfUploaded bySyed Muzammil Ali
- Basic Mathematical EconomicsUploaded byBenjamin Franklin
- Solution-Manual.pdfUploaded byJavid Far Disi
- Consumer Theory BasicsUploaded bySyed Asim Bukhari
- Roy's IdentityUploaded byAlp Eren AKYUZ
- Mathematics - Mathematical Economics and FinanceUploaded byOsaz Aiho
- Macroeconomics Phd Lectures notesUploaded byimad.akhdar
- Foundations of Mathematical EconomicsUploaded byHisham Haider Dewan
- Microeconomics I - ModuleUploaded bynaaifhasan
- Economics Notes [PDF Library]Uploaded byVishal Gattani
- Microeconomic TheoryUploaded bysudeepshaw
- Macroeconomic Theory and PolicyUploaded bymriley@gmail.com
- AttitudeUploaded bywhereisthebody
- Applied Econometrics - A Modern Approach Using Eviews and MicrofitUploaded bymaardybum
- Duopoly PptUploaded bynjaim200
- Recursive Methods in Economic Dynamics - N. Stokey, R. Lucas (1989) WWUploaded byBruno Sultanum Teixeira
- VECMUploaded bysaeed meo
- Transformational Leadership RnegUploaded bySyed Asim Bukhari
- Mathematical Methods for Business and Economics - Edward T. DowlingUploaded byavishap1
- Lecture Notes in MacroeconomicsUploaded byNghĩa Còi
- Mathematical Economics _NotesUploaded bydeepblue004
- Real and Nominal GDP of PakistanUploaded byHumayun
- Microeconomics and Macroeconomics Lecture NotesUploaded bySam Vaknin

- Student SolutionsUploaded byinfiniti786
- Dynamic OptimizationUploaded byabcspock
- 17Uploaded bySalmizan Abdul Salam
- Vec-matUploaded byshivamj099
- Mathematica Partial Differential EquationsUploaded byhammoudeh13
- Chapter13-08Uploaded byAnnabel Amor Antig
- NDA Coaching in ChandigarhUploaded bykaran Singh
- BCA MCQUploaded byViswaprem CA
- MathsUploaded byPrem Sai
- BCA SyllabusUploaded byapi-3782519
- Z01_FRAN6598_07_SE_All_0.pdfUploaded byMohammad Umar Rehman
- DeterminantUploaded bydwarika2006
- Apostila Smath StudioUploaded byRicardo Rabelo
- [Franklin_A._Graybill]_Matrices_with_Applications_(BookZZ.org).pdfUploaded byedward
- Els&CommunicationUploaded byapi-3827000
- Solutions MunkresUploaded byDiego Alejandro Londoño Patiño
- Matrix Inversion Using Orthogonal PolynimialsUploaded byJessa Dabalos Cabatchete
- Egm4313 Exam1 Review StaticsUploaded byMark Viau
- Course Syllabus Second SemUploaded byPolemer Cuarto
- SIGGRAPH2014_Course19_coursenotesUploaded bynkiruka chuka-obah
- Treil S. - Linear Algebra Done WrongUploaded byF
- 9SC01MUploaded byrammech07
- 20110816_DSC3707_2012_101-E_finUploaded by30948723098452348
- Book LinearUploaded bysjebitner
- Survey FormulaUploaded byArsalan Pervez
- Armadillo Nicta 2010Uploaded byAnonymous 00r0iDvPbo
- Dae Civil Curriculum UPDATEUploaded byTheFamous Salman
- 18 ISC Mathematics SyllabusUploaded byAnurag Singhania
- Sd Article 14Uploaded byrs0004
- Matrix AlgebraUploaded byvaidehi4