Professional Documents
Culture Documents
Miron
Outline
1. Derivatives
2. Optimization of a Function of a Single Variable
3. Partial Derivatives
4. Optimization of a Function of Several Variables
5. Optimization Subject to Constraints
Derivatives
The basic tool we need to review is derivatives. The basic, intuititive denition of a
derivative is that it is the rate of change of a function in response to a change in its
argument. Lets take an example and look at it more slowly.
Say we have some variable y that is a function of another variable x, e.g.,
y = f (x)
For example, we could have
y = x2
or
y = 7x + 3
or
y = ln x
Graphically, I am just assuming that we have something that looks like the following:
10
9
8
7
6
5
4
3
2
1
0
0
10
Now say that we are interested in knowing how y will change if we change x.
Lets say that y is test scores, and x is hours of studying.
Assume we are initially at some amount of x, e.g., you have been in the habit of
studying 20 hours per week. You want to know how much higher your test scores
would be at some other amount of x, x + h.
One thing you could do, if you know the formula, is take this alternate x + h, and
compute f (x) as well as f (x + h). You could then look at the dierence:
f (x + h)
f (x)
This would be the change in y. For some purposes, that might be exactly what you
care about.
In other instances, however, you might care about not just how much of a change
there would be, but how much per amount of change in x, i.e., per h:
That is also easy to calculate:
2
f (x + h)
h
f (x)
10
9
f(x+h)8
7
f(x+h) - f(x)
6
5
f(x) 4
h
3
2
1
-1
x+h
10
As you can see, we are just calculating the ratio of two legs of a triangle; that
ratio is the slope of the line that connects the two points, as seen above.
The problem is that this calculation would have a dierent answer if we calculated
it at a dierent point:
f(x+h)10
y
9
f(x+h') - f(x)
f(x) 7
h'
6
5
4
3
2
1
-1
x+h'
10
10
9
8
7
6
5
4
3
2
1
0
0
10
So, lets think about the limiting case of this. Say we examine
f (x + h)
h!0
h
lim
f (x)
At one level, this thing might seem a bit confusing or ill-dened. The numerator
obviously goes to zero as h gets small. The denominator also goes to zero. So,
why should we expect the limit to converge to anything?
The proof is outside this course. But, looking at the graph, we can see that it
seems plausiblethat as we let h go to zero, the ratio should approach the slope of
the line that is tangent to the function.
This is indeed the case, and it can be proven, but we will just accept it as
reasonable.
To summarize, we have shownthat the rate of change of a function at a given
point (assuming it has a well-dened rate of change) is equal to the slope of a line
that is tangent to the curve at that point.
6
f (x)
The key thing to keep in your head is that the derivative is both:
1) the rate of change of the function at that point, and
2) the slope of the tangent line at that point.
Here are a few additional things to consider:
1) The derivative is usually dierent at dierent points.
2) Some functions do not have derivatives at all points:
10
9
8
7
6
5
4
3
2
1
0
x'
10
10
9
8
7
6
5
4
3
2
1
0
y=
x'
(x + 3)=x + 4
10
x' = 0
2
0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
-2
-4
-6
-8
-10
4.0
So far we have talked about the idea that the change in a variable y that depends
on a variable x, per unit of x, might be a useful thing to measure in some settings.
And, we have seen that the derivative we have dened the change in y per unit
of x, for small changes in x seems to measure that concept.
But we have not been that explicit about why derivatives are useful in economics.
Well take a step in that direction now.
So, imagine that we have some y that depends on x, and we control x. We know
that dierent values of x lead to dierent values of y, and we want to choose the x
that gives us the highest y.
For example, assume y is a measure of happiness,and x is the number of pints
of Ben and Jerrys that a consumer eats each night. You might think that for
small values of x, y increases with x. But at some point, as x increases, happiness
decreases (because you can feel your arteries clogging as you eat your 8th pint that
night).
Graphically, we have
10
10
9
8
7
6
5
4
3
2
1
0
0
10
11
The phrase rst-orderis important; it suggests that this is not the whole story,
and that there may be second-order things we have to worry about. Lets leave
that aside for a second.
Intuitively, it seems clear (and one can prove rigorously under some assumptions)
that the x that satises this condition is the x at which the maximum y occurs.
There are some caveats, but ignore them for a moment and look at an example.
Lets say
y = f (x) =
x2 + 6x + 4:
x2 + 6x + 4
2.1
Caveats
The graph that I drew, and the example I considered, had nicefeatures:
1. exactly one peak
2. denitely had a max
3. everywhere dierentiable
This is not true for all functions:
12
10
8
6
no max or min
4
2
-3
-2
-1
10
-2
-4
10
infinitely many
max = min
-3
-2
-1
-2
13
10
10
9
8
7
6
5
4
3
2
1
0
0
10
So, the condition we have stated, the FOC, is not su cient for a point to be a
maximum.
Indeed, it is not even necessary, if we allow for functions that are not dierentiable.
There is a standard approach to dealing with this that handles these weird cases
for dierentiable functions. This method is known as the second-order conditions.
It basically says that the second derivative has to be negative for a maximum.
What is a second derivative? Its just a derivative of a derivative. And you
probably remember, or can at least see intuitively, why this makes sense: If the
second derivative is negative, the derivative is getting smaller.
Dont worry about this for now. I will review it again in a few examples where
it is relevant later.
Most, although not all, of the problems we examine are nice. For now, I want
you to be aware of the fact that some problems are not "nice." We will see some
examples where it is relevant later. But, its not the key thing to focus on now - just
be sure to understand the intuition and mechanics of the FOC.
To be clear, it is very important that you be aware that the FOC is not a su cient
condition; there are special cases where the point that satises the FOC is not the
14
maximizing point.
But were not going to worry about the details yet or to a
signicant degree in this course overall.
NB: everything Ive said is applicable for nding minima instead of maxima. That
is one reason we have to check the SOCs. But again, in most applications that we
will consider, this will take care of itself.
Partial Derivatives
The next, and basically last, calculus topic that we need is partial derivatives.
The reason is that many interesting economics examples relate one variable, say
y, to two (or more) other variables, say k and l. A common example can be found
in a production function:
y = f (k; l)
or, in a utility function,
u = u(x1; x2 )
So, the standard calculus of one variable is not su cient.
Imagine that we have a function of two variables, e.g.,
y = f (x; z)
Now, this is a bit more of a pain graphically.
15
5
0
0 0
10
10
f (x; z)
Now, this might look messy. But it simply treats z as a constant, and then takes
a standard derivative.
This is easiest to see by considering examples. Assume
16
y = xz
Then
@y
= z:
@x
Why? Because if we treat z as a constant, then y equals just a constant times x,
and we know how to take that derivative.
What exactly is this partial telling us? It is telling us the rate at which y changes
as we change x, holding z constant.
Furthermore, it makes sense that this depends on the value of z. Take z = 0 then changing x has no eect on y.
Of course, we could also think about the eect of z on y. To calculate that, we
take the derivative of y with respect to z, treating x as a constant:
@y
= x:
@z
So, if we have a function
y = f (x1; x2 ; : : : xn )
i.e., a function of n variables, there will be exactly n partial derivatives.
More examples: Let
y = ax + bz + cq
Then
@y
=a
@x
@y
= b:
@z
@y
= c:
@z
17
Now say
y = x2 z 3
Then
@y
= 2xz 3
@x
@y
= 3x2 z 2 :
@x
Or, let
u(x1; x2 ) = x1 x2
Then
@u(x1 ; x2 )
= x1
@x1
@u(x1 ; x2 )
= x 1 x2
@x2
3.1
x2
1
Discussion:
In words, the partial of a function with respect to one argument is the rate of
change in the function in response to a small change in the argument, holding the
other arguments xed.
This is dierent than adjusting both arguments.
For example, increasing a consumers consumption of goods 1 and 2 is normally
going to have a dierent eect on utility than just increasing, say, good 1.
As a second example, increasing both K and L will have a dierent eect than,
say, increasing L and holding K constant.
Well see this in practice soon.
The last topic we need to consider is how to nd the maximizing values for functions
of several variables.
Indeed, this is the case of real interest, since key examples in economics are of
this variety.
That is what creates all the tension about how much math to use in intermediate
courses.
Everyone agrees that its nice to be able to use calculus. But it turns out that
we need just a little bit of multivariate calculus.
Virtually all basic calculus courses, however, focus only on univariate, rather than
multivariate, calculus; in particular, they do not teach partial derivatives. Thus, in
most sequences, you do linear algebra and then multivariate calculus. This makes
sense, since you need linear algebra (but only a tiny amount) for some parts of
multivariate calculus. But this standard approach makes life di cult.
So, the key tool we need to do micro theory with calculus is partial derivatives.
That means that if we cannot use partials, the benets of using calculus are not
large; thats why most books put it in an appendix, or skip it entirely.
19
Thats also why many departments do not require calculus for an econ major;
Harvard did not until 10 or 15 years ago.
But this seems nutty to me: for good students who have had some introduction
to basic calculus, learning a partial derivative is not a big deal; its really just a baby
step away from what you already know. Indeed, if you think about it the right way,
you already know what a partial is, as we have seen.
Now we can see why it is useful.
Lets rst consider an abstract example, because theres one small wrinkle that
I want to leave aside for the moment that comes in when we get to the economics
examples.
Say we have
y = f (x1; x2 )
We know that if f is a smoothfunction, it could look something like:
20
5
0
0 0
10
10
We also know we could think about this in only one of two dimensions.
Then this would look like:
21
10
8
6
4
2
-10
-8
-6
-4
-2
-2
10
12
14
16
18
20
-4
-6
-8
-10
So, intuitively, we want to make sure were at a peak from either angle.
Well, looking from either angle is like holding one of x1 or x2 xed.
So, say we do the following: calculate
@y=@x1
and
@y=@x2
set both to zero, and nd the combination of x1 and x2 that simultaneously
solves the two equations.
I am assuming you can see intuitively that this is analogous to the univariate
case.
In words, if we are at the combination of x1 and x2 that produces the maximum
y, then two things must be true:
22
3x21
2x22 + 5x1 x2 + x1 + x2
6x1 + 5x2 +
=0
@y=@x2 =
4x2 + 5x1 +
=0
This is just two linear equations in two unknowns. We can easily solve for x1 and
x2 .
Of course, these are just rst-order conditions. As with the univariate case, we
have to worry about whether were getting a max or min; we also have to worry
about kinks and boundaries, etc.
Ignore all this for the time being, but be aware that there could be an issue. Well
look more carefully at some particular cases as needed.
Constraints
So far, we have not talked about the multivariate case under the assumption that
there might be constraints.
We are going to nesse that issue for the most part, in ways you will see shortly.
So, it is again something we will have to worry about a bit, but its best handled
case-by-case with specic examples, rather than with general theory.
23