You are on page 1of 143

University of Tartu

Anneli Kaasa

MATHEMATICS FOR ECONOMICS


WITH ADDITIONAL TOPICS

Tartu 2014
PREFACE

Economics describes and analyses the economical processes around us. At that often mathematical
models are used that are analysed with mathematical methods. In order to understand what is
written in textbooks or scientific articles, often knowledge of (higher) mathematics is needed. Also,
when conducting research, the empirical analysis is mostly based on theoretical framework that is
often expressed with the help of mathematics.
The following material intends to help those who need to restore or improve their mathematical
knowledge for economics. After introducing mathematical methods, examples are given about
possible uses of those methods in different fields of economics. The examples illustrate the only the
use of a particular mathematical method in economics and have no intention to completely interpret
a model and results used as an example. This has been left to microeconomics, macroeconomics
and other fields of economics, understanding of which is hopefully easier with the help of this
material.

2
CONTENTS
1. Mathematical models in economics ............................................................................................................... 4
1.1. Expression of mathematical models ........................................................................................................ 4
1.2. Elements of mathematical models ........................................................................................................... 7
1.3. Analysis of mathematical models ............................................................................................................ 9
2. Diferentiation ............................................................................................................................................... 12
2.1. Derivative, difference and differential ................................................................................................... 12
2.2. Applications of derivative...................................................................................................................... 15
2.3. Partial derivatives .................................................................................................................................. 24
2.4. Total differential and total derivative .................................................................................................... 28
2.5. Derivative of an implicit function.......................................................................................................... 31
3. Integration .................................................................................................................................................... 36
3.1. Indefinite integral................................................................................................................................... 36
3.2. Definite integral ..................................................................................................................................... 38
3.3. Relationship between total and marginal function................................................................................. 42
4. Matrix algebra .............................................................................................................................................. 45
4.1. Matrices and determinants ..................................................................................................................... 45
4.2. Linear equation systems ........................................................................................................................ 47
4.3. Input-output models ............................................................................................................................... 49
4.4. Leontiefs model .................................................................................................................................... 50
4.5. Calculations in Leontiefs model ........................................................................................................... 53
5. Optimization ................................................................................................................................................. 58
5.1. Extremums of functions of one variable ................................................................................................ 58
5.2. Extremums of a function of two variable .............................................................................................. 63
5.3. Optimization of a function of n variables .............................................................................................. 67
6. Optimization with constraints ...................................................................................................................... 71
6.1. Extremums in an interval ....................................................................................................................... 71
6.2. Optimizing the function of two variables with one constraint ............................................................... 72
6.3. Lagrange method ................................................................................................................................... 74
6.4. Optimization of a function of n variables with many constraints .......................................................... 83
7. Comparative statics ...................................................................................................................................... 85
7.1. Qualitative and quantitative analysis ..................................................................................................... 85
7.2. Using Jacobians in more complex models ............................................................................................. 88
8. Difference equations..................................................................................................................................... 99
8.1. Dynamic analysis, difference and differential equations ....................................................................... 99
8.2. Solving a difference equation .............................................................................................................. 100
8.3. Assessing the stability of equilibrium.................................................................................................. 102
8.4. Cobweb model ..................................................................................................................................... 105
8.5. Phase diagrams .................................................................................................................................... 108
9. Differential equations ................................................................................................................................. 115
9.1. Solving a differential equation............................................................................................................. 115
9.2. Assessing the stability of equilibrium.................................................................................................. 116
9.3. Phase diagrams .................................................................................................................................... 119
10. Exponential functions ............................................................................................................................... 127
10.1. Specificity of exponential functions .................................................................................................. 127
10.2. Growth rate ........................................................................................................................................ 129
10.3. Financial mathematics ....................................................................................................................... 132
10.4. Optimal timing ................................................................................................................................... 137
Literature ........................................................................................................................................................ 138
Appendices ..................................................................................................................................................... 139
Appendix 1.................................................................................................................................................. 139
Appendix 2.................................................................................................................................................. 139
Appendix 3.................................................................................................................................................. 142
Appendix 4.................................................................................................................................................. 142

3
MATHEMATICAL MODELS IN ECONOMICS Anneli Kaasa

1. MATHEMATICAL MODELS IN ECONOMICS

1.1. Expression of mathematical models


In order to describe and analyse the reality, it is sometimes useful to construct a model. A model
has to describe reality as well as possible, but at the same time it has to be simplified and
generalized, so that it may be useful for making conclusions. It is impossible to cover all the reality
with one model, hence those relationships that are of interest are chosen by the researcher and only
those characteristics of objects are described that are important in the context of a particular model.
A model can be expressed verbally, with words and sentences. Mathematical symbols enable to
describe things more briefly, while graphs are good for visualizing relationships. The choice
between different methods for expressing a model has to be made in accordance with the aim. If, for
example the aim is to write a model down for a thorough analysis, mathematical description of a
model is justified and often necessary. However, if the aim is to introduce a model to a wider
audience, verbal description may be a good choice. Appropriate visual illustration may be useful in
both cases. Next, the possibilities for describing a model are illustrated by an example of the market
equilibrium model.
Market equilibrium model describes the relationships between the price of a good and its
demanded and supplied quantity. It can be logically expected that the lower the price, the more
consumers want to buy this good. This relationship is referred to as demand. Also, it can be
assumed that the higher the price that one can get when selling this good, the more the producers
want to sell it. This relationship is referred to as supply. The equilibrium is reached in a market, if
such price is found at which the quantity demanded is equal to the quantity supplied. This price is
called equilibrium price and the quantity that is bought and sold at that price, is called equilibrium
quantity. This was a brief verbal description of the market equilibrium model. Next, we will try to
express this with mathematical methods.
Consumers are willing to buy a certain quantity of the good under discussion at a particular price:
for every price there is a corresponding quantity of a good. Hence, pairs of the values of price and
quantity are formed. Demand can be viewed as a set of those pairs that are acceptable for
consumers, or: as a set of those pairs that follow a certain rule that relates the values of price to the
values of quantity. This rule is called demand function. Demand function describes how the
demanded quantity depends on price. The pairs of the values of price and quantity can be depicted
on a figure where price is on one and quantity on the other axis. If we mark all the pairs that are
acceptable for consumers in this figure, a demand curve is formed (a graph of the demand function).
Analogically the relationship of between price and quantity for producers, also known as supply can
be described.
Let us look at an example of describing the market of a hypothetical good. First, using functions,
let the demand function be:
q D = 24 3 p
and supply function
q S = 4 + 4 p ,
where:
q D quantity demanded;

q S quantity supplied;
p price.

4
MATHEMATICAL MODELS IN ECONOMICS Anneli Kaasa

As it can be seen, the relationships correspond to those described verbally: as price increases,
quantity demanded decreases and quantity supplied increases.
The market equilibrium is reached, if the quantity demanded is equal to the quantity supplied.
Hence, the equilibrium condition and the third equation of this model is:
qD = qS .
As the rules are known that determine the pairs corresponding to the demand and supply, the same
relationships can be expressed, for example using the notation of sets:
{( )
D = p, q D 3 p + q D = 24 , }
S = {( p, q ) 4 p q
S S
}
=4 .

Demand D is a set of those pairs of the values of price and (demanded) quantity (ordered pairs) that
follow the rule 3 p + q D = 24 . Supply S is a set of those pairs of the values of price and (supplied)
quantity that follow the rule 4 p q S = 4 . The market equilibrium is a pair (E) of the values of price
and quantity that belongs both to the set D and the set S:
E = DS.
These relationships can also be described graphically. Demanded and supplied quantities are often
depicted on the same axis in order to find the equilibrium of demand and supply. Hence, it is
possible to describe both demand and supply using only two dimensions.
The shapes of the demand and supply curves are determined by the demand and supply functions.
In our example both are linear functions, hence, the graphs are straight lines and hence, in this case
only two points are needed to draw a graph. Intersection points (vertical and horizontal intercepts)
with axes can be used, for example, but another widely used option is to draw a graph on the basis
of the vertical (or horizontal) intercept and the slope.
Usually, the values of the function (dependent variable y) are depicted on the vertical axis and
values of the argument (independent variable x) on the horizontal axis. According to that, quantity
should be depicted on the vertical axis and price on the horizontal axis. However, according to the
tradition in economics, price is usually depicted on the vertical and quantity on the horizontal axis.
When using the intercepts, this is not a problem. However, when using the vertical intercept and
slope method, one has to use the inverse demand function and the inverse supply function. The
curves, however, are still called demand and supply curves and not inverse demand curve or inverse
supply curve.
Solving both functions for p gives:
1 1
p = 8 q D and p = 1 + q S .
3 4
If price is 0, then the demanded quantity is q D = 24 3 0 = 24 . In order the demanded quantity to
1
be 0, price has to be p = 8 0 = 8 . Hence, the vertical intercept of the demand curve is in the
3
point (0 ; 8) and the horizontal intercept in (24 ;0) . Analogically, the intercepts can be found for the
supply curve: (0 ; 1) and ( 4 ;0 ) , respectively. The situation is depicted on Figure 1.1.

5
MATHEMATICAL MODELS IN ECONOMICS Anneli Kaasa

p
quadrant II quadrant I

8 S

1
-4 24 q

quadrant III quadrant IV

Figure 1.1. Demand and supply curves


It is hard to imagine a negative price or negative quantity. Hence, there is no need to depict the
negative values on figure. The same is true for many variables used in economics: interest rate,
exchange rate, the number of workers, and so on. Thus, in economics mostly only the first
quadrant (where both variables are non-negative) is shown on figures. In the case of some
variables, negative values are also possible: negative profit, negative growth etc. Then, of course,
the negative values are also depicted on figure for that variable.
The scale for the axis is chosen according to the model and the values that are present in the model.
The figure has to be compact, but at the same time it has to describe the proportions that are
important in the context of a model. Often on different axes variables with different measurement
units are depicted, hence the scales of different axes also have to be different in order the figure to
be compact.
A figure drawn in a way that is common in economics can be seen on Figure 1.2.

8
D
S
E

1
24 q

Figure 1.2. Demand and supply curves as they are usually depicted in economics
The equilibrium point E is in the intersection point of two curves.
When describing the economic phenomena graphically, sketching is often used: a figure has to be
only as precise as needed for the analysis. Sometimes the only important thing is the question about
which of the two lines is steeper. In that case it is reasonable to use the vertical (or horizontal)
1
intercepts and slopes method for sketching. Let us look at the functions p = 8 q D and
3
6
MATHEMATICAL MODELS IN ECONOMICS Anneli Kaasa

1 S
p = 1+ q . First, the vertical intercepts are: 8 for the inverse demand function and 1 for the
4
1 1
inverse supply function. Next, the slopes are for the inverse demand function and for the
3 4
inverse supply function.
The slope of a straight line shows how fast is the variable on the vertical axis increasing (or
decreasing, if the slope is negative) as the variable on the horizontal axis increases. In our case the
demand curve is negatively sloping and the supply curve positively sloping. When comparing the
1 1
absolute values of the slopes: > , we can conclude that in this example the demand curve
3 4
decreases faster than the supply curve increases the demand curve is steeper than the supply
curve, as can be seen on Figure 1.2. It has to be pointed out here that if one compares not the slopes
of the inverse functions but the slopes of the demand and supply functions the result is opposite:
3 < 4 . Hence, it is important to pay attention to the notation of axes: which variable is on which
axis.
Knowing the intercepts and how the slopes relate to each other, we can sketch Figure 1.3 that
describes the same situation as previous Figures although the scales are different, the demand
curve is still steeper than the supply curve.

8 D

S
E

1
q

Figure 1.3. Demand and supply curves in the case of different scales

1.2. Elements of mathematical models


In mathematical models, relationships between variables are expressed with the help of equations.
When describing changing economic phenomena, a variable takes on different values (for example
for different persons, at different times, in different places and so on). Therefore a variable is
denoted with a symbol. Widely used variables in economics are for example price, quantity of a
good, revenue, cost, profit, consumption, investments, export, import, total income etc. Most
variables have a traditional notation (symbol), for example p for price or q for quantity. Of course,
other notations can be used as well, but then these notations have to be defined first.
In a particular model, variables are divided to dependent and independent variables. For example, in
the market equilibrium model the demanded quantity depends on price (independent variable). One
variable can be dependent in one model and independent in another.
Variables can be seen in mathematical models together with parameters. When a variable is
something that could be measured empirically, then parameters help to describe the relationships
between variables. A parameter is a magnitude that does not change for one object or situation, it
has a constant value. For different objects or situations, however, it can have different values. So, if
7
MATHEMATICAL MODELS IN ECONOMICS Anneli Kaasa

the value is not known or a generalization is made, parameters can also be denoted with symbols
(for example a, b, c or , , ). For example, the portion of income that is used for consumption
is constant for one person, but for another person it may have different value.
The general form of a market equilibrium model introduced before is:
q S = a + bp ,
q D = c dp ,
qD = qS ,
where a, b, c, d > 0 .

q D , q S and p are variables in this model and a, b, c and d are parameters.


If one (dependent) variable varies according to some rule as other variable(s) vary, then this rule is
expressed by a function. Function is a rule that determines the values of one variable that
correspond to the values of another variable. A function can be denoted as:
y = f (x ) ,
where f denotes the functional dependence of variable y on variable x. In economics, often instead
of f the symbol of dependent variable is used for denoting the functional dependence, for example a
cost function that describes how costs C depend on the produced quantity q can be expressed as:
C = f (q) ,
but also as:
C = C (q ) .
Analogically, a supply function can also be expressed as q S = q S ( p ) and demand function as
q D = q D ( p).
In mathematical economics, the relationships are mostly expressed with the help of equations. An
equation determines an equality of the mathematical expressions on both sides of the equality sign.
There are different types of equations.
A definitional equation sets up an identity between two expressions on the different sides of
equation. For example, profit is equal to the difference between revenue and costs:
= RC ,
where:
profit,
R revenue,
C costs.
In these cases the identical-equality sign can be used: R C .
A behavioural equation describes how one variable behaves in response to the changes in other
variable, for example the equations of demand or supply that describe the behaviour of consumers
or producers. Another example could be the dependence of consumption on income:
C = a + bY , 0 b 1.
Third type of equations is equilibrium conditions. Those equations describe a relationship that is
needed for reaching the equilibrium. In the market equilibrium model the equilibrium condition is
q S = q D , another example from macroeconomics is the equilibrium of investments and savings:
I =S.

8
MATHEMATICAL MODELS IN ECONOMICS Anneli Kaasa

The definitional equations are usually rules that apply always. Behavioural equations describe a rule
from what deviations can be found in empirical research. Profit is always calculated in one way, but
peoples behaviour does not strictly follow the rule. The equilibrium conditions, in turn are satisfied
only at certain (equilibrium) values of variables.

1.3. Analysis of mathematical models


Mathematical models are constructed not only for describing a situation, but they are also used to
analyse the situation, to make conclusions and to offer solutions. Here, the advantage of
mathematical models is that they can be analysed with mathematical methods. Before introducing
these methods we can try to draw conclusions from the market equilibrium model introduced before
with the help of some simple tools.
When analysing a market equilibrium model, the first task is usually to find the equilibrium price, at
which the quantity demanded is equal to the quantity that is supplied. The equilibrium condition
states that those two quantities have to be equal:
qS = qD .
As both quantities depend on price:
q D = 24 3 p and q S = 4 + 4 p ,
then by replacing the quantities in the equilibrium condition with the expressions of price, we get an
equation of one variable price:
4 + 4 p = 24 3 p .
Solving this equation, we can find the equilibrium value of price and by substituting it into the
demand or supply equation, we can find the equilibrium quantity. In other words, we have to solve
an equation system:
q D = 24 3 p,
S
q = 4 + 4 p,
q D = q S .

In our case the equilibrium price appears to be p* = 4 and the equilibrium quantity q* = 12 .
Asterisk (*) is often used in economics for denoting equilibrium or optimal value.
Now we can also depict the equilibrium point on figure (see Figure 1.4).
p
8 S
D

4 E

12 24 q

Figure 1.4. Equilibrium of demand and quantity

9
MATHEMATICAL MODELS IN ECONOMICS Anneli Kaasa

Sometimes, when needed, new aspects are added to a simple model. For example, it is possible to
analyse the influence of excise tax on the market equilibrium. Assume that an excise tax of T per
unit is imposed for the good under discussion. T is a parameter, whose value is not known at the
moment.
Now the price that is paid by consumers is divided into two parts: one part (T) goes to the
government as a tax revenue and the other part goes to producers. This can be expressed by a
definitional equation:
pTD = pTS + T ,
where:
pTS price for suppliers (producers) after imposing excise tax,

pTD price for demanders (consumers) after imposing excise tax.


Both consumers and producers behave in the way they used to before imposing tax, hence, the
quantity demanded after imposing tax qTD depends on the price that is paid by consumers pTD
according to equation: qTD = 24 3 pTD . Also, the quantity supplied after imposing tax qTS depends
on the price that reaches producers) pTS as before: qTS = 4 + 4 pTS .
Thus, in now there are four variables (T is parameter) and four equations in the model:
qTD = 24 3 pTD ,
S
qT = 4 + 4 pT ,
S

D
qT = qT ,
S

pD = pS + T.
T T

This system can be solved by replacing pTD with pTS + T . This way we get a system of three
variables and three equations:
( )
qTD = 24 3 pTS + T ,
S
qT = 4 + 4 pT ,
S

D
qT = qT .
S

The further solution is analogical to the solution used before and the result is
D 12
qT = qT = 12 7 T ,
S

D 4
pT = 4 + T ,
7
S 3
pT = 4 7 T .

From here, it can be concluded that since T is positive, then regardless of the size of the excise tax,
imposing it decreases the equilibrium quantity. The market price (price for consumers) increases,
while the price decreases for producers. We can also see how the tax burden is divided between
4 3
consumers and producers: of the tax is added to the price for consumers and of the tax is
7 7
taken away from the price for producers. Since the producers now have to ask the price that is
higher (in order to still get the price they accept, after they have paid tax), on figure the supply
curve is shifted up by T (see Figure 1.5).
10
MATHEMATICAL MODELS IN ECONOMICS Anneli Kaasa

The intersection point of the new supply curve and the demand curve gives us the new equilibrium
price (for consumers). The price for the producers is smaller by T. As the vertical difference of two
supply curves is exactly T, then the new price for the producers can be found by moving down from
the new equilibrium point to the old supply curve. The horizontal line that corresponds to the initial
equilibrium price p* divides the vertical line segment between two supply curves into two parts that
show how the tax burden is divided between consumers and producers. pTD p * shows the price
change for the consumers and pTS p * the price change for the producers. As it can be seen also
from Figure 1.5 in this example consumers have to pay the larger part of tax.

p
8
D S'

T
4
T E' S
pTD 7
p* E
pTS
3
1 T
7
qT q* q

Figure 1.5. Influence of excise tax on the market equilibrium


It can be seen from Figure 1.5 that if the supply curve would become flatter or the demand curve
steeper, the share that consumers have in the tax burden would increase. Hence, in the case of
excise tax, the division of tax burden between consumers and producers depend on the slopes of
demand and supply curves.
The type of analysis, where equilibrium values are found is called static analysis. Besides that, in
economics comparative statics and dynamic analysis is used, those types are introduced later.

11
2. DIFERENTIATION

2.1. Derivative, difference and differential


In economics one has to deal with changing phenomena measured by variables. When describing
changes in mathematics, terms like difference, differential and derivative are used.
The change in the value of a variable difference is denoted by a Greek letter . The
difference of the variable x per time unit can be found as the value of the variable at time point t
minus the value at the previous time point t 1 :
x = xt xt 1 .
Analogically, the difference of the variable y is:
y = y t y t 1 .
In economics often the influence of one variable to another is analysed, for example how the
demand will change as the price rises by some amount or what happens with production costs as the
quantity produced increases by one unit. This can be described with the help of a function. For
example the influence of x on y is expressed by y = f ( x ) . The impact of the change in argument x
on the value of the function y is expressed by the quotient of the change of y and the change of x:
y y
. shows the change of y as x increases by one unit ( x = 1 ). That is actually the average
x x
change of y. If we know x , then for finging y we have to find the value of y before and after the
change according to the rule y = f ( x ) and then find the difference: y = f ( x + x) f ( x) . In the
case of more complex functions this can be quite troublesome. If the changes are relatively small
y y
then in order to approximate it is rational to find the limit of as x approaches 0:
x x
y f ( x + x) f ( x)
lim = lim .
x 0 x x 0 x

Such limit is called derivative and can be denoted as y , f ( x ) or


dy
(the last notation has to be
dx
viewed as one symbol, not a quotient). Hence:
y
y = f ( x ) =
dy
= lim .
dx x 0 x

The first two notations refer that the derivative (that is a function as well) is derived from the initial
function y = f ( x ) . The last notation refers to the interpretation of derivative: a derivative gives an
y
approximate estimation of the quotient of absolute changes (the smaller x , the more accurate
x
dy
the estimation). The derivative shows approximately the change in the value of a function ( y )
dx
dy y
that comes with the unitary (one unit) growth of the argument ( x = 1 ): .
dx x
The geometrical interpretation of derivative is the slope of the tangent line of the functions graph.
(see Figure 2.1). By finding the value of the derivative at a particular point, we get the slope of the
functions graph at that point. At the same time it is also the slope of the function itself at that point.
Hence, the derivative helps in drawing the graph of a function.

12
DIFFERENTIATION Anneli Kaasa

dy and dx are called differentials of variables y and x, respectively. For an argument the
differential is assumed to be equal to the difference: dx = x . The differential of a function is an
approximate estimation of the actual change of the value of function y that comes with the change
of argument ( x ): dy y . It can be seen from Figure 2.1 that the smaller is the change of
argument, the less the differential and the actual change (difference) of a function differ from each
other. Estimation of the actual change of the value of a function can be found by multiplying the
derivative (approximate change of a function in the case of unitary change of argument) by the
change of argument dx = x :
dy = f ( x ) x .
If we use another notation for the derivative and take into account that dx = x , we can write:
dy
dy = dx , which is definitely true. For example, let there be a function y = 3 x 3 + 4 x 2 10 and a
dx
given time point, where x = 1 and x = 0,02 . The estimation of the actual change y can be found:
y dy = f ( x)x = (9 x 2 + 8 x)x = (9 1 + 8 1) 0,02 = 0,34 .

y
y=f(x)
dy
y dy

x
x = dx

Figure 2.1. Geometrical interpretation of derivative, differences and differentials


The smaller is the change dx = x given to an argument, the more precisely is the actual change in
the value of a function y estimated by the differential. Hence, it is sensible to use derivative in the
case of relatively small changes of argument. Figure 2.1 presents relatively large changes only to
give a better view of the relationships between differences and differentials.
Not all functions are differentiable. The concept of derivative assumes very small changes and
thus, it any small changes have to be possible. Hence, a function has to be continuous: the value of
function should be possible to find for any value of argument. In addition, a function cannot have
turning points, where a tangent line cannot be determined. Hence, for differentiability a function has
to be continuous and smooth.
The rules of differentiation can be found in handbooks of mathematics. Here are the rules for
differentiating the functions mostly used in economics:
c = const = 0 ,

( )
a x b = ab x b 1 ,

( )
ex = ex ,

(a ) = a
x x
ln a =
ax
log a e
,

13
DIFFERENTIATION Anneli Kaasa

(ln x ) = 1 ,
x
(log a x ) = 1 1
= log a e .
x ln a x
In the case of sum, difference, product or quotient of functions, following rules can be used. Let
functions u and v be functions of x: u = f ( x ) and v = g ( x ) .
If y = u v , then y = u v ;
if y = uv , then y = u v + u v ;
u u v u v
if y = , then y = .
v v2
In the case of a composite function (function of a function) chain rule can be used. Lets assume that
y = f (u ) , u = u (v) and v = v(x) .
If y = f (u (v( x))) , then y ( x) = y (u ) u (v) v ( x) .
It is also useful to know that:
1
x ( y ) = .
y ( x)
If a derivative is taken from a derivative, a second-order derivative can be found. When continuing
to take derivatives even higher-order derivatives can be found. A second-order derivative can be
d2y
denoted in following ways: y = f ( x) = 2 . The higher-order derivatives are denoted
dx
analogically, but starting from fourth- of fifth-order derivative the order is described by a number in
d5y
brackets: y (5 ) = f (5 ) ( x) = 5 .
dx

Differentiation basically means taking derivatives. In the case of a function y = f ( x ) , where y is a


dy
function and x an argument, the derivative y = is interpreted as an approximate estimation of
dx
y dy y
(it means: ). y and x are the differences of y and x, dy and dx are the
x dx x
dy y
differentials of y and x. We assume that x = dx , but dy y , hence .
dx x
dy dy
!!! means that as x increases by 1 unit, y (depending on the sign of being positive of
dx dx
dy
negative, respectively) increases/decreases approximately by units.!!! NB! If we are
dx
dealing with economic-related problems, then we should not use x and y, but the variables in our
problem instead.
dy y
If x = dx 0 , then (the estimation gets more accurate).
dx x

14
DIFFERENTIATION Anneli Kaasa

2.2. Applications of derivative


In economics often marginal indicators and marginal functions are used. A marginal indicator
describes the change of one (dependent) variable that comes with the unitary change of another
(independent) variable. For example, marginal cost is the additional cost of producing one more unit
of product; marginal revenue is the additional revenue that comes from selling additional unit.
The relationship between these dependent and independent variables y = f ( x ) is often called total
function or initial function in this context. If the total function is a linear function, then a particular
change of argument causes always (for all values of the argument or all possible points on a
function) the same change in the value of a function, but in the case of more complex functions the
change of the value of a function depends not only on the change x , but also on the value of the
argument x at the moment. Hence, a marginal indicator can be found with the help of marginal
function that is also a function of x. A marginal function is a function that describes how the
change in y (in the case of unitary change of argument x) depends on the argument x. That is the
same that is described by a derivative function. Hence, a marginal function can be found as a
derivative of the initial function.
A marginal indicator is often denoted with the same symbol as the initial indicator, with M added
before this symbol. For example the derivative of revenue function R = R (q ) gives marginal
revenue MR, derivative of utility function u = u (q ) gives marginal utility MU. In general, we can
say that a derivative of a total function y = f ( x ) gives us marginal function:
MY = y (x) .
For example, if we need to find a marginal function from a cost function C = C (q ) , we take a
derivative of it: MC = C (q ) . In the case of cost function C = 3q 3 2q 5 marginal cost function
is MC = C (q ) = 9q 2 2 .

Taking a derivative of y = f ( x ) gives a marginal function of y: MY = y =


dy
. Interpretation: the
dx
interpretation of a derivative.
Derivatives can be used for sketching graphs. In a domain where the derivative of a function y with
respect to x (describing the change of y brought about by the change of x) is positive, the function is
increasing (y increases when x increases) and the graph positively sloping. In a domain where the
derivative is negative, the value of a function is decreasing. A derivative describes the velocity of
growth (change) in a sense. In the case of negative derivative, growth is also negative and the value
of the function is decreasing. The higher is the value of a positive derivative, the faster the value of
a function is increasing and vice versa. Also, the higher is the absolute value of a negative
derivative, the faster is the value of a function decreasing and vice versa.
For example, a revenue is equal to the quantity sold multiplied by price: R = q p . At that a firm has
to take the demand into account: at different prices different quantities can be sold according to the
demand function. Let the demand function be: q = 20 2 p . To find how the revenue depends on
the quantity, we need to find and inverse demand function p = 10 0,5q and then substitute:
R = q (10 0,5q ) = 10q 0,5q 2 . Marginal revenue is then MR = R (q ) = 10 q .
We know that the revenue function is a quadratic function with a parabolic graph. Marginal revenue
is positive, if MR = 10 q > 0 or q < 10 . If q > 10 , marginal revenue is negative. Hence, the
revenue function increases until the point where q = 10 (maximum point) and starts to decrease
after that. Hence, the parabolic graph has to be concave (this is also confirmed by the second

15
DIFFERENTIATION Anneli Kaasa

derivative being negative: R (q ) = 1 ).We also know the maximum point and that if the quantity is
0, the revenue is also 0. Hence, we can sketch the revenue function as shown on Figure 2.2. .

10 q
MR

10

10 q

Figure 2.2. Graphs of the revenue and marginal revenue functions

If we are drawing a graph of y = f ( x ) , y == MY shows whether y = f ( x ) is increasing or


dy
dx
d2y dy
decreasing. y = 2 = MY in turn, shows whether y = = MY is increasing or decreasing and
dx dx
at the same time it shows whether y = f ( x ) is increasing/decreasing acceleratingly or
deceleratingly.

d2y d 2y
y = = MY > 0 y = = MY < 0
dx 2 dx 2
y =
dy
= MY > 0 y , MY , y , MY
dx accelerated increase decelerated increase

dy y , MY MY y MY MY
y = = MY < 0
dx decelerated decrease accelerated decrease

A concept widely used in economics is the concept of elasticity of a function. Elasticity of a


function shows how sensitive is the value of a function to the changes of the argument. More
precisely: elasticity of a function with respect to the argument indicates the relative change (in
percents) of the value of the function, if the argument increases by 1%. Elasticity of a function is
denoted by a Greek letter and calculated as the quotient of the relative change of the value of the
function and the relative change of the argument.
16
DIFFERENTIATION Anneli Kaasa

Let us use a demand function q = q D ( p ) as an example. The price elasticity of demand can be
denoted as pD D stands for the dependent variable (demanded quantity) and p stands for the
independent variable (price). The absolute change of the demanded quantity is q = qt qt 1 . The
relative change shows, how big is the absolute change relative to the value of this variable at the
q qt qt 1
moment. This is usually found in a following way: = . However, that would mean that
qt 1 qt 1
when looking at two absolute changes with different signs but with the same absolute value, the
corresponding relative changes would be different. In the case of calculating elasticities, the
absolute change is divided by the arithmetical average of the first (time t-1) and second (time t)
value:
q qt qt 1 q + qt 1
= , where q = t .
q q 2
For the price elasticity of demand pD we need also the relative change of price:
p pt pt 1 p + pt 1
= , where p = t .
p p 2
Elasticity is then calculated as the quotient of the relative change of the demanded quantity and the
relative change of price:
Dq
q
=
D
p
Dp
p
Dq Dp
that can be rearranged as pD = or:
q p
Dq p
pD = .
Dp q
This is typical form of the formula used for calculating elasticities. As this formula allows us to
calculate elasticity on the arc between two points observed, it is called arc elasticity.
For example, if the price of a good increases from 40 to 50 per kg, then a consumer buys 0,5 kg in a
week now instead of previous 1 kg. The change of the quantity is then q = 0,5 and the average
quantity q = 0,75 . The change of the price is p = 10 and the average price p = 45 .
q p 0,5 45
Elasticity: pN = = = 3 .
p q 10 0,75
If the relative change of the value of the function is larger than the relative change of the argument,
then > 1 and the function is known as elastic. In the opposite case < 1 and the function is
inelastic. If = 1 , it is called unitary elasticity (an unit elastic function). In our example the
function is elastic meaning that the demand is relatively sensitive to changes in price.
The formula of arc elasticity gives so-called average elasticity on an arc, while the elasticities at
different points on this arc may differ from each other. A more precise result is obtained by the
formula of point elasticity. Calculation of a point elasticity assumes infinitely small changes in the
q dq
argument. In that case the changes can be replaced by the differentials and replaced by
p dp

17
DIFFERENTIATION Anneli Kaasa

(derivative). As there is only one point under consideration, there is no need for calculating
averages and there are the values of the function and the argument at a particular point in the
formula instead:
dq p
pD = .
dp q
dq
should be treated here as one symbol referring to a derivative.
dp
For example, let us find the price elasticity of a demand function q = 20 2 p in a point where the
price is 4. If the price is 4, then the quantity is q = 20 2 4 = 12 . The derivative is q = 2 and the
elasticity:
4 2
pD = 2 = ,
12 3
hence, this demand function is inelastic at the price 4.
Analogically other types of elasticities can be calculated: income elasticity of demand, price
elasticity of supply, input elasticities of output etc.
In the case of a linear function the arc elasticity on a segment of line is equal to the point elasticity
at the centre point of that segment. In the case of a curve, arc elasticity is equal to the point
elasticity in a point where tangent line is parallel with the line drawn between the two points used in
calculating the arc elasticity. On Figure 2.3 the arc elasticity between points A and B is equal to the
point elasticity at point C, where tangent line is parallel to the line drawn between A and B.

p
pt-1 A

p
C
pt B

qt-1 q qt q

Figure 2.3. Geometrical relationship between arc and point elasticities.


The way of calculation of an elasticity depends on the data known. If we know two points ( ( x1 , y1 )
y x
and ( x 2 , y 2 ) ) on a curve, we can use arc elasticity: xy = ( x and y are the arithmetic averages
x y
of x and y, x = ( x 2 x1 ) and y = ( y 2 y1 ) ). If we know the function and one point of it, we can
dy x x
use point elasticity: xy = = y ( x) (x and y being the values at that point). While derivative
dx y y
estimates the quotient of the absolute changes of y and x, elasticity estimates the quotient of the
relative changes of y and x.
dy x
!!! xy = means that as x increases by 1 %, y (depending on the sign the sign of xy being
dx y
positive of negative, respectively) increases/decreases approximately by xy %. !!! NB! If we
18
DIFFERENTIATION Anneli Kaasa

are dealing with economic-related problems, then we should not use x and y, but the variables in our
problem instead.
If xy > 1 , we say that y is elastic with respect to x, and if xy < 1 inelastic.

In the case of a linear demand curve p = a bq the price elasticity of demand is:

dq p 1 p p
qp = = = .
dp q b (a p ) b a p

Hence, in the case of a price that is half of the price (a) at which the consumers stop consuming,
0,5a
then qp = = 1 . If the price is higher, then qp > 1 , if lower, then qp < 1 (see also
a 0,5a
Figure A1.1).
a
In the case of a hyperbolic demand curve p = the price elasticity of demand is:
q
dq p a p
qp = = 2 = 1 .
dp q q a p

p p

elastic
a
unit elastic
unit elastic
inealstic
0,5a

q q

Figure A1.1. Elasticity of a linear and hyperbolic demand curve


A graphical method can be used in order to assess the elasticity qualitatively. When rewriting the
formula of elasticity we can see that the elasticity actually is a quotient of the marginal and
average functions:
dy x dy y MY
xy = = = .
dx y dx x AY
From here a simple graphical method for a qualitative assessment of the elasticity of a function can
be derived. The qualitative assessment means here: finding an answer to the question whether the
function is elastic ( > 1 ), unit elastic( = 1 ) or inelastic( < 1 ) in a particular point.

If > 1 ,then MY > AY ;

if = 1 , then MY = AY ;

if < 1 , then MY < AY .

19
DIFFERENTIATION Anneli Kaasa

The geometrical interpretation of a derivative (marginal function) is the slope of the tangent line in
a particular point where the value of the argument is x0 :

MY ( x0 ) = (x0 ) .
dy
dx
The value of an average function for x0 :
y0
AY ( x0 ) =
x0
can be graphically interpreted as the slope of a straight line that connects the point ( x0 , y 0 ) and the
origin (see Figure A1.2). This line is called radius vector.
In order a function to be elastic ( > 1 ) in a particular point, MY > AY should be true and thus
the absolute value of the slope of the tangent line should be larger than the slope of the radius vector
of that point (the slope of the radius vector is always positive in the first quadrant). Hence, the
tangent line has to be steeper than the radius vector. ). For unit elasticity ( = 1 ) the absolute values
of the slopes of the tangent line and radius vector should be the same (they either coincide or cross).
If the tangent line is flatter than the radius vector, the function is inelastic ( < 1 ).

y y

y0
y0

x0 x x
x0

Figure A1.2. Assessing the elasticity of a function


On the Figure A1.2, the case is shown on the left where the function is inelastic in a particular point
and the function that is elastic at the particular point is shown on the right.
Form here, also a rule for the linear demand function can be derived. However, in the case of the
demand function one has always to keep in mind that on the traditional figure (as Figure A1.3) axes
are switched: the value of the function is on the horizontal axis and the value of the argument on the
vertical axis. In the case of a linear demand function the tangent line coincides with the demand
curve. It can be seen from Figure A1.3 that if the price and quantity are both equal to a half of their
maximum values (intercepts), then the absolute value of the elasticity is 1 (the absolute value of the
slope of the demand curve is equal to the slope of the radius vector). For higher prices the relative
change of quantity is larger than the relative price change and the absolute value of the elasticity is
higher than 1 (the demand curve is flatter than the radius vector). For lower prices the function is
inelastic (the demand curve is steeper than the radius vector).

20
DIFFERENTIATION Anneli Kaasa

p p

elastic
pmax
unit elastic

inelastic
0,5pmax
e = 1
D
D
q
0,5qmax qmax

Figure A1.3 Qualitative assessment of the elasticity of a linear and hyperbolic demand curve
In the case of a hyperbolic demand function the absolute values of the slopes in all points are equal
to the slope of a corresponding radius vector. This is in accordance with the constant unit elasticity.

Differentiation enables to prove the well known relationship from microeconomics between the
graphs of the marginal and revenue functions of a variable (cost, revenue, product etc. functions)
the curve of a marginal function MY intersects the curve of an average function AY in the
extremum (maximum or minimum) point of the average function AY

The average function is a quotient of the value of the function and the value of the argument:

y ( x)
AY = .
x

For finding the extremum of the average function AY the derivative of the average function has to
be equal to 0 (see Chapter 5.1 for optimization). According to the rule of differentiating a quotient
of functions:

d y ( x )
d ( AY ) x y ( x ) x y ( x ) 1 !
= = =0 .
dx dx x2

Hence:

1 y (x ) y(x ) y(x )
y (x ) = 0 or if x > 0 , then y ( x ) = 0 and y ( x ) = or
x x x x

MY = AY .

Thus, in the extremum point of AY it holds that MY = AY .


Also, it is possible to assess, how the difference between the marginal indicator (or function) and
average indicator (or function) depends on the value of the argument. A marginal function can
be found as a derivative of a total function that can be written as a product of the average function
and the argument: y = x AY (x) . Thus, according to the rule for differentiating a product of
functions:
MY (x ) = y (x ) = x AY ( x) + AY ( x)

21
DIFFERENTIATION Anneli Kaasa

and hence, the difference between the marginal indicator and average indicator is always:
x AY (x) .
This relationship has an interesting application, for example on the revenue function R(q ) = pq .
Here, a monopolist and a firm that operates in perfect competition have to be distinguished. For a
monopolist price is related to the quantity according to the consumers demand curve p = p(q ) ,
thus the revenue function is actually: R(q ) = p(q ) q . The average revenue curve coincides then
with the demand curve:
R(q ) p(q ) q
AR(q ) = = = p(q ) .
q q
Since mostly a negatively sloping demand curve is assumed, AR is negatively sloping as well.
Hence, AR (q ) < 0 . We know that MR(q ) = q AR (q ) + AR (q ) . Since quantity can also be assumed
to be mostly positive, then MR(q ) AR(q ) = q AR (q ) < 0 and always
MR < AR .
In the case of a linear demand curve p (= AR ) = a bq , the marginal revenue curve starts from the
same vertical intercept as the demand curve, but decreases two times faster:
MR = q ( b ) + (a bq ) = a 2bq .
a
In the case of a constantly unit elastic demand curve p (= AR ) = (hyperbolic graph) the
q
marginal revenue is 0 and thus the revenue is constant (all points on the demand curve give the
same revenue for the monopolist):
a a
MR = q 2 + = 0 .
q q
In general, if the shape of the demand (and average revenue) curve is known, it is possible to
derive the shape of the marginal revenue curve using the relationship derived before. Namely, at
a particular value of the argument n the difference between the average and marginal revenue is
AR (n) MR(n) = n AR (n) . The derivative of the average revenue is the slope of the average
revenue curve at the quantity n that can by found as
dAR
(n ) = a (see Figure A1.4). The
dq n
reasoning behind this quotient can be explained as follows. In the case of the quantity n the slope of
dAR
an average revenue curve (n) is equal to the slope of a tangent line in the point corresponding
dq
to the quantity n. The tangent line is a straight line and in the case of a straight line the slope of it
always shows exactly (no estimation errors) the change in the value of a function per unit of change
dy y
in the argument (in general form = ). Because of that, for determining the slope of a straight
dx x
line one can always choose two points on that line, for example the quantities 0 and n, and find the
changes that occur when moving from the first point to the second. If quantity increases from 0 to n:
q = n , then the corresponding change on the vertical axis is AR = a . Hence, the formula
dAR
(n ) = a can be used here.
dq n

22
DIFFERENTIATION Anneli Kaasa

AR,
MR,
p

AR(n)
n AR,
a D
MR(n)

n q

Figure A1.4. Finding the points of a marginal revenue curve


Thus:
a
AR(n) MR(n) = n = a
n
or: in order to find a point that shows the marginal revenue at the quantity n, we should start from
the point that shows the average revenue at that quantity. From there we should move down by a.
Basically, the difference between AR and MR at the quantity n is equal to the segment line of the
vertical axis between the average revenue at the quantity n AR(n) and the intersection point of the
vertical axis and the tangent line (of the average revenue curve at the quantity n).
In the case of a linear demand curve the tangent line of the average revenue curve (that is also a
demand curve) coincides with the curve itself and the difference between AR and MR (at a
particular quantity) is equal to the difference between the vertical intercept of AR and the price at
this quantity (See Figure A1.5). In the case of a hyperbolic demand function the segment of the
tangent line between the axes is always divided into halves by the point where this tangent line is
drawn. This means that in that case the marginal revenue curve coincides with the q-axis (See
Figure A1.5).
AR, AR,
MR, MR,
p p
b
a

a AR, D AR, D
MR
q q

Figure A1.5, Finding the marginal revenue curve in the case of a linear and a constantly unit elastic
demand curve

23
DIFFERENTIATION Anneli Kaasa

If a firm operates in perfect competition, the market price is fixed for the firm p = p 0 and the
average revenue is equal to that price AR = p 0 . At the same time the marginal revenue is also equal
to the price (price does not depend on the quantity):
dp 0
MR = q + p0 = q 0 + p0 = p0 .
dq
In that case both marginal revenue and average revenue curves coincide with a horizontal straight
line at the level p = p 0 .
One important application of derivatives is optimization that is discussed separately.

2.3. Partial derivatives


In economics often a variable is influenced by more than one variable. In that case multivariable
functions are used, where there is more than one argument:
z = f ( x1 , x 2 , 2 , x n ) .
In analysing the impacts, often one is interested in the separate impact of a particular variable
(argument) on the dependent variable while other arguments are held constant. The latter is known
as ceteris paribus (meaning with other things the same) principle. For example, one may want to
know, what happens to the demand as price increases, but income does not change; or what happens
to the produced quantity as one worker is added, but no additional machines are used. In
mathematical analysis partial derivatives are used to apply ceteris paribus principle.
Let us assume that a variable z depends on x and y: z = f ( x, y ) . The impact of x on z holding y
y
constant can be expressed as . If we know x , then z can be found as
x y =0
z = f ( x + x, y ) f ( x, y ) . Again, in the case of more complex functions and relatively small
z
changes it is sensible to find the limit of as x approaches 0 (assuming that y = 0 ):
x
z f ( x + x, y ) f ( x, y )
lim = lim .
x 0 x x 0 x
This limit is called partial derivative of function z = f ( x, y ) with respect to x. Similarly to the
simple derivative, a partial derivative can also be denoted in different ways. Instead of using prime
( y ), in the case of partial derivative the variable with respect to what derivative was taken is noted
as subscript: z x . While f ( x ) is used for denoting simple derivative, the corresponding notation for
dy
a partial derivative is f x . The notation that is analogical to the notation for simple derivative,
dx
z
uses Goth letter instead of d for differentials: .
x
The partial derivative of z with respect to y is found and denoted analogically to the partial
derivative with respect to x:
z z f (x + x, y ) f ( x, y )
zx = f x = = lim = lim ,
x x0 x x 0 x
z z f ( x, y + y ) f ( x, y )
zy = fy = = lim = lim .
y y 0 y y 0 y

24
DIFFERENTIATION Anneli Kaasa

z
A partial derivative shows approximately the change of the value of the function ( z ) that
x
comes with the unitary growth of the argument ( x = 1 ) holding other arguments constant (in our
z y
case y = 0 ): . Also, similarly to simple derivative, the smaller is the change dx = x
x x y =0
given to argument, the more precisely is the actual change in the value of a function z estimated
by the differential.
The partial derivative assumes holding other variables constant. Hence, when finding partial
derivatives all other variables can be viewed as constants. When finding a partial derivative of
z = f ( x, y ) with respect to x all usual rules of differentiation apply and at that y has to be treated as
a constant (similarly to the numbers possibly included into a function).
For example, given the function z = 3 x 2 y + 3 x 3 + y 10 , the partial derivative of z with respect to x
is z x = 6 xy + 9 x 2 + 0 0 , and with respect to y: z y = 3 x 2 + 0 + 1 0 .
Regarding the geometrical interpretation of a partial derivative, the geometrical expression of a
function with two arguments is a surface in a three-dimensional space. For example, one can
imagine a room, where a vertical z-axis is going up from one lower corner and the x- and y-axes are
going from the same corner to the two sides along the edges of the floor. Let us call the walls above
these edges x- and y-walls. A surface has a tangent plane in every point (one can think of a piece of
cardboard touching a ball). Partial derivative of z with respect to x shows the slope of the tangent
plane with respect to x-axis. In imagination, the plane can be divided into straight lines (very thin
strips of cardboard) that are very close to each other (creating thus a plane) and parallel to the x-
wall. If we project one of these lines to the x-wall, the slope of this line on the set of xz-plane (on
the x-wall) is shown by the partial derivative z x . One can imagine looking at one strip of cardboard
(parallel to x-wall) when facing straight to the x-wall; in that case the slope z x can be seen on the
background of the x-wall. Looking straight, the third dimension (y-wall) has no influence, that is in
accordance with the presumption of the partial derivative that at the same time dy = 0 .
Analogically, the partial derivative with respect to y shows the slope of the tangent plane with
respect to y-axis.
Higher-order partial derivatives are found analogically, for example second-order partial
derivative is found by taking derivative twice with respect to a particular variable:

z xx = f xx = 2 =
( )
z
2 z x
,
x x
z
2z y
z yy = f yy = 2 = .
y y
Mixed (or cross) partial derivatives are found by differentiating first with respect to one variable
and then with respect to another variable:

z xy = f xy = =
z( )
2 z x
,
xy y
z
2z y
z yx = f yx = = .
yx x

25
DIFFERENTIATION Anneli Kaasa

According to Youngs theorem for functions used in economics the result does not depend on the
order of differentiation:

( x) = z y .
z
y x
Given z = 3 x 2 y + 3 x 3 + y 10 for finding z xx a derivative with respect to x is taken again of the
first-order derivative with respect to x: z x = 6 xy + 9 x 2 resulting in z xx = 6 y + 18 x . If we take a
derivative of z y = 3 x 2 + 1 again with respect to y, we get z yy = 0 . Mixed derivative can be found by
taking derivative of z x = 6 xy + 9 x 2 with respect to y or by differentiating z y = 3 x 2 + 1 with respect
to x. Both methods result in z xy = z yx = 6 x .
A partial derivative is used in finding marginal functions from multivariable initial functions. For
example, let us assume that quantity q produced in a firm in a period depends both on the number of
workers L (labour) and on the number of machines used in production that can be thought of as
capital K. Let the production function be:
q = q ( K , L) = 0,5 K 2 2 KL + L2 .
Marginal product of labour MPL shows, how the quantity produced changes as the quantity of
labour is increased by one unit. MPL can be found taking derivative of production function with
respect to L:
MPL = q L = 2 K + 2 L .
Marginal product of capital MPK can be found analogically:
MPK = q K = K 2 L .
Often, it is possible to create new indicators by combining various marginal indicators. For
dR
example, when multiplying the marginal revenue MR = by the marginal product of an input
dq
dq dR dq dR
(denoted with x) MPx = , we get the marginal revenue of this input: MR x = = .
dx dq dx dx
Partial derivatives are also used in calculating elasticities. Let us assume that there are two goods, A
and B and we know that the demanded quantity of A in a period depends on the price of this good
p A , price of the other good p B and the income m (money):
q AD = 12 4 p A + 0,04m + 0,5 p B .
Elasticity of the demand for A with respect to p A is calculated using a partial derivative in a
following way:
q AD p A p
AD = = 4 DA .
p A q A
D
qA
This elasticity shows the relative change (in percents) in the demanded quantity of A as the price of
this good increases by 1% and the price of another good and income do not change.
The elasticity of the demand for A with respect to the price of the other good is calculated as:
q AD p B p
BD = = 0,5 DB ,
p B q A
D
qA
26
DIFFERENTIATION Anneli Kaasa

and the income elasticity of demand as:


q AD m m
D
= = 0,04 D .
m q A
m D
qA
Different elasticities are often related to each other, For example, let us look at the elasticity of
scale that describes the relative change in the production as all inputs are increased by 1%:
dq n dn dK dL
nq = , where n stands for the scale by which all inputs increase ( = = = ).
dn q n K L
When nq > 1 , there are increasing returns to scale, when nq < 1 , decreasing returns to scale and
when nq = 1 , constant returns to scale. For example, in the case of a Cobb-Douglas type production
function q = K L , we can see that:

q (nK , nL) = (nK ) (nL ) = n + K L = n + q ( K , L)


and hence, if + > 1 , there are increasing returns to scale and if + < 1 , decreasing returns to
scale. It can be said that this function is homogeneous to a degree of + .
In the case of a production function in general q = q(K , L ) the differential of the product is
q q dn dK dL dn dn
dq = dK + dL . As = = , then dK = K and dL = L and
K L n K L n n
q dn q dn
dq = K+ L.
K n L n
1 n
When multiplying both sides by an expression , we get the elasticity of scale on the left side
dn q
and the sum of production elasticities with respect to both inputs on the right side:
dq n q dn 1 nK q dn 1 nL q K q L
nq = = + = + = Kq + Lq .
dn q K n dn q L n dn q K q L q
The latter means that the elasticity of scale can be calculated as the sum of production elasticities
with respect to all inputs.
q K q L K L
In our example: nq = Kq + Lq = + = K 1 L + K L 1 = + .
K q L q K L K L

If a function has two or more arguments, for example z = f ( x, y ) or z = f ( x1 , x 2 , 2 , x n ) , partial


derivatives can be found instead of simple derivatives. In the case of a function z = f ( x, y ) , where z
z
is a function and x and y arguments, the partial derivative z x = is interpreted as an approximate
x
z z y
estimation of as y = 0 (ceteris paribus principle) (it means: ). Here we assume
x x x y =0
that for arguments x = dx and y = dy (they are arguments now), but dz z .
The calculation of partial derivatives is based on the ceteris paribus principle everything else
remains the same. It means that if we are taking a partial derivative of z with respect to x
z
zx = , every other variable (for example y) is treated as a constant and vice versa.
x
Applications of partial derivatives are the same as of simple derivatives.

27
DIFFERENTIATION Anneli Kaasa

Again, one important application of partial derivatives is optimization that is discussed separately.

2.4. Total differential and total derivative


If the change in a dependent variable brought about by the changes in all influencing variables is
under consideration, a total differential has to be used.
Given z = f ( x, y ) , if y does not change ( y = dy = 0 ) and x changes by x = dx , then the change
in the value of the function z can be approximately estimated analogically to functions with only
one variable:
z
dz = dx .
x
As y changes by y = dy and x does not change ( x = dx = 0 ), the differential of z is:
z
dz = dy .
y
Hence, as both arguments change, then the total differential of z is:
z z z z
dz = dx + dy or dz = x + y .
x y x y
The total differential shows approximately the total change brought about by changes in all
arguments.
For example, given z = 3 x 2 y + 3 x 3 + y 10 and x = 1 , y = 1 , x = 0,01 and y = 0,02 , the total
differential can be found as follows:
z dz = z x x + z y y = (6 xy + 9 x 2 ) x + (3x 2 + 1) y =
= (6 1 1 + 9 1) 0,01 + (3 1 + 1) 0,02 = 0,23 .
In the case of a function with n arguments z = f ( x1 , x 2 , 2 , x n ) the total differential is found in a
following way:
z z z
dz = dx1 + dx 2 + + dx n .
x1 x 2 x n
Sometimes it may happen that one or more arguments, in turn, depend on one ore more other
arguments. For example, in addition to the function z = f ( x, y ) , the variable y depends on the
variable x: y = g ( x ) . In this case x influences z directly according to the function z = f ( x, y )
and indirectly through y.
Dividing the total differential dz by dx , we get:
dz z dx z dy
= +
dx x dx y dx
dx
and since = 1:
dx
dz z z dy
= + .
dx x y dx
This result is a total derivative of the function z with respect to x. It shows the change of z that
comes with the unitary change of x (while no initial change is given to y) taking both direct and

28
DIFFERENTIATION Anneli Kaasa

indirect impact into account. The first term in the formula shows the direct impact of x to z and the
second term the indirect impact: the impact of x to y is multiplied by the impact of y to z.
Let us look at one example from microeconomics. Let us assume that the demand for a good
depends on the price p and real income mr :
q = f (mr , p) ,
where it is assumed that:
q
> 0 , since the demanded quantity increases as the (real) income increases;
mr
q
< 0 , since the demanded quantity decreases as the price increases (this good is substituted with
p
another);
and the real income depends on the price:
mr = g ( p) ,
dmr
where < 0 , since the real income (goods that can be consumed) decreases as the price
dp
increases.
To find the total impact of the change in price we find the total differential of q:
q q
dq = dp + dmr
p mr
and from that the total impact of p on q:
dq q q dmr
= + .
dp p mr dp
As we know the direction (sign) of the relationships between variables, we can analyse, whether the
total impact is positive or negative:
dq q q dmr
= + < 0.
dp p mr dp
() (+) ()

It can be concluded that in this case, taken all relationships into account, when price increases, the
demanded quantity decreases. In microeconomics, the direct impact of price is called substitution
effect and the direct impact through income is called income effect.
In more complex cases the total derivative can be found analogically using the total differential by
substituting all derivatives that correspond to non-existing relationships with 0. If there is more than
one independent variable in the system, the total derivative describing their total impact is called
partial total derivative.
For example, if z = f (x, y, u, v, w) , u = g ( x, y ) , v = h( x ) and w = j ( y ) , then total differential of z
is:
z z z z z
dz = dx + dy + du + dv + dw
x y u v w
and dividing by dx gives us the partial total derivative of z with respect to x (while no initial change
is given to y):

29
DIFFERENTIATION Anneli Kaasa

dz z dx z dy z du z dv z dw
= + + + + =
dx dy = 0 x dx y dx u dx v dx w dx
z z du z dv
= 1 + 0 + + +0=
x u dx v dx
z z du z dv
= + + .
x u dx v dx
z
This partial total derivative can also be denoted as .
x
As another example, let the demanded quantity q1 depend on the price p1 , the price of another
good p 2 and the real income mr :
q = f (mr , p1 , p 2 ) ,
where it is assumed that:
q
> 0 , since the demanded quantity increases as the (real) income increases;
mr
q1
< 0 , since the demanded quantity decreases as the price increases (this good is substituted with
p1
another;
q1
> 0 , since the demanded quantity of the other good decreases as its price increases and this
p 2
another good is substituted with the first good;
and the real income depends on the prices:
mr = g ( p1 , p 2 ) ,
dmr dmr
where < 0 and < 0 , since the real income (goods that can be consumed) decreases as the
dp1 dp 2
prices increase.
q1
To find the total impact of the price of the other good on the demanded quantity . we find the
p2
q1 q q
total differential: dq1 = dp1 + 1 dp 2 + 1 dmr . And the partial total derivative with respect
p1 p 2 mr
to p 2 :
q1 q1 dp1 q1 dp 2 q1 dmr q1 q q dmr
= + + = 0 + 1 1 + 1
p 2 p1 dp 2 p 2 dp 2 mr dp 2 p1 p 2 mr dp 2
q1 q1 q1 dmr
or: = + .
p 2 p 2 mr dp 2
+ +

AE SE

We can see that this impact depends on whether the substitution or the income effect dominates:

30
DIFFERENTIATION Anneli Kaasa

q1
if SE > IE , then > 0 ; as p 2 increases, q1 increases and this means that these goods are
p2
substitutes for this consumer;
q1
if SE = IE , then = 0 ; as p 2 increases, q1 does not change and these goods are independent
p2
for this consumer;
q1
if SE < IE , then < 0 ; as p 2 increases, q1 decreases and this means that these goods are
p2
complements for this consumer.
If the chains of indirect impact are longer, consisting of three or more impacts, then the number
of the multiplied derivatives in the terms must be larger, respectively. The procedure can be viewed
as a decomposition of the total differential. For example, let us look at the system z = f ( x, y, u ) ,
x = g ( y ) and y = h(u ) . Here, the total impact of u on z can be found as follows:
z z z
dz = dx + dy + du =
x y u

z dx z z
= dy + dy + du ,
x dy y u
dz z dx dy z dy z du z dx dy z dy z
from here: = + + = + + ,
du x dy du y du u du x dy du y du u
direct
indirect

where the first term describes the impact of u on z through x and y, the second term describes the
impact of u on z through y (in this case the chain goes directly to z from y and not through x) and
the last term describes the direct impact of u on z..

2.5. Derivative of an implicit function


Sometimes functions may be given in an implicit form: the relationship between variables is
described with an equation so that an expression of those variables is on one side and a constant on
the other side of equation: F ( x, y ) = c . In many cases an implicit function can be rewritten as an
explicit function where one variable is solved for others, for example y = f ( x ) . However, in some
cases this is either impossible or inconvenient (for example, x 2 + y 2 = 9 ). In those cases it is still
dy
possible to find the derivative with the help of a formula that can be derived using total
dx
differential in the following way. First, we differentiate both sides of the implicit function
F ( x, y ) = c using the total differential formula for the left side (the change of a constant is always
0):
F F
dx + dy = 0 .
x y
F F
This can be rewritten as dy = dx and hence,
y x

31
DIFFERENTIATION Anneli Kaasa

F
dy
= x = Fx .
dx F Fy
y
F
Here it must be noted that the denominator cannot be 0: = Fy 0 . If a derivative function has
y
been calculated using this formula, it can include both variables (differently from the usual case
where only arguments can be seen in a derivative function).
For example, given x 2 + y 2 = 9 the derivative can be found as follows:
dy F 2x x
= x = = , y 0.
dx Fy 2y y

of an implicit function F ( x, y ) = c (if it is complicated to solve


dy
In order to calculate a derivative
dx
F
it into a form y = f ( x ) ), the following formula can be used:
dy
= x = Fx .
dx F Fy
y
In economics, the derivative of an implicit function is used, for example in the case of indifference
curves. For example, an utility function describes how the utility of a consumer depends on the
consumed quantities of two goods q1 and q 2 : u = u (q1 , q 2 ) . Different consumption bundles
with different amounts of goods in it give different utility and those bundles can be depicted in a
figure where the quantity of one good is on one axis and the quantity of the other on the other axis
(see Figure 2.4). It may happen that although with different composition, some bundles provide the
same amount of utility. The consumption bundles giving the same utility are all on the same
indifference curve. If, for example the utility function is u = 12q10, 6 q 20, 7 , then the equation of the
indifference curve corresponding to the utility level 200 is 12q10, 6 q 20, 7 = 200 and 12q10, 6 q 20, 7 = 300
corresponds to the utility level 300.

q2

12q10, 6 q20, 7 = 300


12q10, 6 q20, 7 = 200
q1

Figure 2.4. Indifference curves


In economics, sometimes the slope of an indifference curve at a particular point is of interest. This
can be found as the slope of a tangent line or derivative. In our case with q 2 on the vertical axis and
dq 2
q1 on the horizontal axis, a derivative is needed. Using the formula of the derivative of an
dq1
implicit function gives:

32
DIFFERENTIATION Anneli Kaasa

u
dq 2 q1 MU 1
= = ,
dq1 u MU 2
q 2
where MU 1 stands for the marginal utility of the first good and MU 2 stands for the marginal utility
of the second good. In our case:
dq 2 0,6q10, 4 q 20, 7 6q
= 0, 6 0,3
= 2 .
dq1 0,7 q1 q 2 7 q1
The slope of an indifference curve also gives the marginal rate of substitution. For example in the
case of the function z = f ( x, y ) the marginal rate of substitution shows the change of y ( y ) that is
needed as x increases by one unit ( x = 1 ), in order to hold z constant. The value of z does not
change, if the total differential is equal to 0:
z z
dz = dx + dy = 0 .
x y
Hence, we can find a necessary change of y to hold z constant at the unitary change of x:
y dy z
= x .
x dx zy

Given u = 12q10, 6 q 20, 7 a necessary change of the quantity of the second good to hold utility constant
when the quantity of the first good is increased by one unit, is:
q 2 MU 1 6q
= 2 .
q1 MU 2 7 q1
If the consumer has 10 units of both goods, and needs two additional units of the first good, then
what has to happen to the quantity of the second good to keep the same utility level? This can be
found in a following way:
6 q2 6 10
q2 q1 = 2 1,71 .
7q1 7 10
Hence, the consumer has to give up 1,71 units of the second good.
A function in the form z = c x a y b , where a, b and c are positive parameters, is called Cobb-
Douglas type function and is used often in economics as utility or production or other functions.
The formation of the indifference curves can be described as follows. In general, in economics, a
situation can be described with the help of a function of two arguments z = f ( x, y ) . The value of
the function z can take on different values and here, the relationship between x and y is of interest at
a fixed value of z. In every case of a fixed value of z, we have an implicit function F ( x, y ) = c that
determines the relationship between x and y. All relationships corresponding to different constant
values of z can be depicted on the xy-plane as the indifference curves. The formation of these
curves can be imagined as follows.
The geometrical representation of the function z = f ( x, y ) in a three-dimensional space is often a
mountain or a valley. A well known example is the utility mountain. In the case of a mountain
depicted on Figure A1.6 every value of z corresponds to a horizontal cut (parallel to the xy-plane):
the higher the cut, the larger the value of z. When the intersection lines (circles) of these planes and
the surface z = f ( x, y ) are projected (let to fall down) onto the xy-plane, the situation is depicted on
the xy-plane (only two dimensions needed).
33
DIFFERENTIATION Anneli Kaasa

y y
x x

Joonis A1.6. The horizontal cuts of a surface and their projections on the xy-plane
Now, there are curves on the xy-plane (circles in our case) and every curve incorporates those
bundles of the values of x and y that give a certain value of z. Those curves are called indifference
curves. The slope of theses indifference curves on the xy-plane can be calculated with the help of an
dy F z
implicit function derivative rule: = x = x .
dx Fy zy
Often only a part of the indifference curves are actually shown in traditional figures in economics.
For example, in the case of the utility function u = u (q1 , q 2 ) , it is assumed that the households do
not operate in the areas (quadrants) where additional consumption decreases utility. That means
that always MU 1 , MU 2 > 0 and thus a q1 q 2 -plane the slope of the indifference curve is always
negative:
(+)
dq 2 MU 1
= < 0.
dq1 MU 2
(+)

This condition is satisfied only in two quadrants of the q1 q 2 -plane (upper right and lower left on
Figure A1.7), in one of them are the curves convex and in the other concave. Why one on them is
not suitable can be seen by looking at the mountain from the sky. When moving up on the figure,
the utility is first increasing and the decreasing after the dotted line. When moving to the right on
the figure, again the utility is first increasing and then decreasing. Hence, the only part where both
marginal utilities are positive, is on the lower left, where the indifference curves are negatively
sloping and convex.

q2 MU > 0 MU1 < 0


1

MU 2 < 0 MU 2 < 0

MU1 < 0
MU1 > 0
MU 2 > 0
MU 2 > 0
q1

Figure A1.7. The utility mountain from the sky

34
DIFFERENTIATION Anneli Kaasa

Here, it has to be noted, that in the case of the Cobb-Douglas type functions z = x y the
geometrical representation is not a mountain in the sense used before, but rather an infinitely
increasing surface (maximization is possible only with constraints). In that case one can imagine a
quarter of the base of the mountain described before (without any maximum point). In that case the
intersections of the horizontal cuts of this surface appear to be arcs that we are used to see on the
figure in microeconomics (see Figure A1.8).

y y
x x

Figure A1.8. The horizontal cuts of the infinitely increasing surface and their projections on the xy-
plane

35
3. INTEGRATION

3.1. Indefinite integral


Integration is an opposite operation of differentiation. Let us consider a function y = F ( x ) + C
(where C is the constant part of the function and F ( x ) the part that depends on x) and denote its
derivative as y = F ( x ) = f (x ) . The opposite operation, taking an integral from the derivative f ( x )
is mathematically formulated in a following way:

f (x ) dx .
There is an integral sign before the function that is integrated and there is also the differential of the
variable with respect to which the function is integrated in the notation. Integration should bring us
back to the initial function y = F ( x ) + C . If there is a constant in the initial function, the taking
derivative of it gives 0 and that means that only on the basis of a derivative function it is not
possible to determine the constant of the initial function. Therefore, in mathematics the constant of
integration C is always added to represent the constant of the initial function (if the value of this
constant is not known):

f (x ) dx = F ( x) + C ,
Knowing that f ( x ) dx = dy = dF ( x ) , it can also be written that:

dF (x ) = F (x) ,
confirming that we are dealing with opposite operations.
The rules of integration can be derived from the rules of differentiation. Here are the rules for
integrating the functions usually used in economics:

0 dx = C ,
b +1

(a x )dx = a b + 1 + C ,
b x

a special case of that is dx = 1 dx = x 0 dx = x + C ,

e dx = e x + C ,
x

ax
a dx = + C = a x log a e + C ,
x

ln a
1
x dx = ln x + C .
Integration allows us to find the initial or total function from a marginal function. If a marginal
function is given, then knowing that a marginal function has been found by differentiating the initial
function, this initial function can be found by using the opposite operation. When in mathematics a
constant of integration in general form C is added, in economics often the value of this constant is
known or there is a notation (symbol) that can be used to indicate the interpretation of the constant.
Let the revenue function be MR = 100 8q , the marginal cost
marginal function
MC = 9q 36q + 40 and fixed costs equal to 200. Finding the total cost function:
2

(
C = MC dq = 9q 2 36q + 40 dq = )
= 3q 3 18q 2 + 40q + const .
Since we know that fixed costs are equal to 200, we can rewrite:
36
INTEGRATION Anneli Kaasa

C = 3q 3 18q 2 + 40q + 200 .


If the size of fixed costs is not known, they can be denoted with FC as follows (using the general
notation C instead would be especially confusing here):
C = 3q 3 18q 2 + 40q + FC .
We can also find the total revenue function:
R = MR dq = (100 8q ) dq =100q 4q 2 + const .

If no units are sold ( q = 0 ) then there will be no revenue as well, hence, the constant of this
function is 0:
R = 100q 4q 2 .
Integration is an opposite operation of taking derivatives. Assume we have a function
y = F (x ) + C (we separate the constant C from the remaining function). This is our starting or
initial function. If we take a derivative of it, we get y = F ( x ) + 0 . Let us assume F ( x ) = f (x ) .
Then: y = F ( x ) + 0 = f ( x ) + 0 . If we now integrate this function, we have to get back the initial
function: f (x ) dx = F ( x) dx =F ( x) + C . Hence, the constant C represents the constant of the
initial function.
If we know marginal function MY = y = F ( x ) + 0 = f ( x ) + 0 , we can find total function by
integrating marginal function: y = MY dx = f (x ) dx = F ( x) dx =F ( x) + C .

In the case of functions used in economics the value of the constant of integration (constant of the
initial function) mostly shows the value of the initial function if the argument is equal to 0. For
example if the initial function is y = ax 2 + bx + c , its derivative is y = ax + b and integrating this
y dx = (ax + b)dx = ax + bx + C . The constant of integration (and the constant of the
2
gives us
initial function) C = c is equal to the value of the initial function for the value of argument equal to
0: y (0) = a 0 2 + 0 x + c = c .

That is not always the case, however. For example if the initial function is y = ae x + b , its
derivative is y = ae x and integrating this gives us y dx = ae dx = ae x + C . At that the constant
x

of integration (and the constant of the initial function) C = b is not equal to the value of the initial
function for the value of argument equal to 0: y (0 ) = ae 0 + b = a + b .
The relationship similar to that between the total and marginal functions can be found at other
variables as well. Although the names of these variables are not so obviously referring to their
relationship, the nature of the relationship is similar: a variable and another variable that describes
the change of the first variable the value added of the first variable that is added. For example,
the value of capital forms with the help of investment: the investment of a particular period is equal
to the capital that is added in that period:

I (t ) = = K or K (t ) = I (t )dt .
dK
dt
(the dot over the symbol K denotes the derivative with respect to time (see the chapter about the
growth rate).

37
INTEGRATION Anneli Kaasa

3.2. Definite integral


It appears that the concept of integral can be used to calculate the area under a graph. This can be
done with the help of a definite integral. The area under the graph (curved trapezoid) is determined
as the area between the graph of a function and the horizontal axis and also between the lower and
upper limits (a and b respectively, see Figure 3.1).

f(x)

a b x

Figure 3.1. Area under a graph


In the notation of a definite integral lower and upper limits are placed below and above,
respectively, the sign on integration. A definite integral has always some numeric or parametric
value, that can be found using the following formula:
b
f ( x ) dx = F ( x) = F (b ) F (a ) .
b

a a

First, the function is integrated. Next, the value of the resulting function at the lower limit and the
same at the upper limit are calculated and then the former is subtracted from the latter. This
b
difference can also be denoted with a vertical line with the lower and upper limits: F ( x) .
a

f (x ) dx = F ( x) = F (b ) F (a ) (see
b
A definite integral is defined by the lower and upper limits:
a a

upper figure) and its geometrical representation is an area between the curve and an axis between
the limits (lower figure).

y=F(x)+C
F (b ) F (a )

a b x
y'=F'(x)=f(x)

F (b ) F (a )
a b x

38
INTEGRATION Anneli Kaasa

As F (b ) F (a ) = F (b ) + C (F (a ) + C ) , constant C is usually left out from calculations.


A definite integral can be used, for example for estimating the welfare in microeconomics, more
precisely for estimating the consumers surplus and producers surplus that together give an
estimation of the welfare arising from the market of a particular good. In the market equilibrium
model (see Figure 3.2), all consumers can buy paying the equilibrium price p * . Consumers, who
were willing to pay more, experience a surplus. The sum of the surpluses of all consumers can be
geometrically depicted as the area between demand curve and the horizontal line at the price level
between 0 and the equilibrium quantity as the upper limit.

p
consumer's S
p1 sruplus

E producer's
p* surplus
D
p0
q* q

Figure 3.2. Consumers and producers surpluses


At the same time, producers supply larger quantities at higher prices according to the supply
function. A higher price enables producers with higher production costs participate as well.
However, all units are sold for the equilibrium price and hence, for some producers the price is
higher than the marginal cost of producing. This difference is called producers surplus. The sum of
the surpluses of all producers can be geometrically depicted as an area between supply curve and
the horizontal line at the price level between 0 and the equilibrium quantity as the upper limit.
If the demand and supply functions are not linear, definite integral has to be used for calculating
surpluses.
In most cases there are two options. For example, for calculating consumers surplus, first, the area
under the demand curve can be first calculated and then one can subtract from that the area of a
rectangle representing the expenditures of consumers:
q*
S consumer = p D (q ) dq p * q * .

0
For that we need to find a definite integral of the inverse demand function (p solved for q) as p is on
the vertical axis and limits are determined on the q-axis (this enables to insert the limit values that
are actually values of q to the integrated function). The demand curve shows the price that
consumers are willing to pay at a particular quantity, and this price can be viewed as the estimation
of marginal utility in monetary units. Hence, this method subtracts consumers expenditures from
the utility in monetary units.
Another option is to calculate directly the area of interest (between the demand curve and the
vertical axis) by integrating the demand function (q solved for p) and determining the limits on the
p-axis (p* and p1 ):
p1

S consumer = q D ( p ) dp .
p*

39
INTEGRATION Anneli Kaasa

There are analogical options for the producers surplus. According to the first option, the area under
the supply curve can be subtracted from the rectangle representing the producers revenue
R = p *q * . For that we need to find a definite integral between 0 and q* of the inverse supply
function (p solved for q):
q*
S producer = p * q * p S (q ) dq .

0
The supply curve shows the price the producers are willing to get at a particular quantity and that
comes from the marginal costs (a supply curve is also a marginal cost curve in most cases). Hence
this method subtracts the variable costs (the marginal cost of a fixed cost is 0) from the revenue.
Another option is to calculate directly the area between the supply curve and the vertical axis by
integrating the supply function (q solved for p) and determining the limits on the p-axis ( p 0 and
p*):
p*

S producer = q S ( p )dp .
p0

For example, let the inverse demand function be p D = 40 3q and the inverse supply function
p S = 0,5q 2 q + 10 . In order to find equilibrium point we can use the condition:
p D = p S , hence 40 3q = 0,5q 2 q + 10 and 0,5q 2 + 2q 30 = 0 .
The solutions for this quadratic equation are q1 = 10 and q 2 = 6 , however, the equilibrium
quantity cannot be negative. The price corresponding to q* = 6 is p* = 40 3 6 = 22 .
The situation is depicted on Figure 3.3. The demand curve is a negatively sloping straight line with
the slope 3 and the p-intercept 40. The p-intercept of the parabolic graph of the supply curve can
be found taking quantity equal to 0: p = 0,5 0 2 0 + 10 = 10 . For value q = 1 the price is smaller:
p = 0,5 12 1 + 10 = 9,5 . Hence, the supply curve is first negatively and then positively sloping.

p
40 S

E
22
10
D
6 q

Figure 3.3. Consumers and producers surpluses in the example


Consumers surplus can be found as an area of a triangle:

S consumer =
(40 22) 6 = 54 .
2
When using integral, the monetary utility:
6
2

(40 3q ) dq = 40q
3q 6
=
0 2 0

40
INTEGRATION Anneli Kaasa

3 62 3 02
= 40 6 40 0 = 186 ,
2 2
consumers expenditures: p * q* = 22 6 = 132 , and consumers surplus:
S consumer = 186 132 = 54 .
40 1
For another option, the demand function should be found: q = p . Then the consumers
3 3
surplus:
40 1
40
40 p p 2 40
S consumer = p dp = =
22
3 3 3 6 22

40 40 40 2 40 22 22 2
= = 54 .
3 6 3 6
The producers surplus can only be calculated with the help of integrals. As the inverse supply
function is a quadratic function, it is rather inconvenient to find the supply function (solve it for q)
Hence, it is sensible to calculate the area under the supply curve first:
q3 q2
(0,5q )
6 6
2
q + 10 dq = + 10q =
0 6 2 0

63 6 2 03 0 2
+ 10 6 + 10 0 = 78 ,
6 2 6 2
and then to subtract it from the revenue R = p * q* = 22 6 = 132 to get the producers surplus:
S producer = 132 78 = 54 .
It is worth noting here that the producers surplus is not equal to the profit, but is larger because
of the fixed costs: = R VC FC = H tootja FC .
Also, if, for example, the demand curve has no intersection point with the price-axis (for example
a a
a hyperbolic demand curve p = or q = ), then the area that estimates the consumers surplus is
q p
not a finite number and the consumers surplus cannot be calculated in this case. For the hyperbolic
function using the concept of limits (it is not possible to logarithm an infinity or 0) gives us:
q* a q*
S consumer = dq p * q* = a ln q p * q* = a(ln q * ln 0 ) p * q* =
q
0 0

lim a(ln q * ln b ) p * q* = a(ln q * ( )) p * q* =


b 0

or

= a(ln ln p *) = lim a(ln b ln p *) = a( ln p *) = .
a
S consumer = p* p dp = a ln p p* b

41
INTEGRATION Anneli Kaasa

Depending on between which axis and curve the area is to be calculated:


y y
d

c
x x
a b
the procedure is different. If we are calculating the area between the curve and horizontal (x-)axis,
then the limits are defined on horizontal (x-)axis, the function has to be a function of x, and we
integrate with respect to x
b

y(x ) dx = = .
b

a a

If we are calculating the area between the curve and vertical (y-)axis, then the limits are defined on
vertical (y-)axis, the function has to be a function of y, and we integrate with respect to y.
d

x( y ) dy = = .
d

c c

As a different example, the definite integral enables to find the size of the capital accumulated in a
time period as a result of the investments. For example, the capital accumulated between the periods
t1 and t 2 can be found as follows:
t2 t2

I (t ) dt = K (t )
t1 t1
= K (t 2 ) K (t1 ) .

A definite integral between 0 and a time point n gives the size of the capital that has been
accumulated in n periods:
n

I (t ) dt = K (t ) = K (n ) K (0)
n

0 0

and hence, the total value of the capital at a particular time point n is:
n
K (n ) = I (t ) dt + K (0 ) .
0

If, for example, the initial capital is 100 and the investments depend on the time as follows:
I = 30t 0,5 , then the value of the capital after 10 periods is:

( )
10
K (10) = 30t 0,5 dt + 100 = 20t 1,5
10
+ 100 = 20 101,5 0 + 100 20 31,6 + 100 = 732,5 .
0 0

3.3. Relationship between total and marginal function


A definite integral enables to clarify the relationship between the total and marginal functions. Let
us examine a function, for example the utility function u = u (q ) , as depicted on Figure 3.4. At a
particular value q = n the value of the utility function u = u (n ) shows the total utility received from
consuming the amount q = n . A decreasing linear marginal utility function MU = MU (q )
42
INTEGRATION Anneli Kaasa

corresponds to this utility function. At a particular value q = n the value of the marginal utility
function MU n = MU (n ) shows the utility received from consuming the n -th (last) unit. If we sum
all marginal utilities obtained from all additional units starting from the first unit until the last, n-th
unit, the total utility of n units can be found as a result:
n
MU 1 + MU 2 + + MU n = MU i = u (n ) .
i =1

u(n)

n q
MU
u(n)
MU(n)

n q

Figure 3.4. Graphs of total and marginal utility


In the case of continuous utility function the additional units are infinitely small and in that case the
n n
sum
i =1
is replaced by definite integral (we measured the quantities in whole numbers before
0
and so the segment on q-axis from 0 to 1 represented the first unit, hence, in the case of the
continuous scale of quantity, the corresponding domain starts from 0). Hence:
n

MU (q ) dq = u(n ) .
0

So, the area under the marginal utility curve between 0 and n gives us the total utility of n units.
That is because the utility of 0 units is assumed to be 0. Analogically, summing the marginal
revenues obtained from all additional units sold starting from the first unit until the last, n-th unit,
gives us the total revenue of the quantity n. Since the revenue is 0 when nothing is sold, there is no
constant in the revenue function and the area under the marginal revenue curve between 0 and n
gives us directly the total revenue of n units. Summing the marginal costs needed for all additional
units from the first unit until the last, n-th unit, gives us a function that describes the quantity-
dependent part of the cost function the variable costs for the quantity n. The fixed costs are
described by the constant of the total cost function and hence, they are not reflected in the marginal
cost function. So, the area under the marginal cost curve (mostly coinciding with the supply
curve) between 0 and n gives us not the total cost of n units, but the variable cost of n units.
Summing the marginal profits of all additional production units from the first unit until the n-th unit
gives us the quantity-dependent part of the profit function (the difference between the revenue and
costs) that is actually equal tot the producers surplus. So the area under the marginal profit

43
INTEGRATION Anneli Kaasa

curve between 0 and n gives us not the total profit of n units, but the producers surplus or
variable costs of n units.
In general: the definite integral of a marginal function between 0 and a particular value of the
argument x0 gives us the value of the initial or total function at x0 , but without the constant of the
initial function (in the case of the utility function constant is usually 0, as no utility is obtained when
nothing is consumed):
x0

f (x ) dx = F (x ) , where F (x ) = f (x ) .
0
0

In graphical context that can be described as follows for the example of utility. A marginal utility of
a unit of good is represented as a vertical line from the q-axis to the marginal utility curve. The
marginal utilities of all infinitely small quantities added can be imagined as lines closely next to
each other that are creating an area. Summing marginal utilities of all infinitely small quantities
added from 0 to n gives an area between the q-axis and the marginal utility curve corresponding to
the total utility of n units (lower part of Figure 3.4) that also can be depicted as a vertical line
segment between the q-axis and the total utility curve (upper part of Figure 3.4) Hence, in the case
of proportional figure, the area under the marginal utility curve has to be equal to the height of the
total utility function at n.
If the argument is time t, then
t
F (t ) = f (t ) dt + F (0 ) .
0

Thus, the value of F (t ) at a time point where t = n , is equal to the sum of the initial value and the
value accumulated during n periods.
x0

In economics it is useful to know that MY dx = F ( x


0
0 ) (non-constant part of initial function, as

y ( x0 ) = F ( x0 ) + C ).

When finding a definite integral of a marginal function f ( x ) so that we replace the upper limit
with the variable x, we get:
x
f ( x ) dx = F ( x) = F ( x ) F (a ) .
x

a a

F ( x ) here is actually the total function (without the constant) and the second term is a constant that
is one possible value of the constant of integration. We can denote this constant as F (a ) = C .
Hence, a definite integral in this form can also be viewed as an indefinite integral:
x

f (x ) dx = F ( x) = F (x ) F (a ) = F (x ) + C = f (x )dx ,
x

a a

x
that is: f (x ) dx = f (x )dx .
a

44
4. MATRIX ALGEBRA

4.1. Matrices and determinants


Matrix algebra is widely used in economics as it simplifies the calculations in the case of large
systems. Matrix algebra is used for solving the equation systems, for confirming the results in
optimization, in comparative statics and so on.
A matrix is a rectangular array or numbers, functions or other elements. An m n matrix has m
rows and n columns:
a11 a12 a1n
a a 2 n
[ ]
A = aij = 21

a 22



.

a m1 am2 a mn
Matrix can be denoted in different ways:
[ ]
A = aij = (aij ) = aij .

The first subscript of a matrix element shows, in which row it is and the second subscript, in which
column this element is. A matrix with only one row or column is called vector (row vector or
column vector, respectively). In economics mostly square matrices are used ( m = n ).
A matrix whose columns are the same as the rows of the initial matrix (and rows the same as the
columns of the initial matrix) is called transposed matrix and denoted as A' or AT . Matrix B is
equal to a transposed matrix AT , if:
[b ] = [a ] .
ij ji

In the case of square matrix, the transposed matrix is a reflection of the initial matrix over its
principal diagonal (from upper left to lower right corner).
Next, the basic operations with matrices are introduced. When adding matrices, the corresponding
elements (that are in the same place) in both matrices will be summed:
[a ] + [b ] = [a
ij ij ij + bij .]
When subtracting, analogically the elements of one matrix are subtracted from the corresponding
elements of the other matrix:
[a ] [b ] = [a
ij ij ij bij .]
For adding and subtracting, the matrices have to have the same number of rows and columns.
When multiplying matrices, in order to get a particular element (in i-th row and j-th column) the
elements of i-th row of the first matrix and the elements of j-th column of the second matrix are
multiplied with each other and these products are then summed:
[a ][b ] = [a
ij ij i1 b1 j + ai 2 b2 j + + ain bmj . ]
As it can be seen from the formula, the number of columns in the first matrix has to be equal to the
number of rows in the second matrix (in the case of square matrices with the same number of rows
and columns this is always so). Differently from the multiplication of numbers, in the case of
matrices the order of matrices is important when multiplying.
When a matrix is multiplied by a number, all elements of the matrix are multiplied by this number:
[ ] [ ]
k aij = k aij .

45
MATRIX ALGEBRA Anneli Kaasa

An analogue to the division in matrix algebra is multiplying a matrix with an inverse of the other
matrix. An inverse matrix A 1 is a matrix that, when multiplied with the initial matrix (both from
left and right), gives identity (unit) matrix E as a result:
1 00
0 10
A A 1 = A 1 A = E = .


0 01
Hence, only square matrices can have an inverse matrix. There are many ways to find an inverse
matrix, one of them is introduced later.
When dealing with matrices, determinant is also an important notion. A determinant is a certain
numerical value that corresponds to a particular matrix and is calculated according to certain rules.
A determinant (here a determinant of matrix A) can be denoted in different ways:
a11 a12 a1n
a a 22 a2n
A = aij = 21 = DA .

a n1 an2 a nn
When calculating the value of a determinant, all different sets of elements are found that have only
one element from every row and one element from every column. In the case of n-th order square
matrix every set has n elements. Then products of the elements for each set are found and summed
multiplying with the number 1 all those sets that are parallel with the secondary diagonal.
For example, the determinant of a second-order matrix is calculated as:
a11 a12
A= = a11 a 22 a 21 a12 .
a 21 a 22
Finding higher-order determinants can be simplified by finding minors corresponding to every
element of a, for example the first, row. For finding a minor, a row and a column where the
corresponding element is (for element aij i-th row and j-th column) are omitted and then the
determinant of the remaining matrix is calculated. When a minor is multiplied with a number
(1)i+ j , we get a subdeterminant that is called cofactor ( Aij ). For the first element in the first row
the sum i + j is 2, for the second element in first row 3 etc. Hence, when the first row was chosen,
every other minor has to be multiplied by 1 . The sum of the products of elements of the chosen
row and corresponding cofactors is equal to the determinant of the whole matrix. For example, the
determinant of the third-order matrix can be found as follows:
a11 a12 a13
A = a 21 a 22 a 23 =
a31 a32 a33

a 22 a 23 a a 23 a a 22
= a11 a12 21 + a13 21 .
a32 a33 a31 a33 a31 a32
In the case of higher-order determinants analogical decomposition is used.

46
MATRIX ALGEBRA Anneli Kaasa

A matrix is a rectangular array of elements (constants, parameters, variables) with m rows and n
a11 a12 a1n
a a 2 n
[ ]
columns, for example: A = aij = 21

a 22

.

a m1 a m 2 a mn

Various brackets can be used to denote a matrix: [] = () = .


A determinant is a number associated with a matrix and determinants can only be calculated for
square matrices.
a11 a12 a1n
a a 22 a2n
The determinant is denoted with this kind of brackets: A = 21 (NB! m=n).

a m1 am2 a mn
The determinant of a 1x1 matrix is equal to its only element.
a11 a12
The determinant of a 2x2 matrix is calculated as follows: A = = a11 a 22 a 21 a12 .
a 21 a 22

This can be pictured as: .


The determinant of a 3x3 matrix is calculated as follows:
a11 a12 a13
A = a 21 a 22 a 23 = a11 a 22 a33 + a12 a 23 a31 + a13 a 21 a32 a31 a 22 a13 a32 a 23 a11 a33 a 21 a12 .
a31 a32 a33

This can be pictured as: .

a11 a12 a13 a11 a12


or if using helping columns like this: a 21 a 22 a 23 a 21 a 22 , then .
a31 a32 a33 a31 a32
Changing the signs of one row or one column does change the sign of a determinant, but not its
absolute value.

4.2. Linear equation systems


Matrices enable to express the linear equation systems in a more compact way and to solve them
more effectively. For example, the simple market equilibrium model consists of three equations:
q S = a + bp ,
q D = c dp ,
qD = qS ,
where a, b, c, d > 0 .

47
MATRIX ALGEBRA Anneli Kaasa

After denoting q D = q S = q , the model can be reduced to two equations and two unknowns:
bp q = a,

dp + q = c.
Often the models in economics have much more equations and unknowns. The market equilibrium
model can also be extended by adding equations describing more aspects (for example adding
excise tax to the model). In the case of more complex models it is rational to use matrix algebra.
Let there be a linear equation system that intends to analyse a problem in economics (it is sensible
to have the number of unknowns equal to the number of equations, let this number be n):
a11 x1 + a12 x 2 + + a1n x n = b1 ,
a x + a x + + a x = b ,
21 1 22 2 2n n 2

a n1 x1 + a n 2 x 2 + + a nn x n = bn .

The same relationships can be expressed with the help of matrices and vectors. At first, this
equation system can be rewritten as an equality of two vectors:
a11 x1 + a12 x 2 + + a1n x n b1
a x + a x + + a x b
21 1 22 2 2n n
= 2 .


a n1 x1 + a n 2 x 2 + + a nn x n bn
The vector on the left side can be rewritten as a product of a matrix and a vector based on the rules
of multiplying matrices. For example, the first element in the left-side vector is actually a first
element of a vector that could be obtained by multiplying the matrix of coefficients A and vector of
unknowns X:
a11 a12 a1n x1 b1
a a 22 a 2 n x 2 b2
21 =


a n1 an2 a nn x n bn
or
AX = B .
Examples of using matrix equations can be seen in comparative statics, for example, but one simple
example is provided here as well. For example, let there be a chocolate factory that produces three
types of chocolates: dark (D), milk (M) and white (W) chocolate. For producing one box of dark
chocolate 4 units of cocoa, 1 unit of sugar and 6 units of milk powder is needed. For producing one
box of milk chocolate 2 units of cocoa, 4 units of sugar and 5 units of milk powder is needed. For
producing one box of white chocolate 5 units of sugar and 6 units of milk powder is needed. There
are 100 units of cocoa, 230 units of sugar and 330 units of milk powder in the storeroom. What has
to be the production plan to finish all stocks?
This can be answered with the help of a system, where every equation describes using one input.
The quantity of one type of chocolate multiplied by the quantity of the input that is needed for one
unit of that chocolate gives a need for that input for that type of chocolate. By adding the needed
quantities of that type of input for all types of chocolate we can set it equal to the quantity of this
input in the storeroom. Three equations describing using the different types of inputs are:

48
MATRIX ALGEBRA Anneli Kaasa

4 D + 2M + 0W = 100,

1D + 4 M + 5W = 230,
6 D + 5M + 6W = 330.

and as a matrix equation:


4 2 0 D 100
1 4 5 M = 230 .

6 5 6 W 330

There are many ways to solve a matrix equation. One possibility is to use Cramers rule. According
to that, xi is equal to a quotient, where the determinant of a coefficients matrix A serves as a
denominator and the same determinant, but with the i-th column replaced by the vector of constants
B (let us denote this determinant as Ai ) stands in the place of numerator.
In our case:
100 2 0
230 4 5
A1 330 5 6 440
D= = = = 10 ,
A 44 44
4 100 0
1 230 5
A2 6 330 6 1320
M = = = = 30 ,
A 44 44
4 2 100
1 4 230
A3 6 5 330 880
W = = = = 20 .
A 44 44
Hence, the production plan that spends all inventories includes 10 boxes of dark chocolate, 30
boxes of milk and 20 boxes of white chocolate. In this case, the same solution could be also found
with usual methods of solving an equation system. In the case of more complex systems matrix
algebra may provide a possibility to find the solution in a more effective way, although in more
complex cases calculating determinants becomes more time-consuming as well. Hence, let the
choice of method be a decision of the solver.

4.3. Input-output models


Matrix algebra is used in economics for input-output models; the most well-known of them is the
Leontiefs model. In order to explain the logic behind those models, let us consider one simple
example.
Let there be an economy where only two goods are produced and for producing one the second is
needed as an input and vice versa. Let us assume that to produce one unit of the first product s units
of the second product are needed, and to produce one unit of the second product t units of the first
product is needed. If we denote the total output of industries as x1 and x 2 and the part of the output

49
MATRIX ALGEBRA Anneli Kaasa

that reaches the consumers (can be called final demand or external demand) as y1 and y 2 , it can be
written:
x1 tx 2 = y1 ,

sx1 + x 2 = y 2 .
The output of the first industry minus the output of this industry that is consumed (used) in the
second industry gives what is left for the final demand (of the first industry consumed by the
consumers). The second equation describes the same for the second industry.
As a matrix equation:
1 t x1 y1
s 1 x = y .
2 2
Using Cramers rule gives us formulas for finding the need for output when the final demand is
known:
y1 t
y 1 y + ty 2
x1 = 2 = 1 ,
1 t 1 st
s 1
1 y1
s y2 y + sy1
x2 = = 2 .
1 t 1 st
s 1

4.4. Leontiefs model


Leontiefs model is an input-output model that describes the relationships between inputs and
outputs at the economy level. The whole economy is divided into n industries (every industry
producing only one good) that are connected to each other in the way that they need each others
production as an input. At that a constant technology is assumed: the need of the output of one
industry for producing one unit of another industrys output is assumed to be constant. The
movements of products between industries are described by an input-output table. This table can be
viewed from two aspects.
When looking at an industry as a producing industry, it can be said that besides final demand or
final consumption (used for example in households, government institutions) the output of one
industry is used in many other industries as input. This is usually presented in the rows of input-
output tables and that can be expressed as a general equation of an i-th row:
n

x
j =1
ij + y i = xi ,

where:
xij output of i-th industry that is used in j-th industry,
xi total output of i-th industry,
y i final demand of i-th industry, that is consumed outside industries.

50
MATRIX ALGEBRA Anneli Kaasa

n
The equation of a first row for example is then: x
j =1
1j + y1 = x1 .

When looking at an industry as a consuming industry, it can be said that to produce the output of
one industry inputs from many other industries and besides that also the value added (can be called
also the primary input, denoted here with z; for example wages of workers, taxes, etc) are needed.
This is usually presented in the columns of input-output tables and that can be expressed as a
general equation of a j-th column:
n

x
i =1
ij + zj = xj ,

where:
z j value added in j-th industry,
x j total output in j-th industry.
n
The equation of a first column for example is then: x
i =1
i1 + z1 = x1 .

The input-output table looks like this:


final total
industry 1 j n
demand output
1 x11 x1 j x1n y1 x1

i xi1 xij xin yi xi

n x n1 x nj x nn yn xn .

value
yi =

i
z1 zj zn
added = zj
j

total
x i =

i
x1 xj xn
output = xj
j

The rows show how the output is consumed in different industries and the columns how it is
produced in different industries.
It may be that sometimes the final demand changes. Then a new plan for total output may be
needed. In some other case, it could be that a new plan of total output is known and the quantities of
final demand that will be available at this new plan are of interest. For solving these problems,
matrix algebra can be used.
The row equations in from the input-output table can be expressed as a matrix equation:
x11 + + x1n y1 x1
+ = .

x n1 + + x nn y n x n

51
MATRIX ALGEBRA Anneli Kaasa

If the technology remains constant, then the coefficients that show the need of one product for
producing one unit in another industry are constant. Input coefficient aij shows how much of the
output of the i-th industry is spent in the j-th industry for producing one unit:
xij
aij = .
xj
All input coefficients constitute a matrix A:
a11 a1n
A = .
a n1 a nn

As xij = aij x j , we can substitute xij with aij x j :

a11 x1 + + a1n x n y1 x1
+ = .

a n1 x1 + + a nn x n y n x n

According to the rules of multiplication we can rewrite:


a11 a1n x1 y1 x1
+ =

a n1 a nn x n y n x n
or
AX + Y = X .
If the input coefficients and total output is known, the vector of final demand can be found as
follows:
Y = X A X = ( E A) X .
This is actually one possibility to express the Leontiefs model, sometimes this model is set up us:
1 a11 a1n x1 y1

= .
a n1 1 a nn x n y n

It needs a little more effort to find total output, when input coefficients and final demand is known.
Let us solve the equation ( E A) X = Y for vector X. For that we may multiply this equation with a
matrix ( E A) 1 from the left side:
( E A) 1 ( E A) X = ( E A) 1 Y or
X = ( E A) 1 Y .
The elements of matrix ( E A) 1 show the need for an output of one industry for producing one
unit in another industry taking into account all intermediate demands, including the need for the
output of the first industry in other industries (that produce inputs for the second industry).
The solution of an input-output model (solution of the matrix equation) gives such structure of the
output of the economy that enables to satisfy the final demand without any deficits or surpluses. As
it is about aiming the equality of the demand and supply, it can be viewed as aiming an equilibrium
as well.

52
MATRIX ALGEBRA Anneli Kaasa

It is possible to transform this model into a closed model, where the households and government
sector (demanding, that is consuming the final demand and providing the value added) are viewed
as one industry. The inputs for this industry are actually the final demand and the output is the value
added. In that case the equations of the rows and columns are simpler:
n n

xij = xi and
j =1
x
i =1
ij = xj .

In that case the sum of the input coefficients of a column is always equal to 1 (because the
coefficients are calculated by dividing the intermediate demands of one industry for the quantities
from all industries by the quantity produced in this industry and in the input-output table the sum of
all previous elements in one column gives the last element in that column):
n n zj
aij = 1 (analogically to
i =1
a i =1
ij +
xj
= 1 on an open model).

The matrix equation is then:


x11 + + x1n x1 a11 a1n x1 x1

= or = or AX = X .
x n1 + + x nn x n a n1 a nn x n x n

From here:
0 = X AX = ( E A) X .
In this case it is not possible to determine a unique solution of the total output (the system has an
infinite number of solutions).
This can be seen also from the fact that the sums of the elements in the matrix A by columns are
1 a11 a1n
equal to 1 and in the matrix E A = these sums are equal to 0. The columns
a n1 1 a nn
of both matrices are linearly dependent. For example any row of the matrix E A can be found by
subtracting all other rows from a row where all elements are equal to 0. Hence, the determinant of
the matrix E A is equal to 0 and the matrix equation has no unique solution.

4.5. Calculations in Leontiefs model


For example, let there be an economy whose fragment including two industries can be described
with the following input-output table:
industry 1 2 y x
1 200 120 680 1000
2 500 480 220 1200 .
z 300 600 900
x 1000 1200 2200

If the new planned outputs are 1100 for the first industry and 1400 for the other, then assuming that
the technology is constant, we can find the amounts left for final demand of these industries. First,
the matrix of the input coefficients can be found by dividing the intermediate demands with the total
output in a particular column:
53
MATRIX ALGEBRA Anneli Kaasa

0,2 0,1
A= .
0,5 0,4
The matrix E A can be found:
1 0,2 0 0,1 0,8 0,1
EA= = .
0 0,5 1 0,4 0,5 0,6
and multiplying it by the vector of total output gives the vector of final demand:
0,8 0,1 1100
Y = ( E A) X = =
0,5 0,6 1400
0,8 1100 0,1 1400 740
= = .
0,5 1100 + 0,6 1400 290
Hence, the mentioned plan leaves 740 and 290 units, respectively, for final demand.
If this plan seems inappropriate, then we can start with determining the final demand, for example
800 for the first industry and 300 for the second. The vector of total output can be found as follows:
X = ( E A) 1 Y . For that we need a matrix ( E A) 1 , the inverse matrix of
0,8 0,1
EA= .
0,5 0,6
An inverse matrix can be found using following formula:
1
A 1 = adj A ,
A
~
where A is a determinant of A and adj A (also denoted as A ) an adjunct matrix of A. The latter
can be found by replacing every element aij in the initial matrix by the corresponding cofactor Aij
and by transposing the result.
a11 a12 a1n
a a 22 a 2 n
If A = 21 ,


a n1 an2 a nn

A11 A21 An1


An 2
~ A12 A22
then adj A = A = .


A1n A2 n Ann

0,8 0,1
In our case the cofactors are but one element. Form the matrix E A = , first the
0,5 0,6
matrix of minors is:
0,5
[M ] = 00,6,1
ij
0,8
,

the matrix of cofactors:

54
MATRIX ALGEBRA Anneli Kaasa

[A ] = 00,,16
ij
0,5
0,8

and by transposing it we find:
0,6 0,1
adj [E A] = .
0,5 0,8
The determinant of E A :
E A = 0,8 0,6 ( 0,5) ( 0,1) = 0,48 0,05 = 0,43
and the inverse matrix:

(E A)1 = adj [E A] =
1
EA
1 0,6 0,1 1,40 0,23
= =
0,43 0,5 0,8 1,16 1,86
.

If the new final demand vector is (800,300), we get the new total outputs:
1,40 0,23 800
X = ( E A) 1 Y = =
1,16 1,86 300
1,40 800 + 0,23 300 1189
= = .
1,16 800 + 1,86 300 1486
The new input-output table is then:
industry 1 2 y x
1 237,8 148,6 800 1189
2 594,5 594,4 300 1486 .
z 356,7 743 1100
x 1189 1486 2675

As an alternative, the new total outputs can also be found from the equation ( E A) X = Y or
0,8 0,1 x1 800
0,5 0,6 x = 300 using the Cramers rule as follows (differences are caused by
2
rounding):

800 0,1 0,8 800


300 0,6 510 0,5 300 510
x1 = = 1196 ja x 2 = = 1488
0,8 0,1 0,43 0,8 0,1 0,43
0,5 0,6 0,5 0,6

The inverse matrix ( E A) 1 can also be estimated with the help of an approximation using the
formula: ( E A) 1 = E + A + A + A 3 + .
Let us study this formula. When multiplying a matrix and its inverse matrix we get a unit matrix:
(E A)(E A)1 = E .
Thus, in order E + A + A + A 3 + + A m to be an inverse of E A the following must be true:
55
MATRIX ALGEBRA Anneli Kaasa

(E A)(E + A + A + A3 + + A m ) = E
or when rearranging:
E + A + A 2 + A 3 + 2 + A m A A 2 A 3 2 A m +1 = E A m +1 = E .
Hence, the more similar is the matrix A m +1 to the null matrix (all zeros), the better is the estimation
that E + A + A + A 3 + + A m gives to the inverse matrix ( E A) 1 .
As the input coefficients are all positive and smaller than 1, then when multiplying the matrix A
continuously by itself the elements will approach to 0 and as m , the matrix A m +1 approaches
null matrix ( A m +1 [0] ) and E + A + A 2 + A 3 + 2 + A m ( E A) 1 .
Thus, the multiplication of the matrix A with itself has to be stopped at the moment when the
elements seem to be close enough to 0 depending on the admissible error.
Let us try to find an approximation ( E A) 1 E + A + A 2 + A 3 for the example used before:

1 0 0,2 0,1 0,09 0,06 0,048 0,033


E= , A= , A2 = , A3 = .
0 1 0,5 0,4 0,3 0,21 0,165 0,114
1 + 0,2 + 0,09 + 0,048 0 + 0,1 + 0,06 + 0,033 1,338 0,193
( E A) 1 = .
0 + 0,5 + 0,3 + 0,165 1 + 0,4 + 0,21 + 0,124 0,965 1,744
The actual matrix for comparison:

(E A)1 =
1,40 0,23
.
1,16 1,86
The approximation method is related to another method for finding the solution in the input-
output model. This method is not used widely because of its labour intensity, but it enables to
clarify the relationships shown before.
First let us denote the matrix ( E A) 1 with B. The elements bij of ( E A) 1 = B show how much
of the output of the i-th industry is needed in order to produce one unit in the j-th industry
(including also all amounts of the i-th product that are used in other industries for producing the
inputs for the j-th industry). Let us try to find the coefficients b11 (the need of the product of the first
industry for producing one unit in the first industry) and b21 (the need of the product of the second
industry for producing one unit in the first industry).
One unit of the first product has to be produced. We can see from the matrix of input coefficients:
0,2 0,1
A=
0,5 0,4
that for producing one unit in the first industry 0,2 units of the first product and 0,5 units of the
second product is needed. Now, for producing 0,2 units of the first product 0,2 0,2 = 0,04 units of
the first product and 0,2 0,5 = 0,1 units of the second product are needed. When continuing this we
can construct the following scheme ( H 1 and H 2 in brackets specify which product is meant:
0,5( H 1 ) means that 0,5 units of the first product is needed):

56
MATRIX ALGEBRA Anneli Kaasa

0,2 0,2 0,2 ( H 1 )


0,2 0,2 ( H 1 )
0,5 0,2 0,2 ( H 2 )
0, 2 ( H )
0,5 0,2 ( H ) 0,1 0,5 0,2 ( H 1 )
1
2

0,4 0,5 0,2 ( H 2 )
1 (H1 ) or
0, 2 0,1 0,5 ( H )
0,1 0,5 ( H 1 )
1

0,5 ( H 2 ) 0,5 0,1 0,5 ( H 2 )
0,4 0,5 ( H ) 0,1 0,4 0,5 ( H 1 )
2
0,4 0,4 0,5 ( H 2 )

0,008 ( H 1 )
0,04 ( H 1 )
0,02 ( H 2 )
0, 2 ( H 1 )
0,1 ( H ) 0,01 ( H 1 )


2
0,04 ( H 2 )
1 (H1 ) .
0, 01 ( H 1 )
0,05 ( H 1 )
0,5 ( H 2 ) 0,025 ( H 2 )
0,2 ( H ) 0,02 ( H 1 )
2
0,08 ( H 2 )
By summing all amounts of the first product that are needed, we get the approximation of b11 , by
summing all amounts of the second product that are needed, we get the approximation of b21 :
b11 1 + 0,2 + 0,04 + 0,05 + 0,008 + 0,01 + 0,01 + 0,02 = 1,338 ,
b21 0,5 + 0,1 + 0,2 + 0,02 + 0,04 + 0,025 + 0,08 = 0,965 .
The difference from the actual coefficients comes from the fact that not enough addends were
included to the formula of approximation (the admissible error was relatively large). If this scheme
were expanded, the results would be more precise. However, the coefficients found here are equal
to those found before with the help of the approximation method at the precision level m = 3 .

57
5. OPTIMIZATION

5.1. Extremums of functions of one variable


It often happens that there are alternative ways to achieve a target and some alternatives are better
than others according to some criterion. Finding the best alternative according to a criterion is called
optimizing. Optimizing can be either finding the largest (maximizing) or smallest (minimizing)
value of some variable at certain conditions. For example a person tries to find the possibility to get
from one town to another at minimum costs or a firm wants to find, which quantity produced gives
the largest profit. A general term covering both maximum and minimum is extremum (extremal
values of a function are looked for). Although an optimization problem usually specifies
maximizing or minimizing, usually first all critical points are found that may be extremums and
after that as a next step the minimum or maximum are verified.
When optimizing, using certain conditions (that will be introduced below) those values of
argument(s) are found at which the value of the function is maximal or minimal. In the case of a
function of one argument y = f (x) , the value of x is looked for that gives highest or smallest value
of y the maximum or minimum. The extremum points the minimum and maximum points are
the points (determined with all coordinates: ( x0 , y 0 ) ) where y has minimal or maximal value.
At that it must be pointed out that the commonly used conditions for finding extremums that are
introduced here later have to be viewed as the conditions for local or relative extremums. In a
local maximum point the value of the function is larger than in any of the neighbouring points, but
may be not be larger than any points in the whole domain of this function. Analogically, there may
exist a point where the value of a function is even smaller than in a local minimum point. For
example on Figure 5.1, y in the local minimum point A is larger than y in point B, hence point A
cannot be a global (absolute) minimum point, but it is a local minimum point. Hence a local
extremum may be, but also may not be a global extremum.
The conditions that have to be satisfied for a minimum or maximum to exist at the certain point of a
function are based on the first- and second-order derivatives of that function. Hence, these
conditions are often called as first- and second-order conditions. First, let us look at these conditions
for the function of one argument.
First-order condition is used to find out the critical points (also called as stationary points), where
may or may not be an extremum. A critical point is a point, where the function neither increases nor
decreases. This point may appear to be a maximum or minimum, but also an inflection point, where
the function changes from concave to convex or vice versa (as on Figure 5.1) the increase or
decrease just stops in an inflection point. While the derivative has different signs before and after
extremum points, the sign of derivative is the same before and after the inflection point.
If the function neither increases nor decreases in a critical point, then the differential of y must be 0
in that point. More precisely, if we give an infinitely small change to x ( dx 0 ), then the
approximate estimate of the change of y or the differential dy has to be 0. Hence:
dy = f ( x0 ) dx = 0 .
Since dx 0 , then following holds:
f ( x0 ) = 0 .
This is first-order condition: if the first-order derivative in a point is equal to zero, it is a critical
point: there may be an extremum in that point. It can be seen from Figure 5.1 that the tangent line is
a horizontal line with a slope equal to 0 in a critical point. As the slope of a tangent line at a certain
point is equal to the derivative at the same point, this confirms that the condition f ( x0 ) = 0
provides critical points.

58
OPTIMIZATION Anneli Kaasa

The first-order condition brings out all points where there may be an extremum, but this condition
itself does not guarantee he existence of an extremum. Hence, it is necessary, but not sufficient
condition. Next, a sufficient condition is needed.

y
y'=0

y'=0
y'=0

y=f(x)

A B x
y'

y'=f'(x)
0
x

Figure 5.1. Critical points


Second-order condition is used for finding extremum points from critical points. In extremum
points the sign of dy has to change: in a maximum point the increase is replaced by the decrease
and in minimum point the decrease is replaced by the increase. A function has to be increasing
before and decreasing after a maximum point. Hence, the differential of a function has to be
positive ( dy > 0 ) before and negative after ( dy < 0 ) the maximum point. Thus, in a maximum point
dy decreases: changes from positive to negative.
The change of a differential dy is estimated by its differential (differential of a differential) or
second-order differential d 2 y . Thus, we can say that in a maximum point dy has to decrease and
hence, the second-order differential has to be negative:
d2y < 0.
Analogically, a function has to be decreasing ( dy < 0 ) before and increasing ( dy > 0 ) after a
minimum point. Thus, in a minimum point dy has to increase and the second-order differential has
to be positive:
d2y > 0.
As these conditions are rather troublesome to use, we can take them in a form using derivatives in a
following way. First, second-order differential can be found as follows:
d 2 y = d (dy ) = d [ f ( x ) dx ] = d [ f ( x )] dx =
= [ f ( x ) dx ] dx = f ( x ) dx 2 .
The condition for a maximum point can be rewritten:
d 2 y = f ( x0 ) dx 2 < 0 .

59
OPTIMIZATION Anneli Kaasa

If we give an infinitely small change to x ( dx 0 ), then the second-order differential has to be


negative. Since the square of dx 0 is always positive, then it has to be that:
f ( x0 ) < 0 .
The condition for a minimum point can be rewritten:
d 2 y = f ( x ) dx 2 > 0
and analogically:
f ( x0 ) > 0 .
We can see from Figure 5.1 that the derivative of a function is positive before and negative after a
maximum point, hence in the maximum point the derivative is decreasing and thus the second-order
derivative (derivative of derivative) has to be negative, indeed. Analogically, in the minimum point
the derivative is increasing and thus the second-order derivative has to be positive.
Taking together the necessary and sufficient conditions:
if f ( x0 ) = 0 and f ( x0 ) < 0 , it is a local maximum point;

if f ( x0 ) = 0 and f ( x0 ) > 0 , it is a local minimum point.


Here, it must be pointed out that a sufficient condition is not necessary. If the second-order
condition is satisfied in a critical point, it allows to conclude the existence of the maximum or
minimum in that point. However, if a second-order condition is not satisfied in a critical point, it
does not allow to conclude that there is no minimum or maximum in that point. A critical point may
appear to be a minimum or maximum point even if f ( x0 ) = 0 . Hence, it needs a further
investigation.
Optimization means finding the extremums: the local minimums and/or maximums. For that two
conditions are used.
Optimizing the function y = f ( x ) :
First-order condition, also known as necessary condition:
y = 0 , finding a derivative and setting it equal to 0, gives an equation and from that we can find a
critical value of x a critical point where may be a minimum or maximum, but also an inflection
point.
Second-order condition, also known as sufficient condition:
if y > 0 , a minimum can be confirmed,
if y < 0 , a maximum can be confirmed,
if y = 0 , it needs further investigation, nothing can be confirmed, there still may be a minimum or
maximum of inflection point, hence: necessary condition is not sufficient and sufficient condition is
not necessary.
If y includes variables (symbols), they have to be replaced with the values in the critical point
under consideration.
One possibility is to investigate the graph of the function (is the sign of a derivative changing or not
when going through a critical point), but this method may not be easily applicable. Another option
is to use the n-th derivative test (or successive derivative test) that is carried out in a following
way. If the second-order derivative in a critical point is equal to 0, higher-order derivatives are

60
OPTIMIZATION Anneli Kaasa

found successively until the first higher-order derivative is found that is not equal to 0 any more.
Hence, an n-th order derivative is aimed, for which it holds that
f ( n ) ( x0 ) 0,
( n 1)
f (x0 ) = 0.
The order n enables to determine the character of a critical point. If n is an odd number, it is an
inflection point. If n is an even number, it is an extremum. The character of an extremum is found
analogically to the second-order condition:
if f ( x0 ) = 0 and f ( n ) ( x0 ) < 0 , it is a local maximum point;

if f ( x0 ) = 0 and f ( n ) ( x0 ) > 0 , it is a local minimum point.

For example, given y = (3 x) 4 , critical points can be found by equating the first-order derivative
with 0:
y = 4 (3 x) 3 (1) = 4 (3 x) 3 = 0 , from here:
(3 x) 3 = 0 and
x0 = 3 .
Hence, there is one critical point with x equal to 3. Second-order derivative:
y = 4 3 (3 x) 2 (1) = 12 (3 x) 2 .
If x0 = 3 , then y (3) = 12 (3 3) 2 = 0 . Hence, the n-th order derivative test has to be used. The
value of the third-order derivative
y = 12 2 (3 x) (1) = 24 (3 x)
is also 0, if x0 = 3 :
y (3) = 24 (3 3) = 0 .
The value of the fourth-order derivative is different from 0:
y ( 4 ) = 24 (1) = 24 0 , y ( 4 ) (3) = 24 .
Since n = 4 is an even number, then this is an extremum point. Since y ( 4 ) (3) = 24 > 0 , we can
conclude that it is a minimum point.
As another example, let us assume that we have to find the quantity produced that guarantees the
maximum total revenue for the firm who faces a demand function q = 20 2 p (inverse demand
function: p = 10 0,5q ).
First, the function of total revenue can be found by multiplying price and quantity:
R = p q = (10 0,5q ) q = 10q 0,5q 2 .
The necessary condition is that the derivative of the revenue function that is also known as marginal
revenue has to be equal to 0 (the exclamation mark indicates that two sides of equations have been
set to be equal):
!
R = MR = 10 q = 0 ,
and the optimal quantity is probably q* = 10 . In order to test, whether it really is a maximum point,
we find the second-order derivative:
R = MR = 1 .

61
OPTIMIZATION Anneli Kaasa

As it is negative, the quantity q* = 10 really ensures the maximum revenue that is:
Rmax = 10 10 0,5 10 2 = 50 .
The situation is presented on Figure 5.1 with quantity on the horizontal axis. On the vertical axis
three variables are depicted: price, total revenue and marginal revenue. The vertical intercept of
demand and marginal revenue functions is the same, but the marginal revenue curve decreases two
times faster. If no unit are sold, then there is no revenue either. Hence, the total revenue curve starts
from the origin. At the same time, if price is 0, the revenue is 0 as well, so the total revenue curve
intersects the horizontal axis in the same point where demand curve intersects the horizontal axis.
The maximum point of total revenue is at the same quantity where the marginal revenue curve
intersects the horizontal axis (marginal revenue is 0 there).

p,
R,
MR
MR=10-q
pD=10-0,5q

10 R=10q-0,5q2

10 20 q

Figure 5.2. Demand, marginal revenue and total revenue


As a remark, in the case of linear demand curve, the vertical intercepts of the demand and marginal
revenue functions always coincide and the marginal revenue curve always decreases two times
faster. That can be easily shown. Let the inverse demand function be p = a bq . The total revenue
function is then
R = p q = (a bq ) q = aq bq 2
and the marginal revenue function
MR = R = a 2bq .
One simple example of optimization is maximizing the governments tax revenue RG . In the case
of excise tax, the tax revenue is RG = qT T , where the quantity bought and sold depends on the size

of the tax: qT = f (T ) . Assume that qT = 12 T . The tax revenue is then:


12
7
12 2 24
RG = qT T = 12T T . The revenue-maximizing tax size can be found from: RG = 12 T = 0
7 7
24
and is T * = 3,5 . Since RG = < 0 , it is really a maximum.
7
Interesting results can be obtained when maximizing the revenue of a firm. In the case of a linear
demand function p = a bq the revenue is R = pq = aq bq 2 and the revenue is maximized when
1a
R = MR = a 2bq = 0 , that is when q* = . If the good were given for free, the quantity
2b

62
OPTIMIZATION Anneli Kaasa

consumed would be (from an equation 0 = a bq ) q(0 ) =


a
. Thus, the revenue is the largest, when
b
1a a a
half of the maximum demand is satisfied. The price should then be p = a b = a = , that
2b 2 2
in turn is a half of the price at which the consumers stop consuming p(0 ) = a b 0 = a . This result
is related to the fact that if the sum of two variables is fixed, then the less is the difference between
these variables, the larger is the product of these variables.
a
In the case of a constantly unit elastic demand curve p= , the revenue is constant
q
a
R = pq = q = a and maximization is impossible (recall that in that case MR = 0 for all
q
quantities).
One very basic example of a simple optimization is the profit maximization in microeconomics.
The profit = R C is maximized by a quantity, at which = R C = MR MC = 0 . This is a
basic formula in microeconomics and no second-order condition is usually discussed. When we
look at the second-order condition, then a maximum can be confirmed when
= R C = MR MC < 0 or MR < MC .
Since according to common assumptions in microeconomics the marginal revenue is decreasing
( MR < 0 ) and the marginal cost increasing ( MC > 0 ), the condition MR < MC is always
satisfied and there is no need to inspect this condition every time.

5.2. Extremums of a function of two variable


Often functions of more than one variable have to be optimized, for example if a firm that produces
more than one good wants to find a production plan that maximise the total profit of this firm. Next,
the optimization of a function of two arguments is introduced.
When optimizing the function z = f ( x, y ) values of x and y are looked for that give the maximum
or minimum value of z. The geometrical expression of a function with two arguments is a surface in
a three-dimensional space. If z is on vertical axis, then the local maximum is in the highest point of
the concave surface and the local minimum in the lowest point of the convex surface (see Figure
5.3).
Analogically to the function of one argument, here first- and second-order conditions are used as
well.
First-order condition should give the critical points where the value of z neither increases nor
decreases with respect to both arguments. Hence, the differential of z must be 0 in a critical point.
More precisely, if we give infinitely small changes to x ( dx 0 ) and y ( dy 0 ), then the
approximate estimate of the change of z or differential dz has to be 0. Hence:
dz = f x ( x0 , y 0 ) dx + f y ( x0 , y 0 ) dy = 0 .
As dx 0 and dy 0 , then the first-order condition takes the following form:
f x ( x0 , y 0 ) = f y ( x0 , y 0 ) = 0 ,
meaning that the derivatives with respect to both arguments have to be equal to 0 in the critical
points.

63
OPTIMIZATION Anneli Kaasa

z z

y
x y
x

Figure 5.3. Maximum and minimum points of a function with two arguments
It can be seen from Figure 5.3 that in the extremum points the tangent plane is parallel to the xy-
plane. The partial derivative f x gives the slope of a tangent line in the particular point that is
parallel to the xz-plane. As mentioned at the geometrical interpretation of partial derivatives, when a
straight line parallel to the xz-plane that is on the tangent plane is projected to xz-plane, the slope of
this projection is f x . In the case of a critical point this straight line is horizontal with a slope equal
to 0 and the partial derivative f x has to be 0. Analogically, f y also has to be equal to 0.
The first-order condition is necessary for the extremums, but again not sufficient. A critical point
may appear to be a saddle point, where the value of z is maximal with respect of one argument, but
minimal with respect to another argument. It may also be an inflection point with respect to one or
both arguments. Hence, again a second-order condition is needed.

z z

x y y
x

Figure 5.4. Saddle and inflection points


For a maximum, the function has to be increasing ( dz > 0 ) with respect to both arguments before
the critical point and decreasing ( dz < 0 ) with respect to both arguments after the critical point.
Hence, the second-order differential d 2 z has to be negative ( dz changes from positive to negative)
in this critical point:
d 2z < 0 .
Analogically, the second-order differential d 2 z has to be positive for a minimum:
d 2z > 0.
The second-order differential can be expressed as follows (see Appendix 1):
d 2 z = f xx dx 2 + 2 f xy dx dy + f yy dy 2 .
For a maximum it has to be negative, for a minimum positive.
It can be sown (see Appendix 1) that f xx dx 2 + 2 f xy dx dy + f yy dy 2 is negative in the case of all
differentials of the arguments (if both are not 0), if holds:
f xx < 0 , f yy < 0 and f xx f yy > f xy2 .

64
OPTIMIZATION Anneli Kaasa

Then d 2 z < 0 and maximum can be concluded.


f xx dx 2 + 2 f xy dx dy + f yy dy 2 is positive, if holds:

f xx > 0 , f yy > 0 ja f xx f yy > f xy2 .


Then d 2 z > 0 and minimum can be concluded.
Hence, as a second-order condition, for confirming the maximum, the second-order derivatives
with respect to both arguments have to be negative, but positive for confirming the minimum. In
addition, for both confirming the maximum and confirming the minimum, the product of these
partial derivatives has to be larger than the squared mixed partial derivative.
Taking together the necessary and sufficient conditions (for better overview ( x0 , y 0 ) is left out):

if f x = f y = 0 and f xx < 0 , f yy < 0 and f xx f yy > f xy2 , it is a local maximum point;

if f x = f y = 0 and f xx > 0 , f yy > 0 and f xx f yy > f xy2 , it is a local minimum point.


Here, it must be pointed out again that a sufficient condition is not necessary. A critical point may
appear to be a minimum or maximum after further investigation (that is not discussed here).
If there are even more arguments in a function, the second-order condition when expressed
analogically would be very long and unclear. Therefore, the second-order conditions are often
expressed in matrix form.
As the second-order condition includes different second-order partial derivatives, they are drawn
together into one matrix that is called Hessian (matrix). In the case of a function with two
arguments, the Hessian is a 2 2 matrix:
f xx f xy
H =
f yy
.
f yx
In order to determine the character of the critical points, all principal minors are found. The first-
order principal minor is the first element of the matrix. For the second-order principal minor a
submatrix is found that includes first two elements of the principal diagonal. The determinant of this
submatrix gives the second-order principal minor. The third-order principal minor is based on the
submatrix including three first elements on principal diagonal and so on.
Let us denote with H i the principal minor that includes elements from i (first) rows and i (first)
columns. In the case of the function of two arguments, first- and second-order principal minors can
be calculated:
H 1 = f xx = f xx ,
f xx f xy
H2 = = f xx f yy f xy2 .
f yx f yy

As it can be seen, instead of f xx we can write H 1 . If f xx f yy > f xy2 , then f xx f yy f xy2 > 0 and
H 2 > 0 . Thus, the first and second-order condition can be rewritten as:

if f x = f y = 0 and H 1 < 0 and H 2 > 0 , it is a local maximum point;

if f x = f y = 0 and H 1 > 0 and H 2 > 0 , it is a local minimum point.

65
OPTIMIZATION Anneli Kaasa

Optimizing the function z = f ( x, y ) :


First-order condition, also known as necessary condition:
z x = z y = 0 , finding derivatives and setting them equal to 0, gives a system of two equations and
two variables, solving it gives critical values of x and y a critical point.
Second-order condition, also known as sufficient condition:
if z xx > 0 and z xx z yy z xy z yx > 0 , a minimum can be confirmed,

if z xx < 0 and z xx z yy z xy z yx > 0 , a maximum can be confirmed.


It can also be expressed with the help of Hessian determinant:
z xx z xy
H =
z yx z yy

by finding all principal minors (starting from upper left corner):


z xx z xy
H 1 = z xx = z xx , H 2 = = z xx z yy z xy2 .
z yx z yy

For a minimum confirmed: H 1 > 0 and H 2 > 0 ,

for a maximum confirmed: H 1 < 0 and H 2 > 0 ,

otherwise nothing can be confirmed (it needs further investigation).


As an example, let us look at a firm that produces two goods ( q1 and q 2 ) and wants to maximize
its profit. Let the inverse demand functions be p1 = 5 0,5q1 and p 2 = 24 q 2 . Let us assume that
the total costs depend on the produced quantities as follows: C (q1 , q 2 ) = 5 + q12 q1 q 2 + q 22 . The
profit function can be found as:
p = R C = p1 q1 + p 2 q 2 C (q1 , q 2 )
and more specifically:
= (5 0,5q1 ) q1 + (24 q 2 ) q 2 (5 + q12 q1 q 2 + q 22 ) =
= 1,5q12 + 5q1 + q1 q 2 + 24q 2 2q 22 5 .
For finding critical points, both partial derivatives are set equal to 0:
!

q = 3 q1 + 5 + q 2 = 0,
1

= q + 24 4q = 0.
!

q 2 1 2

Solving the equation system gives one critical point, where q1 * = 4 and q 2 * = 7 . The profit would
then be:
= 1,5 4 2 + 5 4 + 4 7 + 24 7 2 7 2 5 = 89 .
If it is a maximum, can be tested with the help of a Hessian:
f xx f xy 3 1
H = =
f yy 1 4
.
f yx
66
OPTIMIZATION Anneli Kaasa

The principal minors are:


H 1 = 3 ,
H 2 = ( 3)( 4 ) 1 1 = 11 .
The positive second-order principal minor shows that it is an extremum and the negative first-order
principal minor indicates that it really is a maximum.

5.3. Optimization of a function of n variables


The conditions for extremums in the case of a function of n arguments z = f ( x1 , x 2 , 2 , x n ) are
analogical to those used for the functions with two arguments.
According to the first-order condition (necessary condition) a point is a critical point if partial the
derivatives with respect to all arguments are equal to 0 in that point:
f1 = f 2 = 2 = f n = 0 ,
z
where f i = .
xi
For the second-order condition a Hessian is used including all possible second-order partial
derivatives:
f11 f12 f1n
f f 22 f 2 n
H = 21 .


f n1 f n2 f nn
From that it is possible to find all possible principal minors H i , where i = 1 n .
In the case of principal minors, only their sign is important, so if it is possible, one can just
determine the sign without completing complex calculations. Dependent on the signs of the
principal minors:
if f1 = f 2 = ... = f n = 0 and H 1 < 0 , H 2 > 0 , H 3 < 0 etc. or ( 1) H i > 0 , it is a local maximum
i

point;
if f1 = f 2 = ... = f n = 0 and H 1 > 0 , H 2 > 0 , H 3 > 0 etc. or H i > 0 , it is a local minimum
point.
Hence, if the necessary condition is satisfied and all principal minors are positive, one can conclude
a local minimum. If the necessary condition is satisfied and the sign of principal minors alternates, a
local maximum can be concluded. In other cases the extremum cannot be confirmed, but this
possibility cannot be ruled out as well, because further analysis (that is not discussed here) may still
confirm the minimum or maximum.
Optimizing the function z = f ( x1 , x 2 , 2 , x n ) :
First-order condition, also known as necessary condition:
z
z1 = z 2 = 2 = z n = 0 (where f i = ), finding derivatives and setting them equal to 0, gives a
xi
system of n equation and n variables, solving it gives critical values of x1 , x 2 ,2 , x n a critical point
or points.
67
OPTIMIZATION Anneli Kaasa

Second-order condition, also known as sufficient condition:


can be expressed with the help of Hessian determinant:
z11 z12 z1n
z 21 z 22 z 2n
H =

z n1 z n 2 z nn
by finding all principal minors (starting from upper left corner).
For a minimum confirmed:
H 1 > 0 , H 2 > 0 , H 3 > 0 , H 4 > 0 and so on (the sign will be the same);
for a maximum confirmed:
H 1 < 0 , H 2 > 0 , H 3 < 0 , H 4 > 0 and so on (the sign will keep alternating);
otherwise nothing can be confirmed (it needs further investigation).
When constructing the Hessian the equation system of the first-order conditions can be used as a
basis: the derivatives can be taken again of these conditions with respect to the same variables (the
order of the variables has to be the same). In this way the first row of the Hessian can be obtained
from the first condition, the second from the second and so on.
As a remark, when a constant second-order derivative is found when testing the second-order
condition, then this extremum is not only a relative, but also an absolute (global) extremum. For
example, if the second-order derivative of the function y = f ( x ) is a positive or negative number
(no variables), then the function is convex or concave in the whole domain and the extremum is
global. Analogically, an extremum is global, when the elements of a Hessian are all constants.
Let us look at an example of price discrimination: a firm produces one good and sells it on
different markets (to different consumers) at different price. The total product is then a sum of the
products sold on different markets: q = q1 + + q n . Since the technology is the same, then the cost
function is a function of the total product:
c = c(q ) = c(q1 + + q n ) .
The revenue functions, however are different for each market:
R = R1 (q1 ) + + Rn (q n ) .
The profit can be found as:
= R1 (q1 ) + + Rn (q n ) c(q1 + + q n ) = R1 (q1 ) + + Rn (q n ) c(q ) .
When maximizing the profit:
d q
= R1 (q1 ) c (q ) = R1 (q1 ) c (q ) = MR1 MC = 0 ,
dq1 q1

d q
= Rn (q n ) c (q ) = Rn (q n ) c (q ) = MRn MC = 0 .
dq n q n
Hence, the profit is maximized if:

68
OPTIMIZATION Anneli Kaasa

MR1 = = MRn = MC .
Form here, usually a condition is derived that includes the elasticity. For that a transformation is
made with all marginal revenue functions in a following way ( p(q ) is an inverse demand function):

dR d (q p(q )) q dp q dp 1
= 1 p(q ) + q = p(q )1 +
dp
MR = = = = p1 + = p 1 + ,
dq dq dq p (q ) dq p dq
where is the price elasticity of demand that shows how sensitive the consumers are with respect
to the changes in price. (It has to be kept in mind here that in the case of traditional demand
functions is always negative.)
Hence, the profit-maximizing condition is:
1 1
p1 1 + = = p n 1 + = MC .
1 n
Assuming that the marginal cost is positive, the marginal revenues should be positive as well. For
1
that (since the price is positive) the expression 1 + should be positive. Since is negative,

1
then it must hold that < 1 and for that it must hold that > 1 .

Hence, in the case of a linear demand function, a monopolist chooses between such prices that are
higher than the half of the maximum price (at which consumers stop buying). It is really logical that
if the revenue started to decrease while moving further, there would be no point to expand the
production.
p
a

R
0,5a
MR D

Figure A5.1. The domain where a monopolist operates


a
It is worth noting that in the case of a hyperbolic demand curve p= ( MR = 0 ), there are no
q
possibilities for a monopolist. In this case the revenue is constant and thus the profit is maximized
when the cost is minimized. Hence, there is no point for a monopolist to produce anything.
The sensitivity of the consumers determines the price that can be asked at a certain market. The
more sensitive are the consumers, the higher is the absolute value of the elasticity and the
1 1 1
smaller is , and thus the larger is the expression 1 + = 1 (negative elasticity!). Since it

69
OPTIMIZATION Anneli Kaasa

1
must hold that p1 + = MC , then the larger is the expression in brackets, the lower has to be the

price. In the case of more sensitive consumers lower price has to be asked.

70
6. OPTIMIZATION WITH CONSTRAINTS

6.1. Extremums in an interval


Sometimes the domain of a function is constrained: the argument can have values only in a certain
interval. For example, there may be a lower or upper limit for price or the productive capacity of a
firm may be limited etc. This is called optimization in an interval.

y
local
maximum y=f(x)

local
minimum

a b x

Figure 6.1. Example of an interval


An interval can be understood as a part of the functions domain that can be determined as
x [a, b] or a x b . In an interval it is (almost) always possible to find the absolute minimum
and maximum. It can be seen from Figure 6.1 that the local minimum or maximum may happen to
be outside of the interval. In that case the minimum or maximum of this interval can be found in the
limit points. It may also happen that although there is a local maximum (or minimum) in the
interval, the interval includes points, where the value of the function is even higher (smaller) than in
this local maximum (minimum).
Hence, when looking for the minimum and maximum in an interval in addition to the local
extremums the limit points also have to be considered. First, using first-order conditions critical
points are found (the character of them is not important in this context). Only those critical points
are taken into account that are in the interval and the values of y are found in those points. Second,
the values of y in the limit points are found. Last, the found values of y are compared and the largest
and smallest values are the maximum and minimum, respectively, in this interval.
For example, to find the maximum and minimum of function y = x 3 3 x 2 24 x + 100 in an
interval x [0,10], first we find the critical points:
!
y = 3 x 2 6 x 24 = 0 ,
and from that x1 = 2 and x 2 = 4 . x1 = 2 is not in x [0,10] and is left out.
Then we calculate values of y in the other critical point and limit points:
y (4) = 4 3 3 4 2 24 4 + 100 = 20 ,
y (0) = 0 3 3 0 2 24 0 + 100 = 100 ,
y (10) = 10 3 3 10 2 24 10 + 100 = 610 .
Comparison gives us:
max (20 ;100 ; 610) = 610 ,
min (20 ;100 ; 610) = 20 .
Hence, in the interval x [0,10] (10 ; 610) is the maximum point and (4 ; 20) is the minimum point.
71
OPTIMIZATION WITH CONSTRAINTS Anneli Kaasa

As an example from economics, let us find the quantity that has to be produced to maximize profit
of a firm with the cost function C = 100 + 10q and a productive capacity of 150 units, if the firm
faces the inverse demand function p = 50 0,1q . First, we can derive the revenue function from the
demand:
R = q p = q (50 0,1q ) = 50q 0,1q 2 ,
and then the profit function as a difference between the revenue and costs:
= R C = (50q 0,1q 2 ) (100 + 10q ) or
= 0,1q 2 + 40q 100 .
The local extremums can be found with the help of a derivative:
d !
= 0,2q + 40 = 0 .
dq
The critical point is then q* = 200 , but it is outside the interval q = [0 ;150] . Hence, we have to
consider the limit points: if nothing is produced, then:
(0) = 0,1 0 2 + 40 0 100 = 100 ,
if maximum capacity is used, then:
(150) = 0,1 150 2 + 40 150 100 = 3650 .
Hence, it is optimal for this firm to produce as much as possible, 150 units. The situation is also
depicted on Figure 6.2.

3650

-100 150 200 q

Figure 6.2. Maximising profit with a capacity constraint


In the case of a simple function y = f ( x ) , optimization with constraints means that the change of x
is limited: a x b and we are optimizing in an interval.
In an interval critical points for maximum or minimum are:
first, local extremums that happen to be in the interval (usually 0 to 2 values of y),
second, limit points (2 values of y),
hence when optimizing we have to:
find the highest and lowest values of y (from these 2 to 4 values of y).

6.2. Optimizing the function of two variables with one constraint


In economics there are many problems that include optimizing with constraints. There are two
options: aiming a maximum result at fixed costs or aiming minimum costs for a fixed result. For
example, consumers aim to maximize their utility subject to the budget constraint or the firms aim
72
OPTIMIZATION WITH CONSTRAINTS Anneli Kaasa

to minimize costs to produce a certain quantity. When optimizing with constraints the domain of the
function, from where the optimal solution can be found, is limited.
Sometimes the function has a form that makes optimizing without any constraints impossible. For
example, in the case of a simple Cobb-Douglas type utility function u = q1 q 22 , as quantities
increase, the utility increases and that can continue infinitely. However, in reality there is a budget
constraint and so the highest utility level can be found at a certain budget constraint.
Optimizing a function of more than one variable with constraints is in some sense similar to the
optimization in an interval. When optimizing with constraints the extremums are looked for in a
situation where the arguments of the function have to satisfy one or more equations that relate the
arguments to each other. These equations are called constraints. The function that is optimized is
called objective function. In the case of optimizing a function of two arguments with one
constraint, the objective function is:
z = f ( x, y )
and the constraint may be expressed as:
g ( x, y ) = b ,
where g ( x, y ) is a constraint function and b can be called a constraint constant.
The optimization problem then is to find minimum or maximum of the objective function
z = f ( x, y ) as the arguments x and y have to satisfy the constraint g ( x, y ) = b .
The extremums subject to constraints may not coincide with the free extremums (found without any
constraints), a maximum with constraints can be equal or smaller than the free maximum, a
minimum with constraints can be equal or larger than the free minimum.

y
x

Figure 6.3. Geometrical interpretation of the optimization with constraints


The number of constraints has to be smaller than the number of arguments in the objective
function. That can be seen from Figure 6.3. If z is placed on the vertical axis, then a free maximum
of a function with two arguments (that corresponds to a concave surface) is the highest point of the
surface. If the constraint states a linear relationship between x and y, its geometrical interpretation
is a perpendicular plane (parallel to the z-axis). Hence, the possible solutions for maximizing are
now on the line that forms an intersection of the plane and the concave surface. The highest point
on that curve is the maximum at that constraint. If we add one more constraint, it would constrain
the possible solutions to one point (A) and optimization would become senseless. Hence in the case
of two arguments only one constraint can be used.
There are many methods for finding extremums with constraints. One of them is substitution
method: the constraint is solved for one argument and the resulting expression is substituted in the
objective function instead of that argument. This can be done for several constraints. Thus, the
73
OPTIMIZATION WITH CONSTRAINTS Anneli Kaasa

number of arguments in objective functions will decrease by the number of constraints. Then, the
objective function is optimized with respect to the arguments that are still in the objective function
and the values of other arguments can be found from the constraint equations.
Let us try this method on an example from economics. Let the utility function that is to be
maximized, be u = q1 q 22 . The sum of expenditures for both goods (the price pi multiplied by the
quantity qi ) has to be equal to the consumers income m:
p1 q1 + p 2 q 2 = m .
Let the prices be 4 and 2, respectively, and the income 60. Hence, the budget constraint is:
4q1 + 2q 2 = 60 .
First, we can solve the constraint for the first quantity: q1 = 15 0,5q 2 and substitute it to the
objective function:
u = (15 0,5q 2 ) q 22 = 15q 22 0,5q 23 .
Now, we can take derivative and set it equal to 0:
du !
= 30q 2 1,5q 22 = 0 , from here
dq 2
(30 1,5q 2 ) q 2 = 0 .
There are two solutions: q 21 = 0 and q 22 = 20 . In order to determine the quantity that gives the
maximum utility, we determine the signs of the second-order derivative for both solutions:
d 2u
(0) = 30 3q 2 = 30 3 0 = 30 > 0 ,
dq 22
d 2u
(20) = 30 3q 2 = 30 3 20 = 30 < 0 .
dq 22
hence, q 21 = 0 gives the minimum and q 22 = 20 the maximum utility.
If q 2 * = 20 , the corresponding quantity of the first good is q1 * = 15 0,5 20 = 5 and the maximum
utility is u* = 5 20 2 = 2000 .
Although this method seems simple, the replacements and transformations may become quite
complex in the case of more arguments and constraints.

6.3. Lagrange method


The most widely used method for optimizing with constraints is Lagrange method. This method lies
in constructing a new function Lagrangian function. At that, all constraints are inserted into the
objective function without losing any arguments from the objective function that is one
advantage of this method.
Lagrangian function (or Lagrangian) is constructed in a following way. First, the constraint(s) have
to be in a form where the constant b and the constraint function are both on the same side of the
equation leaving 0 to the other side:
b g ( x, y ) = 0 .
Next, all constraints are multiplied by coefficients called Lagrange multiplier and these products
are then added to the objective function. Lagrange multiplier (denoted with ) is an additional
(helping) variable that has always an interpretation in economics problems.
74
OPTIMIZATION WITH CONSTRAINTS Anneli Kaasa

In the case of an objective function of two arguments and one constraint, the new function (L stands
for Lagrangian function) gets the following form:
L = f ( x, y ) + [b g ( x, y )].
It can be proved that the free extremums of Lagrangian function (with respect to all variables,
including Lagrange multiplier(s)) coincide with the extremums of the objective function at the
constraints that were inserted to the Lagrangian function. Hence, to optimize the objective function
with constraints, the extremums of Lagrangian function can be found.
In the case of a objective function of two arguments and one constraint, according to the first-order
condition in critical points the derivatives of Lagrangian function with respect to all variables ( x, y
and ) have to be equal to 0:
L !

x = L x = f x g x = 0,

L !
= L y = f y g y = 0,
y
L
= L = b g ( x, y ) = 0.
!


fx fy f fy
From the first two equations we get: = and = , hence x = . The last equation
gx gy gx gy
fx fy
represents the constraint: g ( x, y ) = b . Hence, the constraint and the condition = serve as the
gx gy
conditions for the extremums of the Lagrangian function.
Let us recall that when optimizing function z = f ( x, y ) without constraints the condition for critical
the points was that the value of a function neither increases nor decreases ( dz = 0 ):
dz = f x dx + f y dy = 0 .
Here, the values of arguments x and y have to satisfy the constraint g ( x, y ) = b . To hold the value of
the function g ( x, y ) constant (b), the total differential of g ( x, y ) has to be 0:
dg = g x dx + g y dy = 0 .
Hence, when optimizing with constraint, two conditions have to be satisfied in a critical point:
f x dx + f y dy = 0,

g x dx + g y dy = 0.

dy dy f dy g
When solving both equations for , we get = x and = x . From that:
dx dx fy dx gy
fx gx f fy
= or x = .
fy gy gx gy
Hence, indeed, the critical points (when optimising the objective function z = f ( x, y ) with the
f fy
constraint g ( x, y ) = b ) have to satisfy the condition x = and the constraint itself: g ( x, y ) = b
gx gy
and using a Lagrangian function for optimizing with constraint is appropriate.

75
OPTIMIZATION WITH CONSTRAINTS Anneli Kaasa

In addition, if the constraint g ( x, y ) = b is satisfied, then the expression b g ( x, y ) is equal to 0 and


then the Lagrangian function coincides with the initial objective function. Hence, it can be said that
if the constraint is satisfied, the extremum points of the Lagrangian function coincide with the
extremum points of the objective function at the constraint.
Hence, the first-order condition for finding extremums of a function of two arguments with one
constraint says that in the critical points the derivatives of the Lagrangian function with respect to
all variables have to be equal to 0:
L x = L y = L = 0 .
For determining the character of the critical points (are they extremum points or not) second-order
condition is used. Analogically to the optimization without constraints, matrix form can be used. In
the case of constraints, a matrix called bordered Hessian (denoted with H ) is constructed. If there
are two arguments in the objective function and one constraint, the bordered Hessian includes
second-order partial derivatives of the Lagrangian function with respect to arguments x and y, that
are bordered (from left and upper sides) with the partial derivatives of the constraint function with
respect to x, y and as follows (the first element, the derivative of constraint function with respect
to is always 0):
0 gx gy

H = gx Lxx Lxy .
g y
L yx L yy

For investigating the critical point, the determinant of this bordered Hessian is calculated (see
Appendix 2):
H = 2 L xy g x g y L xx g y2 L yy g x2 .

It appears that the second-order condition for a maximum is the positive determinant of bordered
Hessian ( H > 0 ) and for a minimum the negative determinant ( H < 0 ).

At that, sufficient condition is not necessary, again. If the sufficient condition is not satisfied, one
cannot rule out the possibility that there still is an extremum in a critical point.
Taking together the necessary and sufficient conditions
if L x = L y = L = 0 and H > 0 , it is a local maximum point;

if L x = L y = L = 0 and: H < 0 , it is a local minimum point.

As mentioned, the Lagrange multiplier has always an interpretation in economics problems.


f
From the first-order condition L x = f x g x = 0 we can write: = x . When expressing partial
gx
the derivatives with the help of differentials, we can rewrite:
z
z x dz
= x =
fx
= = .
g x g x g dg
x
If the constraint equation is satisfied, then dg = db and we can rewrite:
dz
= .
db

76
OPTIMIZATION WITH CONSTRAINTS Anneli Kaasa

Hence, the Lagrange multiplier shows, what happens to the optimal value of objective function as
the constraint constant b is increased by one unit.
From the Lagrangian function L = f ( x, y ) + [b g ( x, y )] we can see, that is equal to a
dL
derivative of the Lagrangian function with respect to b: = . If the constraint is satisfied, the
db
Lagrangian function and the objective function coincide and hence, the derivatives of the
Lagrangian and objective functions are equal:
dL dz
= = .
db db
If the constraint is included in the Lagrangian function in the form of g ( x, y ) b instead of
b g ( x, y ) , then the extremums remain the same, but the Lagrangian multiplier has an opposite
sign that has to be taken into account when interpreting the results. The same happens (opposite
sign of ), if we set minus before in the Lagrangian function:: L = f ( x, y ) [b g (x, y )] . If,
however, both changes are made L = f ( x, y ) [g ( x, y ) b] , nothing changes.

When optimizing the objective function z = z ( x, y ) ) subject to a constraint g ( x, y ) = b :


For solving this kind of problem it is common to use Lagrangian method by composing a new
(Lagrangian) function:
L = z ( x, y ) + [b g (x, y )] .
Assuming that z is optimized given that g ( x, y ) = b is satisfied, then [b g ( x, y )] = 0 and L = z .
Hence, from now on, Lagrangian function will be optimized and can be viewed as a new
(helping) variable.
First-order condition, also known as necessary condition:
L = Lx = L y = 0 , finding derivatives and setting them equal to 0, gives a system of 3 equation and
3 variables (including ), solving it gives critical values of x, y, a critical point or points.
Second-order condition, also known as sufficient condition:
can be expressed with the help of bordered Hessian determinant:
0 gx gy 0 gx gy L Lx Ly
H = gx L xx L xy = g x L xx L xy = L x L xx L xy
gy L yx L yy gy L yx L yy L y L yx L yy

by finding only the sign of this determinant.


If H > 0 , a maximum can be confirmed,

if H < 0 , a minimum can be confirmed,

otherwise nothing can be confirmed (it needs further investigation).


When constructing the bordered Hessian the equation system of the first-order conditions can be
used as a basis: the derivatives can be taken again of these conditions with respect to the same
variables (the order of the variables has to be the same). In this way the first row of the Hessian can
be obtained from the first condition, the second from the second and so on.

77
OPTIMIZATION WITH CONSTRAINTS Anneli Kaasa

dL L z opt
can be interpreted as = = , z opt can be either z max or z min . means that as b
db b b
increases by 1 unit, z opt increases/decreases (depending on the sign of ) approximately by
units.
Let us solve using Lagrangian method the example used before for illustrating substitution method.
The aim is to maximize the utility function u = q1 q 22 (objective function) subject to the budget
constraint 4q1 + 2q 2 = 60 . The constraint function is then g = f (q1 , q 2 ) = 4q1 + 2q 2 . The
Lagrangian function is in this case:
L = q1 q 22 + (60 4q1 2q 2 ) .
Next, we can find derivatives with respect to all variables (derivative of the Lagrangian function
with respect to qi is denoted as Li ):
!

L1 = q 2
2 4 = 0,
!
L2 = 2q1 q 2 2 = 0,
!
L = 60 4q1 2q 2 = 0.

One possibility to solve the equation system is to solve first two equations for :
q 22 2q1 q 2
= = or q 2 = 4q1 .
4 2
Now we can replace q 2 in third equation:
4q1 + 2 4q1 = 60
and find the optimal quantity of the first good: q1 * = 5 . The optimal quantity of the second good is
then: q 2 * = 4 5 = 20 . The maximum utility that is achieved is u* = 5 20 2 = 2000 .
In order to test, whether it is really a maximum point a following bordered Hessian can be
constructed:
0 g1 g 2 0 4 2
H = g1 L11 L12 = 4 0 2q 2 .
g 2 L21 L22 2 2q 2 2q1
In the critical point, the determinant is:
0 4 2
H = 4 0 40 = 0 + 320 + 320 0 160 0 = 480 > 0 .
2 40 10
Hence, it really is a maximum point.
is equal to a derivative of the objective function (utility function u) with respect to the constraint
du
constant (income m): = . Hence, shows approximate change in utility as income increases
dm
by one unit. The value of can be calculated from both the first and second equation:
q 22 20 2 2q q 2 5 20
= = = 100 or = 1 2 = = 100 .
4 4 2 2
78
OPTIMIZATION WITH CONSTRAINTS Anneli Kaasa

Hence, if a consumer gets one additional unit of income (euro, for example, if the income and prices
are measured in euros), his or her maximum utility increases approximately by 100 units.
Interesting results can be found when solving the problems in a general form. For example let us
maximize a Cobb-Douglas type utility function u = q1 q 2 subject to a budget constraint
p1 q1 + p 2 q 2 = m .
The Lagrangian function is then:
L = q1 q 2 + (m p1 q1 p 2 q 2 ) .
When taking derivatives with respect to q1 , q 2 and and setting them equal to 0, we can solve the
first two equations for :
q1 1 q 2 q1 q 2 1 q 2 p p
= = or = 1 or q 2 = 1 q1 .
p1 p2 q1 p2 p 2
This equation determines the optimal proportion of the two goods in a consumption bundle.
Replacing into the last equation (budget constraint) gives:
p 1 +
p1 q1 + p 2 q1 = p1 q1 = m
p 2
and solving this for q1 gives the demand function of the first good:
m
q1 = .
+ p1
Analogically, the demand function of the second good can be obtained:
m
q2 = .
+ p2
These results indicate that the share of in + determines, how much will the optimum demand
m
of the first good be relative to the amount that could be consumed when all the income would
p1
be used on the first good. Analogically the share of in + determines the optimum q 2
m
relative to the maximum possible amount .
p2
We can calculate the value of for example from the first equation:
1 +
q1 1 q 2 m m m +
= = = .
p1 p1 + p1 + p 2 p1 p 2 + m
The maximum utility is then:
+
m m m
u mx = = .
p1 + p 2 + p1 p 2 +
du
dm = + . This expression is an analogue of a growth rate (see Chapter 10.2)
Hence =
u mx u mx m
and shows the relative increase in utility as one unit is added to the budget constraint.

79
OPTIMIZATION WITH CONSTRAINTS Anneli Kaasa

When the same preferences are described by a transformed utility function obtained with the help
of logarithms: v = ln q1 + ln q 2 (see Chapter 10.1 for this transformation), then the Lagrangian
function is:
Lv = ln q1 + ln q 2 + lv (m p1 q1 p 2 q 2 ) .
When taking derivatives with respect to q1 , q 2 and and setting them equal to 0, we can solve the
first two equations for :

v = = ,
p1 q1 p2 q2
p 1
and from that the optimal proportion q 2 = q1 can be obtained that appears to be the same as
p 2
the proportion found for the untransformed form of the Cobb-Douglas type utility function. Since
the budget constraint has also remained the same, the optimal bundle and the demand functions are
the same as well.
The maximized utility and the value of the Lagrange multiplier are of course different. Since
v = ln u , then the Lagrange multipliers from two problems are related as follows:
dv
lv dv d ln u 1
= dm = = = .
l du du du u
dm
In order to confirm this, we can calculate v from the first equation:
+
v = = .
m m
p1
+ p1
a+
v m 1
Hence = = .
a a m a + u max
a +


p1a p 2 a + m
The maximum transformed utility is then
a
m a m m a m
v max = a ln + ln = ln =
p1 a + p2 a + p1 a + p 2 a +
a a m a +
= ln a .
p1 p 2 a +

We can see that indeed, v max = ln u max .


Another interesting example that relates the optimizing with a constraint with a free optimizing
can be drawn from microeconomics, where both finding an optimal input bundle (optimization with
a constraint) and maximizing the profit (optimization without any constraints) are widely discussed
aims. Let us look how these two problems are related.
Given a production function q = q(K , L ) and the cost function as a function of the inputs:
c = rK + wL , both maximizing the production and minimizing the costs give analogical results. In

80
OPTIMIZATION WITH CONSTRAINTS Anneli Kaasa

both cases it is an optimization with a constraint. In the first case the amount of production is
maximized at fixed costs and the Lagrangian function (since labour is denoted with L, in order to
avoid confusion the Lagrangian function is denoted here with ) can be written as follows:
L = q(K , L ) + (c rK wL ) .
When taking derivatives with respect to K, L and and setting them equal to 0, we get from the
first two equations:
MPK MPL q
MPK r = 0 and MPL w = 0 or = = . Since = , then the Lagrange multiplier
r w c
shows here the marginal product of money: the amount of the product that is added when an
additional money unit is spent on the input. Hence, the marginal products of money have to be
equal: there should be no difference whether to spend the additional money unit on one or another
input. From the last equation we get the equation of the isocost line.
In the second case the costs are minimized for producing a certain amount of production:
L = rK + wL + + (q q(K , L )) .
When taking derivatives with respect to K, L and and setting them equal to 0, we get from the
first two equations:
r w
r MPK = 0 and w MPL = 0 or = = .
MPK MPL
c
Since = , then the Lagrange multiplier shows here the marginal cost. Hence, the marginal
q
costs for different inputs have to be equal: there is should be no difference whether to achieve the
unitary increase in the output by increasing the amount of one or another input. From the last
equation we get the equation of the isoquant curve.
The optimal proportion of the inputs is the same however, both when the production is maximized
at fixed costs or when the costs are minimized for producing a certain amount of production. As a
remark: the same applies in the household theory: the optimal proportion of goods in a
consumption bundle is the same both when the utility is maximized at fixed costs or when the costs
are minimized for achieving a certain amount of utility.
In the case of maximizing the profit for the inputs a free optimization without any constraints is
used. When maximizing the profit (assuming the perfect competition and thus, fixed price):
p = R c = p q(K , L ) rK wL
we take derivatives with respect to K and L and set them equal to 0:
r w
p MPK r = 0 and p MPL w = 0 or p = = .
MPK MPL
When denoting the Lagrange multiplier in the production-maximizing problem as q and in the
cost-minimizing problem as c we can write:
1
p = c = .
q
The Lagrange multiplier of the cost-minimizing problem c gives the price, at which the
production-maximizing input bundle maximizes the profit as well. In another words: when we
maximize the profit knowing the price, the marginal cost of the optimal bundle has to be equal to
81
OPTIMIZATION WITH CONSTRAINTS Anneli Kaasa

the price. This result confirms the well known condition that for the maximum profit the price has
c
to be equal to the marginal cost: c = = MC , and the equation of the supply curve thus has to be
q
p = MC .
MPK MPL
We can also see that the condition for the optimal input bundle = (that gives the
r w
optimal proportion of the inputs) ensures the maximum profit as well. The optimal values of the
inputs K and L in an optimal input bundle are found from the equation of the isocost line or the
isoquant curve. When maximizing the profit, the optimal values of the inputs K and L are
determined by the price p.
MPK MPL
The condition = has a geometrical interpretation as well. As mentioned, this
r w
condition gives a proportion for the optimal values of the inputs (in the household theory the
analogical condition gives the proportion for the optimal quantities of goods in a consumption
bundle). In the case of the Cobb-Douglas type functions this proportion is always expressed by a
linear relationship, for example K = nL , where n is a constant that describes the relationship).
This proportion can be depicted as a curve on a traditional figure that describes the optimization
problem (see Figure A6.1). For our example, this curve is a straight line K = nL . Every point of
that curve is a possible optimal solution that is optimal at a particular level of costs or production
(depending on whether we are minimizing the costs or maximizing the product). This curve is often
called as an expansion path.
K
isoquants
K=nL
or the expansion path

isocosts

Figure A6.1. The expansion path in firm theory


When the ratio of the input prices changes (one input gets more expensive with respect to the other),
then the optimal proportion changes as well (the amount of the input that is now more expensive is
now relatively smaller). The change of the proportion relative to the change in prices is described
by the substitution elasticity:
K* K* w K* w
relative change of d d d
e= L* = L* r = L* r .
w K* w w K*
relative change of d
r L* r r L*
For example, let us look at the Cobb-Douglas type production function q = K L . The condition
MPL w K w K * w
= takes the form of = , hence the optimal proportion is: = . Hence:
MPK r L r L * r

82
OPTIMIZATION WITH CONSTRAINTS Anneli Kaasa

K* w
d
L* = r = 1 . Thus, in the case of this type of production function,
and the elasticity =
w
w
d
r r
w
as the ratio of prices increases by 1% (for example the price of labour increases by 1%), then the
r
K*
ratio also increases by 1% (the amount of capital is increased and/or the amount of labour
L*
decreased).

6.4. Optimization of a function of n variables with many constraints


If the objective function has more arguments and there is more than one constraint, analogical
conditions are applied. In a general case an objective function has n arguments:
z = f ( x1 , x 2 , 2 , x n ) ,
and there are m constraints:
g 1 ( x1 , x 2 , 2, x n ) = b1 ,
2
g ( x1 , x 2 , 2 , x n ) = b2 ,

g m ( x , x ,2, x ) = b .
1 2 n m

The constraints can also be expressed as:


g i ( x1 , x 2 , 2 , x n ) = bi , i = 1 m ,
where g i denotes the i-th constraint function.
At that the number of constraints has to be smaller than the number of arguments:
n > m.
Using Lagrangian method, a Lagrangian function is constructed by adding to the objective
function all constraints that are in appropriate form and multiplied by a corresponding Lagrange
multiplier:
[ ]
L = f ( x1 , , x n ) + 1 b1 g 1 ( x1 , , x n ) +
[
+ 2 b2 g 2 (x ,2, x )] + + [b
1 n m m ]
g m ( x1 , 2 , x n ) .
According to the first-order condition in a critical point the partial derivatives of Lagrangian
function with respect to all variables: ( x1 x n ) and ( 1 m ) have to be equal to 0:
L L L L
== = == =0.
x1 x n 1 m
The resulting equation system enables to find the critical points and their character is investigated
with the help of a bordered Hessian.
In the case of n arguments and m constraints the bordered Hessian is a (m + n ) (m + n ) matrix
that consists of four submatrices, as shown below. The n n submatrix on the lower right consists
of the second-order partial derivatives of Lagrangian function with respect to the arguments. Two
submatrices on the upper right and lower left include the first-order derivatives of the constraint
83
OPTIMIZATION WITH CONSTRAINTS Anneli Kaasa

functions with respect to the arguments. These two submatrices reflect each other over the principal
diagonal. On the upper right there is a m m matrix of 0-s.
The bordered Hessian in a general case is:
0 0 0 g11 g 12 g 1n

0 0 0 g12 g 22 g n2


g1m g 2m g nm
H = 1
0 0 0
,
g g 2
g1m L11 L12 L1n
11 1

g2 g 2
2 g 2m L21 L22 L2 n


g 1n g n2 g nm Ln1 Ln 2 Lnn
where g ij stands for a derivative of i-th constraint function with respect to j-th argument.

For the second-order condition, the principal minors are found. Let H i be a principal minor that
includes i rows and columns from the lower right submatrix of the derivatives of the Lagrangian
function. Thus, H i is a determinant of a (m + i ) (m + i ) matrix. For example H 1 includes m + 1
rows and columns (starting from upper left corner), H 2 m + 2 rows and columns and so on.

When investigating the critical points, those principal minors are considered that are of m + 1 or
higher-order:
H i , where i = (m + 1), , (m + n ) .

According to the second-order condition:


if the considered principal minors are with alternating sign starting from the sign ( 1)
m +1
, that is
(1) H i > 0 , it is a local maximum point;
i

if the considered principal minors are with the same sign ( 1) , that is (1) m H i > 0 , it is a local
m

minimum point.
Hence it depends on the number of constraints, whether for the minimum all principal minors have
to be positive or negative. If there is an odd number of constraints, then ( 1) < 0 and all principal
m

minors have to be negative, in the case of even number of constraints ( 1) > 0 and all principal
m

minors have to be positive. It depends also on the number of constraints, from which sign starts the
alternating of the sign for the maximum. If there is an odd number of constraints, then ( 1) > 0
m +1

and alternating starts from a positive principal minor, in the case of an even number of constraints
( 1)m+1 < 0 and alternating starts from a negative principal minor,.
In a case of two arguments and one constraint ( m = 1 ) the first principal minor considered is of
order m + 1 , that is H 2 . For a minimum all principal minors (in that case only H 2 that is equal to
H ) have to be negative, since ( 1) = ( 1) < 0 . For a maximum the sign of the principal minors
m 1

has to alternate starting from ( 1) = ( 1) > 0 , in this case H 2 has to be positive. This is in
m +1 2

accordance with the conditions introduced before.

84
7. COMPARATIVE STATICS

7.1. Qualitative and quantitative analysis


Comparative statics compares the equilibrium values that are reached at different values of
variables and parameters. Comparative statics may be conducted as a qualitative or quantitative
analysis. In the case of the qualitative analysis, it is examined, in which direction the equilibrium
value changes as a variable or parameter is increasing. In the case of the quantitative analysis, the
extent of the change is also important. Comparative statics does not analyse, how the system moves
from one equilibrium to another, this is the field of dynamic analysis.
After changing a parameter in a model, the conditions change and hence, the equilibrium values
also change. Let us look at a market equilibrium model, where the supplied quantity depends on the
price:
q S = a + bp ,
and the demanded quantity on price and income:
q D = c dp + em ,
where all parameters are positive: a, b, c, d , e > 0 .
The equilibrium condition is:
qS = qD .
Hence, we can write:
a + bp = c dp + em ,
and find the equilibrium price:
c + a + em
p* = .
b+d
that depends on many parameters and income.
A qualitative analysis examines, for example, what happens to the equilibrium price (increases or
decreases) as income increases. For that, a derivative is taken from p* with respect to m and
knowing the signs of the parameters one can try to determine the sign of this derivative:
p * e
= > 0.
m b + d
p p
S S

p*' p*
p* p*'
D
D D' D'
q q
c+em c+em' c+em

Figure A7.1. Changes of the demand curve caused by an increase in income m (left) and an increase
in the parameter d (right)
Hence, when income increases, the equilibrium price increases. But what happens to the
equilibrium price, if the slope of the demand curve decreases? If the slope of demand curve is
85
COMPARATIVE STATICS Anneli Kaasa

smaller, the parameter d (in the demand function) is larger (because usually demand curves are
drawn so that price is on vertical and quantity on horizontal axis, that is, based on an inverse
demand function). Hence, we are interested in the change caused by the increase of the parameter d.
Using the rule of the derivative of a quotient:
(+)
p * (a + c + em)
= < 0.
d (b + d )2
(+)

As income is assumed to be positive, it can be concluded that if the demand curve becomes flatter
the equilibrium price will be smaller.
For a quantitative comparative statics we need to know the values of the parameters. Let the
demand function be q D = 24 3 p + 0,01m and the supply function q S = 4 + 4 p . The equilibrium
price is then:
4 + 24 + 0,01m m
p* = = 4+ .
4+3 700
p* 1 1
Since = , as income increases by one unit, then the equilibrium price increases by
m 700 700
1
unit. Or as income increases by 1000 euros, the equilibrium price increases by 1000 1,43
700
euros.
Often only the general form of functions and the signs of the partial derivatives are known. In that
case, a derivative of implicit function becomes useful. For example, if the demand and supply
functions are defined as follows:
q D = q D ( p, m ) ,
q S = q S ( p)
and the equilibrium condition is:
q S ( p ) = q D ( p, m )
or
q S ( p ) q D ( p, m ) = 0 ,
we can denote f ( p, m ) = q S ( p ) q D ( p, m ) . If we need to know the impact of income on the
equilibrium price, we can find it with the help of a derivative of the implicit function. For that we
need the partial derivatives of f ( p, m ) with respect to p and m ( f p and f m ). The derivative of the
implicit function:
p* f qD qD
= m = S mD = S m D .
m fp qp qp qp qp

Knowing that the supplied quantity depends positively on the price ( q Sp > 0 ) and the demanded
quantity depends negatively on the price ( q pD < 0 ) and positively on the income ( q mD > 0 ), we can
determine the sign:
(+)
p* qD
= S m D > 0.
m qp qp
(+) ()

86
COMPARATIVE STATICS Anneli Kaasa

Hence, an increasing income increases the equilibrium price.


The comparative statics analysis is widely used in macroeconomics. Let us look at one example,
where the national income Y is defined to be equal to the total expenditures that are divided
between consumption C , investments I and government expenditures G and the last two are
independent variables denoted with a subscript 0:
Y = C + I 0 + G0 .
Consumption, in turn, depends on income. From the income after taxes (income minus taxes T)
only a certain portion (expressed by a parameter c) is used for consumption and in addition, there is
an autonomous consumption C 0 :

C = C 0 + c (Y T ) , 0 c 1.
Taxes are constituted of the autonomous taxes and a portion (determined by a parameter t) of the
total income:
T = T0 + t (Y ) , 0 t 1.
Three equilibrium conditions are then:
Y C I 0 G0 = 0,

C C 0 c (Y T ) = 0.
T T t Y = 0.
0

Solving this model, for example, for the variable Y (the other dependent variables are C and T) we
get: Y = C 0 + c (Y (T0 + t (Y ))) + I 0 + G0 or:

C 0 + I 0 + G0 cT0
Y* = .
1 c + ct
Now we can analyse qualitatively (a quantitative analysis needs known values of the parameters)
the impact of different independent variables on the equilibrium value of Y:
Y * Y * Y * 1 Y * c
= = = > 0, = < 0.
C 0 0 G0 1
c + ct
T0 1 c + ct

(+) (+) (+) (+)

We can also analyse the impact of parameters, for example the impact of the marginal propensity to
consume (c):
Y * T0 (1 c + ct ) (C 0 + I 0 + G0 cT0 )( 1 + t ) T0 + (1 t )Y *
= = =
c (1 c + ct )2 (1 c + ct )
(+)
*)+ )
Y * T0 tY * Y * T *
= = >0
(1 c + ct ) (1 c + ct )
))(
(+)

and the impact of the tax rate:


Y * c(C 0 + I 0 + G0 cT0 ) cY *
= = < 0.
t (1 c + ct ) 2
(
1 c + ct )
))(
(+)

87
COMPARATIVE STATICS Anneli Kaasa

Hence, an increase in the autonomous consumption, the government expenditures and the marginal
propensity to consume increases and an increase in the autonomous taxes and the tax rate decreases
the equilibrium national income.
Comparative statics analyses the change of equilibrium values caused by the change of other
variables or parameters using derivatives. In the case of quantitative analysis, the amount of change
of the equilibrium value is calculated, which means the amount of the change causing it also has to
be known. More often only qualitative analysis is performed and that means only the sign of change
is determined. The interpretation of the results is the same as the interpretation of a derivative.

7.2. Using Jacobians in more complex models


In the case of more complex models with more than one equilibrium conditions it is reasonable to
use matrix algebra to investigate the impact of changing variables of parameters on the equilibrium
values.
As an example, let there be a model with two equilibrium conditions, two dependent variables y1
and y 2 and two independent variables x1 and x 2 . Both y1 and y 2 depend on variables x1 and x 2 .
Any equilibrium conditions can be transformed into a form, where a function of all variables is on
the one side and 0 on the other side of equation ( F i denotes the i-th function):
F 1 ( y1 , y 2 ; x1 , x 2 ) = 0 ,
F 2 ( y1 , y 2 ; x1 , x 2 ) = 0 .
If we want to find the impact of an independent variable x1 on the equilibrium values of the
dependent variables, we can proceed as follows. First, find the partial derivatives with respect to x1 :
F 1 y1 F 1 y 2 F 1
+ + =0 ,
y1 x1 y 2 x1 x1
F 2 y1 F 2 y 2 F 2
+ + =0 .
y1 x1 y 2 x1 x1
After moving the last terms to the other side of equations, we can rewrite the system as a matrix
equation, where the vector on the left side includes the impacts we are interested in:
F 1 F 1 y1 F 1

y12 y 2 x1 x1
= .
F F 2 y 2 F 2

y
1 y 2 x1 x1

The matrix that consists of derivatives of functions F 1 to F n with respect to the dependent
variables, is called Jacobian and denoted with J. Here:
F 1 F 1

y y 2
J = 12 .
F F 2
y
1 y 2

Using Cramers rule we can find the impact of x1 on the dependent variables y i : we divide the
determinant of matrix J i (where the i-th column is replaced with the vector from the right side on
equation) by the determinant of Jacobian J:

88
COMPARATIVE STATICS Anneli Kaasa

y i Ji
= .
x1 J

F 1 F 1

y1 x1
F 2 F 2

y 2 J2 y1 x1
For example, = = .
x1 J J

If the impact of x 2 is of interest, the derivatives of the functions derived from the equilibrium
conditions have to be taken with respect to x 2 . The matrix equation includes then x 2 instead of x1 ,
but Jacobian remains the same that makes analysing the impacts of different variables easier.
In general, the number of dependent variables should be equal to the number of equilibrium
conditions. Next, briefly the main steps are shown that are used in general case. Let there be a
model with n equilibrium conditions and dependent variables and m independent variables
The equilibrium conditions can be expressed as follows:
F 1 ( y1 ,2, yn ; x1 ,2, xm ) = 0,
2
F ( y1 ,2, yn ; x1 ,2, xm ) = 0,

F n ( y ,2, y ; x ,2, x ) = 0,
1 n 1 m

and the matrix equation for examining the impact of i-th independent variable:
F 1 F 1 F 1 y1 F 1
y
12 y 2 y n xi xi

F F 2 F 2 y 2 F 2

y y 2 y n xi = xi .
1
n
n y
F F n F n F n

y1 y 2 y n xi xi
Now, with the help of Cramers rule it is possible to estimate the direction of the impact of i-th
independent variable (qualitative analysis) and with the help of known values of parameters also the
extent of the impact (quantitative analysis). In the case of more complex analyses the fact that the
same Jacobian can be used repeatedly, becomes very useful.
In general, it is always possible to transform n equilibrium conditions to a form where a function of
all variables is on the one side and 0 on the other side of equation:
F 1 ( y1 , , y n ; x1 , , x m ) = 0,

F n ( y , , y ; x , , x ) = 0,
1 n 1 m

For analysing the impact of the independent variables on the equilibrium values of the dependent
variables, partial total derivatives with respect to a particular independent variable can be used.
According to the formula for a total differential:

89
COMPARATIVE STATICS Anneli Kaasa

F 1 F 1 F 1 F 1
dy + + dy + dx + + dx n = 0 ,

1 n 1
y1 y n x1 x n


F n F n F n F n
dy1 + + dy n + dx1 + + dx m = 0.
y1 y n x1 x m

For analysing the impact of a particular independent variable (for example x1 ) on the equilibrium
values of the dependent variables, we divide the equations obtained by the differential of this
particular independent variable ( dx1 in our example). This brings us into a situation where on the
left side there are the partial total derivatives of the functions F 1 , , F n with respect of this
particular independent variable ( x1 here):

1 0

F 1
F y1
1
F y n F x1
1 1
F x n
1

x = y x + + y x + x x + + x x = 0 ,
1 1 1 n 1 1 1 n 1


F n F n y F n y n F n x1 F n x n
= 1
++ + ++ = 0.
x1 y1 x1 y n x1 x1 x1 x n x1

1 0

Since the independent variables are not related to each other, the corresponding derivatives are
x
equal to 0 and also we know that 1 = 1 :
x1

F 1 F 1 y1 F 1 y n F 1
= ++ = ,
x1 y1 x1 y n x1 x1

F n F n y F n y n F n
= 1
++ = .
x1 y1 x1 y n x1 x1

This system can be written as a matrix equation that enables to find the impact of x1 on the
equilibrium values:
F 1 F 1 y1 F 1 F 1 x1 F 1
y
1 y n x1 x1 x n x1 x1

n = =
F F n y n F n F n x n F n

y y n x1 x1 x n x1 x1
1
Analogically we can derive the matrix equations for analysing the impact of the other independent
variables x 2 , 2 , x m . In general, for an independent variable xi the matrix equation:

F 1 F 1 y1 F 1

y1 y n xi xi

n = .
F F n y n F n
y
y n xi xi
1

90
COMPARATIVE STATICS Anneli Kaasa

A matrix that consists of the derivatives of different functions with respect to the dependent
variables, is called Jacobian:
F 1 F 1

y1 y n
J =
F n F n

y1 y n

The elements of this matrix and the value of its determinant is the same regardless of the
independent variable xi ( i = 1, , m ) whose impact is under consideration at the moment. Therefore
once the value of this determinant has been found, it can be used repeatedly for analysing the
impact of any independent variable or parameter on the equilibrium values of the dependent
variables.
Using the Cramers rule, we can find the impact of a particular variable xi on the dependent
variable y j : we divide the determinant of a matrix J j , where the j-th column is replaced by the
vector of constants on the right side of matrix equation, by the determinant of the Jacobian J:
y j Jj
= .
xi J

At that, the vector of constants used is found by taking derivatives of the functions F 1 , , F n with
respect to xi and multiplying them by 1 . Hence, the dependent variable under discussion
determines, which column is replaced, and the independent variable under discussion determines
with what it is replaced.
When an impact of a parameter is of interest, the procedure is analogical and a particular parameter
is in the place of xi (the vector of constants is the found by taking derivatives of the functions
F 1 , , F n with respect to this parameter and multiplying them by 1 ).
Let us look at a national income model from macroeconomics, where the national income Y is
equal to the total expenditures that is divided between consumption C , investments I and
government expenditures G and the last two are independent variables denoted with a subscript 0:
Y = C + I 0 + G0 .
Consumption, in turn, depends on income. From the income after taxes (income minus taxes T)
only a certain portion (expressed by a parameter c) is used for consumption and in addition, there is
an autonomous consumption C 0 :
C = C 0 + c (Y T ) , 0 c 1.
Taxes are constituted of the autonomous taxes and a portion (determined by a parameter t) of the
total income:
T = T0 + t (Y ) , 0 t 1.
Three equilibrium conditions are then:
Y C I 0 G0 = 0,

C C 0 c (Y T ) = 0,
T T t Y = 0.
0

91
COMPARATIVE STATICS Anneli Kaasa

First, we can find the Jacobian that remains the same:


FY1 FC1 FT1 1 1 0

J = FY2 FC2 FT2 = c 1 c .
FY3
FC3 FT3 t 0 1

When constructing a matrix equation for examining the impact of a particular variable of parameter,
it is important to use the same order of dependent variables (here Y , C and T ) in constructing
Jacobian and in constructing the vector of the impacts of interest on the left side of the matrix
equation. The choice of order itself is not important, it just has to be used consistently.
For example, for the impact of investments I 0 on the equilibrium values of the dependent variables
the following matrix equation is used:
1 1 0 YI FI 1
1

c 1 c C = F 2 = 0
I I
t 0 1 TI FI3 0
and the impact of investments, for example, on the equilibrium income is:
1 1 0
0 1 c
(+)
Y 0 0 1 1 1
= = = > 0.
I 0 1 1 0 1 + tc c (1 c )+ tc
(+) (+)
c 1 c
t 0 1
Hence, investments have a positive impact on the equilibrium income.
The same method can be used for examining the impact of parameters, for example the impact of
the tax rate t on the equilibrium values of the dependent variables can be analysed with the help of
the following matrix equation:
1 1 0 Yt Ft 0
1

c 1 c C = F 2 = 0
t t
t 0 1 Tt Ft 3 Y
and the impact of the tax rate on the equilibrium income is:
0 1 0
0 1 c
(+)
Y Y cY
0 1 cY
= = = < 0.
t 1 1 0 1 + tc c (1 c )+ (tc+ )
(+)
c 1 c
t 0 1

Since 0 c 1 , then 1 c 0 and the impact of the tax rate on the equilibrium income is negative:
as the tax rate increases, the equilibrium income decreases.
These results are in accordance with those obtained before with a qualitative analysis performed
without using matrix algebra.

92
COMPARATIVE STATICS Anneli Kaasa

Next, let us look at the comparative statics analysis in the case of the IS-LM model covering both
the market for goods and the money market. The goods market is assumed to be described by the
following equations. The investments depend negatively on the interest rate:
I = I (i ) , I i < 0 ,
the savings depend positively on the interest rate and on the national income: S = S (i, Y ) , S i > 0 ,
SY > 0 ;
the government expenditures are exogenous:
G = G0 ,
and the governments tax revenue depends positively on the national income:
T = T (Y ) , TY > 0 ;
the amount of imports depends positively on the national income:
IM = IM (Y ) , IM Y > 0 ,
and of exports is exogenous:
X = X0 .
The money market is assumed to be described by the following equations. The money supply is
exogenous:
M = M0,
and the demand for money depends positively on the national income and negatively on the interest
rate:
L = L(i, Y ) , Li < 0 , LY > 0 .
The first equilibrium condition states that the leakages have to be equal to the injections:
S (i, Y ) + T (Y ) + IM (Y ) = I (i ) + G0 + X 0
or the difference between the savings and investments have to be equal to the sum of the budget
deficit and the net exports:
S (i, Y ) I (i ) = G0 T (Y ) + X 0 IM (Y ) .
The second equilibrium condition states that the money supply has to be equal to the demand for
money:
M 0 = L(i, Y ) .
Hence:
F 1 (i, Y ; G0 , X 0 , M 0 ) = S (i, Y ) + T (Y ) + IM (Y ) I (i ) G0 X 0 = 0,
2
F (i, Y ; G0 , X 0 , M 0 ) = M 0 L(i, Y ) = 0.

There are two dependent variables in this system: Y and i and the independent variables are G0 ,
X 0 and M 0 (S, I, T, IM and L denote functions in this model).
Let us find the determinant of the Jacobian:

93
COMPARATIVE STATICS Anneli Kaasa

F 1 F 1
J = i 2 Y = S i I i S Y + TY + IM Y
= LY S i I i + Li S Y + TY + IM Y < 0
F F 2 Li LY (+) (+) () () (+) (+) (+)

i Y
In order to analyse the impact of the government expenditures, we can use a matrix equation:

i * F
1
S i I i S Y + TY + IM Y G0 G0 1
L LY Y * =
F 2 = 0 .
i
G0 G0
From here, the impact of the government expenditures on the equilibrium interest rate and
equilibrium national income:
1 S Y + TY + IM Y
(+)
i * J1 0 LY LY
= = = > 0,
G0 J J J
()

Si I i 1
()
Y * J 1 Li 0 L
= = = i > 0.
G0 J J J
()

This result is in accordance with the theoretical assumptions: an increase in the government
expenditures shifts the IS-curve to the right and thus increases the equilibrium interest rate and the
equilibrium income (see Figure A7.1).

i
LM

IS'
IS

Figure A7.1. IS-LM model


When using the chain rule, it is possible to investigate the impact on S, I, T, IM and L as well. For
example, the impact of the government expenditures on the investments:
I dI i
= < 0.
G0 di G0
() (+)

This negative impact is called a displacement effect.


In the case when a phenomenon is described as a function of more than one variable, partial total
derivative has to be used, for example, when savings are described as being dependent on both the
interest rate and the national income, then the impact of the government expenditures on the savings
function can be found as follows:
94
COMPARATIVE STATICS Anneli Kaasa

S dS i dS Y
= + > 0.
G0 di G0 dY G0
(+) (+) (+) (+)

Comparative statics analysis can also be performed when optimizing. For example the necessary
conditions for optimizing the function z = f ( x1 , x 2 , 2 , x n ) form a system:

f1 = 0

f =0
n
and for that system the element of the Jacobian jij is a the derivative of the function f j based on
the j-th condition with respect to the i-th variable xi . It appears that this Jacobian coincides with
the Hessian that is used for investigating whether the sufficient conditions are satisfied:
f11 f1n
J = = H .
f n1 f nn
Hence, if the sign of the determinant of a Hessian is known, the sign of the determinant of the
corresponding Jacobian is also known, since J = H .

For example, when maximizing the profit p = p q(K , L ) rK wL the necessary conditions are:
(K and L are the dependent variables here):
p K = F 1 ( K , L; r , w, p ) = p MPK r = 0,

p L = F 2 ( K , L; r , w, p ) = p MPL w = 0.
The Jacobian that is the Hessian as well:
F 1 F 1
p q KL p KK p KL
J = K2 L = p q KK = =H.
F F 2 p q LK p q LL p LK p LL
K L
Since for confirming a maximum profit the determinant of the Hessian should be positive, then:
J = H > 0.
We can examine, for example, the impact of the changes in price on the optimal quantities of the
inputs K and L by constructing a matrix equation:

K * F
1

p q KK p q KL p p MPK
= =
pq p q LL L * F 2 MPL
.
p p

LK

From here, the impact of the price on the optimal quantity of capital, for example, is:

MPK p q KL (+)
(+) () (+) (+)

p MPK q LL + MPL q KL
K * J 1 MPL p q LL
= = = > 0.
p J J J
(+)

95
COMPARATIVE STATICS Anneli Kaasa

The sign of this derivative can be found in a following way. A necessary condition for maximizing
the profit is that p MPK r = 0 and p MPL w = 0 . Since we assume all prices (p, w, r) to be
positive, then MPK and MPL have to be positive (it is logical: production increases as the input
increases). According to the second-order condition the first-order principal minor p KK = p q KK
should be negative and thus q KK should be negative. Since the order of variables is not important
when optimizing, then it can also be said the first-order principal minor should be p LL = p q LL
instead and thus, q LL should be negative as well. For determining the sign of q KL it is logical to
assume that as the amount of capital increases, the marginal product of labour increases, hence q KL
can be assumed to be positive. Taking all together it can be concluded that an increase in the price
of a product increases the optimal quantity of capital (the derivative of the equilibrium value of
capital with respect to price is positive).
In the case of optimization with constraints, the Jacobian used for comparative statics is similar
to the bordered Hessian used for optimization and the determinants of those two matrices are
equal. This can be easily shown by using a certain order for the optimality conditions, but in the
other cases also the transformations (one can change the signs of an even number of rows or
columns of a matrix or switch the rows or columns an even number times without changing the
value of the determinant of this matrix) can be made in order to show that these two matrices
coincide.
For example, when optimizing the objective function z = f ( x, y ) subject to a constraint g ( x, y ) = b
the optimality conditions can be found by taking derivatives of the Lagrangian function
L = f ( x, y ) + [b g ( x, y )] with respect to , x and y:

L = b g ( x, y ) = 0,

L x = f x g x = 0,
L = f g = 0.
y y y

For finding the Jacobian we take the derivatives of these functions L , Lx and L y again with
respect to the variables , x and y:
L L L

x y L
Lx Ly 0 gx gy
L Lx Lx
J = x = L Lxx Lxy = g x f xx g xx f xy g xy .
y
x
x
L L y

L y L y L yx L yy g y f yx g yx f yy g yy
y
x y
The commonly known form of the bordered Hessian for this optimization problem is:
0 gx gy 0 gx gy

H = g x L xx L xy = g x f xx g xx f xy g xy ,
g y
L yx L yy g y f yx g yx f yy g yy

by multiplying the first row and the first column by 1 (does not change the value of the
determinant) we get the Jacobian:

96
COMPARATIVE STATICS Anneli Kaasa

0 gx gy 0 gx gy
H = gx f xx g xx f xy g xy = g x f xx g xx f xy g xy = J .
gy f yx g yx f yy g yy gy f yx g yx f yy g yy

However, it must be noted that if the bordered Hessian is obtained by differentiating the first-order
conditions again, then it is exactly equal to the Jacobian.
In the case of the problem of production-maximization at fixed costs (Lagrangian function
L = q(K , L ) + (c rK wL ) ) the optimality conditions are:

= F 1 ( , K , , r , w, c) = c rK w = 0,

K = F ( , K , , r , w, c) = MPK r = 0,
2

= F ( , K , , r , w, c) = MP w = 0.
3

The Jacobian is then:


F 1 F 1 F 1

2 K L 0 r w
F F 2 F
2
J = = r q KK q KL
K L
F 3 F 3 F 3 w q LK q LL

K L
and the determinant of it is equal to the determinant of the bordered Hessian and is positive (see the
example:
0 r w 0 r w
J = r q KK q KL = r q KK q KL = H = 2rw q KL w 2 q KK r 2 q LL > 0 .
(+) () ()
w q LK q LL w q LK q LL
Hence, when the assumptions about the signs that were made before hold, the solution of this
problem always maximizes the production.
We can examine the impact of the expenditures (c) using the following matrix equation:

0 r w * c 1

r q
q KL K * = 0 .
KK
c
w q LK q LL L * 0
c
For example, the impact of the expenditures that can be made on the marginal product of money:
1 r w
0 q KK q KL
() () (+)
* J1 0 q LK q LL q KK q LL + q 2
= = = KL
.
c J J J
(+)

*
In the case where q KK q LL + q KL
2
> 0, > 0 and hence, as the expenditures increase, every
c
additional unit of money brings about more and more product. In the opposite case the conclusion is
the opposite.
97
COMPARATIVE STATICS Anneli Kaasa

However, we can use the connection of this problem of production-maximization at fixed costs with
the problem of profit-maximization. There, in order to maximize the profit, the Hessian should be
positive:
p q KK p q KL
H =
p q LK p q LL
(
= p q KK q LL q KL
2
)
>0

and as the price is positive, then it must hold that q KK q LL + q KL


2
< 0 . Hence, we can conclude that
*
< 0 and hence, as the expenditures increase, every additional unit of money brings about less
c
and less production. This is in accordance with the common assumption of a decreasing marginal
product.
The comparative statics can be applied on the Leontiefs model as well. When we look at the
Leontiefs model in a form: X = ( E A) 1Y or X = BY :

x1 b11 b1n y1 b11 y1 + b1 y + + b1n y n


= = ,

x n bn1 bnn y n bn1 y1 + bn y + + bnn y n

then it can be expressed as an equation system:


x1 = b11 y1 + b12 y 2 + 2 + b1n y n ,
x = b y + b y + 2 + b y ,
2 21 1 22 2 2n n

x n = bn1 y1 + bn 2 y 2 + 2 + bnn y n .

The coefficient bij can then be viewed as a derivative of the total output with respect to the final
demand:
xi x
bij = i .
y j y j

Hence, the coefficient bij shows the increase (approximately) of the total output in the i-th industry
xi , as the final demand of the j-th industry y j increases by one unit.

98
8. DIFFERENCE EQUATIONS

8.1. Dynamic analysis, difference and differential equations


In the case of dynamic analysis the economic phenomena are investigated over time: how the
variables describing those phenomena behave over time and do they converge to certain
(equilibrium) values or diverge from them. This means time is included to the analysis as a variable.
While usually relationships between two or more economic variables are investigated, here the
behaviour of one or more variables over time is of interest. For example, the change of price or the
national income over time may be under consideration.
Hence, time (t) can be viewed as an independent variable. The time path of an economic variable
that shows how this variable changes over time can be expressed as a function of time: y = f (t ) .
Often the form of this function is not known, but then it may be possible to express this time path in
a different way: as difference or differential equations. It may be possible to derive a difference or
differential equation (describing the time path of a variable) from the equations in a mathematical
model. If there is more than one dependent variable in a model, there may be more than one
difference or differential equations.
Whether the use of difference or differential equations is appropriate, depends on whether time is a
discrete or continuous variable. A variable is discrete, if it can have only certain values. Discrete
time means that it can be measured in periods and the variable t only can take on integer values
(whole numbers). If the argument of a function is discrete, then the function can have values only at
the points where the value of the argument is determined. In the case of continuous time, time can
have also non-integer values that differ from each other by a very small amount. If time is a
continuous variable, time can have infinitely small changes and there is a corresponding value of
the variable y for every value of time. The change of y in a time unit is then described by a
derivative of the function y = f (t ) with respect to time:
dy
. The time path of the variable y can
dt
then be expressed by a differential equation, that includes the variable y and its various-order
dy d2y
derivatives with respect to time: = y (t ) , = y (t ) etc. Often, the derivatives with respect to
dt dt 2
time are denoted with dots above the variable: y , y etc. The highest-order derivative in the
equation determines the order of the equation. One of the simplest linear first-order differential
equations is:

+ u (t ) y = w(t ) ,
dy
dt
where u and w are, like y, functions of time. If u and w are constants (for example a and b,
respectively), the equation is:
dy
+ ay = b .
dt
When solving a differential equation, the result is a function y = f (t ) that describes how variable y
depends on time. The solution of a differential equation y (t ) , thus is an expression where time is
the only variable (no derivatives or differentials).
In the case of discrete time the value of y changes as t changes to a new integer value (see Figure
8.1). Because of this, the graph of the function is discontinuous. In that case a function is not
differentiable and use of derivatives and differentials is impossible. Instead, differences are
analysed and the time path of y is described with the help of a difference equation. In the case of
discrete time, often the values of t are interpreted not as time moments, but as periods the value
of the function remains the same during the whole period between two moments. The change of y in
99
DIFFERENCE EQUATIONS Anneli Kaasa

a time unit is now described as the difference between the values in two time periods:
y = y t y t 1 .

y y

y=f(t) y=f(t)

0 1 2 3 4 5 t t

Figure 8.1. Graph of a function in the case of discrete and continuous argument
dy dy
If we replace in the simple differential equation + cy = b the derivative (that describes the
dt dt
change over time) with the difference y = y t y t 1 and the value of y with the initial value y t 1 ,
we get an equation:
y t + cyt 1 = b or y t y t 1 + cyt 1 = b .
After rearranging:
y t = (1 c ) y t 1 + b .
After denoting 1 c = a , we get a simple linear first-order difference equation:
y t = ay t 1 + b .
A difference equation includes the variable y in time period t ( y t ) and in previous time periods
( y t 1 , y t 2 etc.). The greatest number of periods lagged in the equation determines the order of the
equation.
When solving a difference equation, the result is a function y = f (t ) as well. The solution of a
difference equation y t , is an expression where time is the only variable (no values of y in different
time periods).

8.2. Solving a difference equation


Next, solving a simple difference equation y t = ay t 1 + b is introduced. The solution of a difference
equation y t has to give a rule that enables finding the value of y that corresponds to the time period
t.
The solution of a difference equation usually consists of two parts: a particular solution and the
complementary function. Both have an interpretation in economics. The particular solution is the
time-independent part of the solution that shows the equilibrium value of the variable y. The
complementary function is a function of time that represents the deviations from the equilibrium.
We can imagine that the function y = f (t ) is divided into two parts: a constant and a time-
dependent part, like this: y = c + g (t ) . The constant gives then the equilibrium value and the time-

100
DIFFERENCE EQUATIONS Anneli Kaasa

dependent part of the function describes the difference of y in a particular period from the
equilibrium value.
An equilibrium can be understood as a stable state where there is no tendency to change. Hence, in
an equilibrium there is no change over time:
y* = y t = y t 1 .
After replacing in the difference equation all values of y in different time periods with y * , we get:
y* = ay * + b ,
and hence, the formula for finding the equilibrium value, is:
b
y* = , where a 1 .
1 a
The case where a = 1 , will be discussed later.
It can be proved (see Appendix 3) that the complementary function in the case under consideration
here is in a following form:
( y 0 y *) a t ,
where y 0 is the initial value of y (the solution of the difference equation y t describes the time path
of y after taking on the initial value y 0 ). The general solution of difference equation y t = ay t 1 + b
can be found as a sum of the equilibrium value and the complementary function:
y t = y * +( y 0 y *) a t .
0
It has to be pointed out that if constant b is 0, then the equilibrium value is y* =
= 0 and the
1 a
general solution is equal to the complementary function y t = y 0 a t . The derivation of the
complementary function that can be seen in Appendix 3 actually lies in solving the equation
y t = ay t 1 .
b
Since it is known that y* = , the formula:
1 a
b b t
yt = + y0 a
1 a 1 a
can be used as well.
For example, given a difference equation y t = 3 y t 1 8 , the equilibrium value can be found as:
8
y* = = 4,
1 3
and the general solution is:
y t = 4 + ( y 0 4 )3t .
If the initial value y 0 is known, a definite solution can be found. For example, if y 0 = 2 , then:
y t = 4 + (2 4)3t = 4 + ( 2 )3t .

101
DIFFERENCE EQUATIONS Anneli Kaasa

b b
In a special case, where a = 1 , the formula for the equilibrium value y* = = cannot be used.
1 a 0
The equation is then y t = y t 1 + b and it can be seen that in every period constant b is added to the
value of previous period:
y1 = y 0 + b ,
y 2 = y1 + b = y 0 + 2b ,
y t = y 0 + tb .
Hence, if a = 1 , the general solution is y t = y 0 + tb .

8.3. Assessing the stability of equilibrium


With the help of the difference equations, it is possible to asses the dynamic stability of
equilibriums. An equilibrium value y * is found from the equilibrium conditions with certain
assumptions and at certain values of parameters. This equilibrium holds, if the assumptions and
parameters remain the same, but if something changes, the system is not in the equilibrium anymore
and the dependent variable y takes on a new value (initial value in the context of difference
equation) and y starts to change (converge, diverge, ) according to the rule y = f (t ) that is given
by the solution of the difference equation. For example, a crop failure can dramatically change the
price in the market and release a process of adjusting to the new conditions.
It may happen that the value of the variable y converges over time to the new equilibrium value that
can be found on the basis of new assumptions and parameters. In this case we call it a stable
equilibrium. However, it may also happen that the equations in the system determine that the value
of y diverges over time from the equilibrium value. This case is called unstable equilibrium.
Examining the difference equation and its solution enables to determine, how the value of the
dependent variable behaves over time and whether the equilibrium is stable or not.
The general solution of the difference equation y t = ay t 1 + b is:
y t = y * +( y 0 y *) a t .
It can be seen that if the starting (initial) value y 0 is equal to equilibrium value y * , then the
complementary function is equal to 0 and y stays on the equilibrium level y * . If the initial value of
y is for some reasons different from the equilibrium value, then the time path of y depends on the
complementary function ( y 0 y *) a t . If the absolute value of the complementary function
increases over time, then the value of y t diverges from the equilibrium value y * . If the absolute
value of the complementary function decreases over time, then the difference between y t and y *
decreases and y t converges to the equilibrium value.
Hence, it has to be found, what determines, whether the absolute value of the complementary
function is increasing or decreasing over time. It appears that this is determined by the expression
a t , because whether the initial value is larger or smaller than equilibrium value (that is whether
( y 0 y *) is positive or negative) only changes the sign of a complementary function, but not the
behaviour of the absolute value of the complementary function. In the case of a known initial value
( y 0 y *) is a constant and does not depend on time. The larger the difference ( y 0 y *) the more
time it takes to converge in the case of a decreasing absolute value of the complementary function,
but this does not change the decision about stability.

102
DIFFERENCE EQUATIONS Anneli Kaasa

The behaviour of a t over time is determined by the parameter a (that is actually the coefficient of
y t 1 in the difference equation). Namely:

if a > 1 , then at , as t ;
if a = 1 , then at = 1, as t ;
if 0 < a < 1 , then at 0 , as t ;
if a = 0 , then at = 0 , as t .
If a = 0 , the difference equation takes a following form y t = 0 y t 1 + b = b the value of y t is the
same (constant b) in every period and it is not possible that system leaves the equilibrium. Hence,
this case is left out of consideration.
If a = 1 , the general solution is y t = y 0 + tb and the value of y t converges to the positive or
negative (depending on the sign of b) infinity. As in this case converging of y t to the equilibrium
value is not possible as well, this case is also left out.
The value of the variable converges to its equilibrium value, if the complementary function
approaches 0 for that a t has to approach 0. This is possible, if the absolute value of the
parameter a remains between 0 and 1: 0 < a < 1 . In that case the process that follows to the event
that brought y out of the previous equilibrium is converging to the equilibrium value and the
equilibrium is stable. If a is positive, then y takes values only above (if initial value is larger than
equilibrium value, see Figure 8.2) or below (in the opposite case) the equilibrium value. If a is
negative, the value of y oscillates above and below the equilibrium value when converging to it.
If the absolute value of a is larger than 1 ( a > 1 ), then a t approaches the infinity and thus, the
value of the complementary function increases infinitely as well. Hence, the variable y diverges
from its equilibrium value and the equilibrium is unstable. Here, also the process is oscillatiory, if a
is negative.
Concluding:
if a > 1 , then y t , as t , and it is a diverging process and the equilibrium is unstable;

if 0 < a < 1 , then y t y * , as t , and it is a converging process and the equilibrium is stable;

if 1 < a < 0 , then y t y * , as t , and it is an oscillatiory converging process and a stable


equilibrium;
if a < 1 , then y t , as t , and it is an oscillatory diverging process and an unstable
equilibrium.

If a = 1 , the sign of the complementary function alternates over time: ( y 0 y *)( 1) = const and
t

the value of y oscillates between two values. If t is an even number, then the value of y is equal to
the initial value: y t = y * +( y 0 y *)1 = y 0 ; if t is an odd number, then the value of y is:
y t = y * + ( y 0 y *)( 1) = 2 y * y 0 . Hence, if an initial value is given to y that is different from the
equilibrium value ( y 0 y * ), y never reaches the equilibrium value and hence, the equilibrium is
unstable.
All this can be illustrated by Figure 8.2 that presents the graphs of the function y t = f (t ) for
different values of a.
103
DIFFERENCE EQUATIONS Anneli Kaasa

The sign of the complementary function also depends on the sign of ( y 0 y *) , that is whether the
initial value is larger or smaller than the equilibrium value. The graphs on Figure 8.2 correspond to
the case where y 0 > y * , that is where ( y 0 y *) is positive. If the initial value is smaller than the
equilibrium value ( y 0 < y * ), then the graphs of y = f (t ) for different values of a are reflections of
those on Figure 8.2 over the horizontal line y t = y * . If y 0 = y * , the equilibrium holds and the
graph of y = f (t ) is a horizontal line: y t = y * .
When sketching graphs like these, following method can be used. If a sum of two functions has to
be graphed, first one can graph both functions separately and then add one to the other in the
direction of vertical axis (the values of the function and not of the argument are summed). One can
imagine that when sketching the second function, the graph of the first function is viewed as a
replacement to the horizontal axis. Here, the function y t = f (t ) is a sum of the constant y * and
complementary function ( y 0 y *) a t . Hence, we can first sketch the horizontal line corresponding
to the constant ( y t = y * ) and then sketch the complementary function using the line y t = y * as the
replacement of the horizontal axis. The latter is an exponential function multiplied with a constant.
As the time is discrete, the graph is discontinuous (value of y changes every period).

yt yt
a>1 a<-1

y* y*

yt t yt t
0<a<1 -1<a<0

y* y*

t yt t

a=-1

y*

Figure 8.2. Graphs describing different processes dependent on the value of parameter a
Last, it has to be pointed out that in the case of a converging process, the value of y becomes closer
and closer to the equilibrium value, but actually never exactly reaches the equilibrium value. The
difference between the actual value y t and equilibrium value y * decreases, though, and it depends
on the researcher, when he or she decides to consider this difference equal to 0. For example, if the
104
DIFFERENCE EQUATIONS Anneli Kaasa

admissible error is = 0,01 , then the value of y is considered as the equilibrium value, if the
absolute value of the difference is smaller than 0,01: y t y * < 0,01 . The admissible difference can
also be expressed in the following way: y t = y * 0,01 .

8.4. Cobweb model


One well-known example of using difference equations in economics is the cobweb model. This is a
market equilibrium model where time is brought in. It is assumed that the quantity demanded
depends on the price in the same period, but the quantity supplied depends on the price in previous
period:
qtS = q S ( pt 1 ) ,
qtD = q D ( pt ) .
This is the case for many agricultural products decision about the supplied quantity has to be
made much earlier than the product gets on the market. For example, potatoes are planted in the
spring, but harvested and sold in autumn. Hence the quantity (hopefully) sold is decided on the
basis of the price in previous period. Consumers have no such limitations, therefore, for them the
quantity demanded is related to the price of the same period (according to the demand function).
The equilibrium condition states that the quantity demanded has to be equal to the quantity
supplied:
qtD = qtS .
If we assume linear demand and supply functions, the model may be written as follows:
qtS = a + bpt 1 ,
qtD = c dpt ,
qtD = qtS ,
where a, b, c, d > 0 .
Frome the equilibrium condition we can derive the difference equation for price:
a+c b
a + bpt 1 = c dpt or: pt = pt 1 .
d d
It has to be mentioned that the parameter a in this model has no connection to the coefficient a in
the general form of difference equation. Here, the coefficient that determines the behaviour of price
b
is .
d
In order to find the equilibrium price, we can make a replacement: pt = pt 1 = p * . Then we can
also find q * . On the other hand, in the case of equilibrium, neither price nor quantity changes and
there is no need for subscripts. Hence, to find the price and quantity in the equilibrium point, we can
solve the system:
q S = a + bp,
D
q = c dp,
q S = q D .

The result from both methods is the same:

105
DIFFERENCE EQUATIONS Anneli Kaasa

a+c bc ad
p* = and q* = .
b+d b+d
Next, we can analyse whether the market equilibrium is stable: if some event causes the system to
leave the equilibrium, does the system itself reaches the equilibrium again after some time? Figure
8.3 shows demand and supply curves of the market of a hypothetical agricultural product. The
intersection point E shows the equilibrium ( p * and q * ).

p
S

A B
p0
E
p*
C
p1
F
D

q0 q2 q* q1 q

Figure 8.3. Adjustments in the cobweb mode in the case of stable equilibrium
Let us assume an event that rules out holding the present equilibrium, for example there is lots of
rain in a year t = 0 and because of that, it is not possible to supply the equilibrium quantity and thus
the supplied quantity q 0 is smaller. Now, the equilibrium price does not hold and the new price in
that period ( p 0 ) is probably higher than the equilibrium price. If the quantity in the market is
known, then the price with what this quantity is sold can be found from the demand curve: moving
up from the quantity q 0 to the demand curve (point A) and then to the left leads us to the price axis
and from there we read the new price p 0 . Next spring, producer decides the quantity for the next
year q1 on the basis of the price p 0 according to the supply function. Moving to the right to the
supply curve (point B) and then down leads us to the quantity axis and from there we read the
quantity q1 . As this quantity is relatively large, consumers are less willing to pay. Moving up to the
demand curve (point C) and then to the right gives us the price p1 . Then the producer decides q 2
on the basis of p1 (point F on the supply curve) and so on. The name of the model comes from the
fact that the graphical representation of these adjustments is similar to a cobweb.
It has to be mentioned that if, differently from the example here, price is depicted on the horizontal
and quantity on the vertical axis, then the adjustment path consisting of arrows is going not
clockwise, but counter-clockwise.
Whether the adjustment path leads to the equilibrium point or more away from it, depends on the
slopes of the demand and supply curve. If the supply curve is steeper than the demand curve, as on
Figure 8.3, market reaches the equilibrium after some time, but if the demand curve is steeper than
the supply curve, then the process is diverging from the equilibrium and the path leads away from
the equilibrium point (Figure 8.4, left). If the absolute values of the slopes are equal (Figure 8.4
right), then price and quantity will oscillate between two values (one larger and the other smaller
than the equilibrium value).

106
DIFFERENCE EQUATIONS Anneli Kaasa

p p

S
S
E
E

D
D

q q

Figure 8.4. Unstable equilibrium


Hence, the market equilibrium is stable (converging process), if the supply curve is steeper than the
demand curve. As the inverse functions are depicted on the figures, the slope of the inverse supply
function has to be larger than the absolute value of the slope of the inverse demand function. If
demand and supply (and not inverse) functions are used, then the absolute value of the slope of the
demand function has to be larger than the slope of the supply function:
d > b or d > b .

a+c b
The time path of price is described by the equation: pt = pt 1 . The solution of that is:
d d
t
b
pt = p * +( p 0 p *)
d
or
t
c+a c + a b
pt = + p0 .
d +b d + b d
The first term is the equilibrium price and the second term describes the behaviour of price after
taking on the initial value p 0 . In order the price to converge to the equilibrium value over time, the
absolute value of the coefficient of pt 1 in difference equation has to be between 0 and 1:
b
0< < 1 or d > b , as found before.
d
b b
In that case, the equilibrium is stable. If, however > 1 , the process is diverging and if = 1 ,
d d
the process is neither converging nor diverging. In both cases the equilibrium is unstable. As the
b
coefficient is always negative ( b, d > 0 ), then the process of adjustments is always oscillatory
d
in the cobweb model.
For the previously used example, the demand and supply functions are qtD = 24 3 pt and
qtS = 4 + 4 pt 1 . From the condition qtD = qtS the difference equation for price is:
24 3 pt = 4 + 4 pt 1 or
28 4
pt = pt 1 .
3 3

107
DIFFERENCE EQUATIONS Anneli Kaasa

The equilibrium price is:


28
p* = 3 =4
4
1
3
(the equilibrium price was the same if no time was included in the model).
The general solution describing the time path of price is:
t
4
pt = 4 + ( p 0 4) .
3
b 4
When checking the stability, it appears that = > 1 , hence, it is a diverging process. For
d 3
example, if the price suddenly falls to s p 0 = 3 , then the prices in next periods can be found
according to the formula:
t t
4 4
pt = 4 + (3 4 ) = 4 + ( 1) .
3 3
For example price in third period:
3
4
p3 = 4 + ( 1) 6,3 .
3
The time path of price is shown graphically on Figure 8.5.

p
6,3

4
3

0 1 2 3 t

Figure 8.5. Time path of price

8.5. Phase diagrams


We have discussed solving a simple linear difference equation y t = ay t 1 + b . If the difference
equation has a more complex form, solving it can be troublesome or even impossible. In that case
the dependence of variable y on time y t = f (t ) cannot be brought out. In order to get an overview
of the time path of a variable in those cases, a qualitative analysis, more precisely, drawing phase
diagrams is used.
When drawing a phase diagram, the difference equation that expresses the relationship between y t
and y t 1 is used as a basis. This relationship is depicted on a phase diagram as a phase line. Usually

108
DIFFERENCE EQUATIONS Anneli Kaasa

y t is placed on the vertical and y t 1 on the horizontal axis, hence the difference equation is
transformed to the form y t = f ( y t 1 ) . The graph of this function forms a phase line (see Figure 8.6).
At that mostly only that quadrant is considered where both variables have positive values.
In the case of the y t 1 y t -plane the value of the variable y in different periods is equal ( y t = y t 1 ) on
the 45 line. Hence, the value of y at which the phase line intersects the 45 line is the equilibrium
value. The points, where the phase line intersects or touches the 45 line, are the equilibrium
points. Whether the equilibrium is stable or not can be found out by drawing the time path of y on
phase diagram. That can be explained with the help of Figure 8.6, where one hypothetical phase line
is shown. There is an equilibrium point in the intersection point (E) of the phase line and the 45
line if the variable takes on the value y * , then this value will hold until a change in the
conditions.

yt
phase line
y* E
y2
A yt = f ( yt -1 )
y1

45
y0 y1 y2 y* yt-1

Figure 8.6. Example of a phase diagram


If the shape of the phase line is known (for example as presented on Figure 8.6), then we can
choose an initial value y 0 and mark it on the y t 1 -axis. The value in the next period y1 can be
found with the help of the phase line, as it describes the relationship y t = f ( y t 1 ) . We can move up
from y 0 until the phase line (point A). The value on the y t -axis that corresponds to the point A is
the value of y1 . The next value y 2 can be found analogically, but first we have to mark the value
y1 on the horizontal axis. For that we reflect y1 on the vertical axis over the 45 line to the
horizontal axis. In other words: we move from point A to the right until the 45 line and then down
to the horizontal axis. Then we can read the value y 2 with the help of the phase line again and so
on. This procedure is described by a phase path (path that describes the time path of y on a phase
diagram). In the case depicted on Figure 8.6 the phase path approaches the equilibrium point E,
hence, the equilibrium is stable.
Stability of the equilibrium, that is whether the value of y converges to or diverges from the
equilibrium over time depends on the function y t = f ( y t 1 ) and the shape of the phase line that
describes this function. Figure 8.7 shows four different shapes of phase line around the equilibrium
point.
It can be seen from Figure 8.7 that if we give a small change to the variable y, then a converging
process follows (equilibrium is stable), if the phase line is flatter than 45 line around the
equilibrium point. In the opposite case the process is diverging. It has to be mentioned that the

109
DIFFERENCE EQUATIONS Anneli Kaasa

conditions must hold at least in the neighbourhood that covers the change that is given to the
variable.

yt yt

yt yt-1 yt yt-1

yt-1 yt-1

Figure 8.7. Examples of possible phase lines


In order the phase line to be flatter than the 45 line, the absolute value of the slope of the phase
line has to be smaller than 1 (as the slope of the 45 line is 1). Hence, the absolute value of the
dy t
derivative of the function y t = f ( y t 1 ) has to be smaller than 1: ( y *) < 1 . In that case the
dy t 1
dy t
equilibrium is stable and a converging process takes place. In the opposite case, if ( y *) > 1 , a
dy t 1
diverging process takes place and the equilibrium is unstable.
We can also see that if the phase line is negatively sloping around the equilibrium point, then the
value of y oscillates above and under the equilibrium value. In the case of the positively sloping
phase line the value of y converges to the equilibrium value from only one side or diverges from the
equilibrium value to only one side. This can be expressed mathematically in a following way. The
phase line is negatively sloping, if the absolute value of the derivative of the function y t = f ( y t 1 )
dy t
is negative in the equilibrium point: ( y *) < 0 in that case the process that takes place, is
dy t 1
oscillatory.
Taking all together, the qualitative analysis of a difference equation can be conducted in a following
way. First (if possible) the equilibrium values are found. For that it is assumed that y t = y t 1 = y * .
At these values of y the phase line intersects or touches the 45 line. Knowing the equilibrium
points and the shape of the function y t = f ( y t 1 ) the phase line is sketched. Drawing a phase path
or using the conditions shown before, the conclusions can be made about the time path of y
(whether the process is converging to or diverging from the equilibrium).

110
DIFFERENCE EQUATIONS Anneli Kaasa

If finding the equilibrium values is not possible, then the intersection point of the phase line and the
45 line is not known. Despite that analysing the shape of function y t = f ( y t 1 ) enables to make
some conclusions about the stability.
For example the phase diagram that corresponds to the difference equation for price from the
a+c b
previous chapter: pt = pt 1 , in the case of a stable equilibrium (since for that it must hold:
d d
b
0 < < 1 , then the phase line is flatter than the 45 line) is depicted on Figure A8.1:
d
pt

a+c
d
p*

a+c b
pt = - pt -1
d d

45
p0 p* pt-1

Figure A8.1. The phase diagram of the market equilibrium model in the case of stable equilibrium
Let us look at the example of a difference equation for price from previous chapter:
28 4
pt = pt 1 . The equilibrium price has already been found: p* = 4 , at that price the phase line
3 3
intersects the 45 line. The phase line of this linear difference equation is a straight line with the
4 4
slope . A negative slope refers to a negatively sloping line and as > 1 , then the phase line
3 3
is steeper than the 45 line. Hence, we can draw a phase line as show on Figure 8.8.
pt

28
3

28 4
4 pt = - pt -1
3 3

p0 4 pt-1

Figure 8.8 Phase diagram fro the time path of price


As a more complex example, let us look at the difference equation: y t = y t0,15 . We can find the
equilibrium value in a following way:

111
DIFFERENCE EQUATIONS Anneli Kaasa

y* = y *0,5 ,
y * y *0 , 5 = 0 ,
y* y* = y* ( )
y * 1 = 0 .
This gives us two equilibrium values: y1 * = 0 and y 2 * = 1 .
In order to examine, for example, the stability of y 2 * = 1 , we can find the derivative of the function
y t = y t0,15 :
dy t
= 0,5 y t01,5 ,
dy t 1
and find its value in the equilibrium point:
dy t
(1) = 0,5 10,5 = 0,5 .
dy t 1
As the derivative is positive and its absolute value is smaller than 1, then the phase line is
positively sloping and flatter than the 45 line around the equilibrium value y 2 * = 1 . In this case
the process is analogical to that depicted on Figure 8.6. The value of y converges to its equilibrium
value from one side and the equilibrium is stable.
Phase diagrams allow us to analyse more complex difference equations, but at the same time they
can add an additional view to the solutions of quite simple equations, such as the equations that
describe the change of the value of a single amount of money and the annuity over time:
y t = (1 + r ) y t 1 and y t = (1 + r ) y t 1 + B (see Chapter 10.3), respectively. The graphs of these
functions are both positively sloping straight lines steeper than the 45 line ( 1 + r > 1 ), but they
have different vertical intercepts (see Figure A8.2).

yt
yt = B + (1 + r ) yt -1
yt = (1 + r ) yt -1

45
y0=0 y0=A yt-1

Figure A8.2. Phase diagram for a single amount of money and for the annuity
Although financial mathematics does not discuss the terms like equilibrium or stability, this phase
diagram illustrates that the value of money increases over time acceleratingly.
The dynamic analysis is mostly used in macroeconomics. The number of the difference or
differential equations included in the model depends on how many variables are included in the
model with lags. As an example of a simple model with one difference equation, the national

112
DIFFERENCE EQUATIONS Anneli Kaasa

income model can be used that is also known as the income-expenditure model. Let us assume that
the national income ( Y ) is defined to be equal to the total expenditures (E) that is divided between
consumption C and investments I (government and foreign sector are left out here). Consumption
is assumed to consist of the autonomous consumption AC and a certain portion (expressed by a
parameter c) of the income in the previous period. Taking all together:
Yt = Et ,

Et = Ct + I ,
C = AC + cY .
t t 1

When substituting the equations into each other we can get:


Yt = AC + I + cYt 1
and the solution is
AC + I
Yt = Y * +(Y0 Y *)c t , where Y * = .
1 c
Assuming the 0 < c < 1 (c marginal propensity to consume) it can be concluded that the income
Y always converges to its equilibrium value.
The corresponding phase diagram (see Figure A8.2) is similar to the figure that is known as the
Keynesian cross (describing the relationship between the aggregate demand and the aggregate
supply). The phase line ( Yt = AC + I + cYt 1 ) is a positively sloping straight line with a positive
vertical intercept. (Since Yt = Et , the vertical axis describes not only the income in the period t, but
also the expenditures in time period t.)

Yt=Et

Yt = AC + I + cYt -1

AC + I

45
Yt-1

Figure A8.2. The phase diagram of the national income model


It can be seen from the phase diagram as well that the equilibrium is stable and the national income
in this model always converges to its equilibrium value
When, for example, two variables are included with lags into the model, then the model includes
two difference equations forming an equation system. In that case the principles of constructing
phase diagrams are similar to those described in the case of continuous time (Chapter 9.3). In the
case of equations xt = f (xt , y t ) and y t = g (xt , y t ) , the demarcation curves are drawn on the
xt y t -plane and the equations of the demarcation curves are derived from the conditions xt = 0
y t
and y t = 0 (in the form: y t = f (xt ) ). When > 0 , then when moving in the direction of the
y t
113
DIFFERENCE EQUATIONS Anneli Kaasa

increase in y (moving up when y t is depicted on the vertical axis) y t should be increasing and
hence, before the line y t = 0 y t is negative (y decreases over time and the direction of the phase
path is opposite to the y-axis (down)) and after the line y t = 0 y t becomes positive and the phase
path goes to the same direction as the y-axis (up)). Analogical reasoning applies for the case when
y t
< 0 and for the changes around the xt = 0 curve. The demarcation curves divide the xt y t -
y t
plane into four segments. In order to describe the direction of movements of these two variables in
these segments, similarly to the case of continuous time, small arrows describing the direction of
movements are marked. On the basis of these arrows now the phase paths can be sketched starting
from different initial points in different segments. The phase path may converge to the equilibrium
point, but can also diverge from it.

114
9. DIFFERENTIAL EQUATIONS

9.1. Solving a differential equation


In order to give an overview of differential equations and a possibility to compare with difference
dy
equations, next, solving a simple linear first-order differential equation + ay = b is introduced.
dt
As the solution of a difference equation, the solution of a differential equation y (t ) also gives a rule
for finding the value of variable y at time point t. Analogically, the solution of a differential
equation consists of two parts: the equilibrium value of y and the complementary function that
represents the deviations from the equilibrium.
We know that in the case of equilibrium the value of y does not change over time. Hence, if the
variable y takes on its equilibrium value, the derivative of the function y = f (t ) should be 0:
dy
= 0.
dt
After substituting into the differential equation we get:
0 + ay* = b ,
and the equilibrium value:
b
y* = , where a 0 .
a
Finding a solution if a = 0 is discussed later.
It can be proved (See Appendix 4) that the complementary function for this differential equation is:
[ y(0) y *]e at ,
where y (0 ) stands for the initial value given to the variable y. Hence, the general solution of the
dy
differential equation + ay = b is:
dt
y (t ) = y * +[ y (0 ) y *] e at .
b
As it is known that y* = , then it can be also written as:
a
b b
y (t ) = + y (0 ) e at .
a a
dy
For example, given simple differential equation 3 y = 12 , the equilibrium value is:
dt
12
y* = = 4
3
and the general solution can be written as:
y (t ) = 4 + [ y (0 ) + 4] e 3t .
If a particular initial value y (0 ) is known, a definite solution can be found. For example, if
y (0 ) = 3 , then
y (t ) = 4 + [3 + 4] e 3t = 4 + 7 e 3t .

115
DIFFERENTIAL EQUATIONS Anneli Kaasa

dy dy
In a special case, where a = 0 , the equation + ay = b can be rewritten as = b . To solve this
dt dt
equation we can integrate both sides with respect to time:
dy
dt dt = bdt .
After cancelling differentials dt on the left side, an integral of 1 with respect to y remains. The
constants of integration can be both brought to one side and denoted as C. Hence:
y (t ) = bt + C .
At time t = 0 the value of y is y (0 ) = b 0 + C = C constant of integration C actually can be
viewed as the initial value of y. Hence, if a = 0 , then the general solution is:
y (t ) = bt + y (0 ) .

9.2. Assessing the stability of equilibrium


Similarly to the solution of a difference equation, the solution of a differential equation gives
information about the time path: whether the value of the variable y converges to its equilibrium
value over time or diverges from it. When analysing the formula of general solution:
y (t ) = y * +[ y (0) y *] e at
It can be seen that if the starting (initial) value y (0 ) is equal to the equilibrium value y * , then the
complementary function is equal to 0 and y stays on the equilibrium level y (t ) = y * . If the initial
value of y is for some reasons different from the equilibrium value, then the time path of y depends
again on the complementary function [ y (0) y *] e at . As the number e to any power is always
positive, then the sign of the complementary function depends on whether the initial value is larger
or smaller than the equilibrium value. If the initial value is larger than the equilibrium value, then
the complementary function is positive and the variable y takes on values larger than the
equilibrium value. In the opposite case the complementary function is negative and y takes on
values smaller than the equilibrium value.
The sign of the parameter a in turn determines, whether the variable y converges to the equilibrium
value or diverges from it. Namely, the sign of the parameter a determines the behaviour of the
expression e at over time ( t ) as follows:
if a > 0 , then e at 0 , as t ;
if a < 0 , then e at , as t .
Let us analyse the conditions for stability. The value of y converges over time to its equilibrium
value, if the absolute value of the complementary function decreases, that is if e at decreases. For
dy
that (as can be seen from the conditions above), the parameter a in the equation + ay = b has to
dt
be positive. In that case the process that takes place is converging and the equilibrium is stable. If
the initial value is larger than the equilibrium value (( y (0 ) y* > 0 , the complementary function is
positive), then the value of y is decreasing while converging to the equilibrium. If the initial value is
smaller than the equilibrium value, then y is increasing while converging. Again, it has to be kept in
mind that although converging, the value of y never exactly reaches the equilibrium.
The value of y diverges from its equilibrium value, if the absolute value of the complementary
function increases over time. For that, the parameter a has to be negative. Then the process can be
described as diverging and the equilibrium as unstable. If the initial value is larger than the
116
DIFFERENTIAL EQUATIONS Anneli Kaasa

equilibrium value, then the complementary function is positive and the value of y diverges from the
equilibrium value to the direction of positive infinity ( y t + ). In the opposite case, the
complementary function is negative and y approaches the negative infinity ( y t ).
Concluding:
if a > 0 , then y t y * , as t , and it is a converging process and the equilibrium is stable;

if a < 0 , then y t , as t , and it is a diverging process and the equilibrium is unstable.

y(t) y(t)

a>0 a<0

y* y*

t t

Figure 9.1. Time path of a variable for different values of parameter a


Figure 9.1 shows graphs that describe the time path of y if the initial value is larger than the
equilibrium value. In the opposite case the graphs are reflections of those on Figure 9.1 over the
horizontal line y (t ) = y * . As time is continuous here, the graphs are continuous as well.
As an example, let the demand and supply functions be:
q S = a + bp ,
q D = c dp ,
where a, b, c, d > 0 .
Assume that the change of price over time is linearly dependent on the excess demand (demand
minus supply):
dp
dt
(
= k qD qS , )
where k is a parameter with no limitations.
In the case of market equilibrium, the quantity demanded is equal to the quantity supplied
dp
( q D = q S ). The excess demand is then 0 and price does not change over time ( = 0 ). As
dt
a+c
q D = q S , then we can write a + bp = c dp , and find the equilibrium price p* = . This
b+d
price holds until some conditions change. If price gets a new value (initial value in the context of
differential equation), the process of adjustments starts.

117
DIFFERENTIAL EQUATIONS Anneli Kaasa

The differential equation that describes this process can be derived form the three equations:
q S = a + bp ,
q D = c dp ,
= k (q D q S ),
dp
dt
where a, b, c, d > 0 .
When replacing the demand and supply in the last equation with the expressions from the first two
equations, we get:

= k (c dp + a bp ) .
dp
dt
Regrouping gives us a more familiar form:

+ k (b + d ) p = k (a + c ) .
dp
dt
dp
The equilibrium value can be found using a formula or by replacing: = 0 and p = p * . The
dt
result is the same as found before:
k (a + c ) a + c
p* = = .
k (b + d ) b + d
The general solution then is:
p(t ) = p * +[ p(0) p *] e k (b + d )t .
Next, the question arises about the values of parameters that are needed for the price converging to
the equilibrium price. As we know already, in order the equilibrium to be stable, the parameter a in
dy
the equation + ay = b has to be positive. Hence, in our example, the stability condition is
dt
k (b + d ) > 0 in that case the absolute value of complementary function decreases
( e k (b + d ) t 0 ). Since b, d > 0 , the process that takes place is converging and the equilibrium is
stable, if the parameter k is positive.

Based on this, we can make following conclusions. Form the equation


dp
dt
( )
= k q D q S we can see

that a positive value of k ( k > 0 ) means that in the case of a positive excess demand ( q D q S > 0 )
dp
also known as deficit, price increases over time ( > 0 ). Hence, the market stabilizes itself, if
dt
price starts to increase because of the deficit. As the price increases, the quantity demanded
decreases and the quantity supplied increases the difference between the demand and supply
starts to decrease until at the equilibrium price they are equal (see Figure 9.2).

118
DIFFERENTIAL EQUATIONS Anneli Kaasa

p
S

p*

p0

excess demand D

q
q*

Figure 9.2. Possibilities fro stabilizing the market when price increases

9.3. Phase diagrams


Analogically to the difference equations, in the case of differential equations the time path of a
variable can be analysed qualitatively, with the help of phase diagrams. Here, on the vertical axis
is the derivative the of function y = f (t ) with respect to time:
dy
and on the horizontal axis the
dt
variable y itself (see Figure 9.3). The differential equation that describes the relationship between

= f ( y ) and sketched as a phase line. Mostly, only positive


dy
these two is transformed to the form
dt
values of y are considered, but the changes of y can be both positive and negative, hence both
dy
positive and negative values of are included into the phase diagram.
dt
In the case of the equilibrium value, there is no tendency to change y does not change over time:
dy
= 0 that condition is satisfied in all points of the horizontal axis. Hence, the value of y at
dt
which the phase line intersects or touches the horizontal or y-axis, is the equilibrium value of y.
These points can be viewed as the equilibrium points. The stability of the equilibrium can be
assessed as follows.
dy
In the case of those values of y at which the phase line is above the horizontal axis, > 0 and thus
dt
y increases over time. Hence, the time path goes to the direction of increasing values on the y-axis
dy
(to the right). If the phase line is below the horizontal axis, then < 0 and y decreases over time.
dt
The time path goes to the left then.

119
DIFFERENTIAL EQUATIONS Anneli Kaasa

dy dy
dt dt

y y

Figure 9.3. Phase lines


The movements to the right or to the left can be shown with the help of arrows on the phase line, as
shown on Figure 9.3. If the arrows lead to the equilibrium point, then y converges to the equilibrium
over time, in the opposite case y diverges from its equilibrium.
We can see that it is a stable equilibrium (arrows lead to the equilibrium point), if the phase line is
negatively sloping around the equilibrium point, and the equilibrium is unstable, if the phase line is
positively sloping. Phase line is negatively sloping in the equilibrium point, if the slope of the phase
line is negative in that point. Hence, for stability and converging process the derivative of
= f ( y ) with respect to y has to be negative around the equilibrium point:
dy
dt

d dy
dt ( y *) < 0 .
dy
In the opposite case, if the slope of the phase line in the equilibrium point is positive:

d dy
dt ( y *) > 0 ,
dy
the phase line is positively sloping and the equilibrium is unstable.
Let us continue the example of the market equilibrium model with the differential equation:

+ k (b + d ) p = k (a + c ) .
dp
dt
First, we have to transform the equation to the form:

= k (b + d ) p + k (a + c ) .
dp
dt
As it is a linear equation, the phase line is a straight line. We know that b, d > 0 , so the phase line
is determined by the sign of the parameter k. Previously we concluded that the market equilibrium
is stable, if k is positive. In the case of positive k the slope of the phase line is negative:
k (b + d ) < 0 and the phase line is thus a negatively sloping straight line with positive vertical
intercept k (a + c ) (we also know that a, c > 0 ). This phase line is depicted on Figure 9.4. We can
see that in that case the process indeed converges and the equilibrium is stable.

120
DIFFERENTIAL EQUATIONS Anneli Kaasa

dp
dt

= k (a + c ) k (b + d ) p
dp
k (a + c ) dt

p* p

Figure 9.4. Phase diagram showing the time path of price ( k > 0 )
dy
As another example, given differential equation = 8 y 2 y 2 , the equilibrium values can be
dt
dy
found by replacing y = y * and = 0:
dt
4 y * y *2 = 0 ,
y * (4 y *) = 0 .
resulting in two values: y1 * = 0 ja y 2 * = 4 .
For analysing stability, we take derivative:

d dy
( )
dt = d 8 y 2 y = 8 4 y ,
2

dy dy
and find its values in equilibrium points:

d dy
dt (0) = 8 4 0 = 8 > 0 ,
dy

d dy
dt (4) = 8 4 4 = 8 < 0 .
dy
Since the derivative for the equilibrium value y1 * = 0 is positive, then the phase line is positively
sloping in that point and equilibrium, thus unstable. For the other equilibrium value y 2 * = 4 the
derivative is negative and the phase line thus negatively sloping. Hence, y 2 * = 4 is an equilibrium
value that is stable: the value of y converges around that equilibrium over time. These conclusions
are illustrated by the phase diagram on Figure 9.5.

dy
dt

0 4 y

Figure 9.5. Example of a phase diagram

121
DIFFERENTIAL EQUATIONS Anneli Kaasa

Again, although the phase diagrams allow us to analyse more complex difference equations, at the
same time they can add an additional view to the solutions of quite simple equations, such as the
differential equations that describe the change of the value of a single amount of money and the
dy dy
annuity over time: = ry and = ry + B (see Chapter 10.3), respectively. The corresponding
dt dt
phase lines can be seen on the Figure A9.1.

dy dy
= B + ry
dt dt
dy
= ry
dt
B
0
y

Figure A9.1. Phase diagram for a single amount of money and for the annuity in the case of
continuous time
Again, although financial mathematics does not discuss terms like equilibrium or stability, this
phase diagram illustrates that the value of money increases over time acceleratingly.
The dynamic analysis is mostly used in macroeconomics. The number of the difference or
differential equations included in the model depends on how many variables are included in the
model as changing over time. When, for example, two variables changing over time are included
into the model, then the model includes two differential equations forming an equation system. In
this case the principles of drawing the phase diagrams are somewhat different.
In the case of two time-dependent variables (x and y) and two differential equations ( x = f ( x, y )
and y = g ( x, y ) ) the phase diagram is depicted on a xy -plane (see Figure A9.2 in the following
example). First, the demarcation curves are drawn. A demarcation curve x = 0 includes all pairs of
the values of x and y, at which x has no tendency to change and the other demarcation curve y = 0
includes all those pairs at which y has no tendency to change. Hence, a demarcation curve covers all
possible optimal solutions for a particular variable. The intersection point of these demarcation
curves can then be viewed as the equilibrium point of the model. For drawing, the demarcation
curves are expressed in the form: y = f (x) ).
When crossing the demarcation curve, hence, the sign of x or y , respectively, should change. This
y
can be marked on the figure with the help of plus and minus signs. If > 0 , then when moving in
y
the direction of the increase in y (moving up when y is depicted on the vertical axis) y should be
increasing and hence, before the line y = 0 y is negative (y decreases over time and the direction
of the phase path is opposite to the y-axis (down)) and after the line y = 0 y becomes positive and
y
the phase path goes to the same direction as the y-axis (up)). If < 0 , the opposite is true.
y
Analogically the direction of the phase path can be determined relative to the x-axis. If for example
x
> 0 , then when moving in the direction of the increase in x (moving to the right when x is
x
depicted on the horizontal axis), before the line x = 0 the direction of the phase path is opposite to

122
DIFFERENTIAL EQUATIONS Anneli Kaasa

the x-axis (to the left)) and after the line y = 0 to the same direction as the x-axis (to the right). If
x
< 0 , the opposite is true again.
x
The demarcation lines divide the xy-plane into four segments. In order to describe the direction of
movements of the two variables in these segments, small arrows describing the direction of
movements are marked. On the basis of these arrows now the phase paths can be sketched starting
from different initial points in different segments. The phase path may converge to the equilibrium
point, but can also diverge from it.
y x
Let us look at an example, where > 0 and > 0 . Then before the line y = 0 y is negative (y
y x
decreases over time, minus signs below this line) and after this line y is positive plus signs above
this line). Before the line x = 0 y time and the direction of the phase path is opposite to the y-axis
(down)) and after the line y = 0 x is negative (minus signs on the left of this line) and after this
line positive (plus signs to the right on this line). Then the arrows and possible time paths in each
segment can be sketched as shown on Figure A9.2. In this case whether the equilibrium is reached
depends on the starting point.
+
y(t) y = 0
-
-

+ x = 0
y*
-
+

+
-

x* x(t)

Figure A9.2. An example of a phase diagram in the case of two differential equations
Let us look at an example of the IS-LM model for a closed economy covering both the market for
goods and the money market. Let the goods market and capital market be described by the
following equations. In the equilibrium the national income is equal to the total expenditures:
Y (t ) = E (t ) ,
that, in turn is divided between consumption C and investments I and government expenditures G
(foreign sector is left out here):
E (t ) = C (t ) + I (t ) + G .
Consumption is assumed to consist of the autonomous consumption AC and a certain portion
(expressed by the propensity to consume c) of the income after taxes (g tax rate):
C (t ) = AC + c(1 g )Y (t ) , 0 < c, g < 1 , thus 0 < c(1 g ) < 1 ,
Investments are assumed to consist of the autonomous investments AI and an interest-rate
dependent part (depends negatively on the interest rate i):
I (t ) = AI + ir (t ) , i < 0 .
123
DIFFERENTIAL EQUATIONS Anneli Kaasa

Let the money market be described by the following equations. In the equilibrium the demand for
money is equal to the supply of money:
L(t ) = M ,
where the money supply is exogenous (time-independent) and the demand for money consist of the
transactional demand that depends positively on the national income and the speculative demand
that depends negatively on the interest rate:
L(t ) = yY (t ) + kr (t ) , y > 0 , k < 0 .
Let us assume that the change of the national income over time depends positively on the excess
demand on the goods and capital market and the change of the interest rate over time depends
positively on the excess demand on the money market:
Y = a(E (t ) Y (t )) , a > 0 ,
r = b(L(t ) M ) , b > 0
or
Y = a[C (t ) + I (t ) + G Y (t )] = a[AC + c(1 g )Y (t ) + AI + ir (t ) + G Y (t )] ,
r = b[ yY (t ) + kr (t ) M ] .
When solving this system the equilibrium values of the national income and the interest rate can be
found. However, since the solutions are expressions of a large number of parameters, the
interpretation could be quite troublesome and an alternative method here is to depict the situation in
a phase diagram.
Here, the phase diagram is drawn on the Yr -plane. First, the demarcation curves Y = 0 and r = 0
are drawn. (When these curves have a common point, it is the equilibrium point of this system
both variables have reached their equilibrium values). For that we set Y and r equal to 0:
Y = a[AC + c(1 g )Y (t ) + AI + ir (t ) + G Y (t )] = 0 (IS curve),
r = b[ yY (t ) + kr (t ) M ] = 0 (LM curve).
Basically these equations set the equilibrium conditions for the goods and capital market and the
money market, since for these conditions to be true in both cases the demand has to be equal to the
supply (excess demand is equal to 0). The Y = 0 is also known as the IS curve and r = 0 as the LM
curve.
Since r is usually depicted on the vertical axis and Y on the horizontal axis, then for drawing these
conditions have to be transformed into a form r = f (Y ) ).
Solving both conditions for the interest rate gives:
() ()
))) ))) )) ))
)()
) ) ) )) ()

a[AC AI G [c(1 g ) 1]Y (t )] AC AI G c(1 g ) 1


r (t ) = = Y (t ) ,
ai i i
() ()

() ()

()
()
b[M yY (t )] M y
r (t ) = = Y (t ) .
bk (k ) (k )

124
DIFFERENTIAL EQUATIONS Anneli Kaasa

Knowing the signs of the parameters and independent variables (AC, AI, G and M can be expected
to be positive) we can conclude that Y = 0 or the IS-curve is a negatively sloping straight line with
a positive vertical intercept and r = 0 or the LM-curve is a positively sloping straight line with a
negative vertical intercept (see Figure A9.3).
Now we can find the direction of the change in r and Y on both sides of the demarcation curves. The
derivatives:
Y r
= a[c(1 g ) 1] < 0 and = bk < 0 .
Y r
Hence, for both variables as the value of the variable increases, the derivative with respect to time
changes from positive to negative. We mark plus signs ( r > 0 ) below the r = 0 curve and minus
signs ( r < 0 ) above this curve. Also, we mark plus signs ( Y > 0 ) to the left of the Y = 0 curve and
minus signs ( Y < 0 ) to the right of this curve. Also, we can draw small arrows describing the
direction of movements in different segments.
After that we can sketch a hypothetical phase path starting from a hypothetical initial point A, in
order to determine whether the equilibrium can be reached or not. In the case depicted on Figure
A9.3 the phase path is going in circles counter-clockwise. When in doubt, one can first sketch the
diagonal (dotted) lines parallel to the average direction of the arrows in each segment.

r(t) r = 0
IS
-

+ - +

r*

- A
+ Y = 0
LM
+ -
Y* Y(t)

Figure A9.3. Phase diagram of a IS-LM model


Now we can see that in order the phase path to lead to the equilibrium points, the LM-curve has to
be steeper than the IS-curve. That means that the absolute value of the slope of the r = 0 curve has
to be larger than the absolute value of the slope of the Y = 0 curve:
y c(1 g ) 1
> .
k i
Knowing that on the left side we have an absolute value of a positive expression and on the right
side of a negative expression, we can simplify:
y c(1 g ) 1
> .
k i
This condition is the condition for a stable equilibrium in this example.

125
DIFFERENTIAL EQUATIONS Anneli Kaasa

When, for example, the government decides to decrease the money supply, then the decrease in M
M M
also decreases the absolute value of . Since the vertical intercept of the LM-curve is
k k
negative, then the vertical intercept is shifted upwards. As the change in M does not change the
slope of the LM-curve, the change can be depicted as a parallel shift upwards (or to the left, see
Figure A9.4).
Comparatively, it can be said that the new equilibrium is in the intersection point of the new LM
curve and IS-curve. How this point is reached, is described by the phase path. For this, now the new
LM curve has to be viewed as a demarcation curve r = 0 . The former equilibrium is now just the
initial point.

r(t)
IS LM'
LM

(2)

(1)

Y(t)

Figure A9.4. The IS-LM model after the change in the money supply
Analogically, the impact of other independent variables or parameters can be investigated, finding
the corresponding changes in the curves. If the independent variable or parameter that changes is
included in an expression for the slope, this curve has to be rotated accordingly (keeping the vertical
intercept in place), if the expression of the vertical intercept is changed, then there has to be a
parallel shift. (It may also happen that both changes occur). The new equilibrium in a new
intersection point gives the new equilibrium values of Y and r.
It has to be noted that when no parameter or independent variable has changed, the equilibrium
point cannot change as well. In the case of temporary deviations (as the point A on Figure A9.3) the
question is about converging to or diverging from the initial equilibrium point.

126
10. EXPONENTIAL FUNCTIONS

10.1. Specificity of exponential functions


In mathematical economics exponential and logarithmic functions are widely used. An exponential
function is a function where a variable (argument) can be found in the exponent:
y = ax,
with additional condition that:
a > 0 and a 1 .
If the opposite operation is used, it gives the logarithmic function with the general form as
follows:
y = log a x , where a > 0 and a 1 .
The value of a logarithmic function y is a power that when used on a (a is raised to the power of y)
gives x as a result. Solving for x, we get an exponential function:
x = ay.
The limitations for the parameter a have following reasons: a cannot be negative, because if the
1
power is, for example , we would have to take a square root from a negative number. Raising to
2
any power does not change the numbers 0 and 1, thus, if a = 0 or a = 1 , the domain of a function
would be limited to one value, x = 0 or x = 1 , respectively.
In economics, sometimes a simplifying assumption that a > 1 is made. One can always transform
an exponential function where 0 < a < 1 to a form, where the base is larger than 1. Namely, if
1
0 < a < 1 , then a can be expressed as an inverse of a number b that is larger than 1: a = , and it
b
can be written:
x
1
y = a = = b x = b z , where z = x .
x

b
In economics often the natural exponential function is used, where the base is the number e:
y = ex .
Wide usability is reasoned by the simplicity of taking derivative. The derivative of an exponential
function is:

( )
a x = a x ln a .
If the base is e, then:

( )
e x = e x ln e = e x .
That makes many transformations and calculations much simpler.
Let us consider the base of natural exponential function. The value of the number e is related to the
n
1
function f (n ) = 1 + . If n increases, the value of this function also increases, but deceleratingly:
n
1
1
if n = 1 , then f (1) = 1 + = 2 ;
1

127
EXPONENTIAL FUNCTIONS Anneli Kaasa

2
1
if n = 2 , then f (2 ) = 1 + = 2,25 ;
2
3
1
if n = 3 , then f (3) = 1 + = 2,37 etc.
3
It is known that as n , then the value of this function approaches the number e:
n
1
lim1 + = e = ,7188
n
n
In economics the natural logarithmic function y = ln x is also used widely. Often the aim is to
bring a non-linear function to a linear form to use linear (matrix) algebra. For example, the Cobb-
Douglas type function is non-linear:
z = c xa yb .
When taking natural logarithm from both sides:
ln z = ln c + a ln x + b ln y ,
the resulting function is linear with respect to the logarithms ln x and ln y or log-linear.
In microeconomics, often the utility function is transformed, when some characteristics of the utility
function are aimed (it can be done, because there are no units for measuring utility in an absolute
scale and utility is measured on a relative scale: only the order of consumption bundles is important
and not the absolute value of utility). For example, in the case of the Cobb-Douglas type utility
function u = q1 q 2 the transformation often used is v = ln u = ln q1 + ln q 2 .
Logarithmic transformations are also often used in econometrics, because estimating linear
relationships is much simpler than estimating non-linear relationships. In addition, often the
elasticity (of the variable y with respect to the variable x) is of interest: the aim is to find the relative
change in y brought about by an 1% change in x. This can be found by finding a ratio that describes
how the natural logarithms of the variables ( ln x and ln y ) relate to each other. A derivative
d (ln y )
estimates the absolute change in ln y when ln x increases by one unit. At the same time, it
d (ln x )
gives the elasticity mentioned before, because we can make the following transformations in the
formula for this elasticity:
dy x dy 1 dy 1 1 dy d (ln y ) 1
= x = = =
dx y dx y dx y 1 dx dy d (ln x )
x dx
dy d (ln y ) dx d (ln y )
= = .
dx dy d (ln x ) d (ln x )
When a relationship y = ax is estimated in econometrics, then the coefficient a estimates the
y
quotient that describes how y changes (in units) as x increases by one unit. Analogically, when
x
(ln y )
a relationship ln y = b ln x is estimated, then the coefficient b estimates the quotient that
(ln x )
describes how ln y changes (in units) as ln x increases by one unit. At the same time b estimates
y y
the elasticity : it describes how y changes (in percents) as x increases by 1%.
x x

128
EXPONENTIAL FUNCTIONS Anneli Kaasa

10.2. Growth rate


Exponential functions are often used to describe the behaviour of an economic variable over time,
for example:
y = f (t ) = a b g (t ) .
Often the number e serves as a base and the function g can be a simple linear function, for example
g (t ) = ct :
y = a e ct .
The growth rate of a function describes the relative change of a variable over time. Growth rate is
denoted by ry , where r stands for rate and the subscript shows the variable whose growth rate it is.
The absolute change of y in one time unit ( t = 1 ) can be described by: y t = y t y t 1 and the
corresponding relative change can be found as usual:
y t y t 1
ry = .
y t 1
This approach can be used in the case of discrete time (time is measured in periods and variable t
only can have integer values (whole numbers)).
In the case of continuous time, infinitely small changes can be assumed and the concept of
dy
derivative can be used. The absolute change of y in one time unit can be found as a derivative: .
dt
This derivative gives the change in y as the argument t increases by one unit. In order to find a
growth rate of y this derivative has to be divided by the function y itself:
dy
ry = dt .
y
dy
The growth rate can be different in different time points, both and y are functions of t and thus,
dt
the growth rate also depends on the point of time under consideration.
In the case of the function y = a e ct , the growth rate is constant c. According to the chain rule,
dy
= a e ct c .
dt
and growth rate:
dy ct
ry = dt = a e c = c .
y a e ct
If c is negative ( c < 0 ), then there is a negative growth: the value of a function decreases over
time.
For another example, let us assume that the value of a wine collection v depends on time as follows:
v = 1000e t .
The growth rate is then:

129
EXPONENTIAL FUNCTIONS Anneli Kaasa

dv
1
(1000e ) t

1
rv = dt = 2 t
= .
t
v 1000e 2 t
Hence, the growth rate depends on time (is different in different time points) and as t
1
0 , so the growth is decelerating.
2 t
In order to simplify calculations, sometimes a following method is used: first a natural logarhitm is
taken from a function and then a derivative is taken from that with respect to time. This is reasoned
by the following conversion:
dy
ry = dt = 1 dy .
y y dt
1
The expression can be viewed as a derivative of ln y with respect to y. Hence, it can be written:
y
1 dy d (ln y ) dy d (ln y )
ry = = = .
y dt dy dt dt
d (ln y )
Hence, for calculating growth rate a formula ry = can be used as well.
dt
For example, in the case of y = a e ct the logarithm is
ln y = ln a + ct ln e = ln a + ct ( ln e = 1 )
and
d (ln y )
ry = = 0 + c = c.
dt
Given v = 1000e t
the logarithm is
ln v = ln 1000 + t ln e = ln 1000 + t and
d (ln v ) 1
rv = = .
dt 2 t
Growth rate is used in the case of functions like y = f (t ) (t is time) and it describes the relative
change of y per time unit (absolute change of time!). Growth rate can be calculated as
dy
dy 1
ry = dt = . We can also first find ln y and then take a derivative of it with respect to t,
y dt y
1 dy d (ln y ) dy d (ln y )
because: ry = = = .
y dt dy dt dt
dy 1
!!! ry = means that as t increases by 1 unit, y (depending on the sign the sign of ry being
dt y
positive of negative, respectively) increases/decreases approximately by ry 100 %. !!! NB! If
we are dealing with economic-related problems, then we should not use x and y, but the variables in
our problem instead.

130
EXPONENTIAL FUNCTIONS Anneli Kaasa

dy 1 dy 1
From the formula ry = or, if time t as an argument is denoted with x: ry = we can see
dt y dx y
dy
that in a sense the formula for a growth rate is placed between the formulas of the derivative
dx
dy x dy
and elasticity . When the derivative describes the chance in y in units per one unit change
dx y dx
dy y
in x and the elasticity describes the relative chance in y in percents per 1% change in x, then
dx x
dy y
the growth rate describes the relative change in y in percents per one unit change in x. The
dx
interpretation of these three indicators is similar in a sense, but there are also important differences
that are shown on Figure A10.1.

shows that as then y increases or


indicator x increases by decreases (sign) by

dy y
derivative y = 1 unit y units
dx x

dy x y y
elasticity e= e %
dx y x x 1%

dy 1 y y
growth ry = 1 unit ry 100 %
rate dx y x

Figure A10.1. Comparison of the derivative, elasticity and growth rate. J

dy 1
It is worth noting that if the growth rate is multiplied by the value of the argument x , we get
dx y
dy x
the elasticity :
dx y

ry x =

dy 1 dy
and if the growth rate is multiplied by the value of the function y , we get the derivative :
dx y dx

ry y = y .

We know that the derivative gives us the marginal function MY and elasticity is equal to the
dy x dy y MY
quotient of the marginal and average functions: = = = . Continuing the same
dx y dx x AY
row, the growth rate is equal to the quotient of the marginal function and the function itself:
dy 1 dy MY
ry = = y= .
dx y dx y

131
EXPONENTIAL FUNCTIONS Anneli Kaasa

Sometimes a growth rate of a product of functions has to be found. For example, if y = u v ,


where u = f (t ) and v = g (t ) . To find ry we take a logarithm: ln y = ln u + ln v and find a growth
rate:
1 dy d (ln y ) d (ln u ) d (ln v)
ry = ruv = = = + =
y dt dt dt dt
1 du 1 dv
= + = ru + rv ,
u dt v dt
hence, the growth rate of a product of functions is a sum of the growth rates of these functions.
u
In the case of a quotient of functions y = , the logarithm is ln y = ln u ln v and growth rate:
v
1 dy d (ln y ) d (ln u ) d (ln v)
ry = ru = = = =
v y dt dt dt dt
1 du 1 dv
= = ru rv ,
u dt v dt
hence, the growth rate of a quotient of functions is the difference between the growth rates of the
dividend and divider functions.
A formula for the growth rate of a sum or difference of functions ( y = u v ) can be derived as
follows:
1 dy d (ln y ) d (ln (u v )) 1 d (u v )
ry = ru v = = = = =
y dt dt dt u v dt
1 du 1 dv u 1 du v 1 dv
= = =
u v dt u v dt u v u dt u v v dt
u v
= ru rv ,
uv uv
hence, in finding the growth rate of a sum/difference of functions the shares of these functions in
the sum/difference of these functions are taken into account.
For example, let us assume that a shop of a bred factory sold white bred for 1000 euros and brown
bred for 500 euros in February. The sales of white bred increased by 12% and the sales of brown
bred by 9% in February. Hence, the growth rate of total sales is:
1000 500
rkokku = 0,12 + 0,09 = 0,11
1000 + 500 1000 + 500
and the total sales increased by 11% in February.

10.3. Financial mathematics


Exponential functions are widely used in financial mathematics. The main assumption of financial
mathematics is that the value of money or another form of capital changes over time. When
depositing money into a bank one expects the sum to increase by interests, similarly, a revenue is
expected from investments and so on. It is possible to derive formulas that explain this changing
over time.
Let there be a initial sum of money y 0 that is deposited in bank with the interest rate r . Interest rate
r shows, which portion of the total sum will be added as interest. When multiplying the interest rate

132
EXPONENTIAL FUNCTIONS Anneli Kaasa

r by 100, we get the interest rate in percents. For example, if r = 0,12 , it can be said that the interest
rate is 12% .
In the case of simple interest, in every period the sum r y 0 is added to the initial sum and the result
after t periods is:
y t = y 0 + tr y 0 .
However, compound interest is used more often. In that case the sum that is already accumulated
serves as a basis for interest calculations. Thus, if the sum increases by r y 0 in the first period:
y1 = y 0 + r y 0 = (1 + r ) y 0 ,
then the interest in the second period is calculated based on the final sum in the first period: r y1 .
The final sum of the second period is then y 2 = (1 + r ) y1 . Hence, the time path of the value of
money is described by a simple difference equation:
y t = (1 + r ) y t 1 .
The solution can be found as:

= 0 and y t = 0 + ( y 0 0)(1 + r ) = y 0 (1 + r ) .
0
y* =
t t

1 (1 + r )
If we denote the initial sum ( y 0 ) with A and its future value ( y t ) with V, we can write the formula
for finding the future value as:
V = A(1 + r ) .
t

Often time is measured in years and then t stands for the number of years, interest rate r is then
annual interest rate. Sometimes however, compounding can take place many times a year, for
example quarterly. If the interest is calculated m times a year, then the interest rate of one
r
calculation must be and there are m times t periods:
m
mt
r
V = A1 + .
m
If we increase the frequency of compounding to infinity ( m ), it is called continuous
compounding. The formula for this case can be derived as follows. First, we can multiply the
r
power mt by the number 1 in a form :
r
mtr
r r
V = A1 + .
m
r 1
After replacing = and knowing that when exponentitating an exponential function, the
m m
r
powers are multiplied, we can write:
rt
m


r

V = A1 +
1
.
m r

133
EXPONENTIAL FUNCTIONS Anneli Kaasa

In the case of continuous compounding m , hence, the future value in the case of continuous
compounding is the limit value of this expression when m :
rt
m


r

V = lim A1 +
1
.
m r
m


m
1
r n
m 1
If we replace = n , we can replace the expression 1 + with 1 + . As mentioned
r m n
r
before, the limit value of the latter as n is the number e. Here, the assumption is m , but
m
as r can be viewed as a positive constant, we can say that if m , then and thus
r
n . Hence, the formula can be written as:
rt
1 n
V = lim A1 + = Ae rt .
n
n

Thus, the formula for the future value in the case of continuous compounding is V = Ae rt , where r
is the growth rate of the value of money.
This formula can also be derived using the concept of derivatives (assuming infinitely small
changes) and a differential equation. Since in the case of continuous compounding the
compounding takes place not every period, but after every infinitely small change in time, we can
use the concept of derivatives. The derivative of the value of money with respect to time estimates
the change in the value of money brought about by a unitary change in time. In the case of
continuous compounding this change per time unit is equal to the interest (value of money times the
interest rate):
dy
= ry .
dt
This is a differential equation that describes the situation under consideration here. As a remark:
when rearranging the difference equation used in the case of discrete time, we can see the
dy dy
similarity: y t = y t y t 1 = r y t 1 . Here, estimates this change: y t .
dt dt
After transforming the differential equation to the general form:
dy
ry = 0 ,
dt
we can find the equilibrium value:
0
y* = =0
r
and the solution:
y (t ) = 0 + [ y (0 ) 0] e rt or y (t ) = y (0 ) e rt .
When replacing the notations, we get the formula obtained before:
V = Ae rt .

134
EXPONENTIAL FUNCTIONS Anneli Kaasa

For example, let the deposited sum be 10 000 . If the interest rate is 10% or r = 0,1 , then in the case
of simple interest, the future value after five years is:
V = A + tr A = A(1 + tr ) = 10 000(1 + 5 0,1) = 15 000 .
In the case of compound interest with compounding once in a year, the future value after five years
is:
V = A(1 + r ) = 10 000(1 + 0,1) 16 105 ,
t 5

but if the compounding takes place quarterly, then:


mt 45
r 0,1
V = A1 + = 10 0001 + 16 386
m 4
and the continuous compounding gives:
V = Ae rt = 10 000 e 0,15 16 487 .
Hence, the more frequently interest is calculated, the larger is the future value.
If r denotes interest rate and t denotes the number of periods, the future value (V) and the present
value (A) are related as:
V = A(1 + r )
t

If interest is calculated m times a period:


mt
r
V = A1 +
m
If continuous compounding is used:
V = Ae rt
One economic interpretation of the number e can be as follows. This is the sum that will be
obtained when one unit of money is deposited for one year and the yearly interest rate is 100%. In
this case the number describing how many times a year the compounding takes place (m)
approaches infinity and:
tm 1m
r 1
V = lim A1 + = lim 11 + =e
m
m m
m
The opposite of finding future value is finding present value. If, for example, the desired sum in
future, after t periods is known and the interest rate r is also known, then it is possible to deposit
exactly the sum that is needed taking into account the interest that will be added. For that the
present value of the desired sum has to be found. The present value is also useful, when one has to
choose between a sum at present or another sum in future. In that case the sum that is defined in the
future value has to be made comparable with the present values, because the sum that is received in
present, can be deposited and hence, its future value is larger. Finding the present value of some
amount that is to be paid or received in future, is called discounting.
In the formula V = A(1 + r ) then the future value V is known and the present value A unknown.
t

Solving for A, gives us the formula of present value:

= V (1 + r ) .
V t
A=
(1 + r )
t

Analogically, for example in the case of continuous compounding, the present value:

135
EXPONENTIAL FUNCTIONS Anneli Kaasa

A = Ve rt .
Sometimes compound interest is combined with regular (constant) payments, for example when
paying back a loan, while interests are calculated every period. When both the number of periods
that are needed to finish the payments, and the final sum with all the interests are known, it is
possible to find out a constant sum that has to be paid every period (that can be a year, but other
types of periods are also possible). This kind of series of equal payments are called annuity.
Let there be a fixed amount of payments B and the initial value y 0 = 0 ; one example can be a
savings deposit, where every period a fixed sum is added to the deposit. In addition to the fixed
sum, the interest is also added to the value of previous period:
y1 = (1 + r ) y 0 + B = B ,
y 2 = (1 + r ) y1 + B ,
y 3 = (1 + r ) y 2 + B etc.
In general:
y t = (1 + r ) y t 1 + B .
Solving this difference equation gives:
B B
= and y t = + y 0 + (1 + r ) .
B B
y* =
t

1 (1 + r ) r r r
Knowing that the initial value is 0, we can find the formula for the future value of annuity:

V =
B
r
[ ]
(1 + r )t 1 .
Sometimes the present value of annuity is of interest. For example one needs a loan and knows the
sum that can be paid every period. A formula for the present value of annuity can be derived
analogically, with the help of difference equation, but we can also discount the future value of
V
annuity. We know that A = , hence:
(1 + r )t
B (1 + r ) 1 B 1
t
A= or A = 1 .
r (1 + r )t r (1 + r )t

A formula for calculating the size of payments B can be found by solving one or another formula
for B:
1 (1 + r )t
B = Vr and B = Ar .
(1 + r ) 1 (1 + r ) 1
t t

For example, let us assume that a loan of 100 000 euros is desired. Bank gives the loan for 10 years
with an annual interest of 15%. If annuity is used, then the sum that has to be paid every year is
(1 + 0,15)10
B = 100 000 0,15 19900 euros.
(1 + 0,15) 1
10

When the interest is compounded continuously for the annuity, then analogically to the derivation
of the formula for continuous compounding, we can write
dy
= ry + B ,
dt

136
EXPONENTIAL FUNCTIONS Anneli Kaasa

and rearrange to get the general form of the differential equation:


dy
ry = B .
dt
The equilibrium value and the solution:
B B
and y (t ) = + y (0 ) + e rt .
B
y* =
r r r
Hence, knowing that the initial value y (0 ) is 0 and replacing the notation of y (t ) we get:

V =
r
[
B rt
e 1 . ]

10.4. Optimal timing


In the case of goods whose value is increasing or decreasing over time (goods appreciating or
depreciating over time), the question arises, when it is useful to sell the goods. When selling in
present, the received sum of money can be deposited to earn interests. Hence, the sum that would be
received when selling later, has to be discounted. This logic is the basis for maximizing the value of
goods or in other words: finding the optimal time to sell goods.
Using the previous example about wine, let us assume that the bank interest rate is 10% and a
continuous compounding is used. To find the present value v n from the future value v of wine, we
can use the formula for the present value A = Ve rt or in our example:
v n = ve 0,1t .

The value of wine depends on time according to the function v = 1000e t . The present value of the
possible gain from selling wine in future then depends on time as follows:
v n = 1000e t e 0,1t = 1000e t 0 ,1t
.
For finding the time, when this value is maximal, we have to take derivative and set it equal to 0:
dv n 1 t 0 ,1t
!
= 1000 0,1 e =0.
dt 2 t
Since (always) e x > 0 and 1000 0 , it has to be that:
1
0,1 = 0 ,
2 t
from here t = 25 . Hence, it is optimal to sell the wine after 25 years.
When looking at the problem from another side, we can say that it is useful to keep the goods if the
growth rate of their value is higher than the growth rate of money (bank interest rate). If they
become equal, it is optimal to sell goods and deposit the gains. Hence, in our example the condition
for the optimal time is:
1
rv = 0,1 or = 0,1 .
2 t
As expected, this leads to the same result as before.

137
LITERATURE

Anthony, M., Biggs, N. Mathematics for economics and finance. Methods and modelling.
Cambridge: Cambridge University Press, 1996.
Chiang, A. C. Fundamental methods of mathematical economics. 3rd ed. McGraw-Hill, 1984.
Dowling, E. T. Schaums outline of theory and problems of introduction to mathematical
economics. 3rd ed. Shaums Outline Series: McGraw-Hill, 2001.
Kaasik, ., Abel, M. Eesti-inglise-vene matemaatikasnastik. Tartu: T, 1995.
Jrime, E., Velsker, K. Matemaatika ksiraamat IXXI klassile. Tallinn: Valgus, 1987.
Soper, J. Mathematics for Economics and Business. An interactive Introduction. Oxford; Malden:
Blackwell Publishers, 1999.

138
APPENDICES Anneli Kaasa

APPENDICES

Appendix 1
The second-order differential d 2 z of the function z = f ( x, y ) can be rewritten as follows. First we
find the total differential of the first-order differential dz according to the formula of total
differential:
(dz ) (dz )
d 2 z = d (dz ) = dx + dy .
x y
Then we replace the first-order differential dz with the formula of total differential of dz :
( f x dx + f y dy ) ( f x dx + f y dy )
d 2z = dx + dy
x y
and take derivatives of the expressions in brackets with respect to x and y, respectively, ( dx and dy
are constants in this context):
d 2 z = ( f xx dx + f yx dy )dx + ( f xy dx + f yy dy )dy =
= f xx dx 2 + 2 f xy dx dy + f yy dy 2 .

Next, we make transformations of f xx dx 2 + 2 f xy dx dy + f yy dy 2 to find conditions that determine


f xy2 f xy2
the sign of this expression. We add to the expression 0 in the form of d y
2
d 2 y and
f xx f xx
regroup:
f xx dx 2 + 2 f xy dx dy + f yy dy 2 =

f xy2 f xy2
= f xx dx + 2 f xy dx dy +
2
d y + f yy dy
2 2
d2y =
f xx f xx
f xy f xy2 f xy2 2
= f xx dx 2 + 2 dx dy + 2 d 2 y + f yy d y =
f f f
xx xx xx
f xy f yy f xx f xy2 2
2

= f xx dx + dy + d y .

f xx f xx
Since a square is always positive, for any values of dx and dy (unless both are equal to 0):

the expression is assuredly positive, if f xx > 0 ja f yy f xx f xy2 > 0 , and

the expression is assuredly negative, if f xx < 0 ja f yy f xx f xy2 > 0 .

If f yy f xx f xy2 < 0 or f yy f xx f xy2 = 0 , the sign of the expression not clear.

Appendix 2
Analogically to the optimization of a function with two arguments without constraint, in the case of
a constraint the function has to be increasing with respect to both arguments before the maximum
139
APPENDICES Anneli Kaasa

point and decreasing after that. Before the minimum point, the function should be decreasing and
increasing after that. Hence, in the maximum point the differential of the function has to change
from positive to negative, and thus, the second-order differential has to be negative: d 2 z < 0 . In the
minimum point it has to be positive d 2 z > 0 . At that it has to be taken into account that the value of
the constraint function has to be constant: dg = 0 .

Because of the constraint, dx and dy are related and thus d 2 z and the second-order conditions are
different. Let us make some transformations to see, how d 2 z is expressed now with the help of
partial derivatives.
In order the constraint to be satisfied, dx and dy have to be related in a following way:
dg = g x dx + g y dy = 0 .

While without any constraint dx and dy can have any values, now when choosing some value of
dx , there is a corresponding dy :
gx
dy = dx .
gy
Earlier, when looking at the case without constraint, the formula for the second-order differentials
was:
(dz ) (dz )
d 2 z = d (dz ) = dx + dy =
x y
( f x dx + f y dy ) ( f x dx + f y dy )
= dx + dy .
x y
Now, when taking derivatives with respect to x and y, dy also has to be viewed as a variable that
depends on g x and g y and hence, through them also on x and y. When using the rule of the
derivative of a product and regrouping we get:
dy
d 2 z = f xx dx + f yx dy + f y dx +
x
dy
+ f xy dx + f yy dy + f y dy =
y
dy dy
= f xx dx 2 + 2 f xy dx dy + f yy dy 2 + f y dx + dy .
x y
The difference from the case without constraint lies in the last term. The expression in brackets is
the second-order differential of y ( d 2 y ), hence:
d 2 z = f xx dx 2 + 2 f xy dx dy + f yy dy 2 + f y d 2 y .
With the help of analogical transformations we can get the second-order differential of the
constraint function (this second-order differential has to be equal to 0, as the second-order
differential of constant b is 0):
d 2 g = g xx dx 2 + 2 g xy dx dy + g yy dy 2 + g y d 2 y = 0 .

Now we can solve it for d 2 y :

140
APPENDICES Anneli Kaasa

g xx 2 g xy g yy 2
d2y = dx 2 dx dy dy
gy gy gy
and substitute the result to the formula of d 2 z and regroup:
d 2 z = f xx dx 2 + 2 f xy dx dy + f yy dy 2 +

g g xy g yy 2
+ f y xx dx 2 2 dx dy dy =
g g g
y y y
fy fy
= f xx g xx dx 2 + 2 f xy g xy dx dy +
gy gy

fy
+ f yy g yy dy 2 .
gy

fy
The quotient is actually a Lagrange multiplier . Hence, the expressions in brackets appear to
gy
be the second-order derivatives of the Lagrangian function and we can write more briefly:
d 2 z = L xx dx 2 + 2 L xy dx dy + L yy dy 2 .

For a maximum point, this d 2 z has to be negative and for a minimum positive. Let us make some
transformations to get the conditions that determine the sign of d 2 z . If we replace dy with an
g
expression found before x dx and regroup, we get:
gy
2
g g
d 2 z = L xx dx 2 + 2 L xy dx x dx + L yy x dx =
g g
y y
gx 2 g2
= L xx dx 2 2 L xy dx + L yy x2 dx 2 =
gy gy

( ) dx
2
= L xx g y2 2 L xy g x g y + L yy g x2 .
g y2

dx 2
As is positive (a square is always positive), then the sign of d 2 z is determined by the sign of
g y2
L xx g y2 2 L xy g x g y + L yy g x2 .
It appears that this expression, when multiplied by 1 , is equal to a following determinant :
0 gx gy
gx L xx L xy = 2 L xy g x g y L xx g y2 L yy g x2 .
gy L yx L yy
Hence, if this determinant is positive, it is a maximum and if negative, then a minimum.

141
APPENDICES Anneli Kaasa

Appendix 3
Here, the formula for finding the time-dependent part of the general solution for a first-order linear
difference equation complementary function is derived. If we denote the complementary
function with z t , we can express the general solution of the difference equation as a sum of the
equilibrium value and the complementary function: y t = y * + z t . In the period t 1 the general
solution is then y t 1 = y * + z t 1 . Now, we can replace yt and y t 1 with these expressions in the
difference equation y t = ay t 1 + b :
y * + z t = a ( y * + z t 1 ) + b .
Next, we can subtract y * from both sides. At that, from the right side we subtract y * in the form
of ay * + b (we know that y* = ay * + b ). The result shows how the complementary function
depends on its value in previous period:
z t = a z t 1 .
When moving back we can also write: z t 1 = a z t 2 , z t 2 = a z t 3 and so on. Hence, the
complementary function can be expressed through its value in two periods ago:
z t = a z t 1 = a (a z t 2 ) = a 2 z t 2 ,
and thus:
z t = a z t 1 = a z t = = a t z t t
or
zt = a t z0 .
In the 0-period the general solution is y 0 = y * + z 0 , so z 0 can be found as a difference between the
initial value and the equilibrium value of y: z 0 = y 0 y * . Hence, the formula for the
complementary function is:
z t = ( y 0 y *) a t .

Appendix 4
Here, the formula for finding the complementary function of the general solution for a first-order
linear differential equation. If we denote the complementary function with z (t ) , we can express the
general solution of a differential equation as:
y (t ) = y * + z (t ) .
dz
The derivative that describes how the complementary function behaves over time, also
dt
describes how y behaves over time. That can be shown by differentiating both sides of the equation
y (t ) = y * + z (t ) (derivative of a constant y * is 0):
dy dz
= 0+ .
dt dt
In order to bring out the time-dependent part of the solution, we can take derivatives of both sides of the
dy
differential equation + a y = b with respect to time:
dt

142
APPENDICES Anneli Kaasa

d2y dy
2
+a = 0.
dt dt
dz dy
As we know that = , we can write:
dt dt
d 2z dz
2
+a = 0.
dt dt
To find z (t ) , we integrate this equation with respect to time:
d 2z dz
dt 2 dt + a dt dt = 0 dt .
The result is (as we want to find the time-dependent part of solution, the constants of integration are
left out):
dz
+ az = 0 .
dt
After transforming we get:
1 dz
= a .
z dt
1 dz
Next, we can integrate this equation with respect to time: z dt dt = a dt . The left-side integral
1
can also be written as z d z . According to the rules of integration:
ln z = at + C .
To find z, we can use the opposite operation of logarithming: e ( ) . In order the result to be more
clear, we can make some transformations. We can replace the constant of integration C with ln A
( A is constant as well). Without changing the essence of the equation, the expression at can be
multiplied by the number 1 in the form of ln e . The equation then gets a form:
ln z = ln A at ln e .
According to the rules of logarithms:
A
ln z = ln .
e at
Hence:

z (t ) =
A
at
= A e at .
e
If t = 0 , then the complementary function is z (0 ) = A e a0 = A 1 = A and the general solution is
then y (0 ) = y * + z (0 ) = y * + A . We can see that the constant A can be viewed as the difference
between the initial and equilibrium values: A = y (0 ) y * .
Hence, the formula for the complementary function is:
z (t ) = z (0) e at .

143

You might also like