You are on page 1of 202

i

ECON 1202 & ECON 2291

Quantitative Analysis for


Business and Economics
Lecture Book

2013

fx (x0

, y0 )
b

x
y

School of Economics
The University of NSW

ii

Based on Quantitative Methods A: Lecture Book, first edition written and


produced by Simon Angus, School of Economics, University of NSW,
FebruaryMay 2006.
Revisions and additional materials prepared by Christopher Bidner,
Loretti Isabella Dobrescu, Kevin Fox and Judith Watson.
Typeset in LATEX 2 , a document preparation language, using typeface
Computer Modern. Significant packages supporting this document include beamer and pstricks.
c 2010 by the School of Economics, UNSW; Sydney, Australia. All rights

reserved. No part of this document may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic,
mechanical, photocopying, recording, or otherwise, without written permission from the School.

Contents

1 Introducing QABE
1.1 Introduction . . . . . . . .
1.2 Learning in QABE . . . .
1.3 Miscellany . . . . . . . . .
1.4 Functions of One Variable
1.5 What is a Function? . . .
1.6 Functions . . . . . . . . .
1.7 Special Functions . . . . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

1
1
2
4
5
6
8
10

2 Time Value of Money


2.1 Introduction . . . . . . . . . . . . . . . . . . .
2.2 A Problem . . . . . . . . . . . . . . . . . . .
2.3 Option 1: Simple Interest . . . . . . . . . . .
2.4 Option 2: Compound Interest . . . . . . . . .
2.5 Option 3: Continuously Compounded Interest
2.6 Summary . . . . . . . . . . . . . . . . . . . .
2.7 Present Value . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

15
15
15
16
17
19
20
20

3 Evaluating Time-Money Choices


3.1 Introduction . . . . . . . . . . . .
3.2 Equations of value . . . . . . . .
3.3 Net Present Value . . . . . . . .
3.4 Internal Rate of Return . . . . .
3.5 Summary . . . . . . . . . . . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

23
23
24
25
27
28

4 Geometric Progressions and Annuities


4.1 Introduction . . . . . . . . . . . . . . .
4.2 Geometric Progressions . . . . . . . .
4.3 Annuities . . . . . . . . . . . . . . . .
4.4 Annuities Due . . . . . . . . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

31
31
31
32
36

5 Matrices I: Maths by Arrangement


5.1 Introduction . . . . . . . . . . . . .
5.2 Terminology . . . . . . . . . . . . .
5.3 Operations . . . . . . . . . . . . .
5.4 Matrix Multiplication . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

39
39
40
42
44

.
.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

iii

iv
6 Matrices II: The Inverse
6.1 Introduction . . . . . .
6.2 The Inverse . . . . . .
6.3 Determinant Excursus
6.4 The Inverse Really . .
6.5 On Linear Equations .

CONTENTS
& Determinant in
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .

7 Matrices III: Matrix Algebra


7.1 Introduction . . . . . . . . .
7.2 More on Matrix Algebra . .
7.3 Matrices on Computers . .
8 Probability I: Permutations
8.1 Introduction . . . . . . . .
8.2 Why Counting? . . . . . .
8.3 Basic Counting . . . . . .
8.4 Permutations . . . . . . .
8.5 Combinations . . . . . . .
8.6 Summary . . . . . . . . .

.
.
.
.
.

47
47
47
48
51
53

& Automatic-Matrices!
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .

55
55
56
58

and
. . .
. . .
. . .
. . .
. . .
. . .

9 Probability II: Probability in


9.1 Introduction . . . . . . . . .
9.2 Probability Trees . . . . . .
9.3 Rules of Probability . . . .
9.4 Bayes Formula . . . . . . .

Small
. . . .
. . . .
. . . .
. . . .
. . . .

Combinations
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .

action
. . . . .
. . . . .
. . . . .
. . . . .

10 Markov Chains
10.1 Introduction . . . . . . . . . . . . .
10.2 The Basics . . . . . . . . . . . . .
10.3 Transitions, Regularity and States
10.4 Markov Chains in Game Theory .

.
.
.
.

.
.
.
.
.
.

Matrices
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

61
61
62
63
65
67
68

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

69
69
69
70
73

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

77
77
77
79
84

11 Linear Programming I: Solving problems in a world of constraints


11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11.2 The Business Headache . . . . . . . . . . . . . . . . . . . . . . . . .
11.3 Introduction to Linear programming . . . . . . . . . . . . . . . . . .
11.4 Linear Programming . . . . . . . . . . . . . . . . . . . . . . . . . . .

87
87
88
88
90

12 Linear Programming II: Dealing with a Changing World


12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12.2 Linear programming usual suspects . . . . . . . . . . . . . . . . . . .
12.3 Variations in the LP problem . . . . . . . . . . . . . . . . . . . . . .

97
97
97
98

13 Linear Programming III: Using Solver


13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . .
13.2 The Problem . . . . . . . . . . . . . . . . . . . . . .
13.3 Using Solver . . . . . . . . . . . . . . . . . . . . . . .
13.4 Changing the Objective Function: Multiple Solutions
13.5 Summary . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

105
105
106
107
113
115

CONTENTS
14 Differentiation: Responding to
14.1 Introduction . . . . . . . . . .
14.2 Limits . . . . . . . . . . . . .
14.3 Rates of change . . . . . . . .
14.4 Differentiation . . . . . . . .
15 Differentiation II: Tricks and
15.1 Introduction . . . . . . . . .
15.2 Implicit Differentiation . . .
15.3 Logs and Exponentials . . .
15.4 Higher Order Derivatives .
15.5 Elasticity of Demand . . . .

Change
. . . . . .
. . . . . .
. . . . . .
. . . . . .

.
.
.
.

Extensions
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .

16 Differentiation III: Optimization in


16.1 Introduction . . . . . . . . . . . . .
16.2 Extrema of Functions . . . . . . .
16.3 Applied to the Problem . . . . . .

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.

117
117
117
118
122

.
.
.
.
.

127
127
128
130
132
133

one Variable
135
. . . . . . . . . . . . . . . . . . . 135
. . . . . . . . . . . . . . . . . . . 136
. . . . . . . . . . . . . . . . . . . 141

17 Integral Calculus: Unlocking Economic Dynamics


17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . .
17.2 Why Integration? . . . . . . . . . . . . . . . . . . . .
17.3 The Indefinite Integral . . . . . . . . . . . . . . . . .
17.4 The Definite Integral . . . . . . . . . . . . . . . . . .
18 Differential Equations &
18.1 Introduction . . . . . .
18.2 Differential Equations
18.3 Topics in Growth . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

143
143
144
145
148

Growth I
151
. . . . . . . . . . . . . . . . . . . . . . . . . . 151
. . . . . . . . . . . . . . . . . . . . . . . . . . 152
. . . . . . . . . . . . . . . . . . . . . . . . . . 156

19 Differential Equations & Growth II


19.1 Introduction . . . . . . . . . . . . .
19.2 Limited Growth . . . . . . . . . . .
19.3 Logistic Growth . . . . . . . . . . .
19.4 Appendix: An Integration Workout

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

20 Multivariable Calculus: The Partial Derivative


20.1 Introduction . . . . . . . . . . . . . . . . . . . . .
20.2 Functions of two-variables . . . . . . . . . . . . .
20.3 The Partial Derivative Method . . . . . . . . . .
20.4 Methods of Partial Differentiation . . . . . . . .

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.

159
159
160
162
166

.
.
.
.

167
167
168
171
173

21 Multi-variable Optimisation
177
21.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
21.2 Unconstrained optimisation . . . . . . . . . . . . . . . . . . . . . . . 177
21.3 Constrained optimisation . . . . . . . . . . . . . . . . . . . . . . . . 182
22 Applications of Constrained Optimisation
22.1 Introduction . . . . . . . . . . . . . . . . . .
22.2 Economic Applications . . . . . . . . . . . .
22.3 Interpreting Lagrange Multipliers . . . . . .
22.4 The Power of the Method: A Preview . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

187
187
188
193
195
v

CONTENTS

vi

Lecture

Introducing QABE
1.1

Introduction

Welcome to Quantitative Analysis for Business and Economics (QABE). QABE


deals with the fundamentals of mathematics for business and economics. It replaces
the old course Quantitative Methods A, and reflects our goal of continual course
improvement. For those of you who have studied QMA in the past, there are similarities but also differences between the two courses.
You may be wondering if all the material taught in QABE will be applicable to
you. The short answer is that all of it will be applicable, but not always immediately.
That is, the course material has been chosen to reflect the core mathematical skills
that you will need for further study in the quantitative courses, particularly those
taught by the School of Economics. We have students studying marketing and
hospitality, through to economics and econometrics. The spectrum of disciplines
requires a range of mathematical tools. For this reason, we will try to point to
applications of our mathematics for your further studies wherever possible, drawing
on business and economic scenarios and problems to bring out the relevance of the
techniques well be learning.
Finally, so that we can keep this course improving all the time, wed appreciate
your feedback. So if you have an idea for an improvement, please send an email to
the lecturer-in-charge to let us know.
All the best with your studies. We hope that you enjoy the course!
Agenda
1. Introductions;
2. How do I learn in QABE?;
3. Assessment;
4. Futher help;
5. A few notes on studying at university;
6. Functions of one variable.
1

LECTURE 1. INTRODUCING QABE


Introductions
1. Who is your lecturer?
2. Who is the lecturer-in-charge?
3. Who is your tutor?

1.2

Learning in QABE

An overview

Lectures

Textbooks
introducing

clarifying

Understanding
discussing

Tutorials

1.2.1

deepening
correcting

testing

Quizzes
Assignment
Tests
Final

Tuition

lectures (2hr per week) Introduce and emphasise key points from the course, see
worked examples, ask one or two questions; prepare by reading the lecture
notes, reading over reference chapters;
tutorials (1hr per week from week 1) Core place of learning, developing understanding, making mistakes, asking many questions;
pitstop (many hrs per week from week 3) Back-up for tutorials, further explanation,
further inquiry.
consultation (3hrs per week) Clarifying lecture material, discussing course-program
related issues (LIC), focussed tuition.

Computer Labs
No in-lab computing assessment;
Labs are for:
Practicing/learning Excel;
Working on your Assignments;
Completing online-quizzes
Labs are booked for use by QABE students: see the course website for details.
2

1.2. LEARNING IN QABE

1.2.2

Materials

online Go to
http://lms-blackboard.telt.unsw.edu.au
... click on ECON1202/2291 - Quantitative Analysis for Business and Economics. Lecture notes, course-outline, past exams, contact information;
textbooks 1. (required) Haeussler, Paul & Wood, Introductory Mathematical
Analysis: for Business, Economics and the Life and Social Sciences,
Addison Welsey, 12th Edition, 2008. (HPW);
2. (strongly recommended) Knox, Zima & Brown, Mathematics of Finance,
McGraw-Hill Book Company, 2nd Edition, 1999. (KZB);
3. (other) see course outline.

More help?
pass (many hrs per week from week 3) Peer Assistance Support Scheme: Peer
assisted study groups, run by second and third year students;
education development unit (edu) (Australian School of Business) Learning and
language support; workshops etc., Room G07, Ground Floor, ASB Building,
West Lobby;
the learning centre (UNSW) Free and confidential learning support for students;
iLecture Podcasts in .mp3 or live-streaming (quick-time, windows media player).

Is this for you?

Assumed knowledge A level of knowledge equivalent to achieving


a mark or at least 60 in HSC Mathematics. Students who have taken
General Mathematics will not have achieved the level of knowledge
which is assumed for this course.
From the (intro) Calculus lectures...
d p ax
x e
dx
d p 2x
=
e +x
dx
Z b
k(1 ex ) dx
=

Za 
3
x
2
dx
=
5e x +
x

ya =
yb
yc
yd

(x 6= 0).
3

LECTURE 1. INTRODUCING QABE


Refresher Resources
See revision text in the Reserve section of the library (also available at the UNSW
bookshop):
Managing Mathematics: A Refresher Course for Economics and
Commerce Students by Judith Watson, 2nd Edition, 2002.

1.3
1.3.1

Miscellany
On the lecture materials

Using lecture resources


Lecture notes are available from the website. Will be available shortly in a
book form, which can be downloaded or purchased from the UNSW Bookshop;
In the notes look for chapter references in the margins;

text ref.
here!

At the end look for key words of interest that you should revise;
Note special text like,
Definition | The fundamental theorem of first-year
The amount of work w undertaken by a student is inversely related to the
difference between the total session time T and time elapsed in the session
t,
1
(1.1)
w(t)
T t
Examples appear in the notes with a box for working,
Example:
The world is experiencing exponential growth in population, but declining
economic stocks of energy, fresh water and food. Solve.

Or words of caution to make sure you dont fall into common traps,

The Fundamental Theorem of First-year is fundamental for


a reason.
4

1.4. FUNCTIONS OF ONE VARIABLE

1.3.2

This is NOT the Course Outline

Read the Course Outline!

Check (and re-check) the course-outline for information provided


and more (see list below).
Special consideration (e.g. illness);
Student misconduct and plagiarism policy;
Contact details of key people;
Syllabus what well be studying, with chapter references.

1.3.3

Studying at university

Some advice
1. Attend classes the turning up philosophy to education;
2. Use a diary/palm-pilot/organiser/calendar/scraps-of-paper, write in assessments, put down reminders;
3. Make a habit of opening the textbook and reading it weekly;
4. Ask questions lots of them;
5. Introduce yourself to someone else in a tutorial or lab.

1.4

Functions of One Variable

To begin with, we will go back over some fundamental concepts and terminology of
the vast world of functions. Following which, we will meet some particular kinds of
functions that will keep cropping up in this course, and most likely, in the rest of
your studies.
Note: Notation We will use a particular notational convention for functions, such in the
following example:
f (x) = x2 + 4,
It should be noted that there is nothing special about f (or x for that matter), they
are just labels. As we note below, we could just have correctly chosen to name all of
our representative functions as blah with input words, (giving say, blah(words),
which would be read, the function blah of words). However, this might get confusing,
and so we will follow the very established convention of using the function title f (x),
or perhaps y(x).

LECTURE 1. INTRODUCING QABE


Agenda
1. Function review;
2. Special functions;
3. Exponential and Logarithms;
4. Limits.

1.5

What is a Function?

Definition | Function
A function is a rule that assigns to each input number exactly one output number.

HPW 2.1

Example: A linear function


Consider what is meant by the simple linear function f (x) = 1 + 0.5x.

1.5.1

Some Definitions

1. The name of the function is irrelevant. Consider,


f ish(shrimp) =

shrimp 1
;
2

... still a valid function!


2. Often we talk in terms of dependent and independent variables, or alternatively, in terms of the value and argument respectively:
Example: Dependent, Independent
Identify the dependent and independent terms, and the value and argument
of the function H(a, b) = a2 + 2b + 3.

1.6. FUNCTIONS

1.5.2

Functions are not functions!

1. Functions are part of a broader class called relations. Functions are the
special case they give one output value for a given input value.

2. For this reason, they are also called a mapping, or a transformation.

One of these is a function, one isnt!


3
2
1
0
-1
-2
-3

f (x) = x2

3
2
1
0
-1
-2
-3

-3 -2 -1 0 1 2 3

1.5.3

f (x) =

-3 -2 -1 0 1 2 3

Domain and Range

Definition | Domain and Range


The domain of a function is the set of all x values over which the function
makes sense (works!). The range of a function, is the set of all possible
f (x) values, given the domain.

Example:
Find the domain of the function, y(x) =

2
.
x2 +3x4

LECTURE 1. INTRODUCING QABE

1.6
1.6.1
HPW 2.2

Functions
Common Functions

Definition | The Constant Function


A constant function is of the form:
f (x) = c
where c is a constant.
3
2
1
0
-1
-2
-3

f (x) = 2

-3 -2 -1 0 1 2 3

Definition | The Polynomial Function


A polynomial function is of the form:
f (x) = cn xn + cn1 xn1 + + c1 x1 + c0
where cn . . . c0 are constants.
3
2
1
0
-1
-2
-3

f (x) = x2

-3 -2 -1 0 1 2 3

Definition | The Rational Function


A rational function is of the form:
f (x) =

p1 (x)
p2 (x)

where p1 and p2 are polynomial functions.


3
2
1
0
-1
-2
-3

f (x) =

x2 6
x+6

-3 -2 -1 0 1 2 3

1.6. FUNCTIONS
Definition | The Absolute Function
An absolute function is of the form:
f (x) = |g(x)|
where g(x) is some function and | | indicates positive value.
3
2
1
0
-1
-2
-3

f (x) = | x|

-3 -2 -1 0 1 2 3

1.6.2

Combining Functions

Suppose we have two functions, f (x) = 3x + 2, and p(x) = x3 3, then we will be


(x)
.
interested to solve: f (x) + p(x), or f (x) p(x), or f (x) p(x), or even fp(x)
Definition | Function Combination
In general, we have,
sum (f + g)(x) = f (x) + g(x) ,
difference (f g)(x) = f (x) g(x) ,
product

(f g)(x)

quotient

( fg )(x)

= f (x) g(x) ,
f (x)
=
for g(x) 6= 0 .
g(x)

Example: Combining functions


Suppose f (x) = 2x2 3x 2 and g(x) = x 2, and let h(x) =
then show that (h g)(x) = x + 3.

1.6.3

f
g (x),

Composite Functions

Now suppose we dont want a combination, but we want to construct a process of


more than one function,
x

f (x)

g(y)

z
9

HPW 2.3

LECTURE 1. INTRODUCING QABE


that is,
x

g(f (x))

Definition | Composite Function


If f and g are functions, the composite function of f and g is the function f g,
(f g)(x) = f (g(x)) ,
and the domain of f g is the values of x in the domain of g such that
g(x) is in the domain of f .

Example: Composite functions


Let p(x) = x2 2, and h(x) = 5x + 1 (for x 0). Find (p h)(2).

1.7
1.7.1
HPW 2.4

Special Functions
Inverse

Now suppose that instead of,


x

g(f (x))

we want to go back the other way, that is,


x

or in other words, if
(f g)(x) = f (g(x)) = z ,
then what we are after is the function,
(g1 f 1 )(z) = g 1 (f 1 (z)) = x .
where g1 is the inverse function of g.
10

1.7. SPECIAL FUNCTIONS


Example:
Suppose f (x) =

x2 +1
5 ,

find f 1 (x).

Lets try that out, suppose x = 2:


f (x) = f (2) =

(2)2 + 1
=1
5

... and the other way around,


f 1 (x) = f 1 (1) =

(5)(1) 1 = 2

???!!! we received two answers back: +2, or 2

A function has an inverse if and only if it is a one-to-one function.


Definition | One-to-one Function
A function is one-to-one if for all a and b, if a 6= b, then f (a) 6= f (b).
Note: Inverse or reciprocal? Youll have noted that the way that we represent the inverse of a function, f 1 (x) looks a lot like how we might represent the reciprocal
of a number, x1 . So the question is, how do I know what is being talked about?
The context will be most helpful, and the way that the inverse is presented should give
some indication. For instance, to represent the reciprocal of the function f (x) we
would normally write,
1
,
[f (x)]1 =
f (x)
rather than when we just want the inverse we would write,
f 1 (x) ,
see the difference? However, suppose you were confronted with,

z(x) = x2 + 2 x + 4f 1 x ,
what would you understand this to mean? It is clearly a bit ambiguous, with ambiguity
due to the lack of any indication of whether f is a function (which has been written
in the equation without its input value), or if it is just another variable that is taken
reciprocally ( f1 ).
To avoid such ambiguity, good practice is always to make the inputs to functions very
clear (write them in), unless there are many inputs, in which case, make it clear
that you are just going to write the function name (for convenience, we shall write
f (a, b, c, d, e, f ) as just f ) and be sure not to confuse things in the expression.

11

LECTURE 1. INTRODUCING QABE

1.7.2
HPW 4.1,
4.2

Exponential & Logarithmic

We will have much more to say about exponential functions in coming weeks, since
they provide an easy way to talk about various kinds of time-dependent processes.
Be sure to do a number of exercises in exponential and logarithmic functions since
it is quite likely that either you have forgotten the rules associated with them, or
are meeting them for the first time. It will be of great benefit if the rules and
manipulation of these types of functions comes quickly to hand.

Definition | Exponential
f (x) = ax
(A selection of ) Important rules:
am an = am+n
am
= amn
an
(am )n = amn
f (x) = 2x

3
2
1
0
-1
-2
-3

-3 -2 -1 0 1 2 3

Definition | Logarithmic
Where b is the base,
f (x) = logb x
(A selection of ) Important rules:
logb (mn) = logb m + logb n
m
logb ( ) = logb m logb n
n
logb (mr ) = r logb m
logb 1 = 0
f (x) = log10 x

3
2
1
0
-1
-2
-3

-3 -2 -1 0 1 2 3

Note: logb x is like saying, what power must I raise b to, to obtain x?
The connection between logarithms and exponentials...
12

1.7. SPECIAL FUNCTIONS


Definition | A very nice rule
logb x = y

by = x

corresponds to

log

Revise!

1. Go over the lecture notes, chapter refs, tutorial problems,


be sure you can do them(not just read them).
2. Go over the lecture notes, chapter refs, tutorial problems,
be sure you can do them(not just read them).
3. Go over the lecture notes, chapter refs, tutorial problems,
be sure you can do them(not just read them).

13

LECTURE 1. INTRODUCING QABE

14

Lecture

Time Value of Money


2.1

Introduction

This week we begin a section of the course looking at the time value of money. Put
simply, the maths that we need when dealing with money. The reason we need to
develop an understanding of money mathematics (as opposed to, say, just numbersmathematics) is that numbers representing money, have the special property that
they are actually representing value. The trouble is, as opposed to (say) the number
3 which represents the same information yesterday, today, and tomorrow, if we were
talking about 3 dollars of some currency, we would have to be very careful about
when the number was quoted that is, the value of the representation (3 dollars)
is defined in part by the time that it is quoted. Three dollars today wouldnt get you
even a couple of cans of Coke, three dollars twenty years ago would have obtained
you three cans of coke.
An obvious question is, why does money have this peculiar time-value property?
For now, it is enough to raise it as a question. We start our inquiry into the realm of
time-value-of-money by looking at the different ways that interest can be calculated.
That is, the value of a deposit (some sum of money) can grow over time in the banks
hands. Part of this is to do with simple time-value considerations, and part of it has
to do with various investments the bank has made. More on this later!
Agenda
1. Simple interest
2. Compound interest
3. Continuous compounding
4. Present value

2.2

A Problem

The Problem
Were going O/S in 2 years and we need to make as much money as possible before
then through the bank as we dont have any time for work (were studying). We
only have $1500 currently, but the trip will cost $3,200.
15

LECTURE 2. TIME VALUE OF MONEY

The parameters:
Money in the hand: 1500
Time available: 2 years
Three banking options:
1. Option 1: Simple Interest
2. Option 2: Compound Interest
3. Option 3: Continuously Compounded Interest

2.3

Option 1: Simple Interest

2.3.1

Terminology

Definition | Principle, Savings, Rate & Interest

KZB 1.1

The amount of money I have at the beginning (to be invested) is called


the principle, or P ;
The value of the money after some time of investment, is the final
value, or S;
The rate at which value is added in a time period is the interest
rate, or r;
The total value gained due to time-value-of-money is the interest,
or I due to me.
So in our case, we have,
P
$1,500

S
$3,200 (our aim)

r
t.b.d.

I
t.b.d.

Note: On S, P and A There is no real reason why the final value is given the pro-numeral
S my best guess is that is to do with savings (i.e. the amount that is in the account
after some time), but this is pretty tenuous. In any case, it is the convention of the
financial maths texts (both KZB and HPW use S for the final, or saved, value), so
well use it here.
For the initial value, when we are just doing investment calculations, generally we talk
about the Principal P being invested. However, when we are dealing with annuities
(later), we talk about the present-value of the Annuity A (not P ). In this course, well
try to follow the texts as best as possible: P for Principal, S for Savings (or future
value), and A for present-value of an Annuity.

16

2.4. OPTION 2: COMPOUND INTEREST

2.3.2

Working it out

Simple Interest

Definition | Simple Interest


Simple interest is used when the interest is calculated once at the end of
the term, that is,
I = P rt
(2.1)
where t is the time-period of the calculation.

Example: Simple Interest


Suppose we invest (as in our case), P = $1500, r = 5%p.a and t = 2yrs.
How much interest (calculated simply) would we gain? What would be the
final value of our investment?

Example: Reverse simple interest


With the same interest rate as in the previous example, how many years
would it take to gain our target ($3,200)?

2.4
2.4.1

Option 2: Compound Interest


KZB 2.1,
HPW 5.1

Terminology

Compound Interest
Suppose that instead of I being paid once at the end of the investment
period, instead, the calculation (and payment) of I is paid periodically within
the period: such a scenario is that of compound interest.
17

LECTURE 2. TIME VALUE OF MONEY


Definition | Compound Interest
Let P be the original value of the investment, then after n interest periods, at periodic rate of interest r, the final value of the investment
will be,
S = P (1 + r)n
(2.2)

Nominal rate vs. periodic rate Be careful of the terminology


here if the nominal interest rate is 5% (that is, per annum), but the interest is to be compounded quarterly, then r in
(2.2) is actually 0.05/4 = 0.0125 (since the calculation period is
quarterly).

Example: Compound Interest


Find the final value of a $1000 investment, invested for 5 years at the
nominal rate of 8% compounded quarterly.

2.4.2

Working it out
Example: Option 2: reverse compound interest
Returning to our problem, how long would it take to obtain $3,200 with an
initial investment of $1,500, if interest is compounded every two months
at a nominal rate of 5%?

18

2.5. OPTION 3: CONTINUOUSLY COMPOUNDED INTEREST

2.5
2.5.1

Option 3: Continuously Compounded Interest


HPW 10.3;
KZB
App. D

The exponential constant

Clearly, it is better to have interest compounded that is, calculated and added in
more than once a year. However, what would happen if we asked for compounding
to take place every week? every day? every second? continuously??!!!
Suppose we have our normal compound formula,
S = P (1 + r)n
However, to be clear about it, lets define ra to be the annual rate of interest
(the nominal rate), and k to be the number of times a year compounding
will occur, and t to be the number of years,

ra kt
S =P 1+
k
Now, what we are interested in, is making k approach ... (continuous compounding), that is,

ra kt
lim P 1 +
k
k

r t

ra  rka a
= P lim 1 +
k
k

S =

Let x =

ra
k

for simplicity, meaning the limit is now x 0


h
i
1 ra t
S = P lim (1 + x) x
x0

i
h
1
But notice that the limit is actually e limx0 (1 + x) x , that is,
S = P era t

2.5.2

Working it out

Continuously compounded interest


Which leads to the following definition,
Definition | Continuously compounded interest
If the initial investment is P and the final value of the investment is S,
then under continuous compounding at an annual interest rate of r for
t years,
S = P ert
(2.3)

19

LECTURE 2. TIME VALUE OF MONEY


Example: Option 3: Continuous compounding
With P = $1500 and r = 0.05, after two years, what will be the value of
the investment under continuous compounding?

Example: for the record...


With continuous compounding, initial investment of $1,500 and annual interest rate of 0.05, how long would it take to obtain a total investment of $3,200?

2.6

Summary

To summarize...
5000
S = P ert

4000

S =P 1+

3000


r nt
n

S($)

2000

S = P + P rt

1000
0
0

t(years)

2.7

Present Value

The principle, P , can be thought of as the present value of a future value S. A


simple reorganisation of our equations for the future value of P yields the following.
20

2.7. PRESENT VALUE


Definition | Present Value (periodic)
To obtain a compound amount of value S which has been maturing at the
periodic rate of r for n periods, one needs to invest the starting amount, or
principle,
P = S(1 + r)n
(2.4)
otherwise called the present value of S.
Definition | Present Value (continuous)
To obtain a compound amount of value S which has been maturing continuously at the nominal rate of r for t years, one needs to invest the starting
amount, or principle,
P = Sert
(2.5)
otherwise called the present value of S.

21

LECTURE 2. TIME VALUE OF MONEY

22

Lecture

Evaluating Time-Money Choices


3.1

Introduction

We spend a little more time looking at how the value of money is actually dependent
on the time at which it is in our hand. That is, as before, we note that there
exists a time value of money; would you prefer $100 in your hand today, or $101
tomorrow?
There are various explanations for this fact. As you might have reflected above
Id prefer the money today, since Im not sure what will happen tomorrow, that
is the uncertainty of our lives kicking in. Just one of the factors that contribute
to the apparent value of money changing through time.
We then move on to applying some of our new-found skills in time-value-ofmoney to one of the most common problems faced in the commercial world, namely,
the problem of deciding whether or not to go ahead with a project, or if there is
choice between projects, deciding which of the projects to fund.
The reason it is so common (if not already obvious) is that in practically every
business situation, one has to spend money, to make money that is, make an
initial investment (called capital) and after time, begin to yield some kind of income
stream from the business. The big competition that then ensues is between that
money invested in your project, versus that money invested in the bank at some
going rate who gives the best return will decided how you will proceed. However,
the question of whether one project is actually worth it on its own, or in comparison
to another is not always immediately obvious. Not to worry by applying what we
have learnt up till now, we have the tools to become experts on these issues!
The methods we will develop are two of the most common the net present
value of the project on the one hand, and the internal rate of return on the
other. The two methods are very connected, but their interpretation requires some
care.
Agenda
1. Equations of value;
(a) Under simple interest;
(b) Under compound interest;
2. A campus investment conundrum;
23

LECTURE 3. EVALUATING TIME-MONEY CHOICES


3. Looking at the numbers I (the N P V );
4. Looking at the numbers II (the IRR);
5. Calculating with a computer;
6. Conclusions.

3.2
KZB 1.4,
2.6

Equations of value

Equations of value
Scenario
You owe your parents some money. At present, you owe them $500 to be paid in
6 months and $350 in 9 months. You dont want to do installments, youd prefer
to pay $100 now and the rest in 12 months time. In negotiations, they agree to
consider either simple or compounded (quarterly) interest. What will you do?

3.2.1

Simple Interest

Solution technique:
1. Work out the timings;
2. Using the focal date bring all the payments and debts to it;
3. Set up the equation of value;
4. Solve.
Debts:

$500
2

Payments: $100

focal date

$350
8

10

t (months)
12

focal date

Example:
Using a focal date of now or in 12 months, and simple interest at the
nominal value of 7%, what would be the single sum you owe?

Checking the two payments (now = $715.63, 12 months = $766.63), the value of
the 12 month payment is,
P

= 766.63(1 + 0.07)1
= $716.48

24

!!!

3.3. NET PRESENT VALUE

When using simple interest in equations of value, the focal date


must be agreed before hand, since it will affect the total value exchanged.

3.2.2

Compound Interest

1. Try the compound interest version yourself;


2. Check if the value of the future (12 month) payment is the same as the current
payment.
3. Which repayment method would you pick?

3.3
3.3.1

Net Present Value


The Scenario

Scenario: The Bean House a Micro-coffee Roasting House for the


Eastern Suburbs
Youve recently read about the micro- coffee roasting craze that is hitting Sydney.
It seems like the perfect business propositon value pricing (as opposed to costpricing), a legally addicted market (both to the bean, and to the boutique theme),
and with a tiny amount of skill, an easy market to get a foot in (most drinkers dont
know the difference). You and a friend are talking one night and it turns out your
friend has already got a plan together. Knowing you are a student of QABE, she
turns to you to run the numbers. Will it work out?
The numbers ...
Upon further inquiry, the numbers (according to your friend) look like this:
Year ending
0
1
2
3
4
5

3.3.2

Costs
50,000
34,000
34,000
44,000
44,000
44,000

Income
0
25,000
45,000
60,000
70,000
75,000

Cash flow
-50,000
-9,000
11,000
16,000
26,000
31,000

Note
Set-up costs
Two-wages @ $17,000
Third wage @ $10,000

Working it out

Steps to a decision...
1. Adjusting the cash-flow numbers to turn them into present values (based
on the going alternative rate of return);
2. Sum each of the cash-flow values (in todays terms) to get the net present
value;
3. Make a decision:
If NPV > 0 worthwhile;
25

LECTURE 3. EVALUATING TIME-MONEY CHOICES


If NPV < 0 not worth it!
Example:
Suppose your friend has access to a bank who has a long-term savings
account (yearly compounded) offering a nominal rate of 12%. Should The
Bean House get off the ground?
Year ending
0
1
2
3
4
5

Costs
50,000
34,000
34,000
44,000
44,000
44,000

Income
0
25,000
45,000
60,000
70,000
75,000

I-C
-50,000
-9,000
11,000
16,000
26,000
31,000

(1 + 0.12)t

PV

NPV

Definition | Net Present Value (NPV)


If Ft is the estimated cash flow for a T period project at the end of period t
and the yearly compounded rate of interest, the cost of capital is r, then
the net present value is the sum of present values, assuming cash flows
arrive at the end of the year,
N P V = F0 + F1 (1 + r)1 + + FT (1 + i)T .

(3.1)

Interpretation - what does it mean?


As we noted before, if the N P V > 0 then the project is worthwhile;
The reason is, that the NPV calculation is really doing an in-built comparison,
or play-off, between the project at hand, and the interest rate available at the
bank;
If you are beating the bank by gaining more value through the project than
if your money were invested with the bank alone, then we deem the project to
be worthwhile;
Challenge: do the same calculation as in the example above, but at 8% cost of
capital. Is the project now worthwhile?
26

3.4. INTERNAL RATE OF RETURN

3.3.3

The Importance of the Interest Rate

Suppose we do the calculation of the NPV for a range of interest rates ...
20
16
12
8
4
N P V ($ 000) 0
4
8
12
16
20

3.4

IRR
b

r
b

0.05

0.10
b

0.15
b
b
b

Internal Rate of Return

The internal rate of return asks the question,


What cost of capital would mean an NPV = 0, the break-even
point?
Which means that if,

r = IRR

then the return on the project (over its life) is exactly the same as if we put
our money into the bank! (the competition is a dead-heat!)
Definition | Internal Rate of Return (IRR)
The internal rate of return indicates the equivalent interest rate offered by a
financial institution (compounded yearly) that would give the same outcome
for my investment over the project life-time, as my project itself.
It is found by setting the N P V to zero, and solving for r, assuming cash
flows arrive at the end of the year,
N P V = F0 + F1 (1 + r)1 + + FT (1 + r)T = 0 .

(3.2)

Care with the IRR When interpreting the IRR, notice that it is
(by definition) independent of the current cost of capital (what
is actually offered by the banks). It is tempting to think that IRR
somehow depends on this value. It doesnt! (But we compare to it.)

27

LECTURE 3. EVALUATING TIME-MONEY CHOICES

3.4.1

Working it out
Example: IRR by hand
Suppose a project requires an initial investment of $20,000 and returns
$7,000 and $16,000 at the end of the first and second years respectively.
Find the IRR of the project assuming yearly compounding.

There are simple cases (esp. when t 2) that can be solved by hand using
the quadratic equation;
Obviously, it is difficult to solve for r in most cases (other than trial-anderror approximation), so we often use a software package (such as Microsoft
Excel, Open-office or Gnumeric);
Be careful how you use the software however...
Example:
Using a computer program of your choice, find the IRR and N P V , at 3%
interest, of the following stream of net profits: (-45, -25, -2, 12, 27, 30,
31).

The interpretation here needs care! Note that the cost of capital is so-called,
since it is the gain fore-gone (given up) when we use the investment money (the
capital) in our project. For this reason, if we cant do better than what the bank is
offering, we might as well put our backers money in the bank!
It is in this sense, that having capital (investment money) outside of the bank
is costly unless we are putting it to good use, it is literally costing us in interest
we could have been getting! (thats no-risk return too...)

3.5

Summary

Does The Bean House go ahead?


28

3.5. SUMMARY
Clearly, at cost of capital 12%, the The Bean House isnt worth it wed do
better by putting our money in the bank at the offered rate of 12%;
However, by calculating the IRR, we could see that for any cost of capital
less than 10%, the The Bean House is a good idea! ... wed beat the best
interest going at the bank!
Finally, suppose that we had two different projects the coffee roaster being
one, the other being a simple bakery, if they both have a positive N P V , we
still wouldnt know how to pick between them.
However, if you cant do both, choosing the higher N P V project will be
the highest returning project.

29

LECTURE 3. EVALUATING TIME-MONEY CHOICES

30

Lecture

Geometric Progressions and Annuities


4.1

Introduction

Depending on your studies, you may never have heard of an annuity. However,
chances are, you have actually been thinking about, or seeing the work of, annuities
in everyday life for years. For example, when you are repaying your mobile phone in
a number of fixed installments you are paying an annuity. When you take out a loan
for a car, or some other purchase, and then begin repaying the loan, you are dealing
with an annuity. If you have an older friend who is receiving a regular pension
amount from a superanuation benefit such that there will be nothing left by their
lifes end, they are getting money through an annuity. Finally, if you have a savings
account which you regularly transfer money into (say, monthly) and dont intend
to touch for a number of years (as in the film, The Bank), you have an annuity.
Enough said.
You see, annuities do live something of a secret life we all use many financial
services which have their own specific names, but if we look a little deeper, we are
just dealing with an annuity. To understand these creatures better, we have to
start with a bit of maths to refresh ourselves on the nature of geometric progressions
and series, which will then enable us to investigate the many- and varied- forms
of an annuity. If you can get a handle on annuities, then doubt-less youll become
a favoured expert amongst friends and family as you crunch the numbers on their
various financial options!
Agenda
1. Background: the geometric progression and series;
2. Annuities present, future value;
3. Annuities due.

4.2

Geometric Progressions

Consider the following number sequences:


2,-2,2,-2,2,-2,2,-2,2,-2,2,-2 (-1)
1.00,0.60,0.36,0.22,0.13,0.08,0.05 (0.6)
0.5,1.3,3.1,7.8,19.5,48.8,122.1,305.2 (2.5)
31

LECTURE 4. GEOMETRIC PROGRESSIONS AND ANNUITIES


... in each, we have,
An initial value, a; and
A ratio of terms, r,

r=

xi+1
xi

Definition | Geometric Progression


A (finite) geometric progression (or sequence) is a list of numbers
where the first number a is chosen, and subsequent numbers are given by
multiplying the preceeding term by a constant factor r,
a, ar, ar 2 , ar 3 , . . . , ar n1 , ar n

(4.1)

Of interest, is what the numbers (or terms) do over time do they get larger?
smaller? stay the same?;
Further, if they were added together, how will the sum of terms behave?
Definition | Geometric Series
A geometric series is simply an additive equation (sum) of the terms of
a geometric progression, i.e.,
a + ar + ar 2 + ar 3 + + ar n1 + ar n
The sum is given by,
s=

a(1 r n )
1r

Example:
Find the sum of the series,
s = 2 + 4 + 8 + 16 + 32 + 64 + 128 + 256 + 512

4.3
HPW 5.4,
KZB 3.1,
3.2

Annuities

Problem: choosing mobile-phone contracts


32

(4.2)

(4.3)

4.3. ANNUITIES
You have recently gone onto a plan for a mobile phone which includes a new phone.
In the contract, the repayments required per month for the phone alone are $17 per
month (calls are extra). The contract says youve got to pay this for two years. You
wonder to yourself, what is this phone worth anyhow? Should I just buy a phone
now outright???

4.3.1

Terminology

Definition | Annuity
An annuity is a sequence of agreed payments made at fixed intervals,
called the payment period, over a given length of time, called the term
of the annuity.
Example sequence repay a loan at $250 with a payment period of 2 months,
for a term of 12 months:

Deposits:

$250 $250 $250 $250 $250 $250 ?


2

10

t (months)

12

Question then: how is this different to one lump-sum payoff at the outset? At the
end?

4.3.2

Present Value, A
Example: Present Value of Annuity
Show that the present value A of an annuity of agreed payments R, paid
at the end of each of n periods, with r interest rate (per period), is given
by (hint: use the geometric progression result in (4.3)),
A=R

KZB 3.3

1 (1 + r)n
.
r

The previous example leads to the following definition,


33

LECTURE 4. GEOMETRIC PROGRESSIONS AND ANNUITIES


Definition | Present Value of an Annuity
The present value of an ordinary annuity having regular payments R at
the end of each payment period for n payments with an interest rate of r
per period is given by,
A=R

1 (1 + r)n
r

(4.4)

Payments are assumed to be made at the end of each payment period;


Likewise, interest is assumed to be calculated at the end of each payment
period.

Example: Back to the Mobile-phone


In our scenario, you are making $17 payments on your mobile phone every
month for two years. What is the real cost (present value) of the phone?
(Assume the company charges 8% interest.)

4.3.3

Spreadsheeting it
Example:
Calculate the previous example with a spreadsheeting program, and check
the result. Set it up so that you can alter the interest rate. What happens?
(Why?)

34

4.3. ANNUITIES
Example: Car loan
Suppose you are considering purchasing a car. The model you want will
cost you $13,450. You have access to a loan through your bank, who would
charge 9.50% interest. What would the monthly repayments be if the term
of the loan was 5 years?

Interpretation
In each case, the repayments are fixed before the term of the annuity it is
a contract;
However, due to the time-value-of-money these apparently constant payments have changing value;
Infact, the agreed repayments include the interest component that the institution is charging;
It is for this reason that we need to work out the real value of what is being
paid out.
Deposits:

$250 $250 $250 $250 $250 $250


2

focal date

10

t (months)

12

Note: On Annuities You may be getting a quite confused here. Questions such as whose
interest rate? and what does an annuity actual mean? are quite natural.
When we calculate the present value of an annuity, we are really asking the question,
what is the amount that must be paid now to purchase the value of the payments Im
planning to make at various times throughout the future?
You see then, that by fixing the payments schedule, the bank (say) has already included
the interest components into those repayments. Which means, yes, as time goes by
consider a 30 year home loan the first payment of (say) $100 is worth quite a
lot, but by the 30th year, paying the face value of $100 is actually paying them a very
small amount of value (in terms of the value of $100 30 years from now).
Annuities cover a lot of financial services because they are the typical form of loan
repayments. A schedule of fixed amounts is set up to begin with, based on the interest
rate being charged. This includes retail agreements (like mobile phone repayments),
home loans (mortgages), and even investments used by pensioners to withdraw a fixed
amount over time (in this case, it is as if the bank has loaned the money from you
and is paying you back at a fixed amount for a number of years).
Infact, the word mortgage comes from two roots mort meaning death and gage
meaning pledge, so together we get a mortgage being to kill of the pledge or agreement between you and the bank!

35

LECTURE 4. GEOMETRIC PROGRESSIONS AND ANNUITIES

4.3.4

Future Value, S

Notice, we can of course, work out the future value of the loan, just as easily as the
present value,
Deposits:

$250 $250 $250 $250 $250 $250


2

10

t (months)

12
focal date

Notice: the final payment is made on the focal date itself, so requires no
adjustment;
Thus, previous payments need adjustment up till the n 1 payment,
S = R + R(1 + r) + R(1 + r)2 + + R(1 + r)n1
Another geometric series. (Try proving the following result...)
Definition | Future value of an annuity
The future value of an annuity (ordinary) with payments of R per
payment period for n periods at interest rate r per period is,
S =R

4.4
KZB 3.4

(1 + r)n 1
r

(4.5)

Annuities Due

Alternatively, rather than an ordinary annuity (payments are at the end of each
payment period), suppose the first payment is at the start of the period (like a
magazine subscription, say).
Deposits:

$50

$50

$50

$50

$50

$50

$50

t (months)
7

term
This situation is called an annuity due and changes our definitions slightly...
Definition | Present value of annuity due
The PV of an annuity due is the same as the value of an ordinary annuity
one period before the present, brought to the present. So the PV of an
annuity due is just the PV of an annuity, with one periods interest added
(accumulated),


1 (1 + r)n
A = (1 + r) R
(4.6)
r
36

4.4. ANNUITIES DUE


(1 + r)

P V0
$50

P V1

$50

$50

$50

$50

$50

$50

1
2
3
4
5
Ordinary Annuity term

t (months)
7

Definition | Future value of annuity due


As with the PV of an annuity due, the FV is the same as the future value
of an ordinary annuity which starts one period before the present, brought
to the present. So the FV of an annuity due is just the FV of the annuity,
with one periods interest added (accumulated),


(1 + r)n 1
(4.7)
S = (1 + r) R
r
(1 + r)
F V1
$50
0

$50

$50

$50

$50

$50

1
2
3
4
5
Ordinary Annuity term

$50 F V0
6

t (months)

Example:
Suppose you put $100 on the first of each month into a savings account
which pays 5.5% accumulated monthly. What would be size of the account
after 45 years?

37

LECTURE 4. GEOMETRIC PROGRESSIONS AND ANNUITIES

38

Lecture

Matrices I: Maths by Arrangement


5.1

Introduction

We now move to a few studies in the world of matrices.1 Think about going on a trip
to a very special island, like the Galapogos Islands off the coast of South America.
The place is so special and different that it may even have different laws of nature
operating on it. Now this might be a very scary thought. Chances are, the scariness
comes directly from the fact that the world might operate so differently from the
one that we are familiar with, that we wont know what to do, how to behave, what
to expect. A natural reaction is to try to find ways of doing things on the special
island that we are used to doing back home.
This kind of journey is exactly what well be embarking on in this lecture. Well
venture into the world of Matrices, and immediately start looking for mathematical
elements that we are used to in our normal (scalar) world, so as to feel more
comfortable in the new place. The important implication of this journey is firstly,
that we shouldnt expect to be able to do our old maths in the new world, and
secondly, that we must recognise which world we are working in! Or else, we may
try to apply a rule to the wrong world that predictably will give rise to all kinds of
unintended consequences.
For some people, this area can cause a lot of trouble, since in a number of cases,
the way that we perform operations (e.g. subtraction, addition, multiplication) on
matrices gives rise to different outcomes than with just, well, numbers on their own.
In times such as these, just remember that matrices are simply a re-arrangement of
the information we are used to dealing with. In most cases, they simply represent
a collection of lists (e.g. shopping, products) which we describe by the number of
rows and columns in the list. The trick with matrices, is that we will begin to do
strange things with them, like multiplying two lists together! Sometimes this will be
exactly what we are after and matrix algebra (or linear algebra as it is sometimes
known) will prove an enormous help to us.
The special rules of matrices are there for a good reason and with a little practice
shouldnt be so foreign to you. If any anxieties remain, as ever, practice, practice,
practice! There are plenty of examples available in your text-book, on-line, or in
1

Note: in the prepartion of this section on Matrices, reference has been made to the excellent
text Linear Algebra (3rd Ed.) by Fraleigh and Beauregard (FB) (1995). By noting this, you may
feel tempted to find this text, which is a fine response, but note that in many areas, FB goes well
beyond what is required of us in our course. Beware!

39

LECTURE 5. MATRICES I: MATHS BY ARRANGEMENT


the library (look for textbooks on linear algebra or matrices). For the moment,
we will simply have to refresh our memory on this strange world, which will for
the most part mean leaving our economic applications at the door. But as we
move forward, well see how matrices are of great help to us in understanding many
commercial/economic activities.
Agenda
1. Introducing matrix terms;
2. Basic operations;
3. Matrix multiplication.

5.2
HPW 6.1

Terminology

Maths by arrangement
Consider the following examples:
6x1 + 3x2 + x3 = 22
x1 + 4x2 2x3 = 12
4x1 x2 + 5x3 = 10

Income
Costs

Mon
54
51

Tues
60
45


6 3
1
x1
22
1 4 2 x2 = 12
4 1 5
x3
10

Wed
58
50

54 60 58
51 45 50

Information in loose form Information in matrix form


A matrix is just a special arrangement of information into an array of numbers;
Linear equations, tabular data, image information, database entries almost
all kinds of information can be represented in matrix form;
columns
Some terminology:

8 1 6
A = rows 3 5 7
4 9 2
A has 3 rows and 3 columns, so its size is expressed as 3 3 (r c);
40

5.2. TERMINOLOGY
A is thus a square matrix since rows(A) = cols(A);
Further, since A is square, we can refer to the main diagonal or principle diagonal of A;
If rows(A) = 1 or cols(A) = 1, then we refer to a row vector or column
vector respectively, e.g.,

1



8 1 6
5
9
We refer to a single position in the matrix as a matrix element or matrix
entry, normally by referring to the row and column positions,

a11 a12 a13 a14


A = a21 a22 a23 a24
a31 a32 a33 a34
where (say) a23 indicates the entry in row 2, column 3.

5.2.1

Special matrices

Some often-used special matrices:


The zero matrix or null matrix,



0 0 0
0=
0 0 0

or, in general

which has the special property,

0
.. . .
0 = .
.
0

0
..
.

0A = A0 = 0 .
The identity matrix, e.g.,

1 0 0
I = 0 1 0
0 0 1

which has the special property that,

AI = A .
A diagonal matrix is like the identity matrix, but can have non-zero
elements on the diagonal:

2
0
0
D = 0 0.32
0
0
0
345.1
A triangular matrix comes in one of two forms:
41

LECTURE 5. MATRICES I: MATHS BY ARRANGEMENT


1. An upper triangular matrix has the form:
(
aij
Aij =
0

for
for

ij
i>j

e.g.

1 2
2
U = 0 0
3
0 0 345.1

2. A lower triangular matrix


(
aij
Aij =
0

5.3

for i j
for i < j

e.g.

1 0
0
L = 5 3
0
1 0 345.1

Operations

Definition | Matrix equality


Two matrices A and B are said to be equal if they are of the same size
and aij = bij for each i and j.

HPW 6.2

Example:
Are the matrices

4 3
2 0

and

2 0
4 3

equal?

Definition | Transpose of a matrix


For some matrix A of size m n, the transpose of A denoted A is of
size n m with each ith column taking the values of the ith row of A.
(A ) = A
(A + B) = A + B
(AB) = B A

A=

42


1 2 3 4
5 6 7 8

A =

1
2

3
4

5
6

7
8

5.3. OPERATIONS

5.3.1

Addition & subtraction

Matrix addition and subtraction are done in an element-wise fashion.


Example:
Determine the solution to,



 
2 0
4 9
+
0 7
2 1

Example:
Determine the solution to,


 

4 9 3
2 0
+
2 1 2
0 7

Matrix addition and subtraction requires the inputs to be of equal


size, or else the solution is undefined.

5.3.2

Scalar multiplication

To perform scalar multiplication on some matrix A is to simply multiply every


element of A by the given scalar.
43

LECTURE 5. MATRICES I: MATHS BY ARRANGEMENT


Example:
Let A be given by,



3 1
0 5

determine 7A.

5.4
HPW 6.3

5.4.1

Matrix Multiplication
Working it out

Matrix multiplication has a special definition, requiring some care. Suppose we have
two matrices,

 
1 3
5

and B =
A= 2 8
9
4 0
and we wish to find C = AB (A times B), we must do the following:

1. Check that the column dimension of A is equal to the row dimension of B;


2. Determine the size of C: row(A) col(B);
3. For each entry in C, calculate the sum of the ith row of A times the jth
column of B.
Example:
Determine the value of C = AB as defined above,

 
1 3
5
A = 2 8 and B =
.
9
4 0

44

5.4. MATRIX MULTIPLICATION


For practice, calculate AB and BA, where,




0 2
0 1
A=
B=
3 5
2 5

Matrix multiplication is not commutative, that is, for two matrices


of the same size A and B,
AB 6= BA .

Definition | Properties of matrix multiplication


Pay careful attention to the following matrix-specific rules:
A(BC) = (AB)C
A(B + C) = AB + AC
(A + B)C = AC + BC
IA = A
BI = B

Example:
Under what conditions would the following equation be true?
(A + B)2 = A2 + 2AB + B2

45

LECTURE 5. MATRICES I: MATHS BY ARRANGEMENT

46

Lecture

Matrices II: The Inverse & Determinant


in Small Matrices
6.1

Introduction

This lecture is a significant step up in difficulty for our matrix algebra, but it is
also a significant improvement in the power of our matrix toolbox. We begin by
coming up with an equivalent operation to division in arithmetic, known as the
matrix inverse. It does just the same job (a matrix multiplied by its inverse
equals the identity matrix (the stand-in for 1 in matrix algebra)), but like other
matrix operations, has its own specific rules and properties.
Following from this, we are caused to wonder how we might actually come up
with the inverse? Afterall, it isnt much use unless we can find out what its value
is. Whilst there are a few techniques for obtaining the inverse (e.g. row reduction)
we will consider just one, the adjoint method. Now this is itself a little bit tricky,
even on small matrices, but it will introduce us to a very important property of any
square matrix, known as the determinant, and the adjoint method will give us an
idea about why sometimes we can get an inverse, and sometimes we cant. More on
this in the next lecture.
Although all of what we do in this lecture is applicable to much larger matrices
(and is where these techniques have real power), we wont be doing them on anything
bigger than a two-by-two or three-by-three. Next lecture well unleash the power of
the computer on our matrix world, and so do some tackle some really big problems.
Agenda
1. Matrix division? ... The inverse of a matrix;
2. Finding the inverse: the determinant and the adjoint method;
3. Application to systems of linear equations;

6.2

The inverse of a matrix

A problem?

HPW 6.6

47

LECTURE 6. MATRICES II: THE INVERSE & DETERMINANT IN SMALL


MATRICES
Up till now, we have only seen matrix addition, subtraction and multplication (both scalar and matrix);
What about solving something like,
Ax = b

for x?

Normally, wed divide both sides by A and all would be well.


But we cant do matrix division like that!

6.2.1

Defined

Instead, we use the inverse of a matrix which has the following property,
Definition | Inverse of a matrix
If a square matrix A has an inverse, written A1 , then A1 has the following property:
AA1 = I ,
A

and

A = I.

Dont be tempted to think,


1
A

not true!!

Ax = b

for x?

A1 =

Now we can solve our problem,

if A has an inverse (!), then we can simply solve,


Ax = b
1

(Ax) = A1 b

(A1 A)x = A1 b
Ix = A1 b
x = A1 b.

6.3
JS 4.6, 4.7,
4.9

6.3.1

A Primer to the Inverse: The Determinant


Small Determinants

Determinants of order 2
48

6.3. DETERMINANT EXCURSUS


Definition | The determinant


a11 a12
The determinant of a square, two-by-two matrix A =
is writa21 a22


a
a
ten |A| or 11 12 and is given by,
a21 a22


|A| = a11 a22 a21 a12 .

(6.1)



a11 a12


a21 a22

|A| = + a11 a22 a21 a12

Example:



1 3
Calculate the determinant of the matrix
.
5 2

6.3.2

Determinants of higher orders

What about 3 x 3?
What if we need to find the inverse (and so check the determinant) of a 3 by
3 matrix?
... it is possible, but just needs care!

a11 a12

|A| = a21 a22
a31 a32

a
= +a11 22
a32


a13
a23
a33





a21 a23
a21 a22
a23




a12
+ a13
a33
a31 a33
a31 a32
49

LECTURE 6. MATRICES II: THE INVERSE & DETERMINANT IN SMALL


MATRICES
Example:

2 1 2
Find the determinant of the matrix 5 1 3 .
10 2 4

Definition | The determinant of order n


Let A be a square matrix, and Aij be the (square) matrix remaining after
eliminating row i and column j, then define the cofactor of element aij
in A as,
(6.2)
a ij = (1)i+j det Aij ,
and further define the determinant of A to be,
det A = a11 a 11 + a12 a 12 + + a1n a 1n .

6.3.3

(6.3)

Cofactors
Example: Cofactors & Determinants

2
3
Find the cofactor of the entry 3 in the matrix, A =
4
1

1
2
0
0

0
1
1
2

1
2

4
1

Properties of Determinants
Let A be a square matrix, then,
1. We have det(A) = det(A );
2. Exchanging two rows of A, causes determinant of resulting matrix to be
det(A);
50

6.4. THE INVERSE REALLY


3. If two rows of A are equal then det(A) = 0;
4. If a single row of A is multiplied by a scalar r, the determinant of the resulting
matrix is r det(A);
5. Addition of a scalar multiple of one row of A to another row of A leaves the
determinant unchanged;
6. A is invertible (non-singular) if and only if det(A) 6= 0. (Alternatively, A
is singular if and only if det(A) = 0.)

6.4

The Inverse Really

Back to the inverse


Recall, that we started our journey on determinants because we wished to
know if a particular matrix had an inverse or not.
Infact, we can use determinants to go straight to the inverse!
The method introduces us to the adjoint of a matrix. To find the adjoint:
1. Check the determinant is non-zero;
2. Find the cofactors for each entry, call this A ;
3. The adjoint of A is then, (A ) .

6.4.1

The Adjoint of a Matrix


Example: Finding the adjoint

4 0 1
Find the adjoint of the matrix, A = 2 2 0 .
3 1 1

6.4.2

The Inverse by the Adjoint

Definition | Inverse by the adjoint-method


Let A be an n n matrix with det(A) 6= 0. Then A is invertible, and
A1 =

1
adj(A) ,
det(A)

where adj(A) = (A ) , the transposed matrix of cofactors.


51

LECTURE 6. MATRICES II: THE INVERSE & DETERMINANT IN SMALL


MATRICES
Example: The inverse by the
4

Find the inverse of A = 2


3

adjoint
method
0 1
2 0 by the adjoint method.
1 1

The solution method

1. Calculate |A|. If |A| =


6 0, proceed to (2);
2. Find co-factors of A;
3. Construct matrix of co-factors, A = [a ij ];
4. Transpose A to obtain adj(A); and
5. Divide adj(A) by |A| to obtain inverse.

6.4.3

A Useful Check!

Do we have an inverse?

Definition | Non-singularity
For the square matrix A, if |A| is not equal to zero then A is nonsingular and thus has an inverse, if this is not the case, and |A| = 0, we
say that A is singular and will not have an inverse.
Note, then we have two checks to make of some matrix A, to determine if it
has an inverse or not:
1. Is A square?
2. Is |A| =
6 0?
If the answer is yes to both criteria, we can be sure that A has an inverse.
52

6.5. ON LINEAR EQUATIONS


Example:



1 0
Determine whether the matrix A =
has an inverse.
9 2

6.5
6.5.1

Solving systems of linear equations


The problem

A messy problem?
Suppose we wanted to solve,
p2 3p3 = 5

(6.4)

4p1 + 5p2 2p3 = 10

(6.6)

2p1 + 3p2 p3 = 7

(6.5)

... how would we do it?? Rearrange (6.4):


p2 = 5 + 3p3

(6.7)

Substitute (6.7) into (6.5),


2p1 + 3(3p3 5) p3 = 7
2p1 + 9p3 15 p3 = 7
2p1 + 8p3 15 = 7

Solving by hand...
Giving,

1
p1 = (7 + 15 8p3 ) = 11 4p3
2
Now substitute (6.7) and (6.8) into (6.6),

(6.8)

4(11 4p3 ) + 5(3p3 5) 2p3 = 10


44 16p3 + 1p3 25 2p3 = 10
0 3p3 = 0
p3 = 3

Giving, p1 = 11 4(3) = 1, and p2 = 5 + 3(3) = 4 that is,


(p1 , p2 , p3 ) = (1, 4, 3) (phew!).

53

LECTURE 6. MATRICES II: THE INVERSE & DETERMINANT IN SMALL


MATRICES
By the inverse
Example:
Solve the following system of linear equations by the Adjoint Method:
p2 3p3 = 5

2p1 + 3p2 p3 = 7

4p1 + 5p2 2p3 = 10

To summarize...
We cant do normal division on matrices,
We can do the same job with the inverse;
But, if the matrix has a determinant equal to zero, we have no inverse (why?);
We can use the inverse to solve linear systems of equations!

54

Lecture

Matrices III: Matrix Algebra &


Automatic-Matrices!
7.1

Introduction

Having learnt a bit about what is going on with matrices, we just have a few more
loose ends to tie up. Specifically, well go over the algebra of matrices. This is
just like what we would normally do with scalar algebra: rearranging equations,
simplifying, making a certain pro-numeral the subject of the equation. However,
with matrices, there are some very important differences in the rules and how to
apply them. Pay attention.
Following this, well cover a couple of important pieces of terminology singular
and non-singular on the one hand and consistent and inconsistent on the other.
These are really just labels for concepts that we are already familiar with, but they
have especial relevance when we move to using computers. I say this, because
although Microsoft Excel (which well be using for the moment) doesnt have
very elaborate error-messages, other programs do. For instance, Matlab might
report (when trying to obtain A1 ),
>> inv(A)
Warning: Matrix is close to singular or badly scaled.
Results may be inaccurate.
...
what does this mean? See below.
The main purpose of the second part of this lecture is to introduce the way
of matrices on computers, specifically using Microsoft Excel. As with most
computer packages, Excel is very fast to do lots of things, but fast doesnt always
equal correct! Its for this reason that we have gone through matrices by hand to
a point to begin with, so that we can actually uncover when the computer is telling
us fibs!
Agenda
1. A little more on Matrix Algebra;
2. Using computers to do Matrix Maths.
55

LECTURE 7. MATRICES III: MATRIX ALGEBRA &


AUTOMATIC-MATRICES!
Small operations;
Linear equations;
Big operations.

7.2
HPW
6.1-6.3

More on Matrix Algebra

Definition | Properties of the inverse


Some useful properties of the inverse,
If the inverse of the square matrix A exists, we call A non-singular,
otherwise we call A singular;
If A1 exists, it is unique;
Other properties,
(AB)1 = B1 A1
(A1 )1 = A
(A )1 = (A1 )
(A + B) 6= A1 + B1

(normally)

Example:
Suppose A, B and C are all invertible matrices, and
[C1 A + X(A1 B)1 ]1 = C ,
express X in terms of A, B and C.

7.2.1

Consistency

What if there isnt a solution?


We have taken for granted so far that our systems of linear equations will be
solvable, however ...
For now, we notice that,
Definition | Consistency
A linear system having no solutions is said to be inconsistent, whereas
a system having either one or more solutions is said to be consistent.
56

7.2. MORE ON MATRIX ALGEBRA


Example:
Determine whether the following linear system is consistent,

1 3 5 3
[A|b] = 0 1 2 2 , .
0 0 0 1

A relationship to the Determinant??


Example:
Determine whether the following linear system is consistent by checking
the determinant of A,

1 3 5 3
[A|b] = 0 1 2 2 .
0 0 0 1

Independence, Matrix style


Whenever we have the case that one row or column is not independent of
another row or column (respectively) in our matrix, we will find the inverse
(and so, a solution) hard to come by;
This is the principle of linear independence... if for some reason you find
that a matrix is singular, then you should look at the relationship between
the rows, or between columns, or for zero rows or columns.
What is happening? ... you have less information than you need to solve the
system! e.g. 3 (independent) equations and 4 unknowns ... inconsistent!
57

LECTURE 7. MATRICES III: MATRIX ALGEBRA &


AUTOMATIC-MATRICES!

7.3

Matrices on Computers

An easier way?
In reality, many matrix manipulations and computations are hard work!
However, computers dont get tired like humans do!
There are some tricks, however ... well look at a few.

Terminology
In Microsoft Excel, matrices are known as arrays. They have some special
functions that can used on them:
+ - Add and subtract matrices as normal;
MMULT Matrix multiplication;
TRANSPOSE Transpose a matrix;
MDETERM Determinant of a matrix;
MINVERSE Inverse of a matrix.
Additionally, functions can be nested:
=MMULT(TRANSPOSE(A1:C3),MINVERSE(E1:G3))
Although...
Functions that would be nice, but we dont get to use:
A function to quickly create I, identity matrices (on some systems there is a
function called MUNIT;
A function to construct cofactor matrices;
A function for the adjoint;
etc..
The Golden Rule of Matrices on a Computer
There is always a special way to tell the program that we are dealing with an
array
Each of the numbers are related to each other;
They should be considered together!
58

7.3. MATRICES ON COMPUTERS


Definition | The Golden Rule of Array Computing
After entering a formula that intends to treat any row- or column- referenced
block of numbers as an array, rather than just hitting return or enter you
must hit:
ctrl + shift + enter
this is the golden rule of array computing!
Examples...
... to the software!
Note: Lecture notes from here... A clean version of the spreadsheet used in these lecture examples will be placed on the web. You are encouraged to use and play with
it.

59

LECTURE 7. MATRICES III: MATRIX ALGEBRA &


AUTOMATIC-MATRICES!

60

Lecture

Probability I: Permutations and


Combinations
8.1

Introduction

We now take a couple of lectures to think about probability. This is partly because
it is another key feature of various economic problems (especially situations to do
with uncertainty) but also partly because it helps as a primer for further thinking
in probability and its cousin statistics.
For this lecture, we will focus on ways of counting things up. This may not seem
related to probability at first, but the fact is, if we cannot count the number of
possible occurrences, then we cannot determine the likelihood or probability of
seeing the one we have before us. More on this application in the following lecture.
Before we begin, one more note on probability. This is a very common word
in the popular media and discussions in general, for instance, the weather-man
talks about a 25% probability of rain tomorrow, or our family doctor talks about
a one-in-1,500 chance that we might inherit a particular disease trait, or even the
government might say there is a very low probability of any further interest rate rise
in the next two years. Do they all mean the same thing by the word probability ?
Sort of, however, sometimes we slip into using the word a bit informally.
The first two examples are quite correct in that the weather man and the
doctor are asserting that of all the cases we know about in the past that are like
this one, x% of them had y characteristic. To translate for the weather man, he is
saying, of all the days that are meteorologically similar to what we expect tomorrow
to be, 25% of them experienced rain-fall. For the doctor this is, of all the people we
know of who have your heritage, an average of 1 in 1,500 of them had this particular
disease. Notice that in both cases, each professional is asserting that they know
about the population (all days like tomorrow, or all people with the same heritage
as you), and therefore can say something about the instance in front of them. This
is what probability is about knowing information about a whole population gives
rise to estimations of likelihoods about a particular event. Statistics, on the other
hand, goes the other way (sample to population). More on this presently.
Agenda
1. Why counting?
61

LECTURE 8. PROBABILITY I: PERMUTATIONS AND COMBINATIONS


2. Permuations: counting arrangements;
3. Combinations: counting selections;
4. Relationships in counting.

8.2

Why Counting?

Scenario: Petrol Theft!


A spate of petrol thefts is hitting Sydney. You have been asked to assist the exasperated NSW Police department. They have (blurry) surveillance footage of a few
number-plates but cant figure out how to track the ownders down, or even, how
many cars might have the pieces of the plate they can see... can you help?

High prices fuel increase in drivers doing a runner


Eamonn Duff,August 21, 2005, The Sun-Herald
Rising petrol prices have sparked a dramatic increase in the number of motorists
who drive away from service stations without paying for fuel. ... Petrol prices
hit a record high last week of 126.9 cents a litre and industry voices said it was
no coincidence that service stations were experiencing a high number of incidents
involving motorists who fill, then flee. ... A NSW Police spokesman told The SunHerald ... There are those who commit these crimes using stolen cars or number
plates. Then there are those who obscure their number plates and quite often our
investigations run into a dead end.

Are we doing Statistics or Probability ?


The Petrol Theft scenario is a situation where we want to know how many
number plates exist in the population, that is, the probability or likelihood
that we would see these plates on a car;
The trick is, we can do this for number plates because we can calculate (by
counting) every possible number plate combination;
This is not always the case...
Another situation may be that we only have one (or a few) samples, and an
unknown (or unknowable) population, and we would like to know what this
sample tells us about the population.
62

8.3. BASIC COUNTING

inference

Statistics
We have a sample, and infer about an unknown population.

likelihood

Probability
We can know the population, and ask the probability of getting the sample.
Note: Statistics and BES We wont be studying any statistics in QABE. The sister
quantitative course, Business and Economic Statistics does tackle a range of topics
in statistics (and a few probability topics that we dont touch on). The distinction
between probability and statistics should be helpful to orientate your understanding in
both areas.

8.3

Basic Counting

Back to Counting the Population...

HPW 8.1

Well be looking at probability for now...


Hence, we need to come up with ways of counting up the number of members
of a particular population (so that we can say something about the probability
of obtaining our particular sample).

8.3.1

By Boxes
Example: The number-plate population
The first step in dealing with our scenario is to work out the total number
of number plates possible of the kind {LL ## LL} where L stands for
any captial letter from A Z and # stands for any number from 1-9.

63

LECTURE 8. PROBABILITY I: PERMUTATIONS AND COMBINATIONS


Example:
NSW recently changed its plates from the kind {LLL ###} to {LL
## LL}. Whats the difference in the number of plates available to the
RTA due to the change?

8.3.2

By Trees

Or in other words...
Another way to think about the boxes method, is to translate the problem into
a tree diagram.
Tree Diagrams
1. Are made of an initial node;
2. Links show possible paths of construction;
3. Each successive bank of nodes (a choice) is called a level.
A
B

1
2

2
3
B
C

Level 1
Level 2

C
Level 4

Level 5

Level 6

Level 3

By another name
Seen in this way (as a tree), the question becomes, in how many ways can I
construct a path through the tree?
Definition | Basic Counting Principle
Given some procedure having k levels. If n1 is the number of choices for procedure at the first level, n2 at the second, and so on, then the total number of
different ways the procedure can occur (by the basic counting principle)
is simply,
n1 n2 n3 nk1 nk .
(8.1)

64

8.4. PERMUTATIONS
Example: For the record...
The police have a photo from the Kingsford BP petrol station that shows
AA 3 K. How many number plates could have this configuration? What
proportion of the total number do they comprise?

8.4

Permutations

When the supply is limited...

Up till now, we have been allowed to have as many (say) As as we wanted in


our letter boxes ... suppose that is not true, suppose we are only allowed to
choose from a set number of distinct objects?

For example, suppose the police knows that a thief has bought exactly (and
only) six decals to onto a blank number-plate how many possible number
plates could he make?

This question asks about permutations:

Definition | Permutations
Suppose a basket contains exactly n different (distinct) objects, then if an
arrangement (an ordering) of r these were taken from the basket, we would
have a permutation of n objects taken r at a time. The total number of
ways of doing this, the number of permutations is given by the symbol,
n Pr

or n Pr .

65

LECTURE 8. PROBABILITY I: PERMUTATIONS AND COMBINATIONS


Example:
Suppose that number plates are of the (old) form LLL ###
and the thief mentioned above is known to have the letters
{B, D, E, F, J, L, M, P, R, S, U, V } and the numbers {2, 5, 6, 7, 8, 9}. How
many different rear number plates could he come up with?

8.4.1

By the formula

Is there a rule?
This is OK when n and r are small, but would be a pain for larger factors...
Actually, we can make some progress on the maths by using factorials (!).
Writing out what we just did (in the previous example) we have,
12 P3

= 12(12 1)(12 2)

Suppose we choose 6 instead,


12 P6

= 12(12 1)(12 2)(12 3)(12 4)(12 5)

It is a series, but when should it stop? Try...


n Pr

= n(n 1)(n 2)(n 3) . . . (n r + 1)

One more trick (see text) leads to the following definition:


Definition | The Permutation Formula
Given n distinct objects, the number of permutations, taking r at a time is
given by the factorial fraction,
n Pr

n!
(n r)!

In fact, your calculator knows how to do this already with the


66

(8.2)

n Pr

key.

8.5. COMBINATIONS
Example: On Factorials
Try to do 112!
109! on your calculator. Problems? Can you work it out another
way??

8.5

Combinations

When the order doesnt matter...

HPW 8.2

Recall our thief had some letter and number decals that he was using to make
up fake number plates;
Suppose now that there is a glitch in the video recording system, and that when
they use the recognition technology, all it tells the police is which selection
of letters and numbers was used on the fake plate, not what order they were
in;
The question is then, in how many ways can I select (not order) r objects
from a total of n objects?
Counting combinations has wide application, since it is what happens whenever
we have a constraint on the feedstock (the n objects) but dont care about the
ordering, just which of the possible objects is selected.
Example: how many different soccer teams can be selected from a group of
20 school yard kids? (without regard to on-field positions!).
Working it out
For starters, suppose we were forming teams of 3 players, from a total of 5
kids.
Label the kids {A,B,C,D,E}, and write down all the permutations you could
get (choosing 3 from 5):
ABC
ACB
BAC
BCA
CAB
CBA

ABD
ADB
BAD
BDA
DAB
DBA

ABE
AEB
BAE
BEA
EAB
EBA

ACD
ADC
CAD
CDA
DAC
DCA

ACE
AEC
CAE
CEA
EAC
ECA

ADE
AED
DAE
DEA
EAD
EDA

BCD
BDC
CBD
CDB
DBC
DCB

BCE
BEC
CBE
CEB
EBC
ECB

BDE
CED
DCE
DEC
ECD
EDC

CDE
CED
DEC
DCE
ECD
EDC

Look carefully at the way we have written them down ... actually each of the
columns is just a re-arrangement of three base letters:
ABC

ABD

ABE

ACD

ACE

ADE

BCD

BCE

BDE

CDE

67

LECTURE 8. PROBABILITY I: PERMUTATIONS AND COMBINATIONS


Infact, we can use this fact to get to a formula for combinations, since each
column in the previous example is just a rearrangement of r objects, which gives r!
rearrangements we want to cancel out:
Definition | The Combination Formula
Given n distinct objects, the number of combinations, that is, selecting r
at a time without regard to order is given by the factorial fraction,
n Cr

n Pr

r!

n!
r!(n r)!

(8.3)

Example: Back to the thief


Suppose now that there is a glitch in the video recording system, and that
when they use the recognition technology, all it tells the police is which
selection of letters and numbers was used on the fake plate, not what
order they were in. If the police know that a thief has 15 different lettering
decals, how many combinations of (just) the three letters will the thief be
able to make?

8.6

Summary

Case by case...
Sensitive
to Order?
yes

Limited
stocks?
no

yes

yes

Combinations

no

yes

Less Basic
Counting

no

no

Basic Counting
Permutations

68

Formula
n1 n2 . . . nk

Example

Number
plates
n!
Letter
arn Pr = (nr)!
rangements
in Scrabble
n!
Choosing
n Cr = r!(nr)!
an executive
team
Assigning non-exclusive jobs
to workers

Lecture

Probability II: Probability in action


9.1

Introduction

Continuing our two-part look at probability, we now get our teeth into more complicated scenarios of probability. In particular, we will be interested to consider what
makes sense when we think about how two, or three, or more events occur in time.
Along the way, well notice that the ordering of the events matters sometimes, but
not always; its a question of dependence.
Most of our time will be spent drawing and understanding a very useful visualisation of probability problems known as probability trees. It should be pointed
out that there is more than one way of thinking about probability, for example,
seeing probability in terms of sets is another helpful one. However, in the interests
of time, and because trees are especially useful for introducing more complicated
topics like Bayes Formula, well stick to the trees!
Agenda
1. Probability by the trees;
2. Rules of probability for independent events;
3. Conditional probability and Bayes Formula.

9.2

Probability Trees

Re-introducing Trees
We looked at Probability Trees last time, as a diagram with levels.
A
B

1
A

A
1

2
3
B

69

LECTURE 9. PROBABILITY II: PROBABILITY IN ACTION


Consider drawing cards from a pack, two mutually exclusive events associated with drawing a card are,
ck ) Black
B la
Pr (

Pr (
Re
d)

Red

By the cards...

In fact, any number of mutually exclusive events can be represented with a


tree;

13
52

)
(

13
52

13
52

)
(

Pr

Pr

Pr
(
)

Consider drawing cards, but this time, focussing on the suit,

Pr
13
52

(
)

With a 52 card pack, there are 13 cards of each suit, giving a probability of
each event occuring (drawing a card of that suit) equal to 13
52 .

9.3
9.3.1

Rules of Probability
Multiplication

Definition | Multiplication Law


In general, if two events A and B can occur in sequence, then the probability
that both events have occurred is the same as asking what the probability
that one event occurs, times the probability that the other event occurs, given
the first has occurred, or:
Pr(A B) = Pr(A)Pr(B|A)

= Pr(B)Pr(A|B) .

Pr(A)
70

Pr(B|A)

9.3. RULES OF PROBABILITY


Example: Multiplication Law
Suppose that the administration of the University of Coogee have been
doing some surveys of the timing of students dropping out of their fiveyear degree programs. The results indicate that 25% drop out after first
year, and then for the subsequent three years, 10%, 8% and 4% of the
students who stay each year will drop out. Given these data, what is the
probability that a first year student will drop out after 2, 3 and 4 years?

9.3.2

Conditional Probability

Conditional Probabilities

HPW 8.5

Often, we have a sequence of events, where the possible outcome of the second
layer depends on the first layer.

Example: Conditional Probability


Given that a card picked from a full deck is red-suited, what is the probability that it is diamond suited?

Definition | Conditional Probability


If two events, A1 and B1 can occur in sequence, then the conditional
probability of event B1 occurring, given that event A1 has occurred, is
expressed as,
Pr(B1 A1 )
Pr(B1 |A1 ) =
.
Pr(A1 )

71

LECTURE 9. PROBABILITY II: PROBABILITY IN ACTION


B 1|
Pr(

)
1
(A
r
P

A 1 ) B1

A1

Pr(
B2 |

Pr
(A

A1 ) B2

B 1|
Pr(

2)

A 2 ) B1

A2

Pr(
B2 |

9.3.3

A2 ) B2

Probability Trees

Multiple levels...
If there is more than one level, we have some kind of order to the events.

Probabilities given on Probability Trees When drawing Probability Trees, it is conventional to give the conditional probability
of the event occurring for all branches deeper than the first level.
C |A
Pr (

)
(A
Pr

Pr

) C

A
Pr(
D |A

E |B
Pr (

(B
)

) E

B
Pr(
F |B

Definition | Probability Tree


A probability tree shows one or more events that can occur, by using labels for each event (the nodes) and branches that indicate allowable paths
through the event tree. At each level, the associated probabilities e.g.
Pr(A1 ) . . . Pr(Ak ) must:
1. Be mutually exclusive events: no two events can happen simultaneously;
2. Be exhaustive: the sum of the probabilities given equals 1 (no possible
events are missing).
B 1|
Pr(

(A
Pr

A 1 ) B1

A1
Pr(
B2 |

Pr
(A

A1 ) B2

B 1|
Pr(

2)

A 2 ) B1

A2

Pr(
B2 |

72

A2 ) B2

9.4. BAYES FORMULA

9.3.4

Independant Events

Suppose we were to draw a Probability Tree to represent the flipping of a (fair coin)
twice:
H 2|
Pr (
)
1
(H
r
P

Pr

H 1) H2

H1
Pr(
T2 |

(T

H2
Pr(

1)

H1 ) T2

)
|T 1 H2

T1

Pr (
T2 |

T1 ) T2

In this case, we would actually have the scenario that,


Pr(H2 |H1 ) = Pr(H2 )
Pr(T2 |H1 ) = Pr(T2 )

That is, the order doesnt matter, or in other words, we are dealing with independent events:
Definition | Independent Events
Suppose A and B are events with positive probabilities, then if is true that,
Pr(B|A) = Pr(B) or
Pr(A|B) = Pr(A) ,
then A and B are said to be independent events.

Special Multiplication Law Notice, that if A and B are independent events then we simply have,
Pr(A B) = Pr(A)Pr(B)
e.g. the probability of getting two heads in a row is just ( 21 )( 12 ).

9.4

Bayes Formula

The two-stage problem

HPW 8.7

Scenario: To study, or not to study...


Suppose that a large study is conducted of first-year students studying QAEB. The
purpose of the study is to investigate if the final examination is a good test of
whether students undertake consistent study in the course (that is, studying at
least 4 hours outside university hours, every week of session). If results from a postcourse questionnaire indicate that 85% of students undertook consistent study, of
whom 81% pass, whilst for the rest (those who dont undertake consistent study)
only 54% pass, then what is the probability that a student selected at random who
did pass actually studied consistently?
73

LECTURE 9. PROBABILITY II: PROBABILITY IN ACTION


This scenario is one where two-stages are at play;
More than that, we have a contingent probability as before, but this time,
the unknown is in the first stage.
)
P|S
Pr(
0.81
0.19
Pr (
F|S
)

S
)
(S
Pr 5
0.8

0.
Pr 15
(N
)

)
P|N P
Pr(
0.54
0.46
Pr (
F|N
) F

We want:
Pr(S|P) =

Prob.(Study and Pass)


Prob.(Pass)

or, as a contingent probability


Pr(S|P) =

Pr(S P)
Pr(P)

with numbers,
Pr(S|P) =

(0.85)(0.81)
(0.85)(0.81) + (0.15)(0.54)

Notice, that this question asks things the other way around to our normal
thinking for conditional probability;
Here, the sequence is backwards:
1. Rather than: Prob. layer-2 event, given layer-1 event;
2. We have: Prob. layer-1 event, given layer-2 event !
This has a special name in probability theory ...
Definition | Bayes Formula
Suppose A1 . . . An is an exhaustive list of n mutually exclusive events that
can occur for a given population S, and B is any event in S such that
Pr(B) > 0, then the conditional probability of some Ai , given that event B
has occurred is given by,
Pr(Ai |B) =

Pr(Ai )Pr(B|Ai )
,
Pr(A1 )Pr(B|A1 ) + + Pr(An )Pr(B|An )

which is the general form of Bayes Formula.


74

(9.1)

9.4. BAYES FORMULA


Or, in other words...
For Probability Trees, Bayes Formula is asking,

Pr(Ai |Bj ) =

The Prob. of a path through Ai to Bj


The sum of all Prob.s for paths to B

B 1|
Pr(

)
1
r(A

Pr

A 1 ) B1

A1

Pr(
B2 |

A1 ) B2

B 1|
Pr(

(A
2)

A 2 ) B1

A2
Pr(
B2 |

A2 ) B2

In two-layer problems, this reduces to:

Pr(A1 )Pr(B2 |A1 )


Pr(A1 )Pr(B2 |A1 ) + Pr(A2 )Pr(B2 |A2 )

Example: Applying Bayes Formula


A car manufacturing plant has three car chassis production machines, M1 ,
M2 and M3 . For historical reasons, a car rig on the production line has a
0.6, 0.3 and 0.1 probability of going to each machine respectively. If the
three machines add a chassis to a rig without fault with 0.55, 0.60 and 0.30
probabilities respectively, what is the probability that a rig with a chassis
fault came from M2 ?

75

LECTURE 9. PROBABILITY II: PROBABILITY IN ACTION


Example: More Bayes Formula
Suppose that the plant from the previous example restructure their car
chassis production line. Now they have just two machines doing 50% of
the work each: M1 , which always adds the chassis to the rig without fault,
and (the old) M2 , which has had a mid-operation check added to it which
is rejecting 5% of the chassis due to early faults, the other 95% go on to
the final stage, where 90% of them are faultless. Given that a complete
chassis is faultless, whats the probability it came from M2 now?

76

Lecture

10

Markov Chains
10.1

Introduction

An important part of the study of probabilities refers to independent trials processes.


These processes form the basis of classical probability theory and much of statistics.
We know that when a sequence of experiments forms an independent trials process,
the possible outcomes for each experiment are the same and occur with the same
probability. Moreover, knowing the outcomes of previous experiments has no effect
on our predictions for the outcomes of the next experiment. Modern probability
theory studies experiments for which knowing the previous outcomes has a direct
impact on the predictions for future experiments. In principle, for a given sequence
of experiments, all of the past outcomes could influence the predictions for the next
experiment. For example, this should be the case in predicting a students grades
on a sequence of exams in a course.
Agenda
1. Markov chain, definition and characteristics.
2. Markov chain and Game Theory.

10.2

The Basics

In 1907, Andrei Markov started studying a very important new type of experiments.
In these processes, the outcome of a given experiment can affect the outcome of the
next experiment. This type of stochastic process is called a Markov chain.

Definition | Markov Chain


A Markov chain is a sequence of trials of an experiment in which the
possible outcomes of each trial, which are called states, remain the same
from trial to trial, are finite in number, and have probabilities that depend
only upon the outcome of the previous trial.
77

HPW 9.3

LECTURE 10. MARKOV CHAINS

10.2.1

Employment Example

Example
Economically active people are either employed, unemployed or self-employed.
Suppose they are never employed for two periods in a row.
If they are employed in one period, they are just as likely to be unemployed
or self-employed the next period.
If they are unemployed or self-employed, they have an even chance of being in
the same job state the next period.
If there is change from unemployment or self-employment, only half of the
time this is a change to an employed status.
Example: The Employment Problem
Form a Markov chain with this information. First, identify the system and
states.

Example: The Employment Problem (cont.)


Identify the conditional probabilities.

If we collect all this information in a matrix we determine the transition matrix


T , which is the same at every stage of the sequence of observations:

Next State
Self Emp
T = Emp
U nemp

78

CurrentState
Self
Emp
Emp U nemp

1/2 1/2 1/4


1/4 0
1/4
1/4 1/2 1/2

10.3. TRANSITIONS, REGULARITY AND STATES


Self Emp = Self Employed
Emp = Employed
U nemp = Unemployed.
Formally:
We have a set of states, S = {s1 , s2 , ..., sk }.
The process starts in one of these states and moves successively from one state
to another.
Each move is called a step or trial.

10.3

Transitions, Regularity and States

10.3.1

Transition Probabilities

Definition | Transition Probabilities


If the chain is currently in state sj , then it moves to state si at the next
step with a probability denoted by tij . This probability does not depend upon
which states the chain was in before the current state. The probabilities tij
are called transition probabilities and are all nonnegative. The process
can also remain in the state it is in, and this occurs with probability tjj .

Definition | Transition Matrix


A transition matrix for a k-state Markov chain is a kk matrix T = [tij ]
in which the entry tij is the probability from one trial to the next, of moving
to state i from state j. All entries of the transition matrix are nonnegative
and the sum of all entries in each column must be 1 since for each current
state, the probabilities account for all possible transitions:
tij = P (next state is i | current state is j).

The entries in the first column of the transition matrix T in the employment
example above represent the probabilities for the various kinds of employment condition following a self-employment state. Similarly, the entries in the second and
third column represent the probabilities for the various kinds of employment conditions following an employed or unemployed state, respectively.
79

LECTURE 10. MARKOV CHAINS


Example:
If the person is currently self-employed then what is the probability that he
is unemployed two periods from now?

Hint: The equation should remind you of a product of two vectors; we are
multiplying the first row of T with the third column of T . This is just what is done
in obtaining the (1, 3)-entry of the product of T with itself.

Definition |
Let T be the transition matrix of a Markov chain. The ij th entry tnij of the
matrix T n gives the probability that the Markov chain, starting in state si ,
will be in state si after n steps, where:
t2ij =
for k states.

10.3.2

Xk

r=1

tir trj ,

Regular Markov Chain


Example: The Employment Problem again
Consider again the employment example. We know that the powers of
the transition matrix give us interesting information about the process as
it evolves. We are particularly interested in the state of the chain after a
large number of steps (say 4 in our case). Calculate T 4 .

80

10.3. TRANSITIONS, REGULARITY AND STATES


Example: The Employment Problem again, continued...

Lets continue calculating until T 6 .

Next State
Self Emp
T 5 = Emp
U nemp

CurrentState
Self
Emp
Emp U nemp

0.400 0.400 0.399


0.200 0.199 0.200
0.399 0.400 0.400

Next State
Self Emp
T 6 = Emp
U nemp

CurrentState
Self
Emp
Emp U nemp

0.400 0.400 0.400


0.200 0.200 0.200
0.400 0.400 0.400

Note that after six periods our employment condition predictions are, to threedecimal-place accuracy, independent of the current employment status.
The probabilities for the three types of employment condition, Self-employed,
Employed and Unemployed are 0.4, 0.2, and 0.4 no matter where the chain
started.
This is an example of a regular Markov chain. For this type of chain,
long-range predictions are independent of the starting state.
Definition | Regular Markov Chain
A transition matrix T is regular if there exists an integer power of T for
which all entries are strictly positive. A regular Markov chain is a
Markov chain whose transition matrix is regular.

10.3.3

State Vector

We now consider the long-term behaviour of a Markov chain when it starts in a


certain state. Usually this is done by specifying a particular state as the starting
state. An initial probability distribution, defined on S, specifies the starting state.
81

LECTURE 10. MARKOV CHAINS


State Vector
Suppose that initially all three employment states have the same probability to
occur (1/3). These probabilities are called the initial state probabilities and
are collectively know as the initial distribution. They can be represented by a
column vector, called the initial state vector, denoted X0 :

1/3
X0 = 1/3
1/3
Definition | State Vector
The state vector Xn for a k-state Markov chain is a k-entry column
vector in which the entry si is the probability of being in state i after the nth
trial or step. Moreover, its entries are non-negative and sum to 1.
If X0 is a probability vector which represents the initial state of a Markov chain,
then we think of the ith component of X0 as representing the probability that the
chain starts in state si .
We consider the question of determining the probability that, given the chain is
in state j today, it will be in state i in n steps from now. We denote this probability
by tnij , or collectively T n .
Definition |
Let T be the transition matrix of a Markov chain, and let X0 be the probability vector which represents the starting distribution. Then the probability
that the chain is in state si after n steps is the ith entry in the vector
Xn = T Xn1 = T n X0 .

Note: If we want to examine the behavior of the chain under the assumption
that it starts in a certain state si , we simply choose X0 to be the probability vector
with the ith entry equal to 1 and all other entries equal to 0.
Example:
In our employment example,
let initially all three types of employment be


equally possible (X0 = 1/3 1/3 1/3 ). Calculate the distribution


of the states after three periods.

82

10.3. TRANSITIONS, REGULARITY AND STATES


Lets continue calculating until X6 .

X5

X6

0.402 0.398
= T X4 = T 5 X0 = 0.199 0.203
0.398 0.398


=
0.399 0.200 0.399

0.400
= T X5 = T 6 X0 = 0.200
0.400


=
0.400 0.200 0.400

0.400
0.200
0.400
=Q

0.398
1/3
0.199 1/3
0.402
1/3

1/3
0.400
0.200 1/3
1/3
0.400

We can see that, as the number of trials increase, the entries in the state vector
get closer and closer to the corresponding entries in the vector Q.
Remember that:

0.400
Q = 0.200
0.400
which remains unchanged from trial to trial and

0.500
T Q = 0.250
0.250

0.500
0
0.500


0.4
0.250
0.4
0.250 0.2 = 0.2 = Q.
0.4
0.500
0.4

Q is called a steady-state vector for T , is unique and depends only on T (but


not on the initial state vector X0 ).
Example: Finding Q
How can we quickly find Q in our example?

Hint: Use
T Q = Q = IQ
T Q IQ = 0

(T I) Q = 0.
83

LECTURE 10. MARKOV CHAINS


Example: Finding Q continued

Steady State Vector Not all Markov chains have a steady-state


vector, but if the Markov chain is regular, there exists a unique
steady-state vector associated with that Markov chain.

10.4

Markov Chains in Game Theory

There are many cases in which economic decisions are made in situations of conflict,
where one partys actions induces a reaction from others. For example:
Wage bargaining between employers and unions or duopoly.
Deciding how much money to invest in research and development; if one firm
invests in R&D, can another rival firm decide not to follow?
The mathematical theory of games has been applied to economics to help elucidate
problems of this kind.
So, game theory studies strategic interaction in competitive and cooperative
environments because
The best games are not those in which all goes smoothly and steadily
toward a certain conclusion, but those in which the outcome is always in
doubt. George B. Leonard
Doubt is the main factor in a famous game called Prisoners Dilemma: two
suspects of a crime are held in separate rooms, are interrogated and cannot communicate with each other.

Player 1
Conf ess
Deny

Player 2
Conf ess Deny
(4, 4) (1, 5)
(5, 1) (2, 2)

If both deny, each gets two years in prison.


But if Player 2 confesses, gets only one year in prison if Player 1 denies.
Player 1 would then get five years in prison.
84

10.4. MARKOV CHAINS IN GAME THEORY


If Player 1 also confesses, gets four years in prison better to confess!
Both then confess and get four years in prison, rather than two if both denied the
crime!

10.4.1

Repeated Games

What happens if this game is played repeatedly? In an Iterative Prisoners dilemma,


what one player chooses to do is based on what happened in the previous round.
Tit-for-Tat is a common strategy, in which one player does what the other one did
in the previous round. So, one player is prepared to cooperate, but doesnt allow
the other one to exploit him!
Iterative Prisoners Dilemma
An Iterative Prisoners dilemma can be seen as a Markov chain with four states:
Both players 1 and 2 confess.
Player 1 confesses but player 2 denies.
Player 1 denies and player 2 confesses.
Both players deny.
If they engage in a Tit-for-Tat strategy, the transition matrix is:
Current State
Next State 1 2 3 4
1
1 0 0 0

2
0 0 1 0
T1 =
0 1 0 0
3
4
0 0 0 1
Example: Iterative Game
Suppose we start with the state 2 (player 1 confesses but player 2 denies).
The next round, state 3 will be observed (player 2 confesses but player 1
denies). The next round, they will play state 2 again (player 1 confesses
but player 2 denies) and so on. What are the state vectors?

How can one get out of the trap? The solution is a modified version of Tit-for
Tat, that gives each player, say, 10% chance of cooperating, after the other one has
85

LECTURE 10. MARKOV CHAINS


deviated. The transition matrix becomes then:
Next State
1
2
T1 =
3
4

Current State
1
2
3
4
1 0.1 0.1 0.01
0
0 0.9 0.09
0 0.9
0 0.09
0
0
0 0.81

If both deny (state 4), next round they will keep denying with 0.9 0.9 = 0.81
probability,
or player 1 confess and player 2 denies with 0.9 0.1 = 0.09 probability,
or player 2 confess and player 1 denies with 0.1 0.9 = 0.09 probability,
or both confess with 0.1 0.1 = 0.01 probability.

86

Lecture

11

Linear Programming I: Solving


problems in a world of constraints
11.1

Introduction

We move now away from the world of matrix algebra and into the world of linear
programming. It is natural to think that this would require us to do some kind of
computer programming(!), but this is not the case. In fact the word programming
is an artefact of the historical background of the techniques well be studying rather
than saying something about our method.
You see, we will be dealing with the very common problem that anyone faces
when they must attempt to maximize (or minimize) some quantity, subject to a
number of constraints. These constraints might include the minimum production
level that must be attained, or the maximum number of days that can be worked, or
the limit of what an individual can carry at one time.1 The reason why this process
is called linear programming is that for a great many problems, the constraint
and success equations are linear in nature, and the programming bit comes from
the names given to various war-time schedules of activity that were the outcome
of solving just these kinds of problems during World War II. They called them
programmes.
Before jumping straight into the solution method of these problems, we need to
spend some time considering equations where the left- and right-hand sides dont
necessarily equal each other, but instead must be greater-than, less-than, or some
combination of these with equal-to. These are known as inequalities. We look at
these because they are generally how our constraints will be given. Knowing how to
deal with such constraints, well be well-placed to solve bigger linear programming
problems.
Agenda
1. The business headache!
1

An account by Chris Bonnington, expedition leader of the first ascent of the Western Face of
Mt Everest, describes how he had to solve exactly our kind of linear programming problem; he had
to get an amount of equipment, provisions and oxygen up the mountain to support his climbers
with all kinds of constraints, one of which was the amount that any one climber could carry in a
single load.

87

LECTURE 11. LINEAR PROGRAMMING I: SOLVING PROBLEMS IN A


WORLD OF CONSTRAINTS
2. Introduction to linear programming;
3. Application to The Bean House.

11.2

The Business Headache

Why business isnt easy!


Consider some simple business questions:
How many items should I produce for sale?
How many inputs should I order?
How many people do I need to complete the production?
How much do I need to pay them?
Can I afford it??? How much money will I make from sales?
What if things change???
... ?????!!!!
Making some progress
We dont need to think of all the questions at once, individually;
We recognise that the answer to one question will be the input to another
question ...
We are dealing with a system of connected constraints
The way to deal with this complicated system, is to write down each question
in terms of a mathematical equation and to put them down as constraints
on our production process;
Then, by looking at all of the constraints together, we will know where our
answer must be (the feasible region);
Then we can choose a legal answer that best suits our needs.
... easy!

HPW 7.1

11.3

Introduction to Linear programming

11.3.1

Equations of two variables

But first...
We need to learn some useful techniques...
Recall, we are quite familiar with linear equations,
y = mx + c
where y was the dependent variable and x was the independant variable
(m and c are the gradient and constant respectively);
88

11.3. INTRODUCTION TO LINEAR PROGRAMMING


However, suppose that both y and x were dependant variables lets call
them x1 and x2 for ease, that is,
C = m 1 x1 + m 2 x2
Graphically...
Suppose we deal with the equation,
10x1 + 4x2 = 20
We can graph it by rearranging to,
x2 = 5

10
x1
4

However, suppose we are dealing with the inequality


10x1 + 4x2 20
Then we would graph the region,
x2 5

10
x1
4

Or, what about,


10x1 + 4x2 20 ?
Or, in cases where we have greater than or less than (but not equal to), e.g.
10x1 + 4x2 > 20
we use a dotted line to represent the boundary of the region.

11.3.2

Linear inequalities

Definition |
A linear inequality involving variables x and y is of the form,
ax + by + c < 0 or,
ax + by + c 0 or,
ax + by + c > 0 or,

ax + by + c 0
where a, b, and c are constants and a and b cant both be zero.
Graphically, there will be a whole region of (x, y) points that satisfy a linear
inequality;
However, by requiring that several linear inequalities must be satisfied, it is
possible to have,
1. No solution;
2. A bounded region of solutions; or
3. Infinite solutions (unbounded).
89

LECTURE 11. LINEAR PROGRAMMING I: SOLVING PROBLEMS IN A


WORLD OF CONSTRAINTS

11.3.3

Systems of linear inequalities

It is possible to require a number of linear inequalities to be satisfied at the


same time each contributing a region of points that satisfy a given inequality;
For example, consider the system,
10x1 + 4x2 20
5x1 + 5x2 20

How do we find the region that solves both inequalities?

11.3.4

Graphing the system

Step 1: Rearrange each inequality into y(x) form:


x2 5 (10/4)x1
x2 4 x1

Step 2: Draw the boundary lines as if they were equalities; Step 3: Identify the
region where each is satisfied; Step 4: Find the intersection of those regions (if it
exists).

HPW 7.2

11.4

Linear Programming

11.4.1

Terminology

Linear programming
The equations that we have looked at can be thought of as problem constraints they determine the region within which we are allowed to draw a
pair of numbers for x1 and x2 ;
Often, the problem has some kind of objective function:
Definition | Objective function
An objective function is a function of some number (e.g. f (x1 , x3 , x4 ))
or all (e.g. f (x1 , x2 , . . . , xn ) of the problem inputs, that determines the
overall value of a particular set of chosen input values. In economics, it
is usually a measurement of the net-profit.
Linear programming is the process we use to find out a set of inputs that
fulfill some problem criteria (e.g. maximize net profits, minimize total mistakes);
The name comes from schedules (programs) used in WWII.
The solving process...
90

11.4. LINEAR PROGRAMMING

Stop

Start

Yes

x1
x2

Constraints

x3

Objective
function
f (x1 , x2 , x3 )

The best
solution?

No (try again)

11.4.2

Application

Back to the business: The Bean House


Recall, we have already looked at the feasibility of building The Bean House,
in terms of NPV and IRR;
Deciding that the project is actually feasible, your friend and you have decided
to start the project, but...
You have a new headache, and your friend has turned to you to figure it out...
The problem...
The Bean House Scenario
You decide to produce two products: House Pack, and Boutique Blend. However,
you need to work out how many of each to make each night. Three things seem to
affect the decision:
1. Because of council restrictions, you only have a total of 800 KW hours of
Electricity available each night;
2. Your Raw Beans supplier can only provide 60 kgs of Raw Beans each day;
3. You have orders for at least 10 House Pack per day already due to the demand
of other businesses in the area.
How many House Pack and Boutique Blend should you make each night?
Step 1: write down equations
Let:
x1 = Boutique Blend
x2 = House Pack
(Electricity) It takes 5 KW per Boutique Blend and 14 KW per House
Pack (total) to roast them, so:
800 5x1 + 14x2

(11.1)

(Raw Beans) It takes 200 g of Raw Beans per Boutique Blend and 1 kg of
Raw Beans per House Pack to make them, so:
60 0.2x1 + 1.0x2

(11.2)

(Minimum House Pack) we must make at least 10 House Packs, so:


x2 10

(11.3)
91

LECTURE 11. LINEAR PROGRAMMING I: SOLVING PROBLEMS IN A


WORLD OF CONSTRAINTS
Step 2: see it graphically
1. Electricity
2. Raw Beans
3. Minimum House Pack
4. Feasible region...
Step 3: apply the objective function
Objective function is our net profit:
Electricity costs $0.20 per KW;
Raw Beans costs $1.70 per kg;
Boutique Blends sell for $8.50, House Packs for $18.00;
So our net profit (income - costs):
(x1 , x2 ) = (8.50x1 + 18.00x2 )
0.20 (5x1 + 14x2 )

1.70 (0.2x1 + 1.00x2 )


Step 3b: back to the plot
We need to define a profit to draw the line. Try $1,400, $1,300 ...
What is the best solution???
1. Must be in feasible region that...
2. Intersects with the best possible objective function line.
Step 4: solve the maths
Maximum profit occurs where our Electricity and Min House Pack constraints intersect;
So, find (x1 , x2 ) that represents point of intersection, solve:
(Electricity) 800 = 5x1 + 14x2
(min House Pack) x2 = 10
Gives:
800 (14)(10)
= 132
5
House Pack (x2 ) = 10

Boutique Blend (x1 ) =

Net Profit () = $1080.10


92

11.4. LINEAR PROGRAMMING


Summing up
How many KW of Electricity will we use? How many kgs of Raw Beans will we
consume (per day)?

80
Constant Profit Lines

x (House Pack)

70
60

Optimal Production
Point

50
40
30
20

Feasible Area

10
0
0

50

100

150

x (Boutique Blend)
1

93

LECTURE 11. LINEAR PROGRAMMING I: SOLVING PROBLEMS IN A


WORLD OF CONSTRAINTS

Working Page I Introductory Examples


Note: This page intentionally left mostly blank so that you can work along with the introductory examples we will do in class. Use pencil if you are unsure!
6

0
0

x1

0
0

x1

94

11.4. LINEAR PROGRAMMING

Working Page II The Bean House


Note: This page intentionally left mostly blank so that you can work along with the The
Bean House example we will do in class. Use pencil if you are unsure!

80

x2 (House Pack)

70
60
50
40
30
20
10
0
0

50

100

150

x (Boutique Blend)
1

95

LECTURE 11. LINEAR PROGRAMMING I: SOLVING PROBLEMS IN A


WORLD OF CONSTRAINTS

96

Lecture

12

Linear Programming II: Dealing with a


Changing World
12.1

Introduction

Last lecture, we introduced the method of linear programming as a useful method


of analysing the business headache. That is, it made sense of seemingly complicated
sets of constraints and profit functions to tell us how we should set things in our
business to achieve the two-fold aim of 1) being feasible something that is possible
given the available technology; and 2) being the best solution to the problem at
hand.
This time, well look closer at all this LP work. For instance, well look at cases
where we have not just one best solution, but multiple best solutions. Then,
well consider what happens to our business conundrum when things are changed
around us either because of a change in a constraint due to some kind of supplier
shortage (for example), or perhaps the item that we are taking to market experiences
a change in its price. The latter of these effects comes under the changes to the
objective function title, and as well see, needs much care to ensure we dont miss
a (better) solution for lack of perspective.
Agenda
1. Linear programmings usual suspects kinds of solutions;
2. The problem of moving solutions;
3. Looking at the margins what does a change in an input do to our objective
analysis?

12.2

Linear programming usual suspects

Last time, we dealt with constraints, the objective function and put it
together to solve the The Bean House Problem with linear programming;
However, we have different types of solutions that can arise:
1. Bounded;
97

HPW
7.2,7.3

LECTURE 12. LINEAR PROGRAMMING II: DEALING WITH A CHANGING


WORLD
2. Unbounded;
3. No feasible solution;
4. A feasible solution;
5. Multiple solutions.
Unbounded case feasible region infinite;
Bounded case feasible region finite;
Multiple solutions objective function parallel to a line of constraint;
No solution feasible region empty;
Reprise on multiple solutions
How do I know when I have multiple solutions? What are they???
1. Proceed as normal for the problem (as we saw last week, and well do more of
today);
2. If, through graphing, or through trial and error of corner solutions, you find
that you have
An apparently parallel objective function to a constraint line; or
Two of your corner solutions give rise to the same objective function
result; then
3. You more than likely have a case of multiple solutions,
4. And, any point along the line joining the two corner solutions will
itself be an optimal solution!

12.3

Variations in the LP problem

Recall our The Bean House example from last time. We had optimimum outputs
at (132,10) yielding a profit of $1,080.12.
What if things change???
Well consider two types of changes;
1. The effect of changing inputs on our profits;
2. When the objective function itself changes, forcing us to reconsider our
optimal production point.

12.3.1

A Change in the Constraints

A New Power Supply?


98

12.3. VARIATIONS IN THE LP PROBLEM


Example:
A new supplier of power has entered the market who is willing supply you
with up to 1000 KW per night (instead of 800 KW per night). What
happens to your profits? What would you be willing to pay in addition to
the usual price for the extra power?

We find that the new optimum point is the solution to the system:
1000 5x1 + 14x2
10 x2

Solving we find, x1 = 172 and x2 = 10.


New profit? ... (172, 10) = $1, 366.52;
Notice, that the change in the Electricity constraint was +200 KW and our
profit has gone up by $286.40.
How much would we be willing to pay for that extra 200KW of Electricity???
200KW price/KW + gain in profits = W T P

200 0.20 + 286.40 = 326.40

Which leads us to notice the following definition,


Definition | Marginal value of a constraint
The marginal value (also called the shadow price, accounting price,
or scarcity value) is the change in the optimal objective (e.g. the profit)
that would result from a change in the capacity of the constraint by one unit,
all else being left unchanged.
In our case above, we found that the marginal value of one KW of electricity
is $286.40/200 = $1.42.

12.3.2

A Change in the Objective Function

The Boutique Coffee Crash...


99

LECTURE 12. LINEAR PROGRAMMING II: DEALING WITH A CHANGING


WORLD
Example:
Suppose (with our new Electricity restriction of 1000 KW) that the price of
Boutique Blendfalls progressively from $8.50 to $7.50. And then, continues
to fall progressively by $1 each month, finally bottoming-out at $4.50.
What would be the optimum production point (x1 , x2 ) and profit at each
step along the way (8.50,7.50,6.50,5.50,4.50)?

Recall, we must always check that we havent changed the character


of the problem that is, that we are still working on the correct
binding condition.

60

x2 (House Pack)

50
40
30
20
10
0
0

50

100

150

200

x (Boutique Blend)
1

p1 = $4.50

In this case, the binding constraint changed from the Electricity constraint
line, to the Raw Coffee Bean constraint line;
If we plotted the Profit at each production point, we would have found:
100

12.3. VARIATIONS IN THE LP PROBLEM

1400

1400
Profit point 1 (172,10)

1300

1200

1200

1100

1100

Profit ($)

Profit ($)

1300

1000
900
800

$6.06

900

700
5.0

6.0

7.0

8.0

9.0

p (Boutique Blend Price)


1

12.3.3

1000

800

Profit point 2 (72,45)

700
600
4.0

Optimal Profit Path

600
4.0

5.0

6.0

7.0

8.0

9.0

p (Boutique Blend Price)


1

Solution Technique

Solving it
1. Write down equation that has changed;
2. Re-plot the equation, or objective function (to see the new slope);
3. Decide if the changes causes:
No change to optimal conditions;
New optimal conditions;

No more feasible solution.


4. If a change has occured, recalculate (solve) for the new optimal conditions.
5. Check (especially for objective function changes) by trying out each relevant
corner solution and making sure you have the best one.
LP summary
1. As ever, a picture is very helpful!
2. We can also proceed by identifying corner-solutions, and trying these in each
case to find the optimum;
3. Practice!

101

LECTURE 12. LINEAR PROGRAMMING II: DEALING WITH A CHANGING


WORLD

Working Page I
Note: This page intentionally left mostly blank so that you can work along with the examples we will do in class. Use pencil if you are unsure!
6

0
0

x1

0
0

x1

102

12.3. VARIATIONS IN THE LP PROBLEM

Working Page II The Bean House


Note: This page intentionally left mostly blank so that you can work along with the The
Bean House example we will do in class. Use pencil if you are unsure!

80

x2 (House Pack)

70
60
50
40
30
20
10
0
0

50

100

150

x (Boutique Blend)
1

103

LECTURE 12. LINEAR PROGRAMMING II: DEALING WITH A CHANGING


WORLD

104

Lecture

13

Linear Programming III: Using Solver


13.1

Introduction

In the past two lectures we have seen how to set up a linear programming problem
and solve it using the graphical method. The Microsoft Excel program offers us an
alternative method where the problem can be very quickly solved and if necessary
altered and sensitivity analysis performed. To illustrate the use of Solver we will
use a question from a past exam paper in ECON1202. (It has been altered slightly
as it was originally required to be done using the graphical method.)

Agenda

1. Setting up linear program problems using Solver;

2. Interpreting the results;

3. Understanding multiple solutions.

105

LECTURE 13. LINEAR PROGRAMMING III: USING SOLVER

13.2

The Problem
Example: REVCO Motor Company
REVCO motor company has two engine-manufacturing plants in Sydney,
plants A and B, producing the 2-litre clean burning engine used in their
new car model. The maximum production capacity of plants A and B are
50 engines and 55 engines per month respectively.
The car engines are sent by road to the 2 car assembly plants of
the company, one in Adelaide and one in Melbourne. The transport
costs per engine from plant A in Sydney to Adelaide and Melbourne
are $100 and $60 respectively while the transport costs per engine
from plant B in Sydney to Adelaide and Melbourne are $120 and $70
respectively.
In a given month, the Adelaide car assembly plant requires 40 engines while the Melbourne car assembly plant requires 35 engines.
In satisfying the engine requirements of the assembly plants in Adelaide and Melbourne, the objective of the company is to minimise the
transport costs of the engines from Sydney to the 2 assembly plants
in Adelaide and Melbourne. How many engines should be sent to
each plant if cost is to be minimised?

You can try working this out by hand if you like, but lets try another way. First,
we need to set the problem up.

13.2.1

Setting up the Problem

Define the Variables


How do we set up this problem using the least number of variables?
x = number sent from Plant A to Adelaide.
y = number sent from Plant A to Melbourne.
If 40 engines are required in Adelaide then 40 x engines must come from
Plant B.
If 35 engines are required in Melbourne then 35 y engines must come from
Plant B.
106

13.3. USING SOLVER


Objective Function and Constraints
The objective function is:
Transport Cost

C = 100x + 120(40 x) + 60y + 70(35 y)

= 100x + 4800 120x + 60y + 2450 70y

= 20x 10y + 7250


The constraints are:
Capacity of Plant A: x + y 50
Capacity of Plant B:

(40 x) + (35 y) 55

x y 20
x + y 20

And: x 0, y 0.
Also: x 40, y 35.

13.3

Using Solver

Linear Programming by Computer Power


Before setting up the objective function and constraints in Solver, we first need to
make sure that the Solver Add-in is operating. Heres the process for the Microsoft
Office 2007 version of Excel. (In earlier Excel versions the steps are slightly different.)
1. Click on the Office button in the top left corner of a new spreadsheet then
select Excel Options > Add-Ins > Solver Add-in > Go.
2. From the dialog box select the Solver Add-in and OK.
3. If you have never used Add-ins before a configuration process might be necessary. Just follow the prompts.
4. Now check that under the Data tab you can see Solver in the Analysis section.
1. The first step in setting up the problem is to guess some start-up values for
the variables x and y. This is similar to choosing a guess value of the objective
function in the graphical method.
2. We will choose x = 20, y = 20.
3. We enter the labels x and y in cells A1 and B1 and the values 20 in A2:B2.
4. In C1 we enter the label Cost and the equation for the cost from the isoobjective function in C2. This equation is 20 A2 10 B2 7250.
The result is shown in Screenshot 1.

107

LECTURE 13. LINEAR PROGRAMMING III: USING SOLVER


Screenshot 1:

Note that 6650 is the initial value of Cost resulting from our guess.
Now we will enter the constraints.
1. First enter the labels Constraint 1 . . . Constraint 6 in the range A4:A9.
2. Then enter the expressions on the left hand side of each constraint in formula
form using appropriate cell addresses for x and y.
3. For the first two constraints we need x + y so we enter the formula = A2 + B2
The resulting value for the constraint is derived using the initial guess values of
20 and 20.
Screenshot 2:

Continue until all constraints have their appropriate values in the range B4:B9.
We will set up the remaining parts of each inequality within the Solver.
Click on Solver at the far right of the Excel ribbon section, with the Data tab
selected. We need to enter the target (cost) cell and the cells we are trying to change
as shown in Screenshot 3 and make sure we are minimising cost by selecting Min.

108

13.3. USING SOLVER


Screenshot 3:

Once this is done click on Add to add the remaining parts of the constraints.
Screenshot 4 shows how Constraint 2 (x + y 20) is added. Note the correct sign
>= has been chosen and the value 20 entered. Press Add between each addition
and OK once all are completed.
Screenshot 4:

Screenshot 5 shows the Solver Parameters dialog box with all information for
the problem added and the problem ready to be solved.
Screenshot 5:

109

LECTURE 13. LINEAR PROGRAMMING III: USING SOLVER


When you select Solve a new dialog box will open (see Screenshot 6). You need
to click on the words Answer, Sensitivity and Limits to get these three reports. In
this case we will also choose to keep the solver solution.
Screenshot 6:

Once you press OK you will find that the solution is shown in your original
spreadsheet. See Screenshot 7.
Screenshot 7:

Notice that the values of x and y on Sheet 1 are now 40 and 10 and that the
corresponding cost is 6350.
This represents the optimal solution, i.e. minimum cost of shipping engines
that satisfies all the constraints.
We will now look at the information provided in the three reports.
The Answer Report, shown in Screenshot 8, shows the original guess values
for x, y and cost and the final ones.
It also shows which two constraints are binding. In graphical terms this would
mean which lines are intersecting at the optimal corner point of the feasible
region.

110

13.3. USING SOLVER


Screenshot 8:

Notice that there are no slack values for the binding constraints. All the
available amounts have been used. For Constraint 1 the Slack Value = 0. This
means that all the capacity of Plant A is being utilised.
Looking at Constraint 2 we see that Plant B has 30 units slack or 30 units not
utilised.
The second report is called the Sensitivity Report. See Screenshot 9.
Screenshot 9:

111

LECTURE 13. LINEAR PROGRAMMING III: USING SOLVER


It shows the marginal values of the two binding constraints as being equal to
10.
Note that on this report they are listed as Lagrange Multipliers. As we will use
that term in another context later in the session its best to continue calling
them marginal values here as we did in Lecture 12.
How can we interpret the marginal values?
The marginal value or -10 for constraint 1 means that if the number of engines
produced by Plant A was changed from 50 to 51 the minimum cost would
decrease by $10, all else being equal.
On the computer it is very easy to check this. Return to the spreadsheet and
run Solver again. This time set the value for Constraint 1 to equal 51 instead
of 50.
Screenshot 10:

When the solution is found we see that the minimum cost on Screenshot 10 has
indeed gone down by 10, from a cost of $6350 using x = 40 and y = 10 to $6340
using x = 40 and y = 11.
Now we also need to know how much x and y can vary, with the other variable
held constant, and still meet all the original constraints.
Screenshot 11:

The third answer report, the Limits Report, shown in Screenshot 11 helps
us to see this.
112

13.4. CHANGING THE OBJECTIVE FUNCTION: MULTIPLE SOLUTIONS


We see here that x can vary between x = 10 and x = 40, with corresponding
costs of $6950 and $6350. We also see that y can vary between y = 0 and
y = 10.

13.4

Changing the Objective Function: Multiple Solutions

Multiple Solutions
We have just seen that it is easy to change the value of a constraint and run
Solver again to conduct sensitivity analysis. Its also easy to re-solve the problem
with a new objective function. Imagine that the cost of transport to Adelaide
changes. From Plant A it now costs $110 and from plant B $120 for each engine to
be shipped plus an extra $200 fixed cost.
This changes the objective function to:
Cost = 110x + 120(40 x) + 60y + 70(35 y) + 200

= 110x + 4800 120x + 60y + 2450 70y + 200

= 10x 10y + 7450

We will run Solver again with the original constraints and guess values of x = 20,
y = 20. We also will change the equation in cell C2. What should the new equation
be?
= 10 A2 10 B2 + 7450
After running Solver if we look at the spreadsheet it seems that the minimum cost
has changed might be as expected .
Screenshot 12:

In Screenshot 12 there is a new minimum cost of $6950 with x = 25 and y = 25.


However we need to be careful in interpreting this solution.

113

LECTURE 13. LINEAR PROGRAMMING III: USING SOLVER


Screenshot 13:

When we delve a little deeper we discover that the solution is correct but not
unique.
The answer report (Screenshot 13) now shows that only Constraint 1 is binding.
In a graphical sense we can explain this. The iso-objective line is parallel to this
binding constraint so we have multiple optimal solutions lying along this line. There
are many combinations of x and y values that give the minimum cost of $6950.
Screenshot 14:

114

13.5. SUMMARY
By looking at the Sensitivity Report in Screenshot 14 we also see that only
Constraint 1 has a Lagrange Multiplier or marginal value.
This means that any other constraint is free to move by one unit without
affecting the minimum cost value, but cost decreases for each extra unit of
capacity in Plant A.
How can we tell what some of the other combinations of x and y give the minimum
cost? We dont get much help from the Limits Report in Screenshot 15.
Screenshot 15:

From this we can only see that cost is higher at x = 0 y = 0.


However by trying some other values in the spreadsheet as guess values and
observing the value of cell C2 we should be able to find some of the other
solutions which give the minimum cost.
For example, these include x = 35, y = 15 at one end of the scale to x = 40, y = 10
at the other. This is one area where a graph is better than Solver as it shows more
information about where the multiple solutions lie.

Multiple Solutions and Memory Problems Note that it is also


possible in this situation for Solver to encounter memory problems
when situations such as this occur and not produce an answer report
while providing still one solution. You should also be aware of other
error messages that may be generated if the target cell values do not
converge or the conditions are not linear.

13.5

Summary

Solver Wrap Up
We have learned how to use a computer technique which can solve many linear
programming problems efficiently and quickly.
115

LECTURE 13. LINEAR PROGRAMMING III: USING SOLVER


This method also allows the problem to be modified easy so that different
scenarios can be explored.
It is important to generate Solver reports and read them carefully.
We should also be conscious of the limitations that exist, for example where
there is no feasible solution or multiple ones.

116

Lecture

14

Differentiation: Responding to Change


14.1

Introduction

So far we have concerned ourselves mostly with functions as mappings (telling us


for a given input, or set of inputs, what the output will be) drawing them, solving
them, finding feasible solutions by using them as boundaries we havent necessarily
been interested in the way that these outputs change in response to changes to
the inputs. This is therefore, a natural topic for us to consider now.
Indeed, the nature of change in the real-world is an ever-present phenomenon.
Whether it be changes in fish populations due to over-fishing or pollution, or changes
in the birth-rate, or changes in the supply of certain raw materials or finished goods,
change is all around us. The trick is, that many of the elements that change are
actually inputs to our neatly constructed equations, which aimed at representing
certain processes occurring in the real world. Hence, it becomes necessary to consider
what a change in an input might do to an output.
Of course, this subject is not at all confined to economics and business inquiry.
We owe a great debt on this score to our friends in physics, astronomy and mathematics, who for many millennia (OK, about two), have been interested in thinking
about the cause-effect nexus of change in the world.
In approaching this very large field of inquiry, normally taught as one pillar of
the twin pillars of calculus (integration being the other pillar), we shall first ask
what we mean by a derivative? Or rather, what we mean by working out the rate
of change of a function at a given point on the functions domain. Then, once
were happy about the concept of differentiation, well apply some very useful rules
to a number of functions, with a particular focus on some common economic and
business applications. (As always, the last slide/page is well worth taking note of.)

14.2

Limits
3

f (x)

HPW
10.1-10.2

-1

-1

-2

-2

-3

f (x)

-3
-3 -2 -1 0

-3 -2 -1 0

117

LECTURE 14. DIFFERENTIATION: RESPONDING TO CHANGE


Definition | Limit
If f (x) is arbitrarily close to L for x very near, but not exactly equal to a,
then the limit L of f (x) as x approaches a is given by,
L = lim f (x) .
xa

(14.1)

Some limits do not exist;


Rule of thumb: approach from a+, and a.
Limits must be finite.
Example:
Find the limit at 3 of the function,
f (x) =

14.3

Rates of change

14.3.1

The problem

4
.
x2 9

The nature of change


HPW 11.1

A function describes some output value (the dependent variable), for a given
input value(s) (the independent variable(s));
We often would like to know by how much the output is affected by a change
in the input;
x ???f (x)
This can be represented by the following,
x f (x)
where just means change in.
118

14.3. RATES OF CHANGE


The linear case
For the linear case, this isnt difficult;
For some given x we simply measure the f (x) and take the ratio of the
two,
f (x)
rise
=
x
run
Importantly, this will be the same, regardless of where we sample from on the
function.
10
9
8
7

f (x)

f (x)

4
3

f (x)

2
1

0
0

10

But...

For non-linear functions the ratio of output change to input change


is different: it depends on where we take our measurements!
10
9
8
7

f (x)

f (x)

5
4
3

f (x) = 0

1
0
0

10

It would be therefore useful to come up with some way of finding out the rate
of change of a function for a given change of an input at a point on the
function;
For example: how will my appetite for packets of chips change if I finish eating
my first packet of chips? What about my 12th packet of chips?
This is the aim of differentiation:
119

LECTURE 14. DIFFERENTIATION: RESPONDING TO CHANGE


Definition | The derivative in words
For a given function f (x), the derivative of f (x) expressed as
f df
,
, f (x), f(x),
x dx
will enable us to compute the rate of change of f (x) with respect to a
change in x at a given point in the domain of f (x).

14.3.2

Approximating the solution

Suppose we wanted to find out the rate of change of the point marked;
This is the same as trying to find a line that has the same slope right at
that point the tangent;
Suppose we approximate this line, by calculating slopes we can compute
by:
1. Choosing another point away from ours;
2. Drawing a secant between them;
3. Measuring the slope.
To get a better approximation we choose another point closer to our point
of interest;
And continue!
Eventually, by making our x so small (approaching 0) we will have a very
good approximation of the slope:
f (x0 + x) f (x0 )
x0
x

f (x) = lim
10
9
8
7
6

f (x)

5
4
3
2
1

0
0

10

Definition | The derivative defined


The derivative of a function f (x) at any point x is defined as the limit,
f (x) = lim

x0

120

f (x + x) f (x)
.
x

(14.2)

14.4. DIFFERENTIATION
(x + x, f (x + x))

f (x + x) f (x)
(x, f (x))
x

14.3.3

Applying the approximation

Example:
Show that the derivative of f (x) = x2 + x 4 is 2x + 1.

Example:
Find the tangent to the function f (x) = x2 + x 4 at the point x = 3.

121

LECTURE 14. DIFFERENTIATION: RESPONDING TO CHANGE

HPW 11.2

14.4

Differentiation

14.4.1

Rules on One Function

Definition |
Given the following functions of x, the derivative
f (x) =

f (x) = 0

f (x) = xn f (x) = nx

d
dx

is the following:

(k constant)
n1

(power-function rule)

which implies,
f (x) =

x0 ,

f (x) = 0

f (x) =

cxn ,

f (x) = cnxn1

f (x) = cxn , f (x) = cnxn1

Example:
Find the derivative with respect to x of the following functions,

f (x) = 6x2 g(x) = 2( x)3 .

14.4.2
HPW 11.5

Rules on Multiple Functions

Given two functions, f (x) and g(x) we have,


Sum-difference rule
d
[f (x) g(x)] = f (x) g (x)
dx
Product rule
d
[f (x)g(x)] = f (x)g (x) + g(x)f (x)
dx
Quotient rule
f (x)g(x) g (x)f (x)
d f (x)
=
dx g(x)
g2 (x)
122

14.4. DIFFERENTIATION
Differentiation in the field...
HPW 11.3

The derivative has so far been talked of in terms of:


a gradient;
a tangent; and
a slope.
We can also think of it as a rate of change:
By how much will the output change by, for a given change in the
input?
We can then define two important concepts in business economics since the
rate of change analysis is perfect for their calculation...
Definition | Marginal cost
If C(q) the total cost to produce q units of a good, then the marginal cost
is the change in total cost due to changing production levels by one unit,
and is given by,
dC
.
(14.3)
MC =
dq

Definition | Marginal revenue


If R(q) is the total revenue gained from producing q units of a good, then
the marginal revenue is the change in total revenue due to changing
production levels by one unit, and is given by,
MR =

dR
.
dq

(14.4)

Example: Sum-difference rule


Given the short-run total-cost function,
C = Q3 4Q2 + 10Q + 75
find the marginal cost function, that is, the limit of the quotient,

dC
dQ .

123

LECTURE 14. DIFFERENTIATION: RESPONDING TO CHANGE


Example: Product rule
If y = (2x + 3)(3x2 ), find the derivative y (x).

Example: Quotient rule


Find the derivative of y(x) =

14.4.3
HPW 11.6

2x3
x+1 .

Rules on Functions of Multiple Variables

What if we have multiple functions, which are actually nested functions...


Chain rule If we have a function z = f (y), where y is itself a function of another
variable, y = g(x), then,
dz dy
dz
=
dx
dy dx

Example: Chain rule (easy)


Given that z = 3y 2 and y = 2x + 5, find

124

dz
dx .

14.4. DIFFERENTIATION
Example: Chain rule (harder)
(Challenge) Find the solution to

dz
dx

if

z(x) = (x2 + 3x 2)17 .

The new realm...


You are now able to calculate (given the relevant functions):
The marginal cost (M C =

dC
dq );

The marginal revenue (M R =

dR
dq );

The marginal propensity to save (M P S =

dS
dI );

The marginal propensity to consume (M P C = 1 M P S);


Practice!

125

LECTURE 14. DIFFERENTIATION: RESPONDING TO CHANGE

126

Lecture

15

Differentiation II: Tricks and Extensions


15.1

Introduction

We continue our introduction to the nature of change. Last time, we looked at a


variety of functions and how we might compute their derivative an expression
for the rate of change of the dependent variable in terms of the independent variable at a given point. Recall that we arrived at these expressions by applying our
understandings of limits to the problem. We took a secant on the function we
were dealing with, and drew this closer and closer to the point of interest, until it
was so close (the distance between our point of interest and the intersection with
the function at some other point) that we could see that the line we were drawing
actually made a very good approximation to the tangent to the function at that
point just what we were after.
This time, we will need this apparatus to look at three possibly tricky problems
in differentiation. The first, differentiation of implicit functions, forces us to consider
at a deeper level what we are actually doing with our small dee-y, dee-x fraction
symbol. In particular, well see that we can treat it like another variable in our
equation, and therefore, solve for it!
In the second problem, we consider the slightly different logarithmic and exponential functions in terms of their derivatives (and actually find them to be highly
related). Finally, we touch on the seemingly difficult, but actually quite simple area
of multiple derivatives that is, doing our differentiation work more than once
on the same function. For now, well just see the mechanics of the process, but in
the next lecture, well need this to help us identify the best or worst point on a
function.
Finally, we look at a very useful (and common) application of differentiation in
economics, that of the elasticity of demand. We can calculate various elasticities
of supply for example but well start with demand as it is by far the most
common. Well see that all of this differentiation work is mightily helpful in cracking
the problem of knowing by how much the demand for a given good will change in
response to a change in price. You might like to read over this section and then
think about goods in real-life that are examples of each case. To start you off, think
about the (highly topical) demand for automotive fuel (e.g. unleaded petrol) is it
elastic, inelastic, completely inelastic, or unit elastic in price?
Agenda
127

LECTURE 15. DIFFERENTIATION II: TRICKS AND EXTENSIONS


1. Three problems in differentiation:
problem 1 Implicit functions;
problem 2 Logs and exponentials;
problem 3 Double (triple?) differentiation;
2. Differentiation applied elasticity.

HPW 12.4

15.2

Implicit Differentiation

15.2.1

Implicit functions

Problem I: when its not explicit!


So far we have strictly dealt with functions of the form (say),
y = f (x) = x3 + 1

where the variable y is explicitly expressed in terms of x;


However, we may not know the explicit form of y, instead we just have a
function such as,
y(x, a, b) x3 + 1 = 0
... of course, we can rearrange this, but we will only obtain an implicit
function for y (things could be lurking in ys explicit form that we dont know
about...)

Definition | An Implicit Function


When an equation is given with one or more variables, where a variable does
not appear on the left hand side by itself,
e.g.

y + y3 x = 7

we cannot assume the full form of the equation of that variable, instead,
we can only know this form implicitly from the function. The resultant
function for y in terms of x is an implicit function.
Example:
Determine whether y is defined explicitly or implicitly in each of the following functions:
x2 y 3 + 2 = 0,

128

ln a (2 x)2 = y,

(y + x)2 2y = 4

15.3. LOGS AND EXPONENTIALS

15.2.2

Differentiation of implicit functions

Differentiation on an implicit function?


Because we cant assume what the explicit form of an equation is, we cant
just rearrange and perform normal differentiation we may be missing some
internal workings of the variable;
Instead, we perform the following steps:
1. Differentiate both sides w.r.t. x;
2. Bring

dy
dx

terms to one side;

3. Factorise out the

dy
dx

part;

4. Rearrange to solve for


Example:
dy
Find dx
given that,

Be careful when applying

dy
dx .

y 3 + 3x2 y = 13 .

d
dx

to implicit functions:

d 3
dy 2
y 6=
y
dx
dx
rather, must apply the chain rule, say on z = f (y):
dz dy
dz
=
dx
dy dx
that is, if z = [y(x)]3 ,
dz
dx

dz dy
dy dx
dy
= 3y 2 .
dx
=

129

LECTURE 15. DIFFERENTIATION II: TRICKS AND EXTENSIONS

15.3
HPW 12.1,
12.2, 12.5

Logs and Exponentials

Problem II: logs & exponentials

What do we do with functions that dont fit our normal rules?


For example,

f (x) = ex ;

is not a normal power function (xa );


Or, what about,
f (x) = ln x ;
which is not a normal function at all?
Answer: Proceed as normal, with our definition of the derivative...
Example:
Show that f (x) = ex when f (x) = ex .

Definition | Derivative of ex
If f (x) = ex then f (x) has the remarkable property that,
f (x) = f (x) = ex ,

(15.1)

more generally, if f (x) = eg(x) , then,


f (x) = g (x)eg(x) .

The derivative of f (x) = ex IS NOT xex1 !


130

(15.2)

15.3. LOGS AND EXPONENTIALS


Example:
Differentiate the following:
ya = ex ,
yb = xp eax ,
p
yc =
e2x + x .

Example:
Show that if f (x) = ln x, then f (x) = x1 , using the result that eln x = x.

Definition | The derivative of ln x


If f (x) = ln x then,

1
,
x
and, more generally, if f (x) = ln g(x), then,
f (x) =

f (x) =

(15.3)

g (x)
.
g(x)

(15.4)

1 1
.
ln a x

(15.5)

Further, if f (x) = loga x then,


f (x) =

131

LECTURE 15. DIFFERENTIATION II: TRICKS AND EXTENSIONS


Example:
Find dy/dt given that y = t3 ln t2 .

Example:
Find the derivative w.r.t. x of y = ln

15.4
HPW 12.7

3x
1+x

Higher Order Derivatives

Problem III: a derivative of a derivative?


Sounds tricky, but...
Finding the derivative of the derivative of f (x) is the same as,
f (x) =

d2 y
= f(x)
dx2

and can be simply found by:


1. Take the first derivative,
f (x) =

d
f (x)
dx

f (x) =

d
f (x)
dx

2. Take the next derivative,

3. And so on...
But what does f (x) mean? ... next lecture (!)
132

15.5. ELASTICITY OF DEMAND


Example:
2
If f (t) = e2t +4 , find f (t).

15.5

Elasticity of Demand

Application: elasticity

HPW 12.3

We often would like to know, if the price of x changes by $1, then by how
much will the demand for x change?
Trouble is that the units of measurement of price and quantity can vary
so it is hard to get a comparative picture;
Solution: Calculate the percentage change that results for a 1% change in
the inputs...
Definition | Point Elasticity of f at x
If f (x) is some function, differentiable at x, and f (x) 6= 0, then we define
the elasticity of f w.r.t. x as,
=

x
f (x)
f (x)

(15.6)

Normally, we are interested in the elasticity of demand the percentage


change in the quantity demanded (q) for a 1% change in the price (p);
Definition | Point elasticity of demand at p
If the demand in terms of the price for some good is given by q(p), then the
elasticity of demand at point (q, p) is given by,
(q) =

p dq
.
q dp

(15.7)

Notes on (q):
1. If |(q)| > 1, then q is elastic at p;

2. If |(q)| = 1, then q is unit elastic at p;

3. If |(q)| < 1, then q is inelastic at p;

4. If |(q)| = 0, then q is completely inelastic at p;


133

LECTURE 15. DIFFERENTIATION II: TRICKS AND EXTENSIONS


Example:
For a demand function q(p), explain in words what is meant by the definitions of elastic, unit elastic, inelastic and completely inelastic at some
point (q, p).

Example:
If the demand for coffee is given by the demand equation, q(p) = 8000p1.5 ,
find an expression for the elasticity of demand, and so, approximate the
change in quantity demanded for a 1% increase in price at p = 4.

134

Lecture

16

Differentiation III: Optimization in one


Variable
16.1

Introduction

After the past two weeks of introducing the various techniques of differentiation
(or going over old ground for some) you may be keen to see your new knowledge
applied to the real world. Not only is differentiation of interest to give us access to
measurement of rates of change (e.g. the speed or acceleration of a car), but with
some insight we can get at a similarly powerful concept that of optimization.
Optimization is (literally) the process of finding the optimum. The highest,
the lowest; the maximum, the minimum; the top, the bottom; the ... you get the
picture. It turns out that the work we have done in finding the slope of a function,
which is equivalent to, say, finding the gradient of a hill, helps us to identify when
the ground beneath our feet (figuratively speaking) is flat and when it is flat, if
you think about it for a moment, you know that you have reached a point that is a
little higher, or a little lower, than the surrounding ground. In some cases, it will
be the highest or lowest point for miles into the distance. That is exactly the same
logic we apply to find maxima and minima of functions. However, we need to pay
some care to exactly what we have found when we use our new-fangled optimization
methods have we found the highest point? ... or perhaps just a bump along the
way.
To get our thinking on the practical track, well start and finish with our nascent
micro-coffee roasting business, The Bean House , to see if we can help our business
partner to find the optimum quantity of Boutique Blend packs to produce in order
to yield maximum profits. This is similar in flavour to the work we did in linear
programming a few weeks back, but as youll see, optimization by differentiation
drills down on the functional form of the problem itself. For now, to the hills!
Agenda
1. A problem of optimization;
2. What can the first-derivative tell us?
3. What can the second-derivative add?
135

LECTURE 16. DIFFERENTIATION III: OPTIMIZATION IN ONE VARIABLE


4. (Excursus on points of inflection);
5. Solving The Bean House s profit maximization problem.

Back to The Bean House


Maximum profits?
Times have changed just a little for The Bean House and they have decided to
make just Boutique Blend packs . Your business partner has turned to you to find
the maximum profit conditions. He tells you that there is a major competitor in
the micro-coffee roasting market on campus, giving a price equation (as a function
of Boutique Blend packs you sell in a day) as follows,
p(q) = 130 2q .
Furthermore, he tells you that there are fixed daily costs of $148 (rent, power,
water etc.). The question is, how many Boutique Blend packs should The Bean
House make in a day to maximize profits?

16.2
HPW 13.1

Extrema of Functions

Definition | Critical Value


If f (x) is continuous and there exists some point in the domain of f (x), x0
where f (x0 ) = 0 then x0 is said to be a critical value and the point,
(x0 , f (x0 )) is a critical point.

Definition | Stationary Value


Given that x0 is a critical value in the domain of the function f (x), then
f (x0 ) is a stationary value of the function f .
y

f (x)

136

16.2. EXTREMA OF FUNCTIONS


Example:
Find all the critical points of the function,
c(t) =

16.2.1

t
.
t2 + 4

Classification by Observation

What kind of critical point?


Definition | Local Maxima (Minima)
Given some function f (x) and a point x0 in the domain of f (x), then if
f (x0 ) ()f (x) for all possible values of x in some interval about x0 , then
f (x) is said to have a local maximum (minimum) or relative maximum
(minimum) around the point x0 .
y

Which leads to a rule of identifying whether a point is a local minimum or


maximum:
Definition | First-derivative test for relative extrema
If the first derivative of a function f (x) at x = x0 is f (x) = 0, then the
value of the function at (x0 , f (x0 )), will be
1. A relative maximum if f (x) changes sign from positive to negative travelling from the left to the right very near of x0 ;
2. A relative minimum if f (x) changes sign from negative to positive
travelling from the left to the right very near of x0 ;
3. An inflection point if f (x) has the same sign travelling from the
left to the right very near of x0 .
137

LECTURE 16. DIFFERENTIATION III: OPTIMIZATION IN ONE VARIABLE


Example:
Find the nature of the critical values of the function,
y = f (x) = x3 12x2 + 36x + 8 .

16.2.2
HPW 13.4

Classification by the Second Derivative

To classify relative extrema takes some time with certain functions seeing
whether the function is larger or smaller to the left and right of a critical value;
However, we can do the characterisation much faster by noticing that:
By checking the less than or greater than of the function to each side of the
point where the slope is zero, we are asking;
Does the slope of the function change positively or negatively as
it goes through the critical point?
Definition | The Second Derivative Test
If f (x) is a twice differentiable function around some critical value x0 , then:
If f (x0 ) < 0, x0 is a local maximum point;
If f (x0 ) > 0, x0 is a local minumum point; and
If f (x0 ) = 0, x0 is ?
Consider the function f (x) = x3 + 2x2 . To find any maxima and minima we
would:
1. Derive f (x);
2. Solve f (x) = 0 for x to obtain critical values;
3. Derive f (x);
4. Find the sign of f (x) at each critical value.
y

138

16.2. EXTREMA OF FUNCTIONS

Example:
Classify the stationary points of the function f (x) = 19 x3 61 x2 32 x + 1
by using the second derivative test.

! Local vs. global

Notice that a relative or local maximum can also be the absolute or global maximum for the function;
We distinguish between them...

16.2.3

Concavity & Convexity

You will hear quite often this classification of local extrema talked about in
terms of two special words: concavity and convexity... what do they mean?
Definition | Convex & Concave Functions
If f (x) > 0 for all values of a < x < b then f is said to be convex on the
domain (a, b). On the other hand, if f (x) < 0 for all values in the domain,
f is said to be concave on the domain (a, b).
139

LECTURE 16. DIFFERENTIATION III: OPTIMIZATION IN ONE VARIABLE

convex

concave

16.2.4

Points of Inflection

When f (x0 ) = 0 ...

What about when f (x) = 0? ... Before, we said ???!


Now we can classify such a situation as follows:

Definition | Inflection Point


If f (x) is continuous and twice differentiable about some point x0 , then:
If x0 is a (known) inflection point, f (x0 ) = 0;
If f (x0 ) = 0 and f changes sign at x0 then x0 is an inflection point
for f .

What does this mean???


... f (0) = 0 is a necessary condition for an inflection point, but it is not a
sufficient one.

Note: On Points of Inflection What exactly does the point of inflection mean? You will
notice that we got there by the necessary condition that f (x0 ) = 0. So what is this
saying? Recall, that to find the maximum or minimum value of some primitive
function f (x) we looked at where the derivative was equal to zero. That is, by solving
d
the equation dx
f (x) = 0, found the values of x where f reached an extreme value. In
d
the same way, by applying dx
to f (x) and equating to zero, we are asking, when does

the function f (x) reach an extreme value?. That is, when does the derivative
itself attain a maximum or minimum?
In plain terms then, if there is a point where f (x) = 0, then this is saying that at that
point, the slope of the primitive function f (x) has reached a maximum or minimum.
So there are two types of inflection points as shown below:

140

16.3. APPLIED TO THE PROBLEM


y
y

f (x) = x3
g(x) =

1
1+ex

x
y
y

f (x) = 3x2

g (x) =

ex
(1+ex )2

On the left, we have the more normal type, where the point of inflection coincides
with a stationary point (f (x) = 0). Whereas on the right, we have the situation where
the point of inflection is telling us that the slope of the primitive function (g(x)) has
reached its peak, to the left and right of x0 = 0 the slope of g(x) is smaller than at
the inflection point (x0 ).

Example:
Determine whether the function f (x) = x6 has an inflection point at x0 =
0.

16.3

Applied to the Problem

Maximum profits for The Bean House ?

HPW 13.6

We have a price equation,


p(q) = 130 2q .
So total revenue R is,

R(q) = p.q = (130 2q)q = 130q 2q 2 .


141

LECTURE 16. DIFFERENTIATION III: OPTIMIZATION IN ONE VARIABLE


We have fixed costs of $148 per day, plus our costs per Boutique Blend
pack from last time (electricity , raw beans ), giving a total cost of,
C(q) = F C + V C = 148 + ((0.20)(5) + (1.70)(0.200))q = 148 + 1.340q .
So ... total profit (per day) will be,
(q) = R C

= 130q 2q 2 (148 + 1.340q)

(q) = 128.7q 2q 2 148 .

Example:
Given the The Bean House data as above, find the profit maximizing
output of Boutique Blend packs , and the profits obtained at this point.

142

Lecture

17

Integral Calculus: Unlocking Economic


Dynamics
17.1

Introduction

In the previous two lectures, we have been interested in finding the rate of change of
some function, and we have used this to (for example) identify where some output
is at a maximum or minimum. While this is very useful for a variety of applications,
it seems natural to expect that there will be times where rather than having access
to some primitive function describing how an output is affected by an input, we will
instead have observed the rate of change itself, and so wish to find the primitive
function.
In fact, this is the other side of the coin of our differentiation work of the past
two lectures, and it involves, appropriately enough, finding the antiderivative of
a function, or as it is commonly known, the integral. In particular, integration
(the process of going from the derivative to its primitive) has a natural application
when we deal with quantities that change over time. That is, as in our first example
below, we might observe the birth and death rate of a nations population, and
wish to determine what the population will actually be at some time point in the
future. This is exactly the kind of step that integration can deliver if we can put
an equation down for the rate of change of the population, then we should be able
to go back one step to an equation of population at a given time.
Notice the word should in the previous sentence. Two things are against us.
First, as with differentiation, there are integration rules for some functions, but not
all! Second, even if we do have a rule that will give us the primitive function we
are after, we still need one more piece of information a single measurement of the
primitive function at a point in time. For our population example, this will be the
population of the country in some year. By using this, we can actually go from a
general description of the primitive function, to a specific one, which is useful for
further work. All of which, we will aim to cover presently.
Agenda
1. Why integration?
2. Some rules of integration;
143

LECTURE 17. INTEGRAL CALCULUS: UNLOCKING ECONOMIC


DYNAMICS
3. Incorporating initial conditions;
4. Integration as an area.

17.2
HPW
14.214.5

Why Integration?

Economic Dynamics?
Suppose you had collected data from the registry of births and deaths for one
area in Sydney over a year; it suggests that the population had a rate of change
expressed by
1
dP
= 0.8t 2 .
dt
However, you would like to know the actual population of Australia in
any given year since 2000;
We have information in terms of a rate of change, but we want information
in terms of the time path of a variable (the population of Australia);
This sounds like differentiation should be involved but ...
Integral calculus to the rescue...
Previously, we have been dealing with this situation:
differentiation
dy
?
y(x)
dx
However, now (in terms of our population change example) we are asking the
reverse question:
integration
dP
P (t)?
dt
Example:
Suppose as in our example, we have,
1
dP
= 0.8t 2 ,
dt

find an expression for P (t), the population of Australia each year, where
t = 0 is the year 2000.

144

17.3. THE INDEFINITE INTEGRAL

17.3

The Indefinite Integral

17.3.1

Notation

For the moment, we note that what we are doing is actually finding an antiderivative of p(t):
Definition | Antiderivative, or Primitive Function
The antiderivative of some function f (x) is some function whose derivative is f (x). Normally, we use the notation,
F (x) = f (x) ,

(17.1)

where F (x) is the antiderivative of the function f (x).


For any given function, f (x), there can exist a number of primitives.
Consider the primitives of f (x) = 3x2 . We can have x3 , x3 +2, x3 10,....
In general notation, we can write all the primitives of f (x) = 3x2 as:
F (x) = x3 + c

(17.2)

where c is any constant.


Definition | Indefinite Integral
If F (x) is a primitive of f (x), the indefinite integral of f (x) is denoted:
Z
f (x)dx = F (x) + c
(17.3)
where c is any constant.
Definition | The Integral Sign
Suppose we have some function f (x) and we wish to find the antiderivative
(or integral) of f (x), then we would write,
integral sign
variable of integration

f (x) dx = F (x)+c

integrand constant of integration


R
where we use the special notation . . . dx to indicate the antiderivative procedure, with f (x) being the integrand and in this case, x is the variable
of integration, yielding the function F (x), plus some constant of integration c.

17.3.2

The Constant of Integration

Whats c about again?


Recall our population example, we had a strange result when we tried to use
real numbers;
145

LECTURE 17. INTEGRAL CALCULUS: UNLOCKING ECONOMIC


DYNAMICS
This arises because of the fact that say,
F (x) = 4x + 3

...

F (x) = f (x) = 4

... the constant (3) disappears!


So when we go back from the derivative, to the primitive function f (x) by
integration, we dont know the value of any constant, or if it was there at
all...
We can get at the value of c, however, if we have one known point (x, f (x)),
which we can use to solve for c.
So when we write down an integral, we add the c out of good practice, to be
found later, if possible.
Example:
1
Using the same population example as before, find P (t) if p(t) = 0.8t 2
and P (0) = 19.6.

Note: On finding a value for c In the example above we used an initial value, that is,
when t = 0 to find our value for c. This is the easiest way to go, if such an initial
point is available, because it directly gives you the value for c, as above. However, any
known (x, f (x)) pair will enable you to solve for c.

17.3.3

Tools of the Trade

Basic rules
Definition | Integration results
Following from corresponding rules of differentiation, we have,
Z
1
xa dx =
xa+1 + c (a 6= 1)
a+1
Z
1
dx = ln |x| + c
x
Z
1 ax
e + c (a 6= 0)
eax dx =
a
Z
1 x
ax dx =
a + c (a > 0, a 6= 1)
ln a

146

17.3. THE INDEFINITE INTEGRAL


Rules of operation
Definition | Integration rules of operation
(summation)

[f (x) + g(x)] dx =
f (x) dx +
Z
Z
(multiple)
kf (x) dx = k f (x) dx

g(x) dx

... notice the similarity to the differentiation rules weve already covered.
Example:
R
R
Find (x3 x + 1) dx and 3x2 dx.

Example: All the rules at once


Find

Z 
3
x
2
5e x +
dx
x

17.3.4

(x 6= 0).

Techniques of Integration

Definition | Substitution rule


The integral of f (u)( du
dx ) with respect to the variable x is the integral of f (u)
with respect to the variable u:
Z
Z
du
dx = f (u) du = F (u) + c
(17.4)
f (u)
dx
which is useful if this situation can be recognised.
147

LECTURE 17. INTEGRAL CALCULUS: UNLOCKING ECONOMIC


DYNAMICS
Example: Substitution rule
Find the integral of the function f (x) = 2x(x2 + 1).

Example:
really useful substitution rule!
R 2The
3
Find 6x (x + 2)99 dx.

HPW 14.7

17.4

The Definite Integral

17.4.1

Notation

Being definite!
HPW 14.8

So far, we have only considered (with the exception of our population example)
integrals of the indefinite kind only variables were used in place of numbers;
However, we can actually calculate the value of our integrals, in particular,
we often wish to calculate the value of an integral between two values of x in
the domain of f , say b, a where b > a;
We do this by calculating the difference between F (b) and F (a),
[F (b) + c] [F (a) + c] = F (b) F (a)
Notice, we lose the value of the constant c
Definition | The Definite Integral
R
To find the numerical value of an integral f (x) dx over the interval x =
(a, b), where b > a, we calculate the definite integral written,
Z

b
a

b
f (x) dx = F (x) = F (b) F (a)

(17.5)

where b and a are the upper limit of integration and lower limit of
integration respectively.
148

17.4. THE DEFINITE INTEGRAL


Example:
R5
Evaluate 1 3x2 dx.

Example:
Rb
Suppose f (x) = k(1 ex ), find a f (x) dx (k is a constant).

17.4.2

As an Area

Integration by another name...


Look closely again at the definite integral formula,
Z b
f (x) dx
a

Rb
Leaving the first part aside for the moment, a , consider what the f (x) dx
part is doing: An area of an rectangle is being calculated, having dimensions
A = f (x0 ) dx
y

Rb
In fact, as we calculate the definite integral a f (x) dx we are actually calculating a sum of areas that lie between the function and the x axis between
points b and a;
149

LECTURE 17. INTEGRAL CALCULUS: UNLOCKING ECONOMIC


DYNAMICS
Thats why (if you squint!), youll see that the
Z

b
a

part is like an S for Sum!

The definite integral, calculating the area between the function and
the x-axis,
Z b
f (x) dx
a

will give a positive area for regions above the x-axis, but a negative
area for regions below the x-axis.
With our area understanding of the definite integral, we have the following:
Definition | Properties of the Definite Integral
Z

Z aa

a
b

f (x) dx =

f (x) dx

f (x) dx = 0
Z

f (x) dx
kf (x) dx = k
a
a
Z b
Z c
Z c
f (x) dx
f (x) dx +
f (x) dx =
a

In the last rule, it is assumed a < b < c, that is, we can chop up the area
into parts.
Example:
R6
Suppose f (x) = ex/3 3, evaluate 3 f (x) dx and compare it to the
region that lies between the x-axis and f (x) on the interval [3, 6].

150

Lecture

18

Differential Equations & Growth I


18.1

Introduction

Last time we introduced the concept of the anti-derivative, otherwise known as


the integral. This opened up to us the area of business dynamics, that is, where we
are likely to start with an observation concerning the way a variable changes over
time, and then move to an expression for the value of the variable at a chosen point
in time. In this lecture, we extend these ideas to nibble off a small piece of a very
large area in economic, engineering and physical mathematics that of differential
equations.
At this level, we will only consider the first class of such equations, those of
first-order, and first-degree. There are, however, many more types of differential
equations, some linear, some non-linear. Again, like we found with our integration
rules, some very smart people have put together tables of differential equations.
Thus, we only need to recognise the type of equation and then apply the correct rule
from the table. Well investigate the derivation of a couple of these in this lecture
but most of these derivations are beyond the scope of our course.
So where do differential equations get applied in business? The answer is almost everywhere. The class we look at towards the end of this lecture that of
exponential growth are so common in economic activity, that they are often referred to as natural growth equations. This partly has to do with the way that
biological populations tend to grow (hence the natural bit), but also is about the
commonness of this kind of growth. It comes about where the growth results in
more entities (people, bacteria, firms etc.) that can accomplish further growth! That
is, each cycle of growth, produces more ability for the population to grow; it builds
on itself. Not surprisingly, this kind of growth is very effective. Next time you see
some cheese go mouldy in your home seemingly overnight, you might remember the
power of exponential growth to go from even a seemingly small base to a large (and
smelly, in the case of cheese bacteria), thriving population.
Agenda
1. Integration by another name ... differential equations;
2. Solution method;
3. Application to growth equations: exponential growth.
151

LECTURE 18. DIFFERENTIAL EQUATIONS & GROWTH I

HPW 15.5

18.2

Differential Equations

18.2.1

Notation

Terminology
Suppose you were faced with the following relationship between x and y,
 2
dy
+ y 4 = 0,
dx
Since we have an equation where a variable enters both plainly and as some
form of derivative we have a differential equation;
Definition | Differential Equation
An equation where a variable enters as a derivative is called a differential
equation,
 a b
d y
+ ky = c
(18.1)
dxa
where a and b indicate the order and degree of the equation respectively,
and k and c are constants, e.g. a first-order differential equation
of degree 2, is one in which a = 1 and b = 2,


dy
dx

2

+ 2y = 4

Example:
Consider the equations,
(a)

dy
dx

3

+ 4y = 12

(b)

d2 y
y = 0 ,
dt2

what order and degree are these differential equations?

Order? Degree? In this course, we will only consider differential


equations of first-order and degree 1. That is, of the form,
 
dy
+ ky = c
dx

152

18.2. DIFFERENTIAL EQUATIONS


Definition | Homogeneous Case
Given some (say) first-order differential equation in general form,
 
dy
+ ky = c
dx

(18.2)

then if c = 0, we say that (18.2) is an example of the homogeneous case


of differential equations.

18.2.2

Solution Technique
Example:
Find the general solution to,
dy
= xy
dx

where

y > 0.

Separation of variables

Recall how we solved the previous example:


1. From,
dy
= ...
dx
2. To,
Z
That is, we took it from the
on both sides...

dy
dx

. . . dy =

. . . dx

form, into a form which could be integrated

This technique is called separation of variables because of the way in


which we arrange our problem to have each variable separated by the = sign
for integration.
153

LECTURE 18. DIFFERENTIAL EQUATIONS & GROWTH I


Example:
Solve the equation,

18.2.3

dy
dt

+ 2y = 6 by using separation of variables.

General and Definite Solutions

Being definite
Up till now we have left our solutions with a constant in them (either c or
A);
Such a solution is known as a general solution;
However, just like in integration of last lecture, we can use an initial condition to find a definite solution...
Example:
Given the general solution, y = Ae2t + 3, and initial condition y(0) = 10,
find the definite solution.

Note: Definite vs. Particular Solutions You may also find discussion of a particular
solution of a differential equation. This arises because, as you will have seen, the
choice of the constant A (for example) in the above equation is somewhat arbitrary.
Any choice of A as a constant will give rise to a solution of the initial differential
equation that is, the particular equation that this choice gives rise to will display
dynamics in line with the original differential equation. However, there is one special
choice of A that will satisfy an initial condition when t = 0. This is special, since
it will uniquely describe not only the dynamics of the system being analysed, but also,
the actual value at a given time of the system itself, due to solving for A with the
initial conditions of the system in mind. In this way, we can talk about three kinds of
solutions: the general solution,
yg (t) = Ae2t + 3 ,
a particular solution, where we just choose some A arbitrarily,
yp (t) = 16e2t + 3 ,

154

18.2. DIFFERENTIAL EQUATIONS


and the special case, where we choose A such that the initial conditions are satisfied,
which is the definite solution,
yd (t) = 7e2t + 3 .

Example:
Show that the general solution to the non-homogeneous first-order differat + b . (Harder)
ential equation, dy
dt + ay = b (y > b/a) is y(t) = Ae
a

Example:
Using the result of the previous example and assuming that y(0) = y0 ,
show that the definite solution is y(t) = [y0 ab ]eat + ab .

From the previous two examples, we can now say the following.
Definition | Solution of a Non-homogeneous Differential Equation
Given a first-order non-homogeneous differential equation of the form,
dy
+ ay(t) = b
dt

and

y(0) = y0

where a and b are constants, the solution will be given by,




b at b
e
+ .
y(t) = y0
a
a

(18.3)

By realising that the homogeneous case is where b = 0 we can say the following.
155

LECTURE 18. DIFFERENTIAL EQUATIONS & GROWTH I


Definition | Solution of a Homogeneous Differential Equation
Given a first-order homogeneous differential equation of the form,
dy
+ ay(t) = 0
dt

and

y(0) = y0

where a is a constant, the solution will be given by,


y(t) = y0 eat .

Example: Check
Solve (again) the differential equation

HPW 15.6

18.3

Topics in Growth

18.3.1

Exponential Growth

dy
dt

(18.4)

+ 2y = 6 where y(0) = 10.

Differential Equations & Growth


The previous work now gives us a new insight into the nature of growth;
Consider an account that pays interest continuously by how much does it
grow in an infinitely small time period dt?
The account value, say V has a rate of growth, that depends on the interest
rate r and the amount of money in the account at the time, S, that is,
dS
= Sr
dt
... familiar?
Solving,
dS
dt

= Sr
Z
1
dS =
r dt

S
ln |S| + c1 = rt + c2
Z

156

S = ert+c = Aert

18.3. TOPICS IN GROWTH


Now, in the case of a bank account, the initial condition S(0) is just the initial
principal placed in the account, P , so:
S(t) = P ert
... our continuously compounded annuity formula!
Definition | The Exponential Growth form
A differential equation of the form,
dy
= ay(t) ,
dt

(18.5)

where a is a constant, is an exponential growth equation, having solution,


y(t) = y0 eat
(18.6)
where y0 is an initial value. The following cases arise:
If a > 0 we have exponential growth; and
If a < 0 we have exponential decay.
Growth & Decay...
1
Consider the equation y(t) = 2e 2 t . It is either in a growth mode or a decay
one as follows:
y

Example:
The CIA fact-book1 predicted Australias population growth rate in 2006
to be 0.85%. Find a general equation for Australias population at any
given time.

157

LECTURE 18. DIFFERENTIAL EQUATIONS & GROWTH I


Example:
Using the equation derived from the previous example, find how long it
would take at this rate for Australias population to double.

Example:
However, the fact-book also says that Australias net migration is 3.85
migrants per 1000 population per year (2006). Given this fact, re-calculate
the years to double the population of Australia.

158

Lecture

19

Differential Equations & Growth II


19.1

Introduction

We introduced the concept of growth last lecture. However, despite all the interest
that we can gain from the exponential growth story, the reality is that it has some
not very pleasing properties. The most significant of which is its limit behaviour.
As time goes to infinity, we find the exponential curve sky-rocketing off any graph
paper in the world, itself reaching infinite heights. Now while some may think that
this is true that populations, stocks, money can grow and grow infinitely others
beg to differ!
The more realistic view would suggest that just like money doesnt grow on
trees, there are limits to growth. Whether in the form of capacity constraints for
manufacturing plants (we just cant get any more inputs out of the ground), or as
in-built restraint mechanisms as in yeast growth (they literally begin to die in their
own waste), things eventually reach a limit. So first up, we will study a limited
growth equation, and see how it can be used. In the second case, we will round off
one more sharp edge to get a possibly even more realistic growth equation.
Of course, we could keep going and going, creating richer and richer models of
growth, but for now, it will suffice to cover the main three models. They are relatively
simple, but despite any failings they still describe some situations very well. Thus
they are commonly used in modern commercial and economic calculations. Well
even attempt to go back to 1788 and see if we can calculate the growth rate of the
recorded Australian population, and along the way, make a bold prediction about
the limits to Australias population growth!
Agenda
1. Study the exponential growth curve;
2. Correct the limit behaviour the limited growth curve;
3. A further attempt the logistic growth curve;
4. Apply to Australian data since 1788!
159

LECTURE 19. DIFFERENTIAL EQUATIONS & GROWTH II

HPW 15.6

19.2

Limited Growth

19.2.1

A Problem of Limits

Limits to Growth?
Last time, we considered the case of exponential growth, that is, where we
had a growth equation,
dy
+ ay = b and y(0) = y0
dt
giving an equation with respect to time of,


b at b
e
+ .
y(t) = y0
a
a
What about at t ?
Example:


Show that the limit of y(t) = y0 ab et +
> 0.

b
a

is

b
a

for < 0 and for

The second case of the previous example presents us with a problem: exponential growth can be unbounded!;
This might be desirable for some situations, but for most, we will want the
growth to find a natural limit, due to some carrying capacity constraint
for example:
y

What is needed is that the growth rate begins high, but goes to zero as some
capacity level is reached;
In other words, we need the growth rate to be proportional to the deviation
from the carrying capacity:
dN
= k(M N (t))
dt
... where M is the carrying capacity, and k is a rate constant.
160

19.2. LIMITED GROWTH

19.2.2

Limiting Exponential Growth

Definition | Limited Growth


If the rate of change of some quantity y with time t is given by,
dN
= k(M N (t)) ,
dt

(19.1)

where M is the growth limit and k is the rate constant, then the
quantity will have the limited growth equation of time,
N (t) = M Aekt

(19.2)

where A is a constant and A = M N0 if N (0) = N0 is the given initial


condition.

Example:
By the method of separation of variables, prove the previous definition for
limited growth.

19.2.3

Seeing it Graphically

Examples of limited growth:

161

LECTURE 19. DIFFERENTIAL EQUATIONS & GROWTH II


Example:
Suppose that a government economist works out that Australias carrying
capacity is 52 million people. Using the final constant of growth (internal
and migration) from the last lecture, how long will it take for Australias
population to be at 90% of carrying capacity? (Assume current population
is 20 million.)

HPW 15.6

19.3

Logistic Growth

19.3.1

A Problem at the Beginning

A further refinement?
Consider again the limited growth curve:
y

... if this were for population growth, is there still something a bit worrying
about its shape?
The curve starts with maximum slope! But a population is likely to start
off very slowly and then build in gradient, before slowing under the capacity
constraint.
Building a Better Curve
We still want the exponential growth component of the dynamics, but wed
like the capacity constraint to stop growth when p K, but also to be small
for small p;
So... how about taking our exponential growth rate (in terms of population
dynamics),


M p
dp
= kp
dt
M
... and then multiplying by the fraction
162

M p
M ;

19.3. LOGISTIC GROWTH


The growth rate is now the product of the size of the population, and the
difference between the maximum size and the actual size of the population.

19.3.2

Logistic Growth Defined

Definition | Logistic Growth


Given an equation of dynamics for variable N in terms of t,


dN
M N
= kN
dt
M

(19.3)

we will yield a logistic growth equation of the form,


N (t) =

M
1 + Aekt

(19.4)

where M is the capacity constraint, k is the constant of growth and A is


a constant.
Challenge: can you work out how to obtain (19.4) from (19.3)?
Example:
Show that for the logistic growth function,
p(t) =
if p(0) = p0 , then A =

M
p0

M
,
1 + Aekt

1.

Examples of Logistic Growth: changing the rate constant


y

19.3.3

Examples of Logistic Growth

Where does logistic growth occur?


163

LECTURE 19. DIFFERENTIAL EQUATIONS & GROWTH II


Animal growth (including human): from cell division (1,2,4,8,16,32,...) to
eventual growth slow-down and full maturity (then shrinking in older age!);
Population change;
As a distribution (more on this in session 2), especially in the Logit Distribution (used in Econometrics);

19.3.4

Applied to Australian Population Data

Whats the Limit of Australias Population?


To the data: http://www.abs.gov.au, Australian Historical Population Statistics (under Population Characteristics by title);
The data:
1. Range: 1788 2004;
2. Initial recorded value (1788): 859;
3. Last value (2004): 19,997,785;
We have the population equation,
p(t) =
where A =
correct.

M
p0

M
1 + Aek(tt0 )

1, and t t0 adjusts t to get the start-year of our logistic curve

The population data (from the ABS is plotted as black dots);


An attempt to fit the data follows! (M = 30mil, k = 0.03, p0 = 859, t0 = 1630
(why??)) Set t0 = 1788...;
7

x 10

3.5
3

p(t)

2.5
2
1.5
1
0.5
0
1800

1900

2000

2100

164

2200

2300

2400

19.3. LOGISTIC GROWTH


Example:
With our model fitted above, find the rate of population growth in the year
1788.

Summary
Growth is important in economics and business;
Different systems will grow in different ways, we can now model three common
ones:
1. Exponential growth;
2. Limited (exponential) growth;
3. Logistic growth.
The last (logistic) seems the most natural (scarcity is our reason for study!),
but it has its critics.
Keep watching the Australian data youve seen Australias carrying capacity
(30 million) here first!!

165

LECTURE 19. DIFFERENTIAL EQUATIONS & GROWTH II

19.4

Appendix: An Integration Workout

(From challenge above) Prove that the equation for logistic growth,


M N
dN
= kN
dt
M
yields the following equation with respect to time:
N (t) =

M
?
1 + Aekt

WARNING! Only look below, after you have tried it yourself!


We have,
dN
= kN
dt
separating variables gives,
Z

M N
M

k
[N (M N )]
M

1
dy =
N (M N )

k
dt
M

which after integration gives,


1
ln
M

N
M N

+ c1 =

k
t + c2
M

(recall, that
Z

1
1
dx = ln
x(ax + b)
b

ax + b
x

which can be rearranged (by log-laws) to give,




Z
1
x
1
dx = ln
.)
x(ax + b)
b
ax + b
Now, by multiplying both sides by M , and collecting the constants into a single
constant c, we have,


N
= kt + c ,
ln
M N

which can be further rearranged (by log laws, again), to give,

N
M N
kt
N (1 + Ae ) = M Aekt (let A = ec )
M Aekt
.
N =
1 + Aekt
ekt+c =

Finally, we divide the numerator and denominator by Aekt to obtain,


y=

M
1 kt
Ae

where B is a constant standing in for


166

+1
1
A,

M
1 + Bekt

which is what we were required to prove.

Lecture

20

Multivariable Calculus: The Partial


Derivative
20.1

Introduction

Up until now you may have been wondering why we have only dealt with (mainly)
functions of just one variable (e.g. f (x)). The reason is that this is the simplest
case, and therefore is a good place to begin. However, as you are no doubt aware,
much of the useful quantitative work that gets done is on functions of more than
one variable multivariable functions (e.g. f (x, y)). (In fact, if you can remember
back to linear programming, we were actually dealing with functions of this sort,
but only in very straight-forward, but still useful, ways).
What does a function of more than one variable look like? Given the usefulness
to business and economic problems of being able to find the maxima and minima
of functions, can we do the same in the many variable case? In regards to the first
questions, if the function is just of two-variables then we can make a representation
of it in three-dimensional space. But anything more than this, and we dont really
have the means to show it graphically. For the purposes of this course, however,
well stick to functions of two-variables. Moreover, the techniques we develop on
two-variable functions are all applicable to any multivariable functions.
In terms of the second question about finding maxima and minima this
requires us as a first step to be able to deal with rates of change in the multivariable
case. Here, we do a similar thing to what we did in the single variable case, but
we have to make sure we are careful that we are precise about the input/output
rate-effect that we are analysing. Since in this many-variable world, we have more
than one input variable that might cause the output variable to change. Therefore,
when we try to look at the slope of the function, we must look at one variables
effect at a time. This technique is called partial differentiation.
Agenda
1. Functions of two-variables the third dimension;
2. Analysing slope in 3D the partial derivative;
3. Applications of the partial derivative;
167

LECTURE 20. MULTIVARIABLE CALCULUS: THE PARTIAL DERIVATIVE


4. Further analysis and methods.

20.2
HPW 17.1

Functions of two-variables

The third-dimension...
So far, we have dealt (in calculus) with functions of the form,
f (x) ,

f (x),

f (x)

Common to them all is the single independent variable (x);


What if we have a function of the form,
f (x1 , x2 )

???

f (x1 , x2 )

... What does this kind of function look like?


... Is there such a thing as the derivative?
(We need to be careful... what does f () actually mean in this context??)

20.2.1

Seeing it Graphically

A second look at linear programming


Recall our Bean House Linear Programming example from a few lectures
ago?
We had constraints that gave a feasible area which was used with the
objective function to find the optimal solution...
... This is an example of a function:
f (x1 , x2 )

...

(x1 , x2 )

Now in three dimensions...


Constraints are actually planes;
The Objective function in this case is also a plane;
Our optimal point occured at the corner of the two constraints;
We found the best payoff to match this point (these are level curves).
168

20.2. FUNCTIONS OF TWO-VARIABLES


(x1 , x2 )
1
2
b

x2

x1

The general case


Suppose we have some function, z = f (x, y)
Then a point will be (x0 , y0 , z0 ) = (x0 , y0 , f (x0 , y0 )),
Or, a plane will be constructed by fixing one axis value:
1. XZ plane: fix y = y0 ;
2. XY plane: fix z = z0 ;
3. YZ plane: fix x = x0 ;
z

xz

yz

e
plan

lane
p
y
x

pl
an
e
b

x
y

20.2.2

x
y

Derivatives in this Context


HPW 17.2,
17.3

What is the slope???


Can we get some kind of expression for slope in our new function of two
variable?
What would it look like?
Recall: slope is normally expressed as,

df
d...
... that is, we differentiate with respect to a variable we are calculating
the change in the function value, with respect to a change in an input
variable...
169

LECTURE 20. MULTIVARIABLE CALCULUS: THE PARTIAL DERIVATIVE


In our new framework, we do just the same thing...
Definition | Partial Derivative
For some function f (x, y), the partial derivative of f with respect to x,
expressed as,
f
= fx
(20.1)
x
and for the point (x0 , y0 ) is given by the limit,
f (x0 + h, y0 ) f (x0 , y0 )
.
h
h

fx (x0 , y0 ) = lim

(20.2)

The partial derivative with respect to y is given similarly


f
= fy
y

(20.3)

and found in the same way.


Back to the plot
Consider our function again, z = f (x, y),
The partial derivative of f with respect to x at the point (x0 , y0 ) will be the
slope of f in the xz plane.
Similarly, the partial derivative of f with respect to y at the point (x0 , y0 )
will be the slope of f in the yz plane.
z

fx

fy

y0 )
(x 0 ,
b

(x

0,

y0

x
y

What does fx tell us?


The partial derivative, say fx , tells us the answer to the question,
By how much does f vary, for a given change in x all else remaining
the same.
That is, we are analysing the slope of f in one dimension only, keeping other
dimensions fixed.
This helps to untangle the influences of the many variables on the function f .
170

20.3. THE PARTIAL DERIVATIVE METHOD

20.3

The Partial Derivative Method

20.3.1

First order partial derivatives


Example:
Given y = f (x1 , x2 ) = 3x21 + x1 x2 + 4x22 , find fx1 and fx2 .

Example:
Find fx and fy given that f (x, y) = (2x + 4)(y 3).

20.3.2

Second order partial derivatives

Higher order partial derivatives?

HPW 17.5

Recall that in differentiation, with one variable, we could calculate,


df
dx

...

d2 f
dx2

...

d3 f
dx3

...

... Which was useful to determine the rate of change of the derivative of f
in other words, the acceleration or deceleration of the change in f ;
Can we do a similar thing with partial derivatives? What would it mean?
Suppose we took the partial derivatives of a function f (x, y),

(fx )
x

fx

and

fy

(fx )
y

and

(fy )
y

(fy )
x

Now suppose we took the partial derivatives of each of these.


171

LECTURE 20. MULTIVARIABLE CALCULUS: THE PARTIAL DERIVATIVE

We get FOUR different partial derivatives:


fxx

fxy

fyy

fyx

What do each of the FOUR partial derivatives mean??


What does fxx tell us?
fxx tells us the rate of change of fx with respect to x; or
fxx tells us by how much the gradient (with respect to x) of f changes,
as x is varied.
What about fxy ? (a cross partial derivative or mixed partial derivative)
fxy tells us the rate of change of fx with respect to y; or
fxy tells us by how much the gradient (with respect to x) of f changes,
as y is varied.

20.3.3

Seeing it graphically

Higher derivatives: fyx graphically


We now know about fy the partial derivative of f , keeping x constant;
As we take the partial derivative w.r.t. x of fy , we get fyx : it asks, how does
the gradient of f with respect to y change as x is varied? (green arrow)
z

b
b

x
y

172

20.4. METHODS OF PARTIAL DIFFERENTIATION


Example:
In the graphical example above, the function is f (x, y) = 2 (1 x)2
(1 y)2 . Show that fxy = fyx = 0. That is, that the partial derivatives
fx and fy do not change as the other variable is changed.

Example:
Find the second-order partial derivatives of the function z = f (x, y) =
x3 e2y .

Dont be surprised if you keep finding that


fxy = fyx
... a theorem (Youngs theorem) tells us that so long as the two
cross-partial derivatives are continuous, the order in which we calculate the cross-partial derivative doesnt matter. ... In this
course, well only deal with this kind of function.

20.4

Methods of Partial Differentiation

Further methods in partial differentiation


Suppose that instead of the situation we have been dealing with z = f (x, y):
f
z
f

x
y

We have the following:


173

LECTURE 20. MULTIVARIABLE CALCULUS: THE PARTIAL DERIVATIVE


f
z
f

x
y

g
t
h

where z = f (x, y) but also, x = g(t) and y = h(t);


Here, z is affected by t indirectly, but can we get an expression for

20.4.1
HPW 17.6

z
t ?

The Chain Rule

Definition | Chain-rule
Given a function z = f (x, y) where x and y are both functions of another
variable (or variables), e.g. x = g(t), and y = h(t), then if the functions f ,
g and h have continuous partial derivatives, then we have,

or,

z
z x z y
=
+
t
x t
y t

(20.4)

x
y
z
= fx
+ fy .
t
t
t

(20.5)

Example: Chain rule


Find z
t given that z(x, y) =

20.4.2

x+y
2x2

and x(t) = 2t2 + 1 and y(t) = 3t.

Total Differentials

Getting all the change together...


In practice, a special application of the chain rule gives us a useful piece of
information;
We ask the question, by how much does z change, given an infinitesimal change
in x or y? (given that x and y are a function of something else)
We can see that (by the chain rule) we have
dz
z dx z dy
=
+
dt
x dt
y dt
as a first approximation, but we notice that we can take out the common
factor, to be left with,
dz = fx dx + fy dy
174

1
dt

20.4. METHODS OF PARTIAL DIFFERENTIATION


This is called the total differential of the function z and the process by
which we got there is total differentiation. (note, the other variable,
here t has disappeared)
Definition | The Total Differential
Given a two variable function z = f (x, y), then the total change in z for
infinitesimal changes in x and y is given by,
dz = fx dx + fy dy

(20.6)

and is known as the total differential. In general for any number of


variables z = f (x1 , . . . , xn ) the total derivative is given by,
dz = fx1 dx1 + fx2 dx2 + + fxn dxn

(20.7)

Example: Total differentiation


Find the total differential of z = 3x2 + xy 2y 3 .

Total differential: reprise


So what can we see in the total differential:
dz = fx dx + fy dy
We have a summation of the changes to z that occur for a given infinitesimal
change in x or y;
And fx and fy are acting as converters, to make sure that the tiny changes
in x and y will affect z in the correct quantities.

175

LECTURE 20. MULTIVARIABLE CALCULUS: THE PARTIAL DERIVATIVE

176

Lecture

21

Multi-variable Optimisation
21.1

Introduction

We now come to our final lecture in calculus. We met last time the partial derivative
as a way of finding the rate of change of a function of more than one variable.
Further, we could then use the partial derivative to find the total derivative of a
function and so, get at how infinitesimal changes in one of the inputs would change
the output. This time, we use this knowledge, to go one step further and use the
idea of multi-variable calculus to perform optimization.
In some ways, this will look very familiar to our optimization problems under
normal single-variable calculus, but in others, well need extra tools. In the first
case, the first- and second- order conditions that we apply will look very similar, but
in the second case where we have to deal with constraints on where we are allowed
to look for maxima or minima things will get quite a bit more complicated!
In particular, we have to somehow perform our normal optimisation, but at the
same time, ensure that we dont get off the constraint line. This sounds difficult,
but in fact, a device called the Lagrange multiplier will come in very handy. It
should be pointed out that some of the steps in between in the later cases are not
shown, and are not required in this course well just take the conditions as is. For
those interested however, any undergraduate text in calculus will be a good place
to look, most probably under the title of Jacobian or Hessian matrices. For the
rest of us, we just need to be able to apply these nice results to find out when (and
if) we really are at the top (or bottom) of the hill (or depression).
Agenda
1. Extrema in two-variable functions;
2. First and second order criteria to find them;
3. Dealing with constraints;
4. Applying the technique.

21.2

Unconstrained optimisation

A natural problem

HPW 17.7

177

LECTURE 21. MULTI-VARIABLE OPTIMISATION


Recall, we had a function of two variables where we could find the derivatives
with respect to each input: fx , and fy the partial derivatives;
Now, our question will be: how to find the local extrema of these functions?
z

fx

fy

y0 )

(x 0 ,

(x

0,

y0

x
y

A natural solution...
Recall, that for functions of one variable, we solved:
f (x) = 0

checking:

sign[f (x)]

We will do the same here since we have:


1. A local maximum or minimum must be a place where,
fx = 0 and fy = 0;
2. The type of extremum will be determined by the sign of d2 z. (more on
this later!)
The mechanics...
Recall our work with differentials and from last lecture, the total differential of the function z = f (x, y):
dz = fx dx + fy dy
This says that infinitesimal changes in z is affected by infinitesimal changes in
x and y with each multiplied by a scaling factor (fx and fy respectively);
We want the place where, to any infinitesimal change in x or y, dz = 0:
dz = 0 = fx dx + fy dy
How to we solve this? ... fx = fy = 0.
178

21.2. UNCONSTRAINED OPTIMISATION

21.2.1

First order conditions

Definition | First order necessary condition for f (x, y)


Given a function of two variables, f (x, y), a local extremum say (x0 , y0 ) must
satisfy the first order necessary conditions for relative extrema,
fx (x0 , y0 ) = 0

and

fy (x0 , y0 ) = 0 .

Example:
Find the extreme values of the function,
z = f (x1 , x2 ) = 2x21 + x1 x2 + 4x22 .

Example:
For the plotted function used previously,
z = f (x, y) = 2 (x 1)2 (y 1)2 ,
find the location of the local maximum.

21.2.2

Second order conditions

What kind of extremum?


Recall, that with normal differential equations, to get the kind of extremum,
we would apply the second-order condition to the function?
179

LECTURE 21. MULTI-VARIABLE OPTIMISATION


We do the same here, we want to differentiate the total differential:
dz = fx dx + fy dy
Which we differentiate by the chain rule, treating each of (fx , fy ) as a variable,
and each of ( dx, dy) as constants (always an infinitesimal change):
( dz)
( dz)
dx +
dy
x
y

=
(fx dx + fy dy) dx + (fx dx + fy dy) dy
x
y
= ...

d2 z = d( dz) =

Note: If you were wondering where the rest of the previous derivation ends up, then the
following will satisfy you, though this particular derivation is not something we
expect you to be able to reproduce it is for your general understanding (if you
wish!):
( dz)
( dz)
dx +
dy
x
y

(fx dx + fy dy) dx + (fx dx + fy dy) dy


=
x
y
= (fxx dx + fyx dy) dx + (fxy dx + fyy dy) dy
= (fxx ( dx)2 + fyx dy dx) + (fxy dx dy + fyy ( dy)2 )

d2 z = d( dz) =

= fxx ( dx)2 + 2fxy dx dy + fyy ( dy)2


Now, youll notice that we have a quadratic equation which (like all quadratics) can
only be solved if the discriminant is greater than zero.1 In this case, the discriminant
happens to be given by a special matrix representation called the Hessian determinant
(or just the Hessian for short),
fxx
fyx

|H| =

fxy
.
fyy

Like other determinants, it has the value,


|H| = fxx fyy fxy fyx
which, for symmetric functions (i.e. fxy = fyx , as in our course) will look like,
2
|H| = fxx fyy fxy
.

Which must be greater than zero for us to be able to solve our original second derivative, and so this is where the seemingly peculiar part of the sufficient second-order
condition in the definition below comes from:
|H| > 0
1

...

2
fxx fyy fxy
>0.

Recall, that for a quadratic equation of the form,


f (x) = ax2 + bx + c

the solution is given by the quadratic formula,


x=

b2 4ac
2a

where the square-root part (b2 4ac) is known as the discriminant, and of course, must be greater
than zero to have a square-root.

180

21.2. UNCONSTRAINED OPTIMISATION


Definition | Second-order conditions for f (x, y)
Given some function of two variables f (x, y), if the first-order necessary
condition is satisfied (fx = fy = 0) at the point (x0 , y0 ) then the secondorder sufficient conditions for a local maximum are,
fxx , fyy < 0

and

2
fxx fyy fxy
>0,

(21.1)

and

2
fxx fyy fxy
>0.

(21.2)

or for a local minimum are,


fxx , fyy > 0

Summary of conditions
We can give a summary of the necessary and sufficient conditions for establishing local maxima and minima of functions of two variables as follows:

Condition
First-order necessary
Second-order sufficient

Local maximum Local minimum


fx = fy = 0
fxx , fyy < 0
fxx , fyy > 0
2 >0
fxx fyy fxy

Example:
Find the extreme value(s) of the function z = f (x, y) = 8x3 + 2xy 3x2 +
y 2 + 1.

21.2.3

Saddle points

The other case ...


In the previous example, we found that the point (0,0) had fxx < 0 and fyy > 0,
thus, not satisfying even the first-order necessary conditions ... what is at (0,0)?
The two second-order partial derivatives going in opposite directions tells us that
the function (around 0,0) is curving up on one side, and curving down on the
other ... a saddle point.
181

LECTURE 21. MULTI-VARIABLE OPTIMISATION


z

b
b

21.3
HPW 17.8

Constrained optimisation

Dealing with constraints...

Up until now, we have been allowing our search of the function to go anywhere ... it has been unconstrained;

What about the realistic situation where we face a constraint (e.g. a budgetline)?

This will require us to perform constrained optimisation.

x
y

21.3.1

Dealing with the constraint I: substitution

By substitution
182

21.3. CONSTRAINED OPTIMISATION


Example:
Using the function
z = f (x, y) = 2 (x 1)2 (y 1)2 ,
and the constraint 2 = y + 25 x, find the new optimum point.

The Lagrange Multiplier


The substitution method will work when:
1. The constraint(s) is(are) relatively simple;
2. The constraint(s) can therefore be re-arranged to perform substitution
into the objective function.
However, there will be times when this is not true...
Enter: The Lagrange Multiplier Method

21.3.2

Dealing with the constraint II: the Lagrange multiplier

Working with our function,


z = f (x, y) = 2 (x 1)2 (y 1)2 ,
and the constraint 2 = y + 25 x, suppose we incorporate the constraint in a new
function:
2
L = f (x, y, ) = 2 (x 1)2 (y 1)2 + [2 (y + x)]
5
The symbol (lambda) is some number, called a Lagrange multiplier;
Notice the way we have written the constraint; if it is satisfied, then the [. . . ]
term will equal zero;
Q: How to accomplish this?
A: By treating as a variable, and apply our first-order conditions to L(x, y, )
... by setting L = 0 we will get the equation,
2
L = 2 (y + x) = 0
5
... just what we wanted!
183

LECTURE 21. MULTI-VARIABLE OPTIMISATION

21.3.3

First order conditions

Definition | The Lagrange Multiplier Method


In general, to optimize a function of two variables, z = f (x, y) subject to
the constraint g(x, y) = c where c is a constant, then we may write the
Lagrangian function as follows,
L = f (x, y) + [c g(x, y)]

(21.3)

which give the necessary first-order conditions for extrema as,


L = c g(x, y) = 0 ,

Lx = fx gx = 0 , and
Ly = fy gy = 0 .

Example:
Check the previous example, using the Lagrange Multiplier method.

Solution method: reprise


Q: Why can we be so sure that our values for x and y will solve the constrained
problem?
A: Because the use of the Lagrangian with the first-order condition ensures
that the constraint is satisfied.
Check: We had (x , y ) = (1.21, 1.52), substitute into the function for [cg(x,y)],
2
[c g(x, y)] = 2 (y + x)
5
2
= 2 (1.52 + (1.21))
5
= 2 2.004 0 !
Recap: By making the special function L (the Lagrangian), taking the partial
derivative with respect to will ensure that when solved along with the other
first-order conditions, our constraint will be satisfied.
184

21.3. CONSTRAINED OPTIMISATION

21.3.4

Second Order Conditions

Which extremum?
Definition | Second-order sufficient conditions, constrained case
The second-order sufficient conditions for a constrained optimization problem, having Lagrangian,
L = f (x, y) + [c g(x, y)] ,
and partial derivative (bordered Hessian) matrix,
0 gx
gy

|H| = gx Lxx Lxy


gy Lyx Lyy

(21.4)

then for each extremum point (x , y , ):


< 0 the point is a local minimum; and
1. If |H|
> 0 the point is a local maximum.
2. If |H|
Which extremum?
Example:
Check the previous example for second-order conditions.

185

LECTURE 21. MULTI-VARIABLE OPTIMISATION

186

Lecture

22

Applications of Constrained
Optimisation
22.1

Introduction

We have looked at multi-variable optimization, and had an overview of how the


Lagrange method can be used to solve optimization problems in which there is a
constraint. Far from being merely a mathematical curiosity, the Lagrange method
is extremely prominent in Economics. This is because many situations of interest
to economists involve agents (individuals, firms, families, governments, etc.) maximizing subject to constraints. In this lecture, we look at two prominent examples:
i) consumers choose what to consume in order to obtain the highest payoff subject
to being able to afford their consumption bundle, and ii) managers choose the input mix that produces the greatest possible output subject to their annual allotted
budget. We then explore what the value of the Lagrange multiplier can tell us. We
finish up with a preview of more complicated problems that the Lagrange method
can be used to analyze.
For other examples, as well as practice questions, see Section 17.8 of the HPW
textbook. You may also be interested in Martin Osbornes math tutorial site that
deals with Lagrange Multipliers:
www.economics.utoronto.ca/osborne/MathTutorial/ILMF.HTM
Agenda
1. Economic Applications
Utility maximization
Output maximization
2. An Economic interpretation of Lagrange Multipliers
3. The power of the method: A preview of more complex problems
Multiple Constraints
Inequality Constraints
187

LECTURE 22. APPLICATIONS OF CONSTRAINED OPTIMISATION

22.2

Economic Applications

22.2.1

Consumer Behaviour: Utility Maximization

Scenario: Lets think about China


China has experienced huge economic growth recently, and we are interested in the
effects of an increased income on consumption patterns.
To simplify matters, suppose that individuals only consume two goods: food
and transport.
x = quantity of food consumed (as measured in, for example kilograms or
calories).
y = quantity of transportation consumed (as measured in, for example, kilometers traveled).
The individual obtains a particular utility (which you may think of as representing a level of satisfaction).
The amount of utility obtained is given by the function
U (x, y) = a[x + y] + xy

(22.1)

where a 0 is just some number.


Of course, individuals are constrained in what they can consume by how much
money they have to spend, i.e. their budget. If the price of food is px per unit, the
price of transport is py per unit, and the individual has an income of M > 0, then
the budget constraint is given by:
px x + py y = M.

(22.2)

In trying to predict how consumption patterns change in response to changes


in income, we are going to make the assumption that individuals consume things
that give them the highest utility, subject to the budget constraint. That is, the
individual is assumed to face the following problem:
max a[x + y] + xy,
x,y

subject to:

px x + py y = M

Using previous notation:


f (x, y) = U (x, y)
g(x, y) = px x + py y
c=M
The Lagrangian associated with this problem is:
L = f (x, y) + [c g(x, y)]

= a[x + y] + xy + [M (px x + py y)]

188

22.2. ECONOMIC APPLICATIONS


In order to calculate optimal consumption levels for this consumer, we go to the
first-order conditions:
Lx = 0 :

0x + 1y px = a

Ly = 0 :

1x + 0y py = a

L = 0 :

px x + py y + 0 = M

The first-order conditions represent a system of linear equations, implying that we


can use our matrix-based techniques for finding the solution.
Specifically, we have:

0 1 px
x
a
1 0 py y = a
px py
0

If you solve this correctly, you will find that the solution is:
x

y =

M +(py px )a
2px
M +(px py )a
,
2py

where we will assume that a is small enough such that these are non-negative.
These expressions provide some insight into how prices and incomes influence the
demand for both products. We see that the consumption of both goods increases as
income increases, decreases as its own price increases and increases as the price of
the other good increases. Does this seem reasonable?
At this point we can ask What is the utility level that maximizing individuals
will achieve when their income is M and they face prices px and py ?. This is simply
the value that their utility function takes when evaluated at the optimal (as opposed
to arbitrary) consumption levels.
By plugging in the solutions by brute force, we end up with:


M + (py px )a M + (px py )a

+
u(x , y ) =
a
2px
2py



M + (py px )a
M + (px py )a
+
.
(22.3)
2px
2py
Definition | Direct and Indirect Utility Functions
Since the optimal consumption choices depend on prices and income, we
can think of u(x , y ) as being a function of prices and income. Let this
function be denoted V (M, px , py ). The function V (M, px , py ) is known as
the indirect utility function, whereas the underlying function, u(x, y)
is known as the direct utility function.
To simplify matters, lets just focus on the case where a = 0. Here we have
V (M, px , py ) =

M2
.
4px py

(22.4)

Again, this tells us how the maximized level of utility that an individual obtains is
related to income and prices (when a = 0). This function will help us later on when
we discuss the interpretation of Lagrange multipliers.
189

LECTURE 22. APPLICATIONS OF CONSTRAINED OPTIMISATION


Second-order conditions
While we have found values of x and y that are candidates for solutions to the
constrained optimization problem, we need to check the second-order conditions to
ensure that the solution in fact represents a maximum (as opposed to a saddle point
or a minimum).
For this, we need to verify that the determinant of the bordered Hessian matrix
is positive:

0 gx
gy

= gx Lxx Lxy
|H|

gy Lxy Lyy

To start filling in the Hessian, recall that




> 0?

g(x, y) = px x + py y
and that the Lagrangian function is
L = a[x + y] + xy + [M (px x + py y)]
Clearly gx = px and gy = py . The others are not difficult to work out:
Lxx = 0

Lyy = 0

Lxy = 1.

Therefore, we have


0 px py


= px 0 1 ,
|H|


py 1 0

which is easily calculated as 2px py . Since this is greater than zero, the second-order
condition for a maximum is satisfied.

22.2.2

An Interesting Observation?

We can solve the above problem for different utility functions.


For example, it is easy to try u = xy, since this corresponds to the case in
which a = 0.
Now try u = ln x + ln y,
and u = x2 y 2 .
You should find that the solution is the same for all three utility functions. What
is going on? To explore this, draw the level curves associated with each of these
functions.
190

22.2. ECONOMIC APPLICATIONS

22.2.3

Managerial Behaviour: Output Maximization

Production requires capital (K) and labour (L):


1

Y (L, K) = L 2 K 2

Capital costs r per unit, and labour costs w per unit


E(L, K) = wL + rK
Scenario: The Managers Problem
As a manager, you are given a budget, B. Your problem is to choose (K, L) so as
to produce the most output with this budget.
The problem:
max Y (L, K),

subject to:

L,K

That is:

max L 2 K 2 ,
L,K

subject to:

E(L, K) = B

wL + rK = B

Using previous notation:


f (L, K) = Y (L, K)

g(L, K) = E(L, K)

c=B

The Lagrangian function is:


L = f (L, K) + [c g(K, L)]
1

= L 2 K 2 + [B (wL + rK)],
and the first-order conditions are:
1 1 1
L 2 K 2 = w
2
1 1 1
L 2 K 2 = r
2
B = wL + rK

LL = 0 :
LK = 0 :
L = 0 :

(22.5)
(22.6)
(22.7)

An important difference between this system of equations and the previous


system is that this system is not linear.
That is, we can not express the system in matrix form.
Thus, manipulating these equations will require some special attention. One
way to solve this system is to use the following steps.
First, (22.6)(22.5):
r
L
=
K
w

L=

r
K.
w

(22.8)
191

LECTURE 22. APPLICATIONS OF CONSTRAINED OPTIMISATION


Use this in (22.7):
B=w

hr

w
= 2rK.

K + rK

By simple re-arranging we get that


K =

B
,
2r

L =

B
.
2w

and by using (22.8), we have

It is always important to check that your answers make sense.


First, if the allowed budget, B, increases what should happen to the optimal
input choices?
Does this occur?
What should happen to the optimal amount of capital used when the cost of
capital, r, increases?
What should happen to the optimal amount of labour used when the cost of
labour, w, increases?
Do these occur?
To be thorough, you may also wish to check that, if we were to use the inputs
in the proposed quantities, that the total cost of doing so is exactly B. Is it?
Second-order conditions
While we have found values of capital and labour that are candidates for solutions to
the constrained optimization problem, we need to check the second-order conditions
to ensure that the solution in fact represents a maximum (as opposed to a saddle
point or a minimum).
For this, we need to verify that the determinant of the
bordered Hessian matrix is positive:


0
gL
gK

= gL LLL LLK > 0?
|H|


gK LKL LKK
To start filling in the Hessian, recall that

g(L, K) = wL + rK
and that the Lagrangian function is
1

L = L 2 K 2 + [B (wL + rK)]
Clearly gL = w and gK = r.
192

22.3. INTERPRETING LAGRANGE MULTIPLIERS


The others:
1
1
1 1
LLL = ( )L 2 1 K 2
2 2
1
1 1 1
LKK = ( )L 2 K 2 1
2 2
1 1 1 1 1
LLK = ( )L 2 K 2
2 2

3
1
1
= L 2 K 2
4
3
1 1
= L 2 K 2
4
1 1 1
= L 2K 2.
4

We can not just use these expressions in the bordered Hessian, since they
depend on L and K.
It is therefore important to remember that we want to evaluate the bordered
Hessian at the proposed solution point.
Thus, we need to evaluate the above expressions at K = K =
B
.
L = 2w

LLL
LKK
LLK

1
B 2
2r
  1   3
2
1 B 2 B
=
4 2w
2r
  1   1
2
2
B
1 B
=
4 2w
2r

1
=
4

B
2w

 3 

B
2r

and L =

1 w 2 r 2
2 B
1

1 w 2 r 2
=
2 B
1

1 w2r2
.
2 B

This allows us to fill in each element of the bordered Hessian matrix.


Since we have not imposed specific values for the parameters (w, r, and B)
calculating the determinant of the bordered Hessian would be quite a drawnout task.
Instead, what we could do is set up a spreadsheet in which we are able to enter
whatever parameter values we like.
This allows us to determine what the bordered Hessian matrix looks like for
any parameter values.
Once we choose some values, we simply calculate the determinant of the resulting matrix to see if it is positive. Try it!

22.3

Interpreting Lagrange Multipliers

When we solve a constrained optimization problem, such as the output-maximization


problem described above, it is natural to focus on the derived optimal values of the
choice variables (L and K in that example). However, the system of first-order
conditions involves three equations and three unknowns (L, K, and ). Therefore,
in principal, we can also calculate the value of at the solution point. Why would
we want to do this? After all, is not just part of a construction the Lagrangian
function that helps us derive the solution to a constrained optimization problem?
No it turns out that the value of has economic meaning.
193

LECTURE 22. APPLICATIONS OF CONSTRAINED OPTIMISATION


Interpretation
Definition | Economic Meaning of the Lagrange Multiplier
The value of the Lagrange multiplier tells us how much the optimal value of
the objective function changes as the constraint changes.
In other words, consider the problem of choosing x1 and x2 so as to maximize
F (x1 , x2 ) subject to g(x1 , x2 ) = c. Let the solution be x1 (c) and x2 (c), with
an associated Lagrange multiplier of . Let the (constrained) maximized
value of the objective function be F (c) F (x1 (c), x2 (c)). Then, we have
=

dF (c)
.
dc

To show this, note that from the chain rule we have


dF (c)
F dx1
F dx2
=
+
.
dc
x1 dc
x2 dc

(22.9)

From the first-order conditions, we have


g
F
=
x1
x1
g
F
=
.
x2
x2

(22.10)
(22.11)

Substituting these in gives


g dx1
g dx2
dF (c)
=
+
dc
x1 dc
x2 dc


g dx1
g dx2
=
+
.
x1 dc
x2 dc

(22.12)
(22.13)

The term in brackets (by reverse-engineering the chain rule) just describes how much
g changes when c changes. That is, since the constraint requires
g(x1 (c), x2 (c)) = c,

(22.14)

we can differentiate both sides to get that

{g(x1 (c), x2 (c))} = 1.


c

(22.15)

By using the chain rule, the derivative on the left will exactly by the term in brackets
above. Therefore, we are left with the conclusion that
dF (c)
= [1] = .
dc

(22.16)

For instance, if you use the first-order conditions to solve for in the output
maximization problem, you will find that
 1
1 1 2
=
.
2 wr
194

(22.17)

22.4. THE POWER OF THE METHOD: A PREVIEW


That is, if the manager is given an extra dollar of budget to work with, then they
will be able to produce an extra units of output. Notice that depends on the
input prices w and r. Does this make sense?
The information provided by the Lagrange multiplier is useful if, for example,
you were the CEO of a company that was in charge of overseeing managers in
different divisions that each each face a problem similar to the one just described.
To be concrete, suppose that your company has an office in Singapore and an office
in Hong Kong and each location is managed by a local manager. Although the same
production technology is used, rent and wage rates differ across the two locations.
As the CEO you must decide how to allocate the companys resources across these
two branches. If one location had less expensive rent and wages, then your decision
would be easy give all the resources to the place that is cheaper. However, matters
are not as clear if say wages were lower and rents were higher in Singapore. To make
the comparison, you would have to compare the Lagrange multipliers arising from
each managers optimization problem and choose to divert resources to the manager
with the higher multiplier. A non-obvious insight arises from (22.17): choose to give
resources to the place in which the product of the wage and rent is the smallest.
Returning to the utility-maximizing consumer, is interpreted as measuring how
much maximized utility increases as income increases. When a = 0 we have that
=
We also worked out that

M
.
2 px py

V (M, px , py ) =
Notice that

M2
.
4 px py

V
2M
M
=
=
= .
M
4 px py
2 px py

22.4

(22.18)

(22.19)

The Power of the Method: A Preview

While perhaps initially daunting, using the Lagrange method to solve constrained
optimization problems gets much easier with practice. Once you have invested the
effort in understanding the method, it may seem that there must be easier ways of
solving these problems (via substitution of the constraint, for example). In some
cases this will be true, however the method presented here is just the tip of the
iceberg the basic method that you have (hopefully) mastered can be applied to
much larger problems. Covering these extensions in any detail is beyond the scope
of this course, however it is useful to get a sense of just how powerful these (and
related) methods are.

22.4.1

Multiple Constraints

So far we have worked with a single constraint. You are quite right in thinking
that a useful tool will be able to accommodate many constraints simultaneously
- the method described above does this with minimal fuss.
Simply, one extra Lagrange multiplier is added for each additional constraint.
The value of the multiplier has the expected interpretation - the change in the
optimized objective function when the associated constraint is slightly relaxed.
195

LECTURE 22. APPLICATIONS OF CONSTRAINED OPTIMISATION


To keep things manageable, we have only worked with objective functions that
take on two arguments (e.g. x and y, or K and L). What would happen if we
added an extra constraint to such a problem?

22.4.2

Inequality Constraints

Throughout we have only considered constraints that hold with equality. In many
cases this is natural - e.g. if there exists at least one good for which it is always desirable to have greater quantities of, then choosing quantities to consume to maximize
utility will always require that you spend all of your available resources.1 Similarly,
if you are trying to maximize output then you will always spend your entire budget
in purchasing inputs that are to be transformed into output. In these cases, it makes
sense that we use an equality constraint.
However, even in these cases, the problem we really want to analyze is one in
which there is an inequality constraint. For instance, we want our consumer to be
able to spend any amount that they like as long as it is no greater than their available
budget. We do not want to force the consumer to spend all of their resources if they
find it optimal to not do so.2 We may also wish to model a firm that faces various
inequality constraints, such as restrictions on inputs, much like we did when studying
linear programming.
Fortunately, the Lagrange method has been extended in this important direction.
The conditions for a solution, known as the Kuhn-Tucker conditions, are slightly
more complex and you may come across them in higher level economics classes. This
type of constrained optimization is a widespread and very powerful tool used in a
whole range of economic problems.

1
If it did not, then you can always increase your utility by consuming more of good that you
cant get enough of.
2
This is not a statement based on the idea that some consumers wish to save part of their
income. We can just think of savings as another good to be consumed. It is based on the idea that
the consumer should really always have the option to burn their money if they wish. For example,
suppose that there are only two goods in the economy: Beer and Family Guy DVDs. Both of these
goods are enjoyable in moderate quantities, but there comes a point at which consuming greater
quantities becomes painful. If we forced ourselves to use equality constraints, then we would be
denying the consumer the ability to stop purchasing these goods once they become tiresome (e.g.
a multi-millionare may be worse off than a guy with only $100 since the former would be forced to
consume vast quantities of beer and tiresome DVDs).

196

You might also like