You are on page 1of 58

Math198 Lecture Notes

Contents
1 Differential Calculus
1.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1 Domain and Range . . . . . . . . . . . . . . . . .
1.1.2 Domain and range from the graph . . . . . . . .
1.1.3 Even and odd functions . . . . . . . . . . . . . .
1.1.4 Composition of functions . . . . . . . . . . . . .
1.1.5 Inverse Functions . . . . . . . . . . . . . . . . . .
1.1.6 Finding inverses, natural domains and ranges . .
1.1.7 Graphical representation of inverse functions . .
1.1.8 Limits . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Limits and Differentiation . . . . . . . . . . . . . . . . .
1.2.1 Fundamental Rules for Differentiation . . . . . .
1.2.2 Chain Rule . . . . . . . . . . . . . . . . . . . . .
1.2.3 Generalised Chain Rule . . . . . . . . . . . . . .
1.2.4 Rates of Change . . . . . . . . . . . . . . . . . .
1.2.5 Product Rule . . . . . . . . . . . . . . . . . . . .
1.2.6 Quotient Rule . . . . . . . . . . . . . . . . . . . .
1.2.7 What does the derivative mean? . . . . . . . . .
1.3 Graph Sketching . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Rates of change (again) . . . . . . . . . . . . . .
1.3.2 Stationary Points . . . . . . . . . . . . . . . . . .
1.3.3 Finding Stationary Points and Inflexions . . . . .
1.3.4 Local vs. Global . . . . . . . . . . . . . . . . .
1.3.5 Asymptotes . . . . . . . . . . . . . . . . . . . . .
1.3.6 Exponential and Logarithmic Graphs . . . . . .
1.3.7 Hyperbolic Functions . . . . . . . . . . . . . . .
1.3.8 More on stationary points and points of inflexion
1.3.9 Graph Sketching Summary . . . . . . . . . . . .
1.3.10 Tangents and Normals . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

3
3
3
4
4
4
5
5
6
6
7
8
9
10
11
11
12
13
13
13
14
14
15
16
17
18
19
20
21

2 Vectors
2.1 Describing Vectors . . . . . . .
2.1.1 Coordinates vs. Vectors
2.1.2 Position Vectors . . . .
2.1.3 Zero Vectors . . . . . .
2.1.4 The Standard Basis . .
2.1.5 Magnitude . . . . . . .
2.2 Vector Algebra . . . . . . . . .
2.2.1 Scalar Multiplication . .
2.2.2 Vector Addition . . . .

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

23
23
24
24
24
25
25
25
25
25

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
1

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

2.3
2.4

2.5
2.6
2.7

2.8

2.9

2.2.3 Vector Subtraction . . . . . . . . . . . . . . . . . . . . . .


2.2.4 Some Properties . . . . . . . . . . . . . . . . . . . . . . .
2.2.5 Unit Vectors . . . . . . . . . . . . . . . . . . . . . . . . .
Components of a Vector . . . . . . . . . . . . . . . . . . . . . . .
The Scalar Product . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.1 The Standard Basis and Scalar Product . . . . . . . . . .
2.4.2 Properties of the Scalar Product . . . . . . . . . . . . . .
2.4.3 Application: Calculating the Angle Between Two Vectors
Application: Projection of vectors . . . . . . . . . . . . . . . . .
Linear Combination of Vectors . . . . . . . . . . . . . . . . . . .
2.6.1 Vector Span . . . . . . . . . . . . . . . . . . . . . . . . . .
Cartesian and Parametric Equations . . . . . . . . . . . . . . . .
2.7.1 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.7.2 Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.7.3 Distance between a point and a plane . . . . . . . . . . .
The Vector Product . . . . . . . . . . . . . . . . . . . . . . . . .
2.8.1 Properties of the Vector Product . . . . . . . . . . . . . .
2.8.2 Volume of a parallelepiped . . . . . . . . . . . . . . . . .
Physical Interpretations of Scalar and Vector Products . . . . . .
2.9.1 Work Done . . . . . . . . . . . . . . . . . . . . . . . . . .
2.9.2 Torque (Moments) . . . . . . . . . . . . . . . . . . . . . .
2.9.3 Projection and Orthogonal Components of Vectors . . . .
2.9.4 Lagranges Identity . . . . . . . . . . . . . . . . . . . . . .

3 Integration
3.1 The Fundamental Theorem of Calculus .
3.2 Properties of Integration . . . . . . . . .
3.3 Integrating over symmetric domains . .
3.4 Area between a curve and the x-axis . .
3.4.1 The area between two curves . .
3.5 Integrating Certain Rational Functions .
3.6 Integration by Parts . . . . . . . . . . .
3.6.1 Unusual Examples . . . . . . . .
3.7 Integration by Substitution . . . . . . .
3.7.1 Trigonometric Substitutions . . .
3.8 Matrices . . . . . . . . . . . . . . . . . .
3.9 Partial Fractions . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

26
26
26
26
27
28
28
28
30
30
30
31
31
33
33
34
35
35
36
36
36
37
38

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

39
40
41
42
43
45
46
47
49
50
53
54
55

Chapter 1

Differential Calculus
1.1
1.1.1

Functions
Domain and Range

A function is a well defined way of assigning to each number (called the input), exactly one number (called
the output). The set of inputs is called the domain of the function. We often describe a function in terms of
what how it affects an arbitrary input x, but we must always be clear what is the domain of our function.
The range of a function consists of all real numbers that are an output for at least one input of the function.

Example 1.1.1. Define a function f (x) = x, with domain


{x R : x 0}.
This is just another way of saying the domain consists of all non-negative real numbers. In interval notation,
we could write
[0, ).
The left hand bracket is square since 0 belongs to the domain, and the right hand bracket is round since
can never belong to the any subset of real numbers since it is not a number itself. The range is also[0, )
since any positive number can
by taking the square root of another positive number, and 0 = 0.
be found
Note that, for example, 4 = 2 and 4 6= 2. A function has a single output for each input.
Example 1.1.2. A function g is defined as
g(x) = x2 + 3
on the set of real numbers (this implies that domain is the entire set of real numbers without saying to
explicitly). Since x2 is never negative for any real number x, the smallest possible output is 3. The range of
g is therefore
{x R : x 3} = [3, ),
i.e. the set of real numbers greater than or equal to 3.
During this course, we will meet several types of functions. The main ones are:
Polynomial: x4 , x3 20x + 3, . . .
Rational: one polynomial divided by another,

x+2
x2 1 , . . .

Trigonometric: sin(x), tan(x), . . .


Exponential: ex
Logarithmic: ln(x)
. . . or combinations of the above.
3

1.1.2

Domain and range from the graph

The domain and range of a function can be seen from its graph. The domain is set set of real numbers on
the horizontal axis (which is usually labelled the x-axis) through which a vertical line hits the graph. The
range is the set of real numbers on the vertical axis through which a horizontal line hits the graph.

1.1.3

Even and odd functions

If f (x) = f (x) for all x then f is even. The graph of any even function has the y-axis as a line of symmetry.
Example 1.1.3. f (x) = x2 is even since f (x) = (x2 ) = x2 = f (x).
Example 1.1.4. f (x) = cos(x) is even.
If f (x) = f (x) for all x then f is odd. The graph of any odd function has degree 2 rotational symmetry
about the origin.
Example 1.1.5. f (x) = x3 x is odd, since f (x) = (x3 ) (x) = x3 + x = (x3 x) = f (x).
Example 1.1.6. f (x) = sin(x) is odd.
graph
Most functions you encounter will not satisfy either of these conditions.
Example 1.1.7. The following functions are neither even nor odd.
f (x) = x3 + 1
f (x) = x10 x9
f (x) = sin(x) + cos(x).
The following examples shows that there is only one function that is both even and odd.
Example 1.1.8. If f is a function that is both even and odd, then f (x) = f (x). The only number
that is the negative of itself is zero, so the only function that is both even and odd is
f (x) = 0.

1.1.4

Composition of functions

There are a number of ways of making functions out of simpler functions:


Sums e.g. x + sin x
Products e.g. x sin x
sin x
. Here, we have to be careful where this is defined. The domain of the quotient
Quotients e.g.
x
might not be as big as the domain of the function in the denominator as happens in this case, because
sin 0
is not defined.
0
Another important way of making function from others is by composition. If f and g are functions and
the range of g is included in the domain of f , then the composition f g is defined by
f g(x) = f (g(x)).
Examples are sin(x2 ), which is the composition f g, where f (x) = sin x and g(x) = x2 , and sin2 (x) =
(sin x)2 , whicb is the composition g f .
Another example is esin x , which is the composition h f , where h(x) = ex .
4

1.1.5

Inverse Functions

A given function f (x) has an inverse provided we can find some f 1 (x) such that
f (f 1 (x)) = f 1 (f (x)) = x
for all possible x. Be careful with domains and ranges.
Example 1.1.9. Let f (x) = x2 be defined on the set of real numbers. Then f has no inverse since. Consider,
for example, f (2) = f (2) = 4. So if f 1 did exist, would we choose f 1 = 2 or f 1 = 2? We cant
resolve this because according to the definition of a function we may only have one output, and this output
must be well defined. So we conclude that f (x) = x2 has no inverse when the domain is the whole set of
real numbers.
Example 1.1.10. This time let f (x) = x2 , but restrict the domain to the non-negative real numbers:
{x R : x 0} = [0, ).
Now we avoid the trap since no two elements in the domain have the same output. We therefore have an
inverse

f 1 = x.
A geometric interpretation of the inverse is the reflection of the graph of the function in the line y = x.
Note that the inverse of f 1 is f . Also, the domain of f 1 is the range of f , and vice-versa.

1.1.6

Finding inverses, natural domains and ranges

The largest domain we may take for a function is called the natural domain of the function, and it excludes
any and all real numbers that might cause us a problem (e.g. division by zero, square root of a negative etc).
In some cases, particularly for rational functions, a shortcut for finding the range of a function is by finding
the domain of its inverse. The following example demonstrates a standard recipe for finding the inverse of
a given function.
It is often useful to use the following.
y = f (x) f 1 (y) = x.
Example 1.1.11. Let
f (x) =

x1
.
x2

The natural domain of this function is (, 2) (2, ). But what is its range? In other words, if we were
to sketch the graph of y = f (x), are there any values of y that this function misses, or does it hit every
possible real number? To rephrase again; is there always a solution to y = f (x) for any given value of y, or
do some values leave us with an equation that is impossible to solve? Since we know that the range of f
is the natural domain of f 1 , we first calculate the inverse. the recipe is as follows, using the recipe stated
above.
x1
y
=
x2
y(x 2) = x 1
xy 2y = x 1
xy x
= 2y 1
(y 1)x = 2y 1
2y 1
x
=
.
y1

2y 1
, which is defined for y 6= 1. It does not matter what the variable is called, so we can
y1
2x 1
. The natural domain of f 1 (x) is (, 1) (1, ), so this is the range of f (x). Lets
write f 1 (x) =
x1
check that f (x) = 1 is impossible. We try to solve
So f 1 (y) =

x2

x1
x2
x1

1.

(1.1)
(1.2)
(1.3)

Since this is clearly nonsense, our conclusion was correct.

1.1.7

Graphical representation of inverse functions

The graph of the inverse function f 1 is obtained from the graph of f by reflecting it in the line y = x.
For example, for the function f (x) = x, the inverse function satisfies f 1 (x) = x for all x because
y = f (x) = x f 1 (y) = x = y.
So the graphs of both f and f 1 coincide with the line
y = x, and the reflection changes nothing.
If f (x) = x2 , with domain [0, ), then f 1 (x) = x also with domain [0, ). The graphs of f and f 1
intersect at two points (0, 0) and (1, 1), which are both on the line y = x

1.1.8

Limits

If f (x) gets arbitrarily close to ` as x gets arbitrarily close to a, we say that the limit of f (x) is `, as x tends
to a.
This is written as
lim f (x) = `.
xa

For example
lim x = 1,

x1

lim x2 = 1,

x1

lim (1 + x2 ) = 2,

x1

1
1
=
x
2
For most functions that we come across, if f (a) can be defined, then limxa f (x) = f (a), but this is not
always true. It is true for the examples above, where we have
lim

x2

x2 = 1 if x = 1
1 + x2 = 1 if x = 1,
1
1
= if x = 2
x
2
1
However, there are exceptions. This can happen if f (a) is not defined. For example limx0 does not exist
x
1
not surprisingly because is not defined.
0

Another more difficult example is given by

|x| =
So

x
, where |x| is the modulus of x, that is
|x|
if
x 0,
if x 0.

x
x

x
1
=
1
|x|

So limx0

if
if

x>0
x < 0.

x
does nor exist
|x|

Example 1.1.12. This is a very famous example in limit theory. It can be shown that
sin(x)
= 1.
x0
x
lim

We will not cover any methods to show this in this module, but you can convince yourself by evaluating the
function with x = 0.9999999 . . . on your calculator. A convincing picture is given by comparing the length
of an arc of a circle of radius 1 subtended by angle x, and the length of the opposite side of the right-angled
triangle with hypotemuse of length 1 and angle x. The length of the circle arc is, of course, x, and the length
of the opposite side of the right-angled triangle is sin x.

1.2

Limits and Differentiation

You will recall the straight line graph


y = mx + c.
The value of m is the gradient of the straight line, and is a measure of how steep the line is. The larger m
is, the steeper the graph. If m is positive, the graph is increasing; if m is negative, the graph is decreasing.
When the function is anything more complicated than a straight line, the gradient (think: steepness) of the
graph at any point is not the same anywhere and we express it as a function of x, called the derivative. In
dy
and is
fact, we can write the derivative (or gradient function) for y = mx + c: it is either denote by y 0 or dx
the constant valued function
dy
= m.
dx
For a general function f (x) we write the derivative as f 0 (x). We now want to formally extend this concept
to an arbitrary function f . This procedure is sometimes called differentiation from first principles and,
although you may not use it again, it is useful to know where later methods come from in the first place.
We define the gradient of the function f at x to be the gradient of the tangent line to the graph at x.
Since we initially dont know exactly what this is, we take an estimate by connecting the two points on the
graph:
(x, f (x)), (x + h, f (x + h)),
where h is some small, positive real number. The smaller h is (i.e. the closer h is to zero), the better the
approximation to the gradient. The gradient may be calculated as the change in the y values divided by the
change in the x values. This is still only an estimate as the tangent line has only been estimated, and the
ratio is
x
y

=
=

f (x + h) f (x)
(x + h) x
f (x + h) f (x)
.
h

Even though the smaller the value of h the better the estimate, we cant simply evaluate since expression
when h = 0 since it is not in the natural domain as it makes the denominator zero. However, we can take
the limit as h tends to zero. This is the formal definition of the derivative. We have
dy
f (x + h) f (x)
= f 0 (x) = lim
.
h0
dx
h
Example 1.2.13. y = mx + c.
dy
dx

=
=
=
=

m(x + h) + c (mx + c)
h
mx + mh + c mx c
lim
h0
h
mh
lim
h0 h
m.
lim

h0

This was to be expected as this is the result we already knew and generalised.
Example 1.2.14. y = f (x) = x2 .
dy
dx

(x + h)2 x2
h0
h
x2 + 2hx + h2 x2
lim
h0
h
2hx + h2
lim
h0
h
lim 2x + h

2x.

=
=
=

lim

h0

Example 1.2.15. y = f (x) = sin(x).


dy
dx

=
=
=
=
=

since

sin(x + h) sin(x)
h0
h
sin(x) cos(h) sin(h) cos(x) sin(x)
lim
h0
h
cos(h) 1
sin(h)
+ sin(x) lim
cos(x) lim
h0
h0
h
h
sin(h)
2 sin2 (h/2)
cos(x) lim
+ sin(x) lim
h0
h0
h
h
cos(x),
lim

sin h
= 1.
h

1.2.1

Fundamental Rules for Differentiation

There are some standard rules for differentiating.


If y = Af (x), for some constant A, then
dy
= Af 0 (x).
dx
8

If y = f (x) + g(x), then


dy
= f 0 (x) + g 0 (x).
dx
If y = xn , then
dy
= nxn1 ,
dx
for any real number n including negative numbers.
A particular case of the previous rule, if y = A, a constant, then
dy
= 0.
dx
If y = sin(x) then
dy
= cos(x).
dx
If y = cos(x) then
dy
= sin(x).
dx
Notice the negative sign!
If y = ex then
dy
= ex .
dx
No change - this function is its own derivative.
If y = ln(x) then
dy
1
= .
dx
x

1.2.2

Chain Rule

Let y = y(x) (i.e. let y be a function that depends on the variable x). If we cant immediately calculate
it may be that we have to use the chain rule. We try to write

dy
dx ,

y = y(u), u = u(x),
dy
so that y is a composite function of x, and in such a way that we can more simply evaluate du
and du
dx . Then
it follows that
dy
dy du
=
.
dx
du dx
When you get familiar with this method, you are not required to write down u each time. On the other
hand, you are never punished for specifying u, so please do so whenever it helps.

Example 1.2.16. Let y = (1 + x1 )3 . To demonstrate the efficiency of the chain rule, we will first see what
happens when we just multiply out and differentiate. After some work, we get
y

(1 + x1 )3

= x3 + 3x2 + 3x1 + 1
dy
dx

= 3x4 6x3 3x2 .

Alternatively, given y = (1 + x1 )3 , let use write u = 1 + x1 . Then


y
dy
du

= u3
=

3u2

3(1 + x1 )2

u = 1 + x1
du
= x2 ,
dx
giving
dy
dy du
=
= 3x2 (1 + x1 )2 ,
dx
du dx
which is exactly the same as the previous answer with the added bonus that it is now factorised. This may
look harder in this specific case, but it is easier in other cases.
Example 1.2.17. For y = e3x we set u = 3x. Then y = eu , giving
dy
= eu = e3x ,
du
and

du
= 3,
dx

giving ultimately
dy
dy du
=
= 3e3x .
dx
du dx
Example 1.2.18. Now let y = ln(4x). This time, u = 4x gives y = ln u, and so
dy
1
1
= =
.
du
u
4x
Next,
du
= 4.
dx
Finally,
dy
dy du
4
1
=
=
= .
dx
du dx
4x
x

1.2.3

Generalised Chain Rule

Chain rules can occur within other chain rules. We will only demonstrate this with an example
Example 1.2.19. Let y = cos(sin(x2 )). Let u = x2 , so that du
dx = 2x and y = cos(sin(u)). This only helps
dv
us a little, so we must continue by setting v = sin(u), so that du
= cos(u) and y = cos(v). We can now
differentiate y to get
dy
= sin(v) = sin(sin(u)) = sin(sin(x2 )).
dv
We conclude by multiplying to get
dy
dy dv du
=
2x sin(sin(x2 )) cos(x2 ).
dx
dv du dx

10

1.2.4

Rates of Change

If y = f (x) then

dy
dx

is the rate of change of y with respect to x.

Example 1.2.20. If y grows 3 times faster than u, and u grows 5 times faster than x, then y grows 15 times
faster than x. In other words,
dy
dy du
=
dx
du dx

1.2.5

Product Rule

Recall that if y = f (x) then


dy
f (x + h) f (x)
= lim
.
dx x0
h
If y is the product of two functions, say y(x) = u(x)v(x), then this formula says that
dy
dx

=
=
=
=
=
=

u(x + h)v(x + h) u(x)v(x)


h0
h
u(x + h)v(x + h) + [u(x + h)v(x) + u(x + h)v(x)] u(x)v(x)
lim
(adding zero)
h0
h
u(x + h)v(x) u(x)v(x)
u(x + h)v(x + h) u(x + h)v(x)
+ lim
lim
h0
h0
h
h




v(x + h) v(x)
u(x + h) u(x)
lim u(x + h)
+ lim v(x)
h0
h0
h
h
0
0
u(x)v (x) + v(x)u (x)
dv
du
u
+v .
dx
dx
lim

Ultimately, the product rule for y = u(x)v(x) may be written


dy
dx

= u(x)v 0 (x) + v(x)u0 (x)


dv
du
+v
dx
dx
uv 0 + vu0 .

= u
=

Example 1.2.21. Let y = (ln x)ex . Set u = ln x, v = ex . Then applying the rule gives
dy
dx

(ln x)(ex )0 + ex (ln(x))0

1
(ln x)ex + ex
x


1
x
= e ln x +
.
x
=

Example 1.2.22. Let y = x10 sin(3x). With u = x10 , v = sin(3x) (and remembering the chain rule) we get
dy
dx

= x10 (sin(3x))0 + sin(3x)(x10 )0


=

3x10 cos(3x) + 10x9 sin(3x).

11

Example 1.2.23. If y = xx , we can use logarithms to write this in a more workable form: y = xx = ex ln x .
We apply first the chain rule, then the product rule to calculate:
dy
dx

1.2.6

(x ln x)0 ex ln x = xx (x ln x)0

xx (x(ln x)0 + (x)0 ln x)

xx (1 + ln x).

Quotient Rule

Even though this is usually listed as a separate rule, it may be directly derived from the chain and product
rules. Let
u(x)
y(x) =
.
v(x)
Then
v(x)u0 (x) u(x)v 0 (x)
,
y 0 (x) =
(v(x))2
since
u
y =
v
= uv 1
dy
= u(v 1 )0 + v 1 u0
dx
= u(v 2 )v 0 + v 1 u0
vu0 uv 0
=
v2
Example 1.2.24. Let
y=

ln x
.
x2

Choose u = ln x, v = x2 . Then
dy
dx

=
=
=

vu0 uv 0
v2
x2
x 2x ln x
x4
1 2 ln x
.
x3

Example 1.2.25. For

set u = sin x, v =

sin x
y=
x
x = x1/2 . Then
dy
dx

=
=

vu0 uv 0
v2

x cos(x) 12 x1/2 sin(x)


x

12

1.2.7

What does the derivative mean?

One use of the derivative is to check whether a function is increasing or decreasing on a given interval or
neither.
f is an increasing function if f (x1 ) f (x2 ) whenever x1 x2 , for all x1 , x2 in the domain of f .
f is strictly increasing if f (x1 ) < f (x2 ) whenever x1 < x2 , x1 x2 , for all x1 , x2 in the domain of f .
f is a decreasing function if f (x1 ) f (x2 ) whenever x1 x2 , for all x1 , x2 in the domain of f .
f is a strictly decreasing function if f (x1 ) > f (x2 ) whenever x1 < x2 , for all x1 , x2 in the domain of f .
We also talk about (strictly) increasing/decreasing functions on intervals (a, b) or [a, b].
For example, f is increasing on [a, b] if f (x1 ) f (x2 ) whenever a x1 < x2 b (assuming that [a, b] is
in the domain of f ) and similarly for the other definitions.
Now we connect this with the derivative. Suppose that f 0 > 0 on (a, b), that is f 0 (x) exists and is > 0 for
all a < x < b. Then, for each x with a < x < b, if h 6= 0 is sufficiently small so that a < x + h < b, we have
f (x + h) f (x)
> 0.
h
This means that f (x + h) > f (x) if h > 0 and f (x + h) < f (x) if h < 0.
So f is strictly increasing on (a, b). Similarly, if f 0 (x) < 0 for a < x < b then f is strictly decreasing on
(a, b).

1.3

Graph Sketching

The section is about graph sketching and not about graph plotting.
Graph Plotting: Put values of x into y = f (x) and draw an accurate picture of a small part of the
graph by connecting the dots.
Graph Sketching: Draw a less accurate picture that contains information about all interesting features
of the entire graph.
In practice, graph sketching is much more useful that graph plotting.

1.3.1

Rates of change (again)

dy
Example 1.3.26. We demonstrate this with y = mx. We know that dx
= m, but what this tells us is that
dy
is constant, the
whenever x moves 1 unit, y moves m units. This happens for every value of x, and since dx
graph of f is a straight line.

Example 1.3.27. Let consider functions on the domain 0 x 1. Well look at the functions
y1

x2

y2

2x x2 .

The derivatives are


dy1
dx
dy2
dx

2x

2 2x,

which have only non-negative on the given domain. Moreover, both graphs include the points (0, 0) and
(1, 1) and are increasing on this domain. To understand why the graphs look different, we must look at the
second derivative,
d2 y
d dy
=
.
dx2
dx dx
13

We have
d2 y1
dx2
d2 y2
dx2
dy
If dx
is the rate of change of y, then
y = x2 . We have

d2 y
dx2

+2

2.

is the rate of change of

y = x2

y = x2

dy
dx .

E.g. consider the graphs of y = x2 and

dy
d2 y
= 2 > 0.
= 2x,
dx
dx2
graph
dy
d2 y
= 2 < 0.
= 2x,
dx
dx2
graph

We have:

d2 y
dx2

> 0: Convex. Looks like y = x2 .

d2 y
dx2

= 0: (something else we will look at later)

d2 y
dx2

< 0: Concave. Looks like y = x2 .

1.3.2

Stationary Points

dy
Sometimes for a function y = f (x) there are values of x such that dx
= f 0 (x) = 0. Such values of x are
called stationary points of the function. We classify them into several types.
dy
If dx
= 0 and
y = x2 .

d2 y
dx2

> 0, we have a local minimum. At this point, the graph looks like the origin of

dy
If dx
= 0 and
y = x2 .

d2 y
dx2

< 0, we have a local maximum. At this point, the graph looks like the origin of
graphs

When

d2 y
dx2

= 0, things becomes a little more complicated. Just as


d2 y
dx2 .

dy
dx

d2 y
dx2

d2 y
dx2

is the rate of change of


d3 y
dx3

rate of change of
Irrespective of whether or not
= 0, if
= 0 and
an inflexion point. An inflexion point may or may not be a stationary point.

1.3.3

dy d3 y
dx , dx3

is the

6= 0, then the point is called

Finding Stationary Points and Inflexions

A question may ask you to find and classify all stationary points and inflexions of a function y = f (x).
To find stationary points, solve f 0 (x) = 0. One by one, input all of the solutions into f 00 (x).
If f 00 (x) > 0, the stationary point in a local minimum.
If f 00 (x) < 0, the stationary point in a local maximum
If f 00 (x) = 0,
If f 000 (x) 6= 0, the stationary point is an inflexion.
If f 000 (x) = 0, the stationary point is something else, i.e. too complicated for this module.
14

To find all inflexions, solve f 00 (x) = 0. If f 000 (x) 6= 0, this is an inflexion point. Note: you have have
already found all of these when you found the stationary points.
So when asked this type of question, it is a good idea to write down the first three derivatives at once.
Example 1.3.28. Let y = x3 x2 . The derivatives are:
dy
dx
d2 y
dx2
d3 y
dx3
Solving

dy
dx

3x2 2x

6x 2

6.

= 0 gives x = 0, x = 2/3 as our stationary points. At x = 0,


d2 y
dx2

maximum at (0, 0). At x = 2/3,


points, we solve

d2 y
dx2

d2 y
dx2

= 2 < 0, so there is a local

= 2 > 0, so there is a local maximum at (2/3, 4/27). To find inflexion

= 0 to get x = 1/3. So (1/3, 2/27) in an inflexion.

Example 1.3.29. Now let y = x3 + x2 . Notice that this factorises to give y = x2 (x + 1). This means there
is a double root at x = 0 and a single root at x = 1.
Example 1.3.30. For y = x4 x3 ,
dy
dx
d2 y
dx2
d3 y
dx3
Solving

dy
dx

4x3 3x2 = x2 (4x 3)

12x2 6x = 6x(2x 1)

24x 6.

= 0 gives stationary points at x = 0 and x = 3/4. At x = 3/4 we find that

corresponds to a local minimum. However, when x = 0,


point of inflexion. Indeed it is, since
since at this point

1.3.4

d2 y
dx2

= 0 but

d3 y
dx3

d3 y
dx3

d2 y
dx2

d2 y
dx2

> 0, so this

= 0 too. So this stationary point may be a

6= 0 when x = 0. There is another point of inflexion at x = 1/2,

6= 0.

Local vs. Global

Consider the graph of some function y = f (x) defined on the domain 0 x 100, where the domain
dy
represents some physical constraint. We find that dx
= 0 at the top of every hill and the bottom of every
valley. But we only call these local maximum and minimum points because there are points of the graph
that are higher or lower than them. These are not stationary points, but are called the global maximum
and minimum points of the function. A function in general may have many local maximum and minimum
points, but it may only have one global maximum value and one global minimum value.
Example 1.3.31. The function
y=

2x + cos(2x)

is defined on the domain


0 x 2.
We wish to calculate the global maximum and global minimum value of the function in this domain. Lets
start by finding stationary points. We have

dy
= 2 2 sin(2x).
dx
15

The derivative is 0 at a stationary point, so we are solving

2
.
2

sin(2x) =

Lets replace 2x = t, and change our limits accordingly: 0 t 4.


graph
From the graph, we see there are exactly four stationary points. They occur when t = /4, 3/4, 9/4, 11/4.
That is to say, when
x = /8, 3/8, 9/8, 11/8.
By considering the second derivative we could classify these as, in order: local max, local min, local max,
local min. But the point we are making here is that, even though the extreme points of the domain may not
be stationary points, their values might be larger than all other maximum values or smaller than all other
minimum values. So we compute the y coordinates for all stationary points and end points of the interval.
These are:
(0, 1)
(/8, 1.26)
(3/8, 0.98)
(9/8, 5.71)
(11/8, 5.40)
(2, 9.89),
all to two decimal places. So the global minimum occurs at the stationary point when x = 3/8, the global
maximum occurs not at a stationary point but at the end point of the domain, x = 2.

1.3.5

Asymptotes

Asymptotes of the graph of a function f are straight lines that are approached by the graph as x tends to
a for some real number a, or as x tends to + or . A vertical asymptote is a line x = a (which is of
course a vertical line) such that f (x) tends to + or as x tends to a on at least one side of a. Most
of the asymptotes that we shall consider will be either horizontal or vertical lines. A horizontal asymptote
y = b can occur as x tends to , when limx +f (x) = b, that is, f (x) gets arbitrarily close to b as
x is arbitrarily large and positive (or limx f (x) = b, that is, f (x) gets arbitrarily close to b as x is
arbitrarily large and negative).
The classic example to demonstrate asymptotes is
y=

1
.
x

We want to know what the graph looks like, and this is complicated by the fact that x = 0 is a forbidden
value; that is, the function is not defined at this point. So we (informally, using our calculators) take limits
to discover what happens as x 0, x , x . Let x be very small. Then x1 is very large, and has
the same sign as x. There is therefore an asymptote at x = 0. As x gets very large, x1 gets very small and
has the same sign as x. The inverse function of x1 is x1 itself, so there is also a horizontal asymptote y = 0.
This answers the question about asymptotes, but to sketch the graph as a whole we need to go through
our other steps. The first two derivatives are:
dy
dx
d2 y
dx2

=
=

16

1
x2

2
.
x3

Since

dy
dx

< 0 for all values of x (recall that x = 0 is undefined), the graph is decreasing everywhere. Since
2

d y
d y
the sign of dx
2 is negative when x < 0, the graph is concave in this region. Since dx2 is positive when x > 0,
the graph is convex in this region. As x gets larger and larger (both in the positive and negative directions)
the graph gets closer and closer to the horizontal asymptote y = 0, without ever actually touching it.

Example 1.3.32. We will sketch the graph of


y=

2(x 1)
.
x(x 2)

In principle we should check stationary points etc, but since the purpose of this section is to discuss asymptotes we will concentrate only on these. We may write this function in partial fractions (recap coming
later):
1
1
y= +
,
x x2
and add the graphs of the two terms to get the graph of the function. There are vertical asymptotes at
x = 0 and x = 2 and horizontal asymptote y = 0 as x . Note also that f (1) = 0. The moral of this
example is that asymptotes tell us only about what happens for large value of x, but they are not impossible
values of the function. This graph crosses its horizontal asymptote at x = 1.

1.3.6

Exponential and Logarithmic Graphs

The exponential function y = ex has the full set of real numbers as its domain, and positive real numbers
as its range. The natural logarithm function y = ln x has the opposite as it is inverse to the exponential
function. That is,
eln x = x = ln(ex ).
Example 1.3.33. We will sketch

ex
.
x
x
Since exponentials grow faster than polynomials, e is the dominant part of this function. We have
y=

ex
x x
ex
lim
x x
lim

=
=

0.

There is an asymptote at x = 0, and a jump in the graph. Since ex > 0 for all values of x, the sign of ex /x
is the same as the sign of x.
This demonstrates a general rule that, for large values of x, exponentials beat polynomials. A similar rule is
that polynomials beat logarithms.
Be careful: A zero denominator does not always mean there is an asymptote. We have already seen that
sin(x)
= 1.
x0
x
lim

Heres another example.


Example 1.3.34. Let
y=

ex 1
.
x

Since exponentials beat polynomials,


ex 1
x
x
ex 1
lim
x
x
lim

17

= ,
=

0.

However,

ex 1
= 1.
x0
x
You can convince yourself of this using your calculator and evaluating for very small positive and negative
values of x. So, even though the function is undefined when x = 0, the graph looks like it should take the
value 1 here. There is an impossibly small gap between the two branches of the graph.
lim

1.3.7

Hyperbolic Functions

The hyperbolic trigonometric functions cosh and sinh are similar to the usual elliptical trigonometric functions in that the usual trig functions define a point on a circle, while the hyperbolic trig functions define a
point on a rectangular hyperbola. The functions are defined as follows.
cosh(x) =

ex ex
ex + ex
, sinh(x) =
.
2
2

We discuss their individual properties.


ex + ex
,
2
pronounced kosh, is an even function (check!). We can construct its graph by adding the graph of ex /2 to
ex /2. These are mirror images of each other in the y-axis. Just like its trigonometric cousin, cosh(0) = 1.
However, as can be seen above, cosh(x) 1 for all real x.
The function
ex ex
sinh(x) =
,
2
pronounced shine, is an odd function (check!). We can construct its graph by adding the graph of ex /2
to ex/2 . One of these graphs is a rotation of the other about the origin. Just like is trigonometric cousin,
sinh(0) = 0.
Since ex is an increasing function, the dominant term in the exponential expression for cosh and sinh
x
is e /2 for large positive x, and ex /2 for large negative x. As x gets large and positive, the two graphs
get closer and closer. As x gets large and negative, the two graphs look like reflections of each other in the
x-axis.
Since
d x
e = ex ,
dx
you can check that
d
d
sinh(x) = cosh(x),
cosh(x) = sinh(x).
dx
dx
Note the absence of an awkward sign change!
Since they are derivatives of each other, we can use properties of one to describe the other. Since
cosh(x) =

d
sinh(x) = cosh(x) 1,
dx
sinh(x) is an increasing function with no stationary points. Since
d2
sinh(x) = sinh(x),
dx2
there is exactly one inflexion point at x = 0. Then since
d
cosh(x) = sinh(x),
dx

18

there is exactly one stationary point, that is when x = 0. Since


d2
cosh(x) = cosh(x) 1,
dx2
there are no inflexion points.
Example 1.3.35. Sketch
y=

x2
.
cosh(x)

Note first that since the denominator is never zero, this function is defined everywhere and has no asymptotes.
We look for intercepts. When x = 0, y = 2. When y = 0, x = 2. These are the only intercepts. Lets find
the first derivative using the quotient rule with u = x 2, v = cosh(x). Then
dy
cosh(x) (x 2) sinh(x)
.
=
dx
cosh2 (x)
At a stationary point we have
cosh(x) = (x 2) sinh(x).
By trial and error, we can approximate them as x = 0.4, x = 3.0. Lets find the second derivative by
setting U = cosh(x) (x 2) sinh(x), V = cosh2 (x). Then
d2 y
2(x 2) sinh2 (x) 2 sinh(x) cosh(x) (x 2) cosh2 (x)
=
dx2
cosh3 (x)
At x = 0.4, y 00 = 2.3 > 0, so (0.4, 2.2) is a local minimum. At x = 3.0, y 00 = 0.1 < 0, so (3.0, 0.1) is a
local maximum. Since
x2
lim
= 0,
x cosh x
(exponentials beat polynomials), the line y = 0 is an asymptote as x

1.3.8

More on stationary points and points of inflexion

Recall that inflexions need not necessarily be stationary points. A point of inflexion is a point such that
d3 y
d2 y
=
0,
6= 0.
dx2
dx3
An equivalent definition is: an inflexion is a point at which curvature changes. I.e., where we pass from
concave to convex or vice-versa.
Example 1.3.36. Take a function and calculate its derivatives:
y
dy
dx
d2 y
dx2
d3 y
dx3

= x3 + x
=

3x2 + 1

6x

6.

dy
The only intercept is at (0, 0), and there are no stationary points since dx
1. However, there is an inflexion
at x = 0. We can see this by considering the curvature at values either side but very close to x = 0. At
x = 0.1, y 00 = 0.6, so the graph is concave. At x = +0.1, y 00 = +0.6, so the graph is convex here.

19

Example 1.3.37. Same thing again but with a different function.


y = x4 x2 x 1
dy
= 4x3 2x 1
dx
d2 y
= 12x2 2
dx2
d3 y
= 24x
dx3
There is an intercept at (0, 1). Other intercepts are (1, 0) exactly, and (1.5, 0) approximately.
Solving y 0 = 0 gives a stationary point at x 0.9. At this point, y 00 7.7 > 0, so this is a local minimum.
What can we deduce about inflexions before calculating them? Since the graph
is convex at both intercepts,
p
there must be an even number of inflexions. Indeed, they occur at x = 1/6.
Example 1.3.38. Similarly, between a local minimum and a local maximum we must have an odd number
of inflexion (provided that the function is continuous in this interval, and nothing worth than an inflexion
happens).

1.3.9

Graph Sketching Summary

Here is the graph sketching recipe for y = f (x).


Find asymptotes. These may be horizontal, vertical or oblique. Vertical asymptotes are usually found
by considering values of x that make the denominator zero, the others may be found by considering
what happens to the function for large positive and negative values of x. For easier functions, the
horizontal asymptotes can be found by looking at the inverse function (if it exists).
Describe limiting behaviour. This just means you should figure out what happens off the page.
When x is really large negative or positive, does the function get closer to a single value, or closer to
?
Find intercepts. Provided the function is defined at x = 0, one intercept is always (0, f (0)). The
other x-coordinates of intercepts are found by solving f (x) = 0.
Find stationary points. Solve f 0 (x) = 0.
Classify stationary points. Substitute the x-coordinate of each stationary points into f 00 (x). If the
result is positive (or negative) then the stationary point is a local minimum (or maximum). If zero,
then it may be point of inflexion.
Find points of inflexion. Solve f 00 (x) = 0. If for any solution you have f 000 (x) 6= 0, then you have a
point of inflexion.
Sketch the graph. Remembering to label your coordinate axes, your asymptotes, and all the points
listed above (including the y-coordinates!), carefully sketch the graph. The positions of the points
above will give you hints about where the graph is with respect to the axes, and you should make sure
your curvature is correct (concave or convex). You do not need to plot any points accurately, or to
plot any additional points.
Example 1.3.39. Sketch y = x2 sin(x) on the interval x . This time the derivative is a little
harder, and we must use the product rule. We have
dy
dx

x2 (sin(x))0 + (x2 )0 sin(x)

x2 cos(x) + 2x sin(x)

x(x cos(x) + 2 sin(x)).


20

If we rearrange

dy
dx

= 0, we either find that x = 0 or x satisfies

x
= tan(x).
2

For x 6= 0, we will resort to trial and error.


x
2.0
2.2
2.25
2.3
2.5

tan(x) +
-1.19
-0.27
-0.11
0.03
0.53

x
2

So to one decimal place, we have x = 2.3. By symmetry, the full set of stationary points is:
2.3, 0, 2.3.
Now
d2 y
dx2

= x2 sin(x) + 2x cos(x) + 2x cos(x) + 2 sin(x)


=

(2 x2 ) sin(x) + 4x cos(x).

d y
For x = 2.3, dx
2 = 8.6, so x = 2.3 is a local maximum, x = 2.3 is a local minimum. At x = 0,
so we dont know what type point it is without calculating the third derivative. We have

d2 y
dx2

=0

d3 y
= (6 x2 ) cos(x) 6x sin(x).
dx3
When x = 0,

1.3.10

d3 y
dx3

= 6 6= 0, hence at x = 0 there is an inflexion.

Tangents and Normals

We started by approximating the tangent to a curve in order to calculate its derivative. But now that we
have powerful tools to calculate derivatives, we can go the other way and calculate tangents to curves at
given points. The tangent line to a curve y = f (x) at the point x = x0 is the straight line through (x0 , f (x0 ))
with the same gradient as the curve at that point. The normal is the line through the point that is at at
right angle to the tangent. If the tangent has gradient m, the normal has gradient 1/m. Reminder: if a
line through the point (x0 , y0 ) with gradient m has equation
y y0 = m(x x0 ).
Example 1.3.40. For
y = x3 x
we will find the equations for the tangent and normal lines at x = 1. For simplicity, we write
y = x(x + 1)(x 1),
so we know the intercepts and can easily make a quick sketch of the graph. We calculate
dy
= 3x2 1.
dx
At x = 1, we have

dy
dx

= 2. So this is the gradient of the tangent. It therefore has equation


y = 2(x 1).
21

The gradient of the normal is


1
y = (x 1).
2
Warning: You can only calculate these at differentiable points of the function. E.g., not for y = |x| when
x = 0, or y = sign(x) at x = 0, or y = 1/x at x = 0.
Example 1.3.41. We wont work through this together, but it is a interesting problem. Let y = x3 x.
Find two points such that the tangent at one is normal to the other. For what values of is this possible?

22

Chapter 2

Vectors
A vector is a directed line segment. I.e., it is part of a line (a line segment) which has an arrow on one end
(it is directed). It is defined by two properties:
its direction,
its magnitude (size/length).
Example 2.0.1. The speed of an object is not a vector, however the velocity of an object is a vector. The
direction of the velocity is the direction of motion, and the magnitude is the speed in that direction.
Two vectors are equal iff they have the same direction and the same magnitude.
A vector has no fixed position. If two vectors have the same direction and the same magnitude, then
they are always equal even if they are based in different positions.

2.1

Describing Vectors

In print, we write a vector in bold type:


a.
In written work, some people underline vectors and some people put an arrow on top:
a = ~a.
It doesnt matter which you use, but dont try to use bold type in handwritten work.
Given points P and Q there is a unique vector that goes from P to Q. We can write P~Q to describe this
vector, or just denote it by a letter, say v.
We can have vectors in any dimension, and most of the algebraic methods we learn are transferrable to
any dimension.
1-D: Vectors are just number. The direction is either + or , and for a number r, the magnitude is
|r|.
2-D: If we have points in the place with coordinates
P (x1 , y1 )
Q(x2 , y2 )
then the vector P~Q = v may be described in terms of its components:
v

(x2 x1 , y2 y1 )

(v1 , v2 ).
23

3-D: Vectors work exactly the same way, but points have an extra coordinate and vectors have an extra
component. If
P (x1 , y1 , z1 )
Q(x2 , y2 , z2 )
then P~Q = v where
v

(x2 x1 , y2 y1 , z2 z1 )

(v1 , v2 , v3 ).
pic

4-D, 5-D, 6-D,. . . : Everything1 generalises to higher dimensions. However, we can only work in one
dimension at a time.

2.1.1

Coordinates vs. Vectors

It is important not to get confused between coordinates of a point and coordinates of a vector since the
notation are very similar. What does the notation (p1 , p2 ) mean?
If we write P (p1 , p2 ), we mean the point with coordinates (p1 , p2 ).
Position 3
Direction 7
Magnitude 7
If we write p = (p1 , p2 ), we mean the vector with components (p1 , p2 ).
Position 7
Direction 3
Magnitude 3

2.1.2

Position Vectors

This is an attempt to overcome the ambiguity in the previous section. Fix an origin O, and fix coordinate
~ = (p1 , p2 ). We will often describe the
axes. Then, relative to O, the point P (p1 , p2 ) has position vector OP
location of a point by using a position vector rather than coordinates.

2.1.3

Zero Vectors

We use the notation 0 to denote the vectors with 0 magnitude. It is the position vector of the origin. It is
not to be confused with the number 0. In 2D it is
0 = (0, 0),
and in 3D it is
0 = (0, 0, 0).
Note that 0 magnitude also implies no direction.
1 3-D

is a little bit special

24

2.1.4

The Standard Basis

In 2D,
i

(1, 0)

(0, 1).

We may write a general 2D vector as


v = (v1 , v2 ) = v1 i + v2 j.
Now in 3D,
i

(1, 0, 0)

j =

(0, 1, 0)

(0, 0, 1),

and
v = (v1 , v2 , v3 ) = v1 i + v2 j + v2 k.

2.1.5

Magnitude

One of the conditions for two vectors to be equal for their magnitudes to be equal. The magnitude |a| of a
vector a may be calculated using Pythagoras Theorem. In 2D, let a = (a1 , a2 ). Then
q
|a| = a21 + a22 .
In 3D, let a = (a1 , a2 , a3 ). Then
|a| =

a21 + a22 + a23 .

According to this formula, the only vector with 0 magnitude is the zero vector.

2.2

Vector Algebra

Many of the tools we have for manipulating real numbers extend to vectors. There are also a few more.

2.2.1

Scalar Multiplication

Let k be a real number, and let a be a vector. Then ka scales the vector a.
If |k| > 1, then |ka| > |a|. If |k| < 1, then |ka| < |a|.
If k > 0 then ka has the same direction as a. If k < 0 then ka has the opposite direction as a.
We have ka = a iff k = 1.
We have ka = 0 iff k = 0 or a = 0.
Vectors a and b have the same direction iff a = kb for some k > 0.
In 2D, if a = (a1 , a2 ) then ka = (ka1 , ka2 ). In 3D, if a = (a1 , a2 , a3 ) then ka = (ka1 , ka2 , ka3 ).

2.2.2

Vector Addition

We add vectors a and b by the parallelogram rule.In 2D,


a = (a1 , a2 ), b = (b1, b2 ), a + b = (a1 + b1 , a2 + b2 ).
In 3D,
a = (a1 , a2 , a3 ), b = (b1, b2 , b3 ), a + b = (a1 + b1 , a2 + b2 , a3 + b3 ).
25

2.2.3

Vector Subtraction

Given vectors a and b, the vector (a b) is defined to be the vector satisfying


(a b) + b = a.
Think of a b as the vector that goes backwards along b then forwards along a. The notation a is for the
vector with the same magnitude as a but in the opposite direction.

2.2.4

Some Properties

Scalar multiplication and vector addition satisfy many of the same properties as regular numbers. For vectors
a, b, c and scalars k and `, these include:
a+b=b+b
(a + b) + c = a + (b + c)
a+0=0+a=a
a + (a) = (a) + a = 0
a + (b) = a b
k(a + b) = ka + kb
(k + `)a = ka + `a
k(`a) = (k`)a
1a = a, -1a=-a

2.2.5

Unit Vectors

is the vector with magnitude 1 in the same direction as a. It may


Let a be a vector. Then the unit vector a
be calculated as:
a
=
.
a
|a|

32 + 02 + 42 =


3
4
=
a
, 0,
.
5
5

Example 2.2.2. Let a = (3, 0, 4). Then |a| =

We can check

2.3

9 + 16 =

25 = 5. So

s 
 2 r
2
3
4
9 + 16
|
a| =
+ 02 +
=
=1
5
5
25

Components of a Vector

We describe vectors by using their components with respect to some orthonormal basis (i.e. the standard
x, y, z directions). But these standard directions may not be the most useful in practical applications. Let
be non-zero, non-colinear2 unit vectors.Consider a
in the plane they span.The length is called
, b
and b
a
in the direction of a

, and is denoted a
b.
the component of b
2a

I.e. not having the same or even opposite directions.


6= b.

26

=b
a
b
.
By symmetry (of the picture of the algebra), a
= cos().
b
By trigonometry, a
= a/|a|. We may use this to generalise to the case where we dont require unit vectors and we
Recall that a
define the scalar product of two arbitrary vectors:
a b = |a||b| cos().
However, to find components in a direction, the direction vector must be of unit length.I.e., the component
b.
of b in the direction of a is equal to a

2.4

The Scalar Product

For any two vector a and b, the scalar product is defined to be


a b = |a||b| cos(),
where is the angle between a and b (starting at the same point).It is often called the dot product since it
is denoted by , however it is a good habit to call it the scalar product as it reminds you that the answer is
a number rather than a vector.
It is difficult to work with this definition as we dont generally know what the angle is between two
vectors in the first place. There is an easier form of the scalar product to work with, and we will derive this
form in the 2D case. Suppose
=

(a1 , a2 ),

b =

(b1 , b2 ),

the angle between a and the positive x-axis is a , the angle between b and the positive x-axes is b .
pic
We may write
a

(|a| cos(a ), |a| sin(a ))

(|b| cos(b ), |a| sin(b ))

Then the angle between a and b is (b a ). Then


ab

= |a||b| cos(b a )
= |a||b|(cos(a ) cos(b ) + sin(a ) sin(b ))
= |a| cos(a )|b| cos(b ) + |a| sin(a )|b| sin(b )
= a1 b1 + a2 b2 .

We wont derive it, but in 3D if a = (a1, a2, a3), b = (b1, b2, b3) then
a b = a 1 b1 + a 2 b2 + a 3 b3 .
This also generalises to higher dimensions.
Example 2.4.3. Let a = (1, 2), b = (1, 4). Then a b = 1 + 8 = 9.
Example 2.4.4. Let a = (2, 1), b = (1, 2). Then a b = 2 + 2 = 0.
Example 2.4.5. Let a = (1, 1, 0), b = (0, 1, 1), c = (1, 0, 1). Then a b = b c = c a = 1.
27

Example 2.4.6. Let a = (1, 3, 0), b = (3, 1, 0), c = (1, 0, 2). Then

2.4.1

ab

= 3 + 3 + 0 = 0

bc

= 3 + 0 + 0 = 3

ca

1 + 0 + 0 = 1.

The Standard Basis and Scalar Product

Since i, j, k are unit vectors, everything to do with magnitude of vectors etc. is preserved when we write our
vectors in terms of these standard vectors. They provide only a different notation for vectors, and it doesnt
matter which notation we use. It is often useful to remember the algebraic rules for the scalar product, as
well as the fact that they are unit vectors which are pairwise perpendicular. That is,

2.4.2

ii=jj=kk

ij=jk=ki

0.

Properties of the Scalar Product

We have the following properties for vectors a, b, c, real numbers k, `.


ab=ba
a (kb + `c) = k(a b) + `(a c)
(ka) (`b) = k`(a b)
a a = |a|2
No cancellation property. That is, if a b = a c, we can not assume that b = c.
Example 2.4.7. Let a = (1, 3), b = (1, 1), c = (4, 2). Then a b = 2, a c = 2, but b 6= c.

2.4.3

Application: Calculating the Angle Between Two Vectors

Let a and b have fixed magnitudes, but varying directions. Recall that the scalar product depends on the
cosine of the angle between the vectors:
a b = |a||b| cos().
= 0, a b = |a||b|. Vectors are parallel and in the same direction.
0 < < /2, 0 < a b < |a||b|. The angle between the vectors is acute.
= /2, a b = 0. The vectors are at a right angle to each other.
/2 < < , |a||b| < a b < 0. The angle between the vectors is obtuse.
= , a b = |a||b|. Vectors are parallel and in opposite directions.
> . This is the reflex angle between the two vectors, which is not want were looking for. However,
since cos() = cos(2 ), we dont need to worry about this case.

28

If we know the components of two vectors, we can use this method to calculator the scalar product of the
vectors and then find the angle between them.
cos() =

ab
.
|a||b|

Example 2.4.8. Let a = (1, 2), b = (1, 4). Then |a| =


a b = 1 8 = 7. So
7
cos() = ,
5 17

12 + 22 =

5, |b| =

12 + 42 =

17,

which gives 2.43 radians.


Example 2.4.9. Let a = (1, 3, 4), b = (0, 2, 1). Then
ab

(1 0) + (3 2) + (4 1)

0+64

2
p

12 + 32 + 42

=
26
p
|b| =
02 + 21 + 12

=
5.
|a| =

Since a b > 0, the vector meet at an acute angle. We find this angle
cos()

=
=

ab
|a||b|
2
,
26 5

so 1.39 radians.
Example 2.4.10. Let a = (2, 1), b = (1, 2), c = (1, 2). Then
ab =

ac =

bc =

5 = |b||c|.

So both b and c are perpendicular to a. In 2D, there are only ever two vectors perpendicular to any given
vector.
Example 2.4.11. Let a = (0, 1, 1), b = (1, 1, 1), c = (3, 2, 2). Then
ab

ac

bc

= 1.

In 3D, there are many orthogonal vectors to any given vector.

29

2.5

Application: Projection of vectors

We may write any vector b as the sum of a vector parallel to a and the sum of a vector perpendicular to a.
The vector parallel to a is called the projection of b in the direction of a, and is denoted
proja b.
We know that the magnitude is given by
b=
|proja b| = a

ab
.
|a|

Since the direction is a, we multiply this by the unit vector in the direction of a. We end up with the square
of the magnitude on the bottom, and |a|2 = a a, so that
proja b =

(a b)
a.
(a a)

Now, since b is the sum of a vector parallel to a and a vector perpendicular to a, and we have calculated
the component parallel to a, the component perpendicular must be
b proja b.

2.6

Linear Combination of Vectors

Let a1 , a2 , . . . , an be non-zero vectors. Let k1 , k2 , . . . , kn be real numbers. Then


k1 a1 + k2 a2 + + kn an
is called a linear combination of a1 , . . . , an .

2.6.1

Vector Span

The span of a1 , . . . , an consists of all vectors that be written as a linear combination of a1 , . . . , an .


Example 2.6.12. Let a1 = (1, 1), a2 = (0, 1). The span of a1 and a2 contains all vectors of the form
v

k1 (1, 1) + k2 (0, 1)

(k1 , k1 ) + (0, k2 )

(k1 , k1 + k2 ).

This is all 2D vectors since for any x and y we find that the system of equations x = k1 , y = k1 + k2 has a
unique solution k1 = x, k2 = y x.
Example 2.6.13. This time, let a1 = (1, 2), a2 = (3, 6). then
k1 a1 + k2 a2

= k1 (1, 2) + k2 (3, 6)
=

(k1 , 2k1 ) + (3k2 , 6k2 )

(k1 + 3k2 , 2k1 + 6k2 )

(k1 + 3k2 )(1, 2).

In this case the span the span of a1 and a2 is the same as the span of a1 .
In 2D, one vector always spans a line. Two vectors span either a line or a plane. Two vectors a1 and a2
span a line whenever a1 = ka2 (as seen in the previous example).
30

In 3D, one vector always spans a line. Two vectors span either a line or a plane. Three vectors span
either a line, a plane or the whole of three dimensional space.
Example 2.6.14. Find the span of
a1

(1, 0, 2)

a2

(2, 1, 0)

a3

(0, 1, 4)

These span more than a line since they are all multiples of a single vector, so they either span a plane or the
whole of three dimensional space. An alternative way to find out is as follows. Try to solve
k1 a1 + k2 a2 + k3 a3 = 0.
If the only possible solution is k1 = k2 = k3 = 0, then the vectors span the whole of three dimensional space.
If there is another solution, this means one vector may be written as a linear combination of the others and
they therefore span a plane.We solve
0

= k1 a1 + k2 a2 + k3 a3
= k1 (1, 0, 2) + k2 (2, 1, 0) + k3 (0, 1, 4)
=

(k1 + 2k2 , k2 + k3 , 2k1 + k3 ).

I.e.,
k1 + 2k2

k2 + k3

2k1 + k3

If we can find any k1 , k2 , k3 (not all zero) satisfying the above system of equations then the vectors span a
plane. A solution is
k1

k2

k3

= 1,

so these three vectors span a plane.

2.7
2.7.1

Cartesian and Parametric Equations


Lines

Working in 2D or 3D, recall that the span of a vector is a line. Given any vector a, there are many parallel
vectors.
pic
The definition of a vector is a directed line segment, i.e. a small portion of a line. If we multiply a by a
scalar, this stretches or shrinks a in both directions, but a is always parallel to a. So if a is a vector that
a is the span of a, i.e. it is a line. However, vectors have no fixed position. Let b be the position vector of
any point on the line. Then
vb + a
is called the parametric equation of the line: as varies over the real numbers, it gives the position vector
of each point on the line.
pic
31


Example 2.7.15. Find a parametric equation for the line parallel to a = (1, 3) through the point with
position vector b = (0, 1).
Solution 1: Applying the formula,

v = (0, 1) + (1, 3)

= (, 1 + 3).

Solution 2: Since we can start with any point on the line, and ( 3, 2) is a point on the line,

u = ( 3, 2) + (1, 3)

= ( 3 + , 2 + 3)
gives the same line.
pic
Since we are in two dimensions, we can also find the Cartesian equation of the line. Treat the first and
second component of v as the x and y coordinate (since it is a position vector). Then
x

= 1 +

so that
y = 1 +

3,

3x

is the Cartesian equation of the line. Eliminating from u would give the same solution.

Example 2.7.16. If v = (1 + 7, 1 + 5), then

x1
7
y+1

So

x1
y+1
=
7
5
is the equation of the line, and can be rearranged into the usual form.

Example 2.7.17. A line in 3D contains the point with position vector (1, 9, 2), and is the in the direction
(1, 0, 1). Its parametric equation is

v = (1, 9, 2) + (1, 0, 1)

= (1 , 9, 2 ).
This example has been carefully chosen so that we may calculate its Cartesian equation as well. Since the
second component is 9, this line is contained in the plane y = 9. Then setting
x

gives
z =x+

2 1,

which is itself a plane. So the line is the intersection of the two planes. This example was specifically chosen
so that this would be doable, but in general it is must harder to find these planes given a line. We will revisit
this later in the module.
32

2.7.2

Planes

The parametric equations work exactly like those for lines. Suppose a and b are linearly independent vectors,
and p is the position vector of a point on the plane. Then a general point on the plane, r, has position vector
r = p + a + b.
However, it is often more useful to know the Cartesian equation of a plane. Fortunately, there is a recipe for
calculating this.
Let a and b be vectors spanning a plane. Then there are infinitely many normal vectors to a and infinitely
many normal vectors to b. The normals to a span a plane, the normals to b also span a plane, and the
intersection of two planes in general positions is a line. Since vectors parallel to this line are normal to a
and are normal to b, they are also normal to any linear combination of a and b, for let n be such a normal,
then
n (a + b) = n a + n b = 0.
We say any such n is normal to the plane.
We use r = (x, y, z) to denote a general position vector belonging to the plane we are considering,
p = (x1 , y1 , z1 ) to denote a known point on the plane, and n = (a, b, c) to denote a normal to the plane. By
varying r, any vector on the plane may be written as r p.
pic
Since all of these vector are normal to n, we have n (r p = 0, which implies that
n r = n p.
Since n and p are both known, n p is just a constant. The equation of the plane therefore has equation
ax + by + cz = C,
where C is a constant that may be found by substituting in the coordinates of any point known to be on the
plane.
Example 2.7.18. A plane has normal n = (3, 5, 6), and contains the point (1, 0, 1). The equation of the
plane is
3x + 5y + 6z = C.
Substituting in the known point gives C = 3 6 = 3, so the plane has equation
3x + 5y + 6z = 3.

2.7.3

Distance between a point and a plane

There is a simple formula for the distance of a point P = (x1 , y1 , z1 ) from a plane
ax + by + cz + d = 0,
which uses the scalar product. We know that the vector n = ai + bj + ck is normal to the plane. The shortest
distance from P to the plane is the perpendicular distance, that is, parallel to n. If (x0 , y0 , z0 ) is any point
in the plane then this distance is
|projn (r0 r1 )|
where
r0 = x0 i + y0 j + z0 k
33

and
r1 = x1 i + y1 j + z1 k.
Now
projn (r0 r1 ) =

(r0 r1 ) n
(r0 r1 ) n
n.
n=
nn
|n|2

So
|projn (r0 r1 )| =

|(r0 r1 ) n|
.
|n|

Now
|(r0 r1 ) n| = |(x0 x1 )a + (y0 y1 )b + (z0 z1 )c| = |ax0 + by0 + cz0 ax1 by1 cz1 | = |d + ax1 + by1 + cz1 |
Also
|n| =

a2 + b2 + c2

So the distance of P from the plane ax + by + cz + d = 0 is


|ax1 + by1 + cz1 + d

.
a2 + b2 + c2
Example 2.7.19. We consider the plane
3x + 5y + 6z = 3
and the point P = (1, 2, 1). The plane can be written in the form 3x + 5y + 6z + 3 = 0. So the distance of
(1, 2, 1) from this plane is

11 70
|3 1 + 5 2 + 6 1 + 3|
22

= =
.
35
70
32 + 5 2 + 6 2

2.8

The Vector Product

The vector product is another way of combining two vectors. Even though it is often called the cross product,
we refer to it as the vector product as a reminder that its output is always a vector. We will present two
formulas for the vector product: one is easy to remember, but requires you to understand a little about
matrices (which are otherwise not on the syllabus, but you can find information in Section 3.8); the other
is harder to remember but doesnt require any advanced knowledge. Let v = (v1 , v2 , v3 ), w = (w1 , w2 , w3 ).
Then


i
j
k

v w = v1 v2 v3
w1 w2 w3
=

(v2 w3 v3 w2 , v1 w3 + v3 w1 , v1 w2 v2 w1 ).

We compute
v (v w) = v1 v2 w3 v1 v3 w2 v2 v1 w3 + v2 v3 w1 + v3 v1 w2 v3 v2 w1 = 0,
which demonstrates that v w is normal to v. We could similarly show that it is normal to w, so the cross
product of vectors v and w always gives a vector that is normal to both.
Example 2.8.20. Lets revisit
our problem

of calculating the Cartesian equation of a plane. Suppose a
plane is spanned by a = (1, 2, 2), b = ( 3, 1, 0). Then we may take as a normal


i


j j
2
2 = ( 2, 6, 1 6).
n = a b = 1
3 1
0

34

2.8.1

Properties of the Vector Product

A vector is uniquely determined by two of its properties: magnitude and direction. The direction of a b
obeys the so-called right hand rule. With outstretched arms, if a is the direction of the right arm and b is
the direction of the left arm, then a b is in the direction of your head.

There is an alternative formula for a b, namely


|a||b| sin(),
ab=n
where is the angle between a and b. This is reminiscent of the formula for calculating the area of a triangle.
Indeed, it is twice this formula. So the magnitude of a b is the area of the parallelogram spanned by a
and b.
Note that the two versions of the formula may be combined to calculate sin(), but not itself without
more information, since for 0 .
The vector product also satisfies some standard algebraic properties:
a (b + c) = a b + a c
(ra) b = a (rb) = r(a b).

However, note that


a b = b a,
so in particular a b 6= b a. This makes sense according to the right hand rule (imagine starting with
your left hand or standing on your head). Also, in general,
a (b c) 6= (a b) c,
so it makes no sense to write a b c.
There is also a no-cancellation property: if a b = a c, we cant assume that b = c.

2.8.2

Volume of a parallelepiped

The triple scalar product of vectors a, b and c is defined to be


a (b c).
Its absolute value is invariant under the order in which we take the vectors, and is the volume of the
parallelepiped spanned by a, b and c.
pic
It may also be calculated using the formula

a1

|a b| = b1
c1

35

a2
b2
c2


a3
b3 .
c3

2.9
2.9.1

Physical Interpretations of Scalar and Vector Products


Work Done

Suppose a force F moves an object along a vector d.Then the total work done by the force is equal to the
component of F in the direction of d multiplied by the distance travelled. In other words,
Work done = |d||F| cos(theta) = F d.
Example 2.9.21. Let d = (0, 0, 4), F = (2, 3, 5).
pic
Work done is
F d = 0 + 0 + (4 5) = 20.

Example 2.9.22. Let d = (1, 1, 0), F = (2, 3, 5).Work done is


F d = 2 + 3 = 1.

Example 2.9.23. Let d = (1, 1, 0), F = (2, 3, 5).Work done is


F d = 2 3 = 1.
In this example, the force is working to slow the object down.

2.9.2

Torque (Moments)

A seesaw is stable provided |a||m| |b||n| = 0.


I.e. it is stable if the total torque about the pivot is 0.
The force acting on each object in the picture above is gravity so acts downwards, but in real life the
force F could be on an angle.
pic
Resolving normal to the position vector a gives the normal force to a, call this m. Then torque is calculated
as
|a||m| = |a||F| sin(),
which can be verified by moving F (remember that vectors have no fixed position). But this is exactly the
formula for the magnitude of the cross product. In fact, the torque vector is given by
= r F.
Its magnitude is the moment about the point, it direct is normal to the motion.
Example 2.9.24. Let F = (2, 3, 1) be the force applied to a point at position r = (1, 1, 0) relative to the
pivot. Then


i
j
k

= r F = 1 1 0 = (1, 1, 5).
2 3 1

We have | | = 27.

36

Example 2.9.25. Let F = (10, 0, 10) be the force


the pivot. Then

i

= r F = 1
10

applied to a point at position r = (1, 0, 1) relative to



j
k
0 1 = (0, 0, 0).
0 10

So the force F is pulling the point directly away from the pivot, which means there is no twisting.
Example 2.9.26. Pythagoras Theorem. We have a right-triangular prism shaped fishbowl on a vertical
frictionless flagpole and fill it until the normal force is equal the length of each side. Newtons third law
of motion states that this object will remain in rest while there are no exterior forces acting on it. The
magnitude of the torque is equal to
a
b
c
a + b c = 0.
2
2
2
We can rearrange this to prove Pythagoras Theorem, a2 + b2 = c2 , using only physical properties.

2.9.3

Projection and Orthogonal Components of Vectors

Reminder: The component of b in the direction of a is equal to


.
ba
When we resolve a vector in this way, it is often convenient to write the vector b as the sum of a vector
parallel to a and a vector normal to a.
pic
We have


b=

ab
aa


a+

ab
aa


a.

Example 2.9.27. Let a = (2, 1, 0), b = (1, 1, 0). We need to following ingredients to use the above formula:
ab

aa

= 4 + 1 = 5
i j k


= 2 1 0 = (0, 0, 1)
1 1 0


i j k


= 0 0 1 = (1, 2, 0).
2 1 0

ab

(a b) a

2+1=3

So

3
(2, 1, 0) +
5
and we note that (2, 1, 0) is in the same direction as
(2, 1, 0) (1, 2, 0) = 0.
b=

1
(1, 2, 0),
5
a and (1, 2, 0) is normal to this direction since

Example 2.9.28. Do the same for a = (1, 1, 1), b = (3, 0, 3). You should find that
(3, 0, 3) = (2, 2, 2) + (1, 2, 1).

37

2.9.4

Lagranges Identity

Recall that:
ab

= |a||b| cos()
|a||b| sin().
= n

ab

Taking the sum of the squares of the magnitudes of each of these, and using the fact that sin2 ()+cos2 () =
1, we get Lagranges identity:
|a b|2 + (a b)2 = |a|2 |b|2 = (a a)(b b).
This is useful because
|a b| =

p
|a|2 |a|2 (a b)2

Example 2.9.29. Find the area of the parallelogram spanned by


a

(1, 3, 2)

( 2, 0, 2).

This is
|a b| =
=
=
=
=



i
j
k

1 3
2

2 0 2


|(3 2, 3 2, 3 2)|

392

54

3 6.

Using Lagranges formula,


|a b| =

|a|2 |b|2 (a b)2

(14 4 2

=
=

54

3 6,

the same as before.

38

Chapter 3

Integration
There are essentially three ways to understand what integration is:
the measure of a region (i.e. area in 2D, volume in 3D),
the opposite of differentiation,
a way of finding a potential function.
We can approximate the area under the graph of y = f (x) between x = a and x = b by adding up the
areas Ai of n trapeziums, each of width x.
pic
Then the definite integral is defined to be
Z

f (x) dx = lim

n
X

Ai ,

i=1

where x dx as n .
If we know some F (x) with the property that F 0 (x) = f (x), then we can write the indefinite integral
Z
f (x) dx = F (x) + C,
where C is an arbitrary constant of integration (explained later), and we can use this to evaluate the definite
integral:
Z b
f (x) dx = [F (x)]ba = F (b) F (a).
a

A difficult theorem in maths is the Generalised Stokes Theorem, which says


Z
Z
d =
.
S

You dont how to understand what this means or even apply it directly, but on some abstract level it tells
us that whatever the dimension were integrating over, we can integrate over one dimension less and get the
same result. The region S is the boundary of S, so has dimension one lower. In particular, it tell us that
we can integrate over a one dimensional region by evaluating a zero dimensional integral: a sum. A loosely
related example:
d 2
r = 2r.
dr
Differentiate the area of a circle (a 2D property) we get the circumference (a 1D property).
39

3.1

The Fundamental Theorem of Calculus

Suppose F (x) has the property F 0 (x) = f (x). But what if G(x) also has the property that G0 (x) = f (x)?
In this case,
d
d
d
(F (x) G(x)) =
F (x)
G(x) = f (x) f (x) = 0,
dx
dx
dx
and the only thing whose derivative is zero is a constant. So if a function F (x) satisfies F 0 (x) = f (x), then
all functions with that property are
F (x) + C,
where C is a constant. This is why we must include a constant when we write the indefinite integral:
Z
f (x) dx = F (x) + C,
and why we dont need to write a constant for definite integrals since
Z

f (x) dx = [F (x) + C]ba = (F (b) + C) (F (a) + C) = F (b) F (a) + C C = F (b) F (a).

Lets prove the formula:


Z

f (x) dx = [F (x)]ba = F (b) F (a).

Proof. Consider a function

Z
I(x) =

f (u) du,
a

where a is fixed and the upper limit x is allowed to vary. If we differentiate this function from first principles,
we get
!
Z x+h
Z x
d
1
I(x) = lim
f (u) du
f (u) du
h0 h
dx
a
a
Z
1 x+h
f (u) du.
= lim
h0 h x
It can be shown that
Z

x+h

f (u) du = hf (x) + h2 (other terms),

we have
I 0 (x) = f (x).
Write F (x) = I(x) + C. Then
F (b) F (a)

= I(b) I(a)
Z b
Z
=
f (x) dx
a

a
b

Z
=

f (x) dx
a

40

f (x) dx

3.2

Properties of Integration

We have the following properties for integration:


Z
Z
Z
(f (x) + g(x)) dx = f (x) dx + g(x) dx
Z

Z
Af (x) dx = A

f (x) dx

Z
f (x) dx =

f (x) dx

Z
For a < c < b,

Z
f (x) dx =

Z
f (x) dx +

f (x) dx
c

Since Integration can be seen as the opposite of differentiation, many of our identities can simply be reversed.
Z
1
xn+1 + C, n 6= 1

xn dx =
n+1
Z
1
dx = ln |x| + C

x
Z

ex dx = ex + C
Z

sin(x) dx = cos(x) + C
Z

cos(x) dx = sin(x) + C
Z

sinh(x) dx = cosh(x) + C
Z

cosh(x) dx = sinh(x) + C

The second of these tells us what to do when the first formula doesnt work (division by zero is not allowed).
We can understand why. Let y = ex . On the one hand, we can take the inverse function to write x = ln y.
On the other hand, we may differentiate to get
dy
= ex .
dx
But since y = ex , we have
dy
= y.
dx
Rearranging and evaluating gives
Z

1
dy
y

Z
=

dx

x+C

ln y + C,

and hence our result.


41

Example 3.2.1. We have


5

x3 dx

=
=

x4
4

5
2

54 24
4
609
.
4

pic
Example 3.2.2. Next,
Z

sin(x) dx

[ cos(x)]0

( cos() + cos(0))

1+1

2.

pic
Example 3.2.3. Now,
Z

2
x2
x8

8
2 2
 

 8
22
(2)8
(2)2
2

=
8
2
8
2
 8



8
2
2
2
2
2
2
=

8
8
2
2
= 0.

(x7 x) dx

Example 3.2.4. This example looks much more challenging than it is:
3
Z
7
sin(sin(sin(x9 ))) dx.

We will return to it shortly!

3.3

Integrating over symmetric domains

A domain is called symmetric if it is in the form [a, a], where this denotes the set of real x such that
a x a. We sometimes have shortcuts when we integrate over symmetric domains. When we integrate
a function f over this domain we get
Z a
Z 0
Z a
f (x) dx =
f (x) dx +
f (x) dx
a

a
a

0
a

f (x) dx

=
0

Z
=

f (x) dx
0

Z
f (x) dx +

f (x) dx
0

42

Recall that if f is even, then f (x) = f (x) for all x. In this case,
Z a
Z a
f (x) dx = 2
f (x) dx.
a

Also, if f is odd then f (x) = f (x) for all x. In this case,


Z a
f (x) dx = 0.
a

Example 3.3.5. Since the integrand is odd and the domain is symmetric, we have
Z

sin(sin(sin(x9 ))) dx = 0.
7

Note that this is only guaranteed to work if the function is even or odd and the domain is symmetric. E.g.,
x3 is an odd function, but [2, 5] is not a symmetric domain. So
5

x3 dx =

whereas
Z

609
6= 0,
4

x3 dx = 0.

3.4

Area between a curve and the x-axis

The integral
b

f (x) dx
a

gives the area below the graph of y = f (x). However, if the graph itself goes below the axis then this value
is negative,
pic
and if it is over in some places and under in others, the areas can even cancel each other out. So to calculate
the area between the graph of y = f (x) and the x-axis for a x b, for formula is
b

|f (x)| dx.

A=
a

However, this is generally not a function we can directly integrate (recall that the modulus function is not
differentiable), so we use the additive property of integration and split the integral into parts.
Example 3.4.6. Find the area between y = x3 and the x-axis for 1 x 1. Note that this is not
Z 1
x3 dx = 0.
1

The modulus function is defined as


|u| =

x,
x,

43

x 0;
x < 0.

Since x3 0 when x 0, and x3 < 0 when x < 0, we have


Z 0
Z
A =
x3 dx +
1

x3 dx

0
 4 1
x4
x

+
4 1
4 0

 

1
1
0
+
0
4
4
1 1
+
4 4
1
.
2


=
=
=
=

Example 3.4.7. Find the area between the line y = x and the x axis for 2 x 3. Again, we have
Z 2
Z 0
Z 3
|x| dx =
x dx +
x dx
2


=


2 0

x
2


+


= (0 2) +
=

x2
2

3
0


9
0
2

13
.
2

So our recipe to calculate the area between f (x) and the x-axis over a given domain is:
Find all the sub-domains where f (x) is positive and where f (x) is negative (i.e. by finding roots).
Integrate over each sub-domain separately, and add the results with appropriate signs.
Example 3.4.8. Find the total area bounded between the x-axis the graph
y = x3 2x2 x + 2.
This has factorisation y = (x + 1)(x 1)(x 2), so has x-intercepts at x = 1, x = 1, x = 2.
pic
We see from the graph that
Z 2
|x3 2x2 x + 2| dx =

(x3 2x2 x + 2) dx


=

(x3 2x2 x + 2) dx

2x3
x2
x4

+ 2x
4
3
2

..
.
=

37
.
12

44

1

x4
2x3
x2

+ 2x
4
3
2
1


2
1

3.4.1

The area between two curves

Generalising the previous section, the area bounded by y = f (x) and y = g(x) on the domain a x b is
given by
Z b
|f (x) g(x)| dx.
a

To know when f (x) g(x) is positive or negative, we must start by solving f (x) = g(x) and finding all
crossing points. The recipe is then exactly the same as in the previous section, and note that when g(x) 0
we get exactly the same result as in the previous section.
Example 3.4.9. Calculate the area between y = x2 and y = x4 .
pics
We first need to find the bounded region. Solving x2 = x4 gives x = 0 or x = 1. Since x2 x4 when
0 x 1, we have
Z 1
A =
(x2 x4 ) dx
0

x5
x3

3
5
1 1

3 5
2
.
15


=
=
=

1
0

Example 3.4.10. Calculate the area between the graphs y = x and y = 1 x/2 for 0 x 1.
pic
These graphs only intersect when x = 2/3, and for 0 x 2/3, 1 x/2 x. Now
Z 1 

x


A =
x dx
1
2
0
Z 2/3 
Z 1 


x
x
=
1
x dx
1
x dx
2
2
0
2/3


Z 2/3 
Z 1 
3x
3x
=
1
dx
1
dx
2
2
0
2/3

2/3 
1
x4
x4
=
x
x
4 0
4 2/3



1
2/3
3x2
3x2
=
x
x
4 0
4 2/3
..
.
=

5
12

Example 3.4.11. Find the area bounded by y = sin(x) and y = 1/ 2 for 0 x /2.
pic
45

We have
/2




1

=
sin(x) 2 dx
0


Z /4 
Z /2 
1
1
sin(x)
=
sin(x)
dx +
dx
2
2
0
/4

/4 
/2
x
x
= cos(x)
+ cos(x)
2 0
2 /4
..
.

=
2 1.
Z

3.5

Integrating Certain Rational Functions

It may be shown that


Z

1
dx = ln |x a| + C,
xa

and that for n 6= 1,


Z

1
(x a)1n
dx
=
+ C.
(x a)n
1n

See the Appendix on page 55 for a more detailed discussion of partial fractions.
Example 3.5.12. We have
Z

2x 1
dx
2
x x2

Example 3.5.13. We compute:


Z

2
2
x 3x

Z 


1
1
+
dx
x+1 x2
Z
Z
1
1
=
dx +
dx
x+1
x2
= ln |x + 1| + ln |x 2| + C
=

ln |(x + 1)(x 2)| + C

ln |x2 x 2| + C

Z
Z
2
1
2
1
dx +
dx
3
x
3
x3
2
2
= ln |x| + ln |x 3| + C
3
3

2
1
=
ln
+ ln |x 3| + C
3
|x|


2
|x 3|
=
ln
+C
3
|x|

46

Example 3.5.14. Now,


Z

x2 x + 1
dx
x1

Z 
=
=

3.6

1
x+
x1


dx

1 2
x + ln |x 1| + C
2

Integration by Parts

Recall the product rule for differentiation of a function f (x) = u(x)v(x):


dv
du
d
(uv) = u
+v
.
dx
dx
dx
Integrating this rule gives
Z
uv =

dv
dx +
u
dx

Z
v

du
dx.
dx

Rearranging, gives the extremely useful integration by parts formula:


Z
Z
dv
du
u
dx = uv v
dx.
dx
dx
Similarly,
Z

u
a

dv
dx = [uv]ba
dx

v
a

du
dx.
dx

This formula can be used to solve integrals that are written as products. You should choose u and dv/dx.
In general, you choose u to be whichever differentiates to something nice.
Example 3.6.15. Lets evaluate
Z
x sin(x) dx.
The problem is simplified if we choose x as our thing to differentiate. So let us choose
u = x,

dv
= sin(x).
dx

This gives
du
= 1, v = cos(x).
dx
Therefore
Z

Z
x sin(x) dx

= x cos(x) +

cos(x) dx

= x cos(x) + sin(x) + C.
Check, for let
y = x cos(x) + sin(x).
Then

dy
= (x sin(x) + cos(x)) + cos(x) = x sin(x),
dx

as expected.
47

Example 3.6.16. For


Z

x2 ex dx

let
u = x2 ,
Then

dv
= ex .
dx

du
= 2x, v = ex .
dx
Z
Z
x2 ex dx = x2 ex + 2 xex dx.

So

We have to do integration by parts again. This time, let


s = x,
Then

dt
= ex .
dx

ds
= 1, t = ex ,
dx

giving
Z

2 x

x e

2 x

x e

dx =


+ 2 xe

Z
+

x2 ex 2xex 2ex + C

ex (x2 + 2x + 2) + C.


dx

Check for y = ex (x2 + 2x + 2) + C, we get


dy
= ex (2x + 2) + ex (x2 + 2x + 2) = x2 ex .
dx
Example 3.6.17. Now, for
Z

ln(x)
dx,
x3
1
since it is easier to differentiate rather than integrate ln(x), we let
u = ln(x),
Then

dv
= x3 .
dx

1
1
du
= , v = x2 .
dx
x
2

Thus
Z
1

ln(x)
dx
x3

=
=
=
=


2
Z
ln(x)
1 2 1

+
dx
2x2 1 2 1 x3

2
ln 2 1 1

+
8
2 2x2 1


ln 2 1
1 1

+
+
8
2
8 2
ln 2
3

+
8
16
0.1.

48

3.6.1

Unusual Examples

Example 3.6.18. To integrate ln(x) we write ln(x) = 1 ln(x), and integrate by parts:
Z
Z
ln(x) dx = 1 ln(x) dx.
Since we are already unsure what is the integral of ln(x), we must set
u = ln(x),

dv
=1
dx

to give
du
1
= , v = x.
dx
x
Then,
Z

Z
ln(x) dx

= x ln(x)

dx

= x ln(x) x + C.

Example 3.6.19. To integrate


Z
sin(x) cos(x) dx,
we attempt to use integration by parts with
u = sin(x),

dv
= cos(x).
dx

This gives

so that

du
= cos(x), v = sin(x),
dx
Z
Z
sin(x) cos(x) dx = sin2 (x) sin(x) cos(x) dx + 2C.

Although it doesnt look like weve gotten anywhere, we can add the original integral to both sides then
divide by 2 to get
Z
1
sin(x) cos(x) dx = sin2 (x) + C.
2
We can check that for y =

1
2

sin2 (x) + C we do indeed get

49

dy
dx

= sin(x) cos(x).

Example 3.6.20. We have


Z
ex cos(x) dx

[e

sin(x)]0

ex sin(x) dx

(3.1)

ex sin(x) dx
Z

= [ex cos(x)]0
ex cos(x) dx
0
Z

0
ex cos(x) dx
= (e cos() e cos(0))
0
Z
ex cos(x) dx
= e 1

(3.2)

(3.3)
(3.4)
(3.5)

ex cos(x) dx

= e 1

ex cos(x) dx

(3.6)

e 1
2
12.1.

(3.7)
(3.8)

Example 3.6.21. More last year - copy through

3.7

Integration by Substitution

This is exactly the same idea as the chain rule for differentiation, and uses the fact that
dx dt
= 1.
dt dx
However, the exact method to use can be wildly different for different integrals. Some standard methods
exist, but the only way to get the necessary intuition to solve non-standard methods is by doing lots and
lots of practice.
Example 3.7.22. Lets calculate
Z
sin(2x) dx.
We we know how to integrate sin(t), we will let
t = 2x.
This gives
dt
= 2,
dx
i.e.

dx
1
= .
dt
2

50

Then
Z

Z
sin(2x) dx

sin(t) dx

dt
dt

dx
sin(t)
dt
dt
Z
1
sin(t) dt
=
2
1
= cos(t) + C
2
1
= cos(2x) + C.
2
=

The example above contained a few extra steps to explain what is really going on, but it practice it is
sufficient to write it a bit more concisely, as the next example shows.
Example 3.7.23. For
Z

e2x+3 dx

we use the substitution


t = 2x + 3.
This gives
dt
= 2,
dx
or

1
dt = dx.
2

We then substitute x for t and dx for dt to get


Z
e2x+3 dx

=
=
=

Z
1
et dt
2
1 t
e +C
2
1 2x+3
e
+C
2

These examples describe a general approach that often works. If you are given
Z
f (ax + b) dx,

(3.9)

then the first thing you should try it t = ax + b.


We can derive another general function by considering the derivative of y = ln(f (x)). This, according to
the chain rule, is
dy
f 0 (x)
=
.
dx
f (x)
Therefore, we get the general formula
Z

f 0 (x)
dx = ln(f (x)) + C.
f (x)

51

(3.10)

We can show this works by letting t = f (x). Then


dt
= f 0 (x),
dx
which gives
dt = f 0 (x) dx.
Then

f 0 (x)
dx =
f (x)

1
d = ln(t) + C = ln(f (x)) + C.
t

Example 3.7.24. To calculate

ex
dx,
+1
notice that the top is the derivative of the bottom. So let t = ex + 1. Then
Z

ex

dt = ex dx,
giving
ex
dx
x
e +1

Z
=

1
dt
t
ln(t) + C

ln(ex + 1) + C.

Another general rule comes directly from the chain rule. Consider
Z
g 0 (x)f (g(x)) dx

(3.11)

In this case, a sensible substitution would be


t = g(x),
since this gives
dt = g 0 (x) dx,
i.e.

g (x)f (g(x)) dx =

f (t) dt.

Example 3.7.25. For


Z

2x(x2 + 3)5 ) dx,

let
t = x2 + 3
so that
dt = 2x dx.
Then
Z

2x(x2 + 3)5 ) dx

Z
=
=
=
52

t5 dt

1 6
t +C
6
1 2
(x + 3)6 + C.
6

(3.12)

Example 3.7.26. For


Z

ex cos(ex ) dx,

let
t = ex
so that
dt = ex dx.
Then
Z

3.7.1

ex cos(ex ) dx =

Z
cos(t) dt

sin(t) + C

sin(ex ) + C.

Trigonometric Substitutions

Most of these are based on the standard trig identity and those that following from it:
sin2 () + cos2 () = 1
tan2 () + 1 = sec2 ()
1 + cot2 () = csc2 ()
Example 3.7.27. For

1
dx
4 x2

53

Appendix
3.8

Matrices

A 2 2 matrix


A=

a
c


b
d

defines a map acting on 2D space. That is,



  
 

a b
x
(a, b) (x, y)
ax + by
=
=
.
c d
y
(c, d) (x, y)
cx + dy
Example 3.8.28. Let
A=


2
0


3
.
1

Then





2
0
2
0
2
0
2
0

 
3
0
1
0
 
3
1
1
0
 
3
0
1
1
 
3
1
1
1

=
=
=
=

 
0
,
0
 
2
,
0
 
3
,
1
 
5
.
1

Think of the vectors as vertices of a square, and the matrix A skews the square.
pic
The original area was 1, the new area is 2.
known as the determinant of the matrix.

The area of any shape is scaled by matrix A by an amount


a
det A = |A| =
c


b
= ad bc.
d

The same can be done in 3D. The formula for the determinant is:


a1 a2 a3












b1 b2 b3 = a1 b2 b3 a2 b1 b3 + a3 b1


c2 c3
c1 c3
c1
c1 c2 c3


b2
c2

= a1 (b2 c3 b3 c2 ) a2 (b1 c3 b3 c1 ) + a3 (b1 c2 b2 c1 ).


54

3.9

Partial Fractions

We start with a couple of definitions. If f (x) is a polynomial in x, its degree deg f is the highest power of x.
Polynomial
12
3x
4x3 + 7x2

Degree
0
1
3

If f (x) has degree at least 2, it is called irreducible if it has no real roots. Polynomials of degree 1 are also
called irreducible. Any polynomial can be factorised into a product of irreducible polynomials. E.g.
x4 1 = (x 1)(x + 1)(x2 + 1)
x3 1 = (x 1)(x2 + x + 1)
A rational function is the quotient of one polynomial by another. I.e.
f (x) =

g(x)
,
h(x)

where g(x) and h(x) are polynomials. The rational function f (x) is called proper if deg h > deg g. A rational
function f (x) is said to be written in partial fractions if it is written as the sum of a polynomial and proper
rational functions, where each denominator is irreducible.
We first give a method to express a proper rational function in terms of partial fractions. Given
f (x) =

g(x)
,
h(x)

where deg g < deg h;


1. Factorise h(x) into irreducibles. The factors become denominators of our new fractions.
2. Write down the solution using constants, where each numerator is one degree lower than its denominator.
3. Find constants.
4. Write final solution.
Example 3.9.29. Write the following rational function in partial fractions:
2x 1
.
x2

x2
The denominator factorises to give

x2 x 2 = (x + 1)(x 2).
Our function may therefore be written as
2x 1
A
B
=
+
.
x2 x 2
x+1 x2
By cross multiplication, we get
2x 1
A(x 2) + B(x + 1)
=
,
x2 x 2
x2 x 2
or simply
2x 1 = A(x 2) + B(x + 1).
55

There are essentially two methods to find A and B. We demonstrate the first in this example. When x = 1,
this equation becomes
3 = 3A,
so that A = 1. When x = 2, the equation becomes
3 = 3B,
so that B = 1. Therefore, we can write our original function in partial fractions as follows:
1
1
2x 1
=
+
.
x2
x+1 x2

x2

Example 3.9.30. We will do the same for


x2

2
.
3x

The denominator factorises to give


x2 3x = x(x 3),
so the function may be written as
x2

A
B
2
= +
.
3x
x
x3

Cross multiplication gives


2
A(x 3) + Bx
=
.
x2 3x
x2 3x
That is,
2 = A(x 3) + Bx = (A + B)x 3A.
This time we will compare coefficients to get
A+B

3A

2,

which gives A = 2/3 and B = 2/3. Therefore, we get


2
2
2
=
+
.
x2 3x
3x
3(x 3)
Example 3.9.31. Now lets consider
1
.
x(x2 + 1)
This denominator is already factorised, but remember that the new numerators are not constants: they are
always one degree less than their denominators. So we may write our function as
1
A Bx + C
= + 2
.
+ 1)
x
x +1

x(x2
Cross multiplying gives

1 = A(x2 + 1) + (Bx + C)x = (A + B)x2 + Cx + A.

56

This time we have no choice but to compare coefficients:


A+B

1,

giving B = 1. So we get
1
1
x
= 2
.
x(x2 + 1)
x x +1
When we are a dealing with improper rational functions, i.e.
f (x) =

g(x)
h(x)

where deg g deg h, then it is still possible to write f (x) in partial fractions - but we must first find a
polynomial summand that makes the fraction proper. This is achieved by rearranging the numerator, and
is best explained by examples.
Example 3.9.32. Let
x2 x + 1
.
x2 1
The top and bottom here are the same degree so this fraction is improper. We try to write numerator as
the denominator plus a lower degree polynomial to get cancellation:
f (x) =

x2 x + 1
x2 1

=
=
=
=

(x2 1) + (x + 2)
x2 1
2
x 1
x2

x2 1 x2 1
x2
1 2
x 1
3
1
1+

.
2(x 1) 2(x + 1)

Example 3.9.33. For


x2 x + 1
,
x1
we need to write the numerator as a multiple of the denominator plus something of lower degree than the
denominator. This gives
f (x) =

x2 x + 1
x1

x(x 1) + 1
x1
x(x 1)
1
=
+
x1
x1
1
= x+
.
x1
=

57