You are on page 1of 32

Differentiation

1.0 : Introduction
Here we cover the basic theory and application of that part of calculus called differentiation.
Differentiation is one of two fundamental topics in calculus. These notes will include :

Algebraic and Geometric Definitions of the Derivative of a Function


Here we will see how the derivative of a function can be defined in two ways : by algebra or
via geometry. In using the geometric approach we will study the basic ideas behind
gradients and tangents of a function. We will consider the process of two point on a curve
approaching to form a tangent to the curve and we will formalise this process
mathematically

Differentiation from 1st Principles


Having formalised the process of deriving the tangent we will derive, from 1st principles,
the tangents of certain functions and define this process as differentiation.

Derivatives of Trigonometric, Exponential and Logarithmic Functions


Here we will study the basic derivatives of trigonometric functions. Here we will need to
study two important limits relating to the use of trig functions in order to be able to obtain
their derivative. We will also study how to find the derivative of exponential and
logarithmic functions.

Rules of Differentiation
Here we will study the basic rules for differentiating functions containing other functions
(chain rule), functions multiplied together (product rule), and functions divided by each
other (quotient rule)

Stationary Values
Here we will study the behaviour of derivatives around certain regions of a function. We
will see that it is possible to analyse a function from the way their derivatives are organised
around a point on the function called a stationary point. We will also study how we may use
such analysis to help us sketch graphs, and also how to apply it in optimisation problems.

Differentiation of Implicit and Hyperbolic Functions


Here we will differentiate different forms of functions, as well as new trig functions.

Tangents and Normals


Here we will study the connection between the tangent of a curve and the normal to the
tangent. The normal to the tangent (or any other line) any line is defined as the line
perpendicular to the tangent.

Maclaurin and Taylor Series (if time allows)


Here we will study how various non algebraic function such as trig functions, e x , ln x , etc...
can be transformed into equivalent infinite series. This requires the use of differentiation.

Theoretical Study of Limits (if time allows)


The concept of a derivative relies on the limiting process, in that the gradient of a function
approach the tangent of the function when δx approaches 0. as such the concept of limits is
important and we shall study this in more detail.
www.ucl.ac.uk/~uczlcfe
1
1.1 : A Historical Introduction to Calculus

1.1.1 : The two basic concepts of calculus

www.ucl.ac.uk/~uczlcfe
2
1.1.2 : Who invented calculus ?

www.ucl.ac.uk/~uczlcfe
3
1.2 : The Derivative Of A Function

1.2.1 : An algebraic definition of the derivative of a function


We saw in the notes of algebra (in the section on derived functions) that there was a way of
evaluating f(x+δx) from a combination of the original function f(x) and its derived functions, the
general expression being :

f (x + x ) = f (x ) + x.f ∏ (x ) + x f ∏∏ (x ) + x f ∏∏∏ (x ) + x f iv (x ) + ....


2 3 4
(1.2.1)
2! 3! 4!

We then saw how to obtain the derived functions f ∏ , f ∏∏ , f ∏∏∏ , etc... from f(x). Although we only ever
derived this expression for polynomials, let us assume that it is valid for any function f(x) (this isn’t
strictly true, but for our purposes this assumption will suffice. In fact Lagrange, a French
mathematician, thought he had proved in 1797 that such an expansion was valid for any function
f(x) but we now know that this isn’t the case since their are functions which cannot be expanded in
the form above. More on this in the section on Taylor series).

Some basic algebra on (1.2.1) gives

f (x + x ) − f (x )
= f ∏ (x ) + x f ∏∏ (x ) + x f ∏∏ (x ) + x f iv (x ) + ....
2 ∏ 3
(1.2.2)
x 2! 3! 4!

Now, δx is a positive number considered to be small. What happens if we let this number become
smaller and smaller and smaller ? What if we let it become so small it approaches the value 0
without it ever becoming 0 ? In doing this we are doing something very new. We are doing
something called taking limits. We are in fact going to “take the limit” of expression (1.2.2) “as δx
approaches 0”. In symbols we would expresses this description as lim. Hence taking limits in
xd0
(1.2.2) we have

f (x + x ) − f (x )
= lim f ∏ (x ) + x f ∏∏ (x ) + x f ∏∏ (x ) + x f iv (x ) + ....
2 ∏ 3
lim
xd0 x xd0 2! 3! 4!
(1.2.3)

Then applying this limit to each term on the right hand side of (1.2.3) all terms will disappear
except the first term (since the first term does not depend on δx). As such we then define the
derivative of a function to be

df = f ∏ (x ) = lim f (x + x ) − f (x ) (1.2.4)
dx xd0 x

Now, it may seem as if we have put δx = 0 in (1.2.3) in order to obtain (1.2.4) but we haven’t. And
we know we can’t put δx = 0 since if we did we would be dividing by 0 in the left hand side of
(1.2.3). Therefore we must always let δx t 0 (“δx approach 0”) without it equaling 0. It just so
happens that when we do this limiting process all the terms on the right hand side of (1.2.3), except
the first term, disappear.

Expression (1.2.4) is therefore the general expression for the function f ∏ (x ) derived from f(x). The
name given to this particular derived function is “derivative”. Hence f ∏ (x ) is the derivative of f(x).

www.ucl.ac.uk/~uczlcfe
4
1.2.2 : A geometric definition of the derivative of a function
In fact, the concept and the definition of a derivative was originally developed geometrically. This
approach also overcomes the flaw mentioned in the previous definition of the derivative (namely
that not all functions can be expressed in the form of (1.2.1) whereas all function can be studied in
the manner below).

To develop the definition of the derivative geometrically consider the diagram below :

y = mx + c
y2

y1

x
x1 x2

We should be familiar with how to calculate the slope or gradient of the line above. The
slope/gradient of a line is how steep the line is. We may therefore calculate this slope or gradient in
the usual manner as
y2 − y1
slope/gradient = x 2 − x 1

The slope is therefore the increase in y divided by the increase in x. Sometimes you may see this
described in words as ‘rise over run’. In a linear equation this solpe/gradient is represented by
coefficient ‘m’ in the equation y = mx + c.

However, how do we find the gradient of a curve such as the one below ? What is the gradient of
the curve at point P ? :

www.ucl.ac.uk/~uczlcfe
5
Since there are infinitely many lines which pass through P, which line do we use to find out how
steep the curve is at point P ? Our 1st aim is therefore to study an important idea which will lead to
our being able to find slopes of curves at specific points.

As such consider the function f(x ) = x 2 as shown below, and consider the secant joining points P (1,
1) and Q (2, 4) on the curve :

y
y = x2

Q
y + δy 4

P
y 1
x
0 1 2

x x + δx

(a secant is a line passing through any two points on a curve). We may find the gradient or slope of
this secant in the usual way. If we call the horizontal distance x (read this as ‘delta x’) and the
vertical distance y (read this as ‘delta y’), then the gradient/slope of the secant is represented by
y/x (read this as ‘delta x over delta y’) :

P Q gradient
x y x x + x y + y y y/x

1 1 1 2 4 3 3

Now, if we keep point P fixed and move point Q towards P along the curve, then we will get
different secants from which we may calculate the gradient

Q
P δy
δx
x

www.ucl.ac.uk/~uczlcfe
6
As we see from the diagram above, as Q gets closer to P so δy and δx become smaller. We can then
let Q get closer and closer and closer and closer to P. In fact we will let Q get as close as possible to
P without ever landing on top of P. In terms of notation we would say that QtP, but Q!P. This is
an extremely important concept : Q approaches P forever, but Q never equals P. This is the concept
of “limits”, and is fundamental to calculus : no “limits”, no calculus. Ultimately, as Q approaches P,
the secant will become what is called a tangent at P, in other words a line which touches the curve
only at point P :

y y

===>

P P

x x

Theoretically, since Q!P there will always be a secant (however microscopically smaller) for us to
be able to form δy/δx. We will then be able to find the gradient of the tangent at point P.

As an example, consider the function y = x2, and let us form the gradient δy/δx of the secant for
smaller and smaller values of δx and δy. Various calculations of the gradient give :

P Q gradient
x y x x + x y + y y y/x

1 1 1 2 4 3 3
1 1 0.5 1.5 2.25 1.25 2.5
1 1 0.1 1.1 1.21 0.21 2.1
1 1 0.01 1.01 1.0201 0.0201 2.01
1 1 0.001 1.001 1.002001 0.002001 2.001
. . . . . . .
. . . . . . .
. . . . . . .
1 1 0 1 1 0 2

So it seems that as Q d P, x + x d x and the gradient of the curve gets closer and closer to 2. In
fact at P the gradient of the curve becomes the tangent of the curve at point P.

In general, the closer Q gets to P the better the approximation of the gradient of the secant is to the
gradient of the tangent.

www.ucl.ac.uk/~uczlcfe
7
So

as Q t P, gradient of secant PQ t gradient of tangent at P


or
lim gradient of secant PQ = gradient of tangent at P
QdP

I.e.
y
lim = gradient of tangent at P
xd0 x

Hence the gradient of the tangent to y = x 2 at P = (1, 1) is

y
lim =2
xd0 x

In order to find gradients of tangents at different points on the curve y = x 2 we could repeat the
above tabulation for different points P. This would however be tedious. There is in fact a more
general way of finding the gradient of the tangent at any point on the curve. Consider therefore any
point P(x, y) on the curve y = x 2 . Then let Q(x + δx, y + δy) be any other point on the curve such
that the x coordinate of P has increased by δx and the y coordinate of P has increased by δy. Our
table of coordinates now becomes

P Q gradient
x y x x + x y + y y y/x

(x + x ) 2 (x + x ) 2 − x2 (x + x ) 2 − x 2
x x2 x x + x
x

(note that the Q coordinate is found by substituting x + x into the function y = x 2 ). And again we
may simplify the expression in the last column using basic algebra to get

y (x + x ) 2 − x 2 x 2 + 2xx + (x ) − x 2
2
= =
x x x

= 2x + x
Then, taking limits we have
y
lim = 2x
xd0 x

We therefore see that for any value of x the gradient of the slope of y = x 2 is 2x. Because this is true
for any value of x, it is true for any point P on the curve.

www.ucl.ac.uk/~uczlcfe
8
Another example
Let us now consider finding the gradient of the function y = 2x 2 − x. The coordinate of P and Q will
be

P: (x, y ) = (x, 2x 2 − x )
Q: (x + x, y + y ) = (x + x, 2(x + x ) 2 − (x + x ))

Hence the gradient of the slope of y = 2x 2 − x is

2(x + x ) − (x + x ) − [2x 2 − x ]
2
y (y + y ) − y =
= x
x x

2x 2 + 4xx + 2(x ) − x − x − 2x 2 + x
2
=
x

= 4x + 2x − 1
Then as x d 0

y
lim = 4x − 1
xd0 x

We therefore see that for any value of x the gradient of the slope of y = 2x 2 − x is 4x−1. Because
this is true for any value of x, it is true for any point P on the curve.

Example
See example 5b, p113 of Bostock and Chandler

Exercises
Refer to exercises 5a, p111 and exercises 5b, p113 of Bostock and Chandler.

Exercise on thinking mathematically


Repeat the above analysis on the following functions, using point P as P = (1, 1)

i) y = 2x 2 ii) y = x3 iii) y = 2x 3
iv) y = 3x 3 v) y = 4x 3

(If you know how to use the formula properties of Microsoft Excel then it might be faster to us this,
otherwise use a calculator)

Is there any pattern you can spot between your answer to i) and the answer of the example above ?
Is there any pattern you can spot between your answers to ii), iii), iv), and v) ? What about
comparing the answers to i) and the example above with the answers to ii), iii), iv), and v) ? Is there
any recognisable pattern or effect that you can see ?
End of Exercise

www.ucl.ac.uk/~uczlcfe
9
Exercise on thinking mathematically
1) Consider analysing the difference between successive gradients as a point Q approaches a
point P. For a function y = 3x − 2 let us set the x-coordinate of point P as x=1, and the x- coordinate
of point Q as x=2. Then y=2 and y+δy = 5, and the gradient of the line PQ is 3 :

P Q gradient
x y x+δx y+δy [(y+δy) - y] / [ (x+δx) - x]
1 2 2 5 3

Let us now move point Q towards point P in constant increments of 0.2 from x=2 to x=1 (see the
x+δx column in the table below). What happens to the difference between successive gradients ?

P Q gradient difference
between
x y x+δx y+δy [ (y+δy) - y] / [ (x+δx) - x]
gradients
1 2 2 5 3
0
1.8 4.4 3
0
1.6 3.8 3
0
1.4 3.2 3
0
1.2 2.6 3
0
1 2 3

Question :
What does this difference tell us about the gradient and the way the gradient changes ? How ‘fast’
does the gradient change ? What is the steepness of the gradient ?

2) We can repeat the analysis for 2x 2 + x. Starting at the same points P and Q above we have :

P Q 1st difference 2nd difference


gradient between between
x y x+δx y+δy
gradients gradients
1 3 2 10 7
0.4
1.8 8.28 6.6 0
0.4
1.6 6.72 6.2 0
0.4
1.4 5.32 5.8 0
0.4
1.2 4.08 5.4 0
0.4
1 3 5

www.ucl.ac.uk/~uczlcfe
10
Question :
i) What does the 1st difference tell us about the gradient and the way the gradient changes ?
How ‘fast’ does the gradient change ? What is the steepness of the gradient ?

ii) What does the 2nd difference tell us about the gradient and the way the gradient changes ?
How ‘fast’ does the gradient change ? What is the steepness of the gradient ?

iii) What is it about the structure of the functions in 1) and 2) above which makes these
differences occur ? What is it about the aspect of the functions which makes the steepness of the
gradients change ?

3) Repeat the above analysis for the functions :

i) y = −x 3 − 2x + 1 ii) y = x 4 − 10

finding 1st differences, 2nd differences, 3rd differences, etc... (i.e. as many differences as
appropriate).

Question :
i) What does this difference tell us about the gradient and the way the gradient changes ? How
‘fast’ does the gradient change ? What is the steepness of the gradient ?

ii) What is it about the structure of the functions in 1), 2) and 3) above which makes these
differences occur ? What is it about the aspect of the functions which makes the steepness of the
gradients change ?

iii) Is there any pattern you can spot between the type of function being studied and the number
of differences which can be calculated from it ? How many difference columns would you need to
calculate for the general polynomial a 0 x n + a 1 x n−1 + a 2 x n−2 + .... + a n−1 x + a n ? Why ?
End of Exercise

1.3 : Differentiation - A Formal Definition


The process of finding the gradient/slope of a function at a point is known as differentiation. Then,
in finding lim xd0 y/x we are finding what is called the derivative of a function. The derivative is
therefore a general expression which allows us to find the gradient of the function at any specific
value of x.

Then, in collecting together the main ideas presented above we may formalise the process of
differentiation as follows, in other words we may state differentiation in rigorously to be :

Given any function y = f (x ), then let P be a point of the curve of that function
such that the coordinate of P is (x, y ). Then, let Q be a point whose position
results from an increase x in the x coordinate of P. Point Q then has
coordinate (x + x, f (x + x )), and the corresponding increase in the y
coordinate when moving from point P to point Q will then be y, as shown in
the diagram below :

www.ucl.ac.uk/~uczlcfe
11
y

Q
y+δy

δy
δx
y
P

x
x x+δx
y = f(x)

The gradient of the secant PQ is then

y f (x + x ) − f (x )
gradient PQ = = (*)
x (x + x ) − x

Therefore the gradient at point P itself is

dy y f (x + x ) − f (x )
= lim = lim (1.3.1)
dx xd0 x xd0 x

The tangent of a function f(x) is therefore a line passing through point P on that function. Applying
the process (1.4.1) on any function (as shown in the previous examples above) is known as
differentiating from 1st principles.

An extremely important (and sometimes conceptually difficult) point to understand is that a


function f(x) can have a value “in the limit” as x “approaches” a value a, even though the function
itself may not be calculated at x = a. In (1.3.1) above it is therefore the case that δy/δx has a value as
δx t 0, even thought δy/δx is undefined when δx = 0. More on this later when we get to study the
function sin x
x .

Examples
Differentiate the following functions from 1st principles :

i) y = −x + 3 ii) f(x ) = −4x + 2x 2 iii) y = x.(x − 1 ) 2

y = x −1/2x
3/2 1/2
iv) v) f(x ) = 1/x
2x

Exercises
Refer to exercises 5b, p113 of the core textbook of Bostock and Chandler

www.ucl.ac.uk/~uczlcfe
12
Exercise on reading mathematically
Interpret fully expression (*) and (1.4.1) above. What do the expressions mean as a whole and what
do separate parts of the expressions mean ? What is the similarity and difference in meaning
between the denominator of (*) and the denominator of (1.4.1) ? What do these expressions tell us
about the way the gradient and the derivative is derived/calculated ?
End of Exercise

Exercise on thinking mathematically : Activity 1


1) In developing the expression for dy/dx in (1.3.1) above, and in all the diagrams previously
shown, we saw that point Q approached point P from the right hand side. But we can also
approach point P from the left hand side, as shown in the diagram below.

So, using diagram (ii) to help you visualise the situation :

---> develop the correct notation for coordinates (a), (b), and (c).
---> write down the gradient of the line QP
---> in the limiting process as point Q approaches point P, develop the correct
expression for the derivative of f(x), at point P.

y y

Q P
f(x + δx) (c)

f(x) (b)
P Q

x x
x x + δx (a) x

diagram i) : Q approaches P from the right diagram ii) : Q approaches P from the left

2) Similarly, In developing the expression for dy/dx in (1.3.1) above, and in all the diagrams
previously shown, we saw that there was only one point Q which approached point P, and
did so from the right hand side. But we can also approach point P from both the left and the
right hand side, i.e. we can have two point Q1 and Q2 on opposite sides of point P, as shown
in the diagram below.

So, using diagram (iii) to help you visualise the situation :

---> develop the correct notation for coordinates (a), (b), (c), and (d).
---> write down the gradient of the line Q1Q2
---> in the limiting process as point Q1Q2 approaches point P, develop the correct
expression for the derivative of f(x), at point P

www.ucl.ac.uk/~uczlcfe
13
y

Q2
(d)
P
f(x)

(c)
Q1

x
x
(a) (b)

diagram iii) : Q 1 and Q 2 both approach P

3) Confirm that your answer in 2) can be derived algebraically by combining your answer to 1)
and expression (134.1)

End of Exercise

Exercise on thinking mathematically : Activity 2


This activity follows on from the previous activity 1 above

1) From ex 2) in the previous activity you developed an expression for the derivative of f(x), at
point P. Consider only the gradient of this expression, i.e. use only the expression without
doing the limit as x d 0.

Then, consider a point P(x, y) on the curve of f(x ) = x 3 . Using the expression for the gradient
Q1Q2 with f(x ) = x 3 , with x = 0.1, we get

f(x + 0.1 ) − f(x − 0.1 )


gradient Q1Q2 =
0.2

i) Use a graphics calculator or graph drawing software to draw the graph of

(x + 0.1 ) 3 − (x − 0.1 ) 3
y=
0.2

ii) What is the shape of the graph in i) ? In other words, what family of functions does
this graph belong to ?

www.ucl.ac.uk/~uczlcfe
14
iii) Repeat i) and ii) using smaller values of δx, such as 0.05, 0.01, 0.005, etc...

iv) As δx gets smaller and smaller find the equation of the curve which your graphs of i)
and iii) seem to be approaching.

v) Use you answer in iv) to find the derivative of y with respect to x at the points on the
curvef(x ) = x 3 , where x = 3, x = 2, x = 1, x = 0, x = −1, x = −2

2) Repeat all of 1) for the function f(x ) = x 2 .

3) Repeat all of 1) but this time using the expression for the derivative you found in 1) of
activity 1 above.
End of Exercise

Exercise on thinking mathematically : Activity 3


This activity follows on from the previous activity 1 above.

From ex 2) in activity 1 you developed an expression for the derivative of f(x), at point P. Use this
expression to find the derivative of the following function

i) y = −x + 3 ii) f(x ) = −4x + 2x 2 iii) f(x ) = x.(x − 1 ) 2

iv) f(x ) = 1/x


End of Exercise

1.3.1 : Other notations for derivatives

www.ucl.ac.uk/~uczlcfe
15
www.ucl.ac.uk/~uczlcfe
16
1.4 : Differentiating Basic Functions From 1st Principles

1.4.1 : The derivative of basic algebraic expressions


Knowing how to differentiate functions from 1st principles can be algebraically long and
complicated. We therefore need a short, quicker way to differentiate polynomial functions. From
the examples and exercises previously shown above, a pattern might be seen to develop when it
comes to differentiating polynomial functions, namely that :

given y = ax n

dy
then “the derivative of y w.r.t. x” is given by = n.ax n−1
dx

In fact this can be shown to be the case simply by differentiating from 1st principles the expression
y = ax n (a is a constant) as follows : At a point P and Q the coordinate of P and Q will be

P: (x, y ) = (x, ax n )
Q: (x + x, y + y ) = (x + x, a(x + x ) n )

Hence the gradient of the slope of y = ax n is

a(x + x ) − ax n
n
y (y + y ) − y =
= x
x x

We may then expand a(x + x ) n using the binomial theorem to get


n(n−1 ) n−2
a[x n + n.x n−1 x + 2! x
(x ) 2 + ...] − ax n
=
x

= a.n.x n−1 + terms in dx or higher order


Then as x d 0
dy y
= lim = a.nx n−1 (1.4.1)
dx xd0 x

This is known as the power rule for differentiation.

1.4.2 : The derivative of trig functions


Before we can differentiate sin x and cos x from 1st principles we need to study two important
limits involving these trig functions. For sin x the relevant limits is

lim xd0 sin


x
x

the graph of which is shown below :

www.ucl.ac.uk/~uczlcfe
17
Despite our intuition, which might suggest that f(x) has no answer at x = 0, the graph shows us that
the function approaches the value +1 and hence it has a limit at x = 0. To see this let us study the
behaviour of this limit by studying numerically the value which the function approaches when x
approaches 0 from both the left and right hand side of x = 0 :

x f(x)

-0.2 0.99335
-0.1 0.99833
-0.05 0.99958
-0.01 0.99998
-0.005 0.99999
0 ?
0.005 0.99999
0.01 0.99998
0.05 0.99958
0.1 0.99833
0.2 0.99335

From the table it therefore seems that

lim xd0 sin x


x =1

and this is indeed the case. This result may be proved using geometry : See lecture/tutorial.

www.ucl.ac.uk/~uczlcfe
18
1.4.3 : Limit of (cos x − 1) / x
Another important limit, this time needed for differentiating y = cos x from 1st principles, is that of

limxd0 cosx−1
x

the graph of which is shown below :

and again we see that the limit does exist even though algebraically it looks as if there is a division
by 0 at x = 0. Hence, again, the function has a limit at x = 0 since the values of the function
approaches 0. To see this, let us study the behaviour of this limit by studying numerically the value
which the function approaches when x approaches 0 from both the left and right hand side of x = 0 :

x f(x)

-0.1 0.04996
-0.05 0.02499
-0.01 0.00499
-0.005 0.00249
0 ?
0.005 -0.00249
0.01 -0.00499
0.05 -0.02499
0.1 -0.04995

Hence we see that


lim xd0 cos xx − 1 = 0 (*)

www.ucl.ac.uk/~uczlcfe
19
To prove this we can use the result of lim xd0 sin x
x = 1 as well as a trig identity. Therefore using
cos 2x = 1 − 2 sin 2 x , we obtain

cos x = 1 − 2 sin 2 x u cos x − 1 = −2 sin 2 x


2 2

Substituting into (*) we obtain


−2 sin 2 (x/2 )
lim xd0 x =0

Dividing by 2, and factorising a sin x/2 :

− sin(x/2 )
lim xd0 . sin(x/2 ) = 0
x/2
Taking limits

− sin(x/2 )
lim xd0 . lim xd0 sin(x/2 ) = (−1 )(0 ) = 0
x/2

For most basic functions putting x = 0 into an equation and having x t 0 in the same equation will
give the same answer. However, we now see that there is a mathematical difference between putting
x = 0 into (*) and (**) and having x t 0 in (*) and (**). The two processes do not give the same
answer. Division by 0 is impossible (or rather, it is not defined), where letting x forever approach 0
as close as possible without ever equaling 0 is possible, and give us whatever answer it gives.

There are therefore a class of functions which although cannot be evaluated at specific values, say x
= a, b, c, etc... do have an answer as these functions approach those values. I.e. they do have an
answer as x t a, b, c. This is the great conceptual leap which took some of the best mathematicians
(from the time Newton and Leibniz invented calculus) 200 years to understand. Namely the idea
that a function can have a limiting value (i.e. can have an answer as x t a) but cannot be evaluated
at that value (i.e. does not have an answer when x = a). Then, when the function has a limiting
value, the function is said to be defined at that value, i.e.

The limit of f(x) (as x approach the value a) is actually equal to f(a)
i.e.
limxda f (x ) = f (a )

1.4.4 : Derivative of sin x


Now we are in a position to differentiate f(x) = sin x from 1st principles :

d (sin x ) sin(x + x ) − sin(x )


= lim
dx xd0 x

We may now use the factor formula of trig identities to factorise the numerator. In other words :

sin(x + x ) − sin(x ) h 2 cos(x + x/2 ) sin(x/2)

www.ucl.ac.uk/~uczlcfe
20
Hence our derivative becomes :

d (sin x ) 2 cos(x + x/2 ) sin(x/2 )


= lim
dx xd0 x

sin(x/2 )
= lim cos(x + x/2 ).
xd0 x/2

Then, for x measured in radians we have

sin(x/2 )
lim cos(x + x/2 ) = cos x and lim =1 (*)
xd0 xd0 x/2
Hence

d (sin x ) = cos x
dx
Similarly we can show that
d (cos x ) = − sin x
dx

However, these derivative are only valid if the angle x is measured in radians since the limit
limhd0 sin(h )/h in (*) only works if x is measured in radians. In general any limit involving a trig
function divided by a small increment (δx or h) will need to have its angle measured in radians.

Examples
See lecture and refer to examples 8a, p256 of Bostock and Chandler core textbook.

Exercises
Refer to exercises 8a, p256 of Bostock and Chandler core textbook.

Exercise on thinking mathematically


1) The proofs of the derivatives of sin x and cos x above used the factor formula. However, we
may use the compound formula for trig identities to expand sin(x + x ). Similarly we may do so for
cos(x + x ) when differentiating cos x from 1st principles. Hence, perform the expansion of
sin(x + x ) and cos(x + x ) to obtain the derivatives of sin x and cos x.

End of Exercise

1.4.5 : The exponential function


General exponential functions are ones where the power is not constant but a variable. Hence

2 x , 10 x+1 , 5 −3x + 2

etc.... Plotting y = 2 x for various values of x we get

www.ucl.ac.uk/~uczlcfe
21
From the graph we see make the following observations about the behaviour of this function :

i) domain and range : the function is always positive for all values of x, i.e. f(x ) > 0, ≤x
ii) the value of f(x) increase ever more rapidly as x increases.
iii) the function approaches 0 in the negative x direction but never actually equals 0, i.e.
f(x ) d 0 as x d −∞.

We will now introduce a new and important function called the exponential function ex. This
function is important because it is used ’everywhere’ in mathematics and science. In science it is
used to model decay processes such as financial calculation of interest rates, radioactive decay, the
decay of charge on the capacitor of an electrical circuit, damping of car suspension systems,
amongst other applications. In mathematics it is used, amongst other things, to define new
trigonometric functions called hyperbolic functions.

Compared to the graphs above, the graph of ‘the’ exponential function is as shown by the solid line:

www.ucl.ac.uk/~uczlcfe
22
In y = a x the value of a which gives this function is a = 2.718281828.... and is given the symbol e.

Exercise on developing mathematical thinking


Consider the following general exponential function a (x+b) + c

i) For various values of a, b, and c plot the general exponential function.


ii) Describe the behaviour of this function for each of the different values of a, b, and c chosen.
How does the curve of the function change when these values are changed ? What is the
effect of a, b, and c on the function when any/all of these coefficients are negative ?
iii) formalise their description of ii) above into mathematical description (as done in the section
1.4.1 on exponentials above)
iv) compare your ii) with your iii). Make sure that every relevant issue is described formally
and make sure that your formal description is complete.

End of Exercise

1.4.6 : The derivative of the basic exponential function


Let us consider how to differentiate exponential functions. As such, consider the general
exponential function y = a x , a c ‘. From the definition of the derivative of y with respect to x we
have :

dy
= d (a x ) = lim a
x+x
− ax
dx dx xd0 x

= ax . lim a − 1 ,
x

xd0 x

Note that we can factorise ax since this is independent of δx. All we have to do now is to evaluate
the limit. A theoretical discussion of this is beyond the scope of our course, so instead let us look at
a numerical/conceptual argument. Let us see what happens when, for a very small value of δx, we
evaluate the limit for various values of a :

a δx = 0.00001

1 0
2 0.69315
3 1.098618
4 1.386304

The table suggests that as x d 0, there is a value of a between 2 and 3 for which the limit

lim a − 1
x
is 1.
xd0 x

If we can find this value of a then the derivative of y = a x , where a c ‘. would simply be ax itself.
Numerically speaking we could continue trying various values of a between 2 and 3 to see which
value would give us an answer to the limit which would be closer and closer to 1.

www.ucl.ac.uk/~uczlcfe
23
Thus if
a x − 1 j 1 for small δx
x
then
a x − 1 j x for small δx
or a x j 1 + x for small δx
i.e.
a j (1 + x ) for small δx
1/x

In the limits we have


a = lim(1 + x ) 1/x = 2.718281828459......
xd0

This number is in fact given the symbol e (and which was given by Euler, a German mathematician
who 1st came across this as an important value in mathematics), and the corresponding function is
therefore f(x) = ex.

Therefore when a = e we have

d (e x ) = e x
dx

where ex is ‘the’ exponential function. So the derivative of the exponential function is the
exponential function itself. We will leave the differentiation of the more general exponential
functiony = a x until later

Examples
See lecture and refer to examples 8b, p260 of Bostock and Chandler core textbook.

Exercises
Refer to exercises 8b, p261 of Bostock and Chandler core textbook.

1.4.7 : Some history of the exponential number/function


The number e is first “discovered” through a study of compound interest. In 1683 Jacob Bernoulli
looked at the problem of compound interest and, in examining continuous compound interest, he
tried to find the limit of (1 + 1/n ) n as n tends to infinity. He used the binomial theorem to show that
the limit had to lie between 2 and 3, so we could consider this to be the first approximation found to
e. Also if we accept this as a definition of e, it is the first time that a number was defined by a
limiting process. He certainly did not recognise any connection between his work and that on
logarithms.

As far as we know the first time the number e appears in its own right is in 1690. In that year
Leibniz wrote a letter to Huygens and in this he used the notation b for what we now call e. Later,
Johann Bernoulli began the study of the calculus of the exponential function in 1697 when he
published Principia calculi exponentialium seu percurrentium. The work involves the calculation
of various exponential series and many results are achieved with term by term integration.

www.ucl.ac.uk/~uczlcfe
24
It is Euler who suggested the notation e. The claim which has sometimes been made, however, that
Euler used the letter e because it was the first letter of his name is not true. It is probably not even
the case that the e comes from "exponential", but it may have just be the next vowel after "a" and
Euler was already using the notation "a" in his work. Whatever the reason, the notation e made its
first appearance in a letter Euler wrote to Goldbach in 1731. He made various discoveries regarding
e in the following years, but it was not until 1748 when Euler published Introductio in Analysin
infinitorum that he gave a full treatment of the ideas surrounding e. He showed that

e = 1 + 1/1! + 1/2! + 1/3! + ...

and that e is the limit of(1 + 1/n ) n as n tends to infinity. Euler gave an approximation for e to 18
decimal places,

e = 2.718281828459045235

without saying where this came from. It is likely that he calculated the value himself, but if so there
is no indication of how this was done. In fact taking about 20 terms of 1 + 1/1! + 1/2! + 1/3! + ...
will give the accuracy which Euler gave. (*Summary/excerpts taken from the websites
http://www-history.mcs.st-andrews.ac.uk /HistTopics/e.html*)

1.4.8 : The log function


Here I will summarise the properties of logs when they are seen as functions. As such general log
functions are ones of the form
log x, log(2x + 1)

etc.... Plotting y = a log x for various values of a we get

www.ucl.ac.uk/~uczlcfe
25
From the graph we see make the following observations about the behaviour of this function :

i) domain and range : the function produces unlimited values but only for positive
values of x, i.e. −∞ < f(x ) < ∞, x > 0
ii) the value of f(x) increase ever more slowly as x increases.
iii) the function approaches −∞ in the negative y direction but never crosses the x axis,
i.e.f(x ) d −∞ as x d 0 .

We will now introduce an important variation of the log function called the natural log function and
symbolised by y = ln x. This function is important because it is also used ’everywhere’ in
mathematics and science since it defines the inverse of the exponential function, and thus helps us
solve problems involving exponentials.

Compared to the graphs above, the graph of the natural log function is

In y = a log x the value of a which gives this function is a = 0.434294481....

Exercise on developing mathematical thinking


Consider the following general exponential function a log(x + b) + c

i) For various values of a, b, and c plot the general log function.


ii) Describe the behaviour of this function for each of the different values of a, b, and c chosen.
How does the curve of the function change when these values are changed ? What is the
effect of a, b, and c on the function when any/all of these coefficients are negative ?
iii) formalise their description of ii) above into mathematical description (as done in the section
1.4.1 on exponentials above)
iv) compare your ii) with your iii). Make sure that every relevant issue is described formally
and make sure that your formal description is complete.
End of Exercise

www.ucl.ac.uk/~uczlcfe
26
1.4.9 : The derivative of ln x
We will now study how to differentiate y = ln x (because, of all the log functions, the derivative of
this is the easiest to obtain). We will leave the derivative of the general log function y = log x until
later (but note that, by standard log rules we can always transform ln x to log x).

In order to obtain the derivative of y = ln x we could start from 1st principles and set up :

dy ln(x + x ) − ln x
= lim
dx xd0 x

The problem with this is that we do not know how to expand ln(x + x ). So it seems as if we are
stuck ! But not really. Using rules of logs we can change the numerator so that the expression above
becomes
dy ln x +xx
= lim
dx xd0 x

= lim 1 ln x +x x
xd0 x

= lim 1 ln 1 + x
x
xd0 x

Now perform the trick of multiplying by multiplying by x/x :

y
= xx lim 1 ln 1 + x
x
x xd0 x

= 1x lim x ln 1 + x
x
xd0 x

Now, the reason we multiplied by x/x was so that we could now use the rules of logs again to get

= 1x lim ln 1 + x
x/x

xd0 x

= 1x ln lim 1 + x
x/x

xd0 x

But lim xd0 1 + x


x/x
= e (if you can’t see why then let u = δx/x, then we get lim ud0 (1 + u )
1/u
x
which we know is e), so we have

dy 1
= ln e = 1x (1.4.2)
dx x

There is another, much easier, way in which we can show that the derivative of ln x is 1/x. We do
this by converting y = ln x into exponential form. So, if

y = ln x then x = ey

www.ucl.ac.uk/~uczlcfe
27
This last expression is x(y), i.e. a function of y. We can now differentiate this function w.r.t. y :

dx = e y = x (iii)
dy

Do not get confused by the supposed change in the order of the variables : (i) is a function of x so
we differentiate it w.r.t. variable x. (ii) is a function of y so we differentiate it w.r.t. variable y. So,
both functions are still differentiable w.r.t. to there independent variable.

However, we require the derivative of y w.r.t x (i.e. we want dy/dx) not the derivative of x w.r.t y
(i.e. not dx/dy). So what relationship can we find between

dy
and dx ?
dx dy

Well, from the basic definition of derivative we have :

dy y(x + x ) − y(x ) y
= lim xd0 = limxd0
dx x x

= lim xd0 1
y/x

= 1
lim y/x
yd0

= 1 (1.4.3)
dx/dy

So from (1.4.3) above we see that dx/dy is the reciprocal of dy/dx. Hence (iii) becomes

dy 1
=
dx x

and the derivative of y = ln x is dy/dx = 1/x (see appendix for alternative way of finding the
derivative of ln x).

We can now see how to find the derivative of the more general log function y = loga x. From rules
of logs we have :
y = log a x = ln x = 1 . ln x
ln a ln a
Hence
dy
= 1 .1 = 1
dx ln a x x. ln a

1.4.10 : Differentiating y=ax


We may now study how to differentiate the more general exponential y = ax. Then we may rewrite
this function in terms of the exponential as

y = e ln a
x
(iv)
y = e x ln a (v)

www.ucl.ac.uk/~uczlcfe
28
where, for (v) we have used the standard rule of powers for logs. We now know how to differentiate
(v) since this is just the exponential function. However because of the function of a function nature
of (v) we differentiate it by the chain rule :

dy dy du
= = e x ln a . d (x ln a ) (vi)
dx du dx dx

since ln a is a constant (vi) easily becomes :

dy
= e x ln a . ln a = a x . ln a (vii)
dx

Hence the derivative of the general exponential function y=ax is dy/dx = ax. ln a.

Examples
See lecture and refer to examples 8d, p264 of Bostock and Chandler core textbook.

Exercises
Refer to exercises 8d, p264 of Bostock and Chandler core textbook.

1.4.11 : Some history of the log function


Logarithms were invented by John Napier, a Scottish mathematician, in 1614 as a way of
simplifying mathematical calculations involving powers. Thus equations involving powers could be
simplified to equations containing only multiplications, and equations involving multiplications
could be simplified to equations containing only additions.

However, the logarithms he invented differed from the common logarithms we now use today in the
sense that he defined logarithms as a ratio of two distances in a geometric form (just as a basic
definition of sin x, cos x, tan x are ratios of two distances of a triangle), as opposed to the current
definition of logarithms which involves the manipulation of exponents (powers). The possibility of
defining logarithms as exponents was recognized by John Wallis in 1685 and by Johann Bernoulli
in 1694.

However, this way of viewing and using logs still implied that logs and exponents were
fundamentally seen as arithmetic operations used to reduce powers to multiplications and
multiplications to additions, not as functions. The idea of deducing that t = loga x can be derived
from x = at, and therefore of thinking of logs as a function (which is the inverse of x = at) involves
a much later way of thinking about logs.

It would be fair to say that Johann Bernoulli began the study of the analysis of exponential
functions in 1697, and as such generalised the study of logs as functions. It may have been Jacob
Bernoulli who first understood the way that the log function is the inverse of exponential functions.
On the other hand the first person to make the connection between logarithms and exponents may
well have been James Gregory. In 1684 he certainly recognised the connection between logarithms
and exponents, but he may not have been the first to do so.

Consequently, the log function can be defined either as an arithmetic techniques to simplify the
solution to equations, or as a function which is the inverse of the exponential function.
(*Summary/excerpts of the history of logs taken from the websites http://www-gap.dcs.st-and.ac.uk/~history
/HistTopics/e.html and http://www.sosmath.com/algebra/logs/log1/log1.html*)

www.ucl.ac.uk/~uczlcfe
29
Exercise on developing mathematical thinking
So far we have been able to differentiate all the functions we have encountered. However, not all
functions can be differentiated. Remember that the derivative of a function f(x) is given by

df = lim f (x + x ) − f (x )
dx xd0 x

and this derivative can only be found at a point x = c if the limit exists at the point c itself. If the
limit does exists then f(x) is said to be differentiable, otherwise f(x) is not differentiable at x = c.

1) use the above definition of the derivative to find df/dx at x = 1 for the function f(x) = 4x2+1,
i.e. find
f (1 + x ) − f (1 )
lim .
xd0 x

Is f(x) differentiable at x = 1 ?

2) Let g(x) be a function defined as

⎧ x2 + 3 x<1

g(x ) = ⎨
⎪ 2x + 2 xm1

i) Find g(−2), g(−1), g(0), g(1), g(2), g(3)


ii) Sketch the graph of y = g(x) for −2 [ x [ 3
iii) Find g'(1) from first principles using the above definition of a derivative. Here you
will need to calculate

g(1 + x ) − g(1 ) g(1 + x ) − g(1 )


lim+ and lim−
xd0 x xd0 x

In other words you will need to calculate the limit of g'(1) as δx approaches 0 from
the right hand side (i.e. δx t 0+) and separately as δx approaches 0 from the left
hand side (i.e. δx t 0−).

3) Let h(x) be a function defined as

⎧ x2 + 3 x<1

h(x ) = ⎨
⎪ 2x + 4 xm1

i) Find h(−2), h(−1), h(0), h(1), h(2), h(3)


ii) Sketch the graph of y = h(x) for −2 [ x [ 3
iii) Find h'(1) from first principles if it exists. Again you will need to calculate a two
sided limit as :

h(1 + x ) − h(1 ) h(1 + x ) − h(1 )


lim+ and lim−
xd0 x xd0 x

www.ucl.ac.uk/~uczlcfe
30
4) Let k(x) be a function defined as

⎧ x2 + 3 x<1

k(x ) = ⎨
⎪ 2x + 4 xm1

i) Find k(−2), k(−1), k(0), k(1), k(2), k(3)


ii) Sketch the graph of y = k(x) for −2 [ x [ 3
iii) Find k'(1) from first principles if it exists. Again you will need to calculate a two
sided limit

5) Comment on how the graphs of g(x), h(x), and k(x) differ at the point x = 1. How is the
shape of the graph related to differentiability ?
End of Exercise

Exercise on thinking mathematically


It is possible to use 1st principles to define second, third, fourth, etc... derivative of a function. As
such consider
d 2 f = f ∏∏ (x ) = lim f ∏ (x + x ) − f ∏ (x )
dx 2 xd0 x

i) Simplify this expression to obtain the definition of the second derivative.

There are many other definitions of the second derivative. In order to develop these one must use
alternative definitions of the first derivative such as those you can develop from the exercise on
p13-14 of these notes. Therefore, using these two alternative definitions for the first derivative

ii) Define and simplify two alternative expressions to the definition of the second
derivative.
iii) Find, and simplify, a third alternative definition to the second derivative (for this you
will need to find another alternative definition to the first derivative. You should be
able to get this simply by spotting a pattern between the three definitions you
developed from the exercise on p13-14 of these notes).
iv) Use the expressions below to obtain other definitions for the first and second
derivative :

f (x + x ) = f (x ) + x.f ∏ (x ) + x f ∏∏ (x ) + x f ∏∏ (x ) + x f iv (x ) + ....
2 3 ∏ 4

2! 3! 4!

f (x − x ) = f (x ) − x.f ∏ (x ) + x f ∏∏ (x ) − x f ∏∏ (x ) + x f iv (x ) − ....
2 3 ∏ 4

2! 3! 4!

v) From your definitions of the second derivatives above deduce a generalised


definition for the second derivative
vi) Develop similar for the 3rd and 4th derivatives of a function f(x).

End of Exercise

www.ucl.ac.uk/~uczlcfe
31
Appendix
A Comment about infinitessimals
Calculus deals fundamentally with infinitessimals, i.e. infinitely short lengths or distances. In
defining the derivative we then used a ratio of two infinitessimals, i.e. δy/δx. However, we are not
restricted to forming this type of infinitessimal. We may in fact use infinitessimals in whatever
combination is mathematically legal in order to address our problem. For example, consider the
diagram below :

y+δy
δy 2 4

1 3

x
δx

x x+δx

We may then study the change in a given area A if the x and y lengths of the region are increased by
an infinitely small amount δx and δy. In this case we have that region (1) above to be :

A(x, y) = x.y

If we then increase the area by increasing the x and y distances by infinitely small amounts, we then
have the total region (1), (2), (3), (4) :

A(x+δx, y+δy) = (x+δx).(y+δy)

We may then ‘expand’ A(x+δx, y+δy) by considering the separate areas (1), (2), (3), (4) :

A(x + x, y + y ) = A(x, y ) + A(x + x, y ) + A(x, y + y ) + A(x, y )


(1) (3) (2) (4)

= x.y + (x+δx).y + x.(y+δy) + δx.δy

From this we see that infinitessimals are used to describe whatever relationship exists between the
variables as necessary, and that in this example we can talk about the product of infinitessimal in
order to describe an infinitely small area given by

A = x.y

www.ucl.ac.uk/~uczlcfe
32

You might also like