Professional Documents
Culture Documents
, , ,
´
εις επαινoν δ óξ ης τ η̃ς χ άριτoς αυτoυ̃
Preliminary Edition by Dr. Kenneth Rietz
All rights reserved. No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or
by any means, electronic, mechanical, photocopying, recording, or otherwise, without prior written permission from the
author, except by a reviewer who may quote brief passages in critical articles and reviews.
Contents
i
CONTENTS ii
Laws of exponents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Graphs of exponential functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
e (Euler’s constant) and “the” exponential function. . . . . . . . . . . . . . . . . . . . . . . . . . . 22
0.4.2 Logarithmic functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Logarithms as the inverses of exponential functions. . . . . . . . . . . . . . . . . . . . . . . . . . 23
Laws of logarithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Graphs of logarithmic functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Natural logarithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
0.4.3 Solving exponential equations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
0.5 Summary of Chapter 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1 Derivatives - I 26
1.1 Motivating the idea of derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.1.1 General introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Structure of the course. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Calculus as a foreign language. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
The mathematics of non-uniform quantities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Motivation—driving a car. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Corresponding geometric ideas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.2 Definitions of (1-dimensional) derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.2.1 Formula-defined functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Finding the slope of a secant line through two specific points. . . . . . . . . . . . . . . . . . . . . 29
Finding the slope of a general secant line through one specific point. . . . . . . . . . . . . . . . . . 29
Doing this on Maple. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
The uses of difference quotients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Finding the slope of the tangent line at a specific point. . . . . . . . . . . . . . . . . . . . . . . . . 33
Finding the slope of the secant line between two general points. . . . . . . . . . . . . . . . . . . . 33
Magnifying the function, getting a “line.” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Notations and terminologies for derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
The uses of derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.2.2 Correlations to velocity, average and instantaneous. . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.2.3 Green box functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Understanding ∆x and ∆y. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
∆y/∆x ≈ (dy/dx), for ∆x small. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
∆y ≈ (dy/dx)∆x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
(dy/dx) is then what you multiply ∆x by to get ∆y. . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Correlate the WMF to slope of tangent lines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Correlate this to instantaneous velocities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Notations, terminology, and cautions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
An example, using numbers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.2.4 Other definitions of functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
The uses of the wiggle magnification formula. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
1.3 Calculating derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1.3.1 Motivation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
We want to avoid tons of messy algebra. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Patterns in derivatives became formulas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1.3.2 Differentiating polynomials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Derivatives of simple functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Derivative of the sum and difference of monomials. . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Finding derivatives with Maple. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
The uses of derivative formulas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
1.3.3 Limits, and the official definition of derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
CONTENTS iii
2 Finance 119
2.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
2.1.1 Seems an unusual topic for calculus, but isn’t. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Calculus is a major portion of business finance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Economics is a heavily quantified subject nowadays. . . . . . . . . . . . . . . . . . . . . . . . . . 119
We will be doing a few separate topics in this chapter. . . . . . . . . . . . . . . . . . . . . . . . . 120
2.2 Continuous compounding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
2.2.1 Background. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Terminology and notation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Simple interest. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Compound interest. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Continuous compounding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
2.2.2 Indeterminate forms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Like 0/0, 1∞ can be anything. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Other varieties of indeterminate forms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Cure: L’Hôpital’s rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Working limits to infinity on Maple. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
2.2.3 Return to the problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Solution of the problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Investigation of exponential growth. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Regular compounding versus continuous compounding. . . . . . . . . . . . . . . . . . . . . . . . 132
2.3 Inventory control. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
2.3.1 Statement of problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
We want to determine the order size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Notation and terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Equation derived. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Now that we have it, what do we do with it? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
2.3.2 General procedures for minimizing (or maximizing) a function. . . . . . . . . . . . . . . . . . . . 136
Look at a simple picture (graph). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
How do you solve the problem? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Maxes and mins on closed intervals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
2.3.3 Back to the problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Use the second procedure first. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Use the first procedure next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
2.4 Elasticity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
2.4.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Typical use of calculus concepts in economics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Typically poorly explained (poor understanding of calculus). . . . . . . . . . . . . . . . . . . . . . 143
Description of the market; gauge of price levels. . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
2.4.2 Notation and terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Price increase implies decrease in demand. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
CONTENTS v
3 Derivatives - II 152
3.1 Partial derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
3.1.1 Basics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Motivations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Multiple-input functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Derivatives in this case, notations and terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Interpretations of derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Total change and total differential. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
3.1.2 How to calculate partial derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Note which variable is being wiggled; treat others as constants. . . . . . . . . . . . . . . . . . . . . 156
3.1.3 ALL THE SAME RULES APPLY, EXACTLY AS THEY DID BEFORE. . . . . . . . . . . . . . . 156
Simplifications apply here, also. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Doing this on Maple. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
3.1.4 The chain rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
3.1.5 Higher-order partial derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Notations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Interpretations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
Equality of mixed partials, and using that information. . . . . . . . . . . . . . . . . . . . . . . . . 166
3.1.6 Implicit functions and their derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Level sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Definition of implicit functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Formula for derivative of implicit functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Higher-order derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
3.1.7 Constrained partial derivatives (what if you can’t wiggle just one variable at a time?) . . . . . . . . 181
Motivation—gas dynamics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
Notation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
How to calculate these. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
3.2 Summary of Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
3.3 Tests from previous years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
CONTENTS vi
Dudley
Dudley is the main character. He’s generally friendly, even a bit naive. There’s more than a bit of Charlie Brown in
Dudley. He’s adventurous enough to try anything, which often turns into a disaster. Dudley is still learning calculus, and
will often make comments or ask questions that express typical kinds of confusion. That part of Dudley is not hard to
1
CHAPTER 0. INTRODUCTION AND REFERENCE 2
identify with.
Dudley owns (although the reality might be that he is owned by) three pets. Fang is Dudley’s dog, and will often get
Dudley to do things for him. The fact that Fang communicates with Dudley so well says something about the intelligence
of both. Fang hates squirrels, and will always chase them. However, he has not yet caught one, and might not know what
to do with one if he had. Claw is Dudley’s cat and, in common with all cats, considers Dudley a necessary inconvenience.
Then there is Dudley’s pet duck, Bill. (Perhaps you have seen him in the movie Babe.) Bill is at least as smart as Albert
(you’ll meet him soon). On the other hand, Bill thinks he’s a rooster, and no one can make Bill believe otherwise. The
influence of the three pets will get Dudley into all kinds of bizarre situations, especially on tests.
Mugsy
Mugsy is the cynic. He can be counted on to insert the off-the-wall comment and the wise crack. There are times he
says the right thing, but those are totally accidental. He has also tried most everything, but most of it seems to have bounced
off with only minor damage. His questions and comments on calculus are from the outsider’s perspective. He has just
enough smarts to make him dangerous (and has friends you don’t want to meet). He has little patience with high-sounding
nonsense, and only slightly more patience with Dudley. Dudley is a bit intimidated by Mugsy, for good reason.
No one is quite sure why Mugsy has decided to allow himself to be in this course. Mugsy is not the sort to do anything
he really doesn’t want to do. On the other hand, he likes (in his own way) Dudley, or else Dudley wouldn’t be in this course.
Mugsy, despite his gruff exterior, has a (very) small, tender spot.
CHAPTER 0. INTRODUCTION AND REFERENCE 3
Albert
Albert is the genius, the sort that plays chess blindfolded and works diagramless crossword puzzles in ink. He can
explain everything in calculus, and in just about everything else, too. He is not afraid to reply to Mugsy’s comments, or to
answer Dudley’s questions. The others tolerate Albert because he is so useful to have around. He’s the only one who can
figure out what’s going on.
No one is too sure why Albert is here, either. It’s clear that he already knows all this material. It might simply be that
he is responding to the desperate pleas of Dudley and Mugsy for help and reassurance.
Distance formula.
Start with
distance = rate × time
written as s = vt. This is a standard formula from physics, and it is common sense as well. It assumes, though, that the
velocity, v, is a constant.
Graphs.
Plot s versus t, and then v versus t, assuming v = 50 m.p.h. (constant).
The graph of s versus t is a slanted line through the origin, s = 50t. The graph of v versus t is a horizontal line, v = 50.
Here are the graphs.
s v
6 6
v = 50
s = 50t
t t
- -
Non-constant velocity.
This is the general case, even when v is not constant! Velocity is always the slope of the s versus t graph. And distance
traveled is always the area under the v versus t graph.
Calculus enables us to go from the general s-t plot to v (a process called differentiation) and from the general v-t plot to
s (a process called integration).
The two basic operations of calculus are here: differentiation and integration. There are close connections between
slopes and velocities, and between areas and net distances. Those are two of the connections that I hope to make clear.
There’s a lot more to come! And, of course, this simplistic explanation has holes in it, which we will plug up at the proper
time.
CHAPTER 0. INTRODUCTION AND REFERENCE 5
@ ?
@
@
gnome
'
R
@
y
reject @
?
output
?
There are three parts: an input funnel, an output chute, and a reject spout. The input funnel is where you drop inputs of
any sort. The output chute is where the output falls out. The reject spout is for inputs that don’t produce output values.
It is sometimes not drawn. (For the moment, I am ignoring the inner workings of the green box. That will come next.)
Consistent means that whatever the box does to an input, that’s what it will always do to that same input. This is a “serious”
definition; we’ll come back to it regularly throughout this semester.
Dudley: I guess he is serious.
I even use it in more advanced courses.
Gnome. Now we come to what happens inside green boxes. With all these ecological niches, you’d expect that there is
some creature to fill them. Of course, there is. Another definition is that a function is a gnome.
Mugsy: You really sure this guy is serious?
Albert: Hang on. He is trying to make a point.
Mugsy: It looks like he’s trying to make a fool of me. That’s not nice. The last time somebody did that to me, the
CHAPTER 0. INTRODUCTION AND REFERENCE 6
Proper list. When gnomes get old, they begin to lose their memories. To aid them, they often construct proper lists.
Mugsy: Now that makes sense!
Dudley: Now I’m worried about you.
Mugsy: Shuddup.
A proper list has two properties. The first is that there are two columns, labeled INPUT and OUTPUT. The left column
contains the valid inputs; the right column contains corresponding outputs. But that isn’t what makes a list proper. The
other property is that duplicate entries in the left (INPUT) column have identical right column (OUTPUT) entries. This is
the same as consistent, and shows that no decision needs to be made.
Example.
I’ll go over three examples in class. Here they are:
Note that unless there are repeated inputs, you can conclude immediately that the list is proper.
Proper collection of ordered pairs. This is much the same as proper lists. You convert the left-right column couples into
ordered pairs. Proper means essentially what it did before. Oddly enough, though, this is the definition that will become
the most general in more advanced mathematics courses!
Example.
Converting the lists from the previous example to ordered pairs gives
Graph that passes the vertical line test. This is one definition that you might well have seen before. Plotting a proper
collection of ordered pairs (assuming that they are numbers, which is not normally required!) gives a graph. The horizontal
axis is for the input value, the vertical axis is for the output value. Vertical lines look at the possible outputs for a specific
input. The vertical line test states that the graph of a possible function is a true function if no vertical line intersects the
graph in more than one point. The vertical line test checks if vertical lines intersect the graph no more than once. If a
specific vertical line doesn’t hit the graph at all, there is no output value for that input. That’s no problem, since functions
aren’t required to produce an output for each input. That’s the reason for the reject spout. If the vertical hits the graph once,
you can read off the output value from the y coordinate of the intersection point (read off the vertical axis). If it would hit it
more than once, there are several possible outputs for that input, meaning the list was not proper, there was a decision, and
the box was not consistent.
Mugsy: I think he intends that to mean that something is not good.
Albert: Precisely.
The graphs of the examples, in order, (with a lot more points filled in) are given right above. Note that the first and third
ones pass the vertical line test, while the second one does not.
Well-defined formula. This is what most of you would think of as a function. The phrase “well-defined” is a technical
term from mathematics. What it means is that any choices made along the way don’t make any difference in the end.
Mugsy: My choices don’t make a difference?
Albert: Only in the case the formula is well-defined.
For example, the formula
f (x) = x ± (1 − sin2 x − cos2 x)
looks like it involves a choice (Do I take the “+” or the “−?”), but whichever choice you make doesn’t affect the answer,
since the trigonometric identity sin2 x + cos2 x = 1 means that the term in parentheses is 0. So, this f (x) is well-defined.
Not all formulas are functions. The quadratic formula is one example, since it has a ± in it, and that will affect the
values that come out of the formula (usually).
Example.
The formulas for the lists above are not directly obvious, but are, respectively,
f (x) = 1 + x3 f (x) = 1 ± x3 f (x) = 1 + x3
Thus, the first and third lists gave functions, while the second one didn’t.
More terminology.
When we get around to using functions more, some terminology will prove quite useful. One is the domain of a function,
which is the set of all “good” inputs (ones that give outputs rather than dropping out the reject spout). (Note: Most
everything that is done in serious mathematics with functions is done in terms of sets. It turns out to be most convenient.)
How do you find the domain of a function, since we have so many different ways of looking at a function? For green
boxes (and gnomes), I’ve already given it to you. You look for inputs that don’t drop out the reject spout. For proper lists or
ordered pairs, you look down the left column, or at the first coordinates. For graphs, you take the parts of the x-axis which
have points of the graph over (or under) them. (Smash the graph flat onto the x-axis). For a formula, you normally take the
domain to be all of the possible inputs that can produce a legal output. This means that you avoid these:
CHAPTER 0. INTRODUCTION AND REFERENCE 8
• Division by zero;
• taking square roots of negative numbers;
3. Exercises are worth one point per part, problems are worth two points per part, and investigations are worth three
points per part. This is how you can figure out exactly how many points any given assignment is worth.
Homework #1
Exercises.
1. Which of the following represents a function? Give a reason for your answer.
(a) In Out (b) In Out
1 1 1 2
2 4 2 1
1 1 1 3
3 3 3 1
1 1 1 4
4 2 4 1
(c)
CHAPTER 0. INTRODUCTION AND REFERENCE 9
' $
& %
p √
(d) f (x) = 3 |x| (e) f (x) = | x|
2. Which of the following represents a function? Give a reason for your answer.
(a) In Out (b) In Out
1 3 1 1
2 2 2 2
3 1 3 3
4 1 3 4
5 2 2 5
6 3 1 6
(c)
@@
@
@
@
√
(d) f (x) = x (e) f (x) = |x|
3. What are the domains in exercise 1 (except part c.)?
4. What are the domains in exercise 2 (except part c.)?
5. A homework assignment starts on page 44. How many points is it worth?
The inverse might not always exist. The general criterion for deciding whether an inverse function exists is to determine
if more than one input gives any specific output. If so, it can’t have an inverse, since you can’t tell which input gave that
output. (Remember, the essence of a function is that there are no choices, ever.) The ones that don’t have two inputs ever
giving the same output are called one-to-one (or other, less obvious, things like monomorphic or injective, in advanced
mathematics).
Dudley: Does injective have anything to do with needles?
Albert: No.
Note that perfectly good functions (like f (x) = x2 ) might not be one-to-one.
Now, let’s look at these. The first one is still a function. The second one is now a function, although you again have a
difficult time saying that something that is not a function has an inverse that is a function. (The right way to deal with
such things is to look instead at a more general critter, called a relation, which also has input and output spouts, but has no
requirement for consistency.
Dudley: You mean a relation is an inconsistent function?
Albert: Not exactly. A relation allows the possibility of being inconsistent. Inconsistency is not a requirement. It’s
better to think of it this way. A function is a consistent relation.
You can get its inverse exactly the same way as before, by interchanging columns. The question then becomes whether or
not the relation is a function; that is, ask if it is a function. The second relation is not a function, but its inverse is. The first
relation is a function, and its inverse is a function also.) The third one, which was a function, has no inverse. (That is, the
third relation is a function, but the inverse relation is not.)
For formulas with y = f (x), interchange x and y, and then solve for y = f −1 (x). If there are any ±’s in solving for y
the original f (x) has no inverse. Note that interchanging x and y is precisely the same as interchanging input and output
CHAPTER 0. INTRODUCTION AND REFERENCE 11
columns, since x is the variable for the input column (traditionally) and y is the variable for the output column (again,
traditionally). Solving for the (new) y is necessary to check if the formula is a function.
Example.
Here are the three examples one more time. The first function has a formula f (x) = 1 + x3 . To invert that, you start with
y = 1 + x3 , interchange x and y to get x = 1 + y3 , and then try to solve for y. In this case, it isn’t hard.
x = 1 + y3 (1)
3
y = x−1 (2)
√
y = 3 x−1 (3)
Note that there is no ± with cube roots. The ± shows up only with even (square, fourth, √ sixth, etc.) roots, and never with
odd (cube, fifth, seventh, etc.) roots. The inverse function’s formula is then f −1 (x) = 3 x − 1, which then obviously exists.
The second function’s formula is f (x) = 1 ± x3 . To invert, put the function in the form y = 1 ± x3 , and interchange x
and y, giving x = 1 ± y3 . Now, to solve for y gives
x = 1 ± y3 (4)
3
±y = x − 1 (5)
3
y = ±(x − 1) (6)
√
y = ± 3 x−1 (7)
This certainly looks like a non-function, but the lists we gave showed that it is a function. What’s wrong? Not much,
actually. It’s just that we were a bit too careless with throwing things around. Here’s what really happened. When we
constructed the list, we plugged only positive numbers into the formula, meaning that the input column of f (x) was always
positive. When we inverted the columns, the outputs were therefore only positive numbers. So, the ± in the formula never
really appeared in the lists, since it was chosen to make the result
√ always positive, and there was no alternative possible in
that choice. The actual formula for the lists is really f −1 (x) = 3 x − 1 , and this is a function. So, is the function invertible
or not? Good question. If you look only at the numbers in the lists, the answer is yes. If you look at the formula, the answer
is no. If the list for f (x) were expanded by the formula to include such input-output combinations as (−1, 0) and (−1, 2),
which are valid by the formula f (x) = 1 ± x3 , then this expanded list would fail to have an inverse. The formula already
includes such things, and so fails to have an inverse from the start. But note something. Putting more points into the list
causes there not to be an inverse. That comment will turn out to be useful later, when we are trying to make functions have
an inverse. The key will be to remove the offending entries that cause duplicate output column entries.
Mugsy: Al, did you follow that?
Albert: Of course. Why do you ask?
Mugsy: In case I ever need to knowabout it.
The third function is f (x) = 1 + x3 . How do we invert this? The same procedure needs to be followed. Write the
function as y = 1 + x3 , interchange x and y, and you get x = 1 + y3 , but solving for y now looks a bit rougher. Working
√
with absolute values is not too familiar. So, we convert to the alternate form of absolute values. Remember that a2 = |a |?
You were told to!
Mugsy: Hey! Don’t get personal, hear?
CHAPTER 0. INTRODUCTION AND REFERENCE 12
Here, we use it (and not for the last time, either). We get
x = 1 + y3
(8)
q
= 1 + (y3 )2 (9)
p
= 1 + y6 (10)
p
x − 1 = y6 (11)
2 6
(x − 1) = y (12)
6 2
y = (x − 1) (13)
q
y = ± 6 (x − 1)2 (14)
√
= ± 3 x−1 (15)
The ± showed up because we took an even (sixth) root. In this case, we can’t get rid of the ±, and again we get that there
is no inverse.
For the graph of a function, you want to see if the same output value is ever duplicated. So, we look at horizontal lines,
since each horizontal line represents a single output value of the function. (Think about that until it makes sense.)
Mugsy: Forget it. The more I think, the less sense I make out of anything.
If any horizontal line is crossed more than once, the function is not one-to-one, since each different crossing represents a
different input value that has that same output value.
Once a function has passed the horizontal line test, how do we find the graph of the inverse? By flipping about the line
y = x, since that interchanges the positive x (input) and positive y (output) axes. (This is not the same as rotating by 90◦ .)
This has just the same effect as interchanging columns in lists. Of course, you could just flip before you knew the function
had an inverse and apply the vertical line test to see if what you just got is a function.
Here are the graphs of the flipped versions of the three functions.
Note that the first one remains a function, the second one does not become a function, and the third one no longer is a
function. This corresponds exactly to the first one passing the horizontal and vertical line tests, the second one not passing
either, and the third one passing the vertical line test but not the horizontal line test.
Please not not get the vertical and horizontal line tests mixed up. The vertical line test determines whether a graph
represents a function. The horizontal line test determines whether the inverse of a graph represents a function.
Homework #2
Exercises.
1. Which of the parts of exercise 1 of the previous homework set have inverses? Give the inverse for those that have
them, and give a reason for those that don’t have inverses.
2. Which of the parts of exercise 2 of the previous homework set have inverses? Give the inverse for those that have
them, and give a reason for those that don’t have inverses.
3. Explain why the test for a function to be one-to-one is exactly the same as the test for the inverse to be a function.
CHAPTER 0. INTRODUCTION AND REFERENCE 13
Composition.
This is arguably the most important way to combine functions. It is basic to the most important rule in calculus. We will
encounter it later. The idea is simple: Take the output of one function as the input to another.
Mugsy: Skyscraper green boxes! Great!
This is in contrast to multiplying the outputs of two functions, for example. The simplest way to visualize it is with green
boxes. (See? I told you they’d reappear. And we aren’t done with them yet, by a long shot.)
We will have to be careful of notation. We can’t use x as the input (independent) variable for both functions, because
the input for the top function will not usually be the input for the bottom function. (The input to the second function is the
output of the first function.) For the same reason, we can’t use y for the both the output (dependent) variables. We will
usually use u as the intermediate variable—the output variable of the first (top) function as well as the input variable of the
second (bottom) function. That is, we will use u = f (x) and y = g(u), where f (x) is the top function and g(u) is the bottom
function.
The notation for the composition is g ◦ f (x) = g( f (x)). Note that this represents f (x) as the upper box and g(u) as the
bottom box. It looks as though the order is backwards.
Mugsy: This whole subject looks backwards.
Albert: Come now, it’s not all that bad.
Mugsy: Bet?
With g ◦ f (x), you first do f (x), get u = f (x), and then do g(u) = g( f (x)). Note also that f ◦ g and g ◦ f are very different
functions. Order is important in composition. (See the homework.)
And, from what we did before, f ( f −1 (x)) = f −1 ( f (x)) = x is the definition of f −1 (x). This is nothing more than saying
that the inverse function “undoes” the original function, and vice versa.
Example.
Let f (x) = 2 x2 + 3 x − 1 and g(x) = 4 x − 3, and find f ◦ g(x) and g ◦ f (x). To do these, work from the inside out. (Outside
in is another option. It will give the same answer, but in my opinion, a more complicated way.)
For f ◦ g(x), we get
f ◦ g(x) = f (4 x − 3) (16)
2
= 2 (4 x − 3) + 3 (4 x − 3) − 1 (17)
2
= 32 x − 36 x + 8 (18)
For g ◦ f (x), we get
g ◦ f (x) = g(2 x2 + 3 x − 1) (19)
2
= 4 (2 x + 3 x − 1) − 3 (20)
2
= 8 x + 12 x − 7 (21)
The only difficulty with these is algebraic.
Maple can be a big help here. It is good at algebra. Here is a Maple session that would work the example just given. (I
know that we haven’t covered Maple yet in the book, but often that section is covered in lab before this point. In any case,
you can come back to this once you have covered Maple.)
CHAPTER 0. INTRODUCTION AND REFERENCE 14
Homework #3
Exercises.
(a) Find and multiply out (expand) both polynomials f ◦ g(x) and g ◦ f (x). Note that they are different.
(b) Find g−1 (x).
(c) Graph both y = g(x) and y = g−1 (x) on the same set of axes fairly accurately. Also draw in the line y = x
and note that the graphs of g(x) and g−1 (x) are reflections about the line y = x (as they should be if they are
inverses).
(d) Show that g ◦ g−1 (x) = x and g−1 ◦ g(x) = x by working out the compositions algebraically.
(e) Find two different numbers x1 and x2 so that f (x1 ) = f (x2 ). (You will need to use fractions for this.) (Hint:
Find two values of x that make f (x) = 0.)
(f) Why does the answer to the previous part show that f (x) has no inverse?
5. Occasionally, functions are defined in pieces that have to be put together carefully. This problem is about how to do
that. We will be working with the function f (x) defined by
(
8 x if x < −2,
f (x) =
a x2 if x ≥ −2
We will want to find the value of the constant a so that the two parts of the function to fit together without a break at
x = −2. This means that we want the values of the two parts at x = −2 to match.
(a) What are the values of 8 x and a x2 at x = −2?
(b) What value should we give to the constant a to make the two values in the previous part equal? (Hint: Set the
values equal, and solve for a.)
Note: Functions that fit together this way, that is that don’t have a break or gap, are called continuous. Yes, there is a
way to set up functions defined in pieces using Maple, but they don’t occur in this course. You will encounter them
in differential equations, and you will see how to define them in Maple then. Or, if you are really curious, you can
type ?piecewise in Maple and learn now.
6. In this problem, we will be working with the function f (x) defined by
(
3 x if x < 1
f (x) =
a x2 if x ≥ 1
We will want to find the value of the constant a so that the two parts of the function to fit together without a break at
x = 1. This means that we want the values of the two parts at x = 1 to match.
(a) What are the values of 3 x and a x2 at x = 1?
(b) What value should we give to the constant a to make the two values in the previous part equal?
All angles for the rest of the course will be in radians. If you are unfamiliar with them, think of them as “metric
degrees,” where π radians is the same as 180 degrees.
Remember to put your calculator in radian mode! That is often done by pushing a MODE button, and usually needs
to be done each time the calculator is turned on. Radians are actually useful, but that fact is well camouflaged from high
school students. The reason appears in calculus. We will see it later.
Albert: Some calculators, like the HP’s and the better TI’s, allow you to set radian mode once, and it will stay. Most
others require you to set radian mode each time they are turned on.
Mugsy: I count on my fingers . . . all nine of them. How do I set them in radian mode?
Dudley: Albert, quit whimpering.
Double-angle formulas.
In contrast to the addition formulas, we will use the double-angle formulas on occasion. They come from the addition
formulas by setting A = B:
sin(2A) = 2 sin A cos A
cos(2A) = cos2 A − sin2 A = 2 cos2 A − 1 = 1 − 2 sin2 A
The different variety of cosine double angle formulas comes from the Pythagorean identity.
Half-angle formulas.
These come from the last two double-angle identities for cos(2A), changing A to A/2 and solving for the squared sine or
cosine.
1 + cos A
cos2 (A/2) =
2
1 − cos A
sin2 (A/2) =
2
CHAPTER 0. INTRODUCTION AND REFERENCE 18
Example.
Try this with the value 1/2. We want to solve sin = 1/2. The value to use to fill in is θ = π/6, since sin(π/6) = 1/2.
Thus Arcsin(1/2) = π/6.
Note that the outputs of the inverse trigonometric functions are angles, and that they are measured (as all angles in
calculus) in radians. Be careful when doing these on your calculator to keep it in radian mode.
Maple automatically works in radians.
Dudley: What’s this Maple he keeps talking about?
Albert: Hang on. The table of contents shows that it will appear at the end of this chapter.
Mugsy: Can’t he keep things in order?
All angles must be in radians, so the output of all inverse trigonometric functions is in radians. The notations for the
trigonometric and inverse trigonometric functions in Maple are:
sin(x); cos(x); tan(x); cot(x); sec(x); csc(x);
arcsin(x); arccos(x); arctan(x); arccot(x); arcsec(x); arccsc(x);
Note that Maple does not capitalize the inverse trigonometric functions.
CHAPTER 0. INTRODUCTION AND REFERENCE 19
Note that Maple uses these same ranges in version 6.1 (the current one).
From these, you can get between all of the inverse trigonometric functions.
The definition we are using for Arccot x would then mean that Arctan x + Arccot x = π/2 would always hold no matter
what the sign of x, but then Arctan(1/x) = Arccot x would fail for x < 0. You simply can’t win.
Mugsy: So, what’s new?
1. Draw the triangle with the inverse trigonometric function as an angle (I call the angle θ ). This will give you two
sides of the triangle.
2. Find the remaining side by the Pythagorean theorem.
3. Use trigonometric function’s definition to get the answer.
Example.
Let’s actually work out sin(Arctan(w/2)). First, we draw a triangle, with a generic angle θ in it, and make it so that
θ = Arctan(w/2). That is, we set up the sides of the triangle so that tan θ = w/2. To do that, we set the vertical side to w,
and the horizontal side to 2.
√
w2 + 4
w
θ
2
CHAPTER 0. INTRODUCTION AND REFERENCE 21
The
√ remaining side of the triangle (the hypotenuse in this case) can then be calculated, by the Pythagorean theorem, to be
w2 + 4. Then
Laws of exponents.
The basic laws of exponents come directly from the idea that an means a multiplied by itself n times (for n a positive
integer).
ax × ay = ax+y (32)
ax
= ax−y (33)
ay
1
a−x = x (34)
a
(ax )y = ax y (35)
x x x
(a b) = a × b (36)
√x
a = a1/x (37)
To work properly, a must be non-zero for the ones that involve division, and a > 0 for the rest. In general, it is best to use
these with a > 0 for all of them. Certain special values must be memorized:
a0 = 1 (38)
−1
a = 1/a (39)
1
a =a (40)
CHAPTER 0. INTRODUCTION AND REFERENCE 22
It is instructive to note that the graphs of y = ax and y = 1/ax are reflections about the y-axis. This happens because of
a property of exponents, namely that 1/ax = a−x . So, putting a value of x into ax gives the same value as putting −x into
the function 1/ax = a−x . Since reflecting about the y-axis changes the sign of x, we have that reflecting the points of y = ax
about the y-axis will give the points on the graph of y = a−x .
Laws of logarithms.
The laws of logarithms are just the reworking of the laws of exponents, using the fact that these are the inverse functions of
exponentials. Here are the parallels.
Logarithms Exponentials
loga (r s) = loga r + loga s ax × ay = ax+y
ax
loga (r/s) = loga r − loga s ay = ax−y
loga (ru ) = u loga r (a ) = ax×u
x u
√ √
loga ( x a) = 1/x x
a = a1/x
The parallels show up when you use r = ax and s = ay , so loga r = x and loga s = y.
Natural logarithms.
There is a base of logarithms that is used in higher math (calculus and up) so routinely that no other logarithm ever occurs
again except in very special and isolated cases. The base is e, Euler’s constant, the base of the exponential function.
The base e logarithm is so common that it is given a new notation, ln x, which is just loge x. It is called the natural
logarithm of x. We will learn more about it as we go along.
Dudley: Why isn’t that “nl” rather than “ln?” Isn’t it Natural Logarithm, rather than Logarithm Natural?
Albert: All those abbreviations come from Latin, where the order of words is different.
Example.
Suppose the pollution P(t) in a lake is described by P(t) = P0 e−4t , where P0 is the initial pollution level and t is measured
in years. (This actually is a reasonable assumption for the pollution level in a lake when no more pollution is entering it.
We will discuss this more in later courses.) How long will it take the pollution to drop to 1% = 0.01 of its original level?
The solution to the problem looks like this. We want to find the value of t (call it t1 ) when P(t1 ) = 0.01 P0 , so we solve for
it:
So the answer is that it will take 1.1513 years. Note that the critical step that made all of this work is taking logarithms of
both sides, so that the t1 could be isolated.
CHAPTER 0. INTRODUCTION AND REFERENCE 25
2. All other the trigonometric functions can be written in terms of sin θ and cos θ :
sin θ cos θ 1 1
tan θ = , cot θ = , sec θ = , csc θ = .
cos θ sin θ cos θ sin θ
3. The one critical trigonometric identity is sin2 θ + cos2 θ = 1.
4. Inverse trigonometric functions are the mechanism for stripping a trigonometric function off of an angle. So, if
sin θ = t, then θ = Arcsint, for example.
ax × ay = ax+y
ax
= ax−y
ay
1
a−x = x
a
(ax )y = ax y
(a b)x = ax × bx
√x
a = a1/x
Derivatives - I
26
CHAPTER 1. DERIVATIVES - I 27
• The algorithms of computers. What does your computer do to find a square root, or the sine of an angle, or the
arctangent of a number? It’s not simple, and leads to some of the most complicated material we’ll encounter.
• Balancing bottles and rating stereos. I’ll leave this one up to your imagination. They tie together, and are a tip of
the iceberg of a vast array of applications relating to average values. At the end, we might take a brief plunge into
probability and statistics.
Only the first two applications will appear this semester.
If you have any other suggestions for topics, please give them to me! I am always on the lookout for better ways to do
things, other ways to tie this material together. If you have an idea, pass it along. For example, I would love to put in a
section on music, but don’t have enough different topics I can tie together under that heading.
Motivation—driving a car.
In order to bring this home in some detail, let’s take an example that everyone here should be familiar with, namely driving
a car. We will be simplistic for the moment, and assume the car is being driven along a flat, straight road. To figure out the
motion of the car, we only need to know its position (that is, its mile marker reading) at all times.
Dudley: Those of you from Kansas know all about this. Those of you from Vermont will just have to take our word
for it that flat, straight roads do exist.
(Later this semester, we will deal with motion in two directions, and next semester, we will look at roller coasters that deal
with three dimensions.)
Mugsy: That’s when Vermonters get their revenge, I guess.
The variables used. The independent variable is t, which is time, and s, which is position. (I’d use x for position normally,
but I have a reason for using s, beyond the fact that physics often uses s.) We will write s = s(t) to emphasize independent
and dependent variables.
CHAPTER 1. DERIVATIVES - I 28
Average velocity. The formula for average velocity is v = s/t, but only if s and t both start at 0. Otherwise, we need what
is called ∆-notation. In general, ∆Q is the change in Q, where Q is any quantity under consideration. It is calculated by
∆Q = Q(end) − Q(beginning). This “(end) − (beginning)” theme will recur many times in calculus, and all of math.
That being done, we now have ∆s = distance traveled, and ∆t = elapsed time. Then we have v = ∆s/∆t no matter what
the starting values of t and s.
Instantaneous velocity (speedometer reading). The actual velocity at a given time is of greater interest, say to a police-
man, than the average velocity.
Mugsy: That’s what he said, anyway.
Can we find that? The answer, of course, is yes. (Otherwise I wouldn’t ask it.)
The key is in ∆t. How big should we make it? For highway driving, ∆t = 1 minute might be fine. I tend to drive at a
reasonably constant speed (cruise control is handy!), which won’t change too fast. However, 1 minute probably won’t work
for in-town driving. Then, ∆t = 1 second is probably quite close, but even so, won’t be exact. How do we get the exact
velocity? We basically want to take ∆t as small as we can.
Dudley: How about minus infinity? That’s mighty small.
Albert: True. But here, we mean small to be in absolute value. ∆t needs to be very near zero.
What’s the smallest elapsed time we can take? Obviously, ∆t = 0. But ∆t = 0 gives some problems. In that case, ∆s = 0
also. (After all, if we haven’t had any time to move, we haven’t moved.)
Dudley: Even Mugsy can get that one, eh?
Mugsy: Shuddup.
We’d end up with v = 0/0. That’s not good. Before we answer this problem, we take a side-trip into geometry.
Change of variables in the problem. Here, the traditional independent variable is x, not t; the dependent variable is y,
not s. We write y = f (x) rather than s = s(t) to keep things straight. The reason I wanted to use s for position shows up here.
I didn’t want to use the same letter (x) for the independent variable in one setting and the dependent variable in another.
This can be confusing enough without that kind of problem, too.
Average velocity is the same as slope of a secant line. The slope of a line in general is ∆y/∆x, according to this new
∆-notation. (You might, or might not, have seen this before.) Slope of a secant line is ∆y/∆x where ∆y is the change in
the y-coordinate between the intersection points of the secant, and ∆x is the corresponding change in the x-coordinates.
Compare this to the average velocity being ∆s/∆t. It is exactly the same, with the dependent and independent variables in
corresponding places, too! That means that the only difference between the two is a matter of interpreting the meaning of
the variables. One interpretation gives the slope of a secant line, the other is an average velocity. We can use either way of
looking at such a quantity, and will!
It will be convenient to give a term to this ∆y/∆x or ∆s/∆t quantity. It is usually called a difference quotient.
How would we then get the slope of a tangent line? For that, we want to have the line come in and graze the curve at
a single point. You can simulate that by moving the two points of intersection of the secant line closer and closer to each
other. At the point they collide, there is a single point the line intersects the curve, which is then the point of tangency.
What happens in the difference quotient if you slide the points together? The value of ∆x gets smaller and smaller until
it hits zero. That’s not good. When ∆x is zero, so is ∆y. That leads again to 0/0 for the slope of the tangent line.
CHAPTER 1. DERIVATIVES - I 29
Instantaneous velocity corresponds to the slope of a tangent line. The slope of the tangent line is obtained by using
smaller and smaller values of ∆x in the slope of the secant line, leading to 0/0. The instantaneous velocity is obtained by
using smaller and smaller values of ∆t, leading to 0/0. It is exactly parallel in concept and calculation to the idea of a
tangent line. The derivative is the name for the single idea in both of these calculations.
Usefulness of this. The more ways you have of understanding derivatives, the easier it is to understand how to read
formulas. It is something like having several different ways of translating words from a foreign language into English or
vice versa. Sometimes one word is more natural in a specific setting.
Right now, we have that the instantaneous velocity corresponds in some way to the slope of a tangent line. Both of
these are central to the understanding of derivatives. But the best understanding of derivatives is yet to come!
Finding the slope of a general secant line through one specific point.
Why would we want to find the slope of a general secant line through one specific point? Because we will want to find the
tangent line, ultimately. And we find the tangent line by sliding the secant points together. For that, we will need some
general form of the secant line. We will end up fixing one point (nailing it down at the point we want the tangent), and
sliding the other point to it. In this case, we will end up finding the tangent line at the point (1, −2), so that is the point we
will leave alone (nail down).
We will use (1, −2) as the beginning point; we will use (x0 , y0 ) for the ending point. Then ∆x = x0 − 1 and ∆y =
y0 − (−2) = y0 + 2. But, according to the remark earlier, we can figure out y0 from x0 . Specifically, y0 = 2 x0 2 − 3 x0 − 1,
by plugging into the formula. Then ∆y = (2 x0 2 − 3 x0 − 1) + 2 = 2 x0 2 − 3 x0 + 1.
At this point, I need to guide the process a bit. To get what we want, we convert the formula for ∆y to have ∆x in it
rather than x0 . This is a simple, but critical, step. It is not the sort of algebra trick that makes sense now, but shortly I will
come back and explain why, and it should make sense then.
So, we need to get a formula for x0 that is in terms of ∆x. That’s simple to find. We have ∆x = x0 − 1, which we can
just solve. We get x0 = ∆x + 1. This is always an easy step. The messy step is plugging all of this into the formula for ∆y.
CHAPTER 1. DERIVATIVES - I 30
You get
∆y = 2 x0 2 − 3 x0 + 1 (1.1)
2
= 2 (∆x + 1) − 3 (∆x + 1) + 1 (1.2)
= 2 (∆x)2 + 4 ∆x + 2 + (−3 ∆x − 3) + 1)
(1.3)
2
= 2 (∆x) + ∆x (1.4)
2
= ∆x + 2 (∆x) (1.5)
Every term of the simplified form of ∆y has a factor of ∆x in it. This should always happen (for now). If it doesn’t, check
your algebra.
Factor out the ∆x from the terms on the right hand side, and you get
∆y = ∆x (1 + 2 ∆x).
This is the critical step! Why? Because we want ultimately msec , which is ∆y/∆x, and we need a factor of ∆x in ∆y in order
to be able to do that nicely. This is the reason that I changed from using x0 to ∆x.
So, divide this ∆y by ∆x and you get
msec = ∆y/∆x = 1 + 2 ∆x.
Note that the for the section right before this, where we were working with (1, −2) and (3, 8), we had ∆x = 2, and msec = 5
Using the formula for the slope of the secant line we just got (now in terms of ∆x), we get msec = 1 + 2 ∆x = 1 + 2 (2) = 5,
just as we got before!
Now we can find the slope of any secant line, as long as it is to the curve y = 2 x2 − 3 x − 1, and the secant goes through
(1, −2).
Mugsy: And you’re going to tell me that this stuff is actually useful?
Albert: Not yet. We need to learn quite a bit more first.
diffquo := proc(y, x, h)
local var;
var := indets(y)1 ; simplify((subs(var = x + h, y) − subs(var = x, y))/h)
end proc
CHAPTER 1. DERIVATIVES - I 31
Note that diffquo(); tells you the order it expects information to be given to it: First the function, then the x-value
of the point at which you want the secant line, and finally the value of ∆x. Note that you can use any letter you want for
delta_x; a typical one is h, as in the defining procedure (what prints out on the screen after you type read(diffquo);).
If you type
> y := 2*x^2 - 3*x - 1;
> diffquo( y, 1, delta_x);
> diffquo( y, 1, 2 );
y := 2 x2 − 3 x − 1
1 + 2 delta_x
5
How about that? Could life be any easier?
Mugsy: It could do my homework for me, too.
I genuinely urge you to get used to Maple for yourself. We will be using it regularly throughout the course. If you are
having difficulty with it, come see me soon, and we’ll work out the problems. Maple is definitely finicky about some things
(usually for a good reason, but not usually for an obvious reason). Let me remind you that in order to diagnose Maple’s
quirks, I will either need to see a copy of a printout of the session or you will have to come and get me while Maple is still
running. It is usually not possible to answer questions like “Why did Maple give me this wrong answer?”
Homework #4
Exercises.
1. For this exercise, we continue to use the function y = 2 x2 − 3 x − 1 from class, with starting point (1, −2). This means
that you can use the values and formulas we just got in the notes.
(a) Find the y-value to go with x = 5, and the values of ∆x and ∆y that go with these values of x and y.
(b) Find msec , the slope of the secant line to the graph of the function, through the points with x = 1 and x = 5 by
dividing the value of ∆y by the value of ∆x from the previous part.
(c) In the notes, we derived a formula for msec in terms of ∆x. Plug the value of ∆x from the first part of this exercise
into that formula to show that it gives the same value for msec as you got by dividing ∆y by ∆x. (All I want to
show is that the formula works.)
CHAPTER 1. DERIVATIVES - I 32
2. For this exercise, we continue again to use the function y = 2 x2 − 3 x − 1 from class, with starting point (1, −2).
(a) Find the y-value to go with x = 2, and the values of ∆x and ∆y that go with these values of x and y.
(b) Find msec , the slope of the secant line to the graph of the function, through the points with x = 1 and x = 2.
(c) Plug the value of ∆x from the first part of this exercise into the formula for msec and show that it gives the same
value as you got by dividing ∆y by ∆x.
3. This exercise explores what happens if ∆x is negative. We will continue using y = 2 x2 − 3 x − 1 with starting point
(1, −2).
(a) Find the y-value to correspond to the x-value −2, and calculate the values of ∆x and ∆y.
(b) Find the slope of the secant line through (1, −2) and (−2, value from first part) by dividing the value of ∆y by the
value of ∆x from the previous part.
(c) What does the formula msec = 1 + 2∆x give for the value of ∆x from the first part of this exercise?
(d) On the basis of the preceding parts, does ∆x being negative cause errors in the formula for the slope of the
secant line? (If your answers are different, negative values of ∆x do cause problems. Hint: There shouldn’t be
problems.)
Problems.
1. In this problem, we tackle a linear interpolation. I’ll give you all the steps. We will be using our favorite function,
y = 2 x2 − 3 x − 1, again. Of course, that is a strange thing to do, since we can work out any value we want without
interpolation. We do this in order to verify that we are getting reasonable answers; we can check that the answer we
get is fairly close the the actual, correct value.
(a) First, we need to construct a table of values for the function. In order to make sure that we don’t give too much
away, we will take values that won’t interfere with the other things we are going to do. So, for this part of the
problem, find the values of y that correspond to the values of x at x = 7 and x = 9.
(b) Next, find the values of ∆x and ∆y, and the difference quotient for this situation. That gives msec . Once we have
found msec , though, we next use new values of x and y, but that value of msec .
(c) We want to approximate y when x = 7.2, using this. To do that, we set x0 = 7.0 and x1 = 7.2, with y0 being the
value that corresponds to x0 (we got that in part (a)), and y1 being the value that we are trying to approximate.
We make the grand assumption that the value of msec for this segment equals the msec that we calculated in
the previous part of this problem. Of course, it is not exactly equal, but it should be close enough. That’s the
approximation.
So, calculate ∆x = x1 − x0 and ∆y = y1 − y0 for these new values in this part of the problem. When you try to
find ∆y, you won’t know what value to put in for y1 yet, but that’s fine, since it is what we are going to end up
solving for in a moment.
(d) Now, set the value of msec from part (b) of this problem equal to the value of ∆y/∆x using the values from part
(c). You should know everything in the equation except for the value of y1 . Solve for y1 , and get a value for it.
This is the linear interpolation value for y when x = 7.2.
(e) Calculate the actual value of y using the functional equation when x = 7.2. It should be quite close to the value
of y that you got in the previous part of this problem.
2. In this problem, we tackle another linear interpolation, but this time from real life, and no other information. Look
carefully back at the previous problem. The table of values is given to us, so we don’t have to construct the table
ourselves. The rest of the problem should be just like the previous part.
The following table is taken from the 81st edition of the CRC Handbook of Chemistry and Physics. It gives the index
of refraction of air at different wavelengths. (This information is important when you are analyzing properties of the
light given off by different substances.) The column headings are irrelevant, so I will simply call them x and y. If you
want to check out their precise meaning, you can look in the Handbook.
CHAPTER 1. DERIVATIVES - I 33
x y
500 27896
510 27870
520 27846
530 27824
540 27803
550 27782
1 + 2 delta_x
1
That’s the slope of the tangent line.
What is the equation of the tangent line, then? We have two pieces of information (which is how many you always
need for the equation of a line): We have a point the line goes through, (1, −2), and we have the slope, 1. Use the point-
slope form of the line (which is the one always to use in calculus): y − y0 = m (x − x0 ). In this problem, m = 1, and
(x0 , y0 ) = (1, −2), so the equation of the line tangent to y = 2 x2 − 3 x − 1 at the point (1, −2) is y − (−2) = (1) (x − 1) or
y = x − 3 if you want to simplify it (unnecessary).
Finding the slope of the secant line between two general points.
It is a bit restrictive to be able only to work around the point (1, −2) and only on the curve y = 2 x2 − 3 x − 1.
Mugsy: You’d better believe it.
First, we get rid of the restriction of being only at the point (1, −2). In a moment, we will work with general curves, too.
In order to get the slope of a generic secant line, msec = ∆y/∆x, we need two generic points. I will use the points with
x-coordinates x1 and x2 . From there, we get the y-coordinates (remember we only need the x-coordinates) for this function,
y1 = 2 x1 2 − 3 x1 − 1 and y2 = 2 x2 2 − 3 x2 − 1.
The next thing is to get is ∆x and ∆y. That’s not too hard. ∆x = x2 − x1 and ∆y = y2 − y1 = (2 x2 2 − 3 x2 − 1) − (2 x1 2 −
3 x1 − 1). That will require some effort to simplify.
The trick, as before, is to write ∆y in terms of x1 and ∆x. This is done by solving ∆x = x2 − x1 for x2 , plugging into
the equation for ∆y and “simplifying” (multiplying out and canceling terms in) the resulting mess. (And, yes, I will explain
just why we want to do this, quite soon.)
CHAPTER 1. DERIVATIVES - I 34
∆y = (2 x2 2 − 3 x2 − 1) − (2 x1 2 − 3 x1 − 1) (1.6)
= 2 (x1 + ∆x)2 − 3(x1 + ∆x) − 1 − (2 x1 2 − 3 x − 1)
(1.7)
= 2 (x1 2 + 2 x1 ∆x + (∆x)2 ) − 3(x1 + ∆x) − 1 − (2 x1 2 − 3 x1 − 1)
(1.8)
= 2 x1 2 + 4 x1 ∆x + 2 (∆x)2 − 3 x1 − 3 ∆x − 1 − (2 x1 2 − 3 x1 − 1)
(1.9)
2
= 4 x1 ∆x + 2 (∆x) − 3 ∆x (1.10)
Note that each term that’s left has a factor of ∆x in it, as before. Factoring out that ∆x gives
∆y = ∆x × (4 x1 + 2 ∆x − 3) .
As a check, if we take x1 = 1, we get msec = 4(1) + 2 ∆x − 3 = 1 + 2 ∆x, which checks with the formula that we had
before.
This can also be done on Maple, using diffquo();. After you have gotten into Maple and read in diffquo and defined
y, you can give Maple the command
> diffquo(y, x_1, delta_x);
4 x_1 + 2 delta_x − 3
and it gives just what it should give.
The slope of the tangent line can be obtained from the formula for msec , by setting ∆x = 0 (note, after simplifying). You
get that the slope of the tangent line at (x1 , y1 ) is 4 x1 − 3. Again, this can be done in Maple, by continuing the computation
with
> subs(delta_x=0, %);
4 x_1 − 3
In the case that x1 = 1, which is what we used before, we get that the slope is 4(1) − 3 = 1, which is the result that we
got earlier.
Note: The function 4 x − 3 is not the equation of the tangent line. It is a formula for the slopes of all the tangent lines to
the function.
Summary of procedures:
To find msec = ∆y/∆x, the slope of the secant line joining two generic points (x1 , y1 ) and (x2 , y2 ) of a function y = f (x),
perform the following steps.
1. Set ∆x = x2 − x1 , and calculate ∆y = y2 − y1 = f (x2 ) − f (x1 ).
2. Plug x2 = x1 +∆x into ∆y and simplify the result. (This usually involves multiplying out terms and canceling whatever
you can.)
3. If f (x) is a polynomial, you should get that, after the simplification, every term has a factor of ∆x in it. Factor that ∆x
out, so that you get ∆y = ∆x × (something).
4. Divide both sides of the equation for ∆y by ∆x, and you get msec = ∆y/∆x.
What you get is a formula for the slope of the secant line joining any two points of the curve y = f (x).
To find mtan , the slope of the tangent line to y = f (x) at the point (x1 , y1 ), perform the following steps.
1. Find msec by the procedure just given.
2. Set ∆x = 0 in the resulting (simplified) expression to get the slope of the tangent line. What you get is a formula for
the slope of the tangent line at any point (x1 , y1 ) on the curve y = f (x).
CHAPTER 1. DERIVATIVES - I 35
Occasionally, it is easier to use x1 and ∆x rather than x1 and x2 right from the start. In that situation, you’d start with
∆y = f (x1 + ∆x) − f (x1 ), and never even refer to x2 . If you prefer that, you can use it.
Mugsy: Which one is easier, if I gotta do it?
Albert: Whichever you prefer. For the functions at this level of calculus, it won’t make any significant difference. Take
your favorite one.
Mugsy: Really?! I’ll take Maple.
Dudley: I thought you didn’t like Maple!
Mugsy: I don’t. I dislike algebra more.
Homework #5
Exercises.
When a very (but not too) wiggly curve is magnified enough, it looks like a straight line. The derivative is nothing but the
slope of that line! As we magnify, we will be homing in on specific points on the curve. Different points, when magnified
around, give lines with different slopes. All that means is that the slope of the tangent line, that is, the derivative, changes
from one point to another. The derivative is a function, too!
Dudley: Why does that give me a feeling like we aren’t done yet?
It should also be mentioned that not every curve can have its wiggles magnified away. An example (the only one that
we will encounter) is the absolute value of x. The graphs of y = |x | are given for the domain [−5, 5] and [−0.0001, 0.0001].
> plot( abs(x), x=-5 .. 5, color=black, scaling=constrained);
The corner at the origin persists. In fact, all corners will always persist. At a corner, there is no one single tangent line,
and therefore there is no single slope, which means there is no derivative.
The tangent line to y = f (x) at (x1 , y1 ) is the best linear approximation in this sense: Any other line through (x1 , y1 )
pulls away from the graph of y = f (x) much faster than the tangent line. We will learn more about this later.
When you are dealing with an (x, y)-graph of a function, the notation that is used for the derivative also has a number
of forms. One is
dy
dy/dx = ,
dx
which is nothing more than an adaptation of the ∆y/∆x notation.
There is a bit of information that is automatic in the notation dy/dx that is not apparent at first. It implies that x is the
independent variable and y is the dependent variable. This seemingly innocent comment will come back later on both to
cause problems and to give help. We often will use the phrase “the derivative of y with respect to x,” and that carries the
same meaning about which variable is dependent and which is independent.
dy
There are other notations for derivatives that have a different look. If you take dx and treat it as a method of converting
dy dy d
y to its derivative, you can view this as y becoming dx , and so you will think of the derivative as dx = dx y, by just pulling
d dy dy
the y down in front. In that case, dx becomes the differentiation operator (starting with y, it gives dx ). Accordingly, dx is
often written as d/dx(y).
Probably the simplest notation is y0 . The prime (superscripted dash) denotes differentiation. That is, y0 and dy/dx mean
exactly the same thing. Another, less common, notation is Dy, or occasionally, Dx y.
For f (x) notation, all the same things are used. That is, each of the following represent the derivative of f (x):
d f (x) d 0
dx , dx f (x), f (x), D f (x).
There is no systematic notation for the difference quotient, except perhaps for msec , but that doesn’t tell you what the
function or points were.
The notation y0 is due to Newton. Leibniz used dy/dx. Of the two, Leibniz’s notation works the best in the sense that
formulas are easiest to remember in his form. But both are simple and exceedingly common. You will need to be familiar
with both. I will use both in this course to make sure that you get used to both.
Homework #6
Exercises.
1. In an earlier homework exercise, we worked out the slope of the secant line to y = 2 x2 − 3 x − 1, through x = 1 and
x = 5. We assumed then that x = 1 was the beginning point and x = 5 was the ending point. Suppose instead now
that x = 5 is the beginning point and x = 1 is the ending point.
(a) Compare the values of ∆x and ∆y then to the values of ∆x and ∆y now. That is, is there any obvious relation
between the values then and now?
(b) Calculate a new value of msec using the new values of ∆x and ∆y. Compare the new value of msec to the old
value.
(c) On the basis of the results from the previous parts of the exercise, which of ∆x, ∆y, and msec change when the
order of the points changes?
2. Also in that same homework set, we worked out the slope of the secant line to y = 2 x2 − 3 x − 1, through x = 1 and
x = 2. We assumed then that x = 1 was the beginning point and x = 2 was the ending point. Suppose instead now
that x = 2 is the beginning point and x = 1 is the ending point.
(a) Compare the values of ∆x and ∆y now to the values then. That is, is there any obvious relation between the
values then and now?
(b) Calculate a new value of msec using the new values of ∆x and ∆y. Compare the new value of msec to the old
value.
CHAPTER 1. DERIVATIVES - I 38
(c) On the basis of the results from the previous parts of the exercise, which of ∆x, ∆y, and msec change when the
order of the points changes?
3. I gave a step-by-step procedure for finding msec for a function y = f (x) with generic points. Try that process on the
function y = tan x. At what step does the procedure break down? [We’ll develop a different method to deal with such
functions.]
4. Find the slopes of the tangent lines to y = 2 x2 − 3 x − 1 at the points x = 1, x = 3, and x = 5. (You can use the work
we did in class.) Find the equations of the tangent lines to the graph at those points.
The value of ∆x is what I call the “input wiggle;” it represents how much the input changes. The value of ∆y is the “output
wiggle;” it represents how much the output changes. Note that the input wiggle causes the output wiggle.
Wiggles are based around some initial value. When you change something slightly, you are changing away from a value
for that thing. This will turn out to be important soon. Hopefully, it will also make more sense then, too.
Dudley: Is it all right if I don’t understand this remark?
Albert: Look at it this way. When we were magnifying a curve to get a “line,” the slope of that line depended on
where we started, right?
Dudley: I guess.
Albert: The wiggles will depend on where we start, too. And those starting points are what we are wiggling from.
Does that help?
Mugsy: Nope. But you did try.
∆y ≈ (dy/dx)∆x.
What we do is solve for output wiggle, ∆y. We are asking for how much the output will wiggle when the input is wiggled by
a certain amount, ∆x. We could use this to estimate (dy/dx), and we will do that, later. But first, we still need to understand
this equation.
The equation
∆y ≈ (dy/dx)∆x
says that the output wiggle is (roughly) directly proportional to the input wiggle, and the derivative is the constant of
proportionality. If we halve the input wiggle, we will halve the output wiggle, approximately. Double the input wiggle, the
output wiggle doubles, again approximately.
∆y ≈ (dy/dx)∆x (1.11)
You should memorize this formula! It is probably the most important formula in the course!
Mugsy: Is he being serious?
Albert: Probably. It certainly is fundamental, and it is quite possible to build up a large portion of calculus from this
formula. That might be exactly what he will do.
CHAPTER 1. DERIVATIVES - I 40
msec × ∆x = (∆y/∆x) × ∆x = ∆y
also exactly. As the slope of the secant line changes to mtan , the formula changes to an approximation,
(mtan ) × ∆x ≈ msec × ∆x = ∆y
or
(mtan ) × ∆x ≈ ∆y
Since the derivative equals mtan = dy/dx, this is just the same as the wiggle magnification formula.
Another way of seeing it is to remember that as you zoom in on a curve, it flattens out. This zooming-in is the same as
taking ∆x small; we are only looking at values very near a point. As the curve becomes more like a straight line, values of
∆y/∆x approximate closer and closer to the slope of the curve at the point. That isn’t saying anything more than that the
curve begins to look a lot like a straight line. The slope of the line that the curve resembles is the slope of the tangent line,
which is dy/dx ≈ ∆y/∆x. In that case, you can find ∆y ≈ (dy/dx)∆x. We can drive the point further home by looking at
specific ranges of dy/dx. Here’s a series of pictures for the four possible cases.
Remember that ∆x positive means that the arrow points to the right (that is, x is increasing), while ∆x negative means
that the arrow points to the left. Similarly, ∆y is positive means that its arrow points upward while ∆y negative means the
arrow points downward.
For dy/dx large and positive, the slope of the tangent line is large and positive. This means that the ∆y arrow will be
much longer than ∆x.
y
6
6
∆y
-
∆x
x
-
For dy/dx small and positive, ∆y will be much shorter than ∆x.
CHAPTER 1. DERIVATIVES - I 41
y
6
∆x
-
∆y 6
x
-
For dy/dx negative, we have to realize that ∆y and ∆x will be of opposite signs. That is, if ∆x is positive, ∆y will be negative.
For dy/dx small and negative, ∆y will be small in comparison to ∆x, and opposite sign.
6 y
```
```
`
∆y ? ``-
```
∆x ```
x
-
For dy/dx large and negative, ∆y will be large in comparison to ∆x, but of the opposite sign.
6y D
D
D
D
∆y D
D
-D
?
D
∆x D
D
D
D
D
x
D
D -
Where dy/dx > 0, the function is said to be increasing and the graph is rising. This makes sense, if you realize that a positive
slope on the tangent line means that the function is headed uphill (toward the right). Where dy/dx < 0, the function is said
to be decreasing and the graph is falling. This also make sense, since the tangent line is headed downhill (toward the right).
When you do graphing by hand, this information is very important.
On the other hand, with graphing calculators and programs like Maple, graphing by hand carries less motivation. This
draws a lot of the usefulness from this approach. But for reference, we will summarize:
If dy/dx > 0, the function is increasing and the graph is rising.
If dy/dx < 0, the function is decreasing and the graph is falling.
CHAPTER 1. DERIVATIVES - I 42
f (x) = x2 (1.12)
x1 = 3 (1.13)
∆x = 0.2 (1.14)
f 0 (x) = 2 x (1.15)
Calculating ∆y:
x2 = x1 + ∆x = 3.2 (1.19)
2
y2 = f (x2 ) = (3.2) = 10.24 (1.20)
∆y = y2 − y1 = 10.24 − 9 = 1.24 (1.21)
Now let’s look carefully at what we did. The derivative was 6, which says the output wiggle will be roughly 6 times the
input wiggle. The input wiggle was given to us as 0.2, so the output wiggle should be about 6 ∗ 0.2 ≈ 1.2. When we
actually calculated the exact output wiggle, we got 1.24, and the final line just says that 1.24 ≈ 1.2, the exact wiggle is
approximately equal to 6 times the input wiggle. It really worked!
We could have used this same approach to approximate (3.2)2 if we wanted to. Here’s how. We know that 32 is 9,
and that’s at least close. But we can improve the approximation using the wiggle magnification formula. We find that the
derivative (wiggle magnification factor) of the function f (x) = x2 at x = 3 has the value 6. (We will be able to do this Real
Soon Now.) So, if we want to wiggle the input by 0.2 (from 3 to 3.2), the output will wiggle by about 6 ∗ 0.2 ≈ 1.2. That
means that the output of (3.2)2 should be close to 9 + 1.2 ≈ 10.2. That really is pretty close, since the exact value is 10.24.
In the section on algorithms of computers, we will learn how to improve this approximation process. We shall return to
this!
Dudley: Impending doom again?
Albert: Only for those taking calculus next semester.
Mugsy: Great! That means I don’t have to worry about it.
Albert: Oh? You’re in next semester, too. Public demand.
(the correct term is interpolate) what the value is in terms of the values just above and below yours. (Don’t get the idea that
interpolation is out of date, though. There are many situations—chemistry comes immediately to mind—where physical
data has been gathered for a certain range of conditions, but your specific interest is for values inside that range, but not
exactly equal to any of the table values. Interpolation is the only way to go then!) Those extra numbers around the side
of the columns are there to help you interpolate. If your independent variable (the one you can control) is called x, and
the dependent variable you are measuring (that depends on x) is called y, the table lets you read off y if you know x, but
only for specific values of x. For other values of x, you use essentially the wiggle magnification formula to estimate ∆y as
(dy/dx)∆x, where ∆x is easy to figure: It is the difference between what you have and the nearest entry in the table. The
value you don’t know easily is (dy/dx), the wiggle magnification factor. But that is precisely what those extra numbers
around the sides are! Then you can find ∆y approximately using the wiggle magnification formula. Then knowing ∆y
roughly and the value of y you have in the table, you can figure out a corrected value of y that should be considerably more
accurate.
We did a problem going over linear interpolation way back in this chapter. It’s the same thing, except that you were
approximating the derivative by a secant line, obtained from two points that were very close to each other (on either side of
the point you were looking for). And as long as the tabulated values are close together, we can treat it as a highly magnified
function, which will look very much like a straight line. That means that linear interpolation is liable to be quite accurate.
(There are more accurate methods that use more points than the two on either side of the value you have, but we aren’t
going to go into those here.)
Homework #7
Exercises.
1. Calculate ∆y and f 0 (x1 )∆x for each of the following, and show that they are approximately equal. The example of
approximating (3.2)2 shows how to do this. The use of either a calculator or Maple would be extremely handy. I give
you the derivatives, which we will learn to do next.
(a) f (x) = 3 x2 − 2 x , x1 = 2, ∆x = 0.1, f 0 (x) = 6 x − 2
√ √
(b) f (x) = 1/ x, x1 = 9, ∆x = 0.2, f 0 (x) = −1/(2 x3 )
(c) f (x) = (x + 1)/x, x1 = 4, ∆x = 0.2, f 0 (x) = −1/x2
2. Calculate ∆y and f 0 (x1 )∆x for each of the following, and show that they are approximately equal. Again, I give you
the derivatives.
(a) f (x) = x3 − 2 x2 , x1 = 2, ∆x = 0.3, f 0 (x) = 3 x2 − 4 x
√ √
(b) f (x) = x, x1 = 1, ∆x = 0.1, f 0 (x) = 1/(2 x)
(c) f (x) = 1/x, x1 = 5, ∆x = 0.1, f 0 (x) = −1/x2
Problems.
1. In this problem, we show that the wiggle magnification factor formula ∆y ≈ f 0 (x) × ∆x is actually always exactly
equal (rather than just approximately equal) for straight lines. The equation for a line is y = f (x) = mx + b, which
has derivative, f 0 (x) = m.
Albert: That’s the slope of the line, after all. You don’t even need to magnify it in this case!
Use this information for the rest of this problem. Leave everything in terms of letters; don’t substitute numbers here.
Essentially, you should follow the procedure given in the notes for finding msec for two generic points.
(a) Use x-coordinates x1 and x2 (leaving them as letters and not using numbers), and calculate the corresponding
y-coordinates y1 and y2 by plugging into the equation for the line.
(b) Plug those values into the equation ∆y = y2 − y1 , for the exact wiggle, and simplify what you get. It should
reduce to m (x2 − x1 ), which is exactly the value of f 0 (x)∆x. This says that the exact wiggle is exactly equal to
the wiggle magnification factor approximation for a line (only).
CHAPTER 1. DERIVATIVES - I 45
2. The following picture shows a line with small negative slope, but both ∆x and ∆y are positive, giving a positive slope.
What’s wrong?
6y
```
```
`
∆y 6 ``-```
∆x ```
-x
Investigation.
1. In the preceding exercises and problem, we discovered that the wiggle magnification formula is good, and is even
exact for straight lines. What this investigation does is examines how far off the formula can get, and why it works
best for ∆x smallest. We will look carefully at one specific function, f (x) = 1/x with derivative f 0 (x) = −1/x2 .
For this investigation, draw on your homework paper (it might need to be sideways to fit on the page) the following
table:
∆x x2 y2 ∆y f 0 (x1 )∆x Error = |∆y − f 0 (x1 )∆x | Error/(∆x)2
(a) Take x1 = 0.5 and ∆x = 0.1. Compare ∆y and f 0 (x1 )∆x the way you did in the homework exercises.
(b) Would you expect the comparison to be better or worse with ∆x = 0.01? Why?
(c) Work out the comparison in the exercises using the same f (x) and x1 but with ∆x = 0.01, ∆x = 0.001, and
∆x = 0.0001. (You will have to be careful with these if you use a calculator. I recommend you write down all
the digits your calculator gives. Don’t round off or you will spoil the problem! ) Create a table with columns
labeled at the top
∆x, x2 , y2 , ∆y, f 0 (x1 )∆x, and |∆y − f 0 (x1 )∆x |.
Leave room for one more column, to be added next. (Note that x1 = 0.5 and y1 = 2.0 will always be the same,
so separate columns for them are unnecessary.) (Also note that if you get 0 for any of the numbers in the last
column, you rounded when you weren’t supposed to. Go back and recalculate those numbers, NOW! )
(d) The wiggle magnification formula is an approximation, not an exact equality, so you should expect that ∆y and
f 0 (x)∆x will be slightly different. This difference is the the amount that the wiggle magnification formula is off
by, and is the value in the last column. That amount is called the error in the formula. That’s what we want
to look at very carefully. For a line, the approximation is exact, with no error. So, the last column gives how
much the function is not a line, and later on (next semester) we will see that error should be roughly quadratic
in ∆x, that is proportional to (∆x)2 . If that’s true, the error column should be nearly Error ≈ C × (∆x)2 , for
some constant C. Then, Error/(∆x)2 ≈ C, a constant. This we can check! For each of the rows of the table, fill
in the last column, whose values are the error column divided by (∆x)2 . (That is, form the last column for the
row with ∆x = 0.1 by dividing the error column by (0.1)2 = 0.01. Do a similar thing for the remaining rows.)
(e) Do the results in the last column seem to be roughly constant? What value (the C earlier) does it seem that the
constant is? (Note: As ∆x shrinks, the values in the last column should be getting closer to the correct value of
C. Think of it as a limit. In particular, you do not just want to average the values in the last column to get C.
What value does C seem to be getting closer to as ∆x shrinks?)
(f) Use the estimated value of the constant C from the previous part and the formula Error ≈ C (∆x)2 to estimate the
error when ∆x = 10−15 . This is likely to be smaller than your calculator is going to be able to handle, but you
CHAPTER 1. DERIVATIVES - I 46
can check your answer on Maple. Here’s how to set up the check. The semicolons at the end of the commands
have been replaced here by colons, so that the output is suppressed. (You have to change the ending colons to
semicolons to get the answers.)
> y := 1/x:
> y_prime := -1/x^2:
> x1 := 1/2:
> Delta_x := 10^(-15):
> x2 := x1 + Delta_x:
> y1 := subs(x=x1, y):
> y2 := subs(x=x2, y):
> Delta_y := y2 - y1:
> Wiggle_formula := subs( x=x1, y_prime) * Delta_x:
> Error := abs(Delta_y - Wiggle_formula):
> evalf(Error, 20):
This last number should be close to the value you predicted. Note that you are using exact (rational number)
arithmetic in Maple, up to the last step, so there is no error from round-off in this approach.
(g) On the basis of what is done in the preceding parts of this question, the error in the wiggle magnification formula
is roughly proportional to (∆x)2 . Using that, explain why you think that the wiggle magnification formula works
best when ∆x is small.
1.3.1 Motivation.
As “easy” as Maple is to work, diffquo(); is “the long way” to do derivatives. Standard calculus courses spend quite a
while on that part. I have spent enough (at least) to convince you that there has to be a better way.
d
(constant) = 0
dx
d
(x) = 1
dx
d
(mx) = m
dx
d n
(x ) = nxn−1
dx
d
((constant)xn ) = (constant × n)xn−1
dx
Note that the last two formulas do not require that n be a positive whole number, but for polynomials, it will be.
These are the building blocks of polynomials. All we need to do is figure out how to add and subtract them and we’ll
be there.
The derivative of a sum (or difference) of functions is just the sum or difference of the derivatives.
Here are the examples that I will go over in class. Extra space is allowed at the right of the page for you to fill in the
answers.
CHAPTER 1. DERIVATIVES - I 48
functions derivatives
x8
x√200
x
5 x9
5/x9
1 10
2x
2 x4 + 5 x5
5 x3 − 2 x2 + 17 x − 8
20 x4 − 7 x3 + 19 x + π
15 x2 − 7
You need to include both parts in diff();. That is, you have to have the function and the variable. The reason is that
the function could have other variables or parameters in it, and you need to tell diff(); which one is the independent
variable as opposed to just constants. That is, if the function is a*b*c, how would Maple know what the variable is if you
didn’t say?
Note that I will assume that you can use Maple to find any derivatives you encounter. In fact, I expect you to check any
answers you aren’t sure of by using Maple! So, please, if you are having trouble with Maple, see me soon.
Homework #8
Exercises.
(c) x4 − x3 + x2 − x + 1
(d) 1/x2
(e) 6/x3 + x3 /6
√ √ √
(f) x + 3 x + 4 x
3. Make up some polynomials of your own and differentiate them. Make a few of them have large exponents and/or
coefficients. You will get credit for up to three polynomials. [This type of instruction occurs several more times
while we are learning derivatives. The intention is to give you a way to get involved in the task of figuring out what’s
going on. If you can generate good problems, and get the correct answers to them, then you really understand the
ideas. Again, you can use Maple to check your answers. A handy item is that Maple will generate its own random
polynomials. Use randpoly(x); to get them. You can use that same command over and over to get multiple
different random polynomials. But be warned that you get the same ones each time you start Maple over.]
4. What is the derivative of f (x) = 2 x2 − 3 x − 1? [Note that this is the same function and answer we got when grinding
through “the long way.” The name “the long way” is very accurate.]
5. When y = y(x), the derivative of y with respect to x is written dy/dx. How do you write the derivative of s = s(t)?
x=1 (1.24)
x − x2 = 1 − x2 (1.25)
x(1 − x) = (1 + x)(1 − x) (1.26)
x = 1+x (1.27)
0=1 (1.28)
Let’s examine this. First, we subtracted x2 from both sides of the equation. Then we factored both sides. Both of those
operations are always legitimate. We then divided both sides of the equation by 1 − x. That, of course, is a problem, since
we started out by assuming that x = 1. We are dividing both sides by 0, and that’s not good. In essence, we had the equation
1×0 = 2×0
which is always true, and then “canceled” the 0’s. The result was that 1 = 2, which is definitely false. The last step,
subtracting x from both sides, is fine. But by then, we have already invalidated our equality.
Division by a quantity that is elsewhere set equal to zero is algebraically shady. But that’s exactly what we did in our
definition of derivatives!
Mugsy: I knew it. Math has more holes than three tons of Swiss cheese.
Albert: Actually, more than that. But not here. That’s what we are going to investigate next, I assume.
We divided top and bottom of the difference quotient by ∆x and then set ∆x = 0 later on. And that’s just not proper.
How can we be sure that what we are doing is algebraically legitimate? Limits provide a foundation. Let’s go back to
solve a simple problem, one that is closely tied to what we want to solve, but not obviously so. Suppose that we have a
CHAPTER 1. DERIVATIVES - I 50
lim f (x) = 7.
x→2
Greek letter epsilon), and the input tolerance is normally written δ (the Greek letter delta, lower case this time). The topic
is then called an ε-δ argument.
Limits are what we need for derivatives. The slope of the secant line gets close to, but never exactly equal to, the slope
of the tangent line.
Dudley: Never?
Albert: Almost never. It will be exactly equal for lines, for example. But it is difficult to cook up any other curves
where it will be equal.
We want to look very close to ∆x = 0, but we can’t look exactly at ∆x = 0 without getting undefined results (or errors, or
whatever). We have not been allowed to plug in ∆x = 0 (at least until after simplifying).
The definition of f 0 (x) in strict terms is
f (x + ∆x) − f (x)
f 0 (x) = lim (1.29)
∆x→0 ∆x
The fraction inside the limit is precisely ∆y/∆x written out, and the limit makes you look at it when ∆x is very tiny.
Occasionally, this limit is written with h rather than ∆x.
Mugsy: I bet this makes you feel better, Al. All this gibberish is right down your alley.
Albert: And your alleys tend to be a lot darker and less populated, right?
Mugsy: Yup.
What, then, is the procedure for finding the limits that are encountered in this definition? Essentially, we use this major
fact: All functions that are typically encountered in math are continuous where they are defined. What does that mean? The
first thing you do when evaluating a limit is plug the value of the variable into the limit. (That’s the idea behind continuous,
remember? If the value you get by evaluating the limit—that is the best prediction you can make—is equal to the actual
value of the function, the function is continuous. We’re using it somewhat backwards, since it will help us find limits now.)
If you get a value out, that’s your answer.
Dudley: Al, is this serious? First you say you can’t look straight at the value, and then you say to do just that! Why
can’t you make up your mind? Can you or can’t you?
Albert: There, there, Dudley. Quit whimpering, and I’ll explain it.
Dudley: <Sniff>
Albert: Limits are set up to answer the hassle that crops up when you want to plug in a number, and algebraically,
it’s not legal. That’s the “plug in 0 even though you want to divide by it” hassle we had. But the functions in calculus
just happen to have some very nice properties, and one of them is that you can simplify first, get rid of the hassle,
and get the correct answers when you plug in. In one sense you could ignore it, and you probably wouldn’t have even
noticed there was a problem.
Mugsy: I certainly wouldn’t.
Albert: But on the other hand, there are a couple of places later when the idea of limits will come in very handy, and
this then becomes an introduction to the concept of limits.
Dudley: I’m beginning to get it, but I’m not sure I like the idea that this is going to come back.
Mugsy: Does everything in this course return?
Albert: Nearly.
Mugsy: Augh.
How do you tell when you aren’t getting a value out? The only situation where that can occur (for now, anyway—it
gets worse later)
Mugsy: AUGH.
is getting the form “0/0.” This is not good. Anything else is fine:
“(?/??)” has a nice value when both ? and ?? are non-zero.
“(0/??)” = 0 as long as ?? 6= 0.
“(?/0)” doesn’t exist (or is infinite) as long as ? 6= 0.
See the homework.
When working out limx→c f (x) by hand, here is the three-step procedure to follow:
CHAPTER 1. DERIVATIVES - I 52
1. Plug x = c into f (x). If you don’t get 0/0, STOP; you are done. The answer is the value you got. (See the box above.)
2. Since you must have 0/0 to get here, factor the top and bottom. There must be factors of (x − c) in both. (This is
why both the top and bottom are 0 when you plug in x = c.)
3. Cancel the factors of x − c in both top and bottom, and go back to the first step. (This step usually gets rid of the
offending factors that are causing the 0/0 hassle, so we see if we now have a friendly limit.)
Note that this procedure is precisely what I gave you as the procedure for finding slopes of tangent lines (derivatives), in
the special case of the limit that is used in derivatives. First, you always get “0/0” by the form of the equation for slopes.
Then, you worked algebraically to get a factor of ∆x in the top (it’s obvious in the bottom), which is the second step. Then
cancel the ∆x’s, which is the third step. Finally, plug in ∆x = 0 to get the slope, which is back to the first step.
This, then, is why you have to simplify the top of difference quotients, and why all terms without a ∆x all cancel. If not,
the top wouldn’t be zero when you put in ∆x = 0. This is also why the critical step in finding derivatives the long way is
that strategic cancellation of the ∆x’s on the top and bottom. After that, it’s easy!
Here’s an example of how I construct a limit problem (such as might occur on the test).
Mugsy: That’s a hint, hear?
First, I choose a nice function, like
x−4
x+3
and pick a number, like x = 2. I then multiply top and bottom of the function by x − 2 (without canceling), giving
(x − 4)(x − 2) x2 − 6 x + 8
= 2
(x + 3)(x − 2) x +x−6
where the x → 2 is chosen because I multiplied through by x − 2. You see, this way, I build into the fraction a 0/0. I expect
you then to factor the function, cancel the x − 2 on the top and the bottom, and get back to
x−4
x+3
plug in x = 2, and get
2 − 4 −2
=
2+3 5
which is the answer to the limit.
There is one other critical comment: 0/0 is never the answer to anything! In particular, it is never the answer to any
limit. You can get “0/0” as a form, but that means you must go through the procedure just outlined.
Mugsy: I have been informed to “visit” anyone who puts 0/0 as an answer. Heh, heh, heh.
Limits can also be done on Maple. To evaluate on Maple the limit I just constructed, here’s the format:
> limit( (x^2-6*x+8)/(x^2+x-6), x=2);
−2
5
The function goes first, and then the value you want the variable to approach. Be careful to get the parentheses correct.
Essentially, all the limits we’ll encounter (and then some) can be done this way using Maple.
The easiest way to do limits shows up later (L’Hôpital’s rule). We’ll need some easier way when the limits get harder.
The procedure we have only works for polynomials, and other things that we can factor. The curious thing is that L’Hôpital’s
rule uses derivatives!
CHAPTER 1. DERIVATIVES - I 53
Homework #9
Exercises.
1. Evaluate the following limits by the three-step procedure that I gave earlier. (So, you need to show your steps!) You
can use Maple to check your answers.
x2 − x − 6
(a) lim
x→−2 x2 + x − 2
x2 − x − 6
(b) lim 2
x→0 x + x − 2
x2 − x − 6
(c) lim 2
x→1 x + x − 2
x2 − x − 6
(d) lim 2
x→3 x + x − 2
r2 + 2 r + 1
(e) lim 2
r→−1 r − 2 r − 3
2. Evaluate the following limits by the three-step procedure that I gave earlier. Again, you can use Maple to check your
answers.
x2 − 2 x − 3
(a) lim
x→3 x2 − x − 6
x2 − 2 x − 3
(b) lim 2
x→0 x − x − 6
x2 − 2 x − 3
(c) lim 2
x→−1 x − x − 6
x2 − 2 x − 3
(d) lim 2
x→−2 x − x − 6
r2 − 4 r + 4
(e) lim 2
r→2 r − r − 2
x2 − 2 x − 15
3. We want to evaluate lim both by calculator and algebra.
x→−3 x2 + 2 x − 3
(a) Plug values of x near −3. (Try −2.9, −2.99, −2.999, and −3.1, −3.01, −3.001.) This is reasonably easy on
Maple, for example, using
> f := x -> (x^2-2*x-15)/(x^2+2*x-3);
x2 − 2 x − 15
f := x →
x2 + 2 x − 3
which defines the function for Maple, and then type (and again, change the colons at the end of these to
semicolons to see the output):
> f(-2.9):
CHAPTER 1. DERIVATIVES - I 54
> f(-2.99):
> f(-2.999):
> f(-3.1):
> f(-3.01):
> f(-3.001):
What do you think the value of the limit is from these numbers?
(b) What happens if you plug x = −3 into the function? (Or try to find f(-3); on Maple?)
(c) Factor both the top and bottom of the function and reduce it. (You can do this on Maple by typing at it
normal(f(x));. Remember that normal(); in Maple is one of the algebraic simplification routines, the one
I tend to prefer.)
(d) What do you get when you plug x = −3 into the reduced expression for the function?
(e) Which way would you rather work limits (by calculator/computer or by algebra)? Give a reason for your
answer. (There is no wrong answer to this part. I’m just curious to see what you think. The answers usually
split.)
Problems.
1. In this problem, we’ll get a method of working√limits of some functions that are not rational functions (the quotients
√
x− 6
of two polynomials). We will work lim .
x→6 x − 6
(a) Plug x = 6 into the function, according to the first step to finding a limit. What happens?
(b) Why does the second step in the three-step method break down in this case?
(c) We can salvage the process by forcing a factor of x − 6 on the top. The way we√get it√is to rationalize the
numerator (not denominator). Multiply the top and the bottom of the function by x + 6 and multiply out
just the top. Leave the bottom in factored form.
(d) Reduce the function and plug x = 6 into the result. What is the value of the limit? You can check your answer
on Maple, but realize that it might give the answer in a different form than you got. (Subtract your answer from
Maple’s and simplify("); you should get 0 if you are correct.)
√ √
x+5− 5
2. Use the idea from the previous problem to evaluate lim
x→0 x
Investigation.
1. Let’s look at why 0/0 is so vicious, by looking at what a/b means. We do that by converting the division into a
multiplication problem that we try to solve.
(a) Suppose a/b = c, and solve for a. (Leave the variables as variables; don’t use numbers yet.) (In case you are
worried, yes, this part is simple.) The whole idea of division, then, is to use this equation to find c. That is, plug
in the values of a and b, and try to get c that works. The remaining parts of the question refer back to this part.
References that follow in this question to “the equation” are to the equation that is the answer to this part.
(b) Suppose first a = 0 and b 6= 0. What value of c makes the equation work? Are there any other values of c that
can make the equation work?
(c) Suppose now b = 0 and a 6= 0. Is there any value of c that we can use to solve the equation? Why is a/0 not
defined for a 6= 0? (Later we will see that it is sometimes convenient to say that a/0 is infinite, and occasionally
people will say just that.)
CHAPTER 1. DERIVATIVES - I 55
(d) Suppose a = b = 0. Is there any one specific value we can assign to c that works in the equation? Why would
0/0 be called an indeterminate form? Note particularly that 0/0 is not automatically 1, even though x/x = 1 for
any x 6= 0.
(Note: Division by 0 causes pain no matter what the numerator (top). Either there are no solutions (when the
top is non-zero), or there are an infinite number of solutions (when the top is zero). Exactly the same situation
will be investigated later on, in a course called linear algebra, but there it will be harder to see what’s going on
because a wider variety of things can happen.)
Note that you differentiate one factor at a time. Also note that this is not the same as multiplying the derivatives. (See the
homework.)
Example: Find the derivative of
(x2 + 4 x − 3)(x3 − 2 x2 + 5)
We have two ways of doing this, and we will do it both ways. First, if we use the product rule, we get that the derivative is
On the other hand, as we indicated earlier, we can multiply out the product of two polynomials, and get another polynomial.
In this case, when you multiply the original function out, you get
x5 + 2 x4 − 11 x3 + 11 x2 + 20 x − 15
( f (x) g(x) h(x))0 = ( f 0 (x) g(x) h(x)) + ( f (x) g0 (x) h(x)) + ( f (x) g(x) h0 (x))
Again, note that you differentiate one factor at a time. The other factors are left alone. You then add up all the products.
With more than three functions, you start running out of letters to use for them, so a typical approach is to use subscripts.
The functions for a multi-factor product would probably be written f1 (x), f2 (x), . . . , fn (x). The function would be
Don’t memorize the formula—memorize the pattern. This rule is not too hard. The only possible problem is to remember
to use it when you need it.
Here are some more examples that I will go over in class. Differentiate the following functions.
(x3 − x) (2 x2 + x − 1)
z4 (9 z7 + 8 z4 − 6 z3 + 15 z2 − 10 z + 9)
(x6 − 7 x4 + 3 x3 − 7) (2 x5 + 5 x4 − x2 + x − 6)
(x2 + 2 x) (x4 − 3 x2 ) (8 x3 − 5)
(x + 2) (3 x + 4) (5 x + 6) (7 x + 8) (9 x + 10)
Quotient rule.
This can be done two ways. If you want to be formal, you use this:
On the other hand, I keep mixing up the order of the factors on the top, and that changes the value. (See the homework.)
So, the way that I remember the quotient rule (and you will hear me mumbling this to myself when I work one out on the
board) is to call the top function hi (it’s high, that is, on top), and the bottom function ho (because that’s what I call it), and
use D for differentiation.
Mugsy: Brace yourself.
Then the quotient rule becomes (and this is not original to me, by the way!):
hi ho D(hi) − hi D(ho)
D = (1.35)
ho (ho ho)
This, by itself, is not hard, but when combined with the product rule (such as having a product on the top or bottom), it can
get confusing. “All” you have to do is keep from getting overwhelmed. More on that soon.
√Here are some examples that I will work in class. Differentiate the following functions.
x
x2 + x
2 x3 − 7 x2 + 1
3 x2 + 5
CHAPTER 1. DERIVATIVES - I 57
x (x3 − 1)
x4 + x3 − 3
(4 x2 + 8 x − 5) (2 x7 − 8 x3 + x2 + 2)
(5 x3 + 8 x − 2) (x3 + x2 − 1)
One topic becomes very relevant at this point, and that is the degree of algebraic simplification that I will require of you
on the homework and tests. The answer is: “As little as possible, and usually none.”
Mugsy: Wow! And I thought he was nearly as mean as I am . . . .
Let me explain. If I am grading your papers, and you decide to simplify your answers, you will then require me to check
the work you did in simplifying. That takes me time and potentially a lot of work.
It is also to your benefit not to simplify. Especially on a test, simplifying takes you time also, which you could better
spend thinking about other problems. Finally, if you don’t simplify, and you have made a calculus mistake, it is reasonably
clear what you did wrong, and partial credit is simple to award. If you then simplify, you bury your mistake in an avalanche
of algebra, and all I can tell easily is that your answer is wrong. Partial credit is harder to award. I know that your answer
is wrong, so if I can’t figure out what you did, you lose lots of credit.
Mugsy: It seems like he wants to make life easy for himself, mostly.
Albert: That’s bad?
Homework #10
Exercises.
1. Find the following derivatives using the product rule. You do not need to simplify your answers.
(a) f (x) = (7 x3 + 5 x2 )(2 x4 − 5 x)
√
(b) f (x) = x x (Remember to convert the square root to an exponent)
2. Find the following derivatives using the product rule. You do not need to simplify your answers.
(a) f (x) = (5 x2 + 3 x)(2 x5 − 3 x4 )
√
(b) f (x) = x 3 x
3. Work out f 0 (x) for the following using the quotient rule. You do not need to simplify your answers.
x2 +3
(a) f (x) = x
x
(b) f (x) = x2 +3
(3 x2 −2 x+3)(5 x+7)
(c) f (x) = x2 +8
4. Work out f 0 (x) for the following using the quotient rule. You do not need to simplify your answers.
x
(a) f (x) = x2 +1
x2 +1
(b) f (x) = x
(5 x2 +3 x−4)(x−9)
(c) f (x) = x2 +5
5. Make up three (or more) examples of the product rule and three (or more) examples of the quotient rule of your own.
Make sure that at least one quotient rule involves a product also.
Problems.
1. In this problem, we show that the method that you might want to use on products and quotients doesn’t work. Take
f (x) = x3 and g(x) = x4 . An added moral of this problem is that if you obey the laws of algebra, you will get the
correct answers in calculus, although they might not always look the same.
CHAPTER 1. DERIVATIVES - I 58
(a) Combine f (x) g(x) into a single xn and find ( f (x) g(x))0 .
(b) Find f 0 (x) and g0 (x). Combine f 0 (x) × g0 (x) into a single c xm . (Note that this is not equal to ( f (x) g(x))0 , since
neither the exponents nor the coefficients are equal. That is, ( f (x) g(x))0 6= f 0 (x) g0 (x).)
(c) Find ( f 0 (x) g(x)) + ( f (x) g0 (x)), and combine it into a single c xm . It should equal ( f (x) g(x))0 .
2. In this problem, we work through exactly what happens if you reverse the order of factors in the numerator of
the quotient rule formula. (That is, what happens if you use hi Dho − ho Dhi for the top rather than the correct
ho Dhi − hi Dho.) We begin with a specific example.
ho D(hi) − hi D(ho)
(ho ho)
and multiply out the function on the top of the fraction to get a polynomial.
(b) Work out
hi D(ho) − ho D(hi)
(ho ho)
which is what you get if you interchange the order of the factors in the quotient rule. Also multiply out the
function on the top of this fraction to get a polynomial.
(c) Show that the answers to the previous two parts are the negatives of each other. Do this by taking the negative
of the top of the first quotient and getting the top of the second quotient. (The bottoms are the same for both.)
(d) Now we want to show that this same thing always happens, namely that reversing the order of the factors in the
numerator changes the sign of the answer. Do this by showing that
3. In the formula in the notes for the derivative for three factors,
( f (x) g(x) h(x))0 = f 0 (x) g(x) h(x) + · · · ,
set h(x) = 1, and simplify both sides and show that you get the usual product rule for two factors, (( f (x) g(x))0 =
f 0 (x) g(x) + · · · .
Dissect formulas into simpler parts. Now comes the attitude check. I give you this back-breaking quotient and product
rule to differentiate on the test, and your mind goes into zombie mode just by looking at it. What do you do?
Dudley: Is he psychic? That happens to me all the time!
Albert: He’s taught calculus a long time. It happens to lots of people. Now listen to the remedy.
When confronted with a huge derivative, the key is to avoid being overwhelmed.
Dudley: HOW?
Here’s how, Dudley. First, pick out a little part of the problem, and ask yourself if you can differentiate it. If not, pick out a
smaller part until you can. Go all the way down to x (or whatever the variable is) if you need to! Don’t even try to tackle it
all at once. You’ll get confused, and leave out something important. Keep picking the function apart until you are confident
that you can differentiate all of the little parts of the problem. Then realize that if you can differentiate each part, you can
differentiate the whole thing. The rules we’ve gotten so far (and the one to come) are there to tell you how to assemble all
those little parts. Just do what the rules say.
Re-examine product and quotient rules from this point of view. The product rule tells you how to differentiate a
product once you know how to differentiate each factor. The quotient rule tells you how to differentiate a quotient if you
can differentiate the top and the bottom. All the rules in calculus are like that.
Let’s take for an example the biggest quotient rule from before:
(4 x2 + 8 x − 5) (2 x7 − 8 x3 + x2 + 2)
(5 x3 + 8 x − 2) (x3 + x2 − 1)
How does that decompose? Well, you look at the thing, and first of all realize that you’ll need the quotient rule, because
overall it’s a quotient.
Dudley: So far, so good.
But to use the quotient rule, you need to differentiate both the top and the bottom (both D(ho) and D(hi)). But you look at
the top, and cringe.
Dudley: You got it.
But look again. The top is a mess, but what is it? It’s two polynomials multiplied together. That sounds vaguely familiar.
Dudley: Emphasis on the vaguely.
Can I differentiate that?
Dudley: I’m sure you can. But can I?
A simpler question, then. Do you think you can differentiate the first factor, the (4 x2 + 8 x − 5)?
Dudley: That I think I can manage.
Fine. Do you think you can differentiate the other factor, (2 x7 − 8 x3 + x2 + 2)?
Dudley: Yes.
CHAPTER 1. DERIVATIVES - I 60
Great! That means that you can differentiate the whole top! Why? Because the product rule tells you how to put the terms
and their derivatives together. Convinced?
Dudley: Not yet. Show me.
First, look at the product rule. It is
What do you need to know to find the derivative of ( f (x) × g(x))? You need f (x) and g(x), and their derivatives, f 0 (x) and
g0 (x). That is, knowing those, you can plug into the formula, right?
Dudley: I suppose.
So, look at the top now. It’s (4 x2 + 8 x − 5) (2 x7 − 8 x3 + x2 + 2). To differentiate that, all you need is the two factors (and
they are sitting right there in front of you), and the two derivatives, and you’ve already said you can do those. Therefore,
you can differentiate the top!
Dudley: Is that all?
Albert: No.
Mugsy: I knew that was too easy.
Now look at the quotient rule. It is
hi ho D(hi) − hi D(ho)
D =
ho (ho ho)
Now, what do you need to use the quotient rule? To differentiate a quotient, you need the top and bottom functions (again,
that’s easy, because they are just sitting there), and their derivatives. In the example we are working, we’ve already decided
that we can do the derivative of the top, and we have the functions. All we need now is the derivative of the bottom, and
we’re almost finished!
Dudley: “All?”
Well, do you think I can convince you that you can find the derivative of the bottom, too?
Dudley: At this point, I don’t think I could argue anything.
Well, look at the bottom function. It is (5 x3 + 8 x − 2) (x3 + x2 − 1). That’s the product of two polynomials again. Now for
the question. What do you need to know in order to differentiate the product of two functions?
Dudley: The functions and their derivatives?
Exactly! You remembered! Do you know the functions?
Dudley: Yes.
Can you differentiate the first one?
Dudley: Why do I feel like I’m being sold the Brooklyn Bridge?
Albert: Because you think you can’t do it, and you are uncomfortable realizing that maybe you can. Now answer the
question.
Dudley: Yes, I can differentiate that polynomial.
And the other polynomial, can you differentiate it, too?
Dudley: Yes.
So, what rule tells you how to put two functions and their derivatives together to get the derivative of the product?
Dudley: The product rule?
Yes! Yes! Yes!
Mugsy: Cut the theatrics, kid.
Now, Dudley, what do you have? You know the top and bottom functions, and you can figure out their derivatives. And the
quotient rule tells you how to assemble these ingredients into the answer. Can you do the whole thing?
Dudley: I want to say no, but I think the answer is yes.
Go back to the answer from when I did that example in class, and see how all the pieces fit together. Notice where the
functions and derivatives fit in, and how I used both the product and quotient rules.
that are divided, use the quotient rule. In other words, look at what is being done to the little pieces to assemble them
into the whole, and that tells you what formula to use. These rules are without exception. You always use the product and
quotient rules in their places.
Homework #11
Exercises.
1. Show that the short cuts given just before the homework are correct. Do this by applying the product rule and quotient
rule to the expressions on the left side of the equals signs in the expressions given, and use the fact that the derivative
of a constant is 0, and use algebra to simplify what’s on the right sides of the equals signs.
2. Show that the other possibility for a short-cut, (C/ f (x))0 is not equal to C/ f 0 (x). Do this by picking a specific
function for f (x) and specific value for C and working out both expressions and getting different answers.
∆u ≈ g0 (x) ∆x
Then that wiggle gets fed to f (u), and its output will wiggle by
The derivative of a composition is the derivative of the outside (leaving the inside alone) times the derivative
of the inside.
The derivative of the inside term is the one that is often forgotten in the thrill of having gotten the derivative of the outside.
Please remember to put in the whole chain rule!
Albert: Let me add my encouragement, too. Mugsy, I give you permission to pound on anyone who forgets the chain
rule.
Mugsy: Really?! And I thought this section was going to be a drag!
Albert: Is that encouragement enough?
f (g(x))0 = f 0 (g(x)) g0 (x). This is the form I gave earlier, and is the one that expresses the formula algebraically the best.
dy dy
dx = du × du
dx . This form is curious. It appears as though the du’s are being canceled. In fact, that is the right way to
remember this form of the chain rule. It tends not to be as useful a form unless you happen to be given y in terms of u and
u in terms of x so that the derivatives can be worked out easily. While you are given that information occasionally, most of
the time you are given a formula to differentiate.
The big use that we will make of this is somewhat different. If you will remember, the form of the derivative contains
information. When you write dy/dx you are indicating that y is the dependent variable and x is the independent variable.
When you write dy/du, y is still the dependent variable, but now u is the independent variable. Thus, this form of the chain
rule tells you how to change the independent variable in a derivative. That is, if y (the dependent variable) can be expressed
either in terms of x or u (two possible independent variables), then dy/dx and dy/du will not be equal, but will be related
by (dy/dx) = (dy/du) (du/dx). Normally, the independent variable is obvious, but when it isn’t, great care must be taken.
This theme will occur several more times in this course:
The chain rule is the way you change independent variables in a derivative.
Let me give you an example from physics. (Relax. I provide all the formulas you need. What I am looking for is a
real-life dependent variable that can reasonably be expressed in terms of two different independent variables.) One formula
for the velocity of a free-fall object dropped from rest is v(t) = gt, and the distance dropped is s(t) = 21 gt 2 . (Here, g is
the constant of gravitational acceleration. We will derive these formulas later, when we get to integration. They are also
standard from physics.) There are good reasons to ask for velocity either as a function of time or of distance. (How fast is
it dropping after 3 seconds? How fast is it dropping when it has dropped 5 feet?) This means that finding the derivative of
v could use either s or t as the independent variable. One formula we have for velocity would make one derivative easy to
find, v(t) = gt. On the other hand, velocity as a function of distance is not so obvious. We can solve s = 12 gt 2 for t, and
p
get t = 2 s/g. (we are only concerned with t > 0, so there’s no ±). That means that
p p p
v = g 2 s/g = 2 g s = 2 g s1/2
Let’s now look at dv/dt and dv/ds. The first is easy: dv/dt = g, a constant. This says that the acceleration (defined to be
dv/dt) of a freely falling body is a constant, something that physics will verify. On the other hand,
r
p 1 −1/2 g
dv/ds = 2 g s =
2 2s
which is hardly a constant (s is changing). What’s the contradiction here? There isn’t any! But we’ll have to look at this a
little more carefully to see that there really isn’t.
CHAPTER 1. DERIVATIVES - I 64
The acceleration of an object, as stated earlier, is defined as the derivative of velocity with respect to time, meaning that
we divide ∆v by ∆t, and let ∆t shrink. For a freely falling body, that value is a constant. But when we take ∆v and divide by
∆s, we should get something different.
Think of an example. If you drop a rock over a cliff, the velocity increases by a certain amount each second. That is,
if we find the change in velocity from t = 2.0 seconds to t = 2.1 seconds, we’ll get a certain value. It will be the same as
the value we’d get by finding the change in velocity from t = 5.3 seconds to t = 5.4 seconds, since they both have ∆t = 0.1
second. That’s constant acceleration. On the other hand, if we find some value for the change in velocity from s = 2.0 feet
to s = 2.1 feet, we can get some number. It won’t be the same as the number you’d get from finding the change in velocity
from s = 5.3 feet to s = 5.4 feet. Why? Because the time interval during which the rock is dropping from 5.3 feet to 5.4
feet much smaller than the time interval for falling from 2.0 feet to 2.1 feet. (Why? Because it is moving faster, having
accelerated.) It doesn’t have as much time to increase its velocity over a certain distance. So ∆v/∆s will not be a constant.
The two derivatives (dv/dt and dv/ds) are, however, related by the chain rule: dv/dt = (dv/ds) (ds/dt). But does it
check? Of course. Watch. Since ds/dt = v, we get
Derivative of the outside function (leaving the inside alone) times derivative of the inside function. This is the way
to remember the chain rule for formula-based functions. It is the working, useful, approach that I think through when I am
differentiating a function!
D( f ◦ g) = ((D f ) ◦ g) × Dg. This formula is more for reference. It is the most concise statement of the chain rule, but
tends to be used in more advanced courses.
Dudley: This I can skip entirely?
Albert: Yes. For this course.
Mugsy: That’s nice.
Note that you essentially peel off one function at a time from the outside, differentiate it, and leave alone all the functions
that are inside it. I’ll do examples when we get more functions.
Homework #12
Exercises.
CHAPTER 1. DERIVATIVES - I 66
1. Find the derivatives of the following functions. Check your answers on Maple if you want.
√
(a) f (x) = 2 − 3 x
(b) g(z) = (5 z3 − 8 z)6
√
(c) h(r) = r2 2 r − 1
(3t 2 −8t+1)3
(d) s(t) = (6t 2 +9t+5)4
9
5θ5
(e) r(θ ) = θ 2 +1
2. Find the derivatives of the following functions. Check your answers on Maple if you want.
√
(a) f (x) = 3 3 − 5 x
(b) g(z) = (5 z4 + 8 z3 − 1)7
√
(c) h(r) = r3 4 r − 1
(3t 3 −4)5
(d) s(t) = (4t+1)3
8
θ +2
(e) r(θ ) = θ 2 +θ
3. Show that in the worked-out example right before this homework set (on changing quotients to products) the two
answers are actually algebraically equal.
x
4. Show that the derivative of equals the derivative of x (x2 + 1) by algebraically simplifying the first one.
(x2 + 1)−1
(The first of those derivatives was done in class, as an example. Check your notes.)
5. Make up three more chain rule problems. Some should also require the product rule and/or the quotient rule. Again,
you can check your answers with Maple.
Problems.
1. Let f1 (x) = m1 x + b1 and f2 (x) = m2 x + b2 . These are both straight lines, with (constant) slopes m1 and m2 . We
know their derivatives are their slopes. We want to look at the compositions of lines in some detail and show that it
fits what the chain rule says.
(a) Find the compositions f1 ◦ f2 (x) and f2 ◦ f1 (x). Multiply out both compositions to give polynomials. They
should both be linear functions, of the form Mx + B, with various values of M and B, so both graphs are also
lines.
(b) Note that both lines have the same slope. What is that slope? That is, both lines have the same value of M.
What is that value?
(c) Do the two compositions give the same function? Since the values of M are the same for both, all you need to
check is if the formulas for B are the same for both. If they are different, the functions will be different. If they
are the same, the functions will be the same.
(d) The chain rule says that ( f1 ◦ f2 (x))0 = ( f10 ◦ f2 (x)) f20 (x) and ( f2 ◦ f1 (x))0 = ( f20 ◦ f1 (x)) f10 (x). What are ( f10 ◦
f2 (x)) and ( f20 ◦ f1 (x))? (Note that f10 (x) and f20 (x) are easy from the formulas. These, however, require some
thought, although they are also easy, once you see them.)
(e) Plug that value of f10 ( f2 (x)) into the (chain rule) formula for ( f1 ◦ f2 (x))0 , and also plug in the value of f20 (x).
From part (b), we already had calculated what ( f1 ◦ f2 (x))0 should be. Do the two agree? (Hint: They had
better!)
CHAPTER 1. DERIVATIVES - I 67
2. Show that the quotient rule and its reformulation as a product rule give the same results. That is, show that the quotient
rule applied to ( f (x)/g(x))0 and the product and chain rules applied to ( f (x) (g(x))−1 )0 give the same result. (Hint:
One way to show the two answers are the same is to convert the quotient rule’s result into a product (top) (bottom)−1
and then to multiply that out. Some algebra work later, you should get the same form as the result of the product and
chain rules.)
2
3. Take f (x) = x−2
x+3 for this problem.
(a) Find the derivative of f (x) considering it as a chain rule with outside function being square and the inside
function being the quotient.
(b) Split f (x) into the quotient of two squares. Find its derivative now by the quotient rule and chain rules.
(c) See if you can make the answers from the previous parts agree algebraically. (That’s harder than you might
think! Both are correct (if you did them right), but the answers look very different. I’m trying to get you to
realize that there are many ways to do problems, and you can look for ways to make things easier on you.) (Hint
on the algebra: Factor (x − 2)/(x + 3)3 out of both answers and simplify what’s left in both.)
Investigations.
1. This problem gives a (false!) rationale for ( f ◦ g)0 = (g ◦ f )0 . Find and correct the mistake. (Note: Homework
problem 1 showed that this can sometimes be true. But most of the time it is false.)
The wiggle magnification factor WMF of a composition is, as done in class, the product of the WMF’s of
the two functions. You get that the WMF of ( f ◦ g) is (WMF of f ) times (WMF of g). The WMF of (g ◦ f )
is (WMF of g) times (WMF of f ). Since multiplication is commutative (you can multiply factors in any
order), these are obviously equal. Hence, ( f ◦ g)0 = (g ◦ f )0 .
There must be something wrong with this reasoning, because the derivative of a composition does depend on the
order of the factors. What is wrong?
2. In this investigation, we get a formula for the derivative of the inverse of a function. This doesn’t require anything
more than the chain rule and algebra, together with some care about independent and dependent variables.
(a) Let’s first do an example; use y = x5 . What is dy/dx?
(b) Solve y = x5 for x. Find the derivative of that equation. That will give you dx/dy.
1
(c) Show that (dx/dy) = dy/dx . To do this, plug in the formula for x into dy/dx and use the properties of exponents
1
on .
dy/dx
(d) Now, we go back to working with a general function y = f (x). We will be doing exactly this same thing again,
but with general formulas rather than x5 . Solving y = f (x) for x gives x = f −1 (y). Plugging that back into
y = f (x) gives
y = f ( f −1 (y))
which doesn’t say anything more than a function and its inverse undo each other. Differentiate both sides of
this last equation with respect to y, using the chain rule on the right side. (It is less confusing to use the prime
notation for derivatives here.)
(e) Plug x = f −1 (y) into the chain rule in the previous part (it only fits in one spot!), and then solve for ( f −1 )0 (y).
If you realize that dy/dx = f 0 (x) and dx/dy = ( f −1 )0 (y), you should get the same as dx/dy = dy/dx1
.
CHAPTER 1. DERIVATIVES - I 68
Absolute values.
We technically already know how to do this, since we have√a way of defining absolute values in terms of other things we
can differentiate. The definition of absolute value is |x | = x2 . (At least that works for real numbers. Complex numbers
require a more delicate definition.) Then, the derivative of |x | can be done this way:
d d√ 2
|x | = x (1.44)
dx dx
d
= (x2 )1/2 (1.45)
dx
1
= (x2 )−1/2 (2 x) (1.46)
2
x
= 2 1/2 (1.47)
(x )
x
= (1.48)
|x |
Of course, Maple will differentiate absolute values also. Depending on the version of Maple that you are running, you
can get two different answers when you type in diff(abs(x),x);. Maple version V release 3 tends to write the derivative
upside-down from our definition. It will take diff(abs(x), x); and return abs(x)/x. This is equivalent to what we
have. (See the homework.) And, for various technical reasons, I prefer this Maple’s answer, so that’s what I will give as the
formula for the derivative of the absolute value of x:
d |x |
(|x |) = (1.49)
dx x
It is interesting to note that the derivative of |x | is not defined when x = 0. You get division by 0 then. (Actually, you get
0/0, and that’s even worse, if you’ll remember.) Looking at the graph, it becomes clear why. The graph of y = |x | looks like
a V, with its corner at the origin, and the sides at 45◦ angles (slopes ±1). What happens if you magnify the graph around
the origin? It remains V-shaped. It never begins to flatten out to a line. That’s the problem. If it doesn’t become like a line,
it can’t have a derivative, because the derivative is the slope of the line that the graph begins to look like.
Dudley: Can’t most functions be flattened out?
Albert: Mathematicians thought for many years they could, except at a few isolated corner points, but it turns out that
“infinitely crinkly” functions—all corners—are more numerous. But for our purposes, yes, the functions we encounter
will flatten out, with the single exception of absolute values near zero.
Mugsy: “Infinitely crinkly.” Hmm. That gives me ideas.
It is also interesting to look at what |x | /x itself is. It has the value +1 for values of x > 0 (where x = |x |) and has the
value −1 for values of x < 0 (where x = − |x |, since for x negative, x and |x | will be the same “size” but opposite signs).
The derivative has no value (it doesn’t exist) at x = 0. Note that this matches the slopes of the lines that make up the V-shape
of the graph of y = |x |, just as it should. The moral of this comment is that |x | doesn’t have a derivative at x = 0, but it is
perfectly fine for all other values of x.
CHAPTER 1. DERIVATIVES - I 69
On the other hand, Maple 6 gives diff(abs(x),x); as abs(1, x). What is that? It is more complicated than I want
to get into here, but it deals with the idea that x could be a complex number, and, if so, then the formula for |x | needs
to be more general than |x | /x. If you want to get the previous version of the derivative of |x |, here’s what to do. First,
you need to make sure that Maple knows that x is a real number. You do that by typing in assume(x,real);. That,
however, means that any time that you use x, Maple will print it as x ∼, the trailing twiddle being a reminder that you have
made some assumptions about x. I find that annoying, so I go to Maple’s menu bar Options | Assumed variables | No
annotation, and that turns off the trailing twiddle. (Tilde is the more accurate term, but I like twiddle). Then, when you
type in diff(abs(x),x);, Maple will return signum(x), which is 1 when x > 0 and −1 when x < 0 (and 0 when x = 0,
which is wrong, and the reason Maple went with the more general form). This is mostly correct, but still not the form that
we are using. One more step will do it; type in convert(%,abs);, and you will get the familiar x/ |x |.
Mugsy: Now why did he go through all that stuff with Maple when he had already given us the formula?
Albert: Probably to warn you about not being able to decipher Maple’s answers to derivatives that contain absolute
values.
Here are the examples of functions I’ll differentiate in class.
|5 x + 8 |
5 |x |+8
x3 (x − 5)
3
x |x − 5 |
x
2 x+3
|x |
|2px + 3 |
|x | + 3
x2 (4 + |x |)
Homework #13
Exercises.
1. Find the derivatives of the following functions. Use that the derivative of |x | is |x | /x rather than the definition of |x |.
(a) x2 × |x |
(b) x3 − x
|x |
(c)
x
1
(d) √
|x | x2 + 1
2. Find the derivatives of the following functions.
(a) x5 × |x |
(b) 4 x3 − 7 x2
x
(c)
|x |
1
(d) √
|x | x2 − 1
CHAPTER 1. DERIVATIVES - I 70
3. Make up three functions of your own that include absolute values and differentiate them. You can check your answers
on Maple.
Problem.
x |x |
1. In this problem, we show that = .
|x | x
(a) Square both sides in the definition (the one that uses square roots) of |x | to show that |x |2 = x2 .
(b) Multiply top and bottom of x/ |x | by x and simplify using the result from the previous part to get |x | /x.
Note on radians versus degrees in calculus. In the rest of calculus (and all courses afterward), all angles will be mea-
sured in radians. This will be assumed, unless specific instructions for a specific purpose are given for a specific situation.
Always put your calculator into radian mode! A common source of problems when dealing with trig functions is to
forget to change out of degree mode. Most calculators don’t allow you to change what mode the calculator stays in, so you
have to do this yourself each time. (The fancier HP and TI calculators, and probably others, allow you to set radian mode
as the default.) There really is a reason for using radians. It shows up next.
Maple uses radians for all of its angles. Its names for the trig functions are just what you’d expect: sin(x), cos(x),
etc. But don’t forget the parentheses. To Maple, sinx would be a variable and sin x is an error.
Motivate that the derivative of sine is cosine. From a previous homework problem, you showed that the algebraic
method of finding the derivative (“the long way”) of trig functions will not work. For example, we can’t factor a ∆x out of
∆y = sin(x + ∆x) − sin(x) to cancel the ∆x in the denominator. Trig identities can be used to get the derivative, but (to the
way I think) they obscure what is going on rather than clarify it.
Mugsy: Oh, great. Something obscure.
Albert: The alternative is to drop a formula on you, and give you no idea of why it’s correct.
Mugsy: Hey, I’m just going to memorize the thing anyway.
CHAPTER 1. DERIVATIVES - I 71
Albert: The point of this course seems to be to try to get away from the “I’m just going to memorize it anyway”
approach, and to give you a bit of rationale for why things work the way they do.
Mugsy: Great. Mess up my learning style, why don’t ya?
We can find the derivative geometrically, though. To avoid getting mixed up, we will use f (θ ) = sin θ , and find
d sin(θ + ∆θ ) − sin(θ )
(sin θ ) = lim
dθ ∆θ →0 ∆θ
I want x to mean x-coordinate of a point, and not the size of the angle. Besides, θ is a typical letter to represent an angle.
The whole point of what follows is to locate and identify the numerator and denominator geometrically, and then show that
it becomes cos θ in the limit as ∆θ closes in on 0.
Begin with the unit circle (that means radius 1, centered at the origin). Draw in angles of θ and θ + ∆θ (in radians),
crossing the circle. The y-coordinates are sin θ and sin(θ + ∆θ ). The difference in the y-coordinates of the intersection of
the angles with the circle is exactly sin(θ + ∆θ ) − sin(θ ). That’s the value of ∆ sin(θ ) = ∆ f ! Keep track of that. We have
part of the fraction that we need.
The length of the arc of the circle that is cut off is exactly ∆θ , the other part of what we want the limit of. (For a central
angle α in radians in a circle of radius r, the arc cut off—subtended is the technical term—has length rα. For the circle we
have, r = 1, and α = (θ + ∆θ ) − (θ ) = ∆θ . Putting these together, the length of the arc is 1 × ∆θ = ∆θ . That’s the other
part of the fraction!
The reason for using radians shows up here. If you measure α in degrees, then the length of the arc is r × (α × π/180),
and that (π/180) factor will persist through all of your derivatives. Essentially, we use radians to simplify the form of the
derivatives we’ll get. This is why radians don’t really show up until you do calculus, but are the only sensible choice here.
Let’s look closely (in both senses) at the circle near the intersections, where all the action is going on.
Mugsy: Is that another pun?
Albert: I think so.
If ∆θ is very small (and remember that we are going to be taking ∆θ → 0, so it will get very small), then the circle between
the two intersections will be almost a straight line. In fact, to make things easier, I will replace the circle with its tangent
line at the intersection with the radius. The error caused by doing this will be invisible for ∆θ tiny enough. (The essence
of calculus is replacing curves by tangent lines. It needs to be done with some care, though.) That gives a right triangle
with vertical side = ∆(sin θ ), and hypotenuse = ∆θ . This one triangle contains both parts of the fraction that we need
to look at. We’re almost done. The ratio ∆(sin θ )/∆θ is then the cosine of the top angle. With a little geometry (using
parallel lines and complementary angles), you can show that the top angle is essentially θ . Thus, ∆(sin θ )/∆θ ≈ cos θ .
(The approximation is due to replacing the circle by the tangent line to the circle.) As ∆θ → 0, the approximations only
improve, and so work out to give
d
(sin θ ) = cos θ
dθ
This argument is not rigorous, but I hope it is convincing.
Obtaining the derivatives of the other trig functions. In theory, we could find the derivatives of all of the other trig
functions by the same procedure. It would be messy, except for cos θ , which is essentially the same, except for one sign
switch.
First, we get the derivative of cos θ . To do that, we do pull some trig identities out of the hat:
cos(π/2 − θ ) = sin θ
sin(π/2 − θ ) = cos θ
The sine of an angle is the cosine of its complement, and vice versa. But we can use the chain rule now! That means that
d d
(cos θ ) = (sin(π/2 − θ )) (1.50)
dθ dθ
d
= cos(π/2 − θ ) × (π/2 − θ ) (1.51)
dθ
= sin θ × (−1) (1.52)
= − sin θ (1.53)
CHAPTER 1. DERIVATIVES - I 72
d
(sin θ ) = cos θ
dθ
d
(cos θ ) = − sin θ
dθ
d
(tan θ ) = sec2 θ
dθ
d
(cot θ ) = − csc2 θ
dθ
d
(sec θ ) = sec θ tan θ
dθ
d
(csc θ ) = − csc θ cot θ
dθ
We used that the derivative of sine is cosine (according to what we did before). We also used the chain rule to get the term
that was the derivative of π/2 − θ (derivative of the inside). This can also be obtained by the same process that gave the
derivative of sin θ . See the homework.
Now that we have the derivatives of sine and cosine, we can get the derivatives of all the others by the quotient rule. The
reason is that tan θ = (sin θ )/(cos θ ), cot θ = (cos θ )/(sin θ ), sec θ = 1/(cos θ ), and csc θ = 1/(sin θ ). Although some
trig identities need to be pulled together to do this, the accompanying table (at the top of this page) summarizes the results.
We will work some of these in the homework, so that you have at least seen them. Although I don’t encourage memorizing
all of these, you should at least memorize the derivatives of sine and cosine.
Note on keeping the signs straight. There is a pattern here that is too nice to omit. Note that the derivative of all the
“co-” functions (the ones that start with “co”) have a negative sign as part of the formula. (That doesn’t mean that those
derivatives will always be less than 0; it depends on whether the rest of the formula gives a positive or negative number.) In
particular, this means that the derivative of cos θ is − sin θ . That negative will be a source of confusion later.
Mugsy: You aren’t helping with comments like that.
Albert: It’s only a minor confusion there, and if you remember the comment about the derivatives of “co-” functions
having a negative sign in the derivative, then even that goes away.
Additionally, note that you can get the derivative of csc θ from the derivative of sec θ (its co-function) by putting in a
minus sign and changing all the functions in the derivative to their co-functions. The same is true of getting the derivative
of cot θ from the derivative of tan θ . You change the sign of the derivative, and change the sec2 θ to its co-function, getting
finally − csc2 θ . What that ultimately means is that you cut the amount of memorization in half. But since I’m not requiring
you to memorize any of these except the derivatives of sin θ and cos θ , it’s only slightly useful. (On the other hand, if you
are going on to teach calculus, you’ll need to remember such arcane incantations. . . .)
Combining these with the chain rule and other rules. Of course, now that we have these, we will proceed to combine
the results with the product rule, the quotient rule, and the chain rule. Let me emphasize now that our goal is to cover
rapidly all of the standard functions in calculus in one shot. It becomes important, then, to keep up with these functions.
Getting behind will cause severe difficulties in a very short time.
There is one notational hassle that we must clear up before we can go any further. The notation sin2 θ means (sin θ )2 ,
and similar things for other exponents and other trig functions. When working derivatives, this must be kept in mind, and
it would be useful to write it that way until you get used to it. For example,
d d
(cos5 θ ) = (cos θ )5 = 5 (cos θ )4 × (− sin θ ) = −5 cos4 θ sin θ
dθ dθ
CHAPTER 1. DERIVATIVES - I 73
Let me explain how the various steps of this example worked. First, the cos5 θ was changed to its equivalent form (cos θ )5 .
The derivative of this was obtained by the chain rule, with the inside function being u = cos θ and the outside function
being u5 . The derivative of the outside gives 5 u4 , or 5 (cos θ )4 . The derivative of the inside is the derivative of cos θ , which
is − sin θ . The product of these is what the chain rule says the derivative is. Finally, (cos θ )4 is changed back to cos4 θ to
give the answer.
This should be contrasted to the similar-looking, but very different, derivative
d
cos(θ 5 ) = − sin(θ 5 ) × 5 θ 4 (1.54)
dθ
= −5 θ 4 sin(θ 5 ) (1.55)
d
sec(θ 3 ) = sec(θ 3 ) tan(θ 3 )(3 θ 2 ) (1.56)
dθ
= 3 θ 2 sec(θ 3 ) tan(θ 3 ) (1.57)
The θ 3 gets put into both the sec and the tan, since the derivative of sec θ (with respect to θ ) is sec θ tan θ .
When you differentiate trigonometric functions with Maple, you will usually have no problems comparing your answers
with its answers. However, there is one big exception to that. When you ask Maple to differentiate tan x or cot x, it writes it
in a different form (which shouldn’t be strange to you after absolute values):
> f(-2.9):
> f(-2.99):
> f(-2.999):
> f(-3.1):
> f(-3.01):
> f(-3.001):
Trying to get Maple to rewrite these in more normal terms involves simplifying using something called side relations,
which I feel is beyond what is useful for you. (If you want to see it, check out Maple ?siderels, or see me.)
The
√ examples that will be worked in class are given here. Differentiate the following functions.
sin( x)
|cos θ |
tan3 (5 x)
cot2 x sin(3 x)
sec θ + csc θ
sec θ − csc θ
Homework #14
Exercises.
CHAPTER 1. DERIVATIVES - I 74
sin2 θ + cos2 θ = 1,
called the Pythagorean identity. As with any proof, you are not allowed to use the result in the proof itself, so pretend
you don’t know it already for this problem.
(a) Differentiate sin2 θ + cos2 θ as a function of θ . (That is, don’t substitute 1 for the expression). Simplify what
you get, and show that it turns into 0.
(b) The only functions whose derivatives are 0 are constants. That means that sin2 θ + cos2 θ is some constant.
And the fact that it is a constant means that its value doesn’t change as we put in different values of θ . Plug in
the value θ = 0, and evaluate sin2 (0) + cos2 (0) by putting in the (known) values of sin(0) and cos(0).
(c) Explain (using the result of the previous part) why sin2 θ + cos2 θ = 1 is valid for all angles θ .
2. This problem gets to derive the formulas for the derivatives of some of the trig functions that I didn’t derive earlier. I
wrote the remaining trig functions in terms of sin θ and cos θ . Use those identities here.
(a) Differentiate the identity for tan θ in terms of sin θ and cos θ using the quotient rule. Use the Pythagorean
identity (sin2 θ + cos2 θ = 1) and the identity for sec θ in terms of cos θ to make the derivative you get match
what I have.
(b) Differentiate the identity for sec θ in terms of cos θ and use the quotient rule. Take the product sec θ tan θ and
convert it by the identities to sin θ ’s and cos θ ’s, and show algebraically that it equals the derivative of sec θ you
just got.
d d 1
Arcsin x = − Arccos x = √
dx dx 1 − x2
d d 1
Arctan x = − Arccot x =
dx dx 1 + x2
d d 1
Arcsec x = − Arccsc x = √
dx dx |x | x2 − 1
Again, note identities and radians. Please remember that the output of the inverse trig functions will always be angles,
and therefore, will always be in radians. The domains and ranges of the inverse trig functions are given in the accompanying
table that is at the top of the next page. These definitions are typical, but not universal. Some people use different ranges
for Arccot x, Arcsec x, and/or Arccsc x. In fact, earlier versions of Maple defined Arccot x so that the range is (0, π/2] ∪
(−π/2, 0) so, arccot(x); in Maple was the same as Arccot x for x > 0, but equaled ((Arccot x)−π) for x < 0. I am happy
to report, however, that the Maple people have “seen the light,” and now agree with me. All this is just to show you that
there are various different, valid, definitions for inverse trig functions out there.
Dudley: Doesn’t this mess up everybody?
Albert: The only inverse trig functions in common use are Arcsin x and Arctan x, and those are standard.
The derivatives of the inverse trig functions come in three pairs, each inverse function having a derivative that is the
negative of the inverse “co-” function, with the corresponding changes. The appropriately-labeled table (in the middle of
this page) gives them.
Note the domains and the effect on signs of derivatives. I would urge caution about these, especially for x < 0 (which
is the only place where disagreement occurs). The definition that Maple gives to Arccot x doesn’t change its derivative.
One reason for changing the definition of Arcsec x and Arccsc x is to eliminate the absolute values in the derivative. They
are a genuine inconvenience.
Maple and derivatives of inverse trig functions. Maple does not capitalize inverse trig functions, and you have to put
the argument (the x) in parentheses, so that Arctan x in Maple becomes arctan(x).
Maple’s derivatives of Arcsec x and Arccsc x are equivalent to what I have, but avoid the absolute value hassle by using
an awkward combination of square roots:
> diff(arcsec(x),x);
> diff(arccsc(x),x);
1
r
1
x2 1 − 2
x
CHAPTER 1. DERIVATIVES - I 76
1
− r
1
x2 1−
x2
These are equivalent to the formulas I gave. (See the homework.)
Inverse trig functions and the chain rule. Note that the x in the derivatives of Arcsec x and Arccsc x appears in two
places (inside the absolute values and inside the square root). This is essentially the same thing we encountered with the
derivatives of secant and cosecant. How do you use the chain rule with outside functions Arcsec and Arccsc, where there
are two places you could put the inside function? The answer is that you put the inside in both places, just as with absolute
values.
Combining these with everything before. Of course, we will have to combine these functions with the chain rule, the
product rule, the quotient rule and the regular trig functions. And we aren’t done adding functions. The one thing to
remember is that all the new functions that we are getting follow precisely the same patterns for products, quotients, and
composition that we have already learned to use. We just have to keep a bunch more formulas in mind. (The patterns are
the product, quotient, and chain rules.)
Mugsy: Oh, great. And I thought that this course wasn’t going to stress memorization.
Albert: It doesn’t. It turns out that the formulas will all be given to you on the tests. You really don’t have to
memorize them. On the other hand, you had better be familiar with how to use these formulas!
Again, here are the examples that I will work in class. Differentiate the following functions.
Arcsin(5 x)
sin(2 x) Arctan(3 x)
x3 Arcsec(x2 )
sec(x Arcsin(6 x))
Arctan x
x2 + 1
x cos(3 x)
Arctan(5 x) sec(7 x)
Homework #15
Exercises.
1. Find the derivatives of the following functions. You do not need to simplify your answers.
(a) Arcsin(2t)
(b) Arctan(x2 + 1)
(c) |Arcsin(5 x) |
(d) tan(1 − Arctan x)
x Arcsec x
(e)
1 − 2 x2
2. Find the derivatives of the following functions. You do not need to simplify your answers.
(a) Arcsin(5t + 2)
CHAPTER 1. DERIVATIVES - I 77
p
(b) Arctan(4 x)
(c) Arcsec(x2 − 1)
1
(d) sec Arcsin x
Arctan x Arcsin x
(e)
x2 + sin x
3. Make up three of your own derivatives involving inverse trig functions and check them with Maple if you want.
Problems.
1. Let me lead you through the derivation of the formula for the derivative of one inverse trig function, Arctan x. (Here
is one of the times that we use the procedure for finding the derivative of the inverse of a function. I told you we’d see
it again. Refer to the homework problems of the exercises in the chain rule if you want to see it in general.) Assume
that you don’t yet know the derivative for Arctan x, and this problem will derive it.
(a) The definition of inverse functions shows that tan(Arctan x) = x. Differentiate both sides of this identity with
respect to x. What rule did you need to use? What part of the rule required the derivative of Arctan x?
(b) Once we figure out sec2 (Arctan x), we can find the derivative of Arctan x by solving. If you’ll remember, there is
a procedure for finding a formula for “trig(Arctrig x)” functions. (Refer back to the reference section on inverse
trig functions if you need a refresher.) Let’s apply that to sec(Arctan x), and then square it. Call Arctan x = θ ,
or tan θ = x. Draw a generic right triangle with tan θ = x. (For example, let θ be the the lower angle, let x be
the length of the vertical side, and 1 √
the length of the horizontal side. Find the length of the hypotenuse by the
Pythagorean theorem. It should be 1 + x2 .) Then we find that sec(Arctan x) = sec θ can be read off of the
triangle. What then is sec(Arctan x)? What is sec2 (Arctan x)?
(c) Plug the results from the previous part into the formula from the first part. Solve to get d/dx(Arctan x) =
1/(1 + x2 ).
2. Show that the derivative Maple gave for Arcsec x is the same as the derivative I gave. For this, you will want to use
the definition of the absolute value of x.
a
ln(a b) = ln(a) + ln(b) ln = ln(a) − ln(b) ln(an ) = n ln(a) (1.58)
b
These are often called the properties of logarithms. Memorize them. There’s also one more, coming up soon.
The lab will cover common (base 10) and natural (base e) and other-based logarithms, and their properties, and how to
work them on your calculator.
d
(ln x) = 1/x (1.59)
dx
CHAPTER 1. DERIVATIVES - I 78
d 1
(ln x) = (1.64)
dx x
Combining this with everything before. As before, I will give examples in class. Here are the ones I’ll go through.
ln(1 + x2 )
x3 ln x
ln(Arcsin x)
Arctan(ln x)
sin(4 x) ln(5 + x)
x + ln x
sin x Arcsec x
CHAPTER 1. DERIVATIVES - I 79
Logarithmic differentiation, and revisited product and quotient rules. A new option occurs once we have logarithms.
It can be used to differentiate incredibly messy products, quotients, and powers.
Mugsy: Here it comes. I can hardly wait.
We can use the properties of logarithms to convert the function into a much nicer form before we have to apply the rules
for derivatives. Several examples will help.
First, take y = x4 sin x. To differentiate this would be a simple product rule, but let me do it by logarithmic differentiation
as an illustration.
Mugsy: Why do we have to do this the hard way when we already have a reasonably simple way to do it? Al, you’d
better have a good answer for this!
Albert: The process of logarithmic differentiation has several steps, and they are simpler to see on an easier example,
where the algebra doesn’t get in the way so much. The steps to follow are given in the example.
• Step 1. Take the logarithm of both sides. You start with y = x4 sin x, and you get
ln y = ln(x4 sin x)
• Step 3. Differentiate both sides with respect to x (or whatever the independent variable is).
ln y = 4 ln x + ln(sin x) (1.67)
d d
(ln y) = (4 ln x + ln(sin x)) (1.68)
dx dx
1 dy 1 1
× =4 + × (cos x) (1.69)
y dx x sin x
Note the use of the chain rule on both the ln y and ln(sin x) terms.
• Step 4. Solve the resulting equation for dy/dx. (Multiply through by y.)
1 dy 1 1
× =4 + × (cos x) (1.70)
y dx x sin x
dy 1 1
= y× 4 + × (cos x) (1.71)
dx x sin x
This is the answer. But it is sometimes useful to go back to the original problem and plug in the function for y. In
this case, you’d get
dy 1 1
= (x4 sin x) × 4 + × (cos x)
dx x sin x
With more complicated functions (like the next one), you won’t want to do that.
Mugsy: Hey! I didn’t want to even start this!
It hardly looks like this is the correct answer. Certainly the product rule wouldn’t give that form! But it is the same.
See the homework.
Let me do another example, a lot messier this time. Refer back to the previous example for the different steps I am
going to follow. Take
s
x5 (Arctan x)3
y= 7 4 (1.72)
(x + x)4 cos8 x
5 1/7
x (Arctan x)3
= (1.73)
(x4 + x)4 cos8 x
CHAPTER 1. DERIVATIVES - I 80
Checking logarithmic differentiation on Maple. Maple is not much help with logarithmic differentiation, unless you
do a lot of the work. Maple’s simplification routines simply don’t allow any systematic logarithmic differentiation to occur.
The closest you can come—and it does help with a lot of the homework questions, but not all—is to define y as the
function, take the natural log of both sides, differentiate with respect to x, and then issue the command simplify();. That
might or might not help you check your work.
Mugsy: In other words, Maple is no help here.
Albert: Close. It isn’t as much help here as it is in other places.
Homework #16
Exercises.
1. Find the derivatives of the following functions. Don’t use logarithmic differentiation for these!
(a) ln(1 + sec x)
(b) ln x × sin(x − 1/x)
x
(c) ln Arcsec x
2. Find the derivatives of the following functions. Don’t use logarithmic differentiation for these!
(a) r ln r
(b) |ln(sin x) |
β ln β
(c) (β = Greek letter beta)
(β + 3)2
3. I recommend logarithmic differentiation for both parts of this problem.
(a) Find the derivative of s
7 (ln x)25 x18 (x8 − x)41
y=
(x2 + 1)39 (Arcsin(2 x))29
6. Show that the derivative of y = x4 sin x obtained by logarithmic differentiation in the notes agrees with the derivative
you get by the product rule. Do this by simplifying the expression obtained from logarithmic differentiation and
comparing it to what you get by the product rule.
Problems.
1. In this problem, we will give an example of a function whose limit by calculator is way off. Let f (x) = 2 + 3 x2 +
1
π4
ln(|x − π |).
(a) Construct a table of values of f (x) for x = 3, 3.1, 3.14, 3.141, 3.1415, and 3.1416. Use a calculator or Maple.
(Note: If you use your calculator, use the π key; and if you use Maple use Pi for π. Maple uses either
log(); or ln(); for natural logarithm. If you want log10 x in Maple, you have to use log10(x);. In this
problem, we want ln();.)
(b) On the basis of that table, give a guess as to the value of limx→π f (x). Try to make it accurate to one decimal
place.
(c) What happens when you substitute x = π into the equation, particularly in the log term? (This is essentially the
same thing that would occur with an indeterminate form.)
(d) Evaluate the limit as x → π of f (x) using Maple. Which value (your guess on the basis of the table or Maple’s)
do you think is correct? (Hint: Read the first line of this problem.)
(e) The function f (x) drops below 0 only for x in the moderately small interval smaller than π ± 10−1000 . Do you
think your calculator would ever give a limit of - infinity for this limit?
(f) What’s the moral of this problem?
Mugsy: Problems have morals? You gotta be kidding.
2. In this problem, we use logarithmic differentiation to derive the product, quotient, and power rules. Once we have
logs, these become fairly easy.
Dudley: No fair. I already know these. Why see them again?
Albert: Because the more you see them in different contexts, the easier it is to work with them.
Mugsy: I’m all for anything that makes these things easier.
(a) Suppose f (x) = f1 (x) × f2 (x). Take the (natural) logarithms of both sides, and simplify the right hand side
using properties of logarithms. Differentiate both sides using the chain rule. Solve for f 0 (x). Replace f (x) with
f1 (x) × f2 (x), and simplify to get the usual product rule.
(b) Suppose g(x) = g1 (x)/g2 (x). Follow the same procedure as in the first part to derive the quotient rule. (This
will require some algebraic simplification to get it into the usual quotient rule form.)
(c) Suppose h(x) = xn . Again, follow the directions in the first part to derive the formula for the derivative of xn .
Note that the formula does not depend on n being a whole number, just a constant.
3. In this problem, we will try to find a coefficient, c, and an exponent, n, so that (c xn )0 = 1/x = x−1 .
(a) What is the formula for (c xn )0 ? This is what we want to match with x−1 .
(b) What value of n causes the exponent in the derivative to be −1?
(c) Let n have the value from the previous part. What is the derivative of xn in this case? Can we find some function
c xn for that value of n to make the derivative of c xn = x−1 ?
CHAPTER 1. DERIVATIVES - I 83
d x
e = ex (1.88)
dx
or
d
(exp x) = exp x (1.89)
dx
This formula is the key to understanding the exponential function’s important properties.
Dudley: Let me get this straight. The derivative of ex is just ex , right?
Albert: Yup.
Dudley: Then all I have to do is copy the thing down whenever I want to differentiate and exponential of anything?
Albert: Wrong, on two counts. First, you have to remember that what you will be differentiating will be exponentials
and other things, so you need to use the product and quotient rules as appropriate. Also, you have to remember chain
rule, so it is not just “copy the thing down.” You also have to multiply by the derivative of the inside, just like trying
to differentiate sin(5 x), where you get an extra factor of 5.
Dudley: What’s the “inside” of an exponential?
Albert: It’s the thing in the exponent. You’ll see soon.
Why use base e exponentials versus other bases. The exponential function is ex . Other candidates, ax for a > 0,
a 6= 1 or e, are also called exponential functions, but are never called the exponential function.
Mugsy: Never?
Albert: Never.
There are two reasons. First, ax = ex(ln a) , so studying ek x for different values of k gets all the others. Second, the derivative
of ek x is so nice in that form.
Note the differential equation that ek x solves. One differential equation shows up so often, in such a wide range of
settings, that it is quite amazing. The equation is dy/dt = k y, with k a constant. This is one case where understanding what
the equation says (that is, translating it into English) helps to explain why this equation shows up so much.
Dudley: Is this an example of all that talk long ago about being able to interpret an equation?
Albert: YES! You did learn something.
Dudley: You don’t have to sound so surprised . . . .
dy/dt represents the rate at which y is changing. What dy/dt = k y says is that the rate at which y is changing is
proportional to the amount of y present. Any physical situation where that description holds can be modeled mathematically
(the term for this process) by the equation dy/dt = k y, with appropriate understandings of y and k.
The solution of the equation dy/dt = k y is y(t) = C ek t , where C can be any constant. (See the homework.) The value
of C represents the value of y at t = 0, so it is often written y0 . That makes the equation look like y(t) = y0 ek t .
The key to the equation is k. If k > 0, then y grows (and k is called the growth constant), and if k < 0, then y decays
(and k is called the decay constant). (We always work in physical situations with y > 0. Then k > 0 makes dy/dt = k y > 0
CHAPTER 1. DERIVATIVES - I 84
and k < 0 makes dy/dt = k y < 0.) Additionally, the size of k determines how fast the solution grows or decays. When k is
large and positive, y grows very fast; when k is small and positive, then y grows at a much more controlled rate (but once it
starts taking off, it is virtually unstoppable); when k is small (close to 0) and negative, then y is decaying mildly; and when
k is large (far from 0) and negative, then y is dying rapidly—it goes to 0 so fast that it becomes indistinguishable from it
very rapidly.
We now look briefly at four different situations where exponentials occur. One of them (population growth) will be
looked at in detail later, and one of them (continuous compounding) will be re-derived later. The other two are just to show
you the variety of situations that these equations explain.
Newton’s law of cooling. In this situation, y represents the difference between the temperature of an object and the
temperature of its surroundings (called the ambient temperature). The equation dy/dt = k y says that an object cools off (or
warms up) at a rate proportional to that difference in temperature. In other words, a very hot object loses heat much faster
than a slightly warm object.
Mugsy: Al, I need another example.
Albert: OK, Mugsy. Here’s one on your terms. After shooting a gun for a while the barrel gets real hot, right?
Mugsy: Sometimes too hot to touch. That’s why it’s called a heater. But it depends on the gun, and how much you
shoot, and the type of bullets, and . . . .
Albert: OK, OK. Suppose the barrel’s at 170 degrees and the air temperature is 70 degrees, so the gun is 100 degrees
warmer than the air. In five minutes, the barrel will still be warm, but not hot, say 110 degrees. That means that in
five minutes, the barrel cooled off by 60 degrees, or 12 degrees per minute average. In the next five minutes, the barrel
will be barely warm, say 80 degrees. So in the next five minutes, the barrel only cools off 30 degrees, or 6 degrees per
minute. The rate of cooling has dropped. The barrel isn’t as hot after five minutes, so it won’t lose heat as fast. If
it kept dropping at 12 degrees per minute, after the second five minute interval, the temperature would have dropped
another 60 degrees, and the gun would be only 50 degrees, even cooler than the air. That isn’t going to happen. Does
this make sense?
Mugsy: More than it used to. I think.
Dudley: You think? That’s news.
Mugsy: Don’t make wise cracks while holding a gun, see? Not healthy.
In this case k < 0, since y > 0 (an object warmer than its surroundings) will cause the temperature to decrease (dy/dt <
0). This equation is only roughly accurate, due to ignoring certain critical items, such as the temperature distribution within
the object. The entire object does not have just one temperature.
Radioactive decay and radiometric dating. In this case, y represents the amount of a radioactive substance present,
and k represents the decay rate of the substance. The equation dy/dt = k y says that the more of a radioactive substance you
have, the faster you lose it. (If you have one 10 µg piece of radium, and one 50 µg piece of radium, the 50 µg piece will
emit radiation at 5 times the rate of the 10 µg piece. And the radiation is emitted as the radium is changing into something
else—radon gas, among other things).
We want to look more carefully at the value of k in this case. As before, it must be that k < 0, since the amount of
radioactive substance decreases. (That leads to an interesting question: How can breeder nuclear reactors produce more
fuel than they use?) When dealing with radioactive substance, one of the critical items to note is the half life—the length
of time it takes for half the radioactive substance to decay. If we write t1/2 for the half life, this means that 21 y0 = y(t1/2 ).
CHAPTER 1. DERIVATIVES - I 85
(Read over that equation until it makes sense.) That says that
1
y0 = y0 ek t1/2 (1.90)
2
1
= ek t1/2 (1.91)
2
1
ln = ln ek t1/2 (1.92)
2
ln(1) − ln(2) = k t1/2 (1.93)
− ln(2) = k t1/2 (1.94)
− ln(2)
k= (1.95)
t1/2
What this last equation says is that when the half life of the substance is large (uranium, for example), the value of k is
negative and very close to 0, and the decay is very slow. On the other hand, if the half life is small (radium, for example),
the value of k is negative and far from 0, so the decay is very fast. It makes sense. Another form of this equation,
− ln(2)
t1/2 =
k
gives the half-life if you happen (by some weird coincidence) to have the value of k.
This idea is used with radiometric dating, a source of considerable debate in some quarters.
Dudley: Radiometric dating? Like an intense Saturday evening?
Albert: Not in my quarters.
Mugsy: Oh, spare us, both of you.
The basic idea is this. There are two different types (isotopes) of carbon, one that is radioactive (carbon-14) and one that
isn’t (carbon-12) in a fixed ratio. Carbon-14 is created by the interaction of cosmic rays with high-altitude nitrogen, and
has been going on for sufficiently long that carbon-14 is being formed at the rate it is disintegrating, giving what is called a
steady-state. These isotopes are mixed together thoroughly. All chemical reactions treat them virtually identically. Living
tissue (that is, a plant or animal) will absorb both isotopes of carbon at once through food, and keep that constant ratio of
carbon-12 to carbon-14. However, when the tissue dies, the carbon-12 stays put, while the carbon-14 decays, turning into a
different element. If you have a piece of long-dead, once-living material, you can figure out not only how much carbon is
in it but also the amounts of the different types of carbon that are present as well by measuring the amount of (radioactive)
carbon-14 with a Geiger counter sort of device. If you make the assumption that the ratio of different isotopes of carbon
hasn’t changed significantly since the tissue was alive, you can thereby determine how much carbon-14 was present at
death, and finally, how much of the carbon-14 has disintegrated. Knowing the half-life of carbon-14 then allows you to tell
how old the sample is, that is, how long it has been since the carbon-12 and carbon-14 were in equilibrium amounts.
Mugsy: Clear as mud.
Let’s continue a bit further into this. At equilibrium concentrations of carbon-12 and carbon-14, one gram of carbon
will produce 15.3 radioactive disintegrations per minute. If you had a one-gram sample of material that produced, say, 5.7
disintegrations per minute, you could tell that the amount of carbon-14 present was only 5.7/15.3 = 0.37 of the original
amount. That would mean 0.37 = ek t , where we want to find t = the age of the object. To solve that for t, we need to know
k. For carbon-14, the value of k is well-known (to those who have to work with it) as −0.0001245. (See the homework.)
Then
0.37 = ek t (1.96)
−0.0001245t
ln(0.37) = ln(e ) (1.97)
= −0.0001245t (1.98)
t = ln(0.37)/(−0.0001245) (1.99)
= 26480 (1.100)
CHAPTER 1. DERIVATIVES - I 86
or about 26,480 years old. This is the theory behind the practice of radiometric dating.
Dudley: Al, how realistic is this?
Albert: The chemistry/physics of it is solid. The questions arise when dealing with anything that old. How can we
be sure that there hasn’t been contamination with more recently-alive things (bacteria, etc.)? The arguments can get
nearly violent. Depends on whose pet theory is being skewered.
Population growth. Populations tend to grow exponentially. (This was part of the discovery of Malthus.) The reason
is simple. The more people you have, the more new people who arrive. Or, you can do the same with other creatures, such
as bacteria or rabbits.
Mugsy: To update P. T. Barnum, there’s a sucker born every 30 seconds.
There are limits on growth that are not included in this equation, and we will patch the equation up later when we get to
integration.
The equations say that y = y0 ek t , where now y0 is the initial population, and k tells how fast the population is growing.
Note that in this example, k > 0, since we are talking about growth rather than decay.
Dudley: Al, is this realistic?
Albert: For short periods of time, yes. For longer periods of time, you need to throw in the limitations of needed
resources, such as food and space.
Continuous compounding. In case you think that all of these are beyond daily life, let me mention one more that
we’ll investigate later this semester. If your savings account
Dudley: Savings? This is more than a bit hypothetical.
is listed as giving continuous compounding, the formula for the amount of money in your account (if you leave it alone)
Mugsy: Real hypothetical.
is dy/dt = k y, where k is the interest rate given for the account. The more money you have, the more money you make.
Combine these with everything before. We’re getting near the end, but once again, let me mention that all we are doing
is accumulating more and more formulas to plug into, but they are not getting more complicated. All these formulas for
derivatives work precisely the same way. Once you get that down, they all fall into place. Just don’t get overwhelmed. The
examples for this section follow the next topic.
Definitions, graphs, and identities. The notation for the hyperbolic trig functions is exactly like the regular trig functions,
except you add an “h” to the name. The definitions, however, are completely different. They are defined in terms of the
exponential function, ex . The accompanying table at the top of this page gives them. These are not the formulas for the
derivatives! They are the formulas for defining the functions in terms of exponentials and sinh x and cosh x. The formulas
for the derivatives are given in another table, and you are asked to find some in the homework.
You also need to know how to pronounce these. It is not what you might think. The pronunciations I use (and I think
these are standard) are: sinh is pronounced like “cinch,” cosh has a hard “c” and a short “o” so that its starts the same way
as “cot,” tanh is “tansch” (I know there’s no “s” in tanh, but you pronounce it anyway), coth is “coh-tansch” or “kahth” (I
use “coh-tansch”), sech is “seetch,” and csch is “coh-seetch.”
Dudley: Al. Help?
CHAPTER 1. DERIVATIVES - I 87
ex − e−x
sinh x
2
ex + e−x
cosh x
2
e − e−x
x
sinh x
tanh x
ex + e−x cosh x
ex + e−x cosh x
coth x
ex − e−x sinh x
2 1
sech x
ex + e−x cosh x
2 1
csch x
e − e−x
x sinh x
Derivatives of the hyperbolic trig functions. The parallels continue with the derivatives of the hyperbolic trig functions.
Since all of them can be defined in terms of exponentials, and you know how to differentiate them, I shouldn’t have to tell
you what the derivatives of the hyperbolic functions are. In fact, in the homework, I will ask you to derive them. I will give
you this much of a start: d/dx(sinh x) = cosh x and d/dx(cosh x) = sinh x. Note that the only difference is the “missing”
negative sign in the derivative of cosh x.
All of the hyperbolic trig functions have derivatives that match precisely the derivatives of the circular counterparts,
except for negative signs that appear or disappear. That’s a big hint.
Dudley: Are we ever going to get a list of these derivatives?
Albert: On the extra sheet that will accompany the tests, these derivatives will be listed. Of course you save your
tests, don’t you?
Mugsy: Yeah. I bronze mine.
Inverses of the hyperbolic trig functions and their derivatives. As with the circular trig functions, the hyperbolic trig
functions have inverses. But since the hyperbolic functions are defined in terms of exponentials, the inverses can be defined
in terms of the inverse of the exponential, the natural logarithm. They are listed in the accompanying table at the top of
this page. It isn’t clear why anyone would want to use these. I wondered myself, until recently when I had to work with
Arccsch, and discovered that it was a whole lot easier to work with than the messy combination of square roots and logs
that make up Arccsch.
The derivatives of the inverse hyperbolic trig functions are given in the accompanying table at the top of the next page.
Note that these are identical to the derivatives of the inverse circular trig functions, except for changes of sign. That’s handy
CHAPTER 1. DERIVATIVES - I 88
Circular Hyperbolic
sin(x ± y) = sin x cos y ± cos x sin y sinh(x ± y) = sinh x cosh y ± cosh x sinh y
cos(x ± y) = cos x cos y ∓ sin x sin y cosh(x ± y) = cosh x cosh y ± sinh x sinh y
tan x±tan y tanh x±tanh y
tan(x ± y) = 1∓tan x tan y tanh(x ± y) = 1±tanh x tanh y
Function Definition
√
Arcsinh x ln x + x2 + 1
√
Arccosh x ln x + x2 − 1
1 1+x
Arctanh x 2 ln 1−x for |x | < 1
Arccoth x = Arctanh( 1x ) 1 x+1
2 ln x−1 for |x | > 1
√
2
Arcsech x = Arccosh( 1x ) ln 1+ x1−x
√
1+x2
Arccsch x = Arcsinh( 1x ) 1
ln x + |x |
Function Derivative
Arcsinh x √1
1+x2
Arccosh x √1
x2 −1
1
Arctanh x 1−x2
1
Arccoth x 1−x2
Arcsech x √−1
x 1−x2
Arccsch x √−1
|x | x2 +1
to keep in mind.
Of course, we will need to go over some derivatives in class that include exponentials, hyperbolic trig functions, and
inverse hyperbolic trig functions. I won’t ask for the derivatives of any of the hyperbolic trig functions until we have the
correct formulas worked out, except that I have already given you that the derivative of sinh x is cosh x and the derivative of
cosh x is sinh x (with no minus sign). Here are the examples that will be worked in class.
ecosh x
sinh(4 x + 1)
Arccoth (6 + ex )
2
tanh(ex )
exp x × ln(5 sin2 (x))
sinh x cosh x
sin x cos x
Homework #17
Exercises.
5. Make up three more derivatives and work them. Be sure to include exercises that have polynomials, absolute values,
trig functions, inverse trig functions, and logarithms in addition to exponentials. Don’t forget products and quotients,
either. Be glad that we are now done with the new functions that we’ll encounter!
Problems.
1. Use logarithmic differentiation to get the derivative of y = ax , for a > 0. Note that a is a constant, so ln a is a constant
(so it has derivative = 0). Here, x is the variable.
2. Plug y = C ek t into both dy/dt and k y, and show that the results are equal. (This shows that y = C ek t is a solution of
the differential equation dy/dt = k y.)
3. When we looked at Newton’s law of cooling, we decided that k < 0 when y > 0, because dy/dt < 0 then and we want
the equation dy/dt = k y to work. Now we want to find the sign of k when an object is warming up. Find the sign
of dy/dt in the case that y < 0 (so the object is cooler than the ambient temperature). Will k be positive or negative
when y < 0?
4. Get the derivatives of the remaining four hyperbolic trig functions, and put them into the forms that make them look
like the derivatives of the circular counterparts. You will need some of the hyperbolic trig identities to do that. But it
can be done exactly the same way that was done with the circular trig functions. Use the definitions of the hyperbolic
trig functions in terms of sinh x and cosh x.
Investigations.
1. With decay, we found that the equation relating half-life and the decay constant k was k t1/2 = − ln(2). Since k < 0,
this gave half-life a positive value. Can you think of a good interpretation for t2 in the equation k t2 = ln(2) when
k > 0 (exponential growth)?
2. A function f (x) is called even if f (−x) = f (x), and it is called odd if f (−x) = − f (x).
(a) Show that xn is even when n is even and odd when n is odd. Do this by plugging −x in for x in xn and treating
n even and n odd separately. (This is the source of the terminology.)
(b) Show that cosh x is even and sinh x is odd. This can be done by by plugging −x into the exponential definitions
for sinh x and cosh x and simplifying. (The same pattern is true for sin x and cos x.)
(c) Show that ex = cosh x + sinh x. Do this by again going back to the exponential definitions of sinh x and cosh x.
(d) Let f (x) be any function. Show that g(x) = 12 ( f (x) + f (−x)) is an even function and h(x) = 12 ( f (x) − f (−x))
is an odd function. Also show f (x) = g(x) + h(x). (Hint: Look at the previous parts of this investigation.) This
shows that any function can be written as the sum of an even function and an odd function. Also, sinh x and
cosh x are nothing more than the even and odd functions derived from ex . This explains, in a small way, why
they exist.
(e) Show that an odd function f (x) must have f (0) = 0. (Hint: Show f (−0) = − f (0).)
1.4 Differentials.
The notation for derivatives due to Leibniz is dy/dx. This is not the quotient of two “things,” dy divided by dx, by what we
have done before: dy/dx = f 0 (x) shows that very clearly. But engineers have discovered that treating dy and dx as separate
quantities leads to correct results. In keeping with the spirit of the course, I will do what the engineers do, and look at
differentials (as the dy and dx are called).
Mugsy: So, what if you’re a biologist?
Albert: The same ideas apply there. For example, population growth from before will be re-examined in this section.
CHAPTER 1. DERIVATIVES - I 91
The list could continue through all of the functions that we worked with.
Dudley: Al, is this as easy as it looks?
Albert: In fact, it is conceptually simpler than derivatives, but being new, it is not as familiar.
Mugsy: Does that mean “yes?”
Not specifying the independent variable is useful. Suppose we have the following, not uncommon, situation. We
have the position of an object given by x(t) and y(t). That is, you know an object’s x- and y-coordinates as a function
of time. (This is called parametric equations. The main variables are both expressed in terms of a third, independent
variable. We will look at them closely next semester.) Then we can find dx and dy by differentiating: dx = (dx/dt) dt and
dy = (dy/dt) dt. Note that the dt’s have to be tacked onto the end of these derivatives, since t is the variable. Then dy/dx
can be calculated by
dy (dy/dt) dt dy/dt
= =
dx (dx/dt) dt dx/dt
Do you recognize that? Suppose we rewrite it as
dy dy dx
= ×
dt dx dt
It’s the chain rule! Yup, the chain rule appears here, and actually the reason that differentials work at all is that the chain
rule works. (I bet you were wondering why I omitted the most important rule in calculus from the list of rules I gave
earlier for differentials. It’s because the chain rule is built into any operation with differentials. It doesn’t appear, since it is
the assumption of the notation.)
Mugsy: Need I say that I hadn’t noticed?
Albert: No.
CHAPTER 1. DERIVATIVES - I 92
f (x) = 2 x3 − 4 x2 + 8 x − 5 (1.107)
3 2
f (x + dx) = 2 (x + dx) − 4 (x + dx) + 8 (x + dx) − 5 (1.108)
3 2 2 3
= (2 x + 6 x (dx) + 6 x (dx) + 2 (dx) ) (1.109)
2 2
− 4 x − 8 x (dx) − 4 (dx) + 8 x + 8 (dx) − 5 (1.110)
2 2 3 2
∆y = f (x + dx) − f (x) = 6 x (dx) + 6 x (dx) + 2 (dx) − 8 x (dx) − 4 (dx) + 8 (dx) (1.111)
= (6 x2 − 8 x + 8) (dx) + (6 x − 4) (dx)2 + (2) (dx)3 (1.112)
dy = f 0 (x) dx = (6 x2 − 8 x + 8) (dx)
What’s the difference? When working out ∆y, you had to keep higher powers of dx. The differential formula simply
ignores them. Therefore, using differentials allows you the luxury of ignoring powers of dx higher than 1. Simply throw
them away! And what you get by the process is the derivative.
Mugsy: Hey! I like that! You just throw the bums out. What a deal.
Think about this for a while and you will see that this is precisely what we did earlier by factoring out the ∆x from the
expression for ∆y, dividing by ∆x, and then setting ∆x = 0 in what was left. The division by the ∆x causes terms with just a
single ∆x in them to lose the factor of ∆x entirely. The terms with higher powers of ∆x lose one, but not all, of them. Setting
∆x = 0 causes them to go away at that stage. By throwing away powers of dx higher than the first, you do that earlier on in
the calculation. It also makes the computations easier by cutting down on the number of terms you have to keep around.
Mugsy: I’m liking these differentials more and more.
Graphically, you can think of differentials as being tiny wiggles in the variables, but the graph has been so magnified to
see them that the curve has actually been replaced by its tangent line, not just approximately, but exactly. The ratio dy/dx
is then the slope of the line—the tangent line of the curve, exactly as we had before by the limit process. The higher powers
of dx that we ignored in the differential are essentially the “bend” in the curve (like a parabola, which needs a quadratic in
it).
the cooling rate change with the temperature? How can you say that dy only depends on a single temperature difference,
just one value of y? But what saves it is that this equation is true only with differentials. So, dt is a very tiny slice of time.
And because it is so small, the only change in the temperature difference will be equivalently tiny (dy). For the next dt
change, a new value of y is used. As t changes, y changes, too. And because dt is so small, the temperature doesn’t change
during the dt interval, and you can use the single value y.
Essentially, when we are working with differential-sized quantities, the intervals are so short that we can treat everything
that looks variable as though it were a constant, except for other differentials.
This last point is sufficiently important that I want to emphasize it again. Whenever you have a differential on one
side of an equation, the other side must also include a differential (or be 0). The reason is submicroscopic wiggles can’t
be magnified enough to make them “regular-sized” numbers. This is usually stated by saying that differentials must be
balanced by differentials, called balancing differentials.
The next example was radioactive decay. In that case, y = amount of a radioactive substance. (k remains a proportion-
ality constant and t remains as time.) The equation dy = k y dt would end up meaning that the amount of substance that
decays in a very short time period (dy) is (=) proportional to (k) the amount of substance (y) times the length of time (dt).
Again, this works only because the length of time is a differential. Otherwise the number of decays in the equation would
be changing as the radioactive substance decayed.
The next example was population growth. In that case, y = population, and again, k and t represent a constant and time,
respectively. The equation dy = k y dt now is interpreted as follows: The increase in population (dy) is (=) proportional
to (k) the current population (y) times the length of time you wait (dt). Note again that dt must be small enough that the
increase in population doesn’t increase the population on its own, or else the equation would fail.
The final example was continuously compounded interest. In that case, y = amount of money in the account, with k =
interest rate, and t = time again. Then dy = k y dt is nothing more than the simple interest formula, interest (dy, the increase
in the money in the account) equals (=) principal (y) times rate (k) times time (dt). When the time period is as short as a
differential, the amount of money in the account doesn’t have a chance to compound, so it acts like simple (uncompounded)
interest. As t changes, though, so does y, and for the next dt amount, y has increased slightly, and the compounding takes
effect that way.
Although it is probably a bit shaky why this approach is useful, believe me that it is. The analysis of complex situations
simplifies extremely when you are allowed to use time intervals so small that all variables can be treated as constants. That’s
what we can do with differentials.
Again, I will do some examples of finding differentials in class. Here they are. Find the differentials of the following
functions.
y = x sin x
x = cos2 t
eQ
w=
1 − Q3
Homework #18
Exercises.
dy dy/dt
= (1.113)
dx dx/dt
It looks as though the dt’s are canceling (looking at them as differentials), and even though that is a slightly strained
interpretation, it is a very handy way to remember the formula, and can even be legitimized with enough effort. Note:
Half the battle in mathematics is to come up with a notation that helps you. The Leibniz notation (differentials) does that
(Newton’s f 0 (x) does not). The other half is to interpret the notation correctly. I’m giving you the notation, and trying to
show you the interpretation.
There will be an example in class. It is quite straightforward. It is to find the derivative dy/dx for the parametric
equations x = et , y = lnt.
dy dy/dt
=
dx dx/dt
We need to look at this a bit more carefully now, for use momentarily.
If we pull down the y’s, we get
d 1 d
(y) = × (y)
dx dx/dt dt
We can even get rid of the y’s, and get this:
d 1 d
= ×
dx dx/dt dt
This tells us how a derivative with respect to x relates to a derivative with respect to t.
Let me throw in a bit of notation that fits here that would be confusing in any other setting.
Mugsy: I’m not optimistic here.
Remember that I said that physicists tend to use a prime to mean a derivative with respect to x, and a dot over a variable to
indicate a derivative with respect to time. In that notation, the last equation becomes:
˙
()0 =
1 ˙ = ()
× ()
dx/dt dx/dt
This is one reason that only physicists use this awkward notation. It’s handy, but you have to be careful. Fortunately, you
shouldn’t have to worry about this unless you are planning on taking a bunch of physics.
Mugsy: Hmm. Better than I feared.
There is an extension of this change-of-variables rule to multiple independent variables, but that will have to wait until
we can deal with functions that have multiple-variable inputs (partial derivatives).
Mugsy: I can hardly wait.
Homework #19
Exercises.
1. Parametric equations have graphs that are more general than the graphs of regular functions. To convince you of this,
show how to convert y = f (x) into parametric equations. [Hint: Just let t = x. What would the equation for y be, in
terms of t? Yes, it is very easy.]
2. When we were doing differentials and parametric equations, we were using very general (unspecified) functions.
In this problem, I want us to work through a specific example, using Maple if you want. (I’ll give you the Maple
commands as an incentive to use it.) Let’s start with x = t 2 − 3t, and y = t 3 + 4t. Let’s find the slope of the tangent
line at (“when” might be a better term than “at,” since t is viewed as time) t = 2. (It is possible in this case to find y
as an explicit function of x. We will do that to check our answer, but only at the end.) On Maple, this becomes
> eq1 := t^2 - 3*t:
> eq2 := t^3 + 4*t:
Be sure to change the colons at the end to semicolons so that you can see what the values are. This same comment
holds throughout the rest of this problem.
(a) Where is the object at t = 2? (This is, what are the coordinates, (x, y), of the point when t = 2?) To do this on
Maple, use
> subs( t=2, eq1 ):
> subs( t=2, eq2 ):
(b) Next we find dx and dy as functions of t and dt. (Note that dt really needs to be considered as a separate,
independent variable. The size of the wiggle is not set by any other variable.) Maple does not use differentials
(actually, it does, but in a sufficiently more sophisticated way that it is not useful for us).
> dx := diff(eq1, t) * dt:
> dy := diff(eq2, t) * dt:
Note that the dt’s cancel, so you don’t need them here.
(e) To check this answer, we need to solve the parametric equations. From x = t 2 − 3t, we get that t = 23 ±
1
√
2 9 + 4 x. You can verify this on Maple by using
> solve( x=eq1, t ):
You get two values listed, and you want the value for t that is√greater than 3/2. Maple will list things in different
orders, for no clear reason. Pick the one that gives t = 32 + 12 9 + 4 x. Plug that into y = t 3 + 4t. What formula
do you get for y as an explicit function of x? You can do this on Maple by using (immediately after the previous
result)
> simplify( subs( t=%[1], eq2) ):
(Go back and use simplify(subs(t=%[2], eq2)); if the expression you end up with has any minus signs.
You picked the wrong one!) Note that the %[1] picks out the first of the two expressions resulting from the
solve. That’s the one we want (positive square root) this time. (Different runs of Maple might make the second
one have the positive square root. In that case, use t=%[2].) The simplify just makes the answer neater.
(f) Differentiate the function for y in the previous part, to get dy/dx as a function of x. Do this on Maple by using
(immediately after the previous result)
> simplify( diff(%, x) ):
CHAPTER 1. DERIVATIVES - I 99
(g) The value of x when t = 2 was determined in the first part of the problem. Plug that value of x into the formula
for dy/dx from the previous part and get the slope of the tangent line at the point where t = 2. On Maple, this
is done by using (immediately after the previous result)
> simplify( subs( x=subs(t=2,eq1), % ) ):
Compare to the answer using the parametric equations formula. Which way (by parametric equations formula
or by solving for y explicitly) would you rather do this, even if you have Maple around?
All of those are, by the way, ways of writing the second derivative. The third and higher derivatives follow in parallel with
the second derivative.
Acceleration.
Acceleration is a = v̇ = s̈, in the style of notation given earlier, where the last of these is read “s double dot.” The comment
about it being difficult to distinguish between dots and random blotches applies even more here. I prefer to use a = dv/dt =
d 2 s/dt 2 , that is, Leibniz’s notation.
Second derivatives, then, tell you how fast the velocity (or rate of change under consideration) is changing. For example,
the fact that the world’s population is growing is not, by itself, cause for much concern. The rate of change of population
is positive. But the rate of change is itself increasing, and that could become serious. The population of the world is
accelerating.
You rarely encounter derivatives higher than the second. (We will hit a total of one in the applications that we do!
Second derivatives, on the other hand, abound.) The reason that second derivatives are typically as far as you need to go
is that Newton’s law says that F = m a = m s̈, and that only involves second derivatives. Third derivatives can occur in
quantum mechanics, but are rare even there. This is good because dots tend to blur, fade out, or run together with more
than two.
Concavity.
Just as we had that y0 > 0 meant that the function is increasing and the graph is rising (going up to the right), and y0 < 0
meant that the function is decreasing and the graph is falling (going down to the right), we can look at the sign of the second
derivative. When y00 > 0, that means that (y0 )0 > 0, or that y0 is increasing. That means that the slope of the tangent line
is increasing. A few diagrams will convince you that this means that the tangent line is turning counterclockwise, and the
curve is above its tangent line. This situation is called concave up, or bending up.
When y00 < 0, that means that (y0 )0 < 0, or that y0 is decreasing. That means that the slope of the tangent line is
decreasing. A few diagrams will convince you that this means that the tangent line is turning clockwise and that the curve
is below its tangent line. This situation is called concave down, or bending down.
One common, and horrid, terminology for concave up and down is “spilling water” or “holding water.” This is hope-
lessly deceptive (as a homework problem shows). Nevertheless, many calculus books have propagated this terminology.
1.6.3 Calculations.
Of course, you are not content simply to know what second (and higher) derivatives mean, you have to know how to find
them. Fortunately, it is easy. We already know.
How to do it (simple).
Finding a second or third derivative requires nothing more than successive differentiations. Keep taking derivatives until
you have taken the right number of them.
Dudley: That’s all?
Albert: Yes.
This sounds easy, and it is for polynomials and a few other simple functions. For some functions (even ones as “easy”
at tan x), this rapidly becomes a nightmare. But Maple is happy to do this for you. (I’ll show how to do this on Maple in a
moment.)
The real problem is with compositions, which require the chain rule. The first derivative causes a product (the derivative
of the outside times the derivative of the inside), and after that you have the product rule as well as the chain rule. It gets
messy fast.
Again, and it gets harder as you proceed, “all” you need to do is not panic, but apply the rules for derivatives to
successively smaller pieces until you’ve conquered it.
p(x)
= p(x) (q(x))−1
q(x)
just as we did before. This puts you into dealing with chain rules, but for the second derivative and beyond, you will find
cancellations that make life vastly simpler in this form. Higher derivatives often produce multiple terms that are identical
(except perhaps for the coefficient). This permits simplifying the product rule for higher derivatives by combining like
terms. Do that!
Examples.
There will be numerous examples given in class, done both by hand and by Maple. Here they are. Find the specified
derivatives of these functions.
The second derivative of x6
The third derivative of x5 − 2 x4 + 4 x3 − 8 x2 + 9 − 18
The second derivative of sin(x2 )
The sixth derivative of x3
The third derivative of tan x
The five-thousandth derivative of ex
d2 y
d dy
=
dx2 dx dx
but dy/dx as worked out from parametric equations is a function of t and not a function of x. Therefore the next derivative
being with respect to x is awkward. But (and you guessed it) the chain rule (the most important rule in calculus) comes
to the rescue. We need to change the variable of differentiation. We can find d/dt(dy/dx), since dy/dx will be given as a
function of t. To convert to a d/dx(dy/dx), we need something earlier, namely
d 1 d
=
dx dx/dt dt
where we “apply” both sides to dy/dx. The left hand side is then the second derivative (which we want), and the right hand
side is something that we can figure out.
Let me do an example. Suppose
x = t sint, y = t cost
Then we find dy/dx by dy/dx = (dy/dt)/(dx/dt), which in this case gives
dy dy/dt
= (1.117)
dx dx/dt
1 × cost + t × (− sint)
= (1.118)
1 × sint + t × cost
cost − t sint
= (1.119)
sint + t cost
CHAPTER 1. DERIVATIVES - I 103
This is the first derivative. The second derivative is obtained by using the formula given earlier. You get
d2 y
d dy
= (1.120)
dx2 dx dx
d cost − t sint
= (1.121)
dx sint + t cost
1 d cost − t sint
= (1.122)
dx/dt dt sint + t cost
1 d cost − t sint
= (1.123)
sint + t cost dt sint + t cost
1
= × (1.124)
sint + t cost
(sint + t cost)(− sint − (sint + t cost) − (cost − t sint)(cost + (cost + t(− sint)))
(1.125)
(sint + t cost)2
..
.Thank heavens for Maple. . . (1.126)
−(t 2 + 2)
= (1.127)
(sint + t cost)3
A few comments on this. First note that you don’t just differentiate dy/dx with respect to t and stop. You have to divide
by the dx/dt also in order to compensate for the fact that you are differentiating with respect to t (which is easy, due to the
variable in dy/dx) and not with respect to x (which is the derivative you want for d 2 y/dx2 ).
It becomes obvious that higher-order derivatives get messy fast. The third derivative would be calculated by
d3 y d d2 y
= (1.128)
dx3 dx dx2
2
1 d d y
= (1.129)
dx/dt dt dx2
Homework #20
Exercises.
1. In this problem, we attack the idea that concave up is the same as “holds water.”
(a) Draw a reasonably accurate graph of y = 3 x + sin x for 0 ≤ x ≤ 4π. This is most easily accomplished using
Maple; the alternative is to plot lots of points. Also, to get the angles to look right, you will have to add a
command scaling=constrained to the plot command.
(b) Calculate y00 . Determine where it is positive and negative for 0 ≤ x ≤ 4π. (That is, give the x’s where y00 ≥ 0
and the other x’s where y00 ≤ 0.)
(c) Can you see any reason from the graph to say that any place on the curve would “hold water?” What would
happen if you had something of that shape and poured water on it? Would it “hold water” anywhere?
2. In this problem, we investigate higher-order derivatives of polynomials.
(a) How many derivatives do you need to take of a linear function, f (x) = a x + b, before you get 0?
(b) How many derivatives do you need to take of a quadratic function, f (x) = a x2 + b x + c, before you get 0?
(c) How many derivatives do you need to take of a cubic function, f (x) = a x3 + b x2 + c x + d, before you get 0?
(d) Look for a pattern in the preceding three parts, and make a guess about how many derivatives you need to take
of a general nth -degree polynomial before you get 0.
(e) What happens if you take more than the minimum number of derivatives? For example, if four derivatives of a
function give 0, what is the result of seven derivatives of that function?
√
3. For this problem, use y = e−x cos(x 7). Maple will be a big help.
(a) Find dy/dx and d 2 y/dx2 . This is not a pretty second derivative.
(b) Plug these and the original function y into the differential
√ equation d 2 y/dx2 + 2dy/dx + 8y, and show that it
−x
reduces to 0. (The terminology is that y = e cos(x 7) is a solution of the differential equation d 2 y/dx2 +
2 dy/dx + 8 y = 0.)
CHAPTER 1. DERIVATIVES - I 105
4. When you want to evaluate most limits, do the following steps in order:
(a) Plug in the limiting value for the variable. If you get 0/0, proceed to the next step. Otherwise you are done. If
you got a regular number, that’s the answer to the limit. If not, the limit does not exist.
(b) Factor the top and bottom of the fraction and reduce common factors. (If the limit is as x approaches a, there
will be factors of x − a in both the top and bottom. That’s the one you really need to get rid of.) Go back to the
first step.
(c) The exception to this procedure involves limits that you can’t easily factor. In that case, some ingenuity needs
to be applied. But the best solution is to wait for L’Hôpital’s rule, in the next chapter.
5. The basic rules for derivatives and the derivatives of the basic formulas are:
General patterns:
( f (x) ± g(x))0 = f 0 (x) ± g0 (x) ( f (x) · g(x))0 = f 0 (x) · g(x) + f (x) · g0 (x)
(c f (x))0 = c f 0 (x) 0
g(x) · f 0 (x) − f (x) · g0 (x)
( f (g(x)))0 = f 0 (g(x)) · g0 (x) f (x)
=
g(x) g(x)2
Specific functions:
Function Derivative Function Derivative
xn nxn−1 ex ex
|x | |x | /x ln x 1/x
8. Parametric equations occur when both x and y are written in terms of a third variable, usually t. In that case, the
dy dy/dt
formula for the slope of a tangent line is = . Note that it appears that the dt’s are being canceled. This is
dx dx/dt
the check that you have set up the chain rule correctly.
9. Higher-order derivatives:
(a) If the first derivative is the rate of change, the second derivative measures acceleration or how fast that rate of
change is changing. If the first derivative is slope of a tangent line, the second derivative represents concavity
(up or down).
(b) Finding higher-order derivatives involves nothing more than repeated application of the formulas for derivatives.
Suggestions for making the algebra easier:
• It is often easier to deal with quotients by converting them to products of terms with exponents.
• Whenever you can, simplify expressions before taking more derivatives. This involves combining terms,
and is especially useful when you have products of terms with exponents. In that case, factor out of the
whole expression all common factors raised to the lowest powers that occur.
• Higher-order derivatives of parametric equations always involve the chain rule, where you will have to
divide by dx/dt each time.
10. Maple commands that are relevant to this chapter:
d 2 (x + 1)2 d2y
d
ln ln ln(cos2 x)
(a) 2
(b) (c) for x = tan(1 + t) and y = sin(t + 1).
dx x−1 dx dx2
d
(d) (x3 + 2 x − 3)sin x
dx
IV. (10 points) Use the Wiggle Magnification Formula to estimate f (6.3) if f (6) = 10 and f 0 (6) = −1.
V. (10 points) Derive the three-factor product rule by logarithmic differentiation as follows. Suppose f (x) = f1 (x) × f2 (x ×
f3 (x). Take the log of both sides of the equation and simplify the right hand side. Differentiate both sides using the
chain rule. Solve for f 0 (x). Replace f (x) with f1 (x) × f2 (x) × f3 (x) and simplify to put the result in the usual prod-
uct rule form (see equation sheet).[Note: The equation sheet this time contained the formula ( f1 (x) × f2 (x) × f3 (x))0 =
f10 (x) × f2 (x) × f3 (x) + f1 (x) × f20 (x) × f3 (x) + f1 (x) × f2 (x) × f30 (x).]
VI. (10 points) Show that y = sin x + cos x satisfies the second-order differential equation y00 + y = 0 by substituting the
function into the left-hand side of the equation and showing that it is equal to zero.
IV. (10 pts) Use the Wiggle Magnification Formula to estimate f (7.1) if f (7) = 5 and f 0 (7) = 2.
V. Find the following derivatives
d dy
ln(ln(ln(ln(x2 + 2 x)))) where y = π x .
(a) (10 pts) (b) (10 pts)
dx dx
1 1+x
VI. (10 pts) Use the definition of arctanh x, Arctanh x = ln , to confirm the derivative for this function given in
2 1−x
the equation sheet, i.e., take the derivative of this definition and do the algebra required to get it in the form on the equation
sheet.
IV. (10 pts) Use the Wiggle Magnification Formula to estimate f (4.3) if f (4) = −2 and f 0 (4) = 5.
V. (10 pts) Match the function in graphs (A) – (D) with their derivatives (I) – (III). Note that two of the functions have the
same derivative.
(B)
(A) (D)
(C)
(I)
(III)
(II)
d2y
VI. (10 pts) Show that y(x) = C1 + C2 e3 x (C1 and C2 are constants) satisfies the second order differential equation 2 −
dx
dy 0
3 = 0. If we’re additionally given that y(0) = 0 and y (0) = 1, show that C1 = −1/3 and C2 = 1/3.
dx
I. ( 10 points; 5 points each) Answer the following questions about the function f (x) = x3 − 5 x2 + 2.
(a) What is the equation of the secant line through x = −2 and x = 1? (b) What is the equation of the tangent line at
x = 0?
II. ( 20
points; 10 points
each ) Find
the√following
limits.
x2 − 2 x − 8 x+x
(a) lim 2 (b) lim √
x→4 x + 3 x − 28 x→0 x+1
III. ( 30 points; 10 points each
) Find thefollowing derivatives.
d 3 d Arctan t dy 2
for y = (1 + ln x)(x )
(a) x cos x (b) (c)
dx dt ln(t 2 + 1) dx
IV. ( 20 points; 10 points each ) Find the indicated derivatives of the following functions.
d3 d 564
(a) 3 (cos(4 x)) (b) 564 (sin x) (Hint: You are obviously going to have to find a better way to do this than the direct
dx dx
way. Write out the derivatives, look for a pattern, and figure where 564 fits into that pattern.)
V. ( 15 points; 5 points each part ) Give the answers to the following questions about the function defined parametrically by
x = Arctant, y = e2t .
(a) What is the derivative, dy/dx, as a function of t? (b) What is the equation of the tangent line to the curve at the
point when t = 0? (c) What is the second derivative, d 2 y/dx2 , as a function of t?
VI. ( 10 points ) A function occasionally encountered in later mathematics courses is Si(x), called the sine integral of x. Its
derivative is
d sin x
(Si(x)) = .
dx x
CHAPTER 1. DERIVATIVES - I 110
d
Si(x3 ) .
Find [Caution: Note the absolute values.]
dx
√ √
3 e 2 x 3 e− 2x 1 d2y
VII. (10 points) Show that y(x) = + − satisfies the second-order differential equation 2 − 2 y = 1 AND
4 4 2 dx
0
the initial conditions y(0) = 1 and y (0) = 0.
I. (10 points) Given f (x) = x3 − 3 x2 + 2 x + 2, find the equation for the tangent line to the curve at x = 1.
II. (20 points;
2 10 points each) Find the
2following limits.
x −4x+3 z −4z−5
(a) lim (b) lim
x→3 x2 + x − 12 z→5 z2 − z − 20
III. (50 points, as noted) Find thefollowing derivatives. (Don’t simplify your final answers.)
d e3 x sec(4 x) d2
d h 4
i
(b) (10 points) 2 ln(5 x3 − 8 x) (ln x)x +x
(a) (10 points) 2
(c) (10 points) (d) (20
dx x +x dx dx
CHAPTER 1. DERIVATIVES - I 111
d2y
points) for x = Arctant, y = sint
dx2
IV. (10 points) Use the Wiggle Magnification Formula to estimate f (4.9) if f (5) = 8 and f 0 (5) = −4.
V. (15 points) Many of you may be thinking “Hey, we’ve never developed a formula for the derivative of u(x)v(x) !” For this
problem, you get to derive such a formula and then check it.
(a) (8 points) Use logarithmic differentiation to find a formula for the derivative of u(x)v(x) . (Hint: after you panic, set
y = u(x)v(x) , take the ln of both sides and simplify the right hand side. Then, differentiate both sides like you normally
do for log. diff.) (b) (7 points) Apply the formula that you got to the function esin(x) and check your answer with the
regular exponential-plus-chain-rule approach. That is, differentiate esin(x) by the usual chain rule and plug u(x) = ex and
v(x) = sin(x) into the formula you developed in part (a) of this problem.
For these tests, the last page contained this information. You can expect to see it on all tests for the rest of the year.
Occasionally, more material might appear also; that will be decided on each test, and will vary from year to year.
( f (x) ± g(x))0 = f 0 (x) ± g0 (x) ( f (x) · g(x))0 = f 0 (x) · g(x) + f (x) · g0 (x)
(c f (x))0 = c f 0 (x)
f (x) 0 g(x) · f 0 (x) − f (x) · g0 (x)
( f (g(x)))0 = f 0 (g(x)) · g0 (x) =
g(x) g(x)2
Specific functions:
CHAPTER 1. DERIVATIVES - I 113
2 (x + ∆x)2 − (x + ∆x) − 1 − 2 x2 − x + 1
0 f (x + ∆x) − f (x)
I. (a) f (x) = lim = lim =
∆x→0 ∆x ∆x→0 ∆x
(2 x2 + 4 x ∆x + 2 (∆x)2 ) − x − ∆x + 1 − 2 x2 + x − 1 4 x ∆x + 2 (∆x)2 − ∆x ∆x (4 x + 2 ∆x − 1)
lim = lim = lim = lim 4 x+
∆x→0 ∆x ∆x→0 ∆x ∆x→0 ∆x ∆x→0
2 ∆x − 1 = 4 x − 1. (b) y = 11 x − 19 or y − 14 = 11 (x − 3).
II. (a) 6/5 (b) 4
arctan(3 x7 ) 1 d2y d2y
III. (a) − sin(3 x2 + 2 x) (6 x + 2)2 + 6 cos(3 x2 + 2 x)
(b) 7 7 )2
21 x6 (c) 2
= t 2 ; 2 = e at
arctan(3 x ) 1 + (3 x dx dx
2 1 3 cos x 1 − sin(x2 ) 2 x 1
dy 1 1
x=1 (d) =y + − −
dx 5 x 5 sin x 5 cos(x2 ) 5 arctanh(x) 1 − x2
CHAPTER 1. DERIVATIVES - I 114
IV. y ≈ yw = 5.2
1 1 1 1
V. (a) (2 x + 2) (b) π x ln(π)
ln(ln(ln(x2 + 2 x))) ln(ln(x2 + 2 x)) ln(x2 + 2 x) x2 + 2 x
I. (a) 4. (b) 2. This is not equal to the answer is part (a) since that slope is of a secant line through (0, −1), while this
line is tangent to the curve at (0, −1).
√
1 sin(x2 ) 2 3 x + 2 (4 x − 3) − (2 x2 − 3 x + 1) (1/2) (3 x + 2)−1/2 (3)
II. (a) 6 (b) −10 III. (a) cos(x ) (2 x) (b)
|sin(x2 ) | |sin(x2 ) | 3x+2
dy dy/dt 2 sec(2t) tan(2t) 2
(c) = = = sec (2t) tan(2t) since 1/ cos(2t) = sec(2t).
dx dx/dt 2 cos(2t)
d2y
1 d dy
=
dx2 dx/dt dt dx
1
= [2 sec(2t) (sec(2t) tan(2t) (2) × tan(2t) + sec2 (2t) × (sec2 (2t) (2)]
2 cos(2t)
1
= sec(2t) [4 sec2 (2t) tan2 (2t) + 2 sec4 (2t)]
2
= 2 sec3 (2t) tan2 (2t) + sec5 (2t)
1 sin2 (2t) 1
=2 +
cos3 (2t) cos2 (2t) cos5 (2t)
2 sin2 (2t) 1
= +
cos5 (2t) cos5 (2t)
2 sin2 (2t) + 1
=
cos5 (2t)
At x = 1/2, the value of t is found by x = sin(2t), or 1/2 = sin(2t), or π/6 = 2t, or t = π/12. Plugging that into d 2 y/dx2
2 sin2 (π/6)+1
gives d y /dx2 = cos5 (π/6)
, which is positive, since all parts of the fraction are positive. Therefore, the curve is concave
up. r h i
x2 cos(x) tanh(3 x)
(d) y0 = 1
× 2 1x + cos1 x − sin x − sinh(x
1 2 1
sech2 (3 x) 3
7
sinh(x2 ) |tanh(3 x) |
× 7 2 ) cosh(x ) (2 x) − |tanh(3 x) | |tanh(3 x) |
IV. 6.3
CHAPTER 1. DERIVATIVES - I 115
V.
Use a process of elimination. On the left side of the y axis, one curve is f and the other is f 0 . If the upper curve is f 0 , that
would mean the f is the lower curve. But then f 0 > 0, while the f is decreasing, so f 0 < 0. That can’t happen, so the upper
curve is f and the lower curve is f 0 . On the right side of the y axis, both curves are positive, meaning f 0 > 0. That means f
is increasing, and the other curve is f 0 .
VI.
d d 1 1+x
[Arctanh x] = ln
dx dx 2 1−x
1 1 − x (1 − x) (1) − (1 + x) (1)
=
2 1+x (1 − x)2
1 1−x 2
=
2 1 + x (1 − x)2
1
=
(1 + x) (1 − x)
1
=
1 − x2
I. (a) y = 8 x − 10 (b) y = 2
II. (a) 6/11 (b) 0
ln(t 2 + 1)/(t 2 + 1) − 2t Arctan(t)/(t 2 + 1) 2
III. (a) 3 x2 cos x − x3 sin x (b) (c) (1 + ln x)x (2 x ln(1 + ln x) + x/(1 +
(ln(t 2 + 1))2
ln x))
IV. (a) 64 sin(4 x) (b) sin x
V. (a) 2 e2t (1 + t 2 ) (b) y − 1 = 2 x (c) [4 e2t (1 + t 2 ) + 4t e2t ] (1 + t 2 )
3
Si(x ) sin(x3 )
VI. (3 x2 )
Si(x3 ) x3
I. mtan = 2 x − 3; (y − 2) = 3 (x − 3)
II. (a) 4 (b) 8
√
Arcsin(2 x) cos(ln(x2 ))(1/x2 ) (2 x)−sin(ln(x2 )) (1/ 1−(2 x)2 )(2)
III. (a) 3 cosh(3 x) cot(2 x2 )+sinh(3 x) (− csc2 (2 x2 ) (4 x)) (b) 2 (c)
(Arcsin(2 x))
y0 = y (1/x2 ) (2 x) ln(Arcsec(3 x)) + ln(x2 ) (1/ Arcsec(3 x)) √1 (3)
|3 x | (3 x)2 −1
2 2 2
IV. (a) dy/dx = et , d 2 y/dx2 = et , d 3 y/dx3 = et (b) e2
V. 3.4
VI. 1
√ √ √ √ √ √ √ √ √ √
3 2 2 x − 3 2 e− 2 x , 2 x + 3 e− 2 x , so y00 − 2 y 2 x + 3 e− 2 x ) − 2 ( 3 e 2 x − 2x
VII. dy/dx = 4 e 4 d 2 y/dx2 = 23 e 2 = ( 23 e 2 4 + 3e 4 −
√ √
1
2 ) = 1, y(0) = (3/4) + (3/4) − (1/2) = 1, and y0 (0) = 3 4 2 − 3 4 2 = 0.
I. (y − 2) = (−1) (x − 1)
II. (a) 2/7 (b) 2/3
(x2 + x) [3 e3 x sec(4 x) + 4 e3 x sec(4 x) tan(4 x)] − (2 x + 1) (e3 x sec(4 x)) (5 x3 − 8 x), (30 x) − (15 x2 − 8) (15 x2 − 8)
III. (a) (b)
(x2 +
x)
2 (5 x3 − 8 x)2
4 2t cost − (1 + t 2 ) sint
4 x +x
(c) [ln(x)x +x ] (4 x3 + 1) ln(ln x) + (d)
x ln x 1/(1 + t 2 )
IV. f (4.9) ≈ 8.4
V. (a) y = u(x)v(x) so ln y = v(x) ln(u(x)), and then (1/y) y0 = v0 (x) ln(u(x))+v(x) (1/u(x)) u0 (x), or y0 = (u(x)v(x) ) (v0 (x) ln(u(x))+
v(x) (u(x)/u0 (x))). (b) y0 = cos x esin(x) both ways.
CHAPTER 1. DERIVATIVES - I 117
VI. y0 = −6 sin(2 x)+8 cos(2 x), y00 = −12 cos(2 x)−16 sin(2 x). So, y00 +4 y = [−12 cos(2 x)−16 sin(2 x)]+4 [3 cos(2 x)+
4 sin(2 x)] = −12 cos(2 x) − 16 sin(2 x) + 12 cos(2 x) + 16 sin(2 x) = 0. Also, y(0) = 3 cos(0) + 4 sin(0) = 3 + 0 = 3 and
y0 (0) = −6 sin(0) + 8 cos(0) = 0 + 8 = 8.
d2y
1 d dy 1
Then 2
= = (− sinh(2t)).
dx dx/dt dt dx −2 sech(2t)
tanh(2t) (2) 3
4 2 tan(x) dy 2 4 2 8x +8x
(d) If y = (2 x + 4 x ) , then = y (sec x ln(2 x + 4 x ) + tan x .
dx 2 x4 + 4 x2
IV. First, y0 = −10 e−5 x +10 e2 x , y00 = 50 e−5 x +20, e2 x , and y000 = −250 e−5 x +40 e2 x . Then y000 +3 y00 −10 y0 = −250 e−5 x +
40 e2 x + 3 (50 e−5 x + 20, e2 x ) − 10 (−10 e−5 x + 10 e2 x ) = −250 e−5 x + 40 e2 x + 150 e−5 x + 60 e2 x + 100 e−5 x − 100 e2 x =
(0) e−5 x + (0) e2 x = 0. That shows that the function is a solution of the differential equation part of the IVP. Next, we
have to show that the function also satisfies the initial conditions. Using e0 = 1, we get y(0) = 2 e0 + 5 e0 = 7, y0 (0) =
−10 e0 + 10 e0 = 0, and y00 (0) = 50 e0 + 20 e0 = 70.
Chapter 2
Finance
2.1 Introduction.
2.1.1 Seems an unusual topic for calculus, but isn’t.
In this chapter, we look at several applications that can all fall under the general umbrella of finance.
Mugsy: Finance? In a calculus course? Is this a joke?
Albert: I doubt it. There are a lot of possible topics that could fit here. On the other hand, I’ve never encountered
any other calculus course that had a chapter on finance.
Remember that one of my intentions is to provide a range of different topics all of which use calculus in less-than-common
ways.
119
CHAPTER 2. FINANCE 120
Continuous compounding, and indeterminate forms. We started continuous compounding in differentials, but we
come at it again from a different angle. Our work there on indeterminate forms will enable us to deal with messy limits (the
“0/0” type of thing) very easily using calculus (L’Hôpital’s rule).
Inventory control, and max/min problems. This is a classic application of how to minimize inventory + shipping costs.
It isn’t too hard, but the types of problems that can occur here are notoriously difficult for calculus students.
Dudley: Does that mean this is one of those chapters that we’ll cry all the way through?
Albert: Not really. Traditional calculus courses deal with this topic differently. These types of problems come at you
with no background or framework for solving them. They are stated in English sentences, and the usual challenge is
that you have to then translate the information into equations. This whole course is designed to give a big picture
for that type of translation. Hence there is no undue emphasis on such work in this section, as in a typical calculus
course.
Elasticity and relative changes. This topic is straight from microeconomics. We do it from the calculus point of view,
which helps explain several things that are awkward to say without calculus. It shows the usefulness of differentials.
Elasticity also introduces the idea of relative change (error).
Basic notation and terminology. The notation and terminology set up here will be used for all following sections in this
chapter, and even once in a later chapter.
CHAPTER 2. FINANCE 121
More in each section. In each of the following sections are more formulas that are specific to that method of getting
interest.
There are a lot of formulas, but I am not going to expect you to memorize them. I will give them to you on the test.
Dudley: I was getting worried there. Too many variables, and my mind simply locks up.
Albert: The meanings of the variables will not be given to you. Just the formulas.
Simple interest.
Simple interest is, well, the simplest.
Mugsy: Duh.
In this situation, you get interest on the amount of principal only, no matter how long you lend out the money.
I = P r t. "Interest equals principal times rate times time" is a common saying. It is debatable whether it is worth remem-
bering.
Simple interest gathers the least amount of interest at a fixed interest rate. We will compare simple interest to others as
we get them.
FVIF = 1 + r t. To get the new amount you will have (F), you take the principal (P) and add in the interest (I = P r t), and
you get F = P + P r t = P(1 + r t). So, the FVIF is the factor that multiplies P, to give F, namely
FVIF = 1 + r t
Remember, this is for simple interest only. Different interest schemes have different FVIF’s.
Example. Depositing $2500 in a simple interest account for 9 months at 8% annual interest generates a future value of
F = P × FVIF (2.1)
= ($2500)(1 + (0.08) × 3/4) (2.2)
= $2650 (2.3)
where we used that 9 months is 3/4 of a year. (Remember that the easiest units to use are years, even if the units aren’t
given to you that way!)
You can get the interest either by I = F − P = $2650 − $2500 = $150 (the amount that the account increases is the
interest), or by I = P r t = ($2500)(0.08)(3/4) = $150.
CHAPTER 2. FINANCE 122
Compound interest.
In compound interest, the interest that is accumulated gets added to the principal before the next amount of interest is
figured. The causes interest to accumulate faster by compounding than by simple interest. You are getting interest on more
money.
It is possible to treat compound interest this way: It is simple interest where the principal keeps changing at each com-
pounding period. It’s one of the things that I try to cover in the Concepts course. Another result is that if the compounding
period equals the length of time that you keep the money in the account, then compound interest is exactly the same as
simple interest. You have to have a longer period of time for the compounding effect to appear. A homework question deals
with this.
Formulas. A new set of items appears here. You have to keep these straight, or you will get overwhelmed by the number
of different variables. (Careful, Dudley.)
Variable Meaning
These will usually be camouflaged by stating them in English. Quarterly compounding means m = 4; monthly compounding
means m = 12; semi-annual compounding means m = 2; daily compounding means m = 360 (banks often use a 360-day
year); annual compounding means m = 1. Then the value of k is calculated by k = r/m. Don’t use just r! And n is calculated
by n = mt, where t must be in years before you use the formulas.
Mugsy: Am I getting slow, or does this make sense and I can’t see it?
Albert: The formulas can throw you. You are probably best off memorizing what the variables mean. For example, if
you are compounding monthly, and you have money on deposit for 3 years, how many compounding periods are there?
Mugsy: Is that the same as asking how many months there are in 3 years?
Albert: Yes.
Mugsy: Oh, that’s easy. Lessee, 3 times 12 is . . . 36.
Dudley: Hey, not bad. You didn’t even use your fingers.
Albert: So, the value of n, the number of compounding periods, is 36. And if the interest rate is 5% (per year, but
that is almost never stated), then how much do you get when splitting it up into equal monthly amounts?
Mugsy: Would it be 5/12%?
Albert: Exactly.
Dudley: Wow. Two in a row.
Mugsy: That is easier. Thanks, Al. Dudley, quiet.
Looking at compound interest as a succession of accumulating simple interest problems, we can get the formula for
compound interest easily. The critical items to remember are that you get the amount at the end of an interest period by
multiplying the amount at the beginning of the period by (1 + k), and the amount at the end of one period is the amount at
the beginning of the next.
1 P P × (1 + k) = P (1 + k)
2 P (1 + k) P (1 + k) × (1 + k) = P (1 + k)2
3 P (1 + k)2 P (1 + k)2 × (1 + k) = P (1 + k)3
4 P (1 + k)3 P (1 + k)3 × (1 + k) = P (1 + k)4
With a little bit of thinking, you can see that the amount at the end of the nth period is just P (1 + k)n . In that case, the FVIF
for compound interest is
FVIF = (1 + k)n .
Again, this is for compound interest only.
CHAPTER 2. FINANCE 123
Example. Take $2500, and invest it at 9% interest compounded monthly for 4 years. Then 9% = 0.09 = r, and t = 4,
just as before. The “monthly” means m = 12, so k = r/m = 0.09/12 = 0.0075, and n = t m = (4)(12) = 48. Then the
future value interest factor is FVIF = (1 + k)n = (1 + 0.0075)48 = 1.431405. The future value is then F = P × FVIF =
($2500)(1.431405) = $3578.51
There are a few comments that need to be made about the number of decimal places to keep and use. Final answers
should be in dollars rounded to two decimal places (cents). Numbers before that should have between 5 and 8 decimal
places, depending on the size of the principal and how many decimal places you need to get the future value accurately.
Be particularly careful to keep plenty (actually, all you can) of decimal places in (1 + k). The FVIF is very sensitive to
round-off errors in that term. For example, if you have 5% interest compounded monthly, you’d be much closer using
k = .05/12 = .004166666667 than k = .0042.
Also, depending on the number of decimal places you keep, you will get slightly different answers from me or from
others. If it is close, I will not quibble.
Dudley: Does that mean you won’t take off points? That’s what I’m interested in.
Yes, that’s what I mean, Dudley.
Continuous compounding.
In this situation, we need to take shorter and shorter interest periods, which is the same as making m bigger and bigger.
FVIF = limm→∞ (1 + k)n What we want is a limit, where m goes to infinity, which is written ∞. But please be careful;
there is no number ∞, or put another way, ∞ is not a number, and you can’t treat it as such. You can’t, for example, plug ∞
into a formula and get anything realistic out. This means that we had better have another way of evaluating limits, since our
usual approach is to plug in the limiting value. Again, ∞ is a symbol that can only occur in a limit, and there it just means
that some variable is getting gigantic, and we are asking if the expression settles down to a value in the process.
The FVIF becomes 1∞ As m → ∞, k = r/m → 0 and n = mt → ∞. As the period gets shorter and shorter, the amount of
interest per period goes to zero, but the number of interest periods goes to infinity. The net result is that the FVIF looks like
1∞ .
This is a new form! It is like 0/0, an indeterminate form, but it is less obvious. We need to look at it.
Dudley: Isn’t 1 to any power equal to 1? So why is this an indeterminate form? Isn’t it equal to 1?
Albert: Yes, 1 to any power is 1. But isn’t 0/(anything) equal to 0?
Dudley: Yes.
Albert: So why isn’t 0/0 equal to 0? We’ve already seen that it can anything.
Dudley: But this is different.
Albert: Not that much. Keep reading.
The problem is that the 1 in the 1∞ is a limit. For any specific value of n, the “1” is actually a bit larger than 1. Any number
larger than 1, when raised to large enough powers, gets large. (Try putting 1.0000001 in your calculator and then keep
squaring it. It overflows mighty fast.)
We discovered that the limit form 0/0 could be anything. The bottom tried to send the quotient off to infinity, while the
top tried to keep it at zero. This schizophrenic nature of 0/0 is why we called it an indeterminate form. The same is true
for 1∞ . The 1 in the base tries to make the limit 1. The ∞ in the exponent tries to pull the limit to ∞. So, the result is a
tug-of-war characteristic of an indeterminate form. Let’s try an example with numbers first.
Example. Suppose r = 8% = 0.08, and t = 5 (years). Let’s grind out some values for FVIF = (1 + k)n for different
values of m = number of compounding periods. The table is at the top of the next page. (This was done on Maple.
Calculators would work for the first few, but very rapidly would lose accuracy. Try it! I set Digits:=20; in order to keep
enough accuracy.)
It looks like there is a limit (even without the last line), and it is neither infinite (which the exponent tries to make it)
nor 1 (which the base would have). We need to explore this further.
CHAPTER 2. FINANCE 124
m FVIF
1 1.469328
4 1.485947
12 1.489846
360 1.491758
365 1.491759
1000 1.491801
10000 1.491822
100000 1.491824
1000000 1.491825
1000000000 1.491825
∞ 1.491825
Using derivatives to evaluate limits. We encountered an indeterminate form (namely 0/0) trying to define derivatives.
Having conquered derivatives, they now turn around and help us evaluate indeterminate forms.
There are a variety of ways to handle indeterminate forms, but L’Hôpital’s rule is the best. And with the 1∞ form,
L’Hôpital’s rule is basically your only hope.
f (x) f 0 (x)
If f (c) = g(c) = 0, then lim = lim 0 provided the second limit exists.
x→c g(x) x→c g (x)
Notice that the conditions you need to be able to use L’Hôpital’s rule are precisely the ones that you began to dread in
limits: when both the top and the bottom were zero when you plugged in the limiting value of the variable. So, factoring
the top and the bottom (the big problem that cropped up when the top or bottom wasn’t a polynomial) is not necessary. You
do have to differentiate, however.
Mugsy: Have we gained anything?
Albert: Let me put it this way. Would you rather factor a polynomial or differentiate it?
Mugsy: No question. I’d rather differentiate it.
Albert: There’s one gain. All that’s needed for L’Hôpital’s rule is differentiation. Another gain is that L’Hôpital’s rule
also works in cases that aren’t polynomials, where factoring isn’t even a possibility. Remember the messes we got into
when we were trying to find the derivatives of sin x and ln x?
Mugsy: Ugh. What a pain that was. Why didn’t the brilliant author put L’Hôpital’s rule early enough to be able to
use it then, and spare us (and him) that mess?
Albert: You need to do limits first, and then get derivatives, and then get L’Hôpital’s rule, which is what we did.
It would be hard to derivatives first, and then do limits with L’Hôpital’s rule. Explaining derivatives without using
limit-like language–tangent lines, instantaneous velocities, and so on–means you can’t explain most of what derivatives
mean.
The rule also holds if f (c) and g(c) both go to infinity as x approaches c, and it also works in the case that c is infinity.
Zero and infinity are actually quite close.
Dudley: WHAT? Now I know he’s flipped.
Albert: He has a point. Do you want me to convince you?
Dudley: Will it hurt?
Mugsy: Can I leave until you’re done?
Albert: Not much and yes.
Mugsy: (leaving) Be back shortly.
Albert: Ok, Dudley. Suppose you have an infinite number of golf balls, numbered 1, 2, 3, 4, etc. At one hour before
noon, you put the balls numbered 1 to 10 in a row on the ground, and Mugsy takes the ball numbered 1. At a half
hour before noon, you continue the row by placing the balls numbered 11 to 20, and Mugsy takes the ball numbered
2. At one-third of an hour before noon, you continue the row by placing the balls numbered 21 to 30, and Mugsy
takes the ball numbered 3. Do you see the pattern? What’s next?
Dudley: At one-quarter of an hour before noon, I put the balls numbered 31 to 40 in the row, and Mugsy removes the
ball numbered 4.
Albert: Right. Now, as you move closer to noon, there is a major flurry of activity. When the dust settles, how many
balls are left in the row? Well, at one hour before noon, there are 9 balls, and at a half-hour before noon, there are
2 × 9 = 18 balls, and at a third of an hour before noon, there are 3 × 9 = 27 balls, and so on. You gain 9 golf balls in
the row, since you add 10 and Mugsy removes only one. How many golf balls will there be in the row once noon has
CHAPTER 2. FINANCE 126
passed?
Dudley: No problem. An infinite number of them.
Albert: Really? Is the golf ball numbered 1 still in the row?
Dudley: No. Mugsy removed it at one hour before noon.
Albert: Is the golf ball numbered 2 still in the row?
Dudley: No. Mugsy removed it at a half hour before noon.
Albert: If the golf ball numbered 3 still in the row?
Dudley: No. Mugsy removed it at a third of an hour before noon.
Albert: Is the golf ball numbered 250 still in the row?
Dudley: Hmm. I’m beginning to get this funny feeling. No. Mugsy removed that one, too, at 1/250 of an hour before
noon.
Albert: Well, if there are an infinite number of golf balls in the row, there certainly is at least one. What number does
it have?
Dudley: AUGH! Whatever number I name, Mugsy removed it at some time. You mean there aren’t any golf balls left?
Albert: Yup. That’s how close infinity is to zero.
Mugsy: (returning) You guys done yet?
Dudley: How dare you swipe all my golf balls, you cad!
Mugsy: Should I go away for a while longer?
On the other hand, L’Hôpital’s rule works only in the cases of 0/0 and ∞/∞; other indeterminate forms must be put into
one of those two forms. The methods of doing that are important, especially since 1∞ is not in the form 0/0 or ∞/∞.
You can’t use L’Hôpital’s rule unless the indeterminate form is 0/0 or ∞/∞.
Why the rule works. The easiest way to see what is going on is to use parametric equations. That means changing
independent variables from x to t, so that we are trying to evaluate
f (t)
lim
t→c g(t)
We set up x = g(t), and y = f (t), which looks backwards, but isn’t. We also assume f (c) = g(c) = 0. The slope of the
secant line at the origin is
∆y f (t) − f (c)
= (2.4)
∆x g(t) − g(c)
f (t)
= (2.5)
g(t)
(This is why I wanted things "backwards". It puts the f (t) on top and the g(t) on the bottom.) The limit as t approaches c
gives two things: the limit we are looking for, and the slope of the tangent line at the origin. The slope of the tangent line
at the origin comes from the chain rule (the most important rule in calculus), and is
dy dy/dt
= (2.6)
dx dx/dt
f 0 (t)
= 0 evaluated at t = c (2.7)
g (t)
CHAPTER 2. FINANCE 127
xn , n odd −∞ 0 ∞
xn , n even ∞ 0 ∞
1/xn , n odd 0 d.n.e. 0
1/xn , n even 0 ∞ 0
ex 0 1 ∞
ln x d.n.e. −∞ ∞
which is essentially how the second limit in L’Hôpital’s rule is evaluated. So, the limit we want is this quotient of derivatives.
Actually, there are some holes in this “proof,” but this conveys the ideas better than the full-blown rigorous proof. It is
intended to convey one important fact. If either f (c) or g(c) is not 0, the slope of the secant line is not f (t)/g(t) any more,
and the whole thing falls apart. Remember: you have to have 0/0 (or ∞/∞, but this explanation doesn’t show it) in order to
use L’Hôpital’s rule correctly.
Values of various functions at certain points. It is handy to have an idea of what happens to functions that give either
0, ∞ or −∞. There’s a short table at the top of the page. The notation “d.n.e.” means that the limit does not exist. For 1/xn
with n odd, the function tries to go to ∞ for values of x near, but greater than, 0, while the function tries to go to −∞ for
values of x near, but less than, 0. The “d.n.e.” in ln x is there because ln x is not defined for values of x less than or equal to
0. Pushing x towards −∞ in ln x can’t happen.
Rational functions and limits to ±∞. When you have a Rational function (the quotient of two polynomials) in x (that is,
with independent variable x), there is a fast way to determine the limit as x → ∞ or x → −∞. A similar trick can be used on
other functions, and we’ll see how soon.
The basic idea is to focus only on the terms that are growing the fastest as x → ±∞. For a polynomial, those are easy to
find. Just look for the highest degree term in the numerator (top) and denominator (bottom). Throw away all the rest of the
terms in both the top and bottom, reduce what’s left, and the limit of that gives the answer to the original limit. (This needs
to be done with care sometimes, and we’ll see when and how.)
Be careful not to use this method unless the limit is to ∞ or −∞! That’s the only time you can ignore everything but the
highest-degree terms in the top and bottom. This is fast and simple. Examples will be given in class.
Repeated uses of L’Hôpital’s rule. Sometimes after using L’Hôpital’s rule, you still wind up with 0/0 or ∞/∞.
Mugsy: Why doesn’t that surprise me?
In that case, try the rule again! You can continue to use L’Hôpital’s rule as long as you keep getting 0/0 or ∞/∞. But one
note of caution. It becomes so easy to keep applying it, that you sometimes forget to keep checking that you have the right
form for continuing. As soon as you get something else, stop! You can get the answer right there.
Dudley: You mean that people actually keep differentiating beyond the proper stopping point?
Albert: Exactly. This is an unfortunately common occurrence.
Dudley: Why?
Albert: That’s hard to say because there are a number of different reasons, I suspect.
Mugsy: Of course, you always stop at the right point, and have a hard time understanding why someone else wouldn’t.
Albert: Only partly true. It is almost always simpler to differentiate than factor a polynomial, which is why L’Hôpital’s
rule is handy then. But it is often easier to differentiate than to plug numbers in, too. In that case, people will simply
keep differentiating until it looks reasonably simple to plug things in, but by that point, they have gone too far. At
least, that’s my theory. But continued use of L’Hôpital’s rule requires alternating between two things—differentiation
and substitution—and not just differentiation alone.
The procedure I just gave for finding the limits of rational functions as x → ∞ is based on repeated applications of
L’Hôpital’s rule. This is looked at briefly in the homework.
CHAPTER 2. FINANCE 128
Example:
x sin x 0
lim =“ ” (2.8)
x→0 1 − cos x 0
(1 × sin x) + x(cos x)
= lim (2.9)
x→0 sin x
0
=“ ” (2.10)
0
cos x + (1 × cos x + x(sin x))
= lim (2.11)
x→0 cos x
2 cos x − x sin x
= lim (2.12)
x→0 cos x
= 2/1 = 2 (2.13)
Note that each time before applying L’Hôpital’s rule, you have to check that you still have 0/0.
We will work the following limits in class:
x3 + x + 10
lim
x→−2 5 x2 + 13 x + 6
10 z2 − 21 z − 27
lim
z→3 2 z2 − z − 15
4 x2 − 7 x + 3
lim 2
x→1 3 x + 5 x − 2
e2 y − 2 y − 1
lim
y→0 sinh y − Arctan y
4 x3 − 7 x2 + 5 x − 10
lim
x→∞ 7 x3 + 4 x + 200
lim (r ln r)
r→0
1.491824698
Note the use of colons (:) to suppress the printout of values I already knew.
Homework #21
Exercises.
1. Find the following limits. You can use any (legitimate) method you want. You can check your answers on Maple (as
usual).
2
y −5y+6
(a) lim
y→3 y2 + y − 12
CHAPTER 2. FINANCE 129
y2 − 5 y + 6
(b) lim
y→−4 y2 + y − 12
2
y −5y+6
(c) lim
y→2 y2 + y − 12
2
y −5y+6
(d) lim
y→∞ y2 + y − 12
4 + x2
(e) lim
x→∞ 1 − x3
2. Find the following limits. You can use any (legitimate) method you want.
2
2y −y−3
(a) lim
y→−1 3 y2 + 2 y − 1
2
2y −y−3
(b) lim
y→1/3 3 y2 + 2 y − 1
2
2y −y−3
(c) lim
y→3/2 3 y2 + 2y − 1
2
2y −y−3
(d) lim
y→∞ 3 y2 + 2y − 1
6 + x5
(e) lim
x→∞ 3 − x2
3. Make up your own limits, and solve them. (This is excellent practice, and will give you some real insight into what
makes a good limit problem. Include some that use L’Hôpital’s rule, and some others where the limit variable goes
to infinity.) Usual rules apply (i.e., three of them will count).
Problems.
1. In this problem, we look at various indeterminate forms, and some impostors.
(a) Show that 0 × ∞, 00 , ∞ − ∞, and ∞/∞ are indeterminate forms. Do this by showing that in each of them, one
part is trying to make the expression go to one value, while simultaneously, the other part is trying to make it
go to a different value.
(b) Show that other possible forms are not indeterminate: ∞ + ∞, ∞ × ∞, ∞∞ , and 0∞ . Do this by showing that
anything with that form really must approach a specific limit (even if it is ∞). (Caution: ∞∞ and 0∞ approach
different limits. Why are they different?)
3. A blind application of L’Hôpital’s rule can occasionally get you into problems. This problem looks at that situation.
We’ll look at
x + sin x
lim
x→∞ 3x
simplistically and then using some needed insight.
(a) If you try to plug in x going to ∞, you get ∞/∞. L’Hôpital’s rule applies. What do you get by applying it?
(b) Does this next limit exist, and why or why not?
(c) What are the relative sizes of x and sin x as x gets huge? Applying the logic used for rational functions, what is
the new, simplified quotient you take the limit of? What is that limit? [Moral of the problem: Apply common
sense before L’Hôpital’s rule.]
4. Try the formula in L’Hôpital’s rule twice more on
2 cos x − x sin x
lim
x→0 cos x
even though it should not be used. (This limit was an example in the lecture notes.) Do you get the correct answer?
What’s the moral of this problem?
So far, so good. But plugging in m → ∞ still gives the form 0 × ∞, since ln(1) = 0. That’s a start, but not 0/0 or ∞/∞.
We can convert this limit to the right form by algebra: Invert one term and divide by it (standard trick when dealing with
0 × ∞—we did it in an example before the last homework set). The term to put on the bottom is the simplest one; in this
CHAPTER 2. FINANCE 131
Benjamin Franklin’s will. Benjamin Franklin put this in an appendix (technically called a codicil) to his will:
I wish to be useful after my Death, if possible, in forming and advancing other young men that may be ser-
viceable to their Country in both Boston and Philadelphia. To this end I devote Two thousand Pounds Sterling,
which I give, one thousand thereof to the Inhabitants of the Town of Boston in Massachusetts, and the other
thousand to the inhabitants of the City of Philadelphia, in Trust and for the Uses, Interests and Purposes here-
inafter mentioned and declared.
The money was to be lent out at 5% interest, and each borrower was supposed to repay annually both the interest to date
and one-tenth of the principal, which would then be used to loan to other borrowers. He then continues:
If this plan is executed and succeeds as projected without interruption for one hundred Years, the Sum will
be then one hundred and thirty-one thousand Pounds of which I would have the Managers of the Donation to
the Inhabitants of the Town of Boston, then lay out at their discretion one hundred thousand Pounds in Public
Works.... The remaining thirty-one thousand Pounds, I would have continued to be let out on Interest in the
manner above described for another one hundred Years.... At the end of this second term if no unfortunate
accident has prevented the operation the sum will be Four Million and Sixty-one Thousand Pounds.
CHAPTER 2. FINANCE 132
The Franklin Technical Institute of Boston owes its existence to this money.
Mugsy: Try that today, and you’d get ripped off something fierce.
Comparison to polynomials; orders of growth. We have already seen how a linear function (simple interest) is no
match for an exponential (compound interest). But this continues. No polynomial grows as fast as any growing exponential.
(Exponentials can also decay.)
For example, ex doesn’t catch up to 100 x100 until about x = 652.72, but after that, the exponential leaves the polynomial
way behind.
There is a hierarchy of growth of functions. We’ll hit it now (since it is relevant to limits as x → ∞), and later (where
we will use it in other settings). It is used in a manner parallel to what we did for rational functions: to determine the
fastest-growing term in the top and bottom. In this case, the fastest-growing term is the one that contains the term that is
lowest in the following list:
This says, for example, that a polynomial grows faster than a logarithm, and an exponential grows faster than a polynomial.
Right now, we aren’t going to use factorials. (You don’t even have to know what a factorial is at this point. We’ll learn
later.)
How do you use this? Suppose you have a limit such as
8 ln x + 4 x3 + 3 ex
lim
x→∞ 3 sin x + 5 x6 − 2 ex
The way you evaluate it is to discard everything but the fastest-growing terms in the top and bottom, which in this case are
3 ex on the top and −2 ex on the bottom. (You can tell they are the fastest because they occur farther down the list than any
other terms.) So, this limit has a value equal to
8 ln x + 4 x3 + 3 ex 3 ex
lim = lim (2.27)
x→∞ 3 sin x + 5 x6 − 2 ex x→∞ −2 ex
3
= lim (2.28)
x→∞ −2
3
=− (2.29)
2
Again, a note of caution is in order. This procedure of discarding terms can only be used when you are taking limits to
±∞. Other limits require you to keep all the terms.
Compounding period and its effects. The length of the compounding period has an effect on the amount of interest
paid. This can be seen by realizing that simple interest is basically compound interest with a single compounding period.
Basically, for a fixed length of time, the shorter the compounding period, the more interest accumulates. That’s why
some banks proudly display (or used to; it’s fallen out of practice) that their savings accounts compound continuously.
You can’t get more frequent than that! For a specific interest rate and time period, continuous compounding produces
the maximum interest. Daily, monthly, quarterly, and semi-annual compounding would produce increasing amounts of
interest. (You can see this effect if you go back to a table I listed earlier from Maple where I worked out the FVIF for
various different compounding periods.) An exercise in the homework asks you to work out some different numbers.
CHAPTER 2. FINANCE 133
But one thing you will notice in the homework is that the difference between daily and continuous compounding is
virtually invisible. In real life, this still holds. But there is one catch. Many banks (including the one I use) will compound
interest daily, but only pay it monthly. That is, unless you leave the money in until the end of the month, they won’t give
you any interest on it at all!
Effective annual rate (versus nominal rate). Sometimes it is exceedingly difficult to compare varieties of interest
schemes. Is 5.2% compounded monthly better or worse than 5.1% compounded continuously? The smaller rate (5.1%
versus 5.2%) would be offset to some extent by the more frequent compounding (continuously versus monthly).
There is a standard way to compare compounding schemes, called effective annual interest rates or annual percentage
rate (APR). You will see these on credit card applications, for example. The rate that is given to you is called the nominal
rate (at least in old-time textbooks), and that’s r.
The basic idea is simple. If you use the nominal rate for one year, what is the equivalent simple interest rate? Since
simple interest rates are easy to compare (the larger, the more interest), this gives an easy way to compare different interest
and compounding schemes. If you use the formulas, it isn’t hard to come up with the effective rate. If re f f is the effective
rate, the FVIF of the simple interest with that rate should equal the FVIF of the compound interest with the nominal rate.
Doing this, with t = 1, that is, for one year, (so n = m) and regular compounding gives:
This makes it easy to compare different rates with different compounding periods. In the example I gave, 5.2% compounded
monthly has an effective rate of
0.052 12
1+ − 1 = 0.05326 = 5.326%
12
while the 5.1% compounded continuously has an effective rate of
which shows that the 5.2% compounded monthly is better (if you want more interest).
Homework #22
Exercises.
1. Find the new balances on an account with different interest methods. Assume a principal of $10,000, an interest rate
of 6%, and time of 15 years. What is the new balance assuming:
(a) Simple interest?
(b) Interest compounded annually?
(c) Interest compounded quarterly?
(d) Interest compounded monthly?
(e) Interest compounded continuously?
CHAPTER 2. FINANCE 134
2. Find the new balances on an account with different interest methods. Assume a principal of $10,000, an interest rate
of 9%, and time of 10 years. What is the new balance assuming:
(a) Simple interest?
(b) Interest compounded annually?
(c) Interest compounded quarterly?
(d) Interest compounded monthly?
(e) Interest compounded continuously?
3. Find the effective interest on accounts with the following rates and periods.
(a) 3.8% compounded monthly
(b) 7.3% compounded continuously
4. Find the effective interest on accounts with the following rates and periods.
1. This problem works out the difference between simple and compound interest schemes for short and long time
periods. Take a nominal interest rate of 10% for both schemes and for the compound interest scheme use quarterly
compounding.
(a) Find the FVIF’s for simple and compound interest for t = 1, 2, 5, and 10 years.
(b) Find the ratio of the simple interest factors to the corresponding compound interest factors for those same time
periods. (That is, divide the compound interest factor by the simple interest factor.) Which interest scheme
would you prefer for your savings account?
(c) Find the FVIF’s for t = 50, 100, 200, and 500 years. (Don’t be surprised at very large numbers.)
(d) Find the ratio of the simple interest factors to the corresponding compound interest factors for those same time
periods. (Again, divide the compound interest factor by the simple interest factor.) Which would you prefer for
your savings account?
(“$100 placed at 7 percent interest compounded quarterly for 200 years will increase to more than $100,000,000—by
which time it will be worth nothing.” – Robert A. Heinlein)
2. The last problem showed you the drastic difference between simple and compound interest for long periods of time.
This problem shows that there is little difference for short periods of time.
(a) Find the equation of the tangent line to y = ek x at the point (0, 1). This is called the linearization of ek x around
(0, 1). Write the equation as y = (something).
(b) Since the function is closely approximated by its tangent line near the point of tangency, the function y = ek x and
the equation of its tangent line are approximately equal. Write out the approximation, namely ek x ≈ (tangent
line formula from previous part).
(c) Change k’s to r’s and x’s to t’s in the previous approximation. What do you get?
(d) Convert the two sides in the approximation in the previous part to FVIF’s of different interest schemes. That is,
each of the sides in the previous equation should be FVIF’s for different interest schemes. Label each one.
(e) What does this say about the FVIF’s for simple and continuously compounded interest for small values of t?
CHAPTER 2. FINANCE 135
Variable Meaning
Equation derived.
Now we want to get the equation(s) that we will be working with.
Order cost per year. The order cost per year is (the number of times we order per year) × (the cost per order). This is
common sense. Think about it for a moment.
Mugsy: Before you make any comment, Dudley, I’ve already figured it out.
The number of times we order per year is (n/x), since that tells us how many orders need to be made to sell n hamburg-
ers. This also makes sense, but is not as obvious. Try numbers. If you sell 10,000 hamburgers and get shipments in batches
of 500 hamburgers, how many batches do you need? The answer is 20, since 10000/500 = 20.
We have already said that s is the shipping cost per order. We now have all the ingredients to get the order cost per year.
The total order cost is (n/x) × s = (n s)/x. Now let’s go after the total storage cost per year.
Inventory (storage) cost per year. The storage cost per year will be (number of hamburgers stored) × (cost per ham-
burger for storage). The number of hamburgers stored is difficult, since it is not a constant. We get an order, and the number
of hamburgers stored goes up. Then as we sell them, the number goes down. One idea (and it’s a reasonable one) is to use
the average number of hamburgers. So, we will make an assumption here. Suppose we get orders of hamburgers just as we
run out, and that we sell them at a uniform rate. Then the average number of hamburgers will be half of the order size, or
x/2. We’ll use that for number of hamburgers stored.
Dudley: Al, how easy would this be to modify to take into account that no reasonable manager would rely on a
shipment showing up just as they run out? There really ought to be a number of hamburgers on hand that triggers
an order being sent, to arrive in time to allow a safe buffer of hamburgers left.
Albert: That’s certainly the right idea. It wouldn’t be hard at all to incorporate that into the equations, and only
creates a small problem. It turns out that all it does is raise the storage cost by a constant, and doesn’t affect the
final value of x at all.
CHAPTER 2. FINANCE 136
The storage cost of a hamburger is i, by what we decided to call it earlier. The total storage cost is then (x/2) × i =
(i x/2). That’s the other half of what we need.
Total cost. The total cost is the sum of the order cost and the storage cost, or
In this equation, x is the variable we get to determine, so that’s the independent variable. C is the dependent variable, and
the value that we want to minimize. The remaining letters represent constants, or more accurately, parameters that we can
set to fit the situation.
With specific numbers, we can graph it. We can get the values of n, s, and i from the manager and plot the function
C(x), and look for the minimum. (After all, we are trying to make the total cost as small as possible.)
Can we do better? There are problems with the graphical approach. What happens if some of the values change (or
aren’t accurate)? We’d be stuck. A major branch of the “real” theory here is called sensitivity analysis. It answers the
question “How sensitive is the final answer to slight inaccuracies in the parameters?” If it turns out that the result can vary
dramatically with slight changes in n, then that is good to know. You can concentrate your efforts on getting accurate values
of n. Sensitivity analysis is difficult to do graphically. We would be better off if we handled the equations algebraically.
How about in another graph for which that approach doesn’t work?
CHAPTER 2. FINANCE 137
(a)
(b)
One procedure: (most general) The central idea in this case is simple. Find the places where the graph can change from
rising to falling or vice versa. Those will be the maxes and mins.
Here’s the procedure:
1. Find the derivative of the function. You should all be pros at this by now.
Mugsy: Wow, how optimistic can you get?
2. Find out where the derivative is 0 or doesn’t exist (called Critical points); plot those points on a number line. Critical
points are well-named. They are often where the most interesting behavior of the function occurs. At other places
(called Regular points), the graph is basically dull.
The reason you plot the points on a number line (which is really the x-axis; you can add the y-axis later and plot the
function if you want) is to divide the line up into sections that are clumps of points that are all regular points.
3. Find out if the curve is rising or falling on the other points. The key to this method is here. On any given section
(clump of regular points), the graph is either entirely rising or entirely falling. (It will generally do different things on
different sections, but within one section, it’s always the same at each point.) The reason is that a graph can’t change
from rising to falling except at a critical point. If there are no critical points in an interval, it can’t change. We have
intervals (clumps) with no critical points, so it can’t change.
How do you test whether the graph is rising or falling on that section? Simple, check the value of the derivative at
some point inside the interval. (Don’t check the ends—those are the critical points!) Which point? It doesn’t matter,
since they are all the same in the sense the curve is rising for all points of a specific section or falling for all points of
that section. (Again, it can, and usually does, change from one section to the next.)
4. Locate maxes/mins from changes. Once you’ve categorized each section, locating maxes and mins is simple. A max
occurs when you move from a rising section to a falling section. A min occurs when you move from a falling section
to a rising section. For most functions, the sections will alternate between rising and falling. However, that is not
always true, so don’t rely on it. You really do have to check each section on its own.
Other procedure: (easiest and most common) The central idea in this procedure is to look for the tops of hills (maxes)
and bottoms of valleys (mins).
1. Find the first and second derivatives of the function.
CHAPTER 2. FINANCE 138
2. Find out where the first derivative is 0. The places where the first derivative is zero have the potential for being tops
of hills or bottoms of valleys.
In this approach, you can’t handle where the first derivative doesn’t exist. The second derivative won’t exist there
either, and the procedure falls apart.
3. Test the concavity at the places where the first derivative is 0. If the second derivative is positive where the first
derivative is 0, that’s a min. If the second derivative is negative where the first derivative is 0, that’s a max. If the
second derivative is 0 or doesn’t exists where the first derivative is 0, then just about anything could happen: max,
min, or neither.
Why does that work? If the curve is concave up (positive second derivative) at a point that has a horizontal tangent,
then the concave up means you are on a portion of the graph that looks like a smile, and the horizontal tangent gives
a minimum. If the curve is concave down (negative second derivative (negative second derivative) at a point that has
a horizontal tangent, then the concave down means that you are on a portion of the graph that looks like a frown, and
the horizontal tangent gives a maximum.
This procedure is called the Second derivative test, since the second derivative is used to tell the character of the point
where the first derivative is zero. However, be careful with the second derivatives. Just because you have a positive
(or negative) second derivative at a specific point, you don’t have a minimum (or maximum) unless you also have the
first derivative equal to zero.
Choosing between these procedures. Both procedures will work, when they can. The first (general) procedure is usually
somewhat longer, but is guaranteed to work. The second (common) procedure is faster, but might fail for either of two
reasons:
1. The first derivative fails to exist, or
2. The second derivative is 0 or fails to exist.
These are serious problems, and if either occurs, you are forced into using the first procedure.
So, my recommendation is this: Use the second procedure unless it fails. Examples of failure are given in the homework
and in class.
Note that neither of these procedures finds the values of the function at a max or a min. It isn’t necessary! All we are
doing is locating the value of x when the max or min occurs. However, in most practical problems, finding the value of the
function at the max or min is important. If the problem asks for the maximum and/or minimum value of a function, you
have to plug the x’s back into the function to evaluate it at the relevant points.
An example of this will be given when I solve the minimum cost problem. I will do it both ways. And in the next
section, I will find the critical points of a more complicated function. But I will work the following examples in class:
x3 − 3 x2 − 9 x + 8
ex sin x
Needed to guarantee global maxes and mins. The situation that guarantees that a function will have both a global max
and a global min is if the function is continuous (no breaks in its graph) on an interval like a ≤ x ≤ b, commonly abbreviated
[a, b]. This assertion can be proved, but doing so requires more advanced mathematics than I want to deal with here. It is
sometimes called the extreme value theorem.
Mugsy: Should I be glad that the proof is omitted?
Albert: Most definitely. The proof shows up in courses called Real Analysis or Topology. These are genuinely senior-
level courses, and often put off until graduate school.
Global max and min will either be a critical point or an endpoint. There are two types of places to look for a global
max or min. The first type of place is at critical points. It is quite possible that one of the local maxes or mins is also
the global max or min. You can’t ignore them. The other type of place to look—and this is a result of working on on
interval—is at the endpoints of the interval. If you have a function like f (x) = x on the interval [a, b], then the minimum
will occur at x = a and the maximum will occur at x = b. The endpoints are important!
Finding the global max and global min. Once you know where to look, you have to search through the points for the
global max and global min. You could do this by classifying each of the points (“Is it a max or min?”), and looking among
the maxes for the global max and looking among the mins for the global min. That’s not how it is done in practice. For one
thing, it is not obvious what to do with the endpoints. (Although with a bit of thinking, you could probably come up with
the right thing.)
Dudley: Al, Mugsy and I have been conferring, and have one thing to say: HELP!
Albert: Don’t panic, either of you. Think about what’s being done. Suppose you want the global maximum and
minimum of a function f (x) on the interval [a, b]. You could find all the critical points, and classify them as local
maxes and mins. Then, to find the global max, you’d take the local maxes and the endpoints (Don’t forget them!),
and you’d be stuck. How do you tell which is the global max? There is only one way. You have to plug those values
back into the function, and look at the values of the function at all those points. The very largest value is the global
maximum, and the corresponding x is the value where the global max occurs. The, to find the global min, you’d
do the same with all the local mins (again including the endpoints!), plugging in all those values into the function,
but this time looking for the very smallest output of the function, which is the global min of the function, and the
corresponding x is the input that gives that global min. Now, look at what you did. You plugged in all the critical
points into the function, as well as the endpoints (twice). Realize, then, that you could drop out one piece of work.
You don’t really need to know whether a point is a local max or min. It’s wasted effort. The global max is not going
to occur at a local min, and vice versa. Just plug all the points in without classifying them first if you are going only
for the global max and min. It will save you some work. Does that help?
Dudley: Definitely, but I’ll need to go over it again to be sure.
Albert: But be careful. You can skip the classification, as it is called, of the critical points (answering the question
of whether the point is a local max or min) only when you are asked only for the global max and min. If you want
the local maxes and mins, then you do have to go through the classification process (the second derivative test, for
example).
The procedure, then, is somewhat different from finding local maxes and mins. To find the global max and min of a
function on an interval, follow this procedure.
1. Find all the critical points of the function (where the derivative is either 0 or doesn’t exist), and put those in a list.
2. Throw out of the list any critical points that are not in the given interval.
3. Add to that list the endpoints of the interval.
4. Evaluate the function at all the points that are left in the list.
5. Pick the largest value to be the global max and the smallest value to be the global min.
CHAPTER 2. FINANCE 140
Then what we have to do is find the critical points. The values where f 0 (x) = 0 come from setting the top of f 0 (x) to 0. (A
fraction is zero when the top is zero.) That gives −2(x + 1)(x + 4) = 0, so x = −1, −4. We throw out −1 since it is not
in the interval between −6 and −3. Next, we need the values where f 0 (x) doesn’t exist, and that comes from setting the
bottom of f 0 (x) to 0. (A fraction blows up when you divide by 0.) (This, by the way, is why I factored the bottom!) That
gives (x − 2)2 (x + 2)2 = 0, or x = 2, −2. Both of those get thrown out, because neither is between −6 and −3. So, the only
surviving critical point is x = −4. Add to that the endpoints, −3 and −6, and you end up with three x’s to find function
values for.
1 1 7
f (−3) = − , f (−4) = − , and f (−6) = −
5 4 32
Since
1 7 1
− >− >−
5 32 4
the global max in this case is − 15 at x = −3 and the global min is − 14 at x = −4.
A few notes. First, you plug back into f (x) to finally decide global maxes and mins, not f 0 (x) or f 00 (x). Also, when you
are trying to decide where a fraction is either 0 or not defined, you set the top equal to 0 and then the bottom equal to 0.
There is also the problem of where is C0 (x) not defined, which occurs at x = 0. This gives a total of three critical points,
but only one is realizable. (What would the manager say if you told him to order a negative number of hamburgers? Or
zero hamburgers?) This can also be viewed as a sort of global max/min problem, with x > 0 being the interval, but that
can’t work easily, since there is only one endpoint. But thepeffect is the same, namely, drop the negative square root and the
value x = 0. We are left with only one possible value, x = 2 n s/i. It had better be a min!
00 00
pthat value of x into C (x), and we will certainly get a positive number, since all factors in C (x) are positive.
We plug
Thus, x = 2 n s/i does indeed give a minimum for C(x).
Solution.
p
What’s the answer? The order size should be x = 2 n s/i. There is one thing yet to do, since this is a problem taken from
the real world. You should always ask if it makes sense.
Mugsy: Hey, really? Does anything is course actually make sense?
Dudley: Come off it. Of course. What fascinates me is that it could actually possibly be used maybe a little. This is
a new concept for a math course.
If n = number of hamburgers sold per year, gets large, then the order size goes up. That certainly makes sense. If s =
shipping cost to place a single order goes up, we want to make larger orders. That also makes sense, since a larger shipping
charge means you want to pay it less often. If i = inventory cost of a hamburger goes up, then you certainly want to cut the
size of the inventory back, which means smaller orders delivered more often.
There is one other question that fits in here. How often should orders be placed? That is certainly relevant. We answer
that question in the homework right now.
Homework #23
Exercises.
1. Find and classify all the local maxes and mins of the following functions. (Classify means determine whether it’s a
max or a min or neither.)
(a) p(s) = s3 − 6 s2 − 36 s + 18
CHAPTER 2. FINANCE 142
(b) f (x) = x3 − 6 x2 + 12 x − 1
2. Find and classify all the local maxes and mins of the following functions.
(a) p(s) = s3 − 9 s2 + 27 s − 15
(b) f (x) = x3 − 15 x2 + 63 x − 60
3. Find where the functions are rising and falling, and the global maximum and minimum, of the following functions
on the given intervals:
(a) f (x) = 3 x2 + 8 x − 5 on [−3, 0].
x 2
(b) f (x) = x−4 on [5, 10].
(c) f (x) = x ln x for 0 ≤ x ≤ 3 (Use f (0) = 0, since limx→0 f (x) = 0 for this function. We did that limit in an
example earlier.)
4. Find where the functions are rising and falling, and the global maximum and minimum, of the following functions
on the given intervals:
(a) f (x) = 3 x2 − 7 x + 2 on [−1, 3].
x2
(b) f (x) = x−3 on [4, 10].
(c) f (x) = x e−x for 0 ≤ x ≤ 5.
Problems.
1. In this problem, we look at a familiar function that requires the first method for finding maxes or mins.
(a) The function f (x) = |x | has a minimum at x = 0. Why would the second (easy) method never find that as a
minimum?
(b) When you try the second method in this function, what tips you off that it is not going to work?
(c) What about the graph of y = |x | at x = 0 indicates that the second method for finding maxes and mins will fail?
2. In this problem, we classify all the critical points of f (x) = 2x2x+5
−4
. I have already found all the critical points in the
notes (though some were removed due to the interval under consideration there).
(a) What is the second derivative of f (x)? (Needed for the second method of classifying critical points.)
(b) When you try to use the second method to classify the critical points, two points can be classified, but two
can’t. Classify the two that can be. (The two that can’t be classified have the awkward property that the second
derivative explodes—division by 0—when you plug the values in.)
(c) The two failed points obviously need more help, and the first method now becomes the only hope. Determine
where the function is increasing and where it is decreasing. (You always have to use all the critical points to
answer this type of question.)
(d) On the basis of the information from the previous parts, classify the critical points again. Do the two failed
points show up as maxes, mins, or neither? (Bonus: What are they graphically?)
3. In this
p problem, we investigate how often orders need to be placed, having determined that the optimal order size is
x = 2 n s/i.
(a) Back in the beginning of this problem, we decided that there needed to be n/x orders per year to sell n ham-
burgers each year. From that, how many years are there between orders? (This looks hard, but isn’t really. Try
the same question with numbers. If you have 12 orders per year, how many years are there between orders?
How about 4 orders? 3 orders? How did you answer these questions? Apply the same logic to this problem.)
p
(b) Plug x = 2 n s/i into your answer to the previous part, and simplify the result you get algebraically. (It’s only
a minor bit of simplifying.)
CHAPTER 2. FINANCE 143
2.4 Elasticity.
We now want to move into a bit of microeconomics. But courage! We aren’t going to spend much time here; just enough
to pick up a feel for how differentials are used in another field, and get a few more useful things to do with them.
Mugsy: More? We had any?
2.4.1 Introduction.
Typical use of calculus concepts in economics.
The preceding section is a genuine application, but still carries a bit of a sense of triviality. In this section, we will get more
of an idea that calculus concepts can be used in a significant way in the study of economics.
These sections hardly represent the only uses of calculus in economics, but we haven’t yet had enough calculus to do
much more.
We want dR/dx = 0.
To maximize revenue, we differentiate R(x) = x p(x) with respect to x, since p is really a function of x. We then set it equal
to 0, and solve for d p/dx. Here’s what happens:
dR dp
= 1 × p(x) + x × (2.45)
dx dx
dp
0 = p+x (2.46)
dx
dp
−p = x (2.47)
dx
p dp
− = (2.48)
x dx
That is, we want d p/dx to equal −p/x to make the revenue a maximum.
There is some more terminology. Whenever an economist refers to the derivative of something with respect to x
(demand), he calls it the marginal of that thing. d p/dx is marginal price; dR/dx is marginal revenue. The way that this is
explained in economics class is that the marginal revenue (for example) is the change in revenue caused by selling one more
item, while marginal price is the change in price needed to increase demand by one. The idea there is that d p/dx ≈ ∆p/∆x,
where the approximation is best when ∆x is small. Since ∆x = 1 is the smallest that is physically possible, the best that can
be done is d p/dx ≈ ∆p/∆x = ∆p when ∆x = 1. The same would be true for dR/dx ≈ ∆R when ∆x = 1.
Actually, all derivatives (not just with respect to x) in economics are called Marginals. The derivative d/dx is the
Demand marginal, so d p/dx is more accurately the demand marginal of price, while dR/d p would be the the price marginal
of revenue. The type of marginal is then used for specifying the independent variable. However, the demand marginal is
usually meant when there is no other indication of which marginal is referenced.
p/x
Elasticity = (2.51)
d p/dx
The quantity on the left of the equation for the critical price is called elasticity, and is given the letter η (Greek letter,
written eta, and pronounced AY-tuh). It looks like an “n” with a tail. (But the Greek equivalent of “n” is nu or ν, looking
like a “v.” It takes some getting used to.)
The marginal price is on the bottom, as mentioned above. The item on the top is called the Average price. This also is
terminology. A numeric quantity divided by x gives the average of that quantity. So, p/x is the average price, and R/x is
the average revenue. (I would imagine that there are varieties of averages, just like there are varieties of marginals. That is,
the price average of revenue would be R/p. However, I have never encountered such terminology.)
It is worth highlighting the difference:
Note that elasticity will be negative (by the assumption that d p/dx < 0, and assuming p and x are positive). What we will
end up talking about, mostly, is the absolute value of elasticity, making it positive. Elasticity is one of the quantities that
is not defined uniformly. The definition here is the most common (as far as I can tell), but there are others. One typical
alternative is to include absolute values. This definition can be easily translated to something equivalent to what we will
be doing, except that we will have to include the absolute values. Another definition of elasticity that I have seen is that
η = (d p/dx)/(p/x), the reciprocal of what we are using. This makes life complicated, since it reverses all the inequalities
that we will get.
η is negative.
We’ve already said that earlier, but it needs to be put here with the properties of elasticity. It follows from d p/dx < 0, x > 0,
and p > 0.
Terminology.
The market (meaning the price-demand situation being investigated) with 0 < |η | < 1 is called Inelastic, with |η | = 1 is
said to have Unit elasticity, with |η | > 1 is called Elastic. Note that these inequalities flip around if you take off the absolute
values, because η is negative! Also note that the condition we had for maximum revenue, η = −1, implies unit elasticity.
That’s the reason that 1 is the separating line between inelastic and elastic markets.
Dudley: Al, how can I remember these easily?
Albert: I assume you don’t want the brute force memorization method. If you will wait until we know how elastic and
inelastic markets operate, it will be easier to explain.
This is standard terminology (except, see the homework!). We want to examine what these different terms imply about
the market.
Rewrite η as (dx/x)/(d p/p) . In the form that we had for η, it is a bit difficult to interpret, so we rearrange the terms,
and specifically, split the derivative apart into the quotient of two differentials so that we can put all the x-terms on top and
all the p-terms on the bottom.
What are dx/x and d p/p? The meanings of dx/x and d p/p are central to understanding elasticity. In fact, dQ/Q for any
quantity Q, is a useful item in other situations, so we spend some time looking at this separately. It even has a terminology:
dQ/Q is called the relative change in Q. Elasticity, then, is the ratio of the relative change in demand to the relative change
in price. As soon as we figure out what the relative change means, we’ll come back to this, and get our final interpretation
of elasticity.
Absolute error in Q is dQ
Relative error in Q is dQ/Q
Percentage error in Q is dQ/Q × 100%
Note that 100% = 1.00, so the percentage error and relative error are different-looking ways of saying the same thing.
Homework #24
Exercises.
(a) Find the revenue function R = x p as a function of x. Simplify the function you get by combining terms.
(b) Show that the derivative of R with respect to x is positive. Assume that x > 0.
(c) Repeat the previous two parts with p = 18x−4/3 , but show that the derivative of R is negative.
3. Suppose the elasticity of a commodity at a particular production level is η = −1.3. Assume that production is the
same as demand.
(a) If the production goes up by 2%, by what percentage does the price increase or decrease?
(b) If the production goes down by 3%, by what percentage does the price increase or decrease?
4. Suppose that the elasticity of another commodity at a particular production level is η = −0.9. Assume that production
is the same as demand again.
(a) If the production goes up by 1%, by what percentage does the price increase or decrease?
(b) If the production goes down by 2%, by what percentage does the price increase or decrease?
Problems.
1. Consider the following quotation: “The average describes the past, the marginal predicts the future.” To explain
why it makes sense, work the following problem. Suppose Dudley is marketing a new, improved widget sharpener.
(Everyone agrees, thanks to his advertising blitz, that nothing is more intolerable than dull widgets.) Dudley invested
initially $850,000 in development, advertising, and setup costs. He has sold to date 25,000 widget sharpeners for
$65, with a production cost of $35 each, for a profit of $30 each.
(a) What is the total cost to date of widget sharpeners? You do this by adding initial costs and production costs.
(b) What is the average cost of a widget sharpener to date? [Remember what average means. What is the current
value of x?]
(c) What is the cost function of a widget sharpener? Find this by adding initial costs to the cost to produce x widget
sharpeners.
(d) What is the marginal cost of a widget sharpener? [Remember what marginal means. Use the cost function.]
(e) If Dudley went by average cost versus selling price, would he produce another widget sharpener?
(f) Compare marginal cost to selling price. Should Dudley continue to sell widget sharpeners at $65 each?
(g) Interpret the quotation at the beginning of this problem in the light of this problem.
2. In this problem, we simply do some algebraic manipulations with η. One fact to keep in mind is that |s | is the
distance of s from the origin, even if s is negative.
CHAPTER 2. FINANCE 149
3. In this problem, we verify that revenue (R) does what we said it would with increases in price at various elasticities.
(a) Show that dR/d p = x × (η + 1). You can do this by multiplying out the right hand side using the alternate
definition of η, from the previous problem, and by using the product rule on the formula R = x p and showing
the two are the same.
(b) Show that a price increase in an inelastic market leads to an increase in revenue. (Hint: If η > −1, from the
previous problem, then η + 1 > 0.)
(c) Show that a price increase in an elastic market leads to a decrease in revenue. (Hint: If η < −1, from the
previous problem, then η + 1 < 0.)
4. The formula in the first part of the previous exercise can be rewritten as
dR = x × (η + 1) d p.
Find the changes in revenue in the situations described in the last two exercises. (Check your answers against what
you found in the previous problem.)
5. Show that the demand function x = c pn has elasticity η = n. You will find the alternate definition of η from problem
2 (a) easiest to use here. (This demonstrates the pattern for elasticities. Unit elasticity looks essentially like x = c/p;
η = −2 looks like x = c/p2 . Remember that η is negative!)
CHAPTER 2. FINANCE 150
4. Elasticity represents how a market will respond to changes in price or demand. The formulas are
p/x
η=
d p/dx
p/d p
=
x/dx
dx/d p
= .
x/p
Elasticity will be negative: η < 0. An inelastic market has |η | < 1, and is typical of necessities. An increase in price
will result in an increase in revenue. An elastic market has |η | > 1, and is typical of luxuries. An increase in price
will result in a decrease in revenue. A unit elastic market has |η | = 1, and will maximize (or minimize!) revenue for
a commodity.
5. The relative change in Q is dQ/Q or (∆Q)/Q.
Chapter 3
Derivatives - II
3.1.1 Basics.
There are a few things that we will have to cover before we can get into the calculus of multi-input functions.
Motivations.
Are there any reasons for looking at these things? Definitely. There are a number of different rationales for them.
Very few things in life depend on only one other thing. If we want to use calculus in more comprehensive situations, we
will have to deal with the reality that most items in life depend on multiple other items. That means dealing with functions
of many variables, and understanding calculus in those bigger settings.
It might be interesting to view the inventory cost control problem as a function of both x, order size, and n, annual sales
of hamburgers, so that you can see how growth of the business affects order size. And that’s just one option there.
Weather, for example The weather is one of the most complicated of the systems that are being analyzed today. The
accuracy of weather forecasts would be greatly enhanced by a good model (set of equations describing) of the atmosphere.
There are too many variables! You’d need to look at latitude, longitude, length of day, season, amount of pollution (which
is difficult to describe all by itself!), geography, and many others.
Multiple-input functions
As before, one critical element of understanding derivatives is to understand functions correctly. There is one handy piece
of terminology that is (as far as I know) is unique to me, but which I have learned is very useful. The number of input
variables (the number of input chutes) is called the dimension of the function. So far, all our functions (with one exception
in one part of one homework question) have been one-dimensional, by this terminology.
So, let’s look at the different definitions of functions from this point of view. We can operate multi-input green boxes as
easily as regular (that is, single-input) green boxes. You simply require that whenever all the inputs are duplicated, then the
outputs must be duplicated. That is the essence of consistency that we required then, and still require. We can also work
with gnomes that need more information, but we don’t get anything new.
We can create proper lists. Only now, we will have a number of columns for the input side of the list, one column for
each variable. A gnome has to match all the columns in order to determine the output. And that becomes the condition for
“proper-ness:” if all the input values are identical on two rows, the output values must be the same on those rows.
152
CHAPTER 3. DERIVATIVES - II 153
From there, we can create proper lists of ordered triples, ordered quadruples, or whatever number needs to be used to
express the number of input variables. The most general is called ordered n-tuples.
Graphs are more complicated. Graphs in three dimensions are difficult, and in four dimensions (and up), graphs are
unusable. But there is a bit of confusion possible here, that needs to be tackled up front. If you want to graph a one-
dimensional function (a function of a single variable, say x), you need a plane, that is, two dimensions. Why? The reason is
simple, once you see it. A one-dimensional function (single-input green box) has two parts, the input chute and the output
spout. Values from both need to be plotted. And that is exactly what happens. The horizontal axis is the one used to plot
the input variable, and the vertical axis is used to plot the output variable. That accounts for the two dimensions.
What happens in more dimensions? We will need individual axes for each input variable, but we will need one more
axis, namely the output variable’s axis. The result is that you need n + 1 dimensions to graph an n-dimensional function.
You need n of the axes for input variables, and one more for the output variable.
Formulas are basically the same as before, except that there are more variables around. On the other hand, you will
also want to know how to handle Maple functions in more variables. It turns out to be fairly straightforward. Suppose, for
example, you want to define the function f (x, y, z) = x2 ∗ z − ey sin x . The way to do that in Maple is
> f := (x,y,z) -> x^2*z - exp(y*sin(x));
2 − esin(1)
Which variable to wiggle? If we want the derivative to be a wiggle magnification factor, which wiggle do we use? We
could wiggle any variable we want! And we will get a wiggle magnification factor for each variable. That means that we
will need to keep careful track of the notations for derivatives. There are multiple wiggle magnification factors, one for
each variable, and they need to be clearly different.
When we decide to wiggle a variable, we will have to make an assumption: no other variable is wiggling at the same
time. Otherwise the wiggle magnification factor will be thrown off by the effects of other variables wiggling at the same
time. Later on, we’ll see how to combine the total effect of multiple wiggles simultaneously, but for now, we wiggle only
one variable at a time, with all others being constant during the process.
The terminology for the wiggle magnification factor of f (x, y, z) when just x is wiggled is the partial derivative of f
with respect to x. The notation is
∂f
∂x
∂f
using bent-over d’s. In this case, there are also partial derivatives of f with respect to y and z, and they would be written ∂y
∂f
and ∂z , and they represent the wiggle magnification factors when just y or just z is wiggled.
The notation ∂∂ xf denotes that there are other variables around. There really is no difference between the partial
derivatives and regular (single-variable) derivatives, except that there are other variables occurring in the partial derivative.
But since they are keeping still (treated as constants), they don’t really affect much while the derivative is being taken.
On the other hand, they are there, and the partial derivative symbol, ∂ , is a warning that other variables are around, and
need to be taken into consideration. Please keep separate the notations d f /dx and ∂∂ xf when you are writing them. The
notation carries some meaning: Are other independent variables present or not?
CHAPTER 3. DERIVATIVES - II 154
The subscript notation is common, confusing. There is another notation that is commonly used. It is very convenient,
but needs to be used with some care. If you have f (x, y, z), the partial derivatives might also be written fx , fy , and fz meaning
the same as ∂∂ xf , ∂∂ yf , and ∂∂ zf . It’s a lot shorter to write, so it’s used often. But it’s also easy to lose track of subscripts. Use
this notation if you want, but always be careful if you do.
There is one other, less common notation. You will occasionally see Dx f for ∂∂ xf , the subscript indicating the variable
that you are wiggling. It is especially useful when you want to look at the transformation from f to ∂∂ xf . That is, Dx is a
shorthand for ∂∂x , the way that we use d/dx to mean “take the derivative with respect to x of.”
Interpretations of derivatives.
Now that we have some idea about the notation, we need to develop the understanding of the concept.
Still a “wiggle magnification factor.” The wiggle magnification factor (WMF) now means the same as it did in a single-
input function. Suppose you have a function f (x, y, z), and you wiggle just the x-value by ∆x. The value of the function
will change by ∆ f , and the ratio of those two wiggles, ∆ f /∆x, will approximate ∂∂ xf , or ∆ f ≈ ( ∂∂ xf ) ∆x. Similarly, wiggling
just y gives ∆ f ≈ ( ∂∂ yf ) ∆y, and wiggling just z gives ∆ f ≈ ( ∂∂ zf ) ∆z. We will soon get what happens when we wiggle several
variables simultaneously.
We can still salvage the slope of a tangent line! This can be done, but it is a bit of a mess. For those hardly souls who
want to know, here it is in brief. The dimension of a function is the number of independent variables or the number of
inputs. The graph needs one more dimension, for the output variable.
If we try to find a partial derivative, essentially, we treat all variables but one (the one we’re wiggling) as constants.
That has the effect of slicing through the graph of the function with a plane, giving just a single curve on the plane. The
partial derivative is the slope of the tangent line to that curve in the plane.
It is still a rate, when t is the variable. When one of the variables is t = time, we will still refer to the partial derivative
of the function with respect to t as a rate of change of that function.
Contrast to parametric equations. We worked with parametric equations earlier, where there were multiple variables.
The situation now is different, and pointing out the contrasts will help to clarify what we did then, and what we are doing
now.
With parametric equations, there was a single independent variable, t, and all other variables were dependent. An
example is {x = x(t), y = y(t)}. Now there are multiple independent variables, we can wiggle one and hold the others
constant, and there is a single dependent variable. An example is w = f (x, y).
When you want the derivative in parametric equations, you differentiate with respect to t, the only independent variable,
and you will get a regular derivative. When you differentiate w = f (x, y), you have two independent variables, and you get
two partial derivatives, ∂∂wx and ∂∂wy .
In general, a regular derivative is used when there is a single independent variable, and a partial derivative is used
whenever there are multiple independent variables. This is equivalent to using regular derivatives whenever there is a single
input funnel, but partial derivatives whenever there are several input funnels. That remark will come in handy later on.
Net function wiggle, given different input variables wiggling. What happens when we wiggle all of the variables at
once? It turns out that the easiest way to figure that out is to wiggle the variables one at a time, and then see how to combine
them. We would end up with a succession of wiggles (changes) in f . What do we do to with them? Let’s figure that out.
Take a simple example first. Suppose we have f (x, y), a function of two variables. (With multiple variables, the hard
step is always in going from one variable to two. Once you see the pattern, going from two to three, or four, or more
variables is quite simple.) Suppose we have a wiggle ∆x and a wiggle ∆y, both of them very small. The change in f for the
∆x wiggle is about ( ∂∂ xf ) × ∆x, where the partial derivative is evaluated at the original point. If we then wiggle y from there
(which will end up giving the same net result as wiggling by both x and y when we look back at the original function), the
function changes by another ( ∂∂ yf ) × ∆y, except that the partial derivative is now evaluated at the x-wiggled point. But, for x
small, since we are approximating anyway, we can get away with evaluating both partial derivatives at the original point.
What then is the final change in f (x, y)? It will be approximately the sum of the changes:
∂f ∂f
∂x ∂y
∆ f (x, y) ≈ ∆x + ∆y (3.1)
Why the sum and not some other combination (product, for example)? Because the individual changes represent what are
added to the function, so the individual changes are consecutive additions, so the total change is the sum of those. (That’s
difficult to state; if you don’t get it, don’t worry. Just remember that you add.)
This could be confusing, but if you stay calm and look at the formula, you’ll discover that it really isn’t that complicated.
The left hand side represents the total change in f (x, y) that we are trying to approximate. The right hand side has two
terms. The first term represents the change in f due to the fact that x is changing. The partial derivative ∂∂ xf is the wiggle
magnification factor for x, and it gets multiplied by ∆x to give that part of the change in f . The second term on the right
represents the change in f due to the fact that y is changing. The partial derivative there is ∂∂ yf , the wiggle magnification
factor for y, and it gets multiplied by ∆y to give the other part of the change in f . The two added together give the total
change in f .
What happens with even more variables? Suppose we have f (x, y, z), and we wiggle all three variables: x, y, and z. The
wiggle magnification formula in this case is the direct generalization of what we had before:
∂f ∂f ∂f
∆ f (x, y, z) ≈ ∆x + ∆y + ∆z (3.2)
∂x ∂y ∂z
The same thing happens with more variables: You add in a term for each variable, and the term consists of a partial
derivative with respect to that variables times the wiggle in the same variable as the derivative. All these get added together
to give the approximate total change in f .
Dudley: AUGH! There are lots of these? One for each number of variables?!
Albert: Actually, you have it exactly correct. But they are all so similar, it is easy to remember, if you find the pattern.
Mugsy: And if you don’t?
Albert: Let’s just say that you’d be better off finding the pattern.
The formula for total differentials. When we work with differentials rather than general wiggles, the approximations
become equalities, but the interpretation remains the same. For example,
∂f ∂f ∂f
d f (x, y, z) = dx + dy + dz (3.3)
∂x ∂y ∂z
This is how you find the differential for multiple-variable (multi-dimensional) functions.
CHAPTER 3. DERIVATIVES - II 156
3.1.3 ALL THE SAME RULES APPLY, EXACTLY AS THEY DID BEFORE.
The product rule, the quotient rule, and (of course) the chain rule operate exactly as before. Partial derivatives are deriva-
tives, after all. You try to successively simplify the derivatives using the procedures we had, just like our earlier derivatives.
Dudley: This is part of the pattern?
Albert: Actually, it is.
All the same cautions apply, too. Don’t forget the derivative of the inside with the chain rule, for example.
x2 cosh(w) + w x2 sinh(w)
Maple uses partial derivative notation even when there is only one variable.
So, taking derivatives on Maple is just the same as before. You have to tell it what variable is being differentiated with
respect to, and all other variables are treated as constants, which is just what partial differentiation does. For example, the
partial derivative ∂∂w (w x2 cosh w) would be typed into Maple as the command
> diff( w * x^2 * cosh(w), w);
x2 cosh(w) + w x2 sinh(w)
That’s all there is to it!
Homework #25
Exercises.
1. How many dimensions are each of the following functions, and how many dimensions would it take to graph each?
(a) f (x, y, z)
CHAPTER 3. DERIVATIVES - II 157
(b) f (t, u, v, w, x)
2. How many dimensions are each of the following functions, and how many dimensions would it take to graph each?
(a) f (s,t, u, v)
(b) f (s,t, u, v, w, x, y, z)
3. Find the partial derivatives with respect to x and y for the following functions.
(a) 2 x3 − 5 x2 y − y6
(b) x y3 (3 x − y2 )
(c) x Arcsin xy
4. Find the partial derivatives with respect to x and y for the following functions.
(a) 7 x2 + 9 x3 y2 − 2 y5
x ey
(b) 4 x−5 y3
x
(c) y3 sec y
7. Make up three multi-variable functions of your own and find all the first partial derivatives. One should have two
variables, one should have three variables, and one should have four variables.
Problem.
1. It really is not too easy to see that when we are figuring out the total output wiggle when all the input variables are
wiggling that we should add up all of the individual wiggles. This problem will help convince you that adding is
the appropriate thing to do. Let f (x, y) = x3 y2 . We will be moving from (2, 1) to (1.99, 1.02), so ∆x = −0.01 and
∆y = 0.02. For these calculations, use the full precision of your calculator; don’t round off. (You might find it useful
to look back at the work we did in a single independent variable, on page 43.)
(a) Figure out the values of f (2, 1) and f (1.99, 1.02). Then calculate ∆ f from these.
(b) We will now wiggle the x-input alone. We have ∆x = −0.01. Find (∂ f /∂ x)∆x where the partial derivative is
evaluated at (2, 1). Then find f (1.99, 1) − f (2, 1). The two numbers should be close.
(c) We will now wiggle the y-input alone. We have ∆y = 0.02. Find (∂ f /∂ y)∆y where again the partial derivative
is evaluated at (2, 1). Then find f (2, 1.02) − f (2, 1). The two numbers should be close (again).
(d) Compare ∆ f from the first part of this problem with the sum (using the numbers from the other two parts)
(∂ f /∂ x)∆x + (∂ f /∂ y)∆y which is the wiggle magnification approximation to ∆ f .
CHAPTER 3. DERIVATIVES - II 158
@ x
@
@ u1 @ u2
@ @
That leads to asking what variables is y in terms of. It is possible to think of y as a function of u1 and u2 , the way that it
is given. But it is also possible to think of it as a function of x. That is, given a value for x, you can calculate values for u1
and u2 and then use those to get y.
Asking and answering that question is very relevant, since the number of variables that a function is in terms of deter-
mines whether or not the derivative is a regular or partial derivative. Since you need the values of both u1 and u2 to get y,
the derivatives of y with respect to either u1 or u2 will have to be partial derivatives. But once you have the value of x, that’s
all you need to get the value of y. Yes, you do some more calculations (specifically, you calculate u1 and u2 ), but that can
be done from a knowledge of only x. That means that the derivative of y with respect to x will be a regular derivative.
Now, how would it go? The easiest way is to use differentials. (After all, they are legal to use because of the chain rule,
so using them gets you to the formula for the chain rule very rapidly.) Taking differentials in the formulas u1 = u1 (x) and
u2 = u2 (x) gives
du1 du2
du1 = dx and du2 = dx
dx dx
That is, if you wiggle x, those give you how much both u1 and u2 wiggle. But you can tell how much y will wiggle from
CHAPTER 3. DERIVATIVES - II 159
x1 x2
@
@ @
@
@ u1 @ u2
@ @
But more than that. In order to get a formula for the chain rule, we would have to get a partial derivative of y with
respect to x1 or x2 . That means that we would only want to wiggle one of them at a time, and suppose for the illustration
we pick x2 to wiggle.
The derivatives of u1 and u2 would also change to partial derivatives, since their formulas would now be u1 = u1 (x1 , x2 )
and u2 = u2 (x1 , x2 ). When we wiggle x2 , the wiggles in u1 and u2 will become
∂ u1 ∂ u2
du1 = dx2 and du2 = dx2 .
∂ x2 ∂ x2
CHAPTER 3. DERIVATIVES - II 160
Combining that with the differential formula for dy we had (we can use it since we haven’t changed the bottom half of the
green box diagram),we get
∂y ∂y
dy = du1 + du2
∂ u1 ∂ u2
∂ y ∂ u1 ∂ y ∂ u2
= dx2 + dx2 .
∂ u1 ∂ x2 ∂ u2 ∂ x2
What happens when we divide dy by dx2 ? We get a derivative, of course, but the notation changes to reflect the
situation. We don’t just get dy/dx2 . We had already indicated that the derivative of y with respect to x2 needs to be a partial
∂y
derivative, and that’s what you get by dividing: divide dy by dx2 and you get . Doing that, we get
∂ x2
∂y ∂ y ∂ u1 ∂ y ∂ u2
= + .
∂ x2 ∂ u1 ∂ x2 ∂ u2 ∂ x2
I snuck something in there on you. When you divide two differentials, you get a derivative. That’s reasonable; it’s how
we defined differentials. But in this case, dy when divided by dx2 gave ∂∂xy and not dy/dx2 . What’s going on? Plenty, and
2
not much at all.
Dudley: Huh?
Albert: It depends on how confused you are. Not much is happening, really.
Mugsy: Oh, all kinds of things are happening. I don’t understand this at all yet.
Albert: As I was saying....
The notation dy/dx2 would say that y depends only on x2 , and no other value of any other variable is needed to evaluate y.
Writing ∂∂xy would give the derivative also, but at the same time would say that x2 is only one of several (possibly many)
2
other variables needed to evaluate y. The notation for differentials dy or dx2 remains the same in either case, but the notation
for derivatives is pickier. The moral of this lesson is that you can still divide differentials to get derivatives, but when you
write them, you must be careful. Specifically, watch out that you put in partial derivatives when you should, and use regular
derivatives the other times.
Again, there are two paths through the green boxes that lead from x2 down to y, and each path gives a term in the chain
rule. The only question is how to sort out what should be a regular derivative and what should be a partial derivative. That
isn’t too hard, and you might even be able to guess. You use a regular derivative when the variable you are differentiating
with respect to (that is, the variable on the bottom of the derivative) is all by itself on its input level. When x was the only
top variable, then all derivatives with respect to x were regular derivatives. When x on top became x1 and x2 , the derivatives
with respect to x2 became partial derivatives.
Let’s do one more case before we tackle the whole thing in general. What would happen if we had two top input
variables, but only one intermediate variable? What we have now is u = u(x1 , x2 ) and y = y(u), and again suppose that we
are wiggling only x2 . The green box diagram is here.
CHAPTER 3. DERIVATIVES - II 161
x1 x2
@
@ @
@
u
@
@
dy dy ∂ u
dy = du = x2 .
du du ∂ x2
What do we get when we divide dy by dx2 ? It will be a partial derivative, since there is another variable on the same level
as x2 , namely x1 . The final formula for the chain rule in this case is then
∂y dy ∂ u
= .
∂ x2 du ∂ x2
Let’s look at that a bit more closely. There is only one term (no additions) here, since there is only one intermediate
variable. The derivative with respect to u is a regular derivative, since there are no other variables on the same level as u,
while the derivatives with respect to x2 must be partial derivatives since x1 is on its level.
With that under control, we are now going after the gold: the multi-variable chain rule in general. Obviously, this is
going to take some feat of notation
Mugsy: Is that anything like feet of clay?
Albert: Very close.
Dudley: Or like head of brass?
Mugsy: That helps, too.
to keep everything straight. Here’s how we are going to do it. The top input variables will be x1 , x2 , . . . , xn . The outputs of
the first function will be u1 , u2 , . . . , um ; these are also the inputs to the second function, which we will call g(u1 , u2 , . . . , um ).
(I am using g rather than y because in a moment I will do an example there I use y as an independent variable. But it is still
all the same.) We need to indicate that the u j ’s are function of the xi ’s. (I will try to keep the notation consistent in that i
will be the subscript on x, with i going from 1 to n. Then xi is the generic x variable. And j will be the subscript on u, with
j from 1 to m. Then u j will be the generic u variable.) This is done as usual, by writing u j = u j (x1 , x2 , . . . , xn ).
Here is the general green box diagram.
CHAPTER 3. DERIVATIVES - II 162
x1 x2 xn
@
@ @@ ... @
@
...
@ u1 @ u2 @ um−1 @ um
@ @ ... @ @
We now have a quandary. Is g a function of the u j ’s or of the xi ’s? The answer is that it is both. And that is precisely
the reason that we need the chain rule. It tells us how to relate the (partial!) derivatives of g with respect to the xi ’s to the
(partial!) derivatives of g with respect to the u j ’s. It is again the question of how to change the variable of differentiation,
just with lots of variables now.
How do we keep all of these straight? With xi ’s and u j ’s it is bad enough, but when we are working with other sets of
variables (in the homework, for example), it is even worse. The variables are not always nicely given to you in a coherent
pattern enabling you to keep them separate mentally. The trick is this. The function g is originally given (usually as a
formula) in terms of one set of variables. These are the u j ’s. Each of those variables depends on another set of variables.
Those are the xi ’s. I will often, for my own benefit, draw a pair of green boxes with all the inputs and outputs labeled.
The upper set of inputs are the xi ’s and the middle set of variables are the u j ’s. This really helps! (Wait until we get to the
examples.)
Let’s figure out how this is going to work. We want the partial derivative of g with respect to xi , a generic one of the x’s.
(Remember, the xi ’s are the top variables.) We wiggle the xi by some dxi . This has the effect of wiggling all of the u j ’s.
(Those are the middle variables.) The amount of wiggle of each du j is
∂uj
du j = dxi
∂ xi
(I am using differentials here for the wiggles. This could be done with ∆u j , and you’d get an approximately equal (≈)
rather than an equals (=) in that last equation. The process gives exactly the same answers either way.) Each of the wiggles
in each of the du j ’s causes the value of g to wiggle, and the total wiggle in g is then
m
∂g
dg = ∑ ∂ u j × du j
j=1
But we have an expression for du j (we just got it), which we can plug into this and get
m
∂g ∂uj
dg = ∑ ∂ u j × ∂ xi dxi
j=1
Look at this on the green box diagram. The top input wiggles, wiggling all the middle variables, and each of those wiggles
the bottom output, and the net effect is the sum of all of those wiggles. Finally, then, we get that the wiggle magnification
factor is
m
∂g ∂g ∂uj
=∑ ×
∂ xi j=1 ∂ u j ∂ xi
CHAPTER 3. DERIVATIVES - II 163
To help you remember this another way, I have written out the multi-variable chain rule’s summation (with the single
variable version below for comparison), in the following box.
m
∂g ∂g ∂uj
∂ xi
= ∑ ∂ u j × ∂ xi
j=1
dg dg du
= ×
dx du dx
Note that the pattern with the partial derivatives is the same as the single-dimensional chain rule. The terms in the partial-
derivative chain rule look as though you are canceling the ∂ u j ’s, just as it looks as though the du’s are canceling in the
single-dimensional chain rule. Now, though, the fact that these are partial derivatives clues you to the fact that there is more
than one du around, and that you have to add up multiple terms. That is why you need the summation. Also note that you
get one term in the summation of the partial derivatives for each middle variable. There is a contribution to ∂ g/∂ xi due to
u1 changing, another part due to u2 changing, . . . , and a part due to um changing. Each of these contributes one term to the
summation. These ideas, when put together, help you set up the correct expression for working the chain rule. Just be sure
to remember to add it all up!
In fact, let’s work another problem that forces us to be exceedingly careful about regular versus partial derivatives. Let’s
suppose we have a function like g(x, y,t), which depends on both position in the xy-plane and time. (If you want, you can
think about f as representing temperature of a plate that is heating up. The temperature would, in that case, depend both on
the location (x, y) and the time t.) Suppose also that you have a curve that is parameterized by time, x = x(t) and y = y(t).
(If you want, you can think of this as the parametric equations for the position of a bug crawling along on the plate.) Then,
given a value of t, you can figure out the values of x and y, and from those (with the value of t still), you can find the value
of g. That is, we can also think of g as a function of only t. Suppose you want to find the derivative of g as a function of
time alone. (That would be asking how fast the plate under the bug is heating up as it is crawling around.) How would you
do that?
This one is confusing because time enters in an unusual way. To sort all of this out, we need to isolate the top and
middle variables. The middle variables are the ones that g is given in terms of; in this case x, y, and t. The top variables are
the ones that those variables are in terms of; in this case, just t. (It’s the fact that t appears twice that makes this problem
interesting.)
Mugsy: Has anyone ever told you that you have a warped sense of “interesting?”
If we set up the formula for the differential of g using just the intermediate variables, we get
∂g ∂g ∂g
dg = dx + dy + dt
∂x ∂y ∂t
From the x = x(t) and y = y(t) equations, we can get dx and dy in terms of dt:
dx dy
dx = dt and dy = dt
dt dt
(Yes, those are not partial derivatives, but regular derivatives. I’ll explain momentarily, when we’ve finished the problem.)
Plugging the values in for dx and dy gives
∂ g dx ∂ g dy ∂g
dg = dt + dt + dt
∂ x dt ∂ y dt ∂t
from which we can factor a dt and get
∂ g dx ∂ g dy ∂ g
dg = + + dt
∂ x dt ∂ y dt ∂t
Finally, dividing through by dt gives
dg ∂ g dx ∂ g dy ∂ g
= + +
dt ∂ x dt ∂ y dt ∂t
CHAPTER 3. DERIVATIVES - II 164
Now that looks weird. We have both dg/dt and ∂ g/∂t in the same equation! What is going on? Something important to
keep straight, which is why I did this problem. When we write dg/dt, we are looking at g as a function of only t. That is,
after we plug the formulas for x(t) and y(t) into g(x, y,t), we get a formula for g with t as the only variable; we only need to
have t to find out g. The derivative of that function is dg/dt. If we take the derivative of g(x, y,t) with respect to t before we
plug in the formulas for x(t) and y(t), then when we take that derivative of g, we have other variables around, and we then
denote the derivative of g with respect to t as ∂ g/∂t. We calculate the value of ∂ g/∂t by just differentiating the formula we
are originally given for g(x, y,t).
Perhaps an example would be of benefit. Suppose
(where the goofy functions will help us keep things separate). Also, suppose the parametric equations are
∂g
= − sin x
∂x
∂g
= ey
∂y
∂g
= cosht
∂t
We can also find g(t), which we get by plugging in the formulas for x(t) and y(t) into the function g(x, y,t):
3
g(t) = cos(t 2 ) + et + sinht
dg 3
= − sin(t 2 ) 2t + et (3t 2 ) + cosht
dt
by the regular chain rule. Compare that to ∂ g/∂t, and you realize that you will miss the first two terms, the ones that came
from x(t) and y(t), if you think that dg/dt is the same as ∂ g/∂t. The correct way to find dg/dt, namely the chain rule,
gives
dg ∂ g dx ∂ g dy ∂ g dt
= + + (3.4)
dt ∂ x dt ∂ y dt ∂t dt
= − sin x(2t) + ey (3t 2 ) + cosht (3.5)
Note that all you have to do is plug x = t 2 and y = t 3 into this last equation to get what we had before for dg/dt. Stare at
this example until you understand where all the terms come from and why. It really will help!
Was this example contrived merely to be complicated? Not at all. This same sort of confusion occurs in fluid mechanics,
for example, where instead of dg/dt, they write it as Dg/Dt, and call it the material derivative. You need the material
derivative to calculate the acceleration of the particle. (g in this case is the velocity of the particle, derived from the flow of
the fluid. Then the derivative with respect to t is the acceleration.) Just ∂∂tg represents just how the fluid flow is changing at
a single point, since the other variables (x, y, and z) are being held constant. But to get the acceleration of a particle of fluid
correctly, you have to take into account that the particle is moving, too! (If this confuses you, take heart. It is genuinely
confusing. Fluid mechanics is full of this stuff, and I am just beginning to get a handle on it myself.)
How, then, in practice, do you tell whether to use d()/d() or ∂ ()/∂ ()? It actually is easy (despite the hassle above).
Look at the green box diagram. If you are taking a derivative with respect to a variable that is all by itself at that level, then
it is d()/d(). If there are several variables on that level, then the derivative is ∂ ()/∂ (). Compare to what we had above with
CHAPTER 3. DERIVATIVES - II 165
dg/dt versus ∂ g/∂t. When we were treating g as a function only of the top t, the derivative was dg/dt. When we were
treating g as a function on the level with x, y, and t, the derivative was ∂∂tg . Look at it again, and think about this until it sinks
in. Once it makes sense, you have arrived in your understanding of the difference between the regular (total) derivative
and the partial derivative. This also explains why we used dx/dt and dy/dt rather than partial derivatives when finding the
derivatives of x = x(t) and y = y(t).
Homework #26
Problem.
1. Suppose we start with a function w = w(x, y), but that we want to take the partial derivatives of w with respect to
r and θ , the polar coordinate system. (This happens often enough in applications, when you decide that the polar
coordinate system is more suited to your problem than rectangular coordinates are.) The equations relating polar
and rectangular coordinates are x = r cos θ , and y = r sin θ . Also, I will use the subscript notation here, since it is so
common, and also saves a lot of space.
(b) Use those equations to show that (wx )2 + (wy )2 = (wr )2 + (1/r)2 (wθ )2 . (Note: This last equation is for a rather
important combination of partial derivatives in applications. The point of this problem was to show how to
convert the expression from one coordinate system to another. This happens all the time when dealing with
what are called partial differential equations.)
Notations.
There are two different notations for first-order partial derivatives, and they both extend to notations for higher-order partial
derivatives. There is a subtle difference, but it turns out not to be serious in any real situation.
The ∂ ()/∂ () notation extends in exact analogy to the way the notation for d()/d() extended to higher-order derivatives.
For example, if you have f (q, r, s), then the first partial derivative with respect to q is
∂f
∂q
If you want the derivative of that with respect to r, it would be
∂
(pdi f f q)
∂r
which would be compressed to
∂2 f
∂r∂q
CHAPTER 3. DERIVATIVES - II 166
Note several things. First, the 2 in the top indicates the total number of derivatives to take, and the order in which you
take the derivatives is indicated in the bottom of the derivative, from right to left. (The reason for the order is that in this
notation, we tack the derivatives onto the left side, so the leftmost variables are the ones that showed up last.)
A bigger example is given by f (w, x, y, z) and the derivative
∂4 f
∂ y ∂ 2z ∂ w
In this case, there will be a total of four partial derivatives taken, one with respect to w first, then 2 with respect to z next,
and finally one with respect to y.
The subscript notation for partial derivatives has its extension also. If the first partial of f (w, x, y, z) with respect to x
is fx , then its next derivative with respect to z is ( fx )z = fxz . The order of taking derivatives is backwards from the other
notation. Here, we take the partial derivatives in order from left to right, since we tack on derivatives to the right-hand side.
The subscripts more to the right showed up later. There is no typical notation to abbreviate several derivatives with respect
to the same variable in succession, the way we could in
∂4 f
∂ y ∂ 2z ∂ w
where the two z-partial derivatives were combined. In fact, that derivative would be written as
fwzzy
in subscript notation.
Interpretations.
A graphical interpretation of second-order and higher partial derivatives is seriously complicated. I’m not going even to try
to hassle you with it.
Even the algebraic interpretation is clumsy. Basically, for example,
∂2 f
∂ ∂f
=
∂r∂q ∂r ∂q
so the second partial derivative is the rate at which the first partial derivative is changing. It’s hard to say much more.
fx = yexy sin(2z)
fy = xexy sin(2z)
fz = 2exy cos(2z)
Note that
fxy = fyx , fxz = fzx , and fyz = fzy
and that none of the others match. The matching always happens. The property is called the equality of mixed partial
derivatives, and is a very powerful result in mathematics. (It should be noted that there are conditions that need to be met
CHAPTER 3. DERIVATIVES - II 167
for this to happen. They are technical, and require much more than I would expect you to know for a calculus course.
However, for all of the functions you will encounter for a long while—probably forever, unless you are a mathematics
major—the conditions will hold.)
There is an extension to the equality of mixed partial derivatives. It says that when you want to find the partial derivatives
of a function, you can do any of them in any order you want, as long as you end up taking the correct number of partial
derivatives with respect to each of the variables. This is occasionally useful, as can be demonstrated by exaggerated
examples, such as this one. Suppose we want to find ∂ 4 f /∂ x ∂ 3 y for the function
p !
y3 − y2 + 15 + cosh y3
f (x, y) = sec + x 3 y4
ln(y2 + e−y )
The whole point is not to try to take three derivatives of the first term with respect to y (which is what the notation is
asking for), but rather be intelligent about what derivatives to take first, and take the derivative with respect to x before the
derivatives with respect to y. Why? Because the derivative with respect to x first causes the entire first term to evaporate!
There are no x’s in it, so it’s derivative with respect to x is 0. Then take the three derivatives with respect to y, but you only
have to deal with the last term, which isn’t too bad.
Again, you can do this with Maple, but there is no real difference between what we have done before and what we are
doing now. Maple simply takes care of all of the messy algebra for us. For example, in the horrid example right before this,
you could write
> w := sec( ( sqrt( y^3 - y^2 + 15) + cosh(y^3) ) / ln( y^2 + exp(-y) )
> ) + x^3 * y^4;
> diff( w, y$3, x );
p
y3 − y2 + 15 + cosh(y3 )
w := sec( ) + x3 y4
ln(y2 + e(−y) )
72 x2 y
Note that you only have to specify the variables for differentiation, not the total number of derivatives (the 4 in this
case). Maple can add. Maple can also take just diff(w,y);, but that’s too horrible to include here.
Homework #27
Exercises.
1. Find all the different (that is, potentially unequal) second partial derivatives of the following functions. So, for
example, you don’t have to list fxy and fyx , since these should be equal.
(a) x4 y5
(b) y ln x
(c) Arctan(x y)
2. Find all the different second partial derivatives of the following functions.
(a) x2 y5
(b) y Arctan x
(c) ln(x y)
3. Make up three functions (multi-variable) of your own and find all the different second partial derivatives. Usual rules
apply.
Problem.
1. Find fxxxyy for f (x, y) = y ex y . [The order can simplify things considerably.]
CHAPTER 3. DERIVATIVES - II 168
Investigation.
1. We want to solve the partial differential equation
2
1 ∂ 2w
∂ w
= 2
∂ x2 c ∂t 2
This is one of the fundamental equations of physics, called the wave equation. Here, c is the speed of propagation of
the wave, and is a constant. Let f (u) and g(v) be any reasonable (differentiable) functions. Set up
This is the solution. (Don’t ask how I got it. You don’t want to know.) We can check it, though! For this, assume that
u = x + ct and v = x − ct, so w = f (u) + g(v).
(a) Find ∂ w/∂ x and ∂ w/∂t in terms of the derivatives of f (u) and g(v). [Will the derivatives of f and g be partial
or regular derivatives?]
(b) Find ∂ 2 w/∂ x2 by taking ∂ /∂ x of ∂ w/∂ x. (Careful doing this. You’ll need the chain rule again.) Find ∂ 2 w/∂t 2
by taking ∂ /∂t of ∂ w/∂t. (Again, careful doing this.)
(c) Plug the results of ∂ 2 w/∂ x2 and ∂ 2 w/∂t 2 into the wave equation, and show that both sides are equal. This
verifies that w as given at the beginning is a solution of the wave equation.
Level sets.
Before we can get very far with implicit functions, we need to look carefully at the type of items defined by a slightly more
general equation, namely f (x, y) = C, where C is any constant, and not just 0. And we will be concerned initially about the
geometry of equations defined this way. Only later will we get to the calculus side of things.
Earlier in this chapter, we talked about the dimension of a function. We come back to that idea, and expand on it some.
The idea is to come up with a way to deal with two- and three-dimensional functions that don’t require graphs in three
or four dimensions. Here’s how. A level set of an n-dimensional function f (x1 , x2 , . . . , xn ) is the collection of points that
satisfy the equation f (x1 , x2 , . . . , xn ) = C. Obviously, then, graphs of implicit functions are just very special cases of level
sets, namely the ones where C happens to equal 0.
Justifications. Why do this? There had better be a better reason than “because it is there” or something equally useless.
CHAPTER 3. DERIVATIVES - II 169
The only convenient way to visualize three-dimensional functions. If you’ll remember, in order to graph a three-
dimensional function (that is, a function with three independent variables), you’d need four dimensions, three dimensions
for the input variables and a fourth dimension for the value of the output. This is discouraging, since three-dimensional
functions do occur (after all, this is a three-dimensional world), and four-dimensional graphs are difficult to work with, to
put it mildly.
On the other hand, if we work with level sets of a three-dimensional function, we don’t need to go beyond three
dimensions. All we need to do is draw (somehow) the different level sets of the function in order to convey the “shape” of
the function. That still isn’t easy—three-dimensional graphs are difficult to interpret, much less to draw—but at least it is
possible.
The only convenient way to produce topographical maps. A somewhat more everyday example is producing to-
pographical maps. A common topographical map shows the altitude of the locations on the map. Consider how this would
work with a graphical approach to the altitude. The function altitude is a two-dimensional function (you need to know lati-
tude and longitude, two variables, to get the altitude). The graph of the altitude function should then be three dimensional.
Two variables locate the point, and the third dimension is altitude. The graph is a miniature version of the locale of the
map, like you might find in a large model railroad setup. Such a map would not be very easy to carry, and just think about
trying to fold one!
But fortunately, map makers have a better way to represent the altitude than using graphs. Instead, they draw level
curves on the paper that represent different altitudes, and communicate the same information the graph of the function
would, but in a more convenient form.
Description. If you’ll notice, the idea of using level curves to represent altitudes of places on the map yields a two-
dimensional map. You don’t need the third dimension to get values this way! This is very handy.
In general, a level set of a function with n variables can be plotted in n dimensions. If you plot a bunch of the level sets
for various values of C, you can get an idea of the values of the function at all points. It really is a very big connect-the-dots
game. Each point has a function value, and you connect all the points that have a specific value by a level set.
There is one difficulty. If we plotted all of the level sets, every point would be covered, and you wouldn’t get any
information. Plotting level sets requires some common sense.
Uses. I still probably haven’t convinced you that level sets are useful in everyday life. If you aren’t a hiking advocate,
you probably haven’t encountered topographical (U.S. Geodetic Survey) maps. But there are a few level sets on maps that
you have seen. The weather bureau produces several. Instead of giving the barometric pressure at a bunch of points, they
will draw in isobars, level sets of barometric pressure. These are the warm or cold fronts that you see connecting high or
low pressure regions.
And U.S. Today has popularized the multicolored temperature maps, where regions of roughly equal temperature are
colored the same. Those are level sets!
Relations between graphs and level surfaces. When the function has two dimensions (independent variables), the level
set of that function will be a collection of points in two dimensions, usually called a level curve of the function. When the
function has three dimensions, the level set of that function will be a collection of points in three dimensions, usually called
the level surface of the function. This is nothing more than terminology.
If you are given a set in two dimensions, there is a question that comes up. Is this the graph of a one-dimensional
function or the level set of a two-dimensional function? It is a worthwhile question to ask, because you deal with level
sets differently than you do graphs. How would you tell? You’d look at the equation that gave the set. If it is in the form
y = f (x), then it’s the graph of a one-dimensional function. If it is in the form F(x, y) = C, then it’s the level curve of a
two-dimensional function. The key is in the form of the equation. You could change the form of an equation to another, but
equivalent, form, and change whether a specific curve was a graph or a level curve. We’ll do that later, in fact.
The same question can be asked about a three-dimensional set. Is it the graph of a two-dimensional function or the level
set of a three-dimensional function? Since we don’t often work any higher than that, we won’t go any further. But again,
the form controls how you view the set. The same comments apply here, too.
CHAPTER 3. DERIVATIVES - II 170
The graph of a two-dimensional function and its level set. Now we come to one of the big questions we’ll have to
answer. If you are given a single function f (x, y), what is the relation between its level curves and its graph? Again, this is
of more than casual interest to hikers using a topographical map. They need to look at the level curves on the map and use
that to decide what the terrain looks like in order to locate themselves on the map. (Remember, the graph of the altitude
function is the terrain, with hills and valleys and other things.)
One problem is that the level set is two-dimensional, but the graph is three-dimensional. That creates problems con-
necting them. This process requires thinking in three dimensions, and this is very difficult for some people. I will try to
make this as easy as I can. Suppose we have a two-dimensional function f (x, y). First, let’s go from the graph to the level
sets. It’s the easier of the two directions. What would be the equation of a level curve? That is an important question. The
level curves are points with all the same values of f (x, y). That means they satisfy the equation f (x, y) = C, for some value
of the constant C. This is the equation of the level curves of any (two-dimensional) function.
What is the equation of the graph of the function? It is z = f (x, y). (Notice that the extra variable, z, crept in. That
forces the graph into three dimensions, the extra dimension being necessary for the output value of the function. Also, note
that we needed no extra variable for level curves. The value of C is not so much of an output value as a parameter that we
get to choose.) How would we relate these two? We can get z = f (x, y) to match f (x, y) = C if we force an extra condition
on the graph, namely z = C.
Eliminating z between z = f (x, y) and z = C gives f (x, y) = C. But what is z = C? In three dimensions, it is a horizontal
plane with height C. What does it mean to require both z = f (x, y) and z = C? To get the solutions of both at once, we
look for the intersection of the two equations. What happens when we intersect the graph with a horizontal plane? It has
the effect of slicing through the graph at a specific height. That slice (or, depending on your way of visualizing graphs, the
edge of the slice) is a level curve of the function, almost. To be a level curve, it needs to be in the xy-plane. So, push it
down to the xy-plane, and you get a level curve of the function. Do this with a number of different C’s, and you will get a
number of different level curves. You slice the graph horizontally into little strips, and the edges of those strips are the level
curves.
On the other hand, you can also go backwards, from the level curves to the graph of the function. Normally, on a set
of level curves, the value of the function on each level curve is given to you. That means that on the graph, all those points
have the same height. Lift up the level curve to that height, at least in your imagination. Do this with all the other level
curves, and you will get a sort of wire frame for the graph of the function. Fill it in reasonably, and you will get the graph
of the function, at least if the function itself is reasonable. This is precisely what hikers have to do. They then compare the
terrain around them with the reconstruction of the terrain from the level curves on the map.
There is no substitute for an example. Take f (x, y) = x2 + y2 . My suspicion is that none of you knows what the graph
of this function looks like, but that some of you will be able to identify the level curves. The surface is called a paraboloid,
the shape of mirrors in telescopes and searchlights. The level curves are x2 + y2 = C for different values of C. These are
circles centered at the origin, at least as long as C > 0. The larger C, √ the larger the circle. This means that the wire frame
we get by lifting these concentric circles up. The larger the radius ( C), the higher we lift the circle. The frame fills in to
give a bowl-shaped object.
We could also work backwards. If we start with the paraboloid, we get the level curves by slicing it horizontally. The
slices will be circles, which are then pushed down to the xy-plane to give the level curves, a series of concentric circles,
x2 + y2 = C.
Converting a graph of a two-dimensional function into a level set of a three-dimensional function. It turns out
that level sets are easier to work with than graphs. In fact, when we want to do serious work with a the graph of a function,
we will convert it to a level set by changing the function. So, if we are given the graph of a two-dimensional function, how
can we convert the graph to be the level curve of a different function?
Note that the level sets of the two-dimensional function will be curves in the xy-plane. These definitely won’t equal the
graph in three dimensions. What we are trying to do here (and I’ve said it several times to get the point across forcefully) is
find a new function which will have that graph as a level surface. What does that mean about the new function? It will have
to have three independent variables, because its level surface is in three dimensions. The function that we graphed had two
dimensions. What we need is a way to put that third variable in properly.
The method of doing that is so simple that it is hard to see. The graph of the function f (x, y) will have equation
f (x, y) = z. The extra variable is z. It is the output variable, the dependent variable, or however you want to say it. How do
CHAPTER 3. DERIVATIVES - II 171
we convert f (x, y) = z into a level set equation? Simple: Pull all the variables to one side, and get f (x, y) − z = 0. That is the
equation of a level set! It is the level set of F(x, y, z) = f (x, y) − z corresponding to the constant C = 0. That is, F(x, y, z) = 0
is precisely the same as f (x, y) = z.
That looks (and is) easy, but something quite unusual has happened. The dependent variable z in f (x, y) = z has changed
into the independent variable z in F(x, y, z) = 0. That switch is really “all” that happened. What we did to convert the graph
of a two-dimensional function into the level set of a three-dimensional function was add in explicitly the dependent variable,
declare it to be an independent variable, and move it to the other side of the equals sign where all good independent variables
belong. That gives the equation of the function whose level set (corresponding to the constant 0) is the graph of the original
function.
Homework #28
Exercises.
1. What is a function which has a level set which is the same as the graph of y = 2/x? (There are many correct answers
here. Can you come up with several functions?)
2. What is a function which has a level set which is the same as the graph of y = x + 5? (Again, there are many correct
answers.)
Problem.
1. Why can’t different level sets of a function intersect each other? (Hint: What would the value of the function be at
an intersection point?)
Contrast to explicit functions. An explicit function is quite different. In that case, there is one variable isolated all by
itself on one side of the equals sign, and all other variables occur on the other side. The isolated variable is the dependent
variable, and the others are the independent variables.
An implicit function might not be a true function. One of the difficulties is that an implicit function might or might
not be a function by the definitions we gave at the beginning of the course. For example, x2 + y2 − 1 = 0 is a perfectly good
implicit function, but its graph is the unit circle, and that is not the graph of an honest-to-goodness function, whether the
independent variable is x or y.
What would happen if you tried to solve an implicit function for some variable? That is, what happens if you try to
convert an implicit equation into an explicit one? Several possibilities can occur. You might be able to solve it, or you
might not. And even if you can solve it, you might get a function (no ±’s) or you might not. And just because you can’t
solve for one variable doesn’t mean that it is not a function. For example, x5 + y5 + x + y − 1 = 0 defines y as an implicit
function of x, and it turns out to be a genuine function also, but solving for y in terms of x is not possible.
CHAPTER 3. DERIVATIVES - II 172
difficulty is that you have to remember the procedure rather than memorize a formula. (I prefer it that way, myself, unless
the procedure is hopelessly long.)
Before I go on, what kind of derivative should that be? That is, is it a regular derivative or a partial derivative? Consider.
If we solved F(w, x, y, z) = 0 for w would there be more variables around than y? The answer is that there would be, and
so the derivative is a partial derivative, ∂ w/∂ y. What, then, does that mean? It means that when we are taking the partial
derivative, all variables except w and y will have to be treated as constants. (That will mean, in this case, that dx = 0 and
dz = 0 when we finally get around to it.)
The first step is to write down the total differential of the implicit function:
∂F ∂F ∂F ∂F
dF = dw + dx + dy + dz
∂w ∂x ∂y ∂z
And, since the implicit equation requires F = 0, a constant, we can set dF = 0:
∂F ∂F ∂F ∂F
0= dw + dx + dy + dz
∂w ∂x ∂y ∂z
Then you look at the derivative you want. In this case, it is ∂ w/∂ y. That means that all variables except w and y are going to
be treated as constants. So, we can declare dx = 0 and dz = 0. (That’s another way to tell that the derivative you’re finding
is a partial derivative. Whenever you require that some independent variable be a constant, you get a partial derivative.)
Two terms then drop out, and we get
∂F ∂F
0= dw + dy
∂w ∂y
Next, we solve for the quotient of the two differentials that we want just like we did with finding dy/dx (putting the correct
ones on the top and bottom):
∂F ∂F
0= dw + dy (3.12)
∂w ∂y
∂F ∂F
− dw = dy (3.13)
∂w ∂y
∂ F/∂ y
dw = − dy (3.14)
∂ F/∂ w
∂w ∂ F/∂ y
=− (3.15)
∂y ∂ F/∂ w
That’s the way the procedure always goes. Once you know what the function is, you can substitute in for the partial
derivatives, and you are done! It isn’t at all difficult once you get used to it.
Note again that you get partial derivatives ∂ w/∂ y when you divide dw by dy. That’s because there were other variables
around; we set dx = dz = 0. The result must then contain partial derivatives to warn of that fact.
Again note that the minus sign is simply part of the formula. It is always present initially, although algebra might cause
it to disappear as the problem is worked.
Two examples of this on Maple are given in the next section.
Higher-order derivatives.
Higher-order implicit regular derivatives Having found a simple way to calculate the first derivatives of implicit func-
tions, you might think that there was a nice easier way to get higher-order derivatives. Not so. The process is not hard, but
it is long and tedious. It would be very easy to get confused here, so I will try to give you a single, uniform procedure to
use. Then I’ll show you how to do it on Maple, which makes the problem too easy.
Mugsy: It can’t ever be too easy.
Albert: Quite the contrary. If Maple makes it that much simpler than working the problem by hand, there is a danger
that you might simply turn to Maple rather than learning how to do it the “harder” way.
CHAPTER 3. DERIVATIVES - II 174
F := x5 + y5 + x + y − 1
This line actually finds the derivative dy/dx by the implicit function differentiation formula.
> dydx := - diff(F,x)/diff(F,y);
5 x4 + 1
dydx := −
5 y4 + 1
This next is the tricky line. Remember that finding dy/dx implies that y is treated as a function of x. But for Maple, all
variables are completely independent. So, to tell Maple that y is a function of x, you change all the y’s to y(x)’s. That tells
Maple that y is a function of x.
CHAPTER 3. DERIVATIVES - II 175
20 x3 20 (5 x4 + 1)2 y(x)3
− −
5 y(x)4 + 1 (5 y(x)4 + 1)2 (5 y4 + 1)
This line tells Maple that the variable y and the function y(x) really are the same, by converting all of the y(x)’s back to
just y’s.
> subs( y(x)=y, % );
20 x3 20 (5 x4 + 1)2 y3
− −
5 y4 + 1 (5 y4 + 1)3
This final line is unnecessary, except to compare to what I did before.
> normal(%);
20 (25 x3 y8 + 10 x3 y4 + x3 + 25 y3 x8 + 10 y3 x4 + y3 )
−
(5 y4 + 1)3
With the exception of the first line and the normal(%);’s, you can use this procedure for finding d 2 y/dx2 for any
implicitly-defined function in Maple.
There is another, slightly different, and essentially equivalent, way of doing the same thing that might be better for you
to understand. If this doesn’t help, go back to the way I just gave. Here it is.
The alias(); command in Maple basically is a high-powered substitution item. Here, it says that any time y is used,
treat it instead as y(x). That is, treat y as a function of x. This includes printing out answers. It shortens things nicely.
> alias(y=y(x));
y
The next line defines F as the implicit function. Note that we don’t even have to move everything to one side in this
approach!
> F := x^5 + y^5 + x + y = 1;
F := x5 + y5 + x + y = 1
Now differentiate the function with respect to x, which gives an equation that includes ∂ y/∂ x. Note that all derivatives
are partial derivatives for Maple.
> diff(F,x);
5 x4 + 5 y4 ( ∂∂x y) + 1 + ( ∂∂x y) = 0
If we solve for dy/dx, we get the answer we want.
> dydx := solve( %, diff(y,x) );
5 x4 + 1
dydx := −
5 y4 + 1
If we differentiate again, we get the second derivative. But differentiating y means differentiating y(x), giving ∂ y/∂ x,
which Maple doesn’t automatically assume has a value.
> diff( %, x );
> subs( diff(y,x) = dydx, % );
CHAPTER 3. DERIVATIVES - II 176
20 x3 20 (5 x4 + 1) y3 ( ∂∂x y)
− +
5 y4 + 1 (5 y4 + 1)2
20 x3 20 (5 x4 + 1)2 y3
− 4
−
5y +1 (5 y4 + 1)3
We put the formula back into standard form.
> normal(%);
20 (25 x3 y8 + 10 x3 y4 + x3 + 25 y3 x8 + 10 y3 x4 + y3 )
−
(5 y4 + 1)3
Now, let’s do the same problem using the built-in Maple function, implicitdiff();. First, define the function, just
as before.
> F := x^5 + y^5 + x + y = 1;
F := x5 + y5 + x + y = 1
Then ask for the two derivatives. The order of arguments to implicitdiff(); tells Maple what you want. The first
argument is the function. The second argument is the dependent variable. The third argument (and beyond, if necessary)
are the variables to differentiate.
> implicitdiff(F, y, x);
> implicitdiff(F, y, x, x);
5 x4 + 1
−
5 y4 + 1
20 (25 x3 y8 + 10 x3 y4 + x3 + 25 y3 x8 + 10 y3 x4 + y3 )
−
125 y12 + 75 y8 + 15 y4 + 1
You can see why I said that this was too easy.
Mugsy: Hey! Even I could do it that way!
Albert: But can you do it without Maple, the way you will have to on the test?
Mugsy: Spoil sport.
Dudley: I take it that means “no.”
Mugsy: Well, can you?
Dudley: Albert didn’t ask me.
Mugsy: I take it that means “no.”
Higher-order implicit partial derivatives The process for finding higher-order partial derivatives is much the same,
with an extra twist at one point. What I will do is give you an example running through it with Maple, and explaining what
each step is doing. Again, I will do it by hand, and then by Maple the long way, and then by Maple the way that even
Mugsy can do.
How do you tell if a higher-order implicit derivative is a regular or a partial derivative? Exactly the same was as the first
derivative. That is, if the first derivative is a regular derivative, so will all higher-order derivatives. If the first derivative is a
partial derivative, then all higher-order derivatives will be as well.
Dudley: And for those who can’t remember?
Albert: You look at the number of variables. If there are only two, the derivatives are regular. If there are more, the
derivatives are partial.
Mugsy: Why couldn’t he have said it that way?
2 2
The problem I will do is to find ∂∂x ∂yz and ∂∂z ∂yx (and show that they are equal) for
x y z + x2 − y2 + 4 z2 = 10
Here’s how to solve the problem on Maple. First, we define the function
> F := x*y*z + x^2 - y^2 + 4*z^2 - 10;
CHAPTER 3. DERIVATIVES - II 177
F := x y z + x2 − y2 + 4 z2 − 10
Then we find dydz and dydx, which are the variables I use for ∂ y/∂ z and ∂ y/∂ x. Note that this is how to find first-order
partial derivatives of implicit functions. We already have the formulas, namely ∂ y/∂ z is −(∂ F/∂ z)/(∂ F/∂ y) and ∂ y/∂ x
is −(∂ F/∂ x)/(∂ F/∂ y). We put these into Maple’s format.
> dydz := - diff(F,z)/diff(F,y);
> dydx := -diff(F,x)/diff(F,y);
xy+8z
dydz := −
xz−2y
yz+2x
dydx := −
xz−2y
Maple then gives the answers. In this case, it’s not that hard. Right now, it is not obvious that we would need both of
these, but it turns out we will. It will become clear why as we proceed. We now want to work on getting ∂ 2 y/∂ x ∂ z, which
we will find by this process:
∂ 2y
∂ ∂y d(∂ y/∂ z)
= =
∂x∂z ∂x ∂z dx
dz=0
but we must be careful of the partial derivative. The ∂ /∂ x means that the independent variable x is changing; y is considered
the dependent variable, and must also change. This boils down to meaning that z doesn’t change, which we enforce by
requiring dz = 0 when we take the differential on top. That’s the meaning of the little dz = 0 at the end. How do we get
Maple to do this? We have to tell it that y is a function of x only. That way, Maple will treat z as a constant. And we tell
Maple y is a function of only x by substituting y = y(x) into dydz, just as before, and then differentiate with respect to x:
> subs( y=y(x), dydz );
> diff( %, x );
x y(x) + 8 z
−
x z − 2 y(x)
d d
y(x) + x ( dx y(x)) (x y(x) + 8 z) (z − 2 ( dx y(x)))
− +
x z − 2 y(x) (x z − 2 y(x))2
Note that Maple thought that z was a constant, but that y was a function of x. This is exactly what we needed. Now we
have to substitute for the derivative of y with respect to x that Maple used. For this, we need ∂ y/∂ x from above. This is
where we need that, and why.
> subs( diff(y(x),x) = dydx, % );
x (y z + 2 x) 2 (y z + 2 x)
y(x) − (x y(x) + 8 z) (z + )
xz−2y xz−2y
− +
x z − 2 y(x) (x z − 2 y(x))2
We now want to simplify this, but Maple will again stubbornly refuse to accept that y and y(x) are the same, so we have
to get rid of the y(x)’s by another substitution. We then use the normal(%); command to put all of this into a nice form,
and store the result in q1:
> subs( y(x)=y, % );
> q1 := normal(%);
x (y z + 2 x) 2 (y z + 2 x)
y− (x y + 8 z) (z + )
xz−2y xz−2y
− +
xz−2y (x z − 2 y)2
2 y2 x z − 4 y3 + 2 x3 z + x2 y z2 + 8 x z3 + 32 x z
q1 :=
(x z − 2 y)3
2
At this point, that is the value of ∂ y/∂ x ∂ z. What we want to do next is find the other mixed partial derivative,
∂ 2 y/∂ z ∂ x. The process is basically the same. We will again need both ∂ y/∂ x and ∂ y/∂ z. We assume that both of these
CHAPTER 3. DERIVATIVES - II 178
have already been found (from above). We need to declare y as a function of z in ∂ y/∂ x for when we want to differentiate
with respect to z next:
> subs( y=y(z), dydx );
> diff( %, z );
y(z) z + 2 x
−
x z − 2 y(z)
d d
( dz y(z)) z + y(z) (y(z) z + 2 x) (x − 2 ( dz y(z)))
− + 2
x z − 2 y(z) (x z − 2 y(z))
Now we substitute for the derivative that showed up, the ∂ y/∂ z, and get
> subs( diff(y(z),z)=dydz, %);
(x y + 8 z) z 2 (x y + 8 z)
− + y(z) (y(z) z + 2 x) (x + )
xz−2y xz−2y
− +
x z − 2 y(z) (x z − 2 y(z))2
Next, we substitute y(z) back to just y for the simplification
> subs( y(z)=y, % );
(x y + 8 z) z 2 (x y + 8 z)
− + y (y z + 2 x) (x + )
xz−2y xz−2y
− +
xz−2y (x z − 2 y)2
And finally, we simplify it and store this result in q2:
> q2 := normal( % );
2 y2 x z − 4 y3 + 2 x3 z + x2 y z2 + 8 x z3 + 32 x z
q2 :=
(x z − 2 y)3
And last of all, we check that the two mixed partials are in fact equal (something that ought to at least mildly surprise
you, considering how different the means of getting to this end were):
> q1 - q2;
0
It checks!
Dudley: Just to keep my suspense level manageable, does this always happen?
Albert: Yes, as far as you are concerned.
Mugsy: And if you aren’t concerned?
Albert: You ought to be. Leave it at that.
Again, let me do this same problem in the spirit of the second Maple approach I gave earlier.
> alias(y=y(x,z));
y
Again, we set up the alias, with the dependent variable and independent variables as indicated in the derivative that we
want.
> F := x*y*z + x^2 - y^2 + 4*z^2 = 10;
F := x y z + x2 − y2 + 4 z2 = 10
This defines the function, as before.
> diff(F,x);
y z + x ( ∂∂x y) z + 2 x − 2 y ( ∂∂x y) = 0
Again, we get the derivative that we want in the middle of this equation. So, we solve for it next.
> dydx := solve( %, diff(y,x) );
yz+2x
dydx := −
xz−2y
CHAPTER 3. DERIVATIVES - II 179
We store the result in a variable called dydx, remembering that it really is a partial derivative.
> dydx := solve( %, diff(y,x) );
yz+2x
dydx := −
xz−2y
The same sequence gets the other partial derivative.
> diff( F, z);
x ( ∂∂z y) z + x y − 2 y ( ∂∂z y) + 8 z = 0
Now we go for the whole thing.
> diff(dydz,x);
> subs(diff(y,x)=dydx, %);
> q3 := normal(%);
F := x y z + x2 − y2 + 4 z2 − 10
x2 y z2 + 2 z x y2 + 8 x z3 − 4 y3 + 2 x3 z + 32 x z
q5 :=
x3 z3 − 6 x2 y z2 + 12 z x y2 − 8 y3
x2 y z2 + 2 z x y2 + 8 x z3 − 4 y3 + 2 x3 z + 32 x z
q6 :=
x3 z3 − 6 x2 y z2 + 12 z x y2 − 8 y3
0
CHAPTER 3. DERIVATIVES - II 180
Homework #29
Exercises.
(a) x2 y = 15
(b) 3x6 y4 − 7 x2 y5 = 12
(c) y = x4 y3 + 4 x6 y7
4. Find dy/dx for the following equations.
(a) exp(x y2 ) − x3 y4 = 12
(b) sin(x/y) + cos(y/x) = 1
(c) y = ln(x2 + y2 )
5. Calculate d 2 y/dx2 for x2 + y2 = 1. (You can compare this problem to one of the homework problems on parametric
equations, where we calculated the same thing “the hard way” using x = cost, y = sint.)
6. Make up three implicit equations F(x, y) = 0 of your own, and find dy/dx for them.
d2 y
7. Find dx2
for the implicit function 2 x5 + 4 x2 y3 − 5 x y7 = 100. Be prepared for a frighteningly long answer.
d2 y
8. Find dx2
for the implicit function y e3 x + x3 sin(y2 ) = 20. Again, watch out for a many-line answer.
∂y
9. Find ∂z for the implicit function 9 z2 ex y − 3 x z sin(y) = 4.
∂x
10. Find ∂y for the implicit function 4 ln(x + y z) + 7 Arctan(y + x z) = 1
Problems.
∂ 2x
1. Find ∂y∂z for the implicit function ex y z + cos(x y2 + y z) = 2.
3. In this problem, we look at a result that gets used in Physical Chemistry, although it is not clear what it means.
Albert: A whole lot of Physical Chemistry can seem that way. But the next section does more with that subject.
Dudley: Oh, I can hardly wait....
Suppose you have any implicit function of three variables, F(x, y, z) = C. Show that
∂x ∂y ∂z
× × = −1
∂y ∂z ∂x
by writing out the values of these partial derivatives in terms of the partial derivatives of F and simplifying.
CHAPTER 3. DERIVATIVES - II 181
Investigation.
1
1. Answer the following questions about the graph defined by the implicit function x3 + y3 + x y = 27 . (Yes, the constant
1/27 must be exactly that. You’ll see why.)
(a) Show that the first derivative dy/dx of the curve is
3 x2 + y
−
3 y2 + x
54 x y(x3 + y3 + x y − 1/27)
−
(3y2 + x)3
(Unless you are into serious algebra, I’d recommend Maple for this one.)
(c) Plug the equation of the curve into this expression, and show that the second derivative is always equal to zero.
(This is why we need that 1/27.)
(d) Show that the lines y = mx + b have second derivative equal to zero. (A more ambitious problem would have
you show that only lines have second derivative equal to zero. I’m not asking for that.)
(e) Show that the three points 31 , 0 , 0, 13 , and − 13 , − 31 are all on the original curve. (How do you show that a
3.1.7 Constrained partial derivatives (what if you can’t wiggle just one variable at a time?)
Up until this point, whenever we have taken partial derivatives, we have assumed that the independent variables are just
that, independent of each other. There are situations where that is not the case, as any chemistry major will discover in
physical chemistry.
Motivation—gas dynamics.
Let me give you a specific example. In thermodynamics (a part of physical chemistry), the entropy, denoted S, of a gas is
most conveniently defined in terms of three variables, p = pressure, V = volume, and T = temperature.
Mugsy: ALBERT! What’s going on here? What’s entropy?
Albert: It’s a concept from thermodynamics, and measures how disorganized a system is. One of the fundamental
laws of thermodynamics says that entropy must always increase.
Dudley: Physicists have a theory of disorganization?
Albert: That’s one way to look at it. The three laws of thermodynamics are summarized at a level that even Mugsy
can understand as:
1. You can’t win.
2. You can’t break even. (That’s entropy increasing.)
3. You can’t get out.
Mugsy: You mean it’s useless for me to try to clean my apartment?
Dudley: For you, yes.
Mugsy: Watch it, kid.
(Usually, first-year chemistry books use P = pressure, but thermodynamics texts tend to use p.) If you have had chemistry,
CHAPTER 3. DERIVATIVES - II 182
though, you know that p, V , and T are not independent. There is the ideal gas law, pV = nRT , where n = number of moles
(measure of quantity) of gas, and R is a constant that is the same for all gases. So, if we decide to wiggle T , for example,
we are not permitted to hold all other variables constant. The extra equation that the variables have to satisfy is called the
constraint, and this process is called taking constrained partial derivatives.
Because thermodynamics is the course where this notion is used most, I will keep explanations geared to that subject.
It will be easy enough to adapt to any other situation, or to the other functions in thermodynamics (internal energy of gases,
enthalpy, and Gibbs free energy).
Dudley: I don’t even want to know what those mean.
Notation.
We first need to set up the notation so that we can tell what is going on.
Mugsy: I think it’s going to take more than notation.
Dudley: AUGH! More notation!
The entropy S depends on p, V , and T , and of the three, any two will determine the other from the gas law. So, thermody-
namics texts will write S(p, T ) or S(V, T ), depending on which two are the main two under consideration at the time. But
how would you understand (∂ S/∂ T )? It doesn’t make sense at it stands, since the notation implies that all other variables
are being held constant, which can’t happen. Even worse, if you differentiate S(p, T ) with respect to T and compare that
to the derivative of S(V, T ) with respect to T , you will get two different answers. (See the homework.) This is a distinct
problem.
We have to be able to distinguish between different partial derivatives, then. This is done by a standard notation.
(∂ S/∂ T ) p indicates that we are taking the derivative of S with respect to T , holding p a constant. Of course, V will have to
be changing, but that we will have to allow. It is equivalent to the derivative of the formula S(p, T ) with respect to T . On
the other hand, (∂ S/∂ T )V denotes the partial derivative of S with respect to T , while holding V constant (and consequently
allowing p to vary). That would mean that we are differentiating S(V, T ) with respect to T . By the way, each of those
derivatives gives a quantity that is physically measurable. And physically, they are not equal, either.
Mugsy: You mean this actually gets used?
Albert: Chemical engineers have to use this idea all the time.
What you are given. You are given the function to differentiate, the constraint, the variable to differentiate, and the
variables that are independent of the variable to differentiate. These are all part of what you must be given, in some fashion
or another. Note that you can determine what variables are dependent on the variable to differentiate by a process of
elimination.
Write out total differential of function being differentiated. When you are doing this, you should include all of the
variables, whether or not they will be held constant. For example, with
S = S(p,V, T )
Set wiggles of all independent variables to 0; there should be two wiggles left. Now you look at the variables. The
notation will tell you that some of them are being held constant. In that case, you should set the appropriate differential to
0. That is, if you want to find
∂S
∂T V
you will be holding V constant, so you you would set dV = 0. The result in that case would be
∂S ∂S
dS = dp+ dT
∂p ∂T
We are now closing in on the derivative we want. To get (∂ S/∂ T ), you want to factor out a dT from the right side of this
equation and divide through by it. But we have this d p that isn’t dT . What do we do? We want to eliminate the unwanted
wiggle, in this case the d p. Now is when we use the constraint. That gives us the way to relate the dT and the d p that will
enable us to eliminate the d p from this.
Find the total differential of the constraint; set all independent wiggles to 0; solve for the wiggle to be eliminated
from the other equation. The equation for the constraint will tell how the differentials between all the variables relate.
So, if we use the ideal gas law,
pV = n R T
we take the differential of the equation, and get that
V d p + p dV = nR dT
Note that n is almost always a constant for these problems, since you are working with a fixed quantity of gas. On the other
hand, R is always a constant. Then we again set dV = 0, since V is still not allowed to change. Then we get that
V d p = nR dT
Plug in and solve for desired wiggle. Plugging that back in gives
∂S ∂S
dS = dp+ dT (3.25)
∂p ∂T
∂S nR ∂S
= dT + dT (3.26)
∂p V ∂T
∂S nR ∂S
= + dT (3.27)
∂p V ∂T
where we factored out the dT from both terms.
Set up the quotient of wiggles to give the partial derivative you want. Dividing by dT gives us what we want:
∂S ∂S nR ∂S
= +
∂T V ∂ p V ∂T
that such a creature is ambiguous. Besides, how would you calculate it?
Mugsy: Yeah. That’s what I was thinking, too.
Dudley: Right, Mugsy.
Albert: Good questions, and they need good answers. That must be coming next.
First, let’s look at the expression. If you will note, when we write
∂S ∂S nR ∂S
= +
∂T V ∂ p V ∂T
we are writing ∂∂TS in terms of ∂∂TS . What would that mean? Remember that when we first did the total differential of
V
the function, we assumed that there were no constraints. That’s what ∂∂TS represents; we assume that we can wiggle just
T without wiggling any of the other variables, and ∂∂TS is just that wiggle magnification factor. How would that work?
Remember that we started out with a formula for S? That is what you would use for finding ∂∂TS ; just take the normal partial
derivative ignoring any problem with constraints. (Of course, in thermodynamics, you often don’t have much
of a formula
∂S
for these things. This is theory.) To allow that other variables must wiggle, we get an extra term in ∂ T . It takes into
V
account the wiggle of p as we are wiggling T and holding V constant. The term nVR ∂∂ Sp is just what needs to be added to
hold V constant, forcing p to move. This also answers the questions.
Just a few concluding remarks on this subject. It would be possible (but very unappetizing) to have several constraints
on different variables, rather than just one.
Mugsy: You gotta be kidding.
The process would be the same. Take the total differential of the equation containing the dependent variable; zero the
differentials of variables being held constant; take the differentials of the constraints and solve them for the differentials of
the dependent variables to get rid of; plug them in the first differentiated equation and divide by the differential of the main
dependent variable.
Finally, note that the quotient of differentials is a derivative, just as before. But the notation automatically switches to
express the situation, just the way that the notation changed
in going from regular to partial derivatives. That is, dS divided
by dT was not just dS/dT , nor even just ∂ T , but rather ∂∂TS , in order to accurately reflect what we had done to get the
∂S
V
equation relating those differentials.
Now, for the moment you have all been waiting for. How is this done on Maple?
Dudley: To tell you the truth, I’ve been waiting for this section to finish even more.
For that, it is important to realize that Maple doesn’t have the chain rule for differentials built into it, nor is it easy to get it
to do so. However, I have written a routine that takes differentials of equations or functions. That enables you to proceed
exactly as in the notes. For an example, let me do the entropy example again using Maple.
First, we read in the function. It is called, appropriately, d. (Instructions on getting the function will be given in class.
The locations of such things keep changing.)
> d := proc( f )
> local vars, i, v, tmp;
> vars := select( type, indets(f), name );
> if type(f, equation) then d( rhs(f) ) = d( lhs(f) )
> else tmp:=0; for i to nops(vars) do v := vars[i]; tmp := tmp +
> diff(f,v) * d || v; od; tmp;
> fi;
> end;
CHAPTER 3. DERIVATIVES - II 185
d := proc( f )
local vars, i, v, tmp;
vars := select(type, indets( f ), name) ;
if type( f , equation) then d(rhs( f )) = d(lhs( f ))
else
tmp := 0 ;
for i to nops(vars) do v := varsi ; tmp := tmp + diff( f , v) ∗ d||v end do ;
tmp
end if
end proc
You are not expected to understand this!
Next, put in the constraint.
> C := p*V = n * R * T;
C := pV = n R T
Note that you can assign an equation to a variable this way. That is, C is a variable to Maple that has the value
pV = n R T .
Then we take the differential of it.
> dC := d(C);
dC := n T dR + n R dT + R T dn = p dV +V dp
Maple doesn’t know that R and n are constants, so you have to tell it. Here’s how. Note that I am putting two equations
on the same line. That’s fine, as long as both are ended with semicolons (or colons).
> dR := 0; dn := 0;
dR := 0
dn := 0
Check it out, just to be sure.
> dC;
n R dT = p dV +V dp
Looks good. Next, take the differential of the entropy function, S(p,V, T ).
> dS := d(S(p,V,T));
dS := ( ∂∂T S(p, V, T )) dT + ( ∂V
∂
S(p, V, T )) dV + ( ∂∂p S(p, V, T )) dp
For the derivative we want, ∂∂TS , we will need to set dV = 0. We do that now. The # sign tells Maple to ignore
V
everything from that point on. It is used to put comments in the session.
> dV := 0; # For this derivative
dV := 0
Now, we want to get rid of the d p, so we solve the constraint for it.
> solve(dC, dp);
n R dT
V
Then we substitute that back into the equation for dS, and get
> subs( dp=%, dS);
( ∂∂p S(p, V, T )) n R dT
( ∂∂T S(p, V, T )) dT +
V
All we have to do now is divide by the dT , and we are done.
> expand(%/dT);
CHAPTER 3. DERIVATIVES - II 186
( ∂∂p S(p, V, T )) n R
( ∂∂T S(p, V, T )) +
V
And that’s the answer we got before.
If this seems a bit long, it is. But it is exactly what you are doing when you solve these by hand. You can go back and
see how this Maple session is exactly parallel to what we did before. Only the form is different.
Homework #30
Problems.
∂w ∂w
1. Take w = 2 x2 + 4 z + t, with constraint x + z − 3t = 8. Show that ∂x z and ∂x t can never be equal for these
equations.
∂w
2. For w = 3 x3 y2 z2 − 5 x2 + y3 − z4 , with constraint equation x z3 + 4 x2 y3 + y2 z5 = 71, find .
∂z y
∂w x
3. Find where w = ln(xy) + , subject to the constraint Arcsin x + x2 + y2 + z2 + z = 2.
∂x z y+z
∂x
4. Find for x = y e−w + w2 cost e2y with constraint w3 + y3 + t 3 = 1.
∂y t
5. Make up another two problems in constrained partial differentiation and solve them.
Investigation.
1. In this investigation, we look at implicit differentiation with constraints. It really is no more difficult than regular
derivatives with constraints. Suppose we have the equation F(w, x, y, z) = 0. This defines y as an implicit function of
w, x, and z. (Other combinations are possible, but that’s the one we’ll work with on this problem.) Suppose we also
have the constraint C(w, x, y, z) = 0. That should enable you to eliminate one of the variables that y is a function of.
Suppose
we eliminate w. Then y is a function of x and z (although its exact form is not obvious). Find a formula for
∂y
∂x in terms of the partial derivatives of F and C, where they are taken as formulas.
z
CHAPTER 3. DERIVATIVES - II 187
13. Constrained partial derivatives are found by taking the differential of the function you are differentiating, setting the
differentials of the variable(s) being held constant to 0, and getting rid of the “non-derivative” differentials using the
constraint, and then dividing through by the “derivative” differential.
14. There were no new formulas in this chapter, since partial derivatives are found exactly the same way are single-
variable derivatives.
15. The way to define a multi-variable function in Maple is to use the old “arrow notation,” but put the variables in
parentheses. The only new Maple command that appeared was alias();, which is used in Maple to shortcut
inputting expressions. It will not be used again in this course, so it is not critical to learn it. There was also a new
function d that takes differentials of expressions, to be used in finding constrained partial derivatives, but could also
be used when finding implicit derivatives.
I. (10 points, 5 points each) Determine the future value of a deposit of $2000 at 4% for 9 months if the interest is (a)
Compounded quarterly (b) Continuously compounded
II. (15 points, 5 points each) Find each of the following limits.
x3 − x2 − x − 15 2 x50 − ln x + ex sin2 x
(a) lim 5 2 4
(b) lim 30 (c) lim
x→3 x + 3 x + 3 x − 3 x − 36 x→∞ x + sin x + e2 x x→π/2 x3 − 2 x − 2 x2 + π
π
III. (15 points; 5 points each) For the following questions, use the function f (x) = x4 − 2 x2 − 10.
(a) Find all local max and min values of f (x). (b) Using a number line, describe all segments where the function is
increasing or decreasing. (c) Find the global max and min of f (x) on the interval [−0.5, 3].
IV. (35 points, as noted) Find the following derivatives. !
∂ 2 xy ∂5 z2 ∂x
(a) (5 points) (x y e ) (b) (5 points) 3 2 p + ln(x z) (c) (10 points) if x2 y z +
∂y ∂z ∂x 5
tanh(3x) cot(2 x) ∂z
√ ∂x
x y−xz = x (d) (15 points) for x = 3 z3 y + 4 tan(y z) + z w3 subject to z w y = 3
∂y z
V. (15 pts, 5 points each) Given the price/demand function p(x) = x − x2 /30.
(a) Find the elasticity at x = 25. (b) Is the market elastic or inelastic at x = 25? Why? (c) Does revenue increase or
decrease if the price increases (from the reference point x = 25)? Why?
d3 y
VI. (10 pts) Given ln(x y) = 2, find dx3
. Find this as an implicit function; that is, do not solve for y as a function of x
explicitly.
I. (10 points, 5 points each) Determine the future value of a deposit of $3000 at 5% for 8 months if the interest is
(a) Compounded monthly; (b) Continuously compounded
II. (15 points, 5 points each) Find each of the following limits.
1 − cos x 5 x3 − 2 x t sint
(a) lim 2
(b) lim (c) lim
x→0 x + x x→∞ 7 x3 + 3 t→0 1 − cost
III. (15 points; 5 pts each) For the following questions, use the function f (x) = x3 − 12 x − 5.
(a) Find all local max and min values of f (x). (b) Using a number line, describe all segments where the function is
increasing or decreasing. (c) Find the global max and min of f (x) on the interval [1, 4].
IV. Find the following derivatives:
CHAPTER 3. DERIVATIVES - II 189
!
12 6 xz d2y
∂ ln y ∂ x e
(a) (5 pts) x y5 + (b) (5 pts) 9 3 p + (c) (10 pts) if x y + y2 = 1
∂y x ∂x ∂z arctan(3 z) arcsin(4 z2 ) z9 dx2
∂x √
(d) (15 pts) for x = 5 z2 y + 2 z y + z2 e2 w subject to y2 + z w = 31.
∂z y
V. (15 pts, 5 pts each) Given the price/demand function p(x) = x − (x3 /50).
(a) Find the elasticity at x = 4.5. (b) Is the market elastic or inelastic at x = 4.5? Why? (c) By how much will
demand (x) change if price (p) increases by 10% from this point (x = 4.5 reference point)?
2 ∂f ∂f
VI. (15 pts) Given f (u, v) = arctan(u v) + 3 u v2 , and u = ex y , v = sin(x y). Find ∂x and ∂y in terms of u, v, x, and y (not
just x and y alone).
I. (10 points, 5 points each) Determine the future value of a deposit of $5000 at 3% for 9 months if the interest is
(a) Compounded quarterly (b) Simple interest
sin x ln x ln(ln x)
II. (20 points, 5 points each) Find each of the following limits. (a) lim (b) lim √ (c) lim (d)
x→0 e3 x − 1 x→∞ x x→1 ln x
lim x sin(1/x) (Hint: Rewrite this product into an equivalent 0/0 or ∞/∞ form.)
x→∞
III. (15 points; 5 pts each) For the following questions, use the function f (x) = x3 − 3 x2 − 9 x + 1.
(a) Find all local max and min values of f (x). (b) Using a number line, describe all segments where the function is
increasing or decreasing. (c) Find the global max and min of f (x) on the interval [−4, 6].
IV. Find the following derivatives:
∂7 x3 d2y
∂ ln(x z)
(a) (5 pts) (3 x sin y+4 x3 y2 z) (b) (5 pts) + (c) (10 pts) 2 if sin(x y) =
∂y ∂ x ∂ 3z
4 ln(tan(3 z))
sin(4
2
z ) z3 dx
1 ∂x
2 (Solve implicitly; don’t solve for y explicitly). (d) (15 pts) for x = 5 z2 ln(y) + z y e2 w subject to sin(y) +
√ ∂y w
z w = 15. (Use implicit methods only.)
x3
V. (11 pts) Given the price/demand function p(x) = x − . (a) (5 pts) Find the elasticity at x = 5. (b) (3 pts) Is the
60
market elastic or inelastic at x = 5? Why? (c) (3 pts) Hey! At this point (x = 5) in the market, would revenue increase
or decrease if price is increased? Why?
√ ∂f
VI. (10 pts) Given f (x, y, z, w) = x z w y + z2 y − w5 x3 z subject to the constraint ln(w z x2 y) =, find where the
∂ x z,w
subscript indicates that both z and w are being held constant for this derivative.
∂f ∂f ∂f
VII. (10 pts) Let f (u, v) be any given function where u = 3 x2 + 2 y and v = ex y . Find ∂y in terms of ∂u , ∂v , x and y.
VIII. (9 pts) The function f is a measure of productivity of a certain shift at a warehouse. The variable x represents the
money (measured in hundreds of dollars) spent on employee comfort such as improvements in lighting and break room
facilities and y represents money (measured in hundreds of dollars) spent in productivity bonuses. Give an English inter-
pretation of the following:
(a) fx (b) fy (c) fx y
I. (10 points, 5 points each) Determine the future value of a deposit of $3000 at 6% for 10 months if the interest is
(a) Compounded monthly (b) Simple interest
CHAPTER 3. DERIVATIVES - II 190
II. (20 points, 5 points each) Find each of the following limits.
cos(x) 2 x3 − 3 x2 − 11 x + 6 ln(x) − cos(x2 ) + 10x 3 x2 − 2 x + 5
(a) lim (b) lim 3
(c) lim (d) lim
x→π/2 sin(x) − 1 x→3 x − 13 x + 12 x→∞ 3 x4 − 2 x + 1 x→2 cos(π x) − 2 x + 2
x2 − x + 4
III. (15 points) For the following questions, use the function f (x) = .
x−1
(a) (8 points) Find all local max and min values of f (x). (b) (7 points) Find the global max and min of f (x) on the
interval [2, 6].
IV. Find the following derivatives:
∂8 z3
∂ p 3 sin(x z) dy
(a) (5 pts) ( x y z − 2 x z3 ) (b) (5 pts) 3 5 3 − 3 z)
+ 3
(c) (10 pts) if 3 x2 y −
∂y ∂ x ∂ z Arctan(4 z z dx
2 ∂x
cosh(x y) = 2 x y . (d) (15 pts) for x = z w2 − ez y with constraint equation w3 + tan z + y2 = 12 (use implicit
∂z w
methods only)
V. (15 pts) Given the price/demand function p(x) = x − (x5 /30).
(a) (5 pts) Find the elasticity at x = 2 (b) (5 pts) Is the market elastic or inelastic at x = 2? Why? (c) (5 pts) From
this point, by what percentage would demand change if price increased by 10%?
dy ,y d2y 10 y
VI. (10 pts) If x2 y3 = 7, use implicit methods only to show that =− and 2 = 2 . Finally use these results to
dx 3x dx 9x
d3
find . NO credit will be given for solving for y and differentiating explicitly!
dx3
I. (10 points; 5 points each part) Find both the new amount and the interest paid on $1700. for 9 months at 5% interest, if
the interest is
a.) Compounded monthly b.) Compounded continuously
II. (15 points; 5 points each) Find each of the following limits.
2 x2 − 9 x + 4 2 x2 − 9 x + 4 sin x
a.) lim 2 b.) lim 2 c.) lim
x→4 x + 2 x − 24 x→∞ x + 2 x − 24 x→π x − π
III. (20 points; 10 points each) For this problem, use f (x) = 3 x4 − 4 x3 − 36 x2 + 18. Make sure you give both the x- and
y-coordinates of each point.
a.) Find all the local max(es) and local min(s) of f (x). b.) Find the global max and min of f (x) on the interval [−1, 2].
IV. (20 points total; 5 points each) Albert started a company that makes widgets, which grew into the E.M.C. (Enormous
Multinational Conglomerate). He hired Mugsy, and was somewhat perplexed that he wanted to work in environmental
protection, until he heard Mugsy say to a landscape worker “Nice shrub you have there. How much it worth to you that it
stay that way?” Mugsy seems happy in his new job in security. Dudley, on the other hand, started as a marketing specialist,
using his vast experience from calculus. He found that the demand function for widgets is p = 50 − x2 .
a.) If x = 6, what is the price of the widgets? b.) What is the elasticity η at x = 6? c.) Using the elasticity value
from part b.) and the price of a widget from part a.), estimate the percentage that the demand will go up if the price is
lowered by 3%. d.) Does lowering the price of a widget 3% from the amount in part a.) increase or decrease the
revenue from widgets for E.M.C.? Explain your answer using the previous parts of this problem.
V. (15 points; 5 points each)
Find the following partial
derivatives:
∂7 ea b
∂ x+y+z
a.) b.) c.) For w = 3 x3 y2 z2 − 5 x2 + y3 − z4 , with constraint equation
∂ y x y sin(|y |) + z2 ∂a 3 ∂ b4 a4
∂w
x5 z3 + 4 x2 y3 + y2 z5 = 71, find .
∂z y
VI. (15 points) Find the equation of the line tangent to x3 y2 − 4 x2 y3 − 2 x = −3 at the point (−1, 1).
CHAPTER 3. DERIVATIVES - II 191
VII. (10 points) If you have any function f (u, v), where u = x2 − y2 and v = 2 x y, write ∂ f /∂ x as a formula involving
∂ f /∂ u and ∂ f /∂ v and x and y.
I. (10 points; 5 points each) Determine the future value of a deposit of $1000 at 3% for 10 months if the interest is
(a) Compounded monthly (b) Compounded continuously
II. (20 points; 5 points each) Find each of the following limits.
√ 2
cos(x) − 1 x+1−2 200 x100 − sin(x) + ex e2 x − 1
(a) lim (b) lim 3 (c) lim (d) lim
x→π sin(x) x→3 x − 7 x − 6 x→∞ 40 ln(x4 ) + x7 x→0 x
III. (15 points; as noted) For the questions in this problem, use the function f (x) = x2x+2 .
(a) (8 points) Find all local max and min values of f (x). (b) (7 points) Find the global max and min of f (x) on the
interval [0, 2].
IV. (30 points; as noted) Find the following derivatives.
∂3 z + z−1
∂ dy
sin(x2 y z) + x z3
(a) (5 points) (b) (5 points) sin(y x) + tan −1
(c) (5 points) if x y −
∂y ∂ x ∂ z ∂ y x − x dx
∂y
ln(x y) = 3 x y2 (d) (15 points) for y = cos(z w3 ) + w x2 subject to w2 x z = π. (Use implicit methods only.)
∂z w
V. (15 points; 5 points each) Given the price/demand function p(x) = x − (x2 /10).
(a) Find the elasticity at x = 8. (b) Is the market elastic or inelastic at x = 8? (c) From this point, would revenue
(p x) increase or decrease if price increased by 10%?
dy d2y
p
VI. (15 points) If x2 y = 7, use implicit methods only to show that dx = −2 y/x, dx2
= 6 y/x2 , and use these results to
d3y
find dx3
. NO credits will be given for solving for y and differentiating explicitly.
I. (15 points; 5 points each) Find the interest earned by $2500 at 5.4% for 18 months if the interest is calculated by the
following methods:
(a) Simple interest (b) Compounded quarterly (c) Compounded continuously
II. (15 points; 5 points each) Find the following limits:
cos(π/x) x3 + 3 x − 4 3 x4 − 5 x2 + x − 8
(a) lim 2 (b) lim 2 (c) lim
x→2 x + x − 6 x→1 2 x − 5 x + 3 x→∞ 7 x3 + x2 − x + 1
III. (10 points; 5 points each) For this problem, use the function f (x) = x3 − 3 x + 7.
(a) Find the local max and min values of this function. (b) Find the global max and min of y = f (x) on the interval
[0, 3].
IV. (15 points; 5 points each) Dudley’s new employer D.I.P. (Diversified International Products), is selling wicket greasers
“for all those sticky wickets you run into”. The price/demand function for wicket greasers is p = 95 − x − 2 x2 .
(a) What price corresponds to a demand of 4? (b) What is the elasticity when the demand is 4? (c) Using the results
of part (b), if the price is raised above the level in part (a), will the revenue for D.I.P. increase or decrease?
V. (30 points; as noted)
2 Find the following derivatives:8 x ∂2
(a) (5 points) ∂t x√y−cost (b) (5 points) ∂ x∂5 ∂ y3 y2 ee + y2 ex ex y z + x2 y4
∂
5t−3 x
(c) (5 points) ∂z∂y (d) (15
d2w
points) dx2
if w3 − x4 = −7 x w + 10
VI. (10 points) If you have a function F(u, v) and u and v (me and you?) are given by the equations u = 5 tan(x) sec(y) and
v = 5 tanh(x) sech(y), give the formulas for ∂∂Fx and ∂∂Fy in terms of ∂∂Fu and ∂∂Fv
CHAPTER 3. DERIVATIVES - II 192
∂w x
VII. (10 points) Find ∂x z for w = ln(x y) + y+z subject to Arcsin x + x2 + y2 + z2 + z = 2 (use implicit methods only).
I. (15 points, 5 points each) Find the future value in each part.
(a) $2500 loaned out at 4% simple interest for 9 months. (b) $6500 put into an account paying 5% compounded quar-
terly for 7 years. (c) $4000 compounded continuously at 6.3% for 10 years.
II. (20 points; as marked) Find each of the following limits.
2 x2 − 3 x − 20 7 x3 − 5 x2 + 18 ln(cos(3 x))
(a) (5 pts) lim 2 (b) (5 pts) lim (c) (10 pts) lim
x→4 x + x − 20 x→∞ 9 x3 + 25 x + 27 x→0 7 x2
III. (20 points; 10 points each) Dudley has started selling a premium weed eater called MotorGoat. The price per unit, if he
makes x units, is p(x) = 200 − 0.1 x.
(a) Find the elasticity at x = 900. Is the market elastic or inelastic at this sales level? (b) By what percentage will his
sales drop from this level if he increases the price of the MotorGoats by 10%?
IV. (10 points) This is a continuation of the saga from problem III. U C(x) = 5000 + 25 x − 0.001 x2 . How many units must
Dudley sell to maximize his profits? (Profit is revenue (x p) minus cost).
x
V. (15 points) Find the global extrema of f (x) = on the interval [−1, 4].
4 + x2
VI. (30 points;
10 points
each) Find the following derivatives:
∂5
sin(x y) 2 3 ∂w
(a) y
(b) dy/dx for y = x e + y ln(x) (c) for w = 3 z2 y + 4 sin(y z) + cos(x2 ), subject
∂ x ∂ y4 x4 ∂x y
to x2 + y3 + z = 7.
I. (10 points; 5 points each) Determine how much a deposit of $1200 will earn at 6% for 9 months if the interest is
(a) compounded quarterly. (b) Compounded continuously.
II. (15 points; 5 points each) Find each of the following limits.
x2 − 5 x + 6 2 x3 − 5 x2 + 6 x x ln(x)
(a) lim 2 (b) lim 3
(c) lim 2
x→3 2 x − 4 x − 6 x→∞ 3 x − 4 x − 6 x→1 x + 2 x − 3
III. (15 points; 5 points each) For the following questions, use the function f (x) = 2 x3 + 3 x2 − 12 x + 3.
(a) Find all local max and min values of f (x). (b) Using the number line, describe all segments where the function is
increasing or decreasing. (c) Find the global max and min of f (x) on the interval [0, 4].
IV. (35 points; as listed) Find the following derivatives.
√ !
∂ x2 sec y ∂3
x y2 z3 − ex y z
(a) (5 points) 2 2
(b) (5 points) 2
(c) (10 points) Use implicit methods only to
∂ x y cos(x ) ∂x∂ y
dx x 2 d2x
show that = if x2 e2 z = 7. Then, find 2 . Simplify your answer as much as possible. (d) (15 points) Find
dz 2z dz
∂w
for w = 3 z x + 4 cos(y z) + sin(x ) subject to x3 + y2 + z = 7.
2 2
∂x y
x 2
V. (15 points; 5 points each) Given the price/demand function p(x) = x − 10 .
(a) Find the elasticity at x = 8. (b) Is the market elastic or inelastic at x = 8? Why? Does revenue increase or
decrease if price increases at x = 8? Why?
√ ∂f ∂f
VI. (10 points) Suppose that f (x, y, z) = ex y z , x(u, v) = 3 u sin v, y(u, v) = 4 v2 u, and z(u, v) = u v. Find and .
∂u ∂v
CHAPTER 3. DERIVATIVES - II 193
I. (10 points; 5 points each) Determine the future value of a deposit of $2200 that earns 3% for 9 months if the interest is:
(a) Compounded monthly (b) Compounded continuously
II. (15 points; 5 points each) Find each of the following limits.
3 x2 + 12 x + 9 2 ex − x3 + ln(x) [sin(x)]2
(a) lim (b) lim (c) lim
x→−3 x2 + x − 6 x→∞ 4 e3 x + x10 x→0 tan(x)
III. (25 points; as marked) For the following questions, use the function f (x) = 3 x4 − 4 x3 − 12 x2 + 5
(a) (10 points) Find all local max and min values of f (x). (b) (5 points) Using a number line, describe all segments
where the function is increasing or decreasing. (c) (10 points) Find the global max and min of f (x) on the interval
[1, 4].
IV. (45 points; as marked) Find the following derivatives.
√ !
∂ x2 sin( y) ∂3 ∂x
(a) (10 points) 2 2
(b) (10 points) 2
(x y2 z2 − sin(y z)) (c) (10 points) if cos(w x z2 ) −
∂ x y cos(x ) ∂x∂ y ∂z
∂w √
y3 x z = 3 z x2/3 (d) (15 points) for w = 3 z3 x2 + 4 sinh(y z) = sin(z x2 ) subject to z2 x3 + y2 z = 7
∂x z
V. (15 points; 5 points each) Given the price/demand function p(x) = x2 − (x3 /30).
(a) Find the elasticity at x = 25. (b) Is the market elastic or inelastic at x = 25? Why? (c) Does revenue increase or
decrease if price increases at x = 25? Why?
Summary sheet
Elasticity = η
p/x
=
d p/dx
dx/x
=
d p/p
V. (a) η = −1/4 (b) Inelastic, since |η | < 1 (c) Increase, since the market is inelastic at that price.
VI. −6 y/x3
CHAPTER 3. DERIVATIVES - II 194
V. (a) η = −7/3 (b) Elastic, since |η | > 1. (c) Decrease, since revenue moves the opposite direction from price in
an elastic market.
√ h i
VI. (z w y − 3 w5 x2 z) − ( 12 x z w y−1/2 + z2 ) × 2xy
VII. fu (2) + vv (x ex y )
VIII. (a) The rate productivity changes per hundred dollars spent in employee comfort (b) The rate productivity changes
per hundred dollars spent in productivity bonuses (c) The rate productivity changes per hundred dollars spent in em-
ployee comfort per hundred dollars spent in productivity bonuses
Integration explained
∆y1 + ∆y2 + · · · ∆yn = (F(x1 ) − F(x0 )) + (F(x2 ) − F(x1 )) + · · · + (F(xn ) − F(xn−1 )) (4.1)
= F(xn ) − F(x0 ) (4.2)
= F(b) − F(a) (4.3)
= ∆y (4.4)
In English, the total voltage drop is equal to the sum of all the individual voltage drops.
Dudley: Hey, that makes some sense to me. How about you, Mugsy?
Mugsy: I’m afraid it does to me, too.
Dudley: Afraid?
Mugsy: Yeah. Not scared, of course. But when math starts making too much sense, I know something’s got to be
seriously wrong.
This collapsing property in summations is called telescoping.
Summations occur sufficiently often that mathematicians have developed a summation notation for them. For example,
197
CHAPTER 4. INTEGRATION EXPLAINED 198
the summation
f (1) + f (2) + · · · + f (74)
would be written out as
74
∑ f ( j)
j=1
What you do is successively replace the j in f ( j) by the values starting at (as indicated by the value below the Σ sign) 1, 2,
3, . . . , up to 74, (as shown above the Σ) and evaluate f ( j) for each of those values, giving f (1), f (2), f (3), . . . , f (74). You
then add up all of the values to get
f (1) + f (2) + f (3) + · · · + f (74)
For another example,
7
∑ ( j2 + 3) = 3 + 4 + 7 + 12 + 19 + 28 + 39 + 52 = 164
j=0
We will be using summation notation extensively for the rest of the year, so it will be handy to have some feel for it.
Here are a few properties of summation notation:
• If c is a constant then
n n
∑ cxj = c ∑ xj
j=1 j=1
It looks as though the c simply moved outside the summation sign (as in fact it did). Written out, this is (cx1 + cx2 +
· · · + cxn ) = c(x1 + x2 + · · · + xn ), the familiar distributive law. In fact, most of these properties are nothing more than
arithmetic laws written in a way that tends to obscure the familiarity.
Mugsy: Why do I have the idea that mathematicians write things in the most obscure way possible?
Albert: Pick any academic discipline. They all tend to write things in picky and obscure ways, usually to
camouflage the fact that what they are saying is really trivial and obvious. Mathematicians are no exception.
The only trick is to learn the lingo, and that’s most of the battle.
• Another familiar law is
n n n
∑ (a j + b j ) = ∑ a j + ∑ b j
j=1 j=1 j=1
• Notice, however, that there are some things that you are not allowed to do with summations.
! !
n n n
∑ (a j × b j ) 6= ∑ aj × ∑ bj
j=1 j=1 j=1
(n + 1)3 (n + 1)2 19 n 19
− + +
3 2 6 6
164
First, Maple calculated a formula for the summation that is valid for any value of n, and then we had it substitute n = 7
into the formula and checked that the formula does work (for that value at least).
CHAPTER 4. INTEGRATION EXPLAINED 200
Homework #31
Exercises.
1. Find the values of the following summations. Check your answers with Maple, if you want.
5
(a) ∑ ( j − 1)
j=0
7
(b) ∑ (k2 − 5)
k=2
3
(c) ∑ 2− j
j=0
4
(d) ∑ (l − 1)l
l=1
10
(e) ∑ xr
r=6
10
(f) ∑ rx
r=6
2. Find the values of the following summations. Check your answers with Maple, if you want.
5
(a) ∑ ( j − 3)
j=0
7
(b) ∑ (k2 + 3)
k=2
5
(c) ∑ 2j
j=0
3
(d) ∑ ll
l=1
12
(e) ∑ xr
r=7
12
(f) ∑ rx
r=7
3. Make up some summations of your own. As usual, three of them will count for credit.
Problems.
1. Find ∑87345
n=0 cos(nπ). (Obviously, the straightforward approach is not going to work well by hand. Try figuring out
by hand the values of ∑kn=0 cos(nπ) for k = 1, 2, 3, 4, 5, and 6. See if you can find a pattern. Then figure out where
87345 fits into that pattern.) You can check your answer with Maple again.
(a) Write out the numbers in each of these summations and then add them:
7 7 7 7
∑ ( j − 3), ∑ (k − 3), ∑ (r − 3), and ∑ (n − 3)
j=1 k=1 r=1 n=1
(b) What do you notice about the answers in the previous part? Is there an “obvious” reason for this? (This
observation is called the principle of ignorance: A variable doesn’t know what you call it.)
3. This problem looks at another property of summations.
(a) Write out the numbers in the summations and then add them:
7 6 10 −1 31
∑ ( j − 3), ∑ ( j − 2), ∑ ( j − 6), ∑ ( j + 5), and ∑ ( j − 27)
j=1 j=0 j=4 j=−7 j=25
(b) What do you notice about the answers to the previous part? Is there an “obvious” reason for this? This process
is called shifting the index of summation. Notice that the limits of summation and the function change, so that
the same numbers are obtained in each sum.
(c) Give the summation that should be used to duplicate the sums in the first part of this problem if the lower limit
of summation is j = 10. The numbers in each sum should be the same. Note that when the value of j in the
sum gets larger, the limits on the summation must get smaller to give the same numbers.
(d) Change
10
∑ ( j2 − 15)
j=5
to a summation
?
∑ (??)
k=0
Do this is a way that causes all the numbers in both sums to be equal. The other limit is not too difficult to figure
out, but the function is more complicated. Think of what the relation between j and k should be, and plug that
into the summation.
(You should omit the x in the limits when the variable matches the differential.) How do we find F(x)? That’s the whole
key to these problems. We know that dy = d(F(x)) = F 0 (x) dx = x2 dx, so we need to find F(x) that satisfies F 0 (x) = x2 . In
this case, we can actually think about it for a minute
Dudley: For Albert, it’s about a microsecond.
and realize that we can find such a function, namely F(x) = 13 x3 . Then the change in y, ∆y, is given by
dP
= k dt
P
We need a function whose differential is dP/P, and the answer is ln P. The function with differential k dt is even easier,
being d(k t). (Remember that k is a constant.)
Then dP/P = k dt becomes
d(ln P) = d(k t)
meaning that the tiny wiggles in each match. Then the larger wiggles must match, too, so
∆(ln P) = ∆(k t)
We are now getting close to the answer. (The step from the differential to the large-scale change is the hard one!)
Mugsy: Al, is that what integration is supposed to do?
Albert: Exactly. Every time.
If we have values for k and ∆t, we can find the change in P.
To make this concrete, suppose the initial population is P(0) = 50.
Dudley: That sure implies that we aren’t working with humans.
Then we can make some progress finding ∆P
But we also need more information, so suppose we take k = 0.1, and t going from 0 to 10 (the units of time might be
months). Then we can get
We can un-do the logarithms by taking the exponential of both sides and get (remember that a common notation for ex is
exp(x))
P(10) = exp(4.912) = 135.9
What’s the answer? How do you have 135.9 rabbits? Two possibilities occur. One says that you don’t have another rabbit
until the population hits a whole number, so there would be 135 rabbits. Another says that you should round, getting 136
rabbits. Another is that you should count the next rabbit as soon as possible, also getting 136. Both answers would be
considered correct, and there is never much difference between the approaches.
The value of k can be examined a little more carefully, too. It is called the specific growth rate. It represents the rate
of growth per rabbit, since it is k = (1/P)(dP/dt). Just the derivative dP/dt is the growth rate, but that changes with the
population. But (1/P)(dP/dt) is the growth rate per unit population (dividing by P gives the “per unit population” part),
and that tends to stay much closer to constant. (Later, we will improve the assumption that the specific growth rate is
constant. We will also try this on a more realistic example, namely the census data for the U.S.A. from 1790 to 1990.)
Homework #32
Exercises.
1. For this exercise, use the same values for initial population (P(0) = 50) and specific growth rate (k = 0.1) that were
used in class.
(a) What would the rabbit population be at t = 36 (3 years)?
(b) What would the rabbit population be at t = 720 (60 years)?
(c) Assume that each rabbit weighs 5 pounds. (Don’t get picky, it’s just an assumption.) How much would the total
number of rabbits in the previous part weigh?
(d) The earth weighs about 1.32 × 1025 lb. Is the answer to the previous part realistic?
2. For this exercise, use the same values for initial population (P(0) = 50) and specific growth rate (k = 0.1) that were
used in class.
(a) What would the rabbit population be at t = 24 (2 years)?
(b) What would the rabbit population be at t = 480 (40 years)?
(c) Assume again that each rabbit weighs 5 pounds. How much would the total number of rabbits in the previous
part weigh?
(d) Is the answer to the previous part realistic?
Classical application: Archimedes’ method of exhaustion. The basic idea is to fill up (exhaust) the area underneath a
curve with some simple geometric shape. We’ll use rectangles. The limit gives the area.
Mugsy: Method of exhaustion, huh? Let me work on that.
The basic problem is to find the area under the curve y = f (x), for a ≤ x ≤ b. Slice the area up
Dudley: Chunk it through your Veg-O-Matic and not a seed out of place!
CHAPTER 4. INTEGRATION EXPLAINED 205
and you get a large number of strips that are roughly rectangular. Approximate the areas of the strips by rectangles, and
add these areas up to get an approximation to the area under the curve.
Here are a few examples of what it might look like.
There are several choices for the specific approximating rectangles to use, and various terms for the resulting approximation
to the area. You always use the width of the strip for the width of the rectangle. What changes is the way you determine
the height of the rectangle.
Mugsy: I just choose the height as 1. Boy, does that simplify things!
Albert: Did you wonder why you never got the right answer?
Mugsy: I usually got some partial credit, though.
Here’s a table giving several different possibilities, and the name of the resulting approximation to the area.
Moving to exhaustion. We are looking for the area, but that is not what we have found. We have only approximated
the area, and we aren’t even very sure how well we have done that. Before we can get the area, we need to deal with that
problem.
We get the actual area by slicing finer and finer. The whole problem is the fact that the rectangles don’t really fit well.
But if we take thinner and thinner rectangles, the error is proportionally less of the area.
The best way to see this is with pictures. We will take the graph y = −x3 /4 + x2 + 1 for 0 ≤ x ≤ 4 and use left sums.
Here is the picture of the approximation using 20 rectangles.
CHAPTER 4. INTEGRATION EXPLAINED 206
There are little (almost) triangular regions at the top of each rectangle that represent the error, how far off the rectangle
is from the actual area of the strip under the graph. (The reason that the error regions are almost triangular is that as we
magnify the graph, it becomes more and more like a line. Remember?) We will focus in on two pairs of these, first the ones
on either side of x = 2 and then the ones on either side of x = 3.
Let’s first take a close up look at the top of the two that are next to x = 2. Then what we do is continue looking at that
same region as we increase the number of rectangles to 40, 80, 160, and 320. The shaded regions are the new areas added
into the approximation that were missed with 20 rectangles.
It is clear that we are filling in the area as we increase the number of slices. That is, we are exhausting (using up) the
area. That’s the reason for the term “method of exhaustion.”
In order to get the exact area, then, what do we do? We could use and infinite number of rectangles. That is exactly
what calculus does. However, Archimedes didn’t have calculus, so he used limits. Take a limit as the number of slices goes
to infinity, and you get the area.
There is one concern, though. What happens if the rectangles in the approximation are too tall? If we add to them, then
the approximation will get worse, not better. Never fear!
Dudley: Why do I always get afraid when people say that?
Albert: Probably experience.
The approximations do The Right Thing in that case, too. If the rectangles are too big, increasing the number of slices
shrinks the areas of the rectangles down to the area under the curve. Let’s look at what happens to this curve near x = 3 to
see that.
First, we take a close look at the tops of the two rectangles on either side of x = 3. Again, there are two roughly
triangular regions that represent the error in the approximation for those rectangles. Now watch what happens to that error
as we again increase the number of rectangles to 40, 80, 160, and 320. The shaded region now represents the area that has
been removed from approximation, thus decreasing the error for those rectangles.
CHAPTER 4. INTEGRATION EXPLAINED 207
Again, it is clear that the rectangles are closing in on the correct area.
An example from Archimedes. Find the area underneath the curve y = a x2 for 0 ≤ x ≤ b. (Assume a > 0 for conve-
nience.) Archimedes actually did this problem (or it is attributed to him anyway), except that he would not have used the
term limit.
The area of each strip will be approximately base × height, with base = ∆x. The height will be some y-coordinate
(being distance above the x-axis), so it will a x2 , where we will have to determine what x-coordinate to use.
Let’s use n strips (n will be a variable that we will let go to infinity to achieve the “exhaustion” of the area). Each strip
will have width ∆x = b/n, since the width of the whole area is b, and we will divide the area into n equal widths.
Dudley: Are equal widths necessary?
Albert: No. But they simplify the calculations in most cases. All that you really need is to have the width of the
widest slice go to zero.
The x-coordinates of the ends of the slices will then be 0, b/n, 2 b/n, 3 b/n, . . . , (n − 1) b/n, b.
The heights will be a x2 , where we need to find the x-coordinates. I am going to use the right sum approximation. What
will the x-coordinates be, then? It depends on which strip you’re working with. The right end of the jth strip will have
coordinate x = j b/n. (Here, j is a generic, unspecified variable, 1 ≤ j ≤ n, which will turn into the index letter we use in
the summation.) The height of the jth strip is then
2
a j2 b2
jb
a x2 = a =
n n2
We are now faced with evaluating that last summation. If you have encountered a topic called mathematical induction, you
have probably found its value. Since we don’t have time to do all of mathematics in this course, I’ll simply tell you that the
value of that sum is 16 n (n + 1) (2n + 1). (Check me on Maple, if you aren’t sure. The command is sum(j^2,j=1..n);.
CHAPTER 4. INTEGRATION EXPLAINED 208
This is another example of an indefinite summation.) The approximation to the area is then
a b3 1
3 (n + 1) (2n + 1)
n (n + 1) (2n + 1) = a b
n3 6 6 n2
That is not an obvious answer.
Mugsy: Tell me...
But even worse, it is only an approximation to the correct answer (the area)! In order to get the correct area, we must take
the limit as n → ∞.
Mugsy: Archimedes did this? You sure?
Fortunately, we have already tackled that problem as well, and it turns out to be fairly easy.
Dudley: Right. (I had to beat Mugsy on that one.)
The a b3 factor is just a constant that will come along without change. As n → ∞, the relevant terms are (n + 1) (2n +
1)/(6n2 ), and finding that limit (as the quotient of two polynomials) is easy. The answer is (after some algebra) 1/3. The
final area is 13 a b3 .
That is the answer that Archimedes got, but not the form in which he stated it.
Mugsy: Aha! I knew there was a catch!
If you look at the picture of the area under the parabola, you can draw a rectangle with lower corner at the origin, and
upper corner at the tip of the parabola with coordinates (b, a b2 ). (How did I get the y-coordinate?) That rectangle has area
(base) × (height) = (b) (a b2 ) = a b3 . Archimedes would have given his result as the area under the parabola is 1/3 the area
of the rectangle, just as we got. Here is the picture.
If you are now thinking about how you are going to do this yourself, don’t worry. I won’t make you do this.
Dudley: YAY!
The formulas for the summations get horrendous. But you did need to see it once for historical reasons.
Dudley: For the same reason we take a bunch of other courses around here....
But more than that, the method is classical and the basis of the way that most calculus courses treat areas. We will be
working much more complicated problems, but not this way!
Re-do Archimedes’ example from the point of view of differentials. We want to do the entire problem all over, from
scratch, but with the point of view that we will be using for the rest of the course. We’ll slice the region up also, but into
dx-width “rectangles.” Ours will be so thin that there will be no approximation involved—we will get the exact correct area
right off! The penalty that we pay for getting the right area is that we must add up the areas of the “rectangles” with an
integral rather than a summation. I said that the calculus way to solve this problem is with an infinite number of rectangles,
and this is it.
Dudley: Are there really an infinite number of those?
Albert: Yes, and that is why regular summations won’t work to add them up. But you are better off not asking how
infinite the number of rectangles there are. Summations can handle one variety of infinite numbers of terms, and
integrals handle another. Infinite sums will show up next semester.
Dudley: Aww, that’s really mean. You mean to tell me there are different sizes of infinity?
Albert: Yes, there are. I did say that you were better off not asking.
The area of one of the “rectangles” is still (base) × (height), except that the base now has length dx and the height is
just a x2 .
Dudley: Why don’t we use one of those approximation things?
Albert: Remember that differentials are so small, we can treat other variables as constants? That’s what is happening
CHAPTER 4. INTEGRATION EXPLAINED 209
here. The dx is so small that all the different ways of finding the heights give the same results, namely a x2 .
2
R 2
The area of the differential strip is then a x dx. Adding these up gives a x dx. The thing that we are missing is the
equivalent of the limits of summation, called the limits of integration. They tell theR values of x for which the “addition” of
the a x2 dx will occur. In this case, we go from x = 0 to x = b. The integral is then 0b a x2 dx, and this is the area exactly.
The last thing is to evaluate that integral. So, we need to find a function F(x) whose differential is a x2 dx. In this case,
we can find one fairly easily.
F(x) = a x3 /3
If we find ∆F for x = 0 to x = b, we get the answer:
Z b
a x2 dx = F(b) − F(0) (4.18)
0
a (b)3 a (0)3
= − (4.19)
3 3
a b3
= (4.20)
3
It is worth commenting that this is the same answer that we got the other way, and this way was considerably less painful.
This is the way that we will work such problems from now on.
ThereR is a standard set of notations and terminologies for definite integrals, and we might as well get them now. The
integral 0b a x2 dx is read “The integral of a x2 from x = 0 to b.” Note that the limits are read from the bottom to the top.
Reading from the top down is a sure sign of a person who is just learning calculus (just like saying that you are going to
derivate a function rather than differentiate it).
a b3
3
Note how Maple uses the x=0..b to determine the variable (it is x, not a), as well as the upper and lower limits for the
integral. Since x was indicated as the variable, Maple correctly decided that a was a constant.
Mugsy: If only it would behave so nicely for me.
Homework #33
Problem.
1. In this problem, we do another example of finding areas. This time, we can verify the answer ourselves, with a bit of
algebra and geometry. We want the area underneath the curve y = a x, for b ≤ x ≤ c. We assume a > 0 and 0 ≤ b < c,
and we’ll use the differential approach for the calculus.
(a) The differential strips are dx wide. How tall are they (as a function of x)?
(b) What is the area of the differential strip, again as a function of x?
CHAPTER 4. INTEGRATION EXPLAINED 210
(c) What is the range of x’s to use for adding the strips together? (This question asks for the largest and smallest
values of x the problem contains, and tells us the limits to use on the integral.)
(d) Set up the integral for the area, including the limits and the function to integrate, all in terms of x.
(e) Find a function F(x) whose differential is the differential in the integral. (You’ll need a quadratic function if
you’ve done everything correctly up to this point.)
(f) Evaluate the integral by finding ∆F for the range of x’s in the integral. (This is the area of the region, as
determined by calculus.)
(g) The region we are looking at is a trapezoid. (For those of you that don’t remember this from high school
geometry, a trapezoid has four sides, two of which are parallel. In this case, the two parallel sides are vertical.)
The area of a trapezoid is equal to (the average length of the parallel sides) × (the distance between the parallel
sides). For the region we’ve been working with, what is the distance between the parallel sides?
(h) What are the lengths of the two parallel sides? (How do you find the y-coordinate of a point on y = a x?) What
is the average of those two numbers?
(i) Get the area of the trapezoid from the high school geometry formula (by multiplying together the answers to
the previous two parts) and show that it equals the area obtained by calculus.
General ideas.
We now begin our examination of a systematic procedure for finding indefinite integrals, which are used to calculate definite
integrals exactly. Soon, we will also cover a procedure for finding definite integrals approximately, for those situations that
indefinite integrals can’t be found.
We want a function that gives a certain differential. To evaluate ab f (x) dx, we need to find a function F(x) so that
R
F 0 (x) = f (x). Then ab f (x) dx = F(b) − F(a). The reason this works is that if F 0 (x) = f (x), then dF = F 0 (x) dx = f (x) dx,
R
and adding up all the differential changes dF in F(x) gives exactly ∆F = F(b) − F(a). There really is more going on here
than shows on the surface.
Mugsy: Surprise.
Need a constant of integration. The first problem that occurs is that there is no single function F(x) that √ satisfies
F 0 (x) = f (x). There are lots of them! If f (x) = 2 x, then what is F(x)? It could be x2 , or x2 + 1, or x2 − 17, or x2 + 3π
217
. In
2 2
fact, it could be x +C, where C is any constant. The differential of x +C is still 2 x dx, no matter what value the constant
C has. This always happens. The equation F 0 (x) = f (x) will have solutions that look like F(x) + C. The C is called the
constant of integration.
CHAPTER 4. INTEGRATION EXPLAINED 211
This creates some concern. If the value of the definite integral depends on the constant C, then you’d better use the
same constant that I use (since I am always right :-).
Dudley: Hey, Albert! You’ve got competition!
But fortunately, the value chosen for C doesn’t ever affect the definite integral.
This point is best illustrated by an example. Early in definite integrals, I evaluated 17 x2 dx to be 114. (Check that out. It
R
is on page 202.) There, I used the function F(x) = 13 x3 . Watch what happens if instead I use the function F(x) = 13 x3 + 17.
Specifically, watch what happens to the 17.
Z 7
x2 dx = F(7) − F(1) (4.21)
1
1 3 1 3
= (7) + 17 − (1) + 17 (4.22)
3 3
1
(7)3 − (1)3
= (4.23)
3
= 114 (4.24)
If you stare at that for a while, it should become clear that the 17 in both F(7) and F(1) simply self-destructed because of
the subtraction. And with a bit more thinking, you should be able to convince yourself that no matter what constant went
in there instead of 17, it, too, would cancel out of the final answer.
What this means, then, is that you can use any constant of integration that you want for evaluating definite integrals, and
the value of the definite integral will not change. What constant of integration did I use with F(x) = 13 x3 ? The simplest one
possible; I used C = 0, since then F(x) +C = 31 x3 . This is the usual case. You (almost) never put the constant of integration
in when evaluating definite integrals, which is the same as using the value C = 0.
Dudley: Al, what value do you use?
Albert: Zero, of course. It’s the easiest, unless I happen to see another value that’s even easier. That’s really rare,
though.
This is sufficiently important to be highlighted:
When using indefinite integrals to evaluate definite integrals, you will appear to omit the constant of
integration. But what you are actually doing is using the fact that the value of the definite integral doesn’t
depend on the value of the constant, so you are setting it to a convenient value, namely zero.
This is exactly the equation that we used when we were introducing exponentials. But again, where is the constant of
integration? It certainly doesn’t appear (obviously, anyway).
Mugsy: You’re not convincing me.
Let’s look back at dP/dt = k P, which gave dP/P = k dt. The dP/P is the differential of ln(P), while the k dt is the
differential of k t. Then the indefinite integrals of the two differentials should also be equal, namely ln(P(t)) +C and k t +C
should be equal. Does that mean that ln(P(t)) = k t? No, and this is an important point that will occur later also. The
constants of integration that occur in the two different integrals are unrelated. It really should be that
When integrating both sides of an equation with differentials, you only need to put a constant of integration on
one of the sides.
If we compare this to the solution that we got before, we note something interesting. The C3 is exactly the same as
ln(P0 ). What does this say? The constant of integration needs to be there in order to accommodate different possible
starting populations. That is, the equation ln(P(t)) = k t +C describes a large number of situations. To apply it to a specific
situation, you have to determine the value of C. Without that constant of integration, you wouldn’t be able to have an initial
condition.
Albert: And you can bet that in any problem like this on a test, if you leave off the constant, you will get the wrong
answer, and lose credit.
Mugsy: Well isn’t that a sweet thing to do.
Albert: You only need to get burnt once before you remember it for a long time to come.
We now have two different ways of approaching problems that require integrals. One is adding up differentials to
get changes, using definite integrals. The other is to find functions with a specified differential, using indefinite integrals.
Either approach can be used to solve problems. The constant of integration appears only in indefinite integrals (without
limits of integration), and must be determined at the end. Definite integrals can be used to find changes, and you have to
add in the starting value to get the final value. Adding in that starting value correlates precisely to evaluating the constant
of integration. Both methods give exactly the same answers. The sequence of steps is different, but the two approaches are
completely equivalent.
y(x0 ) = y0
c un+1
Z
c un du = +C
n+1
where n 6= −1. This enables you to integrate any polynomial, and other things as well.
We have already been using this, but not formally. We had to figure out what the indefinite integral of 3 x2 was, just a
bit ago. We did it more or less by guessing. This rule allows us to get the answer without having to think it through each
time.
Mugsy: You mean I don’t have to think about this any more?! That’s just great!
One thing you might wonder about is why do we not allow n = −1. The easy way to convince yourself that it doesn’t
work is to try it out. You get a division by 0, which is always a sign of trouble.
Mugsy: I can tell right now that there will be times that n = −1. What do I do then?
Albert: You will find out shortly.
Dudley: My suggestion is to panic.
Albert: I can guarantee that panic is not going to help.
Using Maple to find indefinite integrals and solve initial value problems.
Maple works indefinite integrals, too. It’s even easier to write than definite integrals. You simply leave off the limits, but
don’t forget to tell it the variable. That is, to ask Maple for a x2 dx, simply type
R
a x3
3
Note that Maple doesn’t put in the constant of integration! That does not relieve you of the obligation to put it in. Maple
simply assumes that you know it should go there. On the other hand, I don’t assume that you know it should go there. You
CHAPTER 4. INTEGRATION EXPLAINED 214
y(x) = x3 + 4
We encountered the solve(); family back at the beginning of the semester, when we did the Maple introduction during
the first lab period. The format for all that family is the same:
*solve( what to use in solving, what to solve for );
(The asterisk (*) represents some letter, such as f or d, or nothing at all, in the case of solve();.)
The dsolve(); is the command to solve a differential equation. The bracketed terms are used to group together all the
information that will go into the solution. The
diff(y(x),x)=3*x^2
is the dy/dx = 3 x2 . Note that you have to indicate that y is a function of x in the diff(y(x),x). Otherwise, Maple will
get very confused.
Mugsy: That seems to happen an awful lot.
Albert: Well, Maple is not too smart.
Dudley: Only Albert would say something like that....
The y(-1)=3 is the initial condition to go with the equation. The comma after the } separates the information to go into
the solution from the item to solve for, in this case y(x). And note that you have to indicate that y is a function of x there,
too.
Maple can solve much more complicated differential equations, too. The command dsolve(); will be of exceedingly
great use when you get to the course called Differential Equations. (Assuming, of course, you take it.)
Homework #34
Exercises.
1. Find the solutions of these initial value problems. You will need to find the indefinite integrals by a process of trial
and error. We will remedy that situation soon. But all of these are very easy.
(a) y0 = 4 x3 , y(1) = −4
(b) dw/dr = 3 r2 , w(2) = 5
(c) dx/dt = cost, x(0) = 1
(d) y0 = 0, y(2) = −8
2. Find the solutions of these initial value problems. You will again need to find the indefinite integrals by a process of
trial and error.
(a) y0 = 7 x8 , y(1) = −6
(b) dw/dr = −r−2 , w(2) = 2
CHAPTER 4. INTEGRATION EXPLAINED 215
Problem.
1. We have already solved a problem a little more complicated than y0 = f (x), namely, dP/dt = k P. For that, we needed
to pull the P to the side with the dP. (See the discussion on page 212.) Use this same idea to solve the initial value
problem
dx
= x3 t 3 , with x(−1) = 1
dt
First get velocity, with its initial condition. The approach is direct. We assume constant acceleration. That gives us an
equation for velocity that we can solve.
The equation is dv/dt = −g, where g is the acceleration of gravity, a constant. There is an assumption here that
positive is upward. Since the force of gravity is pulling downward, the negative sign appears. This is a convention, not a
requirement. You could have good reason for declaring positive to be downward, and that would be fine. You must use
your choice consistently throughout the problem, though.
Note that the differential equation requires an initial condition, a value of v when t = 0. We’ll say that v(0) = v0 , just to
give it some notation. The equation dv/dt = −g with v(0) = v0 is easy to solve:
dv
= −g (4.29)
dt
dv = −g dt (4.30)
Z Z
dv = −g dt (4.31)
v = −gt +C (4.32)
(v0 ) = −g × (0) +C (4.33)
v0 = C (4.34)
v = −gt + v0 (4.35)
That wasn’t too bad.
Mugsy: It is if you don’t want to do it at all.
Dudley: You are in a grumpy mood.
Next get position. Velocity is a handy thing to have, but we don’t want velocity.
Mugsy: That depends.
We want position.
Mugsy: Position! Status! Fame! Wow! I’m in a better mood now.
We can get position from velocity by another differential equation, dy/dt = v. (Here, y is position, essentially height. I
don’t use x or s for position, for reasons that will appear in a moment.) We can get the velocity from the previous step, but
we still need an initial condition. Again, we simply make up some notation and say that initial position (height) is y = y0 .
Mugsy: I don’t think that word means what he think that word means.
That means that we must solve dy/dt = −gt + v0 with y(0) = y0 . This is only slightly more difficult than velocity was.
dy
= −gt + v0 (4.36)
dt
dy = (−gt + v0 ) dt (4.37)
Z Z
dy = −gt + v0 dt (4.38)
1
y = − gt 2 + v0 t +C (4.39)
2
1
(y0 ) = − g × (0)2 + v0 × (0) +C (4.40)
2
y0 = C (4.41)
1 2
y0 = − gt + v0 t + y0 (4.42)
2
Those of you who have had physics have probably seen these equations before, but probably not derived this way.
Note that you can check the answers if you aren’t sure.
Albert: That’s a hint. It is really useful at times.
When we want to solve dv/dt = −g, with v(0) = v0 , and say that the answer is v = −gt + v0 , all we have to do is see if the
function given satisfies both parts of the original equation. Since it does ( dtd (−gt +v0 ) = −g and v(0) = −g×(0)+v0 = v0 ),
it is correct. The same holds for dy/dt = v, y(0) = y0 .
CHAPTER 4. INTEGRATION EXPLAINED 217
Also, note the strategy for solving initial value problems. You first integrate the differential equation. That gives a
constant of integration. Only then do you evaluate the constant using the initial condition. Before that, you don’t have
anything you can plug in, and you don’t have a constant yet, either. Finally, you plug that value of the constant back into
the solution of the differential equation, and you are done. Initial value problems are always solved in this order.
Note that a second-order differential equation requires two initial conditions. Ultimately, for free-fall motion, we
were working a second-order differential equation, namely, d 2 y/dt 2 = −g. (The order of a differential equation is the
maximum order of any derivative in the equation.) In the process, we needed two initial conditions, one for y and one for
v = dy/dt. We needed two integrations to “undo” the two derivatives, so we got two constants of integration. We needed
two initial conditions to evaluate the two constants. This is generally true. You will need as many initial conditions as the
order of the differential equation.
Homework #35
Exercises.
1. In this exercise, we look at the units of the equations that we got. For familiarity, I will use the English (foot-pound-
second) system rather than the metric system.
(a) If the unit of length (that is, of y) is the foot (abbreviated ft) and the unit of time is the second (abbreviated sec,
or sometimes just s), what are the units of velocity? Use the definition that v = dy/dt and remember that dy/dt
will be essentially ∆y/∆t, so the units on dy/dt will be the same as the units on ∆y divided by the units on
∆t. (Often, when you are dividing by a unit, you indicate that by the word “per.” To get gasoline mileage, you
divide the distance you drive (in miles) and divide by the amount of gasoline used (in gallons), and the result is
a certain number of “miles per gallon.”) This gives the units for v.
(b) What are the units of acceleration? Use a = dv/dt. This gives the units for g, since it is the acceleration due to
gravity.
(c) Show that each of the terms of v = −gt + v0 have the same units. (That is, the units for v, −gt, and v0 are
identical. This must always happen in valid equations. If the units in an equation are not the same, you can be
quite sure that something is wrong with the equation. So, the equation v = −gt 2 could not possibly be correct.
This is a handy way of checking formulas to make sure you remember them accurately.) (Note: Constant factors
do not affect the units, so the negative sign (a constant factor of −1) and 12 in the next part can be ignored.) The
units of v0 are the same as the units on v, since v0 is a value of v at a specific time. Also, when you multiply
two terms with units, the units multiply as well.
(d) Show that the units in each term of y = − 12 gt 2 + v0t + y0 are the same.
Problems.
1. Dudley wanted to play catch by himself (Fang was off plotting the demise of the every squirrel in the universe, his
usual preoccupation), so he tossed a ball (more-or-less) straight up, and caught it coming down. Suppose he threw
the ball upwards with a velocity of 50 ft/sec = v0 . Use t = 0 at the time he threw and −g = acceleration due to gravity
= −32 ft/sec2 . Use y0 = 4 ft (the rough height of Dudley’s hand) and assume that he caught the ball also at y = 4 ft.
(a) What is the equation that determined how high the ball was? (That is, what is the equation for y under these
conditions?) (You can use the equations from the notes, and the values given in the problem to answer this quite
easily.)
(b) When (that is, for what value of t) did Dudley catch the ball? (Hint: Solve y(t) = 4 for t. You’ll get two values.
Which one do you want?)
(c) Find the value of t when dy/dt = 0 by solving that equation. This gives the time when vertical velocity is 0.
Can you give a simple description of the place in the path of the ball where the vertical velocity is 0?
CHAPTER 4. INTEGRATION EXPLAINED 218
(d) What is the value of d 2 y/dt 2 at the value of t from the previous part? This gives the (vertical) acceleration.
Explain how acceleration can be non-zero when the velocity is 0. After all, acceleration is the derivative of
velocity and the derivative of 0 is 0.
2. We indicated in the notes that we ignored air resistance in the equations we got. There was a reason for that. In this
problem, we treat vertical free-fall motion with air resistance. This will require going to Maple to solve the equations
that will appear. One standard assumption is to make air resistance proportional to velocity, giving an extra term of
k v in the acceleration (k is a constant determined by the shape of the object and the resistance of the air. That is, the
larger the value of k, the more the effect of air resistance enters the equation. For k = 0, there is no air resistance at
all.) The equation is then dv/dt = −g − k v. The initial condition is v(0) = v0 . (We are only solving for v, so the s(0)
won’t be needed.)
(a) Use Maple to find v(t). This can most easily be done using the Maple command dsolve();. The format is
dsolve( {diff(v(t),t)=-g-k*v(t), v(0)=v0}, v(t) );
Make sure you keep straight the parentheses () and the curly brackets {}.
(b) What happens as t → ∞ in this equation? (Physically, that means that you are falling for a very long time.) Find
out using Maple by typing the following commands (right after the dsolve();:
assign(%);
signum(k):=1;
limit(v(t),t=infinity);
(I’ll explain the reasons for these in lab period.) This value of v is called the terminal velocity. It is the velocity
that you will approach as you fall for long times.
Mugsy: I’ve always wondered about that. Not personally, of course. I’ve just observed.
(c) What happens to terminal velocity when k gets large? (This corresponds to taking a big parachute, to increase
wind resistance.) Why do you want a big parachute when you jump out of an airplane?
(d) What happens to v as t → ∞ without air resistance? (Use the equation for v from the notes for regular free-fall
motion here. Take the limit as t → ∞. What happens to v?)
x(t) = v0x t + x0
1
y(t) = − gt 2 + v0y t + y0
2
Remember that when you are working with parametric equations, you always want to phrase everything in terms of the
parameter. In this case, the values of t = time answer the question “When . . . ?”
Homework #36
Problems.
1. A boomerang doesn’t obey the equations we got for ballistic motion. Why shouldn’t it? (I am not looking for a
description of what happens when you throw a boomerang. I am looking for an explanation of why a boomerang
doesn’t follow a nice, parabolic path the way a golf ball does.)
2. This problem leads you to answer the question “How far did the golf ball go in the air?” We will take x0 = 0 and
y0 = 0 just to make life simpler. It really wouldn’t matter. We’ll also assume flat ground. Leave v0x and v0y as
variables that will appear in your answers. Also, don’t replace g by 32; leave it as g.
(a) The distance the ball went in the air is essentially the difference between the two places that it was on the
ground. One place it was on the ground was at the start, at y0 = 0. What equation determines when the ball
is on the ground? (Hint: What value of y does “on the ground” correspond to? Also, when we are asking for
“when” we want to solve an equation for t.)
(b) Solve for t in the equation in the previous part. There should be two values of t.
(c) What are the x-values that correspond to the two values of t?
(d) How far did the ball go? (It is the difference between the two x-values in the previous part!)
3. In this problem, we answer the question “How high does the golf ball go?” We will operate using all the assumptions
and directions from the previous problem.
(a) We want to maximize height, y. How do we do that? (Hint: You will need some ideas from the Derivatives-I
and Finance chapters.)
(b) Interpret the equation in the previous part in terms of a value of the y-velocity. Does it make sense?
(c) Solve the equation in the first part for t. In other words, answer the question “When does the ball get to its
maximum height?”
(d) To get the value of the maximum height, you need to plug the value of t from the previous part into the equation
for y. Do that. What do you get for maximum height?
4. In this problem, we find and work with the non-parametric relation between x and y in ballistic motion. Part of the
reason for this problem is to show that parametric equations make life easier. Use the boxed equations for x(t) and
y(t) for this problem.
(a) Solve the equation x(t) for t. (This is so that at the next step, we can get y, the dependent variable, in terms of
x, the independent variable.)
(b) Plug that equation for t into the equation for y(t). You should get an equation for y = y(x).
(c) Answer the question “How far does the golf ball go?” using just the equation from the previous part of this
problem. Do that by solving the equation y(x) = y0 . (You might want to use Maple. It gets pretty ugly here.)
(d) Answer the equation “How high does the golf ball get?” using just that same equation. (This time, you will
want to solve y0 (x) = 0. Why?)
5. Divide the value of t when the golf ball lands again (from an earlier homework question) by the value of t when the
golf ball reaches maximum height. Does your answer seem reasonable physically?
CHAPTER 4. INTEGRATION EXPLAINED 220
Definite integrals always have limits, and never come with a +C.
However, definite integrals are evaluated Rusing indefinite integrals (with C chosen to be 0). Here is the standard format
for doing that. Suppose we want to evaluate 37 x dx. Since x dx = 21 x2 +C, we get the following
R
1 2 7
Z 7
x dx = x (4.43)
3 2 3
1 1
= (7)2 − (3)2 (4.44)
2 2
= 20 (4.45)
This is what your evaluation of definite integrals should look like. Of course, if the indefinite integral takes more effort to
find, then extra work is needed at that point, but still it should have all these steps in it.
and they need to be the values of P that correspond to the values of t at t = 0 and (generic) t. The value of P when t = 0 is
P(0) = 50; that’s the initial condition. The value of P at an unspecified (generic) t is an equally unspecified (generic) P(t),
usually writtenR just P. The whole idea of the problem, in fact, is to find P(t), so it will have to enter somewhere. Thus,
P
the limits are 50 dP/P. The solution of the problem then is usually written this way. (We will use the same properties of
logarithms and exponentials we always need for this problem.)
Z P Z t
dP
=k dt (4.46)
50P 0
P t
ln(P) = k t (4.47)
50 0
ln(P) − ln(50) = (k t) − (0) (4.48)
ln(P/50) = k t (4.49)
kt
P/50 = e (4.50)
kt
P(t) = 50e (4.51)
CHAPTER 4. INTEGRATION EXPLAINED 221
Note that you can read the initial condition off of the lower limits of the two integrals. This is the format you should follow
when solving problems in this course (and beyond).
One note of caution: Some people argue that it is improper to use the variable of integration (P or t above) as a limit of
the definite integral. Technically, they are correct. Obscure problems or confusions can arise from the practice, but they are
so rare, and so inconvenient to avoid, most practicing mathematicians and engineers ignore the warning.
Homework #37
Exercises.
1. Find the following definite integrals. You will have to do some guessing (for the moment) to find the indefinite
integrals.
Z 4
(a) 2 x dx
2
Z −1
(b) 1 dx
−5
Z 1
(c) x dx
−3
There are no product, quotient, or chain rules for integrals. This is why integration is so nasty. There is one procedure
(coming from the chain rule, so it becomes the most important rule in integration) that helps some. We will tackle that
quite soon.
Standard calculus courses spend huge amounts of time on this topic. One thing that I have done in this course is to be
realistic and say that only a very few of the methods taught are really used, and I’ll present them. Plus, I’ll point you to
Maple and integral tables, which are much more likely to be used than vague memories once you get to using integration
anywhere else.
1 n+1
Z
un du = u +C for n 6= −1
n+1
1
Z
du = ln |u | +C
u
Z
sin u du = − cos u +C
Z
cos u du = sin u +C
Z
eu du = eu +C
1
Z
√ du = Arcsin u +C
1 − u2
1
Z
du = Arctan u +C
1 + u2
1
Z
√ = Arcsec u +C
|u | u2 − 1
Standard formulas.
There are a very few integrals that occur so often that they simply must be memorized.
Dudley: Augh. I can’t stand oodles of memorization!
Albert: Stay calm. If you’ll look at the table, you’ll note that there aren’t oodles of formulas there.
They are in a table nearby, together with a couple that I want there just for reference. I’ll say which are which at the end.
Note that any of these can be checked easily by verifying that the derivative of the right-hand side is the integrand on the
left. All of these but the last three (the inverse trigonometric functions) should be memorized. But if you remember your
derivatives, that’s already done.
Albert: See, it’s not hard!
Dudley: That’s simple for you to say. I still struggle with algebra.
Substitution.
The chain rule, the most important rule in calculus, is a derivative formula. Operated in reverse, it becomes substitution,
the most important rule in integration.
The goal of substitution is to convert a more complicated integral into something simpler, hopefully one of the standard
formulas just given. That’s one of the reasons that the standard formulas are so important. That’s also why I wrote the
standard formulas in terms of u rather than x, since u is the most typical substitution variable. Substitution is also called
change of variables for this reason.
The chain rule says that ( f (g(x)))0 = f 0 (g(x)) × g0 (x). This means that
Z
f 0 (g(x)) g0 (x) dx = f (g(x)) +C
If you realize u0 (x) dx = (du/dx) dx = du, it says that f 0 (u) du = f (u)+C, which is the definition of the indefinite integral.
R
So this is nothing that new. But it does give a major clue as to the way to use this. We substitute u = some inside of part
of the function we are trying to integrate. (Actually, we will hit situations where this is not accurate, but for now, it is
worthwhile letting sink in.)
Mugsy: In other words, you let u = inside of the most complicated part always—sometimes.
Albert: Yes. Always for now. Usually, but not always, later.
The whole key to substitution is to locate what function to use for u. Here are a few items to look for:
• The inside of the most complicated part.
The most complicated part of f 0 (g(x)) g0 (x) is the f 0 (g(x)) term. Its inside is g(x), so use u = g(x) for that reason.
(Note that you shouldn’t let u = f 0 (g(x)), that is, the whole thing.) This is the method that works the most often on
questions that have been cooked up (like on homework or tests).
Albert: HINT!
• Any function whose derivative appears as factor in the integrand.
The integrand is f 0 (g(x)) g0 (x), and the derivative of g(x) (on the inside of f 0 (g(x))) is g0 (x), which appears as a
factor in the integrand. Use u = g(x) for that reason. This obviously works only when you can integrate terms of the
integrand, and can express the rest of the integrand in terms of the integral of the rest.
Dudley: What?
Albert: Look at it this way. In case part of the integrand—that is the function you are integrating—can be
integrated reasonably easily, you then look to see if the rest of the integrand can be expressed easily as a
composition with the inside being the part you can integrate.
Indefinite integrals. Now we begin the process. When confronted by an integral that is not exactly like one of the
standard formulas, here is a rough procedure to follow.
First, convert roots and divisions to exponents. This is the same as you did in differentiation. Calculus works easier
with exponents.
See if you can locate a good substitution, using the suggestions I just gave. How do you tell if a substitution is good?
That comes next. Let’s work an example. Consider
2x
Z
dx
x2 + 1
The integral looks similar to, but not exactly the same as, the Arctan x standard formula. The 2 x on top ruins it. (If it were
just 2 on top, then we could pull the constant outside, and get 2 (1/(x2 + 1)) dx, which would be 2 Arctan x +C. But the x
R
is not a constant, so it can’t be pulled out in front.) So, we convert the integral to exponents
Z
(2 x) (x2 + 1)−1 dx
At this point, we look for a substitution, and u = x2 + 1 suggests itself immediately by both criteria. It is the inside (not the
whole!) of the most complicated part, and its derivative is 2 x, which is a factor in the integral.
Now that we have the substitution, how do we use it? There is another major principle here.
When you are making a substitution, you have to change everything to the new variable.
This includes the differential and the limits (which we will worry about when we get to definite integrals).
How do you tell if a substitution “worked?” Check if all items connected with the old variable have vanished. If so, it
is a useful change.
So, proceed with the example. We have u = x2 + 1, so we have to convert all the x’s in the integral to u’s. Always
begin with the differential. How do you convert dx to du? The chain rule, of course. That tells you how to change between
differentials of related variables. The derivative of the substitution u = x2 + 1 is du/dx = 2 x, so du = 2 x dx. There are two
possible directions at this point.
• Rewrite the integral to include a term that is exactly 2 x dx and then put du in for it. Or
CHAPTER 4. INTEGRATION EXPLAINED 224
du
u = x2 + 1, so du = 2 x dx, so dx = 2x . Then
du
Z Z
−1
2
(2 x) (x + 1) dx = (u)−1 (2 x) (4.56)
Z
2x
= u−1 du (4.57)
= ln |u | +C (4.58)
= ln x2 + 1 +C
(4.59)
Even though it looks longer, I recommend the second approach, reserving the first method for those situations when you
recognize what will happen.
One more point about this integral needs to be commented on. Since the original indefinite integral here was given as a
function of x, the answer should also be given as a function of x. So, whenever you substitute in order to solve an indefinite
integral, you will have to undo the substitution before you are finished. This is usually (but not always) quite simple.
As with any indefinite integral, we can check the result by differentiating it:
2
d 2 1 x + 1
ln x + 1 +C = 2 × 2 × (2 x) (4.60)
dx |x + 1 | x +1
2x
= 2 (4.61)
x +1
This is the function that we integrated, so it checks.
It is worth checking your indefinite integrals until you are convinced you are getting them. Over-reliance on Maple here
can be bad for your ability to work such problems on a test! A large number of homework problems follow, because you
simply have to get used to this technique, and there is no way to do that but to practice.
Finally, the part you have been waiting for—how to use Maple to do this. There is a command for Maple to change
variables in an integral, but to use it is a bit roundabout. You first have to tell Maple not to evaluate the integral you have so
that you can tell it to change the variable. That is done by using what is called the inert form of int(); which is written
Int(); That is, when you type in
Int(2*x/(x^2+1),x);
Maple simply comes back with the integral unevaluated. Now you can change variables. To get to the substitution routines
for integrals, you need to load the student package first, by typing
with(student);.
An example of how this works is given here:
> with(student):
> Int(2*x/(x^2+1), x);
CHAPTER 4. INTEGRATION EXPLAINED 225
Homework #38
Exercises.
In the questions in this investigation, we look at the standard (trig) substitutions. They are the method used to integrate the
last three standard formula (inverse trigonometric function) integrals. This is a major topic in standard calculus courses,
and is typically poorly understood there. I want to look at them only briefly. Maple does a good job on these, so instead of
confusing the daylights out of you, I’ll allow you to use Maple. I will not ask for these on any test.
Mugsy: Joy, joy.
CHAPTER 4. INTEGRATION EXPLAINED 227
dx
Z
= Arctan x +C
1 + x2
We show where this comes from, too. The methods of this question closely resemble the methods of the previous
question, so look back there if you get stuck.
(a) Substitute
R
x = tan θ in the integral, use a different trigonometric identity, and show that this integral also reduces
to dθ = θ + C, but we again need to get back to the original variable, x. For that we need to solve the
substitution x = tan θ for θ . Plug that into dx and 1 + x2 and put that all into the integral and use a trig identity.
Then solve x = tan θ for θ and finish the integral.
(b) We can use the same idea (again) any time we have an integral containing the form a2 + u2 , and not just 1 + x2 .
The substitution in that case is u = a tan θ . Use the method from the first part of this investigation with this
substitution to integrate
du
Z
a2 + u2
R
This time, you will get du together with extra, constant factors. But those are easy to deal with, since they
pull outside the integral sign.
3. We continue with even more trigonometric substitutions. We also know that
1
Z
√ = Arcsec u +C
|u | u2 − 1
This is the last of the trio of inverse trigonometric integrals I just gave. And again, look back to the first investigation
of this section for more detailed reasoning for the steps.
R
(a) Substitute x = sec θ in the integral, use a trigonometric identity, and show that this integral also reduces to dθ .
(You can assume that the absolute values work out. Basically, ignore them. They are a real pain to go through
in detail.) The integral is then θ +C, but we now need to get back to the original variable, x. For that we need
to solve the substitution x = sec θ for θ .
CHAPTER 4. INTEGRATION EXPLAINED 228
(b) We can use a similar substitution any time we have an integral containing the form u2 − a2 , and not just x2 − 1.
The substitution in that case is u = a sec θ . Use the method from the first part of this investigation with this
substitution to integrate
1
Z
√
|u | u2 − a2
(The comment from the previous investigation problem about constant factors applies here, too.)
4. The past three questions provide a framework to integrate a number of different things, but one more item is needed to
tie them all together. If you are confronted with a quadratic that contains only x2 terms and constants, you can use the
methods from the previous problems. On the other hand, a quadratic can also contain a linearRterm √ (such as x2 − x + 1,
where the −x is the linear term). This creates serious problems. For example, if you have dx/ x2 − x, you can’t
just assume that x is a2 , a constant, since it is definitely a variable. There is a procedure for dealing with linear terms
in quadratics that is so common you should have encountered it before. It is called completing the square. We will
go over it in the lab period, if necessary. Maple also has a completesquare(); command (accessible after typing
with(student): the same way that changevar(); is). The result of completing the square in a quadratic is a form
either a2 − u2 , a2 + u2 , or u2 − a2 , where a is a constant and the u contains the variable (usually x). (The fourth option,
−a2 −u2 , can always be treated as −(a2 +u2 ) wherever it occurs.) The previous three problems then show you how to
integrate these. The table at the top of the page summarizes what you do. For example, with 2 x2 − 8 x + 5, completing
the square gives 2(x − 2)√ 2 − 3. Comparing that to u2 − a2 means that you would use u2 = 2(x − 2)2 and a2 = 3, or
√
u = 2 (x − 2) and a = 3. And√the form u2√ − a2 would mean that you would want the substitution u = a sec θ ,
which in this case would become 2 (x − 2) = 3 sec θ . That substitution will magically cause all the right things to
happen in the integral. In particular, you would get that u2 − a2 = 2 x2 − 8 x + 5 would equal a2 tan2 θ = 3 tan2 θ . We
now want to look at integrating a few functions using the table for trigonometric substitutions.
x2 + 6 x + 13
Follow the directions for the previous investigation problem to find this integral.
CHAPTER 4. INTEGRATION EXPLAINED 229
Definite integrals. Once we have exact procedures for indefinite integrals, getting exact procedures for definite integrals
is easy.
The procedure. Exactly the same process is used for definite integrals as for indefinite ones. The only difference is
that we need to learn what to do with the limits. Again, there are two approaches:
• Integrate the function, convert back to the original variables, and use the original limits. This is the method to use if
you already know the indefinite integral.
• Change the limits when you change the variables, and use the new limits with the new variables. This is the method
to use if you don’t already know the indefinite integral.
Note that you don’t have the option of using the new limits with the old variables. Some people always try that . . . .
How do you find the new limits? It is identical to what we had when integrating a differential equation: Make new
limits correspond to old limits. An example will help; consider
Z 2
2x
dx
1 x2 + 1
The first approach is the easiest here, because we already know the indefinite integral; it is ln x2 + 1 + C. The definite
integral then has the value
Z 2 2
2x 2
dx = ln x + 1 (4.62)
1 x2 + 1 1
= ln 2 + 1 − ln 12 + 1
2
(4.63)
= ln(5) − ln(2) (4.64)
= ln(5/2) (4.65)
But what if we didn’t already know the indefinite integral? We’d have to work it out. Let me redo this problem, not
assuming that we have worked out the indefinite integral. The substitution (determined earlier) is u = x2 + 1. I will use the
second method for indefinite integrals:
du
u = x2 + 1, so du = 2 x dx, so dx = 2x . Then
Z 2 Z 5
2x 2 x du
dx = (4.66)
1 x2 + 1 2 u 2x
Z 5
du
= (4.67)
2 u
5
= ln |u | (4.68)
2
= ln(5) − ln(2) (4.69)
= ln(5/2) (4.70)
The big question is, where did the limits 2 and 5 come from in the du integral? Remember that when you want to make
two definite integrals with different variables equal, you must make the limits of the integrals correspond. That means you
put the values of u in the limits of the du integral that correspond to the limits of x = 1 and x = 2 in the dx integral. How
do you find the correspondence? It is given by the substitution equation, u = x2 + 1. Then x = 1 (the lower limit from the
dx integral) corresponds to u = (1)2 + 1 = 2 (the lower limit to use on the du integral), and x = 2 (the upper limit from the
dx integral) corresponds to u = (2)2 + 1 = 5 (the upper limit to use on the du integral). It actually is not difficult. Much
like the chain rule (which is woven through all of this), you only need to remember to do it. But this also merits a boxed
reminder.
Mugsy: I don’t have anything to say here. But I decided to break up the monotony.
When changing variables in a definite integral, you should also change the limits. You do this by plugging the
old limits one at a time into the equation for the substitution, and getting the new limits out.
Again, the procedure changevar(); in Maple also works for definite integrals, as the following shows.
> with(student):
> Int( 2*x/(x^2+1), x = 1 .. 2 );
> changevar(u=x^2+1,%,u);
> value(%);
Z 2
2x
dx
1 x2 + 1
Z 5
1
du
u 2
−ln(2) + ln(5)
Note that the changevar(); routine changes the limits as it goes along. This is what you should do as well!
Why the procedure works. What I have just said ought to make sense (I do try). But there is a more fundamental
reason why the substitution procedure in definite integrals works: Both integrals add up the same numbers! This is hardly
obvious, but it is true.
There are two changes that go on in a substitution in a definite integral: You must change the differential (dx needs
to become du, and does so by the formula du = (du/dx) × dx), and you must change the limits (by the procedure given
earlier). These two changes work together to guarantee that the numbers going into the integral’s summation are equal.
To see this, it is easiest to work backwards, from 25 (1/u) du to 12 2 x/(x2 + 1) dx, and show that everything matches up.
R R
The area of the differential-width slivers in the du integral are (base) × (height), with the du being the (base) and the
1/u being the (height). We won’t have du equaling dx, since the differentials of different variables must be related through
the chain rule. For that, du = (du/dx) × dx = (2 x) dx. So that explains the 2 x in the dx integral. It is a factor needed to
multiply the width of the dx slivers to equal the value of the width of the du slivers.
The height of the du slivers is 1/u, and this “obviously” equals 1/(x2 + 1), since u = x2 + 1. At least it will as long as
the values plugged in for u are equal to (x2 + 1)’s values. Note that we don’t want u to equal x, because then 1/u wouldn’t
equal 1/(x2 + 1), and the heights would be off. So, how do we make the values plugged in for u equal to the values plugged
for x2 + 1? We adjust the limits of the integral so that this happens! The value u = 2 is just what you’d get from plugging
x = 1 into u = x2 + 1, and the value u = 5 is just what you’d get by plugging x = 2 into u = x2 + 1. So, as x goes from 1
to 2, we want the values of u to go from 2 to 5. Then the values for 1/u and 1/(x2 + 1) will match sliver-for-sliver, and the
heights will be equal.
If you don’t get this explanation of why the procedure works, don’t worry too much. You should, however, know what
the procedure for changing limits is and use it when the integral calls for it.
Homework #39
Exercises.
CHAPTER 4. INTEGRATION EXPLAINED 231
1. Find the following definite integrals. Change the limits when you use substitution to evaluate the integral on the ones
for which you don’t know the indefinite integral. (Some of the indefinite integrals showed up in the last homework
exercises.)
Z 2
(a) 3 x3 − 5 x2 + 5 x − 3 dx
0
Z π/4
(b) sin(8 x) dx
π/16
Z 1√
(c) 3t + 8 dt
−1
Z 1
sin r
(d) dr
0cos r
Z 2
1
(e) dx
1 1 + x
2. Find the following definite integrals. Change the limits when you use substitution to evaluate the integral on the ones
for which you don’t know the indefinite integral. (Some of the indefinite integrals showed up in the last homework
exercises.)
Z 1
(a) 3 x3 − 4 x2 − 3 x + 6 dx
0
Z π/2
(b) cos(3 x) dx
π/6
Z 2√
(c) 24t − 7 dt
1
Z 1 √
sin s
(d) √ ds
0 s
Z 1
1
(e) dx
0 1 + 2x
Problem.
1. This problem investigates a way to evaluate some definite integrals very rapidly.
Ra
(a) Give an argument showing that f (x) dx = 0 for any function f (x) and any value a.
a
R1 √
(b) Find the substitution and new limits in −1 x/ x2 + 1 dx.
(c) What is the value of the integral in the preceding part of this problem? (Hint: Look at the first part of this
problem.)
R1 √
(d) Try that same substitution on the integral −1 1/ x2 + 1 dx. What indicates that this integral might not be zero?
(In fact, it has a value roughly equal to 1.76.)
Partial fractions.
This is a method of integrating any rational function (a fancy name for the quotient of two polynomials). It works in theory
always, often in practice. There is a specific method to use.
Conversion to a proper form. When applying partial fractions, the first thing to do is to convert the integrand to [poly-
nomial] + [proper rational function] using polynomial division. A proper rational function is one where the degree of the
top is less than the degree of the bottom. If you don’t have this, the methods of partial fractions will fail. They always
assume a proper rational function.
CHAPTER 4. INTEGRATION EXPLAINED 232
Factor the denominator. All the factors should either be linear or quadratic, with the quadratic having only imaginary
roots. All polynomials (with real coefficients) can be factored this far in theory. This is the step where practice might not
agree with theory. You can tell whether a quadratic has imaginary roots by completing the square on it. If it has the form
u2 + a2 , then it has imaginary roots. If is has the form u2 − a2 , then it will factor into (u − a)(u + a). (If this makes you
think that trigonometric substitutions fit in here, you are precisely correct.)
Set up the correct partial fractions form. Getting the right form takes a bit of practice, but is actually quite simple once
you get the hang of it. The basic idea is that each factor in the denominator generates one or more terms in the partial
fractions form. The number of terms it generates equals its exponent, and the terms it generates equals the original factor
(without its exponent) with a succession of higher exponents, starting at one and working up to the exponent on the factor.
That is, the last term equals exactly the original factor. That determines the denominators of the partial fractions form. The
numerators are much easier. The factors will either be linear or quadratic. Linear factors get constants on top in the form,
and quadratics gets linear terms on top in the form. An example will help:
(?)
x2 (x − 1)3 (x2 + 4)3 (x2 + 9)
The x2 in the denominator gave the first 2 terms in the expansion. The (x − 1)3 gave the next 3 terms. The (x2 + 4)3 gave
the next 3 terms. And the (x2 + 9) = (x2 + 9)1 gave the last term. Note that the linear terms to powers (x2 and (x − 1)3 )
have constants on top, while the quadratic terms to powers ((x2 + 4)3 and (x2 + 9)) have linear terms on top. Note that the
numerator (top) polynomial of the original function has nothing to do with the form as it is set up. It controls the values of
the coefficients A through M, but not the set up.
Solve for the coefficients in the partial fractions form. This is usually an algebraic nightmare. Fortunately, Maple will
do all of that for us. In fact, Maple has a function that does the conversion to partial fractions form directly.
The Maple command to change R(x), a rational function, to its partial fractions form is convert(R(x),parfrac,x);.
You must tell Maple what function to convert (R(x)), what conversion to perform (parfrac is the partial fractions indica-
tor), and what the variable is (x, since there could be other variables around).
An example is
> y := (x^2+2*x+3)/(x^2*(x^2+4)^2);
x2 + 2 x + 3
y :=
x2 (x2 + 4)2
It is possible to put the function straight into the convert(); statement, but I would encourage you not to do that. By
first defining y, and letting Maple print it out, you can look at the function and guarantee that you have the right function.
It is very easy to mistype this functions, particularly by not putting parentheses around the bottom function.
It should be noted that the numbers in this Maple example are quite tame compared to what you can get in a partial
fractions expansion.
There is one nice application that fits in here. Several times, we have looked at the population growth model, dP/dt =
k P. This leads to exponential growth: P(t) = P0 ek t . This equation works quite well, but for limited times only. Suppose,
for example, that we take the example we had before, where P(t) = number of rabbits at time t (in months), k = 0.1, and
P0 = 50. Then at t = 1200 (100 years), we’d have P(t) = 1.3 × 1052 rabbits, weighing substantially more than the entire
solar system. Clearly, something is wrong with the equation. Shortage of food, overcrowding, lack of privacy, etc., cause a
drop in the growth rate. How do we incorporate this into the equations? The usual way is to add another factor, and get the
logistic equation
dP P
= kP 1−
dt M
CHAPTER 4. INTEGRATION EXPLAINED 233
where M is the maximum stable population of rabbits. Then, as P gets close to M, dP/dt (the growth rate) drops, causing
slower growth. If for some reason, P should ever get larger than M, then the growth rate would become negative, and the
population would then decrease back to M. So, let’s solve this differential equation. First separate variables to get
dP M
k dt = = dP
P(1 − P/M) P (M − P)
We want to integrate both sides, but partial fractions are needed on the last integral. Using partial fractions, you get
M 1 1
= +
P (M − P) P M − P
This is the solution, but it is not in convenient form. What follows is a brief summary of the algebra needed to convert it.
P0
P = (M − P) ek t (4.76)
M − P0
P0 P0
P=M ek t − P ek t (4.77)
M − P0 M − P0
P0 P0
P+P ek t ) = M ek t (4.78)
M − P0 M − P0
P0 P0
P 1+ ek t = M ek t (4.79)
M − P0 M − P0
P0
M( M−P )ek t
P= 0 (4.80)
P0
1 + M−P 0
ek t
Dividing top and bottom by the last term on the bottom gives a nicer form:
M
P(t) = (4.81)
M−P0
1+ P0 e−k t
This is one convenient form of the logistic equation. A last bit of adjusting is common here. Taking
M − P0 lnW
W= and t0 =
P0 k
CHAPTER 4. INTEGRATION EXPLAINED 234
M
P(t) = (4.87)
1 + e−k(t−t0 )
Obviously, this is a much more complicated problem, but the answer (in the form given last) is not too bad. The differential
equation we solved is called the logistic equation, and turns out to be remarkably accurate over long periods of time. It is
handy in predicting the spread of rumors, or diseases, or technology, as well as predicting populations.
The form of the logistic equation that you use is determined by the information you are given or want, much like
deciding which form of a line to use. You will usually have the data M, k, and P0 , and if so, you use the first boxed equation
to get the equation. On the other hand, if you then rewrite the equation into the second form using the equations for W and
t0 , you can also read off the value of t0 , which is a useful number to have. (See the homework.)
Using the logistic equation, two men named Pearl and Reed predicted in 1920 that the population of the United States
was given by
197, 273, 522
P(t) =
1 + e−0.03134(t−1914.32)
This fitted the population of the U.S. for the entire range of 1790 to 1910 to an accuracy of 4% (which the constants were
chosen to do),
Albert: But that was a major success. You have to realize that on the basis of three values—the populations in 1790,
1850, and 1910—the values at thirteen values were approximated exceedingly accurately.
but remained that accurate all the way through the 1950 census. From 1960 on, the model has failed rather badly. Why?
Probably because the maximum stable population (given by M) has increased substantially since 1950 due to technological
progress. It is possible to fit new constants (M, k, and t0 ) to the population figures from 1960, 1970, and 1980. (Three
equations, three unknowns.) The result is (due to Maple!)
How well does it work? If you evaluate it, P1 (1990) = 248319646, for a relative error of 0.16% versus the actual 1990
census. Not bad.
Mugsy: Hey! I like that!
The 2000 census figures are still under some debate, but seem to have settled down. The actual U.S. population
according to the 2000 census was 281,421,906. The predicted number is 268,076,627, for a percentage error of 4.7%.
Why was the error so big? There are several reasons. First, there was a major push (for political, not humanitarian,
reasons, in my opinion) to legalize a large number of illegal immigrants. The Hispanic population of this country swelled
enormously, and we have not yet seen the effects of that played out.
Another reason for such a large error is that we used data points that were too close together and tried to extrapolate
too far with them. We can remedy that by taking a different set of data points to get the constants in the logistic equation.
Using the data from 1960, 1980, and 2000 fits better. (Note that this is what Pearl and Reed did as well. They chose 1790,
1850, and 1910 as the years for fitting the data, picking the years that were as far apart as possible.) If you grind through
this (again, with Maple’s help!) using 1960, 1980, and 2000, you get this equation.
CHAPTER 4. INTEGRATION EXPLAINED 235
The error in 1970 is 0.67% and the error in 1990 is 1.75%. Much better. However, it interpretation of that equation is a
bit strange. See the homework.
Homework #40
Exercises.
1. Find the partial fractions expansion of the following functions. (The use of Maple is virtually mandatory—unless
you are seriously masochistic. I made no attempt to end up with nice coefficients here.)
1
(a)
(x + 2)(x − 3)(x + 4)(x − 5)
4 x + 11
(b)
(x − 3)(x2 + 9)
4 x + 11
(c)
(x − 3)2 (x2 + 9)
4 x + 11
(d)
(x − 3)(x2 + 9)2
4 x + 11
(e)
(x − 3)2 (x2 + 9)2
2. Find the partial fractions expansion of the following functions. Again, Maple is needed.
1
(a)
(x + 1)(x − 2)(x − 3)(x + 4)
11 x + 4
(b)
(x − 3)(x2 + 9)
11 x + 4
(c)
(x − 3)2 (x2 + 9)
11 x + 4
(d)
(x − 3)(x2 + 9)2
11 x + 4
(e)
(x − 3)2 (x2 + 9)2
3. (a) Give the set-up for the partial fractions expansion of the following function.
(?)
x2 (x − 6)4 (x + 1)(x2 + 4)(x2 + 12)2
(b) What is the condition on the degree of the numerator (?) for this function to be a proper rational function?
4. For this exercise, use the rabbit data (k = 0.1, P0 = 50), together with a limiting population of M = 500 for logistic
equations. It would be useful to refer back to the notes for the derivation of the logistic equation for this problem.
(See the index or table of contents.)
CHAPTER 4. INTEGRATION EXPLAINED 236
5. In this exercise, we look at the logistic equations for the U.S. populations as derived in the notes for the years
1960–1970–1980 and for the years 1960–1980–2000. (See the boxed equation on pages 234 and 235.)
(a) What is the maximum U.S. population that the data suggest for these two equations?
(b) Why do you think that the maximum populations are so different?
(c) Compare both growth specific rates (the ks) to each other and to the specific growth rate that Pearl and Reed
obtained.
6. Using the two logistic equations in the notes (1960–1970–1980, and 1960–1980–2000), predict the population of the
United States in the year 2010. Would it be reasonable to try to decide between those two models on the basis of how
close they were to the real data in 2010?
7. Decompose the following into partial fractions. Work these out by hand, don’t just set them up.
4 x − 13
(a)
2 x2 + x − 6
13 x + 5
(b)
3 x2 − 7 x − 6
7 x2 − 29 x + 24
(c)
(2 x − 1) (x − 2)2
x2 − 17 x + 35
(d)
(x2 + 1) (x − 4)
8. Integrate the following (which should look familiar).
R 4 x − 13
(a) dx
2 x2 + x − 6
R 13 x + 5
(b) dx
3 x2 − 7 x − 6
R 7 x2 − 29 x + 24
(c) dx
(2 x − 1) (x − 2)2
R x2 − 17 x + 35
(d) dx
(x2 + 1) (x − 4)
Problems.
1. Use the second boxed form of the logistic equation for this problem.
(a) Show that P(t0 ) = M/2.
(b) Show that P00 (t0 ) = 0. (This requires a bunch of algebra. Maple comes in very handy for this one.)
CHAPTER 4. INTEGRATION EXPLAINED 237
Integration by parts.
Even though the product rule for derivatives doesn’t give a product rule for integrals, it does give a rule, called integration
by parts, or often just “parts.” It is used on a different set of integrals than substitution or partial fractions.
Z Z
u dv = u v − v du (4.89)
That is the differential form of integrationR by parts, and probably theR easiest form in which to remember it.
Note one thing. It takes one integral ( u dv) and gives another ( v du).
Dudley: Didn’t substitution take one integral and give another also?
Albert: Very good. But that is where the similarity ends. The whole idea, then, is to choose u and v strategically so
that the second integral actually is easier. The objective for both substitution and parts is the same: Get an integral that is
closer to being able to be worked! The basic formulas we have are the only ones that give genuine answers, rather than
more integrals to be worked.
For example, let’s work Z
x ex dx
R R
Look at the integration by parts formula. We start with an integral u dv, and we end with an integral v du. In going from
the first to the second, the u became a du. In other words, we will end up differentiating the u-part of the integral. Also, in
going from the first to the second, the dv becomes a v. In other words, we will end up integrating the v-part of the integral.
When we look then at the example, Z
x ex dx
we end up having to decide which part to integrate and which part to differentiate. The ex dx would be easy to integrate,
and the x would be simple to differentiate. So, let’s try it. Set u = x, and set ex dx = dv. Then du = dx, and v = ex . (All
right, it should be v = ex +C, but we’ll see in the homework that we can ignore the +C at this point of integration by parts.)
The integration by parts formula, gives
Z Z
x ex dx = x ex − ex dx = x ex − ex +C
How to apply it. But you might argue that it is just as easy to differentiate ex as it is to integrate it, and certainly integrating
x dx is no difficulty either. Why use that particular choice of u and dv? We’ll get to that next.
It is convenient to establish three categories of functions:
1. logs and inverse trigonometric functions
2. polynomials and powers of x
3. exponentials, sines, and cosines
Every function we deal with has factors that fit into one or more of these categories.
There are a couple of advantages of remembering this table. First, when do you use integration by parts? In any of three
cases:
1. a product of functions from different categories
2. single term from first category
CHAPTER 4. INTEGRATION EXPLAINED 239
categories, integration by parts is the method. Since x is in the lower-numbered category (that is, category 2), use u = x,
and since ex is in the higher-numbered category (that is, category 3), use dv = ex dx.
If the integrand is a single term from the first category, treat it as a product of 1 × the integrand, and treat the 1 as being
a polynomial. That makes it an integral from the first category times an integral from the second category, and u = original
function, and dv = 1 dx in the integration by parts formula. For example,
Z
ln x dx
x ex − ex
That’s the answer. But Maple still won’t put in the +C. Maple could have done the whole integral from the start,
though.
> int( x*exp(x), x);
(−1 + x) ex
At any point, you can type
value(%);
and Maple will then go ahead and work the integral out for you, as shown. And if you forget to use the inert form and use
int();,
Maple just grinds it out, also as shown.
CHAPTER 4. INTEGRATION EXPLAINED 240
Multiple integrations by parts. Sometimes, when you use integration by parts, you get another integral that is also a
candidate for integration by parts. In that case, go ahead and use it!
Mugsy: I thought things like this were outlawed with the Inquisition.
R 2 x
Straightforward application. For example, if we started with x e dx, one integration by parts (with u = x2 and
dv = ex dx, so du = 2 x dx and v = ex ) would give
Z Z
x2 ex dx = x2 ex − 2x ex dx
and the integral that’s left would also require an integration by parts. In fact, it is twice the integral we worked out earlier.
The net result would be that Z
x2 ex dx = x2 ex − 2 (x ex − ex ) +C
A few words of caution: Be careful of negative signs! They crop up all over, and it’s easy to miss one. The same goes for
factors that come out of the integrals (such as the 2 in the example).
Dudley: Multiple minus signs! AUGH!
Also, avoid circular work, where everything cancels. An example of that phenomenon is given in the homework. The
way to avoid the circular work trap is always to use these guidelines about what u and v should be. Or in the most general
situation, be careful never to let u in one integration by parts be the result of integrating the dv of the previous step. Doing
so reverses the progress made in the previous integration by parts step.
Mugsy: I just hate it when that happens.
There is one handy procedure for doing integration by parts multiple times very rapidly, as long as the integrand is
of the form (polynomial) × (a single category 3 function). Essentially, the procedure is a systematized formulation of
multiple integration by parts. It is not another method, but a slick way of organizing integration by parts. The procedure
is called tabular integration by parts, and works this way. You will use two columns. The polynomial is put at the top of
one column and the rest of the integrand (not including the differential) is put at the top of the other column. (It should
be a single category 3 function, which should be easy to integrate repeatedly.) Successive derivatives of the polynomial
are put on successive rows in its column, stopping when you reach a derivative of 0. Then fill out the other column with
successive integrals of that function, keeping going down rows in parallel with the derivatives until you run out of rows on
the polynomial side. From this, you can write down the integral. Take each row in the polynomial column, and multiply it
by the entry in the other column that is one row down from it. Then add up all the products with alternating signs (plus, then
minus, then plus, then minus, etc.), starting with positive. (That is, you take the first product with whatever sign it has, take
the second product and change its sign, take the third product with its sign, the fourth product with sign changed, etc.) Add
all these up. (Why do we use one row down rather than products straight across? Because we must do one more integral
than derivative in order to evaluate the integral. Why use alternating signs? Because the integration by parts formula has
this subtraction in it, and this method is exactly the same as integration by parts, just put together to aid in calculations.)
An example will help.
Mugsy: You said it!
Let’s work Z
(x3 + 4 x2 + 6 x) cos(2 x) dx
The columns are headed by x3 + 4 x2 + 6 x and cos(2 x). The table for the integration by parts is then
Polynomial Other function
x3 + 4 x2 + 6 x cos(2 x)
3 x2 + 8 x + 6 (1/2) sin(2 x)
6x+8 −(1/4) cos(2 x)
6 −(1/8) sin(2 x)
0 (1/16) cos(2 x)
The successive products are then (remember to use “down one!”):
CHAPTER 4. INTEGRATION EXPLAINED 241
Solving for the integral. Occasionally, there isRan attempt to get the resulting and initial integrals of integration by
parts to match, and then solve for it. One example is sec3 x dx, which can be evaluated this way. First, we split the sec3 x
into (sec x)(sec2 x), and use a regular integration by parts. Then a trigonometric identity allows us to recover part of a sec3 x
back. Watch. Z Z
sec3 x dx = sec x sec2 x dx
Let u = sec x, dv = sec2 x, so du = sec x tan x dx, and v = tan x, so integration by parts gives (with the trigonometric identity
tan2 x = sec2 x − 1)
Z Z
sec3 x dx = sec x tan x − sec x tan2 x dx (4.93)
Z
= sec x tan x − sec x(sec2 −1) dx (4.94)
Z
= sec x tan x − (sec3 x − sec x) dx (4.95)
Z Z
= sec x tan x − sec3 x dx + sec x dx (4.96)
Z
2 sec3 x dx = sec x tan x + ln |sec x + tan x | +C (4.97)
1 1
Z
sec3 x dx = sec x tan x + ln |sec x + tan x | +C (4.98)
2 2
R
I snuck in the integral sec x dx = ln |sec x + tan x | +C, which we did in the homework (see the substitution section of this
chapter). The Maple commands that duplicate this are:
> with(student):
> Int(sec(x)^3, x);
Z
sec(x)3 dx
Don’t forget the capital “I.”
> intparts(%, sec(x));
sec(x) sin(x) sec(x) tan(x) sin(x)
Z
− dx
cos(x) cos(x)
You differentiate one factor of sec x and integrate the other two. Next, we convert everything to sines and cosines.
> expand(simplify(%));
CHAPTER 4. INTEGRATION EXPLAINED 242
sin(x) sin(x)2
Z
− dx
cos(x)2 cos(x)3
Our next Maple step pulls out a trick we haven’t seen before. It is a new command powsubs();. It operates much like
subs();, except that it also works right for powers. In the example here, powers of 1/cos(x) will convert to powers of
sec(x). It is defined in the student package.
> powsubs( 1/cos(x) = sec(x), % );
Z
sec(x)2 sin(x) − sec(x)3 sin(x)2 dx
Next, we solve for the integral. Again, note the capital “I”’s.
> solve( Int(sec(x)^3,x) = %, Int(sec(x)^3,x) );
Z
sec(x)2 sin(x) − sec(x)3 sin(x)2 dx
We’re almost done. Just that last pesky integral.
> value(%);
1 sin(x)3 1 1
sec(x)2 sin(x) − − sin(x) + ln(sec(x) + tan(x))
2 cos(x)2 2 2
The same process will integrate eax cos(bx) or eax sin(bx), except that two integration by parts are needed to get back to
the original integral. To avoid cancellation in your integral, I would suggest using u = eax for both integration by parts.
“Non-obvious” substitutions.
If you have a quadratic, look to see if completing the square will put it into the form ±u2 ± a2 . Then a trigonometric
substitution (see a long and painful investigation a few sections back) will probably help. Beyond that, you can also try
CHAPTER 4. INTEGRATION EXPLAINED 243
Maple, especially more recent releases. (Maple is continually refining its integration techniques, because they are used so
much!)
Homework #41
Exercises.
1. Work the following integrals. You can check your answer on Maple, but you need to be able to do these by hand, too!
Z
(a) x2 sin(4 x) dx
Z
(b) x8 ln(7 x) dx
Z
(c) (Arcsin x)2 dx (A challenge!)
Z
(d) x2 sin(x3 ) dx
Z
(e) x2 sin(x) dx
Z
(f) sin x ecos x dx
Z p
(g) x8 x3 + 1 dx (Hint: x8 = (x2 ) (x3 )2 .)
3. For each of the following integrals, give the best method to integrate it “by hand,” just the first step. (“Maple” is not
an answer here!)
Mugsy: Rats.
You do not have to evaluate them. Just say how you would do them, if you had to.
Z
(a) x4 cos x dx
Z
3
(b) x5 ex dx
x2 − 1
Z
(c) dx
x3 − 3 x
Z
(d) (ln x)6 dx
CHAPTER 4. INTEGRATION EXPLAINED 244
3 x2 − 5 x + 9
Z
(e) dx
(x + 3)5 (x2 + 1)4
4. For each of the following integrals, give the best method to integrate it “by hand,” just the first step.
Z
(a) x4 ex dx
Z
(b) x5 cos(x3 ) dx
x+1
Z
(c) dx
x2 + 2 x
Z
(d) (ln x)8 dx
5 x2 − 3 x + 4
Z
(e) dx
(x − 3)4 (x2 + 81)3
5. How do you check if an indefinite integral is correct? Use that to check that x ex dx = x ex − ex +C is correct.
R
Problems.
1. RThis shows you whatR happens if you use the wrong assignments in an integration by parts done twice. Start with
x2 ex dx = x2 ex − 2 x ex dx (obtained by integration by parts with the right assignments of u = x2 and dv = ex dx),
and then use the wrong assignments forR u and dv (that is, u = ex and dv = x dx) in the right-hand side integral, and
show that you end up with x e dx = x2 ex dx.
R 2 x
2. Work (x3 + 4 x2 + 6 x) cos(2 x) dx by the usual integration by parts formula (repeatedly), and compare the successive
R
terms obtained with the terms that occurred in the tabular organization in the notes.
3. In this problem, we examine an integral that earlier versions of Maple couldn’t work. The moral of this problem is
that you still need to know how to use the methods (substitution, integration by parts, etc.) even if you do have Maple
available to you. The newer versions of Maple—like in the computer lab—can handle this! But for this problem,
pretend Maple can’t do it. Simulate that by using Int(); rather than int();.) Type
Int(sqrt(1+x^4)/x,x);
on Maple in the computer lab and proceed with this problem.
(a) If we recognize that 1 + x4 fits into the pattern a2 + u2 , with a = 1 and u = x2 , we can put these into the
substitution form u = a tan θ and get a substitution x2 = tan θ . To tell Maple to make the substitution, we have
to load the substitution (change-of-variables) routine, by typing with(student): Then type
changevar(x^2=tan(theta),%%,theta);.
(The doubled percent signs are needed because the
with(student):
was actually immediately before, and we want to go back to the integral.) Then type
value(%);
to get Maple to finish the integral.
(b) Note that the integral is in terms of θ , not x, which is the original variable. To get back, type
subs(tan(theta)=x^2,%);
and Maple gives the answer in terms of x.
Dudley: Hey, Al. Am I supposed to get an Arctanh?
Albert: Didn’t think you’d ever see that, huh? Try typing in convert(%,ln) to see what happens when
you use logarithms. You’ll see why Arctanh is used.
(c) Differentiate (on Maple) the answer from the previous part, simplify it with normal(%); (you might also need
as factor();), and check that it is equal to the original function.)
Investigations.
CHAPTER 4. INTEGRATION EXPLAINED 245
1. In this question, we explore the formula x ex dx = x ex − ex +C. We almost could have “guessed” that the formula
R
would have to be like this. (Of course, this requires inspired guessing.) We will attempt to reconstruct the integral by
finding a function whose derivative is x ex .
(a) You know that the derivative of ex is ex , so suppose you tried, as a first shot, x ex for the integral. Differentiate
that, and see how close to x ex you get.
(b) There is an extra term (an ex ) in the derivative that isn’t in the integral. (Right?) So, let’s see if we can get rid
of it. We must subtract something from the x ex (our original guess for the integral) to get rid of this extra ex in
the derivative. What would we need to subtract from the guess so that the extra ex in the derivative cancels?
(c) Do the differentiation of the guess including that extra term, and check that you do get exactly x ex as the
derivative.
(Note: Reexamine the integration by parts procedure applied to x ex dx and you’ll see that the integral on the
R
right-hand side does exactly what the steps here did. That’s really all there is to integration by parts. You try
something (uv) and then subtract off what you need to (in the second integral) to make things work correctly.)
2. We now investigate what happens if we decide to put in the constant of integration when finding the v term in the
integration by parts.
(a) In the integral x ex dx, with u = x and dv = ex dx, let v = ex + 37. Carry out the rest of the integral, watching
what happens to the 37. Show that the answer doesn’t change.
(b) Give a reason why the v term never needs to have the +C attached to it in any integration by parts use.
R
(c) Evaluate x Arctan x dx, using integration by parts with u = Arctan x and dv = x dx. Write down the integrals
you get by taking v = 21 x2 and v = 12 x2 + 12 . Note that the integral with v = 21 x2 + 12 is easier to work because of
a really nice, and unusual, cancellation. (Note: Integrals where this happens are exceedingly rare. This is more
of a curiosity than something to keep in mind to use regularly.)
Integral tables.
There are numerous integral tables around—so many that it is difficult to give much advice. However, here is a tiny
annotated bibliography of ones you should consider. Stop by my office if you want to see any of these or learn about others.
• Most calculus texts have a table of integrals, often on the flyleaf pages inside the covers. These are minimal, and not
too useful when confronted with complicated integrals.
• CRC Standard Math Tables has a section on integrals (as well as most of the other mathematics that you’d encounter).
It is reasonably good, not comprehensive, but likely to have you what you want. I’d recommend getting this if you
are planning on going into a field that uses quite a bit of math (e.g., engineering, math, physics, or statistics).
• Handbook of Mathematical Functions With Formulas, Graphs, and Mathematical Tables, by Abramowitz and Stegun.
This book is more comprehensive than the CRC one, and can be obtained quite cheaply in paperback. It also is a
major collection of topics, not just integrals. Its treatment is more thorough than CRC’s and more geared to the
person who already knows what to do, just needs to be reminded about all the details or to look up the value of some
function. You might want to consider this if you are going to do lots of math, such as graduate work in math or
engineering. It would be appropriate for someone planning on going to graduate school in mathematics.
• Table of Integrals, Series, and Products by Gradshteyn and Ryzhik. The bulk of this hefty volume is integrals of all
sorts. It comes the closest to being the comprehensive reference for integrals. It also includes summation and product
formulas. This would be useful only for the most specialized math training.
CHAPTER 4. INTEGRATION EXPLAINED 246
Finding the integral you want. Locating the section containing the integral you want can be a chore. Integral tables
(especially the larger ones) contain so many integrals that some organization is critical. Usually, the best procedure for
finding the correct area is to go to the last section that mentions any factor in your integral, and keep up that process
through all the factors you have. For example, if you have x2 ex sin x dx, you would probably find that exponentials
R
occurred after polynomials, and sines after exponentials, so you’d turn to sines. Then look through there until you locate
exponentials times sines, and then look for polynomials times exponentials times sines.
Quadratics often only appear as a2 + u2 , a2 − u2 , and u2 − a2 . (The other combination, −u2 − a2 , is equal to −(u2 + a2 ),
and is rarely encountered.) The integrals that you get are more like
Z p
x2 + 6 x − 7 dx
You can’t simply ignore the 6 x, so you must do what you always do when you don’t like the linear term in a quadratic:
You complete the square. Then proceed as we did before. (See homework investigations about trigonometric substitutions.
That’s the essential basis for all of these forms.)
Mugsy: What a pain that was! You mean it is actually used?
Albert: Very definitely.
Adding on the constant of integration. Integral tables almost always leave off the +C, the same as Maple does. You
still shouldn’t do that! You will lose credit if you do!
Don’t rely on integral tables (or Maple) for all your integration needs. You still need to be able to use substitution,
partial fractions, and/or integration by parts to be able to get to an integral in the table. I have tried to emphasize that point
in the homework. I hope you’ve picked it up.
Dudley: Well, it’s nice of you to say so. I’m assuming that this means that those sorts of things will show up on the
tests.
Albert: You got it.
Reduction formulas. Often, tables only give you a method of extending the range of variables using what is called a
reduction formula. The value of an integral is given in terms of another integral, with some variable lowered (reduced) or
occasionally raised.
A typical example is
1 n−1
Z Z
sinn (ax) dx = − sinn−1 (ax) cos(ax) + sinn−2 (ax) dx
na n
In this case, the exponent (n) is lowered to (n − 2), which is considered progress. You would use this formula if you wanted
R 12
to find, for example, the value of sin (3 x) dx. You’d use n = 12 and a = 3 in the formula on the right side. That you
leave you with an integral sin10 (3 x) dx, which you would then need to evaluate. You do that by
R
going back to the same
formula again, but with n = 10 this time (and a = 3Rstill). You’d getRanother integral, this time sin8 (3 x) dx. Again, you
R
use the formula. You keep doing this until you get sin0 (3 x) dx = dx = x + C, which you should know already. After
that, “all” you have to do is reassemble the pieces to the final answer.
Mugsy: I’m much better at creating little pieces than reassembling them.
As you might expect, most reduction formulas come from applying integration by parts. You’ll get a chance to try one
in the homework.
Dudley: I can hardly wait.
Homework #42
Exercises.
You can check your answer using Maple, but beware of different forms of the same answer! (Take Maple’s answer,
subtract your answer, and then do a simplify(%,trig); on it. If you get 0, or any constant, you are correct.)
2. Use the reduction formula in this section to evaluate
Z
sin6 (5 x) dx.
sin x
Z
dx
x
Z p
1 − k2 sin x dx
The first of these shows up in probability and statistics. (It’s the famous bell-shaped curve function.) The second is called a
Fresnel (pronounced fruh-NELL) integral, and is used in optics. The third is generally called an elliptic integral, and shows
up in miscellaneous places in physics and math.
To find the definite integrals of these functions, we approximate them. This is a very sophisticated subject, and we’ll
just touch on the surface of it.
Mugsy: How about skipping it altogether?
Albert: Definitely not.
We will motivate the approximations using the area concept of integrals. That is, we will interpret the integrals as areas,
and come up with different ways of approximating that area. The approach is simple: Slice it up; approximate it; add it
back together. We can’t take the limit anymore, though, since all you get is the integral that we can’t evaluate. So, we stop
short of the limit, and evaluate the sum we get “by hand.” This is why the result is an approximation. It also gives a major
clue about how to improve the approximation: Increase the number of slices! More on that later; it is not as simple as it
sounds.
Mugsy: Nothing in this course is as simple as it sounds.
On the other hand, we will end up using Maple extensively in this section, because it is so numeric-intensive.
Riemann sums.
The idea of Riemann sums is to approximate the strips using rectangles. It’s a reasonable idea—basically what we did in
setting up integrals. There is only one question, and that is “What height should I use for the rectangle?” Usual calculus
courses cover two options, and Maple allows a third. You can use the heights at the right-hand endpoint of each slice
(called the rightsum(); in Maple), or the left-hand endpoint (called the leftsum(); in Maple), or the midpoint (called
the middlesum(); in Maple). To access the rightsum();, leftsum();, and middlesum(); procedures in Maple, type
with(student);
first. That defines the procedures for Maple. For example, an integral without an exact elementary function value is
Z 2p
1 + x5 dx
0
CHAPTER 4. INTEGRATION EXPLAINED 248
It still has an exact value, though, which we want to approximate. We will use this single integral throughout all the
approximation procedures we get. To apply each of the procedures from above, you would use these commands:
> with(student):
You will notice immediately that you don’t get numbers. You get messy looking summations. To get numbers, use
evalf(value(%)); on them. The Maple Riemann sum routines give results similar to what Int(); does. They don’t try
to do any simplification or evaluation. You must force them.
Mugsy: If you need more force, just call. That I’m good at.
Maple uses only 4 slices unless you tell it otherwise. To get 10 slices with middlesum();, for example, you’d use
> with(student):
> middlesum( sqrt(1+x^5), x = 0 .. 2, 10 );
> evalf(%);
!
1 9
r
i 1 5
∑ 1 + ( 5 + 10 )
5 i=0
4.210170324
That is, you put the number of slices you want to use (if it’s different than 4) after the limits. The others work the same
way.
You can get a very nice picture of the specific rectangles that are used with rightsum();, leftsum();, and middlesum();.
The commands for the pictures are
rightbox();, leftbox();, and middlebox();
with the same types of arguments used in the approximations. For example, you get a picture of the boxes used in
middlesum(sqrt(1+x^5),x=0..2,10);
by typing
middlebox(sqrt(1+x^5),x=0..2,10);
You can actually see the boxes, with their heights set by the values of the function at the middles of their tops.
Homework #43
Exercises.
1. Calculate the decimal values of rightsum();, leftsum();, and middlesum(); for the following integrals, using
n = 10 for each. Put down all 10 digits that Maple gives.
Z π/4
(a) tan x dx
0
Z 1
(b) ex dx
0
Z 1 √
x
(c) dx
0 1 + x4
2. Again, calculate the decimal value of the three Riemann sums for the following integrals, using n = 10 each time.
And again, put down all 10 digits that Maple gives.
Z π/3
(a) cos x dx
0
Z 2
(b) ex dx
0
Z 1 √
x
(c) 5
dx
0 1+x
CHAPTER 4. INTEGRATION EXPLAINED 249
Trapezoidal rule.
We can improve the approximation considerably by using trapezoids rather than rectangles to approximate the areas of the
slices.
In case you’ve forgotten you high school geometry, a trapezoid has four sides, two of them parallel. In this case, the
two parallel sides are vertical. The bottom will be horizontal, and the top will be a line of whatever slope is determined by
the function.
Of course, you want to do this on Maple, and of course, Maple will do it. To get to the Maple routines that do trapezoidal
approximation, you again need with(student); and the routine is, of course, trapezoid();. To approximate, for
example,
Z 2p
1 + x5 dx
0
by the trapezoidal rule with 10 slices, you would use the command
> with(student):
> trapezoid( sqrt(1+x^5), x=0..2, 10);
> evalf(%);
r ! √
1 1 9 i5 33
+ ∑ 1+ +
10 5 i=1 3125 10
4.244981679
Again, you’ll get a summation, which you convert to a decimal by evalf(value(%));. This really is nothing too new.
Homework #44
Exercises.
1. Calculate the trapezoidal approximation (in decimal form) to the following integrals. Use n = 10. (They should look
familiar.) Again, put down all 10 digits that Maple gives.
Z π/4
(a) tan x dx
0
Z 1
(b) ex dx
0
Z 1 √
x
(c) 4
dx
0 1+x
2. Again, calculate the decimal value of the trapezoidal approximation for the following integrals, using n = 10 for
each. Again, put down all 10 digits that Maple gives.
Z π/3
(a) cos x dx
0
Z 2
(b) ex dx
0
Z 1 √
x
(c) dx
0 1 + x5
CHAPTER 4. INTEGRATION EXPLAINED 250
Simpson’s rule.
We can look at Riemann sums as approximating the function by a constant on each slice. The trapezoidal rule approximates
the function by a linear function on each slice. Simpson’s rule uses quadratic functions.
The difficulty is to determine enough information to fit a quadratic. To determine a constant, you only need one value
(the value of the constant), and we had several ways of determining that constant, using the value of the function either at
the left-hand or right-hand endpoint, or the midpoint. For the trapezoidal rule, you need two values, and the values used
were the values at each end of the interval. For Simpson’s rule, you will need three values (corresponding to a, b, and c in
the formula ax2 + bx + c). It is quite reasonable to use the values at both endpoints and the midpoint of the interval, and
that’s what’s done, except that it is always written differently. Instead of using both ends and the middle of a single interval,
the interval is split into two intervals, and the outside ends and the common (“middle endpoint”) point are used. The net
result is that Simpson’s rule requires that the number of intervals be even.
Of course, Maple includes Simpson’s rule. After the usual with(student); you get it by the command simpson();.
To approximate, for example,
Z 2p
1 + x5 dx
0
by Simpson’s rule with 10 slices, you would use the command
> with(student):
This is exactly like what we did before.
Homework #45
Exercises.
1. Calculate Simpson’s rule approximation (in decimal form) to the following integrals. Use n = 10. (They should look
familiar.) Once again, put down all 10 digits that Maple gives.
Z π/4
(a) tan x dx
0
Z 1
(b) ex dx
0
Z 1 √
x
(c) 4
dx
0 1+x
2. Again, calculate the decimal value of the Simpson’s rule for the following integrals, using n = 10 for each. And
again, put down all 10 digits that Maple gives.
Z π/3
(a) cos x dx
0
Z 2
(b) ex dx
0
Z 1 √
x
(c) 5
dx
0 1+x
Problem.
1. In this problem, we collect in one place all the information that the previous exercises have given, and then look at
various patterns in the data. For this, we create a table with the following headings
Exact Leftsum Middlesum Rightsum Trapezoid Simpson
leaving room for several more columns. There will be three rows, one for each of the three integrals in exercise 1 of
the past several sections.
CHAPTER 4. INTEGRATION EXPLAINED 251
(a) Fill in the table, using the numbers you collected earlier. For the Exact column, you might need to use Maple’s
internal approximation routines, and use evalf(%); on them. Again, use full 10 digit accuracy on these.
(b) Compare the Exact column with the others. Which column is consistently closer?
(c) Create a new column, labeled Avg., and fill it in with the averages of the Rightsum and Leftsum columns.
Compare these numbers to the numbers in the Trapezoid column. (The same thing should happen for each
row.) Can you explain what happened?
Accuracy considerations.
The approximation techniques that we have used here are, of course, inaccurate. That’s why they are called approximations.
Some of them are more accurate (closer to the correct value) than others. What we want to look at next is just that. How
accurate (or, how inaccurate) are these approximations?
By definition, the error is
Exact value − Approximate value
The value of the error depends on the method of approximation, the function, the interval, and the number of slices. Of
these, the method and the number of slices can be chosen for any specific problem, since the function and interval are given
to you.
Don’t just increase the number of slices to some huge number and expect better accuracy. Roundoff errors can kill your
estimate, and you’ll be spending a huge amount of time unnecessarily. This is particularly true of the approximations built
into some calculators. (I think they use Simpson’s rule, but I’ve never looked into that carefully.)
The art of balancing approximation method and the number of slices is a delicate one. I intend only to convince you of
that in the homework.
Homework #46
Investigations.
This entire investigation refers to the integral
dx
Z π/2
0 1 + sin x
1. (a) Use Maple to find the exact value of the integral. You will note that Maple grinds for quite a while on this one.
It is not a simple integral! But it does give a whole number answer. (If you get 12 π + 1, go back to your function
and type it in correctly. It is not
1/1+sin(x)! Also remember to capitalize the P in Pi for Maple.)
CHAPTER 4. INTEGRATION EXPLAINED 252
(b) We will set up a table for deciding how well each of the different methods approximates the integral as n =
number of slices increases. The table (it will be quite large) should look like
The rest of the investigation will help you fill it out correctly.
i. Fill in the columns labeled “Value” of Middlesum, Trapezoid, and Simpson on the integral. (Write down
the full 10 significant digits with these. Otherwise the rest of the problem won’t work.)
ii. Knowing the correct value of the integral (from the first question), you can then calculate the numbers in
the column labeled “Err” in all those approximations.
iii. After that, multiply the numbers in the column labeled “Err” by n2 , n2 , and n4 as indicated. (That is, on
the first row, where n = 6, multiply the error in Middlesum by 36 = 62 , multiply the error in Trapezoid
by 36 = 62 , and the error in Simpson by 1296 = 64 . Do similar things on the other rows.) (Be careful to
multiply the value in Simpson by n4 rather than n2 .)
iv. For each of the three methods, the numbers in “Err × n2 ” (or “Err × n4 ” for Simpson’s rule) should have
specific values as n → ∞. (To prove that requires considerably more effort than we have time for.) Estimate
the three limits for the three methods.
v. Use the three estimated limits to estimate the values of the three errors when n = 96. Do this by taking the
limits from the previous part of this question, and dividing by 962 or 964 .
vi. Finally, run each of the methods with n = 96 (this could take some time), find the Values, calculate the
Errs, and figure out the last columns again. Use these numbers to check your answers to the previous part
of this question.
(c) You might have noticed (I am being generous there) that the Errs for middlesum(); and trapezoid();
tend to be related. Specifically, the Err for trapezoid(); is usually quite close to −2 times the Err for
middlesum();. This means that we can come much closer to the correct value of the integral if we use this
fact. Suppose the actual, exact value of an integral is X. Then middlesum(); will give an approximation
Am = X + Em while trapezoid(); will give give an approximation At = X + Et where Em and Et are the
errors from the middlesum(); and trapezoid(); approximations, respectively. But if Et ≈ −2 × Em , then
2 × Em + Et ≈ 0. Then
2 × Am + At = 2 × (X + Em ) + (X + Et ) (4.99)
= 2 × X + 2 × Em + X + Et (4.100)
= 3 × X + (2 × Em + Et ) (4.101)
≈ 3×X (4.102)
Then X ≈ (2 × Am + At )/3 should be a very good approximation. Check this out using the table of Values of
middlesum(); and trapezoid(); for n = 6, 12, and 24. Compare these to the Values of simpson(); at
n = 12, 24, and 48. (It should look very much the same!) This is another way to get Simpson’s rule.
over the same set of “applications” from the point of view of showing you how to set up integrals. I won’t pretend that you
will ever use these as they stand. However, the method that is used to set up integrals is critical to using integration. In fact,
it will become a little boring by the end. That’s the idea.
Dudley: What? Integration is supposed to get boring?
Mugsy: If he’s trying to be funny, it isn’t working.
Albert: No, he’s serious.
Mugsy: It still isn’t working, then.
Albert: The procedure for setting up integrals for applications can get boring. It’s the same thing over and over.
I will cover these faster than most calculus courses, because what I want you to learn is different. I want you to learn
the procedure behind setting up integrals, not the specific formulas.
Many more applications (a lot closer to real-life) will form the majority of this course, beginning in the previous chapter,
picking up again next chapter, and continuing all of next semester. This section is to get you warmed up.
Mugsy: I prefer jogging.
√
Note that f (x) < 0 for 0 ≤ x < 3. This is the culprit! When f (x) < 0, the f (x) dx’s will be negative also, and so will
the sum (that is, the integral). When f (x) turns positive again, the f (x) dx’s become positive too. For 0 ≤ x ≤ 2, there were
more negatives than positives, and the combined sum was negative. The negative and positive terms exactly cancel as you
add up for 0 ≤ x ≤ 3. For 0 ≤ x ≤ 4, there were enough positive terms to make the overall sum positive.
So, the integral is doing what it is supposed to do, namely add up a bunch of differential-sized items. The only problem
is that those items can be negative because the function multiplying dx is negative, and the area ought always to be positive.
When the function is positive, the definite integral gives the area correctly. When the function is negative, the f (x) dx
gives the negative of the area of the slice, and adding those up gives the negative of the area. That’s not too hard to work
with; we just change the sign to get the area. The difficulties occur when the function is sometimes positive and sometimes
negative, as in the example. Then the positives and negatives cancel.
CHAPTER 4. INTEGRATION EXPLAINED 254
What do we do?
What we need is something that will change the sign of negative numbers, but leave positive numbers alone. We have
something that does that—absolute values! The area of a differential sliver is always | f (x) | dx, whether f (x) is positive or
negative.
The general formula for area “under” the graph of y = f (x) from a ≤ x ≤ b can be expressed as
Z b
| f (x) | dx
a
“All” we have to do now is figure out how to integrate absolute values of functions. Note that we really should say that we
are finding the area between y = f (x) and the x-axis rather than the area below y = f (x), because when f (x) < 0, it is the
area above the curve that we are finding. If we use between, we cover both situations with one word.
Learning how to integrate absolute values of functions turns out to be more useful than it might first appear.
Mugsy: Here we go again. I can hardly wait for him to explain how we will use this every day for the rest of our lives.
Dudley: Depends on how long you live.
Mugsy: I can adjust yours right now, if you keep that up.
There are a number of situations where absolute values show up, mainly because
p
z2 = |z |
Any time (well, almost) you are integrating the square root of something, you will want to make the inside part of the square
root a perfect square and then take the square root. At that point, you must put absolute values in, because of this identity.
Then you get to proceed to the following section’s procedure.
The correct and general way to find areas between curves and the x-axis.
The area between the curve y = f (x) and the x-axis for a ≤ x ≤ b is, as stated earlier,
Z b
| f (x) | dx = total area between y = f (x) and the x-axis (4.103)
a
back then.
CHAPTER 4. INTEGRATION EXPLAINED 255
For example, let’s find the correct area between f (x) = 3 x2 − 9 for x between
√ 0 and 4. We√ need first to find where
f (x) > 0. This is done by solving 3 x2 − 9 = 0 which gives 3 x2 = 9, or x = ± 3. Since x = − 3 is not in the interval√
0 ≤ x ≤ 4, we can discard that, and we then have that the intervals over which f (x) doesn’t change sign are 0 < x < 3 and
√
3 < x < 4. Integrate f (x) over these intervals and get:
Z √3 √3 √ √ √
3 x2 − 9 dx = x3 − 9 x0 = (3 3 − 9( 3) − (0 − 0) = −6 3
0
Z 4
2
4 √ √ √
√ 3x − 9 dx = x3 − 9 x√3 = (64 − 36) − (3 3 − 9( 3) = 28 + 6 3
3
Now what do we do? The value of the first integral is clearly negative, which is correct, since the function is negative over
the entire interval. The second integral is clearly positive, which again checks with the function being positive. How do we
find the integral of | f (x) | = 3 x2 − 9 over 0 ≤ x ≤ 4? We add up the absolute values of the two integrals that we got:
√ √ √ √ √
−6 3 + 28 + 6 3 = 6 3 + 28 + 6 3 = 28 + 12 3
This is the area between the curve f (x) = 3 x2 − 9 and the x-axis for 0 ≤ x ≤ 4.
Three final notes.
Mugsy: At least I can count that high. √ √
First, if you don’t take the absolute values before you add in the example, you get (−6 3) + (28 + 6 3) = 28, which is
the answer we obtained by just integrating 3 x2 − 9 from 0 to 4. Think about this for a while until you are convinced that
this should always happen.
Second, be careful when you√ are taking √absolute values of√differences (subtractions). You don’t just change all the signs
to positive. For example, 5 − 3 is 5 − 3 rather than 5 + 3. (For those of you who remember such things, the triangle
inequality says that |a ±√ b| ≤
|a | + |b |,√and the
√≤ can often be a <. You can’t just pull absolute values apart that way!)
On the other hand, 1 − 3 is −(1 − 3) = 3 − 1. You take the absolute value after you have calculated the number,
and not the absolute values of the individual terms. The way to think about it is that you first punch the quantity into your
calculator, and only√ at the end do you decide
√ whether you need √ to change the sign
to√make it positive. In the examples, you
first find that 5 − 3 ≈ 3.268, so 5 − 3 ≈ 3.268, but 1 − 3 ≈ −0.732, so 1 − 3 ≈ 0.732.
Finally, there is another way this type of problem will get stated, that looks like it requires more work on your part.
Mugsy: I can see it coming now.
Sometimes, just the function is given to you, and you are supposed to find the limits by yourself. An example of such a
problem is this. Find the area between the curve y = x2 − x − 6 and the x-axis. No limits are given to you! What do you
do? Simple. In cases like this, you figure out where the function intersects the x-axis, and use those for limits. (You’d have
to do this step anyway, so there really isn’t any extra work.) You then integrate over the interval(s) you get, and add up
the absolute values. Note that you don’t normally integrate over the intervals that are infinite R(going to +∞ or −∞). In the
3 2
example of y = x2 − x − 6, the curve intersects the x-axis at x = −2 and 3. Integrating, you get −2 x − x − 6 dx = − 95 6 . The
95
absolute value is 6 , which is the area.
One short technical note on that. It is possible that there are no intersection points, or just one. An example of no
intersection points would be y = 1/(1 + x2 ), and an example of one intersection point would be y = xe−x . In that case, you
must use infinity (∞) and/or negative infinity (−∞) to complete the limits. For y = 1/(1 + x2 ), you’d use −∞ to ∞. For
y = xe−x , you’d use 0 (the intersection point) and ∞. Why ∞ rather than −∞ is more complicated than I want to go into
here. It would take us into what are called improper integrals, and that’s another topic for next semester.
item that corresponds to y = f (x) crossing the x-axis is y = g(x) and y = h(x) crossing each other because f (x) = 0 is the
same as h(x) − g(x) = 0 or g(x) = h(x). Where f (x) > 0 and f (x) < 0 corresponds to where h(x) is above or below g(x)
again because f (x) > 0 means h(x) − g(x) > 0 or h(x) > g(x). Think about this until it makes sense.
The idea behind this is fairly important. As long as f (x) = h(x) − g(x), the values of f (x) dx will be the same as
the values of (h(x) − g(x)) dx. That means that the areas of the slivers obtained from the area under y = f (x) will match
precisely the areas of the slivers from the area below y = h(x) and above y = g(x). Adding up the slivers will then give the
same area both times. Geometrically, this says that you can slide the areas around vertically, and push the area up or down
so that one side of the area is the x-axis.
Here’s the procedure you have been waiting for.
Dudley: That’s not really what I would have said. Dreading is a little closer.
To find the area between y = g(x) and y = h(x) for a ≤ x ≤ b, you first solve g(x) = h(x), keeping only the points that are
between a and b. Those points break up the interval between a and b into sub-intervals. Integrate over each sub-interval
separately, and add together the absolute values of the results to get the final answer.
Again, if the limits aren’t given to you, use the intersection points (g(x) = h(x)) as before to get the areas.
For example, to find the area between the curves y = 4 x and y = x3 , you have to solve 4 x = x3 , giving values x = 0, −2,
and 2. You would use the limits −2 ≤ x ≤ 2, and the integral you’d want is
Z 2
4 x − x3 dx
−2
(In these problems, you always use the very largest and very smallest intersection points for the limits on the integral.) To
evaluate this, you’d set up two integrals (one from −2 to 0, and one from 0 toR2), and add up the absolute values. The two
2
integrals are −4 and 4, so the total area is |−4 | + |4 |=8. (Note that if you find −2 4 x − x3 dx without splitting it up, you get
0, and that can’t be right! The area is never equal to 0, for us at least.)
This seems reasonably simple (or at least seems as though it will be simple once you figure it out). It really is. But it
illustrates the key to all other applications: Slice up the item you want to find into differential-thickness slivers, estimate
each sliver, and reassemble with an integral.
Albert: This is the beginning of the boredom.
It might come as a shock, but Maple does not have an area-finding routine.
Mugsy: Boy, am I shocked.
You have to do that manually. Maple will find where f (x) = 0, or where g(x) = h(x). Maple will also do the integration, once
you give it the limits. On the other hand, if all you want is an approximation, you can get by with Maple’s approximation
routines. For example, if you wanted to find the approximate area between y = 3 x2 − 9 and the x-axis for 0 ≤ x ≤ 4, you
could just tell Maple evalf(Int(abs(3*x^2-9),x=0..4));, and you’d get an approximation. Note however, that this is
just an approximation, and I will usually ask for the exact value in the homework. For the area between y = 4 x and y = x3 ,
you could tell Maple solve(4*x=x^3,x);, and get the numbers 0, -2, 2. (More complicated problems might require
you to use fsolve();) You’d then have to tell Maple evalf(Int(abs(4*x-x^3),x=-2..2)); Using the Int(); (inert)
form of integration prevents Maple from trying to integrate symbolically first, so Maple works the integral only numerically.
This can save quite a bit of time.
Slicing horizontally.
It is possible to slice areas horizontally rather than vertically. (In fact, a thorough investigation of this approach leads to a
new theory of integration called Lebesgue (pronounced luh-BAYG) integration. In contrast, what we have been doing is
called Riemann integration. But that’s for graduate-level work and not for us to worry about here.)
When you slice horizontally, the sliver will be dy thick, and of some length. Integration will then force you to integrate
()dy. That means that the variable of integration will have to be y, so the function will have to get expressed in terms of y,
and the limits will have to be y-limits.
So, the length of the sliver (to multiply by the thickness dy to get the area) will have to get expressed in terms of y. Since
horizontal distances are the differences in x-coordinates, we’ll need the x-coordinates of the ends of the sliver, meaning that
we will need the left- and right-hand curves in the form x = f (y) and x = g(y).
This is important. It tells us when we should use this method rather than slicing vertically. Look at the form of the
functions given to you; if they are in the form y = f (x), slice vertically, while if they are in the form x = g(y), slice
CHAPTER 4. INTEGRATION EXPLAINED 257
horizontally. If the equations are given implicitly (such as x + y = 4, with no preferred variable), you get to choose.
Admittedly, slicing vertically is much more common.
Homework #47
Exercises.
Z t2
|v(t) | dt = total distance traveled (4.104)
t1
The net distance traveled is much easier to find. Since v = ds/dt, we get that
Z t2 Z t2
v(t) dt = (ds/dt) dt (4.105)
t1 t1
= s(t)|tt21 (4.106)
= s(t2 ) − s(t1 ) (4.107)
Z t2
v(t) dt = net distance traveled (4.108)
t1
Note that net distance can be positive, negative, or zero, depending on the sign of v(t).
On analogy with net and total distances, I will sometimes refer to the area between a function and the x-axis as the total
area, while just integrating the function without the absolute values gives the net area.
We will cover total distance traveled in two dimensions next section.
Homework #48
Exercises.
1. Find the net and total distances traveled for the following velocity functions, v(t).
(a) v(t) = cost, 0 ≤ t ≤ π/2
(b) v(t) = 6t − 48, 0 ≤ t ≤ 20
2. Find the net and total distances traveled for the following velocity functions, v(t).
(a) v(t) = sin(t/4), 0 ≤ t ≤ 2 π
(b) v(t) = 4t − 16, 0 ≤ t ≤ 10
p
ds = dx2 + dy2 (4.109)
CHAPTER 4. INTEGRATION EXPLAINED 259
Drawing a little differential triangle should convince you of this. The only hassle is with the curved “hypotenuse,” but
remember that differential-sized values of dx and dy don’t allow enough room for the hypotenuse to curve enough to mess
up the equation.
Integrating adds up all the ds’s, and the result is the arc length:
Z p
arc length = dx2 + dy2 (sort of) (4.110)
v
Z x2 u 2 !
t 1 + dy
u
dx (4.114)
x1 dx
CHAPTER 4. INTEGRATION EXPLAINED 260
A very similar thing happens in the unusual case of y as the independent variable. The formula becomes
v
Z y2 u 2 !
t 1 + dx
u
dy
y1 dy
as you can easily check. Again you must supply limits (values of y this time, since the differential is dy), and the formula
for dx/dy (obtained by differentiation), and integrate.
A similar, but slightly different, thing happens in the case of parametric equations. There, the independent variable is t,
and you don’t get quite as much simplification. The differential for arc length is calculated as before:
p
ds = dx2 + dy2 (4.115)
s
dx2 dy2
= + × (dt 2 ) (4.116)
dt 2 dt 2
v
u 2 2 !
u dx dy
=t + × dt (4.117)
dt dt
v
u dx 2
Z t2 u 2 !
dy
t + × dt (4.118)
t1 dt dt
where you must supply t-limits, and plug in for both dx/dt and dy/dt and then integrate.
Let’s do some examples. Take y = x2 for 0 ≤ x ≤ 3. The value of dy/dx = 2 x, and the limits on the integral are 0 and
3. The arc length is
Z 3q Z 3p
1 + (2 x)2 dx = 1 + 4 x2 dx
0 0
This integral can be worked exactly (try it on Maple), and the answer is
√
3√ ln(6 + 37)
37 +
2 4
which evalf’s to 9.747088758.
For another example, suppose we take x = t 2 + t and y = t 3 − t, for 0 ≤ t ≤ 2. (We could solve x for t and plug into
y, but that would leave a set of equations that are ghastly. Then we’d have to differentiate them before squaring them. It
would be serious. See the homework where I guide through this problem for just some of the hassles that can occur, using
Maple.)
Mugsy: What?! You mean that we’re going to have to go through a bunch of algebra that he has already described
as “ghastly?”!
Albert: Sort of. You have to get Maple to do it, actually. That makes is reasonable.
Then you get
v !
u dx 2 dy 2
Z t2 u Z 2q
t + × dt = (2t + 1)2 + (3t 2 − 1)2 dt (4.119)
t1 dt dt 0
Z 2p
= 9t 4 − 2t 2 + 4t + 2 dt (4.120)
0
This function can’t be integrated in terms of elementary functions (it is called an elliptic integral).
CHAPTER 4. INTEGRATION EXPLAINED 261
Homework #49
Exercises.
1. Calculus texts love to give problems that involve arc length where the square root in the integrand can be taken
exactly. This involves a very careful choice of functions y = f (x) or x = x(t), y = y(t). This question works through
some of the y = f (x) sort. The trick usually is to choose y = f (x) so that what’s inside the square root in the integral
is a perfect square. Taking the square root then leaves some reasonable function to integrate.
Dudley: My definition of reasonable doesn’t match your definition very closely.
(a) Take y = 13 (x2 + 2)3/2 . Find the integrand for arc length and simplify it to some (?) dx that can be integrated
R
2. Find the circumference of a circle of radius R. Do this by writing the circle as x = R cos θ , y = R sin θ and finding the
length of the curve. (Hint: What are the limits on θ that go around the circle once?)
Investigations.
1. This question is to show you how much easier it is to work arc length with parametric equations when that is the form
you have. Use Maple (unless you are feeling very brave or very masochistic) to do the following parts. Remember
to change the colons at the end of these commands to semicolons so that you can see the answers.
(a) Define
> x := t^2 + t:
and then find ds (without the dt).
> ds := sqrt( diff(x,t)^2 + diff(y,t)^2 ):
Then find the arc length. (You will save a lot of time by using the inert form of the integration command,
Int();. Maple grinds for a long time trying to figure it out, and ultimately generates a gigantic, and unusable,
answer.)
> Int( ds, t = 0 .. 2):
The value should be between 9 and 10. This is the correct value of the arc length.
(b) In the rest of the problem, we try to do this by converting the equations to an explicit form. Clear x by typing
this.
> x := ’x’:
This gives t as a function of x. You will note that there are two solutions. This makes sense, since it is a
quadratic equation for t. Keep both solutions around by assigning a variable to the solutions. (I call it ts for
“the t’s,” that is the value of t given by the x equations.)
> ts := %:
CHAPTER 4. INTEGRATION EXPLAINED 262
(c) Let
> t := ts[1]:
so that t takes on the first of those solution values. Then look at
> y:
Maple will automatically use the value of t that you had from the t:=ts[1] when finding y. (At this point, you
have solved the parametric equations down to the form y = f (x), that is, an explicit equation.)
(d) Now find
> ds := sqrt( 1 + diff(y,x)^2 ):
We now want to integrate ds for a certain range of x’s. How would you find the x-limits to correspond to
0 ≤ t ≤ 2? (Hint: You have the equation!) Integrate the function by typing
> evalf( Int(ds, x = 0 .. 6) ):
You will notice that the answer is either between 9 and 10 (the correct one) or between 24 and 25 (which is
obviously not right).
(e) Repeat the instructions for for the previous two parts, replacing the first step, t:=ts[1];, with
> t := ts[2]:
Now the evalf(Int(dsdt,x=<limits you found>)); gives the other value (whichever one you did not
get two parts ago).
(f) Why did one value of t give the wrong integral, and the other the correct integral, and how would you tell which
one is correct without evaluating the integrals? (Hint: Plug t = 1 into the original parametric equations for x.
What value of x do you get? Plug that value of x back into the ts[1] and ts[2] expressions that you got by
solving for t. Which one gives t = 1, the correct value of t?
(Are you convinced that you should just stick with the parametric equations form of x(t) and y(t) when that’s
what is given to you? And if the equations for x(t) and y(t) weren’t so easy (!), Maple wouldn’t be able to solve
for t at all, and the whole process would be stuck from the beginning. It would eliminate the problem of which
solution of t to use, since there wouldn’t be any solutions!)
Mugsy: I’m not convinced that this guy isn’t a sadist.
Dudley: Huh? Does that mean you think he is or you think he isn’t?
Mugsy: Yes.
2. In this question, we show that the formula for arc length reduces to the formula for total distance traveled in the
previous section. Accordingly, we assume that x = x(t), and that y(t) = 0, so that the object is moving along the
x-axis, and t1 ≤ t ≤ t2 .
(a) Find ds/dt. What do you get when you take the square root? (Hint: It is not dx/dt!)
(b) Plug ds/dt into the integral for arc length with the dt differential and the correct limits on the integral. Since
v(t) = dx/dt, put v(t) into the integral, and show that it is exactly what we got before.
A differential of the curve, when it spins, produces a differential-width hoop. In order to find the area of the hoop, we
slice it across and lay it out flat, where it becomes a ribbon, essentially a long, skinny rectangle. The area will be the width
(ds) times the length. The length of the ribbon is the circumference of the hoop. (Think about that for a while until it makes
sense).
Mugsy: He’s discriminating against me again.
The circumference of the hoop is 2π× (radius of hoop). For definiteness, we’ll use the Greek letter ρ (rho, the equivalent
of “r,” and pronounced like “row”)
Dudley: ρ, ρ, ρ your boat,
Mugsy: Gently down the stream. . .
Albert: At least you’re paying attention.
for the radius of the hoop. That makes the length of the ribbon 2πρ, and the area of the ribbon 2πρ ds. Adding all of these
up (with an integral, of course) gives the formula:
Z
Surface area of revolution = 2 π ρ ds (4.121)
This formula presents us with all the same problems that arc length did, and then some.
Dudley: Mugsy, I’m beginning to agree with you about that “boring” thing.
We must deal with ds again, and we do that exactly the same way that we did with arc length. (This “application” always
immediately follows arc length for this reason.) So, that is something you should know how to do.
The ρ presents more of a difficulty. It came from the radius of the hoop, but how do we find that when we are confronted
with a problem? Basically, the radius of the hoop is the distance of the differential (the ds-piece) of the curve from the axis
of rotation. (Think about that for a bit until it makes sense.)
Mugsy: Hey, would you cut that out? I already feel picked on.
Dudley: I never thought I’d see the day....
So, how do we find ρ in a problem? It is the distance of the curve (which is where the differential ds piece lives) from the
axis of rotation. This needs to be highlighted:
There are two warnings. First, remember that distances between objects are always measured along a perpendicular.
Since the axis of rotation must be a line, the distance is measured perpendicular to that line. Second, remember that ρ, just
like everything else in the integral, must be expressed in terms of the integrating differential variable.
Let’s try to find surface area now for a specific example. Suppose we revolve the curve y = x2 for 0 ≤ x ≤ 4 about the
y-axis. We’ll get a bowl-shaped object called a paraboloid, but picturing it is not critical. What is useful is to be able to
draw the curve and the axis of rotation in two dimensions (the xy-plane). In this case, the graph is fairly simple. The curve
is a parabola, and the axis of rotation is the y-axis. Finding ds is also fairly easy:
s 2
dy
ds = 1 + dx (4.122)
dx
q
= 1 + (2 x)2 dx (4.123)
p
= 1 + 4 x2 dx (4.124)
Note by doing this calculation, we have declared that x is the independent variable and the variable of integration for this
problem. We also know from the curve that the limits on the integral will be (x going from) 0 to 4. The remaining items are
to find ρ and assemble the information into the integral. (Note that we don’t need to know ρ or the limits to find ds.)
When we tackle ρ, we need to understand what it is, and locate ρ on the graph. It will measure the distance from
a differential chunk of the curve y = x2 to the y-axis. In this case, the perpendicular to the y-axis is a horizontal line.
(Anything that passes through the point can be considered perpendicular to it, so the y-axis determines the perpendicular.)
CHAPTER 4. INTEGRATION EXPLAINED 264
How do you measure a horizontal distance? It is the difference in the x-coordinates, or more specifically, it is the
x-coordinate of the right-hand end minus the x-coordinate of the left-hand end. The right-hand end is the point, with x-
coordinate x (generic point). The left-hand end is the y-axis, with x-coordinate 0. (The y-axis has x-coordinate 0; the x-axis
has y-coordinate 0.) Then ρ = (x) − (0) = x. R
Now let’s assemble the answer. The formula is 2πρ ds, which in this case is
Z 4 p Z 4 p
Area = 2π(x) 1 + 4 x2 dx = 2π x 1 + 4 x2 dx
0 0
This integral can be worked exactly (a simple substitution does it; see the homework), but this is somewhat unusual. Most
integrals can’t be worked exactly.
Dudley: At least he assigns ones that are possible to do.
Be careful that you realize that we are finding the distance between the individual points (actually, the differential-sized
chunks) of the curve y = x2 and the y-axis, and not the distance between the whole curve and y-axis. Those are two different,
though related, things. Each point has its own distance from the y-axis. But the curve as a whole has a single distance from
the y-axis, which in this case is 0. (The two intersect at (0,0). The distance of the curve in general is the smallest of the
distances from each of the points.)
Suppose we alter the problem slightly. Suppose we take the same curve and rotate it about the x-axis rather than the
y-axis. Then all we need to recalculate is the value of ρ. The ds portion of the integral depends only on the curve, and not
on how it is rotated. What is the value of ρ this time? The axis of rotation is now the x-axis, so ρ is the distance from the
point (x, x2 ) to the x-axis. What is the perpendicular to the x-axis (again, the perpendicular to the point won’t specify any
direction)? It is a vertical line segment. What is the length of a vertical segment? It is the difference in y-coordinates, or
more specifically, it is the upper point’s y-coordinate minus the lower point’s y-coordinate. In this case, the line segment
runs from the point (x, x2 ) to the x-axis. The y-coordinate of the upper point is then x2 ; the y-coordinate of the lower end of
the segment is 0. Then difference of these is ρ = (x2 ) − (0) = x2 . The surface area of this is
Z 4 p Z 4 p
Area = 2π(x2 ) 1 + 4 x2 dx = 2π x2 1 + 4 x2 dx
0 0
Again, this integral can be worked exactly, but the work is considerably harder. See the homework (where Maple does it!).
Note that we found the surface area without knowing anything about what the surface itself looks like.
It was useful to express the point (x, y) as (x, x2 ), since then we will have everything expressed in terms of the variable
we will want to have in the integral (x in this case, from the fact that we found ds in terms of dx).
What happens if you rotate about a line other than the x- or y-axis? Since the ds lives just on the curve, and could care
less about what line it will get rotated about. However, ρ depends on the line. The easiest lines to deal with are horizontal
(in the form y = C) or vertical (in the form x = C). In that case, you find ρ will be |y −C | or |x −C |, depending on the line.
(You’ll have to think about this for a while.)
Mugsy: That’s it. I give up.
Dudley: What are you going to do? Quit?
Mugsy: No. Stop thinking. I think.
To work with the absolute values (and note that I didn’t use them in the examples), you figure out the sign of y −C or x −C
first, and plug it into the example. For example, if we rotated the y = x2 curve about the line y = 20, the value of ρ would
be |y − 20 |. But for points on the curve, y < 20, so ρ = |y − 20 | = −(y − 20) = 20 − y. That’s what you would use in the
integral.
Homework #50
Exercises.
R4 √
1. Evaluate 2π 0x 1 + 4 x2 dx. This can be done with a substitution. Use Maple to check your answer.
√
2. Evaluate 2π 04 x2 1 + 4 x2 dx using Maple.
R
CHAPTER 4. INTEGRATION EXPLAINED 265
3. Set up the integral giving the surface area obtained by rotating the curve y = x sin x for 0 ≤ x ≤ π about the line
y = −2. Use Maple to approximate the integral.
Problems.
1. The distance between the point (x0 , y0 ) and the line a x + b y = c is
|a x0 + b y0 − c |
√
a2 + b2
Use this formula to set up an integral giving the surface area obtained by rotating the curve y = ln x, for 1 ≤ x ≤ 4,
about the line y = 3 x + 1. (Note that “set up” means just that. You aren’t going to be able to evaluate the integral you
get.)
2. This exercise will find the surface area of a sphere of radius R. A circle is what you want to rotate to give a sphere.
The simplest way to express the curve is parametrically. Use x = R cost, and y = R sint. Rotate about the x-axis.
(a) What limits should be put on t? This is not an easy question! In order for the problem to work correctly, we
must pick limits on t that specify a section of the curve that, when rotated, covers the sphere completely, but
with no overlap. (Not covering the surface completely means that the answer would be too small. Overlap, on
the other hand, would count some of the surface area more than once, and the result would be too big.)
(b) What is ds? (There will be a substantial simplification using a trigonometric identity.) What is ρ? Set up
the integral and evaluate it. (Look up the surface area of a sphere in some reference book. Does your answer
check?)
Albert: Archimedes also worked out this problem, getting this answer, using his method of exhaustion. I’m
impressed, anyway.
Mugsy: Really? I thought he was strictly simple stuff.
Albert: Every.
Once we have the force on each slice, we add up the forces and get a total force. The force on each slice comes from its
area times its fluid pressure by Archimedes’ law.
This type of problem is notorious for giving you a description of a situation that doesn’t include very much information.
In order to analyze it, you must provide the framework. Although there are numerous reasonable possibilities, I encourage
you to adopt one consistent approach: Let x be depth under the liquid. That means that x increases as you go down (not
up), which is a bit strange, but it turns out to be convenient. The surface of the liquid is then given by x = 0.
The problem is typically stated thus: Find the force exerted on a plate with a certain description submerged a certain
way in a fluid of some density. As indicated earlier, you should let x be depth. The density should be a constant, δ (another
Greek letter, “lower case” delta). The shape and size of the plate (it could be anything from the end of an aquarium tank to
a submarine hatch) will have to be taken into account. It fits into the problem by letting l(x) be the length of a horizontal
strip (differential-thickness sliver) at depth x. While setting up l(x), you should at the same time decide the x-coordinates
of the top and bottom of the plate. They will turn into limits on the integral, representing the smallest and largest values
that are being added up to give the force on the plate.
How do we assemble all of this? We find the force on a strip by looking for the bit of force due to the horizontal
sliver. Since the pressure is essentially constant on the sliver (which is why we took it to be horizontal), the force is
(pressure)×(area).
We get the pressure by multiplying the force on the sliver times the depth. That presents problems, since the density of
the liquid is not the force per unit volume. It is the mass per unit volume. However, we can get the force by multiplying by
the acceleration of gravity, usually written g.
Dudley: Al, Mugsy has given up on this. But I still want to try to understand. Can you do something?
Albert: It’s really that F = m a thing. Force is mass times acceleration. So, you multiply the mass times acceleration,
and you get force. In this case, the acceleration is the acceleration of gravity, since gravity is what is causing the force.
Pressure is then (density)× g×(depth) = δ g × x. Area is (length)×(height) = l(x) × dx. The force on the sliver is then
δ g x l(x) dx. Add all of these up with an integral gives
Z
Hydrostatic force = δ g x l(x) dx (4.125)
We are in the position to make this a usable formula. The limits to be supplied are the smallest and largest values of
x for the plate. In other words, it is the depths of the top and bottom of the plate. The sticky one is l(x). That has to be
worked out with each new problem, and is dependent on the shape and depth of the plate. It is usually a matter of geometry,
and not necessarily simple stuff. The idea is to find the length of the strip as it changes with x = depth. For this reason, I
will spare you the gory details of how this works. In the somewhat unlikely event you ever need to use this, there should be
enough information here for you to work it out.
As an example, lets find the force on a submarine porthole.
Mugsy: I didn’t think submarines even had portholes.
Albert: They don’t. I think this might be a joke.
Suppose the porthole has a radius of 1 foot, and the center of the porthole is 200 feet below the surface of the water. The
density of sea water is 64 lb/ft3 . (Actually, that’s the force-density, since pounds are a force, not a mass-density. That means
that we can get δ g = 64. So, we can actually ignore one term in that integral.) The trick is with l(x), as always. We need
an equation for it. The equation of a circle of radius 1, with center at (200, 0) (remember that down is positive and x is the
distance!) is (x − 200)2 + y2 = 12 . The distance l(x) will go p from one side of the circle to the other, in other words, between
the two y-values. Since solving the equation gives y = ± 1 − (x − 200)2 , we get that
q q q
l(x) = ( 1 − (x − 200)2 ) − (− 1 − (x − 200)2 ) = 2 1 − (x − 200)2
Finally, we need the values for the limits on the integral. Since the largest value of x on the hatch is 201, and the smallest
CHAPTER 4. INTEGRATION EXPLAINED 267
value of x on the hatch is 199, the limits are 199 to 201. The integral set up is
Z
Pressure = δ g x l(x) dx (4.126)
Z 201 q
= 64 x 2 1 − (x − 200)2 dx (4.127)
199
Homework #51
Exercises.
1. Find the fluid force on Hoover Dam. Treat it as a rectangle that is 726 feet high and 1244 feet long, with the water
level with the top. Use the “density” of water 62.4 lb/ft3 . Convert your answer into tons (2000 lbs = 1 ton).
2. Set up an integral for the fluid force on a porthole of a submarine. Specifically, presume that the porthole is a circle
of radius R, with the center at a depth of D below the surface of the water. Also assume that D > R.
CHAPTER 4. INTEGRATION EXPLAINED 268
1
Z
√ du = Arcsin u +C
1 − u2
1
Z
du = Arctan u +C
1 + u2
1
Z
√ = Arcsec u +C
|u | u2 − 1
8. You use substitution to convert integrals to one of these standard forms. You let u be
• The inside of the most complicated part, or
• Any function whose derivative appears as a factor in the integrand, or
• Something more complicated (like a trig substitution), but don’t worry about those.
Remember to change the differential to the new variable, as well as the limits (for definite integrals).
9. You use partial fractions to integrate rational functions (the quotient of two polynomials). The steps are:
CHAPTER 4. INTEGRATION EXPLAINED 269
(a) Divide the integrand, if necessary, to make sure that the expression you use partial fractions on is proper (that
is, the degree of the top is less than the degree of the bottom).
(b) Factor the denominator into linear or irreducible quadratic terms
(c) Set up the correct partial fractions form, which is to put in terms for each factor in the denominator, the number
of terms equalling the degree of the factor, and the numerators being either constants (when the factor is linear
to a power) or linear (when the factor is quadratic to a power).
(d) Solve for the coefficients in the numerators. (You will not be required to do this step by hand.)
10. My approach to integration by parts uses categories of functions. The three categories are:
Category 1. Logarithms and inverse trigonometric functions
Category 2. Polynomials and powers of x
Category 3 Exponentials, sines, and cosines
The formula for integration by parts is Z Z
u dv = u v − v du
12. Tables of integrals are usually organized by increasingly complicated functions. To use one, pick out the most
complicated term in the integral you want to evaluate, find that section in the table. Then pick out the next most
complicated term in the integral, and locate the subsection (don’t go out of the section) with that kind of term. Keep
doing that until you find the integral you have.
13. Reduction formulas are common in integral tables. You end up using them several times, with different values of the
constants, until you get to an integral you can work directly.
14. There are numerous methods of approximating definite integrals. The ones that we covered (sums based on the right-
endpoints, left end-points, midpoints in Riemann sums, and the trapezoidal rule and Simpson’s rule) are elementary.
You will not have to know how to calculate those by hand.
15. The area between the curve y = f (x) and the x-axis for a ≤ x ≤ b is
Z b
| f (x) | dx.
a
If the values of a and b are not given to you, solve f (x) = 0 and use the largest and smallest values of x.
16. To integrate the absolute value of a function over an interval a ≤ x ≤ b, you find the places where f (x) = 0, and
discard the points that are not between a and b. You integrate the function over the remaining intervals, and add the
absolute values of the answers.
CHAPTER 4. INTEGRATION EXPLAINED 270
17. To find the area between two curves y f (x) and y = g(x) for a ≤ x ≤ b, calculate
Z b
| f (x) − g(x) | dx
a
. If the values of a and b aren’t given to you, solve f (x) = g(x) and use the largest and smallest x values.
18. If v(t) is the velocity of an object along the x-axis, then the total distance traveled for t1 ≤ t ≤ t2 is
Z t2
|v(t) | dt,
t1
20. You convert ds into something that you can integrate by factoring out of it the differential of the independent vari-
able. Note that when you do that, you end up dividing the dx2 and dy2 by the square of the independent variable’s
differential. The quotient of the squares of two differentials is the square of the derivative. You then take the limits
on the independent variable as the limits on the integral.
Z
21. The formula for the surface area of revolution generated by revolving a curve about a line is 2 π ρ ds, where ρ is the
(function giving the) distance between the curve and the axis of rotation, and ds is the same as the ds for arclength.
22. The
R
formula for hydrostatic pressure on the vertical face of a submerged object is
δ g x l(x) dx,
where x represents depth below the surface of the fluid, δ is the (mass) density of the fluid, g is the acceleration of
gravity (so that δ g = the force density of the fluid), and l(x) is the horizontal length of the object at depth x. You
also need to provide limits of integration, which are the minimum and maximum depths of the object.
23. The new Maple commands from this chapter are:
• sum(function,variable=start..end); which adds up all the values of the function replacing variable
in function successively by the values from start to end.
• int(function,variable); which finds the indefinite integral of function with respect to variable, and
(function,variable=start..end); which finds the definite integral of function from variable=start
to variable=end.
• Int(function, variable); and Int(function, variable=start..end); do nothing more than for-
mat the integral to print it out on the screen. (These are called the inert forms of the integrals.) To get Maple to
carry out the integration on an inert integral in the previous step, type in value(");.
• convert(function,variable,parfrac); which does a partial fractions expansion of function assuming
that the variable is variable.
• The change of variables command is built into the student package, so before you can use it, you have
to tell Maple use(student):. Then the command changevar(integral, substitution_equation,
new_variable); performs a substitution (change of variable) in the integral, including changing the limits
of a definite integral.
• The integral approximation routines are built into the student package, so all the commands that follow in this
point have to begin with the command with(student):. (The semicolon suppresses the listing of the different
routines in student.)
rightsum(function, variable=start..end, number_of_intervals);,
CHAPTER 4. INTEGRATION EXPLAINED 271
Water Balloons
5.1 Introduction.
5.1.1 What happened?
In previous incarnations of this textbook, chapter 5 was all about how to relate the counter of cassette tapes or VCR tapes to
elapsed time. But fewer and fewer people even own cassette players or VCR tapes, having replaced them with CD-ROMs
and DVDs. So, to keep the text relevant, it has been changed around to something more of a timeless nature: water balloon
launching.
Dudley: Gee, I wonder if they will show how to maximize distance?
Albert: Maybe. It isn’t that hard.
Mugsy: Can this stuff be applied to other things than water balloons?
Dudley: It should. You have something in mind?
Mugsy: Yeah, but you probably don’t want to know. It’s someone, rather than something.
Dudley: You’re right, I don’t want to know.
Mugsy: You never heard of a human cannonball?
Dudley: Was this done voluntarily?
Mugsy: For some definition of voluntarily, yes.
x(t) = v0x t + x0
1
y(t) = − gt 2 + v0y t + y0
2
Here is the meaning of variables:
272
CHAPTER 5. WATER BALLOONS 273
Variable Meaning
t Time
v0x Initial x velocity
x0 Initial height
g Acceleration of gravity
v0y Initial y velocity
y0 Initial height
Dudley: AGGGH! Variables! Lotsa variables!
Albert: But you have seen all of them before, Dudley.
Dudley: That doesn’t mean that they didn’t terrify me then, too!
To simplify things, we will assume that the water balloon is launched from ground level, a reasonable assumption.
Dudley: So these equations don’t work if you are launching water balloons out of dorm room window?
Mugsy: Or off of the roof of the science building?
Albert: I refuse to answer some questions that can get you into serious trouble.
That means that we will assume that y0 = 0. It simplifies the equations immensely.
For this, we are going to want to find the range of the water balloon. As with all parametric equations, we will want
to rephrase the question in terms of the parameter. What is the range of the launcher going to be? Described in terms of
the trajectory of the water balloon, it is the distance between the two points that the balloon is on the ground (those two
points being where it is launched from and where it lands). Since the parameter is time, questions regarding the parameter
will be phrased with the word when. So, what we want to find is when the balloon is on the ground. There should be two
times. Then, once we have the two times, we have to find the two positions, and from that, find the distance between the
two positions. So, how to we find when the water balloon is on the ground? The defining characteristic of being on the
ground is y = 0, so what we want to do is solve y = 0 for t. Note that y is a quadratic in t, so we will get two different values
of t, which is just what we want. Remember that y0 = 0.
0=y (5.1)
1
= − gt 2 + v0y t (5.2)
2
1
= t (− gt + v0y ) (5.3)
2
That gives two equations, t = 0 and − 21 gt + v0y = 0. The first value, t = 0 is expected. That would represent the launch
of the balloon. The other value solves to give t = 2 v0 y/g. That is the landing time. To get the range, we will need to
know the positions at those times. By plugging into the equation for x(t), we get that at t = 0, x = x0 and at t = 2 v0y /g,
2v v
x = x0 + 2 v0x v0y /g. The difference is the range, 0xg 0y .
That is correct, as far as it goes, but isn’t in the most usable form.
Dudley: Hey! Maybe he will show how to maximize distance!
Albert: It’s beginning to look like it.
If we go back to trigonometry we get that v0x = v0 cos(θ ) and v0y = v0 sin(θ ), where θ is the launch angle.
Dudley: AAAUUGGGGH! Another variable!
If we plug those in, and use the trig identity sin(2 θ ) = 2 sin(θ ) cos(θ ), we get that the range is actually
(v0 )2 sin(2 θ )
Range = .
g
From this, we can see how to maximize distance.
Dudley: At last, what I have been waiting for!
The largest value of the right hand side occurs when sin(2 θ ) = 1, which occurs when 2 θ = π/2, or when θ = π/4.
Dudley: Lessee, π/4 radians is, uh, 45◦ , right?
Albert: Yes! Congratulations!
Dudley: So launching the water balloon at 45◦ will maximize the distance it goes?
Albert: Yes and no.
CHAPTER 5. WATER BALLOONS 274
Dudley: Hey, quit being as confusing as the book. Give me an answer I can use.
Albert: If there weren’t any air resistance, then yes, a 45◦ launch angle will maximize distance. But if you want to
take air resistance into account, you need to lower that angle somewhat. The maximum distance for hitting a baseball
occurs at about a 40◦ angle.
Mugsy: OK, how would you know that?
Albert: I read a book called The Physics of Baseball.
Mugsy: And I suppose that was pleasure reading for you.
It is also important to notice here that the range is proportional to the square of the launch velocity. That is, doubling
the launch velocity multiplies the range by four. Doing what you can to increase the velocity is clearly important.
k sin(2 θ ) 2
Range = x .
mg
The observation that is most important for us is that the range is quadratic in x, the amount that you pull back on the
launcher.
CHAPTER 5. WATER BALLOONS 275
5.2.3 Data
Suppose we ran an experiment to check this out. You might come up with the following data.
Mugsy: Might come up with this data?
Albert: Correct. Dr. Coulliette can’t find his water balloon launcher to get some real data.
The launch angle is 40◦ . We are going to try to figure out the launch velocity.
x Range
0 0.
1 1.8
2 7.0
3 15.3
4 26.5
5 40.3
6 56.5
7 74.9
8 95.3
9 117.6
10 141.7
Here, x is the percentage of max stretch divided by 10. (So, an 80% stretch would correspond to x = 80/10 = 8.)
It is worth plotting this.
The second data point has x = 1 and a range of 1.8. The error would be (1.8 − A (1)2 ). Of course, this depends on A, as
it should. For different values of A, we get different values of the error for this term. Our work is to find the value of A that
makes this error, and all the other errors as well, small. We square the error and get (1.8 − A (1)2 )2 . Similarly, the square
of the error for the third data point is (7.0 − A (2)2 )2 . Keep going, and add them up at the end. Fortunately, we have Maple
around. If you add them all up, and expand the result, you get 25333 A2 − 74150.6 A + 54337.74. That is the value of the
sum of the squares of the errors.
∂f ∂f
=0 and =0
∂x ∂y
simultaneously for x and y. This gives points (usually more than one) which are the critical points.
Note that what we are doing is similar to the less general method of finding critical points of f (x). We aren’t going
to look at places where a derivative doesn’t exist. Functions of more than one dimension (independent variable) are much
more complicated, and we are forced back to relying on the equivalent of the second derivative test. Places for which the
first derivative is not defined won’t have second derivatives, either, and the second derivative test (the only one we have)
fails. Therefore, we ignore such cases.
Mugsy: You can’t do it so you ignore it?
Albert: That’s the general idea.
To accommodate this new possibility, the terminology needs to change. Maxes become peaks, mins become pits,
saddles become passes (from the idea of a mountain pass), and messes become problems. The possibilities are then peaks,
pits, passes, and problems, retaining our alliterative scheme.
Mugsy: Aw. How cute.
Dudley: Hey, anything that helps me remember is good.
In tribute to some ingenious individual from Fall 2013, there is an alternative way to think about the shape of a saddle
point. Consider the shape of a Pringles (Registered Trademark acknowledged here and for the rest of the chapter) potato
chip. It fits perfectly, in several senses. The shape is just right, and it also fits the alliteration: peaks, pits, Pringles, and
problems.
Mugsy: Why do they do this to me right before lunch? Now I’m hungry.
for the quantity ∆, called the discriminant. The four cases for the critical point are:
Case Type of critical point
∆ > 0 and fxx > 0 Relative min (pit)
∆ > 0 and fxx < 0 Relative max (peak)
∆<0 Saddle (pass)
∆=0 Mess (problem)
For example, take f (x, y) = x2 + y2 + 3 x y. The partial derivatives are easy: fx = 2 x + 3 y and fy = 2 y + 3 x. Setting these
both equal to zero gives that the only critical point is (0, 0). Next,
so
∆ = (2)(2) − (3)2 = −5 < 0
CHAPTER 5. WATER BALLOONS 280
The diamonds are the points, the curve that starts lower and ends higher is the graph of the least squares value using A x2 ,
while the other curve is the graph of the least squares value using A x2 + B x. It is fairly clear that the second curve fist better.
That is, the value of A is (v0 )2 sin(2 θ )/(100 g). Once we know the value of A (and we actually have two choices for it
now!), we can get a value (or twoq potential values) of v0 .
100 A g
Using A = 1.46, we get v0 = sin(2 θ ) = 38.1, while using A = 1.26 gives v0 = 35.45.
Further improvements
Well, if we can make better approximations with more terms, why don’t we go ahead and try to fit the range to A x2 +B x+C?
We can certainly set up the equation for SSE(A, B,C) simply enough. And to minimize it, we would set all three partial
derivatives equal to 0 and solve for A, B, and C. You would get (with Maple’s help again)
Mugsy: I’m actually getting to the point where I don’t shudder when I hear the word Maple. That program is actually
handy!
that A = 1.22, B = 2.16 and C = 1.29. The sum of the squares of the errors is now 7.3896.
Yes, this is technically more accurate, but the improvement is a lot less. Adding in B x dropped the sum of the squares of
the errors down from 77.6868 to 10.2483, an 86% drop. The next drop, from 10.2483 to 7.3896, is only 28%. The question
is whether it is worth it.
There is another consideration. The one data point that we can be absolutely sure of is (0, 0): If you have zero velocity
at the launch, the water balloon won’t go anywhere. If you have a non-zero value of C, the fitted curve won’t go through
that point. On that basis, there is some rationale for not putting in the C term.
CHAPTER 5. WATER BALLOONS 282
Homework #52
Exercises.
1. Find the critical points of the following functions and classify them (a max, min, saddle, or mess):
(a) x2 + 2 y2 + 6 x + 8 y + 12
(b) x2 − 2 y2 + 6 x + 8 y + 12
(c) x2 + 2 y2 − 6 x + 8 y + 12
(d) x3 + 3 y2 − 6 x y
2. Find the critical points of the following functions and classify them (a max, min, saddle, or mess):
(a) 3 x2 + 2 y2 + 6 x + 8 y + 12
(b) 3 x2 − 2 y2 + 6 x + 8 y + 12
(c) 3 x2 + 2 y2 − 6 x + 8 y + 12
(d) x3 + 3 y − 12 x y2
Problems.
1. Let x j take on the values 5, 1, 5, 4, and 5. Let f (x) = ∑5j=1 (x − x j )2 . Write out f (x) and show that f (x) has a
minimum when x = average of the x j ’s. (This is true in all cases, actually.) Do this by setting f 0 (x) = 0 and solving.
2. In this problem, you will fit y = f (x) = A x + B x2 to some data by hand. Use this data: (x j , y j ) is measured as (0, 0),
(1, 1), (2, 5), (3, 9). There are few enough here so that you can do the algebra with only minimal pain. Also, note
that this is a simple y = x2 , with a change in the value at x = 2.
(a) Set up SSE(A, B) for these points, and plug all the data in. You should get something like ((0) − (A (0) −
B (0)2 ))2 + ((rest of the terms)).
(b) Differentiate SSE with respect to both A and B. Don’t forget the chain rule!
(c) Set both the derivatives to zero and solve simultaneously. (You will end up with numbers in the hundreds. Don’t
panic, but be careful.)
(d) Use those values for A and B to evaluate A x + B x2 at x = 1, 2, and 3, and compare with the values in the original
data. (If you are correct, you should get numbers close, but not equal, to 1, 5, and 9.)
(e) Use the second derivative test on SSE to decide if this a maximum, minimum or saddle.
3. The general quadratic in x and y looks like f (x, y) = A x2 + B x y + C y2 + D x + E y + F (not all of A, B, and C are
zero). This problem asks (and answers) various questions about what its relative maxes and mins look like.
(a) Find ∂ f /∂ x and ∂ f /∂ y, and set them both equal to 0 and solve for the critical point. Show that if B2 −4 AC 6= 0,
there is a single critical point. (Maple is of use here in solving the equations you get. What you need to show is
that if B2 − 4 AC 6= 0, the equations can be solved, meaning the solution Maple gets really exists. To see that,
look at what would happen if B2 − 4 AC = 0.)
CHAPTER 5. WATER BALLOONS 283
(b) Show that the critical point in the previous part is either a maximum, minimum or a saddle. (That is, it is not a
mess as long as B2 − 4 AC 6= 0.)
(c) An example of B2 − 4 AC = 0 is f (x, y) = x2 + y. Show that it has no critical points at all.
(d) Another example of B2 − 4 AC = 0 is f (x, y) = x2 + 2 x y + y2 . Show that this has lots of critical points, and
show that they are all minimums (of a sort). [Hint: Factor f (x, y).]
dφ 1 d y
= (5.4)
dt 1 + (y/x)2 dt x
1 x dy dx
dt − y dt
= (5.5)
1 + (y/x)2 x2
1 dy dx
= 2 x − y (5.6)
x + y2 dt dt
There are a few things that we can get from this before we plug in the equations for x and y. Note that dφ /dt is not zero
at the top (maximum height) of the balloon’s arc. How can we tell? At the top of the arc, dy/dt = 0, and plugging that in,
we get that
dφ 1 dx
= 2 0−y ,
dt x + y2 dt
which will be negative since all the values of the variables and derivatives are positive, and there is an overall negative sign.
If you draw a picture of the balloon’s arc, you will see that the camera has to pan upwards until some point before the top
CHAPTER 5. WATER BALLOONS 284
of the arc (the point is where the line from the camera to the arc is tangent to the arc), and then will pan down from there to
the balloon’s landing.
Dudley: I can see where it would have been hard to pick up on that if you had just plugged in the equations for x and
y before differentiating.
Mugsy: Not that this made it easy. Or obvious.
Albert: That’s the advantage of using the equations.
Dudley: What are you trying to do, plug calculus as useful?
Albert: Why, yes, I am.
A bit of calculation shows that the balloon is in the air from t = 0 to t = 2.04, so a plot of that range gives this graph:
CHAPTER 5. WATER BALLOONS 285
When you are lucky, such as in the homework exercises (but not the problems) that follow, you are given the equation.
An example of this simpler sort of problem would be profitable.
Dudley: It would also be unusual.
Problem: Suppose y = x2 − 5x + 8, and you know that dx/dt = 4 when x = 1. Find dy/dt at x = 1.
Answer: The equation relating these rates is the chain rule, since we can find dy/dx, but need dy/dt. That is, we want to
change the variable of differentiation. The chain rule says
dy dy dx
= ×
dt dx dt
From the equation, dy/dx = 2x − 5. At x = 1, the value of dy/dx is 2(1) − 5 = −3. The value of dx/dt is 4. Therefore, at
x = 1,
dy dy dx
= × = −3 × 4 = −12
dt dx dt
That’s the answer! Seem too simple? It really is (except when you have to come up with the equation yourself).
Finally, let me do an example of a related rates problem stated in words. It is more difficult than any problem you are
liable to have to work.
Problem: In his never-ending search to rid the universe of squirrels, Fang had gotten Dudley to catapult zucchini squashes
at night under the street light, since he wants to learn to track falling objects by just their shadows. (Dudley has an
overabundance of zucchini, so he doesn’t mind.) The street light is 7 meters directly above the catapult. The catapult
launches zucchini with initial horizontal velocity of 2 m/s and initial vertical velocity of 10 m/s. Find how fast the shadow
of the zucchini is traveling when the zucchini reaches the top of its path.
Answer: There are all kinds of hassles in working this problem. There are a lot of equations, and very few of them are given
to you. The main thing (after wasting an enjoyable few minutes trying to get the ideal picture) is to try to get an equation
for the position of the shadow. There are also three variables, instead of the usual two.
We set things up to make them as simple as possible. We put the origin of out coordinate system right at the catapult,
so the street light is at position (0, 7). If the zucchini is at position (x, y) (which we can figure out from the equations of
ballistic motion that we had in the last chapter—but not yet), then where is the shadow of the zucchini? For that, we use
have to use similar triangles. (Stare at the picture; it really can help.)
There are two triangles we use. One is formed by the light, down to the catapult, and then to the zucchini’s shadow. The
other is formed by the zucchini, down to the point immediately below it on the ground, and then to the zucchini’s shadow.
These triangles are similar. The light to the catapult is 7 m, and the catapult to the shadow is s. The zucchini down to the
ground is y and that point to the shadow is s − x.
The ratios from similar triangles gives this:
7 y
= .
s s−x
7x
Solve that for s and you get s = 7−y .
The rate at which the shadow is moving is ds/dt. From the solution above, we get that by the quotient rule:
dy
ds (7 − y) (7 dx
dt ) − (7 x) (− dt )
=
dt (7 − y)2
We “only” need to fill in the values of x, y, dx/dt, and dy/dt to get the value of ds/dt.
The equations of ballistic motion give x = 2t and y = −4.9t 2 + 10t. If we can find the value of t, we can plug in and
get everything we need. The value of t requested is “when the zucchini reaches the top of its path.” How do we find that?
If you think back
Mugsy: Or look back, in my case
you will remember that the top of the path occurs at the value of t for which dy/dt = 0. In this case, that is −9.8t + 10 = 0,
or t = 10/9.8 ≈ 1.02. For that value of t, x = 2 ∗ t = 2.04, y = 5.10 (safely below the light), dx/dt = 2, and dy/dt =
(−9.8 ∗ t + 10) = 0. In that case, we simply plug it all in and get
Homework #53
Exercises.
1. Dudley has just been pulled over by a police officer, who has used a radar gun to clock Dudley at 63 m.p.h. in a
55 m.p.h. zone. However, the officer was on a side road at the time, and Dudley wants to argue that his velocity
as measured by the gun (the rate at which the distance between Dudley and the gun was changing) was not the
same as the speed Dudley was actually going along the road, so Dudley shouldn’t be ticketed. The various items
of information that need to be used are these: The officer was sitting on a side road 0.2 mile from the road Dudley
was on, and the radar caught Dudley when he was 0.5 mile from the intersection of the two roads (which are straight
and meet perpendicularly). We want to figure out how fast Dudley was going using only this information. Set up
variables s = distance between Dudley and the radar gun, x = distance between Dudley and the intersection.
(a) What is the equation that relates s to x? [Hint: Draw a picture and use the Pythagorean theorem. This problem
does not use similar triangles.]
(b) What is ds/dx? What equation relates ds/dx and ds/dt?
(c) What is the value of s when x = 0.5?
(d) What is dx/dt when x = 0.5? (This is the speed Dudley was actually going.) Did Dudley “earn” the ticket or
was he right to question the value the officer had?
(e) Show that the speed that the radar gun measures in this situation is always going to be less than the actual speed.
Do this by taking the equation relating ds/dt and dx/dt before you plugged in values for s and x, and using the
fact that x < s.
2. Fang prides himself at being able to dig precisely hemispherical holes. If he can excavate at 0.3 m3 per minute
(meaning that Fang is digging dirt out of the hole at exactly that rate), how fast is the radius of the hole changing
when it is 2 m? (This is a very typical related rates problem. Work first on getting the picture, then get the equations.)
3. The fact that roosters do not swim has not kept Bill from enjoying that activity. (And if people try to remind Bill that
ducks do swim, he pretends that he has water in his ears and can’t hear them.) One day, Bill landed near the center
of a pond, and the ripples from his landing spread out in a perfectly circular pattern, receding from Bill at 4 m/s.
How fast was the area of that circle increasing when the radius was 10 m? (This is also a very typical related rates
problem.)
which is the sum of the vertical displacements from the line, which are the errors in the measurements.
Note the change in character of the letters! At this point, the x j and the y j are the numbers (constants) taken from
the data. That shouldn’t be too much of a surprise since subscripts on variables like x and y often indicate that they are
constants. The variables, the things that will change as we are looking for the minimum, are m and b. That is, we are trying
to look for the “best” slope and intercept.
Remember that y j = actual y-value of a data point, while m x j + b = predicted y-value of the data point, and y j − (m x j +
b) = difference between these, the error in the prediction.
n ∑(x j y j ) − (∑ x j )(∑ y j )
m=
n ∑(x j 2 ) − (∑ x j )2
(∑ x j 2 )(∑ y j ) − (∑ x j )(∑ x j y j )
b=
n ∑(x j 2 ) − (∑ x j )2
n = 17
CHAPTER 5. WATER BALLOONS 289
Year Time
1954 239.4
1954 238.0
1957 237.2
1958 234.5
1962 234.4
1964 234.1
1965 233.6
1966 231.3
1967 231.1
1975 231.0
1975 229.4
1979 229.0
1980 228.8
1981 228.5
1981 228.4
1981 227.3
1985 226.3
∑ x j = 266
∑ y j = 263.97
∑ x j 2 = 5954
∑ x j y j = 4172.7
This gives values of m = 0.0236 and b = 15.16. The predicted equation is then
v = 0.0236 x + 15.16
The maximum absolute error in the prediction is 0.12, which occurs in both 1954 and 1966. The percentage error is less
than 1%, more a result of the fact that the velocity is changing very slowly than that this is a good fit.
When would this predict a three-minute mile to be run? That’s not an easy question, because we have changed things
around so much. First thing we’d need to do is find out the velocity needed for a three-minute mile. That’s not bad. It’s
d
v= (5.10)
t
1 mile
= (5.11)
3 min
1 min
= mi/min × 60 (5.12)
3 hr
= 20 mi/hr (5.13)
Then, we have to find t when v = 20. That’s not too bad either:
20 = 0.0236x + 15.16
x = 205
Since x measures years after 1954, this would give 205 + 1954 = 2159 for the year of the first three-minute mile. That’s
not far from the year of the three-minute mile predicted from linear regression based on times. On the other hand, there is
no chance of a zero-minute mile this way, since that would mean velocity is infinite, meaning x would be infinite as well.
Homework #54
Exercises.
1. Find the estimated time to run 1 mile in 1975, using the equation earlier for velocity, and compare it to the times
listed for 1975.
2. Find the estimated time to run 1 mile in 1981, using the equation earlier for velocity, and compare it to the times
listed for 1981.
Problems.
1. In this problem, you will fit y = f (x) = m x + b to some data by hand. Use this data: (x j , y j ) is measured as (0, 0),
(1, 1), (2, 5), (3, 9). (Note that this is the same data as used to fit a quadratic before. You can fit any data you want
to any function you want.) There are few enough here so that you can do the algebra with only minimal pain. Also,
note that this is a simple y = x2 , with a change in the value at x = 2.
(a) Find n, ∑ x j , ∑ y j , ∑ x j y j , and ∑ x j 2 for the data given.
(b) Plug these values into the equations that I gave for m and b in the notes for a linear regression fit.
(c) Give the values of m x j + b for x = 0, 1, 2, and 3, and compare it to the data given.
CHAPTER 5. WATER BALLOONS 291
2. The average values of x and y for the data are what you are used to, namely x = 1n ∑ x j and y = n1 ∑ y j . Plug the
formulas for x and m and b into m x + b and show that it reduces algebraically to the formula for y. (This means that
y = m x + b is always true for linear regression. Since x and y are usually simple to find, this equation will often get
solved to give b = y − m x, which is used to get b once you’ve found m.)
3. Suppose there are only two data points, (x1 , y1 ) and (x2 , y2 ). Show that the slope and intercept (m and b) of the least
squares line y = m x + b is the same as the slope and intercept of the line that passes through the two points. You can
use the equations I gave for m and b. You will have some algebra ahead of you. (In other words, in this case the least
squares line is the line through the two points.)
CHAPTER 5. WATER BALLOONS 292
n ∑(x j y j ) − (∑ x j )(∑ y j )
m=
n ∑(x j 2 ) − (∑ x j )2
(∑ x j 2 )(∑ y j ) − (∑ x j )(∑ x j y j )
b= = y−mx
n ∑(x j 2 ) − (∑ x j )2
(b) SET UP the integral to find the length of the y = x3 + 2 curve (the arc length) between points A and B (at the outer two
intersection points). Don’t evaluate the integral.
dy
III. (10 points) Solve the following initial value problem. = t cos(t 2 ), y(0) = 1.
dt
CHAPTER 5. WATER BALLOONS 293
V. (10 points) Suppose that the average yearly cost per item for producing x items of a business product is C(x) =
10 + (100/x). If the current production is x = 10 and production is increasing at a rate of 2 items per year, find the
rate of change of the average cost.
VI. (10 points) Given the data {(0, 0), (1, 3), (2, 7)}, find the two equations that you would solve to find the constants A and
B that would give the best least-squares fit of y = A + B sin x for this data. Recall that the least-squares error expression for
this problem is E(A, B) = ∑3i=1 [yi − (A + B sin(xi ))]2 . DO NOT SOLVE FOR A AND B! Write your answer as a linear sys-
tem for A and B, i.e., (constant) = (constant)A + (constant)B, (constant) = (constant)A + (constant)B. (Note these constants
may have different values.)
II. (10 points) In an attempt to impress the women of Asbury with their environmental sensitivity and excellent personal
hygiene, the HR Math Men have built a clothesline by stretching a rope between two poles. Dustin the Math Ninja correctly
points out to a female admirer that the line assumes the shape of a catenary curve with equation y = 5 (ex/10 +e−x/10 ), where
−10 ≤ x ≤ 10. Set up the integral that would calculate the amount of rope used in this project.
III. (10 points) If a water balloon is shot from x = 0 and y = 0 with an initial x velocity of 20 feet per second and an initial y
velocity of 64 feet per second, how far will it travel in the x direction if it lands at the same level? Use the equations (g = 32
feet/sec2 ) vy = v0y − gt, y = y0 + v0y t − 21 gt 2 , x = x0 + v0x t.
dy
IV. (10 points) Solve the following initial value problem. = x sin(x2 ), y(0) = 1.
dx
CHAPTER 5. WATER BALLOONS 294
V. (40
Z πpts, 10 points each) Evaluate the following integrals.
4 cost 1
Z Z Z
2
(a) 2
dt (b) (x + 2 x + 3) sinh(3 x) dx (c) 2
dx (d) x2 ln x dx
π/2 (sint + 1) x +x−2
dy dx
VI. (10 points) If x y2 − ln(x y) = x and = 3 at x = 1, y = 1, find at this same point. VII. (10 points) Find all of the
dt dt
local min/max/saddle points for f (x, y) = x3 + y3 − 3 x y.
VIII. (10 pts) Use integration to show that the area of a right triangle is equal to (1/2)*base*height.
I. (10 points; 5 points each) Find the values of the following summations:
−1 3
k+1
a.) ∑ 2 j+4
b.) ∑ 2
i=−5 k=−2 k − 2
x y2 z
II. (15 points; 5 points each) Find all the first partial derivatives of f (x, y, z) =
sin(2 x y + 3 z)
CHAPTER 5. WATER BALLOONS 295
dy
III. (10 points) Find y if = 3 x2 − 6 and y(0) = 4.
dx
IV. (15 points; 5 points each) Find all three critical points of f (x, y) = 2 x2 + y2 − x2 y − 5, and classify all of them.
V. (10 points) Find the area between the curves y = x2 + x − 2 and y = 3 x − 2. [Note: An unlabeled sketch of the curves
was provided on the test.]
VI. (15 points; 5 points each) N3RD is going to build a trebuchet (a medieval throwing device) for launching pumpkins.
They could use your help with some of the equations. They need to know the initial speed to throw a pumpkin 300 feet. If
the launch angle is roughly 40◦ , the equations of motion are x = 0.766 v0 t and y = −16t 2 + 0.643 v0 t.
(a) Solve y = 0 for t. The positive value gives the time that the pumpkin hits the ground after launch. The value of t will
still have a v0 in it, but that is fine right now. (b) Plug that positive value of t into the equation for x. This gives the
distance that the pumpkin travels. It will still have a v0 in it. (c) Set that distance equal to 300 and solve for v0 . This is
the necessary launch speed (in feet per second).
VII. (30 points; 10 points each) Evaluate the following integrals.
6 x2 − 3 x + 6
Z Z Z
(a) x3 e2 x dx (b) sin(sin x) cos x dx (c) dx
x (x + 1) (x − 2)
I. (10 points; 5 points each) Find the numeric values for the following summations:
8 6
(a) ∑ (l + 3)2 (b) ∑ n cos(n π)
l=3 n=1
dy
II. (10 points) Solve the following initial value problem: = t 2 + 2t + 1, y(0) = 1.
dt
III. (20 points; 5 points each) One of Dr. C’s favorite (yet, admittedly perverse) activities is scaring students as they enter
calc class. As Jacob entered class on a sleepy Monday morning, Dr. C leaped from his hiding spot and yelled, ’Recall.’
Jacob, of course, jumped. His initial horizontal velocity was 5 ft/sec and his initial vertical velocity was 16 ft/sec. Start
t = 0 at the moment he launched. The equations for ballistic motion are: x(t) = v0x t + x0 and y(t) = − 12 gt 2 + v0y t + y0 and
use the value g = 32 ft/sec2 .
(a) When did Jacob land back on the ground? (b) How far did he travel horizontally? (c) When did he reach the
top of his jump? (d) How high did he go?
IV. (25 points, 5 points each) For each of the following four integrals, give the method that should be used to begin to
evaluate it (the first step), and the appropriate information about the method. The possible methods, with the corresponding
information are:
Method Information
Substitution u=?
Partial fractions Setup
Integration by parts u=?
Arctan(ln x) 6 x2 − 15 x + 22
Z Z Z Z
3
(a) x2 ex dx (b) dx (c) dx (d) (x2 +3 x+2) sinh(2 x) dx (e)
x (x + 3) (x2 + 2)2 (x + 7)3
Pick one of these four integrals and work it out completely.
V. (20 points) Find all three critical points of f (x, y) = 2 x2 + y2 − x2 y − 5 and classify all of them.
VI. (10 points) A reactor plant operator must monitor the flow rate through the nuclear core to assure adequate cooling.
When a flow sensor fails, the engineer on watch uses an energy conservation principle to develop the following relationship
between coolant flow rate (Q) and pressure (p) at another monitoring point in the plant: 24 p2 + 6 Q = 18. On the next watch
(2 hours later), the operator observes that the pressure (p) is 800 pounds per square inch (psi) and is increasing at a rate of
0.01 psi per hour (d p/dt = 0.01). How is the coolant flow rate changing at this time (i.e., find dQ/dt at this monitoring
CHAPTER 5. WATER BALLOONS 296
point at this time)? Assume that all relevant units are consistent to give the rate of change.
VII. (10 points) In a typical display of caring and sensitivity, Jess offered to carry Jordan’s backpack to his classes while
he was recovering from a sprained ankle. As part of her fitness program, Jess maintained careful records of her trips. She
described her path with the following functions: x(t) = cos(t/6), y(t) = sin(t/4). SET UP the integral that would determine
the length Jess walked in the noble endeavor from t = 0 to t = 720.
VIII. (10 points Bonus) Find the area between the curves y = x3 − 2 x and y = x2 . See the following sketch.
I. (10 points; 5 points each) Find the numeric values for the following sums.
2 −4
(a) ∑5k=2 kk+1 (b) ∑1j=−1 j2 − 3 j + 2 II. (10 points) Solve the following IVP
dy
= t 3 cos(t 4 )
dt
y(0) = 10
III. (15 points; 5 points each) With the abundant free time available due to an unusually slow social season, the Asbury
math men built a catapult. After completion, they learned that the projectile would take off with a vertical velocity of 48
ft/sec and a horizontal velocity of 18 ft/sec from ground level. Start with t = 0 at launch time and use the ballistic motion
equations to answer the following questions.
(a) Find the total flight time of the projectile from launch until it hits the ground. (b) What’s the range, i.e., how far will
it go horizontally in this time? (c) On a sunny spring afternoon, our heroes hear that a group of women are studying on
the Student Center deck. They decide to impress these young ladies by launching a container filled with difficult proofs. To
maintain anonymity (and because they couldn’t push the catapult any closer), they launched from a distance of 20 ft from
the deck. They were concerned that their projectile would clear the 12 ft rail. Calculate how long it will take to reach this
distance and then calculate the altitude of the object at that time to decide if the projectile will clear the rail.
IV. (40 points; 10 points each) Find the following integrals
(c) x2(2−5
R 1−sin(x) R R x−1) R π/4 3
(a) x+cos(x) dx (b) ln(4 x) dx x+6
dx (d) 0 x sin(2 x) dx
V. (15 points) Find and classify all three critical points of f (x, y) = x2 + 2 x y2 + 6 y2 − 2 x
6. (15 points) Find the total area between y = x4 + 2 x3 − 3 x2 − 8 x − 4 and the x-axis. Hint: x4 + 2 x3 − 3 x2 − 8 x − 4 =
(x − 2) (x + 1)2 (x + 2). The following graph may be helpful:
CHAPTER 5. WATER BALLOONS 297
I. (10 points, 5 points each) Find the numeric values for the following summations:
8 6
(a) ∑ (l + 1)3 (b) ∑ n cos(n π)
l=3 n=1
II. (10 points) Find the area between the curves y = x3 − 2 x and y = x2 . Refer to the graph.
VII. (10 points, 5 points each) In an effort to impress the women of the calculus class, the calculus math men built a tre-
buchet for launching flaming frozen turkeys (as a holiday celebration). After the WPD were called, one officer used his
radar gun to determine that v0x = 35 ft/sec and v0y = 64 ft/sec. The trebuchet launched the birds from a height of 4 ft. Use
the equations of ballistic motion
x(t) = v0x t + x0 and y(t) = − 12 t 2 + v0y t + y0
to answer the following questions.
(a) How far up (maximum height) did the turkey fly? (b) How far away did the turkey land?
I. (10 points; 5 points each) Find the numeric values for the following summations.
3 4
(a) ∑ (l − 3)2 (b) ∑ (n2 − 3 n + 1)
l=1 n=1
√
II. (10 points) Find the area between the curves y = x 1 − x2 and y = x/2.
dy
√
III. (10 points) Solve the following initial value problem: dt = t t 2 + 3, y(0) = 1.
IV. (40 points; 10 points each) Evaluate the following four integrals.
Z π/2 Z Z
3 x2 − 4 x + 5
Z √
(a) x3 sin(4 x4 ) dx (b) (x2 − 2 x − 3) e2 x dx (c) dx (d) t 2 t + 8 dt
0 (x − 1) (x2 + 1)
V. (20 points; 10 points each) Given the function f (x, y) = 4 x − 3 x3 − 2 x y2 .
(a) Find the critical points of this function. (b) Categorize these points according to the second derivative test.
dx √ dy
VI. (10 points) Given that = 0.2 at x = 4, y = 1, and x2 y − 2 x y = 17, find at this same point.
dt dt
VII. (10 points; 5 points each) A projectile (OK, I’m trying to behave here. . . .) is launched from ground level (x = y = 0)
with v0x = 10 ft/sec and v0y = 16 ft/sec. Use the equations of ballistic motion, x(t) = v0x t + x0 and y(t) = −(1/2) gt 2 +
v0y t + y0 , where g = 32 ft/sec2 , vx (t) = v0x , and vy (t) = v0y − gt, to answer the following questions.
(a) What is the maximum height of the projectile. (b) What is its range, i.e., how far will it go in the x direction?
I. (10 points; 5 points each) Find the numeric values for the following summations.
8 5 nπ
(a) ∑ (l 2 + 1) (b) ∑ n sin
l=4 n=2 2
II. (15 points) Find the area between the curves y = t 3 − 2 and y = 2t 2 + 3t − 2. Reference the following graph.
CHAPTER 5. WATER BALLOONS 299
dy
III. (10 points) Solve the following initial value problem: = x cos(x2 ), y(0) = 1.
dx
IV. Z(24 points; 8 pointsZeach) Find the following
Z integrals.
4
(a) x3 ex dx (b) ln(5 x) dx (c) (x2 + 3 x + 2) cosh(3 x) dx
x4 − 5 x2 + 22
V. (4 points) SET UP the partial fraction expansion for . DON’T determine the coefficients!
(x − 3) (x2 + 1)2 (x − 7)3
VI. (10 points) Given the function f (x, y) = 2 x2 − y3 − 2 x y, find all the relative max/min values.
VII (15 points; 5 points each) One of Dr. C’s favorite integration apps is ballistic motion. (He uses it to model projectile
vomiting as part of a parenting class—don’t ask. . . .) Due to time contraints, he was not able to cover it this year, but he
thought, ‘Hey! Why not cover it on the exam?’, so here goes. Near the surface of the earth, the acceleration of gravity, g,
is a constant (32 ft/sec2 or 9.8 m/sec2 ). Since acceleration is the time derivative of velocity or speed (v), we get another of
dv
Dr. C’s favorite things, and IVP, i.e., = −g, v(0) = v0 , where v0 is a given initial speed.
dt
(a) Solve is IVP to get v(t) = v0 − gt. (b) You may be thinking to yourself, ‘Hey! Isn’t velocity the derivative of
dy
distance?’ Yes, so it makes another IVP for distance, y: = v(t), and use v(t) = v0 − gt from part (a), y(0) = y0 , where y0
dt
is the given initial distance. OK, you know the drill. Solve this IVP for y(t). (c) Suppose an object is thrown up from
ground level (y0 = 0) with an initial speed of 64 ft/sec, maximize your equation isn part (b) to find how high the object will
fly, i.e., find the maximum distance y. (Use g = 32 ft/sec2 .)
VIII. (15 points) In an increasingly desperate attempt to court female attention, Team Math Man (male calculus students
who requrested to remain anonymous) volunteered to help the college physical plant solve the hot water shortage in Glide-
Crawford. The engineer in charge develops an equation relating the pressure at the boiler (p) to the flow at the Glide-
Crawford (GC) header (Q) as −p3 + 6 Q = 18. During the morning prime shower time, he measures the pressure at the
boiler as 800 pounds per square inch (psi) and the pressure as falling (−) at a rate of 0.1 psi per hour (d p/dt). How can
Team Math Man find the rate at which the pressure at the GC header is changing (dQ/dt) at this time? Find the value of
dQ/dt at this time.
The following were not problems on tests, but were problems that one student requested to work on. They are, in general,
harder than I what I would put on a test, but are excellent practice. If you can work these, you will be well-prepared for the
final.
Z √
6. Find x a x + b dx, showing your work.
Z 1
3x−1
7. Find √ dx.
−1 3 x2 − 2 x + 3
CHAPTER 5. WATER BALLOONS 300
Z 2 x
d 3
8. Find sin(t ) dt .
dx x
Note that it is impossible to find the indefinite integral of sin(t 3 ), but you don’t need to find it to solve this problem. Show
your work.
9. Sketch the region in the first quadrant√above the line y = 3 x − 2, and below the line y = 4. Find the area of the region.
10. Find the length of the curve y = 31 x (3 − x) from x = 0 to x = 3. Your answer should be a number, which you can
leave in any form you want.
11. Find the area between the curve y = x3 − 6 x2 + 8 x and the x-axis.
These are problems that were on miscellaneous other tests, and are also good practice.
12. Find and classify the critical points of the function f (x, y) = x3 + 8 y3 − 6 x2 − 12 y2 + 4. (There will be four of them.)
13. (15 points total)
a.) (10 points) Find the equation of the linear regression line for the following data:
x y
0 −1
2 2
4 2
6 5
b.) (5 points) Predict the value of y at x = 5.
Summary sheet
Second derivative
2 test in two variables:
2 2 2
∂ f ∂ f ∂ f
For D = − 2
,
∂x∂y ∂x ∂ y2
D > 0, the
critical
point is a saddle.
∂2 f
D < 0 and ∂ x2 > 0, the critical point is a relative min.
2
D < 0 and ∂∂ x2f < 0, the critical point is a relative max.
D = 0 is a mess (any of these, or worse).
I. (a) (3t 2 )/(2 e2t ) (b) −( 21 x ex y−1/2 )/(2 z x + (ex + x ex ) y1/2 + sin x)
R √3√
II. (a) 9/2 (b) √
− 3
1 + 9 x4 dx
III. y = 12 sin(t 2 ) + 1
IV. (a) 12 (x4 +x2 +1) sinh(2 x)− 41 (4 x3 +2 x) cosh(2 x)+ 18 (12 x2 +2) sinh(2 x)− 16
1 1
(24 x) cosh(2 x)+ 32 (24) sinh(2 x)+C
CHAPTER 5. WATER BALLOONS 301
1
(b) −(3 − 2 x − x2 )1/2 +C (c) 3 ln |x − 1 | − 13 ln |x + 2 | +C (d) 2 ln 2 − 43
V. −2
VI. Solve 6 A − 20 + 2 B (sin(1) + sin(2)) = 0 and −2 (3 − A − B sin(1)) sin(1) − 2 (7 − A − B sin(2)) sin(2) = 0.
I. (a) 50 (b) 50
s
1 x/10 1 −x/10 2
Z 10
II. 1+ 5 e − e dx
−10 10 10
III. 80 feet
medskip IV. y = − 12 cos(x2 ) + 23
V. (a) −2 (b) 13 (x2 + 2 x + 3) cosh(3 x) − 19 (2 x + 2) sinh(3 x) + 27
2
cosh(3 x) +C (c) − 13 ln(x + 2) + 13 ln(x − 1) +C
1 3 1 3
(d) 3 x ln x − 9 x +C
VI. 3
VII. Saddle at (0, 0), min at (1, 1)
h 2 b h b2
Z b
h h 2 1
VIII. Area = x dx = x = (b − 0) = = hb
0 b 2b 0 2b 2b 2
I. 143/20 = 7.15
II. y = (1/4) sin(t 4 ) + 10
III. (a) 3 seconds (b) 54 feet (c) t = 10/9 second; 33.5 feet
IV. (a) ln(x + cos(x)) + C (b) x ln(4 x) − x + C (c) 5 ln(x − 3) − 3 ln(x − 2) + C (d) (3/4) (π/4)2 − (3/8) ≈
0.08764
V. (1, 0) is a relative min; (−3, 2) is a saddle point; (−3, −2) is another saddle point.
VI. 96/5 = 19.2
CHAPTER 5. WATER BALLOONS 303
I. (a) 30 (b) 4
II. 5/24 ≈ 0.2083
√
III. y = 13 (t 2 + 3)3/2 + (1 − 3)
1 2x 3 1 1 6 2x
1
IV. (a) − 16 1
cos(π 4 /4) + 16 (b) e (x − 2 x − 3) − e2 x (3 x2 − 2) + e2 x (6x) − e +C (c) 2 ln |x − 1 | −
2 4 8 16
3 Arctan x + 21 ln(x2 + 1) +C (d) 23 t 2 (t + 8)3/2 − 15
8 8 2
t (t + 8)5/2 + 15 7 (t + 8)
7/2 +C or 2 (t + 8)7/2 − 2 16 (t + 8)5/2 +
7 5
2 7/2 +C
3 64 (t + 8)
√ √ √
V. (a) The critical points are (0, 2), (0, − 2), (2/3, 0), (−2/3, 0). (b) (0, ± 2) are saddle points. (2/3, 0) is a rela-
tive maximum. (−2/3, 0) is a relative minimum.
VI. −3/28 ≈ −0.1071.
VII. (a) 4 feet (b) 10 feet
Miscellaneous problems
2 2b
6. (a x + b)5/2 − (a x + b)3/2 +C
5 √ 3
7. 2 − 2.
8. 2 sin(8 x3 ) − sin x3 ).
9. 16/3
√
10. 2 3
11. 8
12. (0, 0 is a max; (0, 1) is a saddle; (4, 0) is a saddle; (4, 1) is a min
9 7
13. (a) y = x− (b) 19/5
10 10
4
IV. (a) 14 ex +C (b) x ln(5 x) − x +C (c) 13 (x2 + 3 x + 2) sinh(3 x) − 19 (2 x + 3) cosh(3 x) + 27
2
sinh(3 x) +C
A B x +C Dx+E F G H
V. + 2 + 2 2
+ + 2
+
x−3 x +1 (x + 1) x − 7 (x − 7) (x − 7)3
VI. Critical point (0, 0) is a saddle (Pringle). Critical point (−1/6, −1/3) is a relative min.
R R
VII. (a) Integrating gives dv = (−g) dt, or v = −gt + C. Plug in the initial condition to get the value of C. v0 =
−g (0) +C, or C = v0 . The solution to the IVP is v(t) = −gt + v0 = v0 − gt. (b) y = y0 + v0 t − 21 gt 2 . (c) 64 feet
VIII. dQ/dt = −80/3 ≈ −26.67
Appendix A
A.1 Chapter 0.
Homework #1
Exercises.
1. (a) is a function, since an input of 1 always produces an output of 1, and there are no other repeated inputs. (b) is not
a function, since an input of 1 produces different outputs. (c) is not a function, since it does not pass the vertical line test.
(d) is a function, since it produces an unambiguous number. (e) is a function, since it produces an unambiguous number.
2. (a) is a function since there are no repeated inputs. (b) is not a function since (for example) an input of 1 could produce
1 or 6. (c) is a function, since it passes the vertical line test. (d) is a function, since it produces an unambiguous number.
(e) is a function, since it produces an unambiguous number.
3. (a) {1, 2, 3, 4}. (b) {1, 2, 3, 4}. (d) The domain is all (real) numbers. (e) The domain is all non-negative numbers.
4. (a) {1, 2, 3, 4, 5, 6} (b) {1, 2, 3} (d) The domain is all non-negative real numbers. (e) The domain is all (real)
numbers.
5. There are 6 parts to the exercises, 3 parts to the problems, and 7 parts on the investigation, for a total of 6 ∗ 1 + 3 ∗ 2 + 7 ∗
3 = 33 points.
Homework #2
Exercises.
1. (a) Has an inverse, given by {(1, 1), (2, 4), (3, 3), (4, 2)}. (b) Has no inverse, since inputs of 2, 3, and 4 all give 1. (c)
Has no inverse, since it fails the horizontal line test. (d) Has no inverse, since inputs of 1 and −1 (for example) both give
1. (e) Has an inverse, given by x2 .
2. (a) Has no inverse, since (for example) both 1 and 6 as inputs give an output of 3. (b) Has an inverse, given by
{(1, 1), (2, 2), (3, 3), (4, 3), (5, 2), (6, 1)}. (c) Has no inverse, since it fails the horizontal line test. (d) Has an inverse,
given by x2 . (e) has no inverse, since inputs of 1 and −1 (for example) both produce an output of 1.
3. A function is one-to-one when each output comes from only one input. The inverse comes from interchanging the inputs
and outputs, so if a function is one-to-one then the inverse has the property that each input comes from only one output,
making the inverse a function.
Homework #3
Exercises.
305
APPENDIX A. ANSWERS TO HOMEWORK EXERCISES 306
2. (a) f (g(x)) = 8 x2 +18 x+9, g( f (x)) = 4 x2 −6 x+3 (b) 21 x− 23 (c) (d) g(g−1 (x)) =
g( 12 x − 23 ) = 2 ( 21 x − 32 ) + 3 = x, and g−1 (g(x)) = g−1 (2 x + 3) = 21 (2 x + 3) − 23 = x. (e) 0, 3
2
Note: The process of including the graphs distorts the axes. This means that angles will not look “right” in the
graphs.
−1+x
3. f ( f (x)) = x , xf ( f ( f (x))) = x
−
4. f ( f (x)) = − 1−(−1−xx ) = x
1−x
5. (a) −16; 4 a (b) −4
6. (a) 3; a (b) 3
A.2 Chapter 1.
Homework #4
Exercises.
1. (a) y2 = 34; ∆x = 4, ∆y = 36 (b) msec = 9 (c) msec = 2 ∆x + 1 = 9
2. (a) y2 = 1, ∆x = 1, ∆y = 3 (b) msec = 3 (c) msec = 2 ∆x + 1 = 3
3. (a) y2 = 13, ∆x = −3, ∆y = 15 (b) msec = −5 (c) msec = 2 ∆x + 1 = −5 (d) No, the answers to parts (b) and (c)
agree, so there is no problem with ∆x being negative.
Homework #5
Exercises.
1. (a) msec = 6 x1 + 3 ∆x − 2 (b) 10
2. (a) msec = 6 x1 + 3 ∆x − 5 (b) −11
Homework #6
Exercises.
1. (a) ∆x now is 1−5 = −4, while before it was 5−1 = 4. And ∆y now is −2−34=−36, while before it was 34−(−2) = 36.
So, both ∆x and ∆y change signs. (b) Now, msec = (∆y)/(∆x) = (−36)/(−4) = 9. This is the same as the old value of
msec . (c) Both ∆x and ∆y change signs, so msec doesn’t change.
2. (a) ∆x is now 1 − 2 = −1, while before it was 2 − 1 = 1. And ∆y now is −2 − 1 = −3, while before it was 1 − (−2) = 3.
So, both ∆x and ∆y change signs. (b) Now, msec = (∆y)/(∆x) = (−3)/(−1) = 3. This is the same as the old value of msec .
APPENDIX A. ANSWERS TO HOMEWORK EXERCISES 307
Homework #12
Exercises. √ 2
1. (a) − 23 √2−3
1
x
(b) 6 (5 z3 − 8 z)5 (15 z2 − 8) (c) 2 r 2 r − 1 + √2rr−1
2 4 2 2 2 −8t+1)3 [4 (6t 2 +9t+5)3 (12t+9)]
5 8
(θ 2 +1) (25 θ 4 )−(5 θ 5 ) (2 θ )
(d) (6t +9t+5) [3 (3t −8t+1) (6t−8)]−(3t
2
(6t +9t+5) 8 (e) 9 5θ
θ 2 +1 (θ 2 +1)2
APPENDIX A. ANSWERS TO HOMEWORK EXERCISES 308
√ 3
2. (a) − 53 1
(b) 7 (5 z4 + 8 z3 − 1)6 (20 z3 + 24 z2 ) (c) 3 r2 4 r − 1 + 2 √4rr−1
(3−5 x)2/3
3 3 4 2 )]−(3t 3 −4)5 [3 (4t+1)2 (4)]
7 2
(d) (4t+1) [5 (3t −4) (9t(4t+1) 6 (e) 8 θ +2
θ 2 +θ
(θ +θ ) (1)−(θ +2) (2 θ +1)
(θ 2 +θ )2
2 2 +1 x2 −1 x2 −1
3. 1
x2 +1
− 2 (x2x+1)2 = (x−x2 +1) 2 becomes − (x2 +1)2 = − (x2 +1)2
2
4. ( x21+1 + 2 (x2x+1)2 ) (x2 + 1)2 becomes 3 x2 + 1.
5. Varies with the student.
Homework #13
Exercises.
|x3 −x | (3 x2 −1) x (|x |/x)−|x | (1)
1. (a) (2 x) |x | + x2 (|x | /x) = 3 x |x | (b) x3 −x
(c) =0
√ √ |x |2
|x |
(|x | x2 +1) (0)−(1) [ x x2 +1+|x | 12 (x2 +1)−1/2 (2 x)]
(d) √
2 2
[|x | x +1]
|4 x3 −7 x2 | (12 x2 −14 x) |x | (1)−x (|x |/x)
2. (a) (5 x4 ) |x | + x5 (|x | /x) = 6 x4 |x | (b) 4 x3 −7 x2
(c) =0
√ √ |x |2
|x | 1 2 −1/2
(|x | x2 −1) (0)−(1) [ x 2
(d) √x −1+|x | 2 (x −1) (2 x)]
[|x | x2 −1]2
3. Varies with the student.
Homework #14
Exercises.
sec2 (|θ |) |θ |
1. (a) |tan(θ )| 2
tan(θ ) sec (θ ) (b) θ (c) 3 sin2 (θ ) cos(θ ) (d) 3t 2 sec(4t) + 4t 3 sec(4t) tan(4t)
(α 2 +1) (− sin(5 α) (5))−(cos(5 α)) (2 α)
(e) (α 2 +1)2
2. (a) − cos(cos(θ )) sin(θ ) (b) − sin(sin(θ )) cos(θ ) (c) −5 cot4 (x) csc2 (x)
(α 2 −1) (1 tan α+α sec2 α)−(α tan α) (2 α)
(d) 4t 3 sec(t 2 ) + 2t 5 sec(t 2 ) tan(t 2 ) (e) (α 2 −1)2
3. Varies with the student.
Homework #15
Exercises. 2 (1−Arctan(x))
1. (a) √ 2 x
(b) 2 1+(1+x 2 )2 (c) |Arcsin(5 x) |
Arcsin(5 x)
√ 5
(d) − sec 1+x2
1−4t 2 1−25 x2
(1−2 x2 ) ((1) Arcsec x+x √1 )−(x Arcsec x)(−4 x)
|x | x2 −1
(e) (1−2 x2 )2
5 2 |Arcsec(x2 −1) |
2. (a) √ (b) √ (c) Arcsec(x2 −1) 2 √2 x 2 2
2
−3−25t −20t 2
Arctan(4 x) (1+16 x ) |x −1 | (x −1) −1
1 1 −2 √ 1
(d) sec Arcsin x tan Arcsin x − (Arcsin x)
1−x2
(x2 +sin x) [ 1 2 Arcsin x+Arctan x √ 1 −(Arctan x Arcsin x) [2 x+cos x]
1+x 1−x2 ]
(e) (x2 +sin x)2
3. Varies with the student.
Homework #16
Exercises.
sec(x) tan(x) sin(x− 1x ) 1 1 Arcsec x
Arcsec x(1)−x ( |x |√1x2 −1 )
1. (a) 1+sec(x) (b) x + ln(x) cos(x − x ) (1 + x2
) (c) x (Arcsec x)2
|ln(sin(x)) | cos(x) (β +3)2 [(1) ln β +β (1/β )]−(β ln β ) [2 (β +3) (1)]
2. (a) ln(r) + 1 (b) ln(sin(x)) sin(x) (c) (β +3)4
1/7
ln(x)25 x18 (x8 −x)41
41 8 x7 −1
3. (a) (1+x2 )39 Arcsin29 (2 x) 25
× ( 7 x ln(x) + 7 x + 7 x8 −x − 78
1 18 1 x
7 1+x2 − 7
58 √ 1
) (b) sin(x)ln(x) × ( ln(sin(x))
x +
1−4 x2 Arcsin(2 x)
ln(x) cos(x)
sin(x) )
21 (x) x12 (x2 −2)43
1/5
4. (a) (2sin 2 13
x +5) Arctan (3 x) 79 × ( 21 cos(x) 12 1 86 x 78 x 237
5 sin(x) + 5 x + 5 x2 −2 − 5 2 x3 +5 − 5 (1+9 x2 ) Arctan(3 x) )
1
APPENDIX A. ANSWERS TO HOMEWORK EXERCISES 309
(x) 2
(b) sin(x)cos(x) (− sin(x) ln(sin(x)) + cos
sin(x) )
5. Varies with the student.
6. x4 sin(x) ( 4x + cos(x) 3 4
sin(x) ) becomes, when multiplied out, 4 x sin(x) + x cos(x).
Homework #17
Exercises.
1 (cosh(x2 )) [(1) sec x+x sec x tan x]−(x sec x) [sinh(x2 ) (2 x)]
1. (a) cosh(ex ) ex (b)
(1−x2 ) Arctanh(x)
(c) cosh2 (x2 )
2. (a) sinh(sin(ex )) cos(ex ) ex (b) 4t 3 4
sinh(t) + t cosh(t)
(sin(ln x)) [e2 x (2) Arccosh(3 x)+e2 x √ 3 ]−(e2 x Arccosh(3 x)) [cos(ln x) (1/x)]
(3 x)2 −1
(c) sin2 (ln x)
1 x2 −1 x2 −1 x
3. 2 x , x2 +1 , 2 x2 +1
( 1 + 1+x ) (1−x)
1 1−x (1−x)2
4. 2 1+x becomes (x+1)1(x−1) .
5. Varies with the student.
Homework #18
Exercises.
2 3 3 2
1. (a) dy =
(15 x sin (x) + 15 x sin (x) cos(x)) dx
(tan3 x)[5 x4 ln x+x5 (1/x)]−(x5 ln x)[3 tan2 x sec2 x]
(b) dw = tan6 x
dx
2 x x x
2. (a) dy = (10 x − 2) dx (b) dz = (x ) [(e ) (ln x)+(e x)4(1/x)]−(e ln x) [2 x]
dx
|t 3 −3t | (3t 2 −3) dt
(z3 ) (sec2 z−1)−(tan z−z) (3 z2 )
3. (a) (cos(u) − u sin(u)) du (b) z6
dz (c) t 3 −3t
(1+z2 ) [−2 sin z−1]−(2 cos z−z) [2 z]
|7t−8 | dt u
4. (a) 7 7t−8 (b) (Arctan(u) + 1+u2 ) du (c) (1+z2 )2
dz
5. dx = cos(θ ) dr − r sin(θ ) dθ
6. Varies with the student.
Homework #19
Exercises.
b cos(t) 21t 6 +24t 2 −5
1. (a) a (− sin(t)) (b) 8t 3 −15t 2 +2
cosh(t) 45t 8 −40t 4 +4t
2. (a) sinh(t) (b) 10t 4 +9t 2
Homework #20
Exercises.
(1+sin2 x) [− sin x]−(cos x) [2 sin x cos x]
1. (a) 6 x ln(x) + 5 x (b) ex x−1 − 2 ex x−2 + 2 ex x−3 (c)
(1+sin2 x)2
2. (a) 12 x2 sec(x) + 8 x3 sec(x) tan(x) + x4 sec(x) tan2 (x) + x4 sec3 (x)
x x 2 x x x x x x 2
(b) ( sin x esin−e2 x cos x ) − ( sin x (e cos x+e (−(sin
sin x))−e cos x (2 sin x cos x)
2 x)2
2e
= sin 2 e cos x 2 e cos x
x − sin2 x + sin3 x
(c) (−1) (x 1 − (ln x)2 )−2 [(1) 1 − (ln x)2 + (x) (1/2) (1 − (ln x)2 )−1/2 (−2 ln x (1/x))]
p p
3. − R sin13 (t)
4. − a2 sinb 3 (t)
5. 6 ex + 18 x ex + 9 x2 ex + x3 ex
6. Varies with the student.
A.3 Chapter 2.
Homework #21
Exercises.
APPENDIX A. ANSWERS TO HOMEWORK EXERCISES 310
A.4 Chapter 3.
Homework #25
Exercises.
1. (a) Three-dimensional function, so it takes four dimensions to graph it. (b) Five-dimensional function, so it takes six
dimensions to graph it.
2. (a) Four-dimensional function, so it takes five dimensions to graph it. (b) Eight-dimensional function, so it takes nine
dimensions to graph it.
3. (a) Partial with respect to x is 6 x2 − 10 x y; partial with respect to y is −5 x2 − 6 y5 . (b) Partial with respect to x is
y3 (3 x − y2 ) + 3 x y3 ; partial with respect to y is 3 x y2 (3 x − y2 ) − 2 x y4 (c) Partial with respect to x is Arcsin( xy ) + r x 2 ;
y 1− x2
y
2
partial with respect to y is − rx
2
y2 1− x2
y
APPENDIX A. ANSWERS TO HOMEWORK EXERCISES 311
4. (a) Partial with respect to x is 14 x + 27 x2 y2 ; partial with respect to y is 18 x3 y − 10 y4 . (b) Partial with respect to x
3 )[ey ]−(x ey ) [4] 3 ey ]−(x ey )[−15 y2 ]
is (4 x−5(4y x−5 y3 )2
; partial with respect to y is (4 x−5 y )[x
(4 x−5 y3 )2
(c) Partial with respect to x is y2 sec( xy ) tan( xy );
partial with respect to y is 3 y2 sec( xy ) + y3 sec( xy ) tan( xy ) (−x/y2 )
ρ
5. (a) Partial with respect to ρ is ln(Arctan(θ )); partial with respect to θ is (1+θ 2 ) Arctan(θ )
(b) Partial with respect to
p is V ; partial with respect to V is p, partial with respect to T is −n r. (c) Partial with respect to x is 2 x y (x y − z2 )4 +
4 x2 y2 (x y − z2 )3 ; partial with respect to y is x2 (x y − z2 )4 + 4 x3 y (x y − z2 )3 ; partial with respect to z is −8 x2 y (x y − z2 )3 z
sin(θ )
6. (a) Partial with respect to ρ is ρ (1+ln(ρ) 2 ) ; partial with respect to θ is Arctan(ln(ρ)) cos(θ ) (b) Partial with respect to
(1+(v1 v2 )/c2 ) [1]−(v1 +v2 )[v2 /c2 ]
v1 is (1+(v1 v2 )/c2 )2
; partial with respect to v2 is
(1+(v1 v2 )/c2 [1]−(v1 +v2 )[v1 /c2 ] x z2
(1+(v1 v2 )/c2 )2
(c) Partial with respect to x is − sin( y+z ) z2 /(y + z); partial with respect to y is
x z2 x z2 x z]−(x z2 ) [1]
− sin( y+z ) x z2 (−1)(y + z)−2 (1); partial with respect to z is − sin( y+z ) ( (y+z) [2(y+z) 2 )
7. Varies with the student.
Homework #26
There are no exercises in this homework set.
Homework #27
Exercises. 3 2 2
−1+x y
1. (a) fxx = 12 x2 y5 , fxy = 20 x3 y4 , fyy = 20 x4 y3 (b) fxx = − xy2 , fxy = 1x , fyy = 0 (c) fxx = −2 (1+xy 2xy2 )2 , fxy = − (1+x 2 y2 )2 ,
3
fyy = −2 (1+xx 2yy2 )2
yx 1
2. (a) fxx = 2 y5 , fxy = 10 x y4 , fyy = 20 x2 y3 (b) fxx = −2 (1+x 2 )2 , f xy = 1+x2
, fyy =0 (c) fxx = − x12 , fxy = 0, fyy = − y12
3. Varies with the student.
Homework #28
Exercises.
1. f (x, y) = y − 2/x has f (x, y) = 0 as a level set, which is the same as y = 2/x. Another option is f (x, y) = x y, which has
x y = 2 as a level set, which is the same as y = 2/x.
2. f (x, y) = y − x has y − x = 5 as a level set, which is the same as y = x + 5. Another function is f (x, y) = y/(x + 5) which
has y/(x + 5) = 1 as a level set, also the same as y = x + 5.
Homework #29
Exercises.
1. Implicit. The variable z on the left also occurs on the right.
2. Explicit. The variable w occurs only once in the equation, and it is by itself on the left side.
18 x5 y4 −14 x y5 −4 x3 y3 −24 x5 y7
3. (a) −2 xy (b) − 12 x6 y3 −35 x2 y4
(c) − 1−3 x4 y2 −28 x6 y6
cos( xy )
2
y2 e(x y ) −3 x2 y4 y sin( xy ) y cos( xy ) x sin( xy ) x
4. (a) − 2 (b) − + x2
− y2
− x (c) 2 (x2 +y2 ) (1−2 y
)
2 x y e(x y ) −4 x3 y3 x2 +y2
2 2
5. − x y+y
3
6. Varies
( with the student. 4 3 7 4 3 7 )
(12 x2 y2 −35 x y6 ) [40 x3 +8 y3 +24 x y2 (− 10 x 2+82x y −5 y6 )−35 y6 (− 10 x 2+82x y −5 y6 )]
12 x y −35 x y 12 x y −35 x y
7. − (12 x2 y2 −35 x y6 )2
( 4 3 7 4 3 7 )
(10 x4 +8 x y3 −5 y7 ) [(24 x y2 +24 x2 y (− 10 x 2+82x y −5 y6 ))+−35 y6 −210 x y5 (− 10 x 2+82x y −5 y6 )]
12 x y −35 x y 12 x y −35 x y
+ (12 x2 y2 −35 x y6 )2
Homework #30
There are no exercises in this homework set.
A.5 Chapter 4.
Homework #31
Exercises.
1. (a) 9 (b) 109 (c) 15 6 7 8 9
8 = 1.875 (d) 90 (e) x + x + x + x + x
10 (f) 6x + 7x + 8x + 9x + 10x
2. (a) −3 (b) 157 (c) 63 (d) 32 (e) x7 + x8 + x9 + x10 + x11 + x12 (f) 7x + 8x + 9x + 10x + 11x + 12x
3. Varies with the student.
Homework #32
Exercises.
1. (a) 1829.911722 (b) 0.9293358726 1033 (c) 0.4646679363 1034 pounds (d) No. The answer to the previous part
indicates that the rabbits will weigh 352 million times the weight of the earth.
2. (a) 551.1588190 (b) 0.3508367956 1023 (c) 0.1754183978 1024 pounds (d) The answer, while less than the weight
of the earth, is still unrealistic. The rabbits won’t weigh more than 1% of the total weight of the earth.
Homework #33
There are no exercises in this homework set.
Homework #34
Exercises.
1. (a) y(x) = x4 − 5 (b) w(r) = r3 − 3 (c) x(t) = sin(t) + 1 (d) y(x) = −8
2. (a) y(x) = 79 x9 − 6 79 (b) w(r) = 1r + 32 (c) x(t) = − cos(t) + 5 (d) y(x) = −5
3. (a) (5/3) x3 +C (b) 3 u4 +C (c) −w−1 +C (d) (2/3)t 3/2 +C
4. (a) 4 x5 +C (b) (7/3) u3 +C (c) −4 w−2 +C (d) (3/4)t 4/3 +C
Homework #35
Exercises.
1. (a) The units of velocity are feet per second. (b) The units of acceleration (a or g) are feet per second per second, or feet
per second squared. (c) The units for v are feet per second. The units for −g/,t are (feet per second per second)*second
= feet per second. The units for v0 are feet per second. They all have the same units. (d) The units for y are feet. The
units of −(1/2) gt 2 are (feet per second per second)*(second)2 = feet. The units of v0 t are (feet per second) times (second)
= feet. The units of y0 are feet. They are all the same.
Homework #36
There are no exercises in this homework set.
Homework #37
Exercises.
1. (a) 12 (b) 4 (c) −4
2. (a) 35 (b) 2 (c) − 21
2
Homework #38
APPENDIX A. ANSWERS TO HOMEWORK EXERCISES 313
Exercises.
1.(a) 34 x4 − 35 x3 + 52 x2 − 3 x +C (b) − 18 cos(8 x) +C (c) − 5x − 6 x2/3 +C (d) 29 (3t + 8)3/2 +C (e) − ln(cos(r)) +C
√
(f) 2 e w +C (g) 21 x2 − 1x +C (h) 14 ln(x4 − 2 x2 ) +C
2. (a) 34 x4 − 43 x3 − 32 x2 + 6 x +C (b) 13 sin(3 x) +C (c) − 45 x15 + 15
4 x
4/5 +C (d) 1 (24t − 7)3/2 +C (e) ln(sin r) +C
36
√ 1 1 1 1 4 2
(f) −2 cos( s) +C (g) − x − 2 x2 +C (h) 4 ln(x + 4 x ) +C
Homework #39
Exercises. √ √
1. (a) 38 (b) −1
8 (c) 22
9 11− 10
9 5 ≈ 5.622785066 (d) − ln(cos(1)) ≈ 0.6156264703 (e) ln(3)−ln(2) ≈ 0.4054651084
47 −2
2. (a) 12 (b) 3
√ √
(c) 41 17 1
36 41 − 36 17 ≈ 5.345424947 (d) −2 cos(1) + 2 ≈ 0.919395388 (e) 2 ln(3) ≈ 0.5493061445
Homework #40
Exercises.
1 1 1 1 1 1 1 1 1 −3+23 x 1 −36+11 x
1. (a) 70 x+2 − 70 x−3 − 126 x+4 + 126 x−5 (b) 23 1
18 x−3 − 18 x2 +9
23
(c) 18 1
(x−3)2
− 11 1
54 x−3 + 54 x2 +9
23 1
(d) 324 x−3 −
23 x+3 1 −3+23 x 23 1 17 1 1 33+34 x 1 −36+11 x
324 x2 +9 − 18 (x2 +9)2 (e) 324 (x−3)2 − 486 x−3 + 972 x2 +9 + 54 (x2 +9)2
1 1 1 1 1 1 1 1 1 −87+37 x 1 −99+4 x
2. (a) 36 1+x − 18 x−2 + 28 x−3 − 126 x+4 (b) 37 1
18 x−3 − 18 x2 +9 (c) 37 1 2 1
18 (x−3)2 − 27 x−3 + 54 x2 +9
37 1
(d) 324 x−3 −
37 x+3 1 −87+37 x 37 1 41 1 1 12+41 x 1 −99+4 x
324 x2 +9 − 18 (x2 +9)2 (e) 324 (x−3)2 − 972 x−3 + 972 x2 +9 + 54 (x2 +9)2
3. (a) The set-up is A/x + B/x2 + C/(x − 6) + D/(x − 6)2 + E/(x − 6)3 + F/(x − 6)4 + G/(x + 1) + (H x + I)/(x2 + 4) +
(J x + K)/(x2 + 12) + (L x + M)/(x2 + 12)2 . (b) The degree of the denominator (bottom) is 13, so the degree of the nu-
merator (top, ?) needs to be less than 13.
M
4. (a) 9 (b) 10 ln(9) ≈ 21.97224577 (c) (−k (t−t0 ))
1+e
500 500
(d) 1+e(−1.0+1. ln(9)) ≈ 115.9846584; 135.9140914 (e) 1+e(−5.0+1. ln(9)) ≈ 471.4128093; 500; They should be close, since P(t)
approaches M as t goes to infinity. (f) 7420.657955; This is nowhere near correct, since the exponential growth is not
only unbounded, it grows very rapidly.
5. (a) M=363,447,466 for the 1960–1970–1980 data and 913,434,253 for the 1960–1980–2000 data (b) The maximums
are so different because the dramatic increase in population in 2000 meant that we weren’t close to leveling out. See the
homework problem in this section for more details. (c) For the 1960–1970–1980 data, k = 0.0265 and for the 1960–1980–
2000 data, k = 0.0150, as opposed to Pearl and Reed’s k = 0.03134. So, the specific growth rates are getting smaller.
6. 285,517,942 for the 1960–1970–1980 data, and 311,453,851 for the 1960–1980–2000 data. The difference between
them is 8.3%, which means that a population in between might not do a good job of saying which one is better. However,
if the real data is much closer to one of those two numbers than the other, that would be a fairly good reason to support that
3 2 4 1 5 1 2
model over the other. 7. (a) x+2 − 2 x−3 (b) x−3 + 3 x+2 (c) 2 x−1 + x−2 − (x−2)2 (d) 2x2x−9
+1
1
− x−4
2
8. (a) 3 ln(x + 2) − ln(2 x − 3) +C (b) 4 ln(x − 3) + (1/3) ln(3 x + 2) + c (c) (5/2) ln(2 x − 1) + ln(x − 2) + x−2 +C (d)
ln(x2 + 1) − 9 Arctan x − ln(x − 4) +C
Homework #41
Exercises.
1
1. (a) − 41 x2 cos(4 x) + 32 cos(4 x) + 18 x sin(4 x) +C (b) 19 x9 ln(7 x) − 81
1 9
x +C
2
√ 1
(c) x Arcsin (x) + 2 Arcsin(x) 1 − x − 2 x + C (d) − 3 cos(x ) + C (e) −x2 cos(x) + 2 x sin(x) + 2 cos(x) + C (f)
2 3
2 9 2 6 8 3 16
√
−ecos x +C (g) 21 x + 105 x − 315 x + 315 x3 + 1 +C.
√
2. (a) 3 x sin(3 x) − 27 sin(3 x) + 9 x cos(3 x) +C (b) 61 x6 ln(6 x) − 36
1 2 2 2 1 6
x +C (c) x Arcsec(x) − ln(x + x2 − 1) +C (d)
1 3 2 sin(x) +C (g) 1 x4 − 1
√
3 sin(x )) +C (e) x sin(x) + 2 x cos(x) − 2 sin(x) +C (f) e 10 15 x4 + 1 +C
3. (a) Tabular integration or integration by parts. (b) Substitution u = x3 first. (Then integration by parts. (c) Partial
fraction expansion. Or in this case, substitution u = x3 − 3 x also works. (d) Integration by parts, u = (ln(x))6 , dv = dx.
(e) Partial fractions, A/(x + 3) + B/(x + 3)2 +C/(x + 3)3 + D/(x + 3)4 + E/(x + 3)5 + (F x + G)/(x2 + 1) + (H x + I)/(x2 +
1)2 + (J x + K)/(x2 + 1)3 + (L x + M)/(x2 + 1)4 .
4. (a) Tabular integration or integration by parts. (b) Substitution (u = x3 ) (and then integration by parts). (c) Partial
APPENDIX A. ANSWERS TO HOMEWORK EXERCISES 314
fractions, or in this case, substitution also works (u = x2 + 2 x). (d) Integration by parts (u = (ln(x))8 , dv = dx). (e)
Partial fractions, A/(x − 3) + B/(x − 3)2 +C/(x − 3)3 + D/(x − 3)4 + (E x + F)/(x2 + 81) + (G x + H)/(x2 + 81)2 + (I x +
J)/(x2 + 81)3 .
5. You differentiate the result and see if you come back to the original integrand:
diff( x*exp(x)-exp(x)+C,x); which gives x ex .
Homework #42
Exercises.
1. Maple gives − 35 1
sin4 (7 x) cos(7 x) − 105
4
sin2 (7 x) cos(7 x) − 105
8
cos(7 x) +C
1 5 1 3 1 5
2. Maple gives − 30 sin (5 x) cos(5 x) − 24 sin (5 x) cos(5 x) − 16 cos(5 x) sin(5 x) + 16 x +C
3. Maple realized that sin(a x)(n−2) = (sin(a x)n divided by sin(a x)2 ). But then it split apart the integrals of the two:
sin(a x)(n−2) dx = sin(a x)n dx (1/(sin(a x)2 ) dx. You can only pull constants out like that.
R R R
Homework #43
Exercises.
1. (a) Rightsum = 0.3863568043, Leftsum = 0.3078169880, Middlesum = 0.3463172133 (b) Rightsum = 1.805627583,
Leftsum = 1.633799400, Middlesum = 1.717566087 (c) Rightsum = 0.5655963332, Leftsum = 0.5155963332, Middle-
sum = 0.5500354202
2. (a) Rightsum = 0.8390539012, Leftsum = 0.8914137786, Middlesum = 0.8664212399 (b) Rightsum = 7.049244379,
Leftsum = 5.771433159, Middlesum = 6.378420082 (c) Rightsum = 0.5826548602, Leftsum = 0.5326548602, Middle-
sum = 0.5674084923
Homework #44
Exercises.
1. (a) 0.3470868961 (b) 1.719713491 (c) 0.5405963332
2. (a) 0.8652338398 (b) 6.410338769 (c) 0.5576548602
Homework #45
Exercises.
1. (a) 0.3465764762 (b) 1.718282782 (c) 0.5452337739
2. (a) 0.8660259831 (b) 6.389112621 (c) 0.5625063505
Homework #46
There are no exercises in this homework set.
Homework #47
Exercises.
1. 18
2. 8
8
3. 3 with graph:
APPENDIX A. ANSWERS TO HOMEWORK EXERCISES 315
4. 34 with graph:
5. 52
1
6. 20
Homework #48
Exercises.
1. (a) 1; 1 (b) 240; 624
2. (a) 4; 4 (b) 40; 104
Homework #49
Exercises. √
1. x2 + 1 (b) x3 + 14 1
x3
(c) 1
2 9x+4
2. 2 π R
Homework #50
Exercises.√
1. 2 π ( 65
12 √
1
65 − 12 ) ≈ 273.8666398
129 1
√
2. 2 π ( 8 65 + 64 ln(−8 + 65)) ≈ 816.5660538
p
3. 2 π 0π (x sin(x) + 2) (sin(x) + x cos(x))2 + 1 dx ≈ 93.30454438
R
Homework #51
Exercises.
1. 0.1022864769 108 tons
A.6 Chapter 5.
Homework #52
Exercises.
1. (a) There is a relative min at the point (−3, −2), with value of −5. (b) There is a saddle point at (−3, 2), with value
11. (c) There is a relative min at (3, −2), with value −5. (d) There is a saddle point at (0, 0) with value 0, and a relative
min at (2, 2) with value −4.
2. (a) There is a relative min at (−1, −2), with value 1. (b) There is a saddle point at (−1, 2), with value 17. (c) There is
a relative min at (1, −2) with value 1. (d) There are saddle points at (1/2, 1/4) with value 1/2 and at (−1/2, −1/4) with
value −1/2. √ √
2 2
3. (a) c = 21 −A+ BA +4 Bt , 12 −A− BA +4 Bt (b) You want the positive square root, because c should be a positive number.
√
(c) c = −1834.446263 + 2534.659463 0.5238063398 + 0.0007890606332t
4. Since you can tell how much (at a minimum) recording time there is on a cassette tape, once you know how much time
has elapsed, you can tell the minimum amount of time left.
Homework #53
Exercises.
1. −57
APPENDIX A. ANSWERS TO HOMEWORK EXERCISES 316
3
2. 2
Homework #54
Exercises.
1. 3.832494443 minutes, or 229.9496666 seconds. The times listed for 1975 are 231.0 and 229.4, which are fairly close.
2. 3.798141443 minutes, or 227.8884866 seconds. The times listed for 1981 are 228.5, 228.4 and 227.3, again reasonably
close.
Index
317
INDEX 318
Radians, 70, 71
Radioactive decay, 84–86
Radiometric dating, 85–86
Range, see Functions, range
Rational function, 127
Reduction formulas, 246
Regular points, 137
Relation, 10
Relative change, 146
Riemann sums, 247
Second derivative, 99
Second derivative test, 138
Simpson’s rule, 250
Solution of a diff. eq., 104
Summary
Chapter 0, 25
Chapter 1, 105
Chapter 2, 150
Chapter 3, 187
Chapter 4, 268
Chapter 5, 292
Summation notation, 197
Surface areas of revolution, 262