Calculus Maple PDF

Calculus At Work And Play
Dr. Kenneth P. Rietz
, , ,
´
εις επαινoν δ óξ ης τ η̃ς χ άριτoς αυτoυ̃
Preliminary Edition by Dr. Kenneth Rietz
©1991-2010 by Kenneth Rietz

Printed in the United States of America
All rights reserved. No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or
by any means, electronic, mechanical, photocopying, recording, or otherwise, without prior written permission from the
author, except by a reviewer who may quote brief passages in critical articles and reviews.
Contents
0 Introduction and Reference 1

0.1 Introductory lecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
0.1.1 Introduction to the cast of characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Dudley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Mugsy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Albert . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Comments on the commentators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
0.1.2 A fast introduction to calculus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Distance formula. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Relation between the graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Non-constant velocity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
0.2 Functions in general. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
0.2.1 Terminology and notation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Functions defined numerous ways. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
More terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
0.2.2 Inverse functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Terminology and notation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Checking for and finding inverse functions (when they exist). . . . . . . . . . . . . . . . . . . . . . 10
0.2.3 Combining functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Adding, subtracting, multiplying, dividing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Composition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
0.3 Trigonometric functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
0.3.1 Definitions of trigonometric functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
0.3.2 Graphs of trigonometric functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Simple graphs of all six trigonometric functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Amplitude and phase of sines and cosines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Sums of such sines and cosines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
0.3.3 Trigonometric identities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Pythagorean (the most basic). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Addition formulas for trigonometric functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Double-angle formulas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Half-angle formulas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
0.3.4 Inverse trigonometric functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Domains and ranges of inverse trigonometric functions. . . . . . . . . . . . . . . . . . . . . . . . . 19
Graphs of the inverse trig functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Relations between inverse trigonometric functions. . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Identities of the trig(arctrig) type. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
0.4 Exponential and logarithmic functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
0.4.1 Exponential functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
i
CONTENTS ii
Laws of exponents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Graphs of exponential functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
e (Euler’s constant) and “the” exponential function. . . . . . . . . . . . . . . . . . . . . . . . . . . 22
0.4.2 Logarithmic functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Logarithms as the inverses of exponential functions. . . . . . . . . . . . . . . . . . . . . . . . . . 23
Laws of logarithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Graphs of logarithmic functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Natural logarithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
0.4.3 Solving exponential equations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
0.5 Summary of Chapter 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1 Derivatives - I 26
1.1 Motivating the idea of derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.1.1 General introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Structure of the course. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Calculus as a foreign language. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
The mathematics of non-uniform quantities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Motivation—driving a car. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Corresponding geometric ideas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.2 Definitions of (1-dimensional) derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.2.1 Formula-defined functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Finding the slope of a secant line through two specific points. . . . . . . . . . . . . . . . . . . . . 29
Finding the slope of a general secant line through one specific point. . . . . . . . . . . . . . . . . . 29
Doing this on Maple. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
The uses of difference quotients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Finding the slope of the tangent line at a specific point. . . . . . . . . . . . . . . . . . . . . . . . . 33
Finding the slope of the secant line between two general points. . . . . . . . . . . . . . . . . . . . 33
Magnifying the function, getting a “line.” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Notations and terminologies for derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
The uses of derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.2.2 Correlations to velocity, average and instantaneous. . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.2.3 Green box functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Understanding ∆x and ∆y. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
∆y/∆x ≈ (dy/dx), for ∆x small. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
∆y ≈ (dy/dx)∆x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
(dy/dx) is then what you multiply ∆x by to get ∆y. . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Correlate the WMF to slope of tangent lines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Correlate this to instantaneous velocities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Notations, terminology, and cautions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
An example, using numbers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.2.4 Other definitions of functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
The uses of the wiggle magnification formula. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
1.3 Calculating derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1.3.1 Motivation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
We want to avoid tons of messy algebra. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Patterns in derivatives became formulas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1.3.2 Differentiating polynomials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Derivatives of simple functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Derivative of the sum and difference of monomials. . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Finding derivatives with Maple. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
The uses of derivative formulas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
1.3.3 Limits, and the official definition of derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
CONTENTS iii
The uses of limits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

1.3.4 Differentiating products and quotients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
The derivative of the product of two functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
The derivative of the product of three or more functions. . . . . . . . . . . . . . . . . . . . . . . . 55
Quotient rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
1.3.5 Time out to gather strength. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
The idea behind derivative formulas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
How do you tell when to use what formula. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
When can you short-cut the process? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
1.3.6 The chain rule (differentiating the composition of functions). . . . . . . . . . . . . . . . . . . . . . 61
Terminology and description of composition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Deriving the chain rule using the “wiggle” approach. . . . . . . . . . . . . . . . . . . . . . . . . . 62
Writing out the chain rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Explaining how the chain rule is used in practice. . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
How to tell when to use it! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Combining the chain rule with earlier rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Explain how to change quotients into products. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
The uses of the chain rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
1.3.7 More new functions and the chain rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Absolute values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
The uses of absolute values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Trig functions and their derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
The uses of the circular trigonometric functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Inverse trig functions and their derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
The uses of inverse circular trigonometric functions. . . . . . . . . . . . . . . . . . . . . . . . . . 76
Logarithms and their derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
The uses of logarithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Exponentials and their derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Hyperbolic trig functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
The uses of exponentials and hyperbolic trig functions. . . . . . . . . . . . . . . . . . . . . . . . . 89
1.4 Differentials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
1.4.1 What is a differential? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
1.4.2 The difference between a differential dx and a wiggle ∆x. . . . . . . . . . . . . . . . . . . . . . . . 92
1.4.3 Examples of how differentials are used. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
The uses of differentials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
1.5 Parametric equations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
1.5.1 Definitions; green boxes with multiple outputs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
1.5.2 Conversion to/from implicit/explicit functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
1.5.3 Derivatives of parametric equations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Formula for parametric equation derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Viewing as coordinated rates of wiggles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Viewing as way to change variable of differentiation. . . . . . . . . . . . . . . . . . . . . . . . . . 97
The uses of parametric equations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
1.6 Higher-order derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
1.6.1 Physical interpretation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
The independent variable is t; rate of change. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Acceleration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
1.6.2 Geometric interpretation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
The independent variable is now x; notations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Concavity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Mention curvature (bending of beams, etc.). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
1.6.3 Calculations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
CONTENTS iv
How to do it (simple). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Algebraic simplifications (combining terms). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Conversion of quotients to products highly recommended here! . . . . . . . . . . . . . . . . . . . 101
Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
1.6.4 Parametric equations and second (and higher-order) derivatives. . . . . . . . . . . . . . . . . . . . 102
The uses of higher-order derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
1.8 Tests from previous years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
2 Finance 119
2.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
2.1.1 Seems an unusual topic for calculus, but isn’t. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Calculus is a major portion of business finance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Economics is a heavily quantified subject nowadays. . . . . . . . . . . . . . . . . . . . . . . . . . 119
We will be doing a few separate topics in this chapter. . . . . . . . . . . . . . . . . . . . . . . . . 120
2.2 Continuous compounding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
2.2.1 Background. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Terminology and notation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Simple interest. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Compound interest. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Continuous compounding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
2.2.2 Indeterminate forms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Like 0/0, 1∞ can be anything. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Other varieties of indeterminate forms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Cure: L’Hôpital’s rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Working limits to infinity on Maple. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
2.2.3 Return to the problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Solution of the problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Investigation of exponential growth. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Regular compounding versus continuous compounding. . . . . . . . . . . . . . . . . . . . . . . . 132
2.3 Inventory control. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
2.3.1 Statement of problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
We want to determine the order size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Notation and terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Equation derived. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Now that we have it, what do we do with it? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
2.3.2 General procedures for minimizing (or maximizing) a function. . . . . . . . . . . . . . . . . . . . 136
Look at a simple picture (graph). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
How do you solve the problem? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Maxes and mins on closed intervals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
2.3.3 Back to the problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Use the second procedure first. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Use the first procedure next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
2.4 Elasticity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
2.4.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Typical use of calculus concepts in economics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Typically poorly explained (poor understanding of calculus). . . . . . . . . . . . . . . . . . . . . . 143
Description of the market; gauge of price levels. . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
2.4.2 Notation and terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Price increase implies decrease in demand. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
CONTENTS v
Let’s try to maximize revenue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

2.4.3 Solution of problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Assume that p = p(x). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
We want dR/dx = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Critical price occurs where (p/x)/(d p/dx) = −1. . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Notation and terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
2.4.4 Investigation of elasticity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
η is negative. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Setting up to examine elasticity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
2.4.5 Looking at relative changes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
Absolute changes are relevant in some cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
Relative changes are common; measurement errors. . . . . . . . . . . . . . . . . . . . . . . . . . . 146
2.4.6 Back to elasticity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
Ratio of relative change in x to relative change in p. . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Why that is relevant to maximizing revenue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Typical examples of elastic and inelastic markets. . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
3 Derivatives - II 152
3.1 Partial derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
3.1.1 Basics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Motivations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Multiple-input functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Derivatives in this case, notations and terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Interpretations of derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Total change and total differential. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
3.1.2 How to calculate partial derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Note which variable is being wiggled; treat others as constants. . . . . . . . . . . . . . . . . . . . . 156
3.1.3 ALL THE SAME RULES APPLY, EXACTLY AS THEY DID BEFORE. . . . . . . . . . . . . . . 156
Simplifications apply here, also. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
3.1.4 The chain rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
3.1.5 Higher-order partial derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Notations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Interpretations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
Equality of mixed partials, and using that information. . . . . . . . . . . . . . . . . . . . . . . . . 166
3.1.6 Implicit functions and their derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Level sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Definition of implicit functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Formula for derivative of implicit functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Higher-order derivatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
3.1.7 Constrained partial derivatives (what if you can’t wiggle just one variable at a time?) . . . . . . . . 181
Motivation—gas dynamics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
Notation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
How to calculate these. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
3.3 Tests from previous years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
CONTENTS vi
4 Integration explained 197

4.1 The concepts behind integrals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
4.1.1 Anti-wiggle factors (anti-derivatives) = definite integrals. . . . . . . . . . . . . . . . . . . . . . . 201
Adding up wiggles (slice it up and put it back together). . . . . . . . . . . . . . . . . . . . . . . . 202
Move from submicroscopic to any size at all. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Population growth “solved.” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Areas, a geometric application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Definite integrals on Maple. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
4.1.2 Antidifferentials = indefinite integrals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
General ideas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Re-examine population growth from this point of view. . . . . . . . . . . . . . . . . . . . . . . . . 211
Constant of integration revisited. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
Looking ahead, a bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Using Maple to find indefinite integrals and solve initial value problems. . . . . . . . . . . . . . . . 213
Vertical free-fall motion “solved.” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Ballistic motion solved from vertical free-fall solution. . . . . . . . . . . . . . . . . . . . . . . . . 218
The relation between definite and indefinite integrals. . . . . . . . . . . . . . . . . . . . . . . . . . 220
Solving differential equations and parallel limits. . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
4.2 Finding integrals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
4.2.1 Exact methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Standard formulas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Substitution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Partial fractions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Integration by parts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
4.2.2 What to do if you must evaluate an integral exactly. . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Look for “obvious” substitutions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Decide if partial fractions or integration by parts can work. . . . . . . . . . . . . . . . . . . . . . . 242
“Non-obvious” substitutions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Integral tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
4.2.3 Approximate methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Riemann sums. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Trapezoidal rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
Simpson’s rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Additional ideas (higher-order approximations). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
Accuracy considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
4.3 Applying 1-dimensional integrals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
4.3.1 Areas, again. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
What happens if f (x) becomes negative? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
What do we do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
Definite integrals of | f (x) |. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
The correct and general way to find areas between curves and the x-axis. . . . . . . . . . . . . . . . 254
The area between two curves. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Slicing horizontally. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
A few comments at the end. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
4.3.2 Net and total distances traveled. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
4.3.3 Arc length. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
4.3.4 Surface areas of revolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
4.3.5 Hydrostatic force (pressure). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
CONTENTS vii
5 Water Balloons 272

5.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
5.1.1 What happened? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
5.2 The background. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
5.2.1 The equations we will need . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
5.2.2 The launcher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
5.2.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
5.2.4 We have lots of data. Now what? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
5.2.5 Working with the Least Squares equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Setting up the Least Squares equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Solving the Least Squares equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
The corresponding errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
5.2.6 Now, let’s do even better . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Getting the equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
5.2.7 Locating critical points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Categorizing critical points in two dimensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Telling the difference using the second derivative test. . . . . . . . . . . . . . . . . . . . . . . . . 279
What happens in more variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
5.2.8 Back to the problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
Size of the errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
5.2.9 Finding launch velocity from these . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
Further improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
The truth comes out . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
5.3 Video angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
5.3.1 Getting rate of change in camera angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
5.3.2 The moral of the story — related rates problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
5.4 Linear regression as another application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
5.4.1 Most common way to fit data to a line. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
5.4.2 The general setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
5.4.3 The procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
5.6 Finals from previous years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
A Answers to Homework Exercises 305

A.1 Chapter 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
A.2 Chapter 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
A.3 Chapter 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
A.4 Chapter 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
A.5 Chapter 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
A.6 Chapter 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
Chapter 0
Introduction and Reference
0.1 Introductory lecture

0.1.1 Introduction to the cast of characters
This is no ordinary calculus text (in case you hadn’t noticed already). Participating in the fun are some characters who will
appear (and disappear) regularly. You need to get acquainted with them.
Dudley
Dudley is the main character. He’s generally friendly, even a bit naive. There’s more than a bit of Charlie Brown in
Dudley. He’s adventurous enough to try anything, which often turns into a disaster. Dudley is still learning calculus, and
will often make comments or ask questions that express typical kinds of confusion. That part of Dudley is not hard to
1
CHAPTER 0. INTRODUCTION AND REFERENCE 2
identify with.
Dudley owns (although the reality might be that he is owned by) three pets. Fang is Dudley’s dog, and will often get
Dudley to do things for him. The fact that Fang communicates with Dudley so well says something about the intelligence
of both. Fang hates squirrels, and will always chase them. However, he has not yet caught one, and might not know what
to do with one if he had. Claw is Dudley’s cat and, in common with all cats, considers Dudley a necessary inconvenience.
Then there is Dudley’s pet duck, Bill. (Perhaps you have seen him in the movie Babe.) Bill is at least as smart as Albert
(you’ll meet him soon). On the other hand, Bill thinks he’s a rooster, and no one can make Bill believe otherwise. The
influence of the three pets will get Dudley into all kinds of bizarre situations, especially on tests.
Mugsy
Mugsy is the cynic. He can be counted on to insert the off-the-wall comment and the wise crack. There are times he
says the right thing, but those are totally accidental. He has also tried most everything, but most of it seems to have bounced
off with only minor damage. His questions and comments on calculus are from the outsider’s perspective. He has just
enough smarts to make him dangerous (and has friends you don’t want to meet). He has little patience with high-sounding
nonsense, and only slightly more patience with Dudley. Dudley is a bit intimidated by Mugsy, for good reason.
No one is quite sure why Mugsy has decided to allow himself to be in this course. Mugsy is not the sort to do anything
he really doesn’t want to do. On the other hand, he likes (in his own way) Dudley, or else Dudley wouldn’t be in this course.
Mugsy, despite his gruff exterior, has a (very) small, tender spot.
Albert
Albert is the genius, the sort that plays chess blindfolded and works diagramless crossword puzzles in ink. He can
explain everything in calculus, and in just about everything else, too. He is not afraid to reply to Mugsy’s comments, or to
answer Dudley’s questions. The others tolerate Albert because he is so useful to have around. He’s the only one who can
figure out what’s going on.
No one is too sure why Albert is here, either. It’s clear that he already knows all this material. It might simply be that
he is responding to the desperate pleas of Dudley and Mugsy for help and reassurance.
Comments on the commentators

You’ll be able to tell when they are saying something, because they talk in a different type font. Say something to these
nice people, would you?
Dudley: Hi, there, nice people.
Mugsy: Humph. They’ll have to show they’re nice first.
Albert: Enjoy the course. And ignore Mugsy.
Mugsy: Watch it, Al.
0.1.2 A fast introduction to calculus.

It would be a shame to leave your first class in calculus, and not have any idea of what the subject is about.
Distance formula.
Start with
distance = rate × time
written as s = vt. This is a standard formula from physics, and it is common sense as well. It assumes, though, that the
velocity, v, is a constant.
Graphs.
Plot s versus t, and then v versus t, assuming v = 50 m.p.h. (constant).
The graph of s versus t is a slanted line through the origin, s = 50t. The graph of v versus t is a horizontal line, v = 50.
Here are the graphs.
s v
6 6

v = 50

s = 50t

t t
- -
Relation between the graphs.

We can go between these two graphs. The slope of the line s = 50t is 50, which is exactly the velocity. If you find the area
under the curve (line, really) v = 50 starting at t = 0 and going up to a generic value t, you discover that it is the area of a
rectangle, with base = t and height = 50, so the total area is 50t, which is exactly the distance.
Non-constant velocity.
This is the general case, even when v is not constant! Velocity is always the slope of the s versus t graph. And distance
traveled is always the area under the v versus t graph.
Calculus enables us to go from the general s-t plot to v (a process called differentiation) and from the general v-t plot to
s (a process called integration).
The two basic operations of calculus are here: differentiation and integration. There are close connections between
slopes and velocities, and between areas and net distances. Those are two of the connections that I hope to make clear.
There’s a lot more to come! And, of course, this simplistic explanation has holes in it, which we will plug up at the proper
time.
0.2 Functions in general.

We will cover a multitude of ways of looking at functions, ranging from frivolous to formal. The single, central idea needs
to be hit hard, and I do that through repetition.
0.2.1 Terminology and notation.

Half the battle in this subject is to understand what is being said. So, we begin with a rundown of both terminology and
notation.
Dudley: Do you think we can get foreign language credit for calculus?
Mugsy: Do you really want to take two years of this subject?
Functions defined numerous ways.

I will present a total of six different ways of looking at functions, some of which are useful in one setting, and others are
useful in different settings. The more ways we have of looking at them, the more likely we are to find a good one when we
need it.
Consistent green box. First, a function is a consistent green box.

Mugsy: A WHAT? Is this guy serious?
Albert: I think so. But this is a bit unusual.
Let me explain. First, you need the picture.
input
@ ?
@
@
gnome
'
R
@
y

reject @
?
output
?
There are three parts: an input funnel, an output chute, and a reject spout. The input funnel is where you drop inputs of
any sort. The output chute is where the output falls out. The reject spout is for inputs that don’t produce output values.
It is sometimes not drawn. (For the moment, I am ignoring the inner workings of the green box. That will come next.)
Consistent means that whatever the box does to an input, that’s what it will always do to that same input. This is a “serious”
definition; we’ll come back to it regularly throughout this semester.
Dudley: I guess he is serious.
I even use it in more advanced courses.
Gnome. Now we come to what happens inside green boxes. With all these ecological niches, you’d expect that there is
some creature to fill them. Of course, there is. Another definition is that a function is a gnome.
Mugsy: You really sure this guy is serious?
Albert: Hang on. He is trying to make a point.
Mugsy: It looks like he’s trying to make a fool of me. That’s not nice. The last time somebody did that to me, the
funeral was not well attended.

Gnomes (pronounced guh-NOMES) live in green boxes and operate them. The critical fact is that gnomes are incapable of
making a decision. Trying to make a gnome make a decision causes such a trauma that it falls over, turns blue and dies.
This is the origin of the saying “I just blue my calculator.”
Mugsy: Oh, please....
Not needing to make decisions corresponds exactly to the green box being consistent. Think about that until it makes sense.
Mugsy: Is this really supposed to make sense, ever?
Albert: Perhaps a better way to think about it is to realize that if the green box were not consistent, then the gnome
would have to make a decision.
Proper list. When gnomes get old, they begin to lose their memories. To aid them, they often construct proper lists.
Mugsy: Now that makes sense!
Dudley: Now I’m worried about you.
Mugsy: Shuddup.
A proper list has two properties. The first is that there are two columns, labeled INPUT and OUTPUT. The left column
contains the valid inputs; the right column contains corresponding outputs. But that isn’t what makes a list proper. The
other property is that duplicate entries in the left (INPUT) column have identical right column (OUTPUT) entries. This is
the same as consistent, and shows that no decision needs to be made.
Example.
I’ll go over three examples in class. Here they are:
INPUT OUTPUT INPUT OUTPUT INPUT OUTPUT

−3 −26 3 −26 −3 28
−2 −7 2 −7 −2 9
−1 0 1 0 −1 2
0 1 0 1 0 1
1 2 1 2 1 2
2 9 2 9 2 9
3 28 3 28 3 28
Note that unless there are repeated inputs, you can conclude immediately that the list is proper.
Proper collection of ordered pairs. This is much the same as proper lists. You convert the left-right column couples into
ordered pairs. Proper means essentially what it did before. Oddly enough, though, this is the definition that will become
the most general in more advanced mathematics courses!
Example.
Converting the lists from the previous example to ordered pairs gives
(−3, −26) (3, −26) (−3, 28)

(−2, −7) (2, −7) (−2, 9)
(−1, 0) (1, 0) (−1, 2)
(0, 1) (0, 1) (0, 1)
(1, 2) (1, 2) (1, 2)
(2, 9) (2, 9) (2, 9)
(3, 28) (3, 28) (3, 28)
The process is quite easy.

Mugsy: That’s easy for you to say.
Albert: I thought you said this made sense.
Mugsy: You shuddup, too.
Graph that passes the vertical line test. This is one definition that you might well have seen before. Plotting a proper
collection of ordered pairs (assuming that they are numbers, which is not normally required!) gives a graph. The horizontal
axis is for the input value, the vertical axis is for the output value. Vertical lines look at the possible outputs for a specific
input. The vertical line test states that the graph of a possible function is a true function if no vertical line intersects the
graph in more than one point. The vertical line test checks if vertical lines intersect the graph no more than once. If a
specific vertical line doesn’t hit the graph at all, there is no output value for that input. That’s no problem, since functions
aren’t required to produce an output for each input. That’s the reason for the reject spout. If the vertical hits the graph once,
you can read off the output value from the y coordinate of the intersection point (read off the vertical axis). If it would hit it
more than once, there are several possible outputs for that input, meaning the list was not proper, there was a decision, and
the box was not consistent.
Mugsy: I think he intends that to mean that something is not good.
Albert: Precisely.
The graphs of the examples, in order, (with a lot more points filled in) are given right above. Note that the first and third
ones pass the vertical line test, while the second one does not.
Well-defined formula. This is what most of you would think of as a function. The phrase “well-defined” is a technical
term from mathematics. What it means is that any choices made along the way don’t make any difference in the end.
Mugsy: My choices don’t make a difference?
Albert: Only in the case the formula is well-defined.
For example, the formula
f (x) = x ± (1 − sin2 x − cos2 x)
looks like it involves a choice (Do I take the “+” or the “−?”), but whichever choice you make doesn’t affect the answer,
since the trigonometric identity sin2 x + cos2 x = 1 means that the term in parentheses is 0. So, this f (x) is well-defined.
Not all formulas are functions. The quadratic formula is one example, since it has a ± in it, and that will affect the
values that come out of the formula (usually).
Example.
The formulas for the lists above are not directly obvious, but are, respectively,
f (x) = 1 + x3 f (x) = 1 ± x3 f (x) = 1 + x3

Thus, the first and third lists gave functions, while the second one didn’t.
More terminology.
When we get around to using functions more, some terminology will prove quite useful. One is the domain of a function,
which is the set of all “good” inputs (ones that give outputs rather than dropping out the reject spout). (Note: Most
everything that is done in serious mathematics with functions is done in terms of sets. It turns out to be most convenient.)
How do you find the domain of a function, since we have so many different ways of looking at a function? For green
boxes (and gnomes), I’ve already given it to you. You look for inputs that don’t drop out the reject spout. For proper lists or
ordered pairs, you look down the left column, or at the first coordinates. For graphs, you take the parts of the x-axis which
have points of the graph over (or under) them. (Smash the graph flat onto the x-axis). For a formula, you normally take the
domain to be all of the possible inputs that can produce a legal output. This means that you avoid these:
• Division by zero;
• taking square roots of negative numbers;
• taking logarithms of non-positive numbers;

• taking inverse trigonometric functions of numbers outside the appropriate values.
(The appropriate values depend on the inverse trigonometric function. We’ll go over logarithms and inverse trigonometric
functions quite soon in lab, if they weren’t covered during the Maple lab.)
Another term is the range, which is the set of all possible outputs of a gnome or green box. For a proper list or set of
ordered pairs, look down the right column, or the second coordinates. It is also the parts of the y-axis with points of the
graph to the left or right. (Smash the graph flat onto the y-axis.) Ranges of functions defined by formulas can be downright
tricky, and I won’t ask you to go into those. There is one thing to drill into yourself right now. We will keep hitting it
throughout this and√ next semester. The principal√square root of a number is never negative. And we use only principal
square roots. So, 9 = 3, not ±3, and in general a2 = |a |. Remember this!
Mugsy: And if I don’t?
Albert: Then you’ll waste our time and yours as we continually remind you of it.
Although we won’t use range and domain much, we will work with independent variables and dependent variables
extensively through the rest of the course. The independent variable of a function is the letter given to the input spout
variable in a formula. It can have any value it wants (it is independent). The dependent variable is the letter given to the
output spout variable in a formula. It is determined by (is dependent on) the input variable.
NOTES ON THE HOMEWORK:
Dudley: Why do I have this feeling of impending doom?
1. The following is the homework assignment for this material. Look for the double lines in the handouts; they begin
and end the homework assignments. You should work the questions after the material before it has been covered.
Questions on homework assignments will be covered in the next lab sessions (Tuesdays), and the homework will be
due the class after that (Wednesday). This is the standard procedure. Special arrangements are made around tests and
the final.
2. Depending on which lab section you are in, you will do different homework questions. Listen in lectures to make
sure that you do the correct ones! (Or, you can always do them all, just to be sure that you do the right ones.)
Mugsy: Get serious.
3. Exercises are worth one point per part, problems are worth two points per part, and investigations are worth three
points per part. This is how you can figure out exactly how many points any given assignment is worth.
Homework #1
Exercises.
1. Which of the following represents a function? Give a reason for your answer.
(a) In Out (b) In Out
1 1 1 2
2 4 2 1
1 1 1 3
3 3 3 1
1 1 1 4
4 2 4 1
(c)
' $
& %
p √
(d) f (x) = 3 |x| (e) f (x) = | x|
2. Which of the following represents a function? Give a reason for your answer.
(a) In Out (b) In Out
1 3 1 1
2 2 2 2
3 1 3 3
4 1 3 4
5 2 2 5
6 3 1 6
(c)
@@
@
@
@
√
(d) f (x) = x (e) f (x) = |x|
3. What are the domains in exercise 1 (except part c.)?
4. What are the domains in exercise 2 (except part c.)?
5. A homework assignment starts on page 44. How many points is it worth?
0.2.2 Inverse functions.

We will hit several examples of inverse functions in this course (logarithms versus exponentials and all the inverse trigono-
metric and inverse hyperbolic trigonometric functions). So, we need to have a basic grasp of them. Particularly with the
logarithms, the inverse function properties are exceedingly important.
Mugsy: I’ve managed for a long time without them.
Albert: When was the last time you took calculus?
Mugsy: And passed?
Terminology and notation.

The general idea of an inverse function is one that “undoes” a function. The inverse function tells you what the input to the
original function must have been to give that output (of the original function).
Mugsy: Is this anything like turning one of these green boxes over and shaking real hard? That sounds fun!
The notation for the inverse function of f (x) is f −1 (x). BE CAREFUL! This does not mean the same as the reciprocal (−1
power) of f (x)! This is critical particularly in the case of inverse trigonometric functions, where the exponent is often put
in this same spot.
The inverse might not always exist. The general criterion for deciding whether an inverse function exists is to determine
if more than one input gives any specific output. If so, it can’t have an inverse, since you can’t tell which input gave that
output. (Remember, the essence of a function is that there are no choices, ever.) The ones that don’t have two inputs ever
giving the same output are called one-to-one (or other, less obvious, things like monomorphic or injective, in advanced
mathematics).
Dudley: Does injective have anything to do with needles?
Albert: No.
Note that perfectly good functions (like f (x) = x2 ) might not be one-to-one.
Checking for and finding inverse functions (when they exist).

For green boxes and gnomes, there is no obvious way to check for or represent the inverse function. Sorry.
Mugsy: If it means we won’t have to do it, I’m all for it!
For proper lists, you scan down the output column for duplicates, and if you find any, see if they came from the same
input. If that always happens, the function is one-to-one. The inverse is obtained by interchanging input and output columns.
Ordered pairs work the same way as lists.
Example.
Let’s take the same three examples we used earlier. Here they are:

−3 −26 3 −26 −3 28
−2 −7 2 −7 −2 9
−1 0 1 0 −1 2
0 1 0 1 0 1
1 2 1 2 1 2
2 9 2 9 2 9
3 28 3 28 3 28
When we exchange columns, we get

−26 −3 −26 3 28 −3
−7 −2 −7 2 9 −2
0 −1 0 1 2 −1
1 0 1 0 1 0
2 1 2 1 2 1
9 2 9 2 9 2
28 3 28 3 28 3
Now, let’s look at these. The first one is still a function. The second one is now a function, although you again have a
difficult time saying that something that is not a function has an inverse that is a function. (The right way to deal with
such things is to look instead at a more general critter, called a relation, which also has input and output spouts, but has no
requirement for consistency.
Dudley: You mean a relation is an inconsistent function?
Albert: Not exactly. A relation allows the possibility of being inconsistent. Inconsistency is not a requirement. It’s
better to think of it this way. A function is a consistent relation.
You can get its inverse exactly the same way as before, by interchanging columns. The question then becomes whether or
not the relation is a function; that is, ask if it is a function. The second relation is not a function, but its inverse is. The first
relation is a function, and its inverse is a function also.) The third one, which was a function, has no inverse. (That is, the
third relation is a function, but the inverse relation is not.)
For formulas with y = f (x), interchange x and y, and then solve for y = f −1 (x). If there are any ±’s in solving for y
the original f (x) has no inverse. Note that interchanging x and y is precisely the same as interchanging input and output
columns, since x is the variable for the input column (traditionally) and y is the variable for the output column (again,
traditionally). Solving for the (new) y is necessary to check if the formula is a function.
Example.
Here are the three examples one more time. The first function has a formula f (x) = 1 + x3 . To invert that, you start with
y = 1 + x3 , interchange x and y to get x = 1 + y3 , and then try to solve for y. In this case, it isn’t hard.
x = 1 + y3 (1)
3
y = x−1 (2)
√
y = 3 x−1 (3)
Note that there is no ± with cube roots. The ± shows up only with even (square, fourth, √ sixth, etc.) roots, and never with
odd (cube, fifth, seventh, etc.) roots. The inverse function’s formula is then f −1 (x) = 3 x − 1, which then obviously exists.
The second function’s formula is f (x) = 1 ± x3 . To invert, put the function in the form y = 1 ± x3 , and interchange x
and y, giving x = 1 ± y3 . Now, to solve for y gives
x = 1 ± y3 (4)
3
±y = x − 1 (5)
3
y = ±(x − 1) (6)
√
y = ± 3 x−1 (7)
This certainly looks like a non-function, but the lists we gave showed that it is a function. What’s wrong? Not much,
actually. It’s just that we were a bit too careless with throwing things around. Here’s what really happened. When we
constructed the list, we plugged only positive numbers into the formula, meaning that the input column of f (x) was always
positive. When we inverted the columns, the outputs were therefore only positive numbers. So, the ± in the formula never
really appeared in the lists, since it was chosen to make the result
√ always positive, and there was no alternative possible in
that choice. The actual formula for the lists is really f −1 (x) = 3 x − 1 , and this is a function. So, is the function invertible
or not? Good question. If you look only at the numbers in the lists, the answer is yes. If you look at the formula, the answer
is no. If the list for f (x) were expanded by the formula to include such input-output combinations as (−1, 0) and (−1, 2),
which are valid by the formula f (x) = 1 ± x3 , then this expanded list would fail to have an inverse. The formula already
includes such things, and so fails to have an inverse from the start. But note something. Putting more points into the list
causes there not to be an inverse. That comment will turn out to be useful later, when we are trying to make functions have
an inverse. The key will be to remove the offending entries that cause duplicate output column entries.
Mugsy: Al, did you follow that?
Albert: Of course. Why do you ask?
Mugsy: In case I ever need to knowabout it.
The third function is f (x) = 1 + x3 . How do we invert this? The same procedure needs to be followed. Write the

function as y = 1 + x3 , interchange x and y, and you get x = 1 + y3 , but solving for y now looks a bit rougher. Working
√
with absolute values is not too familiar. So, we convert to the alternate form of absolute values. Remember that a2 = |a |?
You were told to!
Mugsy: Hey! Don’t get personal, hear?
Here, we use it (and not for the last time, either). We get
x = 1 + y3

(8)
q
= 1 + (y3 )2 (9)
p
= 1 + y6 (10)
p
x − 1 = y6 (11)
2 6
(x − 1) = y (12)
6 2
y = (x − 1) (13)
q
y = ± 6 (x − 1)2 (14)
√
= ± 3 x−1 (15)
The ± showed up because we took an even (sixth) root. In this case, we can’t get rid of the ±, and again we get that there
is no inverse.
For the graph of a function, you want to see if the same output value is ever duplicated. So, we look at horizontal lines,
since each horizontal line represents a single output value of the function. (Think about that until it makes sense.)
Mugsy: Forget it. The more I think, the less sense I make out of anything.
If any horizontal line is crossed more than once, the function is not one-to-one, since each different crossing represents a
different input value that has that same output value.
Once a function has passed the horizontal line test, how do we find the graph of the inverse? By flipping about the line
y = x, since that interchanges the positive x (input) and positive y (output) axes. (This is not the same as rotating by 90◦ .)
This has just the same effect as interchanging columns in lists. Of course, you could just flip before you knew the function
had an inverse and apply the vertical line test to see if what you just got is a function.
Here are the graphs of the flipped versions of the three functions.
Note that the first one remains a function, the second one does not become a function, and the third one no longer is a
function. This corresponds exactly to the first one passing the horizontal and vertical line tests, the second one not passing
either, and the third one passing the vertical line test but not the horizontal line test.
Please not not get the vertical and horizontal line tests mixed up. The vertical line test determines whether a graph
represents a function. The horizontal line test determines whether the inverse of a graph represents a function.
Homework #2
Exercises.
1. Which of the parts of exercise 1 of the previous homework set have inverses? Give the inverse for those that have
them, and give a reason for those that don’t have inverses.
2. Which of the parts of exercise 2 of the previous homework set have inverses? Give the inverse for those that have
them, and give a reason for those that don’t have inverses.
3. Explain why the test for a function to be one-to-one is exactly the same as the test for the inverse to be a function.
0.2.3 Combining functions.

Most functions are substantially more complicated than x2 or sin x. But, we can use these simple functions as building
blocks to get much more complicated functions. That’s what we do next.
Adding, subtracting, multiplying, dividing.

These are the obvious ways to combine two functions. Just do what the operations say. The only caution is to avoid division
by 0. And we will use these methods of combining functions regularly. But they are probably not the most important ways
of combining two functions.
Composition.
This is arguably the most important way to combine functions. It is basic to the most important rule in calculus. We will
encounter it later. The idea is simple: Take the output of one function as the input to another.
Mugsy: Skyscraper green boxes! Great!
This is in contrast to multiplying the outputs of two functions, for example. The simplest way to visualize it is with green
boxes. (See? I told you they’d reappear. And we aren’t done with them yet, by a long shot.)
We will have to be careful of notation. We can’t use x as the input (independent) variable for both functions, because
the input for the top function will not usually be the input for the bottom function. (The input to the second function is the
output of the first function.) For the same reason, we can’t use y for the both the output (dependent) variables. We will
usually use u as the intermediate variable—the output variable of the first (top) function as well as the input variable of the
second (bottom) function. That is, we will use u = f (x) and y = g(u), where f (x) is the top function and g(u) is the bottom
function.
The notation for the composition is g ◦ f (x) = g( f (x)). Note that this represents f (x) as the upper box and g(u) as the
bottom box. It looks as though the order is backwards.
Mugsy: This whole subject looks backwards.
Albert: Come now, it’s not all that bad.
Mugsy: Bet?
With g ◦ f (x), you first do f (x), get u = f (x), and then do g(u) = g( f (x)). Note also that f ◦ g and g ◦ f are very different
functions. Order is important in composition. (See the homework.)
And, from what we did before, f ( f −1 (x)) = f −1 ( f (x)) = x is the definition of f −1 (x). This is nothing more than saying
that the inverse function “undoes” the original function, and vice versa.
Example.
Let f (x) = 2 x2 + 3 x − 1 and g(x) = 4 x − 3, and find f ◦ g(x) and g ◦ f (x). To do these, work from the inside out. (Outside
in is another option. It will give the same answer, but in my opinion, a more complicated way.)
For f ◦ g(x), we get
f ◦ g(x) = f (4 x − 3) (16)
2
= 2 (4 x − 3) + 3 (4 x − 3) − 1 (17)
2
= 32 x − 36 x + 8 (18)
For g ◦ f (x), we get
g ◦ f (x) = g(2 x2 + 3 x − 1) (19)
2
= 4 (2 x + 3 x − 1) − 3 (20)
2
= 8 x + 12 x − 7 (21)
The only difficulty with these is algebraic.
Maple can be a big help here. It is good at algebra. Here is a Maple session that would work the example just given. (I
know that we haven’t covered Maple yet in the book, but often that section is covered in lab before this point. In any case,
you can come back to this once you have covered Maple.)
> f := x -> 2*x^2 + 3*x -1; # Define f(x)

> g := x -> 4*x - 3; # Define g(x)
> expand( f(g(x)) ); # f composed with g
> expand( g(f(x)) ); # g composed with f
f := x → 2 x2 + 3 x − 1
g := x → 4 x − 3
32 x2 − 36 x + 8
8 x2 + 12 x − 7
Homework #3
Exercises.
1. Use f (x) = 2 x2 − x and g(x) = 2 x + 1 for this exercise.
(a) Find and multiply out (expand) both polynomials f ◦ g(x) and g ◦ f (x). Note that they are different.
(b) Find g−1 (x).
(c) Graph both y = g(x) and y = g−1 (x) on the same set of axes fairly accurately. Also draw in the line y = x
and note that the graphs of g(x) and g−1 (x) are reflections about the line y = x (as they should be if they are
inverses).
(d) Show that g ◦ g−1 (x) = x and g−1 ◦ g(x) = x by working out the compositions algebraically.
(e) Find two different numbers x1 and x2 so that f (x1 ) = f (x2 ). (You will need to use fractions for this.) (Hint:
Find two values of x that make f (x) = 0.)
(f) Why does the answer to the previous part show that f (x) has no inverse?
2. Use f (x) = 2 x2 − 3 x and g(x) = 2 x + 3 for this exercise.

(a) Find and multiply out to polynomials both f ◦ g(x) and g ◦ f (x). Note that they are different.
(b) Find g−1 (x).
(c) Graph both y = g(x) and y = g−1 (x) on the same set of axes fairly accurately. Also draw in the line y = x
and note that the graphs of g(x) and g−1 (x) are reflections about the line y = x (as they should be if they are
inverses).
(d) Show that g ◦ g−1 (x) = x and g−1 ◦ g(x) = x by working out the compositions algebraically.
(e) Find two different numbers x1 and x2 so that f (x1 ) = f (x2 ). (You will need to use fractions for this.)
3. For this problem, let
1
f (x) =
1−x
Show that f ◦ f ◦ f (x) = x by grinding through the compositions algebraically.
4. For this problem, let
x
f (x) = −
1−x
Show that f ◦ f (x) = x by grinding through the compositions algebraically.
5. Occasionally, functions are defined in pieces that have to be put together carefully. This problem is about how to do
that. We will be working with the function f (x) defined by
(
8 x if x < −2,
f (x) =
a x2 if x ≥ −2
We will want to find the value of the constant a so that the two parts of the function to fit together without a break at
x = −2. This means that we want the values of the two parts at x = −2 to match.
(a) What are the values of 8 x and a x2 at x = −2?
(b) What value should we give to the constant a to make the two values in the previous part equal? (Hint: Set the
values equal, and solve for a.)
Note: Functions that fit together this way, that is that don’t have a break or gap, are called continuous. Yes, there is a
way to set up functions defined in pieces using Maple, but they don’t occur in this course. You will encounter them
in differential equations, and you will see how to define them in Maple then. Or, if you are really curious, you can
type ?piecewise in Maple and learn now.
6. In this problem, we will be working with the function f (x) defined by
(
3 x if x < 1
f (x) =
a x2 if x ≥ 1
We will want to find the value of the constant a so that the two parts of the function to fit together without a break at
x = 1. This means that we want the values of the two parts at x = 1 to match.
(a) What are the values of 3 x and a x2 at x = 1?
(b) What value should we give to the constant a to make the two values in the previous part equal?
0.3 Trigonometric functions.

This section is for reference, and will be covered in lab, not during lecture. It is not designed to teach trigonometry, but
rather to refresh your memory and provide a convenient reference for various different identities that will be useful. If you
have never had trigonometric before, you should see me immediately.
Mugsy: And what if you have successfully forgotten all the trigonometry you’ve ever seen?
Albert: Hang in there, then. It turns out you won’t need as much as you might think.
The good news (as if that wasn’t enough already) is that, since this is review, I will not assign any homework problems
specifically over this material.
Dudley: Hey! This is easier than I thought it would be!
0.3.1 Definitions of trigonometric functions.

The trigonometric functions
p were originally defined in terms of a right triangle with horizontal side x, vertical side y,
and hypotenuse r = x2 + y2 and base (acute) angle θ : sin θ = y/r, cos θ = x/r, tan θ = y/x, cot θ = x/y, sec θ = r/x,
csc θ = r/y. This means that all other the trigonometric functions can be written in terms of sin θ and cos θ :
sin θ cos θ 1 1
tan θ = , cot θ = , sec θ = , csc θ = .
cos θ sin θ cos θ sin θ
But this works officially for angles between 0◦ and 90◦ (or π radians), since those is the possible angles you can get in a
right triangle. However, these definitions can be extended to any angle by using the angle through the origin of a coordinate
system and the point (x, y), with distance r (defined the same way) from the origin.
All angles for the rest of the course will be in radians. If you are unfamiliar with them, think of them as “metric
degrees,” where π radians is the same as 180 degrees.
Remember to put your calculator in radian mode! That is often done by pushing a MODE button, and usually needs
to be done each time the calculator is turned on. Radians are actually useful, but that fact is well camouflaged from high
school students. The reason appears in calculus. We will see it later.
Albert: Some calculators, like the HP’s and the better TI’s, allow you to set radian mode once, and it will stay. Most
others require you to set radian mode each time they are turned on.
Mugsy: I count on my fingers . . . all nine of them. How do I set them in radian mode?
Dudley: Albert, quit whimpering.
0.3.2 Graphs of trigonometric functions.

Simple graphs of all six trigonometric functions.
The only graphs that are really important are y = sin x and y = cos x, given below in that order.
> plot(sin(x), x=-4*Pi..4*Pi, color=black, scaling=constrained);
> plot(cos(x), x=-4*Pi..4*Pi, color=black, scaling=constrained);
Amplitude and phase of sines and cosines.

In the graphs of A sin(b x + c) and A cos(b x + c), A is the amplitude, 2π/b is the period, and −c/b is the phase shift.
Sums of such sines and cosines.

Any periodic function (as a musical tone) can be written as a sum of sines and cosines. The branch of mathematics that
deals with this is called Fourier series, and is the basis of electronic music generators. I wish we could have gone into this
more in calculus, but I couldn’t put enough topics together at a calculus level to justify it. The right way to tackle this
subject shows up in a course called Applied Math.
Dudley: Isn’t Applied Math self-contradictory, like fresh frozen jumbo shrimp?
Albert: Not at all. Industry uses Applied Math all the time.
0.3.3 Trigonometric identities.

High school trigonometry was an endless succession of proving identities. We won’t do that here. The basic identities that
we will use are mercifully few. Here they are.
Pythagorean (the most basic).

The name Pythagorean identities comes from the fact that these are obtained directly from the Pythagorean theorem. They
are:
sin2 θ + cos2 θ = 1
1 + cot2 θ = csc2 θ
tan2 θ + 1 = sec2 θ
Addition formulas for trigonometric functions.

These are here more for reference than because we will actually use them. They are important in other settings, though.
sin(A ± B) = sin A cos B ± cos A sin B
cos(A ± B) = cos A cos B ∓ sin A sin B

There is a trick that will help when B is a multiple of π/2. It involves looking at the graphs of y = sin x and y = cos x
carefully. It will be explained in detail in a later lab.
Mugsy: Is this going to be on the test?
Albert: No.
Double-angle formulas.
In contrast to the addition formulas, we will use the double-angle formulas on occasion. They come from the addition
formulas by setting A = B:
sin(2A) = 2 sin A cos A
cos(2A) = cos2 A − sin2 A = 2 cos2 A − 1 = 1 − 2 sin2 A
The different variety of cosine double angle formulas comes from the Pythagorean identity.
Half-angle formulas.
These come from the last two double-angle identities for cos(2A), changing A to A/2 and solving for the squared sine or
cosine.
1 + cos A
cos2 (A/2) =
2
1 − cos A
sin2 (A/2) =
2
0.3.4 Inverse trigonometric functions.

These show up at an unexpected, but critical, point in calculus. The idea of finding the inverse of a trigonometric function
is a bit audacious. After all, the function is periodic. There is no possible way it could be one-to-one, which is necessary
for a function to have an inverse.
Mugsy: I like doing audacious things, I think.... What’s “audacious” mean? √
The idea is the same as with square roots. When you invert y = x2 , you get y = ± x, which is not a function. But if we
ignore some of the domain of the original function (this is the one place
√ where we need to use that terminology), that is, just
consider y = x2 for x ≥ 0, we can invert it, and we get our usual y = x. (You can think of this as lobotomizing the gnome,
so he can no longer remember that x2 is defined for x < 0. This eliminates all the choices when trying to go backwards.)
Mugsy: Now that sounds like real fun! Can I try?
You could also review what I did with inverting the second main example in inverse functions. By putting in more
points for a function, we can mess up the existence of it inverse, because of the possibility of putting in points whose output
values duplicate the values of points we already had.
For trigonometric functions, we restrict the domain considerably more drastically than for square roots. More than that,
we have considerable latitude in choosing how we restrict the domain. For some inverse trigonometric functions, the choice
is standard. For others, it isn’t. The same general principles guide us for all of them, though:
1. Leave as much as you can without forcing decisions. (That is, remove enough inputs, but not too many.)
2. Leave values that are as near to 0 as possible.
3. Leave values that simplify identities as much as possible.
Unfortunately, the identities can’t all be simplified simultaneously (the third principle), as we will see shortly. This leaves
it up to each individual what identities are considered most important. There is no consensus. The choices that I make are
considered standard in some places, but not everywhere.
Dudley: That sounds confusing.
Albert: The situation is genuinely confusing.
Mugsy: Oh great. And you expect me to figure it out?
Let me give the general idea behind inverse trigonometric (also called inverse circular) functions. The inverse function
of y = sin x is y = Arcsin x, provided the domains and ranges cooperate. (The notations y = arcsin x and, tragically, y =
sin−1 x are also used. But note that in this case, the −1 is not an exponent. The same position is used for sin2 A, where
it does mean an exponent. I will always avoid the notation sin−1 x in this course as confusing and ambiguous.) Similar
equations give the remaining inverse trigonometric functions, Arccos x, Arctan x, Arccot x, Arcsec x, and Arccsc x. The
way to think of Arcsin x (for example) is as “the angle whose sine is x.” That is, θ = Arcsin x means that x = sin θ . Given a
value, you essentially try to fill in the blank: sin = value you have of sine. The filled in argument to sine is the value
of θ = Arcsin (value you have of sine). If you chose the angle from the correct range—in the case of sine, it’s −π/2 to
π/2—you get the Arcsine.
Example.
Try this with the value 1/2. We want to solve sin = 1/2. The value to use to fill in is θ = π/6, since sin(π/6) = 1/2.
Thus Arcsin(1/2) = π/6.
Note that the outputs of the inverse trigonometric functions are angles, and that they are measured (as all angles in
calculus) in radians. Be careful when doing these on your calculator to keep it in radian mode.
Maple automatically works in radians.
Dudley: What’s this Maple he keeps talking about?
Albert: Hang on. The table of contents shows that it will appear at the end of this chapter.
Mugsy: Can’t he keep things in order?
All angles must be in radians, so the output of all inverse trigonometric functions is in radians. The notations for the
trigonometric and inverse trigonometric functions in Maple are:
sin(x); cos(x); tan(x); cot(x); sec(x); csc(x);
arcsin(x); arccos(x); arctan(x); arccot(x); arcsec(x); arccsc(x);
Note that Maple does not capitalize the inverse trigonometric functions.
Domains and ranges of inverse trigonometric functions.

The domains and ranges that we will use are as follows. These fit as well as possible with calculus, with the understanding
that a perfect match is not possible.
Function Domain Range

Arcsine [−1, 1] [−π/2, π/2]
Arccosine [−1, 1] [0, π]
Arctangent (−∞, ∞) (−π/2, π/2)
Arccotangent (−∞, ∞) (0, π)
Arcsecant (−∞, −1] ∪ [1, ∞) (π/2, π] ∪ [0, π/2)
Arccosecant (−∞, −1] ∪ [1, ∞) [−π/2, 0) ∪ (0, π/2]
Note that Maple uses these same ranges in version 6.1 (the current one).
Graphs of the inverse trig functions.

Graphs of the three main inverse trigonometric functions are here. The most common ones are y = Arcsin x and (even
more common) y = Arctan x, with y = Arcsec x less so.
> plot(arcsin(x), x=-1..1, color=black, scaling=constrained);
> plot( {signum(x)*Pi/2, arctan(x)}, x = -10 .. 10, color=black,

> scaling=constrained );
> plot( {Pi/2, arcsec(x)}, x = -5 .. 5,

> color=black,scaling=constrained );
Relations between inverse trigonometric functions.

Part of deciding the domains is deciding what identities are fundamental. These are the ones I used.

x
Arctan x = Arcsin √
1 + x2
π/2 = Arcsin x + Arccos x (22)

= Arctan x + Arccot x (23)
= Arcsec x + Arccsc x (24)
Arcsin(1/x) = Arccsc x (25)

Arccos(1/x) = Arcsec x (26)
Arctan(1/x) = Arccot x for x > 0 (27)
Arctan(1/x) = −π + Arccot x for x < 0 (28)
From these, you can get between all of the inverse trigonometric functions.
The definition we are using for Arccot x would then mean that Arctan x + Arccot x = π/2 would always hold no matter
what the sign of x, but then Arctan(1/x) = Arccot x would fail for x < 0. You simply can’t win.
Mugsy: So, what’s new?
Identities of the trig(arctrig) type.

We will encounter later a need to simplify things like sin(Arctan(w/2)). Doing this is not difficult, but if you’ve never seen
it before, you wouldn’t figure it out yourself.
The basic procedure is:
1. Draw the triangle with the inverse trigonometric function as an angle (I call the angle θ ). This will give you two
sides of the triangle.
2. Find the remaining side by the Pythagorean theorem.
3. Use trigonometric function’s definition to get the answer.
It actually is harder to write out than to do.
Example.
Let’s actually work out sin(Arctan(w/2)). First, we draw a triangle, with a generic angle θ in it, and make it so that
θ = Arctan(w/2). That is, we set up the sides of the triangle so that tan θ = w/2. To do that, we set the vertical side to w,
and the horizontal side to 2.

√
w2 + 4

w

θ
2
The
√ remaining side of the triangle (the hypotenuse in this case) can then be calculated, by the Pythagorean theorem, to be
w2 + 4. Then
sin θ = sin(Arctan(w/2)) (29)

opposite
= (30)
hypotenuse
w
=√ (31)
w2 + 4
This same pattern is used for all the others.
In theory, you need to be careful about w < 0, but you rarely need to in practice.
In class, we will get the equations used in the Maple handout for the railroad track problem. It’s hard to do in the
handout, because it is so dependent on pictures.
0.4 Exponential and logarithmic functions.

We will use exponential and logarithmic functions routinely throughout calculus. This is a review of the properties of those
functions. It begins with exponential functions, since the properties there are somewhat more obvious. Then it moves to
logarithms, using the “obvious” properties of exponentials to get the standard properties of logs. Again, much of this is at
the level of trigonometry. We’ll need it and use it occasionally, but not as much as you might have done in high school. In
fact, it seems that more and more of high schools are not teaching these subjects, for some strange reason.
0.4.1 Exponential functions.

To determine if a function is exponential, you look and see if the variable is in the exponent and the base is a constant.
Thus, 4x is an exponential function, but x4 is not, nor is xx . (This will be something I will remind you of next semester,
where it will not be so obvious.)
Mugsy: This is obvious?
Dudley: Actually, this much is.
Laws of exponents.
The basic laws of exponents come directly from the idea that an means a multiplied by itself n times (for n a positive
integer).
ax × ay = ax+y (32)
ax
= ax−y (33)
ay
1
a−x = x (34)
a
(ax )y = ax y (35)
x x x
(a b) = a × b (36)
√x
a = a1/x (37)
To work properly, a must be non-zero for the ones that involve division, and a > 0 for the rest. In general, it is best to use
these with a > 0 for all of them. Certain special values must be memorized:
a0 = 1 (38)
−1
a = 1/a (39)
1
a =a (40)
Graphs of exponential functions.

The graphs of exponentials will be used to illustrate several points later on, so we need to look at them briefly. I will
describe the behavior of a whole series of graphs: y = ax for different values of the constant a.
Note first that all the graphs pass through the point (0, 1). For a 1 (meaning a is lots bigger than 1), the curve rises
steeply to the right and falls very close to 0 very rapidly to the left. For a ∼ > 1, the curve rises slowly to the right at first,
and then takes off very rapidly, while it approaches 0 to the left slowly at first, and then flattens against the x-axis fast. For
a = 1, the graph is a horizontal line. For 1 ∼ > a > 0, (meaning that a is between 0 and 1, but much closer to 1) the graph
falls to the right, slowly at first, and then flattens against the x-axis further to the right, while it climbs slowly to the left,
and eventually takes off rapidly. For 1 > a ∼ > 0 (meaning that a is between 0 and 1, but much closer to 0), the curve flattens
out dramatically to the right, and takes off to infinity rapidly to the left.
Here are the graphs of these for a = e ≈ 2.718 (see the next section), a = 1/e ≈ 0.367 and a = 1.
It is instructive to note that the graphs of y = ax and y = 1/ax are reflections about the y-axis. This happens because of
a property of exponents, namely that 1/ax = a−x . So, putting a value of x into ax gives the same value as putting −x into
the function 1/ax = a−x . Since reflecting about the y-axis changes the sign of x, we have that reflecting the points of y = ax
about the y-axis will give the points on the graph of y = a−x .
e (Euler’s constant) and “the” exponential function.

The single value of a that dominates in calculus is e = 2.71828 . . .. The reason is obscure now, but it is essentially the same
reason that we use radians rather than degrees in trigonometric functions. It will end up simplifying the formulas we will
get in calculus.
Dudley: I’m all for that!
The function ex is often written exp(x); in Maple and in most computer programming languages. The value of e
(2.71282 . . . ); in Maple, it is evaluated at exp(1);. In computer languages, exp(x) is the same function. It is built-in in
most systems.
The constant e (as in ex ) is called Euler’s constant. Euler is pronounced “oiler,” as in the former Houston football team.
Mugsy: Is that supposed to be a joke?
Albert: Beats me.
0.4.2 Logarithmic functions.

It seems that more people have problems with logarithms than exponentials, probably because exponentials are built on
familiar concepts, while logarithms are not as concrete. But, if you think of them correctly, you will get the idea quite
rapidly.
Again, this section is intended to be a review, not to teach the subject. I will cover the material in lab, rather than class.
However, I also realize that many of you will not have seen logarithms, so I will go a bit slower in this section. But I still
will not assign any homework on this topic now. (Don’t think you’re going to escape. Homework on this will come along
after we have encountered logs in the calculus part of the course.)
Mugsy: I was hoping....
Logarithms as the inverses of exponential functions.

The very basic idea (the definition, actually) is that logarithms are the inverse functions to exponentials. Exponentials, for
a > 0 except for a = 1, are not only functions, they are one-to-one functions, and so have inverses. That means that you
can take the logarithm only of positive numbers. Anything else can lead to serious difficulties. If you get to a course called
complex analysis (MAT 482 at Asbury), you will encounter those hassles there.
Dudley: Good deal! I’m never going to take any math course with the word “complex” in its name! √
Albert: Actually, the “complex” referred to comes from complex numbers, the kind of things that involve i = −1.
Mugsy: All numbers are complex to me.
Albert: That’s the way Maple works, too.
Dudley: Gimme a break! You mean Mugsy thinks the way that Maple does?
Albert: Only in that.
For logarithms and exponentials to be inverses means that if y = ax , then loga y = x, and vice versa. This gives two
identities immediately:
y = aloga y for y > 0 (41)

x
loga (a ) = x for any x (42)
For these, as well as all logarithms, we require a > 0 and a 6= 1.

This gives one of the best ways of looking at logarithms. They will tell us what is going on in the exponents. They
provide a way to pull exponents down to the level of coefficients, where we can use standard algebra on them. Remember
this. We will use it several more times in this course!
Albert: Hear that, Mugsy?
Mugsy: Don’t remind me.
Laws of logarithms.
The laws of logarithms are just the reworking of the laws of exponents, using the fact that these are the inverse functions of
exponentials. Here are the parallels.
Logarithms Exponentials
loga (r s) = loga r + loga s ax × ay = ax+y
ax
loga (r/s) = loga r − loga s ay = ax−y
loga (ru ) = u loga r (a ) = ax×u
x u
√ √
loga ( x a) = 1/x x
a = a1/x
The parallels show up when you use r = ax and s = ay , so loga r = x and loga s = y.
Graphs of logarithmic functions.

The graphs of these can be determined quite easily once you remember that they are the inverses of graphs of exponentials.
Here they are, with a = e and a = 1/e. Note that they are both defined only for x > 0.
Natural logarithms.
There is a base of logarithms that is used in higher math (calculus and up) so routinely that no other logarithm ever occurs
again except in very special and isolated cases. The base is e, Euler’s constant, the base of the exponential function.
The base e logarithm is so common that it is given a new notation, ln x, which is just loge x. It is called the natural
logarithm of x. We will learn more about it as we go along.
Dudley: Why isn’t that “nl” rather than “ln?” Isn’t it Natural Logarithm, rather than Logarithm Natural?
Albert: All those abbreviations come from Latin, where the order of words is different.
0.4.3 Solving exponential equations.

There are varieties of problems that fit into this category. The sort that we will encounter will be fairly simple. We will
want to solve for a variable that occurs in the exponent. This occurs often when working with exponential equations. The
idea is fairly simple. Take logarithms, and use the properties of logarithms to pull variables down from the exponents and
then solve using standard algebra.
One example that we will hit later looks like this.
Example.
Suppose the pollution P(t) in a lake is described by P(t) = P0 e−4t , where P0 is the initial pollution level and t is measured
in years. (This actually is a reasonable assumption for the pollution level in a lake when no more pollution is entering it.
We will discuss this more in later courses.) How long will it take the pollution to drop to 1% = 0.01 of its original level?
The solution to the problem looks like this. We want to find the value of t (call it t1 ) when P(t1 ) = 0.01 P0 , so we solve for
it:
0.01 P0 = P(t1 ) (43)

−4t1
= P0 e (44)
−4t1
0.01 = e (45)
−4t1
ln(0.01) = ln(e ) (46)
= −4t1 (47)
−4t1 = ln(0.01) (48)
ln(0.01)
t1 = (49)
−4
= 1.1513 (50)
So the answer is that it will take 1.1513 years. Note that the critical step that made all of this work is taking logarithms of
both sides, so that the t1 could be isolated.
0.5 Summary of Chapter 0

1. Functions.
(a) Definitions.
i. Consistent green boxes.
ii. Gnomes.
iii. Proper lists (INPUT and OUTPUT).
iv. Proper collections of ordered pairs.
v. Graphs that pass the vertical line test.
vi. Well-defined formulas.
(b) Domain = set of all “good” inputs
(c) Range = set of all possible outputs
(d) Inverse functions are obtained by interchanging input and output. The notation is f −1 (x).
(e) Function composition: g ◦ f (x) = g( f (x)).
2. All other the trigonometric functions can be written in terms of sin θ and cos θ :
sin θ cos θ 1 1
tan θ = , cot θ = , sec θ = , csc θ = .
cos θ sin θ cos θ sin θ
3. The one critical trigonometric identity is sin2 θ + cos2 θ = 1.
4. Inverse trigonometric functions are the mechanism for stripping a trigonometric function off of an angle. So, if
sin θ = t, then θ = Arcsint, for example.
5. The laws of exponents are
ax × ay = ax+y
ax
= ax−y
ay
1
a−x = x
a
(ax )y = ax y
(a b)x = ax × bx
√x
a = a1/x
6. The laws of logarithms are

Logarithms Exponentials
loga (r s) = loga r + loga s ax × ay = ax+y
ax
loga (r/s) = loga r − loga s ay = ax−y
loga (ru ) = u loga r (ax )u = ax×u
√ √
loga ( x a) = 1/x x
a = a1/x
Chapter 1
Derivatives - I
1.1 Motivating the idea of derivatives.

1.1.1 General introduction.
The argument was raging, often violent (at least as violent as mathematicians get), over who had developed calculus first.
Mugsy: Oooo, this sounds good!
The British claimed that Newton deserved that credit. The rest of continental Europe, and particularly the French, claimed
that Leibniz was first. Both had some facts; both had even more pride. Both sides claimed that the other’s hero had plagia-
rized, for example. Now (and then) mathematicians are a stubborn lot. And as a result, the development of mathematics
took two different tracks. The British followed the approach developed by Newton, a physicist more than a mathematician.
Math for them was (and still is) very intuitive and stressed understanding the concepts. The French followed Leibniz, a
philosopher. Math was (and still is) very logical and rigorous proof was primary. British math was warm and furry; French
math was cold and sterile. The effects of that battle come down to today. The French won. Only recently have research
mathematicians been willing to allow that computers could do anything to help them; numbers were somehow “dirty.”
Proofs are the ultimate.
This attitude infused the teaching of calculus, especially at large research-oriented universities, where calculus texts are
typically written. The emphasis was on proving things. Applications were disjointed and often omitted in favor of theory.
Entire classes of students struggled with memorizing things that were going to be useful only to the few students who were
going to be pure math majors.
But times change. Currently, a revolution in the method of teaching calculus shakes the mathematical community.
Various approaches are being tried. These notes are my attempt at changing calculus instruction. My idea is to go back
to the British style of calculus: introduce ideas in the context of applications in order to motivate them and develop an
understanding of the foundations of the subject. This will also give you a broad set of models for applying the tools of
calculus. It may also answer the perennial question, “Where will I ever use this again?”
Structure of the course.

The applications we will cover in the next two semesters include:
• Financial applications. Continuously compounded interest, inventory control (finding the best way of keeping stock),
and elasticity, with mortgages as an investigation later on.
• Analyzing the motion of a water balloon. This is simple, until you have to start taking reality into account. We will
look at what happens when air resistance is added into the mix.
• Bumper cars and roller coasters. We’ll look at why you feel thrown around by them, and introduce vectors.
• Space travel. Escape velocity, what a black hole is, and multi-stage rockets will be detailed. We’ll also look at
gravitational assist (the slingshot effect).
26
CHAPTER 1. DERIVATIVES - I 27
• The algorithms of computers. What does your computer do to find a square root, or the sine of an angle, or the
arctangent of a number? It’s not simple, and leads to some of the most complicated material we’ll encounter.
• Balancing bottles and rating stereos. I’ll leave this one up to your imagination. They tie together, and are a tip of
the iceberg of a vast array of applications relating to average values. At the end, we might take a brief plunge into
probability and statistics.
Only the first two applications will appear this semester.
If you have any other suggestions for topics, please give them to me! I am always on the lookout for better ways to do
things, other ways to tie this material together. If you have an idea, pass it along. For example, I would love to put in a
section on music, but don’t have enough different topics I can tie together under that heading.
Calculus as a foreign language.

Calculus is a language of its own, the language of quantitative sciences.
Mugsy: I can certainly believe that calculus is a foreign language. None of it makes any sense to me.
Dudley: But you haven’t had any calculus, yet! How can you say it doesn’t make sense? Give it a shot, anyway.
Mugsy: Shooting. That I can try.
Calculus formulas can be “read,” and learning to do that is one of the goals of this course. I am being serious! I wouldn’t
go so far as to advocate calculus substituting for the foreign language requirement here, but the parallels are genuine.
The mathematics of non-uniform quantities.

In geometry, you know that the area of a rectangular area is base times height. The volume of a prism (some base, but a
fixed height) is the area of the base times the height. The area of a circle is π r2 , which holds as long as the figure is a circle,
meaning that the radius is a constant. For non-constant heights or radii, these formulas fall apart. If the height changes
from point to point, how do you know which height to plug into the formula? The answer is to use calculus.
As velocity (v) changes, s = vt can’t be used any more; which v to use? For constant acceleration (a), we can still use
s = 21 at 2 + v0 t + s0 , which we will derive later. But this still requires constant acceleration. Unless something is constant,
we need calculus.
The central formula of Newtonian physics is F = m a, but in Einstein’s theory of relativity, that needs to be modified
because mass (m) is not constant. But even special relativity requires constant velocities (accelerations=zero). To deal with
general accelerations, you need general relativity. (Relax, we won’t be doing relativity. It will be mentioned again when
we look at black holes.) The simplest forms of the equations you have encountered require something to be constant. If
that thing isn’t constant, calculus is the only alternative you have.
Dudley: So all the formulas we’ve learned so far require something to be constant?
Albert: Exactly. But with calculus available, we can get formulas that are much more general.
Motivation—driving a car.
In order to bring this home in some detail, let’s take an example that everyone here should be familiar with, namely driving
a car. We will be simplistic for the moment, and assume the car is being driven along a flat, straight road. To figure out the
motion of the car, we only need to know its position (that is, its mile marker reading) at all times.
Dudley: Those of you from Kansas know all about this. Those of you from Vermont will just have to take our word
for it that flat, straight roads do exist.
(Later this semester, we will deal with motion in two directions, and next semester, we will look at roller coasters that deal
with three dimensions.)
Mugsy: That’s when Vermonters get their revenge, I guess.
The variables used. The independent variable is t, which is time, and s, which is position. (I’d use x for position normally,
but I have a reason for using s, beyond the fact that physics often uses s.) We will write s = s(t) to emphasize independent
and dependent variables.
Average velocity. The formula for average velocity is v = s/t, but only if s and t both start at 0. Otherwise, we need what
is called ∆-notation. In general, ∆Q is the change in Q, where Q is any quantity under consideration. It is calculated by
∆Q = Q(end) − Q(beginning). This “(end) − (beginning)” theme will recur many times in calculus, and all of math.
That being done, we now have ∆s = distance traveled, and ∆t = elapsed time. Then we have v = ∆s/∆t no matter what
the starting values of t and s.
Instantaneous velocity (speedometer reading). The actual velocity at a given time is of greater interest, say to a police-
man, than the average velocity.
Mugsy: That’s what he said, anyway.
Can we find that? The answer, of course, is yes. (Otherwise I wouldn’t ask it.)
The key is in ∆t. How big should we make it? For highway driving, ∆t = 1 minute might be fine. I tend to drive at a
reasonably constant speed (cruise control is handy!), which won’t change too fast. However, 1 minute probably won’t work
for in-town driving. Then, ∆t = 1 second is probably quite close, but even so, won’t be exact. How do we get the exact
velocity? We basically want to take ∆t as small as we can.
Dudley: How about minus infinity? That’s mighty small.
Albert: True. But here, we mean small to be in absolute value. ∆t needs to be very near zero.
What’s the smallest elapsed time we can take? Obviously, ∆t = 0. But ∆t = 0 gives some problems. In that case, ∆s = 0
also. (After all, if we haven’t had any time to move, we haven’t moved.)
Dudley: Even Mugsy can get that one, eh?
Mugsy: Shuddup.
We’d end up with v = 0/0. That’s not good. Before we answer this problem, we take a side-trip into geometry.
Corresponding geometric ideas.

The same dilemma occurs in geometry, although you probably haven’t thought about it.
Mugsy: Now that’s certainly right.
You will know the terms: secant and tangent lines. However, you probably haven’t applied them to anything but circles,
where they are easy to handle. We want to look at them for more general curves, ones where the radius is not constant!
What is a secant line to a curve? It’s a line that crosses the curve in two different points. What is a tangent line? That’s
more difficult to describe. The intuitive idea is that it comes in toward the curve, grazes it at one point, and then buzzes
away. The only problem is that you can’t control what the curve is going to do. The line that looks tangent at a specific
point might cross the curve again at some other point far away.
Change of variables in the problem. Here, the traditional independent variable is x, not t; the dependent variable is y,
not s. We write y = f (x) rather than s = s(t) to keep things straight. The reason I wanted to use s for position shows up here.
I didn’t want to use the same letter (x) for the independent variable in one setting and the dependent variable in another.
This can be confusing enough without that kind of problem, too.
Average velocity is the same as slope of a secant line. The slope of a line in general is ∆y/∆x, according to this new
∆-notation. (You might, or might not, have seen this before.) Slope of a secant line is ∆y/∆x where ∆y is the change in
the y-coordinate between the intersection points of the secant, and ∆x is the corresponding change in the x-coordinates.
Compare this to the average velocity being ∆s/∆t. It is exactly the same, with the dependent and independent variables in
corresponding places, too! That means that the only difference between the two is a matter of interpreting the meaning of
the variables. One interpretation gives the slope of a secant line, the other is an average velocity. We can use either way of
looking at such a quantity, and will!
It will be convenient to give a term to this ∆y/∆x or ∆s/∆t quantity. It is usually called a difference quotient.
How would we then get the slope of a tangent line? For that, we want to have the line come in and graze the curve at
a single point. You can simulate that by moving the two points of intersection of the secant line closer and closer to each
other. At the point they collide, there is a single point the line intersects the curve, which is then the point of tangency.
What happens in the difference quotient if you slide the points together? The value of ∆x gets smaller and smaller until
it hits zero. That’s not good. When ∆x is zero, so is ∆y. That leads again to 0/0 for the slope of the tangent line.
Instantaneous velocity corresponds to the slope of a tangent line. The slope of the tangent line is obtained by using
smaller and smaller values of ∆x in the slope of the secant line, leading to 0/0. The instantaneous velocity is obtained by
using smaller and smaller values of ∆t, leading to 0/0. It is exactly parallel in concept and calculation to the idea of a
tangent line. The derivative is the name for the single idea in both of these calculations.
Usefulness of this. The more ways you have of understanding derivatives, the easier it is to understand how to read
formulas. It is something like having several different ways of translating words from a foreign language into English or
vice versa. Sometimes one word is more natural in a specific setting.
Right now, we have that the instantaneous velocity corresponds in some way to the slope of a tangent line. Both of
these are central to the understanding of derivatives. But the best understanding of derivatives is yet to come!
1.2 Definitions of (1-dimensional) derivatives.

We now come down to getting specific about this. We begin to look at some of the ways of finding derivatives of 1-
dimensional functions, depending on which definition of function we use: formulas, graphs, or green boxes.
Dudley: You mean all that weird stuff is coming back?!
Albert: Apparently.
We will use x and y for the independent and dependent variables for the moment. Later on, we will need to expand this
to s and t, for example.
1.2.1 Formula-defined functions.

This is where standard calculus courses spend a lot of time. We won’t. But we begin here in order to solidify some of the
concepts. We will use Maple to handle the algebra we encounter.
Finding the slope of a secant line through two specific points.

We will use y = 2 x2 − 3 x − 1, with points (1, −2) and (3, 8) as an example of how this all works. We will use more general
curves and points soon enough. Notice that we only need the x-values, since the formula gives the y-values. That is, once
we know x = 1, we can get y = 2 (1)2 − 3 (1) − 1 = −2, and for x = 3, we get y = 2 (3)2 − 3 (3) − 1 = 8. Just plug the value
of x into the formula.
From the points, we can get ∆x = 3 − 1 = 2 and ∆y = 8 − (−2) = 10. This gives msec = ∆y/∆x = 10/2 = 5. This is the
slope of the secant line joining those two points. This seems mighty simple, and it is. But we are only just beginning.
Note that it is always (end)-(beginning), but be careful to keep track of negative signs!
Finding the slope of a general secant line through one specific point.
Why would we want to find the slope of a general secant line through one specific point? Because we will want to find the
tangent line, ultimately. And we find the tangent line by sliding the secant points together. For that, we will need some
general form of the secant line. We will end up fixing one point (nailing it down at the point we want the tangent), and
sliding the other point to it. In this case, we will end up finding the tangent line at the point (1, −2), so that is the point we
will leave alone (nail down).
We will use (1, −2) as the beginning point; we will use (x0 , y0 ) for the ending point. Then ∆x = x0 − 1 and ∆y =
y0 − (−2) = y0 + 2. But, according to the remark earlier, we can figure out y0 from x0 . Specifically, y0 = 2 x0 2 − 3 x0 − 1,
by plugging into the formula. Then ∆y = (2 x0 2 − 3 x0 − 1) + 2 = 2 x0 2 − 3 x0 + 1.
At this point, I need to guide the process a bit. To get what we want, we convert the formula for ∆y to have ∆x in it
rather than x0 . This is a simple, but critical, step. It is not the sort of algebra trick that makes sense now, but shortly I will
come back and explain why, and it should make sense then.
So, we need to get a formula for x0 that is in terms of ∆x. That’s simple to find. We have ∆x = x0 − 1, which we can
just solve. We get x0 = ∆x + 1. This is always an easy step. The messy step is plugging all of this into the formula for ∆y.
You get
∆y = 2 x0 2 − 3 x0 + 1 (1.1)
2
= 2 (∆x + 1) − 3 (∆x + 1) + 1 (1.2)
= 2 (∆x)2 + 4 ∆x + 2 + (−3 ∆x − 3) + 1)

(1.3)
2
= 2 (∆x) + ∆x (1.4)
2
= ∆x + 2 (∆x) (1.5)
Every term of the simplified form of ∆y has a factor of ∆x in it. This should always happen (for now). If it doesn’t, check
your algebra.
Factor out the ∆x from the terms on the right hand side, and you get
∆y = ∆x (1 + 2 ∆x).
This is the critical step! Why? Because we want ultimately msec , which is ∆y/∆x, and we need a factor of ∆x in ∆y in order
to be able to do that nicely. This is the reason that I changed from using x0 to ∆x.
So, divide this ∆y by ∆x and you get
msec = ∆y/∆x = 1 + 2 ∆x.
Note that the for the section right before this, where we were working with (1, −2) and (3, 8), we had ∆x = 2, and msec = 5
Using the formula for the slope of the secant line we just got (now in terms of ∆x), we get msec = 1 + 2 ∆x = 1 + 2 (2) = 5,
just as we got before!
Now we can find the slope of any secant line, as long as it is to the curve y = 2 x2 − 3 x − 1, and the secant goes through
(1, −2).
Mugsy: And you’re going to tell me that this stuff is actually useful?
Albert: Not yet. We need to learn quite a bit more first.
Doing this on Maple.

I have written a Maple procedure for finding ∆y/∆x easily when y = f (x) is a polynomial. It is called diffquo();.
Mugsy: You mean I don’t have to learn how to do this? I can use Maple for all of it?
Albert: I don’t think that’s the right approach. What happens if this process shows up on a test?
Mugsy: What usually happens. My mind goes into gridlock.
Dudley: What mind?
Mugsy: You’re asking for it, buster.
First, get the instructions for finding the routine. These will be given in class, since they have a tendency to change from
year to year, as the computer systems around here change. (There is also a chance of getting the routines directly from my
Web server, if I have it up and running by then.)
Then, get into Maple and load the routine. Here is one possible way that will work. (If you have a Linux/Unix system,
anyway. Windows or Macs have their own different strategies, and the line you type in will be different.) Alternatively, you
could type the procedure in yourself. Start with diffquo :=. Be careful to include the semicolons.
Maple prints out the procedure for you. You are not expected to understand it. Then you have to define the function.
> diffquo := proc(y,x,h)
> local var;
> var := indets(y)[1];
> simplify((subs(var=x+h,y)-subs(var=x,y))/h);
> end proc;
diffquo := proc(y, x, h)
local var;
var := indets(y)1 ; simplify((subs(var = x + h, y) − subs(var = x, y))/h)
end proc
Note that diffquo(); tells you the order it expects information to be given to it: First the function, then the x-value
of the point at which you want the secant line, and finally the value of ∆x. Note that you can use any letter you want for
delta_x; a typical one is h, as in the defining procedure (what prints out on the screen after you type read(diffquo);).
If you type
> y := 2*x^2 - 3*x - 1;
> diffquo( y, 1, delta_x);
> diffquo( y, 1, 2 );
y := 2 x2 − 3 x − 1
1 + 2 delta_x
5
How about that? Could life be any easier?
Mugsy: It could do my homework for me, too.
I genuinely urge you to get used to Maple for yourself. We will be using it regularly throughout the course. If you are
having difficulty with it, come see me soon, and we’ll work out the problems. Maple is definitely finicky about some things
(usually for a good reason, but not usually for an obvious reason). Let me remind you that in order to diagnose Maple’s
quirks, I will either need to see a copy of a printout of the session or you will have to come and get me while Maple is still
running. It is usually not possible to answer questions like “Why did Maple give me this wrong answer?”
The uses of difference quotients.

It seems as though difference quotients are a mathematical invention that has little, if any, application to anything else.
Mugsy: You took the words right out of my mouth.
Quite the opposite. Even at this very early point in calculus, what we are doing has widespread use. But, as always,
mathematicians are good at making that hard to see.
Mugsy: I can’t believe you said that! Mathematicians talk strange?! You admit it! Maybe I do have a chance in this
course!
Albert: Mathematicians tend to talk in such a precise manner, and with their own terminology, that people trying to
apply a mathematical concept to other areas have to do considerable translation at times.
Just where would difference quotients show up? In almost any situation that involves tabular data. A few examples
include economics’s production data, physical chemistry’s steam tables, and engineering’s concrete stress data.
How are difference quotients used? The essence of difference quotients is to replace a section of some curve by a
straight line between two points. (We will soon see that highly magnified curves look like straight lines.) That is exactly
what is used in a process called linear interpolation. You want to evaluate a function at a value using a table that is given
to you, except the value you want isn’t one of the inputs in the table. How do you proceed? You assume that the function
is linear in between the nearest two table inputs, find the slope (that is, the difference quotient), and use that to estimate the
output value of the function at the value you want. An example problem is given in the homework, where you can see all
the steps. We will talk more about interpolation when we have more terminology and notation, quite soon.
Homework #4
Exercises.
1. For this exercise, we continue to use the function y = 2 x2 − 3 x − 1 from class, with starting point (1, −2). This means
that you can use the values and formulas we just got in the notes.
(a) Find the y-value to go with x = 5, and the values of ∆x and ∆y that go with these values of x and y.
(b) Find msec , the slope of the secant line to the graph of the function, through the points with x = 1 and x = 5 by
dividing the value of ∆y by the value of ∆x from the previous part.
(c) In the notes, we derived a formula for msec in terms of ∆x. Plug the value of ∆x from the first part of this exercise
into that formula to show that it gives the same value for msec as you got by dividing ∆y by ∆x. (All I want to
show is that the formula works.)
2. For this exercise, we continue again to use the function y = 2 x2 − 3 x − 1 from class, with starting point (1, −2).
(a) Find the y-value to go with x = 2, and the values of ∆x and ∆y that go with these values of x and y.
(b) Find msec , the slope of the secant line to the graph of the function, through the points with x = 1 and x = 2.
(c) Plug the value of ∆x from the first part of this exercise into the formula for msec and show that it gives the same
value as you got by dividing ∆y by ∆x.
3. This exercise explores what happens if ∆x is negative. We will continue using y = 2 x2 − 3 x − 1 with starting point
(1, −2).
(a) Find the y-value to correspond to the x-value −2, and calculate the values of ∆x and ∆y.
(b) Find the slope of the secant line through (1, −2) and (−2, value from first part) by dividing the value of ∆y by the
value of ∆x from the previous part.
(c) What does the formula msec = 1 + 2∆x give for the value of ∆x from the first part of this exercise?
(d) On the basis of the preceding parts, does ∆x being negative cause errors in the formula for the slope of the
secant line? (If your answers are different, negative values of ∆x do cause problems. Hint: There shouldn’t be
problems.)
Problems.
1. In this problem, we tackle a linear interpolation. I’ll give you all the steps. We will be using our favorite function,
y = 2 x2 − 3 x − 1, again. Of course, that is a strange thing to do, since we can work out any value we want without
interpolation. We do this in order to verify that we are getting reasonable answers; we can check that the answer we
get is fairly close the the actual, correct value.
(a) First, we need to construct a table of values for the function. In order to make sure that we don’t give too much
away, we will take values that won’t interfere with the other things we are going to do. So, for this part of the
problem, find the values of y that correspond to the values of x at x = 7 and x = 9.
(b) Next, find the values of ∆x and ∆y, and the difference quotient for this situation. That gives msec . Once we have
found msec , though, we next use new values of x and y, but that value of msec .
(c) We want to approximate y when x = 7.2, using this. To do that, we set x0 = 7.0 and x1 = 7.2, with y0 being the
value that corresponds to x0 (we got that in part (a)), and y1 being the value that we are trying to approximate.
We make the grand assumption that the value of msec for this segment equals the msec that we calculated in
the previous part of this problem. Of course, it is not exactly equal, but it should be close enough. That’s the
approximation.
So, calculate ∆x = x1 − x0 and ∆y = y1 − y0 for these new values in this part of the problem. When you try to
find ∆y, you won’t know what value to put in for y1 yet, but that’s fine, since it is what we are going to end up
solving for in a moment.
(d) Now, set the value of msec from part (b) of this problem equal to the value of ∆y/∆x using the values from part
(c). You should know everything in the equation except for the value of y1 . Solve for y1 , and get a value for it.
This is the linear interpolation value for y when x = 7.2.
(e) Calculate the actual value of y using the functional equation when x = 7.2. It should be quite close to the value
of y that you got in the previous part of this problem.
2. In this problem, we tackle another linear interpolation, but this time from real life, and no other information. Look
carefully back at the previous problem. The table of values is given to us, so we don’t have to construct the table
ourselves. The rest of the problem should be just like the previous part.
The following table is taken from the 81st edition of the CRC Handbook of Chemistry and Physics. It gives the index
of refraction of air at different wavelengths. (This information is important when you are analyzing properties of the
light given off by different substances.) The column headings are irrelevant, so I will simply call them x and y. If you
want to check out their precise meaning, you can look in the Handbook.
x y
500 27896
510 27870
520 27846
530 27824
540 27803
550 27782
Use this data to estimate the value of y when x = 513.
Finding the slope of the tangent line at a specific point.

We have decided that the slope of the tangent line is what you get when the ∆x gets very tiny in msec We can do that easily
enough in the form that we have.
For the secant line through (1, −2) with y = 2 x2 − 3 x − 1, we had that msec = 1 + 2 ∆x. As ∆x shrinks, this becomes
mtan = 1, the slope of the tangent line to y = 2 x2 − 3 x − 1 at the point (1, −2).
Remember that we had a hassle in calculating instantaneous velocity by setting ∆t = 0? It gave ∆s/∆t = “0/0.” The
same thing happened when we tried to calculate the slope of the tangent line by setting ∆x = 0; we got a “0/0.” But, by
simplifying ∆y/∆x first, and then looking at what happened as ∆x shrank to 0, we avoided all of that. In fact, we didn’t even
notice the problem. We actually did quite a bit here!
What happens in Maple if we try to duplicate this feat? It is worth trying. After all of the preliminaries we did before,
we could type this.
> diffquo( y, 1, 0);
Error, (in diffquo) numeric exception: division by zero

You get an error, since it is illegal to put ∆x = 0 before simplifying. We’ll look at that error in some detail later on. On
the other hand, you can put ∆x = 0 after simplifying, but never before. Do it this way in Maple. First, type in
> diffquo(y,1,delta_x);
> subs(delta_x=0, %);
1 + 2 delta_x
1
That’s the slope of the tangent line.
What is the equation of the tangent line, then? We have two pieces of information (which is how many you always
need for the equation of a line): We have a point the line goes through, (1, −2), and we have the slope, 1. Use the point-
slope form of the line (which is the one always to use in calculus): y − y0 = m (x − x0 ). In this problem, m = 1, and
(x0 , y0 ) = (1, −2), so the equation of the line tangent to y = 2 x2 − 3 x − 1 at the point (1, −2) is y − (−2) = (1) (x − 1) or
y = x − 3 if you want to simplify it (unnecessary).
Finding the slope of the secant line between two general points.
It is a bit restrictive to be able only to work around the point (1, −2) and only on the curve y = 2 x2 − 3 x − 1.
Mugsy: You’d better believe it.
First, we get rid of the restriction of being only at the point (1, −2). In a moment, we will work with general curves, too.
In order to get the slope of a generic secant line, msec = ∆y/∆x, we need two generic points. I will use the points with
x-coordinates x1 and x2 . From there, we get the y-coordinates (remember we only need the x-coordinates) for this function,
y1 = 2 x1 2 − 3 x1 − 1 and y2 = 2 x2 2 − 3 x2 − 1.
The next thing is to get is ∆x and ∆y. That’s not too hard. ∆x = x2 − x1 and ∆y = y2 − y1 = (2 x2 2 − 3 x2 − 1) − (2 x1 2 −
3 x1 − 1). That will require some effort to simplify.
The trick, as before, is to write ∆y in terms of x1 and ∆x. This is done by solving ∆x = x2 − x1 for x2 , plugging into
the equation for ∆y and “simplifying” (multiplying out and canceling terms in) the resulting mess. (And, yes, I will explain
just why we want to do this, quite soon.)
When we do it, we get:
∆y = (2 x2 2 − 3 x2 − 1) − (2 x1 2 − 3 x1 − 1) (1.6)
= 2 (x1 + ∆x)2 − 3(x1 + ∆x) − 1 − (2 x1 2 − 3 x − 1)

(1.7)
= 2 (x1 2 + 2 x1 ∆x + (∆x)2 ) − 3(x1 + ∆x) − 1 − (2 x1 2 − 3 x1 − 1)

(1.8)
= 2 x1 2 + 4 x1 ∆x + 2 (∆x)2 − 3 x1 − 3 ∆x − 1 − (2 x1 2 − 3 x1 − 1)

(1.9)
2
= 4 x1 ∆x + 2 (∆x) − 3 ∆x (1.10)
Note that each term that’s left has a factor of ∆x in it, as before. Factoring out that ∆x gives
∆y = ∆x × (4 x1 + 2 ∆x − 3) .
This is the formula in the form we want it.

The final step in getting msec is to divide this formula for ∆y by ∆x, to get
msec = ∆y/∆x = 4 x1 + 2∆x − 3.
As a check, if we take x1 = 1, we get msec = 4(1) + 2 ∆x − 3 = 1 + 2 ∆x, which checks with the formula that we had
before.
This can also be done on Maple, using diffquo();. After you have gotten into Maple and read in diffquo and defined
y, you can give Maple the command
> diffquo(y, x_1, delta_x);
4 x_1 + 2 delta_x − 3
and it gives just what it should give.
The slope of the tangent line can be obtained from the formula for msec , by setting ∆x = 0 (note, after simplifying). You
get that the slope of the tangent line at (x1 , y1 ) is 4 x1 − 3. Again, this can be done in Maple, by continuing the computation
with
> subs(delta_x=0, %);
4 x_1 − 3
In the case that x1 = 1, which is what we used before, we get that the slope is 4(1) − 3 = 1, which is the result that we
got earlier.
Note: The function 4 x − 3 is not the equation of the tangent line. It is a formula for the slopes of all the tangent lines to
the function.
Summary of procedures:
To find msec = ∆y/∆x, the slope of the secant line joining two generic points (x1 , y1 ) and (x2 , y2 ) of a function y = f (x),
perform the following steps.
1. Set ∆x = x2 − x1 , and calculate ∆y = y2 − y1 = f (x2 ) − f (x1 ).
2. Plug x2 = x1 +∆x into ∆y and simplify the result. (This usually involves multiplying out terms and canceling whatever
you can.)
3. If f (x) is a polynomial, you should get that, after the simplification, every term has a factor of ∆x in it. Factor that ∆x
out, so that you get ∆y = ∆x × (something).
4. Divide both sides of the equation for ∆y by ∆x, and you get msec = ∆y/∆x.
What you get is a formula for the slope of the secant line joining any two points of the curve y = f (x).
To find mtan , the slope of the tangent line to y = f (x) at the point (x1 , y1 ), perform the following steps.
1. Find msec by the procedure just given.
2. Set ∆x = 0 in the resulting (simplified) expression to get the slope of the tangent line. What you get is a formula for
the slope of the tangent line at any point (x1 , y1 ) on the curve y = f (x).
Occasionally, it is easier to use x1 and ∆x rather than x1 and x2 right from the start. In that situation, you’d start with
∆y = f (x1 + ∆x) − f (x1 ), and never even refer to x2 . If you prefer that, you can use it.
Mugsy: Which one is easier, if I gotta do it?
Albert: Whichever you prefer. For the functions at this level of calculus, it won’t make any significant difference. Take
your favorite one.
Mugsy: Really?! I’ll take Maple.
Dudley: I thought you didn’t like Maple!
Mugsy: I don’t. I dislike algebra more.
Homework #5
Exercises.
1. For this question, use the function f (x) = 3 x2 − 2 x − 5.

(a) Find the formula for the slope of the general secant line to y = f (x) in terms of ∆x.
(b) Find the slope of the tangent line to y = f (x) at the point with x-coordinate x = 2.
2. For this question, use the function f (x) = 3 x2 − 5 x − 4.
(a) Find the formula for the slope of the general secant line to y = f (x) in terms of ∆x.
(b) Find the slope of the tangent line to y = f (x) at the point with x-coordinate x = −1.
Magnifying the function, getting a “line.”

This is one demonstration that you will remember, partly because I will drill it into you!
Mugsy: Can I help? I’m good at drilling things into people!
We have already seen this referred to, back when we were discussing linear interpolation. For that process, you want to
replace a section of a graph by a straight line. That sounds a bit suspicious, but it usually a very helpful thing to do. The
reason it works so often is given here.
The following graphs represent the function f (x) = 3 + (−10 sin(4 x) + 26 cos(23 x) + 15 cos(29 x))/10. The various
ranges on the graphs are designed to narrow rapidly down to x = −1. (The graphs should be read across first and then
down.) Watch how the exceedingly wiggly nature of the graph melts away. You might also notice that the slopes at x = −1
are changing, due to the scaling that Maple uses which changes the unit lengths on the axes.
When a very (but not too) wiggly curve is magnified enough, it looks like a straight line. The derivative is nothing but the
slope of that line! As we magnify, we will be homing in on specific points on the curve. Different points, when magnified
around, give lines with different slopes. All that means is that the slope of the tangent line, that is, the derivative, changes
from one point to another. The derivative is a function, too!
Dudley: Why does that give me a feeling like we aren’t done yet?
It should also be mentioned that not every curve can have its wiggles magnified away. An example (the only one that
we will encounter) is the absolute value of x. The graphs of y = |x | are given for the domain [−5, 5] and [−0.0001, 0.0001].
> plot( abs(x), x=-5 .. 5, color=black, scaling=constrained);
> plot( abs(x), x=-0.0001 .. 0.0001, color=black,

> scaling=constrained);
The corner at the origin persists. In fact, all corners will always persist. At a corner, there is no one single tangent line,
and therefore there is no single slope, which means there is no derivative.
The tangent line to y = f (x) at (x1 , y1 ) is the best linear approximation in this sense: Any other line through (x1 , y1 )
pulls away from the graph of y = f (x) much faster than the tangent line. We will learn more about this later.
Notations and terminologies for derivatives.

Calculus is used in so many different situations that there are many different terminologies for it in the different areas that
use it. The formula for the slope of the tangent line is the derivative of the function. The process of finding the derivative is
called differentiation (not derivation!), and the verb form is to differentiate (again, not derivate nor derive). If you want to
look like an utterly clueless beginner, talk about derivating or deriving functions.
Mugsy: There you go, Dudley. Ask me to derive some function.
Dudley: Sorry, Mugsy. I’m not going to fall for it.
When you are dealing with an (x, y)-graph of a function, the notation that is used for the derivative also has a number
of forms. One is
dy
dy/dx = ,
dx
which is nothing more than an adaptation of the ∆y/∆x notation.
There is a bit of information that is automatic in the notation dy/dx that is not apparent at first. It implies that x is the
independent variable and y is the dependent variable. This seemingly innocent comment will come back later on both to
cause problems and to give help. We often will use the phrase “the derivative of y with respect to x,” and that carries the
same meaning about which variable is dependent and which is independent.
dy
There are other notations for derivatives that have a different look. If you take dx and treat it as a method of converting
dy dy d
y to its derivative, you can view this as y becoming dx , and so you will think of the derivative as dx = dx y, by just pulling
d dy dy
the y down in front. In that case, dx becomes the differentiation operator (starting with y, it gives dx ). Accordingly, dx is
often written as d/dx(y).
Probably the simplest notation is y0 . The prime (superscripted dash) denotes differentiation. That is, y0 and dy/dx mean
exactly the same thing. Another, less common, notation is Dy, or occasionally, Dx y.
For f (x) notation, all the same things are used. That is, each of the following represent the derivative of f (x):
d f (x) d 0
dx , dx f (x), f (x), D f (x).
There is no systematic notation for the difference quotient, except perhaps for msec , but that doesn’t tell you what the
function or points were.
The notation y0 is due to Newton. Leibniz used dy/dx. Of the two, Leibniz’s notation works the best in the sense that
formulas are easiest to remember in his form. But both are simple and exceedingly common. You will need to be familiar
with both. I will use both in this course to make sure that you get used to both.
The uses of derivatives.

It seems strange to even put this section in the notes! Derivatives are half of calculus, and calculus itself is fundamental to
many aspects of physics, engineering, economics, and (of course) mathematics. The rest of this course should show you a
few of the uses of derivatives.
Homework #6
Exercises.
1. In an earlier homework exercise, we worked out the slope of the secant line to y = 2 x2 − 3 x − 1, through x = 1 and
x = 5. We assumed then that x = 1 was the beginning point and x = 5 was the ending point. Suppose instead now
that x = 5 is the beginning point and x = 1 is the ending point.
(a) Compare the values of ∆x and ∆y then to the values of ∆x and ∆y now. That is, is there any obvious relation
between the values then and now?
(b) Calculate a new value of msec using the new values of ∆x and ∆y. Compare the new value of msec to the old
value.
(c) On the basis of the results from the previous parts of the exercise, which of ∆x, ∆y, and msec change when the
order of the points changes?
2. Also in that same homework set, we worked out the slope of the secant line to y = 2 x2 − 3 x − 1, through x = 1 and
x = 2. We assumed then that x = 1 was the beginning point and x = 2 was the ending point. Suppose instead now
that x = 2 is the beginning point and x = 1 is the ending point.
(a) Compare the values of ∆x and ∆y now to the values then. That is, is there any obvious relation between the
values then and now?
(b) Calculate a new value of msec using the new values of ∆x and ∆y. Compare the new value of msec to the old
value.
(c) On the basis of the results from the previous parts of the exercise, which of ∆x, ∆y, and msec change when the
order of the points changes?
3. I gave a step-by-step procedure for finding msec for a function y = f (x) with generic points. Try that process on the
function y = tan x. At what step does the procedure break down? [We’ll develop a different method to deal with such
functions.]
4. Find the slopes of the tangent lines to y = 2 x2 − 3 x − 1 at the points x = 1, x = 3, and x = 5. (You can use the work
we did in class.) Find the equations of the tangent lines to the graph at those points.
1.2.2 Correlations to velocity, average and instantaneous.

We now come back to look at velocities and see how what we did with slopes was precisely the same thing.
Dudley: You mean we’re going over this stuff again?
Albert: Yes. The more different ways you can see it, the better off you will be.
Mugsy: Or the more bored . . . .
Albert: Wait until you have to use this idea. You’ll be glad that you have a number of different approaches to it.
Again, the notation gives dependent and independent variables. For average velocity, vavg = ∆s/∆t, and the independent
variable is t, while the dependent variable is s.
Suppose we graph s versus t (position versus time). Notice the terminology, by the way. You say that you graph the
dependent variable versus the independent variable. The independent variable is plotted horizontally and the dependent
variable is plotted vertically (except in unusual circumstances, and you’d need to note that on the graph). The usual graph
is y versus x.
The slope of the secant line in that plot is ∆s/∆t, with the dependent variable on top and the independent variable on
the bottom. But ∆s/∆t is the average velocity. In other words, the average velocity is the slope of the secant line! There is
absolutely no difference mathematically. The only difference is in the interpretation of the difference quotient.
What about the instantaneous velocity? We got that before by taking ∆t smaller and smaller (going to 0) in the average
velocity. That corresponds in the graphical approach to taking ∆x smaller and smaller in the slope of the secant line,
which gives the slope of the tangent line. Again, there is absolutely no difference mathematically between the slope of a
tangent line and an instantaneous velocity. The slope of the s vs. t plot of an object at a certain value of t is precisely the
instantaneous velocity of the object at time t.
If you’ll remember back in the first lecture, I talked about the slope of the line s = 50t (coming from a constant velocity
of 50 m.p.h.), and commented that the slope of the line was 50 = velocity. We are saying exactly the same thing here,
except that we are not requiring that the velocity be constant. In fact, this really is useful exactly when the velocity is not
constant, since constant velocity is so simple to work with.
1.2.3 Green box functions.

The best interpretation of derivatives comes from the most unlikely source, the “green box” definition of functions.
Dudley: Al, is this really a calculus course? This looks weird.
Albert: It certainly is not a typical calculus course. But it seems to be covering the right material. I’m reserving my
decision until later.
It provides us the most intuitive and applicable interpretation of what a derivative means. We will use the y = f (x) notation
for functions for the moment.
Understanding ∆x and ∆y.

The first thing we have to deal with is the ∆x and ∆y in the definition of derivatives. They are the amounts by which x
and y change, respectively, in the function. The intent, though, is for both ∆x and ∆y to be small, and getting smaller.
Accordingly, I term them the “wiggles” in x and y, slight changes in the variables.
Dudley: Augh! You’ve got to be kidding.
Albert: Hey, this is clever! And unique, as far as I have ever seen.
The value of ∆x is what I call the “input wiggle;” it represents how much the input changes. The value of ∆y is the “output
wiggle;” it represents how much the output changes. Note that the input wiggle causes the output wiggle.
Wiggles are based around some initial value. When you change something slightly, you are changing away from a value
for that thing. This will turn out to be important soon. Hopefully, it will also make more sense then, too.
Dudley: Is it all right if I don’t understand this remark?
Albert: Look at it this way. When we were magnifying a curve to get a “line,” the slope of that line depended on
where we started, right?
Dudley: I guess.
Albert: The wiggles will depend on where we start, too. And those starting points are what we are wiggling from.
Does that help?
Mugsy: Nope. But you did try.
∆y/∆x ≈ (dy/dx), for ∆x small.

The difference quotient gets closer and closer to the value of the derivative as ∆x gets smaller and smaller. That’s all that
this says (so far). The input and output wiggles are related this way. That’s the key observation. The ratio of the sizes of
the input and output wiggles gives the derivative, approximately.
Again, note the similarity in notation. When ∆x gets small, it effectively turns into dx, and ∆y effectively becomes dy
(both at once).
∆y ≈ (dy/dx)∆x.
What we do is solve for output wiggle, ∆y. We are asking for how much the output will wiggle when the input is wiggled by
a certain amount, ∆x. We could use this to estimate (dy/dx), and we will do that, later. But first, we still need to understand
this equation.
The equation
∆y ≈ (dy/dx)∆x
says that the output wiggle is (roughly) directly proportional to the input wiggle, and the derivative is the constant of
proportionality. If we halve the input wiggle, we will halve the output wiggle, approximately. Double the input wiggle, the
output wiggle doubles, again approximately.
(dy/dx) is then what you multiply ∆x by to get ∆y.

You multiply the input wiggle by dy/dx to get the output wiggle. That’s what the derivative means! I call this the “wiggle
magnification factor”interpretation of the derivative. I will often abbreviate it as WMF.
All of these proportions are rough, because the function is not exactly a straight line. They are exact for straight lines.
The homework deals with that.
Albert: You should learn to note when comments about the homework show up in the text. They are usually clues
about what to expect, or the result that you should get.
This really is the most important interpretation of a derivative! I will end up using it a number of times throughout the
course. For reference (and, again, I will refer to this later in the course) this is the wiggle magnification formula:
∆y ≈ (dy/dx)∆x (1.11)
You should memorize this formula! It is probably the most important formula in the course!
Mugsy: Is he being serious?
Albert: Probably. It certainly is fundamental, and it is quite possible to build up a large portion of calculus from this
formula. That might be exactly what he will do.
Correlate the WMF to slope of tangent lines.

It obviously would be important to show that this novel approach to derivatives really fits in with what we did before. Why
should the wiggle magnification factor be the same as the slope of a tangent line?
Mugsy: Actually, that’s not something I had bothered to worry about.
The slope of the secant line was exactly ∆y/∆x, so
msec × ∆x = (∆y/∆x) × ∆x = ∆y
also exactly. As the slope of the secant line changes to mtan , the formula changes to an approximation,
(mtan ) × ∆x ≈ msec × ∆x = ∆y
or
(mtan ) × ∆x ≈ ∆y
Since the derivative equals mtan = dy/dx, this is just the same as the wiggle magnification formula.
Another way of seeing it is to remember that as you zoom in on a curve, it flattens out. This zooming-in is the same as
taking ∆x small; we are only looking at values very near a point. As the curve becomes more like a straight line, values of
∆y/∆x approximate closer and closer to the slope of the curve at the point. That isn’t saying anything more than that the
curve begins to look a lot like a straight line. The slope of the line that the curve resembles is the slope of the tangent line,
which is dy/dx ≈ ∆y/∆x. In that case, you can find ∆y ≈ (dy/dx)∆x. We can drive the point further home by looking at
specific ranges of dy/dx. Here’s a series of pictures for the four possible cases.
Remember that ∆x positive means that the arrow points to the right (that is, x is increasing), while ∆x negative means
that the arrow points to the left. Similarly, ∆y is positive means that its arrow points upward while ∆y negative means the
arrow points downward.
For dy/dx large and positive, the slope of the tangent line is large and positive. This means that the ∆y arrow will be
much longer than ∆x.
y
6

6

∆y

-

∆x

x
-
For dy/dx small and positive, ∆y will be much shorter than ∆x.
y
6
∆x
-
∆y 6
x
-
For dy/dx negative, we have to realize that ∆y and ∆x will be of opposite signs. That is, if ∆x is positive, ∆y will be negative.
For dy/dx small and negative, ∆y will be small in comparison to ∆x, and opposite sign.
6 y
```
```
`
∆y ? ``-
```
∆x ```
x
-
For dy/dx large and negative, ∆y will be large in comparison to ∆x, but of the opposite sign.
6y D
D
D
D
∆y D
D
-D
?
D
∆x D
D
D
D
D
x
D
D -
Where dy/dx > 0, the function is said to be increasing and the graph is rising. This makes sense, if you realize that a positive
slope on the tangent line means that the function is headed uphill (toward the right). Where dy/dx < 0, the function is said
to be decreasing and the graph is falling. This also make sense, since the tangent line is headed downhill (toward the right).
When you do graphing by hand, this information is very important.
On the other hand, with graphing calculators and programs like Maple, graphing by hand carries less motivation. This
draws a lot of the usefulness from this approach. But for reference, we will summarize:
If dy/dx > 0, the function is increasing and the graph is rising.
If dy/dx < 0, the function is decreasing and the graph is falling.
If dy/dx = 0, “interesting” things happen.

Dudley: Why do I get this ominous feeling again?
Albert: For good reason. Later, you’ll see that a place where the derivative is zero is called a critical point.
The function is not guaranteed to be either increasing nor decreasing at that point. There are a number of possibilities, and
we will investigate some of them later, in connection with max-min problems.
Correlate this to instantaneous velocities.

For velocities, we have to change notation. We have s = s(t), and velocity is v = ds/dt. Commonly, in physics, a derivative
with respect to x is written as a prime, as in y0 . However, a derivative with respect to time is written as a dot over the letter.
So, velocity will commonly be written (by physicists) as v = ṡ rather than v = s0 . You say ṡ as “s-dot.” It can occasionally
be confusing when dots blur with random blotches and dust specks. But the notation is very convenient. Just use caution.
Why should the wiggle magnification factor show up as an instantaneous velocity? You have to think about it a second.
The wiggle magnification formula says that
∆s ≈ (ds/dt)∆t
Does that make sense? Of course it does! Suppose you wanted to figure out what to multiply by time to get distance.
Obviously, what you want to use is velocity. The only trick is to figure out what velocity, especially when the velocity is
changing. For ∆t small though, the velocity can’t change too much, and the formula makes sense.
Mugsy: You mean I already knew this? Are you trying to make this simple, or something?
Albert: Well, it is basically familiar. Is there anything wrong with trying to make it simple?
Mugsy: I feel cheated if it isn’t complicated enough.
Albert: Don’t worry. That comes later. In the meantime, enjoy it while it’s easy.
Distance is velocity times time, with the velocity being that (only slightly changing) value during that (very short) time.
The interpretations of increasing and decreasing for the sign of dy/dx carry over to interpretations of the sign of ds/dt.
When ds/dt > 0, the object is moving forward. If ds/dt < 0, the object is moving backward. When ds/dt = 0, the object
is stopped. This correlation might be of some help in making sense of the max-min problems we will encounter.
Dudley: That’s the second time that he has mentioned those. Al, why do they keep appearing.
Albert: Hard to say. Maybe he is just trying to get you used to hearing the term.
Mugsy: I used to call that process “softening up the client” when I used it.
Dudley: That’s not helping my apprehension.
Notations, terminology, and cautions.

I need to enter a few notational cautions here. When I write dy/dx or ds/dt, I do not intend this to be thought of as a
quotient of two “things,” in the same way that a difference quotient ∆y/∆x is the quotient of two differences. The notation
y0 for dy/dx is perhaps better in that way, since you don’t see something that looks like a quotient. On the other hand,
there is much to be gained by viewing dy/dx as the quotient of two differentials (the correct name for them). This yields
immediately the idea of wiggle magnification factor, and engineers, for example, often use differentials in ways that make
mathematicians cringe. In this course, aimed at being practical rather than rigorous, I will adopt the engineering point of
view: dy/dx is both a derivative and the quotient of two differentials. We will discuss differentials in more detail later this
chapter.
An example, using numbers.

In order to illustrate the wiggle magnification formula, consider this: Suppose f (x) = x2 , x1 = 3, and ∆x = 0.2. Let’s work
out all the different parts of the wiggle magnification formula and see how it all ties together.
Mugsy: You mean using real numbers! Oh, wow!
I’ll have to tell you for now what the derivative is. For f (x) = x2 , it’s f 0 (x) = 2 x. Later on, when we can find derivatives
fairly easily, you’ll have to do that yourself.
First, the information that is given to you:
f (x) = x2 (1.12)
x1 = 3 (1.13)
∆x = 0.2 (1.14)
f 0 (x) = 2 x (1.15)
Calculating f 0 (x) ∆x:
y1 = f (x1 ) = (3)2 = 9 (1.16)

0
f (x1 ) = 2 (3) = 6 (1.17)
f 0 (x1 ) ∆x = 6(0.2) = 1.2 (1.18)
Calculating ∆y:
x2 = x1 + ∆x = 3.2 (1.19)
2
y2 = f (x2 ) = (3.2) = 10.24 (1.20)
∆y = y2 − y1 = 10.24 − 9 = 1.24 (1.21)
Verifying that ∆y ≈ f 0 (x) ∆x:
∆y ≈ f 0 (x1 )∆x (1.22)

1.24 ≈ 1.2 (1.23)
Now let’s look carefully at what we did. The derivative was 6, which says the output wiggle will be roughly 6 times the
input wiggle. The input wiggle was given to us as 0.2, so the output wiggle should be about 6 ∗ 0.2 ≈ 1.2. When we
actually calculated the exact output wiggle, we got 1.24, and the final line just says that 1.24 ≈ 1.2, the exact wiggle is
approximately equal to 6 times the input wiggle. It really worked!
We could have used this same approach to approximate (3.2)2 if we wanted to. Here’s how. We know that 32 is 9,
and that’s at least close. But we can improve the approximation using the wiggle magnification formula. We find that the
derivative (wiggle magnification factor) of the function f (x) = x2 at x = 3 has the value 6. (We will be able to do this Real
Soon Now.) So, if we want to wiggle the input by 0.2 (from 3 to 3.2), the output will wiggle by about 6 ∗ 0.2 ≈ 1.2. That
means that the output of (3.2)2 should be close to 9 + 1.2 ≈ 10.2. That really is pretty close, since the exact value is 10.24.
In the section on algorithms of computers, we will learn how to improve this approximation process. We shall return to
this!
Dudley: Impending doom again?
Albert: Only for those taking calculus next semester.
Mugsy: Great! That means I don’t have to worry about it.
Albert: Oh? You’re in next semester, too. Public demand.
1.2.4 Other definitions of functions.

The gnome definition of functions doesn’t have a useful interpretation for derivatives. But the others (ordered lists or
ordered pairs) do. It is worth seeing at least once.
The uses of the wiggle magnification formula.

If you look at an old table of (for example) square roots or trig functions, there will often be some extra numbers scattered
around the sides or bottoms of the columns. These numbers are used in a process called interpolation. The situation is this.
If you have a table of values, and you want to look up a value in the table, but it isn’t given exactly, you have to estimate
(the correct term is interpolate) what the value is in terms of the values just above and below yours. (Don’t get the idea that
interpolation is out of date, though. There are many situations—chemistry comes immediately to mind—where physical
data has been gathered for a certain range of conditions, but your specific interest is for values inside that range, but not
exactly equal to any of the table values. Interpolation is the only way to go then!) Those extra numbers around the side
of the columns are there to help you interpolate. If your independent variable (the one you can control) is called x, and
the dependent variable you are measuring (that depends on x) is called y, the table lets you read off y if you know x, but
only for specific values of x. For other values of x, you use essentially the wiggle magnification formula to estimate ∆y as
(dy/dx)∆x, where ∆x is easy to figure: It is the difference between what you have and the nearest entry in the table. The
value you don’t know easily is (dy/dx), the wiggle magnification factor. But that is precisely what those extra numbers
around the sides are! Then you can find ∆y approximately using the wiggle magnification formula. Then knowing ∆y
roughly and the value of y you have in the table, you can figure out a corrected value of y that should be considerably more
accurate.
We did a problem going over linear interpolation way back in this chapter. It’s the same thing, except that you were
approximating the derivative by a secant line, obtained from two points that were very close to each other (on either side of
the point you were looking for). And as long as the tabulated values are close together, we can treat it as a highly magnified
function, which will look very much like a straight line. That means that linear interpolation is liable to be quite accurate.
(There are more accurate methods that use more points than the two on either side of the value you have, but we aren’t
going to go into those here.)
Homework #7
Exercises.
1. Calculate ∆y and f 0 (x1 )∆x for each of the following, and show that they are approximately equal. The example of
approximating (3.2)2 shows how to do this. The use of either a calculator or Maple would be extremely handy. I give
you the derivatives, which we will learn to do next.
(a) f (x) = 3 x2 − 2 x , x1 = 2, ∆x = 0.1, f 0 (x) = 6 x − 2
√ √
(b) f (x) = 1/ x, x1 = 9, ∆x = 0.2, f 0 (x) = −1/(2 x3 )
(c) f (x) = (x + 1)/x, x1 = 4, ∆x = 0.2, f 0 (x) = −1/x2
2. Calculate ∆y and f 0 (x1 )∆x for each of the following, and show that they are approximately equal. Again, I give you
the derivatives.
(a) f (x) = x3 − 2 x2 , x1 = 2, ∆x = 0.3, f 0 (x) = 3 x2 − 4 x
√ √
(b) f (x) = x, x1 = 1, ∆x = 0.1, f 0 (x) = 1/(2 x)
(c) f (x) = 1/x, x1 = 5, ∆x = 0.1, f 0 (x) = −1/x2
Problems.
1. In this problem, we show that the wiggle magnification factor formula ∆y ≈ f 0 (x) × ∆x is actually always exactly
equal (rather than just approximately equal) for straight lines. The equation for a line is y = f (x) = mx + b, which
has derivative, f 0 (x) = m.
Albert: That’s the slope of the line, after all. You don’t even need to magnify it in this case!
Use this information for the rest of this problem. Leave everything in terms of letters; don’t substitute numbers here.
Essentially, you should follow the procedure given in the notes for finding msec for two generic points.
(a) Use x-coordinates x1 and x2 (leaving them as letters and not using numbers), and calculate the corresponding
y-coordinates y1 and y2 by plugging into the equation for the line.
(b) Plug those values into the equation ∆y = y2 − y1 , for the exact wiggle, and simplify what you get. It should
reduce to m (x2 − x1 ), which is exactly the value of f 0 (x)∆x. This says that the exact wiggle is exactly equal to
the wiggle magnification factor approximation for a line (only).
2. The following picture shows a line with small negative slope, but both ∆x and ∆y are positive, giving a positive slope.
What’s wrong?
6y
```
```
`
∆y 6 ``-```
∆x ```
-x
Investigation.
1. In the preceding exercises and problem, we discovered that the wiggle magnification formula is good, and is even
exact for straight lines. What this investigation does is examines how far off the formula can get, and why it works
best for ∆x smallest. We will look carefully at one specific function, f (x) = 1/x with derivative f 0 (x) = −1/x2 .
For this investigation, draw on your homework paper (it might need to be sideways to fit on the page) the following
table:
∆x x2 y2 ∆y f 0 (x1 )∆x Error = |∆y − f 0 (x1 )∆x | Error/(∆x)2
(a) Take x1 = 0.5 and ∆x = 0.1. Compare ∆y and f 0 (x1 )∆x the way you did in the homework exercises.
(b) Would you expect the comparison to be better or worse with ∆x = 0.01? Why?
(c) Work out the comparison in the exercises using the same f (x) and x1 but with ∆x = 0.01, ∆x = 0.001, and
∆x = 0.0001. (You will have to be careful with these if you use a calculator. I recommend you write down all
the digits your calculator gives. Don’t round off or you will spoil the problem! ) Create a table with columns
labeled at the top
∆x, x2 , y2 , ∆y, f 0 (x1 )∆x, and |∆y − f 0 (x1 )∆x |.
Leave room for one more column, to be added next. (Note that x1 = 0.5 and y1 = 2.0 will always be the same,
so separate columns for them are unnecessary.) (Also note that if you get 0 for any of the numbers in the last
column, you rounded when you weren’t supposed to. Go back and recalculate those numbers, NOW! )
(d) The wiggle magnification formula is an approximation, not an exact equality, so you should expect that ∆y and
f 0 (x)∆x will be slightly different. This difference is the the amount that the wiggle magnification formula is off
by, and is the value in the last column. That amount is called the error in the formula. That’s what we want
to look at very carefully. For a line, the approximation is exact, with no error. So, the last column gives how
much the function is not a line, and later on (next semester) we will see that error should be roughly quadratic
in ∆x, that is proportional to (∆x)2 . If that’s true, the error column should be nearly Error ≈ C × (∆x)2 , for
some constant C. Then, Error/(∆x)2 ≈ C, a constant. This we can check! For each of the rows of the table, fill
in the last column, whose values are the error column divided by (∆x)2 . (That is, form the last column for the
row with ∆x = 0.1 by dividing the error column by (0.1)2 = 0.01. Do a similar thing for the remaining rows.)
(e) Do the results in the last column seem to be roughly constant? What value (the C earlier) does it seem that the
constant is? (Note: As ∆x shrinks, the values in the last column should be getting closer to the correct value of
C. Think of it as a limit. In particular, you do not just want to average the values in the last column to get C.
What value does C seem to be getting closer to as ∆x shrinks?)
(f) Use the estimated value of the constant C from the previous part and the formula Error ≈ C (∆x)2 to estimate the
error when ∆x = 10−15 . This is likely to be smaller than your calculator is going to be able to handle, but you
can check your answer on Maple. Here’s how to set up the check. The semicolons at the end of the commands
have been replaced here by colons, so that the output is suppressed. (You have to change the ending colons to
semicolons to get the answers.)
> y := 1/x:
> y_prime := -1/x^2:
> x1 := 1/2:
> Delta_x := 10^(-15):
> x2 := x1 + Delta_x:
> y1 := subs(x=x1, y):
> y2 := subs(x=x2, y):
> Delta_y := y2 - y1:
> Wiggle_formula := subs( x=x1, y_prime) * Delta_x:
> Error := abs(Delta_y - Wiggle_formula):
> evalf(Error, 20):
This last number should be close to the value you predicted. Note that you are using exact (rational number)
arithmetic in Maple, up to the last step, so there is no error from round-off in this approach.
(g) On the basis of what is done in the preceding parts of this question, the error in the wiggle magnification formula
is roughly proportional to (∆x)2 . Using that, explain why you think that the wiggle magnification formula works
best when ∆x is small.
1.3 Calculating derivatives.

We are finally going to get “the right way” to calculate derivatives.
Dudley: Yay!
Mugsy: Hey! Does this mean you won’t be giving us the derivatives anymore?
Albert: Yes.
Dudley: I just changed my mind. BOOOO!
We take up in this one long section all of the rules that there are for finding derivatives. Oddly enough, there aren’t that
many different things to remember. Applying them all in the right order and in the right places can be a little trickier. But,
if you stay calm, you can sort it out fairly simply.
Mugsy: What do you mean “stay calm?”
Albert: Basically, don’t get overwhelmed by long equations. He’ll explain in more detail shortly.
Mugsy: You’re beginning to sound like him.
1.3.1 Motivation.
As “easy” as Maple is to work, diffquo(); is “the long way” to do derivatives. Standard calculus courses spend quite a
while on that part. I have spent enough (at least) to convince you that there has to be a better way.
We want to avoid tons of messy algebra.

The algebra using the approach from diffquo(); can get horrendous. This is one reason calculus has a bad reputation.
Maple is not the answer; diffquo(); is really quite limited. There is a better way. (And Maple knows how to use that
better way, too. We just haven’t covered it yet. We will.)
Patterns in derivatives became formulas.

As people worked out derivatives “the long way,” they became aware of certain patterns. If these patterns turned out to be
correct, they would make finding derivatives much easier. The correct patterns turned into rules that we will now learn.
This used to be a lot messier than the clean version we have today.
1.3.2 Differentiating polynomials.

We will now get enough rules to be able to write down the derivative of polynomials (and a few other functions) on sight.
Dudley: You mean like the derivative of x2 is 2 x from the example last section?
Albert: Precisely.
Derivatives of simple functions.

The first simple results are these. The first three can be done with just as much interpretation as we have now:
d
(constant) = 0
dx
d
(x) = 1
dx
d
(mx) = m
dx
These formulas are actually tricky at first.

Dudley: If these are tricky, the rest should be a breeze!
Albert: It’s only tricky in the sense that you’ll tend to forget them in the collection of other formulas.
You will get used to them, though. They don’t seem to fit into the pattern of the last two formulas, even though they do.
The last two require new formulas:
d n
(x ) = nxn−1
dx
d
((constant)xn ) = (constant × n)xn−1
dx
Note that the last two formulas do not require that n be a positive whole number, but for polynomials, it will be.
These are the building blocks of polynomials. All we need to do is figure out how to add and subtract them and we’ll
be there.
Derivative of the sum and difference of monomials.

This rule is easy to remember. You do with addition and subtraction just what you’d want to do: add and subtract the
results. That’s important enough to earn a box.
The derivative of a sum (or difference) of functions is just the sum or difference of the derivatives.
Here are the examples that I will go over in class. Extra space is allowed at the right of the page for you to fill in the
answers.
functions derivatives
x8
x√200
x
5 x9
5/x9
1 10
2x
2 x4 + 5 x5
5 x3 − 2 x2 + 17 x − 8
20 x4 − 7 x3 + 19 x + π
Finding derivatives with Maple.

Maple can work derivatives directly, and not only by fiddling with diffquo();. It has the rules I just gave (and many
more) built into it. For example, here is how you can find the derivative of 5 x3 − 7 x + 10.
> diff( 5*x^3 - 7*x + 10, x );
15 x2 − 7
You need to include both parts in diff();. That is, you have to have the function and the variable. The reason is that
the function could have other variables or parameters in it, and you need to tell diff(); which one is the independent
variable as opposed to just constants. That is, if the function is a*b*c, how would Maple know what the variable is if you
didn’t say?
Note that I will assume that you can use Maple to find any derivatives you encounter. In fact, I expect you to check any
answers you aren’t sure of by using Maple! So, please, if you are having trouble with Maple, see me soon.
The uses of derivative formulas.

I could impress you with some interesting, but very advanced, uses of the formulas for the derivatives of polynomials. I’ll
only give you one. There is a very common approximation technique called cubic splines that splits up an interval into
pieces and tries to approximate some function on each piece by using a cubic polynomial. In almost all situations, though,
you want to make sure that the pieces of the cubic polynomials fit together “well,” meaning that they join (have the same
values where they fit together), but also that their derivatives are equal, so that they have the same slopes where they fit
together. The formulas for the derivatives of the cubic polynomials becomes part of the method for finding the coefficients
of the polynomials.
Homework #8
Exercises.
1. Find the derivatives of the following functions.

(a) 24 x3 + 5 x2 − 10 x − 15
(b) −16t 2 + 5t + 12 (The variable doesn’t have to be x!)
(c) x8 + x6 + x4 + x2 + 1
√
(d) 3 x (First, you’ll have to write this as x to a power. The same idea is needed for the rest of these exercises.)
√ √
(e) 6 x − 6/ x
√
(f) 5/x2 + 5/ x
(a) x4 − 37 x3 + 16 x − 125
(b) 5t 3 + 8t 2 − 9t + 15
(c) x4 − x3 + x2 − x + 1
(d) 1/x2
(e) 6/x3 + x3 /6
√ √ √
(f) x + 3 x + 4 x
3. Make up some polynomials of your own and differentiate them. Make a few of them have large exponents and/or
coefficients. You will get credit for up to three polynomials. [This type of instruction occurs several more times
while we are learning derivatives. The intention is to give you a way to get involved in the task of figuring out what’s
going on. If you can generate good problems, and get the correct answers to them, then you really understand the
ideas. Again, you can use Maple to check your answers. A handy item is that Maple will generate its own random
polynomials. Use randpoly(x); to get them. You can use that same command over and over to get multiple
different random polynomials. But be warned that you get the same ones each time you start Maple over.]
4. What is the derivative of f (x) = 2 x2 − 3 x − 1? [Note that this is the same function and answer we got when grinding
through “the long way.” The name “the long way” is very accurate.]
5. When y = y(x), the derivative of y with respect to x is written dy/dx. How do you write the derivative of s = s(t)?
1.3.3 Limits, and the official definition of derivatives.

We have attempted to find dy/dx by estimating ∆y/∆x for ∆x exceedingly small. That’s fine. But the way we carried
out that estimate was to transform the difference quotient algebraically to get a factor of ∆x on the top, then divide the
numerator and denominator by ∆x, and then set ∆x = 0.
There is a real problem with that procedure, but it is somewhat subtle. If we set ∆x = 0, then we have no business
dividing the top and bottom of the difference quotient by ∆x. After all, 0/0 is definitely vicious. (See the homework.)
Mugsy: All right! And I thought math was too tame!
There are a number of “0 = 1 proofs” that use this trick. We’ll hit another very inscrutable one next semester, but there are
some simple ones, such as this one. We begin by assuming x = 1 and playing algebra games.
x=1 (1.24)
x − x2 = 1 − x2 (1.25)
x(1 − x) = (1 + x)(1 − x) (1.26)
x = 1+x (1.27)
0=1 (1.28)
Let’s examine this. First, we subtracted x2 from both sides of the equation. Then we factored both sides. Both of those
operations are always legitimate. We then divided both sides of the equation by 1 − x. That, of course, is a problem, since
we started out by assuming that x = 1. We are dividing both sides by 0, and that’s not good. In essence, we had the equation
1×0 = 2×0
which is always true, and then “canceled” the 0’s. The result was that 1 = 2, which is definitely false. The last step,
subtracting x from both sides, is fine. But by then, we have already invalidated our equality.
Division by a quantity that is elsewhere set equal to zero is algebraically shady. But that’s exactly what we did in our
definition of derivatives!
Mugsy: I knew it. Math has more holes than three tons of Swiss cheese.
Albert: Actually, more than that. But not here. That’s what we are going to investigate next, I assume.
We divided top and bottom of the difference quotient by ∆x and then set ∆x = 0 later on. And that’s just not proper.
How can we be sure that what we are doing is algebraically legitimate? Limits provide a foundation. Let’s go back to
solve a simple problem, one that is closely tied to what we want to solve, but not obviously so. Suppose that we have a
partially amnesiac gnome.

Mugsy: Not them again! I thought they were gone for good.
Albert: This isn’t the last time they’ll appear, either!
Dudley: How do you know?
Albert: I read the book last night.
Mugsy: Some light reading in your spare time, I expect.
He can’t remember the output value for a certain input. How could we tell what that output should be? The basic idea is to
look around but not at the input we want. Then the limit is the most likely value based on those surrounding values.
An example helps.
Dudley: Amen.
Suppose the gnome can’t remember f (2). We could look at values near x = 2, and construct this table (values chosen very
strategically for the example):
x f (x)
2.1 7.51
2.01 7.0501
2.001 7.005001
2.0001 7.00050001
.. ..
. .
2 Can’t remember!
.. ..
. .
1.9999 6.99950001
1.999 6.995001
1.99 6.9501
1.9 6.51
It would seem reasonably obvious that the missing value is f (2) = 7. This would be written as
lim f (x) = 7.
x→2
In general, the notation is

lim f (x) = L.
x→c
which means that the value of f (x) sneaks up to L as x moves closer to c.

Several important comments must be made here. First, there is no guarantee that the value of f (2) actually was 7. All
we can say is that it should have been 7.
Next, we were not allowed to look directly at x = 2 in this process. If f (2) really did have a value, and if that value
really was 7, then f (x) would be called continuous at x = 2. In general, if the values of limits of f (x) always match the
values of the functions, the function is called a continuous function. This is difficult to say, but easy to apply. We’ll return
to it momentarily.
For the record, here is a somewhat more precise definition of limits. limx→c f (x) = L holds whenever you can keep f (x)
as close to L as you want just by keeping x close enough to c. Another way of looking at that is to think that you have a
certain amount of leeway in getting f (x) close to L; consider it as a tolerance (in the measurement sense of that word) in
the output values. It is usually given to you.
Mugsy: Like in a homework assignment?
Albert: Yes, but not in this course.
Mugsy: Ooo. I’m beginning to like this approach more!
The whole point is to then come up with a tolerance for the input values, the amount that you will allow x to change that still
does not cause f (x) to go more than its tolerance away from L. In some calculus courses, this becomes a heavy emphasis
of the course. It involves some rather sophisticated algebraic manipulations with inequalities. I won’t make you do that
in this course. (In a later course, however, you will need to do this, but it is only taken by people who are math majors.)
The correct terms for these tolerances gives the name to these calculations. The output tolerance is normally written ε (the
Greek letter epsilon), and the input tolerance is normally written δ (the Greek letter delta, lower case this time). The topic
is then called an ε-δ argument.
Limits are what we need for derivatives. The slope of the secant line gets close to, but never exactly equal to, the slope
of the tangent line.
Dudley: Never?
Albert: Almost never. It will be exactly equal for lines, for example. But it is difficult to cook up any other curves
where it will be equal.
We want to look very close to ∆x = 0, but we can’t look exactly at ∆x = 0 without getting undefined results (or errors, or
whatever). We have not been allowed to plug in ∆x = 0 (at least until after simplifying).
The definition of f 0 (x) in strict terms is
f (x + ∆x) − f (x)
f 0 (x) = lim (1.29)
∆x→0 ∆x
The fraction inside the limit is precisely ∆y/∆x written out, and the limit makes you look at it when ∆x is very tiny.
Occasionally, this limit is written with h rather than ∆x.
Mugsy: I bet this makes you feel better, Al. All this gibberish is right down your alley.
Albert: And your alleys tend to be a lot darker and less populated, right?
Mugsy: Yup.
What, then, is the procedure for finding the limits that are encountered in this definition? Essentially, we use this major
fact: All functions that are typically encountered in math are continuous where they are defined. What does that mean? The
first thing you do when evaluating a limit is plug the value of the variable into the limit. (That’s the idea behind continuous,
remember? If the value you get by evaluating the limit—that is the best prediction you can make—is equal to the actual
value of the function, the function is continuous. We’re using it somewhat backwards, since it will help us find limits now.)
If you get a value out, that’s your answer.
Dudley: Al, is this serious? First you say you can’t look straight at the value, and then you say to do just that! Why
can’t you make up your mind? Can you or can’t you?
Albert: There, there, Dudley. Quit whimpering, and I’ll explain it.
Dudley: <Sniff>
Albert: Limits are set up to answer the hassle that crops up when you want to plug in a number, and algebraically,
it’s not legal. That’s the “plug in 0 even though you want to divide by it” hassle we had. But the functions in calculus
just happen to have some very nice properties, and one of them is that you can simplify first, get rid of the hassle,
and get the correct answers when you plug in. In one sense you could ignore it, and you probably wouldn’t have even
noticed there was a problem.
Mugsy: I certainly wouldn’t.
Albert: But on the other hand, there are a couple of places later when the idea of limits will come in very handy, and
this then becomes an introduction to the concept of limits.
Dudley: I’m beginning to get it, but I’m not sure I like the idea that this is going to come back.
Mugsy: Does everything in this course return?
Albert: Nearly.
Mugsy: Augh.
How do you tell when you aren’t getting a value out? The only situation where that can occur (for now, anyway—it
gets worse later)
Mugsy: AUGH.
is getting the form “0/0.” This is not good. Anything else is fine:
“(?/??)” has a nice value when both ? and ?? are non-zero.
“(0/??)” = 0 as long as ?? 6= 0.
“(?/0)” doesn’t exist (or is infinite) as long as ? 6= 0.
See the homework.
When working out limx→c f (x) by hand, here is the three-step procedure to follow:
1. Plug x = c into f (x). If you don’t get 0/0, STOP; you are done. The answer is the value you got. (See the box above.)
2. Since you must have 0/0 to get here, factor the top and bottom. There must be factors of (x − c) in both. (This is
why both the top and bottom are 0 when you plug in x = c.)
3. Cancel the factors of x − c in both top and bottom, and go back to the first step. (This step usually gets rid of the
offending factors that are causing the 0/0 hassle, so we see if we now have a friendly limit.)
Note that this procedure is precisely what I gave you as the procedure for finding slopes of tangent lines (derivatives), in
the special case of the limit that is used in derivatives. First, you always get “0/0” by the form of the equation for slopes.
Then, you worked algebraically to get a factor of ∆x in the top (it’s obvious in the bottom), which is the second step. Then
cancel the ∆x’s, which is the third step. Finally, plug in ∆x = 0 to get the slope, which is back to the first step.
This, then, is why you have to simplify the top of difference quotients, and why all terms without a ∆x all cancel. If not,
the top wouldn’t be zero when you put in ∆x = 0. This is also why the critical step in finding derivatives the long way is
that strategic cancellation of the ∆x’s on the top and bottom. After that, it’s easy!
Here’s an example of how I construct a limit problem (such as might occur on the test).
Mugsy: That’s a hint, hear?
First, I choose a nice function, like
x−4
x+3
and pick a number, like x = 2. I then multiply top and bottom of the function by x − 2 (without canceling), giving
(x − 4)(x − 2) x2 − 6 x + 8
= 2
(x + 3)(x − 2) x +x−6
I then will ask for

x2 − 6 x + 8
lim
x→2 x2 + x − 6
where the x → 2 is chosen because I multiplied through by x − 2. You see, this way, I build into the fraction a 0/0. I expect
you then to factor the function, cancel the x − 2 on the top and the bottom, and get back to
x−4
x+3
plug in x = 2, and get
2 − 4 −2
=
2+3 5
which is the answer to the limit.
There is one other critical comment: 0/0 is never the answer to anything! In particular, it is never the answer to any
limit. You can get “0/0” as a form, but that means you must go through the procedure just outlined.
Mugsy: I have been informed to “visit” anyone who puts 0/0 as an answer. Heh, heh, heh.
Limits can also be done on Maple. To evaluate on Maple the limit I just constructed, here’s the format:
> limit( (x^2-6*x+8)/(x^2+x-6), x=2);
−2
5
The function goes first, and then the value you want the variable to approach. Be careful to get the parentheses correct.
Essentially, all the limits we’ll encounter (and then some) can be done this way using Maple.
The easiest way to do limits shows up later (L’Hôpital’s rule). We’ll need some easier way when the limits get harder.
The procedure we have only works for polynomials, and other things that we can factor. The curious thing is that L’Hôpital’s
rule uses derivatives!
The uses of limits.

How could something like limits be useful? By going backwards. The derivative is a limit, but that is difficult to calculate
in practice. So what you do is approximate the derivative for very small values of ∆x.
That is (again) the idea behind linear interpolation. You approximate the value of the derivative using the closest tabular
entries that you can. The reason that this gives good answers is that the derivative is a limit, and you are approximating the
limit as well as you can using the table.
Homework #9
Exercises.
1. Evaluate the following limits by the three-step procedure that I gave earlier. (So, you need to show your steps!) You
can use Maple to check your answers.
x2 − x − 6
(a) lim
x→−2 x2 + x − 2
x2 − x − 6
(b) lim 2
x→0 x + x − 2
x2 − x − 6
(c) lim 2
x→1 x + x − 2
x2 − x − 6
(d) lim 2
x→3 x + x − 2
r2 + 2 r + 1
(e) lim 2
r→−1 r − 2 r − 3
2. Evaluate the following limits by the three-step procedure that I gave earlier. Again, you can use Maple to check your
answers.
x2 − 2 x − 3
(a) lim
x→3 x2 − x − 6
x2 − 2 x − 3
(b) lim 2
x→0 x − x − 6
x2 − 2 x − 3
(c) lim 2
x→−1 x − x − 6
x2 − 2 x − 3
(d) lim 2
x→−2 x − x − 6
r2 − 4 r + 4
(e) lim 2
r→2 r − r − 2
x2 − 2 x − 15
3. We want to evaluate lim both by calculator and algebra.
x→−3 x2 + 2 x − 3
(a) Plug values of x near −3. (Try −2.9, −2.99, −2.999, and −3.1, −3.01, −3.001.) This is reasonably easy on
Maple, for example, using
> f := x -> (x^2-2*x-15)/(x^2+2*x-3);
x2 − 2 x − 15
f := x →
x2 + 2 x − 3
which defines the function for Maple, and then type (and again, change the colons at the end of these to
semicolons to see the output):
> f(-2.9):
> f(-2.99):
> f(-2.999):
> f(-3.1):
> f(-3.01):
> f(-3.001):
What do you think the value of the limit is from these numbers?
(b) What happens if you plug x = −3 into the function? (Or try to find f(-3); on Maple?)
(c) Factor both the top and bottom of the function and reduce it. (You can do this on Maple by typing at it
normal(f(x));. Remember that normal(); in Maple is one of the algebraic simplification routines, the one
I tend to prefer.)
(d) What do you get when you plug x = −3 into the reduced expression for the function?
(e) Which way would you rather work limits (by calculator/computer or by algebra)? Give a reason for your
answer. (There is no wrong answer to this part. I’m just curious to see what you think. The answers usually
split.)
Problems.
1. In this problem, we’ll get a method of working√limits of some functions that are not rational functions (the quotients
√
x− 6
of two polynomials). We will work lim .
x→6 x − 6
(a) Plug x = 6 into the function, according to the first step to finding a limit. What happens?
(b) Why does the second step in the three-step method break down in this case?
(c) We can salvage the process by forcing a factor of x − 6 on the top. The way we√get it√is to rationalize the
numerator (not denominator). Multiply the top and the bottom of the function by x + 6 and multiply out
just the top. Leave the bottom in factored form.
(d) Reduce the function and plug x = 6 into the result. What is the value of the limit? You can check your answer
on Maple, but realize that it might give the answer in a different form than you got. (Subtract your answer from
Maple’s and simplify("); you should get 0 if you are correct.)
√ √
x+5− 5
2. Use the idea from the previous problem to evaluate lim
x→0 x
Investigation.
1. Let’s look at why 0/0 is so vicious, by looking at what a/b means. We do that by converting the division into a
multiplication problem that we try to solve.
(a) Suppose a/b = c, and solve for a. (Leave the variables as variables; don’t use numbers yet.) (In case you are
worried, yes, this part is simple.) The whole idea of division, then, is to use this equation to find c. That is, plug
in the values of a and b, and try to get c that works. The remaining parts of the question refer back to this part.
References that follow in this question to “the equation” are to the equation that is the answer to this part.
(b) Suppose first a = 0 and b 6= 0. What value of c makes the equation work? Are there any other values of c that
can make the equation work?
(c) Suppose now b = 0 and a 6= 0. Is there any value of c that we can use to solve the equation? Why is a/0 not
defined for a 6= 0? (Later we will see that it is sometimes convenient to say that a/0 is infinite, and occasionally
people will say just that.)
(d) Suppose a = b = 0. Is there any one specific value we can assign to c that works in the equation? Why would
0/0 be called an indeterminate form? Note particularly that 0/0 is not automatically 1, even though x/x = 1 for
any x 6= 0.
(Note: Division by 0 causes pain no matter what the numerator (top). Either there are no solutions (when the
top is non-zero), or there are an infinite number of solutions (when the top is zero). Exactly the same situation
will be investigated later on, in a course called linear algebra, but there it will be harder to see what’s going on
because a wider variety of things can happen.)
1.3.4 Differentiating products and quotients.

These are general patterns—the building blocks—to use with all derivatives. The product rule is used whenever you have
two functions multiplied together (a product) and the quotient rule is used whenever you have two functions divided (a
quotient).
If you think that the product and quotient rules behave the way adding and subtracting did, you are wrong. You have to
memorize these two rules.
The derivative of the product of two functions.

The product rule says that
( f (x) × g(x))0 = ( f 0 (x) × g(x)) + ( f (x) × g0 (x)) (1.30)
Note that you differentiate one factor at a time. Also note that this is not the same as multiplying the derivatives. (See the
homework.)
Example: Find the derivative of
(x2 + 4 x − 3)(x3 − 2 x2 + 5)
We have two ways of doing this, and we will do it both ways. First, if we use the product rule, we get that the derivative is
(2 x + 4)(x3 − 2 x2 + 5) + (x2 + 4 x − 3)(3 x2 − 4 x)
On the other hand, as we indicated earlier, we can multiply out the product of two polynomials, and get another polynomial.
In this case, when you multiply the original function out, you get
x5 + 2 x4 − 11 x3 + 11 x2 + 20 x − 15
The derivative of this is easy.

5 x4 + 8 x3 − 33 x2 + 22 x + 20
When you multiply out the result from the product rule, you get exactly the same thing. Essentially, you have the option,
with polynomials, of multiplying out before or after you differentiate. The result is the same either way.
Dudley: Why have a product rule at all? Can’t I just multiply the product out?
Albert: Not all functions are polynomials that you can multiply out, before or after. We don’t have the formulas
necessary to differentiate trig functions or exponentials or logarithms, but we will soon. And with products of those
functions, you have no choice but to use the product rule.
The derivative of the product of three or more functions.

The pattern in the product rule is actually simpler to see if you have more functions. The product of two functions doesn’t
allow enough terms to see the way that it works.
The formula for the product of three function is
( f (x) g(x) h(x))0 = ( f 0 (x) g(x) h(x)) + ( f (x) g0 (x) h(x)) + ( f (x) g(x) h0 (x))
Again, note that you differentiate one factor at a time. The other factors are left alone. You then add up all the products.
With more than three functions, you start running out of letters to use for them, so a typical approach is to use subscripts.
The functions for a multi-factor product would probably be written f1 (x), f2 (x), . . . , fn (x). The function would be
f (x) = f1 (x) × f2 (x) × · · · × fn (x)
The derivative would be written (it’s hard to do nicely):
f 0 (x) = f10 (x) × f2 (x) × · · · × fn (x) (1.31)

+ f1 (x) × f20 (x) × · · · × fn (x) (1.32)
..
. (1.33)
+ f1 (x) × f2 (x) × · · · × fn0 (x) (1.34)
Don’t memorize the formula—memorize the pattern. This rule is not too hard. The only possible problem is to remember
to use it when you need it.
Here are some more examples that I will go over in class. Differentiate the following functions.
(x3 − x) (2 x2 + x − 1)
z4 (9 z7 + 8 z4 − 6 z3 + 15 z2 − 10 z + 9)
(x6 − 7 x4 + 3 x3 − 7) (2 x5 + 5 x4 − x2 + x − 6)
(x2 + 2 x) (x4 − 3 x2 ) (8 x3 − 5)
(x + 2) (3 x + 4) (5 x + 6) (7 x + 8) (9 x + 10)
Quotient rule.
This can be done two ways. If you want to be formal, you use this:
f (x) 0 g(x) f 0 (x) − f (x) g0 (x)

=
g(x) (g(x))2
On the other hand, I keep mixing up the order of the factors on the top, and that changes the value. (See the homework.)
So, the way that I remember the quotient rule (and you will hear me mumbling this to myself when I work one out on the
board) is to call the top function hi (it’s high, that is, on top), and the bottom function ho (because that’s what I call it), and
use D for differentiation.
Mugsy: Brace yourself.
Then the quotient rule becomes (and this is not original to me, by the way!):

hi ho D(hi) − hi D(ho)
D = (1.35)
ho (ho ho)
This, by itself, is not hard, but when combined with the product rule (such as having a product on the top or bottom), it can
get confusing. “All” you have to do is keep from getting overwhelmed. More on that soon.
√Here are some examples that I will work in class. Differentiate the following functions.
x
x2 + x
2 x3 − 7 x2 + 1
3 x2 + 5
x (x3 − 1)
x4 + x3 − 3
(4 x2 + 8 x − 5) (2 x7 − 8 x3 + x2 + 2)
(5 x3 + 8 x − 2) (x3 + x2 − 1)
One topic becomes very relevant at this point, and that is the degree of algebraic simplification that I will require of you
on the homework and tests. The answer is: “As little as possible, and usually none.”
Mugsy: Wow! And I thought he was nearly as mean as I am . . . .
Let me explain. If I am grading your papers, and you decide to simplify your answers, you will then require me to check
the work you did in simplifying. That takes me time and potentially a lot of work.
It is also to your benefit not to simplify. Especially on a test, simplifying takes you time also, which you could better
spend thinking about other problems. Finally, if you don’t simplify, and you have made a calculus mistake, it is reasonably
clear what you did wrong, and partial credit is simple to award. If you then simplify, you bury your mistake in an avalanche
of algebra, and all I can tell easily is that your answer is wrong. Partial credit is harder to award. I know that your answer
is wrong, so if I can’t figure out what you did, you lose lots of credit.
Mugsy: It seems like he wants to make life easy for himself, mostly.
Albert: That’s bad?
Homework #10
Exercises.
1. Find the following derivatives using the product rule. You do not need to simplify your answers.
(a) f (x) = (7 x3 + 5 x2 )(2 x4 − 5 x)
√
(b) f (x) = x x (Remember to convert the square root to an exponent)
2. Find the following derivatives using the product rule. You do not need to simplify your answers.
(a) f (x) = (5 x2 + 3 x)(2 x5 − 3 x4 )
√
(b) f (x) = x 3 x
3. Work out f 0 (x) for the following using the quotient rule. You do not need to simplify your answers.
x2 +3
(a) f (x) = x
x
(b) f (x) = x2 +3
(3 x2 −2 x+3)(5 x+7)
(c) f (x) = x2 +8
4. Work out f 0 (x) for the following using the quotient rule. You do not need to simplify your answers.
x
(a) f (x) = x2 +1
x2 +1
(b) f (x) = x
(5 x2 +3 x−4)(x−9)
(c) f (x) = x2 +5
5. Make up three (or more) examples of the product rule and three (or more) examples of the quotient rule of your own.
Make sure that at least one quotient rule involves a product also.
Problems.
1. In this problem, we show that the method that you might want to use on products and quotients doesn’t work. Take
f (x) = x3 and g(x) = x4 . An added moral of this problem is that if you obey the laws of algebra, you will get the
correct answers in calculus, although they might not always look the same.
(a) Combine f (x) g(x) into a single xn and find ( f (x) g(x))0 .
(b) Find f 0 (x) and g0 (x). Combine f 0 (x) × g0 (x) into a single c xm . (Note that this is not equal to ( f (x) g(x))0 , since
neither the exponents nor the coefficients are equal. That is, ( f (x) g(x))0 6= f 0 (x) g0 (x).)
(c) Find ( f 0 (x) g(x)) + ( f (x) g0 (x)), and combine it into a single c xm . It should equal ( f (x) g(x))0 .
2. In this problem, we work through exactly what happens if you reverse the order of factors in the numerator of
the quotient rule formula. (That is, what happens if you use hi Dho − ho Dhi for the top rather than the correct
ho Dhi − hi Dho.) We begin with a specific example.
(a) Take the function

x2 − 1
f (x) =
x+2
Work out the correct quotient rule for f 0 (x), namely,
ho D(hi) − hi D(ho)
(ho ho)
and multiply out the function on the top of the fraction to get a polynomial.
(b) Work out
hi D(ho) − ho D(hi)
(ho ho)
which is what you get if you interchange the order of the factors in the quotient rule. Also multiply out the
function on the top of this fraction to get a polynomial.
(c) Show that the answers to the previous two parts are the negatives of each other. Do this by taking the negative
of the top of the first quotient and getting the top of the second quotient. (The bottoms are the same for both.)
(d) Now we want to show that this same thing always happens, namely that reversing the order of the factors in the
numerator changes the sign of the answer. Do this by showing that
ho D(hi) − hi D(ho) = − (hi D(ho) − ho D(hi))
leaving the ho’s and hi’s as unspecified functions.

(e) Why doesn’t the order of the factors matter in the product rule when it does matter in the quotient rule? Give a
reason for your answer.
3. In the formula in the notes for the derivative for three factors,
( f (x) g(x) h(x))0 = f 0 (x) g(x) h(x) + · · · ,
set h(x) = 1, and simplify both sides and show that you get the usual product rule for two factors, (( f (x) g(x))0 =
f 0 (x) g(x) + · · · .
1.3.5 Time out to gather strength.

Before we plunge on, it is good to stop a bit and solidify what we have done so far. It is absolutely critical to have this
much down pat before we continue.
Dudley: You mean this is a breather?
Albert: Basically. There’s only one more pattern to go, and it’s important enough that it shouldn’t be confused with
the product and quotient rules.
The idea behind derivative formulas.

Why have we been learning derivative formulas? It’s a good question, with good answers. First, if we can’t find derivatives
reasonably quickly and accurately (at least the easy ones), then we will not really be able to use calculus at all later. And
without derivative formulas, you are stuck going back to the diffquo(); method, and that’s a real mess.
Additionally, it is good practice for pattern-matching. Can you isolate the products, quotients, and polynomials? If
so, you can take the derivative just by plugging into the formulas we have. A good chunk of mathematics is learning to
recognize patterns. The patterns become considerably more subtle as you go along, but they are a real key in making sense
out of what would otherwise be utter chaos.
Mugsy: Are you trying to tell me that calculus is not utter chaos?
Albert: Actually, what we are doing here is very mechanical. Even something as stupid as a computer (Maple) can do
it easily.
Mugsy: Al, you’re the only one I know who calls computers stupid.
Albert: They are. But I will acknowledge that they are fast.
For example, if someone worked out a derivative the long way, you’d probably have an idea of what was going on, and
why. If you didn’t have that background, it would be totally bewildering.
Dissect formulas into simpler parts. Now comes the attitude check. I give you this back-breaking quotient and product
rule to differentiate on the test, and your mind goes into zombie mode just by looking at it. What do you do?
Dudley: Is he psychic? That happens to me all the time!
Albert: He’s taught calculus a long time. It happens to lots of people. Now listen to the remedy.
When confronted with a huge derivative, the key is to avoid being overwhelmed.
Dudley: HOW?
Here’s how, Dudley. First, pick out a little part of the problem, and ask yourself if you can differentiate it. If not, pick out a
smaller part until you can. Go all the way down to x (or whatever the variable is) if you need to! Don’t even try to tackle it
all at once. You’ll get confused, and leave out something important. Keep picking the function apart until you are confident
that you can differentiate all of the little parts of the problem. Then realize that if you can differentiate each part, you can
differentiate the whole thing. The rules we’ve gotten so far (and the one to come) are there to tell you how to assemble all
those little parts. Just do what the rules say.
Re-examine product and quotient rules from this point of view. The product rule tells you how to differentiate a
product once you know how to differentiate each factor. The quotient rule tells you how to differentiate a quotient if you
can differentiate the top and the bottom. All the rules in calculus are like that.
Let’s take for an example the biggest quotient rule from before:
(4 x2 + 8 x − 5) (2 x7 − 8 x3 + x2 + 2)
(5 x3 + 8 x − 2) (x3 + x2 − 1)
How does that decompose? Well, you look at the thing, and first of all realize that you’ll need the quotient rule, because
overall it’s a quotient.
Dudley: So far, so good.
But to use the quotient rule, you need to differentiate both the top and the bottom (both D(ho) and D(hi)). But you look at
the top, and cringe.
Dudley: You got it.
But look again. The top is a mess, but what is it? It’s two polynomials multiplied together. That sounds vaguely familiar.
Dudley: Emphasis on the vaguely.
Can I differentiate that?
Dudley: I’m sure you can. But can I?
A simpler question, then. Do you think you can differentiate the first factor, the (4 x2 + 8 x − 5)?
Dudley: That I think I can manage.
Fine. Do you think you can differentiate the other factor, (2 x7 − 8 x3 + x2 + 2)?
Dudley: Yes.
Great! That means that you can differentiate the whole top! Why? Because the product rule tells you how to put the terms
and their derivatives together. Convinced?
Dudley: Not yet. Show me.
First, look at the product rule. It is
( f (x) × g(x))0 = ( f 0 (x) × g(x)) + ( f (x) × g0 (x))
What do you need to know to find the derivative of ( f (x) × g(x))? You need f (x) and g(x), and their derivatives, f 0 (x) and
g0 (x). That is, knowing those, you can plug into the formula, right?
Dudley: I suppose.
So, look at the top now. It’s (4 x2 + 8 x − 5) (2 x7 − 8 x3 + x2 + 2). To differentiate that, all you need is the two factors (and
they are sitting right there in front of you), and the two derivatives, and you’ve already said you can do those. Therefore,
you can differentiate the top!
Dudley: Is that all?
Albert: No.
Mugsy: I knew that was too easy.
Now look at the quotient rule. It is
hi ho D(hi) − hi D(ho)
D =
ho (ho ho)
Now, what do you need to use the quotient rule? To differentiate a quotient, you need the top and bottom functions (again,
that’s easy, because they are just sitting there), and their derivatives. In the example we are working, we’ve already decided
that we can do the derivative of the top, and we have the functions. All we need now is the derivative of the bottom, and
we’re almost finished!
Dudley: “All?”
Well, do you think I can convince you that you can find the derivative of the bottom, too?
Dudley: At this point, I don’t think I could argue anything.
Well, look at the bottom function. It is (5 x3 + 8 x − 2) (x3 + x2 − 1). That’s the product of two polynomials again. Now for
the question. What do you need to know in order to differentiate the product of two functions?
Dudley: The functions and their derivatives?
Exactly! You remembered! Do you know the functions?
Dudley: Yes.
Can you differentiate the first one?
Dudley: Why do I feel like I’m being sold the Brooklyn Bridge?
Albert: Because you think you can’t do it, and you are uncomfortable realizing that maybe you can. Now answer the
question.
Dudley: Yes, I can differentiate that polynomial.
And the other polynomial, can you differentiate it, too?
Dudley: Yes.
So, what rule tells you how to put two functions and their derivatives together to get the derivative of the product?
Dudley: The product rule?
Yes! Yes! Yes!
Mugsy: Cut the theatrics, kid.
Now, Dudley, what do you have? You know the top and bottom functions, and you can figure out their derivatives. And the
quotient rule tells you how to assemble these ingredients into the answer. Can you do the whole thing?
Dudley: I want to say no, but I think the answer is yes.
Go back to the answer from when I did that example in class, and see how all the pieces fit together. Notice where the
functions and derivatives fit in, and how I used both the product and quotient rules.
How do you tell when to use what formula.

This is a common question, since people who are being overwhelmed usually have no idea how to start. The answer,
fortunately, is simple. If you are dealing with things that are multiplied, use the product rule; if you are dealing with things
that are divided, use the quotient rule. In other words, look at what is being done to the little pieces to assemble them
into the whole, and that tells you what formula to use. These rules are without exception. You always use the product and
quotient rules in their places.
When can you short-cut the process?

On the other hand, there are times when the process can be shortened, namely when there is a constant factor. If C is a
constant, we get these formulas:
(Cg(x))0 = C(g0 (x))
D(hi/C) = (D(hi))/C
Remember that these work only with C a constant. These really are the product and quotient rules, applied to this special
case. See the homework.
Homework #11
Exercises.

(3 x2 + 5)(2 x − (1/x))
(a) √
(2 x + x)(x + 1)
√
(s − 3 s)(s2 + 8)
(b) √
(s + s)(s + 4)
(4 x3 − 3 x + 1)(x − (1/x))
(a) √
(x + 2 x)(x + 8)
√
(u + 3 u)(4 u2 + 7)
(b)
(u2 + 7)(u3 − 9)
Problems.
1. Show that the short cuts given just before the homework are correct. Do this by applying the product rule and quotient
rule to the expressions on the left side of the equals signs in the expressions given, and use the fact that the derivative
of a constant is 0, and use algebra to simplify what’s on the right sides of the equals signs.
2. Show that the other possibility for a short-cut, (C/ f (x))0 is not equal to C/ f 0 (x). Do this by picking a specific
function for f (x) and specific value for C and working out both expressions and getting different answers.
1.3.6 The chain rule (differentiating the composition of functions).

This is without any doubt the most important rule in calculus! It is also one of the most often forgotten.
Whenever we get a new function (and we have a stock of them coming up soon), the chain rule will be used on it to
extend it beyond the most plain vanilla version of the derivative.
Terminology and description of composition.

Refer back to the section on functions if you need to be reminded about composition, its meaning, terminology, or notations.
When we are working the chain rule in calculus, we will have a different task: “decomposition.” Decomposition is the
process of taking a function h(x) and finding the functions f and g where h(x) = f (g(x)). This is more of an art than a
science, but for the functions we encounter, it is very easy. There will be inside and outside functions. The inside function
is precisely what it says, the function that is inside, g(x). The outside function is a little more obscure; it represents what is
done to the inside function to give h(x), and is f (u), where u = g(x) is the inside function.
An example will help.
Dudley: I’d cheer, but I’m still too dazed from that quotient and product rule.
Albert: Aren’t you√ glad for the breather?
Suppose h(x)√ = x2 + 1. Then g(x) = x2 + 1 is on the inside. If we replace g(x) = x2 + 1 by u in the formula for h(x), we
get f (u) = u. We have then “decomposed” h(x) into f (g(x)) by finding f (u) and g(x).
Dudley: Hey, that was easy enough that even Mugsy should have gotten it.
Mugsy: You guys are not helping my mood at all. And you don’t want I should get really angry.
All of these problems really are that simple.
Deriving the chain rule using the “wiggle” approach.

What is the derivative of a composition? This can be done algebraically, but the clearest way to see what’s going on is to
use the idea that the derivative is a wiggle magnification factor.
Suppose you have a function y = f (g(x)), which is equivalent to treating it as y = f (u) where u = g(x), that is, u is the
intermediate variable. Next, you wiggle the input value by ∆x. Then u wiggles by a certain amount,
∆u ≈ g0 (x) ∆x
Then that wiggle gets fed to f (u), and its output will wiggle by
∆y ≈ f 0 (u) ∆u ≈ f 0 (g(x)) g0 (x) ∆x
The wiggle magnification factor for the composition is then
( f (g(x))0 = f 0 (g(x)) g0 (x) (1.36)
The chain rule is exactly that.

Dudley: Does that seem easy, too?
Albert: You are asking? Yes, it seems easy, but the application of it can get sticky.
Mugsy: AHA! I knew there was a trick!
The process is much like what happens when you successively magnify something. If you triple the size of something,
and then double its size, the net effect is to multiply its size by a factor of 6 = 3 × 2. Each green box magnifies the wiggle
by some amount, so the composition magnifies it by the products of those amounts.
There is one thing that needs to be settled yet, or serious confusion can result (as in one homework problem).
Mugsy: HINT!
There is a significant matter of where the derivatives are evaluated. Note that on the right-hand side of the chain rule we
have f 0 (g(x)) and g0 (x). Why should f 0 have the g(x) inside it while g0 has only the x inside it? One way to state the
reason is that wiggles are always wiggles about some value—you are wiggling some number by ∆x or ∆u. The wiggle
magnification factor is the derivative evaluated at that original, un-wiggled number. The un-wiggled number that is input
to g is x. The un-wiggled number that is input to f is the output from g, namely g(x), so f 0 gets evaluated there. This all
leads to the way that I prefer to state the chain rule. The derivative of a composition is (briefly) the derivative of the outside
times the derivative of the inside. A more expanded version (that I suggest you use until you get used to it enough that you
don’t need to be reminded any more) is:
The derivative of a composition is the derivative of the outside (leaving the inside alone) times the derivative
of the inside.
The derivative of the inside term is the one that is often forgotten in the thrill of having gotten the derivative of the outside.
Please remember to put in the whole chain rule!
Albert: Let me add my encouragement, too. Mugsy, I give you permission to pound on anyone who forgets the chain
rule.
Mugsy: Really?! And I thought this section was going to be a drag!
Albert: Is that encouragement enough?
Writing out the chain rule.

The chain rule is so important that it is written several, different-looking, ways. After all, it is the most important rule in
calculus. We are not yet done with it by any means. Though these may look different, they are all exactly the same. You
get to use whichever one fits the problem the best.
f (g(x))0 = f 0 (g(x)) g0 (x). This is the form I gave earlier, and is the one that expresses the formula algebraically the best.
dy dy
dx = du × du
dx . This form is curious. It appears as though the du’s are being canceled. In fact, that is the right way to
remember this form of the chain rule. It tends not to be as useful a form unless you happen to be given y in terms of u and
u in terms of x so that the derivatives can be worked out easily. While you are given that information occasionally, most of
the time you are given a formula to differentiate.
The big use that we will make of this is somewhat different. If you will remember, the form of the derivative contains
information. When you write dy/dx you are indicating that y is the dependent variable and x is the independent variable.
When you write dy/du, y is still the dependent variable, but now u is the independent variable. Thus, this form of the chain
rule tells you how to change the independent variable in a derivative. That is, if y (the dependent variable) can be expressed
either in terms of x or u (two possible independent variables), then dy/dx and dy/du will not be equal, but will be related
by (dy/dx) = (dy/du) (du/dx). Normally, the independent variable is obvious, but when it isn’t, great care must be taken.
This theme will occur several more times in this course:
The chain rule is the way you change independent variables in a derivative.
Let me give you an example from physics. (Relax. I provide all the formulas you need. What I am looking for is a
real-life dependent variable that can reasonably be expressed in terms of two different independent variables.) One formula
for the velocity of a free-fall object dropped from rest is v(t) = gt, and the distance dropped is s(t) = 21 gt 2 . (Here, g is
the constant of gravitational acceleration. We will derive these formulas later, when we get to integration. They are also
standard from physics.) There are good reasons to ask for velocity either as a function of time or of distance. (How fast is
it dropping after 3 seconds? How fast is it dropping when it has dropped 5 feet?) This means that finding the derivative of
v could use either s or t as the independent variable. One formula we have for velocity would make one derivative easy to
find, v(t) = gt. On the other hand, velocity as a function of distance is not so obvious. We can solve s = 12 gt 2 for t, and
p
get t = 2 s/g. (we are only concerned with t > 0, so there’s no ±). That means that
p p p
v = g 2 s/g = 2 g s = 2 g s1/2
Let’s now look at dv/dt and dv/ds. The first is easy: dv/dt = g, a constant. This says that the acceleration (defined to be
dv/dt) of a freely falling body is a constant, something that physics will verify. On the other hand,
r
p 1 −1/2 g
dv/ds = 2 g s =
2 2s
which is hardly a constant (s is changing). What’s the contradiction here? There isn’t any! But we’ll have to look at this a
little more carefully to see that there really isn’t.
The acceleration of an object, as stated earlier, is defined as the derivative of velocity with respect to time, meaning that
we divide ∆v by ∆t, and let ∆t shrink. For a freely falling body, that value is a constant. But when we take ∆v and divide by
∆s, we should get something different.
Think of an example. If you drop a rock over a cliff, the velocity increases by a certain amount each second. That is,
if we find the change in velocity from t = 2.0 seconds to t = 2.1 seconds, we’ll get a certain value. It will be the same as
the value we’d get by finding the change in velocity from t = 5.3 seconds to t = 5.4 seconds, since they both have ∆t = 0.1
second. That’s constant acceleration. On the other hand, if we find some value for the change in velocity from s = 2.0 feet
to s = 2.1 feet, we can get some number. It won’t be the same as the number you’d get from finding the change in velocity
from s = 5.3 feet to s = 5.4 feet. Why? Because the time interval during which the rock is dropping from 5.3 feet to 5.4
feet much smaller than the time interval for falling from 2.0 feet to 2.1 feet. (Why? Because it is moving faster, having
accelerated.) It doesn’t have as much time to increase its velocity over a certain distance. So ∆v/∆s will not be a constant.
The two derivatives (dv/dt and dv/ds) are, however, related by the chain rule: dv/dt = (dv/ds) (ds/dt). But does it
check? Of course. Watch. Since ds/dt = v, we get
dv/dt = (dv/ds) × v (1.37)

p
= g/(2 s) × gt (1.38)
p p
= g/(2 s) × g ( 2 s/g) (1.39)
=g (1.40)
just what we had before.

If this confuses you rather than helps you, don’t worry now. But it will be useful to keep around for reference.
Dudley: You mean I don’t have to learn this?
Albert: Yes. But if you can get it, it will help later.
Mugsy: All I need to know is that I won’t be asked for this on a test.
Albert: OK, you won’t be asked for exactly this on a test. But the ideas here could show up.
Mugsy: I don’t know whether to be happy or not.
Derivative of the outside function (leaving the inside alone) times derivative of the inside function. This is the way
to remember the chain rule for formula-based functions. It is the working, useful, approach that I think through when I am
differentiating a function!
D( f ◦ g) = ((D f ) ◦ g) × Dg. This formula is more for reference. It is the most concise statement of the chain rule, but
tends to be used in more advanced courses.
Dudley: This I can skip entirely?
Albert: Yes. For this course.
Mugsy: That’s nice.
Explaining how the chain rule is used in practice.

Examples of this will be given in class. (Many more will follow as we introduce new functions (trig, inverse trig, logarith-
mic, and exponential), since the chain rule allows us to differentiate combinations of functions.) Here are the examples I
will do. Differentiate the following functions.
(6
√ x2 + 8 x − 7 x−4 )9
5
3 x4 − 7 x3 + 14 x2 − 18 x + 9
4
(x6 − x + 8)3 − (x3 − x + 8)6
The multi-layered chain rule might make the pattern clear:
( f1 ( f2 ( f3 ( f4 (x)))))0 = f10 ( f2 ( f3 ( f4 (x))))× (1.41)

f20 ( f3 ( f4 (x)))× (1.42)
f30 ( f4 (x)) × f40 (x) (1.43)
Note that you essentially peel off one function at a time from the outside, differentiate it, and leave alone all the functions
that are inside it. I’ll do examples when we get more functions.
How to tell when to use it!

I recommend the extremely cautious approach. If you aren’t sure whether or not to use it, go ahead and use it! It will never
be wrong, and could save you from wrong answers. Each derivative that you have to work should be approached with the
thought “Where is the chain rule in this problem?”
Essentially, what we are doing is breaking a function up into simpler parts, the way we did earlier. Only now, the parts
are being put together with function composition rather than multiplication or division.
Combining the chain rule with earlier rules.

The chain rule usually occurs in combination with the product and/or quotient rules. It is when you have to use everything
else together with the chain rule that problems get confusing. Here are the examples of untangling such messes that will be
given in class.
x (x2 + 1)−1
(x × (x2 + 1))−1
−1
x
x2 +1
x−1
x2 + 1
x
(x + 1)−1
2
Explain how to change quotients into products.

The chain rule allows you to ignore the quotient rule, since division can be changed into multiplication by the reciprocal.
An example would help here. If you have (compare to the examples just before this)
x
x2 + 1
you can change it into
x × (x2 + 1)−1
The derivative of this would then be (using the product rule and the chain rule)
(1) × (x2 + 1)−1 + (x) × (−1)(x2 + 1)−2 (2 x)
This needs to be compared to the value obtained by the quotient rule on the original quotient form of the function, namely
(x2 + 1)(1) − (x)(2 x)
(x2 + 1)2
It really is the same, but takes some work to show it. See the homework.
The uses of the chain rule.

Again, the chain rule is so absolutely fundamental to calculus that it is a bit strange to put in its uses. These have been
explained already in what was just said.
There is, however, one more idea that hasn’t appeared yet. The other half of calculus is called integration. And the most
important way of evaluating integrals is by a process that is exactly the chain rule.
Homework #12
Exercises.
1. Find the derivatives of the following functions. Check your answers on Maple if you want.
√
(a) f (x) = 2 − 3 x
(b) g(z) = (5 z3 − 8 z)6
√
(c) h(r) = r2 2 r − 1
(3t 2 −8t+1)3
(d) s(t) = (6t 2 +9t+5)4
9
5θ5
(e) r(θ ) = θ 2 +1
2. Find the derivatives of the following functions. Check your answers on Maple if you want.
√
(a) f (x) = 3 3 − 5 x
(b) g(z) = (5 z4 + 8 z3 − 1)7
√
(c) h(r) = r3 4 r − 1
(3t 3 −4)5
(d) s(t) = (4t+1)3
8
θ +2
(e) r(θ ) = θ 2 +θ
3. Show that in the worked-out example right before this homework set (on changing quotients to products) the two
answers are actually algebraically equal.
x
4. Show that the derivative of equals the derivative of x (x2 + 1) by algebraically simplifying the first one.
(x2 + 1)−1
(The first of those derivatives was done in class, as an example. Check your notes.)
5. Make up three more chain rule problems. Some should also require the product rule and/or the quotient rule. Again,
you can check your answers with Maple.
Problems.
1. Let f1 (x) = m1 x + b1 and f2 (x) = m2 x + b2 . These are both straight lines, with (constant) slopes m1 and m2 . We
know their derivatives are their slopes. We want to look at the compositions of lines in some detail and show that it
fits what the chain rule says.
(a) Find the compositions f1 ◦ f2 (x) and f2 ◦ f1 (x). Multiply out both compositions to give polynomials. They
should both be linear functions, of the form Mx + B, with various values of M and B, so both graphs are also
lines.
(b) Note that both lines have the same slope. What is that slope? That is, both lines have the same value of M.
What is that value?
(c) Do the two compositions give the same function? Since the values of M are the same for both, all you need to
check is if the formulas for B are the same for both. If they are different, the functions will be different. If they
are the same, the functions will be the same.
(d) The chain rule says that ( f1 ◦ f2 (x))0 = ( f10 ◦ f2 (x)) f20 (x) and ( f2 ◦ f1 (x))0 = ( f20 ◦ f1 (x)) f10 (x). What are ( f10 ◦
f2 (x)) and ( f20 ◦ f1 (x))? (Note that f10 (x) and f20 (x) are easy from the formulas. These, however, require some
thought, although they are also easy, once you see them.)
(e) Plug that value of f10 ( f2 (x)) into the (chain rule) formula for ( f1 ◦ f2 (x))0 , and also plug in the value of f20 (x).
From part (b), we already had calculated what ( f1 ◦ f2 (x))0 should be. Do the two agree? (Hint: They had
better!)
2. Show that the quotient rule and its reformulation as a product rule give the same results. That is, show that the quotient
rule applied to ( f (x)/g(x))0 and the product and chain rules applied to ( f (x) (g(x))−1 )0 give the same result. (Hint:
One way to show the two answers are the same is to convert the quotient rule’s result into a product (top) (bottom)−1
and then to multiply that out. Some algebra work later, you should get the same form as the result of the product and
chain rules.)
2
3. Take f (x) = x−2
x+3 for this problem.
(a) Find the derivative of f (x) considering it as a chain rule with outside function being square and the inside
function being the quotient.
(b) Split f (x) into the quotient of two squares. Find its derivative now by the quotient rule and chain rules.
(c) See if you can make the answers from the previous parts agree algebraically. (That’s harder than you might
think! Both are correct (if you did them right), but the answers look very different. I’m trying to get you to
realize that there are many ways to do problems, and you can look for ways to make things easier on you.) (Hint
on the algebra: Factor (x − 2)/(x + 3)3 out of both answers and simplify what’s left in both.)
Investigations.
1. This problem gives a (false!) rationale for ( f ◦ g)0 = (g ◦ f )0 . Find and correct the mistake. (Note: Homework
problem 1 showed that this can sometimes be true. But most of the time it is false.)
The wiggle magnification factor WMF of a composition is, as done in class, the product of the WMF’s of
the two functions. You get that the WMF of ( f ◦ g) is (WMF of f ) times (WMF of g). The WMF of (g ◦ f )
is (WMF of g) times (WMF of f ). Since multiplication is commutative (you can multiply factors in any
order), these are obviously equal. Hence, ( f ◦ g)0 = (g ◦ f )0 .
There must be something wrong with this reasoning, because the derivative of a composition does depend on the
order of the factors. What is wrong?
2. In this investigation, we get a formula for the derivative of the inverse of a function. This doesn’t require anything
more than the chain rule and algebra, together with some care about independent and dependent variables.
(a) Let’s first do an example; use y = x5 . What is dy/dx?
(b) Solve y = x5 for x. Find the derivative of that equation. That will give you dx/dy.
1
(c) Show that (dx/dy) = dy/dx . To do this, plug in the formula for x into dy/dx and use the properties of exponents
1
on .
dy/dx
(d) Now, we go back to working with a general function y = f (x). We will be doing exactly this same thing again,
but with general formulas rather than x5 . Solving y = f (x) for x gives x = f −1 (y). Plugging that back into
y = f (x) gives
y = f ( f −1 (y))
which doesn’t say anything more than a function and its inverse undo each other. Differentiate both sides of
this last equation with respect to y, using the chain rule on the right side. (It is less confusing to use the prime
notation for derivatives here.)
(e) Plug x = f −1 (y) into the chain rule in the previous part (it only fits in one spot!), and then solve for ( f −1 )0 (y).
If you realize that dy/dx = f 0 (x) and dx/dy = ( f −1 )0 (y), you should get the same as dx/dy = dy/dx1
.
1.3.7 More new functions and the chain rule.

There are many other elementary functions (as they are referred to) beyond the polynomials, and one thing that any calculus
course must do is show how to work with those functions also. The other elementary functions are the trig functions (which
I hope are familiar), the inverse trig functions (which are probably less familiar), the logarithms (which should also be
somewhat familiar) and the exponentials (which are probably only vaguely familiar), and the regular and inverse hyperbolic
trig functions (which are probably completely new to you).
Mugsy: I think you are being generous, there, at least with me.
I will go over these functions only briefly, mainly to re-acquaint you with functions that you should have already studied
at least once before. If you find any of these completely new (except the hyperbolic ones), see me fast!
Mugsy: When are your office hours again?
Absolute values.
We technically already know how to do this, since we have√a way of defining absolute values in terms of other things we
can differentiate. The definition of absolute value is |x | = x2 . (At least that works for real numbers. Complex numbers
require a more delicate definition.) Then, the derivative of |x | can be done this way:
d d√ 2
|x | = x (1.44)
dx dx
d
= (x2 )1/2 (1.45)
dx
1
= (x2 )−1/2 (2 x) (1.46)
2
x
= 2 1/2 (1.47)
(x )
x
= (1.48)
|x |
Of course, Maple will differentiate absolute values also. Depending on the version of Maple that you are running, you
can get two different answers when you type in diff(abs(x),x);. Maple version V release 3 tends to write the derivative
upside-down from our definition. It will take diff(abs(x), x); and return abs(x)/x. This is equivalent to what we
have. (See the homework.) And, for various technical reasons, I prefer this Maple’s answer, so that’s what I will give as the
formula for the derivative of the absolute value of x:
d |x |
(|x |) = (1.49)
dx x
It is interesting to note that the derivative of |x | is not defined when x = 0. You get division by 0 then. (Actually, you get
0/0, and that’s even worse, if you’ll remember.) Looking at the graph, it becomes clear why. The graph of y = |x | looks like
a V, with its corner at the origin, and the sides at 45◦ angles (slopes ±1). What happens if you magnify the graph around
the origin? It remains V-shaped. It never begins to flatten out to a line. That’s the problem. If it doesn’t become like a line,
it can’t have a derivative, because the derivative is the slope of the line that the graph begins to look like.
Dudley: Can’t most functions be flattened out?
Albert: Mathematicians thought for many years they could, except at a few isolated corner points, but it turns out that
“infinitely crinkly” functions—all corners—are more numerous. But for our purposes, yes, the functions we encounter
will flatten out, with the single exception of absolute values near zero.
Mugsy: “Infinitely crinkly.” Hmm. That gives me ideas.
It is also interesting to look at what |x | /x itself is. It has the value +1 for values of x > 0 (where x = |x |) and has the
value −1 for values of x < 0 (where x = − |x |, since for x negative, x and |x | will be the same “size” but opposite signs).
The derivative has no value (it doesn’t exist) at x = 0. Note that this matches the slopes of the lines that make up the V-shape
of the graph of y = |x |, just as it should. The moral of this comment is that |x | doesn’t have a derivative at x = 0, but it is
perfectly fine for all other values of x.
On the other hand, Maple 6 gives diff(abs(x),x); as abs(1, x). What is that? It is more complicated than I want
to get into here, but it deals with the idea that x could be a complex number, and, if so, then the formula for |x | needs
to be more general than |x | /x. If you want to get the previous version of the derivative of |x |, here’s what to do. First,
you need to make sure that Maple knows that x is a real number. You do that by typing in assume(x,real);. That,
however, means that any time that you use x, Maple will print it as x ∼, the trailing twiddle being a reminder that you have
made some assumptions about x. I find that annoying, so I go to Maple’s menu bar Options | Assumed variables | No
annotation, and that turns off the trailing twiddle. (Tilde is the more accurate term, but I like twiddle). Then, when you
type in diff(abs(x),x);, Maple will return signum(x), which is 1 when x > 0 and −1 when x < 0 (and 0 when x = 0,
which is wrong, and the reason Maple went with the more general form). This is mostly correct, but still not the form that
we are using. One more step will do it; type in convert(%,abs);, and you will get the familiar x/ |x |.
Mugsy: Now why did he go through all that stuff with Maple when he had already given us the formula?
Albert: Probably to warn you about not being able to decipher Maple’s answers to derivatives that contain absolute
values.
Here are the examples of functions I’ll differentiate in class.
|5 x + 8 |
5 |x |+8
x3 (x − 5)
3
x |x − 5 |
x
2 x+3
|x |
|2px + 3 |
|x | + 3
x2 (4 + |x |)
The uses of absolute values.

How are absolute values used in mathematics? Probably the most common use is with distances, where the distance between
x = a and x = b is |b − a |, where you don’t have to know then whether a is larger than b or not. In higher mathematics, the
formulas get more complicated, but often boil down to just this.
Homework #13
Exercises.
1. Find the derivatives of the following functions. Use that the derivative of |x | is |x | /x rather than the definition of |x |.
(a) x2 × |x |

(b) x3 − x
|x |
(c)
x
1
(d) √
|x | x2 + 1
(a) x5 × |x |

(b) 4 x3 − 7 x2
x
(c)
|x |
1
(d) √
|x | x2 − 1
Name Abbreviation Definition

sine sin θ y/r
cosine cos θ x/r
tangent tan θ y/x
cotangent cot θ x/y
secant sec θ r/x
cosecant csc θ r/y
Table 1.1: The definitions of the circular trig functions.
3. Make up three functions of your own that include absolute values and differentiate them. You can check your answers
on Maple.
Problem.
x |x |
1. In this problem, we show that = .
|x | x
(a) Square both sides in the definition (the one that uses square roots) of |x | to show that |x |2 = x2 .
(b) Multiply top and bottom of x/ |x | by x and simplify using the result from the previous part to get |x | /x.
Trig functions and their derivatives.

The trig functions are defined in terms of r, and θ , where
θ = angle at the origin through the point (x, y) 6= (0, 0)

p
r = x2 + y2 (distance from the origin)
The definitions of the trig functions and their typical abbreviations are given in the accompanying table (top of the page).
Note that for some angles, any of the last four of these might not be defined, since x or y might be 0 without the other being
0. But sin θ and cos θ are defined for any angle θ .
Dudley: Mugsy, do these look familiar to you?
Mugsy: Unfortunately, yes.
Note on radians versus degrees in calculus. In the rest of calculus (and all courses afterward), all angles will be mea-
sured in radians. This will be assumed, unless specific instructions for a specific purpose are given for a specific situation.
Always put your calculator into radian mode! A common source of problems when dealing with trig functions is to
forget to change out of degree mode. Most calculators don’t allow you to change what mode the calculator stays in, so you
have to do this yourself each time. (The fancier HP and TI calculators, and probably others, allow you to set radian mode
as the default.) There really is a reason for using radians. It shows up next.
Maple uses radians for all of its angles. Its names for the trig functions are just what you’d expect: sin(x), cos(x),
etc. But don’t forget the parentheses. To Maple, sinx would be a variable and sin x is an error.
Motivate that the derivative of sine is cosine. From a previous homework problem, you showed that the algebraic
method of finding the derivative (“the long way”) of trig functions will not work. For example, we can’t factor a ∆x out of
∆y = sin(x + ∆x) − sin(x) to cancel the ∆x in the denominator. Trig identities can be used to get the derivative, but (to the
way I think) they obscure what is going on rather than clarify it.
Mugsy: Oh, great. Something obscure.
Albert: The alternative is to drop a formula on you, and give you no idea of why it’s correct.
Mugsy: Hey, I’m just going to memorize the thing anyway.
Albert: The point of this course seems to be to try to get away from the “I’m just going to memorize it anyway”
approach, and to give you a bit of rationale for why things work the way they do.
Mugsy: Great. Mess up my learning style, why don’t ya?
We can find the derivative geometrically, though. To avoid getting mixed up, we will use f (θ ) = sin θ , and find
d sin(θ + ∆θ ) − sin(θ )
(sin θ ) = lim
dθ ∆θ →0 ∆θ
I want x to mean x-coordinate of a point, and not the size of the angle. Besides, θ is a typical letter to represent an angle.
The whole point of what follows is to locate and identify the numerator and denominator geometrically, and then show that
it becomes cos θ in the limit as ∆θ closes in on 0.
Begin with the unit circle (that means radius 1, centered at the origin). Draw in angles of θ and θ + ∆θ (in radians),
crossing the circle. The y-coordinates are sin θ and sin(θ + ∆θ ). The difference in the y-coordinates of the intersection of
the angles with the circle is exactly sin(θ + ∆θ ) − sin(θ ). That’s the value of ∆ sin(θ ) = ∆ f ! Keep track of that. We have
part of the fraction that we need.
The length of the arc of the circle that is cut off is exactly ∆θ , the other part of what we want the limit of. (For a central
angle α in radians in a circle of radius r, the arc cut off—subtended is the technical term—has length rα. For the circle we
have, r = 1, and α = (θ + ∆θ ) − (θ ) = ∆θ . Putting these together, the length of the arc is 1 × ∆θ = ∆θ . That’s the other
part of the fraction!
The reason for using radians shows up here. If you measure α in degrees, then the length of the arc is r × (α × π/180),
and that (π/180) factor will persist through all of your derivatives. Essentially, we use radians to simplify the form of the
derivatives we’ll get. This is why radians don’t really show up until you do calculus, but are the only sensible choice here.
Let’s look closely (in both senses) at the circle near the intersections, where all the action is going on.
Mugsy: Is that another pun?
Albert: I think so.
If ∆θ is very small (and remember that we are going to be taking ∆θ → 0, so it will get very small), then the circle between
the two intersections will be almost a straight line. In fact, to make things easier, I will replace the circle with its tangent
line at the intersection with the radius. The error caused by doing this will be invisible for ∆θ tiny enough. (The essence
of calculus is replacing curves by tangent lines. It needs to be done with some care, though.) That gives a right triangle
with vertical side = ∆(sin θ ), and hypotenuse = ∆θ . This one triangle contains both parts of the fraction that we need
to look at. We’re almost done. The ratio ∆(sin θ )/∆θ is then the cosine of the top angle. With a little geometry (using
parallel lines and complementary angles), you can show that the top angle is essentially θ . Thus, ∆(sin θ )/∆θ ≈ cos θ .
(The approximation is due to replacing the circle by the tangent line to the circle.) As ∆θ → 0, the approximations only
improve, and so work out to give
d
(sin θ ) = cos θ
dθ
This argument is not rigorous, but I hope it is convincing.
Obtaining the derivatives of the other trig functions. In theory, we could find the derivatives of all of the other trig
functions by the same procedure. It would be messy, except for cos θ , which is essentially the same, except for one sign
switch.
First, we get the derivative of cos θ . To do that, we do pull some trig identities out of the hat:
cos(π/2 − θ ) = sin θ
sin(π/2 − θ ) = cos θ
The sine of an angle is the cosine of its complement, and vice versa. But we can use the chain rule now! That means that
d d
(cos θ ) = (sin(π/2 − θ )) (1.50)
dθ dθ
d
= cos(π/2 − θ ) × (π/2 − θ ) (1.51)
dθ
= sin θ × (−1) (1.52)
= − sin θ (1.53)
d
(sin θ ) = cos θ
dθ
d
(cos θ ) = − sin θ
dθ
d
(tan θ ) = sec2 θ
dθ
d
(cot θ ) = − csc2 θ
dθ
d
(sec θ ) = sec θ tan θ
dθ
d
(csc θ ) = − csc θ cot θ
dθ
Table 1.2: Derivatives of the circular trig functions.
We used that the derivative of sine is cosine (according to what we did before). We also used the chain rule to get the term
that was the derivative of π/2 − θ (derivative of the inside). This can also be obtained by the same process that gave the
derivative of sin θ . See the homework.
Now that we have the derivatives of sine and cosine, we can get the derivatives of all the others by the quotient rule. The
reason is that tan θ = (sin θ )/(cos θ ), cot θ = (cos θ )/(sin θ ), sec θ = 1/(cos θ ), and csc θ = 1/(sin θ ). Although some
trig identities need to be pulled together to do this, the accompanying table (at the top of this page) summarizes the results.
We will work some of these in the homework, so that you have at least seen them. Although I don’t encourage memorizing
all of these, you should at least memorize the derivatives of sine and cosine.
Note on keeping the signs straight. There is a pattern here that is too nice to omit. Note that the derivative of all the
“co-” functions (the ones that start with “co”) have a negative sign as part of the formula. (That doesn’t mean that those
derivatives will always be less than 0; it depends on whether the rest of the formula gives a positive or negative number.) In
particular, this means that the derivative of cos θ is − sin θ . That negative will be a source of confusion later.
Mugsy: You aren’t helping with comments like that.
Albert: It’s only a minor confusion there, and if you remember the comment about the derivatives of “co-” functions
having a negative sign in the derivative, then even that goes away.
Additionally, note that you can get the derivative of csc θ from the derivative of sec θ (its co-function) by putting in a
minus sign and changing all the functions in the derivative to their co-functions. The same is true of getting the derivative
of cot θ from the derivative of tan θ . You change the sign of the derivative, and change the sec2 θ to its co-function, getting
finally − csc2 θ . What that ultimately means is that you cut the amount of memorization in half. But since I’m not requiring
you to memorize any of these except the derivatives of sin θ and cos θ , it’s only slightly useful. (On the other hand, if you
are going on to teach calculus, you’ll need to remember such arcane incantations. . . .)
Combining these with the chain rule and other rules. Of course, now that we have these, we will proceed to combine
the results with the product rule, the quotient rule, and the chain rule. Let me emphasize now that our goal is to cover
rapidly all of the standard functions in calculus in one shot. It becomes important, then, to keep up with these functions.
Getting behind will cause severe difficulties in a very short time.
There is one notational hassle that we must clear up before we can go any further. The notation sin2 θ means (sin θ )2 ,
and similar things for other exponents and other trig functions. When working derivatives, this must be kept in mind, and
it would be useful to write it that way until you get used to it. For example,
d d
(cos5 θ ) = (cos θ )5 = 5 (cos θ )4 × (− sin θ ) = −5 cos4 θ sin θ
dθ dθ
Let me explain how the various steps of this example worked. First, the cos5 θ was changed to its equivalent form (cos θ )5 .
The derivative of this was obtained by the chain rule, with the inside function being u = cos θ and the outside function
being u5 . The derivative of the outside gives 5 u4 , or 5 (cos θ )4 . The derivative of the inside is the derivative of cos θ , which
is − sin θ . The product of these is what the chain rule says the derivative is. Finally, (cos θ )4 is changed back to cos4 θ to
give the answer.
This should be contrasted to the similar-looking, but very different, derivative
d
cos(θ 5 ) = − sin(θ 5 ) × 5 θ 4 (1.54)
dθ
= −5 θ 4 sin(θ 5 ) (1.55)
where the inside is u = θ 5 and the outside is cos u.

One more example needs to be given. In the derivatives of sec θ and csc θ , two θ ’s appear. How do you plug in the
inside in that case? (This situation will also occur in later derivative formulas, and the answer will be the same there.) You
plug in the inside in both of them. For example,
d
sec(θ 3 ) = sec(θ 3 ) tan(θ 3 )(3 θ 2 ) (1.56)
dθ
= 3 θ 2 sec(θ 3 ) tan(θ 3 ) (1.57)
The θ 3 gets put into both the sec and the tan, since the derivative of sec θ (with respect to θ ) is sec θ tan θ .
When you differentiate trigonometric functions with Maple, you will usually have no problems comparing your answers
with its answers. However, there is one big exception to that. When you ask Maple to differentiate tan x or cot x, it writes it
in a different form (which shouldn’t be strange to you after absolute values):
> f(-2.9):
> f(-2.99):
> f(-2.999):
> f(-3.1):
> f(-3.01):
> f(-3.001):
Trying to get Maple to rewrite these in more normal terms involves simplifying using something called side relations,
which I feel is beyond what is useful for you. (If you want to see it, check out Maple ?siderels, or see me.)
The
√ examples that will be worked in class are given here. Differentiate the following functions.
sin( x)
|cos θ |
tan3 (5 x)
cot2 x sin(3 x)
sec θ + csc θ
sec θ − csc θ
The uses of the circular trigonometric functions.

Who uses all these trig functions? Just about anyone who uses calculus and more advanced mathematics. For example,
any physical phenomenon that is periodic (that is, repeating) requires sines and cosines to analyze it. That might be the
alternating current in an electrical circuit, the heating due to the sun in meteorology (either on a daily basis, or as it changes
periodically over the course of a year), or the backyard swing.
Homework #14
Exercises.

(a) |tan θ |
(b) tan |θ |
(c) sin3 θ
(d) t 3 sec(4t)
cos(5 α)
(e) (α is the Greek letter alpha)
α2 + 1
(a) sin(cos θ )
(b) cos(sin θ )
(c) cot5 x
(d) t 4 sec(t 2 )
α tan α
(e) 2
α −1
3. Make up three more problems yourself that include trig functions, with Maple as a check if you want one.
Problems.
1. In this problem, we derive, using calculus, the most famous trig identity,
sin2 θ + cos2 θ = 1,
called the Pythagorean identity. As with any proof, you are not allowed to use the result in the proof itself, so pretend
you don’t know it already for this problem.
(a) Differentiate sin2 θ + cos2 θ as a function of θ . (That is, don’t substitute 1 for the expression). Simplify what
you get, and show that it turns into 0.
(b) The only functions whose derivatives are 0 are constants. That means that sin2 θ + cos2 θ is some constant.
And the fact that it is a constant means that its value doesn’t change as we put in different values of θ . Plug in
the value θ = 0, and evaluate sin2 (0) + cos2 (0) by putting in the (known) values of sin(0) and cos(0).
(c) Explain (using the result of the previous part) why sin2 θ + cos2 θ = 1 is valid for all angles θ .
2. This problem gets to derive the formulas for the derivatives of some of the trig functions that I didn’t derive earlier. I
wrote the remaining trig functions in terms of sin θ and cos θ . Use those identities here.
(a) Differentiate the identity for tan θ in terms of sin θ and cos θ using the quotient rule. Use the Pythagorean
identity (sin2 θ + cos2 θ = 1) and the identity for sec θ in terms of cos θ to make the derivative you get match
what I have.
(b) Differentiate the identity for sec θ in terms of cos θ and use the quotient rule. Take the product sec θ tan θ and
convert it by the identities to sin θ ’s and cos θ ’s, and show algebraically that it equals the derivative of sec θ you
just got.
Inverse trig functions and their derivatives.

Although it isn’t obvious now why we would even bother with these functions, they turn out to be critical later. I am not
going to assume you remember (or have ever known much) about inverse trig functions from high school. The review from
the lab session should have given you enough.
Mugsy: And if not?
Albert: Ask in a review session, or during office hours.
Function Domain Range

Arcsine [−1, 1] [−π/2, π/2]
Arccosine [−1, 1] [0, π]
Arctangent (−∞, ∞) (−π/2, π/2)
Arccotangent (−∞, ∞) (0, π)
Arcsecant (−∞, −1] ∪ [1, ∞) (π/2, π] ∪ [0, π/2)
Arccosecant (−∞, −1] ∪ [1, ∞) [−π/2, 0) ∪ (0, π/2]
Table 1.3: Domains of the inverse trig functions.
d d 1
Arcsin x = − Arccos x = √
dx dx 1 − x2
d d 1
Arctan x = − Arccot x =
dx dx 1 + x2
d d 1
Arcsec x = − Arccsc x = √
dx dx |x | x2 − 1
Table 1.4: Derivatives of the inverse trig functions
Again, note identities and radians. Please remember that the output of the inverse trig functions will always be angles,
and therefore, will always be in radians. The domains and ranges of the inverse trig functions are given in the accompanying
table that is at the top of the next page. These definitions are typical, but not universal. Some people use different ranges
for Arccot x, Arcsec x, and/or Arccsc x. In fact, earlier versions of Maple defined Arccot x so that the range is (0, π/2] ∪
(−π/2, 0) so, arccot(x); in Maple was the same as Arccot x for x > 0, but equaled ((Arccot x)−π) for x < 0. I am happy
to report, however, that the Maple people have “seen the light,” and now agree with me. All this is just to show you that
there are various different, valid, definitions for inverse trig functions out there.
Dudley: Doesn’t this mess up everybody?
Albert: The only inverse trig functions in common use are Arcsin x and Arctan x, and those are standard.
The derivatives of the inverse trig functions come in three pairs, each inverse function having a derivative that is the
negative of the inverse “co-” function, with the corresponding changes. The appropriately-labeled table (in the middle of
this page) gives them.
Note the domains and the effect on signs of derivatives. I would urge caution about these, especially for x < 0 (which
is the only place where disagreement occurs). The definition that Maple gives to Arccot x doesn’t change its derivative.
One reason for changing the definition of Arcsec x and Arccsc x is to eliminate the absolute values in the derivative. They
are a genuine inconvenience.
Maple and derivatives of inverse trig functions. Maple does not capitalize inverse trig functions, and you have to put
the argument (the x) in parentheses, so that Arctan x in Maple becomes arctan(x).
Maple’s derivatives of Arcsec x and Arccsc x are equivalent to what I have, but avoid the absolute value hassle by using
an awkward combination of square roots:
> diff(arcsec(x),x);
> diff(arccsc(x),x);
1
r
1
x2 1 − 2
x
1
− r
1
x2 1−
x2
These are equivalent to the formulas I gave. (See the homework.)
Inverse trig functions and the chain rule. Note that the x in the derivatives of Arcsec x and Arccsc x appears in two
places (inside the absolute values and inside the square root). This is essentially the same thing we encountered with the
derivatives of secant and cosecant. How do you use the chain rule with outside functions Arcsec and Arccsc, where there
are two places you could put the inside function? The answer is that you put the inside in both places, just as with absolute
values.
Combining these with everything before. Of course, we will have to combine these functions with the chain rule, the
product rule, the quotient rule and the regular trig functions. And we aren’t done adding functions. The one thing to
remember is that all the new functions that we are getting follow precisely the same patterns for products, quotients, and
composition that we have already learned to use. We just have to keep a bunch more formulas in mind. (The patterns are
the product, quotient, and chain rules.)
Mugsy: Oh, great. And I thought that this course wasn’t going to stress memorization.
Albert: It doesn’t. It turns out that the formulas will all be given to you on the tests. You really don’t have to
memorize them. On the other hand, you had better be familiar with how to use these formulas!
Again, here are the examples that I will work in class. Differentiate the following functions.
Arcsin(5 x)
sin(2 x) Arctan(3 x)
x3 Arcsec(x2 )
sec(x Arcsin(6 x))
Arctan x
x2 + 1
x cos(3 x)
Arctan(5 x) sec(7 x)
The uses of inverse circular trigonometric functions.

Why do these? They don’t show up anywhere near as often as the regular circular trig functions. In this case, I will have to
agree with you, but there is a reason. Unfortunately, I can’t explain it to you yet. Put it this way, when we get to integration,
the inverse trig functions will reappear in a very unexpected way. The justification of talking about them now will have to
wait until then.
Homework #15
Exercises.
1. Find the derivatives of the following functions. You do not need to simplify your answers.
(a) Arcsin(2t)
(b) Arctan(x2 + 1)
(c) |Arcsin(5 x) |
(d) tan(1 − Arctan x)
x Arcsec x
(e)
1 − 2 x2
2. Find the derivatives of the following functions. You do not need to simplify your answers.
(a) Arcsin(5t + 2)
p
(b) Arctan(4 x)

(c) Arcsec(x2 − 1)

1

(d) sec Arcsin x
Arctan x Arcsin x
(e)
x2 + sin x
3. Make up three of your own derivatives involving inverse trig functions and check them with Maple if you want.
Problems.
1. Let me lead you through the derivation of the formula for the derivative of one inverse trig function, Arctan x. (Here
is one of the times that we use the procedure for finding the derivative of the inverse of a function. I told you we’d see
it again. Refer to the homework problems of the exercises in the chain rule if you want to see it in general.) Assume
that you don’t yet know the derivative for Arctan x, and this problem will derive it.
(a) The definition of inverse functions shows that tan(Arctan x) = x. Differentiate both sides of this identity with
respect to x. What rule did you need to use? What part of the rule required the derivative of Arctan x?
(b) Once we figure out sec2 (Arctan x), we can find the derivative of Arctan x by solving. If you’ll remember, there is
a procedure for finding a formula for “trig(Arctrig x)” functions. (Refer back to the reference section on inverse
trig functions if you need a refresher.) Let’s apply that to sec(Arctan x), and then square it. Call Arctan x = θ ,
or tan θ = x. Draw a generic right triangle with tan θ = x. (For example, let θ be the the lower angle, let x be
the length of the vertical side, and 1 √
the length of the horizontal side. Find the length of the hypotenuse by the
Pythagorean theorem. It should be 1 + x2 .) Then we find that sec(Arctan x) = sec θ can be read off of the
triangle. What then is sec(Arctan x)? What is sec2 (Arctan x)?
(c) Plug the results from the previous part into the formula from the first part. Solve to get d/dx(Arctan x) =
1/(1 + x2 ).
2. Show that the derivative Maple gave for Arcsec x is the same as the derivative I gave. For this, you will want to use
the definition of the absolute value of x.
Logarithms and their derivatives.

I am going to go over the basics of logarithms in lab period. Again, I do not assume that you are familiar with all their
properties, but it really will help to have seen it before.
Dudley: This has me worried. Al, how much am I really going to have to know about logarithms?
Albert: Fairly little, in much the same way that you didn’t need to know everything about trig functions. In this case,
there are only three of them. Here they are:
a
ln(a b) = ln(a) + ln(b) ln = ln(a) − ln(b) ln(an ) = n ln(a) (1.58)
b
These are often called the properties of logarithms. Memorize them. There’s also one more, coming up soon.
The lab will cover common (base 10) and natural (base e) and other-based logarithms, and their properties, and how to
work them on your calculator.
Derivative of the natural logarithm. The formula for the derivative of ln x is
d
(ln x) = 1/x (1.59)
dx
Where did that come from? I’ll show you.

Albert: Dudley, this is the other one that you will have to know.
Dudley: Thanks, Al. It looks too easy, though.
We are going to have to find this derivative “the hard way” since we have nothing we can use with it. We had a procedure
for grinding these out. First, we begin with y = ln x, and find ∆y = ln(x + ∆x) − ln(x). But by the properties of logarithms
(ask Dudley), this is
x + ∆x
∆y = ln
x
which can be simplified to
∆x
∆y = ln 1 +
x
The difference quotient of this function is then
∆y ln 1 + ∆x x
=
∆x ∆x
We now want to look at this as ∆x → 0. Not nice. However, by a trick, we can do something about it. Instead of using ∆x,
we pick something more suited to the problem, like z = (∆x)/x. That would make the logarithm term a lot easier to deal
with. But it would mess up the bottom of the difference quotient, where we don’t have (∆x)/x; we just have ∆x. But—and
this is the whole idea of this trick, and why it works in this case (and maybe not in others)—we can solve z = (∆x)/x for
∆x and get ∆x = z x. Then ∆x → 0 is the same thing as z → 0. We then get that
d ∆y
ln x = lim (1.60)
dx ∆x→0 ∆x

ln 1 + ∆xx
= lim (1.61)
∆x→0 ∆x
ln(1 + z)
= lim (1.62)
z→0 zx

1 ln(1 + z)
= lim (1.63)
x z→0 z
Why did we do that? It certainly seems messy enough to want to avoid if at all possible. But it isn’t possible. The real
reason for it is the expression at the end. There’s the factor of 1/x, which will come in handy (as the result), and this other
messy limit. The point is this. We have separated out the variable x from the rest of the problem. That is, the value of the
limit has nothing to do with x anymore. We work it out once once and then use it. Put another way, it is a constant. We
choose e as the base of our logarithms, because with that choice of base (ln x = loge x), the constant (the limit) has the value
of 1. Other bases for logarithms give different values for the limit. Plugging in 1 for the limit gives the answer.
d 1
(ln x) = (1.64)
dx x
That’s the formula.

Dudley: That seems somewhat unusual. The derivatives of powers of x give powers of x. Shouldn’t the derivative of
some power of x be equal to 1/x = x−1 rather than having the derivative of ln x giving x−1 ?
Albert: Good question! It turns out it doesn’t work, though. It’ll become a homework problem.
Combining this with everything before. As before, I will give examples in class. Here are the ones I’ll go through.
ln(1 + x2 )
x3 ln x
ln(Arcsin x)
Arctan(ln x)
sin(4 x) ln(5 + x)
x + ln x
sin x Arcsec x
Logarithmic differentiation, and revisited product and quotient rules. A new option occurs once we have logarithms.
It can be used to differentiate incredibly messy products, quotients, and powers.
Mugsy: Here it comes. I can hardly wait.
We can use the properties of logarithms to convert the function into a much nicer form before we have to apply the rules
for derivatives. Several examples will help.
First, take y = x4 sin x. To differentiate this would be a simple product rule, but let me do it by logarithmic differentiation
as an illustration.
Mugsy: Why do we have to do this the hard way when we already have a reasonably simple way to do it? Al, you’d
better have a good answer for this!
Albert: The process of logarithmic differentiation has several steps, and they are simpler to see on an easier example,
where the algebra doesn’t get in the way so much. The steps to follow are given in the example.
• Step 1. Take the logarithm of both sides. You start with y = x4 sin x, and you get
ln y = ln(x4 sin x)
• Step 2. Simplify the expressions using the properties of logs.

Albert: This is where you need those properties of logarithms, Dudley.
ln y = ln(x4 ) + ln(sin x) (1.65)
= 4 ln x + ln(sin x) (1.66)
• Step 3. Differentiate both sides with respect to x (or whatever the independent variable is).
ln y = 4 ln x + ln(sin x) (1.67)
d d
(ln y) = (4 ln x + ln(sin x)) (1.68)
dx dx
1 dy 1 1
× =4 + × (cos x) (1.69)
y dx x sin x
Note the use of the chain rule on both the ln y and ln(sin x) terms.
• Step 4. Solve the resulting equation for dy/dx. (Multiply through by y.)

1 dy 1 1
× =4 + × (cos x) (1.70)
y dx x sin x

dy 1 1
= y× 4 + × (cos x) (1.71)
dx x sin x
This is the answer. But it is sometimes useful to go back to the original problem and plug in the function for y. In
this case, you’d get
dy 1 1
= (x4 sin x) × 4 + × (cos x)
dx x sin x
With more complicated functions (like the next one), you won’t want to do that.
Mugsy: Hey! I didn’t want to even start this!
It hardly looks like this is the correct answer. Certainly the product rule wouldn’t give that form! But it is the same.
See the homework.
Let me do another example, a lot messier this time. Refer back to the previous example for the different steps I am
going to follow. Take
s
x5 (Arctan x)3
y= 7 4 (1.72)
(x + x)4 cos8 x
5 1/7
x (Arctan x)3
= (1.73)
(x4 + x)4 cos8 x
Then step 1 gives: ( 1/7 )

x5 (Arctan x)3
ln y = ln
(x4 + x)4 cos8 x
Then step 2 gives (with a major review of the properties of logarithms thrown in for free):
Mugsy: Gee, thanks.
" 1/7 #
x5 (Arctan x)3
ln y = ln (1.74)
(x4 + x)4 cos8 x
5
x (Arctan x)3

1
= ln (1.75)
7 (x4 + x)4 cos8 x
1 h i
= ln x5 (Arctan x)3 − ln (x4 + x)4 cos8 x (1.76)
7
1h i
= ln(x5 ) + ln((Arctan x)3 ) − ln((x4 + x)4 ) − ln(cos8 x) (1.77)
7
1
= 5 ln x + 3 ln(Arctan x) − 4 ln(x4 + x) − 8 ln(cos x)

(1.78)
7
= 5/7 ln x + 3/7 ln(Arctan x) − 4/7 ln(x4 + x) − 8/7 ln(cos x) (1.79)
(Note that, up to this point in the problem, we have not done any calculus! All we have done is simplify the expression so
that when we do start differentiating (the next step), the expression is not as bad.) Step 3 then gives an answer that (taken
one step at a time) is not so hard!
d d
5/7 ln x + 3/7 ln(Arctan x) − 4/7 ln(x4 + x) − 8/7 ln(cos x)

ln y = (1.80)
dx dx
1 dy 1 1 1 1 1
= 5/7 + 3/7 − 4/7 4 (4x3 + 1) − 8/7 (− sin x) (1.81)
y dx x Arctan x 1 + x2 x +x cos x
4(4 x3 + 1) 8

5 3 1
= + 2
− + tan x (1.82)
7 x 7 (Arctan x)(1 + x ) 7(x4 + x) 7
Step 4 is always easy.
4(4 x3 + 1) 8

dy 5 3 1
=y + − + tan x
dx 7 x 7 (Arctan x)(1 + x2 ) 7(x4 + x) 7
Quite a mess, but much easier than the ghastly quantities of differentiation rules that you’d need without it.
There are also situations where logarithmic differentiation is more than just helpful; it’s necessary. Sometimes we have
no procedures for doing the problem without logarithmic differentiation. The simplest such case is when a variable (or
function) occurs in the exponent. So, here is one more example. Find the derivative of
y = (Arcsec x)sin x
Here are the steps
y = (Arcsec x)sin x (1.83)

ln y = ln (Arcsec x)sin x (1.84)
= sin x × ln(Arcsec x) (1.85)
1 dy 1 1
= cos x × ln(Arcsec x) + sin x × √ (1.86)
y dx Arcsec x |x | x2 − 1

dy 1 1
= y × cos x × ln(Arcsec x) + sin x × √ (1.87)
dx Arcsec x |x | x2 − 1
One caution fits here. Step 4 (multiplying back by y) is easy to forget. (You are so relieved to have finished taking
derivatives!) But if you don’t do it, you aren’t really finished.
Checking logarithmic differentiation on Maple. Maple is not much help with logarithmic differentiation, unless you
do a lot of the work. Maple’s simplification routines simply don’t allow any systematic logarithmic differentiation to occur.
The closest you can come—and it does help with a lot of the homework questions, but not all—is to define y as the
function, take the natural log of both sides, differentiate with respect to x, and then issue the command simplify();. That
might or might not help you check your work.
Mugsy: In other words, Maple is no help here.
Albert: Close. It isn’t as much help here as it is in other places.
The uses of logarithms.

Logarithms used to be used all the time, since they provided the simplest means of solving many problems. After all,
instead of multiplying out two numbers, you could look up the logarithms and add (which is a whole lot simpler). Tables
of logarithms of numbers and even of trig functions were widely available, and regularly used.
The invasion of calculators made tables of logarithms obsolete very rapidly. That, however, doesn’t mean that logarith-
mic functions have lost their usefulness! In order to solve various problems that involve exponents (such as in financial
calculations), logarithms are important. We will see several applications of the exponential function very soon, and loga-
rithms are needed for all of them as well.
Homework #16
Exercises.
1. Find the derivatives of the following functions. Don’t use logarithmic differentiation for these!
(a) ln(1 + sec x)
(b) ln x × sin(x − 1/x)
x

(c) ln Arcsec x
2. Find the derivatives of the following functions. Don’t use logarithmic differentiation for these!
(a) r ln r
(b) |ln(sin x) |
β ln β
(c) (β = Greek letter beta)
(β + 3)2
3. I recommend logarithmic differentiation for both parts of this problem.
(a) Find the derivative of s
7 (ln x)25 x18 (x8 − x)41
y=
(x2 + 1)39 (Arcsin(2 x))29
(b) Find the derivative of y = (sin x)ln x .

4. I recommend logarithmic differentiation for these two also.
(a) Find the derivative of s
5 (sin21 x)x12 (x2 − 2)43
y=
(2 x3 + 5)13 (Arctan(3 x))79
(b) Find the derivative of y = (sin x)cos x .

5. Make up three more exercises to find derivatives. Include at least one logarithmic differentiation. Also, be sure to
include a mix of trig and inverse trig functions with the functions.
6. Show that the derivative of y = x4 sin x obtained by logarithmic differentiation in the notes agrees with the derivative
you get by the product rule. Do this by simplifying the expression obtained from logarithmic differentiation and
comparing it to what you get by the product rule.
Problems.
1. In this problem, we will give an example of a function whose limit by calculator is way off. Let f (x) = 2 + 3 x2 +
1
π4
ln(|x − π |).
(a) Construct a table of values of f (x) for x = 3, 3.1, 3.14, 3.141, 3.1415, and 3.1416. Use a calculator or Maple.
(Note: If you use your calculator, use the π key; and if you use Maple use Pi for π. Maple uses either
log(); or ln(); for natural logarithm. If you want log10 x in Maple, you have to use log10(x);. In this
problem, we want ln();.)
(b) On the basis of that table, give a guess as to the value of limx→π f (x). Try to make it accurate to one decimal
place.
(c) What happens when you substitute x = π into the equation, particularly in the log term? (This is essentially the
same thing that would occur with an indeterminate form.)
(d) Evaluate the limit as x → π of f (x) using Maple. Which value (your guess on the basis of the table or Maple’s)
do you think is correct? (Hint: Read the first line of this problem.)
(e) The function f (x) drops below 0 only for x in the moderately small interval smaller than π ± 10−1000 . Do you
think your calculator would ever give a limit of - infinity for this limit?
(f) What’s the moral of this problem?
Mugsy: Problems have morals? You gotta be kidding.
2. In this problem, we use logarithmic differentiation to derive the product, quotient, and power rules. Once we have
logs, these become fairly easy.
Dudley: No fair. I already know these. Why see them again?
Albert: Because the more you see them in different contexts, the easier it is to work with them.
Mugsy: I’m all for anything that makes these things easier.
(a) Suppose f (x) = f1 (x) × f2 (x). Take the (natural) logarithms of both sides, and simplify the right hand side
using properties of logarithms. Differentiate both sides using the chain rule. Solve for f 0 (x). Replace f (x) with
f1 (x) × f2 (x), and simplify to get the usual product rule.
(b) Suppose g(x) = g1 (x)/g2 (x). Follow the same procedure as in the first part to derive the quotient rule. (This
will require some algebraic simplification to get it into the usual quotient rule form.)
(c) Suppose h(x) = xn . Again, follow the directions in the first part to derive the formula for the derivative of xn .
Note that the formula does not depend on n being a whole number, just a constant.
3. In this problem, we will try to find a coefficient, c, and an exponent, n, so that (c xn )0 = 1/x = x−1 .
(a) What is the formula for (c xn )0 ? This is what we want to match with x−1 .
(b) What value of n causes the exponent in the derivative to be −1?
(c) Let n have the value from the previous part. What is the derivative of xn in this case? Can we find some function
c xn for that value of n to make the derivative of c xn = x−1 ?
Exponentials and their derivatives.

The properties of exponents are probably more familiar to you than the properties of logarithms, even though they are
exactly the same properties, only stated differently.
Mugsy: Sure. And barely remembered is more familiar than totally unknown.
Nevertheless, I will also go over the properties of exponentials (identities and graphs) in lab period, if that wasn’t covered
in the first lab period.
There is a common notation that needs to be seen. Often, ex will be written as exp x, or exp(x). This is the notation that
Maple uses: exp(x); It is shorthand for “the exponential of x.” It is used particularly when the exponent is so bulky that
you can lose sight of the e. Computer languages (such as Basic, Pascal, and C) use the exp(x) notation for ex , too.
The derivative of ex is amazingly easy:
d x
e = ex (1.88)
dx
or
d
(exp x) = exp x (1.89)
dx
This formula is the key to understanding the exponential function’s important properties.
Dudley: Let me get this straight. The derivative of ex is just ex , right?
Albert: Yup.
Dudley: Then all I have to do is copy the thing down whenever I want to differentiate and exponential of anything?
Albert: Wrong, on two counts. First, you have to remember that what you will be differentiating will be exponentials
and other things, so you need to use the product and quotient rules as appropriate. Also, you have to remember chain
rule, so it is not just “copy the thing down.” You also have to multiply by the derivative of the inside, just like trying
to differentiate sin(5 x), where you get an extra factor of 5.
Dudley: What’s the “inside” of an exponential?
Albert: It’s the thing in the exponent. You’ll see soon.
Why use base e exponentials versus other bases. The exponential function is ex . Other candidates, ax for a > 0,
a 6= 1 or e, are also called exponential functions, but are never called the exponential function.
Mugsy: Never?
Albert: Never.
There are two reasons. First, ax = ex(ln a) , so studying ek x for different values of k gets all the others. Second, the derivative
of ek x is so nice in that form.
Note the differential equation that ek x solves. One differential equation shows up so often, in such a wide range of
settings, that it is quite amazing. The equation is dy/dt = k y, with k a constant. This is one case where understanding what
the equation says (that is, translating it into English) helps to explain why this equation shows up so much.
Dudley: Is this an example of all that talk long ago about being able to interpret an equation?
Albert: YES! You did learn something.
Dudley: You don’t have to sound so surprised . . . .
dy/dt represents the rate at which y is changing. What dy/dt = k y says is that the rate at which y is changing is
proportional to the amount of y present. Any physical situation where that description holds can be modeled mathematically
(the term for this process) by the equation dy/dt = k y, with appropriate understandings of y and k.
The solution of the equation dy/dt = k y is y(t) = C ek t , where C can be any constant. (See the homework.) The value
of C represents the value of y at t = 0, so it is often written y0 . That makes the equation look like y(t) = y0 ek t .
The key to the equation is k. If k > 0, then y grows (and k is called the growth constant), and if k < 0, then y decays
(and k is called the decay constant). (We always work in physical situations with y > 0. Then k > 0 makes dy/dt = k y > 0
and k < 0 makes dy/dt = k y < 0.) Additionally, the size of k determines how fast the solution grows or decays. When k is
large and positive, y grows very fast; when k is small and positive, then y grows at a much more controlled rate (but once it
starts taking off, it is virtually unstoppable); when k is small (close to 0) and negative, then y is decaying mildly; and when
k is large (far from 0) and negative, then y is dying rapidly—it goes to 0 so fast that it becomes indistinguishable from it
very rapidly.
We now look briefly at four different situations where exponentials occur. One of them (population growth) will be
looked at in detail later, and one of them (continuous compounding) will be re-derived later. The other two are just to show
you the variety of situations that these equations explain.
Newton’s law of cooling. In this situation, y represents the difference between the temperature of an object and the
temperature of its surroundings (called the ambient temperature). The equation dy/dt = k y says that an object cools off (or
warms up) at a rate proportional to that difference in temperature. In other words, a very hot object loses heat much faster
than a slightly warm object.
Mugsy: Al, I need another example.
Albert: OK, Mugsy. Here’s one on your terms. After shooting a gun for a while the barrel gets real hot, right?
Mugsy: Sometimes too hot to touch. That’s why it’s called a heater. But it depends on the gun, and how much you
shoot, and the type of bullets, and . . . .
Albert: OK, OK. Suppose the barrel’s at 170 degrees and the air temperature is 70 degrees, so the gun is 100 degrees
warmer than the air. In five minutes, the barrel will still be warm, but not hot, say 110 degrees. That means that in
five minutes, the barrel cooled off by 60 degrees, or 12 degrees per minute average. In the next five minutes, the barrel
will be barely warm, say 80 degrees. So in the next five minutes, the barrel only cools off 30 degrees, or 6 degrees per
minute. The rate of cooling has dropped. The barrel isn’t as hot after five minutes, so it won’t lose heat as fast. If
it kept dropping at 12 degrees per minute, after the second five minute interval, the temperature would have dropped
another 60 degrees, and the gun would be only 50 degrees, even cooler than the air. That isn’t going to happen. Does
this make sense?
Mugsy: More than it used to. I think.
Dudley: You think? That’s news.
Mugsy: Don’t make wise cracks while holding a gun, see? Not healthy.
In this case k < 0, since y > 0 (an object warmer than its surroundings) will cause the temperature to decrease (dy/dt <
0). This equation is only roughly accurate, due to ignoring certain critical items, such as the temperature distribution within
the object. The entire object does not have just one temperature.
Radioactive decay and radiometric dating. In this case, y represents the amount of a radioactive substance present,
and k represents the decay rate of the substance. The equation dy/dt = k y says that the more of a radioactive substance you
have, the faster you lose it. (If you have one 10 µg piece of radium, and one 50 µg piece of radium, the 50 µg piece will
emit radiation at 5 times the rate of the 10 µg piece. And the radiation is emitted as the radium is changing into something
else—radon gas, among other things).
We want to look more carefully at the value of k in this case. As before, it must be that k < 0, since the amount of
radioactive substance decreases. (That leads to an interesting question: How can breeder nuclear reactors produce more
fuel than they use?) When dealing with radioactive substance, one of the critical items to note is the half life—the length
of time it takes for half the radioactive substance to decay. If we write t1/2 for the half life, this means that 21 y0 = y(t1/2 ).
(Read over that equation until it makes sense.) That says that
1
y0 = y0 ek t1/2 (1.90)
2
1
= ek t1/2 (1.91)
2
1
ln = ln ek t1/2 (1.92)
2
ln(1) − ln(2) = k t1/2 (1.93)
− ln(2) = k t1/2 (1.94)
− ln(2)
k= (1.95)
t1/2
What this last equation says is that when the half life of the substance is large (uranium, for example), the value of k is
negative and very close to 0, and the decay is very slow. On the other hand, if the half life is small (radium, for example),
the value of k is negative and far from 0, so the decay is very fast. It makes sense. Another form of this equation,
− ln(2)
t1/2 =
k
gives the half-life if you happen (by some weird coincidence) to have the value of k.
This idea is used with radiometric dating, a source of considerable debate in some quarters.
Dudley: Radiometric dating? Like an intense Saturday evening?
Albert: Not in my quarters.
Mugsy: Oh, spare us, both of you.
The basic idea is this. There are two different types (isotopes) of carbon, one that is radioactive (carbon-14) and one that
isn’t (carbon-12) in a fixed ratio. Carbon-14 is created by the interaction of cosmic rays with high-altitude nitrogen, and
has been going on for sufficiently long that carbon-14 is being formed at the rate it is disintegrating, giving what is called a
steady-state. These isotopes are mixed together thoroughly. All chemical reactions treat them virtually identically. Living
tissue (that is, a plant or animal) will absorb both isotopes of carbon at once through food, and keep that constant ratio of
carbon-12 to carbon-14. However, when the tissue dies, the carbon-12 stays put, while the carbon-14 decays, turning into a
different element. If you have a piece of long-dead, once-living material, you can figure out not only how much carbon is
in it but also the amounts of the different types of carbon that are present as well by measuring the amount of (radioactive)
carbon-14 with a Geiger counter sort of device. If you make the assumption that the ratio of different isotopes of carbon
hasn’t changed significantly since the tissue was alive, you can thereby determine how much carbon-14 was present at
death, and finally, how much of the carbon-14 has disintegrated. Knowing the half-life of carbon-14 then allows you to tell
how old the sample is, that is, how long it has been since the carbon-12 and carbon-14 were in equilibrium amounts.
Mugsy: Clear as mud.
Let’s continue a bit further into this. At equilibrium concentrations of carbon-12 and carbon-14, one gram of carbon
will produce 15.3 radioactive disintegrations per minute. If you had a one-gram sample of material that produced, say, 5.7
disintegrations per minute, you could tell that the amount of carbon-14 present was only 5.7/15.3 = 0.37 of the original
amount. That would mean 0.37 = ek t , where we want to find t = the age of the object. To solve that for t, we need to know
k. For carbon-14, the value of k is well-known (to those who have to work with it) as −0.0001245. (See the homework.)
Then
0.37 = ek t (1.96)
−0.0001245t
ln(0.37) = ln(e ) (1.97)
= −0.0001245t (1.98)
t = ln(0.37)/(−0.0001245) (1.99)
= 26480 (1.100)
or about 26,480 years old. This is the theory behind the practice of radiometric dating.
Dudley: Al, how realistic is this?
Albert: The chemistry/physics of it is solid. The questions arise when dealing with anything that old. How can we
be sure that there hasn’t been contamination with more recently-alive things (bacteria, etc.)? The arguments can get
nearly violent. Depends on whose pet theory is being skewered.
Population growth. Populations tend to grow exponentially. (This was part of the discovery of Malthus.) The reason
is simple. The more people you have, the more new people who arrive. Or, you can do the same with other creatures, such
as bacteria or rabbits.
Mugsy: To update P. T. Barnum, there’s a sucker born every 30 seconds.
There are limits on growth that are not included in this equation, and we will patch the equation up later when we get to
integration.
The equations say that y = y0 ek t , where now y0 is the initial population, and k tells how fast the population is growing.
Note that in this example, k > 0, since we are talking about growth rather than decay.
Dudley: Al, is this realistic?
Albert: For short periods of time, yes. For longer periods of time, you need to throw in the limitations of needed
resources, such as food and space.
Continuous compounding. In case you think that all of these are beyond daily life, let me mention one more that
we’ll investigate later this semester. If your savings account
Dudley: Savings? This is more than a bit hypothetical.
is listed as giving continuous compounding, the formula for the amount of money in your account (if you leave it alone)
Mugsy: Real hypothetical.
is dy/dt = k y, where k is the interest rate given for the account. The more money you have, the more money you make.
Combine these with everything before. We’re getting near the end, but once again, let me mention that all we are doing
is accumulating more and more formulas to plug into, but they are not getting more complicated. All these formulas for
derivatives work precisely the same way. Once you get that down, they all fall into place. Just don’t get overwhelmed. The
examples for this section follow the next topic.
Hyperbolic trig functions.

Along with the exponential functions comes a new set of functions, called the hyperbolic trig functions. (The regular trig
functions are sometimes called circular trig functions when distinctions must be drawn between them.) But the really good
news is that this is the end of the new functions that we will see!
You’ve probably seen the buttons on your calculator. What is sinh x ? Or, what is that HYP button? Wonder no
more! That’s what this section is about.
Dudley: I ignore the buttons on my calculator that I don’t understand.
Mugsy: That’s most of them, huh?
Definitions, graphs, and identities. The notation for the hyperbolic trig functions is exactly like the regular trig functions,
except you add an “h” to the name. The definitions, however, are completely different. They are defined in terms of the
exponential function, ex . The accompanying table at the top of this page gives them. These are not the formulas for the
derivatives! They are the formulas for defining the functions in terms of exponentials and sinh x and cosh x. The formulas
for the derivatives are given in another table, and you are asked to find some in the homework.
You also need to know how to pronounce these. It is not what you might think. The pronunciations I use (and I think
these are standard) are: sinh is pronounced like “cinch,” cosh has a hard “c” and a short “o” so that its starts the same way
as “cot,” tanh is “tansch” (I know there’s no “s” in tanh, but you pronounce it anyway), coth is “coh-tansch” or “kahth” (I
use “coh-tansch”), sech is “seetch,” and csch is “coh-seetch.”
Dudley: Al. Help?
Function Definition Identity
ex − e−x
sinh x
2
ex + e−x
cosh x
2
e − e−x
x
sinh x
tanh x
ex + e−x cosh x
ex + e−x cosh x
coth x
ex − e−x sinh x
2 1
sech x
ex + e−x cosh x
2 1
csch x
e − e−x
x sinh x
Table 1.5: Definitions of the hyperbolic trig functions.
Albert: Hang in there. These will seem almost familiar in a moment.

Mugsy: Almost?
You very well might ask how in blazes these were concocted. They look more like someone’s bad dream than what
you’d rather utilize.
Mugsy: My sentiments exactly.
It would take too long to explain (it goes heavily into complex numbers), and the topic can come up in later courses. It is
not as far-fetched as it looks.
In fact, the parallels between the circular and hyperbolic trig functions are quite amazing. The accompanying table at
the top of this page contains a list of identities for each. Note how they are precisely the same, except for occasional shifts
in signs.
Derivatives of the hyperbolic trig functions. The parallels continue with the derivatives of the hyperbolic trig functions.
Since all of them can be defined in terms of exponentials, and you know how to differentiate them, I shouldn’t have to tell
you what the derivatives of the hyperbolic functions are. In fact, in the homework, I will ask you to derive them. I will give
you this much of a start: d/dx(sinh x) = cosh x and d/dx(cosh x) = sinh x. Note that the only difference is the “missing”
negative sign in the derivative of cosh x.
All of the hyperbolic trig functions have derivatives that match precisely the derivatives of the circular counterparts,
except for negative signs that appear or disappear. That’s a big hint.
Dudley: Are we ever going to get a list of these derivatives?
Albert: On the extra sheet that will accompany the tests, these derivatives will be listed. Of course you save your
tests, don’t you?
Mugsy: Yeah. I bronze mine.
Inverses of the hyperbolic trig functions and their derivatives. As with the circular trig functions, the hyperbolic trig
functions have inverses. But since the hyperbolic functions are defined in terms of exponentials, the inverses can be defined
in terms of the inverse of the exponential, the natural logarithm. They are listed in the accompanying table at the top of
this page. It isn’t clear why anyone would want to use these. I wondered myself, until recently when I had to work with
Arccsch, and discovered that it was a whole lot easier to work with than the messy combination of square roots and logs
that make up Arccsch.
The derivatives of the inverse hyperbolic trig functions are given in the accompanying table at the top of the next page.
Note that these are identical to the derivatives of the inverse circular trig functions, except for changes of sign. That’s handy
Circular Hyperbolic
cos2 x + sin2 x = 1 cosh2 x − sinh2 x = 1

1 + tan2 x = sec2 x 1 − tanh2 x = sech2 x
cot2 x + 1 = csc2 x coth2 x − 1 = csch2 x
sin(x ± y) = sin x cos y ± cos x sin y sinh(x ± y) = sinh x cosh y ± cosh x sinh y
cos(x ± y) = cos x cos y ∓ sin x sin y cosh(x ± y) = cosh x cosh y ± sinh x sinh y
tan x±tan y tanh x±tanh y
tan(x ± y) = 1∓tan x tan y tanh(x ± y) = 1±tanh x tanh y
sin2 (x/2) = 21 (1 − cos x) sinh2 (x/2) = 12 (cosh x − 1)

cos2 (x/2) = 21 (1 + cos x) cosh2 (x/2) = 12 (cosh x + 1)
sin x 1−cos x sinh x 1−cosh x
tan(x/2) = 1+cos x = sin x tanh(x/2) = 1+cosh x = − sinh x
sin(2 x) = 2 sin x cos x sinh(2 x) = 2 sinh x cosh x
Table 1.6: Circular and hyperbolic trig identities.
Function Definition
√
Arcsinh x ln x + x2 + 1
√
Arccosh x ln x + x2 − 1
1 1+x

Arctanh x 2 ln 1−x for |x | < 1
Arccoth x = Arctanh( 1x ) 1 x+1

2 ln x−1 for |x | > 1
√
2
Arcsech x = Arccosh( 1x ) ln 1+ x1−x
√
1+x2
Arccsch x = Arcsinh( 1x ) 1
ln x + |x |
Table 1.7: Inverse hyperbolic trig functions in terms of logarithms.
Function Derivative
Arcsinh x √1
1+x2
Arccosh x √1
x2 −1
1
Arctanh x 1−x2
1
Arccoth x 1−x2
Arcsech x √−1
x 1−x2
Arccsch x √−1
|x | x2 +1
Table 1.8: Derivatives of inverse hyperbolic trig functions.

to keep in mind.
Of course, we will need to go over some derivatives in class that include exponentials, hyperbolic trig functions, and
inverse hyperbolic trig functions. I won’t ask for the derivatives of any of the hyperbolic trig functions until we have the
correct formulas worked out, except that I have already given you that the derivative of sinh x is cosh x and the derivative of
cosh x is sinh x (with no minus sign). Here are the examples that will be worked in class.
ecosh x
sinh(4 x + 1)
Arccoth (6 + ex )
2
tanh(ex )
exp x × ln(5 sin2 (x))
sinh x cosh x
sin x cos x
The uses of exponentials and hyperbolic trig functions.

The usefulness of exponentials in a wide variety of situations has already been shown. But why look at hyperbolic trig
functions, or, even worse, their inverses?
It turns out that the combination of exponentials that form the hyperbolic trig functions do occasionally occur. One
major example is the catenary, the graph of y = cosh x. The name catenary comes from the Latin word for chain. The
reason is that the shape of a flexible chain that is freely suspended (or evenly loaded) is a catenary. It might look like a
parabola, and it is quite close to one, but it is different.
How would that be of any use? If you want to design a suspension bridge, where the spans are being held up by cables
(look at the Golden Gate bridge), you want to make sure that the cables are in the form of catenaries. Otherwise, there will
be extra forces on the cables making them more likely to break.
Another famous catenary is the Saint Louis arch, even if it is upside down. By making the arch in the form of a catenary,
the builders were able to guarantee that the gravitational forces were exactly in line with the shape of the arch. That would
keep it from bulging out or in.
The need for the inverse hyperbolic trig functions needs to be delayed the same as the need for the inverse circular trig
functions. They will both appear together, since they solve the same sort of problem.
Homework #17
Exercises.

(a) sinh(ex )
(b) ln(Arctanh x)
x sec x
(c)
cosh(x2 )
(a) cosh(sin(ex ))
(b) t 4 sinht
e2 x Arccosh (3 x)
(c)
sin(ln x)
3. Use the definitions of the hyperbolic trig functions to simplify sinh(ln x), tanh(ln x), and sech (ln x). (Remember that
eln x = x. So what would e− ln x be? Hint: It is not −x.)
4. Differentiate the logarithmic formula for Arctanh x, and show that the derivative formula is correct. (Be prepared for
considerable amounts of algebra.)
5. Make up three more derivatives and work them. Be sure to include exercises that have polynomials, absolute values,
trig functions, inverse trig functions, and logarithms in addition to exponentials. Don’t forget products and quotients,
either. Be glad that we are now done with the new functions that we’ll encounter!
Problems.
1. Use logarithmic differentiation to get the derivative of y = ax , for a > 0. Note that a is a constant, so ln a is a constant
(so it has derivative = 0). Here, x is the variable.
2. Plug y = C ek t into both dy/dt and k y, and show that the results are equal. (This shows that y = C ek t is a solution of
the differential equation dy/dt = k y.)
3. When we looked at Newton’s law of cooling, we decided that k < 0 when y > 0, because dy/dt < 0 then and we want
the equation dy/dt = k y to work. Now we want to find the sign of k when an object is warming up. Find the sign
of dy/dt in the case that y < 0 (so the object is cooler than the ambient temperature). Will k be positive or negative
when y < 0?
4. Get the derivatives of the remaining four hyperbolic trig functions, and put them into the forms that make them look
like the derivatives of the circular counterparts. You will need some of the hyperbolic trig identities to do that. But it
can be done exactly the same way that was done with the circular trig functions. Use the definitions of the hyperbolic
trig functions in terms of sinh x and cosh x.
Investigations.
1. With decay, we found that the equation relating half-life and the decay constant k was k t1/2 = − ln(2). Since k < 0,
this gave half-life a positive value. Can you think of a good interpretation for t2 in the equation k t2 = ln(2) when
k > 0 (exponential growth)?
2. A function f (x) is called even if f (−x) = f (x), and it is called odd if f (−x) = − f (x).
(a) Show that xn is even when n is even and odd when n is odd. Do this by plugging −x in for x in xn and treating
n even and n odd separately. (This is the source of the terminology.)
(b) Show that cosh x is even and sinh x is odd. This can be done by by plugging −x into the exponential definitions
for sinh x and cosh x and simplifying. (The same pattern is true for sin x and cos x.)
(c) Show that ex = cosh x + sinh x. Do this by again going back to the exponential definitions of sinh x and cosh x.
(d) Let f (x) be any function. Show that g(x) = 12 ( f (x) + f (−x)) is an even function and h(x) = 12 ( f (x) − f (−x))
is an odd function. Also show f (x) = g(x) + h(x). (Hint: Look at the previous parts of this investigation.) This
shows that any function can be written as the sum of an even function and an odd function. Also, sinh x and
cosh x are nothing more than the even and odd functions derived from ex . This explains, in a small way, why
they exist.
(e) Show that an odd function f (x) must have f (0) = 0. (Hint: Show f (−0) = − f (0).)
1.4 Differentials.
The notation for derivatives due to Leibniz is dy/dx. This is not the quotient of two “things,” dy divided by dx, by what we
have done before: dy/dx = f 0 (x) shows that very clearly. But engineers have discovered that treating dy and dx as separate
quantities leads to correct results. In keeping with the spirit of the course, I will do what the engineers do, and look at
differentials (as the dy and dx are called).
Mugsy: So, what if you’re a biologist?
Albert: The same ideas apply there. For example, population growth from before will be re-examined in this section.
1.4.1 What is a differential?

Mathematicians have something called a differential, but to understand it requires graduate-level mathematical training. We
won’t bother with that here.
Essentially, the way differentials are used “in the field” is that they are very tiny wiggles. And ratios of differentials
are wiggle magnification factors which in turn are derivatives. (See, that approach wasn’t as far-out as you might have
thought.)
Mugsy: You still have a way to go before I’m convinced of that.
The formula for differentials is amazingly simple. For y = f (x), the formula is designed to make dy/dx = f 0 (x) correct:
If y = f (x), then dy = f 0 (x) dx
Division by dx gives back the derivative formula.

Example: If y = x + x2 , then dy/dx = 1 + 2 x, so dy = (1 + 2 x) dx. That’s the general pattern. Take the derivative and
multiply through by the differential of the independent variable.
The formulas for differentials in more general situations are precisely the same as the formulas for derivatives, except
that you don’t divide by the differential of the independent variable. (This is important! It means you don’t have to tell
what the independent variable is. I’ll come back to that momentarily.) If u and v are functions (and it doesn’t matter of
what), then we have the formulas:
d(u ± v) = (du) ± (dv) (1.101)

d(uv) = (du) v + u (dv) (1.102)
v (du) − u (dv)
d(u/v) = (1.103)
v2
n n−1
d(u ) = nu du for n a constant (1.104)
d(sin u) = cos u (du) (1.105)
etc. (1.106)
The list could continue through all of the functions that we worked with.
Dudley: Al, is this as easy as it looks?
Albert: In fact, it is conceptually simpler than derivatives, but being new, it is not as familiar.
Mugsy: Does that mean “yes?”
Not specifying the independent variable is useful. Suppose we have the following, not uncommon, situation. We
have the position of an object given by x(t) and y(t). That is, you know an object’s x- and y-coordinates as a function
of time. (This is called parametric equations. The main variables are both expressed in terms of a third, independent
variable. We will look at them closely next semester.) Then we can find dx and dy by differentiating: dx = (dx/dt) dt and
dy = (dy/dt) dt. Note that the dt’s have to be tacked onto the end of these derivatives, since t is the variable. Then dy/dx
can be calculated by
dy (dy/dt) dt dy/dt
= =
dx (dx/dt) dt dx/dt
Do you recognize that? Suppose we rewrite it as
dy dy dx
= ×
dt dx dt
It’s the chain rule! Yup, the chain rule appears here, and actually the reason that differentials work at all is that the chain
rule works. (I bet you were wondering why I omitted the most important rule in calculus from the list of rules I gave
earlier for differentials. It’s because the chain rule is built into any operation with differentials. It doesn’t appear, since it is
the assumption of the notation.)
Mugsy: Need I say that I hadn’t noticed?
Albert: No.
1.4.2 The difference between a differential dx and a wiggle ∆x.

A differential satisfies dy = (dy/dx) dx, where a wiggle satisfies ∆y ≈ (dy/dx) ∆x. The difference is the “=” versus the
“≈.” Since the approximation in the wiggle magnification formula becomes better as ∆x shrinks, the best way to think of
differentials is as submicroscopic wiggles. They are so small that the wiggle approximation formula becomes (for practical
purposes, the way engineers operate) exact. I will continue to refer to differentials as wiggles.
It is only a small exaggeration to say that most of the course will be an explanation of how to apply differentials to real-
world problems. So, let me look at how this works with formulas and graphs. It will give further insight into derivatives.
When you have a formula (and I will stick with polynomials because we can work with them algebraically much easier),
you can find the differential by differentiation, as done earlier, but there is another way. It is longer, and is never used, but
it will further emphasize the central idea behind derivatives. I’ll show you by example. I will assume that dx is in fact a
really tiny ∆x, and calculate ∆y for a randomly chosen polynomial.
f (x) = 2 x3 − 4 x2 + 8 x − 5 (1.107)
3 2
f (x + dx) = 2 (x + dx) − 4 (x + dx) + 8 (x + dx) − 5 (1.108)
3 2 2 3
= (2 x + 6 x (dx) + 6 x (dx) + 2 (dx) ) (1.109)
2 2
− 4 x − 8 x (dx) − 4 (dx) + 8 x + 8 (dx) − 5 (1.110)
2 2 3 2
∆y = f (x + dx) − f (x) = 6 x (dx) + 6 x (dx) + 2 (dx) − 8 x (dx) − 4 (dx) + 8 (dx) (1.111)
= (6 x2 − 8 x + 8) (dx) + (6 x − 4) (dx)2 + (2) (dx)3 (1.112)
Compare that to the differential formula for dy:
dy = f 0 (x) dx = (6 x2 − 8 x + 8) (dx)
What’s the difference? When working out ∆y, you had to keep higher powers of dx. The differential formula simply
ignores them. Therefore, using differentials allows you the luxury of ignoring powers of dx higher than 1. Simply throw
them away! And what you get by the process is the derivative.
Mugsy: Hey! I like that! You just throw the bums out. What a deal.
Think about this for a while and you will see that this is precisely what we did earlier by factoring out the ∆x from the
expression for ∆y, dividing by ∆x, and then setting ∆x = 0 in what was left. The division by the ∆x causes terms with just a
single ∆x in them to lose the factor of ∆x entirely. The terms with higher powers of ∆x lose one, but not all, of them. Setting
∆x = 0 causes them to go away at that stage. By throwing away powers of dx higher than the first, you do that earlier on in
the calculation. It also makes the computations easier by cutting down on the number of terms you have to keep around.
Mugsy: I’m liking these differentials more and more.
Graphically, you can think of differentials as being tiny wiggles in the variables, but the graph has been so magnified to
see them that the curve has actually been replaced by its tangent line, not just approximately, but exactly. The ratio dy/dx
is then the slope of the line—the tangent line of the curve, exactly as we had before by the limit process. The higher powers
of dx that we ignored in the differential are essentially the “bend” in the curve (like a parabola, which needs a quadratic in
it).
1.4.3 Examples of how differentials are used.

We will spend most of the course answering this, but we have already worked examples that can be revisited from this
viewpoint. They were the examples that I set up in the exponential function section. I will go over all of them again as far
as setting up the equations.
Before that, note that what I am trying to do here is to teach you the “language” of mathematical formulas—to show
you how to “read” equations to understand what they are saying. This ability is invaluable!
Newton’s law of cooling said that dy/dt = k y, where y = difference in temperature between object and surroundings,
t = time, and k = constant measuring how fast the object cools. Written in differential form, this says that dy = k y dt. What
does this mean? It says that the amount that an object cools (dy) is (=) proportional to (k) the difference in temperature (y)
times the length of time that it sits (dt). You could easily argue against this interpretation of the equation. After all, doesn’t
the cooling rate change with the temperature? How can you say that dy only depends on a single temperature difference,
just one value of y? But what saves it is that this equation is true only with differentials. So, dt is a very tiny slice of time.
And because it is so small, the only change in the temperature difference will be equivalently tiny (dy). For the next dt
change, a new value of y is used. As t changes, y changes, too. And because dt is so small, the temperature doesn’t change
during the dt interval, and you can use the single value y.
Essentially, when we are working with differential-sized quantities, the intervals are so short that we can treat everything
that looks variable as though it were a constant, except for other differentials.
This last point is sufficiently important that I want to emphasize it again. Whenever you have a differential on one
side of an equation, the other side must also include a differential (or be 0). The reason is submicroscopic wiggles can’t
be magnified enough to make them “regular-sized” numbers. This is usually stated by saying that differentials must be
balanced by differentials, called balancing differentials.
The next example was radioactive decay. In that case, y = amount of a radioactive substance. (k remains a proportion-
ality constant and t remains as time.) The equation dy = k y dt would end up meaning that the amount of substance that
decays in a very short time period (dy) is (=) proportional to (k) the amount of substance (y) times the length of time (dt).
Again, this works only because the length of time is a differential. Otherwise the number of decays in the equation would
be changing as the radioactive substance decayed.
The next example was population growth. In that case, y = population, and again, k and t represent a constant and time,
respectively. The equation dy = k y dt now is interpreted as follows: The increase in population (dy) is (=) proportional
to (k) the current population (y) times the length of time you wait (dt). Note again that dt must be small enough that the
increase in population doesn’t increase the population on its own, or else the equation would fail.
The final example was continuously compounded interest. In that case, y = amount of money in the account, with k =
interest rate, and t = time again. Then dy = k y dt is nothing more than the simple interest formula, interest (dy, the increase
in the money in the account) equals (=) principal (y) times rate (k) times time (dt). When the time period is as short as a
differential, the amount of money in the account doesn’t have a chance to compound, so it acts like simple (uncompounded)
interest. As t changes, though, so does y, and for the next dt amount, y has increased slightly, and the compounding takes
effect that way.
Although it is probably a bit shaky why this approach is useful, believe me that it is. The analysis of complex situations
simplifies extremely when you are allowed to use time intervals so small that all variables can be treated as constants. That’s
what we can do with differentials.
Again, I will do some examples of finding differentials in class. Here they are. Find the differentials of the following
functions.
y = x sin x
x = cos2 t
eQ
w=
1 − Q3
The uses of differentials.

Engineers use differentials all the time! That’s the reason I put this section in the notes. They love to derive equations
for quantities by letting something change by a differential amount, holding most everything else constant, and finding the
differential changes that result in something else. You end up with equations called (oddly enough) differential equations,
but that’s a completely separate course.
Homework #18
Exercises.
1. Find the differentials for the following functions.

(a) y = 5 x3 sin3 x
x5 ln x
(b) w =
tan3 x
2. Find the differentials for the following functions.

(a) y = 5 x2 − 2 x + 3
ex ln x
(b) z =
x2
3. The differential of a function is determined by the independent variable in the function. For example, d(z2 ) = 2 z dz.
Find the differentials of the following functions.
(a) u cos u
tan z − z
(b) 3
3z
(c) t − 3t

4. Find the differentials of the following functions.

(a) |7t − 8 |
(b) u Arcsin u
2 cos z − z
(c)
1 + z2
5. Find the differential of x = r cos θ , where r and θ are both variables. (Don’t get flustered; just use the formula for
products! We will deal with this more systematically when we deal with multi-variable functions later this semester.)
6. Make up three more functions on your own and find their differentials.
Problem.
1. Doesn’t squaring a number increase its size? In that case, how can we ignore (dx)2 or higher powers of dx? This
problem answers those questions.
(a) Find the squares of 10, 5, 1, 0.5, 0.1, 0.05, 0.01, 0.005, 0.001. Do all of them get bigger? Which ones get bigger
and which get smaller?
(b) Suppose we look at the example from the notes:
y = 2 x3 − 4 x2 + 8 x − 5
with x = 2, and with ∆x = 10−6 , which is one-millionth, not too small, but close to 0. What is the exact (keep
all the decimal places here—you’ll need 18 by the time you’re done) value for
∆y = 6 x2 (∆x) + 6 x (∆x)2 + 2 (∆x)3 − 8 x (∆x) − 4 (∆x)2 + 8 (∆x)
(c) Suppose we take

dy = (6 x2 − 8 x + 8)(dx)
with x = 2 and with dx = 10−6 . What is dy in that case? Is the approximation obtained by ignoring the extra
higher order terms good (in your estimation)? That is, is the approximate change in y, the value of dy, close the
exact change in y, the ∆y?
1.5 Parametric equations.

I have already mentioned parametric equations (back in differentials). Now we look at them a little more carefully. Later
on, we will look at them in detail.
Albert: Maybe you will get the idea that this is an important concept.
Parametric equations can describe more complicated situations than regular functions can. They are also very common in
physics.
1.5.1 Definitions; green boxes with multiple outputs.

Parametric equations never come singly. There are always multiple dependent variables, all of which depend on a single
independent variable. In the green box analogy, these are multiple-output functions, one output spout for each dependent
variable. (Actually, there can be multiple independent variables as well, which we don’t deal with in this chapter. You will
have parametric equations, though, when several output variables depend on the same input variable(s).)
Usually, these kind of equations occur when several different quantities (such as x- and y-coordinates of an object) both
depend on another quantity (such as t = time). The independent variable is also called the parameter. The most common
interpretation to use t = time as the independent variable. It isn’t the only possibility, though.
Some books will use x = f (t), y = g(t) for this situation. I consider that wasteful. If I want to say that x and y are
functions of t, I write simply x = x(t) and y = y(t); I don’t want to have to remember what letter I have assigned to x or y
(“Was it f or g?”), and it looks cleaner. Remember, then, x = x(t) simply says that x is the dependent variable and depends
on the independent variable t.
1.5.2 Conversion to/from implicit/explicit functions.

If y is given as a formula in x, that is y = y(x), (for example, y = x4 − x) then we say that y is an explicit function of x.
That’s the type of functions that we have been working with so far. The dependent variable is by itself on the left side of
the equation, and the independent variable occurs on the right hand side by itself. (Any other combination is not an explicit
equation.)
But, if x and y are given by a formula with x and y scrambled together (for example, x4 − y4 = xy), then it is not so
obvious what is going on. For a value of x, we can sometime find a value of y (for example, if x4 − y4 = x y and x = 0, then
y = 0), but it is not obvious for other values of x what the value of y is (such as if x = 1 in x4 − y4 = x y). When x and y are
scrambled together this way, the terminology is to call y an implicit function of x. Given a value for x, you’d still have to
solve for y. The terminology is a bit misleading, in that the equation might not represent a function in the usual sense. For
example, with x2 + y2 = 1, the graph is a circle, which obviously can’t represent a green-box function. It is still customary
(though slightly improper) to say x2 + y2 = 1 defines y as an implicit function of x.
Mugsy: I thought mathematicians would at least be accurate.
Albert: Unfortunately, they are common humans.
Aside from explicit and implicit functions, there are functions defined parametrically, that is from parametric equations.
Again, the terminology is a bit improper, in that the y will not always be a regular function of x, although each of x and
y will be regular functions, so there is some consistency. For example, x = cost and y = sint defines a set of parametric
equations equivalent to x2 + y2 = 1 (since sin2 t + cos2 t = 1), and so is not an acceptable function in the usual sense.
All that terminology being settled, it becomes of use to note when one form of an equation can be converted to another.
Briefly, it works this way: An explicit equation can convert to anything else, but that’s all you can guarantee. You can’t
always convert from parametric to explicit or from implicit to explicit or from parametric to implicit or from implicit to
parametric.
1.5.3 Derivatives of parametric equations.

Later on this semester, we will deal with finding the derivatives of implicit functions (it turns out to be quite simple). What
we want to look at now is the derivative of parametric equations.
Even though you can’t always (or even usually) solve for y as a function of x, you can still find dy/dx. (The same is true
for implicit equations.) What can we expect? As you might (or might not) guess, the derivative formula that we’ll get finds
dy/dx as a function of the parameter (independent) variable. That is, to find the slope of the tangent line at some point, the
point will have to be specified by giving a value of the parameter.
Dudley: Al, can you say that in English?
Albert: The idea is this. If you have x = x(t) and y = y(t), then you can locate a point of the curve by giving a value
of t, right?
Dudley: Sure. Just plug the value of t into x = x(t) and y = y(t) and you get the x- and y-coordinates of the point.
Albert: Well, when you find the derivative dy/dx, you find that it also is a function of t. Once you know what value
of t you intend to look at you can plug in and get the coordinates of the point, and also get the slope, dy/dx. That’s
all that he means.

Mugsy: How come when you explain it, it makes more sense?
Formula for parametric equation derivatives.

We saw the formula already (in differentials):
dy dy/dt
= (1.113)
dx dx/dt
It looks as though the dt’s are canceling (looking at them as differentials), and even though that is a slightly strained
interpretation, it is a very handy way to remember the formula, and can even be legitimized with enough effort. Note:
Half the battle in mathematics is to come up with a notation that helps you. The Leibniz notation (differentials) does that
(Newton’s f 0 (x) does not). The other half is to interpret the notation correctly. I’m giving you the notation, and trying to
show you the interpretation.
There will be an example in class. It is quite straightforward. It is to find the derivative dy/dx for the parametric
equations x = et , y = lnt.
Viewing as coordinated rates of wiggles.

What does dy/dx mean now that y is not a function of x explicitly? For derivatives, you wiggle the independent variable,
and watch the dependent variable wiggle; the ratio of those wiggles gives the derivative. But now, t is the independent
variable, so you can’t just wiggle x and see how y wiggles. We have to wiggle t, and watch the relative sizes of the x- and
y-wiggles. The ratios of those give dy/dx. The green box diagram in class will help explain this.
For x = x(t), dx = (dx/dt) dt and for y = y(t), dy = (dy/dt) dt. That gives the relative sizes of the x- and y-wiggles.
Then
dy/dx = (dy)/(dx) (1.114)

(dy/dt) dt
= (1.115)
(dx/dt) dt
(dy/dt)
= (1.116)
(dx/dt)
This gives the formula we had earlier.

There is another way to view this formula. If we look at x and y as functions of t, then dx/dt is the rate at which x
is changing, the x-velocity, and dy/dt is the y-velocity. How does that fit here? If we look at the equations as describing
the positions of an object on the plane, then the velocities are just that. If you think about it a second, though, the x-
velocity and y-velocity must be interrelated. If the object is to follow the curve and the x-coordinate is changing at a certain
rate, the rate of change of the y-coordinate has no flexibility. The equation relating these is (of course) the chain rule:
(dy/dt) = (dy/dx)(dx/dt). Alternately, if you know the values of the x- and y-velocities, then the object must be following
a curve with slope (dy/dx) = (dy/dt)/(dx/dt), also the chain rule. The rates of change of x and y must be coordinated to
fit the curve. That’s what this formula indicates.
Mugsy: Al?
Albert: OK. Here it is again. Suppose you take a motion picture of a tennis ball after it has been hit.
Mugsy: Couldn’t we use a bullet? How about a knife?
Albert: Forget it. This is my explanation; I decide this. Consider two consecutive frames of the movie. They give
the locations of the tennis ball at two times separated by a small time interval, dt. Both x and y will have changed,
by dx and dy. The ratio of those two wiggles gives dy/dx, the slope of the path the tennis ball is following. Also,
dx = (dx/dt) dt and dy = (dy/dt) dt, so they have to be related to each other.
Viewing as way to change variable of differentiation.

As indicated before, one way to look at the chain rule is as a way to switch the variable of differentiation. That’s precisely
what is going on in parametric equations. We have y = y(t), so the natural derivative is dy/dt, where the independent and
dependent variables are all that show up. But if we wanted to treat y = y(x), which is what happens when we try to measure
slope, then the correct derivative is dy/dx, with a different independent variable. The chain rule gives the procedure:
dy dy/dt
=
dx dx/dt
We need to look at this a bit more carefully now, for use momentarily.
If we pull down the y’s, we get
d 1 d
(y) = × (y)
dx dx/dt dt
We can even get rid of the y’s, and get this:
d 1 d
= ×
dx dx/dt dt
This tells us how a derivative with respect to x relates to a derivative with respect to t.
Let me throw in a bit of notation that fits here that would be confusing in any other setting.
Mugsy: I’m not optimistic here.
Remember that I said that physicists tend to use a prime to mean a derivative with respect to x, and a dot over a variable to
indicate a derivative with respect to time. In that notation, the last equation becomes:
˙
()0 =
1 ˙ = ()
× ()
dx/dt dx/dt
This is one reason that only physicists use this awkward notation. It’s handy, but you have to be careful. Fortunately, you
shouldn’t have to worry about this unless you are planning on taking a bunch of physics.
Mugsy: Hmm. Better than I feared.
There is an extension of this change-of-variables rule to multiple independent variables, but that will have to wait until
we can deal with functions that have multiple-variable inputs (partial derivatives).
Mugsy: I can hardly wait.
The uses of parametric equations.

As I indicated, physics and engineering uses parametric equations constantly, mostly to describe the motion of particles in
a system. The chain rule is needed to help describe that motion.
Homework #19
Exercises.
1. Find dy/dx for the following parametric equations.

(a) x = a cost, y = b sint (a and b are constants)
(b) x = 2t 4 − 5t 3 + 2t, y = 3t 7 + 8t 3 − 5t
2. Find dy/dx for the following parametric equations.

(a) x = R cosht, y = R sinht (R is a constant)
(b) x = 2t 5 + 3t 3 − 7, y = 5t 9 − 8t 5 + 2t 2
Problems.
1. Parametric equations have graphs that are more general than the graphs of regular functions. To convince you of this,
show how to convert y = f (x) into parametric equations. [Hint: Just let t = x. What would the equation for y be, in
terms of t? Yes, it is very easy.]
2. When we were doing differentials and parametric equations, we were using very general (unspecified) functions.
In this problem, I want us to work through a specific example, using Maple if you want. (I’ll give you the Maple
commands as an incentive to use it.) Let’s start with x = t 2 − 3t, and y = t 3 + 4t. Let’s find the slope of the tangent
line at (“when” might be a better term than “at,” since t is viewed as time) t = 2. (It is possible in this case to find y
as an explicit function of x. We will do that to check our answer, but only at the end.) On Maple, this becomes
> eq1 := t^2 - 3*t:
> eq2 := t^3 + 4*t:
Be sure to change the colons at the end to semicolons so that you can see what the values are. This same comment
holds throughout the rest of this problem.
(a) Where is the object at t = 2? (This is, what are the coordinates, (x, y), of the point when t = 2?) To do this on
Maple, use
> subs( t=2, eq1 ):
> subs( t=2, eq2 ):
(b) Next we find dx and dy as functions of t and dt. (Note that dt really needs to be considered as a separate,
independent variable. The size of the wiggle is not set by any other variable.) Maple does not use differentials
(actually, it does, but in a sufficiently more sophisticated way that it is not useful for us).
> dx := diff(eq1, t) * dt:
> dy := diff(eq2, t) * dt:
Now what are dx and dy as functions of t and dt?

(c) What are dx and dy at t = 2, as functions of dt? To get this on Maple, use
> subs( t=2, dx ):
> subs( t=2, dy ):
Again, don’t forget the dt!

(d) What is dy/dx at t = 2? This is the slope of the tangent line to the curve at t = 2. To get this on Maple, use
(immediately after the previous two commands)
> % / %%:
Note that the dt’s cancel, so you don’t need them here.
(e) To check this answer, we need to solve the parametric equations. From x = t 2 − 3t, we get that t = 23 ±
1
√
2 9 + 4 x. You can verify this on Maple by using
> solve( x=eq1, t ):
You get two values listed, and you want the value for t that is√greater than 3/2. Maple will list things in different
orders, for no clear reason. Pick the one that gives t = 32 + 12 9 + 4 x. Plug that into y = t 3 + 4t. What formula
do you get for y as an explicit function of x? You can do this on Maple by using (immediately after the previous
result)
> simplify( subs( t=%[1], eq2) ):
(Go back and use simplify(subs(t=%[2], eq2)); if the expression you end up with has any minus signs.
You picked the wrong one!) Note that the %[1] picks out the first of the two expressions resulting from the
solve. That’s the one we want (positive square root) this time. (Different runs of Maple might make the second
one have the positive square root. In that case, use t=%[2].) The simplify just makes the answer neater.
(f) Differentiate the function for y in the previous part, to get dy/dx as a function of x. Do this on Maple by using
(immediately after the previous result)
> simplify( diff(%, x) ):
(g) The value of x when t = 2 was determined in the first part of the problem. Plug that value of x into the formula
for dy/dx from the previous part and get the slope of the tangent line at the point where t = 2. On Maple, this
is done by using (immediately after the previous result)
> simplify( subs( x=subs(t=2,eq1), % ) ):
Compare to the answer using the parametric equations formula. Which way (by parametric equations formula
or by solving for y explicitly) would you rather do this, even if you have Maple around?
1.6 Higher-order derivatives.

In going from position to velocity, you take a derivative of s or x with respect to t. To go from velocity to acceleration,
you take another derivative with respect to t. Taking a derivative twice (or more in other situations) occurs often enough to
generate a set of notations and terminology.
The terminology is that you get the first derivative when you have differentiated once. Differentiate again, and you get
the second derivative; differentiate again and you get the third derivative. You can continue to get the fourth, fifth, sixth,
etc. derivatives. All derivatives after the first are called higher-order derivatives. The generic higher-order derivative is
called the nth derivative.
The notations for higher-order derivatives is as varied as for first derivatives. Here is a table of the different varieties:
First Second Third Fourth nth

dy d2 y d3 y d4 y dn y
dx dx2 dx3 dx4 dxn
d d2 d3 d4 dn
dx (y) dx2
(y) dx3
(y) dx4
(y) dxn (y)
y0 y00 y000 y(iv) y(n)

Dy D2 y D3 y D4 y Dn y
For f (x) notation, all the same things are used.

You might wonder why the 2’s and 3’s are placed that way in the first line of the table.
Mugsy: Actually, this one I did notice.
Dudley: Wow. Write that down.
Mugsy: Careful, grasshopper.
The second line explains it. The reason for d 2 y/dx2 is that it is obtained this way:
2
d2 d2 y

d dy d d d
= (y) = (y) = 2 (y) = 2
dx dx dx dx dx dx dx
All of those are, by the way, ways of writing the second derivative. The third and higher derivatives follow in parallel with
the second derivative.
1.6.1 Physical interpretation.

Acceleration is the derivative of velocity, which is the rate at which velocity is changing. This makes sense, since accel-
eration of zero means velocity isn’t changing at all, while a positive acceleration means the velocity is increasing, and a
negative acceleration means velocity is decreasing. We want to look at this more carefully, though.
The independent variable is t; rate of change.

For most physical problems involving higher derivatives, the independent variable is t, and a derivative with respect to t is
called a rate of change. That is, v = ṡ and a = v̇.
Acceleration.
Acceleration is a = v̇ = s̈, in the style of notation given earlier, where the last of these is read “s double dot.” The comment
about it being difficult to distinguish between dots and random blotches applies even more here. I prefer to use a = dv/dt =
d 2 s/dt 2 , that is, Leibniz’s notation.
Second derivatives, then, tell you how fast the velocity (or rate of change under consideration) is changing. For example,
the fact that the world’s population is growing is not, by itself, cause for much concern. The rate of change of population
is positive. But the rate of change is itself increasing, and that could become serious. The population of the world is
accelerating.
You rarely encounter derivatives higher than the second. (We will hit a total of one in the applications that we do!
Second derivatives, on the other hand, abound.) The reason that second derivatives are typically as far as you need to go
is that Newton’s law says that F = m a = m s̈, and that only involves second derivatives. Third derivatives can occur in
quantum mechanics, but are rare even there. This is good because dots tend to blur, fade out, or run together with more
than two.
1.6.2 Geometric interpretation.

Not only are there physical interpretations for second (mainly) derivatives, there are geometric (graphical) ones also.
The independent variable is now x; notations.

When dealing geometrically, the variable is usually x and not t. Geometric problems are often statics (no change with t, so
the derivative with respect to t would just be 0), while the situations with derivatives with respect to t are usually dynamics.
We can move back to the f 0 (x) notation for derivatives here: y0 = dy/dx. Also, keep in mind that Q0 > 0 means that Q,
whatever it is, is increasing, and Q0 < 0 means that Q is decreasing.
Concavity.
Just as we had that y0 > 0 meant that the function is increasing and the graph is rising (going up to the right), and y0 < 0
meant that the function is decreasing and the graph is falling (going down to the right), we can look at the sign of the second
derivative. When y00 > 0, that means that (y0 )0 > 0, or that y0 is increasing. That means that the slope of the tangent line
is increasing. A few diagrams will convince you that this means that the tangent line is turning counterclockwise, and the
curve is above its tangent line. This situation is called concave up, or bending up.
When y00 < 0, that means that (y0 )0 < 0, or that y0 is decreasing. That means that the slope of the tangent line is
decreasing. A few diagrams will convince you that this means that the tangent line is turning clockwise and that the curve
is below its tangent line. This situation is called concave down, or bending down.
One common, and horrid, terminology for concave up and down is “spilling water” or “holding water.” This is hope-
lessly deceptive (as a homework problem shows). Nevertheless, many calculus books have propagated this terminology.
Mention curvature (bending of beams, etc.).

There are instances of derivatives beyond the second occurring in engineering. One simple example is when getting the
equations for a bending, loaded beam. In that case, you use fourth-order derivatives. The equation is (c(x) u00 )00 = f (x),
where f (x) is the load on the beam and c(x) represents the stiffness of the beam (usually a constant). You’ll end up looking
at that in applied math (if you take it). The point is that high-order derivatives really do occur.
There is one problem with second derivatives measuring the bend: It is not a uniform way of doing it. Let me explain.
A curve that has a constant “bend” is a circle. A curve that has a constant second derivative is a parabola, which is definitely
not a circle. There is a solution to this, called the curvature and written κ (Greek letter kappa), which truly measures the
bend of a curve. It contains the second derivative, of course, but other terms to adjust it. We will talk about it next semester
(in the chapter on amusement park rides).
1.6.3 Calculations.
Of course, you are not content simply to know what second (and higher) derivatives mean, you have to know how to find
them. Fortunately, it is easy. We already know.
How to do it (simple).
Finding a second or third derivative requires nothing more than successive differentiations. Keep taking derivatives until
you have taken the right number of them.
Dudley: That’s all?
Albert: Yes.
This sounds easy, and it is for polynomials and a few other simple functions. For some functions (even ones as “easy”
at tan x), this rapidly becomes a nightmare. But Maple is happy to do this for you. (I’ll show how to do this on Maple in a
moment.)
The real problem is with compositions, which require the chain rule. The first derivative causes a product (the derivative
of the outside times the derivative of the inside), and after that you have the product rule as well as the chain rule. It gets
messy fast.
Again, and it gets harder as you proceed, “all” you need to do is not panic, but apply the rules for derivatives to
successively smaller pieces until you’ve conquered it.
Algebraic simplifications (combining terms).

There is a lot of wisdom in combining terms when you start taking higher derivatives. Each factor you can combine
eliminates that much work. Even Maple can grind for a long time if it is not asked to simplify as it is going along taking
derivatives.
Mugsy: Now, that’s scary.
Conversion of quotients to products highly recommended here!

Even if you are using Maple, and especially if you are not, you want to convert quotients into products:
p(x)
= p(x) (q(x))−1
q(x)
just as we did before. This puts you into dealing with chain rules, but for the second derivative and beyond, you will find
cancellations that make life vastly simpler in this form. Higher derivatives often produce multiple terms that are identical
(except perhaps for the coefficient). This permits simplifying the product rule for higher derivatives by combining like
terms. Do that!

You can take two derivatives of y with respect to x by either of these commands:
> diff(y,x,x):
> diff(y,x$2):
They will give you exactly the same results.
Essentially, all the positions after the function (first position) in diff(); represent variables to differentiate the function
with respect to. The x$2 in the second one is nothing more than a shorthand for x,x.
Higher-order derivatives operate an analogous way. The third derivative in Maple is written either as diff(y,x,x,x);
or diff(y,x$3);. I expect you can figure out what to do with even higher derivatives.
Dudley: Even Mugsy.
Mugsy: Dudley, come here. You need to learn a lesson.
Albert: I’m calling 911 now . . . .
Examples.
There will be numerous examples given in class, done both by hand and by Maple. Here they are. Find the specified
derivatives of these functions.
The second derivative of x6
The third derivative of x5 − 2 x4 + 4 x3 − 8 x2 + 9 − 18
The second derivative of sin(x2 )
The sixth derivative of x3
The third derivative of tan x
The five-thousandth derivative of ex
1.6.4 Parametric equations and second (and higher-order) derivatives.

The difficulty with finding higher-order derivatives with parametric equations is that you want to find
d2 y

d dy
=
dx2 dx dx
but dy/dx as worked out from parametric equations is a function of t and not a function of x. Therefore the next derivative
being with respect to x is awkward. But (and you guessed it) the chain rule (the most important rule in calculus) comes
to the rescue. We need to change the variable of differentiation. We can find d/dt(dy/dx), since dy/dx will be given as a
function of t. To convert to a d/dx(dy/dx), we need something earlier, namely
d 1 d
=
dx dx/dt dt
where we “apply” both sides to dy/dx. The left hand side is then the second derivative (which we want), and the right hand
side is something that we can figure out.
Let me do an example. Suppose
x = t sint, y = t cost
Then we find dy/dx by dy/dx = (dy/dt)/(dx/dt), which in this case gives
dy dy/dt
= (1.117)
dx dx/dt
1 × cost + t × (− sint)
= (1.118)
1 × sint + t × cost
cost − t sint
= (1.119)
sint + t cost
This is the first derivative. The second derivative is obtained by using the formula given earlier. You get
d2 y

d dy
= (1.120)
dx2 dx dx

d cost − t sint
= (1.121)
dx sint + t cost

1 d cost − t sint
= (1.122)
dx/dt dt sint + t cost

1 d cost − t sint
= (1.123)
sint + t cost dt sint + t cost
1
= × (1.124)
sint + t cost
(sint + t cost)(− sint − (sint + t cost) − (cost − t sint)(cost + (cost + t(− sint)))
(1.125)
(sint + t cost)2
..
.Thank heavens for Maple. . . (1.126)
−(t 2 + 2)
= (1.127)
(sint + t cost)3
A few comments on this. First note that you don’t just differentiate dy/dx with respect to t and stop. You have to divide
by the dx/dt also in order to compensate for the fact that you are differentiating with respect to t (which is easy, due to the
variable in dy/dx) and not with respect to x (which is the derivative you want for d 2 y/dx2 ).
It becomes obvious that higher-order derivatives get messy fast. The third derivative would be calculated by
d3 y d d2 y

= (1.128)
dx3 dx dx2
2
1 d d y
= (1.129)
dx/dt dt dx2
And on it goes for ever-higher derivatives.

There is no correspondingly nice way to do higher-order derivatives with logarithmic differentiation.
Mugsy: Break my heart.
The uses of higher-order derivatives.

For the most part, third order and higher derivatives are not used very much. I only know of one formula in physics that
uses a third derivative, and there will only be one formula in this course that uses one (and that is next semester!).
On the other hand, second derivatives are used all the time. The concept of acceleration is fundamental to physics. But
beyond that, it is important to know how a rate of change is itself changing. For example, as I noted earlier, population is
growing at an unsettling rate. But it is far worse that the rate of population growth in rising, and that is a second derivative.
Homework #20
Exercises.
1. Find d 2 y/dx2 for the following functions.

(a) y = x3 ln x
ex
(b) y = x
(c) y = Arctan(sin x)
2. Find d 2 y/dx2 for the following functions.

(a) y = x4 sec x
ex
(b) y = sin x
(c) y = Arcsin(ln x)
3. Find d 2 y/dx2 for x = R cost, y = R sint. Here, t is the parameter, and R is a constant.
4. Find d 2 y/dx2 for x = a cost, y = b sint. Here, t is the parameter, and a and b are constants.
5. Find d 3 y/dx3 for y = x3 ex . Be careful!

6. Make up three of your own higher-order derivatives to work. You should include one that goes up to third-order
derivatives, and one that is a parametric equation with a second order derivative.
Problems.
1. In this problem, we attack the idea that concave up is the same as “holds water.”
(a) Draw a reasonably accurate graph of y = 3 x + sin x for 0 ≤ x ≤ 4π. This is most easily accomplished using
Maple; the alternative is to plot lots of points. Also, to get the angles to look right, you will have to add a
command scaling=constrained to the plot command.
(b) Calculate y00 . Determine where it is positive and negative for 0 ≤ x ≤ 4π. (That is, give the x’s where y00 ≥ 0
and the other x’s where y00 ≤ 0.)
(c) Can you see any reason from the graph to say that any place on the curve would “hold water?” What would
happen if you had something of that shape and poured water on it? Would it “hold water” anywhere?
2. In this problem, we investigate higher-order derivatives of polynomials.
(a) How many derivatives do you need to take of a linear function, f (x) = a x + b, before you get 0?
(b) How many derivatives do you need to take of a quadratic function, f (x) = a x2 + b x + c, before you get 0?
(c) How many derivatives do you need to take of a cubic function, f (x) = a x3 + b x2 + c x + d, before you get 0?
(d) Look for a pattern in the preceding three parts, and make a guess about how many derivatives you need to take
of a general nth -degree polynomial before you get 0.
(e) What happens if you take more than the minimum number of derivatives? For example, if four derivatives of a
function give 0, what is the result of seven derivatives of that function?
√
3. For this problem, use y = e−x cos(x 7). Maple will be a big help.
(a) Find dy/dx and d 2 y/dx2 . This is not a pretty second derivative.
(b) Plug these and the original function y into the differential
√ equation d 2 y/dx2 + 2dy/dx + 8y, and show that it
−x
reduces to 0. (The terminology is that y = e cos(x 7) is a solution of the differential equation d 2 y/dx2 +
2 dy/dx + 8 y = 0.)

1. There are three interpretations of the derivative:
(a) The formula for the slopes of all tangent lines
(b) The rates at which a function is changing
(c) Wiggle magnification factors
2. The wiggle magnification formula is

dy
∆y ≈ ∆x
dx
3. The formula for lines to use in calculus is y − y0 = m (x − x0 ), where (x0 , y0 ) is the point the line goes through and m
is the slope of the line. When finding tangent lines, remember that m needs to be a number, so you have to plug the
point of tangency into the derivative before substituting in for m.
4. When you want to evaluate most limits, do the following steps in order:
(a) Plug in the limiting value for the variable. If you get 0/0, proceed to the next step. Otherwise you are done. If
you got a regular number, that’s the answer to the limit. If not, the limit does not exist.
(b) Factor the top and bottom of the fraction and reduce common factors. (If the limit is as x approaches a, there
will be factors of x − a in both the top and bottom. That’s the one you really need to get rid of.) Go back to the
first step.
(c) The exception to this procedure involves limits that you can’t easily factor. In that case, some ingenuity needs
to be applied. But the best solution is to wait for L’Hôpital’s rule, in the next chapter.
5. The basic rules for derivatives and the derivatives of the basic formulas are:
General patterns:
( f (x) ± g(x))0 = f 0 (x) ± g0 (x) ( f (x) · g(x))0 = f 0 (x) · g(x) + f (x) · g0 (x)
(c f (x))0 = c f 0 (x) 0
g(x) · f 0 (x) − f (x) · g0 (x)

( f (g(x)))0 = f 0 (g(x)) · g0 (x) f (x)
=
g(x) g(x)2
Specific functions:
Function Derivative Function Derivative
xn nxn−1 ex ex
|x | |x | /x ln x 1/x
sin x cos x sinh x cosh x

cos x − sin x cosh x sinh x
tan x sec2 x tanh x sech2 x
cot x − csc2 x coth x −csch2 x
sec x tan x sec x sech x − tanh x sech x
csc x − cot x csc x csch x − coth x csch x
√ √
Arcsin x 1/ 1 − x2 Arcsinh x 1/√1 + x2
2 1/ x2 − 1
Arctan x + x√)
1/(1 Arccosh x
Arcsec x 1/ |x | x2 − 1 Arctanh x 1/(1 − x2 )
These will be given to you on tests, in exactly this form.
6. The chain rule is the most important formula in calculus.

7. The basic formula for a differential dy in a function y = f (x) is dy = f 0 (x) dx. Differentials are most conveniently
viewed as “submicroscopic wiggles.”
8. Parametric equations occur when both x and y are written in terms of a third variable, usually t. In that case, the
dy dy/dt
formula for the slope of a tangent line is = . Note that it appears that the dt’s are being canceled. This is
dx dx/dt
the check that you have set up the chain rule correctly.
9. Higher-order derivatives:
(a) If the first derivative is the rate of change, the second derivative measures acceleration or how fast that rate of
change is changing. If the first derivative is slope of a tangent line, the second derivative represents concavity
(up or down).
(b) Finding higher-order derivatives involves nothing more than repeated application of the formulas for derivatives.
Suggestions for making the algebra easier:
• It is often easier to deal with quotients by converting them to products of terms with exponents.
• Whenever you can, simplify expressions before taking more derivatives. This involves combining terms,
and is especially useful when you have products of terms with exponents. In that case, factor out of the
whole expression all common factors raised to the lowest powers that occur.
• Higher-order derivatives of parametric equations always involve the chain rule, where you will have to
divide by dx/dt each time.
10. Maple commands that are relevant to this chapter:
• diffquo(function, x-coordinate, delta_x); is a procedure that I wrote, and calculated difference

quotients of polynomial functions.
• limit(function, variable=value); finds the limit of the function as the variable approaches the value.
• diff(function, variable); finds the derivative of a function.
diff(function, variable$n); finds the nth derivative of the function.
1.8 Tests from previous years

At the end of chapters after which tests are scheduled, I will put copies of the tests from the previous 10 years. These are
intended for practice, and to give you an idea of what type of questions get asked on exams. There will probably also be
extra credit available for doing these.
Test #1, Fall 2004
I. (20 points; 10 points each) Given f (x) = x2 − 2 x + 1.

f (x + ∆x) − f (x)
(a) Find ddxf using the definition f 0 (x) = lim . (b) Find the equation for the tangent line to the curve
∆x→0 ∆x
at x = −2.
II. (10 points;
2 5 points each)
Find the following
2 limits.
x − x − 12 x − 2 x − 15
(a) lim 2 (b) lim
x→4 x − 2 x − 8 x→5 x2 − 3 x − 5
III. (40 points; 10 points each) Find the following derivatives.
d 2 (x + 1)2 d2y

d
ln ln ln(cos2 x)

(a) 2
(b) (c) for x = tan(1 + t) and y = sin(t + 1).
dx x−1 dx dx2
d
(d) (x3 + 2 x − 3)sin x
dx
IV. (10 points) Use the Wiggle Magnification Formula to estimate f (6.3) if f (6) = 10 and f 0 (6) = −1.
V. (10 points) Derive the three-factor product rule by logarithmic differentiation as follows. Suppose f (x) = f1 (x) × f2 (x ×
f3 (x). Take the log of both sides of the equation and simplify the right hand side. Differentiate both sides using the
chain rule. Solve for f 0 (x). Replace f (x) with f1 (x) × f2 (x) × f3 (x) and simplify to put the result in the usual prod-
uct rule form (see equation sheet).[Note: The equation sheet this time contained the formula ( f1 (x) × f2 (x) × f3 (x))0 =
f10 (x) × f2 (x) × f3 (x) + f1 (x) × f20 (x) × f3 (x) + f1 (x) × f2 (x) × f30 (x).]
VI. (10 points) Show that y = sin x + cos x satisfies the second-order differential equation y00 + y = 0 by substituting the
function into the left-hand side of the equation and showing that it is equal to zero.
Test #1, Fall 2005
I. (20 points, 10 points each) Given f (x) = 2 x2 − x − 1.

(a) Find ddxf using the definition f 0 (x) = lim∆x→0 f (x+∆x)−
∆x
f (x)
. (b) Find the equation for the tangent line to the curve
at x = 3.
II.(10 points; 5 pts each) Find the following limits.
x2 + 2 x − 8 x2 + 2 x − 3
(a) lim 2 (b) lim 2
x→2 x + x − 6 x→−3 x + 5 x + 6
III. Find the following derivatives:

d2 d d2y
(a) (10 pts) 2 sin(3 x2 + 2 x) arctan(3 x7 )

for x = ln(t 2 ) and y = t 2 +3.

(b) (10 pts) (c) (15 pts)
dx dx dx2
d2y
(Remember to simplify each step as much as possible.) Find the value of 2 at x = 1. (d) (10 pts)Use logarithmic
s  dx
2 3
d 5 x sin (x)
differentiation to find 
dx cos(x2 ) arctanh(x)
IV. (10 pts) Use the Wiggle Magnification Formula to estimate f (7.1) if f (7) = 5 and f 0 (7) = 2.
V. Find the following derivatives
d dy
ln(ln(ln(ln(x2 + 2 x)))) where y = π x .

(a) (10 pts) (b) (10 pts)
dx dx
Test #1, Fall 2006
I. (20 points, 10 points each) Given f (x) = x2 + 2 x − 1.

(a) Find msec using the definition msec = f (x+∆∆x)− x
f (x)
and use your result to find the value of msec between the points
(0, −1) and (2, 7). (b) Find the value of mtan at (0, −1). Why does this value differ from your answer in part (a)?
II. (10 points; 5 pts each) Find the following limits.
x2 − 9 x2 + 4 x − 21
(a) lim (b) lim
x→3 x − 3 x→−7 x+7
d 2 x2 − 3 x + 1

d 2

(a) (10 pts) ln( sin(x ) )
(b) (10 pts) √ (c) (15 pts) Given x = sin(2t) and y = sec(2t),
dx dx 3x+2
2
dy d2y 2 sin (2t) + 1
first show that = sec2 (2t) tan(2t) and then show that 2 = . Finally, determine whether the y vs x
dx dx cos2 (2t)
is concave up or down at x = 1/2 and explain your conclusion.

graphs (d) (10 pts)Use logarithmic differentiation to find
!
d 7 x2 cos(x)
.
dx sinh(x2 ) |tanh(3 x) |
IV. (10 pts) Use the Wiggle Magnification Formula to estimate f (9.1) if f (9) = 6 and f 0 (9) = 3.
V. (10 pts) The graphs of a function f and its derivative f 0 are shown. Label the graphs as f or f 0 and write a short paragraph
stating the criteria used in making the selection.

1 1+x
VI. (10 pts) Use the definition of arctanh x, Arctanh x = ln , to confirm the derivative for this function given in
2 1−x
the equation sheet, i.e., take the derivative of this definition and do the algebra required to get it in the form on the equation
sheet.
Test #1, Fall 2007
I. (20 points, 10 points each) Given f (x) = 2 x2 + 3 x − 1.

f (x + ∆x) − f (x)
(a) Find mtan using the definition mtan = lim and use your result to find the equation for the line tangent
∆x→0 ∆x
to the curve at point (−2, 1). (b) Without plotting the function (you can if you want to, but you can’t use the plot for
your answer), can you determine if the function is increasing or decreasing at (−2, 1)? Why? What is the concavity there
(up or down)? Why?
II. (10 points; 5 points each) Find the following limits.
x2 + 2 x − 15 3 x2 + 2 x − 1
(a) lim 2 (b) lim
x→−5 x + 3 x − 10 x→π/2 cos(x)
d sinh(x2 ) d sin(3 x) cos(2 x)
(a) (10 pts) e (b) (10 pts) . (c) (15 pts) Given x = ln(3t) and y = t 3 , and us-
dx dx ln(x2 + 1)
dy d2y
ing parametric differentiation only, first show that = 3t 3 , and then show that 2 = 9t 3 . Finally, determine that
dx dx
d3y 3
= 27t . Five point bonus: Can you find a general pattern for higher-order derivatives of these functions? (d) (10
dx3
d 2

pts) [Arctan(3 x)]sin(x )
dx
IV. (10 pts) Use the Wiggle Magnification Formula to estimate f (4.3) if f (4) = −2 and f 0 (4) = 5.
V. (10 pts) Match the function in graphs (A) – (D) with their derivatives (I) – (III). Note that two of the functions have the
same derivative.
(B)
(A) (D)
(C)
(I)
(III)
(II)
d2y
VI. (10 pts) Show that y(x) = C1 + C2 e3 x (C1 and C2 are constants) satisfies the second order differential equation 2 −
dx
dy 0
3 = 0. If we’re additionally given that y(0) = 0 and y (0) = 1, show that C1 = −1/3 and C2 = 1/3.
dx
Test #1, Fall 2008
I. ( 10 points; 5 points each) Answer the following questions about the function f (x) = x3 − 5 x2 + 2.
(a) What is the equation of the secant line through x = −2 and x = 1? (b) What is the equation of the tangent line at
x = 0?
II. ( 20
points; 10 points
each ) Find
the√following
limits.
x2 − 2 x − 8 x+x
(a) lim 2 (b) lim √
x→4 x + 3 x − 28 x→0 x+1
III. ( 30 points; 10 points each
) Find thefollowing derivatives.
d 3 d Arctan t dy 2
for y = (1 + ln x)(x )

(a) x cos x (b) (c)
dx dt ln(t 2 + 1) dx
IV. ( 20 points; 10 points each ) Find the indicated derivatives of the following functions.
d3 d 564
(a) 3 (cos(4 x)) (b) 564 (sin x) (Hint: You are obviously going to have to find a better way to do this than the direct
dx dx
way. Write out the derivatives, look for a pattern, and figure where 564 fits into that pattern.)
V. ( 15 points; 5 points each part ) Give the answers to the following questions about the function defined parametrically by
x = Arctant, y = e2t .
(a) What is the derivative, dy/dx, as a function of t? (b) What is the equation of the tangent line to the curve at the
point when t = 0? (c) What is the second derivative, d 2 y/dx2 , as a function of t?
VI. ( 10 points ) A function occasionally encountered in later mathematics courses is Si(x), called the sine integral of x. Its
derivative is
d sin x
(Si(x)) = .
dx x
d
Si(x3 ) .

Find [Caution: Note the absolute values.]
dx
Test #1, Fall 2009

f (x + ∆x) − f (x)
I. (10 points) Given f (x) = x2 − 3 x + 2. Find mtan using the definition mtan = lim and use your result to
∆x→0 ∆x
find the equation for the line tangent to the curve at x = 3.
x2 − 2 x − 3 x2 − 2 x − 15
(a) lim 2
(b) lim 2
x→−1 x +x x→−3 x + 5 x + 6
III. (25 points; as noted) Find the following derivatives.

d sin(ln(x2 ))

d d 2

sinh(3 x) cot(2 x2 ) [Arcsec(3 x)]ln(x )

(a) (5 points) (b) (10 points) (c) (10 points)
dx dx Arcsin(2 x) dx
2
IV. (25 points; as noted) Given x = t 2 and y = et .
dy d 2 y d3 y d3y
(a) (15 points) Using parametric differentiation only, find dx , dx2 , and dx3
. (b) (10 points) What is the value of dx3
at
x = 2?
V. (10 points) Use the Wiggle Magnification Formula to estimate f (5.7) if f (5.5) = 3 and f 0 (5.5) = 2.
VI. (5 points) If h(x) = f [g(x)], find h0 (1.6). Note that each axis tick represents 0.4 unit and the pieces of the graphs that
look like straight lines really are. The graph of f (x) is drawn dashed, while the graph of g(x) is solid. Finally, the corners
on the graphs are at (2, 6) and (4, 4).
√ √
3 e 2 x 3 e− 2x 1 d2y
VII. (10 points) Show that y(x) = + − satisfies the second-order differential equation 2 − 2 y = 1 AND
4 4 2 dx
0
the initial conditions y(0) = 1 and y (0) = 0.
Test #1, Fall 2010
I. (10 points) Given f (x) = x3 − 3 x2 + 2 x + 2, find the equation for the tangent line to the curve at x = 1.
II. (20 points;
2 10 points each) Find the
2following limits.
x −4x+3 z −4z−5
(a) lim (b) lim
x→3 x2 + x − 12 z→5 z2 − z − 20
III. (50 points, as noted) Find thefollowing derivatives. (Don’t simplify your final answers.)
d e3 x sec(4 x) d2

d h 4
i
(b) (10 points) 2 ln(5 x3 − 8 x) (ln x)x +x

(a) (10 points) 2
(c) (10 points) (d) (20
dx x +x dx dx
d2y
points) for x = Arctant, y = sint
dx2
IV. (10 points) Use the Wiggle Magnification Formula to estimate f (4.9) if f (5) = 8 and f 0 (5) = −4.
V. (15 points) Many of you may be thinking “Hey, we’ve never developed a formula for the derivative of u(x)v(x) !” For this
problem, you get to derive such a formula and then check it.
(a) (8 points) Use logarithmic differentiation to find a formula for the derivative of u(x)v(x) . (Hint: after you panic, set
y = u(x)v(x) , take the ln of both sides and simplify the right hand side. Then, differentiate both sides like you normally
do for log. diff.) (b) (7 points) Apply the formula that you got to the function esin(x) and check your answer with the
regular exponential-plus-chain-rule approach. That is, differentiate esin(x) by the usual chain rule and plug u(x) = ex and
v(x) = sin(x) into the formula you developed in part (a) of this problem.
Test #1, Fall 2011
I. (15 points; 5 points each) Given f (x) = x3 − 5 x2 + 3 x.

(a) Find slope of the secant line between x = −1 and x = 3. (b) What is the equation of the secant line in part (a)?
(c) Find the equation of the tangent line at x = −1 (you may use the ‘shortcut’ method of finding derivatives).
II. (20 points; 10 points each) You’re probably thinking to yourself, “Hey, I know my limits; why aren’t they on this exam?”
Well, here you go! Find the following limits.
√ √
x2 + 11 x + 30 z− 2
(a) lim (b) lim
x→−6 x2 + 5 x − 6 z→2 z−2
III. (25 points;as marked) Find
the following derivatives (don’t simplify your final answers):
d Arcsin(3 x) d2 4x
h
ex
i
(a) (5 pts) (b) (10 pts) Arctan(e ) (c) (10 pts) (Arcsec(x))
dx 9 x2 + 1 dx2
2
IV. (25 points; as marked) Given the functions x = ln(t 2 ), y = et .
dy 3 3 t 3 d2y 9 3 d3y
(a) (5 pts) Show that = t e . (b) (10 its) Show that 2 = et (t 3 +t 6 ) (c) (5 pts) Find . (d) (5 pts)
dx 2 dx 4 dx3
Find the slope of the y vs x graph at x = 4 (assume t > 0).
V. (10 pts) Estimate f (−5.2) if f (−5) = 2 and f 0 (−5) = −7.
VI. (10 pts) Show that y = ex/2 + e−2x satisfies the ODE 2 y00 + 3 y0 − 2 y = 0.
Test #1, Fall 2012
I. (15 points; 5 points each) Given y = x3 − 4 x + 7.

(a) For x1 = −2 and x2 = 3, find ∆x, ∆y, and msec between the two points. (b) Find the equation for the secant line
between the two points in part (a). (c) Find the equation for the tangent line to the curve at x = −2. (Note: You can use
the “shortcut” method for finding the derivative.)
5 x2 − 2 x + 1 x2 − 2 x + 1 x2 + x − 6
(a) lim (b) lim (c) lim
x→3 4 x3 − 7 x→1 x−1 x→−3 x+3
III. (40points; 10
√ points
each) Find the following derivatives.
d x2 sec( x) d2 4 d2y d 3
for x(t) = 3 Arcsin(t), y(t) = et (2 x + 4 x)sin x

(a) (b) ln(t − 3t) (c) (d)
dx cos(x2 ) dt 2 dx2 dx
IV. (10 points) Use the Wiggle Magnification Formula to estimate f (−3.2) if f (−3) = 10 and f 0 (−3) = 2.
V. (10 points) A future Asbury student will discover a new formula as part of her Nobel-prize-winning work. She will
name it the Coulliette function because “it’s hard to understand and utterly useless.” The symbol for the Coulliette func-
d 1
tion is Co(x). It has the following derivative: Co(x) = p . Use this fact to compute the following derivative:
dx 3 x + x3/2

d
sin(x2 )Co(ln(x)) .
dx
VI. (10 points) Show that y = 3 cos(2 x) + 4 sin(2 x) is a solution to the following Initial Value Problem: y00 + 4 y = 0,
y(0) = 3, y0 (0) = 8.
Test #1, Fall 2013
I. (15 points; 5 points each) Given y = x4 − 4 x2 + 7.

(a) For x1 = −2 and x2 = 3, find ∆x, ∆y, and msec between these two points. (b) Find the equation for the secant line
between the two points in part (a). (c) Find the equation for the tangent line to the curve at x = −2. (Note: You can use
the ‘shortcut’ method for finding the derivative here.)
II. (15 points; 5 points each) Find the following limits. √
√
x2 − 4 x+2 x− 5
(a) lim 2 (b) lim 2 (c) lim
x→2 x + x − 6 x→−2 x + 4 x→5 x−5
III. (45 points; as marked)
Find
√ the following derivatives. 2
d 3 x sec( 5 x) d
(b) (10 points) 2 Arctan(t 4 − 3t)

(a) (10 points) 2
(c) (15 points) For x(t) = 2 sech(2t),
dx cosh(x ) dt
dy 1 d2 y d 4
y(t) = ln(cosh(2t)), first show that = − cosh(2t) and then find dx 2. (d) (10 points) (2 x + 4 x2 )tan(x)
dx 2 dx
IV. (15 points) Show that y(x) = 2 e−5 x + 5 e2 x satisfies the Initial Value Problem (IVP) y000 + 3 y00 − 10 y0 = 0, y(0) = 7,
y0 (0) = 0, y00 (0) = 70.
V. (10 points) A function used often in advanced mathematics is called the Bessel function of the first kind. Its symbol is
d J1 (x)
J p (x) for p = 0, 1, 2, . . .. The formula for the derivative of J1 (x) is (J1 (x)) = J0 (x) − . Use this fact to compute the
dx x
following derivative. Express your answer in terms of J0 , J1 , trig functions, and x (that is, assume that we know what J0
d
J1 (sin(x2 )) .

and J1 are).
dx
For these tests, the last page contained this information. You can expect to see it on all tests for the rest of the year.
Occasionally, more material might appear also; that will be decided on each test, and will vary from year to year.
Summary sheet for derivatives

General patterns:
( f (x) ± g(x))0 = f 0 (x) ± g0 (x) ( f (x) · g(x))0 = f 0 (x) · g(x) + f (x) · g0 (x)
(c f (x))0 = c f 0 (x)
f (x) 0 g(x) · f 0 (x) − f (x) · g0 (x)

( f (g(x)))0 = f 0 (g(x)) · g0 (x) =
g(x) g(x)2
Specific functions:
Function Derivative Function Derivative

xn nxn−1 ex ex
|x| |x| /x ln x 1/x
sin x cos x sinh x cosh x

cos x − sin x cosh x sinh x
tan x sec2 x tanh x sech2 x
cot x − csc2 x coth x −csch2 x
sec x tan x sec x sech x − tanh x sech x
csc x − cot x csc x csch x − coth x csch x
√ √
Arcsin x 1/ 1 − x2 Arcsinh x 1/√1 + x2
Arctan x +√
1/(1 x2 ) Arccosh x 1/ x2 − 1
Arcsec x 1/ |x| x2 − 1 Arctanh x 1/(1 − x2 )
Test #1, Fall 2004, Answers
f (x + ∆x) − f (x) [(x + ∆x)2 − 2 (x + ∆x) + 1] − [x2 − 2 x + 1]

I. (a) Using the definition, f 0 (x) = lim = lim
∆x→0 ∆x ∆x→0 ∆x
[(x2 + 2 ∆x x + (∆x)2 ) − 2 x − 2 ∆x + 1] − [x2 − 2 x + 1] 2 ∆x x + (∆x)2 − 2 ∆x ∆x [2 x + ∆x − 2]
= lim = lim = lim = lim 2 x+
∆x→0 ∆x ∆x→0 ∆x ∆x→0 ∆x ∆x→0
∆x − 2 = 2 x − 2. (b) The slope is −6, so the equation of the line is y − 9 = −6 (x + 2).
II. (a) 7/6 (b) 0
−2 sin x
III. (a) 8 (x − 1)−3 (b) 2 x)) (ln(cos2 x)) cos(x)
(c) 3 cos4 (t + 1) sin(t + 1)
ln(ln(cos
sin x
(d) (x3 − 2 x + 3)sin x cos x ln(x3 − 2 x + 3) + 3 (3 x2 − 2)
x −2x+3
IV. 9.7
f 0 (x) f 0 (x) f20 (x)
V. ln( f (x)) = ln( f1 (x) f2 (x) f3 (x)) = ln( f1 (x)) + ln( f2 (x)) + ln( f3 (x)). Then differentiating gives = 1 + +
f (x) f1 (x) f2 (x)
f30 (x)
. Multiplying the left side by f (x) and the right side by (the equal value) f1 (x) f2 (x) f3 (x) gives f 0 (x) = f10 (x) f2 (x) f3 (x)+
f3 (x)
f1 (x) f20 (x) f3 (x) + f1 (x) f2 (x) f30 (x), which is the usual product rule.
2 (x + ∆x)2 − (x + ∆x) − 1 − 2 x2 − x + 1

0 f (x + ∆x) − f (x)
I. (a) f (x) = lim = lim =
∆x→0 ∆x ∆x→0 ∆x
(2 x2 + 4 x ∆x + 2 (∆x)2 ) − x − ∆x + 1 − 2 x2 + x − 1 4 x ∆x + 2 (∆x)2 − ∆x ∆x (4 x + 2 ∆x − 1)
lim = lim = lim = lim 4 x+
∆x→0 ∆x ∆x→0 ∆x ∆x→0 ∆x ∆x→0
2 ∆x − 1 = 4 x − 1. (b) y = 11 x − 19 or y − 14 = 11 (x − 3).
II. (a) 6/5 (b) 4

arctan(3 x7 ) 1 d2y d2y
III. (a) − sin(3 x2 + 2 x) (6 x + 2)2 + 6 cos(3 x2 + 2 x)
(b) 7 7 )2
21 x6 (c) 2
= t 2 ; 2 = e at
arctan(3 x ) 1 + (3 x dx dx
2 1 3 cos x 1 − sin(x2 ) 2 x 1

dy 1 1
x=1 (d) =y + − −
dx 5 x 5 sin x 5 cos(x2 ) 5 arctanh(x) 1 − x2
IV. y ≈ yw = 5.2
1 1 1 1
V. (a) (2 x + 2) (b) π x ln(π)
ln(ln(ln(x2 + 2 x))) ln(ln(x2 + 2 x)) ln(x2 + 2 x) x2 + 2 x
I. (a) 4. (b) 2. This is not equal to the answer is part (a) since that slope is of a secant line through (0, −1), while this
line is tangent to the curve at (0, −1).
√
1 sin(x2 ) 2 3 x + 2 (4 x − 3) − (2 x2 − 3 x + 1) (1/2) (3 x + 2)−1/2 (3)
II. (a) 6 (b) −10 III. (a) cos(x ) (2 x) (b)
|sin(x2 ) | |sin(x2 ) | 3x+2
dy dy/dt 2 sec(2t) tan(2t) 2
(c) = = = sec (2t) tan(2t) since 1/ cos(2t) = sec(2t).
dx dx/dt 2 cos(2t)
d2y

1 d dy
=
dx2 dx/dt dt dx
1
= [2 sec(2t) (sec(2t) tan(2t) (2) × tan(2t) + sec2 (2t) × (sec2 (2t) (2)]
2 cos(2t)
1
= sec(2t) [4 sec2 (2t) tan2 (2t) + 2 sec4 (2t)]
2
= 2 sec3 (2t) tan2 (2t) + sec5 (2t)
1 sin2 (2t) 1
=2 +
cos3 (2t) cos2 (2t) cos5 (2t)
2 sin2 (2t) 1
= +
cos5 (2t) cos5 (2t)
2 sin2 (2t) + 1
=
cos5 (2t)
At x = 1/2, the value of t is found by x = sin(2t), or 1/2 = sin(2t), or π/6 = 2t, or t = π/12. Plugging that into d 2 y/dx2
2 sin2 (π/6)+1
gives d y /dx2 = cos5 (π/6)
, which is positive, since all parts of the fraction are positive. Therefore, the curve is concave
up. r h i
x2 cos(x) tanh(3 x)
(d) y0 = 1
× 2 1x + cos1 x − sin x − sinh(x
1 2 1
sech2 (3 x) 3
7

sinh(x2 ) |tanh(3 x) |
× 7 2 ) cosh(x ) (2 x) − |tanh(3 x) | |tanh(3 x) |
IV. 6.3
V.
Use a process of elimination. On the left side of the y axis, one curve is f and the other is f 0 . If the upper curve is f 0 , that
would mean the f is the lower curve. But then f 0 > 0, while the f is decreasing, so f 0 < 0. That can’t happen, so the upper
curve is f and the lower curve is f 0 . On the right side of the y axis, both curves are positive, meaning f 0 > 0. That means f
is increasing, and the other curve is f 0 .
VI.

d d 1 1+x
[Arctanh x] = ln
dx dx 2 1−x
1 1 − x (1 − x) (1) − (1 + x) (1)
=
2 1+x (1 − x)2
1 1−x 2
=
2 1 + x (1 − x)2
1
=
(1 + x) (1 − x)
1
=
1 − x2

f (x + ∆x) − f (x)
I. (a) After some algebra, = 4 x + 3 + 2 ∆x, so mtan = 4 x + 3. Then at x = −2, mtan = −5, so the equation
∆x
of the tangent line is y − 1 = 5 (x + 2). (b) Since f 0 (−2) = −5 < 0, the graph is decreasing. From f 0 (x) = 4 x + 3, you
00
get f (x) = 4 > 0, so the graph is concave up.
II. (a) 8/7 (b) Does not exist
2 ln(x2 + 1) [3 cos(3 x) cos(2 x) − 2 sin(3 x) sin(2 x)] − sin(3 x) cos(2 x) [2 x/(x2 + 1)]
III. (a) esinh(x ) cosh(x2 ) (2 x) (b) (c)
[ln(x2 + 1)]2
d
dx/dt = 1/t and dy/dt = 3t 2 , so dy/dx = (dy/dt)/(dx/dt) = (3t 2 )/(1/t) = 3t 3 . Then d 2 y/dx2 = 1/(dx/dt) dt (dy/dx) =
1/(1/t) (9t 2) = 9t 3 . Bonus: The general 3.
formula is d n y/dxn = 3n t
dy 2
3 sin(x )
(d) = y 2 x cos(x2 ) ln(Arctan(3 x)) +
dx Arctan(3 x) (1 + (3 x)2 )
IV. f (4.3) ≈ −0.5
V. I pairs with B; II pairs with C; III pairs with A,D

VI. y0 (x) = 3C2 e3 x , y00 (x) = 9C2 e3 x . Then y00 − 3 y0 = (9C2 e3 x ) − 3 (3C2 e3 x ) = 0 as required. If y(0) = 0 and y0 (0) = 1,
then 0 = y(0) = C1 +C2 e0 = C1 +C2 and 1 = y0 (0) = 3C2 e0 = 3C2 . Then C2 = 1/3 and C1 = −C2 = −1/3 as required.
I. (a) y = 8 x − 10 (b) y = 2
II. (a) 6/11 (b) 0
ln(t 2 + 1)/(t 2 + 1) − 2t Arctan(t)/(t 2 + 1) 2
III. (a) 3 x2 cos x − x3 sin x (b) (c) (1 + ln x)x (2 x ln(1 + ln x) + x/(1 +
(ln(t 2 + 1))2
ln x))
IV. (a) 64 sin(4 x) (b) sin x
V. (a) 2 e2t (1 + t 2 ) (b) y − 1 = 2 x (c) [4 e2t (1 + t 2 ) + 4t e2t ] (1 + t 2 )
3

Si(x ) sin(x3 )
VI. (3 x2 )
Si(x3 ) x3
I. mtan = 2 x − 3; (y − 2) = 3 (x − 3)
II. (a) 4 (b) 8
√
Arcsin(2 x) cos(ln(x2 ))(1/x2 ) (2 x)−sin(ln(x2 )) (1/ 1−(2 x)2 )(2)
III. (a) 3 cosh(3 x) cot(2 x2 )+sinh(3 x) (− csc2 (2 x2 ) (4 x)) (b) 2 (c)
(Arcsin(2 x))
y0 = y (1/x2 ) (2 x) ln(Arcsec(3 x)) + ln(x2 ) (1/ Arcsec(3 x)) √1 (3)
|3 x | (3 x)2 −1
2 2 2
IV. (a) dy/dx = et , d 2 y/dx2 = et , d 3 y/dx3 = et (b) e2
V. 3.4
VI. 1
√ √ √ √ √ √ √ √ √ √
3 2 2 x − 3 2 e− 2 x , 2 x + 3 e− 2 x , so y00 − 2 y 2 x + 3 e− 2 x ) − 2 ( 3 e 2 x − 2x
VII. dy/dx = 4 e 4 d 2 y/dx2 = 23 e 2 = ( 23 e 2 4 + 3e 4 −
√ √
1
2 ) = 1, y(0) = (3/4) + (3/4) − (1/2) = 1, and y0 (0) = 3 4 2 − 3 4 2 = 0.
I. (y − 2) = (−1) (x − 1)
II. (a) 2/7 (b) 2/3
(x2 + x) [3 e3 x sec(4 x) + 4 e3 x sec(4 x) tan(4 x)] − (2 x + 1) (e3 x sec(4 x)) (5 x3 − 8 x), (30 x) − (15 x2 − 8) (15 x2 − 8)
III. (a) (b)
(x2 +
x)
2 (5 x3 − 8 x)2
4 2t cost − (1 + t 2 ) sint

4 x +x
(c) [ln(x)x +x ] (4 x3 + 1) ln(ln x) + (d)
x ln x 1/(1 + t 2 )
IV. f (4.9) ≈ 8.4
V. (a) y = u(x)v(x) so ln y = v(x) ln(u(x)), and then (1/y) y0 = v0 (x) ln(u(x))+v(x) (1/u(x)) u0 (x), or y0 = (u(x)v(x) ) (v0 (x) ln(u(x))+
v(x) (u(x)/u0 (x))). (b) y0 = cos x esin(x) both ways.
I. (a) 0 (b) y = −9 (c) y + 9 = 16 (x + 1)

√
II. (a) 1/7 (b) 1/(2 2)
1
(9 x2 + 1) √ (3) − (18 x) Arcsin(3 x)
1−(3 x)2 (1 + e8 x ) (16 e4 x ) − (4 e4 x ) (8 e8 x )
III. (a) (b)
(9 x2 + 1)2 (1 + e8 x )2
ex

ex
(c) (Arcsec(x)) ex ln(Arcsec(x)) + √
Arcsec(x) |x | x2 − 1
3
3t 2 et d2y

dy dy/dt 3 3 1 d dy 1 9 2 t3 9 5 t3 9 3
IV. (a) = = = t 3 et (b) 2 = = t e + t e = et (t 3 + t 6 )
dx dx/dt 2/t 2 dx dx/dt dt dx 2/t 2 2 4
t 27 2 t 3 3 6 9 t3 2 5 3 6 e6
(c) t e (t + t ) + e (3t + 6t ) (d) e e
2 4 4 2
V. 3.4

1 1 1 x/2 1 x/2
0
VI. y = ex/2 − 2 e−2 x , and y00 = ex/2 + 4 e−2 x . So 2 y00 + 3 y0 − 2 y = 2 e +4e −2 x
+3 e −2e −2 x
−
2 4 4 2
1 3
2 ex/2 + e−2 x = ex/2 + 8 e−2 x + ex/2 − 6 e−2 x − 2 ex/2 − 2 e−2 x = 0.
2 2
I. (a) ∆x = 5, ∆y = 15, msec = 3 (b) (y − 22) = 3 (x − 3) or (y − 7) = 3 (x − (−2)) (c) (y − 7) = 8 (x − (−2))

II. (a) 40/101 (b) 0 (c) 5
√ √ √ √
cos(x2 ) [2 x sec( x) + x2 sec( x) tan( x) (1/2) x−1/2 ] − (x2 sec( x))[− sin(x2 ) 2 x] (t 4 − 3t) (12t 2 ) − (4t 3 − 3)2
III. (a) (b)
√ cos2 (x2 ) (t 4 − 3t)2
(1/3) et 1 − t 2 + (1/3) et (1/2) (1 − t 2 )−1/2 (−2t) 2 + 4)

sin x (6 x
(c) √ (d) (2 x3 + 4 x)sin x cos x ln(2 x3 + 4 x) +
3/ 1 − t 2 2 x3 + 4 x
IV. 9.6
" #
2 2 1 1
V. 2 x cos(x )Co(ln x) + sin(x ) p
3 ln x + (ln x)3/2 x
VI. y0 = −6 sin(2 x)+8 cos(2 x), y00 = −12 cos(2 x)−16 sin(2 x). So, y00 +4 y = [−12 cos(2 x)−16 sin(2 x)]+4 [3 cos(2 x)+
4 sin(2 x)] = −12 cos(2 x) − 16 sin(2 x) + 12 cos(2 x) + 16 sin(2 x) = 0. Also, y(0) = 3 cos(0) + 4 sin(0) = 3 + 0 = 3 and
y0 (0) = −6 sin(0) + 8 cos(0) = 0 + 8 = 8.
I. (a) ∆x = 5, ∆y = 37, msec = 37/5. (b) y − 52 = (37/5) (x − 3) (c) y − 15 = −16 (x + 2)

√
II. (a) 4/5 (b) 0 (c) 1/(2 5)
√ √ √ √
cosh(x2 ) 3 sec( 5 x) + 3 x sec( 5 x) tan( 5 x) (1/5) x−4/5 − 3 x sec( 5 x) sinh(x2 ) (2 x)

III. (a)
cosh2 (x2 )
[1 + (t − 3t] (12t ) − (4t − 3) [(2) (t − 3t)1 (4t 3 − 3)]
4 2 3 4
(b)
[1 + (t 4 − 3t)2 ]2
dy dy/dt (1/ cosh(2t)) (sinh(2t)) (2) −1
(c) First, = = = cosh(2t).
dx dx/dt −2 sech(2t) tanh(2t) (2) 2
d2y

1 d dy 1
Then 2
= = (− sinh(2t)).
dx dx/dt dt dx −2 sech(2t)
tanh(2t) (2) 3
4 2 tan(x) dy 2 4 2 8x +8x
(d) If y = (2 x + 4 x ) , then = y (sec x ln(2 x + 4 x ) + tan x .
dx 2 x4 + 4 x2
IV. First, y0 = −10 e−5 x +10 e2 x , y00 = 50 e−5 x +20, e2 x , and y000 = −250 e−5 x +40 e2 x . Then y000 +3 y00 −10 y0 = −250 e−5 x +
40 e2 x + 3 (50 e−5 x + 20, e2 x ) − 10 (−10 e−5 x + 10 e2 x ) = −250 e−5 x + 40 e2 x + 150 e−5 x + 60 e2 x + 100 e−5 x − 100 e2 x =
(0) e−5 x + (0) e2 x = 0. That shows that the function is a solution of the differential equation part of the IVP. Next, we
have to show that the function also satisfies the initial conditions. Using e0 = 1, we get y(0) = 2 e0 + 5 e0 = 7, y0 (0) =
−10 e0 + 10 e0 = 0, and y00 (0) = 50 e0 + 20 e0 = 70.
Chapter 2
Finance
2.1 Introduction.
2.1.1 Seems an unusual topic for calculus, but isn’t.
In this chapter, we look at several applications that can all fall under the general umbrella of finance.
Mugsy: Finance? In a calculus course? Is this a joke?
Albert: I doubt it. There are a lot of possible topics that could fit here. On the other hand, I’ve never encountered
any other calculus course that had a chapter on finance.
Remember that one of my intentions is to provide a range of different topics all of which use calculus in less-than-common
ways.
Calculus is a major portion of business finance.

Most of the second semester of business math is calculus. Even so, Rich Wright, former director of placement here at
Asbury and a graduate in business, told me that he wished he had taken the regular calculus course rather than business
math calculus. He could have used more math than he took because he now wants to go into graduate work in business, and
they use all kinds of mathematical topics there. And Dr. John Charalambakis, a former economics professor here, twisted
the arms of economics majors to take both semesters of calculus if they were considering graduate work in economics.
Finally, the Financial Mathematics major here at Asbury College is living proof that the two go together.
Economics is a heavily quantified subject nowadays.

If you take any economics courses beyond the basic ones, you will find yourself floating (drowning?) in math.
Mugsy: I do have an economics joke. Want to hear it?
Dudley: Do we have a choice?
Mugsy: Economists show that they have a sense of humor by predicting the gross national product to within a tenth
of a percent. Har, har, har.
Albert: There’s a point to that saying, though. Mathematics is very good at predicting the past, but it is very poor
(at this stage of the game) at predicting the future, especially complex situations, such as a nation’s economy.
Mugsy: You can take the fun out of anything, can’t you?
Mathematics has a reputation for exactness, so using mathematical terminology to express answers gives them extra credi-
bility. The whole field of statistics gets an incredible amount of abuse.
Beyond that, though, mathematics does have a legitimate place in economic analysis. Microeconomics uses some
complicated math, beyond what we have had so far (although by the end of this course, you will have seen the majority of
it).
119
CHAPTER 2. FINANCE 120
We will be doing a few separate topics in this chapter.

We aren’t going to delve into microeconomics, however. We will touch on a few, reasonably everyday, subjects, though.
Continuous compounding, and indeterminate forms. We started continuous compounding in differentials, but we
come at it again from a different angle. Our work there on indeterminate forms will enable us to deal with messy limits (the
“0/0” type of thing) very easily using calculus (L’Hôpital’s rule).
Inventory control, and max/min problems. This is a classic application of how to minimize inventory + shipping costs.
It isn’t too hard, but the types of problems that can occur here are notoriously difficult for calculus students.
Dudley: Does that mean this is one of those chapters that we’ll cry all the way through?
Albert: Not really. Traditional calculus courses deal with this topic differently. These types of problems come at you
with no background or framework for solving them. They are stated in English sentences, and the usual challenge is
that you have to then translate the information into equations. This whole course is designed to give a big picture
for that type of translation. Hence there is no undue emphasis on such work in this section, as in a typical calculus
course.
Elasticity and relative changes. This topic is straight from microeconomics. We do it from the calculus point of view,
which helps explain several things that are awkward to say without calculus. It shows the usefulness of differentials.
Elasticity also introduces the idea of relative change (error).
2.2 Continuous compounding.

2.2.1 Background.
We talked a bit about continuous compounding in the section on differentials, but now we want to take a longer and harder
look at it.
Terminology and notation.

There is no single standard notation or terminology. I will try to be consistent in my presentation, but comparing my
formulas to what you find elsewhere could be frustrating.
Mugsy: Why doesn’t he just use the notation and terminology that economists use?
Albert: Because economists use all kinds of different notations and terminologies. What is presented here is used by
some rather large collection of economists. That’s as close as you can come.
I will use the idea that this is a savings account. That is, you give someone your money, and then get it back, with
interest, later.
Mugsy: I’m interested in having my money right now.
Dudley: Do you really want us to start up a pun-war?
Mugsy: NO. Sorry.
The same ideas are used when you take out a loan, except that someone gives you the money, and you have to pay it back
later, also with interest. But that’s more depressing, so I’ll stick with a savings account.
Albert: But for those of you with student loans, it certainly is a real application.
Basic notation and terminology. The notation and terminology set up here will be used for all following sections in this
chapter, and even once in a later chapter.
Term Variable Meaning
Principal P The initial amount deposited

Interest I The amount added to the principal at the end
Nominal rate r The interest rate that is stated on the account
Time t Time usually measured in years
Future value F Amount you will have at the end
Future value interest factor FVIF The quantity that multiplies the principal
to give the future value
There are different formulas for FVIF, depending on the interest scheme (we’ll have several). Please remember to
multiply this. The term factor always indicates that multiplication is the thing to do.
The units (years, months, etc.) on t must match the time unit in r. That is, if r is given as an annual rate (and it always
is, unless specifically stated otherwise, and that is rare), then t must be in years. (Those are the usual units, and it would be
reasonable to keep everything in terms of years.) Also, the value of r is normally given as a percent, but is always changed
to a decimal when used in a formula.
More in each section. In each of the following sections are more formulas that are specific to that method of getting
interest.
There are a lot of formulas, but I am not going to expect you to memorize them. I will give them to you on the test.
Dudley: I was getting worried there. Too many variables, and my mind simply locks up.
Albert: The meanings of the variables will not be given to you. Just the formulas.
Simple interest.
Simple interest is, well, the simplest.
Mugsy: Duh.
In this situation, you get interest on the amount of principal only, no matter how long you lend out the money.
I = P r t. "Interest equals principal times rate times time" is a common saying. It is debatable whether it is worth remem-
bering.
Simple interest gathers the least amount of interest at a fixed interest rate. We will compare simple interest to others as
we get them.
FVIF = 1 + r t. To get the new amount you will have (F), you take the principal (P) and add in the interest (I = P r t), and
you get F = P + P r t = P(1 + r t). So, the FVIF is the factor that multiplies P, to give F, namely
FVIF = 1 + r t
Remember, this is for simple interest only. Different interest schemes have different FVIF’s.
Example. Depositing $2500 in a simple interest account for 9 months at 8% annual interest generates a future value of
F = P × FVIF (2.1)
= ($2500)(1 + (0.08) × 3/4) (2.2)
= $2650 (2.3)
where we used that 9 months is 3/4 of a year. (Remember that the easiest units to use are years, even if the units aren’t
given to you that way!)
You can get the interest either by I = F − P = $2650 − $2500 = $150 (the amount that the account increases is the
interest), or by I = P r t = ($2500)(0.08)(3/4) = $150.
Compound interest.
In compound interest, the interest that is accumulated gets added to the principal before the next amount of interest is
figured. The causes interest to accumulate faster by compounding than by simple interest. You are getting interest on more
money.
It is possible to treat compound interest this way: It is simple interest where the principal keeps changing at each com-
pounding period. It’s one of the things that I try to cover in the Concepts course. Another result is that if the compounding
period equals the length of time that you keep the money in the account, then compound interest is exactly the same as
simple interest. You have to have a longer period of time for the compounding effect to appear. A homework question deals
with this.
Formulas. A new set of items appears here. You have to keep these straight, or you will get overwhelmed by the number
of different variables. (Careful, Dudley.)
Variable Meaning
m Number of compounding periods per year

k Interest rate per compounding period
n Number of compounding periods the deposit lasts
These will usually be camouflaged by stating them in English. Quarterly compounding means m = 4; monthly compounding
means m = 12; semi-annual compounding means m = 2; daily compounding means m = 360 (banks often use a 360-day
year); annual compounding means m = 1. Then the value of k is calculated by k = r/m. Don’t use just r! And n is calculated
by n = mt, where t must be in years before you use the formulas.
Mugsy: Am I getting slow, or does this make sense and I can’t see it?
Albert: The formulas can throw you. You are probably best off memorizing what the variables mean. For example, if
you are compounding monthly, and you have money on deposit for 3 years, how many compounding periods are there?
Mugsy: Is that the same as asking how many months there are in 3 years?
Albert: Yes.
Mugsy: Oh, that’s easy. Lessee, 3 times 12 is . . . 36.
Dudley: Hey, not bad. You didn’t even use your fingers.
Albert: So, the value of n, the number of compounding periods, is 36. And if the interest rate is 5% (per year, but
that is almost never stated), then how much do you get when splitting it up into equal monthly amounts?
Mugsy: Would it be 5/12%?
Albert: Exactly.
Dudley: Wow. Two in a row.
Mugsy: That is easier. Thanks, Al. Dudley, quiet.
Looking at compound interest as a succession of accumulating simple interest problems, we can get the formula for
compound interest easily. The critical items to remember are that you get the amount at the end of an interest period by
multiplying the amount at the beginning of the period by (1 + k), and the amount at the end of one period is the amount at
the beginning of the next.
Period Start of period End of period
1 P P × (1 + k) = P (1 + k)
2 P (1 + k) P (1 + k) × (1 + k) = P (1 + k)2
3 P (1 + k)2 P (1 + k)2 × (1 + k) = P (1 + k)3
4 P (1 + k)3 P (1 + k)3 × (1 + k) = P (1 + k)4
With a little bit of thinking, you can see that the amount at the end of the nth period is just P (1 + k)n . In that case, the FVIF
for compound interest is
FVIF = (1 + k)n .
Again, this is for compound interest only.
Example. Take $2500, and invest it at 9% interest compounded monthly for 4 years. Then 9% = 0.09 = r, and t = 4,
just as before. The “monthly” means m = 12, so k = r/m = 0.09/12 = 0.0075, and n = t m = (4)(12) = 48. Then the
future value interest factor is FVIF = (1 + k)n = (1 + 0.0075)48 = 1.431405. The future value is then F = P × FVIF =
($2500)(1.431405) = $3578.51
There are a few comments that need to be made about the number of decimal places to keep and use. Final answers
should be in dollars rounded to two decimal places (cents). Numbers before that should have between 5 and 8 decimal
places, depending on the size of the principal and how many decimal places you need to get the future value accurately.
Be particularly careful to keep plenty (actually, all you can) of decimal places in (1 + k). The FVIF is very sensitive to
round-off errors in that term. For example, if you have 5% interest compounded monthly, you’d be much closer using
k = .05/12 = .004166666667 than k = .0042.
Also, depending on the number of decimal places you keep, you will get slightly different answers from me or from
others. If it is close, I will not quibble.
Dudley: Does that mean you won’t take off points? That’s what I’m interested in.
Yes, that’s what I mean, Dudley.
Continuous compounding.
In this situation, we need to take shorter and shorter interest periods, which is the same as making m bigger and bigger.
FVIF = limm→∞ (1 + k)n What we want is a limit, where m goes to infinity, which is written ∞. But please be careful;
there is no number ∞, or put another way, ∞ is not a number, and you can’t treat it as such. You can’t, for example, plug ∞
into a formula and get anything realistic out. This means that we had better have another way of evaluating limits, since our
usual approach is to plug in the limiting value. Again, ∞ is a symbol that can only occur in a limit, and there it just means
that some variable is getting gigantic, and we are asking if the expression settles down to a value in the process.
The FVIF becomes 1∞ As m → ∞, k = r/m → 0 and n = mt → ∞. As the period gets shorter and shorter, the amount of
interest per period goes to zero, but the number of interest periods goes to infinity. The net result is that the FVIF looks like
1∞ .
This is a new form! It is like 0/0, an indeterminate form, but it is less obvious. We need to look at it.
Dudley: Isn’t 1 to any power equal to 1? So why is this an indeterminate form? Isn’t it equal to 1?
Albert: Yes, 1 to any power is 1. But isn’t 0/(anything) equal to 0?
Dudley: Yes.
Albert: So why isn’t 0/0 equal to 0? We’ve already seen that it can anything.
Dudley: But this is different.
Albert: Not that much. Keep reading.
The problem is that the 1 in the 1∞ is a limit. For any specific value of n, the “1” is actually a bit larger than 1. Any number
larger than 1, when raised to large enough powers, gets large. (Try putting 1.0000001 in your calculator and then keep
squaring it. It overflows mighty fast.)
We discovered that the limit form 0/0 could be anything. The bottom tried to send the quotient off to infinity, while the
top tried to keep it at zero. This schizophrenic nature of 0/0 is why we called it an indeterminate form. The same is true
for 1∞ . The 1 in the base tries to make the limit 1. The ∞ in the exponent tries to pull the limit to ∞. So, the result is a
tug-of-war characteristic of an indeterminate form. Let’s try an example with numbers first.
Example. Suppose r = 8% = 0.08, and t = 5 (years). Let’s grind out some values for FVIF = (1 + k)n for different
values of m = number of compounding periods. The table is at the top of the next page. (This was done on Maple.
Calculators would work for the first few, but very rapidly would lose accuracy. Try it! I set Digits:=20; in order to keep
enough accuracy.)
It looks like there is a limit (even without the last line), and it is neither infinite (which the exponent tries to make it)
nor 1 (which the base would have). We need to explore this further.
m FVIF
1 1.469328
4 1.485947
12 1.489846
360 1.491758
365 1.491759
1000 1.491801
10000 1.491822
100000 1.491824
1000000 1.491825
1000000000 1.491825
∞ 1.491825
Table 2.1: Values of an FVIF as m changes.
2.2.2 Indeterminate forms.

An indeterminate form comes from a limit whose value is not automatically determined when you plug in values.
Dudley: My indeterminate form has been greatly improved by dieting and exercise.
Mugsy: I thought you weren’t going to start the pun-war.
Dudley: You’re the one who started it.
Albert: (whining) M-O-M! He hit me first!
Like 0/0, 1∞ can be anything.

With an indeterminate form, there is a tug-of-war between different parts of the limit. With 0/0, the top tries to make the
value 0 (since 0/anything ought to be 0), while the bottom tries to make the value infinite (since anything/0 blows up on
your calculator). A 0/0 limit can be viewed as somewhat indecisive, even schizophrenic, being torn in two directions at
once. The limit in each case represents the final balance between those two forces.
Albert: The mind boggles at the theological analogies.
Mugsy: If this is your entry into the pun-war, it’s mighty lame.
Albert: No. I’m being serious. There is the analogy of free will versus the sovereignty of God. There is the analogy of
sin in the believer versus the holiness of God. There is the analogy of the flesh versus the Spirit in the carnal Christian.
Need I go on?
Mugsy: No. I’m afraid you could.
Albert: However, fortunately for the Christian, it’s not an equal battle. God wins.
Why is 1∞ indeterminate? Let’s look at the two different parts in isolation for the other. If we let the exponent be fixed,
and slide the base toward 1, we do indeed get 1 as the limit. On the other hand, if we fix the base at 1 + really tiny number,
and let the exponent get huge, then the value blows up to infinity. Now you can see the schizophrenic nature of 1∞ more
clearly. That’s why 1∞ is indeterminate.
Other varieties of indeterminate forms.

Once you have the idea, you can rummage around for other indeterminate forms. The other common ones (beside 0/0 and
1∞ ) are 0 × ∞, 00 , ∞ − ∞, and ∞/∞. In the homework, you are asked to show why these are indeterminate. There are also
some similar-looking forms that are not indeterminate. Those will also be covered in the homework.
Perhaps the best of these for illustration is 00 . It is very clearly indeterminate: 0x = 0 for any x, but x0 = 1 for any x.
So, is 00 equal to 0 or to 1? It might not be either in the limit!
Cure: L’Hôpital’s rule.

Indeterminate forms can be cured most easily by L’Hôpital’s rule.
Albert: Since L’Hôpital is French for “the hospital,” I think there’s been another entry into the pun-war.
However, it only works with certain indeterminate forms; the others have to be algebraically maneuvered into one of those
forms. We’ll see how.
Using derivatives to evaluate limits. We encountered an indeterminate form (namely 0/0) trying to define derivatives.
Having conquered derivatives, they now turn around and help us evaluate indeterminate forms.
There are a variety of ways to handle indeterminate forms, but L’Hôpital’s rule is the best. And with the 1∞ form,
L’Hôpital’s rule is basically your only hope.
Statement of rule. L’Hôpital’s rule says this:
f (x) f 0 (x)
If f (c) = g(c) = 0, then lim = lim 0 provided the second limit exists.
x→c g(x) x→c g (x)
Notice that the conditions you need to be able to use L’Hôpital’s rule are precisely the ones that you began to dread in
limits: when both the top and the bottom were zero when you plugged in the limiting value of the variable. So, factoring
the top and the bottom (the big problem that cropped up when the top or bottom wasn’t a polynomial) is not necessary. You
do have to differentiate, however.
Mugsy: Have we gained anything?
Albert: Let me put it this way. Would you rather factor a polynomial or differentiate it?
Mugsy: No question. I’d rather differentiate it.
Albert: There’s one gain. All that’s needed for L’Hôpital’s rule is differentiation. Another gain is that L’Hôpital’s rule
also works in cases that aren’t polynomials, where factoring isn’t even a possibility. Remember the messes we got into
when we were trying to find the derivatives of sin x and ln x?
Mugsy: Ugh. What a pain that was. Why didn’t the brilliant author put L’Hôpital’s rule early enough to be able to
use it then, and spare us (and him) that mess?
Albert: You need to do limits first, and then get derivatives, and then get L’Hôpital’s rule, which is what we did.
It would be hard to derivatives first, and then do limits with L’Hôpital’s rule. Explaining derivatives without using
limit-like language–tangent lines, instantaneous velocities, and so on–means you can’t explain most of what derivatives
mean.
The rule also holds if f (c) and g(c) both go to infinity as x approaches c, and it also works in the case that c is infinity.
Zero and infinity are actually quite close.
Dudley: WHAT? Now I know he’s flipped.
Albert: He has a point. Do you want me to convince you?
Dudley: Will it hurt?
Mugsy: Can I leave until you’re done?
Albert: Not much and yes.
Mugsy: (leaving) Be back shortly.
Albert: Ok, Dudley. Suppose you have an infinite number of golf balls, numbered 1, 2, 3, 4, etc. At one hour before
noon, you put the balls numbered 1 to 10 in a row on the ground, and Mugsy takes the ball numbered 1. At a half
hour before noon, you continue the row by placing the balls numbered 11 to 20, and Mugsy takes the ball numbered
2. At one-third of an hour before noon, you continue the row by placing the balls numbered 21 to 30, and Mugsy
takes the ball numbered 3. Do you see the pattern? What’s next?
Dudley: At one-quarter of an hour before noon, I put the balls numbered 31 to 40 in the row, and Mugsy removes the
ball numbered 4.
Albert: Right. Now, as you move closer to noon, there is a major flurry of activity. When the dust settles, how many
balls are left in the row? Well, at one hour before noon, there are 9 balls, and at a half-hour before noon, there are
2 × 9 = 18 balls, and at a third of an hour before noon, there are 3 × 9 = 27 balls, and so on. You gain 9 golf balls in
the row, since you add 10 and Mugsy removes only one. How many golf balls will there be in the row once noon has
passed?
Dudley: No problem. An infinite number of them.
Albert: Really? Is the golf ball numbered 1 still in the row?
Dudley: No. Mugsy removed it at one hour before noon.
Albert: Is the golf ball numbered 2 still in the row?
Dudley: No. Mugsy removed it at a half hour before noon.
Albert: If the golf ball numbered 3 still in the row?
Dudley: No. Mugsy removed it at a third of an hour before noon.
Albert: Is the golf ball numbered 250 still in the row?
Dudley: Hmm. I’m beginning to get this funny feeling. No. Mugsy removed that one, too, at 1/250 of an hour before
noon.
Albert: Well, if there are an infinite number of golf balls in the row, there certainly is at least one. What number does
it have?
Dudley: AUGH! Whatever number I name, Mugsy removed it at some time. You mean there aren’t any golf balls left?
Albert: Yup. That’s how close infinity is to zero.
Mugsy: (returning) You guys done yet?
Dudley: How dare you swipe all my golf balls, you cad!
Mugsy: Should I go away for a while longer?
On the other hand, L’Hôpital’s rule works only in the cases of 0/0 and ∞/∞; other indeterminate forms must be put into
one of those two forms. The methods of doing that are important, especially since 1∞ is not in the form 0/0 or ∞/∞.
You can’t use L’Hôpital’s rule unless the indeterminate form is 0/0 or ∞/∞.
Other indeterminate forms do not have anything to correspond to this.

Note that the form required is a quotient, but that you do not use the quotient rule from derivatives in L’Hôpital’s rule.
(We don’t want the derivative of the expression!) In fact, this is the combination of derivatives some of you would have
wanted instead of the quotient rule. And, please don’t mess up the quotient rule after having seen this!
Once you have differentiated the top and bottom, you treat the resulting quotient as a new limit. The first thing you do
is plug x = c into it, and see what happens. Usually, you’ll get the answer right then.
Why the rule works. The easiest way to see what is going on is to use parametric equations. That means changing
independent variables from x to t, so that we are trying to evaluate
f (t)
lim
t→c g(t)
We set up x = g(t), and y = f (t), which looks backwards, but isn’t. We also assume f (c) = g(c) = 0. The slope of the
secant line at the origin is
∆y f (t) − f (c)
= (2.4)
∆x g(t) − g(c)
f (t)
= (2.5)
g(t)
(This is why I wanted things "backwards". It puts the f (t) on top and the g(t) on the bottom.) The limit as t approaches c
gives two things: the limit we are looking for, and the slope of the tangent line at the origin. The slope of the tangent line
at the origin comes from the chain rule (the most important rule in calculus), and is
dy dy/dt
= (2.6)
dx dx/dt
f 0 (t)
= 0 evaluated at t = c (2.7)
g (t)
Function x → −∞ x→0 x→∞
xn , n odd −∞ 0 ∞
xn , n even ∞ 0 ∞
1/xn , n odd 0 d.n.e. 0
1/xn , n even 0 ∞ 0
ex 0 1 ∞
ln x d.n.e. −∞ ∞
Table 2.2: Values of various functions at 0 and ±∞
which is essentially how the second limit in L’Hôpital’s rule is evaluated. So, the limit we want is this quotient of derivatives.
Actually, there are some holes in this “proof,” but this conveys the ideas better than the full-blown rigorous proof. It is
intended to convey one important fact. If either f (c) or g(c) is not 0, the slope of the secant line is not f (t)/g(t) any more,
and the whole thing falls apart. Remember: you have to have 0/0 (or ∞/∞, but this explanation doesn’t show it) in order to
use L’Hôpital’s rule correctly.
Values of various functions at certain points. It is handy to have an idea of what happens to functions that give either
0, ∞ or −∞. There’s a short table at the top of the page. The notation “d.n.e.” means that the limit does not exist. For 1/xn
with n odd, the function tries to go to ∞ for values of x near, but greater than, 0, while the function tries to go to −∞ for
values of x near, but less than, 0. The “d.n.e.” in ln x is there because ln x is not defined for values of x less than or equal to
0. Pushing x towards −∞ in ln x can’t happen.
Rational functions and limits to ±∞. When you have a Rational function (the quotient of two polynomials) in x (that is,
with independent variable x), there is a fast way to determine the limit as x → ∞ or x → −∞. A similar trick can be used on
other functions, and we’ll see how soon.
The basic idea is to focus only on the terms that are growing the fastest as x → ±∞. For a polynomial, those are easy to
find. Just look for the highest degree term in the numerator (top) and denominator (bottom). Throw away all the rest of the
terms in both the top and bottom, reduce what’s left, and the limit of that gives the answer to the original limit. (This needs
to be done with care sometimes, and we’ll see when and how.)
Be careful not to use this method unless the limit is to ∞ or −∞! That’s the only time you can ignore everything but the
highest-degree terms in the top and bottom. This is fast and simple. Examples will be given in class.
Repeated uses of L’Hôpital’s rule. Sometimes after using L’Hôpital’s rule, you still wind up with 0/0 or ∞/∞.
Mugsy: Why doesn’t that surprise me?
In that case, try the rule again! You can continue to use L’Hôpital’s rule as long as you keep getting 0/0 or ∞/∞. But one
note of caution. It becomes so easy to keep applying it, that you sometimes forget to keep checking that you have the right
form for continuing. As soon as you get something else, stop! You can get the answer right there.
Dudley: You mean that people actually keep differentiating beyond the proper stopping point?
Albert: Exactly. This is an unfortunately common occurrence.
Dudley: Why?
Albert: That’s hard to say because there are a number of different reasons, I suspect.
Mugsy: Of course, you always stop at the right point, and have a hard time understanding why someone else wouldn’t.
Albert: Only partly true. It is almost always simpler to differentiate than factor a polynomial, which is why L’Hôpital’s
rule is handy then. But it is often easier to differentiate than to plug numbers in, too. In that case, people will simply
keep differentiating until it looks reasonably simple to plug things in, but by that point, they have gone too far. At
least, that’s my theory. But continued use of L’Hôpital’s rule requires alternating between two things—differentiation
and substitution—and not just differentiation alone.
The procedure I just gave for finding the limits of rational functions as x → ∞ is based on repeated applications of
L’Hôpital’s rule. This is looked at briefly in the homework.
Example:
x sin x 0
lim =“ ” (2.8)
x→0 1 − cos x 0
(1 × sin x) + x(cos x)
= lim (2.9)
x→0 sin x
0
=“ ” (2.10)
0
cos x + (1 × cos x + x(sin x))
= lim (2.11)
x→0 cos x
2 cos x − x sin x
= lim (2.12)
x→0 cos x
= 2/1 = 2 (2.13)
Note that each time before applying L’Hôpital’s rule, you have to check that you still have 0/0.
We will work the following limits in class:
x3 + x + 10
lim
x→−2 5 x2 + 13 x + 6
10 z2 − 21 z − 27
lim
z→3 2 z2 − z − 15
4 x2 − 7 x + 3
lim 2
x→1 3 x + 5 x − 2
e2 y − 2 y − 1
lim
y→0 sinh y − Arctan y
4 x3 − 7 x2 + 5 x − 10
lim
x→∞ 7 x3 + 4 x + 200
lim (r ln r)
r→0
Working limits to infinity on Maple.

I noted earlier that Maple takes limits as variables go to infinity, too. For that, you simply use infinity as the limit value.
For example, here’s how I did the limit above.
> FVIF := (1+k)^n:
> k := r/m:
> n := m*t:
> r := 0.08:
> t := 5:
> limit(FVIF, m=infinity);
1.491824698
Note the use of colons (:) to suppress the printout of values I already knew.
Homework #21
Exercises.
1. Find the following limits. You can use any (legitimate) method you want. You can check your answers on Maple (as
usual).
2
y −5y+6
(a) lim
y→3 y2 + y − 12
y2 − 5 y + 6

(b) lim
y→−4 y2 + y − 12
2
y −5y+6
(c) lim
y→2 y2 + y − 12
2
y −5y+6
(d) lim
y→∞ y2 + y − 12
4 + x2

(e) lim
x→∞ 1 − x3
2. Find the following limits. You can use any (legitimate) method you want.
2
2y −y−3
(a) lim
y→−1 3 y2 + 2 y − 1
2
2y −y−3
(b) lim
y→1/3 3 y2 + 2 y − 1
2
2y −y−3
(c) lim
y→3/2 3 y2 + 2y − 1
2
2y −y−3
(d) lim
y→∞ 3 y2 + 2y − 1
6 + x5

(e) lim
x→∞ 3 − x2
3. Make up your own limits, and solve them. (This is excellent practice, and will give you some real insight into what
makes a good limit problem. Include some that use L’Hôpital’s rule, and some others where the limit variable goes
to infinity.) Usual rules apply (i.e., three of them will count).
Problems.
1. In this problem, we look at various indeterminate forms, and some impostors.
(a) Show that 0 × ∞, 00 , ∞ − ∞, and ∞/∞ are indeterminate forms. Do this by showing that in each of them, one
part is trying to make the expression go to one value, while simultaneously, the other part is trying to make it
go to a different value.
(b) Show that other possible forms are not indeterminate: ∞ + ∞, ∞ × ∞, ∞∞ , and 0∞ . Do this by showing that
anything with that form really must approach a specific limit (even if it is ∞). (Caution: ∞∞ and 0∞ approach
different limits. Why are they different?)
2. This problem will look at

3 x4 − 5 x − 7

lim
x→∞ 4 x4 + 2 x2 + 6
both algebraically and numerically.
(a) Evaluate the limit using the approach from lecture for rational functions.
(b) Plug x = 103 into the top and bottom polynomials, and evaluate the quotient to 10 decimal places. Plug the
same value into just the highest-degree terms in the top and bottom and evaluate the quotient to 10 decimal
places. Comparing the two quotients, can you see why you can ignore all but the highest-degree terms in the
top and bottom?
(c) Show that the value of the original limit can be obtained from L’Hôpital’s rule applied four times.
3. A blind application of L’Hôpital’s rule can occasionally get you into problems. This problem looks at that situation.
We’ll look at
x + sin x
lim
x→∞ 3x
simplistically and then using some needed insight.
(a) If you try to plug in x going to ∞, you get ∞/∞. L’Hôpital’s rule applies. What do you get by applying it?
(b) Does this next limit exist, and why or why not?
(c) What are the relative sizes of x and sin x as x gets huge? Applying the logic used for rational functions, what is
the new, simplified quotient you take the limit of? What is that limit? [Moral of the problem: Apply common
sense before L’Hôpital’s rule.]
4. Try the formula in L’Hôpital’s rule twice more on
2 cos x − x sin x
lim
x→0 cos x
even though it should not be used. (This limit was an example in the lecture notes.) Do you get the correct answer?
What’s the moral of this problem?
2.2.3 Return to the problem.

Now that we have L’Hôpital’s rule for ammunition, we come back to finding the limit of the FVIF for compound interest
when m → ∞. This is still not trivial, because 1∞ is not in the form 0/0 or ∞/∞.
Solution of the problem.

Of course, Maple can do it, but that won’t help you understand how to do it. The trick is to use logarithms to pull the
exponent down. That is the main purpose of logarithms, remember? Here’s how.
FVIF = lim (1 + k)n (2.14)

m→∞
= lim (1 + r/m)mt (2.15)
m→∞
ln(FVIF) = lim ln(1 + r/m)mt (2.16)
m→∞
= lim mt ln(1 + r/m) (2.17)
m→∞
So far, so good. But plugging in m → ∞ still gives the form 0 × ∞, since ln(1) = 0. That’s a start, but not 0/0 or ∞/∞.
We can convert this limit to the right form by algebra: Invert one term and divide by it (standard trick when dealing with
0 × ∞—we did it in an example before the last homework set). The term to put on the bottom is the simplest one; in this
case the mt. So we proceed:

ln(FVIF) = lim mt ln(1 + r/m) (2.18)
m→∞
ln(1 + r/m)
= lim (2.19)
m→∞ 1/(mt)
0
=“ ” (2.20)
0
1
1+r/m × (rm−2 )
= lim (2.21)
m→∞ (1/t) × m−2
1
1+r/m × (r)
= lim (2.22)
m→∞ (1/t)
rt
= lim (2.23)
m→∞ 1 + r/m
rt
= (2.24)
1
= rt (2.25)
rt
FVIF = e (2.26)
This, finally, is the answer. This says that the value of the money in the account with compound interest will be P × FVIF =
P er t .
Note that this is exactly the same thing that we got back in the differentials section, although we used k for the interest
rate there rather than r. Our formula there was y(t) = y0 ekt , where y0 was the initial amount of money (what we have been
calling P, the principal), and k was the interest rate (that we are now calling r, with k now having a completely different
meaning than it had then). Translated into our terms, that means the value will be P er t . This equation is not new at all!
Mugsy: I feel cheated. We already knew the answer. Why did we go through all this mess to end up almost at the
beginning?
Albert: We have picked up some valuable information along the way, specifically L’Hôpital’s rule.
In the example earlier with 8% interest compounded continuously for 5 years, the formula here gives the FVIF =
exp(0.08 × 5) = exp(0.4) = 1.491824698, exactly what Maple got for the limit. In fact, if you omit the steps where you
give values to r and t in the Maple session I listed earlier, Maple will still take the limit, and give exp(r t) for its answer.
Investigation of exponential growth.

The general function ek t is called an exponential, and is said to grow exponentially. We want to look at exponential growth.
Benjamin Franklin’s will. Benjamin Franklin put this in an appendix (technically called a codicil) to his will:
I wish to be useful after my Death, if possible, in forming and advancing other young men that may be ser-
viceable to their Country in both Boston and Philadelphia. To this end I devote Two thousand Pounds Sterling,
which I give, one thousand thereof to the Inhabitants of the Town of Boston in Massachusetts, and the other
thousand to the inhabitants of the City of Philadelphia, in Trust and for the Uses, Interests and Purposes here-
inafter mentioned and declared.
The money was to be lent out at 5% interest, and each borrower was supposed to repay annually both the interest to date
and one-tenth of the principal, which would then be used to loan to other borrowers. He then continues:
If this plan is executed and succeeds as projected without interruption for one hundred Years, the Sum will
be then one hundred and thirty-one thousand Pounds of which I would have the Managers of the Donation to
the Inhabitants of the Town of Boston, then lay out at their discretion one hundred thousand Pounds in Public
Works.... The remaining thirty-one thousand Pounds, I would have continued to be let out on Interest in the
manner above described for another one hundred Years.... At the end of this second term if no unfortunate
accident has prevented the operation the sum will be Four Million and Sixty-one Thousand Pounds.
The Franklin Technical Institute of Boston owes its existence to this money.
Mugsy: Try that today, and you’d get ripped off something fierce.
Comparison to polynomials; orders of growth. We have already seen how a linear function (simple interest) is no
match for an exponential (compound interest). But this continues. No polynomial grows as fast as any growing exponential.
(Exponentials can also decay.)
For example, ex doesn’t catch up to 100 x100 until about x = 652.72, but after that, the exponential leaves the polynomial
way behind.
There is a hierarchy of growth of functions. We’ll hit it now (since it is relevant to limits as x → ∞), and later (where
we will use it in other settings). It is used in a manner parallel to what we did for rational functions: to determine the
fastest-growing term in the top and bottom. In this case, the fastest-growing term is the one that contains the term that is
lowest in the following list:
Slow sine, cosine, constants

logarithms
polynomials (sorted by degree)
exponentials (sorted by size of exponent)
Fast factorials
This says, for example, that a polynomial grows faster than a logarithm, and an exponential grows faster than a polynomial.
Right now, we aren’t going to use factorials. (You don’t even have to know what a factorial is at this point. We’ll learn
later.)
How do you use this? Suppose you have a limit such as
8 ln x + 4 x3 + 3 ex
lim
x→∞ 3 sin x + 5 x6 − 2 ex
The way you evaluate it is to discard everything but the fastest-growing terms in the top and bottom, which in this case are
3 ex on the top and −2 ex on the bottom. (You can tell they are the fastest because they occur farther down the list than any
other terms.) So, this limit has a value equal to
8 ln x + 4 x3 + 3 ex 3 ex
lim = lim (2.27)
x→∞ 3 sin x + 5 x6 − 2 ex x→∞ −2 ex
3
= lim (2.28)
x→∞ −2
3
=− (2.29)
2
Again, a note of caution is in order. This procedure of discarding terms can only be used when you are taking limits to
±∞. Other limits require you to keep all the terms.
Regular compounding versus continuous compounding.

Is there any real difference between regular compounding and continuous compounding? Yes, but sometimes not much.
Compounding period and its effects. The length of the compounding period has an effect on the amount of interest
paid. This can be seen by realizing that simple interest is basically compound interest with a single compounding period.
Basically, for a fixed length of time, the shorter the compounding period, the more interest accumulates. That’s why
some banks proudly display (or used to; it’s fallen out of practice) that their savings accounts compound continuously.
You can’t get more frequent than that! For a specific interest rate and time period, continuous compounding produces
the maximum interest. Daily, monthly, quarterly, and semi-annual compounding would produce increasing amounts of
interest. (You can see this effect if you go back to a table I listed earlier from Maple where I worked out the FVIF for
various different compounding periods.) An exercise in the homework asks you to work out some different numbers.
But one thing you will notice in the homework is that the difference between daily and continuous compounding is
virtually invisible. In real life, this still holds. But there is one catch. Many banks (including the one I use) will compound
interest daily, but only pay it monthly. That is, unless you leave the money in until the end of the month, they won’t give
you any interest on it at all!
Effective annual rate (versus nominal rate). Sometimes it is exceedingly difficult to compare varieties of interest
schemes. Is 5.2% compounded monthly better or worse than 5.1% compounded continuously? The smaller rate (5.1%
versus 5.2%) would be offset to some extent by the more frequent compounding (continuously versus monthly).
There is a standard way to compare compounding schemes, called effective annual interest rates or annual percentage
rate (APR). You will see these on credit card applications, for example. The rate that is given to you is called the nominal
rate (at least in old-time textbooks), and that’s r.
The basic idea is simple. If you use the nominal rate for one year, what is the equivalent simple interest rate? Since
simple interest rates are easy to compare (the larger, the more interest), this gives an easy way to compare different interest
and compounding schemes. If you use the formulas, it isn’t hard to come up with the effective rate. If re f f is the effective
rate, the FVIF of the simple interest with that rate should equal the FVIF of the compound interest with the nominal rate.
Doing this, with t = 1, that is, for one year, (so n = m) and regular compounding gives:
Simple interest FVIF = Compound interest FVIF (2.30)

r m(1)
1 + re f f = 1 + (2.31)
m
r m
re f f = 1 + −1 (2.32)
m
and for continuous compounding, you get (again, with t = 1):
Simple interest FVIF = Compound interest FVIF (2.33)

r(1)
1 + re f f = e (2.34)
re f f = er − 1 (2.35)
This makes it easy to compare different rates with different compounding periods. In the example I gave, 5.2% compounded
monthly has an effective rate of
0.052 12

1+ − 1 = 0.05326 = 5.326%
12
while the 5.1% compounded continuously has an effective rate of
e0.051 − 1 = 0.05232 = 5.232%
which shows that the 5.2% compounded monthly is better (if you want more interest).
Homework #22
Exercises.
1. Find the new balances on an account with different interest methods. Assume a principal of $10,000, an interest rate
of 6%, and time of 15 years. What is the new balance assuming:
(a) Simple interest?
(b) Interest compounded annually?
(c) Interest compounded quarterly?
(d) Interest compounded monthly?
(e) Interest compounded continuously?
2. Find the new balances on an account with different interest methods. Assume a principal of $10,000, an interest rate
of 9%, and time of 10 years. What is the new balance assuming:
(a) Simple interest?
(b) Interest compounded annually?
(c) Interest compounded quarterly?
(d) Interest compounded monthly?
(e) Interest compounded continuously?
3. Find the effective interest on accounts with the following rates and periods.
(a) 3.8% compounded monthly
(b) 7.3% compounded continuously
4. Find the effective interest on accounts with the following rates and periods.
(a) 5.3% compounded quarterly

(b) 9.2% compounded continuously
Problems.
1. This problem works out the difference between simple and compound interest schemes for short and long time
periods. Take a nominal interest rate of 10% for both schemes and for the compound interest scheme use quarterly
compounding.
(a) Find the FVIF’s for simple and compound interest for t = 1, 2, 5, and 10 years.
(b) Find the ratio of the simple interest factors to the corresponding compound interest factors for those same time
periods. (That is, divide the compound interest factor by the simple interest factor.) Which interest scheme
would you prefer for your savings account?
(c) Find the FVIF’s for t = 50, 100, 200, and 500 years. (Don’t be surprised at very large numbers.)
(d) Find the ratio of the simple interest factors to the corresponding compound interest factors for those same time
periods. (Again, divide the compound interest factor by the simple interest factor.) Which would you prefer for
your savings account?
(“$100 placed at 7 percent interest compounded quarterly for 200 years will increase to more than $100,000,000—by
which time it will be worth nothing.” – Robert A. Heinlein)
2. The last problem showed you the drastic difference between simple and compound interest for long periods of time.
This problem shows that there is little difference for short periods of time.
(a) Find the equation of the tangent line to y = ek x at the point (0, 1). This is called the linearization of ek x around
(0, 1). Write the equation as y = (something).
(b) Since the function is closely approximated by its tangent line near the point of tangency, the function y = ek x and
the equation of its tangent line are approximately equal. Write out the approximation, namely ek x ≈ (tangent
line formula from previous part).
(c) Change k’s to r’s and x’s to t’s in the previous approximation. What do you get?
(d) Convert the two sides in the approximation in the previous part to FVIF’s of different interest schemes. That is,
each of the sides in the previous equation should be FVIF’s for different interest schemes. Label each one.
(e) What does this say about the FVIF’s for simple and continuously compounded interest for small values of t?
2.3 Inventory control.

One not-too-contrived use of calculus in business is the matter of determining how big of an inventory to keep on hand.
There are two competing factors involved. Shipping costs go way up if you decide to keep a very small inventory because
you are constantly ordering more material. On the other hand, keeping the shipping costs way down means that you are
stockpiling your material, and that ties up your capital that might be earning more money in other places or ways. How do
you balance the two? That’s what we will explore.
2.3.1 Statement of problem.

We want to determine the order size.
We will pretend that the local McDonald’s has asked us to determine the most economical order size for hamburgers. We
have two costs to balance: storage and shipping. There has to be a balance that minimizes the sum of those costs. That’s
what we want to find. We will look for the optimum (fancy word for “best”) order size and rate.
Notation and terminology.

We need to set up what we are going to be doing. Here are the variables and what they mean.
Variable Meaning
x order size (in hamburgers)

n number of hamburgers sold per year (estimated)
s shipping cost to place one order √
i inventory cost per hamburger per year (this is not −1)
C total annual cost (which we want to minimize)
Equation derived.
Now we want to get the equation(s) that we will be working with.
Order cost per year. The order cost per year is (the number of times we order per year) × (the cost per order). This is
common sense. Think about it for a moment.
Mugsy: Before you make any comment, Dudley, I’ve already figured it out.
The number of times we order per year is (n/x), since that tells us how many orders need to be made to sell n hamburg-
ers. This also makes sense, but is not as obvious. Try numbers. If you sell 10,000 hamburgers and get shipments in batches
of 500 hamburgers, how many batches do you need? The answer is 20, since 10000/500 = 20.
We have already said that s is the shipping cost per order. We now have all the ingredients to get the order cost per year.
The total order cost is (n/x) × s = (n s)/x. Now let’s go after the total storage cost per year.
Inventory (storage) cost per year. The storage cost per year will be (number of hamburgers stored) × (cost per ham-
burger for storage). The number of hamburgers stored is difficult, since it is not a constant. We get an order, and the number
of hamburgers stored goes up. Then as we sell them, the number goes down. One idea (and it’s a reasonable one) is to use
the average number of hamburgers. So, we will make an assumption here. Suppose we get orders of hamburgers just as we
run out, and that we sell them at a uniform rate. Then the average number of hamburgers will be half of the order size, or
x/2. We’ll use that for number of hamburgers stored.
Dudley: Al, how easy would this be to modify to take into account that no reasonable manager would rely on a
shipment showing up just as they run out? There really ought to be a number of hamburgers on hand that triggers
an order being sent, to arrive in time to allow a safe buffer of hamburgers left.
Albert: That’s certainly the right idea. It wouldn’t be hard at all to incorporate that into the equations, and only
creates a small problem. It turns out that all it does is raise the storage cost by a constant, and doesn’t affect the
final value of x at all.
The storage cost of a hamburger is i, by what we decided to call it earlier. The total storage cost is then (x/2) × i =
(i x/2). That’s the other half of what we need.
Total cost. The total cost is the sum of the order cost and the storage cost, or
C(x) = (n s/x) + (i x/2)
In this equation, x is the variable we get to determine, so that’s the independent variable. C is the dependent variable, and
the value that we want to minimize. The remaining letters represent constants, or more accurately, parameters that we can
set to fit the situation.
Now that we have it, what do we do with it?

We have found the equation for total cost. The response is “So? How does that help?” It’s clear that the equation is
something we should have, but it is not clear what to do with it now that we have it.
With specific numbers, we can graph it. We can get the values of n, s, and i from the manager and plot the function
C(x), and look for the minimum. (After all, we are trying to make the total cost as small as possible.)
Can we do better? There are problems with the graphical approach. What happens if some of the values change (or
aren’t accurate)? We’d be stuck. A major branch of the “real” theory here is called sensitivity analysis. It answers the
question “How sensitive is the final answer to slight inaccuracies in the parameters?” If it turns out that the result can vary
dramatically with slight changes in n, then that is good to know. You can concentrate your efforts on getting accurate values
of n. Sensitivity analysis is difficult to do graphically. We would be better off if we handled the equations algebraically.
2.3.2 General procedures for minimizing (or maximizing) a function.

As you can imagine, the whole idea of finding the largest and/or smallest values of a function is important in very practical
situations, and calculus is usually at the heart of such endeavors. What we want to do is give the procedures for solving this
type of problem algebraically, at least in the case of a single-variable function. (More complicated procedures are needed
for multi-variable functions. We’ll encounter those in the next application chapter, on cassette recorders.)
In all of these, we are going to be working with formula-based functions. With the other varieties (graphs, lists, etc.),
largest values (maxes) and smallest values (mins) are easy to find. (But as mentioned, there are limitations as well.)
Look at a simple picture (graph).

Audience participation questions about the following graph: (People who have had calculus before are excluded!)
(a) How do we find the maxes or mins?

(b) How do we tell the difference between a max and a min?
How about in another graph for which that approach doesn’t work?
(a)
(b)
How do you solve the problem?

We just got two different ways of solving one problem. Which do we use? That’s difficult to answer. Both approaches are
useful, but in different settings. It depends on the type of problem you have.
I’ll give details on how to solve problems both ways, as well as the advantages and disadvantages of both ways and
guidelines on when to use which procedure. I’ll first give the most general procedure, and then the most common one.
One procedure: (most general) The central idea in this case is simple. Find the places where the graph can change from
rising to falling or vice versa. Those will be the maxes and mins.
Here’s the procedure:
1. Find the derivative of the function. You should all be pros at this by now.
Mugsy: Wow, how optimistic can you get?
2. Find out where the derivative is 0 or doesn’t exist (called Critical points); plot those points on a number line. Critical
points are well-named. They are often where the most interesting behavior of the function occurs. At other places
(called Regular points), the graph is basically dull.
The reason you plot the points on a number line (which is really the x-axis; you can add the y-axis later and plot the
function if you want) is to divide the line up into sections that are clumps of points that are all regular points.
3. Find out if the curve is rising or falling on the other points. The key to this method is here. On any given section
(clump of regular points), the graph is either entirely rising or entirely falling. (It will generally do different things on
different sections, but within one section, it’s always the same at each point.) The reason is that a graph can’t change
from rising to falling except at a critical point. If there are no critical points in an interval, it can’t change. We have
intervals (clumps) with no critical points, so it can’t change.
How do you test whether the graph is rising or falling on that section? Simple, check the value of the derivative at
some point inside the interval. (Don’t check the ends—those are the critical points!) Which point? It doesn’t matter,
since they are all the same in the sense the curve is rising for all points of a specific section or falling for all points of
that section. (Again, it can, and usually does, change from one section to the next.)
4. Locate maxes/mins from changes. Once you’ve categorized each section, locating maxes and mins is simple. A max
occurs when you move from a rising section to a falling section. A min occurs when you move from a falling section
to a rising section. For most functions, the sections will alternate between rising and falling. However, that is not
always true, so don’t rely on it. You really do have to check each section on its own.
Other procedure: (easiest and most common) The central idea in this procedure is to look for the tops of hills (maxes)
and bottoms of valleys (mins).
1. Find the first and second derivatives of the function.
2. Find out where the first derivative is 0. The places where the first derivative is zero have the potential for being tops
of hills or bottoms of valleys.
In this approach, you can’t handle where the first derivative doesn’t exist. The second derivative won’t exist there
either, and the procedure falls apart.
3. Test the concavity at the places where the first derivative is 0. If the second derivative is positive where the first
derivative is 0, that’s a min. If the second derivative is negative where the first derivative is 0, that’s a max. If the
second derivative is 0 or doesn’t exists where the first derivative is 0, then just about anything could happen: max,
min, or neither.
Why does that work? If the curve is concave up (positive second derivative) at a point that has a horizontal tangent,
then the concave up means you are on a portion of the graph that looks like a smile, and the horizontal tangent gives
a minimum. If the curve is concave down (negative second derivative (negative second derivative) at a point that has
a horizontal tangent, then the concave down means that you are on a portion of the graph that looks like a frown, and
the horizontal tangent gives a maximum.
This procedure is called the Second derivative test, since the second derivative is used to tell the character of the point
where the first derivative is zero. However, be careful with the second derivatives. Just because you have a positive
(or negative) second derivative at a specific point, you don’t have a minimum (or maximum) unless you also have the
first derivative equal to zero.
Choosing between these procedures. Both procedures will work, when they can. The first (general) procedure is usually
somewhat longer, but is guaranteed to work. The second (common) procedure is faster, but might fail for either of two
reasons:
1. The first derivative fails to exist, or
2. The second derivative is 0 or fails to exist.
These are serious problems, and if either occurs, you are forced into using the first procedure.
So, my recommendation is this: Use the second procedure unless it fails. Examples of failure are given in the homework
and in class.
Note that neither of these procedures finds the values of the function at a max or a min. It isn’t necessary! All we are
doing is locating the value of x when the max or min occurs. However, in most practical problems, finding the value of the
function at the max or min is important. If the problem asks for the maximum and/or minimum value of a function, you
have to plug the x’s back into the function to evaluate it at the relevant points.
An example of this will be given when I solve the minimum cost problem. I will do it both ways. And in the next
section, I will find the critical points of a more complicated function. But I will work the following examples in class:
x3 − 3 x2 − 9 x + 8
ex sin x
Maxes and mins on closed intervals.

There is one problem that can crop up. It is quite possible to get functions that don’t have maxes or mins. The simplest
example is f (x) = x. It has no max or min. This is a bit disconcerting. There is, however, a way to force the existence of
maxes and mins, and that’s what we look at next.
Some terminology needs to be laid down here. The maxes and mins that we found using the procedures just given are
local maxes and mins. That means that the values are the largest or smallest nearby. But that might not be what you want in
a complicated situation. You might want the largest or smallest values anywhere. Those are called the global max and min.
There might be many local maxes and mins (for example, a very wavy function), but the global max and min are usually
unique, a term meaning that there’s only one of each. (You can have ties, such as with f (x) = sin x, but that’s rare.) It’s
obvious that a global max is a local max and a global min is a local min. (If a value at a point is larger than all other points,
it is certainly larger than all nearby points. The same holds for smaller in place of greater.)
Needed to guarantee global maxes and mins. The situation that guarantees that a function will have both a global max
and a global min is if the function is continuous (no breaks in its graph) on an interval like a ≤ x ≤ b, commonly abbreviated
[a, b]. This assertion can be proved, but doing so requires more advanced mathematics than I want to deal with here. It is
sometimes called the extreme value theorem.
Mugsy: Should I be glad that the proof is omitted?
Albert: Most definitely. The proof shows up in courses called Real Analysis or Topology. These are genuinely senior-
level courses, and often put off until graduate school.
Global max and min will either be a critical point or an endpoint. There are two types of places to look for a global
max or min. The first type of place is at critical points. It is quite possible that one of the local maxes or mins is also
the global max or min. You can’t ignore them. The other type of place to look—and this is a result of working on on
interval—is at the endpoints of the interval. If you have a function like f (x) = x on the interval [a, b], then the minimum
will occur at x = a and the maximum will occur at x = b. The endpoints are important!
Finding the global max and global min. Once you know where to look, you have to search through the points for the
global max and global min. You could do this by classifying each of the points (“Is it a max or min?”), and looking among
the maxes for the global max and looking among the mins for the global min. That’s not how it is done in practice. For one
thing, it is not obvious what to do with the endpoints. (Although with a bit of thinking, you could probably come up with
the right thing.)
Dudley: Al, Mugsy and I have been conferring, and have one thing to say: HELP!
Albert: Don’t panic, either of you. Think about what’s being done. Suppose you want the global maximum and
minimum of a function f (x) on the interval [a, b]. You could find all the critical points, and classify them as local
maxes and mins. Then, to find the global max, you’d take the local maxes and the endpoints (Don’t forget them!),
and you’d be stuck. How do you tell which is the global max? There is only one way. You have to plug those values
back into the function, and look at the values of the function at all those points. The very largest value is the global
maximum, and the corresponding x is the value where the global max occurs. The, to find the global min, you’d
do the same with all the local mins (again including the endpoints!), plugging in all those values into the function,
but this time looking for the very smallest output of the function, which is the global min of the function, and the
corresponding x is the input that gives that global min. Now, look at what you did. You plugged in all the critical
points into the function, as well as the endpoints (twice). Realize, then, that you could drop out one piece of work.
You don’t really need to know whether a point is a local max or min. It’s wasted effort. The global max is not going
to occur at a local min, and vice versa. Just plug all the points in without classifying them first if you are going only
for the global max and min. It will save you some work. Does that help?
Dudley: Definitely, but I’ll need to go over it again to be sure.
Albert: But be careful. You can skip the classification, as it is called, of the critical points (answering the question
of whether the point is a local max or min) only when you are asked only for the global max and min. If you want
the local maxes and mins, then you do have to go through the classification process (the second derivative test, for
example).
The procedure, then, is somewhat different from finding local maxes and mins. To find the global max and min of a
function on an interval, follow this procedure.
1. Find all the critical points of the function (where the derivative is either 0 or doesn’t exist), and put those in a list.
2. Throw out of the list any critical points that are not in the given interval.
3. Add to that list the endpoints of the interval.
4. Evaluate the function at all the points that are left in the list.
5. Pick the largest value to be the global max and the smallest value to be the global min.
Example. Find the global max and min of

2x+5
f (x) =
x2 − 4
on the interval [−6, −3]. The derivative is
(x2 − 4)(2) − (2 x + 5)(2 x)

f 0 (x) = (2.36)
(x2 − 4)2
(2 x2 − 8) − (4 x2 + 10 x)
= (2.37)
((x − 2)(x + 2))2
−2 x2 − 10 x − 8
= (2.38)
(x − 2)2 (x + 2)2
−2(x + 1)(x + 4)
= (2.39)
(x − 2)2 (x + 2)2
Then what we have to do is find the critical points. The values where f 0 (x) = 0 come from setting the top of f 0 (x) to 0. (A
fraction is zero when the top is zero.) That gives −2(x + 1)(x + 4) = 0, so x = −1, −4. We throw out −1 since it is not
in the interval between −6 and −3. Next, we need the values where f 0 (x) doesn’t exist, and that comes from setting the
bottom of f 0 (x) to 0. (A fraction blows up when you divide by 0.) (This, by the way, is why I factored the bottom!) That
gives (x − 2)2 (x + 2)2 = 0, or x = 2, −2. Both of those get thrown out, because neither is between −6 and −3. So, the only
surviving critical point is x = −4. Add to that the endpoints, −3 and −6, and you end up with three x’s to find function
values for.
1 1 7
f (−3) = − , f (−4) = − , and f (−6) = −
5 4 32
Since
1 7 1
− >− >−
5 32 4
the global max in this case is − 15 at x = −3 and the global min is − 14 at x = −4.
A few notes. First, you plug back into f (x) to finally decide global maxes and mins, not f 0 (x) or f 00 (x). Also, when you
are trying to decide where a fraction is either 0 or not defined, you set the top equal to 0 and then the bottom equal to 0.
2.3.3 Back to the problem.

What we want to do is find the minimum of C(x) = (n s/x) + (i x/2). This is more conveniently written as C(x) = n s x−1 +
(i/2)x. (Remember that it is usually easier to deal with exponents than division? This is particularly true when it is so easy
to convert to an exponent.) I want to solve this problem using both the procedures listed for finding local maxes and mins
(not global here, yet).
Use the second procedure first.

For this, I need to find C0 (x) = −n s x−2 + (i/2), and C00 (x) = 2 n s x−3 . Then, I set C0 (x) = 0 and solve.
ns i
− + =0 (2.40)
x2 2
i ns
= 2 (2.41)
2 x
x2 i
= ns (2.42)
2
2n s
x2 = (2.43)
ir
2n s
x=± (2.44)
i
There is also the problem of where is C0 (x) not defined, which occurs at x = 0. This gives a total of three critical points,
but only one is realizable. (What would the manager say if you told him to order a negative number of hamburgers? Or
zero hamburgers?) This can also be viewed as a sort of global max/min problem, with x > 0 being the interval, but that
can’t work easily, since there is only one endpoint. But thepeffect is the same, namely, drop the negative square root and the
value x = 0. We are left with only one possible value, x = 2 n s/i. It had better be a min!
00 00
pthat value of x into C (x), and we will certainly get a positive number, since all factors in C (x) are positive.
We plug
Thus, x = 2 n s/i does indeed give a minimum for C(x).
Use the first procedure next.

We now start all over and use the first method of finding
p local maxes and mins. For this, you have to find the critical points.
C0 (x) = −n s x−2 + (i/2). This equals 0 when x = ± 2 n s/i, and C0 (x) is not defined for x = 0. This gives three critical
points. Now we are confronted with a problem. How do we plot points when they don’t have specific values? How do we
find points around them easily? These are good questions.
The idea that we could add 1 or subtract 1 top too messy to find C0 (x) for those
get numbers bigger or smaller loses; it’s p
p The way to work is this: Whatever value 2 n s/i has, it will certainly be true that n s/i will have a smaller 0value,
values.
and p 3 n s/i will p
have a larger p is, instead of adding something, we multiply by something. So, evaluate C (x) at
p value. That
x = − 3 n s/i, − n s/i, n s/i, and 3 n s/i They give, respectively, i/6, −i/2, −i/2 and i/6. That means that the graph
of C(x) is
p
increasing for −∞ p < x < − 2 n s/i
decreasing for − 2 n s/i p < x <0
decreasing for p 0 < x < 2 n s/i
increasing for 2 n s/i < x, ∞
p
(This is easier to see on a number line.) Therefore, there is a local minimum at x = 2 n s/i. This is the same answer that
the other procedure gave.
This procedure could have been simplified by ignoring x < 0, but it is worth noting something. The function did not
change from decreasing to increasing at x = 0, one of the critical points. That’s a lesson that is worth remembering!
Solution.
p
What’s the answer? The order size should be x = 2 n s/i. There is one thing yet to do, since this is a problem taken from
the real world. You should always ask if it makes sense.
Mugsy: Hey, really? Does anything is course actually make sense?
Dudley: Come off it. Of course. What fascinates me is that it could actually possibly be used maybe a little. This is
a new concept for a math course.
If n = number of hamburgers sold per year, gets large, then the order size goes up. That certainly makes sense. If s =
shipping cost to place a single order goes up, we want to make larger orders. That also makes sense, since a larger shipping
charge means you want to pay it less often. If i = inventory cost of a hamburger goes up, then you certainly want to cut the
size of the inventory back, which means smaller orders delivered more often.
There is one other question that fits in here. How often should orders be placed? That is certainly relevant. We answer
that question in the homework right now.
Homework #23
Exercises.
1. Find and classify all the local maxes and mins of the following functions. (Classify means determine whether it’s a
max or a min or neither.)
(a) p(s) = s3 − 6 s2 − 36 s + 18
(b) f (x) = x3 − 6 x2 + 12 x − 1
2. Find and classify all the local maxes and mins of the following functions.
(a) p(s) = s3 − 9 s2 + 27 s − 15
(b) f (x) = x3 − 15 x2 + 63 x − 60
3. Find where the functions are rising and falling, and the global maximum and minimum, of the following functions
on the given intervals:
(a) f (x) = 3 x2 + 8 x − 5 on [−3, 0].
x 2
(b) f (x) = x−4 on [5, 10].
(c) f (x) = x ln x for 0 ≤ x ≤ 3 (Use f (0) = 0, since limx→0 f (x) = 0 for this function. We did that limit in an
example earlier.)
4. Find where the functions are rising and falling, and the global maximum and minimum, of the following functions
on the given intervals:
(a) f (x) = 3 x2 − 7 x + 2 on [−1, 3].
x2
(b) f (x) = x−3 on [4, 10].
(c) f (x) = x e−x for 0 ≤ x ≤ 5.
Problems.
1. In this problem, we look at a familiar function that requires the first method for finding maxes or mins.
(a) The function f (x) = |x | has a minimum at x = 0. Why would the second (easy) method never find that as a
minimum?
(b) When you try the second method in this function, what tips you off that it is not going to work?
(c) What about the graph of y = |x | at x = 0 indicates that the second method for finding maxes and mins will fail?
2. In this problem, we classify all the critical points of f (x) = 2x2x+5
−4
. I have already found all the critical points in the
notes (though some were removed due to the interval under consideration there).
(a) What is the second derivative of f (x)? (Needed for the second method of classifying critical points.)
(b) When you try to use the second method to classify the critical points, two points can be classified, but two
can’t. Classify the two that can be. (The two that can’t be classified have the awkward property that the second
derivative explodes—division by 0—when you plug the values in.)
(c) The two failed points obviously need more help, and the first method now becomes the only hope. Determine
where the function is increasing and where it is decreasing. (You always have to use all the critical points to
answer this type of question.)
(d) On the basis of the information from the previous parts, classify the critical points again. Do the two failed
points show up as maxes, mins, or neither? (Bonus: What are they graphically?)
3. In this
p problem, we investigate how often orders need to be placed, having determined that the optimal order size is
x = 2 n s/i.
(a) Back in the beginning of this problem, we decided that there needed to be n/x orders per year to sell n ham-
burgers each year. From that, how many years are there between orders? (This looks hard, but isn’t really. Try
the same question with numbers. If you have 12 orders per year, how many years are there between orders?
How about 4 orders? 3 orders? How did you answer these questions? Apply the same logic to this problem.)
p
(b) Plug x = 2 n s/i into your answer to the previous part, and simplify the result you get algebraically. (It’s only
a minor bit of simplifying.)
2.4 Elasticity.
We now want to move into a bit of microeconomics. But courage! We aren’t going to spend much time here; just enough
to pick up a feel for how differentials are used in another field, and get a few more useful things to do with them.
Mugsy: More? We had any?
2.4.1 Introduction.
Typical use of calculus concepts in economics.
The preceding section is a genuine application, but still carries a bit of a sense of triviality. In this section, we will get more
of an idea that calculus concepts can be used in a significant way in the study of economics.
These sections hardly represent the only uses of calculus in economics, but we haven’t yet had enough calculus to do
much more.
Typically poorly explained (poor understanding of calculus).

The concepts and notation of calculus enable us to understand what is being described by elasticity well. If you have to
avoid calculus due to lack of prerequisites (the situation in many economics courses), the explanations become much more
awkward. Remember, calculus is a language that is specially adapted to describing change, and economics is all about
change.
Albert: Change? I prefer folding money.
Mugsy: BOOOOOO! HISSSSSS!
We aren’t interested in learning economics as much as in seeing how calculus can be used in another setting.
Mugsy: If the alternative is between calculus and economics, I’d take calculus.
Dudley: Why?
Mugsy: I need all the brownie points I can get in this course.
The critical thing is to see how the language can be used and to pick up a few more important concepts.
Description of the market; gauge of price levels.

Elasticity is a quantity used by retailers to describe the way revenue will change with price changes. In that sense, it is a
measure of the market price of a commodity. Although the real-world situations in which elasticity are used are vastly more
complicated than can be described by a single function, elasticity is an important one to watch.
2.4.2 Notation and terminology.

The price of an item is written as p. The number of items sold is x. The revenue (gross income of the retailer) is then
R = x p.
Price increase implies decrease in demand.

We make a few assumptions. The most obvious one is that if you increase the price, the demand (the number of items sold)
decreases. This is, of course, debatable. But we are not going to be looking at specialty markets where psychology is as
much a factor as logic.
Mugsy: What? This makes less sense than usual, and that’s not good.
Albert: Occasionally you will hear of the phenomenon that an item wouldn’t sell until its price was raised sky high.
People then thought that the item was exotic or valuable, and it started a trend.
This translates into the language of calculus as d p/dx < 0. If d p > 0 (price increases), then dx < 0 (demand decreases),
so the quotient d p/dx is negative.
We will also make the assumptions that p > 0 and x > 0 always. These are so obvious there is no chance of violating
them.
Let’s try to maximize revenue.

We are going to be interested in maximizing revenue. You might think that you’d try to maximize profit, but if you look at
the financial pages much nowadays, you’ll realize that a large market share is considered more important than high profit.
We are going to be assuming implicitly that the commodity in question is the only item made by the manufacturer, and
that there is only one manufacturer. These are big (generally invalid) assumptions, but they serve the purpose of illustrating
what goes on in more complicated markets.
2.4.3 Solution of problem.

We want to maximize revenue, and need to find the condition that guarantees that.
Assume that p = p(x).

This seems a bit strange. Price is a function of demand? Wouldn’t you think that demand is a function of price? It goes
either way, from the retailer’s perspective. We want to use demand as the independent variable, though. The retailer
will then adjust the price to achieve a certain level of sales (demand). That’s why we called demand x; it represents the
independent variable.
We want dR/dx = 0.
To maximize revenue, we differentiate R(x) = x p(x) with respect to x, since p is really a function of x. We then set it equal
to 0, and solve for d p/dx. Here’s what happens:
dR dp
= 1 × p(x) + x × (2.45)
dx dx
dp
0 = p+x (2.46)
dx
dp
−p = x (2.47)
dx
p dp
− = (2.48)
x dx
That is, we want d p/dx to equal −p/x to make the revenue a maximum.
There is some more terminology. Whenever an economist refers to the derivative of something with respect to x
(demand), he calls it the marginal of that thing. d p/dx is marginal price; dR/dx is marginal revenue. The way that this is
explained in economics class is that the marginal revenue (for example) is the change in revenue caused by selling one more
item, while marginal price is the change in price needed to increase demand by one. The idea there is that d p/dx ≈ ∆p/∆x,
where the approximation is best when ∆x is small. Since ∆x = 1 is the smallest that is physically possible, the best that can
be done is d p/dx ≈ ∆p/∆x = ∆p when ∆x = 1. The same would be true for dR/dx ≈ ∆R when ∆x = 1.
Actually, all derivatives (not just with respect to x) in economics are called Marginals. The derivative d/dx is the
Demand marginal, so d p/dx is more accurately the demand marginal of price, while dR/d p would be the the price marginal
of revenue. The type of marginal is then used for specifying the independent variable. However, the demand marginal is
usually meant when there is no other indication of which marginal is referenced.
Critical price occurs where (p/x)/(d p/dx) = −1.

If we divide both sides of that equation for maximizing revenue by d p/dx, we get that
p dp
− = (2.49)
x dx
p/x
−1 = (2.50)
d p/dx
This is the crucial equation that we will look at closely.
Notation and terminology.

The quantity that we just got is called the elasticity, usually. This is how we define it.
p/x
Elasticity = (2.51)
d p/dx
The quantity on the left of the equation for the critical price is called elasticity, and is given the letter η (Greek letter,
written eta, and pronounced AY-tuh). It looks like an “n” with a tail. (But the Greek equivalent of “n” is nu or ν, looking
like a “v.” It takes some getting used to.)
The marginal price is on the bottom, as mentioned above. The item on the top is called the Average price. This also is
terminology. A numeric quantity divided by x gives the average of that quantity. So, p/x is the average price, and R/x is
the average revenue. (I would imagine that there are varieties of averages, just like there are varieties of marginals. That is,
the price average of revenue would be R/p. However, I have never encountered such terminology.)
It is worth highlighting the difference:
Marginal means take the derivative with respect to x.

Average means divide by x.
Note that elasticity will be negative (by the assumption that d p/dx < 0, and assuming p and x are positive). What we will
end up talking about, mostly, is the absolute value of elasticity, making it positive. Elasticity is one of the quantities that
is not defined uniformly. The definition here is the most common (as far as I can tell), but there are others. One typical
alternative is to include absolute values. This definition can be easily translated to something equivalent to what we will
be doing, except that we will have to include the absolute values. Another definition of elasticity that I have seen is that
η = (d p/dx)/(p/x), the reciprocal of what we are using. This makes life complicated, since it reverses all the inequalities
that we will get.
2.4.4 Investigation of elasticity.

What we want to do is interpret this odd combination of ingredients called elasticity. What it means, etc.
η is negative.
We’ve already said that earlier, but it needs to be put here with the properties of elasticity. It follows from d p/dx < 0, x > 0,
and p > 0.
Terminology.
The market (meaning the price-demand situation being investigated) with 0 < |η | < 1 is called Inelastic, with |η | = 1 is
said to have Unit elasticity, with |η | > 1 is called Elastic. Note that these inequalities flip around if you take off the absolute
values, because η is negative! Also note that the condition we had for maximum revenue, η = −1, implies unit elasticity.
That’s the reason that 1 is the separating line between inelastic and elastic markets.
Dudley: Al, how can I remember these easily?
Albert: I assume you don’t want the brute force memorization method. If you will wait until we know how elastic and
inelastic markets operate, it will be easier to explain.
This is standard terminology (except, see the homework!). We want to examine what these different terms imply about
the market.
Setting up to examine elasticity.

With that in mind, we want to look at elasticity, and what it implies about market conditions, specifically what it implies
about the change in revenue as price changes.
Rewrite η as (dx/x)/(d p/p) . In the form that we had for η, it is a bit difficult to interpret, so we rearrange the terms,
and specifically, split the derivative apart into the quotient of two differentials so that we can put all the x-terms on top and
all the p-terms on the bottom.
What are dx/x and d p/p? The meanings of dx/x and d p/p are central to understanding elasticity. In fact, dQ/Q for any
quantity Q, is a useful item in other situations, so we spend some time looking at this separately. It even has a terminology:
dQ/Q is called the relative change in Q. Elasticity, then, is the ratio of the relative change in demand to the relative change
in price. As soon as we figure out what the relative change means, we’ll come back to this, and get our final interpretation
of elasticity.
2.4.5 Looking at relative changes.

In order to understand elasticity, and look at other situations where relative change occurs (and there are some that you will
encounter), we need to look carefully at relative change.
Absolute changes are relevant in some cases.

We have seen that dQ represents a wiggle in Q, a small change in the variable. We have seen that such things are the
appropriate measure to use. But that is not always the fairest measurement of change or error.
Relative changes are common; measurement errors.

An error of 3 kilometers is really bad, if it’s in the distance between the college and seminary, quite large if it’s the distance
between here and Lexington, but not so bad if it’s in the distance from here to Cincinnati. It’s amazingly small if it’s in the
distance between here and Mars (the allowable error for the Galileo spacecraft to continue its mission properly).
Why the difference in understanding a 3 km error? Because the seminary, Lexington, and even Cincinnati are so much
closer than Mars.
Mugsy: I’ve been to Cincinnati. Some portions are quite close to Mars. But then, portions of Chicago are a lot closer.
Larger absolute errors are generally expected in larger quantities. Consider a price increase of $1. That would be catas-
trophic for a can of Pepsi, but insignificant in the price of an automobile.
Mugsy: They tried to charge me $50 just to inflate the tires. But only once.
Again, it’s the ratio of the change to the current value that determines the reaction. For the Coke that ratio is about
($1.00)/($0.50) = 2 = 200%, while for the car it’s perhaps ($1.00)/($12, 000) = 0.00008 = 0.008%.
The relative error is what we intuitively think of as the way to determine how good a measurement is. For a quantity Q,
the relative error is dQ/Q. It is frequently stated in terms of a percentage. In fact, if you encounter a percentage error, that
is automatically a relative error. In physics labs, for example, relative error is a much more important item than absolute
error. Keep this in mind when writing up your lab reports!
Again, these things are sufficiently important to merit highlighting:
Absolute error in Q is dQ
Relative error in Q is dQ/Q
Percentage error in Q is dQ/Q × 100%
Note that 100% = 1.00, so the percentage error and relative error are different-looking ways of saying the same thing.
2.4.6 Back to elasticity.

Let’s now apply this to elasticity.
Ratio of relative change in x to relative change in p.

Let’s look at elasticity from this perspective. What would η = (dx/x)/(d p/p) represent? It is the ratio of the relative
change in demand to the relative change in price. An elastic market has (the absolute value of) elasticity greater than 1. If
so, then a small relative change in demand accompanies an even smaller relative change in price. That is, if the relative
price changes, the relative demand changes more. (Demand stretches further in an elastic market.) For example, if a
2% increase in demand accompanies a 1% decrease in price, then dx/x = 2% = 0.02, while d p/p = −1% = −0.01, so
η = (0.02)/(−0.01) = −2, so η = −2.
An inelastic market has (the absolute value of) elasticity less than 1. In that case, a small relative change in demand
accompanies a larger relative change in price. For example, if a 2% increase in demand accompanies a 3% decrease in
price, then dx/x = 2% = 0.02, while d p/p = −3% = −0.03, so η = (0.02)/(−0.03) = −2/3, or η = −2/3.
A unit elasticity market has (the absolute value of) elasticity equal to one. That means that the relative change in demand
equals (the negative of) the relative change in price. For example, if a 2% increase in demand accompanies a 2% decrease
in price, then dx/x = 2% = 0.02, while d p/p = −2% = −0.02, so η = (0.02)/(−0.02) = −1, so η = −1.
Why that is relevant to maximizing revenue.

We originally obtained η = −1 as the condition for maximizing revenue. That makes sense, once we realize what happens
in elastic and inelastic markets under price increases or decreases. (For algebraic demonstrations of these conclusions, see
the homework.)
What happens with a price increase in an elastic market? The price goes up by some percentage, but the demand drops
even more (on a percentage basis). The net result is a decrease in revenue. On the other hand, a price decrease (such as a
sale) causes the price to decrease by a certain fraction, but the demand increases by more than enough to compensate, and
the revenue actually increases.
On the other hand, suppose the market is inelastic. Then a price increase (by a certain percentage) causes a decrease in
demand, but not as big (in percentage) as the price increase, and the net effect is an increase in revenue. A price decrease
causes an increase in demand, but not as large a one, and the total revenue drops. Note that this is precisely backwards from
the reactions of an elastic market!
Albert: Dudley, here’s your way to memorize elastic versus inelastic. Marketers are interested in the change in demand
of an object, that is, what is the effect of a price change on how much I sell. An elastic market is one where the relative
demand change is larger than the relative price change; the market feels “spongy” to price changes. An inelastic market
is one where the relative demand change is smaller than the relative price change; the market feels “harder.”
In a market with unit elasticity, the percentage price increase equals the percentage decrease in demand. In that case,
the revenue tends to stay constant. That’s equivalent to a horizontal tangent occurring at a maximum. The values of the
function tend to be nearly constant near a maximum.
Of course, there is no guarantee that setting dR/dx = 0 is finding the maximum; it could be finding the minimum. We’d
need to do more analysis (such as examining assumptions to tell us about d 2 R/dx2 or the sign of d p/dx). We aren’t going
to get into that here.
Typical examples of elastic and inelastic markets.

An elastic market will often have price markdowns and sales. That boosts revenues. Elastic markets are associated with
luxuries and/or non-essentials. New automobiles are the classic example. If the price of a new car goes up too far, you’ll
tend to hold off buying a new one (or will buy a used one) for a while, and new car revenues will drop. That’s considered
bad.
An inelastic market will usually have stable prices (due more to competition than anything else). Inelastic markets are
associated with necessities. Gasoline is a typical example. If the price of gas goes up, you might cut back on your driving,
but you will still buy. Trendy items have moved from elastic to inelastic markets. There is a major element of psychology
even here. Advertising is always trying to move items from elastic to inelastic. (Consider the stereotypical child wailing,
“If I don’t get one of those, I’ll just die!”) On the other hand, once you’ve seen what third-world cultures are like, you tend
to revise your estimates of what are essentials. Considering the state of our culture, I tend to agree with the third world’s
ideas on most of this.
Homework #24
Exercises.
1. Suppose the demand function for a commodity is p = 25 x−4/5 .

(a) Find the revenue function R = x p as a function of x. Simplify the function you get by combining terms.
(b) Show that the derivative of R with respect to x is positive. Assume that x > 0.
(c) Repeat the previous two parts with p = 25 x−6/5 , but show that the derivative of R is negative.
(d) In the function p = p(x), why must the exponent on x must be greater than −1 to make the derivative of R with
respect to x positive? [Hint: What is the resulting exponent on x in R, and why would that make a difference?]
2. Suppose the demand function for a commodity is p = 18x−2/3 .
(a) Find the revenue function R = x p as a function of x. Simplify the function you get by combining terms.
(b) Show that the derivative of R with respect to x is positive. Assume that x > 0.
(c) Repeat the previous two parts with p = 18x−4/3 , but show that the derivative of R is negative.
3. Suppose the elasticity of a commodity at a particular production level is η = −1.3. Assume that production is the
same as demand.
(a) If the production goes up by 2%, by what percentage does the price increase or decrease?
(b) If the production goes down by 3%, by what percentage does the price increase or decrease?
4. Suppose that the elasticity of another commodity at a particular production level is η = −0.9. Assume that production
is the same as demand again.
(a) If the production goes up by 1%, by what percentage does the price increase or decrease?
(b) If the production goes down by 2%, by what percentage does the price increase or decrease?
Problems.
1. Consider the following quotation: “The average describes the past, the marginal predicts the future.” To explain
why it makes sense, work the following problem. Suppose Dudley is marketing a new, improved widget sharpener.
(Everyone agrees, thanks to his advertising blitz, that nothing is more intolerable than dull widgets.) Dudley invested
initially $850,000 in development, advertising, and setup costs. He has sold to date 25,000 widget sharpeners for
$65, with a production cost of $35 each, for a profit of $30 each.
(a) What is the total cost to date of widget sharpeners? You do this by adding initial costs and production costs.
(b) What is the average cost of a widget sharpener to date? [Remember what average means. What is the current
value of x?]
(c) What is the cost function of a widget sharpener? Find this by adding initial costs to the cost to produce x widget
sharpeners.
(d) What is the marginal cost of a widget sharpener? [Remember what marginal means. Use the cost function.]
(e) If Dudley went by average cost versus selling price, would he produce another widget sharpener?
(f) Compare marginal cost to selling price. Should Dudley continue to sell widget sharpeners at $65 each?
(g) Interpret the quotation at the beginning of this problem in the light of this problem.
2. In this problem, we simply do some algebraic manipulations with η. One fact to keep in mind is that |s | is the
distance of s from the origin, even if s is negative.
(a) Show that another way of writing η is (dx/d p)/(x/p).

(b) Show that an inelastic market means that η > −1, since η is negative.
(c) Show that an elastic market means that η < −1, since η is negative.
3. In this problem, we verify that revenue (R) does what we said it would with increases in price at various elasticities.
(a) Show that dR/d p = x × (η + 1). You can do this by multiplying out the right hand side using the alternate
definition of η, from the previous problem, and by using the product rule on the formula R = x p and showing
the two are the same.
(b) Show that a price increase in an inelastic market leads to an increase in revenue. (Hint: If η > −1, from the
previous problem, then η + 1 > 0.)
(c) Show that a price increase in an elastic market leads to a decrease in revenue. (Hint: If η < −1, from the
previous problem, then η + 1 < 0.)
4. The formula in the first part of the previous exercise can be rewritten as
dR = x × (η + 1) d p.
Find the changes in revenue in the situations described in the last two exercises. (Check your answers against what
you found in the previous problem.)
5. Show that the demand function x = c pn has elasticity η = n. You will find the alternate definition of η from problem
2 (a) easiest to use here. (This demonstrates the pattern for elasticities. Unit elasticity looks essentially like x = c/p;
η = −2 looks like x = c/p2 . Remember that η is negative!)

1. The future value interest factors (FVIF’s) for various interest schemes are:
Interest scheme Formula for FVIF
Simple 1+rt
Compound (1 + k)n
Continuous compounding er t
where r = stated (nominal) interest rate, t = length of time for accumulating interest (converted to years), m =
number of compounding periods per year (for compound interest only), k = r/m = interest rate per compounding
period (for compound interest only), and n = t m = number of compounding periods accumulating interest (for
compound interest only).
2. For limits that yield the indeterminate form “0/0” or “∞/∞” when you plug in the limiting value of the variable, you
can apply L’Hôpital’s rule:
f (x) f 0 (x)
If f (c) = g(c) = 0, or f (c) and g(c) are infinite, then lim = lim 0 , provided the second limit exists.
x→c g(x) x→c g (x)
L’Hôpital’s rule can be applied repeatedly to a function, provided the limit continues to give “0/0” or “∞/∞.”
3. There are two possible procedures for finding the x-coordinates of the local (also called relative) maxes and mins of
a function.
(a) The simplest one is used when the function is “nice:”

i. Find the first and second derivatives of the function.
ii. Set the first derivative equal to 0 and solve for x.
iii. Plug those values of x into the second derivative. If the second derivative is positive, that point is a local
min; if the second derivative is negative, that point is a local max; if the second derivative is zero, you have
to use the other procedure.
(b) The other procedure is used for nastier functions, or in the case that the second derivative from the simpler
procedure gave zero.
i. Find the first derivative.
ii. Find all the x’s where the first derivative either equals zero or doesn’t exist. These are the critical points of
the function.
iii. Use the first derivative to determine of the function is rising or falling on the intervals whose end points
are the critical points.
iv. Wherever the function changes from rising to falling or vice versa is a local max or min, provided the
function itself is defined at that point.
(c) The procedure for finding the x-coordinates and values of the global (also called absolute) max and min of a
function on an interval [a, b]:
i. Find the first derivative of the function.
ii. Find all the x’s where the first derivative either equals zero or doesn’t exist. Discard all the x’s that are not
in the interval [a, b].
iii. Put all the remaining , and the values a and b into a table, and calculate the values of the function for all of
them.
iv. The largest function value in the table is the global max of the function; the smallest function value in the
table is the global min of the function. Ties are possible.
4. Elasticity represents how a market will respond to changes in price or demand. The formulas are
p/x
η=
d p/dx
p/d p
=
x/dx
dx/d p
= .
x/p
Elasticity will be negative: η < 0. An inelastic market has |η | < 1, and is typical of necessities. An increase in price
will result in an increase in revenue. An elastic market has |η | > 1, and is typical of luxuries. An increase in price
will result in a decrease in revenue. A unit elastic market has |η | = 1, and will maximize (or minimize!) revenue for
a commodity.
5. The relative change in Q is dQ/Q or (∆Q)/Q.
Chapter 3
Derivatives - II
3.1 Partial derivatives.

You were informed (“warned”) about this section earlier. Now we take up the topic of multi-variable functions (multi-input
green boxes) and how derivatives work there.
3.1.1 Basics.
There are a few things that we will have to cover before we can get into the calculus of multi-input functions.
Motivations.
Are there any reasons for looking at these things? Definitely. There are a number of different rationales for them.
Very few things in life depend on only one other thing. If we want to use calculus in more comprehensive situations, we
will have to deal with the reality that most items in life depend on multiple other items. That means dealing with functions
of many variables, and understanding calculus in those bigger settings.
It might be interesting to view the inventory cost control problem as a function of both x, order size, and n, annual sales
of hamburgers, so that you can see how growth of the business affects order size. And that’s just one option there.
Weather, for example The weather is one of the most complicated of the systems that are being analyzed today. The
accuracy of weather forecasts would be greatly enhanced by a good model (set of equations describing) of the atmosphere.
There are too many variables! You’d need to look at latitude, longitude, length of day, season, amount of pollution (which
is difficult to describe all by itself!), geography, and many others.
Multiple-input functions
As before, one critical element of understanding derivatives is to understand functions correctly. There is one handy piece
of terminology that is (as far as I know) is unique to me, but which I have learned is very useful. The number of input
variables (the number of input chutes) is called the dimension of the function. So far, all our functions (with one exception
in one part of one homework question) have been one-dimensional, by this terminology.
So, let’s look at the different definitions of functions from this point of view. We can operate multi-input green boxes as
easily as regular (that is, single-input) green boxes. You simply require that whenever all the inputs are duplicated, then the
outputs must be duplicated. That is the essence of consistency that we required then, and still require. We can also work
with gnomes that need more information, but we don’t get anything new.
We can create proper lists. Only now, we will have a number of columns for the input side of the list, one column for
each variable. A gnome has to match all the columns in order to determine the output. And that becomes the condition for
“proper-ness:” if all the input values are identical on two rows, the output values must be the same on those rows.
152
CHAPTER 3. DERIVATIVES - II 153
From there, we can create proper lists of ordered triples, ordered quadruples, or whatever number needs to be used to
express the number of input variables. The most general is called ordered n-tuples.
Graphs are more complicated. Graphs in three dimensions are difficult, and in four dimensions (and up), graphs are
unusable. But there is a bit of confusion possible here, that needs to be tackled up front. If you want to graph a one-
dimensional function (a function of a single variable, say x), you need a plane, that is, two dimensions. Why? The reason is
simple, once you see it. A one-dimensional function (single-input green box) has two parts, the input chute and the output
spout. Values from both need to be plotted. And that is exactly what happens. The horizontal axis is the one used to plot
the input variable, and the vertical axis is used to plot the output variable. That accounts for the two dimensions.
What happens in more dimensions? We will need individual axes for each input variable, but we will need one more
axis, namely the output variable’s axis. The result is that you need n + 1 dimensions to graph an n-dimensional function.
You need n of the axes for input variables, and one more for the output variable.
Formulas are basically the same as before, except that there are more variables around. On the other hand, you will
also want to know how to handle Maple functions in more variables. It turns out to be fairly straightforward. Suppose, for
example, you want to define the function f (x, y, z) = x2 ∗ z − ey sin x . The way to do that in Maple is
> f := (x,y,z) -> x^2*z - exp(y*sin(x));
f := (x, y, z) → x2 z − e(y sin(x))

Then, working with f (x, y, z) in Maple is just like you would expect. For example, f (1, 1, 2) can be obtained by just
typing that in:
> f(1,1,2);
2 − esin(1)
Derivatives in this case, notations and terminology.

Now that we have covered functions, we tackle derivatives of multi-variable functions. Derivatives will remain wiggle
magnification factors, the best interpretation for understanding a lot of this. But various notation and terminology must
change to accommodate the multiple variables.
Which variable to wiggle? If we want the derivative to be a wiggle magnification factor, which wiggle do we use? We
could wiggle any variable we want! And we will get a wiggle magnification factor for each variable. That means that we
will need to keep careful track of the notations for derivatives. There are multiple wiggle magnification factors, one for
each variable, and they need to be clearly different.
When we decide to wiggle a variable, we will have to make an assumption: no other variable is wiggling at the same
time. Otherwise the wiggle magnification factor will be thrown off by the effects of other variables wiggling at the same
time. Later on, we’ll see how to combine the total effect of multiple wiggles simultaneously, but for now, we wiggle only
one variable at a time, with all others being constant during the process.
The terminology for the wiggle magnification factor of f (x, y, z) when just x is wiggled is the partial derivative of f
with respect to x. The notation is
∂f
∂x
∂f
using bent-over d’s. In this case, there are also partial derivatives of f with respect to y and z, and they would be written ∂y
∂f
and ∂z , and they represent the wiggle magnification factors when just y or just z is wiggled.
The notation ∂∂ xf denotes that there are other variables around. There really is no difference between the partial
derivatives and regular (single-variable) derivatives, except that there are other variables occurring in the partial derivative.
But since they are keeping still (treated as constants), they don’t really affect much while the derivative is being taken.
On the other hand, they are there, and the partial derivative symbol, ∂ , is a warning that other variables are around, and
need to be taken into consideration. Please keep separate the notations d f /dx and ∂∂ xf when you are writing them. The
notation carries some meaning: Are other independent variables present or not?
The subscript notation is common, confusing. There is another notation that is commonly used. It is very convenient,
but needs to be used with some care. If you have f (x, y, z), the partial derivatives might also be written fx , fy , and fz meaning
the same as ∂∂ xf , ∂∂ yf , and ∂∂ zf . It’s a lot shorter to write, so it’s used often. But it’s also easy to lose track of subscripts. Use
this notation if you want, but always be careful if you do.
There is one other, less common notation. You will occasionally see Dx f for ∂∂ xf , the subscript indicating the variable
that you are wiggling. It is especially useful when you want to look at the transformation from f to ∂∂ xf . That is, Dx is a
shorthand for ∂∂x , the way that we use d/dx to mean “take the derivative with respect to x of.”
Interpretations of derivatives.
Now that we have some idea about the notation, we need to develop the understanding of the concept.
Still a “wiggle magnification factor.” The wiggle magnification factor (WMF) now means the same as it did in a single-
input function. Suppose you have a function f (x, y, z), and you wiggle just the x-value by ∆x. The value of the function
will change by ∆ f , and the ratio of those two wiggles, ∆ f /∆x, will approximate ∂∂ xf , or ∆ f ≈ ( ∂∂ xf ) ∆x. Similarly, wiggling
just y gives ∆ f ≈ ( ∂∂ yf ) ∆y, and wiggling just z gives ∆ f ≈ ( ∂∂ zf ) ∆z. We will soon get what happens when we wiggle several
variables simultaneously.
We can still salvage the slope of a tangent line! This can be done, but it is a bit of a mess. For those hardly souls who
want to know, here it is in brief. The dimension of a function is the number of independent variables or the number of
inputs. The graph needs one more dimension, for the output variable.
If we try to find a partial derivative, essentially, we treat all variables but one (the one we’re wiggling) as constants.
That has the effect of slicing through the graph of the function with a plane, giving just a single curve on the plane. The
partial derivative is the slope of the tangent line to that curve in the plane.
It is still a rate, when t is the variable. When one of the variables is t = time, we will still refer to the partial derivative
of the function with respect to t as a rate of change of that function.
Contrast to parametric equations. We worked with parametric equations earlier, where there were multiple variables.
The situation now is different, and pointing out the contrasts will help to clarify what we did then, and what we are doing
now.
With parametric equations, there was a single independent variable, t, and all other variables were dependent. An
example is {x = x(t), y = y(t)}. Now there are multiple independent variables, we can wiggle one and hold the others
constant, and there is a single dependent variable. An example is w = f (x, y).
When you want the derivative in parametric equations, you differentiate with respect to t, the only independent variable,
and you will get a regular derivative. When you differentiate w = f (x, y), you have two independent variables, and you get
two partial derivatives, ∂∂wx and ∂∂wy .
In general, a regular derivative is used when there is a single independent variable, and a partial derivative is used
whenever there are multiple independent variables. This is equivalent to using regular derivatives whenever there is a single
input funnel, but partial derivatives whenever there are several input funnels. That remark will come in handy later on.
Total change and total differential.

We now come to the case of wiggling more than one variable at a time. This turns out to be nicer than you might imagine.
Net function wiggle, given different input variables wiggling. What happens when we wiggle all of the variables at
once? It turns out that the easiest way to figure that out is to wiggle the variables one at a time, and then see how to combine
them. We would end up with a succession of wiggles (changes) in f . What do we do to with them? Let’s figure that out.
Take a simple example first. Suppose we have f (x, y), a function of two variables. (With multiple variables, the hard
step is always in going from one variable to two. Once you see the pattern, going from two to three, or four, or more
variables is quite simple.) Suppose we have a wiggle ∆x and a wiggle ∆y, both of them very small. The change in f for the
∆x wiggle is about ( ∂∂ xf ) × ∆x, where the partial derivative is evaluated at the original point. If we then wiggle y from there
(which will end up giving the same net result as wiggling by both x and y when we look back at the original function), the
function changes by another ( ∂∂ yf ) × ∆y, except that the partial derivative is now evaluated at the x-wiggled point. But, for x
small, since we are approximating anyway, we can get away with evaluating both partial derivatives at the original point.
What then is the final change in f (x, y)? It will be approximately the sum of the changes:
∂f ∂f
∂x ∂y
∆ f (x, y) ≈ ∆x + ∆y (3.1)
Why the sum and not some other combination (product, for example)? Because the individual changes represent what are
added to the function, so the individual changes are consecutive additions, so the total change is the sum of those. (That’s
difficult to state; if you don’t get it, don’t worry. Just remember that you add.)
This could be confusing, but if you stay calm and look at the formula, you’ll discover that it really isn’t that complicated.
The left hand side represents the total change in f (x, y) that we are trying to approximate. The right hand side has two
terms. The first term represents the change in f due to the fact that x is changing. The partial derivative ∂∂ xf is the wiggle
magnification factor for x, and it gets multiplied by ∆x to give that part of the change in f . The second term on the right
represents the change in f due to the fact that y is changing. The partial derivative there is ∂∂ yf , the wiggle magnification
factor for y, and it gets multiplied by ∆y to give the other part of the change in f . The two added together give the total
change in f .
What happens with even more variables? Suppose we have f (x, y, z), and we wiggle all three variables: x, y, and z. The
wiggle magnification formula in this case is the direct generalization of what we had before:
∂f ∂f ∂f
∆ f (x, y, z) ≈ ∆x + ∆y + ∆z (3.2)
∂x ∂y ∂z
The same thing happens with more variables: You add in a term for each variable, and the term consists of a partial
derivative with respect to that variables times the wiggle in the same variable as the derivative. All these get added together
to give the approximate total change in f .
Dudley: AUGH! There are lots of these? One for each number of variables?!
Albert: Actually, you have it exactly correct. But they are all so similar, it is easy to remember, if you find the pattern.
Mugsy: And if you don’t?
Albert: Let’s just say that you’d be better off finding the pattern.
The formula for total differentials. When we work with differentials rather than general wiggles, the approximations
become equalities, but the interpretation remains the same. For example,
∂f ∂f ∂f
d f (x, y, z) = dx + dy + dz (3.3)
∂x ∂y ∂z
This is how you find the differential for multiple-variable (multi-dimensional) functions.
3.1.2 How to calculate partial derivatives.

It’s about time that we started taking derivatives. We have enough formulas that involve them.
Dudley: You can say that again!
Mugsy: But don’t.
Note which variable is being wiggled; treat others as constants.

The partial derivative notation tells you two critical things: what function is being differentiated and which variable is being
wiggled. All variables in the function except the one being wiggled are treated as constants.
This has some interesting effects. If some term contains lots of variables but doesn’t contain the specific one being
wiggled, then the whole term is treated as a constant, with derivative zero! You learn to keep your eyes peeled just for what
you need to differentiate.
3.1.3 ALL THE SAME RULES APPLY, EXACTLY AS THEY DID BEFORE.
The product rule, the quotient rule, and (of course) the chain rule operate exactly as before. Partial derivatives are deriva-
tives, after all. You try to successively simplify the derivatives using the procedures we had, just like our earlier derivatives.
Dudley: This is part of the pattern?
Albert: Actually, it is.
All the same cautions apply, too. Don’t forget the derivative of the inside with the chain rule, for example.
Simplifications apply here, also.

Since there will be many more opportunities for “constants” (actually, terms that contain variables that are being treated as
constants for a specific partial derivative), the simplifications for the product and quotient rules that occurred are relevant.
You might want to look those up again.
Numerous examples of working partial derivatives will be given in class.

Of course, you want to know how to do this on Maple. The surprise is that you already know. Maple notation for partial
derivatives is exactly the same way as for regular derivatives. Remember that Maple makes you tell it the variable of
differentiation? That’s because Maple really only takes partial derivatives. You can see that if you ask for the derivative of
f (x) with respect to x, which would be a normal derivative. Here’s what Maple does.
> diff( w * x^2 * cosh(w), w);
x2 cosh(w) + w x2 sinh(w)
Maple uses partial derivative notation even when there is only one variable.
So, taking derivatives on Maple is just the same as before. You have to tell it what variable is being differentiated with
respect to, and all other variables are treated as constants, which is just what partial differentiation does. For example, the
partial derivative ∂∂w (w x2 cosh w) would be typed into Maple as the command
> diff( w * x^2 * cosh(w), w);
x2 cosh(w) + w x2 sinh(w)
That’s all there is to it!
Homework #25
Exercises.
1. How many dimensions are each of the following functions, and how many dimensions would it take to graph each?
(a) f (x, y, z)
(b) f (t, u, v, w, x)
2. How many dimensions are each of the following functions, and how many dimensions would it take to graph each?
(a) f (s,t, u, v)
(b) f (s,t, u, v, w, x, y, z)
3. Find the partial derivatives with respect to x and y for the following functions.
(a) 2 x3 − 5 x2 y − y6
(b) x y3 (3 x − y2 )

(c) x Arcsin xy
4. Find the partial derivatives with respect to x and y for the following functions.
(a) 7 x2 + 9 x3 y2 − 2 y5
x ey
(b) 4 x−5 y3

x
(c) y3 sec y
5. Find all the (first) partial derivatives of the following functions.

(a) ρ ln(Arctan θ )
(b) pV − n R T (n and R are constants)
(c) x2 y (x y − z2 )4
6. Find all the (first) partial derivatives of the following functions.
(a) Arctan(ln ρ) sin θ
(b) (v1 + v2 )/(1 + v1c2v2 ) (c is a constant)
2
xz
(c) cos y+z
7. Make up three multi-variable functions of your own and find all the first partial derivatives. One should have two
variables, one should have three variables, and one should have four variables.
Problem.
1. It really is not too easy to see that when we are figuring out the total output wiggle when all the input variables are
wiggling that we should add up all of the individual wiggles. This problem will help convince you that adding is
the appropriate thing to do. Let f (x, y) = x3 y2 . We will be moving from (2, 1) to (1.99, 1.02), so ∆x = −0.01 and
∆y = 0.02. For these calculations, use the full precision of your calculator; don’t round off. (You might find it useful
to look back at the work we did in a single independent variable, on page 43.)
(a) Figure out the values of f (2, 1) and f (1.99, 1.02). Then calculate ∆ f from these.
(b) We will now wiggle the x-input alone. We have ∆x = −0.01. Find (∂ f /∂ x)∆x where the partial derivative is
evaluated at (2, 1). Then find f (1.99, 1) − f (2, 1). The two numbers should be close.
(c) We will now wiggle the y-input alone. We have ∆y = 0.02. Find (∂ f /∂ y)∆y where again the partial derivative
is evaluated at (2, 1). Then find f (2, 1.02) − f (2, 1). The two numbers should be close (again).
(d) Compare ∆ f from the first part of this problem with the sum (using the numbers from the other two parts)
(∂ f /∂ x)∆x + (∂ f /∂ y)∆y which is the wiggle magnification approximation to ∆ f .
3.1.4 The chain rule.

Of course, you are expecting me to take off on the chain rule again. After all, it is the most important rule in calculus.
But the multiple-dimension version is substantially less important than the single-dimension version. However, I do want
to emphasize that the multiple-dimension version forces the distinction between regular and partial derivatives, and is a
great help in understanding the regular chain rule.
Mugsy: This I gotta see.
What was the idea behind the single-dimension chain rule? Ultimately, it told you how to differentiate the composition
of two functions, and expressed the derivative in terms of the derivatives of the functions being composed. It became the
reason that differentials work as well as they do. In the multiple-dimension version, I will build off of the formula for total
differentials.
Here’s the situation. Suppose we have a pair of green boxes. The first one has multiple inputs and multiple outputs. The
second has multiple inputs, but a single output, and the inputs of the second match up with the outputs of the first. We can
then compose the two functions in this sense. Put in values for the first box’s inputs, and have all its outputs drop into the
inputs of the second function. The second box then grinds out the value of the composed function. (We don’t need multiple
output functions on the bottom because we are only going to be looking at one output at a time. We would simply treat all
of them exactly as a collection of single-output functions.)
It might be possible to plunge straight in and sort this out, but working up to the general case would be easier. Let’s
take the simplest case first, where there are two intermediate variables and only one independent variable. That means that
we have y = y(u1 , u2 ) with u1 = u1 (x) and u2 = u2 (x). Here’s the green box diagram.
@ x
@
@ u1 @ u2
@ @
That leads to asking what variables is y in terms of. It is possible to think of y as a function of u1 and u2 , the way that it
is given. But it is also possible to think of it as a function of x. That is, given a value for x, you can calculate values for u1
and u2 and then use those to get y.
Asking and answering that question is very relevant, since the number of variables that a function is in terms of deter-
mines whether or not the derivative is a regular or partial derivative. Since you need the values of both u1 and u2 to get y,
the derivatives of y with respect to either u1 or u2 will have to be partial derivatives. But once you have the value of x, that’s
all you need to get the value of y. Yes, you do some more calculations (specifically, you calculate u1 and u2 ), but that can
be done from a knowledge of only x. That means that the derivative of y with respect to x will be a regular derivative.
Now, how would it go? The easiest way is to use differentials. (After all, they are legal to use because of the chain rule,
so using them gets you to the formula for the chain rule very rapidly.) Taking differentials in the formulas u1 = u1 (x) and
u2 = u2 (x) gives
du1 du2
du1 = dx and du2 = dx
dx dx
That is, if you wiggle x, those give you how much both u1 and u2 wiggle. But you can tell how much y will wiggle from
that. The differential of the formula y = y(u1 , u2 ) gives

∂y ∂y
dy = du1 + du2 .
∂ u1 ∂ u2
Combining that with the other differential formulas gives
∂ y du1 ∂ y du2
dy = dx + dx.
∂ u1 dx ∂ u2 dx
Dividing by dx gives the chain rule:
dy ∂ y du1 ∂ y du2
= + .
dx ∂ u1 dx ∂ u2 dx
Compare that to the regular (single-variable) chain rule,
dy dy du
=
dx du dx
Note that both of the terms on the right side of the multi-variable chain rule look exactly like the right side of the single-
variable chain rule, with the exception of one of the derivatives being partial.
The right way to look at this is (as I already mentioned) using a green box diagram. At the top is x, which opens up
into two output spouts u1 and u2 . Those two output spouts then feed into the two input funnels for the y green box. What
happens with the differential formula is perhaps a bit clearer using this. The top input, x, is wiggled by an amount dx. This
wiggles both the “intermediate” variables, u1 and u2 , by amounts du1 and du2 . Those both wiggle the bottom variable y
by the formula for multi-variable wiggles that we had before. The ratio of the top wiggle to the bottom wiggle is then the
derivative, and its formula is (as always) the chain rule.
The way to look at this is that there are two paths through the green box that lead down from x to y, one going through
u1 and one going through u2 . Each one of these contributes a term to the chain rule, since each path contributes to the
wiggle of y.
What would change, then, if we had two top input variables, x1 and x2 ? Of course, the diagram changes. Here’s what it
would become.
x1 x2
@
@ @
@
@ u1 @ u2
@ @
But more than that. In order to get a formula for the chain rule, we would have to get a partial derivative of y with
respect to x1 or x2 . That means that we would only want to wiggle one of them at a time, and suppose for the illustration
we pick x2 to wiggle.
The derivatives of u1 and u2 would also change to partial derivatives, since their formulas would now be u1 = u1 (x1 , x2 )
and u2 = u2 (x1 , x2 ). When we wiggle x2 , the wiggles in u1 and u2 will become
∂ u1 ∂ u2
du1 = dx2 and du2 = dx2 .
∂ x2 ∂ x2
Combining that with the differential formula for dy we had (we can use it since we haven’t changed the bottom half of the
green box diagram),we get
∂y ∂y
dy = du1 + du2
∂ u1 ∂ u2
∂ y ∂ u1 ∂ y ∂ u2
= dx2 + dx2 .
∂ u1 ∂ x2 ∂ u2 ∂ x2
What happens when we divide dy by dx2 ? We get a derivative, of course, but the notation changes to reflect the
situation. We don’t just get dy/dx2 . We had already indicated that the derivative of y with respect to x2 needs to be a partial
∂y
derivative, and that’s what you get by dividing: divide dy by dx2 and you get . Doing that, we get
∂ x2
∂y ∂ y ∂ u1 ∂ y ∂ u2
= + .
∂ x2 ∂ u1 ∂ x2 ∂ u2 ∂ x2
I snuck something in there on you. When you divide two differentials, you get a derivative. That’s reasonable; it’s how
we defined differentials. But in this case, dy when divided by dx2 gave ∂∂xy and not dy/dx2 . What’s going on? Plenty, and
2
not much at all.
Dudley: Huh?
Albert: It depends on how confused you are. Not much is happening, really.
Mugsy: Oh, all kinds of things are happening. I don’t understand this at all yet.
Albert: As I was saying....
The notation dy/dx2 would say that y depends only on x2 , and no other value of any other variable is needed to evaluate y.
Writing ∂∂xy would give the derivative also, but at the same time would say that x2 is only one of several (possibly many)
2
other variables needed to evaluate y. The notation for differentials dy or dx2 remains the same in either case, but the notation
for derivatives is pickier. The moral of this lesson is that you can still divide differentials to get derivatives, but when you
write them, you must be careful. Specifically, watch out that you put in partial derivatives when you should, and use regular
derivatives the other times.
Again, there are two paths through the green boxes that lead from x2 down to y, and each path gives a term in the chain
rule. The only question is how to sort out what should be a regular derivative and what should be a partial derivative. That
isn’t too hard, and you might even be able to guess. You use a regular derivative when the variable you are differentiating
with respect to (that is, the variable on the bottom of the derivative) is all by itself on its input level. When x was the only
top variable, then all derivatives with respect to x were regular derivatives. When x on top became x1 and x2 , the derivatives
with respect to x2 became partial derivatives.
Let’s do one more case before we tackle the whole thing in general. What would happen if we had two top input
variables, but only one intermediate variable? What we have now is u = u(x1 , x2 ) and y = y(u), and again suppose that we
are wiggling only x2 . The green box diagram is here.
x1 x2
@
@ @
@
u
@
@
The formula for du now becomes

∂u
du = x2 .
∂ x2
(Since we aren’t wiggling x1 , its wiggle is 0, and so it doesn’t show up here.) And the formula for the wiggle of y in terms
of the wiggle in u is an old friend by now:
dy
dy = du
du
since there is only one variable in the formula for y. Combining those two gives
dy dy ∂ u
dy = du = x2 .
du du ∂ x2
What do we get when we divide dy by dx2 ? It will be a partial derivative, since there is another variable on the same level
as x2 , namely x1 . The final formula for the chain rule in this case is then
∂y dy ∂ u
= .
∂ x2 du ∂ x2
Let’s look at that a bit more closely. There is only one term (no additions) here, since there is only one intermediate
variable. The derivative with respect to u is a regular derivative, since there are no other variables on the same level as u,
while the derivatives with respect to x2 must be partial derivatives since x1 is on its level.
With that under control, we are now going after the gold: the multi-variable chain rule in general. Obviously, this is
going to take some feat of notation
Mugsy: Is that anything like feet of clay?
Albert: Very close.
Dudley: Or like head of brass?
Mugsy: That helps, too.
to keep everything straight. Here’s how we are going to do it. The top input variables will be x1 , x2 , . . . , xn . The outputs of
the first function will be u1 , u2 , . . . , um ; these are also the inputs to the second function, which we will call g(u1 , u2 , . . . , um ).
(I am using g rather than y because in a moment I will do an example there I use y as an independent variable. But it is still
all the same.) We need to indicate that the u j ’s are function of the xi ’s. (I will try to keep the notation consistent in that i
will be the subscript on x, with i going from 1 to n. Then xi is the generic x variable. And j will be the subscript on u, with
j from 1 to m. Then u j will be the generic u variable.) This is done as usual, by writing u j = u j (x1 , x2 , . . . , xn ).
Here is the general green box diagram.
x1 x2 xn
@
@ @@ ... @
@
...
@ u1 @ u2 @ um−1 @ um
@ @ ... @ @
We now have a quandary. Is g a function of the u j ’s or of the xi ’s? The answer is that it is both. And that is precisely
the reason that we need the chain rule. It tells us how to relate the (partial!) derivatives of g with respect to the xi ’s to the
(partial!) derivatives of g with respect to the u j ’s. It is again the question of how to change the variable of differentiation,
just with lots of variables now.
How do we keep all of these straight? With xi ’s and u j ’s it is bad enough, but when we are working with other sets of
variables (in the homework, for example), it is even worse. The variables are not always nicely given to you in a coherent
pattern enabling you to keep them separate mentally. The trick is this. The function g is originally given (usually as a
formula) in terms of one set of variables. These are the u j ’s. Each of those variables depends on another set of variables.
Those are the xi ’s. I will often, for my own benefit, draw a pair of green boxes with all the inputs and outputs labeled.
The upper set of inputs are the xi ’s and the middle set of variables are the u j ’s. This really helps! (Wait until we get to the
examples.)
Let’s figure out how this is going to work. We want the partial derivative of g with respect to xi , a generic one of the x’s.
(Remember, the xi ’s are the top variables.) We wiggle the xi by some dxi . This has the effect of wiggling all of the u j ’s.
(Those are the middle variables.) The amount of wiggle of each du j is
∂uj
du j = dxi
∂ xi
(I am using differentials here for the wiggles. This could be done with ∆u j , and you’d get an approximately equal (≈)
rather than an equals (=) in that last equation. The process gives exactly the same answers either way.) Each of the wiggles
in each of the du j ’s causes the value of g to wiggle, and the total wiggle in g is then
m
∂g
dg = ∑ ∂ u j × du j
j=1
But we have an expression for du j (we just got it), which we can plug into this and get
m
∂g ∂uj
dg = ∑ ∂ u j × ∂ xi dxi
j=1
Look at this on the green box diagram. The top input wiggles, wiggling all the middle variables, and each of those wiggles
the bottom output, and the net effect is the sum of all of those wiggles. Finally, then, we get that the wiggle magnification
factor is
m
∂g ∂g ∂uj
=∑ ×
∂ xi j=1 ∂ u j ∂ xi
To help you remember this another way, I have written out the multi-variable chain rule’s summation (with the single
variable version below for comparison), in the following box.
m
∂g ∂g ∂uj
∂ xi
= ∑ ∂ u j × ∂ xi
j=1
dg dg du
= ×
dx du dx
Note that the pattern with the partial derivatives is the same as the single-dimensional chain rule. The terms in the partial-
derivative chain rule look as though you are canceling the ∂ u j ’s, just as it looks as though the du’s are canceling in the
single-dimensional chain rule. Now, though, the fact that these are partial derivatives clues you to the fact that there is more
than one du around, and that you have to add up multiple terms. That is why you need the summation. Also note that you
get one term in the summation of the partial derivatives for each middle variable. There is a contribution to ∂ g/∂ xi due to
u1 changing, another part due to u2 changing, . . . , and a part due to um changing. Each of these contributes one term to the
summation. These ideas, when put together, help you set up the correct expression for working the chain rule. Just be sure
to remember to add it all up!
In fact, let’s work another problem that forces us to be exceedingly careful about regular versus partial derivatives. Let’s
suppose we have a function like g(x, y,t), which depends on both position in the xy-plane and time. (If you want, you can
think about f as representing temperature of a plate that is heating up. The temperature would, in that case, depend both on
the location (x, y) and the time t.) Suppose also that you have a curve that is parameterized by time, x = x(t) and y = y(t).
(If you want, you can think of this as the parametric equations for the position of a bug crawling along on the plate.) Then,
given a value of t, you can figure out the values of x and y, and from those (with the value of t still), you can find the value
of g. That is, we can also think of g as a function of only t. Suppose you want to find the derivative of g as a function of
time alone. (That would be asking how fast the plate under the bug is heating up as it is crawling around.) How would you
do that?
This one is confusing because time enters in an unusual way. To sort all of this out, we need to isolate the top and
middle variables. The middle variables are the ones that g is given in terms of; in this case x, y, and t. The top variables are
the ones that those variables are in terms of; in this case, just t. (It’s the fact that t appears twice that makes this problem
interesting.)
Mugsy: Has anyone ever told you that you have a warped sense of “interesting?”
If we set up the formula for the differential of g using just the intermediate variables, we get
∂g ∂g ∂g
dg = dx + dy + dt
∂x ∂y ∂t
From the x = x(t) and y = y(t) equations, we can get dx and dy in terms of dt:
dx dy
dx = dt and dy = dt
dt dt
(Yes, those are not partial derivatives, but regular derivatives. I’ll explain momentarily, when we’ve finished the problem.)
Plugging the values in for dx and dy gives
∂ g dx ∂ g dy ∂g
dg = dt + dt + dt
∂ x dt ∂ y dt ∂t
from which we can factor a dt and get
∂ g dx ∂ g dy ∂ g
dg = + + dt
∂ x dt ∂ y dt ∂t
Finally, dividing through by dt gives
dg ∂ g dx ∂ g dy ∂ g
= + +
dt ∂ x dt ∂ y dt ∂t
Now that looks weird. We have both dg/dt and ∂ g/∂t in the same equation! What is going on? Something important to
keep straight, which is why I did this problem. When we write dg/dt, we are looking at g as a function of only t. That is,
after we plug the formulas for x(t) and y(t) into g(x, y,t), we get a formula for g with t as the only variable; we only need to
have t to find out g. The derivative of that function is dg/dt. If we take the derivative of g(x, y,t) with respect to t before we
plug in the formulas for x(t) and y(t), then when we take that derivative of g, we have other variables around, and we then
denote the derivative of g with respect to t as ∂ g/∂t. We calculate the value of ∂ g/∂t by just differentiating the formula we
are originally given for g(x, y,t).
Perhaps an example would be of benefit. Suppose
g(x, y,t) = cos x + ey + sinht
(where the goofy functions will help us keep things separate). Also, suppose the parametric equations are
x(t) = t 2 and y(t) = t 3
Then we can calculate all the partial derivatives of g:
∂g
= − sin x
∂x
∂g
= ey
∂y
∂g
= cosht
∂t
We can also find g(t), which we get by plugging in the formulas for x(t) and y(t) into the function g(x, y,t):
3
g(t) = cos(t 2 ) + et + sinht
which we can then differentiate to get dg/dt:
dg 3
= − sin(t 2 ) 2t + et (3t 2 ) + cosht
dt
by the regular chain rule. Compare that to ∂ g/∂t, and you realize that you will miss the first two terms, the ones that came
from x(t) and y(t), if you think that dg/dt is the same as ∂ g/∂t. The correct way to find dg/dt, namely the chain rule,
gives
dg ∂ g dx ∂ g dy ∂ g dt
= + + (3.4)
dt ∂ x dt ∂ y dt ∂t dt
= − sin x(2t) + ey (3t 2 ) + cosht (3.5)
Note that all you have to do is plug x = t 2 and y = t 3 into this last equation to get what we had before for dg/dt. Stare at
this example until you understand where all the terms come from and why. It really will help!
Was this example contrived merely to be complicated? Not at all. This same sort of confusion occurs in fluid mechanics,
for example, where instead of dg/dt, they write it as Dg/Dt, and call it the material derivative. You need the material
derivative to calculate the acceleration of the particle. (g in this case is the velocity of the particle, derived from the flow of
the fluid. Then the derivative with respect to t is the acceleration.) Just ∂∂tg represents just how the fluid flow is changing at
a single point, since the other variables (x, y, and z) are being held constant. But to get the acceleration of a particle of fluid
correctly, you have to take into account that the particle is moving, too! (If this confuses you, take heart. It is genuinely
confusing. Fluid mechanics is full of this stuff, and I am just beginning to get a handle on it myself.)
How, then, in practice, do you tell whether to use d()/d() or ∂ ()/∂ ()? It actually is easy (despite the hassle above).
Look at the green box diagram. If you are taking a derivative with respect to a variable that is all by itself at that level, then
it is d()/d(). If there are several variables on that level, then the derivative is ∂ ()/∂ (). Compare to what we had above with
dg/dt versus ∂ g/∂t. When we were treating g as a function only of the top t, the derivative was dg/dt. When we were
treating g as a function on the level with x, y, and t, the derivative was ∂∂tg . Look at it again, and think about this until it sinks
in. Once it makes sense, you have arrived in your understanding of the difference between the regular (total) derivative
and the partial derivative. This also explains why we used dx/dt and dy/dt rather than partial derivatives when finding the
derivatives of x = x(t) and y = y(t).
Homework #26
Problem.
1. Suppose we start with a function w = w(x, y), but that we want to take the partial derivatives of w with respect to
r and θ , the polar coordinate system. (This happens often enough in applications, when you decide that the polar
coordinate system is more suited to your problem than rectangular coordinates are.) The equations relating polar
and rectangular coordinates are x = r cos θ , and y = r sin θ . Also, I will use the subscript notation here, since it is so
common, and also saves a lot of space.
(a) Use the chain rule to show that
wr = wx cos θ + wy sin θ and (1/r)wθ = −wx sin θ + wy cos θ .
(b) Use those equations to show that (wx )2 + (wy )2 = (wr )2 + (1/r)2 (wθ )2 . (Note: This last equation is for a rather
important combination of partial derivatives in applications. The point of this problem was to show how to
convert the expression from one coordinate system to another. This happens all the time when dealing with
what are called partial differential equations.)
3.1.5 Higher-order partial derivatives.

Just as we could take higher-order derivatives with single-variable (one-dimensional) functions, we can take higher-order
partial derivatives. The process is exactly the same, just keep differentiating until you’ve taken the right number of deriva-
tives. There is one complication, however. There are n first derivatives of an n-dimensional function, one for each variable.
Each of those will have n second derivatives, for a total of n2 possible. Each of those has n more derivatives, etc. In general,
an n-dimensional function has up to nk possible kth derivatives. We’ll need to keep these straight. What a challenge!
Notations.
There are two different notations for first-order partial derivatives, and they both extend to notations for higher-order partial
derivatives. There is a subtle difference, but it turns out not to be serious in any real situation.
The ∂ ()/∂ () notation extends in exact analogy to the way the notation for d()/d() extended to higher-order derivatives.
For example, if you have f (q, r, s), then the first partial derivative with respect to q is
∂f
∂q
If you want the derivative of that with respect to r, it would be
∂
(pdi f f q)
∂r
which would be compressed to
∂2 f
∂r∂q
Note several things. First, the 2 in the top indicates the total number of derivatives to take, and the order in which you
take the derivatives is indicated in the bottom of the derivative, from right to left. (The reason for the order is that in this
notation, we tack the derivatives onto the left side, so the leftmost variables are the ones that showed up last.)
A bigger example is given by f (w, x, y, z) and the derivative
∂4 f
∂ y ∂ 2z ∂ w
In this case, there will be a total of four partial derivatives taken, one with respect to w first, then 2 with respect to z next,
and finally one with respect to y.
The subscript notation for partial derivatives has its extension also. If the first partial of f (w, x, y, z) with respect to x
is fx , then its next derivative with respect to z is ( fx )z = fxz . The order of taking derivatives is backwards from the other
notation. Here, we take the partial derivatives in order from left to right, since we tack on derivatives to the right-hand side.
The subscripts more to the right showed up later. There is no typical notation to abbreviate several derivatives with respect
to the same variable in succession, the way we could in
∂4 f
∂ y ∂ 2z ∂ w
where the two z-partial derivatives were combined. In fact, that derivative would be written as
fwzzy
in subscript notation.
Interpretations.
A graphical interpretation of second-order and higher partial derivatives is seriously complicated. I’m not going even to try
to hassle you with it.
Even the algebraic interpretation is clumsy. Basically, for example,
∂2 f

∂ ∂f
=
∂r∂q ∂r ∂q
so the second partial derivative is the rate at which the first partial derivative is changing. It’s hard to say much more.
Equality of mixed partials, and using that information.

I want to do an example of higher-order partial derivatives, and make a point at the same time. Suppose we have f (x, y, z) =
exy sin(2z), and we want to find all the first-, and second-order partial derivatives. That’s a lot of derivatives, but it will make
the point abundantly clear. I will use the subscript notation, because that takes up so much less room.
fx = yexy sin(2z)
fy = xexy sin(2z)
fz = 2exy cos(2z)
fxx = y2 exy sin(2z) fxy = (1 + xy)exy sin(2z) fxz = 2yexy cos(2z)

fxy = (1 + xy)exy sin(2z) fyy = x2 exy sin(2z) fyz = 2xexy cos(2z)
fxz = 2yexy cos(2z) fyz = 2xexy cos(2z) fzz = −4exy sin(2z)
Note that
fxy = fyx , fxz = fzx , and fyz = fzy
and that none of the others match. The matching always happens. The property is called the equality of mixed partial
derivatives, and is a very powerful result in mathematics. (It should be noted that there are conditions that need to be met
for this to happen. They are technical, and require much more than I would expect you to know for a calculus course.
However, for all of the functions you will encounter for a long while—probably forever, unless you are a mathematics
major—the conditions will hold.)
There is an extension to the equality of mixed partial derivatives. It says that when you want to find the partial derivatives
of a function, you can do any of them in any order you want, as long as you end up taking the correct number of partial
derivatives with respect to each of the variables. This is occasionally useful, as can be demonstrated by exaggerated
examples, such as this one. Suppose we want to find ∂ 4 f /∂ x ∂ 3 y for the function
p !
y3 − y2 + 15 + cosh y3
f (x, y) = sec + x 3 y4
ln(y2 + e−y )
The whole point is not to try to take three derivatives of the first term with respect to y (which is what the notation is
asking for), but rather be intelligent about what derivatives to take first, and take the derivative with respect to x before the
derivatives with respect to y. Why? Because the derivative with respect to x first causes the entire first term to evaporate!
There are no x’s in it, so it’s derivative with respect to x is 0. Then take the three derivatives with respect to y, but you only
have to deal with the last term, which isn’t too bad.
Again, you can do this with Maple, but there is no real difference between what we have done before and what we are
doing now. Maple simply takes care of all of the messy algebra for us. For example, in the horrid example right before this,
you could write
> w := sec( ( sqrt( y^3 - y^2 + 15) + cosh(y^3) ) / ln( y^2 + exp(-y) )
> ) + x^3 * y^4;
> diff( w, y$3, x );
p
y3 − y2 + 15 + cosh(y3 )
w := sec( ) + x3 y4
ln(y2 + e(−y) )
72 x2 y
Note that you only have to specify the variables for differentiation, not the total number of derivatives (the 4 in this
case). Maple can add. Maple can also take just diff(w,y);, but that’s too horrible to include here.
Homework #27
Exercises.
1. Find all the different (that is, potentially unequal) second partial derivatives of the following functions. So, for
example, you don’t have to list fxy and fyx , since these should be equal.
(a) x4 y5
(b) y ln x
(c) Arctan(x y)
2. Find all the different second partial derivatives of the following functions.
(a) x2 y5
(b) y Arctan x
(c) ln(x y)
3. Make up three functions (multi-variable) of your own and find all the different second partial derivatives. Usual rules
apply.
Problem.
1. Find fxxxyy for f (x, y) = y ex y . [The order can simplify things considerably.]
Investigation.
1. We want to solve the partial differential equation
2
1 ∂ 2w

∂ w
= 2
∂ x2 c ∂t 2
This is one of the fundamental equations of physics, called the wave equation. Here, c is the speed of propagation of
the wave, and is a constant. Let f (u) and g(v) be any reasonable (differentiable) functions. Set up
w = f (x + ct) + g(x − ct)
This is the solution. (Don’t ask how I got it. You don’t want to know.) We can check it, though! For this, assume that
u = x + ct and v = x − ct, so w = f (u) + g(v).
(a) Find ∂ w/∂ x and ∂ w/∂t in terms of the derivatives of f (u) and g(v). [Will the derivatives of f and g be partial
or regular derivatives?]
(b) Find ∂ 2 w/∂ x2 by taking ∂ /∂ x of ∂ w/∂ x. (Careful doing this. You’ll need the chain rule again.) Find ∂ 2 w/∂t 2
by taking ∂ /∂t of ∂ w/∂t. (Again, careful doing this.)
(c) Plug the results of ∂ 2 w/∂ x2 and ∂ 2 w/∂t 2 into the wave equation, and show that both sides are equal. This
verifies that w as given at the beginning is a solution of the wave equation.
3.1.6 Implicit functions and their derivatives.

Implicit functions are the ones where the variables are scrambled together. Since there is no obviously preferred variable to
be dependent or independent, problems can occur. It is best to deal with implicit functions by partial derivatives, because
of the multiple-variable nature of implicit functions. We begin our study of implicit functions now.
The notation for implicit functions reflects this scrambling together. An implicit function of x and y will typically be
written as f (x, y) = 0. You can always get implicit functions into that form simply by moving all the terms to the left side
of the equation.
Even the term implicit function is a bit misleading, since the result might not be a green-box type of function. However,
the terminology is stuck, and we can’t change it now. An example of an implicit “function” is x2 + y2 − 1 = 0 (or x2 + y2 = 1
to put it in the more usual format). The graph is a circle, which is not the graph of a function.
Level sets.
Before we can get very far with implicit functions, we need to look carefully at the type of items defined by a slightly more
general equation, namely f (x, y) = C, where C is any constant, and not just 0. And we will be concerned initially about the
geometry of equations defined this way. Only later will we get to the calculus side of things.
Earlier in this chapter, we talked about the dimension of a function. We come back to that idea, and expand on it some.
The idea is to come up with a way to deal with two- and three-dimensional functions that don’t require graphs in three
or four dimensions. Here’s how. A level set of an n-dimensional function f (x1 , x2 , . . . , xn ) is the collection of points that
satisfy the equation f (x1 , x2 , . . . , xn ) = C. Obviously, then, graphs of implicit functions are just very special cases of level
sets, namely the ones where C happens to equal 0.
Justifications. Why do this? There had better be a better reason than “because it is there” or something equally useless.
The only convenient way to visualize three-dimensional functions. If you’ll remember, in order to graph a three-
dimensional function (that is, a function with three independent variables), you’d need four dimensions, three dimensions
for the input variables and a fourth dimension for the value of the output. This is discouraging, since three-dimensional
functions do occur (after all, this is a three-dimensional world), and four-dimensional graphs are difficult to work with, to
put it mildly.
On the other hand, if we work with level sets of a three-dimensional function, we don’t need to go beyond three
dimensions. All we need to do is draw (somehow) the different level sets of the function in order to convey the “shape” of
the function. That still isn’t easy—three-dimensional graphs are difficult to interpret, much less to draw—but at least it is
possible.
The only convenient way to produce topographical maps. A somewhat more everyday example is producing to-
pographical maps. A common topographical map shows the altitude of the locations on the map. Consider how this would
work with a graphical approach to the altitude. The function altitude is a two-dimensional function (you need to know lati-
tude and longitude, two variables, to get the altitude). The graph of the altitude function should then be three dimensional.
Two variables locate the point, and the third dimension is altitude. The graph is a miniature version of the locale of the
map, like you might find in a large model railroad setup. Such a map would not be very easy to carry, and just think about
trying to fold one!
But fortunately, map makers have a better way to represent the altitude than using graphs. Instead, they draw level
curves on the paper that represent different altitudes, and communicate the same information the graph of the function
would, but in a more convenient form.
Description. If you’ll notice, the idea of using level curves to represent altitudes of places on the map yields a two-
dimensional map. You don’t need the third dimension to get values this way! This is very handy.
In general, a level set of a function with n variables can be plotted in n dimensions. If you plot a bunch of the level sets
for various values of C, you can get an idea of the values of the function at all points. It really is a very big connect-the-dots
game. Each point has a function value, and you connect all the points that have a specific value by a level set.
There is one difficulty. If we plotted all of the level sets, every point would be covered, and you wouldn’t get any
information. Plotting level sets requires some common sense.
Uses. I still probably haven’t convinced you that level sets are useful in everyday life. If you aren’t a hiking advocate,
you probably haven’t encountered topographical (U.S. Geodetic Survey) maps. But there are a few level sets on maps that
you have seen. The weather bureau produces several. Instead of giving the barometric pressure at a bunch of points, they
will draw in isobars, level sets of barometric pressure. These are the warm or cold fronts that you see connecting high or
low pressure regions.
And U.S. Today has popularized the multicolored temperature maps, where regions of roughly equal temperature are
colored the same. Those are level sets!
Relations between graphs and level surfaces. When the function has two dimensions (independent variables), the level
set of that function will be a collection of points in two dimensions, usually called a level curve of the function. When the
function has three dimensions, the level set of that function will be a collection of points in three dimensions, usually called
the level surface of the function. This is nothing more than terminology.
If you are given a set in two dimensions, there is a question that comes up. Is this the graph of a one-dimensional
function or the level set of a two-dimensional function? It is a worthwhile question to ask, because you deal with level
sets differently than you do graphs. How would you tell? You’d look at the equation that gave the set. If it is in the form
y = f (x), then it’s the graph of a one-dimensional function. If it is in the form F(x, y) = C, then it’s the level curve of a
two-dimensional function. The key is in the form of the equation. You could change the form of an equation to another, but
equivalent, form, and change whether a specific curve was a graph or a level curve. We’ll do that later, in fact.
The same question can be asked about a three-dimensional set. Is it the graph of a two-dimensional function or the level
set of a three-dimensional function? Since we don’t often work any higher than that, we won’t go any further. But again,
the form controls how you view the set. The same comments apply here, too.
The graph of a two-dimensional function and its level set. Now we come to one of the big questions we’ll have to
answer. If you are given a single function f (x, y), what is the relation between its level curves and its graph? Again, this is
of more than casual interest to hikers using a topographical map. They need to look at the level curves on the map and use
that to decide what the terrain looks like in order to locate themselves on the map. (Remember, the graph of the altitude
function is the terrain, with hills and valleys and other things.)
One problem is that the level set is two-dimensional, but the graph is three-dimensional. That creates problems con-
necting them. This process requires thinking in three dimensions, and this is very difficult for some people. I will try to
make this as easy as I can. Suppose we have a two-dimensional function f (x, y). First, let’s go from the graph to the level
sets. It’s the easier of the two directions. What would be the equation of a level curve? That is an important question. The
level curves are points with all the same values of f (x, y). That means they satisfy the equation f (x, y) = C, for some value
of the constant C. This is the equation of the level curves of any (two-dimensional) function.
What is the equation of the graph of the function? It is z = f (x, y). (Notice that the extra variable, z, crept in. That
forces the graph into three dimensions, the extra dimension being necessary for the output value of the function. Also, note
that we needed no extra variable for level curves. The value of C is not so much of an output value as a parameter that we
get to choose.) How would we relate these two? We can get z = f (x, y) to match f (x, y) = C if we force an extra condition
on the graph, namely z = C.
Eliminating z between z = f (x, y) and z = C gives f (x, y) = C. But what is z = C? In three dimensions, it is a horizontal
plane with height C. What does it mean to require both z = f (x, y) and z = C? To get the solutions of both at once, we
look for the intersection of the two equations. What happens when we intersect the graph with a horizontal plane? It has
the effect of slicing through the graph at a specific height. That slice (or, depending on your way of visualizing graphs, the
edge of the slice) is a level curve of the function, almost. To be a level curve, it needs to be in the xy-plane. So, push it
down to the xy-plane, and you get a level curve of the function. Do this with a number of different C’s, and you will get a
number of different level curves. You slice the graph horizontally into little strips, and the edges of those strips are the level
curves.
On the other hand, you can also go backwards, from the level curves to the graph of the function. Normally, on a set
of level curves, the value of the function on each level curve is given to you. That means that on the graph, all those points
have the same height. Lift up the level curve to that height, at least in your imagination. Do this with all the other level
curves, and you will get a sort of wire frame for the graph of the function. Fill it in reasonably, and you will get the graph
of the function, at least if the function itself is reasonable. This is precisely what hikers have to do. They then compare the
terrain around them with the reconstruction of the terrain from the level curves on the map.
There is no substitute for an example. Take f (x, y) = x2 + y2 . My suspicion is that none of you knows what the graph
of this function looks like, but that some of you will be able to identify the level curves. The surface is called a paraboloid,
the shape of mirrors in telescopes and searchlights. The level curves are x2 + y2 = C for different values of C. These are
circles centered at the origin, at least as long as C > 0. The larger C, √ the larger the circle. This means that the wire frame
we get by lifting these concentric circles up. The larger the radius ( C), the higher we lift the circle. The frame fills in to
give a bowl-shaped object.
We could also work backwards. If we start with the paraboloid, we get the level curves by slicing it horizontally. The
slices will be circles, which are then pushed down to the xy-plane to give the level curves, a series of concentric circles,
x2 + y2 = C.
Converting a graph of a two-dimensional function into a level set of a three-dimensional function. It turns out
that level sets are easier to work with than graphs. In fact, when we want to do serious work with a the graph of a function,
we will convert it to a level set by changing the function. So, if we are given the graph of a two-dimensional function, how
can we convert the graph to be the level curve of a different function?
Note that the level sets of the two-dimensional function will be curves in the xy-plane. These definitely won’t equal the
graph in three dimensions. What we are trying to do here (and I’ve said it several times to get the point across forcefully) is
find a new function which will have that graph as a level surface. What does that mean about the new function? It will have
to have three independent variables, because its level surface is in three dimensions. The function that we graphed had two
dimensions. What we need is a way to put that third variable in properly.
The method of doing that is so simple that it is hard to see. The graph of the function f (x, y) will have equation
f (x, y) = z. The extra variable is z. It is the output variable, the dependent variable, or however you want to say it. How do
we convert f (x, y) = z into a level set equation? Simple: Pull all the variables to one side, and get f (x, y) − z = 0. That is the
equation of a level set! It is the level set of F(x, y, z) = f (x, y) − z corresponding to the constant C = 0. That is, F(x, y, z) = 0
is precisely the same as f (x, y) = z.
That looks (and is) easy, but something quite unusual has happened. The dependent variable z in f (x, y) = z has changed
into the independent variable z in F(x, y, z) = 0. That switch is really “all” that happened. What we did to convert the graph
of a two-dimensional function into the level set of a three-dimensional function was add in explicitly the dependent variable,
declare it to be an independent variable, and move it to the other side of the equals sign where all good independent variables
belong. That gives the equation of the function whose level set (corresponding to the constant 0) is the graph of the original
function.
Homework #28
Exercises.
1. What is a function which has a level set which is the same as the graph of y = 2/x? (There are many correct answers
here. Can you come up with several functions?)
2. What is a function which has a level set which is the same as the graph of y = x + 5? (Again, there are many correct
answers.)
Problem.
1. Why can’t different level sets of a function intersect each other? (Hint: What would the value of the function be at
an intersection point?)
Definition of implicit functions.

We have been throwing around the term “implicit functions” without having been careful to say what they are really. An
implicit function is determined by an equation relating a number of variables, such as F(x, y, z) = 0. In such a situation,
there is no automatic way of telling what variables are independent and what variable is dependent. For example, if you
have the implicit function 2 w4 x z + w x y z + 12 = 0, which variable is dependent? You can’t tell. The point is, you have to
be told; that is part of the information that must be supplied with the function.
In order to be a bit more helpful, I will try to consistently use capital letters for the names of functions that are used
implicitly and lower case letters for the names of explicit functions.
Contrast to explicit functions. An explicit function is quite different. In that case, there is one variable isolated all by
itself on one side of the equals sign, and all other variables occur on the other side. The isolated variable is the dependent
variable, and the others are the independent variables.
An implicit function might not be a true function. One of the difficulties is that an implicit function might or might
not be a function by the definitions we gave at the beginning of the course. For example, x2 + y2 − 1 = 0 is a perfectly good
implicit function, but its graph is the unit circle, and that is not the graph of an honest-to-goodness function, whether the
independent variable is x or y.
What would happen if you tried to solve an implicit function for some variable? That is, what happens if you try to
convert an implicit equation into an explicit one? Several possibilities can occur. You might be able to solve it, or you
might not. And even if you can solve it, you might get a function (no ±’s) or you might not. And just because you can’t
solve for one variable doesn’t mean that it is not a function. For example, x5 + y5 + x + y − 1 = 0 defines y as an implicit
function of x, and it turns out to be a genuine function also, but solving for y in terms of x is not possible.
Formula for derivative of implicit functions.

Can we find the derivative dy/dx of an implicit function of the two variables x and y even when we can’t solve for y
explicitly? The answer is yes. And, for example, we can find ∂ w/∂ y for an implicit function F(w, x, y, z) = 0 using nothing
more than the function itself. It turns out to be quite simple.
I need to remind you about dy/dx. There is information in the notation. By writing dy/dx, we are declaring that x is
the independent variable and y is the dependent variable. Remember that!
Let me do the simple case first. Suppose we have F(x, y) = 0, and we want to find dy/dx. How do we do it? The trick
is to work with the total differential of F,
∂F ∂F
dF = dx + dy
∂x ∂y
This relates how F changes when x and y are changing. If we are requiring that F(x, y) = 0, then we are requiring that F
not change. That means that we have to have dF = 0. This means that
∂F ∂F
0= dx + dy
∂x ∂y
This equation says that x and y aren’t allowed to change in just any way that they want. There has to be a certain relation
between how they change in order to maintain F = 0. (That really does make sense. If both x and y could roam around
freely, we certainly couldn’t expect that F = 0 to be true for all the different values that x and y take on.) But we can solve
that last equation for dy/dx reasonably easily:
∂F ∂F
0= dx + dy (3.6)
∂x ∂y
∂F ∂F
− dy = dx (3.7)
∂y ∂x

∂ F/∂ x
dy = − dx (3.8)
∂ F/∂ y

dy ∂ F/∂ x
=− (3.9)
dx ∂ F/∂ y
This is the formula for finding the derivative dy/dx of implicit functions F(x, y) = 0. Actually, the formula holds for
F(x, y) = C just as well, since that still requires that dF = 0. So, this really is a method of finding the slope of tangent lines
to level curves, and we are using it for implicit functions because they are of that form.
Note that the negative sign is a part of the formula. It is always there! Occasionally there is a negative in one of the
partial derivatives that causes it to cancel, but it was there initially.
If you have taken a standard calculus course, you might remember doing these derivatives. If so, you will remember
that working them took pages of algebra and a lot of time. By this formula, you can do implicit derivatives in a single
step. For example, if you have x5 + y5 + x + y − 1 = 0, you can find the derivative by using that formula with F(x, y) =
x5 + y5 + x + y − 1. You get

dy ∂ F/∂ x
=− (3.10)
dx ∂ F/∂ y
4
5x + 1
=− (3.11)
5y4 + 1
That’s all there is to it!
One comment is in order. Note that the derivative dy/dx will usually contain both x and y. You don’t often find that
the derivative of an implicit function contains only a single variable. That means that to evaluate the derivative at a point,
you have to be given both the x- and y-coordinates of the point to get the slope of the tangent line at that point. This makes
sense, because with implicit functions you can’t usually locate the point by just one x- or y-coordinate.
What about more complicated problems, like finding the derivative of w with respect to y for an implicit function
F(w, x, y, z) = 0? It is not any more difficult than finding dy/dx. In fact, virtually the same procedure solves it. The only
difficulty is that you have to remember the procedure rather than memorize a formula. (I prefer it that way, myself, unless
the procedure is hopelessly long.)
Before I go on, what kind of derivative should that be? That is, is it a regular derivative or a partial derivative? Consider.
If we solved F(w, x, y, z) = 0 for w would there be more variables around than y? The answer is that there would be, and
so the derivative is a partial derivative, ∂ w/∂ y. What, then, does that mean? It means that when we are taking the partial
derivative, all variables except w and y will have to be treated as constants. (That will mean, in this case, that dx = 0 and
dz = 0 when we finally get around to it.)
The first step is to write down the total differential of the implicit function:
∂F ∂F ∂F ∂F
dF = dw + dx + dy + dz
∂w ∂x ∂y ∂z
And, since the implicit equation requires F = 0, a constant, we can set dF = 0:
∂F ∂F ∂F ∂F
0= dw + dx + dy + dz
∂w ∂x ∂y ∂z
Then you look at the derivative you want. In this case, it is ∂ w/∂ y. That means that all variables except w and y are going to
be treated as constants. So, we can declare dx = 0 and dz = 0. (That’s another way to tell that the derivative you’re finding
is a partial derivative. Whenever you require that some independent variable be a constant, you get a partial derivative.)
Two terms then drop out, and we get
∂F ∂F
0= dw + dy
∂w ∂y
Next, we solve for the quotient of the two differentials that we want just like we did with finding dy/dx (putting the correct
ones on the top and bottom):
∂F ∂F
0= dw + dy (3.12)
∂w ∂y
∂F ∂F
− dw = dy (3.13)
∂w ∂y

∂ F/∂ y
dw = − dy (3.14)
∂ F/∂ w

∂w ∂ F/∂ y
=− (3.15)
∂y ∂ F/∂ w
That’s the way the procedure always goes. Once you know what the function is, you can substitute in for the partial
derivatives, and you are done! It isn’t at all difficult once you get used to it.
Note again that you get partial derivatives ∂ w/∂ y when you divide dw by dy. That’s because there were other variables
around; we set dx = dz = 0. The result must then contain partial derivatives to warn of that fact.
Again note that the minus sign is simply part of the formula. It is always present initially, although algebra might cause
it to disappear as the problem is worked.
Two examples of this on Maple are given in the next section.
Higher-order derivatives.
Higher-order implicit regular derivatives Having found a simple way to calculate the first derivatives of implicit func-
tions, you might think that there was a nice easier way to get higher-order derivatives. Not so. The process is not hard, but
it is long and tedious. It would be very easy to get confused here, so I will try to give you a single, uniform procedure to
use. Then I’ll show you how to do it on Maple, which makes the problem too easy.
Mugsy: It can’t ever be too easy.
Albert: Quite the contrary. If Maple makes it that much simpler than working the problem by hand, there is a danger
that you might simply turn to Maple rather than learning how to do it the “harder” way.
Mugsy: And that’s bad?!

Dudley: When it comes to test time, yes.
How does it work? Let me take the simpler case of finding d 2 y/dx2 for an implicit function F(x, y) = 0, once you have
already found dy/dx. The key to d 2 y/dx2 (or any other) higher derivative is to remember what it means. In this case,
d2y

d dy
= (3.16)
dx2 dx dx

dy
d dx
= (3.17)
dx
The trick, then, is to find the differential of dy/dx (that’s very much like finding a derivative, remember), and then divide
through by dx. After that, you need to simplify, using dx/dx = 1, and using the formula you had for dy/dx.
dy 5x4 + 1
For a specific example, with the implicit function x5 + y5 + x + y − 1 = 0, we found =− 4 . To find d 2 y/dx2 ,
dx 5y + 1
we do this:

dy
d 2 y d dx
= (3.18)
dx2 dx 4
+1
d − 5x 5y4 +1
= (3.19)
dx
4
−1 5x + 1
= d (3.20)
dx 5y4 + 1
−1 (5y4 + 1)(20x3 dx) − (5x4 + 1)(20y3 dy)

= (3.21)
dx (5y4 + 1)2
(5y4 + 1)(20x3 dx 4 3 dy
dx ) − (5x + 1)(20y dx )
=− (3.22)
(5y4 + 1)2
4+1
(5y4 + 1)(20x3 ) − (5x4 + 1)(20y3 (− 5x
5y4 +1
))
=− (3.23)
(5y4 + 1)2
−20(25x3 y8 + 10x3 y4 + x3 + 25x8 y3 + 10x4 y3 + y3 )
= (3.24)
(5y4 + 1)3
That’s the final answer.
This process can be simplified using Maple. There are two ways to do it. You can use Maple to systematize the algebra,
leading Maple through the steps you would take. Or, you can use the command implicitdiff(); to let Maple do the
whole thing for you. I’ll show both, because the first method helps you nail down the process, while the second helps you
get the answers.
First, we do the hard way, simplified by Maple. Actually, I do it two slightly different ways, since each way has some
useful parts of Maple in it. No single step of what follows is difficult, but it is very easy to get lost in the lengthy algebra.
This line does nothing more than define the implicit function.
> F := x^5 + y^5 + x + y - 1;
F := x5 + y5 + x + y − 1
This line actually finds the derivative dy/dx by the implicit function differentiation formula.
> dydx := - diff(F,x)/diff(F,y);
5 x4 + 1
dydx := −
5 y4 + 1
This next is the tricky line. Remember that finding dy/dx implies that y is treated as a function of x. But for Maple, all
variables are completely independent. So, to tell Maple that y is a function of x, you change all the y’s to y(x)’s. That tells
Maple that y is a function of x.
> subs( y=y(x), dydx );

5 x4 + 1
−
5 y(x)4 + 1
This next line actually differentiates the mess. In the process, note that Maple gets a dy/dx (written in its own form).
> diff( %, x );
d
20 x3 20 (5 x4 + 1) y(x)3 ( dx y(x))
− 4
+ 4 2
5 y(x) + 1 (5 y(x) + 1)
This next line tells Maple that we already have dy/dx from before, and that it should use that value.
> subs( diff(y(x),x) = dydx, %);
20 x3 20 (5 x4 + 1)2 y(x)3
− −
5 y(x)4 + 1 (5 y(x)4 + 1)2 (5 y4 + 1)
This line tells Maple that the variable y and the function y(x) really are the same, by converting all of the y(x)’s back to
just y’s.
> subs( y(x)=y, % );
20 x3 20 (5 x4 + 1)2 y3
− −
5 y4 + 1 (5 y4 + 1)3
This final line is unnecessary, except to compare to what I did before.
> normal(%);
20 (25 x3 y8 + 10 x3 y4 + x3 + 25 y3 x8 + 10 y3 x4 + y3 )
−
(5 y4 + 1)3
With the exception of the first line and the normal(%);’s, you can use this procedure for finding d 2 y/dx2 for any
implicitly-defined function in Maple.
There is another, slightly different, and essentially equivalent, way of doing the same thing that might be better for you
to understand. If this doesn’t help, go back to the way I just gave. Here it is.
The alias(); command in Maple basically is a high-powered substitution item. Here, it says that any time y is used,
treat it instead as y(x). That is, treat y as a function of x. This includes printing out answers. It shortens things nicely.
> alias(y=y(x));
y
The next line defines F as the implicit function. Note that we don’t even have to move everything to one side in this
approach!
> F := x^5 + y^5 + x + y = 1;
F := x5 + y5 + x + y = 1
Now differentiate the function with respect to x, which gives an equation that includes ∂ y/∂ x. Note that all derivatives
are partial derivatives for Maple.
> diff(F,x);
5 x4 + 5 y4 ( ∂∂x y) + 1 + ( ∂∂x y) = 0
If we solve for dy/dx, we get the answer we want.
> dydx := solve( %, diff(y,x) );
5 x4 + 1
dydx := −
5 y4 + 1
If we differentiate again, we get the second derivative. But differentiating y means differentiating y(x), giving ∂ y/∂ x,
which Maple doesn’t automatically assume has a value.
> diff( %, x );
> subs( diff(y,x) = dydx, % );
20 x3 20 (5 x4 + 1) y3 ( ∂∂x y)
− +
5 y4 + 1 (5 y4 + 1)2
20 x3 20 (5 x4 + 1)2 y3
− 4
−
5y +1 (5 y4 + 1)3
We put the formula back into standard form.
> normal(%);
20 (25 x3 y8 + 10 x3 y4 + x3 + 25 y3 x8 + 10 y3 x4 + y3 )
−
(5 y4 + 1)3
Now, let’s do the same problem using the built-in Maple function, implicitdiff();. First, define the function, just
as before.
> F := x^5 + y^5 + x + y = 1;
F := x5 + y5 + x + y = 1
Then ask for the two derivatives. The order of arguments to implicitdiff(); tells Maple what you want. The first
argument is the function. The second argument is the dependent variable. The third argument (and beyond, if necessary)
are the variables to differentiate.
> implicitdiff(F, y, x);
> implicitdiff(F, y, x, x);
5 x4 + 1
−
5 y4 + 1
20 (25 x3 y8 + 10 x3 y4 + x3 + 25 y3 x8 + 10 y3 x4 + y3 )
−
125 y12 + 75 y8 + 15 y4 + 1
You can see why I said that this was too easy.
Mugsy: Hey! Even I could do it that way!
Albert: But can you do it without Maple, the way you will have to on the test?
Mugsy: Spoil sport.
Dudley: I take it that means “no.”
Mugsy: Well, can you?
Dudley: Albert didn’t ask me.
Mugsy: I take it that means “no.”
Higher-order implicit partial derivatives The process for finding higher-order partial derivatives is much the same,
with an extra twist at one point. What I will do is give you an example running through it with Maple, and explaining what
each step is doing. Again, I will do it by hand, and then by Maple the long way, and then by Maple the way that even
Mugsy can do.
How do you tell if a higher-order implicit derivative is a regular or a partial derivative? Exactly the same was as the first
derivative. That is, if the first derivative is a regular derivative, so will all higher-order derivatives. If the first derivative is a
partial derivative, then all higher-order derivatives will be as well.
Dudley: And for those who can’t remember?
Albert: You look at the number of variables. If there are only two, the derivatives are regular. If there are more, the
derivatives are partial.
Mugsy: Why couldn’t he have said it that way?
2 2
The problem I will do is to find ∂∂x ∂yz and ∂∂z ∂yx (and show that they are equal) for
x y z + x2 − y2 + 4 z2 = 10
Here’s how to solve the problem on Maple. First, we define the function
> F := x*y*z + x^2 - y^2 + 4*z^2 - 10;
F := x y z + x2 − y2 + 4 z2 − 10
Then we find dydz and dydx, which are the variables I use for ∂ y/∂ z and ∂ y/∂ x. Note that this is how to find first-order
partial derivatives of implicit functions. We already have the formulas, namely ∂ y/∂ z is −(∂ F/∂ z)/(∂ F/∂ y) and ∂ y/∂ x
is −(∂ F/∂ x)/(∂ F/∂ y). We put these into Maple’s format.
> dydz := - diff(F,z)/diff(F,y);
> dydx := -diff(F,x)/diff(F,y);
xy+8z
dydz := −
xz−2y
yz+2x
dydx := −
xz−2y
Maple then gives the answers. In this case, it’s not that hard. Right now, it is not obvious that we would need both of
these, but it turns out we will. It will become clear why as we proceed. We now want to work on getting ∂ 2 y/∂ x ∂ z, which
we will find by this process:
∂ 2y

∂ ∂y d(∂ y/∂ z)
= =
∂x∂z ∂x ∂z dx
dz=0
but we must be careful of the partial derivative. The ∂ /∂ x means that the independent variable x is changing; y is considered
the dependent variable, and must also change. This boils down to meaning that z doesn’t change, which we enforce by
requiring dz = 0 when we take the differential on top. That’s the meaning of the little dz = 0 at the end. How do we get
Maple to do this? We have to tell it that y is a function of x only. That way, Maple will treat z as a constant. And we tell
Maple y is a function of only x by substituting y = y(x) into dydz, just as before, and then differentiate with respect to x:
> subs( y=y(x), dydz );
> diff( %, x );
x y(x) + 8 z
−
x z − 2 y(x)
d d
y(x) + x ( dx y(x)) (x y(x) + 8 z) (z − 2 ( dx y(x)))
− +
x z − 2 y(x) (x z − 2 y(x))2
Note that Maple thought that z was a constant, but that y was a function of x. This is exactly what we needed. Now we
have to substitute for the derivative of y with respect to x that Maple used. For this, we need ∂ y/∂ x from above. This is
where we need that, and why.
> subs( diff(y(x),x) = dydx, % );
x (y z + 2 x) 2 (y z + 2 x)
y(x) − (x y(x) + 8 z) (z + )
xz−2y xz−2y
− +
x z − 2 y(x) (x z − 2 y(x))2
We now want to simplify this, but Maple will again stubbornly refuse to accept that y and y(x) are the same, so we have
to get rid of the y(x)’s by another substitution. We then use the normal(%); command to put all of this into a nice form,
and store the result in q1:
> subs( y(x)=y, % );
> q1 := normal(%);
x (y z + 2 x) 2 (y z + 2 x)
y− (x y + 8 z) (z + )
xz−2y xz−2y
− +
xz−2y (x z − 2 y)2
2 y2 x z − 4 y3 + 2 x3 z + x2 y z2 + 8 x z3 + 32 x z
q1 :=
(x z − 2 y)3
2
At this point, that is the value of ∂ y/∂ x ∂ z. What we want to do next is find the other mixed partial derivative,
∂ 2 y/∂ z ∂ x. The process is basically the same. We will again need both ∂ y/∂ x and ∂ y/∂ z. We assume that both of these
have already been found (from above). We need to declare y as a function of z in ∂ y/∂ x for when we want to differentiate
with respect to z next:
> subs( y=y(z), dydx );
> diff( %, z );
y(z) z + 2 x
−
x z − 2 y(z)
d d
( dz y(z)) z + y(z) (y(z) z + 2 x) (x − 2 ( dz y(z)))
− + 2
x z − 2 y(z) (x z − 2 y(z))
Now we substitute for the derivative that showed up, the ∂ y/∂ z, and get
> subs( diff(y(z),z)=dydz, %);
(x y + 8 z) z 2 (x y + 8 z)
− + y(z) (y(z) z + 2 x) (x + )
xz−2y xz−2y
− +
x z − 2 y(z) (x z − 2 y(z))2
Next, we substitute y(z) back to just y for the simplification
> subs( y(z)=y, % );
(x y + 8 z) z 2 (x y + 8 z)
− + y (y z + 2 x) (x + )
xz−2y xz−2y
− +
xz−2y (x z − 2 y)2
And finally, we simplify it and store this result in q2:
> q2 := normal( % );
2 y2 x z − 4 y3 + 2 x3 z + x2 y z2 + 8 x z3 + 32 x z
q2 :=
(x z − 2 y)3
And last of all, we check that the two mixed partials are in fact equal (something that ought to at least mildly surprise
you, considering how different the means of getting to this end were):
> q1 - q2;
0
It checks!
Dudley: Just to keep my suspense level manageable, does this always happen?
Albert: Yes, as far as you are concerned.
Mugsy: And if you aren’t concerned?
Albert: You ought to be. Leave it at that.
Again, let me do this same problem in the spirit of the second Maple approach I gave earlier.
> alias(y=y(x,z));
y
Again, we set up the alias, with the dependent variable and independent variables as indicated in the derivative that we
want.
> F := x*y*z + x^2 - y^2 + 4*z^2 = 10;
F := x y z + x2 − y2 + 4 z2 = 10
This defines the function, as before.
> diff(F,x);
y z + x ( ∂∂x y) z + 2 x − 2 y ( ∂∂x y) = 0
Again, we get the derivative that we want in the middle of this equation. So, we solve for it next.
yz+2x
dydx := −
xz−2y
We store the result in a variable called dydx, remembering that it really is a partial derivative.
yz+2x
dydx := −
xz−2y
The same sequence gets the other partial derivative.
> diff( F, z);
x ( ∂∂z y) z + x y − 2 y ( ∂∂z y) + 8 z = 0
Now we go for the whole thing.
> diff(dydz,x);
> subs(diff(y,x)=dydx, %);
> q3 := normal(%);
y + x ( ∂∂x y) (x y + 8 z) (z − 2 ( ∂∂x y))

− +
xz−2y (x z − 2 y)2
x (y z + 2 x) 2 (y z + 2 x)
y− (x y + 8 z) (z + )
xz−2y xz−2y
− +
xz−2y (x z − 2 y)2
2 y2 x z − 4 y3 + 2 x3 z + x2 y z2 + 8 x z3 + 32 x z
q3 :=
(x z − 2 y)3
Then the other order.
> diff(dydx, z);
> subs(diff(y,z)=dydz, %);
> q4 := normal(%);
> q3 - q4;
( ∂∂z y) z + y (y z + 2 x) (x − 2 ( ∂∂z y))

− +
xz−2y (x z − 2 y)2
(x y + 8 z) z 2 (x y + 8 z)
− + y (y z + 2 x) (x + )
xz−2y xz−2y
− +
xz−2y (x z − 2 y)2
2 y2 x z − 4 y3 + 2 x3 z + x2 y z2 + 8 x z3 + 32 x z
q4 :=
(x z − 2 y)3
0
And, finally, let’s let Maple do all the work.
> F := x*y*z + x^2 - y^2 + 4*z^2 - 10;
> q5 := implicitdiff(F, y, x, z);
> q6 := implicitdiff(F, y, z, x);
> q5 - q6;
F := x y z + x2 − y2 + 4 z2 − 10
x2 y z2 + 2 z x y2 + 8 x z3 − 4 y3 + 2 x3 z + 32 x z
q5 :=
x3 z3 − 6 x2 y z2 + 12 z x y2 − 8 y3
x2 y z2 + 2 z x y2 + 8 x z3 − 4 y3 + 2 x3 z + 32 x z
q6 :=
x3 z3 − 6 x2 y z2 + 12 z x y2 − 8 y3
0
Homework #29
Exercises.
1. Does z = x2 + y2 + z2 determine an explicit or implicit function?

2. Does w = x2 + y2 + z2 determine an explicit or implicit function?
3. Find dy/dx for the following equations.
(a) x2 y = 15
(b) 3x6 y4 − 7 x2 y5 = 12
(c) y = x4 y3 + 4 x6 y7
4. Find dy/dx for the following equations.
(a) exp(x y2 ) − x3 y4 = 12
(b) sin(x/y) + cos(y/x) = 1
(c) y = ln(x2 + y2 )
5. Calculate d 2 y/dx2 for x2 + y2 = 1. (You can compare this problem to one of the homework problems on parametric
equations, where we calculated the same thing “the hard way” using x = cost, y = sint.)
6. Make up three implicit equations F(x, y) = 0 of your own, and find dy/dx for them.
d2 y
7. Find dx2
for the implicit function 2 x5 + 4 x2 y3 − 5 x y7 = 100. Be prepared for a frighteningly long answer.
d2 y
8. Find dx2
for the implicit function y e3 x + x3 sin(y2 ) = 20. Again, watch out for a many-line answer.
∂y
9. Find ∂z for the implicit function 9 z2 ex y − 3 x z sin(y) = 4.
∂x
10. Find ∂y for the implicit function 4 ln(x + y z) + 7 Arctan(y + x z) = 1
Problems.
∂ 2x
1. Find ∂y∂z for the implicit function ex y z + cos(x y2 + y z) = 2.
2. For this problem, use the implicit equation x2 y2 − 3 x3 y + 16 x − 5 y = 0.

(a) Find the derivative dy/dx.
(b) Show that the point (0, 0) is on the graph and find the equation of the line tangent to the curve at the point (0, 0).
(c) Show that the point (1, 4) is on the graph and find the equation of the line tangent to the curve at the point (1, 4).
(Careful here! The derivative does something strange. What kind of lines have that slope?)
3. In this problem, we look at a result that gets used in Physical Chemistry, although it is not clear what it means.
Albert: A whole lot of Physical Chemistry can seem that way. But the next section does more with that subject.
Dudley: Oh, I can hardly wait....
Suppose you have any implicit function of three variables, F(x, y, z) = C. Show that
∂x ∂y ∂z
× × = −1
∂y ∂z ∂x
by writing out the values of these partial derivatives in terms of the partial derivatives of F and simplifying.
Investigation.
1
1. Answer the following questions about the graph defined by the implicit function x3 + y3 + x y = 27 . (Yes, the constant
1/27 must be exactly that. You’ll see why.)
(a) Show that the first derivative dy/dx of the curve is
3 x2 + y
−
3 y2 + x
(b) Show that the second derivative d 2 y/dx2 of the curve is
54 x y(x3 + y3 + x y − 1/27)
−
(3y2 + x)3
(Unless you are into serious algebra, I’d recommend Maple for this one.)
(c) Plug the equation of the curve into this expression, and show that the second derivative is always equal to zero.
(This is why we need that 1/27.)
(d) Show that the lines y = mx + b have second derivative equal to zero. (A more ambitious problem would have
you show that only lines have second derivative equal to zero. I’m not asking for that.)
(e) Show that the three points 31 , 0 , 0, 13 , and − 13 , − 31 are all on the original curve. (How do you show that a

point is on a curve whose equation you know?)

(f) Show that the three points do not lie on a line. (The last four parts of this investigation present a considerable
difficulty. See if you can figure out why there is a problem.)
(g) Use Maple to factor x3 + y3 + xy − 1/27. (Note that one factor is linear in both x and y, so it represents a line.
The remaining equation produces a single point, at − 31 , − 13 .)
3.1.7 Constrained partial derivatives (what if you can’t wiggle just one variable at a time?)
Up until this point, whenever we have taken partial derivatives, we have assumed that the independent variables are just
that, independent of each other. There are situations where that is not the case, as any chemistry major will discover in
physical chemistry.
Motivation—gas dynamics.
Let me give you a specific example. In thermodynamics (a part of physical chemistry), the entropy, denoted S, of a gas is
most conveniently defined in terms of three variables, p = pressure, V = volume, and T = temperature.
Mugsy: ALBERT! What’s going on here? What’s entropy?
Albert: It’s a concept from thermodynamics, and measures how disorganized a system is. One of the fundamental
laws of thermodynamics says that entropy must always increase.
Dudley: Physicists have a theory of disorganization?
Albert: That’s one way to look at it. The three laws of thermodynamics are summarized at a level that even Mugsy
can understand as:
1. You can’t win.
2. You can’t break even. (That’s entropy increasing.)
3. You can’t get out.
Mugsy: You mean it’s useless for me to try to clean my apartment?
Dudley: For you, yes.
Mugsy: Watch it, kid.
(Usually, first-year chemistry books use P = pressure, but thermodynamics texts tend to use p.) If you have had chemistry,
though, you know that p, V , and T are not independent. There is the ideal gas law, pV = nRT , where n = number of moles
(measure of quantity) of gas, and R is a constant that is the same for all gases. So, if we decide to wiggle T , for example,
we are not permitted to hold all other variables constant. The extra equation that the variables have to satisfy is called the
constraint, and this process is called taking constrained partial derivatives.
Because thermodynamics is the course where this notion is used most, I will keep explanations geared to that subject.
It will be easy enough to adapt to any other situation, or to the other functions in thermodynamics (internal energy of gases,
enthalpy, and Gibbs free energy).
Dudley: I don’t even want to know what those mean.
Notation.
We first need to set up the notation so that we can tell what is going on.
Mugsy: I think it’s going to take more than notation.
Dudley: AUGH! More notation!
The entropy S depends on p, V , and T , and of the three, any two will determine the other from the gas law. So, thermody-
namics texts will write S(p, T ) or S(V, T ), depending on which two are the main two under consideration at the time. But
how would you understand (∂ S/∂ T )? It doesn’t make sense at it stands, since the notation implies that all other variables
are being held constant, which can’t happen. Even worse, if you differentiate S(p, T ) with respect to T and compare that
to the derivative of S(V, T ) with respect to T , you will get two different answers. (See the homework.) This is a distinct
problem.
We have to be able to distinguish between different partial derivatives, then. This is done by a standard notation.
(∂ S/∂ T ) p indicates that we are taking the derivative of S with respect to T , holding p a constant. Of course, V will have to
be changing, but that we will have to allow. It is equivalent to the derivative of the formula S(p, T ) with respect to T . On
the other hand, (∂ S/∂ T )V denotes the partial derivative of S with respect to T , while holding V constant (and consequently
allowing p to vary). That would mean that we are differentiating S(V, T ) with respect to T . By the way, each of those
derivatives gives a quantity that is physically measurable. And physically, they are not equal, either.
Mugsy: You mean this actually gets used?
Albert: Chemical engineers have to use this idea all the time.
How to calculate these.

This is actually not very difficult, especially after you have been through the abuse of higher-order partial derivatives of
implicit functions.
Dudley: Great. Malaria is better than the plague, too.
What you are given. You are given the function to differentiate, the constraint, the variable to differentiate, and the
variables that are independent of the variable to differentiate. These are all part of what you must be given, in some fashion
or another. Note that you can determine what variables are dependent on the variable to differentiate by a process of
elimination.
Write out total differential of function being differentiated. When you are doing this, you should include all of the
variables, whether or not they will be held constant. For example, with
S = S(p,V, T )
you would get that

∂S ∂S ∂S
dS = dp+ dV + dT
∂p ∂V ∂T
At this point, we are not taking any constraints into consideration, so subscripts on the partial derivatives are not needed.
That will come momentarily.
Set wiggles of all independent variables to 0; there should be two wiggles left. Now you look at the variables. The
notation will tell you that some of them are being held constant. In that case, you should set the appropriate differential to
0. That is, if you want to find
∂S
∂T V
you will be holding V constant, so you you would set dV = 0. The result in that case would be
∂S ∂S
dS = dp+ dT
∂p ∂T
We are now closing in on the derivative we want. To get (∂ S/∂ T ), you want to factor out a dT from the right side of this
equation and divide through by it. But we have this d p that isn’t dT . What do we do? We want to eliminate the unwanted
wiggle, in this case the d p. Now is when we use the constraint. That gives us the way to relate the dT and the d p that will
enable us to eliminate the d p from this.
Find the total differential of the constraint; set all independent wiggles to 0; solve for the wiggle to be eliminated
from the other equation. The equation for the constraint will tell how the differentials between all the variables relate.
So, if we use the ideal gas law,
pV = n R T
we take the differential of the equation, and get that
V d p + p dV = nR dT
Note that n is almost always a constant for these problems, since you are working with a fixed quantity of gas. On the other
hand, R is always a constant. Then we again set dV = 0, since V is still not allowed to change. Then we get that
V d p = nR dT
or solving for d p (the nuisance term in dS), we get

nR
dp = dT
V
This is what we will use in the dS equation.
Plug in and solve for desired wiggle. Plugging that back in gives
∂S ∂S
dS = dp+ dT (3.25)
∂p ∂T

∂S nR ∂S
= dT + dT (3.26)
∂p V ∂T

∂S nR ∂S
= + dT (3.27)
∂p V ∂T
where we factored out the dT from both terms.
Set up the quotient of wiggles to give the partial derivative you want. Dividing by dT gives us what we want:

∂S ∂S nR ∂S
= +
∂T V ∂ p V ∂T
This is the answer.

∂S
Dudley: FOUL! How can we have a term like ∂T on the right-hand side in the expression, when we have already said
that such a creature is ambiguous. Besides, how would you calculate it?
Mugsy: Yeah. That’s what I was thinking, too.
Dudley: Right, Mugsy.
Albert: Good questions, and they need good answers. That must be coming next.
First, let’s look at the expression. If you will note, when we write

∂S ∂S nR ∂S
= +
∂T V ∂ p V ∂T

we are writing ∂∂TS in terms of ∂∂TS . What would that mean? Remember that when we first did the total differential of
V
the function, we assumed that there were no constraints. That’s what ∂∂TS represents; we assume that we can wiggle just
T without wiggling any of the other variables, and ∂∂TS is just that wiggle magnification factor. How would that work?
Remember that we started out with a formula for S? That is what you would use for finding ∂∂TS ; just take the normal partial
derivative ignoring any problem with constraints. (Of course, in thermodynamics, you often don’t have much
of a formula
∂S
for these things. This is theory.) To allow that other variables must wiggle, we get an extra term in ∂ T . It takes into
V
account the wiggle of p as we are wiggling T and holding V constant. The term nVR ∂∂ Sp is just what needs to be added to
hold V constant, forcing p to move. This also answers the questions.
Just a few concluding remarks on this subject. It would be possible (but very unappetizing) to have several constraints
on different variables, rather than just one.
Mugsy: You gotta be kidding.
The process would be the same. Take the total differential of the equation containing the dependent variable; zero the
differentials of variables being held constant; take the differentials of the constraints and solve them for the differentials of
the dependent variables to get rid of; plug them in the first differentiated equation and divide by the differential of the main
dependent variable.
Finally, note that the quotient of differentials is a derivative, just as before. But the notation automatically switches to
express the situation, just the way that the notation changed
in going from regular to partial derivatives. That is, dS divided
by dT was not just dS/dT , nor even just ∂ T , but rather ∂∂TS , in order to accurately reflect what we had done to get the
∂S
V
equation relating those differentials.
Now, for the moment you have all been waiting for. How is this done on Maple?
Dudley: To tell you the truth, I’ve been waiting for this section to finish even more.
For that, it is important to realize that Maple doesn’t have the chain rule for differentials built into it, nor is it easy to get it
to do so. However, I have written a routine that takes differentials of equations or functions. That enables you to proceed
exactly as in the notes. For an example, let me do the entropy example again using Maple.
First, we read in the function. It is called, appropriately, d. (Instructions on getting the function will be given in class.
The locations of such things keep changing.)
> d := proc( f )
> local vars, i, v, tmp;
> vars := select( type, indets(f), name );
> if type(f, equation) then d( rhs(f) ) = d( lhs(f) )
> else tmp:=0; for i to nops(vars) do v := vars[i]; tmp := tmp +
> diff(f,v) * d || v; od; tmp;
> fi;
> end;
d := proc( f )
local vars, i, v, tmp;
vars := select(type, indets( f ), name) ;
if type( f , equation) then d(rhs( f )) = d(lhs( f ))
else
tmp := 0 ;
for i to nops(vars) do v := varsi ; tmp := tmp + diff( f , v) ∗ d||v end do ;
tmp
end if
end proc
You are not expected to understand this!
Next, put in the constraint.
> C := p*V = n * R * T;
C := pV = n R T
Note that you can assign an equation to a variable this way. That is, C is a variable to Maple that has the value
pV = n R T .
Then we take the differential of it.
> dC := d(C);
dC := n T dR + n R dT + R T dn = p dV +V dp
Maple doesn’t know that R and n are constants, so you have to tell it. Here’s how. Note that I am putting two equations
on the same line. That’s fine, as long as both are ended with semicolons (or colons).
> dR := 0; dn := 0;
dR := 0
dn := 0
Check it out, just to be sure.
> dC;
n R dT = p dV +V dp
Looks good. Next, take the differential of the entropy function, S(p,V, T ).
> dS := d(S(p,V,T));
dS := ( ∂∂T S(p, V, T )) dT + ( ∂V
∂
S(p, V, T )) dV + ( ∂∂p S(p, V, T )) dp

For the derivative we want, ∂∂TS , we will need to set dV = 0. We do that now. The # sign tells Maple to ignore
V
everything from that point on. It is used to put comments in the session.
> dV := 0; # For this derivative
dV := 0
Now, we want to get rid of the d p, so we solve the constraint for it.
> solve(dC, dp);
n R dT
V
Then we substitute that back into the equation for dS, and get
> subs( dp=%, dS);
( ∂∂p S(p, V, T )) n R dT
( ∂∂T S(p, V, T )) dT +
V
All we have to do now is divide by the dT , and we are done.
> expand(%/dT);
( ∂∂p S(p, V, T )) n R
( ∂∂T S(p, V, T )) +
V
And that’s the answer we got before.
If this seems a bit long, it is. But it is exactly what you are doing when you solve these by hand. You can go back and
see how this Maple session is exactly parallel to what we did before. Only the form is different.
Homework #30
Problems.

∂w ∂w
1. Take w = 2 x2 + 4 z + t, with constraint x + z − 3t = 8. Show that ∂x z and ∂x t can never be equal for these
equations.

∂w
2. For w = 3 x3 y2 z2 − 5 x2 + y3 − z4 , with constraint equation x z3 + 4 x2 y3 + y2 z5 = 71, find .
∂z y

∂w x
3. Find where w = ln(xy) + , subject to the constraint Arcsin x + x2 + y2 + z2 + z = 2.
∂x z y+z

∂x
4. Find for x = y e−w + w2 cost e2y with constraint w3 + y3 + t 3 = 1.
∂y t
5. Make up another two problems in constrained partial differentiation and solve them.
Investigation.
1. In this investigation, we look at implicit differentiation with constraints. It really is no more difficult than regular
derivatives with constraints. Suppose we have the equation F(w, x, y, z) = 0. This defines y as an implicit function of
w, x, and z. (Other combinations are possible, but that’s the one we’ll work with on this problem.) Suppose we also
have the constraint C(w, x, y, z) = 0. That should enable you to eliminate one of the variables that y is a function of.
Suppose
we eliminate w. Then y is a function of x and z (although its exact form is not obvious). Find a formula for
∂y
∂x in terms of the partial derivatives of F and C, where they are taken as formulas.
z

1. When the function has more than one independent variable, you must use partial derivatives. They have all the same
meanings as regular derivatives.
2. Partial derivatives are calculated by assuming that all variables except one is being held constant. That means that
any term that doesn’t contain the variable you are differentiating with respect to goes to zero when you differentiate.
3. The wiggle magnification formula (using three variables as an example) is now
∂f ∂f ∂f
∆ f (x, y, z) ≈ ∆x + ∆y + ∆z
∂x ∂y ∂z
Note that each term of the sum on the right-hand side is the approximate amount that f changes when just one
variable changes.
4. The chain rule is now
m
∂g ∂g ∂uj
∂ xi
= ∑ ∂ u j × ∂ xi
j=1
where g = g(u1 , . . . , um ) and each u j = u j (x1 , . . . , xn ).

5. Higher-order partial derivatives have two common notations. If you differentiate f (x, y, z) with respect to y and then
∂2 f
with respect to x, the notations would be and fyx .
∂ x∂ y
6. The differential of a function f (x, y) is
∂f ∂f
df =
dx + dy
∂x ∂y
If you have more variables, there is one term per variable the the function.
7. A general property called the equality of mixed partial derivatives says that fxy = fyx . This ends up meaning that you
can decide for yourself what order you want to take partial derivatives (in order to make your own life the easiest),
and as long as you end up taking the right total number of derivatives with respect to the right variables, you will get
the right answer.
8. Implicit functions are not true functions. They are determined by an equation where the independent and dependent
variables are scrambled together.
9. A level set of a function f (x, y) is the collection of all the points (x, y) for which f (x, y) has a single, specific value.
If f has two independent variables, the level set is usually called a level curve. If f has three independent variables,
the level set is called a level surface.
dy ∂ f /∂ x
10. To find the slope of a tangent line to a level curve defined by f (x, y) = C, the formula is =− . Don’t forget
dx ∂ f /∂ y
the negative sign!
11. If you want higher-order derivatives of implicit functions, you simply differentiate the first derivative. However,
while doing that, you have to realize that y is really a function of x (this is not a partial derivative!). You will end up
with dy/dx’s in the second derivative. Substitute the value of dy/dx that you have already found in order to get the
second derivative.
12. Constrained partial derivatives occur when the independent variables in a function are not free (that is, are con-
strained), but must satisfy some extra equation (called the constraint). In that case, you have to be given which
variable(s) are still going to be held constant, the remaining ones wiggling in order to maintain the constraint. For
example, if f (x, y, z) is a function, but x, y, and z are constrained satisfy the equation g(x, y, z) = C, and you want
to
∂f
to find ∂ f /∂ z while holding x constant, the notation would be .
∂z x
13. Constrained partial derivatives are found by taking the differential of the function you are differentiating, setting the
differentials of the variable(s) being held constant to 0, and getting rid of the “non-derivative” differentials using the
constraint, and then dividing through by the “derivative” differential.
14. There were no new formulas in this chapter, since partial derivatives are found exactly the same way are single-
variable derivatives.
15. The way to define a multi-variable function in Maple is to use the old “arrow notation,” but put the variables in
parentheses. The only new Maple command that appeared was alias();, which is used in Maple to shortcut
inputting expressions. It will not be used again in this course, so it is not critical to learn it. There was also a new
function d that takes differentials of expressions, to be used in finding constrained partial derivatives, but could also
be used when finding implicit derivatives.
3.3 Tests from previous years

Test #2, Fall 2004
I. (10 points, 5 points each) Determine the future value of a deposit of $2000 at 4% for 9 months if the interest is (a)
Compounded quarterly (b) Continuously compounded
II. (15 points, 5 points each) Find each of the following limits.
x3 − x2 − x − 15 2 x50 − ln x + ex sin2 x
(a) lim 5 2 4
(b) lim 30 (c) lim
x→3 x + 3 x + 3 x − 3 x − 36 x→∞ x + sin x + e2 x x→π/2 x3 − 2 x − 2 x2 + π
π
III. (15 points; 5 points each) For the following questions, use the function f (x) = x4 − 2 x2 − 10.
(a) Find all local max and min values of f (x). (b) Using a number line, describe all segments where the function is
increasing or decreasing. (c) Find the global max and min of f (x) on the interval [−0.5, 3].
IV. (35 points, as noted) Find the following derivatives. !
∂ 2 xy ∂5 z2 ∂x
(a) (5 points) (x y e ) (b) (5 points) 3 2 p + ln(x z) (c) (10 points) if x2 y z +
∂y ∂z ∂x 5
tanh(3x) cot(2 x) ∂z

√ ∂x
x y−xz = x (d) (15 points) for x = 3 z3 y + 4 tan(y z) + z w3 subject to z w y = 3
∂y z
V. (15 pts, 5 points each) Given the price/demand function p(x) = x − x2 /30.
(a) Find the elasticity at x = 25. (b) Is the market elastic or inelastic at x = 25? Why? (c) Does revenue increase or
decrease if the price increases (from the reference point x = 25)? Why?
d3 y
VI. (10 pts) Given ln(x y) = 2, find dx3
. Find this as an implicit function; that is, do not solve for y as a function of x
explicitly.
Test #2, Fall 2005
I. (10 points, 5 points each) Determine the future value of a deposit of $3000 at 5% for 8 months if the interest is
(a) Compounded monthly; (b) Continuously compounded
1 − cos x 5 x3 − 2 x t sint
(a) lim 2
(b) lim (c) lim
x→0 x + x x→∞ 7 x3 + 3 t→0 1 − cost
III. (15 points; 5 pts each) For the following questions, use the function f (x) = x3 − 12 x − 5.
increasing or decreasing. (c) Find the global max and min of f (x) on the interval [1, 4].
IV. Find the following derivatives:
!
12 6 xz d2y

∂ ln y ∂ x e
(a) (5 pts) x y5 + (b) (5 pts) 9 3 p + (c) (10 pts) if x y + y2 = 1
∂y x ∂x ∂z arctan(3 z) arcsin(4 z2 ) z9 dx2

∂x √
(d) (15 pts) for x = 5 z2 y + 2 z y + z2 e2 w subject to y2 + z w = 31.
∂z y
V. (15 pts, 5 pts each) Given the price/demand function p(x) = x − (x3 /50).
(a) Find the elasticity at x = 4.5. (b) Is the market elastic or inelastic at x = 4.5? Why? (c) By how much will
demand (x) change if price (p) increases by 10% from this point (x = 4.5 reference point)?
2 ∂f ∂f
VI. (15 pts) Given f (u, v) = arctan(u v) + 3 u v2 , and u = ex y , v = sin(x y). Find ∂x and ∂y in terms of u, v, x, and y (not
just x and y alone).
Test #2, Fall 2006
(a) Compounded quarterly (b) Simple interest
sin x ln x ln(ln x)
II. (20 points, 5 points each) Find each of the following limits. (a) lim (b) lim √ (c) lim (d)
x→0 e3 x − 1 x→∞ x x→1 ln x
lim x sin(1/x) (Hint: Rewrite this product into an equivalent 0/0 or ∞/∞ form.)
x→∞
III. (15 points; 5 pts each) For the following questions, use the function f (x) = x3 − 3 x2 − 9 x + 1.
increasing or decreasing. (c) Find the global max and min of f (x) on the interval [−4, 6].
∂7 x3 d2y

∂ ln(x z)
(a) (5 pts) (3 x sin y+4 x3 y2 z) (b) (5 pts) + (c) (10 pts) 2 if sin(x y) =
∂y ∂ x ∂ 3z
4 ln(tan(3 z))
sin(4
2
z ) z3 dx
1 ∂x
2 (Solve implicitly; don’t solve for y explicitly). (d) (15 pts) for x = 5 z2 ln(y) + z y e2 w subject to sin(y) +
√ ∂y w
z w = 15. (Use implicit methods only.)
x3
V. (11 pts) Given the price/demand function p(x) = x − . (a) (5 pts) Find the elasticity at x = 5. (b) (3 pts) Is the
60
market elastic or inelastic at x = 5? Why? (c) (3 pts) Hey! At this point (x = 5) in the market, would revenue increase
or decrease if price is increased? Why?

√ ∂f
VI. (10 pts) Given f (x, y, z, w) = x z w y + z2 y − w5 x3 z subject to the constraint ln(w z x2 y) =, find where the
∂ x z,w
subscript indicates that both z and w are being held constant for this derivative.
∂f ∂f ∂f
VII. (10 pts) Let f (u, v) be any given function where u = 3 x2 + 2 y and v = ex y . Find ∂y in terms of ∂u , ∂v , x and y.
VIII. (9 pts) The function f is a measure of productivity of a certain shift at a warehouse. The variable x represents the
money (measured in hundreds of dollars) spent on employee comfort such as improvements in lighting and break room
facilities and y represents money (measured in hundreds of dollars) spent in productivity bonuses. Give an English inter-
pretation of the following:
(a) fx (b) fy (c) fx y
Test #2, Fall 2007
(a) Compounded monthly (b) Simple interest
cos(x) 2 x3 − 3 x2 − 11 x + 6 ln(x) − cos(x2 ) + 10x 3 x2 − 2 x + 5
(a) lim (b) lim 3
(c) lim (d) lim
x→π/2 sin(x) − 1 x→3 x − 13 x + 12 x→∞ 3 x4 − 2 x + 1 x→2 cos(π x) − 2 x + 2
x2 − x + 4
III. (15 points) For the following questions, use the function f (x) = .
x−1
(a) (8 points) Find all local max and min values of f (x). (b) (7 points) Find the global max and min of f (x) on the
interval [2, 6].
∂8 z3

∂ p 3 sin(x z) dy
(a) (5 pts) ( x y z − 2 x z3 ) (b) (5 pts) 3 5 3 − 3 z)
+ 3
(c) (10 pts) if 3 x2 y −
∂y ∂ x ∂ z Arctan(4 z z dx
2 ∂x
cosh(x y) = 2 x y . (d) (15 pts) for x = z w2 − ez y with constraint equation w3 + tan z + y2 = 12 (use implicit
∂z w
methods only)
V. (15 pts) Given the price/demand function p(x) = x − (x5 /30).
(a) (5 pts) Find the elasticity at x = 2 (b) (5 pts) Is the market elastic or inelastic at x = 2? Why? (c) (5 pts) From
this point, by what percentage would demand change if price increased by 10%?
dy ,y d2y 10 y
VI. (10 pts) If x2 y3 = 7, use implicit methods only to show that =− and 2 = 2 . Finally use these results to
dx 3x dx 9x
d3
find . NO credit will be given for solving for y and differentiating explicitly!
dx3
Test #2, Fall 2008
I. (10 points; 5 points each part) Find both the new amount and the interest paid on $1700. for 9 months at 5% interest, if
the interest is
a.) Compounded monthly b.) Compounded continuously
II. (15 points; 5 points each) Find each of the following limits.
2 x2 − 9 x + 4 2 x2 − 9 x + 4 sin x
a.) lim 2 b.) lim 2 c.) lim
x→4 x + 2 x − 24 x→∞ x + 2 x − 24 x→π x − π
III. (20 points; 10 points each) For this problem, use f (x) = 3 x4 − 4 x3 − 36 x2 + 18. Make sure you give both the x- and
y-coordinates of each point.
a.) Find all the local max(es) and local min(s) of f (x). b.) Find the global max and min of f (x) on the interval [−1, 2].
IV. (20 points total; 5 points each) Albert started a company that makes widgets, which grew into the E.M.C. (Enormous
Multinational Conglomerate). He hired Mugsy, and was somewhat perplexed that he wanted to work in environmental
protection, until he heard Mugsy say to a landscape worker “Nice shrub you have there. How much it worth to you that it
stay that way?” Mugsy seems happy in his new job in security. Dudley, on the other hand, started as a marketing specialist,
using his vast experience from calculus. He found that the demand function for widgets is p = 50 − x2 .
a.) If x = 6, what is the price of the widgets? b.) What is the elasticity η at x = 6? c.) Using the elasticity value
from part b.) and the price of a widget from part a.), estimate the percentage that the demand will go up if the price is
lowered by 3%. d.) Does lowering the price of a widget 3% from the amount in part a.) increase or decrease the
revenue from widgets for E.M.C.? Explain your answer using the previous parts of this problem.
V. (15 points; 5 points each)
Find the following partial
derivatives:
∂7 ea b

∂ x+y+z
a.) b.) c.) For w = 3 x3 y2 z2 − 5 x2 + y3 − z4 , with constraint equation
∂ y x y sin(|y |) + z2 ∂a 3 ∂ b4 a4
∂w
x5 z3 + 4 x2 y3 + y2 z5 = 71, find .
∂z y
VI. (15 points) Find the equation of the line tangent to x3 y2 − 4 x2 y3 − 2 x = −3 at the point (−1, 1).
VII. (10 points) If you have any function f (u, v), where u = x2 − y2 and v = 2 x y, write ∂ f /∂ x as a formula involving
∂ f /∂ u and ∂ f /∂ v and x and y.
Test #2, Fall 2009
I. (10 points; 5 points each) Determine the future value of a deposit of $1000 at 3% for 10 months if the interest is
(a) Compounded monthly (b) Compounded continuously
√ 2
cos(x) − 1 x+1−2 200 x100 − sin(x) + ex e2 x − 1
(a) lim (b) lim 3 (c) lim (d) lim
x→π sin(x) x→3 x − 7 x − 6 x→∞ 40 ln(x4 ) + x7 x→0 x
III. (15 points; as noted) For the questions in this problem, use the function f (x) = x2x+2 .
(a) (8 points) Find all local max and min values of f (x). (b) (7 points) Find the global max and min of f (x) on the
interval [0, 2].
IV. (30 points; as noted) Find the following derivatives.
∂3 z + z−1

∂ dy
sin(x2 y z) + x z3

(a) (5 points) (b) (5 points) sin(y x) + tan −1
(c) (5 points) if x y −
∂y ∂ x ∂ z ∂ y x − x dx
∂y
ln(x y) = 3 x y2 (d) (15 points) for y = cos(z w3 ) + w x2 subject to w2 x z = π. (Use implicit methods only.)
∂z w
V. (15 points; 5 points each) Given the price/demand function p(x) = x − (x2 /10).
(a) Find the elasticity at x = 8. (b) Is the market elastic or inelastic at x = 8? (c) From this point, would revenue
(p x) increase or decrease if price increased by 10%?
dy d2y
p
VI. (15 points) If x2 y = 7, use implicit methods only to show that dx = −2 y/x, dx2
= 6 y/x2 , and use these results to
d3y
find dx3
. NO credits will be given for solving for y and differentiating explicitly.
Test #2, Fall 2010
I. (15 points; 5 points each) Find the interest earned by $2500 at 5.4% for 18 months if the interest is calculated by the
following methods:
(a) Simple interest (b) Compounded quarterly (c) Compounded continuously
II. (15 points; 5 points each) Find the following limits:
cos(π/x) x3 + 3 x − 4 3 x4 − 5 x2 + x − 8
(a) lim 2 (b) lim 2 (c) lim
x→2 x + x − 6 x→1 2 x − 5 x + 3 x→∞ 7 x3 + x2 − x + 1
III. (10 points; 5 points each) For this problem, use the function f (x) = x3 − 3 x + 7.
(a) Find the local max and min values of this function. (b) Find the global max and min of y = f (x) on the interval
[0, 3].
IV. (15 points; 5 points each) Dudley’s new employer D.I.P. (Diversified International Products), is selling wicket greasers
“for all those sticky wickets you run into”. The price/demand function for wicket greasers is p = 95 − x − 2 x2 .
(a) What price corresponds to a demand of 4? (b) What is the elasticity when the demand is 4? (c) Using the results
of part (b), if the price is raised above the level in part (a), will the revenue for D.I.P. increase or decrease?
V. (30 points; as noted)
2 Find the following derivatives:8 x ∂2
(a) (5 points) ∂t x√y−cost (b) (5 points) ∂ x∂5 ∂ y3 y2 ee + y2 ex ex y z + x2 y4
∂

5t−3 x
(c) (5 points) ∂z∂y (d) (15
d2w
points) dx2
if w3 − x4 = −7 x w + 10
VI. (10 points) If you have a function F(u, v) and u and v (me and you?) are given by the equations u = 5 tan(x) sec(y) and
v = 5 tanh(x) sech(y), give the formulas for ∂∂Fx and ∂∂Fy in terms of ∂∂Fu and ∂∂Fv

∂w x
VII. (10 points) Find ∂x z for w = ln(x y) + y+z subject to Arcsin x + x2 + y2 + z2 + z = 2 (use implicit methods only).
Test #2, Fall 2011
I. (15 points, 5 points each) Find the future value in each part.
(a) $2500 loaned out at 4% simple interest for 9 months. (b) $6500 put into an account paying 5% compounded quar-
terly for 7 years. (c) $4000 compounded continuously at 6.3% for 10 years.
II. (20 points; as marked) Find each of the following limits.
2 x2 − 3 x − 20 7 x3 − 5 x2 + 18 ln(cos(3 x))
(a) (5 pts) lim 2 (b) (5 pts) lim (c) (10 pts) lim
x→4 x + x − 20 x→∞ 9 x3 + 25 x + 27 x→0 7 x2
III. (20 points; 10 points each) Dudley has started selling a premium weed eater called MotorGoat. The price per unit, if he
makes x units, is p(x) = 200 − 0.1 x.
(a) Find the elasticity at x = 900. Is the market elastic or inelastic at this sales level? (b) By what percentage will his
sales drop from this level if he increases the price of the MotorGoats by 10%?
IV. (10 points) This is a continuation of the saga from problem III. U C(x) = 5000 + 25 x − 0.001 x2 . How many units must
Dudley sell to maximize his profits? (Profit is revenue (x p) minus cost).
x
V. (15 points) Find the global extrema of f (x) = on the interval [−1, 4].
4 + x2
VI. (30 points;
10 points
each) Find the following derivatives:
∂5

sin(x y) 2 3 ∂w
(a) y
(b) dy/dx for y = x e + y ln(x) (c) for w = 3 z2 y + 4 sin(y z) + cos(x2 ), subject
∂ x ∂ y4 x4 ∂x y
to x2 + y3 + z = 7.
Test #2, Fall 2012
I. (10 points; 5 points each) Determine how much a deposit of $1200 will earn at 6% for 9 months if the interest is
(a) compounded quarterly. (b) Compounded continuously.
x2 − 5 x + 6 2 x3 − 5 x2 + 6 x x ln(x)
(a) lim 2 (b) lim 3
(c) lim 2
x→3 2 x − 4 x − 6 x→∞ 3 x − 4 x − 6 x→1 x + 2 x − 3
III. (15 points; 5 points each) For the following questions, use the function f (x) = 2 x3 + 3 x2 − 12 x + 3.
(a) Find all local max and min values of f (x). (b) Using the number line, describe all segments where the function is
increasing or decreasing. (c) Find the global max and min of f (x) on the interval [0, 4].
IV. (35 points; as listed) Find the following derivatives.
√ !
∂ x2 sec y ∂3
x y2 z3 − ex y z

(a) (5 points) 2 2
(b) (5 points) 2
(c) (10 points) Use implicit methods only to
∂ x y cos(x ) ∂x∂ y
dx x 2 d2x
show that = if x2 e2 z = 7. Then, find 2 . Simplify your answer as much as possible. (d) (15 points) Find
dz 2z dz
∂w
for w = 3 z x + 4 cos(y z) + sin(x ) subject to x3 + y2 + z = 7.
2 2
∂x y
x 2
V. (15 points; 5 points each) Given the price/demand function p(x) = x − 10 .
(a) Find the elasticity at x = 8. (b) Is the market elastic or inelastic at x = 8? Why? Does revenue increase or
decrease if price increases at x = 8? Why?
√ ∂f ∂f
VI. (10 points) Suppose that f (x, y, z) = ex y z , x(u, v) = 3 u sin v, y(u, v) = 4 v2 u, and z(u, v) = u v. Find and .
∂u ∂v
Test #2, Fall 2013
I. (10 points; 5 points each) Determine the future value of a deposit of $2200 that earns 3% for 9 months if the interest is:
(a) Compounded monthly (b) Compounded continuously
3 x2 + 12 x + 9 2 ex − x3 + ln(x) [sin(x)]2
(a) lim (b) lim (c) lim
x→−3 x2 + x − 6 x→∞ 4 e3 x + x10 x→0 tan(x)
III. (25 points; as marked) For the following questions, use the function f (x) = 3 x4 − 4 x3 − 12 x2 + 5
(a) (10 points) Find all local max and min values of f (x). (b) (5 points) Using a number line, describe all segments
where the function is increasing or decreasing. (c) (10 points) Find the global max and min of f (x) on the interval
[1, 4].
IV. (45 points; as marked) Find the following derivatives.
√ !
∂ x2 sin( y) ∂3 ∂x
(a) (10 points) 2 2
(b) (10 points) 2
(x y2 z2 − sin(y z)) (c) (10 points) if cos(w x z2 ) −
∂ x y cos(x ) ∂x∂ y ∂z

∂w √
y3 x z = 3 z x2/3 (d) (15 points) for w = 3 z3 x2 + 4 sinh(y z) = sin(z x2 ) subject to z2 x3 + y2 z = 7
∂x z
V. (15 points; 5 points each) Given the price/demand function p(x) = x2 − (x3 /30).
(a) Find the elasticity at x = 25. (b) Is the market elastic or inelastic at x = 25? Why? (c) Does revenue increase or
decrease if price increases at x = 25? Why?
Summary sheet
Usual derivative formulas (see chapter 1), plus:
Elasticity = η
p/x
=
d p/dx
dx/x
=
d p/p
Simple interest: FVIF = 1 + r t

Compound interest: FVIF = (1 + k)n
Continuously compounded interest: FVIF = er t
I. (a) $2067.68 (b) $2067.79

II. (a) 10/51 (b) 0 (c) Does not exist
III. (a) Local mins of −11 at x = −1, 1; local max of −10 at x = 0. (b) Since putting the number line in here is awk-
ward, here is the description. The curve is decreasing for x < −1, increasing for −1 < x < 0, decreasing for 0 < x < 1, and
increasing for x > 1. (c) The global max is 53 at x = 3; the global min is −11 at x = 1.
2 2 (−w/z)+(3 x3 +4 z sec2 (y z))
IV. (a) 2 x y ex y + x2 y2 ex y (b) 0 (c) − 2 x y z+x√yy−z−1 (d) − 3 z w 9 x2 y−1
V. (a) η = −1/4 (b) Inelastic, since |η | < 1 (c) Increase, since the market is inelastic at that price.
VI. −6 y/x3
I. (a) $3101.47 (b) $3101.69

II. (a) 0 (b) 5/7 (c) 2
III. (a) Relative max at (−2, 11); relative min at (2, −21). (b) Since a number line is space-consuming, here is the
description. For x < −2, the function is increasing; for −2 < x < 2, the function is decreasing; for x > 2, the function is
decreasing. (c) Global max at (4, 11); global min at (2, −21).
−y √
IV. (a) 5 x y4 + 1/(x y) (b) x3 ex z (c) y (x + 2 y)−2 −(x + 2 y)−1 + 2 y (x + 2 y)−2 x+2

y (d) 10 z y + 2 y +
2 z e2 w − wz (2 z2 e2 w
1−(x2 /50) 50−x2
V. (a) η = 1−(3 x2 /50)
= 50−3 x2
(b) Elastic, since |η | = 2.767. (c) Demand decreases by 27.67%.

∂f v x2 y u ∂f v 2
VI. = 2 v2
+ 3 v2 (2 x y e ) + 2 2
(y cos(x y)), = 2 2
+ 3 v (x2 ex y ) +
2
∂ x 1 + u 1+u v +6vu ∂y 1+u v

u
(x cos(x y))
1 + u2 v2 + 6 v u
I. (a) $5113.35 (b) $5112.50

II. (a) 1/3 (b) 0 (c) ∞ or undefined (d) 1
III. (a) Max at (−1, 6), min at (3, −26) (b) To save space, here is a description: To the left of −2 and to the right of 3,
the function is increasing. Between −2 and 3, the function is decreasing. (c) Global max of 55 at x = 6, global min of
−75 at x = −4.
2 h i
2y 2 w − cos y
IV. (a) 3 x cos y + 8 x3 y z (b) x360 5z
× 10 z ln y + y e2 w

4 z6 (c) x2
(d) y + z e √
w
V. (a) η = −7/3 (b) Elastic, since |η | > 1. (c) Decrease, since revenue moves the opposite direction from price in
an elastic market.
√ h i
VI. (z w y − 3 w5 x2 z) − ( 12 x z w y−1/2 + z2 ) × 2xy
VII. fu (2) + vv (x ex y )
VIII. (a) The rate productivity changes per hundred dollars spent in employee comfort (b) The rate productivity changes
per hundred dollars spent in productivity bonuses (c) The rate productivity changes per hundred dollars spent in em-
ployee comfort per hundred dollars spent in productivity bonuses
I. (a) $3153.42 (b) $3150.00

II. (a) Does not exist (b) 25/14 (c) Does not exist (d) −13
III. (a) Local max of −3 at x = −1; local min of 5 at x = 3 (b) Global max of 6.8 at x = 6; global min of 5 at x = 3
6 x y − y sinh(x y) − 2 y2
IV. (a) (1/2) (x y3 z)−1/2 (3 x y2 z) (b) x5 sin(x z) (c) − (d) (, w2 −y ez y )−(z ez y (sec2 z)/(2 y))
3 x2 − x sinh(x y) − 4 x y
V. (a) η = −0.28 (b) Inelastic, because |η | < 1 (c) Fall by 2.8%
VI. −(80/27) y/x3
I. (a) $1764.82, $64.82 (b) $1764.96, $64.96

II. (a) 7/10 (b) 2 (c) −1
III. (a) Local mins at (3, −171) and (−2, −46); local max at (0, 18) (b) Global max at (0, 18); global min at (2, −110)
IV. (a) p = 14 (b) η = −7/36 (c) Demand increases by about 7/12% ≈ 0.5833%. (d) Revenue decreases,
since |η | < 1, so the market is inelastic.
x y sin(|y | + z2 ) (1) − (x + y + z) [x sin(|y |) + x y cos(|y |) |y | /y]
V. (a) (b) b3 ea b (c) (6 x3 y2 z − 4 z3 ) − (9 x2 y2 z2 −
(x y sin(|y |) + z2 )2
5 2
3x z +5y z 2 4
10 x) 4 3
5 x z + 8 x y3
9
VI. y − 1 = 14 (x + 1).
∂f ∂f ∂f
VII. = 2x +2y .
∂x ∂u ∂v
I. (a) $1025.28 (b) $1025.32

II. (a) Does not exist (b) 1/80 (c) ∞ (d) 2
√ √ √ √ √ √
III. (a) Max value is 2/4 at x = 2. Min value is − 2/4 at x = − 2. (b) Max of 2/4 at x = 2, min of 0 at x = 0.
IV. (a) x2 z cos(x2 y z) (b) 0 (c) −(y − (1/x) − 3 y2 )/(x − (1/y) − 6 x y) (d) −w3 sin(z w3 ) − 2 w x (x/z)
V. (a) η = −1/3 (b) Inelastic, since |η | < 1 (c) Increase
p
VI. For F(x, y) = x2 y, dy/dx = −Fx /Fy = −( 21 (x2 y)−1/2 (2 x y))/( 12 (x2 y)−1/2 (x2 )) = −2 y/x = G(x, y). Then d 2 y/dx2 =
Gx +Gy (dy/dx) = (2 y/x2 )+(−2/x) (−2 y/x) = 6 y/x2 = H(x, y), and d 3 y/dx3 = Hx +Hy (dy/dx) = (−12 y/x3 )+(6/x2 ) (−2 y/x) =
−24 y/x3 .
I. (a) $202.50 (b) $209.46 (c) $210.93

II. (a) π/20 (b) −6 (c) ∞
III. (a) Local min at x = 1, y = 5; local max at x = −1, y = 9. (b) Global min at (1, 5); global max at (3, 25).
IV. (a) p = 59 (b) η = −0.8076 (c) Increase, because |η | < 1 so the demand is inelastic at the price.
√
( 5t − 3 x) (sint) − (x2 y − cost) (1/2) (5t − 3 x)−1/2 (5)
V. (a) (b) 0 (c) x ex y z + x2 y z ex y z
5t − 3 x
(3 w2 + 7 x) (12 x2 ) − (4 x3 − 7 w) (7) 4 x3 − 7 w (3 w2 + 7 x) (−7) − (4 x3 − 7 w) (6 w)

(d) +
(3 w2 + 7 x)2 3 w2 + 7 x (3 w2 + 7 x)2
VI. Fx = 5 Fu sec2 x sec y + 5 Fv sech2 x sech y, Fy = 5 Fu tan x sec y tan y − 5 Fv tanh x sech y tanh y.
√ 1 +2x
 

1 1 1 x 1−x 2
VII. + − −  
x y+z y (y + z)2 2y

I. (a) $2575 (b) $9203.95 (c) $7510.44

II. (a) 13/9 (b) 7/9 (c) −9/14
III. (a) η = −1.222. Since |η | > 1, the market is elastic. (b) −12.22%, or a decrease of 12.22%.
IV. Maximum profit when x = 883.84.
V. Global minimum for f (−1) = −1/5; global maximum for f (2) = 1/4.
2 x ey + (y3 /x)
VI. (a) y cos(x y) (b) − (c) −2 x sin(x2 ) − (2 x) [6 z y + 4 y cos(y z)]
x2 ey + 3 y2 ln(x) − 1
I. (a) $1254.81 (b) $1255.23

II. (a) 1/8 (b) 2/3 (c) 1/4
III. (a) Local max of 23 (at x = −2); local min of −4 (at x = 1) (b) The function is increasing for x < −2 and for x > 1;
the function is decreasing for −2 < x < 1. (c) Global min of −4 at x = 1; global max of 131 at x = 4.
√ √
(y2 cos(x2 )) (2 x sec( y)) − (x2 sec( y)) (y2 (− sin(x2 )) (2 x))
IV. (a) (b) 2 z3 −(2 x z2 +x2 y z3 ) ex y z (c) Let F(x, z) =
(y2 cos(x2 ))2
2 2
x2 e2 z , so the equation is F(x, z) = 7. Then dF = d(7) = 0, so dF = ∂∂Fx dx + ∂∂Fy dz = 0, which gives (2 x e2 z ) dx +
2
2 2 dx
2 4 x 2 z e2 z 2
(x2 e2 z (4 z)) dz = 0. Then (2 x e2 z ) dx = −(4 x2 z e2 z ) dz, or =− 2 = −2 x z. To find ddz2x , you differentiate
dz 2 x e2 z
dx d2x d dx
dz with respect to z, remembering that x is treated as a function of z. That gives dz2 = dz (−2 x z) = −2 ( dz z + x (1)) =
−2 ((−2 x z) z + x) = 4 x z2 − 2 x. (d) (3 z2 + 2 x cos(x2 )) − (3 x2 ) (6 z x − 4 y sin(y z))
V. (a) η = −1/3 (b) Inelastic, since |η | < 1. (c) Increases, since the market is inelastic.
∂f ∂f
VI. = (y z ex y z ) (3 sin v)+(x z ex y z ) (4 v2 )+(x y ex y z ) ((1/2) u−1/2 v1/2 ) and = (y z ex y z ) (3 u cos v)+(x z ex y z ) (8 u v)+
∂u ∂v
(x y ex y z ) ((1/2) v−1/2 u1/2 )
I. (a) $2250.00 (b) $2284.07

II. (a) 6/5 (b) 0 (c) 0
III. (a) Relative max of 5 at x = 0, relative min of −27 at x = 2, relative max of 0 at x = −1. (b) Decreasing for x < −1,
increasing for −1 < x < 2, decreasing for 2 < x. (c) Global max of 325 at x = 4, global min of −27 at x = 2.
√ √
(y2 cos(x2 )) (2 x sin( y)) − (x2 sin( y)) (−y2 sin(x2 ) (2 x)) −2 w x z sin(w x z2 ) − y3 x − 3 x2/3
IV. (a) (b) 2 z2 (c) −
2 2
[y cos(x )]
2 2
2 −w z2 cos(w x z2 ) − y3 z − 2 z x−1/3
3z x
(d) 6 z3 x + 2 x z cos(z x2 ) − √ (4 z cosh(y z))
2y z
V. (a) η = −1/3 (b) Inelastic, since |η | = 1/3 < 1. (c) Increase, since the market is inelastic at x = 25.
Chapter 4
Integration explained
4.1 The concepts behind integrals.

Differentiation is only half of calculus. Integration is the other half.
Mugsy: If he brings back gnomes and green boxes, I’m giving up.
Albert: Promise?
It represents the inverse process to differentiation. Integration is more useful, and more complicated. The different inter-
pretations we got for derivatives will be relevant here, too.
Dudley: Bye, Mugsy.
Mugsy: Hey, if you two both want to get rid of me that much, I’m staying, just to make your lives miserable.
Albert: Good work, Dudley.
Consider this problem. You want to find the voltage drop over a long distance with a voltmeter when the probe leads
are too short to go the whole way. You must break the distance up, measure the voltage drops along each segment, and
combine to get the total voltage drop. I want to frame this question in the terms that we have been using.
Mugsy: Is this really relevant?
Albert: Hang in there. It will be soon. I can see where he’s going.
Suppose, then that y = F(x) = voltage as a function of x = position. That then is what the voltage the meter measures,
almost. Actually, it measures the difference between the voltages at the two ends. Then ∆y = voltage the meter measures.
We want ∆y = F(b) − F(a), but can’t use ∆x = b − a, since that’s too large for the wires on the meter to reach.
Mugsy: Give those wires here, I’ll make ’em reach.
Add auxiliary points at x = x1 , x2 , . . . , xn , keeping x0 = a and xn = b, using ∆x = x j − x j−1 , the old “(end)-(beginning)”
again. We can measure ∆y1 = F(x1 ) − F(x0 ), ∆y2 = F(x2 ) − F(x1 ), . . . , ∆yn = F(xn ) − F(xn−1 ), because we can reach all
of those ∆x’s in a single step. But then ∆y, the entire voltage drop, is equal to ∆y1 + ∆y2 + · · · ∆yn , since
∆y1 + ∆y2 + · · · ∆yn = (F(x1 ) − F(x0 )) + (F(x2 ) − F(x1 )) + · · · + (F(xn ) − F(xn−1 )) (4.1)
= F(xn ) − F(x0 ) (4.2)
= F(b) − F(a) (4.3)
= ∆y (4.4)
In English, the total voltage drop is equal to the sum of all the individual voltage drops.
Dudley: Hey, that makes some sense to me. How about you, Mugsy?
Mugsy: I’m afraid it does to me, too.
Dudley: Afraid?
Mugsy: Yeah. Not scared, of course. But when math starts making too much sense, I know something’s got to be
seriously wrong.
This collapsing property in summations is called telescoping.
Summations occur sufficiently often that mathematicians have developed a summation notation for them. For example,
197
CHAPTER 4. INTEGRATION EXPLAINED 198
the summation
f (1) + f (2) + · · · + f (74)
would be written out as
74
∑ f ( j)
j=1
What you do is successively replace the j in f ( j) by the values starting at (as indicated by the value below the Σ sign) 1, 2,
3, . . . , up to 74, (as shown above the Σ) and evaluate f ( j) for each of those values, giving f (1), f (2), f (3), . . . , f (74). You
then add up all of the values to get
f (1) + f (2) + f (3) + · · · + f (74)
For another example,
7
∑ ( j2 + 3) = 3 + 4 + 7 + 12 + 19 + 28 + 39 + 52 = 164
j=0
It actually is not difficult once you get the hang of it.

Mugsy: This is getting scary. Much too much sense.
Albert: Let me reassure you, it will get complicated soon.
Mugsy: Really? Now I feel better, I think.
This notation is particularly handy when you don’t know the upper limit of summation, as in the addition of the ∆y’s
above. The formula we got there was ∑nj=1 ∆y j = ∆y or
n
∑ F(x j ) − F(x j−1 ) = F(xn ) − F(x0 ) = F(b) − F(a).
j=1
We will be using summation notation extensively for the rest of the year, so it will be handy to have some feel for it.
Here are a few properties of summation notation:
• If c is a constant then
n n
∑ cxj = c ∑ xj
j=1 j=1
It looks as though the c simply moved outside the summation sign (as in fact it did). Written out, this is (cx1 + cx2 +
· · · + cxn ) = c(x1 + x2 + · · · + xn ), the familiar distributive law. In fact, most of these properties are nothing more than
arithmetic laws written in a way that tends to obscure the familiarity.
Mugsy: Why do I have the idea that mathematicians write things in the most obscure way possible?
Albert: Pick any academic discipline. They all tend to write things in picky and obscure ways, usually to
camouflage the fact that what they are saying is really trivial and obvious. Mathematicians are no exception.
The only trick is to learn the lingo, and that’s most of the battle.
• Another familiar law is
n n n
∑ (a j + b j ) = ∑ a j + ∑ b j
j=1 j=1 j=1
which written out is
(a1 + b1 ) + (a2 + b2 ) + · · · (an + bn ) = (a1 + a2 + · · · + an ) + (b1 + b2 + · · · + bn )
the associative law for addition.

Mugsy: Another one of those trivial things, huh Al?
Albert: Yup.
• Notice, however, that there are some things that you are not allowed to do with summations.
! !
n n n
∑ (a j × b j ) 6= ∑ aj × ∑ bj
j=1 j=1 j=1
which written out for the case of n = 2 says
(a1 b1 + a2 b2 ) 6= (a1 + a2 ) × (b1 + b2 )
a common algebra error.

Summations work well for addition (and subtractions), but not multiplication (or division), very much like derivatives.
Mugsy: I guess I’m not supposed to do that, right?
Albert: You got it.
When there is more room, the limits of summation (the things attached to the Σ) will shift slightly around to be on top
and bottom of the summation sign. With less room, the limits of summation appear as subscripts and superscripts. Check
out the difference in placement between ∑nj=1 f ( j) and
n
∑ f ( j)
j=1
This is fairly common practice.

So, summation notation is merely a way of compressing lengthy additions that would otherwise be cumbersome. Sums
have their own rules, but those rules are nothing more than the rules that you already have learned for addition.
As you might guess, Maple does sums, too. The notation is similar to what we have just given. To get Maple to evaluate
7
∑ ( j2 + 3)
j=0
give it the command

> sum( j^2+3, j = 0 .. 7 );
164
and Maple types back the answer. Maple will use rational arithmetic (exact!) when possible. (There are a few excep-
tions, but you will probably never encounter them.) When the sum is not evaluable in simple form, such as including unspec-
ified limits, Maple will return a stylized summation sign. Watch what happens when you type in sum(exp(sin(a)),a=0..n);.
> sum(exp(sin(a)), a = 0 .. n );
n
∑ esin(a)
a=0
Maple simply printed it out as nicely as possible.
Mugsy: Maple couldn’t handle it! I was beginning to think that Maple does everything.
Albert: In this case, Maple couldn’t do it because it can’t be done.
Mugsy: Rats.
The situation where one or both limits of the summation are unspecified is called indefinite summation. Maple can handle
some of these (at least ones that can be handled). Watch.
> sum( j^2+3, j = 0 .. n );
> subs(n=7, %);
(n + 1)3 (n + 1)2 19 n 19
− + +
3 2 6 6
164
First, Maple calculated a formula for the summation that is valid for any value of n, and then we had it substitute n = 7
into the formula and checked that the formula does work (for that value at least).
Homework #31
Exercises.
1. Find the values of the following summations. Check your answers with Maple, if you want.
5
(a) ∑ ( j − 1)
j=0
7
(b) ∑ (k2 − 5)
k=2
3
(c) ∑ 2− j
j=0
4
(d) ∑ (l − 1)l
l=1
10
(e) ∑ xr
r=6
10
(f) ∑ rx
r=6
2. Find the values of the following summations. Check your answers with Maple, if you want.
5
(a) ∑ ( j − 3)
j=0
7
(b) ∑ (k2 + 3)
k=2
5
(c) ∑ 2j
j=0
3
(d) ∑ ll
l=1
12
(e) ∑ xr
r=7
12
(f) ∑ rx
r=7
3. Make up some summations of your own. As usual, three of them will count for credit.
Problems.
1. Find ∑87345
n=0 cos(nπ). (Obviously, the straightforward approach is not going to work well by hand. Try figuring out
by hand the values of ∑kn=0 cos(nπ) for k = 1, 2, 3, 4, 5, and 6. See if you can find a pattern. Then figure out where
87345 fits into that pattern.) You can check your answer with Maple again.
2. This problem (and the next) develop some properties of summations.

(a) Write out the numbers in each of these summations and then add them:
7 7 7 7
∑ ( j − 3), ∑ (k − 3), ∑ (r − 3), and ∑ (n − 3)
j=1 k=1 r=1 n=1
(b) What do you notice about the answers in the previous part? Is there an “obvious” reason for this? (This
observation is called the principle of ignorance: A variable doesn’t know what you call it.)
3. This problem looks at another property of summations.
(a) Write out the numbers in the summations and then add them:
7 6 10 −1 31
∑ ( j − 3), ∑ ( j − 2), ∑ ( j − 6), ∑ ( j + 5), and ∑ ( j − 27)
j=1 j=0 j=4 j=−7 j=25
(b) What do you notice about the answers to the previous part? Is there an “obvious” reason for this? This process
is called shifting the index of summation. Notice that the limits of summation and the function change, so that
the same numbers are obtained in each sum.
(c) Give the summation that should be used to duplicate the sums in the first part of this problem if the lower limit
of summation is j = 10. The numbers in each sum should be the same. Note that when the value of j in the
sum gets larger, the limits on the summation must get smaller to give the same numbers.
(d) Change
10
∑ ( j2 − 15)
j=5
to a summation
?
∑ (??)
k=0
Do this is a way that causes all the numbers in both sums to be equal. The other limit is not too difficult to figure
out, but the function is more complicated. Think of what the relation between j and k should be, and plug that
into the summation.
4.1.1 Anti-wiggle factors (anti-derivatives) = definite integrals.

What would happen if you can’t find y for any value of ∆x?
Dudley: That one’s easy. You quit.
Albert: Not necessarily. Watch.
We’d need some sort of information. If we can find dy for any dx, can we reassemble the dy’s (which are submicroscopic,
remember) into a ∆y? Yes, by a process called integration.
Mugsy: Why doesn’t that surprise me?
Knowing dy for any dx is the same as knowing dy/dx, since then dy = (dy/dx) dx. So, what we want to add up are all
of these (dy/dx) dx’s. But regular sums can’t add up differentials. We need something new. That is called an integral.
The result of this summation will be a number, a value of ∆y. Such things are called definite integrals. Something called
indefinite integrals will appear shortly as a means of calculating definite integrals.
Dudley: Is that anything like using the formulas for indefinite summations to find regular summations?
Albert: Exactly! Congratulations on figuring that out!
Mugsy: Dumb luck.
Adding up wiggles (slice it up and put it back together).

The basic idea is that we change from
∑ ∆y j = ∆y
to Z
dy = ∆y
R
The is called an integral sign, and is a stylized German S, short for Summe (sum in German). The notation is due to
Leibniz.
We need more information, though. The limits of summation are important in sums. It wasn’t just ∑ ∆y j = ∆y, it was
n
∑ ∆y j = y(xn ) − y(x1 )
j=1
The notation for integrals is more properly

Z x=b
dy = y(b) − y(a)
x=a
If we knew y = F(x), it would be easy:
Z x=b
dy = F(b) − F(a)
x=a
just as before. The trick is to find F(x). What we are given is dy/dx = F 0 (x), the wiggle magnification factor. So, the idea
is Rto “undo” the process of taking a derivative, to go from a F 0 (x) that is given to you and find the F(x). A simple example
is x2 dx, with x going from 1 to 7, or as it is more commonly written
Z 7
x2 dx
1
(You should omit the x in the limits when the variable matches the differential.) How do we find F(x)? That’s the whole
key to these problems. We know that dy = d(F(x)) = F 0 (x) dx = x2 dx, so we need to find F(x) that satisfies F 0 (x) = x2 . In
this case, we can actually think about it for a minute
Dudley: For Albert, it’s about a microsecond.
and realize that we can find such a function, namely F(x) = 13 x3 . Then the change in y, ∆y, is given by
∆y = F(7) − F(1) (4.5)

1 1
= (7)3 − (1)3 (4.6)
3 3
= 114 (4.7)
We will systematize this considerably more.

What we have done in this example is to declare that if the dy is related to the dx by dy = x2 dx, then as x changes from
x = 1 to x = 7, then y must change by 114. Note that you don’t know what y changed to—that is, what the final value of y
is—you can only tell how much y changed. To find out what y changed to, you’d have to know what value y started with.
We’ll do that later, too.
Move from submicroscopic to any size at all.

This process is more important than you might imagine. It often turns out that estimating the size of ∆y is exceedingly
complicated, but finding dy in terms of dx is not so bad. (For one thing, you can assume that variables are constants, since
nothing can change when all you do is wiggle by dx.) Then you simply add up all the dy’s with an integral. All of the
applications that we will be doing that require integration end up using this idea. It is the key to all integration problems.
Dudley: Al, is it really that important?
Albert: YES!
Population growth “solved.”

We looked at population growth earlier when we were introducing exponentials, and then again briefly when we were
beginning differentials. We look at it again (and again). Remember that our initial estimates will be bad in the long run.
(For how bad, see the homework.)
Mugsy: I just love it when math bombs out. . . .
But later we will improve them considerably.
As before, P = population (of rabbits, this time), and t = time.
Dudley: Why rabbits, Al?
Albert: Because human populations are more difficult. We will discuss those later.
We want to estimate dP for a (very small) time interval dt. Essentially, dP = k P dt, which says that the population growth
for short time periods is proportional (k) to the population (P) and the length of time you wait (dt). (If this still causes
discomfort, go back to some of the earlier discussions on population and differentials. See page 93.) To solve this, we need
a technique called separation of variables: Pull the P to the side of the equals sign with the dP. Any t’s that happened to be
present (in this case there aren’t any) would go to the side with the dt. This gives
dP
= k dt
P
We need a function whose differential is dP/P, and the answer is ln P. The function with differential k dt is even easier,
being d(k t). (Remember that k is a constant.)
Then dP/P = k dt becomes
d(ln P) = d(k t)
meaning that the tiny wiggles in each match. Then the larger wiggles must match, too, so
∆(ln P) = ∆(k t)
We are now getting close to the answer. (The step from the differential to the large-scale change is the hard one!)
Mugsy: Al, is that what integration is supposed to do?
Albert: Exactly. Every time.
If we have values for k and ∆t, we can find the change in P.
To make this concrete, suppose the initial population is P(0) = 50.
Dudley: That sure implies that we aren’t working with humans.
Then we can make some progress finding ∆P
∆(ln P) = ln(P(10)) − ln(P(0)) (4.8)

= ln(P(10)) − ln(50) (4.9)
= ln(P(10)) − 3.912 (4.10)
But we also need more information, so suppose we take k = 0.1, and t going from 0 to 10 (the units of time might be
months). Then we can get
∆(k t) = [(k t) at t = 10] − [(k t) at t = 0] (4.11)

= (0.1 × 10) − (0.1 × 0) (4.12)
= 1−0 (4.13)
=1 (4.14)
So, setting ∆(ln P) = ∆(k t), we get

ln(P(10)) − 3.912 = 1
We can solve this for ln(P(10)), giving
ln(P(10)) = 1 + 3.912 = 4.912
We can un-do the logarithms by taking the exponential of both sides and get (remember that a common notation for ex is
exp(x))
P(10) = exp(4.912) = 135.9
What’s the answer? How do you have 135.9 rabbits? Two possibilities occur. One says that you don’t have another rabbit
until the population hits a whole number, so there would be 135 rabbits. Another says that you should round, getting 136
rabbits. Another is that you should count the next rabbit as soon as possible, also getting 136. Both answers would be
considered correct, and there is never much difference between the approaches.
The value of k can be examined a little more carefully, too. It is called the specific growth rate. It represents the rate
of growth per rabbit, since it is k = (1/P)(dP/dt). Just the derivative dP/dt is the growth rate, but that changes with the
population. But (1/P)(dP/dt) is the growth rate per unit population (dividing by P gives the “per unit population” part),
and that tends to stay much closer to constant. (Later, we will improve the assumption that the specific growth rate is
constant. We will also try this on a more realistic example, namely the census data for the U.S.A. from 1790 to 1990.)
Homework #32
Exercises.
1. For this exercise, use the same values for initial population (P(0) = 50) and specific growth rate (k = 0.1) that were
used in class.
(a) What would the rabbit population be at t = 36 (3 years)?
(b) What would the rabbit population be at t = 720 (60 years)?
(c) Assume that each rabbit weighs 5 pounds. (Don’t get picky, it’s just an assumption.) How much would the total
number of rabbits in the previous part weigh?
(d) The earth weighs about 1.32 × 1025 lb. Is the answer to the previous part realistic?
2. For this exercise, use the same values for initial population (P(0) = 50) and specific growth rate (k = 0.1) that were
used in class.
(a) What would the rabbit population be at t = 24 (2 years)?
(b) What would the rabbit population be at t = 480 (40 years)?
(c) Assume again that each rabbit weighs 5 pounds. How much would the total number of rabbits in the previous
part weigh?
(d) Is the answer to the previous part realistic?
Areas, a geometric application.

This is how close the Greeks came to discovering calculus. But since they had very limited algebraic skills, they weren’t
going to do much more than this.
Dudley: Just think, Mugsy. You’d have made a great Greek. Algebra skills nearly zilch.
Mugsy: I’d get mad, except that I think I’d enjoy being a Greek, too. Different reasons, though.
What they did will be very similar to what we will be doing the rest of this course.
Classical application: Archimedes’ method of exhaustion. The basic idea is to fill up (exhaust) the area underneath a
curve with some simple geometric shape. We’ll use rectangles. The limit gives the area.
Mugsy: Method of exhaustion, huh? Let me work on that.
The basic problem is to find the area under the curve y = f (x), for a ≤ x ≤ b. Slice the area up
Dudley: Chunk it through your Veg-O-Matic and not a seed out of place!
and you get a large number of strips that are roughly rectangular. Approximate the areas of the strips by rectangles, and
add these areas up to get an approximation to the area under the curve.
Here are a few examples of what it might look like.
There are several choices for the specific approximating rectangles to use, and various terms for the resulting approximation
to the area. You always use the width of the strip for the width of the rectangle. What changes is the way you determine
the height of the rectangle.
Mugsy: I just choose the height as 1. Boy, does that simplify things!
Albert: Did you wonder why you never got the right answer?
Mugsy: I usually got some partial credit, though.
Here’s a table giving several different possibilities, and the name of the resulting approximation to the area.
How you find the height Name for approximation to area
Left-hand side of strip Left sum

Right-hand side of strip Right sum
Middle height of strip Middle sum
Just fits under strip Lower sum
Just fits over strip Upper sum
All of them work, some better than others.

Mugsy: Some are more equal than others, eh?
Albert: Why, Mugsy. Literary allusions! I didn’t know you could.
(Which approximations are better will be discussed in the homework later, when we are more used to the concepts.) The
main idea is that as the number of strips increases, the difference between these will be small enough to ignore, and
increasingly close to the actual area under the curve.
The way to add up all of areas of the strips is summation notation. This is a typical use of summation notation, and very
much like what we will be using throughout the course.
Moving to exhaustion. We are looking for the area, but that is not what we have found. We have only approximated
the area, and we aren’t even very sure how well we have done that. Before we can get the area, we need to deal with that
problem.
We get the actual area by slicing finer and finer. The whole problem is the fact that the rectangles don’t really fit well.
But if we take thinner and thinner rectangles, the error is proportionally less of the area.
The best way to see this is with pictures. We will take the graph y = −x3 /4 + x2 + 1 for 0 ≤ x ≤ 4 and use left sums.
Here is the picture of the approximation using 20 rectangles.
There are little (almost) triangular regions at the top of each rectangle that represent the error, how far off the rectangle
is from the actual area of the strip under the graph. (The reason that the error regions are almost triangular is that as we
magnify the graph, it becomes more and more like a line. Remember?) We will focus in on two pairs of these, first the ones
on either side of x = 2 and then the ones on either side of x = 3.
Let’s first take a close up look at the top of the two that are next to x = 2. Then what we do is continue looking at that
same region as we increase the number of rectangles to 40, 80, 160, and 320. The shaded regions are the new areas added
into the approximation that were missed with 20 rectangles.
It is clear that we are filling in the area as we increase the number of slices. That is, we are exhausting (using up) the
area. That’s the reason for the term “method of exhaustion.”
In order to get the exact area, then, what do we do? We could use and infinite number of rectangles. That is exactly
what calculus does. However, Archimedes didn’t have calculus, so he used limits. Take a limit as the number of slices goes
to infinity, and you get the area.
There is one concern, though. What happens if the rectangles in the approximation are too tall? If we add to them, then
the approximation will get worse, not better. Never fear!
Dudley: Why do I always get afraid when people say that?
Albert: Probably experience.
The approximations do The Right Thing in that case, too. If the rectangles are too big, increasing the number of slices
shrinks the areas of the rectangles down to the area under the curve. Let’s look at what happens to this curve near x = 3 to
see that.
First, we take a close look at the tops of the two rectangles on either side of x = 3. Again, there are two roughly
triangular regions that represent the error in the approximation for those rectangles. Now watch what happens to that error
as we again increase the number of rectangles to 40, 80, 160, and 320. The shaded region now represents the area that has
been removed from approximation, thus decreasing the error for those rectangles.
Again, it is clear that the rectangles are closing in on the correct area.
An example from Archimedes. Find the area underneath the curve y = a x2 for 0 ≤ x ≤ b. (Assume a > 0 for conve-
nience.) Archimedes actually did this problem (or it is attributed to him anyway), except that he would not have used the
term limit.
The area of each strip will be approximately base × height, with base = ∆x. The height will be some y-coordinate
(being distance above the x-axis), so it will a x2 , where we will have to determine what x-coordinate to use.
Let’s use n strips (n will be a variable that we will let go to infinity to achieve the “exhaustion” of the area). Each strip
will have width ∆x = b/n, since the width of the whole area is b, and we will divide the area into n equal widths.
Dudley: Are equal widths necessary?
Albert: No. But they simplify the calculations in most cases. All that you really need is to have the width of the
widest slice go to zero.
The x-coordinates of the ends of the slices will then be 0, b/n, 2 b/n, 3 b/n, . . . , (n − 1) b/n, b.
The heights will be a x2 , where we need to find the x-coordinates. I am going to use the right sum approximation. What
will the x-coordinates be, then? It depends on which strip you’re working with. The right end of the jth strip will have
coordinate x = j b/n. (Here, j is a generic, unspecified variable, 1 ≤ j ≤ n, which will turn into the index letter we use in
the summation.) The height of the jth strip is then
2
a j2 b2

jb
a x2 = a =
n n2
The area then approximates to

n n
a j 2 b2
∑ (height × base) = ∑ 2 (∆x) (4.15)
j=1 j=1 n
n
a j 2 b2 b
= ∑ 2 n (4.16)
j=1 n
n
a b3
=
n3 ∑ j2 (4.17)
j=1
We are now faced with evaluating that last summation. If you have encountered a topic called mathematical induction, you
have probably found its value. Since we don’t have time to do all of mathematics in this course, I’ll simply tell you that the
value of that sum is 16 n (n + 1) (2n + 1). (Check me on Maple, if you aren’t sure. The command is sum(j^2,j=1..n);.
This is another example of an indefinite summation.) The approximation to the area is then
a b3 1

3 (n + 1) (2n + 1)
n (n + 1) (2n + 1) = a b
n3 6 6 n2
That is not an obvious answer.
Mugsy: Tell me...
But even worse, it is only an approximation to the correct answer (the area)! In order to get the correct area, we must take
the limit as n → ∞.
Mugsy: Archimedes did this? You sure?
Fortunately, we have already tackled that problem as well, and it turns out to be fairly easy.
Dudley: Right. (I had to beat Mugsy on that one.)
The a b3 factor is just a constant that will come along without change. As n → ∞, the relevant terms are (n + 1) (2n +
1)/(6n2 ), and finding that limit (as the quotient of two polynomials) is easy. The answer is (after some algebra) 1/3. The
final area is 13 a b3 .
That is the answer that Archimedes got, but not the form in which he stated it.
Mugsy: Aha! I knew there was a catch!
If you look at the picture of the area under the parabola, you can draw a rectangle with lower corner at the origin, and
upper corner at the tip of the parabola with coordinates (b, a b2 ). (How did I get the y-coordinate?) That rectangle has area
(base) × (height) = (b) (a b2 ) = a b3 . Archimedes would have given his result as the area under the parabola is 1/3 the area
of the rectangle, just as we got. Here is the picture.
If you are now thinking about how you are going to do this yourself, don’t worry. I won’t make you do this.
Dudley: YAY!
The formulas for the summations get horrendous. But you did need to see it once for historical reasons.
Dudley: For the same reason we take a bunch of other courses around here....
But more than that, the method is classical and the basis of the way that most calculus courses treat areas. We will be
working much more complicated problems, but not this way!
Re-do Archimedes’ example from the point of view of differentials. We want to do the entire problem all over, from
scratch, but with the point of view that we will be using for the rest of the course. We’ll slice the region up also, but into
dx-width “rectangles.” Ours will be so thin that there will be no approximation involved—we will get the exact correct area
right off! The penalty that we pay for getting the right area is that we must add up the areas of the “rectangles” with an
integral rather than a summation. I said that the calculus way to solve this problem is with an infinite number of rectangles,
and this is it.
Dudley: Are there really an infinite number of those?
Albert: Yes, and that is why regular summations won’t work to add them up. But you are better off not asking how
infinite the number of rectangles there are. Summations can handle one variety of infinite numbers of terms, and
integrals handle another. Infinite sums will show up next semester.
Dudley: Aww, that’s really mean. You mean to tell me there are different sizes of infinity?
Albert: Yes, there are. I did say that you were better off not asking.
The area of one of the “rectangles” is still (base) × (height), except that the base now has length dx and the height is
just a x2 .
Dudley: Why don’t we use one of those approximation things?
Albert: Remember that differentials are so small, we can treat other variables as constants? That’s what is happening
here. The dx is so small that all the different ways of finding the heights give the same results, namely a x2 .
2
R 2
The area of the differential strip is then a x dx. Adding these up gives a x dx. The thing that we are missing is the
equivalent of the limits of summation, called the limits of integration. They tell theR values of x for which the “addition” of
the a x2 dx will occur. In this case, we go from x = 0 to x = b. The integral is then 0b a x2 dx, and this is the area exactly.
The last thing is to evaluate that integral. So, we need to find a function F(x) whose differential is a x2 dx. In this case,
we can find one fairly easily.
F(x) = a x3 /3
If we find ∆F for x = 0 to x = b, we get the answer:
Z b
a x2 dx = F(b) − F(0) (4.18)
0
a (b)3 a (0)3
= − (4.19)
3 3
a b3
= (4.20)
3
It is worth commenting that this is the same answer that we got the other way, and this way was considerably less painful.
This is the way that we will work such problems from now on.
ThereR is a standard set of notations and terminologies for definite integrals, and we might as well get them now. The
integral 0b a x2 dx is read “The integral of a x2 from x = 0 to b.” Note that the limits are read from the bottom to the top.
Reading from the top down is a sure sign of a person who is just learning calculus (just like saying that you are going to
derivate a function rather than differentiate it).
Definite integrals on Maple.

Of course, Maple will work definite integrals; you just have to tell it correctly.
Mugsy: Maple definitely hates me, and the feeling is becoming mutual. How do I get it to do what I want, Al?
Albert: Learn how to talk its language. Forget the idea of getting it to talk yours. And threatening it won’t help at
all.
Mugsy: That explains a little.
Dudley: Like all those punched-out terminal screens in the computer lab?
The procedure that integrates Ris called int();, and you have to give it the function, the variable, and the limits. For
example, to have Maple work 0b a x2 dx, you’d type
> int( a*x^2, x = 0 .. b );
a b3
3
Note how Maple uses the x=0..b to determine the variable (it is x, not a), as well as the upper and lower limits for the
integral. Since x was indicated as the variable, Maple correctly decided that a was a constant.
Mugsy: If only it would behave so nicely for me.
Homework #33
Problem.
1. In this problem, we do another example of finding areas. This time, we can verify the answer ourselves, with a bit of
algebra and geometry. We want the area underneath the curve y = a x, for b ≤ x ≤ c. We assume a > 0 and 0 ≤ b < c,
and we’ll use the differential approach for the calculus.
(a) The differential strips are dx wide. How tall are they (as a function of x)?
(b) What is the area of the differential strip, again as a function of x?
(c) What is the range of x’s to use for adding the strips together? (This question asks for the largest and smallest
values of x the problem contains, and tells us the limits to use on the integral.)
(d) Set up the integral for the area, including the limits and the function to integrate, all in terms of x.
(e) Find a function F(x) whose differential is the differential in the integral. (You’ll need a quadratic function if
you’ve done everything correctly up to this point.)
(f) Evaluate the integral by finding ∆F for the range of x’s in the integral. (This is the area of the region, as
determined by calculus.)
(g) The region we are looking at is a trapezoid. (For those of you that don’t remember this from high school
geometry, a trapezoid has four sides, two of which are parallel. In this case, the two parallel sides are vertical.)
The area of a trapezoid is equal to (the average length of the parallel sides) × (the distance between the parallel
sides). For the region we’ve been working with, what is the distance between the parallel sides?
(h) What are the lengths of the two parallel sides? (How do you find the y-coordinate of a point on y = a x?) What
is the average of those two numbers?
(i) Get the area of the trapezoid from the high school geometry formula (by multiplying together the answers to
the previous two parts) and show that it equals the area obtained by calculus.
4.1.2 Antidifferentials = indefinite integrals.

In order to evaluate definite integrals easily, we need a way to find F(x) from F 0 (x) dx. That process is called indefinite
integration. We study it next.
Indefinite integrals are much more difficult to work than derivatives. You can differentiate (eventually) just about any
function you can write down. It is essentially a mechanical process that Maple can do, for example. On the other hand, it
is quite simple to write down functions which have no indefinite integral that we can find. It isn’t just more complicated, it
is impossible.
Mugsy: Just how encouraging can this guy get?
Albert: It’s even worse than he said. Does that make you feel better?
The whole topic of integration is more complicated than we could possibly cover here. We will do the basics, as much
as a standard first-year calculus course, and I’ll show you how to use Maple, which will give you a boost into the more
complicated (called non-elementary) functions.
General ideas.
We now begin our examination of a systematic procedure for finding indefinite integrals, which are used to calculate definite
integrals exactly. Soon, we will also cover a procedure for finding definite integrals approximately, for those situations that
indefinite integrals can’t be found.
We want a function that gives a certain differential. To evaluate ab f (x) dx, we need to find a function F(x) so that
R
F 0 (x) = f (x). Then ab f (x) dx = F(b) − F(a). The reason this works is that if F 0 (x) = f (x), then dF = F 0 (x) dx = f (x) dx,
R
and adding up all the differential changes dF in F(x) gives exactly ∆F = F(b) − F(a). There really is more going on here
than shows on the surface.
Mugsy: Surprise.
Need a constant of integration. The first problem that occurs is that there is no single function F(x) that √ satisfies
F 0 (x) = f (x). There are lots of them! If f (x) = 2 x, then what is F(x)? It could be x2 , or x2 + 1, or x2 − 17, or x2 + 3π
217
. In
2 2
fact, it could be x +C, where C is any constant. The differential of x +C is still 2 x dx, no matter what value the constant
C has. This always happens. The equation F 0 (x) = f (x) will have solutions that look like F(x) + C. The C is called the
constant of integration.
This creates some concern. If the value of the definite integral depends on the constant C, then you’d better use the
same constant that I use (since I am always right :-).
Dudley: Hey, Albert! You’ve got competition!
But fortunately, the value chosen for C doesn’t ever affect the definite integral.
This point is best illustrated by an example. Early in definite integrals, I evaluated 17 x2 dx to be 114. (Check that out. It
R
is on page 202.) There, I used the function F(x) = 13 x3 . Watch what happens if instead I use the function F(x) = 13 x3 + 17.
Specifically, watch what happens to the 17.
Z 7
x2 dx = F(7) − F(1) (4.21)
1

1 3 1 3
= (7) + 17 − (1) + 17 (4.22)
3 3
1
(7)3 − (1)3

= (4.23)
3
= 114 (4.24)
If you stare at that for a while, it should become clear that the 17 in both F(7) and F(1) simply self-destructed because of
the subtraction. And with a bit more thinking, you should be able to convince yourself that no matter what constant went
in there instead of 17, it, too, would cancel out of the final answer.
What this means, then, is that you can use any constant of integration that you want for evaluating definite integrals, and
the value of the definite integral will not change. What constant of integration did I use with F(x) = 13 x3 ? The simplest one
possible; I used C = 0, since then F(x) +C = 31 x3 . This is the usual case. You (almost) never put the constant of integration
in when evaluating definite integrals, which is the same as using the value C = 0.
Dudley: Al, what value do you use?
Albert: Zero, of course. It’s the easiest, unless I happen to see another value that’s even easier. That’s really rare,
though.
This is sufficiently important to be highlighted:
When using indefinite integrals to evaluate definite integrals, you will appear to omit the constant of
integration. But what you are actually doing is using the fact that the value of the definite integral doesn’t
depend on the value of the constant, so you are setting it to a convenient value, namely zero.
Re-examine population growth from this point of view.

The constant of integration is more than an ignorable nuisance. It actually needs to be there. Let me convince you.
Mugsy: I call that a challenge.
Where was the constant of integration in the population growth example? It was there, but it was hidden. Finding it will
help to clarify what the constant of integration is used for.
Instead of finding the population at t = 10, what would happen if we tried to find it for all t > 0? We would end up with
the same equations, except that we wouldn’t have t = 10 to plug in to P(t); we’d have to leave it as t. Let’s do the whole
problem without any numeric values of anything: k or P(0).
Mugsy: Hah! See, he’s allergic to numbers. Just what I thought.
Albert: No. He uses numbers when possible. In this case, doing it all with algebra forces us to look at items that we
need to deal with, but otherwise wouldn’t have to.
Mugsy: You look at it your way; I look at it mine.
The equation dP/dt = k P gives ∆(ln P) = ∆(k t), just as before. This says that ln(P(t)) − ln(P(0)) = (k t) − (k 0) = k t or,
writing P(0) = P0 , the initial population, we get
ln(P(t)) = ln(P0 ) + k t (4.25)
P(t) = exp(ln(P0 ) + k t) (4.26)
= exp(ln(P0 )) exp(k t) (4.27)
kt
= P0 e (4.28)
This is exactly the equation that we used when we were introducing exponentials. But again, where is the constant of
integration? It certainly doesn’t appear (obviously, anyway).
Mugsy: You’re not convincing me.
Let’s look back at dP/dt = k P, which gave dP/P = k dt. The dP/P is the differential of ln(P), while the k dt is the
differential of k t. Then the indefinite integrals of the two differentials should also be equal, namely ln(P(t)) +C and k t +C
should be equal. Does that mean that ln(P(t)) = k t? No, and this is an important point that will occur later also. The
constants of integration that occur in the two different integrals are unrelated. It really should be that
ln(P(t)) +C1 = k t +C2
Then subtracting C1 from both sides gives

ln(P(t)) = k t +C3
where C3 = C2 −C1 , which is just another constant. This always happens, too, and deserves another box.
When integrating both sides of an equation with differentials, you only need to put a constant of integration on
one of the sides.
If we compare this to the solution that we got before, we note something interesting. The C3 is exactly the same as
ln(P0 ). What does this say? The constant of integration needs to be there in order to accommodate different possible
starting populations. That is, the equation ln(P(t)) = k t +C describes a large number of situations. To apply it to a specific
situation, you have to determine the value of C. Without that constant of integration, you wouldn’t be able to have an initial
condition.
Albert: And you can bet that in any problem like this on a test, if you leave off the constant, you will get the wrong
answer, and lose credit.
Mugsy: Well isn’t that a sweet thing to do.
Albert: You only need to get burnt once before you remember it for a long time to come.
We now have two different ways of approaching problems that require integrals. One is adding up differentials to
get changes, using definite integrals. The other is to find functions with a specified differential, using indefinite integrals.
Either approach can be used to solve problems. The constant of integration appears only in indefinite integrals (without
limits of integration), and must be determined at the end. Definite integrals can be used to find changes, and you have to
add in the starting value to get the final value. Adding in that starting value correlates precisely to evaluating the constant
of integration. Both methods give exactly the same answers. The sequence of steps is different, but the two approaches are
completely equivalent.
Constant of integration revisited.

We still aren’t done investigating the constant of integration. Many physical situations are approximated (the technical term
is modeled) by an equation involving differentials, conveniently called a differential equation. Solving these equations is
often impossible (we can’t solve all indefinite integrals, and these can get much worse), so a computer is enlisted to help
solve them approximately.
The procedures for solving y0 = f (x) are much more complicated than I will explain here, but I am only trying to
illustrate. How would a computer draw a picture of the solution to y0 = f (x)? You’d tell it where to start and it would find
y0 = dy/dx = f (x), the slope of the solution curve at the starting point. It would then move a tiny bit in that direction (a
wiggle!), and land at a new point. It would then recalculate y0 , and move a tiny bit along that new slope, and land at another
new spot. It would recalculate y0 again, and move, and recalculate, and move, and so on. By this method, you can string
together the graph of the solution to y0 = f (x).
There is one difficulty. We had to give the computer a piece of information at the very beginning: where to start.
Without that, how could it figure out where to go next, or even what slope to move along? For this reason, the solution to
y0 = f (x) requires what is called an initial condition, that is, the values of x and y to start from, what I usually call (x0 , y0 ).
(As always, subscripts show that these are constant values of some variables.)
Let’s try a specific example. Find the solution of dy/dx = 3 x2 that starts at the point (−1, 3). If we just took the solution
to be y = x3 , without the constant of integration, then we’d have a serious problem. The curve y = x3 doesn’t contain the
point (−1, 3).
Dudley: Just like might happen on a test, right?

Albert: You got it.
On the other hand, if we solve this correctly, and put in the constant of integration, the solution to dy/dx = 3 x2 is y = x3 +C.
Now we have the flexibility to adjust C to make the solution contain the point (−1, 3). How do we do that? We plug x = −1
and y = 3 into the equation y = x3 +C and solve for the value of C that we need: 3 = (−1)3 +C, so C = 4. The solution to
the problem is then y = x3 + 4. You can check that dy/dx = 3 x2 and the point (−1, 3) is a point of the curve. Note that you
first solve the derivative part of the equation, and then plug the initial condition in. You have no constant of integration to
evaluate until the derivative is gone.
A problem of the sort illustrated here, where you solve a differential equation and evaluate the constant of integration
using an initial condition is called an initial value problem, often abbreviated as an I.V.P.. I.V.P.’s crop up all the time in
physics.
There is a typical shorthand that is used often for writing initial conditions. If you know that y is a function of x (as
indicated by writing dy/dx), then some people (including me) will write the initial condition y = y0 when x = x0 this way:
y(x0 ) = y0
For example, the previous initial value problem would be written as

dy
= 3 x2 with y(1) = 3
dx
Looking ahead, a bit

It turns out to be useful to pull one formula that we will get later back to here.
Mugsy: Oh great. You mean I gotta know this stuff before I can learn it?
Albert: It helps, but isn’t necessary.
Mugsy: Thanks, I think.
It is called the power rule, and is probably the easiest one to handle, and certainly the easiest one to check. It goes like this:
c un+1
Z
c un du = +C
n+1
where n 6= −1. This enables you to integrate any polynomial, and other things as well.
We have already been using this, but not formally. We had to figure out what the indefinite integral of 3 x2 was, just a
bit ago. We did it more or less by guessing. This rule allows us to get the answer without having to think it through each
time.
Mugsy: You mean I don’t have to think about this any more?! That’s just great!
One thing you might wonder about is why do we not allow n = −1. The easy way to convince yourself that it doesn’t
work is to try it out. You get a division by 0, which is always a sign of trouble.
Mugsy: I can tell right now that there will be times that n = −1. What do I do then?
Albert: You will find out shortly.
Dudley: My suggestion is to panic.
Albert: I can guarantee that panic is not going to help.
Using Maple to find indefinite integrals and solve initial value problems.
Maple works indefinite integrals, too. It’s even easier to write than definite integrals. You simply leave off the limits, but
don’t forget to tell it the variable. That is, to ask Maple for a x2 dx, simply type
R
> int( a*x^2, x);
a x3
3
Note that Maple doesn’t put in the constant of integration! That does not relieve you of the obligation to put it in. Maple
simply assumes that you know it should go there. On the other hand, I don’t assume that you know it should go there. You
will have to convince me each time.

Mugsy: A real sweetie, let me tell you.
Sometimes when you ask Maple for an indefinite integral, it will give back something obscure. For example, if you
type in int(sin(x)/x,x); Maple can’t figure it out (It isn’t possible, Mugsy) in terms of the “usual” functions. But there
is a “higher-transcendental” function that solves it. So Maple types back
Si(x)
Si(x) is called the Sine integral function, and is defined by that integral, with Si(0) = 0. If you encounter such things, it
basically means that Maple couldn’t work out the integral in any fashion that used just the elementary functions (polyno-
mials, logarithms, exponentials, trigonometric functions, and inverse trigonometric functions).
Mugsy: Those are elementary? You gotta be kidding!
Albert: They are. You should see the nasty messes that crop up when dealing with higher transcendental functions.
Maple also solves initial value problems. To do these requires somewhat more work.
Albert: After all, they are more complicated.
The way that you would ask Maple to solve dy/dx = 3 x2 , with y(−1) = 3 is this:
> dsolve({diff(y(x),x)=3*x^2, y(-1)=3}, y(x));
y(x) = x3 + 4
We encountered the solve(); family back at the beginning of the semester, when we did the Maple introduction during
the first lab period. The format for all that family is the same:
*solve( what to use in solving, what to solve for );
(The asterisk (*) represents some letter, such as f or d, or nothing at all, in the case of solve();.)
The dsolve(); is the command to solve a differential equation. The bracketed terms are used to group together all the
information that will go into the solution. The
diff(y(x),x)=3*x^2
is the dy/dx = 3 x2 . Note that you have to indicate that y is a function of x in the diff(y(x),x). Otherwise, Maple will
get very confused.
Mugsy: That seems to happen an awful lot.
Albert: Well, Maple is not too smart.
Dudley: Only Albert would say something like that....
The y(-1)=3 is the initial condition to go with the equation. The comma after the } separates the information to go into
the solution from the item to solve for, in this case y(x). And note that you have to indicate that y is a function of x there,
too.
Maple can solve much more complicated differential equations, too. The command dsolve(); will be of exceedingly
great use when you get to the course called Differential Equations. (Assuming, of course, you take it.)
Homework #34
Exercises.
1. Find the solutions of these initial value problems. You will need to find the indefinite integrals by a process of trial
and error. We will remedy that situation soon. But all of these are very easy.
(a) y0 = 4 x3 , y(1) = −4
(b) dw/dr = 3 r2 , w(2) = 5
(c) dx/dt = cost, x(0) = 1
(d) y0 = 0, y(2) = −8
2. Find the solutions of these initial value problems. You will again need to find the indefinite integrals by a process of
trial and error.
(a) y0 = 7 x8 , y(1) = −6
(b) dw/dr = −r−2 , w(2) = 2
(c) dx/dt = sint, x(0) = 4

(d) y0 = 0, y(8) = −5
3. Find the following indefinite integrals using the power rule.

Z
(a) 5 x2 dx
Z
(b) 12 u3 du
Z
(c) w−2 dw
Z √
√
(d) t dt (Hint: Convert t to a power of t.)
4. Find the following indefinite integrals using the power rule.

Z
(a) 20 x4 dx
Z
(b) 7 u2 du
Z
(c) 8 w−3 dw
Z √
3
√
3
(d) t dt (Hint: Convert t to a power of t.)
Problem.
1. We have already solved a problem a little more complicated than y0 = f (x), namely, dP/dt = k P. For that, we needed
to pull the P to the side with the dP. (See the discussion on page 212.) Use this same idea to solve the initial value
problem
dx
= x3 t 3 , with x(−1) = 1
dt
Vertical free-fall motion “solved.”

Newton developed calculus to work physics (essentially). Later, we will delve into a considerable amount of physics (in
the second semester). What we will do here is considerably easier.
Mugsy: I hate physics almost as much as I hate Maple.
Albert: You’d better get used to both. Especially since you are going to be in second semester.
Mugsy: You mean I could ignore all this—including Maple—if I am planning on taking only this one semester? Please?
Dudley: OOOO! Mugsy said ‘Please’ ! He must be desperate!
Mugsy: I am. Al, answer me.
Albert: Sorry, Mugsy. Although there is much more physics next semester in calculus, there is a bit in this semester.
Not only that, you’ll end up taking physics next year, too. Get used to it now.
Mugsy: A LL RIGHT EVERYBODY, I’ M GOING TO BE IN A GRUMPY MOOD FOR THE REST OF THE DAY. S O WATCH IT !
We want to solve the equation that describes frictionless vertical free-fall, the motion of a rock dropped off of a cliff,
for example.
Mugsy: How about a body dropped off a cliff?
Dudley: You wouldn’t have anyone specific in mind, would you?
Mugsy: Are you volunteering?
Dudley: Bye.
We ignore friction for the moment. We consider that in the homework since we need Maple to solve the equation.
First get velocity, with its initial condition. The approach is direct. We assume constant acceleration. That gives us an
equation for velocity that we can solve.
The equation is dv/dt = −g, where g is the acceleration of gravity, a constant. There is an assumption here that
positive is upward. Since the force of gravity is pulling downward, the negative sign appears. This is a convention, not a
requirement. You could have good reason for declaring positive to be downward, and that would be fine. You must use
your choice consistently throughout the problem, though.
Note that the differential equation requires an initial condition, a value of v when t = 0. We’ll say that v(0) = v0 , just to
give it some notation. The equation dv/dt = −g with v(0) = v0 is easy to solve:
dv
= −g (4.29)
dt
dv = −g dt (4.30)
Z Z
dv = −g dt (4.31)
v = −gt +C (4.32)
(v0 ) = −g × (0) +C (4.33)
v0 = C (4.34)
v = −gt + v0 (4.35)
That wasn’t too bad.
Mugsy: It is if you don’t want to do it at all.
Dudley: You are in a grumpy mood.
Next get position. Velocity is a handy thing to have, but we don’t want velocity.
Mugsy: That depends.
We want position.
Mugsy: Position! Status! Fame! Wow! I’m in a better mood now.
We can get position from velocity by another differential equation, dy/dt = v. (Here, y is position, essentially height. I
don’t use x or s for position, for reasons that will appear in a moment.) We can get the velocity from the previous step, but
we still need an initial condition. Again, we simply make up some notation and say that initial position (height) is y = y0 .
Mugsy: I don’t think that word means what he think that word means.
That means that we must solve dy/dt = −gt + v0 with y(0) = y0 . This is only slightly more difficult than velocity was.
dy
= −gt + v0 (4.36)
dt
dy = (−gt + v0 ) dt (4.37)
Z Z
dy = −gt + v0 dt (4.38)
1
y = − gt 2 + v0 t +C (4.39)
2
1
(y0 ) = − g × (0)2 + v0 × (0) +C (4.40)
2
y0 = C (4.41)
1 2
y0 = − gt + v0 t + y0 (4.42)
2
Those of you who have had physics have probably seen these equations before, but probably not derived this way.
Note that you can check the answers if you aren’t sure.
Albert: That’s a hint. It is really useful at times.
When we want to solve dv/dt = −g, with v(0) = v0 , and say that the answer is v = −gt + v0 , all we have to do is see if the
function given satisfies both parts of the original equation. Since it does ( dtd (−gt +v0 ) = −g and v(0) = −g×(0)+v0 = v0 ),
it is correct. The same holds for dy/dt = v, y(0) = y0 .
Also, note the strategy for solving initial value problems. You first integrate the differential equation. That gives a
constant of integration. Only then do you evaluate the constant using the initial condition. Before that, you don’t have
anything you can plug in, and you don’t have a constant yet, either. Finally, you plug that value of the constant back into
the solution of the differential equation, and you are done. Initial value problems are always solved in this order.
Note that a second-order differential equation requires two initial conditions. Ultimately, for free-fall motion, we
were working a second-order differential equation, namely, d 2 y/dt 2 = −g. (The order of a differential equation is the
maximum order of any derivative in the equation.) In the process, we needed two initial conditions, one for y and one for
v = dy/dt. We needed two integrations to “undo” the two derivatives, so we got two constants of integration. We needed
two initial conditions to evaluate the two constants. This is generally true. You will need as many initial conditions as the
order of the differential equation.
Homework #35
Exercises.
1. In this exercise, we look at the units of the equations that we got. For familiarity, I will use the English (foot-pound-
second) system rather than the metric system.
(a) If the unit of length (that is, of y) is the foot (abbreviated ft) and the unit of time is the second (abbreviated sec,
or sometimes just s), what are the units of velocity? Use the definition that v = dy/dt and remember that dy/dt
will be essentially ∆y/∆t, so the units on dy/dt will be the same as the units on ∆y divided by the units on
∆t. (Often, when you are dividing by a unit, you indicate that by the word “per.” To get gasoline mileage, you
divide the distance you drive (in miles) and divide by the amount of gasoline used (in gallons), and the result is
a certain number of “miles per gallon.”) This gives the units for v.
(b) What are the units of acceleration? Use a = dv/dt. This gives the units for g, since it is the acceleration due to
gravity.
(c) Show that each of the terms of v = −gt + v0 have the same units. (That is, the units for v, −gt, and v0 are
identical. This must always happen in valid equations. If the units in an equation are not the same, you can be
quite sure that something is wrong with the equation. So, the equation v = −gt 2 could not possibly be correct.
This is a handy way of checking formulas to make sure you remember them accurately.) (Note: Constant factors
do not affect the units, so the negative sign (a constant factor of −1) and 12 in the next part can be ignored.) The
units of v0 are the same as the units on v, since v0 is a value of v at a specific time. Also, when you multiply
two terms with units, the units multiply as well.
(d) Show that the units in each term of y = − 12 gt 2 + v0t + y0 are the same.
Problems.
1. Dudley wanted to play catch by himself (Fang was off plotting the demise of the every squirrel in the universe, his
usual preoccupation), so he tossed a ball (more-or-less) straight up, and caught it coming down. Suppose he threw
the ball upwards with a velocity of 50 ft/sec = v0 . Use t = 0 at the time he threw and −g = acceleration due to gravity
= −32 ft/sec2 . Use y0 = 4 ft (the rough height of Dudley’s hand) and assume that he caught the ball also at y = 4 ft.
(a) What is the equation that determined how high the ball was? (That is, what is the equation for y under these
conditions?) (You can use the equations from the notes, and the values given in the problem to answer this quite
easily.)
(b) When (that is, for what value of t) did Dudley catch the ball? (Hint: Solve y(t) = 4 for t. You’ll get two values.
Which one do you want?)
(c) Find the value of t when dy/dt = 0 by solving that equation. This gives the time when vertical velocity is 0.
Can you give a simple description of the place in the path of the ball where the vertical velocity is 0?
(d) What is the value of d 2 y/dt 2 at the value of t from the previous part? This gives the (vertical) acceleration.
Explain how acceleration can be non-zero when the velocity is 0. After all, acceleration is the derivative of
velocity and the derivative of 0 is 0.
2. We indicated in the notes that we ignored air resistance in the equations we got. There was a reason for that. In this
problem, we treat vertical free-fall motion with air resistance. This will require going to Maple to solve the equations
that will appear. One standard assumption is to make air resistance proportional to velocity, giving an extra term of
k v in the acceleration (k is a constant determined by the shape of the object and the resistance of the air. That is, the
larger the value of k, the more the effect of air resistance enters the equation. For k = 0, there is no air resistance at
all.) The equation is then dv/dt = −g − k v. The initial condition is v(0) = v0 . (We are only solving for v, so the s(0)
won’t be needed.)
(a) Use Maple to find v(t). This can most easily be done using the Maple command dsolve();. The format is
dsolve( {diff(v(t),t)=-g-k*v(t), v(0)=v0}, v(t) );
Make sure you keep straight the parentheses () and the curly brackets {}.
(b) What happens as t → ∞ in this equation? (Physically, that means that you are falling for a very long time.) Find
out using Maple by typing the following commands (right after the dsolve();:
assign(%);
signum(k):=1;
limit(v(t),t=infinity);
(I’ll explain the reasons for these in lab period.) This value of v is called the terminal velocity. It is the velocity
that you will approach as you fall for long times.
Mugsy: I’ve always wondered about that. Not personally, of course. I’ve just observed.
(c) What happens to terminal velocity when k gets large? (This corresponds to taking a big parachute, to increase
wind resistance.) Why do you want a big parachute when you jump out of an airplane?
(d) What happens to v as t → ∞ without air resistance? (Use the equation for v from the notes for regular free-fall
motion here. Take the limit as t → ∞. What happens to v?)
Ballistic motion solved from vertical free-fall solution.

Ballistic missiles are ones that spend most of their time in an unpowered trajectory. ICBMs (InterContinental Ballistic
Missiles), for example, only burn their engines for a maximum of 90 seconds. After that, they essentially glide. We want
to look at a more domestic example—hitting a golf ball. The same analysis applies to throwing a baseball, water balloon,
or a variety of other items. Again, we will ignore air resistance. This turns out to have rather serious consequences, in that
some results we will get will not be true to experience, but the equations are too complicated to solve (even with Maple!)
Mugsy: Oh, wow. Call Albert.
if we try to include the refinements necessary to make it true-to-life. So, the analysis doesn’t work well with items that are
so light as to be strongly affected by air resistance, such as spitballs. Sorry.
Mugsy: And I thought this class was supposed to be practical.
There is one problem. We will need to solve two second-order equations to get both x and y, and so we will need
2 + 2 = 4 initial conditions. These will be the initial positions of x and y, and the initial x- and y-velocities. The initial
x-position will be written x0 , in parallel to the initial y-position of y0 . But the initial x-velocity can’t use v0 , since we used
that for the initial y-velocity. So, we compromise. We will call the initial x-velocity v0x , and the initial y-velocity v0y . This
means that we will have to modify the answer from the previous part, but that is less confusing than other options.
We have already solved more than half of the problem. The vertical portion of the motion is given by the vertical free-
fall equations we just did. That is, y(t) = − 12 gt 2 + v0y t + y0 is still true. The horizontal part come from simple uniform
motion, constant velocity. The equation is x(t) = v0x t + x0 . It is the same as the equation for y, except that the horizontal
acceleration is zero.
Note that we get parametric equations for the answer! Both of the dependent variables (x and y) are given in terms of a
third, independent, variable, t. For reference, here are the equations for ballistic motion.
x(t) = v0x t + x0
1
y(t) = − gt 2 + v0y t + y0
2
Remember that when you are working with parametric equations, you always want to phrase everything in terms of the
parameter. In this case, the values of t = time answer the question “When . . . ?”
Homework #36
Problems.
1. A boomerang doesn’t obey the equations we got for ballistic motion. Why shouldn’t it? (I am not looking for a
description of what happens when you throw a boomerang. I am looking for an explanation of why a boomerang
doesn’t follow a nice, parabolic path the way a golf ball does.)
2. This problem leads you to answer the question “How far did the golf ball go in the air?” We will take x0 = 0 and
y0 = 0 just to make life simpler. It really wouldn’t matter. We’ll also assume flat ground. Leave v0x and v0y as
variables that will appear in your answers. Also, don’t replace g by 32; leave it as g.
(a) The distance the ball went in the air is essentially the difference between the two places that it was on the
ground. One place it was on the ground was at the start, at y0 = 0. What equation determines when the ball
is on the ground? (Hint: What value of y does “on the ground” correspond to? Also, when we are asking for
“when” we want to solve an equation for t.)
(b) Solve for t in the equation in the previous part. There should be two values of t.
(c) What are the x-values that correspond to the two values of t?
(d) How far did the ball go? (It is the difference between the two x-values in the previous part!)
3. In this problem, we answer the question “How high does the golf ball go?” We will operate using all the assumptions
and directions from the previous problem.
(a) We want to maximize height, y. How do we do that? (Hint: You will need some ideas from the Derivatives-I
and Finance chapters.)
(b) Interpret the equation in the previous part in terms of a value of the y-velocity. Does it make sense?
(c) Solve the equation in the first part for t. In other words, answer the question “When does the ball get to its
maximum height?”
(d) To get the value of the maximum height, you need to plug the value of t from the previous part into the equation
for y. Do that. What do you get for maximum height?
4. In this problem, we find and work with the non-parametric relation between x and y in ballistic motion. Part of the
reason for this problem is to show that parametric equations make life easier. Use the boxed equations for x(t) and
y(t) for this problem.
(a) Solve the equation x(t) for t. (This is so that at the next step, we can get y, the dependent variable, in terms of
x, the independent variable.)
(b) Plug that equation for t into the equation for y(t). You should get an equation for y = y(x).
(c) Answer the question “How far does the golf ball go?” using just the equation from the previous part of this
problem. Do that by solving the equation y(x) = y0 . (You might want to use Maple. It gets pretty ugly here.)
(d) Answer the equation “How high does the golf ball get?” using just that same equation. (This time, you will
want to solve y0 (x) = 0. Why?)
5. Divide the value of t when the golf ball lands again (from an earlier homework question) by the value of t when the
golf ball reaches maximum height. Does your answer seem reasonable physically?
The relation between definite and indefinite integrals.

We’ve thrown around both definite and indefinite integrals quite a bit, enough for you to become confused about the
difference. Here is a fast summary:
Indefinite integrals never have limits, and always come with a +C.
Definite integrals always have limits, and never come with a +C.
However, definite integrals are evaluated Rusing indefinite integrals (with C chosen to be 0). Here is the standard format
for doing that. Suppose we want to evaluate 37 x dx. Since x dx = 21 x2 +C, we get the following
R
1 2 7
Z 7
x dx = x (4.43)
3 2 3
1 1
= (7)2 − (3)2 (4.44)
2 2
= 20 (4.45)
This is what your evaluation of definite integrals should look like. Of course, if the indefinite integral takes more effort to
find, then extra work is needed at that point, but still it should have all these steps in it.
Solving differential equations and parallel limits.

There are two critical items that must be observed every time that you are setting up limits on definite integrals.R One is
that the values of the limits must be values of the variable in the differential in the integral. That is an integral ab . . . dx
indicate that x is changing from a to b, no matter what the function in . . . is. The function doesn’t change what limits you
use! (That’s a principle we’ll see more next semester.)
The other item shows up when you are working out limits on a pair of integrals with different differential variables, and
you want the integrals to be equal. This shows up when you are integrating a differential equation, for example. The key is
this:
The limits in equal integrals must be corresponding values of their variables; they will usually not be the same
for both integrals.
As an illustration of this, let me re-work the population growth problem using definite integrals. The equation was
dP/dt = k P, or dP/P = k dt, with initial condition P(0) = 50. We then take the definite integrals of both sides, and need to
find the limits. The integral of k dt is easiest, so we do that first. The limits on k dt will be values of the differential variable,
t. The time starts Rat t = 0, since that’s when we know the population, P(0). It runs up to some unspecified time t (generic).
So the integral is 0t k dt. The limits on dP/P also need to be obtained. But the limits on this integral will be values of P,
R
and they need to be the values of P that correspond to the values of t at t = 0 and (generic) t. The value of P when t = 0 is
P(0) = 50; that’s the initial condition. The value of P at an unspecified (generic) t is an equally unspecified (generic) P(t),
usually writtenR just P. The whole idea of the problem, in fact, is to find P(t), so it will have to enter somewhere. Thus,
P
the limits are 50 dP/P. The solution of the problem then is usually written this way. (We will use the same properties of
logarithms and exponentials we always need for this problem.)
Z P Z t
dP
=k dt (4.46)
50P 0
P t
ln(P) = k t (4.47)

50 0
ln(P) − ln(50) = (k t) − (0) (4.48)
ln(P/50) = k t (4.49)
kt
P/50 = e (4.50)
kt
P(t) = 50e (4.51)
Note that you can read the initial condition off of the lower limits of the two integrals. This is the format you should follow
when solving problems in this course (and beyond).
One note of caution: Some people argue that it is improper to use the variable of integration (P or t above) as a limit of
the definite integral. Technically, they are correct. Obscure problems or confusions can arise from the practice, but they are
so rare, and so inconvenient to avoid, most practicing mathematicians and engineers ignore the warning.
Homework #37
Exercises.
1. Find the following definite integrals. You will have to do some guessing (for the moment) to find the indefinite
integrals.
Z 4
(a) 2 x dx
2
Z −1
(b) 1 dx
−5
Z 1
(c) x dx
−3
2. Find the following definite integrals.

Z 6
(a) 2 x dx
1
Z 1
(b) dx
−1
Z 2
(c) x dx
−5
4.2 Finding integrals.

We now look briefly at some methods of finding indefinite integrals. Since definite integrals come from them, we will be
finding methods for evaluating them at the same time. We will also look at approximate methods for definite integrals.
There are several standard rules from derivatives and/or differentials that reflect immediately into properties of integrals:
R R
R
k f (x) dxR= k f (x) dx
R
Only constants k are allowed to do this!
f (x) ± g(x) dx = f (x) dx ± g(x) dx. Sums and differences work properly, too.
There are no product, quotient, or chain rules for integrals. This is why integration is so nasty. There is one procedure
(coming from the chain rule, so it becomes the most important rule in integration) that helps some. We will tackle that
quite soon.
Standard calculus courses spend huge amounts of time on this topic. One thing that I have done in this course is to be
realistic and say that only a very few of the methods taught are really used, and I’ll present them. Plus, I’ll point you to
Maple and integral tables, which are much more likely to be used than vague memories once you get to using integration
anywhere else.
4.2.1 Exact methods.

We look first at methods of finding indefinite integrals. From an indefinite integral, we get definite integrals exactly. When
indefinite integrals can’t be found, we can still work definite integrals approximately. We’ll cover that in the next section.
1 n+1
Z
un du = u +C for n 6= −1
n+1
1
Z
du = ln |u | +C
u
Z
sin u du = − cos u +C
Z
cos u du = sin u +C
Z
eu du = eu +C
1
Z
√ du = Arcsin u +C
1 − u2
1
Z
du = Arctan u +C
1 + u2
1
Z
√ = Arcsec u +C
|u | u2 − 1
Table 4.1: Standard integration formulas
Standard formulas.
There are a very few integrals that occur so often that they simply must be memorized.
Dudley: Augh. I can’t stand oodles of memorization!
Albert: Stay calm. If you’ll look at the table, you’ll note that there aren’t oodles of formulas there.
They are in a table nearby, together with a couple that I want there just for reference. I’ll say which are which at the end.
Note that any of these can be checked easily by verifying that the derivative of the right-hand side is the integrand on the
left. All of these but the last three (the inverse trigonometric functions) should be memorized. But if you remember your
derivatives, that’s already done.
Albert: See, it’s not hard!
Dudley: That’s simple for you to say. I still struggle with algebra.
Substitution.
The chain rule, the most important rule in calculus, is a derivative formula. Operated in reverse, it becomes substitution,
the most important rule in integration.
The goal of substitution is to convert a more complicated integral into something simpler, hopefully one of the standard
formulas just given. That’s one of the reasons that the standard formulas are so important. That’s also why I wrote the
standard formulas in terms of u rather than x, since u is the most typical substitution variable. Substitution is also called
change of variables for this reason.
The chain rule says that ( f (g(x)))0 = f 0 (g(x)) × g0 (x). This means that
Z
f 0 (g(x)) g0 (x) dx = f (g(x)) +C
This, however, is a bit difficult to see clearly.

Mugsy: I love how he underestimates my confusion.
Let me try to help by using a new variable, u = g(x). The chain rule then becomes
Z
f 0 (u) u0 (x) dx = f (u) +C
If you realize u0 (x) dx = (du/dx) dx = du, it says that f 0 (u) du = f (u)+C, which is the definition of the indefinite integral.
R
So this is nothing that new. But it does give a major clue as to the way to use this. We substitute u = some inside of part
of the function we are trying to integrate. (Actually, we will hit situations where this is not accurate, but for now, it is
worthwhile letting sink in.)
Mugsy: In other words, you let u = inside of the most complicated part always—sometimes.
Albert: Yes. Always for now. Usually, but not always, later.
The whole key to substitution is to locate what function to use for u. Here are a few items to look for:
• The inside of the most complicated part.
The most complicated part of f 0 (g(x)) g0 (x) is the f 0 (g(x)) term. Its inside is g(x), so use u = g(x) for that reason.
(Note that you shouldn’t let u = f 0 (g(x)), that is, the whole thing.) This is the method that works the most often on
questions that have been cooked up (like on homework or tests).
Albert: HINT!
• Any function whose derivative appears as factor in the integrand.
The integrand is f 0 (g(x)) g0 (x), and the derivative of g(x) (on the inside of f 0 (g(x))) is g0 (x), which appears as a
factor in the integrand. Use u = g(x) for that reason. This obviously works only when you can integrate terms of the
integrand, and can express the rest of the integrand in terms of the integral of the rest.
Dudley: What?
Albert: Look at it this way. In case part of the integrand—that is the function you are integrating—can be
integrated reasonably easily, you then look to see if the rest of the integrand can be expressed easily as a
composition with the inside being the part you can integrate.
Indefinite integrals. Now we begin the process. When confronted by an integral that is not exactly like one of the
standard formulas, here is a rough procedure to follow.
First, convert roots and divisions to exponents. This is the same as you did in differentiation. Calculus works easier
with exponents.
See if you can locate a good substitution, using the suggestions I just gave. How do you tell if a substitution is good?
That comes next. Let’s work an example. Consider
2x
Z
dx
x2 + 1
The integral looks similar to, but not exactly the same as, the Arctan x standard formula. The 2 x on top ruins it. (If it were
just 2 on top, then we could pull the constant outside, and get 2 (1/(x2 + 1)) dx, which would be 2 Arctan x +C. But the x
R
is not a constant, so it can’t be pulled out in front.) So, we convert the integral to exponents
Z
(2 x) (x2 + 1)−1 dx
At this point, we look for a substitution, and u = x2 + 1 suggests itself immediately by both criteria. It is the inside (not the
whole!) of the most complicated part, and its derivative is 2 x, which is a factor in the integral.
Now that we have the substitution, how do we use it? There is another major principle here.
When you are making a substitution, you have to change everything to the new variable.
This includes the differential and the limits (which we will worry about when we get to definite integrals).
How do you tell if a substitution “worked?” Check if all items connected with the old variable have vanished. If so, it
is a useful change.
So, proceed with the example. We have u = x2 + 1, so we have to convert all the x’s in the integral to u’s. Always
begin with the differential. How do you convert dx to du? The chain rule, of course. That tells you how to change between
differentials of related variables. The derivative of the substitution u = x2 + 1 is du/dx = 2 x, so du = 2 x dx. There are two
possible directions at this point.
• Rewrite the integral to include a term that is exactly 2 x dx and then put du in for it. Or
• Solve du = 2 x dx for dx and plug that into the integral.

Either approach works. I only use the first if there is no fudging around necessary to get the du exactly, so I would probably
use it here. Most of the time, however, I use the second approach. It is more mechanical—I can do it without thinking,
which is more than half the idea of having these procedures.
In the first approach, the integral becomes:
Z Z
(2 x) (x2 + 1)−1 dx = (x2 + 1)−1 (2 x) dx (4.52)
Z
= u−1 du (4.53)
= ln |u | +C (4.54)
= ln x2 + 1 +C

(4.55)
Using the second approach, the integral is worked this way:
du
u = x2 + 1, so du = 2 x dx, so dx = 2x . Then
du
Z Z
−1
2
(2 x) (x + 1) dx = (u)−1 (2 x) (4.56)
Z
2x
= u−1 du (4.57)
= ln |u | +C (4.58)
= ln x2 + 1 +C

(4.59)
Even though it looks longer, I recommend the second approach, reserving the first method for those situations when you
recognize what will happen.
One more point about this integral needs to be commented on. Since the original indefinite integral here was given as a
function of x, the answer should also be given as a function of x. So, whenever you substitute in order to solve an indefinite
integral, you will have to undo the substitution before you are finished. This is usually (but not always) quite simple.
As with any indefinite integral, we can check the result by differentiating it:
2
d 2 1 x + 1
ln x + 1 +C = 2 × 2 × (2 x) (4.60)
dx |x + 1 | x +1
2x
= 2 (4.61)
x +1
This is the function that we integrated, so it checks.
It is worth checking your indefinite integrals until you are convinced you are getting them. Over-reliance on Maple here
can be bad for your ability to work such problems on a test! A large number of homework problems follow, because you
simply have to get used to this technique, and there is no way to do that but to practice.
Finally, the part you have been waiting for—how to use Maple to do this. There is a command for Maple to change
variables in an integral, but to use it is a bit roundabout. You first have to tell Maple not to evaluate the integral you have so
that you can tell it to change the variable. That is done by using what is called the inert form of int(); which is written
Int(); That is, when you type in
Int(2*x/(x^2+1),x);
Maple simply comes back with the integral unevaluated. Now you can change variables. To get to the substitution routines
for integrals, you need to load the student package first, by typing
with(student);.
An example of how this works is given here:
> with(student):
> Int(2*x/(x^2+1), x);
> changevar(u=x^2+1, %, u);

> value(%);
> subs(u=x^2+1, %);
> diff(%, x);
2x
Z
dx
x2 + 1
1
Z
du
u
ln(u)
ln(x2 + 1)
2x
x2 + 1
Let me comment briefly on this Maple session. Note the colon in with(student):, preventing a long list from
appearing. The changevar(u=x^2+1,%,u); is the critical step here. The changevar(); command is part of the student
package that was loaded with the with(student): command. You have to tell Maple, in this order, the equation defining
the change of variable, the integral to put it in, and the final variable after the change. The result is another inert integral.
To make Maple evaluate the integral (essentially, tell it to go ahead and work it out, even though it is an Int();), you can
use the command value();. But remember that even though the integral sign is gone, you aren’t quite done. You have
to convert indefinite integrals back to the original variable. That’s what the subs(u=x^2+1,%) does. The last step simply
checks that the answer is correct. But one more caution. Remember that Maple doesn’t put in the constant of integration,
but I want you to! (Forgetting the constant of integration will guarantee that you will lose some credit.)
Homework #38
Exercises.
1. Find the following indefinite integrals.

Z
(a) 3 x3 − 5 x2 + 5 x − 3 dx
Z
(b) sin(8 x) dx
Z
√
(c) (5/x2 ) − (4/ 3 x) dx
Z √
(d) 3t + 8 dt
sin r
Z
(e) dr
cos r
Z √w
e
(f) √ dw
w
x3 + 1
Z
(g) dx (Hint: Divide it out first.)
x2
x3 − x
Z
(h) dx
x4 − 2 x2
[Major note: If you try to make up integrals yourself, you will most likely get things that can’t be integrated. Integrals
have to be concocted very carefully. This is a somewhat discouraging reality. If you want more practice, you can
try a different method. Since substitution is the chain rule in reverse, take some functions with compositions (but no
products or quotients) in them, and differentiate them. Then integrate them. You should come back to the original
function. Alternately, see me for many more examples.]
2. Find the following indefinite integrals.

Z
(a) 3 x3 − 4 x2 − 3 x + 6 dx
Z
(b) cos(3 x) dx
Z
√
(c) (4/x6 ) + (3/ 5 x) dx
Z √
(d) 24t − 7 dt
cos r
Z
(e) dr
sin√r
sin s
Z
(f) √ ds
s
x+1
Z
(g) dx (Hint: Divide it out first.)
x3
Z 3
x +2x
(h) dx
x4 + 4 x2
Problems.
1. When substitution works, it works well. When it doesn’t
R √
work, you can sometimes force it. That’s what we are going
to look at in this problem. We will be integrating x x + 2 dx. Note that the first approach I gave to substitution will
not work well here. The second approach ought to used! (This is another reason I recommend the second method.)
(a) Convert the square root to an exponent. What is the inside of the most complicated part (that is, what is u)?
What is du for that u?
(b) Plug all of this into the integral. Note that all of the x’s don’t obligingly wipe out. This is an indication of
problems, since changing the variable in an integral means all the variables must change.
(c) Can we force the integral into the new variable? The answer is yes. Solve the substitution equation for x (in
terms of u), and plug that in for the x left in the integral. You should now have an integral that has only u as the
variable. That’s what you need for substitution to be a success.
(d) Multiply out the integrand (the function you are integrating), convert each of the terms to exponents, and
integrate it.
(e) Convert the integral back to x’s by using the substitution.
(f) Check your answer by differentiating it. This will require some algebra.
R √
2. Use the procedure from the previous problem to find x9 x5 + 3 dx.

3. It is not uncommon to see ln(x2 + 1) rather than ln x2 + 1 . Why are these two equal?
R
4. The rare caseRoccurs when you have to make your own substitution work. In this problem, we integrate sec x dx.
(The integral csc x dx operates the same way, but with all co-functions.)
(a) Multiply and divide the sec x in the integral by (sec x + tan x), and multiply out the top.
(b) Note that the top is the derivative of the bottom. What should you use for the substitution to finish this problem?
(c) Finish the problem.
Investigation.
In the questions in this investigation, we look at the standard (trig) substitutions. They are the method used to integrate the
last three standard formula (inverse trigonometric function) integrals. This is a major topic in standard calculus courses,
and is typically poorly understood there. I want to look at them only briefly. Maple does a good job on these, so instead of
confusing the daylights out of you, I’ll allow you to use Maple. I will not ask for these on any test.
Mugsy: Joy, joy.
1. The first one we tackle is

dx
Z
√
1 − x2
which we know from earlier is Arcsin x +C.
√
(a) Substitute x = sin θ in the integral. Find dx. Also, plug x into and then simplify 1 − x2 using a trigonometric
identity. Plug both into the integral. You should see some cancellation.
R
(b) In fact, the integral reduces to dθ = θ +C, but we now need to get back to the original variable, x. For that,
solve the substitution x = sin θ for θ . That way, we can plug in to the answer (θ +C) to get the integral in terms
of x. Do this.
(c) We can also use the same idea any time we have an integral containing the form a2 − u2 , and not just 1 − x2 .
The substitution in that case is u = a sin θ . Use this substitution and the procedure from the earlier parts of this
question to integrate
du
Z
√
a2 − u2
(d) Check that the integral you got in the previous part is correct by differentiating your answer. (This can require
some algebra.)
2. We continue with trigonometric substitutions. We also know that
dx
Z
= Arctan x +C
1 + x2
We show where this comes from, too. The methods of this question closely resemble the methods of the previous
question, so look back there if you get stuck.
(a) Substitute
R
x = tan θ in the integral, use a different trigonometric identity, and show that this integral also reduces
to dθ = θ + C, but we again need to get back to the original variable, x. For that we need to solve the
substitution x = tan θ for θ . Plug that into dx and 1 + x2 and put that all into the integral and use a trig identity.
Then solve x = tan θ for θ and finish the integral.
(b) We can use the same idea (again) any time we have an integral containing the form a2 + u2 , and not just 1 + x2 .
The substitution in that case is u = a tan θ . Use the method from the first part of this investigation with this
substitution to integrate
du
Z
a2 + u2
R
This time, you will get du together with extra, constant factors. But those are easy to deal with, since they
pull outside the integral sign.
3. We continue with even more trigonometric substitutions. We also know that
1
Z
√ = Arcsec u +C
|u | u2 − 1
This is the last of the trio of inverse trigonometric integrals I just gave. And again, look back to the first investigation
of this section for more detailed reasoning for the steps.
R
(a) Substitute x = sec θ in the integral, use a trigonometric identity, and show that this integral also reduces to dθ .
(You can assume that the absolute values work out. Basically, ignore them. They are a real pain to go through
in detail.) The integral is then θ +C, but we now need to get back to the original variable, x. For that we need
to solve the substitution x = sec θ for θ .
Form Substitution Differential Form becomes

a2 − u2 u = a sin θ du = a cos θ dθ a2 − u2 = a2 cos2 θ
a2 + u2 u = a tan θ du = a sec2 θ dθ a2 + u2 = a2 sec2 θ
u2 − a2 u = a, sec θ du = a tan θ sec θ dθ u2 − a2 = a2 tan2 θ
Table 4.2: Trigonometric substitutions.
(b) We can use a similar substitution any time we have an integral containing the form u2 − a2 , and not just x2 − 1.
The substitution in that case is u = a sec θ . Use the method from the first part of this investigation with this
substitution to integrate
1
Z
√
|u | u2 − a2
(The comment from the previous investigation problem about constant factors applies here, too.)
4. The past three questions provide a framework to integrate a number of different things, but one more item is needed to
tie them all together. If you are confronted with a quadratic that contains only x2 terms and constants, you can use the
methods from the previous problems. On the other hand, a quadratic can also contain a linearRterm √ (such as x2 − x + 1,
where the −x is the linear term). This creates serious problems. For example, if you have dx/ x2 − x, you can’t
just assume that x is a2 , a constant, since it is definitely a variable. There is a procedure for dealing with linear terms
in quadratics that is so common you should have encountered it before. It is called completing the square. We will
go over it in the lab period, if necessary. Maple also has a completesquare(); command (accessible after typing
with(student): the same way that changevar(); is). The result of completing the square in a quadratic is a form
either a2 − u2 , a2 + u2 , or u2 − a2 , where a is a constant and the u contains the variable (usually x). (The fourth option,
−a2 −u2 , can always be treated as −(a2 +u2 ) wherever it occurs.) The previous three problems then show you how to
integrate these. The table at the top of the page summarizes what you do. For example, with 2 x2 − 8 x + 5, completing
the square gives 2(x − 2)√ 2 − 3. Comparing that to u2 − a2 means that you would use u2 = 2(x − 2)2 and a2 = 3, or
√
u = 2 (x − 2) and a = 3. And√the form u2√ − a2 would mean that you would want the substitution u = a sec θ ,
which in this case would become 2 (x − 2) = 3 sec θ . That substitution will magically cause all the right things to
happen in the integral. In particular, you would get that u2 − a2 = 2 x2 − 8 x + 5 would equal a2 tan2 θ = 3 tan2 θ . We
now want to look at integrating a few functions using the table for trigonometric substitutions.
(a) First we tackle

dx
Z
√
4 x − x2
The function to concentrate on is 4 x − x2 . Complete the square on it. What form is it? (That is, which row in
the table should be be looking at?) What is a? What is u?
(b) Which substitution do you use (from the table and the form)? Use the values for a and u in the previous part to
find the actual substitution (in terms of x) to use.
R
(c) Use that substitution in the integral, and considerable amounts of algebra, to get it to the dθ stage.
(d) Finish the integral, following the same procedure used earlier. Check your answer on Maple, or by differenti-
ation. (Maple is the easier of the two. If you try this on Maple, use sqrt(); instead of exponents. Also, you
can have it differentiate the answer it gives using diff(%); but it won’t put it in a nice form. Use normal(%);
to have it rearrange terms to confirm the check.
(e) In this problem, we try another integral:
dx
Z
x2 + 6 x + 13
Follow the directions for the previous investigation problem to find this integral.
(f) We do one more integral:

dx
Z
√
|x − 2 | x2 − 4 x − 5
Follow the same directions to find this integral. (In this one, on Maple, you will have to ignore the absolute
value signs because Maple is a bit strange there. Be careful to type this function in correctly! And the answer
you get will not even have the same functions in it.)
Definite integrals. Once we have exact procedures for indefinite integrals, getting exact procedures for definite integrals
is easy.
The procedure. Exactly the same process is used for definite integrals as for indefinite ones. The only difference is
that we need to learn what to do with the limits. Again, there are two approaches:
• Integrate the function, convert back to the original variables, and use the original limits. This is the method to use if
you already know the indefinite integral.
• Change the limits when you change the variables, and use the new limits with the new variables. This is the method
to use if you don’t already know the indefinite integral.
Note that you don’t have the option of using the new limits with the old variables. Some people always try that . . . .
How do you find the new limits? It is identical to what we had when integrating a differential equation: Make new
limits correspond to old limits. An example will help; consider
Z 2
2x
dx
1 x2 + 1

The first approach is the easiest here, because we already know the indefinite integral; it is ln x2 + 1 + C. The definite
integral then has the value
Z 2 2
2x 2
dx = ln x + 1 (4.62)
1 x2 + 1 1
= ln 2 + 1 − ln 12 + 1
2
(4.63)
= ln(5) − ln(2) (4.64)
= ln(5/2) (4.65)
But what if we didn’t already know the indefinite integral? We’d have to work it out. Let me redo this problem, not
assuming that we have worked out the indefinite integral. The substitution (determined earlier) is u = x2 + 1. I will use the
second method for indefinite integrals:
du
u = x2 + 1, so du = 2 x dx, so dx = 2x . Then
Z 2 Z 5
2x 2 x du
dx = (4.66)
1 x2 + 1 2 u 2x
Z 5
du
= (4.67)
2 u
5
= ln |u | (4.68)

2
= ln(5) − ln(2) (4.69)
= ln(5/2) (4.70)
This gives the same answer, as it had better.

The big question is, where did the limits 2 and 5 come from in the du integral? Remember that when you want to make
two definite integrals with different variables equal, you must make the limits of the integrals correspond. That means you
put the values of u in the limits of the du integral that correspond to the limits of x = 1 and x = 2 in the dx integral. How
do you find the correspondence? It is given by the substitution equation, u = x2 + 1. Then x = 1 (the lower limit from the
dx integral) corresponds to u = (1)2 + 1 = 2 (the lower limit to use on the du integral), and x = 2 (the upper limit from the
dx integral) corresponds to u = (2)2 + 1 = 5 (the upper limit to use on the du integral). It actually is not difficult. Much
like the chain rule (which is woven through all of this), you only need to remember to do it. But this also merits a boxed
reminder.
Mugsy: I don’t have anything to say here. But I decided to break up the monotony.
When changing variables in a definite integral, you should also change the limits. You do this by plugging the
old limits one at a time into the equation for the substitution, and getting the new limits out.
Again, the procedure changevar(); in Maple also works for definite integrals, as the following shows.
> with(student):
> Int( 2*x/(x^2+1), x = 1 .. 2 );
> changevar(u=x^2+1,%,u);
> value(%);
Z 2
2x
dx
1 x2 + 1
Z 5
1
du
u 2
−ln(2) + ln(5)
Note that the changevar(); routine changes the limits as it goes along. This is what you should do as well!
Why the procedure works. What I have just said ought to make sense (I do try). But there is a more fundamental
reason why the substitution procedure in definite integrals works: Both integrals add up the same numbers! This is hardly
obvious, but it is true.
There are two changes that go on in a substitution in a definite integral: You must change the differential (dx needs
to become du, and does so by the formula du = (du/dx) × dx), and you must change the limits (by the procedure given
earlier). These two changes work together to guarantee that the numbers going into the integral’s summation are equal.
To see this, it is easiest to work backwards, from 25 (1/u) du to 12 2 x/(x2 + 1) dx, and show that everything matches up.
R R
The area of the differential-width slivers in the du integral are (base) × (height), with the du being the (base) and the
1/u being the (height). We won’t have du equaling dx, since the differentials of different variables must be related through
the chain rule. For that, du = (du/dx) × dx = (2 x) dx. So that explains the 2 x in the dx integral. It is a factor needed to
multiply the width of the dx slivers to equal the value of the width of the du slivers.
The height of the du slivers is 1/u, and this “obviously” equals 1/(x2 + 1), since u = x2 + 1. At least it will as long as
the values plugged in for u are equal to (x2 + 1)’s values. Note that we don’t want u to equal x, because then 1/u wouldn’t
equal 1/(x2 + 1), and the heights would be off. So, how do we make the values plugged in for u equal to the values plugged
for x2 + 1? We adjust the limits of the integral so that this happens! The value u = 2 is just what you’d get from plugging
x = 1 into u = x2 + 1, and the value u = 5 is just what you’d get by plugging x = 2 into u = x2 + 1. So, as x goes from 1
to 2, we want the values of u to go from 2 to 5. Then the values for 1/u and 1/(x2 + 1) will match sliver-for-sliver, and the
heights will be equal.
If you don’t get this explanation of why the procedure works, don’t worry too much. You should, however, know what
the procedure for changing limits is and use it when the integral calls for it.
Homework #39
Exercises.
1. Find the following definite integrals. Change the limits when you use substitution to evaluate the integral on the ones
for which you don’t know the indefinite integral. (Some of the indefinite integrals showed up in the last homework
exercises.)
Z 2
(a) 3 x3 − 5 x2 + 5 x − 3 dx
0
Z π/4
(b) sin(8 x) dx
π/16
Z 1√
(c) 3t + 8 dt
−1
Z 1
sin r
(d) dr
0cos r
Z 2
1
(e) dx
1 1 + x
2. Find the following definite integrals. Change the limits when you use substitution to evaluate the integral on the ones
for which you don’t know the indefinite integral. (Some of the indefinite integrals showed up in the last homework
exercises.)
Z 1
(a) 3 x3 − 4 x2 − 3 x + 6 dx
0
Z π/2
(b) cos(3 x) dx
π/6
Z 2√
(c) 24t − 7 dt
1
Z 1 √
sin s
(d) √ ds
0 s
Z 1
1
(e) dx
0 1 + 2x
Problem.
1. This problem investigates a way to evaluate some definite integrals very rapidly.
Ra
(a) Give an argument showing that f (x) dx = 0 for any function f (x) and any value a.
a
R1 √
(b) Find the substitution and new limits in −1 x/ x2 + 1 dx.
(c) What is the value of the integral in the preceding part of this problem? (Hint: Look at the first part of this
problem.)
R1 √
(d) Try that same substitution on the integral −1 1/ x2 + 1 dx. What indicates that this integral might not be zero?
(In fact, it has a value roughly equal to 1.76.)
Partial fractions.
This is a method of integrating any rational function (a fancy name for the quotient of two polynomials). It works in theory
always, often in practice. There is a specific method to use.
Conversion to a proper form. When applying partial fractions, the first thing to do is to convert the integrand to [poly-
nomial] + [proper rational function] using polynomial division. A proper rational function is one where the degree of the
top is less than the degree of the bottom. If you don’t have this, the methods of partial fractions will fail. They always
assume a proper rational function.
Factor the denominator. All the factors should either be linear or quadratic, with the quadratic having only imaginary
roots. All polynomials (with real coefficients) can be factored this far in theory. This is the step where practice might not
agree with theory. You can tell whether a quadratic has imaginary roots by completing the square on it. If it has the form
u2 + a2 , then it has imaginary roots. If is has the form u2 − a2 , then it will factor into (u − a)(u + a). (If this makes you
think that trigonometric substitutions fit in here, you are precisely correct.)
Set up the correct partial fractions form. Getting the right form takes a bit of practice, but is actually quite simple once
you get the hang of it. The basic idea is that each factor in the denominator generates one or more terms in the partial
fractions form. The number of terms it generates equals its exponent, and the terms it generates equals the original factor
(without its exponent) with a succession of higher exponents, starting at one and working up to the exponent on the factor.
That is, the last term equals exactly the original factor. That determines the denominators of the partial fractions form. The
numerators are much easier. The factors will either be linear or quadratic. Linear factors get constants on top in the form,
and quadratics gets linear terms on top in the form. An example will help:
(?)
x2 (x − 1)3 (x2 + 4)3 (x2 + 9)
has a partial fractions expansion of

A B C D E F x+G H x+I Jx+K Lx+M
+ + + + + 2 + 2 + + 2
x x2 x − 1 (x − 1)2 (x − 1)3 x +4 (x + 4)2 (x2 + 4)3 x +9
The x2 in the denominator gave the first 2 terms in the expansion. The (x − 1)3 gave the next 3 terms. The (x2 + 4)3 gave
the next 3 terms. And the (x2 + 9) = (x2 + 9)1 gave the last term. Note that the linear terms to powers (x2 and (x − 1)3 )
have constants on top, while the quadratic terms to powers ((x2 + 4)3 and (x2 + 9)) have linear terms on top. Note that the
numerator (top) polynomial of the original function has nothing to do with the form as it is set up. It controls the values of
the coefficients A through M, but not the set up.
Solve for the coefficients in the partial fractions form. This is usually an algebraic nightmare. Fortunately, Maple will
do all of that for us. In fact, Maple has a function that does the conversion to partial fractions form directly.
The Maple command to change R(x), a rational function, to its partial fractions form is convert(R(x),parfrac,x);.
You must tell Maple what function to convert (R(x)), what conversion to perform (parfrac is the partial fractions indica-
tor), and what the variable is (x, since there could be other variables around).
An example is
> y := (x^2+2*x+3)/(x^2*(x^2+4)^2);
x2 + 2 x + 3
y :=
x2 (x2 + 4)2
It is possible to put the function straight into the convert(); statement, but I would encourage you not to do that. By
first defining y, and letting Maple print it out, you can look at the function and guarantee that you have the right function.
It is very easy to mistype this functions, particularly by not putting parentheses around the bottom function.
It should be noted that the numbers in this Maple example are quite tame compared to what you can get in a partial
fractions expansion.
There is one nice application that fits in here. Several times, we have looked at the population growth model, dP/dt =
k P. This leads to exponential growth: P(t) = P0 ek t . This equation works quite well, but for limited times only. Suppose,
for example, that we take the example we had before, where P(t) = number of rabbits at time t (in months), k = 0.1, and
P0 = 50. Then at t = 1200 (100 years), we’d have P(t) = 1.3 × 1052 rabbits, weighing substantially more than the entire
solar system. Clearly, something is wrong with the equation. Shortage of food, overcrowding, lack of privacy, etc., cause a
drop in the growth rate. How do we incorporate this into the equations? The usual way is to add another factor, and get the
logistic equation
dP P
= kP 1−
dt M
where M is the maximum stable population of rabbits. Then, as P gets close to M, dP/dt (the growth rate) drops, causing
slower growth. If for some reason, P should ever get larger than M, then the growth rate would become negative, and the
population would then decrease back to M. So, let’s solve this differential equation. First separate variables to get
dP M
k dt = = dP
P(1 − P/M) P (M − P)
We want to integrate both sides, but partial fractions are needed on the last integral. Using partial fractions, you get
M 1 1
= +
P (M − P) P M − P
so integrating the separated equation gives

Z t Z P
1 1
k dt = + dP (4.71)
0 P0 P M−P
t P
k t = ln |P | − ln |M − P | (4.72)

0 P0
P P

k (t) − k (0) = ln (4.73)
M − P P0

P P0
k t = ln
− ln
(4.74)
M−P M − P0
P/(M − P)
ek t = (4.75)
P0 /(M − P0 )
This is the solution, but it is not in convenient form. What follows is a brief summary of the algebra needed to convert it.

P0
P = (M − P) ek t (4.76)
M − P0

P0 P0
P=M ek t − P ek t (4.77)
M − P0 M − P0

P0 P0
P+P ek t ) = M ek t (4.78)
M − P0 M − P0

P0 P0
P 1+ ek t = M ek t (4.79)
M − P0 M − P0
P0
M( M−P )ek t
P= 0 (4.80)
P0
1 + M−P 0
ek t
Dividing top and bottom by the last term on the bottom gives a nicer form:
M
P(t) = (4.81)
M−P0
1+ P0 e−k t
This is one convenient form of the logistic equation. A last bit of adjusting is common here. Taking
M − P0 lnW
W= and t0 =
P0 k
and get for the second term on the bottom

M − P0 −k t
e = We−k t (4.82)
P0
= elnW e−k t (4.83)
lnW −k t
=e (4.84)
= e−k(t−(lnW )/k) (4.85)
−k(t−t0 )
=e (4.86)
Putting this into the formula for P(t) gives another final form.
M
P(t) = (4.87)
1 + e−k(t−t0 )
Obviously, this is a much more complicated problem, but the answer (in the form given last) is not too bad. The differential
equation we solved is called the logistic equation, and turns out to be remarkably accurate over long periods of time. It is
handy in predicting the spread of rumors, or diseases, or technology, as well as predicting populations.
The form of the logistic equation that you use is determined by the information you are given or want, much like
deciding which form of a line to use. You will usually have the data M, k, and P0 , and if so, you use the first boxed equation
to get the equation. On the other hand, if you then rewrite the equation into the second form using the equations for W and
t0 , you can also read off the value of t0 , which is a useful number to have. (See the homework.)
Using the logistic equation, two men named Pearl and Reed predicted in 1920 that the population of the United States
was given by
197, 273, 522
P(t) =
1 + e−0.03134(t−1914.32)
This fitted the population of the U.S. for the entire range of 1790 to 1910 to an accuracy of 4% (which the constants were
chosen to do),
Albert: But that was a major success. You have to realize that on the basis of three values—the populations in 1790,
1850, and 1910—the values at thirteen values were approximated exceedingly accurately.
but remained that accurate all the way through the 1950 census. From 1960 on, the model has failed rather badly. Why?
Probably because the maximum stable population (given by M) has increased substantially since 1950 due to technological
progress. It is possible to fit new constants (M, k, and t0 ) to the population figures from 1960, 1970, and 1980. (Three
equations, three unknowns.) The result is (due to Maple!)
363, 447, 466

P1 (t) = (4.88)
1 + e−0.0265(t−1961.00)
How well does it work? If you evaluate it, P1 (1990) = 248319646, for a relative error of 0.16% versus the actual 1990
census. Not bad.
Mugsy: Hey! I like that!
The 2000 census figures are still under some debate, but seem to have settled down. The actual U.S. population
according to the 2000 census was 281,421,906. The predicted number is 268,076,627, for a percentage error of 4.7%.
Why was the error so big? There are several reasons. First, there was a major push (for political, not humanitarian,
reasons, in my opinion) to legalize a large number of illegal immigrants. The Hispanic population of this country swelled
enormously, and we have not yet seen the effects of that played out.
Another reason for such a large error is that we used data points that were too close together and tried to extrapolate
too far with them. We can remedy that by taking a different set of data points to get the constants in the logistic equation.
Using the data from 1960, 1980, and 2000 fits better. (Note that this is what Pearl and Reed did as well. They chose 1790,
1850, and 1910 as the years for fitting the data, picking the years that were as far apart as possible.) If you grind through
this (again, with Maple’s help!) using 1960, 1980, and 2000, you get this equation.
> solve( {subs({P=281421906,t=2000}, eqn),subs({P=226542199,t=1980}, eqn),

> subs({P=179323175,t=1960}, eqn)} ):
> evalf(%):
> assign(%):
> rhs(eqn);
M
√
1+4 x
1 + e(K (−1/2− 2 −t0))
The error in 1970 is 0.67% and the error in 1990 is 1.75%. Much better. However, it interpretation of that equation is a
bit strange. See the homework.
Homework #40
Exercises.
1. Find the partial fractions expansion of the following functions. (The use of Maple is virtually mandatory—unless
you are seriously masochistic. I made no attempt to end up with nice coefficients here.)
1
(a)
(x + 2)(x − 3)(x + 4)(x − 5)
4 x + 11
(b)
(x − 3)(x2 + 9)
4 x + 11
(c)
(x − 3)2 (x2 + 9)
4 x + 11
(d)
(x − 3)(x2 + 9)2
4 x + 11
(e)
(x − 3)2 (x2 + 9)2
2. Find the partial fractions expansion of the following functions. Again, Maple is needed.
1
(a)
(x + 1)(x − 2)(x − 3)(x + 4)
11 x + 4
(b)
(x − 3)(x2 + 9)
11 x + 4
(c)
(x − 3)2 (x2 + 9)
11 x + 4
(d)
(x − 3)(x2 + 9)2
11 x + 4
(e)
(x − 3)2 (x2 + 9)2
3. (a) Give the set-up for the partial fractions expansion of the following function.
(?)
x2 (x − 6)4 (x + 1)(x2 + 4)(x2 + 12)2
(b) What is the condition on the degree of the numerator (?) for this function to be a proper rational function?
4. For this exercise, use the rabbit data (k = 0.1, P0 = 50), together with a limiting population of M = 500 for logistic
equations. It would be useful to refer back to the notes for the derivation of the logistic equation for this problem.
(See the index or table of contents.)
(a) Find the value of W .

(b) Find the corresponding t0 .
(c) Write down the logistic equation for the data of this exercise.
(d) Calculate the value for P(t) at t = 10 and compare it to the value obtained by a simple exponential growth
model. (You can look up exponential growth in the contents or index also.)
(e) Find the rabbit population from the logistic equation at t = 50. Compare that value to the limiting population.
Why should they be close?
(f) Compare it also to the value obtained with exponential growth at t = 50.
5. In this exercise, we look at the logistic equations for the U.S. populations as derived in the notes for the years
1960–1970–1980 and for the years 1960–1980–2000. (See the boxed equation on pages 234 and 235.)
(a) What is the maximum U.S. population that the data suggest for these two equations?
(b) Why do you think that the maximum populations are so different?
(c) Compare both growth specific rates (the ks) to each other and to the specific growth rate that Pearl and Reed
obtained.
6. Using the two logistic equations in the notes (1960–1970–1980, and 1960–1980–2000), predict the population of the
United States in the year 2010. Would it be reasonable to try to decide between those two models on the basis of how
close they were to the real data in 2010?
7. Decompose the following into partial fractions. Work these out by hand, don’t just set them up.
4 x − 13
(a)
2 x2 + x − 6
13 x + 5
(b)
3 x2 − 7 x − 6
7 x2 − 29 x + 24
(c)
(2 x − 1) (x − 2)2
x2 − 17 x + 35
(d)
(x2 + 1) (x − 4)
8. Integrate the following (which should look familiar).
R 4 x − 13
(a) dx
2 x2 + x − 6
R 13 x + 5
(b) dx
3 x2 − 7 x − 6
R 7 x2 − 29 x + 24
(c) dx
(2 x − 1) (x − 2)2
R x2 − 17 x + 35
(d) dx
(x2 + 1) (x − 4)
Problems.
1. Use the second boxed form of the logistic equation for this problem.
(a) Show that P(t0 ) = M/2.
(b) Show that P00 (t0 ) = 0. (This requires a bunch of algebra. Maple comes in very handy for this one.)
(c) The second derivative is the derivative of the first derivative.

Dudley: Even I remember that.
What can you say about any function (and we will think of that function as being P0 (t) in a moment) where its
derivative is zero? (Think about what we did in the chapter on derivatives.)
(d) On the basis of the first and immediately previous parts of this problem, give two different facts about the
population at t = t0 .
Investigations.
1. The modern philosopher Jean Jacques Rousseau formulated a simple model of population growth for 18th century
England based on the following (true) observations:
• The birth rate in London is smaller than the birth rate in rural England.
• The death rate in London is greater than the death rate in rural England.
• As England industrializes, more and more people are moving from rural England to London.
Rousseau then reasoned that the population of England would eventually decline to zero. Let’s show that his con-
clusion is wrong. Let bR , bL , dR , dL , and m be the birth rates in rural England and London, the death rates in rural
England and London, and the net rate people are moving to London from rural England.
(a) Write down the equations giving the rate of population change in rural England and London. (Think about what
increases or decreases population in the different areas, and include the relevant rates from the beginning of this
problem with the correct signs.)
(b) Come up with a set of values for the constants that satisfy Rousseau’s observations, but make both rates of
population change positive.
2. In this question, we want to look at the rates of change of population, P, given by the logistic equation

dP P
= kP 1−
dt M
for various population sizes, using the logistic equation. We always assume P(t) > 0. (P(t) = 0 is boring, and
P(t) < 0 makes no sense.)
(a) Suppose P(t) < M at some time t. Show that dP/dt > 0 at that time. Do this by looking at the signs of the
factors in the formula for dP/dt.
(b) Use the previous part to show that for populations below the maximum population, M, the population grows.
(c) Suppose P(t) > M at some time t. Show that dP/dt < 0 at that time.
(d) Use the previous part to show that for populations larger than the maximum population, M, the population
decreases.
3. This problem is like the previous one, with an extra term. Assume that there is not only a maximum sustainable
population, M, but also a minimum survivable population, m. Assume m < M. The adjusted equation is

dP P P
= kP 1− −1
dt M m
We want to check that it does what we expect.
(a) Give the signs of dP/dt for the intervals 0 < P < m, m < P < M, and M < P. (Figure out the signs of each of
the factors in the equation just as in the previous problem.)
(b) On the basis of this, what would you predict if the population P ever drops below m?
(c) What would be the implication of this model for fishing, hunting, etc.?
(d) Use Maple to solve this adjusted equation. (Don’t plug in initial conditions. If you try, Maple gives up.)
Integration by parts.
Even though the product rule for derivatives doesn’t give a product rule for integrals, it does give a rule, called integration
by parts, or often just “parts.” It is used on a different set of integrals than substitution or partial fractions.
Differential formula. Parts is based on the product rule forR

differentials:
R
d(uv)
R
= u dv + v du, where u and v are any
functions we want. Solve for u dv and integrate, and you get u dv = d(uv) − v du and finally
Z Z
u dv = u v − v du (4.89)
That is the differential form of integrationR by parts, and probably theR easiest form in which to remember it.
Note one thing. It takes one integral ( u dv) and gives another ( v du).
Dudley: Didn’t substitution take one integral and give another also?
Albert: Very good. But that is where the similarity ends. The whole idea, then, is to choose u and v strategically so
that the second integral actually is easier. The objective for both substitution and parts is the same: Get an integral that is
closer to being able to be worked! The basic formulas we have are the only ones that give genuine answers, rather than
more integrals to be worked.
For example, let’s work Z
x ex dx
R R
Look at the integration by parts formula. We start with an integral u dv, and we end with an integral v du. In going from
the first to the second, the u became a du. In other words, we will end up differentiating the u-part of the integral. Also, in
going from the first to the second, the dv becomes a v. In other words, we will end up integrating the v-part of the integral.
When we look then at the example, Z
x ex dx
we end up having to decide which part to integrate and which part to differentiate. The ex dx would be easy to integrate,
and the x would be simple to differentiate. So, let’s try it. Set u = x, and set ex dx = dv. Then du = dx, and v = ex . (All
right, it should be v = ex +C, but we’ll see in the homework that we can ignore the +C at this point of integration by parts.)
The integration by parts formula, gives
Z Z
x ex dx = x ex − ex dx = x ex − ex +C
It is easy to check that this is correct. (How?)
How to apply it. But you might argue that it is just as easy to differentiate ex as it is to integrate it, and certainly integrating
x dx is no difficulty either. Why use that particular choice of u and dv? We’ll get to that next.
It is convenient to establish three categories of functions:
1. logs and inverse trigonometric functions
2. polynomials and powers of x
3. exponentials, sines, and cosines
Every function we deal with has factors that fit into one or more of these categories.
There are a couple of advantages of remembering this table. First, when do you use integration by parts? In any of three
cases:
1. a product of functions from different categories
2. single term from first category
3. a product of terms all from the third category.

This is very handy to know. You don’t want to use integration by parts unless you have to, and this tells you when you have
to. (Alternatively, there are very few instances where you can use both substitution and integration by parts. If substitution
doesn’t look promising, integration by parts is a good next guess.)
Second, when you do have to use integration by parts, you differentiate the lower-numbered category part (that is, let
u = that term), and integrate higher-numbered category part (that is, let dv = that term together with the differential).
In the example, x ex dx, the x fits in category 2 and the ex fits in category 3. Since this is a product of different
R
categories, integration by parts is the method. Since x is in the lower-numbered category (that is, category 2), use u = x,
and since ex is in the higher-numbered category (that is, category 3), use dv = ex dx.
If the integrand is a single term from the first category, treat it as a product of 1 × the integrand, and treat the 1 as being
a polynomial. That makes it an integral from the first category times an integral from the second category, and u = original
function, and dv = 1 dx in the integration by parts formula. For example,
Z
ln x dx
is integrated by letting u = ln x and dv = dx, so that du = (1/x) dx and v = x. This gives

Z Z
ln x dx = x ln x − x(1/x) dx (4.90)
Z
= x ln x − dx (4.91)
= x ln x − x +C (4.92)
This can also be checked.
The last case, of products of functions all from category 3, will be dealt with right after showing how this works on
Maple.
Maple will help you with integration by parts. First, you need to load the student package by the command with(student);
You then can use the command intparts(); to get a single step of integration by parts. You will need the inert form of
integration, Int(); again, or else Maple will just integrate it by parts by itself. You need to tell it the integral, and the
function u (to be differentiated). Maple assumes the rest of the integrand and differential are dv, and works the integration
by parts for you (if it can). It gives back the next step, again without further evaluation. So, for example, if you have x ex dx,
you would proceed this way.
> with(student):
> Int(x*exp(x),x);
> intparts(%,x);
> value(%);
Z
x ex dx
Z
x ex − ex dx
x ex − ex
That’s the answer. But Maple still won’t put in the +C. Maple could have done the whole integral from the start,
though.
> int( x*exp(x), x);
(−1 + x) ex
At any point, you can type
value(%);
and Maple will then go ahead and work the integral out for you, as shown. And if you forget to use the inert form and use
int();,
Maple just grinds it out, also as shown.
Multiple integrations by parts. Sometimes, when you use integration by parts, you get another integral that is also a
candidate for integration by parts. In that case, go ahead and use it!
Mugsy: I thought things like this were outlawed with the Inquisition.
R 2 x
Straightforward application. For example, if we started with x e dx, one integration by parts (with u = x2 and
dv = ex dx, so du = 2 x dx and v = ex ) would give
Z Z
x2 ex dx = x2 ex − 2x ex dx
and the integral that’s left would also require an integration by parts. In fact, it is twice the integral we worked out earlier.
The net result would be that Z
x2 ex dx = x2 ex − 2 (x ex − ex ) +C
A few words of caution: Be careful of negative signs! They crop up all over, and it’s easy to miss one. The same goes for
factors that come out of the integrals (such as the 2 in the example).
Dudley: Multiple minus signs! AUGH!
Also, avoid circular work, where everything cancels. An example of that phenomenon is given in the homework. The
way to avoid the circular work trap is always to use these guidelines about what u and v should be. Or in the most general
situation, be careful never to let u in one integration by parts be the result of integrating the dv of the previous step. Doing
so reverses the progress made in the previous integration by parts step.
Mugsy: I just hate it when that happens.
There is one handy procedure for doing integration by parts multiple times very rapidly, as long as the integrand is
of the form (polynomial) × (a single category 3 function). Essentially, the procedure is a systematized formulation of
multiple integration by parts. It is not another method, but a slick way of organizing integration by parts. The procedure
is called tabular integration by parts, and works this way. You will use two columns. The polynomial is put at the top of
one column and the rest of the integrand (not including the differential) is put at the top of the other column. (It should
be a single category 3 function, which should be easy to integrate repeatedly.) Successive derivatives of the polynomial
are put on successive rows in its column, stopping when you reach a derivative of 0. Then fill out the other column with
successive integrals of that function, keeping going down rows in parallel with the derivatives until you run out of rows on
the polynomial side. From this, you can write down the integral. Take each row in the polynomial column, and multiply it
by the entry in the other column that is one row down from it. Then add up all the products with alternating signs (plus, then
minus, then plus, then minus, etc.), starting with positive. (That is, you take the first product with whatever sign it has, take
the second product and change its sign, take the third product with its sign, the fourth product with sign changed, etc.) Add
all these up. (Why do we use one row down rather than products straight across? Because we must do one more integral
than derivative in order to evaluate the integral. Why use alternating signs? Because the integration by parts formula has
this subtraction in it, and this method is exactly the same as integration by parts, just put together to aid in calculations.)
An example will help.
Mugsy: You said it!
Let’s work Z
(x3 + 4 x2 + 6 x) cos(2 x) dx
The columns are headed by x3 + 4 x2 + 6 x and cos(2 x). The table for the integration by parts is then
Polynomial Other function
x3 + 4 x2 + 6 x cos(2 x)
3 x2 + 8 x + 6 (1/2) sin(2 x)
6x+8 −(1/4) cos(2 x)
6 −(1/8) sin(2 x)
0 (1/16) cos(2 x)
The successive products are then (remember to use “down one!”):
(x3 + 4 x2 + 6 x) × (1/2) sin(2 x)

(3 x2 + 8 x + 6) × (−(1/4) cos(2 x))
(6 x + 8) × (−(1/8) sin(2 x))
(6) × (1/16) cos(2 x)
The integral is then the sum of these with the correct signs:
(x3 + 4 x2 + 6 x) × (1/2) sin(2 x)
−(3 x2 + 8 x + 6) × (−(1/4) cos(2 x))
+(6 x + 8) × (−(1/8) sin(2 x))
−(6) × (1/16) cos(2 x)

1 3 2 9 3 2 9
= x + 2 x + x − 1 sin(2 x) + x +2x+ cos(2 x) +C
2 4 4 8
This procedure is substantially faster and easier than regular integration by parts, but is limited as to which integrals that it
will work with. (Did you notice that I put in the +C? It really does need to be there!)
Dudley: Hey. That’s not too bad, actually. Except for all the minus signs.
Solving for the integral. Occasionally, there isRan attempt to get the resulting and initial integrals of integration by
parts to match, and then solve for it. One example is sec3 x dx, which can be evaluated this way. First, we split the sec3 x
into (sec x)(sec2 x), and use a regular integration by parts. Then a trigonometric identity allows us to recover part of a sec3 x
back. Watch. Z Z
sec3 x dx = sec x sec2 x dx
Let u = sec x, dv = sec2 x, so du = sec x tan x dx, and v = tan x, so integration by parts gives (with the trigonometric identity
tan2 x = sec2 x − 1)
Z Z
sec3 x dx = sec x tan x − sec x tan2 x dx (4.93)
Z
= sec x tan x − sec x(sec2 −1) dx (4.94)
Z
= sec x tan x − (sec3 x − sec x) dx (4.95)
Z Z
= sec x tan x − sec3 x dx + sec x dx (4.96)
Z
2 sec3 x dx = sec x tan x + ln |sec x + tan x | +C (4.97)
1 1
Z
sec3 x dx = sec x tan x + ln |sec x + tan x | +C (4.98)
2 2
R
I snuck in the integral sec x dx = ln |sec x + tan x | +C, which we did in the homework (see the substitution section of this
chapter). The Maple commands that duplicate this are:
> with(student):
> Int(sec(x)^3, x);
Z
sec(x)3 dx
Don’t forget the capital “I.”
> intparts(%, sec(x));
sec(x) sin(x) sec(x) tan(x) sin(x)
Z
− dx
cos(x) cos(x)
You differentiate one factor of sec x and integrate the other two. Next, we convert everything to sines and cosines.
> expand(simplify(%));
sin(x) sin(x)2
Z
− dx
cos(x)2 cos(x)3
Our next Maple step pulls out a trick we haven’t seen before. It is a new command powsubs();. It operates much like
subs();, except that it also works right for powers. In the example here, powers of 1/cos(x) will convert to powers of
sec(x). It is defined in the student package.
> powsubs( 1/cos(x) = sec(x), % );
Z
sec(x)2 sin(x) − sec(x)3 sin(x)2 dx
Next, we solve for the integral. Again, note the capital “I”’s.
> solve( Int(sec(x)^3,x) = %, Int(sec(x)^3,x) );
Z
sec(x)2 sin(x) − sec(x)3 sin(x)2 dx
We’re almost done. Just that last pesky integral.
> value(%);
1 sin(x)3 1 1
sec(x)2 sin(x) − − sin(x) + ln(sec(x) + tan(x))
2 cos(x)2 2 2
The same process will integrate eax cos(bx) or eax sin(bx), except that two integration by parts are needed to get back to
the original integral. To avoid cancellation in your integral, I would suggest using u = eax for both integration by parts.
4.2.2 What to do if you must evaluate an integral exactly.

Every calculus student at some point confronts an integral to work exactly which isn’t part of a section on a specific integral
technique, and a troubling truth sinks in slowly: It isn’t obvious where to start.
Albert: This is particularly discouraging on something like a test, where it often hits.
Mugsy: When that hits, I hit back. Hard.
Albert: I’m assuming that you mean you put extra effort into solving that problem on the test.
Mugsy: Yeah. Sure.
Here’s a brief summary of methods to use.
Look for “obvious” substitutions.

The first thing to do is to see if some substitution will work. If you have an exponential with an exponent that is not
something like k x, you basically must substitute u = exponent. We just don’t have any other way of integrating that type of
function. Basically the same goes for the arguments of sines, cosines, and all the other functions we had.
If the derivative of one portion of the integrand is also a factor in the integrand, you should definitely consider using
u = portion whose derivative is present.
Decide if partial fractions or integration by parts can work.

After looking at substitutions, you should check if the function is in some special form that has a method for it. Partial
fractions works only on rational functions, that is, ones that are the quotient of polynomials. Any other type of function,
and you need to move on. And, in order for partial fractions to work, the denominator should either be factored or easily
factorable.
Integration by parts works when you have either the product of two different types of functions or a single type of certain
functions. (See the discussion on integration by parts earlier for details.) It can also work in many other situations, but it
requires truly clever choices of u and dv.
“Non-obvious” substitutions.
If you have a quadratic, look to see if completing the square will put it into the form ±u2 ± a2 . Then a trigonometric
substitution (see a long and painful investigation a few sections back) will probably help. Beyond that, you can also try
Maple, especially more recent releases. (Maple is continually refining its integration techniques, because they are used so
much!)
Homework #41
Exercises.
1. Work the following integrals. You can check your answer on Maple, but you need to be able to do these by hand, too!
Z
(a) x2 sin(4 x) dx
Z
(b) x8 ln(7 x) dx
Z
(c) (Arcsin x)2 dx (A challenge!)
Z
(d) x2 sin(x3 ) dx
Z
(e) x2 sin(x) dx
Z
(f) sin x ecos x dx
Z p
(g) x8 x3 + 1 dx (Hint: x8 = (x2 ) (x3 )2 .)
2. Work the following integrals.

Z
(a) x2 cos(3 x) dx
Z
(b) x5 ln(6 x) dx
Z
(c) Arcsec x dx (Hint: Look at what you did in integration by parts to find a substitution.)
Z
(d) x2 cos(x3 ) dx
Z
(e) x2 cos(x) dx
Z
(f) cos x esin x dx
Z p
(g) x7 x4 + 1 dx (Hint: x7 = (x4 ) (x3 ).)
3. For each of the following integrals, give the best method to integrate it “by hand,” just the first step. (“Maple” is not
an answer here!)
Mugsy: Rats.
You do not have to evaluate them. Just say how you would do them, if you had to.
Z
(a) x4 cos x dx
Z
3
(b) x5 ex dx
x2 − 1
Z
(c) dx
x3 − 3 x
Z
(d) (ln x)6 dx
3 x2 − 5 x + 9
Z
(e) dx
(x + 3)5 (x2 + 1)4
4. For each of the following integrals, give the best method to integrate it “by hand,” just the first step.
Z
(a) x4 ex dx
Z
(b) x5 cos(x3 ) dx
x+1
Z
(c) dx
x2 + 2 x
Z
(d) (ln x)8 dx
5 x2 − 3 x + 4
Z
(e) dx
(x − 3)4 (x2 + 81)3
5. How do you check if an indefinite integral is correct? Use that to check that x ex dx = x ex − ex +C is correct.
R
Problems.
1. RThis shows you whatR happens if you use the wrong assignments in an integration by parts done twice. Start with
x2 ex dx = x2 ex − 2 x ex dx (obtained by integration by parts with the right assignments of u = x2 and dv = ex dx),
and then use the wrong assignments forR u and dv (that is, u = ex and dv = x dx) in the right-hand side integral, and
show that you end up with x e dx = x2 ex dx.
R 2 x
2. Work (x3 + 4 x2 + 6 x) cos(2 x) dx by the usual integration by parts formula (repeatedly), and compare the successive
R
terms obtained with the terms that occurred in the tabular organization in the notes.
3. In this problem, we examine an integral that earlier versions of Maple couldn’t work. The moral of this problem is
that you still need to know how to use the methods (substitution, integration by parts, etc.) even if you do have Maple
available to you. The newer versions of Maple—like in the computer lab—can handle this! But for this problem,
pretend Maple can’t do it. Simulate that by using Int(); rather than int();.) Type
Int(sqrt(1+x^4)/x,x);
on Maple in the computer lab and proceed with this problem.
(a) If we recognize that 1 + x4 fits into the pattern a2 + u2 , with a = 1 and u = x2 , we can put these into the
substitution form u = a tan θ and get a substitution x2 = tan θ . To tell Maple to make the substitution, we have
to load the substitution (change-of-variables) routine, by typing with(student): Then type
changevar(x^2=tan(theta),%%,theta);.
(The doubled percent signs are needed because the
with(student):
was actually immediately before, and we want to go back to the integral.) Then type
value(%);
to get Maple to finish the integral.
(b) Note that the integral is in terms of θ , not x, which is the original variable. To get back, type
subs(tan(theta)=x^2,%);
and Maple gives the answer in terms of x.
Dudley: Hey, Al. Am I supposed to get an Arctanh?
Albert: Didn’t think you’d ever see that, huh? Try typing in convert(%,ln) to see what happens when
you use logarithms. You’ll see why Arctanh is used.
(c) Differentiate (on Maple) the answer from the previous part, simplify it with normal(%); (you might also need
as factor();), and check that it is equal to the original function.)
Investigations.
1. In this question, we explore the formula x ex dx = x ex − ex +C. We almost could have “guessed” that the formula
R
would have to be like this. (Of course, this requires inspired guessing.) We will attempt to reconstruct the integral by
finding a function whose derivative is x ex .
(a) You know that the derivative of ex is ex , so suppose you tried, as a first shot, x ex for the integral. Differentiate
that, and see how close to x ex you get.
(b) There is an extra term (an ex ) in the derivative that isn’t in the integral. (Right?) So, let’s see if we can get rid
of it. We must subtract something from the x ex (our original guess for the integral) to get rid of this extra ex in
the derivative. What would we need to subtract from the guess so that the extra ex in the derivative cancels?
(c) Do the differentiation of the guess including that extra term, and check that you do get exactly x ex as the
derivative.
(Note: Reexamine the integration by parts procedure applied to x ex dx and you’ll see that the integral on the
R
right-hand side does exactly what the steps here did. That’s really all there is to integration by parts. You try
something (uv) and then subtract off what you need to (in the second integral) to make things work correctly.)
2. We now investigate what happens if we decide to put in the constant of integration when finding the v term in the
integration by parts.
(a) In the integral x ex dx, with u = x and dv = ex dx, let v = ex + 37. Carry out the rest of the integral, watching
what happens to the 37. Show that the answer doesn’t change.
(b) Give a reason why the v term never needs to have the +C attached to it in any integration by parts use.
R
(c) Evaluate x Arctan x dx, using integration by parts with u = Arctan x and dv = x dx. Write down the integrals
you get by taking v = 21 x2 and v = 12 x2 + 12 . Note that the integral with v = 21 x2 + 12 is easier to work because of
a really nice, and unusual, cancellation. (Note: Integrals where this happens are exceedingly rare. This is more
of a curiosity than something to keep in mind to use regularly.)
Integral tables.
There are numerous integral tables around—so many that it is difficult to give much advice. However, here is a tiny
annotated bibliography of ones you should consider. Stop by my office if you want to see any of these or learn about others.
• Most calculus texts have a table of integrals, often on the flyleaf pages inside the covers. These are minimal, and not
too useful when confronted with complicated integrals.
• CRC Standard Math Tables has a section on integrals (as well as most of the other mathematics that you’d encounter).
It is reasonably good, not comprehensive, but likely to have you what you want. I’d recommend getting this if you
are planning on going into a field that uses quite a bit of math (e.g., engineering, math, physics, or statistics).
• Handbook of Mathematical Functions With Formulas, Graphs, and Mathematical Tables, by Abramowitz and Stegun.
This book is more comprehensive than the CRC one, and can be obtained quite cheaply in paperback. It also is a
major collection of topics, not just integrals. Its treatment is more thorough than CRC’s and more geared to the
person who already knows what to do, just needs to be reminded about all the details or to look up the value of some
function. You might want to consider this if you are going to do lots of math, such as graduate work in math or
engineering. It would be appropriate for someone planning on going to graduate school in mathematics.
• Table of Integrals, Series, and Products by Gradshteyn and Ryzhik. The bulk of this hefty volume is integrals of all
sorts. It comes the closest to being the comprehensive reference for integrals. It also includes summation and product
formulas. This would be useful only for the most specialized math training.
Finding the integral you want. Locating the section containing the integral you want can be a chore. Integral tables
(especially the larger ones) contain so many integrals that some organization is critical. Usually, the best procedure for
finding the correct area is to go to the last section that mentions any factor in your integral, and keep up that process
through all the factors you have. For example, if you have x2 ex sin x dx, you would probably find that exponentials
R
occurred after polynomials, and sines after exponentials, so you’d turn to sines. Then look through there until you locate
exponentials times sines, and then look for polynomials times exponentials times sines.
Quadratics often only appear as a2 + u2 , a2 − u2 , and u2 − a2 . (The other combination, −u2 − a2 , is equal to −(u2 + a2 ),
and is rarely encountered.) The integrals that you get are more like
Z p
x2 + 6 x − 7 dx
You can’t simply ignore the 6 x, so you must do what you always do when you don’t like the linear term in a quadratic:
You complete the square. Then proceed as we did before. (See homework investigations about trigonometric substitutions.
That’s the essential basis for all of these forms.)
Mugsy: What a pain that was! You mean it is actually used?
Albert: Very definitely.
Adding on the constant of integration. Integral tables almost always leave off the +C, the same as Maple does. You
still shouldn’t do that! You will lose credit if you do!
Don’t rely on integral tables (or Maple) for all your integration needs. You still need to be able to use substitution,
partial fractions, and/or integration by parts to be able to get to an integral in the table. I have tried to emphasize that point
in the homework. I hope you’ve picked it up.
Dudley: Well, it’s nice of you to say so. I’m assuming that this means that those sorts of things will show up on the
tests.
Albert: You got it.
Reduction formulas. Often, tables only give you a method of extending the range of variables using what is called a
reduction formula. The value of an integral is given in terms of another integral, with some variable lowered (reduced) or
occasionally raised.
A typical example is
1 n−1
Z Z
sinn (ax) dx = − sinn−1 (ax) cos(ax) + sinn−2 (ax) dx
na n
In this case, the exponent (n) is lowered to (n − 2), which is considered progress. You would use this formula if you wanted
R 12
to find, for example, the value of sin (3 x) dx. You’d use n = 12 and a = 3 in the formula on the right side. That you
leave you with an integral sin10 (3 x) dx, which you would then need to evaluate. You do that by
R
going back to the same
formula again, but with n = 10 this time (and a = 3Rstill). You’d getRanother integral, this time sin8 (3 x) dx. Again, you
R
use the formula. You keep doing this until you get sin0 (3 x) dx = dx = x + C, which you should know already. After
that, “all” you have to do is reassemble the pieces to the final answer.
Mugsy: I’m much better at creating little pieces than reassembling them.
As you might expect, most reduction formulas come from applying integration by parts. You’ll get a chance to try one
in the homework.
Dudley: I can hardly wait.
Homework #42
Exercises.
1. Use the reduction formula given in this section to evaluate

Z
sin5 (7 x) dx
You can check your answer using Maple, but beware of different forms of the same answer! (Take Maple’s answer,
subtract your answer, and then do a simplify(%,trig); on it. If you get 0, or any constant, you are correct.)
2. Use the reduction formula in this section to evaluate
Z
sin6 (5 x) dx.
4.2.3 Approximate methods.

As I have said before, elementary functions come from a finite combination of powers, trigonometric functions, inverse
trigonometric functions, logarithms, and exponentials. In calculus class, we stick with elementary functions. Integrating
elementary functions, however, can give new (non-elementary) functions. So, if we can’t find the indefinite integral of a
function, how can we find the definite integral? We can’t use the F(b) − F(a) trick. Some examples of non-elementary
integrals are Z
2
e−a x dx
sin x
Z
dx
x
Z p
1 − k2 sin x dx
The first of these shows up in probability and statistics. (It’s the famous bell-shaped curve function.) The second is called a
Fresnel (pronounced fruh-NELL) integral, and is used in optics. The third is generally called an elliptic integral, and shows
up in miscellaneous places in physics and math.
To find the definite integrals of these functions, we approximate them. This is a very sophisticated subject, and we’ll
just touch on the surface of it.
Mugsy: How about skipping it altogether?
Albert: Definitely not.
We will motivate the approximations using the area concept of integrals. That is, we will interpret the integrals as areas,
and come up with different ways of approximating that area. The approach is simple: Slice it up; approximate it; add it
back together. We can’t take the limit anymore, though, since all you get is the integral that we can’t evaluate. So, we stop
short of the limit, and evaluate the sum we get “by hand.” This is why the result is an approximation. It also gives a major
clue about how to improve the approximation: Increase the number of slices! More on that later; it is not as simple as it
sounds.
Mugsy: Nothing in this course is as simple as it sounds.
On the other hand, we will end up using Maple extensively in this section, because it is so numeric-intensive.
Riemann sums.
The idea of Riemann sums is to approximate the strips using rectangles. It’s a reasonable idea—basically what we did in
setting up integrals. There is only one question, and that is “What height should I use for the rectangle?” Usual calculus
courses cover two options, and Maple allows a third. You can use the heights at the right-hand endpoint of each slice
(called the rightsum(); in Maple), or the left-hand endpoint (called the leftsum(); in Maple), or the midpoint (called
the middlesum(); in Maple). To access the rightsum();, leftsum();, and middlesum(); procedures in Maple, type
with(student);
first. That defines the procedures for Maple. For example, an integral without an exact elementary function value is
Z 2p
1 + x5 dx
0
It still has an exact value, though, which we want to approximate. We will use this single integral throughout all the
approximation procedures we get. To apply each of the procedures from above, you would use these commands:
> with(student):
You will notice immediately that you don’t get numbers. You get messy looking summations. To get numbers, use
evalf(value(%)); on them. The Maple Riemann sum routines give results similar to what Int(); does. They don’t try
to do any simplification or evaluation. You must force them.
Mugsy: If you need more force, just call. That I’m good at.
Maple uses only 4 slices unless you tell it otherwise. To get 10 slices with middlesum();, for example, you’d use
> with(student):
> middlesum( sqrt(1+x^5), x = 0 .. 2, 10 );
> evalf(%);
!
1 9
r
i 1 5
∑ 1 + ( 5 + 10 )
5 i=0
4.210170324
That is, you put the number of slices you want to use (if it’s different than 4) after the limits. The others work the same
way.
You can get a very nice picture of the specific rectangles that are used with rightsum();, leftsum();, and middlesum();.
The commands for the pictures are
rightbox();, leftbox();, and middlebox();
with the same types of arguments used in the approximations. For example, you get a picture of the boxes used in
middlesum(sqrt(1+x^5),x=0..2,10);
by typing
middlebox(sqrt(1+x^5),x=0..2,10);
You can actually see the boxes, with their heights set by the values of the function at the middles of their tops.
Homework #43
Exercises.
1. Calculate the decimal values of rightsum();, leftsum();, and middlesum(); for the following integrals, using
n = 10 for each. Put down all 10 digits that Maple gives.
Z π/4
(a) tan x dx
0
Z 1
(b) ex dx
0
Z 1 √
x
(c) dx
0 1 + x4
2. Again, calculate the decimal value of the three Riemann sums for the following integrals, using n = 10 each time.
And again, put down all 10 digits that Maple gives.
Z π/3
(a) cos x dx
0
Z 2
(b) ex dx
0
Z 1 √
x
(c) 5
dx
0 1+x
Trapezoidal rule.
We can improve the approximation considerably by using trapezoids rather than rectangles to approximate the areas of the
slices.
In case you’ve forgotten you high school geometry, a trapezoid has four sides, two of them parallel. In this case, the
two parallel sides are vertical. The bottom will be horizontal, and the top will be a line of whatever slope is determined by
the function.
Of course, you want to do this on Maple, and of course, Maple will do it. To get to the Maple routines that do trapezoidal
approximation, you again need with(student); and the routine is, of course, trapezoid();. To approximate, for
example,
Z 2p
1 + x5 dx
0
by the trapezoidal rule with 10 slices, you would use the command
> with(student):
> trapezoid( sqrt(1+x^5), x=0..2, 10);
> evalf(%);
r ! √
1 1 9 i5 33
+ ∑ 1+ +
10 5 i=1 3125 10
4.244981679
Again, you’ll get a summation, which you convert to a decimal by evalf(value(%));. This really is nothing too new.
Homework #44
Exercises.
1. Calculate the trapezoidal approximation (in decimal form) to the following integrals. Use n = 10. (They should look
familiar.) Again, put down all 10 digits that Maple gives.
Z π/4
(a) tan x dx
0
Z 1
(b) ex dx
0
Z 1 √
x
(c) 4
dx
0 1+x
2. Again, calculate the decimal value of the trapezoidal approximation for the following integrals, using n = 10 for
each. Again, put down all 10 digits that Maple gives.
Z π/3
(a) cos x dx
0
Z 2
(b) ex dx
0
Z 1 √
x
(c) dx
0 1 + x5
Simpson’s rule.
We can look at Riemann sums as approximating the function by a constant on each slice. The trapezoidal rule approximates
the function by a linear function on each slice. Simpson’s rule uses quadratic functions.
The difficulty is to determine enough information to fit a quadratic. To determine a constant, you only need one value
(the value of the constant), and we had several ways of determining that constant, using the value of the function either at
the left-hand or right-hand endpoint, or the midpoint. For the trapezoidal rule, you need two values, and the values used
were the values at each end of the interval. For Simpson’s rule, you will need three values (corresponding to a, b, and c in
the formula ax2 + bx + c). It is quite reasonable to use the values at both endpoints and the midpoint of the interval, and
that’s what’s done, except that it is always written differently. Instead of using both ends and the middle of a single interval,
the interval is split into two intervals, and the outside ends and the common (“middle endpoint”) point are used. The net
result is that Simpson’s rule requires that the number of intervals be even.
Of course, Maple includes Simpson’s rule. After the usual with(student); you get it by the command simpson();.
To approximate, for example,
Z 2p
1 + x5 dx
0
by Simpson’s rule with 10 slices, you would use the command
> with(student):
This is exactly like what we did before.
Homework #45
Exercises.
1. Calculate Simpson’s rule approximation (in decimal form) to the following integrals. Use n = 10. (They should look
familiar.) Once again, put down all 10 digits that Maple gives.
Z π/4
(a) tan x dx
0
Z 1
(b) ex dx
0
Z 1 √
x
(c) 4
dx
0 1+x
2. Again, calculate the decimal value of the Simpson’s rule for the following integrals, using n = 10 for each. And
again, put down all 10 digits that Maple gives.
Z π/3
(a) cos x dx
0
Z 2
(b) ex dx
0
Z 1 √
x
(c) 5
dx
0 1+x
Problem.
1. In this problem, we collect in one place all the information that the previous exercises have given, and then look at
various patterns in the data. For this, we create a table with the following headings
Exact Leftsum Middlesum Rightsum Trapezoid Simpson
leaving room for several more columns. There will be three rows, one for each of the three integrals in exercise 1 of
the past several sections.
(a) Fill in the table, using the numbers you collected earlier. For the Exact column, you might need to use Maple’s
internal approximation routines, and use evalf(%); on them. Again, use full 10 digit accuracy on these.
(b) Compare the Exact column with the others. Which column is consistently closer?
(c) Create a new column, labeled Avg., and fill it in with the averages of the Rightsum and Leftsum columns.
Compare these numbers to the numbers in the Trapezoid column. (The same thing should happen for each
row.) Can you explain what happened?
Additional ideas (higher-order approximations).

It would seem natural to approximate the function by cubics, quartics (fourth-degree polynomials), etc. to get better
approximations, but this is never seriously done. Simpson’s rule is used often, and (for technical reasons), the trapezoidal
rule is used occasionally.
Of course, Maple is the easiest way to use these. But it uses another method, or actually several methods depending on
the integral, when you ask it for an evalf(%); on an integral that it can’t work exactly. To find out more details, type
?int[numerical]
and be ready to be confused.
Dudley: Is he serious?
Albert: That depends on what you mean. If you are asking if he expects you to type that into Maple, no, he isn’t
serious. If you are asking if he expects you to get confused if you do type that in, yes, he is serious.
Accuracy considerations.
The approximation techniques that we have used here are, of course, inaccurate. That’s why they are called approximations.
Some of them are more accurate (closer to the correct value) than others. What we want to look at next is just that. How
accurate (or, how inaccurate) are these approximations?
By definition, the error is
Exact value − Approximate value
The value of the error depends on the method of approximation, the function, the interval, and the number of slices. Of
these, the method and the number of slices can be chosen for any specific problem, since the function and interval are given
to you.
Don’t just increase the number of slices to some huge number and expect better accuracy. Roundoff errors can kill your
estimate, and you’ll be spending a huge amount of time unnecessarily. This is particularly true of the approximations built
into some calculators. (I think they use Simpson’s rule, but I’ve never looked into that carefully.)
The art of balancing approximation method and the number of slices is a delicate one. I intend only to convince you of
that in the homework.
Homework #46
Investigations.
This entire investigation refers to the integral
dx
Z π/2
0 1 + sin x
1. (a) Use Maple to find the exact value of the integral. You will note that Maple grinds for quite a while on this one.
It is not a simple integral! But it does give a whole number answer. (If you get 12 π + 1, go back to your function
and type it in correctly. It is not
1/1+sin(x)! Also remember to capitalize the P in Pi for Maple.)
(b) We will set up a table for deciding how well each of the different methods approximates the integral as n =
number of slices increases. The table (it will be quite large) should look like
Middlesum Trapezoid Simpson

n Value Err Err × n2 Value Err Err × n2 Value Err Err × n4
6
12
24
48
The rest of the investigation will help you fill it out correctly.
i. Fill in the columns labeled “Value” of Middlesum, Trapezoid, and Simpson on the integral. (Write down
the full 10 significant digits with these. Otherwise the rest of the problem won’t work.)
ii. Knowing the correct value of the integral (from the first question), you can then calculate the numbers in
the column labeled “Err” in all those approximations.
iii. After that, multiply the numbers in the column labeled “Err” by n2 , n2 , and n4 as indicated. (That is, on
the first row, where n = 6, multiply the error in Middlesum by 36 = 62 , multiply the error in Trapezoid
by 36 = 62 , and the error in Simpson by 1296 = 64 . Do similar things on the other rows.) (Be careful to
multiply the value in Simpson by n4 rather than n2 .)
iv. For each of the three methods, the numbers in “Err × n2 ” (or “Err × n4 ” for Simpson’s rule) should have
specific values as n → ∞. (To prove that requires considerably more effort than we have time for.) Estimate
the three limits for the three methods.
v. Use the three estimated limits to estimate the values of the three errors when n = 96. Do this by taking the
limits from the previous part of this question, and dividing by 962 or 964 .
vi. Finally, run each of the methods with n = 96 (this could take some time), find the Values, calculate the
Errs, and figure out the last columns again. Use these numbers to check your answers to the previous part
of this question.
(c) You might have noticed (I am being generous there) that the Errs for middlesum(); and trapezoid();
tend to be related. Specifically, the Err for trapezoid(); is usually quite close to −2 times the Err for
middlesum();. This means that we can come much closer to the correct value of the integral if we use this
fact. Suppose the actual, exact value of an integral is X. Then middlesum(); will give an approximation
Am = X + Em while trapezoid(); will give give an approximation At = X + Et where Em and Et are the
errors from the middlesum(); and trapezoid(); approximations, respectively. But if Et ≈ −2 × Em , then
2 × Em + Et ≈ 0. Then
2 × Am + At = 2 × (X + Em ) + (X + Et ) (4.99)
= 2 × X + 2 × Em + X + Et (4.100)
= 3 × X + (2 × Em + Et ) (4.101)
≈ 3×X (4.102)
Then X ≈ (2 × Am + At )/3 should be a very good approximation. Check this out using the table of Values of
middlesum(); and trapezoid(); for n = 6, 12, and 24. Compare these to the Values of simpson(); at
n = 12, 24, and 48. (It should look very much the same!) This is another way to get Simpson’s rule.
4.3 Applying 1-dimensional integrals.

Traditional calculus courses have a long section right after methods of integration that is quaintly entitled “Applications,”
consisting of up to a dozen contrived situations that are obviously cooked up to make integration look useful. I want to go
over the same set of “applications” from the point of view of showing you how to set up integrals. I won’t pretend that you
will ever use these as they stand. However, the method that is used to set up integrals is critical to using integration. In fact,
it will become a little boring by the end. That’s the idea.
Dudley: What? Integration is supposed to get boring?
Mugsy: If he’s trying to be funny, it isn’t working.
Albert: No, he’s serious.
Mugsy: It still isn’t working, then.
Albert: The procedure for setting up integrals for applications can get boring. It’s the same thing over and over.
I will cover these faster than most calculus courses, because what I want you to learn is different. I want you to learn
the procedure behind setting up integrals, not the specific formulas.
Many more applications (a lot closer to real-life) will form the majority of this course, beginning in the previous chapter,
picking up again next chapter, and continuing all of next semester. This section is to get you warmed up.
Mugsy: I prefer jogging.
4.3.1 Areas, again.

We looked at areas as one of the initial interpretations of integrals. We come back to them and discover that we made a few
assumptions. Trying to correct them will lead us into a few unexpected, and new, topics.
What happens if f (x) becomes negative?

As an example, what is the area for the curve y = 3 x2 − 9 from x = 0 to various values? The indefinite integral (used to
find all the definite integrals) is x3 − 9 x. (Okay, it should have a +C attached, but as before, we can set C = 0 because all
we will be using it only for evaluating definite integrals.)
For 0 ≤ x ≤ 4, the definite integral gives [(4)3 − 9(4)] − [(0)3 − 9(0)] = 28, which is a reasonable answer, I suppose.
For 0 ≤ x ≤ 3, the definite integral gives [(3)3 − 9(3)] − [(0)3 − 9(0)] = 0, and that is suspicious. In fact, a little thought
indicates that it can’t be correct. How could the area be 0!?
Albert: Hint: Any time you get an area of 0, you have almost certainly made a mistake!
Nevertheless, try 0 ≤ x ≤ 2. The definite integral gives [(2)3 − 9(2)] − [(0)3 − 9(0)] = −10.
Dudley: Hold it! An area of 0 was bad enough, but we know that it can’t be negative!
What is going on? First, look at the graph.
√
Note that f (x) < 0 for 0 ≤ x < 3. This is the culprit! When f (x) < 0, the f (x) dx’s will be negative also, and so will
the sum (that is, the integral). When f (x) turns positive again, the f (x) dx’s become positive too. For 0 ≤ x ≤ 2, there were
more negatives than positives, and the combined sum was negative. The negative and positive terms exactly cancel as you
add up for 0 ≤ x ≤ 3. For 0 ≤ x ≤ 4, there were enough positive terms to make the overall sum positive.
So, the integral is doing what it is supposed to do, namely add up a bunch of differential-sized items. The only problem
is that those items can be negative because the function multiplying dx is negative, and the area ought always to be positive.
When the function is positive, the definite integral gives the area correctly. When the function is negative, the f (x) dx
gives the negative of the area of the slice, and adding those up gives the negative of the area. That’s not too hard to work
with; we just change the sign to get the area. The difficulties occur when the function is sometimes positive and sometimes
negative, as in the example. Then the positives and negatives cancel.
What do we do?
What we need is something that will change the sign of negative numbers, but leave positive numbers alone. We have
something that does that—absolute values! The area of a differential sliver is always | f (x) | dx, whether f (x) is positive or
negative.
The general formula for area “under” the graph of y = f (x) from a ≤ x ≤ b can be expressed as
Z b
| f (x) | dx
a
“All” we have to do now is figure out how to integrate absolute values of functions. Note that we really should say that we
are finding the area between y = f (x) and the x-axis rather than the area below y = f (x), because when f (x) < 0, it is the
area above the curve that we are finding. If we use between, we cover both situations with one word.
Learning how to integrate absolute values of functions turns out to be more useful than it might first appear.
Mugsy: Here we go again. I can hardly wait for him to explain how we will use this every day for the rest of our lives.
Dudley: Depends on how long you live.
Mugsy: I can adjust yours right now, if you keep that up.
There are a number of situations where absolute values show up, mainly because
p
z2 = |z |
Any time (well, almost) you are integrating the square root of something, you will want to make the inside part of the square
root a perfect square and then take the square root. At that point, you must put absolute values in, because of this identity.
Then you get to proceed to the following section’s procedure.
Definite integrals of | f (x) |.

The procedure for finding the definite integral of| f (x) | will look vaguely familiar. We already have a clue about how to
proceed. If f (x) ≥ 0, we can proceed simply by integrating f (x) (which equals | f (x) | when f (x) ≥ 0).
If f (x) < 0, we can proceed by integrating − f (x) (which also equals | f (x) | when f (x) < 0). But for f (x) < 0, it is
easier to integrate f (x) and change the sign at the end rather than messing around with the negative sign throughout. This
is what I meant when I said that if f (x) is always positive or always negative, there is no difficulty.
So, what we do is break the interval a ≤ x ≤ b up into parts on which f (x) has constant sign. We did that back with
derivatives, one of the ways we looked for maxes and mins using the first derivative. We do the same here, but for a
completely different reason.
We can find where f (x) is positive or negative by looking at where f (x) is 0 (or isn’t defined—but those functions
require extra work to integrate, and we won’t worry about that problem until next semester). If there are places where the
function is 0 outside the interval a ≤ x ≤ b, we ignore them. The places inside the interval where the function is 0 break the
interval into pieces, and we integrate over each piece separately. (We don’t dare lump them together, because some might
be positive and some negative, and we above all don’t want cancellation to occur.) The intervals where f (x) < 0 will give
negative integrals, and the intervals where f (x) > 0 will give positive integrals. (This is a handy check, if you need one.)
Change all the values of the integrals to positive, and add, and you’ll get the total area.
The correct and general way to find areas between curves and the x-axis.
The area between the curve y = f (x) and the x-axis for a ≤ x ≤ b is, as stated earlier,
Z b
| f (x) | dx = total area between y = f (x) and the x-axis (4.103)
a
It is integrated by the process described earlier.

When f (x) > 0, which is all we had used up to this point, the formula reduces to ab f (x) dx, which is what we used way
R
back then.
For example, let’s find the correct area between f (x) = 3 x2 − 9 for x between
√ 0 and 4. We√ need first to find where
f (x) > 0. This is done by solving 3 x2 − 9 = 0 which gives 3 x2 = 9, or x = ± 3. Since x = − 3 is not in the interval√
0 ≤ x ≤ 4, we can discard that, and we then have that the intervals over which f (x) doesn’t change sign are 0 < x < 3 and
√
3 < x < 4. Integrate f (x) over these intervals and get:
Z √3 √3 √ √ √
3 x2 − 9 dx = x3 − 9 x0 = (3 3 − 9( 3) − (0 − 0) = −6 3
0
Z 4
2
4 √ √ √
√ 3x − 9 dx = x3 − 9 x√3 = (64 − 36) − (3 3 − 9( 3) = 28 + 6 3
3
Now what do we do? The value of the first integral is clearly negative, which is correct, since the function is negative over
the entire interval. The second integral is clearly positive, which again checks with the function being positive. How do we
find the integral of | f (x) | = 3 x2 − 9 over 0 ≤ x ≤ 4? We add up the absolute values of the two integrals that we got:
√ √ √ √ √
−6 3 + 28 + 6 3 = 6 3 + 28 + 6 3 = 28 + 12 3

This is the area between the curve f (x) = 3 x2 − 9 and the x-axis for 0 ≤ x ≤ 4.
Three final notes.
Mugsy: At least I can count that high. √ √
First, if you don’t take the absolute values before you add in the example, you get (−6 3) + (28 + 6 3) = 28, which is
the answer we obtained by just integrating 3 x2 − 9 from 0 to 4. Think about this for a while until you are convinced that
this should always happen.
Second, be careful when you√ are taking √absolute values of√differences (subtractions). You don’t just change all the signs
to positive. For example, 5 − 3 is 5 − 3 rather than 5 + 3. (For those of you who remember such things, the triangle
inequality says that |a ±√ b| ≤
|a | + |b |,√and the
√≤ can often be a <. You can’t just pull absolute values apart that way!)
On the other hand, 1 − 3 is −(1 − 3) = 3 − 1. You take the absolute value after you have calculated the number,
and not the absolute values of the individual terms. The way to think about it is that you first punch the quantity into your
calculator, and only√ at the end do you decide
√ whether you need √ to change the sign
to√make it positive. In the examples, you
first find that 5 − 3 ≈ 3.268, so 5 − 3 ≈ 3.268, but 1 − 3 ≈ −0.732, so 1 − 3 ≈ 0.732.
Finally, there is another way this type of problem will get stated, that looks like it requires more work on your part.
Mugsy: I can see it coming now.
Sometimes, just the function is given to you, and you are supposed to find the limits by yourself. An example of such a
problem is this. Find the area between the curve y = x2 − x − 6 and the x-axis. No limits are given to you! What do you
do? Simple. In cases like this, you figure out where the function intersects the x-axis, and use those for limits. (You’d have
to do this step anyway, so there really isn’t any extra work.) You then integrate over the interval(s) you get, and add up
the absolute values. Note that you don’t normally integrate over the intervals that are infinite R(going to +∞ or −∞). In the
3 2
example of y = x2 − x − 6, the curve intersects the x-axis at x = −2 and 3. Integrating, you get −2 x − x − 6 dx = − 95 6 . The
95
absolute value is 6 , which is the area.
One short technical note on that. It is possible that there are no intersection points, or just one. An example of no
intersection points would be y = 1/(1 + x2 ), and an example of one intersection point would be y = xe−x . In that case, you
must use infinity (∞) and/or negative infinity (−∞) to complete the limits. For y = 1/(1 + x2 ), you’d use −∞ to ∞. For
y = xe−x , you’d use 0 (the intersection point) and ∞. Why ∞ rather than −∞ is more complicated than I want to go into
here. It would take us into what are called improper integrals, and that’s another topic for next semester.
The area between two curves.

We are exceedingly close to solving another, more general, problem.
Mugsy: And, of course, you couldn’t help dropping in as long as you were in the neighborhood.
What is the area between two curves y = g(x) and y = h(x) for a ≤ x ≤ b? The answer is that it is the same as the area
between y = h(x) − g(x) and the x-axis for a ≤ x ≤ b. A picture is helpful to convince you that is right.
All the items about y = f (x) concerning the area between the curve and the x-axis have corresponding items concerning
the area between y = g(x) and y = h(x). All you have to do is use f (x) = h(x) − g(x) and all the pieces fall in place. The
item that corresponds to y = f (x) crossing the x-axis is y = g(x) and y = h(x) crossing each other because f (x) = 0 is the
same as h(x) − g(x) = 0 or g(x) = h(x). Where f (x) > 0 and f (x) < 0 corresponds to where h(x) is above or below g(x)
again because f (x) > 0 means h(x) − g(x) > 0 or h(x) > g(x). Think about this until it makes sense.
The idea behind this is fairly important. As long as f (x) = h(x) − g(x), the values of f (x) dx will be the same as
the values of (h(x) − g(x)) dx. That means that the areas of the slivers obtained from the area under y = f (x) will match
precisely the areas of the slivers from the area below y = h(x) and above y = g(x). Adding up the slivers will then give the
same area both times. Geometrically, this says that you can slide the areas around vertically, and push the area up or down
so that one side of the area is the x-axis.
Here’s the procedure you have been waiting for.
Dudley: That’s not really what I would have said. Dreading is a little closer.
To find the area between y = g(x) and y = h(x) for a ≤ x ≤ b, you first solve g(x) = h(x), keeping only the points that are
between a and b. Those points break up the interval between a and b into sub-intervals. Integrate over each sub-interval
separately, and add together the absolute values of the results to get the final answer.
Again, if the limits aren’t given to you, use the intersection points (g(x) = h(x)) as before to get the areas.
For example, to find the area between the curves y = 4 x and y = x3 , you have to solve 4 x = x3 , giving values x = 0, −2,
and 2. You would use the limits −2 ≤ x ≤ 2, and the integral you’d want is
Z 2
4 x − x3 dx

−2
(In these problems, you always use the very largest and very smallest intersection points for the limits on the integral.) To
evaluate this, you’d set up two integrals (one from −2 to 0, and one from 0 toR2), and add up the absolute values. The two
2
integrals are −4 and 4, so the total area is |−4 | + |4 |=8. (Note that if you find −2 4 x − x3 dx without splitting it up, you get
0, and that can’t be right! The area is never equal to 0, for us at least.)
This seems reasonably simple (or at least seems as though it will be simple once you figure it out). It really is. But it
illustrates the key to all other applications: Slice up the item you want to find into differential-thickness slivers, estimate
each sliver, and reassemble with an integral.
Albert: This is the beginning of the boredom.
It might come as a shock, but Maple does not have an area-finding routine.
Mugsy: Boy, am I shocked.
You have to do that manually. Maple will find where f (x) = 0, or where g(x) = h(x). Maple will also do the integration, once
you give it the limits. On the other hand, if all you want is an approximation, you can get by with Maple’s approximation
routines. For example, if you wanted to find the approximate area between y = 3 x2 − 9 and the x-axis for 0 ≤ x ≤ 4, you
could just tell Maple evalf(Int(abs(3*x^2-9),x=0..4));, and you’d get an approximation. Note however, that this is
just an approximation, and I will usually ask for the exact value in the homework. For the area between y = 4 x and y = x3 ,
you could tell Maple solve(4*x=x^3,x);, and get the numbers 0, -2, 2. (More complicated problems might require
you to use fsolve();) You’d then have to tell Maple evalf(Int(abs(4*x-x^3),x=-2..2)); Using the Int(); (inert)
form of integration prevents Maple from trying to integrate symbolically first, so Maple works the integral only numerically.
This can save quite a bit of time.
Slicing horizontally.
It is possible to slice areas horizontally rather than vertically. (In fact, a thorough investigation of this approach leads to a
new theory of integration called Lebesgue (pronounced luh-BAYG) integration. In contrast, what we have been doing is
called Riemann integration. But that’s for graduate-level work and not for us to worry about here.)
When you slice horizontally, the sliver will be dy thick, and of some length. Integration will then force you to integrate
()dy. That means that the variable of integration will have to be y, so the function will have to get expressed in terms of y,
and the limits will have to be y-limits.
So, the length of the sliver (to multiply by the thickness dy to get the area) will have to get expressed in terms of y. Since
horizontal distances are the differences in x-coordinates, we’ll need the x-coordinates of the ends of the sliver, meaning that
we will need the left- and right-hand curves in the form x = f (y) and x = g(y).
This is important. It tells us when we should use this method rather than slicing vertically. Look at the form of the
functions given to you; if they are in the form y = f (x), slice vertically, while if they are in the form x = g(y), slice
horizontally. If the equations are given implicitly (such as x + y = 4, with no preferred variable), you get to choose.
Admittedly, slicing vertically is much more common.
A few comments at the end.

Next semester, we’ll be learning a somewhat more comprehensive method of getting areas by slicing both horizontally
and vertically simultaneously (called double integration). It’s nice to have some experience now slicing either vertically or
horizontally; it will help then.
It is very helpful, when possible, to have a reasonable graph of the function(s) involved. At the very least, it gives you
something to stare at while you are trying to figure out the rest of the problem.
Dudley: That’s how I feel all the time.
Seriously, it might show you intersection points you would otherwise miss, or something else that you didn’t notice.
Homework #47
Exercises.
1. Find the area between y = 6 x2 + 9 x + 3 and the x-axis.

2. Find the area between y = x3 − 3 x − 3 and y = x − 3.
√
3. Find the area below y = x, above y = 2, and to the left of x = 9. (Drawing a picture is a definite help here! The
√ x = 9” shows that x = 9 is the upper limit of integration. The lower limit of integration is the
phrase “to the left of
intersection of y = x with y = 2.)
4. Find the area between the curves y = x3 , y = −x, and x = 1. (Again, a picture is a great help. Find the single chunk
of area that includes all three curves as sides. Limits of integration again show up as intersection points of boundary
curves.)
5. Find the area to the left of x = 6 y2 , and to the right of the y-axis, for 1 ≤ y ≤ 3. (Note that when the variable of
integration is y, the problem is stated entirely in terms of “left of” and “right of.” This is a clue.)
6. Find the area between x = y3 and x = y4 .
4.3.2 Net and total distances traveled.

The next standard “application” is net and total distance traveled. In this type of problem, you are given the velocity v(t),
and the times t1 ≤ t ≤ t2 , and the question asks for the net and/or total distance traveled.
We need to define terms here. Net distance is the change in position between beginning and end, ignoring what might
have happened in between. Total distance is the distance covered, including all backtracking, in getting from start to finish.
Obviously, total distance will be greater than or equal to net distance.
Oddly enough, finding total distance is exactly the same as finding area between a function and the x-axis.
Dudley: That’s odd? By this point of the course, any connection at all between any two things at all seems likely.
The reason is the similarity between finding a sliver of area and a “sliver” of distance traveled. When f (x) > 0, then
| f (x) | dx = f (x) dx is the area of the sliver. If f (x) < 0, then f (x) dx is the negative of the area, and the actual area is
| f (x) | dx = − f (x) dx. In either case, the area of the sliver is | f (x) | dx.
When we try to find total distance traveled, we run up against the same problem. When v(t) > 0, then the distance
traveled during a dt-sliver (of time) is v(t) dt, that is, rate × time. (We can use this formula even though v(t) is changing
since v(t) won’t be changing during a dt-interval. This is the usual advantage of taking a differential of change.) On the
other hand, if v(t) < 0, then v(t) dt will be negative, and the total distance traveled doesn’t want to count any distance as
negative. All distances covered need to be accumulating as positive values; otherwise there is cancellation, and backtracking
can’t be accounted for. So what do we need to do? Convert v(t) < 0 to be positive, and use in that case |v(t) | dt = −v(t) dt
for the distance traveled during a dt-interval. Again, in either case, we want to use |v(t) | dt for the distance. To add up all
those distances, we use an integral, and get
Z t2
|v(t) | dt = total distance traveled (4.104)
t1
The net distance traveled is much easier to find. Since v = ds/dt, we get that
Z t2 Z t2
v(t) dt = (ds/dt) dt (4.105)
t1 t1
= s(t)|tt21 (4.106)
= s(t2 ) − s(t1 ) (4.107)
and this is just the net distance. We then get this
Z t2
v(t) dt = net distance traveled (4.108)
t1
Note that net distance can be positive, negative, or zero, depending on the sign of v(t).
On analogy with net and total distances, I will sometimes refer to the area between a function and the x-axis as the total
area, while just integrating the function without the absolute values gives the net area.
We will cover total distance traveled in two dimensions next section.
Homework #48
Exercises.
1. Find the net and total distances traveled for the following velocity functions, v(t).
(a) v(t) = cost, 0 ≤ t ≤ π/2
(b) v(t) = 6t − 48, 0 ≤ t ≤ 20
2. Find the net and total distances traveled for the following velocity functions, v(t).
(a) v(t) = sin(t/4), 0 ≤ t ≤ 2 π
(b) v(t) = 4t − 16, 0 ≤ t ≤ 10
4.3.3 Arc length.

Another typical “application” is the length of a curve, called the arc length (sometimes written as a single word). This is
actually a two-dimensional version of the total distance traveled problem, but it will take a while to see that.
The situation is this. We have a curve specified either as an explicit or parametric equation. (The other method of spec-
ifying a curve—an implicit equation—is not well suited to arc length problems, and we don’t work with that possibility.)
We want to find the total length of the curve.
The method, in case you haven’t figured it out by now, is to chop up the curve into differential-sized pieces, find the
length of each piece, and add the pieces back together with an integral. We will be using s for the length of the curve, so
ds = length of the differential piece.
The formula for ds comes directly from the Pythagorean theorem:
p
ds = dx2 + dy2 (4.109)
Drawing a little differential triangle should convince you of this. The only hassle is with the curved “hypotenuse,” but
remember that differential-sized values of dx and dy don’t allow enough room for the hypotenuse to curve enough to mess
up the equation.
Integrating adds up all the ds’s, and the result is the arc length:
Z p
arc length = dx2 + dy2 (sort of) (4.110)
That’s the formula, but it contains something of an awkward problem.

Mugsy: Thrills.
We can’t integrate the thing as it stands. What’s the differential for the integral? Is it dx or dy? Without knowing that, we
don’t know the limits for a specific problem, or even what variable to express everything in terms of. In other words, the
formula is nice, but needs considerable work before it is usable.
Dudley: I suppose that’s the challenge in this section.
Albert: Yep.
Converting the formula to a form you can use for integrating involves creating the differential you want to use. This is
actually a benefit. The formula does not specify what the differential variable is. We can then choose it to suit the problem
we are given. The variable to use in the differential is the independent variable used in describing the curve.
Dudley: Isn’t that always the case?
Albert: Yes. But in this situation, it is handy to make that statement very explicitly.
How do we create this differential? We factor it out of the integrand. Essentially, we write the ds as (ds/d?) d?, where
we fill in ? with the correct variable. The most common variable is probably x, so if we had a problem with x as the
independent variable, the formula would be ds = (ds/dx) dx. But that needs to get translated in terms of the integral. The
procedure for that is always the same, and isn’t even very difficult:
p
ds = dx2 + dy2 (4.111)
s
dx2 dy2

= + × (dx2 ) (4.112)
dx2 dx2
v
u 2 !
u dy
= t 1+ × dx (4.113)
dx
This, then, is what gets integrated.

√
Dudley: Al, shouldn’t we take dx2 = |dx |?
Albert: Why, Dudley! How observant of you! You are certainly correct. That is one of the points that was made so
emphatically earlier, and you even remembered.
Dudley: Thanks for the compliment. Answer my question.
Albert: Oh, yes. When you are setting up the integral, the limits are always put in with the smaller value on the
bottom and the larger number for the upper limit. This means that x, or whatever the differential variable is, will be
increasing. That means that the differential of that variable will always be positive. Does that help?
Mugsy: No. But I’ll believe you.
Note that since x is the independent variable, dy/dx can be calculated as usual, and the integrand turns into a function of x,
which is what the integrand needs to be. Supply x-limits (which will have to be given to you as part of the problem), plug
in the dy/dx (which you calculate from the equation of the curve), and you are ready to integrate. The answer becomes
v
Z x2 u 2 !
t 1 + dy
u
dx (4.114)
x1 dx
A very similar thing happens in the unusual case of y as the independent variable. The formula becomes
v
Z y2 u 2 !
t 1 + dx
u
dy
y1 dy
as you can easily check. Again you must supply limits (values of y this time, since the differential is dy), and the formula
for dx/dy (obtained by differentiation), and integrate.
A similar, but slightly different, thing happens in the case of parametric equations. There, the independent variable is t,
and you don’t get quite as much simplification. The differential for arc length is calculated as before:
p
ds = dx2 + dy2 (4.115)
s
dx2 dy2

= + × (dt 2 ) (4.116)
dt 2 dt 2
v
u 2 2 !
u dx dy
=t + × dt (4.117)
dt dt
This, then, is what is integrated in this case. The formula becomes
v
u dx 2
Z t2 u 2 !
dy
t + × dt (4.118)
t1 dt dt
where you must supply t-limits, and plug in for both dx/dt and dy/dt and then integrate.
Let’s do some examples. Take y = x2 for 0 ≤ x ≤ 3. The value of dy/dx = 2 x, and the limits on the integral are 0 and
3. The arc length is
Z 3q Z 3p
1 + (2 x)2 dx = 1 + 4 x2 dx
0 0
This integral can be worked exactly (try it on Maple), and the answer is
√
3√ ln(6 + 37)
37 +
2 4
which evalf’s to 9.747088758.
For another example, suppose we take x = t 2 + t and y = t 3 − t, for 0 ≤ t ≤ 2. (We could solve x for t and plug into
y, but that would leave a set of equations that are ghastly. Then we’d have to differentiate them before squaring them. It
would be serious. See the homework where I guide through this problem for just some of the hassles that can occur, using
Maple.)
Mugsy: What?! You mean that we’re going to have to go through a bunch of algebra that he has already described
as “ghastly?”!
Albert: Sort of. You have to get Maple to do it, actually. That makes is reasonable.
Then you get
v !
u dx 2 dy 2
Z t2 u Z 2q
t + × dt = (2t + 1)2 + (3t 2 − 1)2 dt (4.119)
t1 dt dt 0
Z 2p
= 9t 4 − 2t 2 + 4t + 2 dt (4.120)
0
This function can’t be integrated in terms of elementary functions (it is called an elliptic integral).
Homework #49
Exercises.
1. Calculus texts love to give problems that involve arc length where the square root in the integrand can be taken
exactly. This involves a very careful choice of functions y = f (x) or x = x(t), y = y(t). This question works through
some of the y = f (x) sort. The trick usually is to choose y = f (x) so that what’s inside the square root in the integral
is a perfect square. Taking the square root then leaves some reasonable function to integrate.
Dudley: My definition of reasonable doesn’t match your definition very closely.
(a) Take y = 13 (x2 + 2)3/2 . Find the integrand for arc length and simplify it to some (?) dx that can be integrated
R
easily. (That is, so the square root is gone.)

(b) Take y = 41 x4 + 18 x−2 . Again, find the integrand, but this time, multiply out the squared term inside the square
root. You should find that a bit of cancellation allows the four terms inside the square root to reduce to three
terms that are also a perfect square. Take the square root. This is the usual trick for this type of problem. Check
this out using Maple if you get stuck (or even if you don’t).
(c) Take y = x3/2 . Find the integrand. In this case, you get a differential that can be integrated exactly even though
the square root is still present.
2. Find the circumference of a circle of radius R. Do this by writing the circle as x = R cos θ , y = R sin θ and finding the
length of the curve. (Hint: What are the limits on θ that go around the circle once?)
Investigations.
1. This question is to show you how much easier it is to work arc length with parametric equations when that is the form
you have. Use Maple (unless you are feeling very brave or very masochistic) to do the following parts. Remember
to change the colons at the end of these commands to semicolons so that you can see the answers.
(a) Define
> x := t^2 + t:
and then find ds (without the dt).
> ds := sqrt( diff(x,t)^2 + diff(y,t)^2 ):
Then find the arc length. (You will save a lot of time by using the inert form of the integration command,
Int();. Maple grinds for a long time trying to figure it out, and ultimately generates a gigantic, and unusable,
answer.)
> Int( ds, t = 0 .. 2):
The value should be between 9 and 10. This is the correct value of the arc length.
(b) In the rest of the problem, we try to do this by converting the equations to an explicit form. Clear x by typing
this.
> x := ’x’:
and then type

> solve(x=t^2+t, t):
This gives t as a function of x. You will note that there are two solutions. This makes sense, since it is a
quadratic equation for t. Keep both solutions around by assigning a variable to the solutions. (I call it ts for
“the t’s,” that is the value of t given by the x equations.)
> ts := %:
(c) Let
> t := ts[1]:
so that t takes on the first of those solution values. Then look at
> y:
Maple will automatically use the value of t that you had from the t:=ts[1] when finding y. (At this point, you
have solved the parametric equations down to the form y = f (x), that is, an explicit equation.)
(d) Now find
> ds := sqrt( 1 + diff(y,x)^2 ):
We now want to integrate ds for a certain range of x’s. How would you find the x-limits to correspond to
0 ≤ t ≤ 2? (Hint: You have the equation!) Integrate the function by typing
> evalf( Int(ds, x = 0 .. 6) ):
You will notice that the answer is either between 9 and 10 (the correct one) or between 24 and 25 (which is
obviously not right).
(e) Repeat the instructions for for the previous two parts, replacing the first step, t:=ts[1];, with
> t := ts[2]:
Now the evalf(Int(dsdt,x=<limits you found>)); gives the other value (whichever one you did not
get two parts ago).
(f) Why did one value of t give the wrong integral, and the other the correct integral, and how would you tell which
one is correct without evaluating the integrals? (Hint: Plug t = 1 into the original parametric equations for x.
What value of x do you get? Plug that value of x back into the ts[1] and ts[2] expressions that you got by
solving for t. Which one gives t = 1, the correct value of t?
(Are you convinced that you should just stick with the parametric equations form of x(t) and y(t) when that’s
what is given to you? And if the equations for x(t) and y(t) weren’t so easy (!), Maple wouldn’t be able to solve
for t at all, and the whole process would be stuck from the beginning. It would eliminate the problem of which
solution of t to use, since there wouldn’t be any solutions!)
Mugsy: I’m not convinced that this guy isn’t a sadist.
Dudley: Huh? Does that mean you think he is or you think he isn’t?
Mugsy: Yes.
2. In this question, we show that the formula for arc length reduces to the formula for total distance traveled in the
previous section. Accordingly, we assume that x = x(t), and that y(t) = 0, so that the object is moving along the
x-axis, and t1 ≤ t ≤ t2 .
(a) Find ds/dt. What do you get when you take the square root? (Hint: It is not dx/dt!)
(b) Plug ds/dt into the integral for arc length with the dt differential and the correct limits on the integral. Since
v(t) = dx/dt, put v(t) into the integral, and show that it is exactly what we got before.
4.3.4 Surface areas of revolution.

Another standard “application” is the areas of surfaces of revolution. First, the term surface of revolution needs to be
explained. If you take a wire and spin it around some axis fast enough, it will appear to be a sort of solid surface. That’s
the surface of revolution for that wire and that axis. What we want to do is find the area of that surface.
By now, the procedure should be getting familiar. We chop up the surface, find the area of a differential slice, and add
up those slices by an integral. It isn’t quite obvious how to slice this surface up, though.
Mugsy: I thought that this was supposed to become boring. How come so much keeps being “not quite obvious?”
It turns out that the right way to do it is to slice the curve up into differentials, and watch what happens to a differential as
it spins.
A differential of the curve, when it spins, produces a differential-width hoop. In order to find the area of the hoop, we
slice it across and lay it out flat, where it becomes a ribbon, essentially a long, skinny rectangle. The area will be the width
(ds) times the length. The length of the ribbon is the circumference of the hoop. (Think about that for a while until it makes
sense).
Mugsy: He’s discriminating against me again.
The circumference of the hoop is 2π× (radius of hoop). For definiteness, we’ll use the Greek letter ρ (rho, the equivalent
of “r,” and pronounced like “row”)
Dudley: ρ, ρ, ρ your boat,
Mugsy: Gently down the stream. . .
Albert: At least you’re paying attention.
for the radius of the hoop. That makes the length of the ribbon 2πρ, and the area of the ribbon 2πρ ds. Adding all of these
up (with an integral, of course) gives the formula:
Z
Surface area of revolution = 2 π ρ ds (4.121)
This formula presents us with all the same problems that arc length did, and then some.
Dudley: Mugsy, I’m beginning to agree with you about that “boring” thing.
We must deal with ds again, and we do that exactly the same way that we did with arc length. (This “application” always
immediately follows arc length for this reason.) So, that is something you should know how to do.
The ρ presents more of a difficulty. It came from the radius of the hoop, but how do we find that when we are confronted
with a problem? Basically, the radius of the hoop is the distance of the differential (the ds-piece) of the curve from the axis
of rotation. (Think about that for a bit until it makes sense.)
Mugsy: Hey, would you cut that out? I already feel picked on.
Dudley: I never thought I’d see the day....
So, how do we find ρ in a problem? It is the distance of the curve (which is where the differential ds piece lives) from the
axis of rotation. This needs to be highlighted:
ds = same as in arc length

ρ = distance of the curve from the axis of rotation
There are two warnings. First, remember that distances between objects are always measured along a perpendicular.
Since the axis of rotation must be a line, the distance is measured perpendicular to that line. Second, remember that ρ, just
like everything else in the integral, must be expressed in terms of the integrating differential variable.
Let’s try to find surface area now for a specific example. Suppose we revolve the curve y = x2 for 0 ≤ x ≤ 4 about the
y-axis. We’ll get a bowl-shaped object called a paraboloid, but picturing it is not critical. What is useful is to be able to
draw the curve and the axis of rotation in two dimensions (the xy-plane). In this case, the graph is fairly simple. The curve
is a parabola, and the axis of rotation is the y-axis. Finding ds is also fairly easy:
s 2
dy
ds = 1 + dx (4.122)
dx
q
= 1 + (2 x)2 dx (4.123)
p
= 1 + 4 x2 dx (4.124)
Note by doing this calculation, we have declared that x is the independent variable and the variable of integration for this
problem. We also know from the curve that the limits on the integral will be (x going from) 0 to 4. The remaining items are
to find ρ and assemble the information into the integral. (Note that we don’t need to know ρ or the limits to find ds.)
When we tackle ρ, we need to understand what it is, and locate ρ on the graph. It will measure the distance from
a differential chunk of the curve y = x2 to the y-axis. In this case, the perpendicular to the y-axis is a horizontal line.
(Anything that passes through the point can be considered perpendicular to it, so the y-axis determines the perpendicular.)
How do you measure a horizontal distance? It is the difference in the x-coordinates, or more specifically, it is the
x-coordinate of the right-hand end minus the x-coordinate of the left-hand end. The right-hand end is the point, with x-
coordinate x (generic point). The left-hand end is the y-axis, with x-coordinate 0. (The y-axis has x-coordinate 0; the x-axis
has y-coordinate 0.) Then ρ = (x) − (0) = x. R
Now let’s assemble the answer. The formula is 2πρ ds, which in this case is
Z 4 p Z 4 p
Area = 2π(x) 1 + 4 x2 dx = 2π x 1 + 4 x2 dx
0 0
This integral can be worked exactly (a simple substitution does it; see the homework), but this is somewhat unusual. Most
integrals can’t be worked exactly.
Dudley: At least he assigns ones that are possible to do.
Be careful that you realize that we are finding the distance between the individual points (actually, the differential-sized
chunks) of the curve y = x2 and the y-axis, and not the distance between the whole curve and y-axis. Those are two different,
though related, things. Each point has its own distance from the y-axis. But the curve as a whole has a single distance from
the y-axis, which in this case is 0. (The two intersect at (0,0). The distance of the curve in general is the smallest of the
distances from each of the points.)
Suppose we alter the problem slightly. Suppose we take the same curve and rotate it about the x-axis rather than the
y-axis. Then all we need to recalculate is the value of ρ. The ds portion of the integral depends only on the curve, and not
on how it is rotated. What is the value of ρ this time? The axis of rotation is now the x-axis, so ρ is the distance from the
point (x, x2 ) to the x-axis. What is the perpendicular to the x-axis (again, the perpendicular to the point won’t specify any
direction)? It is a vertical line segment. What is the length of a vertical segment? It is the difference in y-coordinates, or
more specifically, it is the upper point’s y-coordinate minus the lower point’s y-coordinate. In this case, the line segment
runs from the point (x, x2 ) to the x-axis. The y-coordinate of the upper point is then x2 ; the y-coordinate of the lower end of
the segment is 0. Then difference of these is ρ = (x2 ) − (0) = x2 . The surface area of this is
Z 4 p Z 4 p
Area = 2π(x2 ) 1 + 4 x2 dx = 2π x2 1 + 4 x2 dx
0 0
Again, this integral can be worked exactly, but the work is considerably harder. See the homework (where Maple does it!).
Note that we found the surface area without knowing anything about what the surface itself looks like.
It was useful to express the point (x, y) as (x, x2 ), since then we will have everything expressed in terms of the variable
we will want to have in the integral (x in this case, from the fact that we found ds in terms of dx).
What happens if you rotate about a line other than the x- or y-axis? Since the ds lives just on the curve, and could care
less about what line it will get rotated about. However, ρ depends on the line. The easiest lines to deal with are horizontal
(in the form y = C) or vertical (in the form x = C). In that case, you find ρ will be |y −C | or |x −C |, depending on the line.
(You’ll have to think about this for a while.)
Mugsy: That’s it. I give up.
Dudley: What are you going to do? Quit?
Mugsy: No. Stop thinking. I think.
To work with the absolute values (and note that I didn’t use them in the examples), you figure out the sign of y −C or x −C
first, and plug it into the example. For example, if we rotated the y = x2 curve about the line y = 20, the value of ρ would
be |y − 20 |. But for points on the curve, y < 20, so ρ = |y − 20 | = −(y − 20) = 20 − y. That’s what you would use in the
integral.
Homework #50
Exercises.
R4 √
1. Evaluate 2π 0x 1 + 4 x2 dx. This can be done with a substitution. Use Maple to check your answer.
√
2. Evaluate 2π 04 x2 1 + 4 x2 dx using Maple.
R
3. Set up the integral giving the surface area obtained by rotating the curve y = x sin x for 0 ≤ x ≤ π about the line
y = −2. Use Maple to approximate the integral.
Problems.
1. The distance between the point (x0 , y0 ) and the line a x + b y = c is
|a x0 + b y0 − c |
√
a2 + b2
Use this formula to set up an integral giving the surface area obtained by rotating the curve y = ln x, for 1 ≤ x ≤ 4,
about the line y = 3 x + 1. (Note that “set up” means just that. You aren’t going to be able to evaluate the integral you
get.)
2. This exercise will find the surface area of a sphere of radius R. A circle is what you want to rotate to give a sphere.
The simplest way to express the curve is parametrically. Use x = R cost, and y = R sint. Rotate about the x-axis.
(a) What limits should be put on t? This is not an easy question! In order for the problem to work correctly, we
must pick limits on t that specify a section of the curve that, when rotated, covers the sphere completely, but
with no overlap. (Not covering the surface completely means that the answer would be too small. Overlap, on
the other hand, would count some of the surface area more than once, and the result would be too big.)
(b) What is ds? (There will be a substantial simplification using a trigonometric identity.) What is ρ? Set up
the integral and evaluate it. (Look up the surface area of a sphere in some reference book. Does your answer
check?)
Albert: Archimedes also worked out this problem, getting this answer, using his method of exhaustion. I’m
impressed, anyway.
Mugsy: Really? I thought he was strictly simple stuff.
4.3.5 Hydrostatic force (pressure).

This is the last “application” from typical calculus courses that we’ll cover in this cursory way. Other applications from
calculus will show up later on, as we encounter them in more natural settings.
Mugsy: Like where we can hear the little birdies twittering in the trees?
Dudley: Not that kind of natural setting.
Mugsy: Oh.
This time we want to calculate the force acting on a surface submerged in a fluid, which is the reason for the “hydro”
in “hydrostatic.” A fluid is either a gas or liquid, but we will generally work with liquids. And, the fluid is not moving
(otherwise things get exceedingly complicated), which is the reason for the “static” in “hydrostatic.”
The basic formula comes from physics:
F = PA
where F is the force generated, P is a (constant) pressure, and A is the area of the surface. (Sometimes this topic is called
fluid pressure, but that is technically incorrect. We will be calculating a force. Pressure is force per unit area.)
If the pressure were constant, we’d be done. The formula would give it to us. But in the types of problems we’ll be
working, the pressure will vary with depth under the surface of the fluid, and that is what we will have to deal with. We
find the pressure from Archimedes’ law: the pressure of a liquid equals the density times the depth.
Dudley: That Archimedes guy really got around.
Mugsy: What really scares me is that this is looking too much like physics.
Albert: It is physics. But just a little bit of it.
How are we going to work this kind of problem? As always, we slice it up, find what we need for each slice, and add it
up with an integral. The thing to remember is how to slice. With hydrostatic force, we are required by Archimedes’ law to
slice so that the pressure remains a constant on each slice. That means we must slice horizontally, in every case.
Dudley: Every?
Albert: Every.
Once we have the force on each slice, we add up the forces and get a total force. The force on each slice comes from its
area times its fluid pressure by Archimedes’ law.
This type of problem is notorious for giving you a description of a situation that doesn’t include very much information.
In order to analyze it, you must provide the framework. Although there are numerous reasonable possibilities, I encourage
you to adopt one consistent approach: Let x be depth under the liquid. That means that x increases as you go down (not
up), which is a bit strange, but it turns out to be convenient. The surface of the liquid is then given by x = 0.
The problem is typically stated thus: Find the force exerted on a plate with a certain description submerged a certain
way in a fluid of some density. As indicated earlier, you should let x be depth. The density should be a constant, δ (another
Greek letter, “lower case” delta). The shape and size of the plate (it could be anything from the end of an aquarium tank to
a submarine hatch) will have to be taken into account. It fits into the problem by letting l(x) be the length of a horizontal
strip (differential-thickness sliver) at depth x. While setting up l(x), you should at the same time decide the x-coordinates
of the top and bottom of the plate. They will turn into limits on the integral, representing the smallest and largest values
that are being added up to give the force on the plate.
How do we assemble all of this? We find the force on a strip by looking for the bit of force due to the horizontal
sliver. Since the pressure is essentially constant on the sliver (which is why we took it to be horizontal), the force is
(pressure)×(area).
We get the pressure by multiplying the force on the sliver times the depth. That presents problems, since the density of
the liquid is not the force per unit volume. It is the mass per unit volume. However, we can get the force by multiplying by
the acceleration of gravity, usually written g.
Dudley: Al, Mugsy has given up on this. But I still want to try to understand. Can you do something?
Albert: It’s really that F = m a thing. Force is mass times acceleration. So, you multiply the mass times acceleration,
and you get force. In this case, the acceleration is the acceleration of gravity, since gravity is what is causing the force.
Pressure is then (density)× g×(depth) = δ g × x. Area is (length)×(height) = l(x) × dx. The force on the sliver is then
δ g x l(x) dx. Add all of these up with an integral gives
Z
Hydrostatic force = δ g x l(x) dx (4.125)
We are in the position to make this a usable formula. The limits to be supplied are the smallest and largest values of
x for the plate. In other words, it is the depths of the top and bottom of the plate. The sticky one is l(x). That has to be
worked out with each new problem, and is dependent on the shape and depth of the plate. It is usually a matter of geometry,
and not necessarily simple stuff. The idea is to find the length of the strip as it changes with x = depth. For this reason, I
will spare you the gory details of how this works. In the somewhat unlikely event you ever need to use this, there should be
enough information here for you to work it out.
As an example, lets find the force on a submarine porthole.
Mugsy: I didn’t think submarines even had portholes.
Albert: They don’t. I think this might be a joke.
Suppose the porthole has a radius of 1 foot, and the center of the porthole is 200 feet below the surface of the water. The
density of sea water is 64 lb/ft3 . (Actually, that’s the force-density, since pounds are a force, not a mass-density. That means
that we can get δ g = 64. So, we can actually ignore one term in that integral.) The trick is with l(x), as always. We need
an equation for it. The equation of a circle of radius 1, with center at (200, 0) (remember that down is positive and x is the
distance!) is (x − 200)2 + y2 = 12 . The distance l(x) will go p from one side of the circle to the other, in other words, between
the two y-values. Since solving the equation gives y = ± 1 − (x − 200)2 , we get that
q q q
l(x) = ( 1 − (x − 200)2 ) − (− 1 − (x − 200)2 ) = 2 1 − (x − 200)2
Finally, we need the values for the limits on the integral. Since the largest value of x on the hatch is 201, and the smallest
value of x on the hatch is 199, the limits are 199 to 201. The integral set up is
Z
Pressure = δ g x l(x) dx (4.126)
Z 201 q
= 64 x 2 1 − (x − 200)2 dx (4.127)
199
Maple can evaluate it

Mugsy: Of course.
and it gives 12800 π ≈ 40, 000 pounds, or about 20 tons. You don’t have to worry about being able to open the hatch on a
sub while it’s under water. The force of the water is very effective at keeping it shut.
Homework #51
Exercises.
1. Find the fluid force on Hoover Dam. Treat it as a rectangle that is 726 feet high and 1244 feet long, with the water
level with the top. Use the “density” of water 62.4 lb/ft3 . Convert your answer into tons (2000 lbs = 1 ton).
2. Set up an integral for the fluid force on a porthole of a submarine. Specifically, presume that the porthole is a circle
of radius R, with the center at a depth of D below the surface of the water. Also assume that D > R.

1. The distributive and associative laws for addition mean that you can pull constants outside of summations and split
summations of terms apart.
2. Integrals “add up” differentials, allowing you to move from differential-sized changes to any change size at all.
3. Indefinite integrals must contain constants of integration. They are used to evaluate definite integrals, which never
contain constants of integration. When evaluating definite integrals, the value of the constant of integration in the
indefinite integral doesn’t make any difference, so it is usually set to zero (so it appears that the constant was simply
omitted).
4. When integrating both sides of an equation, you need to put a constant of integration on only one side.
5. The equations of ballistic motion are:
x(t) = v0x t + x0
1
y(t) = − gt 2 + v0y t + y0
2
6. When applying limits to two definite integrals at once, you must apply corresponding (not equal) limits. The dif-
ferential in the integral tells you what the independent variable is, and you want corresponding values of the those
variables.
7. The standard integration formulas are
1 n+1
Z
un du = u +C for n 6= −1
n+1
1
Z
du = ln |u | +C
u
Z
sin u du = − cos u +C
Z
cos u du = sin u +C
Z
eu du = eu +C
1
Z
√ du = Arcsin u +C
1 − u2
1
Z
du = Arctan u +C
1 + u2
1
Z
√ = Arcsec u +C
|u | u2 − 1
8. You use substitution to convert integrals to one of these standard forms. You let u be
• The inside of the most complicated part, or
• Any function whose derivative appears as a factor in the integrand, or
• Something more complicated (like a trig substitution), but don’t worry about those.
Remember to change the differential to the new variable, as well as the limits (for definite integrals).
9. You use partial fractions to integrate rational functions (the quotient of two polynomials). The steps are:
(a) Divide the integrand, if necessary, to make sure that the expression you use partial fractions on is proper (that
is, the degree of the top is less than the degree of the bottom).
(b) Factor the denominator into linear or irreducible quadratic terms
(c) Set up the correct partial fractions form, which is to put in terms for each factor in the denominator, the number
of terms equalling the degree of the factor, and the numerators being either constants (when the factor is linear
to a power) or linear (when the factor is quadratic to a power).
(d) Solve for the coefficients in the numerators. (You will not be required to do this step by hand.)
10. My approach to integration by parts uses categories of functions. The three categories are:
Category 1. Logarithms and inverse trigonometric functions
Category 2. Polynomials and powers of x
Category 3 Exponentials, sines, and cosines
The formula for integration by parts is Z Z
u dv = u v − v du
You use integration by parts when you have

• the product of functions from more than one category, in which case you let u = the function from the higher-
numbered category; or
• a single term from the first category, in which case you let dv = dx; or
• a product of terms all from third category, in which case you will need to do two integrations by parts. You
should let u = the exponential term (if it exists) both times; otherwise, let u = any term the first time, and the
second time u should be what that first u became. (The danger is the second integration by parts undoing the
first one.) Then you will have to solve for the integral you want.
11. Tabular integration is a very fast, systematic way to apply integration by parts, but it only works when you want to
integrate a polynomial times a category 3 function.
12. Tables of integrals are usually organized by increasingly complicated functions. To use one, pick out the most
complicated term in the integral you want to evaluate, find that section in the table. Then pick out the next most
complicated term in the integral, and locate the subsection (don’t go out of the section) with that kind of term. Keep
doing that until you find the integral you have.
13. Reduction formulas are common in integral tables. You end up using them several times, with different values of the
constants, until you get to an integral you can work directly.
14. There are numerous methods of approximating definite integrals. The ones that we covered (sums based on the right-
endpoints, left end-points, midpoints in Riemann sums, and the trapezoidal rule and Simpson’s rule) are elementary.
You will not have to know how to calculate those by hand.
15. The area between the curve y = f (x) and the x-axis for a ≤ x ≤ b is
Z b
| f (x) | dx.
a
If the values of a and b are not given to you, solve f (x) = 0 and use the largest and smallest values of x.
16. To integrate the absolute value of a function over an interval a ≤ x ≤ b, you find the places where f (x) = 0, and
discard the points that are not between a and b. You integrate the function over the remaining intervals, and add the
absolute values of the answers.
17. To find the area between two curves y f (x) and y = g(x) for a ≤ x ≤ b, calculate
Z b
| f (x) − g(x) | dx
a
. If the values of a and b aren’t given to you, solve f (x) = g(x) and use the largest and smallest x values.
18. If v(t) is the velocity of an object along the x-axis, then the total distance traveled for t1 ≤ t ≤ t2 is
Z t2
|v(t) | dt,
t1
while the net distance traveled is Z t2

v(t) dt.
t1
R
19. The general
p formula for arclength is ds, but to apply
p that to a specific problem in two dimensions, you need to use
ds = dx + dy . (In three dimensions, it is ds = dx2 + dy2 + dz2 .)
2 2
20. You convert ds into something that you can integrate by factoring out of it the differential of the independent vari-
able. Note that when you do that, you end up dividing the dx2 and dy2 by the square of the independent variable’s
differential. The quotient of the squares of two differentials is the square of the derivative. You then take the limits
on the independent variable as the limits on the integral.
Z
21. The formula for the surface area of revolution generated by revolving a curve about a line is 2 π ρ ds, where ρ is the
(function giving the) distance between the curve and the axis of rotation, and ds is the same as the ds for arclength.
22. The
R
formula for hydrostatic pressure on the vertical face of a submerged object is
δ g x l(x) dx,
where x represents depth below the surface of the fluid, δ is the (mass) density of the fluid, g is the acceleration of
gravity (so that δ g = the force density of the fluid), and l(x) is the horizontal length of the object at depth x. You
also need to provide limits of integration, which are the minimum and maximum depths of the object.
23. The new Maple commands from this chapter are:
• sum(function,variable=start..end); which adds up all the values of the function replacing variable
in function successively by the values from start to end.
• int(function,variable); which finds the indefinite integral of function with respect to variable, and
(function,variable=start..end); which finds the definite integral of function from variable=start
to variable=end.
• Int(function, variable); and Int(function, variable=start..end); do nothing more than for-
mat the integral to print it out on the screen. (These are called the inert forms of the integrals.) To get Maple to
carry out the integration on an inert integral in the previous step, type in value(");.
• convert(function,variable,parfrac); which does a partial fractions expansion of function assuming
that the variable is variable.
• The change of variables command is built into the student package, so before you can use it, you have
to tell Maple use(student):. Then the command changevar(integral, substitution_equation,
new_variable); performs a substitution (change of variable) in the integral, including changing the limits
of a definite integral.
• The integral approximation routines are built into the student package, so all the commands that follow in this
point have to begin with the command with(student):. (The semicolon suppresses the listing of the different
routines in student.)
rightsum(function, variable=start..end, number_of_intervals);,
leftsum(function, variable=start..end, number_of_intervals);,

middlesumsum(function, variable=start..end, number_of_intervals);,
trapezoid(function, variable=start..end, number_of_intervals);,
and simpson(function, variable=start..end, number_of_intervals);
are all definite integral approximation routines, using, respectively, the right endpoints, the left endpoints, the
midpoints of approximating rectangles, the trapezoidal rule, and Simpson’s rule. The integral that is approxi-
Z end
mated is function d(variable). The number_of_intervals is always optional, but if you don’t specify a
start
value, Maple will use 4. The output of these routines is an inert summation. To convert it to a decimal value,
use evalf(value("));.
Chapter 5
Water Balloons
5.1 Introduction.
5.1.1 What happened?
In previous incarnations of this textbook, chapter 5 was all about how to relate the counter of cassette tapes or VCR tapes to
elapsed time. But fewer and fewer people even own cassette players or VCR tapes, having replaced them with CD-ROMs
and DVDs. So, to keep the text relevant, it has been changed around to something more of a timeless nature: water balloon
launching.
Dudley: Gee, I wonder if they will show how to maximize distance?
Albert: Maybe. It isn’t that hard.
Mugsy: Can this stuff be applied to other things than water balloons?
Dudley: It should. You have something in mind?
Mugsy: Yeah, but you probably don’t want to know. It’s someone, rather than something.
Dudley: You’re right, I don’t want to know.
Mugsy: You never heard of a human cannonball?
Dudley: Was this done voluntarily?
Mugsy: For some definition of voluntarily, yes.
5.2 The background.

We are going to two different things using water balloons. One will be used to estimate launch velocity and how that
depends on the amount that the balloon is pulled back in the launcher. The other is to figure out how to track the balloon
on a camera, by figuring out how the vertical angle changes with time.
Mugsy: That’s so that you can capture all this for YouTube. You want to make the launch look good, right?
Albert: I use Xanga.
5.2.1 The equations we will need

We studied ballistic motion in the last chapter. Here are the equations for there. Note that they are parametric equations.
x(t) = v0x t + x0
1
y(t) = − gt 2 + v0y t + y0
2
Here is the meaning of variables:
272
CHAPTER 5. WATER BALLOONS 273
Variable Meaning
t Time
v0x Initial x velocity
x0 Initial height
g Acceleration of gravity
v0y Initial y velocity
y0 Initial height
Dudley: AGGGH! Variables! Lotsa variables!
Albert: But you have seen all of them before, Dudley.
Dudley: That doesn’t mean that they didn’t terrify me then, too!
To simplify things, we will assume that the water balloon is launched from ground level, a reasonable assumption.
Dudley: So these equations don’t work if you are launching water balloons out of dorm room window?
Mugsy: Or off of the roof of the science building?
Albert: I refuse to answer some questions that can get you into serious trouble.
That means that we will assume that y0 = 0. It simplifies the equations immensely.
For this, we are going to want to find the range of the water balloon. As with all parametric equations, we will want
to rephrase the question in terms of the parameter. What is the range of the launcher going to be? Described in terms of
the trajectory of the water balloon, it is the distance between the two points that the balloon is on the ground (those two
points being where it is launched from and where it lands). Since the parameter is time, questions regarding the parameter
will be phrased with the word when. So, what we want to find is when the balloon is on the ground. There should be two
times. Then, once we have the two times, we have to find the two positions, and from that, find the distance between the
two positions. So, how to we find when the water balloon is on the ground? The defining characteristic of being on the
ground is y = 0, so what we want to do is solve y = 0 for t. Note that y is a quadratic in t, so we will get two different values
of t, which is just what we want. Remember that y0 = 0.
0=y (5.1)
1
= − gt 2 + v0y t (5.2)
2
1
= t (− gt + v0y ) (5.3)
2
That gives two equations, t = 0 and − 21 gt + v0y = 0. The first value, t = 0 is expected. That would represent the launch
of the balloon. The other value solves to give t = 2 v0 y/g. That is the landing time. To get the range, we will need to
know the positions at those times. By plugging into the equation for x(t), we get that at t = 0, x = x0 and at t = 2 v0y /g,
2v v
x = x0 + 2 v0x v0y /g. The difference is the range, 0xg 0y .
That is correct, as far as it goes, but isn’t in the most usable form.
Dudley: Hey! Maybe he will show how to maximize distance!
Albert: It’s beginning to look like it.
If we go back to trigonometry we get that v0x = v0 cos(θ ) and v0y = v0 sin(θ ), where θ is the launch angle.
Dudley: AAAUUGGGGH! Another variable!
If we plug those in, and use the trig identity sin(2 θ ) = 2 sin(θ ) cos(θ ), we get that the range is actually
(v0 )2 sin(2 θ )
Range = .
g
From this, we can see how to maximize distance.
Dudley: At last, what I have been waiting for!
The largest value of the right hand side occurs when sin(2 θ ) = 1, which occurs when 2 θ = π/2, or when θ = π/4.
Dudley: Lessee, π/4 radians is, uh, 45◦ , right?
Albert: Yes! Congratulations!
Dudley: So launching the water balloon at 45◦ will maximize the distance it goes?
Albert: Yes and no.
Dudley: Hey, quit being as confusing as the book. Give me an answer I can use.
Albert: If there weren’t any air resistance, then yes, a 45◦ launch angle will maximize distance. But if you want to
take air resistance into account, you need to lower that angle somewhat. The maximum distance for hitting a baseball
occurs at about a 40◦ angle.
Mugsy: OK, how would you know that?
Albert: I read a book called The Physics of Baseball.
Mugsy: And I suppose that was pleasure reading for you.
It is also important to notice here that the range is proportional to the square of the launch velocity. That is, doubling
the launch velocity multiplies the range by four. Doing what you can to increase the velocity is clearly important.
5.2.2 The launcher

We want to look at how operating the launcher affects the balloon. Specifically, how does the range change depending on
how much you pull back?
Mugsy: Oh, that one is easy!
Albert: It is? Explain, please.
Mugsy: The more you pull back, the farther is goes.
Albert: Yes, but there is a lot more to it.
Mugsy: It seems like there always is.
For this, we need to incorporate a bit of physics, but only a bit.
Mugsy: Dudley, quit whimpering.
The general formula for elastic, deformable objects (also called media) is F = −k x, called Hooke’s law. Here, F is the
force needed to deform (stretch or compress) the medium, x is the length of the deformation, and k is the spring constant.
Mugsy: Dudley, quite whimpering.
The potential energy in a spring (or any stretched elastic medium) is 12 k x2 , where k is the spring constant for medium, and
x is the distance stretched.
That energy is transferred to the balloon’s kinetic energy, 12 m v2 , where m is the mass of the balloon, and v is the velocity.
It would actually be better to use v0 here, actually, since that velocity will be the launch velocity.
Setting these equal to each other, you get that r
k
v0 = x.
m
This has a couple of consequences that are not at all obvious.
• The launch velocity is directly proportional to the amount that the balloon is pulled back.
• The smaller the balloon (that is, the less mass), the faster it will be launched.
• The stiffer the launch bands (that is, the larger k), the faster the balloon is launched.
Those certainly aren’t at all obvious.

Mugsy: Gee, Dudley, you stopped whimpering.
Dudley: Yeah. This looks like it could be useful.
So, if we put this together with the range equation, we get that the range of the water balloon launcher is
k sin(2 θ ) 2
Range = x .
mg
The observation that is most important for us is that the range is quadratic in x, the amount that you pull back on the
launcher.
5.2.3 Data
Suppose we ran an experiment to check this out. You might come up with the following data.
Mugsy: Might come up with this data?
Albert: Correct. Dr. Coulliette can’t find his water balloon launcher to get some real data.
The launch angle is 40◦ . We are going to try to figure out the launch velocity.
x Range
0 0.
1 1.8
2 7.0
3 15.3
4 26.5
5 40.3
6 56.5
7 74.9
8 95.3
9 117.6
10 141.7
Here, x is the percentage of max stretch divided by 10. (So, an 80% stretch would correspond to x = 80/10 = 8.)
It is worth plotting this.
It certainly looks quadratic.
5.2.4 We have lots of data. Now what?

We certainly have a lot of data. In fact, we have a lot more than we need, in one sense. We are going to try to estimate
the launch velocity from this data. But that means that we have 10 equations for one unknown. That is not what we would
normally want. Why don’t we just throw out nine of the data points, use the remaining one and solve for v0 ? It’s a good
question, and deserves a good answer.
One of the hassles of measurement is errors. If we take a lot of measurements for the same thing, we can hope that the
errors would average out in the long run. But that means we have to take a lot more measurements than in necessary. In
this case, if you estimate v0 from each of those data points (except for (0, 0), which doesn’t give any value at all for v0 ),
you get different values of v0 . Which one should you take? Simple, you don’t take any of them!
Mugsy: I like that! Can we leave now, then?

Albert: No.
What we want to do is locate the “best” value for v0 . The whole problem boils down to deciding what “best” means.
There are lots of options for that, and you will see more if you take Applied Math. But the most common (even if it is not
the best) goes by the name of Least Squares analysis.
Dudley: Why don’t we use the best method instead?
Albert: There are several reasons. First, this is the most common. You need practice with it before you use the others.
Second, the others are much more difficult to handle mathematically. Third, least squares leads to the other topics
that this chapter is supposed to cover.
Dudley: All right already! I give up! Let’s do least squares!
The overall idea is simple. If our data were exact, and our value for v0 were exact as well, and the model for the
equations were exact, then the equation for the range would be an exact equality for each data point. But alas, none of those
assumptions is correct. We live in a fallen world, and have to deal with it. Our goal, then, is to find the a value of v0 that
makes the equation as close as possible for all the different data points. So, what we need is a method of determining how
inaccurate the equation is for a specific data point and a specific value for v0 .
In one sense, all we have to do is use the value of v0 and the angle, plug those into the equation for the range and
compare it to the listed value for the range. The closer we are, the better.
But again, we want to use all the data points we have. Changing v0 might make one data point better, but at the expense
of making one or more of the other data points worse. What we need is a method of deciding how good one value of v0
really is. That brings us to the least squares equation.
The obvious candidate for how close the approximation to the range is to the real value of the range is to subtract,
and try to get as near zero as you can. That number is the error in the approximation. We want the errors to be small. A
too-simple approach would be to add all the errors together and try to make that small. After all, that would use all the data
points. The problem is that the errors can be both positive and negative. This would mean that positive and negative errors
could cancel, and that is not at all what we want. We want all the errors to be small individually.
So, instead, we add up the squares of the errors, and try to make that small. That way, positives and negatives get
counted the same, and they can’t cancel.
Dudley: Why don’t we take the absolute value instead of squaring?
Albert: That is one of the “better methods” that we are not covering. The reason we don’t do that is simple. Do you
remember how to differentiate absolute values?
Dudley: Yes.
Mugsy: No.
Albert: The equations that you would have to solve with absolute values are much more complicated, enough so that
using absolute values is rarely done.
What we are going to try to do, then, is the make the sum of the squares of the errors as small as possible, by choosing v0
correctly.
Mugsy: Hey! That’s why the call it Least Squares.
Albert: Very good.
5.2.5 Working with the Least Squares equation

Setting up the Least Squares equation
Let’s work through the error for the second data point in some detail. The rest of them are identical.
Dudley: Why not the first one?
Albert: Because when x = 0, the range is 0, no matter what v0 is. There is no error in that measurement.
So, we assume that the data is quadratic. In order to simplify the calculations
Dudley: Hey, I’m all for that!
we assume that the range has an equation like
Range = A x2 .
Once we get a value for A, then we can go back and actually see what value that gives for v0 .
The second data point has x = 1 and a range of 1.8. The error would be (1.8 − A (1)2 ). Of course, this depends on A, as
it should. For different values of A, we get different values of the error for this term. Our work is to find the value of A that
makes this error, and all the other errors as well, small. We square the error and get (1.8 − A (1)2 )2 . Similarly, the square
of the error for the third data point is (7.0 − A (2)2 )2 . Keep going, and add them up at the end. Fortunately, we have Maple
around. If you add them all up, and expand the result, you get 25333 A2 − 74150.6 A + 54337.74. That is the value of the
sum of the squares of the errors.
Solving the Least Squares equation

We want to find the value of A that makes this as small as possible. Well, that is something we have already learned to do!
That is nothing more than a max/min problem, and we are given the function! How simple can life get!
Mugsy: I wouldn’t say that this is exactly simple. . . .
What we do is differentiate, set the derivative equal to zero and solve for A.
The derivative is 50666 A − 74150.6. Set that equal to 0 and solve for A and you get A = 1.46.
The corresponding errors

Let’s check and see how well this value of A actually works, by plugging the values of x into A x2 , and comparing to the
ranges that we had.
x Range A x2 Error
0 0. 0. 0.
1 1.8 1.46 0.34
2 7.0 5.84 1.16
3 15.3 13.14 2.16
4 26.5 23.36 3.04
5 40.3 36.50 3.70
6 56.5 52.56 3.84
7 74.9 71.54 3.26
8 95.3 93.44 1.86
9 117.6 118.26 -0.66
10 141.7 146.00 -4.40
The sum of the squares of the errors is, according to Maple, 77.6868. No matter what other value of A that you choose, the
sum of the squares of the errors will be larger than 77.6868.
How do we know that the value of A that we found actually minimizes the sum of the squares of the errors? Well, we
had several ways of telling a while ago. The second derivative test is the easiest here. The second derivative of the sum of
the squares of the errors is 50666, which is positive, making the value of A that was found the place where the minimum
value happens. Alternatively, in this case we can tell very easily, since the equation for the sum of the squares of the errors
is a quadratic with a positive coefficient on A2 . That means that the parabola opens up, and the place where the tangent is
horizontal is a minimum.
5.2.6 Now, let’s do even better

We can improve this estimate, by using a more general quadratic. Suppose we try to approximate the range using A x2 + B x.
The process is exactly the same. We set up the sum of the squares of the errors, which we will call SSE, just to keep things
shorter. We then minimize it to find the values of A and B. Things, however, get more complicated.
Mugsy: How could I have guessed?
The problem is that we will end up with a function with two variables, namely A and B. It is only that we haven’t worked
with minimizing (or maximizing) functions of more than one variable yet.
Mugsy: And I expect that is what we will have to learn, right?
Albert: Right.
Dudley: And this gives a better approximation?

Albert: It’s a bit hard to say without working through the numbers, but it is very likely a better approximation.
Getting the equation

The equation for the least squares approximation is virtually identical to what we did before. The only change is that we
will have A x2 + B x where we used to have A x2 . The equation will start SSE(A, B) = (1.8 − (A (1)2 + B (1)))2 + (7.0 −
(A (2)2 + B (2)))2 , and continue for another 8 terms.
Again, thanks to Maple, this isn’t too hard to deal with. We get that the value of SSE(A, B) = 25333 A2 − 74150.6 A +
54337.74 + 6050 A B + 385 B2 − 8934.2 B. This is complicated enough that we will need to learn how to tackle such prob-
lems in general. There are no nice, simple direct solutions here.
For the moment, we will learn how to find the maxes and mins of a function f (x, y), and then apply it to SSE(A, B) later.
5.2.7 Locating critical points

How did we minimize a function with just one variable in it? We set the derivative equal to zero and solved. Now we
have a function with two variables, x and y. What do we do? From our work before, you could guess that you set the first
derivative equal to 0 and solve.
That’s close, but it has two flaws. We are looking for values for two variables, so we will need two equations to solve
for them. Besides, there are two first (partial) derivatives. Which do we set equal to zero? All right, you think, set both first
derivatives equal to 0 and solve. That’s correct, and fixes the other flaw.
Dudley: That sounds too easy.
Albert: Maybe so, but it’s still correct.
We set both first partial derivatives equal to zero, and we get two equations for the two variables. That gives you the critical
points. In this case, to find the critical points of f (x, y), solve the two equations
∂f ∂f
=0 and =0
∂x ∂y
simultaneously for x and y. This gives points (usually more than one) which are the critical points.
Note that what we are doing is similar to the less general method of finding critical points of f (x). We aren’t going
to look at places where a derivative doesn’t exist. Functions of more than one dimension (independent variable) are much
more complicated, and we are forced back to relying on the equivalent of the second derivative test. Places for which the
first derivative is not defined won’t have second derivatives, either, and the second derivative test (the only one we have)
fails. Therefore, we ignore such cases.
Mugsy: You can’t do it so you ignore it?
Albert: That’s the general idea.
Categorizing critical points in two dimensions.

As you expect, there are maxes and mins and messes in two dimensions, but something new shows up, too—a saddle point.
It needs to be explained.
Mugsy: You’d better believe it does.
In multiple dimensions, we get to move freely in different directions. A saddle point can be described (slightly inaccu-
rately) as a point which is a min in one direction, but a max in another direction. How is that possible? A picture is worth
1000 words. It looks like a saddle.
To accommodate this new possibility, the terminology needs to change. Maxes become peaks, mins become pits,
saddles become passes (from the idea of a mountain pass), and messes become problems. The possibilities are then peaks,
pits, passes, and problems, retaining our alliterative scheme.
Mugsy: Aw. How cute.
Dudley: Hey, anything that helps me remember is good.
In tribute to some ingenious individual from Fall 2013, there is an alternative way to think about the shape of a saddle
point. Consider the shape of a Pringles (Registered Trademark acknowledged here and for the rest of the chapter) potato
chip. It fits perfectly, in several senses. The shape is just right, and it also fits the alliteration: peaks, pits, Pringles, and
problems.
Mugsy: Why do they do this to me right before lunch? Now I’m hungry.
Telling the difference using the second derivative test.

There are more possibilities to check for using the second derivative test, as well as more second derivatives to find. This
makes the second derivative test in two dimensions more difficult. There is a pattern to it, but right now, you’ll basically
have to refer to it each time (or, for the more ambitious of you, memorize it).
Dudley: You mean I won’t have to memorize it?
Albert: You’ll be given it on the test. You only have to know how to use it.
Once you have a critical point (where fx = fy = 0), you classify it by plugging those values of x and y into a formula
( fxx ) × ( fyy ) − ( fxy )2 = ∆
for the quantity ∆, called the discriminant. The four cases for the critical point are:
Case Type of critical point
∆ > 0 and fxx > 0 Relative min (pit)
∆ > 0 and fxx < 0 Relative max (peak)
∆<0 Saddle (pass)
∆=0 Mess (problem)
For example, take f (x, y) = x2 + y2 + 3 x y. The partial derivatives are easy: fx = 2 x + 3 y and fy = 2 y + 3 x. Setting these
both equal to zero gives that the only critical point is (0, 0). Next,
fxx = 2, fxy = 3, and fyy = 2
so
∆ = (2)(2) − (3)2 = −5 < 0
so the second derivative test says (0, 0) is a saddle. (And it is.)

Let me warn you about something that might occur to you, but will not work. Don’t just look along the x-direction and
y-direction. That is, it is not good enough to check what is going on at (x ± dx, y) and (x, y ± dy). Consider the function
we just did. Here f (dx, 0) = f (−dx, 0) = (dx)2 > 0, and f (0, dy) = f (0, −dy) = (dy)2 > 0, so it looks like (0, 0) is a
minimum. (The function is going up along each of the four axis directions.) On the other hand, f (c, −c) = −c2 < 0, for
any c > 0, so it really is a saddle. (That is, it is going up in some directions, but it is also going down in others.)
What happens in more variables.

In more dimensions (that is, more variables), there are again only maxes, mins, saddles, and messes. But in higher dimen-
sions, saddles become more and more common, since they can be more and more varied. You’ll have to wait until you take
the applied math course for more details.
Mugsy: Is it all right if I pass on that one?
5.2.8 Back to the problem.

OK, so now we have the means of doing what we want. Let’s actually go ahead and do it. The function was SSE(A, B) =
25333 A2 − 74150.6 A + 54337.74 + 6050 A B + 385 B2 − 8934.2 B. The partial derivatives are 50666 A + 6050 B − 74150.6
and 6050 A + 770 B − 8934.2. Setting both of these equal to 0 and solving gives A = 1.262924425 and B = 1.679879518.
Dudley: Hey, that was fairly easy!
Mugsy: Especially if you use Maple.
Notice that the value of A here is not the same as the value of A from before. That is normally the case. When you
throw more terms into the least squares function, all the coefficients will readjust.
Size of the errors

Now for the moment you have been waiting for.
Mugsy: The end of the course?
Is this actually a better approximation? Let’s calculate (using Maple, of course) the errors.
x Range A x2 + B x Error
0 0. 0. 0.
1 1.8 2.94 -1.14
2 7.0 8.41 -1.41
3 15.3 16.41 -1.11
4 26.5 29.93 -0.53
5 40.3 39.97 0.23
6 56.5 55.54 0.86
7 74.9 73.64 1.16
8 95.3 94.27 1.03
9 117.6 117.42 0.18
10 141.7 143.09 -1.49
If you compare to the approximation using A x2 , you can see that the errors are considerably less now. The sum of the
squares of the errors with these values of A and B is now 10.2483, down from 77.6868. Another way to see this is with the
graph.
The diamonds are the points, the curve that starts lower and ends higher is the graph of the least squares value using A x2 ,
while the other curve is the graph of the least squares value using A x2 + B x. It is fairly clear that the second curve fist better.
5.2.9 Finding launch velocity from these

Now, let’s go back and see if we can figure out what the launch velocity must have been. The formula for the range was
(v0 )2 sin(2 θ )/g. We know that g = 9.8 and the launch angle was 40◦ . If we assume that x represents the number of tenths
of the maximum velocity, then the range would be (in theory) this.
x 2
(v0 10 ) sin(2 θ )
Range =
g
(v0 )2 sin(2 θ ) 2
= x
100 g
That is, the value of A is (v0 )2 sin(2 θ )/(100 g). Once we know the value of A (and we actually have two choices for it
now!), we can get a value (or twoq potential values) of v0 .
100 A g
Using A = 1.46, we get v0 = sin(2 θ ) = 38.1, while using A = 1.26 gives v0 = 35.45.
Further improvements
Well, if we can make better approximations with more terms, why don’t we go ahead and try to fit the range to A x2 +B x+C?
We can certainly set up the equation for SSE(A, B,C) simply enough. And to minimize it, we would set all three partial
derivatives equal to 0 and solve for A, B, and C. You would get (with Maple’s help again)
Mugsy: I’m actually getting to the point where I don’t shudder when I hear the word Maple. That program is actually
handy!
that A = 1.22, B = 2.16 and C = 1.29. The sum of the squares of the errors is now 7.3896.
Yes, this is technically more accurate, but the improvement is a lot less. Adding in B x dropped the sum of the squares of
the errors down from 77.6868 to 10.2483, an 86% drop. The next drop, from 10.2483 to 7.3896, is only 28%. The question
is whether it is worth it.
There is another consideration. The one data point that we can be absolutely sure of is (0, 0): If you have zero velocity
at the launch, the water balloon won’t go anywhere. If you have a non-zero value of C, the fitted curve won’t go through
that point. On that basis, there is some rationale for not putting in the C term.
The truth comes out

The data were generated using Maple, setting up differential equations for x and y, using the ballistic equations of motion,
with extra terms for air resistance added in. The initial velocity was actually 43. The air resistance slowed the motion of
the water balloon down enough that the initial velocity appeared lower. But it was in the right general area.
The more interesting thing to notice is that the value of v0 from the two-parameter (A and B) approximation was further
off than the value of v0 from the one-parameter (just A) approximation, even though the curve fit better with two parameters.
The reason is that the improvement in the fit was due to B, which wasn’t included in the calculation for v0 .
Homework #52
Exercises.
1. Find the critical points of the following functions and classify them (a max, min, saddle, or mess):
(a) x2 + 2 y2 + 6 x + 8 y + 12
(b) x2 − 2 y2 + 6 x + 8 y + 12
(c) x2 + 2 y2 − 6 x + 8 y + 12
(d) x3 + 3 y2 − 6 x y
2. Find the critical points of the following functions and classify them (a max, min, saddle, or mess):
(a) 3 x2 + 2 y2 + 6 x + 8 y + 12
(b) 3 x2 − 2 y2 + 6 x + 8 y + 12
(c) 3 x2 + 2 y2 − 6 x + 8 y + 12
(d) x3 + 3 y − 12 x y2
Problems.
1. Let x j take on the values 5, 1, 5, 4, and 5. Let f (x) = ∑5j=1 (x − x j )2 . Write out f (x) and show that f (x) has a
minimum when x = average of the x j ’s. (This is true in all cases, actually.) Do this by setting f 0 (x) = 0 and solving.
2. In this problem, you will fit y = f (x) = A x + B x2 to some data by hand. Use this data: (x j , y j ) is measured as (0, 0),
(1, 1), (2, 5), (3, 9). There are few enough here so that you can do the algebra with only minimal pain. Also, note
that this is a simple y = x2 , with a change in the value at x = 2.
(a) Set up SSE(A, B) for these points, and plug all the data in. You should get something like ((0) − (A (0) −
B (0)2 ))2 + ((rest of the terms)).
(b) Differentiate SSE with respect to both A and B. Don’t forget the chain rule!
(c) Set both the derivatives to zero and solve simultaneously. (You will end up with numbers in the hundreds. Don’t
panic, but be careful.)
(d) Use those values for A and B to evaluate A x + B x2 at x = 1, 2, and 3, and compare with the values in the original
data. (If you are correct, you should get numbers close, but not equal, to 1, 5, and 9.)
(e) Use the second derivative test on SSE to decide if this a maximum, minimum or saddle.
3. The general quadratic in x and y looks like f (x, y) = A x2 + B x y + C y2 + D x + E y + F (not all of A, B, and C are
zero). This problem asks (and answers) various questions about what its relative maxes and mins look like.
(a) Find ∂ f /∂ x and ∂ f /∂ y, and set them both equal to 0 and solve for the critical point. Show that if B2 −4 AC 6= 0,
there is a single critical point. (Maple is of use here in solving the equations you get. What you need to show is
that if B2 − 4 AC 6= 0, the equations can be solved, meaning the solution Maple gets really exists. To see that,
look at what would happen if B2 − 4 AC = 0.)
(b) Show that the critical point in the previous part is either a maximum, minimum or a saddle. (That is, it is not a
mess as long as B2 − 4 AC 6= 0.)
(c) An example of B2 − 4 AC = 0 is f (x, y) = x2 + y. Show that it has no critical points at all.
(d) Another example of B2 − 4 AC = 0 is f (x, y) = x2 + 2 x y + y2 . Show that this has lots of critical points, and
show that they are all minimums (of a sort). [Hint: Factor f (x, y).]
5.3 Video angle

Suppose we want to set up a video camera to record our water balloon launching. If we fire directly away from the camera
(so that the camera is immediately behind the launcher), then the only motion that the camera needs to make is to pan up
and down. It is much more complicated to record this from the side, where the camera has to pan both up and down and
side to side. Just what we are going to do will keep us busy.
For a bit of simplification, we assume that the camera height is exactly equal to the height of the balloon launcher. The
two will be roughly equal anyway, and it makes life easier. What that means in terms of the variables is that we can set
y0 = 0.
What we need is a formula for the camera angle (measured up from horizontal). That is not too bad, if you remember
trigonometry.
Mugsy: And if you don’t remember trigomonics, or whatever it is?
Albert: Just hang on and he will give you the equation.
Mugsy: Good, as long as I don’t have to do this myself.
The formula for angle, which we will call φ , is y
φ = arctan .
x
Dudley: Oh great, ANOTHER variable.

Albert: But this is the same angle as you saw in polar coordinates.
Dudley: But we called it θ then, didn’t we? And θ is already used to mean something else here! ARRGGHH.
Now, what we want is dφdt , since that represents how fast the camera is panning up or down, with panning up represented
by dφdt > 0 and panning down given by dφ
dt < 0.
There are two ways that we could proceed. One would be to plug the equations for x and y into the equation for φ and
differentiate. That would work, but it wouldn’t provide us the kinds of insight that we will want. What we will do instead
is leave x and y in the equation for φ , but remember that they are functions of t. Then we will differentiate it and at the end,
plug in the values.
Differentiating φ = arctan(y/x) with respect to t requires the chain rule.
dφ 1 d y
= (5.4)
dt 1 + (y/x)2 dt x
1 x dy dx
dt − y dt
= (5.5)
1 + (y/x)2 x2

1 dy dx
= 2 x − y (5.6)
x + y2 dt dt
There are a few things that we can get from this before we plug in the equations for x and y. Note that dφ /dt is not zero
at the top (maximum height) of the balloon’s arc. How can we tell? At the top of the arc, dy/dt = 0, and plugging that in,
we get that
dφ 1 dx
= 2 0−y ,
dt x + y2 dt
which will be negative since all the values of the variables and derivatives are positive, and there is an overall negative sign.
If you draw a picture of the balloon’s arc, you will see that the camera has to pan upwards until some point before the top
of the arc (the point is where the line from the camera to the arc is tangent to the arc), and then will pan down from there to
the balloon’s landing.
Dudley: I can see where it would have been hard to pick up on that if you had just plugged in the equations for x and
y before differentiating.
Mugsy: Not that this made it easy. Or obvious.
Albert: That’s the advantage of using the equations.
Dudley: What are you trying to do, plug calculus as useful?
Albert: Why, yes, I am.
5.3.1 Getting rate of change in camera angle

Now that we have the formula with the derivatives for dφ /dt, we can plug in the equations for ballistic motion and get the
rate that we want:

dφ 1 dy dx
= 2 x −y (5.7)
dt x + y2 dt dt

1 1
= 2 1 2 2
(v0x t + x0 ) (gt + v0y ) − (− gt 2 + v0y t) (v0x ) (5.8)
(v0x t + x0 ) + (− 2 gt + v0y t) 2
−2 (gt 2 v0x + 2 gt x0 − 2 v0y x0 )
= (5.9)
4 v0x t 2 + 8 v0x t x0 + 4 x0 2 + t 4 g2 − 4t 3 g v0y + 4t 2 v20y
2
Mugsy: OK, I give. How’d he get that last equation?

Albert: Maple.
This just goes to show that sometimes plugging in the equations obscures things more than clarifies them.
Just to make sure that this is correct, let’s actually plug in some numbers and make a graph. Using the metric system,
g = 9.8 m/s2 , and let’s take x0 = 5 (so the camera is 5 meters behind the launcher), v0x = 2 m/s, and v0y = 10 m/s (so the
balloon is being launched mostly up). Plugging all those into φ and dφ /dt gives
−49t 2 + 100t 980t 2 + 4900t 2 − 5000

dφ
φ = arctan and =
20t + 50 dt 2401t 4 − 9800t 3 + 10400t 2 + 2000t + 2500
A bit of calculation shows that the balloon is in the air from t = 0 to t = 2.04, so a plot of that range gives this graph:
Now, let’s make sure the graph is reasonable.

Mugsy: Sure looks reasonable to me, but I have no idea what it ought to look like.
Albert: That’s why he needs to explain it. So listen.
There are two functions plotted here. The parabola-looking curve is the plot of φ , and the curve dropping from upper left
to lower right is the plot of dφ /dt.
The value of φ starts and ends at 0, which makes sense, since the balloon starts and ends on the ground. It also hits
its max before the balloon gets to max height, though since we aren’t plotting the height of the balloon, that isn’t quite
obvious. But since we are ignoring air resistance, the path of the balloon is a perfect parabola, so it attains its max height
right in the middle. The middle in this case is at t = 1.02. The max of φ occurs where the graph of dφ /dt crosses the axis,
at t = 0.869, which occurs before the midpoint, as we showed earlier.
Now. let’s look at the graph of dφ /dt. This is more interesting.
Mugsy: For some people, maybe.
The value of dφ /dt is positive for 0 < t < 0.869, meaning that the camera angle is increasing at those times. But you can
clearly see that the rate of increase is dropping: The camera angle is changing fastest near t = 0, which makes sense since
that is when the balloon is closest to the camera. The “acceleration” (second derivative) of the camera angle is negative,
since that can be seen as the slope of the derivative curve, and that is clearly negative. So, the angle increases until it hits a
max at t = 0.869 (where the second derivative is negative, assuring us that it is a max), then it starts dropping, all the way
back to 0, when the balloon hits the ground.
Dudley: Derivatives and derivatives of derivatives and slopes, all this is giving me a major headache!
Albert: Once you sort it out, it actually helps. Working through these graphs, and these explanations helps to give
you a feel for what is going on.
Dudley: And until you sort it out?
Albert: It’s a major headache. I recommend you learn to sort it out.
5.3.2 The moral of the story — related rates problems.

The problem that we just did was a bit more complicated than what we are going to do next. There, we had φ as a function
of two other variables, both of which were functions of t. For what we are going to do next (mostly, anyway), we will have
one variable that is a function of another variable, which itself is a function of t. To be a bit more specific, if y is a function
of x, and x is a function of t, then ultimately, y can be found if we only know t (by calculating x first). That means, we have
y = y(t), so we can find dy/dt. The way that it is done is, of course, the chain rule, the most important rule in calculus:
dy dy dx
= .
dt dx dt
We can calculate dy/dx, since we are given y as a function of x, and we can calculate dx/dt for the same reason. The chain
rule tells us how to combine these two to get what we want, dy/dt.
Actually, whenever you can get two out of three derivatives in the chain rule, you can use the chain rule to get the third.
This seems obvious, until you have to apply it to problems.
Dudley: What’s obvious to one person isn’t obvious to another.
Mugsy: You wouldn’t have specific people in mind, there, would you?
A typical kind of problem (that this is a very long example of) is called a related rates problem. Here’s the general scenario:
You are given a situation that contains two quantities that are changing. You are also given the rate at which
one of the quantities is changing. You are to determine the rate at which the other quantity is changing.
The general procedure to solve such problems is:
1. Find the equation that connects the quantities that are given. (This is usually the hardest part of the problem!)
Albert: AMEN.
2. Use that equation to find the derivative of one quantity with respect to the other.
3. Use the chain rule (relating the derivative in the previous step to the rates of change of the quantities) to solve for the
remaining rate.
When you are lucky, such as in the homework exercises (but not the problems) that follow, you are given the equation.
An example of this simpler sort of problem would be profitable.
Dudley: It would also be unusual.
Problem: Suppose y = x2 − 5x + 8, and you know that dx/dt = 4 when x = 1. Find dy/dt at x = 1.
Answer: The equation relating these rates is the chain rule, since we can find dy/dx, but need dy/dt. That is, we want to
change the variable of differentiation. The chain rule says
dy dy dx
= ×
dt dx dt
From the equation, dy/dx = 2x − 5. At x = 1, the value of dy/dx is 2(1) − 5 = −3. The value of dx/dt is 4. Therefore, at
x = 1,
dy dy dx
= × = −3 × 4 = −12
dt dx dt
That’s the answer! Seem too simple? It really is (except when you have to come up with the equation yourself).
Finally, let me do an example of a related rates problem stated in words. It is more difficult than any problem you are
liable to have to work.
Problem: In his never-ending search to rid the universe of squirrels, Fang had gotten Dudley to catapult zucchini squashes
at night under the street light, since he wants to learn to track falling objects by just their shadows. (Dudley has an
overabundance of zucchini, so he doesn’t mind.) The street light is 7 meters directly above the catapult. The catapult
launches zucchini with initial horizontal velocity of 2 m/s and initial vertical velocity of 10 m/s. Find how fast the shadow
of the zucchini is traveling when the zucchini reaches the top of its path.
Answer: There are all kinds of hassles in working this problem. There are a lot of equations, and very few of them are given
to you. The main thing (after wasting an enjoyable few minutes trying to get the ideal picture) is to try to get an equation
for the position of the shadow. There are also three variables, instead of the usual two.
We set things up to make them as simple as possible. We put the origin of out coordinate system right at the catapult,
so the street light is at position (0, 7). If the zucchini is at position (x, y) (which we can figure out from the equations of
ballistic motion that we had in the last chapter—but not yet), then where is the shadow of the zucchini? For that, we use
have to use similar triangles. (Stare at the picture; it really can help.)
There are two triangles we use. One is formed by the light, down to the catapult, and then to the zucchini’s shadow. The
other is formed by the zucchini, down to the point immediately below it on the ground, and then to the zucchini’s shadow.
These triangles are similar. The light to the catapult is 7 m, and the catapult to the shadow is s. The zucchini down to the
ground is y and that point to the shadow is s − x.
The ratios from similar triangles gives this:
7 y
= .
s s−x
7x
Solve that for s and you get s = 7−y .
The rate at which the shadow is moving is ds/dt. From the solution above, we get that by the quotient rule:
dy
ds (7 − y) (7 dx
dt ) − (7 x) (− dt )
=
dt (7 − y)2
We “only” need to fill in the values of x, y, dx/dt, and dy/dt to get the value of ds/dt.
The equations of ballistic motion give x = 2t and y = −4.9t 2 + 10t. If we can find the value of t, we can plug in and
get everything we need. The value of t requested is “when the zucchini reaches the top of its path.” How do we find that?
If you think back
Mugsy: Or look back, in my case
you will remember that the top of the path occurs at the value of t for which dy/dt = 0. In this case, that is −9.8t + 10 = 0,
or t = 10/9.8 ≈ 1.02. For that value of t, x = 2 ∗ t = 2.04, y = 5.10 (safely below the light), dx/dt = 2, and dy/dt =
(−9.8 ∗ t + 10) = 0. In that case, we simply plug it all in and get
ds 7 − (5.10)) (7 (2)) − (7 (2.04)) (0)

= = 7.37
dt (7 − (5.10))2
as the speed of the shadow.
Homework #53
Exercises.
1. Find the value of dy/dt at x = −2 if y = 2 x3 − 7 x2 + 5 x − 4 and dx/dt = −1.

√
2. Find the value of dy/dt at t = 1 if y = 3 x2 + 1, and x = t 2 − 3t + 1. (Do this without actually performing the
composition to make y an explicit function of t. There’s an easier way!)
Problems.
1. Dudley has just been pulled over by a police officer, who has used a radar gun to clock Dudley at 63 m.p.h. in a
55 m.p.h. zone. However, the officer was on a side road at the time, and Dudley wants to argue that his velocity
as measured by the gun (the rate at which the distance between Dudley and the gun was changing) was not the
same as the speed Dudley was actually going along the road, so Dudley shouldn’t be ticketed. The various items
of information that need to be used are these: The officer was sitting on a side road 0.2 mile from the road Dudley
was on, and the radar caught Dudley when he was 0.5 mile from the intersection of the two roads (which are straight
and meet perpendicularly). We want to figure out how fast Dudley was going using only this information. Set up
variables s = distance between Dudley and the radar gun, x = distance between Dudley and the intersection.
(a) What is the equation that relates s to x? [Hint: Draw a picture and use the Pythagorean theorem. This problem
does not use similar triangles.]
(b) What is ds/dx? What equation relates ds/dx and ds/dt?
(c) What is the value of s when x = 0.5?
(d) What is dx/dt when x = 0.5? (This is the speed Dudley was actually going.) Did Dudley “earn” the ticket or
was he right to question the value the officer had?
(e) Show that the speed that the radar gun measures in this situation is always going to be less than the actual speed.
Do this by taking the equation relating ds/dt and dx/dt before you plugged in values for s and x, and using the
fact that x < s.
2. Fang prides himself at being able to dig precisely hemispherical holes. If he can excavate at 0.3 m3 per minute
(meaning that Fang is digging dirt out of the hole at exactly that rate), how fast is the radius of the hole changing
when it is 2 m? (This is a very typical related rates problem. Work first on getting the picture, then get the equations.)
3. The fact that roosters do not swim has not kept Bill from enjoying that activity. (And if people try to remind Bill that
ducks do swim, he pretends that he has water in his ears and can’t hear them.) One day, Bill landed near the center
of a pond, and the ripples from his landing spread out in a perfectly circular pattern, receding from Bill at 4 m/s.
How fast was the area of that circle increasing when the radius was 10 m? (This is also a very typical related rates
problem.)
5.4 Linear regression as another application.

Just so you don’t think that what we did before is an isolated instance, let me bring up the entire topic of linear regression
analysis. What we have been doing is also called least squares analysis, and is used to fit data to curves all the time. In the
case that the curve happens to be a line,
5.4.1 Most common way to fit data to a line.

In many measurement-oriented situations (e.g., physics labs), you will be measuring data to fit to a linear equation. This
happens sufficiently often to give it a special name. Fitting data to a line by a least-squares condition is called linear
regression.
One purpose of doing this is to estimate the value of the dependent variable at a value for which the independent
variable was not measured. This would be equivalent to asking for the value of t when c = 250, and going back to the
t = A c + B c2 equation with the values we got for A and B and plugging in c = 250 to get the value of t. This is a process
called interpolation, finding a value inside the range of data you have. Another option is called extrapolation, where you
predict values beyond the range of data you have. That is a much more complicated problem, since your equations might
or might not be valid way out there.
5.4.2 The general setup.

In this situation, you will have data points (x j , y j ), for lots of j’s. You then want to minimize
SSE(m, b) = ∑(y j − (m x j + b))2
which is the sum of the vertical displacements from the line, which are the errors in the measurements.
Note the change in character of the letters! At this point, the x j and the y j are the numbers (constants) taken from
the data. That shouldn’t be too much of a surprise since subscripts on variables like x and y often indicate that they are
constants. The variables, the things that will change as we are looking for the minimum, are m and b. That is, we are trying
to look for the “best” slope and intercept.
Remember that y j = actual y-value of a data point, while m x j + b = predicted y-value of the data point, and y j − (m x j +
b) = difference between these, the error in the prediction.
5.4.3 The procedure.

To solve this, you set
∂ SSE ∂ SSE
= 0 and =0
∂m ∂b
and solve simultaneously. In this case, you can solve the equations explicitly. If you do, you get the equations for m and b:
n ∑(x j y j ) − (∑ x j )(∑ y j )
m=
n ∑(x j 2 ) − (∑ x j )2
(∑ x j 2 )(∑ y j ) − (∑ x j )(∑ x j y j )
b=
n ∑(x j 2 ) − (∑ x j )2
These equations are often used in practice.

Let me do a real problem using these equations. In 1954, Roger Bannister ran the world’s first timed (less than) 4-
minute mile. The world record times for the mile since then are (with time in seconds) in the top table on the following
page. It would be possible to use linear regression on this data, and estimate the time as a function of year. That would
have the unrealistic effect of being able to calculate the year of the first three-minute mile (2116), and even the year of the
world’s first zero-minute mile (2623).
Dudley: Mugsy, can you run that fast?
Mugsy: Depends on who—or what—is chasing me.
On the other hand, it would be more reasonable to presume that the velocities are linear functions of time. For that, we
need to find the velocities (which will be the y j ’s, the dependent variables) in m.p.h. for each of these records. Also, it is
awkward to use numbers as large as 1954 and up for year, so I will let x j = (year−1954), the independent variables. Then,
we get the table on the bottom of the next page. If you do the calculations, you get
n = 17
Year Time
1954 239.4
1954 238.0
1957 237.2
1958 234.5
1962 234.4
1964 234.1
1965 233.6
1966 231.3
1967 231.1
1975 231.0
1975 229.4
1979 229.0
1980 228.8
1981 228.5
1981 228.4
1981 227.3
1985 226.3
Table 5.1: Record-setting times for the mile run.
x j (Year-1954) y j (velocity in m.p.h.)

0 15.04
0 15.13
3 15.18
4 15.35
8 15.36
10 15.38
11 15.41
12 15.56
13 15.58
21 15.58
21 15.69
25 15.72
26 15.73
27 15.75
27 15.76
27 15.84
31 15.91
Table 5.2: Record-setting speeds for the mile

∑ x j = 266
∑ y j = 263.97
∑ x j 2 = 5954
∑ x j y j = 4172.7
This gives values of m = 0.0236 and b = 15.16. The predicted equation is then
v = 0.0236 x + 15.16
The maximum absolute error in the prediction is 0.12, which occurs in both 1954 and 1966. The percentage error is less
than 1%, more a result of the fact that the velocity is changing very slowly than that this is a good fit.
When would this predict a three-minute mile to be run? That’s not an easy question, because we have changed things
around so much. First thing we’d need to do is find out the velocity needed for a three-minute mile. That’s not bad. It’s
d
v= (5.10)
t
1 mile
= (5.11)
3 min
1 min
= mi/min × 60 (5.12)
3 hr
= 20 mi/hr (5.13)
Then, we have to find t when v = 20. That’s not too bad either:
20 = 0.0236x + 15.16
x = 205
Since x measures years after 1954, this would give 205 + 1954 = 2159 for the year of the first three-minute mile. That’s
not far from the year of the three-minute mile predicted from linear regression based on times. On the other hand, there is
no chance of a zero-minute mile this way, since that would mean velocity is infinite, meaning x would be infinite as well.
Homework #54
Exercises.
1. Find the estimated time to run 1 mile in 1975, using the equation earlier for velocity, and compare it to the times
listed for 1975.
2. Find the estimated time to run 1 mile in 1981, using the equation earlier for velocity, and compare it to the times
listed for 1981.
Problems.
1. In this problem, you will fit y = f (x) = m x + b to some data by hand. Use this data: (x j , y j ) is measured as (0, 0),
(1, 1), (2, 5), (3, 9). (Note that this is the same data as used to fit a quadratic before. You can fit any data you want
to any function you want.) There are few enough here so that you can do the algebra with only minimal pain. Also,
note that this is a simple y = x2 , with a change in the value at x = 2.
(a) Find n, ∑ x j , ∑ y j , ∑ x j y j , and ∑ x j 2 for the data given.
(b) Plug these values into the equations that I gave for m and b in the notes for a linear regression fit.
(c) Give the values of m x j + b for x = 0, 1, 2, and 3, and compare it to the data given.
2. The average values of x and y for the data are what you are used to, namely x = 1n ∑ x j and y = n1 ∑ y j . Plug the
formulas for x and m and b into m x + b and show that it reduces algebraically to the formula for y. (This means that
y = m x + b is always true for linear regression. Since x and y are usually simple to find, this equation will often get
solved to give b = y − m x, which is used to get b once you’ve found m.)
3. Suppose there are only two data points, (x1 , y1 ) and (x2 , y2 ). Show that the slope and intercept (m and b) of the least
squares line y = m x + b is the same as the slope and intercept of the line that passes through the two points. You can
use the equations I gave for m and b. You will have some algebra ahead of you. (In other words, in this case the least
squares line is the line through the two points.)

1. Related rates problems are basically opportunities to apply the chain rule. You are given (or, if you aren’t lucky, you
have to derive) an equation that connects (relates) two variables; call then x and y here. Then given the rate at which
one of them is changing (that is, given dx/dt or dy/dt) at specific values of the variables, you can find the rate at
which the other is changing by dy/dt = (dy/dx) × (dx/dt). You find dy/dx from the equation that connects x and
y, plug in the one rate you are given and the values of the variables, and you can solve for the other rate.
2. One standard way to find the “best” equation to fit some data is to minimize the sum of the squares of the errors.
3. To find the critical point(s) of a function f (x, y), set both partial derivatives equal to zero and solve simultaneously.
4. To categorize (determine the nature of the graph at) a critical point, evaluate the three second partial derivatives
( fxx , fxy = fyx , and fyy ) at the point. Then you plug those into the formula ∆ = ( fxx ) × ( fyy ) − ( fxy )2 , called the
discriminant. Then, the four possibilities are given by the following table:
Case Type of critical point
∆ > 0 and fxx > 0 Relative min (pit)
∆ > 0 and fxx < 0 Relative max (peak)
∆<0 Saddle (pass)
∆=0 Mess (problem)
5. Linear regression is the name given to fitting data to a line by the least squares approach. The formulas are
n ∑(x j y j ) − (∑ x j )(∑ y j )
m=
n ∑(x j 2 ) − (∑ x j )2
(∑ x j 2 )(∑ y j ) − (∑ x j )(∑ x j y j )
b= = y−mx
n ∑(x j 2 ) − (∑ x j )2
6. There were no new Maple commands in this chapter.
5.6 Finals from previous years

Final, Fall 2004
I. (10 points, 5 points each) Find the following derivatives.

dy ∂z √
(a) for y(t) = t 3 and x(t) = e2t (b) for z x2 + x ez y = cos x
dx ∂y
II. (20 points, 10 pts each) (a) (a) Find the area between the curves y = x3 + 2 and y = 3 x + 2 between intersection points
A and B. Reference the following graph.
(b) SET UP the integral to find the length of the y = x3 + 2 curve (the arc length) between points A and B (at the outer two
intersection points). Don’t evaluate the integral.
dy
III. (10 points) Solve the following initial value problem. = t cos(t 2 ), y(0) = 1.
dt
IV. ( 40 pts, 10 points each) Evaluate the following integrals

Z 2
x+1 1
Z Z Z
(a) (x4 + x2 + 1) cosh(2 x) dx (b) √ dx (c) 2
dx (d) x ln x dx
3 − 2 x − x2 x +x−2 1
V. (10 points) Suppose that the average yearly cost per item for producing x items of a business product is C(x) =
10 + (100/x). If the current production is x = 10 and production is increasing at a rate of 2 items per year, find the
rate of change of the average cost.
VI. (10 points) Given the data {(0, 0), (1, 3), (2, 7)}, find the two equations that you would solve to find the constants A and
B that would give the best least-squares fit of y = A + B sin x for this data. Recall that the least-squares error expression for
this problem is E(A, B) = ∑3i=1 [yi − (A + B sin(xi ))]2 . DO NOT SOLVE FOR A AND B! Write your answer as a linear sys-
tem for A and B, i.e., (constant) = (constant)A + (constant)B, (constant) = (constant)A + (constant)B. (Note these constants
may have different values.)
Final, Fall 2005

3 5
k−1 nπ
I. (10 points, 5 points each) Find the following sums. (a) ∑ (b) ∑ sin
k=1 k n=1 2
II. (20 points, 10 pts each)
(a) Find the area between the curves y = x4 − 2 x2 and y = 2 x2 from the left intersection point to the right intersection point.
(The graph was given on the test.) (b) Another way to describe the quartic curve in part (a) is parametrically, where
x = t and y = t 4 − 2t 2 . SET UP the integral to find the length of this curve (the arc length) between the intersection points
using this parametric form (i.e., in terms of the variable t only). DonâĂŹt evaluate the integral. III. (15 points) Solve the
following initial value problem.
dy
= t ln(t) y(1) = 0.
dt
IV. (40 pts, 10 points each) Evaluate the following integrals.
Z 1
sin(2t + 1) x+4
Z Z Z
3 2 2x
(a) dt (b) (x − 2 x + 3 x) e dx (c) dx (d) arcsin x dx
cos2 (2t + 1) x2 + 5 x + 6 0
2
V. (10 points) The pressure (P) at the outlet of a pump is related to the velocity (v) at the pump outlet by P = − kρv , where k
and ρ are constants and the ratio k/ρ is 4.1 for this application. If a gauge at the pump outlet measures velocity as 4 ft/sec
and rising at a rate of 2 ft/sec2 , at what rate is the outlet pressure changing?
VI. (10 points) Find all of the local min/max/saddle points for f (x, y) = x3 + y3 + 3 x2 − 3 y2 − 8.
Final, Fall 2006
I. (10 points, 5 points each) Find the following sums.

5 9
(a) ∑ (n2 − 1) (b) ∑ ( j − 4)2 − 1

n=1 j=5
II. (10 points) In an attempt to impress the women of Asbury with their environmental sensitivity and excellent personal
hygiene, the HR Math Men have built a clothesline by stretching a rope between two poles. Dustin the Math Ninja correctly
points out to a female admirer that the line assumes the shape of a catenary curve with equation y = 5 (ex/10 +e−x/10 ), where
−10 ≤ x ≤ 10. Set up the integral that would calculate the amount of rope used in this project.
III. (10 points) If a water balloon is shot from x = 0 and y = 0 with an initial x velocity of 20 feet per second and an initial y
velocity of 64 feet per second, how far will it travel in the x direction if it lands at the same level? Use the equations (g = 32
feet/sec2 ) vy = v0y − gt, y = y0 + v0y t − 21 gt 2 , x = x0 + v0x t.
dy
IV. (10 points) Solve the following initial value problem. = x sin(x2 ), y(0) = 1.
dx
V. (40
Z πpts, 10 points each) Evaluate the following integrals.
4 cost 1
Z Z Z
2
(a) 2
dt (b) (x + 2 x + 3) sinh(3 x) dx (c) 2
dx (d) x2 ln x dx
π/2 (sint + 1) x +x−2
dy dx
VI. (10 points) If x y2 − ln(x y) = x and = 3 at x = 1, y = 1, find at this same point. VII. (10 points) Find all of the
dt dt
local min/max/saddle points for f (x, y) = x3 + y3 − 3 x y.
VIII. (10 pts) Use integration to show that the area of a right triangle is equal to (1/2)*base*height.
Final, Fall 2007
I. (10 points, 5 points each) Find the following sums.

7 4 p
(a) ∑ (k3 + 3) (b) ∑ j2 + 1
k=3 j=1
II. (10 points) Solve the following initial value problem.

dy 2
= t et , y(0) = 3
dt
III. ( 40 pts, 10 points each) Evaluate the following integrals
Z 2
x+3 dx
Z Z Z
2 2x 2
(a) 2 3
dx (b) (x − 2 x) e dx (c) (d) t 3 et dt (Hint: Use u = t 2 substi-
0 (x + 6 x + 1) (x − 4)2 (x − 1)
tution first, and then integration by parts.) 5 point BONUS
IV. (10 points) Find all of the local min/max/saddle points for f (x, y) = x4 + y4 − 4 x y.
V. (a) (15 points) Find the area between the following curves: x = y3 − 26 y + 10 and x = 40 − 6 y2 − y3 . Note that the
intersection points are at (15, −5), (−41, 3), and (35, −1). (b) (15 points) SET UP the integral for the length of the
x = 40 − 6 y2 − y3 curve from the FIRST intersection point to the LAST.
VI. (10 points) Water is pouring into a rectangular fish tank with a rate of 3 ft3 /min. How fast is the water level rising
in the tank if the base of the tank is a 2 ft × 3 ft rectangle? (Note that the volume of a rectangular tank is (area of the
base)×(height).)
Final, Fall 2008
I. (10 points; 5 points each) Find the values of the following summations:
−1 3
k+1
a.) ∑ 2 j+4

b.) ∑ 2
i=−5 k=−2 k − 2
x y2 z

II. (15 points; 5 points each) Find all the first partial derivatives of f (x, y, z) =
sin(2 x y + 3 z)
dy
III. (10 points) Find y if = 3 x2 − 6 and y(0) = 4.
dx
IV. (15 points; 5 points each) Find all three critical points of f (x, y) = 2 x2 + y2 − x2 y − 5, and classify all of them.
V. (10 points) Find the area between the curves y = x2 + x − 2 and y = 3 x − 2. [Note: An unlabeled sketch of the curves
was provided on the test.]
VI. (15 points; 5 points each) N3RD is going to build a trebuchet (a medieval throwing device) for launching pumpkins.
They could use your help with some of the equations. They need to know the initial speed to throw a pumpkin 300 feet. If
the launch angle is roughly 40◦ , the equations of motion are x = 0.766 v0 t and y = −16t 2 + 0.643 v0 t.
(a) Solve y = 0 for t. The positive value gives the time that the pumpkin hits the ground after launch. The value of t will
still have a v0 in it, but that is fine right now. (b) Plug that positive value of t into the equation for x. This gives the
distance that the pumpkin travels. It will still have a v0 in it. (c) Set that distance equal to 300 and solve for v0 . This is
the necessary launch speed (in feet per second).
VII. (30 points; 10 points each) Evaluate the following integrals.
6 x2 − 3 x + 6
Z Z Z
(a) x3 e2 x dx (b) sin(sin x) cos x dx (c) dx
x (x + 1) (x − 2)
Final, Fall 2009
I. (10 points; 5 points each) Find the numeric values for the following summations:
8 6
(a) ∑ (l + 3)2 (b) ∑ n cos(n π)
l=3 n=1
dy
II. (10 points) Solve the following initial value problem: = t 2 + 2t + 1, y(0) = 1.
dt
III. (20 points; 5 points each) One of Dr. C’s favorite (yet, admittedly perverse) activities is scaring students as they enter
calc class. As Jacob entered class on a sleepy Monday morning, Dr. C leaped from his hiding spot and yelled, ’Recall.’
Jacob, of course, jumped. His initial horizontal velocity was 5 ft/sec and his initial vertical velocity was 16 ft/sec. Start
t = 0 at the moment he launched. The equations for ballistic motion are: x(t) = v0x t + x0 and y(t) = − 12 gt 2 + v0y t + y0 and
use the value g = 32 ft/sec2 .
(a) When did Jacob land back on the ground? (b) How far did he travel horizontally? (c) When did he reach the
top of his jump? (d) How high did he go?
IV. (25 points, 5 points each) For each of the following four integrals, give the method that should be used to begin to
evaluate it (the first step), and the appropriate information about the method. The possible methods, with the corresponding
information are:
Method Information
Substitution u=?
Partial fractions Setup
Integration by parts u=?
Arctan(ln x) 6 x2 − 15 x + 22
Z Z Z Z
3
(a) x2 ex dx (b) dx (c) dx (d) (x2 +3 x+2) sinh(2 x) dx (e)
x (x + 3) (x2 + 2)2 (x + 7)3
Pick one of these four integrals and work it out completely.
V. (20 points) Find all three critical points of f (x, y) = 2 x2 + y2 − x2 y − 5 and classify all of them.
VI. (10 points) A reactor plant operator must monitor the flow rate through the nuclear core to assure adequate cooling.
When a flow sensor fails, the engineer on watch uses an energy conservation principle to develop the following relationship
between coolant flow rate (Q) and pressure (p) at another monitoring point in the plant: 24 p2 + 6 Q = 18. On the next watch
(2 hours later), the operator observes that the pressure (p) is 800 pounds per square inch (psi) and is increasing at a rate of
0.01 psi per hour (d p/dt = 0.01). How is the coolant flow rate changing at this time (i.e., find dQ/dt at this monitoring
point at this time)? Assume that all relevant units are consistent to give the rate of change.
VII. (10 points) In a typical display of caring and sensitivity, Jess offered to carry Jordan’s backpack to his classes while
he was recovering from a sprained ankle. As part of her fitness program, Jess maintained careful records of her trips. She
described her path with the following functions: x(t) = cos(t/6), y(t) = sin(t/4). SET UP the integral that would determine
the length Jess walked in the noble endeavor from t = 0 to t = 720.
VIII. (10 points Bonus) Find the area between the curves y = x3 − 2 x and y = x2 . See the following sketch.
Final, Fall 2010
I. (10 points; 5 points each) Find the numeric values for the following sums.
2 −4
(a) ∑5k=2 kk+1 (b) ∑1j=−1 j2 − 3 j + 2 II. (10 points) Solve the following IVP
dy
= t 3 cos(t 4 )
dt
y(0) = 10
III. (15 points; 5 points each) With the abundant free time available due to an unusually slow social season, the Asbury
math men built a catapult. After completion, they learned that the projectile would take off with a vertical velocity of 48
ft/sec and a horizontal velocity of 18 ft/sec from ground level. Start with t = 0 at launch time and use the ballistic motion
equations to answer the following questions.
(a) Find the total flight time of the projectile from launch until it hits the ground. (b) What’s the range, i.e., how far will
it go horizontally in this time? (c) On a sunny spring afternoon, our heroes hear that a group of women are studying on
the Student Center deck. They decide to impress these young ladies by launching a container filled with difficult proofs. To
maintain anonymity (and because they couldn’t push the catapult any closer), they launched from a distance of 20 ft from
the deck. They were concerned that their projectile would clear the 12 ft rail. Calculate how long it will take to reach this
distance and then calculate the altitude of the object at that time to decide if the projectile will clear the rail.
IV. (40 points; 10 points each) Find the following integrals
(c) x2(2−5
R 1−sin(x) R R x−1) R π/4 3
(a) x+cos(x) dx (b) ln(4 x) dx x+6
dx (d) 0 x sin(2 x) dx
V. (15 points) Find and classify all three critical points of f (x, y) = x2 + 2 x y2 + 6 y2 − 2 x
6. (15 points) Find the total area between y = x4 + 2 x3 − 3 x2 − 8 x − 4 and the x-axis. Hint: x4 + 2 x3 − 3 x2 − 8 x − 4 =
(x − 2) (x + 1)2 (x + 2). The following graph may be helpful:
Final, Fall 2011
I. (10 points, 5 points each) Find the numeric values for the following summations:
8 6
(a) ∑ (l + 1)3 (b) ∑ n cos(n π)
l=3 n=1
II. (10 points) Find the area between the curves y = x3 − 2 x and y = x2 . Refer to the graph.
III. (10 points) Solve the following initial value problem:

dy
= t 2 + 2t + 1, y(0) = 1.
dt
IV. (40 points, 10 points each) Evaluate the following four integrals.
x2 + 2
Z Z π/4 Z Z
3
(a) x2 ex dx (b) x sin(2 x) dx (c) dx (d) (x2 + 3 x + 2) sinh(2 x) dx
0 (x − 1) (2 x − 8) (x + 2)
2
V. (20 points, 10 points each) Given the function f (x,t) = x3 − 3 x y + y2 + 8.
(a) Find the two critical points of this function. (b) Categorize these points according to the second derivative test.
VI. (10 points) A reactor plan operator must monitor the flow rate through the nuclear core to assure adequate cooling.
When a flow sensor fails, the engineer on watch uses an energy conservation principle to develop the following relationship
between coolant flow rate (Q) and pressure (p) at another monitoring point in the plant: 24 p2 + 6 Q = 18. On the next
watch (2 hours later), the operator observes that the pressure (p) is 800 pounds per square inch (psi) and is increasing at a
dp dQ
rate of 0.01 psi per hour ( = 0.01). How is the coolant flow rate changing at this time (i.e., find at this monitoring
dt dt
point at this time)? Assume that all relevant units are consistent to give the rate of change.
VII. (10 points, 5 points each) In an effort to impress the women of the calculus class, the calculus math men built a tre-
buchet for launching flaming frozen turkeys (as a holiday celebration). After the WPD were called, one officer used his
radar gun to determine that v0x = 35 ft/sec and v0y = 64 ft/sec. The trebuchet launched the birds from a height of 4 ft. Use
the equations of ballistic motion
x(t) = v0x t + x0 and y(t) = − 12 t 2 + v0y t + y0
to answer the following questions.
(a) How far up (maximum height) did the turkey fly? (b) How far away did the turkey land?
Final, Fall 2012
I. (10 points; 5 points each) Find the numeric values for the following summations.
3 4
(a) ∑ (l − 3)2 (b) ∑ (n2 − 3 n + 1)
l=1 n=1
√
II. (10 points) Find the area between the curves y = x 1 − x2 and y = x/2.
dy
√
III. (10 points) Solve the following initial value problem: dt = t t 2 + 3, y(0) = 1.
IV. (40 points; 10 points each) Evaluate the following four integrals.
Z π/2 Z Z
3 x2 − 4 x + 5
Z √
(a) x3 sin(4 x4 ) dx (b) (x2 − 2 x − 3) e2 x dx (c) dx (d) t 2 t + 8 dt
0 (x − 1) (x2 + 1)
V. (20 points; 10 points each) Given the function f (x, y) = 4 x − 3 x3 − 2 x y2 .
(a) Find the critical points of this function. (b) Categorize these points according to the second derivative test.
dx √ dy
VI. (10 points) Given that = 0.2 at x = 4, y = 1, and x2 y − 2 x y = 17, find at this same point.
dt dt
VII. (10 points; 5 points each) A projectile (OK, I’m trying to behave here. . . .) is launched from ground level (x = y = 0)
with v0x = 10 ft/sec and v0y = 16 ft/sec. Use the equations of ballistic motion, x(t) = v0x t + x0 and y(t) = −(1/2) gt 2 +
v0y t + y0 , where g = 32 ft/sec2 , vx (t) = v0x , and vy (t) = v0y − gt, to answer the following questions.
(a) What is the maximum height of the projectile. (b) What is its range, i.e., how far will it go in the x direction?
Final, Fall 2013
I. (10 points; 5 points each) Find the numeric values for the following summations.
8 5 nπ
(a) ∑ (l 2 + 1) (b) ∑ n sin
l=4 n=2 2
II. (15 points) Find the area between the curves y = t 3 − 2 and y = 2t 2 + 3t − 2. Reference the following graph.
dy
III. (10 points) Solve the following initial value problem: = x cos(x2 ), y(0) = 1.
dx
IV. Z(24 points; 8 pointsZeach) Find the following
Z integrals.
4
(a) x3 ex dx (b) ln(5 x) dx (c) (x2 + 3 x + 2) cosh(3 x) dx
x4 − 5 x2 + 22
V. (4 points) SET UP the partial fraction expansion for . DON’T determine the coefficients!
(x − 3) (x2 + 1)2 (x − 7)3
VI. (10 points) Given the function f (x, y) = 2 x2 − y3 − 2 x y, find all the relative max/min values.
VII (15 points; 5 points each) One of Dr. C’s favorite integration apps is ballistic motion. (He uses it to model projectile
vomiting as part of a parenting class—don’t ask. . . .) Due to time contraints, he was not able to cover it this year, but he
thought, ‘Hey! Why not cover it on the exam?’, so here goes. Near the surface of the earth, the acceleration of gravity, g,
is a constant (32 ft/sec2 or 9.8 m/sec2 ). Since acceleration is the time derivative of velocity or speed (v), we get another of
dv
Dr. C’s favorite things, and IVP, i.e., = −g, v(0) = v0 , where v0 is a given initial speed.
dt
(a) Solve is IVP to get v(t) = v0 − gt. (b) You may be thinking to yourself, ‘Hey! Isn’t velocity the derivative of
dy
distance?’ Yes, so it makes another IVP for distance, y: = v(t), and use v(t) = v0 − gt from part (a), y(0) = y0 , where y0
dt
is the given initial distance. OK, you know the drill. Solve this IVP for y(t). (c) Suppose an object is thrown up from
ground level (y0 = 0) with an initial speed of 64 ft/sec, maximize your equation isn part (b) to find how high the object will
fly, i.e., find the maximum distance y. (Use g = 32 ft/sec2 .)
VIII. (15 points) In an increasingly desperate attempt to court female attention, Team Math Man (male calculus students
who requrested to remain anonymous) volunteered to help the college physical plant solve the hot water shortage in Glide-
Crawford. The engineer in charge develops an equation relating the pressure at the boiler (p) to the flow at the Glide-
Crawford (GC) header (Q) as −p3 + 6 Q = 18. During the morning prime shower time, he measures the pressure at the
boiler as 800 pounds per square inch (psi) and the pressure as falling (−) at a rate of 0.1 psi per hour (d p/dt). How can
Team Math Man find the rate at which the pressure at the GC header is changing (dQ/dt) at this time? Find the value of
dQ/dt at this time.
Miscellaneous questions, not from any test.
The following were not problems on tests, but were problems that one student requested to work on. They are, in general,
harder than I what I would put on a test, but are excellent practice. If you can work these, you will be well-prepared for the
final.
Z √
6. Find x a x + b dx, showing your work.
Z 1
3x−1
7. Find √ dx.
−1 3 x2 − 2 x + 3
Z 2 x
d 3
8. Find sin(t ) dt .
dx x
Note that it is impossible to find the indefinite integral of sin(t 3 ), but you don’t need to find it to solve this problem. Show
your work.
9. Sketch the region in the first quadrant√above the line y = 3 x − 2, and below the line y = 4. Find the area of the region.
10. Find the length of the curve y = 31 x (3 − x) from x = 0 to x = 3. Your answer should be a number, which you can
leave in any form you want.
11. Find the area between the curve y = x3 − 6 x2 + 8 x and the x-axis.
These are problems that were on miscellaneous other tests, and are also good practice.
12. Find and classify the critical points of the function f (x, y) = x3 + 8 y3 − 6 x2 − 12 y2 + 4. (There will be four of them.)
13. (15 points total)
a.) (10 points) Find the equation of the linear regression line for the following data:
x y
0 −1
2 2
4 2
6 5
b.) (5 points) Predict the value of y at x = 5.
Summary sheet
Usual derivative formulas, plus:
Formula from finance: Simple interest: FVIF = 1 + r t

Compound interest: FVIF = (1 + k)n
Continuously compounded interest FV IF = ert .
Linear regression formulas:

n(∑ xi yi ) − (∑ xi ) (∑ yi ) (∑ xi 2 ) (∑ yi ) − (∑ xi ) (∑ xi yi )
m= and b=
n(∑ xi 2 ) − (∑ xi )2 n(∑ xi 2 ) − (∑ xi )2
Second derivative
2 test in two variables:
2 2 2
∂ f ∂ f ∂ f
For D = − 2
,
∂x∂y ∂x ∂ y2
D > 0, the
critical
point is a saddle.
∂2 f
D < 0 and ∂ x2 > 0, the critical point is a relative min.
2
D < 0 and ∂∂ x2f < 0, the critical point is a relative max.
D = 0 is a mess (any of these, or worse).
Final, Fall 2004, Answers
I. (a) (3t 2 )/(2 e2t ) (b) −( 21 x ex y−1/2 )/(2 z x + (ex + x ex ) y1/2 + sin x)
R √3√
II. (a) 9/2 (b) √
− 3
1 + 9 x4 dx
III. y = 12 sin(t 2 ) + 1
IV. (a) 12 (x4 +x2 +1) sinh(2 x)− 41 (4 x3 +2 x) cosh(2 x)+ 18 (12 x2 +2) sinh(2 x)− 16
1 1
(24 x) cosh(2 x)+ 32 (24) sinh(2 x)+C
1
(b) −(3 − 2 x − x2 )1/2 +C (c) 3 ln |x − 1 | − 13 ln |x + 2 | +C (d) 2 ln 2 − 43
V. −2
VI. Solve 6 A − 20 + 2 B (sin(1) + sin(2)) = 0 and −2 (3 − A − B sin(1)) sin(1) − 2 (7 − A − B sin(2)) sin(2) = 0.
I. (a) (7/6) (b) 1

R2 p
II. (a) 128/15 (b) −2 1 + (4t 3 − 4t)2 dt
III. y = 21 t 2 ln(t) − 14 t 2 + 41
IV. (a) 12 (cos(2t + 1))−1 +C (b) 12 e2 x (x3 − 2 x2 + 3 x) − 41 e2 x (3 x2 − 4 x + 3) + 18 e2 x (6 x − 4) − 16
1 2x
e (6) +C (c)
2 5 π−1
7 ln |x + 6 | + 7 ln |x − 1 | +C (d) 2
V. −65.6
VI. Saddle points at (0, 0) and (−2, 2); relative min at (0, 2); relative max at (−2, 0).
I. (a) 50 (b) 50
s
1 x/10 1 −x/10 2
Z 10
II. 1+ 5 e − e dx
−10 10 10
III. 80 feet
medskip IV. y = − 12 cos(x2 ) + 23
V. (a) −2 (b) 13 (x2 + 2 x + 3) cosh(3 x) − 19 (2 x + 2) sinh(3 x) + 27
2
cosh(3 x) +C (c) − 13 ln(x + 2) + 13 ln(x − 1) +C
1 3 1 3
(d) 3 x ln x − 9 x +C
VI. 3
VII. Saddle at (0, 0), min at (1, 1)
h 2 b h b2
Z b
h h 2 1
VIII. Area = x dx = x = (b − 0) = = hb
0 b 2b 0 2b 2b 2

√ √ √ √
I. (a) 790 (b) 2 + 5 + 10 + 17 ≈ 10.93566
1 t2
II. y = e + (5/2)
2
1 1 1 1 1
III. (a) 72/289 ≈ .24913 (b) (x2 −2 x) ( e2 x )−(2 x −2) ( e2 x )+(2) ( e2 x )+C (c) − ln |x − 4 |− (x −4)−1 +
2 4 8 9 3
1 1 2 t2
ln |x − 1 | +C (d) (t − 1) e +C.
9 2
IV. (0, 0) is a saddle point; (1, 1) is a local min; (−1, −1) is a local min.
Z 2q
V. (a) 256 (b) 1 + (−12 y − 3 y2 )2 dy
−5
VI. 0.5 ft/min.
I. (a) 15 12 (b) −13/14

∂ f sin(2 x y + 3 z) (y2 z) − (x y2 z) ((2 y) cos(2 x y + 3 z)) ∂ f sin(2 x y + 3 z) (2 x y z) − (x y2 z) ((2 x) cos(2 x y + 3 z))
II. (a) = (b) =
∂x sin2 (2 x y + 3 z) ∂y sin2 (2 x y + 3 z)
∂f sin(2 x y + 3 z) (x y2 ) − (x y2 z) ((3) cos(2 x y + 3 z))
(c) =
∂y sin2 (2 x y + 3 z)
III. y = x3 − 6 x + 4
IV. Relative min at (0, 0); saddle points at (±2, 2).
V. 4/3
VI. (a) t = 0.0401875 v0 (b) x = 0.037836 v20 (c) v0 = 98.72
1 3 2x 3 2 2x 3
VII. (a) 2x e −4x e +4 x e2 x − 38 e2 x +C (b) − cos(sin x) +C (c) −3 ln |x | + 5 ln |x + 1 | + 4 ln |x − 2 | +C
I. (a) 271 (b) 3

II. y(t) = 31 t 3 + t 2 + t + 1
III. (a) Landed at t = 1 second (b) 5 feet (c) 0.5 second (d) 4 feet
IV. (a) Substitution, u = x3 . (Integral becomes (1/3) e +C)x3 (b) Substitution, u = ln x. (Integral becomes ln x Arctan(ln x)−
2 A B x +C Dx+E F G H
(1/2) ln[1 + (ln x) ] +C.) (c) Partial fractions, setup is + 2 + 2 2
+ + + . (Integral
x + 3 x + 2 (x + 2) x + 7 (x + 7)p (x + 7)3
2
√
is a real mess, but Maple says it is (1/64) ln(x+3)−(7901/2255067) ln(x2 +2)+(29927/9020268) (2)∗Arctan(x 2/2)−
(1243739/144324288) ln(x + 7) + (1/1061208) (−2900 x + 6020)/(x2 + 2) + 48427/(2122416 (x + 7)) + 421/(20808 (x +
7)2 ) + C.) (d) Integration by parts (tabular integration), u = x2 + 3 x + 2, dv = sinh(2 x) dx. (Integral becomes (x2 + 3 x +
2) (1/2) cosh(2 x) − (2 x + 3) (1/4) sinh(2 x) + (2) (1/8) cosh(2 x) +C.) (e) (Included above)
V. (0, 0) is a relative min; (2, 2) is a saddle; (−2, 2) is a saddle.
VI. dQ/dt = −64
Z 720 r
1 t 1 t
VII. sin2 + cos2 dt
0 36 6 16 4
VIII. 37/12
I. 143/20 = 7.15
II. y = (1/4) sin(t 4 ) + 10
III. (a) 3 seconds (b) 54 feet (c) t = 10/9 second; 33.5 feet
IV. (a) ln(x + cos(x)) + C (b) x ln(4 x) − x + C (c) 5 ln(x − 3) − 3 ln(x − 2) + C (d) (3/4) (π/4)2 − (3/8) ≈
0.08764
V. (1, 0) is a relative min; (−3, 2) is a saddle point; (−3, −2) is another saddle point.
VI. 96/5 = 19.2
I. (a) 271 (b) 3

II. 37/12
III. y = 13 t 3 + t 2 + t + 1
3
IV. (a) 31 ex +C (b) 1/4 (c) − 16 ln(|x − 1 |) + 12 ln(|2 x − 8 |) + 16 ln(|x + 2 |) +C (d) 21 (x2 + 3 x + 2) cosh(2 x) −
1 1
4 (2 x + 3) sinh(2 x + 3) + 4 cosh(2 x) +C.
V. (a) (0, 0) and (3, 9) (b) (0, 0) is a saddle point; (3, 9) is a relative minimum.
VI. dQ/dt = −64.
VII. (a) 68 feet (b) 142.15 feet
I. (a) 30 (b) 4
II. 5/24 ≈ 0.2083
√
III. y = 13 (t 2 + 3)3/2 + (1 − 3)
1 2x 3 1 1 6 2x
1
IV. (a) − 16 1
cos(π 4 /4) + 16 (b) e (x − 2 x − 3) − e2 x (3 x2 − 2) + e2 x (6x) − e +C (c) 2 ln |x − 1 | −
2 4 8 16
3 Arctan x + 21 ln(x2 + 1) +C (d) 23 t 2 (t + 8)3/2 − 15
8 8 2
t (t + 8)5/2 + 15 7 (t + 8)
7/2 +C or 2 (t + 8)7/2 − 2 16 (t + 8)5/2 +
7 5
2 7/2 +C
3 64 (t + 8)
√ √ √
V. (a) The critical points are (0, 2), (0, − 2), (2/3, 0), (−2/3, 0). (b) (0, ± 2) are saddle points. (2/3, 0) is a rela-
tive maximum. (−2/3, 0) is a relative minimum.
VI. −3/28 ≈ −0.1071.
VII. (a) 4 feet (b) 10 feet
Miscellaneous problems
2 2b
6. (a x + b)5/2 − (a x + b)3/2 +C
5 √ 3
7. 2 − 2.
8. 2 sin(8 x3 ) − sin x3 ).
9. 16/3
√
10. 2 3
11. 8
12. (0, 0 is a max; (0, 1) is a saddle; (4, 0) is a saddle; (4, 1) is a min
9 7
13. (a) y = x− (b) 19/5
10 10
I. (a) 195 (b) 2

II. 71/6 ≈ 11.83.
1
III. y = 2 sin(x2 ) + 1.
4
IV. (a) 14 ex +C (b) x ln(5 x) − x +C (c) 13 (x2 + 3 x + 2) sinh(3 x) − 19 (2 x + 3) cosh(3 x) + 27
2
sinh(3 x) +C
A B x +C Dx+E F G H
V. + 2 + 2 2
+ + 2
+
x−3 x +1 (x + 1) x − 7 (x − 7) (x − 7)3
VI. Critical point (0, 0) is a saddle (Pringle). Critical point (−1/6, −1/3) is a relative min.
R R
VII. (a) Integrating gives dv = (−g) dt, or v = −gt + C. Plug in the initial condition to get the value of C. v0 =
−g (0) +C, or C = v0 . The solution to the IVP is v(t) = −gt + v0 = v0 − gt. (b) y = y0 + v0 t − 21 gt 2 . (c) 64 feet
VIII. dQ/dt = −80/3 ≈ −26.67
Appendix A
Answers to Homework Exercises
A.1 Chapter 0.
Homework #1
Exercises.
1. (a) is a function, since an input of 1 always produces an output of 1, and there are no other repeated inputs. (b) is not
a function, since an input of 1 produces different outputs. (c) is not a function, since it does not pass the vertical line test.
(d) is a function, since it produces an unambiguous number. (e) is a function, since it produces an unambiguous number.
2. (a) is a function since there are no repeated inputs. (b) is not a function since (for example) an input of 1 could produce
1 or 6. (c) is a function, since it passes the vertical line test. (d) is a function, since it produces an unambiguous number.
(e) is a function, since it produces an unambiguous number.
3. (a) {1, 2, 3, 4}. (b) {1, 2, 3, 4}. (d) The domain is all (real) numbers. (e) The domain is all non-negative numbers.
4. (a) {1, 2, 3, 4, 5, 6} (b) {1, 2, 3} (d) The domain is all non-negative real numbers. (e) The domain is all (real)
numbers.
5. There are 6 parts to the exercises, 3 parts to the problems, and 7 parts on the investigation, for a total of 6 ∗ 1 + 3 ∗ 2 + 7 ∗
3 = 33 points.
Homework #2
Exercises.
1. (a) Has an inverse, given by {(1, 1), (2, 4), (3, 3), (4, 2)}. (b) Has no inverse, since inputs of 2, 3, and 4 all give 1. (c)
Has no inverse, since it fails the horizontal line test. (d) Has no inverse, since inputs of 1 and −1 (for example) both give
1. (e) Has an inverse, given by x2 .
2. (a) Has no inverse, since (for example) both 1 and 6 as inputs give an output of 3. (b) Has an inverse, given by
{(1, 1), (2, 2), (3, 3), (4, 3), (5, 2), (6, 1)}. (c) Has no inverse, since it fails the horizontal line test. (d) Has an inverse,
given by x2 . (e) has no inverse, since inputs of 1 and −1 (for example) both produce an output of 1.
3. A function is one-to-one when each output comes from only one input. The inverse comes from interchanging the inputs
and outputs, so if a function is one-to-one then the inverse has the property that each input comes from only one output,
making the inverse a function.
Homework #3
Exercises.
305
APPENDIX A. ANSWERS TO HOMEWORK EXERCISES 306
1.(a) f (g(x)) = 8 x2 +6 x +1, g( f (x)) = 4 x2 −2 x +1 (b) 12 x − 21 (c) (d) g(g−1 (x)) =

g( 12 x − 21 ) = 2 ( 21 x − 12 ) + 1 = x, and g−1 (g(x)) = g−1 (2 x + 1) = 21 (2 x + 1) − 12 = x. (e) 0, 12 (there are lots of other pos-
sibilities) (f) If f(0)=f(1/2)=0, then the inverse value of 0 could be either 0 or 1/2, showing that the inverse is not a function.
2. (a) f (g(x)) = 8 x2 +18 x+9, g( f (x)) = 4 x2 −6 x+3 (b) 21 x− 23 (c) (d) g(g−1 (x)) =
g( 12 x − 23 ) = 2 ( 21 x − 32 ) + 3 = x, and g−1 (g(x)) = g−1 (2 x + 3) = 21 (2 x + 3) − 23 = x. (e) 0, 3
2
Note: The process of including the graphs distorts the axes. This means that angles will not look “right” in the
graphs.
−1+x
3. f ( f (x)) = x , xf ( f ( f (x))) = x
−
4. f ( f (x)) = − 1−(−1−xx ) = x
1−x
5. (a) −16; 4 a (b) −4
6. (a) 3; a (b) 3
A.2 Chapter 1.
Homework #4
Exercises.
1. (a) y2 = 34; ∆x = 4, ∆y = 36 (b) msec = 9 (c) msec = 2 ∆x + 1 = 9
2. (a) y2 = 1, ∆x = 1, ∆y = 3 (b) msec = 3 (c) msec = 2 ∆x + 1 = 3
3. (a) y2 = 13, ∆x = −3, ∆y = 15 (b) msec = −5 (c) msec = 2 ∆x + 1 = −5 (d) No, the answers to parts (b) and (c)
agree, so there is no problem with ∆x being negative.
Homework #5
Exercises.
1. (a) msec = 6 x1 + 3 ∆x − 2 (b) 10
2. (a) msec = 6 x1 + 3 ∆x − 5 (b) −11
Homework #6
Exercises.
1. (a) ∆x now is 1−5 = −4, while before it was 5−1 = 4. And ∆y now is −2−34=−36, while before it was 34−(−2) = 36.
So, both ∆x and ∆y change signs. (b) Now, msec = (∆y)/(∆x) = (−36)/(−4) = 9. This is the same as the old value of
msec . (c) Both ∆x and ∆y change signs, so msec doesn’t change.
2. (a) ∆x is now 1 − 2 = −1, while before it was 2 − 1 = 1. And ∆y now is −2 − 1 = −3, while before it was 1 − (−2) = 3.
So, both ∆x and ∆y change signs. (b) Now, msec = (∆y)/(∆x) = (−3)/(−1) = 3. This is the same as the old value of msec .
(c) Both ∆x and ∆y change signs, so msec doesn’t change.

3. Call the points x1 and x2 , so ∆x = x2 − x1 . Then ∆y = tan(x2 ) − tan(x1 ) = tan(x1 + ∆x) − tan(x1 ). The process breaks
down at this point, since you can’t expand tan(x1 + ∆x) − tan(x1 ) to get a ∆x factor that can cancel the one in the bottom.
4. We already had that mtan = 4 x1 − 3, so substitute 1, 3, and 5 in for x1 , and you get 1, 9, and 17. To get the equations of
the tangent lines, we also need the y-coordinates, which are −2, 8, and 34, respectively. The equations of the tangent lines
are then: y − (−2) = 1 (x − 1), y − 8 = 9 (x − 3), and y − 34 = 17 (x − 5).
Homework #7
Exercises.
1. (a) ∆y = 1.03, f 0 (x1 ) ∆x = 1.0 (b) ∆y = −.0036430966, f 0 (x1 ) ∆x = −.003703703704 (c) ∆y = −.011904762,
f 0 (x1 ) ∆x = −.0125
2. (a) ∆y = 1.587, f 0 (x1 ) ∆x = 1.2 (b) ∆y = .048808848, f 0 (x1 ) ∆x = .05
(c) ∆y = −.0039215686, f 0 (x1 ) ∆x = −.004
Homework #8
Exercises.
1 1 √3 3
1. (a) 72 x2 + 10 x − 10 (b) −32t + 5 (c) 8 x7 + 6 x5 + 4 x3 + 2 x (d) 3 x2/3 (e) x
+ x3/2 (f) − 10
x3
− 25 1
x3/2
2. (a) 4 x3 − 111 x2 + 16 (b) 15t 2 + 16t − 9 (c) 4 x3 − 3 x2 + 2 x − 1 (d) − x23 (e) − 18
x4
+ 12 x2 (f) 1 √1 1 1
2 x + 3 x2/3 + 41 1
x3/4
3. Varies with the student.
4. 4x−3
5. It is ds/dt.
Homework #9
Exercises.
1. (a) 35 (b) 3 (c) undefined (d) 0 (e) 0
2. (a) 54 (b) 12 (c) 0 (d) undefined (e) 0
3. (a) 2.025641026; 2.002506266; 2.000250063; 1.975609756; 1.997506234; 1.999750062. The limit seems to be 2. (b)
x−5
The function is undefined. (c) x−1 (d) 2 (e) The answers vary.
Homework #10
Exercises.
1. (a) (21 x2 + 10 x) (2 x4 − 5 x) + (7 x3 + 5 x2 ) (8 x3 − 5) (b) (1) x1/2 + x ( 12 x−1/2 )
2. (a) (10 x + 3) (2 x5 − 3 x4 ) + (5 x2 + 3 x) (10 x4 − 12 x3 ) (b) 1 x1/3 + x 13 x−2/3
(x) (2 x)−(x2 +3) (1) (x2 +3) (1)−(x) (2 x) (x2 +8) [(6 x−2) (5 x+7)+(3 x2 −2 x+3) (5)]−[(3 x2 −2 x+3) (5 x+7)] (2 x)
3. (a) x2
(b) (x2 +3)2
(c) (x2 +8)2
(x2 +1) (1)−(x) (2 x) (x) (2 x)−(x2 +1) (1) (x2 +5) [(10 x+3) (x−9)+(5 x2 +3 x−4) (1)]−(5 x2 +3 x−4) (x−9) (2 x)
4. (a) (x2 +1)2
(b) x2
(c) (x2 +5)2
Homework #11
Exercises. √
2 2 )]−[(3 x2 +5) (2 x−1/x)][(2+(1/2) x−1/2 ) (x+1)+(2 x+√x) (1)]
1. (a) [(2 x+ x) (x+1)] [(6 x) (2 x−1/x)+(3 x +5) (2+1/x √
[(2 x+ x) (x+1)]2
√ √ √ √
[(s+ s)(s+4)] [(1−(1/3) s−2/3 )(s2 +8)+(s− 3 s)(2 s)]−[(s− 3 s)(s2 +8)] [(1+(1/2) s−1/2 )(s+4)+(s+ s)(1)]
(b) √
[(s+ s)(s+4)]2
√ √
2 3 (1+1/x2 )]−[(4 x3 −3 x+1) (x−1/x)] [(1+x−1/2 ) (x+8)+(x+2 x) (1)]
2. (a) [(x+2 x)(x+8)] [(12 x −3) (x−1/x)+(4 x −3 x+1)[(x+2 √
x)(x+8)] 2
√ √
[(u2 +7) (u3 −9)] [(1+(3/2) u−1/2 )(4 u2 +7)+(u+3 u) (8 u)]−[(u+3 u)(4 u2 +7)] [(2 u)(u3 −9)+(u2 +7)(3 u2 )]
(b) 2 3
[(u +7)(u −9)]2
Homework #12
Exercises. √ 2
1. (a) − 23 √2−3
1
x
(b) 6 (5 z3 − 8 z)5 (15 z2 − 8) (c) 2 r 2 r − 1 + √2rr−1
2 4 2 2 2 −8t+1)3 [4 (6t 2 +9t+5)3 (12t+9)]
5 8
(θ 2 +1) (25 θ 4 )−(5 θ 5 ) (2 θ )
(d) (6t +9t+5) [3 (3t −8t+1) (6t−8)]−(3t
2
(6t +9t+5) 8 (e) 9 5θ
θ 2 +1 (θ 2 +1)2
√ 3
2. (a) − 53 1
(b) 7 (5 z4 + 8 z3 − 1)6 (20 z3 + 24 z2 ) (c) 3 r2 4 r − 1 + 2 √4rr−1
(3−5 x)2/3
3 3 4 2 )]−(3t 3 −4)5 [3 (4t+1)2 (4)]
7 2
(d) (4t+1) [5 (3t −4) (9t(4t+1) 6 (e) 8 θ +2
θ 2 +θ
(θ +θ ) (1)−(θ +2) (2 θ +1)
(θ 2 +θ )2
2 2 +1 x2 −1 x2 −1
3. 1
x2 +1
− 2 (x2x+1)2 = (x−x2 +1) 2 becomes − (x2 +1)2 = − (x2 +1)2
2
4. ( x21+1 + 2 (x2x+1)2 ) (x2 + 1)2 becomes 3 x2 + 1.
Homework #13
Exercises.
|x3 −x | (3 x2 −1) x (|x |/x)−|x | (1)
1. (a) (2 x) |x | + x2 (|x | /x) = 3 x |x | (b) x3 −x
(c) =0
√ √ |x |2
|x |
(|x | x2 +1) (0)−(1) [ x x2 +1+|x | 12 (x2 +1)−1/2 (2 x)]
(d) √
2 2
[|x | x +1]
|4 x3 −7 x2 | (12 x2 −14 x) |x | (1)−x (|x |/x)
2. (a) (5 x4 ) |x | + x5 (|x | /x) = 6 x4 |x | (b) 4 x3 −7 x2
(c) =0
√ √ |x |2
|x | 1 2 −1/2
(|x | x2 −1) (0)−(1) [ x 2
(d) √x −1+|x | 2 (x −1) (2 x)]
[|x | x2 −1]2
Homework #14
Exercises.
sec2 (|θ |) |θ |
1. (a) |tan(θ )| 2
tan(θ ) sec (θ ) (b) θ (c) 3 sin2 (θ ) cos(θ ) (d) 3t 2 sec(4t) + 4t 3 sec(4t) tan(4t)
(α 2 +1) (− sin(5 α) (5))−(cos(5 α)) (2 α)
(e) (α 2 +1)2
2. (a) − cos(cos(θ )) sin(θ ) (b) − sin(sin(θ )) cos(θ ) (c) −5 cot4 (x) csc2 (x)
(α 2 −1) (1 tan α+α sec2 α)−(α tan α) (2 α)
(d) 4t 3 sec(t 2 ) + 2t 5 sec(t 2 ) tan(t 2 ) (e) (α 2 −1)2
Homework #15
Exercises. 2 (1−Arctan(x))
1. (a) √ 2 x
(b) 2 1+(1+x 2 )2 (c) |Arcsin(5 x) |
Arcsin(5 x)
√ 5
(d) − sec 1+x2
1−4t 2 1−25 x2
(1−2 x2 ) ((1) Arcsec x+x √1 )−(x Arcsec x)(−4 x)
|x | x2 −1
(e) (1−2 x2 )2
5 2 |Arcsec(x2 −1) |
2. (a) √ (b) √ (c) Arcsec(x2 −1) 2 √2 x 2 2
2
−3−25t −20t 2
Arctan(4 x) (1+16 x ) |x −1 | (x −1) −1
1 1 −2 √ 1

(d) sec Arcsin x tan Arcsin x − (Arcsin x)
1−x2
(x2 +sin x) [ 1 2 Arcsin x+Arctan x √ 1 −(Arctan x Arcsin x) [2 x+cos x]
1+x 1−x2 ]
(e) (x2 +sin x)2
Homework #16
Exercises.
sec(x) tan(x) sin(x− 1x ) 1 1 Arcsec x
Arcsec x(1)−x ( |x |√1x2 −1 )
1. (a) 1+sec(x) (b) x + ln(x) cos(x − x ) (1 + x2
) (c) x (Arcsec x)2
|ln(sin(x)) | cos(x) (β +3)2 [(1) ln β +β (1/β )]−(β ln β ) [2 (β +3) (1)]
2. (a) ln(r) + 1 (b) ln(sin(x)) sin(x) (c) (β +3)4
1/7
ln(x)25 x18 (x8 −x)41

41 8 x7 −1
3. (a) (1+x2 )39 Arcsin29 (2 x) 25
× ( 7 x ln(x) + 7 x + 7 x8 −x − 78
1 18 1 x
7 1+x2 − 7
58 √ 1
) (b) sin(x)ln(x) × ( ln(sin(x))
x +
1−4 x2 Arcsin(2 x)
ln(x) cos(x)
sin(x) )
21 (x) x12 (x2 −2)43
1/5
4. (a) (2sin 2 13
x +5) Arctan (3 x) 79 × ( 21 cos(x) 12 1 86 x 78 x 237
5 sin(x) + 5 x + 5 x2 −2 − 5 2 x3 +5 − 5 (1+9 x2 ) Arctan(3 x) )
1
(x) 2
(b) sin(x)cos(x) (− sin(x) ln(sin(x)) + cos
sin(x) )
6. x4 sin(x) ( 4x + cos(x) 3 4
sin(x) ) becomes, when multiplied out, 4 x sin(x) + x cos(x).
Homework #17
Exercises.
1 (cosh(x2 )) [(1) sec x+x sec x tan x]−(x sec x) [sinh(x2 ) (2 x)]
1. (a) cosh(ex ) ex (b)
(1−x2 ) Arctanh(x)
(c) cosh2 (x2 )
2. (a) sinh(sin(ex )) cos(ex ) ex (b) 4t 3 4
sinh(t) + t cosh(t)
(sin(ln x)) [e2 x (2) Arccosh(3 x)+e2 x √ 3 ]−(e2 x Arccosh(3 x)) [cos(ln x) (1/x)]
(3 x)2 −1
(c) sin2 (ln x)
1 x2 −1 x2 −1 x
3. 2 x , x2 +1 , 2 x2 +1
( 1 + 1+x ) (1−x)
1 1−x (1−x)2
4. 2 1+x becomes (x+1)1(x−1) .
Homework #18
Exercises.
2 3 3 2
1. (a) dy =
(15 x sin (x) + 15 x sin (x) cos(x)) dx
(tan3 x)[5 x4 ln x+x5 (1/x)]−(x5 ln x)[3 tan2 x sec2 x]
(b) dw = tan6 x
dx
2 x x x

2. (a) dy = (10 x − 2) dx (b) dz = (x ) [(e ) (ln x)+(e x)4(1/x)]−(e ln x) [2 x]
dx
|t 3 −3t | (3t 2 −3) dt
(z3 ) (sec2 z−1)−(tan z−z) (3 z2 )

3. (a) (cos(u) − u sin(u)) du (b) z6
dz (c) t 3 −3t
(1+z2 ) [−2 sin z−1]−(2 cos z−z) [2 z]

|7t−8 | dt u
4. (a) 7 7t−8 (b) (Arctan(u) + 1+u2 ) du (c) (1+z2 )2
dz
5. dx = cos(θ ) dr − r sin(θ ) dθ
Homework #19
Exercises.
b cos(t) 21t 6 +24t 2 −5
1. (a) a (− sin(t)) (b) 8t 3 −15t 2 +2
cosh(t) 45t 8 −40t 4 +4t
2. (a) sinh(t) (b) 10t 4 +9t 2
Homework #20
Exercises.
(1+sin2 x) [− sin x]−(cos x) [2 sin x cos x]
1. (a) 6 x ln(x) + 5 x (b) ex x−1 − 2 ex x−2 + 2 ex x−3 (c)
(1+sin2 x)2
2. (a) 12 x2 sec(x) + 8 x3 sec(x) tan(x) + x4 sec(x) tan2 (x) + x4 sec3 (x)
x x 2 x x x x x x 2
(b) ( sin x esin−e2 x cos x ) − ( sin x (e cos x+e (−(sin
sin x))−e cos x (2 sin x cos x)
2 x)2
2e
= sin 2 e cos x 2 e cos x
x − sin2 x + sin3 x
(c) (−1) (x 1 − (ln x)2 )−2 [(1) 1 − (ln x)2 + (x) (1/2) (1 − (ln x)2 )−1/2 (−2 ln x (1/x))]
p p
3. − R sin13 (t)
4. − a2 sinb 3 (t)
5. 6 ex + 18 x ex + 9 x2 ex + x3 ex
A.3 Chapter 2.
Homework #21
Exercises.
1. (a) 17 (b) Does not exist (c) 0 (d) 1 (e) 0

2. (a) 54 (b) Does not exist (c) 0 (d) 32 (e) ∞
3. Answers will vary. Up to 3 points credit given.
Homework #22
Exercises.
1. (a) $19,000 (b) $23,965.58 (c) $24,432.20 (d) $24,540.94 (e) $24,596.03
2. (a) $19,000 (b) $23,673.64 (c) $24,351.89 (d) $24,513.57 (e) $24,596.03
3. (a) 3.867% (b) 7.573%
4. (a) 5.406% (b) 9.636%
Homework #23
Exercises.
1. (a) The critical points are s = −2, 6, which are a max and a min, respectively. (b) The only critical point is x = 2, which
is neither a max nor a min.
2. (a) .x = 3 is the only critical point, and it is neither a max nor a min. (b) The critical points are x = 3, 7, which are a
max and a min, respectively
3. (a) Falling for −3 < x < −4/3, rising for −4/3 < x < 0. The global min (−31/3) is at x = −4/3, and the global max
(−2) is at x = −3. (b) Falling for 5 < x < 8 and rising for 8 < x < 10. The global min (16) is at x = 8, and the global
max (25) is at x = 5. (c) Falling for 0 < x < 1/e, rising for 1/e < x < 3. The global min (−1/e = −0.368) is at x = 1/e,
and the global max (3 ln 3 = 3.30) is at x = 3.
4. (a) Falling for −1 < x < 7/6, rising for 7/6 < x < 3. The global min (−25/12) is at x = 7/6, and the global max (12)
is at x = −1. (b) Falling for 4 < x < 6, rising for 6 < x < 10. The global min (12) is at x = 6, and the global max (16) is
at x = 4. (c) Rising for 0 < x < 1 and falling for 1 < x < 5. The global min (0) is at x = 0, and the global max (1/e) is at
x = 1.
Homework #24
Exercises.
1. (a) R = 25x1/5 (b) dR/dx = 5x−4/5 > 0 (c) R = 25x−1/5 , dR/dx = −5x−6/5 < 0 (d) When the exponent on p is
greater than −1, the exponent on R is positive, so when the exponent multiplies the coefficient during the derivative, the
answer is positive. If the exponent on p is less than −1, the exponent on R is negative, so when the exponent multiplies the
coefficient during the derivative, the answer is negative.
2. (a) R = 18x1/3 (b) dR/dx = 6x−2/3 > 0 (c) R = 18x−1/3 , dR/dx = −6x−4/3 < 0
3. (a) The price changes by (2%)/(−1.3) = −1.53%, so the price goes down by 1.53%. (b) The price changes by
(−3)/(−1.3%) = 2.31%, so the price goes up by 2.31%.
4. (a) The price changes by (1%)/(−0.9) = −1.11%, so the price goes down by 1.11%. (b) The price changes by
(−2%)/(0.9) = +2.22%, so the price goes up by 2.22%.
A.4 Chapter 3.
Homework #25
Exercises.
1. (a) Three-dimensional function, so it takes four dimensions to graph it. (b) Five-dimensional function, so it takes six
dimensions to graph it.
2. (a) Four-dimensional function, so it takes five dimensions to graph it. (b) Eight-dimensional function, so it takes nine
dimensions to graph it.
3. (a) Partial with respect to x is 6 x2 − 10 x y; partial with respect to y is −5 x2 − 6 y5 . (b) Partial with respect to x is
y3 (3 x − y2 ) + 3 x y3 ; partial with respect to y is 3 x y2 (3 x − y2 ) − 2 x y4 (c) Partial with respect to x is Arcsin( xy ) + r x 2 ;
y 1− x2
y
2
partial with respect to y is − rx
2
y2 1− x2
y
4. (a) Partial with respect to x is 14 x + 27 x2 y2 ; partial with respect to y is 18 x3 y − 10 y4 . (b) Partial with respect to x
3 )[ey ]−(x ey ) [4] 3 ey ]−(x ey )[−15 y2 ]
is (4 x−5(4y x−5 y3 )2
; partial with respect to y is (4 x−5 y )[x
(4 x−5 y3 )2
(c) Partial with respect to x is y2 sec( xy ) tan( xy );
partial with respect to y is 3 y2 sec( xy ) + y3 sec( xy ) tan( xy ) (−x/y2 )
ρ
5. (a) Partial with respect to ρ is ln(Arctan(θ )); partial with respect to θ is (1+θ 2 ) Arctan(θ )
(b) Partial with respect to
p is V ; partial with respect to V is p, partial with respect to T is −n r. (c) Partial with respect to x is 2 x y (x y − z2 )4 +
4 x2 y2 (x y − z2 )3 ; partial with respect to y is x2 (x y − z2 )4 + 4 x3 y (x y − z2 )3 ; partial with respect to z is −8 x2 y (x y − z2 )3 z
sin(θ )
6. (a) Partial with respect to ρ is ρ (1+ln(ρ) 2 ) ; partial with respect to θ is Arctan(ln(ρ)) cos(θ ) (b) Partial with respect to
(1+(v1 v2 )/c2 ) [1]−(v1 +v2 )[v2 /c2 ]
v1 is (1+(v1 v2 )/c2 )2
; partial with respect to v2 is
(1+(v1 v2 )/c2 [1]−(v1 +v2 )[v1 /c2 ] x z2
(1+(v1 v2 )/c2 )2
(c) Partial with respect to x is − sin( y+z ) z2 /(y + z); partial with respect to y is
x z2 x z2 x z]−(x z2 ) [1]
− sin( y+z ) x z2 (−1)(y + z)−2 (1); partial with respect to z is − sin( y+z ) ( (y+z) [2(y+z) 2 )
Homework #26
There are no exercises in this homework set.
Homework #27
Exercises. 3 2 2
−1+x y
1. (a) fxx = 12 x2 y5 , fxy = 20 x3 y4 , fyy = 20 x4 y3 (b) fxx = − xy2 , fxy = 1x , fyy = 0 (c) fxx = −2 (1+xy 2xy2 )2 , fxy = − (1+x 2 y2 )2 ,
3
fyy = −2 (1+xx 2yy2 )2
yx 1
2. (a) fxx = 2 y5 , fxy = 10 x y4 , fyy = 20 x2 y3 (b) fxx = −2 (1+x 2 )2 , f xy = 1+x2
, fyy =0 (c) fxx = − x12 , fxy = 0, fyy = − y12
Homework #28
Exercises.
1. f (x, y) = y − 2/x has f (x, y) = 0 as a level set, which is the same as y = 2/x. Another option is f (x, y) = x y, which has
x y = 2 as a level set, which is the same as y = 2/x.
2. f (x, y) = y − x has y − x = 5 as a level set, which is the same as y = x + 5. Another function is f (x, y) = y/(x + 5) which
has y/(x + 5) = 1 as a level set, also the same as y = x + 5.
Homework #29
Exercises.
1. Implicit. The variable z on the left also occurs on the right.
2. Explicit. The variable w occurs only once in the equation, and it is by itself on the left side.
18 x5 y4 −14 x y5 −4 x3 y3 −24 x5 y7
3. (a) −2 xy (b) − 12 x6 y3 −35 x2 y4
(c) − 1−3 x4 y2 −28 x6 y6
cos( xy )
2
y2 e(x y ) −3 x2 y4 y sin( xy ) y cos( xy ) x sin( xy ) x
4. (a) − 2 (b) − + x2
− y2
− x (c) 2 (x2 +y2 ) (1−2 y
)
2 x y e(x y ) −4 x3 y3 x2 +y2
2 2
5. − x y+y
3
6. Varies
( with the student. 4 3 7 4 3 7 )
(12 x2 y2 −35 x y6 ) [40 x3 +8 y3 +24 x y2 (− 10 x 2+82x y −5 y6 )−35 y6 (− 10 x 2+82x y −5 y6 )]
12 x y −35 x y 12 x y −35 x y
7. − (12 x2 y2 −35 x y6 )2
( 4 3 7 4 3 7 )
(10 x4 +8 x y3 −5 y7 ) [(24 x y2 +24 x2 y (− 10 x 2+82x y −5 y6 ))+−35 y6 −210 x y5 (− 10 x 2+82x y −5 y6 )]
12 x y −35 x y 12 x y −35 x y
+ (12 x2 y2 −35 x y6 )2
3 e3 x y+3 x2 sin(y2 ) 3 e3 x y+3 x2 sin(y2 )

( )
(e3 x +x3 cos(y2 ) 2 y) [(9 e3 x y+3 e3 x (− 3 x 3 ))+(6 x sin(y2 )+3 x2 (− sin(y2 ) 2 y (− 3 x 3 )))]
e +x cos(y2 ) 2 y e +x cos(y2 ) 2 y
8. − (e3 x +x3 cos(y2 ) 2 y)2
3 e3 x y+3 x2 sin(y2 ) 3 e3 x y+3 x2 sin(y2 )

( )
(3 e3 x y+3 x2 sin(y2 )) [3 e3 x +(3 x2 cos(y2 ) 2 y)+(x3 −sin(y2 ) (2 y)2 (− 3 x 3 ))+(x3 cos(y2 ) 2 (− 3 x 3 ))]
e +x cos(y2 ) 2 y e +x cos(y2 ) 2 y
+ (e3 x +x3 cos(y2 ) 2 y)2
xy 2
z e −3 x sin y z/(x+y z)+7/(1+(y+x z) )
9. − 9 18
x z2 ex y −3 x z cos y
10. − 1/(x+y z)+7 z/(1+(y+x z)2 )
Homework #30
A.5 Chapter 4.
Homework #31
Exercises.
1. (a) 9 (b) 109 (c) 15 6 7 8 9
8 = 1.875 (d) 90 (e) x + x + x + x + x
10 (f) 6x + 7x + 8x + 9x + 10x
2. (a) −3 (b) 157 (c) 63 (d) 32 (e) x7 + x8 + x9 + x10 + x11 + x12 (f) 7x + 8x + 9x + 10x + 11x + 12x
Homework #32
Exercises.
1. (a) 1829.911722 (b) 0.9293358726 1033 (c) 0.4646679363 1034 pounds (d) No. The answer to the previous part
indicates that the rabbits will weigh 352 million times the weight of the earth.
2. (a) 551.1588190 (b) 0.3508367956 1023 (c) 0.1754183978 1024 pounds (d) The answer, while less than the weight
of the earth, is still unrealistic. The rabbits won’t weigh more than 1% of the total weight of the earth.
Homework #33
Homework #34
Exercises.
1. (a) y(x) = x4 − 5 (b) w(r) = r3 − 3 (c) x(t) = sin(t) + 1 (d) y(x) = −8
2. (a) y(x) = 79 x9 − 6 79 (b) w(r) = 1r + 32 (c) x(t) = − cos(t) + 5 (d) y(x) = −5
3. (a) (5/3) x3 +C (b) 3 u4 +C (c) −w−1 +C (d) (2/3)t 3/2 +C
4. (a) 4 x5 +C (b) (7/3) u3 +C (c) −4 w−2 +C (d) (3/4)t 4/3 +C
Homework #35
Exercises.
1. (a) The units of velocity are feet per second. (b) The units of acceleration (a or g) are feet per second per second, or feet
per second squared. (c) The units for v are feet per second. The units for −g/,t are (feet per second per second)*second
= feet per second. The units for v0 are feet per second. They all have the same units. (d) The units for y are feet. The
units of −(1/2) gt 2 are (feet per second per second)*(second)2 = feet. The units of v0 t are (feet per second) times (second)
= feet. The units of y0 are feet. They are all the same.
Homework #36
Homework #37
Exercises.
1. (a) 12 (b) 4 (c) −4
2. (a) 35 (b) 2 (c) − 21
2
Homework #38
Exercises.
1.(a) 34 x4 − 35 x3 + 52 x2 − 3 x +C (b) − 18 cos(8 x) +C (c) − 5x − 6 x2/3 +C (d) 29 (3t + 8)3/2 +C (e) − ln(cos(r)) +C
√
(f) 2 e w +C (g) 21 x2 − 1x +C (h) 14 ln(x4 − 2 x2 ) +C
2. (a) 34 x4 − 43 x3 − 32 x2 + 6 x +C (b) 13 sin(3 x) +C (c) − 45 x15 + 15
4 x
4/5 +C (d) 1 (24t − 7)3/2 +C (e) ln(sin r) +C
36
√ 1 1 1 1 4 2
(f) −2 cos( s) +C (g) − x − 2 x2 +C (h) 4 ln(x + 4 x ) +C
Homework #39
Exercises. √ √
1. (a) 38 (b) −1
8 (c) 22
9 11− 10
9 5 ≈ 5.622785066 (d) − ln(cos(1)) ≈ 0.6156264703 (e) ln(3)−ln(2) ≈ 0.4054651084
47 −2
2. (a) 12 (b) 3
√ √
(c) 41 17 1
36 41 − 36 17 ≈ 5.345424947 (d) −2 cos(1) + 2 ≈ 0.919395388 (e) 2 ln(3) ≈ 0.5493061445
Homework #40
Exercises.
1 1 1 1 1 1 1 1 1 −3+23 x 1 −36+11 x
1. (a) 70 x+2 − 70 x−3 − 126 x+4 + 126 x−5 (b) 23 1
18 x−3 − 18 x2 +9
23
(c) 18 1
(x−3)2
− 11 1
54 x−3 + 54 x2 +9
23 1
(d) 324 x−3 −
23 x+3 1 −3+23 x 23 1 17 1 1 33+34 x 1 −36+11 x
324 x2 +9 − 18 (x2 +9)2 (e) 324 (x−3)2 − 486 x−3 + 972 x2 +9 + 54 (x2 +9)2
1 1 1 1 1 1 1 1 1 −87+37 x 1 −99+4 x
2. (a) 36 1+x − 18 x−2 + 28 x−3 − 126 x+4 (b) 37 1
18 x−3 − 18 x2 +9 (c) 37 1 2 1
18 (x−3)2 − 27 x−3 + 54 x2 +9
37 1
(d) 324 x−3 −
37 x+3 1 −87+37 x 37 1 41 1 1 12+41 x 1 −99+4 x
324 x2 +9 − 18 (x2 +9)2 (e) 324 (x−3)2 − 972 x−3 + 972 x2 +9 + 54 (x2 +9)2
3. (a) The set-up is A/x + B/x2 + C/(x − 6) + D/(x − 6)2 + E/(x − 6)3 + F/(x − 6)4 + G/(x + 1) + (H x + I)/(x2 + 4) +
(J x + K)/(x2 + 12) + (L x + M)/(x2 + 12)2 . (b) The degree of the denominator (bottom) is 13, so the degree of the nu-
merator (top, ?) needs to be less than 13.
M
4. (a) 9 (b) 10 ln(9) ≈ 21.97224577 (c) (−k (t−t0 ))
1+e
500 500
(d) 1+e(−1.0+1. ln(9)) ≈ 115.9846584; 135.9140914 (e) 1+e(−5.0+1. ln(9)) ≈ 471.4128093; 500; They should be close, since P(t)
approaches M as t goes to infinity. (f) 7420.657955; This is nowhere near correct, since the exponential growth is not
only unbounded, it grows very rapidly.
5. (a) M=363,447,466 for the 1960–1970–1980 data and 913,434,253 for the 1960–1980–2000 data (b) The maximums
are so different because the dramatic increase in population in 2000 meant that we weren’t close to leveling out. See the
homework problem in this section for more details. (c) For the 1960–1970–1980 data, k = 0.0265 and for the 1960–1980–
2000 data, k = 0.0150, as opposed to Pearl and Reed’s k = 0.03134. So, the specific growth rates are getting smaller.
6. 285,517,942 for the 1960–1970–1980 data, and 311,453,851 for the 1960–1980–2000 data. The difference between
them is 8.3%, which means that a population in between might not do a good job of saying which one is better. However,
if the real data is much closer to one of those two numbers than the other, that would be a fairly good reason to support that
3 2 4 1 5 1 2
model over the other. 7. (a) x+2 − 2 x−3 (b) x−3 + 3 x+2 (c) 2 x−1 + x−2 − (x−2)2 (d) 2x2x−9
+1
1
− x−4
2
8. (a) 3 ln(x + 2) − ln(2 x − 3) +C (b) 4 ln(x − 3) + (1/3) ln(3 x + 2) + c (c) (5/2) ln(2 x − 1) + ln(x − 2) + x−2 +C (d)
ln(x2 + 1) − 9 Arctan x − ln(x − 4) +C
Homework #41
Exercises.
1
1. (a) − 41 x2 cos(4 x) + 32 cos(4 x) + 18 x sin(4 x) +C (b) 19 x9 ln(7 x) − 81
1 9
x +C
2
√ 1
(c) x Arcsin (x) + 2 Arcsin(x) 1 − x − 2 x + C (d) − 3 cos(x ) + C (e) −x2 cos(x) + 2 x sin(x) + 2 cos(x) + C (f)
2 3
2 9 2 6 8 3 16
√
−ecos x +C (g) 21 x + 105 x − 315 x + 315 x3 + 1 +C.
√
2. (a) 3 x sin(3 x) − 27 sin(3 x) + 9 x cos(3 x) +C (b) 61 x6 ln(6 x) − 36
1 2 2 2 1 6
x +C (c) x Arcsec(x) − ln(x + x2 − 1) +C (d)
1 3 2 sin(x) +C (g) 1 x4 − 1
√
3 sin(x )) +C (e) x sin(x) + 2 x cos(x) − 2 sin(x) +C (f) e 10 15 x4 + 1 +C
3. (a) Tabular integration or integration by parts. (b) Substitution u = x3 first. (Then integration by parts. (c) Partial
fraction expansion. Or in this case, substitution u = x3 − 3 x also works. (d) Integration by parts, u = (ln(x))6 , dv = dx.
(e) Partial fractions, A/(x + 3) + B/(x + 3)2 +C/(x + 3)3 + D/(x + 3)4 + E/(x + 3)5 + (F x + G)/(x2 + 1) + (H x + I)/(x2 +
1)2 + (J x + K)/(x2 + 1)3 + (L x + M)/(x2 + 1)4 .
4. (a) Tabular integration or integration by parts. (b) Substitution (u = x3 ) (and then integration by parts). (c) Partial
fractions, or in this case, substitution also works (u = x2 + 2 x). (d) Integration by parts (u = (ln(x))8 , dv = dx). (e)
Partial fractions, A/(x − 3) + B/(x − 3)2 +C/(x − 3)3 + D/(x − 3)4 + (E x + F)/(x2 + 81) + (G x + H)/(x2 + 81)2 + (I x +
J)/(x2 + 81)3 .
5. You differentiate the result and see if you come back to the original integrand:
diff( x*exp(x)-exp(x)+C,x); which gives x ex .
Homework #42
Exercises.
1. Maple gives − 35 1
sin4 (7 x) cos(7 x) − 105
4
sin2 (7 x) cos(7 x) − 105
8
cos(7 x) +C
1 5 1 3 1 5
2. Maple gives − 30 sin (5 x) cos(5 x) − 24 sin (5 x) cos(5 x) − 16 cos(5 x) sin(5 x) + 16 x +C
3. Maple realized that sin(a x)(n−2) = (sin(a x)n divided by sin(a x)2 ). But then it split apart the integrals of the two:
sin(a x)(n−2) dx = sin(a x)n dx (1/(sin(a x)2 ) dx. You can only pull constants out like that.
R R R
Homework #43
Exercises.
1. (a) Rightsum = 0.3863568043, Leftsum = 0.3078169880, Middlesum = 0.3463172133 (b) Rightsum = 1.805627583,
Leftsum = 1.633799400, Middlesum = 1.717566087 (c) Rightsum = 0.5655963332, Leftsum = 0.5155963332, Middle-
sum = 0.5500354202
2. (a) Rightsum = 0.8390539012, Leftsum = 0.8914137786, Middlesum = 0.8664212399 (b) Rightsum = 7.049244379,
Leftsum = 5.771433159, Middlesum = 6.378420082 (c) Rightsum = 0.5826548602, Leftsum = 0.5326548602, Middle-
sum = 0.5674084923
Homework #44
Exercises.
1. (a) 0.3470868961 (b) 1.719713491 (c) 0.5405963332
2. (a) 0.8652338398 (b) 6.410338769 (c) 0.5576548602
Homework #45
Exercises.
1. (a) 0.3465764762 (b) 1.718282782 (c) 0.5452337739
2. (a) 0.8660259831 (b) 6.389112621 (c) 0.5625063505
Homework #46
Homework #47
Exercises.
1. 18
2. 8
8
3. 3 with graph:
4. 34 with graph:
5. 52
1
6. 20
Homework #48
Exercises.
1. (a) 1; 1 (b) 240; 624
2. (a) 4; 4 (b) 40; 104
Homework #49
Exercises. √
1. x2 + 1 (b) x3 + 14 1
x3
(c) 1
2 9x+4
2. 2 π R
Homework #50
Exercises.√
1. 2 π ( 65
12 √
1
65 − 12 ) ≈ 273.8666398
129 1
√
2. 2 π ( 8 65 + 64 ln(−8 + 65)) ≈ 816.5660538
p
3. 2 π 0π (x sin(x) + 2) (sin(x) + x cos(x))2 + 1 dx ≈ 93.30454438
R
Homework #51
Exercises.
1. 0.1022864769 108 tons
A.6 Chapter 5.
Homework #52
Exercises.
1. (a) There is a relative min at the point (−3, −2), with value of −5. (b) There is a saddle point at (−3, 2), with value
11. (c) There is a relative min at (3, −2), with value −5. (d) There is a saddle point at (0, 0) with value 0, and a relative
min at (2, 2) with value −4.
2. (a) There is a relative min at (−1, −2), with value 1. (b) There is a saddle point at (−1, 2), with value 17. (c) There is
a relative min at (1, −2) with value 1. (d) There are saddle points at (1/2, 1/4) with value 1/2 and at (−1/2, −1/4) with
value −1/2. √ √
2 2
3. (a) c = 21 −A+ BA +4 Bt , 12 −A− BA +4 Bt (b) You want the positive square root, because c should be a positive number.
√
(c) c = −1834.446263 + 2534.659463 0.5238063398 + 0.0007890606332t
4. Since you can tell how much (at a minimum) recording time there is on a cassette tape, once you know how much time
has elapsed, you can tell the minimum amount of time left.
Homework #53
Exercises.
1. −57
3
2. 2
Homework #54
Exercises.
1. 3.832494443 minutes, or 229.9496666 seconds. The times listed for 1975 are 231.0 and 229.4, which are fairly close.
2. 3.798141443 minutes, or 227.8884866 seconds. The times listed for 1981 are 228.5, 228.4 and 227.3, again reasonably
close.
Index
Absolute values higher-order, 173–179

derivatives, 68–69 independent variable, 37
integrating, 254 partial, 153–186
Albert, 3 mixed, 166
Arc length, 258 product rule, 55
Area between curves, 255 three functions, 55–56
Average price, 145 quotient rule, 56, 67
trig functions, 70–73
Balancing differentials, 93 Difference quotient, 28
Ballistic motion, 218–219 Differentials, 42, 90–93
Distance, net and total, 257
Chain rule, 61–65 Domain, see Functions, domain
multi-variable, 158–165 Dudley, 1
Change of variables, 222
Composition, see Functions, composition e, 22
Concave down, 100 Elastic, 145
Concave up, 100 Elasticity, 145–147
Continuous at a point, 50 Elementary functions, 68
Continuous compounding, 86, 120–133 ε-δ argument, 51
Continuous function, 50 Error, 45
Critical point Exponential functions, see Functions, exponential
two variables applications, 83–86
classifying, 279 derivatives, 83–86
finding, 278 notation, 83
Critical points, 137 Exponential growth, 131–132
Curvature, 100 Exponents, laws of, 21
Extrapolation, 288
Definite integrals, 201, 229–230
approximating, 247–251 Finance, 119–149
limits on, 220 compound interest, 122
substitution, 229 continuous compounding, 123–133
Degrees, 70 inventory control, 135–141
∆-notation, 28 notation, 120
Demand marginal, 144 simple interest, 121
Dependent variable, see Functions, formulas, dependent vari- Function
able dimension of, 152
Derivatives, 29–104 explicit, 95
chain rule, 61–65 implicit, 95
constrained partial, 181–186 inverse
dependent variable, 37 derivative, 67
higher-order, 99–103 Functions
partial, 165–167 absolute values
implicit, 172–179 derivatives, 68–69
317
INDEX 318
composition, 13–14, 62 substitution, 222–225

defined, 5–7 on Maple, 224, 230
domain, 7 Integration, 197–267
exponential, 21–22, 131 Integration by parts, 238–242
graphs, 22 multiple, 240
formulas, 7 solving for the integral, 241
dependent variable, 8 tabular, 240
independent variable, 8 Interpolation, 31, 43, 288
gnomes, 5–6 Inverse functions, see Functions, inverse
graphs, 7 Inverse hyperbolic trig functions
green boxes, 5 defined, 87
differentiating, 38–39 derivatives, 87–89
implicit, 168–179 Inverse trig functions, see Functions, trig, inverse
inverse, 9–12 defined, 74, 75
inverse trig derivatives, 74–76
defined, 74, 75
logarithmic, 22–24 L’Hôpital’s rule, 125–128
multi-variable, 152–153 Least Squares analysis, 276
one-to-one, 10 Level sets, 168–171
ordered pairs, 6 Limits, 49–52, 123–128
proper lists, 6 Linear regression, 287–290
range, 8 equations, 288
trig, 15–17, 70 explained, 288
inverse, 18–21 Local extrema, 138
Logarithmic differentiation, 79–81
Global extrema, 138 Logarithmic functions, see Functions, logarithmic
Gnomes, see Functions, gnomes derivatives, 77–78
Graphs Logarithms, laws of, 23
dimension of, 153 Logistic equation, 232
Green boxes, see Functions, green boxes
Maple
Half life, 84 difference quotients on, 30–31
Higher-order derivatives, 99 integration by parts, 239
Hydrostatic force, 265 limits, 52, 128
Hyperbolic trig functions partial fractions, 232
defined, 86–87 substitution in integrals, 224, 230
derivatives, 86–87 Marginals, 144
Mugsy, 2
Indefinite integrals, 221–229
checking, 224 Natural logarithms, 24
Independent variable, see Functions, formulas, independent Newton’s law of cooling, 84
variable
Indeterminate forms, 123–128 One-to-one functions, see Functions, one-to-one
Inelastic, 145 Ordered pairs, see Functions, ordered pairs
Initial value problem, 213 Orders of growth, 132
Integral tables, 245
Integrals Parametric equations, 94–97
absolute values in, 254 defined, 95
applications, traditional, 252 Partial fractions, 231
exact methods, 221 Population growth, 86
partial fractions, 231 Product rule, 55
Maple, 232 three functions, 55–56
INDEX 319
Quotient rule, 56, 67
Radians, 70, 71
Radioactive decay, 84–86
Radiometric dating, 85–86
Range, see Functions, range
Rational function, 127
Reduction formulas, 246
Regular points, 137
Relation, 10
Relative change, 146
Riemann sums, 247
Second derivative, 99
Second derivative test, 138
Simpson’s rule, 250
Solution of a diff. eq., 104
Summary
Chapter 0, 25
Chapter 1, 105
Chapter 2, 150
Chapter 3, 187
Chapter 4, 268
Chapter 5, 292
Summation notation, 197
Surface areas of revolution, 262
Telescoping sum, 197

Third derivative, 99
Trapezoidal rule, 249
Trig functions, see Functions, trig
defined, 70
derivatives, 70–73
hyperbolic, 86
inverse, 87
Trig identities, 17
Unit elasticity, 145
Vertical line test, 7
Water Balloons, 272

Wiggle magnification factor, 39
Wiggle magnification formula
three variables, 155
two variables, 155
Wiggles, 38
WMF, see Wiggle magnification factor
multi-variable, 154

Calculus Maple PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Calculus Maple PDF

Uploaded by

Copyright:

Available Formats

Calculus At Work And Play

Dr. Kenneth P. Rietz

©1991-2010 by Kenneth Rietz

0 Introduction and Reference 1

The uses of limits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

How to do it (simple). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Let’s try to maximize revenue. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

4 Integration explained 197

5 Water Balloons 272

A Answers to Homework Exercises 305

Introduction and Reference

0.1 Introductory lecture

Comments on the commentators

0.1.2 A fast introduction to calculus.

Relation between the graphs.

0.2 Functions in general.

0.2.1 Terminology and notation.

Functions defined numerous ways.

Consistent green box. First, a function is a consistent green box.

funeral was not well attended.

INPUT OUTPUT INPUT OUTPUT INPUT OUTPUT

(−3, −26) (3, −26) (−3, 28)

The process is quite easy.

• taking logarithms of non-positive numbers;

0.2.2 Inverse functions.

Terminology and notation.

Checking for and finding inverse functions (when they exist).

INPUT OUTPUT INPUT OUTPUT INPUT OUTPUT

When we exchange columns, we get

INPUT OUTPUT INPUT OUTPUT INPUT OUTPUT

0.2.3 Combining functions.

Adding, subtracting, multiplying, dividing.

> f := x -> 2*x^2 + 3*x -1; # Define f(x)

1. Use f (x) = 2 x2 − x and g(x) = 2 x + 1 for this exercise.

2. Use f (x) = 2 x2 − 3 x and g(x) = 2 x + 3 for this exercise.

0.3 Trigonometric functions.

0.3.1 Definitions of trigonometric functions.

0.3.2 Graphs of trigonometric functions.

> plot(sin(x), x=-4*Pi..4*Pi, color=black, scaling=constrained);

> plot(cos(x), x=-4*Pi..4*Pi, color=black, scaling=constrained);

Amplitude and phase of sines and cosines.

Sums of such sines and cosines.

0.3.3 Trigonometric identities.

Pythagorean (the most basic).

Addition formulas for trigonometric functions.

sin(A ± B) = sin A cos B ± cos A sin B

cos(A ± B) = cos A cos B ∓ sin A sin B

0.3.4 Inverse trigonometric functions.

Domains and ranges of inverse trigonometric functions.

Function Domain Range

Graphs of the inverse trig functions.

> plot(arcsin(x), x=-1..1, color=black, scaling=constrained);

> plot( {signum(x)*Pi/2, arctan(x)}, x = -10 .. 10, color=black,

> plot( {Pi/2, arcsec(x)}, x = -5 .. 5,

Relations between inverse trigonometric functions.

π/2 = Arcsin x + Arccos x (22)

Arcsin(1/x) = Arccsc x (25)

Identities of the trig(arctrig) type.

It actually is harder to write out than to do.

sin θ = sin(Arctan(w/2)) (29)

0.4 Exponential and logarithmic functions.

0.4.1 Exponential functions.

Graphs of exponential functions.

> f := x -> 2x^2 + 3x -1; # Define f(x)

> plot(sin(x), x=-4Pi..4Pi, color=black, scaling=constrained);

> plot(cos(x), x=-4Pi..4Pi, color=black, scaling=constrained);