Forrest Math148notes

Math 148: Calculus II
Instructor: Brian E. Forrest

December 30, 2009
Chapter 1
Integration
This chapter is devoted to basic integral calculus; we explore the definitions,
properties, and techniques of integration. We will prove the fundamental theorem of calculusone of the most important theorems in analysis.
1.0
Areas Under a Curve
Problem. How do we find areas under the graph of a function f (x)?

Example 1.1. Determine the area bounded by the graph of f (x) = x2 , the
lines x = 0, x = 1, and y = 0.
1. Let A be the area of the region R.
2. We can approximate the area by a rectangle with height equal to f (1)
(i.e., f (1) = 12 = 1) and the base over the interval [0, 1]. The area of this
rectangle is given by E0 = height base = 1 1 = 1. Note that A E0 .
3. Next we split the interval [0, 1] into two equal parts at x = 12 . In this case,
R = R1 + R2 and we can estimate the area of R by estimating the areas
of R1 and R2 . Let A1 and A2 denote the area of R1 and R2 , respectively.
Repeating these steps, we estimate A1 and A2 just as beforewith rectangles
of heights, respectively, f ( 12 ) = 14 , f (1) = 1, and of bases, respectively, 12 0 = 12 ,
1 12 = 12 . Hence,
A1 f
and
2

1
1
1
1
1
0 =
=
2
2
2
2
8

1
1
1
A2 f (1) 1
= 12
= ,
2
2
2
1
whence A = A1 + A2 18 + 21 = 58 .
In general, we can divide the interval [0, 1] into n identical parts. Let Ai
denote the area of Ri ; then
R = R1 + R2 + + Rn ,
A = A1 + A2 + + An .
We see that Ai f ( ni )( ni
i1
n )
= ( ni )2 n1 . Hence
A = A1 + + An
n
X
Ai
i=1
n
X
i
1
( )2
n
n
i=1
n
1 X 2
i
n3 i=1

1 n(n + 1)(2n + 1)
n3
6
=
=
1 (1 + n1 )(2 + n1 )
.
6

1
1
Pn
)(2+ n
)
1(1+ n
Note that limn i=1 ( ni )2 ( n1 ) = limn
= 13 .
6
=
Question: Is
1
3
the area of the region R?
We want A 13 , since this will show that A = 13 .

Strategy: Instead of taking the right-hand sum, we take the left-hand sum; we
are now underestimating.
If we again divide [0, 1] into n equal parts to generate subregions R1 , . . . Rn ,
we can once more estimate the area Ai of Ri by approximating Ri with rectangle
i
i1
whose base is on [ i1
n , n ] and whose height equals f ( n ). In this case, we have

Ai f
i1
n
1
=
n
i1
n
2
1
= (i 1)2
n
1
n3

,

n
n1
X
(1 n1 )(1)(2 n1 )
1
1 X 2
1 (n 1)(n)(2n 1)
(i1)2 3 = 3
i = 3
=
,
n
n i=1
n
6
6
i=1
and hence
(1 n1 )(1)(2 n1 )
1
= .
n
6
3
A lim
Conclusion: A = 31 . With this section as our motivation, we can now define

what an integral is; as a result, we will be able to define formally the concept
of area.
2
1.1
Integrals and Riemann Sums
Definition 1.1.1. Let [a, b] be a finite, closed interval. A partition P of [a, b]

is a finite subset of [a, b] of the form
P = {xi : a = x0 < x1 < < xn = b}.
Definition 1.1.2. For a partition P of [a, b] having n elements,
1. define xi := xi xi1 , and
2. define the norm of P, denoted as kPk, by
kPk := max{xi : i = 1, 2, . . . , n}.
Example 1.2. For any n N we can construct the n-regular partition Pn of
[a, b] by subdividing [a, b] into n identical parts with the length of each subinba
ba
terval being ba
n . I.e., xi = n for each i, and kPk = n . Note that
x1
x2
ba
,
n
ba
a+2
,
n
a+
..
.
xi
a+i
ba
,
n
a+n
ba
= b.
n
..
.
xn
Assume that f (x) is bounded on [a, b]. Let P be a partition on [a, b]. Let
Mi = sup{f (x) : x [xi1 , xi ]},
and let
mi = inf{f (x) : x [xi1 , xi ]}.
Then mi f (x) Mi for all x [xi1 , xi ].
Definition 1.1.3. Given f (x), bounded on [a, b], and a partition P of [a, b], we
define the upper Riemann sum of f (x) with respect to P by
U(f, P) := Uba (f, P) =
n
X
i=1
Mi xi ,
and the lower Riemann sum of f (x) with respect to P by

L(f, P) := Lba (f, P) =
n
X
mi xi ,
i=1
where Mi and mi are defined as above. Finally, for each i, choose ci [xi1 , xi ];
a Riemann sum for f (x) on [a, b] with respect to P is defined by
Sba (f, P) =
n
X
f (ci )xi .
i=1
The bounds of summation a and b from definition 1.1.3 are usually omitted.
Remark 1.1.4. Since mi f (ci ) Mi and xi > 0, we can conclude that
Lba (f, P) Sba (f, P) Uba (f, P).
Definition 1.1.5. Given a partition P, we say that a partition Q is a refinement

of P if and only if P Q. We also say that Q refines P or P is refined by Q.
Theorem 1.1.6. Let f (x) be bounded on [a, b]. Let P, Q be partitions on [a, b]
with Q a refinement of P. Then
L(f, P) L(f, Q) U(f, Q) U(f, P).
Proof. Assume that Q = P {y0 } for some y0 [a, b] \ P. Hence Q =
{xi , y0 : a = x0 < x1 < < xi1 < y0 < xi < < xn = b} and
P = {xi : a = x0 < x1 < < xn = b}. Let
Mj = sup{f (x) : x [xj1 , xj ]},
mi = inf{f (x) : x [xj1 , xj ]}.
Also let
Mi,1 = sup{f (x) : x [xi1 , y0 ]},
mi,1 = inf{f (x) : x [xi1 , y0 ]},
Mi,2 = sup{f (x) : x [y0 , xi ]},
mi,2 = inf{f (x) : x [y0 , xi ]}.
Note that since f ([xi1 , xi ]) f ([xi1 , y0 ]) and f ([xi1 , xi ]) f ([y0 , xi ]), we
have that Mi,1 Mi and Mi,2 Mi . Similarly, mi,1 mi and mi,2 mi . Now
U(f, P)
n
X
j=1
n
X
Mj xj
+Mj xj + Mi xi
j=1
j6=i
n
X
+Mj xj + Mi (y0 xi1 ) + Mi (xi y0 )
j=1
j6=i
n
X
+Mj xj + Mi,1 (y0 xi1 ) + Mi,2 (xi y0 )
j=1
j6=i
U(f, Q).
=
Similarly,
L(f, P)
n
X
j=1
n
X
mj xj
+mj xj + mi xi
j=1
j6=i
n
X
+mj xj + mi (y0 xi1 ) + mi (xi y0 )
j=1
j6=i
n
X
+mj xj + mi,1 (y0 xi1 ) + mi,2 (xi y0 )
j=1
j6=i
L(f, Q).
Due to remark 1.1.4, we conclude that

L(f, P) L(f, Q) U(f, Q) U(f, P).
We can use induction on the number of points in Q \ P to establish the theorem.
Corollary 1.1.7. If P and Q are partitions of f (x) on [a, b], then L(f, P)
U(f, Q).
Proof. Let T = P Q. Then T refines both P and Q. Hence L(f, P)
L(f, T ) U(f, T ) U(f, Q), as desired.
Definition 1.1.8. Let f (x) be bounded on [a, b]. The upper Riemann integral
for f (x) over [a, b] is
Z b
f (x) dx = inf{U(f, P) : P a partition of [a, b]},
a
and the lower Riemann integral for f (x) over [a, b] is

Z b
f (x) dx = sup{L(f, P) : P a partition of [a, b]},
a
Observation.
f (x) dx
a
1.2
f (x) dx.
a
Integrable Functions
Definition 1.2.1. We say that f (x) is Riemann integrable on [a, b] if

Z b
Z b
f (x) dx =
f (x) dx,
a
in which case we denote this common value by

Z b
f (x) dx.
a
R
In definitions 1.1.8 and 1.2.1, we encountered new notations: the is the
integral sign, a and b are endpoints of integration, f (x) is the integrand, and dx
refers to the variable of integration (also called the dummy variable)in this
case it is x.
Example 1.3. Let
(
f (x) =
1
1
if x [0, 1] Q,
if x [0, 1] \ Q.
If P = {xi : 0 = x0 < x1 < < xn = 1}, then let

Mi = sup{f (x) : x [xi1 , xi ]},
mi = inf{f (x) : x [xi1 , xi ]}.
We have that
U(f, P) =
L(f, P) =
n
X
i=1
n
X
Mi xi =
mi xi =
i=1
n
X
i=1
n
X
i=1
xi = 1, and
xi = 1.
Hence
Z
Z
f (x) dx = 1 6= 1 =
f (x) dx,
and hence this function is not integrable on [0, 1].

Observation. f (x) in the above example is discontinuous everywhere on [a, b].
Theorem 1.2.2. Let f (x) be bounded on [a, b]. Then f (x) is integrable on
[a, b] if and only if for every > 0 there exists a partition P of [a, b] such that
U(f, P) L(f, P) < .
Proof. Assume that f (x) is integrable on [a, b]. I.e.,
b
f (x) dx.
f (x) dx =
a
Rb
Let > 0. Since a f (x) dx = inf{U(f, P) : P is a partition}, there exists a
partition P1 such that
Z
Z
f (x) dx U(f, P1 ) <

f (x) dx + .
2
Rb
Similarly, since a f (x) dx = sup{L(f, P) : P is a partition}, there exists a partition P2 such that
Z b
Z b

f (x) dx.
f (x) dx < L(f, P2 )
2
a
a
Let Q = P1 P2 . Then
Z b

f (x) dx
2
a
f (x) dx
=
a

2
<
L(f, P2 )
L(f, Q)
U(f, Q)
U(f, P1 ) (by theorem 1.1.6)

Z b
Z b

f (x) dx + =
f (x) dx + .
2
2
a
a
<
(by theorem 1.1.6)
This implies that

U(f, Q) L(f, Q) < .
To prove the converse, assume that for each > 0 there exists a partition P
of [a, b] such that
U(f, P) L(f, P) < .
However, by definition 1.1.8, we have that
b
Z
L(f, P)
f (x) dx U(f, P),
f (x) dx
a
and hence
b
Z
0
Z
f (x) dx
f (x) dx < .
a
Since is arbitrary, we obtain

b
f (x) dx;
f (x) dx =
a
therefore f (x) is integrable on [a, b], as required.

Example 1.4. Let f (x) = x2 . Let Pn be the n-regular partition of [0, 1]. Then
U10 (f, Pn ) =
n
X
f (xi )xi =
i=1
and
L10 (f, Pn ) =
n
X
f (xi1 )xi =
i=1
n
X
i2 1
n2 n
i=1
n
X
(i 1)2
n2
i=1
1
.
n
Then

U(f, Pn ) L(f, Pn )
n2 1
n2 n
02 1
n2 n
1
.
n
This shows that for all > 0, we can find a partition P such that U(f, P)
L(f, P) < (simply select a large enough n by the Archimedean Principle and
use the n-regular partition Pn ). Hence by theorem 1.2.2, f (x) is integrable on
[0, 1]. Later, we will be able to show that
Z
x2 dx =
Recall the following definition.
1
.
3
Definition 1.2.3. We say that f (x) is uniformly continuous on an interval I

if for every > 0, there exists a > 0 such that for all x, y I,
|x y| <
implies
|f (x) f (y)| < .
Also recall the following theorem.

Theorem 1.2.4 [Sequential Characterization of Uniform Continuity]. A function f (x) is uniformly continuous on an interval I if and only if
whenever {xn }, {yn } are sequences in I,
lim (xn yn ) = 0
implies
lim (f (xn ) f (yn )) = 0.

Example 1.5. Let I = R and f (x) = x2 . Take {xn } = n + n1 , {yn } = {n}.
Then limn (xn yn ) = limn n1 = 0, but
"
#
2
1
2
n+
lim (f (xn ) f (yn )) = lim
n
n
n
n

1
= lim 2 + 2
n
n
= 2 6= 0.
Hence f (x) = x2 is not uniformly continuous on R.
One should already be familiar with the following theorem.
Theorem 1.2.5. If f (x) is continuous on [a, b], then f (x) is uniformly continuous on [a, b].
Proof. Assume that f (x) is continuous on [a, b] but not uniformly continuous
on [a, b]. Then by theorem 1.2.4, there exists sequences {xn } and {yn } with
limn (xn yn ) = 0 but limn (f (xn ) f (yn )) 6= 0. (Note that this limit
may not even exist.) By choosing a subsequence if necessary, we can assume
without loss of generality that there exists an 0 > 0 such that
(1.1)
|(f (xn ) f (yn )) 0| = |f (xn ) f (yn )| 0
for all n N.
Since {xn } [a, b], by the Bolzano-Weierstrass Theorem {xn } has a subsequence {xnk } with limk xnk = x0 [a, b]. Note that limk (xnk ynk ) = 0,
hence limk ynk = x0 also. By the sequential characterization of continuity,
we have
lim f (xnk ) = f (x0 ),
lim f (ynk ) = f (x0 ).
Therefore, we can select a K N with

0
2
0
|f (ynk ) f (x0 )| <
2
|f (xnk ) f (x0 )| <
and
for all k K. Hence we have, for all k K,

0 |f (xnk ) f (ynk )|
|f (xnk ) f (x0 )| + |f (ynk ) f (x0 )|

0
0
<
+
2
2
= 0 ,
directly contradicting equation (1.1). Hence f (x) is uniformly continuous on

[a, b].
Theorem 1.2.6 [Integrability Theorem for Continuous Functions].
If f (x) is continuous on [a, b], then f (x) is integrable on [a, b].
Proof. Let > 0. Since f (x) is continuous on [a, b], f (x) is also uniformly
continous on [a, b] by theorem 1.2.5 above. We can find a > 0 such that

if |x y| < , then |f (x) f (y)| < ba
. Let P be a partition of [a, b] with
kPk = max{xi } < . For example, we can choose the n-regular partition
of [a, b] with n large enough (chosen by the Archimedean Principle) so that
ba
n = kPn k < . After we have chosen P, let
Mi = sup{f (x) : x [xi1 , xi ]},
mi = inf{f (x) : x [xi1 , xi ]}.
By the extreme value theorem, there exists ci , di [xi1 , xi ] such that f (ci ) =
mi and f (di ) = Mi ; note that Mi mi = |Mi mi | = |f (di ) f (ci )|. Since

|ci di | < xi < , we have that |f (di ) f (ci )| < ba
. This shows that
U(f, P) L(f, P)
=
=
<
n
X
Mi xi
i=1
n
X
n
X
i=1
(Mi mi )xi
i=1
n
X
i=1

xi
ba
n
X
xi
b a i=1

=
(b a)
ba
= .
Hence f (x) is integrable by theorem 1.2.2.

10
mi xi
Remark 1.2.7. Assume that f (x) is continuous on [a, b]. For each n, let Pn be
the n-regular partition. Then if S(f, Pn ) is any Riemann sum associated with
Pn , we have
Z b
f (x) dx = lim S(f, Pn ).
n
Proof. Given > 0, we can find an N large enough so that if n N , then

ba
n < as defined in theorem 1.2.6 above. Hence
U(f, Pn ) L(f, Pn ) < ,
via a similar argument made in theorem 1.2.6. But U(f, Pn ) S(f, Pn )
Rb
L(f, Pn ) and U(f, Pn ) a f (x) dx L(f, Pn ). This implies that
Z

b

f (x) dx S(f, Pn ) <

a

and so limn S(f, Pn ) =
Rb
a
f (x) dx, as remarked.
Note that the above remark shows that if f (x) is continuous, we have

Z b
n
X
ba ba
f (x) dx = lim
f a+i
.
n
n
n
a
i=1
Definition 1.2.8. Given a partition P = {xi : a = x0 < x1 < < xn = b},
define the right-hand Riemann sum, SR , by
SR :=
n
X
f (xi ) xi ;
i=1
define the left-hand Riemann sum, SL , by

SL :=
n
X
f (xi1 ) xi ;
i=1
define the midpoint Riemann sum, SM , by

n
X
xi1 + xi
SM :=
f
xi .
2
i=1
Theorem 1.2.9. Assume that f (x) is integrable on [a, b]. Given > 0, there
exists > 0 such that if P is a partition of [a, b] with kPk < , then

Z b

f (x) dx <
S(f, P)

a
for any Riemann sum S(f, P).
11
Proof. Since f (x) is integrable, we can find a partition P1 of [a, b] with U(f, P1 )
L(f, P1 ) < 2 . In particular, for any Riemann sum S(f, P1 ), we have U(f, P1 )
Rb
S(f, P1 ) L(f, P1 ) and U(f, P1 ) a f (x) dx L(f, P1 ). Suppose P1 =
{xi : a = x0 < x1 < < xn = b}. Then let
M = sup{f (x) : x [a, b]},
m = inf{f (x) : x [a, b]}.
If M m = 0, then f (x) is constant on [a, b], so say f (x) = c for all x [a, b].
In this case U(f, P) = c(b a) = L(f, P) for any partition P. Hence any > 0
would satisfy the theorem and we are done. So assume M > m.
Let < 2n(M m) . Let P = {yi : a = y0 < y1 < < yj < < yk = b} be
any partition of [a, b] with kPk < . Let
T = {j : j {1, 2, . . . , k}, and [yj1 , yj ] [xi1 , xi ] for some i}.
Also, let
Mj = sup{f (x) : x [yj1 , yj ]},
mj = inf{f (x) : x [yj1 , yj ]}.
Then we have that
U(f, P) L(f, P)
k
X
Mj yj
mj yj
j=1
j=1
k
X
k
X
(Mj mj )yj
j=1
(Mj mj )yj +
jT
(Mj mj )yj .
j{1,2,...,k}\T
Note that
X
(Mj mj )yj
U(f, P1 P) L(f, P1 P)
jT
U(f, P1 ) L(f, P1 )

.
<
2
Observe that {1, 2, . . . , k} \ T has at most n elements (since for each j 6 T, we
have a unique i such that yj1 < xi1 < yj < xi ; conversely for each such i
there exists a unique j 6 Tbut there are only n points in P1 ). Hence for each
j {0, 1, . . . , k} \ T, we have
(Mj mj )yj
(M m) kPk
< (M m)
=

.
2n
12

2n(M m)
This shows that

X
(Mj mj )yj
<
j{1,2,...,k}\T
j{1,2,...,k}\T

2n

n
2n

.
2
Hence, if kPk < , U(f, P) L(f, P) < 2 + 2 = ; therefore, if S(f, P) is any

Riemann sum, then

Z b

f (x) dx < ,
S(f, P)

a
as required.
This thoerem is a generalization of remark 1.2.7. In particular, this theorem
provides an easy, alternate proof of remark 1.2.7. Here we state the remark
again as a corollary.
Corollary 1.2.10. If f (x) is integrable on [a, b], then
Z
f (x) dx = lim S(f, Pn ) = lim

n
n
X
f (ci )
i=1
ba
,
n
where Pn is the n-regular partition of [a, b] and ci [xi1 , xi ]. ( S(f, Pn ) is any

Riemann sum.)
Proof. Let > 0. By the above theorem, we can find > 0 so that if kPn k =
1
n < ,

Z b

f (x) dx <
S(f, Pn )

a
for any Riemann sum S(f, Pn ). The result follows by choosing an N (using the
Archimedean Principle) so that n1 N1 < for all n N .
In general, we write
Z
f (x) dx = lim S(f, P).

a
kPk0
Theorem 1.2.11. Let f (x) be monotonic on [a, b]. Then f (x) is integrable on
[a, b].
13
Proof. Let Pn be the n-regular partition. Assume, without loss of generality,

that f (x) is nondecreasing on [a, b]. Then
U(f, Pn ) = SR (f, Pn ) =
n
X
f (xi )
i=1
L(f, Pn ) = SL (f, Pn ) =
n
X
ba
, and
n
f (xi1 )
i=1
ba
.
n
So
U(f, Pn ) L(f, Pn )
n
X
n1
f (xi )
i=1
=
=
Since limn
1.2.2.
ba
n (f (b)
ba X
ba
f (xi )
n
n
i=0
ba
[(f (x1 ) + + f (xn ))
n
(f (x0 ) + + f (xn1 ))]
ba
(f (xn ) f (x0 ))
n
ba
(f (b) f (a)).
n
f (a)) = 0, f (x) is integrable on [a, b] by theorem
Question: Assume that f (x) is continuous on [a, b] except at c. (Note that

f (x) is bounded on [a, b].) Is f (x) still integrable on [a, b]?
Theorem 1.2.12. Assume that f (x) is continuous on [a, b] except at c [a, b].
Then f (x) is integrable on [a, b].
Proof. Let P = {xi : a = x0 < x1 < < xn = b} be a partition of [a, b] so
that c (xi0 1 , xi0 ) for some i0 . Let > 0. Let
M = sup{f (x) : x [a, b]},
m = inf{f (x) : x [a, b]},
Mi = sup{f (x) : x [xi1 , xi ]},
mi = inf{f (x) : x [xi1 , xi ]}.
We can choose P so that xi0 < 3(Mm) . Then (Mi0 mi0 )xi0
(M m)xi0 < (M m) 3(Mm) = 3 . Since f (x) is uniformly continuous
on [a, xi0 1 ], by refining as necessary, we can assume that if 0 i i0 , then
Mi mi < 3(xi a) . We now have
0
iX
0 1
iX
0 1
i=1
i=1
3(xi0 1 a)
(Mi mi )xi <
14
xi =
i0 1
X
xi

= .
3 i=1 xi0 1 a
3
A similar argument shows, by refining if necessary, that

n
X
(Mi mi )xi <
i=i0 +1

.
3
Then
U(f, P) L(f, P)
n
X
(Mi mi )xi
i=1
iX
0 1
(Mi mi )xi + (Mi0 mi0 )xi0
i=1
n
X
(Mi mi )xi
i=i0 +1

+ +
3 3 3
.
<
=
This shows that f (x) is integrable on [a, b] by theorem 1.2.2.

Observation. Observe the following facts:
1. If f (x) is bounded on [a, b] and continuous except at possibly at finitely
many points, then f (x) is integrable on [a, b].
2. If f (x) is integrable on [a, b] and g(x) = f (x) except at finitely many
points, then g(x) is integrable and
Z
Z
f (x) dx =
g(x) dx.
a
In fact, much more is true. Let (a, b) be an open interval. Define the length
of (a, b) to be l(a, b) = b a. We say that a subset A R has Lebesgue measure
zero if for each > 0, there exists a sequence {(an , bn )} of open intervals such
that
[
X
A
(ai , bi )
and
l(ai , bi ) < .
i=1
i=1
Diversion. Suppose A R is countable. Then A has Lebesgue measure zero.

Proof. Assume that A = {x1 , x2 , . . . , xi , . . .} is countable. Let > 0. Let
15
(an , bn ) = (x

2n+1 , x

2n+1 ).
l(an , bn )
Clearly, A
=
i=1
X
i=1
l(x
x+
i=1
=
=
i=1 (an , bn ).

2n+1
,x +
Also,

2n+1

x + n+1
2n+1
2
X

n
2
i=1
,
implying that A has Lebesgue measure zero.

Diversion. Let f (x) be a function on [a, b]. If the set of points of discontinuity
of f (x) on [a, b] has Lebesgue measure zero, then f (x) is integrable on [a, b].
1.3
Properties of Integrals
Theorem 1.3.1. Assume that f (x) is integrable on [a, b], then

Rb
Rb
1. (cf )(x) = c f (x) is integrable on [a, b] and a (cf )(x) dx = c a f (x) dx
for all c R.
2. If g(x) is also integrable on [a, b], then (f +g)(x) = f (x)+g(x) is integrable
Rb
Rb
Rb
on [a, b] and a (f + g)(x) dx = a f (x) dx + a g(x) dx.
Proof. To prove (1), we consider three cases: c = 0, c > 0, and c < 0. If c = 0,
then (cf )(x) = c f (x) = 0 for all x, hence (cf )(x) is integrable (since it is
continuous). Moreover, Uba (cf, P) = 0 = Lba (cf, P) for all partitions P of [a, b].
It follows that
Z
Z
b
(cf )(x) dx = 0 = c
a
f (x) dx.
a
Now we assume that c 6= 0. Let > 0. Since f (x) is integrable, we can choose

. It is clear
(by theorem 1.2.2) a partition P such that Uba (f, P) Lba (f, P) < |c|
that
U(cf, P) = c U(f, P), and
L(cf, P) = c L(f, P),
if c > 0 and
U(cf, P) = c L(f, P), and
L(cf, P) = c U(f, P),
16
if c < 0. In either case, we now have

|U(cf, P) L(cf, P)| =
=
=
<
=
|c [U(f, P) L(f, P)]|

|c| |U(f, P) L(f, P)|
|c| [U(f, P) L(f, P)]

|c|
|c|
.
(by remark 1.1.4)
This shows that (cf )(x) is integrable on [a, b]. By corollary 1.2.10, we have that
Z b
(cf )(x) dx = lim S(cf, Pn )
n
=
=
=
lim c S(f, Pn )
c lim S(f, Pn )
n
Z b
c
f (x) dx.
a
This proves part (1) of the theorem. The proof of part (2) is left as a homework
exercise.
Theorem 1.3.2. If f (x) is integrable on [a, b], so is |f (x)|.
Proof. This proof is left as a homework exercise. Hint: let h(x) = |f (x)|; then
for any partition P of [a, b], U(h, P) L(h, P) U(f, P) L(f, P).
Definition 1.3.3. Let f (x) be defined on an interval I. Then define the positive
part of f to be the function (also defined on I)
(
f (x) if f (x) 0,
f+ (x) :=
0
if f (x) < 0.
We also define the negative part of f to be the function (also defined on I)
(
0
if f (x) 0,
f (x) :=
f (x) if f (x) < 0.
Observation. Note that
f (x) = f+ (x) f (x), and
|f (x)| = f+ (x) + f (x).
Also, if f (x) is integrable on [a, b], then f+ (x) and f (x) are also integrable.
The proof of this fact is left as a homework exercise.
17
Theorem 1.3.4. Suppose f (x) is integrable on [a, b]. Then

Z
Z

b
b

|f (x)| dx.
f (x) dx

a
a
Proof. From the observation above, we have that
Z b
Z b
f (x) dx =
[f+ (x) f (x)] dx
a
a
b
f+ (x) dx
f (x) dx
and that
Z b
Z
|f (x)| dx
(by theorem 1.3.1),
[f+ (x) + f (x)] dx

a
Z
=
Z
f+ (x) dx +
f (x) dx
(by theorem 1.3.1).
Rb
Rb
Since f+ (x) 0 and f (x) 0 on [a, b], a f+ (x) dx 0 and a f (x) dx 0
(all Riemann sums Sba (f+ , P) and Sba (f , P) are greater than or equal to 0 for
all partitions P of [a, b]). Hence we can now show that
Z

Z

Z b
b

b

f (x) dx =
f+ (x) dx
f (x) dx

a

a

a

Z
Z

b
b

f (x) dx

f+ (x) dx +

a
a
Z b
Z b
=
f+ (x) dx +
f (x) dx
a
a
b
|f (x)| dx,
=
a
as required.
Definition 1.3.5 [Geometric Interpretation of the Definite Integral]. If f (x) 0 on [a, b] and integrable on [a, b], then the area under the
curve bounded by y = f (x), y = 0, x = a and x = b is
Z b
f (x) dx.
a
Now suppose f (x) is no longer restricted to be nonnegative on [a, b]. Then the
area above the x-axis bounded by y = f (x), y = 0, x = a and x = b is
Z b
f+ (x) dx.
a
18
Similarly, the area below the x-axis bounded by y = f (x), y = 0, x = a and

x = b is
Z b
f (x) dx.
a
From these, we can finally define the area bounded by y = f (x), y = 0, x = a

and x = b, A, as the sum of the area above the x-axis and the area below the
x-axis bounded by the same curves; i.e.,
Z b
Z b
Z b
A :=
f+ (x) dx +
f (x) dx =
|f (x)| dx.
a
Theorem 1.3.6. Let f (x) be bounded on [a, b] and let a < c < b. Then f (x) is
integrable on [a, b] if and only if f (x) is integrable on [a, c] and on [c, b]. In this
case, we have
Z b
Z c
Z b
f (x) dx =
f (x) dx +
f (x) dx.
a
Proof. Assume that f (x) is integrable on [a, b]. Let > 0. Since f (x) is integrable, we can find a partition P with
P = {xi : a = x0 < x1 < < xn = b}
for which
U(f, P) L(f, P) < .
By refining if necessary, we can assume that c = xk for some 0 < k < n, so
suppose that
P = {xi : a = x0 < x1 < < xk = c < < xn = b}.
Now let P1 = {xi : a = x0 < x1 < < xk = c} and P2 = {xi : c = xk <
xk+1 < < xn = b}. Then we have U(f, P) = Uca (f, P1 ) + Ubc (f, P2 ) and
L(f, P) = Lca (f, P1 ) + Lbc (f, P2 ). In particular,
0
Uca (f, P1 ) Lca (f, P1 )
Uca (f, P1 ) Lca (f, P1 ) + (Ubc (f, P2 ) Lbc (f, P2 ))
Uca (f, P1 ) + Ubc (f, P2 ) (Lbc (f, P2 ) + Lca (f, P1 ))
U(f, P) L(f, P)
<
,
Ubc (f, P2 ) Lbc (f, P2 )
Ubc (f, P2 ) Lbc (f, P2 ) + (Uca (f, P1 ) Lca (f, P1 ))
Uca (f, P1 ) + Ubc (f, P2 ) (Lbc (f, P2 ) + Lca (f, P1 ))
U(f, P) L(f, P)
<
.
and similarly,
0
19
This shows that f (x) is integrable on [a, c] and [c, b] respectively, by theorem
1.2.2.
Now assume the converse: f (x) is integrable on [a, c] and on [a, b]. Let > 0.
Let P1 = {xi : a = x0 < x1 < < xk = c} be a partition of [a, b] so that
U(f, P1 ) L(f, P1 ) < 2 . Let P2 = {yi : c = y0 < y1 < < ynk = b} be a
partition of [a, b] so that U(f, P2 )L(f, P2 ) < 2 . Let xi = yik for all k i n
so that P = P1 P2 satisfies
P = {xi : a = x0 < x1 < < xk = c < < xn = b}.
We now have:
U(f, P) L(f, sP )
(U(f, P1 ) + U(f, P2 )) (L(f, P1 ) + L(f, P2 ))
(U(f, P1 ) L(f, P1 )) + (U(f, P2 ) L(f, P2 ))

<
+
2 2
= .
Again by theorem 1.2.2, f (x) is integrable on [a, b].

Let Pn be the n-regular partition of [a, c] and Qn be the n-regular partition
of [c, b]. Let Rn = Pn Qn ; this is a partition of [a, b]. Suppose f (x) is integrable
on [a, b]. By applications of corollary 1.2.10, we have that for any Riemann sum
S1 (f, Pn ) and S2 (f, Qn ),
Z c
Z b
f (x) dx +
f (x) dx = lim S1 (f, Pn ) + lim S2 (f, Qn )
a
=
=
lim [S1 (f, Pn ) + S2 (f, Qn )]
lim S(f, Rn )
n
Z b
f (x) dx,
a
where S(f, Rn ) is a Riemann sum obtained by concatenating S1 and S2 .

We have so far worked with integrals whose lower bound a is strictly less
than its upper bound b; the following definition expands the integral to include
all possible bounds.
Definition 1.3.7. Suppose f is integrable on [a, b], a < b.
1. For any real number c [a, b], we define
Z c
f (x) dx := 0.
c
2. We define
Z
Z
f (x) dx :=
f (x) dx
a
20
Remark 1.3.8. All of our properties of integrals are still valid with this expanded definition.
For example, we can show that the following theorem is true.
Theorem 1.3.9. Let f (x) be bounded and integrable on an interval I containing
a and b, a 6= b. Suppose that m f (x) M for all x I. Then
m
1
ba
f (x) dx M.
a
Proof. We know that if a < b, then the theorem holds. Indeed, if P = {xi : a =
x0 < x1 < . . . < xn = b} is any partition of [a, b] and Lba (f, P) is any lower
Riemann sum, then we have
m(b a)
= m
n
X
xi
i=1
n
X
m xi
i=1
n
X
mi xi
i=1
Lba (f, P)
Z b
f (x) dx,
a
where
mi := inf f ([xi1 , xi ]).
Rb
Similarly, a f (x) dx M (ba). Dividing these two inequalities by the positive

quantity b a yields the theorem.
To prove the theorem for the case when a > b, we simply note that
Z b
Z b
Z a
1
1
1
f (x) dx =
f (x) dx =
f (x) dx
ba a
ab
ab b
a
by definition 1.3.7. We now apply the first paragraph of the proof to this
equivalent quantity with the roles of a and b reversed and the theorem follows.
Definition 1.3.10. If f (x) is integrable on [a, b], then we define the average
value or the mean value of f on [a, b] as the quantity
1
ba
f (x) dx.
a
21
We now give two intuitive justifications for the terminology average value in
definition 1.3.10.
1. Geometric justification. The signed area spanned by the curve of f
over [a, b] indicates the sum of all values of f over [a, b]. Dividing this
by the length of the interval [a, b] indicates, on average, how much each
value of f (x) contributed to the area:
1
ba
f (x) dx.
a
That is to say, the average value of f should accomplish the following:

The area of the rectangle whose base is [a, b] and whose height is is equal
to the area of the region bounded by f whose base is also [a, b]. One can
view as the average height of f on [a, b].
2. Sampling. We can sample the values of f at different points and simply
average them (in the finite sense). Then we can let the number of sample
points go to infinity (in a way that we end up sampling each part of
the curve equally). To accomplish this, let Pn be the n-regular partition
of [a, b]. The average of the values of the right-hand endpoints of this
partition is:
=
=
=
f (x1 ) + f (x2 ) + + f (xn )

n

1
ba
ba
ba
+ f (x2 )
+ + f (xn )
f (x1 )
ba
n
n
n
n
1 X
f (xi )xi
b a i=1
1
SR (f, Pn ).
ba
Letting n i.e., letting the number of sample points go to infinity

in a way that different parts of the curve get sampled equallywe obtain
1
1
SR (f, Pn ) =
n b a
ba
f (x) dx,
lim
by theorem 1.2.10.
1.4
The Fundamental Theorem of Calculus
In this section we will prove an important theorem: the fundamental theorem

of calculus. First weR will prove something about integral functionsfunctions
x
of the form F (x) = a f (t) dt.
22
Theorem 1.4.1. Assume that f (t) is Rdefined on an interval I containing a

x
point a and for each x I, the integral a f (t) dt exists. Also assume that there
exists an M R such that |f (t)| M , for all t I. Then the function
Z x
F (x) :=
f (t) dt
a
satisfies
|F (x) F (y)| M |x y|
for all x, y I.
Proof. First note that

Z
Z

f (t) dt .
f (t) dt =

for all , I. With this observation, we have, with the help of theorems 1.3.4
and 1.3.6, that
Z x

Z y

|F (x) F (y)| =
f (t) dt
f (t) dt
a

Zax
Z a

f (t) dt +
f (t) dt
=
a
y
Z x

=
f (t) dt
y
Z x

|f (t)| dt
y
Z x

M dt
y
= M |x y|.
The second inequality follows from an exercise in Riemann sums.
Corollary 1.4.2. With notation the same as the above theorem, F (x) is uniformly continuous on I.

Proof. Let > 0. Put = M
. Then if x, y I are such that |x y| < , then
we have

|F (x) F (y)| M |x y| < M = M
= ,
M
implying the theorem.
Question: Is it possible to show that F (x) is also differentiable? If so, what is

F 0 (x)?
This major theorem answers the above question.
23
Theorem 1.4.3 [First Fundamental Theorem of Calculus].R Assume

x
that f (t) is defined on an interval I containing a. Also suppose that a f (t) dt
exists for all x I; define
Z x
F (x) :=
f (t) dt.
a
Let x0 be in the interior of I (i.e., there exists an > 0 such that (x0 , x0 +)
I). If f (t) is continuous at x0 , then F (x) is differentiable at x0 and
F 0 (x0 ) = f (x0 ).
Proof. Consider
Z
F (x0 + h) F (x0 )
x0 +h
x0
f (t) dt
=
a
f (t) dt
a
x0 +h
f (t) dt.
x0
Given > 0, since f (t) is continuous at x0 , we can find a > 0 such that if
|t x0 | < , then |f (t) f (x0 )| < . Hence for each t (x0 , x0 + ), we have
(1.2)
f (x0 ) < f (t) < f (x0 ) + .
Let h be such that 0 < |h| < . First note that,

1
F (x0 + h) F (x0 )
=
h
h
x0 +h
f (t) dt
x0
and so by inequality (1.2) and theorem 1.3.9, we have that

(1.3)
f (x0 )
F (x0 + h) F (x0 )
f (x0 ) + .
h
For all h satisfying 0 < |h| < , we can combine inequality (1.3) with inequality
(1.2) (with t = x0 ) to obtain

F (x0 + h) F (x0 )

f (x0 ) < .

h
This implies that
F 0 (x0 ) = lim
h0
F (x0 + h) F (x0 )
= f (x0 )
h
and the proof is complete.
24
Example 1.6. Let R+ = {x R : x > 0} = (0, ). Let f : R+ R be defined

by f (t) = 1t . Also, let F : R+ R be defined by
Z x
Z x
1
dt.
F (x) =
f (t) dt =
1
1 t
The first fundamental theorem of calculus shows that F 0 (x) = x1 for all x > 0.
We know that if H(x) = ln(x), then H 0 (x) = x1 for all x > 0. Let G(x) =
F (x) H(x). Then
G0 (x) = F 0 (x) H 0 (x) =
1
1
= 0.
x x
The mean value theorem shows that there exists a C R such that G(x) = C
for all x > 0; however,
Z 1
1
C = G(1) := F (1) H(1) =
dt ln(1) = 0 0 = 0.
1 t
We conclude that
Z
ln(x) =
0
1
dt
t
for all x R .
R x t2
d
Example 1.7. Evaluate dx
e dt.
0
t2
Note that e is continuous on R, and so the first fundamental theorem of
calculus shows that
Z x
2
2
d
et dt = ex
dx 0
for all x R.
Example 1.8. Evaluate F 0 (x), where
x2
sin(t3 ) dt.
F (x) :=
0
Let
H(u) :=
sin(t3 ) dt,
0
d
then F (x) = H(x2 ), whence by the chain rule, F 0 (x) = H 0 (x2 ) dx
x2 = 2xH 0 (x2 ).
0
By the first fundamental theorem of calculus, H (u) = sin(u3 ), and hence
H 0 (x2 ) = sin(x6 ). We conclude that
F 0 (x) = 2xH 0 (x2 ) = 2x sin(x6 ).
25
Example 1.9. Let F : R R be defined by

x2
sin(t3 ) dt.
F (x) =
x
Find F 0 (x).
Note that
Z
F (x)
x2
=
x
Z x2
sin(t3 ) dt
sin(t3 ) dt +
x2
Z
=
sin(t3 ) dt
x
x
sin(t3 ) dt
sin(t3 ) dt,
and so by the fundamental theorem of calculus and the previous example, we

have that
F 0 (x) = 2x sin(x6 ) sin(x3 ).
As seen from the previous examples, the first fundamental theorem of calculus easily lends itself to the following generalization.
Corollary 1.4.4. Assume that f (t), g(x), and h(x) is defined on an interval
I. Also suppose that g(x) and h(x) are both differentiable on the interior of I,
R h(x)
and that g(x) f (t) dt exists for all x I. Define F : I R by
Z
h(x)
F (x) :=
f (t) dt.
g(x)
Let x0 be in the interior of I (i.e., there exists an > 0 such that (x0 , x0 +)
I). If f (t) is continuous at x0 , then F (x) is differentiable at x0 and
F 0 (x0 ) = f (h(x0 ))h0 (x0 ) f (g(x0 ))g 0 (x0 ).
Proof. This proof is left as an exercise. (Hint: Mimic the procedure in the
previous examples.)
Definition 1.4.5. Suppose that f (x) is defined on an open interval containing
an interval I. We say that F (x) is an antiderivative of f (x) on I if F 0 (x) = f (x)
for all x I.
26
Example 1.10. If f (x) = x, then F (x) = 21 x2 is an antiderivative on of f (x)

on R.
Observation. Recall that if F (x), G(x) are continuous on an interval [a, b] and
are antiderivatives of f (x) on (a, b), where a < b, then H(x) := F (x) G(x) is
such that H 0 (x) = F 0 (x) G0 (x) for all x (a, b). Since H(x) is continuous on
[a, b], the mean value theorem shows that H(x) = C R is a constant for all
x [a, b]; therefore, all antiderivatives of f (x) on (a, b) are of the form
F (x) + C.
Conversely, all such functions are antiderivatives of f (x) on (a, b). Hence if we
find one antiderivative of f (x) on (a, b), we have found all of them. In general,
if F (x) is an antiderivative of f (x) on (a, b), we write
Z
(1.4)
f (x) dx = F (x) + C.
The integral on the left side of (1.4) is called an indefinite integral and
represents the family of antiderivatives of f (x) on (a, b). The f (x) is, as before,
called the integrand and the dx refers to the variable of (indefinite) integration.
The following theorem, the second fundamental theorem of calculus, relates the
indefinite integral with the definite integral and explains their similar notation.
Theorem 1.4.6 [Second Fundamental Theorem of Calculus]. Assume
that f (x) is defined on an interval [a, b], a < b. Assume that f (x) is continuous
on [a, b] and differentiable on (a, b). If G(x) is continuous on [a, b] and is an
antiderivative of f (x) on (a, b), then
Z
f (x) dx = G(b) G(a).

a
Proof. First note that f (x) is integrable on [a, b] by theorem 1.2.6. So let F :
[a, b] R be defined by
Z x
F (x) =
f (t) dt.
a
The first fundamental theorem of calculus shows that F 0 (x) = f (x) on (a, b);
i.e., F (x) is an antiderivative of f (x) on (a, b). In observation 5, we noted (by
the mean value theorem) that G(x) = F (x) + C on [a, b] for some C R. We
27
now have
G(b) G(a)
[F (b) + C] [F (a) + C]
F (b) F (a)
Z b
Z
:=
f (x) dx
=
f (x) dx
a
b
f (x) dx 0
=
a
b
Z
=
f (x) dx,
a
as required.
Definition 1.4.7. Suppose f : S R with S R. If a, b S, then we define
the quantity f (x)|ba as follows:
f (x)|ba := f (b) f (a).
With this definition, the second fundamental theorem of calculus can be
rewritten as follows:
b
Z b
Z

f (x) dx =
f (x) dx .
a
The second fundamental theorem allows us to compute exact numeric values

for certain definite integrals. Let us return to the first problem posed in the
first section: What is the area bounded by the curve y = x2 and the lines y = 0,
x = 0, x = 1?
Problem. Find
R1
0
x2 dx.
R
Solution. Note that x2 dx = 31 x3 + C is an antiderivative. Let G(x) = 31 x3 .
By the second fundamental theorem of calculus, we have that
Z 1
1
1
x2 dx = G(x)|10 := G(1) G(0) = 0 = ,
3
3
0
as expected.
Remark 1.4.8. Here is a list of common antiderivatives. The integrands are
defined on their natural domains and the antiderivatives are valid on those
domains.
R
1. c dx = cx + C,
R
1
x+1 + C for all 6= 1,
2. x dx = +1
28
3.
x1 dx =
4.
sin(x) dx = cos(x) + C,
5.
cos(x) dx = sin(x) + C,
6.
ex dx = ex + C,
7.
1
1x2
dx = arcsin(x) + C,
8.
1
1x2
dx = arccos(x) + C,
9.
sec2 (x) dx = tan(x) + C,
10.
tan(x) dx = ln(| sec(x)|) + C,
11.
ln(x) dx = x(ln(x) 1) + C.
1
x
dx = ln(|x|) + C,
Proof. We will check the last two items and leave the rest as exercises.
10. Note that
d
[ln(| sec(x)|) + C]
dx
by the chain rule. For x

sec(x) > 0, and so

1
d
| sec(x)| + 0
| sec(x)|
dx

d
1
| sec(x)|
=
| sec(x)|
dx

+ 2n, 2 + 2n (n any natural number),
=
1
d
| sec(x)| =
| sec(x)| dx
as claimed. The case where x
1
d
sec(x)
sec(x) dx
1
=
sec(x) tan(x)
sec(x)
= tan(x),

+ 2n, 3
2 + 2n is similar.
11. Note that by the product rule,

1
d
[x ln(x) x + c] = 1 ln(x) + x 1 + 0 = ln(x) + 1 1 = ln(x)
dx
x
for all x > 0, as required.
Note that indefinite

integration satisfy
the linearity
property Rjust as derivaR
R
R
tives
do.
That
is,
f
(x)
+
g(x)
dx
=
f
(x)
dx
+
g(x)
dx and c f (x) dx =
R
c f (x) dx, for all continuous f (x), g(x) and c R. This is due to the linearity
of the derivative.
29
1.5
1.5.1
Substitution and the Change of Variables

Theorem
Substitution
R
2
Example 1.11. Evaluate xex dx.
R
R x2
2
2
2
Observe that xe dx = 21 (2x)ex dx = 12 (ex + C0 ) = 12 ex + C, where
2
d 1 x2
C = 21 C0 . To see this, note that the chain rule shows dx
+ C = 12 2xex =
2e
2
xex .
Observation. Suppose that f (x) and g(x) are defined on an open interval I
and differentiable on I. Recall that the chain rule states
d
f (g(x)) = f 0 (g(x))g 0 (x),
dx
valid on I. Now we have that
Z
Z
d
f (g(x)) dx = f 0 (g(x))g 0 (x) dx
dx
and hence
f 0 (g(x))g 0 (x) dx.
f (g(x)) + C =
Also note that
f 0 (u) du = f (u) + C.
Let u = g(x); we get that

Z
Z
f 0 (g(x))g 0 (x) dx = f (g(x)) + C = f (u) + C = f (u) + C = f 0 (u) du.
This suggests that if u = g(x), then du = g 0 (x) dx.
In this section, we will attempt to find antiderivatives by this technique,
known as (indefinite) integration by substitution. It is, in essence, the chain rule
backwards.
Sample Question 1. Evaluate
2xex dx.
R
Strategy: We want to put this integral into the form of f 0 (g(x))g 0 (x) dx.
2
To do this, we try to match each factor of 2xex to a factor of f 0 (g(x))g 0 (x).
0
u
The most promising match looks like f (u) = e with u = g(x) := x2 , where
g 0 (x) = 2x. In other words, we try to find a quantity in the variable xcall
it usuch that its differential dx absorbs most of the difficult parts of the
integrand away into du; after this absorption, what remains of the integral
30
should be expressible in terms of the quantity u only. With luck, whatever

remains is easier to integrate.
In this question, if we choose our magic quantity to be u = x2 , then notice
that du = 2x dx absorbs a large chunk of the integrand away into du, so that
2
2
remains is ex du. Now ex is just eu , and so the integral turns into
Rwhat
eu du, a trivial integral. Usually, this magic quantity u can be found inside
the integrand itself; in this case, it is found in the exponent of e. Do not forget to
re-express the result in terms of the original variable by substituting the magic
quantity u back into the result.
Sample Solution 1. Let u R= x2 . We get that du = 2x dx. After substituting
2
u = x2 and du = 2x dx into 2xex dx, we get that
Z
Z
2
2xex dx = eu du = eu + C.
We can now substitute u = x2 back into our new antiderivative to get
Z
2
2
2xex dx = eu + C = ex + C.
R
R sin(x)
Example 1.12. Evaluate tan(x) dx := cos(x)
dx. Let u = cos(x), whence
du
du = sin(x) dx, or sin(x)
= dx. Now we have that
Z
sin(x)
dx
cos(x)
=
=
=
sin(x)
du
u
sin(x)
Z
1
du
u
(ln(|u|) + C0 )
ln(|u|) + C (where C = C0 )

1
+C
= ln
|u|

1

+C
= ln
cos(x)
=
Example 1.13. Evaluate
ln(| sec(x)|) + C.
x
x+1
dx.
31
Note that
Z
x
dx
x+1
Z
=
=
We know that
Z
Z
x+1
1
dx
dx
x+1
x+1
Z
Z
1
1 dx
dx.
x+1
1 dx = x + C0 . Now let u = x + 1, whence du = dx and

Z
1
1
dx =
du = ln(|u|) + C = ln(|x + 1|) + C1 .
x+1
u
We conclude that
Z
x
dx = (x + C0 ) (ln(|x + 1|) + C1 ) = x ln(|x + 1|) + C,
x+1
where C = C0 + C1 .
An alternate solution is to let u = x + 1 right away to obtain
Z
Z
x
u1
dx =
du
x+1
u

Z
1
=
1
du
u
Z
Z
1
=
1 du
du
u
= (u + C0 ) (ln(|u|) + C1 )
= x ln(|x + 1|) + 1 + C0 C1
= x ln(|x + 1|) + C,
where C = 1 + C0 C1 .
R
Example 1.14. Evaluate sec() d.
We make the not-so-trivial observation that

Z
Z
sec() + tan()
sec() d =
sec()
d
sec() + tan()
Z
sec2 () + sec() tan()
(1.5)
=
d.
sec() + tan()
We make the substitution u = sec() + tan(), and so du = [sec() tan() +
sec2 ()] d. Now note that the entire numerator of 1.5 is absorbed into du and
the entire denominator becomes u, so that we obtain
Z
Z
sec2 () + sec() tan()
sec() d =
d.
sec() + tan()
Z
1
=
du
u
= ln(|u|) + C
=
ln(| sec() + tan()|) + C.

32
Example 1.15. Let f : [1, 1] R be defined by f (x) =

Z
Z p
f (x) dx =
1 x2 dx,
1 x2 . Evaluate
valid on [1, 1].

The solution to this type of problem involves an inverse trigonometric substitution, which will be explored in more detail in
a later
section.
Let = arcsin(x). Since x [1, 1],

,
2
2 and so cos() 0 on
this interval. Weqhave that x = sin(), and so dx = cos() d. We now have
p
that 1 x2 = 1 sin2 () = cos2 () = | cos()| = cos(). Putting these

together, we have
Z p
Z
1 x2 dx =
cos() cos() d
Z
=
cos2 () d
Z
cos(2) + 1
=
2
Z
Z
1
1
=
cos(2) d +
1 d
2
2
1 sin(2) 1
+ +C
=
2
2
2
1
=
sin(2) + + C
4
2
1
arcsin(x)
sin(2 arcsin(x)) +
+ C.
=
4
2
Checking, we have
=
=
=
(1.6)

d 1
arcsin(x)
sin(2 arcsin(x)) +
+C
dx 4
2
1
2
1
cos(2 arcsin(x))
+
2
4
1x
2 1 x2
1
(cos(2 arcsin(x)) + 1) d
2 1 x2

1
2 cos2 (arcsin(x)) 1 + 1
2
2 1x
1
cos2 (arcsin(x)).
1 x2
But observe that

cos2 (arcsin(x)) + x2 = cos2 (arcsin(x)) + sin2 (arcsin(x)) = 1,
33
and so
cos2 (arcsin(x)) = 1 x2 .
Continuing from (1.6) above, we have that

d 1
arcsin(x)
sin(2 arcsin(x)) +
+C
=
dx 4
2
=
(1 x2 )
1 x2
p
1 x2 ,
and we are done.
1.5.2
Change of Variables Theorem
How does substitution work for definite integrals? What happens to the bounds
of integration after we attempt to make a substitution?
Definition 1.5.1. Suppose g(x) is defined on an interval [a, b]. We say that
g(x) is continuously differentiable on [a, b] if g(x) is differentiable on (a, b), g 0 (x)
is continuous on (a, b), and the one-sided limits limxa+ g 0 (x) and limxb g 0 (x)
exist.
Theorem 1.5.2 [Change of Variables]. Assume that g(x) is defined and
continuous on an interval [a, b], a < b. Suppose also that g(x) is continuously
differentiable on [a, b]. Suppose also that f (x) is defined and continuous on
g([a, b]). Then
Z b
Z g(b)
0
f (g(x))g (x) dx =
f (u) du,
a
g(a)
where g 0 (x) has been extended continuously to [a, b].

Proof. Note that g([a, b]) is a closed, bounded interval by the extreme value
theorem. Also note that theorem 1.2.6 shows that f (u) is integrable on [a, b];
let F : I R be defined by
Z t
F (t) =
f (u) du.
g(a)
By the first fundamental theorem of calculus, F (t) is differentiable on I

g([a, b]) with F 0 (t) = f (t) on I g([a, b]). Let H : [a, b] R be defined by
Z g(x)
H(x) = F (g(x)) =
f (u) du.
g(a)
Since g(x) is differentiable on (a, b), the chain rule shows that for all x (a, b),
we have that
H 0 (x)
= F 0 (g(x))g 0 (x)
= f (g(x))g 0 (x).
34
That is, H(x) is an antiderivative of f (g(x))g 0 (x) on (a, b). Extend g 0 (x) continuously to [a, b] so that f (g(x))g 0 (x) is now continuous on [a, b].
Note that H(x) = F (g(x)) is continuous on [a, b]. To see this, first note that
g(x) is continuous on [a, b]. Next, note that by the extreme value theorem, f (x)
is bounded on g([a, b]), whence corollary 1.4.2 shows that F (t) is (uniformly)
continuous on g([a, b]). The composition of F (t) with g(x) will therefore be
continuous on [a, b].
All conditions of the second fundamental theorem of calculus are now satisfied. By that theorem,
b
f (g(x))g 0 (x) dx;
H(b) H(a) =
a
however, we also have

Z
H(b) H(a)
g(b)
g(a)
f (u) du
:=
f (u) du
g(a)
g(a)
g(b)
f (u) du.
g(a)
We conclude that
Z
f (g(x))g 0 (x) dx =
g(b)
f (u) du.
g(a)
R e2 1
Example 1.16. Evaluate e x ln(x)
dx.
1
0
Let g(x) = ln(x). Then g (x) = x ; the change of variables theorem and the
second fundamental theorem of calculus show that
Z g(e2 )
Z e2
1
1
dx =
du
x ln(x)
u
e
g(e)
Z ln(e2 )
1
=
du
u
ln(e)
Z 2
1
=
du
u
1
2
ln(|u|)|1
ln(2) ln(1)
ln(2).
35
1.6
Integration by Parts
Recall that the product rule states that

d
df
dg
f g =
g+
f
dx
dx
dx
= f 0 (x)g(x) + g 0 (x)f (x).
Hence for differentiable functions f (x) and g(x), we have that
Z
Z
Z
d
0
f (x)g(x) + C =
(f (x)g(x)) dx = f (x)g(x) dx + g 0 (x)f (x) dx,
dx
and thus we get the integration by parts formula
Z
Z
g 0 (x)f (x) dx = f (x)g(x) f 0 (x)g(x) dx.
(Or equivalently, we can write it this way:
Z
Z
f 0 (x)g(x) dx = f (x)g(x) g 0 (x)f (x) dx.)
The constant C is accounted for by the indefinite integration sign.
Just as integration by substitution was the reversal of the chain rule, this
technique is the reversal of the product rule.
xex dx.
Strategy: We want to rid the integrand of the x. Integration by parts is ideal for
this maneuver. Our way of removing the x is to differentiate it downwards to
produce the constant 1. As a compensation, we must integrate the other factor
ex upwards; however, integrating ex is not difficult at all. The integration by
parts formula tells us that
Z
Z
xex dx = xex 1ex dx = xex ex + C = (x 1)ex + C.
Sample Solution 2. Let f (x) = x, g(x) = ex . Then f 0 (x) = 1 and g 0 (x) = ex .

Integrating by parts, we have
Z
Z
Z
x
0
xe dx = f (x)g (x) dx = f (x)g(x) f 0 (x)g(x) dx
Z
= xex 1ex dx
=
xex ex + C
(x 1)ex + C.
36
x2 sin(x) dx.
Strategy: As before, the integrand is a product of two factors, one of which

is easy to remove by bringing it downwards by differentiation. On the other
hand, we must bring the other factor, sin(x), upwards by integration. Integrating sin(x) is not difficult. Once we are done, we end up with this integral:
Z
2x( cos(x)) dx.
There are still two factors in the integrand. It looks like we need to apply the
same technique one more time: We differentiate 2x downwards and integrate
cos(x) upwards again, finally eliminating the original factor x2 . We get
Z
2( sin(x)) dx.
This integral is easy to evaluate.
Sample Solution 3. Let f1 (x) = x2 and g1 (x) = cos(x). Then f10 (x) = 2x
and g10 (x) = sin(x). Integrating by parts, we get
Z
Z
Z
2
0
x sin(x) dx = f1 (x)g1 (x) dx = f1 (x)g1 (x) f10 (x)g1 (x) dx
Z
= x2 ( cos(x)) 2x( cos(x)) dx.
Let f2 (x) = 2x and g2 (x) = sin(x). Then f20 (x) = 2 and g20 (x) = cos(x).
Integrating by parts again, we get
Z
Z
Z
2x( cos(x)) dx = f2 (x)g20 (x) dx = f2 (x)g2 (x) f20 (x)g2 (x) dx
Z
= 2x( sin(x)) 2 sin(x) dx
= 2x sin(x) 2 cos(x) + C0 .
Putting these together, we have
Z
Z
x2 sin(x) dx = x2 ( cos(x)) 2x( cos(x)) dx
= x2 cos(x) [2x sin(x) 2 cos(x) + C0 ]
= x2 cos(x) + 2x sin(x) + 2 cos(x) + C
(where C = C0 ).
ex sin(x) dx.
37
Strategy: It does not appear that we can eliminate any of the two factors
through integration or differentiation, because ex and sin(x) never vanish after
some differentiation or integration; however, we can use this fact to our advantage. Note that ex is invariant under integration and sin(x) is cyclic under
differentiation. That is, ex does not change when integrated, and sin(x) first becomes cos(x) and then becomes sin(x) when differentiated. To demonstrate,
after the first integration by parts, our integral transforms into
Z
ex cos(x) dx.
After the second, we get
Z
ex ( sin(x)) dx =
ex sin(x);
we got back the negative of our original integral. This means we can write the
integral in question in terms of itself and the other items generated along the
way, whence we do some algebra.
Observe the sample solution below.
Sample Solution 4. Let f1 (x) = sin(x) and g(x) = ex . Then f10 (x) = cos(x)
and g 0 (x) = ex . Integrating by parts, we get
Z
Z
Z
ex sin(x) dx = g 0 (x)f1 (x) dx = g(x)f1 (x) g(x)f10 (x) dx
Z
x
= e sin(x) ex cos(x) dx.
Let f2 (x) = cos(x). Then f20 (x) = sin(x). Integrating by parts again, we get
Z
Z
Z
ex cos(x) dx = g 0 (x)f2 (x) dx = ex cos(x) g(x)f20 (x) dx
Z
x
= e cos(x) ex ( sin(x)) dx
Z
= ex cos(x) + ex sin(x) dx.
Putting these together, we get
Z
Z
x
x
e sin(x) dx = e sin(x) ex cos(x) dx
Z
= ex sin(x) [ex cos(x) + ex sin(x) dx]
Z
x
x
= e sin(x) e cos(x) ex sin(x) dx.
38
Moving
ex sin(x) dx to the left side, we obtain

Z
Z
ex sin(x) dx + ex sin(x) dx = ex sin(x) ex cos(x),
which implies
Z
2
ex sin(x) dx = ex sin(x) ex cos(x) + C0 .
We conclude that
Z
(where C =
ex sin(x) dx =
ex sin(x) ex cos(x)
+C
2
C0
2 ).
R
Example 1.17. Evaluate ln(x) dx (valid on (0, )).
There is an invisible second factor in the integrand: 1 ln(x). With this
insight, we let f (x) = x and g(x) = ln(x). Then f 0 (x) = 1 and g 0 (x) = x1 .
Integrating by parts, we get
Z
Z
Z
0
ln(x) dx = f (x)g(x) dx = f (x)g(x) g 0 (x)f (x) dx
Z
1
= x ln(x)
x dx
x
Z
= x ln(x) 1 dx
= x ln(x) x + C.
Checking, we have that
d
1
[x ln(x) x + C] = 1 ln(x) + x 1 + 0 = ln(x) + 1 1 = ln(x).
dx
x
This indefinite integral is valid on (0, ), as required.
R
Example 1.18. Evaluate arctan(x) dx.
Using the same technique, we try to eliminate arctan(x) by differentiation,
while at the same time integrating the invisible factor 1. Let f (x) = x and
1
g(x) = arctan(x). Then f 0 (x) = 1 and g 0 (x) = 1+x
2 . Integrating by parts, we
obtain
Z
Z
Z
arctan(x) dx = f 0 (x)g(x) dx = f (x)g(x) f (x)g 0 (x) dx
Z
x
dx.
= x arctan(x)
1 + x2
39
Making the substitution u = 1 + x2 , we find du = 2x dx and so we now have

Z
Z
x
1
2x
dx =
dx
2
1x
2
1 x2
Z
1
1
=
du
2
u
1
ln(|u|) + C0
=
2
1
=
ln(|1 + x2 |) + C0 .
2
Putting these together, we have that
Z
Z
x
dx
arctan(x) dx = x arctan(x)
1 + x2

1
= x arctan(x)
ln(|1 + x2 |) + C0
2
1
= x arctan(x) ln(|1 + x2 |) + C
2
(where C = C0 ).
Below are some indefinite integrals that integration by parts will solve particularly well:
R
arcsin(x) dx,
R
arccos(x) dx,
R
xn sin(x) dx,
R
xn cos(x) dx,
R
xn ex dx,
R
cos(x)ex dx,
R
sin(x)ex dx.
1.7
Partial Fractions
In this section, we will look at how we would integrate rational functions

functions of the form f (x) = p(x)
q(x) , where p(x) and q(x) 6= 0 are both polynomials.
Problem. Evaluate
1
x2 1
dx.
40
Solution. Note that

Z
1
dx =
x2 1
1
dx,
(x + 1)(x 1)
A
1
= x+1
+
and so if there exists some A, B R such that (x+1)(x1)
we would have
Z
Z
1
1
dx
=
dx
2
x 1
(x + 1)(x 1)

Z
A
B
=
+
dx
x+1 x1
Z
Z
B
A
dx +
dx
=
x+1
x1
= A ln(|x + 1|) + B ln(|x 1|) + C.
B
x1 ,
then
Do such A and B exist? To find them, we write

A
B
A(x 1) + B(x + 1)
1
=
+
=
,
(x + 1)(x 1)
x+1 x1
(x + 1)(x 1)
(1.7)
whence
1 = A(x 1) + B(x + 1).

Setting x = 1, we obtain 1 = 2B, which means B = 21 . Setting x = 1, we
obtain 1 = 2A, which means A = 1
2 . Note that this method is not yet
justified yet, since we would be dividing by 0 if we set x = 1 or x = 1 at
equation (1.7); however, we can justify this procedure in many ways, and we
leave the details as an exercise. Of course, we can always just check that the A
and B does indeed satisfy (1.7):
1
2
x+1
1
2
x1
1
2 (x
1) + 12 (x + 1)
=
(x + 1)(x 1)
1
2 x
+ 12 + 12 x +
(x + 1)(x 1)
1
2
1
.
x2 1
The solution above therefore yields

Z
1
dx = A ln(|x + 1|) + B ln(|x 1|) + C
2
x 1
1
1
=
ln(|x + 1|) + ln(|x 1|) + C.
2
2
So the general problem is the following.
R
Problem. Evaluate
the zero polynomial).
p(x)
q(x)
dx, where p(x) and q(x) are polynomials (q(x) is not
41
Definition 1.7.1. Given a polynomial p(x) = a0 + a1 x + a2 x2 + + an xn ,

where n 1 and an 6= 0, n is called the degree of p(x) (notation: deg(p(x)) = n).
For p(x) = 0, we define deg(p(x)) := .
To simplify the problem, we make the following assumptions and observations.
1. Without loss of generality, we will assume that deg(p(x)) < deg(q(x)), for
otherwise we could use polynomial division to write
p(x) = p1 (x) +
p2 (x)
,
q(x)
where deg(p2 (x)) < deg(q(x)). Integrating p1 (x) is simple, and so the
problem reduces to the case where deg(p(x)) < deg(q(x)).
2. By the fundamental theorem of algebra, given any polynomial q(x), we can
write
q(x) = a(q1 (x))m1 (q2 (x))m2 (qk (x))mk ,
where each qi (x) = x ai for some ai R or qi (x) = x2 + bi x + ci for
some bi , ci R, where qi (x) is irreducible. We will assume without loss
of generality that a = 1, so that q(x) is monic. Also, p(x) and q(x) share
no common factors.
1.7.1
Type I Rational Functions
Definition 1.7.2. A rational function f (x) =

and q(x) are polynomials satisfying
p(x)
q(x)
is said to be type I if p(x)
1. deg(p(x)) < deg(q(x)),

2. p(x) = 0 implies q(x) 6= 0, and
3. q(x) = (x a1 )(x a2 ) (x ak ) with ai 6= aj if i 6= j.
Example 1.19. f (x) =
1
x2 1
Remark 1.7.3. If f (x) =

Ak such that
(1.8)
f (x) =
p(x)
q(x)
1
(x+1)(x1)
is type I.
is type I, then there exists constants A1 , A2 , . . .,
A1
A2
Ak
+
+ +
,
x a1
x a2
x ak
where q(x) = (x a1 )(x a2 ) (x ak ). The quantity on the right side of

(1.8) is called the partial fraction decomposition of f (x).
Proof. This proof is omitted.
42
With the constants A1 , A2 , . . ., Ak described above, we have

Z
Z
A1
A2
Ak
f (x) dx =
+
+ +
dx
x a1
x a2
x ak
Z
Z
Z
A1
A2
Ak
=
dx +
dx + +
dx
x a1
x a2
x ak
= A1 ln(|x a1 |) + A2 ln(|x a2 |) + + Ak ln(|x ak |) + C.
(1.9)
To find A1 , A2 , , Ak , we multiply both sides of (1.8) by
q(x) = (x a1 )(x a2 ) (x ak )
to get
k
k
X
Y
p(x) =
(x aj ) .
Ai
i=1
j=1
j6=i
Setting x = am for m = 1, 2, . . . , k yields

k
Y
p(am ) = Am
(am aj ).
j=1
j6=m
Note that
k
Y
(am aj ) 6= 0;
j=1
j6=m
therefore, we conclude that

p(am )
Cm
Am =
where
Cm =
k
Y
(am aj ),
j=1
j6=m
for each m = 1, 2, . . . , k.
We write
x
(x1)(x3)
dx.
A
B
x
=
+
(x 1)(x 3)
x1 x3
x = A(x 3) + B(x 1).
43
Setting x = 1, we get that 1 = A(2), which means A =

get that 3 = B(2), which means B = 32 . Hence
Z
x
dx
(x 1)(x 3)
Z
=
=
1.7.2
1
2
1
2 .
Setting x = 3, we
3
2
x3
dx +
dx
x1
x3
3
1
ln(|x 1|) + ln(|x 3|) + C.
2
2
Type II Rational Functions
Definition 1.7.4. A rational function f (x) =

and q(x) are polynomials satisfying
p(x)
q(x)
is said to be type II if p(x)

2. p(x) = 0 implies q(x) 6= 0,
3. q(x) = (x a1 )m1 (x a2 )m2 (x ak )mk with ai 6= aj if i 6= j,
4. mi 1 for all i = 1, 2, . . . k, and
5. mi > 1 for some i = 1, 2, . . . k.
In this case, one can show that f (x) = p(x)
q(x) has a partial fraction decomposition where each term (x aj )mj of q(x) contributes mj terms of the form
Aj,mj
Aj,1
Aj,3
Aj,2
+
+ +
+
2
3
x aj
(x aj )
(x aj )
(x aj )mj
to the decomposition.
Integrating this type of rational functions is very similar to integrating those
of type I. We give an example below.
R
1
Example 1.21. Evaluate x2 (x+1)
dx.
1
2
a type II rational function; the x2
Here, q(x) = x (x + 1), making x2 (x+1)
will contribute two terms to the decomposition while the (x + 1) will contribute
one term:
1
+ 1)
x2 (x
(1.10)
=
:=
A1,1
A1,2
A2,1
+ 2 +
x
x
x+1
A
B
C
+ 2+
.
x
x
x+1
To find A, B, and C, we multiply the entire equation by q(x) = x2 (x + 1) to get

(1.11)
1 = Ax(x + 1) + B(x + 1) + Cx2 .

44
Letting x = 0, we get that 1 = 0 + B 1 + 0 = B. Letting x = 1, we obtain

1 = 0 + 0 + C 1 = C. Finally, we find A by comparing coefficients in equation
(1.11):
1 = Ax2 + Ax + Bx + B + Cx2 = (A + 1)x2 + (A + 1)x + 1
0 = (A + 1)x2 + (A + 1)x
0x2 = (A + 1)x2
0=A+1
A = 1.
After checking, we find that this combination of A, B, and C satisfies (1.10).
We therefore have that
Z
Z
Z
Z
1
1
1
1
dx
=
dx
+
dx
+
dx
x2 (x + 1)
x
x2
x+1
1
= ln(|x|) + ln(|x + 1|) + C.
x
1.7.3
Type III Rational Functions
Definition 1.7.5. A rational function f (x) = p(x)

q(x) is said to be type III if p(x)
and q(x) are polynomials and f (x) is neither type I nor type II. That is, f (x)
satisfies
2. p(x) = 0 implies q(x) 6= 0,
3. q(x) is of the form

q(x) = (x2 + b1 x + c1 )m1 (x2 + b2 x + c2 )m2 (x2 + bl x + cl )ml
[(x a1 )ml+1 (x a2 )ml+2 (x ak )ml+k ]
with 0 < l k and mi > 0 for all i = 1, 2, . . . , l + k,
4. ai 6= aj if i 6= j, for l i, j k,
5. (bi , ci ) 6= (bj , cj ) if i 6= j, for 1 i, j l,
6. (x2 + bi x + ci ) is irreducible for all i {1, 2, . . . , l}, and
7. (x2 + bi x + ci ) is not a polynomial factor of p(x) for any i {1, 2, . . . , l}.
Essentially, type III rational functions have at least one irreducible quadratic
factor in their denominators. One can show that each term (x2 + bj x + cj )mj of
45
q(x) will contribute mj terms to the partial fraction decomposition of f (x) as

follows:
Aj,m x + Bj,mj
Aj,1 x + Bj,1
Aj,3 x + Bj,3
Aj,2 x + Bj,2
+
++ 2 j
.
+ 2
x2 + bj x + cj
(x + bj x + cj )2 (x2 + bj x + cj )3
(x + bj x + cj )mj
The linear factors of q(x) contribute terms to the decomposition in the same
manner as before.
Example 1.22. Suppose that f (x) = p(x)
q(x) is a type III rational function where
2
2
q(x) = x(x + 1) . Its partial decomposition takes the form
A1
A2 x + B2
A3 x + B 3
p(x)
=
+
+ 2
.
2
q(x)
x
x +1
(x + 1)2
Once we find the constants in the decomposition, we begin integrating. The

linear terms are integrated as before; the quadratic terms are handled by the
following simple fact.
Example 1.23.
1
x2 +1
= arctan(x) + C.
From this, we can solve the following.

R
1
Example 1.24. Evaluate x2 +x+1
dx.
Note that this type III rational function is already in its decomposition. We
2
first complete the square: x2 + x + 1 = x + 12 + 34 . Then, we factor the 34 out
to get
"
#
2
3 4
1
2
x +x+1=
x+
+1 ,
4 3
2
and so we get
1
4
= h
x2 + x + 1
3
2
3
Let u =
2
3

Z
x+

1
2
, so that du =
1
dx
x2 + x + 1
2
3
1
x+
1
2
i2
.
+1
dx dx =
3
2
du. Integrating, we get
dx
i2
x + 12
+1
Z
4
1
3
du
3
u2 + 1 2
46
h
2
3
3
2
3
2
=
=
=
Z
3
1
du
2
u2
(arctan(u) + C0 )

2
1
x+
+ C.
arctan
2
3
In general, we get the following.

R
1
Example 1.25. Evaluate x2 +bx+c
dx, where b2 4c < 0 (i.e., x2 + bx + c is
irreducible).
Note that we can do the same algebra as the previous example to obtain
x2
1
1
=
h

+ bx + c
D
1
D
1
x+
b
2
where D = c b4 > 0. Making the substitution u =
du = 1D dx dx = D du, and so
Z
1
dx
x2 + bx + c
=
=
=
=
i 2

+1
1
D
x+
1
h
1
D
x+
b
2
i2
b
2
, we get that
dx
+1
1
1
D du
2
D
u +1
Z
1
D
du
D
u2 + 1
1
(arctan(u) + C0 )
D

1
1
b
arctan
x+
+ C.
2
D
D
We have found the general formula

Z
1
1
1
b
dx = arctan
x+
+ C,
x2 + bx + c
2
D
D
where D = c
b2
4
> 0.
We now look at how we would integrate an irreducible quadratic term with

a linear polynomials as its numerator.
x
x2 +1
dx.
47
Let u = x2 + 1; then du = 2x dx x dx = du
2 . Integrating, we have
Z
Z
x
1 du
dx
=
2
x +1
u 2
Z
1
1
=
du
2
u
1
(ln(|u|) + C0 )
=
2
1
=
ln(|x2 + 1|) + C.
2
R
Example 1.27. Evaluate xx+1
2 +1 dx.
We first break this up into a sum of two integrals:
Z
Z
Z
x+1
x
1
dx
=
dx
+
dx,
x2 + 1
x2 + 1
x2 + 1
both of which we have already integrated in previous examples. We get that
Z
x+1
1
dx = ln(|x2 + 1|) + arctan(x) + C.
2
x +1
2
R
x
Example 1.28. Evaluate x2 +x+1
dx.
We again break this up into two integrals, keeping in mind that the
substi1
tution u = x2 + x + 1 has differential du = (2x + 1) dx du
=
x
+
2
2 dx. We
have
Z
Z
Z
x + 12
x
12
dx =
dx +
dx
x2 + x + 1
x2 + x + 1
x2 + x + 1
Z
Z
1
1 du 1

dx
=
u 2
2
x2 + x + 1

1
1
2
1
=
ln(|u|) arctan
x+
+C
2
2
3
3

1
1
2
1
2
ln(|x + x + 1|) arctan

+ C.
=
x+
2
2
3
3
We now have an algorithm to integrate every irreducible quadratic term of
Ax+B
the type x2Ax+B
+bx+c . But what about the terms of the form (x2 +bx+c)n , where
n > 1? We can break this up into two integrals

Z
Z
Ax + Ab
Ab
1
2
dx + b
dx.
(x2 + bx + c)n
2
(x2 + bx + c)n
48
Here, we deliberately made the first integral easy to integrate:

simply let u =
x2 + bx + c and we get du = (2x + b) dx A2 du = Ax + Ab
dx. The integral
2
on the left now becomes the easy integral
Z
Z
A
1 A
1
du =
du.
un 2
2
un
The integral on the right is the more difficult one, and it would require techniques
from the next section to solve.
1.8
Inverse Trignometric Substitutions
Problem. Evaluate
1
(x2 +1)2
dx.
Solution. Here, we make the substitution = arctan(x) x = tan(). We

get that dx = sec2 () d and as a result
Z
Z
sec2 ()
1
dx
=
d
(x2 + 1)2
(tan2 () + 1)2
Z
sec2 ()
=
d
(sec2 ())2
Z
1
d
=
sec2 ()
Z
=
cos2 () d

Z
cos(2) 1
=
+
d
2
2
1 sin(2)
=
+ +C
2
2
2
1
=
(sin(2)) + + C,
4
2
since cos2 () =
cos(2)+1
.
2
Recall that x = tan(); from figure 1.1, we can see
1 +x 2
1
Figure 1.1: The subtitution = arctan(x).
49
that sin() = xx2 +1 and cos() =

2x
x2 +1 . We therefore have
Z
(x2
1
.
x2 +1
Then sin(2) = 2 sin() cos() =
x
1
1
dx = (sin(2)) + + C =
+ arctan(x) + C.
2
2
+ 1)
4
2
2(x + 1) 2
The substitution = arctan(x) is called an inverse trigonometric substitution, or simply an inverse trig substitution. If we rewrite this particular substitution as x = tan() and then take advantage of the identity sec2 () = 1+tan2 (),
we can evaluate many similar integrals.
Note. If the integrand has terms of the form (a2 x2 + b2 ) , then make the
ax = b tan(). Use the substitution ax =
substitution = arctan ax
b
b tan() to obtain
(a2 x2 + b2 ) = (b2 tan2 () + b2 ) = b2 (tan2 () + 1) = b2 (sec2 ()) .
1.29. One should make the = arctan x substitution to evaluate

RExample
1
dx.
2
x +1
Problem. Evaluate the definite integral
R1
1
1 x2 dx.
Solution. Here, we make

substitution
= arcsin(x) x = sin(). Then
the

dx = cos() d, valid on
,
.
Note
that
cos() 0 on that interval. We
2
2
now have, by the change of variables theorem, that
Z
1 x2 dx
arcsin(1)
q
1 sin2 () cos() d
arcsin(1)
p
cos2 () cos() d
| cos()| cos() d
cos() cos(theta) d
=
=
cos2 () d.
50
We now use the fact that cos2 () =

Z
cos () d
=
=
=
cos(2)+1
2
cos(2)
2
cos(2) d +
1
2
to get
1
d
2

2
sin(2) 2
+
4
2
2
2

sin
2 2
sin 2

4
4
4
4
+ (0 0)
2
.
2
We get that the area of the semicircle of radius 1 is
!
2.
The above solution demonstrates another inverse trig substitution: =

arcsin(x). This substitution is used so that we could take advantage of the
identity 1 sin2 (x) = cos2 (x).
2
2 2
Note. If the integrand has
terms of the form (b a x ) , then make the
ax
substitution = arcsin b ax = b sin(). Use the substitution ax = b sin()
to obtain
(b2 a2 x2 ) = (b2 b2 sin2 ()) = b2 (1 sin2 ()) = b2 (cos2 ()) .
Problem. Evaluate
1
x2 1
dx.
Solution. Here, we make the substitution

= arcsec(x) x = sec(). Then

dx = sec() tan() d, valid on 0, 2 . Now tan() 0 on that interval; hence,
we have that
Z
Z
sec() tan()
1
p
dx =
d
x2 1
sec2 () 1
Z
sec() tan()
=
d
tan2
Z
sec() tan()
=
d
tan
Z
=
sec() d
=
ln(| sec() + tan()|) + C.
51
Recall that x = sec(); from the fact that tan() = sec2 1 = x2 1, we

get
Z

p
1

dx = ln(| sec() + tan()|) + C = ln x + x2 1 + C.

2
x 1
The above solution demonstrates the use of the substitution = arcsec(x).

It is most often used in conjunction with the identity sec2 (x) 1 = tan2 (x).
2 2
2
Note. If the integrand has
terms of the form (a x b ) , then make the
ax
substitution = arcsec b ax = b sec(). Use the substitution ax = b sec()
to obtain
(a2 x2 b2 ) = (b2 sec2 () b2 ) = b2 (sec2 () 1) = b2 (tan2 ()) .
With these techniques, we can now integrate any rational function f (x) =
(where q(x) is not the zero polynomial).
p(x)
q(x)
1.9
1.9.1
Applications of the Integral

Area Between Curves
Problem. Let R be the region bounded by the graphs of f (x) and g(x) and
the lines x = a and x = b. Suppose that f (x) and g(x) are continuous on [a, b].
Find the area A of region R (figure 1.2).
y
g(x)
f(x)
Figure 1.2: Area between curves.

For convenience, we will in the future say that the region R is bounded by
(the graphs of ) f (x) and g(x) on [a, b].
52
Solution. We find A in three steps.

1. We begin by partitioning the interval [a, b] using
P := {xi : a = x0 < x1 < < xi1 < xi < < xn = b}.
Let Ri be the subregion of R bounded by the graphs of y = g(x) y = f (x)
y
g(x)
Ri
xi-1
xi
f(x)
Figure 1.3: Partitioning R into subregions.

on [xi1 , xi ]. Let Ai be the area of Ri . Then A =
shows this step.
Pn
i=1
Ai . Figure 1.3
2. We estimate each Ai with the area of a rectangle with its base on [xi1 , xi ]
and its top and bottom on the lines y = max{g(xi ), f (xi )} and y =
min{g(xi ), f (xi )}, respectively. This rectangle has width xi := xi xi1
and height |g(xi ) f (xi )|. Thus,
Ai |g(xi ) f (xi )|xi ,
so that
A=
n
X
i=1
Ai
n
X
|g(xi ) f (xi )|xi .
i=1
3. If f (x) and g(x) are continuous on [a, b] and if Pn is the n-regular partition,
then letting n gives
A =
lim
n
X
Ai
i=1
n
X

g a + i b a f a + i b a b a

n
n
n
n
i=1
Z b
=
|g(x) f (x)| dx,
lim
where the last equality is given by corollary 1.2.10.

53
The above solution can be used as the definition of the area bounded by (the
graphs of ) f (x) and g(x) on [a, b], where f (x) and g(x) are continuous functions.
If f (x) and g(x) intersect at least twice and at most a finite number of times,
then we can define the area bounded by (the graphs of ) f (x) and g(x) as the
area bounded by f (x) and g(x) on [c, d], where x = c is the leftmost point where
the two functions intersect and d is the rightmost point where the two functions
intersect.
Example 1.30. Let f (x) = x, g(x) = x3 . Find the area bounded by f (x) and
g(x).
Since we are not given bounds a and b, we are finding the total bounded
area. We note that f (x) = x and g(x) = x3 intersect three times: at x = 1,
at x = 0, and at x = 1. To see this, note that x = x3 implies that x3 x = 0
x(x + 1)(x 1) = 0. Let A be the area bounded by the two graphs. Then we
have that
Z 1
3

x x dx
A =
1
0
Z
=
3

x x dx +
1
0
Z
=
(x3 x) dx +
Z
Z
3

x x dx
0
1
(x x3 ) dx
0
2
1
x2
x
x4
x4
(1.12)
=
4
2 1
2
4 0
1 1
=
+
4 4
1
=
.
2
Equality (1.12) is due to the second fundamental theorem of calculus.

1.9.2
Moments and Centers of Mass
Informal definition. A moment is the product of the mass of an object times

the directed distance from a fixed point.
Assume that R is a region in R2 bounded by f (x) and g(x) on [a, b] with
f (x) g(x) on [a, b]. Assume that f (x) and g(x) are continuous on [a, b]. Also
assume that the region R has area A and represents a thin plate of uniform
density and uniform thickness t. Then the mass of the plate, m, is given by
m = tA.
Without loss of generality, we may assume that the constant t is equal to 1, so
that
Z b
(g(x) f (x)) dx.
m=A=
a
54
Problem. Find the moment of R relative to the y-axis.

Solution. As before, we first partition the interval [a, b] into n subintervals
using
P := {xi : a = x0 < x1 < < xi1 < xi < < xn = b}
This subdivides R into n regions Ri , i = 1, 2, . . . , n. The moment Mi associated
with the region Ri is approximately
Mi mi ci = ci (g(ci ) f (ci ))xi ,
where ci = xi12+xi represents the x center, or the average distance from the
y-axis, of the rectangle with its base on [xi1 , xi ] and its top and bottom at
g(ci ) and f (ci ), respectively; mi = (g(ci ) f (ci ))xi is the area, which is the
mass, of this rectangle. The total moment of R relative to the y-axis, Mx , is the
sum of the individual moments
Mx =
n
X
Mi
i=1
n
X
ci (g(ci ) f (ci ))xi .
i=1
(See figure 1.4.) Letting n gives
g(ci)
g(x)
ci
a xi-1
xi
f(ci)
f(x)
Figure 1.4: ci gives the shaded rectangles average distance from the y-axis.
55
Mx
n
X
lim
lim
i=1
n
X
Mi
ci (g(ci ) f (ci ))xi
i=1
x(g(x) f (x)) dx.
=
a
The last equality is due to corollary 1.2.10. If we remove the assumption that
f (x) g(x), then we can see that we would get
Z
x|g(x) f (x)| dx.
Mx =
a
Informal definition. Suppose R is a region in R2 bounded by f (x) and g(x)

on [a, b], where f (x) and g(x) are continuous on [a, b]. The centre of mass of
R relative to the y-axis, or the x-component of the centre of mass of R, is the
number x
such that the mass of R times x
equals the total moment of R relative
to the y-axis. That is,
b
Z
x
m R = Mx =
x|g(x) f (x)| dx,

a
where mR is the mass, which is the area, of R.

From the above definition, it is easy to see that x
=
Mx
mR .
Example 1.31. Find Mx and x

for the region R bounded by f (x) = x and
g(x) = x3 on [0, 1].
We have that
Z 1
Z 1
3
Mx =
x|x x | dx =
x(x x3 ) dx
0
Z
=
0
1
(x2 x4 ) dx =
2
.
15
56
x5
x3
3
5
1
0
The x-component of the center of mass of R is given by

x
=
Rb
Mx
mR
=
=
x|g(x) f (x)| dx
a
Rb
|g(x) f (x)| dx
a
R1
x(x x3 ) dx
R0 1
(x x3 ) dx
0

2
8
15
=
,
1
15
4
where the integral in the denominator has been computed in example 1.30.
We now turn our attention to the x-axis and the y-component of the center
of mass. So let R be a region in R2 bounded by f (x) and g(x) on [a, b], where
f (x) and g(x) are continuous on [a, b].
Problem. Find the moment of R relative to the x-axis.
Solution. We partition the interval [a, b] into n subintervals using
P := {xi : a = x0 < x1 < < xi1 < xi < < xn = b}
This subdivides R into n regions Ri , i = 1, 2, . . . , n. The moment Mi associated
with the region Ri is approximately
Mi mi ci
=
=
f (xi ) + g(xi )
2
f (xi ) + g(xi )
|g(xi ) f (xi )|
xi ,
2
|g(xi ) f (xi )|xi
i)
where ci := f (xi )+g(x
represents the y center, or the average distance from
2
the x-axis, of the rectangle with its base on [xi1 , xi ] and its top and bottom at
max{f (xi ), g(xi )} and min{f (xi ), g(xi )}, respectively; mi = |g(xi ) f (xi )|xi
is the area, which is the mass, of this rectangle. The total moment of R relative
to the x-axis, My , is the sum of the individual moments
My =
n
X
i=1
Mi
n
X
|g(xi ) f (xi )|
i=1
57
f (xi ) + g(xi )
xi .
2
(See figure 1.5.) Letting n gives

My
=
=
lim
lim
n
X
Mi
i=1
n
X
|g(xi ) f (xi )|
i=1
|g(x) f (x)|
=
a
f (xi ) + g(xi )
xi
2
f (x) + g(x)
dx.
2
The last equality is due to corollary 1.2.10.
g(xi)
ci
xi-1
xi b
g(x)
x
f(xi)
f(x)
Figure 1.5: ci gives the shaded rectangles average distance to the x-axis.

on [a, b], where f (x) and g(x) are continuous on [a, b]. The centre of mass of
R relative to the x-axis, or the y-component of the centre of mass of R, is the
number y such that the mass of R times y equals the total moment of R relative
to the x-axis. That is,
Z
y mR = My =
|g(x) f (x)|
a
where mR is the mass, which is the area, of R.

Similar to x
, y satisfies y =
My
mR .
58
f (x) + g(x)
dx,
2

on [a, b], where f (x) and g(x) are continuous on [a, b]. The center of mass of R
is the point P := (
x, y).
Example 1.32. Let R be the region bounded by f (x) = x3 and g(x) = 0 on
[0, 1]. Find the total moment of R relative to the x-axis, My . Find (
x, y).
Note that g(x) f (x) on [0, 1]; hence we have that
Z 1
Z 1
f (x) + g(x)
(x + x3 )
My =
|g(x) f (x)|
dx =
(x x3 )
dx
2
2
0
0

1
Z 1
1
1 x3
x7
=
(x2 x6 ) dx =
2 0
2 3
7 0

1 1 1
2
=
=
.
2 3 7
21
Now
y =
My
mR
Rb
=
dx
|g(x) f (x)| f (x)+g(x)
2
Rb
|g(x) f (x)| dx
a
R1
=
=
We have computed x
=
the point
8
15
(x x3 ) x 2+x dx
R1
(x x3 ) dx
0

2
8
21
.
=
1
21
4
in the previous example, and so we conclude that

(
x, y) =
8 8
,
15 21
is the center of mass of R.
1.9.3
Volumes of Revolution
In this subsection, we will develop techniques to compute volumes generated

by revolving a region around an axis. We will also see a (somewhat surprising)
relationship between volumes of revolution and centers of mass.
Washer method. We begin by looking at solids generated by revolving a
region R around the x-axis.
Problem. Let R be the region bounded by the graphs of y = f (x) and y = 0
on [a, b], where f (x) 0 on [a, b] and is continuous on [a, b]. The region R is
rotated around the x-axis to generate a cylinder-like solid of revolution S. What
is the volume V of S?
59
Solution. Again, we solve this problem in three steps.

1. We first partition the interval [a, b] via
P = {xi : a = x0 < x1 < < xi1 < xi < < xn = b}.
This divides R into n subregions Ri , i = 1, 2, . . . , n and thus divides S into
n subsolids Si , i = 1, 2, . . . , n. Let Vi be the volume of Si , the subsolid
generated by rotating Ri around the x-axis. We have
V =
n
X
Vi .
i=1
2. We now estimate each Vi . We approximate Ri with a rectangle Wi with

height f (xi ) and base on the interval [xi1 , xi ]; in this way, Si is approximated by the true cylinder generated by rotating Wi around the x-axis.
This cylinder has thickness xi := xi xi1 and radius f (xi ), and hence
it has volume r2 h = (f (xi ))2 xi . This is seen in figure 1.6. We now
have
Vi (f (xi ))2 xi
and so
V =
n
X
Vi
i=1
n
X
(f (xi ))2 xi .
i=1
Wi
f(xi)
f(x)
xi-1
x
xi
Figure 1.6: Approximating each Si by Wi revolved around the x-axis.
60
3. Letting n , we get the volume of revolution formula

V
=
=
lim
n
n
X
n
X
Vi
i=1
(f (xi ))2 xi
i=1
b
Z
=
(f (x))2 dx,

We will now consider the general case.
Problem. Let R be the region bounded by the graphs of y = f (x) and y = g(x)
on [a, b], where 0 f (x) g(x) on [a, b] and both functions are continuous on
[a, b]. The region R is rotated around the x-axis to generate a washer-like solid
of revolution S. What is the volume V of S?
Solution. Let R1 be the region bounded by y = f (x) and y = 0 on [a, b].
Let R2 be the region bounded by y = g(x) and y = 0 on [a, b]. The volume
obtained by rotating R is simple the volume obtained by rotating R1 subtract
the volume obtained by rotating R2 . We therefore get the general volume of
revolution formula
Z b
Z b
V =
(g(x))2 dx
(f (x))2 dx
a
Z
=
[(g(x))2 (f (x))2 ] dx.
Here, g(x) represents the outer radius and f (x) the inner radius.
Example 1.33. Find the volume V of the solid obtained by revolving the region
bounded by y = x and y = x3 on [0, 1] aound the x-axis.
By the volume of revolution formula, we have
Z b
Z 1
2
2
V =
[(g(x)) (f (x)) ] dx =
[(x)2 (x3 )2 ] dx
a
=
(x2 x6 ) dx =
0

1 1
4
=
=
.
3 7
21
61
0

7 1
x
x
3
7
Where have we seen this computation before?

Observation. Let R be the region bounded by f (x) and g(x) on [a, b] with
f (x) and g(x) continuous on [a, b] and 0 f (x) g(x). Let V be the volume
of the solid obtained by revolving R around the x-axis and AR be the area of
R. We have that
" R b (g(x))2 (f (x))2
#
Z b
dx
2
2
2
a
V =
[(g(x)) (f (x)) ] dx = 2 R b
a
(g(x)
f
(x))
dx
a
Z b
(g(x) f (x)) dx
" R b g(x)+f (x) a
#
|g(x) f (x)| dx
2
a
= 2
Rb
|g(x)
f
(x)|
dx
a
Z b
|g(x) f (x)| dx
a
2 y AR .
Note that during the revolution around the x-axis, the centre of mass of R
travels on a circle of radius y. Since 2 y is the circumference of that circle, the
equation above shows that V is the product of the distance that the center of
mass travels during the revolution and the area of the region R.
Shell method.
Now we look at volumes of revolution around the y-axis.
Problem. Suppose that a, b R satisfy 0 a < b. Assume that f (x) and g(x)
are continuous on [a, b]. Let R be the region bounded by f (x) and g(x) on [a, b].
Find the volume, V , of the solid S generated by rotating R around the y-axis.
Solution. The solution is similar to the x-axis case.
1. We first partition the interval [a, b] via
P = {xi : a = x0 < x1 < < xi1 < xi < < xn = b}.
This divides R into n subregions Ri , i = 1, 2, . . . , n. Let Vi be the volume
of Si , the subsolid generated by rotating Ri around the y-axis. We have
V =
n
X
Vi .
i=1
2. We now estimate each Vi . We approximate Ri with a rectangle Wi with its

base on the interval [xi1 , xi ] and its top and bottom at max{f (xi ), g(xi )}
62
and min{f (xi ), g(xi )}, respectively; in this way, Si is approximated by

the hollow cylinder generated by rotating Wi around the y-axis. This
hollow cylinder has thickness xi := xi xi1 , radius xi , and height
|g(xi ) f (xi )|; therefore, it has volume approximately equal to 2rh t =
2xi |g(xi ) f (xi )| xi . This is seen in figure 1.7. We now have
Vi 2xi |g(xi ) f (xi )| xi
and so
V =
n
X
Vi
i=1
n
X
2xi |g(xi ) f (xi )| xi .
i=1
Wi
xi
f(xi)
f(x)
g(xi)
g(x)
a
xi-1 xi b
Figure 1.7: Approximating each Si by Wi revolved around the y-axis.
3. Letting n , we get the volume of revolution formula

V
=
=
n
X
lim
lim
i=1
n
X
Vi
2xi |g(xi ) f (xi )| xi
i=1
Z b
x|g(x) f (x)| dx,

a

This last formula looks oddly familiar as well. From the last observation,
we expect this formula to conform to the relationship that the volume is given
by the distance travelled by the center of mass of R (during the revolution)
multiplied by the area of R.
63
Observation. Let a, b R be such that 0 a < b. Let f (x) and g(x) be

continuous functions on [a, b] and let R be the region bounded by f (x) and g(x)
on [a, b]. Let V be the volume of the solid obtained by revolving R around the
y-axis and AR be the area of R. We have that
#
"R b
Z b
x|g(x)
f
(x)|
dx
V = 2
x|g(x) f (x)| dx = 2 Ra b
a
|g(x) f (x)| dx
a
Z b
|g(x) f (x)| dx
a
2
x AR ,
as expected.
This pair of observations is evidence to the fact that V = dAR , where d is
the distance that the center of mass travels during the revolution and AR is
the area of the region R. This claim is in fact valid and can be used to find
volumes of revolution around any axis L (L is a line). To do so, we first find the
center of mass P and the area AR of the regionthis is done using the ordinary
x-y axes. Then we find the distance r between P and the line L. Finally,
after noting that P does not change location between calculations, we get that
V = dAR = 2r AR and we are done.
1.9.4
Arc Length
Problem. Given a function f (x) defined on [a, b] with f (x) continuously differentiable on [a, b], determine the length S of the graph of f (x) over [a, b].
Solution. We use the usual three-step process.
1. We first partition [a, b] using
P := {xi : a = x0 < x1 < < xi1 < xi < < xn = b}.
This subdivides the graph of f (x) into subarcs Ci , i = 1, 2, . . . , n, where
each Ci is the graph of f (x) on the interval [xi1 , xi ]. Let Si be the length
of Ci .
2. We estimate each Si . We replace the arc Ci by the secant Ei joining the
points (xi1 , f (xi1 )) and (xi , f (xi )). We then p
use the length Li of Ei to
approximate the length of Ci . Note that Li is (xi )2 + (yi )2 , where
yi := f (xi ) f (xi1 ), and so
p
Si Li = (xi )2 + (yi )2 .
By the mean value theorem, there exists some ci (xi1 , xi ) such that
yi := f (xi ) f (xi1 ) = f 0 (ci )(xi xi1 ) = f 0 (c)xi .
64
This can be seen in figure 1.8. We now have that

p
Si Li =
(xi )2 + (yi )2
p
=
(xi )2 + (f 0 (ci )xi )2
p
=
1 + (f 0 (ci ))2 xi ;
therefore,
S
n
X
Si
i=1
n p
X
n
X
Li
i=1
1 + (f 0 (c))2 xi .
i=1
(xi-1, f(xi-1))
yi
(xi, f(xi))
Ei
xi
f(x)
ci
xi-1 xi
Figure 1.8: Estimating each subarc Ci with the secant Ei .
3. Extend f 0 (x) continuously to [a, b]. Letting n gives

S
=
=
lim
lim
Z
=
n
X
Si
i=1
n p
X
1 + (f 0 (c))2 xi
i=1
p
1 + (f 0 (x))2 dx,
where the last equality is due to corollary 1.2.10 and the continuity of
f 0 (x) on [a, b].
65
Definition 1.9.1. Suppose that f (x) is defined on [a, b] and is continuously

differentiable on [a, b]. Extend f 0 (x) continuously to [a, b]. Then the quantity
Z
S :=
1 + (f 0 (x))2 dx
is called the arc length of f (x) on [a, b].

Example 1.34. Find the arc length of f (x) = x on the interval [0, 1].
We have
Z 1p
Z 1
Z 1p
1 + (f 0 (x))2 dx =
1 + 12 dx =
2 dx = 2,
S :=
0
as expected.
Diversion. We can generalize the notion of arc length to two dimensions. Let
x(t) and y(t) be defined on and continuously differentiable on [a, b]. Let F :
[a, b] R2 be a vector-valued function defined by F (t) = (x(t), y(t)). Find the
length of the graph of F (t).
Solution. Extend x0 (t) and y 0 (t) continuously to [a, b]. Subdividing [a, b] using
P := {xi : a = x0 < x1 < < xi1 < xi < < xn = b}
as before, we find that
xi := x(ti ) x(ti1 ) = x0 (ci )ti x0 (ti )ti
for some ci (ti1 , ti ) and
yi := y(ti ) y(ti1 ) = y 0 (di )ti y 0 (ti )ti
for some di (ti1 , ti ), by the mean value theorem. Then
p
Si
(xi )2 + (yi )2
p
=
(x0 (ti )ti )2 + (y 0 (ti )ti )2
p
(x0 (ti ))2 + (y 0 (ti ))2 ti .
=
We have, therefore, that
S=
n
X
i=1
Si
n p
X
(x0 (ti ))2 + (y 0 (ti ))2 ti .
i=1
66
Letting n gives
S
=
=
lim
lim
Z
=
n
X
Si
i=1
n p
X
(x0 (ti ))2 + (y 0 (ti ))2 ti
i=1
(x0 (t))2 + (y 0 (t))2 dt,
where the last equality is due to corollary 1.2.10 and the continuity of x0 (t) and
y 0 (t) on [a, b].
Example 1.35. Let F (t) = (cos(t), sin(t)), defined for t [0, 2]. That is, F (t)
traces the unit circle as t ranges from 0 to 2. We will find the arc length
of F (t). Let x(t) = cos(t) and y(t) = sin(t); both functions are continuously
differentiable on [0, 2]. We have x0 (t) = sin(t) and y 0 (t) = cos(t) on (0, 2).
Extend them continuously to [0, 2]. Here,
p
p
(x0 (t))2 + (y 0 (t))2 = ( sin(t))2 + (cos(t))2 = 1 = 1.

Thus,
Z
SF :=
(x0 (t))2 + (y 0 (t))2 dt =
1 dt = 2,
0
as expected.
Observation. Given a function f (x) defined on [a, b] such that f (x) is continuously differentiable on [a, b], we can parametrize f (x) as F (t) = (t, f (t)). Now
x0 (t) = 1 and y 0 (t) = f 0 (t). Extend them both continuously to [a, b] and observe
that
p
p
(x0 (t))2 + (y 0 (t))2 = 1 + (f 0 (t))2 ,
and so
Z
S=
1 + (f 0 (t))2 dt
is a special case of the generalized notion of arc length.
1.10
Improper Integrals
1.10.1
Definitions and Examples
Recall the following theorem for sequences of real numbers.

Theorem 1.10.1 [Monotone Convergence Theorem]. A monotonic sequence {an } R converges if and only if it is bounded.
67
It is only true because R satisfies the least upper bound principle. To present
the principle, we first need to define what a least upper bound is.
Definition 1.10.2. Given a set S R, a real number is called an upper
bound for S if x for all x S. A real number is called a lower bound for
S if x for all x S. S is said to be bounded above if S has an upper bound.
S is said to be bounded below if S has a lower bound. S is said to be bounded
if S has an upper bound and a lower bound.
Definition 1.10.3. We say that R is the least upper bound for a set S R
if
1. is an upper bound for S, and
2. if is an upper bound for S, then .
The least upper bound of S is often called the supremum of S. If a set S has a
least upper bound, then we denote it by sup S or lub S.
We can make a similar definition for the greatest lower bound, or infimum,
of a set S, denoted by inf S or glb S.
Axiom 1.10.4 [Least Upper Bound Property]. A nonempty subset S R
that is bounded above always has a least upper bound.
Similarly, there is a greatest lower bound property; it can be proven as a
theorem if we assume the least upper bound property as an axiom. We now
prove the monotone convergence theorem.
Proof. Assume that {an } is nondecreasing; the case for a nonincreasing sequence
is similar and is left as an exercise.
Suppose that {an } is bounded. In particular, it is bounded above. The
least upper bound principle shows that the set of values {an } has a least upper
bound L := sup{an }. Let > 0. Since L < L, there must exist some N N
so that L < aN L, for otherwise every term an in the sequence satisfies
an L , making L an upper bound for {an }, a contradiction since L is
the least such upper bound. For all n N , we have that
L < aN an L,
which implies that |an L| < for all n N . This shows that {an } converges;
in fact, it converges to L = sup{an }.
Conversely, suppose that {an } converges (to L, say). Then for 0 = 1, there
exists an N0 so that for all n N0 , |an L| < 0 = 1. Hence for all n N0 ,
we have that
|an | = |an L + L| |an L| + |L| < 1 + |L|.
68
Let M = max{|a1 |, |a2 |, . . . , |aN0 1 |, 1 + |L|}. Then |an | M for all n N,

showing that {an } is bounded. This completes the proof.
Definition 1.10.5 [Limits at ]. Suppose that a function f (x) is defined
on [a, ). We say that f (x) converges to L as x goes to infinity (notation:
limx f (x) = L) if for every > 0, there exists an M a such that if x M ,
then |f (x) L| < . If no such L exists, then we say that f (x) diverges as x
goes to or that limx f (x) does not exist.
We say that f (x) diverges to as x goes to (notation: limx f (x) =
) if for every k > 0, there exists an M a such that if x M , then f (x) k.
We can make a similar definition for functions that diverge to as x goes to
.
One can make a similar definition for limits at . If limx f (x) = ,
then limx f (x) does not exist and f (x) grows without bound. Many results
about limits of functions apply equally well to limits at infinity, such as the
squeeze theorem and the usual arithmetic rules.
Definition 1.10.6. Assume that f (t) is defined on [a, ) for some a R and
is integrable on [a, x] for each x a. We say that the improper (Riemann)
integral
Z
f (t) dt
a
converges if limx
Rx
a
f (t) dt exists. In this case, we write

Z
Z x
f (t) dt = lim
f (t) dt.
a
Otherwise, we say that
lim
a
f (t) dt diverges.
R
Example 1.36. Determine whether 1 1t dt converges; if it does, find its limit.
We have
Z
Z x
1
1
x
dt := lim
dt = lim ln(t)|1 = lim ln(x) ln(1) = .
x
x
x
t
1
1 t
R
This limit does not exist, and so 1 1t dt diverges.
R
Example 1.37. Determine whether 1 t12 dt converges; if it does, find its limit.
We have
x
Z
Z x
1
1
1
1
+ 1 = 0 + 1 = 1.
dt := lim
dt = lim
= lim

2
2
x
x
x
t
t 1
x
1
a t
R
Hence 1 t12 dt converges and it equals 1.
69
1.10.2
Monotone Convergence Theorem for Functions
Definition 1.10.7. Suppose that a function F (x) is defined on [a, ). We say

that F (x) is increasing on [a, ) if for every x1 , x2 [a, ) with x1 < x2 , we
have F (x1 ) < F (x2 ). We say that F (x) is nondecreasing on [a, ) if for every
x1 , x2 [a, ) with x1 < x2 , we have F (x1 ) F (x2 ). We can make similar
definitions for decreasing and nonincreasing functions.
Definition 1.10.8. A function F (x) defined on [a, ) is monotonic on [a, )
if it either nondecreasing on [a, ) or nonincreasing on [a, ).
Question: Is there an analogue of the monotone convergence theorem for nondecreasing functions?
Theorem 1.10.9 [Monotone Convergence Theorem for Functions].
Suppose that a function F (x) is defined on [a, ) and is monotonic on [a, ).
Then limx F (x) exists if and only if F ([a, )) := {F (x) : x a} is bounded.
Proof. First suppose that F (x) is nondecreasing on [a, ).
Assume that S := F ([a, )) is bounded. Then the least upper bound principle shows that S has a least upper bound L := sup S. Let > 0. Since
L < L, there must exist some M a such that L < F (M ) L. For all
x M , we have
L < F (M ) F (x) L,
since F (x) is nondecreasing on [a, ). This shows that |F (x) L| < for
all x M , and so limx F (x) exists and is in fact equal to L := sup S :=
sup{F (x) : x a}.
Conversely, suppose that limx F (x) exists (and equals L, say). Then for
0 = 1, there exists some M a such that for all x M , we have |F (x) L| <
0 = 1. Hence for all x M , we have
F (x) |F (x)| = |F (x) L + L| |F (x) L| + |L| < 1 + |L|.
For all x such that a x < M , we have that F (x) F (M ) since F (X) is
nondecreasing on [a, ). Let K = max{1 + |L|, F (M )}. Then F (x) K for
all x [a, ); i.e., S is bounded above. A lower bound for S is F (a), because
F (a) F (x) for all x a since F (x) is nondecreasing on [a, ). This shows
that S := {F (x) : x a} is bounded.
The case for nonincreasing F (x) is similar and is left as an exercise.
Why is this theorem important? Its importance can be seen in the following
corollary.
Corollary 1.10.10. Suppose that f (t) is defined on [a, ) with f (t) 0 for
all t a. Also suppose that f (x) is integrable on [a, x] for each x a. Then
70
the improper Riemann integral

Z
f (t) dt
a
converges if and only if the set

Z

f (t) dt : x a
is bounded above.
Rx
Proof. Let F : [a, ) R be defined by F (x) = a f (t) dt. Since f (t) 0 for
Ra
all t a, F (x) is nondecreasing. In particular, F (x) F (a) = a f (t) dt = 0
for all x a. The monotone convergence theorem for functions shows that
Z
f (t) dt := lim F (x)
x
exists if and only if the set

Z
F ([a, )) := {F (x) : x a} :=

f (t) dt : x a
is bounded. But this set is bounded if and only it is bounded above, since
F (a) = 0 is automatically a lower bound for F ([a, )). The result follows.
In the corollary above, ifRF ([a, )) is not bounded above, then we can easily
show that limx F (x) := a f (t) dt = .
1.10.3
Convergence Tests
Theorem 1.10.11 [p-Test for Integrals]. The improper integral

Z
1
dt
tp
1
converges if and only if p > 1.
Proof. We have already shown in example 1.36 that if p = 1, then the improper
integral diverges; so assume that p 6= 1. We have that
Z
Z x
1
dt := lim
tp dt
x 1
tp
1
x
xp+1
= lim
x p + 1
1
(1.13)
11p
x1p
.
x 1 p
1p
lim
71
Note that
x1p
lim
=
x 1 p
if 1 p > 0,
0 if 1 p < 0.
Or, put another way,

(
x1p
lim
=
x 1 p
if p < 1,
0 if p > 1.
Hence, the quantity at (1.13) satisfies

1
x1p
=
lim
x 1 p
1p
This shows that
converges (to
1
p1
if p < 1,
if p > 1.
1
dt
tp
1
1
p1 )
if and only if p > 1, as required.
Theorem 1.10.12 [Comparison Test for Integrals]. Assume that f (t),

g(t) are defined on [a, ) with 0 f (t) g(t) for all t a. Also suppose that
f (t) is integrable on [a, x] for all x a.
R
R
1. If a g(t) dt converges, then a f (t) dt converges.
R
R
2. If a f (t) dt diverges, then a g(t) dt diverges.
Proof. Since the second item is
the contrapositive of the first, it suffices to verify
R
the first item. Assume that a g(t) dt converges. Let F , G : [a, ) R be
defined by
Z
x
F (x) =
f (t) dt
a
and
Z
G(x) =
g(t) dt.
a
Corollary 1.10.10 shows that the set

Z
G([a, )) :=

g(t) dt : x a
is bounded above; i.e., some M R is an upper bound. Then for any x a,

we have
Z b
Z b
F (x) =
f (t) dt
g(t) dt = G(x) M,
a
showing that M is an upper bound for the set

Z x

F ([a, )) :=
f (t) dt : x a .
a
72
Hence by corollary 1.10.10 again,

plete.
R
a
f (t) dt converges. The proof is now com-
| sin(t)|
t2 +t+2 dt converges.
1
Let f (t) = t|2sin(t)|
= t12 . Note that 0 f (t) = t|2sin(t)|
+t+2 and g(t)
+t+2 t2 +t+2
R
1
1
p-test for integrals with
t2 = g(t) on [1, ). Since a t2 dt converges by the
R
p = 2, the comparison test for integrals shows that 1 t|2sin(t)|
+t+2 dt converges, as
Example 1.38. Show that
required.
R
Example 1.39. Show that 1 ln(t)
t dt diverges.
ln(t)
1
Let f (t) = t and g(t) = t . Note that 0 f (t) = 1t
x e. The p-test for integrals shows that
Z
Z e
Z
1
1
1
dt =
dt +
dt
t
t
t
e
1
1
ln(t)
t
= g(t) for all
diverges (to ) with p = 1, and so by the comparison test for integrals,

R ln(t)
t dt diverges, whence
e
Z
1
ln(t)
dt =
t
Z
1
ln(t)
dt +
t
Z
e
ln(t)
dt
t
diverges, as required.
Theorem 1.10.13. Suppose that f (t), g(t) are defined
R on [a, ) with f (t) and
g(t) integrable on [a, x] for all x a. Suppose that a f (t) dt converges to L
R
and a g(t) dt converges to M . Then
R
1. for all c R, a c f (t) dt converges, and
Z
Z
c f (t) dt = c
2.
R
a
f (t) dt = c L;
a
(f (t) + g(t)) dt converges, and

Z
(f (t) + g(t)) dt =
a
Z
f (t) dt +
g(t) dt = L + M.
a
Proof. This proof is left as an exercise.

R
dt converges; however, it can

It will be shown in example 1.40 that 1 sin(t)
t
R sin(t)
also be shown that 1 t dt diverges. We make the following definition.
73
Definition 1.10.14. Suppose f (t) is defined on [a, )

R and is integrable on
[a, x] for all x a. We say that the improper integral a f (t) dt converges abR
R
solutely if a |f (t)| dt converges. We say that a f (t) dt converges conditionally
R
R
if a f (t) dt converges but a |f (t)| dt diverges.
Theorem 1.10.15.
R Suppose f (t) is defined on [a, ) and
R is integrable on [a, x]
for all x a. If a f (t) dt converges absolutely, then a f (t) dt converges.
R
Proof. Observe that 0 f (t) + |f (t)| 2|f (t)| for all t a. Since a |f (t)| dt
R
R
converges, both a |f (t)| dt and a 2|f (t)| dt converge by theorem 1.10.13.
R
By the comparison test for integrals, we have that a (f (t) + |f (t)|) dt also
converges. Since f (t) = (f (t) + |f (t)|) + |f (t)|, we have that
Z
Z
Z
f (t) dt =
(f (t) + |f (t)|) dt +
|f (t)| dt
a
converges by theorem 1.10.13, as required.

Example 1.40. Show that
Let f (t) = cos(t) and
1
2 t
32
sin(t)
dt converges.
t
1
1
g(t) = t = t 2 . Then
1
f 0 (t) = sin(t) and g 0 (t) =
. Integrating by parts, we have that

Z
Z
1
1
1 3
sin(t) dt = cos(t) cos(t)
t 2 dt
2
t
t
Z
cos(t) 3
cos(t)
=
t 2 dt
2
t
. By the second fundamental theorem of calculus,

is an antiderivative of sin(t)
t
for any x a we have

x Z x
Z x
1
cos(t)
cos(t) 3
t 2 dt
sin(t) dt =
2
t
t
1
1
1
Z x

cos(x) cos(1)
cos(t) 3
=
t 2 dt.
2
x
1
1
Letting x yields
Z x
Z x
cos(t) 3
1
cos(x)
lim
t 2 dt
lim
sin(t) dt = cos(1) + lim
x 1
x
x 1
2
x
t
Z
Z
1
cos(t) 3
cos(x)
sin(t) dt = cos(1) + lim
t 2 dt.
x
2
x
t
1
1
Since both limits on the right exist, the limit on the left exists by theorem
1
1.10.13. To see that limx cos(x)

exists, we observe that
cos(x)
1x
x
x
x
for all x 1; since
1
1
lim = 0 = lim ,
x
x
x
x
74
the squeeze theorem for functions shows that limx

R
3/2
that 1 cos(t)
dt converges, note that
2 t
0
cos(x)
exists. To see
| cos(x)|
1
3/2
x3/2
x
R 1
for all x 1; since 1 x3/2
dt converges by the p-test for integrals with p = 32 ,
R cos(t) 3/2
dt converges by the comparison test for integrals and we are done.
2 t
1
75
Chapter 2
Series
In this chapter, we will study series, or infinite sums. By the end of this
chapter, we will possess a number of useful tests to test for the convergence of
different series and to estimate the values of those series that do converge.
2.0
Definitions and Basic Theorems
Problem. How do we add infinitely many numbers?

Example 2.1. We consider the following infinite sum: 1 + (1) + 1 + (1) +
1 + (1) + . There are numerous possibilities for the value of such an
infinite sum; in particular, we could compute it in these two different ways:
1. [1 + (1)] + [1 + (1)] + [1 + (1)] + = 0 + 0 + 0 + = 0.
2. 1 + [(1) + 1] + [(1) + 1] + [(1) + 1] + = 1 + 0 + 0 + 0 + = 1.
So the value of this infinite sum is ambiguous. Just by changing the way we
parenthesize the terms, we get different results.
We need an alternate method of addition for infintely many terms.
Definition 2.0.1. A series is a formal sum of infinitely many terms
a1 + a2 + + an + .
P
P
We denote the series by n=1 an . Given a series n=1 an , we define the k-th
partial sum to be
k
X
Sk := a1 + a2 + + ak =
an .
n=1
76
We say that a series converges if and

only if the sequence of partial sums {Sk }kN
P
converges. In this case, we write
k } does not conP n=1 an = limk Sk . If {S
P
verge, we say that the series n=1 an diverges. We say that n=1 an diverges
to if limk Sk = .
Here is our first result involving the above formal definitions.
Theorem 2.0.2 [Divergence Test]. If
n=1
an converges, then
lim an = 0.
Proof. Assume that the series
n=1
an converges to some L R; i.e.,
lim Sk = L.
But then
lim Sk1 = L
also. Hence,
lim Sk Sk1 = lim Sk lim Sk1 = L L = 0.
However, for all k 2, we have

Sk Sk1 =
k
X
an
n=1
k1
X
an = ak .
n=1
This shows that limk ak = 0.

Example 2.2. Consider the series
2
X
n + 3n 1
n=1
5n 2
Upon seeing this, some students will immediately apply more elaborate tests
from the next few sections to this series to see whether or not it converges; a
much better observation is that the terms do not go to zero. Indeed, we have
q
n
1 + n3 n12
2
n + 3n 1
=
5n 2
5n 2
q
1 + n3 n12
=
5 n2
1
5
as n . Hence by the divergence test, this series diverges.
77
We should always check to see if the terms go to zero before even considering
the possibility that the series converges!
Sample Question 5. Determine whether or not the series
1
1/n
[ln(n)]
n=2
converges.
Strategy:
the above advice, we first check to see if the sequence of
n Following
o
1
terms [ln(n)]1/n tend to 0. If we have access to a calculator, we discover that
1
[ln(n)]1/n
like
is close to 1 for large values of n. How do we prove this?
The key to this question is becoming comfortable with strange expressions

1
1
. When an expression involving a variable n, in this case ln(n)
,
[ln(n)]1/n
is raised to another expression of the same variable, in this case n1 , the standard procedure is to take the natural logarithm of the terms to produce a new
sequence; this maneuver will bring the exponent down. The limit of this new sequence is usually obtained by using the fundamental log limit: limn ln(n)
n =
0. Once the limit for this new sequence has been found, we undo the natural
logarithm by exponentiating. This entire process is exemplified in the sample
solution. In the solution, we see that the logarithms of the terms approach zero,
which shows that the terms approach e0 = 1. The divergence test then shows
that the series diverges.
How do we tell at first sight that a series diverges due to strange-looking
terms not going to zero? Here is a useful list of limits derived from the same
technique as the one employed in the solution: n1/n 1, [ln(n)]1/n 1,
1
1
1, and of course, [ln(n)]
In fact, many similar
1/n 1, as n .
n1/n
expressions do not approach zero as n look out for those!
terms {an }n2

have
1
.
[ln(n)]1/n
We shall show that the sequence of

1
does not tend to zero. Let bn = ln(an ) = ln [ln(n)]
. We
1/n
Sample Solution 5. Let an =

bn = ln
1
[ln(n)]1/n
n1 !
1
= ln
ln(n)

1
1
=
ln
n
ln(n)
1
=
[ln(1) ln(ln(n))]
n
1
=
ln(ln(n))
n
ln(ln(n))
=
.
n

78
= 0. Since ln(n) as
The fundamental log limit states that limn ln(n)
n
ln(ln(n))
n , bn =
0 as n . Now {bn } is a sequence tending to 0,
n
so by the sequential characterization of continuity, {ebn } is
a sequence tending
ln
1
to e0 = 1 (ex is continuous at x = 0). But ebn = e [ln(n)]
= [ln(n)]
1/n = an ,
P
hence
6 0 as n . The divergence test shows that n=2 an =
P an 1 1 =
diverges.
1/n
n=2 [ln(n)]
1/n
It is extremely important toP

note that the converse of the divergence test
is false. For example, consider n=1 n1 ; this series is known as the harmonic
1
series. Its terms n go to zero, and yet the series diverges.
Observation. The harmonic series diverges to .
Proof. It is clear that the sequence {Sk } is increasing. Consider {S2k }, a subsequence of {Sk }. It is enough to show that this subsequence diverges to .
Note that
S1 = 1,
1
S2 = 1 + ,
2
1
S4 = 1 + +
2
1
S8 = 1 + +
2
1 1
1 1 1
1 1
+ >1+ + + =1+ + ,
3 4
2 4 4
2 2
1
1
+ +
3
8
1 1 1 1 1 1 1
1 1 1
>1+ + + + + + + =1+ + + ,
2 4 4 8 8 8 8
2 2 2
..
.
S2k 1 +
k
.
2
This shows that {S2k } is not bounded above and the result follows.
On the other hand, there is a class of series, known as the geometric series,
that converges rapidly under a certain condition.
Definition
P 2.0.3 [Geometric Series]. A geometric series is a series of the
form n=0 rn for some r R; r is sometimes called the common ratio of the
geometric series.
Theorem 2.0.4 [Geometric Series Test]. A geometric series converges if
and only if its common ratio r satisfies |r| < 1. Moreover, in this case, the
1
series converges to 1r
.
79
Proof. If r = 1, then Sk = k + 1 and hence

the case where r 6= 1. Note that
n=0
rn diverges. So consider
Sk = 1 + r + r2 + + rk ,
rSk = r + r2 + r3 + + rk+1 .
k+1
We have Sk rSk = 1 rk+1 , which implies Sk = 1r

1r . We know that
k+1
lim
r
=
0
if
|r|
<
1,
and
that
the
limit
does
not
exist
if |r| > 1. Hence
Pkn
r
converges
if
and
only
if
|r|
<
1.
Moreover,
in
this
case,
the series
n=0
1 rk+1
1
=
.
k 1 r
1r
rn = lim
n=0
This completes the proof.
2.1
Positive Series
P
Definition 2.1.1. A series
n=1
an is positive if an 0 for all n.
Observation.
Sk+1 =
k+1
X
an =
n=1
k
X
an + ak+1
n=1
k
X
an = Sk .
n=1
I.e., for positive series, {Sk } is nondecreasing. This shows that either {Sk }
is bounded (and hence convergent by the monotone convergence theorem) or
limk Sk = .
Example 2.3. Consider
Sk
1
n=2 n(n1) .
Note that
1
n(n1)
1
n1
1
n
and so
k
X
1
n(n 1)
n=2
k
X
1
1
1
n
n=2

1
1 1
1 1
1
1
=
1
+
+ +
2
2 3
3 4
k1 k
1
= 1 .
k
P
1
Hence limk Sk = limk 1 k1 = 1. So n=2 n(n1)
= 1.
=
80
P
1
Series like n=2 n(n1)
are said to telescope, because it can be expressed
P
in the form n=2 f (n 1) f (n) and so the middle terms all cancel out, or
collapse, like the middle sections of a real telescope. In the end, only the
two ends are left uncollapsed. Telescoping series often allow one to calculate
the actual sum of the series. The sample questions below further illustrate this
idea.
1
1
ln(n + 1) ln(n)
n=2
converges. If it does, find its sum.
Strategy: This series is already in a telescoping form. If we write the series
out, term by term, as follows
1
1
ln(n + 1) ln(n)
n=2

=

1
1
1
1
+
ln(3) ln(2)
ln(4) ln(3)

1
1
1
1
+ ,
ln(5) ln(4)
ln(6) ln(5)
we will notice that the positive fraction in a bracket cancels the negative fraction
in the next bracket. If the tail end of the telescoping partial sum tends to
1
zero, then the only remaining fraction is just the first negative term: ln(2)
.
Sample Solution 6. Let Sk =
of the series in question. Now
Pk
1
n=2 ln(n+1)
k
X
1
ln(n)
, k 2, be the partial sum
1
1
ln(n
+
1)
ln(n)
n=2

1
1
1
1
1
1
=
+ +
ln(3) ln(2)
ln(4) ln(3)
ln(5) ln(4)

1
1
1
1
ln((k 1) + 1) ln(k 1)
ln(k + 1) ln(k)
1
1
=
+
ln(2) ln(k + 1)
1
1

+0=
ln(2)
ln(2)
P
1
1
1
as k . Hence n=2 ln(n+1)
ln(n)
:= limk Sk converges to ln(2)
.
Sk
81
The tail end of the telescoping partial sum in the above sample solution is
1
and it indeed tends to zero as k . The head end, ln(2)
is what
remains of the series and becomes the sum of the series.
1
ln(k+1) ,
1
21
n
n=2
Strategy: This series is more difficult to put into a telescoping form; however,
we should immediately recognize the difference of squares in the denominator:
n2 1 = (n + 1)(n 1). Now the goal is to write that fraction in a selfdestructive form so that the middle terms telescopesuch forms as f (n)
f (n 1), f (n + 1) f (n 1), f (n + 1) f (n), and so on. When the fraction is
1
1
the reciprocal of a product of two linear terms like n+1
and n1
, we can carry
out this standard trick:
1
(n + 1)(n 1)
=
=
=
=
2
2(n + 1)(n 1)
(n + 1) (n 1)
2(n + 1)(n 1)
n+1
n1
2(n + 1)(n 1) 2(n + 1)(n 1)

1
1
.
2(n 1) 2(n + 1)
This process is essentially the reversal

over
1of putting
1 two1 fractions
1
a1 common

1
1
1
denominator.
Now
the
sum
becomes
2
6
4
8
6
10 + 8 12 +
1

1
10 14 + . Notice how the negative fraction of each bracket cancels with
the positive fraction two backets later? Every term will cancel out except 21
and 14 because they do not have any earlier negative terms with which to cancel
out; we expect the sum to be 12 + 14 = 34 . We will formalize this in the solution
below, by showing that the partial sums tend to a limit.
82
Sample Solution 7. First note that

1
n2 1
=
=
=
=
=
for all n 2. Let Sk =

Sk
1
(n + 1)(n 1)
2
2(n + 1)(n 1)
(n + 1) (n 1)
2(n + 1)(n 1)
n+1
n1
2(n + 1)(n 1) 2(n + 1)(n 1)

1
1
2(n 1) 2(n + 1)
Pk
1
n=2 n2 1
for k 2. Then
k
X
1
21
n
n=2
k
X
1
1
2(n 1) 2(n + 1)
n=2

1
1
1
1
=
+
2(2 1) 2(2 + 1)
2(3 1) 2(3 + 1)

1
1
1
1
+ +
+
2(4 1) 2(4 + 1)
2((k 1) 1) 2((k 1) + 1)

1
1
2(k 1) 2(k + 1)
1
1
1
1
=
+
2(2 1) 2(3 1) 2((k 1) + 1) 2(k + 1)

3
1
1
=
.
4 2k 2k + 2
=
The second to last equality is due to the telescoping

Pmiddle terms. Hence
1
1
2k+2
34 00 = 34 as k . Hence n=2 n211 := limk Sk
Sk = 34 2k
converges to 34 .
Remark 2.1.2. The series
1
n=2 n2
converges.
Proof. Note that n(n 1) = n2 n < n2 for all n 2 and so
83
1
n2
<
1
n2 n
for all
n 2. Now it may seem that
X
1
2
n
n=2
=
?
<
=
1
1
1
+ 2 + + 2 +
22
3
n
1
1
1
+ 2
+ + 2
+
2 3 3
n n
X
1
22
n=2
n2 n
1.
The inequality is not really valid since we have not shown that if the individual
terms of a series is smaller than those of another series, then the value of the
series is smaller. Instead, we will note the following:
Sk
k
X
1
n2
n=2
1
1
1
+ 2 + + 2
22
3
k
1
1
1
<
+
+ + 2
22 2 32 3
k k
X
1
2n
n
n=2
=
1.
Pk
The second inequality follows since the partial sums Tk = n=2 n21n of the seP
ries n=2 n21n is increasing. This shows that the partial sums Sk are bounded.
P
Since Sk is also increasing, n=2 n12 := limk Sk converges by the monotone
convergence theorem.
The above remark and its proof suggests that there is a comparison theorem
for positive series. In fact, there is.
2.1.1
The Comparison and the Limit Comparison Tests
P
P
Theorem 2.1.3 [Comparison Test]. Let n=1 an and n=1 bn be two positive series satisfying 0 an bn . Then
P
P
1. If n=1 bn converges, then so does n=1 an .
P
P
2. If n=1 an diverges, then so does n=1 bn .
Pk
Proof. To prove (1), we mimic the proof of remark 2.1.2. We let Sk := n=1 an
Pk
and Tk :=
n=1 bn be the k-th partial sums of the two series and let T :=
84
n=1 bn .
Since Tk is nondecreasing, we have that Tk T for all k. Hence

Sk =
k
X
an
n=1
k
X
bn = Tk T
n=1
for all k. This shows that {Sk } is bounded above by T ; since {Sk } is nondecreasing, {Sk } converges by the monotone convergence theorem. This completes the
proof of part (1). Part (2) is simply the contrapositive of part (1) and follows
immediately.
Remark 2.1.4 [Linearity of Series]. We will use the following facts many
times.
P
P
1. A series n=1 an converges if and only if n=k an converges for some k.
P
Pk1
P
Moreover, in this case, n=1 an = n=1 an + n=k an for k 2.
P
P
2. If a series n=1 an converges, then
n=1 c an converges for all
P the series P
c R. Moreover, in this case, n=1 c an = c n=1 an .

P
P
P
3. If two series n=1 an and n=1
converge, then n=1 an bn
Pbn bothP
converges, and it converges to n=1 an n=1 bn .

Proof. The proof of this remark will be left as an exercise.
n
X
2
4
3
n=1
n
P
Sample Solution 8. Note that n=0 32
is a geometric series; by the geo
P
2 n
1
metric series test, n=0 3 converges to 1 2 = 3. Now
3
n
X
2
4
3
n=1
n
X
2
n=0
434
8.
This is valid by linearity of series (remark 2.1.4 (1) and (2)).

The comparison test gives us a quick way to establish remark 2.1.2 and many
other examples.
85
P
1
1
Example 2.4. We saw that n=2 n(n1)
converges. Since n12 n21n = n(n1)
P 1
P 1
for n 2, n=2 n2 converges by the comparison test. This shows that n=1 n2
converges by remark 2.1.4 (1).
Diversion.
1
n=1 n2
2
6 .
P 1
P 1
1
n=1 np . We know that
n=1 n diverges and
n=1 n2
P
converges. If p 1, then n1p n1 ; the comparison test shows that n=1 n1p
P
diverges. If p 2, then n1p n12 ; the comparison test shows that n=1 n1p
Example 2.5. Consider
converges.

P
Example 2.6. Consider n=1 sin n12 . If we let f (x) = x and g(x) = sin(x),
then f 0 (x) = 1 cos(x) = g 0 (x) and f (0) = 0 = g(0); the mean value
theorem

1
1
tells us that x sin(x) for all x 0. We now have
that
0
sin
2
n
n2 . Since

P
P 1
1
converges,
we
get
that
converges
by
the
comparison
sin
2
2
n=1
n=1 n
n
test.
The next example illustrates a reversed case where the comparison theorem will not help.
Example
2.7. Consider

P
sin n1 n1 . But n=1
here.
P
1
n

sin n1 . The previous example shows that 0
diverges and so the comparison test does not help
n=1
Note that in the above example, our feeling is that

since sin n1 behaves like n1 for large ni.e., limn
Indeed, for large n, we have that

sin n1
3
1
,
1
2
2
n
and so
1 1
sin
2 n
n=1 sin
1
sin( n
)
1
n
1
n
diverges also
= limt0
sin(t)
t
= 1.

1
3 1
.
n
2 n
So now a constant multiple of n1 is on the correct side of for the comparison

test to work. By remark 2.1.4 (1), we can then extend divergence to the whole
series. Note that in this process, we have shown that in order for the comparison
theorem to work, we really only need some N so that an bn for all n N .
The next theorem formalizes this limiting argument.
Theorem 2.1.5 [Limit Comparison Test]. Let {an } and {bn } be positive
sequences (with bk 6= 0 for all k). Assume that limn abnn = L where L [0, )
or L = . Then:
86
P
P
1. If L (0, ), then n=1 an converges if and only if
n=1 bn converges.
P
P
2. If L = 0, and n=1 bn converges, then n=1 an also converges.
P
P
3. If L = , and n=1 an converges, then n=1 bn also converges.
Proof. We will prove the three parts separately.
1. Assume that L (0, ). Since limn abnn = L, there exists N N such
that if n N , then

L
an
<

L

bn
2
L
an
L
<
L<
2
bn
2
L
an
3L
<
<
2
bn
2
L
3L
bn < an <
bn
2
2
P
P
for all n N . If n=1 an converges, then n=N an converges due to
P
remark 2.1.4 (1) . By the comparison test, n=N L2 bn converges; since
P
L 6=
remark 2.1.4 (both parts)P
shows that n=1 bn converges. Similarly,
P0,
3L
if
b
converges,
so
does
n=N 2 bn . By the comparison test,
P n=1 n
P
n=N an converges; so
n=1 an converges.
2. If L = 0, then there exists N N such that for all n N ,
an
<1
bn
0 an bn
0
for all n N . By thePcomparison test and remark 2.1.4, if
converges, then so does n=1 an .
n=1 bn
3. If L = , then there exists an N N such that for all n N ,

an
>1
bn
an > bn > 0
for all n N . By thePcomparison test and remark 2.1.4, if
converges, then so does n=1 bn .
This completes the proof.

Example 2.8. Now we will apply our new tool to prove divergence of

X
1
sin
.
n
n=1
87
n=1
an
Let an = sin
1
n
and bn =
1
n.
Note that

sin n1
an
lim
= lim
= 1.
1
n bn
n
n
Our sequences {a
for the limit comparison
Pn} and {bn} satisfy the requirements
P
test and hence n=1 sin n1 diverges because n=1 n1 diverges.
2
X
n +n1
3 2n + 1
2n
n=1
converges.
Strategy: We first notice that this is an ugly series. But if n is very large, the
lower-order terms do not contribute
much to the quotient. Indeed, for large n,
n2
the terms are approximately 2n3 = 2n1 2 (which is just one-half of n12 ). This
is a limiting process, and it suggests that we use the limit comparison test.
Moreover, we should compare the terms of the series to n12 because we think
1
n2 is a very good approximation
P for the terms in our series. We feel that this
series will converge because n=1 n12 does.
+n1
, bn = n12 . Comparing the two, we
Sample Solution 9. Let an = 2nn3 2n+4
have
q
q
n3 1 + n1 n12
1 + n1 n12
an
n2 n2 + n 1

.
=
=
=
bn
2n3 2n + 4
2 n22 + n43
n3 2 n22 + n43
Taking the limit as n , we have

q
1 + n1 n12
an
1
1+00
lim
= .
= lim
= lim
2
4
n bn
n 2 2 + 3
n 2 0 + 0
2
n
n
P 1
Since the limit is a positive real number and
n=1 n2 converges, the limit
P n2 +n1
comparison test shows that n=1 2n3 2n+1 converges.
n2
(n3 + 1)2
n=0
converges.
88
Strategy: This sample question is essentially the same as the previous one.
The terms of this series are fractions, where the numerator has degree two and
the denominator has degree six, so the terms are decreasing to zero. If n is
large, we can ignore the lower-order terms in the fraction and the quotient is
2
approximately (nn3 )2 = n14 . This is a limiting process, and it suggests we use the
P
limit comparison test, comparing with n14 . We know that the series n=1 n14
converges, so we expect the series in question to converge also.
Sample Solution 10. As before, let an =
them, we get
an
=
bn
n2
(n3 +1)2
1
n4
n2
(n3 +1)2
n2 n4
=
(n3 + 1)2
(1 +
and bn =
1
1
n3 )(1
1
n4 .
Comparing
1 .
n3 )
Taking the limit as n , we get

lim
an
= lim
n (1 +
bn
1
1
n3 )(1
1
n3 )
1
= 1.
(1 + 0)(1 + 0)
P
Since the limit is a positive real number and the series n=1 n14 converges, the
P
P
2
2
limit comparison test shows that n=1 (n3n+1)2 = n=0 (n3n+1)2 converges also.
2.1.2
The Integral Test
P
Problem. Recall that n=1 n1p diverges for p 1 and converges for p 2.
But what can we
the convergence of the series when 1 < p < 2? For
Psay about
1
example, does n=1 n3/2
converge?
The following useful theorem and its corollary solves the above problem
and many other problems.
Theorem 2.1.6 [Integral Test]. Assume that f (x) is continuous on [1,)
Rk
in particular, 1 f (x) dx exists for all k N. Assume also that f (x) 0 on
[1, ) and that f (x) is decreasing on P
[1, ). Define an := f (n) for each n N
and let Sk be the k-th partial sum of n=1 an . Then

1. For all k N,
Z
k+1
Z
f (x) dx Sk a1 +
(2.1)
1
2.
n=1
f (x) dx.
1
an converges if and only if
R
1
89
f (x) dx converges.
3. In the case that
n=1
(2.2)
an converges, then
Z
0 S Sk
f (x) dx,
k
where S =
n=1
an . (Note that by (2),
R
k
f (x) dx exists.)
Proof. Since f (x) is decreasing, we have, for all k N,

Z
k+1
f (x) dx Uk+1
(f, Pk )
1
1
k
X
(2.3)
=
f (n) 1 =
n=1
k
X
an = Sk ,
n=1
where Pk is the k-regular partition of [1, k + 1]. This can be seen in figure 2.1.
a1 a2 a3 a4 a5 a6 a7 a8 a9
Figure 2.1: Partial sum overestimates the integral.
Similarly, we also have
Z
(2.4)
=
1
k
X
n=2
f (x) dx Lk1 (f, Pk1 )

f (n) 1 =
k
X
an = Sk a1 .
n=2
R1
for all k 2but note that 1 f (x) dx = 0 0 = S1 a1 . This can be seen in
figure 2.2. So combining (2.3) and (2.4), we have that for all k N,
Z
k+1
Z
f (x) dx Sk a1 +
f (x) dx.
1
This proves part (1).
90
a2 a3 a4 a5 a6 a7 a8 a9 a10
Figure 2.2: Partial sum underestimates the integral.
R u To show part (2), we note that by

R corollary 1.10.10 and the fact that F (u) :=
f
(x)
dx
is
increasing
on
[1,
),
f (x) dx converges if and only if the set
1
1
(Z
)
k
f (x) dx : k N
1
is
Similarly, since an = f (n) 0, {Sk } is nondecreasing and therefore
Pbounded.
a
converges
if and onlyR if {Sk } is bounded (by the monotone convern=1 n
gence theorem). Assume that 1 f (x) dx converges. Extending the right-most

inequality of (2.1) (which we just proved), we have
Z
Sk a1 +
Z
f (x) dx a1 +
f (x) dx,
since f (x) is nonnegative.

This shows that {Sk } is bounded and hence converP
gent. Similarly, assume n=1 an converges. Extending the left-most inequality
of (2.1), we have
Z k+1
X
f (x) dx Sk S :=
an ,
1
n=1
since {Sk } is nondecreasing. This shows that
nR
k
1
o
f (x) dx : k N is bounded
and hence
P convergent. The proof for part (2) is now complete. Finally, assume
that n=1 an converges. Then via a Riemann sum argument,
0 S Sk =
X
n=1
an
k
X
an =
n=1
n=k+1
Z
an
f (x) dx.
k
(See figure 2.3.) The proof is now complete.

PThe1following useful corollary solves our problem about the convergence of
n=1 np .
91
...
ak+1 ak+2 ak+3 ak+4 ak+5 ak+6 ak+7
...
k+1
k+2
k+3
k+4
k+5
k+6
k+7
Figure 2.3: Bound given by the integral test.
P
Corollary 2.1.7 [p-Test for Series]. n=1 n1p converges if and only if
p > 1.
P
Proof. If p 0, then n=1 n1p diverges by the divergence test. Consider the
case where p > 0. We have that f (x) := x1p is decreasing and continuous on
[1, ), satisfying the
R test; by the integral
Prequirements to apply the integral
test, we have that n=1 n1p converges if and only if 1 f (x) dx converges. But
R
R
f (x) dx = 1 x1p dx converges if and only if p > 1 by the p-test for integrals.
1
Hence the result follows.
This connection between series and integrals is not surprising, for integration
1
is a limiting process of summation. When sensitive functions like x1p or ln(x)
p
are integrated on [1, ) and [2, ), respectively, we can expect the convergence
and divergence to depend on the size of p in a subtle way. If p is too small,
then the function decays too slowly and the integral will diverge. If p is large
enough, then the function will decrease rapidly enough for the integral to exist.
The integral test transfers this continuous phenomenon to series. Whenever the
convergence of a series depends critically on the precise rate of decay of the
terms, one can expect to use the integral test.
Sample Question 11. Determine whether the series
1
n ln(n)
n=2
converges.
Strategy: As we have seen in the harmonic series, even though the terms go
to zero, the terms may not go to zero fast enough for the series to converge.
1
We are not sure whether this series converges ln(n)
may not shrink n1 fast
enough for the series to escape divergence. When we are unsure if the rate of
decay of the terms is just enough for the series to converge, we can use the
92
1
integral test. If we look at the terms, we realize it is easy to integrate x ln(x)
;
the question of whether this series converges is justP
the question of whether the
1
integral converges. The solution below shows that n=2 n ln(n)
diverges, albeit
very, very slowly.
1
Sample Solution 11. Let f (x) = x ln(x)
be defined on [2, ) and let an =
1
f (n) = n ln(n) . Note that f (x) is both continuous and decreasing on [2, ),
P
satisfying the Rrequirements to use the integral test. Hence n=2 an converges
if and only if 2 f (x) dx converges. We now compute
Z
2
1
dx := lim
b
x ln(x)
1
dx.
x ln(x)
Here we make a change of variable u = ln(x), and so du =

integrating
Z
lim
1
dx
x ln(x)
Z
=
=
=
which diverges. Consequently
ln(b)
lim
ln(2)
1
x
dx and we continue
1
du
u
ln(b)
lim ln(u)|ln(2)
lim ln(ln(b)) ln(ln(2)),
1
n=2 n ln(n)
diverges by the integral test.
The following example uses part (1) of the integral test to show how slowly
this series diverges.
Example 2.9. How large must k be so that
k
X
1
100?
n
ln(n)
n=2
R k+1 1
Pk
1
By part (1) of the integral test,
n=2 n ln(n) 2
x ln(x) dx. We will
R k+1 1
attempt to make 2
x ln(x) dx 100. From the sample question above, we
have that
Z k+1
1
dx = ln(ln(k + 1)) ln(ln(2))
x ln(x)
2
ln(ln(k + 1)) ln(ln(e)) = ln(ln(k + 1)).
100
If we make k ee
ln(ln(k + 1)) 100.
1, then ln(k + 1) e100 , which then guarantees that
93
We need to add an astronomic number of terms before the sum is greater

than 100. If the series is so close to converging, would it converge if we
1
1
increase the exponent on ln(n) by a small amount? Would ln(n)
1.1 shrink n fast
enough for convergence?
1
n ln(n)1.1
n=2
converges.
1
Sample Solution 12. We use the integral test as before. Let f (x) = x ln(x)
1.1
1
be defined on [2, ) and let an = f (n) = n ln(n)
Note that f (x) is both
1.1 .
continuous and decreasing
to use the
R
P on [2, ), satisfying the requirements
integral test. Hence n=2 an converges if and only if 2 f (x) dx converges.
We now compute
Z b
Z
1
1
dx
:=
lim
dx.
1.1
b 2 x ln(x)1.1
x ln(x)
2
Here, we make a change of variable u = ln(x), and so du =

integrating
Z b
Z ln(b)
1
1
lim
dx
=
lim
du
b 2 x ln(x)1.1
b ln(2) u1.1
ln(b)
1
= lim
b 0.1u0.1 ln(2)
1
x
dx and we continue
1
1
,
0.1 ln(b)0.1
0.1 ln(2)0.1
10
10
= lim
+
.
b ln(b)0.1
ln(2)0.1
P
10
1
This time, the integral converges to 0 + ln(2)
0.1 10.37. Hence
n=2 n ln(n)1.1
converges.
P
1
In fact, one could see that if the exponent p of ln(n) in the series n=2 n ln(n)
p
satisfies p > 1, then the series will converge.
=
lim
1
+
n
n=0
converges.
94
Strategy:
P 1 We should immediately notice that this series is approximately
a hint that we should use the limit comparison test. The series
n
Pn=1
1 diverges by the p-test for series; the series in question should diverge,
n=1 n
too.
Sample Solution 13. Let an = 1+1n and bn = 1n , for n 1. Note that an ,
bn > 0 for all n 1. Comparing the two, we have
1
1+ n
1
n
an
=
bn
for all n 1. So limn
an
bn
1+
n
=
n
= limn
1
n
1
,
+1
1
0+1
P
1
n=1 1+ n
1 +1
n
= 1 (0, ). The
limit comparison test shows that the series

converges if and only
P
if n=1 1n converges. But the latter diverges by the p-test for series, so the
P
former diverges also. This shows that n=0 1+1n diverges (by remark 2.1.4
(1)).
2.1.3
The Ratio Test
P
Recall that a geometric series n=0 rn converges if and only if |r| < 1. The
property that guarantees its convergence is that the ratio between successive
terms is always r, some number whose magnitude is strictly less than 1. This
causes the terms to converge to zero very rapidly. If an arbitrary series is such
that the limiting ratio between successive terms is strictly less than 1, will the
series converge?
Theorem 2.1.8 [Ratio Test]. Let {an } be a positive sequence satisfying
an 6= 0 for all n N. Assume that
lim
an+1
= L,
an
where L R or L = .
1. If 0 L < 1, then
n=1
an converges.
2. If L > 1, including the case L = , then
n=1
an diverges.
Note that if L = 1, thenPthe ratio test does not tell us anything about
convergence or divergence of n=1 an .

Proof. To prove part (1), suppose L satisfies 0 L < 1. Choose r so that
0 L < r < 1. Since limn aan+1
= L, we can find n N0 N so that if
n
n N0 , then

an+1

an L < r L
95
and so
an+1
L<rL
an
an+1
2L r <
< r,
an
Lr <
since {an } is positive. So 0 < aan+1

< r for all n N0 . It
where 0 < aan+1
n
n
follows that
aN0 +1
< r aN0 +1 < raN0 ,
aN0
aN0 +2
< r aN0 +2 < raN0 +1 < r2 aN0 ,
aN0 +1
aN0 +3
< r aN0 +3 < raN0 +2 < r3 aN0 ,
aN0 +2
.. .. ..
. . .
aN0 +k
< r aN0 +k < raN0 +k1 < rk aN0 .
aN0 +k1
This
be proven inductively. Note that since 0 < P
r < 1, we have that
P can
k
k
r
converges
by
the
geometric
series
test.
Hence
k=0
k=0 r aN0 converges
P
by remark 2.1.4.PThe comparison
P
P test shows that k=0 aN0 +k converges. But
a
=
a
,
so
k=0 N0 +k
n=N0 n
n=1 an converges by remark 2.1.4. This proves
part (1) of the theorem.
We proceed similarly to prove part (2). Suppose L satisfies L > 1 (including
= L, we can
the case L = ). Choose r so that 1 < r < L. Since limn aan+1
n
find N0 so that for all n N0 ,

an+1

an L < L r
an+1
rL<
L<Lr
an
an+1
< 2L r
r<
an
if L < , or
an+1
an
if L = . In either case, we have that for all n N0 ,
r<
aN0 +1
>r
aN0
aN0 +2
>r
aN0 +1
aN0 +3
>r
aN0 +2
.. ..
. .
aN0 +k
>r
aN0 +k1
an+1
an
> r. Now
aN0 +1 > raN0 ,

aN0 +2 > raN0 +1 > r2 aN0 ,
aN0 +3 > raN0 +2 > r3 aN0 ,
..
.
aN0 +k > raN0 +k1 > rk aN0 .
96
Again, this can be proven inductively. Since r > 1, limk rk aN0 = . Hence
by the comparison theorem for sequences, limk aN0P
+k = . This implies
that limn an = 6= 0, and by the divergence test, n=1 an diverges. The

theorem follows.
We first demonstrate that the ratio test has limitationsnamely, the case
where L = 1.
P
Example 2.10. Consider the trivially divergent series n=1 1. Let an = 1.
Then
an+1
= 1.
lim
n an
The ratio test does not detect the divergence of this series.
and it diverges. Let an = n1 . Then
an+1
= lim
n an
n
1
n+1
1
n
= lim
lim
n=1
1
n.
This is the harmonic series,
n
= 1.
n+1
The ratio test does not help.

P

Let an = n12 . Then
an+1
= lim
n an
n
lim
n=1
1
(n+1)2
1
n2
1
n2 .
We know this series converges.

= lim
n
n+1
2
= 1.
The ratio test does not help.

an+1
n an
lim
=
=
=
n=1
1
.
n1+1/n
Let an =
1
.
n1+1/n
Then
1
(n+1)1+1/(n+1)
1
n
n1+1/n
1/n
lim
lim
nn
.
(n + 1)(n + 1)1/(n+1)
limn n1/n
n
.
n n + 1 limn (n + 1)1/(n+1)
lim
1/n
1/n
Note that ln(n1/n ) = n1 ln(n) = ln(n)
= eln(n )
n 0 as n , and so n
e0 = 1 as n (ex is continuous on R). This means that (n + 1)1/(n+1) 1
n
as well. Also, n+1
1. Hence the above limit evaluates to 1, which means the
ratio test does not help.
97
Question: Does the series
n=1
1
n1+1/n
converge?
The ratio test has failed to detect convergence and divergence for these four
series. In fact, it does not convey any information when an = p(x)
q(x) , where p(x)
and q(x) arePpolynomials. The ratio test will most likely fail with borderline
1
series like
n=1 n1+1/n : it has good reasons to converge (all terms have
exponent greater than 1) and it has equally good reasons to diverge (the
series is approximately the harmonic series). Because the core of the proof of
the ratio test is the geometric series test, one can expect the ratio test to yield
fruit only when the series being tested converges or diverges very rapidly like
the geometric series. Here are some sample questions involving the ratio test.
X
n!
n
2
n=0
converges.
Strategy: This series involve factorials and exponentialsboth of these grow
extremely fast. We know, however, that n! grows substantially faster than 2n .
Since n! is in the numerator, one can expect this series to diverge spectacularly.
With this in mind, we expect the ratio test to detect this rapid divergence.
Notice that, in the solution, the ratio test is well-suited to the manipulation of
exponents and factorials.
Sample Solution 14. Let an = 2n!n ; {an } is a positive sequence satisfying the
requirements of the ratio test. Now
an+1
n an
lim
(n+1)!
2n+1
n!
n
2n
lim
(n + 1)! 2n
n n! 2n+1
= lim
n+1
= .
2
The ratio test shows that this series diverges.
=
lim
Alternate Strategy: One should always keep in mind that in order for a series
to converge, the terms must go to zerothe divergence test. In this case, the
terms obviously do not go to zero, since n! grows much faster than 2n . As a
result, the ratio test is not needed at all to show this series diverges.
Alternate Solution: Let an =
all n 4,
n!
2n .
Since 4! = 24 > 16 = 24 , we have that for

n4
z
}|
{
n! = n(n 1)(n 2) 5 4! > 2 2 2 2 24 = 2n .
This shows thatPfor all n 4, an > 1. Hence limn
divergence test, n=0 an diverges.

98
n!
2n
6= 0 and by the
X
n3
n!
n=0
converges.
Strategy: Whenever we see a factorial sign, it is very likely that we have to use
the ratio test. Looking at the fraction, the factorial obviously grows faster than
a cubic. The terms will decrease extremely rapidly towards zeroas a result,
the ratio test should be able to detect this.
. We have
at the ratio aan+1
n
an+1
an
n3
n! .
(n+1)3
(n+1)!
n3
n!
=
=
=
Then an 0 for all n 0. We will look
(n + 1)3 n!
n3 (n + 1)!

3
n+1
1
n
n+1
13 0 = 0
P
3
as n . Hence by the ratio test, n=0 nn! converges.
X
n2
3n
n=0
converges.
Strategy: The exponential in the denominator is a good indication that we
should use the ratio test. The quadratic in the numerator does not grow nearly
as fast as powers of three in the denominator. Again, the terms decay rapidly
towards zero, and the ratio test should detect this.
n2
3n .
Then an 0 for all n 0. We will look
99
at the ratio
an+1
an .
We have
an+1
an
(n+1)2
3n+1
n2
3n
(n + 1)2 3n
n2 3n+1

2
1
n+1
=
n
3
1
1
12 = < 1
3
3
P
2
as n . The ratio test shows that n=0 3nn converges.
=
X
5n n!
nn
n=1
converges.
Strategy: This series is full of evidence that we should use the ratio test: a
fraction involving exponentials and factorials. This time, it is difficult to tell
which side grows faster: The numerator has powers of five multiplied by a factorial, while the denominator has nn we really do need the ratio test. Hopefully
one side grows so much faster than the other that the ratio test will be able to
detect convergence or divergence.
n
Sample Solution 17. As usual, we let an = 5nnn! . It is clear that an 0 for

all n 1. We will look at the ratio aan+1
. We have
n
an+1
an
=
=
=
5n+1 (n+1)!
(n+1)n+1
5n n!
nn
n
nn
5n+1 (n + 1)!nn
=
5(n
+
1)
5n n!(n + 1)n+1
(n + 1)n+1

n
n
5
n
=5
= n+1 n
5
(n + 1)n
n+1
n
=
5
5
n > 1
e
1 + n1
as n . The ratio test shows that the series
2.1.4
5n n!
n=1 nn
diverges.
The Root Test
Another test that is based on the geometric series test is the following.
100
Theorem 2.1.9 [Root Test]. Let {an } be a positive sequence. Assume that
limn an 1/n = L, where L R or L = .
P
1. If 0 L < 1, then n=1 an converges.
P
2. If L > 1, including the case L = , then n=1 an diverges.
Once again, note that if L = 1, then
P the root test does not tell us anything
about convergence or divergence of n=1 an .
Proof. We first prove part (1). Suppose L satisfies 0 L < 1. Choose r so that
L < r < 1. Since limn an 1/n = L, we can find an N0 N such that for all
n N0 ,

1/n

L < r L
an
L r < an 1/n L < r L
2L r < an 1/n < r
an < rn ,
where
an ({an } is positive). So 0 an < rn for all n N0 . Note that
P 0
n
by the geometric series test and remark 2.1.4, and so by the
n=N0 r converges
P
P
comparision test, n=N0 an converges; therefore, n=1 an converges by remark
2.1.4.
To prove part (2), suppose L satisfies L > 1, including the case L = .
Choose r so that 1 < r < L. We can find an N0 N such that for all n N0 ,

1/n

L < L r
an
r L < an 1/n L < L r
r < an 1/n < 2L r
rn < an
if L < , or
r < an 1/n rn < an
if L = . In either case, we have that for all n N0 , rn < an . Note that
limn rn = since r > 1 and so by the comparison
theorem for sequences,
P
limn an = . By the divergence test,
n=1 an diverges. The theorem
follows.
2.2
Absolute and Conditional Convergence
We have been working with positive series throughout the last section. We now
move on to other types of series.
101
P
P
Definition 2.2.1. We say
n=1 an converges absolutely if
n=1 |an |
Pthat
converges.
We
say
that
a
converges
conditionally
if
it
converges
but
n=1 n
P
|a
|
diverges.
n=1 n
Question: Does absolute convergence imply convergence?
Proposition 2.2.2. If
n=1
|an | converges, then
n=1
an converges.
Proof.
Consider bn := an + |an |. Then we have 0 bn 2|an |. By remark 2.1.4
P
(2), P
n=1 2|an | converges, so now the comparison test shows that the positive
series
n=1 bn converges. Since an = bn |an | by definition of bn , we have that
P
a
converges to
n
n=1
X
n=1
an =
bn
n=1
|an | R,
n=1
by remark 2.1.4 (3). This proves the proposition.

P
cos(n)
1
Example 2.14. Consider the series n=1 cos(n)
n2 . Note that n2 n2 . Since
P 1
P cos(n)
n=1 n2 converges, so does the positive series
n=1 n2 by the comparison
P cos(n)
test. This shows that n=1 n2 converges absolutely and hence it converges
by proposition 2.2.2.
Example 2.15 [Alternating Harmonic Series]. Consider the alternating
n1
P
= 1 12 + 31 14 + . Let Sk be the k-th partial
harmonic series, n=1 (1)n
1
sum of this series. Let an = n ; then
Sk =
k
X
(1)n1 an = a1 a2 + a3 a4 + .
n=1
We can see that

S1 > S3 > S5 > > S2k1 > S2k+1 > >
1
.
2
The sequence {S2k1 } is decreasing and bounded below by 21 . By the monotone

convergence theorem, S2k1 L for some L R. Similarly, we have that
S2 < S4 < S6 < < S2k < S2k+2 < < 1.
The sequence {S2k } is increasing and bounded above by 1. By the monotone
convergence theorem, S2k M for some M R. Note that |S2k S2k1 | =
P (1)n1
1
:= limk Sk
2k 0 as k . Therefore, L = M and n=1 n
P (1)n1
P 1
converges to L = M . But we know that n=1 n = n=1 n diverges.
n1
P
converges conditionally.
So n=1 (1)n
102
One probably can guess from the above example that there is a more general
theorem that will prove convergence for series whose terms alternate signs and
decrease to zero. This is in fact the case, but for now, we will turn to consequences of conditional convergence. We will first need the following definition.
Definition 2.2.3. Let {an } R be any sequence. Define
(
an if an 0,
+
an :=
0
if an < 0.
Also define
(
a
n
:=
0
an
if an 0,
if an < 0.
Note. For any sequence {an },
1. a+
n 0 and an 0 for all n N.
2. We have
an = a+
n an ,
|an | = a+
n + an ,
for all n N.
Remark 2.2.4. If a series
n=1
an converges conditionally, then
a+
n ==
n=1
a
n.
n=1
+
Proof.
to the contrary that n=1 a+
n 6= . Since an 0 for all n,
P Suppose
+
an is a positive series and hence converges to some L R. We know that
Pn=1
n=1 an = M R. Since an = an an , we have that
a
n =
n=1
a+
n
n=1
an = L M R
n=1
by remark 2.1.4 (3). Since |an | = a+

n + an , this means that
X
n=1
|an | =
X
n=1
a+
n +
a
n = 2L M R
n=1
P
again by remark 2.1.4 (3). This is a contradiction since n=1 |an | = by definition of
convergence. This
arose due to the assumption
Pconditional
P contradiction
+
that n=1 a+
diverge to .
n convergeshence
n=1 an , a positive series,
Pmust
The exact same argument, mutatis mutandis, shows that n=1 a

n = .
103
P
We once again review the divergence test: If any series n=1 an converges,
then an 0. We now have the following interesting observation.
P
Observation.PLet n=1 an be P
a conditionally convergent series.
P We+know by
remark 2.2.4, n=1 a+

=
=
a
.
We
also
have
that
n
n=1 n
n=j an = =
P
a
for
any
j,
k
N
by
remark
2.1.4
(1).
This
means
that
the magnitude
n=k n
of the sum of any tail of the two series can exceed M for any M N. We now
claim that
any R, it is possible to rearrange the
Pfor
Ppositive and negative
terms of n=1 an so that the newly rearranged series n=1 a0n = .

To do so, simply
Pchoose positive terms until the partial sum exceeds . This
is possible since n=1 a+
n = . Now choose enough negative
P terms until the
partial sum drops back below . This is possible since n=1 a
n also diverges
to . Now we choose enough positive terms until the partial sum again exceeds
. This is possible since we noted that whatever tail remains of the positive
series still diverges to . Do the same withP
the negative series. Repeating this
process, we now have a rearrangement of n=1 an that oscillates near . We

know that this new series actually converges to since an 0 as n .
The above informal observation motivates the following formal definition of
a rearrangement of a series.
Definition
2.2.5. Given a series
P
b
where
n=1 n
n=1
an and a bijection : N N, the series
bn := a(n)
is called a rearrangement of an .
P
Theorem 2.2.6. Let n=1 an be any series. Then
P
1. If n=1 an is conditionally convergent,
then
P
P for any R {, },
there exists a rearrangement n=1 bn of n=1 an such that
bn = .
n=1
P
2. P
If n=1 an =
R is absolutely convergent, then for any rearrangement
PL
b
of
n
n=1
n=1 an ,
X
bn = L.
n=1
Proof. We will leave the proof as a homework exercise.
104
Hint: For part (1), formalize the argument presented in the observation
above. For part (2), suppose bn = a(n) for some bijection . Let
Sk =
k
X
an ,
n=1
Tk =
k
X
a(n) ,
n=1
Rk =
k
X
|an |.
n=1
How large can |SP

k Tk | be? You may want to use Rk in some triangle inequality.
Remember that n=1 |an | converges.

The above theorem shows that absolutely convergent series and conditionally
convergent series behave as polar opposites when rearranged.
Now we will look at series whose terms alternate signs and decrease to zero.
2.3
Alternating Series Test
Recall that the alternating harmonic series converged. We will generalize the
result in the following theorem; the proof will imitate the proof for example
2.15.
Theorem 2.3.1 [Alternating Series Test]. Assume that {an } satisfies the
following:
1. an 0,
2. an+1 an ,
3. limn an = 0.
P
Then n=1 (1)n1 an converges. Moreover, in this case,
|S Sk | ak+1 ,
where Sk :=
Pk
n1
an
n=1 (1)
and S :=
n=1 (1)
n1
an .
Proof. We will show that the two subsequences {S2k1 }kN and {S2k }kN of
{Sk }kN converge to the same limit L.
105
We first prove that both subsequences are monotonic. We have

S2(k+1)1 S2k1
= S2k+1 S2k1
2k+1
X
2k1
X
n=1
n=1
(1)n1 an
(1)n1 an
(1)2k1 a2k + (1)(2k+1)1 a2k+1
= a2k + a2k+1
0.
This shows that {S2k1 } is decreasing. Similarly,
S2(k+1) S2k
=
=
S2k+2 S2k
2k+2
X
(1)n1 an
n=1
2k
X
(1)n1 an
n=1
(2k+1)1
a2k+1 + (1)(2k+2)1 a2k+2
(1)
a2k+1 a2k+2
0.
This shows that {S2k } is increasing.

Now we will show that both subsequences are bounded. We have
S2k1
(a1 a2 ) + (a3 a4 ) + + (a2k3 a2k2 ) + a2k1
0 + 0 + + 0 + a2k1
0,
and
S2k
a1 (a2 a3 ) (a4 a5 ) (a2k2 a2k1 ) a2k
a1 0 0 0 a2k
a1 .
Hence {S2k1 } is bounded below by 0 and {S2k } is bounded above by a1 . By
the monotone convergence theorem, limk S2k1 = L R and limk S2k =
M R.
Next we show that L = M . Let > 0. We can choose a K so that we have
all of |a2K | < 3 , |L S2K1 | < 3 , and |S2K M | < 3 . We now have
|L M |
|L S2K1 | + |S2K1 S2K | + |S2K M |
|L S2K1 | + |(1)2K1 a2K | + |S2K M |
|L S2K1 | + |a2K | + |S2K M |

+ +
3 3 3
.
<
=
106
P
This shows that n=1 (1)n1 an := limk Sk = S, where S := L = M .
Finally, since S2k S, S2k1 S, we have
S2k S S2k1
for all k N. So
|Sk S| |Sk Sk+1 | = |(1)k ak+1 | = ak+1
for all k N. This proves the last part of the theorem.
X
(1)n
n[ln(n)]2
n=2
converges.
Strategy: Whenever we see (1)n , it is very likely that we have to use the
alternating series test. We must check three things: Do the terms go to zero?
Is the series always alternating? Are the magnitudes of the terms decreasing?
1
The first two questions are easy to answer: n[ln(n)]
2 0 as n , and
1
1
n ln(n) > 0 for all n. The last question is simple once we realize that both n
1
and ln(n)
2 decreasing, so their product is decreasing. The alternating series test
then proves that this series converges.
1
Sample Solution 18. Let an = n[ln(n)]
2 . Then it is clear that an 0 as n
1
1
1
and an > 0 for all n. Also observe that n1 > n+1
> 0 and ln(n)
> ln(n+1)
> 0, for
1
1
1
1
1
1
1
all n. This shows that an = n[ln(n)]2 = n ln(n) ln(n) > n+1 ln(n+1) ln(n+1)
=
1
(n+1) ln(n+1)2 = an+1 for all n. Hence the series
(1)n an =
n=2
X
(1)n
n[ln(n)]2
n=2
converges by the alternating series test.

Sample Question 19. Determine whether the following series converges conditionally, converges absolutely, or diverges:
1
1
1
1
+ .
12 32 52 72
Strategy: This series is not given in sigma notation; after observing the pattern,
1
we can see that the terms are an := 2(2n+1)
for n 0 and the signs alternate.
The series is
X
X
(1)n
(1)n an =
.
2(2n + 1)
n=0
n=0
107
This type of question not only requires us to determine whether the series converges, but also how well the
must also check whether
Pseries converges:PWe
the series of absolute values n=0 |(1)n an | = n=0 an converges. With the
absolute value sign, we see that the series is just one-half of the odd terms of
the harmonic series, and should divergewe can use the limit comparison test
to show this. Without the absolute value sign, the series is alternating, and
the terms decrease to zero, so we can use the alternating series test to prove
convergence. Our conjecture is that this series converges conditionally.
P
n
Sample Solution 19. The series in question is
n=0 (1) an where an :=
1
2(2n+1) . We shall first show that this series converges. It is clear that an > 0
1
1
= 4n+2
0 as n . Also, an+1 =
for all n 0 and that an = 2(2n+1)
1
1
1
1
=
=
=
a
for
all n 0. By the alternating
n
2(2(n+1)+1)
4n+6 P4n+2
2(2n+1)
n
series test, the series n=0 (1)
P an converges. P
Now consider the series n=0 |(1)n an | = n=0 an . Let bn = n1 . Comparing an with bn , we have
an
=
bn
1
2(2n+1)
1
n
1
n
=
4n + 2
4+
2
n
1
1
= (0, )
4+0
4
P 1
as n . Since
the limit comparison
test shows that
n=1 n diverges,P
P
P
1
a
also
diverges.
It
follows
that
a
=
n=1 n
n=0 n
n=0 2(2n+1) diverges.
Hence the series
1
1
1
1
+
12 32 52 72
converges conditionally.
In this type of question, remember that if the series of absolute values converges, then the original series is said to converge absolutely, and hence guaranteed to converge by proposition 2.2.2! There is no point in spending extra time
checking whether the original series converges. Similarly, if the original series
diverges, then the series divergesthere is no more work to do.
The only case where one has to check the convergence of both the original
series and the series of absolute values is exemplified via the above sample
question: conditional convergence. In this type of convergence, the original
series converges but the series of absolute values diverges.
Based on this discussion, whenever a question similar to the above sample
question appears, it is best to do ones rough work in this order:
1. Check whether the series of absolute values converges. If it does, the
original series is said to converge absolutely and there is no more work to
do. Go to the next step if it diverges.
2. Check whether the original series converges. If it does, then the original
series is said to converge conditionally. If it does not, then the original
series diverges.
108
Of course, one may write the final solution in a different order, as seen in the
sample solution above.
The
compares the rate of convergence between the above series
Pnext example
1
.
and n=2 n[ln(n)]
2
P
P
(1)n
1
Example 2.16. Consider the two series n=2 n[ln(n)]
2 and
n=2 n[ln(n)]2 . We
have shown that the latter series converges; we will use the integral test to show
1
that the former series converges. Substituting u = ln(x), du
dx = x , we get
Z
2
1
dx
x[ln(x)]2
Z
:=
lim
=
=
=
1
dx
x[ln(x)]2
ln(b)
1
du
2
u
ln(2)
ln(b)
Z

1

lim
du

2
b
u
ln(2)
ln(b)
1
lim
b u ln(2)
lim
1
1
b ln(b)
ln(2)
1
= 0+
ln(2)
1
=
.
ln(2)
P
1
So the integral test shows that the positive series n=2 n[ln(n)]
2 converges.
P
(1)n
Observe that by proposition 2.2.2, the series n=2 n[ln(n)]2 converges also.
This provides an alternate (messier) solution to the above sample question.
Both the integral
the alternating
test give us an estimation
Pk test and
P series
1
1
clause. Let Sk = n=2 n[ln(n)]
2 and S =
n=2 n[ln(n)]2 . We can use the method
in example 2.9 to estimate how large k must be in order for the partial sum Sk
1
to satisfy |S Sk | < 100
; repeating the previous integration, the integral test
shows that
Z
1
|S Sk |
dx
2
x[ln(x)]
k
Z b
1
:= lim
dx
b k x[ln(x)]2
1
1
= lim
b ln(b)
ln(k)
1
=
.
ln(k)
=
109
lim
We need
1
ln(k)
<
1
100 ,
which is achieved by k > e100 .
P
(1)n
Let us compare this with the alternating series n=2 n[ln(n)]
2 . Let Tk =
Pk
P
(1)n
(1)n
n=2 n[ln(n)]2 and T =
n=2 n[ln(n)]2 . The alternating series test states that
|T Tk |
1
.
(k + 1)[ln(k + 1)]2
1
1
2
We want (k+1)[ln(k+1)]
2 < 100 , or 100 < (k + 1)[ln(k + 1)] . Simple trial-anderror shows that this holds if and only if k 14, which is, incidentally, much
better than k > e100 .
2.4
Determining Convergence
How do we use the numerous tests that we have developed to test for convergence
of series? We generally proceed in this order.
1. Use the divergence test. If the terms do not tend to zero, then the series
diverges.
2. Next, check for absolute convergence. These tests are useful:
comparison test,
limit comparison test,
integral test,
p-test for series,
ratio test,
root test.
If the series is absolutely convergent, then we are done.
3. Finally, if the series evades absolute convergence and is an alternating
series, then we use the alternating series test to test for (conditional)
convergence.
Different tests are sensitive to different series. With experience, one can
generally select the correct test to use at first sight.
Example 2.17. Determine whether or not
(1)n+1
n=1
converges.
110
ln(n)
n
Note that

(1)n+1 ln(n) = ln(n) > 1

n
n
n
P ln(n)
for all n > e. By the P
comparison test, n=1 n diverges by comparison to
the divergent p-series n=1 n1 . We now test for conditional convergence.

n Let
o
f (x) =
ln(x)
x .
Then f 0 (x) =
1
xx
ln(x)
x2
1ln(x)
x2
< 0 for all x > e. Thus,
ln(n)
n
decreases for x > e (by the decreasing function theorem). The fundamental
log limit directly shows that limn ln(n)
n = 0. Hence, the given series satisfies
P
n+1 ln(n)
the requirements for the alternating series test; therefore,
n=1 (1)
n
converges (conditionally).
n

X
3n2 + 1
4n2 2n + 1
n=1
converges.
Let an =
3n2 +1
4n2 2n+1
n
. Note that an > 0 for all n N. We note that
3n2 + 1
3
2
4n 2n + 1
4
P 3n2 +1 n
converges (absolutely).
as n . By the root test, n=1 4n2 2n+1
a1/n
=
n

n

X
n
n+1
n=1
converges.
Note that

n
n+1
n
=
!n
1
n+1
n
1

n+1 n
n
as n . The divergence test shows that
1
1
6= 0
1 n
e
1+ n
P
X
sin(n3 )
n2 1
n=2
111
n=1
n
n+1
n
diverges.
converges.

3
1

)
1
Since sin(n

2
2
n 1
n 1 = n2 1 for all n 2, the comparison test shows
3
P
P
)
1
that n=2 sin(n
n=2 n2 1 converges. The latter does
n2 1 converges absolutely if
indeed converge, for
1
n2 1
n2
1
=
1
n2
n2 1
P
as n , and so the limit comparison test shows that n=2 n211 converges
P 1
by comparison to the convergent p-series n=2 n2 .
Example 2.21. For which values of x does the series
X
xn
n
n=1
converge?
n
Let an = xn . Then
an+1
an
n+1
x

n+1
|x|n+1 n
n

= |x|
.
= xn =
|x|n (n + 1)
n+1
n

n
=
lim
|x|
We have that limn aan+1
= |x|. The ratio test shows
n
n+1
n
P xn
that n=1 n converges absolutely if |x| < 1 and diverges if |x| > 1. If x = 1,
P n
P
then the series becomes the harmonic series n=1 1n = n=1 n1 , which diverges
n
P
by the p-test for series. If x = 1, then the series becomes n=1 (1)
n , which
converges by the alternating series test. Hence the given series converges if and
only if x [1, 1).
112
Chapter 3
Power Series
It would be convenient if certain functions can be expressed as an infinite sum
of very simple functions (such as polynomials). It would be miraculous if we can
differentiate or integrate these functions by differentiating or integrating their
infinite polynomial expansions, term by term. This chapter will explore what
it means to add infinitely many functions, what it means for an infinite sum of
functions to converge, what it means for a function to be expressible as such an
infinite sum, and finally, what these infinite sums are capable of.
3.0
Definitions and Basic Facts
Definition 3.0.1. A power series centered at x = a is a formal sum of the form
an (x a)n
n=0
:= a0 + a1 (x a) + a2 (x a)2 + a3 (x a)3 + .
Each an is called a term of the power series; a is called the center of the power
series. Also, for convenience of notation, we define 00 = 1 (for evaluating power
series only).
Example 3.1. These are power series:
xn
n=0
and
X
xn
.
n!
n=0
113
We have some fundamental problems to consider.

Problem. For which values of x does the P
series converges? I.e., for which real
numbers x0 R does the numerical series n=0 an (x0 a)n converge?

P n
Example 3.2. Consider the power series
x . Let x = r; we know from
P n=0
the geometric series test that the series n=0 rn converges if and only if |r| < 1,
so this power series converges for x (1, 1) only. Incidentally, (1, 1) is called
the interval of convergence for this power series and 1 is its radius of convergence.
We will formally define these two terms later.
P n
Example 3.3. Consider the power series n=0 xn! . (Note: 0! := 1.) For which
values of x does this power series
n converge? We will use the ratio test to find
out. Suppose x 6= 0. Let bn = xn! . Then
n+1
x

(n+1)!
bn+1
|xn+1 |n!
|x|
= xn = n
=
0
bn
|x |(n + 1)!
n+1
n!
as n , for all x R. The ratio test shows that the power series converges
absolutely
x R\{0}. If x = 0, then the power series obviously converges.
Pfor all
n
Hence n=0 xn! converges for all x R.
P
Example 3.4. Consider the power series n=0 n!xn . We will again use the ratio
test to determine which values of x make the power series converge. Suppose
x 6= 0. Let bn = |n!xn |. Then
bn+1
|(n + 1)!xn+1 |
=
= (n + 1)|x|
bn
|n!xn |
as n . The ratio test shows that the power series diverges
P for all x R\{0}.
The power series clearly converges for x = 0. Hence n=0 n!xn converges on
the set {0} only.
In practice, the ratio test will be the primary tool to determine the values
of x for which a power series converges.
Definition 3.0.2. Assume that
f : I R be defined by
f (x0 ) =
n=0
an (x a)n converges on a set I. Let
an (x0 a)n
n=0
114
for each x0 I. We write

f (x) =
an (x a)n .
n=0
In this case, we say that f (x) is represented by the power series
n=0
an (xa)n .
Observation. Without loss of generality, we may assume that a = 0 for all

power series.
Looking at the three previous examples, we see that the sets of values on
which the power series converge are all intervals. This isP
in fact true of all
n
power serieswe will prove this later. The power series
n=0 x converges
P xn
(absolutely) on (1,
P 1); the power series n=0 n! converges absolutely on R;
the power series n=0 n!xn converges (absolutely) on {0}, a degenerate interval.
It appears that all these intervals of convergence are symmetric about the center
of their respective power series (zero). This is another fact that will be proved
later. For a power series with a bounded interval of convergence I, we will let the
term radius of convergence or the letter R denote the distance from its center
to one of the endpoints of I; for a power series with interval of convergence
P R,
we say that it has a radius of convergence R = . Thus, R = 1 for n=0 xn ;
P
P n
R = for n=0 xn! ; R = 0 for n=0 n!xn .
We look at two more examples to illustrate the fact that the interval of
convergence may contain one or both of its endpoints.
n
P n
Example 3.5. Consider the power series n=0 xn . Let bn = xn . Then
n+1
x
n+1
bn+1
n
|xn+1 |n
= xn = n
=
|x| |x|
bn
x (n + 1)
n+1
n
as n . The ratio test shows that the power series converges for all
x satisfying
diverges for all x satisfying |x| > 1. If x = 1,
P n|x| <P1and
1
then n=0 xn =
diverges by the p-test for series. If x = 1, then
n=0
n
P xn
P
n1
Hence
n=0 n =
n=0 (1) n converges by the alternating series test.
P
xn
has
interval
of
convergence
[1,
1)
and
radius
of
convergence
1.
n=0 n
n
P n
Example 3.6. Consider the power series n=0 xn2 . Let bn = xn2 . Then
n+1

x

2
(n+1)2
bn+1
|xn+1 |n2
n

= xn = n
=
|x| |x|
bn
x (n + 1)2
n+1
2
n
as n . The ratio test shows that the power series converges for all x
satisfying |x| < 1 and diverges for all x satisfying |x| > 1. If x = 1, then
115
xn
n=0 n2
P
= n=0 n12 converges (absolutely) by the p-test for seriesthis shows
P n
that the power series converges for x = 1 as well. Hence n=0 xn2 has interval
of convergence [1, 1] and radius of convergence 1.
Observation. The power series above all converge absolutely on the interior of
their intervals of convergence.
We are now ready to explore intervals and radii of convergence formally.
3.1
Interval and Radius of Convergence
P
n
Theorem 3.1.1. If a power series
n=0 an x converges at x = x1 , then it
converges absolutely at x = x0 for any x0 satisfying |x0 | < |x1 |.
P
Proof. Note that since n=0 an xn1 converges, we have that limn |an xn1 | = 0
by the divergence test; therefore, there exists M 0 such that |an xn1 | M for
all n. Let x0 be any real number satisfying |x0 | < |x1 |. Then
|an xn0 | = |an ||x0 |n
|x0 |n
= |an ||x1 |n
|x1 |n
n
x0
= |an xn1 |
x1
n
x0
M .
x1

n
P

Since xx10 < 1, n=0 M xx01 converges by the geometric series test and linP
earity of series. The comparison test then shows that n=0 |an xn0 | converges,
as desired.
P
n
Sample Question 20. Suppose that
n=0 an x converges at x = 4 and
diverges at x = 6. What can be said about the convergence or divergence of the
following series?
P
(a)
n=0 an
P
n
(b)
n=0 an 8
P
n
(c)
n=0 an (3)
P
n
n
(d)
n=0 (1) an 9
Strategy: In summary, theorem 3.1.1 says that if a power series converged at
x equal to some number x0 , then the power series will converge (absolutely)
for x equal to any number P
whose absolute value is less than |x0 |. So in this
question, we are given that n=0 an (4)n converges; therefore, we know that
116
P
n
an 1n and n=0 an (3)P
both converge, simply because |1| < | 4| and
| 3| < | 4|. P
What about n=0 an 8n ? It cannot converge, because if it did
n
converge, then n=0 aP
n 6 would have to converge as well (because |6| < |8|)
butPwe are given that n=0 an 6n diverges! Finally,

reasoning applies
Pthe same
n
n
n
n
to
(1)
a
9
.
The
trick
is
to
notice
that
(1)
a
n
n 9 is really just
n=0
P n=0
P
n
n
n=0 an (9) and so it cannot converge because
n=0 an 6 diverges.
P
n=0
Sample Solution 20. By applying theorem 3.1.1, we have the following.

P
P
(a) WePare given that n=0 an (4)n converges. Since |1| < | 4|, n=0 an 1n
= n=0 an converges.
P
P
n
n
(b)
n=0 an 8 divergesfor if it converged, then
n=0 an 6 must also converge (because |6| < |8|), a contradiction.
P
n
(c) We
converges. Since | 3| < | 4|,
n=0 an (4)
P are givenn that
a
(3)
converges.
n
n=0
P
P
P
n
n
n
(d) Note that
= P n=0 an (9)n ; but
n=0 (1) an 9
n=0 an (9)
n
divergesfor if it converged, then n=0 an 6 must also converge (because
|6| < | 9|), a contradiction.
Here we will extend the notation sup S to mean when the set S is not
bounded above, and similarly for inf S. For a nonempty set S that is bounded
above, sup S still denotes the least upper bound of S; similarly for inf S.
Definition 3.1.2 [Radius of Convergence]. For a power series
construct the set
(
)
X
n
S := x0 R :
an x0 converges .
n=0
an xn ,
n=0
Note that S is nonempty, since

0 S. Let R = sup S. We will call R the radius
P
of convergence of the series n=0 an xn . Note that since 0 S, we have that
R 0.
P
Proposition 3.1.3. Given a power series n=0 an xn , let
(
)
X
n
S := x0 R :
an x0 converges .
n=0
If R := sup S satisfies
1. R = 0, then S = {0};
2. R (0, ), then S = (R, R), [R, R), (R, R], or [R, R];
117
3. R = , then S = R.
Proof. We will prove the three parts separately.
1. We know that 0 S. So assume that x1 S with x1 6= 0. Let x0 =
|x1 |
2 > 0. We have |x0 | < |x1 | and so by theorem 3.1.1, |x0 | = x0 S. But
R := sup S = 0 < x0 S is a contradiction, hence S = {0}.
2. Let x0 (R, R). Then |x0 | < R. There exists x1 S with |x0 | <
|x1 | < R (otherwise |x0 | would be an upper bound for S satisfying x0 <
R := sup S, a contradiction) and so x0 S by theorem 3.1.1.
PSo S
(R, R). Now suppose x0 satisfies |x0 | > R. We claim that Pn=0 an xn0
diverges; i.e., x0 6 S. To see this, suppose to the contrary that n=0 an xn0
converges; i.e., x0 S. We can find x1 > 0 with 0 < R < x1 < |x0 |. By
theorem 3.1.1, x1 S. This is a contradiction, since x1 > R := sup S.
Hence all x0 satisfying |x0 | > R also satisfies x0 6 S. The desired result
follows.
3. Let x0 R. Since R := sup S = , there exists x1 S with |x0 | < |x1 |.
Hence x0 S by theorem 3.1.1it follows that S = R and the proof is
complete.
P
Corollary 3.1.4. Given a power series n=0 an xn , the set
)
(
X
n
an x0 converges
S := x0 R :
n=0
is an interval symmetric about its centre 0.

Proof. Letting R = sup S, all three cases of proposition 3.1.3 shows that S is
an interval symmetric about 0.
Definition
3.1.5 [Interval of Convergence]. Given
P
n
n=0 an x , the set
(
)
X
n
S := x0 R :
an x0 converges
power
series
n=0
is called the interval of convergence of the power series. Note that by corollary
3.1.4 above, S is actually an interval.
Note. Similar language can be defined for those power series not centered at
a = 0. Since centering at a = 0 loses no generality, a general power series still
has an interval of convergence symmetric about its center a and the radius of
118
convergence R would still be half of the length of the interval of convergence. A

general power series centered at a with radius of convergence R has interval of
convergence (a R, a + R), together with possibly one or both of its endpoints.
The ratio test is our principal tool in determining the radius of convergence
of a power series. The following is a direct corollary of the ratio test.

P

Corollary 3.1.6. Given a power series n=0 an (x a)n , if limn aan+1
=
n
L, where L R or L = , then we have the following:
P
1. if L = 0, then n=0 an (x a)n has interval of convergence R and radius
of convergence R = ;
P
n
2. if L > 0 and L R, then
n=0 an (x a) has radius of convergence
1
R = L;
P
n
3. if L = , then
n=0 an (x a) has interval of convergence {a} and
radius of convergence R = 0.
Proof. The corollary is a direct consequence of the ratio test. This proof is left
as an exercise.
Sample Question 21. Determine the radius of convergence and interval of
convergence of the power series
X
xn
.
ln(n)
n=2
Strategy: We begin this type of question by first finding the radius of convergence R using the ratio test. Then, for a power series centered at 0 (it usually
is) we know that it converges on at least the interval (R, R) (if R > 0). To
find the interval of convergence, we need only determine whether the power
series converges at the endpoints x = R and x = R. At these endpoints, the
ratio test process gives a limiting ratio of 1, which renders the ratio test useless; instead, we will need other tests to determine convergence or divergence.
The usual endpoint tests are the divergence test, the alternating series test,
the comparison and limit comparison tests, the integral test, and the p-test for
series.
In this particular example, the radius of convergence is R = 1 and so the
endpoints are 1 and 1. The alternating series test is used to determine whether
the given power series converges at x = 1 while the limit comparison test and
the p-test are used to determine whether the series converges at x = 1.
119
x
Sample Solution 21. For any x R, set bn = ln(n)
for each n 2. Then
n+1

x

ln(n+1) xn+1 ln(n)
ln(n)
|bn+1 |

= n = n
= |x| ln(n + 1) .
x
|bn |
x
ln(n
+
1)
ln(n)

ln(n)
|
Hence limn |b|bn+1
=
lim
|x|

n
ln(n+1) = |x| 1 = |x|. The ratio test shows
n|
that the power series converges if |x| < 1 and diverges if |x| > 1. Hence this
power series has a radius of convergence R = 1 and converges on the interval
(1, 1) and possibly one or both of its endpoints.
n
P
At the endpoint x = 1, we have that n=2 (1)
ln(n) is an alternating series
1
positive, decreasing, and converging to 0 as n ; hence
with the terms ln(n)
n
P
the alternating series test shows that n=2 (1)
ln(n) converges.
P
1
At the other endpoint x = 1, the series becomes n=0 ln(n)
. Letting cn =
1
1
ln(n) and dn = n , we have cn , dn > 0 and
dn
= lim
n cn
n
lim
1
n
1
ln(n)
= lim
ln(n)
=0
n
P
by the fundamental log limit. The limit comparison test shows that n=2
P
diverges if n=2 n1 diverges, which it does by the p-test for series.
P
We conclude that the interval of convergence for the power series n=2
is [1, 1).
1
ln(n)
xn
ln(n)
We applied the full ratio test to find the radius of convergence in the above
sample solution. We could have applied corollary 3.1.6 instead and saved a little
work. Instead of checking the limiting ratio of the whole term an (x a)n , we
merely check the limiting ratio of the coefficient an . The following alternate
solution uses corollary 3.1.6 instead.
Alternate Solution: Let an =
an+1
=
an
1
ln(n)
for n 2. Then an > 0 and we have
1
ln(n+1)
1
ln(n)
ln(n)
1
ln(n + 1)
P xn
as n . Corollary 3.1.6 shows that the radius of convergence of n=2 ln(n)
is R = 11 = 1. This power series will converge on (1, 1) and possibly one or
both of its endpoints.
Checking the endpoint cases x = 1 and x = 1 remains the same as sample
solution 21.
P
Recall that if a power series n=0 an xn converges on an interval
P of convergence I, then we can define a function f : I R by f (x0 ) = n=0 an xn0
for each
P x0 I. This function f is said to be represented by the power series n=0 an xn on I. We can build power series for new functions based on
functions represented by known power series.
120
P n
Example 3.7. As seen in example 3.2,
interval of convergence
n=0 x has P
n
(1, 1). Let f : (1, 1) R be defined
by
f
(x
)
=
n=0 x0 for each x0
P n 0 1
each x0 (1, 1).
(1, 1). By the ratio test, f (x0 ) =
n=0 x0 = 1x0 forP
1
Hence the function 1x is represented by the power series n=0 xn on (1, 1).
Sample Question 22. Find a power series representation for f (x) =

the interval (1, 1).
1
1x3
on
Strategy: We must build a power series representation for the new function
1
1x3 . To do this, we build our power series based on some similar function for
which we already know the power series representation. Searching, we find that
1
1
for us to transform it into 1x
3.
1x is one such function. It is similar enough
P n
Observe that the interval of convergence of n=0 x is (1, 1); also note that
if x (1, 1), then its cube x3 is also in (1, 1).
1
The solution looks as if we merely plugged in x3 in place of x in both 1x
P n
and its power series representation n=0 x . Admittedly, this is actually a good
summary of the overall strategy; however, we must watch for any changes in the
interval of convergence. In this particular example, it only happens that x3 , the
thing we are plugging in, preserves membership in (1, 1). That is, in order for
the cube of a number to be in (1, 1), that number must also be from (1, 1).
This need not occur for all problems.
1
Sample Solution 22. Let f (u) = 1u
. We know [by example 3.7] that f is
P n
represented by the power series n=0 u on its interval of convergence (1, 1).
Hence for any real number x0 whose cube x30 lies in (1, 1), we have that
X
X
1
3 n
3
x3n
(x
)
=
=
f
(x
)
=
0 .
0
0
1 x30
n=0
n=0
But we also know that x3 (1, 1) if and only if x (1, 1), hence
P
represented by the power series n=0 x3n on (1, 1).
Sample Question 23. Find a power series representation for f (x) =
the interval (1, 1).
1
1x3
1
1+x2
is
on
Strategy: As before, we try to find a familiar function f (whose power series

1
representation is known) such that we can plug something in to get 1+x
2 . We
plug the same thing into the power series representation of f to get the answer.
After this process, we must check if the interval of convergence changes.
1
1
It seems that if we plug x2 into 1x
in place of the x, we will get 1+x
2.
P n
x
,
we
get
Doing
the
same
substitution
to
its
power
series
representation
n=0
P
P
2 n
n 2n
as the answer. Did the interval of convergence
n=0 (x ) =
n=0 (1) x
121
change? To answer this, wePfirst observe that (1, 1) is the interval of conver
gence of the original series n=0 xn . We then ask ourselves: what values of x
2
will make x a member of (1, 1)? The answer is all values x between 1
and 1i.e., all x (1, 1).
Please note that in this and the previous sample
Psolution, we used a different
variable name, u, for the original power series n=0 un to avoid confusion. If
we used x, we would end up plugging x3 or x2 into x, a somewhat awkward
maneuver.
1
. We know [by example 3.7] that f is
Sample Solution 23. Let f (u) = 1u
P n
represented by the power series n=0 u on its interval of convergence (1, 1).
Hence for any real number x0 whose negative square x20 lies in (1, 1), we
have that
X
X
1
1
2 n
2
(1)n x2n
(x
)
=
=
=
f
(x
)
=
0 .
0
0
1 + x20
1 (x20 )
n=0
n=0
But we also know that x2 (1, 1) if and only if x (1, 1), hence
P
represented by the power series n=0 (1)n x2n on the interval (1, 1).
1
1+x2
is
Here comes our next fundamental problems.

Problem. What are the properties of functions that are represented by a power
series? For example, is it continuous? Is it differentiable? Is the derivative of
the function represented by the new power series obtained by differentiating
each term of the original power series, term by term? What about its integral?
Problem. Which functions actually have a power series representation?
To solve these problems, we must look more closely at sequences of functions.
3.2
3.2.1
Uniform and Pointwise Convergence

Sequences of Functions

Definition 3.2.1. Let X be any nonempty set. A sequence in X is a function
f : N X. For the sake of intuitive notation, we will use a letter with n as its
subscript, such as xn , fn , an , or bn , to represent f (n), for each n N; we will
rarely write f (n). Also, we will usually denote the sequence f as {xn }nN or
simply {xn } (where x could be any letter). We call xn := f (n) the nth term of
the sequence.
122
Less formally, a sequence is simply an infinite, ordered list of objects from

some set (repetition is allowed). In this section, we will be working mainly with
the set C[a, b], the set of all continuous functions on [a, b].
Definition 3.2.2. Given a sequence {fn (x)} of functions defined on a set I
R, we say that the sequence {fn (x)} converges pointwise to a function f0 (x) if
lim fn (x0 ) = f0 (x0 )
for all x0 I (for convenience we may drop the independent variable x in

{fn (x)} and f0 (x)). We also say that f0 is the pointwise limit of {fn }; we write
fn f0 pointwise on I.
Suppose that {fn } is a sequence in C[0, 1] with fn f0 pointwise on [0, 1].
What can be said about f0 (x)? Not much.
Example 3.8. Firstly, the pointwise limit f0 of a sequence of continuous functions {fn } may not be continuous. Let fn (x) = xn on [0, 1]. Note that fn (x) is
continuous on [0, 1] for all n N. Now
(
0 if x0 [0, 1),
n
lim fn (x0 ) = lim x0 =
n
n
1 if x0 = 1.
Let
(
0 if x [0, 1),
f0 (x) =
1 if x = 1.
Then fn f0 pointwise on [0, 1] and yet f0 is discontinuous at x = 1!

Example 3.9. Suppose a sequence of continuous functions
{fn} converges
pointnR
o
b
wise to f0 on an interval [a, b]; we would hope that a fn (x) dx converges to
Rb
f (x) dx. Unfortunately, this is not the case. For each n N, let
a 0
1
2
],
if x [0, 2n
4n x
1 1
2
fn (x) = 4n x + 4n if x ( 2n , n ),
0
if x [ n1 , 1]
for each x [0, 1]. Let f0 (x) = 0 for each x [0, 1]. The sequence {fn }
converges pointwise to f0 on [0, 1]. Note that the sequence {fn } is designed so
R1
R1
that the integral 0 fn (x) dx = 1 for all n N; yet 0 f0 (x) dx = 0 6= 1!
Diversion. Let f0 : [0, 1] {0, 1} be defined by
(
1 if x [0, 1] Q,
f0 (x) =
0 if x [0, 1] \ Q.
123
Does there exist a sequence {fn } of continuous functions on [0, 1] such that
fn f0 pointwise on [0, 1]?
Pointwise convergence turns out to be the wrong concept of convergence to
use here. The following version of convergence for a sequence of functions is
much better.
Definition 3.2.3 [Uniform Convergence]. We say that a sequence of functions {fn } converges uniformly on S R to a function f0 if for every > 0,
there exists an N N such that if n N , then
|fn (x) f0 (x)| <
for all x S.
The following theorem shows that all uniform limits of continuous functions
are continuous.
Theorem 3.2.4. Assume that fn is continuous for all n N on an interval
I R. Assume also that {fn } converges uniformly to f0 on I. Then f0 is
continuous on I.
Proof. Let x0 I. We will show that f0 is continuous at x = x0 . Let > 0.
We choose an N N so that if n N , then
|fn (x) f0 (x)| <
(3.1)

3
for all x I. So for this N , we have that fN is continuous at x = x0 ; hence

there exists a > 0 such that if x I and |x x0 | < , then
|fN (x) fN (x0 )| <
(3.2)

.
3
Now for any x I with |x x0 | < , we have

|f0 (x) f0 (x0 )|
<
=
|f0 (x) fN (x)| + |fN (x) fN (x0 )| + |fN (x0 ) f0 (x0 )|

+ +
3 3 3
,
where |f0 (x) fN (x)| < 3 due to inequality (3.1), |fN (x) fN (x0 )| < 3 due to
inequality (3.2), and |fN (x0 ) f0 (x0 )| < 3 due to inequality (3.1). Hence f0 is
continuous at x = x0 . The desired result follows.
Uniform convergence is a stronger type of convergence than pointwise convergence. It turns out that uniform convergence has many properties that exist
in a much more general framework.
124
3.2.2
Normed Linear Spaces and Metric Spaces
Definition 3.2.5. Let V be a vector space over R. A norm on V is a function

k k : X R satisfying
1. [Positive Definiteness] kxk 0 and kxk = 0 if and only if x = 0, for
all x V ;
2. [Positive Homogeneity] kxk = ||kxk for all x V ;
3. [Triangle Inequality] kx + yk kxk + kyk, for all x, y V .
The ordered pair (V, k k) is called a normed linear space.
Example 3.10. Let V = R with the usual addition and scalar multiplication,
and let k k = | |, the usual absolute value. One can easily check that | | is a
norm and hence (R, | |) is a normed linear space.
Example 3.11. Let V = R2 with the usual vector addition
and scalar multip
plication, and for each x = (x1 , x2 ) R2 , let kxk2 = x21 + x22 . One can again
check that k k2 is a norm and hence (R2 , k k2 ) is a normed linear space. The
k k2 in this example is called the Euclidean norm, where the subscript 2 refers
to the squares and the square root.
Intuitively, kxk represents the length or size of the vector x.
Given a nonempty set X, it is sometimes useful to talk about how close or
far away two points x, y X are. To talk about the concept of distance, we
can artificially define a metric on this set that measures the distance between
any given pair of elements. The following definition makes the idea of distance
precise.
Definition 3.2.6. Given a nonempty set X, a metric on X is a function d :
X X R satisfying
1. [Positive Definiteness] d(x, y) 0 and d(x, y) = 0 if and only if x = y,
for all x, y X;
2. [Symmetry] d(x, y) = d(y, x), for all x, y X;
3. [Triangle Inequality] d(x, y) d(x, z) + d(z, y), for all x, y, z X.
The ordered pair (X, d) is called a metric space.
Intuitively, d(x, y) is the distance from x to y. The triangle inequality
states that distance travelled going from a point x to a point y is at most (if
not shorter than) the distance travelled going from x, taking a detour to a third
point z, and then going back from z towards y.
125
If one has a normed linear space, then a natural metric space arises out
of it. We can define the distance between two vectors as the norm of their
difference vectormeasuring how large their difference is.
Observation. Let (V, k k) be a normed linear space. Let d : V V R be
defined by d(x, y) = kx yk. Then (V, d) is a metric space. (The metric d here
is said to be induced by the norm k k.)
Proof. Let d be defined as above. Then by definition 3.2.5, we have the following.
1. d(x, y) = kx yk 0 for all x, y V . Also, for all x, y V , If
d(x, y) = kx yk = 0, then x y = 0, implying that x = y.
2. d(x, y) = kx yk = k(1)(y x)k = | 1|ky xk = ky xk = d(y, x),
for all x, y V .
3. d(x, y) = kxyk = k(xz)+(zy)k kxzk+kzyk = d(x, z)+d(z, y),
for all x, y, z V .
Hence (V, d) is a metric space.
An important example of a normed linear space is the following.
Example 3.12. Let
V = C[a, b] := {f : [a, b] R : f (x) is continuous on [a, b]}.
Define the function k k : V R by
kf k
:=
sup{|f (x)| : x [a, b]}
max{|f (x)| : x [a, b]}.
The second equality is true by the extreme value theorem. We check below that
k k is a norm:
1. kf k 0 by definition and kf k = max{|f (x)| : x [a, b]} = 0 implies
that |f (x)| 0 for all x [a, b], hence f (x) = 0 for all x [a, b].
2. Note that
kf k
max{|f (x)| : x [a, b]}
max{|| |f (x)| : x [a, b]}
|| max{|f (x)| : x [a, b]}
|| kf k .
126
3. It is easy to establish that

kf + gk
max{|(f + g)(x)| : x [a, b]}
max{|f (x) + g(x)| : x [a, b]}
max{|f (x)| + |g(x)| : x [a, b]}

max{|f (x)| : x [a, b]} + max{|g(x)| : x [a, b]}
= kf k + kgk .
Hence k k is a norm and so (C[a, b], k k ) is a normed linear space. Here,
k k is called the sup-norm.
Not surprisingly, the metric space induced by the above normed linear space
is also very important.
Example 3.13. Let (C[a, b], k k ) be defined as in example 3.12. The metric
induced by k k is the metric d : C[a, b] C[a, b] R defined by d (f, g) =
kf gk := max{|f (x) g(x)| : x [a, b]}.
With the proper notion of distance developed, we are now able to talk about
convergence of sequences in a general metric space.
Definition 3.2.7. Given a metric space (X, d), we say that a sequence {xn }
X converges to x0 X if for every > 0 there exists an N N such that if
n N , then d(xn , x0 ) < . I.e., d(xn , x0 ) 0 in R as n .
Example 3.14. Consider the metric space (C[a, b], d ). Applying definition
3.2.7 above, a sequence {fn } C[a, b] converges to f0 C[a, b] if and only if
for every > 0 we can find an N N such that if n N , then
d (fn , f0 ) <
kfn f0 k <
max{|fn (x) f0 (x)| : x [a, b]} <
|fn (x) f0 (x)| < for all x [a, b].
This is precisely the definition of uniform convergence!

It turns out that uniform convergence is equivalent to convergence in a different structurethe metric space (C[a, b], d ).
Definition 3.2.8. Given a metric space (X, d), we say that a sequence {xn }
X is Cauchy if for every > 0 there exists N N such that if n, m N , then
d(xn , xm ) < .
127
Proposition 3.2.9. Assume that {xn } is a sequence in a metric space (X, d)

with xn x0 in (X, d) as n . Then {xn } is Cauchy in (X, d).
Proof. Let > 0. Since xn x0 , we can find an N N so that if n N , then
d(xn , x0 ) <

.
2
If n, m N , then we have
d(xn , xm ) d(xn , x0 ) + d(x0 , xm )
= d(xn , x0 ) + d(xm , x0 )

<
+
2 2
= .
This shows that {xn } is Cauchy in (X, d).
Question: If {xn } is Cauchy in some metric space (X, d), does {xn } converge?
The answer is no. But those metric spaces having the property that all
Cauchy sequences converge are special and are given a name.
Definition 3.2.10. We say that a metric space (X, d) is complete if every
Cauchy sequence {xn } X is convergent in (X, d).
Theorem 3.2.11 [Completeness Theorem for C[a, b]]. The metric space
(C[a, b], d ) is complete. I.e., if {fn } is Cauchy in (C[a, b], d ), then fn f0
in (C[a, b], d ) as n , for some f0 (C[a, b], d ).
Proof. Since {fn } is Cauchy in (C[a, b], d ), for any > 0, we can find N N
such that if n, m N , then
d (fn , fm ) := kfn fm k < .
Let x0 [a, b]. Then if n, m N , we have
|fn (x0 ) fm (x0 )| kfn fm k < .
This shows that {fn (x0 )} is Cauchy in (R, | |), for each x0 [a, b]. By completeness of (R, | |), {fn (x0 )} converges for each x0 [a, b]. Let f0 : [a, b] R
be defined by
f0 (x) = lim fn (x)
n
for each x [a, b].

We claim that {fn } converges to f0 with respect to d . Suppose to the
contrary that {fn } does not converge to f0 . Then there exists an 0 > 0 such
that for all N N, there exists some n N and some x [a, b] with
(3.3)
|fn (x) f0 (x)| 0 .

128
Given this 0 choose N0 N so that if n, m N0 , then

0
kfn fm k < .
2
We can choose an n0 N0 and some x0 [a, b] so that inequality (3.3) holds.
For this x0 , we can choose an m0 N0 so that
0
|fm0 (x0 ) f0 (x0 )| < ;
2
this is possible since fm (x0 ) f0 (x0 ) as m . We now have
0 |fn0 (x0 ) f0 (x0 )|
|fn0 (x0 ) fm0 (x0 )| + |fm0 (x0 ) f0 (x0 )|
kfn0 fm0 k + |fm0 (x0 ) f0 (x0 )|

0
0
+
2
2
0 .
<
=
This is a contradiction, and hence fn f0 (with respect to d ) as n , as

claimed. By theorem 3.2.4, f0 (C[a, b], d ) and the proof is complete.
The following is a very useful corollary that will help us solve the fundamental problems posed in a previous section.
Corollary P
3.2.12 [Weierstrass M -Test]. Assume that {fn } C[a, b].
Assume that n=1 kfn k is convergent. For each k N, let Sk : [a, b] R be

Pk
defined by Sk (x) = n=1 fn (x), for each x [a, b]. Then {Sk } converges
uniP
formly on [a, b] to some f0 C[a, b]. In this case, we write f0 (x) = n=1 fn (x).
Pk
P
Proof. Let Tk = n=1 kfn k , and T = n=1 kfn k . We are given that Tk
T as k P
. Let > 0. We can find N N so that if k > j N , then
k
Tk Tj = n=j+1 kfn k < , since convergent sequences are Cauchy. Now for
all k, j satisfying k > j N , we have

k

X
fn (x)
|Sk (x) Sj (x)| =

n=j+1
k
X
|fn (x)|
n=j+1
k
X
kfn k
n=j+1
<
for all x [a, b]. This shows that kSk Sj k < for all k, j satisfying
k > j N , which in turn shows that {Sk } is Cauchy in (C[a, b], d ). By
theorem 3.2.11, {Sk } converges uniformly to some f0 C[a, b] and the proof is
complete.
129
n
P
Example 3.15. Consider n=0 34 sin(4n x) on [a, b] for any a, b R. Note
that for any n N, we have that
n n
n
n
3
3

n

|sin(4n x)| 3 = 3
sin(4
=
x)
4
4
4
4
for all x [a, b]. This shows that

n
n

3
3
n

sin(4
x)

4
4
n
P
for all n N. Since the larger series n=0 34 converges by the geometric series

P
n
test, the smaller series n=0 k 34 sin(4n x)k converges by the comparison
test. The Weierstrass M -test then shows that {Sk } converges uniformly to a

Pk
3 n
sin(4n x). In particular, f0 =
function f0 C[a, b], where Sk =
n=0
4

P
3 n
n
sin(4 x) is continuous on [a, b].
n=0 4
Diversion. The function
n=0

3 n
4
sin(4n x) is nowhere differentiable on R.
The important application of the Weierstrass M -test is the following theorem.

P
Theorem 3.2.13. Assume that the power series n=0 an xn has radius of convergence R > 0. For any x0 satisfying 0 x0 < R, define Sk : [x0 , x0 ] R
by
k
X
an xn
Sk (x) =
n=0
for each x [x0 , x0 ]. P

The sequence {Sk }kN converges uniformly on [x0 , x0 ]
to some function f0 = n=0 an xn C[x0 , x0 ].

Proof. If x0 = 0, then the theorem is trivially true. So consider
the case
P
n
x0 > 0. By proposition 3.1.3, the interval of convergence I of
n=0 an x
satisfies I (R, R) P
[x0 , x0 ]. Choose an x1 so that 0 < x0 < x1 < R.
n
Then
x
I
and
so
1
n=0 an x1 converges; theorem 3.1.1 then implies that
P
n
n=0 an x converges absolutely at x = x0 . Now note that for any n N and
n
n
n
n
n
n
x [x0 , x0 ], |aP
n x | = |an | |x| |an | x0 = |an x0 | kan x k |an x0 |.
n
SincePthe series n=0 |an x0 | converges, the comparison test shows that the se
the Weierstrass M -test, {Sk } converges
ries n=0 kan xn k converges also. By
P
uniformly on [x0 , x0 ] to some f0 = n=0 an xn C[x0 , x0 ], as required.
So in particular, all power series are continuous on the interior of their
interval
of convergence. I.e., the function f (x) represented by the power series
P
a
xn is continuous on (R, R), where R is its radius of convergence.
n
n=0
130
P n
Problem. Suppose that we have the power series n=1 xn . Then its radius of
convergence is 1, and it converges at the left endpoint
= 1 but not at the right
Px
n
endpoint x = 1. Define f : [1, 1) R by f (x) = n=1 xn . Is f (x) continuous

P n
from the right at x = 1? That is, does limx1+ f (x) = f (1) := n=1 xn ?
By a process called Abel summation, we can show the following result.
Theorem 3.2.14
P [Continuity Theorem for Power Series]. Suppose the
power series n=0 an xn has radius of convergence R > 0 and interval of convergence I. Then the function f : I R defined by
f (x) =
an xn
n=0
is continuous on I.
Proof. This proof is omitted. (It follows from a theorem called Abels Theorem.)
With this major result in hand, we are now ready to go on to the next
section.
3.3
Term-by-Term Integration and Differentiation
Theorem 3.3.1. Assume that {fn } C[a, b]. Assume also that fn f0 with
respect to d on [a, b], as n ; i.e., fn f0 uniformly on [a, b]. Then
Z
lim
Z
fn (t) dt =
f0 (t) dt.
That is, we can switch the limit sign as follows:

Z
lim
Z
fn (t) dt =
i
lim fn (t) dt.
Proof. Let > 0. We can find an N N such that if n N , then kfn f0 k <
131

ba .
We now have, for all n N ,
(3.4)

Z
Z b

b

fn (t) dt
f0 (t) dt =

a
a
(3.5)

Z

b

[fn (t) f0 (t)] dt

a
Z b
|fn (t) f0 (t)| dt
a
b
kfn f0 k dt
a
(b a)kfn f0 k

(b a)
ba
.
=
<
=
Equality (3.4) holds by theorem 1.3.1; inequality (3.5) is given by theorem 1.3.4.
This shows that
Z
Z
b
lim
fn (t) dt =
f0 (t) dt,
as required.
P
Corollary 3.3.2. Suppose that f (x) = n=0 an xn on (R, R), for some radius of convergence R > 0. Suppose [a, b] (R, R). Then
"Z
#
Z b
b
X
X
an bn+1 an an+1
n
.
(3.6)
f (x) dx =
an x dx =
n+1
a
a
n=0
n=0
That is, for a power series, we can integrate term by term.
Pk
Proof. For each k N, let Sk : [a, b] R be defined by Sk (x0 ) = n=0 an xn0
for each x0 [a, b]. We have already shown that Sk f uniformly on [a, b] by
theorem 3.2.13. Hence by the previous theorem (theorem 3.3.1), we have that
Z
Z
f (x) dx
lim
Sk (x) dx
#
Z b "X
k
n
lim
an x dx
a
k
X
lim
n=0
n=0
"Z
#
an xn dx ,
where the last equality is due to the (finite) additivity of definite integrals given
132
by theorem 1.3.1. Continuing, we have

#
"Z
k
b
X
n
(3.7)
lim
an x dx
=
k
n=0
b
k
X
an xn+1
lim
k
n + 1 a
n=0
k
X
an bn+1 an an+1
k
n+1
n=0
:=
X
an bn+1 an an+1
.
n+1
n=0
lim
The equality at (3.7) is due to the second fundamental theorem of calculus. This
proves the required result.
Observation. In corollary 3.3.2 above, equation (3.6) is still true even if a > b.
The reader should verify this fact via the definition of definite integrals whose
bounds are reversed (definition 1.3.7).
Example 3.16. We know that
f (u) :=
X
1
=
un
1 u n=0
P n
for all u (1, 1) (and
n=0 u has an interval of convergence of (1, 1)).
Since x (1, 1) if and only if x (1, 1), we have that
X
X
1
1
=
= f (x) =
(x)n =
(1)n xn
1+x
1 (x)
n=0
n=0
for all x (1, 1). Let F : (1, 1) R be defined by

Z x0
1
F (x0 ) =
dt
1
+
t
0
for each x0 (1, 1).
Now we have
Z
F (x) =
0
1
dt =
1+t
=
ln(|1 + t|)|0
ln(|1 + x|) ln(|1 + 0|)
ln(|1 + x|)
ln(1 + x)
for all x (1, 1); the first line is due to the fundamental theorem of calculus.
133
But we also have, by theorem 3.3.2 above,

Z x
Z x
X
1
n n
F (x) =
(1) t dt
dt =
0 1+t
0
n=0
=
=
X
(1)n xn+1 (1)n 0n+1
.
n+1
n=0
(1)n
n=0
xn+1
n+1
P
n+1
for all x (1, 1). Therefore, ln(1 + x) = n=0 (1)n xn+1 for all x (1, 1).
P
n+1
We note that n=0 (1)n xn+1 converges at x = 1 by the alternating series test.
Using the fact that ln(1+x) is continuous at x = 1 and the continuity theorem for
n
P
P
1n
power series, we have that ln(2) = ln(1 + 1) = n=0 (1)n n+1
= n=0 (1)
n+1 .
Hence
X
xn
ln(x + 1) =
(1)n
n+1
n=0
on (1, 1].
In the previous example, we finally showed that
converges to ln(2).
(1)n
n=0 n+1
= 1 21 + 13
P
Definition 3.3.3. Given a power series n=0 an xn , the formal derivative or
the term-by-term derivative of the power series is
nan xn1 ,
n=1
and the formal antiderivative or the term-by-term antiderivative of the power

series is
X
an n+1
C+
x
.
n+1
n=0
Note that we have not yet shown whether the term-by-term derivative or
the term-by-term antiderivative are actually the derivative or antiderivative of
the original series, respectively.
P
n
Question: If a power series
n=0 an x has radius of convergence R, what
can we say about the radii of convergence of its formal derivative and formal
antiderivative?
P
n
Theorem 3.3.4. Assume that a power
n=0 an x has radius of converPseries n1
gencePR. Then its formal derivative n=1 nan x
and its formal antiderivative
an n+1
C + n=0 n+1
x
both have a radius of convergence greater than or equal to
R.
134
Proof. If R = 0, the theorem is trivial. So assume R > 0 and suppose x0

(R, P
R). Let x1 , x2 be chosen so that 0 < |x0 | <
x1 < x2 < R. We know
P
n
that n=0 an xn2 converges, so by theorem 3.1.1,
n=0 |an x1 | converges. By
n
the divergence test, |an x1 | 0 as n , and therefore |an xn1 | M for all
n N, for some M 0. Consider
|nan xn1
|.
0
n=1
We have that
|nan xn1
| = n|an | |x0 |n1
0
|x0 |n1
|x1 |n1
n1
x0
= n|an xn1
|
1
x1
n1
x0
nM
,
x1
= n|an | |x1 |n1
n1

. We have bn 0 and
for all n N. Let bn = nM xx01
n
x0

x1
x0
bn+1
n + 1 x0
(n + 1)M

=
n1 =

x1

bn
nM
n
x1
x0
x1

n1
P
P

as n . Since xx01 < 1, n=1 bn = n=1 nM xx01
converges by the ratio
P
n1
test.
test then shows that n=1 |nan x0 | converges and hence
P The comparison
n1
na
x
converges
for any x0 (R, R). This shows that the
n 0
n=1
Pabsolutely,
radius of convergence of n=1 nan xn1Pis at least R. A similar argument shows
an n+1
that the radius of convergence of C + n=0 n+1
x
is at least R; the details
are left to the reader as an exercise.
Corollary 3.3.5. The radii convergence of all three series
X
n=0
an x ,
X
n=1
nan x
n1
X
an n+1
, and C +
x
n
+1
n=0
are the same.

P
P
Proof. Suppose that n=0 an xn has P
radius of convergence R, n=1 nan xn1
an n+1
has radius of convergence S, and C + n=0 n+1 x
has radius of convergence
T . By theorem 3.3.4 above, we already know that S R and T R.
make P
the simple observation that the formal derivative
of C +
P
PNowanwen+1
n1
n
na
x
is
a
x
and
the
formal
antiderivative
of
nx
n=0 n
n=1
n=0 n+1
P
P
is C + n=1 an xn , which has the same radius of convergence as n=0 an xn .

Theorem 3.3.4 shows that R S, R T , implying R = S = T . The corollary
follows.
135
This last corollary establishes a very important fact: term-by-term derivatives and antiderivatives preserve the radius of convergence. The natural question to ask, then, is whether the formal derivative and formal antiderivative
converge back to the actual derivative and antiderivative of the original power
series. The following theorem is an important starting point to the answer yes.
Theorem 3.3.6. Assume that {Fn } C[a, b] with {Fn (a)} C as n .
Also assume that each Fn (x) has a continuous derivative fn (x) on [a, b] and
that {fn } converges uniformly on [a, b] to a function g C[a, b]. Then {Fn }
converges uniformly to a function G C[a, b] and G0 (x) = g(x) for all x (a, b).
Proof. From the fundamental theorem of calculus we have
Z x
fn (t) dt + Fn (a)
Fn (x) =
a
for every n N and x [a, b]. Let G : [a, b] R be defined by

Z x
G(x) =
g(t) dt + C,
a
for each x [a, b]. (Remember that C = limn Fn (a).) We now have that,
for each x [a, b] and each n N,
Z x
Z x

|Fn (x) G(x)| =
fn (t) dt + Fn (a)
g(t) dt + C
Zax

a
Z x

=
fn (t) dt
g(t) dt + [Fn (a) C]
a
Z xa

=
[fn (t) g(t)] dt + [Fn (a) C]

Zax

[fn (t) g(t)] dt + |Fn (a) C|

Z ax
|fn (t) g(t)| dt + |Fn (a) C|

Zax
kfn gk dt + |Fn (a) C|

a
(x a) kfn gk + |Fn (a) C|
(b a) kfn gk + |Fn (a) C| .

Let > 0. We can choose an N N such that if n N , then kfn gk <
and |Fn (a) C| < 2 . Hence if n N , we have that for each x [a, b],
|Fn (x) G(x)|
(b a) kfn gk + |Fn (a) C|

< (b a)
+
2(b a) 2

=
+
2 2
= .
136

2(ba)
This shows that kFn Gk < for all n N and hence {Fn } G as
n with respect to d . By the fundamental theorem of calculus, G(x) is
differentiable on (a, b) and G0 (x) = g(x) for each x (a, b).
P
Corollary 3.3.7. Assume that n=0 an xnPhas radius of convergence R > 0.
Let F : (R, R) R be defined by F (x) = n=0 an xn P

for each x (R, R).
n1
Then F (x) is differentiable on (R, R) with F 0 (x) =
for all
n=1 nan x
x (R, R).
Proof. Let x1 be any real number satisfying 0 < x1 < R. For each k N, let
Fk : [x1 , x1 ] R be defined by
Fk (x) =
k
X
an xn
n=0
Pk
Pk
for each x [x1 , x1 ]. NotePthat Fk0 (x) = n=0 nan xn1 = n=1 nan xn1 .
n1
Since the formal derivative
also has radius of convergence R
n=1 nan x
0
by
corollary
3.3.5,
{F
}
converges
uniformly
on [x1 , x1 ] to a function f :=
k
P
n1
na
x
,
by
theorem
3.2.13.
Finally,
since
(x1 )} converges to some
n
n=1
P{F
C R, theorem 3.3.6 above shows that F (x) = n=0 an xn is differentiable on

(x1 , x1 ) with
(3.8)
F 0 (x) = f (x) =
nan xn1
n=1
for each x (x1 , x1 ). Since x1 (R, R) is arbitrary, equation (3.8) is true

for all x (R, R).
P
Corollary 3.3.8. Assume that n=0 an xnPhas radius of convergence R > 0.
Let F : (R, R) R be defined by F (x) = n=0 an xn for each x (R, R).

Then F (x) is differentiable on (R, R) an infinite number of times.
Sample Question 24. Consider the power series
X
xn
.
n2
n=1
P n
Find its interval of convergence I. Define f (x) = n=1 xn2 for each x I. Find
the intervals of convergence for the power series representations of f 0 and f 00 .
Strategy: The first part of the question is a simple exercise in using the ratio
test or corollary 3.1.6 to find the radius of convergence R and then testing
137
the endpoints for convergence. We know by corollary 3.3.7 that the term-byterm derivative of the power series representation for f is the power series
representation for f 0 . That is,
"
#

X xn
X
d
d xn
0
f (x) =
=
dx n=1 n2
dx n2
n=1
=
X
X
xn1
nxn1
=
n2
n
n=1
n=1
X
xn
.
n+1
n=0
Corollary 3.3.5 shows that this series will share the same interval of convergence
as the original series, except perhaps at one or both endpoints of the interval.
That is, the only potential difference between the intervals of convergence of the
two series is the endpoints: We merely need to check for the convergence of the
endpoint cases. Differentiating one more time, we get
"
#

X
d xn
d X xn
00
=
f (x) =
dx n=0 n + 1
dx n + 1
n=0
=
X
nxn1
.
n+1
n=1
Again, this series will share the almost identical interval of convergence as the
original series. All we need to do is to test the endpoint cases.
Sample Solution 24. Set an =
an+1
an
1
n2
for each n 1. Then an > 0 and
1
(n+1)2
1
n2
n2
1
=
n2 + 2n + 1
1 + n2 +
n2
(n + 1)2
1
n2
as
n . Corollary
that the radius of convergence of the series
P
P xn3.1.6 shows
1
n
a
x
=
is
R
=
=
1. Hence it converges only on (1, 1) and
2
n
n=1
n=1 n
1
at possibly one or both endpoints.
At the endpoint x = 1, the series becomes
X
(1)n
,
n2
n=1
which is an alternating series whose terms n12 are positive, decreasing, and
P n
tending to zero. The alternating series test shows that n=1 xn2 converges at
138
x = 1. At the other endpoint x = 1, the series becomes
X
1
,
2
n
n=1
a p-series with pP= 2. The series converges by the p-test for series. We conclude
n
that the series n=1 xn2 has an interval of convergence [1, 1].
By corollaries 3.3.5 and 3.3.7, the derived series
"
#

X
d xn
d X xn
=
dx n=1 n2
dx n2
n=1
=
X
X
X
nxn1
xn1
xn
=
=
n2
n
n+1
n=1
n=1
n=0
also has a radius of convergence R = 1 and represents f 0 (x) on (1, 1). At the
endpoint x = 1, this power series becomes
X
(1)n
,
n+1
n=0
1
which is an alternating series whose terms n+1
are positive, decreasing, and
tending to zero. The alternating series test shows that this series converges. At
the other endpoint x = 1, this power series becomes
1
,
n
+
1
n=0
which is the harmonic series. The
for series shows that it diverges. We
Pp-test
xn
conclude that the derived series n=0 n+1
= f 0 (x) has an interval of convergence [1, 1).
Differentiating once again, we obtain
"
#
X

X
d X xn
nxn1
d xn
=
=
,
dx n=0 n + 1
dx n + 1
n+1
n=0
n=1
which has the same radius of convergence R = 1 and represents the function
f 00 (x) on (1, 1) by corollaries 3.3.5 and 3.3.7. At the endpoints x = 1, this
power series becomes
X
n(1)n1
.
n+1
n=1
n1
n
Set an = n(1)
for each n 1, so that |an | = n+1
1 6= 0 as n .
n+1
This shows that the terms of the series do not converge to zero and so the
divergence test shows that the power series diverges at both endpoints. Hence
P nxn1
00
n=1 n+1 = f (x) has an interval of convergence (1, 1).
139
P
n
Corollary 3.3.9. If f (x) is representedR by
(R, R), where
n=0 an x on
P
an n+1
R > 0 is its radius of convergence, then f (x) dx = C + n=0 n+1
x
on
(R, R).
The corollaries above completely answered the fundamental problems posed
several sections ago. There remains one more important problem.
Problem. Assume that a function f is represented by both
P
n
n=0 bn x on open interval I containing 0. Is an = bn ?
n=0
an xn and
The next section will give an affirmative answer to the above problem.
3.4
Taylor Series
P
Lemma 3.4.1. Suppose f (x) is represented by the power series n=0 an xn on
(R, R), where R > 0 is its radius of convergence. Then f is differentiable at
f (k) (0)
k!
for each k N {0}.

P
Proof. By corollary 3.3.8, we know that a power series n=0 an xn is infinitely
differentiable
on (R, R), where R > 0 is its radius of convergence. Let f (x) =
P
n
a
x
for
each x (R, R). By repeated applications of corollary 3.3.7,
n=0 n
x = 0 infinitely many times and ak =
140
we now have, on (R, R),

f 0 (x)
=
=
f 00 (x)
=
=
f 000 (x)
=
=
X
d
[an xn ]
dx
n=0
nan xn1 ,
n=1
d
[nan xn1 ]
dx
n=1
n(n 1)an xn2 ,
n=2
d
[n(n 1)an xn2 ]
dx
n=2
n(n 1)(n 2)an xn3 ,
n=3
..
.
f (k) (x)
=
=
X
n=k
n(n 1)(n 2) (n (k 1))an xnk

n(n 1)(n 2) (n k + 1)an xnk .
n=k
Recall that 00 := 1 when evaluating power series (see definition 3.0.1). Evaluating each expression at x = 0 (R, R) yields
f (0)
an 0n
n=0
= a0 00 + a1 01 + = a0 00
=
f 0 (0)
0! a0 ,
X
nan 0n1
n=1
1a1 011 + 2a2 021 + = 1 a1 00
1! a1 ,
141
f 00 (0)
n(n 1)an 0n2
n=2
f 000 (0)
2(2 1)a2 022 + 3(3 1)a3 032 + = 2 1 a2 00
2! a2 ,
X
n(n 1)(n 2)an 0n3
n=3
3(3 1)(3 2)a3 033 + 4(4 1)(4 2)a4 043

+ = 3 2 1 a3 00
=
..
.
f
(k)
(0)
3! a3 ,
n(n 1)(n 2) (n k + 1)an 0nk
n=k
= k! ak .
We conclude that a0 = f (0) and f (k) (0) = k!ak ak =
The lemma follows.
f (k) (0)
k!
for each k N.
P
n
Lemma 3.4.2. Assume that the power
Pseries n=0nan (x a) has a radius of
convergence R > 0. Suppose f (x) = n=0 an (xa) for each x (aR, a+R).
Then f is infinitely differentiable on (a R, a + R) and
an =
f (n) (a)
n!
for each n N {0}.

Proof. This lemma follows immediately from lemma 3.4.1 above after noting
that centering at a = 0 loses no generality. (We can simply set g(x) = f (x + a)
for each x (R, R) and apply the lemma to g; but if the lemma holds for g(x)
at x = 0, then the lemma holds for f (x) at x = 0 + a = a.)
P
Suppose that the power series n=0 an (x a)n with radius of convergence
R > 0 represents some function f on (a R, a + R). The above lemma says that
the terms of the power series are predetermined by the various derivatives of f
at
x = a. In particular,P
if a function is represented by two different power series
P
n
n
a
(x
a)
and
n
n=0
n=0 bn (x a) , then each ak and bk are equal to the
same predetermined value
in the following theorem.
f (k) (a)
k! and
hence ak = bk . This is made rigorous
Theorem 3.4.3 [Uniqueness

TheoremPfor Power Series RepresentaP
tion]. Suppose that n=0 an (xa)n and n=0 bn (xa)n havePradii of conver
gence R > 0 and S > 0, respectively. Also suppose that f (x) = n=0 an (x a)n
142
P
and f (x) = n=0 bn (x a)n for all x I (a R, a + R) (a S, a + S), I
an open interval containing a. Then an = bn for all n N {0}.
Proof. Define g : (a R, a + R) R by
g(x) =
an (x a)n
n=0
for each x (a R, a + R) and h : (a S, a + S) R by

h(x) =
bn (x a)n
n=0
for each x (a S, a + S).

Note now that g(x) = f (x) = h(x) on all of I, where I is an open interval
containing a. By corollary 3.3.8, g and h (which take on the same values as
f on I) are differentiable at x = a I infinitely many times, and so f is also
differentiable at x = a infinitely many times. This implies
g (n) (a) = f (n) (a) = h(n) (a)
for all n N {0}. Applying lemma 3.4.2 to g and h yields
an =
f (n) (a)
h(n) (a)
g (n) (a)
=
=
= bn
n!
n!
n!
for all n N {0}.

By now, we know that if a function f (x) is representable by a power series (on
an open interval containing its center a), then it mustP
be infinitely differentiable
at x = a. We know that if it has a representation n=0 an (x a)n , then its

terms an are predetermined as stated in lemma 3.4.2. In fact, if a function f (x)
is infinitely differentiable around some center x = a, we can always form the
(n)
P
power series n=0 an (x a)n where an = f n!(a) .
Definition 3.4.4. Suppose that f (x) is differentiable at x = a infinitely many
times. Its Taylor series centered at x = a is the power series
X
f (n) (a)
(x a)n .
n!
n=0
Suppose this power series has interval of convergence I. Its kth partial sum is
the function Sk : I R defined by
Sk (x)
k
X
f (n) (a)
(x a)n
n!
n=0
f (a) + f 0 (a)(x a) +
f 00 (a)
f (k) (a)
(x a)2 + +
(x a)k .
2!
k!
143
We call Sk (x) the kth Taylor polynomial of f (x) centered at x = a. We denote

this polynomial by Pk,a (x). Note that Pk,a (x) is defined as long as f is at least
k times differentiable at x = a.
The natural questions to ask are: What is its radius of convergence R? Does
this series converge back to the function f (x) on (R, R)?
The first question is a matter of using the standard convergence tests for
seriessuch as the ratio test. Once we establish that the radius of convergence
R is strictly greater than 0, we then ask the second question. Unfortunately, not
all functions have Taylor series that converge back to the functions themselves.
Example 3.17. Let f : R R be defined by
( 1
6 0,
e x2 if x =
f (x) =
0
if x = 0.
A fairly simple induction argument shows that f (x) is differentiable at x = 0
infinitely many times and that
f (n) (0) = 0
for all n N {0}. That is, its Taylor series centered at x = 0
X
f (n) (0) n X 0 n X
0
x =
x =
n!
n!
n=0
n=0
n=0
has radius of convergence R = and is equal to 0 identically on all of R. But

1
f (x) = e x2 > 0 when x 6= 0; hence this Taylor series does not converge back to
f.
In the above example, the function f (x) approached 0 as x 0 at a rate so
extremely fast that the kth derivative at x = 0 is 0 for all k. Its Taylor series
could not tell the difference between the behaviour of the constant function 0
and the behaviour of the function f (x) at x = 0. Look at a graph of the function
f (x) and see for yourself!
A Taylor series centered at x = a is a function that tries to encode in it all of
the local information at the center x = a. To be precise, it encodes in it the rates
of change at x = a at various orders of differentiationi.e., the instantaneous
curvature information at x = a. It is the function that would result if this
local curvature at x = a is used to construct the function globally. In this
Taylor sense, the function f (x) in the above example is not the most natural
candidate to possess the curvature at x = 0; instead, the constant function 0 is
the most natural function to possess the absolutely flat curvature at x = 0.
Their Taylor series centered at 0 says so.
144
Definition 3.4.5. Let f (x) be differentiable at x = a infinitely many times.

Form its kth Taylor polynomial centered at x = a:
Pk,a =
k
X
f (n) (a)
(x a)n .
n!
n=0
Suppose the Taylor series of f centered at x = a has a radius of convergence

R > 0. Define Rk,a : (R, R) R by
Rk,a (x)
= f (x) Pk,a (x)

= f (x)
k
X
f (n) (a)
(x a)n .
n!
n=0
Rk,a is called the kth remainder function of f (x) centered at x = a.

The following observation gives us a criterion that can be used to check
whether a functions Taylor series converges back to itself.
Remark 3.4.6. Suppose we have a function f : R R such that f is differentiable at x = a infinitely many times. Form its Taylor series centered at x = a:
P
f (n) (a)
n
that the Taylor series has radius
n=0 an (x a) where an =
n! . Assume
P
of convergence R > 0. Then f (x) = n=0 an (x a)n uniformly on [b, c] for
any [b, c] (a R, a + R) if and only if its kth remainder function Rk,a (x) 0
pointwise on (a R, a + R) as k .
Proof. Note that if the remainder P
function converges to 0 pointwise on (a
n
R, a + R), then the power series
n=0 an (x a) converges pointwise to f
on [b, c] (a R, a + R). But by theorem 3.2.13, the power series converges
uniformly to a function g on [b, c] (a P
R, a + R)but uniform limits are
pointwise limits, and so f = g on [b, c] and n=0 an (x a)n actually converges

uniformly to f on [b, c].
P
In the other direction, if n=0 an (x a)n converges uniformly back to the
function f on any [b, c] (a R, a + R), then it is clear that the pointwise limit
of the kth remainder functions is 0 on all of (a R, a + R).
In essence, the Taylor series of a function f converges back to f if and only if
its remainder function tends to the constant function 0 pointwise. The following
theorem makes the above criterion practical to use.
Theorem 3.4.7 [Taylors Theorem]. Assume that f (x) is k + 1 times differentiable on an open interval I containing a. Form its kth Taylor polynomial
centered at x = a (defined on I):
Pk,a (x) =
k
X
an (x a)n
n=0
145
where an =
such that
f (n) (a)
n! .
Then for each x I, there exists some c between x and a
Rk,a (x)
:= f (x) Pk,a (x)

=
f (n+1) (c)
(x a)n+1 .
(n + 1)!
Proof. This proof is omitted.

Remark 3.4.6 allows us to establish that a Taylor series converges back to
the original function by merely checking that the pointwise limit of the kth
remainder functions is 0. In practice, this means that we will fix an x0
(a R, a + R) and proceed to show that the kth remainder function evaluated
at x = x0 tends to zero as k tends to infinity. See the following example.
P xn
Example 3.18. We know from example 3.3 that
n=0 n! converges on the
P xn
x
entire R. We will now show that e = n=0 n! for all x R. Let f (x) = ex .
Note that f 0 (x) = ex , f 00 (x) = ex , . . . , f (k) (x) = ex for any k N. So
f (k) (0) = e0 = 1 for all k N. Its Taylor series centered at x = 0 is
X
xn
.
n!
n=0
Does this converge back to ex ? We check the size of its kth remainder function
centered at x = 0, evaluated at some arbitrary x0 R:
(k+1)

f
(ck ) k+1
|Rk,0 (x0 )| =
(3.9)
x0
(k + 1)!

xk+1
0

(k+1)
= |f
(ck )|

(k + 1)!

xk+1
0

ck
= |e |
,
(k + 1)!
for some ck between x0 and 0; equality (3.9) is due to Taylors theorem. This
whole equality holds for all k N, with differing ck s for each k. We know
that f (x) = ex is bounded on [x0 , x0 ], so there exists an M > 0 such that
f (c) = ec < M for all c [x0 , x0 ]. Hence for each natural number k,

xk+1

0 |Rk,0 (x0 )| = |eck | 0

(k + 1)!

xk+1
0

< M
.
(k + 1)!
146
k+1
x0
Note that limk M (k+1)!
= 0 and so by the squeeze theorem,
lim |Rk,0 (x0 )| = 0.
This limit holds for all x0 R and so Rk,0 (x) 0 pointwise on R as k .

This shows that
X
xn
ex =
n!
n=0
on R, uniform on any [a, b] R.
We will also show that the Taylor series of sin(x) and cos(x) converge back
to themselves.
Example 3.19. Consider the function f (x) = sin(x). We can easily establish,
using mathematical induction, that
sin(x)
if k 0 (mod 4),
cos(x)
if k 1 (mod 4),
(3.10)
f (k) (x) =
sin(x)
if k 2 (mod 4),
cos(x) if k 3 (mod 4).

The Taylor series of f (x) centered at x = 0 is
X
f (n) (0) n
x ,
n!
n=0
where f (n) (0) is either 0 or 1 as described in (3.10) above. Since every alternate
term is sin(0) = 0, we can rewrite this series as
(1)n
n=0
x2n+1
.
(2n + 1)!
x2n+1
0
Let x0 R. Let bn = (1)n (2n+1)!
. We now have

x2n+3
0
(1)n+1 (2n+3)!

|bn+1 |
|x0 |2n+3 (2n + 1)!
|x0 |2

=
=
=
0

x2n+1
|bn |
|x0 |2n+1 (2n + 3)!
(2n + 2)(2n + 3)
0
(1)n (2n+1)!

as n . By the ratio test, the Taylor series above has radius of convergence
R = .
Now form the kth remainder function for f (x):
Rk,0 (x) := f (x) Pk,0 (x).
147
For any x0 R, Taylors theorem states that there exist a ck between x0 and 0
satisfying
f (k+1) (ck ) k+1
Rk,0 (x0 ) =
x
,
(k + 1)! 0

for all k N {0}. We conclude by equation (3.10) that f (k+1) (ck ) 1 for all
k N {0}. We now have that
(k+1)

f
(ck ) k+1
0 |Rk,0 (x0 )| =
x0
(k + 1)!

|x |k+1

0
= f (k+1) (ck )
(k + 1)!
k+1
|x0 |
(k + 1)!
0
as k . By the squeeze theorem, limk |Rk,0 (x0 )| = 0. This limit holds
for all x0 R and so Rk,0 (x) 0 pointwise on R as k . This shows that
f (x) = sin(x) =
(1)n
n=0
x2n+1
(2n + 1)!
on R, uniform on any [a, b] R.

X
(1)n 2n+1
2n+1
4
(2n + 1)!
n=0

Strategy: Whenever we see odd powers of x in the numerator and odd factorials
in the denominator, we should immediately think of the power series for sin(x):
X
n=0
(1)n
x2n+1
,
(2n + 1)!
convergent for any x R. Upon closer examination, we realize that the given
series is just the power series for sin(x) evaluated at x = 4 . How did we realize
this? The key is that the odd powers of in the numerator and the same
2n+1
odd powers of 4 in the denominator can be merged together to form 4
,
revealing the disguised power series for sin(x) evaluated at x = 4 :
2n+1
X
X
(1)n 2n+1
n 4
=
(1)
.
42n+1 (2n + 1)! n=0
(2n + 1)!
n=0

We conclude that this series converges and evaluates to sin 4 = 12 .
148
P
x2n+1
Sample Solution 25. We know that the power series n=0 (1)n (2n+1)!
converges for all x R and represents the function sin(x). Note that
2n+1
X
X
(1)n 2n+1
n 4
=
(1)
.
42n+1 (2n + 1)! n=0
(2n + 1)!
n=0
P
x2n+1
But this is the power series n=0 (1)n (2n+1)!
= sin(x) evaluated at x =

Hence the series converges to sin 4 = 12 .
4.
Example 3.20. Consider g(x)

= cos(x). If we let f (x) = sin(x), then g(x) =
P
f 0 (x). We know that f (x) = n=0 an xn , where
an =
n!
1
n!
if
if
if
if
n 0 (mod 4),
n 1 (mod 4),
n 2 (mod 4),
n 3 (mod 4).
A direct application of corollary 3.3.7 shows that

"
#
d
d X
n
g(x) =
f (x) =
an x
dx
dx n=0
=
(3.11)
nan xn1 , whence after simplification,
n=1
(1)n
n=0
x2n
,
(2n)!
One can check that the series at (3.11) is the Taylor series for cos(x), uniform
on any [a, b] R.
Alternatively, we can proceed as follows:
"
#
2n+1
d
d X
n x
cos(x) =
sin(x) =
(1)
dx
dx n=0
(2n + 1)!

X d
x2n+1
=
(1)n
dx
(2n + 1)!
n=0
=
=
X
n=0
X
n=0
uniform on any [a, b] R.

149
(1)n (2n + 1)
(1)n
x2n
,
(2n)!
x2n
(2n + 1)!
We will conclude this section with a result about multiplying sequences of

functions.
Theorem 3.4.8. Assume that {fn }, {gn } C[a, b]. If fn f , gn g uniformly on [a, b] as n , then fn gn f g uniformly on [a, b] as n .
Proof. By the extreme value theorem, the continuous functions f, g on [a, b]
achieve their maximums and minimums on [a, b]. Hence kf k < M0 and
kgk < M1 for some M0 , M1 > 0. Choose N0 N large enough so that
for all n N0 ,
kgn gk < 1.
By the triangle inequality, we have that for all n N0 ,
kgn k = kgn g + gk kgn gk + kgk < 1 + M1 .
Hence let
M2 = max {kgn k }
1n<N0
and M = max{1 + M1 , M2 }. Then for all n N, kgn k M .

Let > 0. Choose an N N large enough so that both kfn f k < 2M

and kgn gk < 2M0 . Using the fact that kuvk kuk kvk for any u,
v C[a, b], we have that for all n N ,
kfn gn f gk
= kfn gn f gn + f gn f gk
kfn gn f gn k + kf gn f gk
kgn k kfn f k + kf k kgn gk

< M
+ M0
2M
2M0

+
=
2 2
= .
This shows that fn gn f g uniformly on [a, b] as n .
3.5
Applications
Example 3.21. Find a power series representation for f (x) =
x2
1+x3 .
To do this, we note that on (1, 1),

g(u) :=
X
1
=
un .
1 u n=0
Hence for any x such that x3 (1, 1), we have
X
X
1
1
3
3 n
=
=
g(x
)
=
(x
)
=
(1)n x3n .
1 + x3
1 x3
n=0
n=0
150
But x3 (1, 1) if and only if x (1, 1), hence for all x (1, 1),
P
n 3n
n=0 (1) x . Now
f (x) = x2
1
1+x3
X
X
X
1
3
n 3n
n 2 3n
=
x
(1)
x
=
(1)
x
x
=
(1)n x3n+2 .
1 + x3
n=0
n=0
n=0
This is the required power series representation for f (x) on (1, 1).
2
Example 3.22. Find the 5th degree Taylor polynomial for f (x) = x2 ex sin(x)
centered at 0. Find f (5) (0).
Let f (x) = O(g(x)) here denote the statement that f (x) is Big-O of g(x)
2 n
P
P
n
2
as x 0. We know that eu = n=0 un! for u R and so ex = n=0 (xn!) =
P x2n
P
x4
x3
2
6
n x2n+1
n=0 n! = 1 + x + 4! + O(x ); also, sin(x) =
n=0 (1) (2n+1)! = x 3! +
O(x5 ). We now have, by theorem 3.4.8,

2
x3
x4
+ O(x6 )
x
+ O(x5 )
ex sin(x) =
1 + x2 +
2!
3!
x3
= x
+ x3 + O(x5 )
3!
5
= x + x3 + O(x5 ),

6
5
2 x2
2
x e sin(x) = x x + x3 + O(x5 )
6
5
= x3 + x5 + O(x7 ).
6
This shows that the 5th degree Taylor polynomial for f (x) is x3 + 65 x5 . By
lemma 3.4.1, we have that f (5) (0) = 5!a5 = 5! 65 = 100.
Z
0
with an error less than
0.1
1
dx
1 + x27
1
1028 .
P n
1
(1, 1). If x27 (1, 1),
We know that 1u
=
n=0 u for all uP
P
1
1
27 n
n 27n
then 1+x
=
. Since x27
27 = 1(x27 ) =
n=0 (x )
n=0 (1) x
1
(1, 1) if and only if x (1, 1), we have that for all x (1, 1), 1+x
27 =
P
n 27n
.
n=0 (1) x
151
By corollary 3.3.2, we have that

Z
0
0.1
1
dx
1 + x27
0.1 X
Z
=
0
=
=
=
Z
X
(1)n x27n dx
n=0
0.1
(1)n x27n dx
n=0
0.1

x27n+1
(1)n
27n + 1 0
n=0
(1)n
n=0
(0.1)27n+1
.
27n + 1
R 0.1 1
27n+1
P
n (0.1)
We see that 0 1+x
27 dx =
n=0 (1)
27n+1 . Using the error clause from
the alternating series test, we have that
Z
!

0

0.1
27n+1
X
(0.1)27(1)+1
1

n (0.1)
<
dx
(1)

0 1 + x27
27n + 1
27(1) + 1
n=0
=
<
Hence
3.5.1
P0
n=0 (1)
27n+1
n (0.1)
27n+1
27(0)+1
= (1)0 (0.1)
27(0)+1
1
28(10)28
1
.
1028
= 0.1 is the required estimate.
Banach Contractive Mapping Theorem
We will end the course with a very elegant result about contractive maps, maps
from a metric space into itself that send every element of the space closer together. We will use it to establish the existence and uniqueness of a solution to
an integral equation.
Theorem 3.5.1 [Banach Contractive Mapping Theorem for C[a, b]].
Suppose that : C[a, b] C[a, b] is a contractive map; that is, suppose that
is such that there exists k satisfying 0 k < 1 with
k(u) (v)k kku vk
for all u, v C[a, b]. Then there exists a unique function f C[a, b] such that
(f ) = f . Here, f is called a fixed point of .
Proof. Let f0 C[a, b]. For each n N, define fn = (fn1 ). Set gn = fn+1 fn
152
for each n N {0}. Then we have

kg1 k = kf2 f1 k = k(f1 ) (f0 )k kkf1 f0 k k 1 kg0 k
..
.
kgn k = kfn+1 fn k = k(fn ) (fn1 )k kkfn fn1 k k n kg0 k .
We can easily establish that kgn k k n kg0 k for all n N using the principle
of
mathematical induction.
0 k < 1, geometric series test shows that
P
PSince
n
n
k
kg
k
=
kg
k
k
converges,
whence the comparison
test shows
0
0
n=0P
n=0
P
that n=0 kgn k converges. By the Weierstrass M -test,

g
n=0 n converges
uniformly to some g C[a, b]. But observe that
k
X
gn =
n=0
k
X
fn+1 fn = (f1 f0 ) + (f2 f1 ) + + (fk+1 fk ) = fk+1 f0 .
n=0
This shows that fk+1 f0 g with respect to d as k ; therefore,

fk+1 g + f0 with respect to d as k . Let f = g + f0 = limk fk . We
claim that (f ) = f . To see this, note that for each n N,
0 kfn (f )k = k(fn1 ) (f )k kfn1 f k 0
as n . Hence by the squeeze theorem, limn kfn (f )k = 0, implying
that fn (f ) as n . But fn f as n , and limits are unique, and
so (f ) = f . We also claim that f is the only function to satisfy (f ) = f . To
show this, suppose that h C[a, b] satisfies (h) = h. Then
0 kh f k = k(h) (f )k kkh f k .
Because k < 1, the above implies that 0 (1 k)kh f k 0 and so
kh f k = 0. This shows that f = h and the proof is complete.
Example 3.24. Show that there exists a unique f C[0, 1] so that
Z x
sin(t)
x
(3.12)
f (x) = e +
f (t) dt.
2
0
Let : C[0, 1] C[0, 1] be defined by
Z x
sin(t)
(g) = ex +
g(t) dt
2
0
153
for each g C[a, b]. Because integral functions (of continuous integrands) on
[0, 1] are continuous by corollary 1.4.2, (g) C[0, 1] for all g C[0, 1]. We
note that for any x [0, 1], and f , g C[a, b],

Z x
x
sin(t)

|(g)(x) (f )(x)| = e +
g(t) dt
2

0 Z x

sin(t)
x
f (t) dt
e +
2
0
Z x

sin(t)

=
(g(t) f (t)) dt
2
0

Z x
sin(t)

2 |g(t) f (t)| dt
0

Z x
sin(t)

2 kg f k dt,
0
whence by linearity of definite integrals (theorem 1.3.1),

Z x
Z x
sin(t)
sin(t)

kg f k dt = kg f k

2
2 dt
0
0

Z 1
sin(t)

kg f k
2 dt
0
Z 1
1
kg f k
dt
0 2
1
kg f k .
=
2
This shows that k(g) (f )k 12 kg f k ; is contractive. By the Banach
contractive mapping theorem, there exists a unique function f0 C[a, b] such
that (f0 ) = f0 . But a function f satisfies the integral equation 3.12 if and only
if (f ) = f . Hence f0 is the unique solution to the integral equation 3.12.
Note that not only does the Banach contractive mapping theorem guarantee
uniqueness and existence of fixed points, it also provides a constructive method
to find the fixed point; namely, start with any function f0 C[a, b] and iteratively apply the contractive map to it. The limit of this iteration will be the
desired fixed point of .
Example 3.25. Show that there exists a unique function f0 (x) C[0, 1] such
that
Z x
(3.13)
f0 (x) = x +
t2 f0 (t) dt.
0
Find a power series representation for this function on [0, 1].

154
Let : C[0, 1] C[0, 1] be defined by

Z
(g)(x) = x +
t2 g(t) dt.
Because integral functions (of continuous integrands) on [0, 1] are continuous by

corollary 1.4.2, (g) C[0, 1] for all g C[0, 1]. Note that f is a solution to
(3.13) if and only if (f ) = f . Observe that for any x [0, 1], and f , g C[0, 1],
we have that

Z x
Z x

t2 f (t) dt
t2 g(t) dt x +
|(g)(x) (f )(x)| := x +
0
0

Z x

2
t (g(t) f (t)) dt
=
0
Z x
|t2 | |g(t) f (t)| dt

0
|t2 | |g(t) f (t)| dt
t2 kg f k dt
0
1
Z
=
t2 dt
kg f k
0

=
kg f k
1
kg f k .
3
t3
3
1
0
This shows that is contractive. By the Banach contractive mapping theorem,

has a unique fixed point f0 with (f0 ) = f0 , so f0 is the unique solution to
(3.13).
To find the series representation of f0 (x), we begin with f1 = 0, and fn+1 =
155
(fn ). So
Z
f2 = (f1 )
t2 0 dt = x,
= x+
0
x
t4
x4
t2 t dt = x + = x + ,
4 0
4
0

Z x
Z x
4
t6
t
t3 +
t2 t +
dt = x +
dt
x+
4
4
0
0
x4
x7
x+
+
,
4
47

Z 2
t7
t4
x+
t2 t + +
dt
4
47
0

Z x
t9
t6
dt
x+
t3 + +
4
47
0
x4
x7
x10
x+
+
+
,
4
4 7 4 7 10
Z
f3 = (f2 )
= x+
f4 = (f3 )
=
=
f5 = (f4 )
=
=
=
and so on. We can easily show using induction that for n 2,

fn (x) =
n2
X
i=0
x3i+1
.
1 4 7 (3i + 1)
We know from the Banach contractive mapping theorem that if fn+1 = (fn )
for all n N, then fn f0 uniformly. This shows that
f0 (x) =
x3n+1
1 4 7 (3n + 1)
n=0
is the required power series representation.
156
Appendix A
Extra Sample Questions

A.1
Finding Radii and Intervals of Convergence
P
Recall that the interval of convergence of a power series n=0 an (x a)n is
symmetric around its center a and may or may not contain either endpoint. Onehalf the length of this interval is called the power series radius of convergence.
When asked to find the radii and interval of convergence of a power series, we
should first find the radius of convergence. The principal tools for determining
the radius of convergence are corollary 3.1.6 (involving the ratio test), and occasionally the root test. Once the radius of convergence has been determined, we
then find out if the power series converges at the endpoints. These two steps will
determine the interval of convergence. The tests used for endpoint convergence
are usually the alternating series test, divergence test, p-test for series, integral
test, and the comparison tests. The first sample question we give illustrates this
process.
Sample Question 26. Determine the radius of convergence and the interval
of convergence of the following power series:
X
(1)n xn
.
n2n
n=1
Strategy: Our tool for determining the radius of convergence of this power
series is corollary 3.1.6. Let an be the nth term (coefficient) of this power series;
n
|an+1 |
that is, let an = (1)
n2n . What does the ratio |an | approach as n ? In
the sample solution, we will determine that this limit is 21 and so the radius of
convergence is R = 2.
We then test for the two endpoint cases x = R and x = R. We ended
up using the p-test to prove that the power series diverges for the case x = 2
and the alternating series test to prove that the power series converges for the
case x = 2. The interval of convergence is therefore (2, 2] and the question is
157
finished. Note that the endpoint cases are two specific numeric series; testing
convergence of these two series requires techniques and strategies discussed in
chapter two.
n
Sample Solution 26. Let an = (1)

n2n for each n N. Then

(1)n+1
(n+1)2n+1
n2n
n
|an+1 |
1
1
1
=
=
=
1 =
n
n+1
(1)
|an |
(n
+
1)2
n
+
1
2
2
2
n2n
n n
P
x
as n . Corollary 3.1.6 then shows that the power series n=1 (1)
has a
n2n
1
radius of convergence R = 1 = 2. Hence this power series converges on (2, 2),
2
and possibly at one or both of the endpoints 2 and 2.
At the endpoint x = 2, the power series becomes
X
X
X
(1)n (2)n
(1)n (1)n
1
=
=
,
n
n2
n
n
n=1
n=1
n=1
which diverges by the p-test for series with p = 1. At the other endpoint x = 2,
the power series becomes
X
X
(1)n 2n
(1)n
=
,
n
n2
n
n=1
n=1
which is an alternating series with the terms n1 positive, decreasing, and tending
to zero, and hence this series converges by the alternating series test.
We conclude that the radius of convergence of this power series is R = 2 and
it has interval of convergence (2, 2].
X
xn
.
n+2
n=0
Strategy: Again, we will use corollary 3.1.6 to determine the radius of convergence. We shall see that the radius of convergence of this series is R = 1. At
the endpoint R = 1, the series becomes
X
(1)n
.
n+2
n=0
The (1)n should remind us of the alternating series test. This numeric series
converges by the alternating series test. At the other endpoint R = 1, the series
becomes
X
1
1 1 1
= + + + ,
n
+
2
2 3 4
n=0
158
which is just the harmonic series with the first term missing. This numeric series
diverges by the p-test for series with p = 1.
1
n+2
an+1
=
an
for each n 0. Then an 0 and we have
1
n+3
1
n+2
n+2
1
n+3
as n . Corollary 3.1.6 shows that this power series has radius of convergence
R = 11 = 1. Hence the given power series converges on (1, 1) and possibly one
or both of its endpoints.
At the endpoint x = 1, the power series becomes
X
(1)n
,
n+2
n=0
1
which is an alternating series with the terms n+2
positive, decreasing, and
tending to zero, and hence the series converges by the alternating series test.
At the other endpoint x = 1, the power series becomes
X
1
1 1 1
1
= + + + =
,
n+2
2 3 4
n
n=0
n=2
which diverges by the p-test for series P

with p = 1.
xn
has radius of convergence R =
We conclude that the power series n=0 n+2
1 and interval of convergence [1, 1).
In the sample solution above, the ratio aan+1
approached 1 as n , which
n
told us that the radius of convergence is 11 = 1. Do not confuse this with the
useless case of the ratio test where the ratio approached 1! Remember that we
are applying a corollary of the ratio test, not the ratio test itself. If we were
to apply the full ratio test to the power series, we would have included the xn
into the terms an as well. But because of corollary 3.1.6, knowing the limiting
ratio of the coefficients is enough. Read the corollary carefully and know exactly
what it states.
3n xn
.
(n + 1)2
n=0
Strategy: We do exactly as we did before: use corollary 3.1.6 to find the radius
of convergence and then test for the endpoints.
159

an+1
=
an
3n+1
(n+2)2
3n
(n+1)2
3n
(n+1)2
for each n 0. Then an 0 and
3n+1 (n + 1)2
=3
3n (n + 2)2
n+1
n+2
2
31=3
as n . Corollary 3.1.6 shows that the radius of convergence of the given

power series is R = 13 . Hence the given power series converges on 31 , 13 and
possible one or both of its endpoints.
At the endpoint x = 13 , the power series becomes
n
X
X
X
3n 13
1
1
=
=
,
2
2
2
(n
+
1)
(n
+
1)
n
n=0
n=1
n=0
which converges by the p-test for series with p = 2. At the other endpoint
x = 13 , the power series becomes
n
X
X
X
3n 31
(1)n
(1)n1
=
=
,
2
2
(n + 1)
(n + 1)
n2
n=0
n=0
n=1
which is an alternating series whose terms n12 are positive, decreasing, and
tending to zero, and hence this series converges
the alternating series test.
P 3n xby
n
1
We conclude that the power series n=0 (n+1)
2 has radius of convergence 3
1 1
and interval of convergence 3 , 3 .
X
n!xn
.
(2n)!
n=0
Strategy: This power series looks more complicated because it has more items
in the numerator and a large denominator. But this makes no difference to our
n!
approach. We will proceed exactly as before. Letting an = (2n)!
, then an 0
and we discover that its successive ratio should satisfy
an+1
=
an
=
(n+1)!
(2n+2)!
n!
(2n)!
4n2
(n + 1)!(2n)!
n+1
=
n!(2n + 2)!
(2n + 2)(2n + 1)
n+1
0
+ 6n + 2
as n , because the numerator is approximately n and the denominator

is approximately 4n2 , for large n. Corollary 3.1.6 shows that the radius of
convergence is R = and hence this power series converges on all of R.
This is the first problem we have encountered where the limiting ratio is
zero. As seen above, this will mean that the power series converges everywhere
and there are no endpoints to test!
160

n 1, we have
an+1
=
an
=
(n+1)!
(2n+2)!
n!
(2n)!
4n2
n!
(2n)!
for all n 0. Then an 0 and for any
(n + 1)!(2n)!
n+1
=
n!(2n + 2)!
(2n + 2)(2n + 1)
1
n+1
2n
2 =
0
+ 6n + 2
4n
2n
as n . Hence by the squeeze theorem, limn aan+1

= 0 and corollary
n
3.1.6 shows that the given power series has radius of convergence R = . We
conclude that the given power series has interval of convergence R.
nxn
.
1 3 5 (2n 1)
n=1
Strategy: This power series has an ugly denominator. As n becomes larger,
the denominator becomes longer; however, this changing length does not affect
our strategy at all. We proceed as before with corollary 3.1.6 to find the limiting
ratio of the terms. We will again discover that the limiting ratio is 0, which
means that the interval of convergence of this power series will be all of R.
Sample Solution 30. Let an = 135 n(2n1) for each n 1. Then for each
n 1, an 0; also, we have

n+1
135 (2n1)(2n+1)
an+1
(n + 1)(1 3 5 (2n 1))

=
=
n
an
n(1 3 5 (2n 1) (2n + 1))
135 (2n1)
n+1
1
=
10=0
n
2n + 1
as n . Corollary 3.1.6 shows that the given power series has radius of
convergence R = and interval of convergence R.
xn
.
(ln(n))n
n=0
Strategy: The first thing we should notice is that the terms of this power series
is a fraction whose items in the numerator and denominator are all raised to
161
the same exponent n. When this happens, the root test becomes the most convenient tool to determine the radius of convergence. Recall that if the limiting
nth root of the absolute values of the terms is , then the series converges if
< 1, and the series diverges if > 1. No conclusion can be made if = 1.
The q
nth root of the absolute value of the nth term of this power series looks
n
|x|
|x|
like n (ln(n))
n = ln(n) , which tends to 0 as n regardless which value of x
we plug in. Hence the root test shows that this power series converges for all
values of x. (The interval of convergence is all of R and there are no endpoints
to test!)
Sample Solution 31. Let x0 be any real number. Let an =

p
n
|an | =
xn
0
(ln(n))n .
Then
s
s
n
xn0
|x0 |
|x0 |
n
= n
0
=
(ln(n))n
ln(n)
ln(n)
P
xn
as n . The root test shows that the power series n=0 (ln(n))
n converges
for x = x0 , for any x0 R. Hence the radius of convergence of this power series
is R = and its interval of convergence is all of R.
The next question asks us to find the radius and interval of convergence for
a power series whose series is not centered at 0. We first need to find its center.
n(3x + 2)n .
n=0
Strategy: As said before, we need to find the centre of this

Ppower series first.
In order to put this power series into the standard form n=0 an (x a)n , we
need to get rid of the 3 in front of the x. To do this, we factor the 3 out. The
power series now becomes

n X

n
X
n
2
2
n 3 x+
n3 x
=
.
3
3
n=0
n=0
The center is now apparent: the coefficient an is n 3n and the center a is

23 . Our interval of convergence will be symmetric around 32 (this information is why we need to know the center in the first place). Once we find
the radius of convergence
R, we know that the interval of convergence will be

32 R, 23 + R , along with possibly either or both endpoints.
Finding the radius of convergence and checking endpoints should be routine
by now. The only thing extra that we had to do for this question was finding
the center of the power series in question.
162
n=0

n X

n
X
n
2
2
=
n(3x + 2) =
n 3 x+
n3 x +
3
3
n=0
n=0

n
X
n
2
=
n3 x
3
n=0
n
is a power series centered at 23 . Let an = n 3n for each n 0. Then an 0

and
r
an+1
n + 1 3n+1
n+1
n
=
=
3 13=3
an
n
n3
as n . Corollary 3.1.6 shows that this power series has radius of conver
gence R = 13 and hence it has an interval of convergence 32 13 , 23 + 13 =
1, 13 along with possibly one or both of its endpoints.
At the endpoint x = 1, the given power series becomes
n(3 1 + 2)n =
n (1)n ,
n=0
n=0
which diverges by the divergence test because its terms n (1)n does not
1
approach 0 as n . At the other endpoint x = 3 , the given power series
becomes

n X
1
n 3
+2
=
n
3
n=0
n=0
which also diverges by the divergence test because its terms n grows without

bound. We conclude that this power series has interval of convergence 1, 13 .
In general, the radius of convergence of a power series changes after making
a substitution. Here is a question that directly asks us to find this change after
making the substitution u = x2 .
Sample Question 33. Suppose that the radius of convergence of the power
series
X
an xn
n=0
is R. What is the radius of convergence of the power series
an x2n ?
n=0
163
P
Strategy: We quickly realize that the latter series is n=0 an (x2 )n , which is
just the former series with the substitutionP
x = x2 ; to avoid this awkward
ness, we relabel the former power series as n=0 an un . Now we can write the
2
substitution as u = x . (Changing the variable name from x to u has little to do with the actual solution; it merely makes the notation clearer.) We
know that the former power series has an interval of convergence I containing
(R, R), with possibly one or both endpoints. Therefore, the latter series will
converge whenever x2 (R, R), and will diverge
whenever
x2 6 [R, R].

Observe that x2 (R, R) whenever x R, R and x2 6 [R, R]
h i
P
whenever x 6 R, R . Hence the power series n=0 an x2n converges if

x R, R and nowhere else, except perhaps on one or both endpoints.
The inclusion or exclusion of endpoints does not affect the radius of convergence,
shows that the radius of convergence for the power series
P and2nso this
R. The sample solution below makes this argument rigorous.
a
x
is
n
n=0
This sample question illustrates that whenever we make a substitution like
u = x2 , we must check to see if we have changed the radius of convergence. In
this example, after substituting
u = x2 , we found that the radius of convergence
did change from R to R (if R = 1, then there is no change).

P
Sample
33. For convenience,
we relabel the
power series n=0 an xn
P Solution
P
P
n
2n
2 n
as
n=0 an u . The power series
n=0 an x P =
n=0 an (x ) is obtained
2
n
by
u = x in the power series
n=0 an u . We are given that
Psubstituting
n
a
u
has
a
radius
of
convergence
R,
and
hence
it has an interval
of conn
n=0
P
vergence
I
satisfying
[R,
R]
(R,
R).
We
conclude
that
a
x2n =
n
n=0
P
2 n
2
2
I and diverges wherever hx 6 I. But
n=0 an (x ) convergeswherever x
i

2
x (R, R) I if x R, R , and x2 6 I [R, R] if x 6 R, R .
h i
P
Hence n=0 an x2n converges on an interval J satisfying R, R J

R, R , which shows that it has a radius of convergence R.
A.2
Building Power Series from Analytic Formulae
The sample questions in this section requires us to expand a function given by

x3
an analytic formula (such as ex cos(x) and 1+x
2 ) into its Taylor series centered
at 0. To do this, we simply need to find one power series representation valid
on an some interval containing 0. Recall that this is because once this valid
representation is found for a function f (x), it must be its Taylor series. So
the task of finding the Taylor series for f (x) is just the task of finding a valid
power series representation for f (x). How do we go about finding power series
representations for functions?
We can try to build the given analytic formula using simpler functions
164
whose power series representation are known. Once we assemble the components into the given formula, we replace each component by its power series
representation and the result is the representation that we are looking for.
For example,
f and g have power series reprePsupposen two simple
P functions
n
sentations
a
x
and
b
x
,
respectively.
If a function h can be
n=0 n
n=0 n
built
by
adding
f
and
g
together,
then
h
will
have
power
series representation
P
P
P
n
n
n
(a
+
b
)x
.
This
representation
will be valid
b
x
=
a
x
+
n
n
n
n
n=0
n=0
n=0
on the common interval on which both component series are valid. In practice,
a complicated function h usually cannot be built by just adding two simple
functions together. Building it may require clever additions, multiplications,
divisions, integrations, differentiations and substitutions of numerous functions
whose power series are known. In the questions that follow, we look at various
strategies employed in building the target function using known functions.
Here is a list of functions whose Taylor series are known:
P n
2
3
1. ex = n=0 xn! = 1 + x + x2! + x3! + , valid on R.
n x2n+1
n=0 (1) (2n+1)!
2. sin(x) =
3. cos(x) =
4.
1
1x
n x2n
n=0 (1) (2n)!
n=0
=x
x3
3!
x5
5!
x2
2!
x4
4!
x6
6!
=1
x7
7!
+ , valid on R.
+ , valid on R.
xn = 1 + x + x2 + x3 + , valid on (1, 1).
You may use any of these functions as building blocks in questions that follow;
know these representations wellin both the sigma notation and the expanded
notation. Remember that polynomials are Taylor series of themselves already,
valid everywhere. We begin with a simple example involving multiplication.
Sample Question 34. Find the Taylor series (centered at 0) and its radius of
convergence for the following function:
f (x) = x2 cos(x).
On what interval is the representation valid?
Strategy: This is the easiest type of functions to assemble: a polynomial times
a function whose power series representation is known. Since f (x) is just the
product of x2 and cos(x), both of which are known functions (with known power
series expansions), we will simply multiply x2 into the representation of cos(x),
P
x2n
, to get our desired representation.
which is n=0 (1)n (2n)!
Sample Solution 34. Since cos(x) =
n x2n
n=0 (1) (2n)!
165
on all of R, we have
that
f (x)
x cos(x) = x
(1)n
n=0
=
=
x2n
(2n)!
2 2n
(1)n
n=0
(1)n
n=0
x x
(2n)!
x2n+2
,
(2n)!
convergent and valid on all of R. This is the Taylor series for f (x), with radius
of convergence R = .
Here is one more question of this type.
f (x) = xex .
Strategy: The formula xex practically assembles itself: its the
of a
Pproduct
n
polynomial x and the function ex . Now ex is simply eu = n=0 un! with a
n
P
P
n
substitution u = x, so it has the representation n=0 (x)
= n=0 (1)n xn! .
n!
We simply take this and multiply the x into it, yielding the answer.
Sample Solution 35. First note that eu =
f (x) = xex = x
=
un
n=0 n! ,
valid on all of R. Hence
X
X
(x)n
xn
=x
(1)n
n!
n!
n=0
n=0
(1)n
n=0
X
x xn
xn+1
=
(1)n
,
n!
n!
n=0
convergent and valid on all of R. This is the desired Taylor series for f (x), with
radius of convergence R = .
Here is another question involving multiplication of two simple functions.
This time, however, the multiplication is more messy because neither factor is
a polynomial.
Sample Question 36. Find the first three nonzero terms of the Taylor series
(centered at 0) for the following function:
2
f (x) = ex cos(x).
166
Strategy: This question does not ask us to find the full power series representation: We need only find the first few terms. Again, the function f (x) is very
easy to assemble because its formula practically assembles itselfit is a product
of the functions cos(x) and eu , with substitution u = x2 . We know the power
series representation for both of these functions; it is a matter of multiplying
the series out, term by term:
!
!
2n
X
X
(x2 )n
n x
x2
(1)
f (x) = e
cos(x) =
n!
(2n)!
n=0
n=0
!
!
X
X
x2n
x2n
=
(1)n
(1)n
n!
(2n)!
n=0
n=0

x6
x4
x6
x4
x2
+ 1
+
+
=
1 x2 +
2!
3!
2!
4!
6!
4
4
2
2
x
x
x
x
x2 +
+
+ x2
+
= 1
2!
4!
2!
2!
3
25
= 1 x2 + x4 + .
2
24
Anything after the x4 s will be sixth-degrees x6 and above. The first three
nonzero terms of the power series representation for f (x) will therefore be 1
3 2 25 4
2 x + 24 x . We could get more terms if we like, and we could even spot a pattern;
however, it would involve more gruesome multiplication, which is probably why
the question asks us to stop at the first three terms.
P
n
Sample Solution 36. We first note that eu = n=0 un! for all u R, and so
2 n
P
P
2n
2
ex = n=0 (xn! ) = n=0 (1)n xn! , valid on all of R.
Here, let f (x) = O(g(x)) denote the statement that f (x) is Big-O of g(x)
as x 0. We have that
f (x)
= ex cos(x)
!
!
2n
2n
X
X
n x
nx
(1)
=
(1)
n!
(2n)!
n=0
n=0

4
x
x2
x4
2
6
6
=
1x +
+ O(x ) 1
+
+ O(x )
2!
2!
4!
x2
x4
x4
x2
= 1
x2 +
+
+ x2
+ O(x6 )
2!
4!
2!
2!
3
25
= 1 x2 + x4 + O(x6 ),
2
24
convergent and valid on R. Hence the first three nonzero terms of the Taylor
4
series for f (x) is 1 32 x2 + 25
24 x .
The next three questions involve substitutions, a technique already discussed
in sample questions 22 and 23.
167
f (x) =
1
.
1 + 4x2

Strategy: We follow exactly the same strategy as that of sample questions 22
1
and 23. Since this function feels like 1u
, our intuition should tell us that
1
is our best bet. By substituting
constructing f (x) from the simple function 1u
1
something into u, we hope to obtain 1+4x2 . The obvious candidate is 4x2 .
1
Sample Solution 37. Let g(u) = 1u
. Then g(u) has power series representaP n
tion n=0 u , valid only on (1, 1). Then we have that f (x) has power series
representation
f (x) =
X
X
1
1
2
2 n
=
=
g(4x
)
=
(4x
)
=
(1)n 4n x2n ,
1 + 4x2
1 (4x2 )
n=0
n=0
2
2
convergent
if and only if 4x = (2x) (1, 1). This occurs1 if1and
only if x
1 1
2 ,P
,
and
so
the
above
representation
is
valid
only
on
,
2
2 2 . This shows
that n=0 (1)n 4n x2n is the Taylor series for f (x) with radius of convergence
1
2.
Finding the right thing to substitute in is the key to solving this type of
questions. Try the following.
f (x) =
1
.
4 + x2
You may assume that the Taylor series is a valid representation for f (x) on its
interval of convergence.
1
Strategy: To build 4+x
2 , we follow the substitution strategy as before, starting
1
with 1u . We try to substitute something into u to construct our target func1
1
tion. But this time, the 1 from 1u
and the 4 from 4+x
2 are different numbers!
No matter what we plug into u, we cant turn the 1 into a 4. So the trick here
is to simply factor out the offending 4 like this:
f (x) =
1
1
=
2
4+x
4 1+
168
x2
4
=
1
1
.
4 1 + x42
Once we figure out the representation of

the power series representation for
the substitution u =
2
x4
1
2
1+ x4
1
2
1+ x4
1
4
, we then multiply the
in. Now
is quite easy to obtain. We will make
to get
n X

X
x2n
x2
=
(1)n n ;
4
4
n=0
n=0
after multiplying the
1
4
back in, we get
(1)n
n=0
X
x2n
1 x2n
n =
(1)n n+1 ,
4 4
4
n=0
the desired representation. We then carefully check the radius of convergence

of this new power series and we are done.
The lesson to be learned is this: If there is a constant somewhere you dont
like, factor it out and multiply it back in later!
1
4
1
4+x2
1
2
4 1+ x4
1
4
1x2 =
1+
P 4 n
. Let g(u) = 1 , which has power series representation
1
n=0 u ,
1u
x2
Sample Solution 38. First note that f (x) =
with interval of convergence (1, 1). Then we have that f (x) has power series
representation
2
n

x2
1
1
1
x
1 X
= g
=
f (x) =
4 1 x42
4
4
4 n=0
4
=
X
x2n
x2n
1 X
(1)n n =
(1)n n+1 ,
4 n=0
4
4
n=0
2
2
which converges if and only if x4 = x2 (1, 1), which occurs if and

P
2n
only if x 21 , 12 . The representation n=0 (1)n 4xn+1 is the Taylor series for
f (x), with radius of convergence 12 .
Try the following sample question; it involves a more interesting substitution.
Sample Question 39. Find the Taylor series (centered at 0) and its interval
of convergence for the following function:
f (x) = 2x .
Strategy: The only basic function similar to f (x) (that we can think of) is eu .
We will try to substitute something into u to construct 2x . Our job is to find
169
the item that will transform the base e into 2. Here is how: Note that eln(2) = 2,
and so to get 2x , we will have to substitute u = ln(2)x. Now we have
x

eln(2)x = eln(2) = 2x .
The key to this insight is that ln(x) is the inverse function of ex on the positive
real numbers, and so eln(2) cancels out to produce 2 back.
u
Sample
P un Solution 39. The function g(u) := e has power series representation
n=0 n! , valid for all u R. Hence f (x) has Taylor series
x

X
X
(ln(2)x)n
(ln(2))n xn
f (x) = 2x = eln(2) = eln(2)x = g(ln(2)x) =
=
,
n!
n!
n=0
n=0
valid on all of R.
The next question is a hybrid of trigonometric insight and an easy substitution.
f (x) = sin2 (x).
Strategy: Remember that sin2 (x) means sin(x) sin(x). We could write out the
power series representation for sin(x) twice and multiply the pair out, term by
term. But this is not a good technique, as the multiplication could be extremely
messy as shown in sample question 36, and we may not be able to spot the
pattern of the coefficients. We need a better (cleverer) method of constructing
f (x).
If one remembers basic trigonometric identities, we should recall this one:
cos(2x) = 1 2 sin2 (x).
Solving for sin2 (x), we get that
sin2 (x) =
1
(1 cos(2x)).
2
We have just discovered that sin2 (x) can be constructed by assembling together
1
2 , 1, and cos(2x). The power series representation of cos(2x) is easy to find
P
u2n
we simply substitute u = 2x into cos(u) = n=0 (1)n (2n)!
. We then multiply
1
the negative sign in, add 1 to it, and scale it by 2 . Since the cosine representation
is valid on all of R, the resulting series will be also valid on all of R, and we are
done.
170
Sample Solution 40. Since cos(2x) = 1 2 sin2 (x), we get that f (x) =
sin2 (x) = 21 (1 cos(2x)). Since cos(u) has power series representation
(1)n
n=0
x2n
(2n)!
valid on all of R, the power series representation of f (x) is

!
2n
X
1
1
n x
f (x) = (1 cos(2x)) =
1
(1)
2
2
(2n)!
n=0
!
2n
X
1 1 X
1
x2n
n+1 x
1+
(1)
= +
=
(1)n+1
2
(2n)!
2 2 n=0
(2n)!
n=0
1 X
x2n
+
(1)n+1
,
2 n=0
(2n)! 2
which converges on R and is valid on R. This is the Taylor series for f (x); it
has radius of convergence R = . (Since there is no other notationally effective
way to indicate that the constant term in the power series needs to be increased
by 12 , we can leave the power series representation as shown.)
Integration enters our collection of assembling techniques in the following
question.
f (x) = ln(1 + x).
You may assume that the Taylor series is a valid representation of f (x) on its
radius of convergence.
Strategy: This function does not look like any of the functions that we are
1
familiar with. It does not look like ex , sin(x), cos(x) or 1x
. This should tell
us that we cannot assemble f (x) using addition, multiplication, or substitution
of those functions. This leaves us with integration and differentiation. Recall
1
that the integral of u1 is ln(u), and so integrating 1+x
will yield ln(1 + x):
Z x
1
dt = ln(1 + t)|x0 = ln(1 + x) ln(1 + 0) = ln(1 + x).
1
+
t
0
1
We have successfully built ln(1 + x) by integrating a function 1+x
whose power
series representation is easy to find. The desired representation for ln(1 + x) is
1
therefore obtained by integrating the power series representation for 1+x
, term
by term.
When should we use integration to build the target function? Look for these
two functions: ln(u) and arctan(u), as they are easily obtained by integrating
simple functions.
171

Z x
1
x
dt = ln(1 + t)|0 = ln(1 + x) ln(1 + 0) = ln(1 + x),
0 1+t
1
1
for any x (1, 1). Also note that for t (1, 1), 1+t
= 1(t)
has repreP
P
n
n n
sentation n=0 (t) = n=0 (1) t , with radius of convergence 1. Hence on
(1, 1), the function f (x) = ln(1 + x) has power series representation
!

Z x
Z x X
Z x
X
1
n n
(1)n tn dt
(1) t
dt =
dt =
0
0 1+t
0
n=0
n=0
X
Z x
n+1 x
X
t
n
n
n
=
t dt =
(1)
(1)
n
+1 0
0
n=0
n=0

X
X
xn+1
xn+1
(1)n
=
(1)n
0 =
n+1
n+1
n=0
n=0
(1)n1
n=1
xn
,
n
with radius of convergence the same as

sentation is the Taylor series for f (x).
n=0 (1)
n n
t , which is 1. This repre-
Here is a question that requires both integration and subtraction.


1+x
f (x) = ln
.
1x
You may assume that the Taylor series is a valid representation of f (x) on its
interval of convergence.
Strategy: This may look intimidating at first, but the previous sample question
showed us how ln(u) can be obtained from
which definitely helps
integration,

1+x
here. The question is: How do we get ln 1x
? If we remember our exponent

a
laws, we should recall that ln b = ln(a) ln(b) for all positive a, b. We now
have

1+x
f (x) = ln
= ln(1 + x) ln(1 x),
1x
valid when x (1, 1). This discovery dissolves the question into a pair of easy
integration problems. We use the technique from the previous sample question.
172
Sample Solution 42. First note that for all x (1, 1), we have that 1+x > 0
and 1 x > 0, so that

1+x
f (x) = ln
= ln(1 + x) ln(1 x).
1x
Rx 1
Also note that 0 1+t
dt = ln(1 + t)|x0 = ln(1 + x) ln(1 + 0) = ln(1 + x)
Rx 1
and 0 1t dt = ln(1 t)|x0 = ln(1 x) + ln(1 + 0) = ln(1 x), for all
x (1, 1). Therefore,
f (x)
ln(1 + x) ln(1 x)
Z x
Z x
1
1
dt +
dt
=
1
+
t
1
t
0
Z0 x
1
1
=
+
dt,
1
+
t
1
t
0
=
for all x (1, 1).

P
1
Let g(u) = 1u
. Then g(u) = n=0 un for all u (1, 1). Observe that
X
X
1
1
=
= g(t) =
(t)n =
(1)n tn ,
1+t
1 (t)
n=0
n=0
valid whenever t (1, 1); i.e., whenever t (1, 1). This means that
1
1
1+t + 1t has representation
(1)n tn +
n=0
tn =
n=0
((1)n + 1)tn
n=0
2 + 2t2 + 2t4 + 2t6 + =
2t2n ,
n=0
valid on its interval of convergence (1, 1). We now have that

!
Z x X
1
1
2n
f (x) =
+
dt =
2t
dt
1t
0 1+t
0
n=0

x
Z x

X
X
t2n+1
=
2t2n dt
=
2
2n + 1 0
0
n=0
n=0
Z
X
X
x2n+1
x2n+1
2
0 =
2
,
2n + 1
2n + 1
n=0
n=0
P
2n+1
valid at least on (1, 1). Thus n=0 2 x2n+1 is the desired Taylor series for f (x)
and it has a radius of convergence R the same as that of the integrand series
P
2n
n=0 2x , which is 1.
Try integration with the following question, which involves arctan(u).
173
f (x) = arctan(2x).
Strategy: Recall that the
function can be obtained by integrating
R u arctangent
1
1
1
from
0
to
u;
i.e.,
dt
=
arctan(u).
The representation for 1+t
2
2
2
1+t
0 1+t
is easy P
to find (see sample question 23). We start with the basic function
1
1
n
and substitute u = t2 to obtain its representation 1+t
2 =
n=0 u P
1u =
P
2 n
n 2n
(t
)
=
(1)
t
,
valid
on
(1,
1).
Hence
integrating
this
term
n=0
n=0
by term will yield the power series representation of arctan(u). Next we simply
plug 2x into u to get the answer.
1
,
In summary, we built the target function arctan(2x) by starting with 1u
2
whose representation is known; then we made the substitution u = t ; next,
1
we integrated 1+t
2 from 0 to u; finally, we made the substitution u = 2x. Doing
1
the same manipulations to the power series representation of 1u
yields the
answer.
Z u
1
u
dt = arctan(t)|0 = arctan(u) arctan(0) = arctan(u),
2
1
+
t
0
1
1
for any u (1, 1). Also note that for t (1, 1), 1+t
2 = 1(t2 ) has repreP
P
sentation n=0 (t2 )n = n=0 (1)n t2n , with radius of convergence 1. Hence
on (1, 1), the function arctan(u) has power series representation
!

Z u
Z u X
Z u
X
1
n 2n
n 2n
dt =
(1) t
dt =
(1) t dt
2
0
0 1+t
0
n=0
n=0
Z u
2n+1 u
X
X
t
n
n
2n
(1)
(1)
=
t dt =
2n + 1 0
0
n=0
n=0

X
X
u2n+1
u2n+1
=
(1)n
0 =
(1)n
,
2n + 1
2n + 1
n=0
n=0
P
with radius of convergence identical to that of n=0 (1)n t2n , which is 1. The
power series representation for f (x) = arctan(2x) is therefore
(1)n
n=0
X
(2x)2n+1
22n+1 x2n+1
=
(1)n
.
2n + 1
2n + 1
n=0
This series converges whenever 2x (1, 1) and diverges

whenever 2x 6 [1, 1];
i.e., this series converges only on the interval 21 , 12 and possibly on one or
both of its endpoints. Hence this representation
has radius of convergence 12

1 1
and is valid at least on the interval 2 , 2 . This series is the desired Taylor
series for f (x).
174
The next question involves division. We follow the same method of long
division of polynomials used in algebra. Because we have not formally justified
division of power series, for the next question, you may assume that division
produces a valid representation for the quotient function. Typically, questions
will only ask for the first few terms of the quotient power series.
Sample Question 44. Use division of power series to find the first three
nonzero terms of the Taylor series (centered at 0) for the function
ln(1 x)
.
ex
f (x) =
You may assume that the quotient series is the Taylor series for f (x), valid on
an open interval containing 0.
Strategy: This question directly asks us to divide the power series representation of ln(1 x) by the power series representation of ex . Since we want three
nonzero terms in the quotient series, three nonzero terms from both the divisor
series and the dividend series should be enough. When dividing, remember that
the order in which the powers of x are considered is reversed compared to normal polynomial long division. We must clear the lowest power of x first before
trying to clear a higher power of x.
Rx 1
R x P
Sample Solution 44. Note that ln(1 x) = 0 1t
dt = 0 ( n=0 tn ) dt,
valid for all x (1, 1), and so
!

Z x X
Z x
X
n
n
ln(1 x) =
t
dt =
t dt
=
0
n=0
X n+1
n=0
X
n=0
t
n+1
x
=
0
0
n=0

n+1
X
n=0

x
0
n+1
n+1
x
1
1
1
= x x2 x3 x4 ,
n+1
2
3
4
P n
valid for all x (1, 1). The Taylor series for ex is n=0 xn! = 1 + x + 21 x2 +
ln(1x)
1 3
is therefore
6 x + , valid everywhere. The Taylor series for f (x) =
ex
obtained by dividing the two series:
x 12 x2 31 x3 14 x4
.
1 + x + 21 x2 + 16 x3 +
As shown below, three nonzero terms from both the numerator and the denom-
175
inator suffice to yield three nonzero terms in the quotient series:

1 2 1 3
1 2
1
1
x x x +
1 + x + x + = x + x2 x3 +
2
3
2
2
3
1
x x2 x3 +
3
1 2 1 3
x + x +
2
6
1 2 1 3
x + x +
2
2
1
x3 +
3
1 3
x +
3
0 +
Hence the first three terms of the Taylor series for f (x) is x + 21 x2 13 x3 , as
required.
Finally, since all valid power series representations of functions are necessarily the Taylor series of those functions, we can always find the (potential) power
series representation of an infinitely differentiable function f (x) by calculating
its derivatives. This is the most direct approach to finding the Taylor series for
a function f (x), and it involves no assembling. All we do is compute (usually
by brute force) the derivatives of all orders of f (x) about a point a where it
is differentiable infinitely many times. Note that this method will always work
for any infinitely differentiable function, regardless how complicated the function isthe (very major) drawback is that complicated functions will have very
complicated derivatives, making them nearly impossible to compute.
Sample Question 45. Find the Taylor series (centered at 0) for the following
function:
f (x) = 1 + x.
What is the radius of convergence of this Taylor series? You may assume that
this Taylor series is a valid representation of f (x) on its interval of convergence.
Strategy: After some thought, we realize that this power series is hopeless to
assemble. We cannot get the square root. This leaves us with the method of
directly finding its Taylor series by computing its derivatives of all orders at
x = 0. Of course, we cannot compute infinitely many derivatives; we need to
spot a pattern in the derivatives as n increases. To do so, we compute the first
few derivatives and take a guess at what the pattern is. We can usually prove
that our guess is correct via mathematical induction. Once we have determined
all of its derivatives, we need to find the radius of convergence R of the resulting
Taylor series, using standard techniques discussed in the previous section.
176
Finding the pattern of the derivatives for this function is easy,

but difficult
to write down. We list below the first five derivatives of f (x) = 1 + x.
1
f (x) = (1 + x) 2 ,
1
1
f 0 (x) = (1 + x) 2 ,
2
3
1
f 00 (x) =
(1 + x) 2 ,
4
5
3
f 000 (x) = (1 + x) 2 ,
8
7
15
f (4) (x) =
(1 + x) 2 ,
16
9
105
f (5) (x) =
(1 + x) 2 .
32
The numerator of the coefficient fraction is the product of the first n 1 odd
numbers with alternating signs and the denominator looks like 2n . The exponent
on 1 + x seems to be 12n
2 . The difficult part of this pattern is writing down
the product of the first n 1 odd numbers. Observe that the product of the
first n1 odd numbers is the factorial of 2n2 with the first n1 even numbers
removed. With that in mind, we should find that the product of the first n 1
odd numbers is equal to
(2n 2)!
(2)(4)(6)(8) (2n 2)
(2n 2)!
=
(2 1)(2 2)(2 3)(2 4) (2 (n 1))
(2n 2)!
= n1
,
2
(n 1)!
(1)(3)(5)(7) (2n 3) =
valid for every n 1. Thus the desired formula for f (n) (x) is
f (n) (x) = (1)n
12n
12n
1
(2n 2)!
(2n 2)!
n (1 + x) 2 = (1)n 2n1
(1 + x) 2 .
1)! 2
2
(n 1)!
2n1 (n
Upon evaluating these derivatives at x = 0, the (1 + x)

and disappears, making our Taylor series look like
f (0) +
12n
2
part becomes 1
X
X
(1)n (2n 2)! n
f (n) (0) n
x =1+
x .
n!
22n1 (n 1)!n!
n=1
n=1
The ratio test should handle the radius of convergence well.
Sample Solution 45. Note that x + 1 is differentiable

at x = 0 infinitely
many times. We claim that the nth derivative of f (x) = x + 1 is
f (n) (x) = (1)n
12n
(2n 2)!
(1 + x) 2 ,
22n1 (n 1)!
177
for all n 1. This claim is true for n = 1, for

(1)1
121
1
1
(2 1 2)!
1
1
(1 + x) 2 = 1
(1 + x) 2 = (1 + x) 2 = f 0 (x).
1)!
21
2
2211 (1
Suppose this claim is true for some k 1. Then

f (k+1) (x)
d (k)
f (x)
dx
12k
d
(2k 2)!
=
(1)k 2k1
(1 + x) 2
dx
2
(k 1)!

12k2
1 2k
k (2k 2)!
= (1) 2k1
(1 x) 2
2
k 1!
2
12(k+1)
(2k)(2k 2)!(2k 1)
= (1)k+1
(1 x) 2
(2k) 2 22k1 (k 1)!
12(k+1)
(2k)!
= (1)k+1 2k+1 (1 x) 2
,
2
k!
=
which proves the claim for k + 1. By the principle of mathematical induction,

this claim is true for all n 1.
We now form the Taylor series for f (x):
f (0) +
Setting an =
X
X
(1)n (2n 2)! n
f (n) (0) n
x =1+
x .
n!
22n1 (n 1)!n!
n=1
n=1
(1)n (2n2)!
22n1 (n1)!n! ,
we get that

(1)n+1 (2n)!

22n+1 n!(n+1)!
|an+1 |
(2n)!22n1 (n 1)!n!
(2n)(2n 1)

=
=
=
n (2n2)!
2n+1
(1)
|an |
(2n 2)!2
n!(n + 1)!
4n(n + 1)
22n1 (n1)!n!

4n n 21
n 12
=
1
4n(n + 1)
n+1
as n , and hence this Taylor series has radius of convergence R =

by corollary 3.1.6.
A.3
1
1
=1
Building Analytic Formulae from Power Series
The next few sample questions requires us to find explicit formulae for certain
power series. In essence, this section is the reversal of the previous section, in
that we are now trying to guess the function from which a power series was
built. The general strategy is to break up the given power series into simpler
ones that we can recognize. Just like assembling functions in the previous
178
section, breaking power series apart is a skill that will improve with practice.
Again, sums, differences, products and substitutions are good ways to break a
complicated series into simpler ones, ones that have known formulae.
We begin with an easy question.
X
xn+1
.
(n + 1)!
n=0
What is its interval of convergence I? Find an explicit formula for this power
series on I.
Strategy: If we write the series out, we get
X
x2
x3
x4
xn+1
=x+
+
+
+ ,
(n + 1)!
2!
3!
4!
n=0
P n
which is simply the series n=1 xn! with the index shifted by 1. We know that
P xn
P n
the series n=1 n! is just the series n=0 xn! = ex with the zeroth term missing.
0
What is the zeroth term? It is x0! = 1, and hence the power series in question
P n
should have the formula ex 1. Also, the series n=0 xn! and the given series
will share the same interval of convergence: all of R.
P xn
P xn+1
Sample Solution 46. Note that
n=1 n! , which converges
n=0 (n+1)! =
P xn
wherever n=0 n! converges. But we know that the latter series converges on
all of R, and it represents the function ex ; hence
X
X
X
xn+1
xn
xn
0n
=
=
= ex 1,
(n
+
1)!
n!
n!
0!
n=0
n=1
n=0
valid on its interval of convergence R.

We use substitution to turn the following power series into one which we
recognize.
(1)n
n=0
x4n
.
n!
What is its interval of convergence I? Find an explicit formula for this power
series on I.
179
Strategy: P
The first thing we should notice is the similarity between this power
n
series and n=0 xn! = ex . What suggested this similarity? It is the x4n in the
numerator and the n! in the denominator.
We now try to find out a way to
P
n
make this series to look more like n=0 un! . To do this, we first note that x4n
is simply (x4 )n . We then note that (1)n and (x4 )n can be merged together to
form (x4 )n . Thus the series now becomes
X
(x4 )n
,
n!
n=0
P
n
which is perfect, since this is the series n=0 un! = eu where u has been substituted by x4 . We have successfully
recognized this power series as a compoP
n
sition of a known series eu = n=0 un! and the substitution u = x4 . We can
conclude that the power series
(1)n
n=0
x4n
n!
is equal to ex , and it is valid on all of R because the series

converges everywhere.
un
n=0 n!
= eu
P
n
Sample Solution 47. We know that n=0 un! converges on all of R and is
u
equal to e for any u R. Hence for any x R,
(1)n
n=0
X
x4n
(x4 )n
=
n!
n!
n=0
converges and is equal to ex .

The solution looks very short, but it is the result of following these general
steps:
1. Find one or more known power series (with known explicit formulae) that
are similar to the power series in question;
2. Find out how to break down the given power series into these known power
series, using appropriate manipulations (substitution, addition, subtraction, multiplication).
3. Keep track of the manipulations; after the break-down is complete, just
replace the known power series with their formulae.
In particular, the second item is the most difficult to recognize; the above
sample question only involved a substitution. The key was the exponential laws:
we saw that x4n is the same as (x4 )n and that items with common exponents
can be merged together, like the (1)n and the (x4 )n . Some questions involve
180
more than just a simple substitution and may require the combination of several
known power series; in this case, we simply piece together the formulae of the
known power series to obtain the desired explicit formula. With the above in
mind, try the following sample question.
Sample Question 48. Let a2n = 1 and a2n+1 = 2 for all n 0. Consider the
power series
X
an xn = 1 + 2x + x2 + 2x3 + x4 + .
n=0
What
interval of convergence I? The function f is defined by f (x) =
P is its
n
a
x
for
each x I. Find an explicit formula for f (x).
n=0 n
Strategy: Since each an is either 1 or 2, the radius of convergence can be found
by comparing the absolute series
an |x|n = 1 + 2|x| + |x|2 + 2|x|3 +
n=0
to the larger series
2|x|n = 2 + 2|x| + 2|x|2 + 2|x|3 + .
n=0
Since the latter is just a constant multiple of the series 1 + |x| + |x|2 + |x|3 + ,
it will converge wherever 1 + |x| + |x|2 +P
|x|3 + converges; i.e., when |x| < 1.
Hence the comparison test shows that n=0 an xn converges absolutely when
|x| < 1. Does it converge anywhere else? Well, if |x| 1, then each term of
this power series has size an |x|n , which is greater than or equal to an 1n =
an 1, which does not tend to zero as n . The divergence test shows that
P
n
n=0 an x diverges for any x 6 (1, 1). Now that we have established that
the interval of convergence is (1, 1), we turn to the task of finding an explicit
formula for this power series. We follow the three general steps outlined above.
First we identify some similar power series whose formulae are known.
P The
given series looks a bit like the known series 1 + x + x2 + x3 + = n=0 xn =
1
of each odd power. The left1x , except the given power series has one more
P
over odd powers are x + x3 + x5 + x7 + = n=0 x2n+1 . So these are the two
simpler power series related to the given power series.
The second step requires us to find a way to break apart the given power
series into some combination of the two power series identified in step one. We
know that the sum of those two power series produces the given power series:
an xn
1 + 2x + x2 + 2x3 + x4 + 2x5 + x6 + 2x7 +
1 + x + x2 + x3 + x4 + x5 + x6 + x7 + +
n=0
x + x3 + x5 + x7 + .
181
P
That is, we separated
given series into the two component series n=0 xn
P the2n+1
and the left-over n=0 x
.
To continue with the third step, we need to find the explicit formulae
these
Pof
two component series. We definitely know the formula for the series n=0 xn ;
1
, convergent and valid on (1, 1). The formula for the second series
it is 1x
P
requires a bit more work
to find. Note that it is equal to the product
x n=0 x2n ,
P
P
which is equal to x n=0 (x2 )n , which is equivalent to x n=0 un with the

1
substitution u = x2 . Hence this series has the formula x 1x
2 , valid whenever
2
x is in (1, 1), which occurs when x (1, 1).
The third step is simple. Since we added the two simpler series together
to obtain the given power series, we replace the two simpler series with their
formulae to get the answer.
1 + 2x + x2 + 2x3 + x4 + 2x5 +
xn +
n=0
x2n+1
n=0

=
=
=
=

1
1
+ x
1x
1 x2
x
1+x
+
(1 x)(1 + x) 1 x2
1+x
x
+
2
1x
1 x2
1 + 2x
,
1 x2
P
We conclude that the series in question n=0 an xn = 1 + 2x + x2 + 2x3 +
1+2x
is equal to 1x
2 . Since the formulae of the component series are both valid on
(1, 1), their sum is also valid on (1, 1).
Sample Solution 48. Let x0 (1, 1). Let bn = an xn0 and cn = 2xn0 . Then
because either an = 1 or an = 2, we have that
|bn | = an |x0 |n 2|x0 |n = |cn |,
P
n
for
converges if
n=0 an |x0 |
Pall n n 0. The comparison test shows that
2|x
|
converges,
which
it
does
because
it
is
a
constant
multiple
of a
0
n=0
P
n
convergent
geometric
series
|x
|
(with
common
ratio
|x
|
<
1).
Hence
0
0
n=0
P
n
n=0 an x converges absolutely for any x (1, 1). To show that these are
the only values of x at which the series converge, we use the divergence test.
Let x1 6 (1, 1). Observe that
an |x1 |n an 1n = an 1,
n
for every n 0. Hence
Pan |x1 | n 6 0 as n and so by the divergence
test, we conclude that n=0 an x Pdiverges for all x 6 (1, 1). The interval of
convergence I of the power series n=0 an xn is (1, 1).
182
We will now find an explicit formula for
n=0
an xn . We know that
X
1
=
xn = 1 + x + x2 + x3 ,
1 x n=0
valid on (1, 1). We also know that

1
1 x2
X
= x
(x2 )n
x
1 x2
= x
= x
n=0
x2n
n=0
x2n+1
n=0
= x + x3 + x5 + ,
valid whenever x2 (1, 1); i.e., whenever x (1, 1). The term-by-term sum
of these two series
xn +
n=0
x2n+1
1 + x + x2 + x3 + x4 + x5 + +
n=0
x + x3 + x5 +
1 + 2x + x2 + 2x3 + x4 + 2x5 +
X
an xn
=
n=0
is therefore valid on (1, 1) and represents the sum of the two functions
1
x
+
1 x 1 x2
=
=
=
1+x
x
+
(1 x)(1 + x) 1 x2
1+x
x
+
1 x2
1 x2
1 + 2x
,
1 x2
valid on (1, 1).

The above sample question
three manipulations:
separating the
P involved
P all 2n+1
P
n
x
,
decomposing
x2n+1 into
x
+
power series into
a
sum
n=0
n=0
n=0
P 2n
P 2n
a product
P x n=0 x , and rewriting n=0 x as a composition of a known
series n=0 un and the substitution u = x2 . These manipulations eventually
decomposed the power series into two series whose explicit formulae are known;
we then pieced these formulae together to find the desired formula.
183
P
Sample Question 49*. Consider the power series n=0 an xn , where an+1 =
an for all n 0. Find the interval of convergence I of this series and find an
explicit formula for this series, valid on I.
Strategy: If we drop the sigma notation and write the series out, the power
series looks like
a0 + a1 x1 + a2 x2 + a3 x3 + a0 x4 + a1 x5 + a2 x6 + a3 x7 + a0 x8 + ,
and so on, with the coefficients repeat every four terms. So this question is just
a bigger and more general version of the previous sample question. Instead of
two different coefficients, we now have four. Instead of four numbers, we are
given four generic variables. How did we test for interval of convergence in the
last sample question? We compared it with the series with every coefficient
equal to 2, the larger of the two coefficients. After we found out that the given
series converges on (1, 1), we then need to check if it converges anywhere else.
We will imitate that strategy here.
First we compare the given series in absolute value to the larger series whose
coefficients
are P
all equal to M := max{|a0 |, |a1 |, |a2 |, |a3 |}. I.e., we compare
P
n
n
|a
x
|
=
n
n=0
n=0 |an | |x| to
X
n=0
|M x | =
M |x|n .
n=0
P
n
We
smaller series
n=0 |x| converges for any x (1, 1), so the
Pknow that
P
n
n
|a
x
|
converges
on
(1,
1)
as
well;
we
conclude
that
n=0 n
n=0 an x converges absolutely on (1, 1). Does it converge anywhere else? Well, if all four
of the an s are 0, then this series converge everywhere. So suppose at least one
of the four different an s is not 0. We want to see if |x| 1 will make the series
diverge. To check this, we check the size of the terms in the series:
|an xn | = |an ||x|n |an | 1n = |an |,
which obviously does not tend to 0 as n , because at least one out of every
four an s is a nonzero constant. The divergence test then shows that this series
diverges if |x| 1. The given power series converges only on I = (1, 1).
We will now try to find an explicit formula for this power series. The formula
will likely depend on the four numbers a0 , a1 , a2 and a3 . We will imitate the
strategy used in the previous sample question:
Pseries as a sum
P write this power
of component series. We first take away n=0 a0 x4n from n=0 an xn . (The
x4n comes from the fact that an a0 term occur once every multiple of 4.) Whats
left-over is
(a0 a0 ) + a1 x + a2 x2 + a3 x3 + (a0 a0 )x4 + a1 x5 + a2 x6 + a3 x7 +
=a1 x + a2 x2 + a3 x3 + a1 x5 + a2 x6 + a3 x7 + .
184
So this maneuver took away

P all the a0 terms. We do the same again, taking
away the a1 terms using n=0 a1 x4n+1 . After we take this away, we get
(a1 a1 )x + a2 x2 + a3 x3 + (a1 a1 )x5 + a2 x6 + a3 x7 +
=a2 x2 + a3 x3 + a2 x6 + a3 x7 + .
P
P
4n+2
4n+3
We repeat this two more times with
and
. At
n=0 a2 x
n=0 a3 x
the
end,
every
term
will
have
disappeared.
This
means
that
our
target
series
P
n
n=0 an x can be decomposed into a four-part sum
X
n=0
a0 x4n +
X
n=0
a1 x4n+1 +
X
n=0
a2 x4n+2 +
a3 x4n+3 .
n=0
The formula for each of the four parts should be easy to find. After we have
found them, we simply add the four formulae together to get our answer. We
shall see that each individual formula is valid on (1, 1), so their sum is also
valid on (1, 1)
Sample Solution 49.
is clear that if a0 = a1 = a2 = a3 = 0, then an = 0
PIt
for any n 0 and so n=0 an xn = 0, convergent and valid on all of R. So for

the rest of this solution, suppose that at least one of a0 , a1 , a2 , a3 is not zero.
Let M = max{|a0 |, |a1 |, |a2 |, |a3 |}. Then |an | M for all n 0 and
|an xn | = |an | |x|n M |x|n ,
P
for all n 0. But we have that n=0
M |x|n converges for any x (1, 1) and
P
n
so the comparison
P testnshows that n=0 |an x | converges on (1, 1); i.e., the
power series n=0 an x converges absolutely (and hence converges) on (1, 1).
We will now show that this power series converges for no other values of x. So
suppose |x| 1. We have
|an xn | = |an | |x|n |an | 1n = |an |.
Since at least one of a0 , a1 , a2 , a3 is not zero and an+4 = an for all n 0,
the sequence of terms |an xn | |an | P
does not tend to zero as n . The
n
divergence test shows that the series n=0 an x
for any x 6 (1, 1).
Pdiverges
n
a
x
is
(1, 1).
This shows that the interval of convergence
of
n=0 n
P
1
.
We now find a formula for n=0 an xn valid on (1, 1). Let g(u) = 1u
P n
Then g(u) = n=0 u , valid on (1, 1). We have that
X
X
1
4
4 n
=
g(x
)
=
(x
)
=
x4n ,
1 x4
n=0
n=0
185
valid whenever x4 (1, 1); i.e., when x (1, 1). Therefore we have that
1
1 x4
1
x
=x
1 x4
1 x4
n=0
= x
= x2
1
x
= x3
4
1x
1 x4
= x3
x4n =
n=0
1
x
= x2
1 x4
1 x4
x4n ,
n=0
x4n+1 ,
n=0
x4n =
x4n =
n=0
n=0
x4n+2 ,
x4n+3 ,
n=0
all valid on (1, 1). Observe that

a0
x4n + a1
a0 x4n +
n=0
x4n+1 + a2
a1 x4n+1 +
n=0
x4n+2 + a3
a2 x4n+2 +
n=0
x4n+3
n=0
n=0
n=0
n=0
a3 x4n+3
n=0
= a0 + a0 x4 + a0 x8 + +
a1 x + a1 x5 + a1 x9 + +
a2 x2 + a2 x6 + a2 x10 + +
a3 x3 + a3 x7 + a3 x11 + +
= a0 + a1 x + a2 x2 + a3 x3 + a0 x4 + a1 x5 + a2 x6 + a3 x7 +
X
=
a n xn .
n=0
This sum is valid wherever

each of the four summands is valid; i.e., for all
P
x (1, 1). Hence n=0 an xn is the sum of the four functions
an xn
a0
a1 x
a2 x2
a3 x3
+
+
+
4
4
4
1x
1x
1x
1 x4
a0 + a1 x + a2 x2 + a3 x3
,
1 x4
n=0
valid on its interval of convergence (1, 1).
A.4
A.4.1
Integration and Power Series

Indefinite Integration Using Power Series
These two problems involve replacing the integrand with its power series and
integrating it term by term.
186
Sample Question 50. Evaluate the following indefinite integral as a power

series:
Z
1
dx.
1 + x4
Strategy: This integral should not be a problem, because we can find the power
1
series for 1+x
4 . Then all we have to do is integrating that power series term by
term. Do not forget to include the constant of integration C in the beginning
of the integrated series.
P
1
= n=0 un , valid on (1, 1).
Sample Solution 50. We know that g(u) := 1u
Hence
X
X
1
1
4
4 n
=
=
g(x
)
=
(x
)
=
(1)n x4n ,
1 + x4
1 (x4 )
n=0
n=0
convergent and valid whenever x4 (1, 1); i.e., whenever x (1, 1). This
shows that
!
Z
Z X
1
n 4n
(1) x
dx
dx =
1 + x4
n=0

Z
X
X
x4n+1
(1)n
(1)n x4n dx = C +
= C+
,
4n + 1
n=0
n=0
convergent and valid at least on (1, 1).
Sample Question 51. Evaluate the following indefinite integral as a power
series:
Z
x
dx.
1 + x5
Strategy: This problem is almost identical to the previous question. The only
difference is the integrand.
Sample Solution 51. We know that g(u) :=
Hence
1
1u
n=0
un , valid on (1, 1).
X
X
x
1
5
5 n
= x g(x ) = x
(x ) =
(1)n x5n+1 ,
=x
1 + x5
1 (x5 )
n=0
n=0
convergent and valid whenever x5 (1, 1); i.e., whenever x (1, 1). This
shows that
!
Z
Z X
x
n 5n+1
dx =
(1) x
dx
1 + x5
n=0

Z
X
X
x5n+2
n 5n+1
,
= C+
(1) x
dx = C +
(1)n
5n + 2
n=0
n=0
187
convergent and valid at least on (1, 1).
A.4.2
Approximating Definite Integrals Using Power Series
Sometimes we need a good numeric approximation to certain definite integrals;

if the integrand has a power series representation whose terms contain the item
(1)n , then alternating series test can be used to provide an error bound. The
problems below require us to estimate certain definite integrals to within a specified accuracy. Like evaluating indefinite integrals, first we turn the integrand
into its power series representation. In all three question below, the power series
representations will have a (1)n in their terms. We then integrate the series
term by term, substituting in the numeric bounds of the definite integral. The
power series will then turn into an alternating series, so that we can use the
error clause from the alternating series test to get an estimate. Example 3.23
demonstrates this process.
Sample Question 52. Use power series to approximate the given definite integral, accurate to four decimal places:
Z 0.5
dx
.
1 + x6
0
1
1
Strategy: The integrand 1+x
6 = 1(x6 ) has a simple power series represenP
P
tation n=0 (x6 )n = n=0 (1)n x6n . We then evaluate the definite integral
term by term, substituting the bounds of integration, like this:
!
Z 0.5 X
Z 0.5
dx
n 6n
=
(1) x
dx
1 + x6
0
0
n=0

Z 0.5
X
n 6n
=
(1) x dx
=
=
=
n=0
(1)n
n=0
(1)n
n=0
(1)n
n=0
x6n+1
6n + 1
0.5
0
6n+1

(0.5)
0
6n + 1
(0.5)6n+1
.
6n + 1
After we substituted the 0.5 and the 0 into the power series, it becomes an
alternating series.
188
The error bound from the alternating test states that

k
X
6n+1
6n+1
X

n (0.5)
n (0.5)
(1)
(1)

6n
+
1
6n
+
1
n=0
n=0

6k+7
6k+7

(0.5)
(0.5)
=
(1)k+1
;
6k + 7
6k + 7
that is, the kth partial sum estimates the actual sum to within |ak+1 | units of
accuracy, where ak+1 is the next term in the sum. We want the estimate not
to deviate away from the real sum by more than 105 units away, so making
6k+7
|ak+1 | < 105 will guarantee that. Each |ak+1 | looks like (0.5)
6k+7 , so making
13
k = 1 will make |ak+1 | = (0.5)

< 0.0000094 < 105 , as required. Adding the
13
first two terms is already enough to get a really good estimate on this integral!
Sample Solution 52. Note that
X
X
1
1
6 n
(1)n x6n ,
(x
)
=
=
=
1 + x6
1 (x6 ) n=0
n=0
valid whenever x6 (1, 1); i.e., whenever x (1, 1). Therefore,

!
Z 0.5
Z 0.5 X
dx
n 6n
=
(1) x
dx
1 + x6
0
0
n=0

Z 0.5
X
=
(1)n x6n dx
=
=
=
n=0
(1)
n=0
(1)
n=0
(1)n
n=0
x6n+1
6n + 1
0.5
0
6n+1

(0.5)
0
6n + 1
(0.5)6n+1
.
6n + 1
6n+1
This is an alternating series with terms (0.5)

positive, decreasing, and con6n+1
verging to zero. The alternating series test shows that
1

X
6n+1
6n+1
X
(0.5)
(0.5)

(1)n
(1)n

6n
+
1
6n
+
1
n=0
n=0

(0.5)6(1+1)+1 (0.5)13
(1)1+1
=
< 0.0000094 < 105 .
6 (1 + 1) + 1
13
6n+1
P1
(0.5)7
We conclude that n=0 (1)n (0.5)
= 0.5
0.4988839 0.4989 is
6n+1
1
7
an estimate to the desired accuracy.
189
In simple words, the alternating series error bound says that the error is
no larger than the size of the next term. The more terms we add, the more
accurate our estimate will be. How far off will our estimate be? No more than
the size of the next term that has not been added in. We are always seeking
the term that is smaller than the indicated error tolerance.
Sample Question 53. Use power series to approximate the given definite integral with an error no more than 0.001:
Z 0.5
2
x2 ex dx.
0
Strategy: The strategy remains the same: We find the power series of the
integrand, integrate it term by term, substitute in the bounds of integration,
and then use the alternating series test error bound to obtain an estimate by
selecting a k large enough so that the kth partial sum approximates the actual
sum to within the desired accuracy. In this question, the term that is smaller
in magnitude than 0.001 is the second term. So an estimate obtained by adding
just the 0th and 1st terms of the series will be accurate enough.
Sample Solution 53. Note that g(u) := eu =
Hence
2
x2 ex = x2 g(x2 ) = x2
un
n=0 n! ,
valid on all of R.
X
X
(x2 )n
x2n+2
=
(1)n
,
n!
n!
n=0
n=0
valid everywhere. Therefore,

Z
0.5
2 x2
x e
0.5
Z
dx
(1)
nx
=
=
=
=
n=0
Z 0.5
X
2n+2
!
dx
n!

x2n+2
dx
n!
0.5
2n+3
(1)n
n=0
(1)
n=0
(1)
n=0
(1)n
n=0
This is an alternating series with terms
x
(2n + 3)n!

(0.5)
0
(2n + 3)n!
(0.5)2n+3
.
(2n + 3)n!
(0.5)2n+3
(2n+3)n!
190
2n+3
positive, decreasing, and con-
verging to zero. The alternating series test shows that

1

X
2n+3
2n+3
X

n (0.5)
n (0.5)
(1)
(1)

(2n
+
3)n!
(2n
+
3)n!
n=0
n=0

2(1+1)+3
7

(0.5)
(0.5)
=
(1)1+1
= 0.0005580 . . . < 0.001.
(2 (1 + 1) + 3)(1 + 1)!
7(2!)
2n+3
P1
We conclude that n=0 (1)n (0.5)
(2n+3)n! =
to the desired accuracy.
(0.5)3
3
(0.5)
= 0.35416 is an estimate
5
Here is the final question. The strategy will be omitted because it is identical
to that employed in previous problems.
Sample Question 54. Use power series to approximate the given definite integral, accurate to three decimal places:
Z 0.5
cos(x2 ) dx.
0
P
x2n
Sample Solution 54. Since cos(u) = n=0 (1)n (2n)!
valid on all of R, we
have that
X
X
x4n
(x2 )2n
(1)n
(1)n
=
,
cos(x2 ) =
(2n)!
(2n)!
n=0
n=0
valid everywhere. Therefore,
Z
0.5
2
cos(x ) dx
0.5
x4n
(1)
(2n)!
n=0
=
=
=
=
Z
X
n=0
0.5
(1)n
(1)
n=0
(1)
n=0
(1)n
n=0
This is an alternating series with terms
191
dx

x4n
dx
(2n)!
0.5
4n+1
x
(4n + 1)(2n)!
4n+1

(0.5)
0
(4n + 1)(2n)!
(0.5)4n+1
.
(4n + 1)(2n)!
(0.5)4n+1
(4n+1)(2n)!
positive, decreasing, and
converging to zero. The alternating series test shows that

1

X
4n+1
4n+1
X

n (0.5)
n (0.5)
(1)
(1)

(4n
+
1)(2n)!
(4n
+
1)(2n)!
n=0
n=0

4(1+1)+1
9

(0.5)
(0.5)
=
(1)1+1
< 9.1 106 < 104 .
(4 (1 + 1) + 1)(2(1 + 1))!
9(4!)
P1
(0.5)4n+1
=
We conclude that n=0 (1)n (4n+1)(2n)!
an estimate to the required accuracy.
0.5
1
(0.5)5
5(2!)
= 0.496875 0.497 is
P
x2n
Diversion. The Taylor series for cos(x) is n=0 (1)n (2n)!
. When evaluated
at any x = x0 R, the power series becomes a numeric, alternating series.
This is how a scientific calculator calculates cosines of angles to many digits of
accuracyby just summing the first few terms of the series
(1)n
n=0
x2
x4
x6
x2n
0
= 1 0 + 0 0 + .
(2n)!
2!
4!
6!
192
Appendix B
Differential Equations
Definition B.0.1. A differential equation is an equation of the type
F (x, f (x), f 0 (x), . . . , f (n) (x)) = 0,
where x is a variable and f (x) is a function of x. The order of F is n, the order
of the highest derivative in the equation.
In this chapter, we will only study differential equations of the form f 0 (x) =
F (x, f (x)); these differential equations are called first-order differential equations. These are often written using a variable y to represent f (x), and thus
first-order differential equations look like y 0 = f (x, y). As an example, y 0 =
sin(xy) + 3 is a first-order differential equation. A solution to the differential
equation
F (x, y, y 0 , . . . , y (n) ) = 0
on an interval I is a function y = (x) so that
F (x, (x), 0 (x), . . . , (n) (x)) = 0
for all x I. For a first-order differential equation, this means
0 (x) = f (x, (x))
for all x I.
Example B.1. Consider the first-order differential equation y 0 R= f (x). This
means that y = (x) is an antiderivative of f (x). Hence, (x) = f (x) dx.
Example B.2. Consider Rthe first-order differential equation y 0 = ex . This

means that y = (x) = ex dx, which shows that (x) = ex + C, for any
C R.
As seen from the examples above, a differential equation need not have a
unique solution.
193
B.1
First-Order Separable Differential Equations
Definition B.1.1. A first-order differential equation is called separable if it

can be written in the form y 0 = f (x)g(y).
Example B.3. Consider the separable equation
y 0 = xy,
(B.1)
where f (x) = x and g(y) = y. If y = 0 identically, then it becomes a solution

0
to (B.1), since 00 = 0 = x 0 for all x R. If y 6= 0, then we can write yy = x.
Hence,
Z
Z 0
y
dx = x dx.
y
A change of variables dy = y 0 dx shows that
Z
Z
1
dy
= x dx = x2 + C.
y
2
R dy
But y = ln(|y|), and so ln(|y|) = 12 x2 + C. This is called an implicit solution.
We then solve for y:
|y| = e
x2
2
+C
= eC e
x2
x2
2
= C1 e
x2
2
x2
with C1 > 0; therefore, y = C1 e 2 = Ae 2 for a nonzero A R.

Note that we have not verified that equations of this form are actually solux2
tions to (B.1), but if there were to be a solution, it must be in the form of Ae 2
for some A R. One can check that every equation of this type is actually a
solution to (B.1), and so
y = Ae
x2
2
is a solution to (B.1) for any A R, including A = 0.

The algorithm for solving separable equations is as follows.
1. We first find all exceptional solutions. Namely, find all y0 R such that
g(y0 ) = 0. For each such y0 , (x) = y0 is a solution, for
0 (x) = 0 = f (x) 0 = f (x)g(y0 ) = f (x)g((x)).
R y0
R
y0
2. If g(y) 6= 0, then we get g(y)
= f (x) g(y)
dx = f (x) dx. Upon the
change of variables dy = y 0 dx, we obtain the implicit solution
Z
Z
1
dy = f (x) dx.
g(y)
3. We finally evaluate the integrals and solve for y to get explicit solutions.
194
Exponential growth and decay. This type of growth and decay is used to
model the growth of a bacteria colony with unlimited resources and radioactive
decay. The rate of change of y is proportional to the current value of y; that is,
y 0 = ky. Solving, we obtain the exceptional solution y = 0. In the case where
y 6= 0, we have that
Z
Z
dy
= k dt ln(|y|) = kt + C y = Aekt
y
for some A R. Now y(0) = Aek0 = A 1 = A represents the initial population
or amount of substance. If y 0, then we can see that y 0 = ky. Moreover, this
quantity y 0 = ky satisfies y 0 0 if k > 0 and y 0 0 if k < 0. For the case k > 0,
we get exponential growth; for the case k < 0, we get exponential decay.
Definition B.1.2. Given a differential equation F (x, y, y 0 , . . . , y (n) ), we can
specify certain additional requirements such as
y(x0 ) = y0 , y(x1 ) = y1 , . . . , y(xn ) = yn .
These additional requirements are called initial values.
Logistic growth and decay. Logistic growth is used to model the proportion,
P (t), of a population that has been exposed to a virus at time t after some fixed
point in time. The rate of new exposures is proportional to the product of P (t)
and (1 P (t)). This gives a differential equation P 0 = kP (1 P ). The logistic
model simulates populations with finite resources: The virus has only a fixed,
finite number of people to infect. To solve this differential equation, we first
look for exceptional cases. There are two: P = 0 and P = 1. So assume P 6= 0
and P 6= 1. We then write
Z
Z
dP
= k dt = kt + C0 ,
P (1 P )
whence by a partial fraction decomposition, we have

Z
Z
dP
1
1
=
+
dP
P (1 P )
P
1P
Z
Z
1
1
=
dP +
dP
P
1P
= ln(|P |) ln(|1 P |) + C1 .
This gives

ln(|P |) ln(|1 P |) + C1 = kt + C0
195
ln
|P |
|1 P |

= kt + C.

P
Hence, we get that 1P
= C2 ekt , where C2 must be positive to yield a solution.
This implies that
P
1P
= Aekt , where A R, A 6= 0. Rearranging, we have
P = Aekt P Aekt
P + P Aekt = Aekt
P (1 + Aekt ) = Aekt ,
kt
Ae
P
kt
, we get
whence we obtain P = 1+Ae
kt , A 6= 0. From the fact that 1P = Ae
P
kt
A = 1P e ; observe that if 0 < P < 1, then A > 0 and if P < 0 or P > 1,
then A < 0. That is, if 0 < P < 1, then A is forced to be positive in order
to yield a solution, and if P < 0 or P > 1, A is forced to be negative. In our
physical model, P is a ratio, and so it satisfies 0 < P < 1, whence A > 0 and
we obtain our result
Aekt
P (t) =
1 + Aekt
where A > 0.
Question: How do we find the values of k and A?

This is where we need further constrains, such as initial values, to pin down a
specific k and A. For example, if we were given that P (0) = 0.1 and P (5) = 0.3,
then
Aek(0)
A
,
0.1 = P (0) =
=
1+A
1 + Aek(0)
from which we obtain
0.1
1
A=
= .
1 0.1
9
This means that P (t) =
1 kt
9e
.
1+ 19 ekt
Then
0.3 = P (5) =
1 5k
9e
+ 19 e5k
from which we can solve k.
B.2
First-Order Linear Differential Equations
Definition B.2.1. A first-order differential equation is said to be linear if it

has the form y 0 + f (x)y = g(x).
We will attempt to solve such a differential equation. First note that linear
differential equations look oddly like the product rule from differentiation
(B.2)
y 0 I(x) + I 0 (x)y = (yI(x))0 ,
except some terms present in y 0 I(x) + I 0 (x)y are missing in y 0 + f (x)y. We

multiply the expression y 0 + f (x)y by a function I(x) to make it look more like
y 0 I(x) + I 0 (x)y:
(B.3)
y 0 I(x) + I(x)f (x)y.

196
Now, if our I(x) is such that I(x)f (x) = I 0 (x), then (B.3) will look exactly
like the left side of (B.2). But since I(x)f (x) = I 0 (x) is a separable differential
0
(x)
equation, we can solve for I(x). We write f (x) = II(x)
to obtain
Z
Z
dI
f (x) dx =
I(x)
Z
ln(|I(x)|) = f (x) dx
|I(x)| = e
f (x) dx
We will take the positive solution I(x) = e f (x) dx . This I(x) is called the
integrating factor, and it satisfies I(x)f (x) = I 0 (x).
How and where do we use the integrating factor I(x)? Returning to our
original differential equation
y 0 + f (x)y = g(x),
we first multiply both sides of this equation by I(x) = e
f (x) dx
to get
y 0 I(x) + I(x)f (x)y = y 0 I(x) + I 0 (x)y = (yI(x))0 = I(x)g(x).

We can now integrate both sides as follows
Z
Z
d
[yI(x)] dx = I(x)g(x) dx
dx
whence we obtain
R
y=
Z
yI(x) =
I(x)g(x) dx,
I(x)g(x) dx
.
I(x)
Example B.4. Consider the first-order linear differential equation

y 0 xy = x.
(B.4)
Let f (x) = x and g(x) = x, so that (B.4) can be written as y 0 + f (x)y = g(x).
The integrating factor I(x) is
I(x) = e
f (x) dx
=e
x dx
=e
x2
2
Hence,
R
y=
Letting u =
xe
x2
2
x2
2
x2
2 ,
dx
we have that du = x dx and so

Z
x2
x2
2
xe
dx = eu du = eu + C = e 2 + C;
197
therefore,
y=
x2
2
+C
x2
2
= 1 +
C
e
x2
2
= 1 + e
x2
2
for all C R. It could be checked that every function of the form

(x) = 1 + e
satisfies (B.4), and we are done.
198
x2
2

Forrest Math148notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Forrest Math148notes

Uploaded by

Copyright:

Available Formats

Math 148: Calculus II

Instructor: Brian E. Forrest

Areas Under a Curve

Problem. How do we find areas under the graph of a function f (x)?

the area of the region R?

We want A 13 , since this will show that A = 13 .

Conclusion: A = 31 . With this section as our motivation, we can now define

Integrals and Riemann Sums

Definition 1.1.1. Let [a, b] be a finite, closed interval. A partition P of [a, b]

and the lower Riemann sum of f (x) with respect to P by

Definition 1.1.5. Given a partition P, we say that a partition Q is a refinement

+Mj xj + Mi (y0 xi1 ) + Mi (xi y0 )

+Mj xj + Mi,1 (y0 xi1 ) + Mi,2 (xi y0 )

+mj xj + mi (y0 xi1 ) + mi (xi y0 )

+mj xj + mi,1 (y0 xi1 ) + mi,2 (xi y0 )

Due to remark 1.1.4, we conclude that

and the lower Riemann integral for f (x) over [a, b] is

Definition 1.2.1. We say that f (x) is Riemann integrable on [a, b] if

in which case we denote this common value by

If P = {xi : 0 = x0 < x1 < < xn = 1}, then let

and hence this function is not integrable on [0, 1].

U(f, P1 ) (by theorem 1.1.6)

(by theorem 1.1.6)

This implies that

f (x) dx U(f, P),

Since  is arbitrary, we obtain

therefore f (x) is integrable on [a, b], as required.

Recall the following definition.

Definition 1.2.3. We say that f (x) is uniformly continuous on an interval I

|f (x) f (y)| < .

Also recall the following theorem.

lim (f (xn ) f (yn )) = 0.

|(f (xn ) f (yn )) 0| = |f (xn ) f (yn )| 0

lim f (ynk ) = f (x0 ).

Therefore, we can select a K N with

|f (xnk ) f (x0 )| <

for all k K. Hence we have, for all k K,

|f (xnk ) f (x0 )| + |f (ynk ) f (x0 )|

directly contradicting equation (1.1). Hence f (x) is uniformly continuous on

Hence f (x) is integrable by theorem 1.2.2.

Proof. Given  > 0, we can find an N large enough so that if n N , then

f (x) dx, as remarked.

define the left-hand Riemann sum, SL , by

define the midpoint Riemann sum, SM , by

This shows that

Hence, if kPk < , U(f, P) L(f, P) < 2 + 2 = ; therefore, if S(f, P) is any

f (x) dx = lim S(f, Pn ) = lim

where Pn is the n-regular partition of [a, b] and ci [xi1 , xi ]. ( S(f, Pn ) is any

f (x) dx = lim S(f, P).

Proof. Let Pn be the n-regular partition. Assume, without loss of generality,

f (a)) = 0, f (x) is integrable on [a, b] by theorem

Question: Assume that f (x) is continuous on [a, b] except at c. (Note that

(Mi mi )xi <

A similar argument shows, by refining if necessary, that

(Mi mi )xi <

(Mi mi )xi + (Mi0 mi0 )xi0

This shows that f (x) is integrable on [a, b] by theorem 1.2.2.

Diversion. Suppose A R is countable. Then A has Lebesgue measure zero.

implying that A has Lebesgue measure zero.

Theorem 1.3.1. Assume that f (x) is integrable on [a, b], then

if c < 0. In either case, we now have

|c [U(f, P) L(f, P)]|

(by remark 1.1.4)

Theorem 1.3.4. Suppose f (x) is integrable on [a, b]. Then

Since is arbitrary, we obtain

|f (x) f (y)| < .

|(f (xn ) f (yn )) 0| = |f (xn ) f (yn )| 0

Proof. Given > 0, we can find an N large enough so that if n N , then

Hence, if kPk < , U(f, P) L(f, P) < 2 + 2 = ; therefore, if S(f, P) is any

f (x0 ) < f (t) < f (x0 ) + .