You are on page 1of 49

Numerical Methods in Mechanical Systems

Course Notes

Williams R. Calderon Mu noz, Ph.D.
Department of Mechanical Engineering,
Universidad de Chile,
Santiago, Chile.

Last update: August 14th, 2013


1 Interpolation and Numerical Integration
1.1 Taylor series expansion
The most fundamental mathematical concept in numerical analysis is the Taylor series ex-
pansion.
Let us assume that the function f(x) possesses n + 1 continuous derivatives everywhere on
the interval [x
0
, x], this function can be represented by a nite Taylor series of the form
f(x) = f(x
0
)+
f

(x
0
)
1!
(xx
0
)+
f

(x
0
)
2!
(xx
0
)
2
+....
f
(n)
(x
0
)
n!
(xx
0
)
n
+
f
(n+1)
()
(n + 1)!
(xx
0
)
n+1
,
(1)
where is some number between x
0
and x. If f(x) satises more stringent conditions, this
function can be represented by an innite Taylor series
f(x) = f(x
0
) +
f

(x
0
)
1!
(x x
0
) +
f

(x
0
)
2!
(x x
0
)
2
+ ....
f
(n)
(x
0
)
n!
(x x
0
)
n
+ ....., (2)
when |x x
0
| is suciently small.
Example. If f(x) = e
x
, its Taylor series around x = x
0
is
e
x
= e
x
0
e
x
0
(x x
0
) +
e
x
0
2
(x x
0
)
2

e
x
0
6
(x x
0
)
3
+ er
T
(x).
where er
T
(x) =
e

24
(x x
0
)
4
is the truncation error, with between x
0
and x. By taking
x
0
= 0, for x suciently small, we can get the approximation
e
x
1 x +
1
2
x
2

1
6
x
3
,
with an error smaller than
1
24
x
4
. If x = 0.01, e
0.01
1.01005, and er
T
(0.01) 0.0001/24
1.2 Polynomial interpolation
Interpolation is a powerful tool to analyze and represent information collected from experi-
ments or other source, and get a further knowledge about the meaning and behavior of the
phenomena which are captured in a set of data points.
Mathematically, interpolation is dened as a continuous representation of a discrete set of
data points. Applications might be for dierentiation or integration, or simply estimating
the value of the function between two data points.
In general, there are two distinct interpolation methods: standard, where the curve passes
smoothly through the data points, and least squares, where a smooth curve passes suciently
close to data points that have some uncertainty.
General interpolation problem
The general standard interpolation problem is dened as follows:
GIVEN:
2
1. A set of data points, x
i
, with i = 0, 1, 2, ......N
2. A set of values f
i
f(x
i
), with i = 0, 1, 2, .....N
3. A set of basis functions b
j
(x), with j = 0, 1, 2, ......N
FIND: coecients a
j
, with j = 0, 1, 2, .....N, so that
f(x
i
) =
N

j=0
a
j
b
j
(x
i
), i = 0, 1, 2, ....N.
For simplicity, we have assumed linearity in the coecients a
j
, and subsequently the
above equation is just a system of linear algebraic equations which can be written as
f
i
= B
ij
a
j
,
where B
ij
b
j
(x
i
). The set of basis functions b
j
(x) is still arbitrary at this point (as long
as they are complete in the space of f), but for convenience we impose two restrictions on
them:
1. They should be easy to evaluate, and
2. They should be easy to dierentiate and integrate.
The most common basis functions used that satisfy these restrictions are polynomials of
various kinds. It is important to mention that there exist other very useful basis functions,
such as Fourier: sin and cos.
Lagrange Interpolation
Suppose we have a set of N +1 (non-equally spaced) data (x
i
, f
i
). By choosing the basis
functions b
j
(x) = x
j
, with j = 0, 1, 2, ....N, we can construct a polynomial of degree N that
passes through the data:
f
i
=
N

j=0
a
j
x
j
i
= a
0
+ a
1
x
i
+ a
2
x
2
i
+ .... + a
N
x
N
i
, i = 0, 1, 2, ....N. (3)
Here, we have N + 1 equations for the N + 1 unknowns a
0
, a
1
, a
2
, ....a
N
.
This procedure for nding the coecients of the polynomial is not very attractive. It
involves solving a system of algebraic equations which is generally ill conditioned for
large N (the resulting matrix is called a Vandermonde matrix). In practice, we will dene
the polynomial in an explicit way (as opposed to solving a system of equations). Dene a
polynomial of degree N associated with each point x
j
:
L
(N)
j
(x) =
j
(x x
0
)(x x
1
)....(x x
j1
)(x x
j+1
)....(x x
N
),
3
where
j
is a constant to be determined. Lets use the product notation for L
(N)
j
L
(N)
j
(x) =
j
N

i=0 i=j
(x x
i
).
If x is equal to any of the data points except x
j
, L
(N)
j
(x) = 0. Note
L
(N)
j
(x
j
) =
j
N

i=0 i=j
(x
j
x
i
).
Let

j
=
_
N

i=0 i=j
(x
j
x
i
)
_
1
,
so that
L
(N)
j
(x) =
(x x
0
)....(x x
j1
)(x x
j+1
)....(x x
N
)
(x
j
x
0
)....(x
j
x
j1
)(x
j
x
j+1
)....(x
j
x
N
)
. (4)
Then
L
(N)
j
(x
i
) =
ij
.
where
ij
is the Kronecker delta function (0 if i = j and 1 if i = j). Now we can form the
following linear combination
f(x) =
N

j=0
L
N
j
(x)f
j
.
This is a polynomial of degree N because it is a linear combination of polynomials of degree
N. It is the classic Lagrange polynomial. By construction, it passes through all the data
points. For example, at x = x
i
f(x
i
) = L
(N)
0
(x
i
)f
0
+ L
(N)
1
(x
i
)f
1
+ .... + L
(N)
j
(x
i
) + ... + L
(N)
N
(x
i
)f
N
. (5)
Since L
(N)
j
(x
i
) is equal to zero except for j = i, and L
(N)
i
(x
i
) = 1,
f(x
i
) = f
i
. (6)
Thus, f(x) is the desire polynomial. Note that the polynomial interpolation is unique. That
is, there is only one polynomial of degree N passing through a set of N + 1 points.
The Lagrange polynomial is just a compact and numerically better behaved way of ex-
pressing the polynomial whose coecients could have also been obtained from solving a
system of algebraic equations. If we assume the data f
i
are exactly the values a smooth
function y(x) takes at x
i
(i.e.f
i
= y(xi)), then it can be shown that
y(x) = f(x) +
y
(N+1)
()
(N + 1)!
F(x), x
0
< < x
N
.
4
Note that F(x) = (x x
0
)...(x x
N
) and F(x
i
) = 0, so that the error is largest between
points and for more wiggly functions. For a large set of data points (say greater than 10),
polynomial interpolation can be very dangerous. Although the polynomial is xed (tied
down) at the data points, it can wander wildly between the points which can lead to large
errors for derivatives or interpolated values. For these cases (large set of data points), better
results are obtained by using piecewise Lagrange interpolation.
Example: Let us take y(x) = e
x
for 0 x 1. We use three points (N = 2): x
0
= 0,
x
1
= 0.5, x
2
= 1, and take f
i
= y(x
i
). Then
f(x) =
(x x
1
)(x x
2
)
(x
0
x
1
)(x
0
x
2
)
f(x
0
) +
(x x
0
)(x x
2
)
(x
1
x
0
)(x
1
x
2
)
f(x
1
) +
(x x
0
)(x x
1
)
(x
2
x
0
)(x
2
x
1
)
f(x
2
).
At x = 0.25 we have
f(0.25) =
(0.25)(0.75)
(0.5)(1)
e
0
+
(0.25)(0.75)
(0.5)(0.5)
e
0.5
+
(0.25)(0.25)
(1)(0.5)
e
1
,
= 0.375000 + 1.236541 0.339785,
= 1.271456.
The exact value of the function at x = 0.25 is y(0.25) = e
0.25
= 1.2840254 so that the error
is approximately 1%.
Hermite Interpolation
In some situations, we may want to get an interpolation that not only agree in value with
a given function at specied locations, but their derivatives up to some order also match the
derivatives of the function at the same locations. Thus for the simplest case we require
f(x
i
) = y(x
i
), f

(x
i
) = y

(x
i
), i = 0, ..., N. (7)
This results in Hermite interpolation. The analysis is very similar to that of Lagrange
interpolation. The polynomial must be linear in all the f
i
and f

i
, so we have
f(x) =
N

j=0
H
j
(x)f
j
+
N

j=0

H
j
(x)f
j
, (8)
where H
j
(x) and

H
j
(x) are polynomials of degree (2N + 1) with the following properties
H
j
(x
i
) =
ij
, H

j
(x
i
) = 0,

H
j
(x
i
) = 0,

H

j
(x
i
) =
ij
. (9)
The required polynomial can be constructed by using the properties of the Lagrange poly-
nomial and the result may be expressed in terms of L
(N)
j
(x). The result is
H
j
(x) =
_
1 2L
(N)
j
(x
j
)(x x
j
)
_ _
L
(N)
j
(x)
_
2
,
5

H
j
(x) = (x x
j
)
_
L
(N)
j
(x)
_
2
.
Finally, the error can be estimated by
y(x) = f(x) +
y
(2N+2)
()
(2N + 2)!
F
2
(x), x
0
< < x
N
. (10)
Note that since derivative data often are not available, this method is not used frequently
as an interpolation technique.
Piecewise Lagrange Interpolation
Instead of tting a single polynomial of order N to all the data, one ts lower order
polynomials to sections of it. This is used in many practical applications and is the basis for
some numerical methods. The simplest case is linear Lagrange interpolation
f
i
(x)
x
i+1
x
x
i+1
x
i
f
i
+
x x
i
x
i+1
x
i
f
i+1
. (11)
The subscript on f(x) indicates that this formula holds only for x
i
x x
i+1
. An improve-
ment can be obtained by using the quadratic Lagrange polynomial
f
i
(x)
(x x
i
)(x x
i+1
)
(x
i1
x
i
)(x
i1
x
i+1
)
f
i1
+
(x x
i1
)(x x
i+1
)
(x
i
x
i1
)(x
i
x
i+1
)
f
i
+
(x x
i1
)(x x
i
)
(x
i+1
x
i1
)(x
i+1
x
i
)
f
i+1
(12)
for x
i1
x x
i+1
. This procedure can be extended to higher order interpolation.
Cubic Splines
The main problem with piecewise Lagrange interpolation is that it has discontinuous
derivatives at the boundaries of the segments which causes diculties when dierentiating.
Interpolation with cubic splines gets around this diculty. We can dene a spline interpo-
lating curve by the following criteria:
The curve is piecewise cubic, that is, the coecients of the polynomial are dierent on
each interval (x
i
, x
i+1
).
The curve passes through the given data.
With the exception of the end points, the rst and second derivatives are continuous
at the points x
i
.
From continuity of f

(x), using the piecewise linear Lagrange formula, we have


f

(x)
x
i+1
x
x
i+1
x
i
f

(x
i
) +
x x
i
x
i+1
x
i
f

(x
i+1
). (13)
6
Integrating the equation twice and tting to the data f(x
i
) and f(x
i+1
) results in
f
i
(x) =
(x
i+1
x)
3
6h
i
f

(x
i
) +
(xx
i
)
3
6h
i
f

(x
i+1
) + (x
i+1
x)
_
f(x
i
)
h
i

h
i
6
f

(x
i
)
_
+(x x
i
)
_
f(x
i+1
)
h
i

h
i
6
f

(x
i+1
)
_
, (14)
where h
i
= x
i+1
x
i
. If we can nd f

(x
i
), the formula will be complete. This is obtained
from continuity of f

using two adjacent formulas. The result is


h
i1
6
f

(x
i1
) +
h
i1
+ h
i
3
f

(x
i
) +
h
i
6
f

(x
i+1
) =
f(x
i+1
f(x
i
)
h
i

f(x
i
) f(x
i1
)
h
i1
. (15)
This is a tridiagonal system for f

(x
i
) that is numerically easily solved. This equation cannot
be applied at i = 0 or N, so that we only have N 1 equations for N +1 unknowns. We get
two additional equations by making some assumptions and/or approximations in the end
intervals. Some choices of boundary conditions are:
Periodic f

(x
0
) = f

(x
N1
), f

(x
N
) = f

(x
1
).
Parabolic f

(x
0
) = f

(x
1
), f

(x
N
) = f

(x
N1
).
Natural (or free) f

(x
0
) = f

(x
N
) = 0.
Cantilever f

(x
0
) = f

(x
1
), f

(x
N
) = f

(x
N1
), 0 1.
For equally spaced intervals h, we can show that if f
i
are the exact values a smooth function
y(x) takes at x
i
, then
y(x) f(x) =
h
4
96
y
iv
max
(), x
0
< < x
N
. (16)
7
1.3 Numerical Dierentiation
Derivatives are part of a wide range of mathematical models in engineering. Numerical
techniques are used very often to solve mathematical models. An important step is to get a
derivative from numerical data.
We would like to get the derivative of a smooth function dened on a discrete set of grid
points x
0
, x
1
, x
2
, ...x
N
. Assume that the data are the exact values of the function at the
data point, and we need the derivative only at the data points. We will look into the con-
struction of numerical approximations of the derivative called nite dierences. There are
two approaches to construct such numerical approximations: using interpolation formulas,
or Taylor series approximations.
Finite Dierences from Interpolation
From linear piecewise Lagrange interpolation we have
f(x)
x
i+1
x
x
i+1
x
i
f(x
i
) +
x x
i
x
i+1
x
i
f(x
i+1
). (17)
Dierentiating, we nd that the derivative is a constant and the same at the two ends of the
interval
f

(x
i
) =
f
i+1
f
i
h
i

1
h
i
f
i
, f

(x
i+1
) =
f
i+1
f
i
h
i

1
h
i
f
i+1
, (18)
where and are forward and backward dierence operators. Note that the derivative is
discontinuous at the end points.
From quadratic piecewise Lagrange interpolation we have
f(x) =
(x x
i
)(x x
i+1
)
(x
i1
x
i
)(x
i1
x
i+1
)
f(x
i1
) +
(x x
i1
)(x x
i+1
)
(x
i
x
i1
)(x
i
x
i+1
)
f(x
i
) (19)
+
(x x
i1
)(x x
i
)
(x
i+1
x
i1
)(x
i+1
x
i
)
f(x
i+1
).
Dierentiating and evaluating the result at the midpoint of the interval we obtain
f

(x
i
) =
h
i
h
i1
(h
i1
+ h
i
)
f
i1
+
_
1
h
i1

1
h
i
_
f
i
+
h
i1
h
i
(h
i1
+ h
i
)
f
i+1
. (20)
For equally spaced intervals h
i1
= h
i
= h, so
f

(x
i
) =
f
i+1
f
i1
2h
=
1
2h
( +)f
i
. (21)
This is the central dierence formula, and is the average of the forward and backward
formulas.
8
If we dierentiate f(x) twice and evaluate the result at x
i
we obtain a constant in the interval
x
i1
x x
i+1
f

(x
i
) =
2
h
i1
(h
i1
+ h
i
)
f
i1

2
h
i1
h
i
f
i
+
2
h
i
(h
i1
+ h
i
)
f
i+1
, (22)
which for equally spaced intervals reduces to
f

(x
i
) =
f
i1
2f
i
+ f
i+1
h
2
=
1
h
2
f
i
=
1
h
2
f
i
. (23)
Since f

is constant, it is called a forward, central, or backward formula depending whether it


is evaluated at x
i1
, x
i
, or x
i+1
, respectively. The use of cubic splines to derive nite dierence
approximations has received little attention. It requires the solution of the tridiagonal system
h
i1
6
f

(x
i1
) +
h
i1
+ h
i
3
f

(x
i
) +
h
i
6
f

(x
i+1
) =
f(x
i+1
) f(x
i
)
h
i

f(x
i
) f(x
i1
)
h
i1
. (24)
For uniform intervals we obtain
1
6
f

i1
+
2
3
f

i
+
1
6
f

i+1
=
f
i1
2f
i
+ f
i+1
h
2
. (25)
Note that the solution of the system gives us the approximation for the second derivative,
and the eect of the spline is to distribute the previous result over the central point and
its neighbors with weights 1/6, 2/3, and 1/6. Once this tridiagonal system is solved, the
appropriate f

i
can be used in the rst derivative approximation obtained by dierentiating
the spline approximation for f(x) and evaluating the result at x
i
.
Finite Dierences from Taylor Series
Finite dierences is a practical technique to approximate derivatives and thus to solve
dierential equations.
Finite dierences formulas can be easily derived from Taylor series expansions. For example,
to obtain an approximation for the derivative of f(x) at the point x
i
, we use
f(x
i+1
) = f(x
i
) + (x
i+1
x
i
)f

(x
i
) +
(x
i+1
x
i
)
2
2
f

(x
i
) + ... (26)
Rearrangement leads to:
f

(x
i
) =
f(x
i+1
) f(x
i
)
h
i

h
i
2
f

(x
i
) + .... (27)
When the grid points are uniformly spaced, the above formula can be recast in the following
form
f

i
=
f
i+1
f
i
h
+ O(h). (28)
9
This formula is referred to as the rst forward dierence. The exponent of h is O(h) is
the order of accuracy of the method. With a rst order scheme, if we rene th mesh size
factor by a factor of 2, the error (called the truncation error) is reduced by approximately a
factor of 2. Similarly,
f

i
=
f
i
f
i1
h
+ O(h) (29)
is called the rst order backward dierence formula. Higher order (more accurate) schemes
can be derived by Taylor series of the function f at dierent points about the point x
i
. For
example, the widely used central dierence formula can be obtained from subtraction of the
two Taylor series expansions:
f
i+1
= f
i
+ hf

i
+
h
2
2
f

i
+
h
3
6
f

i
+ ..... (30)
f
i1
= f
i
hf

i
+
h
2
2
f

i

h
3
6
f

i
+ ..... (31)
(32)
This leads to
f

i
=
f
i+1
f
i1
2h

h
2
6
f

i
+ .... (33)
This is a second order formula. In general, we can obtain higher accuracy if we include more
points. Here is a fourth order formula:
f

i
=
f
i2
8f
i1
+ 8f
i+1
f
i+2
12h
+ O(h
4
) (34)
The main diculty with higher order formulas occurs near the boundaries of the domain.
They require the functional values at points outside the domain which are not available. Near
the boundaries one usual resorts to lower order formulas. Similar formulas can be derived for
second or higher order derivatives. For example, the second order central dierence formula
for the second derivative is
f

i
=
f
i+1
2f
i
+ f
i1
h
2
+ O(h
2
), (35)
and is obtained by adding the two Taylor series expansions described above.
Matrix Representation of Finite Dierence Schemes
This is a general procedure for constructing dierence formulas. As an example, suppose
we want to construct the most accurate dierence scheme that involves the functional values
at points i, i + 1, and i + 2. In other words, given the restriction on the points involved, we
ask for the highest order of accuracy that can be achieved. That is
f

i
+
2

k=0
a
k
f
i+k
= O(??), (36)
10
where a
k
are the coecients from the linear combination of Taylor series, and they are to
be determined. We take linear combination of the Taylor series for the terms in the above
formula using a convenient table:
f
i
f

i
f

i
f

i
f

i
0 1 0 0
a
0
f
i
a
0
0 0 0
a
1
f
i+1
a
1
a
1
h a
1
h
2
/2 a
1
h
3
/6
a
2
f
i+2
a
2
a
2
(2h) a
2
(2h)
2
/2 a
2
(2h)
3
/6
Now, we form the sum by adding the columns in the above table:
f

i
+
2

k=0
a
k
f
i+k
= (a
0
+a
1
+a
2
)f
i
+(1+a
1
h+2a
2
h)f

i
+(a
1
h
2
/2+2a
2
h
2
)f

i
+(a
1
h
3
/6+4a
2
h
3
/3)f

i
+.....
(37)
To get the highest accuracy, we must set as many of the low order terms to zero as possible.
We have three coecients, therefore, we can set the coecients of the rst three terms to
zero:
a
0
+ a
1
+ a
2
= 0
a
1
h + 2a
2
h = 1 (38)
a
1
h
2
/2 + 2a
2
h
2
= 0
Solving these equations leads to:
a
0
=
3
2h
, a
1
=
2
h
, a
2
=
1
2h
. (39)
Thus, the resulting (second order) formula is
f

i
=
3f
i
+ 4f
i+1
f
i+2
2h
, (40)
and the leading order truncation error term is
h
2
3
f

i
. (41)
11
1.4 Numerical Integration
Here we will discuss methods for evaluation of the integral of the function f(x) in the interval
[a, b]:
I
_
b
a
f(x)dx. (42)
The function f(x) is dened either analytical or on a et of discrete points a = x
0
, x
1
, ...., x
N
=
b. All integration formulas, also called quadrature formulas, are of the form
I
N

j=0
w
j
f(x
j
), (43)
where w
j
are usually referred to as weights. The specic choice of these weights is what
distinguishes the dierent quadrature formulas.
Newton-Cotes Formulas
One of the easiest ways to obtain quadrature formulas is to use Lagrange interpolation
on an evenly spaced mesh and integrate the result. Divide the range of integration into N
intervals:
x
i
= a + ih, h =
b a
N
, i = 0, 1, 2, ..., N.
Pass a Lagrange interpolating polynomial of degree N through the points (x
i
, f(x
i
)):
P(x) =
N

j=0
L
(N)
j
(x)f(x
j
).
Now integrate this polynomial
I
_
b
a
P(x)dx =
N

j=0
f(x
j
)
_
b
a
L
(N)
j
(x)dx = (b a)
N

j=0
C
N
j
f(x
j
), (44)
where
C
N
j
=
1
b a
_
b
a
L
(N)
j
(x)dx (45)
are pure numbers, called Cotes numbers, and are given in the table below (it is not included).
Observe that in this case the weights are
w
j
= (b a)C
N
j
. (46)
It can be easily shown that
12
N

j=0
C
N
j
= 1 and C
N
j
= C
N
Nj
(47)
Trapezoidal Rule
For one interval, trapezoidal rule i given by
I
i
=
_
x
i+1
x
i
f(x)dx
h
i
2
(f
i
+ f
i+1
). (48)
For the entire interval, [a, b], the trapezoidal rule is
I =
N1

i=0
I
i
h
_
1
2
f
0
+
1
2
f
N
+
N1

j=1
f
j
_
, (49)
where uniform spacing is assumed.
Simpsons Rule
Simpsons rule for uniform intervals is given by
I
h
3
_
f
0
+ f
N
+ 4
N1

j=1,j=odd
f
j
+ 2
N2

j=2,j=even
f
j
_
. (50)
To use Simpsons rule for the entire interval of integration the number of points, (N + 1),
must be odd (even number of panels).
13
2 Numerical Solution of Ordinary Dierential Equations
2.1 Introduction
In this chapter, we will see some fundamental concepts and numerical techniques to solve
ordinary dierential equations (ODE).
Let us consider a nonlinear rst order ODE
dy
dt
= f(t, y), (51)
with initial value y(0) = y
0
. We will always assume the existence and uniqueness of the
solution and also that f(t, y) has continuous partial derivatives with respect to t and y of as
high order as necessary. The purpose of all numerical methods for solution of this dierential
equation is to obtain the solution at time t
n+1
= t
n
+ h, given the solution for 0 t t
n
.
Let us consider the Taylor series methods and expand the solution at t
n+1
about the so-
lution at t
n
y
n+1
= y
n
+ hy

n
+
h
2
2
y

n
+
h
3
6
y

n
+ ..... (52)
From our dierential equation, we have
y

n
= f(t
n
, y
n
). (53)
To evaluate the higher order derivatives we use the chain rule
y

=
dy

dt
=
df
dt
=
f
t
+
f
y
dy
dt
= f
t
+ ff
y
,
y

=

t
(f
t
+ ff
y
) + f

y
(f
t
+ ff
y
) = f
tt
+ 2ff
ty
+ f
t
f
y
+ ff
2
y
+ f
2
f
yy
. (54)
It is clear that the number of terms increases rapidly, and the method is not very practical
for higher than second order accuracy.
The method based on the rst two terms in the expansion is called the forward (or explicit)
Euler method:
y
n+1
= y
n
+ hf(t
n
, y
n
). (55)
One simply starts from the initial condition, y
0
, and marches forward using the formula
to obtain y
1
, y
2
, y
3
, .... Since it is a very simple method to analyze, we will study its prop-
erties thoroughly. From the Taylor series, it is apparent that the forward Euler method is
second order accurate for one time step. However, the global error for advancing from the
initial condition to the nal time, t
f
, is only rst order accurate.
14
Among the more accurate methods that we will see are the Runge-Kutta formulas. With
Runge-Kutta methods, the solution at time step t
n+1
is obtained in terms of y
n
, f(t
n
, y
n
),
and f(t, y) is evaluated at intermediate steps between t
n
and t
n+1
(not including t
n+1
). The
higher accuracy is achieved because more information about f is provided due to interme-
diate evaluations of f. This is in contrast to the Taylor series method where we provided
higher derivatives of f at t
n
.
Higher accuracy can also be obtained by providing information at times t < t
n
. That
is, the formulas involve y
n1
, y
n2
,... and f
n1
, f
n2
,.... These methods are called multi-step
methods.
We will also distinguish between explicit and implicit methods. The formulas that involve
f(t, y) evaluated at t
n+1
belong to the class of implicit methods. Since f may be a nonlinear
function of y, to obtain the solution at each time step, implicit methods usually require
the solution of nonlinear equations. Although the computational cost per time step is high,
implicit methods oer the advantage of numerical stability.
2.2 Numerical concepts
Accuracy. The order of a nite dierence approximation of a dierential equation is
the rate at which the global error of the nite dierence solution approaches to zero as
the sizes of the grid spacings approach zero. This corresponds to the global truncation
error. There are two types of truncation errors: amplitude errors and phase errors.
Stiness. An integration diculty that can only occur in systems of equations (or
equations of order larger than rst) whenever the ratio between the largest and smallest
eigenvalue of the system is large.
Stability. When applied to a dierential equation that has a bounded solution, a
nite dierence equation is stable if it produces a bounded solution and is unstable if
it produces an unbounded solution. It is quite possible that the numerical solution to a
dierential equation grows unbounded even though its exact solution is well behaved.
Of course, there are cases for which the exact solution grows unbounded, but for our
discussion of stability we concentrate only on the cases in which the exact solution is
bounded. Given a system of ODEs and a numerical method, in stability analysis we
seek conditions in terms of the parameter of the numerical method (mainly the step
size h) for which the numerical solution remains bounded. In this context, we have
three classes of numerical methods:
Stable numerical scheme. The numerical solution does not blow up with any
choice of step size;
Unstable numerical scheme. The numerical solution blows up with any choice of
step size;
15
Conditionally stable numerical scheme. Numerical solution remains bounded with
certain choices of step size.
Convergence. A nite dierence method is convergent if the solution of the nite
dierence equation approaches a limit as the sizes of the grid spacings go to zero. Note
that we have no guarantee in general that this limit corresponds to the exact solution
of the dierential equation (unless we know the solution, in which case there would be
no need for a numerical approximation).
Consistency. A nite dierence equation is consistent with a dierential equation
if the dierence between the nite dierence equation and the dierential equation
(i.e. the truncation error) goes to zero as the sizes of the grid spacings go to zero
independently.
An important result which connects the last three concepts is known as the Lax-Richtmyer
(or Lax equivalence) theorem:
Given a properly posed initial value problem and a nite dierence approximation to it that
satises the consistency condition, then stability is the necessary and sucient condition for
the numerical solution to converge to the analytical solution.
2.3 Model dierential problem
For convenience of analytical treatment, we will perform the stability analysis on the model
initial value dierential problem.
y

= y, y(0) = y
0
, (56)
where is a constant. The solution of this model problem is
y = e
t
y
0
. (57)
We allow to be complex
=
R
+ i
I
, (58)
with
R
0. The generalization of complex will allow us to readily apply the results of
our analysis to systems of ordinary dierential equations and partial dierential equations.
For example, consider the second order dierential equation
y

+ y = 0. (59)
The exact solution is sinusoidal
y = c
1
cost + c
2
sint. (60)
16
We can rewrite this second order equation as two rst order equations
_
y
1
y
2
_

=
_
0 1

2
0
_ _
y
1
y
2
_
.
The eigenvalues of the 2 2 matrix are = i. Diagonalizing the matrix
A =
_
0 1

2
0
_
with the matrix of its eigenvectors, S
A = SPS
1
(61)
leads to the uncoupled set of equations
z

= Pz, (62)
where
z = S
1
_
y
1
y
2
_
,
and P is the diagonal matrix with eigenvalues of A on the diagonal. The uncoupled dier-
ential equations for the components of z are
z

1
= iz
1
, z

2
= iz
2
. (63)
This simple example illustrates that higher order dierential equations or systems of rst
order dierential equations can reduce to uncoupled ODEs of the form described before with
a complex coecient. The imaginary part of the coecient results in oscillatory solutions of
the form
e
it
, (64)
and the real part dictates whether the solution grows or decays. In our stability analy-
sis of our model initial value dierential problem we will be concerned only with cases where
has zero or negative real part.
17
2.4 Model dierence problem
Analogous to the dierential problem, consider the single rst order linear model initial value
dierence problem
y
n+1
= y
n
, n = 0, 1, 2, ....., (65)
where y
0
is given and is complex in general. The solution of this problem is
y
n
=
n
y
0
. (66)
Note that the solution remains bounded only if || 1.
2.5 Stability of numerical methods
The connection between the exact solution and the dierence solution is evident if we evaluate
the exact solution at t
n
= nh, for n = 0, 1, ...., where h > 0 then
y
n
y(t
n
) = e
tn
y
0
= e
nh
y
0
=
n
y
0
, (67)
where = e
h
. Since we take our exact solution to be bounded, then
|| =

e
h

1 Re(h) =
R
h 0. (68)
In the
R
h
I
h plane, the region of stability of the exact solution is the left half-plane.
18
Explicit or Forward Euler method
Applying the forward Euler method
y
n+1
= y
n
+ hf(t
n
, y
n
) (69)
to the model problem, leads to
y
n+1
= y
n
+ hy
n
= (1 + h)y
n
. (70)
Thus, the solution at time step n can be written as
y
n
= (1 + h)
n
y
0
. (71)
Note that
= 1 + h. (72)
The numerical solution is stable (bounded) if
|| = |1 + h| = |1 +
R
h + i
I
h| =
_
(1 +
R
h)
2
+ (
I
h)
2
1. (73)
Note that only a small portion of the left half-plane (which is the region of stability of the
analytical solution) is the region of stability for the explicit Euler method. This region is
inside the circle
(1 +
R
h)
2
+ (
I
h)
2
= 1. (74)
19
For any value of h in the left half-plane and outside this circle, the numerical solution
blows-up while the exact solution decays. The numerical method is conditionally stable. To
have a stable numerical solution, we must reduce the step size h so that h falls within the
circle. If is real (and negative), then the maximum step size for stability is
0 h
2
||
. (75)
Th circle is only tangent to the imaginary axis. Therefore, the explicit or forward Euler
method is always unstable (irrespective of the step size) for purely imaginary .
If is real and the numerical solution is unstable, then we must have
|1 + h| > 1, (76)
which means that (1 + h) is negative with magnitude greater than 1. Since
y
n
= (1 + h)
n
y
0
, (77)
the numerical solution exhibits oscillations with change of sign at every time step. This
behavior of the numerical solution is a good indication of instability.
Implicit or backward Euler method
The implicit Euler scheme is given by the following formula
y
n+1
= y
n
+ hf(t
n+1
, y
n+1
). (78)
Note that in contrast to the explicit Euler scheme, we cannot very easily obtain the solution
at the next time step. If f is nonlinear, we must solve a nonlinear algebraic equation at each
time step to obtain y
n+1
. Therefore, the computational cost per time step for this scheme
is much higher than that for the explicit Euler scheme. However, as we shall see below, the
implicit Euler scheme has better stability properties. Applying backward Euler to the model
equation, we obtain
y
n+1
= y
n
+ hy
n+1
. (79)
Solving for y
n+1
, we have
y
n+1
= (1 h)
1
y
n
, (80)
or
y
n
=
n
y
0
, (81)
where
=
1
1 h
. (82)
20
The denominator is a complex number and can be written as the product of its modulus
and phase factor,
=
1
Ae
i
, (83)
where
A = |1 h| =
_
[Re(1 h)]
2
+ [Im(1 h)]
2
=
_
(1
R
h)
2
+ (
I
h)
2
1 (84)
since
R
0, and
= tan
1
Im(1 h)
Re(1 h)
= tan
1

I
h
1
R
h
. (85)
The modulus of is
|| =
|cos isin|
A
=

cos
2
+ sin
2

A
=
1
A
1. (86)
Thus, the implicit or backward Euler scheme is unconditionally stable, and the re-
gion of stability in the (
R
h
I
h) plane is identical to that of the exact solution. When the
region of stability of a dierence equation is identical to th region of stability of the dieren-
tial equation, the nite dierence scheme is sometimes referred to as A-stable. Unconditional
stability is the usual characteristic of implicit methods. The price is higher computational
cost per time step.
Numerical stability does not imply accuracy. A method can be stable but inaccurate. From
the stability point of view, our objective is to use the maximum step size h to reach the
nal destination at time t = t
f
. Large time steps translate to lower number of function
evaluations and lower computational cost. This may not be the optimum h for acceptable
accuracy but it is optimum for stability.
21
Runge-Kutta Methods
The order of accuracy of numerical methods can be increased if one supplies additional
information about the function f. Runge-Kutta methods introduce points between t
n
and
t
n+1
, and evaluate f at these intermediate points. The additional function evaluations result
in higher cost per time step and the accuracy is increased, and as it turns out, better stability
properties are also obtained.
We begin by describing the general form of second order Runge-Kutta formulas for
solving
y

= f(t, y). (87)


The solution at time step t
n+1
is obtained from
y
n+1
= y
n
+
1
k
1
+
2
k
2
, (88)
where the functions k
1
and k
2
are dened sequentially
k
1
= hf(t
n
, y
n
), (89)
k
2
= hf(t
n
+ h, y
n
+ k
1
), (90)
and , ,
1
and
2
are constants to be determined. These constants are determined to
ensure the highest order accuracy for the method. To establish the order of accuracy, let us
consider the Taylor series expansion of y(t
n+1
)
y
n+1
= y
n
+ hy

n
+
h
2
2
y

n
+ .... (91)
We have
y

n
= f(t
n
, y
n
), (92)
and using the chain rule we have already obtained
y

= f
t
+ ff
y
, (93)
where f
t
and f
y
are the partial derivatives of f with respect to t and y, respectively. Thus,
y
n+1
= y
n
+ hf(t
n
, y
n
) +
h
2
2
(f
tn
+ f
n
f
yn
) + .... (94)
By taking a two-dimensional Taylor series expansion of k
2
we get
k
2
= h[f(t
n
, y
n
) + hf
tn
+ k
1
f
yn
+ ...]. (95)
Noting that k
1
= hf(t
n
, y
n
), and substituting it in the solution at time step t
n+1
, we get
y
n+1
= y
n
+ (
1
+
2
)hf
n
+
2
h
2
f
tn
+
2
h
2
f
n
f
yn
+ .... (96)
22
If we compare Eqs. (94) and (96), and match coecients of similar terms, we get

1
+
2
= 1,
2
=
1
2
,
2
=
1
2
. (97)
These are three non-linear equations for the four unknowns (
1
,
2
, , ). Using as a free
parameter, we have

2
=
1
2
, = ,
1
= 1
1
2
. (98)
With the constants chosen, we have a one-parameter family of second order Runge-Kutta
formulas:
y
n+1
= y
n
+
_
1
1
2
_
k
1
+
1
2
k
2
, (99)
where
k
1
= hf(t
n
, y
n
), (100)
k
2
= hf(t
n
+ h, y
n
+ k
1
). (101)
Thus, we have a second order Runge-Kutta formula for each arbitrary value of . The choice
of = 1/2 is made frequently.
Runge-Kutta formulas are often presented in a dierent but equivalent form. For Exam-
ple, the popular form of the second order Runge-Kutta formula ( = 1/2) is presented in
the following (predictor-corrector) format:
y

n+1/2
= y
n
+
h
2
f(t
n
, y
n
), (Forward Euler predictor half-step) (102)
y
n+1
= y
n
+ hf(t
n+1/2
, y

n+1/2
). (Midpoint rule corrector full-step) (103)
Applying this method to the model equation y

= y we get
y
n+1
=
_
1 + h +
1
2

2
h
2
_
y
n
. (104)
For stability, we must have || 1 with = 1 + h +
1
2

2
h
2
.
The most widely used Runge-Kutta method is the fourth order formula
y
n+1
= y
n
+
1
6
k
1
+
1
3
(k
2
+ k
3
) +
1
6
k
4
, (105)
where
k
1
= hf(t
n
, y
n
), (106)
k
2
= hf(t
n
+
1
2
h, y
n
+
1
2
k
1
), (107)
23
k
3
= hf(t
n
+
1
2
h, y
n
+
1
2
k
2
), (108)
k
4
= hf(t
n
+ h, y
n
+ k
3
). (109)
Note that four function evaluations are required at each time step. The fourth order Runge-
Kutta scheme can also be written in the following (predictor-corrector) format:
y

n+1/2
= y
n
+
h
2
f(t
n
, y
n
), (Forward Euler predictor half-step) (110)
y

n+1/2
= y
n
+
h
2
f(t
n+1/2
, y

n+1/2
), (Backward Euler corrector half-step) (111)
y

n+1
= y
n
= hf(t
n+1/2
, y

n+1/2
), (Midpoint rule predictor full-step) (112)
y
n+1
= y
n
+
h
6
[f(t
n
, y
n
) + 2f(t
n+1/2
, y

n+1/2
) + 2f(t
n+1/2
, y

n+1/2
) + f(t
n+1
, y

n+1
)].
(Simpsons rule corrector full-step) (113)
Applying this method to the model equation y

= y we get
y
n+1
=
_
1 + h +
1
2

2
h
2
+
1
6

3
h
3
+
1
24

4
h
4
_
y
n
. (114)
For stability, we must have || 1 with = 1 + h +
1
2

2
h
2
+
1
6

3
h
3
+
1
24

4
h
4
.
24
Multi-Step Methods
In the case of Runge-Kutta formula, higher order accuracy was obtained by several function
evaluations. Higher order accuracy can also be achieved if one uses data from prior to t
n
;
that is, if the solution y and/or f at t
n1
, t
n2
, ... are used. This is another way of providing
additional information about f. Methods that use information from prior to step n are called
multi-step schemes. The apparent price for the higher order accuracy is the use of additional
computer memory. Multi-step methods are not self-starting. Usually another method, such
as explicit Euler, is used to start the calculations for the rst or the rst few time steps.
The most general m-step method is
m

i=0
a
i
y
n+1i
h
m

i=0
b
i
f(t
n+1i
, y
n+1i
) = 0, (115)
and is dened by the number of steps m and the parameters a
i
and b
i
. Without loss of
generality we can take a
0
= 1. If b
0
= 0 the method is explicit; otherwise it is implicit. Note
that multi-step algorithms require only one new function evaluation for each step. There
are many multi-step methods. They typically come in families - each member of the family
possessing a dierent order of accuracy. Three of the more popular families are Adams-
Bashforth, Adams-Moulton, and Gear.
See Tables....
A classical multi-step method is the Leapfrog method.
y
n+1
= y
n1
+ 2hf(t
n
, y
n
). (116)
This method is derived by applying the second order central dierence formula for y

in
equation y

= f(t, y).
System of ordinary dierential equations
Euler and Runge-Kutta methods. See Dr. Goodwines notes.
25
Boundary Value Problems
When data associated with a dierential equation are prescribed at more than one value
of the independent variable, then the problem is a boundary value problem. In initial value
problems, all the data (y(0), y

(0), ..) are prescribed a one value of t independent variable (in


this case at t = 0). To have a boundary value problem, we must have at least a second order
dierential equation
y

= f(x, y, y

),
y(0) = y
0
, (117)
y(L) = y
L
,
where f is an arbitrary function. There are two techniques for solving boundary value prob-
lems:
Shooting Method. Shooting is a technique which uses standard methods for initial value
problems such as Runge-Kutta methods.
Direct Methods. These methods are based on straight forward discretization of the deriva-
tives in the dierential equation and solving the resulting system of algebraic equations.
Shooting Method
Lets reduce the above second order ODE in to two rst order equations by letting u = y,
and v = y

, so that
u

= v, (118)
v

= f(x, u, v), (119)


with end conditions
u(0) = y
0
, u(L) = y
L
. (120)
To solve this system one needs one condition for each of the unknowns u and v rather than
two for one and none for the other. Therefore, we use a guess for v(0) and integrate the
equations to x = L. At this point, u(L) is compared to y
L
; if the agreement is not satisfac-
tory (most likely it will not be), another guess is made for v(0), and the iterative process is
repeated.
For linear problems this iterative process is very systematic; only two iterations are needed.
Let us consider the general linear equation
y

(x) + A(x)y

(x) + B(x)y(x) = f(x), y(0) = y


0
, y(L) = y
L
.
Let us denote two solutions of the equation as y
1
(x) and y
2
(x), which were obtained using
two initial guesses for y

(0). Since the dierential equation is linear, the exact solution can
26
be formed as a linear combination of y
1
and y
2
y(x) = c
1
y
1
(x) + c
2
y
2
(x). (121)
Since y
1
(0) = y
2
(0) = y(0), it follows that
c
1
+ c
2
= 1. (122)
Next, we require that y(L) = y
L
, which, in turn, requires that
c
1
y
1
(L) + c
2
y
2
(L) = y
L
. (123)
We have two equations for c
1
and c
2
; the solution is
c
1
=
y
L
y
2
(L)
y
1
(L) y
2
(L)
, (124)
c
2
=
y
1
(L) y
L
y
1
(L) y
2
(L)
. (125)
This approach also works for higher order linear systems. In general, if n conditions are
specied at the nal point, n + 1 solutions must be generated to obtain the nal solution
by superposition. For this reason, if the number of conditions at the two endpoints are
unequal, it is more ecient to start the integration from the side where the larger number
of conditions are given.
Unfortunately, when f is non-linear, we may have to perform several iterations to obtain the
solution at L to within a prescribed accuracy. We shall demonstrate the solution procedure
using the secant method. Consider y(L) as a (non-linear) function of y

(0):
Suppose that using two initial guesses, y

1
(0) and y

2
(0), we obtained the solutions y
1
(x)
and y
2
(x) with the values at L denoted by y
1
(L) and y
2
(L) such that y
1
(L) < y
L
< y
2
(L).
With the secant method, we form the straight line between the points (y

1
(0), y
1
(L)) and
(y

2
(0), y
2
(L)). The equation for this line is
y

(0) = y

2
(0) + m[y(L) y
2
(L)], (126)
where m =
y

2
(0)y

1
(0)
y
2
(L)y
1
(L)
is the reciprocal of the slope of the line. The next guess is the value
for y

(0) at which the above straight line approximation to the function predicts y
L
. That is
the intersection of the horizontal line from y
L
with the straight line which yields
y

3
(0) = y

2
(0) + m[y
L
y
2
(L)]. (127)
In general, the successive iterates are obtained from the formula
y

+1
(0) = y

(0) + m
1
[y
L
y

(L)], (128)
27
where = 2, 3, ... is the iteration index, and
m
1
=
y

(0) y

1
(0)
y

(L) y
1
(L)
(129)
are the reciprocals of the slope of the successive straight lines (secants). The iteration pro-
cess is continued until y(L) is suciently close to y(L).
A dierent procedure of obtaining new values of y

(0) = v(0) is to replace the secant method


by Newtons method. This iteration method generally converges quadratically.
Let us consider the system
u

= v, (130)
v

= f(x, u, v), (131)


with the boundary conditions
u(0) = y
0
, (132)
u(L) = y
L
. (133)
We take as an initial guess v(0) = s. The problem is to nd s such that the solution of
the initial value problem satises the outer boundary conditon. That is, if we denote the
solution of the initial value problem by [u(x, s), v(x, s)], then we seek s such that
u(L, s) y
L
(s) = 0. (134)
We use the Newton method to solve this equation. If s
0
is a guess for a root of the equation
(s) = 0, a better guess s
1
, is usually obtained by extrapolation to the axis of the tangent
to z = (s) at s = s
0
, and so on. This yields the following iteratio process:
s
n+1
= s
n

(s
v
)
d
ds
(s
n
)
= s
n

u(L, s
n
) y
L
u
s
(L, s
n
)
, n = 0, 1, 2, .... (135)
We can see that s
n+1
= s
n
only if (s
n
) = 0 and then the condition in the right end is satis-
ed exactly. In general, this will not occur for any nite n. We iterate until |s
n+1
s
n
|
for some suciently small . Therefore, the condition at the right end is also approximately
satised.
To obtain the derivative of u with respect to s, we take the derivatives of the initial value
problem with respect to s. By using
U =
u
s
, V =
v
s
, F =
f
s
= U
f
u
+ V
f
v
, (136)
28
we obtain the following variational equations
U

= V, (137)
V

= F(x, u, v, U, V ), (138)
with initial conditions
U(0) = 0, V (0) = 1. (139)
The procedure is as follows. An initial guess of v(0) = s
0
is made. Then the initial value
problem is solved, followed by the solution of the variational problem. Ten s
1
can be com-
puted and |s
1
s
0
| can be checked to see if it is less than a specied error tolerance , in
which case we are done. If |s
1
s
0
| , then we use the next guess v(0) = s
1
to repeat
the whole procedure. The integration methods for the initial value problem and variational
problem are arbitrary.
If y(L) is a very sensitive function of y

(0) there will be diculties in obtaining a converged


solution. In these cases it may be a good idea to integrate from the opposite direction by
guessing a value of y

(L) and iterating until y(0) is suciently close to y


0
.
29
Direct Methods
With direct methods, one simply approximates the derivatives in the dierential equation
with a nite-dierence (or some other) approximation of suitable accuracy. One obtains a
system of algebraic equations for the dependent variables at the node points. For linear
dierential equations, the system is a linear system of algebraic equations; for non-linear
equations, it is a non-linear system of algebraic equations.
Second order approximation to the linear equation (121) yields,
y
j+1
2y
j
+ y
j1
h
2
+ A
j
y
j+1
y
j1
2h
+ B
j
y
j
= f
j
, j = 1, 2, ...N 1,
y
0
= 0, y
N
= y
L
, (140)
where h = x
j
x
j1
. Rearranging,

j
y
j+1
+
j
y
j
+
j
y
j1
= f
j
, j = 1, 2, ....N 1. (141)
where

j
=
_
1
h
2
+
A
j
2h
_
,
j
=
_
B
j

2
h
2
_
,
j
=
_
1
h
2

A
j
2h
_
. (142)
This is a tridiagonal system of linear algebraic equations. The only special treatment comes
at the points next to the boundaries, j = 1 and j = N 1. At j = 1, we have

1
y
2
+
1
y
1
= f
1

1
y
0
. (143)
30
3 Numerical Solution of Partial Dierential Equations
3.1 Introduction
Partial Dierential Equations (PDEs) model a wide range of physical phenomena. PDE can
be of rst order or higher order, there may be a single equation or a system of equations,
they may be linear or nonlinear, and they may be homogeneous or inhomogeneous.
The distinction between initial and boundary value problems that is so important in the
numerical approach to the solution of ODEs, is more complicated for PDEs. The reason is
that there are now two or more independent variables, so it is possible for a problem to be
an initial value problem with respect to one variable and, simultaneously, a boundary value
problem with respect to another variable. It is possible to have pure initial value problems,
pure boundary value problems, and mixed initial-boundary value problems.
First order PDEs occur only occasionally in physics and engineering problems. Systems
of rst order equations are more common, but they can be discussed along with higher order
equations. Many PDEs occurring in applications are second order in at least one of the inde-
pendent variables, fourth order equations arise in mechanics of solid bodies (and elsewhere).
The classication scheme for PDEs depends on the nature of their characteristics. In the
case of two independent variables, characteristics are lines in the plane of the independent
variables (in higher dimensions the characteristics become surfaces) along which signals can
propagate, and they are also the locations of possible discontinuities in the solution of the
equation.
The PDEs can be classied in three kinds, hyperbolic, parabolic and elliptic.
3.1.1 Hyperbolic
A hyperbolic equation possesses only families of real characteristics. Physical systems that
are governed by hyperbolic equations are ones in which signals propagate at nite speed or
over a nite region. A good example of study is the rst order wave equation

t
+ c

x
= 0. (144)
We use the terms space and time coordinates even though there are cases of interest in
which all the coordinates are in fact spatial. Hyperbolic equations are always posed in do-
mains that extend to innity in the timelike coordinate and are thus open in this direction.
The spatial coordinate may or may not be bounded, there may also be boundary conditions
(one at each boundary); otherwise, we have a pure initial value problem. Note that since
our model equation is rst order in the spacelike coordinate, only one boundary condition
31
can be applied at most, so that the spacelike coordinate is also open.
3.1.2 Parabolic
Parabolic equations have only one independent family of real characteristics. The model
parabolic equation we will study is the diusion equation

x
2
= 0. (145)
The initial and boundary conditions that are applied to parabolic equations are typically
similar to those for hyperbolic equations. The domain of solution is open in the time dimen-
sion and the spatial domain may be open or closed. In the case of closed spatial domain we
require two boundary conditions for our model equation.
3.1.3 Elliptic
Elliptic equations do not have real characteristics. The model equation we will study is the
Poisson equation

x
2
+

2

y
2
= f(x, y). (146)
In elliptic problems every point in the solution domain is aected by disturbances at every
other point. This fact makes elliptic problems particularly dicult to solve. The proper
domain for the solution of an elliptic equation is normally a closed region. Furthermore, it
can be shown that necessary and sucient boundary conditions are provided by giving one
datum (the value of the dependent variable, its normal derivative, or some linear combina-
tion of the two) at each point on the boundary.
3.2 Dierence Schemes as Banded Matrices
A dierence scheme is usually written as a point operator. The three-point central dierence
formula for a second derivative is given by
_

x
2
_
i
(
xx
)
i
=
1
x
2
(
i1
2
i
+
i+1
), (147)
where indet i refers to the location where the dependent variable is evaluated in the uni-
formly spaced x-direction. This dierence scheme can be expressed as a matrix operator
once the boundary conditions are specied. If is specied at an endpoint, the boundary
condition is said to be Dirichlet; a Neumann boundary condition corresponds to the speci-
cation of

x
at an endpoint; nally, a periodic boundary condition can be specied at the
32
endpoints. The matrix representation for parabolic and hyperbolic equations with the three
kinds of boundary conditions can now be written as the system of rst order ODEs
d
dt
= A + f. (148)
The ability to express nite dierence schemes in vector-matrix forms is absolutely essential
to our subsequent analysis. In their matrix forms, nite dierence, nite volume, and nite
element schemes all appear similar and much of their evaluation follows the same pattern.
For instance, periodic banded matrices come from the use of periodic boundary conditions.
It can be easily shown that for the model diusion equation we have
A =
1
x
2
B
p
[1, 2, 1]. (149)
The 1

s at the top right and bottom left corners enter because of the wrap around properties
of periodicity. Periodic banded matrices are designated by the subscript p for this reason.
3.3 Semi-Discretization and Matrix Stability Analysis
PDEs can now be readily converted to a system of ODEs by using nite dierence approxi-
mations for derivatives in all but one of the independent variables, and by using the matrix
formulation given above.
Let us consider the following diusion equation:

t
=

x
2
, (150)
with initial boundary conditions
(x, 0) = g(x), and (0, t) = (L, t) = 0. (151)
We discretize the coordinate x with N + 1 uniformly spaced grid points
x
j
= x
j1
+ x, x
0
= 0, x
N
= L, j = 1, 2, ...., N. (152)
Note that x = L/N and we can also write x
j
= jx. Using the second order central
dierence approximation for the second derivative in x results in
d
j
dt
=

j+1
2
j
+
j1
x
2
, j = 1, 2, ...N 1, (153)
where
j
(t) = (x
j
, t). This is a system of ordinary diential equations which can be written
in matrix form as
d
dt
= A, (154)
33
where
j
s are the elements of the vector , and A is the (N1)(N1) tridiagonal matrix
A =

x
2
B[1, 2, 1]. (155)
This is possible since A is a banded matrix. Note that higher order space-derivative approx-
imations can be formed by increasing the bandwidth of the matrix.
The result of the discretization of our diusion equation is a system of ordinary dier-
ential equations and can be solved using any of the numerical methods such as Runge-Kutta
formulas or multi-step methods. However, when dealing with systems, we have to be con-
cerned about stiness. To investigate stiness, we have to study the eigenvalue structure of
the matrix A. Eigenvalues of A can be obtained from a known formula for the eigenvalues of
a tridiagonal matrix with constant entries. Let B[a, b, c] be an (N 1) (N 1) tridiagonal
matrix. It can be shown that the eigenvalues of B are

j
= b + 2

accos
j
,
j
=
j
N
, j = 1, 2, ..., N 1. (156)
Therefore, the eigenvalues of A are

j
=
2
x
2
_
1 cos(
j
N
)
_
. (157)
The eigenvalue with the smallest magnitude is

1
=
2
x
2
_
1 cos(

N
)
_
. (158)
For large N, the series expansion for cos(/N) converges rapidly
cos(

N
) = 1
1
2!
(

N
)
2
+
1
4!
(

N
)
4
.... (159)
Retaining the rst two terms in the expansion results in

1


2

N
2
x
2
. (160)
Also, for large N, we have

N1

4
x
2
. (161)
Therefore, the ratio of the eigenvalues with the largest modulus to the eigenvalues with the
smallest modulus is
|
N1
|
|
1
|
(
2N

)
2
. (162)
34
Clearly, for large N the system is sti. Notice that all the eigenvalue of A are real and
negative. Standard decoupling of the equations using the matrix vector of eigenvectors S
yields
d
dt
= (163)
where = S
1
AS is a diagonal matrix with the eigenvalues of A on the diagonal and
= S
1
. Since the equations are uncoupled, the solutions can be obtained more easily

j
(t) = e

j
t

j
(0). (164)
Thus, negative eigenvalues result in the decay of the solution which is expected for the heat
equation.
Now, consider the model hyperbolic partial dierential equation

t
+ c

x
= 0, (165)
with initial and boundary condition
(x, 0) = g(x), and (0, t) = 0. (166)
Semi-discretization with the second order central dierence formula leads to
d
j
dt
+ c

j+1

j1
2x
= 0. (167)
In matrix notation we have
d
dt
=
c
2x
B[1, 0, 1]. (168)
Thus, the eigenvalues of this matrix are

j
=
c
x
[icos(
j
N + 1
)] = i
j
. (169)
Then, the eigenvalues of the matrix resulting from semi-discretization of the our hyperbolic
partial dierential equation are purely imaginary. The temporal part of the solution is of
the form e
i
j
t
with

j
=
c
x
cos(
j
N + 1
), (170)
which has an oscillatory (non-decaying) character. Our diusion and hyperbolic equations
are examples of extreme cases, one with a decaying solution (negative eigenvalues) and
the other with oscillatory behavior (imaginary eigenvalues). These two examples and the
associated discretizations provide support for the use of the model equation y

= y for
partial dierential equations. The case with real and negative is a model for the diusion
PDE, and the case with purely imaginary is a model for the hyperbolic PDE. Thus, in
application of numerical methods to these partial dierential equations, the same issues
35
discussed for ODEs should be considered. For example, application of the explicit Euler
scheme to our hyperbolic equation will lead to unconditionally unstable numerical solutions,
and application of the same scheme to our diusion equation is conditionally stable. In the
latter case, the time maximum step is obtained from the requirement that
|
j
| = |1 + t
j
| 1, j = 1, 2, ..., N 1, (171)
which leads to
t
2
||
max
, (172)
where ||
max
is the magnitude of the eigenvalue with the largest modulus of the matrix
obtained from semi-discretization of our diusion equation. Substituting
N1
leads to
t
x
2
2
. (173)
This is a severe restriction on the time step. It implies that increasing the spatial accuracy
(reducing x) must be accompanied by a signicant reduction in time step.
The stability analysis performed before uses the eigenvalues of the matrix obtained from
semi-discretization of the PDE at hand. We will refer to this type of analysis as matrix sta-
bility analysis. Since boundary conditions are implemented in the semi-discretization, their
eects are accounted for in the matrix stability analysis.
36
3.4 von Neumann Stability Analysis
In virtually all cases, numerical stability problems arise solely from (full) discretization of
the PDEs and not from the boundary conditions. von Neumanns stability analysis is a
readily implementable analytical procedure for determination of the stability properties of a
numerical method applied to a PDE which does not take into account the eects of boundary
conditions. In fact, it is assumed that the spatial domain is innite or for nite domains
the boundary conditions are periodic. The technique works for linear constant coecient
dierential equations.
We will apply von Neumanns technique to the fully discrete equation

(n+1)
j
=
(n)
+
t
x
2
(
(n)
j+1
2
(n)
j
+
(n)
j1
). (174)
This equation results from approximating the spatial derivative in our diusion equation
with second order central dierence and using explicit Euler time advancement (FTCS).
The key part of von Neumanns analysis is assuming solution of the form

(n)
j

n
e
ikx
j
. (175)
Substituting in the fully discrete equation, results in

n+1
e
ikx
j
=
n
e
ikx
j
+
t
x
2

n
(e
ikx
j+1
2e
ikx
j1
+ e
ikx
j1
). (176)
Noting that
x
j1
= x
j
x, (177)
and dividing both sides by
n
e
ikx
j
leads to
= 1 +
2t
x
2
[cos(kx) 1] = 1
4t
x
2
sin(
kx
2
). (178)
For stability, we must have || 1, that is

1
4t
x
2
sin
2
(
kx
2
)

1. (179)
From this we have two inequalities, one is always satised and the other can be rewritten as

4t
x
2
sin
2
(
kx
2
) 2, (180)
or
t
x
2
2sin
2
(kx/2)
. (181)
37
The most restrictive case occurs when sin
2
(kx/2) = 1. Thus, the maximum time step is
given by
t
x
2
2
. (182)
This is identical to the expression obtained using matrix stability analysis.
3.5 Modied Wave Number Analysis
Modied wave number analysis is very similar to von Neumann analysis; in many ways it is
more straight-forward. It is intended to expand the range of applicability of what we have
learned about stability properties of a numerical method applied to ODEs, to the application
of the same numerical method to PDEs.
Consider our diusion equation. Assuming a solution of the form
(x, t) = (t)e
ikx
, (183)
and substituting in our diusion equation, leads to
d
dt
= k
2
. (184)
In the assumed form of the solution, k is the wave number. In practice, one uses a nite
dierence scheme to approximate the spatial derivative. For example, using second order
central dierence we have
d
j
dt
=

j+1
2
j
+
j1
x
2
, j = 1, 2, ...., N 1. (185)
Let
(x
j
, t) =
j
(t) = (t)e
ikx
j
. (186)
If we substitute and divide by e
ikx
j
leads to
d
dt
=
4
x
2
sin
2
(
kx
2
), (187)
or
d
dt
= k
2
, (188)
where
k
2
=
4
x
2
sin
2
(
kx
2
) =
4
x
2
_
(
kx
2
)
2

1
3
(
kx
2
)
4
+
2
45
(
kx
2
)
6
.....
_
.0 (189)
38
k

is called the modied wave number. If we use any other nite dierence scheme, we would
have also led to the equation form d/dt = k
2
but with a dierent wave number. Thus,
each nite-dierence scheme has a distinct modied wave number associated with it.
Equation d/dt = k
2
is identical to the model equation y

= y, with = k
2
. We
studied the stability properties for various numerical methods with respect to this model
equation. Using the modied wave number analysis, we can readily obtain stability proper-
ties of a time advancement method applied to a PDE. All we have to do is to replace with
k
2
in our ODE analysis.
As an illustration of the modied wave number analysis, consider our hyperbolic equation
with the central dierence approximation for the spatial derivative. It can be easily veried
that the substitution
j
= (t)e
ikx
j
results in
d
dt
= ik

c, (190)
where
k

=
sin(kx)
x
(191)
is the modied wave number. In this case, the corresponding in the model equation is
purely imaginary. Thus, we would know, for example, that application of the explicit Euler
or second order Runge-Kutta scheme would lead to numerical instabilities. On the other
hand, if the leapfrog scheme is used, the time step has to be such that
t
1
|k

|c
=
x
c|sin(kx)|
. (192)
If we consider the worst case,
t
max
=
x
c
, (193)
and
ct
x
1. (194)
The quantity ct/x is called the CFL number and is named after Courant, Friedrich, and
Lewy.
3.6 Higher Dimensions
Take the two dimensional heat equation

t
= (

x
2
+

2

y
2
). (195)
39
For numerical solution, one introduces a grid in the (x, y) plane. Let
(n)
j,k
denote the value of
at the grid point (j, k) at time step n. We consider J +1 grid points in x and K+1 points
in y. The boundary points ar at j = 0, J and k = 0, K. Application of any explicit numerical
method is very straight forward. For example, consider explicit Euler in conjunction with
second order nite-dierence approximation for the spatial derivatives.

(n+1)
j,k

(n)
j,k
t
=
_

(n)
j+1,k
2
(n)
j,k
+
(n)
j1,k
x
2
+

(n)
j,k+1
2
(n)
j,k
+
(n)
j,k1
y
2
_
. (196)
Given an initial condition on the grid points
(0)
j,k
for each j and k one simply marches forward
in time to obtain the solution at subsequent time steps. When j = 1, J 1, or k = 1, K1,
the boundary values are required, and their value from prescribed boundary conditions are
used.
The stability properties for this scheme can be studied in the same way as in the one-
dimensional case. Considering solutions of the form (x
j
, y
k
, t) = (t)e
ik
1
x
j
+ik
2
y
k
, the heat
equation transforms to
d
dt
= (k
2
1
+ k
2
2
), (197)
where k

1
and k

2
are the modied wave numbers
k
2
1
=
4
x
2
sin
2
(
k
1
x
2
), k
2
2
=
4
y
2
sin
2
(
k
2
y
2
). (198)
Therefore, for stability we must have
t
2

_
4
x
2
sin
2
(
k
1
x
2
) +
4
y
2
sin
2
(
k
2
y
2
)
_
1
. (199)
The worst case is when sin
2
(k
1
x/2) = 1 and sin
2
(k
2
y/2) = 1. Thus,
t =
1
2
(
1
x
2
+
1
y
2
)
1
. (200)
In the special case where x = y = h, we get
t
h
2
4
, (201)
which is twice more restrictive compared to the one-dimensional case. Similarly, in three
dimensions one obtains
t
h
2
6
. (202)
40
3.7 Numerical Schemes for PDE
3.7.1 Dufort-Frankel Method
Let us consider the one-dimensional heat equation. The method is composed of two steps.
First, leapfrog in time and second order in space:

(n+1)
j

(n1)
j
2t
=

x
2
_

(n)
j+1
2
(n)
j
+
(n)
j1
_
+ O(x
2
, t
2
). (203)
Thus, this scheme is second order accurate in space and time. Since this scheme is un-
conditionally unstable, we have to modied it. The Dufort-Frankel scheme is obtained by
substituting in the right hand side

(n)
j
=

(n+1)
+
(n1)
2
+ O(t
2
). (204)
Thus, with this substitution the formal order of accuracy does not change. This results in
(1 + 2)
(n+1)
j
= (1 2)
(n1)
j
+ 2
(n)
j+1
+ 2
(n)
j1
, (205)
where = t/x
2
. Thus, this substitution turns this method unconditionally stable. The
Dufort-Frankel method has the same stability property of the implicit methods but with a
lot less work per time step, because it is not required to invert a matrix at each time step.
Substituting the series expansions for
(n)
j+1
,
(n)
j1
,
(n+1)
j
y
(n1)
j
into the above equation
and we get

x
2
=
t
2
6

t
3
+
x
2
12

x
4

t
2
x
2

t
2

t
4
12x
2

t
4
+ .... (206)
The diculty arises from the third term on the right hand side. For a given time step, if we
rene the spatial mesh, the error actually increases. Thus, one cannot increase the accuracy
of the numerical solution by arbitrary letting x 0 and t 0.
3.7.2 Crank-Nicolson Method
We already know that semi-discretization of the heat equation leads to a sti system of
ODEs. Besides, the stability limit for application of explicit schemes is too stringent. For
these reasons implicit methods are typically used in solving parabolic equations. The most
popular implicit scheme for the heat equation is the Crank-Nicolson method

t
=

x
2
. (207)
41
Application of the method leads to

(n+1)

(n)
t
=

2
_

(n+1)
x
2
+

2

(n)
x
2
_
. (208)
By approximating the spatial derivative with the second order nite-dierence scheme

(n+1)
j

(n)
j
=
t
2
_

(n+1)
j+1
2
(n+1)
j
+
(n+1)
j1
x
2
+

(n)
j+1
2
(n)
j
+
(n)
j1
x
2
_
(209)
Dening = t/2x
2
and collecting the unknowns terms with supercrist (n + 1) on the
left and the terms with supercrist (n) on the right, we get the following tridiagonal system
of equations

(n+1)
j+1
+ (1 + 2)
(n+1)
j

(n+1)
j1
=
(n)
j+1
+ (1 2)
(n)
j
+
(n)
j1
, j=1,2,.....,J-1 (210)
At each time step a tridiagonal system of equations must be solved. Application of implicit
methods to partial dierential equations always requires solving a system of algebraic equa-
tions.
If we use the model equation y

= y, the amplication factor is


=
1 + t/2
1 t/2
. (211)
Using the modied wave number analysis, the amplication factor for the Crank-Nicolson
method applied to the heat equation s obtained by substituting k

2
for in this equation.
The modied wave number k

is
k
2
=
4
x
2
sin
2
_
kx
2
_
. (212)
Then,
=
1 2
t
x
2
sin
2
_
kx
2
_
1 + 2
t
x
2
sin
2
_
kx
2
_. (213)
Since ||, and the method is unconditionally stable. Notice that for large t/x
2
, 1,
which leads to temporal oscillations in the solution.
3.7.3 Crank-Nicolson Scheme in Higher Dimensions
Due to the time step restrictions with explicit schemes, the use of implicit methods is more
desirable. Crank-Nicolson scheme to the two-dimensional heat equation results in

(n+1)

(n)
t
=

2
_

(n+1)
x
2
+

2

(n+1)
y
2
+

2

(n)
x
2
+

2

(n)
y
2
_
. (214)
42
Using second order nite-dierence in space and assuming x = y = h yields

(n+1)
j,k

(n)
j,k
=
t
2h
2
_

(n+1)
j+1,k
2
(n+1)
j,k
+
(n+1)
j1,k
+
(n+1)
j,k+1
2
(n+1)
j,k
+
(n+1)
j,k1
_
+
t
2h
2
_

(n)
j+1,k
2
(n)
j,k
+
(n)
j1,k
+
(n)
j,k+1
2
(n)
j,k
+
(n)
j,k1
_
. (215)
Using = t/2h
2
and writing the unknowns in the left hand side, we get

(n+1)
j+1,k
+ (1 + 4)
(n+1)
j,k

(n+1)
j1,k

(n+1)
j,k+1

(n+1)
j,k1
=
(n)
j+1,k
+ (1 4)
(n)
j,k
+
(n)
j1,k
+
(n)
j,k+1
+
(n)
j,k1
= G
(n)
j,k
. (216)
This a system of algebraic equations for
(n+1)
j,k
. It can be written as
B
(n+1)
= G
(n)
, (217)
where B is a tridiagonal matrix and
(n+1)
is a matrix with the unknowns at time step
(n + 1).
3.7.4 Alternating Direction Implicit (ADI) Methods and Approximate Factorization
The diculty of working with large matrices, resulting from the direct implementation of
implicit schemes to partial dierential equations in higher dimensions, has led to the devel-
opment of the so-called split or factored schemes. These schemes circumvent large matrices
while maintaining the same order of accuracy. If we apply Crank-Nicolson and second order
spatial dierencing to the two-dimensional heat equation, we get,

(n+1)

(n)
t
=

2
A
x
_

(n+1)

(n)

+

2
A
y
_

(n+1)

(n)

+ O(t
2
, x
2
, y
2
), (218)
where A
x
and A
y
are dierence operators representing the spatial derivatives in the x and
y directions respectively. For example, A
x
is a vector of length (J 1) (K 1), where
J +1 is the number of grid points in x-direction and K +1 in the y-direction, with elements
dened as

j+1,k
2
j,k
+
j1,k
x
2
, j=1,2,.....,J-1, k=1,2,.....,K-1. (219)
Thus, our discrete two-dimensional heat equation can be rewritten as
_
I
t
2
A
x

t
2
A
y
_

(n+1)
=
_
I +
t
2
A
x
+
t
2
A
y
_

(n)
+ t[O(t
2
, x
2
, y
2
)].
(220)
After some algebra, we can rewrite the factored form of the discrete equations
43
_
I
t
2
A
x
__
I
t
2
A
y
_

(n+1)
=
_
I +
t
2
A
x
__
I +
t
2
A
y
_

(n)
. (221)
This equation is much easier and more cost-eective to implement than the large system
encountered in the non-factored form. This is called a approximate factorization.
Approximate factorization represents a class of methods. The idea is to factor a numerical
operator for which the equations are dicult to solve, into the product of two or more op-
erators for which the solution is computationally easier and just as accurate.
3.7.5 Relaxation Methods
The word relaxation is used to mean the process of nding the iterative solution of a coupled
set of algebraic equations. In our applications, the algebraic equations represent a set of
dierence equations usually derived from the dierencing of partial dierential equations.
The number of iterations required to bring about the solution depends upon both the rate
of convergence and the convergence path. Since the exact solution is generally unknown,
and since the process is iterative, some test of a converged solution must be supplied. The
quantity tested is called the residual. The degree to which the residual must be reduced to
zero before a solution is acceptable is quite arbitrary and generally problem dependent. This
means, the solution is independent of the initial guess and is said to have converged when it
is invariant with further iteration.
These methods are to be compared with direct methods, for instance, Gaussian elimina-
tion. The solution is determined exactly with direct methods (considering the machine
precision) in a nite number of steps. Very ecient direct methods have been developed
for systems that arise from a two-dimensional elliptic equation. They are usually based on
the fast Fourier transform (FFT) or the method of cyclic reduction. Classical relaxation
methods are easy to implement and may be successfully applied to more general systems
than the direct methods.
Consider the equation of Poisson

2
u
x
2
+

2
u
y
2
= f(x, y), (222)
in the unit square (0 < x < 1, 0 < y < 1) subject to the condition that u = 0 on the
boundary. Using x = 1/I and y = 1/J, and central second-order nite dierences, we
get
u
i1,j
2u
i,j
+ u
i+1,j
x
2
+
u
i,j1
2u
i,j
+ u
i,j+1
y
2
= f
i,j
, 1 i I 1, 1 j J 1, (223)
44
with u
i,j
= 0 for i = 0, I or j = 0, J. We can rewrite this equation as
Au = f (224)
where A is a tridiagonal matrix with coecients, u is a matrix with the unknowns and f is
a matrix containing f
i,j
.
An iterative method applied to this equation is a procedure of the type
A
1
v
(p+1)
= A
2
v
(p)
+f, (225)
where A
1
and A
2
are matrices and non equal to zero from a nontrivial decomposition A =
A
1
A
2
. Using this procedure, we get
v
(p+1)
= A
1
1
A
2
v
(p)
+ A
1
1
f = Pv
(p)
+b, (226)
where we must have v
(p)
= u as p quickly.
3.7.6 Jacobi Method
Lets start by considering the set of linear equations
Au = f. (227)
We write the matrix A in the form
A = D L U, (228)
where D is th diagonal of A with elements a
i,i
which is assumed to be nonzero, and L and
U are the strictly lower and upper triangular parts of A with elements a
i,j
(i > j) and
a
i,j
(i < j), respectively. Then we have,
(DL U)u = f. (229)
This can be rewritten as
Du = (L + U)u +f, (230)
or
u = D
1
(L + U)u + D
1
f. (231)
In component form the iteration scheme expressed by this last equation is the Jacobi method
and is written as
v
(p+1)
i
=
1
a
i,i
_
J1

j=1,j=i
a
i,j
v
(p)
j
+f
i
_
. (232)
45
This corresponds precisely to solving the ith equation for the ith unknown v
i
for 1 i I1.
If we substitute the appropriate values of the matrices corresponding to the central second-
order nite dierences discretization of the Poissons equation, we get
v
(p+1)
i,j
=
1
4
_
v
(p)
i1,j
+ v
(p)
i+1,j
+ v
(p)
i,j1
+ v
(p)
i,j+1
_

1
4
f
i,j
. (233)
If we dene the Jacobi iteration matrix by
P
J
= D
1
(L + U), (234)
the Jacobi method in matrix form is
v
(p+1)
= P
J
v
(p)
+ D
1
f. (235)
There is simple but important modication which can be made to the Jacobi iteration. As
before, we compute the new Jacobi iterates using
v
()
= P
J
v
(p)
+ D
1
f. (236)
However, v
()
is now an intermediate value. The new approximation is given by the weighted
average
v
(p+1)
= (1 )v
(p)
+ v
()
, (237)
where 0 < 1 is a weighting factor which may be chosen. This generates an entire family
of iterations called the weighted or damped Jacobi method. Notice that with = 1 we have
the original Jacobi iteration. The combined method is give by
v
(p+1)
= [(1 )I + P
J
]v
(p)
+ D
1
f. (238)
If we dene the weighted Jacobi iteration matrix by
P

= (1 )I + P
J
, (239)
then the method may be expressed as
v
(p+1)
= P

v
(p)
+ D
1
f. (240)
Using Av = f r, where r is the residual, the weighted Jacobi iteration can also be written
in the form
v
(p+1)
= v
(p)
+ D
1
r
(p)
. (241)
This says that the new approximation is obtained from the current one by adding an appro-
priate weighting of the residual. This is just one example of a stationary linear iteration. In
general, recalling that e = u v and Ae = r, we have
u = v + A
1
r. (242)
46
From this expression, an iteration may be formed by taking
v
(p+1)
= v
(p)
+ Br
(p)
, (243)
where B is some approximation to A
1
. If B can be chosen close to A
1
, the iteration
should be eective.
The Jacobi and weighted Jacobi methods wait until all components of the new approxi-
mation have been computed before using them. This requires 2(I-1)(J-1) storage locations
for the approximation vector It also means that new information is not used as soon as it is
available.
3.7.7 Gauss-Seidel Method
The Gauss-Seidel method incorporates a simple change on the Jacobi method: components
of the new approximation are used as soon as they computed. This means that components
of the approximation vector v are over-written as soon as they are updated. This small
change reduces the storage requirements for the approximation vector to only (I 1)(J 1)
locations. It can be shown that the Gauss-Seidel method is equivalent to successively setting
each component of the residual vector to zero and solving for the corresponding component
of the solution.
As in the Jacobi method, splitting the matrix A in the form A = D L U, we can
write the system of equations (Au = f) as
(D U)u = Lu +f (244)
or
u = (D U)
1
Lu + (D U)
1
f. (245)
This corresponds precisely to solving the ith equation for u
i
and using the new approxi-
mations for components 1, 2, ..., i 1. If we now dene the Gauss-Seidel iteration matrix
by
P
G
= (D U)
1
L, (246)
we can express the method as
v = P
G
v + (D U)
1
f. (247)
For our discrete Poisson equation the method is
v
(p+1)
i,j
=
1
4
_
v
(p+1)
i1,j
+ v
(p)
i+1,j
+ v
(p+1)
i,j1
+ v
(p)
i,j+1
_

1
4
f
i,j
. (248)
47
We have assumed that the components of v are updated in ascending order. For the Ja-
cobi and weighted Jacobi methods, the order is immaterial, since components are never
over-written. However, for the Gauss-Seidel method, the order of updating is signicant.
Instead of sweeping through the grid points in ascending order, we could sweep through the
unknowns in descending order or alternate between ascending and descending orders. The
latter procedure is called the symmetric Gauss-Seidel method. Another eective alternative
is to update all the (i + j) = even components rst, and then return and update all the
(i + j) = odd components. This strategy leads to the red-black Gauss-Seidel method (two
dimensional grid). The advantages of red-black versus regular Gauss-Seidel are not imme-
diately apparent; the issue is often problem-dependent. However, red-black Gauss-Seidel
does have a clear advantage in terms of implementation on a parallel computer. The red
points need only black points for their updating and therefore may be updated in any order.
This work represents (I 1)(J 1)/2 independent tasks that can be distributed among sev-
eral independent processors. In a similar way, the black sweep can also be done by several
independent processors.
48
4 Method of Weighted Residuals
49

You might also like