Professional Documents
Culture Documents
Rootfinding
f (x) = 0 (7.1)
We will
explore some simple numerical methods for solving this equation,
and also will
consider some possible difficulties
Then f (x) changes sign on [a, b], and f (x) = 0 has at least one root on the
interval.
Definition
The simplest numerical procedure for finding a root is to repeatedly halve the
interval [a, b], keeping the half for which f (x) changes sign. This procedure is
called the bisection method, and is guaranteed to converge to a root,
denoted here by .
Suppose that we are given an interval [a, b] satisfying (7.2) and an error
tolerance > 0.
The bisection method consists of the following steps:
a+b
B1 Define c = 2 .
The interval [a, b] is halved with each loop through steps B1 to B3.
The test B2 will be satisfied eventually, and with it the condition | c|
will be satisfied.
Notice that in the step B3 we test the sign of sign[f (b)] sign[f (c)] in order
to avoid the possibility of underflow or overflow in the multiplication of f (b)
and f (c).
Example
Find the largest root of
f (x) x6 x 1 = 0 (7.3)
Use bisect.m
n a b c bc f (c)
1 1.0000 2.0000 1.5000 0.5000 8.8906
2 1.0000 1.5000 1.2500 0.2500 1.5647
3 1.0000 1.2500 1.1250 0.1250 -0.0977
4 1.1250 1.2500 1.1875 0.0625 0.6167
5 1.1250 1.1875 1.1562 0.0312 0.2333
6 1.1250 1.1562 1.1406 0.0156 0.0616
7 1.1250 1.1406 1.1328 0.0078 -0.0196
8 1.1328 1.1406 1.1367 0.0039 0.0206
9 1.1328 1.1367 1.1348 0.0020 0.0004
10 1.1328 1.1348 1.1338 0.00098 -0.0096
log ba
n
log 2
The line tangent to the graph of y = f (x) at (x0 , f (x0 )) is the graph
of the linear Taylor polynomial:
i.e.,
f (x0 )
x1 = x0 .
f 0 (x0 )
f (x1 )
x2 = x1 .
f 0 (x1 )
Example
Using Newtons method, solve (7.3) used earlier for the bisection method.
Here
f (x) = x6 x 1, f 0 (x) = 6x5 1
and the iteration
x6n xn 1
xn+1 = xn , n0 (7.7)
6x5n 1
The true root is = 1.134724138, and x6 = to nine significant digits.
Newtons method may converge slowly at first. However, as the iterates come
closer to the root, the speed of convergence increases.
Use newton.m
Compare these results with the results for the bisection method.
Example
One way to compute ab on early computers (that had hardware arithmetic for
addition, subtraction and multiplication) was by multiplying a and 1b , with 1b
approximated by Newtons method.
1
f (x) b
=0
x
where we assume b > 0. The root is = 1b , the derivative is
1
f 0 (x) =
x2
and Newtons method is given by
1
b xn
xn+1 = xn 1 ,
x2n
i.e.,
xn+1 = xn (2 bxn ), n0 (7.8)
equivalently
2
0 < x0 < (7.10)
b
3. Rootfinding Math 1070
> 3. Rootfinding > 3.2 Newtons method
1
The iteration (7.8), xn+1 = xn (2 bxn ), n 0, converges to = b if and
only if the initial guess x0 satisfies
0 < x0 < 2b
1
Figure: The iterative solution of b x =0
Error analysis
f 0 () 6= 0, (7.12)
i.e., the graph y = f (x) is not tangent to the x-axis when the graph
intersects it at x = . The case in which f 0 () = 0 is treated in Section 3.5.
Note that combining (7.12) with the continuity of f 0 (x) implies that
f 0 (x) 6= 0 for all x near .
By Taylors theorem
1
f () = f (xn ) + ( xn )f 0 (xn ) + ( xn )2 f 00 (cn )
2
with cn an unknown point between and xn .
Note that f () = 0 by assumption, and then divide by f 0 (xn ) to obtain
00
f (xn ) 2 f (cn )
0= + xn + ( xn ) .
f 0 (xn ) 2f 0 (xn )
f 00 (cn )
2
xn+1 = ( xn ) (7.13)
2f 0 (xn )
This formula says that the error in xn+1 is nearly proportional to the
square of the error in xn .
When the initial error is sufficiently small, this shows that the error in
the succeeding iterates will decrease very rapidly, just as in (7.11).
Example
x6n xn 1
For the earlier iteration (7.7), i.e., xn+1 = xn 6x5n 1 , n 0, we have
f 00 (x) = 30x4 . If we are near the root , then
f 00 (cn ) f 00 () 304
= = 2.42
2f 0 (cn ) 2f 0 () 2(65 1)
f 00 (cn ) 2f 00 ()
M. (7.15)
2f 0 (xn ) 2f 0 ()
Thus,
xn+1 M ( xn )2 , n0
M ( xn+1 ) [M ( xn )]2
Assuming that all of the iterates are near , then inductively we can show that
n
M ( xn ) [M ( x0 )]2 , n0
|M ( x0 )| < 1
0
1 2f ()
| x0 | < = 00 (7.16)
|M | f ()
If the quantity |M | is very large, then x0 will have to be chosen very close to
to obtain convergence. In such situation, the bisection method is probably
an easier method to use.
The choice of x0 can be very important in determining whether
Newtons method will converge.
xn xn+1 xn (7.17)
This is the standard error estimation formula for Newtons method, and it is
usually fairly accurate.
However, this formula is not valid if f 0 () = 0, a case that is discussed in Section 3.5.
Example
Consider the error in the entry x3 of the previous table.
x3 = 4.73E 3
x4 x3 = 4.68E 3
Example
Use Newtons Method to find a root of f (x) = x2 .
f (xn ) x2n xn
xn+1 = xn f 0 (xn ) = xn 2xn = 2 .
So the method converges to the root = 0, but the convergence is only linear
en
en+1 = .
2
Example
Use Newtons Method to find a root of f (x) = xm .
xm m1
xn+1 = xn n
mxm1 = m xn .
The method converges to the root = 0, again with linear convergence
m1
en+1 = en .
m
3. Rootfinding Math 1070
> 3. Rootfinding > Error Analysis - linear convergence
Theorem
Assume f C m+1 [a, b] and has a multiplicity m root . Then Newtons
Method is locally convergent to , and the absolute error en satisfies
en+1 m1
lim = . (7.18)
n en m
Example
Find the multiplicity of the root = 0 of f (x) = sin x + x2 cos x x2 x,
and estimate the number of steps in NM for convergence to 6 correct decimal
places (use x0 = 1).
2
xn+1 0
f (xn ) + ( xn )f 0 (xn ) 1 m 1
+ f 00 (c) (x n)
= m 2! 00
xn+1 0 1
f 00 () + ( xn )2 f 2!(c)
= m f (xn ) + ( xn )2 1 m
Example
Apply Newtons Method to f (x) = x4 + 3x2 + 2 with starting guess x0 = 1.
3.5
2.5
1.5
0.5
0
x1 x0
0.5
Assume that two initial guesses to are known and denote them by
x0 and x1 .
They may occur
on opposite sides of , or
on the same side of .
1.5
0.5
x3
0
y
x0 x2 x1
0.5
1.5
0.2 0 0.2 0.4 0.6 0.8 1 1.2
Use secant.m
Example
We solve the equation f (x) x6 x 1 = 0.
n xn f (xn ) xn xn1 xn1
0 2.0 61.0
1 1.0 -1.0 -1.0
2 1.01612903 -9.15E-1 1.61E-2 1.35E-1
3 1.19057777 6.57E-1 1.74E-1 1.19E-1
4 1.11765583 -1.68E-1 -7.29E-2 -5.59E-2
5 .113253155 -2.24E-2 -2.24E-2 1.71E-2
6 1.13481681 9.54E-4 2.29E-3 2.19E-3
7 1.13472365 -5.07E-6 -9.32E-5 -9.27E-5
8 1.13472414 -1.13E-9 4.92E-7 4.92E-7
The iterate x8 equals rounded to nine significant digits.
As with the Newton method (7.7) for this equation, the initial iterates do not
converge rapidly. But as the iterates become closer to , the speed of
convergence increases.
3. Rootfinding Math 1070
> 3. Rootfinding > 3.3.1 Error Analysis
f 00 (n )
xn+1 = ( xn )( xn1 ) . (7.21)
2f 0 (n )
f (xn ) f (xn1 )
f 0 (xn ) .
xn xn1
Check that the use of this in the Newton formula (7.6) will yield (7.20).
The formula (7.21) can be used to obtain the further error result that
if x0 and x1 are chosen sufficiently close to , then we have
convergence and
| xn+1 | f 00 () r1
lim = 0 c
n | xn |r 2f ()
5+1
where r = 2 = 1.62. Thus,
c = |M |r1 .
The restriction (7.16) on the initial guess for Newtons method can be
replaced by a similar one for the secant iterates, but we omit it.
3. Rootfinding Math 1070
> 3. Rootfinding > 3.3.1 Error Analysis
Finally, the result (7.22) can be used to justify the error estimate
xn1 xn xn1
Example
For the iterate x5 in the previous Table
x5 = 2.19E 3
(7.23)
x6 x5 = 2.29E 3
General remarks
The derivation of both the Newton and secant methods illustrate a general
principle of numerical analysis.
When trying to solve a problem for which there is no direct or simple method
of solution, approximate it by another problem that you can solve more easily.
GENERAL OBSERVATION
When dealing with problems involving differentiable functions f (x),
move to a nearby problem by approximating each such f (x) with a
linear problem.
The linearization of mathematical problems is common throughout applied
mathematics and numerical analysis.
There are three generalization of the Secant method that are also
important. The Method of False Position, or Regula Falsi, is similar
to the Bisection Method, but where the midpoint is replaced by a
Secant Method-like approximation. Given an interval [a, b] that
brackets a root (assume that f (a)f (b) < 0), define the next point
bf (a) af (b)
c=
f (a) f (b)
as in the Secant Method, but unlike the Secant Method, the new point
is guaranteed to lie in [a, b], since the points (a, f (a)) and (b, f (b)) lie
on separate sides of the x-axis. The new interval, either [a, c] or [c, b],
is chosen according to whether f (a)f (c) < 0 or f (c)f (b) < 0,
respectively, and still brackets a root.
Example
Apply the Method of False Position on initial interval [-1,1] to find the
root r = 1 of f (x) = x3 2x2 + 23 x.
0 x0 x3 x4 x2 x1
2
y
6
1.5 1 0.5 0 0.5 1 1.5
0 x0 x4 x3 x2 x1
2
y
6
1.5 1 0.5 0 0.5 1 1.5
f (xn )
xn+1 = xn , n = 0, 1, 2, . . .
f 0 (xn )
Definition
The solution is called a fixed point of g.
Example
As motivational example, consider solving the equation
x2 5 = 0 (7.24)
for the root = 5 = 2.2361.
We give four methods to solve this equation
I1. xn+1 = 5 + xn x2n x = x + c(x2 a), c 6= 0
5 a
I2. xn+1 = x=
xn x
1
I3. xn+1 = 1 + xn x2n x = x + c(x2 a), c 6= 0
5
1 5 1 a
I4. xn+1 = xn + x= (x + )
2 xn 2 x
All four iterations have the property that if the sequence {xn : n 0} has a
limit , then is a root of (7.24). For each equation, check this as follows:
Replace xn and xn+1 by , and then show that this implies = 5.
3. Rootfinding Math 1070
> 3. Rootfinding > 3.4 Fixed point iteration
Lemma
Let g C[a, b]. Assume that g([a, b]) [a, b], i.e.,
x [a, b], g(x) [a, b].
Then x = g(x) has at least one solution in the interval [a, b].
3.5
2.5 y=g(x)
y=x
1.5
0.5
0.5 1 1.5 2 2.5 3 3.5
Lipschitz continuous
Definition
Given g : [a, b] R, is called Lipschitz continuous with constant > 0
(denoted g Lip [a,b]) if > 0 such that
Definition
g : [a, b] R is called contraction map if g Lip [a, b] with < 1.
Lemma
Let g Lip [a, b] with < 1 and g([a, b]) [a, b].
Then x = g(x) has exactly one solution . Moreover, for
xn+1 = g(xn ), xn for any x0 [a, b] and
n
| xn | |x1 x0 |.
1
Proof.
Existence: follows from previous Lemma.
Uniqueness: assume , solutions: = g(), = g().
| | = |g() g()| | |
(1 ) | | 0.
| {z }
>0
If xn [a, b] then g(xn ) = xn+1 [a, b]) {xn }n0 [a, b].
Linear convergence with rate :
Error estimate
1
| xn | |xn xn+1 |
1
| xn+1 | | xn |
= | xn+1 | |xn+1 xn |
1
Define
= max |g 0 (x)|.
x[a,b]
Theorem 2.6
Assume g C 1 [a, b], g([a, b]) [a, b] and maxx[a,] |g 0 (x)| < 1. Then
1 x = g(x) has a unique solution in [a, b],
2 xn x0 [a, b],
n
3 | xn | |x1 x0 |,
1
xn+1
4 lim = g 0 ().
n xn
Proof.
3. Rootfinding Math 1070
> 3. Rootfinding > 3.4 Fixed point iteration
Theorem 2.7
Assume solves x = g(x), g C 1 [I ], for some I 3 , |g 0 ()| < 1.
Then Theorem 2.6 holds for x0 close enough to .
Proof.
Since |g 0 ()| < 1 by continuity
Take x0 I close to x1 I
x1 I and, by induction, xn I .
Theorem 2.6 holds with [a, b] = I .
|xn+1 | = |g 0 (n )||xn |
|xn+1 | > |xn |
= divergence
Examples
Recall = 5.
1 g(x) = 5 + x x2 ; , g 0 (x) = 1 2x, g 0 () = 1 2 5. Thus
the iteration will not converge to 5.
5 5
2 g(x) = 5/x, g 0 (x) = ; g 0 () = = 1.
x 2
( 5)2
We cannot conclude that the iteration converges or diverges.
From the Table, it is clear that the iterates will not converge to .
3 g(x) = 1 + x 1 x2 , g 0 (x) = 1 2 x, g 0 () = 1 2
5 5 5 5 = 0.106,
i.e., the iteration will converge. Also,
| xn+1 | 0.106| xn |
Examples
Recall = 5.
1 g(x) = 5 + x x2 ; , g 0 (x) = 1 2x, g 0 () = 1 2 5. Thus
the iteration will not converge to 5.
5 5
2 g(x) = 5/x, g 0 (x) = ; g 0 () = = 1.
x 2
( 5)2
We cannot conclude that the iteration converges or diverges.
From the Table, it is clear that the iterates will not converge to .
3 g(x) = 1 + x 1 x2 , g 0 (x) = 1 2 x, g 0 () = 1 2
5 5 5 5 = 0.106,
i.e., the iteration will converge. Also,
| xn+1 | 0.106| xn |
Examples
Recall = 5.
1 g(x) = 5 + x x2 ; , g 0 (x) = 1 2x, g 0 () = 1 2 5. Thus
the iteration will not converge to 5.
5 5
2 g(x) = 5/x, g 0 (x) = ; g 0 () = = 1.
x 2
( 5)2
We cannot conclude that the iteration converges or diverges.
From the Table, it is clear that the iterates will not converge to .
3 g(x) = 1 + x 1 x2 , g 0 (x) = 1 2 x, g 0 () = 1 2
5 5 5 5 = 0.106,
i.e., the iteration will converge. Also,
| xn+1 | 0.106| xn |
Examples
Recall = 5.
1 g(x) = 5 + x x2 ; , g 0 (x) = 1 2x, g 0 () = 1 2 5. Thus
the iteration will not converge to 5.
5 5
2 g(x) = 5/x, g 0 (x) = ; g 0 () = = 1.
x 2
( 5)2
We cannot conclude that the iteration converges or diverges.
From the Table, it is clear that the iterates will not converge to .
3 g(x) = 1 + x 1 x2 , g 0 (x) = 1 2 x, g 0 () = 1 2
5 5 5 5 = 0.106,
i.e., the iteration will converge. Also,
| xn+1 | 0.106| xn |
g 0 (x) = 1 + 2cx
= 3
Need |g 0 ()| < 1
1 < 1 + 2c 3 < 1
1
Optimal choice: 1 + 2c 3 = 0 = c = 2 3
.
The possible behaviour of the fixed point iterates xn for various sizes
of g 0 ().
To see the convergence, consider the case of x1 = g(x0 ), the height of
the graph of y = g(x) at x0 .
We bring the number x1 back to the x-axis by using the line y = x
and the height y = x1 .
We continue this with each iterate, obtaining a stairstep behaviour
when g 0 () > 0.
When g 0 () < 0, the iterates oscillate around the fixed point , as can
be seen.
Figure: 1 < g 0 ()
3. Rootfinding Math 1070
> 3. Rootfinding > 3.4 Fixed point iteration
Figure: g 0 () < 1
3. Rootfinding Math 1070
> 3. Rootfinding > 3.4 Fixed point iteration
We need a more precise way to deal with the concept of the speed of
convergence of an iteration method.
Definition
We say that a sequence {xn : n 0} converges to with an order of
convergence p 1 if
| xn+1 | c| xn |p , n0
| xn+1 | |g 0 (n )|| xn |
| xn+1 | |g 0 ()|| xn |
Theorem 2.8
Assume g C p (I ) for some I containing , and
g 0 () = g 00 () = . . . = g (p1) () = 0, p 2.
Then for x0 close enough to , xn and
xn+1 g (p) ()
lim p
= (1)p1
n ( xn ) p!
i.e. convergence is of order p.
Proof: xn+1 = g(xn )
(xn )p1 (p1)
= g() +(xn ) g 0 () + . . . + g ()
|{z} | {z } (p 1)! | {z }
= =0 =0
(xn )p (p)
+ g (n )
p!
(xn )p (p)
xn+1 = g (n ).
p!
xn+1 g (p) (n ) g (p) ()
p
= (1)p1 (1)p1
( xn ) p! p!
3. Rootfinding Math 1070
> 3. Rootfinding > 3.4 Fixed point iteration
f (xn )
xn+1 = xn , n>0
f 0 (xn )
f (x)
= g(xn ), g(xn ) = x ;
f 0 (x)
f f 00
g 0 (x) = g 0 () = 0
(f 0 )2
f 0 f 00 + f f 000 f f 00 f 00 ()
g 00 (x) = 2 , g 00 () =
(f 0 )2 (f 0 )3 f 0 ()
xn+1 g 00 () 1 f 00 ()
lim = = .
n ( xn )2 2 2 f 0 ()
f (xn )
xn+1 = xn a
Ex.: a = f 0 (x0 ).
f (xn )
xn+1 = xn f 0 (x0 ) = g(xn ).
0 ()
f
|g 0 ()|
Need < 1 for convergence: 1
a
0
Linear convergence wit rate 1 f a() . (Thm 2.6.)
0 ()
0
f
If a = f (x0 ) and x0 is close enough to , then 1 .
a
Recall
Theorem 2.6
xn+1 = g(xn )
xn
xn+1
g 0 ()
xn
From (7.28)-(7.29)
1
xn = ( xn ) + (xn1 xn )
g 0 (n1 )
g 0 (n1 )
xn = (xn1 xn )
1 g 0 (n1 )
g 0 (n1 )
xn = (xn1 xn )
1 g 0 (n1 )
g 0 (n1 ) g 0 ()
1 g 0 (n1 ) 1 g 0 ()
Need an estimate for g 0 ().
Define
xn xn1
n =
xn1 xn2
and
( xn1 ) ( xn )
n =
( xn2 ) ( xn1 )
( xn1 ) g 0 (n1 )( xn1 )
=
( xn1 )/g 0 (n2 ) ( xn1 )
1 g 0 (n1 ) 0
= g (n2 )
1 g 0 (n2 )
n g 0 () as n : n g 0 ()
n
xn = (xn xn1 ) (7.30)
1 n
From (7.30)
n
xn + (xn xn1 ) (7.31)
1 n
Define
Aitken Extrapolation Formula
n
xn = xn + (xn xn1 ) (7.32)
1 n
Example
Repeat the example for I3.
The Table contains the differences xn xn1 , the ratios n , and the
n
estimated error from xn 1 n
(xn xn1 ), given in the column
Estimate. Compare the column Estimate with the error column in the
previous Table.
n xn xn xn1 n Estimate
0 2.5
1 2.25 -2.50E-1
2 2.2375 -1.25E-2 0.0500 -6.58E-4
3 2.23621875 -1.28E-3 0.1025 -1.46E-4
4 2.23608389 -1.35E-4 0.1053 -1.59E-5
5 2.23606966 -1.42E-5 0.1055 -1.68E-6
6 2.23606815 -1.50E-6 0.1056 -1.77E-7
7 2.23606800 -1.59E-7 0.1056 -1.87E-8
Algorithm (Aitken)
General remarks
Quasi-Newton Iterates
(
x0
f (x) = 0 f (xk )
xk+1 = xk ak , k = 0, 1, . . .
Quasi-Newton Iterates
1 ak = f (xk +ff(x(xkk))f
)
(xk )
Steffensen Method. This is Finite
Difference Method with hk = f (xk ) quadratic convergence.
f (x )f (x )
2 ak = xkk x 0 k0 where k 0 is the largest index < k such that
k
f (xk )f (xk0 ) < 0 Regula False
Need x0 , x1 : f (x0 )f (x1 ) < 0
x1 x0
x2 = x1 f (x1 )
f (x1 ) f (x0 )
x2 x0
x3 = x2 f (x2 )
f (x2 ) f (x0 )
Example.
(a) f (x) = (x 1)2 (x + 2) has two roots. The root = 1 has
multiplicity 2, and = 2 is a simple root.
(b) f (x) = x3 3x2 + 3x 1 has = 1 as a root of multiplicity 3.
To see this, note that
When the Newton and secant methods are applied to the calculation of a
multiple root , the convergence of xn to zero is much slower than it
would be for simple root. In addition, there is a large interval of uncertainty
as to where the root actually lies, because of the noise in evaluating f (x).
The large interval of uncertainty for a multiple root is the most serious
problem associated with numerically finding such a root.
The computer used is decimal with six digits in the significand, and it uses
rounding. The function f (x) is evaluated in the nested form of (7.39), and
f 0 (x) is evaluated similarly. The results are given in the Table.
n xn f (xn ) xn Ratio
0 0.800000 0.03510 0.300000
1 0.892857 0.01073 0.207143 0.690
2 0.958176 0.00325 0.141824 0.685
3 1.00344 0.00099 0.09656 0.681
4 1.03486 0.00029 0.06514 0.675
5 1.05581 0.00009 0.04419 0.678
6 1.07028 0.00003 0.02972 0.673
7 1.08092 0.0 0.01908 0.642
and the error decreases at about the constant rate. In our example,
= 32 , since the root has multiplicity m = 3, which corresponds to the
values in the last column of the table. The error formula (7.42) implies
a much slower rate of convergence than is usual for Newtons method.
With any root of multiplicity m 2, the number 21 ; thus, the
bisection method is always at least as fast as Newtons method for
multiple roots. Of course, m must be an odd integer to have f (x)
change sign at x = , thus permitting the bisection method to be
applied.
3. Rootfinding Math 1070
> 3. Rootfinding > 3.5 The Numerical Evaluation of Multiple Roots
f (xk )
xk+1 = xk
f 0 (xk )
f (x) = (x )p h(x), p0
(x )h(x)
g(x) = x
ph(x) + (x )h0 (x)
Differentiating
0 h(x) d h(x)
g (x) = 1 (x)
ph(x) + (x )h0 (x) dx ph(x) + (x )h0 (x)
and
1 p1
g 0 () = 1 =
p p
Quasi-Newton Iterates
E.g. p = 2, p1 1
p = 2.
f (xk )
xk+1 = xk p
f 0 (xk )
xk+1 = g(xk )
f (x)
g(x) = x p
f 0 (x)
p
g 0 () = 1 = 0
p
xk+1 g 00 ()
lim =
k (x xk )2 2
Roots of polynomials
p(x) = 0
p(x) = a0 + a1 x + . . . + an xn , an 6= 0.
p(x) = an (x z1 )(x z2 ) . . . (x zn ), z1 , . . . , zn C.
2. Cauchy
ai
|i | 1 + max
0in1 an
1 |j | 2 .
p(x) = a0 + a1 x + a2 x2 + . . . + an xn (7.43)
p(x) = a0 + x (a1 + x (a2 + . . . + x (an1 + an x)) . . .) (7.44)
bn = an
bk = ak + bk+1 , k = n 1, n 2, . . . , 0
p() = a0 + a1 + . . . + (an1 + an ) . . .
| {z }
bn1
| {z }
b1
| {z }
b0
Consider
q(x) = b1 + b2 x + . . . + bn xn1 .
Claim:
p(x) = b0 + (x )q(x).
Proof.
b0 + (x )q(x)
= b0 + (x )(b1 + b2 x + . . . + bn xn1 )
= b0 b1 + (b1 b2 ) x + . . . + (bn1 bn ) xn1 + bn xn
| {z } | {z } | {z } |{z}
a0 a1 an1 an
n
= a0 + a1 x + . . . + an x = p(x).
Deflation
p(xk )
xk+1 = xk , k = 0, 1, 2, . . .
p0 (xk )
To evaluate p and p0 at x = :
p() = b0
p0 (x) = q(x) + (x )q 0 (x)
p0 () = q()
Given: a = (a0 , a1 , . . . , an )
Output: b = b(b1 , b2 , . . . , bn ): coefficients of deflated polynomial g(x);
root.
Newton(a, n, x0 , , itmax, root, b, ierr)
itnum = 1
1. := x0 ; bn := an ; c := an
for k = n 1, . . . , 1 bk := ak + bk+1 ; c := bk + c p0 ()
b0 := a0 + b1 p()
if c = 0, iter = 2, exit p0 () = 0
x1 = x0 b0 /c x1 = x0 p(x0 )/p0 (x0 )
if |x0 x1 | , then ierr= 0: root = x, exit
it itnum = itmax, then ierr = 1, exit
itnum = itnum + 1, x0 := x1 ,
quad go to 1.
3. Rootfinding Math 1070
> 3. Rootfinding > Roots of polynomials
Conditioning
p(x) = a0 + a1 x + . . . + an xn
roots: 1 , . . . , n
Perturbation polynomial q(x) = b0 + b1 x + . . . + bn xn
Perturbed polynomial p(x, ) = p(x) + q(x)
= (a0 + b0 ) + (a1 + b1 )x + . . . + (an + bn )xn
roots: j () - continuous functions of , i (0) = i .
(Absolute) Conditioning number
|j () j |
kj = lim
0 ||
Example
(x 1)3 = 0 1 = 2 = 3 = 1
3
(x 1) = 0 (q(x) = 1)
Set y = x 1 and a = 1/3
p(x, ) = y 3 = y 3 a3 = (y a)(y 2 + ya + a2 )
y1 = a
a 3a2
a(1 i 3)
y2,3 = =
2 2
2 1i 3
y2 = a, y3 = a , = , || = 1
2
1 () = 1 + 1/3
2 () = 1 1/3
3 () = 1 2 1/3
|j () 1| = 1/3
j () 1 1/3 0
Conditioning number =
General argument
p() = 0
p0 () 6= 0
p(x, ) : root ()
X
() = + ` `
`=1
= +1 + 2 2 + . . .
|{z} | {z }
this is what matters negligeable if is small
()
= 1 + 2 + . . . 1
To find 1 :
0 (0) = 1
p((), ) = 0
p(()) + q(()) = 0
p0 (()) 0 () + q(()) + q 0 (()) 0 () =
=0
q()
p0 () 0 (0) +q() = 0 = 1 = 0
| {z } p ()
1
q()
k = |1 | = 0
p ()
k is large if p0 () is close to zero.
Example
7
Y
p(x) = W7 = (x i)
i=1
q(x) = x6 , = 0.002
Y 7
p0 (j ) = (j `) j = j
`=1,`6=j 6
q(j )
kj = 0 = Q j
p (j ) 7
`=1 (j `)
In particular,
q(j )
j () j + = j + (j).
p0 (j )
f1 (x1 , . . . , xm ) = 0,
f2 (x1 , . . . , xm ) = 0,
.. (7.45)
.
fm (x1 , . . . , xm ) = 0.
If we denote
f1 (x)
f2 (x)
F= : Rm Rm ,
..
.
fm (x)
then (7.45) is equivalent to writing
F(x) = 0. (7.46)
x = G(x), G : Rm Rm
Solution : = G() is called a fixed point of G.
Example: F(x) = 0
x = x AF(x) for some A Rmm , nonsingular matrix.
= G(x)
Iteration:
initial guess x0
xn+1 = G(xn ), n = 0, 1, 2, . . .
Recall x Rm
m
!1/p
X
kxkp = |xi |p , 1 p < ,
i=1
kxk = max |xi |.
1in
Recall x Rm
m
!1/p
X
kxkp = |xi |p , 1 p < ,
i=1
kxk = max |xi |.
1in
Proof
k1
X k1
X
kxk x0 k = k (xi+1 xi )k kxi+1 xi k
i=0 i=0
k1
X 1 k
i kx1 x0 k = kx1 x0 k
1
i=0
1
< kx1 x0 k.
1
k, `:
xn+1 = G(xn )
n
= G()
is a fixed point.
Uniqueness: Assume = G()
k k = kG() G()k k k,
(1 ) k k 0 k k = 0 = .
| {z }
>0
Jacobian matrix
Definition
F : Rm Rm is continuously differentiable (F C 1 (Rm )) if, for every
x Rm ,
fi (x)
, i, j = 1, . . . , m
xj
exist.
f1 (x) f1 (x)
x1 ... xm
def
F0 (x) := .. ..
. .
fm (x) fm (x)
x1 ... xm mm
fi (x)
(F0 (x))ij = xj , i, j = 1, . . . , m.
f (z) f (z)
f (z)T (x y) = (x1 y1 ) + . . . + (xm ym ).
x1 xm
Note:
f1 (x)
F(x) = ..
,
.
fm (x)
fi (x) fi (y) = fi (zi )T (x y), i = 1, . . . , n.
i (xn+1 )i = gi () gi (xn )
MV T
= gi (zni )T ( xn ), i = 1, . . . , m
T
g1 (z1 )
xn+1 ..
= ( xn ) zj , xn
.
gm (zm )T
| {z }
Jn
xn+1 = Jn ( xn ) (7.47)
g1 ()T
..
If xn , Jn = G0 ().
.
gm ()T
The size of G0 () will affect convergence.
Theorem 2.9
Assume
D is closed, bounded, convex subset of Rm .
G C 1 (D)
G(D) D
= maxxD kG0 (x)k < 1.
Then
(i) x = G(x) has a unique solution D
(ii) x0 D, xn+1 = G(xn ) converges to .
(iii) k xn+1 k (kG0 ()k + n ) k xn k ,
whenever n 0.
n
Proof: x, y D
k xn+1 k kJn k k xn k
kJn G0 ()k +kG0 ()k k xn k .
| {z }
n 0
n
Example (p.104)
Solve
f1 3x21 + 4x22 1 = 0
, for near (x1 , x2 ) = (.5, .25).
f2 x32 8x31 1 = 0
Iteratively
3x21,n + 4x22,n 1
x1,n+1 x1,n .016 .17
=
x2,n+1 x2,n .52 .26 x32,n 8x31,n 1
Example (p.104)
xn+1 = xn AF(xn )
| {z }
G(x)
k xn+1 k 1
kG0 ()k 0.04, 0.04, A = F0 (x0 )
k xn k
Why?
G0 (x) = I AF0 (x), G0 () = I AF0 ()
Need
kG0 ()k 0
1 1
A F0 () , A = F0 (x0 )
m dimensional Parallel Chords Method
xn+1 = xn (F0 (x0 ))1 F(xn )
3. Rootfinding Math 1070
> 3. Rootfinding > Systems of Nonlinear Equations
In practice:
1 Solve a linear system F0 (xn ) n = F(xn )
2 Set xn+1 = xn + n
kG0 ()k = 0
If G0 C 1 (Br ()) where Br () = {y : ky k r}, by continuity:
kG0 ()k < 1 for x Br (x)
for some r.
By Theorem 2.9 D = Br linear convergence.
Theorem
Assume
F0 C 1 (D)
D such that F() = 0
F0 () Lip (D)
(F0 ())1 and kF0 ()k1
Then > 0 such that if kx0 k < ,
= xn+1 = xn (F0 (xn ))1 F(xn ) and kxn+1k kxnk2 .
(: measure of nonlinearity)
1
So, need < .
xn+1 = xn A1
n F(xn ), An F0 (xn )
Ex.: Finite Difference Newton
fi (xn + hn ej ) fi (xn ) fi (xn )
An = aij = ,
hn xj
hn and ej = (0 . . . 0 1 0 . . . 0)T .
j th position
Global Convergence
Newtons method:
xn+1 = xn + sn dn
when
dn = (F(xn )0 )1 F(xn )
sn = 1
If Newton step sn not satisfactory, e.g. kF(xn+1 )k2 > kF(xn )k2
defined through the user function fun starting from the vector x0 as initial
guess.
The function fun returns the n values f1 (x), . . . , fn (x) for any value of the
input vector x.
f (x1 , x2 , . . . , xn ) : Rn R;
min f (x)
xRn
Solve:
f
x1
f (x) = 0 with f = ..
. (7.48)
.
f
xn
Hamiltonian
2f
Hij =
xi xj
xn+1 = xn H(xn )1 f (xn )
Taylor:
1
mn (x) = f (xn ) + f (xn )T (x xn ) + (x xn )T H(xn )(x xn )
2
mn (x) f (x) for x near xn .
Newtons method:
2 mn (xn ) = H(xn ).
Modify
e 1 (xn )f (xn )
xn+1 = xn H
where
H(x
e n ) = H(xn ) + n I,
for some n 0.
If 1 , . . . , n eigenvalues of H(xn ) and
e1 , . . . ,
en eigenvalues of H.
e
ei = i + n
Need: n : mm + n > 0.
Ghershgorin Theorem:
Pn A = (aij ), eigenvalues lie in circles with centers
aii and radius r = j=1,j6=i |aij |.
3. Rootfinding Math 1070
> 3. Rootfinding > Unconstrained Minimization
Gershgorin Circles
r2=2
r1=3
1
r3=1
-2 1 5 7
-2.4 -1.6 -0.8 0 0.8 1.6 2.4 3.2 4 4.8 5.6 6.4 7.2 8
-1
-2
-3
Descent Methods
Definition
d is a descent direction for f (x) at point x0 if
f (x0 ) > f (x0 + d) for 0 < 0
Lemma
d is descent direction if and only if f (x0 )T d < 0.