Solving System of Linear Equations: Y. Sharath Chandra Mouli

Solving System of Linear Equations
Y. Sharath Chandra Mouli
Indian Institute of Technology

kanpur
sharath@iitk.ac.in
September 11, 2017
Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 1 / 32
Overview I
1 System of Linear Equations
2 Gaussian Elimination (Back substitution)
3 Gauss-Jordon Elimination
4 LU Decomposition
Performing the LU Decomposition
Crouts algorithm
5 Tridiagonal Systems of Equations
6 Band Diagonal Systems of Equations
7 Iterative Methods
Stationary Iterative Methods
The Jacobi Method
The Gauss-Seidel Method
Successive Overrelaxation Method (SOR)
Symmetric Successive Overrelaxation Method (SSOR)
Non-Stationary Iterative Methods
Conjugate Gradient Method (CG)
MINRES and SYMMLQ
Overview II
CG on Normal Equations, CGNE and CGNR
Generalized Minimal Residual (GMRES)
BiConjugate Gradient (BiCG)
Quasi-Minimal Residual (QMR)
Conjugate Gradient Squared Method (CGS)
BiConjugate Gradient Stabilized (Bi-CGSTAB)
Conclusions from Iterative Methods
System of Linear Equations
a11 x1 + a12 x2 + +a1N xN = b1

a21 x1 + a22 x2 + +a2N xN = b2 (1)
=
aM1 x1 + aM2 x2 + +aMN xN = bN
Here the N unknowns xi , j = 1, 2, .., N are related by M equations.

The coefficients aij with i = 1, 2, ..., M and j = 1, 2, .., N are known
numbers, as the right hand side quantities bi , i = 1, 2, .., M.
Gaussian Elimination (Back substitution)
Echelon form
0 0 0 0
0
a11 a12 a13 . . . a1n x1 b1
0 0 0 0
0 a x2 b2
22 a23 . . . a2n

.. = ..

. . . . . . . . . . . . . . . . . . . . . . .
. .
0 0
0 0 0 . . . amn xm bm
Back substitution:
0 0
xm = bm /amn (2)

N
0 0 0
X
xi = (1/aii ) bi aij xi (3)
j=i+1
Gauss-Jordon Elimination
Linear matrix equation

A X1 X2 X3 Y = b1 b2 b3 I
0 0
where, A and Y are square matrices, the bi s and xi s are column vectors,
and I is the identity matrix, simultaneously solves the linear sets
[A]{X1 } = {b1 } [A]{X2 } = {b2 }

[A]{X3 } = {b3 } and [A]{Y } = {I }
It requires all the right-hand sides to be stored and manipulated at

the same time.
when the inverse matrix is not desired, Gauss-Jordan is three times
slower than the best alternative technique for solving a single linear
set.
LU Decomposition
Writing,
LU = A
Where, L is lower triangular and U is upper triangular matrices.

For 4X4 Matrix,

11 0 0 0 11 12 13 14 a11 a12 a13 a14
21 22 0 0 0 22 23
24 a21
a22 a23 a24

31 32 33 0 0 =
0 33 34 a31 a32 a33 a34
41 42 43 44 0 0 0 44 a41 a42 a43 a44
We decompose as,
Ax = (LU)x = L(Ux) = b
Such that, Ly = b and then solving Ux = y
LU Decomposition
By forward substitution,
b1
y1 =
11

i1
1 X
yi = bi ij yj i = 2, 3, .., N
ii
j=1
By backward substitution,
bN
xN =
11

N
1 X
yi = yi ij xj i = N 1, N 2, .., 1
ii
j=i+1
Performing the LU Decomposition
Crouts algorithm:
Set ii = 1, i = 1, .., N
For each j = 1, 2, 3, .., N do these two procedures:
First, for i = 1, 2, .., j
i1
X
ij = aij ik kj
k=1
Second, or i = j + 1, j + 2, ..., N
j1
" #
1 X
ij = aij ik kj
jj
k=1
Tridiagonal Systems of Equations
Non-zero elements only on the diagonal plus or minus one column.

The set of equations to be solved is
u r
b1 c1 0 . . . 0 0 0 1 1
a2 b2 c2 . . . 0 0 0 u2
r2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. = ..
. .

0 0 0 . . . aN1 bN1 cN1 uN1 rN1
0 0 0 ... 0 aN bN uN rN
Condition for diagonal dominance,

|bj | > |aj | + |cj | j = 1, ..., N
The algorithm may fail if matrix does not satisfy the diagonal
dominance condition.
Band Diagonal Systems of Equations
Definition:
aij = 0 when j > i + m2 or i > j + m1
Band diagonal matrices are stored and manipulated in a so-called
compact form.
Results in a long narrow matrix with m1 + 1 + m2 columns and N
rows.
3 1 0 0 0 0 0 x x 3 1
4 1 5 0 0 0 0 x 4 1 5

9 2 6 5 0 0 0 9 2 6 5

0 3 5 8 9 0 0 = 3 5 8 9

0 0 7 9 3 2 0 7 9 3 2

0 0 0 3 8 4 6 3 8 4 6
0 0 0 0 2 4 4 2 4 4 x
The band diagonal matrix which has N = 7, m1 = 2, and m2 = 1, is
stored compactly as the 7X 4 matrix.
Iterative Methods
Stationary methods:
Stationary methods are older, simpler to understand and implement,

but usually not as effective.
Iterative methods that can be expressed in the simple form
x (k) = Bx (k1) + C (4)
(where neither B nor C depend upon the iteration count k) are

called stationary iterative methods.
Nonstationary methods:
Nonstationary methods are a relatively recent development.
Their analysis is usually harder to understand, but they can be highly
effective.
The Jacobi Method:

Let Ax = b the linear system,
n
X
ai,j xj = bj (5)
j=1
xj is solved by assuming the other entries of x remain fixed

bj nj6=1 ai,j xj
P
xj = (6)
ai,i
As iterative method,
Pn
(k) bj j6=1 ai,j x(k1)
xj = , (7)
ai,i
which is jacobi method.
In Matrix terms, X (k) = D 1 (L + U)X (k1) + D 1 b
where, D= Diagonal, U=Upper, L= Lower
The Gauss-Seidel Method:
Examinated one at time in sequence,

P (k) P (k1)
(k) bi j<i ai,j xj j>i ai,j xj
xi = (8)
ai,i
In matrix terms, the definition of the gauss-seidel method can be

expressed as,
X (k) = (D L)1 (UX (k1) + b)
where, D= Diagonal, U=Upper, L= Lower
Successive Overrelaxation Method (SOR):
Extrapolation to the Gauss-Seidel method,

A weighted average between the previous itarate and the computed
Gauss-Seidel iterate successively
(k) 0 (k) 0 (k1)
xi = xi + (1 )xi (9)
0
Where x denotes a Gauss-Seidel iterate, and is the extrapolation
factor.
The value of that will accelerate the rate of convergence
(0 < < 2).
In matrix form,
x (k) = (D L)1 (U + (1 )D)x (k1) + (D L)1 b
Symmetric Successive Overrelaxation Method (SSOR):

If A is symmetric, SSOR combines two SOR sweeps together
Firstly, SOR sweep is carried out
In the second sweep the unknowns are updated in the reverse order.
Convergence rate, with an optimal value of , is usually slower than
the convergence rate of SOR with optimal .
x (k) = B1 B2 x (k1) + (2 )(D U)1 D(D L)1 b, (10)
Where B1 = (D U)1 (L + (1 ))D, and

B2 = (D L)1 (U + (1 ))D.
Conjugate Gradient Method (CG):
Its one of the oldest, best known of non-stationary and effective

method for symmetric positive definite systems.
The iterates x (i) are updated in each iteration by a multiple (i ) of
the search direction vector p (i) :
x (i) = x (i1) + i x (i) (11)
Correspondingly the residuals r (i) = b ax(i) are updated as
r (i) = r (i1) + q (i) where q(i) = Ap (i) (12)

T T T
The Choice = i = r (i1) r (i1) /p (i) Ap (i) minimizes r (i) A1 r (i) for
all s in Eq.12
Contd..
Conjugate Gradient Method (CG):

The search directions are updated using the residuals
p (i) = r (i) + (i1) p (i1) , (13)
where, i = r (i)T
r (i) /r (i1)T
r (i1) ensures that p (i) and Ap (i1) - or equiv-
alently, r (i) and r (i1) - are orthogonal.
Theory
The unpreconditioned conjugate gradient method constructs the ith
iterate x (i) as an element of x (0) + span{r (0) , ..., A(i1) r (0) } so that
(x (i) x)T A(x (i) x) is minimized, where x is the exact solution of
Ax = b.
Minimum is guaranteed to exist in general, if A is symmetric
positive definite.
MINRES and SYMMLQ:
The MINRES and SYMMLQ methods are variants that can be

applied to symmetric indefinite systems.
For indefinite matrices the minimization property of the Conjugate
Gradient method is no longer well-defined.
The MINRES and SYMMLQ methods are variants of the CG method
that avoid the LU-factorization and do not suffer from breakdown.
MINRES minimizes the residual in the 2-norm.
SYMMLQ solves the projected system, but does not minimize
anything (it keeps the residual orthogonal to all previous ones).
MINRES and SYMMLQ:

Theory
When A is not positive definite, but symmetric, we can still construct
an orthogonal basis for the Krylov subspace by three term recurrence
relations.
Eliminating the search directions in equations Eq.11 and Eq.12 gives
a recurrence
ARi = Ri+1 Ti (14)
where Ti is an (i + 1) i tridiagonal matrix.
Residual is minimized in 2-norms by obtaining
x (i) {r (0) , Ar (0) , ..., Ai1 r (0) }, x (i) = Ri y (15)
That minimizes
k Ax (i) b k2 =k ARi y b k2 =k R(i+1) Ti y(i) b k2 (16)

CG on Normal Equations, CGNE and CGNR:
CG Variants of this approach are the simplest methods for

nonsymmetric or indefinite systems.
The system is transformed to a symmetric definite one and then CG is
applied.
CGNE solves the system (AAT )y = b, for y and then compute the
solution x = AT y .
CGNR solves (AT A)x = b, for the solution vector x, where b = AT b.
Generalized Minimal Residual (GMRES):
Extension of MINRES to unsymmetric systems

Like MINRES, it generates a sequence of orthogonal vectors, but in
the absence of symmetry this can no longer be done with short
recurrences.
Instead, all previously computed vectors in the orthogonal sequence
have to be retained.
For this reason, restarted versions of the method are used.
In the Conjugate Gradient method, the residuals form an orthogonal
basis for the space span {r (0) , Ar (0), A2 r (0) , ...}.
In GMRES, this basis is formed explicitly:
Generalized Minimal Residual (GMRES):
w (i) = Av (i)
for k = 1, ..., i
w (i) = w (i) (w (i) , v (k) )v (k)
end
v (i+1) = w (i) / k w (i) k
The inner product coefficients (w (i) , v (i) ) and k w (i) k are stored in
an upper Hessenberg matrix.
The GMRES iterates are constructed as
x (i) = x (0) + y1 v 1 + .. + yi v (i)
Where the coefficients yk have been chosen to minimize the residual

norm k b Ax (i) k.
BiConjugate Gradient (BiCG):

Unlike the conjugate gradient method, this algorithm does not require
the matrix A to be self-adjoint, but instead one needs to perform
multiplications by the conjugate transpose AT .
We update two sequences of residuals
r (i) = r (i1) i Ap (i) , r(i) = r(i1) i AT p (i)
and two sequences of search directions
p (i) = r (i1) i1 p (i1) , p (i) = r(i1) i1 p (i1)
The choices
T T
r(i1) r (i1) r(i) r (i)
i = , i =
p (i)T Ap (i) r(i1)T r (i1)
T T
Ensure the bi-orthogonality relations r(i) r (j) = p (i) Ap (j) if i 6= j.
BiConjugate Gradient (BiCG):

Convergence
For positive definite systems the method delivers same results as CG,
but at twice the cost per iteration.
For nonsymmetric matrices it has been shown the method is more or
less comparable to GMRES (In terms of number of iterations).
Convergence behavior may be quite irregular, and the method may
even break down.
T
Breakdown situation due to possible event that z (i1) r(i1) 0
T
Other breakdown situation, p (i) q (i) 0, occurs when the
LU-Decomposition fails.
Quasi-Minimal Residual (QMR):
The main idea behind this algorithm is to solve the reduced

tridiagonal system in a least squares sense, similar to the approach
followed in GMRES.
Since the constructed basis for the Krylov subspace is bi-orthogonal,
rather than orthogonal as in GMRES, the obtained solution is viewed
as a Quasi-minimal residual solution.
Convergence
Convergence is Smoother than BiCG
Look-ahead steps in the QMR method prevents breakdown in all
cases but the so-called incurable breakdown, where no number of
look-ahead steps would yield a next iterate.
Conjugate Gradient Squared Method (CGS):
In BiCG, the residual vector r (i) can be regarded as the product of

r (0) and an ith degree polynomial in A, that is
r (i) = Pi (A)r (0) (17)
This same polynomial satisfies r(i) = Pi (A)

r (0) so that
r (i) , r (i) ) = (Pi (AT )

i = ( r (i) , Pi2 (A)r (0) ).
r (0) , Pi (A)r (0) ) = ( (18)
This suggests that if Pi (A) reduces r (0) to a smaller vector r (i) , then
it might be advantageous to apply this contraction operator twice,
and compute Pi2 (A)r (0) .
Conjugate Gradient Squared Method (CGS):
Eq.18 shows that the iteration coefficients can still be recovered from
these vectors, and it turns out to be easy to find the corresponding
approximations for x.
Convergence
A speed of convergence for CGS that is about twice as fast as for
BiCG.
This may lead to a less accurate solution than suggested by the
updated residual.
Highly irregular convergence behavior.
Local corrections to the current solution may be so large that
cancellation effects occur.
The method tends to diverge if the starting guess is close to the
solution.
BiConjugate Gradient Stabilized (Bi-CGSTAB):
The Bi-CGSTAB developed to solve nonsymmetric linear systems

while avoiding the often irregular convergence patterns of the CGS.
Instead of computing the CGS sequence i Pi2 (A)r (0) , Bi-CGSTAB
computes i Qi (A)Pi (A)r (0) where Qi is an ith degree polynomial
describing a steepest descent update.
Convergence
CGS can be viewed as a method in which the BiCG contraction
operator is applied twice. Bi-CGSTAB can be interpreted as the
product of BiCG and repeatedly applied GMRES.
The classical Conjugate Gradient Method (CG) is one of the most

powerful iterative schemes for solving large sparse linear systems Eq.1
with Hermitian positive definite coefficient matrices A.
The Biconjugate Gradient Algorithm (BCG) is the extension of CG to
linear systems Eq.1 with general non-Hermitian nonsingular coefficient
matrices.
Unlike CG, the BCG iterates are not characterized by a minimization
property
There will be irregular convergence behavior with wild oscillations in
the residual norm.
the BCG algorithm, even breakdowns-more precisely, division by 0-may
occur.
The quasiminimal residual method (QMR), iterates are defined by a
quasi minimization of the residual norm, which leads to smooth
convergence curves as a remedy to BCG Problem.
Furthermore, QMR can be implemented based on a look-ahead

version of the nonsymmetric Lanczos algorithm, which avoids possible
breakdowns of the process.
Except for special cases, such as complex symmetric matrices, BCG
and QMR require matrix-vector multiplications with both the
coefficient matrix A of Eq.1 and its transpose AT .
The Conjugate Gradients Squared Algorithm (CGS), that does not
involve AT .
However, like BCG, CGS also exhibits rather erratic convergence
behavior.
The Bi-CGSTAB Algorithm, which uses local steepest descent steps
to obtain a more smoothly convergent CGS-like process.
Bi-CGSTAB can converge considerably slower than CGS.
The End

Solving System of Linear Equations: Y. Sharath Chandra Mouli

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Solving System of Linear Equations: Y. Sharath Chandra Mouli

Uploaded by

Copyright:

Available Formats

Solving System of Linear Equations

Y. Sharath Chandra Mouli

Indian Institute of Technology

September 11, 2017

a11 x1 + a12 x2 + +a1N xN = b1

Here the N unknowns xi , j = 1, 2, .., N are related by M equations.

Linear matrix equation

[A]{X1 } = {b1 } [A]{X2 } = {b2 }

It requires all the right-hand sides to be stored and manipulated at

Where, L is lower triangular and U is upper triangular matrices.

Non-zero elements only on the diagonal plus or minus one column.

Condition for diagonal dominance,

Stationary methods are older, simpler to understand and implement,

x (k) = Bx (k1) + C (4)

(where neither B nor C depend upon the iteration count k) are

The Jacobi Method:

xj is solved by assuming the other entries of x remain fixed

The Gauss-Seidel Method:

Examinated one at time in sequence,

In matrix terms, the definition of the gauss-seidel method can be

Successive Overrelaxation Method (SOR):

Extrapolation to the Gauss-Seidel method,

x (k) = (D L)1 (U + (1 )D)x (k1) + (D L)1 b

Symmetric Successive Overrelaxation Method (SSOR):

x (k) = B1 B2 x (k1) + (2 )(D U)1 D(D L)1 b, (10)

Where B1 = (D U)1 (L + (1 ))D, and

Conjugate Gradient Method (CG):

Its one of the oldest, best known of non-stationary and effective

r (i) = r (i1) + q (i) where q(i) = Ap (i) (12)

Conjugate Gradient Method (CG):

p (i) = r (i) + (i1) p (i1) , (13)

MINRES and SYMMLQ:

The MINRES and SYMMLQ methods are variants that can be

MINRES and SYMMLQ:

k Ax (i) b k2 =k ARi y b k2 =k R(i+1) Ti y(i) b k2 (16)

CG on Normal Equations, CGNE and CGNR:

CG Variants of this approach are the simplest methods for

Generalized Minimal Residual (GMRES):

Extension of MINRES to unsymmetric systems

Generalized Minimal Residual (GMRES):

x (i) = x (0) + y1 v 1 + .. + yi v (i)

Where the coefficients yk have been chosen to minimize the residual

BiConjugate Gradient (BiCG):

BiConjugate Gradient (BiCG):

Quasi-Minimal Residual (QMR):

The main idea behind this algorithm is to solve the reduced

Conjugate Gradient Squared Method (CGS):

In BiCG, the residual vector r (i) can be regarded as the product of

r (i) = Pi (A)r (0) (17)

This same polynomial satisfies r(i) = Pi (A)

r (i) , r (i) ) = (Pi (AT )

Conjugate Gradient Squared Method (CGS):

BiConjugate Gradient Stabilized (Bi-CGSTAB):

The Bi-CGSTAB developed to solve nonsymmetric linear systems

The classical Conjugate Gradient Method (CG) is one of the most

Furthermore, QMR can be implemented based on a look-ahead

You might also like