You are on page 1of 32

Solving System of Linear Equations

Y. Sharath Chandra Mouli

Indian Institute of Technology


kanpur
sharath@iitk.ac.in

September 11, 2017

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 1 / 32
Overview I
1 System of Linear Equations
2 Gaussian Elimination (Back substitution)
3 Gauss-Jordon Elimination
4 LU Decomposition
Performing the LU Decomposition
Crouts algorithm
5 Tridiagonal Systems of Equations
6 Band Diagonal Systems of Equations
7 Iterative Methods
Stationary Iterative Methods
The Jacobi Method
The Gauss-Seidel Method
Successive Overrelaxation Method (SOR)
Symmetric Successive Overrelaxation Method (SSOR)
Non-Stationary Iterative Methods
Conjugate Gradient Method (CG)
MINRES and SYMMLQ
Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 2 / 32
Overview II
CG on Normal Equations, CGNE and CGNR
Generalized Minimal Residual (GMRES)
BiConjugate Gradient (BiCG)
Quasi-Minimal Residual (QMR)
Conjugate Gradient Squared Method (CGS)
BiConjugate Gradient Stabilized (Bi-CGSTAB)
Conclusions from Iterative Methods

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 3 / 32
System of Linear Equations

a11 x1 + a12 x2 + +a1N xN = b1


a21 x1 + a22 x2 + +a2N xN = b2 (1)
=
aM1 x1 + aM2 x2 + +aMN xN = bN

Here the N unknowns xi , j = 1, 2, .., N are related by M equations.


The coefficients aij with i = 1, 2, ..., M and j = 1, 2, .., N are known
numbers, as the right hand side quantities bi , i = 1, 2, .., M.

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 4 / 32
Gaussian Elimination (Back substitution)

Echelon form
0 0 0 0
0
a11 a12 a13 . . . a1n x1 b1
0 0 0 0
0 a x2 b2
22 a23 . . . a2n

.. = ..

. . . . . . . . . . . . . . . . . . . . . . .
. .
0 0
0 0 0 . . . amn xm bm
Back substitution:

0 0
xm = bm /amn (2)

N
0 0 0
X
xi = (1/aii ) bi aij xi (3)
j=i+1

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 5 / 32
Gauss-Jordon Elimination

Linear matrix equation


    
A X1 X2 X3 Y = b1 b2 b3 I
0 0
where, A and Y are square matrices, the bi s and xi s are column vectors,
and I is the identity matrix, simultaneously solves the linear sets

[A]{X1 } = {b1 } [A]{X2 } = {b2 }


[A]{X3 } = {b3 } and [A]{Y } = {I }

It requires all the right-hand sides to be stored and manipulated at


the same time.
when the inverse matrix is not desired, Gauss-Jordan is three times
slower than the best alternative technique for solving a single linear
set.
Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 6 / 32
LU Decomposition

Writing,

LU = A

Where, L is lower triangular and U is upper triangular matrices.


For 4X4 Matrix,

11 0 0 0 11 12 13 14 a11 a12 a13 a14
21 22 0 0 0 22 23
24 a21
a22 a23 a24

31 32 33 0 0 =
0 33 34 a31 a32 a33 a34
41 42 43 44 0 0 0 44 a41 a42 a43 a44

We decompose as,

Ax = (LU)x = L(Ux) = b
Such that, Ly = b and then solving Ux = y

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 7 / 32
LU Decomposition

By forward substitution,

b1
y1 =
11

i1
1 X
yi = bi ij yj i = 2, 3, .., N
ii
j=1

By backward substitution,

bN
xN =
11

N
1 X
yi = yi ij xj i = N 1, N 2, .., 1
ii
j=i+1

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 8 / 32
Performing the LU Decomposition

Crouts algorithm:
Set ii = 1, i = 1, .., N
For each j = 1, 2, 3, .., N do these two procedures:
First, for i = 1, 2, .., j

i1
X
ij = aij ik kj
k=1

Second, or i = j + 1, j + 2, ..., N

j1
" #
1 X
ij = aij ik kj
jj
k=1

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 9 / 32
Tridiagonal Systems of Equations

Non-zero elements only on the diagonal plus or minus one column.


The set of equations to be solved is
u r
b1 c1 0 . . . 0 0 0 1 1
a2 b2 c2 . . . 0 0 0 u2
r2


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. = ..
. .

0 0 0 . . . aN1 bN1 cN1 uN1 rN1
0 0 0 ... 0 aN bN uN rN

Condition for diagonal dominance,


|bj | > |aj | + |cj | j = 1, ..., N
The algorithm may fail if matrix does not satisfy the diagonal
dominance condition.

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 10 / 32
Band Diagonal Systems of Equations

Definition:
aij = 0 when j > i + m2 or i > j + m1
Band diagonal matrices are stored and manipulated in a so-called
compact form.
Results in a long narrow matrix with m1 + 1 + m2 columns and N
rows.
3 1 0 0 0 0 0 x x 3 1
4 1 5 0 0 0 0 x 4 1 5

9 2 6 5 0 0 0 9 2 6 5

0 3 5 8 9 0 0 = 3 5 8 9

0 0 7 9 3 2 0 7 9 3 2

0 0 0 3 8 4 6 3 8 4 6
0 0 0 0 2 4 4 2 4 4 x
The band diagonal matrix which has N = 7, m1 = 2, and m2 = 1, is
stored compactly as the 7X 4 matrix.
Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 11 / 32
Iterative Methods

Stationary methods:

Stationary methods are older, simpler to understand and implement,


but usually not as effective.
Iterative methods that can be expressed in the simple form

x (k) = Bx (k1) + C (4)

(where neither B nor C depend upon the iteration count k) are


called stationary iterative methods.
Nonstationary methods:
Nonstationary methods are a relatively recent development.
Their analysis is usually harder to understand, but they can be highly
effective.

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 12 / 32
Stationary Iterative Methods

The Jacobi Method:


Let Ax = b the linear system,
n
X
ai,j xj = bj (5)
j=1

xj is solved by assuming the other entries of x remain fixed


bj nj6=1 ai,j xj
P
xj = (6)
ai,i
As iterative method,
Pn
(k) bj j6=1 ai,j x(k1)
xj = , (7)
ai,i
which is jacobi method.
In Matrix terms, X (k) = D 1 (L + U)X (k1) + D 1 b
where, D= Diagonal, U=Upper, L= Lower
Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 13 / 32
Stationary Iterative Methods

The Gauss-Seidel Method:

Examinated one at time in sequence,


P (k) P (k1)
(k) bi j<i ai,j xj j>i ai,j xj
xi = (8)
ai,i

In matrix terms, the definition of the gauss-seidel method can be


expressed as,
X (k) = (D L)1 (UX (k1) + b)
where, D= Diagonal, U=Upper, L= Lower

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 14 / 32
Stationary Iterative Methods

Successive Overrelaxation Method (SOR):

Extrapolation to the Gauss-Seidel method,


A weighted average between the previous itarate and the computed
Gauss-Seidel iterate successively
(k) 0 (k) 0 (k1)
xi = xi + (1 )xi (9)
0
Where x denotes a Gauss-Seidel iterate, and is the extrapolation
factor.
The value of that will accelerate the rate of convergence
(0 < < 2).
In matrix form,

x (k) = (D L)1 (U + (1 )D)x (k1) + (D L)1 b

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 15 / 32
Stationary Iterative Methods

Symmetric Successive Overrelaxation Method (SSOR):


If A is symmetric, SSOR combines two SOR sweeps together
Firstly, SOR sweep is carried out
In the second sweep the unknowns are updated in the reverse order.
Convergence rate, with an optimal value of , is usually slower than
the convergence rate of SOR with optimal .

x (k) = B1 B2 x (k1) + (2 )(D U)1 D(D L)1 b, (10)

Where B1 = (D U)1 (L + (1 ))D, and


B2 = (D L)1 (U + (1 ))D.

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 16 / 32
Non-Stationary Iterative Methods

Conjugate Gradient Method (CG):

Its one of the oldest, best known of non-stationary and effective


method for symmetric positive definite systems.
The iterates x (i) are updated in each iteration by a multiple (i ) of
the search direction vector p (i) :
x (i) = x (i1) + i x (i) (11)
Correspondingly the residuals r (i) = b ax(i) are updated as

r (i) = r (i1) + q (i) where q(i) = Ap (i) (12)


T T T
The Choice = i = r (i1) r (i1) /p (i) Ap (i) minimizes r (i) A1 r (i) for
all s in Eq.12

Contd..
Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 17 / 32
Non-Stationary Iterative Methods

Conjugate Gradient Method (CG):


The search directions are updated using the residuals

p (i) = r (i) + (i1) p (i1) , (13)

where, i = r (i)T
r (i) /r (i1)T
r (i1) ensures that p (i) and Ap (i1) - or equiv-
alently, r (i) and r (i1) - are orthogonal.
Theory
The unpreconditioned conjugate gradient method constructs the ith
iterate x (i) as an element of x (0) + span{r (0) , ..., A(i1) r (0) } so that
(x (i) x)T A(x (i) x) is minimized, where x is the exact solution of
Ax = b.
Minimum is guaranteed to exist in general, if A is symmetric
positive definite.

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 18 / 32
Non-Stationary Iterative Methods

MINRES and SYMMLQ:

The MINRES and SYMMLQ methods are variants that can be


applied to symmetric indefinite systems.
For indefinite matrices the minimization property of the Conjugate
Gradient method is no longer well-defined.
The MINRES and SYMMLQ methods are variants of the CG method
that avoid the LU-factorization and do not suffer from breakdown.
MINRES minimizes the residual in the 2-norm.
SYMMLQ solves the projected system, but does not minimize
anything (it keeps the residual orthogonal to all previous ones).

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 19 / 32
Non-Stationary Iterative Methods

MINRES and SYMMLQ:


Theory
When A is not positive definite, but symmetric, we can still construct
an orthogonal basis for the Krylov subspace by three term recurrence
relations.
Eliminating the search directions in equations Eq.11 and Eq.12 gives
a recurrence
ARi = Ri+1 Ti (14)
where Ti is an (i + 1) i tridiagonal matrix.
Residual is minimized in 2-norms by obtaining
x (i) {r (0) , Ar (0) , ..., Ai1 r (0) }, x (i) = Ri y (15)
That minimizes

k Ax (i) b k2 =k ARi y b k2 =k R(i+1) Ti y(i) b k2 (16)


Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 20 / 32
Non-Stationary Iterative Methods

CG on Normal Equations, CGNE and CGNR:

CG Variants of this approach are the simplest methods for


nonsymmetric or indefinite systems.
The system is transformed to a symmetric definite one and then CG is
applied.
CGNE solves the system (AAT )y = b, for y and then compute the
solution x = AT y .
CGNR solves (AT A)x = b, for the solution vector x, where b = AT b.

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 21 / 32
Non-Stationary Iterative Methods

Generalized Minimal Residual (GMRES):

Extension of MINRES to unsymmetric systems


Like MINRES, it generates a sequence of orthogonal vectors, but in
the absence of symmetry this can no longer be done with short
recurrences.
Instead, all previously computed vectors in the orthogonal sequence
have to be retained.
For this reason, restarted versions of the method are used.
In the Conjugate Gradient method, the residuals form an orthogonal
basis for the space span {r (0) , Ar (0), A2 r (0) , ...}.
In GMRES, this basis is formed explicitly:

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 22 / 32
Non-Stationary Iterative Methods

Generalized Minimal Residual (GMRES):

w (i) = Av (i)
for k = 1, ..., i
w (i) = w (i) (w (i) , v (k) )v (k)
end
v (i+1) = w (i) / k w (i) k

The inner product coefficients (w (i) , v (i) ) and k w (i) k are stored in
an upper Hessenberg matrix.
The GMRES iterates are constructed as

x (i) = x (0) + y1 v 1 + .. + yi v (i)

Where the coefficients yk have been chosen to minimize the residual


norm k b Ax (i) k.
Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 23 / 32
Non-Stationary Iterative Methods

BiConjugate Gradient (BiCG):


Unlike the conjugate gradient method, this algorithm does not require
the matrix A to be self-adjoint, but instead one needs to perform
multiplications by the conjugate transpose AT .
We update two sequences of residuals
r (i) = r (i1) i Ap (i) , r(i) = r(i1) i AT p (i)
and two sequences of search directions
p (i) = r (i1) i1 p (i1) , p (i) = r(i1) i1 p (i1)
The choices
T T
r(i1) r (i1) r(i) r (i)
i = , i =
p (i)T Ap (i) r(i1)T r (i1)
T T
Ensure the bi-orthogonality relations r(i) r (j) = p (i) Ap (j) if i 6= j.
Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 24 / 32
Non-Stationary Iterative Methods

BiConjugate Gradient (BiCG):


Convergence
For positive definite systems the method delivers same results as CG,
but at twice the cost per iteration.
For nonsymmetric matrices it has been shown the method is more or
less comparable to GMRES (In terms of number of iterations).
Convergence behavior may be quite irregular, and the method may
even break down.
T
Breakdown situation due to possible event that z (i1) r(i1) 0
T
Other breakdown situation, p (i) q (i) 0, occurs when the
LU-Decomposition fails.

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 25 / 32
Non-Stationary Iterative Methods

Quasi-Minimal Residual (QMR):

The main idea behind this algorithm is to solve the reduced


tridiagonal system in a least squares sense, similar to the approach
followed in GMRES.
Since the constructed basis for the Krylov subspace is bi-orthogonal,
rather than orthogonal as in GMRES, the obtained solution is viewed
as a Quasi-minimal residual solution.
Convergence
Convergence is Smoother than BiCG
Look-ahead steps in the QMR method prevents breakdown in all
cases but the so-called incurable breakdown, where no number of
look-ahead steps would yield a next iterate.

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 26 / 32
Non-Stationary Iterative Methods

Conjugate Gradient Squared Method (CGS):

In BiCG, the residual vector r (i) can be regarded as the product of


r (0) and an ith degree polynomial in A, that is

r (i) = Pi (A)r (0) (17)

This same polynomial satisfies r(i) = Pi (A)


r (0) so that

r (i) , r (i) ) = (Pi (AT )


i = ( r (i) , Pi2 (A)r (0) ).
r (0) , Pi (A)r (0) ) = ( (18)

This suggests that if Pi (A) reduces r (0) to a smaller vector r (i) , then
it might be advantageous to apply this contraction operator twice,
and compute Pi2 (A)r (0) .

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 27 / 32
Non-Stationary Iterative Methods

Conjugate Gradient Squared Method (CGS):

Eq.18 shows that the iteration coefficients can still be recovered from
these vectors, and it turns out to be easy to find the corresponding
approximations for x.
Convergence
A speed of convergence for CGS that is about twice as fast as for
BiCG.
This may lead to a less accurate solution than suggested by the
updated residual.
Highly irregular convergence behavior.
Local corrections to the current solution may be so large that
cancellation effects occur.
The method tends to diverge if the starting guess is close to the
solution.
Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 28 / 32
Non-Stationary Iterative Methods

BiConjugate Gradient Stabilized (Bi-CGSTAB):

The Bi-CGSTAB developed to solve nonsymmetric linear systems


while avoiding the often irregular convergence patterns of the CGS.
Instead of computing the CGS sequence i Pi2 (A)r (0) , Bi-CGSTAB
computes i Qi (A)Pi (A)r (0) where Qi is an ith degree polynomial
describing a steepest descent update.
Convergence
CGS can be viewed as a method in which the BiCG contraction
operator is applied twice. Bi-CGSTAB can be interpreted as the
product of BiCG and repeatedly applied GMRES.

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 29 / 32
Conclusions from Iterative Methods

The classical Conjugate Gradient Method (CG) is one of the most


powerful iterative schemes for solving large sparse linear systems Eq.1
with Hermitian positive definite coefficient matrices A.
The Biconjugate Gradient Algorithm (BCG) is the extension of CG to
linear systems Eq.1 with general non-Hermitian nonsingular coefficient
matrices.
Unlike CG, the BCG iterates are not characterized by a minimization
property
There will be irregular convergence behavior with wild oscillations in
the residual norm.
the BCG algorithm, even breakdowns-more precisely, division by 0-may
occur.
The quasiminimal residual method (QMR), iterates are defined by a
quasi minimization of the residual norm, which leads to smooth
convergence curves as a remedy to BCG Problem.

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 30 / 32
Conclusions from Iterative Methods

Furthermore, QMR can be implemented based on a look-ahead


version of the nonsymmetric Lanczos algorithm, which avoids possible
breakdowns of the process.
Except for special cases, such as complex symmetric matrices, BCG
and QMR require matrix-vector multiplications with both the
coefficient matrix A of Eq.1 and its transpose AT .
The Conjugate Gradients Squared Algorithm (CGS), that does not
involve AT .
However, like BCG, CGS also exhibits rather erratic convergence
behavior.
The Bi-CGSTAB Algorithm, which uses local steepest descent steps
to obtain a more smoothly convergent CGS-like process.
Bi-CGSTAB can converge considerably slower than CGS.

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 31 / 32
The End

Y. Sharath Chandra Mouli (IITK) Solving System of Linear Equations September 11, 2017 32 / 32

You might also like