Fem1d C PDF

1D-FEM in C: Steady State Heat
Conduction
Kengo Nakajima
Information Technology Center
Programming for Parallel Computing (616-2057)
Seminar on Advanced Computing (616-4009)
FEM1D 2
• 1D-code for Static Linear-Elastic Problems by

Galerkin FEM
• Sparse Linear Solver
– Conjugate Gradient Method
– Preconditioning
• Storage of Sparse Matrices
• Program
FEM1D 3
Keywords
• 1D Steady State Heat Conduction Problems
• Galerkin Method
• Linear Element
• Preconditioned Conjugate Gradient Method
FEM1D 4
1D Steady State Heat Conduction
heat generation/volume Q
  T  
 Q  0
x  x 
x=0 (xmin) x= xmax
• Uniform: Sectional Area: A, Thermal Conductivity: 

• Heat Generation Rate/Volume/Time [QL-3T-1] Q
• Boundary Conditions
– x=0 ： T= 0 (Fixed Temperature)
T
– x=xmax：  0 (Insulated)
x
FEM1D 5
  T  
 Q  0
x  x 
x=0 (xmin) x= xmax

T
x
FEM1D 6
Analytical Solution
heat generation/volume Q   T  
 Q  0
x  x 
x=0 (xmin) x= xmax
T  0@ x  0 T
 0 @ x  xmax
x
T   Q
T   Q x  C1  C1  Q xmax , T   0 @ x  xmax
1
T   Q x 2  C1 x  C2  C2  0, T  0 @ x  0
2
x
1  2 Q xmax
 T  Qx 
2 
FEM1D 7
1D Linear Element (1/4)

一次元線形要素
• 1D Linear Element
– Length= L
• Node (Vertex)（節点）
• Element（要素）
– Ti Temperature at i
T Ti
– Tj Temperature at j
– Temperature T on each
Tj
element is linear
function of x (Piecewise
Linear): L
x
T  1   2 x Xi Xj
FEM1D 8
1D Linear Element (1/4)

一次元線形要素
• 1D Linear Element
– Length= L
• Node (Vertex)
• Element
– Ti Temperature at i
T Ti
– Tj Temperature at j
– Temperature T on each
Tj
element is linear
function of x (Piecewise
Linear): L
x
T  1   2 x Xi Xj
FEM1D 9
Piecewise Linear
1 2 3 4 5 Global Node ID
1 2 3 4 Element ID
1 2 1 2 1 2 1 2
1 2 3 4
Local Node ID
for each elem.
Gradient of temperature is constant in each
element (might be discontinuous at each “node”)
FEM1D 10
1D Linear Elem.: Shape Function (2/4)

• Coef’s are calculated basedT Ti
on info. at each node
T  Ti @ x  X i , T  T j @ x  X j Tj
Ti  1   2 X i , T j  1   2 X j
L
• Coefficients:
Ti X j  T j X i T j  Ti x
Xi Xj
1  , 2 
L L
• T can be written as follows, according to Ti and Tj :
Ni, Nj
 Xj x  x  Xi 
T    Ti    Tj Shape Function （形状関数）or
 L   L 
Interpolation Function（内挿関
Ni Nｊ数）, function of x (only)
FEM1D 11

• Number of Shape Functions T
T i
= Number of Vertices of
Each Element Tj
– Ni: Function of Position
– A kind of Test/Trial Functions L
 Xj x  x  Xi 
N i   , N j    x
 L  Xi Xj
 L 
• Linear combination of shape functions provides
displacement “in” each element
– Coef’s (unknows): Temperature at each node
i Trial/Test Function (known
M function of position, defined in
T  N iTi  N jT j TuMM   ai i domain and at boundary. “Basis”
i 1 in linear algebra.
ai Coefficients (unknown)
FEM1D 12

T Ti
• Value of Ni
– =1 at one of the nodes in
Tj
element
– =0 on other nodes L
 Xj x  x  Xi 
N i   , N j    x
 L   L  Xi Xj
N
1.0 Ni Nj
x
Xi Xj
FEM1D 13
Galerkin Method (1/4)

• Governing Equation for 1D
Steady State Heat
Conduction Problems
(Uniform ):
 d 2T  
  2   Q  0
 dx 
T  N {} Distribution of temperature in each element
(matrix form), : Temperature at each node
• Following integral equation is obtained at each
element by Galerkin method, where [N]’s are also
weighting functions:
T   d 2T   
V N    dx 2   QdV  0
 
FEM1D 14

体積当たり一様発熱 Q
• Green’s Theorem (1D)
x=0 (xmin) x= xmax
 d 2B  dB  dA dB 
V A dx 2  dV  S A dx dS V  dx dx  dV
• Apply this to the 1st part of eqn with 2nd-order diff.:
T  d 2T   d N T dT  T dT
 N    dV       dV    N  dS
V  dx 2   dx dx  dx
  V   S
• Consider the following terms: V

S S
dT d N 
T  N {},  {} q   dT
dx dx dx
: Heat flux at element surface[QL-2T-1]
FEM1D 15

• Finally, following eqn is 体積当たり一様発熱 Q
obtained by considering
heat generation term Q ： x=0 (xmin) x= xmax
 d N T d N   V
    dV   
V 
dx dx  S S
T T
  q N  dS   QN  dV  0
S V
• This is called “weak form（弱形式）”. Original PDE

consists of terms with 2nd-order diff., but this “weak
form” only includes 1st-order diff by Green’s theorem.
– Requirements for shape functions are “weaker” in “weak
form”. Linear functions can describe effects of 2nd-order
differentiation.
FEM1D 16

 d N T d N  
    dV   
V 
dx dx  x=0 (xmin) x= xmax
T T
 q N  dS  Q N  dV  0
  V
S V
S S
• These terms coincide at element boundaries and

disappear. Finally, only terms on the domain
boundaries remain.
FEM1D 17
Weak Form and Boundary Conditions

• Value of dependent variable is
defined (Dirichlet)
x=0 (xmin) x= xmax
– Weighting Function = 0
– Principal B.C. (Boundary V
Condition)（第一種境界条件） S S
– Essential B.C.（基本境界条件）
• Derivatives of Unknowns  d N T d N  
dV   
(Neumann)
   
dx dx 
V  
– Naturally satisfied in weak form T T
 q N  dS  Q N  dV  0
 
– Secondary B.C.（第二種境界 S V
条件） dT
where q  
– Natural B.C（自然境界条件） dx
FEM1D 18
Weak Form with B.C.: on each elem.
(e) (e) (e)

k     f 
T
(e)  d N  d N  
k  dV
   
dx dx 
V  
(e)  T T
f    QN  dV   q N  dS
V S
FEM1D 19
Integration over Each Element: [k]

 Xj x  x  Xi  dN i   1  dN j
1
N i   , N j      ,  
 L   L  dx  L  dx  L 
 d N T d N   u ui
V   dx dx dV
L uj
 1 / L 
     1 / L, 1 / L A dx
0 
1/ L 
L
2x1 matrix 1x2 matrix
x
A L  1  1 A  1  1 Xi  0 Xj L
 2  dx 
L 0
  1  1 L  1  1
 x x
N i  1  , N j   
A: Sectional Area  L L
L: Length
FEM1D 20
Integration over Each Element: {f} (1/2)

 Xj x  x  Xi  dN i   1  dN j
1

Ni   
, Nj     ,  
 L   L  dx  L  dx  L 
 x x
N i  1  , N j   
 L L
L
T 1  x / L  Q AL 1 Heat Generation
V Q
 N  dV  Q A
0  x / L  dx  2 1 (Volume)
1 : 1
A: Sectional Area
L: Length
FEM1D 21
Integration over Each Element: {f} (2/2)

 Xj x  x  Xi  dN i   1  dN j1

Ni   
, Nj     ,  
 L   L  dx  L  dx  L 
L
T 1  x / L  Q AL 1 Heat Generation
V Q N  dV  Q A0  x / L  dx  2 1 (Volume)
T 0 dT Surface Heat Flux

 q N  dS  q A x L  q A , q  
S 1 dx
when surface heat flux acts on

only this surface.
FEM1D 22
Global Equations
• Accumulate Element Equations:
k (e)  (e)   f (e) Element Matrix, Element Equations
K    F  Global Matrix, Global Equations
K    k , F     f 
: global vector of  
This is the final linear equations
(global equations) to be solved.
FEM1D 23
ECCS2012 System
Creating Directory
>$ cd Documents
>$ mkdir 2014summer your favorite name
>$ cd 2014summer
This is your “top” directory, and is called <$E-TOP> in this class.
1D Code for Steady-State Heat Conduction Problems
>$ cd <$E-TOP>
>$ cp /home03/skengon/Documents/class_eps/F/1d.tar .
>$ cp /home03/skengon/Documents/class_eps/C/1d.tar .
>$ tar xvf 1d.tar
>$ cd 1d
FEM1D 24
Compile & GO !
>$ cd <$E-TOP>/1d
>$ cc –O 1d.c (or g95 –O 1d.f)
>$ ./a.out
Control Data input.dat
4 NE (Number of Elements)
1.0 1.0 1.0 1.0 x (Length of Each Elem.: L) ，Q, A, 
100 Number of MAX. Iterations for CG Solver
1.e-8 Convergence Criteria for CG Solver
x=1
Element ID
1 2 3 4 5
Node ID (Global)
1 2 3 4
x=0 x=1 x=2 x=3 x=4
FEM1D 25
Results
>$ ./a.out
4 iters, RESID= 4.154074e-17
### TEMPERATURE
1 0.000000E+00 0.000000E+00
2 3.500000E+00 3.500000E+00
3 6.000000E+00 6.000000E+00
4 7.500000E+00 7.500000E+00
5 8.000000E+00 8.000000E+00
Computational Analytical
x=1
Element ID
1 2 3 4 5
Node ID (Global)
1 2 3 4
x=0 x=1 x=2 x=3 x=4
FEM1D 26
Element Eqn’s/Accumulation (1/3)

• 4 elements, 5 nodes
Element ID
1 2 3 4 5
Node ID (Global)
1 2 3 4
x=0 x=1 x=2 x=3 x=4
• [k] and {f} of Element-1:

(1) A  1  1 (1) Q AL 1
k   f   
L   1  1 2 1
• As for Element-4:
( 4) A  1  1 ( 4) Q AL 1
k   f   
L   1  1 2 1
FEM1D 27

• Element-by-Element Accumulation:
4
K    k (e)  + + +
e 1
4
F     f (e)  + + +
e 1
FEM1D 28

• Relations to FDM
(e) A  1  1
k  
L  1  1
+1 -1
4 A
(e) -1 +1 +1 -1
K    k   [ + -1 +1 + +1 -1 + ] L
e 1 -1 +1 +1 -1
-1 +1
+1 -1  d 2T   T  2Ti  Ti 1 
-1 +2 -1 A    2 dV     i 1 2 dV
V
dx  V  L 
 -1 +2 -1 
-1 +2 -1 L  T  2Ti  Ti 1  A
-1 +1
  i 1 2   AL  Ti 1  2Ti  Ti 1  
 L  L
Something familiar …
FEM: Coefficient Matrices are generally sparse
(many ZERO’s)
FEM1D 29
2nd –Order Differentiation in FDM
• Approximate Derivative at×(center of i and i+1)

i-1 i i+1
 d  i1  i
×   
x x  dx i 1/ 2 x
x→0: Real Derivative
• 2nd-Order Diff. at i
 d   d  i1  i i  i1
    
 d 2   dx i 1/ 2  dx i1/ 2
 2    x x  i 1  2i  i 1
 dx i x x x2
FEM1D 30
Element-by-Element Operation
very flexible if each element has different material
property, size, etc.
(e) ( e ) A( e )  1  1
k     1  1
L( e )  
+1 -1
4
(e) -1 +1 (1) A(1) +1 -1 ( 2) A( 2)
  + -1 +1 
(1)
K    k
e 1 L L( 2 )
+1 -1
(3) A(3) ( 4) A( 4)
 ( 3)
+ 
-1 +1 L +1 -1 L( 4 )
-1 +1
FEM1D 31
Element/Global Operations
1 2 3 4
Local Node ID
e e e e
1 2 1 2 1 2 1 2
T T T T
 d N  d N    d N  d N    d N  d N    d N  d N  
  E  dV      E  dV      E  dV      E  dV   
V  dx V V V
dx   dx dx   dx dx   dx dx 
T T T T T T T T
   N  dS   X N  dV  0    N  dS   X N  dV  0    N  dS   X N  dV  0    N  dS   X N  dV  0
S V S V S V S V
[k ]( 3) {}( 3)  { f }(3) [k ]( 4 ) {}( 4)  { f }( 4 )

[k ](1) {}(1)  { f }(1) [k ]( 2 ) {}( 2 )  { f }( 2 )
k11( 3) k12( 3)  1( 3)   f1( 3)  k11( 4 ) k12( 4 )  1( 4 )   f1( 4) 
k11( 2 ) k12( 2 )  1( 2)   f1( 2)   ( 4)   (4) 
k11(1) k12(1)  1(1)   f1(1)    ( 2)   ( 3) ( 3)   ( 3) 
  ( 3)  ( 4)   ( 4) 
 (1) (1)   (1) 
  (1)   (2) ( 2)   (2)  k 22 k 21 k 22  2   f 2 
k 22 k 21 k 22  2   f 2  k 21  2   f 2 
k 21  2   f 2 
1 2 3 4
1 2 3 4 5
[ K ]{}  {F }
 D1 AU11   1   B1 
Mapping Information needed,
 AL
 21 D2 AU 21     B 
  2   2 
from element-matrix
 AL31 D3 AU 31   3    B3  to global-matrix.
 
 AL41 D4 AU 41   4   B4 
   
 AL51 D5   5   B5 
FEM1D 32
Around the FEM in 3 minutes
Accumulation to
Global/overall Matrix
[K ]{}  {F}
13 14 15 16  D X X X  1   F1 
X D X X
X X     F 
7 8 9   2   2 
 X D X X X X  3   F3 
    
9 10 11 12  X D X X  4   F4 
X X D X X X  5   F5 
4 5 6     
X X X X D X X X X  6   F6 
5 6 7 8  X X X X D X X X X     F 
  7   7 
 X X X D X X  8   F8 
1 2 3       F 
X X D X X X
  9   9 
1 2 3 4  X X X X D X X X X 10  F10 
    
 X X X X D X X X X 11  F11 
 X X X D X X 12  F12 
    
 X X D X 13  F13 
 X X X X D X 14  F 
    14 
 X X X X D X 15  F15 
    
 X X X D16  F16 
FEM1D 33

Galerkin FEM
– Preconditioning
• Program
FEM1D 34
Large-Scale Linear Equations in

Scientific Applications
• Solving large-scale linear equations Ax=b is the most
important and expensive part of various types of
scientific computing.
– for both linear and nonlinear applications
• Various types of methods proposed & developed.
– for dense and sparse matrices
– classified into direct and iterative methods
• Dense Matrices:密行列: Globally Coupled Problems
– BEM, Spectral Methods, MO/MD (gas, liquid)
• Sparse Matrices:疎行列: Locally Defined Problems
– FEM, FDM, DEM, MD (solid), BEM w/FMM
FEM1D
Direct Method
直接法
 Gaussian Elimination/LU Factorization.
 compute A-1 directly.
Good
 Robust for wide range of applications.
 Good for both dense and sparse matrices
Bad
 More expensive than iterative methods (memory, CPU)
 not scalable
35
Solver-Iterative 36
What is Iterative Method ?

反復法
Linear Equations Initial Solution
連立一次方程式初期解
 a11 a12  a1n  x1   b1   x1( 0 ) 

      
 a 21 a 22  a 2 n  x 2   b2  (0)
 x2 
       x (0)  
    
       
a an2  x (0) 
 n1  a nn  x n   bn   n 
A x b
Starting from a initial vector x(0), iterative method

obtains the final converged solutions by iterations
(1) ( 2)
x , x ,
1D-Part1
Iterative Method
反復法
• Stationary Method
– Only x (solution vector) changes during iterations.
– SOR, Gauss-Seidel, Jacobi
Ax  b 
– Generally slow, impractical
x ( k 1)  Mx ( k )  Nb
• Non-Stationary Method
– With restriction/optimization conditions
– Krylov-Subspace
– CG: Conjugate Gradient
– BiCGSTAB: Bi-Conjugate Gradient Stabilized
37
– GMRES: Generalized Minimal Residual
FEM1D
Iterative Method (cont.)

Good
 Less expensive than direct methods, especially in memory.
 Suitable for parallel and vector computing.
Bad
 Convergence strongly depends on problems, boundary
conditions (condition number etc.)
 Preconditioning is required : Key Technology for
Parallel FEM
38
Solver-Iterative
Non-Stationary/Krylov Subspace
Method (1/2)
非定常法・クリロフ部分空間法
Ax  b  x  b  I  A x
Compute x0, x1, x2, ..., xk by the following iterative procedures:
x k  b  I  A x k 1
 b  Ax k 1   x k 1
 rk 1  x k 1 where rk  b  Ax k : residual
k 1
x k  x 0   ri
i 0
rk  b  Ax k  b  Ark 1  x k 1 
39
 b  Ax k 1   Ark 1  rk 1  Ark 1  I  A rk 1
Solver-Iterative
Non-Stationary/Krylov Subspace
Method (2/2)
非定常法・クリロフ部分空間法
k 1 k 2 k 1
i
x k  x 0   ri  x 0  r0   I  A ri  x 0  r0   I  A  r0
i 0 i 0 i 1
k 1
 k 1 i i
z k  r0   I  A  r0  I   I  A  r0
i 1  i 1 
zk is a vector which belongs to kth Krylov Subspace（クリ

ロフ部分空間）, approximate solution vector xk is derived
by the Krylov Subspace:
2 k 1
0 0
r , Ar , A r ,, A 0 r0 
40
FEM1D 41
Conjugate Gradient Method

共役勾配法
• Conjugate Gradient: CG
– Most popular “non-stationary” iterative method
• for Symmetric Positive Definite (SPD) Matrices
– 対称正定
– {x}T[A]{x}>0 for arbitrary {x}
– All of diagonal components, eigenvaules and leading
principal minors > 0 （主小行列式・首座行列式）
– Matrices of Galerkin-based FEM: heat conduction, Poisson,
static linear elastic problems a a a a 11
 a  12 13 14 1n
a a22 a23
• Algorithm  21 a24  a2 n 
a a32 a33 a34  a3n 
det  31 
– “Steepest Descent Method” a41 a42 a43 a44  a4 n 
      
– x(i)= x(i-1) + i p(i)  
an1 an 2 an 3 an 4  ann 
• x(i)：solution，p(i)：search direction，i: coefficient
– Solution {x} minimizes {x-y}T[A]{x-y}, where {y} is exact
solution.
FEM1D 42
Procedures of Conjugate Gradient

Compute r(0)= b-[A]x(0)
for i= 1, 2, … • Mat-Vec. Multiplication
z(i-1)= r(i-1) • Dot Products
i-1= r(i-1) z(i-1)
if i=1 • DAXPY (Double
p(1)= z(0)
else
Precision: a{X} + {Y})
i-1= i-1/i-2
p(i)= z(i-1) + i-1 p(i-1)
endif
q(i)= [A]p(i) x(i) : Vector
i = i-1/p(i)q(i)
x(i)= x(i-1) + ip(i) i : Scalar
r(i)= r(i-1) - iq(i)
check convergence |r|
end
FEM1D 43

i-1= r(i-1) z(i-1)
if i=1 • DAXPY
p(1)= z(0)
else
i-1= i-1/i-2
p(i)= z(i-1) + i-1 p(i-1)
endif
i = i-1/p(i)q(i)
r(i)= r(i-1) - iq(i)
end
FEM1D 44

i-1= r(i-1) z(i-1)
if i=1 • DAXPY
p(1)= z(0)
else
i-1= i-1/i-2
p(i)= z(i-1) + i-1 p(i-1)
endif
i = i-1/p(i)q(i)
r(i)= r(i-1) - iq(i)
end
FEM1D 45

i-1= r(i-1) z(i-1)
if i=1 • DAXPY
p(1)= z(0)
else
• Double
i-1= i-1/i-2 • {y}= a{x} + {y}
p(i)= z(i-1) + i-1 p(i-1)
endif
i = i-1/p(i)q(i)
r(i)= r(i-1) - iq(i)
end
FEM1D 46

for i= 1, 2, …
x(i) : Vector
z(i-1)= r(i-1) i : Scalar
i-1= r(i-1) z(i-1)
if i=1
p(1)= z(0)
else
i-1= i-1/i-2
p(i)= z(i-1) + i-1 p(i-1)
endif
q(i)= [A]p(i)
i = i-1/p(i)q(i)
x(i)= x(i-1) + ip(i)
r(i)= r(i-1) - iq(i)
end
FEM1D 47
Derivation of CG Algorithm (1/5)

Solution x minimizes the following equation if y is the exact
solution (Ay=b)
x  y T Ax  y 
x  y T Ax  y   x, Ax    y, Ax   x, Ay    y, Ay 
 x, Ax   2 x, Ay    y, Ay    x, Ax   2x, b    y, b  Const.
Therefore, the solution x minimizes the following f(x):

1
f x    x, Ax    x, b 
2
1 Arbitrary vector h
f x  h   f  x   h, Ax  b   h, Ah 
2
FEM1D 48
1
f x    x, Ax   x, b 
2
1 Arbitrary vector h
f x  h   f  x   h, Ax  b   h, Ah 
2
1
f  x  h    x  h, A( x  h)    x  h, b 
2
1 1
  x  h, Ax   x  h, Ah    x, b   h, b 
2 2
1 1 1 1
  x, Ax   h, Ax    x, Ah   h, Ah    x, b   h, b 
2 2 2 2
1 1
  x, Ax    x, b   h, Ax   h, b   h, Ah 
2 2
1
 f  x   h, Ax  b   h, Ah 
2
FEM1D 49

CG method minimizes f(x) at each iteration.
Assume that approximate solution: x(0), and
search direction vector p(k) is defined at k-th iteration.
x ( k 1)  x ( k )   k p ( k )
Minimization of f(x(k+1)) is done as follows:
(k ) (k ) 1 2 (k )
       
f x   k p   k p , Ap ( k )   k p ( k ) , b  Ax ( k )  f x ( k )
2

f x ( k )   k p ( k )  p ( k ) , b  Ax ( k )
   p(k ) , r (k )

 0  k  (k ) (k )
 (k ) (k )
(1)
 k p , Ap
   p , Ap

r ( k )  b  Ax ( k ) residual vector
FEM1D 50
Residual vector at (k+1)-th iteration: r ( k 1)  b  Ax ( k 1) , r ( k )  b  Ax ( k )

( k 1) (k ) (k ) r ( k 1)  r ( k )  Ax ( k 1)  Ax ( k )   k Ap ( k )
r r   k Ap (2)
Search direction vector p is defined by the following

recurrence formula:
p ( k 1)  r ( k 1)   k p ( k ) , r ( 0 )  p ( 0 ) (3)
It’s lucky if we can get exact solution y at (k+1)-th iteration:

y  x ( k 1)   k 1 p ( k 1)
FEM1D 51

BTW, we have the following (convenient) orthogonality relation:
(k )
Ap , y  x ( k 1)   0
(k )
Ap , y  x ( k 1)  p ( k ) , Ay  Ax ( k 1)  p ( k ) , b  Ax ( k 1)
    
(k )
 p   
, b  A x ( k )   k p ( k )  p ( k ) , b  Ax ( k )   k Ap ( k ) 
(k )
 p     
, r ( k )   k Ap ( k )  p ( k ) , r ( k )   k p ( k ) , Ap ( k )  0
(k )
 p , r (k )
 k  (k )
p , Ap ( k ) 
Thus, following relation is obtained:
(k )
Ap    
, y  x ( k 1)  Ap ( k ) ,  k 1 p ( k 1)  0  p ( k 1) , Ap ( k )  0 
FEM1D 52

( k 1) (k ) ( k 1) (k ) (k ) ( k 1) (k ) (k ) (k )
 p , Ap   r   p , Ap   r , Ap     p , Ap   0
k k
( k 1) (k )
 r , Ap 
  k (4)
(k ) (k )
 p , Ap 
( k 1) (k )
 p , Ap   0 p(k) and p(k+1) are “conjugate（共役）” for matrix A
Compute p(0)=r(0)= b-[A]x(0) ( i 1)
for i= 1, 2, … , r ( i 1)
p 
calc. i-1
 i 1  (i 1)
p , Ap (i 1)
 
x(i)= x(i-1) + i-1p(i-1)
r(i)= r(i-1) – i-1[A]p(i-1)
 r (i ) , Ap ( i 1)
 
check convergence |r|  i 1  (i 1)
(if not converged)
p , Ap (i 1)
 
calc. i-1
p(i)= r(i) + i-1 p(i-1)
end
FEM1D 53
Properties of CG Algorithm
Following “conjugate（共役）” relationship is obtained for
arbitrary (i,j):
(i )
p , Ap ( j )   0 i  j 
Following relationships are also obtained for p(k) and r(k):

(i ) (k )
r , r ( j )  0 i  j ,
 p , r (k )  r (k ) , r (k )
  
In N-dimensional space, only N sets of orthogonal and linearly
independent residual vector r(k). This means CG method
converges after N iterations if number of unknowns is N.
Actually, round-off error sometimes affects convergence.
FEM1D 54
Proof (1/3) (i )
r , r ( j )  0 i  j 

Mathematical Induction (i )
p , Ap ( j )  0 i  j 

数学的帰納法
(k )
, r (k )
p 
(1)  k  (k )
p , Ap ( k )
 
(2) r ( k 1)  r ( k )   k Ap ( k )
(3) p ( k 1)  r ( k 1)   k p ( k ) , r ( 0 )  p ( 0 )
(4)  r ( k 1) , Ap ( k )
 
k 
p ( k ) , Ap ( k )
 
FEM1D 55
Proof (2/3) (i )
r , r ( j )  0 i  j 

Mathematical Induction (i ) ( j)
(＊)
p , Ap   0 i  j 
数学的帰納法
(＊) is satisfied for i  k , j  k where i  j
if i < k r ( k 1) , r ( i )  r ( i ) , r ( k 1) (2)
  
 r (i ) , r ( k )   k Ap ( k )   
(＊) (i ) (k ) (4) (i ) ( i 1) (k )

  k r , Ap     p   p , Ap  k i 1
(i ) (k ) (＊) ( i 1) (k )
  k p , Ap      p , Ap   0 k i 1
if i = k ( k 1) (k ) (k ) (k ) (k )
r , r (k ) (2) r , r   r , Ap  k
(3) (k ) (k ) (k ) ( k 1) (k )
(k ) (k )  r , r    p   p ,  Ap  k 1 k
,r p 
(1)  k  ( k ) (＊) (k ) (k ) (1) (k ) (k ) (k ) (k ) (k ) (k )
p , Ap ( k )
   r , r     p , Ap   r , r    p , r 
k
( k 1) (k ) (k )
(2) r r   k Ap (3) (k ) (k ) ( k 1) (k ) (k )
 r , r    p  r ,r  k 1
( k 1)
(3) p  r ( k 1)   k p ( k ) (2)
( k 1) (k ) ( k 1) ( k 1) ( k 1)
   pk 1 , r     p ,r   Ap  k 1 k 1
 r ( k 1) , Ap ( k )
 
(4)  k  ( k 1) (1)
( k 1) ( k 1) ( k 1)
p ( k ) , Ap ( k )
      p
k 1 ,r     p , Ap  0 k 1
FEM1D 56
Proof (3/3) (i )
r , r ( j )  0 i  j 

Mathematical Induction (i ) ( j)
(＊)
p , Ap   0 i  j 
数学的帰納法
(k )
, r (k )
p 
(1)  k  ( k )
(＊) is satisfied for i  k , j  k where i  j p , Ap ( k )
 
(3) (2) r ( k 1)  r ( k )   k Ap ( k )
( k 1)
if i < k p  
, Ap (i )  r ( k 1)   k p ( k ) , Ap (i )  (3) p
( k 1)
 r ( k 1)   k p ( k )
(＊) ( k 1)
 r , Ap ( i )   r ( k 1) , Ap ( k )
 
(4)  k 
(2) 1 ( k 1) (i ) ( i 1)
p ( k ) , Ap ( k )
 
 r ,r  r  0
k
( k 1)
if i = k p (3)   
, Ap ( k )  r ( k 1) , Ap ( k )   k p ( k ) , Ap ( k ) 
(4)
0
FEM1D 57
( k 1)
r , r (k )  0

( k 1) (2) (k )
r , r (k )   r   
, r ( k )  r ( k ) ,  k Ap ( k )
(3) (k ) (k ) ( k 1) (k )
 r , r (k )   p   p , k 1 k Ap 
(＊) (k ) (1)
(k ) (k ) (k ) (k )
 r , r (k )     p , Ap   r
k , r (k )   p , r (k )  0

(k )
(k ) (k ) (k ) (k ) , r (k )
p 
 r ,r   p ,r  (1)  k  ( k )
p , Ap ( k )
 
(2) r ( k 1)  r ( k )   k Ap ( k )
( k 1)
(3) p  r ( k 1)   k p ( k )
 r ( k 1) , Ap ( k )
 
(4)  k 
p ( k ) , Ap ( k )
 
FEM1D 58
k，k
Usually, we use simpler definitions of k，k as follows:
(k ) (k ) (k )
 p , b  Ax   p , r (k )
 r (k ) , r (k )
 
 
k 
(k ) (k ) (k ) (k )
 (k )
 p , Ap   p , Ap p , Ap ( k )
  
(k ) (k ) (k ) (k )
  p , r   r , r 
 r ( k 1) , Ap ( k )
 r ( k 1) , r ( k 1)
  
k  (k ) (k )

p , Ap  r (k ) , r (k )
  
( k 1) ( k 1)
( k 1) (k ) r , r ( k )  r ( k 1) , r ( k 1) 
, Ap
r     r
k k
FEM1D 59

for i= 1, 2, …
x(i) : Vector
z(i-1)= r(i-1) i : Scalar
i-1= r(i-1) z(i-1)
if i=1
p(1)= z(0) ( i 1)
else r , r (i 1) i 1
   
i-1= i-1/i-2
 i 1 (i 2)
r , r (i 2) i 2
   
p(i)= z(i-1) + i-1 p(i-1)
endif ( i 1)
q(i)= [A]p(i) r , r ( i 1)   i 1 
i = i-1/p(i)q(i)
i  (i )
p , Ap ( i ) 
x(i)= x(i-1) + ip(i)
r(i)= r(i-1) - iq(i)
end
FEM1D 60
Preconditioning for Iterative Solvers

 Convergence rate of iterative solvers strongly depends
on the spectral properties (eigenvalue distribution) of
the coefficient matrix A.
 Eigenvalue distribution is small, eigenvalues are close to 1
 In “ill-conditioned” problems, “condition number” (ratio of
max/min eigenvalue if A is symmetric) is large（条件数）.
 A preconditioner M (whose properties are similar to
those of A) transforms the linear system into one with
more favorable spectral properties （前処理）
 M transforms Ax=b into A'x=b' where A'=M-1A, b'=M-1b
 If M~A, M-1A is close to identity matrix.
 If M-1=A-1, this is the best preconditioner (Gaussian Elim.)
 Generally, A'x’=b’ where A'=ML-1AMR-1, b'=ML-1b, x’=MRx
 ML/MR: Left/Right Preconditioning（左／右前処理）
FEM1D 61
Preconditioned CG Solver
Compute r(0)= b-[A]x(0) [M]= [M1][M2]
for i= 1, 2, …
solve [M]z(i-1)= r(i-1) [A’]x’=b’
i-1= r(i-1) z(i-1) [A’]=[M1]-1[A][M2]-1
if i=1 x’=[M2]x, b’=[M1]-1b
p(1)= z(0)
else p’=>[M2]p, r’=>[M1]-1r
i-1= i-1/i-2 x
p(i)= z(i-1) + i-1 p(i-1) (i)
p’ = r’(i-1) + ’i-1 p’(i-1)
endif
q(i)= [A]p(i) [M2]p(i)= [M1]-1r(i-1) + ’i-1 [M2]p(i-1)
i = i-1/p(i)q(i)
x(i)= x(i-1) + ip(i) p(i)= [M2]-1[M1]-1r(i-1) + ’i-1 p(i-1)
r(i)= r(i-1) - iq(i) p(i)= [M]-1r(i-1) + ’i-1 p(i-1)
end
’i-1= ([M]-1r(i-1),r(i-1))/
([M]-1r(i-2),r(i-2))
’i-1= ([M]-1r(i-1),r(i-1))/
( p(i-1),[A]p(i-1))
FEM1D 62
ILU(0), IC(0)
• Widely used Preconditioners for Sparse Matrices
– Incomplete LU Factorization（不完全LU分解）
– Incomplete Cholesky Factorization (for Symmetric
Matrices)（不完全コレスキー分解）
• Incomplete Direct Method

– Even if original matrix is sparse, inverse matrix is not
necessarily sparse.
– fill-in
– ILU(0)/IC(0) without fill-in have same non-zero pattern
with the original (sparse) matrices
FEM1D 63
Diagonal Scaling, Point-Jacobi
D1 0 ... 0 0
0 D 0 0 
 2 
M    ... ... ... 
 
0 0 DN 1 0 
 0 0 ... 0 DN 
• solve [M]z(i-1)= r(i-1) is very easy.

• Provides fast convergence for simple problems.
• 1d.f, 1d.c
FEM1D 64

Galerkin FEM
– Preconditioning
• Program
FEM1D 65
Coef. Matrix derived from FEM

D X X X  1   F1 
X D X X X X     F 
• Sparse Matrix   2   2 
 X D X X X X  3   F3 
    
 X D X X  4   F4 
– Many “0”’s X X D X X X  5   F5 
    
X X X X D X X X X  6   F6 
• Storing all components  X X X X D X X X X     F 
  7   7 
 X X X D X X  8   F8 
 X X D X X X      F 
(e.g. A(i,j)) is not efficient   9   9 
 X X X X D X X X X 10  F10 
    
 X X X X D X X X X 11  F11 
for sparse matrices  X X X D X X 12  F12 
    
 X X D X 13  F13 
– A(i,j) is suitable for dense  X X X X D X 14  F14 
    
 X X X X D X 15  F15 
matrices     
 X X X D16  F16 
• Number of non-zero off-diagonal components is

O(100) in FEM
– If number of unknowns is 108 :
• A(i,j): O(1016) words
• Actual Non-zero Components：O(1010) words
• Only (really) non-zero off-diag. components should
be stored on memory
FEM1D 66
Variables/Arrays in 1d.f, 1d.c

related to coefficient matrix
name type size description
N I - # Unknowns
NPLU I - # Non-Zero Off-Diagonal Components

Diag(:) R N Diagonal Components
U(:) R N Unknown Vector
Rhs(:) R N RHS Vector

0:N Off-Diagonal Components (Number of Non-Zero Off-
Index(:) I
N+1 Diagonals at Each ROW)
Item(:) I NPLU Off-Diagonal Components (Corresponding Column ID)
AMat(:) R NPLU Off-Diagonal Components (Value)
Only non-zero components are stored according

to “Compressed Row Storage”.
FEM1D 67
Mat-Vec. Multiplication for Sparse Matrix

Compressed Row Storage (CRS)
Diag (i) Diagonal Components (REAL, i=1~N)
Index(i) Number of Non-Zero Off-Diagonals at Each ROW (INT, i=0~N)
Item(k) Off-Diagonal Components (Corresponding Column ID)
（INT, k=1, index(N)）
AMat(k) Off-Diagonal Components (Value)
（ REAL, k=1, index(N) ）
{Y}= [A]{X} D X X X  1   F1 
X D X X X X     F 
  2   2 
 X D X X X X  3   F3 
    
 X D X X  4   F4 
X X D X X X  5   F5 
do i= 1, N 
X X X X D X X X X
   
 6   F6 
 X X X X D X X X X     F 
Y(i)= Diag(i)*X(i)   7   7 
 X X X D X X  8   F8 
 X X D X X X      F 
do k= Index(i-1)+1, Index(i)   9   9 
 X X X X D X X X X 10  F10 
    
 X X X X D X X X X 11  F11 
Y(i)= Y(i) + Amat(k)*X(Item(k)) 

X X X D X X 12  F12 
   
 X X D X 13  F13 
enddo  X X X X D X   F 
  14   14 
 X X X X D X 15  F15 
    
enddo  X X X D16  F16 
FEM1D 68
CRS or CSR ?
for Compressed Row Storage
• In Japan and USA, “CRS” is very general for
abbreviation of “Compressed Row Storage”, but they
usually use “CSR” in Europe (especially in France).
• “CRS” in France
– Compagnie Républicaine de Sécurité
• Republic Security Company of France
• French scientists may feel
uncomfortable when we use “CRS” in
technical papers and/or presentations.
FEM1D 69
Mat-Vec. Multiplication for Sparse Matrix

{Q}=[A]{P}
for(i=0;i<N;i++){
W[Q][i] = Diag[i] * W[P][i];
for(k=Index[i];k<Index[i+1];k++){
W[Q][i] += AMat[k]*W[P][Item[k]];
}
}
FEM1D 70
Mat-Vec. Multiplication for Dense Matrix

Very Easy, Straightforward
 a11 a12 ... a1, N 1 a1, N   x1   y1 
 a a a a  x   y 
 21 22 2, N 1 2, N   2   2 
 ... ... ...      
  x   y 
aN 1,1 aN 1,2 aN 1, N 1 aN 1, N 
 N 1   N 1 
 aN ,1 aN ,2 ... aN , N 1 aN , N   xN   yN 
 
{Y}= [A]{X}
do j= 1, N
Y(j)= 0.d0
do i= 1, N
Y(j)= Y(j) + A(i,j)*X(i)
enddo
enddo
FEM1D 71

1 2 3 4 5 6 7 8
1 1.1 2.4 0 0 3.2 0 0 0 
2
4.3 3.6 0 2.5 0 3.7 0
 9.1 
3 0 0 5.7 0 1.5 0 3.1 0 
 
4 4.1 0 9.8 2.5 2.7 0
0 0 
5 3.1 9.5 10.4 0 11.5 0 4.3 0 
 
6 0 0 6.5 0 0 12.4 9.5 0 
7 0 6.4 2.5 0 0 1.4 23.1 13.1
 
8  0 9.5 1.3 9.6 0 3.1 0 51.3
FEM1D 72
Compressed Row Storage (CRS): C

Numbering starts from 0 in program
0 1 2 3 4 5 6 7
1.1 2.4 3.2 N= 8
0
◎ ① ④
4.3 3.6 2.5 3.7 9.1 Diagonal Components
1 Diag[0]= 1.1
◎ ① ③ ⑤ ⑦
5.7 1.5 3.1 Diag[1]= 3.6
2 Diag[2]= 5.7
② ④ ⑥
4.1 9.8 2.5 2.7 Diag[3]= 9.8
3 Diag[4]= 11.5
① ③ ④ ⑤
Diag[5]= 12.4
3.1 9.5 10.4 11.5 4.3
4 Diag[6]= 23.1
◎ ① ② ④ ⑥
Diag[7]= 51.3
6.5 12.4 9.5
5
② ⑤ ⑥
6.4 2.5 1.4 23.1 13.1
6
① ② ⑤ ⑥ ⑦
7 9.5 1.3 9.6 3.1 51.3
① ② ③ ⑤ ⑦
FEM1D 73
Compressed Row Storage (CRS)：C

0 1 2 3 4 5 6 7
1.1 2.4 3.2
0
◎ ① ④
3.6 4.3 2.5 3.7 9.1
1
① ◎ ③ ⑤ ⑦
5.7 1.5 3.1
2
② ④ ⑥
3 9.8 4.1 2.5 2.7
③ ① ④ ⑤
11.5 3.1 9.5 10.4 4.3
4
④ ◎ ① ② ⑥
12.4 6.5 9.5
5
⑤ ② ⑥
23.1 6.4 2.5 1.4 13.1
6
⑥ ① ② ⑤ ⑦
7 51.3 9.5 1.3 9.6 3.1
⑦ ① ② ③ ⑤
FEM1D 74

# Non-Zero Index[0]= 0
Off-Diag.
1.1 2.4 3.2
0 2 Index[1]= 2
◎ ① ④
3.6 4.3 2.5 3.7 9.1 4 Index[2]= 6
1
① ◎ ③ ⑤ ⑦
5.7 1.5 3.1
2 2 Index[3]= 8
② ④ ⑥
3 9.8 4.1 2.5 2.7
3 Index[4]= 11
③ ① ④ ⑤
11.5 3.1 9.5 10.4 4.3
4 4 Index[5]= 15
④ ◎ ① ② ⑥
12.4 6.5 9.5
5 2 Index[6]= 17
⑤ ② ⑥
23.1 6.4 2.5 1.4 13.1 4 Index[7]= 21
6
⑥ ① ② ⑤ ⑦ NPLU= 25
7 51.3 9.5 1.3 9.6 3.1 4 Index[8]= 25 (=Index[N])
⑦ ① ② ③ ⑤
(Index[i])th~(Index[i+1]) th:
Non-Zero Off-Diag. Components corresponding to i-th row.

FEM1D 75

# Non-Zero Index[0]= 0
Off-Diag.
1.1 2.4 3.2
0 2 Index[1]= 2
◎ ① ④
3.6 4.3 2.5 3.7 9.1 4 Index[2]= 6
1
① ◎ ③ ⑤ ⑦
5.7 1.5 3.1
2 2 Index[3]= 8
② ④ ⑥
3 9.8 4.1 2.5 2.7
3 Index[4]= 11
③ ① ④ ⑤
11.5 3.1 9.5 10.4 4.3
4 4 Index[5]= 15
④ ◎ ① ② ⑥
12.4 6.5 9.5
5 2 Index[6]= 17
⑤ ② ⑥
23.1 6.4 2.5 1.4 13.1 4 Index[7]= 21
6
⑥ ① ② ⑤ ⑦
7 51.3 9.5 1.3 9.6 3.1 4 Index[8]= 25 NPLU= 25
⑦ ① ② ③ ⑤ (=Index[N])

FEM1D 76
1.1 2.4 3.2

0
◎ ① ④
3.6 4.3 2.5 3.7 9.1 Item[ 6]= 4, AMat[ 6]= 1.5
1 Item[18]= 2, AMat[18]= 2.5
① ◎ ③ ⑤ ⑦
5.7 1.5 3.1
2
② ④ ⑥
3 9.8 4.1 2.5 2.7
③ ① ④ ⑤
11.5 3.1 9.5 10.4 4.3
4
④ ◎ ① ② ⑥
12.4 6.5 9.5
5
⑤ ② ⑥
23.1 6.4 2.5 1.4 13.1
6
⑥ ① ② ⑤ ⑦
7 51.3 9.5 1.3 9.6 3.1
⑦ ① ② ③ ⑤
FEM1D 77
1.1 2.4 3.2 Diag [i] Diagonal Components (REAL, i=0~N-1)

0
◎ ①,0 ④,1 Index[i[ Number of Non-Zero Off-Diagonals at
3.6 4.3 2.5 3.7 9.1 Each ROW (INT, i=0~N)
1
① ◎,2 ③,3 ⑤,4 ⑦,5 Item[k] Off-Diagonal Components
5.7 1.5 3.1 (Corresponding Column ID)
2
② ④,6 ⑥,7 （INT, k=0, index[N]）
9.8 4.1 2.5 2.7 Amat[k] Off-Diagonal Components (Value)
3
③ ①,8 ④,9 ⑤,10 （ REAL, k=0, index[N] ）
11.5 3.1 9.5 10.4 4.3
4 {Y}=[A]{X}
④ ◎,11 ①,12 ②,13 ⑥,14
12.4 6.5 9.5
5 for(i=0;i<N;i++){
⑤ ②,15 ⑥,16
Y[i] = Diag[i] * X[i];
23.1 6.4 2.5 1.4 13.1 for(k=Index[i];k<Index[i+1];k++){
6
⑥ ①,17 ②,18 ⑤,19 ⑦,20 Y[i] += AMat[k]*X[Item[k]];
7 51.3 9.5 1.3 9.6 3.1 }
⑦ ①,21 ②,22 ③,23 ⑤,24 }
FEM1D 78

Galerkin FEM
– Preconditioning
• Program
FEM1D 79
Finite Element Procedures

• Initialization
– Control Data
– Node, Connectivity of Elements (N: Node#, NE: Elem#)
– Initialization of Arrays (Global/Element Matrices)
– Element-Global Matrix Mapping (Index, Item)
• Generation of Matrix
– Element-by-Element Operations (do icel= 1, NE)
• Element matrices
• Accumulation to global matrix
– Boundary Conditions
• Linear Solver
FEM1D 80
Program: 1d.c (1/6)

variables and arrays
/*
// 1D Steady-State Heat Transfer
// FEM with Piece-wise Linear Elements
// CG (Conjugate Gradient) Method
//
// d/dx(CdT/dx) + Q = 0
// T=0@x=0
*/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <assert.h>
int main(){
int NE, N, NPLU, IterMax;
int R, Z, Q, P, DD;
double dX, Resid, Eps, Area, QV, COND;
double X1, X2, U1, U2, DL, Strain, Sigma, Ck;
double QN, XL, C2, Xi, PHIa;
double *PHI, *Rhs, *X;
double *Diag, *AMat;
double **W;
int *Index, *Item, *Icelnod;
double Kmat[2][2], Emat[2][2];
int i, j, in1, in2, k, icel, k1, k2, jS;
int iter;
FILE *fp;
double BNorm2, Rho, Rho1=0.0, C1, Alpha, DNorm2;
int ierr = 1;
int errno = 0;
FEM1D 81
Variable/Arrays (1/2)
Name Type Size I/O Definition
NE I I # Element
N I O # Node
NPLU I O # Non-Zero Off-Diag. Components
IterMax I I MAX Iteration Number for CG
errno I O ERROR flag
R, Z, Q, P, I O Name of Vectors in CG
DD
dX R I Length of Each Element
Resid R O Residual for CG
Eps R I Convergence Criteria for CG
Area R I Sectional Area of Element
QV R I Heat Generation Rate/Volume/Time Q
COND R I Thermal Conductivity
FEM1D 82
Variable/Arrays (2/2)
Name Type Size I/O Definition
X R N O Location of Each Node
PHI R N O Temperature of Each Node
Rhs R N O RHS Vector
Diag R N O Diagonal Components
W R [4][N] O Work Array for CG
Amat R NPLU O Off-Diagonal Components (Value)
Index I N+1 O Number of Non-Zero Off-Diagonals at
Each ROW
Item I NPLU O Off-Diagonal Components
(Corresponding Column ID)
Icelnod I 2*NE O Node ID for Each Element
Kmat R [2][2] O Element Matrix [k]
Emat R [2][2] O Element Matrix
FEM1D 83
Program: 1d.c (2/6)

Initialization, Allocation of Arrays
/*
// +-------+ Control Data input.dat
// | INIT. |
// +-------+ 4 NE (Number of Elements)
*/ 1.0 1.0 1.0 1.0 x (Length of Each Elem.: L) ，Q, A, 
fp = fopen("input.dat", "r"); 100 Number of MAX. Iterations for CG Solver
assert(fp != NULL); 1.e-8 Convergence Criteria for CG Solver
fscanf(fp, "%d", &NE);
fscanf(fp, "%lf %lf %lf %lf", &dX, &QV, &Area, &COND);
fscanf(fp, "%d", &IterMax);
fscanf(fp, "%lf", &Eps);
fclose(fp);
1 2 3 4
N= NE + 1; 1 2 3 4 5
PHI = calloc(N, sizeof(double));

X = calloc(N, sizeof(double)); NE: # Element
Diag = calloc(N, sizeof(double));
N : # Node (NE+1)
AMat = calloc(2*N-2, sizeof(double));
Rhs = calloc(N, sizeof(double));
Index= calloc(N+1, sizeof(int));
Item = calloc(2*N-2, sizeof(int));
Icelnod= calloc(2*NE, sizeof(int));
FEM1D 84
Program: 1d.c (2/6)

/*
// +-------+
// | INIT. |
// +-------+
*/
fp = fopen("input.dat", "r");
assert(fp != NULL);
fclose(fp);
N= NE + 1;
X = calloc(N, sizeof(double));
AMat = calloc(2*N-2, sizeof(double)); Amat: Non-Zero Off-Diag. Comp.
Rhs = calloc(N, sizeof(double)); Item: Corresponding Column ID
Index= calloc(N+1, sizeof(int));
Item = calloc(2*N-2, sizeof(int));
Icelnod= calloc(2*NE, sizeof(int));
FEM1D 85
Element/Global Operations
1 2 3 4
Local Node ID
e e e e
1 2 1 2 1 2 1 2
T T T T
 d N  d N    d N  d N    d N  d N    d N  d N  
  E  dV      E  dV      E  dV      E  dV   
V  dx V V V
dx   dx dx   dx dx   dx dx 
T T T T T T T T
   N  dS   X N  dV  0    N  dS   X N  dV  0    N  dS   X N  dV  0    N  dS   X N  dV  0
S V S V S V S V
[k ]( 3) {}( 3)  { f }(3) [k ]( 4 ) {}( 4)  { f }( 4 )

[k ](1) {}(1)  { f }(1) [k ]( 2 ) {}( 2 )  { f }( 2 )
k11( 3) k12( 3)  1( 3)   f1( 3)  k11( 4 ) k12( 4 )  1( 4 )   f1( 4) 
k11( 2 ) k12( 2 )  1( 2)   f1( 2)   ( 4)   (4) 
k11(1) k12(1)  1(1)   f1(1)    ( 2)   ( 3) ( 3)   ( 3) 
  ( 3)  ( 4)   ( 4) 
 (1) (1)   (1) 
  (1)   (2) ( 2)   (2)  k 22 k 21 k 22  2   f 2 
k 22 k 21 k 22  2   f 2  k 21  2   f 2 
k 21  2   f 2 
1 2 3 4
1 2 3 4 5
[ K ]{}  {F }
 D1 AU11   1   B1  Number of non-zero off-
 AL D2 AU 21     B 
 21   2   2  diag. components is 2 for
 AL31 D3 AU 31   3    B3 
  each node. This number
 AL41 D4 AU 41   4   B4 
    is 1 at boundary nodes).
 AL51 D5   5   B5 
FEM1D 86
Attention: In C program, node and

element ID’s start from 0.
1 2 3 4
1 2 3 4 5
0 1 2 3
0 1 2 3 4
e e e e
0 1 0 1 0 1 0 1
FEM1D 87
Program: 1d.c (2/6)

/*
// +-------+ icel
// | INIT. |
// +-------+
*/ Icelnod[2*icel] Icelnod[2*icel+1]
fp = fopen("input.dat", "r");
assert(fp != NULL); =icel =icel+1
fclose(fp); Amat: Non-Zero Off-Diag. Comp.
N= NE + 1; Item: Corresponding Column ID
X = calloc(N, sizeof(double)); Number of non-zero off-diag.
components is 2 for each node. This
AMat = calloc(2*N-2, sizeof(double)); number is 1 at boundary nodes).
Rhs = calloc(N, sizeof(double));
Index= calloc(N+1, sizeof(int)); Total Number of Non-Zero Off-Diag.
Item = calloc(2*N-2, sizeof(int)); Components:
Icelnod= calloc(2*NE, sizeof(int)); 2*(N-2)+1+1= 2*N-2
FEM1D 88
Program: 1d.c (3/6)

Initialization, Allocation of Arrays (cont.)
W = (double **)malloc(sizeof(double *)*4);
if(W == NULL) {
fprintf(stderr, "Error: %s¥n", strerror(errno));
return -1;
}
for(i=0; i<4; i++) {
W[i] = (double *)malloc(sizeof(double)*N);
if(W[i] == NULL) {
return -1;
}
}
for(i=0;i<N;i++) PHI[i] = 0.0;
for(i=0;i<N;i++) Diag[i] = 0.0;
for(i=0;i<N;i++) Rhs[i] = 0.0;
for(k=0;k<2*N-2;k++) AMat[k] = 0.0; X: X-coordinate
for(i=0;i<N;i++) X[i]= i*dX; component of each node
for(icel=0;icel<NE;icel++){
Icelnod[2*icel ]= icel;
Icelnod[2*icel+1]= icel+1;
}
Kmat[0][0]= +1.0;
Kmat[0][1]= -1.0;
Kmat[1][0]= -1.0;
Kmat[1][1]= +1.0;
FEM1D 89
Program: 1d.c (3/6)

if(W == NULL) {
return -1;
}
for(i=0; i<4; i++) {
if(W[i] == NULL) {
return -1;
}
}
for(i=0;i<N;i++) PHI[i] = 0.0;
for(i=0;i<N;i++) Diag[i] = 0.0;
for(i=0;i<N;i++) Rhs[i] = 0.0;
for(k=0;k<2*N-2;k++) AMat[k] = 0.0;
for(i=0;i<N;i++) X[i]= i*dX;
for(icel=0;icel<NE;icel++){ icel
Icelnod[2*icel+1]= icel+1;
} Icelnod[2*icel] Icelnod[2*icel+1]
Kmat[0][0]= +1.0; =icel =icel+1
Kmat[0][1]= -1.0;
Kmat[1][0]= -1.0;
Kmat[1][1]= +1.0;
FEM1D 90
Program: 1d.c (3/6)

if(W == NULL) {
return -1;
}
for(i=0; i<4; i++) {
if(W[i] == NULL) {
return -1;
}
}
for(i=0;i<N;i++) PHI[i] = 0.0;
for(i=0;i<N;i++) Diag[i] = 0.0;
for(i=0;i<N;i++) Rhs[i] = 0.0;
for(k=0;k<2*N-2;k++) AMat[k] = 0.0;
for(i=0;i<N;i++) X[i]= i*dX;
Icelnod[2*icel+1]= icel+1; (e)  d N T d N    A  1  1
} k  dV 
   
dx dx  L   1  1
Kmat[0][0]= +1.0; V    
Kmat[0][1]= -1.0;
Kmat[1][0]= -1.0; [Kmat]
Kmat[1][1]= +1.0;
FEM1D 91
Program: 1d.c (4/6)

Global Matrix: Column ID for Non-Zero Off-Diag’s
/*
// +--------------+ Number of non-zero off-diag. components is 2 for each
// | CONNECTIVITY | node. This number is 1 at boundary nodes).
// +--------------+
*/
for(i=0;i<N+1;i++) Index[i] = 2; Total Number of Non-Zero Off-Diag. Components:
Index[0]= 0; 2*(N-2)+1+1= 2*N-2= NPLU= Index[N]
Index[1]= 1;
Index[N]= 1; # Non-Zero Index[0]= 0
Off-Diag.
for(i=0;i<N;i++){ 1.1 2.4 3.2
2 Index[1]= 2
Index[i+1]= Index[i+1] + Index[i]; 0 ◎ ① ④
} 3.6 4.3 2.5 3.7 9.1
1 4 Index[2]= 6
① ◎ ③ ⑤ ⑦
NPLU= Index[N];
5.7 1.5 3.1
2 2 Index[3]= 8
for(i=0;i<N;i++){ ② ④ ⑥
jS = Index[i]; 3 9.8 4.1 2.5 2.7
3 Index[4]= 11
if(i == 0){ ③ ① ④ ⑤
Item[jS] = i+1; 11.5 3.1 9.5 10.4 4.3
}else if(i == N-1){ 4 4 Index[5]= 15
④ ◎ ① ② ⑥
Item[jS] = i-1;
}else{ 12.4 6.5 9.5
5 2 Index[6]= 17
Item[jS] = i-1; ⑤ ② ⑥
Item[jS+1] = i+1; 23.1 6.4 2.5 1.4 13.1 4 Index[7]= 21
6
} ⑥ ① ② ⑤ ⑦
} 51.3 9.5 1.3 9.6 3.1
7 4 Index[8]= 25
⑦ ① ② ③ ⑤

FEM1D 92
Program: 1d.c (4/6)

Global Matrix: Column ID for Non-Zero Off-Diag’s
/*
// +--------------+ i-1 i
// | CONNECTIVITY | i-1 i i+1
// +--------------+
*/
for(i=0;i<N+1;i++) Index[i] = 2;
Index[0]= 0;
Index[1]= 1;
Index[N]= 1; # Non-Zero Index[0]= 0
Off-Diag.
for(i=0;i<N;i++){ 1.1 2.4 3.2
2 Index[1]= 2
Index[i+1]= Index[i+1] + Index[i]; 0 ◎ ① ④
} 3.6 4.3 2.5 3.7 9.1
1 4 Index[2]= 6
① ◎ ③ ⑤ ⑦
NPLU= Index[N];
5.7 1.5 3.1
2 2 Index[3]= 8
for(i=0;i<N;i++){ ② ④ ⑥
jS = Index[i]; 3 9.8 4.1 2.5 2.7
3 Index[4]= 11
if(i == 0){ ③ ① ④ ⑤
Item[jS] = i+1; 11.5 3.1 9.5 10.4 4.3
}else if(i == N-1){ 4 4 Index[5]= 15
④ ◎ ① ② ⑥
Item[jS] = i-1;
}else{ 12.4 6.5 9.5
5 2 Index[6]= 17
Item[jS] = i-1; ⑤ ② ⑥
Item[jS+1] = i+1; 23.1 6.4 2.5 1.4 13.1 4 Index[7]= 21
6
} ⑥ ① ② ⑤ ⑦
} 51.3 9.5 1.3 9.6 3.1
7 4 Index[8]= 25
⑦ ① ② ③ ⑤

FEM1D 93
Program: 1d.c (5/6)

Element Matrix ~ Global Matrix
/*
// +-----------------+
// | MATRIX assemble |
// +-----------------+
*/
in1= Icelnod[2*icel]; icel
in2= Icelnod[2*icel+1]; in1 in2
X1 = X[in1];
X2 = X[in2];
DL = fabs(X2-X1);
Ck= Area*COND/DL;
Emat[0][0]= Ck*Kmat[0][0];
Emat[0][1]= Ck*Kmat[0][1];
Emat[1][0]= Ck*Kmat[1][0];
Emat[1][1]= Ck*Kmat[1][1];
Diag[in1]= Diag[in1] + Emat[0][0];
if (icel==0){k1=Index[in1];
}else {k1=Index[in1]+1;}
k2=Index[in2];
AMat[k1]= AMat[k1] + Emat[0][1];
QN= 0.5*QV*Area*dX;
Rhs[in1]= Rhs[in1] + QN;
}
FEM1D 94
Program: 1d.c (5/6)

/*
// +-----------------+
// +-----------------+
*/
X1 = X[in1];
X2 = X[in2];
DL = fabs(X2-X1);
Ck= Area*COND/DL;  1  1 A
Emat[0][0]= Ck*Kmat[0][0];  Kmat 
Emat[0][1]= Ck*Kmat[0][1]; Emat   k (e)  A  
Emat[1][0]= Ck*Kmat[1][0]; L  1  1 L
Emat[1][1]= Ck*Kmat[1][1];
k2=Index[in2];
QN= 0.5*QV*Area*dX;
}
FEM1D 95
Program: 1d.c (5/6)

/*
// +-----------------+
// +-----------------+
*/
X1 = X[in1];
X2 = X[in2];
DL = fabs(X2-X1);
Ck= Area*COND/DL;
Emat[0][0]= Ck*Kmat[0][0];
Emat[0][1]= Ck*Kmat[0][1];
Emat[1][0]= Ck*Kmat[1][0];
Emat[1][1]= Ck*Kmat[1][1];
(e) EA  1  1
Diag[in1]= Diag[in1] + Emat[0][0]; Emat   k  
Diag[in2]= Diag[in2] + Emat[1][1]; L   1  1
k2=Index[in2];
QN= 0.5*QV*Area*dX;
}
FEM1D 96
Program: 1d.c (5/6)

/*
// +-----------------+
// +-----------------+
*/
X1 = X[in1];
X2 = X[in2];
DL = fabs(X2-X1);
Ck= Area*COND/DL;
Emat[0][0]= Ck*Kmat[0][0];
Emat[0][1]= Ck*Kmat[0][1];
Emat[1][0]= Ck*Kmat[1][0];
Emat[1][1]= Ck*Kmat[1][1];
Non-zero Off-Diag. at i-th row:
Diag[in2]= Diag[in2] + Emat[1][1]; Index[i], Index[i]+1
}else {k1=Index[in1]+1;} i-1 i i+1
k2=Index[in2];
Index[i] Index[i]+1
QN= 0.5*QV*Area*dX; (e) A  1  1 k1
Rhs[in1]= Rhs[in1] + QN; Emat   k  
Rhs[in2]= Rhs[in2] + QN; L   1  1
}
k2
FEM1D 97
General Elements: k1
“in2” as a off-diag. component of “in1”
/*
// +-----------------+
// +-----------------+
*/
in1= Icelnod[2*icel];
in2= Icelnod[2*icel+1];
X1 = X[in1];
X2 = X[in2]; icel
DL = fabs(X2-X1); in1 in2
Ck= Area*COND/DL;
Emat[0][0]= Ck*Kmat[0][0];
Emat[0][1]= Ck*Kmat[0][1];
Emat[1][0]= Ck*Kmat[1][0];
Emat[1][1]= Ck*Kmat[1][1];
}else {k1=Index[in1]+1;} i-1 in1 in2
k2=Index[in2];
Index[i] Index[in1]+1
QN= 0.5*QV*Area*dX; (e) A  1  1 k1
Rhs[in2]= Rhs[in2] + QN; L   1  1
}
FEM1D 98
General Elements: k2
/*
// +-----------------+
// +-----------------+
*/
X1 = X[in1];
X2 = X[in2]; icel
Ck= Area*COND/DL;
Emat[0][0]= Ck*Kmat[0][0];
Emat[0][1]= Ck*Kmat[0][1];
Emat[1][0]= Ck*Kmat[1][0];
Emat[1][1]= Ck*Kmat[1][1];
}else {k1=Index[in1]+1;} in1 in2 i+1
k2=Index[in2];
Index[in2] Index[i]+1
QN= 0.5*QV*Area*dX; (e) A  1  1
Rhs[in2]= Rhs[in2] + QN; L   1  1
}
k2
FEM1D 99
0-th Element: k1
/*
// +-----------------+
// +-----------------+
*/
X1 = X[in1];
X2 = X[in2]; icel＝0
Ck= Area*COND/DL;
Emat[0][0]= Ck*Kmat[0][0];
Emat[0][1]= Ck*Kmat[0][1];
Emat[1][0]= Ck*Kmat[1][0];
Emat[1][1]= Ck*Kmat[1][1];
Diag[in1]= Diag[in1] + Emat[0][0]; Non-zero Off-Diag. at i-th row: Index[i]
}else {k1=Index[in1]+1;} in1 in2
k2=Index[in2];
Index[in1]
QN= 0.5*QV*Area*dX; (e) A  1  1 k1
Rhs[in2]= Rhs[in2] + QN; L   1  1
}
FEM1D 100
Program: 1d.c (5/6)

RHS: Heat Generation Term
/*
// +-----------------+
// +-----------------+
*/
X1 = X[in1];
X2 = X[in2];
DL = fabs(X2-X1);
Ck= Area*COND/DL;
Emat[0][0]= Ck*Kmat[0][0];
Emat[0][1]= Ck*Kmat[0][1];
Emat[1][0]= Ck*Kmat[1][0];
Emat[1][1]= Ck*Kmat[1][1];
k2=Index[in2];
L
QN= 0.5*QV*Area*dX; T 1  x / L  Q AL 1
V Q N  dV  Q A0  x / L  dx  2 1
}
FEM1D 101
Program: 1d.c (6/6)

Dirichlet B.C. @ X=0
/*
// +---------------------+
// | BOUNDARY conditions |
// +---------------------+
*/
/* X=Xmin */
i=0;
jS= Index[i];
AMat[jS]= 0.0;
Diag[i ]= 1.0;
Rhs [i ]= 0.0;
for(k=0;k<NPLU;k++){
if(Item[k]==0){AMat[k]=0.0;
}}
FEM1D 102
  T  
 Q  0
x  x 
x=0 (xmin) x= xmax

T
x
FEM1D 103
(Linear) Equation at x=0

T1= 0 (or T0 = 0)
  T  
 Q  0
x  x 
x=0 (xmin) x= xmax

T
x
FEM1D 104
Program: 1d.c (6/6)

/*
// +---------------------+
// | BOUNDARY conditions | T1=0
// +---------------------+
*/
Diagonal Component=1
/* X=Xmin */ RHS=0
i=0;
jS= Index[i];
Off-Diagonal Components= 0.
AMat[jS]= 0.0;
Diag[i ]= 1.0;
Rhs [i ]= 0.0;
}}
FEM1D 105
Program: 1d.c (6/6)

/*
// +---------------------+
// | BOUNDARY conditions | T1=0
// +---------------------+
*/
Diagonal Component=1
/* X=Xmin */ RHS=0
i=0;
jS= Index[i];
Off-Diagonal Components= 0.
AMat[jS]= 0.0;
Diag[i ]= 1.0; Erase !
Rhs [i ]= 0.0;
}}
FEM1D 106
Program: 1d.c (6/6)

/*
// +---------------------+ T1=0
// +---------------------+ Diagonal Component=1
*/
RHS=0
/* X=Xmin */ Off-Diagonal Components= 0.
i=0;
jS= Index[i];
Elimination and Erase
AMat[jS]= 0.0;
Diag[i ]= 1.0;
Rhs [i ]= 0.0;
}}
Column components of boundary nodes (Dirichlet B.C.) are

moved to RHS and eliminated for keeping symmetrical feature of
the matrix (in this case just erase off-diagonal components)
FEM1D 107
if T1≠ 0
/*
// +---------------------+ Column components of boundary nodes
// +---------------------+ (Dirichlet B.C.) are moved to RHS and
*/ eliminated for keeping symmetrical feature
of the matrix.
/* X=Xmin */ Index[ j 1]1

i=0;
jS= Index[i]; Diag j j   Amat k Item[ k ]  Rhs j
AMat[jS]= 0.0; k  Index[ j ]
Diag[i ]= 1.0;
Rhs [i ]= PHImin;
for(j=1;i<N;i++){
for (k=Index[j];k<Index[j+1];k++){
if(Item[k]==0){
Rhs [j]= Rhs[j] – Amat[k]*PHImin;
AMat[k]= 0.0;
}}
FEM1D 108
if T1≠ 0
/*
// +---------------------+ Index[ j 1]1
// +---------------------+
Diag j j   Amat k Item[ k ]
*/ k  Index[ j ], k  k s
 Rhs j  Amatk s Item[ k s ]

/* X=Xmin */  Rhs j  Amatk s min where Item[k s ]  0
i=0;
jS= Index[i];
AMat[jS]= 0.0;
Diag[i ]= 1.0;
Rhs [i ]= PHImin;
for(j=1;i<N;i++){
for (k=Index[j];k<Index[j+1];k++){
if(Item[k]==0){
Rhs [j]= Rhs[j] – Amat[k]*PHImin;
AMat[k]= 0.0;
}} Column componentsof boundary nodes
(Dirichlet B.C.) are moved to RHS and
eliminated for keeping symmetrical feature
of the matrix.
FEM1D 109
Secondary B.C. (Insulated)

Heat Gen. Rate Q
  T  
 Q  0
x  x 
x=0 (xmin) x= xmax
T  0@ x  0 T
 0 @ x  xmax
x
T 0 dT
 q N  dS  q A xL
 q A , q   Surface Flux
S 1 dx
According to insulated B.C., q  0

is satisfied. No contribution by this term.
T Insulated B.C. is automatically satisfied
 0 @ x  xmax without explicit operations
x -> Natural B.C.
FEM1D 110
Preconditioned CG Solver
D1 0 ... 0 0
for i= 1, 2, …
0 D 0 0 
solve [M]z(i-1)= r(i-1) 2
 
i-1= r(i-1) z(i-1)
M    ... ... ... 
if i=1  
p(1)= z(0)  0 0 DN 1 0 
else  0 0 ... 0 DN 
i-1= i-1/i-2
p(i)= z(i-1) + i-1 p(i-1)
endif
q(i)= [A]p(i)
i = i-1/p(i)q(i)
x(i)= x(i-1) + ip(i)
r(i)= r(i-1) - iq(i)
end
FEM1D 111
Diagonal Scaling, Point-Jacobi

D1 0 ... 0 0
0 D 0 0 
 2 
M    ... ... ... 
 
0 0 DN 1 0 
 0 0 ... 0 DN 
• solve [M]z(i-1)= r(i-1) is very easy.

• Provides fast convergence for simple problems.
• 1d.f, 1d.c
FEM1D 112
CG Solver (1/6)
/*
// +---------------+
// | CG iterations |
// +---------------+
*/
R = 0;
Reciprocal numbers （逆数）of diagonal
Z = 1; components are stored in W[DD][i].
Q = 1; Computational cost for division is
P = 2;
usually expensive.
DD= 3;
for(i=0;i<N;i++){
W[DD][i]= 1.0 / Diag[i];
}
FEM1D 113
CG Solver (1/6)
/*
// +---------------+ Compute r(0)= b-[A]x(0)
// | CG iterations | for i= 1, 2, …
// +---------------+ solve [M]z(i-1)= r(i-1)
*/
R = 0;
i-1= r(i-1) z(i-1)
Z = 1; if i=1
Q = 1; p(1)= z(0)
P = 2; else
DD= 3;
i-1= i-1/i-2
for(i=0;i<N;i++){ p(i)= z(i-1) + i-1 p(i-1)
W[DD][i]= 1.0 / Diag[i]; endif
}
q(i)= [A]p(i)
i = i-1/p(i)q(i)
W[0][i]= W[R][i] ⇒ {r} x(i)= x(i-1) + ip(i)
W[1][i]= W[Z][i] ⇒ {z} r(i)= r(i-1) - iq(i)
W[1][i]= W[Q][i] ⇒ {q} end
W[2][i]= W[P][i] ⇒ {p}
W[3][i]= W[DD][i] ⇒ 1/{D}
FEM1D 114
CG Solver (2/6)
/* Compute r(0)= b-[A]x(0)
//-- {r0}= {b} - [A]{xini} | for i= 1, 2, …
*/
for(i=0;i<N;i++){ solve [M]z(i-1)= r(i-1)
W[R][i] = Diag[i]*U[i]; i-1= r(i-1) z(i-1)
for(j=Index[i];j<Index[i+1];j++){ if i=1
W[R][i] += AMat[j]*U[Item[j]];
}
p(1)= z(0)
} else
i-1= i-1/i-2
BNorm2 = 0.0; p(i)= z(i-1) + i-1 p(i-1)
for(i=0;i<N;i++){
BNorm2 += Rhs[i] * Rhs[i]; endif
W[R][i] = Rhs[i] - W[R][i]; q(i)= [A]p(i)
} i = i-1/p(i)q(i)
x(i)= x(i-1) + ip(i)
BNRM2=|b|2 r(i)= r(i-1) - iq(i)
for convergence criteria check convergence |r|
of CG solvers end
FEM1D 115
CG Solver (3/6)
for(iter=1;iter<=IterMax;iter++){ Compute r(0)= b-[A]x(0)
for i= 1, 2, …
/*
//-- {z}= [Minv]{r} solve [M]z(i-1)= r(i-1)
*/ i-1= r(i-1) z(i-1)
for(i=0;i<N;i++){ if i=1
W[Z][i] = W[DD][i] * W[R][i];
}
p(1)= z(0)
else
/* i-1= i-1/i-2
//-- RHO= {r}{z} p(i)= z(i-1) + i-1 p(i-1)
*/
Rho= 0.0; endif
for(i=0;i<N;i++){ q(i)= [A]p(i)
Rho += W[R][i] * W[Z][i]; i = i-1/p(i)q(i)
}
x(i)= x(i-1) + ip(i)
r(i)= r(i-1) - iq(i)
end
FEM1D 116
CG Solver (4/6)
//-- {p} = {z} if ITER=1 for i= 1, 2, …
// BETA= RHO / RHO1 otherwise
*/ solve [M]z(i-1)= r(i-1)
if(iter == 1){ i-1= r(i-1) z(i-1)
for(i=0;i<N;i++){ if i=1
W[P][i] = W[Z][i];
}
p(1)= z(0)
}else{ else
Beta = Rho / Rho1; i-1= i-1/i-2
for(i=0;i<N;i++){ p(i)= z(i-1) + i-1 p(i-1)
W[P][i] = W[Z][i] + Beta*W[P][i];
} endif
} q(i)= [A]p(i)
i = i-1/p(i)q(i)
/*
//-- {q}= [A]{p}
x(i)= x(i-1) + ip(i)
*/ r(i)= r(i-1) - iq(i)
for(i=0;i<N;i++){ check convergence |r|
W[Q][i] = Diag[i] * W[P][i]; end
for(j=Index[i];j<Index[i+1];j++){
W[Q][i] += AMat[j]*W[P][Item[j]];
}
}
FEM1D 117
CG Solver (5/6)
//-- ALPHA= RHO / {p}{q} for i= 1, 2, …
*/
C1 = 0.0; solve [M]z(i-1)= r(i-1)
for(i=0;i<N;i++){ i-1= r(i-1) z(i-1)
C1 += W[P][i] * W[Q][i]; if i=1
}
p(1)= z(0)
Alpha = Rho / C1; else
i-1= i-1/i-2
/* p(i)= z(i-1) + i-1 p(i-1)
//-- {x}= {x} + ALPHA*{p}
// {r}= {r} - ALPHA*{q} endif
*/ q(i)= [A]p(i)
for(i=0;i<N;i++){ i = i-1/p(i)q(i)
U[i] += Alpha * W[P][i];
W[R][i] -= Alpha * W[Q][i];
x(i)= x(i-1) + ip(i)
} r(i)= r(i-1) - iq(i)
end
FEM1D 118
CG Solver (6/6)
DNorm2 = 0.0; Compute r(0)= b-[A]x(0)
for(i=0;i<N;i++){ for i= 1, 2, …
DNorm2 += W[R][i] * W[R][i];
} solve [M]z(i-1)= r(i-1)
Resid = sqrt(DNorm2/BNorm2); i-1= r(i-1) z(i-1)
if i=1
if((iter)%1000 == 0){
printf("%8d%s%16.6e¥n",iter, "
p(1)= z(0)
iters, RESID=", Resid); else
} i-1= i-1/i-2
if(Resid <= Eps){ierr = 0; break;} p(i)= z(i-1) + i-1 p(i-1)
Rho1 = Rho; i-2
} endif
q(i)= [A]p(i)
i = i-1/p(i)q(i)
DNorm2 r Ax  b x(i)= x(i-1) + ip(i)
Resid     Eps
BNorm2 b b r(i)= r(i-1) - iq(i)
Control Data input.dat end
4 NE (Number of Elements)
1.0 1.0 1.0 1.0 x (Length of Each Elem.: L) ，Q, A, 
100 Number of MAX. Iterations for CG Solver
1.e-8 Convergence Criteria for CG Solver
FEM1D 119
Finite Element Procedures

• Initialization
– Control Data
– Node, Connectivity of Elements (N: Node#, NE: Elem#)
– Initialization of Arrays (Global/Element Matrices)
– Element-Global Matrix Mapping (Index, Item)
• Generation of Matrix
– Element-by-Element Operations (do icel= 1, NE)
• Element matrices
• Accumulation to global matrix
– Boundary Conditions
• Linear Solver
FEM1D 120
Remedies for Higher Accuracy

• Finer Meshes
NE=8, dX=12.5 F
 u log A1 x  A2   log A2 
8 iters, RESID= 2.822910E-16 U(N)= 1.953586E-01 EA1
### DISPLACEMENT
1 0.000000E+00 -0.000000E+00
2 1.101928E-02 1.103160E-02
3 2.348034E-02 2.351048E-02
4 3.781726E-02 3.787457E-02
5 5.469490E-02 5.479659E-02 1 2 3 4 5
6 7.520772E-02 7.538926E-02
7 1.013515E-01 1.016991E-01 1 2 3 4
8 1.373875E-01 1.381746E-01
9 1.953586E-01 1.980421E-01
NE=20, dX=5
20 iters, RESID= 5.707508E-15 U(N)= 1.975734E-01
### DISPLACEMENT
1 0.000000E+00 -0.000000E+00
2 4.259851E-03 4.260561E-03
3 8.719160E-03 8.720685E-03
4 1.339752E-02 1.339999E-02
……
17 1.145876E-01 1.146641E-01
18 1.295689E-01 1.296764E-01
19 1.473466E-01 1.475060E-01
20 1.692046E-01 1.694607E-01
21 1.975734E-01 1.980421E-01
FEM1D 121

• Finer Meshes
• Higher Order Shape/Interpolation Function（高次補
間関数・形状関数）
– Higher-Order Element（高次要素）
– Linear-Element, 1st-Order Element: Lower Order（低次要
素）
• Formulation which assures continuity of n-th order
derivatives
– Cn Continuity（Cn連続性）
FEM1D 122

• Finer Meshes
• Higher Order Shape/Interpolation Function（高次補
間関数・形状関数）
– Higher-Order Element（高次要素）
– Linear-Element, 1st-Order Element: Lower Order（低次要
素）
• Formulation which assures continuity of n-th order
derivatives
– Cn Continuity（Cn連続性）
• Linear Elements
– Piecewise Linear
– C0 Continuity
• Only dependent variables are continuous at element boundary
FEM1D 123
Example: 1D Heat Transfer (1/2)

T0  0C , h  0.10W / cm 2  K • Temp. Thermal Fins
• Circular Sectional Area,
  4.00W / cm  K 銅Cupper r=1cm
• Boundary Condition
7.50cm
– x=0 : Fixed Temperature
TS  150C – x=7.5 : Insulated
• Convective Heat Transfer
on Cylindrical Surface
– q= h (T-T0)
– q：Heat Flux
Convective Heat Transfer on
Cylindrical Surface • Heat Flow/Unit Surface
Area/sec.
FEM1D 124
Example: 1D Heat Transfer (2/2)

150
### RESULTS (linear interpolation)
ID X FEM. ANALYTICAL Exact Solution
1 0.00000 150.00000 150.00000 ERR(%): 0.00000
2 1.87500 102.62226 103.00165 ERR(%): 0.25292
3 3.75000 73.82803 74.37583 ERR(%): 0.36520 100
4 5.62500 58.40306 59.01653 ERR(%): 0.40898
5 7.50000 53.55410 54.18409 ERR(%): 0.41999
deg-C
### RESULTS (quadratic interpolation) 50  hP 
ID X FEM. ANALYTICAL cosh  x 
1 0.00000 150.00000 150.00000 ERR(%): 0.00000 T ( x)  TS  T0   A  T
0
 hP 
2 1.87500 102.98743 103.00165 ERR(%): 0.00948 cosh X max 
3 3.75000 74.40203 74.37583 ERR(%): 0.01747  A 
4 5.62500 59.02737 59.01653 ERR(%): 0.00722 0
5 7.50000 54.21426 54.18409 ERR(%): 0.02011 0.00 2.00 4.00 6.00 8.00
### RESULTS (linear interpolation) X

ID X FEM. ANALYTICAL
1 0.00000 150.00000 150.00000 ERR(%): 0.00000
2 0.93750 123.71561 123.77127 ERR(%): 0.03711 Quadratic interpolation provides
3 1.87500 102.90805 103.00165 ERR(%): 0.06240 more accurate solution,
4 2.81250 86.65618 86.77507 ERR(%): 0.07926
5 3.75000 74.24055 74.37583 ERR(%): 0.09019 especially if X is close to 7.50cm.
6 4.68750 65.11151 65.25705 ERR(%): 0.09703
7 5.62500 58.86492 59.01653 ERR(%): 0.10107
8 6.56250 55.22426 55.37903 ERR(%): 0.10317
9 7.50000 54.02836 54.18409 ERR(%): 0.10382

Fem1d C PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fem1d C PDF

Uploaded by

Copyright:

Available Formats

1D-FEM in C: Steady State Heat

• 1D-code for Static Linear-Elastic Problems by

1D Steady State Heat Conduction

x=0 (xmin) x= xmax

• Uniform: Sectional Area: A, Thermal Conductivity: 

1D Steady State Heat Conduction

x=0 (xmin) x= xmax

• Uniform: Sectional Area: A, Thermal Conductivity: 

x=0 (xmin) x= xmax

1D Linear Element (1/4)

1D Linear Element (1/4)

1D Linear Elem.: Shape Function (2/4)

1D Linear Elem.: Shape Function (3/4)

1D Linear Elem.: Shape Function (4/4)

Galerkin Method (1/4)

Galerkin Method (2/4)

• Consider the following terms: V

Galerkin Method (3/4)

• This is called “weak form（弱形式）”. Original PDE

Galerkin Method (4/4)

• These terms coincide at element boundaries and

Weak Form and Boundary Conditions

Weak Form with B.C.: on each elem.

(e) (e) (e)

Integration over Each Element: [k]

Integration over Each Element: {f} (1/2)

Integration over Each Element: {f} (2/2)

T 0 dT Surface Heat Flux

when surface heat flux acts on

k (e)  (e)   f (e) Element Matrix, Element Equations

K    F  Global Matrix, Global Equations

This is your “top” directory, and is called <$E-TOP> in this class.

1D Code for Steady-State Heat Conduction Problems

Control Data input.dat

4 iters, RESID= 4.154074e-17

Element Eqn’s/Accumulation (1/3)

• [k] and {f} of Element-1:

Element Eqn’s/Accumulation (2/3)

Element Eqn’s/Accumulation (3/3)

2nd –Order Differentiation in FDM

• Approximate Derivative at×(center of i and i+1)

x→0: Real Derivative

[k ]( 3) {}( 3)  { f }(3) [k ]( 4 ) {}( 4)  { f }( 4 )

• 1D-code for Static Linear-Elastic Problems by

Large-Scale Linear Equations in

What is Iterative Method ?

 a11 a12  a1n  x1   b1   x1( 0 ) 

Starting from a initial vector x(0), iterative method

Iterative Method (cont.)

zk is a vector which belongs to kth Krylov Subspace（クリ

Conjugate Gradient Method

Procedures of Conjugate Gradient

Procedures of Conjugate Gradient

Procedures of Conjugate Gradient

Procedures of Conjugate Gradient

Procedures of Conjugate Gradient

Derivation of CG Algorithm (1/5)

Therefore, the solution x minimizes the following f(x):

Derivation of CG Algorithm (2/5)

Minimization of f(x(k+1)) is done as follows:

Derivation of CG Algorithm (3/5)

Residual vector at (k+1)-th iteration: r ( k 1)  b  Ax ( k 1) , r ( k )  b  Ax ( k )

Search direction vector p is defined by the following