Professional Documents
Culture Documents
Outline
Uses For Simulation
Engineering Design
Virtual Environments
Model Verification
Course Philosophy
Example Problems
Power distribution on an Integrated Circuit
Load bearing on a space frame
Temperature distribution in a package
Circuit Analysis
Equations
Current-voltage relations for circuit elements (resistors, capacitors,
transistors, inductors), current balance equations
Recent Developments
Matrix-Implicit Krylov Subspace methods
Electromagnetic
Analysis of Packages
Equations
Maxwells Partial Differential Equations
Recent Developments
Fast Solvers for Integral Formulations
Structural Analysis of
Automobiles
Equations
Force-displacement relationships for mechanical elements (plates,
beams, shells) and sum of forces = 0.
Partial Differential Equations of Continuum Mechanics
Recent Developments
Meshless Methods, Iterative methods, Automatic Error Control
Recent Developments
Multigrid Methods for Unstructured Grids
Engine Thermal
Analysis
Equations
The Poisson Partial Differential Equation.
Recent Developments
Fast Integral Equation Solvers, Monte-Carlo Methods
Micromachine Device
Performance Analysis
Equations
Elastomechanics, Electrostatics, Stokes Flow.
Recent Developments
Fast Integral Equation Solvers, Matrix-Implicit Multi-level Newton
Methods for coupled domain problems.
Stock Price
t
Equations
Black-Scholes Partial Differential Equation
Recent Developments
Financial Service Companies are hiring engineers, mathematicians and
physicists.
Virtual Environments
for Computer Games
Equations
Multibody Dynamics, elastic collision equations.
Recent Developments
Multirate integration methods, parallel simulation
10
Virtual Surgery
Equations
Partial Differential Equations of Elastomechanics
Recent Developments
Parallel Computing, Fast methods
11
Biomolecule Electrostatic
Optimization
+
- - +
+ +
+- +-
Ligand
(drug
molecule)
-+
Receptor
(protein
molecule)
Ecm protein
Equations
The Poisson Partial Differential Equation.
Recent Developments
Matrix-Implicit Iterative Methods, Fast Integral Equation Solvers
12
No
Make
Sense?
Yes
Anxiety
Works!
D
R
O
P
Develop
Understanding of
Computational
complexity
Develop
Understanding of
Convergence
Issues
Faster Method
Robust Method
C
L
A
S
S
Right
Algorithms
Happiness
New
Algorithms
Fame
13
Course Philosophy
Examine Several Modern Techniques
Understand, practically and theoretically, how the
techniques perform on representative, but real,
applications
Why Prove Theorems?
Guarantees, given assumptions, that the method
will always work.
Can help debug programs.
The theorem proof can tell you what to do in practice.
14
+ 3.3
v
Cache
ALU
Decoder
Power Supply
Main power wires
One application problem which generates large systems of equations is the problem
of distributing power to the various parts of a Very Large Scale Integrated (VLSI)
circuit processor.
The picture on the left of the slide shows a layout for a typical processor, with
different functional blocks noted. The processor pictured has nearly a million
transistors, and millions of wires which transport signals and power. All one can
really see be eye are the larger wires that carry power and patterns of wires that
carry signals, boolean operations such as and and or .
A typical processor can be divided into a number of functional blocks, as
diagrammed on the layout on the left. There are caches, which store copies of data
and instructions from main memory for faster access. There are execution units
which perform boolean and numerical operations on data, such as and, or ,
addition and multiplication. These execution units are often grouped together and
referred to as an Arithmetic Logic Unit (ALU). Another main block of the processor
is the instruction decoder, which translates instructions fetched from the cache into
actions performed by the ALU.
On the right is a vastly simplified diagram of the processor, showing a typical 3.3
volts power supply, the 3 main functional blocks, and the wires (in red) carrying
power from the supply to the 3 main functional blocks. The wires, which are part of
the integrated circuit, are typically a micron thick, ten microns wide and thousands
of microns long ( a micron is a millionth of an inch). The resistance of these thin
wires is significant, and therefore even though the supply is 3.3 volts, these may not
be 3.3 volts across each of the functional blocks.
The main problem is we address is whether or not each functional block has
sufficient voltage to operate properly.
15
Droop
Joint
Beam
Attachment to
the ground
Cargo
Vehicle
In the diagram is a picture of a space frame used to hold cargo (in red) to be lowered
into a vehicle. The space frame is made using steel beams(in yellow) that are bolted
together at the purple joints. When cargo is hanging off the end of the space frame,
the frame droops.
The main problem we will address is how much does the space frame droop under
load.
16
Thermal Analysis
17
+ 3.3
v
Cache
ALU
Decoder
18
19
Thermal Analysis
Select the shape so that
a) The temperature does not get too high
b) Minimize the metal used.
20
Cache
ALU
Decoder
21
22
1000s of small
companies
23
Market cap.
Cadence 4,000
Synopsis/ 5,000
Avanti
Mentor
2,600
Graphics
1.4 billion
24
+
3.3 v
Cache
ALU
Decoder
25
Each of the elements in the simplified layout, the supply, the wires and the
functional blocks, can be modeled by relating the voltage across that element to the
current that passes through the element. Using these element constitutive relations,
we can construct a circuit from which we can determine the voltages across the
functional blocks and decide if the VLSI circuit will function.
25
Supply becomes
Modeling the
Circuit
Power supply
current
A Voltage Source
V
+ Vs
+ Voltage
Physical
Symbol
Current element
V = Vs
Constitutive
Equation
The power supply provides whatever current is necessary to ensure that the voltage
across the supply is maintained at a set value. Note that the constitutive equation (in
the figure) , which is supposed to relate element voltage (V) to element current (I)
does not include current as a variable. This should not be surprising since voltage is
always maintained regardless of how much current is supplied, and therefore
knowing the voltage tells one nothing about the supplied current.
26
Modeling the
Circuit
Current Sources
+
ALU
Physical
Symbol
Is
Circuit Element
I = Is
Constitutive
Equation
The functional blocks, the ALU, the cache and the decoder are complicated circuits
containing thousands of transistors. In order to determine whether the functional
block will always have a sufficient voltage to operate, a simple model must be
developed that abstracts many of the operating details. A simple worst-case model
is to assume that each functional block is always drawing its maximum current.
Each block is therefore modeled as a current source, although one must assume that
the associated currents have been determined by analyzing each functional block in
more detail. Note that once again the constitutive equation is missing a variable, this
time it is voltage. Since a current source passes the same current independent of the
voltage across the source, that V is missing should be expected.
27
Modeling the
Circuit
Resistors
I
Physical Symbol
IR V = 0
Circuit model
Constitutive Equation
(Ohms Law)
Length
resistivity
R=
Area
Design
Parameters
Material
Property
The model for the wires connecting the supply to the functional blocks is a resistor,
where the resistance is proportional to the length of the wire ( the current has further
to travel) and inversely proportional to the wire cross-sectional area ( the current
has more paths to choose).
low Resistance
high Resistance
That the current through a resistor is proportional to the voltage across the resistor
is Ohms law.
28
Modeling VLSI
Power Distribution
IC
IALU
Cache
ID
ALU
Decoder
+
-
Power Supply
voltage source
Functional Blocks
current sources
Wires become resistors
Result is a schematic
To generate representation which can be used to determine the voltages across each
of the functional units, consider each of the models previously described.
First, replace the supply with a voltage source.
Second, replace each functional block with an associated current source.
Third, replace each section of wire with a resistor.
Note that the resistors representing the wires replace a single section with no
branches, though the section can have turns.
The resulting connection of resistors, current sources and voltage sources is called a
circuit schematic. Formulating equations from schematics will be discussed later.
29
Ground
Load
In order to examine the space frame, we will consider a simplified example with
only four steel beams and a load. Recall that the purple dots represent the points
where steel beams are bolted together. Each of the elements in the simplified layout,
the beams and the load, can be modeled by relating the relative positions of the
elements terminals to the force produced by the element. Using these element
constitutive relations, we can construct a schematic from which we can determine
the frames droop.
30
Load becomes
Modeling the
Frame
Force Source
Fx = 0
Fload
x
Mass
Physical
Symbol
Schematic
Symbol
Fy = Fload
Constitutive
Equation
The load is modeled as a force pulling in the negative Y direction ( Y being vertical,
X being horizontal).
Note that the constitutive equation does not include the variable for the loads
position, following from the fact that the loads force is independent of position.
31
Beam becomes
Modeling the
Frame
Strut
x1 , y1
Strut
v
f
Beam
x2 , y2
Physical
Symbol
L = (x1 x2 )2 + ( y1 y2 )2
v
L L
f = EAc 0
L0
Constitutive Equation
(Hookes Law)
L0 = Unstretched Length
Ac = Cross-Sectional Area
E = Young's Modulus
Design
Parameters
Material Property
Apply
force
Remove
force
L1 > Lo
Lo
Lo
buckling
Lo
Apply force
No buckling
L1 > Lo
32
Lo
f=0
f= KL
f = K L
L1
To determine K consider that the force required to stretch a beam an amount
is
(I) Inversely proportional to its unstretched length (It is easier to stretch a 10 inch
rubber band 1 inch than to stretch a 1 inch rubber band 1 inch)
(II) Directly proportional to its cross-sectional area (Imagine 10 rubber bands in
parallel)
(III) Dependent on the material (Rubber stretches more easily than steel).
Combining (I), (II) and (III) leads to the formula at the bottom of the slide.
33
Modeling the
Frame
Load
34
Formulating Equations
from Schematics
Two Types of Unknowns
Circuit - Node voltages, element currents
Struts - Joint positions, strut forces
Two Types of Equations
Conservation Law Equation
Circuit - Sum of Currents at each node = 0
Struts - Sum of Forces at each joint = 0
Constitutive Equation
Circuit - element current is related to voltage
across the element
Struts - element force is related to the change
in element length
SMA-HPC 2003 MIT
35
35
Heat Flow
1-D Example
Incoming Heat
T (1)
T (0)
Near End
Temperature
Far End
Temperature
T (0)
SMA-HPC 2003 MIT
T (1)
x 36
36
Heat Flow
Discrete Representation
T (1)
T (0)
T1
T2
TN 1 TN
37
37
Heat Flow
Constitutive Relation
Ti
Ti +1 Ti
x
hi +1,i
Limit as the sections become vanishingly small
T ( x )
lim x 0 h ( x ) =
38
x
SMA-HPC 2003 MIT
38
Heat Flow
Conservation Law
Ti 1 hi ,i 1
Ti
hi +1,i Ti +1
x
Net Heat Flow into Control Volume = 0
SMA-HPC 2003 MIT hi +1,i hi ,i 1 = h s x
39
39
Heat Flow
Conservation Law
hi +1,i hi ,i 1 = hs x
T i 1 hi , i 1
Ti
hi + 1, i Ti + 1
Heat in
from left
Heat out
from right
Incoming
heat per
unit length
lim x 0 hs ( x ) =
h ( x ) T ( x )
=
x
x
x
40
40
Heat Flow
Circuit Analogy
1
=
R
x
T1
+
-
vs = T (0)
is = hs x
TN
+
-
vs = T (1)
41
41
Formulating Equations
Two Types of Unknowns
Circuit - Node voltages, element currents
Struts - Joint positions, strut forces
Conducting Bar Temperature, section heat flows
Two Types of Equations
Conservation Law Equation
Circuit - Sum of Currents at each node = 0
Struts - Sum of Forces at each joint = 0
Bar Sum of heat flows into control volume = 0
Constitutive Equations
Circuit element current related to voltage
Struts - strut force related to length change
Bar section temperature drop related to heat flow
SMA-HPC 2003 MIT
42
42
+
-
vs
3
4
Given a circuit schematic, the problem is to determine the node voltages and
element currents. In order to begin, one needs labels for the node voltages, and
therefore the nodes are numbered zero, one, two, N, where N+1 is the total
number of nodes.
The node numbered zero has a special meaning, it is the reference node. Voltages
are not absolute quantities, but must be measured against a reference.
To understand this point better, consider the simple example of a current source and
a resistor.
0
1
1
In order for one Amp to flow through the resistor, V1 - V0 must equal one volt. But
does V1 = 11 volts and V0 =10 volts Or is V1 = 101 volts and V0 = 100 volts ? It
really does not matter, what is important is that V1 is one volt higher than V0. So,
let V0 define a reference and set its value to a convenient number, V0 = 0.
43
i5
0
i2
i1
i4
i3
The second set of unknowns are the element currents. Obviously, the currents
passing through current sources are already known, so one need only label the
currents through resistors and voltage sources. The currents are denoted
i1, i2,
44
i5
i1 + i 5 i 4 = 0
0
i2
i1
is1
i4
is 2 + is 3 i 2 i 5 = 0
is 1 i 1 + i 2 = 0
is 3
is 2
4
i 4 is1 is 2 i 3 = 0
i3
i 3 is 3 = 0
The conservation law for a circuit is that the sum of currents at each node equals
zero. This is often referred to as Kirchoffs current law. Another way to state this
law, which more clearly indicates its conservation nature is to say
Any current entering a node must leave the node.
The conservation is that no current is lost, what comes in goes out. The green
statement also makes it clear that the direction of the current determines its sign
when summing the currents. Currents leaving the node are positive terms in the sum
and currents entering the node are negative terms ( one can reverse this convention
but one must be consistent).
45
R1
R5
R2
R2 i2 =V1 V2
R1 i1 = 0 V1
R4
R3
3
R4 i4 =V4 0
R3 i3 =V3 V4
V2
i2
R2 in the figure,
V3
R2
1
V = I
R
V3 and I = i2.
Onse should again take note of the direction of the current. If current travels from
left node through the resistor to the right node, then the left node voltage will be
higher than the right node voltage by an amount
R I.
46
47
Summary of key
points
48
48
Outline
Formulating Equations from Schematics
Struts and Joints Example
Formulating Equations
from Schematics
Struts Example
Identifying Unknowns
x1 , y1
x2 , y2
0, 0
1, 0
hinged
Given a schematic for the struts, the problem is to determine the joint positions and
the strut forces.
Recall the joints in the struts problem correspond physically to the location where
steel beams are bolted together. The joints are also analogous to the nodes in the
circuit, but there is an important difference. The joint position is a vector because
one needs two (X,Y) (three (X,Y,Z)) coordinates to specify a joint position in two
(three) dimensions.
The joint positions are labeled x1,y1,x2,y2,..xj,yj where j is the number of joints
whose positions are unknown. Like in circuits, in struts and joints there is also an
issue about position reference. The position of a joint is usually specified with
respect to a reference joint.
Note also the symbol
This symbol is used to denote a fixed structure ( like a concrete wall, for example).
Joints on such a wall have their positions fixed and usually one such joint is selected
as the reference joint. The reference joint has the position 0,0
( 0,0,0 in three dimensions).
Formulating Equations
from Schematics
3
fx , f y
1
fx , f y
Struts Example
Identifying Unknowns
1
4
fx ,
fy
fx , f y
f lo a d
The second set of unknowns are the strut forces. Like the currents in the circuit
examples, these forces can be considered branch quantities. There is again a
complication due to the two dimensional nature of the problem, there is an x and a y
1
1
s
s
component to the force. The strut forces are labeled
f , f ,..., f , f
x
Formulating Equations
from Schematics
Y
Struts Example
Aside on Strut Forces
f = EAc
fx
f
(0, 0)
L
x1 , y1
fy
L0 L
= ( L0 L )
L0
x1
f
L
y
= 1 f
L
fx =
X
fy
L =
x12 + y12
The force, f, in a stretched strut always acts along the direction of the strut, as
shown in the figure. However, it will be necessary to sum the forces at a joint,
individual struts connected to a joint will not all be in the same direction. So, to sum
such forces, it is necessary to compute the components of the forces in the X and Y
direction. Since one must have selected the directions for the X and Y axis once for
a given problem, such axes are referred to as the global coordinate system. Then,
one can think of the process of computing fx, fy shown in the figure as mapping from
a local to a global coordinate system.
The formulas for determining fx and fy from f follow easily from the geometry
depicted in the figure, one is imply projecting the vector force onto coordinate axes.
Formulating Equations
from Schematics
1
1
y
fx + fx + fx = 0
f + fy + fy = 0
x1 , y1
Struts Example
Conservation Law
f3
x2 , y2
f4
f y4 f y3 + floady = 0
f2
0,0
f lo a d
1,0
Force Equilibrium
Sum of X-directed forces at a joint = 0
Sum of Y-directed forces at a joint = 0
SMA-HPC 2003 MIT
The conservation law for struts is usually referred to as requiring force equilibrium.
There are some subtleties about signs, however. To begin, consider that the sum of
X-diirected forces at a joint must sum to zero otherwise the joint will accelerate in
the X-direction. The Y-directed forces must also sum to zero to avoid joint
acceleration in the Y direction.
To see the subtlety about signs, consider a single strut aligned with the X axis as
shown below
x1,0
x 2,0
, then the strut will exert force in attempt to contract, as
shown below
x 2 + ,0
x1,0
fa
fb
The forces fa and fb , are equal in magnitude but opposite in sign. This is because fa
points in the positive X direction and fb in the negative X direction.
If one examines the force equilibrium equation for the left-hand joint in the figure,
then that equation will be of the form
Other forces + fa = 0
whereas the equilibrium equation for the right-hand joint will be
Other forces + fb = Other forces- fa = 0
In setting up a system of equations for the strut, one need not include both fa and fb
as separate variables in the system of equations. Instead, one can select either force
and implicitly exploit the relationship between the forces on opposite sides of the
strut.
As an example, consider that for strut 3 between joint 1 and joint 2 on the slide, we
have selected to represent the force on the joint 1 side of the strut and labeled that
force f3. Therefore, for the conservation law associated with joint 1, force f3 appears
with a positive sign, but for the conservation law associated with joint 2, we need
the opposite side force, - f3. Although the physical mechanism seems quite different,
this trick of representing the equations using only the force on one side of the strut
as a variable makes an algebraic analogy with the circuit sum of currents law. That
is, it appears as if a struts force leaves one joint and enters another.
1
f 1 x = Fx ( x1 0, y1 0)
f 1 y = Fy ( x1 0, y1 0)
f1 f2
Struts Example
Conservation Law
f 3x = Fx ( x1 x 2, y1 y 2)
f 3 y = Fy ( x1 x 2, y1 y 2)
2
f3
f load
f 2 x = Fx ( x1 1, y1 0)
f 2 y = Fy ( x1 1, y1 0)
f4
f 4 x = Fx ( x2 1, y2 0)
1,0
f 4 y = Fy ( x2 1, y2 0)
x1,0
x 2,0
The -X axis alignment can be used to simplify the relation between the force on the
x1 side and x1 and x2 to
fx =
L | x1 x2 |
x1 x2
0
| x1 x2 |
L0
Note that there are two ways to make fx negative and point in the negative x
direction. Either x1- x2 > 0, which corresponds to flipping the strut, or |x2- x1| < L0
which corresponds to compressing the strut.
Formulating Equations
from Schematics
Struts Example
Summary
f1
f2
x1 , y1 = 0
fL
x2 , y2 = 0
Conservation Law
At node 1: f1x + f 2 x = 0
At node 2: -f 2 x + f L = 0
SMA-HPC 2003 MIT
10
f1
f2
fL
x2 , y 2 = 0
x1 , y1 = 0
Constitutive Equations
x 0
f1x = 1
( L0 x1 0 )
x1 0
f2 x =
x1 x2
( L0 x1 x2
x1 x2
11
x1
x x
( L0 x1 ) + 1 2 ( L0 x1 x2 ) = 0
x1
x1 x2
f2 x
x1 x2
( L0 x1 x2 ) + f L = 0
x1 x2
f2 x
12
f1
f2
x1 , y1 = 0
fL
x2 , y 2 = 0
13
f1
f2
x1 , y1 = 0
fL
x2 , y 2 = 0
14
x2, y2
0,0
4
15
Constitutive Relations
Circuit branch (element) current proportional to branch
(element) voltage
Struts - branch (strut) force proportional to branch (strut)
displacement
16
Circuit Example
One Matrix column for each unknown
N columns for the Node voltage
B columns for the Branch currents
One Matrix row for each equation
N rows for KCL
B rows for element constitutive equations
(linear !)
SMA-HPC 2003 MIT
17
18
Circuit Example
Conservation Equation
i5
0
R5
V1
R1
i1
i2
is1
is 2
is 3
R3
R4
i4
V2
R2
V4
i3
V3
To generate a matrix equation for the circuit, we begin by writing the KCL equation
at each node in terms of the branch currents and the source currents. In particular,
we write
19
i1
is 2
is 3
R3
R4
V2
R2
i2
is 1
i4
Conservation Equation
R5
V1
R1
Circuit Example
V4
i3
V3
i1 + i 2 = is1
i 2 i 5 = i s 2 i s 3
i 3 = is 3
i3 + i4 = is1 + is2
20
Circuit Example
Conservation Equation
One
Row
for
each
KCL
Equation
2
3
4
1 1
i1
i 2
i 3
i 4
i 5
is1
i
i
s
s
2
3
i
s3
is1 + is2
Right Hand
Side for
Source
Currents
21
Circuit Example
Conservation Equation
n1
n2
1
1
n1
KCL at n1
ik
n2
Rk
iother + ik = is
KCL at n 2
iother ik = is
What happens to the matrix when one end of a resistor is connected to the reference
( or the zero node).
n1
ik
In that case, there is only one contribution to the kth column of the matrix, as shown
below
n1
22
Circuit Example
Conservation Equation
n1 isother + isb
n 2 isother isb
RHS
isb n2
n1
KCL at n1
ib 's =
isother + isb
KCL at n2
ib 's =
isother isb
23
Circuit Example
Conservation Equation
n1
ik
n2
n1
i sb n 2
24
Circuit Example
Conservation Equation
R1
i1
R2
i2
is1
is 2
i3
R4
i4
R3
4
1
2
3
4
is 3
1 1
1
i1 i 2 i 3
i4 i5
i1
is 1
i 2
is 2 is 3
i 3 =
is 3
4
i
is1 +is2
i 5
25
Circuit Example
Constitutive Equation
i5 =
2
i1 =
i2 =
1
1
Vb1 = (0 V 1)
R1
R1
i4
i4 =
1
1
Vb 5 = (0 V 2)
R5
R5
1
1
Vb2 = (V 3 V 4)
R2
R2
i3
4
1
1
Vb 4 =
(V 4 0)
R4
R4
i3 =
1
1
Vb 3 = (V 3 V 4)
R3
R3
The current through a resistor is related to the voltage across the resistor, which in
turn is related to the node voltages. Consider the resistor below.
V1
i1
V2
R1
The voltage across the resistor is V1-V2 and the current through the resistor is
i1 =
1
(V 1 V 2)
R1
26
Circuit Example
Constitutive Equation
i5
0
i1
i5 =
i2
2
i2 =
1
1
i1 =
V b1 =
( 0 V 1)
R1
R1
i4
Examine
Matrix
Construction
1
1
Vb 2 =
(V 1 V 2)
R2
R2
i3
4
1
1
i4 =
Vb4 =
(V 4 0 )
R4
R4
V b 1 1
V b 2
1
V b 3 =
V b 4
V b 5
1
1
Vb5 =
(0 V 2)
R5
R5
1
1
i3 =
Vb3 =
(V 3 V 4 )
R3
R3
1
1
V 1
V 2
V 3
V 4
To generate a matrix equation that relates the node voltages to the branch voltages,
one notes that the voltage across a branch is just the difference between the node
voltages at the ends of the branch. The sign is determined by the direction of the
current, which points from the positive node to the negative node.
Since there are B branch voltages and N node voltages, the matrix relating the two
has B rows and N columns.
27
Circuit Example
Constitutive Equation
KCL Equations
1 1
1
1
1 1
Vb1
1
Vb 2
V 1
1 1
V 2
1 1 = Vb 3
V 3
1
Vb 4
V 4
Vb 5
i1
i 2
i 3 = Is
i 4
i 5
AT
A relation exists between the matrix associated with the conservation law (KCL)
and the matrix associated with the node to branch relation. To see this, examine a
single resistor.
k
Vl
Rk
Vm
For the conservation law, branch k contributes two non- zeros to the kth column of
A as in
l
m
1
1
I 1
:
:
:
I B
Is
A
Note that the voltage across branch k is Vl -Vm, so the kth branch contributes two
non-zeros to the kth row of the node- branch relation as in
28
V 1
:
:
V N
V b 1
V b B
It is easy to see that each branch element will contribute a column to the incidence
matrix A, and will contribute the transpose of that column, a row, to the node-tobranch relation.
29
R1
is 2
i3
R4
i4
R2
i2
is1
Constitutive Equation
R5
i1
Circuit Example
is 3
R3
i1
i 2
i 3 =
i 4
i 5
1
R1
1
R2 1
R3
1
R4
1
R5
Vb1
Vb 2
Vb 3
Vb 4
Vb 5
1
to ( k , k )
Rk
30
R1
is 2
i3
R4
i4
R2
i2
is1
Constitutive Equation
R5
i1
Circuit Example
is 3
R3
3
i1
i 2
i 3 AT
i 4
i 5
0
V 1
0
V 2
= 0
V 3
0
V 4
0
VS
31
AT
Circuit Example
Node-Branch Form
Ib
0
=
VN
Is
Ib AT VN = 0
A Ib = I s
Constitutive Relation
Conservation Law
32
Struts Example
In 2-D
One pair of columns for each unknown
- J pairs of columns for the Joint positions
- S pairs of columns for the Strut positions
One pair of Matrix Rows for each Equation
- J pairs of rows for the force equilibrium
equations
- S pairs of rows for the Linearized constitutive
relations.
33
Struts Example
34
f1
Struts Example
Conservation Equation
f3
x2, y 2
f4
f2
fl
0,0
1,0
f 1x + f 2 x + f 3 x = 0
f 1y + f 2 y + f 3y = 0
f 3 x + f 4 x = fl x
f 3 y + f 4 y = fl y
SMA-HPC 2003 MIT
35
f1
Struts Example
Conservation Equation
f3
x2, y 2
Stamping Approach
f4
f2
fl
0,0
1,0
f 1x f 1 y f 2 x f 2 y f 3x f 3 y f 4 x f 4 y
x1 1
y1
1
x2
y 2
1
1
1
1
1
1
A
SMA-HPC 2003 MIT
f 1x
f 1y
f 2x
f 2 y =
f 3x
fl x
fl y
f 3y
f 4x
FL
f 4 y
Note that the incidence matrix, A, for the strut problem is very similar to the
incidence matrix for the circuit problem, except the two dimensional forces and
positions generate 2x2 blocks in the incidence matrix. Consider a single strut
x j 1, y j 1
fs
xj 2, yj 2
The force equilibrium equations for the two joints at the ends of the strut are
At joint j1
+ fsx = flx
+ fsy = fly
xother
j1
At joint j2
j1
yother
j1
j1
fsx = flx
xother
j2
yother
j2
j2
fsy = fly
j2
36
fsx
xj1
yj1
xj 2
yj 2
fsy
Note that the matrix entries are 2x2 blocks. Therefore, the individual entries in the
matrix block for strut Ss contribution to j1s conservation equation need specific
indices and we use j1x, j1y to indicate the two rows and Sx, Sy to indicate the two
columns.
37
Struts Example
Conservation Equation
38
Struts Example
Constitutive Equation
Y
f
x 1, y 1
(0, 0)
Fx
( x 0, y 0)
y
Fy
( x 0, y 0 )
y
ux
uy
ux 1 = x 1 x 0
uy 1 = y 1 y 0
x
( L0 L )
L
y
f y = Fy ( x, y ) = ( L0 L )
L
f x = Fx ( x , y ) =
where
L = x2 + y2
and x, y are as in
y
L
X
If x and y are perturbed a small amount from some x0, y0 such that x02 + y02 = L02,
then since Fx(x0,y0) = 0
Fx
Fx
fx
( x 0, y 0) ( x1 x 0) +
( x 0, y 0) ( y1 y 0)
x
y
and a similar expression holds for y.
One should note that rotating the strut, even without stretching it, will violate the
small perturbation conditions. The Taylor series expression will not give good
approximate forces, because they will point in an incorrect direction.
39
ux1, uy1
f1
Struts Example
Constitutive Equation
f3
ux2, uy 2
f2
f4
fl
0,0
1,0
f 1x
f 1y
f 2x
f 2y
f 3x
f 3y
f 4x
f 4 y
11
22
33
44
0
0
ux1 0
0
T uy1
=
A
ux 2 0
uy 2 0
0
0
40
Struts Example
Constitutive Equation
The ( s, s ) block
ux1, uy 1 fs
Initial position
x 10 , y 10
F x
x ( x 20 x 10 , y 20 y 10 )
( s, s ) =
F y
( x 20 x 10 , y 20 y 10 )
x
ux 2 , uy 2
Initial position
x 20 , y 20
F x
( x 2 0 x 1 0 , y 20 y 1 0 )
y
F y
( x 20 x 10 , y 20 y 10 )
y
41
2 J
Struts Example
Node-Branch From
AT fs 0
=
u fL
0
2 J
2 S
S =Number of Struts
J = Number of unfixed Joints
fs = AT u = 0 Constitutive Equation
A fs = 0 Conservation Law
42
2 J
2 S
Struts Example
Comparison
AT fs 0
=
u fL
0
2 J
AT Ib Vs
=
VN Is
0
43
Generating Matrices
Nodal Formulation
is1 +
R5
1
1
V1 + (V1 V2 ) = 0
R1
R2
V1
R1
R4
is1 is2 +
Circuit Example
is1
V4
V2
is 2
is2 + is3 +
R2
1
1
(V2 V1 ) + V2 = 0
R2
R5
is 3
R3
V3
1
1
V4 + (V4 V3 ) = 0
R4
R3
44
Generating Matrices
Nodal Formulation
i5
0
i1
i
R5
V1
R1
Circuit Example
V2
R2
i2
is1
is 3
is 2
i3
4 R4
1
1
1
+
R1 R 2
R2
1
1
1
+
R2
R 2 R5
R3
V4
1
R3
1
R3
1
R3
1
1
+
R3 R 4
n1
V3
v1
is1
v
2 = is2 is3
v3
is3
v4
is1 + is2
n2
Is
G
SMA-HPC 2003 MIT
Examining the nodal equations one sees that a resistor contributes a current to two
equations, and its current is dependent on two voltages.
ik
V n1
Vn2
Rk
1
(Vn1 Vn 2) = is
Rk
1
KCL at node n 2 iothers (Vn1 Vn 2 ) = is
Rk
So, the matrix entries associated with Rk are
KCL at node n1
others
n1
n1
n2
n2
1
1
Rk
Rk
1
1
Rk
Rk
45
Generating Matrices
Nodal Formulation
Circuit Example
1
R
else
G(n 2, n 2) = G(n 2, n 2) +
1
R
46
Nodal Formulation
N
2 J
Generating Matrices
G Vn = Is
N
G uj = FL
(Resistor Networks)
2 J
47
Nodal Formulation
Node-Branch Matrix
Constitutive
Conservation
Law
I AT
A
0
Ib 0
=
VN Is
Nodal Matrix
[ G ][VN ] = [Is ]
48
Nodal Formulation
Diagonally Dominant .
G matrix properties
Gii Gij
j i
Symmetric ..
Smaller ..
Gij = Gji
N N << ( N + B ) ( N + B )
2 J 2 J << ( 2 J + 2 S ) ( 2 J + 2 S )
49
Node-Branch form
Nodal Formulation
Node-Branch formulation
AT
0
M
Ib
0
=
Vn
I s
b
x
50
Nodal Formulation
Ib AT VN = 0
A ( Ib AT VN ) = A 0
A Ib = Is
A AT VN = Is
G
51
Problem element
Nodal Formulation
Voltage Source
is
Voltage source
Vn 1
Vs
+
Vn 2
Constitutive Equation
0 is + V n 1 V n 2 = V s
SMA-HPC 2003 MIT
52
Problem Element
Nodal Formulation
Voltage Source
i6
Vs
R1
i1
i2
i3
R4
R2
i4
R3
0
1
i1
V1 0
1
i2
V2
1
i3 AT V 3 = 0
1
i4
V 4 0
1 i5
V 5 0
0 i6
Vs
Vs
1
R1
R2 1
=
R3 1
R4 1
R5
53
Problem Element
Nodal Formulation
Voltage Source
I bR
T
0 A VN = 0 (Constitutive Equation)
I bR
A A ATVN = 0 (Multiply by A)
0
Resistor currents
Voltage source
currents
missing
A Ib = Is
(Conservation Law)
Ib
Cannot Eliminate Ai !
R
54
Problem Element
Nodal Formulation
Rigid rod
Rigid Rod
x1, y1
x 2, y 2
L
0
( y1 y2 )
55
Nodal Formulation
Example Problem
Resistor Grid
V1
V2
V 101
V 901
V 102
V 902
V3
V 103
V 903
V4
V 99
V 100
V 200
V 1000
56
Nodal Formulation
Example Problem
Node-Branch
Nodal
57
58
Node-branch
General constitutive equations
Large sparser system
No diagonal dominance
Nodal
Conserved quantity must be a function of node
variables
Smaller denser system.
Diagonally dominant & symmetric.
59
Outline
Solution Existence and Uniqueness
Gaussian Elimination Basics
LU factorization
Pivoting and Growth
Hard to solve problems
Conditioning
Application
Problems
G
M
Vn = I s
x
Systems of Linear
Equations
M1
M2
x b
1 1
b
x
MN 2 = 2
xN bN
x1M 1 + x2 M 2 +
+ xN M N = b
Key Questions
Systems of Linear
Equations
Given Mx = b
Is there a solution?
Is the solution Unique?
Is there a Solution?
There exists weights, x1 ,
x1M 1 + x2 M 2 +
xN , such that
+ xN M N = b
Systems of Linear
Equations
Key Questions
Continued
y1M 1 + y2 M 2 +
+ yN M N = 0
Then if Mx = b, therefore M ( x + y ) = b
Systems of Linear
Equations
Key Questions
Square Matrices
Important Properties
Gaussian
Elimination Basics
Gaussian
Elimination Basics
Reminder by Example
3 x 3 example
M 11 M 12 M 13 x1
b1
M 21 M 22 M 23 x2 = b2
M 31 M 32 M 33 x3
b3
M 11 x1 + M 12 x2 + M 13 x3 = b1
M 21 x1 + M 22 x2 + M 23 x3 = b2
M 31 x1 + M 32 x2 + M 33 x3 = b3
SMA-HPC 2003 MIT
Gaussian
Elimination Basics
Reminder by Example
Key Idea
M 11 x1 + M 12 x2 + M13 x3 = b1
21 x1+ M
22 x2 + M
MM
M 2123 x3 = b2
M 21
21
M
M
x
+
M
M
x
=
b
b1
M
22
x23 + M x 13= b3
12 21
2
2
M
x
+
M
M
M
M
(
M
x
+
M
x
+
M
x
=
b
)
31
1
32
2
33
3
3
11
1112 2
111
11 1
13 3
M 11
M 31
M 31x = b
M 31
MM
x1+x2M+ 32 Mx233+M
3112
33 M
3 13 x33 = b3
b1
M 32
M
M
M
11
11
11
M 31
( M 11 x1 + M12 x2 + M13 x3 = b1 )
M 11
SMA-HPC 2003 MIT
10
Gaussian
Elimination Basics
Pivot
M 11
Reminder by Example
Key Idea in the Matrix
M 12
M 13
M 21
M 21
M 21
M
M
M
x
=
b
22
12 23
12 2
1
2
M 11
M 11
M11
M 31
M 31
M 32
M 12 M 33
M 12
M
M 11
M 11
x3 b3 31 b1
M11
MULTIPLIERS
11
Gaussian
Elimination Basics
Reminder by Example
Remove x2 from Eqn 3
Pivot
M 12
M11
M 21
M 22
M12
0
M11
0
0
M13
M 21
M 23
M 12
M11
M 31
M 33 M M12
11
M 31
M 32
M 12
M 11
M 21
M 22
M 12
M 11
M 21
M 23
M 12
M11
x1 b1
21
b2
b
M11
x2 =
M 31
M
M
32
12
M11
b M 21 b
b M 31 b
2
1
3 M11 1
M11
M 21
M 22
M 12
M11
x3
Multiplier
SMA-HPC 2003 MIT
12
Gaussian
Elimination Basics
M 11
Reminder by Example
Simplify the Notation
1
b
M 12
M 13
M 21
M 21
M 21
M
M
M
x
=
b
22
12 23
12 2
1
2
11
M 1122
M23
M11
M 31
M 31
M
M
M
32
12 33
12
M
11
M 1132
M33
x3 b3 331 b1
M11
13
Gaussian
Elimination Basics
M 11
0
Pivot
M 12
M 22
0
M 13
M 23
M 33
M 32
M 22
Reminder by Example
Remove x2 from Eqn 3
b1
x
x2 =
b2
x
M
3
M 23
b3 32 b2
M 22
Multiplier
SMA-HPC 2003 MIT
14
Gaussian
Elimination Basics
Reminder by Example
GE Yields Triangular System
x
x
y b1
M11 U MU12 UM
1
3
x = 1y
0
U
U
0 M M
x
2y = b2
23x
0 220 U
y
x3
= 0U 3 0 M33
x3 b3
33
11
x2 =
12
13
22
23
33
y2 U 23 x3
U 22
x1 =
SMA-HPC 2003 MIT
Altered
During
GE
y1 U12 x2 U13 x3
U11
15
Gaussian
Elimination Basics
Reminder by Example
The right-hand side updates
b1
b1
y1
M 21
y2 = b2 b1
11
=
y
2
2
y3 b M 31 b M 32 b
3 M 11 1 M 22 2
y3
0 0
1
y b
M21
1 1
M 21 M 1 0 y2 = b2
y b
b1 11
M 11 M31 M32 3 3
1
M11 M22
M
M
31
b
b1 32 b2
M 22
M 11
16
Gaussian
Elimination Basics
Reminder by Example
Fitting the pieces together
M
M
M
M
M
M
M111 12 13
M
M
MM M
1 M
M
M M22 23
M
M
M 1
MM MM M33
11
21
2 1
12
13
22
23
11
1 1
31
3 1
1111
32
3 2
33
22 3 2
17
Basics of LU
Factorization
Solve M x = b
Step 1
M = LiU
=
Step 2 Forward Elimination
Solve L y = b
Step 3 Backward Substitution
Solve U x = y
SMA-HPC 2003 MIT
Recall from basic linear algebra that a matrix A can be factored into the product of a
lower and an upper triangular matrix using Gaussian Elimination. The basic idea of
Gaussian elimination is to use equation one to eliminate x1 from all but the first
equation. Then equation two is used to eliminate x2 from all but the second
equation. This procedure continues, reducing the system to upper triangular form as
well as modifying the right- hand side. More is
needed here on the basics of Gaussian elimination.
18
Basics of LU
Factorization
l11 0
l
21 l22
l13
lN 1
Solving Triangular
Systems
Matrix
0
lNN
y1
b1
b 2
y2
=
y
b
N
N
19
Solving Triangular
Systems
Basics of LU
Factorization
l11 0
l
21 l22
l13
l N 1
Algorithm
0
0
l NN
y1
b1
y
b 2
2
=
y
bN
N
1
b1
l11
1
y2 = (b2 l21 y1)
l22
1
y3 = (b3 l31 y1 l32 y2)
l33
y1 =
yN =
1
l NN
N 1
(bN lN i yi )
i =1
for y1
+
0 mults
for y1
+ 1 mult
for yN
+ .. + N- 1 mults
for y2
for yN
(N
- 1)(N
- 2) / 2 add/subs + (N
- 1)(N
- 2) / 2 mults + N divides
Order N2 operations
20
Factoring
Basics of LU
Factorization
M 11
M
M
M 21
M
M
M 31
M
M
M 41
21
11
31
11
41
11
Picture
M 12 M 13 M 14
M
M 22
M 23
22 M
23 M 24
24
M
M 3333 M 34
M
M
32
34
M
M
M
M
M
M
M444444
M
43
42
43
43
42
M
M
32
22
42
43
22
33
The above is an animation of LU factorization. In the first step, the first equation is
used to eliminate x1 from the 2nd through 4th equation. This involves multiplying
row 1 by a multiplier and then subtracting the scaled row 1 from each of the target
rows. Since such an operation would zero out the a21, a31 and a41 entries, we can
replace those zerod entries with the scaling factors, also called the multipliers. For
row 2, the scale factor is a21/a11 because if one multiplies row 1 by a21/a11 and
then subtracts the result from row 2, the resulting a21 entry would be zero. Entries
a22, a23 and a24 would also be modified during the subtraction and this is noted by
changing the color of these matrix entries to blue. As row 1 is used to zero a31 and
a41, a31 and a41 are replaced by multipliers. The remaining entries in rows 3 and 4
will be modified during this process, so they are recolored blue.
This factorization process continues with row 2. Multipliers are generated so that
row 2 can be used to eliminate x2 from rows 3 and 4, and these multipliers are
stored in the zerod locations. Note that as entries in rows 3 and 4 are modified
during this process, they are converted to gr een. The final step is to used row 3 to
eliminate x3 from row 4, modifying row 4s entry, which is denoted by converting
a44 to pink.
It is interesting to note that as the multipliers are standing in for zerod matrix
entries, they are not modified during the factorization.
21
Factoring
LU Basics
For i = 1 to n-1 {
For j = i+1 to n {
M ji =
Algorithm
M ji
M ii
Pivot
M jk M jk M ji M ik
}
Multiplier
}
}
SMA-HPC 2003 MIT
22
Factoring
LU Basics
At Step i
Zero Pivots
Factored Portion
Multipliers M
(L)
ii
ji
Row i
Row j
M ji
What if M ii = 0 ? Cannot form
M ii
Simple Fix (Partial Pivoting)
If M ii = 0
Find M ji 0 j > i
Swap Row j with i
r1
r2
rj
ri
r N
x1
b1
x2
b2
|
|
xi = bj
|
|
xj
bi
|
|
b N
x N
23
Factoring
LU Basics
Zero Pivots
n- 1
n
is still strictly diagonally dominant.
n
as
a11 0
| a11 |> | aij |
Second row after first step
j =2
First Step
0, a 22
Is
a 22
a 21
a 21
a 21
a12, a 23
a13, , a 2 n
a 1n
a11
a11
a11
a 21
a 21
a12 > a 2 j
a1 j ?
a11
a11
24
Numerical Problems
LU Basics
Small Pivots
Contrived Example
1017
2
1
L = 17
10
x1 1
x = 3
2
0
1
1017
U=
0
1017 + 2
1
Can we represent
this ?
SMA-HPC 2003 MIT
0 y1 1
=
1 y2 3
and therefore y1 = 1, y2 = 3 -1017.
1
1017
2 10
x1 1
x =
17
2 3 10
3 1017
and therefore x2 =
+1
2 1017
1
17
17
17
3 1017
17 2 10 (3 10 )
and x1 = 10 (1
) = 10 (
)
2 1017
2 1017
17
+1
1017
0 y1 1
= y1 = 1 y2 = 1017
1 y2 3
1 x1 1
=
x1 = 1 x2 = 0
1017 x2 1017
25
Numerical Problems
LU Basics
Small Pivots
X.X X X X X i10exponent
64 bits
sign
Basic Problem
11 bits
52 bits
Key Issue
Avoid small differences between large
numbers !
SMA-HPC 2003 MIT
26
LU Basics
Numerical Problems
Small Pivots
1
= 17
10
1
= 17
10
x1
1
=
x
1
2 Exact
0 1017
1 0
0 1017
1 0
x1 1
=
2 1017 x2 3
1
1 x1 1
=
1017 x2 3
x1
0
=
x
2 Rounded 1
27
Numerical Problems
LU Basics
Small Pivots
If | M ii | < max | M ji |
j >i
1
LU reordered = 17
10
0
1
2
1
0 1 + 2 1017
This multiplier
is small
0 y1 3
=
1 y2 1
yields y1 = 3, y 2 = 1 1017 1
Notice that without partial pivoting y2 was 3-1017 or -1017 with rounding.
The right hand side value 3 in the unpivoted case was rounded away, where as now
it is preserved. Continuing with the back substitution.
1
1017
2
1
0 1 + 2 1017
x2 1 x1 1
x1 3
=
x2 1
28
LU Basics
Numerical Problems
Small Pivots
0 y1 3
=
1 y2 1
yields y1 = 3, y 2 = 1 1017 1
Notice that without partial pivoting y2 was 3-1017 or -1017 with rounding.
The right hand side value 3 in the unpivoted case was rounded away, where as now
it is preserved. Continuing with the back substitution.
1
1017
2
1
0 1 + 2 1017
x2 1 x1 1
x1 3
=
x2 1
29
Hard to Solve
Systems
Fitting Example
Polynomial Interpolation
Table of Data
f
t0
f (t0)
t1 f (t1)
f (t0)
tN f (tN)
t0 t1
t2
tN
30
Example Problem
Hard to Solve
Systems
Matrix Form
1
t0
t1
t0
t0
t1
t1
tN
tN
tN
0 f (t0 )
1 f (t1 )
N f (t N )
M interp
SMA-HPC 2003 MIT
The kth row in the system of equations on the slide corresponds to insisting that the
Nth order polynomial match the data exactly at point tk. Notice that we selected the
order of the polynomial to match the number of data points so that a square system
is generated. This would not generally be the best approach to fitting data, as we
will see in the next slides.
31
Hard to Solve
Systems
Fitting Example
Fitting f(t) = t
Coefficient
Value
Coefficient number
SMA-HPC 2003 MIT
Notice what happens when we try to fit a high order polynomial to a function that is
nearly t. Instead of getting only one coefficient to be one and the rest zero, instead
when 100th order polynomial is fit to the data, extremely large coefficients are
generated for the higher order terms. This is due to the extreme sensitivity of the
problem, as we shall see shortly.
32
Perturbation Analysis
Hard to Solve
Systems
Vector Norms
L2 (Euclidean) norm :
i=1
L1 norm :
Unit circle
n
xi
< 1
xi
i=1
< 1
L norm :
= max
i
xi
Square
< 1
33
Perturbation Analysis
Hard to Solve
Systems
Matrix Norms
Vector induced norm :
A = max
x
Ax
x
= max
x =1
Ax
A
A
= max
1
0
j
i=1
Why? Let x =
n
= max
A ij = max abs row sum 1 0
i
j =1
Why? Let x =
34
Hard to Solve
Systems
Perturbation Analysis
Perturbation Equation
(M + M ) ( x + x) = M x + M x + M x + M x =
Models LU
Roundoff
b
Unperturbed
RightHandSide
Models Solution
Perturbation
Since M x - b = 0
M x = M ( x + x ) x = M 1 M ( x + x )
1
Taking Norms
x M 1 MM x + xM x + x
M
x
M
1
M
M
Relative Error Relation
M
x + x
"Condition
Number "
As the algebra on the slide shows the relative changes in the solution x is bounded
by an M
- dependent factor times the relative changes in M. The factor
|| M 1 || || M ||
was historically referred to as the condition number of M, but that definition has
been abandoned as then the condition number is norm
- dependent. Instead the
condition number of M is the ratio of singular values of M.
cond ( M ) =
max( M )
min( M )
Singular values are outside the scope of this course, consider consulting Trefethen
& Bau.
35
Perturbation Analysis
Hard to Solve
Systems
M2
|| M 2 ||
x2
M1
|| M 1 ||
x1
0
Case
1 1 orthogonal
Columns
0 106
M2
|| M 2 ||
M1
|| M 1 ||
x1
1
1 106
Case
1 nearly
Columns
aligned
6
1 10
36
Geometric Analysis
Hard to Solve
Systems
Polynomial Interpolation
log(cond(M))
~1020
1020
1010
~1013
1015
~314
~106
t2
16
32
t1
t2
t12
t 22
tN
t N2
0
d 11 0
0
0 dNN
0
a11
aN 1
a1N d 11 a11
=
aNN dNN aN 1
d 11 a1N
dNN aNN
Does row
- scaling reduce condition number ?
|| M || || M 1 || condition number of M
Theorem If floating point arithmetic is used, then row scaling (D M x = D b) will
not reduce growth in a meaningful way.
If
LU x = b
M
and
D M LU x ' = D b
then
x = x ' No roundoff reduction
37
Summary
Solution Existence and Uniqueness
Gaussian Elimination Basics
LU factorization
Pivoting and Growth
Hard to solve problems
Conditioning
38
Outline
LU Factorization Reminder.
Sparse Matrices
Struts and joints, resistor grids, 3-d heat flow
Factoring
LU Basics
Picture
M 11
M
M
M 21
M
M
M 31
M
M
M 41
21
11
31
11
41
11
M 12 M 13 M 14
M 2222 M
M 23
23 M 24
24
M
M 3333 M 34
M
M
32
34
M
M
M
M
M
M
M444444
M
43
42
43
43
42
M
M
32
22
42
43
22
33
The above is an animation of LU factorization. In the first step, the first equation is
used to eliminate x1 from the 2nd through 4th equation. This involves multiplying
row 1 by a multiplier and then subtracting the scaled row 1 from each of the target
rows. Since such an operation would zero out the a21, a31 and a41 entries, we can
replace those zerod entries with the scaling factors, also called the multipliers. For
row 2, the scale factor is a21/a11 because if one multiplies row 1 by a21/a11 and
then subtracts the result from row 2, the resulting a21 entry would be zero. Entries
a22, a23 and a24 would also be modified during the subtraction and this is noted by
changing the color of these matrix entries to blue. As row 1 is used to zero a31 and
a41, a31 and a41 are replaced by multipliers. The remaining entries in rows 3 and 4
will be modified during this process, so they are recolored blue.
This factorization process continues with row 2. Multipliers are generated so that
row 2 can be used to eliminate x2 from rows 3 and 4, and these multipliers are
stored in the zerod locations. Note that as entries in rows 3 and 4 are modified
during this process, they are converted to gr een. The final step is to used row 3 to
eliminate x3 from row 4, modifying row 4s entry, which is denoted by converting
a44 to pink.
It is interesting to note that as the multipliers are standing in for zerod matrix
entries, they are not modified during the factorization.
Factoring
LU Basics
Algorithm
For i = 1 to n-1 {
For j = i+1 to n {
M ji =
M ji
M ii
Pivot
n 1
(n i ) =
i =1
n2
2
multipliers
M jk M jk M ji M ik
}
Multiplier
n 1
(n i)
i =1
2 3
n
3
Multiply-adds
}
}
SMA-HPC 2003 MIT
Factoring
LU Basics
n-1
n
n-1
n
as
a11 0
0, a 22
Is
a 22
a 21
a 21
a 21
a12, a 23
a13,, a 2n
a1n
a11
a11
a11
a 21
a 21
a12 > a 2 j
a1 j ?
a11
a11
Sparse Matrices
Applications
Space Frame
Nodal Matrix
Space Frame
5
3
4
2
1
X X
X X
X X
X
X X
X X X
X X X
X X X
X X
X
X
X X
X X
X X
X X
X
i
X =
i
X
X
X
X
X
i
i
2 x 2 b lo c k
Applications
Sparse Matrices
1
m +1
Resistor Grid
m+2
m 1
m+3
2m
m2
(m 1) (m + 1)
The resistive grid is an important special case, as it is a model for discretized partial
differential equations (we will see this later).
Lets consider the nodal matrix and examine the locations and number of non zeros.
The matrix has a special form which is easy to discern from a 4 x 4 example. I n the
4 x 4 case the nodal matrix is
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
The tridiagonal blocks are due to the interaction between contiguously numbered
nodes along a single row in the grid. The non zeros, a distance 4 from the diagonals
are due to the inter row coupling between the diagonals.
Sparse Matrices
Nodal Formulation
Applications
Resistor Grid
Sparse Matrices
Nodal Formulation
Applications
Temperature in a cube
m2 + 1
m2 + 2
Circuit
Model
m +1
m+2
Sparse Matrices
Nodal Formulation
1
Tridiagonal Example
X X
X X X
X X X
X X
Matrix Form
m 1
X
X
X
X
X
X
X X
X X
10
Sparse Matrices
For i = 1 to n-1 {
For j = i+1 to n {
M ji =
M ji
M ii
Tridiagonal Example
GE Algorithm
Pivot
M jk M jk M ji M ik
}
Multiplier
Order N Operations!
}
}
SMA-HPC 2003 MIT
11
Fill-In
Sparse Matrices
R4
V3
Resistor Example
V2
V1
R1
R2
Example
R5
iS 1
R3
Nodal Matrix
R1 + R1
1
R
R1
1
1
R4
1
R2
1
1
+
R2 R3
1
R4
1
1
+
R4 R5
V1 0
V = 0 Symmetric
2 Diagonally Dominant
V3 iS1
Recalling from lecture 2, the entries in the nodal matrix can be derived by noting
that a resistor, as
V n1
ik
Vn2
Rk
n1
n2
n1
n2
1
1
Rk
Rk
1
1
Rk
Rk
It is also resisting to note that Gii is equal to the sum of the conductances (one over
resistance) incident at node i.
12
Sparse Matrices
Matrix Non zero structure
X X X
X X 0
X 0 X
Fill-In
Example
X X X
X X X
0
0 X
X X
X= Non zero
SMA-HPC 2003 MIT
a11 a12
a21 0
The result is
a11
a
21
a11
a12
a a
21 12
a11
Notice that the factored matrix has a non zero entry in the bottom right corner,
where as the original matrix did not. This changing of a zero entry to a non zero
entry is referred to as a fill-in.
13
Fill-In
Sparse Matrices
Second Example
Fill-ins Propagate
X
X
X
0
X
0
X
0
X
0
X
0
In the example, the 4 x 4 mesh begins with 7 zeros. During the LU factorization, 5
of the zeros become non zero. What is of additional concern is the problem of fill-ins.
The first step of LU factorization where a multiple of the first row is subtracted
from the second row, generates fill-ins in the third and fourth column of row two.
When multiples of row 2 are subtracted from row 3 and row 4, the fill-ins generated
in row 2 generate second- level fill-ins in rows 3 and 4.
14
Sparse Matrices
V3
V1
V2
0
V3
V2
V1
Fill-In
Reordering
x
x
x
x
x
x
x
x
x
x Fill-ins
x
0
x No Fill-ins
In the context of the nodal equation formulation, renumbering the nodes seems like
a simple operation to reduce fill-in, as selecting the node numbers was arbitrary to
begin with. Keep in mind, however, that such a renumbering of nodes in the nodal
equation formulation corresponds to swapping both rows and columns in the matrix.
15
Fill-In
Sparse Matrices
Reordering
Possible Fill-in
x Locations
x
x
Already Factored
Multipliers
x
x
x
x
x
x
16
Sparse Matrices
Fill-In
Reordering
Markowitz Reordering
For i = 1 to n
Find diagonal j i with min Markowitz Product
End
Greedy Algorithm !
SMA-HPC 2003 MIT
K i N operations
where K is the average number of non zeros per row.
The second step of the algorithm is to swap rows and columns in the factorization.
A good data structure will make the swap inexpensively.
The third step is to factor the reordered matrix and insert the fill-ins. If the matrix is
very sparse, this third step will also be inexpensive.
Since one must then find the diagonal in the updated matrix with the minimum
Markowitz product, the products must be computed at a cost of
K i ( N 1) operations
1
KN 2 operations will be needed just to compute
2
the Markowitz products in a reordering algorithm.
It is possible to improve the situation by noting that very few Markowitz products
will change during a single step of the factorization. The mechanics of such an
optimization are easiest to see by examining the graphs of a matrix.
Continuing, it is clear that
17
Fill-In
Sparse Matrices
Reordering
0
3
0
2
18
Fill-In
Sparse Matrices
Very Sparse
Very Sparse
Dense
19
Sparse Matrices
Fill-In
Unfactored Random Matrix
20
Sparse Matrices
Fill-In
Factored Random Matrix
21
Matrix Graphs
Sparse Matrices
Construction
X X
X X
X X
1
2
X X X
X
X X
4
5
In the case where the matrix is structurally symmetric ( aij 0 if and only if
a ji 0), an undirected graph can be associated with the matrix.
The graph has
22
Sparse Matrices
X
X
X
Matrix Graphs
Markowitz Products
1
2
X
X
X
X
X
4
5
M 11 3 i 3 = 9
M 22 2 i 2 = 4
M 33 3 i 3 = 9
M 44 2 i 2 = 4
M 55 2 i 2 = 4
(degree 1) 2 = 9
(deg ree 2) 2 = 4
(deg ree 3) 2 = 9
(degree 4) 2 = 4
(degree 5) 2 = 4
That the ith node degree squared is equal to the Markowitz product associated with
the ith diagonal is easy to see. The node degree is the number of edges emanating
from the node, and each edge represents both an off-diagonal row entry and an offdiagonal column entry. Therefore, the number of off-diagonal row entries multiplied
by the number of off-diagonal column entries is equal to the node degree squared.
23
Matrix Graphs
Sparse Matrices
Factorization
X X
X X
X X
1
2
X X X
X
X X
3
4
5
factore d
f
After step i in the factorization, the unfactored portion of the matrix is smaller of
size (i - 1)x ( i - 1 ) , and may be denser if there are fill-ins. The graph can be used to
represent the location of non zeros in the unfactored portion of the matrix, but two
things must change.
1) A node must be removed as the unfactored portion has one fewer row.
2) The edges associated with fill-ins must be added.
In the animation, we show by example how the graph is updated during a step of LU
factorization. We can state the manipulation precisely by noting that if row i is
eliminated in the matrix, the node i must be eliminated from the graph. In addition,
all nodes adjacent to node i ( adjacent nodes are ones connected by an edge) will be
made adjacent to each other by adding the necessary edges. The added edges
represent fill- in.
24
Matrix Graphs
Sparse Matrices
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
Markowitz products
( = Node degree)
Example
Graph
1
2
Col Row
3
3
= 9
2
2
= 4
3
4
3
3
3
3
= 9
= 9
= 9
25
Matrix Graphs
Sparse Matrices
Example
Swap 2 with 1
x
x
x
x
x
x
x
x
x
x
Graph
1
A
1
A
3
E
4
A
E
A
G
E
26
Graphs
Sparse Matrices
1
m +1
m+2
m 1
m+3
2m
m2
(m 1) (m + 1)
The resistive grid is an important special case, as it is a model for discretized partial
differential equations (we will see this later).
Lets consider the nodal matrix and examine the locations and number of non zeros.
The matrix has a special form which is easy to discern from a 4 x 4 example. I n the
4 x 4 case the nodal matrix is
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
The tridiagonal blocks are due to the interaction between contiguously numbered
nodes along a single row in the grid. The non zeros, a distance 4 from the diagonals
are due to the inter row coupling between the diagonals.
27
Sparse Matrices
Matrix Graphs
Grid Example
A quick way to get a rough idea of how long it takes to factor the M2 x M2 matrix
associated with an M x M grid like a resistor array is to examine the graph. If one
orders the center column of M nodes in the graph last then they will be completely
connected as shown in the animation. However, a completely connected graph
corresponds to a dense matrix.
Since the resulting dense matrix requires M3 operations to factor, this suggests that
factoring an M x M grid costs something M3 operations, though making such an
argument precise is beyond the scope of the course.
28
29
Sparse Matrices
Vector of row
pointers
1
Val 11 Val 12
Val 1K
Col 11 Col 12
Col 1K
Matrix entries
Column index
Val 21 Val 22
Val 2L
Col 21 Col 22
Col 2L
Val N1 Val N2
Val Nj
Col N1 Col N2
Col Nj
In order to store a sparse matrix efficiently, one needs a data structure which can
represent only the matrix non zeros. One simple approach is based on the
observation that each row of a sparse matrix has at least one non zero entry. Then
one constructs one pair of arrays for each row, where the array part corresponds to
the matrix entry and the entrys column. As an example, consider the matrix
a11 0 a13
a
a
0
21 22
0 a33
a13
3
a 21
1
a 23
2
a 33
3
30
Sparse Matrices
Row j
M i ,i +1
M i ,i + 7
M i ,i +15
i +1
i+7
i + 15
M i ,i +1
M i ,i + 4
M i ,i + 5
M i ,i + 7
M i ,i + 9
M i ,i +12
M i ,i +15
i +1
i+4
i+5
i+7
i+9
i + 12
i + 15
Must read all the row j entries to find the 3 that match row i
In order to store a sparse matrix efficiently, one needs a data structure which can
represent only the matrix non zeros. One simple approach is based on the
observation that each row of a sparse matrix has at least one non zero entry. Then
one constructs one pair of arrays for each row, where the array part corresponds to
the matrix entry and the entrys column. As an example, consider the matrix
a11 0 a13
a
a
0
21 22
0 a33
a13
3
a 21
1
a 23
2
a 33
3
31
Sparse Matrices
Rows
Ops
Misses
Res
300
904387 248967
RAM
2806
1017289 3817587
Grid
4356
3180726 3597746
More
misses
than
ops!
In order to store a sparse matrix efficiently, one needs a data structure which can
represent only the matrix non zeros. One simple approach is based on the
observation that each row of a sparse matrix has at least one non zero entry. Then
one constructs one pair of arrays for each row, where the array part corresponds to
the matrix entry and the entrys column. As an example, consider the matrix
a11 0 a13
a
a
0
21 22
0 a33
a13
3
a 21
1
a 23
2
a 33
3
32
Sparse Matrices
Row j
M i ,i +1
M i ,i +1
i +1
M i ,i + 4
i+4
M i , i + 4 M i ,i + 5
M i ,i + 5
i+5
M i ,i + 7
M i ,i + 7
i+7
M i ,i + 9
M i ,i + 9
i+9
M i ,i +12
i + 12
M i ,i +12
M i ,i +15
i + 15
M i ,i +15
1) Read all the elements in Row j, and scatter them in an n-length vector
2) Access only the needed elements using array indexing!
SMA-HPC 2003 MIT
In order to store a sparse matrix efficiently, one needs a data structure which can
represent only the matrix non zeros. One simple approach is based on the
observation that each row of a sparse matrix has at least one non zero entry. Then
one constructs one pair of arrays for each row, where the array part corresponds to
the matrix entry and the entrys column. As an example, consider the matrix
a11 0 a13
a
a
0
21 22
0 a33
a13
3
a 21
1
a 23
2
a 33
3
33
Summary
LU Factorization and Diagonal Dominance.
Factor without numerical pivoting
Sparse Matrices
Struts, resistor grids, 3-d heat flow -> O(N) nonzeros
34
QR Factorization
Singular Example
LU Factorization Fails
Strut
Joint
Load force
QR Factorization
Singular Example
LU Factorization Fails
v1
v2
1 v3
v4
QR Factorization
Singular Example
M1
M2
x1 b1
x2 b2
MN
=
xN bN
x1M 1 + x2 M 2 +
+ xN M N = b
QR Factorization
Orthogonalization
If M has orthogonal columns
Mi M j = 0
i j
M i x1M 1 + x2 M 2 +
+ xN M N = M i b
xi M i M i = M i b xi =
SMA-HPC 2003 MIT
Mi b
(M
Mi
QR Factorization
Orthogonalization
Orthonormal M - Picture
M is orthonormal if:
Mi M j = 0
i j and
Mi Mi = 1
M1
M1
M2
Non-orthogonal Case
SMA-HPC 2003 MIT
x1
x2
Orthogonal Case
M2
QR Factorization
Orthogonalization
QR Algorithm Key Idea
x1 b1
x2 b2
M
M
M
=
2
N
1
xN bN
Original Matrix
y
b
1 1
y2
b2
=
QN
Q1 Q2
yN bN
Matrix with
Orthonormal
Columns
Qy = b y = Q b
T
Orthogonalization
QR Factorization
Projection Formula
M 1 Q2 = M 1 M 2 r12 M 1
) =0
M1 M 2
r12 =
M1 M1
M2
Q2
SMA-HPC 2003 MIT
r12
M1
Orthogonalization
QR Factorization
Normalization
1
Finally Q2 =
Q2 = Q2
r
22
Q2 Q2
QR Factorization
Orthogonalization
How was a 2x2 matrix
converted?
y1
x1
M 1 M 2 x = x1M 1 + x2 M 2 = Q1 Q2 y = y1Q1 + y2Q2
2
M 1 = r11Q1
M 2 = r22 Q2 + r12Q1
r11
0
r12 x1 y1
=
r22 x2 y2
QR Factorization
M1
x1
M2 =
x2
Orthogonalization
The 2x2 QR Factorization
r11 r12 x1 b1
=
Q1 Q2
0 r22 x2 b2
Upper
Triangular
Orthonormal
Step 1) QRx = b Rx = QT b = b
Step 2) Backsolve Rx = b
SMA-HPC 2003 MIT
Orthogonalization
QR Factorization
3x3 Case
M1
M2
M 3 M1
M 2 r12 M 1
(
(M
)
M )= 0
M 1 M 3 r13 M 1 r23 M 2 = 0
M2
r13 M 1 r23
M 3 r13 M 1 r23 M 2
QR Factorization
(
(M
Orthogonalization
Must Solve Equations for
Coefficients in 3x3 Case
)
M )= 0
M 1 M 3 r13 M 1 r23 M 2 = 0
M2
M1 M1
M 2 M1
SMA-HPC 2003 MIT
r13 M 1 r23
M 1 M 2 r13 M 1 M 3
=
M 2 M 2 r23 M 2 M 3
QR Factorization
Orthogonalization
Must Solve Equations for
Coefficients
M1 M1
M N 1 M 1
M 1 M N 1 r1, N M 1 M N
M N 1 M N 1 rN 1, N M N 1 M N
3
QR Factorization
M1
Orthogonalization
3x3 Case
M2
M 3 M1
Use previously
orthogonalized vectors
M 2 r12Q1
M 3 r13Q1 r23Q2
Basic Algorithm
QR Factorization
For i = 1 to N
rii = M i M i
1
Qi = M i
rii
2N 2N
Normalize
For j = i+1 to N {
rij M j Qi
M j M j rij Qi
SMA-HPC 2003 MIT
Modified Gram-Schmidt
operations
i =1
( N i)2 N N
i =1
operations
QR Factorization
Basic Algorithm
By Picture
Q1
Q2
Q3
QN
r11
0
r12
r13
r22
0
r23
r33
r1N
r2 N
r3 N
rNN
QR Factorization
Basic Algorithm
By Picture
M1 1
Q
M
Q22
Q33
M
Q
M44
r11 r12
r13
r14
r22
r23
r24
r33
r34
r44
QR Factorization
Basic Algorithm
Zero Column
Q1
r11 r12 r13
0 0 0
MN
0 M3
0 0 0
0
r1N
0
0
Basic Algorithm
QR Factorization
Resulting QR Factorization
Q1
0 Q3
0
QN
r11
0
0
0
r12
0
r13
0
r33
r1N
0
r3 N
rNN
QR Factorization
Singular Example
M1
M2
x b
1 1
x2 b2
=
MN
xN bN
x1M 1 + x2 M 2 +
+ xN M N = b
QR Factorization
Minimization View
Alternative Formulations
R ( x ) R ( x ) = ( Ri ( x ) )
T
i =1
Minimization View
QR Factorization
One-dimensional
Minimization
2
1
( Me1 )
d
T
T
T
R ( x ) R ( x ) = 2b Me1 + 2 x1 ( Me1 ) ( Me1 ) = 0
dx
T
b Me1
x1 = T T
e1 M Me1 Normalization
SMA-HPC 2003 MIT
Minimization View
QR Factorization
One-dimensional
Minimization, Picture
Me1 = M 1
b
x1
b Me1
x1 = T T
e1 M Me1
e1
One dimensional minimization yields same result as
projection on the column!
SMA-HPC 2003 MIT
Minimization View
QR Factorization
Two-dimensional
Minimization
( Me1 )
T
T
2
2 x2b Me2 + x2 ( Me2 ) ( Me2 )
Coupling
Term
SMA-HPC 2003 MIT
2
1
+2 x1 x2 ( Me1 )
( Me2 )
Minimization View
QR Factorization
Two-dimensional
Minimization Continued
2
1
( Mp1 )
Coupling
Term
T
1
( Mp2 )
T
+2v1v2 ( Mp1 ) ( Mp2 )
2
2
Minimization View
QR Factorization
pi = ei rji p j
pi M T Mp j = 0
j =1
Mp ) ( Me )
(
=
( Mp ) ( Mp )
T
rji
Minimization View
QR Factorization
Differentiating:
T
Mp
2
v
b
( i ) i Mpi
2vi ( Mpi )
vi =
bT Mpi
( Mpi ) ( Mpi )
T
QR Factorization
For i = 1 to N
pi = ei
Minimization Algorithm
For j = 1 to i-1
x = x + vi pi
Minimization View
QR Factorization
Q1
Minimization and QR
Comparison
QN
Q2
1
e1
r11
p1
1
( e2 r12e1 )
r22
p2
Orthonormal
1
e2 riN ei )
(
MTM
rNN
Orthonormal
pN
QR Factorization
Search Direction
Unit Vectors
MTM
Orthogonalization
Search Directions
MTM
Krylov-Subspace Orthogonalization
Why?
SMA-HPC 2003 MIT
{ p1 , , pN }
Search Directions
Summary
QR Algorithm
Projection Formulas
Orthonormalizing the columns as you go
Modified Gram-Schmidt Algorithm
Minimization View of QR
Basic Minimization approach
Orthogonalized Search Directions
QR and Length minimization produce identical results
Outline
General Subspace Minimization Algorithm
Review orthogonalization and projection formulas
Arbitrary Subspace
methods
,...,
{w0 ,..., wk 1}
w
w
k 1N
0N
Pick a kdimensional
Subspace
x =
k
k 1
w
i =0
Residual Minimization
Arbitrary Subspace
methods
If x =
k
k 1
w
i
i =0
k 1
r = b Mx = b i Mwi
k
i =0
k 2
2
( ) (r )
k 1
= b i Mwi b i Mwi
i =0
i =0
k 1
Arbitrary Subspace
methods
Minimizing r
k 2
2
= b
Residual Minimization
Computational Approach
k 1
Mw
i =0
is easy if
i
2
( Mw ) = 0, i j or ( Mw ) orthogonal to ( Mw )
Create a set of vectors { p0 ,..., pk 1} such that
( Mwi )
and ( Mpi ) ( Mp j ) = 0, i j
T
Residual Minimization
Arbitrary Subspace
methods
Algorithm Steps
For j = 0 to k 1 p j = w j
i =0
( Mw ) ( Mp ) p
T
( Mpi ) ( Mpi )
T
( r ) ( Mp )
i =0
( Mpi ) ( Mpi )
x =
k
0 T
k 1
( r ) ( Mp )
i =0
( Mpi ) ( Mpi )
pi =
i T
pi
Arbitrary Subspace
methods
1) orthogonalize the Mwi ' s
w
p00
w11
p
Residual Minimization
Algorithm Steps by Picture
w
p22
w
p33
M p0
Minimization Algorithm
Arbitrary Subspace
Solution Algorithm
r 0 = b Ax 0
For j = 0 to k-1
p j = wj
For i = 0 to j-1
T
p j p j ( Mp j ) ( Mpi ) pi
pj
( Mp ) ( Mp )
T
j +1
j +1
Normalize
= x + (r
j
=r
pj
) ( Mp ) p
( r ) ( Mp ) M p
Orthogonalize
Search Direction
j T
Update Solution
j T
Update Residual
Arbitrary Subspace
methods
Subspace Selection
Criteria
k 1
Mw
i =0
is small
Arbitrary Subspace
methods
Subspace Selection
Historical Development
1 T
T
Consider minimizing f ( x ) = x Mx x b
2
T
T
x f ( x ) = Mx b x = M 1b minimizes f
( )
k 1
)}
Arbitrary Subspace
methods
Subspace Selection
Krylov Subspace
)}
( )
= span r 0 ,..., r k 1
k 1
Mr
i
i =0
k 1
k 1
} = span {r , Mr ,..., M
0
k 1 0
Krylov Subspace
SMA-HPC 2003 MIT
Krylov Methods
( r ) ( Mp )
k T
k =
( Mpk ) ( Mpk )
T
x k +1 = x k + k pk
k +1
= r k Mpk
pk +1 = r
k +1
Mr ) ( Mp )
(
p
( Mp ) ( Mp )
j =0
k +1 T
Krylov Methods
( r ) ( Mp )
k T
k =
( Mpk ) ( Mpk )
T
x k +1 = x k + k pk
k +1
= r k Mpk
pk +1 = r
k +1
Mr ) ( Mp )
(
p
( Mp ) ( Mp )
k
j =0
k +1 T
Krylov Methods
Symmetric Case
k +1
j =0
k +1 T
k +1 T
k +1
k +1
Krylov Methods
Nodal Formulation
No-leak Example
Insulated bar and Matrix
Incoming Heat
T (1)
T (0)
Near End
Temperature
Discretization
m
SMA-HPC 2003 MIT
2 1
1 2
Nodal
1 Equation
Form
1 2
Far End
Temperature
Krylov Methods
Nodal Formulation
1
m
SMA-HPC 2003 MIT
No-leak Example
Circuit and Matrix
2 1
1 2
1 2
m 1
Nodal
Equation
Form
Krylov Methods
Nodal Formulation
leaky Example
Conducting bar and Matrix
T (1)
T (0)
Near End
Temperature
Discretization
2.01 1
1 2.01
Nodal
Equation
1 Form
1 2.01
m
SMA-HPC 2003 MIT
Far End
Temperature
leaky Example
Krylov Methods
Nodal Formulation
1
m
SMA-HPC 2003 MIT
m 1
2.01 1
1 2.01
Nodal
Equation
1
Form
1 2.01
R
E
S
I
D
U
A
L
10
10
10
10
Insulating
-1
Leaky
-2
-3
-4
10
20
Iteration
30
40
50
60
10
R
E
S
I
D
U
A
L
-1
10
-2
10
-3
10
Insulating
-4
10
Leaky
-5
10
10
15
20
25
Iteration
30
35
40
45
50
Convergence Analysis
Krylov Subspace
Methods
Polynomial Approach
k +1
M r
i =0
k +1
i 0
= k ( M ) r
= r i M r = ( I M k ( M )) r
0
i +1 0
i =0
}=
span r , Mr
Krylov Methods
Convergence Analysis
Basic Properties
2) x
= k ( M )r , k is the k order
k +1
polynomial which minimizes r
k +1
th
2
2
= b Mx = r M k ( M )r
0
0
= ( I M k ( M ) ) r k +1 ( M ) r
th
0
where k +1 ( M ) r is the ( k + 1) order poly
k +1 2
minimizing r
subject to k +1 ( 0 ) =1
3) r
k +1
k +1
Convergence Analysis
Krylov Methods
k +1 2
2
k+1 ( M )r
0 2
2
Therefore
Any polynomial which satisfies
the zero constraint can be used
to get an upper bound on
SMA-HPC 2003 MIT
k +1 2
2
Eigenvalues and
Vectors Review
Basic Definitions
Mui = i ui
eigenvector
Or, i is an eigenvalue of M if
M i I is singular
ui is an eigenvector of M if
( M i I ) ui
=0
Eigenvalues and
Vectors Review
1.1 1
1 1.1
M 11
M
21
M N 1
Examples
1 1 0 0
1 1 0 0
0 0 1 1
0 0 1 2
0
M 22
Basic Definitions
M NN 1
M NN
Eigenvalues?
Eigenvectors?
A Simplifying
Assumption
Eigenvalues and
Vectors Review
u1
u2
u3
uN
= 1u1
2u2
3u3
N u N
Eigenvalues and
Vectors Review
A Simplifying
Assumption Continued
MU =U
1
0
0
0
0
0
N
U MU = or M = U U
1
Eigenvalues and
Vectors Review
Im ( )
Spectral Radius
Re ( )
Eigenvalues and
Vectors Review
Incoming Heat
T (1)
T (0)
Unit Length Rod
T1
+
-
vs = T(0)
TN
+
-
vs = T(1)
Eigenvalues and
Vectors Review
2 1 0 0
1 2
0
1
0
0
1
2
Eigenvalues N=20
Eigenvalues and
Vectors Review
Useful
Eigenproperties
Spectral Mapping
Theorem
Given a polynomial
f ( x ) = a0 + a1 x + + a p x
f ( M ) = a0 + a1M + + a p M
Then
spectrum ( f ( M ) ) = f ( spectrum ( M ) )
Useful
Eigenproperties
Spectral Mapping
Theorem Proof
MM = U U U U = U U
p
p 1
M = U U
1
f ( M ) = U ( a0 I + a1 + + a p
Diagonal
f ( M ) U = U ( a0 I + a1 + + a p
SMA-HPC 2003 MIT
)U
Useful
Eigenproperties
Spectral
Decomposition
x = 1u1 + 2u2 + + N u N
1
Compute by solving U = x = U 1 x
N
Applying M to x yeilds
Mx = M (1u1 + 2u2 + + N u N )
= 11u1 + 2 2u2 + + N N u N
SMA-HPC 2003 MIT
Krylov Methods
Convergence Analysis
Important Observations
n ( M ) r 0 = 0 and therefore r n = 0
2) If M has only q distinct eigenvalues, the GCR
Algorithm converges in at most q steps
Summary
Arbitrary Subspace Algorithm
Orthogonalization of Search Directions
Outline
Reminder about GCR
Residual minimizing solution
Krylov Subspace
Polynomial Connection
Preconditioners
Diagonal Preconditioners
Approximate LU preconditioners
With Normalization
Generalized
Conjugate Residual
Algorithm
r 0 = b Ax 0
For j = 0 to k-1
pj = r j
Residual is next search direction
For i = 0 to j-1
Orthogonalize
T
p j p j ( Mp j ) ( Mpi ) pi Search Direction
pj
pj
( Mp ) ( Mp )
T
x j +1 = x j + ( r
r j +1 = r j
Normalize
) ( Mp ) p
( r ) ( Mp ) M p
j T
Update Solution
j T
Update Residual
Generalized
Conjugate Residual
Algorithm
1) orthogonalize the Mr i ' s
rp00
rp11
With Normalization
Algorithm Steps by Picture
pr 22
rp33
rk
M pk
Generalized
Conjugate Residual
Algorithm
r0
First search direction r = b Mx = b, p0 =
Mr 0
0
Residual minimizing x1 =
solution
Second Search
Direction
( r 0 ) Mp0 p0
T
r1 = b Mx1 = r 0 1Mr 0
p1 =
r1 1,0 p0
M r1 1,0 p0
Generalized
Conjugate Residual
Algorithm
Residual minimizing
solution
x 2 = x1 + ( r1 ) Mp1 p1
T
r 2 = b Mx 2 = r 0 2,1Mr 0 2,0 M 2 r 0
p2 =
r1 2,0 p0 2,1 p1
M r1 2,0 p0 2,1 p1
Generalized
Conjugate Residual
Algorithm
k 1
pk = r ( Mr k )
k
j =0
pk =
pk
Mpk
k = ( r k ) ( Mpk )
T
x k +1 = x k + k pk
r k +1 = r k k Mpk
( Mp ) p
j
Orthogonalize and
normalize search
direction
Polynomial view
Generalized
Conjugate Residual
Algorithm
Residual Minimization
Krylov Methods
Polynomial View
If x k +1 span r 0 , Mr 0 ,..., Mr k
minimizes r k +1
2
2
1) x
k +1
th
Krylov Methods
Nodal Formulation
No-leak Example
Insulated bar and Matrix
Incoming Heat
T (1)
T (0)
Near End
Temperature
Discretization
m
SMA-HPC 2003 MIT
Far End
Temperature
2 1
1 2
Nodal
1 Equation
Form
1
2
10
Krylov Methods
Nodal Formulation
leaky Example
Conducting bar and Matrix
T (1)
T (0)
Near End
Temperature
Discretization
Far End
Temperature
2.01 1
1 2.01
Nodal
Equation
1 Form
1 2.01
m
SMA-HPC 2003 MIT
11
R
E
S
I
D
U
A
L
10
10
10
10
Insulating
-1
Leaky
-2
-3
-4
10
20
Iteration
30
40
50
60
12
10
R
E
S
I
D
U
A
L
-1
10
-2
10
-3
10
Insulating
-4
10
Leaky
-5
10
10
15
20
25
Iteration
30
35
40
45
50
13
Krylov Methods
Residual Minimization
Optimality of poly
Therefore
Any polynomial which satisfies
the constraints can be used to
get an upper bound on
r k +1
r0
14
Induced Norms
Matrix Magnification
Question
Suppose y = Mx
How much larger is y than x?
OR
15
Vector Norm
Review
Induced Norms
L2 (Euclidean) norm :
x
i=1
L1 norm :
x
xi
xi
i=1
< 1
< 1
L norm :
x
= max
i
xi
< 1
16
Standard Induced
l-norms
Induced Matrix
Norms
Definition:
M l max
Mx
x
Examples
M
SMA-HPC 2003 MIT
max
1
= max
i
j =1
max
M ij
M
j
i =1
ij
x l =1
Mx
Max Column
Sum
Max Row
Sum
17
Standard Induced
l-norms continued
Induced Matrix
Norms
= m ax
j
i =1
Why? Let x =
= m ax
i
Why? Let
[1
j =1
x =
ij
0
ij
[ 1
1]
As the algebra on the slide shows the relative changes in the solution x is bounded
by an A-dependent factor times the relative changes in A. The factor
|| A1 || || A ||
was historically referred to as the condition number of A, but that definition has
been abandoned as then the condition number is norm-dependent. Instead the
condition number of A is the ratio of singular values of A.
cond ( A) =
max( A)
min( A)
Singular values are outside the scope of this course, consider consulting Trefethen
& Bau.
18
Useful
Eigenproperties
Spectral Mapping
Theorem
Given a polynomial
f ( x ) = a0 + a1 x + + a p x p
f ( M ) = a0 + a1M + + a p M p
Then
spectrum ( f ( M ) ) = f ( spectrum ( M ) )
19
Krylov Methods
u N
=
u1
eigenvectors of M
k (M )
u
1
u N
u
1
Convergence Analysis
Norm of matrix polynomials
k ( 1 )
u N
k ( N )
1
Cond(U)
u
1
k ( 1 )
u N
k ( N )
condition number of
M's eigenspace
SMA-HPC 2003 MIT
20
Krylov Methods
k ( 1 )
Convergence Analysis
Norm of matrix polynomials
= max x =1
k ( N )
2
( ) x
k
= max i k ( i )
k ( M )
cond (V ) max i k ( i )
21
Krylov Methods
Convergence Analysis
Important Observations
n ( M ) = 0 and therefore r n = 0
22
Krylov Methods
Residual Polynomial
If M = MT then
1) M has orthonormal eigenvectors
cond (V ) =
u
1
u N
u
1
u N
=1
k ( M ) = max i k ( i )
23
* = evals(M)
- = 5th order poly
- = 8th order poly
24
25
Krylov Methods
26
Krylov Methods
x
Ck 1 + 2 min
max min
min
Ck 1 + 2
max min
27
28
Krylov Methods
Chebychev Bounds
max
Ck 1 2
max min
max
min
2
max
+ 1
min
29
Krylov Methods
Chebychev Result
rk
max
min
r0
2
max
+ 1
min
30
Preconditioning
Krylov Methods
1
0
0
1
0
Diagonal Example
1
0
0
2
0
31
Preconditioning
Krylov Methods
Diagonal Preconditioners
Let M = D + M nd
(
Apply GCR to D 1M x = I + D 1M nd x = D 1b
The Inverse of a diagonal is cheap to compute
Usually improves convergence
SMA-HPC 2003 MIT
32
Heat Conducting
Bar example
x
x1
x2
x
100
xi
xi +1
Discretized system
one small x
xn
2 + 1
u1 f (x1)
1 2 +
1 1+ +100
100
100 1+ +100 1
1
1
u f (x )
1
2
n n
max
> 100
min
33
rk
r0
Iteration
SMA-HPC 2003 MIT
34
Heat Conducting
Bar example
Preconditioned Matrix
Eigenvalues
Residual Minimizing
Krylov-subspace
Algorithm can
eliminate outlying
eigenvalues by
placing polynomial
zeros directly on
them.
35
GCR
O ( m3 )
O (m)
O ( m2 )
O ( m6 )
O ( m3 )
O ( m3 )
O ( m9 )
O ( m6 )
O ( m4 )
36
Preconditioning
Krylov Methods
Approximate LU
Preconditioners
Let M L U
Applying GCR to
((
LU
( )
M x = LU
(( LU )
M x is equivalent to
solving LUy = Mx
SMA-HPC 2003 MIT
37
Preconditioning
Krylov Methods
Approximate LU
Preconditioners Continued
38
39
Preconditioning
Krylov Methods
Approximate LU
Preconditioners Continued
40
Summary
Reminder about GCR
Residual minimizing solution
Krylov Subspace
Polynomial Connection
Preconditioners
Diagonal Preconditioners
Approximate LU preconditioners
SMA-HPC 2003 MIT
41
Jacob White
Outline
Nonlinear Problems
Struts and Circuit Example
Newtons Method
Derivation of Newton
Quadratic Convergence
Examples
Global Convergence
Convergence Checks
Nonlinear
problems
( x0 , y0 )
( x2 , y2 )
Strut Example
( x1 , y1 )
Load force
W
Need to Solve f x
W
Struts Example
Nonlinear Problems
L0 L
EAc
L0
f
fx
f
(0,0)
L
x1 , y1
fy
fx
X
fy
L
H L0 L
x1
f
L
y1
f
L
2
1
2
1
x y
Nonlinear
problems
Strut Example
( x1 , y1 )
( x0 , y0 )
x2 x0 y2 y0
L1
x2 x1 y2 y1
L2
f1
( x2 , y2 )
x2 x0
H ( Lo L1 )
L1
x2 x1
H ( Lo L2 )
L2
f1x
f2
f2 x
Load force
W
1x
f2 x
1y
f2 y W
Nonlinear
problems
Strut Example
Why Nonlinear?
y2 y1
H ( Lo L2 )
L2
y2 y0
H ( Lo L1 ) W
L1
Nonlinear
problems
v1
10v
Circuit Example
v2
10
1
I r Vr
10
+
- Vd
Vd
I d I s (e
Need to Solve
Id Ir
I vsrc I r
0
0
Vt
1)
Nonlinear
problems
Solve Iteratively
f ( x)
Solve iteratively
0
guess at a solution x
x0
repeat for k = 0, 1, 2, .
k 1
W x
k 1
f
x
|0
until
Ask
Does the iteration converge to correct solution ?
How fast does the iteration converge?
Richardson
Iteration
Definition
k 1
x f (x )
k 1
f ( xk )
xk
x* ( Solution)
Richardson
Iteration
Example 1
f ( x)
Start with
0.7 x 10
0
x1
x 0 f ( x 0 ) 10
x2
x1 f ( x1 ) 13
x 6 14.27
x3
x 2 f ( x 2 ) 13.9
x7
x4
x3 f ( x3 ) 14.17
x8 14.28
14.25
14.28
Converged
Richardson
Iteration
Example 1
f ( x)
x x
0.7 x 10
Richardson
Iteration
Example 2
f ( x)
Start with
2 x 10
x0
x1
x0 f ( x0 ) 10
x2
x1 f ( x1 )
x3
x2 f ( x2 ) 130
x4
x3 f ( x3 )
40
400
No convergence !
Richardson
Iteration
Convergence
Setup
Iteration Equation
Exact Solution
k 1
x f (x )
*
x N
f (x )
0
Computing Differences
k 1
x
x x f (x ) f (x )
Need to Estimate
Richardson
Iteration
f (v ) f y
Convergence
wf v
v y
wx
v > v, y @
v
Richardson
Iteration
Convergence
Use MVT
Iteration Equation
Exact Solution
k 1
x f (x )
*
x N
f (x )
0
Computing Differences
k 1
x
x x f (x ) f (x )
wf x k
*
1
x x
wx
Richardson
Iteration
If
1
And
Then
Or
Convergence
Richardson Theorem
wf x
wx
x x G
k 1
x dJ x x
lim k of x
k 1
x
lim k of J x x
Linear Convergence
Richardson
Iteration
Example 1
f ( x)
x x
0.7 x 10
Richardson
Iteration
Problems
Newtons method
Another approach
df k *
f ( x ) f ( x ) ( x ) ( x xk )
dx
*
Define iteration
Do k = 0 to .
1
df k
k 1
k
x
x ( x ) f ( xk )
dx
df k
if ( x )
dx
until convergence
1
exists
Newtons Method
Graphically
Newtons Method
Example
Newtons Method
x x
Example
Newtons Method
0
f ( x* )
Convergence
2
df
d
f
k
k
k
*
f ( x ) ( x )( x x ) 2 ( x )( x* x k ) 2
dx
dx
k
*
some x [ x , x ]
But
df k k 1 k
f ( x ) ( x )( x x )
dx
k
by Newton
definition
Convergence
Newtons Method
Contd.
2
df k k 1 *
( x )( x x )
Subtracting
dx
Dividing through
d f
k
* 2
x
x
x
(
)(
)
2
d x
2
df
d
f
k 1
k 1
k
*
* 2
( x x ) [ ( x )]
(
)(
)
x
x
x
2
dx
d x
1
2
df
d
f
Suppose ( x)
( x) d L for all x
2
dx d x
then x
k 1
x dL x x
* 2
Convergence
Newtons Method
Example 1
f ( x) x 2 1 0,
df k
( x ) 2 xk
dx
k
2x (x
k 1
k 1
2x (x
or ( x
k 1
find x ( x* 1)
x
x )
1
k
x ) 2x (x x )
*
x )
1
k
* 2
(x x )
k
2x
x
x
2
Convergence is quadratic
Convergence
Newtons Method
Example 2
2
f ( x) x 0, x 0
df
df k
Note :
not bounded
k
dx
(x ) 2 x
dx
away from zero
2
k
k 1
k
2 x ( x 0) ( x 0)
1 k
*
k 1
k
for x z x 0
x 0
x 0
2
1
*
*
( xk x )
or ( xk 1 x )
2
1
Convergence is linear
Newtons Method
Convergence
Examples 1 , 2
Newtons Method
Convergence
1
2
df
d f
Suppose ( x)
( x) d L for all x
2
dx d x
if L x0 x* d J 1
then xk converges to x*
Proof
x1 x * d L ( x0 x * ) x0 x *
x1 x * d J x0 x *
x2 x * d LJ x0 x * x1 x *
or x2 x * d J 2 x1 x * d J 3 x0 x *
x3 x * d J 4 x 2 x * d J 7 x 0 x *
Newtons Method
Convergence
Theorem
df
d2 f
If L is bounded (
bounded away from zero ;
bounded)
2
dx
dx
then Newton's method is guaranteed to converge given a "close
enough" guess
Always converges ?
Newtons Method
Convergence
Example
Convergence Depends on a Good Initial Guess
f(x)
x1
x1
x
Convergence
Newtons Method
Convergence Checks
k 1
k 1
x ! H xa H xr x
f x
k 1
H fa
k 1
Convergence
Newtons Method
Convergence Checks
f x
x*
x
SMA-HPC 2003 MIT
x H xa H xr x
! H fa
X
x k 1 x k
k 1
k 1
k 1
Summary
Nonlinear Problems
Struts and Circuit Example
Derivation of Newton
Quadratic Convergence
Examples
Global Convergence
Convergence Checks
Outline
Quick Review of 1-D Newton
Convergence Testing
Newton Idea
1-D Reminder
( )
0
*
( )
f x = f ( x) +
f ( x )
x
(x
x +
f ( x)
f ( x )
x
x* x f ( x )
(x
Newton Algorithm
1-D Reminder
x 0 = Initial Guess, k = 0
Repeat {
( )
f x k
x
(x
k +1
( )
xk = f xk
k = k +1
} Until ?
x k +1 x k < threshold ?
SMA-HPC 2003 MIT
f x k +1 < threshold ?
1-D Reminder
Newton Algorithm
Algorithm Picture
Newton Algorithm
1-D Reminder
Convergence Checks
k +1
k +1
x > xa + xr x
k
f x
k +1
< fa
k +1
Newton Algorithm
1-D Reminder
Convergence Checks
f x
x*
x
SMA-HPC 2003 MIT
x < xa + xr x
k
> fa
X
x k +1 x k
k +1
k +1
k +1
Newton Algorithm
1-D Reminder
Local Convergence
Convergence Depends on a Good Initial Guess
f(x)
x1
x1
x
Multidimensional
Newton Method
F
l= x +y
2
FL
(lo l )
F = EAc
= (lo l )
lo
x
x
f x = F = (lo l )
l
l
y
y
f y = F = (lo l )
l
l
Example Problem
F (x) =
f x + FLx = 0
f y + FLy = 0
OR
x
(lo l ) + FLx = 0
l
y
(lo l ) + FLy = 0
l
Multidimensional
Newton Method
v1
i2 + v2b -
Example Problem
Nonlinear Resistors
v2
Nodal Analysis
i3
+
v3b
i1
+
v1b
-
Nonlinear
Resistors
i = g (v)
At Node 1: i1 + i2 = 0
g ( v1 ) + g ( v1 v2 ) = 0
At Node 2: i3 i2 = 0
g ( v3 ) g ( v1 v2 ) = 0
Two coupled
nonlinear equations
in two unknowns
Multidimensional
Newton Method
General Setting
( )
x
*
and F :
( )
F x = F ( x) + JF ( x)
*
Jacobian
Matrix
If x is close to the exact solution
J F ( x ) x* x F ( x )
SMA-HPC 2003 MIT
x* x + H .O.T .
Nodal Analysis
Multidimensional
Newton Method
x
*
and F :
x
(lo l ) + FLx = 0
l
y
(lo l ) + FLy = 0
l
SMA-HPC 2003 MIT
FL
? ?
JF ( x) =
? ?
Multidimensional
Newton Method
v1
i1
x*
i2
b
v
+2
v2
i3
v1b
v3b
Nodal Analysis
Nonlinear Resistor
and F :
At Node 1: i1 + i2 = 0
F1 ( v ) = g ( v1 ) + g ( v1 v2 ) = 0
At Node 2: i3 i2 = 0
F2 ( v ) = g ( v3 ) g ( v1 v2 ) = 0
? ?
JF ( x) =
? ?
Multidimensional
Newton Method
Jacobian Matrix
J F ( x ) x F ( x + x ) F ( x )
F1 ( x )
x1
J F ( x ) x
FN ( x )
x
1
F1 ( x )
xN x1
FN ( x ) xN
xN
Multidimensional
Newton Method
Jacobian Matrix
Singular Case
Suppose J F ( x ) is singular?
F1 ( x )
x1
J F ( x ) x =
FN ( x )
x
1
F1 ( x )
xN x1
=0
FN ( x ) xN
xN
Multidimensional
Newton Method
Newton Algorithm
x 0 = Initial Guess, k = 0
Repeat {
( ) ( )
( x )( x x ) = F ( x )
Compute F x k , J F x k
Solve J F
k +1
for x k +1
k = k +1
} Until
x k +1 x k ,
f x k +1
small enough
Multidimensional
Newton Method
( )
Fn1 ( v )
vn1
g vn1 vn2
vn1
g
v
) +
Fn1 ( v )
vn2
g vn1 vn2
vn2
g
v
) +
Multidimensional
Newton Method
i
Stamping a
Resistor
n1
n2
n1
n1
g vn1 vn2
n2
g vn1 vn2
g vn1 vn2
v
g vn1 vn2
vb
+
-
v
JF (v )
n2
g vn1 vn2
g v v
n1
n2
F (v )
n1
n
2
( )
( )
Compute F x k , J F x k
Zero J F and F
for each element
Compute element currents and derivatives
Sum currents to F , sum derivatives to J F
( )( x
Solve J F x k
k +1
( )
xk = F xk
for x k +1
k = k +1
} Until
SMA-HPC 2003 MIT
x k +1 x k ,
f x k +1
small enough
Multidimensional
Newtons Method
T (1)
T (0)
T1
vs = T(0)
TN
+
-
+
-
ih = k1T + k2T
vs = T(1)
Multidimensional
Newton Method
Multidimensional
Convergence Theorem
Theorem Statement
Main Theorem
If
( )
( Inverse is bounded )
a)
J F1 x k
b)
JF ( x) JF ( y)
x y
Multidimensional
Newton Method
If J F ( x ) J F ( y )
x y
Multidimensional
Convergence Theorem
Key Lemma
Then F ( x ) F ( y ) J F ( y )( x y )
x y
Multidimensional
Convergence Theorem
Multidimensional
Newton Method
Theorem Proof
( ) ( )
x k +1 x k = J F1 x k F x k
( )
F xk
( )
x k +1 x k F x k F x k 1 J F x k 1
0
Finally using the Lemma
x
SMA-HPC 2003 MIT
k +1
x
k
x x
k
k 1 2
)(
x k x k 1
Multidimensional
Convergence Theorem
Multidimensional
Newton Method
k +1
k
k 1
x x x k x k 1
x
2
1
0
If
x x <1
2
x k +1 x k k x k +1 x k + x 0 converges
k =0
Non-converging
Case
1-D Picture
f(x)
x1
x
Newton Method
with Limiting
Newton Algorithm
Repeat {
( ) ( )
( x ) x = F ( x )
+ limited ( x )
Compute F x k , J F x k
Solve J F
x k +1 = x k
k = k +1
} Until
SMA-HPC 2003 MIT
k +1
for x k +1
k +1
x k +1 , F x k +1
small enough
Newton Method
with Limiting
Limiting Methods
Direction Corrupting
limited x k +1 =
i
limited x k +1
x k +1
NonCorrupting
limited x k +1 = x k +1
= min 1,
k +1
limited x
k +1
x k +1
Damped Newton
Scheme
Newton Method
with Limiting
General Damping Scheme
( )
( )
Solve J F x k x k +1 = F x k
for x k +1
x k +1 = x k + k x k +1
Pick to minimize F x + x
k
F x + x
k
k +1
2
2
k +1
F x + x
k
2
2
k +1
) F (x
T
+ k x k +1
Damped Newton
Newton Method
with Limiting
Convergence Theorem
If
( )
( Inverse is bounded )
a)
J F1 x k
b)
JF ( x) JF ( y)
x y
Then
There exists a set of k ' s ( 0,1] such that
F x k +1 = F x k + k x k +1
( )
< F xk
with <1
Damped Newton
Newton Method
with Limiting
Nested Iteration
x 0 = Initial Guess, k = 0
Repeat {
( ) ( )
Solve J ( x ) x = F ( x ) for x
Find ( 0,1] such that F ( x + x )
Compute F x k , J F x k
k
k +1
k +1
x k +1 = x k + k x k +1
k = k +1
} Until
x k +1 , F x k +1
k +1
is minimized
small enough
Newton Method
with Limiting
Damped Newton
f(x)
x2
1
1
D
Summary
Quick Review of 1-D Newton
Convergence Testing
Basic Algorithm
Description of the Jacobian.
Jacobian Construction.
Local Convergence Theorem
Outline
Damped Newton Schemes
Globally Convergent if Jacobian is Nonsingular
Difficulty with Singular Jacobians
Multidimensional
Newton Method
Newton Algorithm
Repeat {
( ) ( )
( x )( x x ) = F ( x )
Compute F x k , J F x k
Solve J F
k +1
for x k +1
k = k +1
} Until
SMA-HPC 2003 MIT
x k +1 x k , F x k +1
small enough
Multidimensional
Newton Method
Multidimensional
Convergence Theorem
Theorem Statement
Main Theorem
If
( )
( Inverse is bounded )
a)
J F1 x k
b)
JF ( x) JF ( y) A x y
Multidimensional
Newton Method
Multidimensional
Convergence Theorem
Implications
Non-converging
Case
1-D Picture
f(x)
x1
x
Newton Method
with Limiting
Newton Algorithm
Repeat {
( ) ( )
( x ) x = F ( x )
+ limited ( x )
Compute F x k , J F x k
Solve J F
x k +1 = x k
k = k +1
} Until
SMA-HPC 2003 MIT
k +1
for x k +1
k +1
x k +1 , F x k +1
small enough
Damped Newton
Scheme
Newton Method
with Limiting
General Damping Scheme
( )
( )
Solve J F x k x k +1 = F x k
for x k +1
x k +1 = x k + k x k +1
Pick to minimize F x + x
k
F x + x
k
k +1
2
2
k +1
F x + x
k
2
2
k +1
) F (x
T
+ k x k +1
Newton Method
with Limiting
Damped Newton
Convergence Theorem
If
a)
J F1 ( x k )
b)
JF ( x) JF ( y) A x y
( Inverse is bounded )
( Derivative is Lipschitz Cont )
Then
There exists a set of k ' s ( 0,1] such that
Damped Newton
Newton Method
with Limiting
Nested Iteration
x 0 = Initial Guess, k = 0
Repeat {
( ) ( )
Solve J ( x ) x = F ( x ) for x
Find ( 0,1] such that F ( x + x )
Compute F x k , J F x k
k
k +1
k +1
x k +1 = x k + k x k +1
k = k +1
} Until
x k +1 , F x k +1
small enough
k +1
is minimized
Newton Method
with Limiting
v1
1v
v2
10
+
- Vd
Damped Newton
Example
1
I r Vr = 0
10
Vd
I d I s (e
Vt
1) = 0
f ( v2 )
( v 0)
v2 1)
(
16
0.025
=
+ 10 (e
1) = 0
2
10
Newton Method
with Limiting
f ( v2 )
Damped Newton
Example cont.
( v 0)
v2 1)
(
16
0.025
=
+ 10 (e
1) = 0
2
10
Damped Newton
Newton Method
with Limiting
Nested Iteration
x 0 = Initial Guess, k = 0
Repeat {
( ) ( )
Solve J ( x ) x = F ( x ) for x
Find ( 0,1] such that F ( x + x )
Compute F x k , J F x k
k
k +1
k +1
x k +1 = x k + k x k +1
k = k +1
} Until
x k +1 , F x k +1
k +1
is minimized
small enough
Newton Method
with Limiting
Damped Newton
Theorem Proof
k +1
=x -
k
( )
k
( )
JF x
F xk
Newton Direction
F ( x ) F ( y ) J F ( y )( x y )
A
x y
2
Combining
F x
k +1
) F (x )+ J (x )
k
( )
k J x k
F
A
F x k J F xk
2
( )
k
( )
( )
F x
Newton Method
with Limiting
Damped Newton
Theorem Proof-Cont
F x
k +1
) F (x )+ J (x )
k
( )
J x
F
A k
F x
J F xk
2
( )
( )
F x
k +1
) (1 ) F ( x ) ( )
k
A
J F xk
2
( )
( )
F x
k +1
k
k
k
k
F ( x ) (1 ) F ( x ) + ( )
F (x )
( )
F x
Newton Method
with Limiting
Damped Newton
Theorem Proof-Cont-II
F x
k +1
1 k + k
( )
2A
2
( )
F x
k
F
x
( )
Two Cases:
1)
2A
2
( )
F xk
<
1
2
2
2 A
k
k
1 +
F xk
2
2A
1
1
2)
Pick k = 2
F xk >
2
2
A F
( )
( )
( )
1 k + k
( )
2A
2
(x )
( )
F xk
1
<
2
k
1
1
<
2 2A F x k
( )
Newton Method
with Limiting
Damped Newton
Theorem Proof-Cont-III
( )
F x k +1 k F x k
( )
F x k +1 F x 0
A
1
For the case where
F ( xk ) >
( )
2 2A F x k
( )
2 2A F x0
Damped Newton
Newton Method
with Limiting
Nested Iteration
x 0 = Initial Guess, k = 0
Repeat {
( ) ( )
Solve J ( x ) x = F ( x ) for x
Find ( 0,1] such that F ( x + x )
Compute F x k , J F x k
k
k +1
k +1
x k +1 = x k + k x k +1
k = k +1
} Until
x k +1 , F x k +1
k +1
is minimized
small enough
Newton Method
with Limiting
Damped Newton
f(x)
x2
1
1
D
Continuation Schemes
Source or Load-Stepping
Basic Concepts
Continuation Schemes
Basic Concepts
General Setting
Solve F ( x ( ) , ) = 0 where:
a) F ( x ( 0 ) , 0 ) = 0 is easy to solve Starts the continuation
b) F ( x (1) ,1) = F ( x )
Dissallowed
0
SMA-HPC 2003 MIT
Continuation Schemes
Basic Concepts
Template Algorithm
Solve F ( x ( 0 ) , 0 ) , x ( prev ) = x ( 0 )
=0.01, =
While < 1 {
x 0 ( ) = x ( prev )
Try to Solve F ( x ( ) , ) = 0 with Newton
If Newton Converged
x ( prev ) = x ( ) , = + , = 2
Else
1
= , = prev +
2
Basic Concepts
Continuation Schemes
R
Vs
+
-
v
Diode
1
f ( v ( ) , ) = idiode ( v ) + ( v Vs ) = 0
R
f ( v, )
v
fL
idiode ( v )
G
F ( x, ) =
1
+
Not dependent!
R
f x ( x, y ) = 0
f y ( x, y ) + f l = 0
Continuation Schemes
Description
F ( x ( ) , ) = F ( x ( ) ) + (1 ) x ( )
Observations
=0 F ( x ( 0 ) , 0 ) = x ( 0 ) = 0
F ( x ( 0 ) , 0 )
x
=I
F ( x (1) )
x
Continuation Schemes
Basic Algorithm
Solve F ( x ( 0 ) , 0 ) , x ( prev ) = x ( 0 )
=0.01, =
While < 1 {
x 0 ( ) = x ( prev ) + ?
Try to Solve F ( x ( ) , ) = 0 with Newton
If Newton Converged
x ( prev ) = x ( ) , = + , = 2
Else
1
= , = prev +
2
Continuation Schemes
x()
x ( + )
x0 ( + ) = x ( )
0
Continuation Schemes
Update Improvement
F ( x ( + ) , + ) F ( x ( 0) , ) +
F ( x ( ) , )
x
( x ( + ) x ( ) )
F ( x ( ) , )
F ( x ( ) , )
x
Have From last
steps Newton
x 0 ( + ) x ( ) =
Better Guess
for next steps
Newton
F ( x ( ) , )
Continuation Schemes
If
F ( x ( ) , ) = F ( x ( ) ) + (1 ) x ( )
Then
F ( x, )
= F ( x) x ( )
Easily Computed
SMA-HPC 2003 MIT
Continuation Schemes
x 0 ( + ) = x ( )
F ( x ( ) , )
Graphically
x()
x0 ( + )
0
Continuation Schemes
x()
Arc-length
steps
1
lambda steps
Continuation Schemes
Arc-length Steps?
x()
arc-length
Arc-length
steps
( x ) + ( )
2
+
x
arc
=0
( prev )
( prev )
2
2
2
Continuation Schemes
F x k , k
k
2
x
x ( prev )
k
prev
k
k
F x ,
x k +1 x k
=
k +1
k
k
2 prev
)
x ( )
k
k
F x ,
+ x
prev
2
2
arc 2
Continuation Schemes
x( )
Upper left-hand
Block is singular
F x k , k
k
x ( prev )
2
x
F x k , k
k
2 prev
Summary
Damped Newton Schemes
Globally Convergent if Jacobian is Nonsingular
Difficulty with Singular Jacobians
Improving Efficiency
Better first guess for each continuation step
Arc-length Continuation
SMA-HPC 2003 MIT
Outline
Image Segmentation Example
Large nonlinear system of equations
Formulation? Continuation? Linear Solver?
Arc-Length Continuation
SMA-HPC 2003 MIT
Simple Smoother
Smoothed
Output
Circuit Diagram
Image
Input
Nonlinear
Smoother
Circuit Diagram
Nonlinear
Smoother
i v
Nonlinear Resistor
Constitutive Equation
1 e
Dv
E J D v
2
Nonlinear
Smoother
Nonlinear Resistor
Constitutive Equation
Current
Voltage
SMA-HPC 2003 MIT
Nonlinear
Smoother
Nonlinear Resistor
Constitutive Equation
Varying Beta
Questions
What Equation Formulation?
Node-Branch or Nodal?
Basic Algorithm
Newton-Iterative
Method
Nested Iteration
x 0 = Initial Guess, k 0
Repeat {
Compute F x k , J F x k
x k 1
k
for 'x k 1
x k 'x k 1
k 1
} Until
'x k 1 , F x k 1
small enough
Newton-Iterative
Method
Basic Algorithm
k 1,l
'
x
Newton
delta from
l GCR Steps
dE
J F xk
If
a)
b)
c)
J F1 x k
F xk
k ,l
rN
GCR
Residual
Inverse is bounded
JF x JF y d A x y
Derivative is Lipschitz Cont
2
k ,l
k
r d C F x
More accurate near convergence
Then
The Newton-Iterative Method Converges Quadratically
SMA-HPC 2003 MIT
Newton-Iterative
Method
Basic Algorithm
Convergence Proof
JF x
F x k r k ,l
k 1
x
Combining
F x
k 1
F x J x
k
1
F x r
J x
F
k ,l
d A J xk
2 F
1
F x r
k
k ,l
Newton-Iterative
Method
Basic Algorithm
F x
k 1
F x F x r
k
k ,l
A
d J F xk
2
1
F x r
k
F x k 1
A
d J F xk
2
1
F x r
k
k ,l
r k ,l
F x
k 1
E 2A
2
F x
E 2 A r k ,l
1
r k ,l
k ,l
Newton-Iterative
Method
Basic Algorithm
Convergence Proof Cont. II
F x
k 1
E A
2
F x
E 2A F x k
2
1
2
C F xk
F x
k 1
2
2
2
k
E
F
x
A
E A
C F xk
d
1
2
2
Easily Bounded
Newton-Iterative
Method
Matrix-Free Idea
J F x k 'x k 1
F xk
JF x
p |
F x k H pl F x k
Gerschgorin Circle
Theorem
Given a matrix
Theorem Statement
%
#
#
mN ,1 " mN , N
O mi ,i d mi , j
j zi
Gerschgorin Circle
Theorem
Theorem Statement
Picture of Gerschgorin
Im O
ith circle
radius
i, j
j zi
Re O
ith circle
center
SMA-HPC 2003 MIT
mi ,i
Gerschgorin Circle
Theorem
1
N
SMA-HPC 2003 MIT
Nodal Matix
2.1 1
1 2.1
% 1
1 2.1
N 1
Nodal
Equation
Form
Gerschgorin Circle
Theorem
1
Resistor Line
Nodal Matix
N 1
2 1
1 2
Nodal
% 1 Equation
Form
1 2
Basic Concepts
Continuation Schemes
General Setting
Solve F x O , O 0 where:
a) F x 0 , 0 0 is easy to solve
b) F x 1 ,1
F x
c) x O is sufficiently smooth
Hard to insure!
x O
Dissallowed
0
SMA-HPC 2003 MIT
Continuation Schemes
Solve F x 0 , 0 , x O prev
GO =0.01, O GO
Basic Concepts
Template Algorithm
x 0
While O 1 {
x 0 O x O prev
Try to Solve F x O , O 0 with Newton
If Newton Converged
x O prev x O , O O GO , GO
Else
1
GO
GO , O O prev GO
2
2GO
Continuation Schemes
Description
F x O , O O F x O 1 O x O
Observations
O =0 F x 0 , 0 x 0 0
wF x 0 , 0
wx
O =1 F x 1 ,1 F x 1
wF x 0 , 0
wF x 1
wx
wx
Continuation Schemes
Solve F x 0 , 0 , x O prev
GO =0.01, O GO
Basic Algorithm
x 0
While O 1 {
x 0 O x O prev ?
Try to Solve F x O , O 0 with Newton
If Newton Converged
x O prev x O , O O GO , GO
Else
1
GO
GO , O O prev GO
2
2GO
Continuation Schemes
xO
Must switch back to
increasing lambda
Arc-length
steps
1
lambda steps
Continuation Schemes
Arc-length Steps?
xO
arc-length |
Arc-length
steps
'x GO
2
O
O
x
x
O
arc
prev
prev
2
Continuation Schemes
wF x k , O k
wx
k
2
x
x O prev
Ok O
prev
k
k
wF x , O
x k 1 x k
wO
k 1
k
O O
k
2 O O prev
x O
k
k
F x ,O
x
prev
2
2
arc 2
Continuation Schemes
x O
Upper left-hand
Block is singular
wF x k , O k
wx
k
x O prev
2
x
wF x k , O k
wO
k
2 O O prev
Summary
Image Segmentation Example
Large nonlinear system of equations
Examined issues in selecting numerical
methods
Arc-Length Continuation
SMA-HPC 2003 MIT
Outline
Initial Value problem examples
Signal propagation (circuits with capacitors).
Space frame dynamics (struts and masses).
Chemical reaction dynamics.
Investigate the simple finite-difference methods
Forward-Euler, Backward-Euler, Trap Rule.
Look at the approximations and algorithms
Examine properties experimentally.
Analyze Convergence for Forward-Euler
Application
Problems
Signal Transmission in an
Integrated Circuit
Signal Wire
Wire has resistance
Wire and ground plane form a capacitor
Logic
Gate
Ground Plane
Metal Wires carry signals from gate to gate.
How long is the signal delayed?
SMA-HPC 2003 MIT
Logic
Gate
Application
Problems
Signal Transmission in an
Integrated Circuit
Circuit Model
resistor
capacitor
Application
Problems
Oscillations in a Space
Frame
Application
Problems
Oscillations in a Space
Frame
Simplified Structure
Bolts
Struts
Ground
Load
Application
Problems
Oscillations in a Space
Frame
Modeling with Struts, Joints and
Point Masses
Point Mass
Strut
Application
Problems
Chemical Reaction
Dynamics
Crucible
Reagent
Strange green
stuff
How fast is product produced?
Does it explode?
SMA-HPC 2003 MIT
Signal Transmission in an
Integrated Circuit
Application
Problems
v1
iC1
C1
iR1
A 2x2 Example
v2
iR2
R2
R1 R3
iR3
iC2
C2
Constitutive
Equations
dv
ic = C c
dt
1
iR = vR
R
Conservation
Laws
1
1
dv1
+
R1 R2
0 dt
C2 dv2
1
R
dt
2
R2 v1
1
1 v2
+
R3 R2
Application
Problems
Signal Transmission in an
Integrated Circuit
A 2x2 Example
Let C1 = C2 = 1, R1 = R3 = 10, R2 = 1
C1
0
1
1
dv1
+
R R
0 dt
1
2
C2 dv2
1
R
dt
2
dx 1.1 1.0
=
x
dt 1.0 1.1
1
R2 v1
1
1 v2
+
R3 R2
2.1 1 1
1 1 0
eigenvectors
SMA-HPC 2003 MIT
Eigenvalues
An Aside on Eigenanalysis
dx(t )
Consider an ODE:
= Ax(t ),
dt
Eigendecomposition:
x(0) = x0
#
# 1 0 0 #
#
A = E1 E2 En 0 % 0 E1
#
#
# 0 0 n #
#
E2
#
Change of variables: Ey (t ) = x (t ) y (t ) = E 1 x (t )
dEy (t )
= AEy (t ), Ey (0) = x0
Substituting:
dt
1 0 0
1 dy (t )
1
Multiply by E :
= E AEy (t ) = 0 % 0 y(t )
0 0
dt
n
#
En
#
1 0 0
dy(t ) = 0 % 0
dt 0 0
n
y(t )
Decoupled
Equations!
dyi (t )
i t
= i yi (t ) yi (t ) = e y (0)
Decoupling:
dt
dx(t )
Steps for solving
= Ax(t ), x(0) = x0
dt
1) Determine E ,
1t
1
e
0
0
2) Compute y (0) = E x0
3) Compute y (t ) = 0 % 0 y (0)
0 0 ent
4) x (t ) = Ey (t )
SMA-HPC 2003 MIT
Application
Problems
Signal Transmission in an
Integrated Circuit
A 2x2 Example
v1 (0) = 1
v2 (0) = 0
Application
Problems
y = y0 + u
fs
fm
d 2u
fm = M 2
dt
Conservation
Law
fs + fm = 0
y y0 EAc
fs = EAc
u
=
y0
y0
dv
EAc
0
0 dt
v
y0
=
u
1 du
0
dt 1
Application
Problems
EAc
=1
Let M = 1,
y0
M
0
dv
EAc
0
0 dt
v
y
=
0
u
1 du
0
dt 1
dx 0 1.0
=
x
0
dt 1.0
i i 0 i i i
eigenvectors
SMA-HPC 2003 MIT
Eigenvalues
Application
Problems
v (0) = 1
0.5
0
u (0) = 0
-0.5
-1
10
15
Application
Problems
Chemical Reaction
Example
A 2x2 Example
Application
Problems
Chemical Reaction
Example
A 2x2 Example
dT
dt 1 1 T
=
R
dR
4
1
dt
dx 1 1
=
x
dt 4 1
eigenvectors
SMA-HPC 2003 MIT
Eigenvalues
Chemical Reaction
Example
Application
Problems
A 2x2 Example
12
10
8
6
R (0) = 0
T (0) = 1
2
0
0
0.5
1.5
2.5
Basic Concepts
Finite Difference
Methods
t t
t1
4
x
1
x
x
0
t1
t2
Third - Approximate
t3
tL
d
the
)
x(tlusing
dt
l
l 1
d
x x
Example:
x (tl )
dt
tl
SMA-HPC 2003 MIT
t L 1 t L = T
t2
l
x x(tl )
Approx.
soln
discrete
x l +1 x l
or
tl +1
Exact
soln
x l 's
Finite Difference
Methods
x
slope =
d
x(tl )
dt
slope =
tl
tl +1
x(tl +1 ) x(tl )
t
Basic Concepts
Forward Euler Approximation
x(tl +1 ) x(tl )
d
x(tl ) = A x(tl )
t
dt
or
x(tl +1 ) x(tl ) + t A x(tl )
Finite Difference
Methods
Basic Concepts
Forward Euler Algorithm
x(t2 ) x 2 = x1 + tAx1
#
x(t L ) x L = x L 1 + tAx L 1
tAx1
tAx (0)
t1
t2
t3
Basic Concepts
Finite Difference
Methods
x
slope =
d
x(tl +1 )
dt
slope =
tl
tl +1
x(tl +1 ) x(tl )
t
Backward Euler
Approximation
x(tl +1 ) x(tl )
d
x(tl +1 ) = A x(tl +1 )
t
dt
or
x(tl +1 ) x(tl ) + t A x(tl +1 )
Finite Difference
Methods
Basic Concepts
Backward Euler Algorithm
x(t2 ) x 2 = [ I tA] 1 x1
#
x(t L ) x L = [ I tA] 1 x L 1
tAx 2
tAx1
t1
t2
Finite Difference
Methods
1 d
d
( x(tl +1 ) + x(tl ))
2 dt
dt
1
= ( Ax(tl +1 ) + Ax(tl ))
2
x(tl +1 ) x(tl )
t
1
x(tl +1 ) x(tl ) + tA( x(tl +1 ) + x(tl ))
2
Basic Concepts
Trapezoidal Rule
slope =
d
x(tl )
d
dt
x(tl +1 )
slope =
dt
slope =
x (tl +1 ) x (tl )
t
1
1
= ( x(tl +1 ) tAx(tl )) ( x(tl ) + tAx(tl +1 ))
2
2
SMA-HPC 2003 MIT
Basic Concepts
Finite Difference
Methods
t
x (t1 ) x = x (0) + ( Ax (0) + Ax1 )
2
t 1
t
I
A x = I +
A x (0)
2
2
2
x (t2 ) x = I
2
x (tL ) x = I
t
A I +
2
1
t
A I +
2
A x1
A x L1
t1
t2
Finite Difference
Methods
Basic Concepts
Numerical Integration View
tl +1
d
x (t ) = A x (t ) x (tl +1 ) = x (tl ) + Ax ( )d
tl
dt
tl +1
t
tl Ax( )d 2 ( Ax(tl ) + Ax(tl ) ) Trap
tAx (tl +1 ) BE
tAx (tl )
tl
tl +1
FE
Finite Difference
Methods
Basic Concepts
Summary
Finite Difference
Methods
Numerical Experiments
Unstable Reaction
t = 0.1
3.5
3
Exact Solution
Backward-Euler
2.5
2
Trap rule
1.5
Forward-Euler
1
0.5
0.5
1.5
Numerical Experiments
Finite Difference
Methods
0.4
-0.05
0.35
Backward
Euler
0.3
-0.1
x 10
Trap
Rule
0.25
0.2
0.15
0.1
0.05
-1
-3
-0.15
Forward
Euler
-0.2
-0.25
-0.3
-0.35
-2
Numerical Experiments
Finite Difference
Methods
10
M
a
x
E
r
r
o
r
10
10
10
10
10
Unstable Reaction-Convergence
Backward-Euler
-2
-4
Trap rule
Forward-Euler
-6
-8
10
-3
10
-2
Timestep
10
-1
10
Finite Difference
Methods
4
2
Numerical Experiments
Oscillating Strut and Mass
Forward-Euler
t = 0.1
0
-2
-4
-6
Trap rule
0
10
Backward-Euler
15
20
25
30
Finite Difference
Methods
Numerical Experiments
Two timescale RC Circuit
small t
Backward-Euler
Computed Solution
large t
Finite Difference
Methods
Numerical Experiments
Two timescale RC Circuit
Forward-Euler Computed
Solution
Finite Difference
Methods
Convergence
Numerical Experiments
Summary
Energy Preservation
Why did BE produce a decaying oscillation?
Why did FE produce a growing oscillation?
Why did trap rule maintain oscillation amplitude?
Finite Difference
Methods
Convergence Analysis
Convergence Definition
max
x x ( l t ) 0 as t 0
l
T
l0,
t
x l computed with t
t
l
x computed with
2
xexact
Convergence Analysis
Finite Difference
Methods
Order-p convergence
max
x x ( l t ) C ( t )
l
T
l0,
t
Finite Difference
Methods
Convergence Analysis
Two Conditions for
Convergence
Finite Difference
Methods
Convergence Analysis
Consistency Definition
x x ( t )
1
0 as t 0
Convergence Analysis
Finite Difference
Methods
Forward-Euler definition
1
x = x ( 0 ) + tAx ( 0 )
Expanding in t about zero yields
[0 , t ]
dx ( 0 ) ( t ) d 2 x ( )
x(t ) = x ( 0 ) + t
+
dt
2
dt 2
2
d
Noting that
x(0) = Ax(0) and subtracting
dt
2
2
Proves the theorem if
( t ) d x ( ) derivatives of x are
1
x x ( t )
dt
bounded
Finite Difference
Methods
Convergence Analysis
Convergence Analysis for
Forward Euler
Forward-Euler definition
l +1
l
l
x = x + tAx
Expanding in t about l t yields
x ( ( l + 1) t ) = x ( l t ) + tAx ( l t ) + el
l
d x ( )
2
dt
2
Finite Difference
Methods
Convergence Analysis
Convergence Analysis for
Forward Euler Continued
l +1
x ( ( l + 1) t ) = ( I + tA ) ( x x ( l t ) ) + e
l
l +1
= ( I + tA ) E + e
l
l +1
( I + tA ) E + C ( t )
l
(1 + t A
SMA-HPC 2003 MIT
E + C ( t )
l
Convergence Analysis
Finite Difference
Methods
If
l +1
Then
(1 + ) u + b, u = 0, > 0
l
e
l
u
b
u (1 + )
l
j =0
1 (1 + )
b=
b
1 (1 + )
j
Convergence Analysis
Finite Difference
Methods
To finish, note (1 + ) e (1 + )l e l
l
1
1
+
1
+
1
(
)
(
)
e
l
u
b=
b
b
1 (1 + )
2
l
1 + t A E + C ( t )
N
Convergence Analysis
Finite Difference
Methods
l t A
2
e
l
l 1
2
E 1 + t A E + C ( t )
C
t
(
)
N
t A
AT
C
t
A
Convergence Analysis
Finite Difference
Methods
max l[0, L] E e
l
AT
C
t
A
Convergence Analysis
Finite Difference
Methods
12
10
RFE
Rexact
Texact
TFE
2
0
0
0.5
1.5
2.5
Convergence Analysis
Finite Difference
Methods
1.2
E
0.8
r
r 0.6
o0.4
r
Rexact-RFE
0.2
Texact - TFE
0
-0.2
0.5
Time
1.5
2.5
Convergence Analysis
Finite Difference
Methods
v1exact
0.8
v1FE
0.6
0.4
v2FE
0.2
0
0
v2exact
0.5
1.5
2.5
3.5
Convergence Analysis
Finite Difference
Methods
0.03
v1exact - v1FE
E 0.02
r
r 0.01
o
0
r
-0.01
v2exact-v2FE
-0.02
-0.03
0.5
1.5
Time2
2.5
3.5
Summary
Initial Value problem examples
Signal propagation (two time scales).
Space frame dynamics (oscillator).
Chemical reaction dynamics (unstable system).
Looked at the simple finite-difference methods
Forward-Euler, Backward-Euler, Trap Rule.
Look at the approximations and algorithms
Experiments generated many questions
Analyzed Convergence for Forward-Euler
Many more questions to answer, some next time
Outline
Small Timestep issues for Multistep Methods
Local truncation error
Selecting coefficients.
Nonconverging methods.
Stability + Consistency implies convergence
Next Time Investigate Large Timestep Issues
Absolute Stability for two time-scale examples.
Oscillators.
Basic Equations
Multistep Methods
General Notation
x
x
l k
l 2
x l 1 x l
d
x(t ) = f ( x(t ), u (t ))
dt
j =0
j =0
l j
l j
x
=
f
x
, u ( tl j )
j
j
Multistep coefficients
tl k
tl 3 tl 2 tl 1 tl
Time discretization
Basic Equations
Multistep Methods
Common Algorithms
Multistep Equation:
Multistep Coefficients:
BE Discrete Equation:
Multistep Coefficients:
Trap Discrete Equation:
Multistep Coefficients:
j =0
j =0
l j
l j
x
=
f
x
, u ( tl j )
j
j
Forward-Euler Approximation:
FE Discrete Equation:
x ( tl ) x ( tl 1 ) + t f ( x ( tl 1 ) , u ( tl 1 ) )
xl x l 1 = t f ( x l 1 , u ( tl 1 ) )
k = 1, 0 = 1, 1 = 1, 0 = 0, 1 = 1
x l x l 1 = t f ( x l , u ( tl ) )
k = 1, 0 = 1, 1 = 1, 0 = 1, 1 = 0
t
f ( x l , u ( tl ) ) + f ( x l 1 , u ( tl 1 ) )
2
1
1
k = 1, 0 = 1, 1 = 1, 0 = , 1 =
2
2
x l x l 1 =
Multistep Methods
Basic Equations
Definitions and Observations
Multistep Equation:
j =0
j =0
l j
l j
x
=
f
x
, u ( tl j )
j
j
Multistep Methods
d
Scalar ODE:
v ( t ) = v(t ), v ( 0 ) = v0
dt
Why such a simple Test Problem?
Decoupled
Equations
l j
l j
x
=
Ax
j
j
k
j =0
l j
l j
1
y
=
E
AEy
j
j
j =0
j =0
l j
1
= t j
j =0
j =0
j =0
y l j
Multistep Methods
Scalar ODE:
d
v ( t ) = v(t ), v ( 0 ) = v0
dt
k
k
l j
l j
v
=
v
j
j
j =0
j =0
Decaying
Solutions
O
s
c
i
l
l
a
t
i
o
n
s
Growing
Solutions
Re ( )
Multistep Methods
Convergence Analysis
Convergence Definition
max
T
l0,
t
vl v ( l t ) 0 as t 0
v l computed with t
t
l
v computed with
2
vexact
Multistep Methods
Convergence Analysis
Order-p convergence
max
v v ( l t ) C ( t )
l
T
l0,
t
Multi-step Methods
10
M
a
x
E
r
r
o
r
10
-2
10
-4
10
10
10
Convergence Analysis
Backward-Euler
Trap rule
Forward-Euler
-6
-8
10
-3
10
-2
Timestep
10
-1
10
Multistep Methods
Convergence Analysis
Two Conditions for
Convergence
Convergence Analysis
Multistep Methods
Multistep formula:
Exact solution Almost
satisfies Multistep Formula:
j v
l j
j =0
k
t j vl j = 0
j =0
d
l
=
v
t
t
v
t
e
(
j ( l j )
j
l j )
dt
j =0
j =0
Global Error: E l v ( tl ) v l
E
+
E
+
+
E
=
e
( 0
( 1
( k
0)
1)
k)
Convergence Analysis
Forward-Euler
Forward-Euler definition
l +1
v t v = 0
l
l t , ( l + 1) t
d
v = v
dt
dt
dt
el
Convergence Analysis
Forward-Euler
Forward-Euler definition
l +1
l
l
v = v + t v
Using the LTE definition
v ( ( l + 1) t ) = v ( l t ) + t v ( l t ) + e
l +1
I + t E + e (1 + t ) E + C ( t )
l
Convergence Analysis
Forward-Euler
If
Then
l +1
(1 + ) u + b, u = 0, > 0
l
e
l
u
b
l
l
u (1 + )
l
j =0
1 (1 + )
b=
b
1 (1 + )
l
One-step Methods
Convergence Analysis
A helpful bound on difference
equations cont.
To finish, note (1 + ) e (1 + ) e
l
1
1
+
1
+
1
(
)
(
)
e
l
u
b=
b
b
1 (1 + )
Convergence Analysis
One-step Methods
l t
2
l +1
l
2
E 1 + t E + C ( t ) e
C ( t )
b
Finally noting that l t T ,
max l[0, L] E e
l
Convergence Analysis
Forward-Euler
max l[0, L] E e
l
Convergence Analysis
Forward-Euler
12
10
RFE
Rexact
6
4
TempExact
TFE
2
0
0
0.5
1.5
2.5
Convergence Analysis
Forward-Euler
1.2
1
E
0.8
r
r 0.6
o0.4
r
Rexact-RFE
0.2
Texact - TFE
0
-0.2
0.5
Time
1.5
2.5
Convergence Analysis
Forward-Euler
v1exact
0.8
v1FE
0.6
0.4
v2FE
0.2
0
0
v2exact
0.5
1.5
2.5
3.5
Convergence Analysis
Forward-Euler
0.03
v1exact - v1FE
E 0.02
r
r 0.01
o
0
r
-0.01
v2exact-v2FE
-0.02
-0.03
0.5
1.5
Time2
2.5
3.5
Multistep Methods
Exactness Constraints
k
d
l
=
v
t
t
v
t
e
(
d
v (t ) = v (t )
dt
LTE
d
p 1
If v ( t ) = t v ( t ) = pt
dt
p
( ( k j ) t )
v (t )
j =0
k j
t j p ( ( k j ) t )
j =0
d
v ( tk j )
dt
p 1
=e
Multistep Methods
( ( k j ) t )
j =0
( t )
If
t j p ( ( k j ) t )
p 1
j =0
k
k
p
p 1
k
j (l j ) j p (l j ) = e
j =0
j =0
k
k
p
p 1
j (( k j )) j p ( k j ) = 0
j =0
j =0
if
k
k
p
p 1
j ( k j ) j p ( k j ) = 0 for all p p0
j =0
j =0
k
k
l
d
p0 +1
Then j v ( tl j ) j v ( tl j ) = e = C ( t )
dt
j =0
j =0
Multistep Methods
k
k
p
p 1
Exactness Constraints: j ( k j ) j p ( k j ) = 0
j =0
j =0
1
2
8
16
1
1
1
1
1
1 0
0 1
0 4
0 12
0 32
0
1
2
3
4
0
0 0
1
1
0
2
0 = 0
0
0
0
1
0 0
2
Note
i = 0
Always
Multistep Methods
Exactness
Constraints for k=2
1
2
8
16
0
1
Forward-Euler 0 = 1, 1 = 1, 2 = 0, 0 = 0, 1 = 1, 2 = 0,
2
FE satisfies p = 0 and p = 1 but not p = 2 LTE = C ( t )
Backward-Euler 0 = 1, 1 = 1, 2 = 0, 0 = 1, 1 = 0, 2 = 0,
2
BE satisfies p = 0 and p = 1 but not p = 2 LTE = C ( t )
Trap Rule 0 = 1, 1 = 1, 2 = 0, 0 = 0.5, 1 = 0.5, 2 = 0,
3
Trap satisfies p = 0,1, or 2 but not p = 3 LTE = C ( t )
Multistep Methods
1
1
0 0 1 1
1 1 2 2
0 4 2 0 0 = 4
0 12 3 0 1 8
0 32 4 0 2 16
1
0
0
1
0 = 1, 1 = 0, 2 = 1, 0 = 1/ 3, 1 = 4 / 3, 2 = 1/ 3
Multistep Methods
10
d
v (t ) = v (t )
d
FE
-5
L 10
T
E
Trap
-10
10
Beste
-15
10
-4
10
-3
10
Timestep
-2
10
-1
10
10
Multistep Methods
10
M
d
a -2 d v (t ) = v (t )
10
x
E -4
r 10
r
-6
10
o
r
t [0,1]
FE
Wheres BESTE?
Trap
-8
10 -4
10
-3
10
-2
10
Timestep
-1
10
10
Multistep Methods
worrysome
200
M 10
a
x 100
E 10
r
r 0
10
o
r
10
d
v (t ) = v (t )
d
Beste
FE
Trap
-100
10
-4
10
-3
10
Timestep
-2
10
-1
10
Multistep Methods
E
+
E
+
( 0
( 1
0)
1)
v ( l t ) v
+ ( k t k ) E l k = el
LTE
Global Error
We made the LTE so small, how come the Global
error is so large?
a0 x + a1 x
l
l 1
+ ak x
l k
=u
x = x0 , x = x1 ,
0
,x
= xk
1
0
x = ( a1 x +
a0
1
+ ak x
k +1
x can be related to u by x = h u
l
j =0
l j
If a0 z + a1 z
k
k 1
Then x = h u where h = j ( j )
l
l j
j =0
j =1
Suppose x = x + u and x = 0
1
0
1
1
2
1
2
1
2
x = x + u = u , x = x + u = u + u
l 1
x =
l
j =0
l j
then x Cl max j u
l
Multistep Methods
E
+
E
+
( 0
( 1
0)
1)
+ ( k t k ) E l k = el
T
E C max T el
l0,
t
t
l
T
l0,
t
for any el
Multistep Methods
E
+
E
+
( 0
( 1
0)
1)
+ ( k t k ) E l k = el
If the roots of
k j
z
j =0
j =0
are either
Multistep Methods
roots of
j =0
k j
=0
-1
roots of
Im
As t 0, roots
move inward to
match polynomial
Re
k j
z
= 0 for a nonzero t
( j
j)
j =0
Multistep Methods
0 = 1, 1 = 4, 2 = 5, 0 = 0, 1 = 4, 2 = 2
Im
roots of z + 4 z 5 = 0
2
-5
-1
Re
Multistep Methods
Convergence Analysis
Multistep Methods
max
T
l0,
t
C1 ( t )
p0 +1
for t < t0
roots of
k j
z
j = 0 inside or simple on unit circle
j =0
max
T
E C2 max T el
l0,
t
t
l
T
l0,
t
Convergence Result:
max
E CT ( t )
l
T
l 0,
t
p0
Summary
Small Timestep issues for Multistep Methods
Local truncation error and Exactness.
Difference equation stability.
Stability + Consistency implies convergence.
Next time
Absolute Stability for two time-scale examples.
Oscillators.
Maybe Runge-Kutta schemes
Outline
Small Timestep issues for Multistep Methods
Reminder about LTE minimization
A nonconverging example
Stability + Consistency implies convergence
Investigate Large Timestep Issues
Absolute Stability for two time-scale examples.
Oscillators.
Basic Equations
Multistep Methods
General Notation
x
x
l k
l 2
x l 1 x l
d
x(t ) = f ( x(t ), u (t ))
dt
j =0
j =0
l j
l j
x
=
f
x
, u ( tl j )
j
j
Multistep coefficients
tl k
tl 3 tl 2 tl 1 tl
Time discretization
Multistep Methods
Scalar ODE:
d
v ( t ) = v(t ), v ( 0 ) = v0
dt
k
k
l j
l j
v
=
v
j
j
j =0
j =0
Decaying
Solutions
O
s
c
i
l
l
a
t
i
o
n
s
Growing
Solutions
Re ( )
Multistep Methods
Convergence Analysis
Convergence Definition
max
T
l0,
t
vl v ( l t ) 0 as t 0
v l computed with t
t
l
v computed with
2
vexact
Multistep Methods
Convergence Analysis
Two Conditions for
Convergence
Convergence Analysis
Multistep Methods
Multistep formula:
Exact solution Almost
satisfies Multistep Formula:
j v
l j
j =0
k
t j vl j = 0
j =0
d
l
=
v
t
t
v
t
e
(
j ( l j )
j
l j )
dt
j =0
j =0
Global Error: E l v ( tl ) v l
E
+
E
+
"
+
E
=
e
( 0
( 1
( k
0)
1)
k)
Multistep Methods
Exactness Constraints
k
d
l
=
v
t
t
v
t
e
(
d
v (t ) = v (t )
dt
LTE
d
p 1
If v ( t ) = t v ( t ) = pt
dt
p
j ) t )
( k
(
v (t )
j =0
k j
p 1
t j p ( ( k j ) t )
=e
j =0
d
v ( tk j )
dt
Multistep Methods
k
k
p
p 1
Exactness Constraints: j ( k j ) j p ( k j ) = 0
j =0
j =0
1
2
8
16
1
1
1
1
1
1 0
0 1
0 4
0 12
0 32
0
1
2
3
4
0
0 0
1
1
0
2
0 = 0
0
0
0
1
0 0
2
Note
i = 0
Always
Multistep Methods
1
1
0 0 1 1
1 1 2 2
0 4 2 0 0 = 4
0 12 3 0 1 8
0 32 4 0 2 16
1
0
0
1
0 = 1, 1 = 0, 2 = 1, 0 = 1/ 3, 1 = 4 / 3, 2 = 1/ 3
Satisfies all five exactness constraints LTE = C ( t )
Multistep Methods
10
d
v (t ) = v (t )
d
FE
-5
L 10
T
E
Trap
-10
10
Beste
-15
10
-4
10
-3
10
Timestep
-2
10
-1
10
10
Multistep Methods
10
M
d
a -2 d v (t ) = v (t )
10
x
E -4
r 10
r
-6
10
o
r
t [0,1]
FE
Wheres BESTE?
Trap
-8
10 -4
10
-3
10
-2
10
Timestep
-1
10
10
Multistep Methods
worrysome
200
M 10
a
x 100
E 10
r
r 0
10
o
r
10
d
v (t ) = v (t )
d
Beste
FE
Trap
-100
10
-4
10
-3
10
Timestep
-2
10
-1
10
Multistep Methods
E
+
E
+
"
+
E
=
e
( 0
( 1
( k
0)
1)
k)
v ( l t ) v
LTE
Global Error
We made the LTE so small, how come the Global
error is so large?
Multistep Methods
E
+
E
+
"
+
E
=
e
( 0
( 1
( k
0)
1)
k)
t
interval
dependent
Convolution Sum
Aside on difference
Equations
Root Relation
a0 x + " + ak x
l
l k
= u , x = 0, " , x
l
l
x
x can be related to the input u by =
Q M q 1
Root multiplicity
m
q =1 m=0
k 1
l j j
h
u
j =0
Roots of
a0 z + a1 z
k
convolution sum
h = q ,m ( l ) ( q )
l
+ " + ak = 0
=0
Aside on difference
Equations
Convolution Sum
Bounding Terms
Q M q 1
l j
m
l
j
x = q ,m ( l j ) ( q ) u
q =1 m =0 j =0
Rq ,m
Independent of l
max j u
Multistep Methods
Multistep Methods
( 0 t 0 ) E l + (1 t 1 ) E l 1 + " + ( k t k ) E l k = el
If, as t 0, roots of ( 0 t 0 ) z l + " + ( k t k ) = 0
N
t
t
t
C (T )
Multistep Methods
roots of
j =0
k j
=0
-1
roots of
Im
As t 0, roots
move inward to
match polynomial
Re
k j
z
= 0 for a nonzero t
( j
j)
j =0
Multistep Methods
Im
roots of z + 4 z 5 = 0
2
-5
-1
Re
Multistep Methods
Convergence Analysis
Multistep Methods
max
T
l0,
t
C1 ( t )
p0 +1
for t < t0
roots of
k j
Inside the unit circle or on the
z
=
0
j
j =0
T
max T E C2 max T el
l0,
l0,
t
t
p0
l
Convergence Result: max T E CT ( t )
l
l 0,
t
Multistep Methods
small t
Backward-Euler
Computed Solution
Circuit Example
d
x(t ) = Ax(t )
dt
eig ( A) = 2.1, 0.1
large t
Multistep Methods
FE on two time-constant
circuit?
Forward-Euler Computed
Solution
Multistep Methods
Scalar ODE:
d
v ( t ) = v(t ), v ( 0 ) = v0
dt
l +1
l
l
l
Forward-Euler: v = v + t v = (1 + t ) v
If 1 + t > 1 the solution grows even if <0
1
l +1
l
l +1
l +1
Backward-Euler: v = v + t v v =
(1 t )
vl
1
If
< 1 the solution decays even if > 0
1 t
Trap Rule: v
l +1
= v + 0.5t ( v
l
l +1
+ v ) v
l
l +1
1 + 0.5t ) l
(
=
v
(1 0.5t )
Multistep Methods
Forward Euler
z = (1 + t )
-1
Im ( )
ODE stability
region
Im(z)
Difference Eqn
Stability region
Re(z)
2
t
Region of
Absolute
Stability
Re ( )
Multistep Methods
Im(z)
-1
Difference Eqn
Stability region
Re(z)
2
t
Region of
Absolute
Stability
Re ( )
Multistep Methods
-1
ODE stability
region
Im(z)
Difference Eqn
Stability region
Re(z)
2
t
Region of
Absolute
Stability
Re ( )
Multistep Methods
Backward Euler
z = (1 t )
Im(z)
-1
Difference Eqn
Stability region
Re(z)
Region of
Absolute
Stability
Im ( )
Multistep Methods
-1
Difference Eqn
Stability region
Re(z)
Region of
Absolute
Stability
Multistep Methods
Stable Difference
Equation Im(z)
-1
Difference Eqn
Stability region
Re(z)
Region of
Absolute
Stability
Multistep Methods
Stability Definitions
j =0
A-stable:
Multistep methods
4
2
Numerical Experiments
Oscillating Strut and Mass
t = 0.1
Forward-Euler
0
-2
-4
-6
Trap rule
0
10
Backward-Euler
15
20
25
30
Multistep Methods
Forward Euler
z = (1 + t )
Im ( )
ODE stability
region
Im(z)
oscillating
unstable
-1
Difference Eqn
Stability region
Re(z)
2
t
Region of
Absolute
Stability
Re ( )
Multistep Methods
Backward Euler
z = (1 t )
Im ( )
Im(z)
decaying
-1
Difference Eqn
Stability region
oscillating
Re(z)
Region of
Absolute
Stability
Multistep Methods
1 + 0.5t )
(
Trap Rule z = (1 0.5t )
Im ( )
Im(z)
oscillating
oscillating
-1
Difference Eqn
Stability region
Re(z)
Region of
Absolute
Stability
Multistep Methods
Summary
Small Timestep issues for Multistep Methods
Local truncation error and Exactness.
Difference equation stability.
Stability + Consistency implies convergence.
Investigate Large Timestep Issues
Absolute Stability for two time-scale examples.
Oscillators.
Didnt talk about
Runge-Kutta schemes, higher order A-stable
methods.
Outline
Periodic Steady-state problems
Application examples and simple cases
Finite-difference methods
Formulating large matrices
Shooting Methods
State transition function
Sensitivity matrix
Basic Definition
Periodic Steady-State
Basics
dx ( t )
(t )
= F x ( t ) + u{
{ input
dt
state
2T
3T
x ( t + T ) = x ( t ) for t >> 0
Periodic Steady-State
Basics
Basic Definition
Interesting Property
dx ( t )
= F ( x ( t ) ) + u (t )
dt
Then if u is periodic with period T and
x ( t0 + T ) = x ( t0 ) for some t0
x ( t + T ) = x ( t ) for all t > t0
Periodic Steady-State
Basics
Periodic Input
Wind
Response
Oscillating Platform
Desired Info
Oscillation Amplitude
Application Examples
Swaying Bridge
Periodic Steady-State
Basics
Application Examples
Communication Integrated
Circuit
Periodic Input
Received Signal at 900Mhz
Response
filtered demodulated signal
Desired Info
Distortion
Periodic Steady-State
Basics
Application Examples
Automobile Vibration
Periodic Input
Regularly Spaced
Road Bumps
Response
Car Shakes
Desired Info
Shake amplitude
Periodic Steady-State
Basics
RLC Circuit
Simple Example
RLC Filter,
Spring+Mass+Dashpot
Spring-Mass-Dashpot
Force
d x
dx
M 2 + D + x = u{
(t )
dt
dt
input
Periodic Steady-State
Basics
Simple Example
RLC Filter,
Spring+Mass+Dashpot Cont.
d x
dx
M 2 + D + x = u (t )
dt
dt
u(t) = 0 lightly damped (D<<M) Response
x ( t ) Ke
SMA-HPC 2003 MIT
D
2M
cos
+
M
Periodic Steady-State
Basics
Simple Example
RLC Filter,
Spring+Mass+Dashpot Cont.
Ke
2M
Periodic Steady-State
Basics
dx ( t )
i t
= Ax ( t ) + e{
dt
input
x ( t ) = ( i A ) e
1
i t
Periodic Steady-State
Basics
Aside Reviewing
Integration Methods
Nonlinear System
dx ( t )
= F x ( t ) + u{
(t )
{ input
dt
state
x ( 0 ) = x0
1424
3
Initial Condition
x x
l
l 1
= t F ( x ) + u (l t )
l
Implicit Methods
Aside Reviewing
Integration Methods
Backward-Euler Example
Forward-Euler
Backward-Euler
x(t1 )
x1 = x(0) + t f ( x ( 0 ) , u ( 0 ) )
x(t1 )
x(t2 )
x 2 = x1 + t f x1 , u ( t1 )
x(t2 )
x(t L )
x L = x L 1 + t f x L 1 , u ( t L 1 )
x 2 = x1
M
x(t L )
)
+ t f ( x , u ( t ) )
x1 = x(0) + t f x1 , u ( t1 )
2
x L = x L 1 + t f x L , u ( t L )
Nonlinear equation
solution at each step
Implicit Methods
Aside Reviewing
Integration Methods
0 x t 0 f ( x , u ( tl ) ) + j x
l
j =1
f ( x l , j , u ( tl ) )
0 I t 0
Jacobian
l j
j =1
Independent of x l
( x l , j +1 x l , j ) = 0 x l , j t 0 f ( x l , j , u ( tl ) ) + b
F ( xl , j )
t j f x l j , u ( tl j ) = 0
Implicit Methods
Aside Reviewing
Integration Methods
f ( x l , j , u ( tl ) ) l , j +1 l , j
0 I t 0
( x
x ) = F ( xl , j )
Newton Iteration:
x l
x l ,0
tl k
0 I t 0
t l 3 t l 2 t l 1 t l
f ( x l , j , u ( tl ) )
Polynomial
Predictor
0 I
as t 0
Boundary-Value
Problem
Basic Formulation
Periodicity
Constraint
Differential
Equation Solution
d
N Differential Equations:
xi ( t ) = Fi ( x ( t ) )
dt
N Periodicity Constraints: xi (T ) = xi ( 0 )
SMA-HPC 2003 MIT
Boundary-Value
Problem
dx ( t )
= Ax ( t ) + u ( t ) t [ 0, T ]
{
dt
x (T ) = x ( t )
14243
input
periodicity
constraint
(
(
x = x
L
L 1
T
t =
L
+ t ( Ax + u ( Lt ) )
L
Periodicity implies x = x
L
Boundary-Value
Problem
NxL
1
0
0
t I A
1
1 I
IA
0
t
NxL t
O
O
0
1
0
I
0
t
1
I x1 u ( 0 )
t
2
u
t
x ( )
0
M = M
0 M M
x L u ( Lt )
1
I A
t
Boundary-Value
Problem
Nonlinear Problem
dx ( t )
= F ( x ( t ) ) + u ( t ) t [ 0, T ]
{
dt
input
x (T ) = x ( t )
14243
periodicity
constraint
t
F
x
+ u ( t )
x
2
2
1
2
x x x t F x + u ( 2t )
H FD
=
M
L
x x L x L 1 t F x L + u Lt
( )
( ( )
( ( )
( ( )
) =0
)
Boundary-Value
Problem
Shooting Method
Basic Definitions
dx ( t )
= F ( x (t )) + u (t )
Start with
dt
And assume x(t) is unique given x(0).
D.E. defines a State-Transition Function
( y, t0 , t1 ) x ( t1 )
where x (t ) is the D.E. solution given x ( t0 ) = y
SMA-HPC 2003 MIT
Boundary-Value
Problem
Shooting Method
State Transition function Example
dx ( t )
= x (t )
dt
( y, t0 , t1 ) e
( t1 t0 )
Shooting Method
Boundary-Value
Problem
Abstract Formulation
Solve
H ( x ( 0 ) ) = ( x ( 0 ) , 0, T ) x ( 0 ) = 0
14
4244
3
x(T )
( x, 0, T )
JH ( x) =
I
x
JH ( x
SMA-HPC 2003 MIT
)( x
k +1
) = H ( x )
k
Boundary-Value
Problem
Shooting Method
Computing Newton
To Compute ( x ( 0 ) , 0, T )
dx ( t )
Integrate
= F ( x ( t ) ) + u ( t ) on [0,T]
dt
( x, 0, T )
What is
?
x (T )
x
x (0) +
x (T )
x (0)
Boundary-Value
Problem
( x, 0, T )
x
x11 (T ) x1 (T )
x1 T x T
(
)
(
)
N
N
1
SMA-HPC 2003 MIT
Shooting Method
Sensitivity Matrix by Perturbation
L L
L L
L L
L L
x1
(T ) x1 (T )
N
xN ( T ) xN ( T )
Boundary-Value
Problem
Shooting Method
Efficient Sensitivity Evaluation
1
1
x x ( 0 ) t F x + u ( t ) = 0
x ( 0 )
1
1
1
F
x
x
0
(
)
x
x
t
=0
x ( 0 ) x ( 0 )
x x ( 0 )
1
I
1
F x
x ( 0 )
x
I t
=
x x ( 0 ) x ( 0 )
) )
( ( )
( )
( )
Shooting Method
Boundary-Value
Problem
l
l 1
F ( x ) x l
x
I t
=
x x ( 0 ) x ( 0 )
F
x
(
)
( x, 0, T )
I t
x
x
l =1
Shooting Method
Boundary-Value
Problem
F
x
(
)
( x, 0, T )
I t
x
x
l =1
1442443
L
Timestep Newton
Jacobian
( x, 0, T )
( I tA)
Shooting Method
Matrix-Free Approach
Basic Setup
dx ( t )
= F ( x (t )) + u (t )
Start with
dt
H ( x ( 0 ) ) = ( x ( 0 ) , 0, T ) x ( 0 ) = 0
( x, 0, T )
JH ( x) =
I
x
k
k +1
k
k
J H ( x )( x x ) = H ( x )
Shooting Method
Matrix-Free Approach
Matrix-Vector Product
( x k , 0, T )
k +1
k
k
k
I ( x x ) = x ( x , 0, T )
14243 1442443
x
144
42444
3
x
b
A
p
,
0,
T
x
, 0, T )
(
)
(
j
Ip
pj
Shooting Method
Matrix-Free Approach
Convergence for GCR
Example
dx
Ax = 0 eig ( A ) real and negative
dt
Shooting-Newton Jacobian
( x, 0, T )
AT
I =e I
x
SMA-HPC 2003 MIT
Matrix-Free Approach
Shooting Method
AT
I =S
1T
1
O
e
N T
1
S
1
1
Few Slow Modes larger than 1
SMA-HPC 2003 MIT
Summary
Periodic Steady-state problems
Application examples and simple cases
Finite-difference methods
Formulating large matrices
Shooting Methods
State transition function
Sensitivity matrix
Outline
Three Methods so far
Time integration until steady-state achieved
Finite difference methods
Shooting Methods
Shooting Methods
State transition function
Sensitivity matrix
Matrix-Free Approach
Spectral Methods
Galerkin and Collocation Methods
Basic Definition
Periodic Steady-State
Basics
dx ( t )
(t )
= F x ( t ) + uN
N input
dt
state
2T
3T
x ( t + T ) = x ( t ) for t >> 0
Periodic Steady-State
Basics
Boundary-Value
Problem
Basic Formulation
Periodicity
Constraint
Differential
Equation Solution
d
N Differential Equations:
xi ( t ) = Fi ( x ( t ) )
dt
N Periodicity Constraints: xi (T ) = xi ( 0 )
SMA-HPC 2003 MIT
Boundary-Value
Problem
Nonlinear Problem
dx ( t )
= F ( x (t )) + u (t )
N
dt
t [ 0, T ]
input
x (T ) = x ( t )
periodicity
constraint
t
F
x
+ u ( t )
x
2
2
1
2
x x x t F x + u ( 2t )
H FD
=
#
L
x x L x L 1 t F x L + u Lt
( )
( ( )
( ( )
( ( )
) =0
)
Boundary-Value
Problem
Shooting Method
Basic Definitions
dx ( t )
= F ( x (t )) + u (t )
Start with
dt
And assume x(t) is unique given x(0).
D.E. defines a State-Transition Function
( y, t0 , t1 ) x ( t1 )
where x (t ) is the D.E. solution given x ( t0 ) = y
SMA-HPC 2003 MIT
Shooting Method
Boundary-Value
Problem
Abstract Formulation
Solve
H ( x ( 0 ) ) = ( x ( 0 ) , 0, T ) x ( 0 ) = 0
x(T )
( x, 0, T )
JH ( x) =
I
x
JH ( x
SMA-HPC 2003 MIT
)( x
k +1
) = H ( x )
k
Boundary-Value
Problem
Shooting Method
Computing Newton
To Compute ( x ( 0 ) , 0, T )
dx ( t )
Integrate
= F ( x ( t ) ) + u ( t ) on [0,T]
dt
( x, 0, T )
What is
?
x (T )
x
x (0) +
x (T )
x (0)
Boundary-Value
Problem
( x, 0, T )
x
x11 (T ) x1 (T )
x1 T x T
(
)
(
)
N
N
1
SMA-HPC 2003 MIT
Shooting Method
Sensitivity Matrix by Perturbation
" "
" "
" "
" "
x1
(T ) x1 (T )
N
xN ( T ) xN ( T )
Boundary-Value
Problem
Shooting Method
Efficient Sensitivity Evaluation
1
1
x x ( 0 ) t F x + u ( t ) = 0
x ( 0 )
1
1
1
F
x
x
0
(
)
x
x
t
=0
x ( 0 ) x ( 0 )
x x ( 0 )
1
I
1
F x
x ( 0 )
x
I t
=
x x ( 0 ) x ( 0 )
) )
( ( )
( )
( )
Shooting Method
Boundary-Value
Problem
l
l 1
F ( x ) x l
x
I t
=
x x ( 0 ) x ( 0 )
F
x
(
)
( x, 0, T )
I t
x
x
l =1
Shooting Method
Boundary-Value
Problem
F
x
(
)
( x, 0, T )
I t
x
x
l =1
Timestep Newton
Jacobian
( x, 0, T )
( I tA)
Shooting Method
Matrix-Free Approach
Basic Setup
dx ( t )
= F ( x (t )) + u (t )
Start with
dt
H ( x ( 0 ) ) = ( x ( 0 ) , 0, T ) x ( 0 ) = 0
( x, 0, T )
JH ( x) =
I
x
k
k +1
k
k
J H ( x )( x x ) = H ( x )
Shooting Method
Matrix-Free Approach
Matrix-Vector Product
( x k , 0, T )
k +1
k
k
k
I ( x x ) = x ( x , 0, T )
x
x
b
A
p
,
0,
T
x
, 0, T )
(
)
(
j
Ip
pj
Shooting Method
Matrix-Free Approach
Convergence for GCR
Example
dx
Ax = 0 eig ( A ) real and negative
dt
Shooting-Newton Jacobian
( x, 0, T )
AT
I =e I
x
SMA-HPC 2003 MIT
Matrix-Free Approach
Shooting Method
AT
I =S
1T
1
%
e
N T
1
S
1
1
Few Slow Modes larger than 1
SMA-HPC 2003 MIT
Fourier Representation
Spectral Methods
Truncation Approximation
x(t ) =
Xe
l =
t
i 2 l
T
x(t )
Xe
l = L
SMA-HPC 2003 MIT
t
i 2 l
T
Spectral Methods
Fourier Representation
Square Wave Example
Fourier Representation
Spectral Methods
X l = X
*
l
t
i 2 l
T
t
+ i 2 l
*
T
x(t ) = X l e
+ Xl e
+ X0
N
l =1
l =0
Real
SMA-HPC 2003 MIT
Fourier Representation
Spectral Methods
Orthogonality
t
i 2 l
T
t
i 2 m
T
dt = 0 l m
t
i 2 m
T
x(t )dt = e
0
t
i 2 m
T
Xe
l =
t
i 2 l
T
dt = TX m
Fourier Representation
Spectral Methods
Advantages
1 i 2 m Tt
m
lim m > e
x(t )dt = lim m > X m = O ( c )
T 0
Xe
l = L
SMA-HPC 2003 MIT
t +T
i 2 l
Xe
l = L
i 2 l
t
T
= x (t )
Spectral Methods
Computing Coefficients
Residual
dt l = L
Residual
t
i 2 l
L
T
F
X
e
l = L
u (t )
i 2 l
T
T
F Xle
R X ,t =
Xle
l = L T
l = L
Residual
u (t )
Computing Coefficients
Spectral Methods
Residual
i 2 m
t
T
G
R X , t dt = 0 m { L,..., 0,...L}
Residual
Spectral Methods
Computing Coefficients
Galerkin Equation
t
t
L
L i 2 l
i 2 l
i 2 l
T
T
F Xle
Xle
l = L
l = L T
i 2 mX l + e
0
i 2 m
t
T
t
i 2 l
L
T
F Xle
l = L
T
t
i 2 m
T
+
dt
e
u ( t )dt = 0
m { L,..., 0,...L}
SMA-HPC 2003 MIT
u ( t ) dt
Spectral Methods
Computing Coefficients
Linear Galerkin F(x)=Ax
i 2 mX l + e
0
i 2 m
t
T
t
i 2 l
L
T
A Xle
l = L
T
t
i 2 m
T
u ( t )dt = 0
dt + e
0
Um
i 2 L
+
A
0
0
0
UL
T
X L
X
i 2 ( L 1)
( L 1)
( L 1)
0
0
+A 0
# = #
T
0
0
0
%
#
#
U
X
i 2 L
L
L
+ A
0
0
0
T
Diagonal
Spectral Methods
Computing Coefficients
Collocation Equations
i 2 l
T
T
F Xle
R X , tl = 0 =
Xle
T
l = L
l = L
Residual
l = {1,..., 2 L + 1}
u ( tl )
Spectral Methods
Computing Coefficients
Discrete Fourier Transform
L
x ( t1 )
i 2 t1 X L
i 2 TL t1
T
"
"
e
e
x (t )
X
2
( L 1)
#
% %
#
#
# =
#
% %
#
i 2 L t
L
i 2 t( 2 L+1)
( 2 L+1)
T
T
e
" " e
X L x t( 2 L +1)
l
If tl =
T then DFT Matrix has orthog columns
2L + 1
SMA-HPC 2003 MIT
Spectral Methods
Computing Coefficients
Collocation using timepoints
i 2 L
0
0
0
x ( t1 )
T
x ( t2 )
i 2 ( L 1)
0
0
0
1
#
DFT
DFT
(
)
T
#
%
0
0
i 2 L
x t( 2 L +1)
0
0
0
T
F ( x ( t1 ) )
F ( x ( t2 ) )
#
F x t( 2 L +1)
((
))
u (t )
1
u ( t2 )
#
=
#
u t( 2 L +1)
Spectral Differentiation
Spectral Methods
Computing Coefficients
Spectral Differentiation Example
Computing Coefficients
Spectral Methods
1
t
i 2 L
T
0
DFT
0
1
t
%
0
0
0
%
1
t
0
i 2 ( L 1)
T
0
0
1
F ( x ( t1 ) )
1
x
t
2 F ( x ( t2 ) )
x
0
#
# +
0 #
#
1 x 2 L +1 F x t
( 2 L +1)
((
))
u (t )
1
u ( t2 )
#
=
#
u t( 2 L +1)
x ( t1 )
x ( t2 )
0
0
1
( DFT ) #
#
%
0
i 2 L
x t( 2 L +1)
0
T
0
F ( x ( t1 ) )
F ( x ( t2 ) )
#
#
F x t( 2 L +1)
((
))
u (t )
1
u ( t2 )
#
=
#
u t( 2 L +1)
Summary
Four Methods
Time integration until steady-state achieved
Finite difference methods
Shooting Methods
Spectral Methods
Shooting Methods
State transition function
Sensitivity matrix
Matrix-Free Approach
Spectral Methods
Galerkin and Collocation Methods
SMA-HPC 2003 MIT
Engine Thermal
Analysis
Electrostatic Analysis
The Laplace Partial Differential Equation.
HeatFlow
1-D Example
Incoming Heat
T (1)
T (0)
Near End
Temperature
Far End
Temperature
T (0)
T (1)
x
1-D Example
HeatFlow
Discrete Representation
T (1)
T (0)
T1
T2
TN 1 TN
1-D Example
HeatFlow
Constitutive Relation
x
Ti
Ti +1 hi +1,i
Ti +1 Ti
= heat flow =
x
hi +1,i
Limit as the sections become vanishingly small
T ( x )
lim x 0 h ( x ) =
x
1-D Example
HeatFlow
Conservation Law
Ti 1 hi ,i 1
Ti
hi +1,i Ti +1
x
Heat Flows into Control Volume Sums to zero
hi +1,i hi ,i 1 = hs x
1-D Example
HeatFlow
Conservation Law
hi +1,i hi ,i 1 = hs x
Ti 1 hi , i 1
Ti
x
hi + 1, i Ti + 1
Heat in
from left
Heat out
from right
Incoming
heat per
unit length
h ( x ) T ( x )
lim x 0 hs ( x ) =
=
x
x
x
1-D Example
HeatFlow
CircuitAnalogy
1
=
R x
T1
+
-
vs = T (0)
is = hs x
TN
+
-
vs = T (1)
HeatFlow
1-D Example
Normalized 1-D Equation
T ( x )
u ( x)
= hs
= f ( x)
2
x
x
x
2
u xx ( x ) = f ( x )
Residual Equation
Using Basis
Functions
Partial Differential Equation form
u
2 = f
x
2
u (0) = 0 u (1) = 0
u ( x ) uh ( x ) = i i ( x )
{
i =1
Basis Functions
d i ( x )
R ( x ) = i
+ f ( x)
2
dx
i =1
n
Using Basis
Functions
Example
3 5
Using Basis
functions
Basis Weights
Galerkin Scheme
( x ) R ( x ) dt = 0
l
d i ( x )
i
+ f ( x ) dx = 0 l {1,..., n}
2
0 l ( x )
dx
i =1
Basis Weights
Using Basis
Functions
d l ( x )
0 dx
1
d i i ( x )
i =1
dx
dx i ( x ) f ( x ) dx = 0
0
l {1,..., n}
SMA-HPC 2003 MIT
Convergence
Analysis
The
question
is
u
uh
Convergence Analysis
Heat Equation
Overview of FEM
u
2 = f
x
2
u (0) = 0 u (1) = 0
u v
x x dx = f v dx for all v
14243 1
424
3
a(u,v)
l (v )
Introduced an abstract notation for the equation u must satisfy
a (u , v) = l (v)
SMA-HPC 2003 MIT
for all v
Convergence Analysis
Heat Equation
Overview of FEM
n
Example
3 5
Heat Equation
Convergence Analysis
Overview of FEM
Key Idea
a(u , u ) defines a norm a (u, u ) u
U is restricted to be 0 at 0 and1!!
Using the norm properties, it is possible to show
Then
u uh = min wh X h u wh
1
424
3
1
424
3
Pr ojection
Solution
Error
Error
Heat Equation
Convergence Analysis
Overview of FEM
uh
1
u uh = O
1
424
3
n
error
Summary
Why Poisson Equation
Reminder about heat conducting bar
Outline
Informal Finite Difference Methods
Heat Conducting Bar
1-D Example
Heat Flow
Discrete Representation
T (1)
T (0)
T1
T2
TN 1 TN
1-D Example
Heat Flow
Equation Formulation
In co m in g H eat ( h s )
Ti 1 hi , i 1
Ti
x
hi + 1, i Ti + 1
hi +1,i
Ti +1 Ti
= heat flow =
x
hi +1,i hi ,i 1 = hs x
Heat in
from left
Heat out
from right
Incoming
heat per
unit length
h ( x )
T ( x )
lim x 0 hs ( x ) =
=
x
x
x
Heat Flow
1-D Example
Normalized 1-D Equation
Normalized Equation
T ( x )
u ( x)
= hs
= f ( x)
2
x
x
x
2
u xx ( x ) = f ( x )
Summary
Informal Finite Difference Methods
Heat Conducting Bar
Outline
Reminder about FEM and F-D
1-D Example
Krylov Method
Communication Lower bound
Preconditioners based on improving communication
Heat Flow
1-D
Example
Normalized 1-D Equation
T ( x )
u ( x)
= hs
= f ( x)
2
x
x
x
2
u xx ( x ) = f ( x )
FD Matrix
properties
x0 x1 x2
2
1
1
0
2
x
M
0
SMA-HPC 2002 MIT
uxx = f
1-D Poisson
Finite Differences
xn xn +1
2 1
O O
O O
L 0
L
O
O
O
1
u j +1 2u j + u j
x
= f (x j )
u1
f ( x1 )
M
0 M
M
M M
0 M = M
M
1 M
2 M
M
u
f ( x )
n
n
Residual Equation
Using Basis
Functions
Partial Differential Equation form
u
2 = f
x
2
u (0) = 0 u (1) = 0
u ( x ) uh ( x ) = i i ( x )
{
i =1
Basis Functions
d i ( x )
R ( x ) = i
+ f ( x)
2
dx
i =1
n
Basis Weights
Galerkin Scheme
( x ) R ( x ) dt = 0
l
d i ( x )
i
+ f ( x ) dx = 0 l {1,..., n}
2
0 l ( x )
dx
i =1
Basis Weights
Using Basis
Functions
d l ( x )
0 dx
1
d i i ( x )
i =1
dx
dx i ( x ) f ( x ) dx = 0
0
l {1,..., n}
SMA-HPC 2002 MIT
Structural Analysis of
Automobiles
Equations
Force-displacement relationships for
mechanical elements (plates, beams, shells)
and sum of forces = 0.
Partial Differential Equations of Continuum
Mechanics
Equations
Navier-Stokes Partial Differential Equations.
Engine Thermal
Analysis
Equations
The Poisson Partial Differential Equation.
FD Matrix
properties
Discretized Poisson
x1
x2
x2m
xm +1
u j +1 2u j + u j 1
244
x
144
3
u
xx
2
xm
u j + m 2u j + u j m
y
1442443
u
yy
2
= f (x j )
FD Matrix
properties
3-D Discretization
FD Matrix
properties
Discretized Poisson
x j m
x j 1
x j m2
xj
x j +1
x j + m2
u j +1 2u j + u j 1
( x )
1442443
u xx
2
u j + m 2u j + u j m
( y )
1442443
u yy
2
x j +m
u j + m 2 2u j + u j m 2
( z )
144
42444
3
u zz
2
= f (xj )
FD Matrix
properties
3-D Discretization
Matrix nonzeros, m = 4 example
FD Matrix
properties
Summary
Numerical Properties
| Aii | | Aij |
j i
1
1
2 D 2 4, 3 D 2 6,
Summary
FD Matrix
properties
Structural Properties
2 D m 2 m 2 , 3 D m3 m3
Aij = 0
|i j | > 1
2D
Aij = 0
|i j| > m
3 D
Aij = 0
| i j | > m2
3,
2D
5, 3 D
Basics of GE
A11
A
0
21
A
031
041
A
SMA-HPC 2002 MIT
Triangularizing
Picture
A
A2323 A
A24
A22
22 A
24
A03232 A33
A
34
34
33
Triangularizing
GE Basics
Algorithm
For i = 1 to n-1 {
For each Row
For j = i+1 to n {
For each Row below pivot
For k = i+1 to n { For each element beyond Pivot
Ajk Ajk
Multiplie
r
}
}
Aji
Aii
Pivot
Aik
Form n-1 reciprocals (pivots)
Form
n 1
n2
(n i ) =
2
i =1
n 1
Perform
SMA-HPC 2002 MIT
multipliers
(n i)2
i =1
2 3
n
3
Multiply-adds
Complexity of GE
1 D
O ( n3 ) = O ( m3 )
2D
3 D
Triangularizing
Banded GE
Algorithm
b
b
For i = 1 to n-1 {
For j = i+1 to ni+b-1
{ {
{
For k = i+1 to n i+b-1
{
Ajk Ajk
NONZEROS
Aii
Aik
}
}
}
n 1
Perform
(min(b 1, n i))
i =1
Aji
O (b2n )
Multiply-adds
Complexity of
Banded GE
1 D
O (b 2 n) = O (m)
2D
3 D
Preconditioning
Preconditioning
Krylov Methods
Diagonal Preconditioners
Let A = D + And
(
Krylov Methods
Convergence Analysis
Optimality of GCR poly
Therefore
Any polynomial which satisfies the
constraints can be used to get an
upper bound on
SMA-HPC 2002 MIT
r k +1
r
xexact (1 digit)
b = r0
-.5
1.25 -1 -.5
D A
D 1 A
1-D Discretized
PDE
0.5
0
0 1
1
0.5
1
0
O
0
0.5
0
O
O 0.5 M
0
0
0.5
1 0
14444
4244444
3
SMA-HPC 2002 MIT
D1A
1
1.25
0.5
M0.5
0
0
xexact (1 digit)
b = r0
1
D A
1
D A
-.5
1.25 -1 -.5
0
k +1
j
= x + j r 0
Need at least m iterations for x
m
j =1
1
0
If r 0 =
M
0
Takes 2m = O ( m ) iters
m2
SMA-HPC 2002 MIT
for x k +1
)m
Eigenanalysis
0.5
0
0 1
1
0.5
1
0
O
0
0.5
0
O
O 0.5 M
0
0
0.5
1 0
14444
3
4244444
1
1.25
0.5
M0.5
0
0
D1A
k
Recall Eigenvalues of D A = 1 cos
m +1
1
Eigenanalysis
For D 1 A,
max
min
m
= 1 cos
1
cos
+
+
m
1
m
1
1
2
+1
k =
log
convergence
tolerance
1 m
log
+1
O (m)
O (m)
O (m)
O (m
O (m
)
O (m )
4
)
O (m )
3
GCR
O (m
)
O (m )
O (m )
Gauss-Seidel Preconditioning
Preconditioning
Approaches
Physical View
u0 u1
u2(old)
u1(new)
(new)
2
Gauss Seidel
u3(old)
un(new)
1
un(new) u
n +1
Gauss-Seidel Preconditioning
Preconditioning
Approaches
xexact (1 digit)
b =1 r 0
( D + L) A
( D + L)
b =1 r
( D + L) A
( D + L)
Krylov Vectors
1-D Discretized
PDE
X=nonzero
Gauss-Seidel Preconditioning
Preconditioning
Approaches
Symmetric Gauss-Seidel
(new)
u
u0 1
u2(old)
(new)
u
(new)
2
u1
u3(old)
un(new)
1
u
(newer)
u
u 1
0
(new)
n2
un(newer)
1
un(new) u
n +1
un( new)
u2(newer)
Gauss-Seidel Preconditioning
Preconditioning
Approaches
Symmetric Gauss-Seidel
k +1
= ( D + U ) L ( D + L ) Ux k
1
1
1
+ ( D +U ) b ( D +U ) L ( D + L) b
k +1
= x ( D + U ) D ( D + L ) Ax k
+ ( D +U ) D ( D + L) b
1
Block Diagonal
Preconditioners
Preconditioning
Approaches
Line Schemes
Grid
Matrix
m2
Block Diagonal
Preconditioners
Preconditioning
Approaches
Line Schemes
Grid
Problem
Lines preconditioners
communicate rapidly in
only one direction
Solution
m2
Do lines first in x,
then in y.
Block Diagonal
Preconditioners
Preconditioning
Approaches
Domain Decomposition
Approach
The trade-off
m2
Block Diagonal
Preconditioners
Preconditioning
Approaches
Domain Decomposition
m points
Block
index
m
l2
m
points
l
2
m
Block cost: factoring l l grids, O ( m 2l ) , sparse GE
l
m
GCR iters: Communication bound gives O iterations.
l3
Suggests insensitivity to l: Algorithm is O ( m ) .
Preconditioning
Approaches
Line Schemes
Grid
Matrix
m2
Overlapping Domain
Preconditioners
Preconditioning
Approaches
Grid
Matrix
m2
Preconditioning
i
Approaches
Incomplete Factorization
Schemes
Outline
Sparse Matrices
Fill-In
Example
X X X
X X 0
X 0 X
X X X
X X X
0 X
X X
X= Non zero
Fill-ins
SMA-HPC 2002 MIT
Fill-In
Sparse Matrices
Second Example
Fill-ins Propagate
X
X
X
0
X
0
X
0
X
0
X
0
Sparse Matrices
Fill-In
Very Sparse
Very Sparse
Dense
Sparse Matrices
Fill-In
Unfactored Random Matrix
Sparse Matrices
Fill-In
Factored Random Matrix
FD Matrix
properties
3-D Discretization
Matrix nonzeros, m = 4 example
Preconditioning
i
Approaches
Incomplete Factorization
Schemes
Key idea
Summary
3-D BVP Examples
Aerodynamics, Continuum Mechanics, Heat-Flow
Krylov Method
Communication Lower bound
Preconditioners based on improving communication
Outline
Integral Equation Methods
Exterior versus interior problems
Start with using point sources
Standard Solution Methods in 2-D
Galerkin Method
Collocation Method
Issues in 3-D
Panel Integration
SMA-HPC 2003 MIT
Interior
Exterior
2T = 0
outside
2T = 0
inside
Temperature
known on surface
Temperature in a Tank
Temperature
known on surface
n
surface
Thermal
= conductivity
+
v
-
2 = 0
Outside
is given on Surface
Dielectric
= Permitivity
n
surface
Resonator
Computed Forces
Bottom View
Discretized Structure
Computed Forces
Top View
Exterior Problems
Surface
T = 0 at
But, must
truncate the
mesh
T
Only need
on the surface, but T is computed everywhere
n
Must truncate the mesh, T () = 0 becomes T ( R ) = 0
SMA-HPC 2003 MIT
Greens Function
Laplaces Equation
In 2-D
2
2
( x x0 ) + ( y y0 )
If u = log
2u 2u
then 2 + 2 = 0 for all ( x, y ) ( x0 , y0 )
x
y
In 3-D
If u =
( x x0 ) + ( y y0 ) + ( z z0 )
2
2u 2u 2u
then 2 + 2 + 2 = 0 for all ( x, y, z ) ( x0 , y0 , z0 )
x
y
z
Laplaces Equation
in 2-D
Simple Idea
u is given on surface
Surface
( x0 , y0 )
Let u = log
2u 2u
+ 2 = 0 outside
2
x
y
( x x0 ) + ( y y0 )
2
2u 2u
+ 2 = 0 outside
2
x
y
Problem Solved
Simple Idea
Laplaces Equation
in 2-D
More Points
u is given on surface
2u 2u
+ 2 = 0 outside
2
x
y
( x2 , y2 )
( x1 , y1 )
n
Let u = i log
i =1
( xn , yn )
( x xi ) + ( y yi )
2
) = G ( x x , y y )
n
i =1
Simple Idea
Laplaces Equation
in 2-D
(x , y )
t1
t1
( x2 , y2 )
( x1 , y1 )
( xn , yn )
G ( xt x1 , yt y1 ) L L G ( xt xn , yt yn ) ( xt , yt )
1
1
1
1
1
1
M
O
M
M
M
M
O
M
M
G x x , y y L L G x x , y y n x , y
1
1
tn
tn
tn
n
tn
n
tn
tn
r
R=10
n=40
Laplaces Equation
in 2-D
Integral Formulation
Limiting Argument
Laplaces Equation
in 2-D
Represent ( x ) = i i ( x )
{
i =1
Basis Functions
Example Basis
Represent circle with straight lines
Laplaces Equation
in 2-D
( x) =
approx
surface
SMA-HPC 2003 MIT
G ( x, x ) ii ( x ) dS
i =1
Laplaces Equation
in 2-D
x1
xn
ln
l1
x2
l2
( x) =
i =1
i =1
G ( x, x ) ii ( x ) dS = i
approx
surface
SMA-HPC 2003 MIT
G ( x, x ) dS
line l
i
Laplaces Equation
in 2-D
R ( x) ( x)
G ( x, x ) ii ( x ) dS
approx
surface
i =1
( x )R ( x ) dS = 0
i
for all i.
Laplaces Equation
in 2-D
( x ) R ( x ) dS = ( x ) ( x ) dS
i
i ( x ) G ( x, x ) j j ( x ) dS dS = 0
j =1
approx
surface
Laplaces Equation
in 2-D
Collocation
Collocation: i ( x ) = ( xi ) (point-matching)
( x x ) R ( x ) dS = R ( x ) = ( x )
ti
ti
( )
xti = j
j =1
G xti , x
ti
approx
surface
) ( x) dS = 0
j =1
G xti , x j ( x ) dS
approx
surface
1444424444
3
Ai , j
M O
M M
= M
M
O M M M
An ,1 L L An ,n n xtn
( )
Laplaces Equation
in 2-D
xn l
n
xt1
l1
l2
x2
( )
xti = j
j =1
G xti , x j ( x ) dS
approx
surface
Collocation point in
line center
A1,1 L L A1,n 1 ( xt1 )
M O
M M
= M
M
O M M M
An ,1 L L An ,n n xtn
( )
( )
xti = j
j =1
G ( x , x) dS
ti
line j
1442443
Ai , j
Laplaces Equation
in 2-D
( )
xti = j
j =1
G ( x , x) dS
ti
line j
1442443
Ai , j
xt1
xt2
l1
A1,2 =
l2
G ( x , x) dS G ( x
t1
line 2
t2
line1
, x ) dS = A2,1
Laplaces Equation
in 2-D
Galerkin: i ( x ) = i ( x ) (test=basis)
( x ) R ( x ) dS = ( x ) ( x ) dS
i
i ( x ) G ( x, x ) j j ( x ) dS dS = 0
i ( x ) ( x ) dS = j
approx
surface
j =1
14442444
3
bi
A1,1 L L
M O
M
O
An ,1 L L
approx
surface
G ( x, x ) i ( x ) j ( x ) dS dS
approx approx
surface surface
1444444
424444444
3
Ai , j
A1,n 1 b1
M M M
=
M M M
An ,n n bn
If G ( x, x) = G ( x, x) then Ai , j =A j ,i
SMA-HPC 2003 MIT
j =1
A is symmetric
Laplaces Equation
in 2-D
ln
l1 xn
l2
x2
n
( x ) dS = G ( x, x) dS dS
linei
14243
bi
j =1
linei line j
144424443
Ai , j
A1,1 L L A1,n 1 b1
M O
M
M = M
M
O M M M
An ,1 L L An ,n n bn
SMA-HPC 2003 MIT
3-D Laplaces
Equation
Integral Equation: ( x ) =
surface
1
( x ) dS
x x
n
Represent ( x ) i i ( x )
{
i =1
Basis Functions
j ( x ) = 1 if x is on panel j
Panel j ( x ) = 0 otherwise
j
SMA-HPC 2003 MIT
3-D Laplaces
Equation
Put collocation points at
panel centroids
( )
xci = j
xci Collocation
point
j =1
G(x
panel j
ci
, x dS
14442444
3
Ai , j
M O
M M
= M
M
O M M M
An ,1 L L An ,n n xcn
( )
3-D Laplaces
Equation
xci Collocation
point
Ai , j =
panel j
1
dS
xci x
Panel j
One point
quadrature
Approximation
Four point
quadrature
Approximation
SMA-HPC 2003 MIT
Panel Area
Ai , j
xci xcentroid j
4
0.25* Area
j =1
xci x po int j
Ai , j
3-D Laplaces
Equation
Calculating Self-Term
xci Collocation
point
Ai ,i =
panel i
1
dS
xci x
Panel i
One point
quadrature
Approximation
Ai ,i =
panel i
SMA-HPC 2003 MIT
Ai ,i
Panel Area
xci xci
1
424
3
0
1
dS is an integrable singularity
xci x
3-D Laplaces
Equation
Calculating Self-Term
Tricks of the trade
xci Collocation
point
Panel i
Ai ,i =
panel i
Disk of radius R
surrounding
collocation point
Integrate in two Ai ,i =
disk
pieces
Disk Integral has
singularity but has
analytic formula
SMA-HPC 2003 MIT
disk
1
dS
xci x
1
1
dS +
dS
xci x
rest of panel xci x
R 2
1
dS =
xci x
0
1
rdrd = 2 R
r
3-D Laplaces
Equation
Calculating Self-Term
Other Tricks of the trade
xci Collocation
point
Panel i
Ai ,i =
panel i
1
dS
xci x
1
424
3
Integrand is singular
3-D Laplaces
Equation
Galerkin (test=basis)
n
( x ) ( x ) dS = ( x ) G ( x, x ) ( x ) dS dS
144
2443
144444244444
3
i
bi
j =1
Ai , j
panel
i
panel
j
x x
243 j =1
14444244443
bi
Ai , j
A1,1 L L A1,n 1 b1
M O
M
M = M
M
O M M M
An ,1 L L An ,n n bn
SMA-HPC 2003 MIT
3-D Laplaces
Equation
M O
M M
= M
M
O M M M
An ,1 L L An ,n n xcn
( )
Summary
Integral Equation Methods
Exterior versus interior problems
Start with using point sources
Standard Solution Methods
Collocation Method
Galerkin Method
Next Time Fast Solvers
Use a Krylov-Subspace Iterative Method
Compute MV products Approximately
Outline
Solving Discretized Integral Equations
Using Krylov Subspace Methods
Fast Matrix-Vector Products
Multipole Algorithms
Multipole Representation.
Basic Hierarchy
Algorithmic Improvements
Local Expansions
Adaptive Algorithms
Computational Results
+
v
-
2 = 0
Outside
is given on Surface
Dirichelet Problem
x
x
surface
1
424
3 Ch arg e
potential
Green's Density
Function
Resonator
Computed Forces
Bottom View
Discretized Structure
Computed Forces
Top View
3-D Laplaces
Equation
Integral Equation: ( x ) =
surface
1
( x ) dS
x x
n
Represent ( x ) i i ( x )
{
i =1
Basis Functions
j ( x ) = 1 if x is on panel j
Panel j ( x ) = 0 otherwise
j
SMA-HPC 2003 MIT
3-D Laplaces
Equation
Put collocation points at
panel centroids
( )
xci = j
xci Collocation
point
j =1
G(x
panel j
ci
, x dS
14442444
3
Ai , j
M O
M M
= M
M
O M M M
An ,1 L L An ,n n xcn
( )
3-D Laplaces
Equation
xci Collocation
point
Ai , j =
panel j
1
dS
xci x
Panel j
One point
quadrature
Approximation
Four point
quadrature
Approximation
SMA-HPC 2003 MIT
Panel Area
Ai , j
xci xcentroid j
4
0.25* Area
j =1
xci x po int j
Ai , j
3-D Laplaces
Equation
Calculating Self-Term
xci Collocation
point
Ai ,i =
panel i
1
dS
xci x
Panel i
One point
quadrature
Approximation
Ai ,i =
panel i
SMA-HPC 2003 MIT
Ai ,i
Panel Area
xci xci
1
424
3
0
1
dS is an integrable singularity
xci x
3-D Laplaces
Equation
Calculating Self-Term
Tricks of the trade
xci Collocation
point
Panel i
Ai ,i =
panel i
Disk of radius R
surrounding
collocation point
Integrate in two Ai ,i =
disk
pieces
Disk Integral has
singularity but has
analytic formula
SMA-HPC 2003 MIT
disk
1
dS
xci x
1
1
dS +
dS
xci x
rest of panel xci x
R 2
1
dS =
xci x
0
1
rdrd = 2 R
r
3-D Laplaces
Equation
Calculating Self-Term
Other Tricks of the trade
xci Collocation
point
Panel i
Ai ,i =
panel i
1
dS
xci x
1
424
3
Integrand is singular
3-D Laplaces
Equation
Galerkin (test=basis)
n
( x ) ( x ) dS = ( x ) G ( x, x) ( x) dS dS
144
2443
1444442444443
i
bi
j =1
Ai , j
panel
i
panel
j
x x
243 j =1
14444244443
bi
Ai , j
A1,1 L L A1,n 1 b1
M O
M
M = M
M
O M M M
L
L
A
A
n ,1
n,n
n bn
SMA-HPC 2003 MIT
3-D Laplaces
Equation
M O
M M
= M
M
O M M M
An ,1 L L An ,n n xcn
( )
Solving Discretized
Integral Equations
compute Apk
( r ) ( Ap )
k T
k =
( Apk )
( Apk )
x k +1 = x k + k pk
Solving Discretized
Integral Equations
( r ) ( Ap )
k T
( Apk ) ( Apk )
x k +1 = x k + k pk
r k +1 = r k k Apk
pk +1 = r
k +1
Ar ) ( Ap )
(
p
( Ap ) ( Ap )
k
j =0
k +1 T
Complexity of GCR
Dense Matrix-vector
product costs O(n2)
compute Apk
k =
Solving Discretized
Integral Equations
1/(# panels)
Summary
Solving Discretized Integral Equations
GCR plus Fast Matrix-Vector Products
Multipole Algorithms
Multipole Representation.
Basic Hierarchy
Algorithmic Improvements
Local Expansions
Adaptive Algorithms
Computational Results
Precorrected-FFT Algorithms
Molecular Dynamics
Nicolas Hadjiconstantinou
Molecular Dynamics
Molecular dynamics is a technique for computing the equilibrium and
non-equilibrium properties of classical* many-body systems.
* The nuclear motion of the constituent particles obeys the laws of
classical mechanics (Newton).
References:
1)
2)
3)
Moldy
A free and easy to use molecular dynamics simulation package can be found
at the CCP5 program library (http://www.ccp5.ac.uk/librar.shtml), under the
name Moldy. At this site a variety of very useful information as well as
molecular simulation tools can be found.
Moldy is easy to use and comes with a very well written manual which can
help as a reference. I expect the homework assignments to be completed using
Moldy.
Allows us to study and understand material behavior so that we can model it.
Tells us what the answer is when we do not have models.
Example: Diffusion equation
F |x
F | x + dx
dy
Conservation of mass:
dx
xx ++ dx
d
ndxdydz ) = ( F | x F | x + dx )dydz
(
dt
d
ndxdydz ) F | x
(
dt
F
2 F dx 2
| x dx + 2
F | x +
dydz
x
x 2
F
2 F dx
=
dxdydz 2
dydzdx
x
x 2
in the limit
dx 0,
n F
+
=0
t x
n
2n
=D 2
t
x
diffusion equation!
5
F = D
n
x
Example:
( )
Let i M
( )
A = ( ) A( )d.
E
(E ) exp
kT
where k is Boltzmanns constant.
For non-equilibrium systems solving a problem reduces to the task of calculating
().
Molecular methods are similar to experiments where rather than solving for ()
we measure A directly.
10
A = (i ) A(i )
i
Equations of Motion
Newtons equations
For i = 1,K, N
r
d 2 ri r
r r r
mi 2 = Fi = r U r1 , r2 K rN )
ri
dt
r r r
U (r1 , r2 K rN ) = Potential energy of the system
r
r r
r r r
= U1 (r1 ) + U 2 (ri , rj ) + U 3 (r1 , r j , rk )
i
i j >i
+K
r
U1 (r1 ) = external field K
r r
r r
U 2 (r1 , rj ) = pair interaction = U 2 (rij ), (rij ) = (ri r j )
r r r
U 3 (ri , rj , rk ) = three body interaction (expensive to calculate)
13
i j >i
where U2eff includes some of the effects of the three body interactions.
14
12 6
U (r ) = 4
r
r
(~ 1/ r ) for r > 2
6
15
The Lennard-Jones potential (U) and force (F) as a function of separation (r)
( = = 1)
1
0
1
2
0.5
1.5
2.5
3.5
0.5
1.5
2
r
2.5
3.5
1
0
1
2
16
Reduced Units
What is the evaporation temperature of a Lennard-Jones liquid?
17
Number density * = 3
Temperature T * =
kT
P 3
Pressure P =
*
Time t =
t
2
m
Integration Algorithms
An integration algorithm should
a)
19
b)
Very important because for a given total simulation time the longer the
timestep the less the number of force evaluation calls
c)
d)
e)
20
()
() (
()
()
()
r
1 r
r
r
r t + t ) = r t ) + t V t ) + t 2 a t ) + K
2
r
1 r
r
r
r t t ) = r t ) t V t ) + t 2 a t ) + K
2
ADVANTAGES
1)
Very compact and simple to program
2)
Excellent energy conservation properties (helped by time-reversibility)
21
3)
4)
Time reversible
r
r
r (t + t ) r (t t )
( )
4
Local error O t
DISADVANTAGES
1)
r
r
r
r t + t ) r t t )
Awkward handling of velocities V t ) =
2t
r
r
a) Need r t + t ) solution before getting V t )
( )
2
b) Error for velocities O t
2)
() (
()
r
r
r
4
a
t
a
t t )
(
)
(
r
r
2
r (t + t ) = r (t ) + tV (t ) + t
6
r
r
r
r
r
2a (t + t ) + 5a (t ) a (t t )
V ( t + t ) = V ( t ) + t
6
Coordinates equivalent to Verlet algorithm
r
V more accurate than Verlet
23
b) Evaluate accelerations from new positions and velocities (if forces are
velocity dependent.
c)
d) Go to (a).
Although these methods can be very accurate, the nature of MD simulations is
not well suited to them. The reason is that any prediction and correction which
does not take into account the motion of the neighbors is unreliable.
24
a)
b)
c)
d)
e)
r
t 2 r
r
r
r
r (t + t ) = r (t ) + tV (t ) +
a ( t ) a ( t t )
6
rP
r
t r
r
V (t + t ) = V (t ) + 3a (t ) a (t t )
2
rP
1
r
r
a (t + t ) = F r (t + t ),V (t + t )
m
r
rc
t r
r
V (t + t ) = V (t ) + 2a (t + t ) + 5a (t ) (t t )
6
rP
rc
Replace V with V and go to c.
If there are no velocity dependent forces this reduces to the Beeman method
discussed above.
25
Todays computers can easily treat N > 1000 so artifacts from small systems
with periodic boundary conditions are limited.
Equilibrium properties unaffected.
Long wavelengths not possible.
In todays implementations, particle interacts with closest images of other
molecules.
26
1 N
= mi
V i =1
rr
1
r r
= mViVi + rij Fij
V i
i j
r
P
r i i
u=
mi
i
(macroscopic velocity).
Need initial conditions for positions and velocities of all molecules in the
system.
E
P( E ) exp
kT
28
r2
P
Ek = i
i 2m
N
29
Equilibration
Because systems are typically initialized in the incorrect state, potential energy
is usually either converted into thermal energy or thermal energy is consumed
as the system equilibrates. In order to stop the temperature from drifting, one
of the following methods is used:
r
Td r
Vi
1) Velocity rescaling Vi =
T
where Td is the desired temperature and
r2
Pi
2
T=
3Nk i 2mi
N
This is the simplest and most crude way of controlling the temperature
of the simulation.
30
31
U (r ) r rc
U tr (r ) =
r > rc
O
Not favored because Utr(r) is discontinuous. Does not conserve energy.
This is fixed by the truncated and shifted potential
U (r ) U (rc ) r rc
U tr sh (r ) =
O
r > rc
33
-Verlet list
-Cell index method
34
(r
> rc )
so that neighbor pairs need not be calculated every timestep
rl
3
7
7`
1
rc
2
5
35
Cell-index Method
Divide simulation into m subcells in each direction (here m = 5 ).
Search only sub-cells within cut-off (conservative)
Example: If sub-cell size larger than cut-off for cell 13 only cells 7, 8, 9, 12, 13,
14, 17, 18, 19 need to be searched.
9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
36
Nc =
N
m2
instead of
1
N ( N 1).
2
1
N ( N 1).
2
37
Constraint Methods
38
39
r
r r
r
dri Pi dPi r
Equations of motion
= ,
= Fi fPi
dt m dt
f is a dynamical variable and is given by
df kg
= (T Td )
dt Q
40
d
d
T Pi 2 = 0
dt
dt i
r r
dri Pi
= ,
dt m
N r r
r
P Fi
r
dPi r
= Fi Pi , = i =1N r
dt
Pi 2
i =1
41
Homework Discussion
Use (and modify) control file control.argon. You can run short familiarization
runs with this file.
Control.argon calls the argon.in system specification file. Increase the number
of particles to at least 1000 (reduce small system effects and noise).
Note that density in control.argon is in units equivalent to
g/cm3 (= 1000 Kg/m3).
Reduce noise by increasing number of particles and/or averaging interval
(average-interval).
Make sure that temperature does not drift significantly from the original
temperature. You may want to rescale (scale-interval) more often and stop
rescaling (scale-end) later. Make sure you are not sampling while
rescaling!
By setting rdf-interval = 0 you disable the radial distribution function
calculation which makes the output file more readable.
42
#
# Sample control file. Different from the one provided with code.
#
sys-spec-file=argon.in
density=1.335
#density=1335 Kg/m3
temperature=120
#initial temperature=120K
scale-interval=2
#rescale velocities every 2 timesteps
scale-end=5000
#stop rescaling velocities at timestep 5000
nsteps=25000
#run a total of 25000 timesteps
print-interval=500 #print results every 500 timesteps
roll-interval=500
#keep rolling average results every 500 timesteps
begin-average=5001 #begin averaging at timestep 5001
average-interval=10000 #average for 10000 timesteps
step=0.01
#timestep 0.01ps
subcell=2
#subcell size for linked-cell method
strict-cutoff=1
#calculate forces using strict 0(N2) algorithm
cutoff=10.2
#3 * sigma
begin-rdf=2501 #begin radial distribution function calculation at timestep 2501
rdf-interval=0
rdf-out=2500
43
1) This problem is intended to reenforce your understanding of the nodal analysis and nodebranch
equation formulation techniques.
Consider the following simple circuit.
R3
R2
0
Is
a) Apply nodal analysis to generate a linear system of equations which can be used to compute the
circuit node voltages. Where appropriate, please give matrix or vector entries as analytical formulas
in terms of R1 , R2 , R3 and Is .
b) Use the nodebranch approach to form a linear system of equations which can be used to compute
the circuit node voltages and resistor currents. Where appropriate, please give matrix or vector
entries as analytical formulas in terms of R1 , R2 , R3 and Is .
2) In order to help you better understand how to create a program which reads in schematics and
generates systems of equations, we have written a set of matlab functions and scripts which can
be used to read in a le containing a circuit description and set up the associated nodal analysis
1
matrix and the appropriate righthand side. In this problem, you will modify these functions and
scripts, so bear with us while we describe what they do.
The matlab script assumes that the circuit description is given as a list of statements in a le, and
that the element types are described in the following format.
resistors:
rlabel
node1
node2
val
where label is an arbitrary label, node1 and node2 are integer circuit node numbers, and val is
current sources:
ilabel
node1
node2
val
where label is an arbitrary label, node1 and node2 are integer circuit node numbers, and val is
voltage sources:
vlabel
node1
node2
val
where label is an arbitrary label, node1 and node2 are integer circuit node numbers, and val is
clabel
node1
node2
node3
node4
val
where label is an arbitrary label, node1 , node2 , node3 , and node4 are integer circuit node
numbers, and val is the controlled sources transconductance (denoted gm ), which means that the
current owing from node1 to node2 is val times the voltage dierence between node3 and node4
(i.e. i = gm (v3 v4 )). Note, ground is always node 0.
To use the supplied scripts to solve our supplied example circuit le, test.ckt, rst start up matlab.
When in matlab, type le = test.ckt to specify the name of the input le. Then type readckt to run
the script in the le readckt.m. This will read the le and put the information in arrays. Finally,
type loadMatrix to run the function in loadMatrix.m; this will create the conductance matrix and
the righthand side associated with the circuit in test.ckt. To determine the vector of node voltages,
type v = G\b.
The scripts we have provided have an implementation of nodal analysis for resistors, current sources
and voltagecontrolled current sources. Your job will be to extend the implementation to include
voltage sources for the special case where one terminal of the voltage source connected to ground.
To accomplish this, you will need to modify only the le loadMatrix.m.
You could, of course, switch the simulator so that it uses nodebranch analysis. However, for
networks of twoterminal positive resistors, the conductance matrix has a structure that is advan
tageous for numerical calculations (positive diagonals, negative odiagonals, symmetry, diagonal
dominance). Assuming that one terminal of the voltage source is connected to ground, it is possible
to implement voltage sources in your Matlab simulator in such a way that you will still preserve the
above properties of the conductance matrix. Please implement such a scheme in your Matlab simu
lator. Only implement what is needed for resistors, current sources, and grounded voltage sources,
do not bother with voltagecontrolled current sources. Be sure to test your modied simulator, you
will need a working version for problem (3).
Some Helpful Notes.
It will be necessary to make contributions to the righthand side (RHS) vector for each resistor that
is connected to a voltage source, and the matrix will require modications as well. In calculating
these modications, you will nd the array, sourcenodes generated by readckt helpful.
To make the simulator easier to debug, the node numbers in input les correspond to the row
numbers in the generated conductance matrix. If the input le node numbers are not contiguous,
or if voltage source nodes are not last, there will be rows in the resulting G matrix with only a
single one on the diagonal.
3) In this problem we will examine the heat conducting bar basic example, but will consider the
case of a leaky bar to give you practice developing a numerical technique for a new physical
problem.
With an appropriate input le, the simulator you developed in problem 2 can be used to solve
numerically the onedimensional Poisson equation with arbitrary boundary conditions. The Poisson
equation can be used to determine steadystate temperature distribution in a heatconducting bar,
as in
a
H(x)
2 T (x)
(T (x) T0 )
(1)
=
2
x
m
m
where T (x) is the temperature at a point in space x, H(x) is the heat generated at x, m is the
thermal conductivity along the metal bar, and a is the thermal conductivity from the bar to the
surrounding air. The temperature T0 is the surrounding air temperature. The ratio ma will be small
as heat moves much more easily along the bar than disspates from the bar into the surrounding air.
Now suppose one is trying to decide if it is necessary to have heat sink (a heat sink is usually just
a large chunk of metal which dissipates heat rapidly enough to stay close to room temperature)
connections at the ends of an electronic package. You can use simulation to help you make this
decision.
a) Use your Matlab simulator to numerically solve the above Poisson equation for T (x), x [0, 1],
given H(x) = 50 for x [0, 1], a = 0.001, and m = 0.1. In addition, assume the ambient air
temperature is T0 = 350, and T (0) = 300 and T (1) = 300. The boundary conditions at x = 0 and
x = 1 models heat sink connections to a cool metal cabinet at both ends of the package. That is,
it is assumed that the heat sink connections will insure both ends of the package are xed at near
room temperature. In your numerical calculation, how did you pick x? How do you know your
solution is accurate.
b) Now use your simulator to numerically solve the above equation for T (x), x [0, 1], given
H(x) = 50 for x [0, 1], a = 0.001, and m = 0.1. In addition, assume the ambient air temperature
is T0 = 350, and that T (0) and T (1) are unknown but Tx(0) = 0 and Tx(1) = 0. The zero heat ow
3
boundary conditions at x = 0 and x = 1 implies that there are no heat sinks at either end of the
package. Compare your results to part (a).
c) For the case examined in part (b) above, what will happen if a is identically zero. Can you solve
this problem? Can you nd a reasonable solution by examining what happens as a approaches
zero?
4) This last problem uses the simulator you developed in the previous question to remind you about
some of the properties of eigenvalues and eigenvectors before we use them in lecture. Be sure to
familiarize yourself with MATLABs eig command before you start.
a) Consider the leaky bar system in question 3, assume T0 = T (0) = T (1) = 0 and use ther
mal conductivity values in part 3a). For this system, nd a heat distribution, H(x), such that
maxx H(x) = 1 and when you solve for the temperature,
T (x) = H(x)
where is a real number. You need only plot H(x) and T (x), you do not need an analytical formula.
b) For the problem in part 4a, how many dierent H(x) and s are there?
c) Suppose a = 0, but all the other settings in part 3a) hold. How do the H(x)s and s which
satisfy
T (x) = H(x)
change? Please explain your results.
!#"$%&('*)
+-,+./.01
23/45
"6798;:=<> @?A
/4B5C0!D
E 1GFIHKJ*L!MONQPR!STNUJVMXWH9YZS[YQMO\X\$]TW^`_+abcSGJVL!S>d3e-fga\XhWb*MiJVL!^jMO^`_!\XST^`STH7JkaDJ*MOWHlNZMXHnm*o5prqsOtuaH!v=wxo5prqsOtzy
1|{}L+aDJ~MON-J*L!Sv!MXSTbcSTH!]SSrJIYSSTHJ*L!S[JIYWa\OhWb*MiJVL!^`NkC{}L!MO]kL9MON#SJcJVSb|aDH!vKYQLl
4 1Z&R!_!_WNcSWR>YSb*S-JVbU5MOH!h~JVWhSTH!SbVaDJ*S#aHS!aD^_!\XSWbYQL!MX]kLzWH!SZW!JVLlSh]ba\XhWb*MiJVL!^`N
]TWHSTbchSTH!]STN
B
aNUJVSTb#J*L+aHJVL!SWJ*L!STby{}L7CYZWR!\Xv9WRaWMOvR!NcMOH!hazNc5^`^SrJVb*MX]GJVSNcJ~Srla^`_!\OS
&1B\OSaDN*S[+H!vaH9Srla^`_!\OSGWb-YQLlMO]kL9WH!S>d>e-fa\OhWb*MXJ*L!^j]TWH7Sb*hSN~aDNcJVSb#JVL+aHJVLlS3WDJVL!SbTy
, 1&R!_!_WN*SzJ*L!S`d>e-fua\OhWb*MiJVL!^MONGa_!_!\XMOSTv;J*WN*W\i&MXH!h$YQL!STbcS
MXNGNc5^`^SrJVb*MX]zaH!v%L+aDNa\O\
!RlJQJIYZWWMXJ*N#STMXhSTH7Da\ORlSTN#MOHKJ*L!SMOH7JVSbcDa\l&kDlJVLlS3WDJVL!SbQJIYZW`STMOhSTH7Da\OR!SNQSMOH!hAaH!vly
1n&L!WYJ*L+aDJJVL!S%bcSTN*MXv!R+a\[_lb*W5v!R!]TSv7JVLlS=J*L!MObcvMiJVSbVaDJ*MOWHW3JVLlSd3e-fa\XhWb*MiJVL!^
MOHlSTPR+a\XMXJI
D
D
NVaJVMONU+STNJVLlS
~}
MOH7JJ*L!MOHlKaWRlJQ]TWH!NcJ*b*R!]rJVMOHlhaJVLlMOb*v5IWbcv!STb#_W\X5H!W^MOa\WDJ*L!S[Wb*^
Z
V/
4B1{}L+aJQMON#JVL!SSTNUJ|WR!Hlv9WR]TaH+H!vWb
D5
YQMXJ*LJVLlS[aWSGSTMXhSTHl_!b*W_STbUJVMXSTNTySGN*Rlb*SJ*W_!MO]k`aHSrla^`_!\OS~JVL+aDJL+aN#a>N*Rl`]TMXSTH7J#HR!^STbZWv!MXNcJVMXH!]J
STMXhSTH7Da\XR!STN/aH!v9_!\OWJr aN~azRlH!]J*MOWHWWHaz\OWhDI\XMOH!STabZ_l\OWJ-J*W`STbcMXCWR!bQWR!H!v!Ny
IV
01|R!^`STbcMO]aD\NcMO^R!\OaDJVMXWHWDZW5]TSTaHn+WYv&&H!a^MX]TN[MXNGR!N*Sv%MOH;YZSaDJ*L!STb_!bcSTv!MX]JVMXWH%aH!vnMXHnb*SN*SaDb*]kL=WH
h\XW+a\Y-abc^MXH!h!y%#L!S]TW^^`WH!\i%R!N*SvNcMO^R!\OaDJVMXWH@J*ST]kL!HlMOPR!STN`abcSK+aNcSTvWHb*S_STaDJVSvNcW\OR&JVMOWHW|a
WMON*NcWHSPR+aDJVMXWHAWbL5v!bcWNcJVaDJVMX]Q_!b*SN*N*Rlb*STNyZFJMXNPR!MiJVSQ]TW^^`WHzJVW3]TW^`_!RlJ*SQL5v!bcWNcJVaDJVMX]|_lb*STNcN*R!bcSTNZ
N*W\X5MOH!h[J*L!SQWMXN*NcWHSPR+aDJVMXWHAWH`a[v!W^aMOHJ*L+aDJ]TWSTb*NJVLlS~SH7JVMObcS#-JV\OaH7JVMO]#W5]TSTaH$7bcSTN*Rl\XJVMXH!hMOHz^`MO\X\OMOWH!N
WR!Hl5HlWYQHlNTyg#L!SWRlH!v+abU]WH!v!MiJVMXWH!NAWb`J*L+aDJWMXN*NcWHSTPR+aDJ*MOWHJVL!SHv!ST_STHlvWHJ*L!SNcST\OS]J*STv
_!L75N*MX]a\Z^W5v!ST\yFIHJ*L!MON3_!bcW!\OS^WRYQMO\X\BSrla^`MOH!SAN*W\X5MOH!havlMON*]b*SrJVMOSTvJIYZWDv!MO^`STH!NcMOWH!a\WMXN*NcWH
STPR+aJVMOWH@_!bcW!\XST^aNcN*W5]TMOaDJVSv=YQMiJVL%^W5v!S\OMOHlhCJVL!SAW5]TSaDH$aDH!v@Sla^MXH!SJVLlSMXHJ*STb*a]J*MOWH!N[SJIYZSTSHJVLlS
]kL!WMX]TSW_!L75N*MO]Ta\^W5v!S\ L!SH!]TS3WR!H!v+aDbcK]WH!v!MiJVMOWH!NkaDH!v9J*L!SbVaDJ*SWBd3e-f]WH7STbchSTHl]TSy
#L!S#W5]TSTaHMONaHR!HR!NcR+a\!vlW^aDMOH3S]aR!NcS#MXJMON^R!]kLYQMOvlSTbJ*L+aHvlSTST_6yW[^`W5v!ST\7JVLlMONT]TWH!NcMOv!SbN*W\X5MOH!h
azJIYWDv!MO^`STHlN*MOWH+a\$WMONcN*WH9STPR+aJVMOWH$
Vl
Vl
( Vlr
YQL!Sb*S>%l DMON#J*L!S>LlWb*MXTWH7JkaD\
_WNcMXJ*MOWH$lKl MON#JVL!SSbcJ*MO]aD\
_WNcMXJ*MOWHWbQv!S_lJVL$aH!vA V!
zg
l
Wb!Wb-JVL!S_+aDbcJV\iKGMXb*MO]kLl\OSJ^W5v!S\
V!B(
V!
Vl
A(&
l
1{@SL+aSYQb*MXJcJVSHna^aDJV\Oa9RlH!]J*MOWH+Omr5!D9|pkw6 jYQL!MO]kL;YQMX\O\hSTHlSTbVaJVSJ*L!SNc_+ab*NcS
^aDJVbcMO]TSNGaNcN*W5]TMOaDJVSvYQMXJ*L;v!MONc]TbcSJVMXTMXH!hWJVL;J*L!S_lR!b*Sz~STR!^aH!H%aDH!v;_+abUJV\iGMXb*MX]kL!\OSrJ|_lb*W!\XST^`NTy|#L!S
N*]b*MX_lJ3MONN*R!]kL=JVL+aJ3J*L+aDJv!MXN*]b*SJ*MOTaDJVMXWH%]aH=S`avcR!NcJ*STv$YQMXJ*L
aH!v@;lT@
YQL!Sb*S
aH!v
abcSMXH7JVSThSTb*NydSH!STb*aDJVSAJVL!S`^aDJVbcMO]TSN3WbJ*L!S]TaN*S`
l=
aDH!v
ly
|WDJVSBYQL!SHb*STPR!SNcJ*STvJVW;hSH!STb*aDJVSCJ*L!SC^aDJVbcMi@aNcN*W5]TMOaDJVSvYQMXJVLJ*L!SC_!Rlb*ST\i~STR!^aH!H_lb*W!\XST^9JVLlS
N*]b*MX_lJ-+Omr5GhSTH!SbVaDJ*STN~az^aJVb*M`YQL!MO]kLKWbc]TSTN#J*L!S[_WDJVSTH7J*Mxa\$MXHKWH!S[]Wb*H!Sb#WJVL!S[_!bcW!\XST^vlW^aDMOHJ*W
S3STbcW!yB{}LMXN-JVL+aJ
4B1#L!S#^aJVb*MX]TSNhSTHlSTbVaJVSTvAMXH_+abUJ a7ab*S#NU5^^`SJ*b*MO]aH!vz]TaHzS#NcW\XSTvYQMiJVL`d3e-fR!NcMOH!haJ*b*R!H!]TaDJVSv
+a]kWbcJ*L!WhWH+a\OMXaDJ*MOWH6yFI^`_!\XST^`STH7JZJ*L!SJ*b*R!H!]TaDJVSvK+a]kWbcJ*L!WhWH!a\OMXaDJ*MOWHzMXHJVL!Swxo5prqsOtb*WRlJVMXH!SaDH!v
]TW^_+aDb*SZJ*L!SJ*MO^`SZbcSTPR!MObcSTvJ*WG]aDb*bc>WRlJ d>e-fMXJ*STbVaJVMOWH!NWHJVLlS-hSTHlSTbVaJVSTvz^aDJ*b*MO]STNYQMXJ*L]TW^`_!\XSJVS
aH!vJVbcR!H!]TaDJVSv+a]kWbcJ*L!WhWH!a\OMXaDJ*MOWH!N cR!NUJQR!N*SWR!b#Y-aDJ*]kLJ*WAJ*MO^`Sy
&1[|_!_!\iKWR!b~^`W&v!Mi+STvd3e-fa\OhWb*MXJ*L!^
N*SrJJVLlS3J*W\OSbVaH!]S3J*W
55-JVW`N*W\i&MXH!hAJVL!S>v!MONc]Tb*SrJVMXTSTv_!R!bcS
|SR!^aH!HnaHlv;_+abUJV\X9GMXb*MX]kL!\OSrJQ_!b*Wl\OST^`NTR!NcMOH!haH7b*aH!v!W^ST]rJVWbWby|\XWJ [
aNaRlH!]J*MOWHnWZWH%a\OWhI\OMXH!SaDb~_l\OWJ|WbWJVLn_lb*W!\XST^`NTy
WYYZWR!\Xv;WR=aH+aD\X5TS3YQLJVL!S_!abcJ*\X
GMObcMO]kL!\XSJ#_lb*W!\XST^ ]TWH7Sb*hSN|aNUJVSbk
1
_&JVMOWH+a\e-aHWR9+HlvaAhW5W&vK_!bcST]WH!v!MiJVMOWH!STb-Wb-J*L!S_+abUJV\XCGMObcMO]kL!\XSJ-_lb*W!\XST^
!#"$%&('*)
+-,+./.01
2345&
"$687:9<;= >@?A
/4BDC.!E
FHGJILKMON#P!QSRUT!V*GW!XYKZ[R\KI]P+^JRUG_!Xa`G_bKcT!V*GW!XYKZedQY_DGV*f!KVgILGihQSjKkKjKV\`G_!KA^lXYQaI*I*XYKmILQYZAKmILGino^JILnpP%q!T$r
stImnoGjKoV*RmG_bXS`uILP!KvW+^JR*QYnR]GwBILPbKvG_!Kyxtf!QYZAKo_bR*QYG_+^XFHKz#I*G_%Z{KyILP!G|f$r]}@Kcz~QYXYXnoGjKoVmZcq!XaILQYfbQYZAKo_!R\QYG_+^JX
FHKyz#ILG_eZ{KyILP!G|f!R#QS_8I*P!Km_!Ky|IHT!V*GW!XYKZR*KyIr
C 1Us_8ILPbQYR~T!V*GW!XYKZedb`Gq8z~QYXYXz~V*QSI*K3^AFOKz#I*G_D^XShGV\QSILPbZw5GV#R\GXSj|QS_!hvILP!KmR\I*V*qbI*Rg^J_!fv\GQY_6ILR#T!V\GW!XSKoZQS_
b
ILPbK{+hqbV*KAWKXYGzmrlGq@z~QSXYXBq!R\K{I*P+^JIkFHKz#I*G_R\GXSjKoVmILGQS_6jKoRILQSh6^JILK{^eTbV*GW!XSKoZI*P+^JI=P+^RmIzGDnGV*V\KonyI
R*GXYqbI*QYG_!RrFOGILKJdJI*P!KTbV*GW!XSKoZQSR>R\`|Z{ZAKI*V*QSn^JWGq&II*P!Kx^&QYRodR\GgQSI>no^_kWKILV*Ko^JILKfc^R^~G_!Kyxtf!QYZAKo_bR*QYG_+^X
T!V\GW!XYKZer
x1, y1
(0,0)
(d,0)
+GV-ILPbQYR#T!V\GW!XYKZed!^R*R\q!Z{K33gBmQY_ILP!KmRILV*q&IHnG_!RILQSI*qbILQajKmKoq+^ILQYG_$r
1e+GVILP!KR\ILV\qbILRi^_!f@\GQS_I*RlK!^JZ{T!XSKD^WGjKJdOR\q!T!TGR*KDI*P!K<^TbT!XYQSKofw5GV*noKP!^Ri_!G_boKoV\G(^_!f
noGZ{TG_!K_I*RmR\G=`Gqn^_{_!GIq!R*KUR`|Z{ZAKI*V\`byr]QSjKOI*P!KU_!G|f+^X+w5GV\Zw5GVR\`|R\I*KoZGJw
_!G_!XSQY_!Ko^VKoq+^JI*QYG_bR
z~P!QSnpPn^J_uWKmq!R*KfeI*GAfbKILKV*ZAQY_!KUI*P!K]jJ^XYq!KR~Gw$-^_!fu|hQajKo_u+d/=^_bf>Jr
4 1O&q!TbTGR\KHzKg_!Gz^JR*R*qbZ{K~ILP+^II*P!Kg^JT!T!XYQSKofw5GV*nKgQSRG_!XS`A^JnILQS_!hmQY_AILP!KO_!Koh^JILQajKOvf!QYV\KonI*QYG_rB]QSjK
B
ILPbK3G_!KUf!QSZ{K_!R*QSG_+^X_!G_!XSQY_!Ko^VK|q!^JILQSG_iw5GV-W6`lKoXYQSZ{QS_+^JI*QY_!hmILP!KU$jJ^V\Q^W!XSKgq!R\QY_!hvI*P!KUw^nIILP+^JI-W`
R\`|ZAZ{KyILV`iILP!KmP!GV*QYG_6Ip^Xw5GV*nKoR~Zcq!RIHWKmQY_8W+^XY^_!nKr
&1&qbT!TGR*K%[ uQY_:ILP!K{Kyb^ZAT!XYK{^WGjKd^_bf^R\R*q!ZAKQYR=^nI*QY_!heQY_:ILP!K{_!Koh6^ILQSjKl%f!QSV*KonyILQSG_@R\G
ILP!^JIm`Gq@n^_:q!R*KILPbK{G_bKyxfbQYZAKo_!R\QYG_+^JXKoq+^ILQYG_@f!KV*QajKof@QY_%T+^VIlW/rA}@V*QaILK^8Zl^JI*X^W<R\noV\QYTbImILG8q!R\K
FHKyz#ILG_RZ{KyILP!G|fvILG]no^XYnq!X^JI*KILPbK\GQY_6IBf!Ky+KonyILQYG_l^R^Uw5q!_bnILQSG_GwW6`vR\IL^V\I*QY_!hUz~QaILPAU(&pO(
^_!f<I*P!Ko_%QY_bnoV*Ko^R*QS_!hiILPbKvZl^h_!QSI*q!f!KkGJwI*P!K^T!TbXYQYKfw5GV\noKvQY_KoQYhPIUQY_!nV*KoZAKo_6I*R]GwA>kgb6&r=HR\K
^RI*P!KgQS_!QSI*Q^Xbhq!KoR\Rw5GVFHKz#I*G_$R-ZAKILPbG&fAILP!KHR*GXSqbILQSG_lGwILP!KUTbV*Kj|QSGq!RXSG6^f$r+GVI*P!KU(=R*GXSjKJdbq!R\K
r
ILPbK3QS_!QSI*Q^Xhq!KR*RO|B
&1mFOGzI*P+^JIO`Gq<_!GzI*P!KkR\GXYqbI*QYG_uw5GVHmHbSDZAK^J_!QY_!hAILP!K3w5GV*nKkQSRg^nyILQS_!h{fbGz~_6z^V*f+ydzGV*
W+^np6z^V*fbRgW6`V*Kf!q!nQY_!hvILPbK3Z{^h_bQSILqbf!KUGwILP!Km^JT!T!XYQSKofiw5GV\noKmQY_KQYhP6I~RILKoTbRodO&6|r]_bnoK3^h6^QS_$d
ILPbKkQY_bQSILQY^X
]q!KR*RHw5GV~I*P!KcFOKz#I*G_QaILKoV*^JILQSG_u^JIHK^npPDXSG6^fuR\I*KoTR\P!Gq!XSfWK3I*P!KkR\GXYq&ILQYG_uGJwILP!K3T!V\Kj|QYGqbR
XYG^f8R\ILKT$r
1H}P6`lQYRI*P!K]R*GXSqbILQSG_l`GqefbKILKV*ZAQY_!Kfiw5GV#>Obd|ILP!K]XY^R\I#XSG6^fiRILKT$d+GwT+^VI3nf!QSKoV\Ko_6I#w5V*GZILPbK
R*GXYqbI*QYG_`GqenGZ{TbqbILKfuw5GV-I*P!K3Obd!ILPbKm!V*R\IOXSG6^fR\I*KoT$d^JI#I*P!KmWKhQY_b_!QY_!hvGJwT+^JV\Ikf/\
!#"$%&('*)
+-,+./.01
23/45
"687:9<;3 =?>@
4AB,6C
DFEGH6IKJMLHON6PHRQ$HG8PGNSGTH&UVH!NSLXW#Y[Z]\A^_PB`ASE6abLH&c
PH5Gd^PBeAEG8UAfgH:fAe
GT^_L-GT\gHh$i+GT\kjgabflG8^G
\bN6P]PHm
H&STN6L#`ASE6abLHOcnP5Y
o 1qpVr!sutdvxw5y!sz|{|tO}Ky!{~{|!&{|wt!~wr+}suyd+{uz|s~-}ws*t!syF]yO*{y!{6O~*{|wtsu+{|wt!~u!s!{|z{|&
*{!vTt+z5~{|~-5!{|uz|zBzsy!~-*wr!sXwzz|wk{|t!}Kt!wt!z{|t!su#w{|~*~wtdsu+T*{|wt
&
&
T
k{*rr!sAw!tOy+uwt!yO{{wt!~
x+
+
@
Tt!y
O
R
kd~*{|v@!zsM+t!{*sy!{susut!s@~r!suv@s{~3O~*suy?wd~*wzsr!s@t!wt!z{|t!su[w{~*~wts5!T{wt?wttktOw&y!s
}{|y$O*r!sy!{|~usss5!T{wt!~q*s
kr!s*sAsAk{z|zwt!~{|y!s$r!sA{t*suTz
Xw#
t!y
M
M
T
&
T
T
M
wTs
t!w
r!s8t!w5y!su~
t!y
*s]t!wT{|t!z|!y!sy{|trOs
{|wt$!
!
]
!*r!
*w!}r]
r!sw!t!y!]w
t!y!{{wt!~
y!{~*us*M
{|
O
k*
TrO
suk sutsq
A*wsk*r+TA*r!squw!{tT~*~*w5u{|Tsyk{rM*r!skwsky!{|~us{usy@s+T{wt!~A{~At!wt!~{|t!}Oz*su}*y!zsu~~
*r!sTz|Osu~Xwr!s
w
~ pVr!T#y!w5su~*r!{|~{|v@!zwO-y+Tvxsuy8qs#wt]vxsr!w5y!~-T!!z|{suy@w~*wz5{|t!}
rO{|~k!*w!z|sv
wzsKr!sTwsKsu+T*{|wt!~k{r%xvM!z*{|y!{vxst!~*{wt+z$s#*wt ~vxsr!w5y<!~{|t!}]Bsuw]{|t!{{Tz
}!su~~
X wrOs
~ut!y
@ suv@wt!~*Tsr!Tw!!w}*vr!{ssu~K+Ty!T*{|uwtsu}sut!sATt!y
s
*r!st!vM
suw
s#*wt]{*su*T{wt!~*su!{*syBwM{t!~*!swt!s+{|t
y!ssuvx{t!
u!*{|tx*r!s~wz|&
{wt]Xw-wxu~*s~ubkr!sut
Tt!ydkrOsut pVr+Tqr+T!st!~#kr!sut
!~{|t!}xy+v@syRs#wt<*wB~*wzs[Xw*r!s
~kr!st
ww!}rOz8r!wz}s3 ut
q
u!*@{txXssur+t
*su~{|y!!zsTz|+{|wt!~ s~*O*s*wMuw!t
w]uwv@!O*s
w3wt!s!{|t
!r*wRy+v@r!s]t!s# wty!sz
su Xw*vwRyOss*v@{|t!s]rOwvM
rOsBX!t!*{|wts Tz|+{|wt!~Kw
? w
vxt]z|{t!s~5~*suv~*wzO{wt!~sus3su!{*suy
k{|zz+t!wTw*suy6&!&-{|~{|tO~suyBwtOz
A ~*{t!}[-{vxOz|{|s~gr+
/wK
X/
*w5y!!*~u6
Xy
/{|~
*r!sut]vxy!suw!{t wv*{ Xsus
~suy]
!
Xw *vv/ *{s*w/O
!
r
s
@
v
s
*
!
r
&
w
X/
8!~*{t!}r!s[O!*w&{|vxT{wt
w =rOs@t!vxsxw!{|t&XsusMXwzz|wk~X*wvrOs@r+3wt!z%]wO{t!sw8uwvx!&s@
!
{|~t!susy!suy6
Oqt!wTwO{t!swwvxOOs
!
X
w-rOs[yO{|~**s{|suydt!wt!z{|t!su-w{|~~*wtdsu+{|wt8Xwv!w!zsuvwt!s+wvx!*srOs[wts*}st!us3T*s
~t!y+T*ys#wtwFr!s]wtsu}sutOusdT*sBw#rOsduwO{t&_X*ssv@srOw&y
wq
wRuwv@+sBuwtsu}s
t!yds&!z|{|t]w!k*s
su~+vTs[~*suv@{|zw}!zw#wq|
| |
su~*!~BXw-r!swxv@s*r!w5y!~k
/s tu +!~*s X/
~*Oz~ - w#!O*w~*s~wr!{~ks&s*{vx
t!y
w-*r!sK3-z}w{rOvF&~*~!v@s
kr!sus
{|~qr!s *su~{|y!!zgtOy
uwtsu}sut!s@krOsut
{|t!z|z!kr!st?T!!*w&{|vxT*{|t!}
T
rOs[vxT**{_suw#O*w5y!!u+!~*s
y!s!t!suyws
t!y
r+TK{|~lr!w
w u!sutwvxs@w!Muw!{t5X*ssxv@s*r!w5y:vxw5y!{5{t!}
uzw~*s#*w[suw3ut@w@vTsq
!~{|t!}~t!y+T*yxs#*wt ~
wnuzw~*s#*w[suw3ut@w@vTsq
/A
/
v@srOw&y
!!w~*sBwsB{|ts*s~sy{|tzw u!uz|!zT*{|wtO~ugTt!ywt!z?tOsusuy*wRuwtsu*}s8s#*wt ~
v@KsrOw&y<~wx*r+T
k{|zz$w%rO{|ssKrOsKy!s~*{|suyRTuuO{tFrO s
wqkr+Tz|Os3w
Xssu~X!tO{wt8sTz+T{wt!~
t!wTr!sO!*wTrw%5{|t!}%w?!!zs#wt ~xvxsr!w5yO~*{|tO}?wtOzX!tO{wtsTz|!T{wt!~{|~*w
y
w vx!&sr!s3w!{|td!!w&{vT*suz]+t!{sy!{6 s*sutOusu~
u
r+Tk{|~
X * #
kr!s*s
{|~-rOs[su*wkk{r
{|t]rOs
3sutdt!ydusuw~qsuz|~skr!s*s - w
suqs#wtF{sT*{|wtxwOz|yd ~*!r!t!{*syO{su*st!usv@s*r!w&y]su!{*s
vtBXOt!*{|wtdsTz+T{wt!~
0 1t*r!{|~q!w!z|svF!wk{z|z
y!s!!}BwOvxTz|8+~suyRqs#wt<~*wzsqXwqTz|uOzT*{|t!}rOs[Xwus3su!{
z|{!*{!vw~*{{|wtw w{t~M{tz|wy!suy~!*!*s wKw*susu{t!z%suzuwv@sx*w<k*{*sxw!wkt
s#wtF~*wzskXwv~u*Tr$+kr!{r!ssuqw8+t!yds ~{|s
r!s[!*w}v~*r!w!z|yd*suy]r!sXwzz|wk{|t!}3Xw*vxTk~k{t!!O
w{t*~u
kr!s*s
z|wy!suy
~**O*~u
kr!s*s
{|~#*r!ssuzT~{u{wrOs[~O
z|wy!~u
Aw<y!su{|y!s[wx!~*s3w!vxT*z8~**{O~qk{rFsX!zz]{t!~*s*suy<!O}~ur!st<wtF*r!sKw!~*sKsu<!}s
wFk{|zz+tOy~*sqw FTz|B+z|s~#kr!{|rd!~*s[qs#wt ~qv@srOw&y]w+tOyd*r!ssu!{|z{|!{|!vw~*{{wt]wl
~*swTl~*&~t!y w{|t~#~! ssyws&*sut+z
Xw*su~ r!sw&y!s!~su~qr!st!w5y+z
Tt+z5~{|~-XwvM!z{|wt$
~*wwFvxBk{|~rd*w@s5{|s *r+TkvxTs*{Tz6sXws[~ {t!}Mr!{~k!*wOz|suv
L
KJ
*
M
r sy!{su+zsu~XwrOsM~**O*~ w{t~w&y!sM*s
!
t!y
*suy!~{t
rOsx!tO~s*r!suy
{|t!{{Tz w{|tw~*{{|wt!~u
rOs@~OKwt!t!s*{5{ltOy?rOsxT!!z|{suyRXwusu~
Tt!y
~
5
u
~
u
s
!
r
s
*
3
s
!
r
u
s
[
s
u
s
|
z
u
s
~
k
O
r
|
{
F
r
k
{
|
z
z
}
{
s
u
z
!
u
s
~
k
~
@
w
k
+
r
T
|
{
#
~
k
w
O
t
}
s
~5~
t!y8s~ ~5~
DE&;0FC#)NO-
FU
U
w@s&!z|{|tBkrdtOydO]{
G
svxwt!~*T*s@*r+TKw!Msu~!z*{|t!}s#*wt ~Mv@srOw&y$g!!z{|suy:wFsrw#*r!ssOvxOz|sO*w!zsuv@~u
uw tsu}su~5!y!{|uz|z
V1
r!s*ssMxXs
wtFvxsr!w5y
s#
v@ws3s~sOvxOz|su~qr+k{|zz=!~s!*w!z|svx~sstRXwxuw*suz!w}*vxv@suy
W
8w!y!suO!}}syRs#wt&+~suyR~**O*~tOy w{t*~~wzsuwts!Tvx!zs[*su~
#y8
kr
t!
<w!y!suO!}}sys#wt5+~suy?~O~[tOy
pVrd*s*r!s[t!~su~yO{su*st
^]
\1
`_
FX ~5~
pVr+Tr+Ost!~
LU
Eb b b
a[ UU[
~
r!sB*y!{{wt+z s#*wtv@srOw&ywts~
{|t+z At*{su~M?vx{y!t!{}r *wsu
[s rOw&ydwBwtsu}swt*r!sMv@w~sOv@!zsu~u6t!yR{tFrOs
A*{u sKXwkrOsMs*~wt<kr!wButR}ss#wt ~v@
T~*s[wgM*{|sO{|t]r!sXss~{sT*{|wtO~
c
}{sutd
&
wTs{O*r!sr!t!}s{|tKzsut!}TrK{~~vTz|z_*r!s-wsXw*vsuy!!su~g*wrOsv@w&y!szwTz|*suyO{vx!zsuv@sutsy
r!s8w!kt!s!*w}vwt8~*wv@s
nsOv@!z|s~
s*{B5!y!{|wts*}st!us
ed
{b
kr+Tss~*rOsuv@su~wut*r!{|t!wTw:}sBqs#wt ~xvxsr!w5y*w?wts*}sFwtn~@vxtw
s
~ ~5~r!w!}r*su~
sKXws*tOsuy$r!w ssu
ssutRsKr+sut tOsuus~*~**{zF~*!
~5~~qwRut
ussuy!syk{*rdz|zwl*r!suv
|
}
!#"$%&('*)
+-,+./.01
2345&
"6798:<;= <>&?@A4B<C7D
C 1GFIHKJ*L!MONQP!RTSU!VXWZY\[S]K^_MXVOV!WG`!abYKMXH!W_VOS@caVEJ*R*]!HEcabJ*MOSHdWZRTR*SRe@NfJgaUEMOVOMhJI[iaHEjkcZSHlWZRTmWZH!cWon5SRpaHdWq`EaYdP!VOW
E
YdWJrLES&jkJ*L+abJ#jES&WN-H!SJscZSH7lWR*mWtlWZRT[K^sWVOVuQvsSH!NTMOj!WR-]!N*MXH!mwJ*L!Won5SVXVOS^_MOH!m3MOH7J*WZmR*abJrMXSHkYdWJ*L!S&jKJrSxNTSVXlW
n5SR-J*L!Wtyz5{T|-^_L!MOcgL}NrabJ*MONf~+WZN yz5{T|Q!yz5{T|Ge
I
y+yZ
(+y
^_L!WR*WybuQWR*WtyaPEP!R*S`&MOYKabJrWN-yz5{T|#abJ#J*MOYdWPSMOH7J#{Q(
a7|#WJ*WZRTYKMXH!WoJrLEW3VXS&cZaV6J*R*]!H!cZabJrMXSH9WZRTR*SR_SnJrLEMONdTVXWaP&n5R*SmiYKWqJrL!S@j$u
U/|-FIN#JrL!WYdWJ*L!S&jNfJgaU!VXW>MXVOVJ*L!WYKWqJrL!S@jcZSH7lWZR*mW
c |-BVXSJ_aH!j9cSYdP+aR*WJrL!WtcSYKPE]EJrWj>abH!jJrL!WtWG`EacJN*SVO]EJ*MOSHkn5SRsJrL!WocaNTWwk=e+aH!j9SH9J*L!WtMOH7JrWRTlbaV
{-EZbuQN*W e &&e!aH!j Ehu
j/|#
S&SkcabR*Wn5]EVOVX[abJ#[S]!R_P!VOSJ*NZe+abH!j>WG`&P!VaMXH9[S]!R_RTWZN*]EVXJrN_MXH}P!aRTJ_cbu
, 1FIH}J*L!MON-PER*SU!VXWZY[S]^_MOVXV6m7aMXH9P!RrabcJrMXcZW^_MhJrL9JrL!WJ*WZcgL!HEMO]!WZN_n5SR#WNTJrMXYkaJrMOHEmxVOS@caVJrRT]!H!caJrMOSH9WZR*RTSRZe
aH!j}aVXN*SxWG`EaYKMXH!Wo^_L+aJ_L+aP!PWZH!N-J*SxJrL!WabcZcZ]ERracq[SnaH9MXHJ*WZmR*abJrMXSHYdWJ*L!S&jK^_L!WZHJrL!WR*WaRTWpT]!YdP!N-MXH
JrLEWdPER*SU!VXWZYj!WZNTcZR*MXPEJrMXSH$NtH!SH!VXMOH!WZaRon5]!HEcJrMXSH$uvsSH!N*MXj!WZR]!NTMOH!m9JrLEWxn5SVOVXS^_MXH!mMXHJ*WZmR*abJrMXSH<YdWJ*L!S&jAJ*S
N*SVXlWtn5SR-J*L!Wtyz5{T|-^_L!MOcgL}NraJrMONf~+WZN yz5{T|Qz5yz5{T|T|qe
I
yy&
zETz5y|7qz5y@
7 z5y|*|*|
^_L!WR*WybuQSJ*WtJrL+abJ#yMON#MXHJ*WZH!jEWZjJrSdaP!P!RTS`EMXYkaJrWyz5{T|#abJ#J*MOYdWPSMOH7J#{Q
a7|QWJ*WZR*YdMOHEW-JrL!W_lbaVO]!WNSbn
EQaHEj7sNTS=JrL!abJQJrL!WVOS@caV&JrR*]EH!cabJ*MOSHdWZRTR*SRpSnJrL!WaUSlWYdWJ*L!S@jKMXNpSR*j!WR
! upH!W3aP!P!RTS7acgLn5SR#cZSYdP!]EJ*MOH!miJ*L!WcZS@WdcZMXWZH7JrNMON-J*SJ*WZNfJJ*L!Wtn5SR*Yw]!Vax]!NTMOH!miyz{T|p*{G*{ rZ{fu
U/|&]!P!PSN*Wtz5yz{T|*|QEyz5{T|GuQN*WJrL!WoaUSlWoYdWJrLES&j$e^_MXJ*LKJ*L!WocS&WqKcMOWHJ*Ns[S]j!WJ*WZRTYKMXH!WZjMXHAza7|qe
JrScSYKPE]EJrWxJ*L!WiN*SVX]EJrMXSH<n5SRoJ*L!WMOH7JrWRTlbaV{=(EZuvsSYdP+aRTWw[S]!RcZSYdP!]EJ*WZj%N*SVX]EJrMXSHJ*SJ*L!WWq`EacqJ
N*SVO]EJ*MOSH}abJ_{\n5SR \ b E b7 hXh usS^naNfJMON#JrL!W=aUSlW3YKWqJrL!S@j>cSH7lWR*mMXH!mKJ*S
JrLEW3WG`EacJN*SVX]EJrMXSH/
c|}S^NT]!P!PSN*Wz5yz5{T|T|
yz{T|n5SRyz5{T|
EaH!jz5yz5{T|T|
oEyz{T|}SJ*L!WZRf^_MON*WbuNTWJrLEW
aUSlW<YdWJ*L!S&j6e^_MXJ*LJ*L!WcS&WqKcMOWHJ*Nk[S]j!WJ*WZRTYKMXH!WZjMOHza7|qeJrScSYdP!]EJrWJrLEWN*SVO]EJ*MOSHn5SRJrL!W>MOH&
* *X!g* X
JrWRTlbaV6{sEZupvsSYdP+aR*W[S]!R#cSYdP!]EJrWj9N*SVX]EJrMXSHkJ*SJ*L!WWG`!abcJN*SVX]EJrMXSH9abJs{QdzE
|
uS^nabNTJMONJ*L!W-abUSlWsYKWqJrL!S@jwcSH7lWZRTmMOHEmoJrSoJ*L!WsWG`EacJBNTSVO]&JrMOSH/
01FIH<JrLEMONoPER*SU!VXWZY[S]<^_MOVXVcSYdP+aR*W3JrL!W=~+H!MXJ*Wqj!MXWZRTWZH!cWxaH!jN*L!S@SJrMXH!mkYdWJ*L!S@j!Nn5SRN*SVhl@MOH!mkakH!SH&
VOMXH!WabRkP!RTSU!VOWY>upS]N*LES]!VOjN*WZW<J*L+abJKJrL!W<JI^SYKWqJrL!S@j!NkaR*W<]!MhJrW<jEMXWZR*WH7JuvsSH!NTMOj!WRkJrLEW<N*cZaVaR
WZ]+aJrMOSH
yz5{T|Q(NTMOH!L_ycZSN {
&]!PEPSNTWxJrL!MXNoW@]!abJrMXSHAL+aNtaKPWR*MOS@j!MXc=NfJrWabjE[7INTJrabJrWiNTSVO]&JrMOSHSnQPWZRTMOS@j>e6MuWuheyz{T|_yz5{|
^_L!WR*WdMXNaH%MOH7JrWmWZRuiFIHJ*L!MONtPER*SU!VXWZY[S]%^_MOVOVWq`EaYdMOH!WxJI^S}j!MXWZRTWZH7J=aP!P!RTS7acgL!WN3n5SR~!H!j!MOHEmJ*L!MON
PWZR*MXS&jEMOcNTSVO]EJ*MOSH6u#L+aJMONZeB[S]^_MOVOV]!N*W9JI^SjEMXWZR*WH7JiJrWZcgLEH!MO]!WNdJ*S<~+H!ja<N*SVX]EJrMXSHJrS<J*L!W}aUSlW
WZ]+aJrMOSH$e+SH9JrL!WMXHJ*WZRflabV${-EZeE^_L!MXcgLabVON*SiN*abJrMXNT~+WNJ*L!WcZSHENTJrR*aMOH7J#J*L+abJ_yz7|Byzf|Gu
a7|d&WJk=e#aH!j]!N*WJrL!W~+H!MhJrWGIj!Mh6WR*WZHEcZW}YKWqJrL!S@jJ*SNTSVXlW}J*L!W<abUSlW}PWZR*MXS&jEMOc9P!RTSU!VOWY>uNTW
U+acg7^saR*j@IWZ]EVOWZRdn5SRiJrL!W9J*MOYdWqIjEMON*cR*WqJrMOZabJrMXSH$eQx{&E}N*SJrL!WR*W^_MOVXV-UW%H!S@j!WNKMXH[S]!R~+H!MhJrWq
j!Mh6WR*WZHEcZW3j!MONTcZR*WqJrMXabJ*MOSH$eEaH!j>aHMOH!MhJrMOaVm]!WZNTNyz{T|pu
U |#om7abMOH}]!N*MXH!mkU+abcg^saRTj&IW]!VOWRn5SR#JrLEW=JrMXYKWGIj!MXN*cR*WJ*MOZabJrMXSH$eE]EN*W=J*L!W3N*L!S@SJrMXH!mdYKWqJrL!S@j}J*SkSU&JgaMXH}yz{T|
/
n5SR_3upNTW3J*L!WMOH!MhJrMOaV/m]!WZNTNyz7|Qu
c|3S^VOWqJx(Eu}NTMOH!mJ*L!WkN*aYdWKj!MXN*cR*WJ*MOZabJrMXSHabH!jMOHEMXJrMOaVm]EWZN*NxaNwMXHP+aRfJ9za7|qeN*SVhlWKn5SR3yz{T|
]!NTMOH!m3JrL!W_~+H!MhJrWqj!MXWZRTWZH!cWYdWJ*L!S@j$uL+abJpL+abP!PWH!N^_LEWZHk[S]kJrRf[iJrS3]!N*WJrL!WN*LES&SJ*MOH!m3YdWJrLES&jdSHKJ*L!MON
P!RTSU!VOWY
j/|Rf[]EN*MOHEmJ*L!W>lbaVX]!W>Sbnoyz|in5S]EH!jU7[JrL!W~+H!MhJrWqj!MXWZRTWZH!cWYdWJ*L!S&jaNKaHMXH!MXJ*MaVm]!WZNTNKn5SRJrLEW
N*LES&SJ*MOH!mYdWJ*L!S&j6uQS^Yx]EcgL>cZaH}[S]}PWRTJr]ER*Uyz7|#aH!jNfJrMOVXV$L+alWJrLEW3NTL!S@SJrMXH!miYKWqJrL!S@jcZSH7lWZR*mW
!#"$%&('*)
+-,+./.01
23/4567
"$8:9<;%=> ?A@:&
7A4BDC.E'F+GHI 8JKLM
86+GGN7G1
CO1#PRQTS*U!VXWZY![F\]!^X_`ba\cedfVX^X^+]_gcOW*VXQOh>SiUO_kjTlmonpf`e\^X_Nqc!^srt[BuOa7Q+r`IVXqNWZY![F\h[irt`vu!_N`I\Q!WwSi[irSi_NuTVXQIqN^srtW*WNx
y#U!_3W*\z{SRd|r[F_>V}W#Y\WwSi_u~\Q:S*U!_qN\c![FW*_d|_]WFV}Si_tx
r8|W*_S*U!_`e\^XuOa`I\^X_qNc!^Xr[ZuOa7Q+r`IVXqNW|W*\z{SRd|r[*_S*\qr^}qNc!^XrtSi_SiU!_V}W*\S*U!_N[F`Trt^q\`eYO[*_NWFW*V}]!VX^}V}SRa\z?^}VXc!VXu
r[Fh\QrSBrS*_N`IY_[irtS*c![*_rtQ!urMuO_NQ!WFV}SRa>\z+!:xy#UO_-VXWF\SiU!_[*`er^8qN\`eY![F_NWFW*VX]OVX^XVoSRa
VXWfu!_H+Q!_uDrW
t
y#U!_3_H&Y_N[*V}`e_Q8Sr^$tr^}c!_\z
&7 > +x
rtS#S*U!_NWF_>q\Q!u!VoSiVX\Q!W#VXW t:
]/g\dWF_NQ!WFV}SiVo_rt[*_ka\c![[*_W*c!^oSiWgS*\SiU!_kY\S*_NQ8SiVXr^qNcOSwR\c!W*_u$?rQ!uDS*U!_c!WF_\zZrWFS*[*VXqHSqNc&SFR\t\[r
^XV}Q!_u&Rq_N^}^$^XV}WFSfrYOY![*\8rtqU$x
,1MPRQS*U!VXW#YO[*\]!^}_N`a\cdfVX^}^q\Q!WFVXu!_[fS*U!_u!Vo_[*_NQOqN_]_SRd_N_NQSiU!_^}\&qNr^rtQ!uh^X\]!r^S*[*c!Q!qHSiV}\Q:_N[F[*\[-z6\[
]\c!Q!u+rt[Fatr^}c!_Y![*\]!^X_`eWx
r8f|\Q!WFVXu!_[#SiU!__Nc+rSiVX\Q
!
\QS*U!_#VXQ8Si_[Ftr^ D OdfVoSiU]\c!Q!u!r[Fa>qN\QOu!V}S*VX\Q!W 8BrtQ!u Bx5_S*_N[*`IVXQO_rtQ+r^}aS*VXqrt^X^}a
SiUO_3_OrqSMW*\^}cOSiV}\Q$x
]/#|\QOWFSi[Fc!qSrk!Q!V}S*_HRuOV}_N[*_Q!qN_gu!VXWFqN[*_HSiV}rtS*VX\QWFqU!_N`I_z6\[-q\`IY!cOSiV}Q!hkS*U!_W*\^XcOS*VX\QTS*\SiU!_g_Nc+rtS*VX\Q:V}Q
Y+r[wS r8-Qc!`e_[*V}qr^}^}ax|\`IY!cOS*_SiU!_3Qc!`e_[*V}qr^
WF\^XcOS*VX\Q:S*\IS*U!_>_c+rtSiV}\Qc!W*V}Q!hI&O_uDW*Y!rtSiVXr^
WFS*_NY!WFVXN_W
\zBTO&eOo3rQ!u~OOx
q#\dqN\QOW*VXuO_N[#SiU!__7c!rtSiV}\Q
+
u/#|\QOWFSi[Fc!qSrk!Q!V}S*_HRuOV}_N[*_Q!qN_gu!VXWFqN[*_HSiV}rtS*VX\QWFqU!_N`I_z6\[-q\`IY!cOSiV}Q!hkS*U!_W*\^XcOS*VX\QTS*\SiU!_g_Nc+rtS*VX\Q:V}Q
Y+r[wS q#Qc!`e_[*V}qr^}^}ax||\`IY!cOS*_3S*U!_>Qc!`I_N[FVXqNr^?W*\^}cOSiV}\Q:Si\SiU!_>_Nc+rtS*VX\Qc!W*V}Q!he&O_uDW*Y!rtSiVXr^
WFS*_NY!WFVXN_W
\zBTO&eOo3rQ!u~OOx
_g&U!\dS*U+rtSS*U!_mXl*m7*lHlS*U+rtSgVXWMS*U!_[*_W*V}u!c!_kdfU!_QASiU!__OrqSWF\^XcOS*VX\Q~VXWMW*c!]!WwSiVoSicOS*_Nu
VXQ8S*\S*U!_u!VXWFqN[F_Si__7c!rtSiV}\Q$
VXWS*U!_Wirt`e_z6\[S*U!_Y![*\]!^X_`VXQAY!r[FS r8rWu!VXWFqN[F_SiV}N_u<V}Q%Y+r[FS ]/H?rtQ!u
0 1+\[#SiUOVXWfY![*\]!^X_`~!d_ku!_H_^X\Y_Nu~W*\`I_>_7Si_NQOW*V}_k`ertSi^Xr]qN\7u!_Si\TUO_N^XYa\cc!Q!uO_N[*WwSrQOuD]\c!Q!u+r[wa8
_N^}_N`I_NQ8SM`e_HSiU!\7u!W]aU+r7VXQOhea\cc!WF_>SiUO_k`I_SiUO\&uSi\eu!_S*_N[*`IVXQO_3S*U!_>qNrY+rqV}SirQ!qN_3\zBWFVXQ!h^X_qN\QOu!c!qS*\[Nx
y#U!_3Y![*\]!^X_`VXW-z6\[*`c!^XrtSi_uVXQrW#z6\^}^X\dfVXQ!hWN
M\d`kc!qU~qU!r[*h_`cOWFSfd|_Y!c&SM\QrqN\Q!u!c!qHSi\[#Si\[*rVXWF_V}SiW#\^}Sirh_z6[*\`N_[*\\^oSiW-S*\e\QO_\^oSx
_3qNrQW*\^o_z6\[-SiUO_3qU!r[*h_u!_Q!W*VoSRa\QSiU!_qN\Q!u!c!qHSi\[fW*cO[Fzrq_3]8aWF\^}7VXQOhSiU!_VXQ8S*_Nh[*r^$_Nc+rSiVX\Q
g Bw6
77
u!_Q!W*VoSRax_U+r_3u!__N^}\Y_uDrW*_HSM\tz`ertSi^Xr]:qN\7u!_NW#S*U+rtSc!W*_SiUO_3q\^X^}\&qNrtSiV}\Q`I_S*U!\&u:S*\eq\`eYOcOSi_SiUO_
qU+r[Fh_u!_NQOW*V}SRaxPRQ<S*U!VXWY![F\]!^}_N`$a\c%dfVX^}^5_HOr`IVXQO_SiU!_qN\7u!_rQOu<S*U!_NQ<qN\Q8_N[FS3SiUO_Iq\&uO_Si\c!W*V}Q!hr
r^X_[*VXQ:`I_S*U!\&uxBy#U!_>`ertS*^sr]q\&uO_3V}Q!qN^}c!u!_WWF__N[irt^?!^X_NWf^}VXWFS*_Nu:]_N^}\dx
+
8
!Z}K-y#U!VXWfV}W-SiU!_`erVXQ:[F\cOSiV}Q!_dfU!V}qU~qNr^Xqc!^srtS*_NW#qNrY+rqV}SrtQ!qN_tx
+
o|y#U!VXW#+^}_qN\Q8SrtVXQ!W-S*U!_[*\cOS*VXQ!_dfUOVXqUDrtQ+r^}aS*VXqrt^X^}aq\`IY!cOSi_W
i
+ +?7Gofy#U!V}W#[*\c&SiVXQO_NWf[*_Nru!WrW*_SM\zY+rQ!_^XW#u!_W*q[*VX]OVXQ!hSiU!_u!V}W*q[*_S*VX_Nuh_N\`e_HSi[Fax
7
!oy#U!VXWf[F\cOSiV}Q!_qN\`eY!c&Si_NWfY+rtQ!_N^}W#qN_NQ8S*[*\V}u!WNx
6
O}Ky#U!V}Wf[*\cOS*VXQ!_WF_S*WMc!Y:S*U!_qN\^}^X\7qrtS*VX\Q`ertS*[*Vox
_ZU!r_Zr^XWF\-Y![*\7VXuO_NuSRd|\f_HOr`IY!^}_NWNr-WF7c!r[*_Y!^XrtSi_ +^X_W
56O&$o$6Li56O&0/$o$L/rtQ!u56OCo$6L
rQ!urW*YOU!_N[F_ +^X_WGN?;$o$6LiGN?C7/,$o$L*G?!/;$}LMdfVoSiUASiUO[*_N_WFc!qNq_NW*WFV}_N^}a%[F_+QO_Nu
u!V}W*qN[F_S*VXrSiVX\Q!WN/rW#S*U!_3!^X_Q+r`I_NWfVX`IY!^oaxy#U!_3Y!^srtS*_VXWrIcOQ!V}SfWF7c!r[*_>Y!^srtS*_c!Q!Voz6\[*`I^}aTu!VXWFqN[*_HSiV}N_NuVXQ8S*\
OrQ!u
Y+rQ!_^XWNxey#U!_IW*Y!U!_[*_eVXW3rcOQ!V}SWFY!U!_N[F_Tu!V}W*q[*_S*VX_NuV}Q8Si\~rQ!\Q7cOQ!V}z6\[F` qN\^X^X_qS*VX\QA\zO
8rtQ!u
kY!rQ!_N^}WNxQ!q_a\cU+r_uO\dfQO^X\8ruO_Nu:SiU!_M+^X_W|z6[*\`SiUO_d|_]$Oa\cqNrQ[*c!QrtQ:_HOr`IY!^X_M]a
SRa7Y!VXQOh
g`ertS*[*Vo7
rQ!uS*U!_NQa\cIqNrQr^}W*\_HOr`IVXQO_SiUO_#`TrtS*[*V x|_-z6\[wd-rt[*Q!_uS*U+rtSBWFY!U!_[*_
t
Srt_NWr
S*VX`I_S*\[*c!Q$x
r 5!rt`eV}Q!_#SiU!_qN\7_IqNV}_NQ8S-`ertS*[*VoY![*\7u!c!q_Nue]aS*U!_Mq\^X^}\&qNrtSiV}\Q`I_SiUO\&u$7rQOuT_H&Y!^XrVXQdfU8aSiU!_`TrSi[*V
8
VXWfQ!_Nr[*^oaWwa&`I`I_Si[FVXqz6\[-S*U!_Y!^srSi__HOr`IY!^}_NWNO]!cOSfzr[-z6[*\`WFa7`I`e_HSi[FVXqz6\[-S*U!_W*Y!U!_[*_3_HOr`IY!^X_WNx
]/|y#U!_qNrY+rqV}SirQ!qN_g\zrkc!Q!VoS#Y!^srSi_gV}W#rY!YO[*\&VX`ertSi_^}a8OxkY!V}qN\zr[*ru!WxM\d[irY!V}u!^}aIV}W|SiU!_gqN\^}^X\7qrtS*VX\Q
`I_SiUO\&uqN\Q_N[FhVXQ!hSi\SiU!rtSrQ!Wwd|_[Nx
q QOqN_Ta\cU+r_:_!rt`eV}Q!_NuS*U!_q\^X^}\&qNrtSiV}\Q%qN\7u!_a\cqrQSiU!_QY![*\]!r]!^}a<WF_N_:U!\dSi\Dq\Q8_[FSSiUO_
qN\7u!_ISi\c!W*V}Q!hDr r^}_N[F7V}Q%`e_HSiU!\7u$x:+\[SiUOVXW3Y![*\]O^X_N`
d|_Id|\c!^Xu^XVXt_a\cS*\~q\`IY+r[*_SiUO_TrqqNc![*rqa
\zZSiU!_ r^X_[*VXQ`e_HSiU!\7uASi\:SiU!_q\^X^}\&qNrtSiV}\Q`I_S*U!\&uDz6\[rhV}_Q%Q7cO`]_[\z|Y!rQ!_N^}WN
]!c&S>\Q!^}az6\[SiUO_
W*c+rt[*_Y!^XrtSi_>Y![*\]O^X_N`xg_\Q!^oa~d|rQ8Sa\cASi\Td\[*dfV}S*USiU!_kW*c+r[F_Y!^srtS*_kY![F\]!^}_N` ]_qrc!WF_VXQS*U!_NWF_
Y![F\]!^X_`eW-S*U!_Y+rQ!_^XWr[F_r^X^$WF`Trt^X^WFc+r[*_WNx
y\u!\eS*U!_ r^}_N[*V}Q~qNr^Xqc!^srtS*VX\QOa\c~dfVX^X^
QO_N_NuDS*\Tq\`eYOcOSi_SiU!_>VXQ8Si_h[irt^\tzreY\Si_Q8SiVsrt^\_N[rIW*c+r[F_
Y+rQO_N^xe\Q!\SSi[waASi\qN\`e_c!Y<dfV}SiU<rQrQ+rt^}aSiV}qz6\[*`cO^sr S*U!\c!hUVoza\cu!\OY!^}_rWF_^X_S`I_Q!\d! x
y#U!_[*_Drt[*_`TrQ8ad|ra&WeSi\rY!Y![F\&V}`TrtS*_N^oaqr^}qNc!^XrtSi_S*U!_[F_Nc!V}[*_NuKVXQ8Si_h[irt^xPzMa\cEr[*_WFS*c!qz6\[IrQ
rY!YO[*\8rqUq\Q8SrqHSfSiU!_y"z6\[#W*\`I_U!_N^}Y$x
u/#&cOY!Y\W*_SiUO_qN\^}^X\7qrtS*VX\Qe\[ r^X_[*VXQe`TrtS*[*V T_c+rtSiV}\Q!Wfr[F_]_NVXQOhW*\^o_Nu]8a rc!WFW*VsrtQ_N^}VX`IVXQ+rtS*VX\Qx
U!V}qUDdfVX^X^
Sir_^}\Q!h_[z6\[M_[Fa~^sr[Fh_3Y![*\]O^X_N`IWNz6\[*`IVXQOhIS*U!_k`ertSi[FVo\[MW*\^o7VXQ!hISiU!_>`TrtS*[*V $#U!V}qU
Srt_NW^X\Q!h_N[-z6\[-SiU!_W*_3W*`er^X^Y![*\]O^X_N`IW%#
_ PzMa\cKr[*_c!WFVXQ!h '&Si\<W*\^}_:S*U!_qN\^}^X\7qrtS*VX\Q\[ rt^X_N[FVXQ`ertSi[FVo_Nc+rtS*VX\Q!WdfU!V}qUrtY!Y![*\rqU
d\c!^Xu:a\cY![*_Hz6_N[N r^}_N[*V}Q\[-q\^X^}\&qNrtSiV}\Q#PRQ:z6\[*`IVXQ!h>a\c![ r^}_N[*V}Q`TrSi[*V Ou!V}ua\cc!WF_>r`I_S*U!\7u
SiU!rtShc+r[*rQ8Si_N_W#SiU!_`ertSi[FVoedfVX^X^]_Wwa&`I`I_Si[FVXq #
zig&c!YOY\WF_>S*U!_kq\^X^}\&qNrtSiV}\Q\[ r^}_N[*V}Q~`ertSi[FVo_7c!rtSiV}\Q!WMr[*_>]_VXQ!heW*\^o_Nuc!W*V}Q!h:r)([wa&^}\W*c!]OW*Y+rq_
`I_SiUO\&u<^XV}_ '
&3x$U!V}qU<dfV}^X^5Sir_I^X\Q!h_N[z6\[_N[wa%^sr[Fh_Y![*\]!^X_`eW?z6\[*`IVXQ!hS*U!_`TrtS*[*V A\[W*\^}7VXQ!h
SiUO_3`ertS*[*Vo#:U+rtSMru!u!VoSiV}\Q+r^VXQOz6\[F`TrSiVX\QTu!\a\cQ!__NuSi\erQ!WFd_N[fSiUOVXW#c!_NWwSiV}\Q#
hy#U!_W*c+rt[*_Y!^XrtSi_e_HOr`IY!^X_Wr[F_Tu!V}W*q[*_S*VX_NuV}QS*\ h[*V}u!W>\tzfc!Q!V}z6\[*`I^}aDWFVX_NuY+rQO_N^XW O
OrQ!u Ez6\[kSiU!_eSiU![F_N_:_HOr`IY!^}_NWr]\_Hx*5!rt`eV}Q!_eS*U!_:qN\7_IqNV}_NQ8S`ertSi[FVoh_Q!_N[*rtSi_u]a
a\c![ rt^X_N[FVXQ_N[FW*VX\Q\tzfqr^}qNqrtYz6\[>S*U!_NWF__Or`eYO^X_NWrQ!uQ!\tSiVXq_TU!\d`TrtQa\z#S*U!_`ertSi[FVo_NQ8Si[FVX_W
U+r_g_HOrqS*^}aSiUO_gWirt`e_ftr^Xc!_tx+"gW|r3z6c!QOqSiV}\QT\tz
A&UO\d`erQ8ac!Q!V}c!_Mtr^}c!_NWr[*_SiU!_[*_MVXQSiU!_M`TrtS*[*V}qN_NW
rWFW*\7qNVsrSi_NudfVoSiUSiU!_fc!QOV}z6\[F`e^oau!VXWFqN[*_HSiV}N_NuIW*c+rt[*_fY!^srSi_NW,#-B^}_rWF_hV}_frtQI_&Y!^srQ!rtSiV}\Qz6\[5a\c![[*_W*c!^oSiWNx
U /.r_Tr~WFY_qNVsrt^Z_N[*WFVX\Q\z-a\c![ r^X_[*VXQOVXN_u%qr^}qNqrtYSiU!rtS\QO^}aDd\[*Wkz6\[>c!QOV}z6\[F`e^oaDu!V}W*qN[F_S*VXN_u
/
W*c+rt[*_Y!^XrtSi_WNrQ!uDc!WF_kSiUO_k\]!WF_N[wrSiVX\Qa\c`TruO_kVXQ h-Si\T`e\[F_>_0IqNVX_Q8Si^}aq\`eYOcOSi_3SiU!_q\&_0IqNVX_Q8S
`ertSi[FVoez6\[#SiU!V}W#W*Y_NqNVXr^$WF_SM\z5YO[*\]!^}_N`IWNx
V{
7c!Y!Y\W*_Ba\c>Q!\dS*[FaMSi\fW*\^o_BSiUO_Z^}VXQ!_Nr[$Wwa&WwSi_`z6\[$SiU!_qU+rt[*h_B_qS*\[1|]ac!W*V}Q!h#SiU!_ '&r^Xh\[*VoSiU!`x
-rQDa\cAc!W*_>SiU!_\]!W*_[FtrtSiV}\QDV}Q h8fSi\TW*c!]!WwSrQ8S*Vsr^}^}a[*_u!c!q_SiU!_`I_N`I\[waQ!_N_u!_NuASi\TW*\^o_c!QOV}z6\[F`e^oa
u!V}W*qN[F_S*VXN_u~WFc+r[*_3Y!^XrtSi_W%#
Problem 2.1
a) Assume, for the sake of simplicity, that all resistors in the line are of resistance R. The
structure of the N N conductance matrix G is:
G=
2
R
R1
0
..
.
R1
..
1
.
R
. .
..
.
.
..
..
.
.
0 R1
2
R
R1
..
0
..
.
R1
2
R
The matrix G is a tridiagonal matrix (i.e. a band matrix of bandwidth 2). By inspection,
the number of nonzero entries in G is N + 2(N 1) = 3N 2.
b) The matrix problem for the resistor line, written in terms of the resistance matrix G1
is G1 i = v where i is the vector of current source currents owing into each of the nodes,
and v is the vector of node voltages. For our original resistor line, i is a zero vector.
Suppose now that the jth entry of the vector i is nonzero. Physically, an injection of
current into node j will cause a change in all the node voltages. The jth entry of vector
i multiplies only the jth column of G1 . So a change in all the node voltages in v will
be algebraically possible only if the jth column of G1 consists of all nonzero entries, i.e.
G1 i,j =
0 for all i.
By extending this argument to all entries of the current source vector (and all columns of
the resistance matrix), we see that the N N resistance matrix G1 is full, i.e. will have
N 2 nonzero entries.
c) The factorization of the tridiagonal conductance matrix G produces two bidiagonal fac
tors L and U , such that LU = G. In order to see this, lets examine the rst few elimination
steps for the matrix G.
After the rst elimination step we get:
2
R
G(1) =
0
.
.
.
R1
3
2R
R1
R1
.
0
2
R
..
..
..
.
..
.
.
.
0
..
.
R1
1
2
R
R
2
R
G(2) = .
.
.
..
.
R1
3
2R
0
..
.
0
..
.
4
3R
R1
..
..
.
..
.
2
R
..
..
..
..
.
.
1
0 R
0
...
..
.
1
R
2
R
Each elimination step targets only one row in the tridiagonal matrix G. In addition, the
triangular block of zeros in the upperright corner of the matrix remains untouched. Thus
after all N 1 elimination steps, the L matrix will feature ones on the main diagonal, and
the N 1 multipliers on the subdiagonal. The U matrix will also be bidiagonal, with the
pivots on the main diagonal, and Rs on the superdiagonal. It follows that the number
of nonzero entries in L or U is thus N + (N 1) = 2N 1.
For N = 1000 the number of nonzero entries in G1 is 1, 000, 000 while L and U will each
contain only 1999 nonzero entries. It is not a good idea to use the inverse of a matrix
for solving the matrix problem due to the excessive number of required multiplications
proportional to the number of nonzero entries.
d) To determine the smallest entry in the resistance matrix, lets use the fact that an entry
rij if the resistance matrix is simply a voltage at node i caused by a unit current source
connected to node j. And we are looking for a smallest entry, i.e. the case where voltage is
minimal. You can easily gure out that we should put current source at one ends of the line
and examine the voltage at the other end of the line. In other words, the smallest element
for an N xN resistance matrix is always r1N or rN 1 they are equal, since matrix inversion
preserves symmetry.
Now imagine our line with current source connected to the rst node. The voltage at node
N is evidently
r1N =
1
N +1
(1)
Problem 2.2
a) It is very easy to nd counterexample, just remember the fomula for, say, second pivot:
a
22 = a22 a21
a12
a11
(2)
mathematical induction. Lets assume that we have proven our statement for the order
N 1. Now we need to show, that for the order N of the matrix, after eliminating rst row
from all subsequent rows, the resulting (N 1)x(N 1) submatrix:
1. will have all positive diagonal entries, no larger than original diagonal entries
2. will have negative odiagonals
3. will be strictly diagonally dominant
4. will be tridiagonal
Thie rst statement we have already shown, since all multipliers except for the M21 are
zero. Second one is also trivial. Same is the last one.
Now, lets show that the third statement holds. Since initial matrix is strictly diagonally
dominant, we know that |a22 | > |a21 | + |a23 |, and |a11 | > |a12 |. The only thing we need
to show is that the number we substract from a22 is less than a2 1, which is also evident.
Therefore we have:
|
a22 | > a22 |a21 | > |a21 | + |a23 | |a21 | = |a
23 |
(3)
Problem 2.3
a) For N (y, k) = I + yeTk and given k, the matrix N structurally looks like:
k
1
y1
0
. .
.
..
..
..
...
N (y, k) = 0 1 + yk 0
.
..
. . ..
.
. .
.
.
0
yn
1
where y = [y1 yn ]T . A simple check will help you verify that N 1 is structurally similar
to N . Let
k
1 w1 0
. .
..
. . ...
...
= 0 wk 0
.
.. . . ..
.
. .
.
.
0 wn 1
N 1
w1 + y1 wk =
w2 + y2 wk =
... =
(1 + yk )wk =
... =
0
0
0
1
0
wn + yn wk = 0,
N 1
1
..
.
0
..
.
0
k
y1
(1+y
k)
.
..
.
.
.
1
(1+yk )
..
.
yn
(1+y
k)
0
...
0
. . ..
. .
1
1
(ek x).
xk
A =
a11 a12
a21 a22
an1 a12
Let us write A as
4
a1n
a2n
ann
A = x(0)
1
(0)
x
2
(0)
xn
(0)
where xj denotes the jth column of A at step 0 of our (yettobe derived) matrix inversion
algorithm.
In part (b) we showed how to nd a matrix N such that N x = ek for a given x. Let us
notate this matrix by N (x, k) so that N (x, k)x = ek . In the rst step of the algorithm we
(0)
compute N (x1 , 1) and multiply it into A, so that
(0)
N (x1 , 1)A
where
(1)
xj
(1)
(1)
0 x2
xn
0
(0)
(0)
= N (x1 , 1)xj
(0)
(1)
We now multiply N (x1 , 1)A by N (x2 , 2), and so forth, so that after k steps, we have
( 1)
N (x
, j)A
=
j=1
1 0
0 1
(k)
(k)
0 0 xk+1 xn
.. ..
. .
0 0
xj
(j1)
N (xj
(0)
, j)xj
j=1
(k1)
(j1)
N (x
j
, j)
j=1
(k)
What makes the algorithm inplace is that since xjk = ej , we no longer have to store those
(j1)
(k1)
n = size(A,1);
for i=1:n,
y = -A(:,i);
y = y / A(i,i);
for k = i+1:n;
m = A(i,k);
for j=1:n;
end;
end;
A(:,i) = y;
for k = 1:i-1;
m = A(i,k);
for j=1:n;
end;
end;
end;
On a Sun Sparc10, this routine is about 500 times slower that Matlabs builtin inv()
function.
A more ecient code would operate on the matrix by columns:
%in-place invert a matrix A; vectorized
n = size(A,1);
for i=1:n,
y = -A(:,i);
y = y / A(i,i);
for k = i+1:n;
end;
A(:,i) = y;
for k = 1:i-1;
end;
end;
This routine is signicantly faster; it is only a factor of 1015 slower than Matlab. Regardless
of how fast the machine is, the fact that direct matrix inversion takes O(N 3 ) operations
limits the size of the problem we can solve in a reasonable time. Matlab took roughly 2.7
seconds to invert an N = 250 matrix. In a month, it could probably do N = 25000, in a
year, about N = 50000. The rst algorithm above could probably handle only N = 3000
in a month, N = 7000 in a year.