You are on page 1of 31

Linear multistep method

Abstract
Linear multistep methods are used for the numerical solution of ordinary differential
equations. Conceptually, a numerical method starts from an initial point and then takes a
short step forward in time to find the next solution point. The process continues with
subsequent steps to map out the solution. Single-step methods (such as Euler's method)
refer to only one previous point and its derivative to determine the current value.
Methods such as Runge-Kutta take some intermediate steps (for example, a half-step) to
obtain a higher order method, but then discard all previous information before taking a
second step. Multistep methods attempt to gain efficiency by keeping and using the
information from previous steps rather than discarding it. Consequently, multistep
methods refer to several previous points and derivative values. In the case of linear
multistep methods, a linear combination of the previous points and derivative values is
used.

Definitions
Numerical methods for ordinary differential equations approximate solutions to initial
value problems of the form

The result is approximations for the value of y(t) at discrete times ti:

ti = t0 + ih
yi = y(ti) = y(t0 + ih)
fi = f(ti,yi)

where h is the time step (sometimes referred to as Δt).

A linear multistep method uses a linear combination of yi and yi' to calculate the value of
y for the desired current step.

1
Multistep method will use the previous s steps to calculate the next value. Consequently,
the desired value at the current processing stage is yn + s.

A linear multistep method is a method of the form

where h denotes the step size and f the right-hand side of the differential equation. The

coefficents and determine the method. The designer of the


method chooses the coefficients; often, many coefficients are zero. Typically, the
designer chooses the coefficients so they will exactly interpolate y(t) when it is an nth
order polynomial.

If the value of bs is nonzero, then the value of yn + s depends on the value of f(tn + s,yn + s).
Consequently, the method is explicit if bs = 0. In that case, the formula can directly

compute yn + s. If then the method is implicit and the equation for yn + s must be
solved. Iterative methods such as Newton's method are often used to solve the implicit
formula.

Sometimes an explicit multistep method is used to "predict" the value of yn + s. That value
is then used in an implicit formula to "correct" the value. The result is a Predictor-
corrector method.

Examples
Consider for an example the problem

The exact solution is y(t) = et.

2
One-Step Euler
A simple numerical method is Euler's method:

Euler's method can be viewed as an explicit multistep method for the degenerate case of
one step.

This method, applied with step size on the problem y' = y, gives the following
results:

Two-Step Adams Bashforth


Euler's method is a one-step method. A simple multistep method is the two-step Adams–
Bashforth method

This method needs two values, yn + 1 and yn, to compute the next value, yn + 2. However,

the initial value problem provides only one value, y0 = 1. One possibility to resolve this
issue is to use the y1 computed by Euler's method as the second value. With this choice,
the Adams–Bashforth method yields (rounded to four digits):

3
The exact solution at t = t4 = 2 is , so the two-step Adams–Bashforth
method is more accurate than Euler's method. This is always the case if the step size is
small enough.

Multistep Method Families


Three families of linear multistep methods are commonly used: Adams–Bashforth
methods, Adams–Moulton methods, and the backward differentiation formulas (BDFs).

Adams–Bashforth methods

The Adams–Bashforth methods are explicit methods. The coefficients are as − 1 = − 1


and , while the bj are chosen such that the methods has order s
(this determines the methods uniquely).

The Adams–Bashforth methods with s = 1, 2, 3, 4, 5 are (Hairer, Nørsett & Wanner 1993,
§III.1; Butcher 2003, p. 103):

• — this is simply the Euler method;


The coefficients bj can be determined as follows. Use polynomial interpolation to find

the polynomial p of degree s − 1 such that

4
The Lagrange formula for polynomial interpolation yields

The polynomial p is locally a good approximation of the right-hand side of the

differential equation y' = f(t,y) that is to be solved, so consider the equation y' = p(t)
instead. This equation can be solved exactly; the solution is simply the integral of p. This
suggests taking

The Adams–Bashforth method arises when the formula for p is substituted. The

coefficients bj turn out to be given by

Replacing f(t, y) by its interpolant p incurs an error of order hs, and it follows that the s-
step Adams–Bashforth method has indeed order s (Iserles 1996, §2.1)

The Adams–Bashforth methods were designed by John Couch Adams to solve a


differential equation modelling capillary action due to Francis Bashforth. Bashforth
(1883) published his theory and Adams' numerical method (Goldstine 1977).

Adams–Moulton methods
The Adams–Moulton methods are similar to the Adams–Bashforth methods in that they

also have as − 1 = − 1 and . Again the b coefficients are chosen

5
to obtain the highest order possible. However, the Adams–Moulton methods are implicit

methods. By removing the restriction that bs = 0, an s-step Adams–Moulton method can


reach order s + 1, while an s-step Adams–Bashforth methods has only order s.

The Adams–Moulton methods with s = 0, 1, 2, 3, 4 are (Hairer, Nørsett & Wanner 1993,
§III.1; Quarteroni, Sacco & Saleri 2000):

• — this is the backward Euler method;

• — this is the trapezoidal


rule;

The derivation of the Adams–Moulton methods is similar to that of the Adams–Bashforth


method; however, the interpolating polynomial uses not only the points tn−1, … tn−s, as

above, but also tn. The coefficients are given by

6
The Adams–Moulton methods are solely due to John Couch Adams, like the Adams–
Bashforth methods. The name of Forest Ray Moulton became associated with these
methods because he realized that they could be used in tandem with the Adams–
Bashforth methods as a predictor-corrector pair (Moulton 1926); Milne (1926) had the
same idea. Adams used Newton's method to solve the implicit equation (Hairer, Nørsett
& Wanner 1993, §III.1).

Analysis
The central concepts in the analysis of linear multistep methods, and indeed any
numerical method for differential equations, are convergence, order, and stability.

The first question is whether the method is consistent: is the difference equation

a good approximation of the differential equation y' = f(t,y)? More precisely, a


multistep method is consistent if the local error goes to zero as the step size h goes to

zero, where the local error is defined to be the difference between the result yn + s of the
method, assuming that all the previous values are exact, and the exact

solution of the equation at time tn + s. A computation using Taylor series shows out that a
linear multistep method is consistent if and only if

All the methods mentioned above are consistent (Hairer, Nørsett & Wanner 1993, §III.2).

7
If the method is consistent, then the next question is how well the difference equation
defining the numerical method approximates the differential equation. A multistep

method is said to have order p if the local error is of order O(hp + 1) as h goes to zero.
This is equivalent to the following condition on the coefficients of the methods:

The s-step Adams–Bashforth method has order s, while the s-step Adams–Moulton

method has order s + 1 (Hairer, Nørsett & Wanner 1993, §III.2).

These conditions are often formulated using the characteristic polynomials

In terms of these polynomials, the above condition for the method to have order p
becomes

In particular, the method is consistent if it has order one, which is the case if ρ(1) = 0
and ρ'(1) = σ(1).

If the roots of the characteristic polynomial ρ all have modulus less than or equal to 1 and
the roots of modulus 1 are of multiplicity 1, we say that the root condition is satisfied.
The method is convergent if and only if it is consistent and the root condition is satisfied.
Consequently, a consistent method is stable if and only if this condition is satisfied, and
thus the method is convergent if and only if it is stable.

Furthermore, if the method is stable, the method is said to be strongly stable if z = 1 is


the only root of modulus 1. If it is stable and all roots of modulus 1 are not repeated, but

8
there is more than one such root, it is said to be relatively stable. Note that 1 must be a
root; thus stable methods are always one of these two.

Example
Consider the Adams–Bashforth three-step method

The characteristic equation is thus

First and second Dahlquist barriers


These two results were proved by Germund Dahlquist and represent an important bound
for the order of convergence and for the A-stability of a linear multistep method.

First Dahlquist barrier

A zero-stable and linear q-step multistep methods cannot attain an order of convergence
greater than q + 1 if q is odd and greater than q + 2 if q is even. If the method is also
explicit, then it cannot attain an order greater than q (Hairer, Nørsett & Wanner 1993,
Thm III.3.5).

Second Dahlquist barrier

There are no explicit A-stable and linear multistep methods. The implicit ones have order
of convergence at most 2 (Hairer & Wanner 1996, Thm V.1.4).

Goldstine, Herman H. (1977), A History of Numerical


Analysis from the 16th thrMethods for nonstiff
problems
Consider an initial value problem
9
(1)

where the solution can be scalar- or vector-valued. Let be a sequence of


grid points, which for simplicity is supposed to be equidistant with step size

. A numerical method is an algorithm that yields approximations to the

solution at the grid points.

Abstract linear spaces

Cartesian geometry, introduced by Fermat and Descartes around 1636, had a very large
influence on mathematics bringing algebraic methods into geometry. By the middle of
the 19th Century however there was some dissatisfaction with these coordinate methods
and people began to search for direct methods, i.e. methods of synthetic geometry which
were coordinate free.

10
It is possible however to trace the beginning of the vector concept back to the beginning
of the 19th Century with the work of Bolzano. In 1804 he published a work on the
foundations of elementary geometry Betrachtungen über einige Gegenstände der
Elementargoemetrie. Bolzano, in this book, considers points, lines and planes as
undefined elements and defines operations on them. This is an important step in the
axiomatisation of geometry and an early move towards the necessary abstraction for the
concept of a linear space to arise.

The move away from coordinate geometry was mainly due to the work of Poncelet and
Chasles who were the founders of synthetic geometry. The parallel development in
analysis was to move from spaces of concrete objects such as sequence spaces towards
abstract linear spaces. Instead of substitutions defined by matrices, abstract linear
operators must be defined on abstract linear spaces.

In 1827 Möbius published Der barycentrische Calcul a geometrical book which studies
transformations of lines and conics. The novel feature of this work is the introduction of
barycentric coordinates. Given any triangle ABC then if weights a, b and c are placed at
A, B and C respectively then a point P, the centre of gravity, is determined. Möbius
showed that every point P in the plane is determined by the homogeneous coordinates
[a,b,c], the weights required to be placed at A, B and C to give the centre of gravity at P.
The importance here is that Möbius was considering directed quantities, an early
appearence of vectors.

In 1837 Möbius published a book on statics in which he clearly states the idea of
resolving a vector quantity along two specified axes.

Between these two works of Möbius, a geometrical work by Bellavitis was published in
1832 which also contains vector type quantities. His basic objects are line segments AB
and he considers AB and BA as two distinct objects. He defines two line segments as
'equipollent' if they are equal and parallel, so, in modern notation, two line segments are
equipollent if they represent the same vector. Bellavitis then defines the 'equipollent sum
of line segments' and obtains an 'equipollent calculus' which is essentially a vector space.

11
In 1814 Argand had represented the complex numbers as points on the plane, that is as
ordered pairs of real numbers. Hamilton represented the complex numbers as a two
dimensional vector space over the reals although of course he did not use these general
abstract terms. He presented these results in a paper to the Irish Academy in 1833. He
spent the next 10 years of his life trying to define a multiplication on the 3-dimensional
vector space over the reals. Hamilton's quaternions, published in 1843, was an important
example of a 4-dimensional vector space but, particularly with Tait's work on quaternions
published in 1873, there was to be some competition between vector and quaternion
methods.

You can see a picture of the plaque commemorating where Hamilton discovered the
Quaternions and a (fanciful) engraving of when he carved the rules.

In 1857 Cayley introduced matrix algebras, helping the move towards more general
abstract systems by adding to the different types of structural laws being studied. In 1858
Cayley noticed that the quaternions could be represented by matrices.

In 1867 Laguerre wrote a letter to Hermite Sur le calcul des systèmes linéaires. His
systèmes linéaires is a table of coefficients of a system of linear equations denoted by a
single upper-case letter and Laguerre defines addition, subtraction and multiplication of
of these linear sysyems. In this work Laguerre aims to unify algebraic systems such as
complex numbers, Hamilton's quaternions and notions introduced by Galois and by
Cauchy.

Laguerre's work on linear systems was followed up by a paper by Carvallo in 1891. In


this paper he defines operators on vector functions and draws a clear distinction between
operators and matrices.

To understand the difference between the notions of an operator and a matrix, it suffices
to say that, if one changes the coordinate system, one obtains a different matrix to
represent the same vector function, but the same operator.

12
Another mathematician who was moving towards geometry without coordinates was
Grassmann. His work is highly original but the notion of barycentric coordinates
introduced by Möbius was his main motivation. Grassmann's contribution Die
Ausdehnungslehre appeared in several different versions. The first was in 1844 but it was
a very difficult work to read, and clearly did not find favour with mathematicians, so
Grassmann tried to produce a more readable version which appeared in 1862. Clebsch
inspired Grassmann to work on this new version.

Grassmann studied an algebra whose elements are not specified, so are abstract
quantities. He considers systems of elements on which he defines a formal operation of
addition, scalar multiplication and multiplication. He starts with undefined elements
which he calls 'simple quantities' and generates more complex quantities using specified
rules.

But ... I go further, since I call these not just quantities but simple quantities. There are
other quantities which are themselves compounded quantities and whose characteristics
are as distinct relative to each other as the characteristics of the different simple quantities
are to each other. These quantities come about through addition of higher forms ...

His work contains the familiar laws of vector spaces but, since he also has a
multiplication defined, his structures satisfy the properties of what are today called
algebras. The precise structures are now known as Grassmann algebras. The ideas of
linearly independent and linearly dependent sets of elements are clearly contained in
Grassmann's work as is the idea of dimension (although he does not use the term). The
scalar product also appears in Grassmann's 1844 work.

Grassmann's 1862 version of Die Ausdehnungslehre has a long introduction in which


Grassmann gives a summary of his theory. In this introduction he also defends his formal
methods which had clearly been objected to by a number of mathematicians. Grassmann's
justification comes very close to saying that he is setting up an axiomatic theory and this
shows that he is well ahead of his time.

13
Cauchy and Saint-Venant have some claims to have invented similar systems to
Grassmann. Saint-Venant's claim is a fair one since he published a work in 1845 in which
he multiples line segments in an analogous way to Grassmann. In fact when Grassmann
read Saint-Venant's paper he realised that Saint-Venant had not read his 1844 work and
sent two copies of the relevant parts to Cauchy, asking him to pass one copy to Saint-
Venant.

However, rather typically of Cauchy, in 1853 he published Sur les clefs algébrique in
Comptes Rendus which describes a formal symbolic method which coincides with that of
Grassmann's method (but makes no reference to Grassmann). Grassmann complained to
the Académie des Sciences that his work had priority over Cauchy's and, in 1854, a
committee was set up to investigate who had priority. We still await the committee's
report!

The first to see the importance of Grassmann's work was Hankel. In 1867 he wrote a
paper Theorie der complexen Zahlensysteme concerning formal systems where
combination of the symbols are abstractly defined. He credits Grassmann's Die
Ausdehnungslehre as giving the foundation for his work.

The first to give an axiomatic definition of a real linear space was Peano in a book
published in Torino in 1888. He credits Leibniz, Möbius's 1827 work, Grassmann's 1844
work and Hamilton's work on quaternions as providing ideas which led him to his formal
calculus.

Peano's 1888 book Calcolo geometrico secondo l'Ausdehnungslehre di H. Grassmann


preceduto dalle operazioni della logica deduttiva is remarkable. It gives the basic calculus
of set operation introducing the modern notation ∩, ∪, for intersection, union and an
element of. It was many years before this notation was to become accepted, in fact
Peano's book seems to have had very little influence for many years. It is equally
remarkable for containing an almost modern introduction to linear spaces and linear
algebra.

14
In Chapter IX of the book Peano gives axioms for a linear space.
It is hard to believe that Peano writes the following in 1888. It could almost come from a
1988 book! The first is for equality of elements

1. (a = b) if and only if (b = a), if (a = b) and (b = c) then (a = c).


2. The sum of two objects a and b is defined, i.e. an object is defined denoted by a +
b, also belonging to the system, which satisfies:
If (a = b) then (a + c = b + c), a + b = b + a, a + (b + c) = (a + b) + c,
and the common value of the last equality is denoted by a + b + c.
3. If a is an object of the system and m a positive integer, then we understand by ma
the sum of m objects equal to a. It is easy to see that for objects a, b, ... of the
system and positive integers m, n, ... one has
If (a = b) then (ma = mb), m(a+b) = ma+mb, (m+n)a = ma+na,
m(na) = mna, 1a = a.
We suppose that for any real number m the notation ma has a meaning such that
the preceeding equations are valid.

Peano goes on to state the existence of a zero object 0 and says that 0a = 0, that a - b
means a + (-b) and states it is easy to show that a - a = 0 and 0 + a = a.

Peano defines a linear system to be any system of objects satisfying his four conditions.
He goes on to define dependent objects and independent objects. He then defines
dimension.

Definition: The number of the dimensions of a linear system is the maximal number of
linearly independent objects in the system.

He proves that finite dimensional spaces have a basis and gives examples of infinite
dimensional linear spaces. Peano considers entire functions f(x) of a variable x, defines
the sum of f1(x) and f2(x) and the product of f(x) by a real number m. He says:-

15
If one considers only functions of degree n, then these functions form a linear system
with n + 1 dimensions, the entire functions of arbitrary degree form a linear system with
infinitely many dimensions.

Peano defines linear operators on a linear space, shows that by using coordinates one
obtains a matrix. He defines the sum and product of linear operators.

In the 1890's Pincherle worked on a formal theory of linear operators on an infinite


dimensional vector space. However Pincherle did not base his work on that of Peano,
rather on the abstract operator theory of Leibniz and d'Alembert.

Like so much work in this area it had very little immediate impact and axiomatic infinite
dimensional vector spaces were not studied again until Banach and his associates took up
the topic in the 1920's.

Although never attaining the level of abstraction which Peano had achieved, Hilbert and
his student Schmidt looked at infinite dimensional spaces of functions in 1904. Schmidt
introduced a move towards abstraction in 1908 introducing geometrical language into
Hilbert space theory. The fully axiomatic approach appeared in Banach's 1920 doctoral
dissertation.

Overview
A system of linear inequalities defines a polytope as a feasible region. The simplex
algorithm begins at a starting vertex and moves along the edges of the polytope until it
reaches the vertex of the optimum solution.

Consider a linear programming problem,

maximize
subject to

with the variables of the problem, a vector representing the linear form to optimize, A a
rectangular p,n matrix and the linear constraints.

16
In geometric terms, each inequality specifies a half-space in n-dimensional Euclidean
space, and their intersection is the set of all feasible values the variables can take. The
region is convex and either empty, unbounded, or a polytope.

The set of points where the objective function obtains a given value v is defined by the
hyperplane cTx = v. We are looking for the largest v such that the hyperplane still
intersects the feasible region.

As v increases, the hyperplane translates in the direction of the vector c. Intuitively, and
indeed it can be shown by convexity, the last hyperplane to intersect the feasible region
will either just graze a vertex of the polytope, or a whole edge or face. In the latter two
cases, it is still the case that the endpoints of the edge or face will achieve the optimum
value. Thus, the optimum value will always be achieved on (at least) one of the vertices
of the polytope.

The simplex algorithm applies this insight by walking along edges of the (possibly
unbounded) polytope to vertices with higher objective function value. When a local
maximum is reached, by convexity it is also the global maximum and the algorithm
terminates. It also finishes when an unbounded edge is visited, concluding that the
problem has no solution. The algorithm always terminates because the number of vertices
in the polytope is finite ; moreover since we jump between vertices always in the same
direction (that of the objective function), we hope that the number of vertices visited will
be small. Usually there are more than one adjacent vertices which improve the objective
function, so a pivot rule must be specified to determine which vertex to pick. The
selection of this rule has a great impact on the runtime of the algorithm.

In 1972, Klee and Minty[2] gave an example of a linear programming problem in which
the polytope P is a distortion of an n-dimensional cube. They showed that the simplex
method as formulated by Dantzig visits all 2n vertices before arriving at the optimal
vertex. This shows that the worst-case complexity of the algorithm is exponential time.
Since then it has been shown that for almost every deterministic rule there is a family of

17
simplices on which it performs badly. It is an open question if there is a pivot rule with
polynomial time, or even sub-exponential worst-case complexity.

Nevertheless, the simplex method is remarkably efficient in practice. It has been known
since the 1970s that it has polynomial-time average-case complexity under various
distributions. These results on "random" matrices still didn't quite capture the desired
intuition that the method works well on "typical" matrices. In 2001 Spielman and Teng
introduced the notion of smoothed complexity to provide a more realistic analysis of the
performance of algorithms.[3]

Other algorithms for solving linear programming problems are described in the linear
.programming article.

Algorithm

The simplex algorithm requires the linear programming problem to be in augmented


form, so that the inequalities are replaced by equalities. The problem can then be written
as follows in matrix form:

Maximize Z in:

where are the introduced slack variables from the augmentation process, ie the non-
negative distances between the point and the hyperplanes representing the linear
constraints A.

By definition, the vertices of the feasible region are each at the intersection of n
hyperplanes (either from A or ). This corresponds to n zeros in the n+p variables of (x,
xs). Thus 2 feasible vertices are adjacent when they share n-1 zeros in the variables of (x,
xs). This is the interest of the augmented form notations : vertices and jumps along edges
of the polytope are easy to write.

The simplex algorithm goes as follows :

18
• Start off by finding a feasible vertex. It will have at least n zeros in the variables
of (x, xs), that will be called the n current non-basic variables.
• Write Z as an affine function of the n current basic variables. This is done by
transvections to move the non-zero coefficients in the first line of the matrix
above.
• Now we want to jump to an adjacent feasible vertex to increase Z's value. That
means increasing the value of one of the basic variables, while keeping all n-1
others to zero. Among the n candidates in the adjacent feasible vertices, we
choose that of greatest positive increase rate in Z, also called direction of highest
gradient.
o The changing basic variable is therefore easily identified as the one with
maximum positive coefficient in Z.
o If there are already no more positive coefficients in Z affine expression,
then the solution vertex has been found and the algorithm terminates.
• Next we need to compute the coordinates of the new vertex we jump to. That
vertex will have one of the p current non-basic variables set to 0, that variable
must be found among the p candidates. By convexity of the feasible polytope, the
correct non-basic variable is the one that, when set to 0, minimizes the change in
the moving basic variable. This is easily found by computing p ratios of
coefficients and taking the lowest ratio. That new variable replaces the moving
one in the n basic variables and the algorithm loops back to writing Z as a
function of these.

Example

x=0 is clearly a feasible vertex so we start off with it : the 3 current basic variables are x,
y and z. Luckily Z is already expressed as an affine function of x,y,z so no transvections
need to be done at this step. Here Z 's greatest coefficient is -4, so the direction with
highest gradient is z. We then need to compute the coordinates of the vertex we jump to
increasing z. That will result in setting either s or t to 0 and the correct one must be found.
For s, the change in z equals 10/1=10 ; for t it is 15/3=5. t is the correct one because it

19
minimizes that change. The new vertex is thus x=y=t=0. Rewrite Z as an affine function
of these new basic variables :

Now all coefficients on the first row of the matrix have become nonnegative. That means
Z cannot be improved by increasing any of the current basic variables. The algorithm
terminates and the solution is the vertex x=y=t=0. On that vertex Z=20 and this is its
maximum value. Usually the coordinates of the solution vertex are needed in the standard
variables x, y, z ; so a little substitution yields x=0, y=0 and z=5.

Implementation

The tableau form used above to describe the algorithm lends itself to an immediate
implementation in which the tableau is maintained as a rectangular (m+1)-by-(m+n+1)
array. It is straightforward to avoid storing the m explicit columns of the identity matrix
that will occur within the tableau by virtue of B being a subset of the columns of . This
implementation is referred to as the standard simplex method. The storage and
computation overhead are such that the standard simplex method is a prohibitively
expensive approach to solving large linear programming problems.

In each simplex iteration, the only data required are the first row of the tableau, the
(pivotal) column of the tableau corresponding to the entering variable and the right-hand-
side. The latter can be updated using the pivotal column and the first row of the tableau
can be updated using the (pivotal) row corresponding to the leaving variable. Both the
pivotal column and pivotal row may be computed directly using the solutions of linear
systems of equations involving the matrix B and a matrix-vector product using A. These
observations motivate the revised simplex method, for which implementations are
distinguished by their invertible representation of B.

In large linear programming problems A is typically a sparse matrix and, when the
resulting sparsity of B is exploited when maintaining its invertible representation, the

20
revised simplex method is a vastly more efficient solution procedure than the standard
simplex method. Commercial simplex solvers are based on the primal (or dual) revised
simplex method.

Analytic geometry

Three geometric linear functions — the red and blue ones have the same slope (m), while
the red and green ones have the same y-intercept (b).
Main article: linear equation

In analytic geometry, the term linear function is sometimes used to mean a first-degree
polynomial function of one variable. These functions are known as "linear" because they
are precisely the functions whose graph in the Cartesian coordinate plane is a straight
line.

Such a function can be written as

f(x) = mx + b

(called slope-intercept form), where m and b are real constants and x is a real variable.
The constant m is often called the slope or gradient, while b is the y-intercept, which
gives the point of intersection between the graph of the function and the y-axis. Changing
m makes the line steeper or shallower, while changing b moves the line up or down.

Examples of functions whose graph is a line include the following:

• f1(x) = 2x + 1
• f2(x) = x / 2 + 1
• f3(x) = x / 2 − 1.

The graphs of these are shown in the image at right.

21
Vector spaces

In advanced mathematics, a linear function means a function that is a linear map, that is,
a map between two vector spaces that preserves vector addition and scalar
multiplication.

For example, if x and f(x) are represented as coordinate vectors, then the linear functions
are those functions f that can be expressed as

f(x) = Mx,

where M is a matrix. A function

f(x) = mx + b

is a linear map if and only if b = 0. For other values of b this falls in the more general
class of affine maps.

Abstract: The linear stability analysis for linear multistep methods leads to study the
location of the roots of the associated characteristic polynomial with respect to the unit
circle in the complex plane. It is known that if the discrete problem is an initial value one,
it is su_cient to determine when all the roots are inside the unit disk. This requirement is,
however, conicting with the order conditions, as established by the Dahlquist barrier. The
conict disappears if one uses a linear multistep method coupled with boundary conditions
(BVMs). In this paper, a rigorous analysis of the linear stability for some classes of
BVMs
is presented. The study is carried out by using the notion of type of a polynomial.
c 2007 European Society of Computational Methods in Sciences and Engineering
Keywords: Linear multistep methods, Stability of numerical methods, polynomial type
Mathematics Subject Classi_cation: 65L06, 65L20
1 Introduction
One of the most important problems to be solved when a _rst-order di_erential equation
in Rs is

22
approximated by a linear multistep method is the control of the errors between the
continuous and
the discrete solutions. In the last forty years a lot of e_orts have been done in this _eld,
mainly
when the methods are applied to dissipative problems. In such a case, the study of the
propagation
of the errors is made by means of a linear di_erence equation (frequently called error
equation)
1Published electronically April 14, 2007
2Corresponding author. E-mail: l.aceto@dma.unipi.it
2 L. Aceto R. Pandol_ D. Trigiante
depending on a complex parameter. Considering that the resulting equation is, in general,
of order
greater than one, some additional conditions must be _xed in order to get the solution we
are
interested in. When all of them are chosen at the _rst points of the interval of integration
we are
solving discrete initial value methods (IVMs). It is well-known that in this case the
asymptotic
stability of the zero solution of the error equation is equivalent to require that the
associate stability
polynomial is a Schur polynomial, that is all its roots have modulus less than one, as the
complex
parameter, say q, varies into the left-hand complex plane. This request is, however,
conicting with
the order conditions, as established by the Dahlquist barrier. On the other hand, since
only one
condition is inherited by the continuous problem there is no valid reason to use discrete
IVMs. It is
possible to split the additional conditions part at the beginning and part at the end of the
interval

23
of integration, solving a boundary value method (BVM). In this case the concept of
stability needs
to be generalized. In such a case the notion of well-conditioning is more appropriate.
Essentially,
such notion requires that under a perturbation __ of the imposed conditions, the
perturbation of
the solution _y should be bounded as follows
k_yk _ _k__k
where _ is independent on the number of points in the discrete mesh. If a discrete
boundary value
problem is considered, the error equation is well-conditioned if the number of initial
conditions is
equal to the number of roots of the stability polynomial inside the unit circle and the
number of
conditions at the end of the interval of integration is equal to the number of roots outside
of the
unit circle [6]. This result generalizes the stability condition for IVMs where all the roots
need
to be inside the unit disk. In order to control that the number of roots inside the unit circle
is
constant for q 2 C�; the notion of type of a polynomial p(z) of degree k is useful. A
polynomial is
of type:
T(p(z)) = (r1; r2; r3); r1 + r2 + r3 = k;
if r1 is the number of roots inside, r2 on the boundary and r3 outside the unit disk in the
complex
plane, respectively.
Denoting, as usual, by _k(z) and _k(z) the _rst and second characteristic polynomials of a
generic
k-step linear multistep method and by q = h_ the characteristic complex parameter, where
h is the

24
stepsize and _ is the parameter in the usual test equation (i.e., y0 = _y), the classical A-
stability
condition requires that T(_k(z) � q _k(z)) = (k; 0; 0) for all q 2 C�: In the BVM
approach, the
A-stability condition requires that only r2 vanishes, i.e.,
T(_k(z) � q _k(z)) = (r1; 0; r3); for all q 2 C�:
The stability problem is then reduced to study whenever or not T(_k(z)�q _k(z))
remains constant
for q 2 C�; no matter if the discrete problem is an IVM or a BVM one. Of course, we
have now
much more freedom since the case r3 = 0 is only one of the allowed possibilities.
Except for very simple cases, the proof that the number of roots inside the unit circle
remains
constant for all q 2 C� is often checked numerically. In this paper a theoretical analysis
is carried
out for the classes of linear multistep methods generalizing the Backward Di_erentiation
Formulas
(GBDFs) and the Adams methods (OGAMs, GAMs) [2, 6]. The starting point of this
study will
be either the explicit form of the coe_cients or relevant properties of them.
2 Polynomial type and the stability problem for LMMs
The study of the location of the zeros of special polynomials with respect to a curve in
the complex
plane is an old problem whose pioneering works go back to Schur [7]. Classical criteria
such as
Schur's or Routh-Hurwitz are in general di_cult to apply to high degree polynomials. The
following
result provide conditions which are simpler to check in our subsequent analysis.
c 2007 European Society of Computational Methods in Sciences and Engineering
(ESCMSE)
Stability analysis of LMMs via polynomial type variation 3

25
Theorem 2.1 Let p(z) be the real polynomial of degree k de_ned by
p(z) =
Xk
j=0
ajzj
and p_(z) = zk p(z�1) its adjoint. Suppose that the following conditions are satis_ed:
i) p(1) 6= 0;
ii) there exists m 2 N; m _ k; such that:
z2m�k+1p(z) � p_(z) = ak(z � 1)2m+1:
Then,
T(p(z)) =
(k � m; 0;m) if (�1)m ak p(1) > 0;
(k � m � 1; 0;m + 1) if (�1)m ak p(1) < 0:
Proof. We _rst prove that p(z) has no zeros on the unit circle when p(1) 6= 0: Suppose
that
p(ei^_) = 0; for a _xed ^_ 2 (0; 2_): Then, p_(ei^_) = eik^_p(e�i^_) = 0 since the
coe_cients are real.
Moreover, from hypothesis ii) it turns out to be
0 = ei(2m�k+1)^_p(ei^_) � p_(ei^_) = ak(ei^_ � 1)2m+1:
This equality is only veri_ed for ^_ = 0; which is excluded by hypothesis i):
Let g be de_ned by
g(z) =
z2(m�k)+1 p(z2)
p(1)
:
Supposing that n is the number of zeros of p inside the unit circle (counted with
multiplicity), g
has np _ 2(k � m) � 1 poles and nr _ 2n zeros inside the unit circle. We determine the
winding
number w of g(z) around the circle jzj = 1 (see, e.g., [7]). The hypothesis ii); relating p
and p_;

26
yields
g(z) � g(z�1) =
z2(2m�k+1) p(z2) � z2k p(z�2)
z2m+1 p(1)
=
z2(2m�k+1) p(z2) � p_(z2)
z2m+1 p(1)
=
ak(z2 � 1)2m+1
z2m+1 p(1)
=
ak(z � z�1)2m+1
p(1)
:
Then,
Im g(ei_) =
g(ei_) � g(e�i_)
2i
=
ak(ei_ � e�i_)2m+1
2 i p(1)
=
ak(2 i sin _)2m+1
2 i p(1)
=
(�1)mak
2 p(1)
(2 sin _)2m+1:
This quantity vanishes for _ = 0; _: Moreover, according to the sign of (�1)m ak p(1) it
is either

27
positive or negative between (0; _) and the opposite between (_; 2_): Considering that w
assumes
the value +1 in the _rst case and �1 in the other one and that the principle of the
argument yields
w = nr � np = 2(n � k +m) + 1; immediately it follows that n = k � m; if (�1)m ak
p(1) > 0 and
n = k�m�1; if (�1)m ak p(1) < 0: Since p has a total of k zeros, the assertion follows.
c 2007 European Society of Computational Methods in Sciences and Engineering
(ESCMSE)
4 L. Aceto R. Pandol_ D. Trigiante
Example 2.1 Let
p(z) = 10 � 5z + z2:
It satis_es the hypothesis ii) with m = k = 2: Moreover, since (�1)m ak p(1) = 6; it
follows that T(p(z)) =
(0; 0; 2): In fact, the roots of p(z) are: z1 = 5+ip15
2 ; z2 = 5�ip15
2:
Example 2.2 Consider the polynomial
p(z) = �7 + 3z + 5z2 + z3:
In this case the hypothesis ii) is veri_ed with m = 2: Moreover, since (�1)m ak p(1) = 2;
we get T(p(z)) =
(1; 0; 2): In fact, the roots of p(z) are: z1 _= 0:8662; z2 _= �2:2108; z3 _= �3:6554:
Example 2.3 Let
p(z) = �9 + z + 5z2 + z3:
Here p(1) = �2 and the condition ii) is veri_ed with m = 2: Since (�1)m ak p(1) = �2;
one has that
T(p(z)) = (0; 0; 3): The roots are: z1 _= �4:2731; z2 _= �1:8596; z3 _= 1:1326:
The above result turns out to be useful in studying the stability properties for discrete
problems
obtained by using Boundary Value Methods (BVMs) with (k1; k2)-boundary conditions,
i.e., k-step

28
linear multistep methods (LMMs) to which k1 initial conditions and k2 = k � k1 _nal
ones are
imposed.
For the sake of completeness, we briey report the essential results on the stability
problem for
BVMs (see [6] for details). The characteristic (or stability) polynomial of the method is
_k(z; q) = _k(z) � q_k(z); q 2 C; (1)
where
_k(z) :=
Xk
j=0
_(k)
j zj ; _k(z) :=
Xk
j=0
_(k)
j zj : (2)
Although the locution \well-conditioning" would be more appropriate dealing with
boundary value
problems, we shall continue to use the term stability, as customary. It is known that the
wellconditioning
of a linear boundary value problem, either continuous or discrete, is related to the
so called dichotomy. In the discrete case it essentially states that the number of initial
conditions
should be equal to the number of roots of the characteristic polynomial inside the unit
circle
and, of course, the number of conditions at the end of the interval of integration should be
equal
to the number of roots outside the unit circle. Then, supposing that k1 is the number of
the

29
initial conditions and k2 the number of the _nal ones, the generalization of the stability
request to
boundary value problems can be done as follows.
De_nition 2.1 The k-step BVM with (k1; k2)-boundary conditions is said to be Ak1;k2-
stable if
C� _ Dk1; k2 , where
Dk1; k2 = fq 2 C : T(_k(z; q)) = (k1; 0; k2)g
denotes the region of (k1; k2)-absolute stability.
When k2 = 0 the above de_nition reduces to the classical A-stability.
The notion of consistency remains unchanged with respect to the IVM case, i.e.,
_k(1) = 0; _0k(1) = _k(1): (3)
We choose here the normalization condition _k(1) = 1:

Conclusion

COMSAT could read the writing on the wall and eventually offered SUPER-
COMPACT-PC, a version of which could run on an IBM PC. They added their own
layout tool, AUTO-ART. However, SuperCOMPACT PC was a "chopped-up" version of
the large mainframe program, while Touchstone was written for the PC architecture,
taking advantage of the new operating system, and the decade-long leadership of
Compact Software was successfully challenged by EESof. After Comsat lost interest in
CAD, COMPACT Software again became a private company, owned by Ulrich Rohde. It
was located across the street from his other microwave enterprise, Synergy Microwave,
in beautiful Patterson NJ, birthplace of funnyman Lou Costello.

Update July 5, 2006. We'll let Ulrich fill you in on the history of COMPACT from here
on:

"The 201 McLean Boulevard Patterson location on Route 20 near the river is and was a
nice location and Compact was in a fancy building. No need to shoot at Patterson

30
When I bought the company from Comsat, both the main frame version and the PC
version where unstable and models such as T-junctions, crosses and others fairly
inaccurate at higher frequencies. Becoming a partner in the DARPA MIMIC program,
and mostly financed from earnings rather the DARPA money, these things were changed
by developing EM-based models, and verified by Raytheon and Texas Instruments.

The most significant contributions were:

N-dimensional nodal noise analysis for all types of both linear and nonlinear circuits,
including oscillators, mixers and amplifiers under large signal conditions.

31

You might also like