Professional Documents
Culture Documents
1
A.
c Megretski, J. Wyatt, 2007. All rights reserved.
ii
Preface
The course is aimed at filling the gaps in basic linear algebra and functional analysis
background of graduate students interested in communication, control, signal processing,
optimization, and related areas.
Principal topics include:
(a) field-independent linear algebra (vector spaces, bases, dimensions, matrix algebra,
linear transformations, linear equations, determinants, characteristic polynomials)
emphasizing the coordinate-free approach with applications in linear systems and
coding theory;
(c) convexity (convex functions and convex sets, convex optimization, cutting plane
methods, Caratheodory theorem, Minkovsky functionals, Krein-Milman Theorem,
Hahn-Banach theorem, minimax theorem) with applications in optimization;
(d) approximation and topology (norms, approximation, functional spaces, Fourier and
Laplace transforms, compactness, fixed point theorems, differentiation, implicit
mapping theorems, first and second order conditions of optimality) with applica-
tions in robustness analysis and optimization.
iii
iv PREFACE
Contents
Preface iii
v
vi CONTENTS
3 Determinants 53
3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.1.1 Parameter-Dependent Linear Equations . . . . . . . . . . . . . . . . 53
3.1.2 Determinants as Dynamical System Invariants . . . . . . . . . . . . 54
3.2 Construction of a Determinant . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.2.1 Signed Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.2.2 Multilinear Skew Symmetric Functions . . . . . . . . . . . . . . . . 56
3.2.3 Determinant as Signed Volume Gain . . . . . . . . . . . . . . . . . 59
3.3 Basic Properties of Determinants . . . . . . . . . . . . . . . . . . . . . . . 60
3.3.1 Elementary Properties . . . . . . . . . . . . . . . . . . . . . . . . . 60
Determinant of Identity . . . . . . . . . . . . . . . . . . . . . . . . 60
Determinant of an Invertible Operator . . . . . . . . . . . . . . . . 60
Multiplicativity of Determinant . . . . . . . . . . . . . . . . . . . . 61
Determinants of Similar Operators . . . . . . . . . . . . . . . . . . 62
Determinants and Duality . . . . . . . . . . . . . . . . . . . . . . . 63
3.3.2 Determinants and Block Decompositions . . . . . . . . . . . . . . . 64
Direct Sums And Block Decompositions . . . . . . . . . . . . . . . 64
Determinants Of Block Triangular Matrices . . . . . . . . . . . . . 65
Schur Identity And The Kramers Formula . . . . . . . . . . . . . . 66
4 Characteristic Polynomials 69
4.1 Motivation: LTI Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.1.1 Autonomous Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.1.2 Reachability of LTI State Space Models . . . . . . . . . . . . . . . . 70
4.2 Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.2.1 Companion Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.2.2 Invariant Subspaces And Irreducible Polynomials . . . . . . . . . . 74
4.2.3 Cayley-Hamilton Theorem . . . . . . . . . . . . . . . . . . . . . . . 75
4.2.4 Schur Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5 Quadratic Forms
and Scalar Products 79
5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.1.1 Euclidean Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.1.2 Quadratic Constraints and Uncertainty Modeling . . . . . . . . . . 81
5.1.3 Random Variables and Second Order Statistics . . . . . . . . . . . . 84
5.2 Positive Definiteness of Quadratic Forms . . . . . . . . . . . . . . . . . . . 85
5.2.1 Cauchy-Bunyakovski-Schwarz Inequality . . . . . . . . . . . . . . . 85
viii CONTENTS
6 Linear-Quadratic Optimization 91
6.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.1.1 Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.1.2 Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2 Basic Properties Of LQ Optimization . . . . . . . . . . . . . . . . . . . . . 97
6.2.1 Equivalence Of LQ Optimization Problems . . . . . . . . . . . . . . 97
6.2.2 Well-Posedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.2.3 Necessary and Sufficient Conditions of Optimality . . . . . . . . . . 100
6.2.4 Optimizing Sequences . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.2.5 Optimal Cost In LQ Optimization . . . . . . . . . . . . . . . . . . . 102
8 Convexity 139
8.1 Convex Sets And Convex Functions . . . . . . . . . . . . . . . . . . . . . . 139
8.1.1 Intersections And Maximums . . . . . . . . . . . . . . . . . . . . . 140
8.1.2 Convexity And Differentiation . . . . . . . . . . . . . . . . . . . . . 142
8.1.3 Convexity Preserving Operations . . . . . . . . . . . . . . . . . . . 146
8.2 Basic Theorems of Convex Analysis . . . . . . . . . . . . . . . . . . . . . . 149
8.2.1 Hahn-Banach Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 149
CONTENTS ix
A standard way of modeling both the physical and the virtual worlds is by writing systems
of equations. General systems of equations are hard do deal with in a systematic fashion:
they are tough to solve practically, and also difficult to analyze theoretically. The so-
called linear equations turn out to be a nice exception: they are relatively straightforward
to solve practically, and their theoretical analysis is supported by a rich and powerful
theory. This chapter covers the most elementary aspects of linear equations, grouping
around the statement that a system of n scalar linear equations with n scalar unknowns
(counted properly) has a unique solution. It introduces the notions of vector space and
linear transformation as mathematical abstractions of linearity, dimension of a vector
space as an accurate measure of number of equations or number of variables, and
matrix of a linear transformation as a representation and practical computation tool.
1.1 Motivation
A linear equation with real variables has the form A(v) = u, where A : V 7 U is a
given linear function, V and U are real vector spaces, u U is a given element of U,
and v V is an element of V to be found. The main objective of this chapter is to
teach recognition of real vector space structures, assessment and representation of linear
functions, and counting of dimensions.
1
2 CHAPTER 1. LINEARITY WITH REAL SCALARS
(a) the set V of all polynomials of degree less than m, as well as the set U of all columns
of n real numbers, are real vector spaces;
In addition, in order to calculate a solution p of (1.1) for a specific set , one can use a
matrix representation of A with respect to some bases in V and U.
p = p(t, h) of two real variables of degree less than m with respect to each of them, with
real coefficients pr,i IR such that
m1
X m1
X
p(tk , hk ) = yk (k = 1, . . . , n), p(t) = pr,i tr hi . (1.3)
r=0 i=0
2
Since p is defined by m independent real parameters, one can expect that the inter-
polation problem will have a unique solution whenever n = m2
(tk ti )2 + (hk hi )2 6= 0 for k 6= i. (1.4)
This, however, is not the case, in general. We will consider this example in terms of
real vector spaces, dimensions, linear transformations, and show that a less trivial picture
emerges.
where the positive integer m and the interpolation data = [(tk , yk )]nk=1 are given, while
the coefficients pi , qi of the polynomials p, q are to be figured out.
Interpolation by rational functions offers the possibility of much better use of the free
parameters to match given data. On the other hand, the theoretical analysis becomes
more involved compared to the polynomial case, as the function mapping (p, q) to the
column of the values of p/q is unlikely to be linear. We will show that finding a linear
equation interpretation is more challenging, but still possible in this case and yields nice
conditions for existence and uniqueness of solutions in the rational interpolation problem.
How can one be sure that this system of equations has a non-zero solution? One way to
see this is to recognize the set of all possible functions q : {1, . . . , n} 7 IR, assigning a
real number to each node, as a vector space, and the transformation mapping q to the
sequence = (i ) in (1.6) as a linear function. Comparing the dimensions of the vector
spaces of {q} and {} will lead to the observation that (1.6) has n variables but only n 1
equation (if calculated properly). Hence, a non-zero solution of (1.6) does exist.
Definition 1.1 A set V and two functions p : V V 7 V (p for plus) and s : IRV 7
V (s for scale) are said to define a real vector space if there exists an element 0V V
(called zero of V ) such that conditions (V1)-(V8), listed below, are satisfied, where v + u
and cv for v, u V and c IR are used as shortcuts for p(v, u), and s(c, v) respectively:
(V1) (v + u) + w = v + (u + w) for all v, u, w V ;
(V2) c1 (c2 v) = (c1 c2 )v for all v V , c1 , c2 IR;
(V3) v + u = u + v for all v, u V ;
(V4) c(v + u) = (cv) + (cu) for all v, u V , c IR;
(V5) (c1 + c2 )v = (c1 v) + (c2 v) for all v V , c1 , c2 IR;
(V6) v + 0V = v for all v V ;
(V7) 0 v = 0V for all v V ;
(V8) 1 v = v for all v V .
1.2. REAL VECTOR SPACES 5
To streamline notation, several conventions are typically used, as long as this does not
cause ambiguity:
(e) standard operation priority rules (do multiplication and division before addition and
subtraction when not sure, etc.) are applied, so that, for example, c1 v + c2 u means
(c1 v) + (c2 u) and not c1 (v + (c2 u)).
Note that, once the addition function is fixed, only one element can be suitable to
play the role of zero of the vector space. Indeed, if v1 , v2 V are such that v1 + v = v
and v2 + v = v for every v V then
v2 = v1 + v2 = v2 + v1 = v1 .
Mathematical formality requires one to make a distinction between the set V and the
triplet V = (V, p, s), because, on the same set, vector operations satisfying (V1)-(V8) can
be defined in different ways. While V is the true vector space, it is common to ignore
the difference between V and V unless this causes ambiguity, so that v V should be
interpreted as v V .
Multiplication of a vector by another vector is not included in the definition of a real
vector space (another inadmissible operation is addition of a scalar to a vector). Though
multiplication of vectors, satisfying the distributive, associative, and commutative laws
can be defined on every vector space, there is usually no natural way of doing this.
For example IR1 is the same as IR, IR2 is the set of all columns of two real numbers
(essentially, same thing as IR IR), etc. The set IRn appears naturally when analyzing
interactions of n real parameters.
The elements of IRn can be added (to each other) and scaled (by a real number) in a
natural (component-wise) fashion, to get another element in IRn :
x1 y1 x1 + y1 x1 cx1
x2 y2 x2 + y2 x2 cx2
.. + .. = .. , c .. = .. .
. . . . .
xn yn xn + yn xn cxn
Linear Subspaces
It is common to define an interesting real vector space V as a subset V U of a larger
real vector space U, with the addition and scaling operations inherited from U. Indeed,
1.2. REAL VECTOR SPACES 7
for this construction to be meaningful, V must be closed under the addition and scaling
operations from U.
Definition 1.2 Let U be a real vector space. A subset V U is called a linear subspace
of U if v1 + v2 V and cv V whenever v, v1 , v2 V and c IR.
Equipped with the addition and scaling operations inherited from U, a linear subspace
V becomes a real vector space in its own right.
For example, the subset C[0, 1] IR[0,1] of all continuous functions f : [0, 1] 7 IR is
a linear subspace of IR[0,1] , and hence is automatically a real vector space with respect to
the usual operations of addition and scaling.
Elementary Geometry
A vector space can be associated naturally with the usual elementary geometry on the
plane (or in space), according to the following set of definitions. which translates the
geometric terms of points, lines, segments, distances, etc. into the language of real vector
spaces.
(a) The set V is the set of all points on the plane (or space). An arbitrarily selected
point O is to be called zero.
B A+B
e e
e e e e e
0.5A O 0.75A A 1.5A
With this definition, the plane and the three-dimensional space become real vector
spaces, and the axioms (V1)-(V8) become a set of relatively simple theorems of elementary
geometry. It becomes possible to express geometric objects in terms of real vector space
operations. For example, a line passing through two points A 6= B can be viewed as the
set
(AB) = {(1 t)A + tB : t IR};
two lines (A1 B1 ) and (A2 B2 ) can be called parallel if and only if
A1 B1 = c(A2 B2 )
1.2. REAL VECTOR SPACES 9
for some c IR, etc. While not all geometric notions are covered by this association (for
example, there are no means for defining angles and comparing lengths of non-parallel seg-
ments within the framework of real vector spaces), addition of a scalar product operation,
to be discussed in next chapters, makes a real vector space an accurate representation of
elementary geometry. As a result, linear algebra becomes the most powerful and conve-
nient tool for proving geometric theorems, though most people learn linear algebra too
late to use it this way.
These operations, however, do not make X a real vector space. For example, the associa-
tivity law does not work, as, for v1 = v2 = 0.5 X
Abstract Statements
Here are some examples done at a very abstract level. Both represent obvious state-
ments, which nevertheless have to be derived from the axioms.
The first statement is about cancelling identical terms on both sides of equalities
between vector sums.
Proof.
V6 V7
v = v + OV = v + 0 w = v + (1 + (1)) w
V5 V8 V1
= v + (1 w + (1) w) = v + (w + (1) w) = (v + w) + (1) w
by assumption V1
= (u + w) + (1) w = u + (w + (1) w) = u.
10 CHAPTER 1. LINEARITY WITH REAL SCALARS
The second statement claims that scaling a zero vector results in a zero vector no
matter what the scaling parameter is.
Proof.
V7 V2 V7
c 0V = c(0 0V ) = (c 0)0V = 0 0V = 0V .
In the rest of the presentation, we will not sink to this picky level of detail again.
B
B
B
B
B
B
M3
d
Bd
B M2
B
d B
B
W B
B
B
d
d BBd
A2 M1 A3
According to the definition of addition and scaling, for every two points A, B V
such that A 6= B the segment [AB] and the line (AB) are the sets
and the middle point C of segment [AB] is given by 0.5(A+ B). Three points A1 , A2 , A3
X define a triangle
A1 A2 A3 = {t1 A1 + t2 A2 + t3 A3 : ti 0, t1 + t2 + t3 = 1}
1.3. LINEAR FUNCTIONS 11
if they do not belong to a single line. A median in A1 A2 A3 is one of the three segments
[Ai Mi ], where M1 is the middle of [A2 A3 ], M2 is the middle of [A1 A3 ], and M3 is the
middle of [A1 A2 ].
The theorem of interest claims that [A1 M1 ], [A2 M2 ], and [A3 M3 ] have a common
point W . To prove this, note that W must be of the form W = t1 A1 + t2 A2 + t3 A3 , where
t1 + t2 + t3 = 1, and, due to the symmetry (re-naming points A1 , A2 , A3 should have no
effect on W ), one would expect that t1 = t2 = t3 . This yields W = (1/3)(A1 + A2 + A3 )
as a guess for what W actually is. Now it remains to verify that W indeed belongs to all
three medians. Since
1 1 2 1 1 2
(A1 + A2 + A3 ) = A1 + ( (A2 + A3 )) = A1 + M1 ,
3 3 3 2 3 3
it follows that W [A1 M1 ]. The inclusions W [A2 M2 ] and W [A3 M3 ] are derived in
a similar way.
Definition 1.3 Let V and U be two real vector spaces. A function A : V 7 U is called
linear if
A(v1 + v2 ) = A(v1 ) + A(v2 ), A(cv) = cA(v)
for every v, v1 , v2 V and c IR.
Definition 1.4 The null-space (or kernel) of a linear function A : V 7 U is the set
def
ker(A) = {v V : Av = 0}.
Example 1.1 If V is a linear subspace of a real vector space U , the inclusion function A :
V 7 U which maps v to itself is linear. An important special case is the identity function
IV : V 7 V is defined by IV (v) = v. When V = IRn , the notation In is frequently used in place
of IV . Also, when the vector space V can be figured out from context, I can be used instead of
IV .
Example 1.2 Let V be the real vector space C[0, 1] of all continuous functions f : [0, 1] 7 IR.
The integration formula Z t
(Af )(t) = f ( )dt
0
defines a linear operator A : V 7 V . The operator maps f C[0, 1] to a continuously
differential function g C[0, 1] such that g(0) = 0 and g(t) = f (t) for all t. The proof of
linearity simply refers to the properties of integration. In contrast, the formula Lf = f (0.5)
defines a linear functional L : C[0, 1] 7 IR.
defines a functional H : C[0, 1] 7 IR which is not linear. To prove absence of linearity, one
typically needs an example. For instance, let f (t) 1. Then H(f ) = 1 but H(2f ) = 4. Since
H(2f ) 6= 2H(f ), the function H is not linear.
Example 1.4 Let V be the real vector space of all polynomial functions f : IR 7 IR. Let
A : V 7 V and B : V 7 V be the differentiation and multiplication by the independent
variable operators defined by
Then BA maps f to g, where g(t) = tf(t), while AB maps f to h, where h(t) = tf(t) + f (t).
Therefore AB BA maps f to f , i.e.
def
AB BA = AB + (1)BA = I.
In essence, the proof of Theorem 1.1 is simple. If V0 = V then the statement is obvious.
Otherwise, take a vector w V , w 6 V0 , and consider the set
V1 = {v + tw : v V0 , t IR},
Lemma 1.3 (Zorns Lemma) Let X be a non-empty set. Let Y be a subset of X X with the
following properties:
Then there exists xmax X such that (xmax , x) 6 Y for all x 6= xmax .
Zorns Lemma can be interpreted in the following way. The set Y defines a partial order
on X, in which x1 is said to be less or equal to x2 (notation x1 x2 ) whenever (x1 , x2 ) X.
The order is called partial because it is possible for both inequalities x1 x2 and x2 x1 to be
false. Assumptions (a) and (b) reflect the usual properties of ordering: the inequalities x1 x2
and x2 x1 are satisfied simultaneously if and only if x1 = x2 , and the inequalities x1 x2 ,
x2 x3 imply x1 x3 . Condition (c) establishes that every completely ordered subset X0 of
X has an upper bound xub X, i.e. an element, not necessarily belonging to X0 , such that the
inequality x xub holds for all x X0 . The conclusion of the Lemma is that the set X has at
least one maximal element xmax , i.e. such which is not smaller than any other element of X.
1.3. LINEAR FUNCTIONS 15
To apply Zorns Lemma to prove Theorem 1.1, define X as the set of all linear extensions
F : W 7 U of A0 (i.e. such that V0 W V and F v = A0 v for all v V0 ). Since A0 X,
the set X is not empty. Define the partial order on X according to which F1 : W1 7 U is
less or equal than F2 : W2 7 U if and only if W1 W2 and F2 v = F1 v for all v W1 . Then
conditions (a),(b) of Lemma 1.3 are evidently satisfied. Moreover, condition (c) is satisfied as
well because, for a completely ordered subset X0 of extensions F : W 7 U , an upper bound Fub
can be chosen as the function mapping the union Wub of W to U (since each element w Wub
belongs to some W , the value of Fub (w) is well defined for all w Wub ). Therefore X has a
maximal element Fmax .
Theorem 1.2 (and some of its generalizations) is frequently used in linear algebra and
functional analysis-related proofs. One of its interpretations is in terms of information
recovery, where the transformation v 7 Av is viewed as a measurement process, which
is associated with some loss of the information contained in v. Assuming that Bv is
the information to be recovered, function C is the filter converting measurement Av
into the needed data Bv. Obviously, whatever is in the null-space of A is lost in the
measurement process. Theorem 1.2, essentially, claims that the rest can be recovered by
a linear filter.
Example 1.5 A linear function A : V 7 U such that ker(A) = {0} always has a left inverse
A+ : a linear function A+ : U 7 V such that A+ A = IV . To prove this, use Theorem 1.2 with
W = V and B = IV .
Definition 1.5 Let V be a real vector space. The dual space V ] is the real vector space
defined as the subspace of all linear functions from V IR .
xn xn
and hence it is common to associate (IRn )] with the vector space IR1n of all 1-by-n real matrices.
The vector spaces IRn appear to be very similar. In particular, the linear function F : IRn 7
IR1n defined by
x1
F ... = x1 . . . xn
xn
establishes a linear bijection between IRn and IR1n . In general, however, a linear bijection
between V and V ] does not exist.
Definition 1.6 Let V, U be real vector spaces. For a linear function A : V 7 U, its
dual A] is the linear function A] : U ] 7 V ] mapping each linear functional f : U 7 IR
to the linear functional g = A] f : V 7 IR defined by g(v) = f (Av).
Example 1.7 A linear function A : IRm 7 IRn can be represented by its matrix a, which
allows one to view A as multiplication by a matrix on the left v 7 av. If (IRm )] and (IRn )]
are represented as the vector spaces of row matrices IR1,m and IR1,n respectively, the dual A]
becomes the multiplication by matrix a on the right q IR1,n 7 qa IR1,m .
It is easy to verify that the duality transformation A 7 A] satisfies the usual identities
similar to those valid for transfposition of matrices:
While the notion of orthogonality is not generally available for pairs of vectors from
the same real vector space, it can be applied to elements of a vector space and its dual.
The following statement is a commonly used relation between null space of a linear
function and the range of its dual, which is actually a special case of Theorem 1.2.
1.4. BASES AND DIMENSION 17
Theorem 1.3 If V, U are real vector spaces and A : V 7 U is a linear function then
(ker(A)) = R(A] ).
Proof. If f (ker(A)) then f v = 0 for all v such that Av = 0. According to Theorem 1.2,
applied with B = f and W = IR, there exists C U ] such that f = CA, i.e. f = A] C R(A] ).
Conversely, if f R(A] ) then there exists g U ] such that f v = gAv for every v V , and
hence f v = 0 whenever Av = 0.
where vi are elements of the same real vector space V , and ci are real numbers, called
coefficients of the linear combination.
Linear Independence
Informally speaking, a finite sequence of vectors is called linearly independent if none of
its elements can be represented as a linear combination of the other.
When a sequence of measurements is performed, some may become redundant: for example,
when n = 3, and three measurements are defined by
1 0 1
w(1) = 1 , w(2) = 1 , w(3) = 0 ,
0 1 1
the third one does not produce any additional information, as the identity
is satisfied no matter what the values of the unknown parameters ai are. Hence, the outcome
of the third measurement can be predicted using the results from the first two.
The redundancy of the third measurement can be interpreted in terms of linear independence
(more precisely, the lack of one): the identity (1.8) holds because
which means that the sequence (w(1), w(2), w(3)) is not linearly independent. The interpretation
holds in the general case: a sequence
of k measurements w(i) IRn contains a redundant one if and only if it is not linearly indepen-
dent.
Example 1.9 Let V IRIR be the real vector space of all polynomial functions f : IR 7 IR.
The sequence of monomials (1, t, . . . , tn ) is linearly independent for every n. Indeed, every linear
combination of functions fi is a polynomial with coefficients which are also the coefficients of
the linear combination. Since a polynomial equals zero identically only if all of its coefficients
are zero, the conditions of Definition 1.8 are satisfied.
The definition only allows finite bases. While bases with an infinite number of elements
are very important, their proper use requires an adequate framework for approximation
and convergence of vectors, something that the concept of a general real vector space
does not provide. For vector spaces of functions, an intuitively appealing notion of a basis
would call for the possibility of arbitrarily good approximation of every vector by a linear
combination of its elements.
The following statement describes bases of a general real vector space.
Theorem 1.4 Let V be a real vector space, V 6= {0}. Then one of the following condi-
tions is satisfied:
(b) there exists a positive integer n such that every linearly independent sequence
(v1 , . . . , vk ) of vi V has k n elements, and can be extended to a finite ba-
sis (v1 , . . . , vn ) of V .
The number n from case (b) of Theorem 1.4 is called the dimension of V , notation
n = dim(V ). By convention, dim(V ) = when case (a) takes place, and dim(V ) = 0 for
V = {0}.
The proof of Theorem 1.4 is based on the following observation.
where ai,m are real numbers. If u0 = 0, the set Y is obviously not linearly independent. Other-
wise a0,k 6= 0 for some k. For bi = ai,k /a0,k , each vector ui bi u0 is a linear combination of the
elemements of the sequence Sw oftained by excluding vk from Sv . Since Sw has n elements, by
the inductive assumption the sequence
(u1 b1 u0 , . . . , um bm u0 )
20 CHAPTER 1. LINEARITY WITH REAL SCALARS
is not linearly independent, i.e. there exist real numbers c1 , . . . , cm , not all of which equal zero,
such that
0 = c1 (u1 b1 u0 ) + + cm (um bm u0 ) = (c1 b1 + + cm bm )u0 + c1 u1 + + cm um ,
which proves Lemma 1.4 for m = n + 1.
Therefore a linearly independent sequence cannot have more elements than a basis. On
the other hand, if a linearly independent sequence Sv = (v1 , . . . , vk ) is not a basis, there
exists a vector vk+1 which is not a linear combination of Sv , and hence can be appended
to Sv to produce a longer linearly independent sequence (v1 , . . . , vk , vk+1). If this process
of extending Sv can be continued without termination, V has no finite basis. Otherwise,
the process terminates at a basis of V . This concludes the proof of Theorem 1.4.
Example 1.11 Let V be the set of all continuous functions f : [0, 1] 7 IR which are piecewise
linear, in the sense that they can be represented in the form f (t) = ak t+bk on each of the intervals
t [k/n 1/n, k/n], where k = 1, . . . , n, and n is a fixed number. It is easy to see that V is a
real vector space. What is the dimension of V ?
The intuition tells us that an element f V is defined by n + 1 real parameters a0 , a1 , . . . , an
representing the values ai = f (i/n) of f (t) at t = 0, 1/n, 2/n, . . . , 1. Hence one would expect
that the dimension of V is n + 1. To prove this, consider the sequence Sf of n + 1 functions
1 |nt i|, |nt i| 1,
fi (t) = i = 0, 1, . . . , n.
0, otherwise,
It is sufficient to establish that Sf is a basis in V .
(a) Each function fi belongs to V .
(b) The value of
fc (t) = c0 f0 (t) + c1 f1 (t) + + cn fn (t)
at t = i/n equals ci for i = 0, 1, . . . , n. Hence fc = 0 implies ci = 0 for all i, which means
that is linearly independent.
(c) For every f V the function
g(t) = f (0)f0 (t) + f (1/n)f1 (t) + + f (1)fn (t)
belons to V and equals f at the points t = i/n, i = 0, 1, . . . , n. Hence g(t) = f (t) for all
t, which proves that is a generating set.
As established in (a)-(c), Sf is a basis of V , and hence dim(V ) = n + 1.
(b) An n-by-m real matrix a can be scaled by a real number c IR. The result is
the matrix of cA where A : IRm 7 IRn is the linear function with matrix a.
Algorithmically, the operation is component-wise multiplication by c.
(c) An n-by-m matrix a can be multiplied on the left by a k-by-n matrix b. The result
ba is the matrix of BA where A : IRm 7 IRn and B : IRn 7 IRk are the linear
functions with matrices a and b respectively. Algorithmically, the operation means
n
X
(ba)r,q = br,1 a1,q + br,2 a2,q + + br,n an,q = br,i ai,q .
i=1
It is easy to verify that the three elementary operations introduced for real matrices
satisfy the familiar laws of associativity and distributivity. It is important to remember
that commutativity, while valid for the addition of matrices, does not hold for multiplica-
tion. In fact, unless k = m, the product ab is not even defined for a k-by-n matrix b and
an n-by-m matrix a! Another important difference is that a product ba of two non-zero
matrices can be zero.
The association between operations on matrices and operations on linear functions ex-
tends to arbitrary real vector spaces of finite dimensions, as long as proper bases are used.
For scaling and addition, the conclusion is obvious. The following statement establishes
the connection for multiplication.
1.4. BASES AND DIMENSION 23
Theorem 1.6 Let V, U, W be real vector spaces with bases bV , bU , and bW respectively.
Let a, b be matrices of linear operators A : V 7 U and B : U 7 W with respect to
the pairs of bases (bV , bU ) and (bU , bW ) respectively. Then ba is the matrix of BA with
respect to bases bV and bW .
Proof. A basis bV = (v1 , . . . , vm ) of a real vector space defines a linear function TV : IRm 7 V
according to
x1
TV ... = x1 v1 + + xm vm .
xm
Since bV is a basis, TV is a bijection, and hence has a well defined inverse TV1 . The bases bU
and bW define similar linear bijections TU : IRn 7 U and TW : IRk 7 W , where n and k are
the dimensions of U and W respectively. By construction, the matrix a of A with respect to
the bases bV , bU is the matrix of the linear function : IRm 7 IRn defined by = TU1 ATV .
Similarly, the matrix b of B with respect to the bases bU , bW is the matrix of the linear function
1
: IRn 7 IRk defined by = TW BTU , and the matrix c of BA is same as the matrix of
1
= TW BATV . Since
1
= TW BTU TU1 ATV = TW1
BATV = ,
the conclusion c = ba follows.
Proof. If dim(ker(A)) = then dim(V ) = and hence equality (1.10) holds. Otherwise, if
dim(ker(A)) = k, where k < , let b0 = (v1 , . . . , vk ) be a basis in ker(A).
24 CHAPTER 1. LINEARITY WITH REAL SCALARS
implies
ck+1 vk+1 + + cn vn ker(A),
which means that the sequence b is not linearly independent.
If dim(V ) = , b0 can be extended to an arbitrarily long linearly independent sequence b =
(v1 , . . . , vn ), and the arguments used for dim(V ) < can be used to show that dim(R(A)) = .
it follows from (1.10) that dim(R(A)) = dim(U). According to Theorem 1.4, this means
that R(A) = U, i.e. that equation Av = u has a solution v V for every u U.
The number dim(R(A)) is important enough to have a special term for it.
The solution v of Av = u will be unique when ker(A) = {0}. Indeed, if Av1 = u and
Av2 = u then A(v1 v2 ) = 0, which means u1 u2 ker(A). Conversely, if Av = u and
w ker(A) then A(v + w) = u as well.
1.4. BASES AND DIMENSION 25
Example 1.14 Returning to the setup of interpolation by polynomials of a single real vari-
able, one can use Theorem 1.7 to prove that for every sequence (t1 , . . . , tm ) of m different real
numbers, and every sequence (y1 , . . . , ym ) of m real numbers, there exists a unique polynomial
p = p(t) of degree less than m such that p(tk ) = yk for all k = 1, . . . , m.
Indeed, let V be the real vector space of all polynomials of degree less than m. The function
A : V 7 IRm defined by
p(t1 )
A(p) = ..
,
.
p(tm )
is linear. Since V has the basis of m monomials (1, t, . . . , tm1 ), its dimension is m. Since the
dimension of IRm is m as well, it is sufficient to show that ker(A) = {0} has dimension zero, i.e.
that Ap = 0 implies p = 0. Since Ap = 0 means that the m different numbers ti are roots of the
polynomial p of degree less than m, the equality p 0 is implied, which completes the proof of
feasibility of polynomial interpolation.
Example 1.15 Let V be the set of those real-valued functions f : IR IR 7 IR of two real
variables which can be represented in the polynomial form
2 X
X 2
f (t, h) = fi,k ti hk ,
i=0 k=0
where fi,k are real constants, and satisfy the rotational symmetry constraint
Bi-Orthogonality
Let Su = (u1 , . . . , un ) be a sequence of elements of a real vector space U. A sequence
Sf = (f1 , . . . , fn ) of linear functionals fi U ] is called bi-orthogonal to Su when
def 1, i = k,
fi (uk ) = ik =
0, i 6= k.
v = f1 (u)v1 + + fn (u)vn ,
A(c1 v1 + + cn vn ) = u
yields ck = fk u.
The following statement claims that a linearly independent sequence of linear func-
tionals is always bi-orthogonal to a sequence of vectors.
The proof of Theorem 1.8 is constructive and essentially describes the Gaussian elim-
ination algorithm for solving systems of linear equations.
Proof. The if part is easy: if Sf and Su are bi-orthogonal then applying the functional
f = c1 f1 + + cn fn
is such that
fn w = f u 6= 0, f1 w = f2 w = = fm w = 0.
Hence, for un = (fn w)1 w the sequence Su = (u1 , . . . , un ) is bi-orthogonal to Sf .
Proof. According to Theorem 1.9, for every linearly independent sequence in V there exists
a linearly independent sequence in V ] (the bi-orthogonal one) of the same length. Conversely,
according to Theorem 1.8, for every linearly independent sequence in V ] there exists a linearly
independent sequence in V (the bi-orthogonal one) of the same length.
The next statement is a familiar relation between the dimensions of a space, its sub-
space, and the orthogonal complement of the subspace.
Proof. If dim(U ) = then dim(V ) = and hence the equality is satisfied no matter what
the value of dim(U ) is.
If dim(U ) = k < let Su = (u1 , . . . , uk ) be a basis in U , and let Sv = (u1 , . . . , un ) be
an extension of Su to a linearly independent sequence in V . Let Sf = (f1 , . . . , fn ) be the
corresponding bi-orthogonal sequence of functionals fi V ] . Then fi ur = 0 for i > k and r k,
which means that (fk+1 , . . . , fn ) is a linearly independent sequence in U . Hence
The last statement is the well-known relation between the column and row ranks
of a matrix, translated into the language of general linear functions.
Sf = (f1 , . . . , fk ) = (A] g1 , . . . , A] gk )
Sf = (f1 , . . . , fk ) = (A] g1 , . . . , A] gk )
So far, the discussion of vector spaces was limited to the special case when real numbers
are used as scalars, which was reflected in the term real vector space. This chapter
adapts the standard definitions and theorems to the case of arbitrary fields of scalars. It
turns out that the basic methods of real vector spaces, presented in chapter 1, can be
generalized without modification to the case of a general field. As a side benefit, this
chapter should also give a fine opportunity to review basic matrix algebra constructions
in a new environment.
Some of the alternative fields of scalars are quite familiar. For example, vector spaces
with complex scalars are convenient for dealing with eigenvectors and eigenvalues. Vector
spaces with rational scalars are better suited for precise numerical manipulations with
matrices. The main attention of this chapter will be given to recognition and construction
of finite fields, which are used to extend the benefits of linearity to discrete mathematics.
2.1 Motivation
The mathematical notion of a field is an abstraction for data which can be handled by
operations of addition, subtraction, multiplication, and division satisfying the convenient
and familiar properties of associativity, distributivity, and commutativity. Among many
examples, the set IR of real numbers, equipped with the standard arithmetic operations, is
perhaps the most important field: its elements are used to represent values of continuously
varying, or analog physical parameters. Another classical example is the set C of complex
numbers, originally invented to serve in intermediate calculations associated with real
numbers. Practical calculations associated with real or complex numbers are typically
performed using rational numbers (set Q) 0 which gives another example of a field.
While complex numbers appear to be perfect for representing parameters of the physi-
cal world, most of human and computer decision making is done in terms of discrete data,
for which the set of admissible values is finite. Linearity of the quantized word is ex-
pressed by finite fields, which are finite sets on which operations of addition, subtraction,
29
30 CHAPTER 2. LINEARITY WITH ALTERNATIVE FIELDS
multiplication, and division satisfying the usual properties are defined. Finite fields are
used in discrete data processing, such as coding and decoding in digital communications,
as well as to speed up calculations associated with analog data.
Error-Correcting Coding
Consider the task of using redundancy to protect a large portion of digital data from
inevitable corruption. The data could be some computer code to be stored on a CD
expected to be scratched and otherwise misused, or some scientific measurements trans-
mitted by a weak and noisy radio signal from a deep space probe, etc. One model of the
situation treats the original data as a binary sequence x = (x1 , . . . , xm ) of xi {0, 1},
which is to be transformed (or encoded) into a longer binary sequence z = (z1 , . . . , zn ) of
zi {0, 1} in such a way that, even after z is modified (corrupted) into a different sequence
y = (y1 , . . . , yn ), the original data x can still be recovered from y, provided that the differ-
ence between y and z is not too large, in a certain sense. The setup is shown on diagram
(2.1), where the encoder E : {0, 1}m 7 {0, 1}n and decoder D : {0, 1}n 7 {0, 1}m are
functions to be designed, while C : {0, 1}n 7 {0, 1}n is given.
z1 y1
x1 .. .. x1
. .
x = ... x = D(y) = ...
E C D
7 z = E(x) = zm 7 y = C(z, w) = ym 7 (2.1)
xm .. .. xm
. .
zn yn
when the bits zi represent storage on a compact disc, the errors, caused by scratches and
manufacturing defects, are expected to come in bursts, which makes the independent bit
errors model inadequate.
A common sense idea for encoding is that of mixing, in the sense that every single
bit zi of z = E(x) should depend on a large number of bits of x:
z = F (x) : zi = fi (x1 , . . . , xm ) (i = 1, . . . , m), (2.2)
so that, when a few components of z are corrupted in the transition from z to y, there is
still enough information to restore x from y. This rises the issue of complexity of decoding,
since, even when no data corruption occurs (z = y), recovering x from y means solving
system (2.2) of n equations with m variables. Since, in general, only linear systems
are inexpensive to solve, one would want (2.2) to satisfy the requirements of linearity.
However, the concept of real vector space linearity does not apply here because, by the
nature of the application, the variables xi are discrete, i.e. can take only a finite number
of values.
A significant benefit of understanding vector spaces over general fields is the realization
that there is a way of treating the elements of {0, 1} (or, more generally, sequences of bits
of fixed length k, as well as some other finite sets) as elements of a finite field, which
allows one to extend the concept of linearity, with all of its benefits, to the discrete
variable setup. The coding schemes which use linear operations on finite fields allow for
a significant simplification of the decoding process. There is a cost associated with using
linear coding: in most cases the length n of the encoded message z tends to be larger than
the theoretical minimum achievable with general nonlinear codes, but the gap appears to
be shrinking as better linear codes are being developed.
Network Coding
A set = {(a, b)} of ordered pairs (a, b) of numbers a, b {1, . . . , n}, such that a < b
for all (a, b) , and two positive integers k, m, such that k + m n, can be used as a
simplified model for delivering information from k sources to m sinks over a network with
n nodes, with the interpretation that node a can send information to node b if and only
if (a, b) G.
A memoryless q-bit network communication algorithm for is described by a col-
lection of encoding functions fa,b . For a given (a, b) , the function fa,b combines the
information available originally available at node a with all information received at a from
other nodes to produce a q-bit word to be sent to node b. Formally, fa,b maps N (a)+1 to
, where = {0, 1}q and N(a) is the number of elements in the set
R(a) = {c : (c, a) }.
The meaning of each function fa,b is given by
y(a, b) = fa,b (u(a), y(c1, a), . . . , y(cN (a) , a)) : c1 < < cN (a) , ci R(a), (2.3)
32 CHAPTER 2. LINEARITY WITH ALTERNATIVE FIELDS
# #
1 2
y(1, 3) y(2,3)
"!
H "!
j#
HH
H
H
3
"!
y(1, 5) y(2, 6)
y(3, 4)
#?
4
"! HH
#?
y(4, 5) HH
y(4, j#
6)
H ?
5 6
"! "!
where u(a) is the q-bit word representing the information originating at node a, and
y(a, b) is the q-bit word sent from a to b. For example, in the so-called butterfly
network shown on Figure 2.1:
u(a) = ga,d (y(c1 , d), . . . , y(cN (d) , d)) : c1 < < cN (d) , ci R(d)
As in the case of error-correcting coding, network coding benefits from mixing, which
means some of the functions fa,b not just selecting one of their inputs for an output (as
in routing) but actually making fa,b dependent on all of its arguments. For example,
no routing scheme can produce an acceptable network coding algorithm for the butterfly
network from Figure 2.1 with k = m = 2 (i.e. sending information from nodes 1 and
2 to nodes 5 and 6), as in this case either y(3, 4) = u(1), in which case u(2) cannot be
recovered at node 5, or y(3, 4) = u(2), in which case u(1) cannot be recovered at node 6.
In contrast, defining
y(1, 3) = y(1, 5) = u(1), y(2, 3) = y(2, 6) = u(2), (2.4)
y(4, 5) = y(4, 6) = y(3, 4) = y(1, 3) + y(2, 3),
where the addition is done component-wise modulo 2, yields an acceptable network code.
The general difficulty associated with mixing strategies is that the decoder will have
to solve a system of many equations with many unknowns. Linearity of the encoding
functions fa,b aids greatly both in assessing feasibility of decoding and in reducing com-
plexity of the decoding algorithm. Since is a finite set, the framework for linearity will
have to be based on finite fields. For example, the network coding algorithm defined by
(2.4) is linear over the binary field ZZ2 = {0, 1}.
and then substituting the right side for xn in the rest of equations.
Despite the theoretical simplicity of the algorithm, its implementations (such as LA-
PACK, which is what MATLAB relies upon) frequently fail to produce an accurate so-
lution, even for feasible system of equations. The algorithm implementation failures are
due to the round-off errors associated with the floating point arithmetics used in most
engineering calculations. The issue of sensitivity of linear equation solving with respect to
round-off errors will be studied later in the course, after the concept of distance between
vectors is introduced. In this chapter, it is appropriate to address the topic of solving
systems of linear equations exactly.
An appealing framework for exact linear equation solvers assumes that the entries
of A and y (and hence those of the recursively generated reduced matrices) are ratio-
nal numbers, and hence can be represented precisely in a computer memory as pairs
of sequences of bits (one for the numerator and one for the denominator). Within this
framework, which relies on linearity with respect to the field Q 0 of all rational numbers, a
direct implementation of the exact Gaussian elimination algorithm is possible.
Such implementation, however, is bound to be extremely inefficient for large n. To
see the reason, one can estimate the number of bits needed to store the numerators and
denominators of the entries of the reduced marices involved. Assume that the original
integer entries pi,k , qi,k of ai,k = pi,k /qi, k require m bits each to store. Then the numerators
and denominators of
ai,n an,k pi,n qi,n qn,n pn,n qi,k pi,n pn,k qn,n
ai,k = ai,k =
an,n qi,k qi,n qn,k pn,n
require, roghly, 4m bits to store. After k steps of the Gaussian elimination procedure,
one will expect to need 4k m bits to store each entry of the reduced matrix A, which
actually indicates that the Gaussian elimination algorithm, when implemented directly,
has exponential complexity growth.
Using arithmetic calculations in finite fields can resolve the complexity growth issue.
One way to apply the idea is by performing all arithmetic operations on integers modulo
a sufficiently large prime number p, which means working in the finite field ZZp , where
log(p) n(m + log(n)). This way, the number of bits required to store an entry of a
reduced version of matrix A does not exceed log2 (p) + 1. A more efficient algorithm relies
on finding a sufficiently good approximate solution in the field of q-adic numbers for an
arbitrarily chosen prime number q, and then converting the approximation to the actual
solution.
Q,
0 IR, and C are familiar to the reader, uses them as examples, and introduces finite fields
Fq , where q = pm and p is a prime number. A general classification of fields in terms of
characteristic numbers and extensions is given, showing that every field, depending on its
characteristic number, is an extension of either Q
0 or Fp for some prime number p.
Note that, once the addition and multiplication functions are fixed, only one element
can be suitable to play the role of zero of the field. Indeed, if a, b F are such that
a + x = x and b + x = x for every x F then
a = b + a = a + b = b.
Similarly, only one element can be suitable to play the role of unit of the field.
Mathematical formality requires one to make a distinction between the set F and the
triplet F = (F, a, m), because, on the same set, arithmetic operations satisfying the field
axioms (F1)-(F9) can be defined in different ways. While F is the true field, it is
common to ignore the difference between F and F (unless this causes ambiguity), so that
x F has the same meaning as x F .
36 CHAPTER 2. LINEARITY WITH ALTERNATIVE FIELDS
(a) x + y1 = x + y2 = 0 implies y1 = y2 ;
y1 = y1 1 = y1 (xy2 ) = y2 (xy1 ) = y2 .
The second example states that the product of two non-zero elements of a field is not
zero.
z = z 1 = z(xy) = (zx)y = 0 y = 0.
(a) The element y in (F8), which, according to Lemma 2.1, is uniquely defined by x, is
denoted by x.
2.2. GENERAL FIELDS 37
(b) The element y in (F9), which, according to Lemma 2.1, is uniquely defined by x, is
denoted by x1 .
(f) For a positive integer n, and x F , nx and xn denote respectively the sum and the
product of n copies of x;
(g) Standard operation priority rules (do multiplication and division before addition
and subtraction when not sure, etc.) are applied, so that, for example, x + yz
means x + (yz) and not (x + y)z.
For example, complex conjugation defines a homorphism of the field C into itself.
a 0 1 2 m 0 1 2
0 0 1 2 0 0 0 0
1 1 2 0 1 0 1 2
2 2 0 1 2 0 2 1
a 0 1 m 0 1
0 0 1 0 0 0 0
1 1 0 1 0 1
0 1 0 1
It is easy to see that F3 and F are very much the same field, and the only difference is
in the ways the elements are labeled: F is obtained from F3 by re-naming 2 as 8. In
fact, fields F and F3 are equivalent, or isomorphic, according to the following definition.
In other words, field F is isomorphic to field G if G and the field operations of G are
obtained by re-naming the elements of F . For most practical purposes, isomorphic fields
should be treated as equal.
Example
2.2 The set Q[j]0 consisting of all complex numbers z of the form z = a + jb, where
j = 1 and a, b are rational numbers, with the standard 0, 1, and operations of addition and
multiplication, is a field.
Proof. To verify this, use the identities
Example 2.3 The set ZZ of all integer numbers, with the usual selection of zero and identity,
and equipped with the standard operations of addition and multiplication, is not a field.
Proof. For x = 2 ZZ, there is no y ZZ such that xy = 1.
Example
2.4 The set F consisting of all real numbers z of the form z = a + b, where
= 3 2 and a, b are rational numbers, with the standard 0, 1, and operations of addition and
multiplication, is not a field.
Proof. Indeed, assume to the contrary that F is a field. Since F , this implies 2 F , i.e.
2 = a + b for some a, b Q.
0 Multiplying the equality by yields 2 = a + 2 b. Substituting
a + b in place of produces equality 2 ba = (a + b2 ). Since is not a rational
2
number, this
implies 2 = ba and a + b2 = 0, which yields 2 = b3 (impossible since = 3 2 is not a rational
number).
Finite Fields
A finite field can be constructed explicitly, by defining 0F , 1F , addition and multiplication
on a particular finite set F .
Example 2.5 Every field has at least two different elements: 0 = 0F and 1 = 1F . There is
a field, usually denoted by F2 (or sometimes ZZ2 ), which has no other elements. Here are the
addition and multiplication tables in ZZ2 :
a 0 1 m 0 1
0 0 1 0 0 0
1 1 0 1 0 1
It is easy to see that the operations in F2 are the standard addition and multiplication of integers
done modulo 2.
40 CHAPTER 2. LINEARITY WITH ALTERNATIVE FIELDS
Example 2.6 Let p be a positive number not smaller than 2. Let F = {0, 1, 2, . . . , p 1} be
the set of p integers between 0 and p 1. For every x1 , x2 F let a(x1 , x2 ) be defined as the
reminder of the division of the usual integer sum x1 + x2 by p, i.e.
x1 + x2 , x1 + x2 < p,
a(x1 , x2 ) =
x1 + x2 p, x 1 + x2 p.
Similarly, let m(x1 , x2 ) be defined as the reminder of the division of the usual integer product
x1 x2 by p. It can be shown that the definition satisfies conditions (F1)-(F9) if and only if p is a
prime number. The resulting field is usually denoted by Fp (or ZZp ). In the simplest case p = 2
we get the field from Example 2.5.
Not every set with more than one element allows arithmetic operations which define
a field. In particular, there is no way to complete the addition and multiplication tables
a 0 1 b c d m 0 1 b c d
0 0 1 b c d 0 0 0 0 0 0 0
1 1 1 0 1 b c d
0
b b b 0 b
c c c 0 c
d d d 0 d
Theorem 2.1 Let q be a positive integer. A field F with q elements exists if and only if
q = pm where p is a prime number and m is a positive integer. Every two fields X and Y
with q elements are isomorphic.
where the sequence (xk ) is required (for the sake of uniqueness of representation) to have
an infinite number of zero terms. The arithmetic operations on real numbers can be
developed completely in terms of representations (2.6), though this may be not the most
2.2. GENERAL FIELDS 41
0 + 20 + 21 0 + 20 + 21 + 23 + 25 + 27 + . . . = 1.
is used, where it is required (for the sake of uniqueness of representation) that an infinite
number of terms xk are not equal to m 1.
The fields of p-adic numbers are used in the number theory, but sometimes are handy
in common engineering calculations as well. An important application of p-adic numbers
is exact efficient solving of systems of linear equations in the field Q
0 of rational numbers.
It is easy to see that Qp contains a subfield isomorphic to the field of rational numbers.
Therefore, for a system of equations Ax = y, where A and y are matrices with rational
42 CHAPTER 2. LINEARITY WITH ALTERNATIVE FIELDS
entries, solving for x Q0 n is equivalent to solving for x Qnp . Due to the nature of
arithmetic operations in Qp , it will be possible to compute (sequentially and efficiently)
the first few terms of the p-adic expansion of vector x:
x = xe + x0 + 2x1 + 4x2 + . . . .
It turns out that knowing a finite set of the first coefficients of a p-adic expansion of a
rational number a = p/q allows one to reconstruct the number. Therefore, a few first
terms of a p-adic expansion of x are sufficient for computing exactly. Calculation of exact
rational solutions of systems of linear equations via p-adic approximations is one of the
best linear equation solving algorithms available.
Proof. Let z1 , z2 , . . . , zn1 be the list of all non-zero elements of F . For a non-zero element
x F let yk = xzk . Since, for i 6= k,
yi yk = x(zi zk ) 6= 0,
z = z1 z2 . . . zn1 , y = y1 y2 . . . yn1
it follows that xn1 = 1 for every non-zero element of F . Hence xn x = x(xn1 1) = 0 for
every element x of F .
Theorem 2.2, and some of its generalizations, are used to speed up checking whether a
given large number p is a prime: instead of trying to factor p (no provably fast algorithm of
such factorization is known so far), one computes p-th power of a few integers x [2, p2],
modulo p. According to Theorem 2.2, if the reminder of dividing xp by p is not x for
some x [2, p 2] then p is not a prime number. The opposite is not true: for example,
415 = 4 modulo 15, but 15 is not a prime number.
2.2. GENERAL FIELDS 43
Definition 2.5 Let F be a field. The minimal positive integer q such that the sum of
q copies of 1 F equals 0 is called the characteristic of F . If no such q exists, the
characteristic of F is defined as zero.
Proof. The main idea is to look at the elements a1F F , defined as sums of a copies of 1F ,
where a IN is a positive integer, and to use the identity (ab)1F = (a1F )(b1F ) for all a, b IN,
which follows from the basic field axioms.
To prove (a), assume to the contrary that q IN and q = ab where a, b IN are smaller
than q. Then, according to the definition of q, neither a1F nor b1F are equal to zero. However,
(a1F )(b1F ) = q1F = 0F , which contradicts the result of Lemma 2.2.
To prove (b), define the subfield F0 by
0, for h = 0 Q,
0
def
F0 = {h1F : h Q},0 where h1F = (b1F )/(a1F ), for h = b/a, a, b IN,
(b1F )/(a1F ), for h = b/a, a, b IN.
(i) for a field of characteristic zero, a1F 6= 0F for all a IN (justifies division by a1F );
(ii) (b1 1F )(a2 1F ) = n1F = (b2 1F )(a1 1F ) whenever b1 a2 = n = b2 a1 (shows that (b1F )/(a1F )
is the same element of F for every representation h = b/a of a rational number h as a
ratio of two integers a, b IN).
The map h 7 h1F is a bijection between Q0 and a subset F0 of F which maps sums into sums
and products into products. Hence F0 is a subfield of F which is equivalent to Q.
0
To prove (c), note that the map h Fq 7 h1F F is a bijection between Fq and
Fq = {0F , 1F , 1F + 1F , . . . , (q 1)1F } F,
which maps Fq -sums and Fq -products into F -sums and F -products respectively. Hence Fq is
the subfield of F which is equivalent to Fq .
Theorem 2.3 can be interpreted in the following way: all fields are based on (or are
extensions of) either Q0 or ZZp for some prime p. The fields of characteristic p 6= 0 have a
clear finiteness flavor (though do not have to be finite in general). The extensions of Q0
include but are not limited to the usual number fields, such as IR or C.
The conventions used for real vector spaces are also applicable to vector spaces over
general fields.
Just as vector spaces over IR are called real vector spaces, vector spaces over the field
C are called complex vector spaces.
Linear Subspaces
As in the case of real vector spaces, a subset W of a vector space V over field F becomes
a vector space over F when it is closed under the operations of addition and scaling
inherited from V , i.e.
w1 + w2 W, cw W w1 , w2 , w W, c F.
where (f1 , . . . , fm ) is a basis in F , and ci Fp are arbitrary, the total number of elements
in F is pm .
Definition 2.7 Let V and U be two vector spaces over the same field F . A function
A : V 7 U is called linear if
Definition 2.8 The null-space (or kernel) of a linear function A : V 7 U is the set
def
ker(A) = {v V : Av = 0}.
Example 2.8 An n-by-m matrix A with entries from a field F defines naturally a linear
function A : F m 7 F n (it is mathematically incorrect but very convenient to denote both the
matrix and the linear function by the same letter).
For example, the matrix
1 2
A=
2 1
defines a linear function A : (F3 )2 7 (F3 )2 with null space and range defined by
c c
ker(A) = : c F3 , R(A) = : c F3 .
c 2c
Example 2.9 It is still true for a general field F that, for a polynomial p of degree k with
coefficients in F , the equation p(x) = 0 has not more than m different solutions x F . Therefore,
when x1 , . . . , xn are n different elements of F , and m n, the linear function A mapping the
vector space V (over F ) of all polynomials p F [x] of degree not larger than m to F n according
to
p(x1 )
Ap =
..
.
p(xn )
has zero null space.
48 CHAPTER 2. LINEARITY WITH ALTERNATIVE FIELDS
Example 2.10 Let F be a field. Let V be the real vector space of all polynomials p with
coefficients in F . Let A : V 7 V and B : V 7 V be the differentiation and multiplication
by the independent variable operators defined by
n
X n
X n
X
Af = kfk xk1 , Bf = fk xk+1 for f = fk xk .
k=1 k=0 k=0
Then, just as in the case of polynomials with real coefficients, AB BA maps f to f , i.e.
AB BA = I.
Definition 2.9 Let V be a vector space over field F . The dual space V ] is the vector
space over F defined as the subspace of all linear functions from F V .
Example 2.11 Let F be a field. Every linear functional f : F n 7 F has the form
x1 x1
f ... = c1 x1 + + cn xn = c1 . . . cn ... ,
def
xn xn
2.4. STANDARD LINEARITY IN GENERAL VECTOR SPACES 49
where ci F , and hence it is natural to associate (F n )] with the vector space F 1,n of all 1-by-n
matrices with entries from F . The vector spaces F n and F 1,n appear to be very similar. In
particular, the linear function F : F n 7 F 1,n defined by
x1
F ... = x1 . . . xn
xn
For every field F , the duality transformation A 7 A] satisfies the usual identities
similar to those valid for transfposition of matrices:
(A + B)] = A] + B ] , (AB)] = B ] A] , (A1 )] = (A] )1 , etc.
While the notion of orthogonality is not generally available for pairs of vectors from
the same real vector space, it can be applied to elements of a general vector space and its
dual.
Definition 2.11 For a subspace U of a vector space V over field F ,
def
U = {f V ] : f (v) = 0 v U}.
The following statement is a commonly used relation between null space of a linear
function and the range of its dual, which is actually a special case of Theorem 1.2.
Theorem 2.6 If V, U are vector spaces over field F and A : V 7 U is a linear function
then
(ker(A)) = R(A] ).
Definition 2.12 Let V be a vector space over field F . A sequence (v1 , . . . , vn ) of elements
vi V is called linearly independent if a linear combination
c1 v1 + + cn vn (2.10)
of its elements equals zero only when all of its coefficients ci F are equal to zero.
The following statement describes bases of a vector space over a general field F .
Theorem 2.7 Let V be a vector space over field F , V 6= {0}. Then one of the following
conditions is satisfied:
(b) there exists a positive integer n such that every linearly independent sequence
(v1 , . . . , vk ) of vi V has k n elements, and can be extended to a finite ba-
sis (v1 , . . . , vn ) of V .
The number n from case (b) of Theorem 1.4 is called the dimension of V , notation
n = dim(V ). By convention, dim(V ) = when case (a) takes place, and dim(V ) = 0 for
V = {0}.
Theorem 2.8 Let V, U be vector spaces over field F . Then every basis bV = (v1 , . . . , vm )
in V defines a one-to-one correspondence between linear functions A : V 7 U and
sequences of m vectors g1 , . . . , gm U, according to gk = Avk for k = 1, . . . , m.
The convention for representing linear functions is to form an n-by-m table (n rows and
m columns) filled with the nm elements ai,k F x, so that the first column is a1,1 , a2,1 ,. . . ,
an,1 , the second column is a1,2 , a2,2 ,. . . , an,2 , etc. The table is referred to as the matrix of
A with respect to the bases bV , bU :
a1,1 a1,2 . . . a1,m
a2,1 a2,2 . . . a2,m
a = .. .. .. .
. .
. . . .
an,1 an,2 . . . an,m
When A is a linear operator, i.e. when V = U, it is customary to use the same basis
for V and U: bV = bU .
Elementary operations on matrices with entries in a general field F are defined in the
same way as for matrices with real entries.
As in the case of real vector spaces, for general fields of scalars, the solution v of
Av = u will be unique when A is linear and ker(A) = {0}.
at all elements xk F . By construction, up to (2r m 2)/2 errors in the sequence (qk ) can
be tolerated to allow recovery of the original data pk . Moreover, if each value of qk is stored as
a straight sequence of r bits, the recovery algorithm will tolerate up to (2r m 2)/4 bursts
of errors, as long as each burst is not more than 2r bits long. The code is named Reed-Solomon
after its inventors.
52 CHAPTER 2. LINEARITY WITH ALTERNATIVE FIELDS
Gaussian Elimination
The algorithm of Gaussian elimination extends without a glitch to vector spaces over
arbitrary fields.
Let Su = (u1 , . . . , un ) be a sequence of elements of a vector space U over field F . A
sequence Sf = (f1 , . . . , fn ) of linear functionals fi U ] is called bi-orthogonal to Su when
def 1, i = k,
fi (uk ) = ik =
0, i 6= k.
v = f1 (u)v1 + + fn (u)vn ,
A(c1 v1 + + cn vn ) = u
yields ck = fk u.
The following statement generalizes the claim that a linearly independent sequence of
linear functionals is always bi-orthogonal to a sequence of vectors.
Theorems 2.10,2.11 can be used to generalize a number of statements about the relation
between dimesnion-counting and duality.
Determinants
3.1 Motivation
Determinants play an important role in handling linear operators on finite dimensional
vector spaces. In particular, they are useful in figuring out how a solution of a system
of linear equations depends on its parameters. Among other things, determinants are
involved in the formulae for changing coordinates in multivariable integration, define bar-
rier functions in interior point algorithms, and serve as invariants in proving infeasibility
of certain design tasks.
53
54 CHAPTER 3. DETERMINANTS
A direct analysis of the Gaussian elimination algorithm makes one to expect that the
number of digits needed to represent a denominator of x will grow exponentially with n.
Fortunately, this pessimistic upper bound is overly conservative. A more careful analysis
of the situation, conveniently performed using determinants, shows that the number of
digits in the numerator and denominator of the entries of x will actually grow not faster
than O(n log(n)).
stabilizes system (3.1) in the sense that every y(t) satisfying (3.1) and (3.2) simultaneously
converges to zero exponentially as t +. This can be shown by observing that, subject
to (3.1) and (3.2),
d
{y(t)2 + y(t)2 } = 2(1 u(t))y(t)y(t) = 0.5|y(t)y(t)|,
dt
and hence the Lyapunov function
V (y, y) = y 2 + y 2
While the dependence of the 2-by-2 matrix valued function M = M(t) on the coefficient
function u = u(t) is complicated, the analysis of the determinants of the matrices involved
shows that det(M(t)) = 1 for all t. Since M(t) would be converging to zero as t +
in the case when every solution of (3.1) converges to zero, the identity det(M(t)) = 1
proves that stabilization of (3.1) with program control is impossible. Thus, the use of
the determinant as an invariant (something remaining constant as time progresses) of
dynamical system (3.1) plays a critical role in establishing infeasibility of the stabilization
problem.
h h
A C =A+B
O h -h
B
A and B. With one of the vectors fixed, the function becomes almost linear, in the
sense that
when c 0 and the points A1 , A2 lie on the same side of line (OB). It is possible to get
full linearity with respect to each of the arguments A, B by replacing S0 with its signed
56 CHAPTER 3. DETERMINANTS
area version S = S(A, B), where S(A, B) = S0 (A, B) when the points O, A, B do not
belong to the same line and the line (OB) can be obtained by rotating the line (OA)
around the origin O by an angle (0, ), taking S(A, B) = S0 (A, B) otherwise. The
resulting signed volume S = S(A, B) is not only multilinear (linear with respect all of its
individual components), but also skew symmetric, in the sense that S(A, B) = S(B, A)
for all A, B.
It can be proven that every other multilinear skew symmetric function : V V 7 IR
has the form (A, B) = c0 S(A, B) where c0 IR is a real constant. This observation can
be used to define the determinant det(L) of a linear function L : V 7 V as the constant
such that
S(LA, LB) = S(A, B) A, B P.
def
Indeed, since L is linear and S is multilinear and skew symmetric, the function SL (A, B) =
S(LA, LB) is also multilinear and skew symmetric. Therefore there exists a constant
IR such that S(LA, LB) = S(A, B) for all A, B. Since S(A, B) 6= 0 for some
argument pairs (A, B), the constant IR is uniquely defined by L.
A similar definition of the determinant can be produced for an arbitrary vector space
V over an arbitrary field F .
Subject to the multilinearity assumption, the two conditions are equivalent when the
characteristic of F is not equal to 2, and the second condition is the proper specification
for the signed volume when F has characteristic 2.
To define a multilinear skew symmetric function by an explicit formula, one can use
the following classical construction.
For a positive integer n let Bn = {b} denote the set of all permutations of the elements
of the set {1, . . . , n}, i.e. sequences
b = (b(1), b(2), . . . , b(n)) : b(i) {1, . . . , n}, b(i) 6= b(k) for i 6= k.
For example, (3, 1, 4, 2) is a permutation from B4 , while (1, 2, 4) and (1, 2, 3, 1) are not.
A well known formula from combinatorics says that Bn has exactly n! elements.
Let
1, a > 0,
sign(a) = 0, a = 0,
1, a < 0,
denote the sign of real number a. For every n and every b Bn let
Y
(b) = sign(b(k) b(i)) (3.3)
i<k
be the function which equals 1 when the number of inversions in b (i.e. pairs (i, k) of
indexes such that i < k but b(i) > b(k)) is even, and equals 1 otherwise. For example,
there are three inversions (b(1) > b(2), b(1) > b(4), and b(3) > b(4)) in b = (3, 1, 4, 2),
and hence ((3, 1, 4, 2)) = 1.
The most remarkable feature of the inversion counting function is that (b) changes
its sign when two elements in the sequence b are replaced by each other to form a new
permutation c.
Theorem 3.1 Let n be a positive integer. Let i, k {1, . . . , n} be such that i < k. For
a b Bn let c Bn be defined according to
b(i), r = k,
c(r) = b(k), r = i,
b(r), otherwise.
The following theorem shows how permutations and the inversion counting function
can be used to construct multilinear skew symmetric forms.
Theorem 3.2 Let V be a vector space over field F . Let f = (f1 , . . . , fn ) be a sequence
of elements fi V ] (i.e. linear functionals fi : V 7 F ). Then the function f : V n 7 F
defined by
X n
Y
f (v1 , . . . , vn ) = (b) fi vb(i) (3.4)
bBn i=1
in (3.4) contains exactly one component depending (linearly) on each of the arguments vi V ,
and hence is multilinear. since a linear combination of multilinear functions is multilinear as
well, the multilinearity of f follows.
To prove skew symmetry, assume that vi = vk for some i < k. Then the elements of the
sum in (3.4) can be combined into pairs, where the permutations b, c defining the elements of
a single pair can be obtained from each other by switching their i-th and k-th terms. Since
(b) = (c), and
(f1 vb(1) ) (fn vb(n) ) = (f1 vc(1) ) (fn vc(n) ),
the sum of terms in each pair equals zero. Hence the total value of f is zero as well.
(v1 , . . . , vn ) = c 0 (v1 , . . . , vn ) vi V.
Proof. First, let us show that every multilinear skew symmetric function g : V n 7 F such
that g(u1 , . . . , un ) = 0 for some basis u = (u1 , . . . , un ) in V is identically equal to zero.
Let us do this for n = 2 first. Since every v1 , v2 V can be represented in the form
The same expansion, using multilinearity and skew symmetry, can be used to show that in
general, for
Xn
vk = cik ui (k = 1, . . . , n),
k=1
the representation
g(v1 , . . . , vn ) = hg(u1 , . . . , un )
takes place, where h = h(c11 , . . . , cnn ) is some function of the coefficients cik . Hence g 0 as
soon as g(u1 , . . . , un ) = 0.
To finish the proof of Theorem 3.3, let u = (u1 , . . . , un ) be a basis in V . Since by assumption
0 is not identically equal to zero, 0 (u1 , . . . , un ) 6= 0. Then g = c0 , where
(u1 , . . . , un )
c = ,
0 (u1 , . . . , un )
is a multilinear skew symmetric function which takes zero value at u. Hence g 0, i.e. c 0 .
60 CHAPTER 3. DETERMINANTS
Definition 3.2 Let V be a vector space of dimension n over vector field F . Let A : V 7
V be a linear function. The determinant det(A) of A is defined by the identity
In other words, the determinant of a linear operator is the multiplicative gain in signed
volume produced by its action.
Using the expression from (3.4) for the multilinear skew symmetric form , where
v = (v1 , . . . , vn ) and f = (f1 , . . . , fn ) are bi-orthogonal bases in V and V ] respectively,
yields an explicit, though rather complicated, formulae
X n
Y
det(A) = (b) fi Avb(i) . (3.5)
bBn i=1
Determinant of Identity
If V is a finite dimensional vector space and I : V 7 V is the identity operator I(v) v
then the signed volume is not changed by the action of I, and hence det(I) = 1 is the
unit in the field of scalars of V .
Theorem 3.4 Let V be a vector space of dimension n < over vector field F . Let
A : V 7 V be a linear function. Then det(A) = 0 if and only if ker(A) 6= {0}.
3.3. BASIC PROPERTIES OF DETERMINANTS 61
Proof. If ker(A) = {0} and (v1 , . . . , vn ) is a basis in V then (Av1 , . . . , Avn ) is a basis in V .
Hence, for a skew symmetric multilinear form 0 : V n 7 F which is not identically equal to
zero, the values
b = 0 (v1 , . . . , vn ), a = 0 (Av1 , . . . , Avn )
are not zero, and therefore det(A) = a/b 6= 0.
Conversely, if A is not invertible then every element in the range R(A) is a linear combination
of the a fixed sequence of n 1 vectors u = (u1 , . . . , un1 ). Following the proof of Theorem 3.3,
the value of 0 (Av1 , . . . , Avn ) is a linear combination of the values of 0 (w1 , . . . , wn ) where
wi {u1 , . . . , un1 } for every i. Since this measn that at least two arguments of 0 (w1 , . . . , wn )
are equal, 0 (w1 , . . . , wn ) = 0, and hence 0 (Av1 , . . . , Avn ) = 0.
Though, formally speaking, Theorem 3.4 can be used to verify feasibility of a linear
equation, it is not always a good idea to follow this path. For instance, let us return to
the setup of Example 1.14. Checking feasibility of a polynomial interpolation problem
p(ti ) = yi (i = 1, . . . , n),
for a given set of samples (ti , yi), where ti 6= tk for i 6= k, is equivalent to checking
feasibility of the linear equation Mx = y where y IRn is the column vector with entries
yi , and M : IRn 7 IRn is defined by
p0 p(t1 )
M ... = ... .
pn1 p(tn )
One way to solve the problem is by figuring out that the determinant of the matrix
1 t1 . . . t1n1
1 t2 . . . tn1
2
..
.
n1
1 tn . . . tn
of M in the standard basis is not equal to zero. The approach used in Example 1.14,
though mathematically equivalent, appears to be more straightforward and concentrated
on the fundamentals.
Multiplicativity of Determinant
The following statement is the multiplicative property of the determinant.
62 CHAPTER 3. DETERMINANTS
Theorem 3.5 Let V be a vector space of dimension n < over vector field F . Let
A : V 7 V and B : V 7 V be two linear functions. Then
det(AB) = det(A) det(B).
Proof. Since AB is the composition of A and B, the total signed volume gain produced by AB
is the product of the gains corresponding to separate actions of B and A.
Example 3.2 In the example from subsection 3.1.2, assume that the piecewise constant func-
tion u takes value u( ) = uk for tk < tk+1 , where 0 = t0 < t1 < < tm+1 = t. Then the
differential equation in (3.1) can be solved explicitly on each of the intervals [tk , tk+1 ]:
1/2 1/2 1/2
y( ) = y(tk ) cos(uk ( tk )) + y(tk )uk sin(uk ( tk )) ( [tk , tk+1 ]),
has determinant det(M (t)) = 1. Hence it is impossible to make all components of M (t) to
converge to zero as t +.
The following theorem states that similar functions have equal determinants.
Theorem 3.6 Let U and V be finite dimensional vector spaces over field F . Let A :
U 7 U, B : V 7 V , and S : U 7 V be linear functions such that SA = BS. If S is a
bijection then det(A) = det(B).
3.3. BASIC PROPERTIES OF DETERMINANTS 63
U (Au1 , . . . , Aun )
det(A) =
U (u1 , . . . , un )
V (SAu1 , . . . , SAun )
=
V (Su1 , . . . , un )
V (BSu1 , . . . , BSun )
=
V (Su1 , . . . , un )
= det(B),
In terms of matrices, Theorem 3.7 is the familiar claim that transposition does not
change determinant of a matrix.
Proof. Let n = dim(V ). Consider the function : (V ] )n V n 7 F defined according to the
proof of Theorem 3.2:
X n
Y
(f1 , . . . , fn , v1 , . . . , vn ) = (b) fi vb(i) .
bBn i=1
With the first n arguments fi V ] are fixed, is a multilinear skew symmetric function on V n .
With the last n arguments fixed, is a multilinear skew symmetric function on (V ] )n . Since
It is easy to see that, due to the condition V1 V2 = {0}, the representation of a given v V
in the form v = v1 +v2 where vi Vi is unique. Moreover, the functions [V1 , V2 ] : V 7 V1
and [V2 , V1 ] : V 7 V2 defined according to
are linear.
When vector spaces U, V have decompositions U = U1 U2 and V = V1 V2 , a
linear function A : U 7 V naturally defines four linear functions A1,1 : U1 7 V1 ,
A1,2 : U2 7 V1 , A2,1 : U1 7 V2 , and A2,2 : U2 7 V2 , defined according to
Theorem 3.8 Let V be a vector space of dimension n < over vector field F . Let
A : V 7 V be a linear function.
(a) If the block matrix representation of A with respect to a direct sum decomposition
V = V1 V2 is given by
A11 A12
A= ,
0 A22
where 0 denotes the linear function 0 : V1 7 V2 which maps every element v1 V1
to zero then
det(A) = det(A11 ) det(A22 ). (3.6)
(b) Identity (3.6) also holds if the block matrix representation of A with respect to a
direct sum decomposition V = V1 V2 is given by
A11 0
A= ,
A21 A22
Proof. The proofs for (a) and (b) are similar. Let us prove (a).
Let (v1 . . . , vr ) and (vr+1 , . . . , vn ) be bases in V1 and V2 respectively. Let (f1 , . . . , fn ) be the
sequence of functionals from V ] which is bi-orthogonal to (v1 , . . . , vn ). Then
fi A11 vk for i, k r,
fi A22 vk for i, k > r,
fi Avk =
fA v for i r, k > r
i 12 k
0 for i > r, k r.
X r
Y nr
Y
det(A) = fi Avc(i) ft+r Avd(t)+r
cBr ,dBnr i=1 t=1
r
!
XY X nr
Y
= fi A11 vc(i) ft+r A22 vd(t)+r
cBr i=1 dBnr t=1
= det(A11 ) det(A22 ).
Theorem 3.8 allows one to calculate determinants of matrices associated with row and
column operations, as in det T (c) = 1 where T is the n-by-n matrix with units on the main
diagonal and a single non-zero off-diagonal element c in the i-th row and k-th column (so
that for every n-by-n matrix A the product AT (c) is obtained from A by adding its i-th
column, scaled by c, to its k-th column, and T (c)A is obtained from A by adding its k-th
row, scaled by c, to its i-th row). Similarly, Theorem 3.8 implies that determinant det(A)
of a diagonal matrix A equals the product of its diagonal entries.
Theorem 3.9 Let U, V be finite dimensional vector spaces over field F . Let A : U 7 V
and B : V 7 U be linear functions. Then
Hence
I A I A I 0 I AB A
det = det = det = det(I AB),
B I B I B I 0 I
I A I A I A I 0
det = det = det = det(I BA),
B I B I 0 I B I BA
which implies (3.7).
3.3. BASIC PROPERTIES OF DETERMINANTS 67
One of many applications of Theorem 3.9 is a well known explicit formula for solving
linear equation Mx = v with respect to x, sometimes referred to as the Kramers formula.
Theorem 3.10 Let V be a finite dimensional vector space over field F . Let M : V 7 V
be a linear function, f V ] a linear functional on V , and v, e V vectors from V , such
that ker(M) = {0} and f e = 1. Then
det(M Mef + vf )
f M 1 v = . (3.8)
det(M)
= det(IF f (e M 1 v)) = 1 f e + f M 1 v = f M 1 v.
Characteristic Polynomials
This chapter uses characteristic polynomials and invariant subspaces to study the prop-
erties associated with iterations of linear operators: a topic highly relevant when working
with linear time invariant (LTI) dynamical systems.
Indeed, arranging the pairs of consecutive Fibonacci numbers into vectors yields (4.1),
where
y(t + 1) 1 1 1
x(t) = , x0 = , A= .
y(t) 0 0 1
Studying the characteristic polynomial A (s) = det(sI A) of A (in particular, its
roots) yields an explicit formula for y(t), as well as a conclusion about its asymptotic
69
70 CHAPTER 4. CHARACTERISTIC POLYNOMIALS
behavior:
s 1 1
A (s) = det = s2 s 1
1 s
has real roots
1+ 5 1 5
= , = ,
2 2
and, accordingly, the Fibonacci number y(t) is a linear combination of t and t :
t t
y(t) = ,
so that y(t) converges to + and y(t + 1)/y(t) converges to as t +.
It is instructive to do analysis of Fibonacci recurrence in the fields of finite charac-
teristic. For example, in F5 the characteristic polynomial of A has a double root s = 3,
which is certified by the identity
s2 s 1 = (s 3)2 (in F5 ).
where x(t), u(t), y(t) are elements of vector spaces X, U, Y over the same vector field F ,
and A : X 7 X, B : U 7 X, C : X 7 Y , D : U 7 Y are fixed linear functions.
4.1. MOTIVATION: LTI SYSTEMS 71
The typical understanding of (4.2) is that it defines a linear system which transforms an
input time series u = (u(0), u(1), . . . ) into output sequence y = (y(0), y(1), . . . ), using the
vector x0 = x(0) of initial conditions as an additional input parameter. In the classical
applications of LTI systems the field F = IR of real numbers is used to describe (some-
times approximately) dynamical relations between time samples of physical parameters,
represented, naturally, by real numbers. However, LTI state space models over finite fields
extend linear system methods to finite automata, and can be used to simplify analysis of
discrete algorithms in communications and control.
Consider, for example, the issue of reachability and observability for the model in
(4.2).
Definition 4.1 System (4.2) is called reachable if for every x X there exists a positive
integer T and a sequence u(0), u(1), . . . of vectors u(t) U such that the solution of (4.2)
with x(0) = 0 satisfies x(T ) = x .
Naturally, reachability describes the ability of a control action u = u(t) to stir the
state x = x(t) from a zero initial position to an arbitrary given position x X, given
enough time. Moreover, due to the linearity of equations (4.2), reachability also means
the possibility to stir from any given initial position x0 X to an arbitrary terminal
position x X.
The question of reachability can be answered in terms of invariant subspaces. Specif-
ically, system (4.2) is not reachable if and only if there exists a subspace X0 X which
is proper (i.e. X0 6= X), contains the range of B (i.e. R(B) X0 ), and A-invariant, in
the sense that AX0 is contained in X0 .
For example, consider the input - output recursive equation
y(t + 1) + y(t 1) = u(t) + 2u(t 1), (4.3)
which can be represented in the form (4.2) with
y(t 1) 2u(t 1) 0 1 2
x(t) = , A= , B= , C = 0 1 , D = 0.
y(t) 1 0 1
When F = IR is the field of real numbers (i.e. when y(t) and u(t) are real), system (4.2)
is reachable. This can be seen by noticing that the vectors
2 1
B= , AB =
1 2
span X = IR2 , and hence IR2 is the only A-invariant subspace which contains the range
of B. Interestingly, the reachability property does not necessarily extend to the interpre-
tations of (4.3) in other fields F . For example, with F = F5 , since
1 2
=2 (in F5 ),
2 1
72 CHAPTER 4. CHARACTERISTIC POLYNOMIALS
Theorem 4.1 For an integer n > 1 and a field F let (a0 , . . . , an1 ) be a sequence of
elements of F . If v = (v1 , . . . , vn ) is a basis in a vector space V over F then the (uniquely
defined) linear operator A such that
Avk = vk+1 (k = 1, . . . , n 1), Avn = a0 v0 a1 v1 an1 vn1
has characteristic polynomial
det(sI A) = sn + an1 sn1 + + a1 s + a0 .
Proof. It is sufficient prove that for a multilinear skew symmetric form : V n 7 F such that
(v1 , . . . , vn ) = 1 the identity
(sv1 v2 , sv2 v3 , . . . , svn1 vn , svn +a0 v1 + +an1 vn ) = sn +an1 sn1 + +a1 s+a0 (4.4)
(sv1 v2 , sv2 + a0 v1 + a1 v2 ) =
= ( 2 + a1 s + a0 )(v1 , v2 ) = s2 + a1 s + a0 .
Assume that (4.4) is valid for all n k, where k > 2. Then, for n = k + 1,
Since
(v1 , sv2 v3 , . . . , svn1 vn , svn + a0 v1 + + an1 vn ) =
(u1 , . . . , uk ) 7 (v1 , u1 , . . . , un )
is multilinear and skew symmetric, the inductive assumption can be applied to the first term in
(4.5), to show that
= (v2 , v3 , . . . , vn , a0 v1 ) = a0 ,
Therefore, if a linear operator has a non-trivial invariant subspace, its characteristic poly-
nomial is reducible, i.e. can be factored into a product of two polynomials of positive
degree. Moreover, one of the two polynomials is the characteristic polynomial of the
restriction of A to the invariant subspace V1 .
Example 4.1 If a polynomial p = p(x) of degree m with coefficients from field F has a root
a F , i.e. p(a) = 0, then p can be factored according to p(x) = (x a)q(x), where q = q(x) is
a polynomial of degree m 1 with coefficients from F (this can be shown by performing long
polynomial division of p(x) by x a). Therefore a polynomial p = p(x) with complex coefficients
is irreducible over field C if and only if it has the form p(x) = p1 x + p0 . Similarly, a polynomial
with real coefficients is irreducible over IR if p(x) = p2 x2 + p1 x + p0 where either p2 = 0 or
4p2 p0 > p21 .
The following theorem claims that every linear operator A on a finite dimensional vec-
tor space V has a non-zero invariant subspace V1 such that the characteristic polynomial
of the restriction A11 of A to V1 is irreducible.
4.2. BASIC PROPERTIES 75
Proof. For every v V , v 6= 0 let = (v) be the minimal positive integer k for which Ak v
is a linear combination of vectors v, Av, . . . , Ak1 v. Since dim(V ) = n, it follows that (v) n
for all v V .
Let m = (u), u 6= 0 be the minimal value of , i.e. the vectors u, Au, . . . , Am1 u are
linearly independent, and
Am u = cm1 Am1 u + + c1 Au + c0 u
for some ci F . Equivalently,
p(A)u = (Am + pm1 Am1 + + p1 A + p0 )u = 0
for
p(x) = xm + pm1 xm1 + + p1 x + p0 , pi = ci F.
Let us show that the polynomial p = p(x) must be irreducible over F . Indeed if p = q1 q2
where both polynomials q1 , q2 have degree less than m then
q1 (A)q2 (A)u = p(A)u = 0.
Then either q2 (A)u = 0, which means (u) < m, or q1 (A)w = 0 for w = q2 (A) 6= 0, which
means (w) < m, with both equalities contradicting the minimality of m = (u).
To complete the proof, note that the vectors u, Au, . . . , Am1 u form a basis in an A-invariant
subspace V1 V , and, according to Theorem 4.1, p(s) = det(sI A11 ) is the characteristic
polynomial of the restriction A11 of A to V1 .
It turns out that the identity A (A) = 0 olds for every linear operator A : V 7 V .
Proof. For every v V , v 6= 0 let k be the smallest positive integer such that Ak v is a
linear combination of v, Av, . . . , Ak1 v. As was shown in the proof of Theorem 4.2, this means
p(A)v = 0 for a polynomial p of degree k, and vectors v, Av, . . . , Ak1 v form a basis in an
A-invariant subspace V1 , such that p(s) = det(sI A11 ) is the characteristic polynomial of the
restriction A11 of A to V1 . Hence (s) = q(s)p(s) for some polynomial q, and
(A)v = q(A)p(A)v = 0.
U = U1 U2 Um
If the characteristic polynomial of M22 is irreducible, set m = 2, M2,2 = M22 to finish the
proof. Otherwise, applying Theorem 4.2 to M22 represents M it in the block diagonal form
M1,1 M12 M13
M = 0 M2,2 M23 ,
0 0 M33
When F = C is the field of complex numbers, the dimensions of the spaces Uk must be
equal to 1 for irreducibility, and hence Mi,k are just scalars, which makes (4.7) an upper
triangular form called complex Schur decomposition. The Schur decomposition of complex
matrices is an important operation, because it not only calculates the eigenvalues Mk,k of
matrix M, but can be performed efficiently and in a numerically stable fashion. More
specifically, in terms of matrix representations, it is always possible to find a complex Schur
decomposition MS = T MT 1 of a matrx M for which the coordinate transformation T is
unitary, i.e. preserves Hermitian length of vectors (to be discussed in the next chapter).
When F = IR, the dimensions of the spaces Uk must be equal to 1 or 2, and the corre-
sponding uppper triangular block representation is called the real Schur decomposition.
78 CHAPTER 4. CHARACTERISTIC POLYNOMIALS
Chapter 5
Quadratic Forms
and Scalar Products
The concepts discussed in this chapter apply to real vector spaces, which also allows for
an easy generalization to complex vector spaces, as every vector space over an extension
field of IR is naturally a vector space over IR as well. The main use of quadratic forms in
most applications is to define and to compare lengths of non-parallel vectors.
This chapter introduces bilinear functionals and scalar products, positive definiteness,
Cauchy-Schwarz inequality, Gramm-Schmidt orthogonalization, and other related math-
ematical constructions used to work with uncertainty of functional models, second order
statistics of random variables, elementary geometry proofs, and many other applications.
5.1 Motivation
Quadratic forms are defined by pushing the notion of linearity just a bit further.
While the definition can be easily generalized to introduce forms of every positive
integer power (cubic, fourth order, etc.), quadratic forms are remarkably easier to handle.
For example, one of the most commonly checked properties of a quadratic form : V 7 V
on a vector space V over the field F = IR of real numbers is its positive definiteness.
While positive definiteness of a quadratic form on a finite dimensional real vector space
can be verified easily, assessment of this property for forms of any other even order remains
79
80 CHAPTER 5. QUADRATIC FORMS AND SCALAR PRODUCTS
Then, out of the three real numbers (v3 , u), (v4 , u), and (v5 , u), at least two have the same
sign (i.e. are either both non-negative or both non-positive). Assume, without loss of
generality, that these two vectors are v3 and v4 . If (v3 , u) 0 and (v4 , u) 0 then the
hemisphere defined by (v, u) 0 contains v1 , . . . , v4 . Otherwise, the hemisphere defined
by (v, u) 0 contains v1 , . . . , v4 .
or, equivalently,
2a a2
y(t + 1)
x(t + 1) = Ax(t), x(t) = , A= .
y(t) 1 0
Roughly speaking, the recursive equation (5.1) is called asymptotically stable if every
solution y = y(t) of (5.1) converges to zero as t +. Assessment of asymptotic
stability of recursive equations is a fundamental question in control systems.
Since the characteristic polynomial (s) = (s a)2 of A has a double root at s = a,
every solution of (5.1) has the form
as, due to the presence of nonlinearity, it is impossible to produce a useful explicit formula
for y(t) as a function of t. While a complete stability analysis of (5.2) for all possible values
of a is difficult, playing with quadratic forms allows one to certify asymptotic stability or
lack thereof for some values of a.
First, let us show how asymptotic stability of (5.1) can be analyzed without referring
to an explicit formula for its solutions. Consider the function 1 : IR2 7 IR defined by
x1
1 = (x1 ax2 )2 + rx22 ,
x2
82 CHAPTER 5. QUADRATIC FORMS AND SCALAR PRODUCTS
where r > 0 is a parameter. Note that 1 is a quadratic form on the real vector space
V = IR2 . Indeed, 1 (v) = b1 (v, v), where b1 : V V 7 IR is the bilinear function defined
by
u1 v1
b1 , = (u1 au2 )(v1 av2 ) + ru2v2 .
u2 v2
For a fixed value of parameter , the function 2 mapping v V to 1 (v) 1 (Av) IR
is a quadratic form as well, because
for every t = 0, 1, 2, . . . for every solution of (5.1). Since 1 itself takes non-negative
values, this means that
y(t)2 1 (x(t)) t 1 (x(0))
converges to zero as t +.
Similarly, it is easy to establish that, for |a| > 1, there always exist > 1 and r > 0
such that 2 is positive definite. This means that
While the relation between w(t) and y(t) cannot be described by quadratic inequalities
exactly, there are certain quadratic inequalities which are implied by the relation between
w(t) and y(t). In particular, for every > 0 there exists = () > 0 such that the
inequality
(x(t), w(t)) 0
5.1. MOTIVATION 83
Using the a standard criteria of positive definiteness for finite dimensional quadratic
forms, it is easy to establish that for every a IR such that |a| > 1 there exist > 1,
> 0, and 0 such that the quadratic form
is positive definite. This, in turn, can be used to prove that recursion (5.2) is not asymp-
totically stable when |a| > 1. Indeed, otherwise there will exist a solution y = y(t) of
(5.2) which takes an infinite number of non-zero values, and converges to zero as t +.
Since the inequality |y(t)| = () will be valid for sufficiently large t, it follows that
e (x(t), w(t)) 0 and hence
for sufficiently large t, which contradicts the assumption that y(t) 0 but y(t) is not
identically equal to zero as t +.
A similar argument can be used to prove local asymptotic stability of (5.2) for all
a (1, 1), which means that every solution of (5.2) with y(0) and y(1) small enough
remains small for all t > 0 and converges to zero as t .
The approach can also be used to establish (global) asymptotic stability for some
values a (1, 1). Indeed, since
0 y sin(y) hy y 0, (5.3)
is a quadratic form on IR2 IR. If a IR is such that values of (0, 1) and 0 can
be found making the quadratic form
positive definite then (5.2) is asymptotically stable. Indeed, since 3 (x(t), w(t)) 0 for
all solutions of (5.2), positive definiteness of 4 for some 0 implies that 1 (x(t + 1))
1 (x(t)) for all t.
84 CHAPTER 5. QUADRATIC FORMS AND SCALAR PRODUCTS
are well defined and can be handled using the standard rules of integration concerning
linearity, inequalities, convergence, etc. Such functions v are referred to as (scalar) random
variables, and the quantity E[v] is called the expected value of v. While a complete proper
treatment of random variables requires a solid foundation in the theory of integration
(something not assumed in these lectures), certain aspects of the framework, typically
associated with treating linear combinations of a fixed finite family of random variables,
fit nicely into the subject of elementary linear algebra.
Consider the set V of all scalar real random variables v for which E[|v|2] < , assuming
for convenience to consider two random variables u, v V equal when E[|v u|2 ] = 0.
With the natural operations of addition and scaling, V becomes a real vector space.
Moreover, the bilinear form
b(u, v) = E[uv]
defines a positive definite quadratic form
E (v) = E[|v|2].
A number of basic optimization questions in signal processing and control can be formu-
lated in terms of quadratic forms and random variables from V .
A classical example is a parameter estimation setup, in which the value of a real
parameter a IR is to be estimated based on a finite number of uncorrelated noisy
measurements. The measurements are represented by random variables yi (i = 1, . . . , n),
related to a by the equations yi = a + wi where wi are random variables with zero mean
E[wi ] = 0 and unit variance E[wi2 ] = 1. The assumption that wi are uncorrelated is
represented by the equalities E[wi wk ] = 0, to be satisfied for all pairs (i, k) with i 6= k.
A typical objective is to find the sequence of coefficients c = (ci )ni=1 of a linear estimator
a = c1 y1 + + cn yn ,
of the estimation error a a does not depend on parameter a IR, and, subject to this
constraint, is minimal.
5.2. POSITIVE DEFINITENESS OF QUADRATIC FORMS 85
It turns out that the set U of all estimator coefficient sequences c = (ci )ni=1 for which
E[|a a|2 ] does not depend on a IR is a linear subspace in IRn . Moreover, the functional
is a quadratic functional on U, in the sense that it is a sum of a quadratic form, a linear
function, and a constant. The corresponding quadratic form is positive definite, which
guarantees existence and uniqueness of the optimal estimator, which can be calculated
easily using the basic properties of quadratic forms.
def (u + v) (u v)
(u, v) =
4
defines a symmetric (in the sense that (u, v) = (v, u) bilinear form (, ) : V V 7 IR.
Indeed, if b : V V 7 IR is the bilinear form from the definition of (v) = b(v, v) then
(u + v) (u v) b(u, v) + b(v, u)
= .
4 2
Proof. Fix u, v V and consider the function f : IR 7 IR defined by f (t) = (u + tv). Since
(v) = (v, v) for some bilinear symmetric form (, ) , f () is a quadratic function
|u v| |u| + |v| ,
In particular, the triangle inequality states that, for a positive definite quadratic form ,
the quantity (u, v) = |u v| can serve as a measure of distance between vectors u and
v.
v = c1 v1 + + cn vn V
the value (v) is uniquely defined by the matrix Q and by the coefficients ci IR according
to
(v) = (c1 v1 + + cn vn )
= (c1 v1 + + cn vn , c1 v1 + + cn vn )
X n
= ci ck (vi , vk ) .
i,k=1
In particular, when V = IRn , and Q is the matrix of a quadratic form in the standard
basis, the values of (v) can be calculated according to the matrix algebra formula
where v IRn is an arbitrary n-by-1 matrix, and the prime 0 means transposition, so that
v 0 is a 1-by-n matrix, and, accordingly, the product v 0 Qv is a 1-by-1 matrix, i.e. a scalar
v 0 Qv IR.
The following statement describes how the matrix of a quadratic form changes when
the basis with respect to which it is computed is replaced by another one.
Su = Sv Muv ,
where Muv is interpreted as a linear function Muv : IRn 7 IRn . By the definition of the matrix
of a quadratic form,
(Sv x, Sv y) = x0 Qv y, (Su x, Su y) = x0 Qu y.
Hence
Since the equality holds for every pair of vectors x, y IRn (in particular, for every pair of the
standard basis vectors), Qu = Muv0 Q M .
v uv
One useful implication of the coordinate change formula is that the sign of the deter-
minant of the matrix of a quadratic form does not depend on the basis. Indeed, since
Muv is invertible, det(Muv ) 6= 0, and hence the signs of det(Qv ) and
0 0
det(Qu ) = det(Muv Qv Muv ) = det(Muv ) det(Qv ) det(Muv ) = det(Qv ) det(Muv )2
(v) = (c1 v1 + + cn vn )
= c21 + + c2n
> 0,
The proof of Theorem 5.3 is constructive, and provides a very efficient way of checking
positive definiteness of quadratic forms, usually referred to as Gram-Schmidt orthogonal-
ization in a theopretical context, and as Choleski decomposition in a numerical linear
algebra context.
Proof. Let us prove by induction with respect to k = 1, 2, . . . , n that there exists a set of
vectors v1 , . . . , vk such that ui is a linear combination of v1 , . . . , vi for every i = 1, . . . , k, and
(vi , vj ) = ij for all i, j {1, . . . , k}.
For k = 1, setting v1 = (u1 )1/2 u1 (which is possible because by assumption (u1 ) > 0)
proves the base of induction. Assume now that the statement is proven for all k m. Then
for k = m + 1 let v1 , . . . , vm be the orthonormal vectors existing according to the inductive
assumption. Define
m
X
vm+1 = (em+1 )1/2 em+1 , where em+1 = um+1 (vi , um+1 ) vi .
i=1
5.2. POSITIVE DEFINITENESS OF QUADRATIC FORMS 89
for j = 1, . . . , m. Since vectors ui form a basis, em+1 6= 0, and hence (em+1 ) > 0, which makes
vm+1 well defined. By construction, (vm+1 ) = 1 and (vm+1 , vi ) = 0 for i m.
The inductive proof of Theorem 5.3 translates into a recursive algorithm in numerical
calculations. Since the matrix M of the linear function A : V 7 V , defined by Avi =
ui , is upper triangular in the basis (v1 , . . . , vn ), the original matrix Q of in the basis
(u1 , . . . , un ) is getting represented in the form Q = M 0 M. Such representation is called
Choleski factorization, and producing M for a given symmetric Q is one of the basic
algorithms of numerical linear algebra.
Proof. To prove the implication (a) (b), note that the matrix Q of a positive definite
quadratic form has a Choleski factorization Q = M 0 M where M is the matrix of an invertible
linear operator, and hence has a non-zero determinant. Hence
For n = 1 the statement is trivial. Assume the implication (b) (a) is valid for all n m.
Then for n = m + 1 consider the subspace Vm of V spanned by the first m = n 1 vectors from
the basis b. According to the inductive assumption, the restriction m of to Vm is positive
definite. Let u1 , . . . , um be an orthonormal basis for m . In the basis (u1 , . . . , um , vm+1 ) the
matrix of has the block form
Im h
Qu = ,
h0 a
where h is an m-by-1 real vector, and a IR is a scalar.
Since the sign of determinant of a matrix of a quadratic form does not depend on the basis,
are non-negative (they are equal to zero), but the corresponding quadratic form is not
positive semidefinite.
Chapter 6
Linear-Quadratic Optimization
This chapter presents basic properties and applications of linear quadratic (LQ) opti-
mization: existence, uniqueness, general properties, and explicit formulae for the optimal
value and optimal vector, constructions associated with Hilbert spaces, relation to optimal
control and optimal estimation.
6.1 Motivation
Maximization or minimization of a quadratic functional over a real vector space is per-
haps the simplest meaningful optimization setup imaginable. When well-posed, it can be
reduced to solving a linear equation. Quadratic optimization has a number of appealing
theoretical properties, and is extensively used in algorithmical and theoretical subjects,
such as optimal feedback control, parameter estimation, and signal processing, convex
and non-convex optimization, Hilbert space constructions, etc.
A general linear quadratic (LQ) optimization setup calls for maximization (or mini-
mization) of a quadratic functional : V 7 IR on a real vector space V :
(v) = 2f (v) (v) sup, (6.1)
vV
Another common setup which can be reduced to (6.1) in a straightforward way is mini-
mization
def
(v) inf , v0 + U = {v0 + u : u U} (6.3)
vv0 +U
of a quadratic form over the affine subspace v0 +U of V , where U is a fixed linear subspace
of V , and v0 V is a given vector.
91
92 CHAPTER 6. LINEAR-QUADRATIC OPTIMIZATION
Here the boundary conditions imposed on y reflect an assumption of zero initial velocity,
a requirement of zero terminal velocity, and the desire to move change the coordinate by a
unit of length. Most importantly, the definition of () means that the cost is proportional
to the square of the instantaneous acceleration.
One possible selection of y = y(t) is
2
2t , t [0, 0.5],
y0 (t) = 2 (6.5)
4t 1 2t , t [0.5, 1].
is zero for all functions u V satisfying (6.6). Assuming for now that y is four time
continuously differentiable, integration by part transforms this into
Z 1
y(4) (t)u(t)dt = 0,
0
(4)
which means that the fourth derivative y of y must be identially equal to zero. Hence
y (t) = c0 + c1 t + c2 t2 + c3 t3
is a polynomial of degree less than four. Using the boudary conditions yields c0 = c1 = 1,
c2 = 3 and c3 = 2, i.e.
y (t) = 3t2 2t3
is a minimizer in (6.4).
Is this y the only minimizer? Since the formula for y was derived from an assumption
that y is four times continuously differentiable, the issue has to be addressed in one way
or another. It turns out, however, that a simple abstract argument yields uniqueness of
the optimum in (6.4): since the restriction of to the subspace U of V is positive definite,
the argument of minimum must be unique. Indeed, if v1 and v2 are two such minimizers
then (v1 , u) = 0 and (v2 , u) = 0 for all u U. In particular, with u = v1 v2 U, this
implies (u, u) = 0, i.e. u = 0.
xt+1 = At xt + Bt wt , yt = Ct xt + Dt wt , x0 = x0 + e0 , (6.8)
where wt , e0 , x0 are random variables ( wt takes values in IRm(t) while x0 and e0 take
values in IRn(0) ) such that
f 7 f 0 Q0 f = E[|f 0 e0 |2 ] 0
94 CHAPTER 6. LINEAR-QUADRATIC OPTIMIZATION
is positive semidefinite.
It is assumed that the variables yt and x0 represent the measured data, and the task
is to design two linear estimators of xt : one uses the measurements y with 0 < t,
together with x0 , and produces a before measurement yt estimate
t1
X
xt = Gt x0 + Gt, y (6.9)
=0
of xt ; the other uses the measurements y with 0 t, together with x0 , and produces
an after measurement yt estimate
t
X
x+
t = Lt x0 + Lt, y (6.10)
=0
of xt . The sequences
t (Gt ) = E[|xt xt |2 ], + + 2
t (Lt ) = E[|xt xt | ].
= t () = E[||2],
(, ) = E[ 0 ].
Let Ut and Ut+ be the linear subspaces of V consisting of all linear combinations (6.9)
and (6.10) respectively (by construction, Ut Ut+ ). Then the task of optimizing of Gt is
equivalent to the linear-quadratic problem of minimizing on the affine subspace xt + Ut
of Vt , and optimization of Lt is equivalent to the linear-quadratic minimization of on
the affine subspace xt + Ut+ .
The orthogonality (projection) principle can be used to derive a nice recursive expres-
sion for computing the optimal xt and x+t , (essentially, the famous Kalman filter).
Recall that an estimate xt xt + Ut is optimal if and only if
(xt xt , u) = E[u0 xt ] = 0
6.1. MOTIVATION 95
for all u Ut . Since, in particular, every random variable of the form y , where < t
and is an n(t)-by-k( ) matrix with real coefficients, belongs to Ut , we conclude that
for every , which means that the matrix equality E[et y0 ] = 0 must be true for et = xt xt
for every < t. Similarly, the matrix expected value E[et x00 ] must be zero for the optimal
estimate xt . To summarize, a linear estimate xt is optimal if and only
E[(x+ 0 + 0
t xt )x0 ] = 0, E[(xt xt )y ] = 0 t. (6.12)
Since (6.11) and (6.12) represent conditions of optimality which are both necessary and
sufficient, it is possible to use them to check correctness of optimal estimation guesses.
For example, since, by assumption, E[e0 x00 ] = 0, it follows that
x0 = x0 (6.13)
xt+1 = At x+
t . (6.14)
Indeed, since x+
t satisfies (6.12), and, due to the assumptions made about wt ,
E[(xt+1 At x+ 0 + 0 0
t )y = At E[(xt xt )y + Bt E[wt y ] = 0,
for t, as well as
E[(xt+1 At x+ 0 + 0 0
t )x0 = At E[(xt xt )x0 + Bt E[wt x0 ] = 0,
At x+
t satisfies the orthogonality principle, and hence is a correct expression for xt+1 .
A similar, though slightly more involved derivation, shows that
x+
t = xt + Ht (yt Ct xt ), (6.15)
and
def
Qt = E[et e0t ] = E[(xt xt )(xt xt )0 ].
96 CHAPTER 6. LINEAR-QUADRATIC OPTIMIZATION
There are two different statements to prove here: first, the existence of a matrix Ht
satisfying (6.16), and, second, the orthogonality relations (6.12).
Once (6.16) is established, the orthogonality follows by inspection, since, according to
(6.15),
def +
e+
t = xt xt = et + Ht Dt wt Ht Ct et
E[e+ 0
t yt ] = E[(et + Ht Dt wt Ht Ct et )(Dt wt Ct et + Ct xt )0 ]
= E[(et + Ht Dt wt Ht Ct et )(wt0 Dt0 e0t Ct0 + x0t Ct0 )]
= Ht (Dt Dt0 + Ct Qt Ct0 ) Qt Ct0
= 0
as well as (6.16).
To establish the feasibility of (6.16), we will use the following frequently used obser-
vation.
To continue proving the feasibility of (6.16), note that the quadratic form
implies
0 = f 0 (Dt Dt0 + Ct Qt Ct0 )f = |Dt0 f |2 + (Ct0 f )0 Qt (Ct0 f ),
and hence Qt Ct0 f = 0. Now, since
for every v V . In particular, the supremums in (6.1) and (6.2) are equal.
Proof. When (v) = 0 and f (v) = 0, both sides of the equality are zero. When (v) = 0 but
f (v) 6= 0, both sides of the equality are +. When (v) > 0, the left side is
|f (v)|2
sup{2f (v)t (v)t2 } = at t = f (v)(v)1 ,
tIR (v)
which is the same as the value on the right side:
|f (v)|2
sup t2 |f (v)|2 = at t = s(v)1/2 .
tIR, t2 (v)1 (v)
98 CHAPTER 6. LINEAR-QUADRATIC OPTIMIZATION
Another simple but important observation is that (6.3) can be reduced to a special
case of (6.1). Indeed, let : V 7 IR be a quadratic form on a real vector space V . Let
v0 V and U V be an element and a linear subspace of V . Then for every u U
6.2.2 Well-Posedness
Well-posedness of an LQ optimization problem is related to the following two questions.
(a) Does the functional to be maximized (minimized) have a finite upper (respectively
lower) bound?
where a IR is a real parameter, defines a quadratic functional on the real vector space
C[0, 1] of all continuous functions v : [0, 1] 7 IR if and only if a > 1 (otherwise the
integral does not converge for some v C[0, 1]). Since
2r ta r 2 ta r IR, t (0, ),
where the inequality becomes equality at r = ta , the functional has a finite upper
bound over C[0, 1] if and only if a < 1. A maximizer v C[0, 1] exists if and only if
1 < a 0, in which case it is, naturally, given by v (t) = ta .
While sometimes it is difficult to judge well-posedness of an LQ optimization setup
over an infinite dimensional vector space, the following simple statements are frequently
quite useful.
First, the quadratic functional from (6.1) has a finite upper bound only when the
corresponding quadratic form is positive semidefinite.
Proof. Let v1 V be a vector such that (v1 ) < 0. If f (v1 ) 6= 0 then |f (tv1 )|2 + for
t , while (tv1 ) 0 remains smaller than 1. If f (v1 ) = 0, let v0 V be a vector such that
f (v0 ) 6= 0. Then
|f (tv0 + t2 v1 )|2 = t2 |f (v0 )|2 + as t +,
while
(tv0 + t2 v1 ) = t2 (v0 ) + 2t3 (v0 , v1 ) + t4 (v1 ) as t +
remains smaller than 1.
xn
we have
(v) = 2f (v) (v) = 2Lx x0 Qx.
Since (v) has a finite upper bound, Q is positive semidefinite. Also, for every x IRn such
that Qx = 0 the equation Lx = 0 must hold, otherwise (tx) converges to + as t converges
to plus or minus infinity. Hence there exists 1-by-n matrix H such that L = HQ, which implies
that
2Lx x0 Qx = 2HQx x0 Qx = (x H 0 )0 Q(x H 0 ) + HQH 0
achieves its minimum at x = H 0 .
100 CHAPTER 6. LINEAR-QUADRATIC OPTIMIZATION
(v ) (v) = (v v ) 0,
where Q = Q0 and F age given real matrices of dimensions n-by-n and n-by-1 respectively.
Assuming that Q = Q0 is a positive semidefinite matrix (i.e. defines a positive semidefinite
quadratic form v 7 v 0 Qv), the necessary and sufficient condition that a given vector
v IRn maximizes (v) is given by
F 0 u = v0 Qu u IRn ,
i.e. F 0 = v0 Q, which is equivalent to Qv = F . When Q = Q0 > 0 is positive definite, the
condition can be further simplified to v = Q1 F .
An equivalent formulation of Theorem 6.2 is given by the orthogonality principle, or
projection theorem.
6.2. BASIC PROPERTIES OF LQ OPTIMIZATION 101
v0 + U
d d
v + u v
d
v0
As shown on Figure 6.1, the projection theorem can be interpreted in terms of the
distance and scalar product
(v + u) (v u)
dist(v, u) = (v u)1/2 , (v, u) =
4
in V , induced by the positive semidefinite quadratic form . The theorem claims that
a vector v v0 + U has minimal distance to the origin 0 V if and only if it is
orthogonal to every element of U.
The result of Theorem 6.4 classifies the optimizing sequences {(vk ) k=1 as those for
which the linear functionals u 7 (vk , u) converge to the linear functional u 7 f (u) well
enough, where the required type of convergence is characterized by condition (b).
Proof. The proof follows from the identity
Proof. It is easy to see why the statement is true when V = IRn and is positive definite.
Indeed, then (v) = v 0 Qv for some symmetric matrix Q = Q0 , and every f V ] has the form
v 7 F 0 v, where F IRn is a fixed row vector. Since is positive definite, Qv = 0 implies
v 0 Qv = 0 and hence v = 0, i.e. ker(Q) = {0}. Hence equation Qv = F has a unique solution
v = Q1 F for every F IRn . Since
the maximum
0 (f ) = F 0 Q1 F for f (v) = F 0 v
is indeed a quadratic form.
Now consider the general case. To prove that V is a linear subspace of V ] , note that,
according to Lemma 6.2, a linear functional f V ] belongs to V if and only if there exists a
constant d = df IR such that |f (v)| df for all
def
v = {v V : (v) 1}.
Lg (f ) = 0 (g + f ) 0 (g f )
Fixing a positive definite quadratic form on a real vector space V allows one to quantify
length |v| of every vector vector v V according to |v| = (v)1/2 . It can also be used as
a basis for measuring the operator norm kAk of a linear function A as the minimal upper
bound for the ratio of the lengths |Av| and |v|. The availability of vector length and the
operator norm makes it possible to consider approximation and convergence of vectors
and linear functions.
A major goal of this chapter is to introduce the techniques for finding an approximation
Ar of a given linear function A, where Ar is restricted to be a linear function of rank
less than r, which minimizes the operator norm kAr Ak of the error function =
Ar A. In the case when A is a finite matrix with real or complex coefficients, and
the standard Euclidean length of vectors is used, the procedure requires computing the
dominant (largest) eigenvalues of A0 A (where the prime means Hermitian conjugation), as
well as the corresponding eigenvectors. When A has infinite rank, several complications
may arise. First, it is not always possible to define the conjugate A0 of a linear operator
A on a vector space V of infinite dimension: this, in general, requires boundedness of A
and completeness of V with respect to the distance measure d(v, u) = |v u| . Second,
the operator A0 A will not necessarily have any eigenvalues and eigenvectors: instead, A0 A
has spectrum, of which eigenvalues and eigenvalues constitute a special case. The chapter
explores these issues within the context of Hilbert spaces, operator norms, and spectrum
of a self-adjoint linear operator.
7.1 Motivation
Mathematical models based on Hilbert spaces and spectral decomposition of symmetric
operators are heavily utilized in a variety of applications, ranging from quantum mechanics
and distributed models to wavelets and model reduction.
105
106 CHAPTER 7. BOUNDED LINEAR FUNCTIONS ON HILBERT SPACES
Let V denote the real vector space of all such functions (where m is not fixed and can
be arbitrary). The standard Fourier series can be used to represent the elements of V as
limits of finite sums of sinusoids, according to
n
X
v(t) = lim vn (t), vn (t) = a0 + {ak cos(kt) + bk sin(kt)},
n
k=1
where
1 1 1
Z Z Z
a0 = v)(t)dt, ak = cos(kt)v(t)dt, bk = sin(kt)v(t)dt.
2
0.5
0.5
1.5
4 3 2 1 0 1 2 3 4
which is valid for all sequences of real numbers ak , bk , it is possible to establishthat the
length of the error vector h hn decreases not slower than a constant times 1/ n, i.e.
1
|h hn | = O as n .
n
1
Z
|v(t)|2 dt < .
2
One can try to interpret this result as evidence to the point that the functions
{cos(kt)}
k=0 , {sin(kt)}k=1 form a basis in V , in the sense that for every element v V
there exist sequences of real numbers (ak )
k=0 and (bk )k=1 such that
X
v(t) = a0 + {ak cos(kt) + bk sin(kt)},
k=1
108 CHAPTER 7. BOUNDED LINEAR FUNCTIONS ON HILBERT SPACES
i.e.
n
X
lim |v vn | = 0 for vn (t) = a0 + {ak cos(kt) + bk sin(kt)}.
n
k=1
functions {cos(kt)}
k=0 , {sin(kt)}k=1 forms a basis, in the sense that there is a one-to-one
correspondence between the elements of L2 [0, 1] and pairs of sequences of real numbers
with finite sums of squares, mapping every u L2 [0, 1] to (ak )
k=0 , (bk )k=1 in such a way
that 2
Z n
X
lim u(t) a {a cos(kt) + b sin(kt)} dt = 0.
0 k k
n
k=1
While working directly with the real vector space L2 [0, 1] requires a solid foundation in
the theory of integration, it is possible to infer many of its properties from studying the
quadratic form on the real vector space V .
The analysis of convergence of the Fourier series can be continued by studying the lin-
ear operators En : V 7 V mapping every function v V to the error of its approximation
by a finite sum of its complex Fourier series:
n Z
1 X jk(t )
(En v)(t) = v(t) e v( )d.
2 k=n
Calculation of the operator norm kEn k of En , which measures the maximal (over v 6= 0)
ratio of lengths |En v| /|v| , performed using the Parceval formula, yields a partially
disappointing outcome: kEn k = 1 for all n! On one hand, this means that the length
of the error vector En v can be as large as the length of v 6= 0. On the positive side,
the length of En v is never larger than the length of v, which is much better than can be
claimed for many other approximation methods.
7.1. MOTIVATION 109
for some linear functionals f1 , . . . , fk : IRn 7 IR; computing all values of fi (v) for a
given v takes O(kn) operations, and computing the sum of fi (v)vi takes another O(km)
operations.
While this is not exactly the same as the original matrix model reduction question,
the following relaxed formulation is important because it allows an elegant and efficient
solution: given an m-by-n real matrix M and a positive integer r find a matrix Mr of rank
less than r for which the operator norm kr k of the difference = Mr M is minimal.
To state a question like this, one has to fix first a way of measuring lengths of vectors
in both IRn and IRm . By default, the positive definite quadratic form on IRk is defined
according to
x1
... = x21 + + x2k ,
xk
in which case |x| becomes the standard Euclidean length |x| of x.
A solution of the matrix rank reduction problem is given in terms of Schur (eigen-
value) decomposition for the symmetric matrix A = M 0 M. Essentially, the r-th largest
eigenvalue of M 0 M is the square of the minimum of kMr Mk over the set of matrices
Mr of rank less than r. Moreover, the eigenvectors of M 0 M corresponding to its r largest
eigenvalues can be used to calculate an optimal approximation Mr .
The optimal rank reduction setup can be generalized to cover the important case
of finding low rank approximations to linear functions M : V 7 U, where V and
U are real vector spaces of infinite dimensions. Such approximations are quite helpful
when numerical calculations with infinite dimensional vectors are to be performed. For
example, while there is no explicit formula for computing the solution u : [, ] 7 IR
of the differential equation
for a given function v : [, ] 7 IR, a good low rank approximation of the linear function
v 7 u can be computed using the ideas of matrix (linear function) rank reduction.
110 CHAPTER 7. BOUNDED LINEAR FUNCTIONS ON HILBERT SPACES
def (u + v) (u v)
fu (v) = (u, v) = (7.1)
4
satisfies the condition
def
|f | = sup{|f (v)| : (v) 1} < . (7.2)
Example 7.1 Let V = C[0, 1] be the real vector space of continuous functions v : [0, 1] 7 IR.
Let , f : V 7 IR be the quadratic form and the linear functional defined by
Z 1 Z 0.5
2
(v) = v(t) dt, f (v) = v(t)dt.
0 0
Then is positive definite, f is linear, and condition (7.2) is satisfied with |f | = 1/ 2. Nev-
ertheless, there is no continuous function u : [0, 1] 7 IR such that
Z 1
def
f (v) = (u, v)s = u(t)v(t)dt.
0
One can argue that the vector space V from Example 7.1 is incomplete with respect
to the metric defined by the quadratic form, in the sense that it does not contain enough
elements to define all continuous linear functionals on itself as scalar products.
7.2. HILBERT SPACES 111
Definition 7.1 A real Hilbert space is the pair (V, ), where V is a real vector space,
and : V 7 IR is a positive definite quadratic form : V 7 IR such that every linear
functional f : V 7 IR satisfying (7.2) can be represented in the form (7.1).
Actually, this definition is not quite standard. The following statement establishes the
equivalence between Definition 7.1 and a more general notion of completeness.
(a) for every (0, ) and a linear function f : V 7 IR satisfying |f (v)| |v| for
all v V there exists u V such that |u| and f (v) = (u, v) for all v V ;
rn = sup |vk vi | .
k,i>n
|u vk |2 = |f (u vk ) (vk , u vk )| rn |v vk | .
hence
lim sup{|(v, vk ) (v, vi ) |2 : v V, (v) 1} = 0,
min{k,i}
As usually, a common abuse of notation allows one not to distinguish between the
Hilbert space H = (V, ) and the corresponding real vector space V , so that v H is to
be understood as v V . As a rule, the scalar product (v, u) in a Hilbert space H = (V, )
is denoted without the index: as (v, u) or, equivalently, hv, ui. In these lecture notes,
the length (v)1/2 of a vector v in a Hilbert space H = (V, ) is denoted by |v| (alternative
commonly used notation kvk will be reserved for those length-like quantities (norms) for
which the function v 7 kvk2 is not a quadratic form).
i.e.
f (v) = (u, v) where u = f (v1 )v1 + + f (vn )vn .
In particular, the real vector space V = IRn is usually treated as a real Hilbert space with
the quadratic form
x1
... = x21 + + x2n .
xn
Establishing from scratch that a specific positive definite quadratic form on an infi-
nite dimensional real vector space defines a Hilbert space is usually tricky. The following
example is one of the easiest of its kind.
7.2. HILBERT SPACES 113
Example 7.3 The infinite dimensional Hilbert space `2 is defined as the pair H = (V, ),
where V is the subset of the set IRZZ+ consisting of all functions v : ZZ+ 7 IR (essentially,
sequences v(0), v(1), . . . of real numbers) which are square summable, i.e. such that
X
(v) = |v(t)|2 < .
k=0
Obviously v V implies cv V , and the fact that for v, u V the inequality
|v(t) + u(t)|2 2(|v(t)|2 + |u(t)|2 )
proves v + u V , which establishes that V is a real vector space. Similarly, since
|v(t)u(t)| 0.5(|v(t)|2 + |u(t)|2 ),
the function
X
(v, u) 7 v(t)u(t)
t=0
is a well defined symmetric bilinear form on V , which establishes that : V 7 IR is a positive
definite quadratic form.
To verify that `2 is indeed a Hilbert space, take a sequence {vk } k=1 of elements vk `
2
Since |vk vi | |vk (t) vi (t)| for every t ZZ+ , the sequence {vk (t)}
k=1 has a limit u(t) for
every t ZZ+ . Since
X
|vk (t) u(t)|2 rn for k > n,
t=0
it follows that u `2 and |vk u| 0 as k .
Example 7.4 Let V = C[0, 1] be the real vector space of all continuous functions v : [0, 1] 7
IR. The quadratic form : V 7 IR defined by
Z 1
(u) = u(t)2 dt. (7.3)
0
does not make (V, ) a Hilbert space. Nevertheless, V can be viewed as a subset of the real
Hilbert space V 0 of all linear functionals f : V 7 IR such that |f (v)|2 (v) for all v V ,
where an element u V is associated with the functional
Z 1
fu : v 7 u(t)v(t)dt. (7.4)
0
0
Essentially, V consists of all all linear functionals (7.4) where u() is not restricted to V , but
is allowed to be a measureable function u : [0, 1] 7 IR which is also square integrable, in the
sense that the integral (7.3) is finite. In fact, it is proper to associate V 0 with the real Hilbert
space L2 [0, 1] of all measureable square integrable functions u : [0, 1] 7 IR. However, a proper
discussion of L2 [0, 1] relies on the theory of Lebesque integration, and is not discussed here.
114 CHAPTER 7. BOUNDED LINEAR FUNCTIONS ON HILBERT SPACES
Then
Under the assumptions of Theorem 7.2, every element u V can be associated with
the linear functional fu V 0 defined by
fu (v) = (u, v) .
the sequence of real numbers {fk (v)} k=1 converges to a limit g(v) IR as k . Repeating
the arguments from the proof of the implication (a)(b) of Theorem 7.1 shows that g V 0 and
0 (fk g) 0 as k .
To prove (b), for a given f V 0 define the sequence {vk }
k=1 as in the proof of the implication
(b)(a) of Theorem 7.1, and follow its arguments to show that 0 (f fk ) 0 where fk (v) =
(vk , v) .
Definition 7.2 Let V be a complex vector space. A real valued quadratic form : V 7
IR is called a Hermitian form when
Just as every quadratic form on a real vector space defines a unique bilinear sym-
metric function (, ) : V V 7 IR such that (v) = (v, v) , every Hermitian form
: V 7 IR on a complex vector space V defines a unique function (, ) : V V 7 C
with the following properties:
Example 7.6 The complex vector space V of all functions v : ZZ+ 7 C which are square
summable, in the sense that
X
(v) = |v(t)|2 <
t=0
and the Hermitian form : V 7 IR define a complex Hilbert space `2 (C ) = (V, ) with
Hermitian inner product
X
(v, u) = (v, u) = v(t)0 u(t).
t=0
Definition 7.3 Let U, V be Hilbert spaces (either both real, or both complex). A linear
function A : U 7 V is called bounded when there exists a constant c IR such that
|Au| c|u| u U.
The minimal c 0 satisfying this condition is called the operator norm of A, denoted by
kAk.
It is easy to see that the opposite is also true: a linear function A : U 7 V is continuous
with respect to the length metrics if and only if A is bounded. For nonlinear functions, the
relation fails both ways: for example, the function A1 : IR 7 IR defined by A1 (x) = x2 is
continuous but not bounded, and the function A2 : IR 7 IR defined by
0, x < 1,
A2 (x) =
1, x 1,
Proof. Consider the case of real scalars (the complex case is very similar). For every v V
consider the linear functional fv : U 7 IR defined by fv (u) = (v, Au). Since
the functional is bounded, and hence can be represented as a scalar product with an element of
U : fv (u) = (w, u). Since Note that w is uniquely defined by v, and |w| kAk |v|. Therefore
the map B : v 7 u is linear and bounded.
The linear function B defined in Theorem 7.3 is called the conjugate of A. The
conjugation operation is frequently denoted by the star sign, as in B = A . However,
deferring to MATLAB, this text will use the prime notation B = A0 for conjugation.
Example 7.7 It is easy to see that the operation of conjugation is, in general, available only
on Hilbert spaces. For example, let : U 7 IR be a positive definite quadratic form on a real
vector space U . Let f : U 7 IR be a bounded linear function. Consider the positive definite
function (x) = x2 on V = IR. Then a conjugate of f with respect to these two quadratic
forms would be a map f 0 : IR 7 V such that
which is equivalent to f (u) = (f 0 (1), u) for all u U . In other words, existence of a conjugate
for every bounded function f : U 7 IR is already equivalent to U being a Hilbert space with
respect to !
Definition 7.4 A subset X of a Hilbert space H is called closed in H if for every sequence
{vk }
k=1 of elements vk X the condition |u vk | 0 as k implies u X.
Example 7.8 Let X be the set of all sequences x `2 with a finite number of non-zero
elements. While X is a linear subspace of `2 , it is not closed. To show this, let vk X be
defined by t
2 , t < k,
vk (t) =
0, t k.
Let u `2 be defined by u(t) = 2t for all t ZZ+ . Then |u vk | converges to zero as k ,
but u is not an element of X, as it has an infinite number of non-zero entries. Hence X is not
closed in H.
In contrast, the set of all x H such that x(t) = 0 for all odd t ZZ+ is a closed linear
subspace of H.
Theorem 7.5 Let V be a linear subspace of a Hilbert space H (real or complex). The
quadratic form v 7 |v|2 makes V a Hilbert space if and only if V is closed in H.
Theorem 7.5 can be used to prove the existence of an orthogonal complement to every
closed linear subspace of a Hilbert space.
Theorem 7.6 Let H be a Hilbert space (real or complex) with inner product (, ). Let
V be a closed linear subspace of H. Then
Proof. To prove (a), note that V is invariant under scaling and addition, i.e. V is a linear
subspace of H. Moreover, if |ek g| 0 as k and (ek , v) = 0 for all k then
i.e. V is closed.
To prove (b), note that every vector u H defines a bounded linear function fu : V 7 F
(where F = IR or F = C ) according to fu (v) = (u, v). By Theorem 7.5, V is a Hilbert space,
and hence there exists w V such that fu (v) = (w, v). Equivalently, the vector e = u w
is orthogonal to V , in the sense that (e, v) = 0 for all v V . To prove uniqueness, assume
e1 + v1 = e2 + v2 where e1 , e2 V and v1 , v2 V . Then e1 e2 = v2 v1 , and hence
i.e. v2 = v1 and e2 = e1 .
To prove (b), note that the uniqueness of the representation u = e + v, where e V ,
v V , implies linearity of the resulting maps u 7 e and u 7 v, since cu = ce + cv follows from
u = e + v for every scalar c F , and u1 + u2 = (e1 + e2 ) + (v1 + v2 ) follows from u1 = e1 + v1
and u2 = e2 + v2 . Also, since
for e V , v V , the operator norms of functions PV and PE are not larger than 1.
While Theorem 7.6 appears to be obvious in the finite dimensional case, where every
linear subspace of a Hilbert space is automatically closed, it is important to note that
orthogonal projections are not well defined for non-closed subspaces. In particular, for
u `2 and V `2 defined in Example 7.8 there is no way to represent u as a sum u = e+v
where v V and e V , because V = {0}, and u 6 V .
that every element of H is a limit of a sequence of vectors from V . The set V from
Example 7.8 is a useful example of a desnse linear subspace of `2 .
The following statement establishes the principle of extension for a bounded linear
function.
Theorem 7.7 Let G and H be two Hilbert spaces (real or complex). Let U and V be
dense linear subspaces of G and H respectively. Let A0 : U 7 V be a linear function
which is bounded, in the sense that there exists a constant 0 such that |A0 u| |u|
for every u U (naturally, the length |A0 u| of A0 u V is taken in H, and the length |u|
of u U is taken in G). Then there exists a unique bounded linear function A : G 7 H
such that Au = A0 u for all u U.
U A-
0 V
? A- ?
G H
{A0 uk }
k=1 is a Cauchy sequence in H. Hence Auk h as k for some h H.
Let us show that h depends only on g G (and not on the selection of a specific sequence
{uk }
k=1 converging to g). Indeed, if, in addition, |uk g| 0 for some uk U , then
converges to zero as k . Hence |A0 uk A0 uk | 0, i.e. the sequences {A0 uk } k=1 and
{A0 uk }k=1 must have the same limit.
The fact that h is uniquely defined by g allows one to write h = A(g). Since scaling of a
convergent sequence results in the same scaling of its limit, and limit of a sum of two convergent
sequences is the sum of their limits, A : G 7 H is a linear function. Also, since |A0 uk | |uk |,
it follows that |Ag| |g|.
Finally, for every bounded linear extension B of A0 we have |Bg Buk | 0 whenever
|uk g| 0. When vk U , this yields |Bg Buk | = |Bg A0 uk | 0, i.e. Bg = Ag.
The operation of continuous extension has a number of useful properties, listed in the
following statement.
122 CHAPTER 7. BOUNDED LINEAR FUNCTIONS ON HILBERT SPACES
|(A0 + C0 )uk (A + C)g| = |A(uk g) + C(uk g)| (kAk + kCk) |uk g|.
To prove (d), note first that substituting u = D0 v into (v, A0 u) = (D0 v, u) yields
i.e. |D0 v| kA0 k |v|, which proves boundedness of D0 . Let D : V 7 U be the continuous
extension of D0 . Since for |uk g| 0, uk U , and |vk h| 0, vk V , we have
(h, Ag) (Dh, g) = lim {(vk , Auk ) (Dvk , uk )} = lim {(vk , A0 uk ) (D0 vk , uk )} = 0,
k k
D = A0 .
The convergence, however, is not of the ordinary point-wise type: referring to the value
of f (t) for a specific t IR could be meaningless for a continuum of instances of t. The
subsection explains how the summation in (7.5) can first be defined on a set of nice
sequences {fk }kZZ , and then extended to cover the general case of sequences {fk }kZZ
satisfying (7.6).
For completeness, the following result from the classical theory of continuous functions
will be useful.
Theorem 7.9 If a continuous function f : [, ] 7 C is such that
Z
ejktf (t)dt = 0
Since f is a continuous function, r ( ) converges to zero as 0, and hence the upper bound
can be made arbitrarily close to zero by selecting large n and small . Therefore f ( ) = 0.
Let U be the complex vector space of all functions u : ZZ 7 C such that u(k) = 0 for
all but a finite set of k ZZ. Let V be the set of all continuous functions v : [, ] 7 C
such that only a finite number of the Fourier series coefficient integrals
Z
1
vk = ejkt v(t)dt, k ZZ (7.7)
2
are not equal to zero. Let A0 : U 7 V be the Fourier series sum function mapping
u U to v = A0 (u) according to
X
v(t) = ejktu(k).
kZZ
Let
1
X Z
0
(u1 , u2 ) = u1 (k) u2 (k), (v1 , v2 ) = v1 (t)0 v2 (t)dt
kZZ
2
Let G and H be the complex Hilbert spaces obtained by applying the process of comple-
tion, as described in Theorem 7.2, to U and V respectively. According to Theorem 7.8, the
identities in (7.8) imply that the linear functions A0 , B0 have uniquely defined continuous
extensions A : G 7 H and B : H 7 G such that
B = A0 , A0 A = BA = IG , AA0 = AB = IH .
7.3. SPECTRUM OF SELF-ADJOINT OPERATORS 125
The linear function A extends the Fourier series sum operation to the elements of G, which
can be easily associated with the Hilbert space of all functions f : ZZ 7 C satisfying the
inequality (7.6).
In contrast, for a fixed t [, ] the linear function ft : U 7 C defined by
X
ft (u) = ejkt u(k)
kZZ
is not bounded with respect to the Hermitian form , and hence cannot be extended to
G.
(b) if > , + < , and a sequence {vk } k=1 of vectors vk V is such that
(vk ) = 1 and (vk ) converges to + as k then
To prove (a), note that, by assumption, () with w = v achieves its maximum at t = 0, and
hence
2(u, w) (w) 2(w)(u, w) (u, v) + (u, v)
0 = (0) = 2
=2 .
(w) (v)
To prove (b), note that, by assumption, the maximum of (t) (0) converges to zero as
k . Since the maximum is not smaller than
|(u, vk ) + (u, vk ) |2
C , where C = C(+ , ),
(u)(w)
the conclusion follows.
Example 7.9 Let us try to find the maximal value of x1 x2 + x2 x3 where x1 , x2 , x3 are real
numbers such that x21 + x22 + x23 = 1. The setup corresponds to having V = IR3 and : V 7 IR,
: V 7 IR defined by
x1 x1
x2 = x21 + x22 + x23 , x2 = x1 x2 + x2 x3 .
x3 x3
According to Theorem 7.10 (a), a set of values x1 , x2 , x3 is optimal only if the identity
(u1 x1 + u2 x2 + u3 x3 ) = 0.5(u1 x2 + u2 x1 + u2 x3 + u3 x2 )
The conclusion, however, relies on the premise that (v) does achieve its maximal value
subject to (v) = 1. In this specific case, due to the fact that the dimension of V = IR3 is finite,
the assumption is indeed true. A proper way of establishing attainability of the maximum is
by referring to compactness of the set
S = {v IR3 : (v) = 1}
and continuity of function : compactness means that out of every sequence {vk }
k=1 of vk S
such that (vk ) , one can extract a subsequence wn = vk(n) converging to a limit u S,
which in turn implies that
The compactness and continuity arguments, which become less trivial, though still very
useful, in the infinite dimensional case,will be studied in later chapters. Perhaps an easier way
to see that the computed value = 1/ 2 is indeed the maximum of on S is by checking that
the matrix of the quadratic form
(1/ 2 + )(v) (v)
Example 7.10 Let V = `2 be the real Hilbert space of all functions v : ZZ+ 7 IR such that
X
2
|v| = v(k)2 < .
k=0
Let (v) = |v|2 be the quadratic form defining V as a Hilbert space. Let
v(0)2 X v(0)2
(v) = + v(k)v(k + 1) = + v(1)v(2) + v(2)v(3) + . . . .
2 2
k=1
Since
v(k)v(k + 1) 0.5{v(k)2 + v(k + 1)2 },
it follows that (v)/(v) 1 for all v `2 , v 6= 0. Since, for wr `2 defined by wr (k) = r k
where r (0, 1) is a parameter,
1 r2 1 r3 (wr )
(wr ) = + 2
, (wr ) = + 2
, lim = 1,
2 1r 2 1r r1 (wr )
one can conclude that the minimal upper bound of (v)/(v) equals 1.
According to Theorem 7.10 (a), the minimal upper bound of (v)/(v), where v 6= 0, can
only be achieved at an element v `2 satisfying the identity
" #
X X
u(k)v(k) = 0.5 u(0)v(0) + {u(k)v(k + 1) + u(k + 1)v(k)}
k=0 k=1
128 CHAPTER 7. BOUNDED LINEAR FUNCTIONS ON HILBERT SPACES
a(z) = z 2 + 2(1 )z + 1.
Substituting (7.12) into the second equation in (7.11) yields c1 = c2 = c. Since z1 z2 = 1, this
implies that v(k) does not converge to zero as k , unless v(k) = 0 for all k > 0. Therefore
the optimality condition from Theorem 7.10 (a) can only be satisfied for = 0.5, which is in
obvious disagreement with the minimal upper bound value computed earlier at = 1!
The example demonstrates the dangers of relying on eigenvalues in finding maximal ratios
of quadratic forms on infinite dimensional vector spaces. It also presents a setup in which the
maximal value of such ratio is not achieved. The absence of an optimizer takes place despite the
fact that V is a Hilbert space defined by the quadratic form . This is in sharp contrast with
the special case when is defined by (v) = |f (v)|2 , where f : V 7 IR is a linear function.
According to the properties of linear quadratic optimization discussed in the previous chapter,
the minimal upper bound of |f (v)|2 /(v) is achievable whenever it is finite and the pair (V, )
defines a Hilbert space.
When (V, ) is a Hilbert space, and A is bounded, (7.13) means that A = A0 , which
explains the terminology.
When V is finite dimensional, the matrix of a self-adjoint operator A with respect to
any orthonormal basis is symmetric (in the real case) or Hermitian (in the complex case).
We are used to representing finite dimensional quadratic forms by symmetric matrices.
The following statement provides a coordinate free generalization of such representation
(which applies in the infinite dimensional case as well), associating bounded quadratic
forms = (v) on a Hilbert space H with self-adjoint operators A : J 7 H according to
(a) is bounded, in the sense that there exists a constant > 0 such that |(v)| |v|2
for all v H;
By assumption, is positive semidefinite, and and can be bounded by (v) 2|v|2 . Hence the
corresponding symmetric bilinear form satisfies
is bounded, and hence there exists a unique w = A(u) H such that (u, v) = (w, v) for
all v H. Since the constraints defining w are linear with respect to u and w, the function
A : V 7 V is linear. Substituting v = Au into the identity
yields
|Au|2 = (u, Au) |u| |Au|,
which proves that A is bounded, and kAk . Since the bilinear form (u, v) is symmetric, A
is self-adjoint. Finally, substituting v = u into (7.15) yields (v) = (v, Av).
Theorem 7.11 allows one to interpret the statements of Theorem 7.10 in terms of self-
adjoint operators. Indeed, consider the setup of Theorem 7.10 where (V, ) is a Hilbert
space (i.e. (v) = |v|2 on V ) and (v) = (v, Av) = v 0 Av for some bounded self-adjoint
linear operator A : V 7 V . Then statement (a) of the theorem means that the minimal
upper bound of the ratio (v)/|v|2 can only be achieved on eigenvectors of A which
correspond to eigenvalue , In turn, statement (b) claims that |Avk vk | 0 as k
for every sequence {vk }
k=1 of vectors vk V such that |vk | = 1 and (vk ) .
130 CHAPTER 7. BOUNDED LINEAR FUNCTIONS ON HILBERT SPACES
In Example 7.9, the self-adjoint operator A : IR3 7 IR3 such that v 0 Av = (v) is
defined by
x1 0.5x2
A x2 = 0.5(x1 + x3 ) ,
x3 0.5x2
and its largest eigenvalue = 1/ 2 is the maximum of (v)/|v|2, achieved at the corre-
sponding eigenvector
p1/2
v = (2)/2 .
1/2
In Example 7.10, the self-adjoint operator A : `2 7 `2 such that v 0 Av = (v) is
defined by
0.5v(0), k = 0,
(Av)(k) = v(1) + 0.5v(2), k = 1,
v(k) + 0.5v(k 1) + 0.5v(k), k > 1.
Operator A has a single eigenvalue = 0.5 which has nothing to do with the supremum
(or, for that matter, infimum) of (v)/|v|2. The actual minimal upper bound of (v)/|v|2
equals 1. It is not achievable, but (vk )/|vk |2 1 for a sequence of vectors {vk }
k=1 ,
2
vk ` such that |vk | = 1 if and only if |Avk vk | 0.
has no eigenvalues, but its spectrum is the whole interval [1, 1],
Spectrum of a bounded self-adjoint operator A : H 7 H on a Hilbert space H
is closely related to the extremums of the functional (v) = v 0 Av/|v|2, defined for all
non-zero elements of H. According to Theorem 7.10, the minimal upper bound of
belongs to the spectrum of A. Actually, since the condition |Avk rvk | 0 together
with |vk | = 1 implies (vk ) r, the supremum of is the maximal element of (A).
In general, eigenvectors w of A define extremums of the functional , in the sense that
Aw = w for w 6= 0 implies
(w + tu) (w)
lim = 0 u H.
t0 t
The following theorem provides justification for an important classification of spec-
trum: essentially it claims that a spectrum point is either an eigenvalue of finite multi-
plicity, isolated form the rest of the spectrum, or is a point of essential spectrum, which
is a generalization of the notion of eigenvalues of infinite multiplicity.
(b) m(r) = dim(Vr ) < , where Vr = ker(rI A), and A has block representation
rIm(r) 0
A= , r 6 (A), (7.17)
0 A
dimension, its essential spectrum is empty. When dim(H) = , ess (A) is a non-empty
closed subset of (A).
Proof. The implication (b)(a) is easy, as one can define u1 , . . . , un H as a basis in Vr .
Then, using the block representations of matrices and vectors associated with the direct sum
decomposition H = Vr Vr , we conclude that
rIm(r) 0 0 0 0
|Av rv| = r = = |rv Av|
0 A v v rv Av
which means that |vk vi | 0 as k, i , i.e. {vk }k=1 is a Cauchy sequence, and, as such,
has a limit u V . Since Ar is bounded, Ar u = w, which proves that Ar V is a closed subspace
of U .
Now let the quadratic form : W 7 W be defined by
By construction ker(Ar ) = {0}, and hence is positive definite. Since dim(W ) < , this implies
existence of > 0 such that (w) 2 |w|2 . Hence, for w W and v V ,
|Ar w + Av | |Ar w| (|Ar v| |Ar w + Ar v|) (|v| |Ar w + Ar v|),
kAr k kAr k kAr k
7.3. SPECTRUM OF SELF-ADJOINT OPERATORS 133
|Ar u| |u| u U.
Example 7.11 Let V = C[0, 1] be the real vector space of all continuous functions v : [0, 1] 7
IR, equipped with the positive definite quadratic form
Z 1
def
(v) = |v(t)|2 dt.
0
As was pointed out in earlier examples, (V, ) is not a Hilbert space. Let H be a completion of
V , i.e. a real Hilbert space which contains V as a dense subset, and such that (v) = |v|2 for
all v V . (It is known that H can be interpreted as the space L2 [0, 1] of all square integrable
measureable functions, but this fact will not be used here.)
Let A0 : V 7 V be the double integration operator mapping u V to y = A0 (u) defined
as the (unique) solution of the differential equation
Note that A0 is self-adjoint with respect to , since for every pair (u1 , y1 ), (u2 , y2 ) of solutions
of (7.18), integration by part yields
Z 1 Z 1 Z 1
(y1 , u2 ) = y1 (t)u2 (t)dt = y1 (t)y2 (t)dt = y1 (t)y2 (t)dt = (u1 , y2 ) .
0 0 0
it follows that A0 is a bounded linear operator, and |A0 v| |v| for all v V . This makes it
possible to consider the uniquely defined extension A : H 7 H of A0 : a bounded linear operator
such that Av = A0 v for all v V . Since A0 is self-adjoint, it follows that A = A0 .
What is the spectrum of A? A point r IR belongs to (A) if and only if |ruk Auk |
converges to zero for some sequence of vectors uk H such that |uk | = 1. Since V is dense in H,
one can assume that uk V without loss of generality, which makes the following calculations
more straightforward.
For r 6= 0 let yk = A0 uk be the corresponding solutions of (7.18). By assumption, |uk
r 1 yk | 0 as k , i.e.
Z 1
yk + r 1 yk = ek , yk (0) = 0, yk (1) = 0, lim |ek (t)|2 dt = 0. (7.19)
k 0
134 CHAPTER 7. BOUNDED LINEAR FUNCTIONS ON HILBERT SPACES
Since, for a given ek (), a colution of (7.19) can be written explicitly as a convolution integral,
we have |yk y| 0 and uk r 1 y| 0 as k , where y = y(t) satisfies
the only non-zero points of spectrum of A are the eigenvalues 1 , 2 ,. . . (each of multiplicity
one). A set of coresponding normalized eigenvectors {xk }
k=1 is given by
(2k 1)
xk (t) = 2 cos t .
2
The point r = 0, as a limit point of the spectrum, is automatically in ess (A). It is not very
important, but instructive to see how it is verified that r = 0 is not an eigenvalue of A.
To prove that r = 0 is not an eigenvalue of A0 , note that y = A0 u = 0 means that
u = y = 0. However, the elements of H are mystery vectors for us, and it is not right to jump
into representing the relation between u H and y = Au as y = u.
To prove that r = 0 is not an eigenvalue of A, assume to the contrary that v H is a
non=zero vetor such that Av = 0. Then
which means that not every vector in V can be approximated arbitrarily well by the elements of
A0 V in the metric of H. Since A0 V contains every two times continuously differentiable function
y such that y(1) = y(0) = 0, the opposite is true: A0 V is dense in H. The contradiction proves
that ker(A) = {0}.
(c) the sum of multiplicities of those eigenvalues of L0 L which are larger than 2 is
smaller than k.
7.3. SPECTRUM OF SELF-ADJOINT OPERATORS 135
k1 (L) > k (L) = k+1 (L) = = k+m1 (L) > k+m (L)
will also be orthonormal. They are called the left singular vectors of L.
When G = IRn or G = C n , the linear function L can be represented in the form
X X
L= uk k vk0 , i.e. Lx = k (vk , x)uk ,
[U,S,V] = svd(L);
(uk is the k-th column of U, vk is the k-th column of V , k is the k-th diagonal element
of S).
Let L : G 7 H be a bounded linear function mapping one Hilbert space to another.
Since linear functions of small rank (recall that rank of a linear function is the dimension
of its range) are usually much easier to deal with in practical calculations, it is frequently
desirable to approximate L by a linear function Lr : G 7 H which has rank smaller than
a given positive integer r. It is natural to measure the quality of such approximation in
terms of the operator norm kL Lr k of the error function.
The following theorem, which is the ultimate destination of this chapter, solves the
bounded rank approximation problem in terms of singular values and singular vectors of
L.
Theorem 7.13 Let L : G 7 H be a bounded linear function mapping one Hilbert space
to another. Let
1 2 r1 r 0
are the first r largest singular values of L. Then
(b) if r1 > r and v1 , . . . , vr1 are the corresponding orthonormal right singular vec-
tors of L then kL Lr k = r for
r1
X
Lr = (Lvk )vk0 .
k=1
Example 7.12 Let V = C[0, 1] and it completion H be the real vector space and the Hilbert
space considered in Example 7.11.
Let L0 : V 7 V be the linear integration operator mapping u V to v = L0 (u) defined as
the (unique) solution of the differential equation
Z t
v = u, v(0) = 0, i.e. v(t) = u( )d.
0
Since |v(t)| |u()| for all t [0, 1], it follows that |L0 u| |u| for all u V , i.e. L0 is bounded
with respect to the metric of H. Therefore L0 has a unique bounded extension L : H 7 H.
Since Z 1 Z 1 Z 1
y(t)u(t)dt = y(t)v(t)dt = y(1)v(1) y(0)v(0) y(t)v(t)dt
0 0 0
or every continuously differentiable function y : [0, 1] 7 IR, it is natural to consider the linear
operator M0 : V 7 V mapping w V to y V defined by
Z 1
y = w, y(1) = 0, i.e. y(t) = w( )d.
t
By the usual bounding argument, M0 is linear and bounded, and hence has a unique linear
bounded extension M : H 7 H. Since, by construction, (w, A0 u) = (M0 w, u) for all w, u V ,
it follows that M = L0 .
Since A = L0 L for the self-adjoint operator A : H 7 H considered in Example 7.11, we can
conclude that the singular values of L are given by
2
k (L) = , (k = 1, 2, . . . ),
(2k 1)
with
(2k 1)
vk (t) = 2 cos t
2
being the corresponding right singular vectors.
The calculations allow one to draw conclusions about the possibility of approximating L
by linear functions of finite rank. In particular 1 (L) = 2/pi is the operator norm kLk of
L, i.e. the best error of approximating L by a linear function of rank zero (the only linear
function of rank zero is zero). More interestingly, 2 (L) = 2/(3) is the minimal possible error
7.3. SPECTRUM OF SELF-ADJOINT OPERATORS 137
of approximating L by a rank one linear function L1 . One optimal approximation of rank one
is given by L1 = (Lv1 )v10 , i.e. maps u V to w = L1 u defined by
Z 1
4 t
w(t) = sin cos u( )d.
2 0 2
138 CHAPTER 7. BOUNDED LINEAR FUNCTIONS ON HILBERT SPACES
Chapter 8
Convexity
i.e. if the segment connecting every two points on the graph of f lies above the graph,
and quasi-convex if all of its level sets
are convex.
Convexity is of paramount importance in studying optimization and game theory. It
is also a powerful tool of feasibility analysis for systems of equations and inequalities.
139
140 CHAPTER 8. CONVEXITY
This can also be used to prove convexity of sets, as level sets of convex functions are
convex. Finally, it is possible to derive convexity of sets and functions with complex
definitions by combining information about convexity of simpler objects. .
It is easy to see that every half-space is a convex set, and every affine function is convex
(on every convex subset of V ). It is also not difficult to establish that the intersection of
a family of open half-spaces is a convex set, and the minimal upper bound of a family of
linear functions is a convex function on every set over which it is finite, as stated by the
following theorem.
Theorem 8.1 Let K be a (non-empty) set of affine functionals on real vector space V .
Then
0K = {v V : h(v) < 0 h K}
is convex.
def
(v) = sup h(v) <
hK
In other word, a set defined by affine inequalities is convex, and supremum of affine
functionals is a convex function.
Proof. To prove (a), let v1 , v2 0K and c [0, 1]. Since h(v1 ) > 0 and h(v2 ) > 0 for all h K,
we conclude that
h(tv1 + (1 t)v2 ) = th(v1 ) + (1 t)h(v2 ) > 0
for all h K, t [0, 1], which means tv1 + (1 t)v2 0K .
8.1. CONVEX SETS AND CONVEX FUNCTIONS 141
To prove (b), note that supremum of the sum is not larger than the sum of the corresponding
supremums, and hence
Example 8.1 Let V be the set of all Hermitian n-by-n matrices. Let V be the subset of
V consisting of all positive semidefinite matrices. Is a convex set?
Note that answering this question using the non-negative eigenvalues definition of positive
semidefiniteness would be difficult, if not impossible. Luckily, there is another definition: a
matrix M V is positive semidefinite if and only if x0 M x 0 for all x C n , x 6= 0, or,
equivalently
def
h(M ) = r x0 M x < 0 x C , x 6= 0, r IR, r > 0.
Since the function h : M 7 r x0 M x is affine for all x and r, the set is convex.
Example 8.2 Let V be the real vecor space of all Hermitian n-by-n matrices. Let V be
the subset of V consisting of all (strictly) positive definite matrices. Let : 7 IR be the
function mapping M to the trace of M 1 . If f a convex function?
Once again, trying to relay of an explicit formula expressing the trace of M in terms of the
entries of M is likely to lead one nowhere.
To show that is a convex function, note that trace of an n-by-n matrix can be defined by
n
X
trace(M ) = e01 M e1 + + e0n M en = e0k M ek ,
k=1
where {ek }nk=1 is the standard basis in C n . On the other hand, as follows readily from the
properties of linear quadratic optimization, the identity
v 0 M 1 v = maxn {2Re(v 0 u) u0 M u}
uC
where k (Z), for a Hermitian n-by-n matrix Z and k {1, . . . , n}, denotes the k-th largest
eigenvalue of Z. (Indeed, in an appropriate orthonormal basis, the matrix of X is a diagonal
one, with the numbers Xkk = k (X) on the diagonal. On the other hand, since 0 Y I, all
diagonal elements Ykk of Y are from the interval [0, 1], and, since Y has r positive eigenvalues,
n
X n
X
Ykk = k (Y ) r.
k=1 k=1
Hence
n
X r
X
trace(XY ) = k (X)Ykk k (X),
k=1 k=1
where the maximum is achieved when Y11 = = Yrr = 1, Ykk = 0 for k > r.
Therefore the sum of r largest eigenvalues is a convex function on the real vector space of
all Hermitian matrices of fixed dimensions.
Example 8.4 Let V be the real vector space of all polynomial functions p : IR 7 IR of n real
variables. Let : V 7 IR be the function mapping p to the maximum
2
(p) = max{p(t) et }.
t
While the function (p) can be difficult to evaluate in terms of the coefficients of p, and finding
an analytical expression for (p) also seems impossible, it is very easy to establish convexity of
p: since, for every fixed t IR the function p 7 p(t) exp(t2 ) is affine, is a maximum of a
family of affine functions, and hence is convex.
def f (t + ) f (t)
f+ (t) = lim
0, >0
8.1. CONVEX SETS AND CONVEX FUNCTIONS 143
exists and is finite. Similarly, f is left differentiable if for every t (0, 1) the limit
def f (t + ) f (t)
f (t) = lim
0, <0
Theorem 8.2 For every function f : (0, 1) 7 IR the following conditions are equivalent:
(a) f is convex;
(b) f continuous, right differentiable and its right derivative f+ is monotinically non-
decreasing.
Proof. To prove the implication (a)(b), assume that f is convex. Then for every set of values
a, b, c (0, 1) such that a < b < c we have
ba cb ba cb
f (b) = f c+ a f (c) + f (a),
ca ca ca ca
which is equivalent to
f (c) f (b) f (c) f (a) f (b) f (a)
.
cb ca ba
In particular, for every t (0, 1) and 0 < 1 < 2 < 1 t we have
which means that the limit in the definition of f+ (t) is that of a monotonically non-increasing
function with a finite lower bound, and hence exists. Also, since
f (b) f (a)
inf f+ (t) sup f+ (t)
t(a,b) ba t(a,b)
144 CHAPTER 8. CONVEXITY
are satisfied (it is a very useful and not completely trivial excercise to see why this is true).
Hence the monotonicity of f+ guarantees that, for 0 < v < u < 1,
are satisfied.
Let be a subset of a real vector space V . A function : 7 IR is convex if and
only if its restriction to every segment in is convex, i.e. if the function fu,v : [0, 1] 7 IR
defined by
f (t) = u,v (t) = (tu + (1 t)v)
is convex for every pair u, v . In particular, it is sufficient to know that f = u,v has
a non-negative second derivative at t = 0 for every pair u, v to conclude that is
convex.
When is a subset of IRn , and : IRn 7 IR is two times continuously differentiable
on an open subset of IRn containing , the second derivative of u,v is given by
u,v
(0) = (u v)0 (v)(u v),
dt
8.1. CONVEX SETS AND CONVEX FUNCTIONS 145
where (v) is the Hessian of : the matrix of its partial second derivatives (which must
be symmetric due to the continuity of the second derivatives of ). Therefore, positive
semidefiniteness of a continuous Hessian is a guarantee of convexity. In many examples
(usually when is a subset of a real vector space with no convenient representation as
IRn ), instead of assembling the Hessian, it is easier to simply show that for all u, v
there exist real number a such that the limit
(tu + (1 t)v) (v) at
lim
t0,t>0 t2
exists and is non-negative.
Example 8.5 The function : [0, ] 7 IR defined by (v) = sin(v) is convex. Indeed,
(v) = sin(v) is non-negative on [0, ]. As a byproduct, a representation of as maximum of
affine functions is given by
Example 8.6 Let be the positive quadrant in R2 , i.e. the set of vectors [x; y] IR2 with
positive components x > 0, y > 0. Obviously is convex. Let the function : R be
defined by f (x, y) = 1/xy. According to Theorem 8.2, the function is convex, because its
Hessian 2
d f /dx2 d2 f /dxdy 2/x3 y 1/x2 y 2
W (x, y) = =
d2 f /dydx d2 f /dy 2 1/x2 y 2 2/xy 3
is positive definite on . Moreover, the identity
1 1 x x1 y y 1
= max 2
xy x1 >0,y1 >0 x1 y1 x1 y 1 x1 y12
holds for all x, y > 0.
Example 8.7 Let V be the set of all real symmetric n-by-n matrices. Let V be the
subset of all positive semidefinite matrices in V . Then for every positive integer n the function
X 7 trace(X n ) is convex on . To prove this, for every X0 and X1 V let f (t) =
trace(X0 + tX1 )n . Since
n
!
X Y
trace(X + t)n = ts() trace X(k) ,
k=1
where the sum is taken over the set H of all functions : {1, . . . , n} 7 {0, 1}, and the
function s : H 7 {0, 1, . . . , n} maps H to the sum of its values. To prove convexity of
, it is sufficient to show that the coefficient at t2 in the expansion is non-negative. Indeed,
146 CHAPTER 8. CONVEXITY
since trace(AB) = trace(BA), the coefficient at t2 is a sum of traces of matrices of the form
X0a X1 X0b X1 , where a, b are non-negative integers such that a + b = n 2. Since X0 is a positive
semidefinite symmetric matrix, it can be represented in the form X0 = Y 2 , where Y = Y 0 0.
Then
trace(X0a X1 X0b X1 ) = trace(Y b X1 Y a )0 (Y b X1 Y a )0 0
Example 8.8 Let V be the set of all Hermitian n-by-n matrices. Let V be the subset of
all positive definite matrices in V . Then the function : 7 IR defined by
is convex. To prove this, it is sufficient to show that for arbitrary X and V the function
L1 () = {u U : L(u) }
is a convex set.
is a convex set.
0 = {v : f (v) < 0}
is a convex set.
xn
and the corresponding functions k : k 7 IR defined by k (x) = sin(Lk x) are convex. The
intersection [0, ]n IRn of all sets Wk is convex as well (this is obvious, as is a hypercube,
148 CHAPTER 8. CONVEXITY
but also follows from Theorem 8.3 (c)). Hence for every > 0 the function : 7 IR defined
by
n
def
X
(x) = 1 + k (x)
k=1
X = {x : (x) < 0}
are convex as well (Theorem 8.3 (d)). Finally, is convex as the intersection of all sets X
with > 0.
(a) is convex;
The implication (a)(b) is trivial, as a level set of every convex function is convex.
The other direction provides a useful way of proving convexity of homogeneous functions
by checking convexity of a single level set.
Proof. Assume that assumption (b) holds. For u, v V and > 0 we have
u v
1, 1.
+ (u) + (v)
xn k=1
convex for all a 1? First, the function t 7 |t|a is convex on IR (verifiable by differentiation).
Hence the function x 7 (x)a is convex. Hence the level set
= {x IRn : (x)a 1}
is convex. Since is also the level set of the homogeneous function : IR7 [0, ), is convex
as well.
The definition of an interior point given in Definition 8.1 is weaker than many alterna-
tives (typically, based on metrics and topology). For example, according to the definition,
v = 0 is an interior point of the set
x1 2 2
= IR : x2 0 or x2 x1 ,
x2
Then there exists a linear function f V ] , f 6= 0, such that f (w) f (v) for all w .
Note how the inequality f (w) f (v), where f is a non-zero linear function, is used
to describe mathematically a hyperplane separating v from the interior of .
The proof and applications of the Hahn-Banach Theorem will be discussed in detail
in the next chapter.
v = c1 v1 + + cm vm , ck IR, ck 0, c1 + + cm = 1.
For example, the segment [v, w] connecting two vectors v, w V is the set of all convex
combinations of v and w.
In general, convex combinations provide a useful way of defining convex sets: as stated
by the following result, the set of all convex combinations of elements from a given set is
always convex.
Theorem 8.6 Let 0 be a subset of a real vector space V . Then the set = co(0 ) of
all convex combinations of finite groups of elements from 0 is convex.
Proof. If
m
X n
X m
X n
X
v= ak vk , u = bk uk , ak = bk = 1,
k=1 k=1 k=1 k=1
and
m
X n
X
tak + (1 t)bk = 1.
k=1 k=1
8.2. BASIC THEOREMS OF CONVEX ANALYSIS 151
The Caratheodorys Fundamental Theorem states that, in a real vector space of finite
dimension n, every convex combination of m > n + 1 vectors is a convex combination of
a subset of n + 1 of those vectors.
Theorem 8.7
Proof. To prove (a), let d be the minimal number of elements e1 , . . . , ed of the set {w1 , . . . , wm }
needed to represent w as a linear combination with non-negative coefficients. Let
w = a1 e1 + + ad ed , ad > 0 (8.2)
Let t0 be the smallest of the ratios ak /ck taken for k with ck 6= 0. By construction, all
coefficients of the linear combination in (8.3) are non-negative, and at least one of them is zero,
which contradicts the assumption of minimality of d.
To prove (b), let U = V IR, a real vector space of dimension q = n + 1, elements of which
are pairs (v, y), with v V , y IR, with point-wise addition and scaling operations. Define w,
w1 , . . . , wm by w = (v, 1), wk = (vk , 1). Application of statement (a) shows that there exists a
subset u1 , . . . , un+1 of the set {v1 , . . . , vm } such that
Example 8.11 One important use of Theorem 8.7 is to describe extremal probability distri-
butions in optimization problems in which decision parameters are random variables.
Let v be a random variable which takes values in the interval [, ], has zero mean and
unit variance. What is the maximal possible expected value of sin(v)? Formally speaking, the
problem calls for maximizing the integral
Z
E[sin(v)] = sin(t)dV (t)
where V : [, ] 7 [0, 1] is the monotonic function such that V (0) = 0 and V (1) = 1, to be
optimized.
The optimization can be simplified greatly by realizing that the set of all possible values
of the vector
E[v]
E[v 2 ] IR3
E[sin(v)]
is the set of all convex combinations of vectors from the set
t
0 = t2 : t [, ] .
sin(t)
According to Theorem 8.7, every element of is a convex combination of just four vectors from
0 . Therefore, the problem can be reduced to minimizing
4
X
ck sin(tk )
k=1
subject to
4
X 4
X 4
X
ck tk = 0, ck t2k = 1, ck = 1, ck 0, tk [, ]
k=1 k=1 k=1
with respect to the eight real variables ck , tk : still not easy, but manageable.
Definition 8.2 Let V be a real vector space. A subset V is called (weakly) bounded
if for every u, v , u 6= 0, the set
def
Ev,u () = {t IR : v + tu }
is bounded. Similarly, the subset is called (weakly) closed when Ev,u () is closed for
all v, u V .
For convex subsets of a finite dimensional vector space, the notions of weak boundeness
and closedness are the same as the usual ones.
Note that the assumptions of closedness and boundedness are really needed in state-
ment (b) of Theorem 8.8. For examle every finite number of the closed convex subsets
k = [k, ) of IR, where k = 1, 2, . . . have common point, but the intersection of all sets
k is empty. Similarly, every finite number of the bounded convex subsets k = (0, 1/k)
of IR, where k = 1, 2, . . . have common point, but the intersection of all sets k is empty.
Proof. To prove (a), it is sufficient to show that if every r > n sets of a family of r + 1 sets
1 ,. . . ,r+1 in an n-dimensional vector space V have a common point then all k + 1 sets have
a common point. To do this, for every k {1, . . . , r + 1} let vk be a common point for all
sets 1 , . . . , r+1 exceps (possibly) k . Consider the vectors wk = (vk , 1) in V IR. Since the
dimension of V IR is not larger than r, there exist real numbers c1 , . . . , cr+1 , not all of which
are equal to zero, such that
c1 w1 + + cr+1 wr+1 = 0.
The claim is that the vector
|ck |
v = a1 v1 + + ar+1 vr+1 where ak =
|c1 | + + |cr+1 |
belongs to the set k for every k = 1, . . . , r + 1.
To prove this, since vk i whenever k 6= i, it is sufficient to show for that v is a convex
combination of vectors v1 ,. . . ,vr+1 , excluding vk , for every k. Since, by construction, c1 + +
cr+1 = 0, there are both positive and negative elements among ck , we have
X X 1 X X
ak = ak = , v = (2ak )vk = (2ak )vk ,
2
ck <0 ck >0 ck >0 ck <0
154 CHAPTER 8. CONVEXITY
The proof of the following statement (sometimes called the S-procedure losslessness
theorem, and a very useful result of its own right) demonstrates the typical application
of the Hellys theorem.
Theorem 8.9 If , : V 7 IR are two quadratic forms on a real vector space V such
that
is closed, and bounded. We need to prove that the sets have a common point. According to
Theorem 8.8, it is sufficient to show that every two of such sets have a non-empty intersection.
Since a pair of sets v , w depends only on the values of and at three vectors u, v, w, it is
sufficient to prove Theorem 8.9 for the linear span of u, v, w, i.e. in the case dim(V ) = 3.
Since the application of the Hellys theorem is done at this point in the proof, the rest is left
as an excercise.
|v| R v , (8.4)
and a feasibility oracle, which is a function h : IRn 7 IRn IR, called feasibility oracle,
taking vectors u IRn as inputs, and producing pairs (w, c) IRn IR as outputs, in
accordance with the following rule:
(a) if u then w 0 u < c;
operations, whenever > 0 is such that the set of contains a ball of radius .
In abstract terms, the algorithm can be described as follows.
(a) Initialize u = 0 IRn , H = R1 In , go to (b).
(b) Apply the oracle to u to produce (w, c) = h(u). If w 0u < c then go to (c). Otherwise
update H and u according to H = He and u = ee , where a non-singular n-by-n real
matrix He and ue IRn are such that
Theorem 8.10 For a real non-singular n-by-n matrix H and a vector u IRn let E
denote the ellipsoid
E = {v IRn : |H 1 (v u)| < 1}.
For (w, c) IRn , where w 6= 0, let X denote the half-space consisting of those v IRn
for which w 0 v < c. Then
w0u c
r= (0, 1);
|Hw|
(b) assuming condition (a) is satisfied, the ellipsoid defined by (8.5) with
(1 + r) |e| 1 + 0.5(1 r)
ue = u + MM 0 w, He = M,
2 1+
where
1
2(1 + rn) 1 1+
= , e = Hw, = 2 , = , M = H Hee0 ,
(1 r)(n 1) |e| |e|2
Here is a sample MATLAB code for finding the minimal value ellipsoid:
function [Hn,x0n]=ellips_cut(H,x0,L,r)
% function [Hn,x0n]=ellips_cut(H,x0,L,r)
%
% finds a minimum-volume ellips {x: |inv(Hn)(x-x0n)|<1} containing the
% intersection of ellips {x: |inv(H)(x-x0)|<1} with the half-plane
% {x: L(x-x0)>r*|LH|}, where r is between 0 and 1.
n=length(x0);
LH=L*H;
q=norm(LH);
p=q^2;
ptau=2*(1+r*n)/((1-r)*(n-1));
tau=ptau/p;
th=(1-1/sqrt(1+ptau))/p;
H1=H-(th*(H*LH))*LH;
x0n=x0+0.5*(1+r)*q*tau*(H1*(L*H1));
Hn=H1*((1+0.5*ptau*(1-r))/sqrt(1+ptau));
8.3. CONVEX OPTIMIZATION 157
It can be shown that the volume of the ellipsoid E decreases by at least 1 0.5/n at
each repetition of step (b), which proves the claimed convergence properties of the algo-
rithm. In must be noted that, despite having remarkable provable convergence properties,
most practical implementations of the ellipsoids algorithm turn out to be inferior to the
alternatives, such as the interior point method.
158 CHAPTER 8. CONVEXITY
Chapter 9
The Hahn-Banach Theorem, which states that an open set with non-empty interior can
be separated by a hyperplane from every point which is not an interior point of the
set, is a very useful tool for proving results in the fields of linear algebra, functional
analysis, and optimization. This chapter provides a proof for the theorem, as well as
some counterexamples to its generalizations, and explores its applications in matrix
algebra and optimization.
Definition 9.1 Let be a convex subset of a real vector space V such that 0 is an
interior point of , in the sense that for every u V there exists r > 0 such that ru .
The Minkowski functional of is the function p : V 7 [0, ) mapping every u V to
159
160 CHAPTER 9. THE HAHN-BANACH THEOREM
= {v V : p(v) 1}
Proof. If p(v) 1 then p(v + tv) = 1 + t > 1 (and hence v + tv 6 ) for all t > 0, which means
that v is not an interior point of .
If p(v) < 1 then for every u V
1 t
p(v + tu) = (1 + t)p v+ u p(v) + tp(u) < 1
1+t 1+t
every t IR, which means that p(u) f0 (u) for all u U, where U is the one-dimensional
linear subspace of V consisting of all vectors tu0 with t IR, and f0 : U 7 IR is the
linear function defined by f0 (tu0 ) = t. According to Theorem 9.2, there exists a linear
function f : V 7 IR such that f (u0 ) = f0 (u0 ) = 1, and p(v) f (v) for all v V . Hence
f (v) p(v) 1 for all v w0 , and also f (u0 ) = 1. Equivalently,
f (w) 1 + f (w0 ) = f (v0 w0 ) + f (w0) = f (v0 ) w ,
which means that f (v) = f (v0 ) is the hyperplane separating and v0 .
Theorem 9.3 If a convex subset of a real vector space V of finite dimension dim(V ) =
n < has no interior point then there exists a linear function f : V 7 IR, not identically
equal to zero, such that f (w1 ) = f (w2 ) for all w1 , w2 .
Proof. If is empty, the conclusion holds automatically. Assuming that 6= , fix an element
w0 , and let v1 , . . . , vm be a linearly independent subset of w0 with the maximal number
of elements (by construction m n).
Consider first the case when m = n. Let wk = vk + w0 for k = 1, . . . , n. Then every u V
can be represented in the form
u = c1 v1 + + cn vn = (c1 + + cn )w0 + c1 w1 + + cn wn ,
for
n
1 X
v= wk
n+1
k=0
when t > 0 is small enough to satisfy the conditions
n
1 X 1
t ck 0, + tck 0 (k = 1, . . . , n),
n n
k=1
9.1. PROOF OF THE HAHN-BANACH THEOREM 163
Theorem 9.3 can be used to strengthen the statement of the separation principle in
the finite dimensional case.
Theorem 9.4 If is a non-empty convex subset of a finite dimensional real vector space
V and u V is not an element of then there exists a linear function f : V 7 IR such
that f (w) f (u) for all w , and f (w) 6= f (u) for some w .
Proof. Let U be the linear subspace of V containing all linear combinations of vectors w u
where w . Since 0 6 u, there exists a non-zero linear function g : U 7 IR such that
g(w u) 0 for all w . By construction, g(w) g(u) for all w , and g(w) 6= g(u) for
at least one w . Let v1 , . . . , vm be a basis in U . Extend it to a basis v1 , . . . , vm , . . . , vn in V ,
and define f : V 7 IR according to
f (c1 v1 + + cn vn ) = g(c1 v1 + + cm vm ).
By construction, f is the desired linear functional.
which means that f (ek ) = 0 for all k, i.e. f (p) = 0 for all p.
164 CHAPTER 9. THE HAHN-BANACH THEOREM
is not completely satisfactory, though, as it does not describe well robustness of the
property being checked. For example, the linear equations in (9.4) are frequently approx-
imating a nonlinear system
x(t) = Ax(t) + (x), (9.5)
where : IRn 7 IRn is a continuous function bounded by
and > 0 is known to be small. Since there is no explicit formula for solving (9.5), the
eigenvalue argument turns out to be insufficient.
An argument for robustness of the no return property can be given by obtaining a
quadratic Lyapunov function V (x) = x0 P x for system (9.4), which is strictly decreasing
along all non zero solutions of (9.4). Indeed, in this case the inequality V (x(T )) <
V (x(0)) is guaranteed for x(t) 6 0, which makes equality x(t) = x(0) impossible. The
differentiation
clearly shows that V (x) = x0 P x has the desired properties if and only if the n-by-n
symmetric matrix P = P 0 satisfies the strict Lyapunov inequality
P A + A0 P > 0. (9.7)
Theorem 9.5 Let A be an n-by-n real matrix. Then the following conditions are equiv-
alent:
(A) Let V be the real vector space of all real symmetric n-by-n matrices. Let be the convex
subset of V consisting of all matrices of the form P A + A0 P Q, where Q = Q0 > 0 and
P = P 0.
(D) For every linear function f : V 7 IR there exists H V such that f (X) = trace(XH)
for all X V .
(E) Let f : V 7 IR, f (X) = trace(XH), where H = H 0 , be the non-zero linear function
separating from zero (it exists due to Theorem 8.5). Then
for all P = P 0 and for all Q = Q0 > 0. Setting P = 0 yields trace(QH) 0 for every
Q = Q0 > 0, which means H 0. Setting P = tP0 for an arbitrary P0 V and letting
t converge to yields AH + HA0 = 0. Our objective is to show that these conditions
contradict the assumptions.
Consider A and H as linear operators on C n . Let U = R(H) C n be the range of H.
Since H 6= 0, we have U 6= {0}. Since AHv = H(A0 v) for every v C n , U is A-invariant.
Let Hv be a non-zero eigenvector of the restriction of A to U , i.e. AHv = sHv, Hv 6= 0.
Then
0 = v 0 (AH + HA0 )v = v 0 (AHv) + (AHv)0 v = 2Re(s)v 0 Hv,
which, due to v 0 Hv 6= 0, means that s is a purely imaginary eigenvalue of A. The
contradiction proves Theorem 9.5.
as the task of finding the minimal upper bound p of Cx subject to the inequality Ax B
(understood component-wise), where x ranges over IRn .
It is easy to see that every 1-by-m matrix p such that p 0 (component-wise) and
pA = C provides an upper bound pB for p , because Cx = pAx pB whenever Ax B.
This leads to another optimization problem, called the dual of (9.8):
Example 9.3 If
0 1
A= , B= , C= 1, 0
1 0
then there is no x IR such that Ax B, and there is no p IR12 such that pA = C and
p 0.
The following statement, conveniently proven using the Hahn-Banach Theorem, es-
tablishes that there is no duality gap when the inequality Ax B has a solution.
Theorem 9.6 Let A, B, C be real matrices of dimensions m-by-n, m-by-1, and 1-by-n
respectiely such that Ax0 B for some x0 IRn . Then
Proof. To emphasize the essentials, the proof is given under a slightly stronger assumption
that Ax0 < B for some x0 IRn . Assume that y IR is such that y p .
(A) Let
(D) For every linear function f : V 7 IR there exist q IR and h IR1m such that
f (u, v) = qu + hv for all (u, v) V .
168 CHAPTER 9. THE HAHN-BANACH THEOREM
However, there are simple examples showing that existence of the Lagrange multipliers
satisfying Lp (x0 ) = 0 is not a necessary condition of optimality. For instance, with
the point x0 = 0 is the argument of the constrained minimum, but Lp (0) = 1 for Lp (x) =
x + px2 for every p IR.
The following statement, proven using the separation principle, provides valid neces-
sary conditions of optimality using Lagrange multipliers.
(a) there exist real scalars pk 0 such that pk gk (x0 ) = 0 and Lp (x0 ) = 0 for the function
Lp : X 7 IR defined in (9.12);
def
where k(0) = 0.
(B) By assumption, 0 IRr+1 is not an element of . Indeed, otherwise there exists v IRn
such that gk(i) v < 0 for all i = 0, 1, . . . , r, which means that, for a suffciently small t > 0,
all numbers
vm k=0
(E) If the functional in (9.15) is not identically equal to zero, and separates and 0, in the
sense that f (w) 0 for all w , then
m
X
dk [gk (x0 )v + k ] 0 k > 0, v IRn ,
k=0
When d0 > 0, dividing the equality by d0 yields conclusion (a) with pk = dk /d0 . When
d0 = 0, dividing by the equality by the sum d of di yields (9.14) with pk = dk /d.
Chapter 10
This chapter introduces major constructions associated with using limits of sequences of
vectors. It covers the strong convergence, given in terms of norms, weak convergence, de-
fined in terms of duality (linear functionals), and explores application of the standard no-
tions of continuity, compactness, completeness, and separability. In contrast, the general
issues of vector space topology (i.e. the abstract guidelines for defining and generalizing
convergence while working with inifnite dimensional vector spaces) are not discussed here.
10.1 Motivation
Properties of vector spaces associated with convergence are used extensively when deal-
ing with existence of solutions of systems of linear and nonlinear equations, as well as
optimization problems with infinite number of variables. Convergence properties are also
used as benchmarks for establishing and comparing performance of approximation and
model reduction algorithms.
171
172 CHAPTER 10. NORMS AND CONVERGENCE
solution. For example, when xk0 = 0 for all k, one solution of (10.1) is obviously given
by xk (t) 0. However, for every function : IR 7 IR which is infinitely many times
differentiable, and satisfies the condition (k) (0) = 0, setting xk = (k) (the k-th derivative
of ) yields a valid set of continuously differentiable functions xk satisfying the equations
in (10.1). Since such a function can easily be not identically equal to zero, as in
exp(1/t2 ), t 6= 0,
(t) =
0, t = 0,
v(t + ) v(t)
v(t) = lim . (10.2)
0,t+[0,T ]
Since V = `2 , as a Hilbert space, has length |v| = (v, v)1/2 of its elements readily defined,
the limit relation in (10.2) can be understood as the definition of v(t) as the element of
V satisfying
v(t + ) v(t)
lim v(t) = 0.
0,t+[0,T ]
It can be shown that, with this definition of the derivative, the differential equation v = Av
has a solution v : IR 7 `2 satisfying initial condition v(0) = v0 for every given v0 `2 .
The availability of existence and uniqueness of solution statements in this example
is very much due to a proper selection of a real vetcor space and a convergence metric
to represent sequences of real numbers. It is also possible to make a selection which,
while reasonable in some other applications, leads to a lot of inconvenience in this case.
For example, defining V as the real vector space of all real sequences {v(t)} t=0 , with
convergence defined component-wise (i.e. vk u in V iff vk (t) v(t) as k for
every fixed t ZZ+ ), brings one back to the original interpretation of (10.1), when each
differential equation in (10.1) is considered separately, and uniquencess of solutions with
fixed initial conditions does not take place.
10.1. MOTIVATION 173
In applications, Lemma 10.1 is used together with a (less trivial) classical statement
characterizing the compact subsets of IRn .
Theorem 10.1 A subset X IRn is compact if and only if is is closed (i.e. xk x and
xk X implies x X) and bounded (i.e. there exists r > 0 such that |x| r for all
x X).
The proof of the following statement is a typical application of Lemma 10.1 and
Theorem 10.1.
174 CHAPTER 10. NORMS AND CONVERGENCE
Lemma 10.2 Let a1 , . . . , am be a set of vectors in IRn such that every vector in IRn can
be represented as a linear combination of vectors ak . Then there exists > 0 such that
m
def
X
(x) = |a0k x|4 |x|4 x IRn . (10.4)
k=1
Proof. Let X be the the unit sphere in IRn , i.e. the set of all x IRn such that |x| = 1. Since
X is bounded and closed, it is compact. Since the function is continuous on X, it achieves its
minimum at a point x X. Since x is a linear combination of ak , at least one real number a0k x
is not equal to zero, and hence (x) > 0. Since both sides of (10.4) scale the same way when x
is replaced by cx, where c IR, the conclusion of the theorem follows with = (x).
Examination of the proof of Lemma 10.1 shows that its statement has little to do with
the fact that X is a subset of IRn : the only important thing is that a notion of convergence
to a limit should be defined for sequences of elements of X, in such a way that, with respect
to this definition of convergence, X is sequentially compact, and : X 7 IR is lower
semi-continuous. It turns out that, in contrast with the finite dimensional real vector
space IRn , where all reasonable definitions of convergence turn out to be equivalent,
infinite dimensional vector spaces allow substantially different definitions of convergence.
Which definition of convergence to use depends on the specifics of a particular application:
giving a definition in which more sequences have a limit leads to more sets being compact,
at the expense of allowing fewer lower semi-continuous functions.
def
v = {t IR : tv } = [rv , rv ] where rv (0, )
for every v V ). This offers a convenient way to verifying that a specific function
p : V 7 [0, ) is a norm.
According to a common convention, the values of norms are denoted using the double
vertical bar signs, as in p(v) = kvk. This could be inconvenient when several norms on
the same real vector space are used simultaneously. To avoid ambiguity in the notation,
an index can be used to indicate a specific norm, as in p(v) = kvk or p(v) = kvkV .
where p [1, ) is a parameter. The main step in proving that k kp is indeed a norm relies on
convexity of the function f : IR 7 IR defined by f (y) = |y|p (hence x 7 kxkpp is convex, hence
the level set of {x : kxkp 1} is convex, hence k kp is convex, as a homogeneous function.
Symmetry and positive definiteness of k kp are obvious. Note that kxkp is not a convex (and
hence is not a norm) when p < 1.
The limit
x1
x1
..
def
.
.
= lim
..
= max |xk |
p+
k
xn
xn
p
is also a norm on IRn . The norm k k is the Minkowski functional of the hypercube
x1
.. IRn :
= |xk | 1 .
.
xn
176 CHAPTER 10. NORMS AND CONVERGENCE
Example 10.3 The idea of the k kp -norms can be extended to sets of functions. For example,
on the real vector stace V = C[0, 1] of all continuous functions v : [0, 1] 7 IR one can define the
norms Z 1 1/p
def p
kvkp = |v(t)| dt
0
for all p [1, ), with
def
kvk = max |v(t)|
t[0,1]
being the limit case. There is no a-priori preference in choosing which norm to use. The kvk
(L-Infinity) norm represents the peak value of |v|, which is suitable for the worst case analysis
problems. The kvk2 (L2) norm, being generated by a quadratic form, is typically by far the
easiest to work with. Using the kvk1 (L1) norm as a measure of error in approximating a complex
function by a large number of simpler ones is commonly believed to aid in reducing the number
of non-zero coefficients in the resulting approximation.
Example 10.4 Let U and V be real vector spaces equipped with norms k kU and k kV
respectively. A linear function A : U 7 V is called bounded (with respect to these norms)
if there exists a constant c > 0 such that kAukV ckukU for all u U . The maximal lower
bound of such constants c is called the operator norm of A (induced by k kU and k kV ). It is
easy to check that the operator norm is indeed a norm on the real vector space W = L(U, V ) of
all bounded linear functions A : U 7 V . In most cases, operator norms cannot be defined by
quadratic forms.
Example 10.5 Consider the sequence of elements ek `2 , where `2 is the standard real Hilbert
space of all functions v : ZZ+ 7 IR such that
def
X
|v|2 = |v(t)|2 < ,
t=1
defined by
1, t = k,
ek (t) =
6 k.
0, t =
10.2. BASIC TYPES OF CONVERGENCE 177
While the sequence is bounded,in the sense that |ek | = 1 for all k, it has no (strongly) convergent
subsequence, since |vi vk | = 2 whenever i 6= k.
Definition 10.3 Let F V ] be a total set of linear functionals on a real vector space V .
A sequence {vk }
k=1 of vk V is said to converge (weakly, with respect to F ) to a limit
F
w V (notation vk w if and only if
Indeed, the expansion in (10.5) converges for every pair v, w V because the function
g : IR 7 IR defined by
|y|
g(y) =
1 + |y|
178 CHAPTER 10. NORMS AND CONVERGENCE
takes values in [0, 1] only. Since g(y) = 0 implies y = 0, and F is total, (v, w) = 0 if
and only if v = w. Since g(y1 + y2 ) g(y1 ) + g(y2) for all real y1 , y2 , is a metric on V .
Finally, by construction, (vk , w) 0 as k if and only if fn (vk w) 0 for all n.
The most commonly used examples of duality-based convergence are associated with
normed vector spaces. Let V be a real vector space with norm k k : V 7 [0, ). Let
V be the real vector space of all bounded linear functionals f : V 7 IR (when V is a
Hilbert space defined by the quadratic form (v) = kvk2 , we know that V is, essentially,
w
same thing as V ). The weak convergence in V (notation vk u for the sequence {vk } k=1
of vectors vk V to converge weakly to u V ) is the duality convergence defined by
w
F = V . The weak* convergence in V (notation vk u for the sequence {vk } k=1 of
vk V to weak* converge to u V ) is the convergence defined by the set F of all
functionals gv : V 7 IR of the form gv (f ) = f (v), where v ranges over V . Since V ,
equipped with the operator norm, is a normed space itself, there is also the standard
weak convergence of sequences from V which, in general, is not the same as the weak*
convergence.
It is an easy observation that weak convergence is always implied by the strong one.
w
While in general the weak convergence vk u does not imply strong convergence, it
does imply boundedness of the norms kvk k.
w
Theorem 10.3 If V is a normed real space and vk u in V then there exists a constant
r IR such that kvk k r for all k.
Theorem 10.3 has a nice proof based on the closed graph theorem, a general principle
of proving functional analysis statements to be studied in the forllowing lecture.
10.2. BASIC TYPES OF CONVERGENCE 179
Example 10.7 Let V = `2 be the Hilbert space of squared summable sequences v = {v(t)}
t=0
of real numbers v(t) IR with the norm
!
def
X
kvk = |v| = |v(t)|2 .
t=0
The fact that for every bounded linear functional f : `2 7 IR there exists u `2 such that
f (v) = (u, v) and kf k = |u| can be interpreted as the identity (`2 ) = `2 . Therefore weak and
weak* convergence definitions on `2 are equivalent. Using the unit sample functions ek (t) = kt ,
w
it is easy to see that vk u in `2 implies
According to Theorem 10.3, the weak convergence also implies existence of a finite upper bound
for |vk |. A simple derivation shows that these two conditions are not only necessary but also
sufficient for weak convergence:
`2 w
vk u iff sup |vk | < inf, lim vk (t) = u(t) t.
k k
The difference between many (though not all) possible ways of introducing convergence
on vector spaces does not manifest itself in the finite dimensional case, as demonstrated
by the following result.
Theorem 10.4 Let V be a real vector space with a finite basis b = (u1 , . . . , un ). Let
p : V 7 [0, ) be a norm. Let F V ] be a total subset of linear functionals f : V 7 IR.
Then for every sequence {vk }
k=1 of vectors vk V and for every w V the following
conditions are equivalent:
(a) p(w vk ) 0 as k ;
(c) cik 0 as k for every i {1, . . . , n}, where cik IR are the coefficients of the
linear decomposition
X n
w vk = cik ui.
i=1
def
khk = sup |h(t)| < ,
tZZ+
X
f (v) = h(t)v(t) v `1 , (10.6)
t=0
kf k = khk .
Essentially, Theorem 10.5 establishes that, for V = `1 , its dual space V can be
associated with the real vector space ` of all uniformly bounded real sequences h =
{h(t)}
t=0 , equipped with the norm h 7 khk .
Proof. Let f : `1 7 IR be a bounded linear functional. For all t ZZ+ define h(t) = f (ek ),
where
1, k = t,
ek (t) =
6 t.
0, k =
|h(t)| kf k ket k1 kf k,
By construction,
X
kv PT k1 = |v(t)|
t>T
for every h ` . This criterion is not very useful, though, as it can be tricky to apply, due
to the arbitrariness of the sequence h ` . A more detailed analysis of weak convergence
in `1 reveals the following (slightly disappointing) result.
w kk1
Theorem 10.6 Weak convergence vk u takes place in `1 if and only if vk u, i.e.
kvk uk1 0 as k .
w
Proof. Assume to the contrary that vk u but there exists > 0 such that kvk uk1 for
arbitrarily large k. Since by assumption |vk (t) u(t)| 0, this means that for every ZZ+
there exist n = ( ) and r = ( ) > such that
r
X
|vn (t) u(t)| 0.9kvn uk1 0.5.
t=
Let h ` be defined by
1, vn(k) (t) > u(t), (k) t < (k + 1),
h(t) = 1, vn(k) (t) < u(t), (k) t < (k + 1),
0, vn(k) (t) = u(t), (k) t < (k + 1).
By construction,
X
h(t)(vn(k) (t) u(t)) 0.8kvn(k) uk1 0.4
t=0
does not converge to zero. The contradiction proves Theorem 10.6.
Note that component-wise convergence vk (t) u(t) is not enough for the weak*
w
convergence vk u in ` : uniform boundedness is another necessary condition.
Proof. The implication (b)(a) is a standard application of convergence of infinite sums: for
every w `1 the sequences t 7 w(t)(vk (t) u(t)) have the same index-independent summable
upper bound
|w(t)(vk (t) u(t))| |w(t)|(c + kuk ),
and hence
X
X
lim w(t)(vk (t) u(t)) = lim w(t)(vk (t) u(t)) = 0.
k k
t=0 t=0
w
To prove that (a) implies (b) assume that vk u, i.e.
X
lim w(t)(vk (t) u(t)) = 0
k
t=0
for all w `1 . Using w = ek (where ek are defined as in the proof of Theorem 10.5) with this
identity yields vk (t) u(t) for every fixed t ZZ+ .
The uniform boundedness of kvk k follows from Theorem 10.3.
converges to zero in the weak* sense, while the sequence vk = kuk does not converge
anywhere in the weak* sense.
Since, in contrast with the relations such as (`2 ) = `2 and (`1 ) = ` , there is no
explicit characterization of the dual space U for U = ` , the weak topology of ` , in
contrast with its weak* topology, is difficult to describe. It is easy to see that, in ` , the
weak convergence is strictly stronger requirement than the weak* convergence.
Example 10.8 The sequence uk defined in (10.7) does not converge to zero weakly in ` . To
prove this, consider a Banach limit functional, i.e. any linear function f : ` 7 IR such that
|f (v)| kvk for all v ` , and
f (v) = lim v(t)
t
10.3.1 Completeness
A real normed space V is called complete is every Cauchy sequence in V has a limit.
Similar notion of completeness can be introduced for other situations when convergence
of sequences to a limit is defined on a real (or complex) vector space, but the case of
normed spaces is by far the most useful one. Completeness is a required assumption in
many fundamental theorems of functional analysis. A complete normed space is frequently
called a Banach space.
Hilbert spaces form a special class of Banach spaces: availability of a scalar product
is a key element of many proofs. On the other hand, many application problems (in
particular, those involving approximation of linear transformations) do not allow for an
adequate interpretation within a Hilbert space framework, which usually makes using
more general Banach spaces a requirement.
The standard examples of Banach spaces are the vector spaces `p and Lp [0, 1] (where
1 p , as well as C[0, 1] (equipped with the max-norm), which usually supplies
enough building blocks for modeling with Banach spaces. Naturally, a closed subspace of
a Banach space is a Banach space as well. The following theorem generalizes a similar
Hilbert space result, and presents a less obvious construction of a complete normed space.
Theorem 10.8 For every normed real vector space V its dual V is complete.
g is bounded and kg fk k 0.
Example 10.9 Completeness depends on the norm being used. In particular, the real vector
space C[0, 1] of all continuous functions v : [0, 1] 7 IR is complete with respect to the max-norm
To see the incompleteness with respect to the | |-norm, consider the sequence of functions
(2t)k , t 0.5,
vk (t) =
1, t 0.5.
It is easy to see that vk is a Cauchy sequence with respect to the | | norm, but the limit of
vk is the unit step function with a discontinuity at t = 0.5.
10.3.2 Continuity
A definition of sequential continuity of a function : X 7 Y is available whenever
convergence is defined for sequences of elements of X and Y : the function : X 7 Y is
called continuous if and only if
In this subsection, we are interested in applying this concept to subsets of real vector
spaces. Since infinite dimensional vector spaces usually have several different meaningful
types of convergence available, there are different notions of continuity as well.
For linear functions f : U 7 V mapping one normed space into another, there are
important cases when continuity does not depend on the types of convergence being used.
(c) A is continuous with respect to the strong convergence in U and weak convergence
in V ;
(d) A is bounded.
In particular, Theorem 10.9 implies that the standard operations of addition and
scaling are continuous with respect to both strong and weak convergence (though not with
respect to the mixture of weak convergence of the arguments and the strong convergence
of the output).
10.3. COMPLETENESS, CONTINUITY AND COMPACTNESS 185
which means that (d) implies both (a) and (c). Similarly, since A is bounded, for every g V
the composition g A, mapping u U to g(Au) IR, is linear and bounded as well. Hence if
f (uk ) f (w) for every bounded linear function f : U 7 IR then
Example 10.10 The identity operator on `2 is not completely continuous. Indeed, the se-
quence of ek `2 defined by ek (t) = kt converges weakly to zero, but does not converge strongly.
In contrast, the function A : `2 7 `2 mapping a sequence v(t) to sequence u(t) = et v(t) is
completely continuous.
A very important example of a completely continuous function on L2 [0, 1] is given by the
integration operator, mapping v = v(t) to
Z t
u(t) = v( )d.
0
and hence |vk | |u|. On the other hand, the unit sample vectors ek `2 defined by ek (t) = kt
converge weakly to zero, while |ek | = 1 for all k.
Example 10.12 Let V be the normed space of all continuous funcions v : [0, 1] 7 IR, equipped
with the L2 norm 1/2
Z 1
2
kvk = |v(t)| dt .
0
186 CHAPTER 10. NORMS AND CONVERGENCE
(b) A is continuous with respect to the weak convergence (in both argument and the value)
if and only if there exist constants a0 , a1 IR such that
(y) = a0 + a1 y y IR;
(c) A is completely continuous if and only if there exists a constant a0 such that
(y) = a0 y IR;
Example 10.13 Let V be the normed space from Example 10.12. Let W be the the real
vector space C[0, 1] equipped with the k k norm. The function A : V 7 W mapping v V
to w = Av W according to
Z t 2
w(t) = v( )d
0
is continuous with respect to the weak convergence in V and the strong convergence in W .
10.3.3 Compactness
A set X, for the elements of which a notion of convergence is available, is called se-
quentially compact if every sequence {xk }k=1 of elements xk X contains a subsequence
{yk }k=1 , where yk = xn(k) , n(k) as k , converging to a limit z X.
For the subsets of IRn , equipped with the usual definition of convergence, sequential
compactness is equivalent to compactness, and a subset of IRn is compact if and only if it
is both closed and bounded.
As is demonstrated by the example of `1 , where the unit ball is not sequentially
compact with respect to both strong and weak convergence, establishing compactness is
not as easy in the infinite dimensional case.
The following theorem is a major tool of proving compactness in infinite dimensional
vector spaces.
10.3. COMPLETENESS, CONTINUITY AND COMPACTNESS 187
Theorem 10.10 Let V be a normed real vector space. Assume that V contains a count-
able subset such that every vector v V is a limit of a sequence of elements of . Then
the unit ball of V is sequentially compact with respect to the weak* convergence.
The subset satisfying the conditions of Theorem 10.10 is called a countable dense
subset of V . A normed space V which has a countable dense subset is sometimes referred
to as separable.
Proof. Let {fk }
k=0 be a sequence of linear functionals fk V such that kfk k 1 for all k. We
need to prove existence of a subsequence gk = fn(k), where n(k) , such that gk (v) g(v)
for all v V .
Let {wm } m=1 be an enumeration of all elements of . Let us construct a strictly monotoni-
cally increasing function n = n(k) in such a way that fn(k) (vm ) converges to a limit gm as k .
To do this, use the standard diagonalization argument, by defining r : ZZ+ ZZ+ 7 ZZ+ ac-
cording to the following rules:
Example 10.14 The normed spaces `p are separable for 1 p < (in particular, the
countable set consisting of all sequences of rational numbers with a finite number of non-zero
elements is dense in `p for 1 p < . In contrast, ` is not separable. To see this, consider
the set X ` of all binary sequences v ` (i.e. such that v(t) {0, 1} for all t ZZ+ ).
Since kv uk = 1 for every two different elements of X, a single vector w ` cannot satisfy
the inequality kw vk < 0.5 for more than one element of X. Since X is uncountable, this
means that elements of a fixed countable subset of ` cannot approximate all elements of X
arbitrarily well.
Since `1 and `2 are separable normed spaces, the unit balls of the corresponding dual spaces
(respectively ` and `2 ) are sequentially compact with respect to the weak* topology. This
observation can be used to prove existence of an optimal solution in optimization problems
involving an infinite number of decision parameters.
188 CHAPTER 10. NORMS AND CONVERGENCE
Example 10.15 Consider the task of designing a control sequence aimed at minimizing the
maximal input/output amplitude of a given discrete time dynamical system given by the equa-
tions
y(t + 2) + 3y(t + 1) + y(t)2 = u(t) t ZZ+ , y(0) = 1, y(1) = 1. (10.8)
Formally, the task is to find a sequences u() and y() satisfying (10.8) for which the value of the
performance functional
(u(), y()) = sup{|u(t)| + |y(t)|} (10.9)
t0
is the minimal upper bound of a family of bounded linear functionals. Since the minimal upper
bound of a limit is never larger than the limit of minimal upper bounds, is lower semicontinuous
with respect to the weak* convergence, which guarantees that (u, y) = inf .
Chapter 11
Feasibility Of Equations
This chapter presents methods for establishing feasibility and robustness of solutions
of linear and nonlinear equations with (potentially) infinite number of variables. The
approach relies on explicit construction of Cauchy sequences of vectors, to converge to a
desired solution (when proving feasibility) or to a parameter vector for which the equation
has no solution (when deriving robustness from feasibility by contradiction).
One idea to be introduced is that of uniform boundedness, claiming that a system of
linear equations Au = v (to be solved for an element u U of a Banach space u for a
given element v of another Banach space V ), defined by a bounded linear function A, and
feasible for every v V , admits a solution u bounded by kuk ckvk, where the constant
c does not depend on v.
Another key statement to be presented is a version of the inverse function theorem,
claiming, roughly speaking, that a nonlinear equation a(u) = v has a solution u u0 for
all v v0 = a(u0 ), as long as a is continuously Frechet differentiable at u0 , and the range
R(A) of its Frechet derivative A is the Banach space V of all possible values of v. The
inverse function theorem plays a key role in deriving necessary conditions of optimality in
infinite dimensional minimization subject to a (possibly infinite) number of equality and
inequality constraints.
Example 11.1 Let `1 be the stabdard Banach space of all functions x : ZZ+ 7 IR such that
def
X
kxk = |x(t)| < .
t=0
189
190 CHAPTER 11. FEASIBILITY OF EQUATIONS
Informally, (v) is the minimal norm of a solution u of the linear equation Au = v (if one
ignores the fact that the minimum could be not achievable), unless the equation Au = v
def
has no solutions, in which case (v) = +. By construction, the function achieves
a minimal upper bound of + if and only if equation Au = v has no solution u U
for some v V . The following theorem, known as the interior mapping principle, claims
that achieves its minimal upper bound whenever sup = +, assuming that the linear
function A is bounded and the normed spaces U, V are complete.
Theorem 11.1 Let A : U 7 V be a bounded linear function mapping one Banach space
def
to another, such that V = R(A) = A(U). Then there exists r IR such that for every
v V the equation Au = v has a solution u U with kuk rkvk.
In other words, Theorem 11.1 claims that a feasible system of linear equations defined
by a bounded linear function on Banach spaces has a uniformly bounded solution. The
statement can be appreciated better by realizing that, in general, its assumptions guar-
antee neither the existence of a minimizer of kuk subject to Au = v, nor the existence of
a bounded linear function B : V 7 U such that ABv = v for all v V .
Proof. Let
D = {v V : kvk 1}, Dh = {u U : kuk h}
be the closed unit ball in V and the closed ball of radius h in U , centered at the origin. The first
step is to prove that there exists h > 0 such that every v D can be approximated arbitrarily
well by Au with u Dh , i.e. that the error kAu vk can be made arbitrarily small while using
u Dh . This is done by assuming the contrary, i.e. that
11.1. UNIFORM BOUNDEDNESS 191
(*) for every h > 0 there exists > 0 and v D such that kAu vh k > for all u Dh ,
(**) for every h > 0, every open ball Bd (v0 ) D of positive radius d > 0 contains a vector v
which cannot be approximated arbitrarily well by Au with u Dh .
Indeed, otherwise for every w D the norms kv0 Au0 k and kv0 + dw Auk can be made
arbitrarily small with some u, u0 Dh , which means that
= kv0 + dw Au (v0 Au0 )k kv0 + dw Auk + kv0 Au0 k
u u0
kw Auk =
w A
d
d |d|
def u u0
u = D2h/d ,
d
which contradicts (*).
Using (**), construct a sequence of vectors wk V and positive numbers d(k) IR (k =
0, 1, . . . ) satisfying the inclusion Bd(k) D for all k by setting w0 = 0, d(0) = 1, and for
k = 0, 1, 2, . . . defining wk+1 as the vector in Bd(k) (wk ) which cannot be approximated arbitrarily
well by Au with u Dk (such w = wk+1 exists due to (**)), and choosing d(k + 1) (0, d(k)/2)
small enough so that Bd(k+1) (wk+1 ) D and kAu wk+1 k > d(k + 1) for all u Dk . By
construction, {wk } k=0 is a Cauchy sequence in V which converges to a limit w such that
w Bd(k) (wk ) for all k. Since none of the elements of Bd(k) (wk ) belonkgs to ADh , this implies
u 6 R(A): a contradiction which proves that () is not valid.
The second part of the proof, assuming that for every v D there exists u Dh such
that kAu vk 0.5, which, by homogenuity, is equivalent to the existence, for every v V of
u U such that kuk hkvk and kAu vk 0.5kvk, constructs, for every v V , a sequence
u0 , u1 , . . . converging to a vector u U such that Au = v and ku k 2hkv k. Let u0 = 0.
For k = 0, 1, 2, . . . define uk+1 = uk + u, where u U is such that kuk hkv Auk k and
kAu (v Auk )k 0.5kv Auk k. Then kv Auk k 2k kv k and kuk+1 uk k h2k kv k,
which implies that
kuk um k h21k kv k for m > k > 0.
Therefore {uk }k=0 is a Cauchy sequence. Since U is complete, kuk u k 0 for some u U
such that ku k 2hkv k. Since A is bounded, Au = v .
Example 11.2 Let V1 and V2 be two closed linear subspaces of a Banach space V . Assume
that every vector v V can be represented as a sum v = v1 + v2 , where vk Vk for k = 1, 2,
though not necessarily in a unique way. Does it follow that the equation v = v1 + v2 can be
solved for vk Vk with the solution (v1 , v2 ) being bounded, as in kvk k ckvk?
192 CHAPTER 11. FEASIBILITY OF EQUATIONS
Due to Theorem 11.1, the answer is affirmative. Indeed, as closed linear subspaces of a
Banach space V , V1 and V2 are complete. Hence the real vector space U = V1 V2 of all pairs
(v1 , v2 ), vk Vk , equipped with the norm k(v1 , v2 )k = max{kv1 k, kv2 k}, is a Banach space. The
map A : U 7 V defined by A(v1 , v2 ) = v1 + v2 is bounded (its operator norm is not larger
than two). Since, by assumption, the equation Au = v (which, for u = (v1 , v2 ), is equivalent to
v1 + v2 = v) has a solution u U for every v V , according to Theorem 11.1 it can be chosen
in such a way that kvk k ckvk, where the constant c does not depend on v.
is linear and bounded. Moreover, the equation Au = v has a solution u = (q(0), q) U for
every q V . Nevertheless, for qn V defined by qn (t) tn , the solution un = (cn , pn ) of
Aun = qn has pn (t) ntn1 . Since kqn k = 1 and kpn k = n, there exists no constant c such
that equation Au = v has a solution u U with kuk ckvk for every v V .
Proof. According to Theorem 11.1, there exists c IR such that equation Au = v has a solution
u U with kuk ckvk for every v V . Since A is a bijection, such solution u is unique, and is
given by u = A1 v, which implies kA1 k c.
[] = {(u, v) U V : v = (u)}.
(b) L is bounded.
The non-trivial part of Theorem 11.3 is the implication (a)(b). One way to interpret
it is by saying that, for a linear function with a closed graph, infinite amplification of input
length is impossible as if it would be achieved at a specific input vector u.
Proof. Since W = U V is a Banach space, and [L] is a closed linear subspace of W , [L]
is a Banach space as well. The projection onto the first coordinate function A : [L] 7 U ,
defined by
Aw = u for w = (u, v) [L],
is a linear bijection which is bounded (kAk 1). Hence its inverse, mapping u U to (u, Lu)
[L] is bounded. Therefore L is bounded as well.
Example 11.4 Consider a time-varying linear dynamical system defined by the recurrent
equations
where A(t), B(t), C(t) are given real matrices of dimensions n-by-n, n-by-1, and 1-by-n respec-
tively, such that
sup kA(t)k < , sup kB(t)k < , sup kC(t)k < ,
t t t
which define a unique output sequence y = y(t) for every input sequence v = v(t). There are
several ways of defining stability of the model in (11.1): the `2 -BIBO stability calls for y to be
squared summable (i.e. y `2 ) whenever v `2 , and the bounded `2 gain stability calls for
existence of a constant > 0 such that kvk kyk for all input/output pairs, where k k is the
standard norm in `2 .
While it may appear that the bounded `2 gain stability is a stronger requirements than
the `2 -BIBO stability, the two notions are actually equivalent. Indeed, the `2 -BIBO stability
means that (11.1) defines a linear function L : `2 7 `2 . Since (11.1) is defined in terms of
scalar equations relating finite subsets of the scalar components of v() and y(), the graph of
L is closed. According to Theorem 11.3, L is bounded, which is equivalent to bounded `2 gain
stability.
(a) for every v V there exists c = c(v) IR such that kfk (v)k ckvk for all k ZZ+ ;
(b) there exists c0 IR such that kfk (v)k c0 kvk for every v V and k ZZ+ .
One way of interpreting Theorem 11.4 is that the function : V 7 IR{+} defined
by
(v) = sup kfk (v)k
k
def
U = U0 U1 U2 = {u = (u0 , u1 , u2 , . . . ) : uk Uk , kuk = sup kuk k < }
k
be the real vector space of all uniformly bounded sequences of vectors uk Uk . It is a straight-
forward excercise (following the proof of completeness for ` ) to show that, equipped with the
norm k k , U is complete.
By the assumption (a), the formula
Lv = (f0 v, f1 v, f2 v, . . . )
defines a linear function L : V 7 U . Since the functions fk are bounded, the graph of L is
closed. According to the closed graph theorem, L is bounded, and hence kfk k c = kLk < .
Example 11.5 Let V be a complex Banach space (same as the real one, except over the field
of complex numbers). Let A : V 7 V be a bounded linear operator. The set = (A) of all
complex numbers s C for which sI A does not have a bounded inverse is called spectrum of
A. The spectrum of A is a generalization of the set of eigenvalues to the infinite dimensional
case.
Using the Jordan form decomposition, in the case when dim(V ) < , it is relatively easy to
establish that
def
r(A) = max{|s| : s (A)} = lim kAn k1/n . (11.2)
n
converges whenever |s| is larger than the right side of (11.2), the right side of (11.2) is never
smaller than r(A). The principle of uniform boundedness is handy in establishing the the
inequality in the reverse direction in the general infinite dimensional case.
Due to the completeness of V , the expansion
X
u = B(s0 )v + (s0 s)B(s0 )2 v + = (s0 s)t B(s0 )t+1 v
t=0
kB(s0 )k
kvk
1 |s0 s| kB(s0 )k
whenever s0 I A has a bounded inverse (s0 )B and |s0 s|kBk < 1. Hence for every f V and
v V the function s 7 f (sI A)1 v is analytical in the region |s| > r(A). Since the expansion
(11.3) is valid at least for |s| > kAk, this implies that the sequences {hn f An v} converge to
zero for |h| > r. According to the principle of uniform boundedness, this means existence of a
constant c0 = c0 (h) such that
Theorem 11.5 If {vk } k=1 is a weakly convergent sequence of elements of a normed space
V then supk kvk k < .
holds by the definition, and existence of g V satisfying |fk (g)| = kvk k follows from the Hahn-
Banach theorem). Since fk (g) = g(vk ) converges to a limit as k , the sequence {|fk (g)|k} k=1
is bounded for every fixed v V . Since V is a Banach space, this means that sup kfk k < ,
which implies sup kvk k < .
196 CHAPTER 11. FEASIBILITY OF EQUATIONS
Then equation v = F (v) has a unique solution v X. Moreover, this solution is the limit
of the sequence {vk }
k=0 of vk V constructed according to
vk+1 = F (vk ) (k = 0, 1, 2, . . . ),
hence
kvk+1 vk k k kF (v0 ) v0 k
and
k
kvm vk k kF (v0 ) v0 k,
1
which means that {vk }
k=0 is a Cauchy sequence in V . Since by assumption V is complete, there
exists u V such that kvk uk 0 as k . By the closednes of X, u X, and
ku vk = kF (u) F (v)k ku vk
x = n (Ax) + w, (11.5)
where A is a given n-by-n real matrix such that kAk < 1, and n : IRn 7 IRn is a pointwise
sigmoid function, defined by
x1 (x1 )
.. .. def y
n . = , (y) = ,
. max{1, |y|}
xn (xn )
to be solved with respect to x IRn for a given w IRn . Since n : IRn 7 IRn is contractive,
in the sense that
|n (v) n (u)| |v u| v, u IRn ,
and the map v 7 Av is strictly contractive with contraction coefficient = kAk, the function
F : IRn
7 IRn defined by
F (v) = n (Av) + w
is strictly contractive. According to Theorem 11.6, equation (11.5) has a unique solution x IRn
for every w IRn , which can be obtained as the limit of the exponentially converging sequence
vk IRn defined by
v0 = 0, vk+1 = n (Avk ) + w (k = 0, 1, 2, . . . ).
Example 11.8 Existence and uniqueness of solutions of an ordinary differential equation with
a Lipschitz right side is a remarkable application of the contractive map principle.
Let a : IRn 7 IRn be a Lipschitz function, i.e. such that there exists a constant cL IR
satisfying
|a(u) a(v)| cL |u v| u, v IRn .
Our intention is to prove that for every x0 IRn there exists T > 0 and a continuously differen-
7 IRn such that
tiable function x : [0, T ]
Let T > 0 be such that = cL T < 1. Let V = C([0, T ] 7 IRn ) be the real vector space of
all continuous functions v : [0, T ] 7 IRn . Equipped with the infinity norm
def
kvk = max |v(t)|,
t[0,T ]
In the derivation of a generalized version of the inverse function theorem later in this
section, we will use a slightly different version of the contractive map principle, adapted
for finding solutions v X of an implicit equation H(v, v) = 0, where H : X X 7 IR
is a continuous function. Naturally, the fixed point equation v = F (v) is a special case of
H(v, v) = 0 obtained with H(v, u) = ku F (v)k.
Proof. According to assumption (b), conditions (11.7) can be satisfied by choosing some
wk+1 X whenever kwk w0 k < r and H(wk , wk1 ) = 0. Since having wi well defined for i k
yields
kwi+1 wi k i kw1 w0 k, txhence kwk w0 k (1 + + + k )kw1 w0 k < r,
wk are defined for all k, and form a Cauchy sequence with a limit x W satisfying kx w0 k r.
Since X is a closed subset of W and H is continuous, x X and H(x, x) = 0.
Example 11.11 Let V = C[0, 1] be the normed space of continuous functions v : [0, 1] 7 IR,
equipped with the max-norm k k . The function g1 : V 7 IR defined by g1 (v()) = v(0)2 is
differentiable at every point v0 V , and its derivative G1 = g1 (v0 ) at v0 is the linear functional
G1 : V 7 IR defined by
G1 v = g1 (v0 )(v) = 2v0 (0)v(0).
Similarly, the function g2 : V 7 V mapping v = v(t) to u = u(t) = v(t)2 is differentiable at
every point v0 V , and its derivative G2 = g2 (v0 ) at v0 is the linear operator G2 : V 7 V
defined by
(G2 v)(t) = (g2 (v0 )(v))(t) = 2v0 (t)v(t).
Though, according to the formal definition, the Frechet derivative G = g(u0 ) does not
have to be bounded (as a linear function) in general, it is easy to show that kg(u0 )k <
whenever g is continuous at u0 , i.e. kg(u) g(u0)k 0 as ku u0 k 0.
Proof. If G is not bounded, there exists a sequence of vectors wk U such that kwk u0 k 0
but kG(wk u0 )k = 1 as k . Then wk X for sufficiently large k, and kg(wk ) g(u0 )k 1
as k , which contradicts the assumed continuity of g.
Frechet derivatives of composition functions can be computed using the standard chain
rule.
Proof. By assumption, for every > 0 there exists > 0 such that kg(u) Gk < whenever
ku u0 k < . For every pair u1 , u2 U such that kui u0 k < and u1 6= u2 define function
h : IR 7 U according to h(t) = tu1 + (1 t)u2 . By the definition of the Frechet derivative,
every point [0, 1] is contained in an open interval S = S( ) such that
kg(h(t)) g(h( )) (t )g(h( ))(u1 u2 )k |t | ku1 u2 k
for every t S( ). Since kg(h( )) Gk < , this implies
kg(h(t)) g(h( )) (t )G(u1 u2 )k 2|t | ku1 u2 k,
i.e.
ke(t) e( )k 2|t | ku1 u2 k
for e(t) = g(h(t)) tG(u1 u2 ). Since the segment [0, 1] is compact, it can be covered by a
finite set of such intervals. Combining the bounds on the increments of e(t) over the selected
intervals yields
ke(1) e(0)k 2ku1 u2 k,
i.e.
kg(u1 ) g(u2 ) G(u1 u2 )k 2ku1 u2 k.
Since 0 as 0, this completes the proof of Theorem 11.11.
202 CHAPTER 11. FEASIBILITY OF EQUATIONS
Proof. According to the interior map principle, there exists c > 0 such that the equation
Gu = v has a solution u U with kuk ckvk for every v V . According to (11.10), there
exists 0 > 0 such that 0 < and
ku1 u2 k
kg(u1 ) g(u2 ) G(u1 u2 )k whenever kuk u0 k 0 . (11.12)
2c
Let us show that the conclusion of Theorem 11.12 holds for = 0 /3.
To do this, apply Theorem 11.7 with W = U , X = Bd (u0 ), = 0.5, r = 0 , w0 = 0, and
According to (11.12), for khk = 0 /3 the right side in (11.13) with v = w0 = 0 has norm not
larger than 0 /6c, and hence (11.13) has a solution u = w1 U with kw1 k 0 /6. Similarly,
(11.12) implies that equation H(w, e) = 0 has a solution e such that ke vk ku wk,
where = c/2c = 0.5, whenever H(u, v) = 0, kuk < r, and kwk < r. Since (1 )r >
kw1 w0 k, Theorem 11.7 proves existence of w U such that kwk khk and H(w, w) = 0,
i.e. g(u0 + h + w) = g(u0 ) + Gh.
X 2 + A1 X + A0 = 0, (11.14)
11.2. GENERALIZED INVERSE FUNCTION THEOREM 203
where A0 , A1 , X are bounded linear operators on a Banach space V (A0 , A1 are given, and X
is to be found). The equation does not always have a solution: when V is a real vector space,
(11.14) is infeasible for V = IR, A1 = 0, A0 = 1. When V is a complex vector space, (11.14) is
infeasible for
2 0 1
V = C , A1 = 0, A0 = .
0 0
Let us show how to use Theorem 11.12 to prove that, assuming A1 has a right inverse
A+ +
1 : V 7 V (i.e. a bounded linear operator such that A1 A1 = IV ), there exists > 0 such
that equation (11.14) has a solution X for every A0 satisfying kA0 k .
Indeed, let U be the Banach space of all bounded linear operators on V . The function
g : U 7 U defined by
g(X) = X 2 + A1 X
is Frechet differentiable at every point X0 U , with the Frechet derivative g(X) mapping U
to
g(X) = X + X + A1 .
At X = 0 the range of g(X) = A1 is the whole U , as the equation A1 = Y has a solution
= A+ 1 Y for every Y U . According to Theorem 11.12, for every > 0 there exists > 0 such
that the equation g(H + W ) = A1 H has a solution W with kW k kHk whenever kHk .
Another version of the implicit mapping theorem follows easily from Theorem 11.13.
where
L(u) = q(u) + pg(u) (11.17)
is the Lagrangian defined by p and q.
(a) is convex and has a strong interior point w0 (i.e. there exists > 0 such that
w whenever kw w0 k < );
(c) there exists a bounded linear function G : U 7 V such that condition (11.10) is
satisfied;
(d) R(G) = V ;
Then there exist a bounded linear functional p V and a number q {0, 1} (q = 1 when
p = 0) such that condition (11.16) is satisfied for the Lagrangian (11.17).
Conditions (a)-(d) of Theorem 11.15 are regularity assumptions: (a) means that
is a nice fat set; (b) means that is linearizable around u0 ; (c) is a more restrictive
local linearizability assumption imposed on g (condition (c) implies that G = g(u0 ) is the
Frechet derivative of g at u0 , and is in turn implied by continuity and continuous Frechet
differentiability of g in a neigborhood of u0); (d) ensures that the constraint g(u) = 0
is regular (i.e. defines a smooth manifold of solutions) near u0 . Condition (e) is the
assumption of optimality of u0 in the optimization setup from (11.15).
Proof. Let C = (u0 ). The first step of the proof establishes that C(u u0 ) 0 whenever u
is a strong interior point of such that G(u u0 ) = 0. Indeed, if u + whenever kk < d
(where d > 0) then
whenever
d
kwk 0 kt(u u0 )k, 0 = .
ku u0 k
Since, for every > 0, the implicit mapping theorem guarantees, for t > 0 small enough,
existence of wt U such that kwt k < kt(u u0 )k and g(u0 + t(u u0 ) + wt ) = 0, the optimality
of u0 implies (u0 + t(u u0 ) + wt ) (u0 ). Since
where int() denotes the set of all stong interior points of . Since is convex, so is int(),
and hence . Since has a strict interior point, so does int(). Since R(G) = V , this yields
existence of an interior point of . As was established in the first step of the proof, 0 6 . Hence,
according to the hahn-Banach theorem, there exists a non-zero linear functional f : IR V 7 IR
such that f (z) 0 for all z .
Since every linear functional f : IR V 7 IR has the form
The Lagrange multiplier optimality conditions of Theorem 11.15 can be used to prove
the following necessary condition of optimality for a given triple (T, u(), x()) from (11.19).
Theorem 11.16 Assume that the pair (A, B) of real matrices of dimensions n-by-n and
n-by-1 is controllable, and the function h : IRn 7 IRm is continuously differentiable. Let
11.3. NECESSARY CONDITIONS OF OPTIMALITY 207
M be the set of all triplets (T, u, x), where T > 0, u : [0, T ] 7 [1, 1] is integrable, and
x : [0, T ] 7 IRn satisfies (11.19) and h(x(T )) = 0. Assume that a triplet (T, u, x) M is
optimal (in the sense that T1 T for every (T1 , u1 (), x1 ()) M), and such that h(x(T ))
has rank m. Then there exists b IRm , |b| = 1, such that
and
{1}, y > 0,
sgn[y] = {1}, y < 0,
[1, 1], y = 0.
Once T > 0 and b IRm , |b| = 1 are fixed, u(t) is uniquely defined for all t except,
possibly, a finite number of instances where B 0 (t) = 0 Hence x = x() is completely
determined by T , b, and the constraints h(x(T )) = 0, |b| = 1 become a system of n + 1
scalar equations with n + 1 scalar variables. In applications, the optimal control strategy
is computed by solving these equations numerically.
Proof. Let 0 be the subset of the Banach space U = IR L [0, 1] consisting of all pairs (T, v)
with T > 0 and v L [0, 1]. Let : 0 7 IR and g : 0 7 IRm be defined by
Let 0 be the set of all pairs (T, v) U with T > 0 and kvk 1.
Since the relations
v( ) = u(T ), z( ) = x(T )
define a bijection between the triples (T, u, x) M and the triples (T, v, z) satisfying conditions
(T, v) and (11.22), the original optimization task is equivalent to the problem of minimizing
(T, v) over the pairs (T, v) satisfying g(T, v) = 0.
Since is linear in its domain, and the set 0 contains a ball around every point (T, v) 0 ,
is Frechet differentiable on 0 , and its derivative is given by
(T, v)(T , v ) = T .
Let us show that the function S : 0 7 IRn mapping (T, v) 0 to z(1) according to
(11.22) is Frechet differentiable at every point of its domain, and its derivative is given by
Indeed, (11.23) defines a bounded linear function U 7 IRm , and the difference
which makes (T ) = o(k(T , v )k). Note also that S(T, v) depends continuously on (T, v) 0 .
Since h is assumed continuously differentiable, the function g is continuously Frechet dif-
ferentiable on 0 . Moreover, since the pair (A, B) is controllable, R(S(T, v)) = IRn for all
(T, v) 0 . Since it is assumed that R(h(x(T ))) = IRm , it follows that R(g(T, v)) = IRm at the
optimum.
Applying Theorem 11.15 with V = IRm yields existence of q {0, 1} and b IRm (b 6= 0
when q = 0) such that
q (T, v)(T , v ) b0 g(T, v)(T , v ) 0 (11.24)
whenever T + T > 0 and kv + v k 1. Note that it is impossible to have b = 0, because in
that case T 0 whenever T + T > 0, which is impossible for T > 0.
Rewriting (11.24) in a more explicit way, and setting T = 0 yields
b0 h(x(T ))x (T ) 0
subject to
x (t) = Ax (t) + Bu (t),
x (0) = 0 (11.25)
and ku + u k 1, where
def
x (t) = z (t/T ), u (t) = v (t/T ).
Let : [0, T ] 7 IRn be defined by (11.21). Then, due to integration by part and (11.25),
Z T
0
b h(x(T ))x (T ) = (t)Bu (t)dt 0
0
whenever ku + u k 1. Since u (t) can be made positive when u(t) = 1, negative when
u(t) = 1, and of either sign when |u(t)| < 1, it follows that B 0 (t) is non-positive when u(t) = 1,
non-negative when u(t) = 1, and zero when |u(t)| < 1, which is equivalent to (11.20).
Index
spectrum
Fermats Little Theorem, 42
essential, 131
field
of complex linear operator, 130
characteristic, 43
definition, 35 theorem
equivalent (isomorphic), 38 Little Fermats, 42
isomorpic (equivalent), 38
shortcut notation, 36 vector space
unit, 35 complex, 45
209
210 INDEX
over a field, 44
real, 4
shortcut notation, 5
zero, of, 4, 44