You are on page 1of 767

381 104 E

Difference Equations

If the equation is linear in y(x), y(x + l), mine whether a given set of solutions is
.” / y(x + n), namely, if it is given by fundamental.
Let $(x) be a solution of a nonhomogeneous
iJJoPi(x)Y(x+i)=~(x). linear difference equation

the difference equation is said to be linear. p.x(Y)= C Pi(x)Y(x+i)=q(x). (2)


When q(x)=O, it is homogeneous; otherwise, it i=O
is inhomogeneous (or nonhomogeneous). Ifcpl(X),<p,(4,..., C~,(X) are n linearly inde-
pendent solutions of (l), then an arbitrary
solution of (2) is given by
D. Linear Difference Equations

Assume that p,Jx), , p,,(x) are single-valued + %(X)cpn(X) + w>


analytic functions without poles and common
where a,(x), . . , u,(x) are arbitrary periodic
zeros in some domain. Consider the linear
functions of period 1. Then the expression for
difference equation
y is called a general solution of (2). If we ab-
breviate Casorati’s determinant of a funda-
mental system of solutions cpi(x), (p2(x), . . ,
<p,(x) of (1) by D(x) and Write pi(x) as the quo-
Ifcpl(X),<p,(4,..., C~,(X) are solutions of(l),
tient of the cofactor of <pi(x + n) of D(x + 1) by
then a linear combination a r (X)V, (x) +
D(x + l), we have
a,(x)<p,(x) + . + a,(x)cp,(x) with arbitrary
periodic functions a,(x), a,(x), . , a,(x) of
period 1 is also a solution of (1).
Let &, fi*, . be singular points of pr(x),
assuming that the summation S on the right-
p2(x), . ,p,(x), LX~,LX~, be the zeros of p,(x),
hand side is known. This is the analog of
and yr, y2, be the zeros of p,(x + n). Then the
Lagrange’s tmethod of variation of constants
set of singular points of the linear difference
in the theory of linear ordinary differential
equation (1) is the set {c(~,pi, y,}.
equations.
A function C~,,,(X)is said to be linearly de-
pendent on the functions ‘pi(x), Q,(X), . ,
(~,,-i (x) with respect to the difference equa- E. Linear Difference Equations with Constant
tien (1) if<p,(x)=u,(x)cp,(x)+u,(x)<p,(x)+ Coefficients
. +a,~,(x)<p,~,(x), where u,(x), Q(X), . . . .
a,-,(x) are functions of period 1, every one If a11 the coefficients in
of which takes a nonzero lïnite value at least
at one point not congruent (mod Z) to any of (3)
the singular points, where Z is the additive
group of integers. are constants, n linearly independent solutions
A set of m functions is called linearly inde- are obtained easily. Indeed, if i is a root of the
pendent if none of the functions is dependent algebraic equation C&pi Â’ = 0,1” is a solu-
on the other m - 1 functions. When a set of n tion of (3). This algebraic equation is called
solutions of equation (1) is linearly indepen- the characteristic equation of (3). If it has n
dent, it is a fundamental system for (1). Any distinct roots Ai, A,, . . Â,,, then A;, A;, . , Â;
solution of (1) cari be expressed as a linear are n linearly independent solutions. In gen-
combination of n solutions of a fundamental eral, if  is an m-tuple root of the characteristic
system. equation, then ”, xÂX, ,xm-iAx are solu-
The determinant tions of (3). Accordingly, if Aj is a root of multi-
plicity mj (&, mj = n, j = 1,2, , s), then Ây,
<PI(X) %(X) <p,(x)
x1.7, . ,xrn~-ily (j= 1, . ,s) constitute a set of
<Pl(Xf 1) qz(x+ 1) ::: V”(X + 1)
n linearly independent solutions.
Even if a11 the pi are real, the characteristic
<Pi(X-tFl) qz(x+n-1) . <PJx+n-1)
equation may have complex roots. In such a
formed from n functions <pr (x), <p2(x), . . , <p,(x) case real solutions are obtained as follows:
is called Casorati’s determinant and is denoted When i = p + iv is a root of multiplicity m, 1=
by D(<pi(x), <p2(x), . , <p,(x)). A necessary and p - iv is also a root of the same multiplicity.
sufftcient condition for a given set of n func- If we Write p = JI*2+v2, tan cp= v/p, then
tions to be independent is that Casorati’s ~“COS (px, p”sin <px, ~~“COS <px, xp”sin vx, . . ,
determinant be nonzero at every point except Xm-‘pXCOS cpx, x mm’p”sin <pxare 2m indepen-
those which are congruent to singular points dent real solutions.
of (1). Casorati’s determinant is used to deter- Nonhomogeneous equations with con-
104 F 382
Difference Equations

stant coefficients cari be generally solved by possible to transform an equation of the type
Lagrange’s method with these solutions. How- (1’) into that of the type (1), there are theories
ever, when the nonhomogeneous term has a developed specitïcally for the type (1’) since
special form such as the coefficients of the equation may become
more complicated by such a transformation.

References
where p(x) is a polynomial in x and  is a root
of multiplicity m of the characteristic equation,
[l] P. M. Batchelder, An introduction to
we cari use the method of undetermined coeffr-
linear difference equations, Dover, 1967.
cients. In this particular case, the substitution
[2] G. Boole, A treatise on the calculus of
(A, + A, x + . + A,xk)x”I” with undetermined
tïnite differences, Dover, 1960.
coefficients A,, A,, . . . , A, gives solutions, k
[3] R. M. Cohn, Difference algebra, Inter-
being the degree of p(x).
science, 1965.
[4] T. Fort, Finite differences and difference
F. Difference and Differential Equations equations in the real domain, Oxford Univ.
Press, 1948.
The differential operator d/dx acts on the [S] C. Jordan, Calculus of tïnite differences,
family of functions {x” 1m = 0, k 1, . } accord- Chelsea, 1933.
ing to dx”Jdx = mx”-l, just as the difference [6] K. S. Miller, Linear difference equations,
operator A acts on the family {x(~) = l-(x + Benjamin, 1968.
l)/F(x-m+ l)lm=O, kl, . ..} according to [7] L. M. Mime-Thomson, The calculus of
Ax(“‘) = mx(“-l). Hence by using the factorial fïnite differences, Chelsea, 1980.
series Z a,~(“‘) and its similarity to the power [S] N. E. Norlund, Vorlesungen über Differen-
series C u,xm, we may obtain some analogies zenrechnung, Springer, 1924.
with the theory of differential equations. For [9] N. E. Norlund, Leçons sur les équations
example, the +Frobenius method in the theory linéaires aux différences finies, Gauthier-
of tregular singular points cari be applied to Villars, 1929.
the system of difference equations [lO] C. H. Richardson, An introduction to
calculus of tïnite differences, Van Nostrand,
(Z- 1)A-l’%(Z)
=j$ +k(Z)Wj(Zh 1954.

k=l,2 ,..., n.

However, there are certain essential differences 105 (Vll.2)


between functions detïned as solutions of Differentiable Manifolds
differential and difference equations. For ex-
ample, Holder’s theorem states that no solu-
tion of the simple difference equation y(x + 1) A. General Remarks
-y(x)=x-’ satistïes any talgebraic differential
equation. Consequently, the gamma function, The rudimentary concept of n-dimensional
which is related to a solution of the equation manifolds cari already be seen in J. Lagrange%
Il/(x) = d log r(x)/dx, cannot be a solution of dynamics. In the middle of the 19th Century n-
any algebraic differential equation. For the dimensional +Euclidean space was known as a
numerical solution of ordinary differential continuum of n real parameters (A. Cayley; H.
equations by difference equation approxima- Grassmann, 1844, 1861; L. Schlafli, 1852). The
tion - 303 Numerical Solution of Ordinary notion of general n-dimensional manifolds was
Differential Equations. introduced by B. Riemann as a result of his
differential geometric observations (1854). He
considered an n-dimensional manifold to be a
G. Geometric Difference Equations set formed by a 1-parameter family of (n - l)-
dimensional manifolds, just as a surface is
For an arbitrary complex number 4, an equa- formed by the motion of a curve. Analytical
tion of the form y(qx) =f(x, y(x)) is called a studies of topological structures of manifolds
geometric difference equation. For example, the and their local properties were initiated and
ordinary difference equation (1) cari be trans- developed by Riemann, E. Betti, H. Poincaré,
formed into and others. TO avoid the diflïculties and dis-
advantages of analytical methods, Poincaré
k$o
PA4Wqk)=w (1’) restricted his consideration to those topo-
logical spaces X that are tconnected, ttri-
by the change of variable z = q”. Although it is angulable, and such that each point of X
383 105 E
Differentiable Manifolds

has a neighborhood homeomorphic to an n- A set S = { ( UZ, $J},,, of coordinate neigh-


dimensional Euclidean space. We often refer to borhoods is called an atlas of M if { UajatA
such spaces as Poincaré manifolds; Poincaré forms an topen covering of M.
called them n-dimensional manifolds. In 1936,
H. Whitney published a monumental paper
D. Differentiable Manifolds
[ 141 on differentiable manifolds in which the
various fundamental concepts on differentiable
Let S = {(U,, t+kJ},,, be an atlas of an n-
manifolds were established. This and sub-
dimensional topological manifold M. For each
sequent papers written by Whitney during
pair of coordinate neighborhoods (U,, $J and
nearly twenty years greatly influenced the
(U,, tiB> in S such that U, fl U, # 0, +bDo I&I,-’ is
rapid advance of the theory of differentiable
a homeomorphism of the open set $JU, fl U,)
manifolds since 1950.
of R” onto the open set tiB(U, n Ua) of R”. Let
x=(x’, . , x”) E $J U, n Us). Then we cari
B. Topological Manifolds Write (~~~~~~I,‘)(X) =($a(~), . . J;‘(x)). If the
n real-valued functions fs, , fim defined in
An n-dimensional topological manifold M is by $J U, n UP) are of +class c’ (1 < YG 00) (resp.
delïnition a +Hausdorff space in which each treal analytic) for any CI, p in A such that
point p has a neighborhood U(p) homeo- U, n U, # 0, then we cal1 S an atlas of class C’
morphic to an open set of R”. (resp. CU) of M. When an n-dimensional topo-
Let M’ be a Hausdorff space in which logical manifold M has an atlas S of class
each point p of M’ has a neighborhood U(p) c’ (1 < r < w), we cal1 the pair (M, S) an n-
homeomorphic to an open set of H”, where H” dimensional differentiable manifold of class C
is the half-space {(x1,x2, . . ..x.)~R”jx,>O}. (or C’-manifold). A C”-manifold is also called
Let 8M’ denote the set consisting of points p a smooth manifold, while a C”-manifold is
of M’ such that p corresponds to a point of called a real analytic manifold. We cal1 M the
H~={(xl,...,x,)~Hn~x,=O}cH”underthe underlying topological space of (M, S), and
homeomorphism from U(p) to an open set of we say that S defines a differentiable structure
H”. M’ is called an n-dimensional topological of class C’ (or C-structure) in M.
manifold with boundary if 8M’ # 0, and 3M’ is In particular, a Ca-structure is called a real
called the boundary of M’. On the other hand, analytic structure. A C’-manifold whose under-
M defined as above or M’ with aM’= 0 is lying topological space is compact (tparacom-
called an n-dimensional topological manifold pact) is called a compact (paracompact) C-
without boundary. The interior of M’ is the manifold. A coordinate neighborhood (U, tj) of
complement Mo = M’- aM’ of the boundary. M is called a coordinate neighborhood of class
The boundary of an n-dimensional topological c’ of (M, S) if the union S U {(U, $)} is also an
manifold is an (n - 1)-dimensional topological atlas of class c’ of M. In particular, each co-
manifold. A topological manifold without ordinate neighborhood of M belonging to S is
boundary is called closed or open according as of class c’. The set s of a11 coordinate neigh-
it is compact or has no connected component borhoods of class c’ of (M, S) is an atlas of M
which is compact. There exist connected topo- containing S, and we cal1 s the maximal atlas
logical manifolds that are not tparacompact; containing S. Let S and S’ be two atlases of
among them, the 1-dimensional ones are called class C” of M. If ,!?= 8, then we say that S and
long lines. A connected paracompact topolog- S’ delïne the same differentiable structure of
ical manifold M has a tcountable open base class C’ on M and that the differentiable mani-
and is tmetrizable. folds (M, S) and (M, S’) of class c’ are equiva-
lent. In particular, (M, S) and (M, .!?) are equiv-
alent C’-manifolds. Let S and s’ be atlases
C. Local Coordinates
of class C’ and class C”, respectively, where
1< r < s < w. Since s > r, we cari consider S’
Let M be an n-dimensional topological mani-
an atlas of class C”. If S and S’ defme the same
fold. A pair (U, $) consisting of an open set U
C-structure in M, then we say that the C”-
of M and a homeomorphism $ of U onto an
structure delïned by s’ is subordinate to the
open set of R” is called a coordinate neighbor-
C-structure delïned by S. If M is paracom-
hood of M. If we denote by (x’(p), . ,x”(p))
pact, then there exists a CO-structure subordi-
(pe U) the coordinates of the point $(p) of R”,
nate to a C-structure of M (Whitney [14]).
then xi, x2, , x” are real-valued continuous
functions delïned on U. We cal1 these n func-
tions the local coordinate system in the coordi- E. Differentiable Manifolds with Boundaries
nate neighborhood (U, t,k) and the n real num-
bers x1(p), , x*(p) the local coordinates of the Let U and U’ be open sets in the half-space H”,
point ~EU (with respect to (U, $)). and let <p: U + U’ be a continuous mapping. If
105 F 384
Differentiable Manifolds

there exist open sets W and W’ in R” contain- orientation is called its positive orientation
ing U and U’, respectively, and a mapping and the other its negative orientation. If S =
$: W+ W’ of class c’ that extends cp, we cal1 q {(U,, tj.)}aEA belongs to the positive orien-
a mapping of U into U’ of class c*. Let M be tation, S and (U,, +a) are called an atlas and
a Hausdorff topological space. A structure of local coordinate system, respectively, compa-
a C’-manifold on M is defined by a set S = tible with the positive orientation.
{(UC, $J},,,, where { Ua}nEA is an open covering
of M, and, for each a, tj, is a homeomorphism
G. Differentiable Functions
of U, onto an open set of H” such that for
any a, /?EA with U,n U,#@, t,bPot+bomlis a
Let f be a real-valued function defmed in a
mapping of class C’ from tj,( U, n U& ont0
neighborhood of a point p of a Cm-manifold
+a( U, n U,). Let C?M denote the set consisting
M. Let (U, +) be a coordinate neighborhood of
of points p of M such that PE U, and $J~)E Ho =
class C” such that PE U. If the function fo $ m1
I(x 1,..., x,)~H”(x,=O}forsomea~A.If is of class C’ (1 < r < 00) in a neighborhood of
C?M # 0, the pair (M, S) is called an n-dimen-
the point t,b(p) in R”, then the function f is
sional differentiable manifold witb boundary of
called a function of class c’ at p. This defï-
class c’ (or C’-manifold witb boundary), and
nition is independent of the choice of a coor-
dM is called the boundary of M. i3M forms an
dinate neighborhood of class C”. If we de-
(n - 1).dimensional C’-manifold. If we put
note the local coordinate system in (U, $) by
Ua = U, n aM and denote the restriction of $,
(xl,..., xn), there exists a function f(x’, . , x”)
to Va by &, then s’= {(UL, t,Q},,, is an atlas
of n variables defïned in a neighborhood of
of class c’ of aM. If dM is empty, then (M, S)
t,b(p) in R” such that f(q)=f(x’(q), . ,x”(q)) for
is a C’-manifold. In this sense a C’-manifold
each point 4 in the neighborhood of p. Here
is sometimes called a C’-manifold without
we use the same symbol f for the function f
boundary.
deiïned in a neighborhood of p in M and for
the function fo t,-’ deiïned in the image of the
neighborhood by $ in R”. The function f is
F. Orientation of a Manifold
of class C’ at p if and only if f(x’, , x”) is
of class c’ in a neighborhood of the point
Let S={WL+acI,)J,,A be an atlas of class c’ in
(x’(p), . , x”(p)) of R”. A function of class C
M, and for each a let {xi, . , xi} be the local
(or C’-function) in M is a real-valued function
coordinate system in a coordinate neighbor- in M that is of class c’ at every point of M.
hood (U,, ICI,). If U, and U, intersect, then there
exist n real-valued functions Fi (i = 1, . . . , n)
defmed on $=(U, n U,) such that x$(p) = H. Tangent Vectors
Fi(~:(p), .,.,x:(p)) for PE U,n U, and i=
1, . . . , n. The tJacobian Das = D(F’, , F”)/ Let M be a Cm-manifold, and let g(M) be the
D(x,‘, , xi) is different from zero at each real vector space consisting of a11 C”-functions
point (xt , , xi) of +,( U, n U,). If we cari in M. (For the sake of simplicity, we denote a
choose an atlas S of M SO that, for any a, fi manifold (M, S) by M.) A tangent vector L at a
such that U, f? U, is nonempty, the Jacobian point p of M is a linear mapping L: g(M)-R
Daa is always positive, then we say that the su& ht Wii) = -W%(p) +f(pWM for aw f
C’-manifold M is orientable, and we cal1 S an and g in g(M). For any two tangent vectors
oriented atlas. L,, L, and any pair of real numbers il, Â2 we
Let S= {WL G&)),,~ and s = I(V,, (~~1)~~~ detïne 1,L, +&L, by (1,L, +&L,)(f)=
be two oriented atlases of a connected c’- Â,L,(f)+A,L,(f),f~~(M).
manifold M. If M is connected, then the sign of Thus tangent vectors at p form a real vector
the Jacobian D,,(p) of the transformation of space TP, which we cal1 the tangent vector
local coordinates is independent of the choice space (or simply the tangent space) of M at
,ofa~A,1~A,andp~U,flV~.WesaythatS the point p. The dimension of the tangent
and s’ defïne the same (opposite) orientation if vector space TP equals the dimension of M.
Dan is always positive (negative). Hence if M is The set of a11 tangent vectors of M forms a
connected, the set of a11 oriented atlases of tvector bundle over the base space M, called
class c’ is composed of two subsets such that the tangent vector bundle (or tangent bundle)
atlases belonging to one of them have the of M.
same orientation, while two atlases belonging By a tangent r-frame (r < n) at p we mean an
to different ones have the opposite orientation. ordered set of r linearly independent tangent
Each of these subsets is called an orientation vectors at p. The set of all tangent r-frames
of the connected C’-manifold M. When we also forms a fïber bundle over M called the
assign to M one of two possible orientations, tangent r-frame bundle (or bundle of tangent
M is called an oriented manifold; the assigned r-frames) (- 147 Fiber Bundles F).
385 105 L
Differentiable Manifolds

1. Differentials of Functions above conditions, <pgives a homeomorphism


from M onto <p(M), where (o(M) has the rela-
For a C”-function fin M and a point p of M tive topology of M’. If the former delïnition is
we cari deline a linear mapping df,: TP+R by adopted as embedding, then a mapping cp
df,(L)=L(f) for a11 LE T,,, and we cal1 dfp the satisfying the conditions of the alternative
differeutial off at p. The totality of differen- definition is sometimes referred to as a regular
tials at p of Cm-functions in M forms the tdual embedding. If M is compact, the two delïni-
vector space of the tangent vector space TP. tions coincide. In Sections L and M, embed-
ding always means regular embeddings.
The theory of embeddings and immersions
J. Differentiable Mappings is mainly concerned with ways to embed and
immerse a given manifold M into a manifold
Let cp be a continuous mapping of a C”- M’ of a particular type with lowest possible
manifold M into a Cm-manifold M’. We cal1 q dimension. M’ is usually the Euclidean space
a differentiable mapping of class c’ (or simply R”, the projective space PNR or a certain stan-
a C’-mapping) (1~ r < CO) if the function fo <p dard manifold. The theory was initiated by
is of class c’ for any C’-function f on M’. If Whitney (1936). He proved by “general posi-
<p is a homeomorphism of M onto M’ and q tion” argument that an n-dimensional C”-
and q-i are both of class C’, then we cal1 cp a manifold M with countable basis cari always
diffeomorphism of class C’. If there exists a be immersed in the 2n-dimensional Euclidean
diffeomorphism of class C” of a Cm-manifold space and cari always be embedded in the
M onto a C”-manifold M’, then M and M’ are (2n + 1)-dimensional Euclidean space as a
said to be diffeomorphic. closed set (Whitney’s theorem).
Let M and M’ be C”-manifolds and rp be
a Cm-mapping of M into M’. For a tangent
vector L of M at p, a tangent vector L’ of M L. Submanifolds
at cp(p)isdefinedbyL’(g)=L(go<p),g~~(M’).
The mapping L-+L’ defines a linear mapping A Cm-manifold M is said to be a submanifold
(dq), of the tangent vector space TP of M at p of a C”-manifold M’ if M is a subset of M
into the tangent vector space T& of M’ at and the identity mapping of M into M’ is an
q(p). The linear mapping (dq), is called the immersion. If the identity mapping of M into
differential of the differentiable mapping q~ at p. M’ is an embedding, then M is called a regular
If (dq), is surjective, p is called a regular point submanifold of M’. A regular submanifold M
of <p. A point on M which is not a regular of M’ is called a closed submanifold if M is a
point is called a critical point of <p. A point q closed subset of M’.
on M’ which is an image of a critical point is Let cp be a C”-mapping from M into M’
called a critical value of cp, and a point on M’ and M” be a submanifold of M’. Then for each
which is not a critical value is called a regular q E M” the tangent space q of M” at q is a
value. In R”, the diffeomorphic image of a set linear subspace of the tangent space 74 of M
of Lebesgue measure zero has Lebesgue mea- at q. Denote by 7cqthe projection of quotient
sure zero. SO the set of Lebesgue measure vector space onto Ti/Ti. A C”-mapping <pis
zero is well defined on a (paracompact) C”- called transverse to M” if for each p E rp-‘(M”)
manifold. Then Sard’s theorem states: Let the composite rrVu,)o d<p,: TD-+Ti,,,/Ti&, is
q:M-+M’ be a C”-mapping; then the set of surjective. If q is transverse to M” then
critical values of <phas Lebesgue measure zero cp-‘(M”) is a submanifold of M. For any C”-
in M’. mapping <p: M -* M’ and any submanifold M”
c M’, we cari find a C”-mapping cp’: M-FM’
which is transverse to M” and arbitrarily close
K. Immersions and Embeddings to cp (transversality theorem). Let M, and Ml
be submanifolds of M’. Then we say M, in-
Let M and M’ be C”-manifolds and <p be a tersects transversely to M, if the inclusion
Cm-mapping of M into M’. If (d<p), is injective M, c M’ is transverse to M2.
at every point p of M, then cp is called an im- A C”-mapping is called a submersion if it
mersion of M into M’. If cp is an immersion, has no critical point. Let cp be a submersion
then for some neighborhood UP of any point from M into M’. Then for each point qe M’,
p of M the restriction q 1U, gives rise to a <p-‘(q) is a regular submanifold of M, and M is
homeomorphism from UP into M’. If an im- covered by a mutually disjoint family of sub-
mersion cp is injective, then cp is called an manifolds: M = u4EM.<p-1(q).
embedding (or an imbedding) of M into M’. An Let M be a submanifold of an n-dimensional
alternative definition is often used, which says Euclidean space R”. We cari identify the tan-
that cp is an embedding if, in addition to the gent vector space TP of M at p with the geom-
105 M 386
Differentiable Manifolds

etric tangent space of M at p in the Eucli- there exists a unique vector field Z in M such
dean space R”. A vector in R” that is ortho- that Zf= X( Yf) - Y(Xf) for any C”-function f
gonal to the tangent vector space TP of M at p in M. We denote Z by [X, Y] and cal1 it the
is called a normal vector to M at p. The set of Poisson bracket (or simply bracket) of X and
a11 vectors normal to M forms a vector bundle Y. If 5’ and ri denote the components of X
over M, which we cal1 the normal vector bundle and Y, respectively, in a coordinate neighbor-
(or normal bundle) of M. If M is compact, then hood (U, $), then the components ci of [X, Y]
the totality of vectors normal to M whose are given by ii=C,{Sk(arli/axk)-rlk(a5’/axk)}.
length is <E (where F is a sufficiently small The bracket of vector fields has the following
positive real number) forms a neighborhood properties: (i) [X, Y]f= X( yf) - Y(Xf), (ii)
N(M) of M in R” which we cal1 a tubular CfX, gY1 =fiCX, Y1 +fWd Y-dyf)X, (iii)
neighhorhood of M. Int N(M) is called an open [X + Y, Z] = [X, Z] + [Y, Z], (iv) [X, Y] =
tubular neighborhood. -C~~~l~~~~~~~CC~,~1,~l+CC~,~1,~1+
[[Z, X], Y] = 0 (Jacobi identity). These iden-
tities show that X(M) is a Lie algebra (- 248
M. Vector Fields Lie Algebras) over R.
If <pis a diffeomorphism of M onto M’,
Let N be a subset of a Cm-manifold M. By a then for any vector field X in M we cari delïne
vector field on N we mean a mapping X that a vector Iïeld V*X in M’ by the condition
assigns to each point p of N a tangent vector (V*X), = d<p,(X,), p = v(q). Then ‘p* is an iso-
X, of M at p. We cari consider X as a tcross morphism of the Lie algebra X(M) onto the
section over N of the tangent vector bundle of Lie algebra 3E(M’).
M. Let X be a vector fïeld in M, and let f be a
C”-function in M. Then we cari deiïne a func-
tion Xf in M by (Xf)(p)=XJ We cal1 X a N. Vector Fields and One-Parameter Croups
vector field of class C’ if the function Xf is of Transformations
of class c’ for any C”-function fin M. Let
(xi, . ,x”) be the local coordinate system in A one-parameter group of transformations of
a coordinate neighborhood (U, $), and let M is a family cpt(t’c R) of diffeomorphisms
(a/axi),f=(aflaxi)(p) for PE U and ~ES(M). satisfying the following two conditions: (i) the
Then a/axi (i = 1, , n) are vector lïelds in U, mapping of R x M into M delïned by (t, p)-+
and the (a/ax’), form a basis of TP at every <p,(p) is of class C”; and (ii) <pSo vt = <~s+~for
point ~EU. A vector held X in U is written s, tER.
uniquely as XP=Ci~i(p)(i3/i3xi)P at each point Let cpt be a one-parameter group of trans-
p E U. Then ci, . , 5” are real-valued func- formations of M. Then we cari delïne a vector
tions delïned in U, called the components of X fia X by Xpf = lim,-df(v,(d) -f(p))lt,
with respect to the local coordinate system where p E M and fc g(M). The vector lïeld X
(x’ , . . . ,xX). A vector field X in M is of class C thus defined is called the infinitesimal trans-
if and only if its components 5’ with respect to formation of vt. We also say that <pt is gen-
each coordinate system are functions of class erated by X, and sometimes we denote <P,by
C’ (0~ r < CO). Let (x’, , X”) be another local the symbol exp tX. In this case, if (xi,. , x”) is
coordinate system in a neighborhood U of a local coordinate system, then at each point p
p, and let (f’, . , 5”) be the components of of the coordinate neighborhood, we have X, =
X with respect to (y’, . . , X”). Then we have Ci(~x’(v~(P))l~t),=,(a/ax’),.
1”(4)=Cj(axi/axj)(q)5j(4) at each point q~ U. If M is compact, then every vector lïeld in M
For the rest of this article we mean by a is the intïnitesimal transformation of a one-
vector lïeld in M a vector field of class C”, and parameter group of transformations; that is,
we denote by 3(M) the set of a11 vector fields every vector lïeld generates a one-parameter
in M. Then X(M) is an S(M)-tmodule, where group of transformations. For M noncompact,
S(M) denotes the algebra of a11 C”-functions this is not always true. Nevertheless, for each
in M. In fact, forf, g@(M) and X, YgIIE(M), vector field X we have the following result
we cari delïne a vector lïeld ,fX +gY by (fX concerning local properties of X: For each
+gY),=f(p)X,+g(p)Y,, and this delïnes an point p of M, there exist a neighborhood U of
z( M)-module structure in X(M). p, a positive real number E, and a family cp,(1t 1
In a coordinate neighborhood (U, $), we cari CE) of mappings of U into M satisfying the
Write X = C,S’(a/ax’). The right-hand side of following three conditions. (1) The mapping
this equation is sometimes called the symbol of of ( -E, E) x U into M defmed by (t, q)+(p1(q)
the vector field X. A vector lïeld X cari also be is of class C”, and for each Iïxed t, vr is a dif-
interpreted as a linear differential operator feomorphism of U onto an open set <p,(U)
that acts on g(M). of M. (2) If 1~1,Itl, and Is+ tl are all smaller
Let X and Y be vector Iïelds in M. Then than E and q and v,(q) both belong to U, then
387 105 0
Differentiable Manifolds

cp,(rp,(q))=cp,+,(d. (3) x(q)f=lim,-,(f(<p,(q)) tem (x1, ,x”). If Kj;:::i are the components
-f(q))/t for qe U and fi g(M). We cal1 qt the of K with respect to the local coordinate sys-
local one-parameter group of local transfor- tem (Xi, , X”) in another coordinate neigh-
mations around p generated by X. borhood (U’, $‘) such that U n ci’ # 0, then for
Let X and Y be vector tïelds in M, and let each q E U II u’, the following relations hold:
cpt be the local one-parameter group of local
transformations around p generated by X. z+:j:(q) = 1 k,J3x’l/uxkl), . .(aX”/axk$
Then LX, Yl,=lim,+,(Y,-((cp,), Y)#, where
x (dx’l/&+), .(ax’~/axj~),K:~‘::;~.
(q,), Y is a vector field defïned as follows: Let
U be a neighborhood of the point p where cpt A tensor tïeld K in M is called a tensor field of
(1tI CE) is detïned. Then (<PJ, Y is the vector class C’ (0 < t < CO) if the components are func-
field on <p,(U) that is the image of Y under the tions of class CL for any coordinate neighbor-
diffeomorphism q,. In particular, if X gener- hood of M.
ates a one-parameter group of transforma- The sum K +L and the tensor product
tions of M, then we have [X, Y] = lim,,,( Y- K @3L of two tensor fields K and L are detïned
(&* Y)/t for any vector tïeld Y in M. by the rules (K + L)P = K, + L, and (K @ L),
= K, 0 L,, respectively. The contraction of
two tensor fïelds is also defined by taking the
0. Tensor Fields contraction pointwise.
Let q be a diffeomorphism of M into M’.
Let y(p) be the vector space consisting of a11 Then the differential (d<p), is an isomorphism
r-times contravariant and s-times covariant of T, onto TP (p = <p(q)) for each qc M and
tensors over the tangent vector space TP of a hence induces an isomorphism Qq of the vector
C”-manifold M, that is, space T,‘(q) onto the vector space y(p) (- 256
Linear Spaces). For any tensor iïeld K in M we
cari detïne a tensor fïeld QK in M’ by (ijK),=
Q&K,), p=(p(q), qeM. Then ij(K+L)=QK+
where TP* denotes the dual linear space of TP $L, Q(K @ L) = $K @ @L, and the mapping
(- 256 Linear Spaces). A tensor field (more 4 commutes with contraction.
precisely, contravariant of order r and covariant Let K be a tensor field and X be a vector
of order s, or simply a tensor field of type (r, s)) fïeld (both of class P) in M. We defme a
on a subset N of M is a mapping K that as- tensor iïeld L,K by (LxK),=lim,,e(K,-
signs to each point p of N an element K, of the (+,K),)/t, where qt denotes the local one-
vector space Tsr(p). In particular, if r = s = 0, K parameter group of local transformations
is a real-valued function on N, and we cal1 K a around p generated by X. We cal1 L,K the
scalar field. If r = 1 and s = 0, K is a vector Lie derivative of K with respect to the vector
field, called a contravariant vector field. When tïeld X. The operator L, : K + L, K has the
r = 0 and s = 1, we cal1 K a covariant vector following six properties: (i) L,(K + K’) =
tïeld (or differential form of degree 1). If r # 0, L,K+L,K’;(ii)L,(K@K’)=(L,K)@K’+
s = 0 or r = 0, s # 0, we cal1 K a contravariant K 0 L,K’; (iii) the operator L, commutes
tensor field of order r or a covariant tensor with contraction; (iv) L,f= Xf for a scalar
field of order s, respectively. A contravar- fïeld f and L, Y= [X, Y] for a vector tïeld Y;
iant or covariant tensor fïeld K is said to be (9 L tx, r, = L, L, - L, L,, that is, L,,, yl K =
symmetric (alternating) if K, is a symmetric L,(L,K)- L,(L,K); and (vi) K is invariant
(alternating) tensor at every point p of M. under qt, i.e., ij?! K = K for a11 t, if and only if
Let (x1, . . ,x”) be the local coordinate system L,K=O.
in a coordinate neighborhood (U, $). Then at Let K be a covariant tensor fïeld of order r
each point p of U, the (3/8x’), (i= 1, , n) form in M. We always assume that K is of class C”.
a basis of the tangent vector space TP, the The value K, of K at pe M is an element of the
differentials (dx’), (i= 1, , n) form a basis of vector space TP* 0 . 0 TP* (r times tensor
the dual space TP*, and these bases are dual to product of TP*); hence we may consider K, an
each other. A tensor field K of type (r, s) de- r-linear mapping of TP into R (- 256 Linear
tïned on M is written at any point p of U in Spaces). If X,, , X, are vector tïelds in M,
the following form: we defme a C”-function K(X,, . . , X,) by
KW,, . . , X,)(P) = K,(W,),, . ,(-UP). Then
KP=CKj::::jr(p)(a/axil),o 1.. @(c~/c?x’$, the mapping that assigns to each r-tuple
@ (dx’l), @ . 0 (dxj$,. (Xl 1 . . . / X,) of vector fïelds the C”-function
K(X, , . . , X,) is an r-linear mapping on the
The functions Kf:;:::: defïned in U are called a(M)-module X(M) consisting of a11 vector
the components of the tensor field K of type tïelds of class C” in M into g(M); that is,
(r, s) with respect to the local coordinate sys- K(X,, . ..> fXi+gy, 1.1, X,)=fK(X,, . . . . Xi,
105 P 388
Differentiable Manifolds

. . . . X,)+gK(X, ,..., x ,..., Xv)(i=1 ,..., r)for (k= 1, . . . . r). Then we cari Write
J gE g(M). Conversely any r-linear map-
Ulp=-
ping of the S(M)-module X(M) into S(M) 1 E ai,...i,(p)(dxi~)p A A (dx’r),
r!i ,,..., i,=l
cari be interpreted as a covariant tensor fïeld
of order r in M. If the tensor lïeld K is sym- The functions ai, ,,,i, are the components of the
metric (alternating), the corresponding r- tensor lïeld w, and w is of class C’ if these
linear mapping K(X,, . , X,) is symmetric components are of class C’ for any coordinate
(alternating) with respect to X,, . . , X,. For neighborhood. By the support (or carrier) of a
the Lie derivative L, K of a covariant tensor differential form w we mean the closure of the
tïeld of order r of K, we have the following subset of M consisting of a11 p such that w,, # 0.
formula: (L,K)(X,, , X,)=X(K(X,, . . . ,X,)) In the rest of this article, differential forms
-C;=, K(X,, . . , [X, Xi], , X,). are always of class C”, and we denote by
D(M) the real vector space consisting of a11
differential forms of degree r and of class C”.
P. Riemannian Metrics In particular, Do(M) = s(M) and D(M) = {0}
for r>n, n=dimM.
A symmetric covariant tensor lïeld g of order For differential forms we have the following
2 and of class C” in M is called a pseudo- five important operations.
Riemannian metric if the symmetric bilinear (1) Exterior product. Let o and 0 be dif-
form gP on the tangent vector space TP is non- ferential forms of degree r and s, respectively.
degenerate at each point ~EM; and g is called The exterior product w A 0 of w and 0 is the
a Riemannian metric if gP is positive delïnite differential form of degree r + s delïned by
for all p. If g is a Riemannian metric, the (w~e),=w,~8,,p~M,LetX,,...,X,+,be
length 11 L 11of a tangent vector LE TP is delïned r + s vector lïelds in M. Then we have
by ~~Ll12=gP(L,L). On a paracompact C”- (W A @)(XI a i xr+,)
manifold there always exists a Riemannian
metric. A pair consisting of a differentiable
manifold and a Riemannian metric on it where the summation runs over all possible
is called a Riemannian manifold (- 364 partitionsof(l,2,...,r+s)suchthati,<i,<
Riemannian Manifolds). ci, and j, <j,< . . . <js, and sgn(i;j) means
the sign of the permutation (1,2, . . , r + s)+
(il,...,Lj ,i”‘> j s.) In particular, if wi, . , w,
Q. Differential Forms
are differential forms of degree 1, then we have
(wl A . . . “w,)(X,, . . . . X,)=det(w,(Xj)).
An alternating covariant tensor lïeld in M of
(2) Exterior differentiation. Let w be a dif-
order r and of class C’ (0 < t < CO) is also called ferential form of degree r, and let w =( l/r!)
a differential form (or exterior differential
Gai ,,,, i,dxil A A dx’r in a coordinate neigh-
form) of degree r. A differential form of degree
borhood (U, i+k).Then we cari deline a differen-
1 is sometimes called a Pfaffan form. Let w be
tial form dw of degree r + 1 by the condition
a differential form of degree r. Since each alter-
dw=(l/r!)~dcq+,ndx’l A . . . r\dx’r in U. The
nating covariant tensor of order r at a point p
differential form do is called the exterior de-
is an element of A’Tp*, the r-fold texterior
rivative (or exterior differential) of w. A dif-
product of T:, the form w is a mapping that
ferential form w satisfying the condition dw = 0
sends each point p of A4 to an element wP of
is called a closed differential form, and a dif-
A’Tp*. We cari also regard w as an alternating
ferential form q that cari be expressed as y~=
r-linear mapping of x(M) into s(M). Let
dw for some w is called an exact differential
(x1, . ,x”) be the local coordinate system in a
form. If a, bcR and w, w’ED’(M), then we
local coordinate neighborhood (U, I+!J).Since
have d(aw + bu’) = adw + bdw’. Therefore the
(dx’),(i=l,..., n) is a basis of TP* at each point
set 6’(M) of a11 closed differential forms of
p of U, we cari express wp (pe U) uniquely in
degree r and the set e(M) of a11 exact differ-
the form
ential forms of degree r are linear subspaces
COI>= c ai,...i,(p)(dxi~),A A(dX’& of V(M).
i,<...<i,
For the exterior derivative dw, we have the
where the sum extends over all ordered r- following formula:
tuples (il, . , i,) of indices such that 1 <i, ci,
kW(X,,...,X.)
< . < i, < n. For an ordered r-tuple (ii, , i,)
of indices with repeated indices we put a,,,..i, = 0,
and for (ii, , i,) with r distinct indices we put
a~l...~r=(sgn~)~j,...j~~ where 01, . ..J.) (Jo < .-. +~(-l)i+~o([xixj],xl,
<j,) is a permutation of (i, , . . , i,) and sgn 0 i<J

denotes the sign of the permutation e: ipj, . . . . gi ,.../ zj /..., X,,,),


389 105 T
Differentiable Manifolds

where the variables under the sign A are to be S. Partitions of Unity


omitted.
(3) Interior product with vector field. Let Let M be a paracompact C”-manifold, and let
w be a differential form of degree r and X a {I$}i,, be a tlocally lïnite open covering of M
vector tïeld. When r > 1, we cari defme a dif- such that the closure of v is compact for each
ferential form z(X)0 of degree r - 1 by the index i. Then there exists a C”-function fi in
formula (z(X)w)(X,, . . . , X,-J = w(X, Xi, ., M for each i satisfying the following three
X,-J for any Y- 1 vector fields Xi, . . . , X,-i; if conditions: (i) 0 <fi < 1, (ii) the support of 11 is
r = 0, we put I(X)~ = 0. The differential form contained in I$ and (iii) C&(x) = 1 for every
I(X)~ is called the interior product of w with XE M. The family of C”-functions fi, Ill, is
X. called a partition of unity of class C” subordi-
(4) Lie derivative. The Lie derivative L,w of nate to the open covering { v}iE,.
a differential form of degree r with respect to
a vector lïeld X is a differential form of the
same degree. For any r vector fïelds Xi, , X, T. Integrals of Differential Forms
we have, by defïnition, (L,o) (X,, . , X,)=
x(w(x,, . ) x,))-CI=, w(x,) >Cx, xJ> Integrals over an Oriented Manifold. Let M be
“‘2 -a. an n-dimensional paracompact oriented C”-
(5) Let q be a C”-mapping of M into M’, manifold and w a differential form of degree n
and let q$ : T&,-’ TP* be the ttranspose (or in M with compact support. We cari choose a
+dual) of the linear mapping (dq),: TP-+ Tqcpl positively oriented atlas S = {(U,, $J}OIEA such
for PE M, i.e., the mapping defïned by the that {UaJatA is a locally fïnite covering; ü, is
condition ((d<p),L, C()=(L, @cc) for each LE TP, compact for each a. Suppose fïrst that the
RE TpTpj. We denote the linear mapping of support of w is contained in U, for some index
c(. Then (i+ka-‘)*woI is a differential form of de-
AT&) into l\Tp* induced by (pp* by q$ also. Let
w be a differential form of degree r in M. Then gree n in $J UJ, and we cari express (tim-‘)*o,
a differential form <p*w in M, the pullback by in the form adx’ A A dx”, where (xi, . , x”)
cp of o, is defmed by (<P*U)~= q$w,(,), PE M. are the coordinates in R” such that xi = xi o $,
The operations detïned previously satisfy (i = 1, , n) give a local coordinate system
the following six important relations: (i) dz = compatible with the orientation of M and a is
0, that is, d(do) = 0; (ii) d(w A 0) = dw A /il + a C”-function with compact support. Then we
(-l)lw A dB, where w is of degree r; (iii) defme the integral of w over M by
q*(w A 0) = cp*w A cp*Q, q*(dw) =d(cp*w); (iv)
L,(WA8)=(L,W)AB+OA(L,B); (V) Lx= C!l= adx’ . ..dx”.
s M s v.Wd
z(X).d+d.z(X), L,(dw)=d(L,w); and (vi)
L [x,Y,=Lx.LY-LY.Lx, 4cx, YI)=L,.z(Y)- For the general case let {f,},,A be a partition
z(Y). L,. of unity of class C” subordinate to { UajatA.
Then the support of f,w is contained in U,,
and except for a finite number of the indices CI,
f,w vanishes identically. Therefore we may
R. De Rbam Cohomology defïne the integral of w over M by

Let D(M) = C:=,, a’(M), where IZ= dim M.


Then B(M) is a tcochain complex with
tcoboundary operator d. We denote by H’(D) and we cari show that this definition of the
the r-dimensional cohomology group of integral is independent of the choice of or-
this cochain complex, and we cal1 it the r- iented atlas S and of a partition of unity sub-
dimensional de Rham cohomology group of the ordinate to S.
differentiable manifold M. If we denote by
E”(M) and F(M) the subspaces of D(M) con- Integrals over a Singular Chain. We tïx rectan-
sisting of closed differential forms and exact gular coordinates in R’. Let d, be the origin
differential forms, respectively, then H’(D) and di be the unit point on the ith coordinate
= a’(M)/@‘(M) (0 < r < n) by definition. If axis. Let S’ denote the oriented r-simplex
~IIE@(M) and B~crl(M), then WA CIE~‘+‘(M), (d,,d,, . ,d,) with vertices d,, d,, . ,d,. When
and if ~JE@(M) and BE(Y(M) (or me@(M) we regard S’ as a point set, we denote it by
and OEC(M)), then WA @E@+‘(M). SO if we (S’I. An oriented singular r-simplex of class C”
put H(D) = Cr=r H*(B) (direct sum), we cari in M is, by definition, a pair (S’, cp) consisting
detïne a product in H(D) by [w] [0] = [w A 01 of S’ and a C”-mapping <p of an open neigh-
for each [w] E Hi(D), [Q] E H’(B). With respect borhood of IS’ into M. An element of the free
to this product, H(D) forms an algebra over R Z-module generated by singular r-simplexes of
called the de Rham cobomology ring of M. class C” is called an integral singular r-chain
105 u 390
Differentiable Manifolds

of class C” in M. We define a real singular r- submanifold aD into M with aD having the


chain of class C” analogously. Let w be a orientation induced naturally from that of M.
differential form of degree r and (Y, <p) be an (2) Let C be a singular r-chain of class C” in
oriented singular r-simplex of class C” in M. M, and let aC be the boundary of C. Then for
Then cp*w is a differential form of degree r in a any differential form w of degree r - 1, we have
neighborhood of IS’I, and we cari express <p*w
intheform<p*w=adx1Adx2r\...Adx’.We dw= w.
define the integral of w over (S’, <p) by s c s C?C

This formula is also called Stokes’s formula.


W= adx’ . ..dx’.
s 6s’.a) s P’I
V. De Rham’s Theorem
and the integral of w over a singular r-chain C
of class C” by
Let M be a connected paracompact C”-
manifold. If we consider o as a singular co-
chain, we have (dw) (C) = w(X); by Stokes’s
formula, this means that the exterior differen-
where C=C,mi(S*,<pi), miEZ (or mieR). tial dw of w is equal to the coboundary of the
When r = 0, then w is a function in M, and singular cochain w. Let w and C be a closed
S” is a point o. In this case we put jcw = differential form of degree r and a singular r-
CiV44d”)).
cycle of class C”, respectively, and let [w] and
Let C,(S, Z) (C,(S, R)) be the Z-module (vec- [C] be the de Rham cohomology class and the
tor space over R) of integral (real) singular singular homology class represented by w and
r-chains of class C” in M, and let w(C) be C. Using Stokes’s formula, we cari define the
the value of the integral of w over a chain C. inner product ([w], [Cl) by
Then w is a linear function in the vector space
C,(S, R), and hence we cari consider w a sin-
([WI>[Cl)= w.
gular r-cochain of class C”. sc
Through this inner product, it follows that
U. Stokes’s Formulas the de Rham cohomology group H’(a) is
isomorphic to the rth singular cohomology
(1) Let D be a tdomain in an n-dimensional group H’(M, R), the dual space of the rth
C”-manifold M, and let 8D and D be the homology group of the complex of real sin-
boundary and the closure of D, respectively. gular chains of class C”. Moreover, the de
Let S= {(K k))asA be an atlas of class C” of Rham cohomology ring H(a) is isomorphic to
M, Ua = U, n D, i& be the restriction of Ics, to the singular cohomology ring H*(M, R) (de
Ui, and T= {(U:,I&)).~~. If the pair (0, T) is a Rham’s theorem).
C”-manifold with boundary under a suitable
choice of S, then the domain D is called a
W. Divergence of a Vector Field
domain with regular (or smooth) boundary.
The boundary dD of (D, T) is then an (n- l)-
Let M be an n-dimensional oriented C”-
dimensional closed submanifold of M, and if
manifold, and let S be an oriented atlas. Let
M is orientable, dD is also orientable. Now let
o be a differential form of degree n, and let
M be a paracompact and oriented manifold
(xl, . . . . x”) be the local coordinate system in a
and D be a domain with regular boundary.
coordinate neighborhood in S. Then we cari
Let C be a characteristic function of D in M,
express w in the coordinate neighborhood
i.e., a function dehned by the condition C(p) =
uniquely in the form w = adx’ A A dx”. If the
lforp~DandC(p)=Oforp$D.LetQbea
function a is positive for any coordinate neigh-
differential form of degree n in M with com-
borhood in S, we cal1 w a volume element of
pact support. We define the integral of Q over
M. In a paracompact oriented manifold, there

O=
sD C.8.
D by

sM
always exists a volume element. (We remark
that an n-dimensional differentiable
M is orientable if and only if there exists an
everywhere nonvanishing
manifold

differential form of
Let u be a differential form of degree n - 1 in
degree n.) Let f be a C”-function in M with
M with compact support. We then have
compact support. Then ,f. w is a differential
Stokes’s formula:
form of degree n with compact support, and SO
the integral
dw= i*w,
s D s ?D

where i denotes the identity mapping of the


.f.w
JA4
391 105 Y
Differentiable Manifolds

is defïned. We cal1 this integral the integral of mined by the sections of a vector bundle 5 of
the function f with respect to the volume ele- class C” is also a vector bundle of class C”.
ment w. LetS:M-tNandg:N*LbeC”-mappings.
Let g be a Riemannian metric in M and g, We detïne a composition of jets by j;(Pjg.
the components of g with respect to the local jif=jp(gof). A jet j~feJ’(M,N) is inver-
coordinate system (xl, ,x”) as before. Then tible if there exists a mapping g : N-t M such
we cari define a volume element w in M by that jj(,,>g. jif=ji(l,), where 1, denotes the
putting o=fidx’ A . . . ~dx”, G=det(g,) in identity mapping of M onto itself. We de-
each coordinate neighborhood. The volume note by I’(M, N) the set of a11 invertible jets
element thus delïned is called the volume ele- in Y(M, N), and put I;(M, N)=I’(M, N)n
ment associated with the Riemannian metric g. J;(M, NI.
Let w be a volume element and X a vector (3) Let G’(n) be the set of all invertible jets
fïeld in M. Then the Lie derivative L,w is also in I’(R”, R”) whose source and target are the
a differential form of degree n, and we cari origin of R”. Then with respect to the compo-
express L,w in the form L,w =fx. w, where fx sition of jets, G’(n) forms a Lie group that is an
is a scalar tïeld, i.e., a function in M. We cal1 fx textension of G’(n) = CL (n, R) by a simply
the divergence of the vector tïeld X with re- connected nilpotent Lie group. The projec-
spect to the volume element w and denote it tion G’(n)+G’(n) is a special example of the
by div X. If w is associated with a Riemannian natural projection J’(M, N)+J”(M, N) (r 2 s),
metric, then div X is called the divergence of X which is defined in general.
with respect to ihe Riemannian metric. (4) We cari identify 1; (R”, M) (m = dim M)
If M is compact, we have with the tangent m-frame bundle over M.

c
JM
divX.w=O
More generally, I;(R”‘, M) is a G’(m)-bundle
over M.

for any vector field X. This result is known as


Green% theorem. Y. Pseudogroup Structure

Let X be a topological space, and let r be a


set consisting of homeomorphisms f: Uf- V,,
X. Jets
where U,, V’ are open subsets of X. We cal1 r a
pseudogroup of topological transformations if
Let M and N be C”-manifolds. We defme r satisfies the following four conditions: (i) r
an equivalence relation in the set of all C”- contains the identity mapping of X onto X; (ii)
mappings of M into N. Let f and g be such if fe r, then the restriction off onto any open
mappings and p be a point of M. Choosing subset U of V, is also contained in r; (iii) if f
local coordinate systems, we Write f(p) = and g are in r and Vf c U,, then g of is con-
(fi(X), ‘.‘>f.(X)), g(P)=(gl(x)> .‘.>%(X))> x= tained in r; and (iv) if fi r, then f-’ : Vf- U, is
(x 1, . , x,). We say f and y are equivalent at also in r.
p iff(p), g(p), and the values at p of a11 the Following the definition of differentiable
partial derivatives of fi and gi up to the order r manifolds we defîne a pseudogroup structure of
(r an integer, r > 0) are equal (i = 1, , n). An M (or, more precisely, a r-structure of M) as a
equivalence class with respect to this equiva- set A of bijections, with each member c( defïned
lente relation is called a jet of order r at p. A jet on a subset U, of M onto an open set Va of X,
of order r at p represented by a function f is satisfying the following three conditions: (i)
denoted by j;A and the points p and f(p) are U,U~=M;(ii)ifcc,B~A,thenaoB-‘Or,
called the source and the target of the jet j;J where the domain of defmition of c(o p ml is
respectively. We denote by JP(M, N) the set of fl( U, n U,); and (iii) A is the maxima1 set of
a11jets of order r with source at p and target in bijections that satisfies conditions (i) and (ii).
N and let J’(M, N)= UpsMJL(M, N). For any We introduce in M the weakest (coarsest)
jet j, let n,(j) and n,(j) denote the source and topology such that every bijection CIis a
the target of j, respectively. We cari introduce homeomorphism. If r’ is a pseudogroup of X
the structure of a C”-manifold in J’(M, N) containing r and A’ is a l-‘-structure of M
in a natural way such that the projections such that A c A’, then we say that A’ is sub-
z,:J’(M,N)+M and q:J’(M, N)-tN are both ordinate to A. If X = R” (or H”, a half-space of
of class C” and J’(M, N) is a fiber bundle over R”), r is the totality of diffeomorphisms of
M(N) with projection 7~,(zJ. As examples, we class c’ of open sets of X ont0 open sets of
have: X, and M is a space with Hausdorff topol-
(1) Jt (R, N) is identiiïed with the tangent ogy, then the r-structure is the C-structure
vector bundle of N. with or without boundary which we have
(2) The set Y([) of a11jets of order r deter- already defïned. We give three examples of r-
105 z 392
Differentiable Manifolds

structures subordinate to J’, where T’ is the often provide a basic functional-analytic view
totality of local transformations of class C’ to nonlinear analysis and global analysis.
(r > 1) in R”. The most important category of Banach
(1) When n is even, we identify R” with C@ manifolds is provided by the Hilbert manifold,
and denote the totality of holomorphic trans- that is, a Banach manifold whose local coordi-
formations of connected open domains by r. nates are modeled on a Hilbert space. For a
The r-structure in this case is called a complex Hilbert manifold, a partition of unity subordi-
structure. nate to a locally fmite open covering cari be
(2) When n is odd, we define T as the totality taken from smooth functions. In the follow-
of transformations of connected open domains ing, we refer to a separable Hilbert manifold
in R” that leave invariant a Pfaftïan form simply as a Hilbert manifold. An intïnite-
~~!lxidxm+i+dx2m+1 (n = 2m + 1) up to scalar dimensional Hilbert manifold M cari be
factors. The r-structure in this case is called a smoothly embedded as an open set of a Hil-
contact structure. bert space and thus covered by a single coordi-
(3) We consider R” = RP x R”-P and define T nate neighborhood. Hence, in particular, the
as the set of a11 diffeomorphisms U+ I/ (where tangent bundle of M is trivial. Historically,
U, T/ are open in R”) such that each set of the this fact was the fïrst instance showing that
form Un (RP x {y}) is mapped onto a set of the the distinguishing properties are shared by
form Vn(RP x {y’}). The r-structure in this infïnite-dimensional manifolds; this was tïrst
case is called a foliated structure. recognized as a consequence of the theorem
The problem of determining whether there stating that the unitary group of an intïnite-
exists a r-structure for given r and M involves dimensional Hilbert space is contractible. Two
widely ranging problems of topology and intïnite-dimensional Hilbert manifolds are
analysis. The classification of r with reason- diffeomorphic if and only if they are homo-
able conditions is another important open topically equivalent. A typical example of a
problem. Hilbert manifold is provided by the space of
Haefliger has constructed the classifying L2-loops on a compact Riemannian manifold.
space m for r-structures. Morse theory cari be extended in a suitable
way to a Riemannian Hilbert manifold under
the Palais-Smale condition, which makes it
Z. Infinite-Dimensional Manifolds possible for the integral curve of gradfto tend
to a critical point, where fis a Morse function.
Let B and B’ be Banach spaces and cp be a
mapping from an open set of B to B’. Then,
using the notion of Fréchet derivatives, we cari AA. Gel’fand-Fuks Cobomology
define cp to be smooth. For a smooth mapping
<p, the Jacobian J(<p) at each point x is a linear The space E(M) consisting of a11 the smooth
mapping from B to B’, for which the inverse vector fields on a smooth manifold M has the
function theorem holds true as a straightfor- structure of a Lie algebra under the bracket
ward extension of the corresponding one in operation [X, Y] = XY- YX, where the vector
the finite-dimensional case. Similar extension tïelds X and Y are regarded as derivations on
also holds for the existence and the unique- the algebra Cm(M) of smooth functions on M.
ness theorems of solution of ordinary differen- LE(M) is a topological Lie algebra when en-
tial equations with value in B. These facts dowed with the topology defined by uniform
permit us to generalize the notion of differ- convergence of the components of vector fields
entiable manifolds to infinite-dimensional and a11 their partial derivatives on each com-
ones. Actually, infinite-dimensional manifolds pact set of M. When x(M) acts continuously
cari be detïned in a way similar to the finite- on a topological vector space V, the continu-
dimensional case, taking open sets of a certain ous cohomology H*@(M), V) of x(M) with
Banach space as local coordinate neighbor- coefftcients in Vis the cohomology of the
hoods. Such a manifold is called a Banacb cochain complex @ { Cp = G’@(M), V), d}.
manifold. Various forma1 definitions of dif- Here C” = V, Cp(p> 1) is the space of a11 the
ferentiable manifolds cari also be stated for alternating p-linear continuous mapping cp of
Banach manifolds in extended form. However, X(M) x . . . x x(M) (p-times) into V, and d:CP
while differentiable manifolds are locally com- +Cp+’ is detïned for cp~C’and X,6X(M) by
pact and admit a partition of unity by smooth d<p(X,)=X,cp(~=O)anddrp(X,,...,X,+,)=
functions subordinate to a locally tïnite open Ci<j( -l)‘+‘cp( Cxi, Xj]; x,, ) gi> , dXCj>. . . ,
covering, Banach manifolds generally lack xp+,)+~i(-l)‘+‘xi<p(xl,...,~i,...,xp+l)
these properties. Actually, local compactness (p> 1). When Vis a topological algebra and
gives a criterion for whether a manifold is the elements of X(M) act on Vas derivations,
finite-dimensional or not. Banach manifolds the exterior multiplication of cochains induces
393 105 Ref.
Differentiahle Manifolds

a graded algebra structure in H*@(M), V). tryagin classes of M. This mode1 shows in
When V= R is the trivial X(M)-module, particular that H*@(M)) is not necessarily
then H*(X(M))=H*(X(M),R) is called the lïnitely generated as a graded algebra.
Gel’fand-Fuks cohomology of M. Gel’fand and The cohomology theory of X(M) in the case
Fuks proved that, for any compact oriented where the representation is nontrivial has also
manifold M, we have dimHP(X(M))< +co been investigated. The natural representa-
for a11 p and HP@(M)) = 0 for 1~ p < n (n = tion on Cm(M) is a typical example. There is
dimM). For example, if M is the circle Si, also a topological interpretation: H*@(M),
then the algebra H*(?I$?‘)) is generated by Cm(M))zH*(YM,R), where Y, is the liber
two generators CIE HZ, BE H3, which are expli- product of the evaluation mapping M x T(E,)
citly described as cochains in the following +Ew and the inclusion P,çE, correspond-
way: ing to a fiber inclusion U(n)çPzn.

References

[l] L. Auslander, Differential geometry, Har-


per, 1967.
[L] R. L. Bishop and S. 1. Goldberg, Tensor
Localization of the concept of Gel’fand- analysis on manifolds, Macmillan, 1968.
Fuks cohomology naturally yields the co- [3] N. Bourbaki, Variétés différentielles et
homology of forma1 vector tïelds. Here a analytiques, Hermann, 1967.
forma1 vector fïeld means the expression [4] E. Cartan, Les systèmes différentiels ex-
CfP(Xl> .'.> X#/~X,,, f, being formal power térieurs et leurs applications géométriques,
series in xi, . ,x,. The set of a11 the forma1 Actualités Sci. Ind., Hermann, 1945.
vector tïelds forms a Lie algebra a,, and the [S] H. Cartan, Formes différentielles, Her-
continuous cohomology of a, with respect mann, 1967.
to the Krull topology is denoted by H*(a,). [6] C. Chevalley, Theory of Lie groups 1,
Let B, be the universal tclassifying space of Princeton Univ. Press, 1946.
the unitary group u(n), let (BLI)*” be its 2n- [7] G. de Rham, Variétés différentiables, Ac-
skeleton, and let PZn be the canonical principal tualités Sci. Ind., Hermann, 1955.
U(n)-bundle restricted to (B&,. Then there is [S] S. Kobayashi and K. Nomizu, Founda-
an algebra isomorphism H*(a,)= H*(P,,; R). tions of differential geometry, Interscience, 1,
This cohomology and its variants play im- 1963; II, 1969.
portant roles in the theory of foliations. [9] S. Lang, Introduction to differentiable
An important subcomplex @ {Cg, d} of manifolds, Interscience, 1962.
0 { Cp, d}, the diagonal complex, is defined as [lO] Y. Matsushima, Differentiable manifolds,
Cf;={rpECPIcp(X1,...,Xp)=OifsuppX1n Dekker, 1972.
fl supp X, = a}. Here, supp Xi denotes [ 1 l] J. R. Munkres, Elementary differential
the support of Xi, that is, {x 1X,(x) # O}. Let topology, Ann. Math. Studies 54, Princeton
PM be the principal U(n)-bundle associated Univ. Press, 1963.
with the complexified tangent bundle of M. [ 121 K. Nomizu, Lie groups and differential
u(n) acts freely on the product PM x Pz, and geometry, Publ. Math. Soc. Japan, 1956.
the quotient space E, = PM x P&U(n) is a [ 131 S. Sternberg, Lectures on differential
fiber bundle over M with fiber P2,,. Then, if geometry, Prentice-Hall, 1964.
M is a compact oriented manifold, the co- [ 143 H. Whitney, Differentiable manifolds,
homology H,*@(M)) of the diagonal com- Ann. Math., (2) 37 (1936), 645-680.
plex is completely determined by the isomor- [ 151 H. Whitney, Geometric integration
phisms H&I(M)) g HP+“@,; R) for a11 p. In theory, Princeton Univ. Press, 1957.
particular, if a11 the Pontryagin classes of M [ 161 J. Eells, A setting for global analysis, Bull.
vanish, then H:(E(M)) = Zi+j=p+n H’(M; R) 0 Amer. Math. Soc., 72 (1966), 751-807.
Hj( a,,). [ 171 D. Burghelea and N. Kuiper, Hilbert
The Gel’fand-Fuks cohomology has a topo- manifolds, Ann. Math., (2) 90 (1969) 335-352.
logical interpretation: H*(E(M))= H*(T(E,), [18] N. Kuiper, The homotopy type of the
R) as graded algebras, where F(E,) denotes unitary group of Hilbert space, Topology, 3
the space of a11 the continuous cross-sections (1965), 19-30.
of E,-+M with the compact open topology. [ 191 R. Palais and S. Smale, A generalized
Moreover the differential graded algebra Morse theory, Bull. Amer. Math. Soc., 70
C*&(M)) has a mode1 in the sense of Sullivan (1964) 1566172.
constructed purely algebraically from a mode1 [20] 1. M. Gel’fand, The cohomology of
of the de Rham algebra of M and the Pon- infinite-dimensional Lie algebras; some ques-
106 A 394
Differential Calculus

tions of integral geometry, Actes Congr. Int. denoted by Dzf(xo) orf;(xo)(D;f(xo) or


Math., Nice, 1970, Gauthier-Villars, vol. 1, 95- jY (x0)). For instance, if f is defined on I=
111. [a, b), then D,f(a) is identical to D:~(U).
[21] 1. M. Gel’fand and D. B. Fuks, Coho-
mologies of the Lie algebra of tangent vector
tïelds of a smooth manifold 1, II, Functional B. Differentials
Anal. Appl., 3 (1969) 1944210; 4 (1970) 1 lO-
116. (Original in Russian, 1969, 1970.) In the detïnitions given above, neither dx nor
[22] 1. M. Gel’fand and D. B. Fuks, The coho- dy in dy/dx has a meaning by itself. In the
following, however, we give a definition of dx
mology of the Lie algebra of forma1 vector
and dy, using the concept of increment, SO that
fïelds, Math. USSR-Izv. (1970), 3222337.
dy =f’(x) dx. Let Ay denote the increment
(Original in Russian, 1970.)
[23] V. W. Guillemin, Cohomology of vector f(x + Ax) -f(x) off corresponding to the in-
fïelds on a manifold, Advances in Math., 10 crement Ax of x. Suppose that f(x) is differen-
(1973) 192-220. tiable at x. We set Ay/Ax =f’(x) + E. Then we
have lim bX-o E= 0. This cari be written utiliz-
[24] M. V. Losik, On the cohomologies of
infinite-dimensional Lie algebras of vector ing tlandau’s notation as Ay =f’(x)Ax +
fïelds, Functional Anal. Appl., 4 (1970), 1277 o(lAxl) (Ax-+O); in other words, Ay is the
sum of two terms, of which the fïrst, f’(x)Ax, is
135. (Original in Russian, 1970.)
proportional to Ax and the second is an tin-
[25] R. Bott and G. Segal, The cohomology of
the vector fields on a manifold, Topology, 16 tïnitesimal of an order higher than Ax. Here
the principal part f’(x)Ax of Ay is called the
(1977), 285-298.
differential of y =f(x) and is denoted by dy.
[26] A. Haefliger, Sur la cohomologie de l’al-
The differential dy thus defined is a function of
gèbre de Lie des champs de vecteurs, Ann. Sci.
two independent variables x and Ax. In partic-
Ecole Norm. SU~., 9 (1976) 503-532.
ular, if f(x) = x, from the definition we get
dx = 1. Ax = Ax. Hence, in general, we have
dy = f ‘(x) dx and f’(x) = dy/dx.
With respect to the rectangular coordi-
106 (X.6) nates (x, y), the straight line with slope f ‘(x0)
Differential Calculus through a point (x0, f(xo)) on the graph of y =
f(x) is the ttangent line of the graph at the
point (x0, f(xo)). A function is continuous at a
A. First-Order Derivatives point where the function is differentiable, but
the converse of this proposition does not hold.
Let y = f(x) be a real-valued function of x In fact, Weierstrass showed that the function
defined on an interval 1 of the real line R. If for detïned by the infmite series CEo ~“COS b”zx,
a tïxed x0 E 1, the limit where 0 <a < 1 and b is an odd integer with
ab > (3/2)n + 1, is continuous everywhere and
lim f(xo + 4 -f(xo)
nowhere differentiable on (-CO, CO) [3].
h-0 h
x,+keI

exists and is fïnite, then fis called differenti- C. Differentiation


able at tbe point x0, and the limit is the deriva-
tive (differential coefficient or differential For two differentiable functions f and g de-
quotient) off at the point x0. If f is differenti- tïned on the interval I, the following formulas
able at every point of a set A c 1, then f is said hold: (aif + pg)’ = xf’ + bg’, where CIand p are
to be differentiable on A. The function that constants; (fg)‘=f’g+fg’; and (f/g)‘=(f’g-
assigns the derivative off at x to x E A is called fg’)/g’ (at every point where g #O). Let y =
the derivative (or derived function) of j(x), f(x) be a function of x defined on the interval
which is denoted by dy/dx, y’, 3, df(x)/dx, (a, b) and x = q(t) a function of t detïned on
(d/dx)f(x), f’(x), or D,f(x). The process of (c(,fl). If q(t)~(a, b) whenever t E(N, b), then the
determining f’ is known as the differentia- composite function y=F(t)=f(<p(t))=(f OP)(t)
tion off: The derivative off at the point x0 is is well defined. Assume further that f and q
written f’b,), (df/W(xo), kf(x,), CWdxl,=,o, are differentiable on (a, b) and (tl, fi), respectively.
etc. We say that f is right (left) differentiahle or Then the composite function F(t)=(f oq)(t) is
differentiable on the right (left) if the limit on differentiable on (tu, p), and we have the chain
the right, lim,,+, (f(xo + h) -f(x,))/h (the limit rule, F’(t) = f ‘(x)$(t) (x = q(t)), or dy/dt =
on the left, limk,+o(f(xo - h) -f(x,))/h), exists (dy/dx)(dx/dt). Assume that a function y =
and is imite. This limit is called the right (left) f(x) is tstrictly increasing or decreasing and
derivative or derivative on the right (left) and is differentiable at x0. If furthermore we have
395 106 F
Differential Calculus

f’(x)#O, then the inverse function X=~~‘(Y) is rème des accroissements finis.“) Using the
also differentiable at y, ( =f(xo)) and satisfies mean value theorem, the following theorem
(dx/dy),=,o(dy/dx),=Xo = 1. However, if f’(x,) = cari be proved: If f(x) is continuous on [a, b]
0, then even though f-‘(y) is not differenti- and f’(x) exists and is positive on (a, b), then
able at y,,, lim,,,, (f~‘(y,+Ay)-f~‘(y,))lAy f(b) >f(u). Accordingly, if f’(x) > 0 at every
exists and is +cû or -CO. point x of an interval I, then f(x) is tstrictly
increasing on that interval. (If f’(x) <O on 1,
then f is strictly decreasing.) The converse of
D. Higher-Order Derivatives
the previous statement does not always hold
(f’(x) = x3 is a counterexample, since f’(O) = 0).
If the derivative f’(x) of a function y =f(x) is
Furthermore, from the mean value theorem
again differentiable on 1, then (f’(x))‘=f”(x) is
it follows that if f’(x) = 0 everywhere in an
well defmed as a function of x on 1. In general,
interval, then f(x) is constant on that inter-
if f’“-‘)(x) is differentiable on I, then ,f(x) is
val. Consequently, two functions with the
called n-times differentiahle on 1, and the nth
same derivative on an interval,differ only by
derivative (or nth derived function) f(“)(x) of
a constant.
f(x) is defïned by f(“)(x)=(f(“-‘j(x))’ and is Suppose that f(x) is n-times differentiable on
also denoted by d”y/dx” or D(“‘y. The nth
an open interval 1. For a tïxed a E 1 and an
derivative for n > 2 is called a higher-order
arbitrary x E 1, we put
derivative.
Concerning the nth derivative of the product
of two functions, Leibniz3 formula holds:
f(x)=f(u)+f’o(x-u)+
l!

(fg)‘“‘=f’“‘g+
‘I f’“-‘)g’+,, +f'"-"(fi)
(n(1)!--nY+R".
0
+0; f(“-k)g(k)+ +fg’? Then
tween
R, = f ‘“‘(t)(x - U)“/n! for some < be-
a and x. This is called Taylor% formula,
where R, is the remainder of the nth order
Analogous to dy = y’Ax, which is a function of
given by Lagrange. We also have several other
x and Ax, we cari detïne d’y in the notation
forms for R, (- Appendix A, Table 9). If
d2y/dx2 by d’y = d(dy) = d(y’Ax) = (y’Ax)‘Ax
f’“‘(x) is continuous at x = a, then <+a as
= y”Ax’. Since Ax = dx, it follows from the
x+a, and accordingly, f (“)(5)-f (“)(a). Hence
above that d’y = y” dx*. Similarly, d”y = y(“)dx”
f(x)=C~=,,(f’k’(a)/k!)(x-u)k+o((x-a)”). If
and is called the nth differential (or differential
f’“‘(x) is continuous at x = a, then, by Tay-
of nth order) of ,f(x).
lor’s formula, the value of the polynomial
C&,(f’k’(u)/k!)(x-u)k cari be considered an
E. The Mean Value Theorem approximate value of ,f(x) for x near a. This
approximation is called the nth approximation
Let f(x) be a continuous function detïned on of f(x), and its error is given by 1R,,, 1. By
[a, b], and suppose that for every point x0 on applying this formula, it is sometimes possible
(a, h) there extsts a hmtt hm,,, (f(xo + 4 - to calculate a limit such as A = lim,,,f(x)/g(x),
f(x,))/h, which may be infinite. (These condi- where f(x)+0 and g(x)+0 as x+u. For in-
tions are satisfïed if f(x) is differentiable on stance, if f ‘(x) and g’(x) are both continuous at
[a, b].) Then there exists a point 5 such that x = a and g’(u) # 0, then by taking the Iïrst
approximations of f(x) and g(x) it is easily
seen that A = f ‘(u)/g’(u). A limit of this type is
often called a limit of an indeterminate form
This proposition is called the mean value O/O. Similarly, we cari calculate limits of such
theorem. A special case of the theorem under indeterminate forms as 0. GO or 0” (for limits
the further condition that f(a)=f(b) is called of indeterminate forms - [SI).
Rolle’s theorem. If we put h-a = h, 5 = a + Bh,
then the conclusion of the theorem may be
written asf(a+h)=f(a)+h.,f’(a+Bh) (Oc F. Partial Derivatives
o< 1).
This theorem implies the following: Let Let w =f(x, y,. , z) be a real-valued function of
f(x) be a function as in the hypothesis of the IL independent real variables x, y,. , z defined
mean value theorem, and assume further that on a domain G contained in n-dimensional
A <f’(x) < B holds for ah x with a <x <b. Euclidean space R”. We obtain a function of a
Then A<(f(h)-f(u))/(b-u)<B. (French single variable from f by keeping n - 1 vari-
mathematicians sometimes cal1 this the “théo- ables (say, (x, j, , z), i.e., a11 except y) fïxed. If
106 G 396
Differential Calculus

such a function p(y) =,f’(xo, y, . ,z,J is dif- tion f is not continuous at (O,O), even though
ferentiable, that is, if both ,f, and f, exist at (0, O).) The function f is
totally differentiable on G if all f,, f,, ,,f,
cp,(yo)= ,im d~o+A~)-<p(~o)
exist and are continuous on G, or, more weak-
Ay-
AY
ly, if all f,, f,, ,,f, exist and, with possibly
one exception, are continuous. Suppose that
fbo, Y, + AY, , zo)-f(xo>yo> rzo)
w =f(x, y) is totally differentiable at (x, y), and
= ijrnO
AY let Ax=pcosO, Ay=psinO. As p-‘O, for a
exists and is Imite, then ,f is called partially fïxed 0 there exists the limit lim,,,(Aw/p) =
differentiable with respect to y at (~,,y,, f,(x, y) COS0 +fY(x, y) sin 0. This limit is called
. ..) zo), and the derivative is called the partial the directional derivative in the direction 0 at
derivative (or partial differential coefficient) (x, y). The partial derivatives f, and f, are
off(x,y, . , z) with respect to y at (x,,, yo, special cases of the directional derivative for
,zO). It is denoted by [~?w/~?y],=, “,,,,, z==“, 0 = 0 and 742, respectively. Suppose that we
are given a curve lying in the tinterior of the
(~/~Y)~(x~,Y~, , zoX fy(xo, Y,, , zo), or
DJ(x,, y,, , z,,), etc. We usually assume that domain off and that the curve passes through
the point (x, y, , z) where partial derivatives the point (x, y), where the curve is differenti-
are considered is an tinterior point of the able. Then the partial derivative of w =f(x, y)
tdomain of the function. Since in a space of in the direction of the normal to the curve at
dimension higher than 1, the tboundary of a (x, y) is called the normal derivative of w at the
domain may be complicated, partial deriva- point (x, y) on the curve and is denoted by
tives at boundary points are usually not con- ?w/Cn. Analogous definitions and notations
sidered. If a function ,f possesses a partial de- have been introduced for functions of more
rivative with respect to x at every point of an than two variables.
open set G, then f, is a function on G and is TO see the geometric signitïcance of the total
called a partial derivative off with respect to differentiability of w =f(x, y), we consider the
x. The process of determining partial deriva- graph of the function w =,f(x, y) and a point
tives off is called the partial differentiation of (a, b,f(a, b)) on the graph. Then the plane
represented by w-f(a, b)=cc(x-~)+/?(y- b) is
.f:
the +tangent plane to the surface at (a, b,f(u, b))
if and only if ~=,/;(a, b) and B=,f,(u, b). The
G. Total Differential existence off, and f, depends on the choice of
coordinate axes, while the total differentia-
Let w =,f‘(x, y, , z) be a function defïned bility off does not.
on a domain G, and let P = (x, y, , z) be an
interior point of the domain G of a function
H. Higher-Order Partial Derivatives
w=f(x, y, . . . . z). Put Aw=f‘(x,+Ax,y,+
Ay, , z. + AZ) -.f(x,, y,, , zo). If there
Suppose that a partial derivative of a function
exist constants c(, B, , y such that Aw =
w =f‘(x, y, , z) detïned on an open set G again
ctAx+/j’Ay+ +yAz+o(p) (p+O), where
admits partial differentiation. The latter partial
p= Ax2+AyZ+...+AzZ,thenfiscalled
derivative is called a second-order partial
totally differentiable (or differentiable in the
derivative off: We cari similarly detïne the nth
sense of Stolz) at P. In this case, f is partially
order partial derivatives. Higher-order partial
differentiable at P with respect to each of the
derivatives are denoted as follows:
variables x, y, . . . , z, and CI=.L(x,, Y,, , zo), /3
=fy(~o,~O,...,~o)r...i~=.f~(~o,~O,...,~O).The a 3, û2W

principal part of Aw as p+O is stdx + PAy + c?x( ax >=,x,=L,(x>Y i..., 4


+lAz, which is called the total differential of w
at P. If f is totally differentiable at every point a aw
of G, then ,f is said to be totally differentiable ay ( ax > =g+x,Y>
, . ..>4,
on G. The total differential of w is denoted by
a alw PW
dw. Since the total differentials of x, y, , z are =-=,f,,,(x,y, . ..) z),

dx = Ax, dy = Ay, , dz = AZ, respectively, we


ûx ! oxay > cxdyax
cari Write dw=,f,dx+f,dy+...+f,dz; and In general, f,, and ,j& are not equal. (Peano’s
dw is a function of independent variables x, example: Let f(x, y) = xy(x2 - y2)/(x2 + y2) for
y, , z, dx, dy, , dz. The total differentiability (x, y) # (0,O) and f(0, 0) = 0. Then f,, = - 1,
off implies the continuity of ,fi whereas the fYX = 1 at (0, O).) However, if both f,, and ,&
partial differentiability off with respect to are continuous on an open set G’, then they
each variable does not imply that f is continu- coincide in G’. Furthermore, if f,, fY, and f,,
ous. (Example: Delïne f(x, y) = X~/(X~ + y2) exist in a neighborhood U of a point P belong-
for (x, y) # (O,O), and ,f(O, 0) = 0, then the func- ing to the domain off’ and ,f,, is continuous
397 106 L
Differential Calculus

at P, then fyX exists at P and f,, =f,, (H. A. formula is valid for a function of n variables
Schwarz). If f, and f, exist in U and are totally (n > 3). As in the case of functions of one vari-
differentiable at P, then f,, =LX at P (W. H. able, we cari derive approximation formulas
Young). Similarly, if the partial derivatives of for f from Taylor’s formula.
order > 3 ~,,Xy,,, and x..,,... are a11 continuous,
then Ly... =i,,,,,,,. Hence we cari change the
order of differentiation if a11 the derivatives K. Classes of Functions
concerned are continuous.
If a11 the partial derivatives of order n of f(P)
are continuous on an open set G, then fis said
1. Composite Functions of Several Variables to be a function of class c” (or n-times contin-
uously differentiable) on G. The set of a11 n-
Let w be a function of x, y, . . . , z, and let each times continuously differentiable functions is
x, y, . , z be a function of t. Suppose that the denoted by C” (n = 1,2,. . ). A continuous
range of (x(t), y(t), . , z(t)) is contained in the function is of class CO. A function of class C’ is
domain of w. Then w is a function of t. If fur- also called a smooth function. It is obvious
ther w is totally differentiable and x, y, , z that C” 3 C’ 2 C2 3.. A partial derivative of
are a11 differentiable, then w as a function of t order r <s of a function belonging to class C
is differentiable, and we have does not depend on the order of the differenti-
ation. A function belonging to Cm = (7:, c’ is
dw aw dx awdy awdz said to be of class C” or infinitely differenti-
Z-ax dt I aydt+...+ dzdt able. We sometimes say that a function has a
If partial derivatives of order 2 2 are totally certain “nice” property or is “well behaved” if
differentiable, then dz w/dt2, d3 w/dt3, . are it belongs to some C’ (r > 1).
obtained by repeating the above procedure. A Let w =f(x, y, . . , z) be a function defined on
similar consideration is valid when x, y, . , z an open set G in R” and P=(u, b, . . . . c)EG. If
are functions of several variables.
f(x> Y, . , 4 = f(a, b, . . >4

J. Taylor% Formula for Functions of Several +,z,


1 ...~~~n.~~~...,~(x-u)“(~-b)“...(z-<-)”
n
Variables
holds in some open neighborhood U of P,
where the right-hand side of the equality is an
Suppose that f(x, y) is defïned on an open
absolutely convergent series, then f is said to
domain G, f(x, y) has continuous partial de-
be real analytic at P. In this case, fis r-times
rivatives of orders up to II, and the line seg-
differentiable at P for any r, and we have
ment (a+(x-a)t,b+(y-h)t) (O<t< 1)is
contained in the domain G. Then there exists a r,!r,! . ..r.!
Lx
number 0 (0 <e < 1) such that r1r2”‘rn=(rl +r,+...+r,)!

f (X> Y) ,I,+,,+...+‘“f
(a, b, . , c).
’ axrlayrz. .aZrn
If f-is real analytic at every point P of the
domain G, then f is called a real analytic func-
tion on G. Sometimes, a real analytic function
is called a function of class C”. A real analytic
function belongs to C”, but the converse is not
true (- 58 C”-Functions and Quasi-Analytic
Functions E).

L. Extrema
X~@+(X-a)@ b+(y-b)Q),

where, for instance, the third term ((x-u). Let f be a real-valued function delïned on a
(Wx) + (Y - MVa~))2f(~, b) means domain G in an n-dimensional Euclidean
(x-a)2(a2f/ax2)(,,
b)+2(x-a)(y-b)’ space R” that has the point F. in its interor.
(a2maY)(u, b) + (Y - wbww)(~, b), If there exists a neighborhood U of P,, such
with (a2flax2)(a, b), (a2flaxay)(a, b), and that for every point P (#P,) of U we have
(a2fldy2)(a, b) denoting the values of (a’flax’), f(P)>f(P,), then we say that f has a relative
(a’fiaxay), and (a2flûy2) at (a, b), respectively. minimum at P,, and f(P,) is a relative mini-
The displayed formula is called Taylor% for- mum off: Replacing > by <, we obtain the
mula for a function of two variables. A similar definition of a relative maximum. f(P,) is
106 Ref. 398
Differential Calculus

called a relative extremum if it is either a rela- ation of timplicit functions to lïnd relative
tive maximum or a relative minimum. extrema of functions detïned implicitly. Given
TO find relative extrema, the following facts functions <p, , . . , cp, (m < n), the problem of
concerning the sign of the derivative are useful. finding a relative extremum of f(xi, . . ,x,)
Suppose that a function of a single variable is under the condition that cpi (x i , , x,,) = 0,
differentiable on an interval I. Then we have . . . . qm(xl, , x,) = 0 is called the problem
the following: (1) If f has a relative extremum of finding a conditional relative extremum.
at an interior point xt, of I, then f’(xO) = 0. This problem cari be reduced to the problem
(2) If f’(xO) = 0 and f’(x) changes its sign at x0 of tïnding a relative extremum of an implicit
from positive (negative) to negative (positive), function. Actually, if the functions f; ‘pi,
then j’ has a relative maximum (minimum) at “‘> <P,,,are of class C’ and the +Jacobian
x0. (3) If f’(xO) = 0 and f is twice differentiable d(cp,, . , (~~)/a(x,-,,,+i, , x,) does not vanish
on some neighborhood of x,,, then f has a in the domain considered, then y, = x.-,,,+i,
relative maximum or minimum according as “‘/ y, = x, cari be regarded as implicit func-
f”(xo) < 0 or > 0. If f”(x,,) = 0, then nothing tionsofx,,...,x,(I=n-m).Hencewecanset
definite cari be concluded about a relative f(xl,...,xl,~l,...,~,)=f*(xl,...,x,).Thenf
extremum off at x0. In general, if there exists has a relative extremum at (xl, . , xn) under
a neighborhood of x,, in which f is r-times the condition ‘pi = . = qrn = 0 if and only if
differentiable (r is even) and f(‘) is continu- f* has a relative extremum at P. = (~7, . , xp).
ous, and if f’(xO) =f”(xO) = =f(‘-I)(x,) = 0, The latter condition implies that a11 af */8x,
f”“(x,) > 0 (or < 0), then f has a relative mini- (j=l , . . , I) vanish at Po, which holds if and
mum (maximum) at x0. On the other hand, if only if for arbitrary constants ii, . , Â, the
this condition holds with odd r, then f(xO) is function F(x,, . . . , X")=f+^Iq~+...+Â,<p,
not a relative extremum. If f’(xo) = 0, then satistïes aF/axi = 0 (i = 1, . . . , n), and further ‘pi
f(xO) is called a stationary value off: = 0, . , qrn = 0 at (XT, , xz). From this system
If a function f on n variables x, y, , z has a of equations we cari often tïnd the values of
relative extremum at (~,,,y,, . ,zO), then we x1, , xn. This method of lïnding conditional
bave L(X~, Y,, . . . , zo)=o,f,(xo,Yo,...,zo)= relative extrema is called Lagrange% method of
0, . . . ,fz(xo, y,, , zo) = 0, provided that the indeterminate coefficients or the method of
partial derivatives off exist. Assume that for a Lagrange multipliers (- 208 Implicit Func-
function f of class C2 of two variables x and y, tions; 216 Integral Calculus H; 379 Series H).
we have fx(xo, yo) = 0 and fY(xo, y,) = 0, and let
6 =f,,(xo~~o)f,,(xo~~o)-f~(xo~ Y,). Then we
have the following: (1) If 6 > 0, then according
References
as fxx(x,, yo) < 0 or > 0, f has a relative maxi-
mum or minimum at (x0, y,). (2) If 6 < 0, then
f does not have a relative extremum at (x0, yo). [l] T. M. Apostol, Mathematical analysis,
(3) If 6 = 0, then without further information Addison-Wesley, 1957.
nothing definite cari be said about a relative [2] N. Bourbaki, Eléments de mathématique,
extremum off at the point. Fonctions d’une variable réelle, Actualités Sci.
Let xi, ,x, be independent variables. If a Ind., 1074b, 1132a, Hermann, second edition,
function f of variables x1, . ,x, has a relative 1958, 1961.
extremum at a point P. = (xy, . . ,x,0), then [3] R. C. Buck, Advanced calculus, McGraw-
f, =fxz(Po) = 0 (i = 1, . . , n), provided that a11 Hill, second edition, 1965.
the partial derivatives off exist. In general, a [4] R. Courant, Differential and integral cal-
point P. where fis totally differentiable and culus 1, II, Nordemann, 1938.
these conditions are satistïed is called a critical [S] G. H. Hardy, A course of pure mathemat-
point off: The value f(P,) at a critical point is ics, Cambridge Univ. Press, seventh edition,
called a stationary value. If further f is of class 1938.
C’, then consider a tquadratic form of n [6] E. Hille, Analysis, Blaisdell, 1, 1964; II,
variables Q = Q(X, , . . , X,) = &,&XiXk, 1966.
where .L, =,LL,,(Po). Suppose that I.LA ~0. [7] W. Kaplan, Advanced calculus, Addison-
Then according to whether Q is +Positive Wesley, 1952.
detïnite, tnegative delïnite, or tindetïnite, f [S] E. G. H. Landau, Einführung in die Dif-
has a relative minimum, relative maximum, or ferentialrechnung und Integralrechnung,
no relative extremum at P,. If &l = 0, then Noordhoff, 1934; English translation, Dif-
nothing cari be said in general. A critical point ferential and integral calculus, Chelsea, 1965.
P off is said to be nondegenerate if I&l #O [9] J. M. H. Olmsted, Advanced calculus,
and degenerate if l& I= 0. Appleton-Century-Crofts, 1961.
We cari also apply the method of differenti- [lO] A. Ostrowski, Vorlesungen über
399 107 A
Differential Equations

Differential-und Integralrechnung 1, II, III, initiated the study of differential equations


Birkhauser, second edition, 1960-1961. satistïed by thypergeometric series.
[l l] M. H. Protter and C. B. Morrey, Modern The problem of existence of solutions, which
mathematical analysis, Addison-Wesley, 1964. supplies a foundation of modern differential
[12] W. Rudin, Principles of mathematical equation theory, was tïrst treated by A. L.
analysis, McGraw-Hill, second edition, 1964. Cauchy. His proof of the existence theorem
[ 131 V. 1. Smirnov, A course of higher mathe- was later improved by R. L. Lipschitz (1869).
matics. 1, Elementary calculus; II, Advanced Pioneers in the function-theoretic treatment
calculus, Addison-Wesley, 1964. of differential equations were C. A. A. Briot and
[ 141 A. E. Taylor, Advanced calculus, Ginn, J. C. Bouquet, who investigated the singular
1955. points of a function detïned by an analytic
differential equation. Also, B. tRiemann pro-
posed a new viewpoint which influenced L.
Fuchs in his development of the theory of
107 (XIII.1) linear ordinary differential equations in the
complex domain (1865). Works of A. M. Le-
Differential Equations gendre on telliptic functions and of H. +Pain-
taré on tautomorphic functions should also be
A. Ordinary Differential Equations mentioned in this connection.
After the Cauchy-Lipschitz existence the-
It was Galileo who found that the accelera- orem for the equation y’ =f(x, y) was known,
tion of a falling body is a constant and thence efforts were directed toward weakening the
derived his law of a falling body x(t) = gt*/2 as conditions imposed on f(x, y). G. Peano lïrst
what we would now view as a solution of the succeeded in giving a proof under the continu-
differential equation x”(r) = g, where x(t) de- ity assumption only (1890) and his results
notes the distance the body has fallen during were sharpened by 0. Perron (1915).
the time interval t and g is the constant gravi- Regarding the uniqueness of solutions of
tational acceleration. This pioneering work tinitial value problems, there are various re-
may be regarded as the fïrst example of solu- sults by W. F. Osgood (1898) Perron (1925)
tion of a differential equation. Also, the tequa- and many Japanese mathematicians. In the
tions of motion, proposed by 1. +Newton as course of this work the necessary and suftïcient
the mathematical formulation of the law of condition for uniqueness was successfully
motion, including Galileo’s law as a special formulated in a concise form (- 3 16 Ordi-
case, are differential equations of the second nary Differential Equations (Initial Value
order. Thus differential equations appeared, Problems)).
simultaneously with differential and integral For linear differential equations with peri-
calculus, as an indispensable tool for the uni- odic coefficients, investigations were carried
tïed and concise expression of the laws of out by C. Hermite (1877), E. Picard (1881),
nature. Such laws are generally called dif- G. Floquet (1883) G. W. Hi11 (1886), and
ferential laws. others. For instance, solutions satisfying y(x +
Newton completely solved the equations of w) = Ay were found to exist, where o is
the ttwo-body problem proposed by himself; the period of the coefficients. Analogous re-
G. W. +Leibniz also succeeded in solving many sults followed in the case of doubly periodic
simple differential equations. coefficients.
In the 18th Century, many mathematicians, Techniques of factorization of linear dif-
such as the +Bernoullis, A. C. Clairaut, J. F. ferential equations developed by G. Frobenius
Riccati, L. +Euler, and J. L. tlagrange, at- (1873) and E. Landau (1920) should also be
tacked and solved differential equations of noted. Picard (1883), J. Drach (1898) and
various types independently. In that period, E. Vessiot (1903, 1904) established a remark-
the emphasis was on solution by quadrature, able result on the solvability (in the sense of
that is, applying to telementary functions a solution by quadrature) of linear differential
fmite number of algebraic operations, trans- equations, successfully extending the +Galois
formations of variables, and indelïnite inte- theory in this new direction.
grations. It was toward the end of the 18th The concept of tasymptotic series, which in
Century that new methods, such as integration a sense approximate the solution of differential
by intïnite series, came to be discussed. A equations, was introduced by Poincaré (1886)
method of variation of constants for the solu- and extended by M. A. Lyapunov (1892) J. C.
tion of linear ordinary differential equations C. Kneser (1896), J. Horn (1897) C. E. Love
was invented by Lagrange in 1775. At the (1914), and others. Poincaré was also the
beginning of the 19th Century, C. F. +Gauss founder of topological methods in differential
107 B 400
Differential Equations

equation theory, and his ideas were developed problems-tnonlinear problems appearing in
extensively by 1. Bendixson (1900), Perron the study of viscous or compressible fluids, or
(1922, 1923), G. D. Birkhoff, and others (- 126 the study of equations of tmixed type in con-
Dynamical Systems). nection with supersonic flow-have emerged
In 1890, Picard invented an ingenious tech- as important topics; and the newly developed
nique of tsuccessive approximation for the techniques of functional analysis have brought
proof of existence theorems, and his technique about remarkable changes. Especially in the
is now widely used in every application of study of the Schrodinger equations of quan-
functional equations. The technique of reduc- tum mechanics and of more general tevolution
ing linear differential equations to linear equations , this method has proved to be a
tintegral equations of Volterra type was also powerful tool.
developed. Finally, we should not fail to mention that
On the tboundary value problems and the development of electronic computers has
teigenvalue problems that appear in many made it possible to obtain numerical solu-
areas of physics, there was extensive research tions and to discover many important facts.
by mathematicians such as J. C. F. Sturm +Numerical analysis is now becoming an indis-
(1836), J. Liouville, L. Tonelli, Picard, M. pensable part of the theory (- 304 Numerical
Bôcher (1898, 1921), Birkhoff (1901, 1911), Solution of Partial Differential Equations).
and others. In this connection the problem
arises of expanding a given function by an
torthogonal system of functions obtained as
References
teigenfunctions of a given boundary value
problem. Those problems were brought into
unifïed form by D. +Hilbert (1904) in his theory [l] E. L. Ince, Ordinary differential equations,
of tintegral equations. Subsequently boundary Dover, 1956.
value problems of ordinary and partial dif- [2] L. S. Pontryagin, Ordinary differential
ferential equations came to be discussed in this equations, Addison-Wesley, 1962. (Original in
framework. Russian, 1961.)
Finally, it should be mentioned that the [3] E. A. Coddington and N. Levinson,
tcalculus of variations created by Euler and Theory of ordinary differential equations,
Lagrange gave rise to the study of a certain McGraw-Hill, 1955.
class of differential equations bearing the name [4] P. Hartman, Ordinary differential equa-
of Euler (- 46 Calculus of Variations). tions, Wiley, 1964.
[S] E. Hi]le, Lectures on ordinary differential
equations, Addison-Wesley, 1969.
B. Partial Differential Equations [6] L. Bieberbach, Theorie der gewohnlichen
Differentialgleichungen auf funktionen theore-
The origin of partial differential equations cari tischen Grundlage dargestellt, Springer, 1953.
be traced back to the study of hydrodynamic [7] E. Hille, Ordinary differential equations in
problems by J. d’Alembert (1744) and Euler. the complex domain, Wiley, 1976.
However, perhaps Lagrange and P. S. +La- [S] W. Wasow, Asymptotic expansions for
place were the fïrst to investigate the general ordinary differential equations, Interscience,
theory. Subsequently, during the 18th and 1965.
19th centuries, it was developed by G. Monge, [9] 1. Kaplansky, An introduction to differen-
A.-M. Ampère, J. F. Pfaff, C. G. +Jacobi, tial algebra, Hermann, 1957.
Cauchy, S. +Lie, and many other mathema- [lO] M. A. Naïmark, Linear differential oper-
ticians. The fundamental existence theorem ators, 1, II, Ungar, 1967, 1968. (Original in
for the initial value problem, now called the Russian, 1954.)
Cauchy-Kovalevskaya theorem, was proved [ 111 R. Courant and D. Hilbert, Methods of
by S. Kovalevskaya in 1875 (- 321 Partial mathematical physics 1, II, Interscience, 1953,
Differential Equations (Initial Value Problems) 1962.
J-9 [ 121 P. R. Garabedian, Partial differential
Because of their close connection with prob- equations, Wiley, 1964.
lems of physics, linear equations of the second [ 133 1. G. Petrovskiï, Lectures on partial
order have been a chief abject of research. Up differential equations, Interscience, 1954.
to the 19th Century, classification into telliptic, (Original in Russian, 1950.)
thyperbolic, and tparabolic types and the [ 141 L. Hormander, Linear partial differential
study of boundary and initial value problems operators, Springer, 1963.
for each of these types constituted the main [ 151 K. Yosida, Lectures on differential and
part of the theory. integral equations, Wiley, 1960. (Original in
In the 20th Century, more complicated Japanese, 1950.)
401 108 B
Differential Games

108 (XIX.1 0) Let SU denote the class of functions u(t, x) that


are piecewise C’ in x on D and have their
Diffèrential’Games range in U. Similarly, let S, denote the class of
functions u(t, x) that are piecewise C’ in x on D
and have their range in V.
A. Introduction
Let u E S,, and v E S,, and consider the dif-
ferential equation
The study of differential games arose from the
study of pursuit and evasion problems and dx/dt =f(t, x, u(t, x), v(t, x)). (2)
various tactical problems. The lïrst work was
done by R. P. Isaacs [l] in a series of RAND subject to the initial condition
Corporation memoranda that appeared in x(z)=(. (3)
1954. He applied to many illustrative examples
the method of Hamilton-Jacobi differential We say that a pair (u, V)E& x S, is playable
equations (- eq. (4) below). The heuristic if for every (t, 5) in D every solution of (2)
results of Isaacs were made rigorous by W. H. satisfying (3) stays in D and reaches a terminal
Fleming [2], L. D. Berkovitz [3], A. Friedman manifold F in tïnite time, where F is a smooth
[4,5], and others. manifold contained in 0. Let Q, c SU, R, c S,
be the maximal pair of subclasses such that
each pair (u, u) E R, x R, is playable. We cal1 the
B. Zero-Sum Two-Person Games functions in a, and Sr, the strategies for the
players.
Suppose that there are two antagonists, each For each strategy pair (u, v) we cari define a
exerting partial control over the state of a functional
system. One wishes to maximize a given payoff
that is a functional of the state and the control
exerted, while the other wishes to minimize 11
this payoff. Let the state of a differential game =s@l~x(tl))+ W, x(t), 46 x(t)), U(t, x(t)))&
s ‘0
at time t be represented by an n-dimensional
vector x(t)~R”. In a zero-sum differential game where x(t) is a solution of (2) and (3) and t, is
between two players 1 and II, we are given a the tïrst time that (t, x(t)) reaches the terminal
system of n differential equations manifold F. The functional J is called the
dxldt = f (t, x, u, v) payoff.
(1)
Let (u*, V*)ER, x Q, be a strategy pair.
with an initial condition x(r) = 5 E R”, where Suppose that for any UEQ, and VE&, the
u E RP is chosen at each instant of time by inequalities
player 1 and DER~ is chosen by player II. We
assume that the function f(t, x, u, v) is con-
tinuous in t and continuously differentiable
hold for a11 (T, 5) ED. We say that (u*, u*) is a
on the entire (x, u, u)-space.
saddle point relative to the classes 0, and 0,.
It is usually assumed that both players
The function
know the present state of the game and that
they know how the game proceeds; that is, qt, x) =J(t, x, u*, u*)
they know the system (1). Each player cari take
defmed on D is called the value function of
the state of the game into account in making
the game. Berkovitz [3] proved that the
his choice. Thus player 1 cari let his choice of u
value function W(t, x) is continuous on D and
be governed by a vector function u(t, x) defined
continuously differentiable on each Di and
on D, where D c [0, CO) x R” is a fixed region of
satistïes
the (t, x)-space. Similarly, player II cari let his
choice of v be governed by a vector function fEy P(c x, u,V*I + Ya, ML x, u,V*)I
o(t, x) defined on D.
A lïnite collection of subregions D,, . . , 0, of
a region D is said to constitute a decomposi-
tion of D whenever the following conditions = h(t, x, u*, v*) + ngt, x)f(t, x, u*, u*)
hold: (i) Each Di (i = 1, , r) is connected and
= - qt, x). (4)
has a piecewise smooth boundary; (ii) Di f’
Dj = cp if i #j. A function detïned on D is said to Equation (4) is called the Hamilton-Jacobi
be piecewise C’ in x on D if there is a decom- equation.
position of D such that on each Di the function Let x*(t; T, 5) be the optimal trajectory corre-
and a11 its derivatives with respect to x are sponding to the saddle-point strategies (u*, v*)
continuous in (t, x) on &. Let U and V be and resulting from an initial point (T, [)ED.
tïxed closed subsets of RP and R4, respectively. Then there exists an n-dimensional continuous
108 c 402
Differential Games

vector function i(t; t, [) such that the following a payoff


hold [3]:
(i) The functions x* and ,? satisfy the system Ji(?5>Ul>~‘.> UN)=!4i(tlrX(tl))

of differential equations fl
+ h,(t,x(t), ul(t, x(t)), , u,(t, x(t)))dt,
dx/dt=f(t,x,u*(t,x),u*(t,x)), s 10
where x(t) is a solution of (5) and t, is the first
dA/dt = -H,(t, x, Â, u*(t, x), u*(t,x)),
time that (t, x(t)) reaches the termina1 manifold
where F. Each player i is to choose his strategy uicRi
SO as to maximize his own payoff Ji.
H(t,x,Â,u,u)=h(t,x,u,u)+itf(t,x,u,o).
There are many definitions of “solution” for
(ii) If x = x*(t; z, 0, t 2 T, then games involving more than two players. A
strategy N-tuple u* = (ul *, . , uN*) is called an
w,(t, 4 = w; T,O
equilibrium point for the game if the
(iii) At t = tl , the transversality condition inequalities

c?T c3qaT ag8X J,(u,* ,..., ui~,*,ui>ui+,* ,...)


H-+‘Sp--&O
aa atag ôx ao aa <Ji(ul* )...> Ui-l*rUi*,Ui+l* /...) (i=l,..., N)

holds, where the terminal manifold F is given hold for any u,eQ,, ,u,E!&.. J. H. Case [9]
parametrically by the relations has shown that the conclusions drawn for
zero-sum two-person games also hold for N-
t = T(o), x=X(o),
person differential games.
0 ranging over a cube in some fïnite-
dimensional Euclidean space.
References
(iv) For all t < t < t, ,
[l] R. P. Isaacs, Differential games, Wiley,
1965.
=npll~“x ff(t,x*(t),I(t),u, u) [2] W. H. Fleming, The convergence problem
for differential games II, Advances in game
= H(t, x*(t), i(t), U*(t, x*(t)), U*(t, X*(t))). theory, Ann. Math. Studies 52, Princeton
Univ. Press, 1964, 195-210.
P. Varaiya and J. Lin [6] and Friedman [3] L. D. Berkovitz, Necessary conditions for
[4,5] have defïned certain special classes of optimal strategies in a class of differential
differential games, and have shown that under games and control problems, SIAM J. Con-
their defïnitions the games have nonzero value trol, 5 (1967), l-24.
functions. [4] A. Friedman, On the detïnition of dif-
ferential games and the existence of value and
of saddle points, J. Differential Equations, 7
C. N-Person Differential Games (1970), 69-91.
[S] A. Friedman, Differential games, Wiley,
In a differential game between many players, 1971.
the state vector ~(C)GR” is governed by [6] P. Varaiya and J. Lin, Existence of saddle
points in differential games, SIAM J. Control,
dxldt =f(t, x, ul, . . , uJ, x(4 = 5,
7 (1969), 142-157.
where CQERP! is chosen at each instant of time [7] H. W. Kuhn and G. P. SzegG (eds.), Dif-
by player i. Each ui is constrained to lie in a ferential games and related topics, North-
fïxed closed subset Ui of RP!. Let Si be the class Holland, 1971.
of functions u,(t, x) that are piecewise C’ in x [S] L. S. Pontryagin, On the theory of dif-
on fi and have their range in Lii. ferential games, Russian Math. Surveys, 21
We say that an element u=(u,, . . ..u~) of (1966), 193-246.
S, x . . x S, is playable if, for every (7, <)ED, [9] J. H. Case, Toward a theory of many
every solution of the differential equation player differential games, SIAM J. Control, 7
(1969), 179-197.
dx/dt =f(t, x, u1 (L 4, , u,@, x)), x(z) = 5,
(5)
stays in D and reaches a terminal manifold
Finfinitetime.LetR,c&(i=l,...,N)be
109 (VII.1)
the maximal subclasses such that each ele- Differential Geometry
mentu=(u, ,..., u,)EQ,x...xR,isplayable.
We cal1 the functions ui~Ri (i= 1, . . ..N) the In differential geometry in the classical sense,
strategies. For each strategy N-tuple we detïne we use differential calculus to study the prop-
403 109
Differential Geometry

erties of figures such as curves and surfaces in by means of differential calculus the properties
Euclidean planes or spaces. Owing to his of curves and surfaces that are invariant under
studies of how to draw tangents to smooth projective transformations. This subject was
plane curves, P. +Fermat is regarded as a studied by E. J. Wilczynski, G. Fubini, and
pioneer in this lïeld. Since his time, differential others; tafftne differential geometry and +Con-
geometry of plane curves, dealing with cur- forma1 differential geometry were studied by
vature, tcircles of curvature, tevolutes, +en- W. Blaschke and others (- 110 Differential
velopes, etc., has been developed as a part of Geometry in Specific Spaces).
calculus. Also, the fteld has been expanded to Influenced by Gauss’s geometry on surfaces,
analogous studies of space curves and surfaces, in his inaugural address at Gottingen in 1854
especially of +asymptotic curves, +lines of cur- Riemann advocated an intrinsic differential
vature, tcurvatures and tgeodesics on surfaces, geometry completely independent of embed-
and truled surfaces. C. F. +Gauss founded the dings (Über die Hypothesen, welche der Geo-
theory of surfaces by introducing concepts of metrie zugrund liegen; Werke, 2nd ed. 1892,
the tgeometry on surfaces (Disquisitiones circa 2722287) (- 364 Riemannian Manifolds).
supeyficie curvas, 1827). Gauss recognized Removing the restriction to two dimensions
the importance of the intrinsic geometry of and considering abstract manifolds of dimen-
surfaces, and it is generally agreed that differ- sion n, he introduced what is now known as
ential geometry as it is known today was ini- the Riemannian metric; actually he considered
tiated by him. Thus differential geometry came the more general metrics that had formed the
to occupy a firm position as a branch of subject matter of the dissertation of P. Finsler
mathematics. The influence that differential- in 1918 (- 152 Finsler Spaces). Riemann-
geometric investigations of curves and surfaces ian geometry includes Euclidean and non-
have exerted upon branches of mathematics, Euclidean geometry as special cases, and is
physics, and engineering has been profound. important for the great influence it exerted
For example, E. Beltrami discovered an in- on geometric ideas of the 20th Century. Under
timate relation between the geometry on a the influence of the algebraic theory of invar-
tpseudo-sphere and +non-Euclidean geometry. iants, Riemannian geometry was then studied
The study of tgeodesics is a fertile topic deeply as a theory of invariants of quadratic tcovar-
related to dynamics, the calculus of variations, iant tensors by E. B. Christoffel, C. G. Ricci,
and topology, on which there is excellent and others. Riemannian geometry attracted
work by J. Hadamard, H. +Poincaré, P. Funk, wide attention after A. Einstein applied it to
G. D. Birkhoff, M. Morse, R. Bott, W. Klin- the tgeneral theory of relativity in 1916.
genberg, and M. Berger, among others. The In the same year, T. Levi-Civita introduced
theory of minimal surfaces initiated by J. L. the notion of +Levi-Civita parallelism, which
Lagrange was an application of the calculus contributed greatly to the clarification of geo-
of variations. At the early stages of develop- metric properties of Riemannian spaces. Ob-
ment, G. Monge, J. B. M. C. Meusnier, A. M. serving parallelism to be an affine-geometric
Legendre, 0. Bonnet, B. Riemann, K. Weier- concept, H. Weyl and A. S. Eddington devel-
strass, H. A. Schwarz, Beltrami, and S. Lie oped a theory of Riemannian spaces “afftnely”
contributed to the theory. Weierstrass and based on the notion of parallelism without
Schwarz established its relationship with the using metrical methods. Such a geometry is
theory of functions. J. A. Plateau showed called a geometry of an affine connection (-
experimentally that tminimal surfaces cari be 80 Connections).
realized as soap films by dipping wire in the Every straight line in a Euclidean space has
form of a closed space curve into a soap solu- the property that a11 tangents to the line are
tion (1873). The Plateau problem, i.e., the prob- parallel. In a space with an affine connection,
lem of proving mathematically the existence we may detïne a family of curves called tpaths
of a minimal surface with prescribed bound- as an analog of straight lines. Such curves are
ary curve, was solved by Tibor Rade in 1930 solutions of a system of ordinary differential
and independently by J. Douglas in 1931. equations of the second order of a certain type.
Although the relationship to function theory Coefftcients of such differential equations
is lost for higher-dimensional minimal sub- determine a parallelism and hence an affine
manifolds, their study is intimately related to connection. H. Weyl discovered transforma-
the calculus of variations and topology. tions of coefficients that leave the family of
Euclidean geometry is a geometry belonging paths invariant as a whole, namely, projective
to F. Klein’s Erlangen program (- 137 Erlan- transformations of an affine connection. A
gen Program). For other geometries in the geometry that aims to study properties of
sense of F. Klein we may also consider the cor- paths or affine connections that are invariant
responding differential geometries. For instance, under these transformations is called a projec-
in tprojective differential geometry we study tive geometry of paths. Such geometry was
109 404
Differential Geometry

studied by L. P. Eisenhart, 0. Veblen, and forerunner of the theory of fïber bundles. Com-
others. The concept of projective connections bining +Grassmann algebra with differential
was an outcome of such studies. Similarly, the calculus, Cartan developed a powerful com-
concept of conforma1 connections was devel- putational tool known as calculus of differen-
oped from the consideration of tconformal tial forms (- 105 Differentiable Manifolds Q).
transformations of Riemannian spaces. Differential forms have become indispensable
These geometries cannot in general be re- in topology, algebraic geometry, and in studies
garded as geometries in the sense of Klein. of functions of several complex variables, as
Actually, any one of these geometries generally well as in differential geometry.
has no transformations that correspond to The work of Lie on transformation groups
tcongruent transformations of geometries in also had a profound influence on Cartan. The
the sense of Klein; even if it has such trans- latter’s work on +Lie groups, particularly on
formations, they do not act transitively on the simple Lie groups, and on differential geome-
space. Thus geometries are naturally divided try culminated in 1926 in his discovery of
into two categories, one consisting of geome- +Riemannian symmetric spaces. These spaces
tries in the sense of Klein (based on the group are natural generalizations of the spherical
concept) and the other of geometries based on surface and the unit disk in the complex plane
Riemann’s idea. Under such circumstances, with Poincaré metric, and play essential roles
E. Cartan unified the thoughts of Klein and in unitary representation theory and in other
Riemann from a higher standpoint and con- areas of mathematics.
structed his theory of connections in a series of A tangent line to a curve C at a point P of C
papers published between 1923 and 1925. He is the limit line of the line PQ, where Q is a
developed the theory of affine, projective, and point on C aporoaching P; hence we cari de-
conforma1 connections from a viewpoint con- fine it locally. A concept (or property), such as
sistent with that of Klein. Just as each tangent this, that cari be defined in an arbitrary small
space of a Riemannian manifold is viewed as a neighborhood of a point of a given figure or a
Euclidean space, an affine connection regards space is called a local concept (or local prop-
the tangent space at each point as an affine erty) or a concept (or property) in the small.
space and develops it onto the tangent space In the early stages of the development of dif-
at an intïnitesimally nearby point. In discuss- ferential geometry, differential calculus was
ing projective connections Cartan attached a the main tool of study, SO most of the results
projective space to each point of a manifold as were local. On the other hand, a concept (or
an infïnitesimal approximation, and similarly property) that is defmed in connection with a
for conforma1 connections. More generally, he whole figure or a whole space is called global
attached to each point of a manifold a lïxed or in tbe large. In modern differential geome-
Klein space, i.e., a homogeneous space of a Lie try, the study of relations between local and
group, called the structure group. Thus Cartan global properties has attracted the interest of
introduced the concept of fiber bundle (- 147 mathematicians. This view was emphasized
Fiber Bundles). Then he delïned a connection by Blaschke, who worked on the differential
as a development of the fïber, i.e., the gen- geometry of tovals and tovaloids. The study of
eralized tangent space, at each point onto the trigidity of ovaloids by S. Cohn-Vossen be-
fïber at an infïnitesimally nearby point (- 80 longs in this category, and many works on
Connections B). If G is the group of congruent geodesics and minimal surfaces were done
transformations in Euclidean space, a mani- from this standpoint.
fold with connection having G as its structural From the viewpoint of modern mathemat-
group is called a manifold with Euclidean con- ics, the basic concepts on which we construct
nection. Among manifolds with Euclidean Riemannian geometry and geometries of con-
connection, Riemannian manifolds are charac- nections are global concepts of tdifferentiable
terized as those without ttorsion. If we take manifolds. However, in Riemann% time the
the group of congruent transformations of theory of +Lie groups and topology were not
projective (conformai) geometry as G, we have yet developed; consequently, Riemannian
manifolds with projective (conformai) connec- geometry remained a local theory. In 1925, H.
tion in the sense of Cartan. Among these, there Hopf began to study the relations between
are remarkable ones called manifolds with local differential-geometric structures and the
normal projective (or conformai) connections, topological structures of Riemannian spaces.
which are essentially the same as the ones However, except for the work of Cartan, Hopf,
studied by Veblen and others. Cartan’s idea and a few others, differential geometry in the
had a profound influence on modern differen- 1920s was still largely concerned with surfaces
tial geometry. The method of moving frames, in the 3-dimensional Euclidean space or local
created by G. Darboux and extensively used properties of Riemannian manifolds, and with
by Cartan in his theory of connections, was a affine, projective, and conforma1 connections.
405 109 Ref.
Differential Geometry

Gradually the concept of differentiable mani- based on the theory of connections in prin-
folds was clarified, the global theory of Lie cipal fiber bundle.
groups made progress, and topology devel- R. H. Nevanlinna’s value distribution theory
oped; and the trend toward global differential and its subsequent generalization by Chern
geometry began slowly in the early 1930s. The and others cari be best described in differential-
dissertation of G.-W. de Rham published in geometric terms.
1931 showed that the cohomology of a mani- Differentiable manifolds are currently ob-
fold cari be computed in terms of differential jects of research in both differential geometry
forms (- 105 Differentiable Manifolds R). His and differential topology. While topology
theorem provides the theoretical foundation studies manifolds per se, differential geometry
for expressing cohomological invariants of a may be considered as the study of differenti-
manifold in terms of differential geometric able manifolds equipped with geometric struc-
invariants. In a series of papers immediately tures, such as metric tensors, connections,
following de Rham’s, W. V. D. Hodge es- (almost) complex structures, and various other
tablished that, on a compact Riemannian tensors. Through these geometric structures,
manifold, every r-dimensional cohomology differential geometry enjoys close contact
class cari be uniquely represented by a har- with many branches of mathematics. From
monic form of degree r (- 194 Harmonie its early days, differential geometry has had
Integrals). close ties to topology (as exemplitïed by the
An important class of complex manifolds Gauss-Bonnet formula) and to partial dif-
with compatible Riemannian metric was dis- ferential equations and analytic functions
covered by J. A. Schouten, D. van Dantzig, (through, e.g., the study of minimal surfaces).
and E. Kahler around 192991932. This class of The bonds with topology were strengthened
manifolds, called Kahler manifolds today, by Morse theory and, more recently, by the
comprises the projective algebraic manifolds. theory of characteristic classes. In the most
Hodge’s theory of harmonie integrals is most recent proof of the Atiyah-Singer index theo-
effective when applied to compact Kahler rem, differential geometry is an important
manifolds, (- 232 Kahler Manifolds). intermediary between topology and analysis.
The most celebrated global theorem in Differential geometry and algebraic geometry
classical differential geometry of surfaces is the have enriched each other through Kahler
Gauss-Bonnet formula (1848) (- 364 Rie- manifolds. The theory of functions of several
mannian Manifolds D). The formula was gen- complex variables also has points of contact
eralized to closed hypersurfaces of Euclidean with differential geometry, such as value distri-
space by Hopf in 1925, to closed submanifolds bution theory and Cauchy-Riemann struc-
of Euclidean space by C. B. Allendoerfer and tures. Contact and symplectic structures are
W. Fenchel in 1940, and lïnally to arbitrary basic to mechanics. Lorentz manifolds and
closed Riemannian manifolds by Allendoerfer connections in principal bundles are essential
and A. Weil in 1943. But the simple proof mathematical, tools in the general theory of
given by S. S. Chern in 1944 contained the relativity and in gauge theory. Topics such as
notion of transgression, which has become minimal submanifolds, manifolds of positive
essential in the theory of characteristic classes curvature, and closed geodesics are active and
(- 56 Characteristic Classes). The discovery of important areas of research belonging to Rie-
Pontryagin classes for Riemannian manifolds mannian geometry proper; at the same time,
(1944) and Chern classes for Hermitian mani- differential geometry provides a language and
folds (1946) culminated in the +index theorem methods that are important in wider areas of
and the tRiemann-Roch theorem of F. E. P. mathematics.
Hirzebruch, and tïnally in the +Atiyah-Singer
index theorem. References
A simple but fruitful idea of S. Bochner,
relating harmonie forms to curvature, es- [l] L. P. Eisenhart, A treatise on the differen-
tablished tvanishing theorems for harmonie tial geometry of curves and surfaces, Ginn,
forms of Riemannian manifolds and for holo- 1909.
morphic forms of Kahlèr manifolds under [2] G. Darboux, Leçons sur la théorie gén-
suitable positivity conditions for curvature. érale des surfaces, Gauthier-Villars, 1, second
His idea has led to the vanishing theorems of edition, 1914; II, second edition, 1915; III,
K. Kodaira and others (- 232 Kahler Mani- 1894; IV, 1896.
folds D). [3] W. Blaschke, Vorlesungen über Differen-
The work of C. Ehresmann in 1950 on con- tialgeometrie, Springer, 1, third edition, 1930;
nections in principal fiber bundles established II, 1923; III, 1929 (Chelsea, 1967).
a solid foundation to Cartan’s theory of con- [4] L. P. Eisenhart, Riemannian geometry,
nections. Gauge theory in physics is largely Princeton Univ. Press, 1926.
110 A 406
Differential Geometry in Specific Spaces

[S] E. Cartan, Leçons sur la géométrie des manifold V on which a +Lie transformation
espaces de Riemann, Gauthier-Villars, second group G acts.
edition, 1946; English translation, Geometry of If a +Lie group G of dimension r acts ttransi-
Riemannian spaces, Math. Sci. Press, 1983. tively on a space G, and the tstability group
[6] J. A. Schouten, Ricci-calculus, Springer, for any point of G, consists of the identity
second edition, 1954. element only, then G, is called the group mani-
[7] 0. Veblen and J. H. C. Whitehead, The fold of G, and an element of G, is called a
foundations of differential geometry, Cam- frame. If f0 is a fixed frame, then the mapping
bridge Tracts, 1932. a+& E G, (a~ G) gives a tdiffeomorphism of
[8] S. S. Chern, Topics in differential geome- G to G,. Let I(G) be the set of a11 tdifferen-
try, Lecture notes, Inst. for Adv. Study, tial forms w of degree 1 on G, that are invar-
Princeton, 1951. iant under transformations of G. Then Z(G)
[9] W. V. D. Hodge, The theory and applica- is a linear space of dimension r and is the
tions of harmonie integrals, Cambridge Univ. tdual space of the Lie algebra g of G. A basis
Press, second edition, 1952. {wl 11 <Â < r} of I(G) is called a set of rela-
[ 101 S. Kobayashi and K. Nomizu, Founda- tive components of G. The structure equations
tions of differential geometry, Interscience, 1, hold:
1953; II, 1969.
[ 111 E. Cartan, Oeuvres complètes I-III, dw,=; i= c,lpvwp AO,,
Gauthier-Villars, 1952-1955; second edition, I<,” 1
CNRS, 1984. where Ci,,” are +Structure constants of the Lie
[ 121 A. L. Besse, Manifolds a11 of whose geo- algebra g.
desics are closed, Springer, 1978. Let G be a Lie transformation group of a
[13] M. Berger, P. Gauduchon, and E. Mazet, space E. Then a G-invariant submanifold of E
Le spectre d’une variété Riemanninne, Lecture on which G acts transitively is called an orhit.
notes in math. 194, Springer, 1971. Each point ye E determines an orbit contain-
[14] D. Gromoll, W. Klingenberg, and W. ing y. When there exist parameters kj (1 <j < t)
Meyer, Riemannsche Geometrie im Groosen, such that any G-invariant on E is a function of
Lecture notes in math. 55, Springer, 1968. k,, . . , k,, then these parameters are called the
[ 151 S. Helgason, Differential geometry, Lie fundamental invariants of E. Let H (c G) be the
groups, and symmetric spaces, Academic stability group at a point y0 on an orbit M;
Press, 1978. then M is identilïed with the thomogeneous
[16] S. Kobayashi, Transformation groups in space G/H by the diffeomorphism <p: M+G/H,
differential geometry, Springer, 1972. <p(ayO) = aH (a E G). Furthermore, a +Principal
[17] T. Levi-Civita, The absolute differential fiber bundle (G,, M, H, 7) is determined by the
calculus, Blackie, 1926. projection 7: G,+M, 7(ufo)=ay, (~CG). The
[ 181 J. W. Milnor, Lectures on Morse theory, tlïber H, on a point y~ M is a group manifold
Ann. Math. Studies 51, Princeton Univ. Press, of H. H, is called the family of frames on y,
1963. and an element of H, is a frame on y. Local
[ 191 R. Osserman, A survey of minimal sur- coordinates 0, (1 <p < s) of the group H are
faces, Van Nostrand, 1969. called the secondary parameters and are used
[20] G. de Rham, Variétés differentiables, to indicate frames in H,. When H is not con-
Hermann, 1955. nected, let Ho be the connected component
[21] A. Weil, Introduction à l’étude des vari- of the identity of H and fi be the tcovering
étés kahleriennes, Hermann, 1958. manifold G/H” of M. An element JE M over
[22] K. Yano and S. Bochner, Curvature y E M is called an oriented element. Now as-
and Betti numbers, Ann. Math. Studies 32, sume that the group H is connected. Then the
Princeton Univ. Press, 1953. family of frames H, on each y~ M is given as
[23] Y. Matsushima, Differentiable manifolds, an tintegral manifold of a tcompletely inte-
Dekker, 1972. grable system of total differential equations ni
= 0 (1s i < r - s, r-rie I(G)) on the group mani-
fold G,. Here the ni are linearly independent
and are called the horizontal components of M.
110 (Vll.17) The ni are linear combinations of the relative
Differential Geometry in components wi of G, and their coefftcients are
Specif ic Spaces generally functions of the fundamental invar-
iants kj. For simplicity, we assume that the
relative components {mn} are chosen such that
A. The Method of Moving Frames
the horizontal components ni and components
The main theme of this article is the theory of w, (r-s < LY< r) are linearly independent. Then
surfaces (i.e., submanifolds) in a differentiable the wp (1~ p <s) are called the secondary com-
407 110A
Differential Geometry in Specific Spaces

ponents. The differentials de,, of the secondary Fp( W) depends on m parameters ui and sP
parameters are linear combinations of the og. secondary parameters 0, of order p. On the
Furthermore, let {x,} (1~ (r < n) be local co- space P(W), the differentials of invariants of
ordinates of E; then the differentials dx, are orders less than or equal to p - 1 and the prin-
linear combinations of the differentials dkj and cipal components of orders less than or equal
the horizontal components 7~~. to p - 1 are linear combinations of the basic
Let G be a Lie transformation group of a components, whose coefficients are functions
space V. We regard two m-dimensional sur- of the invariants of orders less than or equal
faces WI and W, passing through XE V as to p. The differentials of invariants of order
equivalent if they have a tcontact of order p at p and the principal components of order p are
x. Then an equivalence class of submanifolds linear combinations of the basic components:
is called a contact element of order p at the
point x. Let E, be the set of a11 contact ele- dk, = h,, 7~~+ . . . + h,,~,,,, t,-,<u<t,,
ments of order p at x, where x runs over a11 n,=b,,n,+...+b,,q,,, r-s,-,cu<r-s,,
the points in V. A contact element of order p
naturally determines a contact element of where the coefficients hNi, bai are functions of
order p - 1, and we denote this correspon- the invariants of orders less than or equal to p
dence by ti : E,-+ E,-, . Thus we obtain the and, in general, the secondary parameters Q,, of
series of correspondences order p. These coefficients are called the coeffi-
cients of order p. Let Ip be a subgroup of G
V=E&EI+... +Ep-,;YEp+...,
preserving a family of frames of order p and DP
where a contact element of order 0 is identilïed be a space whose coordinates are coefficients
with a point of V. Since a transformation on (&, bai) of order p. Then Ip acts on DP as a
the space V induces a transformation on E,, transformation group. Knowledge of the prop-
G is also a Lie transformation group of E,, erties of contact elements of order less than
and this transformation commutes with the or equal to p cari be utilized to obtain infor-
mapping $. The fundamental invariants kj of mation about the invariants of order p + 1, etc.
E, are said to be of order p. We use similar In fact, if we cari choose in the I,-space 0, a
terminology (such as frames of order p, etc.) subspace C, that intersects each orbit in D,, at
throughout this article. The fundamental in- one and only one point, then in general the
variants kj of order p (1 <j < tP) cari be chosen secondary parameters of order p associated
such that they contain the fundamental invar- with the points in C, correspond to the frames
iants ki (1 <i < t,-,) of order p - 1. The ad- of order p, and the parameters associated with
ditional t, - t,-, invariants k, (t,-, < c(< tP) are the points in C, are the invariants of order p +
called the invariants of order p. The family H! 1. The restrictions of the coefficients of order
(y~ EP) of frames of order p cari be chosen p to C, are functions of the invariants of orders
such that Hi is contained in the family Hi-’ less than or equal to p + 1; they are indepen-
(z = $y~ E,-,) of frames of order p - 1. If neces- dent of the secondary parameters of order p.
sary, the family HJ of frames of order p cari be Thus the frames of order p + 1 and the in-
made connected by detïning an orientation of variants of orders less than or equal to p + 1
contact elements of order p. Furthermore, the determine the contact elements of order p of W
horizontal components rcj (1 <j < r - sP) of and their differentials; generally, the latter cari
order p cari be chosen such that they contain be utilized to determine the contact elements
the horizontal components rri (1 <i < r - sPml). of order p + 1.
The additional sP-i -s,, components rca(r This process of obtaining information of
- sP-i < tu < r - sP) are called the principal “order p + 1” utilizing a suitable subspace C,
components of order p. of DP is the so-called general metbod of moving
Let W be an m-dimensional surface of a frames. However, the surface W may contain
space V. The contact element of order p (2 0) points for which the general method does not
is determined at every point of W and ex- apply. Actually, there are surfaces W for which
pressed by the family of frames of order p and the method does not apply for any point in W.
the values of invariants of orders less than or Thus various methods of moving frames are
equal to p. Let {ui} (1~ i <m) be local coordi- necessary to tope with different kinds of sur-
nates on W. Then the differentials dui are given faces. In the actual application of the method
as linear combinations of linearly independent of moving frames, we use certain devices that
differential forms rci (1~ i < m), where the ni, help to simplify the calculations. In fact, an
called the basic components of W, are certain infmitesimal transformation (S&, abai) of the
linear combinations of the differentials of the group I, acting on the space 0, is expressed as
fundamental invariants of V and of the horizon- a linear combination of the secondary compo-
tal components of orbits of V. Let Fp( W) be nents of order p; this expression is easily ob-
the set of a11 families of frames of order p; then tained by means of the structure equations of
110 B 408
Differential Geometry in Specific Spaces

G. The group I,,,, is a subgroup of I, Iïxing group, or projective differential geometry, cari
every point of the subspace C,, and its in- be found in the Theory of surfaces by J. G. .
lïnitesimal transformation is such that 6hai = 0, Darboux. The subject has been systematically
Sb,, = 0. The secondary components of order studied by H. G. H. Halphen, E. J. Wilczynski,
p + 1 are immediately obtained from the and G. Fubini. The Fubini theory was en-
equations for dk, and rr,. Furthermore, when riched substantially by E. Cartan, E. Lech, E.
m > 2, the condition for the principal compo- Bompiani, and J. Kanitani.
nents of every order to satisfy the structure In this section we consider a surface S in a
equations of G is essential to the problem of 3-dimensional projective space. Let A(u’, u’)
the existence of (m-dimensional) surfaces. (u’, uz are parameters on S) be a point of S,
As we apply the method of moving frames and associate with A a11 the frames [A, A,,
consecutively to a surface W, we eventually A,,A,l(IA,A,,A,,A,l=l),whereA,,A,,A,
arrive at the order ré having the following are points of the tangent plane to S at A. A
properties: The families of frames of order family of such frames is called the family of
4 + 1 coincide with those of order q, and the frames of order 1, and we express its differen-
invariants k, of order q + 1 are expressed as tial by
functions cp&k,) of invariants k, of order less 3
than or equal to 4. In this case, the families of dA,= 1 w,BAp, a=O,1,2,3, A,=A.
frames of orders q +j (j > 1) are a11 equal, and p=o
the invariants of orders q + j are partial (j - l)- The 0: are tPfaffian forms that depend on two
derivatives of functions cpB(k,). The family of principal parameters determining the origin A
frames of order q is called the Frenet frame. and ten secondary parameters determining the
The differential invariants on a surface are frame. Wehavew~+w~+w~+w3=O,w~=O.
defined to be differential forms generated by Furthermore, m1 = ~0, re2 = wi are indepen-
the basic components and the invariants of dent of each other and depend on the principal
each order. parameters only. Let zi, z2, z3 be tnonhomo-
Specilïcally, assume that the group G is an geneous coordinates with respect to a frame of
analytic transformation group of V, and the order 1. Then in a neighborhood of the origin,
m-dimensional surfaces W,, W, are analytic. S is expressed by z3 = Cz2f*, where the f, are
Then there exists an element 9 of G such that homogeneous functions of degree r with re-
gW, = W, if and only if W, and W, are of the spect to zi, zz.
same kind and have the same relations among If we Write fi =(ao(z’)’ + 2a,z’z2 + az(z2)*)/
the invariants of orders less than or equal to 2, then it follows from the structure equations
q + 1. These relations are called the natural of the projective transformation group that
equations of the surface. The theory of sur- w~=aoo1+a,r02,0$=a1c01+a2mZ.Ifwe
faces based on the analysis of the natural put rp2 = ao( + 2a, w1 w* + a2(W2)*, then
equations of surfaces is called natural geome- a curve on S defined by <P*= 0 is called the
try. The reduction formula cari be obtained by asymptotic curve and its tangent the asymp-
utilizing the Frenet frame; it gives the equation totic tangent. At any point of this curve, the
of the surface in the form of power series con- plane tangent to S is in contact of order 2
taining the invariants of each order. with this curve, and there are in general two
Various results are known concerning the asymptotic curves through any point of S.
theory of surfaces of the spaces V, , V, sharing Equations of the asymptotic tangent at A
the same transformation group G. We also are given by z3 = 0, f2 = 0. A point of S at
have a theory of special surfaces whose in- which the asymptotic tangents coincide is
variants satisfy specific functional relations. called a parabolic point. If every point of S is
Furthermore, we have problems concerning parabolic, then S is a tdevelopable surface, and
the deformation of a surface (preserving some the general theory is not applicable to such a
differential invariants). Actually, the theory of surface.
surfaces of dimension m other than curves and Among the family of frames of order 1, a
hypersurfaces is in general quite dihïcult. The frame satisfying a0 = a2 = 0, a, = 1 is called the
methods of tensor calculus cari be applied to frame of order 2. For this frame, the straight
the study of surfaces. The theory of tconnec- lines AA,, AA, are asymptotic tangents. With
tions cari be considered to be an outgrowth of respect to this frame, if f3 = -(bo(z’)3 +
the study of surfaces by means of the method 3b,(z’)‘z* + 3b,z’(z*)* + b3(z2)3)/3, then r$ =
of moving frames and tensor calculus. b,w’+b,w*, -w;+w:+w;-w;=2(b,w’+
b,~*), w: = b,w’ + b,u*, and the quadric
B. Projective Differential Geometry surface z3=z’z2-z3(b,z’ +b2z2+pz3) (with p
arbitrary) is called Darboux’s quadric at A, an
The rudiments of differential geometry sub- especially interesting one among contact quad-
ordinated to the tprojective transformation ries of S. Darboux’s curve is a curve on S such
409 110 c
Differential Geometry in Specific Spaces

that Darboux’s quadric is in contact of order 3 A into A and the image C~(S) is in contact of
at any point of it. Its tangent is called Dar- order 2 with S at A, then the pointwise corre-
boux’s tangent and is given by z3 =0, b,(z’)3 + spondence is called a projective deformation.
b3(z2)3 = 0. A necessary and sufficient condition for the
We have bob3 #O, except in the case of existence of a projective deformation between
truled surfaces. We take special frames of two surfaces is that these surfaces have the
order 2 determined by b, = b, = 0, b, = b, = 1 same projective line element [6]. A ruled sur-
and cal1 them the frames of order 3. If a frame face is projectively deformable only to a ruled
of order 3 satisiïes f4= -(c~(z~)~+~c,(z~)~z~ surface. Given an arbitrary surface S, it is
+6(c,- 1)(z’z2)2+4c3z1(z2)3+c4(z2)4)/12, generally impossible to find a surface that is
then c$ - 20: + w$ = cOwl + cl w2, ~3 -WY different from S and projectively deformable to
=c1w’+c2w2, w:-w~=c2w1fc302, wg+o; S; some conditions must be satistïed [6,8].
- 21~02= c3w1 + c4w2. With respect to this Let pol, po2, po3, p12, p13, p23 be tP1ücker
family of frames of order 3, ((w’)~ +(w’)~)/ coordinate of a straight line in a 3-dimensional
2w’w’ is an invariant associated with two projective space P3. Then we have pol p23 -
neighboring points of S, called the projective po2p’3+po3p’2=0, and there is a one-to-
line element. Also with respect to this frame, one correspondence between the ratios of {p”}
two straight lines AA,, A, A, are polar with and straight lines (- 90 Coordinates B). If the
respect to Darboux’s quadric. pij are regarded as homogeneous coordinates
Among the families of frames of order 3, a of a 5-dimensronal projective space P5, then
frame satisfying c, = c1 = c3 =0 is called a the previous equation detïnes a hyperquadric
frame of order 4. With respect to this frame, Q in P5. Thus there is a one-to-one correspon-
there exist A, p, v, p such that WY = Âw’ + pw2, dence between points of Q and straight lines in
col = vcd + pw2, coi = pd + 1w2. Hence if we P3. A curve on Q corresponds to a set of one-
put ca = - 3a, c4 = - 3b, it follows that parameter families of straight lines, or a ruled
surface. Sets of 2-parameter or 3-parameter
families of straight lines corresponding to
surfaces of 2 or 3 dimensions on Q in P5 are
called congruences of lines or complexes of
lines, respectively. Thus by using a theory of
TO= -(3/2)(aw’+bw*), 7; =(1/2)(aw’ +bo2). surfaces in P5, it is possible to establish the
theory of congruences and complexes [2,6,8],
Thus the frame of order 4 is the Frenet frame which is an important part of projective dif-
and is attached to every point of S. This frame ferential geometry.
is called the normal frame, and the invariants Specitïcally, if the surface is either a curve or
a, b, 1, p, v, p are called the fundamental differ- a hypersurface, there are numerous interesting
ential invariants. The straight lines AA, and results [2,4,6].
A, A, associated with the normal frame are
called directrices of Wilczynski of the Iïrst and
second kind, respectively. With respect to the C. Affme Differential Geometry
normal frame, S is expressed by
The theme of general affine differential geome-
try is the study of differential-geometric prop-
+(a(~‘)” + b(z2)4)/4+(z1z2)2/2+ ...
erties of a point or set of points in a space
A necessary and sufftcient condition for two that are invariant under the action of the
surfaces S, S to be projectively equivalent is taffrne transformation group. Affine differen-
that there be normal frames having the same tial geometry is the study of the properties
wl, cu2 and the same six fundamental differen- invariant under the action of the tequivalent
tial invariants. For six quantities a, b, 1, p, v, p affine transformation group, i.e., a subgroup of
to be fundamental differential invariants of a the afftne transformation group formed by
surface, they must satisfy a certain condition of elements sending (xi) to (Xi) such that
existence [6].
A frame of order 1 such that AA, and A, A, xi=a,+ c n,xj, i=l,...,n, det(aij) = 1.
j=l
are polar with respect to Darboux’s quadric
is called Darboux’s frame. With respect to The latter transformation leaves invariant the
this frame also, a theory of surfaces has been volume surrounded by an oriented closed
established. hypersurface. The method of moving frames is
Consider a pointwise correspondence be- effective in affine differential geometry.
tween two surfaces S, S, and denote by AES Let C be a plane curve, and associate with
the point corresponding to AES. If there exists any point A = A(t) of C a family of frames
a projective transformation <p that transforms [A, e,, e,], where the area of the parallelogram
110 D 410
Differential Geometry in Specific Spaces

determined by the two vectors e,, e2 is equal by A. B the tinner product of hyperspheres
to 1. This frame is called the frame of order 0. A, B, we obtain
Its differential is expressed by
A;A,=g,,> cr,fi=O,l,..., n,co,

dA = ; use,, de, = 5 use,, r=1,2, where


S=I S=I
co:+o$=o.
(Sa@)=
The frames of order 1,2, and 3 are character-
ized by w2=O; w’=O, wi=o’; and w2=0,
i,j=l,..., n, Sij = Sji'
wf = WI, wi =O, respectively. Then a frame of
order 3 cari be associated with each point of C Let z” be homogeneous coordinates with
and coincides with the Frenet frame. We cal1 respect to Y?. Then the +Mobius transforma-
a1 = do the affine arc element; the affine curva- tion z+Z of S” is characterized by Z”=c;za,
ture K is delïned by coi = -K~O. Then the where g,&c,B = gcr, Ici1 #O. The differential of
Frenet formula is given by the family of the frames is defmed by
dA=dae,, de, =dae,, de,= -Kdae,.
(1)
With respect to this frame, C is expressed as
where
y=x2/2+~x4/8+(d~/da)x5/40+ ... .

Further, do and K are given analytically by

da= IdA,d2A)1’3, ~=[d’AJda’,d~A/da~[,

where lM, N 1= det(M, N). M, N are column


vectors with two entries. We cal1 a the affine
There are (n + l)(n + 2)/2 linearly independent
arc length. The straight line on which e2 is
forms among w, and this is the number of
situated is called the affine normal, the dia-
parameters of the Mobius transformation
meter of the parabola osculating C at A. If rc is
group. The structure equations of this group
constant, then C is a conic section. Further-
are
more, C is an ellipse, hyperbola, or parabola
according as the constant K is positive, nega- doi=Co;r\o,P. (2)
tive, or zero. In allïne geometry, parabolas play The theme of conforma1 differential geometry
a role similar to that played by straight lines in is the properties of Pfaflïan forms w! satisfying
Euclidean geometry. (1) and (2).
There are numerous results concerning the Consider a transformation c: C zaA,-*
theory of skew curves and surfaces [ 11. Con- C z”(A, +dA,). (1) If a11 o vanish except ~0,
cerning the theory of skew curves, results on then a11 the circles through A,, A, are in-
affine length, affine curvature, affine torsion, variant, and any point P is transformed to a
affine principal normais, and affine hinormals neighboring point P on the circle, such that
are similar to those in Euclidean geometry. the cross ratio (P, P; A,, A,) is constant. This
The affine transformation group is situated transformation is called the homothety with
between the projective transformation group centers A,, A,. (2) If a11 w vanish except 00,
and the congruent transformation group and WY =C gikwk, then a11 the circles tangent to a
hence has properties analogous to theirs. The lïxed direction at A, are transformed into
theory of surfaces has a character similar to themselves, and any hypersphere through A,
that of projective differential geometry [ 11. and orthogonal to those circles is transformed
We may also consider the variation of the into a hypersphere having the same property.
affine area of a surface surrounded by a closed This transformation is called the elation with
skew curve C. We cal1 the extremal surface the tenter A,. (3) If a11 o vanish except OF =
affine minimal surface. W. Blaschke and others WA, then the transformation is an
C9ikwkt
obtained many results on the global properties elation with tenter A,. (4) If a11 o vanish
of such surfaces. except ce{, then the transformation is an in-
lïnitesimal rotation with tenter A,, with A,
D. Conforma1 Differential Geometry regarded as a point at infinity. Thus any in-
lïnitesimal Mobius transformation is de-
Let S” be a tconformal space of dimension composed into the previous four types of
n, and associate with each point A,, ES” a transformation.
frame %[A,, A,, . . , A,, A,] of the +(n + 2)- TO study the theory of curves and hypersur-
hyperspherical coordinates with origin A, faces in S”, we again utilize the Frenet frame
(- 76 Conforma1 Geometry). Then denoting chosen from a family of frames associated with
411 110 Ref.
Differential Geometry in Specitïc Spaces

A,. For example, the Frenet formula of a curve where [ , ] is the Poisson bracket; then N is a
in S3 is given by tensor fïeld of type (1,2) over M’“+i, which we
cal1 the torsion tensor of the almost contact
structure (cp, 5, q). When N vanishes identically
on M”“l, we say that the almost contact
structure is normal.
An almost contact structure (cp, 5, q) on
M2”+’ induces naturally an almost complex
We cal1 do, IC, and z the conforma1 arc element, structure J on M2”+’ x R (resp. M2”+’ x S’),
conforma1 curvature, and conformai torsion, which reduces to a complex structure if and
respectively. There are many results on con- only if (cp, 5, q) is normal. A similar statement is
forma1 deformation [S]. also vahd for the product space of two almost
Concerning Laguerre differential geometry, contact manifolds (A. Morimoto [ 131).
we have results dual to those in conforma1 If M2”+i is an almost contact manifold
differential geometry (a point is replaced by a with structure tensor (cp, 5, q), we cari fïnd a
straight line and an angle by a distance be- positive defïnite Riemannian metric g SO that
tween the points of contact of the common g(<pX, <pY)=dX, Wr(XMY) for any pair
tangents of two oriented circles). of vector tïelds X and Y, and the set (<p, 5, q, g)
is then said to be an almost contact metric
structure.
When M2”+’ is a contact manifold with
E. Contact Manifolds
contact form q, there exists a unique vector
field 5 which satislïes dq(X, 5) =O, ~(5) = 1 for
Consider a (2n + 1)-dimensional differentiable
any vector field X. We cari then find a tensor
manifold M’“+’ with a 1-form q such that
fïeld cp of type (1,1) and a positive definite
q A (dq)” #O, where du is the exterior derivative
metric tensor y SO that (i) dq(X, Y) = g(X, <pY)
of r) and A denotes exterior multiplication.
is satisfïed for any pair of vector fïelds X and
(Note that this is true for the 1-form in the left-
Y, and (ii) (9, 5, ré,g) is an almost contact metric
hand side of eq. (2) in 82 Contact Transforma-
structure. The almost contact metric structure
tions A.) Such a manifold is called a contact
determined in this way by a contact form q is
manifold with contact form n. The structure
called a contact metric structure. A differenti-
group of the tangent bundle of a contact mani-
able manifold with normal contact metric
fold M”‘+’ reduces to U(n) x 1, where U(n) is
structure is called a normal contact Rieman-
the unitary group; hence every contact mani-
nian manifold or a Sasakian manifold. Bries-
fold is orientable. Simple but typical examples
korn manifolds are examples of such mani-
are given by the unit sphere S”‘+’ in Euclidean
folds. They include, besides the standard
space E2”‘2 and the tangent sphere bundle of
sphere S2”+‘, all exotic (2n + 1)-spheres that
an (n + 1)-dimensional Riemannian manifold
bound compact oriented parallelizable mani-
M”+l, both with natural contact forms (S. S.
folds. An almost contact manifold is said to be
Chern [SI). Every 3-dimensional compact
regular or nonregular according as the integral
orientable differentiable manifold is a contact
curve of 5 is regular or not as a submani-
manifold (J. Martinet [ 121).
fold. A compact regular contact manifold is a
Now a differentiable manifold M2”+’ is said
principal circle bundle over a symplectic mani-
to be an almost contact manifold if it admits a
fold, and it admits a normal contact metric
tensor lïeld cp of type (1, l), a vector field 5, and
structure if and only if the base manifold is
a 1-form q such that
a Hodge manifold (Boothby and Wang [S],
<p2x= -X+11(X)5, 1(5)=1, (3) Hatakeyama [ 111).
Many research papers on the topology
where X is an arbitrary vector fïeld on M2”+‘;
and differential geometry of manifolds with
and the triple (<p, 5, q) is then called an almost
the structures defïned above have been pub-
contact structure. (3) implies that (~5 = 0 and
hshed by S. Tanno, S. Tachibana, D. E. Blair,
q(cpX)=O (S. Sasaki [14,1]). The structure
M. Okumura, K. Ogiue, S. 1. Goldberg, and
group of the tangent bundle of an almost
others.
contact manifold M2”+i reduces to U(n) x 1.
Indeed, J. W. Gray [lO] took this property as
his definition of almost contact structure. For
any pair of vector fields X and Y on M’“+‘, let References

NW, Y) = LX, y1 + <pc<px,y1


[ 1] W. Blaschke, Vorlesungen über Differen-
+<pcx~cpyl-c~x~cpy1 tialgeometrie, Springer, II, 1923; III, 1929
-{X.W- Y~~W))5, (Chelsea, 1967).
111 A 412
Differential Geometry of Curves and Surfaces

[2] G. Bol, Projektive Differentialgeometrie 1, an immersed submanifold (or a surface) of E”.


II, Vandenhoeck & Ruprecht, 1950. When m = 1, we cal1 it a curve of E”, and when
[3] E. Cartan, La théorie des groupes finis et m =II - 1, a hypersurface in E”. The cases of
continus, Gauthier-Villars, 1951. n = 2 and n = 3 have been the main abjects of
[4] E. Cartan, Leçons sur la théorie des es- study in differential geometry of curves and
paces à connexion projective, Gauthier-Villars, surfaces. The differential-geometric properties
1937. for the general case of immersion are discussed
[.5] P. C. Delens, Méthodes et problèmes des in 365 Riemannian Submanifolds.
géométries différentielles euclidienne et con-
forme, Gauthier-Villars, 1927.
[6] G. Fubini and E. Lech, Introduction à la B. Frames in E”
géométrie projective différentielle des surfaces,
Gauthier-Villars, 193 1. Every +Euclidean motion in E” cari be ex-
[7] J. Kanitani, Géométrie différentielle pro- pressed as the product of a parallel translation
jective des hypersurfaces, Mem. Ryojun Coll. and an torthogonal transformation that keeps
of Eng., 1931. the origin of E” fixed. The set of all parallel
[S] W. M. Boothby and H. C. Wang, On translations is a commutative group that cari
contact manifolds, Ann. Math., (2) 68 (1958) be identitïed with R”. It is a normal subgroup
721-734. of the group of motions I(E”) of E”. SO we see
[9] S. S. Chern, Pseudo-groupes continus that I(E”) is a tsemidirect product of R” and
infinis, Coll. int. du CNRS, Geom. Diff., Stras-
the torthogonal group O(n). The Lie algebra
bourg (1953) 119-135. of I(E”) is the direct sum of R” and the Lie
[lO] J. W. Gray, Some global properties of algebra o(n) of the orthogonal group, where
contact structure, Ann. Math., (2) 69 (1959),
both are regarded as additive groups. Corre-
421-450. sponding to this decomposition, we cari Write
[l 1) Y. Hatakeyama, Some notes on differenti-
the +Maurer-Cartan differential form over
able manifolds with almost contact structure, I(E”) as o+Q, where o belongs to R” and Q
Tôhoku Math. J., (2) 15 (1963), 176-181.
to o(n). The +Structural equation d(w + 0) =
[ 121 J. Martinet, Formes de contact sur les -(1/2)(w+R) A (w+Q) cari be divided into
variétés de dimension 3, Proc. Liverpool Sin-
the following two parts: dw = fi A w; dR =
gularities Symposium 11 (1971), 1422 163. -( 1/2)R A R. These are known as the structure
[13] A. Morimoto, On normal almost contact equations of E”. By an orthogonal frame in
structures, J. Math. Soc. Japan, 15 (1963), 420-
E” we mean an ordered set (x, e,, , e,) con-
436. sisting of a point x and a set of torthonormal
[ 141 S. Sasaki, On differentiable manifolds vectors e,, e2,. ,e,. We denote by 0(n) the set
with certain structures which are closely re-
of a11 orthogonal frames in E”. If we denote
lated to almost contact structure. 1, Tôhoku the translation identified with XER” by TX,
Math. J., (2) 12 (1960), 459-476. II (with Y. then there is a one-to-one correspondence <p:
Hatakeyama), ibid., 13 (1961), 281-294. I(E”)+e(n)givenbyrp(T,A)=(x,Ae,,...,Ae,)
[15] K. Yano and S. Ishihara, Tangent and (A E O(n)). We cari make 0(n) into a differ-
cotangent bundles, Dekker, 1973. entiable manifold SO that cp is a tdiffeomor-
[16] D. E. Blair, Contact manifolds in Rie-
phism. We denote the differential forms over
mannian geometry, Lecture notes in math. O(n), which are images of w and R under the
509, Springer, 1976. +dual mapping of <p-‘, by the same letters w
and R, respectively. For O(n) as a +Principal
liber bundle over R” with the projection rc:
X(X, e,, . , e,) =x and n vector-valued func-
tions <pi: cpi(x, e,, . . . , e,) = e, over &j(n), we have
Il 1 (VII.1 2)
w=Cco’e,, R = 1 R”E,,
Differential Geometry of r<j
Curves and Surfaces coi = (d7c, cpi), cYj=(d<pi, <pj), (1)

A. General Remarks dod=pY~/w’, dQU=Cp,,Qki,


j k

Let ,f be an timmersion of an m-dimensional where {E,} is a basis of o(n) delïned by Eijej=e,,


tdifferentiable manifold M of class c’ into an Eijei= -ej, E,e,=O (k#i,j) and ( , ) is the
n-dimensional Euclidean space E”. More pre- scalar product of vector-valued forms induced
cisely, fis a differentiable mapping of class C’ from the scalar product of E”. Any diffeomor-
such that the tdifferential df, is injective at phism of O(n) onto itself preserving w and 0
every point p of M. The pair (M,f) is called must be a Euclidean motion.
413 111 E
Differential Geometry of Curves and Surfaces

C. Theory of Curves x=x(t). If cp is a diffeomorphism of a closed


interval [a’, b’] onto [a, b], then fo cp and f
Let (M,f) be an immersion of a 1-dimensional are representations of the same arc in E”, and
differentiable manifold M into E”. We identify cp is called a transformation of the parameter.
the tangent space of E” at each point with E” Any curve of class C’ is trectifiable, and its
itself. Then df, maps the origin of the tangent arc length is given by s = ~,b(C~=l(dxi/dt)Z)1’2 dt.
space M, to f(x), and the image df,(M,) of M, We may choose the arc length s measured
by df, is a straight line passing through f(x) from a point on the arc as a parameter, called
in E”, called the tangent line off(M) at f(x). the canonical parameter of the arc. Consider
By Lof(M) we mean the set of a11 ordered sets an arc C of class C” given by the vector repre-
(x, e,, . , e,), where XEM and {ei} is an ortho- sentation x=x(s), SE [a, b]. We assume that its
normal basis of E” such that e, PDF,. Then +Wronskian lx’(s), ,x(“)(s)/ is not identically
of(M) cari be naturally immersed in 0(n) by zero (we denote by ’ the derivative with respect
the mapping f:f(x,e,, . , e,)=(f(x),e,, . . . ,e,). to the canonical parameter), which means
We cari pull back the differential forms w, Q that the arc C is not contained in a hyper-
w’, Rij over 0(n) to O#f)f* and denote them plane in E”. A point at which the Wronskian
by 0, 0, t?‘, and Oij, respectively; then we have vanishes is called a stationary point, and we
f?=O (i> 1). Let fi and f2 be two immersions assume that there exists no stationary point
of M into E”. Then in order for there to exist a on C. By the +Gram-Schmidt orthonormaliz-
Euclidean motion c( of E” such that fi =CI o f2, ing process we obtain an orthonormal basis
it is necessary and sufftcient that there exist a e,,...,e,(le, , . . . , e,l > 0) from n vectors x’(s),
diffeomorphism cp of of-,(M) onto of,(M) such “‘3 x(“)(s) at each point of C. We cal1 the frame
that Q,-] = (p*(t$>), O,, = (p*(Of2). Let 7~~be the thus determined the Frenet frame. With re-
projection of the liber bundle or(M), and let ‘pi spect to the Frenet frame, (2) is rewritten as
be naturally delïned vector-valued functions e;(s)= -Ki~l(S)ei~l(S)+Ki(S)ei+l(S),
over 0#4). Then we have d( f o rrf) = 8’ ‘pl,
d<p,=&, O’jrp,. If we put d?(X)= Ildf,(X)ll’ i=l ,...,n;
(~EM,.), then we have (0’)‘= $(ds’). For each Ko(S) = K,(S) = 0; (3)
point XE M there are two possibilities for the
choice of e, corresponding to two orientations Kj(s) > O, j=l,...,n-2.
of the curve. But since (dq,, dq,) = z~(pzds2), These are called Frenet’s formulas (or the
p2 depends only on the point x of M. We cal1 Frenet-Serret formulas). We cal1 x1, K~, . . ,
p (> 0) the absolute curvature. We now choose K,-~ the lïrst, second,. . , (n - 2)nd curvature,
an orientation of the curve and then e, in
respectively, while we cal1 K,~~ the torsion for
accordance with the orientation. Thus we get a
n > 3. For a curve in a lower-dimensional sub-
submanifold of Of(M), which we again express space Em c E”, we set ~~ = 0 (i > m). The curva-
by the notation Of(M). If we delïne the form ds tures and the torsion of a straight line are
by ds(X)=(df(X),e,), we have Q’=rr,*(ds), and
zero. TO get Frenet’s formulas in these special
ds is called the line element. Any tlocal cross
cases, we fix e, (i > m) in the subspace ortho-
section R : rcf o R = 1 of the bundle C!$(M) is gonally complementary to Em in E” and pro-
called a moving frame. Putting ceed as in the general case. Suppose that C,,
R*(O’) = ds, R*(@ij) = pij& C, are arcs such that both of their Frenet
frames are of class C’. If there exists a diffeo-
we see that the following equation holds over morphism of C, to C, that preserves arc length
M: andtheK;(i=l,..., n - 1) are equal at corre-
dei=zpijdsej. (2) sponding points, C, and C, are mapped onto
j each other by a motion of E”. This is the fun-
For two immersions (M, fi) and (M, fi), we damental theorem of the theory of curves.
have fi = c(o fi (CI is a Euclidean motion) if and Given n - 1 functions of class C’ I~~(S) B 0, ,
only if they have the same ds and pij for some K,~,(s)>O (we assume that the equality signs
moving frames. occur at most at a lïnite number of points)
and IC,~~ (s) for 0 < s Q L, there exists an arc
that has rci, . . , K,-~, IC.-~ as its lïrst, ,
D. Frenet’s Formulas (n-2)nd curvatures and its torsion, respec-
tively. The equations ~~= rci(s) are called the
In order to study local properties of curves it is natural equations of the curve.
sufficient to consider them on +Jordan arcs of
class c’. With respect to orthogonal coordi- E. Plane Curves
nates (x’ , , . . ,x”) in E”, such a curve is repre-
sented parametrically by x’=f’(t) (te [a, b], Let x =x(s) be a curve of class C2 in E*, and
C(dx’/d@ > 0) or by a vector representation (x(s), e,, e,) its Frenet frame. The tangent and
111 F 414
Differential Geometry of Curves and Surfaces

the normal of this curve at x(s) have para- problems of this kind is called the isoperimet-
metric representations x(s)+ te,, x(s) + te,, rit problem. This problem has intimate con-
respectively (with parameter t). Frenet’s for- nections with fïelds such as integral geometry.
mulas are written as x’ = e, , x” = e’, = rce2, and The oval has a convenient parameter other
p is called the curvature of the curve C. The than the arc length parameter s. Given a num-
natural equation is given by K = K(S). If K(S) = ber t, 0 < t < 2n, there exists a unique point
constant #O along C, C must be a portion of x(t) in the oval such that e2 =(COS t, sin t) at
a circle. Another way of defïning the curvature x(t). When we describe the oval in terms of
is as follows: We take a fixed direction (for the parameter t, the tangent vector at x(t) is
example, the positive direction of the x-axis on parallel to that at x(t + z), and we cari define
E’) and denote by Q(s) the angle made by the the width W(t) at x(t). W(t) is called the widtb
tangent T, of the curve C at x(s) with the direc- of tbe oval. A curve is called a curve of con-
tion. Then we have K(S) = dQ/ds. If n = 2, the stant widtb if the curve is an oval whose width
curvature cari take both positive and nega- W(t) does not depend on t. The circle is a
tive values. Figs. 1 and 2 suggest a geometric typical example of a curve of constant width.
meaning of K > 0 and K < 0, respectively. The Reuleaux’s triangle is another well-known
circle with tenter x=x(s) + (l/K)e, and radius example of a curve of constant width (Fig. 3).
l/~ has a contact of higher order than any For a curve of constant width of width W
other circle in E’. We cal1 this circle the oscu- and length L, we have L = n W.
lating circle (or circle of curvature) at the point
x=x(s), its tenter the tenter of curvature, and
1,‘~ the radius of curvature. The locus c’ of the
tenter of curvature of a curve C is called an
evolute of C. Conversely, C is called an involute
of c’, the tenvelope of the family of normal
lines of C. When a curve is given in terms of its
canonical parameter s, the curvature is given
by Ix’(s), ~“(S)I; when the curve is given by
another parameter t as x=x(t), the curvature Fig. 3
is given by K(t)=Ix’(t),x”(t)l/lx’(t)13, where ’
means dJdt.
There are also some results concerning the
relations between local properties (for exam-
ple, curvature) and properties of the whole
figure. An example is given by the four-vertex
tbeorem. A vertex on a curve C is by defini-
tion a point where drc/ds = 0. Then there are at
least four vertices on an oval of class C3. A
simple closed curve with K > 0 ( < 0) must be
Fig. 1 Fig. 2 convex (- 89 Convex Sets).
K>O. K<O.

The facts we have just stated concern local F. Space Curves


properties of plane curves. We shall now dis-
cuss the global theory of curves, which deals Let x=x(s) (SE [a, b]) be a curve C of class C3
with properties of each curve as a whole. Let in E3 deiïned in terms of the canonical para-
C: x =x(s), a <s < b, be a closed curve. Let meter s. Let (x(s), e,, e2, e3) be Frenet frames
0(s), 0 <O(s) < 27c, be the angle that e(s) makes along C. Then we have the Frenet formulas
with the x axis. Put 0 = ji o’(s) ds. Intuitively,
e;=K,e2, e;= -Klel+K2e3, ej= -rc2e2.
0 measures the total rotation of e, (s) as we
run along the curve C from a to b. Since C is We cal1 l/~~, l/~~ the radius of curvature and
closed, 0 is an integer multiple 1 of 2~. The the radius of torsion, respectively. The line
integer 1 is called the rotation number of C, x = x(s,,) + te, is the tangent of C at X(Q).
and is equal to (1/2n)J,bK(s)ds. Let D be a The two straight lines through the point x(s,J
closed domain consisting of points in the inte- defined by 5 =X(Q) + te, and X=X(Q) + te,
rior and on the boundary of a simple closed are called the principal normal and the binormal
curve C. C is called a closed convex curve or of C at X(Q), respectively. The three planes
an oval if D is tconvex in E’. Among ail ovals through x(sO) defined by X=X(~,)+ te,+?e,,

of given length, the circle has the maximum ~=x(s,)+te,+~e,,and~=x(s~)+te,+te~


area. Various generalizations of this theorem are called the normal plane, the rectifying
have been obtained, and the collection of plane, and the osculating plane, respectively.
415 111 G
Differential Geometry of Curves and Surfaces

At a point x(s,J of a curve x=x(s) of class C”, related to the properties of tknots. If a simple
we take e,(s,,), e,(se), and e3(s0) as unit vectors closed curve in E3 is knotted, then the total
of the coordinate axes. Substituting the Frenet curvature is at least 47~ [7,8]. We iïx an origin
formulas into the Taylor expansion of x(s), we 0 in E3 and draw a unit tangent vector with
see that the new coordinate x,(s), X~(S), X~(S) of initial point 0 parallel to the unit tangent
C are given by vector at each point of a space curve C; then
the endpoint of this vector traces a curve C on
x1 =(s-s0)-(~l(s0)/6)(s-s,,)3+...,
the unit sphere with the tenter 0. We cal1 C
the spherical indicatrix of C and the corre-
spondence of C to ë a spherical representation.
The total curvature K of a curve C is equal to
These are called Bouquet’s formulas. Utilizing the length of C. Consequently, we have K =
these formulas we cari see the nature of the $cdB, where Q is the angular deflection of the
curve with given ICI and ICI. A curve and its tangent line along the closed curve C.
osculating plane at a point on it have contact
of order higher than any other plane through
that point. The family of osculating planes G. Theory of Hypersurfaces
of C tenvelops a tdevelopable surface S and
coincides with the locus of tangent lines to C. Let (M,f) be an immersion of an (n- l)-
We cal1 S the tangent surface of C, and C the dimensional differentiable manifold M of class
line of regression of S. The family of rectifying c’ into E”. Then we cari define on the hyper-
planes of C also envelops a developable sur- surface M a positive detïnite differential form
face called the rectifying surface, and C is a g of degree 2 induced from the inner product
tgeodesic on this surface. The family of normal of E”:g,(X,X)=(df,(X),df,(X)), ~EM,. Then
planes of C envelops either a cane or a tangent M becomes a TRiemannian manifold with
surface of another curve C. When the natural tRiemannian metric g. We cal1 g the fïrst fun-
equation of a space curve has a special form, damental form of (M,f). The Riemannian
the shape of the curve is simple. For example, geometry on a surface with its tïrst funda-
rci (s) = constant, K~(s) = constant represent a mental form as Riemannian metric is called
curve, called an ordinary helix, on a cylinder geometry on a surface (- 364 Riemannian
which cuts a11 the generators of the cylinder at Manifolds).
a constant angle. More generally, it is known By CO/(M) we mean the set of a11 the ordered
that if K& =Constant, the tangent at each sets (x,e,, . , e,)EO(n), where XE M and {e,}
point of the curve makes a constant angle with (i=1,2,..., n) is an orthonormal system of E”
a tïxed direction. Such a curve is called a gen- such that eisdf,(Mx) (i= 1, . . . ,n- 1). Then
eralized helix or a curve of constant inclination. O,(M) with natural projection rcJ and natural
Each curve satisfying arc1 + bu* = c (ah # 0) is differentiable structure is a ?Principal fiber
called a Bertrand curve. For a Bertrand curve bundle over M and has a natural immersion
there exists another curve C and a corre- f:f(x,e, ,..., e,)=(f(x),e, ,..., e,)intheprin-
spondence of C onto ë such that they have a cipal fiber bundle O(n). We cari tpull back the
common principal normal at corresponding forms on 0(n) to O,-(M) by f* and put f3=
points. Conversely, this property is also a f*(w), 0 =p(Q); then the structural equa-
sufftcient condition for C to be a Bertrand tions of E” are transformed to dB = 0 A 8,
curve. A Mannheim curve is detïned analo- d 0 = (-1/2)0 A 0. Furthermore, if we put
gously as a curve having a correspondence fji=f*(wi), @=f^*(@), then fjp=O and fJi,
with another curve C such that the principal @j(i,j < n) depend only on the lïrst funda-
normal of C and the binormal of C coincide at mental form of (M,f). Let fi and f2 be two
corresponding points. When a correspondence immersions of M into E”. Then in order that
of C and Chas the property that tangents at there exist a Euclidean motion c( of E” such
corresponding points are parallel, then the that fi = a ofi, it is necessary and sufftcient
correspondence is called a correspondence of that there exist a diffeomorphism cp of CJ-,(M)
Combescure. onto Of,(M) such that or1 =(p*(+J and OJ1 =
We have stated mainly local properties of (P*(@,~). Suppose that M is ?Orientable and
space curves. There are also several results oriented. Then the unit vector field normal to
about global properties of curves in E3 analo- df,(M,) at every point XE M in E” defines a
gous to the case of plane curves. For a sim- mapping of M into the unit sphere in E” called
ple closed curve C of length L, we cal1 K = the spherical representation of M or the Gauss
jh~,(s)ds the total curvature of C. Generally mapping (Gauss map). Regarding the unit
we have K <271, while K =2x if and only if C normal vector tïeld 5 of M as a vector-valued
is a closed convex curve lying in a plane (W. function over M, we cari delïne a symmetric
Fenchel) [S, 61. The total curvature is deeply product of df and dc by -(df d<)(X, Y)=
111 H 416
Differential Geometry of Curves and Surfaces

(1/2)C(df(x),dr(Y))+(~f(Y),d5(x))l, cakd points near p. on the surface lie on both sides


the second fundamental form of (M,f). of the tangent plane at p0 (Figs. 4, 5). A hyper-
Two immersions fi and fi of M that induce bolic point is also called a saddle point, since in
the same fïrst and second fundamental forms a neighborhood of the point the surface looks
have a Euclidean motion c( such that fi = like a saddle. A point that is neither elliptic
c(o fi; and the converse is also truc. This fact nor hyperbolic is called a parabolic point; at a
is called the fundamental theorem of the theory parabolic point we have LN - M2 = 0. If at
of surfaces. least one of L, M, N does not vanish at po,
then there is a neighborhood of p0 of the sur-
face that lies on one side of the tangent plane
H. Theory of Surfaces in E3 (- 365 Riemann-
at p. (Fig. 6). If a vector (X, Y) on the tangent
ian Submanifolds; Appendix A, Table 4.1)
plane of the surface at p0 satisfies the equation
LX2 + 2MX Y + N Y2 = 0, then the direction of
A surface in E” is locally expressed by para-
the vector is called an asymptotic direction. If
metric equations xi = xJu,) (i = 1, , n; c(=
the point p. is elliptic, such a direction does
1, . . , m) or by a single vector equation x =
not exist; if p. is hyperbolic, the direction is an
x(u,). We are mainly concerned with the case
tasymptotic direction of the Dupin indicatrix
n = 3, m = 2, and we express the surface by a
on the tangent plane at po. A curve C on a
vector representation x = x(u, 0). The lïrst and
surface such that the tangent line at each point
second fundamental forms are written as
of the curve coincides with an asymptotic
Edu2f2Fdudv+Gdv2, direction of the surface at the point is called an
asymptotic curve.
Ldu2 $2M dudv f N dv2.

If we use the usual notation of +tensor analy-


sis, then the fïrst and second fundamental
forms are also denoted by gàp du” du8 and
I&duaduB, respectively, where ur (n = 1,2) are
parameters (with Z omitted by +Einstein’s
convention). We cal1 {sas}, {HaB} the first and
second fundamental quantities, respectively (-
365 Riemannian Submanifolds). At the point
p,, = x(u,, v,-J on a surface S that corresponds Fig. 4
to parameter values (uO, vo), the curves ex- Elliptic point.
pressed by v = vo and u = u0 are called a u-
curve and a v-curve through p,,, respectively.
Let x,, x, denote the tangent vectors 8x/&,
ax/dv at p,, to the u-curve and v-curve, respec-
tively, through the given point p0 and [ de-
note the unit vector orthogonal to x, and x,.
Then 5 is called the normal vector of S at p,,
and (x.,x”, 5) the Gaussian frame of S at p,,.
Although a Gaussian frame is not in general
an orthogonal frame, it is intimately related Fig. 5
to local parameters. The plane that passes Hyperbolic point
through the point p,, and is spanned by x,, x,
is called the tangent plane to S at po. The
coefficients of the second fundamental form
L(u, v), M(U, v), N(u, v) are expressed by the
inner products L =(-x,, t,), M =(-x,, c,),
N = (- X,, 5,) (tu = a&% 5, = atiw.
Let (X, Y) be the coordinates of a point on
the tangent space at p. with respect to the
Gaussian frame. We cal1 the curve of the sec-
Fig. 6
ond order defïned by LX2 + 2MX Y+ NY 2 = E
Parabolic point.
(E is a suitable constant) the Dupin indicatrix.
The point p. is called an elliptic point or a Let C: x = x(u(t), u(t)) be a curve through p.
hyperbolic point on S according as the Dupin on the surface x = x(u, v). Then the curvature K
indicatrix at the point is an ellipse or a hyper- of C as a space curve is given by
bola. If p,, is an elliptic point, then points near
p. on the surface lie on one side of the tangent Ldu2$2MdudvfNdv2
KCOSe=
plane at po, whereas if p. is a hyperbolic point. Edu2f2Fdudv+Gdv2 ’
417 111 H
Differential Geometry of Curves and Surfaces

where du du is the direction of C on the surface only, the surface must be either a plane or a
at p0 and B is the angle between the normal of portion of it. The mean curvature and the
the surface at p0 and the principal normal of C Gaussian curvature of a sphere are constant,
at pO. The tenter of curvature at a point p,, of and those of a plane are both equal to zero. If
a curve C of class Cz on a surface of class Cz is we use the spherical representation of a surface
the projection on its osculating plane of the stated in Section G, we cari give to the Gauss-
tenter of curvature of the section C* (of the ian curvature the following geometric mean-
surface) tut by the plane determined by the ing: Let A be the area of the domain enclosed
tangent to the curve at p. and the normal of by a closed curve C around a point p,, on a
the surface at the point (Meusnier’s theorem). surface, and let A* be the area of the domain
The curvature of the curve C* at p,, is called on the unit sphere enclosed by the curve that
the normal curvature of the surface at the point is the image of C under the spherical represen-
for the tangent direction. Since the normal tation of the surface. Then the limit of A*/A
curvature for a direction at a point is a con- as the closed curve C tends to the point p. is
tinuous function of this direction that cari be equal to K at po.
represented as a point on a unit circle, there Let us denote by (g”@) the inverse matrix of
exist two directions that realize the maximum the matrix (gma) whose elements are coefficients
and minimum of the normal curvature. These of the lïrst fundamental form gUsdu” dup. We
directions are given by the equation easily see that g1 i =G/(EG-F2),g’2=g21=
- F/(EG - F’), g22 = E/(EG - F2). We intro-
Edu+Fdv Fdu+Gdv
=o. duce the symbols
Ldu+Mdv Mdui- Ndv

When this quadratic equation in duldv has


nonzero discriminant, it determines two direc-
tions delïned by its two roots. These directions
are called principal directions at the point. A
curve C on a surface such that the tangent line
at each point of the curve coincides with a which are called the Christoffel symbols of the
principal direction at the point is called a line lïrst and second kinds, respectively. Suppose
of curvature. When a11 lines of curvature of that a surface is given by the vector represen-
a surface are circles, the surface is called a tation x = x(ui , u,), and put x, = ax/&P, x,~ =
cyclide of Dupin. The two normal curvatures ax,/ad. Then for the derivatives of the Gaus-
corresponding to two principal directions are sian frame, we obtain
given by l/R, satisfying the following second-
order equation: X,8= la= -gYBHBax,.

1 2 EN+GL-2FM 1 LR-M2
-=o. We cal1 the former Gauss’s formula and the
0R - EG-FZ i?+ EG-F2 latter Weingarten’s formula. The integrability
They are called principal curvatures, and each conditions of these partial differential equa-
of their inverses is called a radius of principal tions are
curvature. The mean value H = (ICI + 1c~)/2 of
two principal curvatures ICI= l/R, (i= 1,2) is
called the mean curvature (or Germain3 curva-
ture), and the product K = ici ic2 is called the
total curvature (or Gaussian curvature). These where
are given by
lEN+GL-2FM LN-M=
K=-
H=Z EG-F2 ’ EG-F2
-
fJ a
A point on a surface is elliptic, hyperbolic,
parabolic according as K > 0, K < 0, or K = 0
or
{ Ii
YP 60
at the point. A point where the second fun- are components of the curvature tensor. The
damental form is proportional to the tïrst former are called the Gauss equations, and the
fundamental form is called an umbilical point, latter the Codazzi-Mainardi equations. In
and a point where the second fundamental connection with these equations, Bonnet%
form vanishes is called a flat point or a geo- fundamental theorem states the following:
desic point. If a surface consists of umbilical Suppose that a positive delïnite symmetric
points only, the ratio (L(du)2 + 2M dudu + matrix (g,,J and a symmetric matrix (H,,) are
N(du)2)/(E(du)2 + 2F dudv + G(du)‘) is a con- given that are functions of class C2 and Ci,
stant, and the surface is either a sphere or a respectively, defined over a tsimply connected
portion of it. If a surface consists of flat points domain D in R*. If they satisfy the Gauss
1111 418
Differential Geometry of Curves and Surfaces

equations and the Codazzi-Mainardi equa- The development of a geodesic is a straight


tions, then there exists a surface x = x(ul, u2) line (- 364 Riemannian Manifolds).
with the given (g,& and (H,,) as coefficients of Let us consider a simply connected, orient-
its lïrst and second fundamental forms, respec- able bounded domain D on a surface such
tively. Such a surface is determined uniquely that the boundary of D is a simple closed curve
if, for an arbitrary lïxed point (uy, ui) of D, C that consists of a finite number of arcs of
we assign an arbitrary point p,, and a frame class C’. If we denote by mi (i = 1,2, , m) the
(xy, xi, 5’) at p. SO that xy, xi are orthogonal external angles at vertices of the curvilinear
to the unit vector 5’ and (x1, xi) = g&uy, u$ as polygon C (Fig. 7), we have
the Gaussian frame at po. On the other hand,
we cari investigate surfaces as Riemannian rcSds+ -f ai+ Kdo=2n.
s C i=l
manifolds defined by the lïrst fundamental
form. This is called the Gauss-Bonnet formula. In
A tdiffeomorphism between two surfaces particular, if a11 the arcs of C are geodesics, we
preserving arc length is called an isometric have
mapping. The condition of preserving arc
m
length is equivalent to the condition that the Kda=2ir.
a,+
lïrst fundamental quantities of the surfaces =
i=l

coincide at each pair of corresponding points, This formula implies as special cases the
provided that we have introduced parameters following well-known theorems in Euclidean
on the two surfaces SO that corresponding geometry and spherical trigonometry: (i) The
points have the same parameter values. In sum of interior angles of a triangle is equal to
such a case two surfaces are said to be iso- 7~.(ii) The area of a spherical triangle is pro-
metric. From the Gauss equation we cari see portional to its spherical excess. The formula
that the total curvature depends only on the also implies the following theorem: On any
lïrst fundamental quantities. SO K is a quantity closed orientable surface we have s s K do =
that is preserved under isometric mappings 2n~, where x is the +Euler characteristic of the
(Gauss’s theorema egregium). surface. We cal1 s l K do the integral curvature
A vector lïeld L=(~)X, defined along a curve (or total Gaussian curvature).
u’=u’(t) on a surface is said to be parallel in
the sense of Levi-Civita along the curve if its
tcovariant derivative along the curve vanishes,
i.e., if

Wldt = dP/dt + ÂBduY/dt = 0.

The length of a vector belonging to a vector


lïeld that is parallel along a curve C is constant
along C. The angle of two vectors both be-
longing to vector fields that are parallel along Fig. 1
C is also constant along C. Choose two vector
fields q,,, parallel along a curve C on a surface
that satisfy g&&, =a,,. Then the tangent 1. Special Surfaces in E3
vector to C is expressed by du”/dt = lTa,v“(t).
Take a 2-plane and lïx an orthogonal coordi-
nate system on it; then the integral curve C of A surface is called a surface of revolution if it is
a set of ordinary differential equations dx”Jdt generated by a curve C on a plane 7~when 7~
= C&#(t) (Ci = ÂyQ,(Po)) is called the develop- is rotated around a straight line 1 in n. Then
ment of C (- 80 Connections). We denote by 1 and C are called an axis of rotation and
K the curvature of a curve C of class C2 on a a generating curve, respectively. A surface
surface S of class C2 and by o the angle be- of revolution having the x3-axis as the axis
tween the binormal of C and the normal of of rotation is given by the equations x1 =
rcos 0, x1 = r sin 0, x3 = q(r); its lïrst funda-
S at the same point. Then the quantity ICI =
KCOS o, belonging to the geometry on the sur- mental form is (1 + (p”)dr* + r2 dQ2. The sec-
tion of a surface of revolution by a half-plane
face, is called the geodesic curvature of the
curve at the point. A curve with vanishing through its axis of rotation is called a meri-
dian. According as the meridian is a straight
geodesic curvature is called a geodesic. It satis-
lies the differential equations line parallel to the axis of rotation or a straight
line intersecting the axis nonorthogonally,
the surface of revolution is called a circular
cylinder or a circular cane, respectively. If the
419 1111
Differential Geometry of Curves and Surfaces

meridian is a circle that does not intersect the ruled surface generated by a straight line that
axis of rotation, it is called a torus. moves under a certain rule intersecting a
A surface of class Cz whose mean curvature lïxed straight line 1 orthogonally is called a
H vanishes everywhere is called a minimal rigbt conoid. If we take 1 as the x,-axis, the
surface (- 275 Minimal Submanifolds). A surface is given by the equations x1 = u COSv,
surface of class C’ realizing a relative mini- x2 = u sin u, xj =f(o). A surface generated by a
mum of areas among a11 surfaces of class Cl curve C (C may be chosen as a plane curve)
with a given closed curve as their boundaries that moves in the direction of a fïxed line I
is an tanalytic surface such that H = 0. Con- with constant velocity and turns around I with
versely, a surface of class C2 with vanishing certain constant angular velocity is called a
mean curvature is an analytic surface. The helicoidal surface. If we take 1 as the x,-axis,
equation of a surface of revolution with a cate- the surface is given by the equations x1 =
nary as its generating line is given by XT + xz = ucosu, x,=usinu, x,=f(u)+ku, where k is a
ate X@ + e-‘3“‘)/2. This surface is called a cate- constant and x3 =f(x,) is the equation of C. In
noid and is a minimal surface. Conversely, a particular, if C is a straight line that intersects I
minimal surface of revolution is necessarily a orthogonally, then f(u) = 0, and the surface is
catenoid. For a surface obtained by rotating a called a right helicoid (or ordinary helicoid).
+Delaunay curve around its base line, the mean A right conoid is both a ruled surface and a
curvature H is equal to a constant (#O). Con- minimal surface. Conversely, a ruled surface
versely, a surface of revolution with nonzero that is also a minimal surface is necessarily a
constant mean curvature must be such a sur- right conoid. A helicoidal surface with a trac-
face. A surface with constant Gaussian curva- trix as the curve C is called a Dini surface and
ture is called a surface of constant curvature, is a surface of constant negative curvature. On
and is a 2-dimensional Riemannian spacé of the normal of a surface S two points qi (i = 1,2)
constant curvature (- 364 Riemannian Mani- are centers of principal curvature at p. The
folds). A non-Euclidean plane cari be repre- locus of each of these points is a surface called
sented locally as a surface of constant curva- a tenter surface of S. When S is a sphere, two
ture (- 285 Non-Euclidean Geometry). Two tenter surfaces degenerate to a point; if S is a
surfaces of the same constant curvature are surface of revolution, one of the tenter surfaces
locally isometric to each other. degenerates to the axis of revolution and the
Surfaces of revolution of constant curvature other is a certain surface of revolution. If S is
are classiiïed. The simplest surface of constant general, each of the tenter surfaces is the locus
negative curvature is a pseudosphere, which is of an edge of regression of the developable
a surface of revolution obtained by rotating a surface generated by normals of S along a line
ttractrix x, =acoscp, xs =alogtan((cp/2)+ of curvature.
(7r/4)) -a sin cp ( - n/2 < cp< 7c/2) around the When a 1-parameter family of surfaces S, is
x,-axis. A surface generated by a 1-parameter given by the equation F(x,,x,,x,, t)=O, a
family of straight lines is called a ruled surface; surface E that does not belong to this family is
a hyperboloid of one sheet, a hyperbolic para- called an enveloping surface of the family of
boloid, a circular cylinder, and a circular cane surfaces {St} if E is tangent to some S, at each
are examples. The tïrst two cari be regarded point of E, that is, if E and S, have the same
as ruled surfaces in two ways. Each of the tangent plane. The equation of E is obtained
straight lines that generate a ruled surface is by eliminating t from F(x1,xZ,x3, t)=O and
called a generating line. A surface consisting of (aF/at)(x,,x,,x,,t)=O. In general, if we de-
straight lines parallel to a lïxed line and pass- note by V(X,, x2, x3) = 0 the equation ob-
ing through each point of a space curve C is tained by eliminating t from F = 0 and aF/& =
called a cylindrical surface with the director 0, then the surface delïned by cp= 0 is either
curve C. A surface generated by a straight line the enveloping surface of {St} or the locus
that connects a certain point o with each point of singular points of S,. The intersection C,,
of a curve C is called a conical surface. Both a of the enveloping surface E of {St} and St,
cylindrical surface and a conical surface are is a curve detïned by F(x1,x2,x3, tJ=O,
ruled surfaces such that K = 0 everywhere. For (aF/dt)(x,,x,, x3, t,J=O. We cal1 Ct, a charac-
ruled surfaces we have K < 0. In particular, a teristic curve of {S,}. Since {C,} is a family of
surface such that H#O and K = 0 everywhere curves on the enveloping surface E, there may
is called a developable surface, A developable exist an envelope F on E. In such a case, F is
surface must be either a cylindrical surface, a called the line of regression of {St}. The equa-
conical surface, or a tangent surface of a space tion of F is obtained by eliminating t from F =
curve. There exist ruled surfaces that are not 0, aF/dt = 0, and a2F/at2 = 0. In particular,
developable, for example, hyperboloids of one the enveloping surface of a family of planes is a
sheet and hyperbolic paraboloids. A nondevel- developable surface, and their characteristic
opable ruled surface is called a skew surface. A curves are straight lines. Moreover, the line of
111 J 420
Differential Geometry of Curves and Surfaces

regression coincides with the line of regression pact, connected surface of class C4 with con-
of the tangent surface. stant Gaussian curvature is a sphere. A closed
If there exists a diffeomorphism between surface with K > 0 and H = constant is a
two surfaces such that first fundamental forms sphere (H. Liebmann).
at each pair of corresponding points are pro- A problem proposed by H. Hopf asks
portional, then the surfaces are said to be in a whether an orientable compact surface with
conforma1 correspondence. In particular, when constant mean curvature is a sphere. In con-
the proportionality factor is a constant, they nection with this problem, Hopf showed that a
are said to be in a similar (or homothetic) closed orientable surface of class C3 of genus
correspondence. There exists a local conforma1 zero with constant mean curvature is a sphere.
correspondence between any analytic surface If there is a certain relation W(k, , k2) = 0 be-
and a plane. Namely, if we choose suitable tween the two principal curvatures k,, k,
parameters, we cari reduce the fïrst funda- (k, > k2) of a surface, the surface is called a
mental form of any analytic surface to the Weingarten surface (or W-surface). There are
form A(<, q)(dt2 +dq’). Such parameters are many interesting results for W-surfaces. As an
called isothermal parameters. From the exis- extension of the convex surface, tight immer-
tence of isothermal parameters we cari see sions have been studied (- 365 Riemannian
that there exists a local conforma1 correspon- Submanifolds).
dence between any two analytic surfaces. The The Gaussian curvature K is invariant
assumption of analyticity in these theorems is under isometries. Hence a sphere is trans-
not necessary [ 101. If there exists a diffeomor- formed to a sphere by each isometry. This
phism between two surfaces under which geo- fact is sometimes described as the rigidity of a
desics are mapped to geodesics, then the sur- sphere. More generally, if two ovaloids are iso-
faces are said to be in geodesic correspondence. metric, then they are congruent (Cohn-Vossen’s
A surface has a locally geodesic correspon- theorem). It is known that if we remove a small
dence with a plane if and only if it is a surface circular disk from a sphere, then the remaining
of constant curvature. If two surfaces are in portion of the sphere is isometrically deform-
geodesic correspondence, then with respect able. On the existence of closed geodesics on
to parameters with the same values at corre- ovaloids, G. D. Birkhoff proved the follow-
sponding points, we have the relation ing theorem: There exist at least three closed
geodesics on any ovaloid of class C3. It is also
known that there exist surfaces of revolution
that are not spheres but whose geodesics are
a11 closed (- 178 Geodesics). On a hyperbolic
for coefficients of connections of the two non-Euclidean compact +space form of genus
surfaces. p (p > 2) there exists a geodesic whose points
By the +Alexander-Pontryagin duality theo- are everywhere dense in it (E. Hopf) (for the
rem, a submanifold M in E3 that is homeo- ergodicity of flows along geodesics on this
morphic to S2 divides E3 into two domains, surface - 136 Ergodic Theory; also 126
and two points belonging to different domains Dynamical Systems).
cannot be connected by a broken segment
unless the segment meets the surface. Such a
manifold M is called a closed surface. One of J. Singular Points of a Surface
the two domains consists of those points with
bounded distance from a point belonging to Suppose that a neighborhood of a point p0 of
the domain. Such a domain is called the inte- a surface S in E3 is given by a certain vector-
rior of the closed surface M. If the set M* con- valued function ,f of class C’ as r = f(u, u). Then
sisting of M and its interior is convex in E3, a point p,, where two vectors (af/&),O, (af/&&
the surface M is called a closed convex surface are linearly independent is called a regular
(or ovaloid). point. A point on S that is not regular is called
The Gaussian curvature of an ovaloid can- a singular point. If for suitable parameters we
not be negative at any point. A closed surface have (W%&=O but (aflW,o, (a2flau21po,
with K>O must be an ovaloid. Moreover, it is (û2f/&&),0 are linearly independent, then
known that on any closed surface there exists such a singular point is called a semiregular
at least one point where K > 0 (J. Hadamard). point. In general, shapes of neighborhoods of
If there exists no umbilical point and K is singular points are extremely complicated.
strictly positive in a domain on a surface, then However, we note the following: (i) by a small
the two principal curvatures regarded as con- deformation of the function f (and its deriva-
tinuous functions on the domain cannot take tives of orders at most r) we cari reduce p. to a
their local maximum and local minimum regular or semiregular point of the deformed
values at the same point (D. Hilbert). A com- surface; (ii) if p. is semiregular, we cari choose
421 112 A
Differential Operators

suitable parameters and curvilinear coordi- [ 171 M. P. do Carmo, Differential geometry of


nates of class c’ in E3 near p,, SO that the curves and surfaces, Prentice-Hall, 1976.
surface S in the neighborhood of the origin p0 [18] T. J. Willmore, An introduction to dif-
is expressed by the equations x1 = u*, x2 = ferential geometry, Clarendon Press, 1959.
v, x3 = UV (H. Whitney’s theorem [lS]). (The [19] N. J. Hicks, Notes on differential geome-
higher-dimensional case has also been consid- try, Van Nostrand, 1964.
ered (Whitney [16]).) [20] D. Laugwits, Differential and Rieman-
nian geometry, Academic Press, 1965.
[21] B. O’Neill, Elementary differential geome-
try, Academic Press, 1966.
References
[22] J. J. Stoker, Differential geometry, Wiley,
1969.
[l] W. Blaschke, Vorlesungen über Differen-
tialgeometrie 1, Springer, 1924 (Chelsea, 1967).
[2] A. Duschek and W. Mayer, Lehrebuch der
Differentialgeometrie 1, Teubner, 1930. 112 (X11.15)
[3] L. P. Eisenhart, An introduction to dif- Differential Operators
ferential geometry, Princeton Univ. Press,
1940; revised edition, 1947.
A. Definition
[4] W. Klingenberg, Eine vorlesung über
differentialgeometrie, Springer, 1973.
A mapping (or an operator) A of a function
[S] W. Fenchel, On the differential geometry
space F, to a function space F, is said to be
of closed space curves, Bull. Amer. Math. Soc.,
a differential operator if the value f(x) of the
57 (19X), 44-54.
image f= Au (uEF,,~EF,) at each point x is
[6] S. S. Chern, Curves and surfaces in Eucli-
determined by the values at x of u and a finite
dean space, Studies in global geometry and
number of its derivatives. If u and f are tdistri-
analysis, Studies in math. 4, Math. Assoc.
Amer. butions, the definition applies with the deriva-
tive interpreted in the sense of distributions
[7] 1. Fary, Sur la courbure totale d’une
courbe gauche fisant un noeud, Bull. Soc. (- 125 Distributions and Hyperfunctions). In
this article we restrict ourselves to the case of
Math. France, 77 (1949), 128-138.
linear differential operators and consider only
[S] J. W. Milnor, On the total curvature of
those of the form
knots, Ann. Math., (2) 52 (1950), 248-257.
[9] E. Cartan, La théorie des groupes finis et
continus et la géométrie différentielle traitees
par la méthode du repère mobile, Gauthier- where tl denotes n-tuples (tl 1, a 2,“‘i a,) of
Villars, 1937. nonnegative integers, called multi-indices; la1
[ 101 L. Bers, Riemann surfaces, Courant Inst. theIengthofa:~a~=a,+a,+...+a,;andD”
Math. Sci., 1957-1958. the differential operator Du= D:l D$l.. D>,
[ 1 l] S. Sternberg, Lectures on differential with Dj=( -i)a/ax,. The coefficient (-i) is
geometry, Prentice-Hall, 1964. sometimes omitted. The coefficients a,(x)
[12] A. D. Aleksandrov, Tntrinsic geometry of are functions defmed on an open set Q in n-
convex surfaces (in Russian), Gostekhizdat, dimensional space. We cal1 P(x, D) an ordinary
1948; German translation, Die innere Geo- differential operator if the dimension n of R is
metrie der konvexen Flachen, Akademie- 1 and a partial differential operator if n > 2.
Verlag, 1955. Ordinary differential operators and partial
[13] S. S. Chern, Topics in differential geome- differential operators behave quite differently
try, Lecture notes, Institute for Advanced in many respects.
Study, Princeton, 1951. We set
[ 141 H. Hopf, Zur Differential Geometrie
geschlossener Flachen in Euklidischen Raum,
Convegne Internazionale di Geometria Dif-
where 5 = (5, , &, . , 5,) E R” or C”. The order
ferenziale, Italy, 1953,45%54 (Cremona, 1954).
of P(x, D) is the greatest integer [ai for which
[15] H. Whitney, The singularities of a smooth
a,(x) $0. In expression (1) m is assumed to be
n-manifold in (2n - 1)-space, Ann. Math., (2) 45
equal to the order, and in that case
(1944), 247-293.
[16] H. Whitney, Singularities of mappings of
/dl/=lll
Euclidean spaces, Symposium International de
Topologia Algebraica, Universidad National is called the principal part of P(x, D), and the
Autonoma de Mexico and UNESCO, 1958, corresponding polynomial P,(x, 5) the charac-
285-301. teristic polynomial.
112 B 422
Differential Operators

Differential operators have been investi- When P(D) is a differential operator with
gated for a long time in connection with the constant coefficients, we cal1 a distribution
linear differential equations E(x) a fundamental solution if it satisfies

P(x,D)u(x)=f(x), XER. (2) W)W = W, (3)


Except for ordinary differential operators, where 6(x) is tDirac’s distribution (6 function).
however, it is rather recently that the prop- If E(x) is a fundamental solution in this sense,
erties of such operators have been studied then F(x, y) = E(x - y) is the kernel of a left
from the general viewpoint. inverse of P(D) and is a fundamental solution
We denote a differential operator with in the sense of the preceding paragraph.
constant coefficients by P(D). In general, Every differential operator P(D) with con-
P(x, D) is assumed to be a linear differen- stant coefficients has a fundamental solu-
tial operator with coefficients that are C”- tion in the sense of (3) (Ehrenpreis-Malgrange
functions. However, many of the results for theorem; see L. Hormander [4] for a proof).
Cm-coefficients also hold when the coefftcients General operators P(x, D) with variable
are sufftciently differentiable. (For tfunction coefficients do not necessarily have funda-
w-s WI, W4,4Q), JT4, C(Q), Cm(Q), mental solutions. However, if P(x, D) belongs
C;(Q), L,(a), d(0), &?(a), etc., - 125 Distri- to one of the classical types of operators (ellip-
butions and Hyperfunctions, 168 Function tic, hyperbolic, or parabolic), then it has a
Spaces). fundamental solution at least locally. (See
Differential operators are classitïed accord- F. John [S] for elliptic operators, J. Leray
ing to their properties. The most important are [ 1 l] for strongly hyperbolic operators, and S.
the elliptic, hyperbolic, and parabolic types. Mizohata [12] and S. D. Eidel’man [9] for
A differential operator P(x, D) is said to be an parabolic operators.) Leray has generalized
elliptic operator if the characteristic poly- John% method to strongly hyperbolic opera-
nomial P,,,(x, 5) has no real zero except for tors in an enormous work [ 131.
5 = 0 for each x E fi. Typical examples are
the Laplacian A = - (0: + . + 0:) and the
Cauchy-Riemann operator a/& = (1/2)(3/8x + C. Ranges of Differential Operators
iapy).
A differential operator is said to be hyper- Let P(D) be a differential operator with
holic if the associated Cauchy problem is well constant coefficients. Then it follows from
posed (- 325 Partial Differential Equations of the Ehrenpreis-Malgrange theorem that
Hyperbolic Type). The d’Alembertian 0: - P(D)GY(Q) 3 S(t2) holds for any open set fi.
(02 + . . + Df) is an example. A differential However, there are differential operators
operator of the form ô/ôt + P(t, x, D,) is called P(x, D) with variable coefficients such that for
paraholic if P(t, x, 0,) is strongly elliptic (- any Q P(x, D)#(Q) yb 9(Q). H. Lewy first
Section G) in x. The heat operator iDn+l -A is devised such an example:
typical.
P(x, D) = -iD, + D, -2(x, + ix,)D,.
These three types of operators appear most
often in applications, and if n = 2, then any Let C&i(x, D) be the homogeneous part of
operator of order 2 with real coefficients in order 2m - 1 of the commutator
(3/8x,), belongs to one of them at a generic
P(x, D)P(x, D) -P(x, D) P(x, D).
point. In other cases, however, there are dif-
ferential operators that do not belong to any Then in order that P(x, D)LY(Q) 3 SJ(O), it is
of them. necessary that

P,(x,O=O impb G,-,(x,5)=0


B. Fundamental Solutions
for a11 x ER, 5 E R” (Hikmander’s theorem [4]).
If a differential operator P(x, D) with g(a) as When P(x, D) is a differential operator that
its domain has a left inverse F that is expressed does not satisfy this condition (e.g., Lewy’s
as an tintegral operator with tkernel distri- operator), choose an ~(X)E~(Q) that is not in
bution (in g’,,,; - 125 Distributions and P(x, D)LY(Q). Then the differential equation (2)
Hyperfunctions F), then the kernel is said to be has no distribution solutions at all. P. Scha-
a fundamental solution (or elementary solution). pira extended this result to the case of thyper-
F is usually a right inverse of the weak exten- functions (also - 274 Microlocal Analysis).
sion (- Section F) of P(x, D) and maps g(n) Concerning the ranges of differential oper-
into s(Q). The image is mapped to the original ators P(D) with constant coefficients, we have
function by P(x, D). Nevertheless, F is not a the following detailed results due to L. Ehren-
genuine right inverse, and hence the funda- preis [28], B. Malgrange [ 141, and Hormander
mental solution is not unique if it exists. c41.
423 112 E
Differential Operators

An open set fi is said to be P-convex for a [27]. Before that time, D. Hilbert, E. E. Levi
differential operator P(D) if for each compact and K. 0. Friedrichs, and others investigated
set K c R there exists a compact set K’ c fi this problem for some elliptic operators, and
such that <pE Cg(Q) and suppP( -D)<p c K the hypoellipticity was called Weyl’s lemma
imply supp <pc K’. Convex sets are P-convex (- 323 Partial Differential Equations of Ellip-
for any P(D). Al1 open sets are P-convex if and tic Type).
only if P(D) is an elliptic operator. Similarly to the constant coefficient case
Theorem: The following conditions are the following two theorems are fundamental:
equivalent: (i) R is P-convex; (ii) P(D)iB'(Q) 1 Elliptic and parabolic operators P(x, D) with
&(Q); (iii) P(D)G(R) = g(Q). Property (iii), the C” coefficients are hypoelliptic (Schwartz [27],
Mittag-Leffler theorem, and the solvability of Mizohata [12]). Elliptic operators P(x, D)
tCousin’s lïrst problem for the solutions of with real analytic coefficients are analytic
~(D)U=O are equivalent. hypoelliptic (1. G. Petrovskiï [16], C. B. Mor-
An open set Q is said to be strongly P- rey and L. Nirenberg). The latter holds also
convex if for each compact set K c n there for hyperfunction solutions (M. Sato and
exists a compact set K' such that LE& and Schapira). Moreover, the following result is
supp P( -D)p c K imply supp p c K'; and known: If for each compact set K in an open
/J E G’(R) and sing supp P( -D)p c K imply sing set Q there exists a constant C such that
suppp c K'; where the singular support of p is l/Pkull,<Ck+‘(mk)!, then UE&@), where II.llK
the closure of the set of a11 points at which p is denotes the L,-norm on K (H. Komatsu, T.
not a C”-function. Convex sets are strongly P- Kotake, and M. S. Narasimhan).
convex, and strongly P-convex sets are P- However, as seen by the example D, + ix:D,
convex. in RZ (k is even), the analytic hypoellipticity
Theorem: fi is strongly P-convex if and only also holds for nonelliptic operators. Such an
if P(D)9'(R)=GY(R). operator is called a subelliptic operator; subel-
R. Harvey has shown that every domain R liptic operators have been investigated by
is P-convex in the sense of hyperfunctions, i.e., Hormander, Yu. V. Egorov, F. Treves, and
the equation ~(D)U =f always has a hyper- others [19].
function solution u on fi for any hyperfunction Hormander has obtained a fairly complete
f on Q. For the treal analytic functions J&‘(D), result on the hypoellipticity of the operators of
however, P(D)d(Cl)=d(Q) does not hold for the form
convex open set R in general. Hormander [ 151
gave a necessary and suflïcient condition for
i=l
P(D) and 51 in order that this hold.
where X0, ,X, are homogeneous tïrst-order
differential operators with real coefficients
D. Hypoellipticity ([ 171; 0. A. Oleïnik and E. V. Radkevich
[ 181). Hypoellipticity was investigated exten-
A differential operator P(x, D) is called hypo- sively after the introduction of pseudodifferen-
elliptic in R if for any distribution u(x)EZ(R), tial operators and Fourier integral operators
Pu E Cm(n) implies u E C”(Q). Further, a differ- (- 345 Pseudodifferential Operators).
ential operator with real analytic coefficients
P(x, D) is called analytically hypoelliptic in R
if P is hypoelliptic and if for any distribution E. Differential Operators in Banach Spaces
u(x)Eg’(SZ), PUE&(R) implies u(x)EJzZ(R).
There are two fundamental facts about We consider differential operators P(x, D) de-
such operators. Let P have constant coeffi- tïned on a domain n as operators in the func-
cients; then P(D) is hypoelliptic if and only if tion spaces C(Q) or L,(Q. Differential oper-
P((+i~)=0 and 1t+i+m imply that lnl-co ators of order m> 1 are always unbounded
(Hormander’s theorem [4]). Furthermore, P(D) operators in the Banach space X =C(Q) or
is analytic hypoelliptic if and only if P is ellip- L,,@). Moreover, their domains of definition
tic (Petrovskiï’s theorem [16]). The heat opera- as operators in X are not generally determined
tor is not elliptic, but hypoelliptic. If P(D) is uniquely by the expressions P(x, D) as differen-
elliptic, then actually any hyperfunction ~(X)E tial operators.
g(R) such that PUE&@) is real analytic (R. P(x, D) is a linear operator that maps
Harvey, G. Bengel). On the other hand, if P(D) Cc(R) into X. This operator has a closed
is not elliptic, there is a hyperfunction solution extension. The minimal closed extension P. is
u of Pu = 0 that is not a distribution. called the minimal operator of P(x, D) in X.
Strictly speaking, the notion of the hypo- We have ueg(P,,) and Pou=f if and only if
ellipticity for general differential operators there exists a sequence (pnE C;(Q) such that
was first formulated explicitly by L. Schwartz <P~+U, P(x, D)cp,+f: On the other hand, the
112 F 424
Differential Operators

closed linear operator P, whose domain is generally of inlïnite dimension, and the con-
the set of a11 ueX such that P(x,D)ugX in crete forms of the elements of B and B’ are
the sense of distribution is called the maxi- not known. However, we have some informa-
mal operator (or weak extension) of P(x, D). tion by M. 1. Vishik (Amer. Math. Soc. Transi.,
We have u~B(pi) and P,u=f if and only if (2) 24 (1963)) about the boundary values of
<f-4’% D)<p) = <J <p> for any <PE CZ(Q, elliptic operators of the second order. Combin-
where ‘P(x, D) is the transposed operator ing this with the results by J.-L. Lions and E.
Magenes (J. Anal. Math., 11 (1963)), we cari
obtain information for elliptic operators of
higher order.
Integration by parts shows that Pr is an exten-
sion of P. and that when X is the tdual space
of a space Y, the weak extension PI in X is the F. Differential Operators witb Boundary
tdual of the minimal operator of tP(x, D) in Conditions
Y. Let X=L,(R) (1~ p < CO), R be a bounded
open set with smooth boundary, and P(D) A closed operator between the minimal opera-
have constant coefficients. Then P, coincides tor P,, and the maximal operator P, is deter-
with the smallest closed extension of the oper- mined by designating a closed subspace B
ator P(D) having as its domain the set of a11 of the boundary space g. This operator is
u~C”(R)llx such that ~(D)UE~. The latter called the operator witb tbe boundary condition
closed extension is called the strong extension. B. Particularly important are boundary condi-
The difference between the weak and the tions expressed in the form
strong extension is not obvious in the vari-
able coefficient case. Qi(x, D)u(X) = 0, XEC?R, i=l,...,k, (4)
P. coincides with P, when n is the entire
with differential operators Qi(x, D) (i = 1, . . , k)
space and P(x, D) is an elliptic operator whose
detïned on the boundary aR of fi.
coefficients are constants or close to constants
When P(x, D) is an ordinary differential
(J. Peetre, Medd. Lunds Univ. Mat. Sem., 16 operator delïned on a lïnite interval and the
(1959); T. Ikebe and T. Kato, Arch. Rational
orders of Qi are at most m - 1 (or m), (4) always
Mech. Anal., 9 (1962)). In general, we have
has a delïnite meaning. However, for partial
P0 #P, Let P(x, D) be an ordinary differential
differential operators, we need an interpreta-
operator with bounded coefficients such that tion of (4), i.e., (4) does not necessarily deter-
la,(x)1 > 6 > 0 and R be the bounded interval mine the subspace B of S3 uniquely.
(a, b). Then the domain of P, coincides with the
Let P, be the smallest closed extension of
set of a11 (m- 1)-times continuously differenti-
P(x,D) with {~~C~@)flX~Q~(x,D)u(x)=0,
able functions u such that the (m- 1)st deriva-
x E 80, P(x, D)u E X} as its domain. P, is called
tive is absolutely continuous and P(x,D)ueX,
the strong extension of the differential operator
while the domain of P. is the set of a11 func-
P(x, D) with boundary condition (4).
tions u which satisfy in addition the boundary
On the other hand, when R, P, and Qi
conditions
satisfy suitable conditions, we cari define the
u(a) = u’(a)= . = u’-(a) transposed differential operator ‘P(x, D) with
the transposed boundary operators Rj(x, D)
=u(b)=...=tw’)(b)=O.
(j=l , . . . , I). Namely, there are differential
(Moreover, t&“‘)(a) = u(“)(b) = 0 when X = operators Rj(x, D) on the boundary such that
C(a, 4) a necessary and sufficient condition for u E
Let G(P,), C(PI)( c X x X) be the tgraphs of Cm(a) to satisfy (4) and PU(~)=~(X) is
Po, Pl. Then the quotient space a = C(PI)/
G(P,) is called the boundary space, and an f(x)u(x)dx= u(x)‘Pv (x) dx (5)
element of the dual B’ of &?, i.e., a continuous sR 1‘$2
linear functional on C(PI) which vanishes on for a11 u(x) E C”(n) with the boundary con-
G(P,,), is called a boundary value relative to ditions Rju = 0, x E aR. Then the operator P,
P(x, D). For the ordinary differential opera- defmed by P+(x) =f(x) for the pairs u(x),
tors discussed above, the boundary space is fox satisfying (5) is called the weak exten-
the set of a11 linear combinations of u(‘)(a) and sion of the differential operator P(x, D) with
u(j)(b). When P(x, D) is an ordinary differen- boundary condition (4). As in the case of
tial operator, we cari explicitly determine the operators without boundary conditions, the
boundary values also in the case where the weak extension is an extension of the strong
interval (a, b) is intïnite, the coefficients a,(x) extension, and generally is the dual of the
are not bounded, or a,(x)-+0 (x-q b); and we strong extension in the dual space X’ of the
cari show that B is tïnite-dimensional. When transposed differential operator with the trans-
P(x, D) is a partial differential operator, YB is posed boundary condition.
425 112 1
Differential Operators

Regularity up to the boundary. The funda- order that a differential operator P have a
mental problems for the differential operator coercive boundary condition, it is necessary
P(x, D) with the homogeneous boundary that it be a special type of elliptic operator. In
condition (4) are to determine, for both the this case, Agmon, N. Aronszajn, Schechter,
strong and the weak extensions, the spaces and others found conditions under which (4) is
of solutions of the homogeneous equations coercive. Agmon, A. Douglis, and Nirenberg
Pu = 0 and their ranges. The problems mostly show that the inequality (6) holds in L,(R) and
reduce to determining when the strong and in the normed spaces of Holder continuous
the weak extensions coincide and, including functions under a suitable condition. The
this, also to the problem of regularity on the classical boundary conditions au + bat+ =0
closed domain fi containing the boundary (a 2 0, b > 0, a + b = 1) are coercive for elliptic
of the solutions u of the equation P,u =f: operators of the second order (- 323 Partial
This problem was solved by Nirenberg for Differential Equations of Elliptic Type). How-
strongly elliptic operators in L*(0) with the ever, problems remain when the coefficients
Dirichlet boundary condition of Qj(x, D) are discontinuous. In order to
have coincidence of the strong and weak exten-
aj-lu(x)/aP =O, j= 1,2, . . . ,m/2,
sions or regularity up to the boundary, it is
and generalized later by F. Browder, M. necessary neither that P be elliptic nor that the
Schechter [20], S. Agmon, Lions, and others boundary condition be coercive. But it is not
for elliptic operators in L,(n) with a kind of known to what extent these conditions cari be
coercive boundary condition (- Section H). weakened. At present, major contributions are
Consequently, when 0 is bounded and Hormander’s work (Acta Math., 99 (1958))
smooth, P, is equal to PS for those operators. dealing with operators with constant coefi-
Write P for P,,,. Then the space N(P) of the cients and flat boundaries, and works by J. J.
solutions of Pu = 0 is a subspace of fïnite di- Kohn [21], Nirenberg, and Hormander con-
mension, and the range R(P) is a closed sub- cerning noncoercive boundary conditions.
space of lïnite codimension. In particular, The latter works are connected with the theory
it follows that the +index of P, dim N(P) - of several complex variables, and have at-
codim R(P), is finite. tracted much attention.

1. Self-Adjoint Extension
G. Strongly Elliptic Operators

A differential operator P(x, D) is said to be One of the fundamental problems in the case
X =L,(n) is whether the minimal operator P,
strongly elliptic if its characteristic polynomial
has a self-adjoint extension. P0 is symmetric
satisfies
if and only if P(x, D) is formally self-adjoint:
ReP,dx,5)8C151m>0, 5#0. P(x,D)=‘P(x,D). Under this condition the
boundary space B turns out to be the direct
Many of the elliptic operators, such as the
sum of two subspaces && = {(x, P, x) (x E i3(P1),
Laplacian, that appear in applications are
PI x = + ix} + G(P,,). The numbers n, = dim 99*
strongly elliptic. L. Garding treated the
are called the deficiency indices of Pi, and P.
boundary value problem with the Dirichlet
has a self-adjoint extension if and only if n,
condition (in the generalized sense) for strong-
=?Z_.
ly elliptic operators. His work initiated the
general study of differential operators (- 323 H. Weyl gave a method for computing nl
for the Sturm-Liouville operators:
Partial Differential Equations of Elliptic
Type). His theory is based on the following
inequality, called Gkding’s inequality: PkD)= - (&~(r)%)+q(x), x+,b).

We say that a (resp. b) is of limit circle type


if the solutions u(x) of P(x, D)u(x) + lu(x) = 0
(IeC) always belong to L, in a neighborhood
of a (resp. b) and of limit point type if a solu-
tion does not belong to Z., This classification
H. Coercive Boundary Conditions does not depend on the choice of /EC. (1) If
both a and b are of limit point type, then
The boundary condition (4) is said to be coer- n, = n- = 0 and hence P. is self-adjoint. (2) If
cive if a is of limit circle type and b is of limit point
type, then n, = IZ- = 1, and the self-adjoint
/lV”~ll 6 WP4l + llull) (6) extensions of P. are the operators P, that are
holds for any u~C~(fi) that satislïes (4). In obtained from P, by assigning the boundary
112 J 426
Differential Operators

condition In the first method, we take a function u(x)


that satistïes Qi(x, D)u(x) = gi(x) and reduce
p(a)u’(a) COScc+ u(a) sin t( = 0.
the problem to the homogeneous one for u0 =
The same is true when a and b are inter- u-v. In the second method, we consider the
changed. (3) If both a and b are of limit circle pair P(x, D) and Qi(x, D) as an operator that
type, then n, = n- = 2, and we cari impose two maps a function u to the pair of functions
boundary conditions to obtain the self-adjoint (Pu, Qiu) and investigate it directly. The latter
extensions. method was adopted by Peetre and Hor-
These results have been extended by K. mander [4].
Kodaira [23] and N. Dunford and J. Schwartz
[3] to the case of ordinary differential opera-
tors of order m. There are formally self-adjoint L. Estimates in Weighted Spaces
operators that have no self-adjoint extensions.
For example, the operator -id/dx in Z,,(O, oo)
has detïciency indices n, = 1 # n- = 0. J. F. Treves, Hormander [4], and H. Kumano-
For partial differential operators, it is dif- go obtained estimates similar to (6) in L,-
lïcult to determine explicitly a11 self-adjoint spaces relative to the weighted measure
extensions of a given formally self-adjoint w,(x)dx instead of the usual L,-spaces, and
operator because the boundary space is com- applied them to the proof of the uniqueness
plicated. If the boundary condition (4) is for- of Cauchy problems for differential equations
mally self-adjoint and coercive, then it follows with variable coeftïcients (- 321 Partial Dif-
from the results of Schechter and others that ferential Equations (Initial Value Problems)).
the differential operator with boundary condi- Hormander [22] applied similar estimates to
tion (4) is self-adjoint. Furthermore, condi- the proof of the tfundamental theorems of
tions under which P0 is self-adjoint or has a Stein manifolds.
self-adjoint extension are known. The follow-
ing theorem is often used as a condition of the
latter type. If a tsymmetric operator delïned on M. Eigenfunction Expansions
a dense subspace of a Hilbert space X is posi-
tive delïnite:
When a self-adjoint operator P in the Hilbert
(Tx,x)>O, X~wl space L,(Q) is a self-adjoint extension of a
then there is a positive delïnite self-adjoint differential operator P(x, D), the ?Spectral
extension 7 (Friedrichs’s theorem). The self- decomposition of P is concretely expressed by
adjoint extension obtained by this theorem is the expansion of functions UE L2(R) into eigen-
called the Friedrichs extension. functions of P(x, D).
If 0 is bounded, P(x, D) is an elliptic oper-
ator defined on a neighborhood of fi, and
J. Generators of Semigroups the boundary condition is coercive, then the
tspectrum of P is composed solely of eigen-
From the point of view of probability theory, values, and the eigenvectors of P are eigen-
W. Feller investigated the extensions of the functions of P(x, D) in the classical sense and
Laplacian d2/dx2 and similar operators that are of class C” up to the boundary.
are the generators of order-preserving semi-
groups. Recently various attempts have been
made to generalize his results to the multi-
dimensional case (- 11 5 Diffusion Processes; N. Asymptotic Distribution of Eigenvalues
378 Semigroups of Operators and Evolution
Equations). If P = -A, the number v(1) of eigenvalues less
P. D. Lax and A. N. Milgram proved that than i satisfïes the asymptotic relation
if P(x, D) is a strongly elliptic operator, then
-P(x, D) with the Dirichlet condition in L2(R) l”‘ZA
v(4 - pn”‘2n~(n/2)’ as Â+cc
is the generator of a semigroup [l].

regardless of the shape of the domain and the


K. Boundary Value Problems boundary condition, where n is the dimension
and A is the volume of 0 [2]. This was first
There are two methods of solving the inhomo- proved by Weyl and extended by T. Carleman,
geneous boundary value problem Garding, and others to the case of operators
of higher order with variable coefficients (-
w, W(x)=f(x), XEG
323 Partial Differential Equations of Elliptic
Qi(X,D)U(X) =&(XX XéXl, i= 1, . . ..k. Type).
421 112 Q
Differential Operators

0. Weyl-Stone-Titchmarsh-Kodaira Theory the spectral measure is given has been studied


by 1. M. Gel’fand and B. M. Levitan (Amer.
When R is unbounded or the coefficients of Math. Soc. Transi., (2) 1 (1955); original in
P(x, D) have singularities near the boundary of Russian, 1951).
R, the spectrum of P may have a continuous
part.
Let P(x, D) be an ordinary differential oper-
P. Eigenfunction Expansion for Partial
ator on an interval (a, b). Then for each ÂE
Differential Operators
C the equation (P(x,D)-I)<~(x)=0 has m
linearly independent solutions (pk(x, 1) (k =
The theory of eigenfunction expansion for
1, , m), and any solution is represented as
partial differential operators with continuous
a linear combination of them. Weyl and M. H.
spectra is not complete, as it is for ordinary
Stone obtained the spectral decomposition of
differential operators. What causes difflculties
P (of the second order) in the form is that the solutions of (P(x, D) - A)u = 0, which
cc should be the generalized eigenfunctions, form
u(x)= f cPjlx> n)dPjk(31) an infinite-dimensional space, and except for
j,k=l s -m
special cases it is impossible to introduce
convenient parameters in it. Many proofs are
known for a general result, saying that any
self-adjoint elliptic operator has an eigenfunc-
tion expansion into generalized eigenfunctions
of the form
b
X (P~(Y> 4Wdy>
cc kV)
s a
u(x) = -m jz Vj(xa A)dPj(Â) R qOj(Y,Â)u(Y)dY,
s s
where the pjk(Â) are functions of bounded
[lO]. There are few operators, however, for
variation and their variations represent the
which we know how to construct cpi(x, A) and
spectral measure. This formula shows that
the measure dpi(Â). The Fourier transform
linear combinations of the pj(x, Â) form gen-
gives the expansion for operators with con-
eralized eigenfunctions even when Â. belongs to
stant coefficients defined on the whole space.
the continuous spectrum. Later, E. C. Titch-
By means of a generalized form of the Fourier
marsh and Kodaira gave a formula to obtain
transform, Ikebe gave an eigenfunction expan-
the density matrix pjk(lZ) and completed the
sion for the Schrodinger operator -A + q(x)
theory (Titchmarsh [6], Kodaira [23], Dun-
in R3 under the condition that q(x) EL, and
ford and Schwartz [3]). This expansion theo-
O(~X[~~-“) as IxI+cc [24]. Y. Shizuta, Mizo-
rem makes it possible to deduce in a unifîed
hata, Lax and R. S. Phillips, N. A. Shenk,
manner expansion theorems for classical
Ikebe, D. K. Fadeev, and others proved the
special functions, such as the tFourier series
same results for similar operators defïned
expansion theorem, the expansions by tHer-
on an exterior domain with a bounded set
mite polynomials and tlaguerre polynomials,
deleted from a higher-dimensional Euclidean
the tFourier integral theorem, and various
space. These theories are closely related to the
expansions in terms of tBesse1 functions [6,7].
tscattering theory of the Schrodinger equa-
The relation between the coefficients of the
tion (-idldt - P)u = 0 or the wave equation
differential operator and the spectral distri-
(d2/dt2 +P)u =0 associated with those opera-
bution of P is important in applications and is
tors. Lax and Phillips have developed the
the subject of many papers [3,5,6].
latter scattering theory [26] (- 375 Scattering
For example, let P(x, D) = -d2/dx2 + q(x)
Theory).
andR=(-oo,co).Ifq(x)-+co asIxl+co,then
Many of the problems in tquantum mechan-
P(x, D) is essentially self-adjoint, the spectrum
ics reduce to finding the spectral distribution
is entirely composed of the point spectrum,
of self-adjoint partial differential operators.
and a detailed estimate of thejth eigenvalue lj
is also known [6]. If q(x) converges rapidly to
0 as [xl+co, then P(x, D) is essentially self-
adjoint, and there is only a continuous spec- Q. Expansion Theorems for Non-Self-Adjoint
trum for i > 0 and eigenvalues for  < 0 with Operators
at most 0 as accumulation point. If q(x) is a
periodic function of x, then P(x, D) is again A kind of eigenfunction expansion theorem
essentially self-adjoint, and the spectrum con- may hold for non-self-adjoint differential
sists of a continuous spectrum decomposed in operators or for non-Hilbert spaces. (See
a sequence of nonoverlapping intervals. The papers by the Russian school for ordinary
converse problem of determining q(x) when differential operators and those by Browder,
112 R 428
Differential Operators

Agmon, and others for partial differential References


operators.)
[l] K. Yosida, Functional analysis, Springer,
1965.
R. Systems of Differential Operators [2] R. Courant and D. Hilbert, Methods of
mathematical physics, Interscience, 1, 1953; II,
We have SO far dealt with single differential 1962.
operators that map functions u to functions [3] N. Dunford and J. T. Schwartz, Linear
f: A linear differential operator that maps a operators II, Interscience, 1963.
p-tuple (ui, , up) of functions to a q-tuple [4] L. Hormander, Linear partial differential
(fi, . , f,) of functions cari be written as operators, Springer, 1963.
[S] M. A. Naimark, Linear differential opera-
tors 1, II, Ungar, 1967. (Original in Russian,
1954.)
where P(x, D) = (Pij(x, D)) is a matrix of single [6] E. C. Titchmarsh, Eigenfunction expan-
differential operators. Such a matrix is called sions associated with second-order differential
a system of differential operators. A system equations, Claredon Press, 1, revised edition,
P(x, D) is said to be underdetermined if p > 4, 1962; II, 1958.
determined if p = q, and overdetermined if p < [7] K. Yosida, Lectures on differential and
q. Many propositions that hold for single integral equations, Interscience, 1955. (Orig-
operators hold for determined systems under inal in Japanese, 1950.)
appropriate conditions. [S] F. John, Plane waves and spherical means
However, there is a fundamental difference applied to partial differential equations, Inter-
between overdetermined (underdetermined) science, 1955.
systems and determined systems, as is seen [9] S. D. Èïdel’man, Parabolic systems, North-
from the theory of several complex variables, Holland, 1969. (Original in Russian, 1964.)
which is the theory of a typical overdeter- [ 101 1. G. Gel’fand and N. Ya. Vilenkin, Gen-
mined system a/% = (a/Zi). The theory is eralized functions IV, Applications of har-
much more difficult for overdetermined (under- monic analysis, Academic Press, 1964.
determined) systems. The general theory of [ 1 l] J. Leray, Hyperbolic differential equa-
overdetermined and underdetermined systems tions, Lecture notes, Institute for Advanced
with constant coefficients has been constructed Study, Princeton Univ. Press, 1953.
by Ehrenpreis [28], Malgrange, V. Palamodov, [ 121 S. Mizohata, Hypoellipticité des équa-
and Hormander [22] for C”-functions and tions paraboliques, Bull. Soc. Math. France,
distributions. It has been extended by H. Ko- 85 (1957), 1550.
matsu to the case of hyperfunctions. T. Miwa [ 133 J. Leray, Problème de Cauchy I-V, Bull.
has discussed the same problem for real ana- Soc. Math. France, 85 (1957), 3899429; 86
lytic functions. (1958), 75596; 87 (1959), 81-180; 90 (1962), 39-
156; 92 (1964) 263-361.
[ 141 B. Malgrange, Existence et approxima-
S. Symmetric Systems of the First Order tion des solutions des équations aux dérivées
partielles et des équations de convolution,
Determined systems of the lïrst order are Ann. Inst. Fourier, 6 (1955-1956), 271-355.
important in applications. Many problems in [15] L. Hormander, On the existence of real
mathematical physics are formulated in terms analytic solutions of partial differential equa-
of them. Also, single equations of higher order tions with constant coefficients, Inventiones
cari be reduced to determined systems of the Math., 21 (1973) 151-182.
lïrst order by regarding the derivatives as [ 161 1. G. Petrovskiï, Sur l’analyticité des
unknown functions. In some cases determined solutions des systèmes d’équations différen-
systems of the lïrst order are easier to handle tielles, Mat. Sb., 5 (47) (1939) 3370.
than single operators of higher order. In partic- [ 171 L. Hormander, Hypoelliptic second-order
ular, a system of differential operators differential equations, Acta Math., 119 (1967)
147-171.
[18] 0. A. Oleïnik and E. V. Radkevich,
is said to be a symmetric positive system if the Second-order equations with nonnegative
matrices A,(x) and B(x) satisfy the following characteristic form, Amer. Math. Soc., 1973.
conditions: A* = Ai; B + B* + 2 ôA,/ax, is (Original in Russian, 1971.)
positive semidefinite. Symmetric positive sys- [ 191 L. Hormander, Seminar on singularities
tems have been studied in detail by Friedrichs of solutions of linear partial differential equa-
[25], Phillips, C. S. Morawetz, Lax, and tions, Ann. Math. Studies 91, Princeton Univ.
others. Press, 1979.
429 113
Differential Rings

[20] M. Schechter, On Lp estimates and regu- higher differentiation in R: (i) 6,(x + y) = 6,x
larity 1, II, Amer. J. Math., 85 (1963), 1-13; + ??,y; (ii) 6,(xy) = C 6,~. &y (the addition is
Math. Stand., 13 (1963), 47-69. performed over a11 pairs a, b of nonnegative
[Zl] J. J. Kohn, A priori estimates in several integers that satisfy a + B = A); (iii) 6,(6,x) =
complex variables, Bull. Amer. Math. Soc., 70
6,x =x. Two hlgher dlf-
(1964), 739-745.
[22] L. Hormander, An introduction to com- ferentialions 6 = {a,} and a’= (&} are said to
plex analysis in several variables, Van Nos- be commutative if and only if 6, and Sh com-
trand, 1966. mute for a11 pairs Â, fl of nonnegative integers.
[23] K. Kodaira, On ordinary differential Higher differentiations were introduced by H.
equations of any even order and the corre- Hasse (1935) for ihe study of the fïeld of +alge-
sponding eigenfunction expansions, Amer. J. brait functions of one variable in the case of
Math., 72 (1950), 502-544. nonzero characteristics.
[24] T. Ikebe, Eigenfunction expansions as- These two defmitions of differential rings
sociated with the Schrodinger operators and coincide if the characteristic of R is zero. For
their applications to scattering theory, Arch. simplicity, we shall restrict ourselves to that
Rational Mech. Anal., 5 (1960), l-34. case.
[25] K. 0. Friedrichs, Symmetric positive Let 6,) . , S,,, be the differentiations of the
linear differential equations, Comm. Pure differential ring R. If x is an element of R,
Appl. Math., 11 (1958), 333-418. Ss,lSP @zx(s,, s2, . . , s, are nonnegative
[26] P. D. Lax and R. S. Phillips, Scattering integers) is called a derivative of x. We cal1 x
theory, Academic Press, 1967. constant if and only if S,x = = 6,x = 0. An
[27] L. Schwartz, Théorie des distributions, tideal a of R with &a c a (i = 1,2, . . , m) is
Hermann, 1966. called a differential ideal of R. If it is a +Prime
[28] L. Ehrenpreis, Fourier analysis in several ideal (semiprime ideal (i.e., an ideal containing
complex variables, Wiley-Interscience, 1970. a11 those elements x that satisfy xQ E a for some
[29] J.-E. Bjork, Rings of differential natural number g)), then a is called a prime
operators, North-Holland, 1979. differential ideal (semiprime differential ideal).
[30] L. Hormander, The analysis of linear A subring S of R with S,S c S cari be regarded
partial differential operators, I-IV, Springer, as a differential ring with respect to the differ-
1983-1985. entiations S,, . . , S,. We cal1 S a differential
subring of R and R a differential extension ring
of s.
Let X,, . . . , X, be elements of a differential
113 (111.17) extension fïeld of a differential field K with the
differentiations 6 l,“.> 6mr and let &S? . . .
Differential Rings
6s;Xi(sla0,...,s,>0, l<i<n)be+algebra-
ically independent over K. The totality of their
Let R be a tcommutative ring with a unity polynomials over K, which forms a differential
element 1. If a map 6 of R into R is such that ring, is called the ring of differential polyno-
for any pair x, y of elements of R, (i) 6(x + y) mials of the differential variables X, , . , X,
=6x + 6y, and (ii) 6(xy) = 6x. y + x. 6y, then 6 over K, and is denoted by K{X,, . . , X,}. Its
is called a derivation (or differentiation) in R. A elements are called differential polynomials.
ring R provided with a finite number of mutu- For this ring of differential polynomials we
ally commutative differentiations in R is called have an analog of +Hilbert’s basis theorem in
a differential ring. In this article we consider the ring of ordinary polynomials, Ritt’s basis
only the case where R contains a subfield that theorem: If we are given any set W of differen-
has the unity element in common with R. In tial polynomials of X,, . . . , X, over K, we cari
particular, if R is a fïeld, we cal1 it a differential choose a fïnite number of differential polyno-
field. mials P,, , P, from %!r)lsuch that each element
In the above definition of differential ring, it Q of <331has an integral power Qg equal to a
is not necessary to mention the tcharacteristic linear combination of P,, . . , P, and their de-
of the subfïeld contained in R. However, to rivatives, where the coefficients of the linear
make it more effectively applicable in the case combination are elements of K {X, , . , Xx}.
of nonzero characteristics, we may detïne This theorem implies that in the ring of differ-
differential rings using higher differentiation in ential polynomials, every semiprime differential
place of the differentiation detïned above. If a ideal cari be expressed as the intersection of a
sequence 6= {S,} of maps a,,, a,, a,, of R finite number of prime differential ideals; if this
into R satisfïes the following conditions (i)-(iv) expression is tirredundant, it is unique (- 67
for any pair x, y of elements of R and any pair Commutative Rings).
i., p of nonnegative integers, then 6 is called a The equation obtained by equating a dif-
113 Ref. 430
Differential Rings

ferential polynomial to zero is called an alge- [S] E. R. Kolchin, Abelian extensions of dif-
brait differential equation. Concerning these ferential fïelds, Amer. J. Math., 82 (1960), 779-
equations, we are able to use methods similar 790.
to those used in talgebraic geometry in study- [6] K. Okugawa, Basic properties of differen-
ing the usual algebraic equations. J. Ritt made tial fields of an arbitrary characteristic and the
interesting studies on solutions of algebraic Picard-Vessiot theory, J. Math. Kyoto Univ., 2
differential equations by such methods, prin- (1963), 295-322.
cipally in the case when the ground fïeld K [7] J. F. Ritt, Differential algebra, Amer.
consists of tmeromorphic functions. Math. Soc. Colloq. Publ., 1950.
Since that time, basic study of differential
rings and fïelds has been fairly well organized
and has developed into theories such as the
following two: 114 (IX.1 8)
(1) Picard-Vessiot tbeory. This is a classical Differential Topology
theory of tlinear homogeneous differential
equations originated by E. Picard and E.
A. General Remarks
Vessiot; it resembles the tGalois theory con-
cerning algebraic equations. The Galois group
Differential topology cari be defïned as the
in this case is a linear group, and its structure
study of those properties of tdifferentiable
characterizes the solution of the differential
manifolds that are invariant under tdiffeo-
equation. E. Kolchin introduced the general
morphisms. The basic abjects studied in this
concept of the Picard-Vessiot extension fïeld of
field are the topological, combinatorial, and
a differential field and studied in detail the
differentiable structures of manifolds and the
group of differential automorpbisms (i.e., the
relationships among them. Some of the re-
group of a11 those automorphisms that com-
markable contributions, such as H. Whitney’s
mute with the differentiations and fïx elements
embedding theorem [ 11, the triangulation theo-
of the ground field), thus making the classical
rems of J. H. C. Whitehead and S. S. Cairnes
theory more precise and more general.
[2,3], and Morse theory [4], were made in
(2) Galois theory of differential fields. Gen-
the 1930s. In the late 195Os, outstanding re-
eralizing the concept of the Picard-Vessiot
sults were obtained by R. Thom, J. W. Milnor,
extension, Kolchin introduced the notion of
S. Smale, M. Kervaire, and F. Hirzebruch,
the strongly normal extension fïeld and es-
among others. Differential topology thus be-
tablished the Galois theory for such exten-
came a new, fascinating branch of mathematics.
sions. In this theory, we see that the Galois
group is an talgebraic group relative to a
tuniversal domain over the field of constants B. Differentiable Structures
of the ground iïeld. Conversely, every algebraic
group is the Galois group of a strongly normal We assume that all manifolds are tparacom-
extension fïeld. We also see that a strongly pact. Let M be a ttopological manifold. A
normal extension cari be decomposed, in a C’-equivalence class of tatlases of class C
certain sense, into a Picard-Vessiot extension (1 < r < CO) on M is called a C’-structure on M.
and an Abelian extension (i.e., an extension Any C-structure on M contains an atlas of
whose Galois group is an TAbelian variety) class C” on M, and its C”-equivalence class is
(- Kolchin [2-51, Okugawa 161). uniquely determined (Whitney Cl]). This C”-
structure is called a differentiable structure on
M compatible with the given (Y-structure.
Moreover, any Cm-structure admits a treal
References
analytic structure compatible with it in this
sense (Whitney Cl]). A C”-manifold is also
[l] 1. Kaplansky, An introduction to differen- called a smooth manifold, and a differentiable
tial algebra, Actualités Sci. Ind., Hermann, structure a smooth structure. Let D,, D, be
1957. differentiable structures on M. If two C”-
[2] E. R. Kolchin, Algebraic matric groups manifolds (M, D,) and (M, Dl) are not tdif-
and the Picard-Vessiot theory of homogeneous feomorphic, we say that the differentiable
ordinary linear differential equations, Ann. structures D,, D, are distinct (- 105 Differen-
Math., (2) 49 (1948), l-42. tiable Manifolds). Milnor defïned a certain
[3] E. R. Kolchin, Galois theory of differential invariant of differentiable structure using the
fïelds, Amer. J. Math., 75 (1953), 753-824. Hirzebruch index theorem (- 56 Character-
[4] E. R. Kolchin, On the Galois theory of istic Classes) and proved that there are several
differential fields, Amer. J. Math., 77 (1955), distinct differentiable structures on the 7-
868-894. dimensional sphere S’ (Milnor [SI). Milnor’s
431 114 D
Differential Topology

example of such structures was: Let fk: S3 -* the following conditions: (i) For any closed n-
SO(4) be the mapping detïned by &(G)z = simplex 0 of K, f ( G : 0 --f M” is a C’-mapping;
O*r&, where k is an odd integer, h, j are inte- (ii) for any point p of o, the tJacobian matrix
gers determined by h + j = 1, h -j = k, and off (o has rank n at p. Then we say that (K, f)
0, z are tquaternions of norm 1. Let Ml be is a C-triangulation of the C’-manifold M”,
the total space of the S3-bundle over S4 cor- and the C’-structure of M” is compatible with
responding to the mapping fk. This is an the triangulation (K,f). Concerning c’-
oriented closed manifold with the naturally triangulations of C’-manifolds, we have the
defïned differentiable structure. Moreover, following results (S. Cairns [L], J. H. C. White-
for each k, Ml is homeomorphic to the 7- head [3]): (i) any C’-manifold (1 < r < 00) has a
dimensional sphere S7. But if k, / are odd and c-triangulation, and any (Y-triangulation of
k* f l2 (mod 7), M; is not diffeomorphic to the boundary cari be extended to the whole
Mz. Differentiable manifolds (such as Mz) that manifold; (ii) for a C’-triangulation (K,f) of
are homeomorphic, but not diffeomorphic, to a C’-manifold M”, the triangulated space
the natural sphere are called exotic spheres. (K,f; M”) is a tcombinatorial manifold; (iii) for
After the discovery made by Milnor, many two C’-triangulations (K,,f,) and (K2,f2) of
topological manifolds other than the 7- a C’-manifold M”, the triangulated spaces
dimensional sphere and possessing several (K,,.f,; M”) and (K2,fZ; M”) are combinatori-
distinct differentiable structures have been ally equivalent.
found (N. Shimada, 1. Tamura). Moreover, Conversely, for a combinatorial manifold
topological manifolds admitting no differen- (K,f; M”), a differentiable structure on M that
tiable structures have been constructed (M. is compatible with the triangulation (K, f) is
Kervaire 161, Smale, Tamura [7], J. Eells and called a smoothing of M. The question of
N. Kuiper, C. T. C. Wall [SI). On the other whether a smoothing of a combinatorial mani-
hand, manifolds of dimension < 3 admit fold exists is called the smoothing problem. We
unique differentiable structures. have some criteria for its solvability. One is
We cari introduce a TRiemannian metric expressed in terms of transverse fields, and
on a CI”-manifold. Let Mt and M; be C”- another in terms of tmicrobundles (- 147
manifolds (k-dimensional and n-dimensional, Fiber Bundles P). We also have the following
respectively) and f: Mi +M; be an timmer- method using the theory of +Obstruction (J.
sion. We fix a Riemannian metric on Mz. For Munkres [9]; M. W. Hirsch and B. Mazur
each point p of Mi, let N,(f) be the linear [ 101). Let rk be the group of oriented differen-
space of a11 tangent vectors of M: at f(p) that tiable structures on the tcombinatorial k-
are orthogonal to the Rangent space of f( U(p)), sphere (an exact defmition of this group is
where U(p) is a small open neighborhood of p given in Section 1). Let K’ be the i-lskeleton
in M:. Then {N,(f)(p~M:j forms an (n-k)- of a combinatorial n-manifold (K,f; M”) and
dimensional tvector bundle vr over Mi, called a a smoothing of a neighborhood U of f(K’)
the normal bundle of the immersion f: The in M”. Then there exists an obstruction cocycle
equivalence class of vf is independent of the C,EZ’+’ (K, ri) satisfying the following con-
choice of Riemannian metric on M;. When f ditions: (i) if C&(a’“) =0 for an (i + 1)-simplex
is an tembedding, we denote the submanifold a’” of K, the smoothing c( cari be extended
f(M:) by Lk. Let N,(L’) be the set of a11 points over a neighborhood of f(K’ U oif’), and vice
whose distances (with respect to the Riemann- versa; (ii) if there exists another smoothing E’
ian metric) from Lk are <E. If Lk is compact of a neighborhood of f(K’) that coincides with
and E is sufflciently small, N,(Lk) is an n- the smoothing c( of a neighborhood of f(K’-‘),
dimensional submanifold of M; with bound- then C,. - C, is a koboundary; conversely, for
ary and is uniquely determined independently any (i + 1)-coboundary b with coeffkients in
of the choice of E up to diffeomorphism. This ri, there exists a smoothing LY’of a neighbor-
submanifold is called a tubular neighborhood of hood of f(K’) such that b = C,. - C,. Using
Lk in M; and is denoted by N(Lk). The interior this method we cari prove that for n < 7 a11
of N(Lk) is called an open tubular neighbor- combinatorial manifolds of dimension n are
hood. The total space E(V/) of the normal smoothable.
bundle vr of the embedding S is diffeomorphic
to the open tubular neighborhood of Lk in h4:.
D. Embedding and Immersion Theorems

C. (Y-Triangulation and the Smoothing In the following, M” and Xm are Ca-manifolds


Problem of dimensions n and m, respectively. Two
timmersions fO, fi : M”-rX” are said to be
Let (K, f) be a ttriangulation of the c*- regularly homotopic if there exists a thomotopy
manifold M” ( 1 < r < 00) (f : ) K 1d M”) satisfying f,: M”+X”‘, 0 <t < 1, such that .f, is an immer-
114E 432
Differential Topology

sion for each t and the induced mapping (tdif- A. Haefliger [ 151 obtained the following
ferential off,) dJ: T(M”)-* T(Xm) on the tan- important result: Let M” be compact and
gent spaces naturally gives rise to a continuous +(k - 1)-connected, and let X” be k-connected.
mapping over T(M”) x 1. Two tembeddings are Then any continuous mapping of M” in Xm is
isotopic if they are regularly homotopic and homotopic to an embedding if 2k <n, m > 2n -
the homotopy f, is an embedding for each t. k + 1, and any two homotopic embeddings
Given M” and X”, a fundamental problem of of M” in X” are isotopic if 2k < n + 1, m > 2n -
embedding (immersion) theory is to classify the k + 2. Thus if m > 3(n + 1)/2, any two em-
embeddings (immersions) of M” in X” accord- beddings of S” in R” are isotopic. The range
ing to their isotopy (regular homotopy) classes. m > 3(n + 1)/2 is called the metastable range.
Whitney proved that a continuous mapping Haefliger further classitïed the embeddings of
f: M”-tX” cari be approximated by an immer- S4”-’ in R6” and showed the existence of em-
sion if m > 2n and by an embedding if m 2 2n + beddings of S4”-’ in R6” that are not isotopic
1 [ 1). The following results are also due to to the natural one. More complete results for
Whitney [ 11: Any two immersions of M” in X” the classification of embeddings of thomotopy
that are homotopic are regularly homotopic if n-spheres in S” were obtained by J. Levine
m > 2n + 2, and any two embeddings of M” in [16]. Furthermore, Levine proved the follow-
X” that are homotopic are isotopic if m > 2n ing unknotting theorem in higher-dimensional
+ 3. The range m > 2n + 3 is called the stable knot theory [16]: Let ,f:Snm2+Sn be a C”-
range of embeddings. embedding for n > 6; then f(Y2) is unknotted
In [ll, 121 (1944) Whitney improved his if and only if Y- f(s”-“) is homotopy equiva-
classical theorems and showed that M” cari lent to S’ (- 235 Knot Theory G; also 65
always be immersed in RZnml for n > 1 and M” Combinatorial Manifolds D).
cari always be embedded in R’“. The methods We list some results about embeddings and
used in his proof have played an important immersions. If M” is noncompact, M” cari
role in the subsequent development. Classifi- always be embedded in R’“-‘; if M” is a non-
cation of immersions of the n-sphere S” in R” compact n-manifold, M” cari always be im-
was determined by Smale [13]. Let p: T(M”)-+ mersed in R”; if M” is compact and orientable
M” and p’: T(Xm)+Xm be the projections of and n > 4, M” cari always be embedded in
the tangent bundles of M” and X”, respec- R2”Fl
tively. A mapping cp: T(M”)-rT(X”) is a linear
fiber mapping (linear fiber map) if, for each
~EM”, &-i(x)) is contained in a fïber of
E. Nonembedding and Nonimmersion
T(Xm) and <p1p-‘(x) is a linear mapping of
Tbeorems
rank n. A linear bomotopy qt: T(M”)+T(X”)
(0~ t < 1) is a homotopy such that each cpt is a
linear fiber mapping. The following theorem We denote the ttotal Stiefel-Whitney class
of Hirsch [ 141 is fundamental to immersion of M” by w(M”) and the ttotal Pontryagin
theory: Assume n cm. If q: T(M”)* T(X”‘) is class of M” by p(M”) (- 56 Characteristic
a linear fiber mapping, the mapping (p: Mn+ Classes). Then (w(M”))-’ (EH*(M”; Z,)) and
X” induced by <p cari be approximated by an (p(M”))-’ (eH*(M”;Z)) cari be written as
immersion f: M”+X” such that df and <p are W(M”)=CWi(M”) (Wi~Hi(M”;Zz)) and p(M”)=
linearly homotopic. Two immersions L g: C&(M”) (picH4’(M”; Z)). Then the property
Mn-X” are regularly homotopic if and only if of characteristic classes for the +Whitney sum
df and dg are linearly homotopic. If M” is im- implies the following theorem: If M” cari be
mersible in Rm+r, where m > n with r linearly immersed in R”+k, then W,(M”) = 0 for i > k and
independent tïelds of tnormal vectors, then M P,(M”) = 0 for i > [k/2]. Furthermore, if M” cari
is immersible in R”. In particular, if M” is a rr- be embedded in Rn+k, then Wk(M”)=O. As an
manifold (- Section I), M” is immersible in application, these results yield the nonembed-
R”+‘. ding (nonimmersion) theorem for projective
Let C?(M”) be the set of isomorphism classes spaces (- Appendix A, Table O.VII). Sharper
of real tvector bundles over M”, and consider theorems were obtained subsequently. In
tl:G(M”)+KO(M”) (- 237 K-Theory). An particular, Atiyah proved the following: Let
element 5 E KO(M”) is said to be positive if 5 ni (i = 0, 1, ) be texterior power operations
is in the image of 8. If &,E fi(M”), the geo- (- 237 K-Theory), and let yi be the operations
metric dimension of tO, written S(<e), is the detïned by the forma1 power series C&,yiti=
least integer k such that &, + k is positive. Then (C&lLiti)(l -t))‘. Then y’(n-$M”))=O for
Hirsch’s theorem [ 141 cari be expressed as i > k (i > k) if M” cari be immersed (embedded)
follows: M” is immersible in R”+!+ (k > 0) if and in R”+k. Furthermore, we have an interesting
only if g(n-z(M”))<k, where t(M”) is the result for the differentiable case. For any posi-
tangent bundle of M”. tive integer q, there exists a differentiable
433 114 F
Differential Topology

manifold M” such that M” is immersible in Rk fi, . ,fk; s). In particular, X(X(. . (X(X(D”;
but not embeddable in Rk+q. fi ; s,); f2; sl) . ); &; sj)) is called a handlebody.
Let Mn-’ be an oriented (n - l)-mani-
F. Handlebodies fold and f: dD” x DneS+(Mn-’ -C?M”-‘) an
orientation-preserving Cm-embedding. Then,
Let W be a C”-manifold with boundary 8kV. by straightening the angle, the quotient space
Then the boundary 8 W has a neighborhood U of (Mn-’ - Intf(aDs x D”-“)) U (D” x do”-“)
that is homeomorphic to 8 W x R + , where R + obtained by the identification of points cor-
= [0, CO). Let W,, W, be C”-manifolds with responding under f ) aDs x aDnws is a C”-
boundaries and f: 8 W, -8 W, be a diffeomor- manifold x(M”-I;f). We say this manifold is
phism. Then the quotient space M of WI U W, obtained by a spherical modification (or sur-
obtained by identifying points corresponding gery) of type (s, n-s) from Mn-‘. The manifold
under f has a natural differentiable structure. x(M”-I;f) has a natural orientation, and
This construction of the C”-manifold M from ax(M”-I;f)= aM”-‘. The process of spherical
the Cm-manifolds WI, W, is called pasting modification has the following relation to
togetber tbe boundaries. that of attaching a handle: Let W be an n-
Now we consider the ttopological product dimensional manifold and f: aDs x DR-‘-+3 W
WI x W, of manifolds W,, W, with boundaries an embedding. Then 8X( W; j; s) =x(6 W;f).
aw,, aw,. W, x W,-aw, xaW, has anatural When W= Mn-’ x [0, l] and f: i3Ds x D”-“+
atlas of class C”. By introducing a suitable M x {l}, aXW;f;s)=idMx {l};fWM x {O),
atlas of class C” in a neighborhood of 8 WI x and therefore M is cobordant (- Section H)
8 W, in W, x W,, we obtain a C”-manifold to x(M ;f). Conversely, let Ml be cobordant
with boundary homeomorphic to WI x W,. to M,. Then we cari obtain M, from M, by
More generally, let M be an n-manifold with a fïnite sequence of spherical modifications
boundary, let N be a finite union of submani- (A. Wallace [17], Milnor). Let M” be an n-
folds of dimension <n - 2 in aM, and suppose dimensional C”-manifold and f: M”-*R’ be a
that M has a corner along N. Extension of the C”-function. If f satisfies the following con-
F-structure of M-N over M as in this para- ditions, then it is called a nice function: (i) Al1
graph is called straigbtening tbe angle. tcritical points off are tnondegenerate; (ii) for
Let D” be the oriented unit +n-disk in the any critical point p of J the tindex (- 279
ir-dimensional Euclidean space R”, it4; and Morse Theory) off at p is equal to f(p). We
Mi be oriented compact C”-manifolds, and have the following theorems.
&:D”-+MF (i= 1,2) be +C”-embeddings, fi 1. Let M be a compact C”-manifold. Then
+Orientation-preserving and .f, torientation- there exists a nice function on M (M. Morse,
reversing. Then, pasting together the bound- Smale [18], Wallace [17]).
aries of Mi-Intf,(D”) and Ml-Intf,(D”) by 2. Let M be a compact C”-manifold and
the mapping fi of;‘, we obtain an oriented f: M-rR’ a Cm-function a11 of whose critical
C”-manifold, called the connected sum of M; points are nondegenerate. Suppose that the
and Mi and denoted by MT # Mi. The con- number of critical points on f-’ [-E, F] is k
nected sum Mf # Ml has the orientation in- and that they are a11 contained in f-‘(O). Sup-
duced from those of Mr and Mi, and its dif- pose further that the indices of these critical
ferentiable structure is uniquely determined points are a11 equal to s. Then f-‘( -a, E] is
independently of the mappings fi. Let S” be diffeomorphic to the manifold with handles
the natural n-dimensional sphere. Then we X(f-‘(-oo, -e];fr, . . . . &;s) (Morse, Thom,
have M” # S” z M” (here z means diffeomor- Smale [18]).
phic), (M, #M,)#M,xM, #(M*#MJ, M, # 3. Generalized Poincaré conjecture. Let M”
M,zM,#M,. be an n-dimensional thomotopy sphere of class
Let M” be a manifold with boundary and C” (n> 5). Then M” cari be obtained by past-
f:(aDs) x D”-“+aM” be a C”-embedding. ing together the boundaries of two n-disks.
Then we cal1 the quotient space X(M”;f; s) of Consequently, M” is homeomorphic to the n-
M"U(DS x D”-‘) obtained by the identification dimensional sphere s” (Smale [18], H. Yama-
of corresponding points under f the manifold suge [ 191).
with a handle attached by f: Also, we cal1 the 4. Let M” be a tcontractible compact n-
construction of X(M”;f; s) from M” attaching dimensional manifold (n > 5), with aM con-
a handle and cal1 D” x D”-” an s-handle. After nected and simply connected. Then M” is dif-
straightening the angle, X(M”;f; s) is consid- feomorphic to the n-disk D” (Smale [ 181).
ered naturally a C”-manifold with boundary. 5. Let MT, Mi be oriented, compact, simply
Letfi:dD,S~D~?-taM”(i=l,...,k)beem- connected n-dimensional manifolds (n > 4). If
beddings whose images are mutually disjoint. M, is h-cobordant (- Section 1) to M,, then
Then similarly, using embeddings fi, . . ,fk, we M, is diffeomorphic to M2 (h-cobordism theo-
cari define a Cm-manifold with handles X(M”; rem; Smale [20]).
114 G 434
Differential Topology

6. If M; x Rk is diffeomorphic to M; x Rk, we mental (cohomology) class of the Thom com-


say that M; is k-equivalent to MJ. Let Mi, M2 plex MG. For a general Thom space Xc, we
be compact n-dimensional manifolds of the cari also define the fundamental class of X,
same thomotopy type. Then M, is k-equivalent using the Thom-Gysin isomorphism.
to M, if and only if for a thomotopy equiva- A C”-submanifold WP of a compact mani-
lente f: M, -* M2, z(M,) @ ck is equivalent to fold V” is a +Support of a tsingular cycle and
f*z(M,) @ .akr where z(Mi) is the ttangent represents a homology class z E Hp( I’“; G)
bundle of Mi and &k the ttrivial vector bundle (G = Z, or Z). In this case we say that the
of dimension k (Mazur). homology class z of V” is realizable by a sub-
7. In some special cases the classification of manifold. A homology class z E Hn-k( V”; G) is
manifolds by diffeomorphism is completely realizable by a submanifold if and only if there
determined. (i) The classification of simply exists a mapping f: V”+MO(k) (or MSO(k))
connected 5-dimensional manifolds M with such that f*(U) is the dual cohomology class
vanishing second Stiefel-Whitney class wî(M) of z (Thom [23]).
is determined by H,(M). (ii) A 2-connected The homotopy group n,+,(MO(n)) (resp.
compact 6-dimensional manifold is dif- n,+,(MSO(n)) (n > k) is determined only by
feomorphic to either S6 or the connected sum k for any n up to isomorphism. It is called
of a lïnite number of copies of S3 x S3 (Smale the k-dimensional stable homotopy group of
[20]). Besides these, the classifications of the Thom spectrum MO = {MO(n)} (MS0
(n - 1)-connected 2n-dimensional manifolds = {MSO(n)}) and is denoted by xk(MO)
(Wall [S]) and (n - 1)-connected (2n + l)- (n,(MSO)). For the tunitary group U(n) and
dimensional manifolds (Tamura [21], Wall the tsymplectic group SP(n), we detïne Thom
[22]) have been obtained. complexes MU(n) and MSp(n) as the Thom
complexes associated with (U(n), 2n) and
(Sp(n),4n) by the canonical inclusions U(n)e
G. Thom Complexes O(2n) and Sp(n) c 0(4n), respectively. The
k-dimensional stable homotopy groups
Let 5 be a real n-dimensional vector bundle
rr,(MU)=limn,,+,(MU(n)), z,(MSp)=
over a paracompact space X, A, be the total
lim n4n+k(MSp(n)) of Thom spectra MU = { ,
space of the associated bundle of 5 with fiber
MU(n),SMU(n),MU(n+ 1),... }, MSp=
the closed n-disk D”, and A, be the total space
{ . . . . MSp(n), SMSp(n), S2MSp(n), S3MSp(n),
of the associated bundle of 5 with liber the
MSp(n + l), . } cari be detïned similarly to the
(n - 1)-sphere 8D”. Then the quotient space
case of n,(MO), where SMU(n) denotes the
X, = A,/j,, obtained from A, by contracting
treduced suspension of MU(n). The stable
A, to a point, is called the Thom space of the
homotopy groups of Thom spectra are cal-
vector bundle t. If X is a +CW-complex, then
culated in connection with the cohomology
the Thom space X, of 5 has the homotopy
groups utilizing the Thom-Gysin isomorphism
type of a CW-complex and is called a Thom
(- 202 Homotopy Theory T) (Thom [23],
complex. The Thom space X, has the canon-
Milnor [24]).
ical base point pr corresponding to A,.
For a coefficient group G, denote by G, the
tlocal system of coefficient groups with stalk G
H. Cobordism
associated with the +Orientation sheaf of an R”-
bundle t. Then we have the Thom-Gysin
Cobordism theory is a theory of classification
isomorphism:
of differentiable manifolds initiated by L. S.
W(X; G<) cz H”+q(Xr, pc; G), Pontryagin and V. A. Rokhlin, who called it
intrinsic homology. The theory was brought to
maturity by Thom [23]. Its fundamental prob-
Let G be a closed subgroup of the ortho- lem is to determine whether a given compact
gonal group O(n) and BG be the base space of manifold is the boundary of another manifold.
the universal n-dimensional vector bundle yG Corresponding cobordism theories for com-
with structure group G. Then we cari take a binatorial and topological manifolds are being
connected CW-complex as BG. We denote the developed (Wall, R. Williamson).
Thom complex of the vector bundle yc by MG We consider only compact C”-manifolds
and cal1 it the Thom complex associated with that are not necessarily connected. Let D
(G, n). The Thom complex associated with be the set of a11 diffeomorphism classes of
(G, n) is (n - 1)-tconnected. If G is connected, C”-manifolds, and let ZD, be the set of a11
then we have n,(MG)rZ, n,,(M0(n))~Z~, orientation-preserving diffeomorphism classes
H”(MG;Z)=Z, and H”(MO(~);Z,)EZ, (Z, of oriented C”-manifolds. For an oriented
stands for the quotient group Z/2Z). The manifold V, we Write - V for the manifold
generator U of H”(MG; Z) is called the funda- with reversed orientation.
435 114 1
Differential Topology

For two compact k-manifolds V, WE a,, we has a complex structure is called a stably (or
say that V is cobordant to W if there exists a weakly) almost complex manifold. Let V, W be
compact (k + 1)-manifold XE a, such that <iX 2n-dimensional compact stably almost com-
= VU (- IV). In this definition, considering 3 plex manifolds (n > 1). We say that V is C-
instead of a,, we say that V is cobordant to W equivalent to W if they have the same +Chern
mod 2. The equivalence class of Vk with respect numbers. The set of a11 C-equivalence classes
to the cobordism relation (the cobordism [VIC forms the complex cobordism group Il,,
relation mod 2) is called an oriented (un- as in the (real) case of cobordism groups (the
oriented) cobordism class and is denoted by existence of the inverse of an element is not
[V”] ([ Vk12). The set of a11 oriented (un- trivial). Introducing multiplication into the
oriented) cobordism classes [V”] ([ Vk],) of k- direct sum U, =C II,, by the product of mani-
manifolds forms an Abelian group fik (%,) by folds, we obtain the complex cobordism ring.
the natural addition [ Vk] + [ Wk] = [ Vk U Wk] We have U,, E nZn(A4U) and U, is a poly-
([ Vk], + [ Wk], = [ Vk U Wk],). This is called nomial ring over Z with generators { [ YikIc 1
the k-dimensional oriented (unoriented) cobor- k > 1 }, where we cari take for Yik a complex
dism group. Deline the product [V”] x [PV’] = k-dimensional nonsingular algebraic variety
[vkXW~]([Vq2X[Wq2=[VkXW~]2). (Milnor [24]).
Then the direct sum R = 2 nk (?II = C !II,) forms
an anticommutative (commutative) tgraded
algebra, which is called the cobordism ring 1. h-Cobordism Groups of Homotopy Spheres
or Thom algebra. We have the following
theorems. A manifold M is said to be parallelizable if its
1. R,, ‘%, are isomorphic to the stable homo- tangent bundle r(M) is trivial, and almost
topy groups n,(MSO), rrk(MO) of the Thom parallelizable if there exists a lïnite number of
spectra MSO, MO, respectively (Thom’s funda- points xi in M such that M - U {xi} is paral-
mental theorem) [23]. lelizable. A manifold M is called stably paral-
2. For a natural number k not of the form lelizable (or s-parallelizable) if the +Whitney
2’- 1, there exists a compact manifold P(k), sum t @ a1 of its tangent bundle z(M) and the
and % is a polynomial ring over Z, with gen- trivial line bundle a1 is trivial. A manifold M”
erators {[P(k)], 1k # 2’- 1). n @ Q (Q is the is called a rc-manifold if M” has a trivial nor-
field of rational numbers) is a polynomial ring mal bundle when it is embedded into a Eucli-
over Q with generators { [PCZm] 1m > l}, where dean space RN (N > 2n). A manifold M” is a n-
PC*” is the complex 2m-dimensional projec- manifold if and only if M” is s-parallelizable.
tive space (Thom [23]). Moreover, Milnor The concepts delïned in this paragraph are
[24] proved that the p-component of fik is related by inclusions as follows: parallelizable
zero for an odd prime p. Wall proved that 2 s-parallelizable 2 almost parallelizable.
the 2-component of 51, contains no element For a connected manifold with boundary,
of order 4, using a certain exact sequence these three concepts are equivalent. +Group
which contains the natural homomorphism manifolds are parallelizable. An n-dimensional
Rk+91k [25]. manifold homeomorphic to the n-dimensional
3. Let T be the ideal of R consisting of a11 sphere is parallelizable if and only if n = 1, 3, 7
torsion elements in R. Then RIT is a poly- (Milnor [26]). Suppose that we are given an
nomial ring over Z with generators {[Y,,] 1 (n - 1)-dimensional sphere S”-’ (n even). We
k > l}, where we cari take for Y,, a complex cari consider the problem of determining the
Zk-dimensional nonsingular algebraic variety maximal number r such that there exists a
(Milnor). tangent r-frame tïeld over S”-‘. J. F. Adams
Kharacteristic numbers are invariants of solved this problem using +K-theory (- 237
cobordism classes. Combining the results K-Theory) as follows: Let n =(2a + 1)2b, b =
stated in 2, we have the following theorem: c + 4d, where a, b, c, d are integers and 0 < c <
4. Let V, W be manifolds. Vis cobordant to 3. Put p(n) = 2’+ 8d. Then r = p(n) - 1. On the
W mod 2 if and only if they have the same other hand, homotopy spheres are n-manifolds
corresponding Stiefel-Whitney numbers. Let ~271.
V, W be oriented manifolds. V is cobordant to Let VI, V, be oriented compact manifolds.
W if and only if they have the same corre- Then we say that VI is h-cobordant to V, if
sponding Stiefel-Whitney numbers and +Pan- there exists an oriented manifold W with
tryagin numbers. The tindex of an oriented 4k- boundary aIV= VI U (- V,) such that K (i = 1,2)
dimensional compact manifold is a cobordism is a deformation retract of W. The set of a11 h-
class invariant and cari be expressed as a linear cobordism classes of oriented homotopy n-
combination with rational coefficients of +Pan- spheres forms an Abelian group 0, with con-
tryagin numbers (- 56 Characteristic Classes). nected sum as addition. This is called the (h-
A manifold whose +Stable tangent bundle cobordism) group of homotopy n-spheres. We
114 J 436
Differential Topology

denote by &,(a~) the subgroup of homotopy n- large enough (i.e., k > n + 2). Then the isomor-
spheres that are boundaries of x-manifolds. phism class of the tnormal bundle of g is inde-
Kervaire and Milnor gave certain exact se- pendent of the choice of embedding, and it
quences that contain the groups /3,, the stable depends only on M. Any representative of the
homotopy groups G, of spheres, and the stable isomorphism class is called the normal k-vector
homotopy groups n,(SO). They clarifïed rela- bundle of M and is denoted by vh. Let (X, Y)
tions among these groups and proved that be a Poincaré pair of forma1 dimension n,
O,(&) = 0 for n even, B,(&r) = fïnite group for n and let 5 be a real k-vector bundle over X. A
odd # 3, 0,/@,(&) c Coker J,,, etc. [27], where normal mapping (normal map) (J b) consists of
J,:n,(SO)+G, is the +J-homomorphism. From a degree 1 mapping f:(M, aM)+(X, Y) and a
these results it follows that the 0, (n # 3) are tbundle mapping b: v&< which covers f:
iïnite Abelian groups, 0, =0 for n < 7, # 3, A normal mapping (f; b):(M, aM)+(X, Y) is
Q7g Z,,, etc. (- Appendix A, Table 6.1). normally cobordant to a normal mapping
By pasting together the boundaries of two (f’, b’):(M’, aM’)+(X, Y) if aM=aM’ and
n-disks 0; and 02, we obtain an oriented there exist a smooth (n + 1)-manifold W and
manifold which is considered as a smoothing a mapping F: W-X such that a W= M U
of the combinatorial n-sphere. The set of all (-M’) U 3M x 1, F is covered by a bundle
orientation-preserving diffeomorphism classes mapping B:vb+& and (F, B)I(M, dM)=(f; b),
of smoothings of the combinatorial n-sphere (F,B)I(M’,aM’)=(f’,b’).
forms an Abelian group r, with connected The fundamental theorem of surgery theory
sum as addition. This is called the group of is formulated as follows: Let (X, Y) be a Poin-
oriented differentiable structures on tbe com- caré pair of formal dimension n. Suppose that
binatorial spbere. By the generalized Poincaré X is simply connected and n > 5. Let (J b):
conjecture and the h-cobordism theorem, we (M,dM)+(X, Y) be a normal mapping that
obtain r, = 0, (n # 3,4). Furthermore, r, = 0 restricts to a homotopy equivalence on the
and r, = 0 (J. Cerf [28]). The group r, cari also boundarySIaM:aM*Y.Then(f,b)isnor-
be defïned as follows: Let Diff + D”, Diff + S”-’ mally cobordant to a normal mapping (f’, b’):
be the groups of orientation-preserving diffeo- (M’, aM’)+(X, Y) with f’: M’+X a homotopy
morpbisms of D”, S”-‘, respectively, where equivalence if and only if a well-defined ob-
multiplication is defined by composition. Let struction g(A b) vanishes. a(f; b) is called the
r : Diff + Dt +Diff + S”-’ be the homomor- surgery obstruction. When n is odd, a(f; b)
phism induced by the restriction D”+S”-‘. always vanishes. When n = 0 mod 4, e(f; b) is
Then the image of r is a normal subgroup, and an integer. If Y= 0, it is given by (I(M)-
r, = Diff + Snm’/r (Diff ’ D”). I(X))/8, where I( ) denotes the tindex of the
manifold or of the Poincaré complex. If ns 2
mod 4, ~(1; b) is an integer mod 2, called the
J. Surgery Theory Arf-Kervaire invariant. For a thorough devel-
opment of simply connected surgery - [31].
A process of modifying a manifold into an- In the PL or even in the topological cate-
other by a sequence of spherical modifications gories, surgery theory cari be developed simi-
is called surgery on the manifold (- Section larly (Browder and Hirsch, R. C. Kirby and
F). The technique of surgery proved to be a L. C. Siebenmann [32]).
powerful tool for the development of differen- In his study of Hauptvermutung for simply
tial topology in the 1960s. Kervaire and Mil- connected manifolds, D. Sullivan reformu-
nor exploited this technique in their study of lated surgery in terms of the “surgery exact se-
homotopy spheres [27]. W. Browder [29] and quences” involving the classifying spaces G/PL
S. P. Novikov [30] used surgery to construct or G/O. (These spaces are “homotopy theoretic
differentiable manifolds with the same homo- fibers” of BPL-BG or BO+BG respectively.)
topy type as that of a given Poincaré complex In the special case where X is a closed simply
in dimension greater than 4. connected PL n-manifold, the surgery exact
TO explain the main points of surgery the- sequence cari be formulated as follows (n 2 5):
ory, we introduce some terminology. A pair
Z (n=O mod4),
of finite tCW complexes (X, Y) is called a
Poincaré pair of formal dimension n if there O+hT(X)A [X, G/PL]: Z, (n = 2 mod4),
exists a class PE H,,(X, Y; Z) called the funda-
mental class such that the tcap product p-: 10 (n odd).

H4(X; Z)+H&X, Y; Z) is an isomorphism The set hT(X) is the totality of equivalence


for each 4. When Y= @, X is called a Poin- classes of pairs (M,f), where M is a closed PL
caré complex. Let M be an n-dimensional n-manifold and fis a homotopy equivalence
smooth manifold. Consider an embedding g M+X. Two such pairs (M’,f’) and (M”,f”)
of M into a Euclidean space R”+k, where k is are defïned to be equivalent if there exists a PL
431 114K
Differential Topology

homeomorphism h: M’-+M” such that f” o h is It was Rokhlin who lïrst discovered a


homotopic to f’. The set of homotopy classes strange property of 4-manifolds [38]. Ro-
[X, G/PL] of mappings X+G/PL is appro- khlin’s tbeorem states: Let M be a closed
priately identified with the set of a11 normal oriented smooth 4-manifold. If M is almost
cobordism classes of normal mappings with parallelizable (or equivalently, if M is a spin 4-
target X or with the set of a11 PL reductions manifold), then the index of M is divisible by
of the Spivak normal fiber space of X as a 16. Milnor and Kervaire gave an alternative
Poincaré complex [31]. The mapping ré cari proof of this theorem from the differential-
be defined naturally, and the mapping f3 cor- topological point of view [39]. Freedman and
responds to the assignment of the surgery Kirby and Y. Matsumoto gave geometric or
obstruction. The image n(M,f)~[x, G/PL] elementary proofs.
is sometimes called the normal invariant of Until 1981, Rokhlin’s theorem had been a
(M,f). Sullivan reduced the classification constant source of many curious phenomena
problem of manifolds within a given homo- in 4-dimensional topology. The list contains,
topy type to a homotopy theoretic problem for example: (1) the class (4m + 2,4n + 2)~
of the classifying spaces G/PL, G/O [33] (also H2(S2 x S’) E Z @ Z cannot be represented by a
- [31,34]). smoothly embedded 2-sphere in Sz x S* (Ker-
In the latter half of the 1960s surgery theory vaire and Milnor). This result was improved
was extended by Wall to caver a11 compact by W.-C. Hsiang and R. H. Szczarba, A. G.
nonsimply connected manifolds which are Tristram, and Rokhlin. (2) The h-cobordism
not necessarily orientable. He introduced a theorem fails to hold in 4 or 3 dimensions
certain Abelian group L,(rc; w), now called the (T. Matumoto, Siebenmann [40]). (3) There
Wall group, which functorially depends on the exists a closed smooth 4-manifold M that is
fundamental group z = rr, (X), with the orienta- homotopy equivalent to the real projective
tion character w : xi (X)+Z,, and which is of space RP4, but M $ RP4 (Cappell and Shane-
period 4 with respect to the forma1 dimension son). (4) There exists an open 4-manifold IV4
n of X. In Wall’s theory, the surgery obstruc- properly homotopy equivalent to S3 x R but
tion o(A b) takes its value in this group. The distinct (Freedman).
group structure of Wall groups has been cal- Closed, connected, simply connected 4-
culated in many cases. Sullivan’s exact se- manifolds M and N are homotopy equivalent
quences are extended to the non-simply con- if and only if the intersection forms delïned on
nected cases [34]. Making use of the extended the 2-dimensional (co)homology groups are
exact sequences, Wall and W. C. Hsiang and equivalent as inner product spaces over Z
J. L. Shaneson classilïeld homotopy tori. The [41]. Wall [42] proved that if closed simply
result played an important role in the work of connected smooth 4-manifolds M and N
Kirby and Siebenmann on the tannulus con- are homotopy equivalent, then they are h-
jecture and stable homeomorphisms which led cobordant. Moreover, if M and N are h-
to the solution of tHauptvermutung and trian- cobordant, then there exists an integer k > 0
gulation problems on topological manifolds in such that M # k(S2 x S2) is diffeomorphic to
1969 (- 65 Combinatorial Manifolds). Sur- N # k(S’ x S’).
gery theory has many applications to other About 1973, using a certain intïnite repeti-
geometric problems. Among them are lïnd- tion process, A. Casson constructed a family
ing missing boundaries for open manifolds, of noncompact smooth 4-manifolds which are
equivariant surgery, homology surgery [35], properly homotopy equivalent to the open 2-
and surgery on codimension two submanifolds handle Dz x R’. A 4-manifold belonging to the
[36,37]. family is called a Casson bandle. He observed
that if a11 the Casson handles are diffeomor-
phic to D2 x R2, then theories analogous to
K. 4-Dimensional Manifolds surgery and the h-cobordism theorem in
higher dimensions cari also be developed in
The results in differential topology discussed dimension 4.
above are mainly concerned with manifolds of In his 1982 paper [54], Freedman proved
dimension > 5. For 4-manifolds, however, that each Casson handle is homeomorphic to
because of their peculiar nature which is not D2 x R2. (It was proved later that Casson
observed in other dimensions, many funda- handles are not, in general, diffeomorphic
mental problems had remained unsolved until to D2 x R’.) This result and the proper h-
M. H. Freedman’s epoch-making paper [54] cobordism theorem, also due to Freedman,
appeared in 1982. His paper, together with proved many fundamental results on the topo-
S. K. Donaldson’s theorem [SS] which was logical structure of 4-manifolds: (1) If closed
published a little later, was a breakthrough in simply connected smooth 4-manifolds M, N
the theory of 4-manifolds. are h-cobordant, they are homeomorphic. In
114 L 438
Differential Topology

particular, a 4-dimensional homotopy 4-sphere Livesay invariant. The technique used there
is homeomorphic to the 4-sphere S4. (Proof of was equivariant surgery under the action of T.
the 4-dimensional Poincaré conjecture.) (2) A De Rham homotopy theory. A new approach
topological4-manifold properly homotopy to topology of manifolds was invented by
equivalent to R4 is homeomorphic to R4. (3) Sullivan [44], who considered the exterior
Given a nonsingular symmetric bilinear form algebra of differential forms (with polynomial
w over Z, there exists a closed, connected, coefficients) on a simplicial complex. Through
simply connected topological 4-manifold the construction of the minimal mode1 for the
whose intersection form is equivalent to w. algebra of differential forms, he recovers the
(Therefore Rokhlin’s theorem cannot be ex- rational homotopy type of the complex under
tended to topological4-manifolds. From this, some reasonable condition on the fundamental
it follows that there exist simply connected group. Following upon the classical de Rham
topological 4-manifolds which are nonsmooth- theorem which recovers the (real) cohomology
able or even nontriangulable.) (4) The homeo- algebra from the algebra of differential forms,
morphism class of a closed connected simply this approach is called the de Rham homotopy
connected 4-manifold is determined by the theory. This method has proved to be useful
intersection form and the Kirby-Siebenmann for the explicit calculation of the algebraic
class. (This statement is an improved one by topological invariants of +Kahler manifolds,
F. Quinn [56].) tloop spaces, cross-section spaces, talgebraic
Donaldson [SS] revealed a Sharp contrast varieties, etc.
existing between smooth and topological4- Kirby calculus. Kirby [45] initiated a link-
manifolds. Donaldson’s theorem states: If the theoretic approach to 3- and 4-manifolds. Let
intersection form of a closed, connected, sim- L be a link in S3. Suppose that each compo-
ply connected smooth 4-manifold is positive nent of L is given a framing, which means a
detïnite, then the form is equivalent to the trivialization of a tubular neighborhood. Such
standardone(l)@(l)@...@(I).This a link is called a framed link. A framing of
theorem, together with Casson’s theory and a component is determined by the tlinking
Freedman’s result (2) stated above, implies number between the component and its re-
that there exists an exotic differential struc- placement along the framing. By attaching
ture on a 4-dimensional Euclidean space R4. 2-handles to the 4-disk D4 along a framed link
Moreover, as an application of Donaldson’s in S3 = 2D4, we obtain a handlebody; here we
theorem, the problem of representing a 2- denote it by WL. It is known that each closed
dimensional homology class of S2 x S2 by a connected oriented 3-manifold is obtained as
smoothly embedded 2-sphere was completely the boundary of such a handlebody (W. B. R.
solved (K. Kuga [57]): the class (p,q)gff2(S2 x Lickorish, Wallace). Kirby proved that for
S2) is represented by a smoothly embedded framed links L and L’, the boundaries 8 WL
S2ifandonlyif]pl<10rlqlg1. and 8 WL. belong to the same orientation-
Many problems concerning differential preserving diffeomorphism class if and only if
structures on 4-manifolds remain unsolved. It L is transformed into L’ by a sequence of the
is not known whether an exotic smooth 4- following two kinds of elementary operations:
sphere exists. (i) adding or deleting a trivial knot (separated
Among other results, the proof of the 4- from L) with framing kl, (ii) band summation
dimensional annulus conjecture by Quinn [56] of components corresponding to “handle
is remarkable. sliding.” Kirby’s approach is called Kirby
calculus on framed links. For applications -
[45,53].
L. Miscellaneous Results

The Browder-Livesay invariant. Let Z” be a References


homotopy n-sphere, T an involution on .Y’,
that is, a smooth mapping T:L”+Zn with [l] H. Whitney, Differentiable manifolds, Ann.
TO T=id,.. Assume that T is free from tïxed Math., (2) 37 (1936) 6455680.
points. The involution T is said to desuspend [2] S. S. Cairns, Triangulation of the manifold
if there exists a homotopy (n - 1)-sphere C of class one, Bull. Amer. Math. Soc., 41 (1935),
smoothly embedded in c” which is invariant 549-552.
under the action of T. Browder and G. R. [3] J. H. C. Whitehead, On Cl-complexes,
Livesay defïned an obstruction U(T) in the Ann. Math., (2) 41 (1940), 8099824.
group O(n: even), Z (n = 3 mod4), Z, (n = [4] M. Morse, The calculus of variations in
1 mod4) such that o(T) =0 if and only if T the large, Amer. Math. Soc. Colloq. Pub].,
desuspends (provided that n > 6) [43]. The 1934.
invariant a(T) is now called the Browder- [S] J. W. Milnor, On manifolds homeomor-
439 114 Ref.
Differential Topology

phic to the 7-sphere, Ann. Math., (2) 64 (1956) [26] J. W. Milnor, Some consequences of a
3999405. theorem of Bott, Ann. Math., (2) 68 (1958),
[6] M. A. Kervaire, A manifold which does 4444449.
not admit any differentiable structure, Com- [27] M. A. Kervaire and J. W. Milnor, Groups
ment. Math. Helv., 34 (1960) 257-270. of homotopy spheres 1, Ann. Math., (2) 77
[7] 1. Tamura, 8-manifolds admitting no dif- (1963), 5044537.
ferentiable structure, J. Math. Soc. Japan, 13 [28] J. Cerf, Sur les difféomorphismes de la
(1961) 3777382. sphère de dimension trois (I, = 0), Lecture
[S] C. T. C. Wall, Classification of (n- l)- notes in math. 53, Springer, 1968.
connected 2n-manifolds, Ann. Math., (2) 75 [29] W. Browder, Homotopy type of differen-
(1962) 163-189. tiable manifolds, Colloquium on Algebraic
[9] J. Munkres, Obstructions to the smooth- Topology, Aarhus, 1962,42246.
ing of piecewise differentiable homeomor- [30] S. P. Novikov, Diffeomorphisms of sim-
phisms, Ann. Math., (2) 72 (1960) 521-554. ply connected manifolds (in Russian), Dokl.
[lO] M. W. Hirsch and B. Mazur, Smoothings Akad. Nauk. SSSR, 143 (1962) 1046-1049.
of piecewise linear manifolds, Ann. Math. [31] W. Browder, Surgery on simply con-
Studies, Princeton Univ. Press, 1974. nected manifolds, Erg. Math., 65 (1972).
[ 111 H. Whitney, The self-intersections of a [32] R. C. Kirby and L. C. Siebenmann, Foun-
smooth n-manifold in 2n-space, Ann. Math., dational essays on topological manifolds,
(2) 45 (1944) 220-246. smoothings, and triangulations, Ann. Math.
[ 121 H. Whitney, The singularities of a smooth Studies 88, Princeton Univ. Press, 1977.
n-manifold in (2n- 1)-space, Ann. Math., (2) 45 [33] D. Sullivan, Triangulating and smooth-
(1944), 2477293. ing homotopy equivalences and homeomor-
[ 131 S. Smale, The classification of immersions phisms, Geometric Topology Seminar Notes,
of spheres in Euclidean spaces, Ann. Math., (2) Princeton Univ. Press, 1967.
69 (1959), 327-344. [34] C. T. C. Wall, Surgery on compact mani-
[14] M. W. Hirsch, Immersions of manifolds, folds, Academic Press, 1970.
Trans. Amer. Math. Soc., 93 (1959) 242-276. [35] S. E. Cappell and J. L. Shaneson, The
[ 151 A. Haefliger, Plongements differéntiables codimension two placement problem and
de variétés dans variétés, Comment. Math. homology equivalent manifolds, Ann. Math.,
Helv., 36 (1961) 47-82. (2) 99 (1974) 277-348.
[16] J. Levine, A classification of differentiable [36] Y. Matsumoto, Knot cobordism groups
knots, Ann. Math., (2) 82 (1965), 15550. and surgery in codimension two, J. Fac. Sci.
[ 171 A. H. Wallace, Modifications and co- Univ. Tokyo, 20 (1973) 253-317.
bounding manifolds, Canad. J. Math., 12 [37] M. H. Freedman, Surgery on codimen-
(1960) 503-528. sion 2 submanifolds, Mem. Amer. Math. Soc.,
[ 1 S] S. Smale, Generalized Poincarës conjec- vol. 12, no. 191, 1977.
ture in dimensions greater than four, Ann. [38] V. A. Rokhlin (Rohlin), New results in the
Math., (2) 74 (1961) 391-406. theory of 4-dimensional manifolds (in Russian),
[19] H. Yamasuge, On the Poincaré conjec- Dokl. Akad. Nauk SSSR, 84 (1952), 221-224.
ture for M5, J. Math. Osaka City Univ., 12 [39] J. W. Milnor and M. A. Kervaire, Ber-
(1961) l-17. noulli numbers, homotopy groups, and a
[20] S. Smale, On the structure of manifolds, theorem of Rohlin, Proc. Int. Congress Math.,
Amer. J. Math., 84 (1962) 387-399. 1958, Cambridge Univ. Press, 1960,454-458.
[21] 1. Tamura, Classification des variétés [40] L. C. Siebenmann, Disruption of low-
différentiables, (n - 1)-connexes, sans torsion, dimensional handle body theory by Rohlin’s
de dimension 2n + 1, Sém. H. Cartan 16-19, theorem, Topology of Manifolds, Markham,
1962-1963, Inst. H. Poincaré, Univ. Paris, 1970,57-76.
1964. [41] J. W. Milnor, On simply connected 4-
[22] C. T. C. Wall, Classification problems in manifolds, Symp. Int. de Top. Alg. (Mexico
differential topology, 1, II, Topology, 2 (1963) 1956), Mexico, 1958, 122-128.
2533261,263-272. [42] C. T. C. Wall, On simply connected 4-
[23] R. Thom, Quelques propriétés globales manifolds, J. London Math. Soc., 39 (1964)
des variétés différentiables, Comment. Math. 141-149.
Helv., 28 (1954) 17-86. [43] S. Lapez de Medrano, Involutions on
[24] J. W. Milnor, On the cobordism ring R* manifolds, Erg. Math., 59, 1971.
and a complex analogue 1, Amer. J. Math., 82 [44] D. Sullivan, Inlïnitesimal computation in
(1960), 5055521. topology, Publ. Math. Inst. HES, 47 (1977),
[25] C. T. C. Wall, Determination of the 269-331.
cobordism ring, Ann. Math., (2) 72 (1960) [45] R. C. Kirby, A calculus for framed links
2922311. in S3, Inventiones Math., 45 (1978) 35556.
115 A 440
Diffusion Processes

[46] J. Munkres, Elementary differential top- hood of x and sup is taken over a11 ~ES, and
ology, Ann. Math. Studies, Princeton Univ. s, ts[t,,t,] such that O<t-s-ch. Then {Xt}
Press, 1961. is a diffusion process if and only if
[47] S. Smale, A survey of some recent devel- f2
opments in differential topology, Bull. Amer. P(P(X,,X,+,)>&)dt=ot~) (~10)
Math. Soc., 69 (1963), 131-145. s fI
[48] C. T. C. Wall, Topology of smooth mani- for every E> 0 [SI. This includes the following
folds, J. London Math. Soc., 40 (1965), l-20. result due to E. B. Dynkin (1952) and J. R.
[49] J. W. Milnor, Lectures on the h-cobordism Kinney (1953) as a special case: If in (1) we cari
theorem, Princeton Univ. Press, 1965. replace O(h) with o(h), then {Xt} Will be a
[SO] R. Stong, Notes on cobordism theory, diffusion process. Specifically, when {X,} is a
Princeton Univ. Press, 1968. 1-dimensional +strong Markov process, the
[Si] J. W. Milnor, Topology from the differen- latter condition is also necessary for the pro-
tiable viewpoint, Univ. Press of Virginia, 1969. cess to be a diffusion process.
[52] M. W. Hirsch, Differential topology, Diffusion processes are intimately related to
Springer, 1976. a certain class of tpartial differential equations.
[53] R. Mandelbaum, Four-dimensional top- Let S be the real line. Assume that the ttran-
ology: An introduction, Bull. Amer. Math. sition probability P(s, x, t, E) (s < t) of the pro-
Soc., (New Series) 2 (1980), 1- 159. cess {X1}041<m satisfies
[54] M. H. Freedman, The topology of four-
dimensional manifolds, J. Differential Geome- l-P(s,x,s+h,(x-E,X+E))=o(h) (h10)
try, 17 (1982), 357-453. (24
[SS] S. K. Donaldson, An application of gauge
theory to four-dimensional topology, J. Dif- for every E> 0 and that the following limits
ferential Geometry, 18 (1983), 279-315. exist:
[56] F. Quinn, Ends of maps III: dimension 4 X+E

and 5, J. Differential Geometry, 17 (1982), lim 1 (Y - x)*pts, x, s + h, dy)


h-O+ h s X-E

X+E
503-521.
[57] K. Kuga, Representing homology classes =2a(s,x)>O, (2b)
of S* x S*, Topology, 23 (1984), 133-137.

hsX~E
h-O+
lim A (Y-x)P(s,x,s+b,dy)=bts,x), (2~)

lim ‘(P(s,x,s+h,S)-l)=c(s,x)~O.
115 (XVll.8) h-O+ h
(24

Diffusion Processes Assume further that the transition probability


is tabsolutely continuous with respect to Le-
A. General Remarks besgue measure: P(s, x, t, dy) = p(s, x, t, y) dy.
Then under some suitable additional con-
Let (!& !Z3,P) be a tprobability space. A tMar- ditions, p(s, x, t, y) satisfies
kov w-s {X}OGt<m on a ttopological space ap- -~(s,x)~-bts,x)~-c(s,x)P,
8%
S with tcontinuous time parameter t is called a as-
diffusion process if the tsample function X,(w)
is continuous in t with probability 1 until a P(t-O,x,LY)=~(x-y), (3)
random time l(o), called the tterminal time.
and
After the terminal time, X,(w) stays at the
terminal point 8. Such a process is said to be ôp a2
1-dimensional or multidimensional according
~=;iT-zt~t’,i.>p>-~tb(t,v)p)+clt,Y)P,
as S is an interval or a manifold (possibly with
P(& x, s + 0, Y) = S(Y - 4, (4)
boundary) with dimension 2 2. Brownian
motion is a typical diffusion process (- 45 where 6 is the +Dirac delta function. The co-
Brownian Motion). efficient c vanishes if {Xt} is tconservative.
TO give conditions for a Markov process to Equations (3) and (4) are called Kolmogorov’s
be a diffusion process, let us assume that S is a backward equation and forward equation, re-
tcomplete metric space with metric p. Let {Xt} spectively. They are also called the Fokker-
be a Markov process on S with time para- Planck partial differential equations.
meter t ranging over a finite interval [tl, t2] A. N. Kolmogorov [l] derived equations (3)
and satisfying and (4) in 1931, and W. Feller proved that (3)
(or (4)) has a unique solution under certain
sup P(s, x, t, S - U,(x)) = O(h) (h10) (1) regularity conditions on the coeff%.zients a, b,
for every E> 0, where U,(x) is the E-neighbor- and c, and that the solution p(s, x, t, y) is non-
441 115 B
Diffusion Processes

negative, has an integral with respect to y that tesimal generator 8 of a diffusion process has
does not exceed 1, and satisfïes the Chapman- the local property stating that if u and u be-
Kolmogorov equation long to the domain of 8 and coincide in a

s Ph x, L YML Y, 4 4 dY = P(% x, 4 4

for every 0 <s < t < u. Hence p(s, x, t, y) deter-


neighborhood

B. 1-Dimensional
of x0, then BU(~,) = %V(X,).

Diffusion Processes

mines a +Markov process analytically. How-


Let S be a straight line. A point XE S is called
ever, rigorous proof establishing the sufftciency
a right singular point if X,(w) B x for all tu
of condition (2) for the Markov process {Xt} to
[0, i(w)) with P,-probability 1. A left singu-
be a diffusion process did not appear until the
lar point is defined analogously, with > re-
1950s.
placed by <. A right and left singular point
In the ttemporally homogeneous case,
is called a trap, while a right singular point
p(s, x, s + h, y) does not depend on s and cari be
which is not left singular is called a right shunt
written as p(h, x, y). Then U(S, x), b(s, x), and
(a left shunt is defïned analogously). A point is
c(s, x) are also independent of s, and p(t, x, y)
called a regular point if it is neither right nor
satisfïes
left singular.
ap 2 The set of a11 regular points is open. Let
~=u(x)~+b(x)~+c(x)p, (rl, r2) be a connected component of this open
set. One of the most important results con-
P(O+,x,Y)=cJ-Y). (5) cerning this situation is the proof of the exis-
Feller made an intensive study of this case and tence of a strictly increasing function s(x) de-
completely solved the problem of existence tïned on (rI, r2) and two measures m and k
and uniqueness of the solution of (5) assuming on (ri, r2) such that the iniïnitesimal generator
that p(t, x, y) is nonnegative and that its inte- o> of %II is represented as
gral with respect to y does not exceed 1 [7]. u+(dx)-u(x)k(dx)
In particular, when t varies in [0, +co) and S Bu(x) = (6)
m(dx) ’
is an interval [r,, rJ, Feller used the Hille-
Yosida theory of tsemigroups of operators to where u+(dx) is the measure du+(x) induced by
determine the conditions that should be satis- the tright derivative u’(x) of u(x) with respect
fïed by Y, and r2 in order that the differential to s(x) (i.e., ~+(~)=lim,,,+,{u(x+Ax)-
equation (5) (with the initial condition and his u(x)}/{s(x + Ax) - s(x)}). Equation (6) gives a
additional assumptions) yield one and only generalization of second-order tdifferential
one solution. Feller also introduced the notion operator au” + bu’ + CU (a > 0, c < 0) [ 121. Here
of generalized differential operators, which m is positive for nonempty open sets, both m
expresses the differential operator in the right- and k are tïnite for compact sets in (ri, r2), and
hand side of (5) in the most general form [S]. s, m, and k are unique in the following sense: If
The probabilistic meaning of his results was there are two sets of values of si, mi, and ki (i =
clariiïed by Dynkin, H. P. McKean, K. Itô, 1,2), then s,(x)=cs,(x)+constant, m,(dx)=
D. B. Ray, and others, and a11 1-dimensional ë’m,(dx), and k,(dx)=c-lk,(dx) for some
diffusion processes with the tstrong Mar- positive constant c. We cal1 s, m, and k, respec-
kov property have now been completely tively, the canonical scale, canonical measure
determined. (or speed measure), and killing measure for %II.
Since not much research on temporally They determine the behavior of X,(w) belong-
nonhomogeneous diffusion processes has ing to W inside the interval (ri, Y~). Conversely,
been done SO far, we restrict our explanation given any such set of s, m, and k, we cari fïnd a
to temporally homogeneous ones. Let s%R= 1-dimensional diffusion process !IJI such that s,
(X,, IV, P, 1XE S) be a Markov process, where m, and k are, respectively, the canonical scale,
S is a +state space, W is the tpath space con- canonical measure, and killing measure of ‘SR.
sisting of a11 paths w : [0, +co] +S U {a} which If X,(w) is nonvanishing in (r, , r2) with proba-
are continuous in t for 0 < t < c(w) (w(t) = 3 for bility 1, the killing measure k is identically
t><(w) while w(t)ES for O<~<[(W)), and P, is zero, and the canonical scale s satisfies the
a probability measure on W under the con- equation
dition that the process starts from x at t = 0
4x)-s(xJ
(- 261 Markov Processes). We cari actually Px(%,< %,) =
identify W with the tbasic space R and set 4x*)-4x1)
X,(w) = w(t) for w E fi. Assume that !N has the for xi <x < x2, where 4 is the thitting time of
tstrong Markov property. It follows from the point y.
+Dynkin’s formula for +infmitesimal generators The motion X,(w) belonging to the process
(- 261 Markov Processes) that the iniïni- %II and contained in (r,, r2) cari be constructed
115 B 442
Diffusion Processes

from the Brownian motion by means of a starts from rl and reaches the interior of the
topological transformation of the state space interval S even if r, ES, whereas if CI< COwe
(interval) based on s, a ttime change based on cari construct (adjoining r1 to S if necessary) a
m, and a tkilling based on k. More precisely, diffusion process that enters the interior from
we fïrst transform the interval (ri, r2) by x + r1 and whose motion in the interior coincides
s(x) into the interval (s(rl +O), s(r2 -0)) SO with that of X,(w).
that the diffusion process on this new interval If r1 is a regular boundary for Y.R and r1 ES,
has a canonical scale coincident with x. The then there are various possibilities for the
speed and killing measures are transformed behavior of X,(w) at r,. They are expressed by
accordingly. We cari, therefore, assume that the boundary conditions satisfïed by the func-
the canonical scale is x. Let us consider the tions u belonging to the domain of the in-
case (rl, r2) = (-c0, +co) for simplicity. Let iïnitesimal generator 6. The condition is in
t(t, x) be the tlocal time of Brownian motion at general of the form
x (- 45 Brownian Motion). Next, we apply
yu(r,)+GBu(r,)+pu+(r,)=O,
the ttime change to the Brownian motion by
means of the tadditive functional where y, 6, and p are constants, y, 6 < 0, p > 0,
+Xi and 161+~>0. Ify=K=O, then rl is said to be
<p(t)= t(t, x)m(dx), a reflecting barrier. If rl is regular for YJI and
s -cc does not belong to S, then X,(w) vanishes
and fmally tkill the latter process by means of exactly as X,(w) reaches r,, and r1 is called an
the tmultiplicative functional absorbing barrier. This case corresponds to the
+cO boundary condition u(rJ = 0. Whatever the
cc(t)=exp - t(<p -l W> x)k(W boundary condition may be, YJI is constructed
(S -CU > from the Brownian motion with reflecting
Thus we obtain the process sm in (-CO, CO). In barrier by topological transformation of the
particular, if m(dx) = a(x)-’ dx and k(dx) = state space, time change, and killing. Here if
IC(~)I dx, we have y # 0, then killing may occur at r, ; if 6 # 0, the
set of visiting times of r1 has positive Lebesgue
measure; and if p # 0, the trace of the motion
may go beyond the point r, and reach the
interior points of S [2,7-93.
If we weaken the assumption of continuity
of paths and admit jumps from r,, the general
and Bu = au” + CU. boundary condition becomes
At a shunt the infïnitesimal generator 6 has
a form that is a generalization of the fïrst-
order differential operator bu’ + CU (c < 0),
with b > 0 or b < 0 according as it is a right
shunt or a left shunt. At a trap we have Bu(x)
+s (u(x)
-4r,)Mdx)
=0,
k,.r*l

= - Wl~,(S). where v is a measure with respect to which


When S is an interval with endpoints r, and min(l,s(x)-s(r, +0)) is integrable.
r2 and a11 interior points are regular, the left When S = (rl, rJ, the transition probability
endpoint rl is classified into the following 4 is absolutely continuous with respect to the
types, according to the bebavior of m near r1 : canonical measure, the density p(t, x, y) has an
Take’an arbitrary fïxed point r E (rl, rJ, and set +eigenfunction expansion
n(dx) = m(dx) + k(dx) and 0
P(Lx,Y)= e”‘e(dA; x, y),
s -cc
Ct= (s(r) -s(x))Wx),
s (r,.r1 and p(t, x, y) is positive, jointly continuous in 3
variables, and symmetric in x and y. A similar
B= 4(x, r)MW. result is also known when S is half-open or
s (11.1)
closed [9].
Then r, is a regular boundary if du< CO, b < If I)31 is trecurrent, i.e., P,(cT~ < +CD) = 1 for
CO; an entrante boundary if a < CO, fi = co; an every x and y in S, then there exists a unique
exit boundary if g= CO, b < co; and a natural (up to a multiplicative constant) +invariant
boundary if u = 00, fi = CO. This classification measure for S that is finite for a11 closed inter-
is independent of the choice of r. A similar vais in the interior of S. If W is conservative
classification of r2 cari be established. X,(w) and all the interior points of S are regular,
approaches r1 in fïnite time with positive or then the canonical measure is an invariant
nul1 probability according as fi is finite or infï- measure, provided that the endpoints are
nite. If a = cc, it never happens that X,(w) either entrante, natural, or regular reflecting.
443 115c
Diffusion Processes

C. Multidimensional Diffusion Processes one solution starting from x. (For further


information related to the theory of martingale
Let the state space S be a domain or the problems - [ 15,201.)
closure of a domain in the n-dimensional Suppose next that S is the closure of a
Euclidean space R”. Consider a temporally bounded domain with a suffïciently smooth
homogeneous diffusion process (Xl}OQt<m on boundary and A is given with suflïciently
S. Under suitable regularity conditions the smooth coefficients. If the boundary condition
infïnitesimal generator 8 coincides, for suffï- is yu + pLau/ûn = 0, p # 0, then there exists a
ciently smooth functions u in its domain, with unique diffusion process on S corresponding
the following telliptic partial differential oper- to this situation. Moreover, if y=O, then the
ator A: process is said to have a reflecting barrier. S.
Watanabe [22] gave a probabilistic condition
A= E a”(x) &+ i h’(x)&+c(x), which characterizes the reflecting diffusion
i,j=l i=l
processes in the normal direction among a11
c<o; (7) reflecting diffusion processes in oblique direc-
where S has a boundary, u satistïes a boundary tions. Suppose that a general boundary con-
condition of the form dition (8) is given. We Write the left-hand side
of (8) as Lu. Write u = Hf for the solution of
i,$l dqx)~+“~ pyx)T Au = 0 with boundary value 1: Under some
i=1 natural additional conditions, T. Ueno proved
that if LH tgenerates a Markov process on
au(x)
+Y(x)u(x)+G(x)Au(x)+~c(x)~=o, (8) the boundary, then there exists a diffusion
process for A with boundary condition (8)
where, for simplicity, we assume that S is the [21]. If {X,} has a reflecting barrier, the Mar-
closed half-space defïned by x” > 0, (&j) is a kov process on the boundary is, conversely,
symmetric nonnegative detïnite matrix, y < 0, obtained from {Xt} through time change by a
6 < 0, p > 0, and a/& is the inward-directed nonnegative continuous tadditive functional
tconormal derivative associated with a”. This which increases only when the value of X, is
boundary condition was discovered by A. D. on the boundary. Stochastic differential equa-
Venttsel’ [ 133. Conversely, given an operator A tions are also used in constructing diffusion
such as (7) and a boundary condition (8) (if S processes with boundary condition (8) [ 191
has a boundary), the existence and uniqueness (- 406 Stochastic Differential Equations). An
of the corresponding diffusion process are elegant method for constructing diffusion
known for several special cases. If S = R” and processes with boundary condition (8) has
A has continuous coefficients, there exists at been introduced by Watanabe [22]. It consists
least one diffusion process corresponding to A of piecing together excursions from the bound-
[14-161. ary to the boundary. In this construction,
A probabilistic approach to constructing the stochastic integrals for tPoisson point
diffusion processes corresponding to the oper- processes play an important role. There are
ator A with c = 0 is given by Itô’s method various other results on multidimensional
of tstochastic differential equations (- 406 diffusion processes with general boundary
Stochastic Differential Equations E). A some- conditions which cannot be covered by the
what different approach was introduced by D. method mentioned above (- [23] and M.
W. Stroock and S. R. S. Varadhan [20] under Motoo, Pr-oc. Inter-n. Symp. Stochastic D$
the name of martingale problems (- 261 Mar- ferential Equations, Kyoto, 1976).
kov Processes C). Let W” = C( [0, CO)+R”) When we have a diffusion process on S
be the space of a11 continuous functions w: with infinitesimal generator of the form A, we
[0, CO)+R” endowed with compact uniform cari obtain a probabilistic expression for the
topology and d( W”) be the topological o- solutions of various partial differential equa-
field. Given x E R”, a solution to tbe martingale tions involving A. Let 0 be the hitting time of
problem for the operator A with c = 0 start- the boundary of S. Then Hf(x) cari be ex-
ing from x is a probability measure P, on pressed as E,(f(X,-,)). The solution of Au=
(W”, 23( Wn)) satisfying PJw(0) =x) = 1 such -f with boundary value 0 is given by u(x) =
that f(w(t))-j& Af(w(s))ds is a P,-martingale E,(&f(X,)dt), while the solution U(C,X) of
for a11f~cz(R”), where CF(R”) denotes the set au/& = Au with boundary value 0 and initial
of a11 C”-functions on R” having compact value ~(0, x) =f is E,(f(X,); t < 0). The fïrst
support. If a=(~“) is uniformly positive de- case gives the solution of a +Dirichlet problem;
lïnite, bounded, and continuous and if b = (b’) and in this case, the condition for a boundary
is bounded and Bore1 measurable, the martin- point to be tregular relative to the Dirichlet
gale problem for the operator A with c = 0 is problem cari also be expressed probabilisti-
well posed, i.e., for each x E R”, there is exactly cally (- 45 Brownian Motion, 261 Markov
115D 444
Diffusion Processes

Processes). Furthermore, if f(X,) is replaced by time of !LR is intïnite a.s. Let W(M) be the space
f(XJexp$,k(X,)ds) in the expressions for U(X) of ah continuous mappings w : [0, CO)+ M
and u(t,x), A is replaced by A + k (M. Kac endowed with compact uniform topology, and
[24]). When k<O, this replacement gives rise set W,(M)=jwlw~W(M),w(O)=x}. Wede-
to a killing of the process (- 261 Markov scribe the ttopological support Y(P,) of the
Processes E). probability P, on W,(M), i.e., the smallest
By using the theory of +Dirichlet forms we closed subset of W,(M) that carries the mea-
cari investigate a general class of tsymmetric sure P,. Let Y be the set of a11 piecewise con-
multidimensional diffusion processes which are stant mappings u: [0, CO+R’. For a given
not in the framework of the classical diffusion u=(u’(t), u’(t), . . . , u’(t)) of Y’, we consider a
processes, i.e., diffusion processes whose in- system of ordinary differential equations
tïnitesimal generators are not necessarily dif-
ferential operators (M. Fukushima, Dirichlet
Forms and Markov Processes, 1980)).
i.e., for every C”-function f on M with com-
pact support,
D. Diffusion Processes on Manifolds

$-MtH= Aof( + c (Ak.fk’(t))Uk(t).


Let M be a tconnected toriented TP-compact k=l
C” tmanifold of dimension n. Let fi be M or
Then for every u E Y and x E M, we obtain a
MU {a} (= the one-point tcompactifïcation
curve <p= <p(x, u) =(<p,(x, u)) on M by solving
of M) according as M is +Compact or non-
(10) with V(~)=X. Set Y”={<p(x,u)Iu~Y}.
compact. Suppose that we are given a system
Then we have
of C”-tvector tïelds A,, A,, , A, on M.
We consider the following stochastic differen- vp( PJ = Y” for every x E M.
tial equation on M:
Let 5? be the +Lie algebra generated by A,, A,,
dX,= i AI(XJodwk(t)+ A,(X,)dt “‘> A,, and set <L”(x) = { k’,] FE f?}. If dim U(x) =
(9)
k=l n for every XE M, then .Y(P,) = W’(M) for every
XE M. For further information - Stroock and
(- 406 Stochastic Differential Equations),
Varadhan (Proc. 6th Berkeley Symp. Math.
where w(t)=(w’(t), w*(t), . . . . w’(t)) denotes an
Statist. Prob. III, 1972) and H. Kunita (Proc.
r-dimensional Brownian motion and the tïrst
Int. Symp. Stochastic Differential Equations,
term of the right-hand side is understood in
Kyoto, 1976).
the sense of the Stratonovich stochastic dif-
Let A be a smooth tnondegenerate second-
ferential. Let X(t, x, w) be the solution of (9)
with X0 =x E M defïned on the r-dimensional order telliptic differential operator on M which
is expressed in local coordinates as
Wiener space (WJ, P”) (- 406 Stochastic
Differential Equations) and P, be the proba- 1 n
bilitylaw on w(M) of(X(t,x,w)),,,, where A=j,C a”(x) &+ c bi(x)Y$
‘., 1 i=l
fi(M) is the space of a11 continuous map-
pings w : [0, CO+&? with 8 as a trap. Then where (a”(x)) is symmetric and tstrictly posi-
{P, 1x E M} defines a diffusion process W on tive defïnite. Then there exist a +Riemannian
M which is generated by the second-order metric g and a Cm-tvector lïeld b on M such
differential operator XL=, Az/2+ A,, i.e., a that
diffusion process YJI with the intïnitesimal
generator 6 such that A=;A,+b, (11)

where AM is the +Laplace-Beltrami operator


on the +Riemannian manifold (M, y). We now
where C$(M) denotes the space of a11 C”- construct the diffusion process generated by
functions on M with compact support. By the operator A, introducing a stochastic dif-
appealing to the analytical theory of partial ferential equation on the tbundle O(M) of the
differential equations we cari discuss regularity torthonormal frames. There exists an tafftne
properties of the transition probability of 93 connection V +Compatible with the Riemann-
([ 151; L. Hormander, Acta Math., 119 (1967)). ian metric g such that for every Cm-function f
Recently, P. Malliavin [ZS] also suggested a on M,
probabilistic method for proving elliptic regu-
larity results (see also [19]).
We now assume that a11 linear sums of
A,, A,, , A, are tcomplete. Then the terminal
445 115 D
Diffusion Processes

where (L,, L,, , L,) is the system of tcanon- are then of the form const. x exp[2F(x)]m(dx),
ical horizontal vector lïelds (tbasic vector where m(dx) is the TRiemannian volume (E.
lïelds) on O(M) corresponding to the affine Nelson, Duke Math. J., 25 (1958); Kolmogorov,
connection V and rr:O(M)+M is the natural Math. Ann., 113 (1937)). The diffusion process
projection [ 191. We now consider the follow- I)31 is said to be locally symmetrizable if for
ing stochastic differential equation on O(M): every tsimply connected domain D c M there
exists a Bore1 measure vD(dx) on D such that
dr,= 1 Lk(r,)odwk(t). (12)
k=l
TDfbM4vDW= f(x)TDd4vD(W
Let r(t, r, w) be the solution of (12) with r0 = sD i D

r~0(M) delïned on the n-dimensional Wiener for all bounded continuous functions f and g,
space (II$, P”). Now a stochastic curve where
X(t, r, w) on A4 is delïned by X(t, r, w) =
x(r(t, r, w)). Set X(t, r, w) = i? for t 2 [, where [
is the texplosion time of r(t, r, w). Then the
probability law of (X(t,r, w))~~~ on I?(M)
depends only on x = n(r), and it defines a diffu-
sion process cJJ1on M which is generated by Then !RI is locally symmetrizable if and only if
the operator A of (11). (For details - [19,26].) wb is tclosed, i.e., do, = 0. R. Z. Khas’minskiï
When A = AM/2, i.e., h = 0 in (1 l), the diffusion [28] proved a pair of useful tests for explo-
process %II is called the TBrownian motion on sions of diffusion processes on M = R”, similar
the Riemannian manifold M (- 45 Brownian to Feller’s test for II = 1 mentioned in Section B
Motion). (- [ 181). In case of a general manifold we cari
Next consider the case when (L,, L,, , L,) investigate the possibility of explosions for 9.R
in (12) is a system of canonical horizontal by appealing to the comparison theorems for
vector lïelds corresponding to the TRiemann- tcurvatures in the theory of differential geome-
ian connection, and let b be a Cm-vector try [19,29]; S. T. Yau, J. Math. Pures Appl., 57
lïeld on M. Let 6 be the scalalization of b, (1978)). The diffusion process !RI on M is said
i.e., g=(b’,b’, . . . . b”), b’(r)=~J’=, bj(x)fi’, and to be +recurrent if P,[X, E U for some t > 0] = 1
(jji)=(ej)-’ for r=(x,e,,e,, . . . . eJEO(M), for any open subset U of M; otherwise it is
where b = C;=, b’a/ax’ and e, = &, &aJad called ttransient. It is well known that an II-
in local coordinates. Suppose that ii(r) is dimensional Brownian motion on R” is recur-
bounded on O(M) for i= 1,2, ,n. Then rent if n < 2 and transient if n > 3 [9]. There
also are some results for the criterion which
determines whether %II is recurrent or transient
([ 19,281; A. Friedman, Stochustic D#ërentiul
Equations und Applications 1, II, 1975, and K.
Ichihara, Publ. Res. Inst. Math. Sci., 14 (1978)).
Explicit formulas for the transition probability
is an texponential martingale. The diffusion of the Brownian motion on a hyperbolic space
process generated by the operator AM/2 + b is are given in [29]. For information related to
obtained from the Brownian motion on M the asymptotic behavior of diffusion processes
constructed above by the ttransformation of as tl0 we refer the reader to Varadhan (Comm.
drift, i.e., the transformation by the tmultiplica- Pure Appl. Math., 20 (1967)) and S. A. Mol-
tive functional M(t) (- 261 Markov Processes). chanov (Russiun Math. Suroeys, 30 (1975)).
(G. Maruyama (NU~. Sci. Rep. Ochanomizu Diffusion approximations for suitably nor-
Univ., 15 (1954)) studied this for 1-dimensional malized random sequences have been studied
processes, 1. V. Girsanov [27] and M. Motoo extensively beginning with the work of A. J.
(Ann. Inst. Stutist. Muth., 12 (1960-1961)) for Khinchin [30]. The theory of tlimiting distri-
multidimensional cases.) butions of sums of tindependent (or weakly
Consider a diffusion process %Il = {P, 1x E M} dependent) random variables is among the
on M generated by the operator A of (11). Let best-known examples (- 250 Limit Theorems
wb be the tdifferential 1 -form defmed from the in Probability Theory). For information re-
vector field b by lated to the theory of diffusion approxima-
tions, see Yu. V. Prokhorov (Theory of Prob.
Appl., 1 (1956)), A. V. Skorokhod [ 161, and
for every x E M and every C”-vector field V Stroock, Varadhan, and G. Papanicolaou
on M. Then %71is symmetrizable if and only if (Pro~. 1976 Duke Turbulence Conference).
wh is texact, i.e., if there exists a function F on Recently, many interesting examples of multi-
M such that w,, = dF. The tinvariant measures dimensional diffusion processes have also been
115 Ref. 446
Diffusion Processes

introduced to describe probabilistic models in [15] D. W. Stroock and S. R. S. Varadhan,


physics, biology, etc. (e.g., R. L. Stratonovich, Multidimensional diffusion processes,
Topics in the Theory of Random Noise 1, II, Springer, 1979.
1963; J. F. Crow and M. Kimura, An Introduc- [ 161 A. V. Skorokhod, Studies in the theory
tion to Population Genet& Theory, 1970; and of random processes, Addison-Wesley, 1965.
K. Sato, Proc. Int. Symp. Stochastic Differential (Original in Russian, 1961.)
Equations, Kyoto, 1976). [ 171 K. Itô, On stochastic differential equa-
In general, if a diffusion process {X,} is tions, Mem. Amer. Math. Soc., 1951.
given on a noncompact space S and X, has [ 181 H. P. McKean, Jr., Stochastic integrals,
no limit points in S as tt<, then some natural Academic Press, 1969.
compactification should be induced by {X,}. [ 191 N. Ikeda and S. Watanabe, Stochastic
The notion of a +Martin boundary for Markov differential equations and diffusion processes,
processes is introduced in this connection. North-Holland and Kodansha, 198 1.
[20] D. W. Stroock and S. R. S. Varadhan,
Diffusion processes with continuous coefi-
References cients 1, II, Comm. Pure. Appl. Math., 22
(1969), 345p400,479%530.
[l] A. N. Kolmogorov, über die analytischen [21] K. Sato and T. Ueno, Multidimensional
Methoden in der Wahrscheinlichkeitsrech- diffusions and the Markov process on the
nung, Math. Ann., 104 (1931), 451-458. boundary, J. Math. Kyoto Univ., 4 (1965),
[Z] E. B. Dynkin, Markov processes 1, II, 529-605.
Springer, 1965. (Original in Russian, 1963.) [22] S. Watanabe, Poisson point process of
[3] J. L. Doob, Stochastic processes, Wiley, Brownian excursions and its applications to
19.53. diffusion processes, Amer. Math. Soc. Proc.
[4] K. Itô, Lectures on stochastic processes, Symp. Pure Math., 31 (1977), 153-164.
Tata Inst., 1960. [23] M. Motoo, Application of additive func-
[S] L. V. Seregin, Continuity conditions for tionals to the boundary problem of Markov
stochastic processes, Theory of Prob. Appl., 6 processes, Proc. 5th Berkeley Symp. Math.
(1961), l-26. (Original in Russian, 1961.) Stat. Prob. II, pt. 2, Univ. of California Press,
[6] W. Feller, Zur Theorie der Stochastischen 1967,75-l 10.
Prozesse (Existenz- und Eindeutigkeits-satz), [24] M. Kac, On some connections between
Math. Ann., 113 (1936), 113-160. probability theory and differential and integral
[7] W. Feller, The parabolic differential equa- equations, Proc. 2nd Berkeley Symp. Math.
tions and the associated semi-groups of trans- Stat. Prob., Univ. of California Press, 1951,
formations, Ann. Math., (2) 55 (1952), 468-519. 189-215.
[S] W. Feller, On second order differential [25] P. Malliavin, Ck-hypoelliptic with degen-
operators, Ann. Math., (2), 61 (1955), 90-105. eracy, Stochastic Analysis, A. Friedman and
[9] K. Ttô and H. P. McKean, Jr., Diffusion M. Pinsky (eds.), Academic Press, 1978, 199%
processes and their sample paths, Springer, 214,327-340.
1965. [26] P. Malliavin, Formule de la moyenne,
[ 101 P. Lévy, Processus stochastiques et calcul de perturbations et théorèmes
mouvement brownien, Gauthier-Villars, 1948. d’annulation pour les formes harmoniques,
[ 1 l] E. B. Dynkin, One-dimensional continu- J. Functional Anal., 17 (1974), 274-291.
ous strong Markov processes, Theory of Prob. [27] 1. V. Girsanov, On transforming a certain
Appl., 4 (1959), l-52. (Original in Russian, class of stochastic processes by absolutely
1959.) continuous substitution of measures, Theory
[12] 1. S. Kats (Kac) and M. G. Krein, On the of Prob. Appl., 5 (1960), 285-301. (Original in
spectral functions of the string, Amer. Math. Russian, 1960.)
Soc. Transl., (2) 103 (1974). (Original in Rus- [28] R. Z. Khas’minskii, Ergodic properties
sian, supplement of the Russian translation of of recurrent diffusion processes and stabili-
the book by F. V. Atkinson, Discrete and zation of the solution of the Cauchy problem
continuous boundary problems.) for parabolic equations, Theory of Prob.
[ 131 A. D. Venttsel’ (Ventcel’), On boundary Appl., (1960), 179-196. (Original in Russian,
conditions for multidimensional diffusion 1960.)
processes, Theory of Prob. Appl., 4 (1959), [29] A. Debiard, B. Gaveau and E. Mazet,
164-177. (Original in Russian, 1959.) Théorèmes de comparison en géométrie
[14] N. V. Krylov, On the selection of a Mar- riemannienne, Publ. Res. Inst. Math. Sci., 12
kov process from a system of processes and the (1976), 391-425.
construction of quasidiffusion processes, Math. [30] A. Ya. Khinchin, Asymptotische Gesetze
USSR-I~V., 7 (1973), 691-709. (Original in der Wahrscheinlichkeitsrechnung, Springer,
Russian, 1973.) 1933.
441 117A
Dimension Theory

116 (XX.21 as well as the effect of compressibility are


taken into account, gravitational acceleration
Dimènsiokal Analysis g and sound velocity a must be included, SO
that we have Dlpv2 l2 = f (vl/v, v2/lg, vJa, C,,
The system of units for physical quantities is C 2,. . . ), where C,, C,, . . . are other dimension-
derived from a certain set of fundamental less quantities depending on the physical
units. If the fundamental units are denoted properties of the fluid. Fr = u2//g is called the
by 8, <p, $, etc., any other unit c( (called a de- Froude number, and M = V/a the Mach number.
rived unit) cari always be expressed in the form Next, consider the case of heat transfer
a = C@V~$~. (c, 1,m, n, . . . are constants), by between a solid surface and a flowing fluid.
defmition or by physical laws. The exponents 1, Let the area of the solid surface be denoted
m, n, are called the dimensions of c(, and the by S, the heat transferred per unit time by
content of the previous statement is expressed Q (HT-‘), the thermal conductivity of the
as [a] = [@‘cp”fi” .], which is called the di- fluid by k (HI!-’ T-i&‘), the specific heat by
mensional formula. The usual practice is to C (HM-’ e-i), the two representative tempera-
take as fundamental units length, time, mass, tures by T, and Ti, and the representative
temperature, and energy, which are denoted by length by 1, where expressions in parentheses
L, T, M, 0, and H, respectively. Dimensional represent dimensional formulas. Then we have,
analysis investigates the relation between as dimensionless quantities, the Nusselt num-
physical quantities by use of the rc theorem ber Nu = Q/(kS( TI - 7”)/1), the Prandtl number
and the law of similitude given below. Pr = V/K (K = k/pc), the Grashoff number Gr =
d3g(T, - T’,)/v2T,, and R, SO that from the rc
theorem we have the relation Nu =f(R, Pr, Gr,
A. The 7~Theorem
C, , C,, . ). Furthermore, Pe = vl/rc = Pr R is
called the Péclet number.
If a relationship f(cc, b, ) = 0 holds among n
physical quantities ~1,p, independently of
the choice of fundamental units, the equation References
f(cz, b, ) = 0 cari always be transformed into
F(rc1,z2, . ..)=O. where the ni are n-m dimen- [ 11 A. W. Porter, The method of dimensions,
sionless quantities (m is the number of funda- Methuen, third edition, 1946.
mental units) of the form ni = x“~~~~. If we [2] D. C. Ipsen, Units, dimensions, and dimen-
choose the xi SO that n, = c&‘ly mil and rr2, sionless numbers, McGraw-Hill, 1960.
rc3, etc. do not contain a, then j”=O implies [3] H. L. Langhaar, Dimensional analysis and
CI=/?~~~~~ . @(rc,, rcn3,. ..). which clearly shows theory of models, Wiley, 1951.
the manner in which the quantity t( is related [4] H. E. Huntley, Dimensional analysis,
to other quantities ,$ y,. . . Macdonald, 1952.
[S] C. M. Focken, Dimensional methods and
their applications, Arnold, 1953.
B. The Law of Similitude
[6] R. Kurth, Dimensional analysis and group
theory in astrophysics, Pergamon, 1972.
In general, if two physical systems of the same
[7] L. 1. Sedov, Similarity and dimensional
kind have the same values of the rti, then the
methods in mechanics, Academic Press, 1959.
physical states of the systems are similar. If
(Original in Russian, fourth edition, 1957.)
we are given a family of mutually similar sys-
Also - 414 Systems of Units.
tems, it is sufficient to observe a particular one
among them (a “model”) in order to estimate
physical values attached to any one of the
given systems.
Consider, for example, the case of the drag 117 (11.21)
D acting on geometrically similar bodies
Dimension Theory
placed in the flow of a viscous incompressible
fluid. If u is the velocity, 1 the representative
length of the body, p the density of the fluid, A. Introduction
and p the viscosity (which has the dimensional
formula ML-‘T-i), then the 7~theorem gives Toward the end of the 19th Century, G. Cantor
D/pv212 =f(pul/p). Hence the drag coefficient discovered that there exists a one-to-one corre-
as given by the left-hand side cari be obtained spondence between the set of points on a line
by the experiments performed on a geometri- segment and the set of points on a square; and
cally similar model. The dimensionless quan- also, G. Peano discovered the existence of a
tity R = u//v (v = p/p) is called the Reynolds tcontinuous mapping from the segment onto
number. If the wave resistance due to gravity the square. Soon, the progress of the theory of
117 B 448
Dimension Theory

point-set topology led to the consideration of space, the Cantor discontinuum, and +Baire
sets which are more complicated than familiar zero-dimensional spaces are all 0-dimensional.
sets, such as polygons and polyhedra. Thus it The set of rational points in a separable
became necessary to give a precise definition +Hilbert space is 1-dimensional.
to dimension, a concept which had previously
been used only vaguely. In 1913, L. E. J.
C. Dimension of Metric Spaces
Brouwer [9] gave a definition of dimension
based on an idea of H. Poincaré. In 1922, the
The following theorems hold for the dimension
foundations of dimension theory for separable
of metric spaces (M. Katëtov, Czechoslouak
metric spaces were established by K. Menger
Math. J., 2 (1952); K. Morita, Math. Ann., 128
[ 1 l] and P. Uryson [ 101. Subsequently, P. S.
(1954)). Let X and Y be metric spaces. The
Aleksandrov and W. Hurewicz contributed
equality dim X = Ind X holds. If Yc X, then
much to the development of the theory. The
dim Y d dim X. If X is a union of a countable
foundations of dimension theory for general
number of closed sets Fi (i = 1,2, ), then
metric spaces were established independently
dim X = max(dim Fi) (sum tbeorem for dimen-
by M. Katetov amd K. Morita. More general
sion). The inequality dim(X U Y) < dim X +
theory for tnormal spaces has also been inves-
dim Y + 1 holds. If dim X = n, then X is a
tigated; the same results as in metric spaces,
union of n + 1 0-dimensional subsets (decom-
however, do not always hold.
position tbeorem for dimension). We have
dim(X x Y) < dim X + dim Y, where X # 0
(product theorem for dimension).
B. Definition of Dimension
Each of the following is a necessary and
sufftcient condition for dim X <n: (i) There
Let X be a normal space. If any finite open
exists a subspace A of a Baire zero-
tcovering of X has an open covering of +order
dimensional space B (r) and a continuous
<n + 1 as its retïnement (- 425 Topological
closed mapping f of A onto X such that f-‘(x)
Spaces R) (i.e., if for any open sets Ci (i = 1,
consists of at most n + 1 points for each point
“’ >s) such that X = G, U U G,, there exist x of X (K. Morita, Sci. Rep. Tokyo Kyoiku
open sets Hi (i= 1, . . ..s) such that HicGi, X=
Duiyaku, 5 (1955)); (ii) there exists a metric of
H,U...UH,,andanyn+2oftheH,haveno
X which gives the same topology on X such
point in common), then we Write dim X <n. If
that for any positive number E, any point x of
dim X < n but dim X <n - 1 does not hold,
X, and any n+2 points xi (i= 1, . . ..n+2)
then we define X to be n-dimensiona and
at a distance less than E from the (.a/2)-
Write dim X = n. We cal1 dim X the cover-
neighborhood of x, there are at least two
ing dimension, (or Lebesgue dimension) of
points xi and xj (i #j) with distance <E (J.
X. The idea behind this definition is due to
Nagata, Fund. Math., 45 (1958)).
H. Lebesgue.
Hurewicz’s problem asked whether the
There are other delïnitions of dimension
equality dim X = n + m (m > 0) implies the
that are given inductively. Let us delïne
existence of an m-dimensional space A and a
Ind X = - 1 if X is empty. If for any pair
mapping f of A onto X having property (i). It
consisting of a closed set F and an open set G
was solved affirmatively for separable metric
with F c G in X there exists an open set V
spaces by J. H. Roberts and for general metric
such that F c Vc G and Ind( ii- V) < n - 1,
spaces by K. Nagami (Japun. J. Math., 30
then we delïne Ind X < n. Next, we define
(1960)).
ind @ = - 1. For any point p of X and any
If X is the union of a countable number of
neighborhood G of p, suppose that there exists
closed tstrongly paracompact subspaces, in
an open neighborhood V of p such that Vc
particular if X is separable, then Ind X =
G and ind( v- V) = n - 1. Then we detïne
ind X [ 1,2]. However, it was shown by P.
ind X < n. As before, we set Ind X = n (ind X
Roy (Bull. Amer. Math. Soc., 68 (1962)) that
=n) ifTndX<n (indX<n) but IndX<n
this equality does not hold in general.
- 1 (ind X < n - 1) does not hold. (The delï-
nition of indX is due to Menger.) We cal1
Ind X (ind X) the large inductive dimension of D. Euclidean Spaces and Dimension
X (the small inductive dimension of X).
If dim X < n does not hold for any n, then The n-dimensional +Euclidean space R” is
X is called infinite-dimensional, written exactly n-dimensional in the sense mentioned
dim X = CO; we define Ind X = CO and ind X = above; thus this concept of dimension agrees
co similarly. These dimensions are invariant with our intuition. The proof of dim R” > n
under thomeomorphisms. cornes from Lebesgue’s theorem: If each mem-
The set of irrational points in a Euclidean ber of a lïnite closed covering of an n-cube has
449 117H
Dimension Theory

suflïciently small diameter, then the order of does not hold in general even if X x Y is local-
the covering is not less than n + 1. (The proof ly compact and normal, and dim X = dim Y =
of dim R” < n is easy.) Let X be a subset of R” 0; T. Przymusinski (hoc. Amer. Math. Soc.,
and ,f a homeomorphism from X onto a sub- 76 (1979)) noted that CH cari be avoided by a
set f(X) of R”. If x is an interior point of X, modification of Wage’s construction. Katëtov
then ,f(x) is an interior point off(X). Also, if ((hopis Pe%t. Mat. Fys., 75 (1950)) proved
an open set A of R” is homeomorphic to a that dim X is determined by the ring C*(X) of
subset B of R”, then B is open in R” (Brouwer’s bounded real-valued continuous functions
theorem on the invariance of domain [SI). on X.
This theorem holds for any manifold but not
for general separable metric spaces. By the
theorem of invariance of domain it cari be F. Homological Dimension
shown that R” and R”, m # n, are not homeo-
morphic (theorem on invariance of dimension Aleksandrov contributed much to the develop-
of Euclidean spaces). Any n-dimensional ment of dimension theory in introducing the
separable metric space is embedded in a concept of homological dimension (Math.
Euclidean space R’“+‘, or, more precisely, in Ann., 106 (1932)). The homological dimension
the subset of R”‘+’ consisting of all points x of a compact Hausdorff space X with respect
of which at most n coordinates are rational to an Abelian group G is the largest integer n
(Menger-Nobeling embedding theorem, G. such that the n-dimensional tCech homology
Nobeling, Math. Ann., 104 (1930)). Thus, from group &,(X, A; G) is nonzero for some closed
the topological point of view, any finite- subset A of X. The cohomological dimension
dimensional separable metric space cari be D(X; G) is detïned similarly by using the tCech
identified with a subset of a Euclidean space. cohomology group fi”(X, A; G). If dim X < m,
Moreover, it is known that any n-dimensional then dim X = D(X; Z) (Z is the additive group
separable metric space is homeomorphic to a of integers). The cohomological dimension of
subset of some n-dimensional compact metric X with respect to an arbitrary Abelian group
space. is determined by the cohomological dimension
If F is a bounded closed subset of R”, then with respect to some specitïed groups, and the
dim F < FI if and only if for any positive num- cohomological dimension of the product ;pace
ber E, there exists a continuous mapping f X x Y is expressed in terms of those of X and
from F into an n-dimensional polyhedron in Y (M. F. Bokshtein, [SI). A compact Hausdorff
R” such that the distance between x and f(x) space X has the property that dim(X x Y) =
is less than E for each point x of F. dim X + dim Y for any compact Hausdorff
space Y if and only if dim X = D(X; Q(p)),
where Q(p) is the additive group of rationals
E. Dimension of Normai Spaces mod 1 of the form m/p” for any prime number
p (V. Boltyanskiï, [SI); this result holds also
Let X be a norma1 space. Then Ind X 2 dim X when X is paracompact (Y. Kodama, J. Math.
and Ind X > ind X, but the equalities do not Soc. Japan, 18 (1966)).
necessarily hold here. The following theorems
were obtained by E. Lech, Aleksandrov, C. H.
Dowker, E. Hemmingsen, and Morita [ 11. If G. Dimension and Measure
dim X d n, then any locally finite open cover-
ing of X has an open covering of order <n + 1 Let X be a separable metric space. Then
as its refinement; if A is an TF, subset of X dim X < n if and only if X is homeomorphic
or A is strongly paracompact, then dim A < to a subset of a Euclidean space R’“+’ whose
dim X; if X has a ter-locally finite closed cover- (n + 1)-dimensional tHausdorff measure is 0
ing {F,}, then dim X = max(dim F,). (E. Szpilrajn. Fund. Math., 28 (1937); also -
In order that dim X < n, it is necessary and [ 1,2]). The infïmum of c(2 0 such that the
sufftcient that any continuous mapping from a Hausdorff measure A,(X) of dimension t(
closed subset of X into an n-sphere S” cari be vanishes is called the Hausdorff dimension of
extended continuously to X. If X and Y are X.
tparacompact and X is tlocally compact,
or if X x Y is strongly paracompact, then
dim(X x Y) Q dim X + dim Y, where X # 0; if H. Dimension Type (Fréchet’s Definition)
X is a tCW complex, then the equality holds
[14]. M. L. Wage (hoc. Nat. Acad. Sci. US, In analogy to the theory of tcardinal numbers
75 (1978)) proved under the continuum hypo- in set theory, M. Fréchet (1909) defmed the
thesis (CH) that dim(X x Y) < dim X + dim Y dimension type of topological spaces as fol-
1171 450
Dimension Theory

lows: Two spaces X and Y are said to have the 118 (V.9)
same dimension type if X is homeomorphic to
a subset of Y and Y is homeomorphic to a
Diophantine Equations
subset of X.
A. General Remarks
1. Infinite-Dimensional Spaces
A Diophantine equation is an talgebraic equa-
tion whose coefficients lie in the ring Z of
If X is a metric space with 0 < dim X < CC~,
rational integers and whose solutions are
then for each positive integer m with m<
sought in that ring. The name cornes from
dim X, X contains a (closed) subset S with
Diophantus, an Alexandrian mathematician of
dim S = m. Tumarkin asked the following
the third Century A.D., who proposed many
question: For an infinite-dimensional com-
Diophantine problems; but such equations
pact metric space X and for each positive
have a very long history, extending back to
integer m, does X contain a closed subset
ancient Egypt, Babylonia, and Greece. As
S with dim S = m? D. Henderson (Amer. 1.
early as the sixth Century B.c., Pythagoras is
Math., 89 (1967)) answered this question in the
said to have partially solved the equation x2 +
negative. Furthermore, J. Walsh (Bull. Amer.
y2=z2 byx=2n+1,y=2n2+2n,z=y+1.
Math. Soc., 84 (1978)) constructed an infïnite-
A general solution is given by the Pythagorean
dimensional compact metric space X such
numbers x = m2 - n2, y = 2mn, z = m2 + n2.
that if S is an arbitrary subset of X with
+Fermat’s problem also concerns a Diophan-
dimS>O then dimS= rx).
tine equation.
Systematic studies of Diophantine equations
References over Z have been made for the linear equation
Cy=, a,xi = a (ai, a~ Z) and for the quadratic
[l] K. Morita, Dimension theory (in Japa- equation ux2 + hxy + cy2 = k (u, h, c, k E Z) in
nese), Iwanami, 1950. two unknowns. The latter forms a principal
[2] W. Hurewicz and H. Wallman, Dimension topic of C. F. Gauss’s Disquisitiones urithme-
theory, Princeton Univ. Press, 1941. ticue and cari be regarded as a starting point of
[3] J. Nagata, Modern dimension theory, modern algebraic number theory. The special
Noordhoff, second edition, 1965. quadratic equation t2 - Du2 = f4 (DE Z) is
[4] K. Nagami, Dimension theory, Academic called Pell’s equation. If D ~0, then Pell’s
Press, 1970. equation has only a finite number of solutions.
[S] P. S. Aleksandrov, The present status of If D > 0, then a11 solutions t,,, u, of Pell’s equa-
the theory of dimension, Amer. Math. Soc. tion are given by k((tl +u,fi)/2)n=(t,,+
Transl., (2) 1 (1955), l-26. (Original in Russian, u,&)/2, provided that the pair t,. u, is a
1951.) solution with the smallest t, + u, fi> 1 [15].
[6] L. E. J. Brouwer, Beweis der Invarianz der Using continued fractions (- 83 Continued
Dimensionenzahl, Math. Ann., 70 (1911), 161- Fractions), we cari determine t 1, u, explicitly.
165. A general quadratic Diophantine equation
[7] H. Lebesgue, Sur la non-applicabilité de ux2 + bxy + cy2 = k with two unknowns cari
deux domaines appartenant respectivement à be solved completely if we use solutions of
des espaces à n et n + p dimensions, Math. Pell’s equation; this is an application of the
Ann., 70(1911), 166-168. arithmetic of quadratic fields (- 347 Qua-
[S] L. E. J. Brouwer, Beweis der Invarianz des dratic Fields) [ 11. On quadratic Diophantine
n-dimensionalen Gebiets, Math. Ann., 71 equations of several unknowns, thére are deep
(1912), 305-313. studies by C. L. Siegel (- 348 Quadratic
[9] L. E. J. Brouwer, Über den natürlichen Forms).
Dimensionsbegriff, J. Reine Angew. Math., 142 Diophantine problems consist of giving
(1913), 146-152. criteria for the existence of solutions of alge-
[ 101 P. Uryson, Les multiplicités cantoriennes, brait equations in rings and fïelds and even-
C. R. Acad. Sci. Paris, 175 (1922), 440-442. tually determining the number of such solu-
[l l] K. Menger, Dimensionstheorie, Teubner, tions. The fundamental ring of interest is
1928. Z and the fundamental field of interest is Q.
[ 121 A. Pears, Dimension theory for general One discovers rapidly, however, that to have
spaces, Cambridge Univ. Press, 1975. a11 the technical maneuverability necessary for
[ 131 R. Engelking, Dimension theory, North- handling general problems, one must consider
Holland, 1978. rings and fïelds of iïnite type over Z and Q.
[14] K. Morita, Dimension of general topo- Furthermore, one is led to consider fïnite fields
logical spaces, Surveys in General Topology, and local fields when one deals with a locali-
G. M. Reed (ed.), Academic Press, 1980. zation of the problems under consideration.
451 118 c
Diophantine Equations

Techniques from various tïelds of mathematics. C. Equations over Local Fields


e.g., algebraic number theory, algebraic geom-
etry, analysis, Diophantine approximation, A method of solving problems in number
etc., have been successfully applied to salve theory by use of embeddings of the ground
Diophantine problems. However, much re- tïeld into its tcompletions is called a local
mains unsolved today. Yu. V. Matiyasevich method. Such methods have important conse-
(1970) showed that Hilbert’s tenth problem is quences when applied to Diophantine equa-
unsolvable; there is no general method of tions. Let S be a polynomial in n variables
telling whether a Diophantine equation has a with rational integer coefftcients. The con-
solution. This theorem in a sense indicates the gruence f = 0 (mod pk) is solvable for a11 k 2 1 if
complexity of Diophantine problems. For and only if f = 0 is solvable in tp-adic integers.
many centuries, no other topic has engaged This is an easy consequence of the com-
the attentions of SO many mathematicians, pactness of the ring of p-adic integers. We
both professional and amateur, or resulted cari solve f = 0 in p-adic integers provided
in SO many published papers. For these mis- that we cari solve an intïnite sequence of con-
cellaneous results - Dickson [l] and Morde11 gruences. It is generally diftïcult to tel1 when
CA. we may limit our consideration to only a tïnite
number of these. In this respect, the following
lemma is most useful. Hensel’s lemma: Let
B. Equations over Finite Fields r-(x r , . . , x,) be a polynomial whose coeffr-
cients are p-adic integers. Let yr, , y. be p-
adic integers such that for some i (1~ i < n) and
Let k be a fmite tïeld of characteristic p con- an integer 6 2 0, we have f(y, , . . . , y,) = 0
sisting of 4 (= p’) elements. Chevalley’s theo- (mod pzs+r) and 8f18xi(yI, . , y.) = 0 (modp’),
rem: Let f be a tform of degree d in n vari- + 0 (mod p6+‘). Then there exist p-adic integers
ables with coefficients in k such that d < n. tir, . , tI,, such that f(fI,, . ,0”) = 0 and Bi -yi
Then f =0 has a nontrivial solution in k. A (modpdi’)fori=l,...,n.ThecaseS=Ois
generalization is Warning’s theorem: Let often useful; it implies that a nonsingular
f,, . ,f, be polynomials with coefficients in k in solution modp cari be extended to a p-adic
n variables of degrees d, , . . , d,, respectively, solution. Generalization to simultaneous equa-
and suppose that d = d, + + d, < n. Then the tions is also known [6]. Skolem’s method is
number N of common zeros of SI, . ,f, satis- sometimes useful when we investigate certain
fies N 3 0 (mod p). Warning’s second theorem types of equations over tlocal fïelds. This
asserts that if N > 0 then N > qnmd. Warning’s method is based on some simple properties of
theorem was also improved by J. Ax (1964) to local analytic manifolds over local fields [7]. If
the effect that N E 0 (mod qb) for any integer a quadratic form has zeros in each local freld,
h < n/d [3]. For equations over tïnite fields, then it has a rational zero (tMinkowski-Hasse
counting the number of solutions is important. theorem). When a theorem of this type holds,
Let f(x, y) be an tabsolutely irreducible poly- we say that the tHasse principle holds (- 348
nomial in x and y over k. Let N denote the Quadratic Forms). For forms of higher degree,
number of zeros in k of ,f(x, y). A. Weil proved the Hasse principle no longer holds even if the
IN -q1<2y&+c(d), where g is the genus of forms are absolutely irreducible and nonsin-
the curve f(x, y) = 0 and c(d) is a constant gular. Counterexamples were first found for
depending on d. Weil’s proof requires the cubic (E. S. Selmer, 1951) and quintic (M.
use of deep results from algebraic geometry. Fujiwara, 1972) forms. Asymptotic formulas in
This theorem is equivalent to the tRiemann Waring’s problem (- 4 Additive Number
hypothesis for algebraic curves over finite tïelds Theory E) cari be regarded as an analytic form
[4]. Later, using Stepanov’s method, W. M. of the Hasse principle. As to the quantitative
Schmidt and E. Bombieri (1973) independently formulation of the Hasse principle, there are
gave new proofs which do not depend on deep results of Siegel for quadratic forms
algebraic geometry [3]. P. Deligne [5] proved and their generalization by T. Tamagawa. R.
a far-reaching generalization of Weil’s theorem Brauer (1945) showed that forms in suffïciently
to tnonsingular absolutely irreducible equa- many variables represent zero in a11 p-adic
tions in n variables. He showed 1N-q”-’ (= tïelds. Forms of odd degree represent zero in
O(q’“-‘“2). This is a part of the tWei1 conjec- the tïeld of rational numbers if the number of
ture for zeta functions of algebraic varieties variables is suffrciently large compared with
over tïnite tïelds (- Section E; 450 Zeta Func- the degree (B. J. Birch, 1957). Let .f be a poly-
tions Q). Schmidt obtained in an elementary nomial with p-adic integer coefftcients and c,
manner a weaker estimate 1N -qn-’ ( = O(qnm3/*) (m 3 0) be the number of solutions to the con-
but without the assumption of nonsingularity gruence f = 0 (mod p”). The series q(t) =
(- Section F). ZZZ,, c,tm is called the Poincaré series of J:
118 D 452
Diophantine Equations

J. Igusa (1975) proved, by using his theory of in some subfield K’ of K, where K’ is neither Q
asymptotic expansions together with Hiro- nor an imaginary quadratic field. The most
naka’s tresolution theorem, that <p(t) is a basic in the theory of norm forms is (W. M.)
rational function of t [S]. Schmidt’s theorem: Let CI,, , U, be linearly
independent over Q and suppose that the
module generated by a 1, , CI,,is nonde-
D. Integral Solutions of Some Diophantine generate; then N(cc,x, + +z,x,)=c (cEQ)
Equations has only tïnitely many solutions in integers
X ,, ,x,. (Math. Ann., 191 (1971)). The proof
In this section we are concerned with those is based on his remarkable result on tsimulta-
equations for which some “theory” exists. For neous approximation which generalizes Roth’s
isolated results - [ 1,2]. theorem (- 182 Geometry of Numbers G).
There are investigations on special norm forms
(1) Binary Forms. Thue’s theorem (1908): If (T. Skolem, N. 1. Fel’dman, K. Ramanathan,
f(x) = &, u,x”(a,~Z, n > 2) has distinct roots, K. Gyory, M. Fujiwara, etc.). For general
then the number of rational integral solutions forms of higher degree, not much is yet known
of cy=Ou”xUy”~‘= a (Z 3 a # 0) is finite. This except for the additive forms (- 4 Additive
theorem is a direct consequence of Thue’s Number Theory E). H. Davenport [ 111 proved
theorem on Diophantine approximation, that if .f(x) is a cubic form with rational in-
which says that there are only a fïnite number teger coefficients in n variables, then .f‘(x) = 0
of rational numbers p/y (p, 4 E Z, 4 > 0) with has a nontrivial integral solution, provided
la-P/c7 < l/q (“‘)+’ for a given algebraic num- that n > 16. This theorem was proved by
ber c( of degree n (n > 2) [9, p. 1221. K. F. Roth means of an exquisite application of the tcircle
proved that (n/2) + 1 in this formula cari be method together with some geometry of num-
replaced by 2 + E (E is an arbitrary positive bers. A well-known conjecture is that n> 10
number independent of n) (Mathematika, 2 instead of n k 16. It is known that over local
(1955), l-20). Roth’s theorem was generalized fïelds, any cubic form in 10 variables has a non-
to some cases of number fïelds and function trivial zero. There are various results of this
fields (- 182 Geometry of Numbers) and is type for simultaneous additive, quadratic, and
applied to Diophantine equations [9,10]. cubic forms (Davenport, D. Lewis, R. Cook,
A. Baker (1968), using a completely different Schmidt, etc.) [ 171. A satisfactory theory of
method, has given explicit Upper bounds for forms of higher degree, like that of quadratic
the solutions of Thue’s equations, thus enabl- forms, is not yet known but is quite desirable.
ing one to compute effectively a11 the solu- In this vein, Igusa has obtained some new
tions. More precisely, if ,f in Thue’s equation results of considerable interest, e.g., a +Poisson
is irreducible over the rationals, then every summation formula for higher-degree forms,
integer solution (x, y) of the equation satisfies using his theory of asymptotic expansions [S].
max(lxl, 1~1) <exp((nH)~10”~5+(loga)2”+2),
where H is the theight off: The proof of this (3) Algebraic Curves. The fundamental re-
remarkable theorem is based on his deep result suit is Siegel’s theorem (1929): Assume that
concerning the lower bound for the linear theequationsA(X,,...,X,)=O(l<i<m)
forms in the logarithm of algebraic numbers determine an algebraic curve with a positive
(Mathematicu, 13 (1966); 14 (1967)). Baker’s tgenus in an tafflne space of dimension n.
method has been applied to elliptic, hyperellip- Then the number of rational integral solutions
tic, and other curves (Baker, H. Stark, J. ofA(X,, .,.,X,)=0 (1 <i<m) is tïnite. This
Coates, V. G. Sprindzhuk, etc.). theorem was generalized by S. Lang in the fol-
lowing form: Let K be a finitely generated iïeld
(2) Higher-Degree Forms. A natural generali- over Q and I a subring of K that is lïnitely
zation of binary forms is a norm form. Let K generated over Z. Furthermore, let C be a
be an algebraic number field of degree t 2 3 nonsingular projective algebraic curve with a
and c(~, . . . . du, be elements of K. Then the norm positive genus defïned over K, and let <pbe a
N(a,x,+...+cc,x,)=nr=,(al’x,+...+crl’x,), rational function on C defined over K. Then
where &) denotes a conjugate of c(, is a form of there are only a fïnite number of points P on C
degree t with rational coefficients. Tt is easy to with ~(P)EI [lO]. The proof of this theorem is
see that every form which has rational coeffl- based on a generalization of Roth’s theorem in
cients and is irreducible over Q but which is a the above sense and on the weak Mordell-Weil
product of linear forms with algebraic coefi- theorem (- Section C). A. Robinson and P.
cients is a constant multiple of a norm form Roquette gave another approach to Siegel’s
[7]. A module M in an algebraic number iïeld theorem from the standpoint oî nonstandard
K is called degenerate if M has a submodule N arithmetic (J. Number T&ory 7 (1975)). On the
such that, for some CIE K, ctN is a full module other hand, a necessary condition for the exis-
453 118 E
Diophantine Equations

tente of infinitely many solutions of f(X, Y) = elliptic curves by using Baker’s method. For
0 with rational integral coefficients was given example, if f(x, y) is an absolutely irreducible
by C. Runge (J. Reine Angew. Math., 100 polynomial with coefficients in Z such that the
(1887)). curve f=O has genus 1, then max(lxl, Iy])<
expexpexp((2H)“), where m= 10d”, d=degf,
(4) Elliptic Curves. An elliptic curve E is an and H is the height of ,f (Baker and Coates,
TAbelian variety of dimension 1, or what is the 1970. The method of proof was to reduce it to
same, an irreducible nonsingular tprojective the Weierstrass equation case, which had been
algebraic curve of tgenus 1 furnished with treated earlier by Baker, with a better bound.)
a point 0 as origin. The tRiemann-Roch By the tMordell-Weil theorem (- Section E),
theorem defïnes a group law on the set of A, r Z’ x tïnite torsion group. Here r is called
tdivisor classes of E. Actually, if P, P’ are the rank of E over Q. There is a rather doubt-
points of E, then there exists a unique point ful conjecture to the effect that the rank r is
P” such that (P)+(P’)-(P”)+(O), where - bounded. The rank ris conjectured to be equal
means linear equivalence, i.e., the left-hand to the order of the zero of L(s, E) at s = 1 (Birch-
side minus the right-hand side is the divisor of Swinnertou-Dyer conjecture). Much numerical
a rational function on the curve. The group and theoretical evidence supports this famous
law on E is then defined by P + P’= P”. If the conjecture [13].
characteristic # 2 or 3, using the Riemann-
Roch theorem one iïnds that the curve E cari
be defined by a Weierstrass equation y2 =x3 + E. Rational Points of Algebraic Varieties
ax + b with a, b in the ground lïeld over which
the curve is defïned. Conversely, any homo- Let V be an tabstract algebraic variety defined
geneous nonsingular cubic equation has genus over a field k, and let P be a point of V. Then
1 and defines an elliptic curve in the projective P is called a rational point over k of Vif the
plane once the origin has been selected. If both coordinates of the trepresentative P, contained
the curve and the origin are defïned over a in an taffine open set V, of V are in k (- 16
fïeld k, then the group law is also detïned over Algebraic Varieties D). This definition is inde-
k, and it becomes a 1-dimensional Abelian pendent of the choice of the representative P,.
variety defined over k. If the ground fïeld k is In particular, if Vis a tprojective variety, the
the tïeld of complex numbers, the group law point P given by the thomogeneous coordi-
is the same as that given by the taddition nates (x0,x1, . . . . x,,) is rational if and only if
theorem of the tweierstrass @-function with X~/X~E k (0 < i < n, xP # 0). In the following we
invariants y2 = - 4a and g3 = - 46 through the state main results concerning rational points
parametrization x = a(u), y=&‘(u). Much of of algebraic varieties, especially results con-
the Diophantine theorems on elliptic curves cerning TAbelian varieties, restricting k to be
are generalized to Abelian varieties. Here we either an talgebraic number tïeld of lïnite
shah deal mainly with elliptic curves delïned degree, a tp-adic number field, or a tfinite lïeld.
by Weierstrass equations over Q. Extension to
algebraic number ftelds usually causes no trou- Mordell-Weil Theorem. Let A be an Abelian
ble. For more general elliptic curves - [ 191. variety of dimension n defïned over an alge-
The Lutz-Mattuck theorem (- Section E) brait number field k of finite degree. Then
obviously implies that the points of finite order the group A, of a11 k-rational points on A is
in E, (k = Q,) form a finite group. This torsion fïnitely generated. This theorem was proved
group is computable. In case a and b are in Z by L. J. Morde11 (1922) for the case of n = 1
then any point of tinite order in E, has coordi- and by Weil (1928) for the general case [ 101.
nates (x, y) in Z and, if y # 0, y2 (4a3 + 27b2 The assertion that A,/mA, is a finite group
(Lutz-Nagell). The WC group (Weil-Châtelet for any rational integer m is called the weak
group) of E over k is the birational class of Mordell-Weil theorem; this theorem is basic in
principal homogeneous spaces over k. The the proof of the Mordell-Weil theorem and is
extent of validity of the Hasse principle for used in the proof of Siegel’s theorem, too. A
elliptic curves cari be measured by the Tate- generalization of the Mordell-Weil theorem is
Shafarevich group, which is deiïned as the set obtained when k is a field (of arbitrary charac-
of elements of the WC group that are every- teristic) finitely generated over the tprime field
where locally trivial. This group is conjectured ClOl.
to be a tïnite group. For other results and If A is defined over a finite algebraic number
interesting conjectures - [ 12- 141. The num- field k, we have the following conjectures of
ber of integral points on elliptic curves is finite Birch, Swinnerton-Dyer, and Tate on the rank
according to Siegel’s theorem on algebraic of A,. Let p be a prime ideal of k at which A
curves. Explicit bounds for the size of these has a good treduction, and denote by A,, the
points have been given for several types of reduced variety. Let ~CI”, , nez”” be the eigen-
118 F 454
Diophantine Equations

values of the N(p)th power endomorphism of in F. If the equation ,f = 0 has a solution


A, with respect to an [-adic representation, (x1, , x,,) # (0, ,O) in F for any f with n > d’,
where N(p) denotes the norm of p (- 3 Abelian then F is called a C,(d)-field. If F is a C,(d)-tïeld
Varieties E, N), and put L,(s, A)= nff, (1 - for any d > 1, then F is called a Ci-field. In
$)N(p))‘)-‘. The L-function of A deiïned by order for F to be a C,-field, it is necessary and
L(s, A) = n’L&,(s, A), where the product ranges suffïcient that F be an talgebraically closed
over a11 good primes, is the principal part of tïeld. A Ci-field is sometimes called a quasi-
the zeta function of A (- 450 Zeta Functions algebraically closed field. There exists no non-
S). Birch and Swinnerton-Dyer conjectured commutative algebra over a Ci-fïeld F. A
that if k = Q and A is of dimension 1, then finite tïeld is C, (C. Chevalley (1936)). If F, is
there exists a constant C #O such that L(s, A) algebraically closed, then F = F,,(X) (rational
- C(s - l)g as s+ 1. Tate generalized this con- function fïeld of one variable) is a C, -tïeld
jecture to any A and k. Moreover, the constant (Tsen’s theorem). A homogeneous polynomial
C, appropriately moditïed by factors corre- ,f of n = d’ variables of degree d with coeff-
sponding to the bad primes and the infinite cients in F such that f = 0 has no solution in F
primes, is thought to be expressible in terms of except (0, . . , 0) is called a normic form of
certain arithmetic invariants of A [ 131. order i in F. If a C,-field F,, has at least one
normic form of order i, then (i) FO(X,, ,X,)
Lutz-Mattuck Theorem. The group of rational is a C,+,-field; and (ii) an extension of F, of
points of an Abelian variety A of dimension n lïnite degree is a Ci-field. A complete field F
over a tp-adic number fïeld k contains a sub- with respect to an texponential valuation is a
group of lïnite index isomorphic to the direct C,-field whenever its residue lïeld F, is alge-
sum of n copies of the tring D of p-adic integers braically closed. The iïeld F of power series of
in k (E. Lutz, J. Reine Angew. Math., 177 one variable over a fmite fïeld F,, is a C,-fïeld
(1937); A. Mattuck, Ann. Math, 62 (1955)). (Lang). E. Artin conjectured that a p-adic tïeld
Q, is a C,-fïeld. It was proved by H. Hasse
Mordell’s Conjecture. In his 1922 paper, in
(1923) that Q, is a C,(2)-field and by D. Lewis
which the above theorem on the set of rational
(1952) that Q, is a C,(3)-field. However, G.
fïelds on l-dimensional Abelian varieties (i.e.,
Terjanian (1966) [ 171 gave a counterexample
on elliptic curves) was established, Morde11
to Artin’s conjecture; that is, he gave a quartic
stated the conjecture: Any algebraic curve of
form of 18 variables with coefficients in Q2
genus y > 2 defïned over Q has only a fïnite
having only trivial zero in Qz. Ax and S.
number of rational points. The same cari be
Kochen (1965) [18] proved that for any inte-
conjectured for such curves defïned over any
ger d > 1 there exists an integer p,,(d) such
algebraic number field k of finite degree. This
that Q, is a C,(d)-tïeld for p>po(d) (- 276
had remained as a conjecture until 1983. In
Mode1 Theory E).
1961 1. R. Shafarevich conjectured: Let k be
any algebraic number field of finite degree, S a
fmite set of finite prime spots of k, and y any References
natural number > 2. Then there are, up to
[1] L. Dickson, History of the theory of num-
k-isomorphism, only a finite number of non-
bers I-III, Chelsea, 1952.
singular algebraic curves of genus g defïned
[2] L. J. Mordell, Diophantine equations,
over k having good reduction at every finite
Academic Press, 1969.
prime spot outside S.
[3] W. M. Schmidt, Equations over imite
In 1973, A. N. Parshin showed that Mor-
fields, Lecture notes in math. 536, Springer,
dell’s conjecture followed from this conjec-
1976.
ture, which was fïnally proved in 1983 by
[4] A. Weil, Sur les courbes algébriques et les
G. Fallings [7]. Mordell’s conjecture was thus
variétés qui s’en déduisent, Actualites Sci. Ind.,
settled in the affirmative. Analogs of these
1041 (1948).
conjectures on algebraic function fields over
[S] G. Faltings, Endlichkeitssatze für abelische
tïnite fïelds are easier than the original ones
Varietaten über Zahlkorpern, Inventiones
and had been proved in the 1960s for Mordell’s
Math., 73 (1983), 3499366.
conjecture by Yu. 1. Manin, H. Grauert, and
[6] M. Greenberg, Lectures on forms in many
M. Miwa, and for Shafarevich’s conjecture by
variables, Benjamin, 1969.
S. Alakelov, A. N. Parshin, and L. Szpiro.
[7] Z. 1. Borevich and 1. R. Shafarevich, Num-
ber theory, Academic Press, 1966. (Original in
F. Ci-Fields Russian, 1964.)
[S] J. Igusa, Lectures on forms of higher de-
Let F be a tïeld, and let i > 0, d > 1 be in- grec, Tata Inst., 1978.
tegers. Let ,f be a homogeneous polynomial [9] W. J. LeVeque, Topics in number theory,
of n variables of degree d with coefficients Addison-Wesley, 1956.
455 120 A
Dirichlet Prohlem

[lO] S. Lang, Diophantine geometry, Inter- brait number tïelds. In tpotential theory he
science, 1962. dealt with the +Dirichlet problem concerning
[l l] H. Davenport, Analytic methods for the existence of tharmonic functions. He also
Diophantine equations and Diophantine gave +Dirichlet’s condition for the convergence
inequalities, Univ. of Michigan, 1962. of trigonometric series.
[ 121 J. W. S. Cassels, Diophantine equations
with special reference to elliptic curves, J.
References
London Math. Soc., 41 (1966), 193-291.
[ 131 H. P. F. Swinnerton-Dyer, The conjec-
[l] P. G. L. Dirichlet’s Werke 1, II, Reimer,
tures of Birch and Swinnerton-Dyer and of
1889, 1897 (Chelsea, 1969).
Tate, Proc. Conf. on Local Fields, Driebergen,
Springer, 1967. [L] P. G. L. Dirichlet, Vorlesungen über Zah-
lentheorie, with supplements by R. Dedekind,
[ 143 S. Lang, Elliptic curves: Diophantine
Vieweg, fourth edition, 1894 (Chelsea, 1968).
analysis, Springer, 1978.
[3] F. Klein, Vorlesungen über die Entwick-
[ 151 1. Niven and H. Zuckerman, An introduc-
lung der Mathematik im 19. Jahrhundert 1,
tion to the theory of numbers, Wiley, 1960.
Springer, 1926 (Chelsea, 1956).
[ 161 J. Joly, Equations et variétés algébriques
sur un corps fini, Enseignement Math., (2) 19
(1973), 1-117.
[ 171 D. Lewis, Diophantine equations: p-adic
method, Studies in Math., vol. 6, Math. Assoc. 120 (X.30)
Amer., 1969. Dirichlet Problem
[ 181 J. Ax and S. Kochen, Diophantine prob-
lems over local Iïelds 1, II, Amer. J. Math., 87
(1965), 6055630,631-648. A. The Classical Dirichlet Prohlem
[ 191 J. Tate, The arithmetic of elliptic curves,
Inventiones Math., 23 (1974), 170-206. Let D be a bounded or unbounded tdomain in
R” (n > 2) with compact boundary S. The class-
ical Dirichlet problem is the problem of fmding
a tharmonic function in D that assumes the
119 (XXl.20) values of a prescribed continuous function on S.
This problem is also called the tfïrst boundary
Dirichlet, Peter Gustav
value problem (- 193 Harmonie Functions
Lejeune and Subharmonic Functions). In this article f
always stands for a boundary function given
Peter Gustav Lejeune Dirichlet (February 13, on S. The problem is called an interior problem
1805-May 5, 1859) was born of a French if D is bounded and an exterior problem if D
family in Düren, Germany. From 1822 to 1827 is unbounded. In an exterior problem, it is
he was in Paris, where he became a friend of further required that when an tinversion
J.-B. +Fourier. In 1827, he was appointed with tenter at an exterior point P,, of D is per-
lecturer at the University of Breslau; in 1829, formed on D and a +Kelvin transformation
lecturer at the University of Berlin; and in is performed on the solution in D (when the
1839, professor at the University of Berlin. In solution exists), the function thus obtained on
1855, he was invited to succeed C. F. +Gauss the inverted image of D be harmonie at P,
at the University of Gottingen, where he spent (n > 3). When n = 2, the solution in D, which
his last four years as a professor. is already regarded as a function on the in-
His works caver many aspects of mathemat- verted image of D, is required to be harmonie
ics; however, those on number theory, analy- at P,. Thus an interior problem cari be trans-
sis, and potential theory are the most famous. formed to an exterior problem, and vice versa.
He greatly admired Gauss and is said to have We now explain the history of the classical
kept Gauss’s Disquisitiones arithmeticae at his problem.
side even when traveling. Let D be a bounded domain with boundary
In number theory, he created the +Dirichlet S in R3. G. Green (1828) asserted that if S is
series and proved that a sequence in arithmetic suftïciently smooth,
progression contains infinitely many prime
numbers, provided that the tïrst term and the
WP>
anPQI
u(P)=-; ssf(Q)-MQ) (1)
common difference are relatively prime. Also,
using his “drawer principle,” which states that is the solution for the Dirichlet problem, where
if there are n + 1 abjects in n drawers then at ,f is prescribed on S, G(P, Q) is +Green’s func-
least one drawer contains at least 2 abjects, he tion with the pole at Q in D, no is the outward-
clarified the structure of +Unit groups of +alge- drawn normal to S at Q, and da is the +Sur-
120 B 456
Dirichlet Prohlem

face element on S. He took for granted the Poincaré used another method (without utiliz-
existence of Green% function from physical ing (1)) to solve the Dirichlet problem [8]. He
consideration of the problem. Thus his dis- observed that it is sufficient to consider the
cussion was not quite rigorous. This defect was case where f is equal to the restriction to S of
corrected by A. M. Lyapunov (1898) under a a polynomial g and that g is expressed in D as
certain condition on S. Denote by U, the New- the sum of the Newtonian potential of a signed
tonian potential of a measure with density measure T with density - Ag/(4rr) and a func-
m > 0 on S. Assume that a continuous function tion that is harmonie in D and continuous on
f on S and a positive constant a are given. In DU S. If it is possible to sweep out z to aD,
1840, C. F. Gauss investigated the existence of then the solution is obtained. He showed that
a density mr 2 0 on S of total mass a which this is in fact the case if at every point P of S
satisfies js(u,, -2f)m,da=min,~s(u,- there exists a cane that is disjoint from D and
Zf)mda, where the total mass of m is equal has its vertex at P. This condition is called
to a. He asserted also that u,,-f is equal to a Poincaré% condition. In 1900,I. Fredholm
constant b on S. If f-0, then u,~ must be discussed the Dirichlet problem by reducing it
equal to a positive constant c on S, and hence to a problem of tintegral equations. A domain
u ,,,,--bc~i~,~ must be a solution of the ex- D is called a Dirichlet domain if the (classical)
terior problem for the boundary function f on Dirichlet problem is always solvable in D. H.
S. However, his discussion was incomplete Lebesgue (1912) showed that a solution is
because we cannot always ensure the existence obtained by the method of iterative averaging
of a density that gives a measure minimizing in every Dirichlet domain.
the integral. Moreover, even in the case where
D is a ball, there exists a continuous function
fa0 on S such that there is no Newtonian B. The Dirichlet Problem in a General Domain
potential that is equal to f on S up to a con-
stant (M. Ohtsuka, 1961). Gauss (1840) W. It has been believed that the classical Dirichlet
Thomson (Lord Kelvin) (1847), and G. L. problem is always solvable in every domain
Dirichlet solved the Dirichlet problem by until S. Zaremba observed in 1909 that the
making use of the so-called Dirichlet principle, problem is not always solvable for a punctured
which is explained in detail in Section F. After ball. In 1913, Lebesgue gave a decisive exam-
K. Weierstrass (1870) pointed out that there is ple in which the domain is homeomorphic to
a case where no minimizing function exists, D. a bah and bounded by a surface sufficiently
Hilbert (1899) gave a rigorous proof of the smooth except at one point. Thus the central
Dirichlet principle under a certain condition. interest shifted to lïnding a harmonie function
Meanwhile, C. G. Neumann (1870) solved in D that depends only on a continuous func-
the Dirichlet problem rigorously for the lïrst tion f given on C?D and coincides with the
time, although he assumed that D is a con- classical solution when D is a Dirichlet do-
vex domain with a smooth boundary. First, main. Extend f to a continuous function in
he considered the potential IV, = (1/27c). the whole space, and denote it by fO. Approxi-
j,,f(K’/ih)da of a double layer in D; mate D by an increasing sequence {D”} of
then he formed the potential IV, =( 1/27c). Dirichlet domains, and denote by U, the solu-
ssfi (ar -‘/an) do of a double layer with the tion in D, of the Dirichlet problem for the
values fi of W, on S and defined W,, W,, . boundary function fo. N. Wiener proved in
similarly. The series W, - W, + W, - W, + . 1924 that u, coverges to a harmonie function
plus a suitable constant gives a solution to that is independent of the choice of the exten-
the exterior Dirichlet problem for the bound- sion off and {D,}. The problem of deciding
ary function f In 1887, H. Poincaré also used where on dD the general solution assumes the
(1) to solve the Dirichlet problem. He ob- given boundary values is treated in Section D.
tained Green’s function with the pole at 0 0. D. Kellogg (1928) found a general method
in a bounded domain D in the following man- that includes Poincaré’s method of sweeping
ner: Let D’ be the image of D by an inversion out, Schwarz’s alternating method, and the
with tenter 0, S, be a spherical surface sur- result of Wiener. Both Poincarës method of
rounding the boundary aD’ of D’, and n be a sweeping out and Lebesgue’s method of itera-
uniform measure on S, such that the potential tive averaging yield Wiener’s general solution.
of ,U is equal to 1 inside S,. By tsweeping out p
to ?D’, the solution in D’ of the exterior prob-
lem for the boundary function 1 is obtained. C. Perron3 Method
A +Kelvin transformation of this solution
yields the solution h in D of the interior prob- We explain 0. Perron’s method (1923) by
lem for the boundary function l/OP. Now considering the improved method of M. Brelot
l/OP- h(P) is Green’s function in D. In 1899 [ 11. For simplicity we assume that the domain
451 120 F
Dirichlet Prohlem

D is bounded in R3. Let U be the family of tion in D that assumes the boundary value 0
subharmonic functions u bounded above and at M and has a positive lower bound outside
satisfying lim s~p~-~ u(P) <f(M) for any every bal1 with tenter at M. A positive super-
boundary point M. Define H/(P) as sup,u(P), harmonie function delïned in the intersection
where u runs through CT,if this family is not of D and a neighborhood of M and taking the
empty; otherwise, set If, = --CO. Cal1 l& a boundary value 0 cari be used as a barrier. A
hypofunction. Detïne HL by -H-r and cal1 it a necessary and sufficient condition for a bound-
hyperfunction. If H, = H,, the common func- ary point M to be regular is the existence of a
tion is denoted by H,; if H,.(P) < CO, then HJ is Green% function in D assuming the value 0 at
harmonie. This is called a Perron-Brelot solu- M. This condition was given by G. Bouligand
tion (Perron-Wiener-Brelot solution or simply (1925), and it follows from the existence of a
PWB solution). The method of defining H,. is barrier. Another necessary and suflïcient con-
called Perron% method (the Perron-Brelot dition of a quantitative nature was obtained
method or the Perron-Wiener-Brelot method). by Wiener. It is equivalent to the requirement
Wiener showed in 1923 that the +Daniell- that the complement of D is not +thin at M
Stone integral cari be regarded as a general (- 338 Potential Theory G). Kellogg conjec-
solution if a +Daniell-Stone integrable function tured that the set of irregular boundary points
f is given on the boundary of a Dirichlet is of capacity zero and verilïed this in R2
domain; in 1925, he showed that the same is (1928). The conjecture was proved lïrst by G.
true for a general domain (not necessarily C. Evans (1933) in R3, and different proofs
Dirichlet). He showed also that his solution were given by F. Vasilesco (1935) and 0.
coincides with the Perron-Brelot solution Hf if Frostman (1935). The conjecture is also true
fis continuous. Unfortunately, however, from in R” for n>4.
a wrong example he concluded that IJ, # fi,
cari hold even for a simple discontinuous J;
E. The More General Dirichlet Prohlem
and SO he lost interest in Perron% method.
Brelot (1939) corrected Wiener’s erroneous
SO far, we have been concerned with R”. More
conclusion and proved that the Daniel1 Upper
generally, Brelot and G. Choquet [3] obtained
and lower integrals are equal to fiî and fif,
the following result in a Green space & (- 193
respectively. TO any continuous f there corre-
Harmonie Functions and Subharmonic Func-
sponds an Hf, and there exists a +Radon mea- tions): Consider a metric space that contains &
sure pLp satisfying HJP) = jfdpp. This measure
and in which & is everywhere dense, and de-
is called a harmonie measure or harmonie note by A the complement of G with respect to
measure function. Brelot showed that fi/= fif
the space. Let {F} be a family of +Iïlters on 8
if and only if f is PL,-integrable for one (or such that each F converges to a certain point
every) P. In particular, If D is a Dirichlet
of A. Suppose that u < 0 whenever u is sub-
domain and E is a closed set on the boundary
harmonie and bounded above on & and
aD, then the harmonie measure function pp(E)
lim sup u < 0 along every F. Assume the exis-
takes the value 1 at an inner point (in the tence of a barrier u in a neighborhood in &
space aD) of E and vanishes on aD - E. We
of the limit point Q of every F; that is, u is to
note that pp is equal to the measure obtained be positive superharmonic, to tend to 0 along
by sweeping out the unit mass at P to aD. F, and to have a positive lower bound outside
every neighborhood of Q. Under these as-
sumptions, we obtain the PWB solution on &
D. Regular Boundary Points as in R3. There are various examples of A and
F that satisfy these conditions. In particular, L.
If Hf(P)+f(M) as P+MedD for any con-
Naïm [6] investigated in detail the case where
tinuous function f on aD, then M is called A is the +Martin boundary. More generally,
regular. The regularity of a point is a local it is possible to treat the Dirichlet problem
property. A boundary point that is not regu- axiomatically (- 193 Harmonie Functions
lar is called irregular. The regularity of M is and Subharmonic Functions).
equivalent to the convergence of pLp as P-M
to the unit mass at M with respect to the
+Vague topology. There are many sufficient F. The Dirichlet Principle
conditions and necessary conditions for a
boundary point to be regular. The existence Let D be a bounded domain with a sufficiently
of a harrier is a qualitative condition that is smooth boundary in R” and f be a piecewise
necessary and sufficient for a boundary point Cl-function on D with imite Dirichlet integral
to be regular. It was used by Poincaré and SO Ilfll’=SDIgradflZd~, where dT is the volume
named and used effectively by Lebesgue. A element. Suppose that f has a continuous
barrier is a continuous superharmonic func- boundary value <pon i3D. The classical Dirich-
120 Ref. 458
Dirichlet Problem

let principle asserts that the solution of the Series of the form (2) were introduced by
Dirichlet problem for cp has the smallest Dir- P. G. L. Dirichlet in 1839 and utilized in an
ichlet integral among the functions that are investigation of the problems of tanalytic
piecewise of class Ci in D and assume the number theory. Later J. Jensen (1884) and
boundary value <p. In a general domain, H, E. Cohen (1894) extended the variable z to
minimizes llu-fll among harmonie functions complex numbers. The Dirichlet series is not
u in D. Brelot [2] discussed the principle for a only a useful tool in analytic number theory,
family of competing functions that are detïned but is also investigated as a generalization of
in a domain in & and whose boundary values power series. The +Laplace transform is the
cannot be defined in the classical manner. generalization of the Dirichlet series to the
integral, and similar formulas often hold for
both cases.
References

[l] M. Brelot, Familles de Perron et problème B. Convergence Regions


de Dirichlet, Acta Sci. Math. Szeged., 9 (1939),
If the series (1) converges at z=zO, then it
133-153.
converges in the half-plane Re z > Re zO. There-
[2] M. Brelot, Etude et extensions du principe
fore there is a uniquely determined real num-
de Dirichlet, Ann. Inst. Fourier, 5 (1955), 371I
ber S such that (1) converges in Rez > S and
419.
diverges in Rez < S. If (1) always converges
[3] M. Brelot and G. Choquet, Espaces et
(diverges), we put S = -CO (+ CO). We cal1 S the
lignes de Green, Ann. Inst. Fourier, 3 (1952)
abscissa of convergence (or abscissa of simple
199-263.
convergence). Similarly, there is a uniquely
[4] R. Courant, Dirichlet’s principle, con-
determined real number A such that (1) con-
forma1 mapping, and minimal surfaces, Inter-
verges absolutely in Re z > A and is not ab-
science, 1950.
solutely convergent in Re z < A. We cal1 A the
[S] 0. D. Kellogg, Foundations of potential
abscissa of absolute convergence. Furthermore,
theory, Springer, 1929.
there is a uniquely determined real number U
[6] L. Naïm, Sur le rôle de la frontière de R. S.
such that (1) converges uniformly in Rez > U’
Martin dans la théorie du potentiel, Ann. Inst.
for every U’> U and does not converge uni-
Fourier, 7 (1957), 1833281.
formly in Rez > U” for every U” < U. The
[7] M. Ohtsuka, Dirichlet problem, extremal
number U is called the abscissa of uniform
length and prime ends, Lecture notes, Wash-
convergence. Among these abscissas we always
ington Univ., St. Louis, 196221963.
have the relations
[8] H. Poincaré, Théorie du potentiel New-
tonien, Leçons professées à la Sorbonne,
Gauthier-Villars, 1899.
[9] L. L. Helms, Introduction to potential A-S<limsupy.
theory, Interscience, 1969. n-m n

The latter was proved by Cohen (1894). The


numbers S, A, U are determined from a,, 1, by
means of the formulas
121 (X1.3)
Dirichlet Series S=limsupllog C a, (3)
x-m x I rxlQl.<x I
A. Dirichlet Series
A-li_up~log(la~“<~,o.,). (4)
For z = x + iy, A, > 0, and A,? + CO, the series of
the form U=limsupIlog TX,
x-m x
f(z)= f ~,ev-47) (1)
Il=1 TX= sup 1 ~,exp(-kv) , (5)
-m<y<+a3 [X]<&<X
is called a Dirichlet series (more precisely, a
Dirichlet series of the type {A,}). If Â,, = n, then where [ ] is the +Gauss symbol. Formulas (3)
(1) is a power series with respect to ë’. If &, = and (4) were proved by T. Kojima (1914) and
log n, the series (1) becomes (5) by M. Kunieda (1916). In particular, when
lim,,, (log n)/Â, = 0, we have
Il anIn’, (2) logl4
S=U=A=limsup---- (6)
n-m 2,
which is called an ordinary Dirichlet series. If
a, = 1, then (2) is the tRiemann zeta function. (0. Szasz, 1922; G. Valiron, 1924).
459 121 D
Diricblet Series

The series (1) converges uniformly in the a kind of periodicity for the values of f(z) over
angular domain {z 1larg(z-z,)l <tl <n/2}, Re z = x; this was the origin of the theory of
where the vertex z0 lies on the line Re z0 = S. talmost periodic functions.
Hence it represents a holomorphic function in As for the zeros of the function f(z), the
the domain Re z > S, but it is possible that following theorems are known: If f(z) is not
there is no singularity on the line Rez = S. For identically zero, it has only a fïnite number of
example, if a, =( -l)“, then the series (2) has zeros in x >S+E, emM”< y < eMx for arbitrary
S = 0, but the sum is an tentire function (2’-’ positive numbers E, M (Perron, 1908). If we
- 1) l(z). Taking the analytic continuation f(z) denote by N(T) the number of zeros in x > S
of the series (l), the intïmum R of p such that SE, T<y< 7’+26logT, then limsup,,,N(T)/
f(z) is holomorphic in Rez > p is called the (log T)2 8 6,‘~ (E. Landau, 1927).
abscissa of regularity. It is still possible that There have been many investigations into
there is no singularity on the line Re z = R. We the connection between the singularities of j”(z)
always have R <S, and R is given by and the coefficients a,. If the a, are real and
positive, the point z= S is always a singular-
R= ~~<~<~~‘i~~~~og~og+I<p(x+~Y)I+x),
sup
ity of f(z). Moreover, if S = 0, Re a, > 0, and
lim,,, (cos(arg4) ii& = 1, then z = 0 is a sin-
m a, exp( - &z)
(7) gularity of ,f(z) (C. Biggeri, 1939). Furthermore,
q(z)=“; T(l+i ”) ’
if Â,/n-, CO, lim infn-oa(&+l - 1,) > 0, then the
where log+a=max(loga,O) (C. Tanaka, 1951). line Re z = S is the tnatural boundary of f(z)
The infimum B of p such that f(z) is bounded (F. Carleson and Landau, 1921; A. Ostrowski,
in Re z > p is called the abscissa of boundedness. 1923). If S=O, liminf,,,,(A,+, -Â,)=q>O, then
We always have R <B < A. H. Bohr proved there always exist singularities on every inter-
the following three theorems concerning these val on the imaginary axis with the length 2n/q
values: (1) If {Ân} are linearly independent over (G. Polya, 1923). S. Mandelbrojt (1954, 1963)
the ring of integers, then A = B (1911). (2) If gave some interesting results concerning the
(n,,, -A,)-‘=O(expe@) for every E>O, then relations between the singularities of (1) and
U= B (1913). (3) If limsup(logn)/&=O, then S the Fourier transform of an entire function.
= U = A = B (1913). In the final case, the values If U = -CO, the function f(z) is an entire
are given by (6). function. Its tarder (in the sense of entire func-
tion) p is given by

log+ log+ M(x)


p = lim sup
C. Properties of Functions Given by Diricblet x--cc 1x1 ’
Series
M(x)= sup If(x+iy)l.
The coefficients a, in (1) are given in terms of
There have been many investigations into the
the function f(z) by
+Julia direction of f(z) and related topics by
Mandelbrojt, Valiron, and Tanaka.
~‘u.=&.~~~l((z~~dz> (8)
c Lrn
where c > max(S, 0), 1, < w ci,,, , and the
integration contour does not pass through D. Tauberian Tbeorems
{in}. If w= A,, then the term a, in the sum of
the left-hand side of (8) is replaced by a,/2 (0. As in the case of power series, if the series
Perron, 1908). Furthermore, if S < x, then we
Ca,, converges to s, then f( + 0) = s (Abel%
have
continuity theorem). The converse is not neces-
mg+7 sarily true. The converse theorems, with ad-
a,=lim’ f(x + iy)evCJ& + ~Y)I dy (9) ditional conditions on a, and i,, are called
T-LX T s 0”
Tauberian theorems, as in the case of power
(J. Hadamard, 1908; C. Tanaka, 1952). series. Many theorems are known about this
Ifx=Rez>S, thenf(z)=o(lyl)((yl-rco).In field. The most famous additional conditions
order to investigate its behavior more pre- are lim,,, &u”/(” - A,-,) = 0 (Landau, 1926)
cisely; Bohr introduced and a, = O((n,, -in-i)/&) (K. Ananda-Rau,
1928).
lwIf(x+~Y)l
p(x) = lim sup For the summation of Dirichlet series (es-
IYl-+m 1WlYl
pecially +Riesz’s method of summation) - 379
in his thesis (1910) and called it the order over Series R. For Tauberian theorems (especially
Re z = x. The function p(x) is nonnegative, the Wiener-Ikehara-Landau theorem) of the
monotone decreasing, convex, and continuous ordinary Dirichlet series - 123 Distribution
with respect to x. Bohr later found that there is of Prime Numbers B.
121 E 460
Dirichlet Series

E. Series Related to Dirichlet Series inlïnite sequence {y,} consisting of distinct


elements of I, the sequence {y,~} has no +clus-
A series of the form ter point in X. (ii) For every X~X, there exists

c n!Ll,
,=rz(z+l)(z+2)...(z+n)’
zzo, -1, -2, . . .
a neighborhood U, such that y U, n U, = @ for
a11 but finitely many y E I. (ii’) If x, y E X are
not I-equivalent, there exist neighborhoods
is called a factorial series with the coefficients U,, U, of x, y, respectively, such that y U, n U,
{a,}. It converges or diverges simultaneously = 0 for a11 y E I. (iii) For any compact subset
with the ordinary Dirichlet series Zu,/n’ M of X, yM n M = 0 for a11 but lïnitely many
except at z = 0 and negative integers. The series YEr.
It is easy to see that (ii) a(i), (ii) + (ii’) = (iii);
ccI (Z-l)(Z-2)...(Z-n) z-l
an= f a, if, moreover, X is +locally compact, we also
1
PI=, n! n=, i n > have (iii) =(ii), (ii’). When (i) holds, I is called
is called a binomial coeffkient series. It con- a discontinuous transformation group of X,
verges or diverges simultaneously with the and when (ii) holds, I is called a properly
ordinary Dirichlet series C( -l)na,# except discontinuous transformation group. In partic-
at z = 0 and positive integers. ular, when X cari be identilïed with a thomoge-
neous space G/K of a locally compact group
G by a compact subgroup K, the conditions
(i), (ii), and (iii) for a subgroup I of G are a11
References
equivalent, and they are also equivalent to the
condition that I is a tdiscrete subgroup of G.
[l] V. Bernshteïn, Leçons sur les progrès ré-
For a discontinuous group I acting on X,
cents de la théorie des séries de Dirichlet,
the tstabilizer I, = {y E I 1yx x x} of x E X is
Gauthier-Villars, 1933.
always a finite subgroup. When Ix = { 1) for
[2] K. Chandrasekharan and S. Minakshi-
all XEX, I is said to be free (or to act freely
sundaram, Typical means, Oxford Univ. Press,
on X). If rx = nxeXI,={l},Iissaidtoact
1952.
+effectively on X. A point x E X is called a tïxed
[3] G. H. Hardy and M. Riesz, The general
point of I if Ix # Ix. In the following, we
theory of Dirichlet’s series, Cambridge, 1915.
assume for simplicity that I acts effectively on
[4] S. Mandelbrojt, Séries lacunaires, Ac-
X, unless otherwise specilïed.
tualités Sci. Ind., Hermann, 1936.
Since I-equivalence is clearly an tequiva-
[S] S. Mandelbrojt, Séries adhérentes, régu-
lente relation, we cari decompose X into I-
larisation des suites, applications, Gauthier-
equivalence classes, or I-torbits. The space of
Villars, 1952.
a11 I-orbits, called the quotient space of X by
[6] S. Mandelbrojt, Dirichlet series, Reidel,
I, is denoted by I\X. When I satislïes the
1972.
conditions (ii) and (ii’), the space I\X becomes
[7] G. Valiron, Théorie générale des séries
a +Hausdorff space with respect to the topol-
de Dirichlet, Mém. Sci. Math., Gauthier-
ogy of the quotient space. If, moreover, I is
Villars, 1926.
free, X is an (unramified) tcovering space of
I\X with the tcovering transformation group
r. (Conversely, a covering transformation
group is always a free, properly discontinuous
122 (IV.1 4) transformation group.) In general, X may be
viewed as a covering space of I\X with rami-
Discontinuous Groups fications, and the ramifying points (in X) are
nothing but the fixed points of I.
A. Definitions [l-4]

Suppose that a group I acts continuously on a B. Fundamental Regions


+Hausdorff space X, that is, for every y E I and
x E X, an element yx of X is assigned in such A complete set of representatives F of I\X in
a way that the mapping X+~X is a homeo- X (that is, a subset F of X such that IF=
morphism of X onto itself and that we have X, yF n F = @ for y E I, y # 1) is called a fun-
y1(y2x) = (y1 yJx, lx = x, where 1 is the identity damental region of I in X if it further satislïes
element of I. Two points x, y~ X are said to suitable topological or geometrical require-
be r-equivalent if there exists a y E I such ments. Here we assume that F, the closure
that y = yx. (I-equivalence for subsets of X is of F, is the closure of its interior Fi. (In this
delïned similarly.) case, F or Fi is sometimes called the funda-
We consider the following conditions of mental region of I instead of F itself.) Such
discontinuity of I. (i) For every XEX and any a fundamental region exists if I satisfies the
461 122 c
Discontinuous Groups

conditions (ii), (ii’), and the set of fixed points which cases we may put E= i4 (resp. & or c6).
is tnowhere dense in X (R. Baer, F. W. Levi, For the fundamental regions corresponding to
193 1). A fundamental region F is called normal these values of .s, see Fig. 1. In the cases v = 1
if the set {y F} (y E F) is locally finite, that is, if, and 2, the tautomorphic functions with respect
for every x EX, there exists a neighborhood to r are essentially given by exponential func-
Ux such that yF n U, = 0 for a11 but a finite tions and elliptic functions, respectively (-
number of y E r. If X is tconnected and F is 134 Elliptic Functions).
normal, then r is generated by the set of YE F
such that yF n F # 0. Thus it is useful to have
a fundamental region in order to lïnd a set of
generators of r and a set of tfundamental
relations for them. When X has a Yinvariant
+Bore1 measure n and F is countable, then the
measure p(F) of F is independent of the choice
of F. Hence it is legitimate to put n(r\X) =
p(F); r is called a discontinuous group of the (a) (b)
first kind (C. L. Siegel CL]) if r is a discontinu-
ous transformation group which has a normal
fundamental region F such that {y) yF f? F =
0) is finite and p(F) < cg. For instance, if X is
locally compact and F is compact (0 T\X:
compact), then F is of the first kind.
When we are concerned only with the quali- (cl (di

tative properties of r, it is sometimes conve- Fig. 1


nient to relax the conditions for a fundamental (a)v=2,~=1.(b)w~=l,cu~=i,~=i.(c)~~~=l,
region, replacing it by a fundamental (open) set ~f~,=i,,c=i’,.(d)w,=1,0,=13,~=CVh.
R of F, that is, an (open) subset R of X such
that TR = X and yR f’R = 0 for ah but a lïnite (3) X = { jz\ < 1) (unit disk) [3,10,11]. By a
number of y f r [S-9]. +Cayley transformation, the unit disk cari be
transformed to the Upper half-plane !$ = {z =
x + iy 1y > 0). Any analytic automorphism of sj
is given by a real tlinear fractional transforma-
C. The Case of a Riemann Surface
tion (Mobius transformation) z+(az+ b)(cz +
d)-’ (a, b, c, d E R, ad - bc = 1). The totality of
Let r be a discontinuous group of analytic real linear fractional transformations acts tran-
automorphisms of a tRiemann surface X. In sitively on $. Hence !$ cari be identilïed with
virtue of tuniformization theory, it is enough, the thomogeneous space G/K of G = SL(2, R)
in principle, to consider the case where X is by K = SO(2) (which is the stabilizer of the
tsimply connected. Thus we have the following point fi). Hence discontinuous groups r
three cases: of analytic automorphisms of 9 are obtained
(1) X =C U {CO} (tRiemann sphere). F is a as discrete subgroups of G. Actually, every
finite group. Since r cari also be considered as element of G detïnes an analytic automor-
a group of motions of the sphere, it is either a phism of the whole Riemann sphere, which
cyclic, tdihedral, or tregular polyhedral group leaves the real axis RU {CD} invariant. For any
ClOl. z 6 C U {CO} and a sequence {ri} consisting of
(2) X=C (complex plane). r is contained in distinct elements of F, a cluster point of the
the group of motions of the plane. The sub- sequence (y,~} in CU { co} is called a limit
group consisting of a11 parallel translations point of F. When only one or two limit points
contained in r is a tfree Abelian group of rank exist, r cari easily be transformed to one of
v < 2. If v = 0, then r is a lïnite cyclic group. the groups given in (2). Otherwise, the set f! of
When v > 0, r consists of the transformations a11 limit points of F is infinite, and either 2 =
of the following form: RU { co} or 2 is a tperfect, tnowhere dense
subset of R U { co}. When i! is inlïnite, F is
When v= 1, z+Ekz+mw k m E Zl,
called a Fuchsoid group.
When v = 2, z-+~~z+rn,w, +m,u, Since 5J has a G-invariant +Riemannian
metric ds2 = y-‘(dx’ + dy’) (called the +Pain-
(k mi E Z),
taré metric), by which Sj becomes a hyperbolic
where w, w,, w2 are nonzero complex numbers plane (+non-Euclidean plane with negative
with Im(w,/w,) > 0, and F.= +1 in general, curvature), we cari construct a fundamental
except in special cases when v = 2 and 02/wi = region F of r which is a normal polygon
i4 (resp. & or ib, where <r=exp(2rci/l)), in bounded by geodesics, that is, the arcs of cir-
122 D 462
Discontinuous Groups

cles orthogonal to the real axis. A set of gen- D. Modular Groups [ 10,131
erators of r and the fundamental relations for
them are easily obtained by observing the The group
correspondence of the equivalent sides of the
r=sL(2,z)
fundamental polygon. (Conversely, starting
from a normal polygon satisfying a suitable
= a,b,c,dEZ,ad-bc=l
condition, one cari construct a discontinuous
group r having F as a fundamental region. In
this manner, we generally obtain a (nontrivial) (or the corresponding group of linear frac-
continuous family of discrete subgroups of G.) tional transformations) is called the (elliptic)
A Fuchsoid group r is finitely generated if modular group. The modular group r is a
and only if the fundamental polygon F has a Fuchsian group of the first kind acting on 5j,
fïnite number of sides, and in that case r is and its fundamental region together with the
called a Fuchsian group. (More generally, a correspondence of the equivalent sides is
shown in Fig. 2. Fig. 3 illustrates the trans-
finitely generated discontinuous group of
linear fractional transformations acting on formations under r of the fundamental tri-
a domain in the complex plane is called a angle, where r is regarded as acting on the
Kleinian group.) A Fuchsian group r is of the unit disk. From Fig. 2 we obtain the gen-
first kind if and only if L! = R U {CO}; otherwise, ) and the
it is of the second kind. It is also known that a
discontinuous group r is a Fuchsian group of fundamental relations:
the lïrst kind if and only if P(I-\@ < CO [12].
For a real point XER U {CO}, we also denote by
r, the stabilizer of x (in r). The point x is
called a (paraholic) cusp of r if rX is a free
cyclic group generated by a tparabolic trans-
There are two r-equivalence classes of elliptic
formation ( # t-1). Cusps of r are represented
points of r, which are represented by c4 = i
by vertices of the fundamental polygon on the
and c3, with [ri:{ *1,}1=2, &:{S)]=3;
real axis. On the other hand, if a tïxed point z
and only one IY-equivalence class of parabolic
of r lies in sj, then the stabilizer r, is always a
cusps, which coincides with Q U { co}. The
fïnite cyclic group generated by an telliptic
corresponding Riemann surface ‘91r is analyti-
transformation. Hence such a point z is also
cally equivalent to the Riemann sphere.
called an elliptic point of r. For a Fuchsian
group r of the fïrst kind, let {zi, , zS} be
a complete set of representatives of the r-
equivalence classes of the elliptic points of r
(which cari also be chosen from among the
vertices of the fundamental polygon), and let e,
be the order of T,?; furthermore, let t be the
number of the r-equivalence classes of para-
bolic cusps of r. Then the quotient space T\$j
cari be compactified by adjoining t points at
infïnity, and the resulting space becomes a
compact Riemann surface !Rr if we detïne an Fig. 2 Fig. 3
analytic structure on it in a suitable manner.
The area #Ir) measured by the Poincaré For a positive integer N, the totality T(N)
metric is given by the +Gauss-Bonnet formula: of elements in r satisfying the condition

Pc%)=
s dxdy
~
F Y2
(cn>=(o
:> (mod N) forms a normal

subgroup of r, called a principal congruence


suhgroup of level N. (For the case N = 2, see
Fig. 4.) In general, a subgroup r’ of r contain-
ing T(N) for some N is called a congruence
where y is the tgenus of the Riemann sur- suhgroup of r. (It is known that there actually
face !Il,. It is known that there exists a lower exists a subgroup r’ of r with a imite index,
bound (=~/21) for P(?R~) [12]. Automorphic which is not a congruence subgroup.) For
functions (or Fuchsian functions) with respect N 23, -&$T(N), SO that T(N) is effective.
to a Fuchsian group r, which are essentially (For N = 1,2, we have T(N)5= { k &}.) If
the same thing as algebraic functions on the N 2 2, T(N) has no elliptic point. The number
Riemann surface 91r, have been abjects of t(N) of the equivalence classes of cusps of r(N)
extensive study since Poincaré (1882). and the genus g(N) of the corresponding Rie-
463 122 F
Discontinuous Groups

mann surface !RrcN> are given as follows: where X = .%j,= Sp(n, R)/K, +Siegel’s Upper half-
space; r = Sp(n, Z), tsiegel’s modular group of
t(l)= 1, t(2)=3,
degree n), 0. Blumenthal, H. Braun, and L.-K.
t(N)=(1/2N)[I-:1-(N)] (N>3), Hua, and continued by those in the German
school such as M. Koecher, H. Maass, and
9(1)=67(2)=0, others, has undergone substantial develop-
g(N)= 1 +((N-6)/24N)[r:I-(N)] (N>3), ment in recent years under the influence of the
theory of algebraic groups [S, 8,14&17] (- 32
where [r:r(N)] = N3 n,,.(l- l/p’). Auto-
Automorphic Functions).
morphic functions with respect to T(N) are
On the other hand, for a symmetric Rieman-
called tmodular functions of level N.
nian space X of negative curvature, the group
of isometries G = I(X) is a tsemisimple Lie
group of noncompact type with a finite num-
ber of connected components and with a fïnite
tenter, and X cari be identified with the homo-
geneous space of G by a maximal compact
subgroup. Therefore the study of discontinu-
ous groups of isometries of X cari be reduced
to that of discrete subgroups of a Lie group
G of this type. A typical example is the case
where X is the space of a11 real +Positive def-
inite symmetric matrices of degree n with
determinant 1; this space cari be identifîed with
the quotient space SL(n, R)/SO(n) (A E SL(n, R)
Fig. 4 acts on X by X 3 S+‘ASA). The unimodular
A fundamental region of r(2) that consists of six group r = SL(n, Z) is a discontinuous group of
fundamental regions of r(1). the first kind acting on this space X, and a
method of constructing a fundamental region
E. The Case of Many Variables of r in X is provided by the Minkowski reduc-
tion theory [6,7].
Up to the present time, discontinuous groups
r and the corresponding automorphic func-
tions have been studied only in the following F. Discrete Subgroups of a Semisimple Lie
cases: (2’) X=C”, r2 Z2” (the free Abelian Group
group of rank 2n, consisting of parallel trans-
lations) [14] (- 3 Abelian Varieties); (3’) X is Two subgroups r, r’ of a group G are called
a bounded domain in C” and r is a discon- commensurable if r f’ r’ is of fïnite index in
tinuous group of analytic automorphisms of both r and Y. For a real tlinear algebraic
X. (In this case, conditions (i), (ii), (iii) are group G c GL(n, R) defined over Q, a subgroup
equivalent.) r commensurable with Gz = G f’ GL(n, Z) is
In the case (3’), the group a(X) of a11 (com- called an arithmetic subgroup (in the original
plex) analytic automorphisms of X, endowed sense) of G (examples: SL(n, Z), Sp(n, Z)). An
with its natural (?Compact-open) topology, arithmetic subgroup r is always discrete, and
becomes a tLie group, which has r as a when G is semisimple, the quotient space T\G
discrete subgroup. When r\X is compact, is of fmite volume (P(I-\G) < CO) with respect
it is known by the theory of automorphic to an invariant measure p. Moreover, r\G is
functions (or by a theorem of Kodaira) that compact if and only if G is tQ-compact (or +Q-
r\X becomes a tprojective variety, which is a anisotropic), that is, if Go or Gz consists of
tminimal mode1 [3,14]. In particular, when X only tsemisimple elements (A. Borel, Harish-
is a tsymmetric bounded domain, i.e., when X Chandra, G. D. Mostow, and T. Tamagawa
becomes a tsymmetric Riemannian space with [6,7]); the same results remain true if G is
respect to its tBergman metric, the connected tzariski connected and has no tcharacter
component G of the identity element of G’= deiïned over Q. The proofs of these facts (and
J&‘(X) (which incidentally coincides with that the compactification of the quotient space
of the group I(X) of a11 tisometries of X) is a r\X for the noncompact case) depend on a
tsemisimple Lie group of noncompact type construction of fundamental open sets that
(i.e., without compact simple factors), and X generalizes the reduction theory of Minkowski
cari be identifïed with the homogeneous space and Siegel [5,8,15-181.
of G by a maximal compact subgroup K of G. For a connected semisimple Lie group G
The theory of discontinuous groups of this of noncompact type and a discrete subgroup
type, initiated by Siegel (especially in the case r with p(T\G) < CO,the following density
122 G 464
Discontinuous Groups

theorem holds (Bore], Ann Math., (2) 72 (1960)): G. Rigidity and Arithmeticity
(i) For any linear representation p of G, the
linear closure of p(F) coincides with that of A discrete subgroup F of a Lie group G with
p(G); (ii) if G is algebraic, F is +Zariski dense p(T\G) < CGis usually called a lattice of G. A
in G. Furthermore, suppose that G is a direct lattice F of G is said to be uniform if T\G is
product of simple groups G, and the tenter of compact. By a theorem of Bore1 and Harish-
G is finite; F is called irreducible if its projec- Chandra [7], an arithmetic subgroup of a real
tion on any (proper) partial product of { Gi) is linear group (dehned over Q) is a lattice. Using
not dicrete. For instance, if G is a Qsimple this result, Bore1 showed further, by a con-
algebraic group then F = Gz is irreducible. In structive method, that any semisimple Lie
general, there exists a partition of the set of group has a lattice, especially a uniform one
indices {i} such that F is commensurable with (Borel, Topology, 2 (1963)).
a direct product of irreducible discrete sub- For a long time there were no known ex-
groups of the partial products corresponding amples of nonarithmetic irreducible lattices
to this partition, and these irreducible compo- in semisimple Lie groups other than those
nents are unique up to commensurability. locally isomorphic to X,(R). This naturally
The method of constructing a discrete sub- led to Selberg’s conjecture that any irreducible
group F of G = SL(2, R) in a geometric manner (nonuniform) lattice in a semisimple Lie group
using the Upper half-plane cari be generalized G not locally isomorphic to S&(R) is arith-
to some extent to the construction of discrete metic (Selberg, International Colloquium on
subgroups of certain groups using hyperbolic Function Theory, Bombay, 1960). The con-
spaces of low dimensions (E. B. Vinberg). jecture seemed to be well-grounded by the
Except for these few cases, today it is known rigidity theorem of Weil and Selberg [ 193.
that a discrete subgroup r of a semisimple Lie However, in 196661967, V. S. Makarov
group G (of R-rank 22) with p(T\G)< CO is and Vinberg constructed nonarithmetic non-
arithmetic in a certain sense (- Section G). uniform lattices in So(n, 1) (n = 3,4,5) by a
This implies there are only very few discrete geometric method; the lattices are generated
subgroups for a semisimple Lie group of by treflections [21]. Thus rigidity and arith-
higher rank. Actually a number of facts sug- meticity do not necessarily coincide, and the
gesting this result were already known in the conjecture should be considered under stronger
1960s. First, the only subgroups of SL(n, Z) conditions.
(n 2 3), Sp(n, Z) (n 2 2) with tïnite index are As for rigidity of uniform lattices, Mostow
congruence subgroups (H. Bass, M. Lazard, established in 1970 the following strong rigid-
and J.-P. Serre; this result has been generahzed ity theorem [22]: If G, G’ are semisimple Lie
to the case of an arbitrary +Chevalley group groups with trivial tenter and without com-
over an algebraic number tïeld by C. C. Moore pact factors, and are not locally isomorphic
and H. Matsumoto). Second, G, = SL(n, Z), to S&(R), and if F, F’ are irreducible uni-
Sp(n, Z) are maximal in G [S]. Finally, it is form lattices, then any isomorphism 0: F+
known (rigidity theorem) that if a connected F’ extends to an analytic isomorphism 8: G+
semisimple Lie group G with a fïnite tenter G’ (namely, 81 F = e). The previous rigidity
does not contain a simple factor which is theorem is now implied by Mostow’s. If X is a
locally isomorphic to SL(2, R), then any dis- simply connected symmetric space, then a
crete subgroup F of G with compact quotient tlocally symmetric space M covered by X is
F\G has no nontrivial tdeformation (i.e., a11 expressed as a quotient of X by a fixed-point-
deformations are obtained from inner auto- free properly discontinuous group F in the
morphisms of G) (A. Selberg, E. Calabi and E. group I(X) of total isometries (when the Lie
Vesentini, and A. Weil [ 193). This last result algebra of I(X) does not have a compact fac-
amounts to the vanishing of the cohomology tor, this condition for r is equivalent to saying
group H’(T, X, ad), and in this connection an that it is a torsion-free discrete subgroup of the
extensive study has been made by Y. Mat- semisimple Lie group I(X)), and the funda-
sushima, S. Murakami, G. Shimura, and K. G. mental group n,(M) of M = F\X is isomorphic
Raguanathan to determine the +Betti numbers to F. The strong rigidity theorem implies that
of F\X, and more generally the cohomology compact locally symmetric spaces M, and
groups of the type H(T, X, p) with an arbi- M2 (of higher dimensions) are isomorphic as
trary representation p of G. These cohomol- Riemannian spaces if and only if r-tr (M,) and
ogy groups are closely related to automor- rcl (M,) are isomorphic as abstract groups.
phic forms with respect to F [S, 201.) For a On the other hand, in 1973, G. A. Margulis
tnilpotent or +Solvable Lie group, a general proved the arithmeticity of irreducible non-
method of constructing discrete subgroups is uniform lattices in semisimple real linear
known; see, for example, M. Saito, Amer. J. groups of R-rank greater than 1 (Russian
Math., 83 (1961)) Math. Surveys, 29 (1974) (original in Russian,
465 122 Ref.
Discontinuous Groups

1974); Functional Anal. Appl. 9 (1975) (original ical problem. Al1 possibilities for such F have
in Russian, 1975). M. S. Raghunathan also been enumerated in low-dimensional cases.
proved independently the same fact under For instance, there are 230 kinds of discontinu-
a slightly stronger condition. Their results, to- ous groups of +Euclidean motions acting on
gether with the rigidity of nonuniform lattices 3-dimensional Euclidean space without tïxed
in (higher-dimensional) semisimple Lie groups subspaces, which are classified into 32 tcrystal
of R-rank 1 established by H. Garland, Rag- classes (A. Schonflies and E. S. Fedorov, 1891-
hunathan, and G. Prasad (Inventiones Math., 1892; - 92 Crystallographic Groups). Al1
21 (1973)) imply that the strong rigidity theo- discontinuous groups of a Euclidean space
rem holds similarly for nonuniform lattices. generated by treflections have also been enu-
The results of Margulis and Raghunathan merated (H. S. M. Coxeter, 1934 [25,26]).
show that the Selberg conjecture in the orig-
inal sense is affirmative for the case of R-rank 1. Kleinian Groups
greater than 1. But neither argument is appli-
cable to uniform lattices, for they depend deeply The last decade has seen considerable research
on the fact (proved by D. A. Kazdan and on (tïnitely generated) Kleinian groups. This
Margulis) that a nonuniform lattice contains a research is closely related to the theory of
nonidentity unipotent element. Previously, in a tquasiconformal mappings and the tmoduli of
lecture at the international congress of mathe- Riemann surfaces.
maticians at Moscow, 1966,I. 1. Pyatetskii- Making use of Eichler cohomology and
Shapiro generalized the definition of arith- tpotentials, L. V. Ahlfors established his tïnite-
meticity and suggested that arithmeticity of ness theorem and L. Bers his area theorem.
lattices should be investigated without the Bers and B. Maskit investigated the bound-
distinction of whether they are uniform or aries of +Teichmüller spaces and discovered
nonuniform. His detïnition is equivalent to the Kleinian groups with the property that the
following [9,24]: For a connected semisimple complement of the set L! of limit points is
algebraic group G detïned over R, a subgroup connected and simply connected.
r c G = G, is an arithmetic suhgroup (of G) Numerous mathematicians have subse-
if there is an algebraic group H defïned over quently discussed the classification, defor-
Q and a surjective homomorphism <p: H+ mation, and stability properties of the set 2,
Ad G defined over R such that the Lie group uniformization and deformation of Riemann
(Ker <~)a is compact and that <p(Hz) and Ad F surfaces with or without nodes, and other
are commensurable. The uniform lattice in G geometric properties. In their discussions, the
that is constructed by the method of Bore1 is theory of quasiconformal mappings has played
arithmetic in this sense. In 1974, Margulis an important role. The discontinuous groups
finally established the following arithmeticity of motions of hyperbolic 3-space have also
theorem [23,24]: If the R-rank of G is not been studied.
less than 2, an irreducible lattice F in G is an
arithmetic subgroup of G (even if it is uni- References
form). In the same lecture, Pyatetskiï-Shapiro
also extended the Selberg conjecture to such [l] B. L. van der Waerden, Gruppen von
“semisimple Lie groups” as those containing linearen Transformationen, Erg. Math.,
p-adic Lie groups as factors. Margulis proved Springer, 1935 (Chelsea, 1948).
this Pyatetskiï-Shapiro conjecture atlïrmatively [2] C. L. Siegel, Discontinuous groups, Ann.
by showing that an analog of the strong rigid- Math., (2) 44 (1943) 674-689 (Gesammelte
ity theorem holds for such groups. Abhandlungen, Springer 1966, vol. 2, 390-
As for semisimple Lie groups of R-rank 1, 405).
besides the lattices constructed by Makarov [3] Sém. H. Cartan 6, Fonctions automorphes
and Vinberg there are only the few examples et espaces analytiques, Ecole Norm. SU~.,
of nonarithmetic lattices in SU(2,l) presented 195331954.
by Mostow (Proc. Nat. Acad. Sci. US, 75 [4] G. Shimura, Introduction to the arithmetic
(1978); Pacific J. Math., 86 (1980)). The prob- theory of automorphic functions, Publ. Math.
lem of arithmeticity still remains open for Soc. Japan, 11 (1971).
groups locally isomorphic to So(n, 1) (n > 6) [S] Sém. H. Cartan 10, Fonctions auto-
SU(n, 1) (n> 3), Q(n, 11, or F4. morphes, Ecole Norm. SU~., 195771958.
[6] A. Weil, Discontinuous subgroups of class-
H. Geometric Discontinuous Groups [ 1,251 ical groups, Lecture notes, Univ. of Chicago,
1958.
The study of discontinuous groups F acting on [7] A. Bore1 and Harish-Chandra, Arithmetic
a Euclidean or projective space X as a trans- subgroups of algebraic groups, Ann. Math., (2)
formation group of a given structure is a class- 75(1962),485-535.
123 A 466
Distribution of Prime Numbers

[S] Algebraic groups and discontinuous sub- Generators and relations for discrete groups,
groups, Amer. Math. Soc. Proc. Symp. Pure Erg. Math., Springer, 1957.
Math., 1966, vol. 9. [26] N. Bourbaki, Eléments de mathématique,
[9] M. S. Raghunathan, Discrete subgroups of Groupes et algèbres de Lie, ch. 4, 5, 6, Ac-
Lie groups, Erg. Math., Springer, 1972. tualités Sci. Ind., 1377, Hermann, 1968.
[ 101 R. Fricke and F. Klein, Vorlesungen über [27] L. Bers and 1. Kra, A crash course on
die Theorie der automorphen Funktionen 1, Kleinian groups, Lecture notes in math. 400,
Teubner, 1897, second edition, 1926. Springer, 1974.
[ 1 l] H. Poincaré, Théorie des groupes fuchs-
iens, Acta Math., 1 (1882), l-62 (Oeuvres,
Gauthier-Villars, 1916, vol. 2, 108-168).
[ 121 C. L. Siegel, Some remarks on discontinu- 123 (V.7)
ous groups, Ann. Math., (2) 46 (1945), 708-718
(Gesammelte Abhandlungen, Springer, 1966,
Distribution of Prime
vol. 3, 67-77).
Numbers
[ 131 F. Klein and R. Fricke, Vorlesungen über
die Theorie der elliptischen Modulfunktionen A. General Remarks
1, Teubner, 1890.
[ 141 C. L. Siegel, Analytic functions of several Given a real number x, we denote by n(x) the
complex variables, Lecture notes, Institute for number of primes not exceeding x. A. M.
Advanced Study, Princeton, 1948-1949. Legendre (1808) obtained empirically the for-
[ 151 C. L. Siegel, Symplectic geometry, Amer. mula n(x) k x/(log x -B) for some constant B,
J. Math., 65 (1943), l-86 (Gesammelte Ab- and C. F. Gauss (1849) obtained the formula
handlungen, Springer, 1966, vol. 2,274-359;
Academic Press, 1964). n(x)+ “du
[ 161 1. Satake, On compactifications of the s 2 logu’
quotient spaces for arithmetically detïned assuming the average density of primes to be
discontinuous groups, Ann. Math., (2) 72 l/logx. The Bertrand conjecture, which asserts
(1960), 555-580. the existence of at least one prime between x
[ 171 W. L. Baily and A. Borel, Compactifi- and 2x, was proved by P. L. Chebyshev (1848),
cations of arithmetic quotients of bounded who introduced the functions
symmetric domains, Ann. Math., (2) 84 (1966),
442-528. Q(x)= g 1 logp
m=l p<x
1181 H. Minkowski, Diskontinuitatsbereich
für arithmetische Aquivalenz, J. Reine Angew. and
Math., 129 (1905), 220-274 (Gesammelte Ab-
444’ c 1WP
handlungen, Teubner, 1911, vol. 2, 53-100; p”<x
Chelsea, 1967).
=O(x)+e(fi)+e(V/X)+....
[ 193 A. Weil, On discrete subgroups of Lie
groups, Ann. Math., (2) 72 (1960), 369-384; II, (In this section, p represents a prime number.)
ibid., 75 (1962), 578-602. He thereby proved
[20] Y. Matsushima and S. Murakami, On
vector bundle valued harmonie forms and
automorphic forms on symmetric Riemannian where A =log(21123113511530-1130). G. F. B.
manifolds, Ann. Math., (2) 78 (1963), 365-416. Riemann (1858) considered the function c(s)
[21] E. B. Vinberg, Discrete groups generated (where s = o + it is a complex variable), ex-
by reflections in Lobacevskii spaces, Math. pressed by the +Dirichlet series Ls, n-‘, which
USSR-Sb., 1 (1967), 429-444. (Original in is convergent for o > 1. He found relations
Russian, 1967.) between the zeros of l(s) (- 450 Zeta Func-
[22] G. D. Mostow, Strong rigidity of locally tions) and n(x). F. Mertens (1874) obtained the
symmetric spaces, Ann. Math. Studies 78, formulas
Princeton Univ. Press, 1973.
[23] G. A. Margulis, Discrete groups of mo-
tions of manifolds of nonpositive curvature,
Amer. Math. Soc. Transl., (2) 109 (1977), 33-
45. (Original in Russian, 1974.)
[24] G. A. Margulis, Arithmetic irreducible
lattices in semisimple groups of rank greater
than 1 (in Russian), Mir, 1977. (Appendix to
the Russian translation of [SI.) where c is the Euler constant and B is some
[25] H. S. M. Coxeter and W. 0. J. Moser, constant.
467 123 B
Distribution of Prime Numbers

B. Prime Number Theorem (1949):

O(x)logx+ 1 H(x/p)logp=2xlogx+O(x),
The prime number theorem P<I
7c(x)logx which enabled him to prove O(x) - x. Thus
lim -= 1, or =(x)-X
x-cc X logx’ he obtained for the first time a proof of the
prime number theorem that does not use com-
was proved almost simultaneously (1896) by J. plex analytic methods. The simple formulas
Hadamard and C. J. de La Vallée-Poussin. Es1 p(n)/n=O and C,,,p(n)=o(x), obtained
Without using the theory of +entire functions, by H. von Mangoldt (1897) were revealed by
E. Landau (1908) established the formula Landau to have a deep meaning concerning
7-c(x)= Lix + O(X~?J’~~~), the prime number theorem. Let x”(x) denote
the number of integers not exceeding x that
where cari be expressed as the product of r distinct
primes. In generalizing the prime number
theorem, Landau (1911) proved that

1 x(loglogx)‘-’
is the tlogarithmic integral. It cari be shown by xr(x)-(r- l)! logx
integration by parts that
Let us Write 9(x) = Cz, ëaxn2. Riemann
Lix=X+ -+l!x proved that
logx 1og*x
1-f
+(k-l)!x+O
logkx
x 0
( logk+‘x >.
1 ; “‘9(~)(~“/2-1 +X-“2-“/2)dX
For example, by taking x = lO’, we get n(x) = s(s-1) s1
664,579, Lix k 664,918, and x/logx k 620,417.
If the Dirichlet series f‘(s) = C:i u,,n-’ and obtained the well-known functional equa-
satisfïes the condition Cnsxan-cx, then its tion for the zeta function (- 450 Zeta Func-
abscissa of convergence is 1, and we have tions B)
lim,,,+,(s- l)f(s) = c. The converse is known
71~“‘2r(s/2)i(s)=n-“2+S’2r(1/2-s/2)[(l -s).
as the +Tauberian theorem. If F(s) = En=, u,,n-’
(a, 2 0) converges absolutely for o > 1 and F(s) This enables us to extend c(s) as a meromor-
- c/(s - 1) is analytic for 0 > 1, then we obtain phic function to the whole complex plane.
c n Qx a, - cx ( Wiener-Ikehara-Landau theorem, Utilizing this extended i(s) and the following
1932). Specifically, if we put - ~‘(s)/~(s) = result of 0. Perron on Dirichlet series, we cari
X:i A(n)n-“, then the conditions of the theo- estimate $(x). Let e0 (# CO) be the abscissa of
rem are satisfied, and we obtain C,..h(n)= convergence of F(s) = Cc, f(n)n-‘, and let
$4x)-x. u>O, a>(~,,, and x>O. If
It is easily seen that the prime number a+iT
theorem is equivalent to $(x)-x or H(x) - x. lim i F(s)- ds
The tnumber-theoretic function A(n) (Man- v+m27ci s a-iT S

goldt’s function) introduced above satislïes


exists, then the limit is equal to xn,,f(n),
Cdln A(d) = logn. It follows from the +Mobius
where C’ means that in the summation the last
inversion formula that A(n) = Cd,,, p(d) log(n/d).
term f(x) is replaced by ,f(x)/2 if x is an in-
Hence A(n)=logp (n=p”) and =0 other-
teger. In many cases, F(s) has a pole at s = 1,
wise. Thus we obtain $(x)=Cn,\-A(n)=
and the principal part of the sum is obtained
O(x). When f(x) = Q(x) or ti(x), it is easy
from the residue at s = 1, whereas the residual
to show that ~2(f(t)/t2)dt=logx+0(1) and
part is given by a certain contour integral. TO
lim inf,,, ~(X)/X < 1 < lim SUP~-~ ~(X)/X. How-
estimate +(x), we use -[‘(s)/<(s) as F(s); hence
ever, it is not easy to prove f(x) - x. TO do
the problem arises of determining the zeros of
SO, introduce the number-theoretic function
c(s). Riemann conjectured that ah the zeros
M(n), which satisiïes xdln M(d) = log2 n. As
of c(s) in the strip 0 <e < 1 must be situated
before, we have M(n)=&p(d)log2(n/d);
on the vertical line e= 1/2. If this so-called
hence
tRiemann hypothesis (- 450 Zeta Func-
(21- l)logp (n=p’, IZ l), tions) is true, then it follows that x(x) = Lix
M(n)= 210gplogq (n=p’q”, /> 1, m> l), + O(&logx). The ultimate validity of
Riemann’s hypothesis remains in doubt.
1 0 (otherwise).
Concerning this, the most recent major
Thus we obtain CnGx M(n)=2xlogx+O(x). result is the following formula, obtained
This leads to A. Selberg’s well-known formula by 1. M. Vinogradov (1958): x(x) = Li x +
123 C 468
Distribution of Prime Numbers

0(x exp( - clog3’5x/loglog1’5x)). Without (1963) showed that c cari be replaced by 6/37
using Riemann% hypothesis, J. E. Littlewood + a, and consequently that 0 G 61/98 + E. D. R.
(1918) proved that Heath-Brown and H. Iwaniec (1979) proved
that 0 < 11/20+~ by using the tsieve method
n(x) - Li x
lim sup > 0, and the zero density theorem of L-functions
x-cc (Jx/logx)logloglogx (Section E). R. A. Rankin (1935) proved that
~C(X)- Li x
lim inf <o.
X-oc (Jx/logx)logloglogx x (logloglogp,)-*
If we denote by N(T) the number of zeros of holds for inlïnitely many n. If we denote by
c(s) in the domain 0 < (r < 1, 0 < t < T, then we n*(x) the number of primes p d x such that p
have + 2 is also a prime, then it has been conjec-
1 +log2n tured by Hardy and Littlewood (Acta Math.,
N(T)=& TlogT- T+ O(log T). 44 (1922)) that
27l
x~du
n,(x)-C
Let N,(T) denote the number of zeros of [(.Y)
on the interval o = 1/2,0 < t < T. Selberg (1942)
obtained the impressive result
s * 1og* u

as x+ CO, where
N,(T)>cTlogT.

E. C. Titchmarsh (1936) showed that there


exist 1041 zeros of i(s) in the domain 0 <o < 1,
The numerical evidence provided by the com-
0 < t < 1468 and that a11 lie on the line (T = 1/2.
putation rc2(109) = 3424506 (Brent, Math.
Computers have provided further results that
Camp., 28 (1974)) tends to indicate the truth of
seem to justify the Riemann hypothesis. For
this conjecture. At present, 76. 31s9 k 1 seems
example, it has been calculated that there are
to be the largest known pair of twin prime
75000,000 zeros of c(s) in the domain 0 < cr < 1,
numbers (H. C. Williams and C. R. Zarnke,
0 < t < 32,585,736.4 and that all are simple
Math. Comp., 26 (1972)).
zeros and lie on the line 0 = 1/2 (R. P. Brent,
Math. Camp., 33 (1979)). N. Levinson proved
in 1974 by another method that at least one-
third of the zeros of the Riemann zeta function D. Prime Numbers in Arithmetic Progressions
are on the line o = 1/2. The minimum of the
modulus of the imaginary part of the zeros Let k be a positive integer, x(n) be a residue
witha=1/2ist=14.13.... character modulo k (- 295 Number-Theoretic
Functions D), and L(s, x) = xz, x(n)n-” (0 > 1)
be the +Dirichlet L-function. The function
C. Twin Primes L(s, x) of s detïned by this series cari be ex-
tended to an analytic function in the whole
Let p, be the nth prime. We know from the complex plane in the same way as the Rie-
prime number theorem that p, - nlogn; more mann zeta function. In particular, when x is
precisely, p,=nlogn+nloglogn+ O(n). A pair the principal character, then L(s, x), thus ex-
of primes differing only by 2 are called twin tended, is a meromorphic function whose only
primes. It is still unknown whether there exist pole is situated at s = 1 and is simple; otherwise
infinitely many twin primes. There exist in- the function L(s, x) is holomorphic on C.
tïnitely many n satisfying pn+t -p.<clogp, (P. Using this function L(s, x) and in connection
Erdos, 1940). Suppose that 5( 1/2 + it) = O(1 tl’). with his research concerning the tclass num-
A. E. Ingham (1937) proved that bers of quadratic forms, P. G. L. Dirichlet
(1837) proved that there exist intïnitely many
Pn+1 -P,<Prl~ @ 0=(1+4c)/(2+4c)+s,
primes in the arithmetical progression 1, 1+ k,
by using the following density theorem related / + 2k, , where I is the initial term and k a
to the zeros of c(s): If we denote by N(a, T) common difference relatively prime to 1. This
the number of zeros of c(s) in the domain result is called the Dirichlet theorem (or prime
r<a<l,O<t<T(1/2<a<l),thenthereis number theorem for arithmetic progressions).
a positive constant c such that N(ec, T)= Suppose that a runs over a11 integral ideals
O(T 2(1+2c)(‘~a)log5 T). The Lindekif hypothe- in a tquadratic number field K of discriminant
sis asserts that the constant c cari be made d. Then the tDedekind zeta function <,(.Y) of K
arbitrarily small. If the Riemann hypothesis is detïned by C(Na)-” for <T> 1. By virture of
holds, then the Lindelof hypothesis also holds. the decomposition law of prime ideals (- 347
It is clear that we cari substitute 1/6+~ (F > 0) Quadratic Fields C), we have [,(s) = [(s)L(s, x),
for c, E being arbitrarily small. W. Haneke where x(n) = (d/n) is the +Kronecker symbol.
469 123 E
Distribution of Prime Numbers

Utilizing c,(s), we obtain formulas concerning K. Prachar (1957), which is an extension of


the +class number h(d) of the fïeld K. If d>O, Page’s theorem (1935): We let 6 be the function
then h(d)=(&/2loga)L(l,X), where E is the deiïned by 6 = 1 -pi if L(s, x) has an excep-
+fundamental unit of K. On the other hand, if tional real zero bl, and s=c,/log(k(ltI+2))
d<O, then h(d)=(wfl/2n)L(l,X), where w otherwise (where c, is a suitably small num-
denotes the number of the roots of unity ber). If we denote by p the real part of any zero
contained in K. It follows that L( 1, x) > ( # &) of L(s, x), then the theorem states that
21og(( 1 + $)/2)/m. Let x be a character
modulo k induced by a primitive character fi<l- c1 cle
lw
x0. Since we have L(s, 1) = L(s, x”)Qlk( l- lwMltI+2)) ~Wk(ltl+-2)) >’
x’(p)p-“), it cari be shown that L(l,x)#O for provided that 6log(k(ltl+2))<c,. Linnik
a real character x. It is easy to prove that called this result the Deuring-Heilbronn pbe-
L( 1, x) # 0 for a complex character x, These nomenon. Using this theorem and the zero
statements then lead to the Dirichlet theorem. density theorem, Linnik (1944) proved that
The proof was simplifïed by H. N. Shapiro p(k, 1)« kL, where p(k, 1) is the least prime in
(195 1). Besides Landau’s three proofs for the arithmetic progression 1, 1+ k, 1+ 2k, and
L( 1, x) # 0 for real character x(1 908), there L is a constant. We cal1 L Linnik’s constant. M.
are elegant proofs by T. Estermann (1952), Jutila (1977) and Chen Jing-Run (1979) proved
Selberg (1949) and others. For a character that L < 80 and L < 17, respectively.
Xmod k, we always have L( 1, x) = O(log k), while Let s be positive, bj, zj (1 Q j < s) be complex,
L( 1, x)-’ = O(log k) with one possible excep- and 1, m be real numbers satisfying max Izil > 1,
tion, which may occur only if x is a real char- 12 s, and m 2 0. Under these conditions P.
acter. Even in this case, we have L( 1, x)-l Turan (1953) obtained the following funda-
= O(k”) (where E> 0 is arbitrary, but 0 depends mental theorem, which is called the power sum
on E). This result was obtained by K. L. Siegel theorem:
(1934) from his study concerning class num-
bers of imaginary quadratic number fïelds. His max Ib,z;+...+b,z:l
fTl<&l+t?l
proof was simplified by Estermann (1948) and
S. D. Chowla (1950). The importance of the
prime number theorem for arithmetic pro-
gressions was revealed when it was applied to This theorem is effective in research on the
the Goldbach problem (- 4 Additive Number distribution of zeros of zeta functions. Based
Theory C). Concerning this problem, the man- on this new method, Turan (1961), S. Knapow-
ner in which the remainder term depends on ski (1962), and Fogels (1965) reached the re-
the modulus k became an abject of investiga- sults cited above.
tion. The Page-Siegel-Wallïsz theorem is Another method of research, considered to
convenient to use: Denote by TL(X; k, 2) the be a new sieve method, on the distribution of
number of primes not exceeding x and of the primes was introduced by Selberg and A. 1.
form ky + 1, where (k, l) = 1. If x > exp(k”) Vinogradov. This method was followed by W.
(where E> 0 is arbitrary), then we have B. Jurkat and H. E. Richert (1965). Linnik, A.
Rényi (1950), and E. Bombieri (1965) founded
n(x; k,~)=$+O(xe-;;k;ogX) still another method, called the large sieve
metbod, by which P. T. Batemann, Chowla,
Further research on the distribution of zeros and P. Erdos studied the value of L( 1, x).
of L(s, x) is necessary for the study of X(X; k, I)
when x takes smaller values. If x is a nonprin-
cipal real character, then L(s, x) may have at E. Sieve Method
most one real zero pi around 1; this is called
Siegel’s zero. Because of this fact, when x is Let A be a set of integers, and P a set of
small we are unable to obtain any formula to primes. For each p E P, let R, be a set of resi-
indicate the uniform distribution of primes. dues mod p, and w(p) the number of residues
However, we have the following deep result, belonging to 0,. The sieve method is a device
obtained by E. Fogels (1962). For a given for estimating (from above or below) the
positive E, there exist C~(E) and C(E) such that number of integers FI belonging to the set
~L(X; k, l)>c(~)x/q(k)k~logx, provided that S(A,P,R,)={nln~A,nmodp~R, for peP}.
x 2 keo(‘). On the other hand, Titchmarsh (1930), The combinatorial methods of the Brun, Bus-
using the tsieve method, obtained X(X; k, 1) tab, and Richert sieves are interesting and
= O(x/<p(k) logx) for x > k’o. A theorem of this efftcient but quite complicated. Here, we shall
type is called the Burn-Titcbmarsb tbeorem (- briefly describe the Selberg sieve. As an ex-
Section E). Fogels’s theorem is based on the ample, denote by S(x; q, I) the number of n
following theorem by Yu. V. Linnik (1947) and satisfying n = I mod q, n <x, (n, D) = 1, where q
123 E 470
Distribution of Prime Numbers

is a prime not exceeding x, z <x, D = & a z p, tion of A (Vaughan). These methods, known
and (I, q) = 1; then collectively as the large sieve, were tïrst di-
rected toward proving the Rényi theorem
stating that every sufficiently large even integer
cari be represented as the sum of a prime and
an almost prime integer (M. B. Barban). After-
ward, combining Richert’s sieve with this
large sieve, Chen Jing-Run (1973) improved
where A, = 1 and 1, is arbitrary for d > 1. Thus this result to a remarkable degree (- 4 Addi-
the problem reduces to the optimization of Ad. tive Number Theory C). Several applications
Proceeding in this manner, C. Hooley (1967) of Bombieri’s theorem have been demon-
and Y. Motohashi (1975), using analytic strated by P. D. T. A. Elliot and H. Halber-
methods, obtained certain deep results relating stam (1966): e.g., the estimation of the num-
to the Brun-Titchmarsh theorem. Using the ber of representations of n as p +x2 + y2
large sieve, H. L. Montgomery and R. C. (Hooley, Linnik) and the estimation of
Vaughan (1973) expressed this theorem in a C,,,d(n-p) (Linnik, B. M. Bredihin).
precise form: n(x, q, I) < 2x/cp(q) log(x/q) for a11 Let N(a, 7; x) denote the number of zeros of
q<x. L(s, x) in the rectangle c(< o < 1, 1tl < T. Com-
Let n,, n2,. , ni! be Z natural numbers not bining the large sieve with new Fourier in-
exceeding N, and Z(p, a) the number of nj’s tegral techniques and the Turan +power-sum
such that nj=amodp. Rényi (1950) proved method, Gallagher (1970) proved that there
that exists a positive constant c satisfying

,&Nk T,x)<<(QT)‘“-“‘,

Set u, = 1 for n = nj and u, = 0 otherwise, and q$QxzdqN(r> T>W(QT)“‘-“‘.


set ~(CC)= 2 .,,a,exp(2nina); then
Similar results were also obtained by G. Ha-
lasz and Montgomery (1969) using another
method, which was further exploited by Jutila,
M. H. Huxley, and Iwaniec. In particular,
In view of this simple fact, Bombieri (1965) and Heath-Brown and Iwaniec (1979) deduced that
P. X. Gallagher (1968) extended the problem
~L(X + y) - n(x) 2 cy(logx)-’ if y 2 x1 ‘/20+t.
and proved that, in general, Combined with the Deuring-Heilbronn
phenomenon, the zero density theorem above
not only establishes the Linnik theory, but
also yields the following result, due to K. A.
Rodoskii, T. Tatuzawa, and A. 1. Vinogradov:

Similar results cari be obtained for character 7c(x;q,l)=L “du


CP@I)s 2 lwu
sums CM<ngM+Nqn~(n). In this connection
Montgomery proved that

S({n;M<n<M+N};P,C$,)
if x>exp(logqloglogq), where E= 1 or 0
according as Siegel’s zero p exists or not, and
where

where A = Max(logq,(log~)~~~(loglogx)~~~).

Throughout these researches, estimates of


P(z) = n P. the type
PEP
PS:

Using these methods, the following estimate


was obtained by Bombieri:
~~<P(4)(l~l+2)~ogc~q(ltl+2)~
c max mpx ~(y, 4,U
q$x”2(logx)-ByGx m= I and

1 y du
~ «x(logx)-A,
<p(q) s 2 bu l
where A is arbitrary and B is a certain func-
471 124 A
Distribution of Values (Complex Variables)

are of great importance and have been studied [l l] D. N. Lehmer, List of prime numbers
by A. F. Lavrik (1968) Linnik (1961), Huxley from 1 to 10,006,721, Carnegie Institution of
(1972), and K. Ramachandra (1975). Washington, 1914.
[12] N. Levinson, More than one third of
zeros of Riemann’s zeta-function on o = 1/2,
Advances in Math., 13 (1974) 383-436.
F. The Prime Ideal Theorem in Algehraic
[ 131 K. Prachar, Primzahlverteilung, Springer,
Number Fields
1957.
[ 141 A. Selberg, An elementary proof of the
In an algebraic number lïeld of tïnite degree,
prime number theorem for arithmetic pro-
the prime number theorem is replaced by the
gressions, Canad. J. Math., 2 (1950), 66-78.
prime ideal theorem (T. Mitsui, 1956, Fogels,
[ 151 E. C. Titchmarsh, The theory of the
1962), which is based on the theory of the
Riemann zeta-function, Clarendon Press, 195 1.
Hecke tlfunction (E. Hecke, 1917; Landau,
[16] 1. M. Vinogradov, A new estimate of the
1918). Let K be a lïnite Galois extension over
function c( 1 + it) (in Russian), Izv. Akad. Nauk
an algebraic number fïeld k of tïnite degree.
SSSR, 22 (1958) 161-164.
Suppose that p is a prime ideal of k and is not
[ 171 M. N. Huxley, The distribution of prime
ramified in K. The TFrobenius automorphism
numbers, Oxford Univ. Press, 1972.
of a prime divisor of p in K determines a
[lS] P. Turan, Eine neue Methode in der
conjugate class C of the Galois group of K/k.
Analysis und deren Anwendungen, Akadémiai
Let ~I(X; C) denote the number of prime ideals
Kiado, Budapest, 1953.
in k associated with the class C in the above
[ 193 H. L. Montgomery, Topics in multiplica-
sense and whose norm does not exceed x.
tive number theory, Lecture notes in math.,
Then we have
227, Springer, 1971.
h(C) -eJlogx 1, [20] E. Bombieri, Le grand crible dans la
7-&C)=LK:k, Llx+O(xe théorie analytique des nombres, Société Math-
ématique de France, 1974.
where h(C) is the number of elements con- [Zl] H. Halberstam and H. E. Richert, Sieve
tained in C and c is a positive constant de- methods, Academic Press, 1974.
pending on K/k. This is an extension of Che-
[22] C. Hooley, Applications of sieve methods
botarev’s theorem (E. Artin, 1923).
to the theory of numbers, Cambridge Univ.
Press, 1976.
[23] H. E. Richert, Lectures on sieve methods,
References Tata Inst., 1976.

Ll] R. Ayoub, An introduction to the analytic


theory of numbers, Amer. Math. Soc. Math.
Surveys, 1963.
[2] H. Bohr and H. Cramér, Die neuere Ent- 124 (X1.8)
wicklung der analytischen Zahlentheorie, Distribution of Values of
Enzykl. Math., II C (8), 1922. Functions of a Complex
[3] K. Chandrasekharan, Introduction to
analytic number theory, Springer, 1968.
Variable
[4] P. Erdos, Some recent advances and cur-
rent problems in number theory, Lectures on A. General Remarks
Modern Mathematics III, T. L. Saaty (ed.),
Wiley, 1965, 196-244. Suppose that we are given a function f: A-+B
[S] E. Hecke, Vorlesungen über die Theorie and that the variables z, w take on values in A,
der algebraischen Zahlen, Akademische Ver- B, respectively. A value distribution of f(z) is a
lag, Leipzig, 1923 (Chelsea, 1970). set of points z where f(z) takes on a certain
[6] E. Landau, Handbuch der Lehre von der value w (called w-points of f(z)). Value distri-
Verteilung der Primzahlen 1, II, Teubner, 1909 bution theory is usually concerned with the
(Chelsea, 1969). study of value distributions of complex tana-
[7] E. Landau, Vorlesungen über Zahlen- lytic functions. Value distribution theory has
theorie 1, II, Hirzel, 1927 (Chelsea, 1946). been developed extensively and deeply for the
[S] G. H. Hardy, Divergent series, Clarendon case where A is the lïnite plane Jzl < co or the
Press, 1949. unit disk Jzl < 1 and B is the extended complex
[9] H. M. Edwards, Riemann’s zeta function, plane 1w (< co, and there are many interest-
Academic Press, 1974. ing results in this case (- 272 Meromorphic
[lO] A. E. Ingham, The distribution of prime Functions). Value distributions for analytic
numbers, Cambridge Univ. Press, 1932. functions on general domains or on Riemann
124B 472
Distribution of Values (Complex Variables)

surfaces or of several complex variables have We deline the characteristic function


also been studied. W (= V,f)) as

B. TbeCaseoflz[<R<cc

For a ttranscendental entire function f(z),


every value (including CO) is a value of the where r0 is a tïxed positive number and ,4(r)lLo
tcluster set of f(z) at the point at infinity =A(r)-A(r,). Using a bivector [ff“] =(Lfj’
(tweierstrass’s theorem). This theorem was -,h,h’) off and f’ = (fi, fi, ,,fi), we have
improved in the following way by E. Picard in another representation of T(r):
1879: A transcendental entire function f(z) has
an intïnite number of w-points for any finite
value w except for at most one lïnite value
*(+A
s‘2sis ~ 2*lc.B”ll’,,t,,
i-l ‘“S 0 0 If l2
(+Picards theorem). A value w for which the w-
fis transcendental when lim T(r)/logr= CO,
points are at most tïnite is said to be a Picard%
and the number
exceptional value. E. Bore1 gave a precise form
of this theorem, taking into consideration the A=dim{(c,,c, ,..., c,)EC”“IcO,fO+tClf,+...
order of a function, and G. Julia proved the
existence of Julia’s directions (- 272 Mero- + cnfn = OI
morphic Functions; 429 Transcendental En- is the degeneracy index off: It holds that 0 <
tire Functions). After other results in value 3, <n - 1. As an extension of Picards theorem,
distribution theory had been obtained by J. J. Dufresnoy stated that, for transcendental
Hadamard, G. Valiron, and others, R. Nevan- holomorphic curves f and among a in gen-
linna published an important work in 1925 in eral position, the zeros of (a,,f) are inlïnite
which he established the so-called Nevanlinna exceptforatmostn+~+l.X(cC”+‘-{0})
tbeory of meromorphic functions in lzl < Rd is in general position when any p (<n + 1) vec-
CO,unifying results obtained up until that tors in X are linearly independent. In connec-
time, and which became the starting point of tion with this result, there are many Picard-
the subsequent value distribution theory (- type theorems, and when A.> 0, there are some
272 Meromorphic Functions). T. Shimizu and particular results for holomorphic curves. H.
L. V. Ahlfors gave a geometric meaning to the Cartan, H. and J. Weyl, and Ahlfors extended
Nevanlinna tcharacteristic function T(r). Ahl- the Nevanlinna theory to holomorphic curves
fors established the theory of covering surfaces as follows. For a=(a,,a,, . . . . a,)~C”+l -{O},
by metricotopological methods in 1935, and we put
applied it to obtain the Nevanlinna theory
la1 = the length of a,
and many other results on meromorphic func-
tions. This theory revealed that the topological Il~flI=l~sf~lll~llfl~ (a,.f)#Q
meaning of the number 2 of Picards excep-
tional values is closely related to the tEuler and delïne the proximity function
characteristic 2 of the sphere. H. Selberg es- 1 2n
1
tablished the value distribution theory of m(r, CO=z
s 0 logmdO
talgebroidal functions and gave a precise form
of G. Rémoundos’s theorem, which corre- and the counting function
sponds to Picard’s theorem for algebroidal 2n

functions. N(r,a)=k logl(a,f)ldOl *


Moreover, as an extension of the value s0 10
distribution theory of meromorphic functions, Then we have the first main theorem: For any
there is the theory of holomorphic curves or of a for which (a, f) + 0,
systems of entire functions. Let fO,fi, f,
T(r) + m(r,, a) = N(r, a) + m(r, a);
(n 2 1) be entire functions without common
zeros for a11 and for which f0 :,fi : . :f, is not and the second main theorem: When  = 0, for
constant. Put f= (&,f, , . . ,f,). This is a re- any al, a2,. , aq in general position,
duced representation of a nonconstant holo-
morphic curvef:C+P”(C). For cc=(ccO,~i,
. . ..a.)EC”+‘-{O}, considering the zeros of
where S(r)=O(logrT(r)) except for r in a set e
(a,f)=a,fo+alfl+...+a,f, (#OI,
of lïnite logarithmic measure, (i.e., S,dlogr <
we cari extend Picard’s theorem and the Ne- GO). For A > 0, it is conjectured that “4 -II -
vanlinna theory for meromorphic functions to 1” cari be changed into “4 -n-i - 1.” This
holomorphic curves. is unsolved except for some special cases.
413 125 A
Distributions and Hyperfunctions

Some results show relations between excep- be obtained without imposing any additional
tional values and the order as in the case of conditions on the functions. The Hallstrom-
meromorphic functions. Nevanlinna theory Kametani theorem is an example. Although
has been extended to the associated curves fp the set of exceptional values in this theorem
(~=1,2,...,n;f’=f)for/Z=O,andtherehave cannot be replaced generally by a smaller set
been attempts to extend this theory of holo- than an F,-set of logarithmic capacity zero, we
morphic curves even further by generalizing have the following theorem: Let E be a +Cari-
domains or ranges. tor set with successive ratios 5, = 21,/1,_, ,
where 1, denotes the length of the segments
that remain after repeating n times the process
C. The Case of General Domains of deleting an open segment from the middle
of another segment. Then any single-valued
The value distributions of meromorphic func- meromorphic function with E as the set of
tions delïned in a general domain or an open singularities has at most 3 Picard’s excep-
Riemann surface depend on the function- tional values if lim,,, 5, = 0 and at most 2 if
theoretic “size” of the set of singularities (- i;,+i =o(<z) (L. Carleson, K. Matsumoto). By
169 Function-Theoretic Nul1 Sets) or the type weakening the conditions on E, one cari get
of the Riemann surface (- 367 Riemann Sur- several improvements of this result.
faces). For instance, we have the following
theorem of Picard type: A single-valued mero-
References
morphic function with a set of singularities of
tlogarithmic capacity zero takes on every
[l] K. Noshiro, Modern theory of functions
value intïnitely often in any neighborhood
(in Japanese), Iwanami, 1954.
of each singularity except for at most an F,-
[2] A. Dinghas, Vorlesgungen über Funk-
set of values of logarithmic capacity zero
tionentheorie, Springer, 1961.
(Hallstr6m-Kametani theorem). For the study
[3] W. K. Hayman, Meromorphic functions,
of value distribution at general singularities, it
Clarendon Press, 1964.
is useful to investigate cluster sets (- 62 Clus-
[4] H. Cartan, Sur les zéros des combinaisons
ter Sets). In order to generalize the Nevan-
linéaires de p fonctions holomorphes données,
linna theory to the case of general domains or
Mathematica, 7 (1933), 531.
Riemann surfaces, we take their exhaustions
[S] L. V. Ahlfors, The theory of meromorphic
depending on a real parameter Y and detïne the
curves, Acta Soc. Sci. Fenn., ser. A, 3 (1941),
tcounting function and SO on (- 272 Meromor-
3-31.
phic Functions). G. J. Hallstrom established
[6] H. and J. Weyl, Meromorphic functions
the value distribution theory of meromor-
and analytic curves, Ann. Math. Studies 12,
phic functions delïned in the complementary
Princeton Univ. Press, 1943.
domain D of a compact set E of logarithmic
[7] M. Cowen and P. Griffiths, Holomorphic
capacity zero by taking 0, = {z 1u(z) < r} as the
curves and metrics of negative curvature, J.
exhaustion of D, where u(z) denotes the Evans
Analyse Math., 29 (1976) 93-153.
potential for E, i.e., u(z) is the potential corre-
[S] L. Sario and K. Noshiro, Value distri-
sponding to a positive mass distribution p
bution theory, Van Nostrand, 1966.
on E of total mass 1 which tends to +co as
[9] K. Matsumoto, Existence of Perfect Picard
z tends to any point of E. Thus the number
sets, Nagoya Math. J., 27 (1966), 213-222.
of Picard’s exceptional values is not greater
than 2 + 5, where 5 =limsup,,, F(r)/T(r)
with -n(r) = the Euler characteristic of 0, and
F(r)=&n(r)dr (Hallstrom-Tsuji theorem). J.
Tamura, L. Sario, and others studied the value 125 (X11.7)
distributions of meromorphic functions de- Distributions and
lïned on Riemann surfaces. Sario succeeded in
extending the Nevanlinna theory to analytic
Hyperfunctions
mappings of a Riemann surface !II into another
Riemann surface 6 by introducing a suitable A. History
metric in G to delïne the tproximity function.
In the Nevanlinna theory on general do- The advancement of analysis, particularly in
mains, we must sometimes impose condi- the lïeld of partial differential equations and
tions that the functions must satisfy in order harmonie analysis, necessitated the generali-
to obtain a concrete conclusion, for instance, zation of the notion of functions and deriva-
the condition that 5 be fmite in the Hallstrom- tives. For instance, “functions” such as Dira&
Tsuji theorem. But it is also important to +delta function and +Heaviside’s function were
determine the domains where some result cari used by physicists and engineering scientists
125 B 474
Distributions and Hyperfunctions

even though the former is not a function and algebraic geometry. Independently, Horman-
the latter is not a differentiable function in the der established a similar theory for distri-
classical sense. The fïnite parts of divergent butions using Fourier analysis (- 274 Micro-
integrals, used by J. Hadamard to investigate local Analysis).
the fundamental solutions of wave equations
(1932), and the Riemann-Liouville integrals
due to M. Riesz (1938) were the notions that B. Definition of Distributions
eventually led to the theory of generalized
functions. The rudiments of the idea of distri- Let C~(X) be a complex-valued function of x =
bution, however, cari also be found in other (x1, , x”) delïned on an open set fi in the n-
earlier works. S. Bochner (1932) and T. Carle- dimensional Euclidean space R”. By the sup-
man (1944) discussed the Fourier transforms of port (or carrier) of cp, denoted by supp <p, we
locally integrable functions on the reals with mean the tclosure of {x 1<p(x) # 0} in R. For
growth as large as a polynomial. S. L. Sobolev multi-indices p, i.e., n-tuples p = (p, , , p,) of
introduced the notion of generalized derivative nonnegative integers, we set /pi = p1 + . + pn.
and also of generalized solution of differential For a function <p(x) of class CiPi, we Write
equations by means of integration by parts in
studying the Cauchy problem for hyperbolic
equations (1936); J. Leray (1934), K. 0. Fried-
richs (1939), and C. B. Morrey, Jr. (1940) also In particular, D (‘,...,‘)q = <p. TO indicate the
discussed generalized derivatives. On the other variables x we shall adopt the notation 0:.
hand, L. Fantappié (1943) investigated analytic s(0) denotes the set of a11 complex-valued
functionals that are elements of the dual of the functions of class C” defïned on R with tcom-
space of analytic functions and applied them pact support, which is a +linear space under
to the theory of partial differential equations. the usual addition and scalar multiplication in
Based on a systematic generalization of these function spaces. A sequence { cp,} in @fi) is
investigations, L. Schwartz [l] established the said to converge to 0 (the function identically
theory of distributions (1945), which not only equal to zero) as m+ CO,denoted by <P,,,* 0, if
provided a mathematical foundation for a there exists a compact set E in R such that E
number of forma1 methods that had been used contains supp <P,,,for every m, and for every p,
in mathematical physics but also gave new and {PV,} converges uniformly to 0 as m-32.
powerful tools for the theories of differential We sometimes abbreviate g(n) either as 9
equations (L. Hormander [2]) and +Fourier or, when we want to indicate the variables x,
transforms. Furthermore, it has been applied as &.
to trepresentation theory of locally compact A complex-valued tlinear functional T de-
groups, the theory of probability, and the fïned on g(Q) is called a distribution on R if it
theory of manifolds (G. de Rham [3]). As Will is continuous on a(n), i.e., <P* * 0 implies
be seen in Section B, distributions are defined T(<p,)+O. The set of all distributions on Q is
as continuous tfunctionals on a certain func- denoted by g’(n) (or 9’). For distributions S
tion space, and it is essential to Select a func- and T, the sum S + T and scalar multiple ctT
tion space appropriate to the problems con- aredeiïned by(S+T)(cp)=S(q)+T(<p)and
cerned. For this reason, 1. M. Gel’fand and G. (ctT)(<p)=aT(<p), respectively, which are also
E. Shilov defined various classes of general- distributions. Hence Y(Q) is a tlinear space.
ized functions [4] as a natural extension of
Schwartz’s theory. In this direction there are
also various classes of ultradistributions intro- C. Examples of Distributions
duced by C. Roumieu [S] and A. Beurling.
Another but completely different approach (1) Let j”(x) be a tmeasurable and tlocally
was given by M. Sato [6,15] in the form of the integrable function on R. Then a distribution
theory of hyperfunctions (1958). Intuitively a T, is delïned by q(<p) = j cp(x)f(x)dx. Here dx
hyperfunction is the sum of (ideal) boundary is the +Lebesgue measure and the domain of
values of holomorphic functions at the real integration is Q (in fact, supp cp). If T, = T,,
axis. We obtain in this way an ultimate class then ,f(x) = g(x) talmost everywhere. Thus we
of localizable generalized functions. Sato em- cari identify the distribution 7jj with the cor-
ployed the relative (or local) cohomology responding function f; and sometimes T, Will
theory to define the “boundary values” and to be denoted simply by f:
prove their localizing property. Such an alge- (2) Let p be a tRadon measure on fi, i.e.,
brait approach to generalized functions led a complex-valued tregular completely ad-
naturally to an algebraic treatment of systems ditive set function on the +Bore1 sets in R.
of linear partial differential operators [7], Then a distribution T, is defïned by T,(q)=
called algebraic analysis by comparison with S <p(x)p(dx). Example (1) is a special case where
415 125 F
Distributions and Hyperfunctions

p(dx)=f’(x)dx. Two Radon measures /J and v notion of pointwise value, having exact mean-
coincide if T, = T,. If the measure is concen- ing for functions, has no meaning for distri-
trated at the origin, then T,(p) = C<~(O),denoted butions. Moreover, we cari construct a distri-
by ~6, and 6 is called Dira& distribution. bution with given local data in the following
Sometimes 6 is denoted by 6,, 6,,,, or 6(x) to way: Suppose that an topen covering {Q,} of R
indicate that it operates on functions of x. A and a set of distributions 7;~9’(R~) are given
distribution TE~‘(R) is a measure if and only and that 73 and Tk are equal in slj n R, for any
if T(<p,)+O whenever the supports supp<p, of j and k. Then there exists a unique distribu-
a sequence cp, E 9(R) are in a fixed compact tion TE~‘(Q) such that T= ‘J in Qj for each j.
set in R and <P,,,converges uniformly to zero. (Define T(q) = C 7Jxjq) for a +Partition of unity
A distribution T is called a positive distribu- {cx~} subordinate to {Qj}). Namely, the distri-
tion if T(q)> 0 for any VE~(R) that has non- butions 9’ form a +sheaf of linear spaces over
negative values at a11 points x in R. Every R”. It is not tflabby but is soft, i.e., for any
positive distribution is equal to a T, corre- distribution T defined on a neighborhood of a
sponding to a positive measure p. closed set F in R”, there is a distribution on R”
(3) For given p the distribution 8(P) is deiïned that coincides with T on a neighborhood of F.
by ~‘P’(~)=(-l)/PIDPcp(0).~(o~..~~o)=~. For each distribution TE~‘(Q) there is a
(4) Let g(x) be a function defïned but not largest open subset R’ of 0 on which T van-
integrable on an interval (a, b), and assume ishes. Its complement is called the support (or
that for any positive number E it is integrable carrier) of T and is denoted by supp T. The
on (a + E, b). Moreover, assume that support of the distribution given in example (1)
coincides with that of the function 5 The sup-
g(x)= t A,(x-a)-A~+h(x), port of &‘) is the origin.
V=l
where ReÂ, > 1, 1, is not an integer, and h(x) is
integrable on (a, b). Then E. Derivatives of Distributions
b
g(x)dx-CA,&-l))‘a’~‘.“=F(e) In example (1) in Section C, if ,f(x) is a func-
s CI+E tion of class Ck, then by integration by parts
tends to a Imite value as ~‘0. This limit is TDpf(<p)=( -l)lp’Ts(Dp<p) for Ip( < k. The right-
called the hite part (in French partie finie) of hand side defines a distribution even if ,f is not
the integral Ji g(x) dx, denoted by Pf 1: g(x) dx: differentiable. In view of this example, we
defme derivatives DPT of any distribution T by
b
Pf g(x)dx (DPT)(cp)=( -l)‘P’T(DP~)> <PE&.
sa (2)

=-cyf++“+ sa*h(x)dx.
Any distribution is infinitely differentiable.
Any locally integrable function is infmitely
differentiable in the sense of distributions, and
In the same way, for every <pE C@RI), T(q) = its derivatives DOT, are called distribution
Pfjig(x)<p(x)dx is defined, and T is a distribu- derivatives (or generalized derivatives or weak
tion which Will be denoted by Pfg, and which derivatives).
is frequently called a pseudofunction. This For example: (1) DP6 = @‘); (2) (l-dimen-
notion is extended to the n-dimensional case sional case) dx+/dx= 1, dl/dx=& where x, =
and used to express the fundamental solutions max(x, 0) and 1 (x) is Heaviside’s function,
(- Section EE; Appendix A, Table 15.V) of which is equal to 1 for x > 0 and to 0 for x < 0.
thyperbolic partial differential equations.

F. Tbe Operation of Linear Differential


D. Localization of Distributions Operators on Distributions

Let,Q’ be an open subset of R. Every function Let T be a distribution and C((X) a C”-function
<pE 9(U) cari be extended to a function <pE on R. Then we cari define the product aT by
9(n) by setting <p(x) = 0 for x $Q’. Thus if (ET)(~)= T(cw<p),where the right-hand side is a
TE~‘(R), then a distribution SE~‘@‘) is de- continuous linear functional on 9(Q) since the
fined by S(q)= T(<p) for every VE~@‘), and S multiplication <p~cx<p is continuous in g(0).
is called the restriction of T to R’. Two distri- We define the dual operator (or conjugate
butions T and S are said to be equal in Q’ if operator) P’(x, D) of a linear differential opera-
their restrictions to 0’ are equal. If every point tor P(x, D) = C u,(x)D” by
in R has a neighborhood where T= S, then
P’(x, WP =~(-l)‘P’DP(~p(x)<p(x)), (3)
T= S. In this sense a distribution is determined
completely by its local data, although the where the a,(x) are C”-functions on R. Com-
125G 416
Distributions and Hyperfunctions

bining differentiation of distributions, multi- H. Distributions Depending on a Parameter


plication by functions, and addition, we cari
apply partial differential operators to distribu- Consider distributions TA depending on a
tions, and we have (P(x,D)T)(<p)= T(P’(x,D)<p). parameter E,, where i, ranges over the real line,
Linear differential operators commute with the complex plane, or more generally an open
restriction mappings. In particular, we have set in Euclidean space. The convergence theo-
suppP(x, D) Tc supp T. In other words, linear rem and the strong convergence theorem also
differential operators are +sheaf homomor- hold in the case of a continuous parameter A.
phisms on the sheaf 2’ of distributions over fi. Thus T, is continuous (differentiable) with
On the other hand, every continuous sheaf respect to the parameter 1. if T,(q) is continu-
homomorphism on 9 is a linear differential ous (differentiable) with respect to n for any
operator whose order is finite on each com- <pE 3. If TA is continuous with respect to A,
pact set in 0. Even when no continuity is then DXT, is also continuous with respect to
assumed, a sheaf homomorphism on 9’ is a 1,. If TA is differentiable with respect to a real
linear differential operator except on a discrete variable IV, then 0: TA is also differentiable, and
subset of fi (J. Peetre). C?D: T,/ai =D~(I?T~/~~). The same facts hold for
the case of several real variables. For a com-
plex parameter 1, TA is holomorphic with re-
G. The Topology of 9 and 3’ spect to A if T,(q) is holomorphic for any ~~63.
The fundamental properties of holomorphic
LetQ5=K,cK,@Z... beasequenceofcom- functions also hold in this case.
pact sets in R that exhausts fi. For every If TA is defined and continuous in an interval
nondecreasing sequence of positive numbers [a, b], then the integral T=i T,di with respect
{u} = {a,, a,, } and nondecreasing sequence to 2 exists in 9’ and we have
of nonnegative integers {k} = {k,, k, , } we
set

P[u),(~)(<P)=~~P s”P s”PujlDp<p(x)l (4)


j20 IplGk,x$K,

for <pE 9(!2). We define a tlocally convex topol- 1. Distributions with Compact Support
ogy of the space 9(R) by taking the totality
We denote by G(Q) (abbreviated to &) the
of the tseminorms plOi,ik) as a fundamental
space of a11 complex valued C”-functions
system of continuous seminorms. Then 9(D) is
on 0. &(Q) is a nuclear +Fréchet space with
a tnuclear +(LF)-space, and the convergence
the locally convex topology defined by the
qrn 3 0 of a sequence (pm is identical to the
seminorms
convergence <p,*O in this topology. A sub-
set B c 9(n) is tbounded in the topology
if and only if there exist a compact set E c Q
such that supp <pc E for every cpE B, and as p ranges over all multi-indices and K ranges
positive numbers M, for each p such that over all compact sets in fi.
sup 1D”<p(~)i < M, for every cpE B. We denote by &‘(R) (abbreviated to 8’) the
The topology of the space H(Q) is the tstrong dual of 8(Q), i.e., the set of a11 continu-
tstrong topology of the +dual of 9(Q): the ous linear functionals on s(n) equipped with
ttopology of uniform convergence on every the topology of uniform convergence on
bounded set in 9(Q). Under this topology bounded sets in Q(R). G’(R) is a nuclear +(DF)-
9(Q) is a +nuclear treflexive linear topo- space. If TE&‘@), then its restriction to 9(n) is
logical space. With respect to this topology, a distribution with compact support. Con-
the linear differential operators of Section F versely if T is a distribution with compact
are continuous in 9’. support in R, then choosing c(E 9(D) which is
By virtue of the following convergence identically equal to 1 on a neighborhood of
theorems, various limiting processes concern- the support of T, we defïne a linear functional
ing distributions cari be treated easily. If the S on &(Q) by S(Q) = T(E<~). Then SE #(fi), and
limit lim T(q)= T(Q) exists for every cpg3, S is independent of the choice of CI. In this
where {q} is a sequence of distributions, then sense, we cari consider that S is the extension
TE~’ and { 7;) is convergent to the distribu- of T to G(R) and identify S and T. thus G’(R)
tion T (convergence theorem). Moreover, for coincides with the set of a11 distributions with
any p, Dp7; is convergent to DPT (theorem of compact support in R.
termwise differentiation). Any bounded set in
9 or 9 is totally bounded. Thus weak conver- J. Structure Theorems for Distributions
gence of a sequence { 7;) implies strong conver-
gence (convergence in the topology of the A distribution TE~‘(~) is said to be of order at
space Y) (strong convergence theorem). most k if IT(<p)l d~~~~,~k)(<p) for {k} ={k,k,k, . ..}
477 125 M
Distributions and Hyperfunctions

and for some {u}. A distribution of order 0 is a but the left-hand side has a strictly weaker
measure. Every distribution of finite order cari topology than the right-hand side. Spaces of
be represented as a tïnite linear combination distributions with parameters (- Section H)
of derivatives (in the sense of distributions) of are often identified with completed tensor
measures or locally integrable functions. The products of spaces of distributions [8-l 11.
restriction of any distribution to a relatively
compact open subset R’ cari be represented
L. The Kernel Theorem
in this way because it is of finite order. There-
fore the distributions form the smallest class
Let R, and R, be as ab0ve. Every distribu-
of generalized functions that contains a11
tion K on R, x R, induces the continuous
locally integrable functions, is stable under
linear mapping LK:9(fi,)-+9’(Q,) deiïned by
differentiation, and forms a sheaf. Further-
more, for any distribution T there exists a (L,(<p(x)))(~(y))=K(<p(x)~(y)), ti~w&).
Conversely every continuous linear mapping
sequence {qj} c 9 that converges to T in the
L: .9(!2,)+9’(n,) is equal to L, for a K E
distribution sense.
9’(Q, x fi,) and the correspondence K H L, is
A closed set F is said to be regular if for
a topological isomorphism of the locally con-
every point a in F we have a neighborhood U
vex space #(fi, x fi,) ont0 L(g(R,), 9’@,))
of a and constants w and 13 x > 0 such that
equipped with the topology of uniform conver-
every pair of points x and y in F n U is con-
gence on the bounded sets (L. Schwartz’s
nected by a curve contained in F with length
kernel theorem, [S-l 11). K is called the kernel
less than or equal to ~IX-y[“. If the support
(distribution) of the mapping L,.
of a distribution T is contained in a regular
L, is a continuous linear mapping S(a,)+
closed set F, then
&(a,) (resp. 9(QX)-+8’(s2,)) if and only if KE
T=x D”jT,, 9’(Q,) 6 &(Q,) (resp. 9’(!2,) 6 R(Q,)), in which
case K is said to be regular (resp. compact) in
(the expression is not necessarily unique, and
y. L, cari be extended to a continuous linear
the sum is locally fïnite), where the pj are
mapping Q’(Q,)-+9’(!2,) (resp. &2,)-+9’@,))
complex-valued measures with support con-
if and only if K E &(Q,) 6 Y(Q,) (resp. G’(R,) 6
tained in F. In particular, if the support of a
9’(QY)), in which case K is said to be regular
distribution T contains only one point a, then
(resp. compact) in x. K is said to be regular
T cari be represented uniquely as a fïnite linear
(resp. compact) if it is regular (resp. compact)
combination
both in x and y. L, cari be extended to a con-
tinuous linear mapping G’(R,)+E(R,) (resp.
G(R,)+b’(R,)) if and only if KEF(R, x RY)
of derivatives of the distribution &) defined (resp. &‘(Q, x 0,)). Then K is said to be regu-
by 4-,,(d = d4. larizing (resp. compactifying) [S, 101.

K. Tensor Products of Distributions M. Convolution

Let 0, c R” and R, c R” be open sets. If TE For distributions S and T on R”, assume that
9’(&) and SE 9’($), then there is a unique either S or T has a compact support or more
distribution T @ SE Y@, x Q,), called the generally that supp Tn ({x} - supp S) is locally
tensor product (or direct product), such that uniformly bounded with respect to x. Then the
TO S(<P(X)$(Y))= T(v)W) for w <PEW&) correspondence
and $ E 9($). More directly we have Fubini’s
theorem: T 0 S(dx, Y)) = T,(S,(dx, Y))) =
S,,(T,(q(x,y))) for any ~PE~(R, x 0,). If F and delïnes a distribution called the convolution of
G are linear spaces of distributions on R, and S and T and is denoted by S * T. In particular,
R,, respectively, then the ttensor product T,*T,=T/,,, where f* g is the tconvolution of
F @ G is identified with the linear combina- functions ,f and g, and Tf, T,, and q,s denote
tions of tensor products TO S of TE F and the distributions corresponding to A g, and
SE G, which form a linear space of distribu- f* g, respectively. For example: T* 6 = 6 * T=
tions on R, x R,. When F and G have tlocally T, DPT*S=T*(DPS)=DP(T*S).
convex topologies, the tcompleted tensor Thus a solution of the partial differential
products F 6 G and F 6 G are also usually equation P(D) T= S with constant coefficients
identified with subspaces of 6Y(Q, x a,,). For is given by the convolution S * E of S with a
example, we have 9’(n,) @ 9’(!2,) = 9’(!2, x +fundamental solution E of P(D), whenever
Q,), &‘(Q,) 63 &‘(Q,) = G’(R, x R,) and G(R,) ô the convolution exists.
G(R,) = &(QX x fi,) (including the topologies). If TE 9 and <pE 9, the convolution T* <p
Similarly, we have 9(fi,) 6 g(fi,) = 9(Q, x Q,), is equal to a function f of class C” (or the
125 N 478
Distributions and Hyperfunctions

distribution corresponding to f), and f(y) = .FTEY~ of TE.~?: is detïned by


TJ<p(y-x)). fis called the regularization of
T. For distributions S and T, assume that
(BT)(q)= T(&p), <PEY. (8)
If(x)g(y-x)1 is integrable on R” for any regu- The inverse Fourier transform is defïned simi-
larization f= S * <pand g = T* $. Then there larly, except that -i is replaced by i. For
exists a unique distribution V’such that example: (1) PJ= (&)“a, where 1 is the
distribution corresponding to the function 1.
V*(<p*$)= f(x)g(Y-x)dx. (2)9(Dp T)=i'pl~pcW.
s A function <p of class C” is called a slowly
This distribution V is called the general- increasing C”-function if cp and its derivatives
ized convolution and is denoted by S * T of any order are slowly increasing continuous
(C. Chevalley). functions. The space 0,,,, is the set of all such
functions. Its image under the Fourier trans-
form $(CM), denoted by 0&, coincides with
N. Tempered Distributions Y’ the set of all distributions T such that any
regularization f= T * cp is a rapidly decreas-
We denote by ,rP(R”) (abbreviated to Y) the ing C”-function. A member of the space 0;
space of all trapidly decreasing functions of is called a rapidly decreasing distribution. If
class C” on R”. SP is a nuclear Fréchet space
C(E&,, (BE@;) and TEP', then aT~,Y”and
with the locally convex topology defïned by .9(aT)=(&)-n~~*9T(~*T~9”and
the seminorms
9(8* T) = (&)"@FT). If T is a distri-
I)p,q(47)=~~PIxp~q<p(x)I, (6) bution with compact support in R”, then its
x Fourier transform is the tentire function f(c) =
as p and 4 range over all multi-indices, where (fi)-“T(exp( -~XC)), where [= 5 +iq~C”.
xp=xpi 1 xp. We denote by S“‘(R”) (abbre- For a convex compact set K in R”, deiïne its
viated to Y”) the strong dual of Y(R”). Its supporting function H, by HK(q) = supXGK xv.
elements are called tempered distributions (or Paley-Wiener theorem. An entire function
slowly increasing distributions) and are identi- f(c) on C” is the Fourier transform of a distri-
fied with their restrictions to Gn(R”). A distri- bution (resp. a function of class Cm) with sup-
bution TE~' cari be continuously extended to port in K if and only if(i) for any E> 0 there is
Y if and only if any regularization f = T* cp a constant C, such that If([)[ < C,exp(H,(q)+
of T is a slowly increasing continuous func- C:I<[) on C”, and (ii) there are constants C and
tion (i.e., there is a polynomial P(x) such that N such that If(<)1 < C(l + 151)” (resp. for any
I,f(x)I < IP(x)l). Y”(R”) is a nuclear (DF)-space. N there is a constant C such that If(<)1 <
Let 1 <pc CO, and let q be the tconjugate ex- C(l +l<l)mN) on R”.
ponent. The strong dual of the tfunction space
&JR”) is denoted by L!&JR”). Since Y is dense
in gL,, 9;, is identifïed with a linear subspace P. Fourier Series and Distributions on Tori
of 9”. The strong dual of 9i,(R”) is equal to
gLq(R”). Let &(R”) be the closed linear sub-
space of %(R”) consisting of a11 functions cp The n-dimensional +torus T” is the quotient
such that P’<PE C,(R”) for all p (- 168 Func- space of R” with respect to the equivalence
tion Spaces). It is also the closure of .Y in B. relation xi = yj mod Z ( j = 1, , n). The space
The strong dual of 4 is denoted by LSL,(R”). 9’(T”) of distributions on T” is defined to be
Its elements are called integrable distributions the strong dual of 9(T”) = G(T”). The volume
because the strong dual of &JR”) is equal to element dx of T” is defïned from that of R”.
.%(R”) SO that the integral T( 1) has a meaning. Thus, for an integrable function A we cari
Tf by

TJ(<P)=
sf(xh(4k
Here 1 is the function identically equal to 1. define a distribution
+Sobolev spaces Wi(R”) and +Besov spaces
BP,,(R”) are also linear subspaces of F(R”). <pE S(T”).
T”

Consider the family of functions &(x) =


0. Fourier Transforms exp(2xipx), where p ranges over all n-tuples
of integers. Then any distribution T on T” has
The Fourier transform the Fourier series expansion T = C cpfp(x),
where the Fourier coefficients cP= T(f-,) are
.Rp(x)=(J%-” <p(?Jexp(-ixc)& (7) slowly increasing, i.e., /cpi <C(l + lpj2)k for
s some k and C. Conversely if {cP} is a slowly
is an isomorphism of Ye onto ÿX, where x5 increasing sequence, then C c,f, converges to
=x1 t1 + +x,,[“. The Fourier transform a distribution T on T”.
479 125 s
Distributions and Hyperfunctions

Q. Substitution functions under coordinate transformation,


this definition has an invariant meaning, and
Let f=(fi, . . ..f.) be a C”-mapping from an every locally integrable function on M is re-
open set fl in R” into an open set fi in R”, and garded as a distribution. More generally, a
assume that the +rank of the Jacobian matrix distribution ‘cross section of a tvector bundle
(af,/ax,) (i= 1, ,m;j= 1, , n) is equal to m in of class C” is delïned in the same way. The
a neighborhood of the tinverse image E of the most important is the case of the +p-fold ex-
support of a distribution S = S,,, E 9’(n). In a terior power of the tcotangent bundle. Its
neighborhood U of E, we cari choose IA, = distribution cross sections, i.e., texterior dif-
g,(x), . . ..~.~,=g~~,,,(x) SO that the trans- ferential forms of order p with distribution
formation (y, u) = (f(x), g(x)) has the inverse coefficients, are called currents of degree p. If
transformation x = $(y, u) of class C”. If the M is toriented, then the space of currents of
support of cpE 9(Q) is contained in U, then degree p on M is identitïed with the dual space
defining @Iby of the locally convex space c%i(“~p)(M) of a11
differential forms of degree n-p, of class C”
and with compact support in M. For example,
if C is an (n - p)-dimensional tsingular chain, a
we have 4 E 9(n), where J(y, u) is the abso- current Tc of degree p is delïned by Tc(cc)=~ccc.
lute value of the +Jacobian of $(y, u), and @ If M is not +Orientable, we have to consider
is independent of the choice of g1 . . . , gn-,,,. either the double covering fi of M or the cross
Now we detïne T(q) = S(G) if supp<p c U, and sections of the tensor product of the bundle of
I’(Q) = 0 if supp <p does not intersect E. Since a (n -p)-covectors and the orientation bundle
distribution cari be determined by its local [l, 31. Currents were introduced by de Rham
behavior, we have a distribution TE 9’(Q) with to prove his celebrated isomorphism theo-
support E, which is denoted by T= S of= S(,f) rem of the +de Rham cohomology groups de-
= SfcX>and is called the substituted distribution fmed by differential forms and the tsingular
of StYj by y =S(x). It is also called the pullback cohomology groups delïned by singular co-
of S by f and is denoted by f *S. The chain chains (- 105 Differentiable Manifolds).
rule for derivatives of composites

;(sof)=igg g0.r S. Gel’fand-Shilov Generalized Functions


(9)
I I( 1 >
also holds. Let E and F be tlocally convex spaces for
which a continuous linear injection i: E-+F
For example: (1) If S = S$j (y~ R”), then E is
the surface fi(x) = f2(x) = . . = f,(x) = 0. Assum- with dense range is defined. Then the dual i’ :
ing that (f,, ,f,) satistïes the condition in the F’+ E’ is a continuous linear injection with
previous paragraph, we obtain the distribution weak*-dense range. Therefore if we shrink the
lP'(fi,..., f,) with support contained in a sur- function space E, we obtain a larger space E
of continuous linear functionals. Gel’fand and
face. From the fact @) = Dy46, we cari Write
Shilov [4] introduced many function spaces
a"'~(fl>...>fm) E, called fundamental spaces or test function
‘(‘)(fi a ’ f,) = (afl p (ijf mp spaces, and detïned corresponding spaces E’ of
generalized functions. Their motivations were
(Gel’fand-Shilov notation). (2) For the map-
in applications to the theory of partial dif-
ping ,f(x) = Ax + b, where A is an n x n regular
ferential equations, where Fourier transforms
matrix and b is an n-vector, we cari define the
play an essential role and Schwartz’s frame-
substituted distribution of S by f = f(x) for any
work of tempered distributions is too restric-
,SE~‘(R”). For instance, &,,(<p)=<p(b). 6,X2mC2)
tive. They considered a pair of function spaces
=(2~)~‘(6~,~,,+6~,+,,)(c>O,x~R~).
E and E” such that the Fourier transformation
9 : E-+E” is an isomorphism of locally convex
R. Distributions and Currents on a spaces. Then the Fourier transformation de-
Differentiable Manifold lïned to be the dual of 9 gives an isomor-
phism E’+E’ of spaces of generalized func-
Let A4 be an n-dimensional tdifferentiable tions. The function spaces E and E” are often
manifold of class C”. Let {(U,, $,)} be an spaces of entire functions, SO that the Gel’fand-
tatlas. Then a distribution T on M is detïned to Shilov generalized functions are not neces-
be a collection of distributions T, on $,(U=) c sarily localizable in R”. As typical examples of
R” such that TO on ti8( U, n UP) is equal to T, o spaces E and E we shah mention only spaces
($,o$gl) for any CIand fi. Since the distribu- of type S in the next section.
tions obey the same transformation law as L. Ehrenpreis’s analytically uniform spaces
125 T 480
Distributions and Hyperfunctions

[ 123 are also a class of generalized function forms a sheaf. Roumieu [S] took the space
spaces introduced from a similar point of view. giMO; (fi) of +ultradifferentiable functions of
class {M,} with compact support. We cari also
use the space gcMl,(0) of class (M,). The cor-
T. Spaces of Type S responding generalized functions are called
ultradistributions of class {M,} and of class
Let c(, B, A, B, x, and B be n-dimensional (M,), respectively. Here M, is a sequence of
vectors. By A > B we mean Aj > Bj ( j = 1, . . , n). positive numbers satisfying the logarithmic
We use notations AP= A~I A$, pp’ = pfl”l convexity Mi < Mo-, M,,, and the Denjoy-
pp. (i) For c(> 0, A > 0 the space S,,, consists Carleman condition C MP/M,+, < 03. 9jMpl(R)
of a11 C”-functions cp such that seminorms (resp. 9,,p>(fl)) is the space of a11 functions
<PEI such that there are constants k and
pq,~(b?)=supsup’x~~‘)’ (10) C (resp. for any k > 0 there is a C) for which
x P
we have ID”<p(x)l< Cki”lMIaI. In particular,
are fïnite for a11 A >A and q. For example, the Gevrey classes {s} = {p!“} and (s)=(p!‘)
S,,, is the space of a11 functions of class C” detïned for s > 1 are important, and they ap-
with support in {xl Ixjl < Aj,j= 1, ,n}. (ii) pear often in theory of differential equations.
For /J 2 0, B > 0 the space SP,B consists of a11 Almost a11 results for distributions have been
C”-functions <psuch that the seminorms extended to ultradistributions under appropri-
ate conditions on M,, which are satisfïed by
y,.,(rp)=supsup’X~q~~)’ (11) Gevrey sequences p!” (- [ 143 in particular).
x 4 Closely connected are A. Beurling’s test func-
are finite for a11 B > B and q. (iii) For X, fi > 0 tion spaces 9,(n) (- [ 13]), where w is a func-
and A, B > 0, the space S!.,j consists of a11 C”- tion on R” continuous at the origin, satisfying
functions such that the seminorms O=ru(O)<w(<+a)<w(t)+~(q) for a11 5 and q
in R” and such that ~w(<)(1+1<1)~“~‘d~< 10.
Ix”~4dxl SU(Q) is defïned to be the space of a11 VE~(R)
Pa,B(<P)=suPsuPsuP ~ ~
APBqpPaq4P (12)
x P 4 such that the Fourier transform @ satistïes
~[@(~)lexp(iw(~))d<< CU for any 1>0. If w(t)
are finite for a11 2 > A and B > B. The topol-
=log(1+1<1), then sU=9; ifw(t)=ltllis, then
ogies for these spaces are given in terms of
gw = GBcsj.Let W(t) =w( - 5). The space 9;(Q)
the seminorms (lO), (1 l), and (12), respectively.
of Beurling’s generalized distributions is defïned
These spaces are generically called spaces of
to be the strong dual of 9&).
type S.

F(sc,A)=Sm~A (c(> 01,


V. Hyperfunctions
SqP) = s p,s (P>O)>
qs,p.j> = $yp.sA b+B> 1). In this and the following sections we mean by
a cane r c R” a convex open cane with vertex
F(S,,,) = SO-A’,
at 0. For two cones r, A, we Write Ac r if
~(s”~B)=so,,~, An S”-’ is relatively compact in r n S”-‘, where
S”-‘={~x~=l}.Byawedgewemeananopen
LF(S,4’$ = s;; ;, ’ (x+8=1),
subset of C” of the form Q + il-, where R c R” is
where A’= Aexp( l/A) and B’= Bexp( l/B). an open subset and r c R” is a cane. r is called
We set S, = U Sa,A, SP = U SP,B, and Si = the opening of the wedge and R its edge. By an
u S!,?> where A and B range over a11 positive inhitesimal wedge (0-wedge for short) or a
n-dimensional vectors. Si # {0} if and only if tuboid of opening r and edge R, we mean a
one of the following conditions is satisfied: (i) complex open set U such that U c R + il- and
cr+/?> 1, CC>~, b>O;(ii) c(> 1, D=O; (iii) a=O, that for any A E r, U contains the part of
/3 > 1. In such a case the space S!’ of general- R + iA which is contained in some complex
ized functions contains the tempered distribu- neighborhood of the edge 0. The symbol R+
tions F(R”). il-0 Will represent any one of such open sets,
and 0(fi + il-O) (the inductive limit of) the total-
ity of functions holomorphic on some of them.
U. Ultradistributions A byperfunction f(x) on an open set R c R”
is an equivalence class of forma1 expressions of
The localizing property of distributions is the form
proved mainly from the existence of ?Partitions
of unity by functions in 9. Therefore if a test f(x)=j$ qx + qo)> (13)
function space admits partitions of unity, the
corresponding class of generalized functions where c(z) E fl(Q + iQ0). {F;(z)} is called a set
481 125 W
Distributions and Hyperfunctions

of defïning functions of f(x). Here the equiva- Then by the general theory it cari be shown
lente relation is given as that the remaining Hk(C”, 6) agrees with the
section module g(Q) of Y?{~(S). Since H”( V, CO)
=0 for any open set Vc C” (B. Malgrange), it
(14) follows further that the sheaf g is flabby, i.e.,
if ‘;n r, #a, that is, we cari contract two terms its sections on any open set cari always be
into a single term as above and, conversely, extended to the whole space.
cari decompose, if possible, a term in the If U is a ‘%tein neighborhood of R, then
inverse way. These are considered to be modi- H&(Cn, 8) cari be expressed using the covering
fications of the expression of the same hyper- cohomology as the quotient space
function. The totality of hyperfunctions on fi
is denoted by a(Q). It is a linear space by O(U#fi) f o("#jR), (16)
j=l
virtue of the linear structure naturally induced
from that of holomorphic functions, combined where
with the above equivalence relation.
The symbol 4(x + iQO), which represents by U#n={z~UIpr,Jz)$pr,JR) for all k},
itself a hyperfunction, is called the boundary U#jR={zEUlprk(z)$prk(R) for k#j}, (17)
value of Fj(z) to the real axis. It is merely
forma1 and does not imply any topological and prk is the projection from c” to the kth
limit, though there is some justification for the coordinate. Then the isomorphism HA(C”, 0) =
terminology as Will be seen in Section Z. B(Q) is induced by the correspondence
In the case of one variable, we have only
two kinds of wedges R+iR’, hence a hyper-
function cari be expressed by two terms: =CsgnaF(x+ir,O)E~(n), (18)
0
F+(x+iO)-F-(x-i0). (15) where r, is the a-orthant {y~R”la~y~>O,j=
Some examples of hyperfunctions of one vari- 1, . , n} and sgn 0 = ol c”. F(z) is called a
able are 6(x)= -(27~i)~~((x+iO)~~-(x-i0)-‘); delïning function of the corresponding hyper-
l(x)= -(27q(log( -x- iO)-log( -x+ i0)); function. For one variable, any complex neigh-
Pfx~‘=((x+iO)~‘+(x-iO)-‘)/2. borhood U 3 52 is Stein, and the above isomor-
phism reads o( U \a)/@( U) = a(Q), from which
the naturality of the sign in (15) follows.
W. Localization of Hyperfunctions Thus the notion of support is also legitimate
for hyperfunctions. The sheaf of hyperfunc-
If R’c fi is an open subset, the restriction tions 33 does not admit partitions of unity as
mapping vB(fi)-tB(Q’) is induced from that for for 63’. It is, however, flabby. Consequently,
holomorphic functions. With this structure the given a decomposition of a closed set into
correspondence RH~(R) becomes a tpresheaf. locally finite closed subsets E = UAEAEA: and
It is in fact a sheaf, because it cari be expressed a hyperfunction f with support in E, we cari
by the terminology of relative (or local) co- always fïnd hyperfunctions fi with support in
homology as follows: Let Hn(C”, 9) denote E, such that ,f= Ci,,,,&. For distributions this
the kth relative cobomology group of the pair property holds only under some regularity
(C’, C” \ 0) (also called the kth local coho- assumption for the decomposition.
mology group with support in R) with coeffl- There are several practical criteria to deter-
cients in a sheaf .p on C”. (It is by definition mine whether or not a hyperfunction is zero in
the kth tderived functor of F H rJCn, S) = some open set R. These are called the edge of
the totality of sections of .F defïned on a the wedge theorem. A hyperfunction F(x + iT0)
neighborhood of R and with support in Q and with single expression is zero if and only if F(z)
is calculated as the kth cohomology group of itself is zero. F,(x + iT,O)=F,(x+ ir,O) if and
the +complex ro(Cn, c;P), where Y denotes any only if they stick together to a function in O(n
flabby resolution (i.e., tresolution by tflabby + i(T, + r,)O) (Epstein type). (Note that rl + r,
sheaves) of 9.) Let Xi.(T) denote the kth is equal to the convex hull of rl U r,, e.g., r +
derived sheaf of 5 to R”. (It is by definition the (- r) = R” (Bogolyubov type).) z;, 4(x + iqo)
sheaf on R” associated with the presheaf =0 if and only if there exist Gjk(z)~O(R+t(q+
RH HA(C”, 9).) Then the cohomological r,)O), j, k = 1, , N, such that Gj,(z) = Gkj(z)
definition of the sheaf of hyperfunctions is UA and Fj = C:=I Gj,, j = 1, , N (A. Martineau
=X~~(O) (the orientation being neglected). A [ 191). These are interpretations of cohomology
fundamental theorem by Sato says that R” in terms of coverings and have global variants
cc” is purely n-codimensional with respect to concerning the envelope of holomorphy.
6 (i.e., %$(m) = 0 for k #n), and moreover The real analytic functions <pE &(a) on Q
Hk(C”, 0) = 0 for k # n for any open set R c R”. are naturally included in &Y(Q) via the ex-
125x 482
Distributions and Hyperfunctions

pression cp(x + ifO) for any I or with I = R”. derivative or the Jacobian matrix of the map-
They form a subsheaf. Let S be a hyperfunc- ping W’ at x. Thus the pullback @*f(x) =
tion on !2. The complement in R of the largest .f(@(x))~B(n) off(x)EYB@) by the transforma-
open subset R’ c R, where f(x) is real analytic, tion x=@(X) cari be delïned by substituting
is called the singular support of f(x) and is z=@(i) into the defïning functions. It is consis-
denoted by singsuppf: If R is bounded and tent with the defmition of the coordinate trans-
sing supp f c K, then we cari choose an ex- formation for real analytic functions. Also, the
pression of the form (13), where Fj(z) cari be law of the change of variables for the definite
continued analytically to 0 \ K. If suppf~ K, integral is the same as that for real analytic
then these Fj(z) satisfy C Fj(x) = 0 on R \ K in functions. This cari be extended to the general
the usual sense. operation of substitution as mentioned in Sec-
tion Q for distributions or even to much more
general operations (- Section CC).
X. Operations on Hyperfunctions The convolution (f*g)(x)=J,.f(x-t)g(t)dt
is defïned under the same assumption on sup-
The derivatives of a hyperfunction f(x) with port as in the case of distributions. It cari be
the expression (13) are defïned through the detïned either literally as the composition of
defïning functions as D”~(X) = C(Dj’FJ(x + the above operations, or directly by way of the
irjO). The product by a real analytic function defining functions: For example, if g has com-
$(x) is defined by Il/(x)f(x)=C(t,hFj)(x+iqO). pact support e D and C Gk(x + iA,O) is an
Combining these, we have the operation of a expression as mentioned at the end of Section
hnear partial differential operator with real W, then by choosing a suitable deformation Dk
analytic coefficients P(x, D) on hyperfunctions. of D, we have
It is a kheaf homomorphism.
Let fi 8(n) and D c !2 be a compact
set with piecewise smooth boundary. If
singsuppfn ÛD = a, then by means of the
cf*m=;
. FS 4
Fj(z-t)G,(z)dt
1z-x+i(r,+&)ll
special expression (13) mentioned at the end of
Section W the definite integral is defmed as Y. Hyperfunctions and Analytic Functionals

The totality .%[K] of hyperfunctions with sup-


port in a compact set K becomes a nuclear
where Dj is a path deformed from D in such a Fréchet space. It is the dual of the nuclear
way that 8Dj = I~D and Fj(z) is holomorphic on (DF)-space .d(K) of real analytic functions de-
Dj. The result is independent of the choice of fined on a neighborhood of K. Thus B[K] is
deformations or of the boundary value ex- the space of analytic functionals with +Porter
pression employed. in the real compact set K. The duality is given
If suppfc D, the result is also indepen- by the definite integral (h vo> = JRnf(x)q(x)dx
dent of D; hence it cari be written as jRnf(x)dx. for ~EB’[K] and <PE~T~(K). A sequence {h(x)}
If f(x, t) is a hyperfunction of the two groups in g[K] converges if and only if it admits an
of variables (x, ~)ER x V such that singsuppfn expression (13) for a common fïxed set of O-
C?Dx V= @, then the integral sDf(x, t)dxc wedges R + iIjO, j = 1, , N, such that the
B,(V) is deiïned by the same method. It com- defïning functions Fkj6 0(!2 + iq0) are also
mutes with differentiation or integration with holomorphic on a fïxed complex neighbor-
respect to t. hood U of R\ K and converges locally uni-
If f(x) = C Fj(x + iqO), g(x) = C Gk(x + iA,O), formly in R + i?OU U, where !2 is a (real)
then we cari define the product by f(x)g(x) = neighborhood of K.
C(F,G,)(x + i’; n AkO) under the assumption B[K] is isomorphic to HK(C”, O), and
that 5 n Ak # @ for every pair j, k. Especially, the above duality is a special case of the
the product f(x)g(t) is always legitimate for Martineau-Harvey duality HR(C”, 0) =O(K)’
two hyperfunctions depending on different for a Stein compact set K c c”. In the l-
groups of variables. (It cari also be interpreted dimensional case this is due to G. Kothe. If
as the tensor product.) K = [a,, b,] x . x [a,, b,], then HR(C”, 0) cari
A real analytic coordinate transformation be represented, including the topology, by the
x =D(X): fi+fi extends naturally to a holo- quotient space
morphic coordinate transformation z = CD(Z)
on a complex neighborhood.
il-0 is transformed
A 0-wedge R +
by mm1 to a twisted wedge
O(UKK)
I j=l
E O(U#jK),

containing, for each x E R and Ac I, a O- where Ii is a Stein neighborhood of K and


wedge fi2 + i(D@-‘),A0 with some (real) neigh- U # K, U #j K are detïned in the same way as
borhood fi2 of X=@-‘(x), where (DW’), is the (17).IfwechooseU=U,x...xU,,thenby
483 125 AA
Distributions and Hyperfunctions

way of a defïning function F(~)E G( U # K) for for a distribution T, we have the following
f~%?[Kl the inner product is given by the convergence in the sense of distributions on
contour integral R”:

TX = 1 sgn 0 hfo F,(x + iay,) for y,EI,.


c
More generally if the limit
=(-1)” ... F(z)<p(z)dz, . ..dz.,
4 Y1 f Y.
Clj$~q(x+kÿi) for yjsTj (20)
where yjc Uj is a closed path surrounding
[a,, bj] once in the positive sense. Similar inte- exists in g’(Q), then the distribution deiïned as
gral formulas are known for some special K this limit admits {e(z)) as a set of detïning
of various types. functions when it is considered as a hyper-
Starting from analytic functionals we cari function. However, for an arbitrary set of
reconstruct the sheaf of hyperfunctions. For defïning functions for a distribution, (20) does
example, we cari put @fi) = the totality of not necessarily converge in 3’(sZ). (This is be-
locally fïnite sums of analytic functionals with cause we cari add terms with any bad behavior
porter in Q modulo the rearrangement of which cancel each other formally.) If a distri-
supports by decomposition (Martineau 1171). bution admist the boundary value expression
If R is bounded, we cari also put 3??(n) = B[al/ with only one term, or if the dimension is one,
B[an] (Schapira [lS]). The proof of local- then the convergence of (20) in B’(0) neces-
izability and/or flabbiness is based on the sarily holds.
decomposability of support (which is the dual The convergence of (20) in Y(Q) is equiva-
of the exact sequence O+d(KU L)+&(K) @ lent to the locally uniform estimate for the
&‘(L)-+d(K nL)+O) and the denseness of detïning functions of the type 4(z) = O( lyl m”)
B[K] c g[L] (which is the dual of the unique for some M > 0. These assertions cari be gener-
continuation property O-+&(L)+.&(K)) for a alized to ultradistributions. For %$M,)(R)
pair K c L with the same family of connected (resp. glM,r (0)) the last growth condition for
components. Note that in no way is the topol- the deiïning functions reads as follows: 4(z) =
ogy of UA[K] localizable, or equivalently, B(Q) O(expM*(L/ly[)) for some L (resp. every L>O),
does not admit a reasonable topology. where M*(p) = sup,log(pPp!M,/M,); espe-
cially for M,= p!” we have M*(p) - p’““-‘r.

Z. Embedding of Distributions
AA. Hyperfunctions on a Real Analytic
As the dual of the natural mapping d(K)+ Manifold
g(K) we have the topological embedding
&‘(K)ç&‘(K)‘=IA[K]. This embedding con- Sticking the hyperfunctions on coordinate
serves the support and hence gives rise to an patches by the transformation law mentioned
embedding of sheaf 3’c+B (R. Harvey). in Section X, we cari define the sheaf of hyper-
For a distribution T with compact support functions on a real analytic manifold. More
a set of its defïning functions as a hyperfunc- generally, for a real analytic vector bundle
tion is given by F,(z) = T,( W(z -x, I-J), 0 being over a real analytic manifold, we cari consider
the multisignature, where W(z, I,) =j& or, the sheaf of its hyperfunction local cross sec-
W(z, w)do and W(z, w) is the component of a tions, which is also flabby. Thus, especially
Radon decomposition of S(x) (- Section CC). on a real analytic manifold M, we cari obtain
If supp Tc K = [a,, b,] x . . . x [a,, b,], then a concrete flabby resolution of the constant
as a hyperfunction it is represented by the sheaf C, of length dim M by the sheaves of
following element of O(C” # K): differential forms with hyperfunction coeflï-
cients: O+C,-+~~‘~~~)O,. +@jimM)+O.

F(z)=T, ( 1
(27ri)n(X1-ZJ...(X,-Z”)

It is in fact in O((P’)” #K) and vanishes at


>
(19) With this sequence we cari calculate the rela-
tive cohomology groups of open pairs with
coefficients in C by an analytic method. This
infïnity, where P’ =Cl U {a} is the Riemann is an extension of the de Rham theory for
sphere. These formulas are valid also for hyper- distributions [ 161.
functions, and they give defïning functions of If M is a compact manifold equipped with a
some canonical types. Especially, the one given nowhere vanishing real analytic density glob-
by (19) is called the standard defining function ally deiïned on h4, then we have the topolog-
and is characterized by the foregoing prop- ical duality B(M) = d(M)‘. The inner product
erties. is given by the defïnite integral with respect to
For the above-mentioned defïning functions the density.
125 BB 484
Distributions and Hyperfunctions

The Fourier series is an example of hyper- converges locally uniformly for 5 in some O-
functions on real analytic manifolds. The series wedge D” - iA0, and defines there a holo-
C c,exp(2nipx) converges in B(T”) and defines morphic function of infra-exponential growth.
a periodic hyperfunction if and only if cP is of Thus we obtain a Fourier hyperfunction
infra-exponential growth, i.e., cP = O(&) for C(c - iA0) that agrees with Ff calculated by
any E> 0. T” has the global complex neighbor- the duality. For a general f(x)E?& the Fourier
hood (PI)“, and f(x)~@T”) has the corre- transform in the manner of Sato is calculated
sponding boundary value expression as follows: First we decompose f(x) into the
sum C fk(x) for which the defining functions
of fk(x) decrease exponentially outside A:.
which represents the terms in the Fourier Then we calculate G,(c) by (21) and put Ff =
series such that ajpj > 0. C Gk(< - iA,O). An example of such decom-
position is given by multiplication by x,(x) =
l-I;=, l/( 1 + expajxj), which decreases expo-
BB. Fourier Hyperfunctions
nentially outside r,= {gjxj>O}.
In place of Y we take as the basic space the The relation between 9 and 99 is more com-
plicated than the relation Yc,9’(R”). The
space Y* of exponentially decreasing real ana-
growth condition for b! is interpreted as a
lytic functions in the sense of M. Sato [ 151:
condition concerning germs at infinity. Thus 9
.f(x)~ Y* if and only if there exist 6 > 0 and
E> 0 such that for any 6’<6 and L<E, f(x) cari be considered to be a sheaf on the direc-
tional compactification D”=R” U SQ:’ such
extends holomorphically to the neighborhood
{ Im z 1 < 6’) of the real axis and has order that -2 1Rn= B. Just as the sheaf D is obtained
O(e- “IRez’) there. UP, is endowed with the struc- from fl, the sheaf 2 is obtained as the nth
ture of nuclear (DF)-space via the inductive derived sheaf X&(d) from the sheaf 8 on
D”+ iR” consisting of germs of holomorphic
limit for 6 > 0, E > 0. The classical Fourier
functions of infra-exponential growth with
transform 9 acts isomorphically on Y*. (In
fact, 6 and E change their roles under F.) The respect to Re z. We have Hk(D” + iR”, 8) = 0
for k # n for any open set R c D”. Especially, 9
strong dual of Y.. is called the space of Fourier
is flabby, and the decomposition of support is
hyperfunctions and is denoted by 9. It is a
available to calculate the Fourier transform.
nuclear Fréchet space. It contains Y” as a
dense subspace in view of the continuous The symbol b! employed at the beginning to
express the global Fourier hyperfunctions
and dense inclusion Y* ~9’. It also contains
corresponds to Z?(D”), and O1(R”)=B(R”) by
classical locally integrable functions of infra-
detïnition. The canonical restriction mapping
exponential growth, i.e., of order & for any
$(D”)+B(R”) is surjective but not injective.
E> 0. Thus by the duality we obtain a wider
extension of Fourier transformation on 9. As for tempered distributions Y’, we cari
introduce various subclasses of Fourier hyper-
In the following a 0-wedge of the form R”+
functions, e.g., exponentially decreasing Fourier
iT0 Will be called a 0-wedge of the form D” +
hyperfunctions Ua,o exp( -a-)9, real
iT0 at the same time if it is a tubular domain
(i.e., with fïxed imaginary part l-0). Then an analytic functions of infra-exponential growth
B(D”), etc. ,We cari also consider operations
element f(x) E 02 cari be expressed in the form
such as convolution and multiplication be-
(13), where each Fj(z) is holomorphic in a O-
tween adequate pairs, and apply differential
wedge D” + irjO and is of infra-exponential
growth there locally uniformly in Imz. The operators with suitable coefficients. Concern-
ing these we cari avail ourselves of the same
inner product of such f(x) with ~DEP.+ is given
by the definite integral formulas as given in Section 0.
A hyperfunction with compact support is
naturally considered as a Fourier hyperfunc-
.fwP(x)~x= E Fj(z)<p(z)dz>
s j=l s h=y, tion, and its Fourier transform agrees with the
inner product (f(x), (fi)-“exp( - ix<)),
where the yje rjO are tïxed. Given a cane A we
which gives an entire function of exponential
deiïne its dual cane by A”={~ER”I(Q~)>O
type.
for a11 YEA}. If F,(z) are a11 of exponential
Paley-Wiener theorem. An entire function
decrease in Re z locally uniformly with respect
f(l) is the Fourier transform of a hyperfunc-
to ImzETjO and Rez/lRezl$A”, then the de-
tion with support in a compact convex set K
finite integral
in R” if and only if it satisfïes condition (i) of
Section 0.
G([)=(&)-n e-‘Xrf(x)dx
The theory of Fourier hyperfunctions de-
s
scribed above is mainly due to Sato and T.
emizrFj(z)dz Kawai [20]. They are not the largest class of
(21)
generalized functions stable under the Fourier
485 125 DD
Distributions and Hyperfunctions

transformation. Following a suggestion of cari be interpreted as the decomposition of


Sato, S. Nagamachi and N. Mugibayashi, Y. 6 into hyperfunctions with S.S. in the single
Saburi, and Y. Ito have extended them to the direction w. By the convolution, this formula
modifïed Fourier hyperfunctions in which the supplies similar decomposition for a general
radial compactitïcation Dz” of C” is employed hyperfunction (called the singular spectrum
instead of the horizontal compactification decomposition). (22) cari be generalized to
D”+ iR” in the above theory. If we discard
qx)= k-l)! dettgrad,, Vx, 4) dw
the localizing property, they cari be extended (23)
(-27ci)” s snm~ (@(x, OI) + i0)n ’
further to the Fourier ultrahyperfunctions or
ultradistributions of J. Sebastiao e Silva (1958), where the twisted phase @(x, w) satistïes (i)
M. Hasumi (1961), and M. Morimoto (1973). Q(x, w) is a real analytic function of positive
type (i.e., Re @(x, w) = 0 implies Im 0(x, w) B 0)
and (ii) @(x, w) is homogeneous of order 1 in w
CC. Micro-Analyticity of Hyperfunctions
and @(O, w) = 0, grad,@(O, w) = w; and the
vector Y(x, w) is such that (Y(x, w), x) =
The boundary value expression (13) for a
@(x, w). If @(x, w) further satisfies @(x, w) #O
hyperfunction ,f’(x) cari be interpreted con-
for x #O, then the component becomes a
versely as the description of the state of ana-
hyperfunction (even a distribution) of x whose
lytic continuation of ,f(x) to the complex do-
S.S. is precisely one point (0, w), and this fact is
main. Thus we say that ,f is micro-analytic at
useful in theoretical applications [7,21] (- 274
(x,,, &,) if on a neighborhood of x0 it admits
Microlocal Analysis).
the analytic continuation into the half-space
(Im z, tO) < 0 in the sense that it admits an ex-
pression (13) satisfying r,n { (Im z, CO) < 0) #
0 for every j. The set of points (x,,, <,)EQ x DD. Structure Theorems of Hyperfunctions
SE-‘, where f~g(fi) is not micro-analytic, is
called the singularity spectrum or singular A hyperfunction whose support is concen-
spectrum off and is denoted by S.S.f: We have
trated at the origin is expressed as the infï-
by definition S.S.F(x + iTO)c R x (P n F-l).
nite derivative J(D)G(x) = 2 a,DPG(x) of the
Micro-analyticity cari be characterized by the
Dirac measure, with the coefficients satisfying
Fourier transformation as follows: fis micro- lim( la,lP!)“lPI = 0. Such an operator J(D) is
analytic at (x,, &,) if and only if there exists called a local operator with constant coellï-
a Fourier hyperfunction g(x) such that the
cients and acts on g as a sheaf homomor-
Fourier transform y(<) decreases exponentially
phism. Its total symbol 5(l) is an entire func-
on a tconical neighborhood of [, and that
tion of infra-exponential growth or of type
,f- 9 is real analytic on a neighborhood of
[ l,O]. By way of such an operator, a general
x0. A hyperfunction f(x) is real analytic in a
(Fourier) hyperfunction cari be written in the
neighborhood of x0 if and only if it is micro-
form J(D)g(x) with a continuous function on
analytic at (x0, 5) for any 5 ES”-‘.
R” (of infra-exponential growth).
With this notion we cari clarify the oper-
IfO’csuppfc{(v,x)>O}, then SSps(O, rtv)
ations on hyperfunctions. (In the following (Holmgren type theorem of Kashiwara and
S’-’ is identifïed with (R”\{O})/R+ and + Kawai), and furthermore, the direction compo-
denotes the sum in the latter.) We have
nent of S.S.f at 0 has the form of { kv} UP-‘(E)
=wg) c { tx, 14 + (1 - 49) I(X> 5)E S.S.L with some EcS”-‘, where p:S”-‘\{ kv}-+
S’-’ is the projection to the equator (the
(x,~)ES.S.g,06Â.~1}US.S.fUS.S.g, watermelon-slicing theorem of Morimoto,
=V(W)) = {<K ‘D@Z) I tW35)~ S.S../-)> Kashiwara, and K. Kataoka). E is called the
reduced S.S. of j” at 0. These theorems have
and these operations are legitimate if and only many applications in the theory of linear par-
if 0 does not appear in the direction compo- tial differential equations and also in physics.
nent of the right-hand side. We also have A hyperfunction f(x) with support in the
hyperplane x, =0 has several further remark-
S.S. f(x,t)dxc{(x,<)1(x,t,&O)~S.S.fforat}, able properties in x,. It admits a forma1 ex-
s
pansion of the form CgOfk(~‘)fi’k’(~,), where
S.W*s)= {@+y, t)ltx, 5)ES.S.f, (Y> 5kS.S.B) x’=(~~,...,x,~~)and,f,(x’)=~~,,f(x)x,k/k!dx,.
The sum converges in the sense of the topol-
under suitable conditions for support.
ogy if f has compact support. It reduces lo-
The classical decomposition formula of
cally to a tïnite sum if f is a distribution, and

(qx)=
b-*Y
Radon (or the plane wave decompostion

s (-27ci)”
dw
snm~(xw + i0)
of 6)
the kth term represents the k-ple layer distri-
bution of mass, electric charge, etc. For a
general f the coefficients { ,&(x’)} do not neces-
125 EE 486
Distributions and Hyperfunctions

sarily determine A though they are determined eigenvalues of P(x). Then


by1:
9-p: = -(J2?r)-“22~+“*“‘2~‘r(n+ l)I-(A+n/2)

x IdetPI-1/2{sinrt(~+q/2)Q;“-“‘2
EE. Complex Powers of Polynomials

Among examples of special generalized func-


Here the arguments in the F-factor (A+ l)(i +
tions the most important are those of the form
n/2) give the b-function of P(x). If q = n - 1,
y$, where f+ = max { f(x), 0} (or more generally
we further have, letting PT& = P: l( k (x, v))
it cari be replaced by zero on some connected
for an eigenvector v corresponding to the
components of {f(x) > O}), and 1. E C denotes a
unique positive eigenvalue,
holomorphic parameter. (The discussion is the
same for f- =max{ -f(x), O}.) The simplest .Fpf* =(271)-ni2221+n-l=“12~1r(‘+ 1)
example, x :, is detïned as the analytic con-
tinuation of the locally integrable function x$
+,b=i(~+n/2)Q;--fl/2 +Q‘A-“,2),
for Re?, > - 1 by repeated use of the formula
x+ =(À+ l)-‘D,x:+‘, and becomes meromor-
From these formulas (taking the fïnite part if
phic in 3, with simple poles at n = -1, -2,
necessary) we obtain the fundamental solution
As a hyperfunction we have x< = { ( -x + i0)” -
of the wave equation, the Laplacian, and their
(-x - i0)“}/2isin rri. At a negative integer À =
iterations. These are exactly the distributions
-n, x: has residue (-1)“-‘6(n~‘)(x)/(n- l)!
introduced by Hadamard, M. Riesz, and
and tïnite part [ -(27~i))~z~“1og( -z)]. The
others, as mentioned in Section A (- Appen-
latter is often denoted by x;“. In general, for
dix A, Table 15.V).
a germ of a real-valued real analytic func-
tion ,f(x) we cari tïnd a differential operator
P(i, x, 0,) with polynomial coefficients in ,? References
and a monic polynomial b(i) of minimum de-
gree such that [l] L. Schwartz, Théorie des distributions,
PG, x, Qf:” = b(4.f: (24) Hermann, revised edition, 1966.
[2] L. Hormander, Linear partial differential
(Sato, 1. N. Bernshtein, Kashiwara, J.-E. Bjork operators, Springer, 1963.
[22]). This formula gives the analytic con- [3] G. de Rham, Variétés différentiables, Her-
tinuation of ,f$ just as for x: The polyno- mann, 1955.
mial h(3,) is called the h-function or the Sato- [4] 1. M. Gel’fand and G. E. Shilov, Gen-
Bernshtein polynomial and contains valuable eralized functions. 1, Properties and oper-
information regarding the singularity off: It
ations; II, Spaces of fundamental and gen-
has only negative rational roots (Kashiwara). eralized functions; III, Theory of differential
We have,f.fi=fi+‘; hence ,f;’ -f:’ (suit- equations; IV (1. M. Gel’fand and N. Ya. Vil-
ably interpreted as above) gives a solution of enkin), Applications of harmonie analysis; V
the division problem u.,f = 1. Thus if f is a
(1. M. Gel’fand, M. 1. Graev, and N. Ya. Vilen-
polynomial, its inverse Fourier transform gives kin), Integral geometry and representation the-
a tempered fundamental solution of ,f( -iD).
ory, Academic Press, 1964, 1968, 1967, 1964,
Furthermore, when ,f is the relative invariant 1966; VI (1. M. Gel’fand, M. 1. Graev, and 1. 1.
of a tprehomogeneous vector space, we cari Pyatetskiï-Shapiro), Representation theory
calculate h(1) explicitly by way of the holo- and automorphic functions, Saunders, 1969.
nomy diagram. Also, the Fourier transform of (Originals in Russian.)
f$ cari be calculated exphcitly by way of the [S] C. Roumieu, Sur quelques extensions de la
real holonomy diagram as a hnear combina- notion de distribution, Ann. Sci. Ecole Norm.
tion of the corresponding abjects for the dual Sup. Paris, 77 (1960) 41-121.
prehomogeneous vector space with coefficients [6] M. Sato, Theory of hyperfunctions 1, II, J.
similar to the +Maslov index. The simplest Fac. Sci. Univ. Tokyo, sec. 1, 8 (1959) 139-
example is 193,3877437.
[7] M. Sato, T. Kawai, and M. Kashiwara,
Microfunctions and pseudo-differential equa-
tions, in [16, pp. 26555291.
[S] L. Schwartz, Espaces de fonctions différ-
entiables à valeurs vectorielles, J. Analyse
Among practical examples are the following Math., 4 (1954-1955) 888148.
classical formulas: Let P(x) be a nondegener- [9] A. Grothendieck, Produits tensoriels topo-
ate real quadratic form and Q(t) its tdual logiques et espaces nucléaires, Mem. Amer.
form, and let q denote the number of negative Math. Soc., 16 (1955).
487 126 A
Dynamical Systems

[lO] L. Schwartz, Théorie des distributions Kepler’s law on the motion of planets and
à valeurs vectorielles, Ann. Inst. Fourier, 7 Galileo’s observations of movement cari be
(1957), l-141; 8 (1958), l-209. explained theoretically. Following this, L.
[ 1 l] F. Treves, Topological vector spaces, Euler, J. L. Lagrange, P. S. Laplace, W. R.
distributions and kernels, Academic Press, Hamilton, C. G. J. Jacobi, and others devel-
1967. oped the theory using analytical methods, and
[ 123 L. Ehrenpreis, Fourier analysis in several founded analytical dynamics. From the end of
complex variables, Wiley-Interscience, 1970. the 18th Century through the 19th Century, the
[13] G. Bjorck, Linear partial differential tthree-body problem attracted the attention of
operators and generalized distributions, Ark. many mathematicians. At the end of the 19th
Mat., 6 (1966), 35 l-407. Century, H. Bruns and H. Poincaré found that
[14] H. Komatsu, Ultradistributions. 1, Struc- general solutions of the three-body problem
ture theorems and a characterization; II, The could not be obtained by tquadrature, and this
kernel theorem and ultradistributions with gave rise to a crisis of analytical dynamics. But
support in a submanifold; III, Vector-valued this was resolved by Poincaré himself. He
ultradistributions and the theory of kernels, J. pointed out the importance of the qualitative
Fac. Sci. Univ. Tokyo, sec. IA, 20 (1973) 255 theory based on topological methods, and
105; 24 (1977) 607-628; 29 (1982), 653-718. obtained many fundamental results. A. M.
[ 151 M. Sato, Theory of hyperfunctions (in Lyapunov with his theory of stability and
Japanese), Sugaku, 10 (1958), l-27. G. D. Birkhoff with his many important con-
[ 161 H. Komatsu (ed.), Hyperfunctions and cepts of topological dynamics established
pseudo-differential equations, Lecture notes in foundations of the new qualitative theory.
math. 287, Springer, 1973. In 1937 A. A. Andronov and L. S. Pontryagin
[ 171 A. Martineau, Les hyperfonctions de introduced the concept of structura1 stability,
M. Sato, Sém. Bourbaki, 13 (1960&1961), which attracted the attention of S. Lefschetz.
no. 214. Lefschetz’s school investigated structura1 sta-
[ 1S] P. Schapira, Théorie des hyperfonctions, bility and tnonlinear oscillations, and obtained
Lecture notes in math. 126, Springer, 1970. many important results (H. F. de Baggis, L.
[19] A. Martineau, Le “edge of the wedge Markus, M. M. Peixoto, and others). In about
theorem” en théorie des hyperfonctions de 1960, S. Smale initiated study of differentiable
Sato, Proc. Intern. Conf. Functional Analysis dynamical systems under the influence of Lef-
and Related topics, Tokyo, 1969, Univ. Tokyo schetz’s school. Smale and his school founded
Press, 1970, 95- 106. a new theory of differentiable dynamical
[20] T. Kawai, On the theory of Fourier systems using tdifferential topology. D. V.
hyperfunctions and its applications to partial Anosov generalized the work of E. Hopf and
differential equations with constant coefft- G. A. Hedlund on tgeodesic flows of closed
cients, J. Fac. Sci. Univ. Tokyo, sec. IA, 17 surfaces of +Constant negative curvature and
(1970), 4677517. established the concept of Anosov systems,
[21] K. Kataoka, On the theory of Radon which played an important role in Smale’s
transformations of hyperfunctions, J. Fac. Sci. theory. The work of Hopf, Hedlund, and
Univ. Tokyo, sec. IA, 28 (1981) 331-413. Anosov is closely related to tergodic theory.
[22] J.-E. Bjork, Rings of differential opera- Ya. G. Sinai and R. Bowen obtained impor-
tors, North-Holland, 1979. tant results in ergodic theory. The concept of
structural stability and its generalization are
essential in the +Catastrophe theory of R. Thom
(- 51 Catastrophe Theory); the theory of bi-
furcation of dynamical systems is another
126 (1X.22) essential part of catastrophe theory. D. Ruelle
and F. Takens proposed a new mathematical
Dynamical Systems mechanism for the generation of turbulence
using Smale’s theory and Hopf bifurcation.
A. History The new theory of dynamical systems devel-
oped by Smale and others is now applied to
The theory of dynamical systems began with the mathematical explanation of chaotic phe-
the investigation of the motion of planets in nomena in many branches of science. Finally,
ancient astronomy. Qualitative investigation we mention that in the 1960s A. N. Kolmo-
of mechanics in antiquity and the Middle Ages gorov, V. 1. Arnold, and J. Moser obtained
culminated in the work of J. Kepler and G. remarkable results on the existence of quasi-
Galilei in 17th Century. At the end of that periodic solutions for the n-body problem,
Century, 1. Newton founded his celebrated which turned out to salve the long-standing
Newtonian mechanics, by means of which problem of the stability of the solar system.
126B 488
Dynamical Systems

B. Definitions of Dynamical Systems Let (X, cp) and (Y, $) be flows. A homeo-
morphism h: X+ Y is called a topological
In the study of the evolution of physical, bio- equivalence from (X, p) to (Y, $) if it maps
logical, and other systems, we construct math- each orbit of cp onto an orbit of $ preserving
ematical models of the systems. Usually, the orientations of orbits (ie., there exists an in-
state of a given system is completely described creasing homeomorphism cr,:R-rR for each
by a collection of continuous parameters, X~X such that h<p(x, t) = @(h(x), E,(t)) for a11
which may be related in some cases. Thus the t E R). Two flows are topologically equivalent if
space X of a11 possible states of the system cari there exists a topological equivalence from one
be regarded as a Euclidean space or a subset to the other. If two flows are topologically
of a Euclidean space detïned by some equa- equivalent, their phase portraits have the same
tions. In general, we assume that the space X topological structure. Two flows (X, cp) and
of all possible states of the system forms a ( Y, $) are flow equivalent if there exist a c > 0
+topological space, and we cal1 it a state space and a homeomorphism h : X + Y such that
or a phase space. Second, we assume that the h<p(x, t) = $(h(x), ct) for all t E R. Such an h is a
law of evolution of states in time is given, by topological equivalence from (X, q) to (Y, $).
which we cari tel1 the state xi at any time ri if (2) Let X be a topological space and Z the
we know the state x,, at time t,. Assigning x1 additive group of integers. If we replace R by
to x0, we have a mapping n(t,, t,):X+X for Z in the definition of a continuous flow, we
any times t, and ri, which satistïes the follow- obtain a definition of a (continuous) Z-action, a
ing conditions: (i) x(t,,t,)ox(t,,t,)=~(t,,t,); discrete flow, or a discrete dynamical system on
(ii) n(t,, to) = 1 x, the identity mapping of X. X. If (X, <p) is a discrete flow, then f= <pi :X+
Finally, we assume that the mapping n(t , , to) X is a homeomorphism and <p,,=,f” for all
depends only on t = t, -t,. Writing n, = n E Z. Conversely, for a given homeomorphism
n(t,, to) if t = t, -t,, we have the following ,f:X+X, define a mapping <p:X x Z-+X by
conditions from (i) and (ii) above: (i’) rr, o n, = <p(x,n)=f”(x), XEX and FEZ. Then (X,<p) is
n s+f, s, teR; (ii’) rcO= 1,. a discrete flow such that <P,,=f” for n E Z. SO
In general, the theory of topological dynam- we identify a homeomorphism with a discrete
ics cari be regarded as the study of topological flow. Thus the orbit of a homeomorphism
transformation groups (- 43 1 Transformation f:X+X through x is C(x)={,f”(x)In~Z}.
Groups) originating in the topological investi- Let Z + be the additive semigroup of all
gations of problems arising from classical nonnegative integers. If we replace R by Z, in
mechanics. Here, we restrict our attention to the defïnition of a continuous flow, we obtain
some important special cases. a definition of a discrete semiflow on X. For a
(1) Let X be a topological space and R the discrete semiflow (X, cp), the mapping <p,,:X+
additive topological group of real numbers. X, y1E Z + is in general not a homeomorphism
Let q :X x R-+X be a continuous mapping. but a continuous mapping. We cari identify
For each t E R, we define a mapping <Pu: X*X a continuous mapping ,f: X +X with a discrete
by V~(X)= <p(x, t), X~X. If the family of map- semiflow (X, <p) in a natural way as above.
piw J<PJ~~~ satisfies the following conditions, Let ,f:X-tX and 9: Y+ Y be two homeo-
we say that (X, <p) is a (continuous) R-action, a morphisms (continuous mappings). A homeo-
(continuous) flow, or a (continuous) dynamical morphism h:X+Y such that hof=goh is
system on X, and that X is the phase space: called a topological conjugacy from f to g.
(i) ~oso<p,=<p,+, for all s, tER; (ii) <pO= 1,. And f and g are called topologically conjugate
Let (X, cp) be a flow. Then <p,:X+X is a if there exists a topological conjugacy from
thomeomorphism with (cp,))’ = qmt for each f to g. Topologically conjugate homeomor-
t E R. For each x E X, defïne a mapping @: R-+ phisms have the same phase portrait in a topo-
X by <p”(t) = <p(x, t), t E R. The mapping cp” is logical sense.
called a motion through x, and its image C(x) (3) Let M be a tdifferentiable manifold of
= {<p”(t) 1t E R} is called the orbit or the trajec- class c’ (1 <r Q CO or r = w). A continuous flow
tory through x. An orbit is nonempty, and two (M, q) is a flow of class c’, a c’-flow, a differ-
orbits are either identical or mutually disjoint. entiable dynamical system of class c’, or a
The family of orbits fills up the phase space X one-parameter group of transformations of
and is called the phase portrait. class c’, if <p: M x R-1 M is of class C. A semi-
Let R, be the additive tsemigroup of all flow of class c’ is delïned similarly.
nonnegative real numbers. If we replace R by Let (M, cp) and (N, $) be c’-flows. A topo-
R, in the definition of a (continuous) flow, we logical equivalence h : M -rN from (M, q) to
obtain a definition of a (continuous) semiflow. (N, $) is called a C’-equivalence if it is a +C’-
For a semiflow (X, <p), the mapping <pr:X-tX, diffeomorphism. Two flows are C’-equivalent if
t E R + is in general not a homeomorphism but there is a C’-equivalence from one to the other.
a continuous mapping. Classification of c’-flows by C’-equivalence is
489 126 C
Dynamical Systems

difïcult and sometimes too unwieldy to work of ordinary differential equations in a coordi-
with. On the other hand, there are many prob- nate neighborhood of each point of M, and it
lems which cari be solved by the knowledge generates a tlocal one-parameter group of
of the topological structure of the phase por- local transformations of class C’. If M is tcom-
trait of C’-flows. pact, this local one-parameter group of local
(4) Let (M, 7~)be a discrete flow on a dif- transformations is uniquely extended to a one-
ferentiable manifold M of class C’. If 7-c:M x parameter group of transformations of class
Z-+ M is of class C’, then (M, 7~)is called a Z- C’ (- 105 Differentiable Manifolds). Thus
action of class C’, a discrete flow of class C’, a a vector fïeld of class C’ on M generates a
discrete C’-flow, or a discrete dynamical system unique C’-flow on M if M is compact. There-
of class C’. If (M, z) is a discrete C’-flow, then fore, sometimes we identify a vector fïeld of
f= 7~~:M+M is a C’-diffeomorphism. Con- class C’ with the C’-flow generated by it.
versely, a C’-diffeomorphism f: M+M defines (3) Let (M, <p) be a C’-flow. Then ‘pl :M+
a discrete C’-flow in a natural way on M, and M is a C’-diffeomorphism, which we cal1 the
we identify a C’-diffeomorphism f: M+M time-one mapping (time-one map) of (M, cp).
with the discrete C’-flow on M defined by f: Thus every C’-flow (M, cp) induces a C’-
A discrete semiflow of class C’ is defïned simi- diffeomorphism as a time-one mapping. But
Marly, and a +C’-mapping f: M-M is identi- the set of Cl-diffeomorphisms that are time-
iïed with a discrete semiflow of class C’ on M one mappings of Cl-flows is of the tlïrst cate-
defined by f: gory in the space of all Cl-diffeomorphisms
Letf:M*Mandg:NhNbeC’- with C’ topology (J. Palis). Thus most Cl-
diffeomorphisms (C’-mappings) of differ- diffeomorphisms are not expressed as time-one
entiable manifolds M and N of class c’. A mappings of Cl-flows.
topological conjugacy from f to g is called a (4) Let M be a compact differentiable mani-
C’-conjugacy if it is a C’-diffeomorphism. f and fold of class C” with dimension m and (M, cp) a
y are C’-conjugate if there is a C’-conjugacy C’-flow (1 <Y < CO). An (m - 1)-dimensional
from f to y. closed submanifold C of M is called a cross
section of (M, cp) if the following conditions are
satisfïed: (i) For any XEX, there exist t, >O
C. Examples and Remarks and t, <O such that C~,,(X), <P,,(x)E~; (ii) Fvery
orbit intersects .Z ttransversally whenever it
(1) Let D be an open set of R” and f:D+R” a meets Z. Let C be a cross-section of (M, <p) and
continuous mapping. Consider the tautono- XEZ. Let t, be the least positive number with
mous system of ordinary differential equations <p,,(x)~Z. Such a t, exists for every X~X, and
f:Z-tZ defïned by ~(X)=V,,(X), XGY is a C’-
dx/dt=f(x), ~ED. (1)
diffeomorphism. We cal1 this diffeomorphism f
We assume that for each XED there exists a the first-return mapping (first-return map) or
unique tnonextendable solution <p(x, t) with the Poincare mapping (Poincaré map) for Z.
the initial condition <p(x, 0) =x defïned on a (5) Let N be a compact differentiable
maximal interval (a,, b,), -cc <a, < 0 <b, < CO manifold of class C” and f: N + N a C’-
(- 3 16 Ordinary Differential Equations (Ini- diffeomorphism (1 <r < CO). Defïne an equiva-
tial Value Problems)). The set { <p(x, t)) a, < t < lente relation - on N x R generated by (x, t +
b,} is called the trajectory through x. By the 1) -(f(x), t) for x E X, t E R. Then the quo-
uniqueness assumption, we have <p(cp(x, t), s)= tient space N(f) = N x R/- has a natural
cp(x, t + s) whenever both sides of the equality tdifferentiable structure of class C’, and a C’-
are defined. When (a,, b,) = R for a11 x E D, flow (N(f), $) is detïned by $( [x, t], s) = [x, t +
then equation (1) is called complete. If (1) is s] for XE N, t, seR, where [x, t] E N(f) is the
complete, the mapping cp: D x R+D defined equivalence class of (x, t). The flow (N(f), $)
by the solutions <p(x, t) determines a continu- thus obtained is called the suspension off: The
ous flow (D, <p). Furthermore, if f is of class suspension (N(f), II/) has a cross section C=
C’, then (D, cp) is of class C’. If (1) is not com- { [x, 0] 1x E N}, and the Poincaré mapping
plete, then there exists a continuous positive for C is C’-conjugate to f: Conversely, if (M, cp)
scalar function a:D+R such that has a cross section C and the Poincaré map-
ping for Z is f: C+C, then the suspension
dx/dt=cc(x)f(x), ~ED, (2) (Z(f), $) is C’-equivalent to (M, <p).
is complete. The trajectories of (1) and (2) (6) Let U be an open set of R” and f: U+R”
through x coincide for a11 ~ED, and thus the a continuous mapping. Consider the tdiffer-
phase portraits of (1) and (2) are the same. ence equation
(2) Let M be a differentiable manifold of
X m+1 =fk’J, X,E u. (3)
class C”. A tvector tïeld of class C’ on M
(1 Q r < ~03)gives rise to an autonomous system For each XE U, let <p(x, m) be the solution of (3)
126 D 490
Dynamical Systems

with cp(x, 0) =x. If f(U) c U, then cp(x, m) exists orbit. If C is a closed orbit, then all points of C
for a11 m E Z + and XE U, and cp defines a dis- are nonsingular periodic points, and their
crete semiflow on U. If f: U+ U is a homeo- smallest positive periods coincide. A closed
morphism, then V(X, m) exists for a11 rns Z orbit is compact.
and XE U, and q defines a discrete flow on U (3) Let x be a point of X. A point yeX is
(- 104 Difference Equations). called an w-limit (resp. cc-limit) point or a posi-
tive (resp. negative) limit point of x if there
exists a sequence {t,} of real numbers such
D. Basic Concepts that (i) t,+cc (resp.t,-+ -CO) as I~+Q and
(ii) p(x,t,)+y as n+co. The set of all w-limit
For simplicity we assume that the phase (resp. a-limit) points of x is denoted by o(x)
spaces of dynamical systems are metric spaces, (resp. E(X)) and is called the w-limit (resp. dc-
and we denote their metrics by d. limit) set of x. For each x E X, w(x) and C((X)
(1) Let (X, <p) be a continuous flow on a are closed invariant sets, and the following
metric space X and X~X. A subset A of X is equalities hold: C+(x) = C+(x) U w(x), C(x) =
called invariant if <p,(A) c A for a11 t E R. A C- (x) U C((X), and C(x) = C(x) U a(x) U w(x).
subset of X is invariant if and only if it is a If x is a periodic point, C(x) = C(x) = C((X) =
union of orbits. A subset A of X is positively w(x). If C is a closed orbit, then C = C = C(x) =
(resp. negatively) invariant if q+(A)= A for all x(x) = w(x) for a11 x E C. If A is a compact in-
taO (resp. t<O). The subset C+(x)={@(t)It~ variant set and XE A, then C((X) and w(x) are
0) (resp. C(x)= {<p”(t)/ t<O}) is called the nonempty.
positive (resp. negative) semiorbit or the posi- Assume that X is tlocally compact and
tive (resp. negative) half-trajectory starting XEX. Then w(x) is connected if it is compact,
fromsx. A subset is positively (resp. negatively) and none of the connected components of w(x)
invariant if and only if it is a union of positive is compact if w(x) is not compact.
(resp. negative) semiorbits. A subset is invari- (4) Let x be a point of X. Let J+(x) (resp.
ant if and only if it is both positively and nega- J-(x)) be the set of a11 points y satisfying the
tively invariant. following condition: There exist a sequence
The union and the intersection of invariant {t,,} of numbers and a sequence {x”} of points
sets are invariant. If A is invariant, then its in X such that (i) t,+w, (resp. t,+ -CO) as n+
closure A, Its interior Int A, its boundary û.4, CO,(ii) x,+x as n+ CO, and (iii) V(X,, t,)+y
and its complement A’ = X - A are invariant. as n+a. The set J+(x) (resp. J-(x)) is a closed
If A is invariant, then <p(A x R)c A and the invariant set containing w(x) (resp. C((X)), called
restriction mapping cp(A x R defines a continu- the first positive (resp. negative) prolongational
ous flow (A, cp1A x R) on A. The flow thus limit set of x. If X is locally compact, then
obtained is called the restriction of (X, cp) on A. J+(x) is connected if it is compact, and none of
(2) A point XEX is a singular point, an equi- the connected components of J+(x) is compact
librium point, a critical point, a rest point, or when J+(x) is not compact.
a fixed point if C(x) = {x} (- 290 Nonlinear Notions of higher prolongations have been
Oscillation). A point is regular or nonsingular if defined and investigated by T. Ura, J. Aus-
it is not a singular point. The set of a11 singular lander, and P. Seibert.
points is a closed invariant set, and the set of (5) For a discrete flow, we cari similarly
all nonsingular points is an open invariant detïne basic notions such as an invariant set,
set. If A is a positively invariant set which is fixed point, periodic point, and SO on deiïned
homeomorphic to the closed unit bal1 in R”, in Sections D( 1)-D(4) by replacing R by Z.
then there exists a singular point in A (TBrou- The propositions and theorems stated above
wer lïxed-point theorem). hold for discrete flows, except those concern-
A point XEX is periodic if there exists a ing connectedness.
T#O such that

<pk t + T) = 44% t) (4)


E. Recursive Concepts and Dispersive Concepts
holds for all t E R. If x is periodic, the motion
<p” and the orbit C(x) are said to be periodic. A (1) Let (X, cp) be a flow on a metric space X.
point x is periodic if and only if there exists a Let X~X be a point such that there exist a
T# 0 with <p(x, T) = x. A singular point is neighborhood U of x and a T > 0 satisfying the
periodic. For a periodic point x, a number T condition U n ql( U) = @ for t > T. Then x is
satisfying (4) is called a period of x. If x is called wandering. The set of a11 wandering
nonsingular and periodic, then there exists points is an open invariant set. A point is
a smallest positive period T, of x, and any nonwandering if it is not wandering. The set R
period is an integral multiple of TO. An orbit of of a11 nonwandering points is a closed invar-
a nonsingular periodic point is called a closed iant set and is called the nonwandering set.
491 126E
Dynamical Systems

The following conditions are equivalent: (i) x (5) A nonempty closed invariant set is called
is nonwandering, (ii) x~J+(x), (iii) x~J~(x). a minimal set if none of its nonempty proper
The nonwandering set R contains a11 singular closed subsets is invariant. For a nonempty
points, closed orbits, and w(x) and x(x) for a11 compact subset A of X, the following con-
XGX. ditions are equivalent: (i) A is minimal, (ii) C(x)
(2) A point x E X is positively (resp. nega- = A for a11 XE A, (iii) C+(x)= A for a11 XEA,
tively) Poisson stable if x EU(X) (resp. x E C((X)). (iv) C-(x)= A for ail XEA, (v) w(x)= A for all
It is Poisson stable if it is both positively and XE A, (vi) L$X)= A for all XE A. A point x6X is
negatively Poisson stable. A positively Poisson positively (resp. negatively) Lagrange stable if
stable point and a negatively Poisson stable C+(x) (resp. C(x)) is compact. If C(x) 1s com-
point are nonwandering. The following con- pact, then x is called Lagrange stable. Every
ditions are equivalent: (i) x is positively Pois- nonempty compact invariant set contains a
son stable, (ii) C+(x) = w(x), (iii) C(x) c w(x), (iv) compact minimal set. If XE X is positively
for any neighborhood U of x and T>O, there (resp. negatively) Lagrange stable, then o(x)
exists a t > T with cp(x, L)E CJ. If X is tcomplete (resp. a(x)) contains a compact minimal set.
and XE X is positively Poisson stable but not A closed invariant set A is called a saddle set
periodic, then w(x) - C(x) is tdense in w(x). if there exists a neighborhood U of A such that
This implies that if X is complete, then C(x) is every neighborhood of A contains a point x
periodic if and only if C(x) = w(x). such that C+(x)+ Cl and C-(x)$ U. Otherwise
(3) If a11 the points of the phase space X are A is called a nonsaddle set. For a point x of X,
wandering, then (X, cp) is called completely let E(x) be the subset of X consisting of the
unstable. If all the points of X are nonwander- points y satisfying the following condition:
ing, then (X, <p) is called regionally recurrent. If There exist a sequence {x,,} of points in X and
A is an invariant set such that the restriction two sequences {t,,}, {.Y~} of numbers such that
of (X, cp) on A is regionally recurrent, then (i) x,+x, t,+m, s,-, --CO as n+cc and (ii)
(X, <p) is said to be regionally recurrent on A. If (p(x,, t,)-y, q(x,,s,,)+y as n-a. For a subset
(X, cp) is regionally recurrent and X is locally B of X, put E(B)= UXEBE(x). Let {Sa} be the
compact, then the set of all Poisson stable family of all saddle minimal sets and {F,} the
points is dense in X. family of a11 nonsaddle minimal sets. If the
For a given flow (X, q), we obtain a se- phase space X is compact, then the nonwan-
quence of invariant sets {Q,} and a sequence deringsetR=(UpFO)U(U~E(&))=(UpFB)U
of restriction flows {(Q,, cp,)} such that (i) E( un S,) (T. Saito).
(fi,, <PJ =(X, cp); (ii) R,,, is the nonwandering (6) A point X~X is said to be recurrent if for
set of (fi,, cp,), n > 0; and (iii) (fi,,, , qn+,) is the any E> 0 there exists a T > 0 such that the +&-
restriction of (X, cp) on R,,,. Then X=0,30, neighborhood U of cp”( [t, t + T]) contains C(x)
~...xQ,x . . . . PutR,=n,R,.ThenR,isan for a11 t E R. If x is recurrent, the motion <p”
invariant set of (X, <p), and we denote the re- and the orbit C(x) are said to be recurrent.
striction of (X, <p) on fi, by (Q,, cp,). Starting Every orbit of a compact minimal set is recur-
from (Q,,, cp,), we obtain similarly a sequence rent, and if the phase space is complete, then
of invariant sets {Q,,“} and a sequence of the closure of a recurrent orbit is a compact
fhvs {c4u+,, <p,+,)}. If we obtain an ordinal minimal set (Birkhoff).
number y such that QY = R,,, # 0 by continu- A set D of real numbers is called relatively
ing this process, then we call fiY the set of dense if there exists a T > 0 such that D n
central motions. In this case, the flow (fi,, <p,) is (t,t+T)#@forall tER.AssumethatxEX is
regionally recurrent, and every invariant sub- Lagrange stable, then x is recurrent if and only
set of X on which (X, <p) is regionally recurrent if for every E> 0 the set {t 1d(x, rp(x, t)) < E} is
is contained in 0,. When X is tseparable and relatively dense.
ti, is compact and nonempty, then the mini- (7) A flow (X, cp) is dispersive if for any x,
mum of such ordinal y is an +Ordinal of at y E X there exist neighborhoods U, of x and U,
most the second number class. of y and a T > 0 such that U, n cp,(U,) = 0 for
(4) Let OR be the set of all XE X satisfying the all t > T. The following conditions are equiva-
following condition: For each E, T > 0, there lent: (i) (X, <p) is dispersive; (ii) For any x, y~ X,
exist a sequence {x0 =x, x, , , xk =x} of there exist neighborhoods U, of x and U, of y
points in X and a sequence {t,, t, , , t,-,} of and a T>O such that U,n<p,(U,,)=@ if Itl2T;
numbers such that ti > T and d(<pJxJ, xitl) <F, (iii) J+(x)=@ for a11 XEX.
for 0 < i < k - 1. The set W is a closed invar- A flow (X, q) is parallelizable if there exist
iant set containing the nonwandering set R a subset S of X and a homeomorphism h:
and is called the cbain recurrent set. If X = .%, X+S x R such that (i) cp(S x R)= X and (ii)
then (X, <p) is called chain recurrent. The re- hocp(x,t)=(x,t)forallxeSandteR.Aflow
striction (2, cp12 x R) of (X, cp) on YR is chain (X, cp) is parallelizable if and only if there is a
recurrent (C. C. Conley). subset S of X satisfying the following con-
126 F 492
Dynamical Systems

ditions: (i) For each X~X there exists a unique Such a function L is called a Lyapunov func-
Z(X)ER with cp(x,z(x))~S; (ii) 7:X-R is con- tion for A (- 394 Stability). Assume that X is
tinuous. A parallelizable flow is dispersive. A locally compact and A is nonempty, stable,
flow on a locally compact separable metric and invariant. Then A is asymptotically stable
space is parallehzable if and only if it is disper- if and only if there exists a neighborhood U
sive (V. V. Nemytskiï, V. V. Stepanov). of A such that any invariant set in U is neces-
An open set U of X is called a tube if there sarily contained in A (Ura). Let A be a non-
exist a T> 0 and a subset S of U satisfying the empty closed invariant set. A is called an at-
following conditions: (i) <p(S x (- T, T)) = U, (ii) tractor if it has an open neighborhood U satis-
For each x E U there is a unique ~(X)E R with fying the following conditions: (i) U is posi-
17(x)1 < T such that (P(x,~(x))ES. The set S in tively invariant; (ii) For each open neighbor-
the above definition is called a local section. If hood V of A, there is a T > 0 such that cp,( U) c
X~X is a regular point, then there exists a tube V for a11 t 2 T. Condition (ii) implies that
containing x (M. Bebutov, H. Whitney). The n tào~,(U)=Aandw(x)cAforallx~U.
notion of a local section is a generahzation Thus an attractor is asymptotically stable. If A
of Poincare’s “surface sans contact” [l] or is an attractor, the basin of A is the union of
Birkhoff’s “surface of section” [4], and it is a11 open neighborhoods of A satisfying (i) and
related to the notion of the cross section. (ii).
(8) For discrete flows (homeomorphisms), (2) Assume that the phase space X is com-
basic notions, such as a nonwandering set, plete. A motion cp” (x E X) is called almost
Poisson stability, regional recurrence, central periodic if for any E > 0 there exists a rela-
motion, and minimal set, are detïned similarly, tively dense subset {7.} in R such that d(cp”(t),
and many of the propositions and theorems in cp”(t+r,))<c for a11 teR and 7, (- 18 Almost
Sections E(l))E(5) hold similarly for discrete Periodic Functions). If <px is almost periodic,
flows (homeomorphisms). then ‘pY is almost periodic for all y~ C(x). If A
is a compact minimal set and if rp” is almost
periodic for some x in A, then every motion
F. Stability
@‘, y~ A, is almost periodic. An almost periodic
(1) Let (X, <p) be a continuous flow on a metric motion is recurrent. The converse is not true.
space X. A point x E X is called orbitally stable But if x is recurrent and Lyapunov stable in
if for any E> 0 there exists a 6 > 0 such that C(x) (i.e., in the restriction flow (C(x), cp1C(x)
C+(y) is contained in the a-neighborhood of x R)), then cp” is almost periodic. If x is uni-
C+(x) for all y with d(x, y) < 6 (- 394 Sta- formly Lyapunov stable in C(x) and negatively
bility). A nonempty set A is called stable if Lagrange stable, then (px is almost periodic
every neighborhood of A contains a positively (A. A. Markov).
invariant neighborhood of A. If A is compact
(in particular, if A is a periodic orbit), then G. Singular Points and Closed Orbits
orbital stability and stability for A are equiva-
lent. A nonempty set A is called asymptotically In this section we assume that the phase space
stable if A is stable and there exists a neighbor- is a tparacompact differentiable manifold of
hood V of A such that o(x) c A for any XE V. class C” with metric d.
If A is stable and w(x) c A for a11 XEX, then A (1) Let E be a tïnite-dimensional real vector
is called globally asymptotically stable. A point space and L:E+E a linear automorphism. L
x E X is Lyapunov stable if for any E> 0 there is called byperbolic if it has no teigenvalues of
exists a 6 > 0 such that d(<p,(x), q,(y)) < E for all absolute value 1. If L: E-E is hyperbolic,
t > 0 and y with d(x, y) < 6. Lyapunov stability there are unique vector subspaces E” and E”
implies orbital stability. A point x is uniformly satisfying the following conditions: (i) E =
Lyapunov stable if for any E> 0 there exists a 6 E” @ E”, (ii) L(E”) = E” and L(E”) = E”, (iii) if
> 0 such that for ZE C(x) and y with d(y, z) < 6 II.11 is a tnorm on E, then there exist constants
we have d(q,(z), v,(y)) <E for all t > 0. Uniform c > 0 and 0 <i < 1 such that, for any positive
Lyapunov stability implies Lyapunov stability. integer m, llL”‘(v)ll <ck”Ilull when UEE” and
For a singular point, the notions of uniform ~~L~“(U)~~ <~~“//VII when UEE”. The zero 0 of E
Lyapunov stability, Lyapunov stability, orbital is the only fixed point of a hyperbolic linear
stability, and stability are equivalent. Assume automorphism. Put s = dim E” and u = dim E”.
that the phase space X is locally compact and Then s + u = dim E, and s (resp. u) is the num-
A is a nonempty compact subset of X. Then A ber of eigenvalues of absolute value < 1 (resp.
is asymptotically stable if and only if there > 1) counted with multiplicity. A topological
exist a neighborhood N of A and a continuous conjugacy class of a hyperbolic linear auto-
mal-valued function L on N such that (i) L(x) morphism L : E -+E is determined by s, u, and
=Oifx~AandL(x)>OifxgÇA;(ii)L(<p(x,t)) the signs of det(L 1E”) and det(L 1E”), where
<L(x)ifx$A,t>O,and<p({x}x[O,t])cN. det(L 1E”) is the tdeterminant of the restriction
493 126 G
Dynamical Systems

L (E”: Eu+ E” of L on E” (a = s, u). Further the texponential of the matrix tA. Thus <Puis a
investigations of topological classification of linear automorphism for a11 t E R. The origin
linear automorphisms have been carried out OER” is a singular point of (R”, cp). If none of
by N. H. Kuiper and J. W. Robbin. the real parts of the eigenvalues of A are zero,
(2) Let f:R”+R” be a C’-diffeomorphism we cal1 0 E R” a hyperholic singular point of (5).
(1~ r < CO). Assume that the origin OE R” is a If OER” is a hyperbolic singular point of (5),
tïxed point of 1: Then the tdifferential df, : R” then there exist two vector subspaces E” and
+R” off at 0 is a linear automorphism. It is E” of R” satisfying the following conditions: (i)
given by df,(x)=&(f)x, XER”, where J,(f) is R”= Es@ E”; (ii) A(E”)=E” and A(E”)=E”;
the tJacobian matrix off at 0 and x is ex- (iii) E” and E” are invariant sets of (R”, cp); (iv)
pressed as a column vector. If df, is hyperbolic, there exist positive constants c and i such
then fis topologically conjugate to df, in a that for a11 t>O, II<p,(x)ll <ce-“‘Ilxjl when X~E”
suffïciently small neighborhood of 0 (P. Hart- and Il+,(x)11 <cë”ilxll when XEE”, where II./1
man). Assume that f is of class C”, and let is the norm on R”. The origin 0 is a hyperbolic
Â1, . , À, (possibly repeated) be the eigenvalues singular point of (5) if and only if the time-
of&. If&#171... A”n
” for a11 1 \\< i < n and for one mapping ‘pl =eA:R”+R” of cp is a hyper-
a11 nonnegative integers m, , . , m, with 2 B bolic linear automorphism. If 0 is a hyperbolic
m, $ . . . + m,, then in a suffkiently small singular point of (5), the above direct sum
neighborhood of 0, f is Cm-conjugate to df, decomposition R”= ES@ E” coincides with the
(S. Sternberg). one with respect to ‘pl. Put s = dim E” and u =
(3) Let f: M+M be a C’-diffeomorphism dim E”. Then s + u = n, and s (resp. u) is the
(1 < r < CO) of a differentiable manifold M of number of a11 eigenvalues of A with negative
class C”. Let p E M be a fïxed point off: Then (resp. positive) real parts counted with multi-
the tdifferential df, off at p is a linear auto- plicity. Also we obtain the following proper-
morphism of the ttangent space T’,(M) of M at ties: (i) XE E” (resp. E”)ocp,(x)+O as t+cO
p. If df, is hyperbolic, then p is called a hyper- (resp. t+ -CO), (ii) x$ ES (resp. E”) =z- l(cp,(x)ll
holic fixed point off: Let p be a hyperbolic +CC as t-cc (resp. t+ -CO), (iii) the origin 0
fixed point of ,f and 7”(M) = ES 0 E” be the is the only singular point.
direct sum decomposition with respect to df,. Let A and B be n x n real matrices. Let
Put W(p)={x~MIf”(x)-+p as n+co} and (R”, <PA)and (R”, <p,) be the flows determined
W”(p)={x~ïMIf-“(x)+p as n-tco}. W”(p) by the equations dx/dt = Ax and dxldt = Bx,
and W”(p) are invariant sets and images of respectively. Assume that OER” is a hyperbolic
tinjective immersions of class c’ of vector singular point for both of the equations. Let s
spaces ES and E”, respectively. At the point p, and u (resp. s’ and u’) be the integers defined
W(p) and W”(p) are tangent to E” and E”, for (pa (resp. cpB)in the above paragraph. Then
respectively [21]. We cal1 W”(p) (resp. W(p)) the following conditions are equivalent: (i)
the stable (resp. unstahle) manifold off at p. In (R”, <~a) and (R”, <ps) are flow equivalent, (ii)
a suflïciently small neighborhood of a hyper- (R”, <~a) and (R”, <ps) are topologically equiva-
bolic fixed point p of J fis topologically lent, (iii) s = s’, (iv) u = u’, (v) (<PA)~ and (~p~)~ are
conjugate to its differential df,. Therefore a topologically conjugate. Further investigations
hyperbolic fixed point is an isolated fixed point of the phase portrait of the equation dx/dt =
(i.e., there are no lïxed points in its suffkiently Ax without the assumption of hyperbolicity
small neighborhood except itself). A hyperbo- were done by Kuiper.
lit lïxed point p is called a source if dim W’(p) (5) Let D be an open set ofR”andf:D+R”
= 0 and a sink if dim W”(p) = 0. Otherwise it is a continuous mapping. Consider the system of
a saddle point. The number of topological differential equations (l), which we Write down
conjugacy classes of hyperbolic fïxed points on again here:
an n-dimensional manifold is 4n.
dx/dt=f(x), ~ED. (6)
Let PE M be a periodic point off and m the
smallest positive period of p. Replacing f by A point p E D is a singular point of (6) if the
f”, we obtain notions of hyperbolicity, stable trajectory through p consists of a single point
manifold, and SO on for the periodic point p. p. If (6) generates a flow (D, <p), then p is a
We also obtain propositions and theorems singular point of (6) if and only if p is a sin-
similar to those stated above for periodic gular point of (D, rp). A point p is a singular
points. point of (6) if and only if f(p)=O.
(4) Let A be a real n x n matrix. Consider the Suppose further that fis a C’-mapping
system of linear differential equations (1~ r < 00). A singular point p of (6) is called
hyperholic if none of the real parts of the eigen-
dxldt = Ax, XER”. (5) values of the Jacobian matrix J,(f) off at p is
This equation generates a C”-flow (R”, <p), and zero. Let p be a singular point of (6). We as-
<P,: R”+R” is given by <p,(x) = el”x, where eta is sume for simplicity that p = 0 and (6) generates
126 G 494
Dynamical Systems

a flow (D, cp). Denote the Jacobian matrix off only if a11 real parts of the corresponding eigen-
at 0 by A, and let (R”, <PA)be the flow generated values Â1, , A, are positive (resp. negative).
by the equation dxldt = Ax. If 0 is a hyperbolic (7) Let V be a vector field of class c’ on a
singular point, then in a sufflciently small C”-differentiable manifold M of dimension n
neighborhood of 0, the flows (D, <p) and (R”, <pA) > 2 and (M, <p) the c’-flow generated by V. Let
are flow equivalent and hence topologically C be a closed orbit of (M, cp), p a point of C,
equivalent (Hartman, D. M. Grobman). Let and T> 0 its smallest positive period. Then
3., , . , & (possibly repeated) be the eigen- p is a fïxed point of the C’-diffeomorphism
valuesofA.IffisofclassC”andli#m,A,+ <pT: M+M. Let V(~)ET’,(M) be the value of V
+ m,/., for all 1 < i < n and for a11 nonnega- at p and (dq+)I>: T,(M)+T,(M) the differen-
tive integers m, , , m,with2<m,+...+m,, tial of (Pu at p. Then (d<p,),(V(p))= V(p), and
then in a sufficiently small neighborhood of 0, V(p)#O. Therefore 1 is an eigenvalue of (dq,),.
two flows (D, <p) and (R”, (PA) are C”-equivalent The other eigenvalues 1,) . . , A,-, (possibly
(Sternberg). repeated) of (dq,), do not depend on a choice
(6) Let M be a paracompact differentiable of pu C and are called the characteristic multi-
manifold of class C” and V a vector fïeld of pliers of C. If none of the characteristic multi-
class C’ (1 < r < CO) on M. For simplicity we pliers of C is of absolute value 1, we cal1 C a
assume that V generates a C’-flow (M, <p). For hyperbolic closed orbit. If C is a hyperbolic
each point p of M, take a coordinate neighbor- closed orbit, then there exist vector subspaces
hood (U, 2) of class C” around p, where U is E; and EP for each pe C satisfying the follow-
an open neighborhood of p in M and c(: U +R” ing conditions: (i) T,(M) = L( V(p)) @ Ei @ E$
is a homeomorphism onto an open set D in R”. where L( V(p)) is the l-dimensional subspace
Using (U, x), we cari express the vector iïeld V generated by V(p); (ii) (d<p,),(EP)= E;,,,, (a=
as a system of ordinary differential equations s, u) for a11 tER, where (dq,),: T,(M)-, TqIc,,(M).
of the form (6) in D. The eigenvalues A,, ,& In particular, (dq+),(Ez) = Ez (o = s, u); (iii)
(possibly repeated) of the Jacobian matrix off dim Ef, (resp. dim EP) is independent of pu C
at cc(p) are independent of the choice of local and is equal to the number of Âi of absolute
coordinates (U, c() around p. Thus 3,,, ,A, are value <1 (resp. >l). Put W”(C)={xeMI
invariants of the C’-equivalence, but they are d(<p,(x), C)+O as t-ta} and W”(C)= {XE M 1
not invariants of the topological equivalence. d(<pq(x), C)+O as t+m}. Then W’(C) (resp.
A singular point p of a flow (M, <p) (or a W’(C)) is an invariant set and is an injectively
vector field V) is called hyperholic if cc(p) is a immersed C’submanifold of M which is tan-
hyperbolic singular point of the corresponding gent to L( V(p)) @ EP (resp. L(~(P)) @ E;) for
equation (6) (i.e., none of the real parts of the each PE C. W”(C) (resp. W“(C)) is called the
above eigenvalues Â,, ,3,,, are zero). A sin- stable (resp. unstable) manifold for C.
gular point of (M, <p) is hyperbolic if and only Let p be a point of a closed orbit C. An
if it is a hyperbolic fixed point of the time-one embedded (n - l)-dimensional disk D of class
mapping <pl of <p. A hyperbolic singular point c’ in M containing p is called a cross section
p is an isolated singular point. for a closed orbit C if V is transverse to D (i.e.,
Let p be a hyperbolic singular point of T,(M)=L(V(x))@ T,(D) for each xeD) and
(M, <p) and 7”(M) = E” @ E” the direct sum D n C = {p}. For a given cross section D for C,
decomposition with respect to ‘pl. Put s= there exists a neighborhood U of p in D such
dim E” and u = dim E”. Then s (resp. u) is the that for any x E U there exists a z(x) > 0 such
number of a11 eigenvalues ii of negative (resp. that cp(x,z(x))~D and <p(x,t)#D for Oct-c
positive) real parts. Put W(p) = {xc M 1<p,(x) 7(x). A mapping f: U +D, deiïned by f(x) =
‘pas t-m} and W”(p)={x~Ml<p-,(x)-p <p(x, z(x)), XE U, is called a Poincaré mapping
as t-ta}. W”(p) and W“(p) are invariant sets for D. The C’-conjugacy class of the tgerm of
and images of injective immersions of class C a Poincaré mapping S: V+D at p is indepen-
of vector spaces E” and E”, respectively. At dent of the choice of p6C and the cross section
the point p, W”(p) and W“(p) are tangent to D for the closed orbit C. The point p is a fïxed
E” and E”, respectively. We cal1 W”(p) (resp. point of L and the eigenvalues of & coincide
W(p)) the stable (resp. unstable) manifold of ,f with the characteristic multipliers of C includ-
at p. The stable manifold and the unstable ing multiplicity. Therefore C is hyperbolic if
manifold of (M, cp) at a hyperbolic singular and only if p E C is a hyperbolic fixed point of
point p coincide with those of the time-one f: In a sufflciently small neighborhood of C,
mapping <pl at p. A hyperbolic singular point the flow is C*-equivalent to the suspension off:
p is called a source if dim W(p) = 0 and a sink The topological equivalence of the flow in a
if dim W”(p) = 0. Otherwise it is called a saddle suffïciently small neighborhood of C is deter-
point. If p is a source (resp. a sink), then W”(p) mined by the topological conjugacy class of ,f:
(resp. W”(p)) is a neighborhood of p. A sin- The number of topological equivalence classes
gular point p is a source (resp. a sink) if and of hyperbolic closed orbits which appear in a
49.5 126 H
Dynamical Systems

flow on an n-dimensional manifolds is 4(n - 1) singular points and ah closed orbits) is dense
(M. C. Irwin). in the nonwandering set R for VE%‘(M) [23];
(8) Let V be a vector tïeld of class C’ (1 < rd (iv) there is no regular first integral for VE
ro) on a compact C”-manifold M. If M has T’(M), where a regular first integral of a vec-
a nonempty boundary, we assume that V tor tïeld V on M is a Cl-function ,f: M-*R
points outward at ah boundary points. Let p such that ,f is not constant on any open set of
be an isolated singular point of V and denote M and is constant along any orbit of (the flow
by i(p) the +index of the vector Iïeld V at the generated by) V (R. C. Robinson).
singular point P. If a11 singular points are Let V (resp. V’) be an element of X’(M)
isolated, then the sum C, i(p) is independent of (resp. T(M’)), (M, <p) (resp. (M’, <p’)) the flow
V and is equal to the +Euler-Poincaré charac- generated by V (resp. V’), and Q (resp. Q’) the
teristic x(M) of M (Poincaré, H. Hopf) (- 153 nonwandering set of (M, <p) (resp. (M’, cp’)).
Fixed-Point Theorems). A vector tïeld (or a V and V’ are called topologically equivalent
flow) is called nonsingular if it has no singular (resp. R-equivalent) if (M, <p) and (M’, <p’) (resp.
points. If x(M) #O, then any vector field (and the restrictions (0, <p1R x R) and (Q’, <p’( U x
hence any flow) on M has a singular point. If R)) are topologically equivalent. A vector
x(M) = 0, then there exists a nonsingular vector field VE%“(M) (or the flow generated by V) is
tïeld on M. If M is 2-dimensional and without called C’-structurally stable (resp. C’-R-stable)
boundary, then M admits a nonsingular vector if there exists a neighborhood -Af of Vin
tïeld if and only if M is a +torus or a +Klein X’(M) such that any V’ in X is topologically
bottle. equivalent (resp. D-equivalent) to V. If VE
There are many directions in which gen- X’(M) is structurally stable (resp. R-stable),
eralizations of the Poincaré-Hopf theorem cari then the topological structure of the phase
be made. For example, an index of an isolated portrait of (the flow generated by) V in the
closed orbit of a flow has been defïned and whole space (resp. the nonwandering set) re-
investigated by F. B. Fuller. mains invariant under a sufficiently small C’-
The phase portrait near a singular point or perturbation of V. The generic properties (i)-
a closed orbit which is not hyperbolic is com- (iv) above hold for Ci-structurally stable
plicated. Many results in general situations vector fields (Markus, Thom, Peixoto, J. Ar-
have been obtained for planer flows, but there raut). The generic properties (i) and (iii) above
are only a few for higher-dimensional flows. hold for Ci-R-stable vector tïelds.
(2) Let M be a compact C”-manifold with-
out boundary. Let F’(M) be the set of a11
H. Generic Properties and Structural Stability C’-mappings of M into itself with the topol-
ogy of uniform C’-convergence. Let Diff’(M)
(1) Let M be a compact C”-manifold without be the subset of F’(M) consisting of a11 C’-
boundary. The set T(M) of all vector fïelds of diffeomorphisms of M onto itself. Then F’(M)
class C’ (1 < r < CU) on M forms a real vector is a complete metric space, and Diff ‘(M) is
space in a natural way. We cari give a norm open in F’(M). Thus Diff’(M) is a +Baire space
11VII, (called a C’-norm) for VET(M) using its and also a topological group with respect to
expressions in a given suitable system of local this topology. A proposition P concerning
coordinates of class C” on M and their par- SE Diff ‘(M) is called generic if the set {fg
tial derivatives up to order r. By virtue of Diff’(M) 1P(f)} contains a residual set. The
this norm, T’(M) is a Banach space with the following properties are generic: (i) Every
topology of uniform (?-convergence (- 168 periodic point is hyperbolic [21]; (ii) for each
Function Spaces). A subset of a topological pair of hyperbolic periodic points p and q, the
space is called a residual set or a Baire set if it stable manifold W”(p) intersects the unstable
is the intersection of a countable number of manifold W”(q) transversally [21]; (iii) for a
dense open sets. A residual set in Ut^‘(M) is a C’-diffeomorphism f~Diff’(M) the set of all
dense set. A proposition P concerning a vector periodic points is dense in the nonwandering
tïeld of class C’ is called generic if the set {VE set ([23], Robinson).
2’(M)) P(V)} contains a residual set for any Letf:M+Mandg:N-*NbeC’-
M. The following properties are generic prop- diffeomorphisms and fi(f) and Q(g) their
erties: (i) Al1 singular points and ah closed nonwandering sets. f and g are called rZ-
orbits are hyperbolic [21,22]; (ii) any stable conjugate ifflQ(f):R(f)-fi(f) and gin(g):
manifold (of a hyperbolic singular point or a D(g)+R(g) are topologically conjugate. A
hyperbolic closed orbit) meets transversally diffeomorphism fE Diff’( M) is called C’-
with any unstable manifold (of a hyperbolic structurally stable (resp. C’-fi-stable) if there
singular point or a hyperbolic closed orbit) at exists a neighborhood M off in Diff’(M) such
any point of their intersection [21,22]; (iii) the that any g in N is topologically conjugate
set of a11 periodic points (i.e., the union of all (resp. R-conjugate) to f: Generic properties (i)-
126 1 496
Dynamical Systems

(iii) above hold for Ci-structurally stable dif- <p(x, 0) =x defïned on (a,, b,) for the equation
feomorphisms, and (i) and (iii) above hold for dx/dt = f(x), ~ED. Suppose that for a given
C’-fi-stable diffeomorphisms. point XED there exists a compact set K in D
containing the positive semiorbit C+(x) =
{V(X, t) 10 < t < b,} of x. Then b, = CO. Further,
1. Low-Dimensional Systems we assume that the w-limit set w(x) of x con-
tains no singular points. Then we have either
(1) Let S’ = {ZEC 1JzI = 1) be the unit circle in (i) C+(x) = o(x) and it is a closed orbit, or (ii)
the complex plane C and p : R +,Y’ the tcover- C+(x) #w(x) and w(x) is a closed orbit. In case
ing projection detïned by p(x) = elnix, x E R. (ii), w(x) is a +Simple closed curve and C+(x)
Let f: S’ -*S’ be an +Orientation-preserving is a spiral which tends the closed orbit w(x)
homeomorphism. Then f cari be lifted to a (Poincaré, 1. Bendixson). We cal1 such a closed
mapping F: R+R satisfying the following orbit W(X) a limit cycle. Let f, and fi be poly-
conditions: (i) poF=fop; (ii) F is monotone nomials of two variables and m the maximum
increasing; (iii) F- la is a periodic function of of degrees of,f, and f2. Let f=(fl, f2):R2+R2
period 1, where 1, is the identity mapping of be the mapping defïned by fi and f2. The
R. The limit p(f) = lim,,, F”(x)/n exists for a11 equation dxldt = f(x), xcR*, detïned by such
x E R, and its tresidue class modulo Z is inde- an fis called a polynomial system of degree
pendent of F and x. We cal1 p(f) the rotation m. The following is Hilbert’s 16th problem: 1s
number off: Let f, g : S’ +Si be orientation- there a number N(m) depending only on m
preserving homeomorphisms. If f and y are such that the number of limit cycles for any
topologically conjugate by an orientation- polynomial system of degree m is bounded by
preserving (resp. treversing) homeomorphism, N(m)?
then p(f)=p(g) (resp. p(f)- -p(g)) modulo Let M be a closed (i.e., compact, without
Z. An orientation preserving homeomorphism boundary) Cm-manifold of dimension 2 and
f: S’ +S2 has a periodic point of the smallest (M, <p) a Cz-flow on M. Then a minimal set of
positive period s if and only if p(f) = Y/S, where (M, <p) is either (i) a singular point, (ii) a closed
r and s are relatively prime integers. In this orbit, or (iii) the whole space M. For case (iii),
case a11 periodic points are of the smallest M is a torus (Poincaré, Denjoy, C. L. Siegel, A.
positive period s. If p(f) is irrational, then J. Schwartz). Let T be a 2-dimensional torus
the w-limit set w(x) of x E Si is independent and (T, cp) a c’-flow. Suppose that (T, <p) has a
of x, and E = w(x) is either tperfect and +no- cross section C which is Cl-diffeomorphic to
where dense or the whole space Si. If p(f) is S’. Let f :Z+C be the Poincaré mapping for
irrational and E = w(x) = S’, then ,f is called Z. Then (7’, <p) has a closed orbit if and only if
transitive. If f is transitive, then it is topologi- the rotation number p(f) off is rational. If
cally conjugate to the rotation rpcn: S’ +S’ p(f) is irrational and (T, cp) is of class C2, then
detïned by rpo)(e2nix) = e2ni(x+p(f)), XER. Let T is a minimal set.
f:S’ +S’ be of class C’ with p(f) irrational. If Let M be an orientable 2-dimensional
its derivative f’ is of +bounded variation, then closed C”-manifold and VE%‘(M). Then Vis
fis transitive (A. Denjoy). In particular, if f structurally stable if and only if the following
is of class C? with p(f) irrational, then fis conditions are satisfied: (i) There are only a
topologically conjugate to the rotation r,,(,->. lïnite number of singular points, all hyperbolic;
However, there are Ci-diffeomorphisms of S’ (ii) There are only a tïnite number of closed
onto itself whose nonwandering sets are not orbits, ah hyperbohc; (iii) There are no orbits
the whole space. Those C’-diffeomorphisms which connect saddle points; (iv) The a(x) and
are never topologically conjugate to C2- w(x) for any XE M are singular points or
diffeomorphisms. M. R. Herman gave a suffï- closed orbits (Peixoto). The above theorem
tient condition for a diffeomorphism of S’ was first proved by Andronov and Pontryagin
onto itself to be differentiably conjugate to a for analytic vector fields on a 2-dimensional
rotation. disk which are transverse to the boundary of
A C’-diffeomorphism f:S’+S’ is structur- the disk at any boundary point. The set of a11
ally stable if and only if n(f) is finite (hence structurally stable vector tïelds of class C’ on
Q(f) consists of a tïnite number of periodic M is a dense open set in !T’(M) (Peixoto).
points), and a11 periodic points are hyper-
bolic. The set of ah structurally stable C’-
diffeomorphisms of S’ onto itself is a dense
J. Axiom A Systems
open set in Diff’(S) (Peixoto).
(2) Let D be an open set in R2 and f :D+R2
a continuous mapping. We assume that for In this section we assume that phase spaces
each XCD there exists a unique nonextend- are closed C”-manifolds with metric d and
able solution <p(x, t) with the initial condition 1~ r < 10 unless stated otherwise.
497 126 J
Dynamical Systems

(1) A vector fïeld ~ES’(M) (resp. a C’-flow vector tïeld (or an Axiom A flow) if the follow-
(M, q)) is called a Morse-Smale vector field ing conditions are satisiïed: A(a) The non-
(resp. a Morse-Smale flow) if the following wandering set consists of a finite set F of sin-
conditions are satisfïed: (i) The nonwandering gular points, a11 hyperbolic, and the closure A
set is the union of a tïnite number of singular of the union of closed orbits, and F n A = @;
points and a tïnite number of closed orbits; (ii) A(b) The conditions (i)-(iii) in the definition of
The singular points and closed orbits are a11 Anosov flow in which we ieplace the terms
hyperbolic; (iii) The stable manifolds and the “x E M” by “ x E A.” For each x EM, put W”(x)
unstable manifolds of the singular points and ={y~M(d(cp,(x),cp,(y))+Oas t-m} and
closed orbits intersect each other transversely. ~U(x)={~~MId(cp~,(x),<p~,(~))~O as t+a).
If dim M = 2, the Morse-Smale vector lïelds on We cal1 W(x) (resp. W’“(x)) the stable (resp.
M are exactly the structurally stable ones unstable) manifold of V at x. For a subset A
discussed in Section I(2). A Morse-Smale of M, put W”(A) = UXEA W”(x) (a= s, u).
vector fïeld is structurally stable [2.5]. The set For x E M, put Ww”(x) = ~O(C(X)) (g = s, u),
of a11 Morse-Smale vector tïelds on M is open where C(x) is the orbit of x. If the flow satis-
but not dense in X'(M) if dim M > 2 (Palis, fies Axiom A, then W’(x), W’(x), W’““(x), and
Smale). However, it contains a dense open ~Y”(X) are injectively immersed submanifolds
subset of the set of all tgradient vector fields for a11 XE M (M. W. Hirsch and C. C. Pugh).
with respect to a given +Riemannian metric If x E M is a hyperbolic singular point, then
(Smale). In particular, any closed manifold W’(x) and W“(x) defmed above coincide with
admits Morse-Smale vector fields and hence those detïned before. If C(x) is a hyperbolic
structurally stable vector fïelds. For a Morse- closed orbit, then W’(C(x)) and ~“(C(X))
Smale vector fïeld, the ?Morse inequalities defined above coincide with those defined
hold as they do in the tcalculus of variations before.
in the large [24]. For an Axiom A flow, there is a decomposi-
(2) A vector fïeld VE%“(M) (or the C’-flow tion of the nonwandering set 0 = R, U U Q,
(M, cp) generated by V) is called an Anosov (disjoint union), where each Ri is closed, invar-
vector field (or an Anosov flow) if the following iant, and transitive (i.e., has a dense orbit),
conditions are satisfïed: (i) There is a direct and M = uF1 w’(Q)= UT1 w”(0,) (disjoint
sum decomposition T,(M)=L(V(x))@EX@EX union) [7]. This decomposition is called the
of the tangent space T,(M) for each x E M spectral decomposition of R, and each Ri is
which depends continuously on XE M; (ii) called a basic set. Let R = 51, U U R, be the
(44xE) = E;,(x) and (d<p,),(EX)= E;,,,, for a11 spectral decomposition of the nonwandering
x E M and t E R; (iii) There are a Riemannian set for an Axiom A flow. Denote ni < fij if
metric on M and constants c, Â > 0 such that, p(Q) n w”(Q,) # 0. A sequence of basic sets
for a11 t>O and ~EM, ~~(&p,),(w)lj<cë"'~~wll Ria, oil, , slit (k > 1) is called a cycle if Ri0 <
when WEEX, and II(&&(w)ll <ce-*‘Ilwll Ri1 < < Rix, Ri0 = Qii, and otherwise sliD #
when WEEX, where 11.11is the norm induced Ris for p # 4. An Axiom A flow which has no
by the Riemannian metric. The suspensions of cycles in the above sense is said to satisfy the
Anosov diffeomorphisms and the tgeodesic no cycle condition. The R-stahility theorem:
flows on Riemannian manifolds of negative An Axiom A flow with the no cycle condition
curvature are important examples of Anosov is R-stable [27]. Q-explosion: If an Axiom A
flows (J. Hadamard [8]). There are exam- flow has a cycle, then it is not R-stable (Palis).
ples of Anosov flows other than the ones An Axiom A flow is said to satisfy th’e strong
stated above (M. Handel and W. P. Thurston, transversality condition if Ww”(x) and Ww”(x)
J. Franks and R. F. Williams). The follow- intersect transversely at any XE M. An Axiom
ing have been proved by Anosov [S]: (i) An A flow with the strong transversality condi-
Anosov flow is structurally stable; (ii) there tion satisfïes the no cycle condition [7]. The
are countably many closed orbits for an structural stability tbeorem: An Axiom A flow
Anosov flow; (iii) if there exists a smooth in- of class C’ with the strong transversality con-
variant measure (i.e., an +invariant measure dition is structurally stable [29,31]. Morse-
which has a +smooth density with respect to Smale flows and Anosov flows are Axiom A
the measure associated with the Riemannian flows with the strong transversality condition.
metric), then the set of a11 closed orbits is dense There are Axiom A flows other than Morse-
in M. If we assume further that the flow is of Smale flows and Anosov flows that satisfy the
class C2, then it is tergodic; (iv) the set of a11 strong transversality condition [7]. Stability
Anosov vector fields is open in !T(M); (v) {Et}, conjecture: A c’-flow is structurally stable
{ E!J} (XE M) define tfoliations on M, which are (resp. R-stable) if and only if it is an Axiom A
called Anosov foliations. flow with the strong transversality condition
(3) A vector field VE!T'(M) (or the C’-flow (resp. the no cycle condition). S. Newhouse,
(M, q) generated by V) is called an Axiom A V. A. Pliss, Robinson, R. Mai%, S. D. Liao,
126J 498
Dynamical Systems

and A. Sannami have made important contri- theorems similar to those stated in Section
butions to the study of the stability conjecture. J(2) hold. Franks, Newhouse, A. Manning,
Neither the set of ah Axiom A flows nor the and J. N. Mather obtained important results
set of all R-stable flows nor the set of structur- concerning Anosov diffeomorphisms.
ally stable flows is dense in x’(M) if dim M > 2 A diffeomorphism fi Diff ‘(M) is called an
(R. Abraham and Smale, Newhouse). How- Axiom A diffeomorphism if the following con-
ever, the set of all structurally stable flows is ditions are satislïed: A(a) The nonwandering
dense in r(M) in the C” topology (M. Shub, set fi is hyperbolic; A(b) the set of all periodic
M. M. C. de Oliverira). points is dense in 0. There are examples that
(4) A diffeomorphism fi Diff ‘( M) is called a satisfy Axiom A(a) but not Axiom A(b) (A.
Morse-Smale diffeomorphism if the following Dankner, M. Kurata). For XE M, put W’(x)
conditions are satislïed: (i) The nonwandering ={y~Mld(f”(x),f”(y))~Oasn~co} and
set R is a lïnite set, and hence it consists of lV(x)={y~Mld(f-“(x),f-“(y))+Oas n+co}.
periodic points; (ii) all periodic points are We cal1 W(x) (resp. W(x)) the stable (resp.
hyperbolic; (iii) for each pair p, q~sZ, W’(p) unstable) manifold off at x. For a subset A of
intersects W”(q) transversally. The Morse- M, put W”(A) = UxeA W(x) ((r = s, u). If f is an
Smale diffeomorphisms on the unit circle S’ Axiom A diffeomorphism, W’(x) and W”(x)
are exactly the structurally stable ones de- are injectively immersed submanifolds of M. If
scribed in Section I( 1). A Morse-Smale dif- x is a hyperbolic periodic point, W’(x) and
feomorphism is structurally stable [25]. The W”(x) deiïned here coincide with those delïned
set of all Morse-Smale diffeomorphisms on M before. For Axiom A diffeomorphisms, notions
is open but not dense in Diff’(M) if dim M > 2 such as spectral decomposition, basic sets, the
(Palis, Smale). However, it contains a dense strong transversality condition, and the no
open subset of the set of all time-one mappings cycle condition are delïned similarly, and
of the flows generated by gradient vector lïelds. theorems similar to those stated in Section J(3)
In particular, any closed manifold admits hold.
MorseSmaIe diffeomorphisms and hence Let h: X+X be a homeomorphism of a met-
structurally stable diffeomorphisms. For a rit space X and c(, fi > 0. A sequence {xi}itz of
Morse-Smale diffeomorphism, the Morse points in X is an sr-pseudo-orbit if d(h(x,), x,+,)
inequalities hold [24]. < c( for all FEZ. We say that {xi}itz is /?-
Let A be a closed invariant set of fe shadowed (or /?-traced) by a point x E X if
Diff’(M). A is called hyperbolic if the follow- rl(h’(x),xi)<fi for a11 FEZ. We say that h has
ing conditions are satislïed: (i) There is a split- the pseudo-orbit tracing property if for any fi
ting T,(M) = EX @ Et of the tangent space > 0 there exists an c(> 0 SO that any a-pseudo-
T,(M) for each XEA, which depends continu- orbit is p-shadowed by some point. If f is an
ously on XE& (ii) &(Et) = E;,,, and df,(EX) = Axiom A diffeomorphism, then fln(f):n(f)
E’j.,,, for all x E A; (iii) There is a Riemannian -n(f) has the pseudo-orbit tracing property
metric on M and constants c > 0,O <Â < 1 such (Bowen). Axiom A diffeomorphisms with the
that, for any integer m>O and xeA, iidfx(w)ll strong transversality condition (in particular,
dci”l~w~l when w~EX and ~~df-“‘(w)~l< Anosov diffeomorphisms) have the pseudo-
~1.” 11wII when w E EX. A diffeomorphism fi orbit tracing property (Bowen, K. Sawada, K.
Diff’( M) is called an Anosov diffeomorphism Kato, A. Morimoto).
if M itself is hyperbolic for J: Manifolds which (5) Let S be a discrete topological space with
admit Anosov diffeomorphisms are restricted n elements (n > 2) and Z(S) = ni,, Si the dou-
(Hirsch, K. Shiraiwa). Examples of Anosov bly infmite product of copies Si of S with the
diffeomorphisms are given as follows: Let product topology. A point x=(x& of C(S) is
L: R”+R” be a hyperbolic linear automor- a doubly infinite sequence of points in S. C(S)
phism with L(Z”) = Z”, where Z” is the dis- is a ttotally disconnected, Perfect, compact,
crete subgroup of R” consisting of all elements tmetrizable space (i.e., a +Cantor set). Let
with integral coordinates. Then L induces an o: L(S)-Z(S) be a mapping delïned by a(x) =
automorphism ,f: T”-, T” of the n-dimensional y, X=(X~), ~=(y~), yi=xi+i (iEZ). Then 0 is a
torus T”= R”IZ”, which is an Anosov diffeo- homeomorphism of C(S) and is called the shift
morphism. There are similar constructions automorphism with n symbols.
using hyperbolic automorphisms of simply Detïne a diffeomorphism f: S2 -t Sz of the 2-
connected tnilpotent Lie groups and their uni- dimensional sphere S2 as follows. We consider
form discrete subgroups (Smale). A homeo- S* as R*U { CQ}. Let U be a region of R* con-
morphism h:X+X of a metric space X is sisting of the rectangle R (= ABCD) and an
called expansive if there is a constant E> 0 Upper and a lower cap as shown in Fig. 1. f
such that x, ygX and d(h”(x),h”(y))<a for all maps U into itself as in Fig. 1 (f(A) = A’,f(B)
n E Z imply x = y. An Anosov diffeomorphism = B’,f(C) = C’,f(o) = D’, and SO on). Here,
is expansive. For Anosov diffeomorphisms, R nf(R)= P U Q is the union of two rectangles,
499 126 K
Dynamical Systems

P’=f-‘(P) and Q’=,f-‘(Q) are rectangles, and ,fEDifY(M). A point PE IV(x)0 W”(x) (p#x) is
fis taffine on each of P’ and Q’ (stretching in called a bomoclinic point. If W’(x) and W’(x)
the vertical direction and contracting in the intersect transversally at a homoclinic point p,
horizontal direction). The lower cap is con- then p is called a transversal homoclinic point.
tracted into itself, and every point in it tends to In a neighborhood of a transversal homoclinic
a sink x0. The Upper cap is mapped into the point, there is a closed invariant set A of ,f m
lower one and stays there. The point CC is the for some positive integer m such that f” 1A is
only source, and lim,,, f-“(x) = COfor a11 topologically conjugate to the shift automor-
XE R’- U. Thus the nonwandering set of ,f phism with two symbols (Smale). There are
consists of a sink x,,, a source CO, and A = generalizations of this theorem for semiflows
&,.f “(R). The mapping f constructed here by F. R. Marotto, Shiraiwa, and Kurata.
is called the horseshoe diffeomorphism. It is
an Axiom A diffeomorphism with the strong
transversality condition (and hence structur- K. Topological Entropy and Zeta Functions
ally stable). The restriction ,f 1A off on a
basic set A is topologically conjugate to the (1) The notion of topological entropy was first
shift automorphism with two symbols. It is delïned by R. L. Adler, A. G. Konheim, and
neither a Morse-Smale diffeomorphism nor an M. H. McAndrew as an analog to measure-
Anosov diffeomorphism [7]. theoretic tentropy. Let X be a compact topo-
logical space and CCan topen covering of X.
Let N(a) be the minimum number of members
of a tsubcovering of cc Let f: X +X be a con-
tinuous mapping and c1an open covering of X.
Thenlim,,,(l/n)logN(ccvf-‘ccv...vf-”+’a)
exists, where a vf-’ CIv vf-“” c( is the open
covering{A,n,f~‘(A,)n...nf~“+‘(A,~,)IA,,
A,, , A,-, EC(}. We denote the above limit by
h(f, x) and call it the topological entropy of ,f
with respect to CC.The topological entropy h(f)
of ,f is detïned as the sup h(,f, a), where the
supremum is taken over a11 open coverings tl
of X. Now assume that X is a compact metric
space. For an open covering tl of X, put diam c(
Fig. 1 = sup { diam A 1A E a}, where diam A is the
tdiameter of A. If {x,},,, is a sequence of open
A restriction of a shift automorphism on a coverings of X such that diama,+ as n-+ 10,
closed invariant set is called a suhshift. Let A then h(f; cc,)+h(f) as n-1 33. The topological
=(A,) be an n x n matrix with (i,j)-component entropy of the shift automorphism with n
A,j=O or 1 for a11 i, j = 1, . . . , n. It is irreducible symbols is equal to logn. Let L:R”+R” be a
if for each i, j there is a positive integer m such linear mapping with L(Z”) c L(Z”) and Âi, ,
that A” has a nonzero (i,j)-component. As- 1., its eigenvalues. Let f: T”-t T” be the in-
sume that A is irreducible and S = { 1, . . , n}. duced endomorphism of the n-dimensional
P~~Z~={X=(X,)EZ(S)~A,~,,+,=~ fora11 torus T” = R”/Z”. Then the topological entropy
iE Z}. Then 1, is a closed invariant set of L(S). is given by h(f)=&,r logliil.
The restriction o, = o 1ZA :C, -ZA is called a Let f: X +X be a continuous mapping
subshift of finite type or a Markov subshift, and and M(f) the set of all tf-invariant probabil-
A is called the transition matrix. A topological ity measures on the tBore1 sets of X. For
classification of the subshifts of lïnite type has each ~EM(J), denote by h,(f) the (measure-
been investigated by Williams. theoretic) entropy off with respect to n. Then
Let fi Diff ‘(M) be an Axiom A diffeomor- h(f)=sup{h,(f)Ip~M(f)} (E. 1. Dinaburg,
phism and A its basic set. Bowen constructed a T. N. T. Goodman, L. W. Goodwyn). The
tMarkov partition for fi A by generalizing the following properties hold: (i) If ,f: X+X is a
Sinai construction for Anosov diffeomor- homeomorphism, then h(f”)=lnlh(f) for all
phisms. The Markov partition connects fl A n E Z. (ii) If (X, <p) is a continuous flow, then
with a suitable subshift of Imite type and is h(cp,)=ltlh(<p,) for all VER. (iii) Let A:X,-tX,
applied to the study of Axiom A diffeomor- (i = 1,2) be continuous. Then h(,f, x f2) =
phisms, especially to the ergodic theory of h(,f,)+h(,f,). (iv) Let fi:X,-tX, (i= 1,2) be con-
Axiom A diffeomorphisms (Bowen, Ruelle). A tinuous. If there is a continuous mapping
similar theory for flows was developed by g:X,-+X, such that g(X,)=X, and gof, =
Bowen, Ruelle, and M. E. Ratner. f2 og, then h(f,) >h(f2). In particular, if fi and
(6) Let x E M be a hyperbolic lïxed point of f2 are topologically conjugate, then h(f,)=
126 L 500
Dynamical Systems

h(f’). (v) If ,f: X-X is a homeomorphism and Let <r, :C, -1, be a subshift of fïnite type
R is the nonwandering set of J then k(f)= with transition matrix A; then i(t) = l/det(E -
h(fl0). In particular, R-conjugate homeo- tA), where E is the unit matrix (Bowen and 0.
morphisms have the same topological entropy E. Lanford III). The zeta function of an Axiom
(Bowen). (vi) If ,f: M-tM is an Axiom A diffeo A diffeomorphism has a positive radius of
morphism of a closed C”-manifold M, then convergence and is a rational function (K.
h(,f)=limsup,,,(l/n)logN,(f), where N,(S) is Meyer, J. Guckenheimer, Manning).
the number of all periodic points of period n. Let (M, <p) be a nonsingular C’-flow on a
And h(f) =0 if and only if the nonwandering closed C”-manifold M. Let r be the set of
set is tïnite (Bowen). all closed orbits and l(y) the smallest posi-
Bowen gave alternative definitions of the tive period of y E r. Smale defïned the zeta
topological entropy and generalized it for function of (M, cp) by Z(s)=n,,,-n~,(l-
uniformly continuous mappings of (not neces- [expI(y)] msmk). If (M, cp) is the geodesic flow on
sarily compact) metric spaces. If 1’: M+M a surface of constant negative curvature, Z(s)
is a Cl-mapping of an n-dimensional Rie- reduces to the +Selberg zeta function.
mannian manifold M, then /I(S) < max { 0, There are generalizations and modifications
nlogsup{ llrlf,ll IxEM)}, where ll&ll is the of the notion of zeta function by Ruelle and
norm of df,: T,(M)+ Tftx,(M), and hence h(f) Franks.
is tïnite (Bowen, S. Ito).
(2) Let f: M*M be a continuous mapping
of a closed C” -manifold M and ,f,: H,(M)+ L. Classical Dynamical Systems
H,(M) (resp. f,, : H,(M)+H,(M)) the tinduced
homomorphism if f on the +homology group (1) Let M be a C” -manifold without bound-
H,(M) (resp. the ttïrst homology group ary. A C’-flow (M, q) or a C’-diffeomorphism
H,(M)) with coefficients in R. The spectral ,f: M+M with a smooth invariant measure is
radius s(L) of a linear mapping L: E-E of a called a classical dynamical system. Most
real vector space E is the maximum of the important classical dynamical systems are
absolute values of the eigenvalues of L. The Hamiltonian or Lagrangian systems. In the
following entropy conjecture is still open: modern formulation, a Hamiltonian system
If f: M+M is a Cl-diffeomorphism (Cl- consists of M, a symplectic form w on M (i.e., a
mapping), then h(j‘) 2 logs(,f,). Concerning the tnondegenerate +closed 2-form), and the vector
entropy conjecture, the following are known: field X, on M delïned by a C’+‘-function
(i) The conjecture holds for a dense open set of H: M +R. We cal1 X, the Hamiltonian vector
Diff’(M) in the CO-topology (Shub); (ii) for iïeld with energy function H. Let (M, w, X,) be
any continuous mappingf, h(f)>logs(f,,) a Hamiltonian system. Then the following
(Manning); (iii) the conjecture fails for a hold: (i) M is of even dimension and there is a
homeomorphism (Pugh); (iv) the conjecture system of local coordinates (q’ , , q’, pl, ,
holds for an Axiom A diffeomorphism with p,) such that w = Cy=l dqi~dpi (J. G. Darboux).
the no cycle condition (Shub and Williams); In these coordinates X, is expressed by
(v) the conjecture holds for any continuous +Hamilton’s equations dq’/dt = aH/@,,
mapping of the n-dimensional torus (M. dpJdt = - dH/<lq’, i = 1, , n; (ii) the smooth
Misiurewicz, F. Przytycki); (vi) for a Cl- measure defined by the +Volume element R =
mappingf: M+M, h(f)>logldegfl, where (( -l)[“‘*‘/n!)fY is an invariant measure for
degfis the +mapping degree off (Misiurewicz, the flow generated by X, (J. Liouville); (iii)
Przytycki). the energy function H is constant along any
(3) M. Artin and B. Mazur first defïned the trajectory of X,. Especially, H-‘(e) (e6R) is
zeta function of a diffeomorphism by analogy an invariant set for each e and is called an
to +Weil’s zeta function. Let ,f: X +X be a energy surface. Energy surfaces are submani-
homeomorphism of a compact metric space folds of codimension one for almost a11 e E R.
X. Assume that the number IV, = N,,,(,f) of a11 Important examples of Hamiltonian sys-
periodic points of period m is finite for a11 m. tems are given as follows. Let Q be an n-
Put [(t)=exp(Cz=, N,t”/m) and call it the dimensional manifold and T*(Q) the +cotan-
zeta function off: For any closed C”-manifold gent bundle of Q. Then T*(Q) has a canonical
M, there is a dense set E of Diff’( M) such that symplectic form wO. We cal1 Q a configuration
,6fm(,f)<ck” for ,feE, where NM(f) is the space and T*(Q) a momentum phase space. For
number of isolated periodic points of period m any C*+l -function H on T*(Q), we have a
and c and k are positive constants depending Hamiltonian system (T*(Q), wo, X,) in which
only on f (Artin, Mazur). Hence, for such ,f’~ E, X, is of class C’. The Hamiltonian formalism
the series exp(C& fim(f)t”‘/m) has a positive is translated into Lagrangian formalism by
radius of convergence. Originally, Artin and using the +tangent bundle r(Q) instead of
Mazur called this the zeta function of .1: T*(Q). T(Q) is called a velocity phase space.
501 126M
Dynamical Systems

For a given differentiable function L on T(Q), N, is the translational flow with frequencies
the energy function E on T(Q) and the Lagran- w1 = aH/ûp, (c), , w, = cYH/cYp,(c) (Arnold). As-
gian vector field X, on T(Q) are constructed. If sume further that the Hessian det(d2 H/ap,ap,)
L satisfies a certain condition, there is a system does not vanish at c and ol, . , w, are inde-
of local coordinates (ql, . . . . q”,q’, . . . ,y”) such pendent. Let fi be an energy function obtained
that X, is expressed in these coordinates by by adding a suffïciently small perturbation to
the +Euler-Lagrange equations dq’/dt = yi, H. Then for almost a11 c’ near c, there exists an
d/dt(aL/a$)=aL/dq’, i= 1, . . . ,n. The projec- invariant torus fl of X, near N,, such that the
tion of an integral curve (i.e., an orbit) of X, restriction flow on fl is differentiably equiva-
into Q is called a base integral curve. Under a lent to the translational flow with the same
suitable condition, the base integral curve of a frequencies (Kolmogrov, Arnold, Moser).
Lagrangian (or a Hamiltonian) system is the (4) Generic properties for Hamiltonian
geodesic of the Jacobi metric on Q up to a systems were investigated by M. Buchner,
reparametrization, and the restriction of the Markus, Meyer, Pugh, Robinson, Takens, and
flow generated by X, on an energy surface is Newhouse.
the geodesic flow described below.
(2) Let M be a tcomplete Riemannian mani-
fold ofclass C” and S(M)={UET(M)I IIu// = l}.
M. Bifurcation
Let n:S(M)+M be the projection delïned by
Z(V) = x for u E T,(M). Then S(M) is a tsphere
bundle over M, which is called the unit tangent (1) Consider a differential equation with a
sphere bundle over M. Each ~ES(M) deter- parameter. For example, let X be a domain
mines a unique tgeodesic C,: R-+ M such that of R”, J=( -1, l), and f:J x X-R” a C’-
C,(o) = Z(U) and the ttangent vector Ch(O) to C, mapping. For each p E J, define f, : X * R” by
at t = 0 is equal to v. Let u, be the tangent ffl(x) =f(p, x), XE X. Consider the differential
vector Ch(t) to C, at t for teR. Then IIu,ll= 1 equation
and z(v~) = C,(t) for ail t E R. Define a map-
dxldt =fJx)> XEX and ~LE.J. (7)
pingcp:S(M)xR+S(M)by<p(o,t)=u,.Then
(S(M), <p) is a C”-flow, which is called the As p varies, the topological structure of the
geodesic flow on M. By the classical Liouville phase portrait of (7) may change. Suppose that
theorem, a geodesic flow has a smooth invar- there exists pLo~J such that the topological
iant measure. structure of the phase portrait of (7) changes at
(3) Let T” = R”/Z” be the n-dimensional p = pLo but remains the same when pLo-E < p <
torus and wl, . , w, real constants. Define a p,orp,<p<pO+Eforsome&>O.Thenp,
mapping<p:T”xR+T”bycp([x,,...,x,],t)= is called a bifurcation point of (7).
[x,+o,t ,..., x,+~,t], where [x, ,..., X”]E Hopf bifurcation: Assume that X = R2 and
R”/Z” = T” is the residue class of (xl, . . , X,)E the origin OER’ is a singular point of (7) for
R” modulo Z”. Then (T”, <p) is a C”-flow. We a11 ,u E J. Assume further that the Jacobian
cal1 it the translation flow with frequencies matrix off, at 0 E R2 has two distinct complex
wl, . , w,. If wl, , w, are linearly indepen- conjugate eigenvalues A(p) and n(p) such
dent over Z, they are called independent. Every that the real part Rel.(p) of A(p) is positive
orbit of the translation flow is dense if and when p > 0, zero when p = 0, and negative
only if its frequencies are independent (Poin- when p < 0. Then OE R2 is a sink for p < 0 and
caré, H. Weyl). A translational flow with inde- a source for p > 0. Now assume further that
pendent frequencies is called quasiperiodic. d/dp(Rei(p))(,=, is positive and OeR2 is a
Let (M, u, X,) be a Hamiltonian system on “vague attractor.” Then OEJ is a bifurcation
a 2n-dimensional manifold M. Under a certain point, and there exists an asymptotically stable
condition, there exist an open set U of M and closed orbit for (7) near and around OeR2
a diffeomorphism f: ci* T” x R” such that the which depends continuously on p for p > 0
following holds: Identify U with T” x R” by [37]. Thus a sink of (7) (pt0) changes to a
J and let (ql, . . . . q”,p,, . . . . p.) be the coordi- source and an asymptotically stable closed
nates of T” x R”. Then the energy function orbit (p > 0) when p changes its sign. The Hopf
H is independent of q = (ql, . . , q”) SO that bifurcation theorem cari be generalized to a
Hamilton’s equations becomes dq’/dt = aH/ap,, higher-dimensional case, and there is a dif-
dp,/dt = - aHI@‘= 0 for i = 1, . , n. Therefore feomorphism version of the theorem.
the solutions are given by q’(t) = (aH/api(c))t + (2) Bifurcations in more general settings
q’(0) modula Z, p,(t)=pi(0)=ci, i= 1, ,n, have been investigated by many mathema-
where C=(C~, . . ..c.). Therefore N,=fml(T” x ticians, including Thom, Arnold, R. J. Sacker,
{c}) is an invariant torus (i.e., an invariant G. R. Sell, D. H. Sattinger, G. Iooss, Ruelle,
set diffeomorphic to T”) of (the flow generated and Takens. Generic bifurcations of dynamical
by) X, for ail CER”, and the restriction flow on systems have been investigated by J. Soto-
126 N 502
Dynamical Systems

mayer, Meyer, P. Brunovsky, and others; attention of many mathematicians and scien-
bifurcations of Morse-Smale systems by New- tists. For 1-dimensional semidynamical sys-
house, Palis, Peixoto, and S. Matsumoto; tems such as May’s equation, T. Y. Li, J. A.
and bifurcations of Axiom A diffeomorphisms Yorke, A. N. Sharkovskiï, J. W. Milnor, Thur-
by Newhouse and Palis. ston, and many others have obtained notable
results, while for Lorenz’s equation we have
results by Ruelle, Guckenheimer, Williams,
N. Miscellaneous Topics Sinai, and many others. Chaos arising from
discretization of differential equations has been
(1) Let S3 be the 3-dimensional sphere and studied by M. Yamaguti, S. Ushiki, and others
VET~(S~). H. Seifert proved that if V is suftï- (- 433 Turbulence and Chaos).
ciently close (in CO topology) to a nonsingular
vector fïeld tangent to the tïbers of the +Hopf
iïbration S3_tSz, then V has a closed orbit. He
References
conjectured that every nonsingular vector iïeld
VE!T~(S~) had a closed orbit. (Seifert conjec-
ture; - 154 Foliations D). If a nonsingular [l] H. Poincaré, Mémoire sur les courbes
vector fïeld VcX1(S3) is transverse to a codi- définies par une équation différentielle, J.
mension 1 foliation of class C2, then it has a Math. Pures Appl., (3) 7 (1881), 3755422; 8
closed orbit (S. P. Novikov). Let M be a 3- (1882), 251-296. (Oeuvres 1, Gauthier-Villars,
dimensional Cm-manifold. Then there exists a 1928, 3-84.)
nonsingular vector tïeld VE X’(M) with no [2] H. Poincaré, Sur les courbes définies par
closed orbit in any thomotopy class of a non- les équations différentielles, J. Math. Pures
singular vector field on M (P. A. Schweitzer). Appl., (4) 1 (1885), 167-244; 2 (1886), 151-217.
Thus the Seifert conjecture fails for a vector (Oeuvres 1,90&158, 1677222.)
Iïeld of class C’, but the conjecture for a vector [3] H. Poincaré, Les méthodes nouvelles de la
Iïeld of class C’ (r > 2) is an open problem. mécanique céleste. 1, Solutions périodiques,
Related work has been done by Fuller, H. Non-existence des intégrals uniforms, Solu-
Chu, and A. Weinstein. tions asymptotiques. II, Méthodes de MM.
(2) Let M be a closed connected C”- Newcomb, Gyldén, Lindstedt et Bohlin. III,
manifold. A c’-flow (M, <p) (resp. fe Diff’(M)) Invariants intégraux, Solutions périodiques du
is a minimal flow (resp. a minimal diffeomor- deuxième genre, Solutions doublement asymp-
phism) if M itself is a minimal set. If M admits totiques. Gauthier-Villars, 1, 1982; II, 1893;
a tlocally free SI-action of class C”, then it III, 1899; English translation, New methods
admits a minimal Cm-diffeomorphism, and if of celestial mechanics IIIII, Clearinghouse
M admits a locally free special (in particular, a for Federal Scientific and Technical Informa-
tfree) T2-action of class C”, then it admits a tion, Springfield, 1967.
minimal C”-flow (A. Fathi, Herman, A. B. [4] G. D. Birkhoff, Dynamical systems, Amer.
Katok). Open problems: What are the topo- Math. Soc. Colloq. Publ., 1927.
logical properties of the manifolds admitting [5] A. A. Andronov and L. S. Pontryagin,
minimal flows? Does S3 admit a minimal flow? Systèmes grossiers, Dokl. Akad. Nauk SSSR,
(3) E. N. Lorenz studied numerical solutions 14 (1937), 247-250.
of the following nonlinear equations in R3 [6] S. Lefschetz, Differential equations: geo-
which arose from the convection equation: metric theory, Interscience, 1957.
dxJdt= -oxfay,dyJdt= -xz+rx-y,dz/dt= [7] S. Smale, Differentiable dynamical systems,
xylbz. When o = 10, r = 28, and b = 813, he Bull. Amer. Math. Soc., 73 (1967), 747-817.
found irregular behavior in this dynamical [S] D. V. Anosov, Geodesic flows on closed
system. R. M. May studied numerical solu- Riemannian manifolds with negative curva-
tions of the following difference equation in ture, Proc. Steklov Inst. Math., 90 (1967) 1~
connection with the growth of biological popu- 235.
lations with nonoverlapping generations: [9] R. Bowen, Equilibrium states and the
X n+l =a~,(1 -x,), x,E[O, l] (1 <a<4). He ergodic theory of Anosov diffeomorphisms,
found that the dynamical structure of the Lecture notes in math. 470, Springer, 1975.
above difference equation was delicate and [lO] D. Ruelle and F. Takens, On the nature
complicated. Y. Ueda and H. Kawakami also of turbulence, Comm. Math. Phys., 20 (1971),
found similar phenomena in their numerical 167-192.
study of Duffing’s equation of the type d2x/dt2 [ 1 l] J. Moser, Stable and random motions in
+ k dxldt + x3 = Bcos t. The phenomena ob- dynamical systems, Ann. Math. Studies 77,
served in these investigations were called Princeton Univ. Press, 1973.
chaos, which exhibits strange attractors. [ 121 L. H. Loomis and S. Sternberg, Advanced
These investigations have attracted the calculus, Addison-Wesley, 1968.
503 127 A
Dynamic Programming

[13] V. V. Nemytskii and V. V. Stepanov, [34] R, Bowen, On Axiom A diffeomorphisms,


Qualitative theory of differential equations, Regional Conf. Series in Math. 35, Amer.
Princeton Univ. Press, 1960. (Original in Rus- Math. Soc., 1978.
sian, 1947.) [35] R. Abraham and J. E. Marsden, Founda-
[ 141 N. P. Bhatia and G. P. Szego, Stability tions of mechanics, second edition, Benjamin,
theory of dynamical systems, Springer, 1970. 1978.
[ 151 W. H. Gottschalk and G. A. Hedlund, [36] V. 1. Arnold and A. Avez, Problèmes
Topological dynamics, Amer. Math. Soc. ergodiques de la mécanique classique,
Colloq. Publ., 1955. Gauthier-Villars, 1967; English translation,
[ 161 R. Abraham and J. W. Robbin, Transver- Ergodic problems of classical mechanics, Ben-
sal mappings and flows, Benjamin, 1967. jamin, 1968.
[ 171 L. Markus, Lectures in differentiable 1371 J. E. Marsden and M. McCracken, The
dynamics, revised edition, Regional Conf. Hopf bifurcation and its applications, Applied
Series in Math. 3, Amer. Math. Soc., 1980. math. sci. 19, Springer, 1976.
[ 1S] Z. Nitecki, Differentiable dynamics, MIT
Press, 1971.
[19] M. Shub, Stabilité globale des systèmes
dynamiques, Astérisque, Soc. Math. France, 56
(1978). 127 (X1X.8)
[20] M. C. Irwin, Smooth dynamical systems, Dynamic Programming
Academic Press, 1980.
[21] S. Smale, Stable manifolds for differential
equations and diffeomorphisms, Ann. Scuola A. General Remarks
Norm. Sup. Pisa, (3) 17 (1963) 977116.
[22] 1. Kupka, Contribution à la théorie des There are two types of multistage tdecision
champs génériques, Contributions to Differ- processes. In one of them, an outcome of the
ential Equations, 2 (1963) 457-482. whole process is determined at the final stage
[23] C. Pugh, An improved closing lemma and without any consideration of the outcome for
a general density theorem, Amer. J. Math., 89 each intermediate stage. The textensive form
(1967), 1010-1021. of a tgame is of this type. In the other type,
[24] S. Smale, Morse inequalities for a dynam- an outcome is assigned at each stage of a
ical system, Bull. Amer. Math. Soc., 66 (1960) multistage decision process. The theory of
43-49. dynamic programming, dealing with this latter
[25] J. Palis and S. Smale, Structural stability type, has been developed by R. Bellman and
theorems, Amer. Math. Soc. Proc. Symp. Pure others since 1950 and is now one of the fun-
Math., 14 (1970), 2233231. damental branches of mathematical program-
[26] M. W. Hirsch and C. C. Pugh, Stable ming, along with the theories of linear and
manifolds and hyperbolic sets, Amer. Math. nonlinear programming. The following exam-
Soc. Proc. Symp. Pure Math., 14 (1970), 133- ples illustrate some features of multistage deci-
163. sion processes.
[27] S. Smale, The fi-stability theorem, Amer. Multistage allocation process. We are given
Math. Soc. Proc. Symp. Pure Math., 14 (1970) a quantity x > 0 that cari be divided into two
289-297. parts y and (x-y). From y we obtain a return
[28] S. Smale, Notes on differentiable dynam- g(y), and from (x-y) a return h(x-y). In SO
ical systems, Amer. Math. Soc. Proc. Symp. doing, we expend a certain amount of our
Pure Math., 14 (1970), 2777287. original resources and are left with a new
[29] J. W. Robbin, A structural stability theo- quantity, ay + b(x - y), 0 <a, b < 1, with which
rem, Ann. Math., (2) 94 (1971) 447-493. the process is continued. How do we proceed
[30] J. W. Robbin, Topological conjugacy and SO as to maximize the total return obtained in
structural stability for discrete dynamical a finite or unbounded number of stages?
systems, Bull. Amer. Math. Soc., 78 (1972), Multistage choice process. Suppose that we
923-952. possess two gold mines A and B, the first of
[31] R. C. Robinson, Structural stability of C’ which contains an amount x of gold, while the
flows, Lecture notes in math. 468, Springer, second contains an amount y. In addition, we
1975,262-277. have a single gold mining machine with the
1321 M. Shub, Dynamical systems, filtrations property that if used to mine gold in the mine
and entropy, Bull. Amer. Math. Soc., 80 (1974), A, there is a probability Pi that it Will mine
27741. a fraction rr of the gold there and remain in
[33] M. W. Hirsch, C. C. Pugh, and M. Shub, working order, and a probability 1 -Pi that it
Invariant manifolds, Lecture notes in math. Will mine no gold and be damaged beyond
583, Springer, 1977. repair. Similarly, the mine B has associated
127 B 504
Dynamic Programming

with it the corresponding probabilities Pz and choice. Then we have


1 - P2 and fraction r2. How do we proceed in
order to maximize the total amount of gold
before the machine is defunct?
These two processes have the following
.f(x, Y) = max
Pl~r~x+f((l-rl)x,y)l
LPz{rzY+f(x,U -rdY)l 1
The optimal policy cari be described in the
features in common: (1) In each case we have a following way. We choose A or B according as
physical system characterized in any state by a p,r,x/(l -pl) is greater or less than p2r2y/(l -
small set of parameters, the state variables. (2) p2). We cari choose either A or B if equality
In each state of either process we have a choice holds. After an operation according to such a
of a number of decisions. (3) The effect of a choice, the machine may become defunct and
decision is a transformation of the state vari- terminate the process. If the machine is usable,
ables. (4) The past history of the system is of then we cari apply our policy to a new com-
no importance in determining future actions bination of the amounts of gold in A and B.
(tMarkov property). (5) The purpose of the
process is to maximize some function of the
state variables. B. Discrete Deterministic Processes
A policy is a rule for making decisions that
yields an allowable sequence of decisions; an By a deterministic process we mean a process
optimal policy is a policy that maximizes a in which the outcome of a decision is uniquely
preassigned function of the final state vari- determined by the decision. We assume that
ables. A convenient term for this preassigned the state of the system, apart from time de-
function of the final state variables is criterion pendence, is described in any stage by an M-
function. One of the characteristic features of dimensional vector p constrained to lie within
Bellman’s methodology of dynamic program- some region D. Let T= { T,} (where q runs over
ming is the appeal to the principle of optimal- a set S) be a set of transformations with the
ity: An optimal policy has the property that property that ~ED implies that T,(P)ED for a11
whatever the initial state and initial decision q E S, i.e., any transformation T, carries D into
are, the remaining decisions must constitute itself. The term “discrete” signifies here that we
an optimal policy with regard to the state have a process consisting of a lïnite or de-
resulting from the tïrst decision. numerably inlïnite number of stages. A policy,
In the multistage allocation process the state for the lïnite process which we consider lïrst,
variables are x (the quantity of resources) and consists of a selection of N transformations
z (the return obtained up to the current stage). in order, P = (Ti, T,, . . , TN), yielding succes-
The decision at any stage consists of an alloca- sively the sequence of states pi = T(p,-,) (i =
tion of a quantity 0 <y < x. This decision has 2,3, . . . . N) with p, = T,(p). These transfor-
the effect of transforming x into ay + b(x - y) mations are to be chosen to maximize a given
and z into z + g(y) + h(x - y). The purpose of function R of the final state pN. Observe that
the process is to maximize the final value of z. the maximum value of R(p,), as determined
Denote by f,(x) the N-stage return obtained by an optimal policy, Will be a function of the
starting from an initial state x and using an initial vector p and the number N of stages
optimal policy. Then we have only. Let us then detïne our basic auxiliary
functions &(p) =max R(p,) = the N-stage re-
fi(X)=o~a<XxCg(Y)+hCx-Y)I, turn obtained starting from an initial state p
. ,
and using an optimal policy. This sequence is
f.(x)=,a<xxCg(y)+h(x-y) detïned for N = 1,2, . . , and p E D. The essential
\ ,
uses of the principle of optimality cari be ob-
+"LI(aY+w-Y))l, n>2.
served from the following two features. The
This recurrence relation yields a method for lïrst is the use of the embedding principle. The
obtaining the sequence {f,(x)} inductively. original process is embedded in a family of
In the stochastic gold-mining process, the similar processes. In place of attempting to
state variables are x and y (the present level determine the characteristics of an optimal
of the two mines) and z (the amount of gold policy for an isolated process, we attempt to
mined to date). The decision at any stage con- deduce the common properties of the set of
sists of a choice of A and B. If A is chosen, optimal policies possessed by the members of
(~,y) goes into ((1 -r,)x,y) and z into z+r,x, the family. The second feature is the derivation
and if B is chosen, (x, y) goes into (x, (1 - r2)y) of recurrence relations by which the functional
and z into z + r2 y. The purpose of the process equations connecting the members of the se-
is to maximize the expected value of z ob- quence { fk(p)} are established. Assume that
tained before the machine becomes defunct. we choose some transformation Tq as a result
Denote by f(x, y) the expected amount of of our lïrst decision, obtaining in this way a
gold obtained using an optimal sequence of new state vector T,(p). The maximum return
505 127 E
Dynamic Programming

from the following k - 1 stages is, by definition, sion made over the interval [0, S], and let pd be
fk-,(Tq(p)). It follows that, if we wish to maxi- the state at S starting from the initial state p
mize the total k-stage return, q must now be and employing d. The application of the prin-
chosen to maximize this (k - 1)-stage return. ciple of optimality suggests that
The result is the basic recurrence relation
.f(p;S+T)=s;pf(p,, T), (1)
fkb) =maxqdkl (T,(P)), for k 2 2, with fi (P) =
max,,sR(T,(p)). For the case of an unbounded
where the supremum is taken over the set D of
process, the sequence jfk(p)} is replaced by a
a11 allowable decisions d.
single function f(p), the total return obtained
The limiting form of (1) as S+O is a non-
by using an optimal policy starting from state
linear partial differential equation (Bellman
p, and the recurrence relation is replaced by
partial differential equation). This expression
the functional equation f(p)=max,f(T,(p)).
is important for use in actual analysis. For
numerical purposes, S is kept nonzero but
C. Discrete Stochastic Processes small. R. Bellman showed that it is possible
to avoid many of the quite difficult rigorous
We again consider a discrete process, but one details involved in this limiting procedure if
in which the transformations are stochastic we are interested only in the computational
rather than deterministic. The initial vector p solution of variational processes.
is transformed into a stochastic vector z with
an associated distribution function dG,(p, z)
dependent on p and the choice of q. We as- E. Markovian Decision Processes
sume that z is known after the decision has
been made and before the next decision is to Consider a physical system which at any of
be made. We agree to measure the value of the times t = 0, A, 2A, must lie in one of the
a policy in terms of some average value of states S,, S,, . , S,. Let yi(n) be the probability
the function of the final state. Let us cal1 this that the system is in Si at times nA, and let Pij
expected value the return. Beginning with be the probability that the system is in state Sj
the case of a finite process, we define fk(p) as at t + A if it is in state Si at time C. We suppose
before. The expected return as a result of the that the transition probabilities Pij are inde-
initial choice of T, is therefore pendent of t. We assume that the ej depend
on a parameter q, which may be a vector, and
.A-, (zWq(P> 4. that at each stage of the process q is to be
I 27-D chosen SO as to maximize the probability that
Consequently, the recurrence relation for the the system is in the state S,. We obtain the
sequence {fk(p)} is nonlinear system

Y,(n+l)=my : Plj(q)Yj(n)
j=1

with fi(p)=max,,,S,,.R(z)dC,(p,z). Con- =i$ Yi(n)Pil (q*);


sidering the unbounded process, we obtain the
functional relation
Yi(n + l) = ,il Pij(q*)Yj(n)t i=2,3 ,..., N,

where q* = q*(n) in the remaining N - 1 equa-


tions is one of the values of q that maximize
D. Continuous Deterministic Processes y, (n + 1). There are similar processes that cari
be considered as continuous analogs of this
A number of interesting processes require that type of decision process. These are called Mar-
decisions be made at each point of a con- kovian decision processes and were discussed
tinuum, such as a time interval. The simplest by Bellman. There is, however, another type
examples of processes of this character are of Markovian decision process in which a re-
provided by the tcalculus of variations. Let ward is given at each stage. For each state Si
us denote by f(p; T) the return obtained over of the system there are k alternatives 1,2,3,
a time interval [0, T] starting from the ini- . . , k. If we choose the alternative h among
tial state p and employing an optimal policy. these k alternatives, then the transition prob-
Although we consider the process as one con- abilities pu (j= 1,2, . . . , n) are determined, and
sisting of choices made at each point t on a reward ri is associated with each state Sj.
[O, T], it is better to begin with the concept of Let us denote by ui(n) the total expected re-
choosing policies (functions) over intervals, turn obtained at the nth stage by appealing
and then pass to the limit as these intervals to an optimal policy when the initial state is Si.
shrink to points. Let d be an allowable deci- Then the principle of optimality in the theory
127 F 506
Dynamic Programming

of dynamic programming yields method does not have a rigorous logical foun-
dation. V. G. Boltyanskii [6] has presented a
ui(n+ l)=mp 5 p$($+Uj(n)) justification of the dynamic programming
( j=l > method.
A policy-iteration method involving a Let L(x, u) (i = 0, 1, , n) be defined for x E
value-determination operation with a policy- I/c R” and u E U c R’, where V is an open set,
improvement routine was given by R. A. and continuously differentiable on Vx U.
Howard [4]. Suppose that two points x0 and x1 are given in
V. Among a11 the piecewise continuous con-
trols u(t)=(u,(t), . , u,(t))~ U which transfer
F. Dynamic Programming and the Calculus of the phase point moving in accordance with
Variations dx.
+(x’U(t))’ i= 1, . . . . n,
Problems in the calculus of variations cari be
viewed as multistage decision problems of a from x0 = x(to) to x1 = x(ti), tïnd the control
continuous type. It cari be shown that the u(t) for which the functional
dynamic-programming approach yields forma1
derivations of classical necessary conditions J= ” fo(x(t), u(t))dt
for the calculus of variations. Let us consider
the problem of minimizing the functional takes the smallest value.
A continuous function w(x) = ~(xi, . . . , x,) is
J(y)= hfb>Yww~, called a Bellman function relative to a point
s Il a E V if it possesses the following properties: (1)
where the function y is subject to y(a) = c. w(a)=@ (2) there exists a set M (the singular
We embed this problem within the family of set of w(x)), which is closed in V and does not
problems generated by allowing a and c to be contain interior points, such that the function
parameters with the ranges of variation -CO < w(x) is continuously differentiable on the set
a < b, -CO CC < CO. Now we detïne the optimal V-M and satistïes the condition
value function S(a, c) = min, J(y). Then the
principle of optimality yields the functional sup = 0, ~EV-M.
equation “CU
The following theorem gives a sufficient opti-
S(a, 4
mality condition.
Cl+‘4 Theorem: Assume that for dx/dt =f(x, u(t))
= min f(x,Y,y’)dx+S(a+A,c(y)) ,
(s 0 > given in a region VcR” there exists a Bellman
function w(x) relative to the point aE V with a
where the minimization is taken over a11 func-
piecewise smooth singular set. Assume, further-
tions defined over [a, a + A] with y(a) = c and
more, that for any point x0 E V there exists
c(y) = y(a + A). Then, writing u = y’(a), we get
a control u(t) which transfers the phase point
as as from x0 = x(t,) to a = x(tl) and satistïes the
-%=min
” (
f(a,c,u)+uZ
>
This yields the Euler equation, the Legendre
condition, the Weierstrass condition, and
‘1
relation

s ‘0
fo(x(t), u(t))dt = -w(x’).

the Erdman corner conditions. Furthermore,


Then any such control u(t) is optimal in V.
it cari be shown that the functional equa-
Recently, Vinter and Levis [7] obtained the
tion characterization yields the Hamilton-
following general result in this connection. The
Jacobi partial differential equation of classical
sufficient condition is given in terms of a solu-
mechanics.
tion to the Bellman partial differential equa-
The dynamic-programming approach cari
tion. It is shown that if this equation is modi-
be applied to more general problems in the
tïed SO that it is actually an inequality, and if
calculus of variations.
this inequality is required to be satisfied in a
limiting sense only, then the condition is also
necessary for optimality.
G. Dynamic Programming and the Maximum
Principle
H. Characteristic Features of Dynamic
In general, the method of dynamic program- Programming
ming carries a more universaf character than
the maximum principle of optimal control The characteristic features of the dynamic-
theory. However, in contrast to the latter, this programming approach cari be summarized in
507 127 Ref.
Dynamic Programming

the following five points: (1) the advantage of


lower dimensionality in comparison with the
enumeration approach; (2) the possibility of
finding maxima and/or minima of functions
defined over restricted domains for which
differential calculus may not work well; (3) the
availability of numerical solutions in recursive
form; (4) the possibility of formulating certain
problems to which classical methods do not
apply; and (5) the applicability of the method
to most types of problems in mathematical
programming, such as tinventory and produc-
tion control, optimal searching, and some
optimal and adaptive control processes.

References

[l] R. E. Bellman, Dynamic programming,


Princeton Univ. Press, 1957.
[2] R. E. Bellman, Applied dynamic program-
ming, Princeton Univ. Press, 1962.
[3] R. E. Bellman, Adaptive control processes;
a guided tour, Princeton Univ. Press, 1961.
[4] R. A. Howard, Dynamic programming and
Markov processes, Technology Press and
Wiley, 1960.
[S] S. E. Dreyfus, Dynamic programming and
calculus of variations, Academic Press, 1965.
[6] V. G. Boltyanskii, Suflïcient conditions for
optimality and the justification of the dynamic
programming method, SIAM J. Control, 4
(1966), 326-361.
[7] R. B. Vinter and R. M. Lewis, A necessary
and suftïcient condition for optimality of dy-
namic programming type, making no a priori
assumption on the control, SIAM J. Control
and Optimization, 16 (1978), 571-583.
128 A 510
Econometrics

128 (XVIII.1 5) problems peculiar to economic analysis, where


a factor cari seldom be controlled; and usually
Econometrics
there are too many highly related independent
variables. In such cases, if all possible indepen-
A. General Remarks dent variables are taken into the model, the
accuracy of the testimators of the coefficients
The term econometrics cari be interpreted in becomes extremely poor. Such a phenomenon,
various ways. In its widest sense, it means called multicollinearity, brings up the problem
the application of mathematical methods to of selection of independent variables (- 403
economic problems and includes mathematical Statistical Models), to which no satisfactory
economics, tmathematical programming, etc. solution has been given. Also, assumptions
However, here we use it to mean statistical about the error terms may be dubious, the
methods apphed to economic analysis. error terms may be correlated, or the variantes
The abject of econometrics is to provide may be different. If the tvariance-covariance
methods to analyze relationships between matrix of the errors is given, the tgeneralized
economic variables. We classify these methods least squares method cari be applied, but
into four categories according to the types of usually such a matrix is not available.
relationships involved: (1) Analysis of causal
relations: If a set of variables Xi, , X, affects
an economic variable Y, we cari estimate the C. Systems of Simultaneous Equations
direction and extent of those effects on Y.
(2) Analysis of equilibrium: When a set of The second category of problems is peculiar
economic variables Y,, , Y, is determined to economic analysis and applies mainly to
through a market equilibrium mechanism, we macroeconomic data. Suppose that Y = (Y,,
cari analyze the structure of relationships “7 Y,) is a vector consisting of G economic
that determines the equilibrium. (3) Analysis variables, among which there exist G relation-
of correlation: When a set of economic vari- ships that determine the equilibrium levels of
ables is affected simultaneously by some (un- the variables. We also suppose that there exist
known) common factors, we cari analyze the K variables Z = (Z, , . . , Z,)’ that are indepen-
correlation structure of the variables. (4) Ana- dent of the economic relations but affect the
lysis of time interdependence: A process of equilibrium. The variables Y are called endog-
development in time of a set of economic enous variables, and the Z are called exog-
variables cari be analyzed. enous variables. If we assume linear relation-
There are two types of economic data: (a) ships among them, we have an expression
macroeconomic data, representing quantities such as
and variables related to a national economy as
Y=BY+I-Z+u, (1)
a whole, usually based on national census; and
(b) microeconomic data, representing informa- where B and I are matrices with constant
tion about the economic behavior of individ- coefficients and u is a vector of disturbances or
ual persons, households, and iïrms. Macro- errors. (1) is called the linear structural equa-
economic data are usually given as a set of tion system and is a system of simultaneous
time series (- 421 Time Series Analysis A), equations. By solving the equations formally,
while microeconomic data are obtained main- we get the so-called reduced form
ly through statistical surveys and are given as
Y=I;IZ+v, (2)
cross-sectional data. These two types of data,
related to macroeconomic theory and micro- where Z7=(1-B)m’r,v=(I-B)-lu. The
economic theory, respectively, require differ- relation of Y to Z is determined through the
ent approaches; and sometimes information reduced form (2), and if we have enough data
obtained from both types of data has to be on Y and Z we cari estimate 17. The problem
combined; obtaining macroeconomic infor- of identification is to decide whether we cari
mation from microeconomic data is called determine the unknown parameters in B and
aggregation. I uniquely from the parameters in the re-
duced form. A necessary condition for the
parameters in one of the equations in (1) to be
B. Regression Analysis identifiable is that the number of unknown
parameters (or, since known constants in the
The most common technique for the first system are usually set equal to zero, the num-
category of problems is tregression analysis, ber of variables appearing in the equation) not
which is applied to both microeconomic and be greater than K + 1. If it is exactly equal to
macroeconomic analysis. However, there are K + 1, the equation is said to be just identified,
511 128 Ref.
Econometrics

and if it is less than K + 1, the equation is said appear over many time periods and when
to be overidentitïed. some structure among the coefficients of those
If a11 the equations in the system are just lagged variables cari be assumed, such a mode1
identified, for arbitrary 17 there exist unique i? is called a distributed lag model.
and I that satisfy I7= (1 -B)-’ I. Therefore, if Sometimes it is necessary to include some
we denote the tleast squares estimator of 17 by nonlinear equations in the simultaneous equa-
i?, we cari estimate B and I from the equation tion model. Such nonlinear simultaneous
(1 -@fi= p. This procedure is called the equation models are difftcult to deal with,
indirect least squares method and is equivalent partly because the solution of the equation
to the tmaximum likelihood method if we may not be unique, and in practical applica-
assume normality for u. tions ad hoc procedures are applied to obtain
When some of the equations are over- estimates of the parameters.
identitïed, the estimation problem becomes
complicated. Three kinds of procedures have
been proposed: (1) full system methods, (2) D. Multivariate and Time Series Analysis
single equation methods, and (3) subsystem
methods. In full system methods a11 the para- Problems in the third category cari be ap-
meters are considered simultaneously, and if proached by tmultivariate analysis techniques.
normality is assumed, the maximum likelihood Sometimes +Principal component analysis and
estimator cari be obtained by minimizing [(Y - tcanonical correlation analysis have been
I7Z) (Y - I7Z)‘l. Since it is usually difficult to applied to analyze the variations of a large
compute the maximum likelihood estimator, a amount of data. However, the practical mean-
simpler, but asymptotically equivalent, three- ing of the results obtained is often dubious.
stage least squares method has been proposed. The fourth category is the problem of time
The single equation methods and the sub- series analysis. Sophisticated theories of sto-
system methods take into consideration only chastic processes have Little relevance for
the information about the parameters in one economic time series, because usually the time
equation or in a subset of the equations, and series do not satisfy such conditions as being
estimate the parameters in each equation stationary or having the +Markov property,
separately. There is a single equation method, etc. Recently, however, autoregressive moving
called the limited information maximum like- average (+ARMA) and multivariate ARMA
lihood method, based on the maximum likeli- models [4] have been applied to macro-
hood approach, and also a two-stage least economic data, especially for the purpose of
squares method, which estimates n fïrst by prediction and for determining the direction of
least squares, computes Y = fiZ, and then causal relations (- 421 Time Series Analysis).
applies the least squares method to the mode1 Traditionally, fluctuations of economic time
series have been thought to consist of trend,
Y=BP+rZ+ii
cyclic variation, seasonal variation, and error.
These two and also some others are asymptot- Various ad hoc techniques have been used to
ically equivalent. Among asymptotically separate or eliminate such components, but
equivalent classes of estimators corresponding the theoretical treatment of such problems is
to different information structures it has been far from satisfactory.
established that the maximum likelihood
estimators have asymptotically higher-order
efficiency [S] (- 399 Statistical Estimation) References
than other estimators, and Monte Carlo and
numerical studies show that they are in most [l] W. C. Hood and T. C. Koopmans (eds.),
cases better than others if properly adjusted Studies in econometric method, Cowles Com-
for the biases. mission Monograph 14, Wiley, 1953.
In many simultaneous equation models [2] H. Theil, Economie forecasts and policy,
which have been applied to actual macro- North-Holland, 1958.
economic data, the values of endogenous [3] J. Johnston, Econometric methods,
variables obtained in the past appear on the McGraw-Hill, 1963.
right-hand sides of equations (1). Such vari- [4] G. E. P. Box and G. M. Jenkins, Time
ables are called lagged variables, and they cari series analysis: Forecasting and control,
be treated, at least in the asymptotic theory of Holden-Day, revised edition, 1976.
inference, as though they were exogenous. [S] M. Akahira and K. Takeuchi, Asymptotic
Hence exogenous variables and lagged endog- efficiency of statistical estimators: Concepts
enous variables are jointly called predeter- and higher order effciency, Lecture notes in
mined variables. When many lagged variables statistics 7, Springer, 198 1.
129 Ref. 512
Einstein, Albert

129 (XXl.21) well’s equations for a vacuum are written in the


form
Einstein, Albert
Es dE/& = rot H - J,, sOdivE=pe,
Albert Einstein (March 14, 18799April 18, p,,aH/&= -rot E- J,, p. div H = p,,,, (1)
1955) was born of Jewish parents in the city of
Ulm in southern Germany. He became a Swiss where E and H are the electric and magnetic
Citizen soon after graduating from the Eid- field vectors, pe and p,,,the electric and mag-
genossische Technische Hochschule of Zürich netic charge densities, J, and J, the electric
in 1900. Afterward, he obtained a position as and magnetic current densities, Es and p,, con-
examiner of patents at Bern, and while at that stants, and the quantity II& = c the speed
post, he published his theories on light quanta, of light in vacuum (2.99797 x 10s mis). Charge
tspecial relativity, and tBrownian motion. and current densities must satisfy the equations
After briefly holding professorships at the of continuity
University of Zürich and the University of ap,lat + div J, = 0,
Prague, he became a professor at the Univer-
sity of Berlin in 1913. His general theory of ap,/at + div J, = 0. (2)
relativity was announced in 1916, and in 1921
Following upon the observation that, appar-
he won the Nobel Prize in physics for his
ently, pm= 0 and J, = 0 in nature, we hence-
contributions to theoretical physics. TO escape
forth set them equal to zero. This causes an
Nazi persecution, he fled to the United States
asymmetry between the electric and magnetic
in 1933, and until his retirement in 1945 he
quantities. On the other hand, the proposition
was a professor at the Institute for Advanced
“p,,,= 0 and J, = 0” cannot be deduced from
Study at Princeton. He advised President
the classical theory itself.
Roosevelt of the feasibility of constructing the
In the presence of matter, additional charge
atomic bomb, but after World War II, along
and current appear due to the electric and
with others who had been connected with the
magnetic polarization P and M of the material.
bomb, he was active in promoting the nuclear
Therefore, in this case, it is necessary to make
disarmament movement and the establishment
the following substitutions in (1):
of a world government.
The theory of relativity raises fundamental pe+p=pe-divP,
epistemological problems concerning time, J,-+J = J, + aP/at + rot M,
space, and matter. The results of the general
theory were veritïed in 1919 by observations of H-+H+M. (3)
the solar.eclipse.
Moreover, if we define the electric flux density
Through his latter years, Einstein continued
(or electric displacement) D and the magnetic
to work on tunified field theory and on the
flux density (or magnetic induction) B by
generalization of relativity theory.
D=s,E+P, B = P~(H +Ml, (4)

References then Maxwell’s equations (1) are transformed


into
[l] P. Frank, Einstein, his life and time, aD/at=rotH-J,, divD=p,,
Knopf, 1947.
[2] A. Einstein, The meaning of relativity, aB/at= -rot E, divB=O. (5)
Princeton Univ. Press, tïfth edition, 1956.
In the electromagnetic lïeld in a vacuum
[3] C. Seelig, Albert Einstein, eine dokumen-
there is energy with a density
tarische Biographie, Europa Verlag, 1954.
u = b-,,/2)EZ + W2)B2, (6)
and energy flux with a density expressed by
the Poynting vector

130 (XX.1 6) S=ExH. (7)


Electromagnetism Between these quantities the following relation
holds:

A. Maxwell% Equations aulat + div S = 0. (8)


An electric charge 4 moving with velocity v
Mathematical formulation of electromagnetics
in an electromagnetic tïeld is subject to the
leads to tinitial value and tboundary value
Lorentz force
problems for Maxwell% equations according
to the geometric nature of the medium. Max- F=qE+qvxB. (9)
513 130 B
Electromagnetism

This force cari be interpreted as being caused Some cases of practical importance are
by the Maxwell stress tensor given below.
(1) Electrostatics. If the lïelds are time-
T,=(E~/~)(-E,E~+~@~) independent and there is no electric current,
+(,~o,/‘2)(-HiHk+26i,H’). (10) then E and H are mutually independent. The
static electric fïeld is calculated from the solu-
By introducing the scalar potential V and tion of the boundary value problem of the
the vector potential A, we cari express the tïeld +Poisson equation AV = - P,/E. Specifically, V
vectors as takes a constant value in each conductor.
B=rotA, E= -grad V-aA/&. (11) (2) Magnetostatics. For zero electric current,
the problem of magnetostatics is solved in the
Furthermore, if we impose an auxiliary con- same way as in electrostatics. For the case of
dition (Lorentz condition) nonvanishing stationary electric current, the
divA+~OpOav~at=o, (12) problem is reduced to that of solving

then we obtain from (1) the wave equations AA= -PJ,, divA=O. (17)

(3) Electric current in a conductor. A station-


0 v= - dEO> OA= -poJ,, (13)
ary electric current in a conductor is governed
where 0 = A-.s,,~~~~/&~ is called the d’Alem- by the equation of continuity div J =O, Obm’s
bertian and is sometimes written as 0 2. From law J = oE, and a special case of Maxwell%
(13) we conclude that the electromagnetic equation rot E = 0. The electric current pro-
fïeld cari propagate in a vacuum as a wave duces heat (Joule beat) proportional to J. E
with speed c = 116. In tquantum theory per unit volume per unit time (Joule% law).
the potentials V and A are regarded as being For certain substances, the specific resis-
more fundamental than E and H themselves. tance K1 suddenly becomes negligibly small
However, they are not uniquely determined, in below a critical temperature. This is called
the sense that the gauge transformation superconductivity.
v+ v+ a*/at, A+A-grad$ (4) Quasistationary electric circuit. The
(14)
problem appearing most often in electrical en-
with an arbitrary function $ of the space and gineering is that of a quasistationary circuit.
time variables does not affect the fields. Its characteristic feature is that the electric
Maxwell’s equations are invariant under the currents exist only in the circuit elements
+Lorentz transformation. Therefore they cari (inductors, capacitors, and resistors) and in the
be written in 4-dimensional tensor form (- lines connecting them. The current (both J,
359 Relativity C). and aD/&) cari be neglected in a11 other parts
Maxwell’s equations cari be regarded as of the system. (This could be compared to
the +wave equations for tbosons with spin 1 the situation in dynamics where we consider
(tphotons). The equations of quantum electro- systems of material points or of rigid bodies
dynamics are obtained if we regard the field having a fmite number of degrees of freedom,
quantities as quantum-mechanical vari- although every material body is essentially
ables (q-numbers) and then perform +Second a continuum.) The system network is con-
quantization. structed as a tlinear graph with the circuit ele-
ments as its branches. Topological tnetwork
theory deals with the relation between the
B. Concrete Problems structure of the linear graph and the electri-
cal characteristics of the network, whereas
In solving Maxwell’s equations concretely, function-theoretic network theory deals with
we usually make additional assumptions for the relation between current and voltage at
polarizations, electric current, and tïeld vectors each part of the network. In the latter theory,
current and voltage are considered as func-
P=xeE, M=x,H, J, = oE, (15) tions of the frequency of the sinusoidal alter-
where xer x,, and u are called the electric sus- nating voltage applied to some point of the
ceptibility, magnetic susceptibility, and conduc- network. Together these constitute a unique
tivity, respectively. Then we have theoretical system in engineering mathe-
matics for designing system networks (- 282
D=EE, B=pH, (16) Networks).
where E= E,, + xe is the dielectric constant and (5) Theory of electromagnetic waves. The
p = pO( 1 + x,) is the magnetic permeability. theory of electromagnetic waves deals with the
Therefore the equations (5) become identical to case where the changes of a11 field quantities
(1) if Es and p,, are replaced by E and p, respec- are proportional to eiwr, and in addition the
tively. (p, and J, are set equal to zero.) frequency w is SO large that the term aD/at in
130 Ref. 514
Electromagnetism

(5) is of the same order as or larger than J,. In with boundary conditions E, = 0 and aB,/an
such a situation the electromagnetic field = 0 on the wall, where (a/&) denotes the
behaves like a wave. partial derivative in the normal direction. The
Problems of various types arise depending solutions with B, = 0 are called transverse
on the geometry of the conducting and dielec- magnetic (TM) waves (or electric (E) waves)
trie substances, on the type of the energy and those with E, = 0 transverse electric (TE)
source, etc. Important problems are: (i) radi- waves (or magnetic (M) waves); for each of
ation of a wave from a point source into free these cases the equations determine an eigen-
space; (ii) scattering of a plane wave by small wave number k for a given angular frequency
bodies or cylinders; (iii) diffraction of a wave w (typically in a waveguide situation) or
through holes in a conducting plate; (iv) reflec- eigenfrequency v = w/(2rr) for a given wave-
tion and refraction of a wave at the boundary length /1= 2n/k (typically in a resonant cavity
between different media; (v) wave propagation situation).
along a conducting tube (wave guide); and
(vi) resonance of the electromagnetic tïeld in a
References
cavity surrounded by a conducting substance.
Theoretical treatment similar to that for ordi-
[l] J. D. Jackson, Classical electrodynamics,
nary networks is possible for microwave cir-
Wiley, second edition, 1975.
cuits consisting of wave guides, cavities, etc.
[2] W. K. H. Panofsky and M. Phillips, Class-
(6) Wave guides. For electromagnetic waves
ical electricity and magnetism, Addison-
with a harmonie time dependence e-‘“’ propa-
Wesley, second edition, 1961.
gating inside a hollow tube of uniform cross
[3] A. Sommerfeld, Electrodynamics, trans.
section extending in the z-direction with per-
by E. G. Ramberg, Academic Press, 1952.
fectly conducting walls (called a wave guide),
(Original in German, 1948.)
the Maxwell equations become
[4] P. Moon and D. E. Spencer, Foundations
rot E = iwB, divB=O, of electrodynamics, Van Nostrand, 1960.
[S] J. A. Stratton, Electromagnetic theory,
rotB= -Q&UE, divE=O, (18) McGraw-Hill, 1941.
[6] B. 1. Bleaney, Electricity and magnetism,
with the boundary condition
Clarendon Press, 1957.
nxE=O, neB=O, [7] R. Fano, L. J. Chu, and R. B. Adler,
Electromagnetic lïelds, energy and forces,
where n is a unit normal at the boundary Wiley, 1960.
surface. A further reduction is gained by [S] E. G. Hallen, Electromagnetic theory,
Fourier analysis in the z-variable. For har- Chapman & Hall, 1962.
monic z-dependence eikr, the transverse com-
ponents E, = (e, x E) x e, and B, = (e, x B) x e,
are determined from the z-components E, =
e;E and H,=e;H by the following part of
the Maxwell equations: 131 (X.8)
Elementary Functions
ikE, + iwe, x B, = grad, E,,

ikB, - ipme, x E, = grad, B, (19) A. Definition


(where grad, is the transverse components of
A function of a tïnite number of real or com-
the gradient), up to the solutions for E, = B,
plex variables that is talgebraic, exponential,
= 0, called transverse electromagnetic (TEM)
logarithmic, trigonometric, or inverse trig-
waves, for which k = CO,,&, B, = + fi e, x
E,, and E, is a solution of the electrostatic onometric, or the composite of a fmite number
problem in two dimensions rot E, = 0, div E, of these, is called an elementary function. Ele-
= 0. TO have E, # 0 for the TEM solution, it is mentary functions comprise the most common
necessary to have two or more surfaces, such type of function in elementary calculus.
as a coaxial table (region between two con- J. Liouville [l] delïned the elementary func-
centric cylinders) or a parallel-wire trans- tions as follows: An algebraic function of a
mission line. Nonzero longitudinal compo- finite number of complex variables is called an
elementary function of class 0. Then eZ and
nents E, and B, are determined from the
logz are called elementary functions of class 1.
2-dimensional equations
Inductively, we define elementary functions of
class n under the assumption that elementary
functions of class at most n - 1 have already
515 131 D
Elementary Functions

been deiïned. Let g(t) and gj(w, , . . , wn) (1~ common logarithms of integers have been
j < m) be elementary functions of class at most computed and published in tables.
1 and f(zr , , z,,J be an elementary function
of class at most n - 1. Then the composite
D. Derivatives of Exponential and Logaritbmic
functions g(f(z,, . . ,z,)) and f(g,(w,, . . , w,),
. , g,,,(wr, , w,)) (and only such functions) Functions
are called the elementary functions of class at
The function f(x) = uX is differentiable, and
most n. An elementary function of class at
f’(x) = k,f(x), where k, is a constant deter-
most n and not of class at most n - 1 is called
mined by the base a. If we take the base a to
an elementary function of class n. A function
be
that is an elementary function of class n for
some integer n is called an elementary function. e=lim l+! “= f ‘=2.71828...,
In this article, we explain the properties of the Y-m ( V> V=o v!
most common elementary functions.
then we have k, = 1. The number C( l/v!) is
usually called Napier’s number and is denoted
by e after L. Euler (his letter to C. Goldbach
B. Exponential and Logarithmic Functions of a of 1731; - Appendix B, Table 6). In 1873,
Real Variable C. Hermite proved that e is a transcendental
number. We sometimes denote eX by expx; the
Let a > 0, a # 1. A function f(x) of a real vari- term exponential function usually means the
able satisfying the functional relation function exp x. The function eX is invariant
under differentiation, and conversely, a func-
f(x+Y)=fw(Y)> fU)=a, (1) tion invariant under differentiation necessarily
satisfies f(n) = a” for positive integers n and has the form Ce”. The logarithm to the base e
j( - n) = l/an for negative integers -n. In gen- is called the Napierian logaritbm (or natural
eral, f(n/m) = p for every rational number logarithm), and we usually denote it by log x
r = n/m. If we assume that f(x) is continuous, without exphcitly naming the base e (some-
then there is a unique strictly monotone func- times it is denoted by lnx). The derivative
tion f(x) detïned in (-00, CO) whose range is of logx is 1/x, hence we have the integral
(0,co). The function f(x) is called the exponen- representation
tial function with the base a and is denoted by
ax, read “a to the power x” and also called a
power of a with exponent x. Its inverse function
is called the logaritbmic function to the base a,
logx=
s “dx
-.
1 x

The constant factor k, in the derivative of 11~


(3)

and is denoted by log,x. The specific value is equal to log a. The graphs of y = eX and
log,x is called the logarithm of x to the base a. y = log x are shown in Fig. 1. The functions eX
If g(x) = log, x, we have and log( 1 +x) are expanded in the following
Taylor series at x = 0:
dxY)=dx)+dY), sk4= 1. (2)
Hence we have xy =f( g(x) + g(y)). Therefore
we cari reduce multiplication to addition by
using a numerical table for the logarithmic
function. log(1 +x)= f (-l).+r$. (5)
V=I

The power series in the right-hand sides of (4)


and (5) are called the exponential series and
C. Logarithmic Computation
the logarithmic series, respectively. The radii
of convergence of (4) and (5) are CO and 1,
The logarithm to the base 10 is called the
respectively.
common logarithm. If two numbers x, y ex-
pressed in the decimal system differ only in the
position of the decimal point (i.e., y = x. 10” for I v=e”
an integer n), they share the same fractional ,/’
parts in their common logarithms. The in- I
/’ Y=logx
tegral part of the common logarithm is called >Y’
1
the cbaracteristic, and the fractional part is /‘l I
called the mantissa. (We note that the word .o 1
/
“mantissa” is also frequently used for the frac- /’
+I’
tional part a in the tfloating point representa-
tionx=a.lO”, 10-‘<u<l,or l<a<lO.)The Fig. 1
131 E 516
Elementary Functions

E. Trigonometric and Inverse Trigonometric of the hyperbola. Then we detïne the coordi-
Functions of a Real Variable nates of P to be (cash 0, sinh 0) as functions of
0. We have
The trigonometric functions of a real variable
coshx =(eX+ e-“)/2,
x are the functions sin x, cosx (- 432 Trig-
onometry) and the functions tan x = sin ~/COS x, sinh x = (eX - e-“)/2, 63)
cet x = COSx/sin x, sec x = ~/COSx, and cosec x
called the hyperbolic cosine and hyperbolic sine,
= l/sin x derived from sin x and COSx. The
derivatives of sin x and COSx are COSx and respectively. As in the case of trigonometric
functions, we detïne the hyperbolic tangent by
- sin x, respectively. They have the following
tanh x = sinh x/cosh x, the hyperbolic cotangent
Taylor expansions at x = 0:
by coth x = cash x/sinh x, the hyperbolic secant
sinx=
m (-1)” 2”+l
C -x
by sechx= l/coshx, and the hyperbolic cose-
“=0(2v+ l)! ’ tant by cosechx = l/sinh x. They are called
the hyperbolic functions. The graphs of sinh x
m (-1)”
COSx = c -x2V. and coshx are shown in Fig. 3. The trigono-
V=o (2v)! metric functions are sometimes called circular
The radii of convergence of (6) and (7) are both functions.
CO.
The inverse functions of sin x, COSx, and
tan x are the inverse trigonometric functions
and are denoted by arc sin x, arc COSx, and
arc tan x, respectively. (Instead of this nota-
tion, sin-’ x, COS-~ x, and tan-’ x are also
used). These functions are infintely multiple-
valued, as shown in Fig. 2. But if we restrict
their ranges within the part shown by solid
lines in Fig. 2, they are considered single-
Fig. 3
valued functions. TO be more precise, we re-
strict the range as follows: - n/2 < arc sin x <
We now introduce the Gudermannian (or
n/2,0 < arc COSx < I[, - 7112< arc tan x < 7~12.
Gudermann function):

Q=gdu=2arctane”-n/2,

l+sinO
u=gd-‘H=log~tan0+sec0l=~log~.

Then the hyperbolic functions cari be ex-


pressed in terms of the trigonometric func-
tions. For example,

sinh u = tan 8, cash u = sec 0, tanh IA= sin 8.

Fig. 2
G. Elementary Functions of a Complex
The functions having these ranges are called Variable (- Appendix A, Table 10)
the principal values and are sometimes denoted
by Arc sin x, Arc COSx, and Arc tan x, respec- (1) Exponential function. The power series (4)
tively. The derivatives of these functions are converges for a11 lïnite values if we replace x by
(1 - x2))1’2, -(1 -x2))11’, (1 +x2))‘, respec- the complex variable z and gives an tentire
tively (- Appendix A, Table 9.1; for the Tay- function of z with an tessential singularity at
lor or Laurent expansions of tan x, cet x, the point at intïnity. This is the exponential
secx, cosecx, arcsinx, arccosx, arc tanx, etc., function ei of a complex variable z. It satistïes
- Appendix A, Table lO.IV). the addition formula (l), e’l+‘2 =eZleZ2, and it
is also the tanalytic continuation of the ex-
ponential function of a real variable. For a
F. Hyperbolic Functions purely imaginary number z = iy, we have the
Euler formula
Let P be a point on the branch of the hyper-
eiY=cosy+isiny. (9)
bola x2 - y2 = 1, x > 0, and let 0 be the origin
and A the vertex (1,O) of the hyperbola. De- The function w = eZ gives a tconformal map-
note by 012 the area of the domain surrounded ping from the z-plane to the w-plane, as shown
by the line segments OA, OP, and the arcn in Fig. 4, which maps the imaginary axis of
517 132 A
Elementary Particles

the z-plane onto the unit circle of the w-plane (8) and (9), we have
(w = u + io). For z = x + iy (x, y are real num-
c0s2=(e’“+e~“)/2, sin z = (e” - e-‘“)/2i,
bers), we have eZ = e”e’Y; hence eZ is a tsimply
periodic function with fundamental period 2ni. cash z = COSiz, sinh z = (sin iz)/i. (10)
Each of these formulas (10) is called an Euler
formula. For a complex variable, the trig-
onometric and hyperbolic functions are com-
posites of exponential functions, the inverse
trigonometric functions are composites of
logarithmic functions, and a11 of them are
elementary functions of class 1. The defmition
of elementary functions by Liouville described
in Section A refers, of course, to the functions
of a complex variable. We remark that the
w-plane inverse function of an elementary function is
Fig. 4 not necessarily an elementary function. For
example, the inverse function of y =x -a sin x
is not an elementary function (- 309 Orbit
(2) Logarithmic function. The logarithmic
Determination B).
function logz of a complex variable z is the
The derivative of an elementary function is
inverse function of e’. It is an iniïnitely
also an elementary function. However, the
multiple-valued analytic function that has
+Primitive function of an elementary function
tlogarithmic singularities at z = 0 and z = CO.
is not necessarily an elementary function. The
Al1 possible values are expressed by logz
primitive function of a rational function or an
+ 2mi (n is an arbitrary integer), where we
algebraic function of tgenus 0 is again an
Select a suitable value logz. The principal value
elementary function. Similar properties hold
of log z is usually taken as log r + 8, where z
for rational functions of trigonometric func-
= re” (r = 1~1, 0 is the argument of z) and 0 < 0
tions. Liouville [l] carried through a deep
< 27~. (Sometimes the range of the argument is
investigation of the situation where the in-
taken as - 7~< 0 < x.) The principal value of
tegral of an elementary function is also an
logz is sometimes denoted by Logz. The
elementary function.
power series (5) gives one of its tfunctional
elements. The integral representation (3) holds
for a complex variable z. The multivalency of References
logz results from the selection of a contour of
integration; the integral of l/z around the [l] J. Liouville, Mémoire sur la classilïca-
origin is 2d, which is the increment of logz. tion des transcendantes et sur l’impossibilité
(3) Power. The exponential function a’ for d’exprimer les racines de certaines équations
an arbitrary complex number a is defïned to en fonction finie explicite des coefficients, J.
be exp(z log a). Similarly, za is defmed to be Math. Pures Appl., 2 (1837), 56-104; 3 (1838),
exp(alogz). The function z’ is an algebraic 523-546.
function if and only if a is rational. In other [2] E. G. H. Landau, Einführung in die Dif-
cases, the function z’ is an elementary function ferentialrechung und Intergralrechnung,
of class 2. Noordhoff, 1934; English translation: Dif-
(4) Trigonometric functions, inverse trig- ferential and integral calculus, Chelsea, 1965.
onometric functions, hyperbolic functions. [3] N. Bourbaki, Eléments de mathématique,
The trigonometric, inverse trigonometric, and Fonctions d’une variable réele, ch. 3, Fonc-
hyperbolic functions of a complex variable are tions élémentaires, Actualités Sci. Ind., 1074b,
deiïned by the analytic continuations of the Hermann, second edition, 1958.
corresponding functions of a real variable. For Also - references to 106 Differential Calculus.
example, sin z and COSz are defined by the
power series (6) and (7), respectively. They are
entire functions whose zero points are n7c and
(n-$-c (n is an integer), respectively. They are 132 (Xx.31)
also represented by tweierstrass’s infinite
Elementary Particles
product (- Appendix A, Table lO.VI).
The functions tan z, cet z, sec z, and cosec z
are tmeromorphic functions of z on the com- A. Introduction
plex z-plane, and they are expressed by
+Mittag-LetTler partial fractions (- Appendix The word “atom” is derived from the Greek
A, Tables lO.IV, 10.V). As cari be shown from word for indivisible. It turns out that an atom
132 A 518
Elementary Particles

is divisible into its constituent nucleus and ticle. Whether a corresponding quantum (grav-
electrons. The nucleus, in turn, consists of iton) exists for gravitational interactions is a
protons and neutrons (together called nu- question related to the existence of gravita-
cleong). Photons (quanta of electromagnetic tional waves themselves, and is not yet settled.
waves), electrons, protons, and neutrons (de- The nuclear force is an example of a strong
noted by y, e, p, and n, respectively), along interaction and is studied to elucidate nuclear
with many other subsequently discovered sub- structure and to derive the tcross sections of
nuclear particles, are called elementary parti- various collision processes involving nuclei. H.
cles, while nuclei, atoms, and molecules are the Yukawa predicted in 1935 the existence of a
composite particles composed of these elemen- particle associated with the nuclear force, just
tary particles. as photons are associated with the electro-
States of a particle form an irreducible (uni- magnetic interaction. lts mass was predicted,
tary) representation of the tproper inhomoge- from the range of the nuclear force, to be
neous Lorentz group with positive energy (or about 200 times the electron mass, which is
a fïnite direct sum of such representations). intermediate between the masses of electrons
Thusamass(m>O)andaspin(j=O,),l,...) and nucleons, and hence the particle was
are assigned to each particle (- 258 Lorentz named a mesotron or a meson. These were
Group). For example, e, p, and n have spin f found in cosmic rays in 1947 and, in fact, it
and nonzero masses, while y has spin 1 and was found that the mesons relevant to the
zero mass. nuclear force (now called pions and denoted
Many elementary particles are unstable, by 7c+, 7~-, no according to their electric charge)
decaying into other particles. The average decay into other kinds of particles called
lifetime is denoted by z and its inverse is called muons (denoted by $, fi-) with a charged-
the half-width. The time in which half of many pion lifetime of 2.6 x 10-s sec (z* +p(+ + v(V);
samples of the same particle decays is called the neutral pion 71’ decays, over a shorter life-
the half-life, given by T log 2. For example, time, into photons). The muons then decay
electrons, protons, and photons are supposed into electrons and neutrinos (p’ -+e* + v + V,
to be stable (or at least to have very long life- v indicating antineutrinos) with muon life-
times), while a neutron decays into a proton, times of about 2.2 x 10e6 sec. Weak inter-
an electron, and a neutrino v (P-decay) with a actions are relevant to these decays as well as
lifetime of about 15 minutes. to the B-decay of the nucleus and the neutron.
From the study of the relativistic equations Since 1962 electron neutrinos v, and muon
(Dirac equations) for wave functions of an elec- neutrinos v# have been distinguished in these
tron, P. A. M. Dirac predicted the existence of decays, SO that electron and muon numbers
particles with the same mass as the electron, may be conserved.
but of opposite electric charge (Dira?s hole Since 1949, many new particles (unstable
theory, 1930). These were discovered in 1932 under weak interactions) have been gradually
and called positrons. Every elementary particle found in cosmic rays; these are called strange
is now believed to be associated with an anti- particles. Some of the early ones are hyperons
particle characterized by the opposite sign of (A, C,Z), of spin $, and kaons (K, K), of spin 0.
the particle’s additive quantum numbers, the For such strange particles, the strangeness
two being connected by the tPCT theorem. quantum number, which is preserved in the
Hence the positron is the antiparticle of the strong interaction, has been introduced, and
electron. The antiproton, theoretically ex- the Nakano-Nishijima-Gell-Mann formula
pected for a long time and experimentally concerning this number is known to hold. This
found in 1955, is the antiparticle of the proton. says that Q = 1, + i(B + S) for each elementary
Antiprotons, antineutrons, and positrons are particle, where Q is the electric charge in units
constituents of antimatter. The particles whose of that of the positron, 1, is the third compo-
additive quantum numbers are a11 invariant nent of the isospin, B is the baryon number,
under change of sign, such as photons (y), neu- and S is the strangeness.
tral pions (7-c’), etc., seem to be antiparticles of Quantum-mechanical wave functions for a
themselves and are said to be self-conjugate. system of identical particles seem to be either
Elementary particles have four distinct types totally symmetric (Bose statistics) or totally
of interaction: gravitational, weak, electro- antisymmetric (Fermi statistics) under permu-
magnetic, and strong, in increasing order of tations of particles. Accordingly, particles are
interaction strength. Gravitational and electro- either bosons (or Bose particles) or fermions (or
magnetic interactions were recognized in Fermi particles). Al1 bosons seems to have
earlier centuries because these interactions are integer spins and a11 fermions half-odd-integer
of long range. A. Einstein put forward the idea spins. This is the connection of spin and statis-
of a light quantum or photon as a lump of tics and follows from certain axioms in quan-
electromagnetic energy behaving like a par- tum fieldtheory(- 150 Field Theory). There
519 132 C
Elementary Particles

has been some discussion on intermediate the so-called divergence difftculties. The ultra-
statistics (parabosons and parafermions). violet divergence cornes from integration over
high momenta (of virtual particles), and the
infrared divergence is due to the zero mass of
B. Families of Elementary Particles photons. It was later found that the ultraviolet
divergence cari be combined with a small
Elementary particles are classified into families number of parameters of the theory (i.e., elec-
of leptons, photons, and hadrons. tron mass, (possibly nonzero) photon mass,
The family of leptons now consists of elec- and electromagnetic coupling constant e) into
trons (e), muons (PL), and tau-leptons (T), their a revised set of constants (called the renormal-
accompanying neutrinos (v,, v,,, v,), and their ized masses and coupling constant), which are
antiparticles; r was discovered in the late then equated to the observed lïnite values of
1970s. Experimentally, v, is not well estab- these constants-a procedure called renor-
lished nor has the possibility v, = v, yet been malization. Physically, this renormalization is
excluded. Leptons have spin i. They are char- pictured to be effected by virtual photons (and
acterized by having no strong interactions. electron-positron pairs) surrounding (bare)
The family of photons consists of photons electrons and photons. The infrared divergence
and the recently discovered intermediary weak is supposed to be a reflection of the fact that
vector bosons. Gluons (- Section C (4)), if electrons cari be accompanied by intïnitely
they exist, also belong to this family. many photons with negligibly small total
The family of hadrons has a large number of energy (a situation made possible by the zero
members which are either mesons or baryons, mass of photons); this cannot be experimen-
the former being bosons and the latter fer- tally analyzed (and is indistinguishable from a
mions. We now have, in addition to pions and single electron).
kaons, many resonant mesons (unstable under The relativistically covariant formulation
strong interaction) such as p-mesons and w- of the renormalized perturbation theory of
mesons. At present we have, in all, more than quantum electrodynamics proposed by S.
20 species of mesons. This number does not Tomonaga, J. S. Schwinger, and R. P. Feyn-
Count spin, charge, and antiparticle degrees of man (independently and in different forms),
freedom. The baryon subfamily includes nu- and in particular the Feynman rules and
cleons, hyperons, and excited states, now con- Feynman diagrams that lead to the Feynman
sisting of more than 30 species, again not integrals (- 146 Feynman Integrals), made
counting spin, charge, and antiparticle degrees possible detailed theoretical computations,
of freedom. We have nucleonic resonances and the computed values (such as the Lamb
with spin as high as 11/2. shift of hydrogen and the anomalous magnetic
Hadrons are now well understood as com- moment of an electron) fit marvelously well
posites of subhadronic constituents called with observed values-an achievement con-
quarks (and antiquarks), although free quarks sidered a great success of quantum electro-
have not been observed. (Hence the problem of dynamics. F. J. Dyson more or less showed
quark confinement has been discussed ex- that the renormalization procedure really
tensively.) Mesons are systems made up of absorbs all the divergences in a11 orders of the
a quark and an antiquark. Baryons are sys- perturbation expansion in terms of renormal-
tems made up of three quarks. Therefore, at ized constants, though there were later refme-
our present level of knowledge, the elemen- ments and elaborations of the proof. This
tary particles might be leptons, photons, and work also leads to the division of quantum
quarks (and the corresponding antiparticles). tïeld theories into two classes: renormalizable
theories, where (inlïnitely many) ultraviolet
divergences cari be absorbed into a Imite
C. Metbods in tbe Tbeory of Elementary number of constants by renormalized pertur-
Particles bation theory, and unrenormalizable theories.
The question of whether perturbation series
(1) ?Quantum Field Tbeory. Application of converge in some sense is an unsolved ques-
the ideas of quantum mechanics to electro- tion of quantum electrodynamics.
magnetic fields and their interaction with elec- In quantum lïeld theory, the central role is
trons resulted in the formulation of quantum played by quantum lïelds, which are operator-
electrodynamics (and more generally quan- valued generalized functions of a space-time
tum field theory). Application of quantum- point. Particle interpretations of any state at
mechanical perturbation theory to quantum intïnite past and infinite future are obtained in
electrodynamics with the titre-structure con- the theory from the asymptotic behavior of the
stant a = e’/(hc) (about 1/137) as an expansion fields (tasymptotic tïelds) at time -CO and time
parameter resulted in divergent expressions- +co, and from their relation, expressed by the
132 C 520
Elementary Particles

tS-matrix, describing how particles scatter by the unitarity of the S-matrix. Hence equation
collision. Thus any mode1 of quantum tïelds (l), called a dispersion relation, gives a relation
makes, in principle, a prediction about what among observable quantities. An integral rep-
particles appear and how they behave (asymp- resentation for the general two-body scattering
totically) in mutual collisions. amplitude f(s, t, u) (2 incoming and 2 outgoing
After the success of quantum electrodynam- particles) has been proposed by S. Mandelstam
ics, the perturbation theory of quantum tïelds and is called the Mandelstam representation,
was applied to systems of pions and nucleons; where s=(p, +p#, t=(pl +p#, and u=
this proved to be unsuccessful, possibly due to (pl +p4)’ (squares in a Minkowski metric)
the lack of an appropriate expansion param- and p,, p2, p3, p4 are the 4-momenta of the
eter. This has led some people to study the incoming and outgoing particles, with sign
mathematically rigorous consequences of reversed for the latter. (Some relations hold
quantum lïeld theories; these mathematical among variables: pf = mi2 with mi the mass,
consequences do not require any perturbation Cpi=O, S+~+U=C~“.) Under the inter-
calculation, simply following from a small change of incoming and outgoing particles,
number of mathematically formulated axioms the corresponding f’s are related by analytic
believed to be satisfied by a large class of continuation. This is called the crossing sym-
quantum tïeld theories. This approach, re- metry. The study of the S-matrix directly
ferred to as taxiomatic quantum tïeld theory, from its analyticity and unitarity is called the
has yielded a few physically meaningful con- S-matrix approach (- 386 S-Matrices).
sequences of general nature: analyticity of The analyticity of the two-body scatter-
some S-matrix elements, the +PCT theorem, ing amplitude f as a function of the angular
and the connection of spin and statistics (- momentum 1 with a lïxed s was investigated by
Section A). T. Regge for nonrelativistic potential scatter-
While the axiomatic approach provides a ing, and later the idea was applied to the S-
general framework, the tconstructive fïeld matrix approach. The poles l= l(s) off (fis
theory developed later provides examples of considered to be a function of 1 for each fixed
quantum tïeld theories that fit into such a s) are called Regge poles. Regge trajectories 1=
framework. Because of its concrete nature, it I(s) for variable s < 0 have been shown to play
cari make statements about detailed properties important roles in the high-energy behavior of
of the model, such as the (non-)existence of scattering amplitudes for small values of the
composite particles, the establishment of per- variable t (four-momentum-transfer squared).
turbation theory as an asymptotic expansion, An approximate expression of S-matrix
and phase-transitions phenomena and the elements, with Regge poles and satisfying the
related broken symmetry as the coupling con- crossing symmetry, was introduced by G.
stant varies. It has, however, been successful Veneziano and is called the Veneziano model.
for only 2 and 3 space-time dimensions. It has developed into the so-called dual reso-
nance mode1 (dual in the sense that s-channel
poles are dual to t-channel poles) and has
(2) Analytic S-Matrix Approach. Due to the
subsequently evolved into the string mode1
failure of the perturbation approach in quan-
of hadrons, according to which hadrons are
tum tïeld theory, a new approach was devel-
viewed as systems composed of strings joining
oped based on assumptions about the analy-
quarks (and antiquarks).
ticity properties of the S-matrix elements. The
assumed analyticity properties were surmised
(3) Group-Theoretical Approach. In connec-
from examination of Feynman integrals and of
tion with the symmetry properties shown by
nonrelativistic potential scattering, and par-
the spectra and reaction patterns of hadrons,
tially follow from axiomatic tïeld theory. In
a group-theoretical approach has been de-
this approach, the information that scattering
veloped. For example, the similarity of the
amplitudes, or S-matrix elements, possess
behavior of neutrons and protons in nuclei,
certain analytic properties with respect to
apart from the difference in their electro-
energies, scattering angles, and SO on is ex-
magnetic properties, was formulated as isospin
pressed by means of integral representations.
invariance (under the group SU(2)). The sym-
For example, the forward two-body scattering
metry properties are sometimes understood in
amplitude f(s) as a function of s = (energy in
terms of new additive quantum numbers that
the center of mass system)’ is written as
are conserved or nearly conserved, and these
properties are made more concrete in the final
stage by the introduction of fundamental
constituents carrying these quantum numbers.
Imf(s) is related to the total cross section by Based on the canonical formalism of quan-
the optical theorem, which is a statement of tum field theory, (Noether) currents, as
521 132 D
Elementary Particles

quantum-mechanical generators of the sym- quantization has also been formulated with
metry, are introduced in association with con- the explicit introduction of Faddeev-Popov
served quantum numbers. The commutation ghosts from the beginning. Non-Abelian
relation of these currents, referred to as a cur- gauge lïeld theory exhibits the very important
rent algebra, shows the structure of a Lie alge- property of asymptotic freedom, which states
bra expressing the symmetry of the Lagran- that the interaction at asymptotically high
gian of the system of hadrons; for example, energies, or at very short distances, approaches
SU(3) x SU(3) for three species (flavors) of that of free (noninteracting) theory. This prop-
massless quarks (the tïrst factor for vector erty is required for the description of hadronic
currents and the second for axial-vector cur- systems made of quarks in view of the experi-
rents). The approach showed remarkable suc- mental observation of the scaling behavior of
cess when combined with the hypothesis of deep inelastic inclusive structure functions.
the partially conserved axial-vector currents Thus gauge theory is believed to describe the
(PCAC), which requires the divergence of the dynamics of systems of quarks in hadrons. It
axial-vector currents to be proportional to the is called quantum chromodynamics, and the
pseudoscalar meson fields. quantum of the gauge fïeld is called the gluon.
Even if a theory has a symmetry under a The recent development of gauge tïeld theory
group G in its formulation, a vacuum state of has been accompanied by many technical
the theory might not have a symmetry under relïnements of quantum fïeld theory, which
G. If that occurs, we speak of spontaneously include the methods of dimensional regulariza-
broken symmetry. Under some assumptions, a tion and renormalization groups.
particle of zero mass (which is connected to the Dimensional regularization starts with
vacuum by the current for the spontaneously Feynman integrals delïned for n-dimensional
broken symmetry) is associated with the spon- momenta with n # 4, in order to give meaning
taneously broken symmetry. This statement is to integrals divergent for n = 4. Then Feynman
called Goldstone’s theorem, and the relevant integrals have poles at n =4, which are ab-
particle of zero mass is called the Nambu- sorbed into the unrenormalized constants by
Goldstone boson. The PCAC hypothesis is means of the renormalization procedure. This
believed to be connected with the spontaneous method is particularly suited for non-Abelian
breakdown of the axial SU(3) symmetry with gauge theory, since it is the regularization
pions, kaons, and eta-mesons as Nambu- method that keeps gauge invariance at each
Goldstone bosons. This approach produced step of the calculation.
the Adler-Weisberger sum rule, which relates The trenormalization-group equation has
the weak axial-vector coupling constants to been known for a long time. It results from
pion-nucleon scattering cross sections. the requirement that physical quantities
A group-theoretical attempt to treat fer- should be dimensionally covariant under re-
mions and bosons on an equal footing resulted normalization of constants in the theory. The
in the introduction of super Lie algebras (Z,- renormalization-group equation is usually
graded Lie algebras). The basic new ingredient written as a differential equation for a Green’s
in this approach is a special class of generators function, expressing the fact that a change of
roughly interpreted as the square root of the scale of momenta is balanced by changing
four-momentum, whose anticommutators coupling constants and masses. Further relïne-
(instead of commutators) are linear combina- ments are due to C. G. Callan, K. Symanzik, S.
tions of ordinary generators. A supermultiplet, Weinberg, and G. ‘t Hooft.
which is an irreducible representation of a
super Lie algebra, consists of both fermions
and bosons. Extensions to local super Lie D. Models of Elementary Particles
algebras (and also incorporation of gravitons
into the framework) have been tried, with no Hadrons are now considered to be made of
realistic mode1 emerging SO far. more fundamental constituents; at present, at
least 15 species of subhadronic, fermionic
(4) Non-Abelian Gauge Field Theory. Recently, constituents (quarks) have been proposed.
quantum tïeld theory has been revived as non- Attempts to understand subhadronic structure
Abelian tgauge theory, resulting in what are have a long history, the main landmarks of
considered to be successful qualitative and which include the Fermi-Yang mode1 (where
semiquantitative predictions. The quantization rr-mesons are supposed to be made of protons
of the theory was at fïrst carried out in terms and neutrons), the Sakata mode1 (where a11
of the +Feynman path integral, where lïctitious hadrons are supposed to be made of protons,
particles, called Faddeev-Popov ghosts, appear neutrons, and A-hyperons), and a variety of
through the precise detïnition of the functional quark models developed from the eightfold
measure of the path integral. The canonical way of M. Gell-Mann and Y. Ne’eman (new
132 Ref. 522
Elementary Particles

assignments of representations of SU(3) to and leptons transform under the group SU(2)
particles somewhat different from the Sakata as doublet or singlet representations. The
model; for example, the octet representation renormalizability of the spontaneously broken
for mesons and low-lying baryons, and the gauge field theory has been established by
decuplet for excited baryons). Originally, ‘t Hooft. Major predictions of the Glashow-
quarks were supposed to corne in 3 species Weinberg-Salam model, including the existence
(u-quarks, d-quarks, s-quarks), each carrying of the gauge bosons W and Z, have been
its own quantum number, now called the borne out experimentally.
flavor quantum number. Then it was suggested Grand unified models attempt to unify
that each flavored quark has three additional QCD and QFD, employing a larger Lie group
degrees of freedom, i.e., three new quantum containing SU(3) x SU(2) x U(1) as a sub-
numbers (called color quantum numbers) in group. The most popular ones are those based
order to account for experimental data: (1) the on the groups SU(5), SO(10) and on some of
spin-statistics problem of baryonic ground the exceptional groups. Super grand unifïed
state wave functions, (2) the decay rate of models attempt to unify QCD, QFD, and
~+2y, (3) the Drell ratio (= total cross section the gravitational interaction, with super Lie
for {ë + e+ +anything}, divided by the cross groups as a possible basis. Recently, there
section for em + e+ +p- + pL+). Recently, two have been attempts to search for subquark
new flavor degrees of freedom besides the old structures.
3 flavors (u, d, s) have been discovered, the
carriers of which are c-quarks and b-quarks, c
being the constituents of J/$-particles, b of References
y-particles. The combination of 5 flavors and
3 colors results in 15 quarks, as stated earlier. [l] J. D. Bjorken and S. D. Drell, Relativistic
The so-called standard mode1 is a quantum quantum mechanics, McGraw-Hill, 1964.
iïeld theory based on a local, non-Abelian [Z] J. D. Bjorken and S. D. Drell, Relativistic
gauge group SU(3) x SU(2) x U(1). The group quantum fïelds, McGraw-Hill, 1965.
SU(3) is called the color SU(3) group, which [3] G. F. Chew, The analytic S-matrix, Ben-
is supposed to be strictly unbroken and ex- jamin, 1966.
presses the invariance of the theory under the [4] S. L. Adler and R. F. Dashen, Current
local SU(3) transformation of the three color algebras, Benjamin, 1967.
degrees of freedom. Quarks having spin 3 [S] V. de Alfaro, S. Fubini, G. Furlan, and C.
transform as its 3-dimensional fundamental Rossetti, Currents in hadron physics, North-
representation, and the vector gauge bosons Holland, 1973.
transforming as its 8-dimensional regular [6] M. Gell-Mann and Y. Ne’eman, The eight-
representation are gluons. This part of the fold way, Benjamin, 1964.
theory is quantum chromodynamics (QCD). [7] R. E. Behrends, J. Dreitlein, C. Fronsdal,
The remaining part of the theory, based on and W. Lee, Simple groups and strong interac-
the local gauge group SU(2) x U(l), is called tion symmetries, Rev. Mod. Phys., 34 (1962),
the Glashow-Weinberg-Salam mode1 or its l-40.
hadronic extension quantum flavor dynamics [S] E. Fermi and C. N. Yang, Are mesons
(QFD), and unifies the electromagnetic and elementary particles? Phys. Rev., 76 (1949),
the weak interactions. This gauge group is 1739-1743.
supposed to be spontaneously broken with the [9] S. Sakata, On a composite mode1 for the
only unbroken subgroup U(1)’ corresponding new particles, Prog. Theoret. Phys., 16 (1956),
to the electromagnetic gauge transformation. 686-688.
The underlying mechanism for the spontaneous [ 101 L. Corwin, Y. Ne’eman, and S. Sternberg,
breakdown of the gauge group SU(2) x U(1) is Graded Lie algebra in mathematics and phys-
not well understood, but conventionally it is ics, Rev. Mod. Phys., 47 (1975), 573-603.
assumed to occur through the so-called Higgs [l l] M. Jacob (ed.), Dual theory, Physics
mechanism. This is a mechanism proposed by Reports Reprint Book Series, vol. 1, North-
P. W. Higgs, whereby the Goldstone boson Holland, 1974.
acquires a nonzero mass if the broken sym- [ 121 J. Scherk, An introduction to the theory
metry occurs in the presence of an associated of dual models and strings, Rev. Mod. Phys.,
massless vector field (called a gauge vector 47 (1975), 123-164.
field), which also becomes associated with [ 131 H. Harari, Quarks and leptons, Phys.
massive bosons. The gauge vector bosons are Rep., 42 (1978), 235-309.
identitïed with photons (y) (unbroken sym- [14] E. S. Abers and B. W. Lee, Gauge theo-
metry and hence massless) and weak inter- ries, Phys. Rep., 9 (1973), 1-141.
mediary bosons (broken symmetry and hence [ 151 S. Weinberg, Recent progress in gauge
massive), usually denoted by W and Z. Quarks theories of the weak, electromagnetic and
523 133 c
Ellipsoidal Harmonies

strong interactions, Rev. Mod. Phys., 46 The ordinary differential equation


(1974), 255-277.
[16] J. C. Taylor, Gauge theories of weak 4A& AA% =(KA+C)A
interactions, Cambridge Univ. Press, 1976. ( >
is satistïed by A and also by M and N if we
replace 1 by n and v, respectively. Equation (2)
is called Lamé% differential equation, with K
133 (XIV.9) and C the separation constants.
LetK=n(n+l)forn=O, 1,2,....Then
Ellipsoidal Harmonies equation (2), for a suitable value (the eigen-
value) of C, has a solution that is a polynomial
A. Ellipsoidal Coordinates in A: or a polynomial multiplied by one, two, or
three of Jm, Jm, and w.
If a > b > c, then for any given (x, y, z) E R3, the Among these solutions 2n + 1 are linearly
three roots of the cubic equation in 0 independent. We denote these solutions by A
=hm(l) (m= 1,2, . . . ,2n+ 1). They are essen-
x2 y2 22
F(O)=- ~ --l=O tially equivalent to the Lamé functions, to be
u2+e+b2+Q+C2+B defined at the end of this section. TO be pre-
are real and lie in the intervals Q> - c2, - c2 > cise, by setting
0 > - b2, and - bZ > fI > -a’. Denoting these 1+ (a2 + b2 + c2)/3 = 5,
three roots by Â, n, and Y (they are labeled
SO as to satisfy the inequalities 1> - c2 > p > C = B + n(n + l)(a’ + b2 + c2)/3,
-b’>v> -a’), F(Â)=O, F(p)=O, and F(v)=0 e, = (b2 + c2 - 2a2)/3, .; e, +e,+e,=O,
represent an ellipsoid, a hyperboloid of one
sheet, and a hyperboloid of two sheets, re- we have
spectively. They are confocal with the ellipsoid

x2 y2 22
aZ+b2+C2-1=0’
n(n+ l)<+B
(3)
pass through the point (x, y, z), and mutually =4(5-e,)(5-e2)(T-e3)A’
intersect orthogonally.
This cari also be written in the form
The quantities 1, p, v are called the ellip-
soidal coordinates of the point (x, y, z). Rect- d2A
angular coordinates (x, y, z) are expressed in dU2=(n(n+1M4+B)A
terms of ellipsoidal coordinates (1, p, v) by the
formula by the change of variable 5 = p(u) with the
Weierstrass t@-function.
X2=(u2+Â)(a2+~L)(a2+V) The differential equation (3) has 5 =ei, e2,
(1)
(a’-b2)(u2 -c2) ’ e3, CO as tregular singular points. A solution of
(3) that is a polynomial in 5 or a polynomial
and two others obtained from (1) by cyclic
multiplied by one, two, or three of G,
permutations of (a, b, c) and (x, y, z).
fi, and fi is called a Lamé function
of the first kind.
B. Ellipsoidal Harmonies

C. Classification of Lamé Functions


When a tharmonic function tj of three real
variables is constant on the surface A.= con-
The 2n + 1 linearly independent solutions
stant, p = constant, or v = constant in ellipsoidal
fn(Â) of (2) are classified into the following
coordinates, the function tj is called an ellip-
four families. If n is an even number 2p, then
soidal harmonie. A solution of Laplace’s equa-
tion A$ =0 in the form tj = A@)M(p)N(v) p + 1 solutions fnm(n) among the 2n + 1 solu-
cari be obtained by the method of separation tions are polynomials in 1 of degree p, and the
of variables. The equation A$ = 0 is written in other 3p functions are polynomials in 3, of de-
the form gree p - 1 multiplied by

or
where the summation is taken over the even
permutations of (1, p, v), and
Since a11 these polynomials are products with
A,=&a2 +A)(b2+/2)(c2 +A). real factors of degree 1, the solutions belonging
133 D 524
Ellipsoidal Harmonies

to the first family are of the type called the ellipsoidal harmonies of the first and
of the third species, respectively. Similarly for
f~(n)=(3L-e,)(~-e2) . ..(A-On.,), (5) odd n, the ellipsoidal harmonies of the second
while solutions of the latter kinds are of the and of the fourth species
type ~~=(~oryorz)0~0~...0~,-,,,~, (11)
~~=xyzO,@, ...o(“m,,,2 (12)
fnc4= are composed of the Lamé functions of the
second and fourth species, respectively.
For odd n, these forms are a complete sys-
x(2-&)(A-Cl,)...@-On,,-,). (6) tem of ellipsoidal harmonies that are linearly
The functions (5) and (6) are called Lamé independent and expressible in terms of poly-
functions of the tïrst species and of the third nomials in x, y, z of degree n.
species, respectively. On the other hand, if n is The zeros tl, t2, . . ..tP of the Lamé func-
an odd number 2p + 1, then 3(p + 1) solutions tions are real, and ti # tj (i #j). They never
among the 2n + 1 functions j”“‘(i) are of the coincide with any one of et, e2, and es. If e, >
e, > e3, then tri . ,<, a11 lie between e, and
type
e,. If m is an integer such that 0 < m < p, we
have one and only one Lamé function (with
the species given) with m of its zeros lying
between e, and e2 and the remaining p-m
zeros between e2 and e3 (Stieltjes’s theorem). In
this way, a complete system of linearly inde-
x(~-~l)(~-fl2)“.(Â.-~(,-,,,*)~ (7) pendent Lamé functions of the specifïed type
and the other p functions are of the type may be obtained, since m assumes p + 1 differ-
ent values. When the constant B appearing
f~(~)=&‘+~)(b*+ jr)(c* +A) in the differential equation (3) takes specifïc
X(~-~,)(~-~,)...(~-~(,-,),*). (8) values SO that the equation has Lamé func-
tions of the fïrst kind as its solution, (3) also
The functions (7) and (8) are called Lamé yields a solution A such that A-+<-(“f’)‘2 as
functions of the second species and of the fourth c- 00. This function A is called the Lamé
species, respectively. Hence in either case function of the second kind.
we have 2n + 1 linearly independent Lamé
functions.
When n is even, we obtain an ellipsoidal D. Ellipsoids of Revolution (Spheroids)
harmonie
4 When the fundamental ellipsoid is a spheroid

x*+y* z*
-+-= 1,
a* c*
by multiplying J,“‘(n), f/(p), and J,“‘(v) belong-
ing to the tïrst family. Also, in this case, by it is convenient to use the spheroidal coordi-
setting nates (5, q, cp) given by
x*
~ y* ~- z*
1 x=1 (~~-l)(l-)lZ)coscp,
@,=-
d+e, +b*+ep+2+e,
y=l (t*-l)(l-q*)sin<p, (13)
(+s-e,we,)
z = &L l==JD
= (2 + 0,) (b* + e,)(2 + e,)
for a2 < c2 (prolate) and
we have
x=1 g*+ l)(l -$)cos<p,
$~=@,@,...@.,* (9)
y=1J(<2+1)(1-~2)sin<p,
up to constant coefficients. Utilizing Lamé
functions of the third species (instead of func- z = &L l=JX2 (14)
tions of the tïrst species) and formula (l), we
for a2 > c* (oblate). The solutions of Laplace%
find that
equation, which are regular at a11 tïnite points,
~~=(y~orzxorxy)x@,@,...O,~~-1. (10) are given by

For even n, every ellipsoidal harmonie ex- ti = KYS)CY~)llPnm~


pressible in terms of polynomials in x, y, z of
in the prolate case, and
degree n cari be written as a linear combina-
tion of the functions (9) and (lO), which are ti = KWEXrl)S,‘,“w
525 133 F
Ellipsoidal Harmonies

in the oblate case. Here Pnmis the tassociated (18) which are regular on the whole domain
Legendre function of the tïrst kind. Solutions -1 <x < 1 by per(x), and the corresponding
which are regular outside a tïnite ellipsoid eigenvalues by A,,,, (assuming the boundary
cari be composed of the tassociated Legendre condition stated in this paragraph concern-
functions of the second kind, Qn(<) or Qn(i<) ing singularities). In particular, when K+O,
instead of Pn(<) or P:(i{), respectively. equation (18) reduces to tlegendre’s associ-
ated differential equation, the eigenvalues of 1
become n(n + 1) (n is a positive integer), and
E. Spheroidal Wave Functions the corresponding eigenfunctions become the
associated Legendre functions of the first kind:
Transforming the tHelmholtz equation in
prolate spheroidal coordinates (13) we have

Hence per(x) is a solution whiçh tends to a


constant multiple of Pn(x) as K+O. Using a
+GY=O, rc=kl. system of orthogonal functions Pr(x), we cari
expand pc:(x) as
(15)
w34 = C 4XYx),
By separating variables in the form Y = I,m
X(t) Y(rj);Ftm(~, we have the equations Il- n 1= even number. (20)
The coefficients A satisfy a recurrence
formula

(
,212+21-l-2&
L-W+1)+K (21-1)(21+3) 4Tl
>

t16W &t-m- 1x-4 A”


(21-3)(21-l) n’f ’
which X and Y, respectively, must satisfy. The
only difference between equations (16a) and (l+m+ 1)(1+m+2& -o
-2 n,1+2- (21)
(16b) arises from the fact that the domain of (21+3)(21+5)
(16a) is given by 1~ 5 whereas the domain of
(16b) is given by -1 CV< 1. For the oblate The functions per(x) and peT(x) are ortho-
gonal in the domain -1 <x < 1.
spheroid, utilizing formula (14) we have
Another solution of (18) exists which corre-

&{$(@+ug)++‘g) sponds to the same eigenvalue A,,,, is inde-


pendent of per(x), and has the opposite parity:

1 1 a9 q434 = IZ-m,C=,,,” AiYIQiYx)


+ (--
1-V/*
52
>
a<pz
1
+i?Y=O. (17)

(22)
By separating variables as before, Y(q) satistïes J<
+ j,m C=oddBTjJI;I(x)~
the same equation as (16b), while X(t) satistïes
where the A$ are the same as in (20) and are
equation (16a) with 5 replaced by it. Al1 these
determined by the recurrence formula (21),
equations are of the type
while for j 2 m + 2, the Br j satisfy the recur-
rente formula

km-j(j+ 1)+r22’(~~I~:j~~z)B,j
A solution of (18) is known as a spheroidal
wave function. The equation (18) has *I as
,tj-m-l)tj-4 Bm,_
+Ic (2j-3)(2j-1) “” ’
regular singular points and CO as an irregular
singular point of class 1. Hence spheroidal +K2(j+m+l)(j+m+2) B”, -o,
wave functions behave like tlegendre func- “,‘+2 - (23)
(2j+3)(2j+5)
tions in the interval [ -1, l] and like tBesse1
functions in the neighborhood of 00. Since the associated Legendre function of
the second kind
Qf(~)=(l-x~)~‘~d~QJdx~
F. The Functions pc:(x) and qer(x)
is of the form
When we Write a solution of (18), it is custom-
ary to Write x instead of z when z is contained
Q:(x) = pl(x)log~~

in the interval [ -1, 11. We denote solutions of +(l-x2)-“‘2 x (a polynomial in x)


133 Ref. 526
Ellipsoidal Harmonies

for 1> m, the qer(x) have x = k 1 as singular Disregarding a constant factor, this coincides
points. with the function detïned by (22) with Q;(z) =
By expressing the solution of equation (18) (z2 - l),‘* dmQ,/dzm in place of the associated
in integral form we find that pet(x) satisfïes an Legendre function of the second kind.
integral equation

Tmv .,mPGw References

= l (1 -x2)““*(1 -5’)“‘*eiKx5pe~(5)d5, (24) [l] M. J. 0. Strutt, Lamésche- Mathieusche-


s -1 und verwandte Funktionen in Physik and
Technik, Erg. Math., Springer, 1932 (Chelsea,
where the coefficient v,,, is related to Pflm(0) or
1967).
P;‘(o).
[2] E. W. Hobson, The theory of spherical and
In order to extend the domain of deiïni-
ellipsoidal harmonies, Cambridge Univ. Press,
tion of pc:(x) and qen(x) beyond the interval
1931 (Chelsea, 1955).
[ -1, 11, we adopt, in the domain G obtained
[3] Y. Hagihara, Theories of equilibrium fïg-
by deleting the interval [ -1, l] from the com-
ures of a rotating homogeneous fluid mass,
plex plane, the Heine-Hobson definition of the revised translation, NASA, Washington D.C.,
associated Legendre function, 1970. (Original in Japanese, 1933.)
P;(z) = (z* - l)m’2 d” P,,/dz”, (25) [4] Y. Hagihara, Celestial mechanics, vol. 4,
pts. 1, 2, Japan Society for the Promotion of
instead of N. M. Ferrer& definition (19), and Science, 1975.
construct a solution of (18) in G:
[S] C. Flammer, Spheroidal wave functions,
KW= C .WYW Stanford Univ. Press, 1957.
[6] J. A. Stratton, P. M. Morse, L. J. Chu, J.
Il-n1 =even number, (26) D. C. Little, and F. J. Corbatb, Spheroidal
wave functions, including tables of separation
which is like (20) and again satisiïes the inte- constants and coefficients, Wiley, 1956.
gral equation (24). From this we cari obtain
the expansion formula

per(z)= fi (z2- 1)m’2 134 (XIV.3)


%mWrn Elliptic Functions

A. Elliptic Integrals
Il- n ( = even number. (27)
Let <p(z) be a polynomial in z of degree 3 or 4
Multiplying this by a constant, we defme with complex coefficients and R(z, w) a rational
function in z and w. Then R(z, m) is called
71(z’- y’*
.KW= 2 Zm an elliptic irrational function. An integral of the
J type j R dz is called an elliptic integral. The
origin of the name cornes from the integral
that appears in calculating the arc length of an
ellipse. Any elliptic integral cari be expressed
F” =(l+m)! A” by a suitable change of variables as a sum of
n,l (lwrn)! “31 elementary functions and elliptic integrals of
the following three kinds:
This expression asymptotically assumes the
form dz

je:(z) - sin(rcz - n7~/2)/rcz SJ (1 -z2)(1 -k*z*)’

1 - k2z2
for IzI » 1. In a similar manner we fïnd a ~ dz,
solution SF 1-z*

and
ne:(z)= -
J7L (z’ - l)m’*
2
Zrn

s (l-a2z2)
dz
(l-z*)(l-k*z*)

(- Appendix A, Table 16.1). These three kinds


of integrals are called elliptic integrals of the
having the asymptotic form
first, second, and third kind, respectively, in
ne:(z) - cos(fcz - n7c/2)/K-z. Legendre-Jacobi standard form. This classifica-
521 134 E
Elliptic Functions

tion corresponds to that of tAbelian integrals. elliptic integral satistïes the relation K(k) =
The constant k is called the modulus of these dPag(l, &61.
elliptic integrals, and a is called the parameter.
Let the four zeros of q(z) be rxl, Q, Q, c(~ (we
C. Elliptic Integrals of the Second Kind
take one of them as a when <p(z) is of degree
3). The tRiemann surface % corresponding to
When R has poles with residue zero, its inte-
the elliptic irrational function has the zeros
gral has no singularities other than poles.
ml> a 2, x3, ~1~as tbranch points with degree
The standard form is
of ramification 1, and is of two sheets and
of tgenus 1. If the integrand does not have
F(z)= j;,/Edz
a pole with a tresidue, then the integral is
multivalued only because the value (called
the periodicity modulus) of the integral taken = ;Jwdcp=E(k,<p), (3)
along the normal section (basis of the homol- s
ogy group) is not equal to zero. where z = sin cp. We have

@Y4 +Eu
F(z)= “dn’udu=-
s0 O(u) K
B. Elliptic Integrals of the First Kind
if we set z = sn u. Here, 0 is Jacobi’s theta func-
When R is a function without singularities tion, and
other than branch points, only the topological
structure of R gives rise to the multivaluedness
of the integral of R. The standard form is
i where & is a theta function to be described in
dz
Section 1 and K, K’ are the same as in the case
s o J(l -z’)(l -kZzZ) of elliptic integrals of the fïrst kind. The
quantity

where z = sin cp. This integral is the inverse

1
function of sn w (- Section J). The periodicity
moduli are 2iK’ and 4K, where

sd<p
K=K(k)=
dz is called a complete elliptic integral of the

=
o J(l-2’)(1 -k2z2) second kind.
n/2
=F
s0 Jm D. Elliptic Integrals of the Third Kind

K’= K(k’), k’* = 1 -k2. When R has poles with nonzero residue its
We cal1 K(k) a complete elliptic integral of the integral has logarithmic singularities. In this
first kind and F(k, cp) an incomplete elliptic case, residues also contribute to multivalued-
ness of the integral. The standard form is
integral of the first kind (- Appendix A, Table
16). Setting z dz

s
<p
F(z) =

sin qpl =
(l+k’)sin<pcos<p
JZ’

we have the relation


k, =p
l-k
l+k”
(2)
ZZZ
s d<p o (1 -a’~‘)

0 (l-a’sin’cp)Jw’
(1 -z’)(l -k2z2)

F(k<p)=U +W’(k,,cpJ/2> and it is expressed as

which is called Landen’s transformation. Since


F(z) = s ;log~+u~)+u
k, <k when 0 <k < 1, this transformation (
reduces the calculation of elliptic integrals
if we set z = sn u and a2 = k2 sn2 a (- Appendix
to those with smaller values of k. For two
A, Table 16).
given positive numbers a and b, put a, =a,
b, = b, and a,,, = (a, + bn)P’> bn+, = $i%i.
Then the sequences {a,,} and {b,} converge E. Elliptic Functions and Periodic Functions
rapidly to the common limit, which is called
the arithmetico-geometric mean of a and Historically the elliptic function was first intro-
b, and is denoted by ag(a, b). The complete duced as the inverse function of the elliptic
134 F 528
Elliptic Functions

integral. However, since it has been realized sum of the poles is a period (Liouville’s fourth
that elliptic functions are characterized as tbeorem).
functions with double periodicity, it is now
customary to define them as doubly periodic
F. Weierstrass’s Elliptic Functions
functions.
If f(x), delïned on a linear space X, satistïes
Weierstrass delïned
the relation f(x + o) =f(x) for some WE X and
a11 x E X, the number o is called a period of
f(x), and f(x) with a period other than zero
is called a periodic function. The set P of a11
periods of f(x) forms an additive group con- as the simplest kind of elliptic function. Here
R = 2mq + 2nw,, with m, n integers. The sum-
tained in X. If a basis wi, . . , w, of the additive
mation C’ extends over a11 integral values (pos-
group P exists, its members are called funda-
itive, negative, and zero) of m and n, except
mental periods of f(x).
for m = n =O. p(u) is an elliptic function of
Any continuous, nonconstant, periodic
order 2 with periods 2~0, and ~CD,, called a
function of a real variable has only one posi-
Weierstrass @-function. The following func-
tive fundamental period and is called a simply
tions c(u) and U(U) are called the Weierstrass
periodic function. The ttrigonometric functions
zeta and sigma functions, respectively:
are typical examples: sin x and COSx have the
fundamental period 27~; tan x and cet x have
the fundamental period rc (- 159 Fourier
Series).
A single-valued nonconstant tmeromorphic and
function of n complex variables cannot have
more than 2n fundamental periods that are g(u)=un’((l
-i)exp(i+$)).
linearly independent on the real number fïeld.
A function of one complex variable with two These have quasiperiodicity, expressed by
fundamental periods is called a doubly periodic
i(u + 24 = i(u)+ hi, (4)
function.
Let w, w’ be the fundamental periods of a c(u + 2q) = - e2vi(u+0Jo(u), (5)
doubly periodic function. For a given number i= 1, 2, 3;
‘I1+rz+y/3=0> Vi = itwi),
a, the parallelogram with vertices a, a + w,
a + w’, a + w + CO’is called the fundamental and they satisfy the relations
period parallelogram. The complex plane is
aa (4 = - i’(u) (6)
covered with a network of congruent parallel-
ograms, called period parallelograms, obtained and
by translating the fundamental period parallel-
ogram through mw + nw’ (m, n = 0, +l, &2,. . ). (7)
A doubly periodic function f(u) meromor-
phic on the complex plane is called an ellip- The function m(u) is an even function of u,
tic function. For simplicity, we usually denote and i(u) and cr(u) are odd functions of u. By
the fundamental periods of an elliptic function considering the integral Jc(u)du once around
by 2~0, and 2w,, and introduce o2 detïned by the boundary of a fundamental period par-
the relation wi + w2 + w3 = 0. The fïrst, and allelogram, we have
therefore also higher, derivatives of any elliptic
function are elliptic functions with the same
~‘~~~~~~}= *Fi, Im(z)20, (8)
periods. The set of a11 elliptic functions with
the same periods forms a tlïeld. The number of
poles in a period parallelogram is lïnite. The which is called the Legendre relation.
sum of the orders of the poles is called the The derivative
order of the elliptic function. An elliptic func-
@f(u)= -2Cl/(u-R)3
tion with no poles in a period parallelogram is
merely a constant (Liouville’s first theorem). of a @-function is an elliptic function of order
The sum of the residues of an elliptic function 3 and bears the following relation to m(u):
at its poles in any period parallelogram is zero
(~‘(u))2=4(k3(u))3-92~(u)-93
(Liouville’s second theorem). Hence there cari
be no elliptic function of order 1. An elliptic =4(M(u)-e,)(M(u)-e,)(p(u)-e,),
function of order n assumes any value n times
g2=60c’1/Q4, g3=140~‘1/R6,
in a period parallelogram (Liouville’s third
theorem). The sum of the zeros minus the ei = @twi)> i= 1, 2, 3. (9)
529 134 1
Elliptic Functions

By differentiating this relation successively, simply an elliptic function cari now be called
we see that M(“)(U) is expressed as a polynomial an elliptic function of the lïrst kind. For con-
in p(u) if n is an even number and as a prod- stants p and v, the function
uct of polynomials in p(u) and p’(u) if n is an
f(u) = PO(U - u)/a(u) (13)
odd number.
In particular, writing m(u) = z in (9) we lïnd is an example of an elliptic function of the
that the p-function is the inverse function of second kind. In this case
the elliptic integral p.=e2Poi-2u%
I ) i=l,2. (14)
Furthermore, for given constants pL1and p3,
lL= s :J& any elliptic function of the second kind is
(- Appendix A, Table 16.IV). expressed as the product of an elliptic function
Any elliptic function cari be expressed in of the first kind and the function (13) with p
terms of Weierstrass’s functions. Specilïcally, and u determined by (14).
let the poles of f(u) and their orders be a,,
a 2, , a, and hi, h,, , h,, respectively, and H. Elliptic Functions of the Third Kind
let the principal part in the expansion of f(u)
near the pole uk be If a meromorphic function f satislïes

2 A, k=l, 2 ,..., m. f(u+2wi)=eoiu+*lf(u), i= 1, 3 (15)


j=l (U - U# ’
(10)
(ai and bi are constants) with periods 2w,, 2w,,
Then we obtain we cal1 it an elliptic function of the third kind
(- 3 Abelian Varieties 1).
The Weierstrass sigma function a(u) is an
example of an elliptic function of the third
hk (-lYAkj p(j-2)(U-ak) ,
kind. The functions oi, g2, and 03, defined by
+C----

j=l (J-l)!
(11)
the equations

where C is a constant depending on f(u). This e”‘“o(u-wi)


i= 1, 2, 3,
q(u) = - (16)
cari be reduced to c(wi) ’

f(u)= A+B@(u) are also elliptic functions of the third kind. In


the case of elliptic functions of the second and
by using the addition theorems (- Appendix
third kinds, 2w, and 2w, are not periods in
A, Table 16) for @ and zeta functions, where A
the strict sense delïned earlier, but are conve-
and B are rational functions of D(u). Therefore,
niently referred to as the periods. The func-
given two elliptic functions with the same
tions cri (i = 1,2,3) are called cosigma functions.
periods, after expressing them as rational
functions of M and M’ in the above form and
eliminating p and @, we obtain an algebraic 1. Theta Functions
equation with constant coefficients. In par-
ticular, for any elliptic function f(u) we ob- The theta functions, or more strictly, elliptic
tain an talgebraic differential equation of the theta functions, are defined by
first order by using this method, with f’(u) an
elliptic function with the same periods. Fur- 9,(u,r)=2 c (-l)nq(n’112)2sin(2n+l)nv,
thermore, the functions f(u + o), f(u), and f(u) n=o
satisfy an algebraic equation. Thus for any
9,(u,r)=2 F q(“+1’2)Zcos(2n+ l)m,
elliptic function an algebraic addition theorem “=O
holds.
9,(u,7)=1+2 =f q”zcos2n7ru,
“=l
G. Elliptic Functions of the Second Kind
(17)
As an extension of the definition of elliptic
functions, if a tmeromorphic function f satis- where q = eiar, Im z > 0. We cal1 (17) the q-
fies the relations expansion formulas of the theta functions, and
we sometimes Write & in place of &. A theta
f(u+24=PLff(Uh .f(u+w=P3p (12) function is an elliptic function of the third kind
(pi and pL3 are constants) with the fundamental with periods 1 and 7. Any elliptic function cari
periods 2w,, 2w,, we cal1 f(u) an elliptic func- be expressed as a quotient of theta functions
tion of the second kind. What we have called (- Appendix A, Table 16 for specific exam-
134 J 530
Elliptic Functions

ples). The q-expansion formula is quite suit- ulus, respectively. Furthermore, the relation
able for numerical computation because of d sn w/dw = cn w dn w holds. The function z
its rapid convergence; its terms decrease as the = sn w is the inverse function of the elliptic
n2 powers of q as n+ m. integral(l) (- Appendix A, Table 16.111).
An elliptic function with the fundamental
periods 2w, and 2w, cari also be viewed as
having the periods 2~; = 20,, 20: = - 2w,. References
Consequently, theta functions formed with the
parameter 7 = wa/w, cari be expressed in terms
[l] S. Lang, Elliptic functions, Addison-
of the parameter 7’ = w; /a; = -col /w3 = - 117,
Wesley, 1973.
and we have
[2] D. W. Masser, Elliptic functions and trans-
9,(u,7)=iAS,(u/r, -l/T), cendence, Springer, 1975.
[3] C. G. J. Jacobi, Fundamenta nova theo-
92(u, 7) = m(47, -1/7),
riae functionum ellipticarum (1829) (C. G. J.
~,(u,7)=A~,(ul7, -1/7X Jacobis Gesammelte Werke 1, G. Reimer,
1881,49-239).
UA 7) = A32(47, -l/7),
[4] G. H. Halphen, Traité des fonctions el-
A = fiexp( - ziu2/z). (18) liptiques et de leurs applications 1, II, III,
Gauthier-Villars, 18866 189 1.
These are called Jacobi’s imaginary trans-
[S] A. Hurwitz and R. Courant, Vorlesungen
formations. If Im(-l/z)» 1, then (ql+ 1, and
über allgemeine Funktionentheorie und ellip-
therefore the series in the q-expansion for-
tische Funktionen, Springer, third edition,
mula converges slowly. However, by an imag-
1929.
inary transformation we get Im(z)» 1 and
[6] H. E. Rauch and A. Lebowitz, Elliptic
lq[kO, SO that computations become much
functions, theta functions and Riemann sur-
easier.
faces, Williams & Wilkins, 1973.
Each of the theta functions satislïes, as a
[7] H. Hancock, Elliptic integral, Dover, 1958.
function of two variables u and 7, the follow-
ing partial differential equation of the heat-
conduction type:

ô29(~, 7yau2= 47cN(u, gaz (19)


(- Appendix A, Table 16.11). 135 (11.3)
Equivalence Relations
J. Jacobi’s Elliptic Functions
A. General Remarks
C. G. J. Jacobi delïned elliptic integrals as
inverse functions of elliptic integrals of the fïrst
kind in the Legendre-Jacobi standard form (1). Suppose that we are given a relation R be-
They are, in the above notation, tween elements of a set X such that for any
elements x and y of X, either xRy or its nega-
snw=JG-=
44 ~3(Wl(4 tion holds. The relation R is called an equiva-
(20)
03(u) 92Kw4b4 lente relation (on X) if it satisfïes the following
three conditions: (1) xRx, (2) xRy implies yRx,
a1(4 %W2(4
and (3) xRy and yRz imply xRz. Conditions
cnw=a,o=92(0)$4(u)’
(1) (2), and (3) are called the reflexive, sym-
dnw=c2&40=4w3w metric, and transitive laws, respectively. To-
(22) gether, they are called the equivalence prop-
Q3(4 $3Kw4(4 ’
erties. Condition (1) cari be replaced by the
where w=JGu and u=u/2w,. following: (1’) For each x there exists an x’
These functions satisfy the relations such that xRx’. The relation “x is equal to y”
sn2w+cn2w=1, kZsn2w+dn2w=1, is an equivalence relation. If xRy means that x
(23)
and y are in X, then R is also an equivalence
where relation. An equivalence relation is often de-
noted by the symbol -. The relations of con-
kZ=-=-e2 - e3 (92(0))4
(24) gruence and similarity between figures are
el -e3 (-94YW4 equivalence relations. If X is the set of integers
The constants k and k’= J1-k2 are called and x = y means that x-y is even, then the
the modulus and the complementary mod- relation = is an equivalence relation.
531 136 B
Ergodic Theory

B. Equivalence Classes and Quotient Sets 136 (XVII.1 5)


Ergodic Theory
Let R be an equivalence relation. “xRy” is
read: “x and y are equivalent” (or “x is equiva-
lent to y”). The subset of X consisting of a11 A. General Remarks
elements equivalent to an element a is called
the equivalence class of a. By (l), (2), and (3), The origin of ergodic theory was the so-called
each equivalence class is nonempty, the equiv- ergodic hypothesis, which provided the foun-
alence class of a contains a, and different dation for classical statistical mechanics as
equivalence classes do not overlap. Namely, created by L. Boltzmann and J. Gibbs toward
X is decomposed into a tdisjoint union of the end of the 19th Century (- 402 Statistical
equivalence classes. This +Partition is called Mechanics). Attempts by various mathemati-
the classification of X with respect to R. For cians to give a rigorous proof of the hypoth-
example, the set of integers is classifïed into the esis resulted in the recurrence theorem of
equivalence class of even numbers and that of H. Poincaré and C. Carathéodory and the
odd numbers by the relation = Conversely, ergodic theorems of G. D. Birkhoff and J. von
since the relation “x and y belong to the same Neumann, which marked the beginnings of
member of a partition” is an equivalence re- ergodic theory as we know it today. As the
lation, we cari regard any partition as a classi- theory developed it acquired close relation-
fication. An element chosen from an equiva- ships with other branches of mathematics, for
lente class is called a representative of the example, the theory of dynamical systems,
equivalence class. In the example we cari take probability theory, functional analysis, number
0 and 1 as the representatives of equivalence theory, differential topology, and differential
classes of even and odd numbers, respectively. geometry.
X/R denotes the set of equivalence classes of The principal abject of modern ergodic
X with respect to R, and is called the quotient theory is to study properties of tmeasurable
set of X with respect to R. The mapping p:X+ transformations, particularly transformations
X/R that carries x in X into the equivalence with an invariant measure. In most cases, the
class of x is called canonical surjection (or transformations studied are defmed on a Le-
projection). The idea of equivalence relations besgue measure space with a tïnite (or o-finite)
cari be generalized to deal with the case when measure. A Lebesgue measure space with a
X is a tclass. finite measure (cr-lïnite measure) is a tmeasure
space that is measure-theoretically isomorphic
to a bounded interval (to the real line) with the
usual tlebesgue measure, possibly together
C. Stronger and Weaker Equivalence Relations
with an at most countable number of atoms. It
is known that any separable complete tmetric
Let R and S be two equivalence relations on
space with a complete regular Bore1 tproba-
X. If xRy always implies xSy, then we say that
bility measure is a Lebesgue measure space
R is stronger than S, S is weaker than R, the
with a finite measure. We assume, unless stated
classification with respect to R is finer than the
otherwise, that the measure space (X, a, m) is a
one with respect to S, or the classification with
Lebesgue measure space. Al1 the subsets of X
respect to S is coarser than the one with re-
mentioned are assumed measurable, and a pair
spect to R. The relations “x is equal to y” and
of sets or functions that coincide almost every-
“x and y are in X” are the strongest and the
where are identified. We use the abbreviation
weakest equivalence relations, respectively.
“a.e.” to denote “talmost everywhere.”
Any two equivalence relations on X are
ordered by their strength, and the set of
equivalence relations on X forms a tcomplete
B. Ergodic Theorems
lattice with respect to this ordering.

Let (X, g, m) be a a-fïnite measure space.


A transformation <p defïned on X is called
References measurable if for every BE&?, @LE&?. A
tbijective transformation <p on X is called
[l] N. Bourbaki, Eléments de mathématique 1, bimeasurable if both cp and ‘p-l are measur-
Théorie des ensembles, ch. 2, Actualités Sci. able. A measurable transformation cp is called
Ind., 1212c, Hermann, second edition, 1960; measure-preserving (or equivalently, the mea-
English translation, Theory of sets, Addison- sure m is invariant under <p) if rn(q-‘(B))=m(B)
Wesley, 1968. holds for every B. It is called nonsingular if
Also - references to 381 Sets. m(<p-l(B))=0 whenever m(B)=O, and ergodic
136 B 532
Ergodic Theory

ifm((cp-‘(B)UB)-(<p-‘(B)flB))=Oimplies (X, 9) leaving the measure m invariant (i.e.,


either m(B) = 0 or m(X - B) = 0. IP(x,B)dm=m(i?)foreveryBEUB)(- 261
The mean ergodic theorem of von Neumann Markov Processes), then for every f belonging
(Pr-oc. Nat. Acad. Sci. US, 18 (1932)) states that to L,(X) (1 <p < co) the sequence A,f con-
if cp is a measure-preserving transformation oc verges in the norm of L,(X) to a limit function
(X, W, m), then for every function f belonging f*.
to the +Hilbert space &(X) = L,(X, &?, m) (- (2) Birkhoff’s ergodic theorem has been
168 Function Spaces), the sequence extended to the following individual ergodic
theorem by E. Hopf (1954): If T is a +Positive
linear operator mapping L,(X) into L,(X)
and L,(X) into L,(X) with lITIl i < 1 and
converges in the tnorm of &(X) as n-* COto a 11T 11m Q 1, then for every fin L,(X) the se-
function f* that satistïes f*(rpx) =f*(x) a.e. The quence A,f converges a.e. to a limitf*. If
individual (or pointwise) ergodic theorem of T is a Markov operator, then T maps each
Birkhoff (Proc. Nut. Acad. Sci. US, 17 (1931)) L,(X) into itself and satislïes //TII,< 1 for
states that for every f belonging to the tfunc- each p (1 <p < CO), and therefore Hopf’s er-
tion space L,(X), the sequence A”~(X) con- godic theorem applies to such T. Special cases
verges a.e. to f*. From either of these theo- of this theorem were proved earlier by J.
rems it follows that for any set E satisfying Doob and by Kakutani. Later, N. Dunford
<P~I@)=E and m(E)< CO, the limit functionf* and J. Schwartz showed that the assump-
satislïes JEf* dm = IJdm. In particular, if m(X) tion of the positivity of T cari be dispensed
= 1 and <p is ergodic, then the limitf* equals with in Hopf’s theorem. For a positive linear
the constant jfdm a.e. This fact therefore gives operator T on L,(X) satisfying II T 11,< 1, R.
a mathematical justification to the ergodic Chaton and D. Ornstein (1960) proved that
hypothesis, which states that the “time mean” the ratio ergodic theorem holds: For every pair
(CkZbf(cp”x))/n of what is observable over a of functions f and g in L,(X) with g 2 0 a.e.,
suftïciently long time cari be replaced by the lim n-tm CiZb Tkf(x)/CkZ& Tkg(x) exists and is
“phase mean” jfdm. Both von Neumann% fmite a.e. on the set {x 1CtZk Tkg(x) > 0). This
theorem and Birkhoff’s theorem were sub- theorem extends earlier results of Hopf and
sequently generalized in various directions by W. Hurewicz dealing with special classes of
many authors. operators arising from measurable transfor-
(1) Mean ergodic theorems are concerned mations. Hopf’s ergodic theorem cari be de-
with the tstrong convergence of the sequence duced from the Chaton-Ornstein theorem,
of averages A,=(Ck$, Tk)/n of the iterates of a while it is known that there are positive oper-
tbounded linear operator T on some +Banach ators T on L,(X) satisfying II T 11i < 1 for which
space. A generalization of von Neumann% lim n-m A,f fails to exist on a set of positive
theorem due to F. Riesz, K. Yosida, and S. measure for some feti(X). This shows that
Kakutani dispenses with the assumptions that the assumption 11T 11~< 1 is crucial in Hopf’s
the linear operator T is induced by a measure- theorem.
preserving transformation <p and that Tacts (3) As was the case in the original proof by
on the Hilbert space Z,*(X). A version of this Birkhoff of his ergodic theorem, every known
generalization states that if a linear operator T proof of an individual ergodic theorem de-
detïned on a Banach space X satistïes the pends crucially on the so-called maximal
conditions ergodic lemma (or maximal inequality). For
the case of a positive linear operator Ton
L,(X) with 11T 11i < 1, Hopf proved the rele-
vant maximal ergodic lemma: If E(f) is the set
{xIsupna, A,f(x)>O} for eachfin L’(X), then
J,,,,fdm > 0. Hopf’s original proof of this
lemma was quite intricate, but A. Garcia
then for an element fi% the sequence of (1965) succeeded in giving an extremely sim-
averages A,f converges strongly to an element ple and elegant proof. From the maximal er-
f* E% if and only if there exists a subsequence godic lemma the following so-called domi-
converging tweakly to f*. From this theorem nated ergodic theorem cari also be obtained:
of Riesz, Yosida, and Kakutani follows the Let T be a linear operator mapping each
L,-mean ergodic theorem (1~ p < co) for so- L,(X) into L,(X) satisfying II TII,< 1 for each
called Markov operators: If T is a linear oper- p (1 <p < CO). If for f in L,(X) we let f(x) =
ator defïned on each of the Banach spaces ~up~~~~A,,f(x)~, then if 1 <pc CC we have
L,(X) (1 < p < co) by means of the formula
T~(X) = jf(y)P(x, dy), where P(x, B) is the ttran-
sition probability of a +Markov process on
533 136 C
Ergodic Theory

while if p = 1 and m(X) < ro we have and E(t) = lim,,,( l/n)g,,. Note that if cp is a
measure-preserving transformation on a fmite
~i~x~~~s2[~~X)+Sli~x-)lluEi,ilr~ld~]. measure space (X, g, m) and if f is an element
of L,(X), then Xi,k=Cjk:/fo<pjfor O<i< k,
This theorem was obtained first by N. Wiener where k = 1,2,3 detïnes a subadditive pro-
(1939) for the special case of T induced by a cess (in fact, X,,k=Xi,j+Xj,k for i<j< k). The
measure-preserving transformation. M. Ak- multiplicative ergodic theorem, proved by V.
coglu (1975) proved an isometric “dilation” Oseledec, which plays a significant role in
theorem for positive contractions (i.e., positive ergodic theory of tnonhyperbolic smooth
linear operators T for which I/ TII < 1) in the dynamical systems and which has found ap-
space L,(X) for 1 <p < CO, by means of which plications also in talgebraic groups, cari be
he was able to deduce a dominated ergodic derived from the subadditive ergodic theorem
theorem for an arbitrary positive contraction Cg, 91.
in L,(X) from the,corresponding theorem for a
positive linear isometry in L,(X) proved earlier
by A. Ionescu-Tulcea. From this theorem of C. Recurrence and Invariant Measures
Akcoglu one cari obtain the following indi-
vidual ergodic theorem: If T is a positive con- In this section we assume that the measure
traction in L,(X) for some p with 1 < p < CO, space (X, a, m) is nonatomic. A nonsingular
then for every f in L,(X), the sequence A,f measurable transformation <pdefined on
converges a.e. (X, 8, m) is called recurrent (infinitely recurrent)
(4) Both mean and individual ergodic theo- if for every set B and for almost a11 XE B, there
rems cari be extended without diflïculty to exists an FEZ’ (inlïnitely many FEZ+) such
a continuous time parameter semigroup that <~“(X)E B. A set W is called wandering
{ 7; 1t > 0} of bounded linear operators such under<pif<p-“(W)n<p-k(W)=Oforn#k.The
that 7; T, = 7;+, (TO=I), under a suitable con- transformation <p is called conservative if no
tinuity assumption on T with respect to t, by sets of positive measure are wandering under
replacing the discrete time average (z;zh Tk)/n cp, and incompressible if B 3 <pml B implies
with (St, T,ds)/t. Further extensions to n- m(B - <p-‘B) = 0. The following statements
parameter semigroups were obtained by N. about a nonsingular measurable transforma-
Wiener and by N. Dunford and A. Zygmund. tion <p are equivalent: (i) <p is recurrent; (ii) <p is
For mean ergodic theorems, even further inlïnitely recurrent; (iii) <p is incompressible;
extensions were possible to tamenable semi- (iv) <p is conservative. An immediate con-
groups of bounded linear operators. For l- sequence of this is the following recurrence
parameter semigroups, the behavior of the theorem of Poincaré (in the form formulated
mean at zero, (JO T,ds)/t as tl0 (local ergodic by Carathéodory): A measure-preserving
theorem) or lLJ~ëAsT,fds as 210 or L?a transformation on a lïnite measure space is
(Abelian ergodic theorem), has also been in- infinitely recurrent. In fact, in order for a non-
vestigated by Wiener, U. Krengel, E. Hille, singular measurable transformation <p to be
Yosida, and others. Abelian ergodic theorems recurrent (and hence intïnitely recurrent) it
are related to properties of the tresolvent of is suflïcient that there exist a tïnite measure
the semigroup { 7;) (- 378 Semigroups of /L invariant under <p and equivalent to (i.e.,
Operators and Evolution Equations). Further mutually +absolutely continuous with) the
extensions of ah these theorems in various given measure m.
directions have been given by many authors; The invariant measure problem is one of the
for these extensions and related topics - basic problems in ergodic theory and is for-
[S-7]. mulated abstractly in the following way: Given
(5) J. F. C. Kingman (1968) proved an inter- a nonsingular measurable transformation <p on
esting and useful extension of (both mean and a o-lïnite measure space (X, 3, m), lïnd neces-
individual) ergodic theorems, called the sub- sary and sufftcient conditions for the existence
additive ergodic theorem. A real-valued +sto- of a tïnite (or a-finite) measure invariant under
chastic process { Xi,k 10 < i < k, k = 1,2,3, . } <pand equivalent to m. The given measure m
is called a subadditive process if it satisfies the specifies only the class of equivalent measures
following conditions: (i) Whenever i <j < k, among which an invariant measure is to be
Xi,, < Xi,j + Xj,k. (ii) The +joint distribution of found. Therefore we cari assume without loss
{Xi+i,j+l} is the same as that of {Xi,j}. (iii) The of generality that m is a finite measure. For the
texpectation gk = E(X,,,) exists, and satisfies remainder of this section, unless we explicitly
gk > - Ak for some constant A and for a11 state otherwise, we always mean by an invar-
k > 1. Kingman proved that if {Xi,k} is a sub- iant measure the one that is equivalent to m.
additive stochastic process, then the limit 5 = The Poincaré recurrence theorem states that
limn+,W%,, exists a.e. and in the mean, <pbeing recurrent is necessary for the existence
136 C 534
Ergodic Theory

of a lïnite invariant measure. The recurrence of number of simpler examples have been ob-
<pis, however, not suftïcient, since an ergodic tained by L. Arnold, Brunel, and others. It is
transformation with an infinite but a-tïnite now known that there are many different types
invariant measure is recurrent and has no of transformations having no a-Imite invariant
tïnite invariant measure. A necessary and measure (- Section F). Furthermore, it was
sufhcient condition for the existence of a tïnite shown by Ionescu-Tulcea that in the group of
invariant measure was given by a theorem of a11 nonsingular bimeasurable transformations
A. Hajian and Kakutani (1964) which states: <p with a suitable metric, those having a a-lïnite
has a tïnite invariant measure if and only if <p invariant measure form a subset of the tlïrst
has no weakly wandering sets (a set W is called category.
weakly wandering under <pif there exists an Various extensions of results on the invar-
inlïnite subset {nk} of Z+ such that <p?‘k wn iant measure problem for nonsingular trans-
q -9 W= @, k #j). Hajian also proved that a formations to the case of tMarkov processes
bimeasurable transformation <phas a tïnite having nonsingular transition probabilities
invariant measure if and only if <pis strongly were obtained by Y. Ito, J. Neveu, S. Foguel,
recurrent in the following sense: For every set and others [ 121.
E with m(E) > 0, there exists a positive integer For investigation of detailed properties of
k = k(E) such that particular transformations arising from classi-
cal dynamical systems, problems in number
max m(qFjEnE)>O
O<j<k theory, and SO on, rather than the existence of
invariant measures it is more important to
for every n E Z.
determine a specitïc form of an invariant mea-
H. Fürstenberg [l l] obtained the following
sure with desirable properties and to develop
striking extension of the Poincaré recurrence
methods to decide when such a measure is
theorem: If a bimeasurable transformation
unique. Various people have considered spe-
<ppossesses a lïnite invariant measure, then
cial classes of transformations; these workers
for every set E with m(E) >O and for every
have been able to obtain explicit descriptions
integer k > 2, there exists n > 1 such that
of invariant measures with nice properties for
m(E n V”E n q?“E n . . n v(~-‘)“E) > 0. From this
the transformations in question and have
theorem one cari deduce a diff’cult theorem
derived interesting consequences. Most im-
of E. Szemerédi on tarithmetic progressions
portant among these is the so-called Gibbs
which states: Any subset of the integers having
measure, introduced and investigated by Ya.
positive tupper density contains arithmetic
Sinai. He obtained this notion by generalizing
progressions of arbitrary length. In fact, it is
the concept of the equilibrium Gibbs distri-
not difficult to show that the theorems of
bution, which plays a prominent role in tstatis-
Fürstenberg and of Szemerédi are mutually
tical mechanics. It is delïned in the following
equivalent, and for this reason the theorem of
way: Let X be a compact metric space, <pa
Fürstenberg is sometimes referred to as the
homeomorphism on X, and pLo a probability
ergodic Szemerédi theorem.
measure on X invariant under cp. For a func-
For a nonsingular bimeasurable transfor-
tion g belonging to L,(X) and for m, n > 0 let
mation <p, a pair of sets A and B are said to
be countably equivalent under v if there exist
countable decompositions {A, 1k E Z ’ } and
{i?, 1k E Z’ } for A and B, respectively, and an
infinite subset {nk} of Z such that V”~A, = B, and form a sequence of probability measures
for each k. A and B are said to be finitely p,,,,(g) absolutely continuous with respect to
equivalent under <p if finite decompositions p,,, for which the +Radon-Nikodym derivative
{ Ak} and {Bk} cari be chosen. It was proved by
44Axld =ex~CL.dv~x)
Hopf that (i) <p is recurrent if and only if no set
of positive measure is finitely equivalent under &dx) %“(Y I /Jo)
<pto one of its proper subsets, and (ii) <p has a holds. A measure that is a limit point, in the
fmite invariant measure if and only if no set of sense of tweak convergence, of the sequence of
positive measure is countably equivalent under measures pL,,, (g) is called a Gibbs measure
cp to one of its proper subsets. It cari be shown constructed from p0 and g. It is clear that a
that if cp is ergodic, then <phas no a-Imite Gibbs measure is invariant under <p. For mix-
invariant measure if and only if every pair of ing topological Markov shifts (- Section D),
sets of positive measure are countably equiva- Sinaï showed that if one starts with the invar-
lent under <p. iant measure pLo of maximal entropy (- Sec-
The first example of a transformation ad- tion H), the existence of which was earlier
mitting no a-lïnite invariant measure was shown by W. Parry, one cari determine a class
constructed by Ornstein in 1960. Since then, a of functions g for which the Gibbs measure
535 136 D
Ergodic Theory

is unique, i.e., p(g)=lim,,.,,pL,,Jg) exists. is called monothetic if it has a topological


Sinaï further investigated the properties of the generator.
unique measure ,n(g) in detail. R. Bowen and (2) If <pis a group tendomorphism of a com-
D. Ruelle, detïning the Gibbs measure some- pact Abelian group G, then q preserves the
what differently, investigated the existence Haar measure m. If <pis a group tautomor-
and uniqueness of such measures, thereby phism, thdn it is a bijective measure-preserving
recapturing the results of Sinaï in the case of transformation on (G, %?,m). A continuous
mixing topological Markov shifts. With the group automorphism cp induces a group auto-
aid of Markov partitions (- Section G), these morphism cp* of the character group G*.
results on Gibbs measures were carried over The measure-preserving transformation cp is
to the case of tAnosov and “taxiom A” diffeo- ergodic if and only if every character except
morphisms, and they provided essential tools the identity has an infinite orbit under the
in the investigation of the ergodic behavior of induced automorphism ‘p*. When the group
these transformations. For details - [13,14]. is the n-dimensional torus T”, a continuous
The explicit form of the density of the in- group automorphism <p is uniquely repre-
variant measure with respect to the Lebesgue sented by an n x n matrix with integer entries
measure for the transformation associated and with determinant kl. In this case, q is
with tcontinued fraction expansion was al- ergodic if and only if no roots of unity appear
ready known to Gauss, and one cari draw among the teigenvalues of the representing
from it numerous conclusions about metric matrix.
properties of continued fractions. The way in (3) Let (Y, .zZ) be a measurable space, let
which Gauss was able to determine this invar- (Y,, ZZZJ=(Y, &) for each n E Z, and define
iant measure, however, was never explained. (Y *, a*) to be the tproduct measurable space
Recently, Sh. Ito, H. Nakada, and S. Tanaka (&z Y,, &,54,). The transformation cp
(Keio Eng. Rep., 30 (1977)) developed an inter- defined on (Y*, J&‘*) by
esting method to describe the mechanism used
to arrive at this density function for the invar-
iant measure for the continued fraction trans-
formation. They employed similar methods in
with y;=~,,+~ for each n is called the shift
subsequent work to determine explicit density
transformation. Let p be a probability measure
functions for invariant measures for other
on (Y, .r4) such that (Y, &, p) is a Lebesgue
related number-theoretic transformations and
measure space, let p, = p for each n, and detïne
for certain classes of continuous mappings
p* to be the tproduct measure I’IntzpL, on
over an interval; by means of the explicit forms
(Y*, &*). Then (Y*, &*, ,u*) is a Lebesgue
of invariant measures they were able to de-
measure space, and the shift transformation cp
scribe the metric properties of these transfor-
is a bijective measure-preserving transforma-
mations in detail.
tion. Considered with the product measure, cp
is called a generalized Bernoulli shift. When the
set Y is at most countable and the measure p
D. Examples and Construction of Measure- on Y is given by a sequence {pj} of positive
Preserving Transformations numbers with C pj = 1, <p is called a Bernoulli
shift. Suppose that P(y, A) is a Markov tran-
Examples of measure-preserving transforma- sition function on (Y, &) and n is a proba-
tions appear in many different contexts. We bility measure invariant under P(y, A). Then
describe some of the important ones. we cari detïne the Markov measure 7~* on the
(1) Let G be a +locally compact Abelian product space (Y*, &*) by setting
group satisfying the second axiom of tcounta-
bility (- 423 Topological Groups), IA a o- .*,,*,=f~~+....l.+,f~,n(dYII)P(Yo,~Yi)
algebra of Bore1 subsets of G, and m its +Haar
measure (normalized if G is compact). Then
x P(Y,,dY,)“‘P(Y,~,,dY,),
(G, g, m) is a Lebesgue measure space. For a
lïxed element go E G, delïne the transforma- for a cylinder set
tion Q:G+G by qgO(g)=g+gO. Then Vu, is
a bijective measure-preserving transforma-
tion on (G, &J, m) and is called the rotation
on G by the element go. If G is compact, then and extending it to a11 of d*. The shift trans-
the rotation Pu, is ergodic if and only if the formation <p preserves the Markov measure
cyclic subgroup generated by the element go is rr*. Considered as a measure-preserving trans-
dense in G. If this happens, the element go is formation on (Y*, d*, rc*), rp is called a Mar-
called the topological generator of G. A group kov shift.
136 D 536
Ergodic Theory

A generalized Bernoulli shift is always cal1 4 the transformation built from cp with the
ergodic. A Marko\ shift is ergodic if and only ceiling function f: If <p is ergodic and m(A,)-+O
if the corresponding Markov process is +irre- as n+ 00, then 4 is also ergodic.
ducible, which is the case if and only if the (iii) Suppose that $ is a measure-preserving
following property is satislïed: For every pair transformation on (X, 9J, m) and cpxis a
of sets A and B with ~(A)~L(B)>O, there exists measure-preserving transformation on
an FEZ+ such that J,P”(y,B)dn>O. (Y, d, p) for each x E X. Assume that the map-
There are other measures besides the prod- ping (x, y)+(px(y) is measurable with respect
uct measure and the Markov measure that to the a-algebras 3 x d and &. The trans-
cari be detïned on the product space (Y*, &*) formation 0 delïned on the product space
and are invariant under the shift transforma- (X x K a x d, m x PL)by W, Y) =(+W, C~,(Y))
tion cp. For example, if Y is a Bore1 subset of is measure-preserving and is called the skew
R and d is the o-algebra of Bore1 subsets of Y, product of ti and {cpx}. If <px= <p for all XE X,
then any tstationary stochastic process taking then we get a direct product transformation
values in Y induces such a measure on w> Y) =wc4, cp(Y)).
(Y*, &*). When considered with a measure of (iv) A measure-preserving transformation cp
this type, the shift transformation <pis called on (Y, &, p) is said to be a factor transforma-
the shift associated with the stationary process. tion (or a homomorphic image) of a measure-
Properties of the shift associated with a preserving transformation $ on (X, a, m) if
tstationary Gaussian process have been inves- there exists a measurable transformation q
tigated by G. Maruyama, 1. Girsanov, H. from X onto Y such that m o q m1= p and <PV=
Totoki, and others. In particular, it is known r/$. If 93’ is a o-subalgebra of a and $ leaves
that the shift is ergodic if and only if the tspec- IA’ invariant (i.e., $ -’ og’ c YB’), then $ induces a
tral measure for the tcovariance function of factor transformation cp on the measure space
the associated Gaussian process is continuous (X, a’, m). Conversely, if a measure-preserving
c151. transformation <p on (Y, d, p) is a factor trans-
(4) Other important examples of measure- formation of $ on (X, 93, m) via a mapping q,
preserving transformations arise from classical then OB’= q-l& is a a-subalgebra of YB invar-
dynamical systems, which Will be described in iant under 11/.
Section G. (6) A one-parameter family { q1 ) t E R} of
(5) There are several ways of constructing bijective measure-preserving transformations
new measure-preserving transformations from on a measure space (X, g, m) is called a flow.
given ones. We describe important cases. A flow is called continuous if the mapping t-1
(i) Let cp be a nonsingular, measurable, 7; is tweakly continuous where { IT;} is the
recurrent transformation (not necessarily one-parameter family of tunitary operators
measure-preserving) on a o-tïnite measure on L’(X) induced by the flow {cp,}. A flow is
space (X, 93, m), and let A be a set of positive called measurable if the mapping (t,x)+cp,(x)
measure. For x~,4, let n(x)=min{n~Z+ 1 is a measurable transformation of R x X into
<p”(x)~A}. The transformation <p,:A+A de- X. A measurable flow is continuous. A. Ver-
lïned by <PA(X)= cpncX)(x)is a nonsingular mea- shik and Maruyama proved that for any con-
surable transformation on the measure space tinuous flow there exists a measurable flow,
(A, B fl A, mA), where m,(B) = m(A f! B)/m(A), unique in a specifïed sense, which is spatially
and it is measure-preserving if <p is. We cal1 isomorphic (in the sense specitïed in Section E)
<pathe transformation induced by cp on A. It is to the given flow.
ergodic if cp is ergodic. Important examples of flows are given by
(ii) Let cp be a nonsingular measurable classical dynamical systems (- Section G),
transformation on a a-finite measure space and by continuous-time stationary stochastic
(X, a, m), and suppose that {A,} is a countable processes.
(possibly finite) partition of X. Define a func- An important tool in the study of flows is
tion f:X-*Z’ by settingf(x)=n for XE A,, provided by the theorem of W. Ambrose and
and let g={(x,j)lxEX, l<j<f(x)} be a Kakutani: Every measurable ergodic flow
subspace of the product measure space (X x without a fixed point is spatially isomorphic to
Z’, 28 x %?,m x p), where %7is the rs-algebra an S-flow. A measurable flow {cpt} is called an
of a11 subsets of Z+, and p is the measure on S-flow (special flow or flow built under a func-
(Z+,%T) defined by ~L((n))=1 for each nEZ+. tion) if there exist a measure-preserving trans-
The transformation 4 defïned on 8 by @(x,j) formation rp of a measure space (X, !?J, m) and
=(x,j+l)ifl<j<f(x)and =(cp(x),l)ifj= an R+-valued function f on (X, 23, m) such
f(x) is a nonsingular measurable transforma- that each <Puis a measure-preserving trans-
tion on the measure space (8, (93 x %‘) n x, (m x formation on the subspace rf = {(x, u)) x EX,
I~)X), and it is measure-preserving if <p is. We 0 < u <f(x)} of the product measure space
537 136 E
Ergodic Theory

(XxR’,dx&,mxQgivenby morphic; but the converse is not true in


general.
44(x, 4 (i) The property of cp being ergodic is a
=(x,u+t) if -u<t< -u+f(x), spectral property since cp is ergodic if and only
n-1 if the number 1 is a simple eigenvalue of the
induced unitary operator T. If cp is ergodic,
<p”(x),u+t- 1 f(<pk(x)) if
k=O > then the set of a11 eigenvalues of the induced
n-1 operator T forms a subgroup of the circle
-u+ c f(<pk(x))Gt< -u+ c f(<p”(x)X group, each eigenvalue is simple, and each
k=O k=O
teigenfunction has constant absolute value. If
= cp-“(x),u+t+ $J f(cpk(x)) if the tspectrum of the induced operator T con-
( k=-n > sists entirely of eigenvalues, cp is said to have
discrete spectrum (or pure point spectrum). A
-u- 2 f(<pk(x))<t< -u- 2 f(<pk(X)), theorem due to von Neumann and P. Halmos
k=-n k= -n+l
-the lïrst theorem on the question of isomor-
n 3 1. Here J%Jis the o-algebra of Bore1 subsets phism-states that two ergodic automor-
of R+, and A is the usual Lebesgue measure. phisms ‘pi and qD2with discrete spectra are
spatially isomorphic if and only if they are
spectrally isomorphic, which is the case if and
E. Isomorphism Problems only if the induced operators Tl and T, have
the same set of eigenvalues. Furthermore,
In this section we assume that the Lebesgue every ergodic automorphism with discrete
measure spaces considered are probability spectrum is spatially isomorphic to an ergodic
spaces. For simplicity, following the common rotation on a compact Abelian group [2].
usage among Russian mathematicians, we Analogous results were obtained by L.
cal1 a measure-preserving transformation on Abramov for a bigger class of automorphisms,
(X, VB,m) an endomorphism and a bijec- namely, for ergodic automorphisms having so-
tive measure-preserving transformation an called quasidiscrete spectra.
automorphism. (ii) An automorphism cp is ergodic if and
An automorphism ‘p, (a flow {vi’)}) on only if for every pair of sets A, B,
(Xi, g,, mi) is said to be spatially isomorphic
(or metrically isomorphic) to an automorphism /i; k iil m(<pk(A) n B) = m(A)m(B).
<p2(a flow {(p,“)}) on (X,,og,,m,) if there exist >
sets N, and N, with m,(N,)=m,(N,)=O and a Strengthening this condition, we cari define <p
bijective measurable transformation f3from to be weakly mixing if for every pair of sets A,
Xi-N,toX,-N,suchthatm,oO=m,and B,
Ocp, = <p20 (Ocpj’)= q$‘)O for each t).
Classification of automorphisms and flows
into isomorphism classes constitutes the cen-
tral problem of modern ergodic theory. Prop- strongly mixing if
erties of automorphisms and flows that are
preserved under spatial isomorphisms are
called isomorphism invariants (or metric in-
and k-fold mixing if for arbitrary choice of sets
variants). There are several isomorphism in-
Aj,j=O,l ,..., k,
variants that are essential to the study of the
isomorphism problem. We describe these limm(A, n V”IA, fl q+A, fl V”~A,)
below for the case of automorphisms. There
= 4%,)m(A,). . . m(A,),
are corresponding invarianta for flows as well.
where the limit is taken as ni, n2, . . , nk-+
(1) Spectral Invariants. Two automorphisms <p, coinsuchawaythatn,<n,<...<n,and
and <p2are said to be spectrally isomorpbic if min, 4j,k(nj-nj-1)+co. The property of an
the tunitary operators Ti and T2 induced by automorphism cp being weakly mixing or
<pi and <p2on the Hilbert spaces &(X,) and strongly mixing is a spectral property. For
&(XJ, respectively, are unitarily equivalent instance, <pis weakly mixing if and only if the
(i.e., there exists an isometric isomorphism V number 1 is a simple eigenvalue and is the
of &(Xi) onto L2(Xz) such that VT, = T, V). only eigenvalue of the induced operator T. It
Properties preserved under spectral isomor- is also known that <p is weakly mixing if and
phisms are called spectral invariants (or spec- only if the direct product automorphism <px 9
tral properties). If <pi and <p2are spatially iso- is ergodic. The set of a11 weakly mixing auto-
morphic, it is clear that they are spectrally iso- morphisms forms a dense tG,-set in the group
136E 538
Ergodic Theory

of a11 automorphisms on (X, %, m) considered (vi) An ergodic automorphism with quasi-


with the so-called weak topology (Halmos’s discrete spectrum has a mixed spectrum, that
theorem). On the other hand, it was shown is, the spectrum of the induced unitary opera-
by V. Rokhlin that the set of a11 strongly mix- tor T has a continuous component and eigen-
ing automorphisms is a set of lïrst category values in addition to 1. Anzai (1951) con-
with respect to the weak topology. However, structed a special class of skew product auto-
there are only a few known examples of auto- morphisms having mixed spectra and showed
morphisms that are weakly mixing but not that in this class there are automorphisms that
strongly mixing. are spectrally isomorphic but not spatially
(iii) An automorphism cp is said to have isomorphic. However, the question of whether
countahle Lehesgue spectrum if the maximal two spatially nonisomorphic automorphisms
spectral type of the induced unitary operator exist among automorphisms having the same
T restricted to the torthocomplement of the purely continuous spectrum remained un-
subspace of constant functions in L,(X) is answered for a long time, until in 1958 Kol-
equivalent to the Lebesgue measure and its mogorov (Dokl. Akad. Nauk SSSR, (5) 119
tmultiplicity is countably infinite. (1958)) settled it aftïrmatively by using a new
(iv) An automorphism <p is called a K- isomorphism invariant called entropy.
automorphism (or Kolmogorov automorphism)
if there exists a o-subalgebra oA of YB such that 2. Generators and Entropy. (i) By a partition 5
(a) cp%,=% ad <p%,f%,, (b) V,,EZv”4,=% = {A,} of the space X we mean a collection of
and (c) /j,,,z<pnUWo = Jlr, where N is the o- sets A, such that A, n A,. = @ whenever  # 1.
subalgebra of .%’ consisting of nul1 sets and and U A, = X. We denote by E the partition of
their complements. The notion of a K-flow X into individual points, and by v the trivial
(or Kolmogorov flow) is defined similarly. K- partition {X}. A partition into a Imite (count-
automorphisms are k-fold mixing for a11 orders able) number of sets is called a Imite (count-
k and have countable Lebesgue spectra. able) partition. A partition 5 is said to be
Generalized Bernoulli shifts are all K- lïner than another partition [ (or [ is coarser
automorphisms. An ergodic Markov shift is a than 5) if for every AE [ there is a set BE[ such
K-automorphism if and only if it is strongly that A c B. For a collection { <,} of partitions
mixing, which is the case if and only if the of X, we denote by Va& the coarsest partition
corresponding Markov process is tirreducible that is finer than each t,, and by An& the
and taperiodic (- 260 Markov Chains B). A tïnest partition that is coarser than each 5,. If
continuous group automorphism of a compact & = { Ak,.} is a sequence of countable (or tïnite)
Abelian group is a K-automorphism if and partitions, then Vk& is precisely the parti-
only if it is ergodic. In particular, a continuous tion of X into nonempty intersections of the
group automorphism of the n-dimensional form nk Ak,“, with Ak,n,~& for each k. With
torus T” is a K-automorphism if and only if no a partition 5 of X we associate a a-subalgebra
roots of unity appear among the eigenvalues of uA(t) of a which is the a-algebra of a11 .?3-
the representing matrix. Automorphisms and measurable sets that are a union of elements in
flows arising from classical dynamical systems 5. Two partitions 5 and [ are said to coincide
also provide examples of K-automorphisms a.e. if g(t) = auA([)a.e. (i.e., for every A E&?(S)
and K-flows. In particular, a tgeodesic flow on there exists a set B in BS([) such that m(A U
a surface of negative curvature is a K-flow, B - A n B) = 0 and conversely).
and each automorphism (except the identity) (ii) Suppose that <pis an endomorphism of
of this flow is a K-automorphism. For the shift (X, g, m) and 5 a partition of X. By <p-i5 we
transformation q associated with a stationary mean the partition {<P~‘(A)I AE<}. If <pis an
Gaussian process, it was shown by Maruyama automorphism, we also defme <pt = {q(A) 1
[ 151 that cp is (a) weakly mixing if and only if AE 0. A partition t is called a generator for
it is ergodic, (b) strongly mixing if and only an endomorphism <pif V& <p-“< = E a.s. If
if the covariance function of the associated V,,Z ~m $5 = E a.s., < is called a two-sided gen-
Gaussian process tends to 0 as n+co, and (c) erator for an automorphism <p.
a K-automorphism if and only if the +spec- An endomorphism cp is said to be periodic at
tral measure of the covariance function is a point X~X if there exists a positive integer n
absolutely continuous with respect to the such that <p”(x) = x and aperiodic if the set of
Lebesgue measure. points of periodicity has measure zero. If the
(v) Examples of automorphisms having measure space (X, %?,m) is nonatomic, then
various types of spectra have been constructed every ergodic endomorphism on it is aperi-
by a number of authors by using stationary odic. A theorem of Rokhlin states that every
Gaussian processes and the theory of approxi- aperiodic automorphism cp has a countable
mation developed by A. Katok and A. Stepin two-sided generator. This implies that every
C161. such cp is spatially isomorphic to the shift
539 136 E
Ergodic Theory

transformation on the intïnite product space established for the first time the fact that there
(Y*, &*) considered with some invariant mea- are uncountably many spatially nonisomor-
sure p*, where each coordinate space 3 = Y phic Bernoulli shifts.
has at most a countable number of points. (iv) An automorphism q is said to have
Krieger improved this result by showing that if completely positive entropy if h(cp, 5) > 0 for
an ergodic automorphism cp has Imite entropy, every partition 5 #v. It was shown by Rokhlin
then cp has a Imite two-sided generator [ 171. and Sinaï that an automorphism <phas com-
(iii) For a lïnite or countable partition 5= pletely positive entropy if and only if cp is a K-
{A,}, define the entropy H(c) of the partition automorphism. M. Pinsker proved that for
to be -&m(A,)log(m(A,)) (the logarithms every automorphism cp there exists a partition,
here and below are natural logarithms). We called the Pinsker partition, that is invariant
denote by d the set of ah partitions 5 with under cp and such that the factor transforma-
H(c)< co. If 5~2, then for any endomorphism tion of <pwith respect to this partition has
cp the limit zero entropy and is the largest among the
factor transformations of q with zero entropy.
(v) Rokhlin showed that h(q) = 0 for an
endomorphism <p if and only if a11 of its factor
transformations are automorphisms. An endo-
exists and is lïnite. The entropy h(q) of the
morphism <pis called exact if Aso <p-“E = v a.e.
endomorphism <pis delïned to be sup { h(<p, 5) 1
Rokhlin introduced a way to associate with
5 E u2”) and is an isomorphism invariant. Prop-
each endomorphism a certain automorphism,
erties of entropy have been investigated exten-
called the natural extension, which reflects the
sively since the notion was introduced by
properties of the endomorphism. For example,
Kolmogorov. We cite a few results.
an endomorphism and its natural extension
(a) If a partition 5 E ut” is a generator for an
are simultaneously ergodic or nonergodic, are
endomorphism <p or a two-sided generator for
mixing of the same order, and have equal
an automorphism <p, then h(v) = h(cp, 5) (Sinai’s
entropy. The natural extension of an exact
lemma). (b) For every integer n, h(<p”) = In1 h(q),
endomorphism is a K-automorphism.
and for a measurable flow {cpt}, h(<p,)= Jtlh(<p,)
(vi) Automorphisms cpi and <p2are said to be
for every real number t. (c) If an automor-
weakly isomorphic if each of them is a factor
phism <pis periodic, then h(q) = 0. (d) If <pr is a
transformation of the other. Sinaï proved
factor transformation of <P*, then h(<p,) d h(qJ.
(1964) that for each ergodic automorphism
(e) h(<p, x <p2)= h(<p,)+ h(q,). (There is a more
with positive entropy, there exists a factor
complicated formula (due to Rokhlin) for the
automorphism having the same entropy and
entropy of a skew product automorphism.)
isomorphic to a Bernoulli shift, and hence it
(f) If cp is a recurrent automorphism and <pa
follows in particular that Bernoulli shifts with
is the automorphism induced by <p on a sub-
the same entropy are weakly isomorphic.
set A with m(A)>O, then h(cp,)= h(<p)/m(A)
Ornstein (Adu. in Math., 4 (1970)) went further
(Abramov’s formula). (g) If cp is a Bernoulli
and succeeded in proving the following re-
shift with probability distribution {p,}, then
markable result: Two Bernoulli shifts with
h(p)= -Cnpnlogp,. (h) If cp is a Markov shift
equal entropy are spatially isomorphic. Par-
based on the Markov transition probability Pij
tial results in this direction were obtained
(delïned on a countable or Imite state space)
earlier by L. Meshalkin, J. Blum, and D. Han-
and an invariant measure r-ri, then
son. In the proof of Ornstein’s theorem, essen-
h(p)= -CCrriPijlogPij. tial use was made of the following theorem of
i j
C. Shannon and B. McMillan, which plays a
(i) If <p is ergodic and has a pure point spec- fundamental role in information theory (-
trum, or more generally, has a quasidiscrete 213 Information Theory): Suppose that <p is
spectrum, then h(p) = 0. (j) For an ergodic an ergodic endomorphism on (X, B, m) and 5
group automorphism <p on an n-dimensional is a partition of X. For a point XEX, let A,(x)
torus, h(cp) = C log 111, where the sum is taken denote the element in the partition V[:A <pmkc
over a11eigenvalues 2 of modulus > 1 of the that contains x. Then for almost a11 x,
representing matrix. (k) If an automorphism cp
has positive entropy, then in &(X) there exists lim -ilogm(A.(x))
“-ta, ( >
a subspace invariant under the induced uni-
tary operator T such that the spectrum of T exists and equals h(cp, 5).
restricted to this subspace is countable Le- (vii) Techniques and ideas developed by
besgue (Rokhlin’s theorem). It follows from Ornstein in his proof of the isomorphism
(k) that automorphisms with tsingular spectra theorem have been relïned and extended fur-
or spectra of Imite multiplicity must have zero ther by himself, B. Weiss, N. Friedman, M.
entropy. In proving assertion (g), Kolmogorov Smorodinsky, and others, and numerous re-
136 E 540
Ergodic Theory

sults were subsequently obtained on the iso- process satisfying (i) 5 and 4 are indexed by
morphism problem. We describe below some the same set J, (ii) h(cp, [) -h(<p, 5) < 6, and
of the main results and basic concepts; for (iii) d-,(@A 4, (<p, 4)) -C 4 then dl@, t),(<p, 4)) a.
more detailed accounts - [21-241. Let us cal1 a partition < an F.D. generator for
A sequence {t”} of partitions is said to be an automorphism <pif 5 is a two-sided gen-
independent if for every choice of sets A,, . . , A, erator for cp and the process (cp, 5) is lïnitely
with AL~<,,,, m(&, Ak)=n;=l m(A,) when- determined. A general isomorphism theorem
ever n,, n2,. , n, are a11 distinct. For a fixed proved by Ornstein states: If automorphisms <p
E> 0, two partitions 5 and [ are said to be E- and (p have the same entropy and both have
independent if F.D. generators, then <pand (p are isomorphic.
Ornstein and Weiss proved further that the
process (cp, 5) is F.D. if and only if it is V.W.B.
From these basic results a number of results
A pair (<p, [), where <p is an automorphism cari be deduced. (a) Nontrivial factors of Ber-
of (X, g, m) and 5 is a lïnite (or countable) noulli shifts are Bernoulli. (b) Strongly mixing
partition of X, is called a process (on X). Markov shifts have two-sided weak Bernoulli
@J,,Z -m ~“5) is a o-subalgebra of g invariant generators and hence are isomorphic to Ber-
under cp, and cp restricted to this c-subalgebra noulli shifts. (c) Every Bernoulli shift cari be
is a factor automorphism of <p and is isomor- embedded in a flow, which implies among
phic to a shift transformation on an infinite other things that Bernoulli shifts have roots of
product space, where each coordinate space a11 orders. (d) It was shown by Y. Katznelson
has the same number of points as the number that every ergodic group automorphism of an
of atoms in 5. A process (9, [) is called a Ber- n-dimensional torus has a two-sided generator
noulli process (or an independent process) if the which cari be shown to be V.W.B., and hence
sequence of partitions { (p”s 1n E Z} is indepen- is also isomorphic to a Bernoulli shift. R. Adler
dent, and a weak Bernoulli (W.B.) process if and B. Weiss earlier proved by entirely differ-
for every E > 0 there exists a k > 0 such that ent methods that on a 2-dimensional torus,
the partitions VF:: (~‘5 and VE-n cpi< are E- two ergodic group automorphisms having the
independent for a11 n à 0. Let J be a tïnite same entropy are spatially isomorphic. They
set and x, b be two functions from the set have done this by constructing a two-sided
{ 1,2, , N} to J. By d,(cc, B) we denote (l/N) generator 5 for such automorphisms which is
#{n 1cc(n)#P(n)}, where # denotes the car- also a Markov partition (- Section G).
dinality. d, detïnes a metric on the space (viii) Among further positive results there
J{1,2,...,Nl called the Hamming distance. Now, are the following: (a) Except for a trivial
foraprocess(cp,[)onX with <={Ajlj~J}, change in the time scale, any two Bernoulli
we define for x, yeX, dN(x, y) to be equal to flows are isomorphic (Ornstein). (b) Every
dJ<t(x), t:(y)), where for X~X, t:(x) is the ergodic group automorphism of a compact
point (j(x),j(q(x)), . . ..j(<pN~1(x)))E5(1.2,...,N} group is isomorphic to a Bernoulli shift
and j(x) is the index jE.l for which x~,4~. A (Thomas and Miles; Lind). (c) A number of
process (cp, 5) is called very weak Bernoulli automorphisms and flows arising from classi-
(V.W.B.) if for any E > 0 there exists an N, cal dynamical systems are shown to be Ber-
such that whenever N > N, there is a set G noulli (- Section G). (d) Examples of exact
with m(C)> 1 -E belonging to 6? (Vj=-, ~“5) endomorphisms arise in connection with prob-
satisfying the following condition: for any lems in number theory; for example, tcon-
AE~ (Vf=mU,<pn<) with AcG and m(A)>O, tinued fraction expansion and fi-expansion.
there exists a probability measure v on X x X The natural extensions of the exact endomor-
satisfying (i) V(E x X) = m(E) and V(X x F) = phisms associated with continued fraction
m,(F) for a11 E,FEIA and (ii) j”xXxd$(x,y)dv< expansion and /J-expansion are now known
E. It is not diflïcult to show that W.B. pro- to be spatially isomorphic to Bernoulli shifts.
cesses are V.W.B. For processes (<p, 5) on (ix) Since many of the automorphisms that
(X, #, m) and (cp, 4) on (X, &?, ti), where both 5 have been shown to be K-automorphisms are
and 4 are indexed by the same fmite set J, we now known to be isomorphic to Bernoulli
define dN((<p, 8, (<p, 5)) to be inf& x xdiv(t~b), shifts, it is natural to expect that every K-
ct(~))dp(x, x)}, where inf is taken over a11 automorphism is in fact Bernoulli. However,
probability measures p on X x X satisfying Ornstein (1973) constructed an example of a
P(E x x)=m(E) for a11 EE~ and p(X x É)= K-automorphism that is not a Bernoulli shift.
m(E) for a11 ËE~. d, decreases with N, and A lot of work has since been done to investi-
we define 46 i-),6 T)) to be limN+rn d,((<p, 0, gate how bad K-automorphisms cari be. It
(cp,5)). Finally, a process(cp,[) is said to be turns out that K-automorphisms shareahnost
finitely determined (F.D.) if for any E> 0 there none of the fïner properties of Bernoulli shifts.
exists a 6 > 0 and N such that if (<p, 4) is any For example: (a) There are uncountably many
541 136 F
Ergodic Theory

nonisomorphic K-automorphisms of the same = p”(x) for almost all x. Two transformations
entropy (Ornstein and P. Shields). (b) There is ‘pi and <p2are said to be weakly equivalent if
a K-flow that is not Bernoulli, and there are there exists a bimeasurable nonsingular trans-
uncountably many nonisomorphic K-slows formation 0 such that Q[q,]Q-’ = [<PJ. H. Dye
having the same entropy at time one (Smoro- proved that any pair of type II, transforma-
dinsky). (c) There exists a K-automorphism tions are weakly equivalent. It easily follows
not isomorphic to its inverse (Ornstein and from this that the same is true for type II,
Shields). (d) There exists an automorphism transformations. Krieger showed, on the other
that cannot be written as a direct product of a hand, that among type III transformations
K-automorphism and an automorphism with there are uncountably many weakly non-
zero entropy. This example shows that a con- equivalent ones. In fact, Krieger [28] intro-
jecture made earlier by Pinsker was false (Orn- duced an invariant for weak equivalence,
Stein). (e) There are weakly isomorphic K- called the ratio set, by means of which he
automorphisms that are not isomorphic (Polit classifïed type III transformations into mutu-
and Rudolph). ally weakly nonequivalent subclasses III, for
0 <Â < 1. Krieger showed further that (a) for
0 < 1.< 1, every pair of transformations in the
F. Weak Equivalence and Monotone class III, is weakly equivalent; (b) in the class
Equivalence III, there are uncountably many mutually
weakly nonequivalent transformations; and
(1) Weak Equivalence. In order to construct (c) two ergodic transformations are weakly
examples of tfactors in the theory of von Neu- equivalent if and only if the corresponding
mann algebras (- 308 Operator Algebras), factors constructed via the group measure
F. Murray and von Neumann considered space construction are t*-isomorphic. An
various ergodic groups of bimeasurable non- ergodic flow of nonsingular transformations
singular transformations on a finite measure (not necessarily measure-preserving) called the
space. In this context a group 9 = {g} of trans- associated flow was constructed independently
formations is called ergodic if m(( g ml (B)U B) - by Krieger [29] and by T. Hamachi, Y. Oka,
(g-‘(B)n B))=O for every gag implies either and M. Osikawa [30], and was shown to give
m(B) = 0 or m(X -B) = 0. A measure p is said another effective invariant for weak equiva-
to be invariant under the group 3 = {g} if p is lente. Krieger showed that the mapping which
invariant under every transformation g in 8. assigns to each ergodic transformation its
Murray and von Neumann’s construction (the associated flow gives a tbijective mapping
so-called group measure space construction) between the set of all weak equivalence classes
gives a type II, tfactor if the group admits a of ergodic type III, transformations and the
finite equivalent invariant measure, a type II, set of all isomorphism classes of aperiodic
factor if it admits an infinite (but cr-finite) equiv- conservative ergodic flows. In this connection
alent invariant measure, and a type III factor if the theorem of U. Krengel and 1. Kubo on the
it has no cr-tïnite equivalent invariant measure. representation of ergodic flows, extending the
In this connection we note that Hajian and Y. theorem of Ambrose and Kakutani mentioned
Itô extended the theorem of Hajian and Kaku- in Section D, plays a signitïcant role.
tani and proved that an arbitrary group 9 = A bimeasurable nonsingular transformation
{g} of nonsingular bimeasurable transforma- 0 is called a normalizer of another transforma-
tions admits a Imite invariant measure if and tion <pif it satistïes O[<p] = [Q]Q. The set of a11
only if no set of positive measure is weakly normalizers N(Q) of a transformation <pforms
wandering under the group 9 (a set W is said a group called the normalizer group which
to be weakly wandering under a group 9 if contains the full group [v] as a subgroup. One
there exists an intïnite subset {g,,lnEZ+} of ?? cari introduce a suitable topology to ~V(cp)
such that gn( IV) n gk( IV) = 0 for n #k). Analo- to make it a complete separable metrizable
gously to the terminologies used in the theory group. Hamachi [3l] has shown that, for type
of von Neumann algebras, we detïne ergodic III transformations cp, the quotient group
countable group 3 (or an ergodic bimeasurable -V(q)/[<p]-, where [VI- denotes the closure of
nonsingular transformation <p) to be of type [q], is algebraically and topologically isomor-
II,, II,, or III if %#(or <p) has a tïnite equivalent phic to the tcommutant of the associated flow.
invariant measure, a o-tïnite intïnite equivalent Results obtained by Krieger and others
invariant measure, or no o-fmite equivalent mentioned above were motivated in part by
invariant measure, respectively. corresponding developments in von Neumann
For a bimeasurable nonsingular transfor- algebra theory due mainly to A. Connes (-
mation <p, define the full group [<p] to be the 308 Operator Algebras). From the recent deep
group of a11 bimeasurable nonsingular trans- result of Connes on the uniqueness of tapprox-
formations IJ such that, for some n = n(x), +(x) imately Imite-dimensional factors of type II,
136 G 542
Ergodic Theory

it follows that every approximately tïnite- V.W.B. of the isomorphism theory described
dimensional factor with the exception of type in Section E by using an f- instead of a d-
III, is *-isomorphic to a factor constructed metric. They then showed that this property,
from an ergodic transformation via the group called loosely Bernoulli (L.B.) by Feldman and
measure space construction. monotonely very weak Bernoulli (M.V.W.B.) by
Satayev, stays invariant under monotone
(2) Monotone Equivalence. The theory of weak equivalence, and that there exist transforma-
equivalence discussed above deals with the tions with and without this property. An f-
structure of orbits of a transformation or of metric is deiïned by starting off with an fN-
groups of transformations. In fact, transfor- metric on the space J{1*2,...,Nl instead of the
mations <p and ti are weakly equivalent if and Hamming distance d, and proceeding in
only if there exists a bimeasurable nonsingular exactly same way as for the definition of d,
transformation 0 mapping the cp-orbit of namely, by extending fN to fj and f, and
almost all x onto the ICI-orbit of O(x). A some- finally to 1: For CIand b in J(‘,Z,...,N), fN(c(,/j')
what more stringent notion of equivalence is defined by setting 1 - fN(cc, fi) equal to l/N
dealing with orbit structure is called mono- times the maximal integer n for which there
tone equivalence or Kakutani equivalence. We are positive integers j, <j, < . < j,, k, <k, <
say that measurable flows {Vu} and {@} of <k, with a( ji) = p(kJ, i = 1, . , n. Ornstein
measure-preserving transformations on fmite and Weiss discovered that by substituting f
measure spaces (X, YB, m) and (X, IA, m), respec- for done cari develop a theory of monotone
tively, are monotonely equivalent if there equivalence that parallels the isomorphism
exists a bimeasurable measure-preserving theory described in Section E. One cari detïne
transformation 0 on X to X such that for a process (cp, 5) to be fïnitely fixed (F.F.) by
almost a11 XEX and a11 tER, O~I~(X)=<P~~~,~~O(X), substituting ffor din the detïnition of F.D.
where z(t, x) is a monotone increasing function process, and, as was mentioned earlier, to be
of t. Two S-flows (- Section D) built over the L.B. by doing the same in the detïnition of the
same base transformation with different ceiling V.W.B. process. Ornstein and Weiss proved:
functions are monotonely equivalent, and it (a) If <p and (p have zero entropy, fmite posi-
cari be shown that monotonely equivalent tive, or infïnite entropy and if both have F.F.
ergodic flows are isomorphic to S-flows built generators, then they are monotonely equiv-
over the same base transformation. This in- alent. (b) Let cp have an F.F. generator. If
duces an equivalence relation on transforma- h(<p)=O, then cp is monotonely equivalent
tions, also called monotone equivalence or to an irrational rotation of the circle. If 0 <
Kakutani equivalence: Two transformations <p h(<p) < co, then <pis monotonely equivalent
and <p are monotonely equivalent if they cari to a Bernoulli shift of lïnite entropy. If h(q)=
serve as base transformations of the same (or ‘x), then cp is monotonely equivalent to a
equivalent) flow. This notion of equivalence Bernoulli shift of infinite entropy. (c) If <phas
was introduced by Kakutani in [32], where he an F.F. generator, then (q,<) is F.F. for a11
showed that qn and <pare monotonely equiva- nontrivial partitions 5. (d) (cp, 5) is F.F. if and
lent if and only if there are sets E c X and only if it is L.B. It is known further that within
E c X such that the induced transforma- each entropy class there exist uncountably
tions (- Section D) <pEand <PEare isomorphic. many monotonely nonequivalent transforma-
Abramov’s formula mentioned in Section E tions. A measurable flow { cp,} is called L.B. if
implies that there are at least three classes of it cari be represented as an S-flow built over
transformations (and flows) that are mutually an L.B. transformation. The L.B. flows of
nonequivalent: transformations (flows) of zero, zero entropy are those monotonely equivalent
finite, and infinite entropy. Nothing much was to a +Kronecker flow on a 2-dimensional
done in this equivalence theory for a num- torus, and L.B. flows of fïnite positive (infinite)
ber of years until in 1975 J. Feldman and E. entropy are those monotonely equivalent to
Satayev independently showed that there are the Bernoulli flow of fïnite (infinite) entropy.
monotonely nonequivalent transformations The direct product of an L.B. flow and a Ber-
within each class of zero, positive, and intïnite noulli flow is L.B., while it was shown by M.
entropy. Since then extensive work has been Ratner that the horocycle flow (- Section G)
done by Feldman, Satayev, A. Katok, Orn- is L.B. but its direct product with itself is not.
Stein, Weiss, D. Rudolph, M. Ratner, and
others, and numerous results have been ob-
tained. For detailed accounts of this devel- G. Classical Dynamical Systems
opment - [22,24,33,34]. The main idea
employed by Feldman and Satayev was to By a classical dynamical system we mean a
introduce a new metric called an f-metric tdiffeomorphism or a flow generated by a
and then to define a notion corresponding to smooth +vector held on some tdifferentiable
543 136 G
Ergodic Theory

manifold M”. Such a system is nonsingular one-parameter group {&} of transformations


with respect to a measure dehned by any of the space M”, given by a vector fïeld, is
+Riemannian metric on M”. For a fixed Rie- called the flow transversal to the flow { cp,} if
mannian metric, we cal1 measures smooth (i) the decomposition of the space M” into the
if they have a smooth density with respect trajectories of the flow {$,} is invariant under
to the measure given by the metric. { cp,}; (ii) the limit lim,,, lim,& lVJt, x) - t)/ts =
(1) Among classical dynamical systems, geo- C((X) exists for the function B$(t, x), which is
desic flows have been investigated most exten- defmed to be the time length of the segment
sively. Let 2, (M) be the unitary ttangent bun- {C~,&(X) 10 <u d t} of the trajectory of the flow
dle over the manifold M”. A point (x, e) E {$r}. Sinai’s fundamental theorem states that if
9, (M) defines a unique tgeodesic through x a flow {cp,} is ergodic and has a transversal
in the direction of e. The geodesic flow on ergodic flow { &} for which j N(X) dp < 0, then
$i (M) is the flow defïned by cp,(x, e) = (xt, e,), { cp,} is a K-flow. If a(x) < 0, then we cari even
where x, is the point in M” reached from x drop the assumption that {<p,} is ergodic. If
after time t under a motion with unit speed 1 a(x) dp > 0, the theorem holds for the flow
along the geodesic determined by (x, e), and e, 1%).
is the unit vector at xt tangent to the geodesic. A geodesic flow on a 2-dimensional mani-
The classical tliouville theorem in this con- fold of constant negative curvature always has
text implies that the measure on dt,(M) that is a transversal flow, called a horocycle flow. The
the product of the measure on M” induced by ergodicity of a horocycle flow was proved by
the metric and the Lebesgue measure on the Hedlund. It follows from Sinai’s fundamental
(n- l)-dimensional sphere gives a smooth in- theorem, therefore, that a geodesic flow on a
variant measure for the geodesic flow. A wide surface of constant negative curvature is a
class of systems arising from mechanics cari K-flow. Sinaï proved even more: A geodesic
be described as geodesic flows. flow on any surface of negative curvature is
Hopf and G. Hedlund proved that if the a K-flow. There is an extension of the notion
manifold M” is compact and has constant of transversal flow to higher dimensions,
negative curvature, then the geodesic flow is called transversal tïeld. Using this notion Sinaï
strongly mixing. Later, by using the theory of proved that a geodesic flow on a manifold
group trepresentations, 1. Gel’fand and S. (of any dimension) of constant negative cur-
Fomin proved that the spectrum of a geodesic vature is a K-flow. Finally, Ornstein and
flow on a compact manifold of constant nega- Weiss established that geodesic flows on com-
tive curvature is Lebesgue, and is even count- pact manifolds of negative curvature are
able Lebesgue in the case where the mani- Bernoulli.
fold is of dimension 2. F. Mautner and later L. (4) D. Anosov considered a class of flows
Auslander, L. Green, and F. Hahn extended and diffeomorphisms satisfying a condition
this algebraic method to flows obtained under that characterizes unstable motions such as
the action of some one-parameter subgroup of geodesic flows on a manifold of negative cur-
a +Lie group acting on its thomogeneous space vature. They are now called Anosov flows (or
and obtained extensive results for the case of Y-flows) and Anosov diffeomorphisms (or Y-
tnilpotent and some tsolvable Lie groups [35]. diffeomorphisms) (- 126 Dynamical Systems).
(2) The flow on an n-dimensional torus de- Anosov proved that if an Anosov slow has a
fmed by smooth invariant measure, then it is ergodic
and either it has a continuous nonconstant
eigenfunction or it is a K-flow. Anosov diffeo-
morphisms with smooth invariant measures
=(x1 +w,t,x,+w,t, . . ..x.+w,t)
are K-automorphisms. Sinai constructed for
is called a translational flow or a Kronecker ttransitive Anosov diffeomorphisms (and for
flow. The numbers wi, w2, . . . , w, are called Anosov flows) a special partition of the under-
frequencies. Every orbit of {qr} is dense in the lying manifold M called a Markov partition
torus if and only if the frequencies are linearly having desirable properties. The importance of
independent over Z. The motion under a such partitions lies in the fact that they enable
translational flow with independent frequen- one to represent such diffeomorphisms as
cies is called a quasiperiodic motion. A trans- Markov shifts. Starting with a measure invar-
lation flow for a quasiperiodic motion has iant for such a Markov shift having the maxi-
discrete spectrum. mal entropy (- Section H), Sinaï constructed
(3) Sinai obtained a useful criterion for a a Gibbs measure (- Section C), which turned
classical dynamical system to be a K-system. out to be unique in this case and gave rise to
Let M” be compact, and suppose that {cp,} is a a natural invariant measure p for the corre-
flow on M” defmed by a smooth vector field sponding diffeomorphism. He proved further
and preserving some smooth measure p. A that the diffeomorphism <pconsidered as an
136 H 544
Ergodic Theory

automorphism of the measure space (M, &?, PL) phism, then for any positive integer n and
is a K-automorphism. It was shown subse- 6 > 0, there exists a periodic automorphism tj
quently by R. Azencott that <p is in fact Ber- of period n such that m( {x 1C~(X)# $(x)}) <
noulli with respect to p. Sinai investigated (I/n) + 6 (theorem of Halmos and Rokhlin).
further the uniqueness of the invariant mea- The question as to how quickly this approxi-
sure attaining the maximal entropy for transi- mation cari be carried out has been investi-
tive Anosov diffeomorphisms, and he was able gated in detail by Katok, Stepin, and others.
to show among other things that the set of The rate of this approximation was shown to
transitive C”-Anosov diffeomorphisms that have a close relationship with the entropy and
do not have an invariant measure absolutely spectral properties of the automorphism cp. By
continuous with respect to the measure in- utilizing this relationship, various examples of
duced by the Riemannian metric on M con- automorphisms with specified spectral prop-
tains an open dense subset. The methods em- erties have been constructed [ 163.
ployed by Sinaï in these investigations were (2) When an arbitrary automorphism <pis
extended further by D. Ruelle, Bowen, and given on a Lebesgue measure space (X, a, m),
others. Bowen, in particular, was able to con- there exists a unique decomposition 5 = {A,}
struct Markov partitions for a wider class of of X such that (i) each A, is invariant under
diffeomorphisms, namely, those satisfying the <pand (ii) except for a negligible set (in a
so-called axiom A introduced earlier by S. specified sense) of A,, each A, is turned (in a
Smale, and characterized the Gibbs measures natural manner) into a Lebesgue measure
for diffeomorphisms in this class by means of space, and the restriction of <p to A, is an
the variational principle (- Section H). For a ergodic automorphism. This decomposition is
more detailed account of the results of Sinai called the ergodic decomposition of X with
and Bowen - [13,14]. respect to cp. There is a corresponding decom-
(5) An important example of a system that is position with respect to a flow. A formula also
neither an Anosov nor an “axiom A” system exists that enables us to compute the entropy
because of nonsmoothness has been studied by h(<p) in terms of the entropies of ergodic com-
Sinai: the simplest mechanical mode1 due to ponents of q.
Boltzmann and Gibbs of an ideal gas, which is (3) Let X be a compact metric space and
described as a system generated by tiny rigid <p:X+X a thomeomorphism. N. Krylov and
spherical pellets moving inside a rectangular N. Bogolyubov showed that there always
box and colliding elastically. Sinaï succeeded exists on X a Bore1 probabihty measure p that
in proving that this is a K-system, thereby is invariant under cp. Let 9 be the collection of
giving an affirmative answer to the classical a11 Bore1 probability measures on X, and let
question of the ergodicity of the basic mode1 PV be the subset of B consisting of those invar-
of statistical mechanics [38]. It was shown iant under <p. Then p and .9V are both convex
subsequently by M. Aizenman, S. Goldstein, sets compact with respect to the tweak * topol-
and J. Lebowitz that this system is Bernoulli. ogy. If gQ is the set of a11 extreme points in
Sinai also investigated another nonsmooth PV, then by the +Kreïn-Milman theorem &q is
system of classical importance: the system not empty. A measure p in 9q belongs to gV
describing the motion of a billiard bal1 on a if and only if cp is ergodic with respect to p.
square table with a tïnite number of convex When the set 8, consists of a single element, <p
obstacles. He showed that such a system, is a is called uniquely ergodic; <pis called minimal if
K-system. Ornstein and G. Gallavotti then for every point X~X, the orbit of x under cp=
showed that Sinaïs methods in fact show that orb,(x)= {<p”(x) (ni Z} is dense in X. v, is
the system is Bernoulh. Sinaï’s methods were called strictly ergodic if it is both minimal and
extended further by L. Bunimovich and 1. uniquely ergodic. A theorem of J. Oxtoby
Kubo to study properties of billiard systems in states that q is strictly ergodic if and only if for
more complicated domains. For more detailed every continuous real-valued function f on X,
account of these matters - [24]. The results the sequence of averages (C~~~f(~~(x)))/n
obtained by Sinaï and Bowen for Anosov converges uniformly to a constant M(f).
and “axiom A” systems were extended to There are homeomorphisms that are minimal
even wider class of systems by A. Stepin, R. or uniquely ergodic but not strictly ergodic.
Sacksteder, M. Brin, and Ya. Pesin (- [24]). (4) R. Jewett proved that any weakly mix-
ing measure-preserving transformation on a
Lebesgue space is spatially isomorphic to a
H. Miscellany strictly ergodic transformation. Krieger ex-
tended the result by showing that if the en-
(1) An arbitrary aperiodic automorphism cari tropy of an ergodic transformation cp is fïnite,
be approximated by periodic automorphisms. then <p is spatially isomorphic to a strictly
More precisely, if <p is an aperiodic automor- ergodic transformation. Similar results were
545 136 Ref.
Ergodic Theory

obtained for flows by K. Jacobs, M. Denker, ploited to give answers to these problems.
and E. Eberlein (- [39]). New and more striking applications of ideas of
(5) A topological analog of the notion of ergodic theory to different types of questions
entropy, called topological entropy, was in- in number theory have been started by Yu.
troduced by Adler, A. Konheim, and J. Linnik, Fürstenberg, W. Veech, T. Kamae, and
McAndrew. This is delïned as follows: For others (- [40,41]).
every open covering .Ce of a compact topolog- 7. Most of the results discussed in this arti-
ical space X, let N(d) be the number of sets cle dealt with the action of a cyclic group of
in the minimal subcovering of .d. For open transformations or of one-parameter flow.
coverings .d and g, let d va be the open There are signilïcant extensions of many of
covering {A n B 1A E&, BEOA}. For any open these results to different types of group ac-
covering .d and a continuous mapping <p tions. For recent developments - [22].
on X, the limit lim,,,(log N(.d v q-l& v
. . . v cp-(“-‘),d))/n = h,,,(<p, d) exists. Topo-
logical entropy h,,,(cp) of the continuous
References
transformation <p is now defined by h,,,(q) =
sup { h,,,(<p, -d) 1d an open covering of X}.
L. Goodwyn showed that h,,,(cp) 3 h,(q) [l] E. Hopf, Ergodentheorie, Springer, 1937.
for any <p-invariant probability measure p, [2] P. R. Halmos, Lectures on ergodic theory,
where h,(p) is the measure-theoretic entropy Publ. Math. Soc. Japan, 1956.
of cp regarded as a p-preserving transforma- [3] P. Billingsley, Ergodic theory and infor-
tion. T. Goodman went further and succeeded mation, Wiley, 1965.
in proving that h,,,(cp) = sup{h,(<p) 1p a Q- [4] K. Jacobs, Neuere Methoden und Ergeb-
invariant probability measure}. In connec- nisse der Ergodentheorie, Springer, 1960.
tion with these results there is interest in the [S] N. Dunford and J. T. Schwartz, Linear
question of the existence and uniqueness of operators 1, Interscience, 1958.
an invariant measure for cp with maximal [6] K. Yosida, Functional analysis, Springer
entropy, i.e., a vo-invariant measure p for which and Kinokuniya, second edition, 1968.
h,(p) = h,,,(<p). Such a measure does not [7] U. Krengel, Recent progress on ergodic
always exist, and even if it does it may not theorems, Soc. Math. France, Astérisque, 50
be unique. The notions of topological en- (1977), 151-192.
tropy and of measures with maximal entropy [S] J. F. C. Kingman, Subadditive ergodic
are generalized by Ruelle in the following theory, Ann. Probability, 1 (1973) 8833909.
way. For an open covering &, a continu- [9] D. Ruelle, Ergodic theory of differentiable
ous mapping <p, and a real-valued con- dynamical systems, Publ. Math. Inst. HES, 50
tinuous function g on X, let Z,(&‘, <p,y) be (1979), 27-58.
equal to inf{C,,,sup,,,exp~%50g(<pk(x))}, [ 101 N. A. Friedman, Introduction to ergodic
where the inf is taken over ah subcover- theory, Van Nostrand, 1969.
ingsIofthecovering&‘v<p-‘dv...v [ 1 l] H. Fürstenberg, Ergodic behavior of
q -@-‘)&‘. Then a lïnite limit P(&‘, <p,g) = diagonal measures and a theorem of Szeme-
lim,,, I/n log Z,,(.&, <p,g) exists, and the rédi on arithmetic progressions, J. Analyse
quantity P(<p, g) = SU~{ P(d, <p,y) 1cal an open Math., 31 (1977), 2044256.
covering of X} is called topological pressure. [ 121 S. R. Foguel, Ergodic theory of Markov
When g =O, P(<p, g) reduces to h,,,(<p). Ruelle processes, Van Nostrand, 1969.
proved for texpansive mappings <p that P(C~, g) [ 131 Ya. G. Sinaï, Gibbs measures in ergodic
=sup{h,(cp)+~gd~~ p is a <p-invariant proba- theory, Russian Math. Surveys, 27 (1972), 21~
bility measure}. This assertion is called the 69. (Original in Russian, 1972.)
variational principle for the topological pres- [ 14) R. Bowen, Equilibrium states and the
sure. The variation principle was proved for ergodic theory of Anosov diffeomorphisms,
general continuous mappings q by P. Walters. Lecture notes in math. 470, Springer, 197.5.
If a <p-invariant measure p satislïes P(~I, g) [ 151 G. Maruyama, The harmonie analysis of
= h,,(q) + sg dp, then p is called an equilibrium stationary stochastic processes, Mem. Fac. Sci.
state for y with respect to cp. It is known that Kyushu Univ., (A) 4 (1949) 455106.
for expansive mappings <p every continuous [16] A. B. Katok and A. M. Stepin, Approxi-
function g on X has an equilibrium state. mations in ergodic theory, Russian Math. Sur-
(6) Application of ergodic theory to problems veys, 22 (5) (1967) 77-102. (Original in Rus-
in analytic number theory has been made by sian, 1967.)
several authors. Ergodic or mixing properties [17] W. Krieger, On entropy and generators
of particular measure-preserving transforma- of measure-preserving transformations, Trans.
tions that arise in connection with various Amer. Math. Soc., 149 (1970), 4533464.
problems in number theory have been ex- [ 1S] V. A. Rokhlin (Rohlin), New progress in
137 546
Erlangen Program

the theory of transformations with invariant [36] V. 1. Arnold and A. Avez, Ergodic prob-
measure, Russian Math. Surveys, 15 (4) (1960), lems of classical mechanics, Benjamin, 1968.
l-22. (Original in Russian, 1960.) [37] Ya. G. Sinai, Introduction to ergodic
[ 191 V. A. Rokhlin, Lectures on the entropy theory, Math. Notes, Princeton Univ. Press,
theory of measure-preserving transformations, 1976.
Russian Math. Surveys, 22 (5) (1967) l-52. [38] Ya. G. Sinai, On the foundations of the
(Original in Russian, 1967.) ergodic hypothesis for a dynamical system of
[20] W. Parry, Entropy and generators in statistical mechanics, Sov. Math. Dokl., 4
ergodic theory, Benjamin, 1969. (1963), 1818-1822. (Original in Russian, 1963.)
[21] D. S. Ornstein, Ergodic theory, random- [39] M. Denker, C. Grillenberger, K. Sig-
ness, and dynamical systems, Yale Univ. Press, mund, Ergodic theory on compact spaces,
1974. Lecture notes in math. 527, Springer, 1976.
[22] D. S. Ornstein, A survey of some recent [40] Yu. V. Linnik, Ergodic properties of alge-
results on ergodic theory, Math. Assoc. Amer. brait tïelds, Springer, 1968.
Studies in Math., 18 (1978), 229-262. [41] W. A. Veech, Topological dynamics, Bull.
[23] P. Shields, The theory of Bernoulli shifts, Amer. Math. Soc., 83 (1977) 7755830.
Chicago Lectures in Math., Univ. of Chicago
Press, 1973.
[24] A. B. Katok, Ya. G. Sinai, and A. M.
Stepin, Theory of dynamical systems and 137 (Vl.18)
general transformation groups with invariant
measure, J. Soviet Math., 7 (1977), 974- 1065.
Erlangen Program
(Original in Russian, 1975.)
[25] J. Moser, E. Phillips, and S. Varadhan, When F. Klein succeeded K. G. C. von Staudt
Ergodic theory, a seminar, NYU Lecture as professor at the Philosophical Faculty of
Notes, 1975. Erlangen University in 1872, he gave an in-
[26] A. M. Vershik and S. A. Yuzvinskiï, Dy- auguration lecture entitled “Comparative Con-
namical systems with invariant measure, Pro- sideration of Recent Geometric Researches,”
gress in Math. VIII, Plenum, 1970, 151-215. which later appeared as an article [ 11. In it he
(Original in Russian, 1967.) developed a penetrating idea, now called
[27] H. A. Dye, On groups of measure- the Erlangen program, in which he utilized
preserving transformations 1, II, Amer. J. group-theoretic concepts to unify various
Math., 81 (1959), 1199159; 85 (1963), 551-576. kinds of geometries that until that time had
[28] L. Sucheston (ed.), Contributions to been considered separately.
ergodic theory and probability, Lecture notes The concept of transformation is not new; it
in math. 160, Springer, 1970. was, however, not until the 18th Century that
[29] W. Krieger, On ergodic flows and the the concept of transformation groups was
isomorphism of factors, Math. Ann., 223 recognized as useful. The theory of tinvariants
(1976),19-70. of linear groups and the +Galois theory of
[30] T. Hamachi, Y. Oka, and M. Osikawa, algebraic equations attracted attention in the
Flows associated with ergodic non-singular 19th Century. In the same Century, tprojective
transformation groups, Publ. Res. Inst. Math. geometry made remarkable progress, for
Sci., 11 (1975), 31-50. example, when A. Cayley and E. Laguerre
[3 l] T. Hamachi, The nornalizer group of an discovered that metrical properties of Eu-
ergodic automorphism of type III and the clidean and +non-Euclidean geometries cari be
commutant of an ergodic flow, J. Functional interpreted in the language of projective geom-
Anal., 40 (1981), 387-403. etry. Cayley proclaimed, “Al1 geometry is
[32] S. Kakutani, Induced measure preserving projective geometry.” After learning geometry
transformations, Proc. Imp. Acad. Tokyo, 19 under J. Plücker, Klein made the acquaintance
(1943), 635-641. of S. Lie. Both men understood the impor-
[33] A. B. Katok, Monotone equivalence in tance of the group concept in mathematics. Lie
ergodic theory, Math. USSR-I~V., 11 (1977), studied the theory of tcontinuous transforma-
99-146. (Original in Russian, 1977.) tion groups, and Klein studied discontinuous
[34] E. A. Satayev, An invariant of monotone transformation groups from a geometric stand-
equivalence determining the quotients of auto- point. Klein was thus led to the idea of the
morphisms monotonely equivalent to a Ber- Erlangen program, which provided a bird’s-
noulli shift, Math. USSR-I~V., 11 (1977), 147- eye view of geometry.
169. (Original in Russian, 1977.) Klein’s idea cari be summarized as follows:
[35] L. Auslander, L. Green, F. Hahn, et al., A spaceS is a given set with somegeometric
Flows on homogeneous spaces, Ann. Math. structure. Let a transformation group G of S
Studies, Princeton Univ. Press, 1963. be given. A subset of S, called a figure, may
547 138 B
Error Analysis

have various kinds of properties. The study of [2] G. Fano, Kontinuierliche geometrische
the properties that are left invariant under a11 Gruppen, Enzykl. Math., Leipzig, 190771910,
transformations belonging to G is called the Geometrie III, pt. 1, AB4b, 289-388.
geometry of the space S subordinate to the [3] F. Klein, Vorlesungen über nicht-
group G. Let this geometry be denoted by Euklidische Geometrie, Springer, 1928.
(S, G). Two figures of S are said to be congruent [4] F. Klein, Vorlesungen über hohere Geom-
in (5, G) if one of them is mapped to the other etrie, Springer, 1926.
by a transformation of G. The geometry (S, G) [S] E. Cartan, Les récentes généralisations de
is actually the theory of invariants of S under la notion d’espace, Bull. Sci. Math., 48 (1924)
G, with the term invariants to be understood in 294-320.
a wider sense; it means both invariant quan- [6] E. Cartan, La théorie des groupes et les
tities and invariant properties or relations. recherches récentes de géometrie différentielle,
Replacing G in (S, G) by a subgroup G’ of G, Enseignement Math., 24 (1925), 1- 18; Proc.
we obtain another geometry (S, G’). A series of Internat. Congr. Math., Toronto, 1 (1928) 85-
subgroups of G gives rise to a series of geom- 94.
etries. For instance, let A be a figure of S. The
elements of G leaving A invariant form a sub-
group G(A) of G that operates on s’=S- A.
We thus obtain a geometry (s’, G(A)) in which 138 (XV.3)
A is called an absolute figure. In this way,
many geometries are obtained from projective
Error Analysis
geometry. Klein gave numerous examples.
It is noteworthy that he mentioned even the A. General Remarks
groups of trational and thomeomorphic
transformations. The data obtained by observations or mea-
Klein’s idea not only synthesized the geom- surements in astronomy, geodesy, and other
etries known at that time, but also became a sciences do not usually give exact values of the
guiding principle for the development of new quantities in question. The error is the dif-
geometries. ference between the approximation and the
In 1854 G. F. B. Riemann published his exact value. The theory of errors originated
epochmaking idea of Riemannian geometry. from systematic work with data accompanied
This geometry has a metric, but in general by errors, and the statistical treatment of ex-
lacks congruence transformations (isometries). perimental data was the main concern in the
Thus Riemannian geometry is a geometry beginning stages (- 397 Statistical Data Anal-
that is not included in the framework of the ysis). However, due to the recent development
Erlangen program. The importance of Rie- of high-speed computers it has become pos-
mannian geometry was acknowledged when it sible to carry out computations on a tremen-
was used by A. Einstein in 1916 as a founda- dously large scale, and the detailed analysis of
tion of his general theory of relativity. H. errors has become an absolute necessity in
Weyl, 0. Veblen, and J. A. Schouten dis- modern numerical computation. Hence the
covered geometries that are generalizations of analysis of errors in relation to numerical
afftne, projective, and tconformal geometries computation has become the tenter of re-
in the same way as Riemannian geometry is search in error theory.
a generalization of Euclidean geometry. It
became necessary to establish a theory that B. Errors
reconciled the ideas of Klein and Riemann;
E. Cartan succeeded in this by introducing the One rarely makes a mistake in counting a
notion of tconnection (- 80 Connections). small number of things; therefore the exact
However, the Erlangen program, which gave value of the Count cari be determined. On the
an insight into the essential character of clas- other hand, the exact value in decimals is
sical geometries, still maintains its role as one never obtainable for a continuous quantity,
of the guiding principles of geometry. say length, no matter how fine measurements
are made, and a large or small stochastic error
is thus inevitable in measuring a continuous
References quantity. A discrete tïnite quantity is a digital
quantity, and a continuous quantity is an
[l] F. Klein, Vergleichende Betrachtungen analog quantity. The natures of these two
über neuere geometrische Forschungen, Math. quantities are quite different. The values of
Ann., 43 (1893), 633100 (Gesammelte mathe- a digital quantity are distributed on some
matische Abhandlungen, Springer, 1921, vol. discrete set, while the values of an analog
1,460&497). quantity are distributed with a continuous
138 c 548
Error Analysis

probability. Thus there is the possibility of added to a large number a, the result may be
error even for treatment of digital quantities, just a and the information of b could be lost
although checking the results for these quan- completely. This kind of loss of information
tities is easy. It is preferable to regard digital cari often cause serious trouble.
quantities as being analog quantities if the
possible values are densely distributed.
C. Methods of Error Analysis
On the other hand, when an appropriate
tanalog computer is not available, analog In order to analyze the propagation of errors,
quantities receive treatment similar to digital let us assume that ah numbers are carried to
quantities. They are expressed as x times some inlïnitely many digits SO that no roundoff error
unit, and x is expanded in the decimal or occurs. Suppose that we are to evaluate the
binary systems. An approximation to such function y=f(x,, . . . . x,) when x1,x2, . . . . x, are
an expansion is obtained by rounding off a assigned. Let ré be the truncation error of an
numeral at some place, the position depending approximate expression. If an input error hi for
on the capacity for computation by available xi exists, then the corresponding error for y is
methods. There are two ways of rounding off
numbers, the fïxed point method and the float-
ing point method. The former specifïes the
place of digits where the rounding off is made, Moreover, suppose that at the final step we
and the latter essentially specitïes the number round off to get a result with a Imite number
of significant digits. of digits, by which an error E is introduced.
Classification of errors. (1) Errors of input Then the final error 6 for y is
data are the errors included in input data
themselves. Such input-data errors include
the errors that occur when we represent con-
stants such as 1/3, ,/$ n by lïnite decimals. This procedure is performed for each step
(2) Truncation errors occur in approximate needed in the computation. If y = f(xl, ,x,)
expressions for the computation formulas is a specitïed step, then the input error Si for
under consideration. (3) Roundoff errors occur that step involves all the errors arising before
in taking some fïnite number of digits from the that step, i.e., hi is an accumulated error. For-
earlier digits in the numerical value at each ward analysis is a method to estimate the total
step. If the computation of an intïnite number accumulated error from the initial input data.
of digits were actually possible, no errors of It is usually quite diftïcult to obtain precise
this type would appear. Recently, it has been estimates by means of this method. In con-
considered more preferable to cal1 this “com- trast, J. H. Wilkinson proposed the following
putational error.” hackward analysis. Here, the computational
The difference between tïxed point and value y is considered as the exact result for
floating point rounding off is that the former the modilïed initial data Xi, , X,,, say y =
is better suited for operations of addition and f(Z,, . . ..X.), and the estimates for Ixi-.Zil
subtraction and the latter is better for multi- are given. For example, in the binary floating
plication and division. In fïxed point rounding point arithmetic of u bits, we always have the
off, if a number is multiplied many times by relation:
numbers less than 1, a so-called underflow may
occur, and many digits may disappear; a great computational value of a + b
deal of information cari thus be lest. In com- =a(1 +6)-tb(l +a),
putation for scientific research that involves
frequent multiplication and division, floating with 161, lsl<2~“, even when cancellation or
point rounding off is preferable. It should be loss of information occurs. Wilkinson has
noted that rounding off for addition and sub- made a deep investigation of error analysis for
traction may also cause a critical loss of in- linear computation, algebraic equations, and
formation. This phenomenon is called cancel- eigenvalue problems by means of backward
ing digits. For instance, in the subtraction analysis [4,5].
7.6325071- 7.6318425 = 0.0006646, where the
subtrahend and minuend share several early D. Proliferation of Errors
signifïcant digits, the difference loses those
digits. Thus, relative errors may be magni- The phenomena usually called “accumulation
lied tremendously. By taking a large number of errors” should more appropriately be called
of significant digits, such a situation may the proliferation of errors, where the algorithm
be avoided to some extent. So-called high- itself includes a particular mechanism to in-
precision computation shows its effectiveness in crease a small error indefinitely. An example is
such cases. Similarly, when a small number h is the recursion formula for tBessel functions
549 139 B
Euclidean Geometry

stated as axioms in the Elements that if c(+ b = 180”, the


two lines 1 and I’ in Fig. 2 are parallel. Hence
Jr!+1 (x)=(2~Ix)J,(x)-Jn-l(x). given a line 1 and a point P not lying on 1,
It is customary to compute J,(x) by this for-
mula, starting with the values of J,(x) and
J,(x), with x given. By putting Jn-, (x) = y,,
J,(x) = z,, the recursion formula cari be re-
garded as a linear transformation of the
point !‘,(y,,,~,) in a plane into another point
Pn+l(~,+l 3z,+J, where
Fig. 1 Fig. 2
Yn+, =z,> z,+1 = -y,+(2n/x)z,.

The teigenvalues Âi, lb2 of this tdifference equa- there exists a line l’ passing through P that is
tion satisfy the following: As long as n < 1xl, we parallel to 1. The tïfth postulate ensures the
have li,I=li,I=l, whileifn>Ixl, Ai isgreater uniqueness of the parallel 2’ passing through
than 1 and increases rapidly as n tends to the given point P. For this reason, the fifth
infmity. Consequently, even the slightest dis- postulate is also called the axiom of parallels.
crepancy in the position of P, gives rise to a Utilizing this axiom, we cari prove the well-
greatly magnifïed error in the result [3]. known theorems on parallel lines, the sum of
Many studies have been made of the propa- interior angles of triangles, etc. The axiom
gation of errors and of instability phenomena plays an important role in the proof of the
in the numerical solution of ordinary differen- Pythagorean theorem in the Elements. The
tial equations (- 303 Numerical Solution of axiom is also called Euclid’s axiom.
Ordinary Differential Equations; [2]). However, Euclid states this axiom in a quite
complicated form, and unlike his other axioms,
it cannot be verilïed within a bounded region
References
of the space.
[ 11 J. von Neumann and H. H. Goldstine, Many mathematicians tried in vain to de-
Numerical inverting of matrices of higher duce it from other axioms. Finally the axiom
order, Bull. Amer. Math. Soc., 53 (1947), 1021- was shown to be independent of other axioms
1099. in the Elements by the invention of non-
[2] P. Henrici, Discrete variable methods in Euclidean geometry in the 19th Century (-
ordinary differential equations, Wiley, 1962. 285 Non-Euclidean Geometry).
[3] T. Uno, The problem of error propagation The term Euclidean geometry is used in
(in Japanese), Sûgaku, 15 (1963), 30-40. contrast to non-Euclidean geometry to refer to
[4] J. H. Wilkinson, Rounding errors in alge- the geometry based on Euclid’s axiom of par-
brait processes, Prentice-Hall, 1963. allels as well as on other axioms explicit or
[S] J. H. Wilkinson, Algebraic eigenvalue implicit in Euclid’s Elements. It was in the 19th
problem, Clarendon Press, 1965. Century that a complete system of Euclidean
[6] L. B. Rail (ed.), Error in digital computa- geometry was explicitly formulated (- 155
tion 1, II, Wiley, 1965, 1966. Foundations of Geometry). From the stand-
point of present-day mathematics, it would be
natural to define first the group of motions by
the axiom of free mobility due to H. Helm-
139 (Vl.3) holtz (- Section B) and then, following F.
Klein, to detïne Euclidean geometry as the
Euclidean Geometry study of properties of spaces that are invariant
under the groups of these motions (- 137
A. History Erlangen Program).

Attempts to construct axiomatically the geom-


etry of ordinary 3-dimensional space were B. Group of Motions
undertaken by the ancient Greeks; culminating
in Euclid’s Elements (- 187 Greek Mathemat- Let P be an tordered fïeld and A” the n-
ics). The fifth postulate of Euclid’s Elements dimensional +affine space over P. Let B’ be an
requires that two straight lines in a plane that r-dimensional affine subspace of A”, B’-’ an
meet a third line, as shown in Fig. 1, in angles (Y- 1)-dimensional subspace of B’, B’-* an
c(, fl whose sum is less than 180”, have a com- (r- 2)dimensional subspace of B’-‘, etc. In
mon point. In the Elements, two straight lines the sequence of subspaces B’, B’-i, . . . , BO,
in a plane without a common point are said each Bk - Bk-’ consists of two +half-spaces (k =
to be parallel. It cari be proved from other r,r-l,...,l).Let Hkbeoneofthesehalf-
139 c 550
Euclidean Geometry

spaces. Then the sequence of half-spaces H’, with respect to a hyperplane A”-’ = h is called
H’-‘, . , H’ is called an r-dimensional flag, a hyperplanar symmetry. They are uniquely
denoted by 5j’ (n > r> l), and B’ and H’ are determined by p and h, respectively, and are
called the principal space and the principal denoted by S, and S(h), respectively. If H(A”) is
half-space of !$, respectively. If f is a tproper the set of a11 hyperplanes of A”, then 23” is
aflïne transformation of A”, f(H’), f(H’-‘), generated by {S(h) 1he H(A”)}. Furthermore, if
. ..> f(H ‘) form an r-dimensional flag R’. We p, 4 are two points of A”, the composite S,S,, is
Write f(s) = R’. a parallel translation by 2. pg (Fig. 3). The
Let %” be the group of a11 proper affine parallel translations generate a normal sub-
transformations of A”. The subgroup 23’ of ‘u” group 2” of 23”. For p, qc A”, the element of 2

x’=&(z)
with the following two properties is called the that carries p to 4 is denoted by zpq. The
group of motions, and any element of 23” is
called a motion (or congruent transformation).
(1) Let r be an integer between 1 and n, and let
5, Ji’ be any two r-dimensional flags. Then

A
there exists an element f of 23” that carries S, P 9

to H’:f($‘) = si’. (2) Let A y be two elements of


8” with f(s) = R’, g(s)= A*, and let p be any
x X”=&(x’)
point on the principal space of 5j’. Then f(p) =
g(p), that is, f; g have the same “effect” on Fig. 3
the principal space. In particular, when r = n,
then f =g. That CU” possesses a subgroup 8” subgroup of 23” that leaves a point p of A”
with properties (1) and (2) is called the axiom invariant is denoted by 0;. Obviously, we have
of free mobility. 0; = rpq OPZ~;‘. Thus a11 the 0; (for PE A”) are
When n = 1, it is easy to see that the ele- isomorphic. We cal1 0; the orthogonal group
ments of %i are only those elements f of 2li around p and any element of 0; an orthogonal
that cari be expressed in the form f(x) = k transformation around p. More generally, any
x + a (a~ P). When n > 2, P must satisfy the element of 8” that leaves a subspace Ak of A”
following condition in order that a subgroup invariant is called an orthogonal transforma-
23” with properties (1) and (2) exists in CU”: If a, tion around the subspace Ak.
b E P, then P contains an element x such that An element of d” that preserves the orienta-
x2 = a2 + b2. When this condition is satished, tion of A”, i.e., is represented by a ‘proper
the ordered lïeld P is called a Pythagorean affinity with a positive determinant, is called a
field. Every treal closed field (e.g., the field R of proper motion. Proper motions form a sub-
real numbers) is Pythagorean. If 23” exists, its group w0 of 23”. Rotations are, by definition,
uniqueness is assured by (1) and (2). Further- orthogonal transformations belonging to 23:.
more, if P contains a square root of every Sometimes %30is called the group of motions;
positive element (this condition is satislïed, for then ‘$3” is called the group of motions in the
example, by R), then conditions (1) and (2) are wider sense. In this article, however, we shah
reducible to the case r = n only, i.e., conditions continue to use the terminology introduced
(1) and (2) for other values of r follow from (1) above.
and (2) with r = n. Hereafter, we assume the The study of the properties of A” invariant
existence of 23”. under W’ is n-dimensional Euclidean geometry.
Suppose that we have A” 3 B’ 3 Bk (n > Since QI”1 d”, every proposition in affine
r 2 k > 0), and let !$ be a flag with the prin- geometry (- 7 Affine Geometry) cari be con-
cipal space B’:.ff=(H’, . , Hk, . , H’). Let sidered a proposition in Euclidean geometry,
si’ be another flag with the same principal but there are many propositions that are
space B’:H’=(K’, . . . . Kk, . . . . K’), where we proper to Euclidean geometry. Sometimes the
suppose that Hj= Kj for k > j > 1, whereas for subgroup of CU” generated by 8” and the
r > i > k + 1, we suppose that Hi and K’ are homotheties of A”, i.e., elements of ‘%” repre-
different half-spaces on B’ divided by B”. The sented by tscalar matrices, is called the group
flag H’ is denoted by $ji. An element f of %” of motions in the wider sense, and the study of
with f(s) = sjk is called a symmetry (or reflec- properties of A” invariant under this group is
tion) of B’ with respect to Bk. It leaves every called n-dimensional Euclidean geometry in the
point on Bk invariant, and its effect on B’ is wider sense.
determined only by Bk independently of the
choice of half-spaces in 5j’ and R’ (subject to
the conditions mentioned above). In partic- C. Length of Segments
ular, the symmetry of A” with respect to a
point A0 = p is called a central symmetry with Two figures F, F’ in A” are said to be congru-
respect to the tenter p; and the symmetry of A” ent if there exists an SE 23” such that f(F) = F’.
551 139 D
Euclidean Geometry

Then we Write F = F’. The congruence relation lines OA, OB are called the sides of L AOB
is an tequivalence relation. Let s = ~4, s’ = $4 (Fig. 5). Two congruent angles are said to have
be two +Segments in A”. We say that s, s’ have the same measure, denoted by 1 L AOBI or
equal length when s = s’. Length is an attri- sometimes simply by CI. Let 5jz =(Hz, H’) (Hi
bute of the equivalence class of segments. The = half-line QR) be a given 2-dimensional flag
length of s is denoted by (~1. Al1 segments of and the given measure of an angle. Then we
the form @ are congruent, and we defme IpPl cari lïnd a unique half-line QP in the given
= 0. If we are given a length and a thalf-line half-plane Hz such that 1L PQRI =CI (Fig. 6).
starting from a point p, we cari lïnd a unique The angle L PQR is said to “belong” to $*.
point 4 on it such that lpql = the given length Let K* be the half-plane separated by the
(Fig. 4). Let r be a point on the extension of line PU Q containing the half-line QR. Then
p4. The length Iprl is then uniquely deter- HZ n K* is called the interior of the angle
mined by (pq( and lqr(. It is delïned as the

q’
P’ 4
p (a) q

P 0 7
Fig. 5 Fig. 6
u’+a”=u.
(b)

Fig. 4 L PQR. Let L P’QR be another angle belong-


ing to a*. If the interior of the latter angle is
sum of the lengths: \pr( = Ipii\ + l?j?\. With a subset of the interior of L PQR and QR #
respect to the addition thus delïned, lengths of QP’, then 1LPQRI is said to be greater than
segments in A” form a +Commutative semi- 1LP’QRI, and we Write 1LPQRI>I LP’QRI.
group with the cancellation law, which cari be Actually, it cari be shown that > is a relation
extended to an +Abelian group M with 0 as between the measures of L PQR and L P’QR,
the identity element (- 190 Groups P). and that the set of measures of angles forms a
Let 1.~1#O and Is’( be any length. On a half- tlinearly ordered set with respect to the rela-
line starting from p, we cari fmd points q, r tion 2 defined in the obvious way. When
with Ip4/= Isl, IF1 = Is’l. Then the element lLPQRI>ILP’QRl,wewriteILPQRl=
pr/pq = if P (- 7 Affine Geometry) is a posi- 1 L PQP’I + 1 L P’QRI. Actually, these are rela-
tive element of P uniquely determined by 1st tions between measures of angles. Further-
and Is’I. We cal1 A the measure of ls’j with the more, if the measure E of an angle is given, the
unit IsI and denote it by Is’l: Isl. If P is +Archi- set of measures of angles < c( forms a linearly
medean, Â cari be represented by a real num- ordered set order-isomorphic to a segment
ber (- 149 Fields N). We have (ls’l+ ~S”I): lsl and satisfying: (i) If j3 <y then there exists a S
=(~s’~:~s~)+(~s”~:~s~),(~s”~:~s’~)(ls’I:Is”~)= such that b+S=y;(ii) fi+6=S+b; (fli +B2)+
Is”I:IsI (if Is’IfO). Thus the mapping ls’l-, p3 = /3i + ( /J’* + &) if a11 these sums exist; and
Is’I : lsl sends the additive semigroup of (iii) Pr + 6 = /j2 + 6 implies /3r = &. When P is
lengths to that of the positive elements of P. Archimedean, these properties imply that the
This is actually an isomorphism, which cari be measure of angles < 1L PQR 1 cari be repre-
extended to an isomorphism of M onto the sented by positive real numbers <k (k is any
additive group of the lïeld P. given positive number) such that the relations
Let cp(X, X’, ,X”) be a thomogeneous of ordering an addition are preserved.
rational function of a lïnite number of vari- This one-to-one correspondence between
ables X, X’, . . . , X”. If a relation <p(Â, i’, . ,A”) the measures of angles and a subset S of the
=OholdsforÂ=JsJ:l~~l,Â’=Is’l:ls~~,...,IZ” interval (0, k] of real numbers cari be extended
=IsLïI:lsOI, where lsOI is a length #O, then to a correspondence between the measures of
<~(Â,,A~,...,Âd;)=OholdsalsoforA,=lsl:ls,l, general angles and the subset of R obtained
/2;=(s’I:Is11,...,AO;=Isa(:Is11, where Isrl is any from S. When P=R, then we have S =(O, k],
other length # 0. Hence, in this case, the ex- and any real number appears as a measure of
pression <p(lsl,ls’l,...,ls”l)=Ois meaningful. a general angle. We cari choose L PQR and
the positive number k arbitrarily, but it is
D. Angles and Their Measure customary to choose them as follows. Suppose
we are given an angle L AOB. Let the exten-
An angle L AOB is a figure constituted by two sions of the half-lines OA and OB in the oppo-
half-lines OA, OB starting from the same point site directions be OA’ and OB’, respectively.
0 but belonging to different straight lines. The The angles L AOB and L A’OB are called
point 0 is called the vertex, and the two half- supplementary angles of each other, and SO are
139 E 552
Euclidean Geometry

L AOB and L AOB’. The angles L AOB and say that 2 and m are orthogonal (or perpendic-
L A’OB’ are called vertical angles to each other ular) to each other, and Write Ilm. Let 1 be a
(Fig. 7). Any angle is congruent to its vertical line and A’ an r-dimensional subspace of A”
angle, and an angle that is congruent to its (1 < r < n - 1) intersecting 1 at a point 0 = A’ n 1.
supplementary angle has a tïxed measure. If 1 is orthogonal to a11 lines on A’ passing
through 0, then 1 is said to be orthogonal to
/A A’, and we Write II A’ (Fig. 10). If A”-’ is any
P hyperplane in A”, then there exists a unique
(I
line I through a given point P of A” that is
B’ (I 0 B
P orthogonal to A”-‘; this 1 is called the per-
* A’ pendicular to A”-’ through P, and the

Fig. 1

Such an angle (or its measure) is called a right


angle. In the description of the measurement of
angles, we usually consider the case where the
special angle L PQR is a right angle, and we
Fig. 10
set k = 7r/2. (The existence and uniqueness of
the right angle cari be proved.) An angle that is
greater (smaller) than a right angle is called an intersection 1tl A”-’ is called the foot of the
obtuse (acute) angle. A general angle whose perpendicular through P. When A”-’ is given,
measure is twice (four times) a right angle is the mapping from A” to A”-’ assigning to
called a straight angle (perigon). Sometimes we every point P of A” the foot of the perpendic-
choose as the “unit angle” 1/90 of a right ular through P is called the orthogonal projec-
angle, which is called a degree (hence a right tion from A” to A”-‘.
angle = 90 degrees, denoted by 90”); 1/60 of a Let@‘=(H”,H”-‘,...,H’)beann-
degree is called a minute (1’ = 60 minutes, dimensional flag of A” and 0 the initial point
denoted by 60’), and 1/60 of a minute is called of the half-line H1 Then we cari lïnd a point Ei
a second (1’ = 60 seconds, denoted by 60”). If, inH’(i=1,2,...,n)suchthatOUE,10UEj
as usual, we put the right angle equal to 7r/2, (i#j,i,j=1,2 ,..., n). Moreover, if le1 is any
then the unit angle is (2/rt)(right angle). This is unit of length, then Ei cari be chosen uniquely
called a radian, and 1 radian = 180”/n = SO that (OE,(=(e( (i= 1,2, . . . . n). Then 0,
57”17’44.806.. .” + 57.3”. E,, , E, are tindependent points in A”, and
If a straight line m intersects two straight wehaveA”=OUE,U...UE,.Thuswehavea
lines 1, l’, eight angles t(, fi, y, 6, CI’, /j”, y’, 6’ tframe Z = (0; E, , , E,) of A” with 0 as origin
appear, as in (Fig. 8). In this figure, c( and CI’, fl and the Ei as unit points. Such a frame is
and p’, y and y’, and 6 and 6’ are called corre- called an orthogonal frame. A coordinate
sponding angles, while CIand y’, /r and 3, y and system with this frame, called an orthogonal
CI’, and S and p’ are called alternate angles to coordinate system adapted to $“, is uniquely
each other. When 1 and I’ are parallel, each of determined by 8”. A motion is characterized
these angles is congruent to its corresponding as an tafftnity sending one orthogonal frame
or alternate angle. onto another or onto itself.
The Pythagorean theorem asserts that if a Utilizing an orthogonal coordinate system,
triangle AA BC is given for which L ABC is a the lengths of segments and the measures of
right angle (Fig. 9), then jAB12+IBC12=ICA12 angles cari be expressed simply. Let (x1, , x,)
(which makes sense since X2 + Y2 -2’ is a be the coordinates of X with respect to such a
homogeneous polynomial). coordinate system. Then the length of the
segment 10x1 (with le1 as unit) is equal to
(C$l xy2, and when Y is another point, with
coordinates (y,, . , y,), 0 #X, 0 # Y, then we
have

C;=l Xjyi
COS/ LXOYI=
(cy=, xy(c;=, Yi)“2’
Fig. 8 Fig. 9
In particular, we have 0 U XI0 U Y if and
E. Rectangular Coordinates only if Cb, xiyi = 0.
We may Write x = E for the +location
When two straight lines 1, m intersect, two vector of X. Then the taffinity Ax + b is a
pairs of vertical angles appear. If one of these motion if and only if A is an +Orthogonal ma-
angles is a right angle, then a11 are. Then we trix. Thus the tinner product (x, y) is invariant
553 139 G
Euclidean Geometry

under motions; it therefore has meaning in The notion of measure of point sets other than
Euclidean geometry. If we put 1x1 =(x, x)l’*, polyhedra is a generalization of the notion
then the right-hand sides of the formulas for of volume of polyhedra (- 270 Measure
10x1 and COS[ LXOYI cari be written as 1x1 Theory).
and (x,y)/(lxl ‘1~1). More generally, we have
IXYI=ly-xl. This is the Euclidean distance
(or simply distance) between X and Y. Then A”
G. Ortbonormalization
becomes a tmetric space with this distance, i.e.,
a Euclidean space. Historically, the notion of
metric spaces was introduced in generalizing Let 0, A,, . . , A, be n + 1 tindependent points
Euclidean spaces (- 273 Metric Spaces). in A”. Then the n vectors zi = a,, i = 1, , n,
are independent. The points 0, A,, . . . , A,
determine an n-dimensional flag @ of A” as
F. Area and Volume
follows. Let Hi be the half-line OA,, H, be the
half-plane on the plane 0 U A, U A, separated
The subset I” of A” consisting of points
by the line 0 U A, in which A, lies, . , H, be
(x i, . . . , x,,) with respect to an orthogonal co-
the half-space on A” separated by the hyper-
ordinate system, with 0 < xi < 1, i = 1, . , n, is
plane 0 U A, U . . . U A,-, in which A, lies. Let
called an n-dimensional unit cube. A function
b, , , b, be the unit vectors of the rectangular
m that assigns to tpolyhedra in the wider sense
coordinate system adapted to $9. Suppose
P, Q, in A” nonnegative real numbers m(P),
further that we are given a rectangular coordi-
m(Q), . . . is called an n-dimensional volume if it
nate system. Then b,, . . . ,b, cari be obtained
satisfïes the following four conditions: (1) m(a)
from a,, . . ,a” by the following procedure,
= 0. (2) m(P U Q) + m(P n Q) = m(P) + m(Q). (3)
called ortbonormalization (E. Schmidt): First
If P is sent to Q by a translation, then m(P) =
put b,=a,/la,I,so that Ib,l=l. Thenc,=
m(Q). (4) m(l”)= 1. It has been proved that
a,-(a,,b,)b, satisfïes (b,,c,)=O, c,#O.
such a function is unique and has the property
Put b,=c,/lc,l. Then we have (b,,b,)=O,
that P = Q implies m(P) = m(Q). Thus the con-
Ib,l=l.Whenb,,...,b,-iareobtainedin
cept of volume cari be dehned in the frame-
this way, SO that (bj, b,)=6, for 1 <j, k<i- 1,
work of Euclidean geometry. More generally,
thenci=ai-(a,,b,)b,-...-(a,,bim,)bim,
if the affinity f(x) = Ax + b sends P onto Q,
satisfïes (bj, ci) = 0, ci #O. Hence b, = ci/lcil
then ~(Q)=C~(P), where c is the absolute
added to b,, , b,-l retains the property
value of the determinant 1A(. If P is covered by
(bi, bk) = Sj, for 1 <j, k < i, and this procedure
a fïnite number of hyperplanes, then m(P) = 0,
cari be continued to i = n.
and if P is a tparallelotope with n independent
Two vectors u, Y are called orthogonal (de-
edgesa, ,..., a,, thenm(P)=absla, ,..., a,l,
noted ulv) if (u, v) =O, and u is called normal-
where la,, . . , a,,/ is the determinant of the
ized when III/= 1. Thus any two of the vectors
n x n matrix with a, as the ith column vector,
b, , . , b, are orthogonal, and each of them is
and absx is the absolute value of the real
normalized. Between given vectors a,, . . , a,,
number x. If P is an +n-simplex whose vertices
and b,, , b, we have the relation {a,, ,ai}
have location vectors x0, x1, . , x,, then we
(= the linear space generated by a,, , ai) =
have
{b, ,..., b,},i=l,..., n.
1 1 1 Let W,, %R, be two subspaces of the linear
m(P)=labs
n! xg x, . x, space W of the vectors of the Euclidean space
A”. If any element of %Ri is orthogonal to any
The volume of any polyhedron cari be ob-
element of W,, then 1151,and %Il, are called
tained by dividing it into n-simplexes and
orthogonal and written %JI, 1Vl,. For any
summing their volumes. If P is an r-
proper subspace W, of !Dl, it cari be shown by
dimensional polyhedron in A”, then the r-
the method of orthonormalization that there
dimensional volume of P is obtained by divid-
exists a unique proper subspace !I$ of %R such
ing P into r-simplexes and summing their
that %Il=!& U9Jl,, !JJI,l!IJI,. Such a subspace
r-dimensional volumes (in the respective r-
!III, is called the ortbocomplement of %Ri (with
dimensional Euclidean spaces containing
respect to !JR). Then mm, n !Ul, = (0) follows,
them). In particular, when r = 1, we speak of
and hence !IN = W i + %II,. Every element a of
lengtb (e.g., the length of a broken line), and
%II is therefore written uniquely in the form
when r = 2, of area. If V is the r-dimensional
a, +a,, a, E!LR,, a,E%R,; we cal1 a, the mm,-
volume of an r-dimensional parallelotope with
component of a and a2 the orthogonal compo-
r edges a,, . . . . a,, we have the formula
nent of a with respect to ‘%II,. The mapping
(al,al) ... (al,a,) from !JJl to %Il, assigning a, to a is called the
p= ... orthogonal projection from %JI to mm,; it is a
(a,,aJ (a,, a,) hnear and tidempotent mapping.
139H 554
Euclidean Geometry

H. Distance between Subspaces [2] E. E. Moise, Elementary geometry from an


advanced standpoint, Addison-Wesley, 1963.
Since the Euclidean space A” is a metric space, [3] H. Weyl, Mathematische Analyse des
the distance is delïned between any two non- Raumproblems, Springer, 1923 (Chelsea, 1960).
empty subsets of A” (- 273 Metric Spaces). [4] F. Klein, Vorlesungen über hohere Geom-
Let A’, B” be two subspaces of dimensions Y, s etrie, Springer, third edition, 1926 (Chelsea,
of A”, and let d be the distance between them. 1957).
Then it cari be shown that there exist points [S] J. Dieudonné, Algèbre linéaire et géométrie
PE A’, qE BS such that d=pq, and if P’E A’, élémentaire, Hermann, second corrected edi-
q’~ B” are any other points with d =p’q’, then tion, 1964; English translation, Linear algebra
pq = p’q’: In particular, when r = 0 and s = n - 1 and geometry, Hermann, 1969.
(i.e., when A’=p is a point and BS= B”-’ is a [6] G. Choquet, L’enseignement de la géomé-
hyperplane), the distance d cari be obtained as trie, Hermann, 1964.
follows: If (a, x) = b is an equation of B”-’ and
p is the location vector of p, then d = I(a, p) -
hl/la]. If la1 = 1 in this equation of B”-‘,
then d is given simply by [(a, p) - hl. An equa- 14O(Vl.4)
tion (a, x) = b of a hyperplane is said to be in
Hesse’s normal form if la I= 1.
Euclidean Spaces

A space satisfying the axioms of Euclidean


1. Spberes and Subspaces geometry is called a Euclidean space. An taflïne
space having as +Standard vector space an n-
The set of points in a Euclidean space lying at dimensional Euclidean tinner product space
a fixed distance r from a given point is called over a real number field R is an n-dimensional
the spbere of radius r with tenter at the given Euclidean space E”. In an n-dimensional Eu-
point. If p is the location vector of the tenter of clidean space E”, we Iïx an torthogonal frame
this sphere with respect to a given rectangular C = (0, E, , . , E,), ei = 03, (e,, ej) = 6,. The
coordinate system, then the equation of the frame C determines trectangular coordinates
sphere is Ix-pl=r or (x,x)-2(p,x)+(p,p)- (x1,x 2, , x,) of each point in E”. We cari thus
rz = 0. The set of points lying at equal dis- establish a one-to-one correspondence be-
tances from k + 1 points with location vectors tween E”andR”={(x,,...,x,)lx,~R}.Inthis
p,,, pl, . , pk (k > 1) is a linear subspace of the sense we identify E” and R” and usually cal1 R”
space (which may be @ or the entire space). If itself a Euclidean space. The 1-dimensional
these points are independent, then the sub- space R’ is a straight line, and the +Cartesian
space has dimension n-k, where n is the di- product of n copies of R’ is an n-dimensional
mension of the entire space. In particular, if Euclidean space (or Cartesian space). Given
these points are vertices of an n-dimensional pointsx=(x,,x, ,..., x,,)andy=(y,,y, ,..., y,)
+Simplex, then there is a unique sphere passing in the Euclidean space R”, the tdistance d(x, y)
through them, called the circumscribing spbere between them is given by
of the simplex. In this case, the simplex is said
to be inscribed in the sphere. If pO, pl, . . . , p,
are location vectors of the vertices of the sim- Thus the distance d(x, y) supplies R” with the
plex, then the equation of the circumscribing structure of a tmetric space. We cal1 xi the ith
sphere of the simplex is given by coordinate of the point x, the point (0, ,O)
the origin of R”, and the set of points {x 1 -cc
1 1 ... 1 1
<xi< CD; xj=O, j#i} the x,-axis (or ith co-
po p1 p, x =o.
ordinate axis). For an integer m such that
p; p: . pi x2
-1 d m 6 n, we defïne m-dimensional +sub-
When n = 2 or 3, there are many classical spaces in R”; a -l-dimensional subspace is
results concerning the circumscribing circle of the empty set, a 0-dimensional subspace is
a triangle, the circumscribing sphere of a sim- a point, and a l-dimensional subspace is a
plex, and other figures related to a triangle or straight line. If we take an orthogonal frame,
a simplex. an m-dimensional subspace is represented
as an R” (- 139 Euclidean Geometry; 7 Affine
Geometry).
References As a ttopological space, R” is tlocally com-
pact and tconnected. A bounded closed set in
[l] G. D. Birkhoff and R. Beatley, Basic geom- R” is tcompact (Bolzano-Weierstrass theorem).
etry, Scott, Foresman, 1941 (Chelsea, third Given a point a = (a,, , u,) in R” and a real
edition, 1959). positive number r, the subset {x 1d(x, a) < r} of
555 141
Euler, Leonhard

R” is called an n-dimensional solid sphere, solid cube) of R”. We cari take the set of open inter-
n-sphere, hall, n-hall, disk, or n-disk with tenter vals as base for a neighborhood system of R”.
a and radius r, its tinterior {x 1d(x, a) < r} an n- Al1 the tconvex closed sets (for example,
dimensional open sphere, open n-sphere, open closed intervals) having interior points in
hall, open n-hall, open disk, or open n-disk, R” are homeomorphic to an n-dimensional
and its tboundary {x 1d(x, a) = r} an (n - l)- solid sphere. A topological space 1” that is
dimensional sphere or (n - 1)-sphere. In partic- homeomorphic to an n-dimensional solid
ular, a 2-dimensional solid sphere is called a sphere is called an n-dimensional (topological)
circular disk, its interior an open circle, and its solid sphere, (topological) n-cell, or n-element. A
boundary a circumference. A disk or a circum- topological space Y-’ homeomorphic to an
ference is sometimes called simply a circle. (n - l)-dimensional sphere is called an (n - l)-
The family of n-dimensional open spheres dimensional topological sphere (or simply
with tenter a gives a base for a neighborhood (n - 1)-dimensional sphere. The spaces 1” and
system of the point a. Suppose that we are S”-’ are +Orientable ttopological manifolds
given a sphere S and two points x, y on S. whose orientations are determined by assign-
The points x, y are called antipodal points on ing the generators of the (relative) thomology
the sphere S if there exists a straight line L groups H”(I”, ?‘) and H,-,(S”-‘), respectively
passing through the tenter of S such that S n (both are inlïnite cyclic groups). By means
L = {x, y}. The segment (or the length of the of the tboundary operator a: H,(I”, in)+
segment) whose endpoints are antipodal points H,-,(S”-‘), the orientation of 1” or S”-’ deter-
is called the diameter of the solid sphere (or mines that of the other.
of the sphere). The notion of tdiameter (- 273
Metric Spaces) of a solid sphere or of a sphere
References
considered as a subset of the metric space
R” coincides with the notion of diameter of
See references to 139 Euclidean Geometry.
the corresponding set defined above. When
n 2 3, the intersection of a sphere and a 2-
dimensional plane passing through the tenter
of the sphere is called a great circle of the
sphere. For m such that 1 <m < n, we consider 141 (XXl.22)
an m-dimensional solid sphere or an (m - l)-
Euler, Leonhard
dimensional sphere in an m-dimensional
plane R”. These spheres are also called m-
dimensional solid spheres or (m- l)-dimen- Leonhard Euler (April 15, 1707Xeptember 18,
sional spheres in R”. 1783) was born in Basel, Switzerland. In his
In particular, the solid sphere of radius 1 mathematical development he was greatly
having the origin as its tenter is called the unit influenced by the Bernoullis (- 38 Bernoulli
disk, unit hall, or unit cell, and its boundary is Family). He was invited to the St. Petersburg
called the unit sphere. (In particular, when we Academy in 1726 and remained there until
deal with the 2-dimensional space R2, we use 1741, when he was invited to Berlin by Fre-
the term circle instead of sphere, as in unit derick the Great (1712-1786). Euler was active
circle.) The points (0,. ,O, 1) and (0, ,O, -1) at the Berlin Academy until 1766, when he
are called the north pole and south pole of the returned to St. Petersburg. Already having
unit sphere, respectively. The (n - 2)-dimen- lost the sight of his right eye in 1735, he now
sional sphere, which is the intersection of the became blind in his left eye also. This, how-
unit sphere and the hyperplane x, = 0, is called ever, did not impede his research in any way,
the equator; the part of the unit sphere that is and he continued to work actively until his
“above” this hyperplane (i.e., in the half-space death in St. Petersburg.
x, > 0) is called the northern hemisphere, and Euler was the central figure in the mathe-
the part that is “below” the hyperplane (i.e., in matical activities of the 18th Century. He was
the half-space x, < 0) the southern hemisphere. interested in a11tïelds of mathematics, but
Let ai, hi be real numbers satisfying a, < bi especially in analysis in the style of +Leibniz,
(i=1,2,...,n). The subset {xIai<xi<bi,i= which had been passed down through the .
1,2, . . , n} of R” is called an open interval of Bernoullis and was developed by him into a
R”, and the subset {x )ai < xi < bi} a closed form that led to the mathematics of the 19th
interval. They are sometimes called rectangles Century. Through his work analysis became
(when n = 2), rectangular parallelepipeds, or more easily applicable to the tïelds of physics
boxes. An open interval is actually an open set and dynamics. He developed calculus further
of R”, and a closed interval is a closed set. In and dealt formally with complex numbers. He
particular, the closed interval {x 10 <xi < 1, also contributed to such fïelds as tpartial dif-
i=1,2,..., n} is called the unit cube (or unit n- ferential equations, the theory of telliptic func-
141 Ref. 556
Euler, Leonhard

tions, and the tcalculus of variations. He con- nondestructive Read-Only Storage (ROS). For
tributed much to the progress of algebra and example, the microprogrammed control used
theory of numbers in this period, and also did for the unifïed COordinate Rotation Digital
pioneering work in topology. He had, how- Computer (CORDIC) algorithm is effective in
ever, little of the concern for rigorous founda- calculating elementary functions because of its
tions that characterized the 19th Century. He simplicity, accuracy, and capability of high-
was the most prolific mathematician of ah time, speed execution via parallel processing. It is
and his collected works are still incomplete, not clear whether the applications of approxi-
though some seventy volumes have already mation formulas heretofore in use Will be
been published. superseded by microprogramming techniques
in ah kinds of computers. However, the value
of the approximation formulas recognized in
References
the 1950s has been declining insofar as elemen-
[l] L. Euler, Opera omnia, ser. 1, vol. l-29, tary functions are concerned. Recently, as a
ser. 2, vol. l-30, ser. 3, vol. l-13, Teubner and method for the evaluation of functions based
0. Füssli, 1911-1967. on a new viewpoint, some “unrestricted” al-
gorithms have been proposed by Brent [ 11,
which are useful for the computation of ele-
mentary and special functions when the re-
quired precision is not known in advance or
142 (XV.1 1) when high accuracy is necessary. It is expected
Evaluation of Functions that methods for the evaluation of functions
using approximation formulas Will be further
A. History developed as microprogramming techniques
and unrestricted algorithms see wider use.
By the evaluation of a function f(x) we mean
the application of algorithms for obtaining the
approximate value of the function. The evalu- B. Evaluation of Functions Using
ation methods are classified roughly into two Approximation Formulas
groups: (1) evaluation of functions using ap-
proximation formulas, and (2) evaluation using Suppose we approximate a function f(x) by a
microprogramming techniques. The advent function g(x) using the following class of func-
of high-speed computers has brought about tions p,(x) and qj(x). Let continuous functions
drastic changes in the evaluation of functions. pi(x), . . . ,p.(x) and ql(x), , q,(x) detïned on
Before the introduction of high-speed com- a closed interval [a, h] satisfy the following
puters in the 1940s mathematical tables had conditions: (i) p , , . . . ,p, and ql, , q,,, are both
played a prominent role. The iïrst issue (pub- linearly independent; (ii) there exist at most a
lished in 1943) of the journal Mathematical tînite number of zeros for &‘& b,q,(x) in [a, h]
tables and other aids to computation (MTAC), for any choice of b,, b,, , b,,,, except for the
the predecessor of the journal Mathematics oj’ case b, = b, = = b, = 0; (iii) there is a con-
computation, was primarily concerned with tinuous function g(x) with a nonzero denomi-
tables of mathematical functions. One of the nator in [a, b] representable as
aims of this journal was to facilitate the ex-
change of information on errors in the tables. g(x)= f uiPi(x) 5 bjqjlxh (1)
i=l 1 j=l
The tables, obtained in the past by tedious
hand calculation, cari now be easily prepared whereai,i=l ,..., n,andbj,j=O,l ,..., m,are
by high-speed computers, and there was a constants. Then g(x) is called a generalized
period when more accurate and extensive rational function based on a class of functions
tables were published one after another. It {p,(x)}, i= 1, . . . . n, and {qj(x)}, j= 1, . . . . m. If
is ironie, however, that high-speed comput- pi(x) = x’-l and qj(x) = xj-l, (1) is reduced to a
ers revolutionized numerical analysis and rational function. If m = 1 and q,(x) = 1, (1)
prompted a shift of emphasis in the tïeld, is a linear combination of p,(x) and is called
beginning in the 1950s away from the use of an approximation of linear type to f(x); and
numerical tables and toward exploration of further, if p,(x) = xi-l, (1) is reduced to a poly-
the most efftcient methods of approximation of nomial. The crux of the approximation prob-
the functions, thus causing a rapid decrease in lem lies in the criterion to be used in choosing
the need for tables. the approximate constants in (1). There are
Recently, a significant trend in computer three methods for choosing them, which lead
design has replaced the conventional logic to three types of approximation of major
control section with “stored logic,” or micro- importance: (i) interpolatory approximations,
programmed control, stored in high-speed, (ii) +least-squares approximations, and (iii)
557 142 C
Evaluation of Functions

min-max error approximations (sometimes new point (xk+la yk+l ) is obtained from (xkr yk)
called best approximations). As a major aim of according to a pair of transformations, xk+, =
computer approximation of a function is to <p(xk, yk) and Ykil = $(xk, Yk)r keeping the
make the maximum error as small as possible, value of F(x,, ya) invariant. If xk is forced to
the third type of approximation has been used converge to x, and if g(x,) = 1 and h(x,) = 0,
for digital computers. then ~~=f(x,)=F(x,,y,)=F(x,,y,)= .., =
For every continuous function ,f(x), there F(x,, y,) = y,. In this procedure it is necessary
always exist min-max error approximations of to determine g(x), h(x), <p(x), and I,!J(x) for
the form (1). In generating a min-max error the given f(x). Most of the elementary func-
approximation in practice, we essentially tions cari be evaluated by Chen’s algorithm,
depend upon the following conditions and which is essentially identical with Specker’s
theorems. Let a function space F be a d- Sequential Table Look-up (STL) method
dimensional linear space. If for ,feF which is based on addition formulas [7], e.g., logx =
not identically zero, there exist at most (d- 1) log(xa)-loga or eX=eXmpep. The iteration
zeros of f(x) in [a, b], then F is called the equations (2) of CORDIC given later cari also
unisolvent space or the Haar space. be derived by the STL method with complex
Let one of the best approximations to numbers.
a continuous function f(x) be g(x). If the The use of coordinate rotation to evaluate
linear space D of a11 the functions of the form elementary functions is not new. In 1956 and
{C aipi( C b,g(x)q,(x)} is unisolvent, then 1959 Volder [S] described the CORDIC for
the best approximation is unique. For the the calculation of trigonometric functions,
case where g(x) is an approximation of linear multiplication, division, and conversion be-
type to f(x), Haar [2] proved the following tween the binary and the decimal and r-adic
theorem. number systems. Daggett, in 1959, discussed
Let g(x) be the best approximation to f’(x) the use of CORDIC for decimal-binary con-
and g(x) = CF, aipi( A necessary and suflï- version. It was recognized by Walther [9] in
tient condition for the best approximation to 1971 that these algorithms could be merged
be always unique is that the m-dimensional into one unified algorithm. Consider coordi-
linear space generated by the functions p, , p2, nate systems parameterized by m in which the
, pm be unisolvent. In particular, we have radius R and angle A of the vector P = (x, y)
the best unique polynomial approximation are delïned as R =(x2 + rny2)“’ and A = m-‘/’
(one variable) of f(x) delïned on [a, b]. arctan(m”“y/x). The basis of Walther’s algo-
Let the necessary and suhïcient condition rithm is the coordinate rotation in a linear
of the previous theorem be satistïed in a d- (m = 0), circular (m = l), or hyperbolic (m = - 1)
dimensional linear space D. A necessary and coordinate system, depending on which func-
sufficient condition for g(x) given by (1) to be tion is calculated. Iteration equations of
the best approximation of ,f(x) is that there CORDIC are as follows. The point (x~+~, y,+i,
existpointsadx,<x,<...<x,-,<x,db, zi+,) is obtained from the point (xi, y;, z,) by
called deviation points, such that ( -l)‘(f(xi) - means of the transformation
g(x,))=p (i=O, 1, . . . . d).
A great number of algorithms are known
by means of which one cari calculate the best Yi+l =YiAxisi, (2)
approximation g(x) of a function f(x) for the
zi+1 = zi + ciij,
given values of n and tn and for pi(x) = xi-l
and gj( x) = x ‘-’ in (1). However, the following where m is a parameter for the coordinate
three kinds of algorithms are used most fre- system, tl, = m-ii2 arctan (m”‘&), and hi is a
quently for generating min-max approxima- suitable value, e.g., f2-‘. The angle Ai+i and
tions on computers: Remes’s second algorithm radius Ri+l are Ai+i =A,-cc,and R,+,=Rix
[3], the differential correction algorithm, and Ki E Ri x (1 + m6’)“‘. After n iterations we find
Yamauti’s folding-up method. A,=A,-aandR,=R,xK,andthen
x,= K{x,cos(am’~2)+y,m’i2sin(am’i2)},

C. Evaluation of Elementary Functions Based y,,= K{y,cos(am1~2)-x,m~‘i2sin(ctm’i2)},


on Microprogramming
z,=z,+cc,
Chen [6] has given a general algorithm for where c(= x:Zo ai and K =H:Z; Ki. These
calculating an elementary function z =f(x) as relations are summarized in Table 1 for nr = 1,
follows. Let F(x, y) = yg(x) + h(x), where y is m = 0, and m = -1 in the following special
some parameter for evaluating z0 = f(xO) = cases. (i) The value of A is forced to converge
F(x,, yo). We assume that (xc,, y,J has been to zero; y,-tO. (ii) The value of z is forced to
given and that z0 is unknown. Suppose that a converge to zero; z,+O.
142 D 558
Evaluation of Functions

Table 1. Input and Output Functions for CORDIC


Input Quantity
Function m x0 Y0 ZO to be 0 Output
sint 1 l/K 0 t z,+o y,+sin t
COSt 1 l/K 0 t z,+o x,+cos t
tan’ t 1 1 t 0 Y,-+0 z,+tan-‘t
xz 0 X 0 z z,-0 Y,+=
YlX 0 X Y 0 YnA zn+Ylx
sinh t -1 l/Km, 0 t z,+o y,+sinh t
cash t -1 l/Km, 0 t Z”-tO x,+cosh t
tanh -’ t -1 1 t 0 Y,+0 z,-+tanh-‘t
n-1 n-1
K, = n (1 +~!?~~)l’~, Km1=n(l-6j2)1’2
j=O j=O

D. Fast Fourier Transform (FFT) example, the 2-dimensional Fourier transform


coefficient is given by
When a function f(x) cari be taken to be
periodic, it is advantageous to use trigono- K,,“, = CMN, WI x
metric polynomials as least-squares approxi-
N,-1 N,-1
mating functions. The summations arising in
least-squares approximations based on the
(O<n,<N,-1, O<n,<N,-1).
trigonometric polynomials play an important
role in various applications, and a quite effi- If by an elementary operation we mean one
tient algorithm for the evaluation of such complex multiplication and one complex ad-
sums was developed by Cooley and Tukey dition, we cari evaluate X,,,,l through (Ni N2)2
[lO] in 1965, known as the fast Fourier trans- such operations using Horner’s scheme. By
form (FFT). Let x,, (0 <m d N - 1) be a set of the direct product decomposition method,
complex numbers, and consider however, the (Ni N2)2 operations cari be re-
N-I duced to only N, N2(N1 + N2) operations.
X, = (l/N) 1 x, exp( - 2nimn/N) Because the matrix corresponding to the trans-
m=o
formation mentioned above is a direct product
(O<ndN-1). (3) of N, x N, and N2 x N2 matrices, we cari per-
form the calculations in two stages: tïrst to
Equation (3) is often called the discrete
obtain &,,,, forO<m,<N,-landO<n,dN,
Fourier transform (DFT) of the sequence x,,
- 1 and then to obtain Xnjn, for O< n2 < N2 - 1
it being analogous to the continuous YFourier
andO<n,<N,-1. Wehave
transform,
N,-1
7
5 ml.9 =(W2) 1 exp(-2nin2m2/N2)x,,,,2,
Xn=(1/T) x(t)exp( -2zint/T)dt WI,=0
s0 AV-1
(O<n<N), (4)
xn,,nz =WW c exp(-2~in,ml/N1)5m,,n2.
In,=0
where x(t) is a periodic function of t with
period T. X, is called the nth Fourier coeffi- This direct product decomposition method is
cient of a set of N equally spaced samples of well known for 2- or 3-dimensional Fourier
size N for the function x(t) (t=jT/N,O<jd transforms. Even for a 1-dimensional Fourier
N). In the same way as for the continuous transform of length N = N, N,, if N, and N2 are
Fourier transform, the discrete transform cari relatively prime, one cari use the method of
be inverted to yield direct product decomposition. Even when N,
N-l and N, are not prime, we cari use a method
x,= 1 X,exp(2rrinm/N) (O<n<N-1). of “pseudo”-direct product decomposition to
m=o reduce the number of operations. Namely, we
Here x, is called the coefftcient of the inverse canputm=m,+N,m,(O<m,<N,,0dm2<
Fourier transform, and X,,, and x, thus form a N,);n=N2n,+n2(O<n,<N,,0<n2<N2).
transform pair. Then, similarly as before, X, in (3) cari be
The FFT algorithm for a lïnite sequence of rewritten as
length N in (3) is based on the fact that the ‘v-1
calculation of (3) cari be performed in stages by X,=(l/N,) 1 exp(-2nin,m,/N,)
m,=o
using the direct product decomposition if N =
N, N2, with N, and N2 relatively prime. For x exp(-2~imln21Wl N2))5m,,n,,
559 142 Ref.
Evaluation of Functions

where fraction of Stieltjes type, say,


N2-1
COI %XI 512x1
5 m,.n2 =U/N21 1 exp(-2nin2m21N2)x,,+.1,2, m-K+, 1 +, 1 +...,
WI,=0
(5)
then its 2pth and (2p + 1)th approximate frac-
which is a DFT of length N2. If we put tions are the (p,p)th and (p+ 1,p)th Padé
approximation of f(x), respectively.
a,,,, = ew( -2niml U(N1 N2)k,,+ (6)
then
N,-1
References
X,=(Wd c exp(-2~in,mllNl)~~,,,2.
m,=O
(7) [l] R. P. Brent, Unrestricted algorithms for
Formula (7) is nothing but a DFT of length elementary and special functions, IFIP, 1980,
Nr . Thus an FFT of length N = N, N, cari 613-619.
be calculated by decomposing it into three For the theory of approximation,
stages as follows: (i) obtain N, transforms [2] A. Haar, Die Minkowskische Geometrie
of length N2 in (5); (ii) multiply <,,,,, by und die Annaherung an stetige Funktionen,
exp( -2xim, Q/N) in (6) (phase rotation); (iii) Math. Ann., (1918), 2944311.
calculate N, transforms of length N, in (7). [3] A. Ralston, Rational Chebyshev approx-
If either or both of N, and N, cari be factored imation by Remes’ algorithms, Numer. Math.,
further SO that, e.g., N = N, N2 = N, N,, Nz2 = 7 (1965), 322-330.
“‘2 an FFT of length N2 cari be calculated [4] E. W. Chenney, Introduction to approxi-
similarly by decomposing it, and SO on, and mation theory, McGraw-Hill, 1966.
in this way one cari reduce the total number [S] M. J. D. Powell, Approximation theory
of operations. This is the principle of FFT and methods, Cambridge Univ. Press, 1981.
pointed out by H. Takahasi. When N is a For microprogramming techniques,
power of two, it cari be shown that the FFT [6] T. C. Chen, The automatic computation
algorithm requires approximately Nlog, N/2 of exponential, logarithms, ratios and square
operations. roots, IBM Res. Rep. RJ 970 (1972), 32.
[7] W. H. Specker, A class of algorithms for
In x, exp x, sin x, COSx, tan ml x and cet ml x,
E. Padé Approximation IEEE Trans. Elec. Comp., EC-14 (1965) 85-
86.
Let f(x) - ca + ci x + c2x2 + be a forma1 [S] J. E. Volder, Binary computation algo-
power series. For any pair of nonnegative rithms for coordinate rotation and function
integers (p, q), we delïne the (p, q)th Padé ap- generation, Convair Rep. IAR-1 148 Aero-
proximation of f(x) as follows: The Padé ap- electronics group, 1956.
proximation is a rational function [9] J. S. Walther, A unilied algorithm for
elementary functions, Proc. AFIPS, 1971,
(ao+a,x+a2X2+...+apXP)
Spring Joint Comp. Conf., 3799385.
/(bo+b,x+b2x2+...+bqxq) For fast Fourier transforms,
[lO] J. W. Cooley and J. W. Tukey, An al-
satisfying the condition that a11 terms in the
gorithm for the machine calculation of com-
forma1 power series
plex Fourier series, Math. Comp., 19 (1965),
(bo+b,x+...+bqx4)(co+c1x+...) 297-301.
[ 1 l] E. C. Brigham, The fast Fourier trans-
-(a,+a,x+...+U,XP)
form, Prentice-Hall, 1974.
should vanish up to the term xP+q. An infi- For the Padé approximation,
nite matrix whose (p, q)th entry is the, (p, q)th [ 121 H. Padé, Sur la représentation approchée
Padé approximation is called the Padé table d’une fonction par des fractions rationnelles,
for f(x). The Padé approximation is uniquely Ann. Sci. Ecole Norm. SU~., (3) 9 (1892), l-92.
determined, provided that every Hankel [13] G. A. Baker, Jr., The Padé approximation
determinant 1, 2, Encyclopedia of Mathematics and Its
Applications, vols. 13, 14, Addison-Wesley,
C!J cp+1 “’ cP+q 1981.
Cp+i $2 “’ cp+q+1
For tables of approximation formulas,
. .
[ 141 C. Hastings, Jr., Approximations for
cP+4 Cp+q+1 ..’ Cp+2q
digital computers, Princeton Univ. Press, 1955.
never vanishes. [ 151 J. F. Hart (ed.), Computer approximation,
When we expand f(x) into a continued Wiley, 1968.
143 A 560
Extremal Length

143 (X1.18) hence


Extremal Length
Mi\Ilur, 1/ =pm.
n
A. General Remarks If each CE r contains at least one C, E F, for
every n, then n(F) 3 x, ,?(F,). (4) Let f be an
The notable relation between the lengths of analytic function in a domain Q and {C} be
certain families of curves in a plane domain given in R. Denote by f(C) the image of C by
and the area of the domain has long been f: Then Â( { C}) < a( { ,f(C)}). The equality holds
recognized and utilized in function theory. if f is one-to-one. This shows that A( {C}) is
L. V. Ahlfors and A. Beurling formulated this conformally invariant.
relation by introducing the notion of extremal
length for families of curves [ 11. Although
there are various definitions of extremal B. Extremal Distance
length, they are essentially the same except for
one due to J. Hersch [3] and A. Pfluger [2]. Let 0 be a domain in a plane, 8Q its bound-
The image of an interval under a continuous ary, and Xi, X2 sets on RU an. The extremal
mapping is called a curve. We say that it is length of the family of curves in Q connecting
locally rectifiable if every tcontinuous arc points of Xi and points of X2 is called the
of the curve is trectitïable. Let C be a fmite extremal distance between Xi and X, (relative
or countable collection of locally rectifiable to 0) and is denoted by E.,(X,, X,).
curves in a plane and p, 0 < p < co, be a +Baire Example(l).Let~={z~/z~<2},X,=~Q
function delïned in the plane. Represent C in and X, be a countable set in )z( < 1 such that
terms of arc length s (- 246 Length and Area), the set of accumulation points of X, coincides
and set (C, p) = JCp ds. For a family I of Imite with IzI= 1. Then &(X,,X2)= CO, but the
or countable collections C, p is called admis- extremal distance in the sense of Hersch and
sibleif(C,p)>l foreveryCEF.IfnoCEI Pfluger is equal to (2n))‘log2.
consists of a fïnite or countable number of Example (2). In a rectangle with sides a and
points, then p = cc is always admissible for b, the extremal distance between the sides of
F. Call inf {SSp’dxdy}, where p runs over length a is b/a.
admissible Baire functions, the module of I, Example (3). Let R be an annulus r, <
and denote it by M(I). The reciprocal Â(I)= IzI <r2. The extremal distance Â. between
l/M(I) is called the extremal length of F. If the two boundary circles of R is equal to
no Baire functions are admissible for r, we (27r-’ log(r,/r,). The extremal length of the
set n(I) = 0. The extremal length is defmed family of curves in R homotopic to the bound-
equivalently in two other ways as follows: ary circles is equal to l/i.
Let p be a nonnegative Baire function, and Example (4). Let Q be a domain in the ex-
put L(lY,p)=inf{(C,p)ICEr}, then n(F)= tended z-plane such that co E fi. Let z0 E R and
supL(F,p)*/~~p*dxdy, where p runs over { 1z - z0 I= r} c 0, and denote by Â, the ex-
nonnegative Baire functions. Next, let 0 be tremal distance between { 1z - z0 1= r} and a set
the collection of nonnegative Baire func- Xc& relative to Q. Then A,-(27~~‘logr
tions p such that jJp*dxdy< 1; then Â(I)= increases with r. We call the limit the reduced
sup{L(T,~)~)p~@}. We obtain the same extremal distance and denote it by x,(X, 00).
value for Â(I) if werequire an admissible p to +Robin’s constant for +Green’s function in Iz
be tlower semicontinuous. If p is required to with pole at z = CO is equal to 2nA”,(aR, CO).
be continuous, then the extremal length de- Extremal length is also delïned on Riemann
fined by Hersch and Pfluger is obtained. As is surfaces. Some classical conforma1 invariants
shown in example (1) of Section B, there is a cari be given in a generalized form in terms of
case where the two definitions actually differ. extremal length. The notion of extremal length
If an admissible p yields M(r)=/Jp*dxdy, has applications in various branches of func-
then pjdsl is called an extremal metric. Beur- tion theory, such as tconformal and tquasi-
ling gave a necessary and sufficient condition conforma1 mappings, the +Phragmén-Lindelof
for a metric to be extremal [4]. theorem, the tcoeftïcient problem, the +type
We list four properties of extremal length: problem of Riemann surfaces, and studies of
(i)3,(r,)~l(r2)ifr,cr,.(2)M(U,r,)~ the boundary properties of functions of Imite
C. M(I,). (3) Let {F,} and F be given. Suppose Dirichlet integrals. It is also applied to prob-
that there are mutually disjoint measurable lems in differential geometry. Extending the
sets {E,} such that each C, E I, is contained notion of extremal length, M. Ohtsuka con-
in E,. If eachelementof Un r, contains at sidered extremal length with weight, and B.
least one CE r, then M(F) 2 C. M(I,), and Fuglede introduced the notion of generalized
561 143 Ref.
Extremal Length

module in higher-dimensional spaces [S].


These notions have useful properties and
applications.

References

[l] L. V. Ahlfors and A. Beurling, Conforma1


invariants and function-theoretic null-sets,
Acta Math., 83 (1950) lOlG129.
[2] A. Pfluger, Extremallangen und Kapazitat,
Comment. Math. Helv., 29 (1955), 120%131.
[3] J. Hersch, Longuers extrémales et théorie
des fonctions, Comment. Math. Helv., 29
(1955), 301-337.
[4] M. Ohtsuka, Dirichlet problem, extremal
length and prime ends, Van Nostrand, 1970.
[S] Y. Kusunoki, Some classes of Riemann
surfaces characterized by the extremal length,
Proc. Japan Acad., 32 (1956).
[6] M. Ohtsuka, On limit of BLD functions
along curves, J. Sci. Hiroshima Univ., 28
(1964).
[7] T. Fuji’i’e, Boundary behaviors of Dirich-
let functions, J. Math. Kyoto Univ., 10 (1970).
[S] B. Fuglede, Extremal length and functional
completion, Acta Math., 98 (1957), 171-219.
[9] J. A. Jenkins, Univalent functions and
conforma1 mapping, Erg. Math., Springer,
1958.
[ 101 J. Vaisala, Lectures on n-dimensional
quasiconformal mappings, Lecture notes in
math. 229, Springer, 1971.
144 Ref. 564
Fermat, Pierre de

144 (Xx1.23) sertion as a marginal note at the point at which


Fermat, Pierre de n = 2 in equation (1) i, treated and added the
famous words, “1 have discovered a truly
remarkable proof of this theorem which this
Pierre de Fermat (August 20, 1601 -January margin is too small to contain.” It is not
12, 1665) was born into a family of leather known whether Fermat actually had a proof.
merchants near Toulouse, France. He became Fermat’s problem asks for a proof or disproof
an attorney and in 163 1 a member of the of this conjecture, which itself has not been
Toulouse district assembly. When not engaged solved despite centuries of efforts by many
in such work, he did research in mathematics, mathematicians; but its study has promoted
SO that he consigned his results only to his remarkable advances in number theory. In
correspondence or to unpublished manu- particular, E. E. Kummer’s theory of ideal
scripts. The manuscripts were pubhshed post- numbers and the development of the theory of
humously by his son in 1679 and are known as tcyclotomic tïelds were originally conceived in
Varia opera mathematica. His research into treating Fermat’s problem.
number theory, stimulated by Bachet’s (1581- In this article, we consider only those in-
1638) translation of the Arithmetika of Dio- tegral solutions x, y, z of equation (1) with
lphantus (pubhshed in 1621) made Fermat’s xyz #O that are relatively prime. We also
name immortal and initiated modern num- restrict ourselves to the cases n = 1 (odd prime)
ber theory. He posed the famous Fermat’s and n = 4, without loss of generality.
Problem, which has yet to be solved (- 145 For smaller values of n, the nonsolvability of
Fermat’s Problem). He began analytic geome- equation (1) was proved long ago, for n = 3 by
try by studying the theory of tconic sections of L. Euler (1770), and later again by A. M. Le-
Apollonius, and utilizing this theory he dealt gendre; for n = 4 by Fermat and Euler; for n = 5
with the notions of tangent lines, maximal by P. G. L. Dirichlet and Legendre (1825); and
(minimal) values of functions, and quadrature, for n = 7 by Ci. Lamé (1839). S. Germain and
which made him a pioneer in calculus. He also Legendre found some results on more general
wrote a precursory work in the theory of prob- cases, but the most remarkable result was ob-
ability in the course of his correspondence tained by Kummer (J. Reine Angew. Math., 40
with +Pascal. +Fermat’s principle is important (1850), Ahh. Akad. Wiss. Berlin (1857)).
in the field of optics, where it is known as the Let / be an odd prime, [ a primitive Ith root
law of least action. Unlike +Descartes, he em- of unity, and h the tclass number of the cyclo-
phasized the revival rather than the criticism tomic tïeld Q(i). Then the class number h, of
of Greek mathematics. the real subfield Q(c + i-i) of Q(c) divides h.
We cal1 h, = h/h, and h, the tfïrst and +Second
factors of h, respectively.
References
(1) If I is tregular, that is, if (/t, 1) = 1, then
xl+ y’=~’ has no solution (Kummer, 1850).
[ 1) P. Fermat, Oeuvres 1-V and supplement,
There are infinitely many irregular primes
P. Tannery and C. Henry (eds.), Gauthier-
[3]; those under 100 are 37,59, and 67. There
Villars, 189 1 - 1922.
are 7,128 regular primes and 4,605 irregular
primes between 3 and 125,000. It is not yet
known whether there are infinitely many regu-
lar prime numbers, although the beginning
145 (V.16) part of the sequence of natural numbers con-
Fermat’s Problem tains a larger number of these than the num-
ber of irregular prime numbers. The condition
(I, h) = 1 is equivalent to saying that the numer-
The last theorem of Fermat (c. 1637) asserts ators of +Bernoulli numbers B,, (m = 1,2, ,
that if n is a natural number greater than 2, (/ - 3)/2) are not divisible by I (Kummer, 1850).
then Kummer obtained a result on irregular
primes (1857) which was improved later as
xn+yn=zn (1) follows. Note that if 1 is not regular then hi is
has no rational integral solution x, y, z with divisible by 1 (Kummer, 1850) (- 14 Algebraic
xyz # 0. In the case n = 2, equation (1) has Number Fields).
integral solutions called Pythagorean numbers (2) If (h2, I)= 1 and the numerators of Ber-
(- 118 Diophantine Equations). Fermat read noulli numbers B,,, (m = 1,2, , (l- 3)/2) are
a Latin translation of Diophantus’ Arith- not divisible by /3, then x’t y’=~’ has no
metika, in which the problem of finding all solution (H. S. Vandiver, Trans. Amer. Math.
Pythagorean numbers is treated. In his per- Soc., 3 1 (1929)). By computation Vandiver
sonal copy of that book, Fermat wrote his as- confïrmed that x’ + y’ = Z’ has no solution for
565 146 A
Feynman Integrals

1<619. At present, this procedure has been (2*) Under the same conditions as in state-
continued for 1~ 125,080 using computers by ment (2), equation (4) has no solution. If we
the method of D. H. Lehmer, E. Lehmer, and additionally restrict a, fl, y to relatively prime
H. S. Vandiver, Proc. Nat. Acad. Sci US, 40 integers of Q(c + [-‘) and replace 1. by (l-
(1954) (S. S. Wagstaff, Math. Comp., 32 (1978)). [)( 1 - [-‘), then equation (5) also has no solu-
When the condition (xyz, I) = 1 or (xyz, 1) = 1 tion (Vandiver, 1929).
is added, we speak of Case 1 or Case II, respec- (4b*) If equation (4) has solutions a, 8, y in
tively. The following theorems hold for Case 1. Q(c), then congruence relation (3) holds for a11
(3) If XI + y’ = z’ has a solution in Case 1, m with 2 <m < 43 (Morishima, 1934).
then When I is suflïciently large, we have the
results of M. Krasner (C. R. Acad. Sci. Paris
B2mfi-2m (t)=O (modi), m= 1,2, . . . . (l-3)/2,
(1934)) and Morishima (Proc. Japan Acad., 11
(2) (1935)).
holds for - t = x/y, yjx, y/z, zJy, xjz, and zjx, Bibliographies are given in Vandiver and
where f,(t) = Lf.lh r”-’ t’, and B,,, is the mth Wahlin [1] and Vandiver [2].
Bernoulli number. This is called Kummer’s
criterion (D. Mirimanov, 1905).
A simplification of the’above result is References
(4a) If xf + y’ = z’ has a solution in Case 1,
then [1] H. S. Vandiver and G. E. Wahlin, Alge-
brait numbers II, Bull. Nat. Res. Council,
(2’-‘-1)/1-O(modl) no. 62, 1928.
(A. Wieferich, J. Reine Angew. Math., 136 [2] H. S. Vandiver, Fermat’s last theorem,
(1909)). This result created a sensation at the Amer. Math. Monthly, 53 (1946), 555-578.
time of its publication. It was lïrst shown that [3] P. Ribenboim, 13 lectures on Fermat’s last
1093 and 35 11 are the only primes with l< theorem, Springer, 1979.
3700 for which the above congruence holds; it
is presently known that no other 1 with 1~ 6 x
109 satisfies this congruence. The criterion
(4a) was gradually improved by Mirimanov
(1910, 1911), P. Furtwangler (1912), Vandiver 146 (Xx.30)
(1914), G. Frobenius (1914), F. Pollaczeck Feynman Integrals
(1917), T. Morishima (1931), and J. B. Rosser
(1940, 1941). For example:
(4b) If xl + y’ = z’ has a solution in Case 1, A. Introduction
then
As the S-matrix or tGreen’s function in quan-
(m “-l)/l=O(modl) (3) tum fïeld theory is usually prohibitively dif-
holds for a11 m with 2 $ m ~43. By means fïcult to calculate, perturbative expansions in
of this result, Rosser (1941) showed for / < terms of coupling constants have been em-
41,000,000, and D. H. Lehmer and E. Lehmer ployed since the beginning of the theory (-
(Bull. Amer. Math. Soc., 47 (1941)) showed for 386 S-Matrices). R. P. Feynman (Phys. Reu., 76
l< 253,747,889 that xl + y’= z’ has no solution (1949)) invented a way of calculating the series
in Case 1. in terms of Feynman integrals. His method
We have hitherto been concerned with drastically simplified the preceding method
rational integral solutions of xl+ y’= z’. We due to S. Tomonaga and J. S. Schwinger, even
may also consider the problem of proving or though, as was later shown, the two methods
disproving that LX’+ fl’= y’ has no solution are theoretically equivalent (F. J. Dyson, Phys.
*I, fi, y with @y # 0 in the ring of talgebraic Reu., 75 (1949)). A Feynman integral is an
integers of Q(i). Case 1 means the impossibil- integral associated with a Feynman graph
ity of according to the Feynman rule explained
in Section B. Feynman integrals inherit the
a’+p’+y’=O, (apy,l)=l, (4)
troublesorne problem of divergence, and some
and Case II means the impossibility of recipe which systematically provides them
with a deîïnite meaning is needed. Such a
a’+~‘=~/y’, (aDy,!)= 1, (5)
recipe is given by the renormalization the-
where n is a natural number, E is a tunit in ory of Tomonaga, Schwinger, Feynman, and
Q(c), and /1= 1 - [. We have the following Dyson. A mathematically rigorous renormali-
results: zation theory was given by N. N. Bogolyubov
(l*) If (h, 1)= 1, then neither equation (4) nor and 0. S. Parasiuk (Acta Math., 97 (1957)),
equation (5) has a solution (Kummer, 1850). later supplemented by K. Hepp (Comm. Math.
146 B 566
Feynman Integrals

Phys., 2 (1966)). See also W. Zimmermann oriented, that M/;+ # IV- and that G is con-
(Comm. Math. Phys., 15 (1969)) and S. A. nected. The orientation of a line is indicated by
Anikin et al. (Theor. Math. Phys., 17 (1973)). the symbol + Given an orientation, we de-
Furthermore, E. R. Speer [l] gave a mathe- fine the incidence number [j: /] to be +l or
matically convenient recipe of renormalization -1 according to whether L, ends or starts from
(under the condition that massless particles 5. In other cases, [j: I] is defined to be zero.
are irrelevant). Although the series expansion The incidence number [j: r] is defined in the
in coupling constants is a divergent series same way but with LF replacing L, (Fig. 1).
even after renormalization (- 386 S-Matrices),
the study of Feynman integrals has given
much insight into the qualitative aspects of the
S-matrix, and in particular, into its analytic
structure (e.g., R. J. Eden et al. [2]). In this
respect the discovery of the Landau-Nakanishi
equations, which describe the location of sin-
gularities of Feynman integrals, was crucially
important (L. D. Landau, N. Nakanishi, and
J. Bjorken, 1959; - Section C). Later, R. E. Fig. 1

Cutkosky found a formula which gives impor- In this example of a Feynman graph, the interna1
lines L, and L, do not intersect and this diagram
tant information concerning the ramification should be drawn in R4, not in R2; this is indicated
of Feynman integrals near their singularities by X. For convenience, multiple lines such as L,
(- Section C). It gave impetus to J. Leray’s and L, are usually drawn in a curvilinear mariner,
mathematical study of Feynman integrals as shown.
from the viewpoint of integration of multi-
valued analytic functions (Leray, Bull. Soc. The Feynman rule associates the following
Math. France, 87 (1959)). Such studies were integral F,(p) with each Feynman graph G:
subsequently carried out by D. Fotiadi, M.
Froissart, J. Lascaux, F. Pham, etc.; - [3-51 nrl,64(C:,1[j:r]p,+C~1[j:llk~)
FG(P) =
and references cited there for this topic. An s l-&(k;-m;+pO)
extensive study by G. Ponzano, T. Regge,
Speer, and J. M. Westwater on the mono- x 5 d4k,, (1)
dromy structure of Feynman integrals is 1=1
closely related to the studies by Pham and where k: = k; ,, - XI=, ktY. Here l/(kF -m: +
others (- Regge in [6] and references cited J-1 0) means lim,,,(l/(k~-rnf +fi E)
there). On the other hand, the progress of (- 125 Distributions and Hyperfunctions).
microlocal analysis has thrown new light on Here we consider the case where the interac-
the Feynman integrals and has given a unified tion Lagrangian density does not contain
foundation to these various other studies (- differential operators (ie., direct coupling) and
Section C; also Pham, M. Kashiwara, and T. a11 relevant particles are spinless. In general,
Kawai in [6], M. Sato et al. in [6] and refer- we should multiply the integrand of F,(p) by a
ences cited there). matrix of polynomials of the p, and k,.
F,(P) bas the form KSj,,Cj:rlP,)fdP); we
often investigate ,fc(p) instead of F,(p). The
B. Definitions function &(p) is studied on M = def{ peR4” 1
Cj,,[j:r]p,=O}, and is called a Feynman
First, the notion of Feynman graphs is intro- amplitude. The integral(l) is not well defined
duced. A Feynman graph is sometimes called a as it stands because of the following prob-
Feynman diagram. A Feynman graph G con- lems: (a) 1s the product appearing in the inte-
sists of tïnitely many points (called vertices) grand well detïned? (b) 1s the integral conver-
{ I$}j=i,,,,,,, iïnitely many 1-dimensional seg- gent? The fïrst problem is not serious if m, #O
ments (called interna1 lines) {L,},=,,,,,,, and (- 274 Microlocal Analysis E) However, the
finitely many half-lines (called external lines) second problem, called the ultraviolet diver-
jw”=l....,“~ all of which are located in a 4- gence, is serious. The renormalization proce-
dimensional affine space. Each of the end- dure is intended to overcome this diflïculty.
points w’ and IV- of L, and the endpoint of When some m, is equal to zero, even the first
L; coincide with some vertex y. A four-vector problem, called the infrared divergence, is
p, = (P,,~, p,, , , p,, *, p,., 3) is associated with each serious. See D. R. Yennie et al. (Ann. Phys., 13
external line L; and a constant m, > 0 is as- (1961)) and T. Kinoshita (J. Math. Phys., 3
sociated with each interna1 line L,. For sim- (1962)) for analyses of the infrared divergenee.
plicity, we usually suppose, in addition, that In this article we always assume for simplicity
each interna1 line and each external line are that every m, is strictly positive, even though
567 146 C
Feynman Integrals

such an assumption is too restrictive from the momentum conservation law at the vertex y;
physical viewpoint. (2d) corresponds to the mass-shell constraint
Calculations of Feynman integrals are often (if CI,3 0) on the interna1 line L,. Since (2~)
done by means of the parametric representa- entails z,~,(C)a,k,=O for a closed circuit
tion of the integral (1) of the form (= loop) C of G for some set of values Q(C) =
kl or 0 with Q(C) being 0 if L, does not be-
long to C, the equation (2~) is usually called

1 6(1
-XE,
a,>E,
da, the closed-circuit condition. By attaching wj to

xs0U(a)‘(V(p,a)+~0)-N+2”,~2’ the vertex y and associating a,k, to the inter-


na1 line L,, we get a diagram representing a
multiple collision of classical point particles,
where U(a) and V(p, c() are determined by the since the relations (2a)-(2f) are just the class-
topological structure of the graph G. See [7] ical conditions for such a collision (S. Coleman
for the derivation of this formula and the and R. E. Norton, Nuovo Cimento, 38 (1965)).
treatment of the integral written in this form. The fact that the Landau-Nakanishi equations
It is useful not only for the study of the sin- admit such an interpretation is neither acci-
gularity structure of F,(p) but also for the dental nor superfïcial in view of the tmacro-
study of spectral representations, etc. (-, e.g., scopie causality of the S-matrix. Note also that
[7, SI). Note that several different notations there is another interpretation of the Landau-
are used in the hterature for the parametric Nakanishi equations, which emphasizes their
representation of the integral. Hence one resemblance to +Kirchhoff’s law (Nakanishi,
should be careful in referring to papers which Prog. Theor. Phys., 23 (1960)). Such a resem-
use parametric representations. blance cari be used to study the structure of
Feynman integrals from the viewpoint of
+graph theory. See [7] and the references cited
C. Analytic Properties of Feynman Integrals there for this topic.
An important observation by Pham and
The celebrated result of Landau (Nuclear Sato (1973), which opened a way to the micro-
Phys., 13 (1959)), Nakanishi (Prog. Theor. local analysis of Feynman integrals and the
Phys., 22 (1959)) and Bjorken (thesis, Stanford S-matrix, is the following: If we consider the
Univ., 1959) asserts that the singularities of a Landau-Nakanishi equations to define a sub-
Feynman amplitude &(p) are confined to the variety of S*M, the tspherical cotangent bun-
subset L+(G) (called a positive-a Landau- dle of M, by eliminating only w, k, and a,
Nakanishivariety)ofM={pER4”ICj,,[j:r]p, then the resulting variety describes the tsin-
= 0}, defined by the following set of equations gularity spectrum of&(p). More precisely,
(called Landau-Nakanishi equations), where u,, S.S.f’(p) is confïned to the set (p; fiu),
wj, and k, are real four-vectors and a, is a real where (p; u) satistïes the Landau-Nakanishi
number, a11 of which are to be eliminated to equations. The rigorous proof of this state-
detïne relations among the pr (note, however, ment was given by Sato et al. in [6]. The
that a positive-a Landau-Nakanishi variety is subset of S*M or fiS*M thus obtained
not strictly a subvariety of M, because of the is denoted by g’(G) and is also called a
constraint (2e): positive-a Landau-Nakanishi variety. It is
noteworthy that the microlocalization of the
ur=g[j:r]wj (r=l,...,n), (24 classical result of Landau, Nakanishi, and
Bjorken had essentially been achieved in a less
sophisticated manner by D. Iagolnitzer and H.
r$ li:rIp,+Ii Cj:llk,=O (j= l,...,n’), P. Stapp (Comm. Math. Phys., 14 (1969)) in the
WI framework of S-matrix theory. The variety
defined by (2a)-(2d) and (2f) is denoted by
z[j:l]wj=alk, (I=l,...,N), (24 L(G) or Y(G) and is a Landau-Nakanishi
variety. In a neighborhood of p” in L+(G),
a,(k:-mf)=O (1=1,...,N), (24 &(p) is the boundary value of a holomorphic
function &(p) whose domain of defïnition is
a1 2 0, (24 determined by u-vectors (- 274 Microlocal
with some Analysis E). Furthermore, in simple cases one
cari verify that &(p) cari be analytically con-
a,#@ m tinued to detïne a holomorphen on
Usually a Landau-Nakanishi variety (resp., the universal covering space U-L(G)’ of U -
equation) is called a Landau variety (resp., L(G)’ for a complex neighborhood U of p”,
equation) for short. The equation (2a) is where L(G)’ denotes a complexification of
conventional; (2b) represents the energy- L(G). Hence we cari discuss the difference of
146 Ref. 568
Feynman Integrals

fc(yp) and f&), where y denotes a loop en- 141 F. Pham, Introduction a l’étude topo-
circling L(G)‘. Cutkosky (J. Math. Phys., 1 ogique de singularités de Landau, Gauthier-
(1960)) observed that it cari be expressed Villars, 1967.
(in simple cases) by the integral obtained by 151 S. Lefschetz, Applications of algebraic
replacing l/(kF - rnf +GO) in the right- opology, Springer, 1975.
hand side of(l) by -2nfl6+(kf-w$‘)=,,, 161 Publ. Res. Inst. Math. Sci., 12, suppl.
-2zflY(k,$(k~-m:), where Y(k,,,) 1977) (Proc. Oji Seminar on Algebraic Analy-
denotes the tHeaviside function. A result of sis, 1976).
this type is called a discontinuity formula and 171 N. Nakanishi, Graph theory and Feynman
is now obtained for the S-matrix itself in a ntegrals, Gordon & Breach, 1971. (Original in
suitably modified manner, i.e., the discontinu- Iapanese.)
ity formula holds beyond the framework of [8] 1. T. Todorov, Analytic properties of Feyn-
perturbation theory (- 386 S-Matrices). As is man diagrams in quantum field theory, Per-
mentioned in 386 S-Matrices, the discontinuity gamon, 1971. (Original in Russian, 1966.)
formula is closely related to Sato’s conjecture
on the tholonomic character of the S-matrix
(T. Kawai and Stapp in [SI). Actually, M.
Kashiwara and Kawai (in [6]) proved that 147 (IX.1 3)
f&) satistïes a tholonomic system of linear Fiber Bundles
differential equations whose characteristic
variety is confïned to the extended Landau-
Nakanishi variety L(G). They further proved A. General Remarks
that the system has regular singularities (- E. Stiefel [2] introduced certain tdiffeomor-
274 Microlocal Analysis G). Their result gives, phism invariants of tdifferentiable manifolds
on one hand, a precise version of Regge’s
by considering a field of a finite number of
statement to the effect that f&) is a generaliza-
linearly independent vectors attached to each
tion of a thypergeometric function (in Battelle
point of a manifold; and H. Whitney [3] ob-
Rencontres, C. DeWitt and J. A. Wheeler (eds.),
tained the notion of tïber bundles as a com-
Benjamin, 1968), and, on the other hand, a
Pound idea of a manifold and such a fïeld of
rigorous proof of the fact that f&) is a Nils-
tangent vectors. S. S. Chern [4] emphasized
son class function. This fact is closely related
the global point of view in differential geo-
to the works of D. Fotiadi, J. Lascaux, Pham,
metry by recognizing the relation between the
and others. Kashiwara and Kawai (Comm.
notion of tconnections (due to E. Cartan) and
Math. Phys., 54 (1977)) also showed that the the theory of fiber bundles. The theory of iïber
holonomic character of the Feynman ampli-
bundles is also applied to various fïelds of
tude is an important clue for understanding
mathematics, for example, the theory of +Lie
the so-called hierarchical principle, which had
groups, thomogeneous spaces, tcovering
been proposed and studied in connection
spaces, and general vector bundles, vector
with the tMandelstam representation by the
bundles of class c’, or analytic vector bundles.
Cambridge group (Eden et al. [2]). Further-
Homological properties of fiber bundles are
more, Kashiwara et al. (Comm. Math. Phys.,
studied by means of +Spectral sequences, and
60 (1978)) gave a useful expression of&(p) at
cohomology structures of several homoge-
several physically important points by analyz-
neous spaces and several characteristic classes
ing the microlocal structure of the holonomic
are determined explicitly by means of tcoho-
system that &(p) satisfïes. Thus the use of
mology operations. Also, the group K(X),
t(micro-) differential equations in analyzing
formed by equivalence classes of vector bun-
Feynman amplitudes has turned out to be
dles over a iïnite +CW complex X, is a tgener-
effective in understanding their singularity alized cohomology group, treated in tK-theory,
structures in a unitïed manner.
in which further development is expected
(- 237 K-Theory).

References B. Definitions

[l] E. R. Speer, Generalized Feynman ampli- Let E, B, F be topological spaces, p: E+ B a


tudes, Princeton Univ. Press, 1969. continuous mapping, and G an teffective left
[2] R. J. Eden, P. V. Landshoff, D. 1. Olive, topological ttransformation group of F. If
and J. C. Polkinghorne, The analytic S-matrix, there exist an topen covering { U,} (c(E A) of B
Cambridge Univ. Press, 1966. and a homeomorphism <p,: U, x F zp-‘(U,) for
[3] R. C. Hwa and V. L. Teplitz, Homology each c(E A having the following three prop-
and Feynman integrals, Benjamin, 1966. erties, then the system (E, p, B, F, G, U,, <p,) is
569 147 E
Fiher Bundles

called a coordinate bundle: (1) pcp,(h, y) = b bundle) if G operates on G by left translations.


(b~u~,y~F). (2) Detïne <~~,~:Fxp-‘(b) (~EU,) This is also defïned by the following con-
by v,,d~)=v~(b~~); then ~paW=v&ovo,b~G ditions: G is a right ttopological transfor-
for beU,f? U,. (3) gsa: U,f’U,+C is continu- mation group of P, and there exist an open
ous. We say that this bundle is equivalent covering {U} of B and homeomorphisms <p:
to a coordinate bundle (E, p, B, F, G, U;, cp;) U x Gzq-‘(U) with qcp(b,g)=b, <p(b,g).g’=
if y,,(b)= <p;;h o<P.,~E G (hé LJ,n U;) and cp(h, gg’) (bE U; g, g’ E G). A bundle mapping
g,,c: U,n U;+G is continuous. An equivalence Y: P+ P’ between two principal bundles y~=
class 5 = (E, p, B, F, G) of coordinate bundles is (P, q, B, G) and 1’ = (P’, q’, B’, G) is also delïned
called a fïber bundle (or G-bundle), and E is as a continuous mapping Y with Y(x g) =
called the total space (or bundle space), p the W) 9.
projection, B the base space, F the fïber, and G
the bundle group (or structure group). Also, UC
D. Associated Fiber Bundles
of a coordinate bundle (E, p, B, F, G, Un, cp,)
belonging to the class 5 is called the coordinate
Let ré= (P, q, B, G) be a principal bundle, and
neighborhood, cp, the coordinate function, and
let F be a topological space having G as an
gOa the coordinate transformation (or transition
effective left topological transformation group.
function).
Then G is a right topological transformation
Let 5 = (E, p, B, F, G) and 5’ = (E’, p’, B’, F, G)
group of the product space P x F by (x, y). g =
be two liber bundles with the same liber and
(x.g,g-’ .y) (xéP,y~F,geG). Consider the
group. A continuous mapping Y: E-t,%’ is
+orbit space P x GF = (P x F)/G, and define the
called a bundle mapping (hundle map) from C:to
continuous mapping p:P x,F+B by ~{@,y)}
5’ if the following two conditions are satislïed:
=q(x).ThenuxoF=(Px,F,p,B,F,G)is
(1) There is a continuous mapping $ : B-B
a tïber bundle, called the associated fiber
with p’oY =Y op. (2) t,hflr,,(b)=&,;! oY ocp,,,~
bundle of ré with liber F. On the other hand, rl
G(bEU,n$-‘(VJ, b’=\lr(b)), and $,or:Ucn
is called an associated principal bundle of 5 =
Ic, -‘( VL)-G is continuous, where {U,, cp,} and
(E, p, B, F, G) if 4 = r xG F. A principal bundle
{Vi, cpi} are pairs of coordinate neighbor-
q having the same coordinate transformations
hoods and functions of r and c’, respectively.
as 5 is an associated principal bundle of 5, and
Moreover, if $ is a homeomorphism, then Y is
two liber bundles are equivalent if and only if
also a homeomorphism and Y-i is a bundle
their associated principal bundles are equiva-
mapping.
lent. Therefore, given a fiber bundle 5, there
Let 5 =(E, p, B, F, G) and 5’ =(E’, p’, B, F, G)
exists a principal bundle q such that 5 = 9 xG F.
be two fiber bundles with the same base space,
liber, and group. If there is a bundle map-
ping Y: E+E’ such that $: B-B as described E. Examples of Fiher Bundles
before is the identity mapping, then we say
that < is equivalent (or isomorphic) to 5’ and (1) Product bundle. (B x F, pl, B, F, G), where
Write 5 = 5’. Take the same coordinate neigh- pl, the projection of the product space, is
borhoods {U,}, and let gPa and gbdl be the co- called a product bundle if there is just one
ordinate transformations of 5 and <‘, respec- coordinate neighborhood B and the coordi-
tively. Then 5 = 5’ if and only if there are nate function is the identity mapping of B x F.
continuous mappings 1,: U,+G with g;,(b)= A bundle that is equivalent to a product bun-
*p(b)gpaVWa@-’ @E ua n UP>. dle is called a trivial bundle.
For a system { tfp&} of coordinate transforma- (2) A tcovering (F, p, Y) is a fiber bundle
tions of a liber bundle, we have gyp(b)gPbl(b) = whose liber is the discrete space ~~‘(y,)
gJb) (b E UCn U, n U,). Conversely, given an (y0 E Y), and the structure group is a factor
open covering { Um} and a system of gfiK: U, n U, group of the tfundamental group rci(Y,yJ. In
*G satisfying, this condition, there is a unique particular, a tregular covering is a principal
G-bundle (E, p, B, F, G) with { goo} as a system bundle.
of coordinate transformations. Actually, E is (3) Hopf bundle. Let A be the real number
the tidentification space of ,!?= {(b, y, c()1b E U,} lïeld R, the complex number field C, or the
c B x F x A obtained by identifying two points quaternion fïeld H, 3, = dim, A, and A”” the
(b, Y, 4, (b’, Y’, B) with b = b’, Y’ = y,&). Y, ad P (n + 1)-dimensional linear space over A. Iden-
is defmed by p{ (b, y, m)} = b, where the index tify two points (z,, , zJ, (~0, ,zL) of the
set A = {E} is considered a discrete space. subspace A”+I - (0) (0 is the origin) if there is a
~612 such that zi=ziz (i=O, . . ..n). Then we
C. Principal Fiber Bundles obtain the identification space P”(A), called the
n-dimensionai projective space over A. Let Si
A liber bundle n = (P, q, B, G, G) is called a (= S’@+r)-‘, the (E(n+ l)- 1)-sphere) be the
principal fiber bundle (or simply principal unit sphere in A”+‘. Then Si is the topological
147 F 570
Fiber Bundles

transformation group of Si by the product of an (n, + n,)-dimensional real vector bundle


A, and the orbit space SI/S: is P”(A). Fur- (E, p, B), denoted by t, @ t2 and called the
thermore, (Si, 4, P”(A), Si) (q is the projection) Whitney sum of 5, and c2. Similarly, we cari
is a principal bundle called the Hopf bundle (or define the tensor product 5, @ t2, the p-fold
Hopf fibering). These comments are valid also exterior power A”< (or bundle <‘p’ of p-vectors),
for n = m. When n = 1, P’(A) is homeomorphic and the hundle of homomorphisms
to S”, and the Hopf bundle is (S’“-‘, 4, S”, S”-‘) Hem([,, 5,) of dimension n,n,, (p), and
(1. = 1,2,4). A Hopf bundle is delïned similarly n, nî, respectively, using the tensor product
for i, = 8 using the +Cayley algebra, and the R”I @ R”I =R”I”*, the p-fold texterior power
projection q: Y-’ +SA (1. = 2,4,8) is the Hopf APR” = R(i) (the space (R”)@) of p-vectors
mapping (Hopf map). in R”), and the space of homomorphisms
(4) Let G be a topological group, H its Hom(R”1, R”2) = Rnl”*. (For the last one, we use
closed subgroup, and r:G+G/H, r(g)=gH the Hom((d J1, d,,) instead of Hom(d,,, db).)
natural projection. If there exist a neighbor- Hom(& E’)= t* is called the dual (vector)
hood U of ~(H)E G/H and a continuous map- bundle of 5, where s1 is the trivial line bundle.
pingf: U-G such that rof is the identity If we use coordinate transformations, 0, 0,
mapping, then we say that H has a local cross A”, and <* are obtained by the direct sum,
section fin G, and (G/K, p, G/H, HIK, H/I&) is +Kronecker product, matrix of +p-minors, and
a liber bundle for any closed subgroup K of H ttranspose of matrices, respectively. If t2 is
(where p is the natural projection gK+gH and a subbundle of 5,) the quotient bundle 5, /c2
K, is the largest normal subgroup of H con- of dimension n, - n2 is delïned by using the
tained in K). The associated principal bundle quotient vector space R”l/R”2= R”-“*, and
of the latter fiber bundle is (G/K,, p, G/H, t2 @ (t1/t2) is equivalent to 5,. These oper-
HIK,). Any closed subgroup H of a +Lie ations preserve the equivalence relation of
group G has a local cross section in G; hence bundles. Also, @ and @ are commutative up
the above bundles cari be obtained. to equivalence and satisfy the associative and
distributive laws. For each 5 having a lïnite-
dimensional +CW complex as base space, there
F. Vector Bundles is a 5’ such that 5 0 <’ is trivial.
Using the complex number lïeld C or the
A system < =(E, p, B) of topological spaces E, B quaternion field H instead of the real number
and a continuous mapping p: E-B is called an lïeld R, we cari deline similarly the complex
n-dimensional real vector bundle if the follow- vector bundle or the quaternion vector bundle
ing two conditions are satislïed: (1) p-‘(b) is a and the operations 0, 0, etc. For a complex
real vector space for each b E B. (2) There exist vector bundle 5, the complex conjugate bun-
an open covering { U,} (a~ A) of B and a co- dle 5 is defined by the complex conjugate of
ordinate function qDol:U, x F z P ml (U,) for each matrices.
a6A, where F=R”; furthermore, the <P~,~: (5) Tangent bundles, tensor bundles. Let M
R” z p ml (b) are isomorphisms of vector spaces. be an n-dimensional tdifferentiable manifold of
In this case, goa = (pp,b o <P~,~:R”zR”( be class C’. Consider the ttangent vector space
U, f’ U,) is an element of the tgeneral linear T’(M) at PE M, set T(M) = UP,, T,(M), and
group GL(n, R). Hence a vector bundle is a delïne n: T(M)+M by n(T,(M))=p. For a
lïber bundle with liber R” and group GL(n, R), tcoordinate neighborhood CI,, of p with local
and the converse is also true. A l-dimensional coordinate system (x,, ,x,), each point of
vector bundle is called a line bundle. A vector Y’( UP) is represented by C;=i @/3x,, and
bundle 5’ = (E’, p’, B) is called a subbundle of a ne1 (U,) has a coordinate system (x1, ,x,,
vector bundle 5 = (E, p, B) if E’ c E, p 1E’ = p’, fi, ,,fJ. Hence T(M) is a C’-‘-manifold, and
and p’-‘(b) is a vector subspace of p-‘(b) for Z(M) = (T(M), TT,M, R”, GL(n, R)) is an n-
each b E B. dimensional real vector bundle. 2(M) is called
Let <i and t2 be two vector bundles of the tangent (vector) bundle, its dual bundle
dimension n, and n2 with the same base space 2*(M) the cotangent (vector) bundle, and the
B. Let E be the union of the direct sum p;‘(b) tensor product 2(M) @ . @ 2*(M) 0 . . a
+p;‘(b)forbEB,anddefinep:E-Bby tensor bundle of M. The line bundle A”X*(M)
p(p;‘(b)+p;‘(b))=b. Take the same coordi- is called the canonical bundle of M.
nate neighborhoods U, for 5, and c2, and For a tcomplex manifold M, T(M) is a
define <p,: U, x R”1+“2-*p-‘(UJ by <p,(b,y)= complex manifold and 2(M) is a complex
(<pb,,+dd~)(~~R “1% = R”i + R”z), where vector bundle. Therefore these bundles are
the cpi: U, x R”I~ pi-’ (U,) are the coordi- delïned as complex bundles.
nate functions of 5,. Then E is topologized by (6) Tangent r-frame bundle. In the preceding
taking the family {q,(O)} (0 is open in U, x example, the space of all ttangent r-frames of
R”j+Q) as the +open base, and we obtain M is a bundle space with base space M and
571 147 1
Fiber Bundles

group GL(n, R). It is called the tangent r-frame and gi, is an element of the i,th copy of G,
bundle (or the bundle of tangent r-frames) of k = 1, . , m. Here we cari omit ti,gi, if ti, = 0.
M. Regard E, as a topological space with a weak
topology such that the coordinate functions
es ti, ewg, are continuous. Define the right
G. Tbe Classification Problem
action of G on E, by (ti,gi, @ . . . @ ti,gi,).g=
ti,(gi, ‘g)@ 0 ti,(gim.g). Let B, and p: EG+
For a tïber bundle 5 =(E, p, B, F, G) and a
B, be the identification space of E, by this
continuous mapping $ : B’-t B, consider the
action of G and the identification mapping.
subspace E’= {(x,b’)~E x ~?I~(X)=~(W)} of
Then (EG, p, B,, G) is a universal bundle for G,
E x B’ and the projections p’:E’+B’ and Y:
and B, is called the classifying space for G. E,
E’+E. Then I/I#< = (E’, p’, B’, F, G) is a fïber
and B, are sometimes written as EG and BG.
bundle, and Y is a bundle mapping from $#t
In particular, a classifying space B, is a count-
to t; $#l is called the induced bundle of 5 by
able CW complex for any countable CW
$. Let { Uz} and { gpb} be the systems of coordi-
group G (i.e., a topological group that is a
nate neighborhoods and transformations of 5.
countable CW complex such that the mapping
Then { $ ml (U,)} and { goa o $} are correspond-
g+g-l of G into G and the product mapping
ing systems of $ # <. If Y : E’-t E is a bundle
G x G-G are both cellular). The following
mapping from [’ to r having $ : B’+B as the
examples of classifying spaces for Lie groups
mapping of base spaces, then <‘= $#t. Also,
are also useful. Note that every CW complex
wehave*#5,~~#52if51r~2;(~o~‘)#5-
B, of a given G has the same thomotopy type.
+,Y#($#<). If 5 is a principal bundle, then $#t
is also principal, and I~/#(V x,F)-(ll/#n) x,F.
For a tparacompact space B’, $1 t = $2 5 if
$, , $,: B’dB are thomotopic. 1. Examples of Universal Bundles
For a topological group G, a principal
bundle ((n, G)=(i?(n, G),p, B(n, G), G) is called (1) G is either O(n), U(n), or Sp(n): Let A and n
an n-universal bundle if E(n, G) is +n-connected be as in (3) of Section E. According as A is R,
(n < 00); its base space B(n, G) is called an n- C, or H, we Jet U(n, A) be the +Orthogonal
classifying space of G. In particular, <(CO, G) group O(n), the +unitary group LJ(n), or the
= tG = (EG, p, B,, G) is called simply a universal +symplectic group SP(~). Then the Stiefel
bundle and B, a classifying space of G. Then manifold V,+,,,(A)= U(m+n,A)/I, x U(n, A)
we have the classification tbeorem: Let B be a (Im is the unit element of U(m, A)) is (A(n+ 1)
CW complex with dim B < n; then the set of - 2)-connected. Hence the principal bun-
equivalence classes of principal G-bundles with de W(n+ 11-Z U(m+n, N)=(K,+,,,(A),
base space 6 is in one-to-one correspondence M ,+,,,(A), U(m, A)) from (4) of Section E is a
with the thomotopy set n(B; B(n, G)) of con- (A(n + 1) - 2)-universal bundle of U(m, A),
tinuous mappings of B into B(n, G). Such a where the base space M,+,,,(A)= L’(m+n,A)/
correspondence is given by associating with U@n, A) x U(n, A) is the +Grassmann manifold.
the induced bundle $#<(PI, G) a continuous (2) G is either O(C~), U(m), or S~(OC). The
mapping I,!I: B+B(n, G), called the character- examples in (1) are valid for m, n = CO. Con-
istic mapping or classifying mapping (charac- sider the tinductive limit group U(m, A) =
teristic map or classifying map) of $#<(II, G). Un U(n, A) under the natural inclusion
Furthermore, if G is an effective left topo- U(n, A) t U(n + 1, A), and supply the infinite
logical transformation group of F, the set of classical group U(GO, A) with the weak topol-
equivalence classes of G-bundles with base ogy (this means that a set 0 of U(co, A) is
space B and tïber F is in one-to-one corre- open if and only if each 0 fl U(n, A) is open in
spondence with rr(B; B(n, G)). The correspon- U(n, A)). Then the infinite Stiefel manifold
dence is given by associating $#([(n, G) xGF) v ,+,,,(A) and the infinite Grassmann manifold
to *. M ,+,,,(A) (m = CIZor n = CO) are detïned as
before, and we have

H. Construction of Universal Bundles Mvn(A)=U M,+,,m(N>

For an arbitrary topological group G, J. W. and SO on. Furthermore, these manifolds are
Milnor [6] constructed a universal bundle (E,, CW complexes, and V,,,(A) (m< CO) is CO-
p, B,, G) in the following manner. The join connected. Although U( CQ,A) is not a Lie
E, = Go o Go of countably infinite group, U(m, A) x U(n, A) has a local cross
copies of G is detïned as follows: A point e of section in U(m + n, A) for m, n < m [7]. There-
E, is the symbol ti,gi, 0 @ t,,gi, (1 < i, < fore, setting n= 10 in (1) ((CO, U(m, A)) is a
i,<...<i,,m=1,2,3 ,... ),whereti ,,..., ti,are universal bundle of U(m, A), and the infinite
real numbers satisfying tii 3 0, til + + tim= 1, Grassmann manifold M,,,(A) is a classifying
147 J 512
Fiber Bundles

space Bu(m,Ar. Also, ((I(n + 1)-2, U(co, A)) K. Homotopy and Homology Theory of
in (1) is a (a(n + 1) - 2)-universal bundle of Bundles
U(co,A) (n< CO).
(3) G is either SO(m) or a general Lie group. Since a tïber bundle is a tlocally trivial tïber
For the trotation group SO(m), we have [(n- space, the exact sequence and the spectral
1, ~O(~))=(K+.,,(R),P, a,+,,,,SO(m)) and sequence of fiber spaces (- 148 Fiber Spaces)
B SO(m)--fi,,,> where fi,,,,, = SO@ + n)/ are applicable to tïber bundles. For example,
SO(m) x SO(n) is the oriented Grassmann the cohomology structures of homogeneous
manifold. For any compact Lie group G, we spaces of classical groups have been deter-
bave Un- 1, G)=(V,,+,,,,,(W,~,0(mfn)lG x mined by A. Borel, J.-P. Serre, and others.
O(n), G), where G c O(m). For any connected (1) Characteristic class. For a classifying
Lie group G, we have <(n, G)= <(n, G,) x,~G, space B, of a topological group G, we have an
where G, is the maximum compact subgroup isomorphism rr,( BG) g rr-l (G) of homotopy
of G (since G/G, is homeomorphic to a Eu- groups and the following classification theo-
clidean space, ((n, G) reduces to <(n, G,)). rem of tïber bundles over the n-sphere S”. The
set of the equivalence classes of principal G-
bundles or G-bundles with fïber F over the
base space S” is in one-to-one correspondence
J. Reduction of Fiber Bundles
with the set n,-,(G)/x,(G) of equivalence
classes under the operation of G on X,-~(G)
Let G be a topological group and H its closed given by the inner automorphisms of G; such a
subgroup. We say that the structure group of a correspondence is given by associating with
G-bundle r is reduced to H if < is equivalent to each principal G-bundle q = (P, q, S”, G) the
a G-bundle whose coordinate transformations class (called the characteristic class of II) con-
take values in H. For a principal H-bundle taining the image A(z,) of a generator z,~rr,(S”)
vo = (P, q, B, H), the associated H-bundle Q, xH by the homomorphism A: n,(Y) E n,(P, G)
G=(P xHG,p, B, G) with fiber G is delïned, -rrmI(G). Take U, and U, (the open sets of
where H operates on G by the product of G; Y’ such that the last coordinates tn+, are >
it is also a principal G-bundle if we detïne -1/2 and < 1/2, respectively) as coordinate
an operation of G on P x,G by {(x,g)} .g’= neighborhoods of q. Then the restriction T=
{(x, gg’)}. For a principal G-bundle 4, we say y 1z 1S”-’ represents the characteristic class
that q is reducible to an H-bundle if there is a of 11,where g,2: U, n U, + G is the coordinate
principal H-bundle vo with q = yl,, x,G, and transformation and S”-’ is the equator of Y.
we cal1 Q, a reduced bundle of ;rl. It is easy to see (2) For the principal bundle q =(So(n +
that the group of a G-bundle 5 is reducible to 1), q, Y, SO(n)), the mapping T: S”-‘-SO(n)
H if and only if the associated principal G- is given by
bundle of 5 is reducible to H. Also, if vo is a
reduced bundle of II, then $#Q is a reduced T(r,,...,r.)=(l.-?(rilj,)(~~’ -1>
bundle of $ # tl.
Now, assume that H has a local cross sec- (1, is the unit matrix of degree n). Hence the
tion in G and G/H is co-connected. Then tmapping degree of the composite q’o T:S”-’
for an n-universal bundle <(n, H) of H, +,Y-’ (of T and the natural projection q’:
<(n, H) x,G is an n-universal bundle of G (n- So(n)+s”-‘) is equal to 0 if n is odd and 2 if
connectedness of E(n, H) x, G is shown by the n is even. From this fact and the homotopy
thomotopy exact sequence of +tïber spaces). exact sequence, we have
Therefore, by the classification theorem, the
group of any G-bundle is reducible to H, and
the equivalence classes of G-bundles are in Z ifm=l orniseven
one-to-one correspondence with those of H- = 1 z,=z/2z ifm>l andnisodd
bundles.
(1) A G-bundle is trivial if and only if its for the real Stiefel manifold V,,+,,,(R), which is
group is reducible to e (identity element). A 2n- (n - 1)-connected.
dimensional differentiable manifold M of class (3) Sphere bundles. An O(n+ 1)-bundle with
C” has an +almost complex structure if and lïber S” is called an n-sphere bundle. The set of
only if the group of the tangent bundle Z(M) is equivalence classes of n-sphere bundles with
reducible to GL(n, C), i.e., 2(M) is considered base space Y’ is in one-to-one correspondence
as an n-dimensional complex vector bundle. with z,,-,(O(n+ l))/n,(O(n+ 1)). For example,
(2) Since GL(n, R) z O(n) x R”(“+l)” and any 1-sphere bundle over S” (m > 3) and any
GL(n, C)z U(n) x R”‘, n-dimensional real (com- n-spherebundle over S3is trivial. Every 3-
plex) vector bundles cari be considered as O(n) sphere bundle over S4 is equivalent to one of
(U(n))-bundles with fïber R”(P). {&,,. 1m an integer, n a positive integer}, where
573 147 M
Fiber Bundles

r m,n is detïned as follows: Let p, a:S3-t0(4) be section over B”+’ if and only if c(f) = 0. Thus
detïned by p(q)q’ = qq’q-‘, o(q)q’ = qq’ (q, y’ are there is a cross section over B”+’ if and only
tquaternions of norm 1). Then these mappings if the set of c(f) for every cross section f over
represent generators of rr3(O(4)) g n3(S3 x B” contains the cochain 0; (c(f)} is considered
S3) 2 Z + Z, and &,,, is the 3-sphere bundle as a measure of the obstruction.
over S4 corresponding to the element m { p} + Let w:I+B be a +path. We consider the
n{a}~rr,(0(4)) (ie., to the mappingf,,.:S3% space 1 x F and the canonical projection p, :
O(4) defined by f,,Jq)(q’)=qm+“q’q-“‘). (Here 1 x F+I. Then there is a bundle mapping Q:
we use the fact that the operation of the ele- 1 x F+E withpoQ=wop,, since w#<=
ment TE O(4) (r(q) = q-‘) is given by rpr -’ = p, (1 x F, p, , I, F); and a homeomorphism w# : F z
ror-’ =pa-‘.) Fis dehned by w#=‘P~,~,oR,oQ~‘o<~,,,,,
where b,=w(s) (s=O, l), <p,: LJ, x Fsp-‘(U,) is
a coordinate function of a coordinate neigh-
L. Cross Sections borhood U,3b,,andQ(=RIsxF):Fzpp’(b,).
The homeomorphism w# induces an isomor-
For a tïber bundle 5 = (E, p, B, F, G), a cross phism w# :rc,(F)zrt,(F), and x,(F) forms a
section f: B, +E over a subspace B, (c B) is a tlocal coefficient on B. Then the cochain c(f)
continuous mapping such that pof is the is a tcocycle with the local coefftcient n,(F),
identity mapping of B,; a cross section over B called the obstruction cocycle off: Further-
is called a cross section of 5. A bundle 5 is more, the set {c(f)) for every cross section
trivial if and only if the associated principal f: B”+E is a cohomology class C”+~(()E
bundle of 5 has a cross section. More gener- H”+l(B; n,,(F)) (local coefficient), and c”+‘(t)
ally, given a principal bundle u = (P, q, B, G) is called the primary obstruction to the con-
and a closed subgroup H having a local cross struction of a cross section. There is a cross
section in G, n is reducible to H if and only if section over the (n + 1)-skeleton B”+’ if and
the associated bundle q( xJG/H) = (P/H, q’, B), only if c”“(<)=O. The local coefftcient z,,(F)
G/H) with fiber G/H has a cross section. is trivial if B is tsimply connected or, for
Suppose that the base space B of the fiber example, if the structure group G of 5 is con-
bundle 5 = (E, p, B, F, G) is a tpolyhedron. We nected (in which case 5 is called an orientable
denote the +r-skeleton of B by B’ and consider fiber bundle); when this is true, Y’+‘(<) is an
the problem of extending cross sections ,f,: B’ element of H”+‘(B; n,(F)), where n,(F) is not a
+E successively for r = 0, 1, Clearly, there local coefficient. Furthermore, if Y”([) = 0
is a cross section fO. For each r-simplex o of B, and ni(F) = 0 (n < i < m), then the secondary
wehave(p-‘(a),p,o,F)-(oxF,p,,a,F)since obstruction Om+‘(<)~ H”“‘(B; n,(F)) is de-
g is tcontractible. Hence there is a bundle fined similarly (- 305 Obstructions).
mapping<p,:axFzp-‘(o)withpocp,=pr.
Assume the existence of a cross section f,-r :
B’-’ +E, and consider the mapping h, = p2 o M. Stiefel-Wbitney Classes
<pi’ o(frml 16):6-F (p2:o x F-F is the pro-
jection and 6 is the boundary of a). Then if Let 5 =(E, p, B, F, O(n)) be an O(n)-bundle over
ho is extensible to h,:a+F, an extended cross an arcwise connected polyhedron B and let
section f,: o+E of f,-r 10 is dehned by f,(b) = 5’ be its associated principal bundle. Con-
<p,(b, h,,(b)) (bEo), and the extension f,: B’+ sider the Stiefel manifold I/n,nmk = i&,(R) =
E of f,-r is defïned by f, 10 =f,, f, 1B’-’ = O(n)/I,-, x O(k), which is (k- 1)-connected,
f,-r. If T[,-r (F) = 0, for example, there is an and the associated bundle 5” = 5’ x O,njVn,n-k
extension h, of hb since (a, 6) z (V, Y-r), and with fiber V&mk. The primary obstruction
f,-r is extensible to a cross section f;. K+,(S)=Ck” (5k)~Hk+1(B;nk(l/n,n-k))(k=
Now assume that the base space B of a G- 0, 1, , n - 1) is called the Stiefel-Whitney
bundle 5 =(E, p, B, F, G) is an tarcwise con- class of 5. We have 2W,+,(<)=O unless k=n-
nected polyhedron and F is t(n - 1)-connected. 1 and k is odd. Hence we usually consider
Then there is a cross section f: B”+E con- W,+,([)EH~“(B; Z,). 5 is orientable, i.e., the
structed by the stepwise method of the previ- group of 5 is reducible to SO(n), if and only if
ous paragraph. But if n,(F)#O, we have an W,(t) =O. The Stiefel-Whitney classes of an
obstruction to extending f over B”+‘. Now we n-dimensional tdifferentiable manifold M
explain how to measure this obstruction. Sup- are detïned to be those of the tangent bundle
pose that F is +n-simple. Then for each (n + l)- Z(M). Since the orientability of M coincides
simplex (r of B, the mapping ho : 9 + F, detïned with that of 2(M), M is orientable if and only
by f as in the previous paragraph, determines if W,(M)=O. The condition W,+,(M)=0 is
a unique element c(f)(o) of the homotopy necessary for the existence of a continuous
group n,(F). Hence we have a tcochain C(~)E field of orthonormal tangent (n - k)-frames
C”+l (B; n,(F)), and f is extensible to a cross over M (if k = n - 1 this condition is also SU~~I-
147 N 514
Fiber Bundles

tient). Also, when M is closed, W,(M) is equal P. Microbundles


to x(M)p, where p is the fundamental coho-
mology class of M and x(M) is the tEuler A system x: BLE2 B of topological spaces E,
characteristic of M (- 56 Characteristic B and continuous mappings i, j is called an n-
Classes B). dimensional microbundle over B if for each
be B, there exist a neighborhood U of b, a
neighborhood I’ of i(U), and a homeomor-
N. Cbern Classes
phismh:VsUxR”withhoiIU=i,,jII/=p,o
h(i,:U-UxOcUxR”,andp,:UxR”~Uis
For a U(n)-bundle <=(E,p,i?,F, U(n)), the
the projection). Let H,(n) be the topological
primary obstruction Ck+1(<)=c2k+2(~k)~
group of a11 homeomorphisms of R” onto itself
P+2(B;Z) (k=O, 1,...,n-l)oftheassoci-
fïxing the origin with compact-open topology.
ated bundle tk = t” x U~,~V,,,-k(C) is called the
Then the equivalence classes of n-dimensional
Chern class of 5. If we consider 5 as an O(2n)-
microbundles over B are naturally in one-to-
bundle by U(n)c0(2n), then IV,,+,(~)=0 and
one correspondence with the equivalence
IV,,(<) = C,(r) (mod 2). The Chern classes of a
classes of H,(n)-bundes with base space B and
real 2n-dimensional almost complex manifold
fiber R” [13].
are defined to be those of the tangent bundle
In the category of polyhedra and PL (tpiece-
2(M) (- 56 Characteristic Classes C).
wise hnear) mappings, the notion of a PL
microbundle cari be defïned in the same man-
0. Bundles of Class c’, Analytic Bundles ner. The structural group of n-dimensional PL
microbundles is defïned, in a generalized sense,
A fiber bundle < = (E, p, B, F, G) is called a fiber as a complete tsemi-simplicial complex [ 111.
bundleofclassC’(r=O,l,... co,w)ifE,E,F The tangent PL microbundle is delïned for any
are differentiable manifolds of class C’, G is a ttopological (PL) manifold. J. Milnor classified
Lie group and a transformation group of F of tsmoothings of PL manifolds by means of PL
class c’, and p and the coordinate functions microbundles [ 111 and then showed that the
are differentiable mappings of class C”. Bun- ttangent vector bundles and its +Pontryagin
dles of class Ca are usual G-bundles, and those classes of smooth manifolds are not topologi-
of class C” are real analytic fiber bundles. cally invariant [ 121.
Similarly, complex analytic fiher bundles are For a +PL embedding f: M+N between PL
delïned by the notions of complex manifolds, manifolds, if there is a neighborhood E of
+complex Lie groups, and rholomorphic map- f(M) in N and a PL mapping p: E+M SO that
pings. For example, the universai bundles a diagram v:M<E%M is a PL microbundle,
[(n - 1,0(m)) and 9(2n, U(m)) are real and com- then v is called a normal PL microbundle of $
plex analytic principal bundles, respectively, In this case, f is tlocally flat.
and the tangent bundle 2(M) of a C’+l (or There is a locally flat PL embedding be-
complex) manifold is a C’ (or complex ana- tween PL manifolds which admits no normal
lytic) vector bundle. The operations of the PL microbundle [14].
Whitney sum, etc., are defined analogously for
these vector bundles.
The equivalence of c’ (complex analytic) Q. Block Bundles
bundles is detïned by means of bundle map-
pings that are C’-differentiable (holomorphic). As the normal bundle theory for locally flat
Bundles of class C’ (r < CO) are classified by C’ PL embeddings of PL manifolds, the concept
mappings into a classifying space in the same of block bundle was introduced independently
manner as for bundles of class CO. Also, the by C. P. Rourke and B. J. Sanderson [lS], M.
connection of class C? (- 80 Connections) in Kato 1161, and C. Morlet [17]. Let E be a
c’ bundles is an important notion. tpolyhedron, and let K be a cell complex. A
For complex analytic bundles, a similar set jEcI 0~ K} of PL balls in E indexed on K is
classification has been obtained for restricted called a q-block structure of E if the following
spaces by K. Kodaira, Serre, S. Nakano [S], three conditions are satistïed: (1) lJoEK E, = E;
and others. The classification of complex ana- (2) for each oc K, there is a +PL homeomor-
lytic bundles over a Stein manifold is reduced phism h,: o x Iq+E, such that h,(z x Iq) = E,
to that of bundles of class Ca (Oka’s principle for each face r of cr, where 1 = [ -1, 11; and (3) if
[SI), and similar results are valid for C”- E,nE,#@,then E,nE,=E,,where~=atlp.
manifolds. The complex analytic (or holo- For crû K, E, is called the block over o, and
morphic) connection does not necessarily exist, h, in (2) is called a trivialization of E,. Then a
and M. F. Atiyah [IO] found the condition triple (E, K, {E, 1~TEK}) is referred to as a q-
for its existence and its relation with Chern block bundle over K and is denoted by t/K.
classes. Another block bundle <‘/K =(E’, K, {EL 1crû
515 148 B
Fiber Spaces

K}) over the same complex K is said to be [ 151 C. P. Rourke and B. J. Sanderson, Block-
isomorphic with </K if there is a PL homeo- bundles I-III, Ann. Math., (2) 87 (1968).
morphism g:E-E’, called an isomorphism, [ 161 M. Kato, Combinatorial prebundles 1, II,
such that g(E,) = Ec (a~ K). A PL embedding Osaka J. Math., 4 (1967).
i : 1K I+ E is a zero-section of t/K if for each [ 171 C. Morlet, Les méthodes de la topologie
TEK, there is a trivialization &:a x 14-t&, différentiélle dans l’étude des variétés semi-
such that h,(x, 0) = i(x) (x E 0). In this case lineairs, Ann. Sci. École Norm. SU~., 1 (1968),
we say that t/K is a block bundle with a zero- 313-394.
section i: 1K l+ E. There is a unique zero- [18] D. Husemoller, Fiber bundles, McGraw-
section of t/K up to isomorphism of t/K onto Hill, 1966.
itself. For every tlocally flat PL embedding
between PL manifolds M and W of codimen-
sion q and for any ce11 division K of M, a
tderived neighborhood N of f(M) in W admits
a unique q-block bundle v/K = (N, K, {No 1o E
148 (1X.10)
K}) with f: M+ N as a zero-section up to Fiber Spaces
isomorphism respecting the zero-section [lS].
The block bundle v/K is called a normal hlock
hundle off: M + W.
A. General Remarks

J.-P. Serre [l] generalized the concept of liber


References
bundles to that of liber spaces by utilizing the
covering homotopy property (- Section B).
[l] N. E. Steenrod, The topology of liber He applied the theory of tspectral sequences,
bundles, Princeton Univ. Press, 1951. due to J. Leray, to the (cubic) tsingular (CO)-
[2] E. Stiefel, Richtungsfelder und Fernparal- homology groups of liber spaces. These are
lelismus in n-dimensionalen Mannigfaltig- quite useful for determining (co)homology
keiten, Comment. Math. Helv., 8 (1936) 3-51. structures and homotopy groups of topo-
[3] H. Whitney, Topological properties of logical spaces, and are now of fundamental
differentiable manifolds, Bull. Amer. Math. importance in algebraic topology.
Soc., 43 (1937), 785-805.
[4] S. S. Chern, Some new view-points in
differential geometry in the large, Bull. Amer.
B. Definitions
Math. Soc., 52 (1946), l-30.
[S] F. Hirzebruch, Topological methods in
algebraic geometry, Springer, third edition, Let p:E+B be a continuous mapping between
1966. topological spaces, and let X be a topological
[6] J. W. Milnor, Construction of universal space. Then we say that p has the covering
bundles II, Ann. Math., (2) 63 (1956), 430-436. homotopy property with respect to X if for any
[7] J. C. Moore, Espaces classlïants, Sém. H. mapping f: X+ E and thomotopy gl: X+B
Cartan, 12, no. 5 (19599 1960) Ecole Norm. (O<t< 1) with pof=g,, there is a homotopy
SU~., Secrétariat Mathématique. f,:X+E(O<t<l)withf,=fandpof,=g,.
[S] S. Nakano, On complex analytic vector We cal1 (E, p, B) a fiber space (or fihration) if p
bundles, J. Math. Soc. Japan, 7 (1955), l-12. has the covering homotopy property with
[9] H. Grauert, Analytische Faserungen über respect toeverycube I”={(x,,...,x,))O<xi<
holomorph-vollstandigen Raumen, Math. l}, n = 0, 1, . (then p has the covering homo-
Ann., 135 (1958) 263-273. topy property with respect to every tCW
[lO] M. F. Atiyah, Complex analytic connec- complex). Then E is called the total space,
tions in tïber bundles, Trans. Amer. Math. p the projection, B the hase space, and Fb =
Soc., 85 (1957), 181-207. p-‘(b) the fiber over bEB.
[l l] J. W. Milnor, Microbundles and differ- Let E, B, F be topological spaces and p: E+
entiable structures, Lecture notes, Princeton B a continuous mapping. We cal1 (E, p, B, F)
Univ., 1961. a locally trivial fiber space if for each b E B,
[ 121 J. W. Milnor, Microbundles 1, Topology, there exist an open neighborhood U of b and
3 (1964), Suppl. 1, 53-80. a homeomorphism cp: U x F-p-‘(U) with
[ 131 J. Kister, Microbundles are liber bundles, po<p(b’,y)=b’ (b’e U,~EF). In this case, p has
Bull. Amer. Math. Soc., 69 (1963) 8544857. the covering homotopy property with respect
[14] C. P. Rourke and B. J. Sanderson, An to each tparacompact space; hence (E, p, B) is a
imbedding without a normal bundle, Inven- liber space. A tlïber bundle is clearly a locally
tiones Math., 3 (1967) 2933299. trivial liber space.
148 c 576
Fiber Spaces

C. Path Spaces space, and its fiber is an +Eilenberg-MacLane


space K(n,(X), n); (iv) q, o f, is thomotopic to
Another important example of a lïber space is fnml. Such a system {X., f,, q,,} is called the
a path space. A path in a topological space Postnikov system of X and is in a sense con-
X is a continuous mapping w: 1 +X (I= sidered a decomposition of X into Eilenberg-
[O, 11). Given subsets A, and A, of X, the MacLane spaces.
path space Q(X; A,, A,) is the space of a11
paths w:I+X satisfying w(E)EA, (E=O, 1)
topologized by +Compact-open topology. De- E. Spectral Sequences of Fiber Spaces
fine p,:ti(X;A,,A,)+A, by p,(w)=w(~) (E=
0,l). Then (Q(X; A,, A,), pE, A,) is a fïber space; The cohomological properties of lïber spaces
in fact, pE has the covering homotopy property are obtained mainly from the following results
with respect to every topological space. In (which are valid similarly for homology except
particular, the total space of the tïber space for properties of products). Assume that the
(n(X; X, *), po, X) (* EX) is tcontractible, and base space B of a given tïber space (E, p, B) is
the tïber p;‘( *) = fi(X; *, *) = RX over * is the kimply connected and the tïber F=p-‘( *) is
tloop space of X with base point *. For a arcwise connected, and let R be a commuta-
continuous mapping f: Y-*X, consider the tive ring with unit. Then the spectral sequence
vace Es={(~, +V)E Y x CW;X,X)If(y)=w(O)} (of singular cohomology) of the tïber space
and the continuous mapping p:E,+X de- (E, p, B) (with coefficients in R) is detïned to be
fined by ~(y, w) = w(1). Then Y c E,, and Y is a sequence {E,, d,} satisfying the following
a tdeformation retract of E,; furthermore, properties:
(E,, p, X) is a fiber space with f = p 1Y. (i) E, = &,, E1,q are bigraded R-modules, and
d,= &,qdp,‘J are R-linear differentials such that
d,(,FW) c E7+V-r+l,
D. Homotopy Groups of Fiher Spaces (ii) EF;\ = KerdP,‘J/Im dpmr,q+r-l
r , which means
that E,,, =H(É,).
For thomotopy groups of fiber spaces, the (iii) Efxq = 0 for p < 0 or q < 0, Ej’.q = E” = . .
Hurewicz-Steenrod isomorphism theorem = Em” for r > max(p, q + 1).
holds: Let (E, p, B) be a fiber space and F = (iv) E, has a product for which Epq. E$S~’ c
p-‘( *) the fïber over the base point *E B. E~ip’~qtq’ and d,(u. u)=(d,u). v+( -l)P+qu.
Then p* : n,(E, F)+z,,(B) is an isomorphism for d,u (ut Efsq). Furthermore, the induced prod-
n > 2 and a bijection for n = 1. By this theorem, uct in H(E,) coincides with the product in
we have the homotopy exact sequence of a iïber
space: C)+Lm is the tbigraded module associated with
some filtration of the cohomology module
. ..-7c”+l (B)-%,(F)-~Tc,(E)%~,(B)+....
H*(E;R);thatis,H”(E;R)=D”~“~D’~“~’~...
Furthermore, the more general exact sequence 3Dn.03D”+1.-1=0 and ,,774=DP.I/DP+~.4-1.

Furthermore, the +cup product Q in H*(E; R)


. ..+n(z.RB),wc(Z;F),-rrr(Z;E),%(Z;B), satisfles DP-4 v DP’,4’ c DP+P’x4+4’
and coincides
is valid for each CW complex Z, where with the given product in E,.
n(Z; X), denotes the thomotopy set of map- (vi) E;.q = HP(B; Hq(F; R)), and the product
pings from Z to X relative to the base point. in E, coincides with the cup product in
Example (1). A cross section of a tïber space H*(B;H*(F; R)).
(E, p, B) is a continuous mapping f: B-t E with (vii) The composition of H”(B; R) = E;*‘+ E~V’
p of= 1. If (E, p, B) has a +Cross section or the +...+E,+, n,o --E”.o=D”,o
m c H”(E; R) is equal to
fiber F is a tretract of E, then z,(E)= x,(B) + p*, and the composition of H”(E; R) = Dos”+
E”.O=EO.”
n,(F). If F is contractible in E, then z,(B) E a> n+2 c . . c Ei,” c Ei,” = H”(F; R) is
n,(E)+n,-,(F)(nH). equal to i* (i: F c E), where each + is the
Example (2). (E, p, B) is called an n- projection onto the quotient module. In the
connective fiber space if B is tarcwise connec- sequence
ted, E is tn-connected, and p* :X~(E)+~~~(B) is
H”-‘(F; R)%“(E, F; R)h”(B; R),
an isomorphism, for i > n. For each arcwise
connected space B and integer n, there is such we have a*-‘(Imp*)= En*“~‘, Coimp*=Ei*‘,
a fiber space. and d,: Eisn~’ -, E:,’ is equal to the trans-
Example (3). For a CW complex X, there are gression z*=p*~108*:a*~1(Imp*)~Coim
topological spaces X,, and continuous map- p*. Each element of a*-‘(Imp*) is called
pings f,:X+X,, qn+,:X,+l+X, (n=O, 1, . ..) transgressive.
with the following four properties: (i) X, (0 6 In the following examples, we assume that R
n < m) is a point if X is m-connected; (ii) f.*: is a principal ideal ring.
ni(X)gxi(X,) (i<n); (iii) (X,,q,,,X,-,) is a tïber Example (4). Let k be a commutative
511 149 A
Fields

lïeld, and assume that dim, H,(B; k) and [2] S. T. Hu, Homotopy theory, Academic
dim, H,(F; k) are finite. Then for the tPoincaré Press, 1959.
polynomial P,r(t)=x,b,,t”, b,=dim,H,(X; k), [3] A. Borel, Sur la cohomologie des espaces
we have PE(t)=PB(t)PF(t)-(1 + t)cp(t), where fibrés principaux et des espaces homogènes de
<p(t) is a polynomial with nonnegative coeffi- groupes de Lie compacts, Ann. Math., (2) 57
cients (Leray). In particular, for the tEuler (1953), 115207.
characteristic x(X) = Px( -l), we have x(E) [4] E. H. Spanier, Algebraic topology,
=x(B)x(F). Also, if &:I&(F; k)+H,(E; k) is McGraw-Hill, 1966.
monomorphic for each n 2 0, then PJi) = [S] G. W. Whitehead, Elements of homotopy
PB@P&). theory, Springer, 1978.
Example (5). Isomorphism theorem: If
H,(B;R)=O(O<n<r)andH,(F;R)=O(O<n
<s), then p* : H,,(E, F; R)-+H,,(B; R) is isomor-
phic for 0 < n <r f s and epimorphic for n = r +
s, and we have the following homology exact
149 (111.6)
sequence:
Fields
. ..+H.(F;R)%H,(E;R)%f,(B;R)

hml(F;R)+..., n<r+s A. Definition

(similarly for the cohomology).


A set K having at least two elements is called a
Example (6). Assume that H”(F; R)g
field if two operations, called taddition (+)
H”(S’; R) (S’ is the r-sphere, r 2 1). Let g = Mp
and tmultiplication (.), are delïned in K and
be the tmapping cylinder of p: E+B and fi:
satisfy the following three axioms.
.i+B be the continuous mapping defined
(1) For any two elements a, b of K, the sum
by p. Then the Thom-Gysin isomorphism Lj:
a + b is delïned; the associative law (a + b) +
H”-‘~‘(B;R)~H”(E”,E;R)(n~O) with g(c()=
c = a + (b + c) and the commutative law a + b =
P*(U)~ y(1) holds (- 114 Differential Top-
b + a hold; and there exists for arbitrary a, b
ology G). Also, we have the Gysin exact
a unique element x such that a + x = b; that is,
sequence:
K is an tAbelian group with respect to the ad-
. ..+H”(B; R)%(E; R)+H”-‘(B; R) dition (the tidentity element of this group is
denoted by 0 and is called the zero element of
K).
where g satislïes ,9(a) = c(- 0 = 0~ c1(a = (2) For any two elements a, b of K, the prod-
g(l)EH’+i(B; R)). Here R is equal to the uct ab (= a. b) is defined; the associative law
image of a generator of H’(F; R) = R by the (ab)c = a(bc) and the commutative law ub = bu
transgression r*, and 251= 0 if Y is even. (These hold; and there exists for arbitrary a, b with
results hold also for r = 0 and R = Z, .) a # 0 a unique element x such that ux = b, that
Example (7). Assume that H”(B; R) g is, the set K* of a11 nonzero elements of K is
IF(S’; R) (r > 2). Then we have H”-‘(F; R) cz an Abelian group with respect to multiplica-
H”(E, F; R) (n > 0) and the Wang exact tion. K* is called the multiplicative group of
sequence: K, while the identity element of K* is denoted
by 1 and is called the unity element (unit ele-
. . +H”(E; R)h”(F; R) ment or identity element) of K.
(3) The distributive law a(b + c) = ub + UC
holds. In other words, a lïeld is a tcummuta-
where B satislïes Q(C~- 8) = @(CC)
-/? +
tive ring whose nonzero elements form a
( -l)n(r-l)a - W’) (a, BE fW’; RI). group with respect to multiplication.
Example (8). For a lïeld k of odd character- A noncommutative ring whose nonzero
istic, if H”(E; k) = 0 (n > 0) and the algebra elements form a group is called a noncom-
H*(F; k) is generated by a finite number of mutative fïeld (skew tïeld or s-field). It should
elements of odd degree, then H*(F; k)g be noted that sometimes a field is delïned as a
Ak(x,, . , xr) (the texterior algebra) and ring whose nonzero elements form a group
H*(B; k)r k[y,, . . . ,yJ, where yi=z*(xi) (A. without assuming the commutativity of that
Borel). group, and in this case our “lïeld” delïned
before is called a commutative lïeld. (The term
References “skew lïeld” is sometimes used to mean either
a commutative or a noncommutative lïeld.) In
[l] J.-P. Serre, Homologie singulière des this article we limit ourselves to commutative
espaces lïbrés, Ann. Math., (2) 54 (1951) 425 tïelds (for noncommutative fields - 29 As-
505. sociative Algebras).
149 B 578
Fields

B. General Properties an extension field K, of k, and an isomor-


phism cp: K i -*K, which is an extension of the
Since a field K is a commutative ring, we have given isomorphism $ : k, +k2; construction of
properties such as a0 = Oa = 0, (- a)b = a( - b) the field K, is often called the embedding of
= -ab for elements a, b in K. If a kubring k of k, into K,. When K, and K, are extensions
K is a tïeld, we say that k is a subfield of K or of k, an isomorphism : K 1+K, is called a k-
K is an overfïeld (extension field or simply isomorphism if it leaves every element of k
extension) of k. If a lïeld K has no subfield invariant.
other than K, K is called a prime field. In an extension K/k, let S be a subset of K.
A map S of a tïeld K into another tïeld K’ is The smallest intermediate tïeld of K/k contain-
called a (lïeld) homomorphism if it is a ring ing S is called the field obtained by adjoining S
homomorphism, i.e., if it satistïes f(a + b) = to k or the lïeld generated by S over k, denoted
f(u) +f(b), f(ub) =f(a)f(b). Since a tïeld is by k(S). The lïeld k(S) consists of those ele-
+Simple as a ring, every (lïeld) homomorphism ments in K each of which is a rational ex-
is an injection unless it maps everything to pression in a tïnite number of elements of S
zero. A homomorphism of K into K’ is called with coefficients in k. An extension tïeld k(t)
an isomorphism if it is a bijection, and K and obtained by adjoining a single element t to k is
K’ are called isomorphic if there exists an called a simple extension of k, and in this case t
isomorphism of K onto K’. An isomorphism of is called a primitive element of the extension.
K onto itself is called an automorphism of K. The trational function field k(X) with coeflï-
If there is a natural number n such that the tient lïeld k is a simple extension of k with a
n primitive element X.
sum nl = i+...-ti of the unity element 1 is 0, When sublïelds k, (2~ A) of a Iïeld K are
then the minimum of such n is a prime number given, the smallest sublïeld of K containing a11
p, called the characteristic of K. On the other these subfields exists and is called the com-
hand, if there is no natural number n such that posite Beld of the k,.
nl = 0, we say that the characteristic of K is 0.
E. Algebraic and Transcendental Extensions
C. Examples of Fields
An element CIof an extension field K of a Iïeld
The rational number lïeld Q consisting of a11 k is called an algebraic element over k if tl is a
rational numbers, the real number fïeld R +zero point of a nonzero polynomial, say, f(X)
consisting of a11 real numbers, and the complex = a, + a, X + . + a,X” with coefficients in k.
number tïeld C consisting of a11 complex num- If t( is not algebraic over k, then c( is called a
bers, are all fields of characteristic 0. A subtïeld transcendental element over k. An algebraic
of the complex number tïeld C is called a num- element c( is always a root of an irreducible
ber field. The rational number field is a prime polynomial over k which is uniquely deter-
field, and every prime lïeld of characteristic 0 is mined up to a constant factor (ck*) and is
isomorphic to the rational number tïeld. For called the minimal polynomial of c( over k. K is
the ring Z of a11 rational integers, the residue called an algebraic extension of k if a11 elements
class ring modula a prime number p is a lïeld of K are algebraic over k; otherwise we cal1 K
Z/pZ = (0, 1,2, , p - 1 (mod p)} of character- a transcendental extension of k. If K 1 is an
istic p, called the residue class field for p. Thus algebraic extension of K and K is an algebraic
Z/pZ is a prime lïeld, and every prime lïeld of extension of k, then K, is also an algebraic
characteristic p is isomorphic to Z/pZ. If the extension of k. In an arbitrary extension lïeld
number of the elements of a lïeld K is finite, K K of k, the set of a11 algebraic elements over
is called a finite field. Z/pZ is an example of a k forms an algebraic extension field of k. A
finite field. simple extension k(t) with a transcendental
element t is isomorphic to the rational func-
tion field of one variable with coefftcient tïeld
D. Extensions of a Field k. If t is an algebraic element over k, k(t) is
isomorphic to the tresidue class fïeld of the
In order to express that K is an extension field polynomial ring k[X] modula the minimal
of k, we often use the notation Klk. Subfields polynomial f(X) of t over k.
of K containing k are called intermediate tïelds
of KJk. Consider two extensions K,lk, and
K,/k,, and let <p:K,pK, be an isomorphism F. Finite Extensions
which induces an isomorphism $: k, +k,.
Then we cal1 <p an extension of $. Suppose that An extension fïeld K of a field k is called a
k,, K, are given tïelds and K, contains a Bnite extension if K has no inlïnite set of ele-
sublïeld k, isomorphic to k, Then there exist ments that are tlinearly independent over k,
579 149 J
Fields

i.e., if K is a finite-dimensional linear space (X-CC,)~‘(X-~J~ . . . (X-a,,$“, r > 1, where


over k. The dimension of the linear space over al, az, . , c(, are distinct roots of f(X) in its
k is called the degree of K over k and is de- splitting lïeld; between the degree n of f(X)
noted by (K : k) (or [K : k]). If K is a finite and the number m of distinct roots of ,f(X),
extension of k and L is a Imite extension of K, the relation n = mp’ holds. In particular, if
then L is also a fmite extension of k and (L : K) ap*E k for some r, we cal1 a a purely inseparable
(K : k) = (L: k). Every fïnite extension fïeld of k element over k. An algebraic extension of
is an algebraic extension of k and is obtained k is called purely inseparable if a11 elements
by adjoining a fïnite number of algebraic of the field are purely inseparable over k. In
elements to k. Conversely, every field obtained an algebraic extension K of k the set of a11
by adjoining a fïnite number of algebraic separable elements forms an intermediate
elements to k is a fmite extension of k. If K = field K, of K/k. The field K, is called the
k(a) with an algebraic element c(, then (K : k) is maximal separable extension of k in K. If K is
equal to the degree of the minimal polynomial inseparable over k. i.e., if K #K,, then the
of c( over k, also called the degree of tl over k. characteristic of k is p #O, and K is purely in-
Every element of k(a) is expressed as a poly- separable over K,. The degrees d = [K,: k]
nomial in tu with coefficients in k. On the other and f=[K:K,,] are denoted by [K:k], and
hand, for any nonconstant polynomial f(X) of [K : kli, respectively. A separable extension
k[X] there exists a simple extension k(a) such of a separable extension of k is also separable
that CLis a root of f(X). over k, and every finite separable extension
of k is a simple extension.
If no inseparable irreducible polynomial in
G. Normal Extensions
k[X] exists, we cal1 k a Perfect field; otherwise,
an imperfect lïeld. Every field of characteristic
An algebraic extension field K of a field k is
0 is a Perfect fïeld. A fïeld of characteristic p
called a normal extension of k if every irreduc-
( # 0) is Perfect if and only if for each a E k the
ible polynomial of k[X] which has a root in polynomial Xp - a has a root in k. Every alge-
K cari always be decomposed into a product
brait extension of a Perfect tïeld is a sepa-
of linear factors in K [Xl. An extension lïeld K
rable extension and a Perfect field. Any imper-
of k is called a splitting tïeld of a (nonconstant)
fect tïeld has an inseparable, in fact purely
polynomial ~(X)E k[X] if f(X) cari be decom-
inseparable, proper extension.
posed as a product of Iinear polynomials, i.e.,
f(X)=c(X-a,)(X-a,)...(X-a,), cEk, aiEK.
A splitting field K of f(X) (~(X)E k[X]) is 1. Algebraically Closed Fields
called a minimal splitting field of f(X) if any
proper subfïeld L of K (K 3 L 3 k) is not a If every nonconstant polynomial of k[X] cari
splitting field of f(X). A minimal splitting field be decomposed into a product of linear poly-
of f(X) is obtained by adjoining a11 the zero nomials of k[X], or equivalently, if every
points of f(X). A fïnite extension tïeld of k is a irreducible polynomial of k[X] is linear, k is
normal extension if and only if it is a minimal called an algebraically closed fiel& k is alge-
splitting fïeld of a polynomial of k[X]. For braically closed if and only if k has no alge-
any given (nonconstant) polynomial f(X) E brait extension tïeld other than k, and hence
k[X], there exists a minimal splitting field of every algebraically closed field is Perfect.
f(X), and a11 minimal splitting fields of f(X) For any given tïeld k there exists an algebrai-
are k-isomorphic. cally closed algebraic extension field of k
unique up to k-isomorphisms (E. Steinitz);
hence we cal1 such a fïeld the algebraic closure
H. Separable and Inseparable Extensions
of k. TO proceed further, suppose that we are
given a tïeld k and its extension K. If there is
An algebraic element a over k is called a sepa-
no algebraic element of K over k outside of k,
rable element or an inseparable element over k
i.e., if k is the intersection of K and the alge-
according as the minimal polynomial of a over
brait closure of k, then we say that k is alge-
k is tseparable or tinseparable. An algebraic
braically closed in K. The complex number
extension K of k is called a separable extension
tïeld is an algebraically closed lïeld (C. F.
of k if all the elements of K are separable over
Gauss’s fundamental theorem of algebra;
k; otherwise, K is called an inseparable exten-
- 10 Algebraic Equations).
sion. An element a is separable with respect
to k if and only if the minimal polynomial of
c( over k has no double root in its splitting J. Conjugates
field. If a is inseparable, then k has nonzero
characteristic p, and the minimal polyno- Let k be a field and K an algebraic extension
mial f(X) of a cari be decomposed as f(X) = of k. Two elements a, fl of K are called conju-
149 K 580
Fields

gate over k if they are roots of the same irre- iniïnite.) In particular, if K = k(S) with an
ducible polynomial of k[X] (or equivalently, algebraically independent S, K is called a
if the minimal polynomials of c( and fl with purely transcendental extension of k.
respect to k coincide); in this case we cal1 the An extension K of k is called a separably
subiïelds k(a), k(p) conjugate fïelds over k. The generated extension, or simply a separable
conjugate fïelds k(a) and k(p) are k-isomorphic extension, if every iïnitely generated intermedi-
under an isomorphism e such that o(a) = 8. In ate field of K/k has a separating transcendence
particular, if K is a normal extension of k, the basis over k. If K itself has a separating tran-
number of conjugate elements of an element c( scendence basis over k, then K is separably
of K is the number of distinct roots of the generated, but not conversely.
minimal polynomial ,f(X) of CI, which is inde- A purely transcendental extension tïeld of k
pendent of the choice of a normal extension K having a finite transcendence degree FI is also
containing k. The element CIis separable if and called a rational function tïeld in M variables
only if the number of conjugate elements in K over k, and a fïnite extension of such a rational
is the same as the degree of f(X). On the other function tïeld is called an algebraic function
hand, k(a) is normal over k if and only if k(a) field in n variables over k.
coincides with all its conjugate fïelds. Let K and L be extension fields of k, both
Let c( be a separable algebraic element over contained in a common extension fïeld. We
k, and let a, = tl, x2, , CI,,be conjugate ele- say that K and L are linearly disjoint over k if
ments of u over k. The product A = a1 ~1~. ct,, every subset of K linearly independent over k
and sum B=ai +a,+ . ..+cc. are elements of k. is also linearly independent over L, or equiva-
Indeed, iff(X)=X”+c,X”~‘+ . ..+c. is the lently, if every subset of L linearly independent
minimal polynomial of c( with respect to k, we over k is also linearly independent over K. An
have A=(-l)“c,, B= -ci, and A and B are algebraic function field K = k(x,, x2, ,x,)
called the norm and the trace of c(, respectively, over k (whose transcendence degree is <n) is
denoted by A = N(a), B = T~(N). Let K be a called a regular extension of k if K and the
finite separable extension of degree n over k, algebraic closure k of k are linearly disjoint. In
and let c( be an element of K. Then the degree order that K be regular over k it is necessary
m of the minimal polynomial of c( is a divisor and sufficient that k be algebraically closed in
of n; that is, n = mr with a positive integer r. K and that K be separably generated over k.
We detïne the norm and the trace of c( with
respect to K/k by N,,,(a) = N(a)‘, Tr&a) =
L. Derivations
rTr(a), respectively. Then these quantities
satisfy &&44 = ~KIk(4N&B), TrKl& + B) = A map D of a tïeld K into itself is called a
TrKII<(ct)+ 7+,,,(b) for tl, IJEK. (For the Galois
derivation of K if it satisfies D(a + b) = D(a) +
theory of algebraic extensions - 172 Galois
D(b) and D(ab)=aD(b)+bD(a) for all a,b~K.
Theory.)
The set of elements c of K for which D(c) =
0 is a subfield. If the characteristic of K is
p( #O), then D(xp)=O for all XEK. Let k be a
K. Transcendental Extensions
subiïeld of K. A derivation D of K is called a
derivation over k if D(c) = 0 for all c E k; the
Let K be an extension of k and ui, , u, be
totality of derivations over k is a tk-module. If
elements of K. An element u of K is said to
K = k(x,, x2, , x,) is an algebraic function
be algebraically dependent on the elements
iïeld over k, then the k-module of derivations
ui, u2, . , u, if v is algebraic over the fïeld
over k has finite dimension s( < n) and we
k(u,, u2,. , un). A subset S of K is called
cari choose s suitable elements ui , u2, , u, of
algebraically independent over k if no u ES is
K such that K is separably algebraic over
algebraically dependent on a finite number
k(u i, u 2, , us). Generally, the transcendence
of elements of S different from u itself; S is
degree r of K over k does not exceed s, and K
called a transcendence basis of K over k if S is
is separably generated over k if and only if
algebraically independent and K is algebraic
r=s.
over k(S). Furthermore, if K is separable over
k(S), then S is called a separating transcendence
basis of K over k. There always exists an alge- M. Finite Fields
braically independent basis of K over k, and
the tcardinal number of S depends only on K Finite fïelds were first considered by E. Galois
and k; this cardinal number is called the tran- (1830) SO they are also called Galois fields.
scendence degree (or degree of transcendency) There is no imite noncommutative tïeld (Wed-
of K over k. (When S is an intïnite set, we derburn’s theorem, J. H. M. Wedderburn,
sometimes say that the transcendence degree is Trans. Amer. Math. Soc., 6 (1905)). A simple
581 150 A
Field Theory

proof for this was given by E. Witt (Abh. Math. elements. Since it is known that every formally
Sem. Unio. Hamburg, 8 (1931)). The character- real tïeld is a subtïeld of a real closed tïeld, it
istic of a finite fïeld is a prime p, and the num- follows that every formally real field is an
ber of elements of the fïeld is a power of p. ordered field and therefore is of characteristic
Conversely, for any given prime number p and 0. The problem of constructing an ordered
natural number c(, there exists a tïnite tïeld field out of a formally real tïeld is closely re-
with p” elements. Such a fïeld is unique up to lated to the existence of valuations of a certain
isomorphism, which we denote by GF(p”) or type [S]. The notion of formally real fields
F,(q=p”). For any positive integer m, GF(p”“) was introduced by E. Artin (Abh. Math. Sem.
is an extension fïeld of GF(pa) of degree m and Univ. Hamburg, 5 (1927)). By making use of the
a tcyclic extension. Every element a of GF(p”) theory of formally real fïelds, Artin succeeded
satistïes ap” = a; hence a has its pth root in in solving afftrmatively +Hilbert’s 17th prob-
GF(p”). Therefore every tïnite field is Perfect. lem, which asked whether every positive
The multiplicative group of GF(p”) is a cyclic delïnite rational expression (i.e., a rational
group of order p” - 1. expression with real coefftcients that takes
positive values for a11 real variables) cari be
expressed as a sum of squares of rational ex-
N. Ordered Fields and Real Fields pressions. More precisely, it was shown by A.
Ptïster that every positive detïnite function in
A tïeld K is called an ordered field if there is NX i, . . . , X,) is a sum of at most 2” squares.
given a +total order in K such that a > b im-
plies a+c>b+c for ah c and a>b, c>O im-
References
plies ac > hc. The characteristic of an ordered
fïeld is always 0. An element a of K is called a
[l] E. Steinitz, Algebraische Theorie der Kor-
positive element or a negative element accord-
per, J. Reine Angew. Math., 137 (1910) 167-
ing as a > 0 or a < 0. For an element a of K the
309.
absolute value of a, denoted by lai, is a or -a
[Z] E. Steinitz, Algebraische Theorie der Kor-
according as a 2 0 or a < 0. If we defïne neigh-
per, Walter de Gruyter, 1930 (Chelsea, 1950).
borhoods ofa by the sets {xla-e<x<a+s}
[3] H. Hasse, Hohere Algebra 1, II, Sammlung
with positive elements E, K becomes a +Haus-
Goschen, Walter de Gruyter, 192661927.
dorff space. If, for any two positive elements
[4] B. L. van der Waerden, Algebra 1,
a, b of K, there exists a natural number n such
Springer, seventh edition, 1966.
that nu > b, then we cal1 K an Archimedean
[5] .4. A. Albert, Fundamental concepts of
ordered fïeld. Two ordered fïelds are called
higher algebra, Univ. of Chicago Press, 1956.
similarly isomorphic if there exists an isomor-
[6] N. Bourbaki, Eléments de mathématique,
phism between them under which positive
Algèbre, ch. 5, Actualités Sci. Ind., 1102b,
elements are always mapped to positive ele-
Hermann, second edition, 1959.
ments. The rational number fïeld and the
[7] N. Jacobson, Lectures in abstract algebra
real number tïeld are examples of Archime-
III, Van Nostrand, 1964.
dean ordered fïelds, while every Archimedean
[S] M. Nagata, Field theory, Dekker, 1977.
ordered field is similarly isomorphic to a sub-
tïeld of the real number tïeld. (For the struc-
ture of non-Archimedean ordered tïelds, see
[SI.) A field k is called a formally real field (or
simply real field) if - 1 (1 is the unity element 150 (Xx.28)
of k) cannot be expressed as a lïnite sum of
squares of elements of k. The real number field
Field Theory
is a mode1 of formally real ftelds. More gener-
ally, every ordered lïeld is a formally real field. A. History
A formally real field is called a real closed field
if no proper algebraic extension of it is a for- When a quantity $(xX such as velocity, is
mally real tïeld. The real number tïeld is a real defïned at every point x in a certain region of
closed field. The algebraic closure of a real a space, we say that a fïeld of the quantity $
closed field is obtained by adjoining a root of is given. This general concept is used in many
the polynomial X2 + 1. If a is a nonzero ele- branches of science. Here we confine ourselves
ment of a real closed lïeld, then either a or -a to some branches of physics, in particular to
cari be a square of an element of the tïeld. the quantum theory of tïelds, which describes
Every real closed tïeld cari be made an ordered telementary particles.
tïeld in a unique way, namely, by defining The ttheories of elasticity and thydro-
squares of nonzero elements to be positive dynamics (in particular, concerning +Euler’s
150 B 582
Field Theory

equation of motion) deal with displacement B. Relativistically Covariant Classical Fields


and velocity fields, respectively. However, a
fïeld in a vacuum (ether), which is quite differ- Relativistically covariant fïelds q;(x) on the
ent from a field in a space fïlled with matter, level of classical theory are functions of the
fïrst became a subject of physics in telectro- space-time point x with either real or com-
magnetism. M. Faraday (1837) introduced the plex values depending on the index tl (which
electromagnetic fteld and discovered its funda- distinguishes different tïelds) [ 11. A fïnite-
mental laws, and J. C. Maxwell (1837) com- dimensional representation D” of SL(2, C) (-
pleted the mathematical formulation. On the 258 Lorentz Group) on either a real or com-
basis of this formalism, A. Einstein (1905) plex vector space is assigned to each CI, and
established the theory of trelativity and later the index r of q,!(x) refers to components in
developed the general theories of relativity and this representation space. Each mode1 is
of gravity. specified in terms of a Lagrangian density
Although quantum theory originated from y(x) that is a function (typically a polynomial)
the problem of blackbody radiation, the quan- of p,!(x), a,,vr(x) (8, denoting a/axU, p=O,
tum theory of the electromagnetic field was . . ,3) and their complex conjugates. 9 is
first developed by P. A. M. Dirac (1927) after taken to be invariant under the replacement
the development of tquantum mechanics. of d(x) and &<pp(x) by C,~akkcpS(4 and
Along similar lines P. Jordan and E. P. Wigner &D”(A),,A(A)j;a,~~(x)for a11 AeSL(2,C)
(1927) quantized the matter wave (electron (called the Lorentz invariance of U) and pos-
tïeld), and W. Heisenberg and W. Pauli (1929) sibly under the replacement of p,?(x) and
developed the quantum theory of wave tïelds ê,d(x) by & Uas<p,%) ad & UmDû,d(x)
in general. Subsequently, Jordan and Pauli (called an interna1 symmetry).
(1927) Pauli (1939) S. Tomonaga (1943), J. The frelds are supposed to satisfy the follow-
Schwinger (1948) and others reformulated the ing partial differential equation, called the tïeld
theory in a relativistically covariant manner. equation, and are obtained as the +Euler equa-
Quantum electrodynamics, dealing with an tion of the variational problem for the action
electromagnetic fteld interacting with electrons integral I= Sy(x)dx:
(and positrons), has given excellent agreement
with experimental measurements when for- &Y(x)/ê<p,a(x)- ; (a/ax”)(alP(x)/o(a~<(x))
mulated in this way. Divergence diffrculties p=o
inherent to quantized tïeld theory were by- =o,
passed by the renormalization procedure of
Tomonaga, Schwinger, R. P. Feynman, and F.
!L’O
J. Dyson (1947).
On the other hand, H. Yukawa (1934) ap- =o
plied the concept of the quantized tïeld to the
(the bar indicates the complex conjugate),
interpretation of nuclear force and predicted
where the second equation is identical to the
the existence of n-mesons. Many kinds of new
tïrst for a real freld. For complex tïelds. 5!(x) is
particles have since been found, including 7c-
chosen such that 1 = r (possibly except for a
mesons and muons. The iïeld theory of n-
surface term) in order to ensure that the above
mesons cari explain the qualitative features of
two equations are complex conjugates of each
the meson-nucleon system. Similar theories
other.
cari be formulated for other types of unstable
The invariance of 1 under a Lie group of
particles that have been found in cosmic rays
(local and/or volume-preserving point) trans-
since 1949.
formations of fïelds implies a conservation law
During the progress of meson theory, vari-
for a certain quantity (Noether’s theorem; -
ous types of tïelds were investigated, and a
e.g., E. L. Hill, Reu. Med. Phys., 23 (1951)). For
general theory of elementary particles was
translational invariance (X”+X’ +a’), the
developed. Dirac (1936) proposed a general
energy-momentum (stress) tensor
wave equation for elementary particles, and
Pauli and M. Fierz (1939) proved the con- T’,(x)
nection of spin and statistics (- the end of
= -6,3(x)+ 1 {asP(x)la(a,<pP(x))a,<pr(x)
Section D). Schwinger (195 1) derived quantum- a,r
mechanical equations of motion and commu-
tation relations from a unitïed variational
principle. For the cases where no interaction is (the complex conjugate terms in a11 equations
present, a general theory of elementary par- to be suppressed for real tïelds) satistïes the
ticles consistent with the requirements of rela- differential conservation law C;=, a,T’,(x)=O
tivity and of quantum theory was established as a consequence of the tïeld equations, which
as the theory of free fields. in turn implies the time-independence of the
583 150 c
Field Theory

following quantity provided that Tk,(x) Dirac field:


vanishes sufficiently fast at spatial infïnity:

H= T0,(x)d3x, Pk= T0,(X)d3X (xO=t). (6,(x)+= C,&(x)y$, yg=(Y$ are +Dirac’s Y-


s s matrices),
These are the total field Hamiltonian (energy) massive real vector tïeld:
and momentum.
In a general situation, P”( =&gYPTPpr
where gPV is the Minkowski metric tensor with
the signature 1, -1, -1, -1) is not necessarily x (~~~,(x)-û”a,(x))+,~C~“~~,(x)A”(x)~
PV
symmetric in p and v. F. J. Belinfante (Physica,
6 (1939); 7 (1940)) has given a formula (by a where D” is trivial for scalar tïelds, [l, 0] @
change of T(x) via so-called 4-divergence) for [0, 11 for the Dirac tïeld, and [ 1, l] for vector
obtaining a symmetrized energy-momentum fields.
tensor P, for which H and Pk are equal to For interacting tïelds, some interaction parts
those defmed from the above TP, and for are added to the sum of such free Lagrangian
which the differential conservation law is densities of relevant helds. Some examples are:
satisfïed. P(q)(or gq4) interaction:
Lorentz invariance implies the conservation
WP~N (or sdx)“b
law x,3,, AP‘““(x) = 0 for the angular momen-
Yukawa-type interaction:
tum density
SC,,, w4+Ykk(x)A,(x),
Fermi interactions:
g Cw GdC rsV(X)+~~
r k%4>
X(&,. $~‘(x)+sy&!r(x))
(Uk = 1 (scalar, no index k), y” (vector, G,,. = gpP.),
y”~” (tensor, G,,. =gpI<‘gVV.), y5yU (pseudovector,
G,,, = gpr<.>y5 = iy”y1y2y3), y5 (pseudoscalar, no
index k).
and the time-independence of the total angular
momentum

~“2 _- ,,,,fO’“d3X C. Heuristic Theory of Quantized Fields


(x0 = t),
s
For free tïelds, a quantization procedure
where Sf:” is antisymmetric in p, v and D”(A),, similar to the usual quantum mechanics (-
=&,+&VS~~Y~,,Y/2 for A(A),,=q,,,+c.,, with 351 Quantum Mechanics; 377 Second Quan-
an infinitesimal E,,“. (We have (A(.~)x)~ = tization) leads to the following type of canon-
C,WKx’ ad A(A),,=C,g,,A(A)P,.) ical commutation or anticommutation rela-
A continuous one-parameter group U(p) tions among tïelds (and their fïrst time deriva-
of interna1 symmetries implies the conserva- tives) at time 0; for example,
tion law C,, &P(x) = 0 for the 4-current den-
sity (charge (p=O) and current (p= 1,2,3) real scalar field:
densities) Cd2 xl, 402 y)1 = id36 - y),
c4m 4, dO>Y)l- = cm, 4, aKA Y11- =o,
Dirac field:
cim xx bk(O,Y)1 + = &,a3(x -Y),
cdm Xl>$.a Y)1 + = ccm, 4, im Y)1 + = 0.
and the time-independence of the charge
([A,Bli=AB+BA,S,,=Oforr#sandS,,=l,
lJ”(x)d3x (x’=t), where U’(O)“fl=/2”fl. An
d3(x - y) = I& 6(xk - y”).) The free-field equa-
example of U(p) is the multiplication by
tions then lead to the following 4-dimensional
exp iQ”p (called a gauge transformation of the
commutation relations:
first kind), where Q” is an integer with p = 0
for real fields q;. real scalar fiel&
Examples of Lagrangian densities for non- Cdxb P(Y)] = - 4h-y),
interacting tïelds (called free Lagrangian den- Dirac tïeld:
sities) are: [ddx), k(y)+1 + =(C,~i,a, -im4&Mx -Y),
real scalar fïeld: massive real vector field:
[A,(x),A,(y)l- =i(g,,+m-*a,a,)A,(x-y).
(C,“g”‘~,~o(x)~“<p(x)-m2<p(x)2)/2,
complex scalar fîeld: Here Ai, is the tinvariant distribution. There is
(C,“s”“~~<p(x)~,,cp(x)-m2<p(x)~(x)), a unique representation of such relations for a
150 D 584
Field Theory

fïeld (called the Fock representation) with a cari be written as


vector fi (called the free vacuum vector) which
is annihilated by jcp(x)f(x)dx, by J$r(x)f(x)dx
and 1 $ï(x)f(x) dx, or by 1 A,(x)f(x) dx when-
ever jeiJ”Xf(x)dx=O for p”>O (here, p.x=
in terms of U(t, s)=ei’H,e-i(t~s)He-isHa and the
Cg,,,p’x”), and which is cyclic (i.e., polynomials
of (smeared-out) lïelds generate a dense subset canonical lïeld <p. at time 0. The covariant
perturbation series due to Tomonaga, Schwin-
of 0). The free lïelds in the Fock represen-
ger, Feynman, and Dyson is obtained by sub-
tatiog satisfy the Wightman axioms (- Sec-
stituting the following expansion, which is the
tion D).
iteration of the Duhamel formula:
In the Fock representation of canonical
fïelds a translationally invariant vector must U(t, 4
be a free vacuum vector, up to multiplication
by a complex number. In order to construct a =nfo(-i)” ‘dl, ” ‘dr,H,(t,)... H,(t,).
mode1 of a translationally invariant interaction ss ss
among canonical lïelds with a unique vector of Here, H,(t)=eitHOH,emitHa. Each term cari be
minimal energy (a vector which is called the represented by connected +Feynman diagrams
true or interacting vacuum and which must (the denominator canceling out a11 discon-
be translationally invariant if unique), one nected graphs) and computed according to the
must look for some other suitable representa- tFeynman rule, yielding tFeynman integrals.
tion. Such a no-go theorem is called Haag’s
Each expression SO obtained (formally) may
theorem. be a divergent integral (in the absence of the
The earliest formulation of interacting quan- cutoff), in which case one tries to cancel it out
tized lïelds was developed heuristically by by modifying the original Hamiltonian with
forma1 manipulation in the Fock representa- the alteration of parameters such as mass
tion, described in textbooks of quantum field and coupling constant (by an amount called
theory [2-61. It cari be mathematically justi- the renormalization constant, which is diver-
fied if the so-called cutoff is introduced by gent in the absence of the cutoff) or possibly
limiting the space to a imite volume, possibly by the addition of terms again involving
changing the space into a lattice and smooth- renormalization constants. If this cari be
ing out fields (effectively cutting off the high- achieved in terms of a lïnite number of renor-
energy part of the interaction) (A. M. Jaffe, 0. malization constants, the mode1 or the Hamil-
E. Lanford III, and A. S. Wightman, Comm. tonian is called renormalizable [7]. If the
Math. Phys., 15 (1969)). The full theory is then divergent integral appears only in a tïnite
expected to be obtained by taking a limit of number of graphs (counting the same sub-
the true vacuum expectation values as various graph of an inlïnite number of different dia-
cutoff parameters are removed. This is the aim grams as one graph), the mode1 is called super-
of constructive tïeld theory (- Section F), renormalizable.
which has been achieved for some space-time
models of dimension 2 and 3.
D. Axiomatic Quantum Field Theory
In quantum lïeld theory, the S-matrix is
given in terms of (the mass-shell restriction of
Relativistically covariant fields <p,(x) on a Hil-
the Fourier transform of) the vacuum expec-
bert space Z are called Wightman fields if the
tation value of the time-ordered product of
following four Wightman axioms are fullïlled
ftelds, called z-functions (- Section E). In the
heuristic approach, it is given in terms of the [S%lO]:
(1) The tïelds C~,(X) are operator-valued distri-
following Gell-MannLow formula (its imagi-
butions: For each C”-function f of rapid de-
nary time version being mathematically used
crease on the Minkowski space (f’eY(R4)),
in constructive lïeld theory): If the lïeld C~(X)=
<p,(f) is an operator delïned on a common
eiH”“cpo((O,x))e~i”x” with H = Ho + H, (Ho is
domain D dense in .Y? and satisfying <p,(f)D c
the free Hamiltonian as the generator of the
D (the domain of q*(f)* contains D), <p,(f)* 1D
time translation for the free fïeld vo(x) and H,
= vs(f) for another index z and (Y, q,(f)@)
is the interaction Hamiltonian, such as P(<p,))
for any Y, and Q in D is linear and continuous
andxy>...>xz,then
in f relative to the topology of the Schwartz
(Q> cp(Xl) “’ dX”P) space sP(R4).
(2) Relativistic covariance: There exists a
= Ji-; (eiHTQo, q(x,) cp(x,)emiHTRo)
continuous unitary representation L/(a, A)
(aeR4 and AESL(~,C)) of the universal cover-
/Vo, e -2iHTQO)>
ing group $1 of the trestricted inhomogeneous
where fi, is the free vacuum. The numerator Lorentz group PJ on .X such that U(a, A)D c
585 150 D
Field Theory

If the mass operator (P P)“’ has an iso-


lated point spectrum at m with the eigenspace
Z’,(m), then there are sufflciently many n, a =
(a,, , a,,), and f(xl, . . ,x,) such that cp,(f)Q~
where f,,a(x)=f(h(A)-l(x-a)) and (D(A),& is
,YiUl(m), and the linear hull of such vectors are
a fïnite-dimensional representation of SL(2, C).
dense in X1(m). In terms of the fj satisfying
The index a usually consists of tundotted and
<p,~,,(fj)Sr~Z’~(m~) (some of the m’s may coin-
dotted indices, interchanged in E, and of other
cide) and the solutions gj(x) = Jgj(p)exp i(px -
indices, interchanged among them in E.)
wj(p)xo)d3p, wj(p)=(p2 +mf)“‘, of the Klein-
(3) Locality: If the supports off and g are
Gordon equation, then the following limits,
mutually spacelike, then
called out and in states, exist:
ou,
Yin (4 . ..h.)=tli~~Qe,(t,sl)...Q.(t,g,)R,
on D, where E(cI,~)= fl. If E(c(,/~)= -1 exactly
when D(A),,=D(A)ps= -1 for A= -1 (i.e.,
when both vu and <ppare Fermi fields), they Qj<t,gj,= <pa(,,(X~
+(t,X), ...>Xnj+(GX))
are said to satisfy the normal commutation s
relations. X&(X~ + . +xn,)gj(t,x)dx, . ..dx”.dx,
(4) Spectrum conditions: Let U(a, 1) = eiO.p.
hj=(27L)3gj(p)cP,,(ns2E~~(mj).
The joint spectrum of P’ (p = 0, 1,2,3) is in the
forward cane v+ ={p~R~[p.p>O,p~>O}, It delïnes the S-matrix elements as follows:
with a point spectrum of multiplicity 1 at p = 0.
A vector 0 belonging to the point spectrum 0 S(!I, . ..h.;h; . ..h”.)
of P’ is called the true (or interacting) vacuum =(Y”“‘(h, . . . h”), Yi”(h; h;,)).
and is required to be in D.
Usually D is taken to be minimal, namely, D This is called Haag-Ruelle scattering theory,
is the linear hull of R and ~p,~(f,). . cp,,(f,)0 and the existence proof is based on the cluster-
with a11 possible n, c(~, , c(,,,f, , . . ,f.. By means ing property of WBT and an asymptotic esti-
of the nuclear theorem, it is possible to defïne mate for the behavior of the g’s for large t
and a11 x (R. Haag, Phys. Reu., 112 (1958); D.
a linear operator <poL,,,,..(f) for f(x,, , x”) in
5f’(R4”), linear and continuous in f on D such Ruelle, Helu. Phys. Acta, 35 (1962)). The out
that it coincides with rp,,(f,) cp,“(f,) if f(x) and in states cari be interpreted in the limit of
= ny=, J(xj). If such operators are introduced, inlïnite future and past as the state where n
particles are moving at velocities vj related to
then the linear hull of 52 and (p,,,,.,,(f)0 is
the spectrum pu of P” through the relation p” =
taken to be D.
Under the foregoing axiom, the vacuum mj/( 1 - vf)l”, p = mjvj/( 1 - vj2)li2 (with probabil-
expectation values of the products of fïeld ity amplitude proportional to h,(p)) (H. Araki
operators detïne tempered distributions W,(x) and R. Haag, Comm. Math. Phys., 4 (1967)).
Since
= WC,,,.=”(x1, ,x,), called Wightman functions,
such that (Y”“‘(h, h”), Y”‘(,; . I?L,))

w,(nf;)=(52,<p,I(f,).“cp,“(f,)n).
The notion of a connected graph in pertur-
bation theory corresponds to the following
where the sum is over a11 permutations P, Yo”’
notion of truncated Wightman functions WcT:
cari be viewed as a unitary mapping from the
tFock space X0 over XE X1(m) to the closed
w=T(r)=C(-l)m~l(m-l)! 1 fi W,(I,),
m (Ix) k=1 subspace 2’“’ spanned by Y”“‘(hl h,), n =
0, 1,2,. . . (Q for n=O, h, itself for n= 1). The
free tïelds on 8’“’ and Xi” are called (out and
in) asymptotic fields. Likewise, Yi” is a unitary
where ! indicates a set of variables xj~M, jE1, mapping from X0 to 2’“. If 2’” =,X0”‘, then
{Zk} is a partition of 1 into m subsets, with the the matrix element S cari be viewed as a ma-
order of the x’s in each I, remaining the same trix element of a unitary operator on X0,
as in I, and the sum over {lk} extending over called an S-matrix. Its unitary transform by
a11 possible partitions. If the spectrum of {Pu} Yout and by Yi” coincide and detïne a unitary
in R’ is contained in vJ’={p~MIp.p>m~, operator, sometimes called an S-operator, on
p” > m} for some m > 0 (the mass gap), then Xi” = %‘Oout.The equality 2 = ~4”” = Ü+?~~’is
WuT has an exponential clustering property at called the completeness of the scattering states
spatial infïnity. For example WaT(x, + a,, . . , or asymptotic completeness.
x, + a& m’R+O as a distribution in x if m’ cm, If the scattering states are complete, then
ay=Ofor alljand R=maxlaj-akl+a. the following LSZ asymptotic condition, due
150 D 586
Field Theory

to Lehmann, Symanzik, and Zimmermann detïned by


(Nuouo Cimento, 1 (1955)) holds:
P

=~e(xoi cPlci)W,a(x)
where a*(h) is the tcreation operator (a*(h). ( >
Y(h, . h,)=Y(hh, . h,) for either out or
(often with an additional factor (-i)“-‘), where
in states), ~~~~,,(fj)fi~~,(m~) is not required,
the Fourier-Laplace transform &q”; C,,) of
and (Q, v~,~,(&)Q) = 0 is assumed instead. This
0(x0; C,) is a rational function and 0(x0; C,/Ci)
leads to an explicit expression for the S-matrix
is the inverse Fourier transform of its bound-
elements in terms of r-functions, which takes
ary value as Im 4’ E Ci tending to 0, and the
the following simple form if a11 particles have
cane Ci and hence ri is specitïed by a consistent
spin 0, a11 tïelds <p=are scalar (D(A),, = &), and
choice of signs of Cj,, Im q,? for a11 nonempty
ah nj cari be taken to be 1. First define the
proper subsets 1 of (1, . . , n). The Fourier
connected part S,(h, . . h,; h; &) from S by
transform of ri(x) is a boundary value of an
exactly the same equation as truncated Wight-
analytic function common to a11 ri and coin-
man functions with S to be set to 0 if n = 0, 1
cides with ?ETfor p” E Ci. Making use of this
or n’ = 0, 1 except when n = M’ = 1 (S(h, h’) =
relation, some analyticity properties of the S-
(h, h’)). Then
matrix (including TCP symmetry) have been
S#I, . . h,; h,,, h,) proved ([ll, 121; J. Bros, H. Epstein, and
V. Glaser, Comm. Math. Phys., 1 (1965);
Epstein, Glaser, and A. Martin, Comm. Math.
Phys., 13 (1969); Epstein, J. Math. Phys., 8
(1967)).
-Pf+l>“.> - PJ ,J ( - izj”2 hj(Pj) dnj(Pj))> The two-point function for scalar tïelds has
1 the following simple expression in terms of
where p: = (p/ + mf)112 (the set of such p is tinvariant distributions, sometimes called
called the positive mass shell; there the vanish- the Kallen-Lehmann representation with the
ing factors (p, pi-m*) cancel the poles of fa?), Kallen-Lehmann weight p:
dQ,$p) is the Lorentz invariant measure
(21~ I)-‘d3p(=S(p.p-mj)d4p) on the posi- (Q>cp(x)<p(~)Q)= m &(x-Y)dp(K-2).
tive mass shell, h E ,rir (m) is represented by a s0
measurable function h(p) with the inner prod-
Under the Wightman axiom (with the
uct (h, h’)=~h(p)h’(p)dR,(p), Zj is defined by normal commutation relations), there exists
(h, <p,,,,(fj)0)=Zj”2(h,~) for a11 h~%‘r(m~) and
an antiunitary operator 0, called the TCP
f(p)=(2n)m3’2Jeip’Xf(x)dx, and ?z is the operator, which satisfïes the relations
Fourier transform of the truncated time-
ordered function (z-function) detïned by OR=R, OU(a,A)W’=U(-a,A),

o2 = U(0, -1)
s,T(p, ,..., p,)= raT(x, ,..., x,)expi C pj’xj
s ( j=l > @<p,(x)@-‘=(-l)V(-i)N<p,(-x),

where 11is the number of undotted indices in c(,


x fi (27cm3”d4xj,
j=l N =0 if D(A),,= 1 for A = -1 (Bose tïelds), and
N = 1 if D(A),,= -1 for A= -1 (Fermi tïelds).
P The choice of &l for ~(a, a) in the locality
axiom cannot be opposite to the normal com-
mutation relations for any a except for the
in which the sum is over a11 permutations P, T trivial case <p,(x) = 0 ([S- 101; G. Lüders and B.
indicates the truncated Wightman functions, Zumino, Phys. Reo., 110 (1958); N. Burgoyne,
0(x0; C,) is the characteristic function of the Nuouo Cimento, 8 (1958)). This is called the
cane xP,~, 2 xPc2, 2 .a xi<,, (the time ordering) connection of spin and statistics. For a general
possibly smeared out by convolution with a choice of k 1 for .$a, fi), the Wightman axioms
C”-function of compact support (SO that it cari imply existence of a certain number of even-
multiply a distribution), where the formula oddness conservation laws (i.e., the existence of
does not actually depend on the smearing a unitary representation u of a group (Z,)’ for
functions. (Usually fields <p, are normalized SO some 1 such that u(g)Q=Q u(g)cp,(f)u(g)*=
that Zj = 1.) Xo,(g)<p,(f) for some characters 1, on the group)
The r-functions (0. Steinmann, He[u. Phys. SO that the Klein transforms Q=(x) = u(g,)qJx)
Acta, 33 (1960); D. Ruelle, Nuovo Cimento, 19 (for some choice of g=) of the original tïelds
(1961); H. Araki, J. Math. Phys., 2 (1961)) are satisfy the Wightman axioms with the normal
587 150 F
Field Theory

commutation relations ([S-10]; H. Araki, version of the Haag-Araki axioms is called the
J. Math. Phys., 2 (1961)). Haag-Kastler axioms.
If 8’ denotes the causal complement of Lo
(i.e., the set of all points spacelike to O), then
the assumption .ti(O)’ = d(U) is called duality
E. Theory of Local Observables
and is proved for a certain type of region,
which includes the double cane in the case of
Except for the technical assumption of free fields. With the assumption of the duality
operator-valued distributions, physical con- for double cones in the vacuum sector, S.
tents of the Wightman axioms have been for- Doplicher, Haag, and J. Roberts (Comm. Math.
mulated in terms of the von Neumann algebra Phys., 13 and 15 (1969); 24 (1971); 35 (1974))
d(0) for bounded open sets 0, generated by succeeded in the analysis of superselection
those observables which cari be measured in sectors and clarilïed the connection of spin and
the space-time region 0, as follows: statistics in a much more satisfactory fashion,
(1) Isotony: If 6,~ 0,) then &(Or) 1 Se(@,). as well as the anticommutativity of intertwin-
(2) Covariance: Li(a, A)&(O)U(a, A)* = ing operators for superselection sectors for
d(A(A)O + a). Fermi statistics.
(3) Locality: If Qj~&(oj) and 0, is spacelike to
0,) then [QI, Q2] = 0. (No signal cari propa-
gate faster than the speed of light.) F. Constructive Field Theory
In addition, the spectrum condition is as
before (the stability of the vacuum) and, since An effort to make mathematical sense out of
we restrict our attention to the closed span of the heuristic theory of quantized fïelds and to
&(O)n for a11 0, R is assumed to be cyclic for produce examples of Wightman fields and the
ufi L$‘(U). (Then the latter is irreducible.) By associated system of local observables has
treating Q(x)= U(x, l)QU(x, l)* for QE&(U) been pushed forward by J. Glimm and A. Jaffe
as a (noncovariant but localized) fteld, Haag- since the mid 1960s and is known as construc-
Ruelle scattering theory and the analyticity tive tïeld theory. Since 1972, the Euclidean
properties of the S-matrix described above for methods, already known in some sense, have
Wightman fields hold in exactly the same way become extremely powerful central tools, are
in the theory of local observables. The notion collectively known as Euclidean field theory
of an algebra of local observables has been [13-161.
a concern of R. Haag since the late 1950s The Wightman function W(z, . z,,) is ana-
with the consequent analogy to Wightman lytic at the Schwinger points zj = (ix:, xj) (XE R4)
tïelds being demonstrated by Araki in his if xj # xk for j # k, and its value S(x, . x”) =
Zürich lectures of 1961-1962. Hence the W(z, . . z,) is called the Schwinger function.
above axioms are sometimes called Haag- The axioms for Schwinger functions equiv-
Araki axioms. alent to Wightman axioms are known as
With the help of the additional axiom Osterwalder-Schrader axioms (Comm. Math.
(a(@,) U &(CV,))” = d(0, U O,), called the addi- Phys., 31 (1973); 42 (1975)). The positivity
tivity, the vacuum vector n is cyclic and sep- axiom reflecting the positive defïnite metric
arating for X~(O) for any bounded open 0 of the Hilbert space for Wightman fields is
(Reeh-Schlieder theorem) and .d(O) = d(6), known as O-S positivity (or T-positivity or
where 8 is the double cane {x 1Ix01 + Ix( <L,} if reflection positivity).
8={xIIx”I+IxI<L,Ixl<s}forany.s>0,for Since the Schwinger function is symmetric
example (Borcbers theorem). in its variables, it cari be viewed as the expec-
A merit of the Haag-Araki axioms is that tation value of the product of (commuting)
these axioms have direct physical interpre- random fields, called Euclidean fïelds, if an
tation. In particular they always imply the additional positivity holds. This idea was put
commutativity at spacelike separation of sup- forward by K. Symanzik in the 1960s. E. Nel-
ports, in contrast to the anticommutativity for son then realized that Euclidean fields for free
Fermi lïelds. This then necessitates the con- fïelds have the Markov property, and he devel-
sideration of the representations associated oped Euclidean Markov field theory. A work
with some other states, such as states with an of Guerra in 1972 utilizing Nelson symmetry
odd number of fermions and representations of revealed the extreme power of this approach,
the C*-algebra & of quasilocal observables and the whole of constructive field theory has
(generated by ah d(O)), which are nonequiva- been studied in Euclidean formulation with
lent to the vacuum representation and are remarkable results for super-renormalizable
called superselection sectors. This viewpoint models in space-time of dimension 2 and 3.
was introduced by Haag and D. Kastler (J. The key point is the Feynman-Kac-Nelson
Math. Phys., 5 (1964)) and the C*-algebra formula, which expresses Schwinger functions
1!5OG 588
Field Theory

as functional integrals and reveals a mathe- [S] K. Nishijima, Fields and particles, Ben-
matical connection between Euclidean field jamin, 1969.
theory and classical statistical mechanics. [6] N. N. Bogolyubov and D. V. Shirkov,
Introduction to the theory of quantized tïelds,
Wiley, third edition, 1980.
G. Gauge Theory
[7] K. Hepp, Théorie de la rénormalization,
Springer, 1969.
tElectrodynamics in terms of the 4-vector
[S] R. F. Streater and A. S. Wightman, PCT,
potential ,4,,(x) is invariant under the local
spin and statistics, and all that, Benjamin,
gauge transformations ,4,(x)-A,(x) + 8, A(x)
1978.
(and the associated transformation of the
[9] R. Jost, The general theory of quantized
charged fields) for A satisfying the wave equa-
fields, Amer. Math. Soc., 1965.
tion q A = 0. This leads on one hand to com-
[lO] N. N. Bogolyubov, A. A. Logunov, and 1.
plication in the canonical quantization of the
T. Todorov, Introduction to axiomatic quan-
ftelds A,(x) and, on the other hand, to the
tum lïeld theory, Benjamin, 1975. (Original in
necessity of an indelïnite inner-product space
Russian, 1969.)
as exemplified by the Gupta-Bleuler formalism
[ 1 l] C. DeWitt and R. Omnes (eds.), Disper-
of quantum electrodynamics. In such a formal-
sion relations and elementary particles, Wiley,
ism, the physical Hilhert space (with positive
1960.
delïnite metric) is introduced by considering
[12] M. Chretien and S. Deser (eds.), Axiom-
a specilïc subspace (physical subspace) of a
atic field theory, Gordon & Breach, 1966.
semidefinite metric and taking the quotient by
[ 131 G. Velo and A. S. Wightman (eds.), Con-
its nul1 subspace.
structive quantum lïeld theory, Springer, 1973.
The corresponding theory with a non-
[ 141 B. Simon, The P(C~), Euclidean (quan-
commutative gauge group is known as the
tum) field theory, Princeton Univ. Press, 1974.
theory of Yang-Mills fields. In order to restore
[ 151 J. Glimm and A. Jaffe, Quantum physics,
the forma1 unitarity of the S-matrix in the
Springer, 1981.
perturbation series in terms of Feynman dia-
[ 161 B. Simon, Functional integration and
grams, L. D. Faddeev and V. N. Popov (Phys.
quantum physics, Academic Press, 1979.
LRtt., 25B (1967)) introduced fictitious particles,
[ 171 A. Jaffe and C. Taubes, Vortices and
called Faddeev-Popov ghosts. T. Kugo and 1.
monopoles, Birkhauser, 1980.
Ojima (Proy. Theoret. Phys., Suppl., 66 (1979))
developed a canonical quantization scheme for
the Yang-Mills lïeld which naturally intro-
duces additional fields corresponding to the
Faddeev-Popov ghosts and specilïes the phy-
sical subspace by means of the simple condi- 151 (WA)
tion that it be the kernel of the generator of Finite Groups
BRS transformations, earlier introduced by C.
Becchi, A. Rouet, and R. Stora.
A. The Number of Finite Groups of a Given
Gauge theory cari be formulated on a lattice
Order
of lïnite volume as a kind of classical statistical
mechanics. This is known as lattice gauge
A group is called a finite group if its order is
theory, and the important issue currently being
lïnite (- 190 Groups). Since the early years of
investigated is whether or not its limit, as the
the theory of lïnite groups, a major problem
lattice interval tends to 0 (the continuum Iimit)
has been to fmd the number of distinct isomor-
and the volume tends to inlïnity, produces a
phism classes of groups having a given order.
nontrivial quantum theory of gauge fields.
It is almost impossible, however, to find a
general solution to the problem unless the
References values of n are restricted to a (small) subset of
the natural numbers. Let f(n) denote the num-
[l] L. D. Landau and E. M. Lifshits, The ber of isomorphism classes of finite groups of
classical theory of lïelds, Pergamon, 1975. order n. If p is a prime number, then f(p) = 1
(Original in Russian, 1973.) and any group of prime order is a cyclic group.
[2] J. D. Bjorken and S. D. Drell, Relativistic If p is prime, then any group of order p2 is an
quantum lïelds, McGraw-Hill, 1965. tAbelian group and f( p’) = 2. If p and 4 are
[3] S. S. Schweber, An introduction to relativ- distinct primes and p > 4, then f( pq) = 2 or 1
istic quantum lïeld theory, Harper & Row, according as p is congruent to 1 modulo 4 or
1961. not. If p = 1 (mod q), there is a non-Abelian
[4] H. Umezawa, Quantum field theory, group of order pq as well as a cyclic group of
North-Holland, 1956. order pq. For small n, the value of f(n) is as
589 151 D
Finite Groups

follows: order, and an Abelian group of exponent 2


(i.e., p2 = 1 for each element p).
n 8 12 16 18 20 24 27 28 30 32 60 Let G be a fïnite group, and let G, = G 3 G,
f(n) 5 5 14 5 5 15 5 4 4 51 13 3.. .X G, = { 1) be a tcomposition series of G.
For any n, f(n)2 1. When p is prime, f( p”) is The set of isomorphism classes of the simple
known for m < 6 : f( p3) = 5, f(p4) = 15 if p > 2. groups Giml /Ci, i = 1, 2, . . , r, is uniquely deter-
For f( p5) see 0. Schreier, Abh. Math. Sem. mined (up to arrangement) by the +Jordan-
Univ. Hamburg, 4 (1926). For f(26) see [ 111. Holder theorem. Thus the two most funda-
Set f (p”) = p’ and 1= Am3. Then A-2127 as mental problems of finite groups are (i) the
rn- CO(G. Higman, Proc. London Math. Soc., study of the simple groups and (ii) the study of
10 (1960); C. C. Sims, Symposium on Group a group with a given set of composition fac-
Theory, Harvard, 1963). tors. The fïrst is one of the leading problems of
the theory, although it has been in a state of
stagnation until rather recently (- Section J).
B. Fundamental Theorems on Finite Groups As to the second problem, initial works by H.
Wielandt and others are under way (partic-
The following are some fundamental theorems ularly in the direction of various generaliza-
useful in studying lïnite groups. tions of Sylow’s theorems). For the class of
(1) The order of any subgroup of a lïnite finite solvable groups, the tïrst problem has a
group G divides the order of G (J. L. La- rather trivial solution; only the second prob-
grange). The converse is not necessarily true. If lem is important, and even in this case the
a fïnite group G contains a subgroup of order theory seems to leave something to be desired.
n for any divisor n of the order of G, then G is
a tsolvable group. Furthermore, if G contains
C. Finite Nilpotent Groups
a unique subgroup of order n for each divisor
n of the order of G, then G is a cyclic group.
A finite group is nilpotent if and only if it is
Let p be a prime number. Let the order of a
the tdirect product of its p-Sylow subgroups,
finite group G be p”m, where m is not divisible where p ranges over a11 the prime divisors of
by p. A subgroup of order p” of G is called a p-
the order. Any maximal subgroup of a nilpo-
Sylow suhgroup (or simply a Sylow subgroup) tent group is normal. The converse holds for
of G. The importance of this concept may be
fïnite groups; that is, a lïnite group is nilpotent
seen from the next theorem. if and only if a11 its maximal subgroups are
(2) A finite group contains a p-Sylow sub-
normal.
group for any prime divisor p of the order of
the group. Furthermore, p-Sylow subgroups
are conjugate to each other. The number of D. Finite Solvable Groups
distinct p-Sylow subgroups is congruent to 1
modulo p. In general, the number of distinct p- One of the most profound results on finite
Sylow subgroups of G that contain a given groups, an affirmative answer to the long-
subgroup whose order is a power of p is con- standing Burnside conjecture, is the Feit-
gruent to 1 modulo p (Sylow’s theorems). Thompson theorem (Pacifie J. Math., 13
(3) A p-group is a tnilpotent group (a fïnite (1963)): A finite group of odd order is solvable.
group is called a p-group if its order is a power The index of a maximal subgroup of a finite
of p). Thus any finite p-group G of order > 1 solvable group is a power of a prime number
contains a nonidentity element in the tcenter (E. Galois). But the converse is not true. The
of G. Furthermore, any proper subgroup of G unique simple group of order 168 has the
is different from its tnormalizer. A paper by P. property that a11 maximal subgroups are of
Hall (Proc. London Math. Soc., (2) 36 (1933)) is prime power index. A finite solvable group
a classic and fundamental work on p-groups. contains a self-normalizing nilpotent subgroup
A group G of order 8 with 2 generators 0, z (i.e., a nilpotent subgroup H such that N,(H)
and relations o4 = 1, Z(TT-’ = g-l, oz = 7’ is = H), and any two such subgroups are conju-
called the quaternion group. This group is gate (R. W. Carter, Math. Z., 75 (1960); cf. W.
isomorphic to the multiplicative group consist- Gaschütz, Math. Z., 80 (1963), for a generali-
ing of { kl, fi, Q, +k} in the tquaternion zation). Such a subgroup is called a Carter
tïeld. A generalized quaternion group is a group subgroup and is an analog of a Cartan subal-
of order 2” with 2 generators 0, z and rela- gebra of a Lie algebra. But unlike Cartan
tions 02”-’ = 1, ZUT-’ = o-l, 7’ = 02”-‘. A non- subalgebras, most simple groups do not con-
Abelian group all of whose subgroups are tain any self-normalizing nilpotent subgroups.
normal subgroups is called a Hamilton group. A fïnite solvable group of order mn (m, n are
A Hamilton group is the direct product of a relatively prime) contains a subgroup of order
quaternion group, an Abelian group of odd m; two subgroups of order m are conjugate; if I
151 E 590
Finite Groups

is a divisor of m, then any subgroup of order 1 length of G. A solvable group is n-solvable for
is contained in a subgroup of order m (P. Hall). any set 7t of prime numbers. A n-solvable
The converse of the first part of this theorem is group contains a Hall subgroup which is a 7-c-
also true: A tïnite group is solvable if it con- group and also a Hall subgroup which is a n’-
tains a subgroup of order m for any decompo- group; an analog of Hall’s theorem on finite
sition of the order in the form mn, (m, n) = 1 (P. solvable groups holds. Hall and Higman (Proc.
Hall’s solvability criterion). This generalizes London Math. Soc., 7 (1956)) discovered deep
the famous Burnside theorem asserting the relations between the p-length of a p-solvable
solvability of a group of order p”qb, where group and invariants of its p-Sylow subgroup.
both p and 4 are prime numbers. If the se- For example, the p-length is 1 if a p-Sylow
quence of the quotient groups of a tprincipal subgroup is Abelian.
series of a fïnite group G consists of cyclic
groups, then the group G is called supersolv-
ahle. A fïnite group is supersolvable if and G. Permutation Groups
only if the index of any maximal subgroup is a
prime number (B. Huppert, Math. Z, 60 (1954)). The set of a11 permutations on a set R of n
If p is the largest prime divisor of the order of elements forms a group of order n! whose
a fïnite supersolvable group G, then a p-Sylow structure depends only on n. This group is
subgroup of G is a normal subgroup. called the symmetric group of degree n, de-
noted by S,. Any subgroup of S, is a permuta-
E. Hall Subgroups tion group of degree n. When it is necessary to
mention the set R on which permutations
operate, S,, may be denoted as S(Q), and a
A subgroup is called a Hall subgroup if its
subgroup of S(Q) is called a permutation
order is relatively prime to its index (see the
group on R. The set of n elements on which S,
theorems of P. Hall on tïnite solvable groups).
operates is usually assumed to be { 1,2, , n),
There is no general theorem known on the
and an element o of S, is written as
existence of a Hall subgroup. If a fïnite group
G has a normal Hall subgroup N, then G con-
tains a Hall subgroup H that is a complement
ofNinthesensethatG=NHandNnH=l;
CT=
(
1
1’
2
2’
n
n’ > ’

123456
furthermore, any two complements are conju-
for example,
gate (Schur-Zassenhaus theorem). The analog 231546’ >
of Hall’s theorem on finite solvable groups
where i’ is the image of i by o: i’= a(i). The
fails for nonsolvable groups. But if a finite
element g may be written as ( . .)(abc z)( . ),
group, solvable or not, contains a nilpotent
which means that CJcyclically maps a into b, b
Hall subgroup H of order n, then any sub-
into c, and SO on, and tïnally z back into a. In
group of an order dividing n is conjugate to a
the example, o=(12 3)(45)(6). It is customary
subgroup of H (Wielandt, Math. Z., 60 (1954);
to omit the cycle with only one letter in it,
cf. P. Hall, Proc. London Math. Soc., 4 (1954),
such as (6) in the example. With this conven-
for a generalization). There are some gener-
tion, CJ= (12 3)(4 5) may be an element of S, for
alizations of these results for maximal n-
any n > 5, leaving a11 the letters i 2 6 invariant.
subgroups which may not be Hall subgroups.
A cycle of length 1 is an element CTof S, which
moves I letters c1I, . , a, cyclically and fixes
F. 7r-Solvable Groups all the rest; i.e., cr = (a1 , . . , uJ. Then an ex-
pression such as D = (1 2 3)(4 5) is the same as
Let n be a set of prime numbers. Denote the the product of two cycles (1 2 3) and (4 5). In
set of prime numbers not in n by n’. A tïnite general, any permutation cari be expressed as
group is called a x-group if a11 the prime divi- the product of mutually disjoint cycles (two
sors of the order belong to K. A Cte group is cycles (a,, ,a,) and (b,, . . . , b,) are said to be
called 7c-solvable if any composition factor is disjoint if ai # bj for a11 i andj). Furthermore,
either a n’-group or a solvable rr-group. If TC= the expression of the permutation as the prod-
{ p} consists of a single prime number p, we uct of mutually disjoint cycles is unique up to
use terms such as p-group or p-solvable (in- the order in which these cycles are written. A
stead of { p}-solvable). Let G be a n-solvable cycle of length 2 is called a transposition. Any
group. A series of subgroups P,, = 1 c No 5 permutation may be written as a product of
P,~N,$...~P,cN,=Gdelïnedbytheprop- transpositions. This expression is not unique,
erties that Pi/Ni-, is the maximal normal 7c- but the parity of the number of transpositions
subgroup of GIN,-, and Ni/Pi is the maximal in the expression is determined by the permu-
normal x’-subgroup of G/Pi is called the n- tation. A permutation is called even if it is the
series of G, and the integer 1 is called the 7~. product of an even number of transpositions
591 151 H
Finite Groups-

and odd otherwise. The symmetric group S, H. Transitive Permutation Groups


contains the same number of even and odd
permutations. A permutation group G on a set Q is called a
The totality of even permutations forms a transitive permutation group if for any pair
normal subgroup of order (n!)/2(n 2 2) called (a, h) of elements of R, there exists a permuta-
the alternating group of degree n and usually tion of G which sends a into b. Otherwise G is
denoted by A,,. An even permutation is the said to be intransitive. Let G be a transitive
product of cycles of length 3. The alternating permutation group on a set 0, and let a be an
group A, of degree 5 is the nonsolvable group element of a. The totality of elements of G
of minimal order. This fact was known to which leave a invariant forms a subgroup of G
Galois. If n #4, then the alternating group A, called the stabilizer of a (in G). The index of
is a simple group and a unique proper normal the stabilizer is equal to the number of ele-
subgroup of S,. If n=4, A, contains a normal ments of R, the degree of G. Thus the degree of
noncyclic subgroup V of order 4. In this case a transitive permutation group G divides the
A, and V are the only proper normal sub- order of G (a fundamental theorem).
groups of S,. A noncyclic group of order 4 is The concept of orbits is important. Let G be
called a (Klein) four-group. If n 2 5, the sym- a permutation group on a set Q. A subset I of
metric group S,, is not a solvable group. This Q is called an orbit of G if it is G-invariant and
is the group-theoretic ground for the famous G acts transitively on I. In other words, a
theorem, proved by Ruflïni, Abel, and Galois, subset I of Q is an orbit of G if the following
which asserts the impossibility of an algebraic two conditions are satisfied: (i) If a~rand
solution of a general algebraic equation of y E G, the image g(a) also lies in c and (ii) if
degree more than four. If n < 4, S, is solvable. a and b are two elements of I, there exists
The group S, has a composition series with an element x of G such that b = x(a). Thus
composition factors of orders 2, 3, 2, 2, and each element x of G induces a permutation
S, is realized as the group of motions in 3- <p,(x) on r. The set of a11 these permutations
dimensional space which preserve an octahe- <~~(X)(XE G) forms a permutation group on I,
dron. Hence S, is called the octahedral group. which may be denoted by q,(G). Then <p,(G) is
Similarly, A,(A,) is realized as the group of transitive on I, and <Pu is a homomorphism of
motions in space which preserve a tetrahedron G onto <p,(G). Thus the number of elements in
(icosahedron); thus A, is called the tetrahedral an orbit I is a divisor of the order of G. It is
group and A, the icosahedral group. clear that the set Q on which G acts is the
These groups have been extensively studied union of mutually disjoint orbits I’i, . , r, of
in view of their geometric aspect. The group of G. This implies that the degree of G is the sum
motions of a plane which preserve a regular of the numbers of elements in the orbits ri.
polygon is called a dihedral group. If a regular The resulting equation often contains non-
polygon has n sides, then the group has order trivial relations. If <pi denotes the homomor-
2n. Sometimes a Klein four-group is included phism qri defmed before, then G is isomorphic
in the class of dihedral groups (for n= 2). The to a subgroup of the direct product of the
dihedral groups, octahedral group, etc., are groups cpi(G), i= 1, 2, . . . , r.
called regular polyhedral groups. A lïnite A transitive permutation group is called
subgroup of the group of motions in 3- regular if the stabilizer of any letter is the
dimensional space is either cyclic or one of identity subgroup (l}. A transitive permuta-
the regular polyhedral groups. A dihedral tion group is regular if and only if its order
group is generated by two elements of order 2. equals its degree. Any group cari be realized
Conversely, a lïnite group generated by two as a regular permutation group (Cayley’s
elements of order 2 is a dihedral group. This theorem). A transitive permutation group
simple fact has surprisingly many conse- which is Abelian is always regular.
quences in the theory of finite groups of even Let G be a transitive permutation group on
order [15, ch. 91. A dihedral group of order a set Q. If the stabilizer of an element a of R is
2n contains a cyclic normal subgroup of order a maximal subgroup, G is called primitive, and
n, and hence is solvable. otherwise imprimitive. A normal subgroup,
If n # 6, every automorphism of S, is inner. which is #{ l), of a primitive permutation
The order of the group of automorphisms of group is transitive. An imprimitive permuta-
S, is twice the order of S,. The index (S,, : H) of tion group induces a decomposition of the set
a subgroup H of S, is at least equal to n unless 0 into the union of mutually disjoint subsets
N = A,. If (S, : H) = IZ, then H is isomorphic to Ai, . , A, (s > 1) such that each Ai contains at
S,-, If n # 6, S, contains a unique conjugate least two elements, and if x E G maps an ele-
class of subgroups of index n. But S, contains ment a of Ai onto an element b of Aj, then x
two such classes, which are exchanged by an maps every element of Ai into Aj: x(Ai) = Aj.
automorphism of S, (- Section 1). The set {A,, . , As} is called a system of im-
151 H 592
Finite Groups

primitivity. A subset A of R is called a block if G is at most 2-transitive if II is odd, while G is


x(A) fi A equals A or the empty set for a11 x in at most 3-transitive if n is even but more than
G. A block is called nontrivial if A #Q and A 4. The symmetric group S, of degree 4 is the
contains at least two elements. Each member only 4-transitive group that contains a proper
of a system of imprimitivity is a nontrivial solvable normal subgroup.
block. A transitive permutation group is A transitive extension of a permutation
primitive if and only if there is no nontrivial group H on fi is defïned as follows. Let cc be a
block. new element not contained in R. A transitive
A permutation group G on a set R is called extension G of H is a transitive permutation
k-transitive (or k-ply transitive, where k is a group on the set {Q CO} in which the stabilizer
natural number) if for two arbitrary k-tuples of COis the given permutation group H on R.
(a,, . , uk) and (b,, , bk) of distinct elements of Transitive extensions do not exist for some H.
Q, there is an element of G which maps ai into Suppose that a permutation group H admits a
biforalli=1,2,...,k.Ifk~2,Giscalledmulti- transitive extension G that is primitive. If H is
ply transitive. A doubly transitive permutation simple, then G is also simple unless the degree
group is always primitive. The symmetric of G is a power of a prime number. Construct-
group S, of degree n is n-transitive, while the ing transitive extensions has been an effec-
alternating group A, is (n-2)-transitive for tive method for constructing sporadic simple
n 2 3. Conversely, an (n-2)-transitive permu- groups.
tation group on { 1,2,. . , ri} is either S, or A,,. Permutation groups of prime degree have
For multiply transitive permutation groups been studied extensively since the last Century,
which are simple, see the list in Section 1. If partly because of their connection with al-
k 3 6, no k-transitive groups are known at gebraic equations of prime degree. Let p be a
present except S,, and A,. If the Schreier con- prime number. A transitive permutation group
jecture (- Section 1) is true, then there are no of degree p is either multiply transitive and
o-transitive permutation groups except S, and nonsolvable or has a normal subgroup of
A, (Wielandt, Math. Z., 74 (1960); H. Nagao, order p with factor group isomorphic to a
Nuyoya J. Math., 27 (1964); O’Nan, Amer. cyclic group of order dividing p - 1 (Burnside).
Math. Soc. Notices, 20 (1973)). Choose two cycles x and y of length p in S,. If
Two 5transitive permutation groups other y is not a power of x, then the subgroup (x, y)
than S, and A, are known: the groups M, 2 and generated by x and y is a multiply transitive
M,, of degrees 12 and 24, respectively, dis- permutation group which is simple. The struc-
covered by E. L. Mathieu in 1864 and 1871. ture of (x, y) is not known despite its simple
The stabilizer of a letter in M,,(M,,) is a 4- delïnition. More attention has been paid to
transitive permutation group of degree 11 (23), groups of degree p, where p is a prime number
denoted by M, ,(Mz3). No 4-transitive permu- such that (p - 1)/2 = q is another prime number.
tation groups other than S,, A,, Mi (i= 11,12, The problem is to decide if such nonsolvable
23, and 24) are known. The groups M,,, M,,, groups contain the alternating group A,. The
M,,, M,, and the stabilizer M,, of a letter Mathieu groups M, 1 and Mz3 are the only
in M,, are called Mathieu groups. They are known exceptions for p > 7. The search for
simple groups which have quite exceptional additional exceptions has been aided by the
properties. For Mathieu groups, see E. Witt, development of high-speed computers. It is
Abh. Math. Sem. Univ. Humburg, 12 (1938). A known that there is no exceptional group
k-transitive permutation group G on fi of of degree p = 2q + 1 for 23 < p < 4079 (P. J.
degree n and order n(n - 1). (n-k + 1) has Nikolai and E. T. Parker, Math. Tables Aids
the property that no nonidentity element of G Comput., 12 (1958); see N. Ito, Bull. Amer.
leaves k distinct letters of R invariant. If k > 4, Math. Soc., 69 (1963), for further results in this
such a group is one of the following: S,, Ak+2, direction).
M,,, and M,, (C. Jordan). For k=2 and 3, A Frobenius group is a nonregular transitive
see H. Zassenhaus, Abh. Math. Sem. Uriiu. permutation group in which the identity is the
Hamburg, ll(l936). only element leaving more than one letter
A multiply transitive permutation group G invariant. A Frobenius group of degree n
contains a normal subgroup S such that S is a contains exactly n - 1 elements which displace
non-Abelian simple group and G is isomorphic a11 the letters. These n - 1 elements together
to a subgroup of the group Aut S of the auto- with the identity form a regular normal sub-
morphisms of S, except when the degree n of G group of order n. This is a theorem of Frobe-
is a power of a prime number and G contains nius; a11 the existing proofs depend on the
a regular normal subgroup of order n which is theory of characters. The regular normal
an telementary Abelian group (W. S. Burn- subgroup of a Frobenius group is nilpotent
side). Furthermore, in these exceptional cases, (J. G. Thompson, Proc. Nat. Acad. Sci. US, 45
593 151 1
Finite Groups

(1959); [15, ch. 10; 16, ch. 31). A Zassenhaus Unitary group, U,(q):
group is a transitive extension of a Frobenius
group. ~=q”‘“-“~2i~(qi-(-l)‘)/~, d=(n,q+l).

The structure of unitary groups does not de-


1. Finife Simple Groups pend on the form.
Symplectic group, S,(q), n = 2m:

Ah simple groups of fïnite order were com- g=q”‘fi(q”-l)/d, d=(2,q-1).


i-1
pletely classified in February 1982 (- Section
J; 1231). These are divided into the following In the symplectic case, the dimension n of the
four classes: (1) cyclic groups of prime order, space V must be even, SO n = 2m, and the struc-
(2) alternating groups of degree >5, (3) sim- ture does not depend on the form.
ple groups of Lie type, and (4) other simple Orthogonal group in odd dimension n =
groups. 2m+ 1, 02m+l(q):
The subclass (1) consists of cyclic groups of
prime order p for any prime number p. Abelian g=qm2 fi (q2i- l)/d, d=(2,q-1).
i=l
simple groups belong to this subclass. The sub-
class (4) consists of twenty-six sporadic simple The structure does not depend on the form in
groups including hve Mathieu groups. Al1 odd dimension.
sporadic groups, other than the fïve Mathieu Orthogonal groups in even dimension n=
groups, are of recent discovery. 2m: There are two inequivalent forms, one
Simple groups of Lie type are analogs of with tindex m (which is maximal) and the other
simple Lie groups, and include the classical with index m - 1. The two orthogonal groups
groups as well as the exceptional groups and are denoted by OZm(s, q), F:= If- 1, where E = 1 if
the groups of twisted type. the form is of maximal index and -1 other-
Classical groups are divided into four types: wise. Then
tlinear, tunitary, tsymplectic, and +Orthogonal m-1
(- 60 Classical Groups). Let q =p’ be a power g=q’“(“-‘)(q”-8) ,t (q*‘-l)/d,
of a prime number p. Consider a vector space
V of dimension II > 2 over the fïeld F, of q d=(4,q”-E).
elements, except in the unitary case where V
The value of E is determined by the formf:
is a vector space of dimension n > 2 over the
E= 1 if f is equivalent to C:i x2imlxzi, and
iïeld Fqz of q* elements. Let f be a nondegener-
E= -1 iff-x~+yx,x,+x~+C&~,~-~x~~,
ate form on V which is +Hermitian in the uni-
where the polynomial t* + yt + 1 is irreducible
tary case (with respect to the automorphism
over F,.
of order 2 of F,, over F,), +skew symmetric
There are other ways to denote these
bihnear in the symplectic case, and tquadratic
groups. Let X=X(*, *) be a group of nonsin-
in the orthogonal case. In the orthogonal case,
gular linear transformations of a vector space
the dimension of Vis assumed >3. Consider
V. Two asterisks indicate two invariants, such
the group of a11 linear transformations of V
as the dimension of V and the number of ele-
(linear case) of determinant 1, or the group of
ments in the ground fïeld. The notation SX
a11 linear transformations of determinant 1
stands for the subgroup of X consisting of
which leave the form f invariant (in other
linear transformations with determinant 1, and
cases). In the orthogonal case, take the com-
the notation PX stands for the factor group of
mutator subgroup. With each of these groups,
the linear group X by its tenter. Thus PX is
the factor group of it by its tenter is a simple
a subgroup of the group of a11 projectivities
group with a few exceptions.
of the projective space formed by the linear
There are several notations to denote these
subspaces of V. The following list is self-
groups. E. Artin’s notation for simple groups,
explanatory, except the last term in each row,
which is reasonably descriptive and simple,
which is the notation of L. E. Dickson [l]:
follows the name of the simple group: n and q
are as described in the preceding paragraph, y L(q) = PSW, 4) = LF(n, 4)
is the order of the simple group, and (a, b)
denotes the greatest common divisor of two 4(q) = PSU@, 4) = ffO(n, 4*)
natural numbers a and h. .%,(q) = W(n, 4) = Wn, 4).
Linear simple group, L,(q):
(LF: linear fractional group; HO: hyperortho-
gonal group; A: Abelian linear group.) If f is a
9=4 n(“ml)‘zfi(qi-l)/d, d=(n,q-1).
i=2 nondegenerate quadratic form, then the sub-
151 1 594
Finite Groups

group of GL(n, q) consisting of a11 the elements Dynkin diagram. Let 0 be the automorphism
leaving the form f invariant may be denoted of the Chevalley group which sends x,(t) to
by O(n, q,f). Let O(n, q,f) denote the commu- x0(P). Let U (resp. V) be the subgroup of the
tator subgroup of O(n, q,f). Set E= 1 if f is of Chevalley group generated by x,(t) with a > 0,
maximal index, and E= -1 otherwise. Then téF (x,(t),/?<O), and let U’(V’) be the sub-
group consisting of all the elements of U(V)
O”(G 4) = W% 4, f).
which are left invariant by 8. The group gen-
Dickson’s notation for orthogonal groups is erated by U1 and Vi is called the group of
complicated and seldom used. twisted type. If the order of (r is i, this group is
Finite simple groups corresponding to said to be of twisted type IX. In ah but one
Lie groups of some exceptional type were case, the group of twisted type is simple (see
studied by Dickson early in this Century, but R. Steinberg, P~C$C J. Math., 9 (1959)). The
C. Chevalley (Tôhoku Math. J., (2) 7 (1955)) value i is 2 except when X = D4. Since D4
proved the existence, simplicity, and other admits symmetries of orders 2 and 3, there are
properties of groups of any (exceptional) type two twisted types. If X = B,, G,, or F4, then
over any fïeld by a unified method. Simple Lie the diagram has a symmetry. If the character-
algebras over the tïeld C of complex numbers istic p of the ground lïeld F is 2,3, or 2 accord-
are completely classitïed, and according to the ing as X = B,, G,, or F4 and the tïeld F has an
classification theory they are in one-to-one automorphism (T such that (Y)“= tP for any

correspondence with the +Dynkin diagrams. t E F, then a procedure similar to the one de-
Let L be a simple Lie algebra (over C) corre- scribed before is applicable, and the group of
sponding to the +Dynkin diagram of type twisted type X’ is obtained (R. H. Ree, Amer.
X (- 248 Lie Algebras). Let L= L, + x L, J. Math., 83 (1961)). The group of twisted type
be a Cartan decomposition of L, where c( is simple if the tïeld F has more than three
ranges over the +root system A of L. It is pos- elements.
sible to choose a basis B of L with the following The following list contains a11 the simple
properties (+Chevalley’s canonical basis): groups of Lie type. For each classical group,
B consists of e,EL,(ccEA) and a basis of L,; we list the type followed by identification:
the structure constants of L with respect to B
are a11 rational integers; the automorphism An = L,+,(q) (n2 1)
x,(t) in the tadjoint group detïned by
24= K+,(q) (na1)
x,t5)=expt5adem) (SEC) Bn = O,n+, (4 tn> 1)
maps each element of B into a linear combi-
C” = S,,(q) (n>2)
nation of elements of B with coefficients which
are polynomials in < with integer coefficients. D”=o,,(Ld (fl>3)
Thus the matrix A,(<) representing the trans-
2Dn=02n(-1>q) (n>3)
formation x=(t) with respect to B has coeflï-
cients which are polynomials in 5 with integer For other groups the type of the group is
coefficients. followed by the customary name or notation,
The elements of B span a Lie algebra L, if any, and the order y:
over the ring Z of integers. Let F be a field and
form L, = F @ ZL,. Then L, is a Lie algebra B; Suzuki group, Sz(q), q = 2”‘+’
over F, and the set B may be identified with a .4=q2(4-l)tq2+1)
basis of L, over F. Let t be any element of F,
A,(t) be the matrix obtained from A,(<) by 3D4 g=q’2(q8+q4+l)(q6-l)(q2-1)
replacing the complex variable 5 by the ele-
G2 .4=46t46-l)(qz-1)
ment t, and tïnally x,(t) be the linear trans-
formation of L, represented by the matrix G Ree group, Re(q), q = 32”+1
A,(t) with respect to B. The group generated
9=q3tq3+ l)(q-1)
by the x,(t) for each root a and each element t
of F is called the Chevalley group of type X F4
g=qyq1* - 1)(q8 - l)W- 1)
over F. The commutator subgroups of the
x(4*-1)
Chevalley groups are simple, with a few excep-
q=22”+l
tions which Will be stated after the complete F4 g=q12(q6+1)(q4-1)
list of simple groups of Lie type. Suppose that
X = A,, D,,, or E, (- 248 Lie Algebras S). x(q3+11)(4-1)

Then the Dynkin diagram of type X has a E6 dg=qX6(q’2- l)(q9- l)(qS- 1)


nontrivial symmetry; let it be a-p. Suppose
xkP-1)(q5-1)tq2-~)
that the tïeld F has an automorphism (r of the
same order as the order of the symmetry of the d=(3,q-1)
595 1511
Finite Groups

Q6 dg=q36(q’2 - l)(q9 + 1)(q8 - 1)

x kP- U(s5 + lHq2 - 1)


d=(3,q+ 1) The other twenty-one groups have been
discovered since 1964. Each group is identilïed
E7 dg=q63(q18 - l)(q’4- l)(q’2- 1)
by the symbol (x)~, indicating that it is the ith
x (qlO- lW-lN4-lkZ- 1) group discovered in the year 19x. The list
continues with the name or names of dis-
d=(2,q-1)
coverers, the order of the group, and a brief
ES y z qyq30- l)(q2”- l)(q2O- 1) description.
(64),: Z. Janko, g= 175,560=23.3.5.7.
x(q’S- l)(q’4- l)(q’2- 1)
11.19. A subgroup of the Chevalley group
x (48 - l)(q2 - 1). G,( 11). See J. Algebra, 3 (1966).
(67), : M. Hall and Z. Janko, g = 604,800=
B;: M. Suzuki, Proc. Nat. Acad. Sci. US, 46
2’. 33. 52. 7, a transitive extension of U,(3)
(1960); G; and FL: Ree, Amer. J. Math., 83
of degree 100.
(1961): G,: Dickson, Trans. Amer. Math. Soc., 2
(67),: D. G. Higman and Sims, g =
(1901), Math. Ann., 60 (1905); other Chevalley
44,352,OOO = 29 32 53 .7. 11, a transitive
groups: Chevalley, Tôhoku Math. J., (2) 7
extension of M,, of degree 100. It is a normal
(1955); twisted types: Steinberg, P~C$C J.
subgroup of index 2 in the group of automor-
Math., 9 (1959) J. Tits, Séminaire Bourbaki
phisms of a certain graph with 100 vertices.
(1958), Publ. Math. Inst. HES (1959), D. Hert-
(67),: Suzuki, g=448,345,497,600=2r3. 37.
zig, Amer. J. Math., 83 (1961) Proc. Amer.
5’ ‘7.11. 13, a transitive extension of G,(4);
Math. Soc., 12 (1961).
defïned from the automorphism group of a
Nonsimple cases: L,(2), L,(3), U,(2), and
graph of 1782 vertices.
Sz(2) are solvable groups of orders 6, 12,72,
(67),: J. McLaughlin, g = 898128,000 =
and 20, respectively. The groups O,(2), G,(2),
27. 36. 53. 7.11, a transitive extension of
G;(3), and F:(2) contain normal subgroups
U,(3); defmed from a graph of 275 vertices.
of indices 2,2, 3, and 2, respectively. These
(68), : G. Higman, Z. Janko, and J. McKay,
normal subgroups are simple and identifïed as
g = 50,232,960 = 27. 35 ‘5.17 19, a transitive
L,(9), U,(3), L,(8) in the tïrst three cases. The
extension of the group which is obtained from
normal subgroup of F:(2) is not in the list of
L,(16) by adjoining the field automorphism of
simple groups of Lie type and is quite excep-
order 2. The existence was veritïed by using a
tional (Tits’s simple group, Ann. Math., (2) 80
computer.
(1964)).
(68),, (68),, (68),: J. H. Conway,
Isomorphisms between various simple
gros: Md= u,(q)= s,(q)= O,(q); O,(q) =
S,(q); 04u~q)=~,(dx~,(q); o,t-l,q)=
=4,157,776,806,543,360,000,
L2(q2); o,(1>d=L4(d; 06(-1>q)=u,(q);

O,,+,(q)=&,(q) if q is a power of 2; L,(2)=S,; g=2 ‘8.36.53.7.11.23,


L,(3)= A,; L2(4)=Lz(5)= A,; L2(7)=L3(2);
g=2’O.3’.53.7.11.23.
L,(9)=A6; L4(2)=A8; U,(2)=&(3). If q is odd
and 2n > 6, then S,,(q) and 02n+l (q) have the The big group is obtained from the automor-
same order but are not isomorphic. L,(4) and phism group of a lattice in 24-dimensional
L,(2) have the same order but are not isomor-
space, and the two smaller ones are subgroups
phic. There is no other isomorphism or coin- of it. The lattice was defïned by J. Leech in
cidence of orders among the known simple
connection with a problem of close packing of
groups (Artin, Comm. Pure Appl. Math., 8
spheres in 24 dimensions (Canad. J. Math., 19
(1955)).
(1967)).
The groups of the automorphisms of simple
(68),: B. Fischer,g=2’7.3g.52.7.11.13=
groups belonging to subclasses (1), (2), and 70,321,75 1,654,400, a transitive extension of
(3) are known. For the simple groups of Lie U,(2) derived by means of a certain graph.
type, see Steinberg, Canad. J. Math., 10 (1960).
(69),: D. Held and others,g=2’0.33.52.73.
The following list of twenty-six groups con-
17 = 4,030,387,200.
sists of all the simple groups that belong to (69),:B.Fischer~,g=2~~.3~~.5~.7.11.13.
class (4): 17.23 =4,089,470,473,293,004,800.
Five Mathieu groups whose orders are (69),:B. Fischer,g=221.316.52.73.11.13.
M,,:g=7,920=24.3’.5~11 17.23.29 = 1,255,205,709,190,661,721,292,800.
(71),: R. N. Lyons and C. C. Sims, g=
M,,:g=95,040=26.33.5.11
28.37.56.7.11.31.37.67.
M,,:g=443,520=2’.3’.5.7.11 The existence of (69), and (71), was verihed by
151 J 596
Finite Groups

using computers; (69), and (69), were derived There are several remarkable properties of
by means of certain graphs. For a more de- known fmite simple groups which have been
tailed account of these simple groups, see J. conjectured to hold for arbitrary Imite simple
Tits, Séminaire Bourhuki (1970), No, 375, and groups. One of the most famous is the Schreier
the references [ 18,19,20.] conjecture, which asserts that the group of
(72),:A. Rudvalis,g=2’4.33.53,7.13.19, outer automorphisms of a simple group is
a transitive extention of Tits’s simple group, solvable. This has been verifïed for a11 known
i.e., a normal subgroup of F>(2) of index 2. cases. Another conjecture says that a tïnite
Concerning this group (72), , see the article simple group is generated by two elements.
of J. H. Conway and D. B. Wales, J. Alyebra, This has also been verified for almost ah
27 (1973). known groups. In many cases, there is a gen-
(73),: M.0’Nan,g=29.34.5.73~ll.19.31. erating set of two elements, one of which has
This group (73), was discovered by O’Nan and order 2. There is no counterexample known to
the existence was verilïed by C. Sims, using a disprove the universal validity of this property.
computer. Except for Sz(q), the orders of known simple
(73),: B. Fischer, g=24’ ‘313.56.72. 11.13’ groups are divisible by 12.
19.23.31.47.
(73),: B. Fischer and R. Griess, g = 246. 3”‘.
59~76~112~133~17~19~23~29~31~41~47~ J. Classification of Finite Simple Groups
59.71.
(74),: J. G. Thompson,g=215.3’0.53.72. The objective of classification theory is to tïnd
13.19.31. the complete list of fïnite simple groups; this
(74),: K. Harada, g=2’4.36.56.7. 11.19. was accomplished in February 1982, following
The existence of (73), was suggested by B. the series of important works mentioned
Fischer, and then that of (73), by B. Fischer below.
and R. Griess. Shortly after this, the exis- The order of a tïnite non-Abelian simple
tence of (74), and (74), was suggested by J. G. group is divisible by at least three distinct
Thompson, and then Thompson and Harada prime numbers (W. S. Burnside; - Section D).
proved the existence of these groups with the The order of a finite non-Abelian simple group
aid of P. Smith and S. Norton, using com- is even (W. Feit and J. G. Thompson; - Sec-
puters. The existence of (73)> was established tion D). These theorems are special cases of
in 1976 by Leon and Sims, using a computer. the following theorem: If G is a fïnite non-
Very recently (July 1980) R. Griess has an- Abelian simple group in which the normalizer
nounced that the group (73), is realized as of any solvable subgroup # { 11 is solvable,
a group of automorphisms of a 196,883- then G=L,(q)(q>3), Sz(2’““)(n> 1), A,,
dimensional commutative nonassociative L,(3)> U,(3)> Ml 1, or Tits’s simple group. In
algebra over the rational numbers. particular, a minimal simple group is isomor-
(75),:Z.Janko,g=221~33~5~7~113~23~29. phic to L,(p) (p=2 or 3 (mod5), p>3), L,(2p),
31.37’43. This group (75), was discovered L,(3p), Sz(2!‘), or L,(3), where p is a prime
by Z. Janko. The existence was verifïed by number (a fïnite non-Abelian simple group is
using a computer. called a minimal simple group if ah proper
For a more detailed account of the twenty- subgroups are solvable). This theorem is
six sporadic simple groups, see [Zl]. proved in a series of papers by J. G. Thomp-
Simple groups of order < 1000 are A 5 son (Bull. Amer. Math. Soc., 74 (1968), Pucific
(g= 6% b(7) (1681, 403% -h(8) (50% J. Math., 33 (1970), 39 (1971) 48 (1973), 50
and L2( 11) (660). Al1 simple groups of order (1974), and 51 (1974)). The method which
<20,000 are known. Thompson used in these papers has since been
Among the known simple groups, the fol- generalized in various ways by many authors
lowing multiply transitive permutation repre- to establish a number of important theorems.
sentations are known: Alternating groups A,, There are also some interesting consequences
(degree n), A, and A, (degree 15), A, (degree of this theorem concerning solvable groups.
lO), A, (degree 6), Mathieu groups Mi (degree For example, a finite group is solvable if and
9, Ml 1 (degree 121,L(q) (dewe (qn - l)/ only if every pair of elements generates a sol-
(q-I)),L,(p)(degree~for~=5,7,11), u3(q) vable subgroup.
(degree 1 + q3), Sz(q) (degree 1 + $), Re (3”) Let G be a non-Abelian simple group of
(degree 1 + 33n), S,,(2) (degrees 2n-1(2nk l)), even order and S one of its 2-Sylow subgroups.
and the Higman-Sims group (67), (degree 176) Then S is neither cyclic nor a generalized
and, the Conway group (68), (degree 276). quaternion group (W. S. Burnside, R. Brauer,
Among them, A, (degree n, n > 5), Mi (degree i), and M. Suzuki). If S is a dihedral group, then
M,, (degree 12) L,(2m) (degree 1+2”‘), and G= A, or L,(q) (q odd 35) (D. Gorenstein and
L,(5) (degree 5) are triply transitive. J. H. Walter). These theorems deal with the
597 151 J
Finite Groups

cases where S is “small.” The study in this X(2, p) where the action of SL(2, p) on V(2, p)
direction has culminated in the classification of is taken to be the natural one. A finite group G
tïnite simple groups ah of whose 2-subgroups contains a strongly closed Abelian p-subgroup
are generated by at most four elements (D. A # {I } if no section of G is isomorphic to
Gorenstein and K. Harada, Mem. Amer. Math. Qd( p) (a section is a quotient group of a sub-
soc., 147 (1974)). group). Furthermore, if p is odd, then we cari
If a 2-Sylow subgroup of a finite non- choose as A a characteristic subgroup of a p-
Abehan simple group G is an Abelian group, Sylow subgroup S of G which is determined
then G = L,(r) (r = 0, 3 or 5 (mod S), r > 3), or only by the structure of S. Therefore a finite
else G possesses an element of order 2 whose non-Abelian simple group has a section iso-
centralizer is isomorphic to Z, x L,(q) (q s 3 or morphic to Qd(2) = S, except when it is one of
5 (mod 8), q > 3) (J. H. Walter). In the latter the simple groups mentioned in Goldschmidt’s
case, G is called a group of Janko-Ree type theorem. This theorem generalizes an un-
(J-R type for short), and if q # 5, it is called a published result of J. G. Thompson to the
group of Ree type. If q = 5, then G = (64), , the effect that 3 divides the order of tïnite non-
Janko’s simple group of order 175,560. The Abelian simple groups except SZ(~~“+‘).
Ree groups Re(q) are groups of Ree type. Since Let G be a 2-transitive permutation group
the discovery of Re(q), it has long been an on n + 1 letters, and assume that the stabilizer
open problem to show that there are no other H of a letter contains a normal subgroup K
groups of Ree type. Very recently, the com- which is regular on the remaining n letters.
bined work of E. Bombiere, J. G. Thompson Then G contains a normal subgroup N such
and others settled the problem (Inventiones that G is isomorphic to a subgroup of the
Math., 58 (1980)). automorphism group of N and either N =
A surprisingly short proof of the Walter? L2(q), Q(q), Wq) or a group of Ree type,
theorem above is given by H. Bender (Muth. or else N is 2-transitive on the n + 1 letters and
Z., 117 (1970)). Bender’s method applies to a no nonidentity element of N leaves two dis-
much larger class of groups. A subgroup A of tinct letters invariant (the structure of N in the
a imite group G is said to be strongly closed if latter case is also known (H. Zassenhaus; -
AR n N,(A) c A for each g E G. D. M. Gold- Section H). This theorem is proved by E.
schmidt proved (Ann. Math., 99 (1974)) that if Shult for n even (Illinois J. Math., 16 (1972))
A is a strongly closed Abehan 2-subgroup, and by C. Hering, W. M. Kantor, and G. M.
then the subgroup G, generated by the con- Seitz for n odd (J. Algebra, 20 (1972)). Its proof
jugates of A possesses a normal series G, 2 depends on the work of many authors who
G, 3 G, with the properties: G, is of odd order, considered various special cases, especially the
G,/G, is a 2-group and is contained in the work of H. Zassenhaus, W. Feit, N. Ito, and
tenter of G,/G,, and either G, = G, or G,/G, M. Suzuki on the classification of Zassenhaus
is the direct product of simple groups on the groups, and the work of M. Suzuki on the case
following list: L,(q) (q E 0, 3 or 5 (mod 8), where n is even and HIK is of odd order (Ann.
q>3), Sz(2’“-‘), U,(2”) (n> l), and the groups Muth., 79 (1964)). In this special case which
of J-R type. Furthermore, AGJG, 1 G,/G, Suzuki handles, the stabilizer H is of even
and AG,/G, is the tenter of a 2-Sylow sub- order and H n Hg is of odd order for any ge
group of G,/G, This theorem generalizes G - H. If a proper subgroup H of an arbitrary
an earlier result of G. Glauberman (the so- fïnite group G has this property, then H is
called Z*-theorem) which states that if A is a called a strongly embedded subgroup. Extend-
strongly closed subgroup of order 2, then the ing the work of Suzuki, H. Bender proved the
image of A in the quotient group G/K by the following theorem (J. Algebra, 17 (1971)): If a
maximal normal subgroup of odd order is a fïnite group G possesses a strongly embedded
normal subgroup and hence is contained in subgroup, then either (i) a 2-Sylow subgroup
the tenter of G/K. These theorems of Glauber- of G is cyclic or a generalized quaternion
man and Goldschmidt are of fundamental group, or (ii) G possesses a normal series G =
importance in the study of tïnite simple groups G, 1 G, 2 G, such that G,/G, and G, are of
since they provide an effective tool for showing odd order and G, /G, = L,(2”), U3(2”), or
that a given group has a normal Abelian 2- SZ(~~“-‘) (n> 1). This theorem generalizes
subgroup. another theorem of Suzuki who reached the
Glauberman obtained a criterion for the same conclusion under the assumption that
existence of a strongly closed Abelian p- two distinct 2-Sylow subgroups have only the
subgroup # { 1) for some prime p [ 15,221. For identity element in common. Bender’s theorem
any prime p, the quadratic group Qd( p) is is of fundamental importance in the classifica-
defined to be the semidirect product of the 2- tion theory of fïnite simple groups, since a
dimensional vector space I’(2, p) over the tïeld strongly embedded subgroup often appears as
of p elements by the special hnear group an obstacle to the proofs of classification
151 Ref. 598
Finite Groups

theorems. For a generalization of Bender’s choose ,f‘(h) = { h(h + l))!; R. Brauer and K. A.
theorem, see a paper by M. Aschbacher (Pro~. Fowler, Ann. Math., 62 (1955)). In particular,
Amer. Math. Soc., 38 (1973)), which also con- there exist only finitely many isomorphism
tains an alternative proof of Shult’s theorem. classes of fïnite simple groups which contain
The theorem of Shult, Hering, Kantor, and an element of order 2 with a given centralizer
Seitz may be interpreted as a classification of H. This fact is a ground for Brauer’s program
lïnite groups having a split (B, N)-pair of rank of studying simple groups of even order in
1. Let G be a finite group and let B and N be terms of the structure of the centralizers of
subgroups of G such that (i) B and N generate elements of order 2. There are a number of
G, (ii) T= B n N is a normal subgroup of N, important results concerning Brauer’s pro-
and (iii) W= N/Tis generated by a set S of ele- gram [20]. For example, nine sporadic simple
ments of order 2 such that sBs # B and sBw c groups were discovered in related works. Since
BwB U BswB for each s E S and each w E W. 1973, Brauer’s program has been improved
The subgroups B and N are called a (B, N)- greatly by M. Aschbacher, D. Gorenstein, and
pair of G (the quadruplet (G, B, N, S) is called others, and the classification of fïnite simple
a Tits system), and the cardinality of the set groups was fïnally completed in February 1982
S is called the rank of the (B, N)-pair. The [21,23].
(B, N)-pair is said to be split if B has a normal
subgroup Cl such that B = TU and Tn CI = { l),
and is said to be saturated if T= finEN B”. If a
References
tïnite group G has a split saturated (B, N)-pair
of rank 1, and if Z = flgEG Bg, then G/Z is a 2-
transitive permutation group satisfying the [ 11 L. E. Dickson, Linear groups with an
assumption of the theorem of Shult, Hering, exposition of the Galois iïeld theory, Teubner,
Kantor, and Seitz, and information is obtained 1901 (Dover, 1958).
on the structure of G. In general, the simple [Z] W. Burnside, Theory of groups of fïnite
groups of Lie type are characterized as simple order, Cambridge Univ. Press, second edition,
groups with certain (B, N)-pairs. For (B, N)- 1911.
pairs of rank 2, see papers by P. Fong and G. [3] H. Zassenhaus, Lehrbuch der Gruppen-
M. Seitz (Inventiones Math., 21 (1973), 24 theorie, Teubner, 1937; English translation,
(1974)). J. Tits has developed a satisfactory The theory of groups, Chelsea, 1958.
theory on finite groups having a (B, N)-pair of [4] A. Speiser, Die Theorie der Gruppen von
rank at least 3 (Lecture notes in math. 386, endlicher Ordnung, Springer, third edition,
Springer). 1937.
Let G be a tïnite group generated by a [S] W. Specht, Gruppentheorie, Springer,
conjugate class D of elements of order 2, and 1956.
let 7c be the set of positive integers consisting of [6] M. Suzuki, Structure of a group and the
the orders of the products of two distinct structure of its lattice of subgroups, Erg.
elements of D. Furthermore, assume that G Math., Springer, 1956.
has no nontrivial solvable normal subgroup. [7] H. S. M. Coxeter and W. 0. J. Moser,
B. Fischer proved (Inventiones Math., 13 Generators and relations for discrete groups,
(1971)) that if n = {2,3}, then G contains a Erg. Math., Springer, 1957.
normal subgroup isomorphic to one of the [S] M. Hall, The theory of groups, Macmillan,
following groups: A,,, S,,(2), O,,( fl, 2), 1959.
O,,( &1,3), U,,(2), and the three Fischer’s simple [9] C. W. Curtis and 1. Reiner, Representation
groups (68),, (69),, (69),. For a generalization theory of finite groups and associative al-
of this theorem, see papers by M. Aschbacher gebras, Interscience, 1962.
(Math. Z., 127 (1972), J. Algehru, 26 (1973)). [lO] W. R. Scott, Group theory, Prentice-Hall,
The most powerful result in this direction is 1964.
given by F. Timmesfeld (J. Algebra, 33 (1975), [ 111 M. Hall and J. K. Senior, The groups of
35 (1975)): Suppose T[ consists of 2,4, and odd order 2” (n < 6), Macmillan, 1964.
positive integers. Furthermore, assume that if [ 121 H. Wielandt, Finite permutation groups,
d and e are in D and de is of order 4, then Academic Press, 1964.
(de)2 ED. Then G = A,, U,(3), the Hall-Janko [ 133 J. Dieudonné, Sur les groupes classiques,
group (67), , L(q)(n 2 3), 02,,+, kW 2 3), Actualités Sci. Ind., Hermann, 1948.
02,( k1> db >4), G(q)> 3D&), F,(q)> *4,(q)> [ 141 J. Dieudonné, La géométrie des groupes
k,(q), Wd, or Wq), where q = 2”. classiques, Erg. Math., Springer, 1955.
Let G be a simple group of even order and [ 151 D. Gorenstein, Finite groups, Harper &
let H be the centralizer of an element of order Row, 1968. Second edition, Chelsea, 1980.
2. Then the order of G is bounded by a func- [ 161 D. Passman, Permutation groups, Ben-
tion f of the order h of H (for example, we cari jamin, 1968.
599 152 c
Finsler Spaces

[ 171 B. Huppert, Endliche Gruppen, Springer, metric is more convenient for the purpose
1967. since “only nongeometrical results cari be
1181 R. Brauer and C. H. Sah (eds.), Theory of obtained” by using Finsler metrics [7]. P.
finite groups, a symposium, Benjamin, 1969. Finsler initiated the systematic study of Finsler
[19] M. B. Powell and G. Higman, (eds.), metrics and extended to a Finsler space many
Finite simple groups, Academic Press, 197 1. concepts and theorems valid in the classical
[20] W. Feit, The current situation in the theory of curves and surfaces [S].
theory of finite simple groups, Actes Congr.
Intern. Math., 1970, Nice, Gauthier-Villars,
vol. l., p. 55593. B. The Finsler Metric
[21] D. Gorenstein, The classification of finite
simple groups 1, Bull. Amer. Math. Soc., 1 In a Finsler space, the arc length of a curve
(1979) 433199. x=x(t) (a < t < b) is given by it L(x, dx/dt) dt.
[22] G. Glaubermann, Factorization in local Therefore a tgeodesic in a Finsler space is
subgroups of finite groups, Regional confer- defined as a tstationary curve for the problem
ence series in mathematics, no. 33 (1977). of +Variation 6 1: L(x, dx/dt) dt = 0, and the
[23] D. Gorenstein, Finite simple groups, differential equation of the geodesic is given by
Plenum, 1982.

where yjk(x, y) is the +Christoffel symbol of g,,


152 (VII.1 8) i.e.,
Finsler Spaces
.i.;x(~,y~=;&”
<1

A. Definitions where (g’j(x, y)) is the inverse matrix of


(Yijtx> Y)).
Let T(M) be the ttangent vector bundle of an The distance between two points in a Fin-
n-dimensional tdifferentiable manifold M. An sler space is defined, as in a Riemannian space,
element of T(M) is denoted by (x, y), where x is as the infimum of the lengths of curves joining
a point of M and y is a ttangent vector of M at the two points. Many properties of Riemann-
x. Given a tlocal coordinate system (xi, . , x”) ian spaces as metric spaces cari be extended
of M, we cari obtain a local coordinate system to Finsler spaces. The topology detïned by the
of T(M) by regarding (x1, , x”, y’, . . , y”) = Finsler metric coincides with the original
(xi, y’) as coordinates of the pair (x, y)~ T(M), topology of the manifold. A Finsler space M is
where (x1, . ,x”) are coordinates of a point x said to be tcomplete if every Cauchy sequence
of M and y = C yja/ôxj. A continuous real- of M as a metric space is convergent. The
valued function L(x, y) defïned on T(M) is following three conditions are equivalent: (i) M
called a Finsler metric if the following condi- is complete; (ii) each bounded closed subset
tions are satistïed: (i) L(x, y) is differentiable at of M is compact; (iii) each geodesic in M is
y#O; (ii) L(x,Ây)=IJ”IL(x,y) for any element infinitely extendable. In a complete Finsler
(x, y) of T(M) and any real number Â; and space, any two points cari be joined by the
(iii) if we put gij(x, y) = (i/2)a2L(x, y)2/ayiayj, shortest geodesic.
the symmetric matrix (g,(x, y)) is positive de- A diffeomorphism <p of a Finsler space M
tïnite. A differentiable manifold with a Finsler preserves the distance between an arbitrary
metric is called a Finsler space. There exists pair of points if and only if the transforma-
a Finsler metric on a manifold M if and only tion on T(M) induced by <ppreserves the
if M is tparacompact. We cal1 F(x, y) = L(x, y)2 Finsler metric L(x, y). Such a transformation is
the fundamental form of the Finsler space. called an tisometry of the Finsler space. In the
When F(x, y) is a quadratic form of (y’, ,y”), +Compact-open topology the set of a11 isome-
L(x, y) is a +Riemannian metric, and F(x, y) = tries of a Finsler space is a +Lie transformation
Ci, jgij(x)y’yj. Therefore a Finsler space is a group of dimension at most n(n + 1)/2. If a
Riemannian space if and only if g,j does not Finsler space admits the isometry group of
depend on y. The matrix gij is also called the dimension greater than (n(n - 1) + 2)/2, it is a
fundamental tensor of the Finsler space (i,j = Riemannian space of constant curvature [S].
l,...,n).
Thus the notion of a Finsler metric is an
extension of that of Riemannian metric. The C. The Theory of Connections
study of differentiable manifolds utilizing such
generahzed metrics was considered by B. An important difference between a Finsler
Riemann, but he stated that a Riemannian space and a Riemannian space relates to their
152 Ref. 600
Finsler Spaces

properties with respect to the theory of tcon- Finsler spaces by considering an integral on
nections. In the case of a Riemannian space, the subbundle of the tangent vector bundle
the Christoffel symbols constructed from the satisfying L(x, y) = 1 [6]. M. H. Akbar-Zadeh
fundamental tensor are exactly the coefficients studied tholonomy groups and transformation
of a connection, whereas in the case of a Fin- groups of Finsler spaces by using the theory of
sler space, the Christoffel symbols y$x, y) do fïber bundles.
not deiïne a connection, for the fundamental Connections of Finsler spaces have been
tensor 9, depends not only on the points of the investigated by many geometers, but most of
space but also on the directions of tangent them used methods considerably different from
vectors at these points. those of the modern theory of connections in
When we consider notions such as tensors, principal fïber bundles. J. H. Taylor and Synge
etc., in a Finsler space M, it is generally more (1925) defined the covariant differentïal of a
convenient to take the whole tangent vector vector tïeld along a curve. L. Berwald (1926)
bundle T(M) into consideration rather than defmed a connection from the point of view of
restricting ourselves to the space M. For ex- the general geometry of paths. A curve on a
ample, let P be the ttangent n-frame bundle manifold satisfying the differential equation
over a Finsler space M and Q = p-‘(P) be the
tprincipal Iïber bundle over T(M) induced
from P by the projection p of T(M) onto M.
We cal1 the elements of fiber bundles asso- is called a path. The theory was originated by
ciated with Q tensors. In this sense, the fun- 0. Veblen and T. Y. Thomas and generalized
damental tensor g, in a Finsler space is the as above by J. Douglas. Characteristically,
covariant tensor field of order 2. Therefore it is with respect to a Berwald connection, the
natural to consider a connection in a Finsler covariant differential of the fundamental ten-
space as a connection in the principal fïber sor does not vanish.
bundle Q. The connection in a Finsler space A Finsler space is a space endowed with a
defïned by E. Cartan is exactly of this type [3]. metric for line elements. As a dual concept, we
Namely, he showed that by assigning to a have a Cartan space, which is endowed with a
connection in Q certain conditions related to metric for areal elements [4]. A. Kawaguchi
the Finsler metric, we cari determine uniquely (1937) extended these notions further and
a connection from the fundamental tensor SO studied a space of line elements of higher order
that the covariant differential of the funda- (or Kawaguchi space).
mental tensor vanishes.
Cartan’s introduction of the notion of con-
nection produced a development in the theory
References
of Finsler spaces that parallels the develop-
ment in the theory of Riemannian spaces, and
many important results have been obtained. [ 11 M. H. Akbar-Zadeh, Les espaces de Fin-
0. Varga (1941) succeeded in obtaining a sler et certaines de leurs généralisations, Ann.
Cartan connection in a simpler way by using Sci. Ecole Norm. SU~., (3) 80 (1963) l-79.
the notion of osculating Riemannian space. [2] L. Auslander, On curvature in Finsler
S. S. Chern (1943) studied general Euclidean geometry, Trans. Amer. Math. Soc., 79 (1955)
connections that contain Cartan connections 378-388.
as a special case. Noticing that the tangent [3] E. Cartan, Les espaces de Finsler, Ac-
space of a Finsler space is a tnormed linear tualités Sci. Ind., Hermann, 1934.
space, H. Rund (1950) obtained many notions [4] E. Cartan, Les espaces métriques fondés
different from those of Cartan. However, as far sur la notion d’aire, Actualités Sci. Ind., Her-
as the theory of connections is concerned, the mann, 1933.
two theories do not seem to be essentially [S] P. Finsler, Über Kurven und Flachen in
different. The theory of curvature in a Finsler allgemeinen Raumen, dissertation, Gottingen,
space is more complicated than that in a 1918.
Riemannian space because we have three [6] A. Lichnérowicz, Quelques théorèmes de
curvature tensors in the Cartan connection. géométrie différentielle globale, Comment.
Using the fact that, in a local cross section of Math. Helv., 22 (1949) 271-301.
the tangent vector bundle of a Finsler space, a [7] B. Riemann, Über die Hypothesen, welche
Riemannian metric cari be introduced by the der Geometrie zu Grunde liegen, Habilita-
Finsler metric, L. Auslander (1955) [2] ex- tionsschrift, 1854. (Gesammelte mathematische
tended to Finsler spaces the results of J. L. Werke, Teubner, 1876,2544270; Dover, 1953.)
Synge and S. B. Myers on the curvature and [S] H. C. Wang, On Finsler spaces with com-
topology of Riemannian spaces. A. Lichnéro- pletely integrable equations of Killing, J.
wicz extended the Gauss-Bonnet formula to London Math. Soc., 22 (1947) 559.
601 153 B
Fixed-Point Theorems

[9] H. Rund, The differential geometry of g(x) is called a coincidence point off and g.
Finsler spaces, Springer, 1959. The intersection number A,,, of the graph
[ 101 M. Matsumoto, The theory of Finsler of ,f and that of g is called the coincidence
connections, Publ. Study Group Geom. 5., numher off and g. If A,-, 9 # 0, then f and g
Okayama Univ., 1970. have at least one coincidence point. The coin-
cidence number Af,s is also expressed as
Cp=,(-l)Ptr(f*og!IHP(X)), where g,:
HP(X)+HP(Y) is the tGysin homomorphism
of g.
153 (1X.7) Suppose that a lïnite group G acts on the
Fixed-Point Theorems manifolds X and Y. If f: X + Y is a mapping, a
point x of X such that zf(x) =fi(x) for a11 TE G
is called an equivariant point off: When G is a
A. General Remarks
group of order 2 and acts on X nontrivially,
the equivariant point index Âr is employed.
Given a mapping f of a space X into itself, a
This index was introduced by Nakaoka
point x of X is called a fixed point off if f(x)
(Japan. J. Math. 4 (1978)), using the tequivar-
=x. When X is a topological space and f
iant cohomology. It has the property that
is a continuous mapping, we have various
Â/ # 0, implying that f admits an equivariant
theorems concerning the fïxed points off:
point. The prototype of this theorem is the
Borsuk-Ulam theorem (Fund. Math. 20 (1933)),
B. Fixed-Point Theorems for Polyhedra which states that a continuous mapping f:
S”+R” always admits a point XE~” such that
(1) Brouwer Fixed-Point Theorem. Let X be a .m = .f( - 4.
tsimplex and f: X-+X a continuous mapping.
Then f has a lïxed point in X (Math. Ann., 69 (3) Lefschetz Number and Fixed-Point Indices.
(1910), 71 (1912)). Suppose that 1K 1is an n-dimensional homo-
geneous polyhedron (i.e., any simplex of K
(2) Lefschetz Fixed-Point Theorem. Let H,(X) that is not a face of another simplex of K is of
be the p-dimensional thomology group of a dimension n), and f: 1K I-+ 1K 1 is a continuous
Vïnite polyhedron X (with integral coeffl- mapping. Then there exists a continuous
cients), T,(X) the ttorsion subgroup of H,(X), mapping g: 1K 1-1 K 1 homotopic to f and
and B,(X) = H,(X)&(X). The continuous admitting only isolated fïxed points {ql, ,
mapping f: X+X naturally induces a homo- ql}, each of which is an inner point of an n-
morphism f, of the free Z-module B,(X) into dimensional simplex of K. The tlocal degree
itself. Let c(~ be the ttrace off, and A/ = ii of a mapping y at qi is called the Iïxed-
ca=,( -l)pc(, (n = dim X). We cal1 this integer point index of g at qi. Then JJ = & ki does
A, the Lefschetz numher f: not depend on the choice of g and is equal to
We have the Lefschetz Iïxed-point theorem: (-l)“A,.
(i) Let J y be continuous mapping sending X
into itself. If 5 g are thomotopic (f=g), then (4) Singularities of a Continuous Vector Field.
AJ = A,. (ii) If A, # 0, then f has at least one Let X be an n-dimensional tdifferentiable
fïxed point in X (Trans. Amer. Math. Soc., 28 manifold and F a tcontinuous vector field on
(1926)). X that assigns a tangent vector xp to each
The condition A, # 0 is, however, not neces- point p of X. A point p is called a singular
sary for the existence of a fïxed point of ,f: The point of F if xp is the zero vector. The vector
Brouwer fïxed-point theorem is obtained im- field F induces in a natural manner a continu-
mediately from (i) and (ii). in particular, if the ous mapping f: X+X that is homotopic to
mapping f is homotopic to the identity map- the identity mapping 1,. Then a fïxed point of
ping l,, then tlp is the pth tBetti number of X, fis a singular point of F, and vice versa. When
and A! is equal to the tEuler characteristic such a singular point p is isolated, there exists
x(X) of X. Hence, in this case, if x(X) ~0, then a tcoordinate neighborhood N of p that is
f has a fïxed point. homeomorphic to an n-dimensional open bal1
When X is a compact oriented manifold such that xq is nonzero for every point q in N
without boundary the Lefschetz number A,. of except for q = p. Let N’ be the boundary of N.
f cari be interpreted as the tintersection num- Then we may consider N’ g S”-’ and a map-
ber of the graph off and the diagonal of X. ping FIN’ from S”-’ to R”\{O} ES’-‘. The
More generally, let X and Y be compact tdegree of a mapping S”-‘z R”\ (0) 39-l
oriented n-dimensional manifolds without is called the index of the singular point p. This
boundary. If f and g : X + Y are continuous index is equal to the fïxed-point index A,, off
mappings, a point x of X such that f(x) = at p. Hence, when X is compact and has no
153 c 602
Fixed-Point Theorems

boundary, the sum of indices of (isolated) p. Under the circumstances mentioned above
singular points of F is equal to (-~)“X(X). In the Lefschetz number L(T) is given by the
particular, a compact manifold X without formula
boundary admits a continuous vector fïeld
with no singular point if and only if x(X) = 0 i (-lYtrvi,p
(Hopf’s theorem, Math. Ann., 96 (1927)). L(T)$=0
p Idet(l -df,)l ’
(5) Poincaré-Birkhoff Fixed-Point Theorem. In where the summation is over the fïxed points
certain cases, a continuous mapping f: X +X off:
of a iïnite polyhedron X into itself has fïxed Here are some examples of the above for-
points even if Af = 0. For example, let X be the mula. First, take as B the tde Rham complex
annular space {(r, 6)) 1c( d r < /l} ((r, f3) are the and as ‘pi the obvious one, i.e., the ith exterior
polar coordinates of points in a Euclidean power of the transpose of df: In this case, the
plane) and let f: X *X be a homeomorphism formula reduces to the classical Lefschetz
satisfying the following conditions: (i) there formula. As a second example consider the
exist continuous functions g(Q), h(O) such that +Dolbeault complex
d@) < 6 h(Q) > Q, fk 0) =k4 cl(@), f(B> Q) =
04°*o(M)ho~1(M)a* ho-(M)+0
(,$ h(0)); (ii) there exists a continuous positive
function p(r, 0) defïned for CI< r < b such that
of a compact complex manifold M and a
holomorphic mapping f: M + M with only
p(r, 0) dr dO = p(f(r, Q)WrdQ simple fixed points. In this case the formula
SS D SS D
above reduces to one giving the Lefschetz
for all measurable sets D. number of the induced endomorphisms
Then ,f has at least two fixed points. This ,f*:H’-*(M)+H’,*(M) of the +Dolbeault
theorem was conjectured in 1912 by Poincaré, cohomology:
who hoped to apply it to salve the trestricted
three-body problem. The theorem was later
uf*)=C l
proved by G. D. Birkhoff (Trans. Amer. Math. B detc( 1 - df,) ’
Soc., 14 (1913); see also M. Brown and W. D.
where df, is regarded as a holomorphic
Neumann, Michigan Math. J. 24 (1977)) and differential.
is called the Poincaré-Birkhoff fixed-point
If the assumption that the mapping f has
theorem or the last theorem of Poincaré.
only simple fïxed points is replaced by the one
that f is a diffeomorphism of M contained in
a compact ttransformation group G, then
C. Atiyah-Bott and Atiyah-Singer Fixed-Point
there is also a generalized Lefschetz formula,
Theorems
given by Atiyah and Singer (Ann. Math., (2) 87
(1968)). The lïxed-point set of such a diffeo-
There is a far-reaching generalization of the
morphism is a closed submanifold of M (con-
Lefschetz formula given by Atiyah and Bott
sisting of several components). Suppose that
(Ann. Math., (2) 86 (1967), 88 (1968)). Let M be
we are given an elliptic complex G over M and
a compact differentiable manifold without
boundary and .f: M+M a differentiable map- a lift of the G-action on M to d. The latter
implies that, if we define T:T(E,)+T(E,) by
ping with only simple fïxed points; that is, it is
7;s(x)=f -I~(f(x)) for S~T(E~), then diT=
assumed that det(1 -df,)#O for each iïxed
7;+, di holds for each i. Under these circum-
point PE M of J where df, is the differential of
stances, the Lefschetz number L(T) cari be
fat the point p. The fïxed points off are finite
expressed in the form
in number. Suppose that an telliptic complex
over M
&04-(Eo)3r(E1)~... bLI,I-(Et)+0
where the summation is over the components
and a sequence of smooth vector bundle {Fi} of the fixed-point set Mf off and where
homomorphismscpi:f*Ei+Ei(i=O,...,l) the number v(F,) is written explicitly in terms
are given such that diT = T+, di for each i, of the +symbol of the elliptic complex d with
where 7;:r(EJ-+T(E,) is defined by T~(X)= G-action, the characteristic classes of the mani-
qq(f(x)) for seT(Ei). The sequence T=(T) fold Fi, the characteristic classes of the normal
induces endomorphisms H’T of the homology bundle of Fi in M, and the action of g = f -’ on
groups Hi(&) of the elliptic complex &. We the normal vectors. The formula is essentially
defïne the Lefschetz number L(T) by L(T) = a reformulation of the tAtiyah-Singer index
&,( -l)i tr H’T. On the other hand, for a theorem. In fact, L(T) is the tanalytic index of
tïxed point p of,f, let ‘pi 1p:Ei 1p’Ei,p denote &?evaluated on g and the number v(Fi) is de-
the restriction of <pi on the fïber E,,, of Ei over duced from the ttopological index of & using
603 154 A
Foliations

the localization theorem. The most useful equation. Now we cari apply the theorem of
elliptic complexes are de Rham complexes, Tikhonov to show the existence of solutions.
Dolbeault complexes, signature operators and On the other hand, when we are given prob-
Dirac operators. In the case of Dolbeault lems of functional analysis, Schauder’s lïxed-
complexes, fis assumed to be an analytic point theorem is usually more convenient to
automorphism, and the number v(Fi) takes the apply than Tikhonov’s theorem.
form The following theorem, written in terms of
functional analysis, is useful for applications:
Let D be a subset of an n-dimensional Eucli-
dean space, F the family of continuous func-
Here, the norma1 bundle N of Fi has a decom- tions delïned on D, and T: F-F a mapping.
position N = Ce N(0) into the sum of complex Suppose that the following three conditions
vector bundles such that g acts on N(B) as eie, are satisfïed: (i) For fi, & E F, 0 < Â < 1 implies
and Qe is the characteristic class deiïned by Â-f, + (1 - Â)fz E F. (ii) If a series { fk} of func-
tions in F converges uniformly in the wider
sense to a function J then fi F; and further-
more, the series { Tfk} converges uniformly in
the wider sense to T$ (iii) The family T(F) is a
where the Chern class of N(B) is written as
+normal family of functions on D. Then there
c(N(0)) = ni< 1 + xj); moreover .Y(F,) denotes
exists a function fe F such that Tf=$
the tTodd class of the complex manifold Fi,
Let R be a topological linear space and T a
and N* denotes the dual bundle of N (- 237
mapping assigning a closed convex subset T(x)
K-Theory H).
of R to each point x of R. A point x of R is
called a fixed point of T if x E T(x). The map-
D. Fixed-Point Theorems for Infinite- ping T is called semicontinuous if the condition
Dimensional Spaces ~,,+a, yn+b (y,~T(x,)) implies that ~ET(U). In
particular, if K is a bounded closed convex
Birkhoff and 0. D. Kellogg generalized subset of a fmite-dimensional Euclidean space
Brouwer’s lïxed-point theorem to the case of R and T a semicontinuous mapping sending
function spaces (Trans. Amer. Math. Soc., 23 points of K into convex subsets of K, then
(1922)). Their result was utilized to show the T admits tïxed points (Kakutani fixed-point
existence of solutions of certain differential theorem, Duke Math. J., 8 (1941)). This result
equations, and has led to a new method in the was further generalized to the case of locally
theory of functional equations. convex topological linear spaces by Ky Fan
J. P. Schauder obtained the following theo- (Proc. Nat. Acad. Sci. US, 38 (1952)).
rem: Let A be a closed convex subset of a
Banach space, and assume that there exists a
References
continuous mapping T sending A to a fcount-
ably compact subset T(A) of A. Then T has
fixed points (Studia Math., 2 (1930)). This [l] B. F. Brown, The Lefschetz lïxed point
theorem is called the Schauder fixed-point theorem, Scott, Foresman, 1971.
theorem. [2] J. Cronin, Fixed points and topological
degree in nonlinear Analysis, Amer. Math.
A. Tikhonov generalized Brouwer’s result
Soc. Math. Surveys, no. 11, 1964.
and obtained the following Tikhonov iïxed-
[3] S. Lefschetz, Topology, Amer. Math. Soc.
point theorem (Math. Ann., 111 (1935)): Let R
Colloq. Publ., vol. XII, 1930.
be a locally convex ttopological linear space,
[4] D. R. Smart, Fixed point theorems, Cam-
A a compact convex subset of R, and Ta con-
bridge tracts in math. 66, Cambridge Univ.
tinuous mapping sending A into itself. Then
Press, 1974.
T has fixed points.
This theorem may be applied to the case
where R is the space of continuous mappings
sending an m-dimensional Euclidean space E”
into a k-dimensional Euclidean space Ek to 154 (1X.21)
show the existence of solutions of certain
differential equations. For example, when m =
Foliations
k = 1, consider the differential equation
A. Introduction
dyldx =.0x> Y)> Y(x,) = yo.
We set T(y)=y,+&f(t,y(t))dt to determine a A foliation is a kind of geometric structure on
continuous mapping T: R + R. Then the fixed manifolds, such as a differentiable or complex
points of Tare the solutions of the differential structure. The study of foliations evolved from
154 B 604
Foliations

investigation of the behavior of torbits of a In this case, ,f induces a codimension q, c’-


vector lïeld and also of the solutions of ttotal foliation ,f-‘(5) of N whose leaves are the
differential equations. Through the early arcwise connected components of f-‘(L)
works of C. Ehresmann, G. Reeb, and A. Hae- (LES).
fliger, together with the development of mani- In particular, if Q is a q-dimensional C’-
fold theory in 196Os, it became an established manifold and f: N-Q is a C’-submersion, f
lïeld of mathematics. Since then, great progress induces a codimension q, C’-foliation of N
has been made in this lïeld, especially in its whose leaves are the arcwise connected com-
topological aspects. At the same time it turns ponents off-‘(x) (x~Q).
out that many problems in foliation theory are A C’ p-plane field It^ on M is called involu-
deeply related not only to the geometry of tive if, for any C’ vector fields X, Y on M such
manifolds but also to various other branches that X,, YxcTx (~EM), the ?Lie bracket [X, Yj
of mathematics, such as the theory of differen- satislïes [X, Y],E%~. This condition is known
tial equations, functional analysis, and group as the Frobenius integrability conditipn for
theory. .‘r. If ?Z is defïned locally by q Pfafflan equa-
tions w1 = = wq = 0, the above condition is
equivalent to the condition that there are
B. Definitions and General Remarks c’-’ 1-forms 0, (i,j= 1, ,q) such that dq=
CY=, O,A(U~. A c’ p-plane fïeld % is said to be
A foliation on a manifold cari be defïned completely integrable if it is a tangent bundle
within various categories: topological, c*- of some foliation. When r > 2, Z” is completely
differentiable (1 <r < a), real analytic, and integrable if and only if it is involutive (Frobe-
holomorphic. For delïniteness, however, we nius theorem) (- 428 Total Differential Equa-
restrict ourselves to the C’-differentiable cate- tions). There is a topological obstruction to
gory in what follows. Furthermore, all mani- the complete integrability of It^ (- Section F).
folds are assumed to be paracompact. A closed C”,-manifold M admits a codi-
Let M be an n-dimensional C”-manifold, mension 1, C-plane field if and only if the
possibly with boundary. A codimension q, C’- Euler number of M vanishes. In 1.944, Reeb
foliation of M (0 <q 8 n, 0 d r < m) is a family constructed a codimension 1, C”-foliation of
B = (L, 1%~ A} of arcwise connected subsets, the 3-sphere S3 as follows [l]. Let f(x) be a
called leaves, of M with the following prop- C”-even function delïned on the open interval
erties: (i) L, n L,. = @ if 5z# x’; (ii) lJotA L, = (-1, l), such that
M; (iii) Every point in M has a local coordi-
nate system (U, $) of class c’ such that for limci=O (k=0,1,2 ,__. ).
each leaf L, the arcwise connected components 1x1-1dx’k’,f’(x)
of U n L a are described by xnmq+’ =Constant, The graphs of the equations ~=f(x)+c (-1~
“‘Y x” = constant, where x1, x2, , x” denote x< l,c~R) together with the lines x= +l
the local coordinates in the system (U, $). constitute a codimension 1, C”-foliation of
In particular, every leaf of .F is an (n-q)- [ -1, l] x R. Then by rotating it around the y-
dimensional tsubmanifold of M. The totality axis in R3, we obtain a codimension 1, C”-
of integral submanifolds of a tcompletely foliation of Dz x R, where D2 denotes the
integrable nonsingular system of +Pfafflan closed 2-disk. The foliation is invariant under
equations on R”, wi=ail(x)dxl +ai2(x)dx2 + vertical translations and therefore delïnes a
. . . +a,(x)dx,=O (i= 1,2, ,q) forms a co- codimension 1, Cm-foliation of Dz x S’. This
dimension q foliation, and the totality of inte- foliation is called the Reeb component of Dz x
gral curves of a nonsingular vector fïeld of S’. Since S3 is a union of two solid tori inter-
class C’ on M (r 3 1) constitutes a codimension secting in the common toroidal surface, the
n - 1 C’-foliation. Reeb components of each solid torus con-
Let 5 be a codimension q, C’-foliation of M structed above delïne the so-called Reeb foli-
(r > 1). Then M admits a (7’ TP-plane lïeld ation of S3 (Fig. 1).
consisting of all vectors tangent to the leaves
of .p, and, dually, a C’-’ q-plane lïeld (p + q =
n). Denote the former by ~(9) and the latter
by v(g), and call them the tangent bundle and
the normal bundle of 9, respectively. v(p) is
isomorphic to the quotient bundle T(M)/z(T).
.p is called transversely orientable if v(y) is an
orientable vector bundle. A C’-mapping .f:
N-t M is said to be transverse to the foliation
d if the composite mapping T(N)zT(M)+
T(M)/z(P) is epimorphic at each point of N. Fig. 1
605 154 D
Foliations

Generalizing the above construction into a morphism, or simply the holonomy, of the leaf
differential topological method, one obtains L. The image of h, is called the holonomy
the following results: Every closed 3-dimen- group of L. For r 3 1, by differentiating each
sional manifold admits a codimension 1, C”- element of Gy, one has a homomorphism
foliation (S. P. Novikov [3], W. Lickorish, J. dh,: n, (L, x,)+GL(q; R), called the linear
Wood); every odd-dimensional sphere admits holonomy of L. The holonomy of a proper leaf
a codimension 1, C”-foliation (1. Tamura, A. (- Section D) completely characterizes the
Durfee, B. Lawson). On the other hand, every foliation of a neighborhood of it (Haefliger).
open manifold has a codimension 1, C”-
foliation induced by a submersion over R (-
Section F). D. Topology of Leaves
Let M be a total space of a C’-bundle over
a C”-manifold B with Iïber F. If F is a C”- Let 9 be a codimension q foliation of M. The
manifold and the +Structure group reduces to a leaf topology is a topology of M defined by
totally disconnected subgroup of Diff’F, the requiring each connected component of the set
group of ah C’-diffeomorphisms of F, then the of the form Un L to be open, where U is an
local sections, which are defined in an obvious open set in M and LEB. Leaves are nothing
manner using the local triviality of this bundle, but the connected components of M with
fit together to give leaves of a codimension q, respect to this topology. A leaf LE p is called
C’-fohation (q = dim F). In this case, M is a compact leaf if L is compact in the leaf top-
called a foliated bundle or a flat F-bundle over ology. In general, L is called proper if two
B. Each leaf of this foliation is diffeomorphic topologies on L induced from the original and
to a covering manifold of B and transverse to the leaf topologies on A4 coincide. Any com-
the fïbers of the bundle M-B. Foliated bun- pact leaf is proper. A leaf L is called locally
dles exhibit a class of foliations; this is espe- dense if Int L# 0. If a leaf is neither proper
cially important in connection with the char- nor locally dense, it is called exceptional. Since
acteristic classes of foliations (- Section G). we are assuming that M is paracompact, a
leaf that is a closed subset of M is proper
(Haefliger). There exists a codimension 1, Ci-
C. Holonomy foliation of the 2-torus T2 that contains
exceptional leaves. But in the C’ category
The notion of the holonomy of a leaf, given by (r > 2) such a foliation does not exist on T2 (A.
Ehresmann, is a generahzation of the +Pain- Denjoy, C. Siegel). In higher dimensions, there
taré mapping in tdynamical systems. Let F be are examples of C”-fohations with exceptional
a codimension q, C’-foliation of M and L be a leaves (R. Sacksteder). The following result is
leaf of 9. Let N(L) denote the total space of called the Novikov closed leaf theorem [3]:
the normal disk bundle of L in M. Choose a Any codimension 1, C’-fohation (r 2 2) of a
C-immersion i: N(L)+M such that i restricted closed 3-dimensional manifold M contains a
to the zero section of N(L) is the natural inclu- Reeb component if either n,(M) is finite or
sion and i maps the Iïbers of N(L) transversely n2(M)#0 (M#S’ x S2,S1 x RP2). In partic-
to the foliation 9. Then the induced foliation ular, every C*-foliation of S3 contains a com-
i-‘(9) of N(L) has the properties that the pact leaf homeomorphic to T2. The question
leaves are transverse to the iïbers of N(L) and of whether every codimension 2, C’-foliation of
the zero section of N(L) is a leaf. If y is an S3 has a compact leaf is known as the Seifert
oriented loop in L based at a point x,,EL, then conjecture. There is a counterexample in the
there is a neighborhood ci of 0 in the fiber C’ case (P. Schweitzer [7]), but it remains
over x0 satisfying the following: for each point unsolved for r > 2 (- 126 Dynamical Systems
XE U there is a curve yX: [0, I]+N(L) having NI.
the properties: (i) y,(O) = x, (ii) Im(y,) lies on a A compact leaf L is said to be stable if it has
leaf of i-‘(F), and (iii) rcoy,(t)=y(t) for any an arbitrarily small open neighborhood that
tE[O, 11, where rr:N(L)+L is the bundle pro- is a union of compact leaves. The following
jection. The family of curves {y,1 XE U) gives a results are called the Reeb stability theorems:
C’-diffeomorphism H, from U to another open (1) Let L be a compact leaf of a C’-foliation
set of K1 (x,,), which assigns y,( 1) to x. Let G; (r 2 0). If the holonomy group of L is tïnite,
denote the group of tgerms at 0 of a11 local then it is stable (Reeb Cl]). (2) Let 9 be a
C”-diffeomorphisms of Rq which lïx 0. The transversely orientable codimension 1, C’-
germ at 0 of the mapping H, depends only on foliation (r > 1) of a compact manifold M
the thomotopy class of y, and, by identifying (tangent to the boundary). If there exists a
nml(xO) with R4, we obtain a homomorphism compact leaf L with H~(I!,; R) = 0, then M is a
h,:nl(L,x,)+G~. h, is determined by L up to hber bundle over Si or [0, 11, and the leaves of
conjugacy and is called the holonomy homo- 9 are the fibers of this bundle. In particular,
154 E 606
Foliations

every leaf is compact and stable (Reeb [ 11, topology of BTY and the group structure
Thurston [8]). of Diff’(R¶), which is stated below. Let
The generalization of the stability theorem DiffL(R9) be the topological group of all c’-
for proper leaves has been investigated by T. diffeomorphisms of R4 that are identities
Inaba and P. Dippolito. outside some compact sets with the c* to-
pology. Let DiffL,,(Rq) denote Diffk(R9) with
the discrete topology and B Diff;(Rq) denote
E. Haefliger Structures the homotopy fïber of the natural mapping
BDiffk,,(R9)+BDiffL(R9). Then there exists a
Let 3; be the tpseudogroup consisting of a11 continuous mapping BDI~~~(R~)~@B~~
C’-diffeomorphisms 9 of an open subset of R4 that induces an isomorphism in the ho-
to another open subset of R4. We Write Ii for mology group with integer coefficient for
the set of all germs [y], of 9 at x, x~domain of 0 d r d cc and 4 k 1, where !Z denotes the qth
y, SE~;. The sheaf topology of ri is the to- +loop space functor (J. N. Mather [ 121, Thur-
pology whose open base is the family of sub- ston [ 101). Further, it has been proved that
sets of the form u xsdomainof ,{ CYI,). With this Diffk,,(R9) is a +Simple group (if r # q + 1) and
topology and the multiplication induced from that Brq is +(q + l)-connected for r #q + 1
the composition of ?3;, ri is a topological (Mather and Thurston, Haefliger [SI). The
groupoid, i.e., a tgroupoid whose multiplication group Homeo,,,R4 is tacyclic (Mather), and
and inverse mappings are continuous. If one hence Brf is tcontractible.
identifies a point x in R4 with [idaq],, Ii con-
tains R4 as a subspace.
A codimension q, C’-Haefliger structure or F. Existence and Classification of Foliations
a ri-structure .&? on a topological space X is a
maximal covering of X by open sets {U, 1ieJ}, Not every plane fïeld on a manifold is isomor-
such that for each pair i, jsJ, there is a con- phic to a completely integrable field (R. Bott).
tinuous mapping yij: Ui n u,+ry satisfying Thus, in general, the existence of a plane lïeld
does not guarantee the existence of a foliation
yik(~)=~ij(~)~~jk(~) for x~U,nu~nu,. (*)
of a manifold. Haefliger and Thurston solved
Since yii(x) is the germ of the identity mapping the existence and classification problems in
for XE Ui, the correspondence x+yii(x) defines foliation in terms of Haefliger structures as
a continuous mapping fi: Ui+R4= Ii. A co- follows. Two codimension 4 foliations 5, and
dimension 4, C’-foliation of a C”-manifold M p, of a C”-manifold M are said to be concor-
is the same as a Ii-structure on M such dant if there is a codimension 4 foliation 9 on
that each ,ji is a C’submersion (Haefliger [SI). M x [0, 11, that is transverse to M x {t} (t =
If f: Y-tX is continuous and Z@ is a 0,l) and induces there the given foliation e
ri-structure on X, there is an induced ri- (t = 0,l). They are said to be integrably homo-
structure ,j-’ ,Z on Y which is defïned by topic if one further requires that the foliation
if~‘(Ui),rijofli,j~J}. Two Ii-structures suó 9 be transverse to M x {t} for a11 t E [0, l] in
and 2, on X are said to be homotopic if there the detïnition above. Similarly, two subbundles
exists a ri-structure 2 on X x [0, 11 such that &, 5, of T(M) are said to be concordant if
~Ixxj,)=&(t=O,1). there is a subbundle 5 of T(M x [0, 11) such
Let I;(X) be the set of homotopy classes of that 5 1,.jl)=<,fort=O,l,andtheyaresaid
Ii-structures on X. There exists a space Br;, to be homotopic if one further assumes that
called the classifying space for r;-structures, 51 M x Ifj is a subbundle of T(M x {t}) for all
such that there is a natural one-to-one corre- t E [0, 11. The following theorem is of funda-
spondence between I;(X) and [X, BI;] for mental importance.
any paracompact space X, where [A, i?] de- Theorem: Let M be an open (resp. closed)
notes the set of homotopy classes of continu- C”-manifold. Then for each r=O, 1, . . , CO, the
ous mappings from A to B. By condition (*) integrable homotopy classes (resp. concor-
above, if r > 0, the differentials j&,(x) 1XE dance classes) of codimension q, Cr-foliations
Ui n U,, i, jEJ} define a q-dimensional vector of M are in a natural one-to-one correspon-
bundle v(m) over X, which is called the nor- dence with homotopy classes of r;-structures
mal bundle of 2. The correspondence 3? -t ,Z on M together with homotopy classes (resp.
v(Z) gives a continuous mapping v: sr;+ concordance classes) of subbundles of T(M)
BGL(q; R) among classifying spaces. If r = 0, isomorphic to v(x). (M. L. Gromov, A.
there is also a similar mapping v: Brt+ Phillips, Haefliger [S], Thurston [SI).
BTop,. Let Bry be the homotopy tïber of the The following are consequences of the
mapping v. BI; is a classifying space for the theorem: (i) A closed manifold A4 admits a
Ii-structures with trivialized normal bundles. codimension 1, C”-foliation if and only if the
There is a tight connection between the Euler number of M vanishes. (ii) If a manifold
607 154 G
Foliations

admits a +q-frame fïeld, then the associated q- Kerdf= z(9) 1c. This is a principal J,-bundle
plane fïeld is homotopic to the normal bundle over M, and its restriction to U is isomorphic
of a codimension q foliation of M. (iii) Every to the pullback by f of the bundle Pk. Hence
dimension q plane field on a C”-manifold is there are homomorphisms from the set of in-
homotopic to the normal bundle of a CO- variant forms on Pk to A(P,(F)) that induce
foliation with C” leaves. a homomorphism A= A(A,)+l$ A(P,(p)).
This homomorphism is compatible with the
action of O(q) and hence induces a homomor-
G. Characteristic Classes of Foliations phism of O(q)-basic subcomplexes. Thus one
obtains a homomorphism cpp:H*(A,; O(q))+
Let BF; be the classifying space of Fi- l$H*(A(P,(.F)); O(q))rH*(M; R). In fact,
structures. An element of the cohomology <P,~ depends only on the 2-jet of the foliation
group H*(BT;; R) is called a (real) character- 9, and one cari think of it as a homomor-
istic class of codimension q, C’-foliations. If 9 phism <~:H*(A,;0(q))+H*(Bri;R)(r>2).
is a codimension q, C’-foliation of A4 and f: The elements in Im cp are called the smooth
M+BIi is the classifying mapping for 9, characteristic classes of foliations.
then an element a(9)=f*a~H*(M;R), C(E Let WO, be the differential graded algebra:
H*(BFq; R), is called the characteristic class
of 9 corresponding to c(. The first nontrivial
characteristic classes of foliations are known as where dui=ci, dc,=O, deg(ui)=2i- 1, deg(c,)=
the Godbillon-Vey classes (C. Godbillon and 2i, and E denotes the texterior algebra over
J. Vey [ 131) and cari be detïned as follows. Let R generated by the ui’s, and R denotes the
9 be a transversely oriented, codimension q, tpolynomial algebra of the ci>s truncated by
C”-foliation of M. Then there exists a +q-form the tideal generated by elements of degree
s2 on M such that on a neighborhood U of > 2q. There exists a homomorphism of dif-
each point of M, 52 is written as c$ A A WY, ferential graded algebras WO,+ A(A,; O(q))
where aui , , ~4 are linearly independent l- which induces an isomorphism in cohomology
forms that vanish on leaves of 8. By the inte (1. M. Gel’fand and D. B. Fuks). For a codi-
grability condition, there is a 1-form n such mension q foliation 9 of M, the cohomology
that dR = ‘1 A Q. Then the Godbillon-Vey class class determined by cîi corresponds to the ith
FS of F is the de Rham cohomology class in Pontryagin class of the normal bundle v(p)
H2q+‘(M;R) represented by the closed 2q+ 1 of 9, and the cohomology class U, c: corre-
form v A (d~)~. sponds to the Godbillon-Vey class of .9. In
The following construction provides a wide particular, the subring of the cohomology
class of characteristic classes of foliations (Bott ring H*(M; R) generated by the +Pontryagin
and Haefliger [ 14],1. Bernstein and B. Rosen- classes of v(9) is trivial for degree > 2q (Bott’s
feld [ 151). Let Jk be the set of +k-jets at 0 of vanishing theorem).
local C”-diffeomorphisms of R’i keeping 0 Let Nqf’ be a closed (q + 1)-dimensional
fïxed. The set {Jk}Eo forms an tinverse system Riemannian manifold of +Constant negative
of +Lie groups with respect to the natural curvature. The total space Ti N of the unit
homomorphism pk:J,+, +Jk, and each Jk tangent sphere bundle of N admits a codi-
(k 2 1) contains O(q) as a tmaximal compact mension q, Cm-foliation associated with the
subgroup. Let Pk be the differentiable fiber tgeodesic flow of T, N (+Anosov foliation). It
bundle of k-jets at 0 of local diffeomorphisms has been shown that the Godbillon-Vey class
of Rq whose domains contain 0. It is a +prin- of this foliation is nontrivial (R. Roussarie,
cipal J,-bundle over Rq. Denote by ,4(P,) the F. Kamber and P. Tondeur, K. Yamato). It is
tdirect limit of the de Rham complexes of known that many of the smooth characteristic
{Pk}&,, and let A be the subcomplex of .4(P,) classes are also nontrivial (Bott and Haefliger,
consisting of invariant forms with respect to Thurston, J. Heitsch).
the natural action of gqm. A is canonically A smooth characteristic class aeH*(B& R)
isomorphic to the tcochain complex A(A,) of is called rigid if for any smooth one-parameter
continuous alternating forms on A,, where A, family {R-1} of codimension q foliations on
is the topological Lie algebra of tformal vector a C”-manifold M, d(cc(e))/dt = 0 holds.
tïelds on R4 (- 105 Differentiable Manifolds The elements in the image of the natural
AA). homomorphism
Now let g be a codimension q, C”-
foliation of M. Let Pk(g) denote the differ-
entiable fïber bundle over M whose fïber are rigid (Heitsch). On the other hand, the
over XE M is the space of k-jets at x of the Cm- Godbillon-Vey class is not rigid. In fact, Thur-
submersion f: U+Rq from an open neighbor- ston constructed a one-parameter family
hood U of x to Rq, satisfying (i) f(x) = 0, (ii) of codimension q foliations of a certain
154 H 608
Foliations

(2q + 1)-dimensional manifold for which the vol(D(x, r)), where D(x, r) is the set of points
Godbillon-Vey class varies continuously. The y EL whose distance along L from a fïxed
characteristic classes of a simple foliation are point XE L is not greater than r. The growth
often trivial. For example, the Godbillon-Vey type of the function fL is determined only by L.
class of a codimension 1 foliation of a closed Many papers have been published that deal
manifold vanishes if it is almost without holo- with the relation between the behavior of
nomy (i.e., no noncompact leaves have non- leaves and their growth types in codimension 1
trivial holonomy) (M. Herman 1161; T. Mizu- foliations (J. Cantwell and L. Conlon, G. Hec-
tani, S. Morita, and T. Tsuboi [ 171). tor, T. Nishimori, N. Tsuchiya). On the other
hand, as a generalization of the notion of
H. Further Topics asymptotic cycles in dynamical systems, the
notion of foliation cycles, or equivalently a
(1) Transverse structures. Let B be a q- transverse invariant measure, has been detïned
dimensional manifold and Y a geometric (J. Plante [18], D. Ruelle, D. Sullivan [19]).
structure on it, and let $,, denote the tpseudo- The existence of a transverse invariant mea-
group generated by the local diffeomorphisms sure for a foliation is closely connected to the
that preserve the structure Y. Replacing 3; growth types of the leaves. The example of
by YY in the defïnition of a C’-Haefliger struc- Denjoy in Section D leads to the study of
ture, one obtains definitions of a r,-structure tminimal sets of foliations and the structures
and a r,-foliation. A r.,-foliation is called a of foliations (Hector, Cantwell and Conlon).
Riemannian foliation, a transversely real ana- The structure of a codimension 1 foliations
lytic foliation, or a transversely holomorphic which are almost without holonomy has been
foliation if Y is a Riemannian, real analytic, or fairly well investigated (Sacksteder, Hector,
complex structure on R4 (Cq12), respectively. H. Imanishi, R. Moussou, Roussaire).
The theorics for many such foliations are (4) Compact foliation. A foliation whose
analogous to those for C’-foliations. For leaves are a11 compact is called a compact
example, many results are known about the foliation. D. Epstein proved that if 9 is a
characteristic classes of Riemannian or holo- codimension 2, C2-compact foliation of a
morphic foliations. Haefliger showed that there closed 3-manifold, then the leaves of 9 are the
is no codimension 1 real analytic foliation on a fïbers of a +Seifert fïbration of M. In higher
simply connected closed manifold and that the dimensions, the situation is more complicated
classifying space Bry for codimension 1 trans- (Sullivan, R. Edwards and K. Millett).
versely oriented transversely real analytic (5) Foliated bundles. There are many results
foliations has the homotopy type of a +K(n, l)- on foliated bundles. In particular J. W. Milnor
space for some uncountable tperfect group n [20] and J. Wood obtained a condition for a
cv. circle bundle 5 over a closed surface Z to have
(2) Foliated cohordism. Two closed oriented a foliation transverse to fïbers. More precisely,
n-dimensional C”-manifolds A4, and M, with if 5 and C are orientable, then 5 admits such a
codimension q, C”-foliations are said to be foliation if and only if 1X(5)1< -min{O,x(Z)},
foliated cobordant if there exist a compact where X denotes the Euler number and x the
oriented (n + 1).dimensional C”-manifold W Euler-Poincaré characteristic. Kamber and
with boundary 8 W = M, U (- M,) and a codi- Tondeur made an extensive study of charac-
mension q, C’-foliation of W which is trans- teristic classes of foliated bundles [21].
verse to 5 W and induces the given foliations of (6) Transverse foliations. Two foliations F
M, and M, The resulting foliated cobordism and 9 of M are said to be transverse to each
classes {F} form a group F-R:,, with respect other if any two leaves K and L of Y and 3
to the disjoint union. It is known that 9-n:; 1 are transverse to each other. A foliated bundle
= {0} and that the Reeb foliation of S3 is has such foliations. D. Hardorp proved that
cobordant to zero. The characteristic classes on every orientable closed 3-manifold, there
of foliations provide invariants of foliated exists a triple of codimension 1 foliations
cobordisms. In particular the Godbillon-Vey that are pairwise transverse. Tamura and A.
number r,[M] is an invariant of FQ;,,,,, Sato classified the codimension 1 foliations
(r 3 2, q > l), and a result of Thurston men- that are transverse to the Reeb component
tioned in Section G implies that the homo- of Dz x S’.
morphism 9Q;q+I,y+R defmed by {S}d
r,[M] is surjective. References
(3) Growth of leaves, transverse invariant
measure. Let .g be a C”-foliation of a com- [l] G. Reeb, Sur certains propriétés topo-
pact manifold. Fix a +Riemannian metric logiques de variétés feuilletées, Actualité Sci.
on M. Then each leaf L of 9 has the in- Ind., 1183, Hermann, 1952.
duced metric, and one has a function .fl(r) = [2] C. Ehresmann, Sur la théorie des variétés
609 155 B
Foundations of Geometry

feuilletées, Univ. Roma Rend. Mat. Appl., 10 [23] B. Lawson, The quantitative theory of
(1951) 64482. foliations, Regional conference series in math.
[3] S. P. Novikov, Topology of foliations, 27, Conference Board of the Mathematical
Amer. Math. Soc. Transl., 1967, 2866304. Sciences. 1977.
(Original in Russian, 1965.)
[4] 1. Tamura, Foliations and spinnable struc-
tures on manifolds, Ann. Inst. Fourier, 23
(1973) 197-214.
[S] A. Haefliger, Homotopy and integrability, 155 (Vl.2)
Lecture notes in math. 197, Springer, 1971, Foundations of Geometry
133-163.
[6] A. Denjoy, Sur les courbes définies par les
A. Introduction
équations difféentielles à la surface du tore, J.
Math. Pure Appl., 11 (1932), 333-375.
Geometry deals with figures. It depends, there-
[7] P. Schweitzer, Counterexamples to Seifert
fore, on our spatial intuition, but our intuition
conjecture and opening closed leaves of foli-
lacks objectivity. The Greeks originated the
ations, Ann. Math., (2) 100 (1974), 3866400.
idea of developing geometry logically, based
[S] W. Thurston, A generalization of Reeb
on explicitly formulated axioms, without re-
stability theorem, Topology, 13 (1974), 347-
sorting to intuition. From this intention re-
352.
sulted Euclid’s Elements, which was long con-
[9] W. Thurston, Existence of codimension
sidered the Perfect mode1 of a logical system.
one foliations, Ann. Math., (2) 104 (1976), 249-
As time passed, however, mathematicians came
268.
to notice its imperfections. Since the 19th cen-
[ 101 W. Thurston, Foliations and groups of
tury especially, with the awakening of a more
diffeomorphisms, Bull. Amer. Math. Soc., 80
rigorous critical spirit in science and philoso-
(1974),304-307.
phy, more systematic criticism of the Elements
[l l] W. Thurston, Noncobordant foliations of
began to appear. Non-Euclidean geometry was
S3, Bull. Amer. Math. Soc., 78 (1972), 511-514.
formulated after reexamination of Euclid’s
[ 121 J. N. Mather, Integrability in codimen-
axiom of parallels; but it was also discovered
sion 1, Comment. Math. Helv., 48 (1973),
that even as a foundation of Euclidean geom-
195-233.
etry, Euclid’s system of axioms was far from
[ 131 C. Godbillon and J. Vey, Un invariant
Perfect. Various systems of axioms for Eucli-
des feuilletages de codimension un, C. R. Acad.
dean geometry were proposed by mathemati-
Sci. Paris, 273 (1971), 92-95.
cians in the latter half of the 19th Century,
[14] R. Bott and A. Haefliger, On character-
among them one by D. Hilbert [ 11, which
istic classes of I-foliations, Bull. Amer. Math.
became the basis of far-reaching studies.
Soc., 78 (1972), 1039%1044.
[15] 1. Bernstein and B. Rosenfeld, On charac-
teristic classes of foliations, Functional Anal. B. Hilbert’s System of Axioms
Appl., 6 (1972), 60-62.
[ 161 M. Herman, The Godbillon-Vey invari- Hilbert took as undelïned elements points
ant of foliations by planes of T3, Lecture (denoted by A, B, C, ), straigbt lines (or sim-
notes in math. 597, Springer, 1977, 294-307. ply lines, denoted by a, b, c, . . . ), and planes
[ 171 T. Mizutani, S. Morita, and T. Tsuboi, (denoted by X, /!, y, . ..). Between these ob-
The Godbillon-Vey classes of codimension one jects there exist incidence relations (expressed
foliations which are almost without holonomy, in phrases such as “A lies on a,” “a passes
Ann. Math., (2) 113 (1981). through A,” etc.); order relations (“B is be-
[ 181 J. Plante, Foliations with measure-pre- tween A and C”); congruence relations; and
serving holonomy, Ann. Math., (2) 102 (1975), parallel relations. The relations are subject to
327-362. the following fïve groups of axioms:
[ 191 D. Sullivan, Cycles for the dynamical (1) Incidence axioms: (1) For two points A,
study of foliated manifolds and complex mani- B, there exists a line a through A and B. (2)
folds, Inventiones Math., 36 (1976), 225255. If A #B, the line u through A, B is uniquely
[20] J. W. Milnor, On the existence of a con- determined. We Write a = A U B and cal1 a the
nection with curvature zero, Comment. Math. join of A, B. (3) Every line contains at least
Helv., 32 (1958), 2155223. two different points. There exist at least three
[21] F. Kamber and P. Tondeur, Foliated points that do not lie on a line. (4) If A, B, C
bundles and characteristic classes, Lecture are points not on a line, there exists a plane CI
notes in math. 493, Springer, 1975. through A, B, C. (We also say that A, B, C lie
[22] B. Lawson, Foliations, Bull. Amer. Math. on a.) For every plane t(, there exists at least
Soc., 80(1974), 369-418. one point A on c(. (5) If A, B, C are points
155 B 610
Foundations of Geometry

not on a line, the plane c( through A, B, C is subsets E’ = {A’ 1A - A’}, CI” = {A” 1A ré A”} of
uniquely determined. We Write LX= A U BU C a are called half-planes on CI bounded by a.
and cal1 czthe join of A, B, C. (6) If A, B are Again, denoting by c( and a the set of points on
two different points on a line a and if A, B lie 2 and the set on a, respectively, we obtain SL=
on a plane m, then every point on a lies on c(. n’ U a U a” (disjoint union).
(We say that a lies on CI or 51passes through (III) Congruence axioms: Two segments AB,
a.) (7) If a point A lies on two planes CI, 8, A’B’ cari be in a relation of congruence, ex-
there exists at least one other point B on c( and pressed symbolically as AB = A’B’. (Segments
8. (8) There exist at least four points not lying AB and A’B’ are then said to be congruent.
on a plane. Since the segment AB is defïned as the set
(II) Ordering axioms: (1) If B is between A {A, B}, the four relations AB E A’B’, BA s
and C, then A, B, C are three different points A’B’, AB = B’A’, BA = B’A’ are equivalent.)
lying on a line; also, B is between C and A. (2) This relation is subject to the following three
If A, C are two different points, then there axioms: (1) Let A, B be two different points on
exists a point B such that C is between A and a line a, and A, a point on a line a, (a, may or
B. (3) If B is between A and C, then A is not may not be equal to a). Let a; be a ray on a,
between B and C. starting from A,. Then there exists a unique
We defïne a segment as a set of two different point B, on a; such that AB = A, B, (2) From
points A, B, denoted as AB or BA, and we cal1 A, B, = AB and A,B, c AB follows A, B, E
A and B ends of this segment. The set of points A,B,. (Hence it follows that = is an equiv-
between A, B is called the interior of AB, and alence relation between segments.) (3) Let
the set of points of A U B that are neither ends A, B, C be three points such that B is between
nor interior points of AB is called the exterior A and C, and let A,, B,, C, be three points
of AB. such that B, is between A, and C,. Then from
(4) Let A, B, C be three points not lying on a AB = A, B, > BC = B, C, follows AC = A, C,
line. If a line a on the plane A U BU C does not Now let h, k be two different lines in a plane
pass through A, B, or C, but passes through a x and through a point 0, and let h’, k’ be the
point of the interior of AB, then it also passes rays on h, k starting from 0. The set of two
through a point of the interior of BC or CA such rays h’, k’ is called an angle in c(, denoted
(Pas&% axiom). by L (h’, k’) or L (k’, h’). This angle is also de-
The following propositions are proved noted by L AOB, where A, B are points of h’,
from the above axioms. Given n points A,, k’, respectively. The rays h’, k’ are called the
A,, . , A, on a line (n > 2), we cari rearrange sides and the point 0 is called the vertex of
them, if necessary, SO that the point Ai is be- this angle. Then h’ is a subset of a half-plane
tween Ai and A, whenever we have 1~ i-c j < on c( bounded by k, and k’ is a subset of a ,half-
k < FI. There are exactly two ways of arrang- plane on c( bounded by h. The intersection of
ing the points in this manner (theorem of linear these two half-planes is called the interior of
ordering). Let 0 be a point on a line a, and let this angle, and the subset of c(- 0 consisting of
A, B be two points on the line different from 0. points belonging to neither the inside nor the
Write A-B when A= B or 0 is not between A sides of the angle is called the exterior of the
and B; Write A-B otherwise. Then - is an angle. Between two angles ~(h’, k’), L(h’, , k’,)
equivalence relation between points on the line there may exist the relation of congruence,
different from 0; from A ++B, A * C it follows again expressed by the symbol =, as in the
that B-C. We say that A, B are on the same case of segments, and subject to the following
side or on different sides of 0 on a depending two axioms: (4) Let ~(h’, k’) be an angle on a
on whether A - B or A - B. Two subsets a’ plane s( and h, be a line on czl (x1 may or may
and a” of a defined by u’ = {A’ 1A - A’}, a” = not be equal to a). Let 0, be a point on h,, hi
{A” 1A * A”} are called half-lines or rays on a ray on CI~ starting from O,, and a’, a half-
a with 0 as the extremity (or starting from 0). plane on x1 bounded by h,. Then there exists a
Denoting by a, for simplicity, the set of points unique ray k’, starting from 0, and lying in
on a, we have a = a’ U { 0) U a” (disjoint union). X\ such that L’(h’, k’) = ~(h;, k’,). Moreover,
Using axiom 11.4, we cari also prove the ~(h’, k’)= ~(h’, k’) always holds. (Hence it
following: Let u be a line on X, and let A, B be follows that = is an equivalence relation be-
two points on a not lying on a. If A = B or if tween angles.) (5) Let both A, B, C and A,, B,,
the interior of the segment AB has no point in C, be triples of points not lying on a line. Then
common with a, we say that A, B are on the from AB=A,B,, AC-A,C,,and LBAC-
same side of a on a, and Write A - B. Other- L B, A, C,, it follows that L ABC = L A, B, C,
wise, we say that A, B are on different sides of (IV) Axiom of parallels: Suppose that a, h
a on c( and Write A + B. Then N is an equiva- are two different lines. Then it follows from
lente relation between points on CI not lying on axiom 1.2 that if a and b share a point P, such
a, and from A *B, A * C follows B - C. The a point is the unique point lying on both a and
611 155 E
Foundations of Geometry

b. In this case we say that a, b intersect at P defmed as the Pythagorean closure of the field
and Write an b = P. On the other hand, if a and Q of rational numbers.
b have no point in common and if a, b are on
the same plane, we say that a, b are parallel
and Write a//b. If A, a are on a plane c( and A
D. Independence of Axioms
is not on a, we cari prove (utilizing axioms 1,
II, and III) that there exists a line b passing
In Hilbert’s system, the axioms 1 and II are
through A in CIsuch that a//b. The axiom of
used to formulate further axioms. On the other
parallels postulates the uniqueness of such a b.
hand, it cari be shown that each of the groups
(V) Axioms of continuity: (1) Let AB, CD be
III, IV, and V is independent from other
two segments. Then there exist a tïnite number
axioms.
of points A,, A,, . , A,, on A U B such that
The independence of IV is shown by the
CD-AA,-A,A,=...=A,-,A,andBisbe-
consistency of non-Euclidean geometry (- 285
tween A and A,, (Archimedes’ axiom). (2) The
Non-Euclidean Geometry). The following
set of points on a line a (again denoted for
mode1 shows the independence of 111.5: In the
simplicity by a) is “maximal” in the following
analytic mode1 for I-V, we replace the defï-
sense: It should satisfy axioms IL-11.3,111.1,
nition of distance between two points
V.l, and the theorem of linear ordering. If
ü is a set of points satisfying these axioms (X~>X~~X~ (Y~,Y~>YA b
such that ü 3 a, then 5 should be = a (axiom ((x,-Y,+x,-y,)2+(x*-y*)2+(x3-y3)2)1’2.
of linear completeness). Hence follows the
Then III.5 does not hold, while a11 other
theorem of completeness: the set of points,
axioms remain satisfied. The independence of
lines, and planes is maximal in the sense that it
V.2 is shown by the geometry over R, or P,,.
is not possible to add further points, lines, or
The independence of V. 1 follows from the
planes to this set with the resulting set still
existence of the non-Archimedean Pythag-
satisfying axioms I-IV and V.l.
orean field: the Pythagorean closure of any
tnon-Archimedean fïeld (e.g., the field of ra-
C. Consistency tional functions of one variable over Q with a
inon-Archimedean valuation) is such a iïeld. A
In formulating the above axioms and proving geometry in which V.l does not hold is called
their consistency, Hilbert assumed the consis- a non-Archimedean geometry.
tency of the theory of real numbers (- 156
Foundations of Mathematics). TO prove con-
sistency, Hilbert constructed a mode1 for the E. Completeness of the System of Axioms and
above axioms using the method of analytic Relations hetween Axioms
geometry. He defmed points as triples of real
numbers (x1, x2,x,), lines and planes as sets of The tcompleteness of the system of axioms I-
points satisfying suitable systems of linear V cari be shown by introducing coordinates in
equations, and relations of ordering, con- the geometry with these axioms and represent-
gruence, and parallelism in the usual way. It is ing it as +Euclidean geometry of three dimen-
easy to verify that such a system satisfies a11 sions. Axiom group V is essential for the intro-
the axioms 1-V. Thus the consistency of these duction of coordinates over R. Moreover, we
axioms is reduced to the consistency of the have the following results:
theory of real numbers (- 35 Axiom Systems). (i) The geometry with the axioms I-IV cari
A mode1 for I-IV and V.l cari be obtained be represented as “Euclidean geometry” of
in the countable tïeld R, of a11 real talgebraic three dimensions over a Pythagorean fïeld,
numbers instead of R. Then R, cari be further and the geometry with axioms 1.1-1.3, II-IV
restricted to its subfïeld P, detïned as follows: cari be represented as “Euclidean geometry” of
Let F be an arbitrary tïeld. An textension of F two dimensions over a Pythagorean field.
of the form F(m) with /Ig F is called a (ii) The geometry with II, and a stronger
Pythagorean extension of F, and F is said to be axiom of parallels IV* (given a line a and a
a Pythagorean fïeld if any Pythagorean exten- point A outside a, there exists one and only
sion of F coincides with F (e.g., R, and R are one line a’ passing through A that is parallel to
Pythagorean). It is easily verified that I-IV are a) cari be represented as an tafine geometry
satisfïed in the “analytic geometry over any over a field K that is not necessarily
Pythagorean lïeld.” On the other hand, we cari commutative.
construct a minimal Pythagorean field con- (iii) The field K is commutative if and only if
taining a given field (the Pythagorean closure the following holds: Suppose that in Fig. 1
of the field) in the same way as we construct A’UB/AU B’, B’U CJ/BU C’. Then it follows
the talgebraic closure of a lïeld. The field P,, is that A’U C//A U C’ (Pascal3 theorem).
155 F 612
Foundations of Geometry

we consider only simple plane polygons, and


refer to them simply as polygons.
TJordan’s theorem implies that a polygon in
the sense just defïned divides the plane into
two parts, its interior and its exterior. This
special case of Jordan’s theorem cari be proved
by 1.1-1.3 and II only. A polygon P is divided
Fig. 1 into two polygons P,, PZ by a broken line
joining two points on sides of P and lying in
(iv) The “two-dimensional geometry” with the interior of P (Fig. 3). In this case, we say
I-1.3, II, and IV* cari be embedded in the that P is decomposed into P,, P2 and Write P
“three-dimensional geometry” with axioms 1, = P, + P2. We may again decompose P, , P2
II, and IV* if and only if the following holds: and thereby arrive at a decomposition of the
Suppose that in Fig. 2 we have AU B//A’U B’, form P = P, + . + Pk. Axiom III is used to
BU Ci B’U C’. Then it follows that CU A// introduce the congruence relation = between
c’ U A’ (Desargues’s theorem). polygons. Two polygons P, Q are called de-
composition-equal if there exist decomposi-
tionsP=P,+...+P,,Q=Q,+...+Q,such
that P, = Q i , , Pk = Qk. This is expressed by
PzQ. We cal1 P, Q supplementation-equal if
there exist two polygons P’, Q’ such that (P+
P’)z(Q + Q’), P’zQ’. This Will be expressed by
izsissc PeQ. If we assume IV, we cari use result (i) of
(i .4 .4’
Section E. Let K be the ground tïeld of the
Fig. 2 geometry (K is Pythagorean, hence tordered).
The area of polygon P is defïned as the posi-
(v) From 1.1-1.3, II, IV*, and Pascal’s theo- tive element m(P) of K assigned to P such that
rem follows Desargues’s theorem. m(P+Q)=m(P)+m(Q), and m(P)=m(P’) if
(vi) Desargues’s theorem is independent of P = P’. From PzQ or PeQ, it follows that m(P)
1.1~1.3,11,111.1~111.4, IV*, and V; that is, we = m(Q). Under these axioms, it is proved that
cari construct a non-Desarguesian geometry (a m(P) = m(Q) implies PeQ. If we also assume
geometry in which Desargues’s theorem does V.l, then ~(P)=~I(Q) implies PzQ. Thus
not hold) in which these axioms are satistïed. the theory of area of polygons cari be con-
Axioms 1, 11, and IV*, as well as the theo- structed without assuming axiom V.2, though
rems of Pascal and Desargues, are propositions this result cannot be generahzed to higher-
in affine geometry. Each has a corresponding dimensional cases. For the case of three di-
proposition in tprojective geometry, and the mensions, we cari construct two solids of the
results concerning them cari be transferred to same volume that are not supplementation-
the case of projective geometry (- 343 Projec- equal [2,7].
tive Geometry).

F. Polygons and Their Areas

Suppose that we are given a finite number of


points Ai (i = 0, 1, , Y) in the geometry with Fig. 3
axioms 1 and II. Then the set of segments (or,
more precisely, the union of segments together
with their interiors) A,Ai+I (i=O, 1, . . ..r-1) is G. Geometric Construction hy Ruler and a
called a hroken line joining A, with A,. In Transferrer of Constant Lengths
particular, if A, = A,, then this set is called a
polygon with vertices Ai and sides Ai A,+, A The geometry with I-IV cari be represented
polygon with r vertices is called an r-gon. (For as 3-dimensional Euclidean geometry over
r = 3,4,5,6, r-gons are called triangles, quad- a Pythagorean lïeld. Conversely, ah these
rangles, pentagons, and hexagons, respectively.) axioms are valid in 3-dimensional Euclidean
A plane polygon is a polygon whose vertices ah geometry over any Pythagorean iïeld. Thus
lie on a plane. A polygon is called simple if any the minimal system of “quantities” whose
three consecutive vertices do not lie on a line, existence is assured in geometry with these
and two sides A,A,+, and AjAj+, (ifj) meet axioms is the fïeld P,, the Pythagorean closure
only whenj=i+ 1 or i=j+ 1. In this article, of 0. Hilbert noticed that the existence of a
613 156 B
Foundations of Mathematics

geometric abject under axioms I-IV cari be [6] G. Hessenberg and J. Diller, Grundlagen
expressed as its constructibility by ruler (i.e., an der Geometrie, de Gruyter, 1967.
instrument to draw a straight line joining two [7] V. G. Boltyanskiï, Hilbert’s third problem,
points) and a transferrer of constant lengths. Winston, 1978.
The latter, for a constant length x, is an instru-
ment that permits fïnding the point X on the
given ray AB such that AX =x. It is not pos-
sible to construct by ruler and transferrer a11
the points that cari be constructed by means 156 (1.1)
of ruler and compass (- 179 Geometric Con- Fout-dations of Mathematics
struction). However, it is possible to construct
all the lengths 1.x, where Â. is any element of P,.
A. General Remarks
Hilbert conjectured that an element of P, cari
be characterized as a ttotally positive algebraic
The notion of +Set, introduced toward the end
number of degree 2”, ~EN. This conjecture was
of the 19th Century, has proved to be one of
proved by Artin [3].
the most fundamental and useful ideas in
mathematics. Nonetheless, it has given rise to
well-known tparadoxes. Based on this notion,
H. Related Topics
R. Dedekind developed the theories of natural
numbers [2] and real numbers [3], defining
While Hilbert’s foundations are concerned
the latter as “cuts” of the set of rational num-
with 3-dimensional Euclidean geometry, it is
bers. Thus set theory served as a unifying
easy to generalize these results to the case of
principle of mathematics.
n-dimensional Euclidean geometry (- 139
It has been noted, however, that some of the
Euclidean Geometry). Also, for affine and pro-
most commonly utilized arguments in set
jective geometries, there are well-organized
theory, which are at the same time the most
systems of axioms (at least for the case of di-
mensions > 3). Hilbert [ 1, Appendix III] useful in mathematics and belong almost to
the basic framework of forma1 logic itself,
showed that plane thyperbolic geometry cari
resemble very much those which give rise to
be constructed on a moditïed system of
paradoxes. This fact has caused many critical
axioms, but for other non-Euclidean geome-
mathematicians to question the very nature of
tries (in particular, telliptic geometries) there
are no known systems of axioms as good as mathematical reasoning. Thus a new lïeld,
foundations of mathematics, came into being at
Hilbert’s for the Euclidean case. On the other
the beginning of this Century. This field was
hand, Hilbert [l, Appendix IV] gives another
divided at its inception into different doctrines
method of constructing Euclidean geometry
according to the views of its initiators: logicism
in characterizing the group of motions as the
by B. Russell, intuitionism by L. E. J. Brouwer,
topological group with certain properties.
and formalism by D. Hilbert. In set theory,
G. Thomsen [4] rewrote Hilbert’s system of
which was the origin of this controversy, it was
axioms in group-theoretical language utiliz-
pointed out that the “definition” of set as given
ing the fact that the group of motions is gen-
by G. Cantor was too naive, and axiomatic
erated by symmetries with respect to points,
treatments of this theory were proposed (- 33
lines, and planes. Finally, Hilbert’s study of
Axiomatic Set Theory).
the foundations of geometry led him to re-
search in the tfoundations of mathematics.
B. Logicism

References Russell asserted that mathematics is a branch


of logic and that paradoxes corne from ne-
[l] D. Hilbert, Grundlagen der Geometrie, glecting the “types” of concepts. According to
Teubner, 1899, seventh edition, 1930. his opinion, mathematics deals formally with
[Z] M. Dehn, Uber den Rauminhalt, Math. structures independently of their concrete
Ann., 55 (1902), 4655478. meanings. Science of this character has been
[3] E. Artin, Uber die Zerlegung defïniter called logic from antiquity. According to him,
Funktionen in Quadrate, Abh. Math. Sem. logic is the youth of mathematics, and math-
Univ. Hamburg, 5 (1926), 100-l 15 (Collected ematics is the manhood of logic. TO construct
papers, Addison-Wesley, 1965, 2733288). mathematics from this standpoint, asserted
[4] G. Thomsen, Grundlagen der Elementar- Russell, ordinary language is lengthy and
geometrie, Teubner, 1933. inaccurate, and some proper system of sym-
[S] K. Reidemeister, Grundlagen der Geo- bols should be used instead. Thus he tried to
metrie, Springer, 1968. reconstruct mathematics using tsymbolic logic.
156 C 614
Foundations of Mathematics

Attempts to reorganize mathematics using ity. The philosophical standpoints of mathe-


logical symbols had formerly been made by G. maticians such as L. Kronecker and H. Poin-
Leibniz, who wrote Dissertatio de arte com- caré in the 19th Century or E. Bore], H.
binatoria in 1666, as well as by A. de Morgan, Lebesgue, and N. N. Luzin at the turn of this
G. Boole, C. S. Peirce, E. Schroder, G. Frege, Century cari be assimilated to intuitionism, but
G. Peano, and others. Symbols used by the those of the latter three are often said to be-
last two authors resemble those of today. long to semi-intuitionism or to French empiri-
Russell studied these works and published his cism. Brouwer took a narrower standpoint,
own theory in a monumental joint work with strongly antagonistic to Hilbert’s formalism.
A. N. Whitehead: Principia mathematica (3 Today the word “intuitionism” is generally
vols., 1st ed. 1910-1913, 2nd ed. 1925-1927), interpreted in Brouwer’s sense.
in which the theories of natural numbers and Brouwer sharply criticized the usual way of
real numbers as well as analytic geometry are reasoning in mathematics and claimed that
developed from the fundamental laws of logic. indiscriminate use of the law of excluded mid-
If this work had been completely successful, dle (or tertium non datur) P v 1 P cannot be
it could have eliminated any possibility of the permitted. According to him, the proposi-
intrusion of paradoxes into mathematics. tion “Either there exists a natural number
However, the authors were forced to postulate with a given property P, or else no such num-
an “unsatisfactory” axiom in order to con- ber exists” is to be regarded as proved only
struct mathematics. They introduced the notion when an actual construction of a natural num-
of ttype as follows: An abject M detïned as the ber with the property P is given or when the
set of a11 abjects of a certain type belongs to a absurdity of the existence of such a natural
higher type than the types of the elements of number cari be constructively proved. When
M. This serves to eliminate certain paradoxes neither of these two results cari be shown, then
but brings about inconveniences such as the one cari say nothing about the truth of the
following. Suppose that we are trying to con- above proposition. Thus the usual method of
struct the theory of real numbers from that of proof, known as the method of reductio ad
rational numbers. Each real number cari then absurdum, i.e., of proving a proposition P by
be considered a tpredicate about rational proving its double negation 11 P, is not
numbers. If this predicate contains only tquan- generally considered valid. It is a difficult but
titïers relating to variables running over a11 important problem of mathematical logic to
rational numbers, then the corresponding real determine which parts of usual mathematics
number is said to be predicative, otherwise cari be reconstructed intuitionistically, though
impredicative. According to Russell, the latter it does not seem easy to reconstruct any part
should have a higher type than the former, of mathematics elegantly from this standpoint.
which makes the theory of real numbers
exceedingly complicated. TO avoid this dif-
tïculty, Russell proposed the axiom of reduci- D. Formalism
bility, which says that every predicate cari be
replaced by a predicative one. With this rather TO eliminate paradoxes, Hilbert tried to apply
artifïcial axiom Russell himself expressed his axiomatic method. From Hilbert’s stand-
dissatisfaction. Russell also postulated the point, any part of mathematics is a deductive
taxiom of intïnity and the taxiom of choice, system based on its axioms. In the deductive
which are also problematic. After examining development, however, “logic,” including set
the philosophical background of the book, H. theory and elementary number theory, is used.
Weyl wrote about Principia mathemutica, Paradoxes appear already in such logic. Hil-
“Mathematics is no more based on logic than bert’s idea was to axiomatize such logic and
the utopia built by the logician.” Nevertheless, to prove its consistency. Thus one must tïrst
logic as formulated in this book, as well as the formalize the most elementary part of mathe-
theory of types as developed by F. P. Ramsay matics, including logic proper.
in the school of Russell and Whitehead, is still Hilbert proved the consistency of Euclidean
an important subject of mathematical logic. geometry by assuming the consistency of the
theory of real numbers. This is an example of a
relative consistency proof, which reduces the
C. Intuitionism consistency proof of one system to that of
another. Such a proof cari be meaningful only
The intuitionist claims that mathematical when the latter system cari somehow be re-
abjects or truths do not exist independently garded as based on sounder ground than the
from mathematically thinking spirit or intui- former. TO carry out the consistency proof of
tion, and that these abjects or truths should be logic proper and set theory, one must reduce
directly seized by mental or intuitional activ- it to that of another system with sounder
615 156 E
Foundations of Mathematics

ground. For this purpose, Hilbert initiated theory of natural numbers but of any consist-
metamathematics and the iïnitary standpoint. ent theory (from the fïnitary standpoint) con-
The finitary standpoint recognizes as its taining the theory of these numbers.
foundation only those facts that cari be ex- At the same time, Gode1 also obtained the
pressed in a Imite number of symbols and only following important result: Let S be any con-
those operations that cari be actually executed sistent forma1 system containing the theory of
in a Imite number of steps. Essentially, it does natural numbers. Then it is impossible to
not differ from the standpoint of intuitionism. prove the consistency of S by utilizing only
The methods based on this standpoint are also arguments that cari be formalized in S. This
called constructive methods. means that a consistency proof from the Iïni-
Metamathematics is also called proof theory. tary standpoint of a formal system S inevitably
Its subject of research is mathematical proof necessitates some argument that cannot be
itself. Hilbert was the tïrst to insist on its im- formalized in S.
portance. The theory is indispensable for con-
sistency proofs of mathematical systems, but it (2) Consistency Proofs for Pure Number
may also be used for other purposes. In fact, Theory. Gentzen [7] called pure number theory
the same idea cari be seen in the +duality prin- the theory of natural numbers not depend-
ciple of projective geometry, which dates from ing on the free use of set theory (differing con-
long before Hilbert’s proclamation of formal- sequently from the usual theory of natural
ism. This is not a theorem of projective geome- numbers based on +Peano axioms; - 294
try deduced from its axioms; rather, it is a Numbers) and proved its consistency. W.
proposition about the theorems in projective Ackermann [14] proved the consistency of a
geometry, based on the type of axioms and similar theory admitting the use of Hilbert’s
proofs in this subject. +s-symbol. G. Takeuti [15] showed that Gen-
According to Hilbert’s method, one must tzen’s result cari be obtained as a corollary
develop proof theory from the tïnitary stand- to his theorem extending +Gentzen’s funda-
point with the aim of proving the consistency mental theorem on tpredicate logic of the Iïrst
of axiomatized mathematics. For this purpose, order to a ttheory of types of a certain kind.
one must formalize the mathematical theory in According to the result of Gode1 mentioned
question by means of symbolic logic. A theory in (1) above, some reasoning outside pure num-
thus formalized is called a formal system. ber theory must be used to prove its consis-
tency. In a11 consistency proofs of pure number
theory mentioned above, ttranslïnite induction
E. Some Results of Formalist Theory up to the fïrst te-number cc, is used, but all the
other reasoning used in these proofs cari be
One of the most remarkable results hitherto presented in pure number theory. This shows
obtained with Hilbert’s method is the con- that the legitimacy of transtïnite induction up
sistency proof of pure number theory by G. to E,, cannot be proved in this latter theory. A
Gentzen [7]. This consistency proof covers direct proof of this fact was given by Gentzen
the largest domain for which an explicit [ 131. On the other hand, the legitimacy of
consistency proof has SO far been obtained. translïnite induction up to an ordinal num-
However, the methods of formalist proof ber < E,, cari be proved within pure number
theory have proved to be most effective in theory.
studying the logical structure of mathematical Again, transtïnite induction is not the only
theories and have led to various results on the method by which to prove the consistency of
consistency of formalized mathematical sys- pure number theory. Actually, Gode1 [ 171
tems, on symbolic logic, and on axiomatic set carried out the proof utilizing what he called
theory. We give some examples. computable functions of Imite type on natural
numbers and what we cal1 primitive recursive
(1) Godel’s Incompleteness Theorem. K. Gode1 functionals of tïnite type.
[6] showed that if a system obtained by for- By restricting pure number theory further,
malizing the theory of natural numbers is one obtains weaker theories of natural num-
consistent, then this system contains a tclosed bers whose consistency cari be proved with
formula A such that neither A nor its negation tïnitary methods without recourse to such
1 A cari be proved within the system. He methods as transtïnite induction up to sO. M.
originally proved this under the assumption Presburger [ 1 S] proved the consistency of a
that the system is w-consistent. This is a stron- theory in which only the addition of numbers
ger condition for the system than simple con- is considered an operation. Ackermann [ 191,
sistency, but J. B. Rosser [13] succeeded in J. von Neumann [20], J. Herbrand [21], and
replacing this by the latter. This result shows K. Ono [22] proved the consistency of theo-
the incompleteness not only of the usual ries in which some restrictions are placed
156 Ref. 616
Foundations of Mathematics

on the use of the axiom of tmathematical countable system of axioms satistïed by the
induction. system of natural numbers, there always exists
On the other hand, K. Schütte [23] gave a another tlinearly ordered system satisfying a11
consistency proof for number theory including these axioms and yet not isomorphic to the
what he called “infinite induction” from a system of natural numbers as an ordered
stronger standpoint than Hilbert’s tïnitary one; system.
he attempted to lïnd a basis that makes such a Godel’s incompleteness theorem and the
proof possible. Skolem paradox, as well as this result, seem
to indicate that there is a certain limit to
(3) The Consistency of Analysis. No delïnitive the effectiveness of the formalist method. On
result has yet been obtained from the stand- the other hand, nonstandard analysis has
point of formalism, though many attempts are originated in this result.
being made, among which a recent one by C.
Spector [24] should be mentioned.
References
(4) Axiomatic Set Theory. There are different
kinds of axiom systems (- 33 Axiomatic Set [l] S. C. Kleene, Introduction to metamath-
Theory). TO give a consistency proof for any of ematics, Van Nostrand, 1952.
these systems is considered a very difficult [2] R. Dedekind, Was sind und was sollen die
problem today, but many interesting results Zahlen? Vieweg, 1888.
are known concerning the relative consistency [3] R. Dedekind, Stetigkeit und irrationale
or independence of these axioms. Zahlen, Vieweg, 1872.
[4] B. Russell, Introduction to mathematical
(5) The Skolem-Lowenheim Theorem. The philosophy, Allen & Unwin and Macmillan,
metamathematical Skolem-Lowenheim 1919.
theorem states: Given a consistent system of [S] A. Heyting, Intuitionism, North-Holland,
axioms stated in the first-order predicate logic 1956.
whose cardinality is at most countable, there [6] K. Godel, Über forma1 unentscheidbare
always exists an tobject domain consisting of Satze der Principia Mathematica und ver-
countable abjects satisfying a11 these axioms. wandter Systeme 1, Monatsh. Math. Phys., 38
For example, axiomatic set theory is stated (1931), 1733198.
in predicate logic of the lïrst order, and the [7] G. Gentzen, Die Widerspruchsfreiheit der
cardinality of its axioms is countable. Thus reinen Zahlentheorie, Math. Ann., 112 (1936),
there exists an abject domain consisting of 493-565.
countable abjects satisfying a11 these axioms, [S] D. Hilbert and P. Bernays, Grundlagen
provided that they are consistent. Such a do- der Mathematik, Springer, 1, second edition,
main is called a countahle mode1 of axiomatic 1968; II, 1970.
set theory. On the other hand, from the axioms [9] A. Tarski, Logic, semantics, metamath-
of this theory one cari prove that there exists ematics, Clarendon Press, 1956.
a family of sets that is more than countable. [ 101 S. C. Kleene, Mathematical logic, Wiley,
This should also hold in a mode1 of the theory, 1967.
in which each abject represents a set. This [ 111 J. R. Schoenlïeld, Mathematical logic,
situation is known as the Skolem paradox. Addison-Wesley, 1967.
This does not imply, however, the inconsis- [ 121 0. Becker, Grundlagen der Mathematik
tency of axiomatic set theory. In fact, the term in geschichtlicher Entwicklung, Alber, 1954.
“countable” is to be interpreted in its math- [ 131 J. B. Rosser, Extensions of some theorems
ematical sense when one says “there exists a of Gode1 and Church, J. Symbolic Logic, 1
family of sets that is more than countable,” (1936), 87-91.
while it should be interpreted in its metamath- [ 141 W. Ackermann, Zur Widerspruchsfreiheit
ematical sense when one speaks of a countable der Zahlentheorie, Math. Ann., 117 (1940),
mode1 of the theory. It is the confusion of these 1622194.
two different interpretations that leads to the [ 151 G. Takeuti, On the fundamental conjec-
“paradox.” ture of GLC 1, J. Math. Soc. Japan, 7 (1955),
249-275.
(6) Skolem’s Theorem on the Impossihility of [ 161 G. Gentzen, Beweisbarkeit und Un-
Characterizing the System of Natural Numbers beweisbarkeit von Anfangsfallen der trans-
by Axioms. T. Skolem [27] proved that it is finiten Induktion in der reinen Zahlentheorie,
not possible to characterize the system of Math. Ann., 119 (1943) 140-161.
natural numbers by a countable system of [ 171 K. Godel, über eine bisher noch nicht
axioms stated in the predicate logic of the Iïrst benützte Erweiterung des finiten Standpunk-
order. More precisely, given any consistent tes, Dialectica, 12 (1958) 280-287.
617 157 B
Four-Color Problem

[ 181 M. Presburger, über die Vollstandigkeit Though there have been various approaches
eines gewissen Systems der Arithmetik ganzer to the four-color problem, the main stream of
Zahlen, in welchem die Addition als einzige investigation has concentrated on obtaining an
Operation hervortritt, C. R. du 1 Congrès des unavoidable set consisting of reducible ar-
Math. des Pays Slaves, Warsaw (1929), 92- rangements (- Section D) in order to correct
101, 395. the mistake made by Kempe. G. D. Birkhoff
[19] W. Ackermann, Begründung des “tertium [7] lïrst discovered the simplest nontrivial
non datur” mittels der Hilbertschen Theorie reducible arrangement, nowadays called
der Widerspruchsfreiheit, Math. Ann., 93 Birkhoff’s diamond (Fig. 1). Using a11 reduc-
(1924) l-36. ible arrangements known up to that time, P.
[20] J. von Neumann, Zur Hilbertschen Franklin showed that every map with up to 25
Beweistheorie, Math. Z., 26 (1927) l-46. countries must contain a reducible arrange-
[21] J. Herbrand, Sur la non-contradiction de ment, SO that such a map is four-colorable [S].
l’arithmétique, J. Reine Angew. Math., 166 This limit has been gradually increased; in
(1931) l-8. 1975 a reducible arrangement for 25 countries
[22] K. Ono, Logische Untersuchungen über was found.
die Grundlagen der Mathematik, J. Fac. Sci.
Univ. Tokyo, 3 (1938), 329-389.
[23] K. Schütte, Beweistheoretische Erfassung
der unendlichen Induktion in der Zahlen-
theorie, Math. Ann., 122 (1951) 369-389.
[24] C. Spector, Provably recursive func-
tionals of analysis: a consistency proof of
analysis by an extension of principles for-
mulated in current intuitionistic mathematics,
Recursive function theory, Amer. Math. Soc.
Fig. 1
Proc. Symp. Pure Math., 5 (1962). l-27.
Birkhoff’s diamond.
[25] L. Lowenheim, über Moglichkeiten in
Relativkalkül, Math. Ann., 76 (1915), 447-470.
H. Heesch invented the method of discharg-
[26] T. Skolem, Einige Bemerkungen zur
ing (- Section D), found criteria for reduci-
axiomatischen Begründung der Mengenlehre,
bility, and tïnally conjectured the existence of
5 Congress der Skandinavischen Mathema-
an unavoidable set of reducible arrangements
tiker, Helsingfors (1923), 217-232.
with several thousand elements, but this was
[27] T. Skolem, Uber die Nichtcharakterisier-
too large to construct by hand. W. Haken and
barkeit der Zahlenreihe mittels endlich oder
K. Appel with J. Koch worked with high-
abzahlbar unendlich vieler Aussagen mit aus-
speed computers for a total of 1,200 CPU
schliesslich Zahlenvariablen, Fund. Math., 23
hours over a period of four and a half years
(1934) 150-161.
and tïnally succeeded in constructing and
checking an avoidable set of reducible ar-
rangements with 1,834 elements, which proved
the four-color problem afhrmatively (1976;
157 (Vl.22) [9,10]). For early investigations of the four-
Four-Color Problem color problem - [S]; for results up to the
1930s - [3].

A. Brief History
B. Precise Formulation of tbe Problem
It is obvious that four colors are necessary to
color some geographical maps on a sphere (or TO formulate the problem precisely, we must
plane), but are four colors sufhcient to color state the following two conditions: (i) Every
every map? This is the so-called four-color country on a map is a tconnected domain; a
problem. The precise formulation of this prob- connected part of the sea is considered to be a
lem Will be given in Section B. The conjecture country. (ii) Two countries sharing boundary
was made by Francis Guthrie and communi- lines must be colored differently. On the other
cated through his brother Frederick Guthrie hand, if two countries share only a finite num-
to A. De Morgan in 1852 [4]. A. Cayley called ber of points, then they may share the same
attention to the problem in 1878. It became color.
famous after J. P. Heawood pointed out a Actually we cari modify the map SO that no
mistake in the proof by A. B. Kempe (1879). more than three countries meet at the same
Heawood also studied the problem of coloring point. A map with this condition is called a
maps on arbitrary surfaces (- Section E). trivalent or cubic map. In the study of the four-
157 c 618
Four-Color Problem

color problem, we cari restrict ourselves to addition without carrying, i.e., the operation is
trivalent maps without two-sided countries. In that of the logical (exclusive) “or.” The number
the remainder of this article, we shall assume c = ci cg in the binary system represents the
that the maps considered have these result of the operation. TO each boundary line,
properties. we give the number corresponding to a @ b,
In recent investigations, it has become where a and b are the colors for the two coun-
customary to convert maps into their tdual tries meeting at the boundary. Then we have
graphs, where each country is replaced by its the numbering 1,2, and 3 for each edge, where
capital lying inside, and for each adjacent pair 1, 2, and 3 are labeled once and only once for
A, B (Fig. 2) the boundary is replaced by a each triple of edges emanating from each
line connecting the representative points of A vertex. Such numbering for edges is called Tait
and B in such a way that it meets the bound- coloring for edges. Then we give the signature
ary only once. + or - to each vertex, according as the order
of the edge numbering is counterclockwise or
clockwise. Then the algebraic sum of the signa-
tures along the boundary of each country in
the map is always a multiple of 3. Conversely,
if we give the signature + or - to each vertex
in such a way that the algebraic sum of the
signatures is always a multiple of 3 along the
boundary of each country, then we get a four-
coloring of the map by reversing the above
Fig. 2
procedure. This is called Tait% algorithm,
Dual graph.
which shows that the four-coloring of a planar
The forma1 extension of the four-color prob- map is an TNP problem. As for five-coloring, it
lem to higher dimensions is trivial, since we is known that an algorithm of polynomial
cari easily construct arrangements needing complexity exists.
arbitrary many colors for coloring. By con- Tait’s algorithm also shows that the four-
verting to a dual graph, one cari see that this is color problem is equivalent to the following
not surprising, since the planarity of a graph apparently elementary geometric proposition:
is a strong restriction, while the condition “For any convex polyhedron, we cari always
of representability of a linear graph in 3- tut near some of its vertices in such a way that
dimensional space imposes no restriction. the resulting polyhedron has only faces such
that the number of sides is a multiple of 3.”
Many other equivalent statements of the four-
C. Tait’s Algorithm color problem are known [ 1,2].

Suppose that a planar trivalent map M is four-


D. Solution of the Four-Color Problem
colored. Denote the four colors by 0, 1, 2, and
3. (In Fig. 3, we use A, B, C, and D instead of 0,
For a given planar trivalent map, denote by
1, 2, and 3, respectively). Here we take the
V, the number of the countries with n sides
operation a @ b-c. Represent the integers
(n à 3). From Euler’s theorem on polyhedra,
a, b = 0, 1, 2, or 3 in the binary (2-adic) number
we have immediately the relation
system as 2-digit numbers a, a, and b, b,,
where ai and bi are 0 or 1. For each digit we &(6-n)I’“=12. (1)
take the operation ai 0 bi = ci to be binary
We easily see from this that every planar map
must contain 3-, 4-, or 5-sided countries. A
family F of the arrangements of countries with
the property that every planar map must
contain at least one arrangement belonging to
F is called an unavoidable set. The family con-
sisting of 3-, 4-, and 5-sided countries is the
simplest example. In order to obtain an un-
avoidable set, Heesch invented the method of
discharging. Let us assume the existence of a
map A4 that contains no arrangement of a
Fig. 3
family F. We give a signed charge of (6 -n) to
An example of Tait’s algorithm. At the vertices, .
stands for + and 0 for -. Within the domains, A, every n-gon in M. Next we divide and move
B, C, and D stand for four colors corresponding to the charges SO that the pluses and the minuses
0, 1, 2, and 3, respectively. cancel out. If a11 positive charges disappear,
619 158
Fourier, Jean Baptiste Joseph

according to the assumption of the existence of where [a] means the largest integer a < CI.After
the map M, then we have a contradiction of this was proved in various special cases, J. W.
(l), SO we cari conclude that no such map M T. Youngs and G. Ringel lïnally proved in
exists. 1968 that the number p equals the chromatic
A 3-sided country cari be ignored in four- number except for a Klein bottle (a non-
coloring. As for a 4-sided country A, Kempe orientable surface with x = 0) [6]. Franklin
proved (by means of a deiïnite modification proved in 1934 that the chromatic number for
procedure) that after four-coloring outside A, a Klein bottle is 6.
we also have a total four-coloring including A.
An arrangement of a country or countries A is
References
called reducible if, as in the above case, after
four-coloring outside A, we get a total four-
[ 11 0. Ore, The four-color problem, Academic
coloring including A by suitable modifications.
Press, 1967.
If we have an unavoidable set consisting only
[2] T. Saaty and P. Kainen, The four-color
of reducible arrangements, then by defmition,
problem-assaults and conquest, McGraw-
every planar map Will be four-colorable.
Hill, 1977.
Kempe believed that he had proved the
[3] N. L. Biggs (ed.), Graph theory 1736-1936,
reducibility of a pentagon (5-sided country),
Oxford Univ. Press, 1976.
but unfortunately, he missed a particular case.
[4] K. 0. May, The origin of four-colour con-
Even though the pentagon is not reducible,
jecture, Isis, 56 (1965), 346-348.
much effort has been given to finding other
[S] W. W. R. Ball, Mathematical recreations
reducible arrangements. Many useful criteria
and essays, Macmillan, 1892; revised twelfth
for reducibility have also been studied. If we
edition, H. S. M. Coxeter (ed.), Toronto Univ.
have enough reducible arrangements, then
Press, 1958.
we cari either eventually obtain an unavoid-
[6] G. Ringel, Map color theorem, Springer,
able set, which proves the four-color problem
1974.
affïrmatively, or iïnd a “minimal counter-
[7] G. D. Birkhoff, The reducibility of maps,
example” without reducible arrangements,
Amer. J. Math., 35 (1913), 115-128.
which disproves it.
[S] P. Franklin, The four-color problem,
Haken obtained an unavoidable set consist-
Amer. J. Math., 44 (1922), 225-236.
ing of arrangements without certain kinds of
[9] K. Appel and W. Haken, Every planar
reduction obstructions. (He called these “geo-
map is four-colorable 1. Discharging, Illinois J.
graphically good arrangements.“) He firmly
Math., 21 (1977), 429-489.
believed that, applying certain “probabilistic
[lO] K. Appel, W. Haken, and J. Koch, Every
conjectures” to such arrangements, he would
planar map is four-colorable II. Reducibility,
conquer the problem by considering arrange-
Illinois J. Math., 21 (1977), 490-567.
ments of up to 14 countries. With repeated
improvements on the discharging process and
on the criteria for reducibility, he fïnally was
able to conclude that his speculation was
correct, as recounted in Section A. His inves- 158 (Xx1.24)
tigation is not only an example of the use of Fourier, Jean Baptiste
high-speed computation in pure mathematics,
Joseph
but also one inviting reassessment of the
meaning of mathematical proof.
Jean Baptiste Joseph Fourier (March 21,
1768%May 16, 1830) was born in Auxerre,
France, the son of a tailor, and was orphaned
E. Coloring Maps on Arbitrary Surfaces
at the age of eight. In 1790, he was appointed
professor at the Ecole Polytechnique. In 1798,
On a ttorus, seven colors are suffïcient to color
Napoleon took him on his Egyptian campaign
any map and that many colors are necessary
together with G. Monge. On his return to
to color some maps. Heawood (1890) inves-
France, he was made governor of the depart-
tigated the problem of coloring maps on
ment of Isère. With the downfall of Napoleon,
closed surfaces with tEuler characteristic x < 2,
he lost his position; however, he was later
orientable or not. The least number of colors
appointed to the French Academy of Science
sufficient to color any map on a surface is
as a result of his research on the transmission
called the chromatic number of the surface.
of heat. In 1826 he was elected a member of
Heawood proved that the chromatic number
the Académie Française.
is less than or equal to
His research on heat transmission was
p = C(7 + &=GY21 (x < 3, (2) begun in 1800. In 1811, he presented a prize-
158 Ref. 620
Fourier, Jean Baptiste Joseph

winning solution to a problem put forth by the metric series have period 2n, we assume that
Academy of Science. He solved the equation the functions considered are extended for
for heat transmission under various tboundary a11 real x by the condition of periodicity
conditions. Fourier also stated (without rigor- f(x + 27~)=f(x). The study of the properties
ous proof) that an arbitrary function could be of the series S(f) and the representation off
represented by ttrigonometric series (- 159 by 6(f) are major abjects of the theory of
Fourier Series), a statement that gave rise to Fourier series. Since eiX = COSx + i sin x, if we
subsequent developments in analysis. set 2c,=a,-ib,, cmk=ck (k=O, 1,2, . ..). we
have

References
ck=; 1 f(t)e-ik’dt, k=O, fl,
s ii
[l] J. B. J. Fourier, Oeuvres 1, II, G. Darboux
Then G(f) is represented by the complex form
(ed.), Gauthier-Villars, 1888-1890.
Ckm, ~m ckeikx, and {eikX} (k=O, kl, . ..) is an
orthogonal system in (-X, rc). In this complex
form, we take symmetric partial sums such as
C;=-“c,e’kx (n= 1,2, . ..).
159 (X.22) Consider the power series +a0 +C& (uk -
Fourier Series ibk)zk on the unit circle z = eiX in the complex
plane. Its real part is the trigonometric series
(2) and the imaginary part (with vanishing
Throughout this article, we assume that f(x),
constant term) is
g(x), . f. are real-valued functions and that
integrals are always Lebesgue integrals.
uk sin kx - bk COSkx), (3)

A. Introduction which is called the conjugate series (or al-


lied series) of .f and is denoted by g(f). In
The set of functions complex form, the conjugate series is
-iCE-co(sgnk)ckeik”.
Y&, COSxl&, sinx/&,..., If f and y belong to L, (- n, n) and
COSkx/&, sin kx J&, . ,
f(x)- f ckeikx, S(X)-k=~a dkeikx,
which is called the trigonometric system, is k=-m

an torthonormal system in ( - rr, x) (- 3 17 then


Orthogonal Functions). Let f(x) be an ele-
ment of L, ( -n, n) (i.e., +Lebesgue integrable
f(x-t)g(t)dt- 2 ckdkeik”
in ( - rr, rr)). We put k=-m

1 R The function
ak=- ~@)COS ktdt,
7l s -TE
I f*dx)=& ~‘fMWf
h,=l j(t) sin kt dt, k=O, l,..., s
(1)
71s -li
is called the convolution off and g.
and cal1 ak, bk the Fourier coefficients off: The If f is tabsolutely continuous, then the de-
forma1 series rivative f’(x) satishes

fao + 5 (uk COSkx + bk sin kx) (4 f’(x) - i,f?, kckeik”


k=l

is called the Fourier series off and is often


= kE, ‘db, COSkx - ak sin kx).
denoted by G(f). TO indicate that a forma1
series G(f), as above, is the Fourier series of a If F(x) is an indelïnite integral of ,fi then
function L we Write

f(x)-;a,,+ 2 (akcoskx+bksinkx).
k=l
a a,sinkx-b,coskx
The sign - means that the numbers ak, bk are =c+ c
k=l k ’
connected with f by the formula (1); it does
not imply that the series is convergent, still less where C is a constant of integration and the
that it convergesto jI Generally, trigonometric symbol ’ indicates that the term k=O is
series are those of form (2), where the akr bk are omitted from the sum.
arbitrary real numbers. Since the trigono- If fi L, (- rr, n), then the Fourier coefficients
621 159 c
Fourier Series

c, converge to 0 as n--+ CO(Riemann-Lebesgue then S(f) converges at x to f(x) (Lebesgue’s


theorem). If f satistïes the tlipschitz condition test). Jordan’s and Dini’s tests are mutually
of order c( (0 <tu < l), then c, = O(nP), and if f independent, and both are included in Le-
is of tbounded variation, then c, = O(n-‘). besgue’s test, which, although not as conve-
When f~,!,,( - rr, n), nient in certain cases, is quite powerful. (4) If
f(x) is continuous in (a, b) and its modulus of
continuity satisfïes the condition w(6). log( 1/6)
-0 as &O in this interval, then S(f) con-
verges uniformly in (a + E, b-c) (Dini-Lipschitz
L
test).
k=l

which is called the Parseval identity. If C (ck)*


< 00, then there exists a function ~EL*( - rr, 7-c) C. Summability
which has the ck as its Fourier coefficients.
This converse is implied by the +Riesz-Fischer Let s,(x) be the nth partial sum of the Fourier
theorem (- Appendix A, Table 11.1). series G(S), and ~~(x)=~,,(x;f) be the tïrst
arithmetic mean ((C, 1)-mean) of s,,(x) (i.e., g,,(x)
= (sa(x) + si (x) + . + s,,(x))/@ + 1)). Then we
have
B. Convergence Tests

u”(x)-f(x)=L n <p&)K(t)&
The nth partial sums s,(x) = s,(x;f) of the n. s 0

sJx)=;
s1R
f(x+t)D.(t)dt,
Fourier series G(f) cari be written in the form where

1 sin((n+ l)t/2) ’
K,(t)=-
2(n+l) ( sin(t/2) > .
where The expression K,(t) is called the Fejér kernel,
D,(t) = {sin(n + 1/2) t)/2 sin(t/2). and the u,(x) are often called Fejér means. If
the right and left limits f(xfO) exist, G(f) is
The function o,(t) is called the Dirichlet kernel. +(C, l)-summable at the point x to the value
For a lïxed point x we set <p,(t) =f(x + t) + (f(x +O)+f(x -0))/2. If f is continuous at
f(x - t) - ~Y(X); then every point of a closed interval 1, G(f) is
uniformly (C, l)-summable in 1 (Fejér theorem,
%<X>-nx>=; ; cp,@P&)~~. 1904). As we explain in Section H, there exist
s continuous functions whose Fourier series
Hence if the integral on the right-hand side are divergent at some points. Thus the sum-
tends to zero as n+ CO, lim,,, s,(x) =f(x). If f mability of G(f) is more important than its
vanishes in an interval I= (a, b), then G(f) convergence. Fejér’s theorem remains true
converges uniformly in any interval 1’ = (a + if we replace (C, l)-summability by (C, c()-
E, b -E) interior to 1, and the sum of S(f) is summability (c(> 0). More generally, if fE
0. This is called the principle of localization. L,( - rc, rc), then G(f) is (C, a)-summable for
Here we give four convergence tests. (1) If f a > 0 to the value f(x) almost everywhere (H.
is of bounded variation, S(f) converges at Lebesgue). Since (C, cc)-summability (a > 0)
every point x to the value { f(x + 0) +f(x - implies tsummability by Abel% method, the
0)}/2. In addition, if f is continuous at every result of Fejér’s theorem is valid for +A-
point of a closed interval I, G(f) is uniformly summability. However, the direct study of A-
convergent in 1 (Jordan’s test). As a special summability is also important. Let f(r, x) be
case of this test, bounded functions having a Abel’s mean of S(f); that is,
lïnite number of maxima and minima and no
more than a finite number of points of dis-
continuity have convergent Fourier series
(Dirichlet’s test). (2) If the integral jc I<p~(t)l/tdt
= 1 f(x+t)P(r,t)dt,
is finite, then S(f) converges at x to ,f(x)
n
(Dini’s test). (3) If
where P(r, t)=(l -r2)/2(1 -2rcos t+?), O<
r < 1. We cal1 P(r, t) the Poisson kernel. The
ohl<pxlw~=o(~)
s function f(r, x) is tharmonic inside the unit
and circle and tends to f(x) as r + 1 almost every-
where. Hence f(r, x) gives the solution of the
,im “I<p,O)-<px(t+rlIl dt=O +Dirichlet problem for the case of the unit
q-0s >I t circle.
159 D 622
Fourier Series

D. The Gibbs Phenomenon F. Mean Convergence

Let f(x) be of bounded variation and not The theorems on conjugate functions enable
continuous at 0 :f(O) = 0, f( +O) = I > 0, f( -0) = us to obtain some results for the tmean con-
- 1. Then the partial sum s,(x) converges to vergence of the partial sums s, of G(f). If fE
f(x) in the neighborhood of 0, but not uni- L,(p>l), then lif-~J~+O;iff~L~, then
formly. Moreover, ~~f--snll,-O, Ilf”-Snll,-O for every O<p<
1. Also, if Ifllog+lflEL1, then Ilf-s,J-0,
lim s, - =1G, Ilf-Z,ll +O. As a corollary of this result, we
n-m 0 )j
obtain the following theorem, which is a gen-
“sint eralization of the Parseval identity: If the
G2 -&=1.1789.... Fourier coefficients of functions fi L, and gE
nets
L, (l/p + l/q = 1) are a,, b, and un, bn, respec-
Hence as x tends to 0 from above and n tends tively, we have the Parseval formula
to CO, the values y = S,(X) accumulate in the 2a
1
interval [0, IG], while s,( -x)= -s,,(x) in the - fgdx=$,&+ f (a,a;+b,,b;),
n s0 n=1
neighborhood of 0. This phenomenon is called
the Gibhs phenomenon. If f is of bounded where this series is convergent.
variation, then G(f) exhibits the Gibbs phe-
nomenon at every point of simple discontinuity
off: However, the (C, 1)-means of G(f) do not G. Analytic Functions of the Class H,
exhibit this phenomenon.
Let p > 0. A complex function <p(z) holomor-
phic for IzJ < 1 is said to belong to the class
E. Conjugate Functions HP (Hardy class) if there exists a constant A4
such that
For any integrable fi L 1( -z, n), the integral

m =
When <P(Z)E HP, the nontangential limit q(e”)
(f(x+t)-f(x-t)) = lim n-te,O~(~) exists for almost all 0. We Write
this as <P(e”)=f(0)+$(@, wheref(O),f(O)
exists almost everywhere. The function y(x) belong to the class L,. Also, ,7(o) coincides
is called the conjugate function of f(x). The with the conjugate function of f(0) for p > 1.
conjugate series G(f) is (C, cc)-summable (a > For 1 <pc CO, HP is isomorphic to L,, but
0) to the value y(x) at almost every point, for p = 1 and p = 00, HP and L, are different
and a fortiori summable by Abel% method. classes. Using the theory of functions of H,, we
Even if fE L, ( -x, 7-c),f does not always be- cari discuss some properties of Fourier series.
long to the class L, ( - 71,x). For example, If q(e”) =f(f3) + [f(O) is of bounded variation,
Cz2 COSnx/log n is the Fourier series of a then q(e”) is absolutely continuous and its
function fi L, (- n, z), but its conjugate series Fourier series converges absolutely. We set
CE2 sin nx/log n is not the Fourier series of a ll.7
function in L,( -n, 7~). However, if both f and g(O)= (1 -r)l<p’(rei8)12dr
>
,
fare integrable, e(f) = G(f). If fc L, (p > l),
then ~“EL~ and llf”llp~ApIlfllp; aIso, %f) s*(o)=
= G(f). If If1 log+lfl is integrable (such a
function is said to belong to the Zygmund (j;(I-r)drjo2 Iqf(reicBm’))12P(r, t)dt )“‘,
class), then 7 is integrable and g(f) = G(f).
Moreover, in this case there exist constants A where P(r, t) is the Poisson kernel. Then g(8) <
and B such that 29*(O), and there exist constants A,, B, C,
and A,, such that

ozz lg(B)IPdOd A,, 2x IcP(e’W’dO, P>O,


If f is merely integrable, SO is IfiP for any 0 < s s0
pc 1, and llfll,,<BPIlfll (O<~C 1). IffeLipa
P> 1,
(0 < a < l), then fé Lip a, but the theorem fails
for a = 0 and a = 1. The conjugate function is
important for convergence of partial sums of
Fourier series.
623 159J
Fourier Series

then the Fourier series of f(x) converges


almost everywhere.
On the other hand, A. N. Kolmogorov
gave an integrable function with Fourier
Denote by s,,(O) and q,(0), respectively, the series diverging everywhere (more precisely,
partial sums and arithmetic means of the limsup,,, ls2”(x)l = co almost everywhere).
Fourier series of <à, and set Using this example, J. Marcinkiewicz showed
that there is an fe L, such that s,(x;f) oscil-
mIs”(e)-o,(e)l’ u2 lates boundedly almost everywhere. Moreover,
Y(@=Il=1
c n
( > there exists an integrable function with inte-
grable conjugate and almost everywhere diver-
Then O#A, <g*(fI)/y(B)<A,# CO. From these
ging Fourier series [SI. For any given tnull set
relations, we cari prove that if the indices
E, we cari construct a continuous function
nk satisfy the conditions p > nk+, /nk > tl>
whose Fourier series diverges on every x E E.
1, s,,(Q) converges almost everywhere to
<p(e”) for <p(e”)EH, (1 <p). If we set A,(@=
CL~,+lc,eiv8, s(e)=(C~,lAk(0)12)1’2, where 1. Absolute Convergence
cp(ei8)-C~Ocyei’8, then l16(~)llp~ApIIdlp
(p>l). Ifcp(~)EH~(O<p<l), thenCc,e’“‘is The convergence of the series (1) C(la,l +jb,l)
(C,p-’ - 1)-summable to <p(e”) almost every- implies the absolute convergence of the trig-
where. These functions and relations were onmometric series (2) a,/2 + x(a, cas nx +
introduced mainly by J. E. Littlewood and b, sin nx). Conversely, if the series (2) con-
R. E. A. C. Paley and were later generalized by verges absolutely in a set of positive mea-
A. Zygmund. There are more precise results sure, the series (1) converges (Denjoy-Luzin
by E. Stein [7], G. Sunouchi [S], S. Yano [9], theorem). For the absolute convergence of
and others. Fourier series, we have the following tests: If
fi Lipx (c(> 1/2), then S(f) converges ab-
solutely, but for c(= lj2, this is no longer true.
H. Almost Everywhere Convergence and
If f(x) is of bounded variation and belongs to
Divergence
Lip CI(c(> 0), G(f) converges absolutely.
Suppose that the Fourier series of a function
P. du Bois Reymond (1876) tïrst showed that
there exits a continuous function whose S(x) is absolutely convergent and the value of
Fourier series diverges at a point, but the f(x) belongs to an interval (a, b). If C~(Z)is a
function of a complex variable holomorphic
problem of whether Fourier series of continu-
ous functions converge almost everywhere at every point of the interval (a, b), the Fou-
(the so-called du Bois Reymond problem) re- rier series of cp{ f(x)} converges absolutely
mained unsolved for many years. At last in (Wiener-Lévy theorem). As a corollary we
1966, L. Carleson [lO] proved that the Fourier obtain that if G(f) converges absolutely and
series of a function belonging to L, converges f(x) # 0, then G( l/j) converges absolutely. The
converse of the Wiener-Lévy theorem was
almost everywhere; hence the du Bois Rey-
mond problem was solved affirmatively. Using proved by Y. Katznelson [6]. For a given q(x)
detïned in [ -1, 11, if the Fourier series of
Carleson’s method, R. A. Hunt [12] proved
q { f(x)} converges absolutely for every f(x)
that
with absolutely convergent Fourier series
(If(x)/ < l), then q(z) is holomorphic at every
point of the interval [ -1,l).
Many problems concerning this topic still
remain unsolved. In particular, the determi-
which implies that the Fourier series of fE L, nation of the structure of the functions with
(1 <p < m) converges almost everywhere. Hunt absolutely convergent Fourier series has not
also proved that been completed.

j;‘(s;Pl+)dx J. Sets of Uniqueness


2n
SA If a, COSnx + b, sin nx converges to 0 on a set
If(Mog+ If(xN’dx+A.
s0 of positive measure, then a,, b,+O (Cantor-
Moreover, P. Sjolin proved that if Lebesgue theorem). A point set E c (0,27r) is
called a set of uniqueness (or U-set) if every
2n
trigonometric series converging to 0 outside E
If\. log+ If\ .log+ log+ (fldx< 00,
s0 vanishes identically. A set that is not a U-set is
159 K 624
Fourier Series

called a set of multiplicity (or M-set). G. Can- sur l’algèbre des séries de Fourier absolument
tor showed that every lïnite set is a U-set, and convergentes, C. R. Acad. Sci. Paris, 241
W. H. Young showed that every denumerable (1958), 4044406.
set is a U-set. It is clear that any set E of posi- [7] E. M. Stein, A maximal function with
tive measure is an M-set, but D. E. Men’shov applications to Fourier series, Ann. Math., (2)
showed that there are tperfect M-sets of mea- 68 (1958), 5844603.
sure 0. Moreover, N. K. Bari showed that [S] G. Sunouchi, Theorems on power series of
there exist Perfect sets of type U. However, the the class HP, Tôhoku Math. J., (2) 8 (1956),
structure problem of sets of uniqueness has not 1255146.
yet been solved completely. [9] S. Yano, On a lemma of Marcinkiewicz
A set E is said to be of type H (or an H-set) and its applications to Fourier series, Tôhoku
if there exists a sequence of positive integers Math. J., (2) 11 (1959), 191-215.
n,<n,<... and an interval 1 such that for each [ 101 L. Carleson, On convergence and growth
x E E, no point of {nkx}& is in 1 (mod 27r). H- of partial sums of Fourier series, Acta Math.,
sets are sets of uniqueness, a fact given by A. 116 (1966), 135-157.
Rajchman. 1. 1. Pyatetskiï-Shapiro generalized [ 111 J.-P. Kahane and Y. Katznelson, Sur les
H-sets to HC”‘)-sets [3]. ensembles de divergence des séries trigonomé-
Lacunary trigonometric series are series in triques, Studia Math., 26 (1966), 3055306.
which very few terms differ from zero. Such [12] R. A. Hum, On the convergence of
series cari be written in the form Fourier series, Orthogonal expansions and
their continuous analogues, Proc. Conf. South-
il (a,cosn,x+b,sinn,x)=~, 4(x). ern Illinois Univ., 1967, Southern Illinois Univ.
Press, 1968, 235-255.
S. Sidon established some of the characteristic [13] P. Sjolin, An inequality of Paley and
properties of such series; he generalized them convergence a.e. of Walsh-Fourier series, Ark.
further and obtained the notion of Sidon sets Mat., 7 (1968), 551-570.
(- 192 Harmonie Analysis). We often delïne a [ 141 R. Salem, Algebraic numbers and Fourier
lacunary series more specilïcally as a series for analysis, Heath, 1963.
which the nk satisfy Hadamard’s gaps; that is, [ 151 J.-P. Kahane, Some random series of
nk+, /n, > 4 > 1. Then if CE, (a: + bt) is finite, functions, Heath, 1968.
the series Cg1 A,(x) converges almost every-
where. Conversely, if XE1 A,(x) is convergent
in a set of positive measure, then CE, (a: + bk)
converges. This theorem is related to the
Rademacher series and random Fourier series
160 (X.23)
c41. Fourier Transform

K. Multiple Fourier Series A. Fourier Integrals

Routine extensions to multiple Fourier series In this article we assume that f(x) is a
from the case of a single variable are easy, but complex-valued function detïned on R =
signilïcant results are difftcult to obtain. Re- (--CO, m) and t(Lebesgue) integrable on any
cently, however, there have been several im- lïnite interval. If the integral
portant contributions in this tïeld. 03
f(x)ëiXtdX=lh& Bf(x)e?‘dx
s -00 B-m s *
References
exists, it is called the trigonometric integral
[l] A. Zygmund, Trigonometrical series, War- or Fourier integral. We have a general result:
saw, 1935. Iff(x)ELi(-co, CO), K(x) is bounded on
[2] A. Zygmund, Trigonometric series 1, II, (-CO, CO), and SgK(x)dx=o(T) (T-t &co), then
Cambridge Univ. Press, second edition, 1959. J?,f(x)K(xt)dx exists and
[3] G. H. Hardy and W. W. Rogosinski,
Fourier series, Cambridge Univ. Press, 1950. lim a f(x)K(xt)dx=O.
t-*m s ~oc
[4] N. K. Bari, A treatise on trigonometric
series, Pergamon, 1964. (Original in Russian, In particular, it follows that if S(x)E L, (- CO,
1958.) ao), then jFmf(x)eitxdx exists and
[S] J.-P. Kahane and R. Salem, Ensembles “I
parfaits et séries trigonométriques, Actualités lim f(x)em”“dx=O
r-*0, s -cc
Sci. Ind., Hermann, 1963.
[6] Y. Katznelson, Sur les fonctions opérant (Riemann-Lebesgue tbeorem).
625 160 C
Fourier Transform

B. Fourier% Integral Theorems x-l JOIf(t)1 dt < D, where C, D are constants


(Wiener% formula).
Suppose that f(x) is of tbounded variation in
an interval including x, or more generally
satisfïes the assumption for any one of the
C. Fourier Transforms (- Appendix A,
convergence tests for Fourier series (- 159
Table 11.11)
Fourier Series B). Then Fourier% single inte-
gral theorem
Let f(x)~L~(0, CO). Then

F(t)= 2 “f(u)cosutdu
Jsn 0
(1) is called the (Fourier) cosine transform of f(x).
Under the same condition as for the validity of
holds if one of the following three condi- (2), the inversion formula of the cosine transform
tions is satisfïed: (1) f(x)/( 1 + 1x1) belongs to f(x) = &qo” W) COSxt dt holds where we
L,( -CO, CO); (2) ~(X)/X tends to zero mono- suppose that f(x) = $(f(x + 0) +f(x - Oj). If we
tonically as x-+ *co; (3) f(x)/x=.q(x)sin(px+ defïne f( - x) =f(x), then this is equivalent to
q), where g(x) tends to zero monotonically the formula (2). Analogously,
as x-+ +co (S. Izumi, 1934). The right-hand m
side of (1) is called Dirichlet’s integral. G(t)= 2 f(u) sin ut du
Let f(x) be of bounded variation in an inter- Js 710
val including x (or satisfy some other conver- is called the (Fourier) sine transform of f(x).
gence test for Fourier series). Then Fourier’s Under the same condition as for the validity
double integral theorem of (2), we get the inversion formula f(x) =
&lo G(t)sinxtdt. More generally, for any
fW~1(-~, ml,

F($ -;/(x)e?~>dx
f(u)cost(u-x)du (2) 5

holds if one of the following three conditions is called the Fourier transform of f(x). Under
is satisfied: (4)f(x)gL1( -a, CO); (5)f(x)/(l the same condition as for the validity of (2), the
S(~~)EL~(-00, CO), andf(x) tends to zero Fourier inversion formula
monotonically as x+ fco; (6) f(x)/( 1 + 1x1)~
L,(-CO, CD) andf(x)=g(x)sin(px+q), where f(x)=1 lim T F(t)e’“‘dt
ST- s -T
g(x) tends to zero monotonically as x+ fco.
Iff(x)EL1(-co, CO), the formula holds. The cosine transform and the sine
1 a: sin’ A(t -x) transform coincide with the Fourier trans-
f(x) = lim - f(t) dt form when f(x)=f(-x) and -f(x)=f( -x),
~-ma4 s -co A(t -x)~
respectively.
holds almost everywhere, and in particular at Ifforanyf(x),F(t)ELq(-a,co)(lGq<m)
any x where f(x) is continuous. More gener- exists, for which
ally, the formula
cc f’(x)e-‘“‘dt-F(t) ‘dt+O
f(x)= lim A ~mf@WWxW~
A-CC
s

(3) is valid (i.e., ( l/&)~TTf(t)emi”‘dt converges


to F(x) as T-+ CO in the mean of order q), then
holds at any point x where f(x + 0) and f(x - we say that f(x) has the Fourier transform F(t)
0) exist, if K(t)EL,(-cq co), JZmK(t)dt= inL,(--co,cO). Iff(x)E&(-a,co)(l<p<2),
1, IK(t)l<M, K(t)=o(t-‘) as Itl+co, and then f(x) has the Fourier transform F(t) in
f(x)EL1(-c~,~),orifK(t)EL~(-co,co), L, (l/p + l/q = l), and F(t) has the Fourier
j-a K(t)dt = 1, and f(x) is bounded. Similarly, transform f( -x) in L, (E. C. Titchmarsh).

lj$
s0y0; K(x)dx=%R{f}
s0“K(x)dx
the formula Moreover,

m IF(t)lqdx
s -cc

holds if Y.R{ f} = lim,,, T-’ jlf(t)dt exists, I /l-m1 \ u(P-1)

~(2n)“!Z’-l I.fWl”dx
K(x) is differentiable, Ix’K(x)l<C (1 <x), and LJ-a: j
160 D 626
Fourier Transform

Iff(x), G(~)EL&-CO, co) (1 <p<2) and their D. Conjugate Functions


Fourier transforms in L, are F(r), g(t), respec-
tively, then the Parseval identity Corresponding to Fourier’s double integral
m
F(t)G(t)dt = m f(xM - wx 2.m
theorem (2), the integral

s -m

holds. The Fourier


s -02

inversion formula holds, in


lim L
i-mn ss0
dt
~m
sint(u-x)f(u)du

is called the conjugate Fourier integral of the


the sense that
integral in the right-hand side of (2). Formally,
this is written lim,,, t-‘JO(1 -cosIt)tC’(f(x
+ t) - f(x - t))dt. If f(x) is a sufficiently regular
function, the part involving cosit tends to 0 as

The theory of Fourier transforms is valid for


i+ CO. Now let

g(x) =1 lim
nA-m
t-0
Afb+w(x-t)dt,
s t
c
cosine and sine transforms. Specitïcally for p =
2, iffEL,(O, m), then filif(x).cosxtdt For any fi L, (0, CO), the integral exists almost
+Converges in the mean in L,(O, CO) to the everywhere, and g(x) is called the conjugate
cosine transform F(t) as T+ co, and con- function or Hilbert transform of f(x). If fE L,
versely, the cosine transform of F(t) in L, is (p > l), then g(x)E L, also and we have
f(x). The transforms f(x), F(x) are connected 1 AB(x+th(x-t)dt
by the formulas f(x)= -- lim
?CA-m
e-0 s E t

j-i F(u)du =gj: f(t)Fdt, and s-m lg(x)lPdx <MP~?, If(x)lPdx, where
M, is a constant depending only on p. In
particular,
j-;f(t)dt=&j”: F(u)Fdu,
cc
cc lf(x)12dx= = Ig(x)12dx for p=2.
If(x)12dx= m IF(t)l’dt. s -m s -02
s0 s0

The theory of Fourier transforms was gen- E. Boundary Functions of Analytic Functions
eralized as follows by G. N. Watson (Roc.
Suppose that a complex-valued function f(z)
London Math. Soc., 35 (1933)). We suppose
(z =x + iy) is tholomorphic for y > 0, f(x + iy)
that x(x)/x E L,(O, CO) and

m
Xb4X(Y4
s ‘2 cl
du = min(x, y).
converges as y-0 for almost a11 x to f(x)
(which is called the boundary function), and
~(X)E L,,( -CO, 00) (p> 1). Moreover, suppose
that f(z) is represented by +Cauchy’s integral
Iff(x)EL,(O, CO), then there exists an F(~)E formula or +Poisson’s integral of f(x) on the
L,(O, CO) such that real line. If either fi L, (p> 1) and has F(t)
as its Fourier transform or f(x) is an L,-
Fourier transform of F(I)EL, (q > l), then a
necessary and sufficient condition for the
and the inversion formula function f(x) to be the boundary function of
an analytic function is that F(t) be 0 almost
everywhere for t > 0 (N. Wiener, R. E. A. C.
Paley, E. Hille, J. D. Tamarkin).
and the Parseval identity
F. Generalized Fourier Integrals
j: IfW12dx=j-~ ,F(t)12dt
Let If(x)l/(l +IX~“)E L,( -a, CO) for a positive
integer k and
hold. F(t) is called the Watson transform of
f(x). For any fe L,, equality (4) is necessary k-1 (- itx)”
for the existence of the Watson transform F(t) c- IxlG 1,
Lk=Lk(t,X)= u=lJ v! ’
for which the inversion formula holds. S.
{ 0, Ixl>l,
Bochner (1934) generalized this theory further
-ifx _ L
to tunitary transformations in L, (- 192
Ek(t)=; ; .ftxle k dx.
Harmonie Analysis). m ( - iX)k
627 160 H
Fourier Transform

The function Ek(t) or one that differs from implies


EJt) by a polynomial of degree at most k is cc m
called the kth transform of ,f(x). We Write lim g2(x-t)dtl(t)= A gz(t)dt.
x-= s ~m -0
formally f(x) =JTC, eiX’dkEk(t). Actually, if we
give an appropriate meaning to the integral, From these general theorems, we cari prove
this formula itself is valid (H. Hahn, Wiener various +Tauberian theorems about the sum-
Berichte, 134 (1925); S. Izumi, Tôhoku Science mation of series. Also, these results were applied
Rep., 23 (1935)). (For the theory and applica- to the proof of the +Prime number theorem by
tions of kth transforms - [2, ch. 63.) S. Ikehara and E. Landau (Wiener [ 11). In the
general Tauberian theorem, the boundedness
of p(t) cari be replaced by one-sided bounded-
ness (H. R. Pitt, 1938). In fact, the iïrst form of
G. Applications of Fourier Transforms the theorem still holds if we replace the con-
dition by the following one: g1 and y2 are
Suppose that f(x)~L,( -CO, nr>), F(t) is its continuous, g,(x) > 0, the Fourier transform of
Fourier transform, and f(x) = o(e-H(x)), where y1 (x) does not vanish, g2(x) satisfies the con-
O(x) is positive and increasing. If SF 0(x)x-‘dx ditions of the second theorem, and p(x) > C.
= CU and F(t) vanishes identically in an (Concerning Fourier transforms on the topo-
interval, then F(t)=0 over (-CO, CO). If logical groups - 192 Harmonie Analysis;
sy 0(x)x-‘dx < GO,then there exists a function and concerning Fourier transforms of the
f(x) such that F(t) vanishes identically in an distributions - 125 Distributions and
interval, but F(t) does not vanish identically in Hyperfunctions).
( -ZJ, CO) (Wiener, Paley, N. Levinson). These
results are applicable to the theory of tquasi- H. Fourier Transforms of Distributions
analytic functions.
Let f(x)~L~(-CO, 10). Then a necessary The Fourier transform is defined in higher-
and sufficient condition for any function in dimension R” by
L, ( -CO, m) to be approximated as closely as
we wish by linear combinations of the trans- F(f)=(s)-” emi”cf(x)dx, ~EL~,
lations XF= 1 akf(x + hk) of f(x) with respect s
to the L,-norm is that the Fourier transform x5=x,51+x,(*+...+x,ii,.
of f(x) does not vanish at any real number.
One denotes it by f(f). The inverse transform
When ,f(x)~&( -n3, x), a necessary and suffi-
is detïned by
tient condition that an arbitrary function in
L2(-CO, ~0) cari be approximated as closely
as we wish by zr=, ak,f(x + h,) with respect to FÏ(g)=(&)-’ eiXrg(<)d&
s
the L,-norm is that the zeros of the Fou-
rier transform of f‘(x) have measure zero Differentiation under the integral sign gives
(Wiener). This result was used by Wiener to
prove the generalized Tauberian theorem: O”f(<) =(&)-” emixr( - ixyf(x)dx,
s
Suppose that gl(x)EL,(-co, CO) and its Fou-
rier transform never vanishes. Moreover, let D”=(i3/i3X,)011 . ..(a/aX.,p,
gZ(x)EL1(-a, GO)and p(x) be bounded over
under the assumption (1 +l~l)~‘!f(x)~l, (lai=
(-co, CU). Then lim,,, jZZ y, (x -t)p(t)dt =
CC,+ + cc,). Roughly speaking, the decreas-
Aj?m gl(t)dt implies that lim,,, s?a g2(x -
ing order of f(x) when Ix I-, COis reflected in
t)p(t)dt=AlT,g2(t)dt. Another type of
the differentiability of f(t). In the same way,
Wiener theorem is concerned with +Stieltjes
integrals. Suppose that
(i<)af([)=(fi)mn e-‘x”D4f(x)dx,
s
sup IY,(X)l< a
n=i: -02 n<x<n+, which shows that the differentiability of f(x)
(hence gl(x)EL,) and that the Fourier trans- is reflected in the decreasing order of f(t).
form of g1 (x) never vanishes. Moreover, let The same statements cari evidently be made
for the relation between y(<) and its inverse

5 sup Is*(x)l<
n=-K ncx<n+l
30
Fourier transform.
Let .Y be the space of rapidly decreasing

lim
x-3;
33
s sCO
and let JC” Ida(t)1 be bounded.

~(u
.4,(x-t)dcc(t)=A
-m
Then

s,(Odt
functions <p(x), i.e., such that for a11 positive
integers k and c(3 0, (1 + I~l)~D”cp(x) remains
bounded (- 125 Distributions and Hyperfunc-
tions). From the facts above, it follows that the
160 Ref. 628
Fourier Transform

linear mapping f(x)wF(,f) is a topological In the former case, the inversion formula takes
isomorphism from Y! onto rYe. Let Y’ be the the form
dual space of Y. Usual function spaces in
classical analysis are contained in Y’, for f(x)=(2n)-” ei”tfA(<)d<,
instance, the L,-space and that of functions s
increasing at most of polynomial order at and the Parseval identity becomes
infinity.
Let TE cYd. Then the mapping <pE rYc~ lf(x)12dx=(2~)m” l&)12d5.
T(.Fq) is continuous, and ~TEY; is defined s s
by BT(v)= T(8<p). 9 is also a topological
isomorphism from Y; onto YP;. Take as an References
example the distribution p.v.l/x. This is no
longer a function; however, its Fourier trans- [l] N. Wiener, The Fourier integral and cer-
form cari be calculated as follows: tain of its applications, Cambridge Univ.
Press, 1933.

n.5>0,
127 S. Bochner, Vorlesungen über Fouriersche

J
Integrale, Akademische-Verlag, 1932; English
translation, Lectures on Fourier integrals,

x lim
‘4-m s r
0
A sin(xodx
~
x
=

-
2”’
n
-i,
J 2
<CO,
Ann. Math. Studies, Princeton Univ. Press,
1959.
[3] E. C. Titchmarsh, Introduction
theory of Fourier integrals, Clarendon
1937.
to the
Press,

where the limit is taken with respect to the


[4] R. E. A. C. Paley and N. Wiener, Fourier
topology of Y’.
transforms in the complex domain, Amer.
Using the definition above, classical results
Math. Soc. Colloq. Publ., 1934.
cari be extended to .Y’ almost automatically.
[S] S. Bocher and K. Chandrasekharan,
For example, .F(D” T) = (it)“9( T), 9(( - ix)b T)
Fourier transforms, Ann. Math. Studies,
=W.F(T). Moreover, if TE&‘, then FT=
Princeton Univ. Press, 1949.
fi-“T(ë’“~). In particular, .F6 =(27r)‘@.
[6] M. J. Lighthill, Introduction to Fourier
The Plancherel theorem says that if for
analysis and generalized functions, Cambridge
~(X)E L, one defines its Fourier transform by
Univ. Press, 1958.
[7] 1. M. Gel’fand and G. E. Shilov (Silov),
Generalized functions 1, Academic Press, 1964.
(Original in Russian, 1958.)
then the correspondence Foy is a uni- [S] S. Bochner, Harmonie analysis and the
tary mapping from L, ont0 L,, i.e., theory of probability, Univ. of California
Press, 1955.
[9] R. R. Goldberg, Fourier transforms, Cam-
bridge Tracts, Cambridge Univ. Press, 1961.
This result cari be extended. For any non- [lO] T. Carleman, L’intégrale de Fourier et
negative integer m, the element of H” (- questions qui s’y rattachent, Almqvist and
168 Function Spaces) is characterized by its Wiksell, 1944.
Fourier transform: ~(X)E H” if and only if [ 1 l] L. Schwartz, Théorie des distributions,
(1 +]Q)mf(r)~ L,. Furthermore, for arbitrary Hermann, 1966.
real s, the space H” cari be detïned as the set of For formulas for Fourier transforms - refer-
all elements of Y’ whose Fourier transform ences to 220 Integral Transforms.
f(t) satisfies (1+151)sf([)~L~. H” and H-” are
dual to each other.
ForfEL,,gELz,orl;ggL,,theconvo-
lution makes sense as a function, and it holds
that F(,f * g) = y(,f)F(g). This relation cari be 161 (IV.3)
extended to distributions, to state Free Groups
F(S * T) = (cPS)(FT).
A. General Remarks
This holds for (S, T)E &’ x Y’, SI, x &, (&,
is the dual of sL2), etc.
A group F is called a free group if it is the +free
Fourier transforms are also often defined by
product (- 190 Groups M) of tinfïnite cyclic
groups G,, , G,, generated by a,, . . , a,, re-
f(t)= ëiXcf(x)dx or = e-‘“‘“sf(x)dx.
s s spectively. Then n is called the rank of F. A
629 161 Ref.
Free Groups

free product of tsemigroups is delïned similarly groups. W. Magnus (193 1) showed that it is
to that of groups, and the free product of solvable for any group with a single delïning
infinite cyclic semigroups Ci = { 1, ai, a?, } relation. The word problem is an example of
(i= 1, ..., n) is called a free semigroup gener- decision problems (- 97 Decision Problem).
ated by II elements ai (i= 1, . . ..a). The word problem for groups is closely related
If a group G is generated by subgroups to that for semigroups (A. M. Turing, 1937; E.
Hi (i = 1, , n) isomorphic to Gi, then G is a L. Post, 1947; A. A. Markov, 1947). Similar
homomorphic image of the free product of problems for other algebraic systems cari also
the groups Ci. A subgroup #{e} of the free be considered. The problem of determining a
product F of groups Ci is itself the free prod- general procedure by which it cari be decided,
uct of a free group and several subgroups, in a Imite number of steps, whether two given
each of which is conjugate in F to a subgroup words interpreted as elements of G cari be
of some Gj (A. G. Kurosh, 1934). Notably, a transformed into each other by an (inner)
subgroup # {e} of a free group is itself a free automorphism of G is called the transforma-
group (0. Schreier, Abh. Math. Sem. Univ. tion problem.
Hambury, 5 (1927)). A subgroup of index j of a Let F be a free group of rank n and F = Fi
free group of rank n is a free group of rank 3...3F,.3Fr+, I... be the tlower central
1 +j(n - 1) (Schreier). series of F. Then FJF,,, is a tfree Abelian
Let F be the free group generated by n ele- group of rank ~L,(r)=(l/r)C,,,~(r/d)nd, where p
ments a,, . , a,, and let G be a group gen- is the tMobius function (E. Witt). The intersec-
erated by n elements b,, . , b,,. Then there is a tion of all subgroups of F of lïnite index is the
homomorphism of F onto G. Let N be its identity element.
kernel. If the class of a tword ~(a,, ,a,)
belongs to N, then we have w(b,, , b,,)= 1.
C. The Burnside Problem
We cal1 w(b,, . . , b,) = 1 a relation among
the generators b,, . . . , b,. If N is the minimal
The original problem of Burnside is: If every
normal subgroup of F containing the classes
element of a group G is of finite order (but not
of words wi(a,, . . . ,a,), , ~,(a,, . . . . a,), then
necessarily of bounded order) and G is Iïnitely
the relations wi (b, , , b,) = 1, . , w,(b,, ,
generated, is G a tïnite group? E. S. Golod [6]
b,) = 1 are called defining relations (or funda-
(1964) showed that this problem for p-groups
mental relations). If generators a,, . . , a, and
has a negative solution. The following is the
words ~,(a,, . . . ,a,), . . . . w,(ai, . . . ,a,) are
more usual form of the Burnside problem: If a
given, then there is a group generated by
group G is lïnitely generated and the orders of
a,, , a, with delïning relations wi (ai, . . . ,
elements of G divide a given integer r, is G
a,)= 1, . . . . ~,,,(a,, . , a,)= 1. In fact, let F be the
lïnite? Let F be a free group of rank n, N be
free group generated by a,, , a, and N the
the normal subgroup of F generated by a11 the
minimal normal subgroup containing the
rth powers x” of elements of F, and B(r, n) =
classes of words wi(a,, . . . . a,), . . . , ~~(a,, . . . ,
FIN. Then the problem is the same as the
a,). Then the factor group FIN is such a group.
question of whether B(r, n) is finite. For r =
A free group is a group with an empty set of
2, 3,4, 6 the group is certainly lïnite (1. N.
defining relations. In the preceding discussion,
Sanov, M. Hall). The restricted Burnside prob-
n and m are not necessarily finite. If both n
lem is the question whether the orders of
and m are lïnite, then G is called finitely
lïnite factor groups of B(r, n) are bounded. It
presented.
was solved affirmatively for r a prime (A. 1.
Kostrikin [5], 1959).
A group generated by two generators x, y
B. The Word Problem
and satisfying the relations xU = y” = (xy)” = 1
(where u, u, w are integers) is inlïnite if l/u + I/v
If a finitely presented group G is given, then a
+l/w-l<O,andisofordergifO<l/u+l/u
general procedure has to be determined by
+l/w-1=2/g.
which it cari be decided, in a tïnite number of
There is also a lïnitely presented group
computational steps, whether a given word
which is isomorphic to its proper factor group
equals the identity element as an element of G.
(B. H. Neumann).
This is called the word problem (- 190 Groups
M). A solution to the word problem does not
always exist (P. S. Novikov [S], 1955); in fact, References
there is a group with two generators and 32
delïning relations for which the word problem Cl] A. G. Kurosh, The theory of groups 1, II,
cannot be solved (W. Boone [7]). However, it Chelsea, 1960. (Original in Russian, 1953.)
was shown by V. A. Tartakovskii that the [2] M. Hall, The theory of groups, Macmillan,
problem cari be solved for a large class of 1959.
162 630
Functional Analysis

[3] H. S. M. Coxeter and W. 0. J. Moser, outstanding contributions was the discovery of


Generators and relations for discrete groups, the tcontinuous spectrum. In 1918, F. Riesz
Erg. Math., Springer, 1957. proved that Fredholm’s alternative theorem
[4] W. Magnus, A. Karrass, and D. Solitar, holds for a tcompletely continuous linear
Combinatorial group theory, Interscience, operator in the space of continuous functions,
1966. and later the result was extended to Banach
[S] A. 1. Kostrikin, The Burnside problem, spaces.
Amer. Math. Soc. Transl., (2) 36 (1964), 63- In 1932, three important books by S. Banach
100. (Original in Russian, 1959.) Cl], J. von Neumann [2], and M. H. Stone [3]
[6] E. S. Golod, On nil-algebras and finitely were published. These books treated tclosed
approximable p-groups, Amer. Math. Soc. linear operators that are not necessarily con-
Tran& (2) 48 (1965), 103-106. (Original in tinuous. The notion of +Banach space was
Russian, 1964.) introduced: a tnormed linear space complete
[7] W. W. Boone, The word problem, Ann. of with respect to the distance dis(x, y) = I~X -
Math., (2) 70 (1959), 2077265. yll. By making use of the +Baire-Hausdorff
[S] P. S. Novikov, On the algorithmic un- theorem and the +Hahn-Banach theorem,
solvability of the word problem in group Banach proved, for closed linear operators
theory, Amer. Math. Soc. Transl., (2) 9 (1958), in Banach spaces, the fundamental principle
l-1 22. (Original in Russian, 1955.) consisting of the topen mapping theorem, the
Also - references to 190 Groups. tclosed graph theorem, the tuniform bounded-
ness theorem, the tresonance theorem, and the
tclosed range theorem. These theorems were
modilïed to be applicable in locally convex
162 (X11.1) topological linear spaces by the Bourbaki
Functional Analysis group beginning in the late 1940s.
In 1929, von Neumann proved, as a mathe-
The origin of functional analysis cari be traced matical foundation of quantum mechanics,
to the 1887 work of V. Volterra. He stressed that a closed linear operator T in a Hilbert
the important notion of operations or opera- space admits spectral resolution with real
tors, that is, generalized functions of which spectra if and only if T is a +Self-adjoint opera-
the domains as well as the ranges are sets of tor (- also Stone [3]). The condition that a
functions. A typical example is the operator closed linear operator is a tfunction of a self-
assigning to a function ,f its derivative ,f’. An adjoint operator was given by von Neumann,
operator is called a functional if its values are F. Riesz, and Y. Mimura (193441936). K. Frie-
numbers, as in the case of the operator assign- drichs (1934) proved that a tsemibounded
ing to a function f the value f’(a) or the value linear operator admits a self-adjoint extension.
&f(t)dt. In 1896, Volterra considered the T. Kato [4] (1950) proved that a Schrodinger-
operator mapping a continuous function f to type Hermitian operator is tessentially self-
a continuous solution <pof the integral equa- adjoint.
tien f(x)= <p(x)-j”tK(x,~My)dy, where Von Neumann’s +mean ergodic theorem
K(x, y) is a continuous function. Defining in the Hilbert space (1932) was extended to
I<p=<p and (K<p)(x)=.fZK(x,~) <p(y)dy, he Banach spaces by K. Yosida, S. Kakutani, and
showed that <p is given by <p= (1 -K))if= Riesz in 1936. G. D. Birkhoff’s tpointwise
f+Kf+K’f+..., where K”f=K(K”-if). ergodic theorem (193 1) was extended by N.
Following this lead, I. Fredholm studied in Wiener (1939) Yosida (1940) E. Hopf (1954),
1900 the integral equation f(x) = <p(x)- N. Dunford (1955) R. V. Chaton and D. S.
1.C K(x,y)<p(y)dy containing a parameter ‘. Ornstein (1960), and others in various ways.
He proved the so-called talternative theorem: The +Abelian ergodic theorems were discussed
For a given n,, the operator equation (I- by E. Hille and R. S. Phillips [S] and Yosida
Wh=f, (K<p)(x)=S,hK(~,~)<p(y)dy, either C61.
admits a uniquely determined continuous The notion of +Banach algebra was intro-
solution q for every continuous function f or duced by M. Nagumo in 1936.1. M. Gel’fand
else (1 -&K)V~ = 0 admits a nontrivial con- proved that a commutative Banach algebra
tinuous solution <pO~0. D. Hilbert discussed with multiplicative unit e satisfying Ile11= 1
(1904- 19 10) a +continuous linear operator K (= the normed ring) admits a representation
delïned on the +Hilbert space L, with values by an algebra of complex-valued continuous
in L,, and he called a complex number 1, a functions (1941).
tspectrum of K if (1 - Â,,K) does not have a The treflexivity as well as the tduality of
continuous linear inverse. He proved that if Banach spaces were studied by S. Kakutani
K is +Hermitian, then K admits a +Spectral (1939), V. L. Shmul’yan (1940), and W. T.
resolution with real spectra only. One of his Eberlein (1941).
631 163 A
Functional-Differential Equations

The notion of tvector lattice (= +Riesz space) [2] J. von Neumann, Mathematische Grund-
was introduced in analysis by Riesz (1930). This lagen der Quantenmechanik, Springer, 1932.
was followed by the work of L. V. Kantrovitch [3] M. H. Stone, Linear transformations in
(1935) and H. Freudenthal (1936). Kakutani Hilbert spaces and their applications to analy-
gave two standard types of +Banach lattice: sis, Amer. Math. Soc., 1932.
the tabstract (M) space (1940) and the +ab- [4] T. Kato, Perturbation theory for linear
stract (L) space (1941). M. Krein and S. Krein operators, Springer, second edition, 1976.
(1940) and Yosida and M. Fukamiya (1940) [S] E. Hille and R. S. Phillips, Functional
discussed the (M)-type vector lattices, and analysis and semigroups, Amer. Math. Soc.,
Yosida (1941), the (L)-type vector lattices. H. 1957.
Nakano (1940- 1941) and T. Ogasawara and [6] K. Yosida, Functional analysis, Springer,
F. Maeda (1942) studied the spectral resolu- sixth edition, 1980.
tion of the Banach lattice. [7] N. Dunford and J. Schwartz, Linear opera-
The connection between +Brownian motion tors, Interscience, 1, 1958; II, 1963; III, 1971.
and tpotential theory was clarified by Kaku- [S] H. Brezis, Opérateurs Maximaux Mono-
tani (1942), and extended by J. L. Doob (1956) tones et Semigroupes de Contractions dans les
and G. A. Hunt (1957-1958). Espaces de Hilbert, American Elsevier, 1973.
The tone-parameter semigroup of continu- [9] L. Schwartz, Théorie des Distributions,
ous linear operators in Banach spaces was Hermann, 1966.
studied by Hille and Yosida, and they gave in [lO] 1. M. Gel’fand, Generalized functions I-
1948 a characterization of the +intïnitesimal III (with G. E. Shilov), IV (with N. Ya. Vilen-
generator of such semigroups. Its dual was kin), V (with M. 1. Graev and N. Ya. Vilenkin),
given by Phillips (1955). The one-parameter Academic Press, 196441966.
semigroup of nonlinear tcontractive operators [ 1 l] M. Sato, Theory of Hyperfunctions 1, II,
in Hilbert spaces was studied by Y. Komura J. Fac. Sci. Univ. Tokyo, sec. 1, 8 (1959), 1399
(1967) who obtained a nonlinear version of 193; 8 (1960) 3877437.
the Hille-Yosida theorem. This has been ex- [ 121 M. Sato, T. Kawai, and M. Kashiwara,
tended considerably in Banach spaces by Microfunctions and pseudodifferential equa-
many scholars, e.g., Kato (1967), M. G. Cran- tions, Lecture notes in math. 287, Springer,
dal1 and A. Pazy (1969), Crandall and T. Lig- 1973.
gett (1971) 1. Miyadera and S. Oharu (1970),
H. Brezis (1973), P. Bénilan (1973), and others.
In 1936, S. Sobolev gave a generalization of
the notion of functions and their derivatives
through integration by parts. This general-
163 (XIII.1 6)
ization has been extended by L. Schwartz
(1945-) [9] to the notion of tdistributions,
Functional-Differential
which are continuous linear functionals delïned Equations
on the function spaces 9(R”) and Y’(R”), and
this extension gives, e.g., a reasonable inter-
pretation of Dirac’s S-function. Since 1959, A. General Remarks
Gel’fand [lO] has been publishing, with his
collaborators, books on the distribution the- In many models, it is assumed that the future
ory pertaining to function spaces other than behavior of a system under consideration is
9(R”) and Y(R”). M. Sato introduced (1959- governed only by its present state and not by
1960) [ 1 l] the theory of thyperfunctions, as a past states. However, for various systems
generalization of distribution theory. In the arising in practical problems we cannot ignore
case of one independent variable, a hyper- the effect of the past on the future. Such a
function f may be delïned as a generalized phenomenon is often observed in population
boundary value on the real axis R of a holo- problems, epidemiology, chemical reactions,
morphic function F detïned in C’ -R’. Hyper- system engineering, and SO on.
function theory has been refined to tmicro- The description of such phenomena may
local analysis and studied extensively by Sato, involve difference-differential equations
A. Martineau, H. Komatsu, P. Schapira, T.
x(t)=f(t,x(cXx(t-h,), “‘,.a-&)), (1)
Kawai, M. Kashiwara, M. Morimoto, A. Ka-
neko, and others (- [12]). or integrodifferential equations

References -w=g(t,x(t))+ *fks>x(tXx(s))~~. (2)


s0
[ 11 S. Banach, Théorie des Opérations Lin- An enormous variety of equations is discussed
éaires, Warsaw, 1932. in the literature, but most of them cari be
163 B 632
Functional-Differential Equations

expressed in the form features. However, Hale [14] originated the


study of a class of equations of neutral type for
W) =AL x(s)), (3) which the general qualitative theory cari be
which is usually called the functional differen- developed in the same manner as for equations
tial equation; here s varies in an interval 1, and of retarded type.
f depends on the function x of &. If t is the For the case of infinite retardation, proper
right or left endpoint of the interval 1,, then choice of the space of functions x contained
(3) is said to be of retarded or advanced type, in the right-hand side of (3) is of great signifi-
respectively. Most of the results obtained have tance. A general treatment of the phase space
been from work on equations of retarded type. for equations with infinite retardation in an
In this case, the maximal length of the interval axiomatic setting is given in [ 1.51.
1, is called the retardation (delay, lag, deviation,
etc.), and (3) is often called the retarded dif-
ferential equation, delay differential equation, C. Phase Space
or differential equation with lag (retardation,
deviating argument). For simplicity, let & always be the interval
[t-h, t] with a iïnite retardation h >O. A
solution of (3) starting at t = r is a function
B. Historical Remarks defïned on [z-h, a) for a > 7 such that the
solution coincides with a preassigned function
Functional differential equations of a sort were (the initial function) on [z-h, ~1 and is con-
studied by Johann Bernoulli in the 18th cen- tinuously differentiable and satistïes (3) on
tury in connection with the string problem. [z, a). Since a solution starting at t = z must be
Since then, a good deal of work has been done continuous for t > z, it is quite natural, as is
in this tïeld by many people; some of this work done in much of the literature in order to
was done before the beginning of this Century. develop a qualitative theory, to choose the
Among the investigators, Volterra [l, 21 is space C( [t - h, t], R”) of continuous R”-valued
noteworthy for his systematic study on rather functions on [t-h, t] as a space of functions x
general equations related to problems of involved in the right-hand side of (3). Intro-
predator-prey populations and viscoelasticity, ducing a symbol x, which represents an ele-
though his results were ignored by his con- ment of C = C( [ - h, 01, R”) detïned by X~(S)=
temporaries. In the early 194Os, Minorsky x(t + s) (SE C-h, 0]), we cari rewrite (3) in a
[3], in his famous study of ship stabilization, more convenient form:
pointed out the importance of delay effects in
control theory, with many of the modern WI =fk XC)> (El
issues first appearing in his work [4]. Mishkis where the function ,f(t, cp) is detïned on R x C
[S] studied linear systems extensively, and (or on its subspace). Here C is considered
Driver [6] gave a unifïed representation for to be a Banach space with the norm ll<pII =
functional differential equations. Important max,,I-h,Ol~&)~, 1.1 being a norm in R”. The
achievements were given by Krasovskii [7] space C is said to be the phase space of(E),
and Bellman and Cooke [SI. They laid the and the initial condition cari be written as
foundation for the qualitative theory of func-
tional differential equations. Inspired by these x,=5, (z,<&R x C. (5)
works, many books (such as [9-141) were
published. Owing to these books together with
many articles on this iïeld, the theory of func- D. Initial Value Prohlem
tional differential equations has become an
important branch in the theory of differential Let f(t, <p) be a continuous functions deiïned
equations. on an open domain D c R x C. The initial value
Equations of advanced type are treated in, prohlem is to find a solution of (E)-(5) for a
e.g., [S, S], but qualitative theory for them has given (7, <)ED. Many fundamental results hold
hardly been established. The equation as for ordinary differential equations: (a) There
always exists a solution of (E)-(5) under the
i(t)=ax(t)+bx(lt) on O<t<m (4)
continuity off on D. Here, the solution means
is of advanced type if A> 1, and its analytic the one to the right. No general existence
solution has been studied in detail. Equation theorem is given for the solution to the left. (b)
(3), whose right-hand side involves differential The solution of (E)-(5) is unique, if f satisfies
operators, e.g., i(t) =f(t, x(t), x(t - h), i(t - h)), the Lipschitz condition: If(t, cp)-f(t, $)I <L 11<p
is said to be of neutral type [S, 8,141, and is -G 11in a neighborhood of each point of D,
generally considered to be of neither retarded where L is a constant depending on the neigh-
nor advanced type because of its distinctive borhood. (c) If f varies in the space C(D, R”)
633 163 F
Functional-Differential Equations

equipped with the compact open topology and may be expressed in the form
if the solution of (E)-(5) is unique for every
(r, 5) ED, then the solution is continuous as a fk <PI= r ’ CdAt>s)l<pM
function of (r, <,f). (d) If f is completely con- J -h
tinuous on D, that is, f is a continuous map- where ~(t, s) is an n x n matrix detïned on 1 x
ping of bounded subsets of D into bounded C-h, 01, which is measurable in (t, s) and of
sets in R”, then every solution cari be con- tbounded variation in s with a ttotal variation
tinuable to the right as long as it remains less than L(t), and the integration is that of
bounded or stays away from the boundary of Stieltjes with respect to s. From this fact it
D. This assertion is no longer true in the follows that the range of the initial functions
absence of complete continuity. cari be extended to the space of tpiecewise-
TO prove the existence theorem, Schauder’s continuous functions, and the solution x(t) of
+fixed point theorem is utilized; the Picard the nonhomogeneous linear system x(t) =
successive method is also effective under the f(t,xJ+p(t) through (r, 0~1 x C cari be rep-
Lipschitz condition. Consider the case when resented by the constant variational formula
fk cp) cari be writty- as fk CPI= sk v(O), CPI,
where g(t, x, <p) is detïned for (t, x, (P)ER x R” x x(t) = IXL +3 (0)
C, and dt, x, <PI =dt, x, $1 if <p(s)= $(SI on
[ - h, -81, 6 > 0. Equation (1) is one such case,
where h = max h,, 6 = min h,. Then, g(t, x, xr) is
a function of (t, x) alone on [r, z + S] under the where T(s) = E (the unit matrix) for s = 0, and
given condition (5), that is, (E) is reduced to I(s) = 0 (the zero matrix) for s < 0.
an ordinary differential equation. This makes
it possible to fïnd a solution of (E))(5) by
matching successively the solutions of ordi- F. Autonomous Linear System
nary differential equations on [z, z + S], [r +
6,Tf26],.... This is the step-by-step metbod, Let (L) be an autonomous linear system
which is effective even for equations of neutral
w =fW (AL)
type with the same property.
Under uniqueness, the solution x(t) of(E) In this case, (7) is not different from the Riesz
induces a mapping T(t, r): G(t, r)-*C, t 2 t, representation theorem f(cp)=ph[dq(s)]~(s).
which maps X,E G(t, z) to X~E C, where G(t, r) c For (AL) the fundamental operator T(t) plays
C is the set of 5 for which the solution of (E)- an extremely important role. As was seen,
(5) is continuable up to t. Then {T(AJ,>, is a one-parameter semigroup of
bounded linear operators T(t) which is strong-
T(t, s) T(s, T) = T@, t) for t>s>z, (6) ly continuous in t > 0 and compact for t > h.
and T(t, t) is strongly continuous in t. If(E) is Thus the asymptotic behavior of the solutions
autonomous, that is, f(t, cp) is independent of t, of (AL) are determined by the distribution of
then there exists a mapping T(t), t > 0, satisfy- the spectra o(A) of the inlïnitesimal generator
ing T(t-r)= T(t,z), and {T(t)},>, becomes a A of T(t), which is given by AV = yi with the
one-parameter semigroup. domain D(A)={~EC~~EC and Cp(O)=f(cp)}.
The properties of T(t) assert that: (a) a(A)
consists of point spectra alone. (b) The number
E. Linear System of A, = (16 o(A) 1Re  > CX}is at most lïnite for
any CCER. (c) For every ~EU(A) the dimension
When f(t, <p) is continuous on 1 x C for an of the generalized eigenspace of i is finite. (d)
interval 1 and is linear in cp, equation (E) is The spectra of A coincide with the roots (cbar-
said to be a linear system (denoted by (L)). In acteristic roots) of the characteristic equation of
this case, f(t, <p) satislïes the Lipschitz con- (AL),
dition with L = L(t) on 1 x C for a continuous
function L(t) of I, which also implies that f is
completely continuous. Thus the solution of
(L) - (5) uniquely exists over [r, CO) n 1 for any together with their multiplicities. (e) 8~
(T, 5)~ I x C, and the mapping (the fundamental o( T(t)), t > 0, if and only if Â.E a(A).
or solution operator) T(t, z) is a tbounded Let P,, CCER, be the linear space spanned by
linear operator on C for any t, r E 1, t > r, and the generalized eigenfunctions corresponding
+Compact if t > r + h. 7(t, r) corresponds to the to a LEA,. Then, (a) P, is invariant under T(t),
fundamental matrix for ordinary linear dif- that is, T(t)P, c P, for t 2 0; (b) the restriction
ferential equations, but it is not invertible in of T(t) to P, is invertible and hence extend-
general. able over VER; (c) if HEP,, then [T(t){](O)=
A continuous function f(t, <p) linear in cp CIE,,pA(t, oe”‘, where pA(t, 5) are polynomials
163 G 634
Functional-Differential Equations

in t and linear in 5; (d) there is a direct-sum and D+ denotes the Upper-right Dini deriva-
decomposition C = P, + Q, such that Q, is in- tive. Furthermore, the Lyapunov function cari
variant under T(t), the projection along Qana: be endowed with a Lipschitz condition 1V(t, cp)
C+P, is bounded, and there exist positive -V(t,$)I<L,l/q-$11 forsomeconstantL, if
constants K, E for which // T(t)511<Ke(JLmE)rIIIII, f in (E) is uniformly Lipschitzian, that is, the
t 2 0 if 5 E Q,. Hence for any 5 E C the solution coefficient L is constant over the domain. The
x(t) of (AL) through t at t=O satisfies Lipschitz condition together with (ii) makes it
possible for a solution y(t) of the perturbed
equation

WI =fk Y,) + JYL Y,) (10)


t>O, (8)
to satisfy
where ll~~ll denotes the operator norm of na.
The set Q = natR Q, may contain an element 5 D + VL Y,) G - cW Y,) + L, IPU,y,)l.
other than the zero element, but T(t)5 must be
By using this fact, the stability of (10) cari be
identically zero for t > hn.
discussed.
For a linear system (L) with f(t, <p) w-
However, for functional differential equa-
periodic in t, similar conclusions result, with
tions it is quite a diflïcult problem to con-
T(w, 0) in place of A; and relation (8) holds,
struct a suitable Lyapunov function for a
where A, is defïned similarly by replacing a(A)
given (E).
by j(l/w)log~lI~a(T(w,O))) andp,(t,t)are Many of the attempts to improve the SU~~I-
polynomials with w-periodic coefficients. This
ciency condition that have been made up to
corresponds to tFloquet’s theorem for ordi-
now are such that a stability property cari be
nary periodic linear systems. p E o( T(w, 0)) and
verifïed by means of a simple Lyapunov func-
(1,‘~) log p are said to be the characteristic
tion. The main effort has been devoted to
multiplier and the characteristic exponent,
replacing condition (ii) by another type of
respectively.
condition. One of them is

(ii*) D+V(t,x,)< -c(lx(t)l)


G. Stability Problem
under the uniform boundedness of ,f(t, <p),
The concept of stability cari be deiïned and where c(r) is a continous function with c(v) > 0
studied in the same spirit as for ordinary dif- for r > 0. Another one is for the case when
ferential equations, and the Lyapunov second V(t, <p)= W(t, <p(O)), where W(t, x) is defmed for
method also turns out to be effective. For (t, X)E R x R”. In this case, (ii) cari be replaced
instance, the zero solution of(E) with .f delïned by
onD,=[O,c;o)x{<p~C~ll<pl~<H}forH>Ois (ii**) D’l/(t,x,)~ -~V(&X,)
said to be uniformly asymptotically stable if
there are a constant c(> 0 and positive func- whenever V(t + s, x,+,) < F (V(t, x,)) for SE
tions 6(~) and O(E) of c:> 0 such that any solu- [-II, 01, where c > 0 is a constant and F(r) is a
tion x(t) of(E) satisfies continuous function satisfying F(r) > r for r > 0.
The condition (ii**) was given by B. S. Razumi-
Ix(t)1 <E as long as x(t) exists, (9) khin and provides an easier way to construct
whenever //x,11 C~(E) and t>~ or llx,II <a and a Lyapunov function.
t> z + o(6) for z 2 0. In the above detïnition, if f For a linear system, uniform asymptotic
is completely continuous on D, and if 0 <E < stability implies II T(t, z)ll < 1/6( 1) for t > z and
H, then the phrase “as long as x(t) exists” is llT(t,+ < 1/2 for t>z+rr(cc/2)+h. These facts
redundant, and (9) is equivalent to the claim together with (6) show that the zero solu-
that x(t) exists for a11 t > z and Ix(t)1CE. Under tion is exponentially stable, that is, Ix(t)\ <
the Lipschitz condition and the complete Ke-Y(‘-T)~~~TII for t >t, where v=(log2)/(~(a/2)+
continuity on f in (E), the zero solution of(E) h), K = 2/S( 1). From (8) it follows that the zero
is uniformly asymptotically stable if and only if solution of (AL) is uniformly asymptotically
there exists a continuous R-valued function stable if and only if
(Lyapunov function) V(t, <p) defined on DH1 for {iEa(A)IRei>O}=@. (11)
0 < H, <H such that

0) a(l<p(O)l)d~(t,<p)$h(II<pl/), H. Equations of Neutral Type


(ii) D+V(t,x,)< -cV(t,x,)
TO deal with an equation of neutral type, such
as long as (&X~)E&, for a solution x(t) of(E),
as
where a(r), b(r) are continuous functions with
a(r) > 0 for r > 0, b(0) = 0, c > 0 is a constant, 44 =“Oc x,, &), (12)
635 163 1
Functional-Differential Equations

a phase space such as C’=C’([-h,O],R”)of istic equation is now given by


continuously differentiable functions should 0
be chosen. The solution of(E) Will become det Â(P+ e”dp,(~))-S-~E”‘d>l(s)]=O
smooth as time elapses if f in (E) is suffkiently [ s -Il
smooth. However, this property cannot be and the property (b) is true if tl> a,, where
expected for (12), and the solution of (12) - (5) D(p), f(<p) are assumed to be of the forms (13)
need not belong to Ci even if 5 E C’ without a and (7) respectively, without the argument t,
specitïc condition on 5 such as f(O) =f(r, 5, [). and where a, is given by
Such a restriction on initial functions is an
obstruction to the development of the general
theory. Equations of the form
Similarly, the decomposition C = P, + Q, is
$o(r, x,1 =fk xc) PJ) possible if CI> a,. Hence, to obtain the uniform
asymptotic stability of the zero solution of (14)
caver a fairly general class of equations of the condition a, < 0 should be assumed in
neutral type, where D(t, cp) and f(t, <p) are addition to (11). The linear function D(p) is
delïned on an open set D c R x C and a solu- said to be stable if a, < 0. If D(q) is stable, then
tion of (N)-(5) means a continuous function the Lyapunov method is also applicable, as a
x(t) defined on [r-h, a), a > z, which satislïes suflïcient condition, to the stability problem of
(5) and (N) with linear D(v) instead of general D(t, <p),
f but in this case condition (i) for V(t, rp) should
D(t, x,) = D(z, 5) + f(s, x,)ds on CT,4. be replaced by
sr

(E) corresponds to the case where D(t, <p)=


p(O), while x(t)= G(t,X,)+g(t,x,) is reduced
to the form (N) if G(t, <p) is linear in <pand 1. Infinite Retardation
continuously differentiable with respect to t by
setting D(t, q) = <p(O) - G(t, <p) and ,f(t, cp)= There are some basic differences between cases
dt> CPI-(~lWW, <pl. of finite retardation and of inlïnite retardation.
A continuous function D(t, cp) linear in cp is For example, in the delïnition of stability, the
said to be atomic at 0 if in the representation inequality Ix(t)1 <E in (9) cari be replaced by
0 IIxtll <a with no difference in the case of lïnite
D(t, d = Cd,Pk S)I&) retardation, but this replacement yields a
s -h different concept deeply connected with the
(see equation (7)) P(t)=p(t,O)-p(t, -0) exists choice of the phase space in case of infinite
and is nonsingular, which is equivalent to retardation.
There are several ways to define a phase
space for a functional differential equation
with infinite retardation. One of those phase
spaces generally used is a linear space X of
with a nonsingular matrix P(t) and a matrix functions: (-co, 0] +R” with a seminorm II IIx
&t, s) whose total variation with respect to s
such that if a function x:( -00, a)-+R” satislïes
on C--c, 0] tends to 0 as o-0 locally uni-
x, E X and it is continuous on [z, a], then
formly in t. A nonlinear function D(t, cp) is said
to be atomic at 0 if D(t, q) has a continuous 6) X,E X for a11 t E [z, a),
+Fréchèt derivative D,(t, cp) with respect to <p
(ii) x, is continuous as a function [z,a)+X,
and D,(t, cp) is atomic at 0. In equation (N),
D(t, <p) is always assumed to be atomic at 0, (iii) mlx(t)lG ll~tll~~~~~~,,~,,,ll~~~~l
and many of the fundamental theorems for (E)
are also valid for (N). Let T(t) be the operator
+~M(~-~Hx,llx>
solution of the autonomous linear equation where m and K are positive constants and
M(t) is a continuous function.
$DbJ=/(xt). (14) If X satistïes the foregoing conditions and if
f(t, cp) is continuous on an open domain D c
R x X, then the local properties (a)-(c) in Sec-
Then iW)l,,, is a strongly continous semi-
tion D hold, and SO does (d) when D=R x X.
group of bounded linear operators, and the
On the other hand, if X has a fading memory,
corresponding Wïnitesimal generator A has
namely,
the domain D(A)={~EC~@EC and D(rj)=
f(p)}. The properties of the spectra e(A) are (iv) M(t) < Me-“’ in (iii) for positive con-
the same as for (AL), except that the character- stants M, ,u,
163 Ref. 636
Functional-Differential Equations

then the procedure in Section F is applicable [S] R. Bellman and K. L. Cooke, Differential-
by restricting a(A) to a,,(A)= {1.~g(A)IRe1> difference equations, Academic Press, 1963.
-p} and, hence the zero solution of (AL) with [9] L. E. El’sgol’ts and S. B. Norkin, Introduc-
inlïnite retardation is uniformly asymptotically tion to the theory and application of differen-
stable under (11). tial equations with deviating argument, Aca-
ForO<h<co andO<y<co,boththespace demie Press, 1971. (Original in Russian, 1963.)
C,,, of continuous functions <p:( - h, 0] +R” [ 101 A. Halanay, Differential equations: Sta-
with a lïnite limit lim,,-,eY”<p(s) and the space bility, oscillations, time lags, Academic Press,
Mh,? of measurable functions cp:( - h, 0] +R” 1965.
with &,e’“l<p(s)l ds< m satisfy the foregoing [l l] M. N. Ojjuztoreli, Time-lag control sys-
conditions, where the norms are given by tems, Academic Press, 1966.
ll<pllc,,y= ~~wh.oleY”ld~)l and lI~llMh,,=IdO)I [ 121 T. Yoshizawa, Stability theory by Lia-
+&eYS(cp(s)Ids. These spaces have a fading punov’s second method, Math. Soc. Japan,
memoryifhcco ory>O. 1966.
In much of the literature in which the equa- [ 131 V. Lakshmikantham and S. Leela, Dif-
tion of infïnite retardation has a form like ferential and integral inequalities II, Academic
(2) or (4), the space CB of bounded continu- Press, 1969.
ous functions or the space C,, of continuous Cl43 J. K. Hale, Theory of functional differen-
functions with compact supports are suffi- tial equations, Springer-Verlag, 1977.
tient as a phase space under the norm 11<pli = [ 151 J. K. Hale and J. Kato, Phase space for
s~p~~~l<p(s)l. However, it is to be noted that retarded equations with infïnite delay, Funk-
C, is not complete when CB does not satisfy cial. Ekvac., 21 (1978), 11-41.
condition (ii). When CB is chosen as the phase
space, in order to show the existence theorem,
f(t, xt) should be continuous for any bounded
continuous function x in addition to the cont-
inuity in (t, cp). This condition is satislïed if 1, 164 (XII.1 8)
= [g(f), t] in (3) for a continuous function Function Algebras
g(t) < t. Equation (2) or (4) with 0 <A < 1 give
rise to such a case, but g(t) must be equal to 0
in (2) and (4) with 1=0. If g(t)+co as t+m, A. Definition Cl-31
the Razumikhin condition (ii**) D+ V(t, x,) <
-c v(t, x,) whenever V(s, x,) < F (V(t, x,)) Let C(X) be the tBanach algebra of a11 con-
(SE [g(t), t]) for a Lyapunov function is effec- tinuous complex-valued functions on a com-
tive, but one cari conclude only that x(t)-+0 as pact Hausdorff space X with pointwise oper-
t-t CO without uniformity. ations and tuniform norm. A function algebra
(or uniform algebra) on X is a closed sub-
References algebra A of C(X) containing the constant
functions and separating the points of X, i.e.,
[l] V. Volterra, Sur la théorie mathématique for any x, yeX, x#y, there exists feA with
des phénomènes héréditaires, J. Math. Pures f(x)#f(y). If cpx, XEX, denotes the evaluation
Appl., 7 (1928), 249-298. mapping ,f+f(x) of A, the correspondence x+
[L] V. Volterra, Théorie mathématique de la <p, is a homeomorphism of X into the +maxi-
Lutte pour la Vie, Gauthier-Villars, 1931. mal ideal space W(A) of A. Since f(<p,)=f(x)
[3] N. Minorsky, Self-excited oscillations in for any x E X and fi A, the tGel’fand represen-
dynamical systems possessing retarded actions, tation of A is an isometric isomorphism. By
J. Appl. Mech., 9 (1942), 65-71. identifying x with <px, we regard X as a closed
[4] A. M. Zverkin, G. A. Kamenskiï, S. B. subset of m(A) and the Gel’fand transform j
Norkin, and L. E. El’sgol’ts, Differential equa- f’~ A, as a continuous extension off to !JJl(A).
tions with retarded arguments (in Russian),
Uspekhi Mat. Nauk., 17 (1962), 77- 164.
[S] A. D. Mishkis, Lineare Differentialglei- B. Examples [ 1,3,4]
chungen mit nacheilendem Argument, Deut-
scher Verlag Wiss., 1955. (Original in Russian, For a compact plane set K let P(K) (R(K)) be
1957.) the subalgebra of a11 functions in C(K) that
[6] R. D. Driver, Existence and stability of cari be approximated uniformly on K by poly-
solution of a delay-differential system, Arch. nomials (rational functions with poles off K).
Rational Mech. Anal., 10 (1962), 401-426. A(K) denotes the subalgebra of all functions in
[7] N. N. Krasovskiï, Stability of motion, C(K) that are analytic in the interior of K.
Stanford Univ. Press, 1963. (Original in Rus- These are function algebras on K and P(K)ç
sian, 1959.) R(K) E A(K). When K is the unit circle T =
637 164 F
Function Algebras

{ IzI = l}, P(T) is called the disk algebra, the mum modulus principle for analytic func-
most typical concrete example. tions, and representing measures corne from
The theory of function algebras emerged +Poisson’s integral formula. The most relevant
from attempts to salve, by means of functional maximum principle is H. Rossi’s local maxi-
analysis, certain problems in complex analysis, mum modulus principle: For any closed set K
especially problems of uniform approximation, in WA), we bave IlfIl,= IlfllbdKV~KnPA~
for a11
e.g., when does A(K) = P(K) or A(K) = R(K) fi A, where bd K is the topological boundary
hold? of K and llills=sup{lP(s)lIs~S}. A corre-
Another important example is given by the sponding result for function spaces was con-
algebra H,(U) of a11 bounded analytic func- sidered by Y. Hirashita and J. Wada. Closely
tions on a bounded plane domain U with the related to representing measures are the or-
supremum norm IlfIl, =sup{ If(z)1 1ZE U} [SI. thogonal measures for A, i.e., complex Bore1
Since the Gel’fand representation is an iso- measures p on X such that jfdp = 0 for a11
metric isomorphism of H,(U) into the algebra fi A; they are often useful in studying func-
C(YJl(H,(U))), H,(U) is viewed as a function tion algebras by means of the duality tech-
algebra on the space %n(H, (U)). Further nique. The set of orthogonal measures is
examples result if we take K or U in a Rie- denoted by A’.
mann surface or in the complex n-space.
We list a few abstract function algebras that
D. Peak Sets Cl-31
reflect certain relevant properties possessed
by concrete examples. A is called a Diricblet A subset K of X is a peak set for A if there
(resp. logmodular) algebra if the set Re A = exists fe A such that f(x) = 1 for x E K and
Wflf~A) (rev. logIA-‘l={loglflIf;f-‘~ If(y)/ < 1 for ~vEX-K. K is a generalized peak
A}) is dense in C,(X), the space of a11 con- set if it is the intersection of peak sets. A point
tinuous real-valued functions on X. It is called
XEX is a (generalized) peak point if the set
bypo-Diricblet if the closure of ReA has lïnite
{x} is a (generalized) peak set. The set of
codimension in C,(X), and the linear span of
generalized peak points equals the Choquet
log IA m11is dense in C,(X).
boundary for A. If X is metrizable, the peak
In the following, A denotes a function alge- sets and generalized peak sets coincide. There
bra on X unless otherwise specilïed.
exist X and A such that X is metrizable: X =
‘%I(A) = c(A), but A #C(X) (B. Cole). A sub-
set E of X is interpolating for A if for any
C. Boundary and Representing Measure bounded continuous function u on E there
Cl-41 exists fi A with f 1E = u. Then a closed G, set
K in X is an interpolating peak set if and only
A subset E of X is a boundary for A if for any if pEA’ implies I,uI(K)=O (E. Bishop).
fgA there exists XEE such that ]~(X)I= ii,fll.
A closed boundary is a boundary closed in
X. G. E. Shilov proved that there is a smallest E. Antisymmetric Decomposition [l-3]
closed boundary, 3A, which is called the Shilov
boundary for A. A positive Bore1 measure p on A subset F of X is a set of antisymmetry for A
X is a representing measure for <PEW(A) if f(<p) if every function in A which is real-valued on F
is constant on F. Bishop’s antisymmetric de-
=jf(x)dp(x) for allfEA. Each ~E!U~(A) has a
composition then appears as an extension of
representing measure supported by dA. The
the +Weierstrass-Stone theorem on uniform
Choquet boundary, c(A), consists of a11 X~X
approximation. It is a relïnement of Shilov’s
such that the evaluation qn, at x has a unique
decomposition and reads as follows: If {E,} is
representing measure. Then c(A) is a bound-
the family of maximal sets of antisymmetry for
ary, whose closure is dA. If X is metrizable,
c(A) is a G, set in X and supports a represent- A, it is a partition of X into generalized peak
sets such that fi C(X) with fl E,E A 1E, for a11
ing measure for every member of !Dl(A).
For <~E!D~(A), M, denotes the set of repre- x belongs to A. An interesting connection was
found by J. Tomiyama between the maximal
senting measures for cp. It is a tweak* com-
antisymmetric decomposition of X relative to
pact convex subset of the space of measures
A and that of ‘m(A) relative to Â.
on X. M, is a singleton if A is Dirichlet or log-
modular. It is fmite-dimensional if A is hypo-
Dirichlet. The case dim M, < +cc has been F. Parts and Analytic Structure [l-4]
studied in detail. Extensive studies for the case
dim M, = +m have been done only for con- A. M. Gleason defïned an equivalence relation
crete examples related to polydisks, inlïnitely - in YJl(A) by setting rp-$ if sup{lp(cp)-
connected domains, etc. f($)l lf~A, Ilfil < 1) ~2. Each equivalence
The notion of boundary reflects the tmaxi- class for this relation is a part (or a Gleason
164 G 638
Fuuction Algebras

part) for A. Two points cp, $ belong to the L,(m), i.e., A is a weak* Dirichlet algebra on
same part if and only if there exist mutually (X, m); (ii) if IIE M, is absolutely continuous
absolutely continuous representing measures p with respect to m, then p = m; (iii) if a closed
for <pand v for I/I such that c-l < dp/dv -c c for subspace M of L,(m) is simply invariant in the
some constant c > 0 (Bishop). An analytic sense that A,M E M and A,M is not dense in
structure in m(A) is a pair (V, t), with an tana- M, then M=qH,(m) with qEL,(m), 191=1
lytic set 1/ in some open subset of C” and a a.e.; (iv) the set log 1H,(m)-’ 1 coincides with
nonconstant continuous mapping 5: V-m(A) L,(m; R), the set of real-valued elements in
such that fo z is analytic on 1/ for a11fe A. L,(m);(v) for every w~L,(m), w>O, we have
For such a structure, z(V) is always within a infiJI -fJ2wdmJfeA,} =exp(Jlogwdm); (vi)
part. But there cari be a nontrivial part with the linear functional h-j h dm on H,(m) has a
no analytic structure, as was shown by G. unique positive extension to L,(m). Further
Stolzenberg. A topological characterization extensions were subsequently made by use of
of parts was obtained by J. Garnett. Sample the conjugation operator (- Section K) by
results in the positive direction are: (1) If the T. Gamelin, H. Konig, Lumer, K. Yabuta,
ideal A, = {fi A 1j(<p) = 0) is fïnitely generated, and others. Some more properties of weak*
there is an analytic structure (V, z) such that z Dirichlet algebras have been obtained by T.
is a homeomorphism of V onto an open neigh- Nakazi.
borhood of <p(Gleason); (2) if <p has a unique
representing measure and if the part P of cp is
not a singleton, there is a bijective continuous H. Generalized Analytic Functions [l, 101
mapping z of the open unit disk onto P such
that /oz is analytic for ~EA (J. Wermer, K. Let r be a dense subgroup of the additive
Hoffman and G. Lumer); (3) if A is hypo- group R of the reals with discrete topology,
Dirichlet and if a part P is not a singleton, P and let G be the tcharacter group of r. Each
cari be made into a 1-dimensional tanalytic a~ r, as a character of G, detïnes a continuous
space SO that each fg A is analytic on P (J. function x0 on G. Let A be the closed sub-
Wermer and B. V. O’Neill). Analytic structures algebra of C(G) generated by ix,) a~ r, a > O}.
in tpolynomially convex hulls of curves in C” A is a Dirichlet algebra but is far more difficult
have also been studied [3,4]. to describe than the disk algebra. The study of
this algebra, especially that of invariant sub-
spaces, has evolved from papers by Helson
G. Abstract Function Tbeory [ 1,6-91 and D. Lowdenslager. Let o be the +Haar
measure of G and H,(o) the closure of A in
Let <pl %I1(.4), and choose m E M,, which is L2(o). A closed subspace M of L,(a) is called
fïxed. The generalized Hardy class H,(m), 0 invariant if xa M c M for a11 a E r, a 2 0. M is
<p< a, associated with A is the closure called doubly invariant if x0 M c M for a11 a E
(weak* closure, if p = CO) of A in the +L, space r. Otherwise, it is called simply invariant. In
L,(m) on the measure space (X, m). Under fact, only the latter is interesting. Let e, E G
suitable restrictions on A, cp, or m, we cari with t E R be the character of r defïned by
recapture some of the important classical facts, e,(u) = e”“. Then the mapping t-e, is a faith-
most of which have their origins in the works fui representation of R into G. A cocycle on
of A. Beurling, R. Nevanlinna, F. and M. G is detïned to be a Bore1 function B on G x
Riesz, and G. Szego. In this area, H. Helson R such that (i) 1B(x, t)l = 1, (ii) B(x + e,, t) =
and D. Lowdenslager came up with a powerful B(x, s) B(x, s + t) for XE G and s, t E R. Two
method using orthogonal projections in Hil- cocycles are identified if they differ only on
bert space and gave together with S. Bochner’s a nul1 set in G x R. For a cocycle B, let M,
remark, a strong influence for subsequent be the set of fEL2(o) such that B(x, t)f(x +
development. The modification argument was e& H,(dt/( 1-t t’)) for almost all x E G, where
then devised by Hoffman and Wermer, in- H2(dt/(l + t2)) is the closure, in the space
spired by F. Forelli. After Hoffman’s detailed L,(dt/( 1 + t’)) on R, of the set of boundary
study of logmodular algebras, Lumer observed value functions on R of bounded analytic
that most results remain valid when cp has functions on the Upper half-plane. Then the
a unique representing measure. T. P. Srini- mapping B+ M, is a bijection from the set of
vasan and J.-K. Wang (- [8]) introduced the cocycles onto the set of simply invariant sub-
notion of weak* Dirichlet algebra and showed spaces M of L2(o) such that M=~{x,M(uE
that some major theorems are mutually equiv- r, a < 0). Moreover, M, = qH,(o) for some
aient and are in fact measure-theoretic. With q~L,(a), lqf= 1 a.e. if and only if B(x,t)=
the Hoffman-Rossi complement, their result q(x) q(x + e,), i.e., B is a coboundary. When r #
now states that for fïxed rnE M, the following R, there is a cocycle that is not a coboundary.
are equivalent: (i) A +A is weak* dense in Further studies have been done by Helson,
639 164 K
Function Algebras

Gamelin, J. Tanaka, and others. On the other J. Rational Approximation [ 1,2]


hand, F. Forelli observed that tflows in com-
pact spaces give rise to a kind of analyticity. The problem of rational (polynomial) approxi-
In the above case, the algebra A consists of a11 mation on a compact plane set K asks when
fi C(G) that are analytic with respect to the A(K)=R(K) (A(K)=P(K)) holds. An impor-
flow T,(x) = x + e, for x E G and t E R. General- tant tool for such problems is Cauchy’s trans-
ized analytic functions induced by flows in form of measure p: fi([) = j(z - [) m1dp(z). Using
general have been studied by Forelli and P. S. this, one cari show, for instance, the following:
Muhly. (1) A(K)=P(K) if and only if K has a con-
nected complement (S. N. Mergelyan); (2) fc
C(K) belongs to R(K) if each point ZEK has a
1. The Unit Disk [ 1,6,7,11,12] closed neighborhood V with fi KnvcR(K n V)
(Bishop); (3) A(K) = R(K) if the diameters of
Let A be the disk algebra P(T), D the open the components of C - K are bounded away
unit disk { IzI < l}, and m the normalized Le- from zero (Mergelyan); (4) C(K)=R(K) if and
besgue measure on T. The algebra A has been only if almost all points of K are peak points
the m&t important mode1 in the theory of for R(K) (Bishop). The last result cannot be
function algebras, and we lïnd here the origin extended much because there is a set K such
of many abstract results. Some typical results: that A(K) # R(K), while A(K) and R(K) have
(i) every orthogonal measure for A is abso- the same peak points (A. M. Davie).
lutely continuous with respect to m (F. and M. A complete characterization for A(K) =
Riesz); (ii) a closed set E in T is an interpolat- R(K) was obtained by A. G. Vitushkin: (5)
ing peak set for the tGel’fand transform  The following are equivalent: (i) A(K) =
of A if and only if m(E) = 0 (W. Rudin and L. R(K); (ii) for any bounded open set D in C,
Carleson); (iii) A is maximal among the closed E(D-K)=a(D-Ko); (iii) for any z~bdK,
subalgebras of C(T) (Wermer); (iv) a function there exists r > 1 such that lim supado S((A(~; S)
fi C(C1 D) belongs to  if f is analytic at every -K’)/a(A(z;r6)-K)< +a, where A(z;6) is
ZED with f(z)#O (T. Rade). thedisk{wEClIw-zl<s}anda(E),forany
The generalized Hardy class H,(m), 0 < p < bounded set E in C, is the continuous ana-
m, associated with A is viewed as the set of lytic capacity of E, which is the supremum of
nontangential boundary value functions of If’(a)/ for a11 continuous functions f on the
elements in the classical +Hardy class H,,(D). Riemann sphere CU {a} such that If1 < 1,
Here we fïnd the origin of invariant subspace f( CO)= 0, and f is analytic off a compact sub-
theorems: A closed subspace A4 (# {0}) of set of E. As for uniform or asymptotic ap-
H,(m) is invariant, i.e., AM c M, if and only proximation on noncompact closed subsets
if M=qH,(m) with q~H,(m), lq[= 1 a.e. in C, Carleman’s classical study has recently
(Beurling). been extended by N. U. Arakelyan, A. A.
The algebra H, = H,(m) is a weak* Diri- Nersesyan, A. Stray, and others in an interest-
chlet algebra, whose Shilov boundary is identi- ing way.
fïed with the maximal ideal space X of L,(m). In connection with rational approximation,
We have L,(m;R)=logl(H”)-‘1, and afortiori we should note detailed studies on pointwise
H, is logmodular on X. The mapping z+<pZ bounded approximation in H,(U), U being
embeds the disk D in W(H,) as an open set. a bounded open set in C, by 0. J. Farrell, L.
The structure of W(H,) was studied in detail Rubel and A. Shields, Gamelin and J. Garnett,
by 1. J. Schark, Hoffman, and others. We finish and Davie.
with three remarkable results: (i) D is dense in
!Dl(H,) (Carleson). This is the corona theorem
and was proved in the following equivalent K. Further Topics
form: For any fi,. . . &EH,(D) with If1 I+ . +
If;I>~>OonD,thereexistg,,...,g,~H,(D) (1) Conjugation operator [9,13]. Take any
withf,g,+ . . . +,f,g, = 1. A simple proof was m E M,, <pE W(A). The conjugation operator
discovered by T. Wolff [ 121. (ii) The convex then associates with each UE Re A the unique
combinations of tBlaschke products are uni- element *u in C,(X) such that u + i *u E A and
formly dense in the unit bal1 of H,(D) (D. l *u dm = 0. After the classical inequalities of
Marshall) [12]. (iii) Every closed subalgebra B Kolmogorov and M. Riesz, we consider the
between H, and L,(m) is a Douglas algebra, following conditions with constants c, and d,:
i.e., B is generated by H, and the complex (K)(~I*uIPdm)“P<cp Jluldm for O<p< 1; (M)
conjugates of a family of inner functions (S.-Y. (~I*uIPdm)liP~d,(~IuIPdm)l’P for 1 <p-c CO.Let
Chang and Marshall) [ll]. The proof of (iii) is m be arbitrary. Then the inequality (K) is valid
an interesting application of the theory of for u E Re A, u > 0. The inequality (M) is valid
tbounded mean oscillation. for a11 u E Re A if p is an even integer > 2; it is
164 Ref. 640
Function Algebras

valid for a11 UE Re A, u > 0, if p is not an odd [2] A. Browder, Introduction to function
integer. Al1 remaining cases have counter- algebras, Benjamin, 1969.
examples. On the other hand, M, always con- [3] E. L. Stout, The theory of uniform alge-
tains an m such that logI,~(<p)ld~log(fldm for bras, Bogden and Quigley, 1971.
~EA (Bishop). Such a representing measure is [4] J. Wermer, Banach algebras and several
called a Jensen measure for <p. If m is Jensen, complex variables, Markham, 1971.
(K) is valid for all uEReA and all O<p< 1, [S] T. W. Gamelin, Lectures on H”(D), Uni-
and (M) is valid for a11 u~Re A and a11 1 < versidad National de La Plata, 1972.
p-CCD. [6] K. Hoffman, Banach spaces of analytic
(2) Riemann surfaces. For a compact bor- functions, Prentice-Hall, 1962.
dered Riemann surface R, let A(R) be the [7] H. Helson, Lectures on invariant sub-
algebra of functions, continuous on Cl R and spaces, Academic Press, 1964.
analytic on R. Then A(R) is hypo-Dirichlet on [S] F. Birtel (ed.), Function algebras (Proc. Int.
the border bd R of R, and many results for the Symposium on Function Algebras, Tulane
disk algebra are extended to A(R). Most of the Univ., 1965), Scott, Foresman, 1966.
basic results are described in [ 141. The maxi- [9] K. Barbey and H. Konig, Abstract analytic
mality of A(R) in C(bdR) was obtained by H. function theory and Hardy algebras, Lecture
Royden; extreme points of related Hardy notes in math. 593, Springer, 1977.
classes were discussed by Gamelin and M. [ 101 H. Helson, Analyticity on compact
Voichick; and invariant subspaces were deter- Abelian groups, Algebras in Analysis, J. H.
mined by Forelli, M. Hasumi, D. Sarason, and Williamson (ed.), Academic Press, 1975, l-62.
Voichick. A further extension to infïnitely [ 111 D. Sarason, Function theory on the unit
connected surfaces has been obtained by C. W. circle, Virginia Polytech. Inst. and State Univ.,
Neville, Hasumi, and M. Hayashi in the case 1978.
of open Riemann surfaces R of Parreau- [ 121 P. Koosis, Introduction to H, spaces,
Widom type, which is defined as follows: Let Cambridge Univ. Press, 1980.
C(a, z) be tGreen’s function for R, and let [ 131 T. W. Gamelin, Uniform algebras and
B(a, a), c1> 0, be the iïrst tBetti number of Jensen measures, Cambridge Univ. Press,
the domain {ZER 1G(a, z) > a}; then R is of 1978.
Parreau-Widom type if JB(a, cc)& < +co. For [ 141 M. Heins, Hardy classes on Riemann
such surfaces the situation looks favorable: surfaces, Lecture notes in math. 98, Springer,
For instance, the Cauchy-Read theorem is 1969.
valid, and the Brelot-Choquet problem con- [15] W. Rudin, Function theory in polydiscs,
cerning Green% lines is solved affirmatively. Benjamin, 1969.
As for the generalization of approxima- [ 163 W. Rudin, Function theory in the unit
tion theorems of Mergelyan and Arakelyan, bal1 of c”, Springer, 1980.
we refer to the work of Bishop and L. K.
Kodama for compact sets and to that of S.
Scheinberg for noncompact closed sets.
(3) Higher-dimensional sets. Much attention
has been paid to algebras of analytic functions 165 (11.4)
on domains in C”, n 2 2, e.g., polydisks, unit
halls, and general pseudoconvex domains.
Functions
Polydisk algebras and bal1 algebras have been
studied extensively by W. Rudin, P. Ahern, A. History
Forelli, and many others [ 15,161. Approxi-
mation theorems of Mergelyan type were ob- Leibniz used the term finction (Lat.functio) in
tained by G. M. Henkin, N. Kerzman, and 1. the 1670s to refer to certain line segments
Lieb for strictly pseudoconvex domains with whose lengths depend on lines related to
smooth boundary and by L. Hormander and curves. Soon the term was used to refer to
Wermer and L. Nirenberg and R. 0. Wells, Jr., dependent quantities or expressions. In 17 18,
in the case of totally real manifolds. Further Johann Bernoulli used the notation cpx, and by
improvements have been obtained by R. M. 1734 the modern functional notationf(x) had
Range, A. Sakai, and others. been used by Clairaut and by Euler, who
detïned functions as analytic formulas con-
structed from variables and constants (1728)
Cl]. tCauchy stated (1821) [2]: “When there is
References
a relation among many variables, which deter-
mines along with values of one of them the
[l] T. W. Gamelin, Uniform algebras, values of the others, we usually consider the
Prentice-Hall, 1969. others as expressed by the one. We then call
641 165 D
Functions

the one an ‘independent variable,’ and the C. Variables


others ‘dependent variables.“’ TDirichlet consi-
dered a function of x E [a, b] in his paper (1837) A letter x, for which we cari substitute a name
[3] concerning representations of “completely of an element of a set X, is called a variable,
arbitrary functions” and stated that there was and X is called the domain of the variable. An
no need for the relation between y and x to be element of the domain of a variable x is called
given by the same law throughout an interval, a value of x. In particular, if the domain is a set
nor was it necessary that the relation be given of real numbers or complex numbers, the
by mathematical formulas. A function was variable is called a real variable or a complex
simply a correspondence in which values of variable, respectively. On the other hand, a
one variable determined values of another. letter that stands for a particular element is
called a constant.
When the domain and range of a functionf
are X and Y respectively, a variable x whose
B. Functions
domain is X is called the independent variable,
and a variable y whose domain is Yis called
Today, the word “function” is used generally the dependent variable. Then we say y is a
in mathematics in the same sense as a tmap- function of x, and Write y =f(x). When a con-
ping (- 381 Sets C) or, which is the same crete method is given by which we make a
thing, a tunivalent correspondence (- 358 value of y correspond to each value of x, we
Relations B). But this word is sometimes used say that y is an explicit function of x. When a
in a wider sense, to mean a general (not nec- function is determined only by a tbinary re-
essarily univalent) correspondence, called a lation such as R(x, y) = 0, we say that y is
many-valued (or multivalued) function; in that an implicit function of x (- 208 Implicit
case a univalent correspondence is called a Functions).
single-valued function. Given functionsf; g with an independent
Specialists in each branch of mathematics variable t, suppose that y is regarded as a
have their respective ways of using the Word. function of x defined by relations x =f(t),
In analysis, values of a function are often y = g(t). Then we say that y is a function of x
considered real or complex numbers; such with the variable t as a parameter. A function
functions are called real-valued functions or whose range is a given set C with variable t as
complex-valued functions, respectively. Fur- its independent variable is often called a para-
thermore, if the domain of the function is also metric representation of C by t.
a set of real or complex numbers, then it is If the domain of a functionfis contained in
called a real function or a complex function, a Cartesian product set X, x X, x . . x X,,
respectively (- 131 Elementary Functions; 84 the independent variable is denoted by
Continuous Functions; 198 Holomorphic (x,, x2, ,x,), andfis often called a function
Functions). If the domain of a real- or of n variables or a function of many variables
complex-valued function is contained in a (when n 2 2).
tfunction space, the function is often called a
functional; the tdistribution is an example. In
algebra we often lïx a ttïeld, tring, etc., and D. Families and Sequences
consider functions whose domains and ranges
are in such algebraic systems. Special names A function whose domain is a set 1, cp: 1 +X, is
are given to functions having special prop- called a family indexed by 1 (or simply family),
erties, which cari be defïned according to the and I is called the index set. In the case cp(/z)=
structures of the domain and the range. For ~~(1.~1) the family is denoted by {x~}~~, or
example, when both domain and range of a {x2} (1~1). If the range X of a function <p is a
functionfare sets of real numbers,fis called set of points, a set of functions, a set of map-
an even function if f(t) =f( - t), and an odd pings, or a set of sets, then the family {x~}~~, is
function if f(t) = -f( - t). A function f that pre- called a family of points, a family of functions,
serves the order relation between real num- a family of mappings, or a family of sets, res-
bers, i.e., such that t, <t, implies f(t,)<f(tZ), pectively. If the set 1 is a tdirected set, the
is called a tmonotone increasing function. family is called a directed family. Generally, if
A mapping from a set 1 to a set F of func- J is a subset of 1, the family {x~} IEJ is called a
tions, cp: I+F, is called a family of functions subfamily of {x~}~.,. In particular, if 1 is a
indexed by Z (or simple family of functions), lïnite or infmite set of natural numbers, the
and is denoted, using the formf, instead of family indexed by 1 is called a tïnite sequence
cp(4, ‘v LM~eI or {fi} (LE 1). In particular, if or intïnite sequence, respectively. Sequence is a
Z is the set of natural numbers, the family is generic name for both, but in many cases it
called a sequence of functions. means an inlïnite sequence, and usually we
165 Ref. 642
Functions

have I= N. Then the value corresponding to uous except for at most a countable number
n E N is called the n th term or, generally, a of points. Hence it is +Riemann integrable in a
term. For convenience, the 0th term is often fïnite interval provided that it is bounded. A
used as well. If each term of a sequence is a continuous real function ,f(x) defïned on an
number, a point, a function, or a set, the se- interval in R is tinjective if and only if it is
quence is called a sequence of numhers, a se- strictly monotone. In such a case, the range of
quence of points, a sequence of functions, or a the function f(x) is also an interval, and the
sequence of sets, respectively. A sequence is inverse function is also strictly monotone.
usually denoted by {a,}. If it is necessary to Furthermore, a differentiable real function f
show the domain of n explicitly, the sequence defmed on an interval is monotone if and only
is denoted by {u~},,~,. If J is a subset of 1, a if its derivative S’ is always 20 (monotone
semence (4 JntJ is called a suhsequence of the increasing) or always <O (monotone decreas-
sequence {an}na,. And if 1 = N, the composite ing). If f’> 0 ( < 0), f is strictly monotone
{Q} of {a,} and a sequence {k,} of natural increasing (decreasing).
numbers with k, < k, <k, . is usually called a
subsequence of {a,}.
B. Functions of Bounded Variation
References
Let f(x) be a real function defined on a closed
[l] L. Euler, Opera omnia, ser. 1, Opera math- interval [a, b] in R. Given a subdivision of the
ematica VIII: Introductio in analysin intïni- intervala=x,<x,<x,<...<x,=h,wede-
torum 1, Teubner, 1922.
note the sum of positive differences f(xi)-
[2] A. L. Cauchy, Cours d’analyse de 1’Ecole f(xi-,) by P and the sum of negative differ-
Royale Polytechnique pt. 1, 1821 (Oeuvres ser.
ences f(xi)-,f(xi-,) by -N. Then we easily
2, III, Gauthier-Villars, 1897). obtain
[3] G. P. L. Dirichlet, Über die Darstellung
ganz willkürlicher Funktion durch Sinus- und P-N =f(b) -fb),
Cosinusreihen, Repertorium der Physik, Bd. 1
P+N=CI.f(Xi)-f(xi~~)l.
(1837), 152-174 (Werke, Bd. 1, G. Reimer, t
1889,1, p. 1333160).
The suprema of P, N, and P + N for a11 pos-
sible subdivisions of [a, b] are called the posi-
tive variation, the negative variation, and the
total variation of the function f(x) in the inter-
val [a, b], respectively. If any of these three
166 (X.5)
values is fïnite, then a11 three values are finite.
Functions of Bounded In such a case, the function f(x) is called a
Variation function of bounded variation. Every function
of bounded variation is bounded, but the con-
verse is not true. The positive and negative
A. Monotone Functions
variations n(t), v(t) of the function f(x) in the
interval [a, t] are monotone increasing func-
A function (or mapping) f from an tordered
tions with respect to t, and we have
set X to another ordered set Y is called a
monotone increasing (monotone decreasing) f(x) -f(u) = 44 - 44 (2)
function if
if f(x) is a function of bounded variation.
x1 <x2 implies f(x,)<.f(x2) Hence every function of bounded variation has
both left and right limits at every point. A
(Xl <x2 implies f(xl)af(x2)). (1) monotone function is a function of bounded
A monotone increasing (decreasing) function is variation, and the sum, the difference, or the
also called a nondecreasing (nonincreasing) product of two functions of bounded variation
function. In either case, the function Sis called is also a function of bounded variation. Hence
simply a monotone function. If X and Y are f(x) is a function of bounded variation if and
+totally ordered sets and the inequality < (>) only if it is the difference of two monotone
holds in (1) instead of < (a), then f is called a functions. The representation (2) (representing
strictly (monotone) increasing (strictly (mono- a function of bounded variation as the differ-
tone) decreasing) function. In either case, f is ence of two monotone increasing functions)
called simply a strictly monotone function. is called the Jordan decomposition of the func-
In particular, when X and Y are subsets of tion f(x). A function of bounded variation is
the real line R, a monotone function is contin- Riemann integrable, continuous except for at
643 167 B
Functions of Confluent Type

most a countable number of points, and dif- 167 (XIV.7)


ferentiable talmost everywhere.
A continuous function detïned on a closed
Functions of Confluent Type
interval is bounded but not necessarily of
bounded variation (e.g., f(x)= xsin(l/x) (XE A. Confluent Hypergeometric Functions
(0, l]), 0 (x = 0)). A discontinuous function
may be a function of bounded variation on If some singularities of an ordinary differential
a closed interval (e.g., sgn(x)). However, an equation of +Fuchsian type are confluent to
tabsolutely continuous function, a differenti- each other, we obtain a confluent differential
able function with bounded derivative, or a equation whose solutions are called functions
function satisfying the +Lipschitz condition is a of confluent type. The equations that appear
function of bounded variation on a closed frequently in practical problems are the con-
interval. fluent bypergeometric differential equations
The notion of functions of bounded vari-
d2w dw
ation was introduced by C. Jordan in connec- zdz2+(y-z)dz-cIw=o (1)
tion with the notion of the length of curves (-
246 Length and Area). and related equations. Equation (1) corre-
sponds to the thypergeometric differential
equation for which a tregular singular point
C. Lebesgue-Stieltjes Integral coincides with the point at inlïnity and is an
tirregular singular point of class 1. For (1)
Let f(x) be a right continuous function of z = 0 is a regular singular point, and a series
bounded variation on a closed interval [a, b], solution (radius of convergence CO) is given by
and f(x) = n(x) - v(x) the Jordan decomposi-
tion of f(x). Then Z(X) and v(x) are monotone
increasing right continuous functions and hence
delïne bounded measures drr(x) and ~V(X) on
[a, h], respectively (- 270 Measure Theory L
(v)). The difference dn - dv of these two mea- where y is not a nonpositive integer. The func-
sures is a tcompletely additive set function tion ,F, in (2) is a tgeneralized hypergeometric
on [a, b] which is often called the (signed) function due to Barnes and is called a byper-
Lebesgue-Stieltjes measure induced by A written geometric function of confluent type or Kum-
d& For every function g integrable with respect mer function. If y is not equal to a positive
to the measure dp = dz + dv, we detïne jEg df integer, the other solution of (1) independent of
to be equal to JEgdn -jEgdv and call it the (2) is given by z ‘-F(l +CC-y,2-y;z) (- Ap-
Lebesgue-Stieltjes integral. In this case, h(x) pendix A, Table 19.1).
= Jtn,bl y df is of bounded variation on [a, b]
and the Lebesgue-Stieltjes measure db(x)
is denoted by g(x)df(x). If f, and f2 are of B. Wbittaker Functions
bounded variation on [a, b], then SO is the
product f = fi fi, and we have Equation (1) with w = er’2z-y’2 W, y - 2a = 2k,
y2 - 2y = 4mZ - 1 reduces to Whittaker’s dif-
df(x)=f,(x~O)df,(x)+f,(xTO)df~,x). (3)
ferential equation
The Lebesgue-Stieltjes integral for a continu-
ous integrand is often called the Riemann- w=o. (3)
Stieltjes integral, because it cari be defïned in
an elementary way similar to the defïnition of If 2m is not equal to an integer, (3) has two
the Riemann integral. series solutions for any lïnite z:
The notion of bounded variation cari also be
delïned for interval functions on R” and set Mk,m(z)=z(1~2)ime~Z~ZF(~+m-k, 1 +2m;z),
functions on an abstract space (- 380 Set Mk -m(z)=z(1~2)-me-2~2F(~-m-k, 1 -2m;z).
Functions).
If 2m is an integer, since the functions n/r,,,
and Mk,-,, are linearly dependent, E. T. Whit-
References taker considered a solution of the form

[l] H. L. Royden, Real analysis, Macmillan,


second edition, 1963.
[2] W. Rudin, Real and complex analysis, (o+)
a> (-t)-k-(1/2)tm
X
McGraw-Hill, second edition, 1974. s
167 C 644
Functions of Confluent Type

If k-f-m is equal to a negative integer, this there are no other singularities. Then differen-
integral does not exist. The function tial equations of order 2 with these conditions
are transformed into the form (4), whose solu-
tions are represented by parabolic cylinder
functions. Differential equations of the form
(4) are reduced to confluent hypergeometric
e-‘dt differential equations if z2 is chosen as an
independent variable (- Appendix A, Table
for Re(k --i - m) < 0 is defined for any m, k, 20.111).
and for any z except when z is a negative real
number. We cal1 Mk,,, and W,,, the Whittaker
functions. +Bessel functions are particular cases D. Indefinite Integrals of Elementary
of these functions, and the relation Functions

Since exponential and trigonometric functions


cari be represented by particular types of
Kummer functions, their indefïnite integrals
is satisfied. In Whittaker’s differential equation,
that cannot be represented by elementary
since W_,,,( - z) is also a solution and
functions, e.g., tincomplete r-functions and
W,,,(z)/W,,,( -z) is not equal to a constant,
the error function Erfz = 10 exp( - t2) dt, cari be
Wk,,,(z) and W-,,,( -z) cari be considered a
represented by Kummer or Whittaker func-
pair of fundamental solutions (- Appendix A,
tions. They are included in a family of tspecial
Table 19.11).
functions of confluent type. The functions
defmed by
C. Parabolic Cylinder Functions

Putting x = (c2 - a2)/2 and y = 511, the curves


corresponding to 5 = constant and to y~=
constant, respectively, constitute families of
orthogonal parabolas. The curvilinear coordi-
nates (5, q, z) in three dimensions are called are called Fresnel integrals, which are also
parabolic cylindrical coordinates. By using represented in terms of the Whittaker function
parabolic coordinates, separating variables in as
Laplace3 equation into the form f(~)g(~)e”‘, C(z) - iS(z)
and making a simple transformation, we iïnd
that f and g satisfy a differential equation of
the form

d2F Fresnel integrals tïrst appeared in the theory of


x+(n+f-;z2)F=0.
the diffraction of waves. More recently they
have been applied to designing highways for
By means of the Whittaker function Wk,,,(z), a
high-speed automobiles. Furthermore, the
solution D,(z) of (4) is represented by
functions
D”(Z) = 2 n~+(1/4)~-112 w
n,2+(l,4).~l,4(+z2)~
C(u)=J;cos(;s2)ds,
Equation (4) is called Weber% differential
equation or the Weber-Hermite differential
equation, and D,(z) the Weber function. An- S(u)= [‘sin(cs’\ds
other solution of (4) is D-,-l(iz) or D-,-, (- iz). Jo \’ /
The solutions of (4) are called parabolic cyl- (obtained by a change of variables z = nu2/2)
inder functions. In particular, if n is equal to are also called Fresnel integrals. Numerical
a nonnegative integer, then tables are available for them. The curves x = C
and y = S with a parameter z or u are called
H,(z) = 2-“12 exp(fz2)D,(JZ z)

Lix=
s“dt
Cornu’s spiral (Fig. 1). The functions
is the tHermite polynomial of degree n. Solu-
tions of differential equations for harmonie Eix= x gdt,
oscillators in quantum mechanics are of this o logt’ s -cc t
form. where a +Principal value must be taken at t = 0
In general, suppose that three regular sin- if x > 0,

s
gular points are confluent to the point at inlïn-
ity, and that they are reduced to an irregular “sint “cost
Six= -dt, and Cix= - -dt
singular point of class 2. Suppose further that 0 t sx t
645 168 B
Function Spaces

[2] L. J. Slater, Confluent hypergeometric


functions, Cambridge Univ. Press, 1960.
For the logarithmic integral, etc.,
[3] N. Nielsen, Theorie der Integrallogarith-
mus und verwandter Transzendenten, Teub-
ner, 1906.
[4] Mathematical tables 1, British ASSOC. Adv.
Sci., London, 193 1.
[S] Nat. Bur. Standards, Tables of sine, cosine
and exponential integrals 1, II, Washington,
D. C., 1940.
[6] L. Lewin, Polylogarithms and associated
functions, North-Holland, 1958, revised edi-
tion, 1981.
Also - references to 389 Special Functions.
Fig. 1

are called the logarithmic integral, exponential


integral, sine integral, and cosine integral, or
integral logarithm, integral exponent, integral 168 (X11.6)
sine, and integral cosine, respectively. They Function Spaces
satisfy the relations

Eix=Lie”, Eiix=Cix+iSix+(n/2)i.
A. General Remarb
They have important applications: Ei x in
quantum mechanics, Six and Ci x in electrical It is a general method in modern analysis to
engineering, and Li x in estimating the number consider a set X of mappings of a space R into
of +Primes less than x (- 123 Distribution of another space A as a space (- 381 Sets) and
Prime Numbers). Li x is also denoted by Ii x its elements (namely, mappings of Q into A) as
(- Appendix A, Table 19.11). points of the space X, and to investigate them
as geometric abjects. In particular, it is impor-
tant to consider the case where 0 is a ttopo-
E. Stokes’s Equation Iogical space, a fmeasure space, or a fdiffer-
entiable manifold and X is a set of real- or
Consider a linear differential equation of the complex-valued functions defmed on Q and
second order with iïve regular singular points satisfying certain conditions, such as continu-
including the point at infïnity such that the ity, measurability, and differentiability. Such
difference of the characteristic indices at every spaces are generally called function spaces;
singularity is equal to 1/2. Such equations are they usually form ttopological linear spaces
called generalized Lamé’s differential equa- (- 37 Banach Spaces; 197 Hilbert Spaces; 424
tiens. F. Klein and M. Bôcher have shown Topological Linear Spaces).
that every linear differential equation that is
commonly treated in mathematical physics is
represented by a confluent type of generalized B. Examples of Function Spaces
Lamé’s equation. Among these equations, if
a11 five singularities are confluent to the point The following are important examples of func-
at infinity, the resulting equation is called tion spaces. Throughout this section, a11 func-
Stokes’s differential equation, which is applied tions are real- or complex-valued, and two
to the investigation of diffraction. This is re- functions on a measure space are identified
duced to +Bessel’s differential equation of order whenever they are equal to each other talmost
1/3 by suitable transformations of the inde- everywhere.
pendent and dependent variables.
(1) The Function Spaces C(O), C,(Q), and
C,(Q). The totality of continuous functions
References f(x) defined on a compact tHausdorff space Q
is denoted by C(Q). Let f+ g and of (c( a real
[l] H. Buchholz, The confluent hypergeo- or complex number) be the functions f(x) +
metric function, with special emphasis on its g(x) and af(x), respectively. Then C(Q) forms
applications, translated by H. Lichtblau and a tlinear space. Furthermore, defïne the norm
K. Wentzel, Springer, 1969. (Original in Ger- off by Ilfll = supxen If(x)l. Then C(Q) becomes
man, 1953.) a +Banach space since it is complete in (the
168 B 646
Function Spaces

metric defined by) the norm. The norm is ding in C,(Q) of the linear subspace XK of a11
called the supremum norm or uniform norm functions with support in K equipped with
because lim,,, Ilf,-fil=0 means that f,(x) the supremum norm is continuous. Usually
converges to f(x) uniformly on R as II+ 00. X(0) denotes the space C,(n) equipped with
Define ,f.g to be the function f(x)g(x). Then this topology.
clearly Ilf,gll < IlfIl Ilgll. Hence C(Q) is also a
+Banach algebra. (2) The Lebesgue Spaces L,(R) (0 < p < CO). Let
Suppose that a subset R of C(Q) satislïes the (Q p) be a tmeasure space. We denote by L,(Q)
following three conditions: (i) R is an talgebra the totality of tmeasurable functions f(x) on R
over the complex number tïeld with respect to such that [~(X)I~ is integrable. A function f de-
the addition and multiplication delïned above lïned on (Q, p) is called square integrable if fE
and contains the function identically equal to L,(R, p). If R is the interval (a, b) equipped
one. (ii) For any two distinct points x and y with the Lebesgue measure, it is sometimes
of a, there exists a function fc R satisfying denoted by &(a, b). We delïne the norm I\f\\,
f(x) #f(y). (iii) For any fe R, there exists an
f* E R such that f*(x) =f(x) on R. Then R is
dense in C(n) with respect to the supremum Y;,, = l,~,,,=(~~llol~dl<(x))liP.
norm (namely, in the sense of uniform conver-
gence). This fact is known as the Weierstrass- If 1 <p < CO,then L,(R) is a Banach space. The
Stone theorem (or Stone-Gel’fand theorem). triangle inequality for this norm is precisely
A subset E of C(Q) is tprecompact (i.e., any the +Minkowski inequality. If 0 <p < 1, the
sequence of functions in E contains a sub- norm no longer satistïes the triangle inequal-
sequence that converges uniformly on fi) if and ity but does satisfy the quasinorm inequality
only if E is tuniformly bounded and tequicon- lIf+~ll,~~lip~l~llfllp+ llsll,b and ifC lILll~<
tinuous (Ascoli-Arzela theorem). Since C(n) is CO,then Cf, converges unconditionally in
not a treflexive Banach space except for trivial L,(sZ). Hence L,(a), 0 <p < 1, is a tquasi-
cases (- Section C), precompact sets and Banach space. If lim,,, Ilf,-flI,=O, we say
relatively compact sets are different in the that the sequence {f,} converges to f in the
tweak topology. For important characteriza- mean of order p (or in the mean of power p),
tions of the latter sets - [2,5]. and Write 1.i.m .“+,f, =f: If {f.} converges to
When fi is a topological space that is not fin the mean of order 2, we simply say that
necessarily compact, the totality of bounded (f,} converges to f in the mean. (The nota-
continuous functions on R (denoted by K(R)) tion 1.i.m. means the limit in the mean and is
is also a Banach space with respect to the used mostly when p = 2.) For any f; g E &(a),
supremum norm Ilfil = supXen I~(X)[. Let R be (i g) =iof(x)s(x) dp(x) is well detïned, by the
a locally compact Hausdorff space. Then the Schwarz inequality, and has the properties of
space C(R) of a11 continuous functions on R is the tinner product. Hence, L2(Q) is a +Hilbert
endowed with the topology of uniform conver- space. If 1 <p < CQ,then L,(Q) is a tuniformly
gence on the compact sets, i.e., the tlocally convex Banach space and is in particular
convex topology detïned by the tseminorms treflexive. Deepest results on L,(r), 1 < p < CO,
~up,,~jf(x)j as K ranges over the compact sets are often derived from the Littlewood-Palay
in R. C(Q) is always a complete locally convex theory due to J. E. Littlewood and R. E. A. C.
space. It is a +Fréchet space if 0 is o-compact Paley, A. Zygmund [6], and E. M. Stein [7].
(i.e., fi is the union of a countable family of Its starting point is the inequality
compact sets). We denote by C,(Q) the sub-
space of all functions f(x) E C(0) that converge ~pllfll,~ ll.Y(f)ll,~~Pllfll,~
to zero as x tends to infinity (i.e., given an E> where g(f) is the function
0, there is a compact set K such that If(x)1 <
112
E for x$K). C,(Q) is a Banach space with the m lgrad,u(x,t)lZtdt
g(f)(x)=
norm sup,,nIf(x)l. It cari be regarded as a (s 0 >
closed linear subspace of K(a). obtained from the Poisson integral
The totality of continuous functions with
compact support is denoted by C,(Q) or X(R), 0, t)
where the support (or carrier) of a function f
ZZZ
l-((n + 1)/2) t(t*+IX-yl*)-(“+l)‘*f(y)dy,
is the tclosure of the set {x 1f(x) # 0} in R and 7L(“+1w
s R”
is usually denoted by suppf: If s2 is not com-
pact, C,(n) is not complete with respect to the
supremum norm, but when R is o-compact The L, spaces, 1 <p < CO, are generalized
C,(Q) is complete with respect to the strongest in the following way. Let Q(s) be a convex
locally convex topology with the property and nondecreasing function on [0, 00) satisfy-
that for each compact set K in R the embed- ing ù>(O) = 0 and Q(s)/s- CO as s+ CO. Denote
647 168 B
Function Spaces

by Le(Q) (L$(R)) the set of a11 functions f(x) the totality of measurable functions on R that
such that a>(lf(x)l) is integrable (@@[~(X)I) is take tïnite value almost everywhere. Then II f 11
integrable for some k > 0). L,@) = L;(R) if =Sn(lf(x)IM1+If(x)l))d~L(x)forf~S(R)has
@(2s) < Ca>(s). L$,(R) is a Banach space, called the properties of the tpseudonorm and S(R)
the Orlicz space, under the norm is a +Fréchet space (in the sense of Banach)
that is not locally convex in general. We have
IlfIl =inf I>O @(A-‘lf(x)l)dp(x)<l . lim,,, II fn - f II = 0 if and only if
{ 1s 1
~‘m,~(rxII~(x>-f(x>l~E~)=O
L,(R), 1 < p < CO, is the Orlicz space for a>(s)
=sp.
for any positive number E. Convergence of this
type is called convergence in measure (or as-
(3) The Function Space M(R) = L,(R). Let R
ymptotic convergence) and is the same notion
and p be as in (2). A measurable function f(x)
as +Convergence in probability of a sequence
on R is said to be essentially bounded if there
of trandom variables (- 342 Probability
exists a positive number c( such that If(x)1 <cc
Theory). If {f,} EL,(Q) converges to ~EL,(R)
almost everywhere on R. The inlïmum of such
in the mean of order p, then { fn} converges to
a is called the essential supremum off; denoted
by esssup,,,If(x)l. The totality of essentially
f asymptotically, but the converse is not true
in general. If a sequence {f,} ES(O)converges
bounded measurable functions on R (denoted
by M(R)) is a Banach space with respect to to f ES(Q) almost everywhere, then { fn} con-
verges to f asymptotically. Any sequence { fn}
the norm Ilfil = llfll,=~~~~~p,,,lf~~~l. If
that converges to f asymptotically contains a
p(n) < CO, then M(R) c L,(Q) for any p > 0, and
subsequence {f,,} that converges to f almost
Ilfll,=lim,+, Ilfil, for any feM(R). From
everywhere.
this point of view, M(R) is also denoted by
L,(R) even when p(Q)= CO. This is also the
reason why the notation II.11m is used for the
(6) The Sequence Spaces c, c,,, 1, (0 < p < CO),
norm in M(0). m=lm, and s. The totality c (resp. c,,) of se-
quences x = {&} that converges (resp. con-
(4) The Lorentz Spaces &,JR) (0 < p, q < CO). verges to zero) as n+ cc forms a Banach space
The Lebesgue spaces L,(R), 0 < p < CO, are with respect to the norm I~XII = sup I&l. c,,
rearrangement invariant. Namely, deiïne for a (resp. c) is the space C,(Q) (resp. C(Q)) when R
measurable function f on a measure space is (resp. the one-point compactification of) the
(CI, PL)the distribution function pff(s) =~L(X E discrete locally compact space { 1,2,3, }. The
Q 1If(x)1 > s}, s > 0, and the rearrangement sequence space I,, 0 < p < cc (resp. m = I,), is
f*(t)=inf{s>OI~~(s)< t}, t>O. Then ~EL,@) defined to be the spaces L,(R) (resp. M(Q)),
if and only iff*EL,(O, CO) and ilfIl,= ilf*li,. where R is the space { 1,2,3, , n, . . }, of
Another important class of rearrangement in- which each point has unit mass, while s de-
variant spaces are the Lorentz spaces &,&2), notes the space S(Q), where s1= { 1,2, . , n, . . . },
O<p, qd co (G. G. Lorentz, 1950; R. A. Hunt provided with the measure assigning mass
[SI), which is defïned to be the quasi-Banach 1/2” to the point n. s is the set of a11 sequences
space of all measurable functions f on s1 such equipped with the topology of pointwise con-
that vergence (s is also used to denote the space of
rapidly decreasing sequences; - Section (16)).
Ilf II(p,@=lltl'pf*(m< CO3 Assume that the space &(a) mentioned in
where ~5: is the L,-space on (0,~) relative (2) is tseparable and that { cp,} is a tcomplete
to the measure dt/t. &JR) = L,(R) with orthonormal set in L,(s2). Then putting
equal norms. If 1< p < CO and 1 <q < co, then
&,,,(Q) is a Banach space under the equiva- 5. =
lent norm lltl’pml &f *(s)dsll,:. Except for
these cases, &JR) is not equivalent to
(+Fourier coefficients) for any f 6 L,(R), we have
a normed space in general [SI. If q. < ql, then
{<,}~l~ and Czi l&,l’= II f 11’. Conversely, for
L u,40)(Q)c LcP,4ij(R) with continuous embed-
any { &} E I, there exists an f= CE1 <,,(p, E
ding. In case P(Q) < Q, L(Po,40)t~) c L(p1,4,)tQ) L2(Cl) whose Fourier coefficients are the given
for po>pI and any qo, ql. The Lorentz spaces
5” (Riesz-Fischer theorem). By means of this
play an important role in interpolation and
correspondence, separable spaces L2(R) and
approximation theory (- 224 Interpolation of
1, are mutually isomorphic as Hilbert spaces.
Operators).
Sometimes we denote by I,(Q) the function
space L,(R), where 51 is an arbitrary set en-
(5) The Function Space s(Q). Let (Q, PL)be a dowed with the measure assigning mass 1 to
measure space with p(n) < CO. Denote by S(Q) each point.
168 B 648
Function Spaces

(7) The John-Nirenherg Space BMO. A locally equivalent conditions: (i) MGfcL, for a <p
integrable function f(x) on R” is said to be of with s cp(x)dx # 0; (ii) M:~E L, for a <p with
hounded mean oscillation if s <pdx # 0; (iii) M:~E L, for any <p; (iv) M:~E
L, for any: <p. If a distribution f satistïes (one
llf IlBMO=~~PI~lrl If(4-fsl~x<~, (1) of) these conditions, then its Poisson integral
ss u(x, t) = <Pu*f(x) is a function, where C~(X)=
where the supremum is taken over a11 (solid) const(1 +IX~~))(~+*)~~, and its radial maximal
spheres S in R”, fs is the mean ISJ mlJsf(x)dx, function u+(x)= SUP,,~ I~(X, t)I and nontangen-
and [SI denotes the measure of S (F. John and. tial maximal function u*(x) = s~p,,_~,<, lu(y, t)I
L. Nirenberg, 1961). The set BMO(R”) of aIl both belong to L,. Conversely if u(x, t) is a
functions on R” of bounded mean oscillation harmonie function on the Upper half-space
forms a Banach space under the norm men- t > 0 and if either u+ or u* belongs to L,,
tioned above if two functions f and g are iden- then its boundary value f= u(. ,O) exists in the
tified whenever ,f-g is equal to a constant sense of tempered distribution and f satisfies
almost everywhere. Condition (1) is equivalent the above conditions. Detïne the norm of an
to ,f~ H,(R”) by ~~u*/~~. Then H,(R”) becomes
UP a quasi-Banach space. If 1 <p < CO, then
sup ISI-’ If(x)-f$‘dx <CO H,,(R”)= L,(R”) with equivalent norms. H,(R”)
( 1s > is a Banach space strictly smaller than L, (R”).
for any 1< p < CO. There are constants B and An fi L,(R”) belongs to H,(R”) if and only if
K > 0 such that a11 the Riesz transforms Rif are in L, (R”), and
ll.fllH, is equivalent to IlfIl, +C IiRjfll i. Simi-
lar characterizations of H,(R”) are also known
for p > 0 (Fefferman and Stein [SI).
Let 5, j= 1, . , m, be +proper convex open
for any sphere S and Â.> 0. BM0 is a slightly
cones in R” such that the +Polars IjO caver
larger space than L, (e.g. log 1x1E BMO)
R”. Then an fi Y’(R”) belongs to H,(R”) if
and has better properties. For example, the
and only if there are holomorphic functions
Calderon-Zygmund operators are bounded in
F,(x + iy) on R” + iq such that SU~{ Ile(. +
BMO(R”). We have
iy)ll,ly~I~}<m andf=C4(.+iIjO)(D.L.
Burkholder, R. F. Gundy, and M. L. Silver-
BMO(R”)=L,(R”)+ f RjL,(Rn),
j=l Stein for n = 1 and L. Carleson for n > 1). Let 0
< p d 1. A measurable function u on R” is said
where Rj are the +Riesz transforms [9]. This is
to be a p-atom if there is a sphere S such that
called the Fefferman-Stein decomposition. The suppacSand /lall,,~ISI-“PandifSa(x)x”dx
Riesz transforms cari be replaced by more
=0 for all multi-indices cxwith ICI[ <FI(~-‘- 1).
general families of singular integral operators
Here a multi-index a is an n-tuple (c(i , , cc,,)
(A. Uchiyama, Acta Math. (1982)).
of nonnegative integers, 1a[ = %i + . + a, and
Xm=Xl=’ . ..x a” A distribution ,f belongs to
(8) The Hardy Spaces HP (0 < p < a). The
classical theory of Hardy classes (- 159 H,(R”), 0 < p 2 ;, if and only if there are a
Fourier Series G) has been reconstructed by sequence of p-atoms uj and a sequence of
the real-analysis method and extended to numbers ij > 0 in 1, such that f = C ijuj in the
higher-dimensional cases by E. M. Stein, G. sense of distributions, and the norm IlfliH, is
Weiss, C. Fefferman, and others. According to equivalent to the infimum of llljllr, (R. R. Coif-
man for II = 1 and R. H. Latter for n > 1). The
their terminology the elements of the Hardy
theory of HP and BM0 has been generalized
space H, are (the complex linear combinations
of) the real parts of the boundary values of to more general situations (- Coifman and
Weiss, Bull. Amer. Math. Soc. (1977)).
holomorphic functions of the Hardy class.
Let fc Y’(R”) be a ttempered distribution.
From now on we assume that R is a domain
For a <pE Y(Rn) detïne the radial maximal
in the n-dimensional Euclidean space R” (or
function M,ff and the nontangential maximal
more generally a differentiable manifold). D”
function M,*f relative to <p by
stands for Dl’1 D,,““, where Dj = c~/c?x,.

f>O (9) The Function Spaces C’(0) and Ch(Q) (1=


qf(x)=, ypt*.f(Y)l, 0, 1,2, , CO). The totality of I-times continu-
x , ously differentiable functions in Q (namely,
where C~,(X)= t -“cp(x/t) and * denotes convo- differentiable functions of tclass C’ in fi) is
lution. Then the Hardy space H,(R”), 0 <p < denoted by C’(D). We say that a sequence
10, is detïned to be the space of all tempered {f,} of functions in C’(Q) converges to 0 in
distributions ,f which satisfy the following C’(Q) if ID”~,(X)/ converges to 0 uniformly on
649 168 B
Function Spaces

every compact subset of R for every GLsatisfy- (11) The Sobolev Spaces kV’,‘,@), H’(R), and
ingO<laI<I(O<lal<coif1=m).C’(R)isa E@n) (l= 0, 1,2, . . . ) l<p<cc or -co<l<co,
Fréchet space. The totality of functions in l<p<co).Let1>0beanintegerandl<p<
C’(Q) whose supports are compact subsets of CO. The Sobolev space II$@) is the totality of
R is denoted by CA(Q). We say that a sequence functions f(x) such that for a11 a satisfying
{fy} of functions in Ch(n) converges to 0 in JaI < 1, the derivatives D”f(x) in the sense of
Ch(Q) if suppf, (v = 1,2,. . ) is contained in a distribution (- 125 Distributions and Hyper-
compact subset of R independent of Y and {f,} functions) belong to L,(R) with respect to
converges to 0 in C’(Q). Ch(Q) is an t(LF)- Lebesgue measure in Q. II$(Q) is a Banach
space. space with the norm
When R is a locally closed set in R” (or a
differentiable manifold with boundary), we
denote by C’(Q) the totality of functions f(x)
on R together with their continuous forma1 Clearly II$‘(Q) = L,(Q). II’:@) is a Hilbert
derivatives D”~(X), la1 ~1 (la1 < CO if I= 00) on space with respect to the inner product
fl such that for every a and m with 1a1 < m < 1
(m<co if1=co), (.Ld= no<;<lD=fWD"godx.
s . .
D4f(x)-,p,<;-,c, D”+Bf(Y)(x -YYlP!
Sometimes W:(R) is denoted by H’(a). Its
=0(1x-y[“-‘“‘) closed linear subspace obtained as the comple-
tion of C?(0) is denoted by HA(R). We have
locally uniformly in R as lx - y1 tends to 0. Hi(R)= W~(Cl)=L,(Cl). However, if I> 1, we
Convergence in C’(Q) is delïned in the same have H;(o)c Ii$Q), and identity does not
way as above. hold unless R = R”.
If Sz is a closed set in R” with a finite number The delïnition of Sobolev spaces has been
of connected components in each of which extended to those with fractional and also
two points x and y are connected by an arc of negative order -00 <s < CO in many different
length < Clx-y1 with a constant C indepen- ways. When 1 < p < CO, it is natural to delïne
dent of x and y, then every f in C’(Q) cari be W;(R”) to be the space of a11 tempered distri-
extended to an f in C’(R”) (Whitney’s extension butions f on R” such that (1 - A)s@f = F r (( 1
. theorem). +~~~')"'2Ff)~Lp(R"). If s>O, then W;(R”) thus
delïned coincides with the space of a11 fi
(10) The Lipschitz Spaces A”. Let s >O, and
L,(R”) whose Poisson integral u(x, t) satislïes
let k be the least integer greater than s. The
Lipschitz (or Holder) space k(R”) is the totality cc Il2
t4k~2S~‘IAk~(.,t)12dt <co
of functions f(x) on R” which satisfy
Il(s 0 > Il P
for some (and any) integer k > 42.
“fllA”=sx~,p j$o ; (-l)jf(x+jy) lYl”< CO.
1 0 Il The Poisson integral cari be replaced by
other regularizations, and thus the definition
When s< 1, this is exactly the tlipschitz (or
of Sobolev spaces of fractional orders is ex-
Holder) condition of order s. But when s = 1, it
tended to arbitrary open set s2 with the cane
is strictly weaker than the tlipschitz condition.
condition (T. Muramatu [ 131). Here R is said
A function f E A’ is said to be smooth in the
to satisfy the cane condition if there are a
sense of A. Zygmund [6]. Suppose 0 < h <s is
bounded and uniformly continuous mapping
an integer. Then a function f belongs to A” if
Y:R”+R” and an E>O such that for any XER
and only if it is h times continuously differenti-
the convex hull of the e-bal1 with tenter at x +
able and a11 the derivatives O"f of order h are
Y(x) and {x} is included in Q
in IA-~. A”(R”) is a Banach space of functions
modulo the polynomials of degree < k - 1.
Suppose that 1 Q q < CO and f is a measur- (12) The Besov Spaces B”p,q (-CO <s< cql <p,
able function on R” such that q < 00). The effort to make the Sobolev em-
bedding theorem [lO] more precise led S. M.
Nikol’skiï and 0. V. Besov [l l] to the other
classes of “Sobolev spaces” of fractional order.
where the supremum is taken over a11 spheres Let s > 0 and 1 < p, q < CO.The Besov space
in R” and the infimum over a11 polynomials of B&(R”) is the totality of functions f EL,(R”)
degree <s. Then f(x) is equal to a function such that
f(x) in A’(R”) almost everywhere and the su-
premum is equivalent to the norm jjfll*s. If IB;,r= j$o ; (-l)‘f(.+AJ) qdy lk4
Conversely every f E A”(R”) satislïes the above (Ill ( > Il plYlqs+” >
inequality (S. Campanato). <CO
168B 6.50
Function Spaces

for some (and any) integer k > s. In terms of hood l& of 0 in 9@) is delïned to be the
the Poisson integral an feL,(R”) belongs to totality of functions f(x) such that 11O”f11 p < E
BS,,,(R”) if and only if for any cLsatisfying [xl< 1. In particular, 9$,(Q)
is also denoted by a(Q). (99(Q) is also used to
denote the space of hyperfunctions.)
A function f(x) is called a rapidly decreas-
for some (and any) integer k > 42. The first ing C”-function if it belongs to P(R”) and
definition is obviously extended to an arbi- satislïes
trary domain R in R”. If R satislïes the cane
condition, then the functions in B&(fi) are
characterized similarly as above by using a suit- for any c( and any integer k > 0. The totality of
able regularization u(x, t) of f(x) (Muramatu rapidly decreasing C”-functions is denoted by
[13]). Let R be a domain with the cane condi- Y. The neighborhood v,k,E of 0 in Y is defïned
tion. When 0 <h <s is an integer, an f belongs to be the totality of functions f(x) such that
to B;,,(a) if and only if fE w;(n) and all (1 +Ix~~)~ID~~(x)I<E for any c( satisfying la1<1.
the derivatives O”f of order h are in B;;:(n). The spaces in this section are tnuclear except
BP,,@) is a Banach space with the norm ~~f~~, for 9L and %, and are employed in the theory
+ IflB;,,. If p = 2, then B;,,(Q) coincides with of distributions [14] (- 125 Distributions and
the Sobolev space IV,(Q). But in the other Hyperfunctions). When R = R”, we usually
cases, BP,@) is different from w;(Q) for any q. omit (R”); for example, 9(R”) and gLP(R”) are
However, BP,JR) or Ba, ,(0) is often called a denoted by 9 and 9r+ respectively.
Sobolev space of order s and denoted by
R$(D). Clearly we have B& m =L, f? A”.
(14) The Function Spaces gIMP), gfM,), JYiMpj,
The Besov space B&(S2) of order s < 0 is de-
&(Mp), and d = C”. Let {M,} be a sequence of
fmed to be the totality of distributions f which
positive numbers satisfying the logarithmic
cari be represented as f= &Qk O”f, with
convexity Mz < M,-, M,,, giMpi(Q) (resp.
f, E B”(fi) for some (and any) integer k
&cM,j(Q)) denotes the totality of C”-functions
such that s + k > 0. The norm is delïned to be
f on R such that for any compact set K in fi
XZ, lIf,llB~~~,,,.In terms of the Poisson in- there are constants k and C (resp. for any k > 0
tegrals or other regularizations the same char-
there is a constant C) satisfying
acterization holds as above.
If R is a domain in R” with the cane con- 1Daf(x) 1< Ck’“‘M IH) XEK, lal>O.
dition, then the restriction BL,,(R”)-+BP,,(Q)
G?~,,!)(R) is the totality of keal analytic func-
(resp. W;(R”)+ W$2) for 1< p < CO) is a
tions on R and is denoted by d(Q) or C”(Q).
bounded linear surjection with a bounded
If {M,} satistïes the Denjoy-Carleman condi-
right inverse [ 131.
From now on we denote by B;+ etc., tion C M,IM,+, <CO, then 9jMp,(Q)=9(Q)n
B&(52), etc. for a domain R c R” with the cane gpp)(fi) and %,,)(Q) = W4 n ~cM,)(~) are
dense in 9(Q). Conversely, if 91Mpl(R) is differ-
condition. If q < r, then B& c BP,,. If 1~ p < 2,
ent from {0}, then {M,,} satislïes the Denjoy-
then BP,,c W~CBP,~. If2<p<co, then BP,,c
Carleman condition. In this case an fg
Wp c B&. Ifs > t, then BP, coc BP, 1.
&lMB1(Q) (resp. GcMp)(Q)) is sometimes called
Sobolev-Besov embedding theorems: (i) Let p
an ultradifferentiahle function of class {M,}
<pi and s- n/p = SI- nlp’. Then B& c Bgr,q,
(resp. (M,,)). The most important is the case
KP, c W$, WlcBp.,, and, ifp’< CO, then W;c
where M, = p!” for an s > 1. Then an fi
Wi,‘. (ii) Let 0 <s < n/p and n/p’ = n/p -s.
GrM,l(Q) (resp. G,,P,(s2)) is called a function
Then BSp,qc L(,,,,,, and Wp c Lcp,,pJ( c Lps). (iii)
of Gevrey class {s} (resp. (s)). The topological
Let s = nlp. Then BP, 1 c BC for any p and BP, ~
properties of &(a) (resp. &l,l,(fi), etc.) have
cBMOforp<co.(iv)LetO<n’<nandO<
been discussed by A. Martineau (resp. H.
s’ = s -(n - n’)/p. Then there is a bounded trace
Komatsu).
operator Tr:B~,,(R”)~Bs;.,,(R”‘) that extends
For function spaces of S type - 125 Distri-
the restriction mapping 9’(R”)+9’(R”‘). Tr
butions and Hyperfunctions.
is surjective and has a bounded linear right
inverse [11-131.
(15) Tbe Function Spaces O(Q), OP(Q) = A,,(R),
(13) The Function Spaces 9,8, BLD, 3, and Y. and A(R). Let R be an open set in C”. The
The spaces of inlïnitely differentiable functions totality of tholomorphic functions on s2 is
Cg(Q) and Cm(Q) are also denoted by 9(Q) denoted by 0(Q). 0(Q) is a tnuclear Fréchet
and 8’(n), respectively. The totality of func- space with the topology of uniform conver-
tions f(x) in C”(Q) such that, for a11 c(, D”~(X) gence on compact sets. It is a closed linear
belongs to L,(R) with respect to tlebesgue subspace of C(Q) and also of Cm(Q).
measure is denoted by BLD(a). The neighbor- For any p 2 1, the totality of functions f
6.51 168 C
Function Spaces

holomorphic in Q and satisfying JnIf(z)IPdxdy Banach space BP’(n) of Radon measures of


< CO (z =x + iy; dx dy is Lebesgue mea- bounded variation. On the other hand, the
sure), denoted by UP(Q) or A,(Q), is a Ba- dual space of C,(Q) is the space of a11 Radon
nach space with respect to the norm ilfil,= measures on R. N. Bourbaki takes this fact as
(jnIf(z)[Pdxdy)lip. In particular, it is a Hil- the defmition of measure.
bert space when p = 2. Any bounded linear functional @ on tl(Q)
The totality of functions bounded and con- is expressed as
tinuous on the closure of R and holomorphic
in fi (denoted by A(Q)) is a Banach space
with respect to the norm Ilfil = supZEn If(z)I.
with a suitable (~EM(R); and Il@// = ~I(P~I~.
(16) The Ki3he Spaces n Â(c&~)) and x1” (LY(~)). Conversely, any <pE M(R) defïnes a bounded
Let {teck)} be an increasing sequence of se- linear functional on L1(Q) by means of (3).
quences tltk) = (c(y), CC~), ) of positive numbers. Accordingly, the dual space of L,(Q) is isomor-
The echelon space n I(“(k)) of G. Kothe [ 151 phic to M(R).
is the totality of sequences x=(5,,&, . ..) The dual space of L,(R) (1 <p < CO) is iso-
such that P(~)(X) = C cc{“) 1&l< CO for any k. morphic to L,(R), where q is the real number
It is a Fréchet space with the topology deter- defmed by (l/p) + (l/q) = 1 (accordingly, 1 <q <
mined by the seminorms pck). The co-echelon m) and is called the conjugate exponent of p.
space’x Â.’ (oc) is the totality of sequences y = Any bounded linear functional on L,(R) is
(ql, qZ,. . . ) such that Iqil< Caik) for some C expressible by the formula in (3) (where fi
and k. The space s of rapidly decreasing se- J44) with cp~~%Jfi),and Il@ll= lldlq.
quences and the space s’ of slowly increasing The dual space of M(R) is isomorphic to the
sequences are the echelon space and the co- normed linear space of a11 (real- or complex-
echelon space for the sequence ai”) = ik. If RI”) = valued) lïnitely additive set functions cp detïned
k’, then we obtain the space of power series on a11 measurable sets in R, of bounded vari-
with infinite radius of convergence and the ation over R, and absolutely continuous with
space of convergent power series. More gener- respect to the measure p given in s2 (i.e., p(N) =
ally, let du= (Es) be a sequence of positive num- Oimplies<p(N)=O).Ifl<p<coandl<q<
bers. The echelon spaces for a{“)= exp(ka,) and CO, then the dual space of L~,,,,(Q) is isomor-
exp( - k-lai) are called the infinite type power phic to LcP,.q,)(fi), where p’ and q’ are conjugate
series space and the finite type power series exponents of p and q, respectively.
space and are denoted by A,(E) and Al(a), If R is nonatomic (i.e., R has no set of posi-
respectively. Echelon spaces and co-echelon tive measure that cannot be decomposed into
spaces have been employed to construct exam- two subsets of positive measure), then no con-
ples and counterexamples in the theory of tinuous linear functionals exist other than
tlocally convex spaces by Kothe [ 151, Grothen- zero on S(Q) and on L,(R) and L(,,,,(fl) for
dieck, Y. and T. Komura, E. Dubinsky, D. O<p<l.
Vogt, and others. The sequence spaces c,,, 1, (1~ p < CO), m,
and s (the space defmed in Section B (6)) are
special cases of C,(n), L,(R), M(R), and S(Q),
C. Dual Spaces respectively. Hence their dual spaces cari be
described explicitly. For example, the dual
When 0 is a compact Hausdorff space, any space of cg (resp. II) is 1, (resp. m), and if 1 <
bounded linear functional @ on C(Q) is ex- p < CO, the dual space of 1, is 1, (where (l/p) +
pressed by the tstieltjes integral (l/q) = 1). The coupling of x = (5.) and y =
(II,) is given by C &,r,. The dual space of s is
the totality of sequences (q,) such that q,, = 0
except for a lïnite number of IL.
with a tRadon measure <p, i.e., a (real- or Let VMO(R”) denote the closure of CO(R”) in
complex-valued) tregular tcountably additive BMO(R”). Then the dual space of VMO(R”) is
set function delïned on the Bore1 sets in 0. identifïed with H1 (R”). This cari be considered
Since <p is of bounded variation, the totality of to be a generalization of the TF. and M. Riesz
such cp is denoted by B1/(R). Conversely, any theorem. On the other hand, the dual space of
cpE BP’(n) gives a bounded linear functional on H,(R”) is BMO(R”) [9]. Hardy spaces HJR”),
C(0) detïned by (2), and //@Il = the total varia- 0 <p < 1, are not locally convex but have suffi-
tion of <p over R. Hence the dual space of C(n) ciently many bounded linear functionals, and
is isomorphic to the Banach space Bv(Q) with their dual spaces are identifïed with Lipschitz
the norm /\@Il. spaces A”(R”), where s= n(p-’ - 1).
Let R be a locally compact Hausdorff space. Let 1 <p, q < CO. Then the dual space of
Then the dual space of C,(Q) is again the Wi(R”) is isomorphic to H$“(R”) and that of
168 Ref. 652
Function Spaces

B&(R”) to B;f,.(R”), where p’ and q’ are conju- [ 131 T. Muramatu, On Besov spaces and
gate exponents of p and q, respectively. The Sobolev spaces of generalized functions de-
dual space 9’ of 9 is defined to be the space of fïned on a general region, Publ. Res. Inst.
tdistributions; the dual spaces of B, gLL,, and Math. Sci., 9 (1974), 325-396.
9 are (algebraically) linear subspaces of g. [ 143 L. Schwartz, Théorie des distributions,
Similarly, the dual spaces of gjMp) and scM,> Hermann, revised edition, 1966.
are called the spaces of tultradistributions [ 151 G. Kothe, Topological vector spaces 1,
of classes {M,} and (M,), respectively. The Springer, 1969. (Original in German, 1966.)
dual space of & is identifïed with the space of
thyperfunctions with compact support (- 125
Distributions and Hyperfunctions). A continu-
ous linear functional on o(Q) is called an 169 (XI.1 9)
analytic functional. For each analytic func-
Function-Theoretic Nul1 Sets
tional ù> there is a compact set L c R such that
I@(f)[ d C ~up,,~lf(z)I. A compact set K is
called a porter of @ if every compact neighbor- A. General Remarks
hood L of K satisfies this condition. Porters
are similar to supports of generalized func- By a function-theoretic nul1 set we mean an
tions, but an analytic functional does not exceptional set that appears in theorem as the
necessarily have a smallest porter. one asserting that a certain property holds
The dual space of an echelon space is the with a “small exception.” We give below some
corresponding co-echelon space. of the more important examples of exceptional
sets. For simplicity, we limit ourselves to the n-
dimensional Euclidean space R” (n > 2).

References
B. Sets of Harmonie Measure Zero

[l] S. Banach, Théorie des opérations liné-


Denote by xE the characteristic function of
aires, Warsaw, 1932 (Chelsea, 1963).
a set E on the boundary aD of a bounded
[2] N. Dunford and J. T. Schwartz, Line.ar
domain D in R”. We cal1 the thypofunction
operators, Interscience, 1, 1958; II, 1963; III,
Hx, and thyperfunction i?,E (- 120 Dirichlet
1971.
Problem) the inner and outer harmonie mea-
[3] K. Yosida, Functional analysis, Springer,
sures of E (with respect to D), respectively.
1965.
When they coincide, the function is called the
[4] J. Lindenstrauss and L. Tzafriri, Classical
harmonie measure of E. A necessary and suflï-
Banach spaces &II, Erg. Math., 92 (1977), 97
tient condition for E to be of inner harmonie
(1979).
measure zero is that u < 0 hold in D whenever
[S] A. Grothendieck, Critères de compacité
a tsubharmonic function u bounded above in
dans les espaces fonctionneles généraux, Amer.
D satisfies lim sup u(P) < 0 as P tends to any
J. Math., 74 (1952), 168-186.
point of C?D-E. This theorem implies the
[6] A. Zygmund, Trigonometric series, Cam-
following uniqueness theorem: If h is boundeg
bridge Univ. Press, second edition, 1959.
and harmonie in D, if E is a set of inner har-
[7] E. M. Stein, Singular integrals and dif-
monic measure zero on C?D, and if h(P)+0 as P
ferentiability properties of functions, Princeton
tends to any point of C~D-E, then h = 0. A
Univ. Press, 1971.
necessary and sufflcient condition for E to be
[S] R. A. Hunt, On L(p, q) spaces, Enseigne-
of outer harmonie measure zero is that there
ment Math., (2) 12 (1966), 249-276.
exist a positive tsuperharmonic function u in D
[9] C. Fefferman and E. M. Stein, HP spaces of
such that v(P)+ CO as P tends to any point of
several variables, Acta Math., 129 (1972), 137-
E. (Concerning the existence of a limit for a
193.
subharmonic or tharmonic function at every
[lO] S. L. Sobolev, Applications of functional
boundary point except those on a set of har-
analysis in mathematical physics, Amer. Math.
monic measure zero, - 193 Harmonie Func-
Soc., 1963. (Original in Russian, 1950.)
tions and Subharmonic Functions.)
[ 1 l] 0. V. Besov, Investigations of a family of
function spaces in connection with theorems of
imbedding and extension, Amer. Math. Soc. C. Sets of Capacity Zero
Transl., Ser. 2,40 (1964), 85-126. (Original in
Russian, 1961.) Although there are many kinds of capac-
[ 121 H. Triebel, Interpolation theory, function ity (- 48 Capacity), here we consider only
spaces, differential operators, North-Holland, tlogarithmic capacity and cc-capacity (c(> 0).
1978. Let K be a nonempty compact set in R”. Set
653 169 F
Function-Theoretic Nul1 Sets

W(K)=inf,,jjPQ-“d@)&(Q), where p runs implies H” = 0. Moreover, HZ = @ if and


through the class of nonnegative +Radon mea- only if C,(K) = 0, and Hq = 0 implies C,-,(K)
sures of total mass 1 supported by K, and =Oforq,2<q<cc Cl].
Write C,(K)=(W(K))-““. Defïne C,(a)=0
for an empty set 0. For a general set E c R”,
E. Nul1 Sets Defined with Respect to Families
detïne the inner capacity by supKcE C,(K) and
of Functions
the outer capacity by the infïmum of the inner
capacity of an open set containing E. When
Conversely, L. V. Ahlfors and A. Beurling
the inner and outer capacities coincide, the
characterized the size of sets in a plane by
common value is called the cr-capacity (or
means of families of functions [2]. Let D be a
capacity of order ~1)of E and is denoted by
domain, and let f represent a holomorphic
C,(E). We denote the logarithmic capacity of E
function in D. Fix a point z0 in D. Set
by C,(E). In order that C,(K)=0 (n=2) or the
+Newtonian capacity C,_,(K) = 0 (n 2 3) for a
compact set K, it is necessary and suftïcient
that the harmonie measure of K with respect
to G-K vanish for any bounded domain G
containing K. Then K is removable for any E={fltheareaofR’>n},
harmonie function that is bounded or has a
lïnite +Dirichlet integral in G - K. In general, where RC is the complement of the trange R
K is said to be removable for a family F of of (f(z) -f(z,,))-‘. Denote by 623, 63, 66
functions if for any domain G containing K the families consisting of constants and +univa-
and fi F defined in G-K, there exists gE F lent functions in d, a, 6, respectively. Use
defined in G such that g = f in G-K. Let K be the notation 5 to represent any one of these
a compact set in RZ with C,(K)=0 and G be a six families, and detïne Mg = Mg(zo; D) by
domain containing K. Let f be a holomorphic sup{lf’(z,)l~f~~}.Then M,=M,>M,=
function defined in G-K for which every M,, 2 MG, = MG,, and M8(zo; D) = 0 im-
point of K is an tessential singularity. Then the plies M8(z;D)=0 for any zeD.
set of texceptional values for f at every point Denote by Na the class of compact sets K
of K is of logarithmic capacity zero. If f is such that the complement K’ of K is con-
tmeromorphic in jzj < 1 and nected and M8(z; K’) = 0. We cal1 K E N8 a
nul1 set of class Na. In order for K to be re-
movable for 23 or 9, it is necessary and SU~~I-
tient that K E Nar or EN,, respectively. We in
general have
then f has a fïnite limit in any angular domain
with a vertex at every point of lzl = 1 except for
those belonging to a set of a-capacity zero
(logarithmic capacity zero if CI= 0). If A1 (K) = 0, then K E Ns. There exists a set
K E Nn with A,(K) > 0 (A. G. Vitushkin, Dokl.
Akad. Nuuk SSSR, 127 (1959); J. Garnett, Proc.
D. Hausdorff Measure Amer. Math. Soc., 21 (1970)). When K is a
subset of an analytic arc A, K E NB implies
Let CL> 0 and a set E be given in R”. Denote by A,(K)=O, Nr, is equal to Na,, and K be-
(r a covering of E by a countable number of longs to N, if and only if C,(A)=C,,(A-K).
balls with radii d,, d,, . . . , a11 of which are If an tanalytic function has an essential sin-
smaller than E (> 0). As s-0, inf,C,dk in- gularity at every point of K E N,, then any
creases. The limit is called the Hausdorff mea- compact subset of the set of exceptional values
sure of E of dimension c( and is denoted by at every point of K belongs to NB. A necessary
A,(E). In order for a compact set K to be and sufficient condition for K E ND is either
removable for the family of harmonie func- that the complement of any one-to-one +Con-
tions detïned in a bounded domain and satisfy- forma1 image of K’ be of plane measure zero
ing the +Holder condition of order CI, it is or that any tunivalent analytic function in K’
necessary and suff’cient that Anm2+,(K)=0. be reduced to a tlinear fractional function. If
Next, suppose that K is a compact set in a the union of at most a countable number of
plane and the complement G of K with re- Nar or ND sets is compact, it belongs to the
spect to the plane is connected. Set IlfIl,= same class. But it not true for Naai sets [3].
(JJG1flqdxdy)l’q for f holomorphic in G and
q, 1 <q< CO,and IlfIl, =supelf(. Denote by F. Analytic Capacity
Hq thefamilyofffo with (/fll,<co. Ifpis
defined by l/p+ l/q= 1, then &,(K)< CQ For a compact set K, let D, be the unbounded
impliesHq=@forq,2cq<~,andA,(K)=0 connected component of K’. The quantity
169 Ref. 654
Function-Theoretic Nul1 Sets

M%(~O, DK) is called the analytic capacity of K classes of loops with one common base point
and is denoted by a(K). If cc(K) > 0, there exists y,, the product is always defined, and the set
an extremal function fo(z)=cc(K)zm’ + ... , forms a group n1 (Y, yo). This group is called
called the +Ahlfors function, that maps D, the fundamental group (or Poincaré group) (H.
onto a covering surface of the unit disk. In Poincaré, 1985) of Y (with respect to yo). If Y is
general, a(K) is not greater than the logarith- tarcwise connected, then 7~~(Y, y0) r n, (Y, yl)
mit capacity C,(K). If K is a continuum, then for an arbitrary pair of points y,, yl, and the
a(K)=C,(K), and cc(K) is attained by and only structure of the group is independent of the
by f&), which maps DK onto 1WI < 1 conform- choice of the base point. This group is de-
ally and z = cx to w = 0. For a linear set K, noted simply by nl( Y). A continuous mapping
a(K) is equal to a quarter of its length (C. <p: (Y, y&( Y’, yo) induces a homomorphism
Pommerenke, Arch. Math., 11 (1960)). The <P,:~~(Y,Y,J+~,(Y’,YO) by sending Cfl to
concept of analytic capacity is basic for the <p*[fl = [<pof], and (cp’o q), = cpi o ‘p* holds
theory of rational approximation on compact for the composite <p’o <p of mappings. Thus
sets. n,(Y) is a ttopological invariant of Y. If 7c1(Y)
consists of only one class (the class of the
constant path), we say that Y is simply con-
References
nected. For example, cells and spheres S”
(n > 2) are simply connected. The famous
[l] L. Carleson, Selected problems on excep-
+Poincaré conjecture states that a simply con-
tional sets, Van Nostrand, 1967.
nected 3-dimensional compact tmanifold is
[2] L. V. Ahlfors and A. Beurling, Conforma1
homeomorphic to the 3-dimensional sphere.
invariants and function-theoretic null-sets,
We have van Kampen’s theorem: Let P be a
Acta Math., 83 (1950), 101-129.
connected polyhedron, P, and Pz be its con-
[3] L. Sario and K. Oikawa, Capacity func-
nected subpolyhedra such that P, fl P2 is con-
tions, Springer, 1969.
nected, and P = P, U P2. Then 7~~(P) is isomor-
[4] L. Sario and M. Nakai, Classification
phic to the group (tamalgamated product)
theory of Riemann surfaces, Springer, 1970.
obtained from the +free product of nl(P,) and
[S] J. Garnette, Analytic capacity and mea-
n1 (PJ by giving the relations that the images
sure, Lecture notes in math. 297, Springer,
of each element of n,(P, n PJ in n,(P,) and in
1970.
n1 (Pz) are equivalent. Also, the fundamental
[6] L. Zalcman, Analytic capacity and rational
group of the tproduct spaces is the direct
approximation, Lecture notes in math. 50,
product of the fundamental groups of the
Springer, 1968.
spaces involved. Any group is the fundamental
[7] T. W. Gamelin, Uniform algebras,
group of some +CW complex. The Abelization
Prentice-Hall, 1969.
rc1/[7c1,n,] of the fundamental group 7~~=
n,(Y) (Y arcwise connected) is isomorphic to
the 1-dimensional integral thomology group
H,(Y). For example, the fundamental groups
170 (1X.2) of a circle S’ and a ttorus T” are an inlïnite
cyclic group and a free Abelian group of rank
Fundamental Groups n, respectively; the fundamental group of a l-
dimensional CW complex is a free group; and
A tcontinuous mapping f from the interval the fundamental group of an orientable 2-
I= {t 10 Q t d 1) into a topological space Y is dimensional closed surface of +genus p is a
called a path connecting the initial point f(0) group having 2p generators {a,, . . . , a,,, b,, . . . ,
and the terminal point f( 1). In particular, a b,} and a relation a, b,a;‘b;’ . . . a,b,ap’b;’ =
path satisfying f(0) =f( 1) = y, is called a loop 1. If x0 is a fïxed point of the circle S’, then
(or closed path) with y, as the hase point. For a the fundamental group cari be defïned as the
path L the inverse path f off is defïned by f(t) set of a11 homotopy classes of continuous map-
=f( 1 - t). When the terminal point off and pingsf:(S’,x,)~(Y,y,).
the initial point of g coincide, the path F de- Extending the defïnition of the fundamental
lïned by F(t) =f(2t) for 0 < t < 1/2 and F(t) = group by replacing 1, S’ with I”, S”, we obtain
g(2t - 1) for 1/2 < t < 1 is called the product the n-dimensional homotopy group (- 202
(or concatenation) off and g, and is denoted Homotopy Theory; 91 Covering Spaces).
by f. g. With [f] standing for the equivalence
class of a path f under the relation of thomo-
topy relative to 0, 1 (E 1) (i.e., by homotopy References
with 0 and 1 being fïxed), the inverse [If]-’ =
[f] and theproduct [f].[g]=[f.g] are [l] H. Seifert and W. Threlfall, Lehrbuch der
defïned. In particular, in the set of homotopy Topologie, Teubner, 1934 (Chelsea, 1965).
655 170 Ref.
Fundamental Groups

[2] S. Lefschetz, Introduction to topology,


Princeton Univ. Press, 1949.
[3] W. S. Massey, Algebraic topology, an
introduction, Springer, 1977.
[4] E. H. Spanier, Algebraic topology,
McGraw-Hill, 1966.
171 Ref. 658
Galois, Evariste

171 (XXl.25) of the roots of the equation. Even in this


original form of the theory (Galois theory),
Galois, Evariste Galois not only completed the research started
by J. L. Lagrange, P. Ruffini, and Abel, but
Evariste Galois (October 25, 1811-May 31, also made an epochal discovery that opened
1832) was born in Bourg-la-Reine, a suburb of the way to modern algebra. J. W. R. Dedekind
Paris. In 1828, while still in junior high school, (Werke III, 1894) interpreted this result as a
he published a paper on periodic tcontinued duality theorem concerning the automorphism
fractions. Although he published four papers, groups of a field. It was shown later that
his most important works were submitted to Galois theory plays an important role in the
the French Academy of Science and either lost general theory of commutative tïelds estab-
or rejected. He was unsuccessful in his attempt lished by E. Steinitz. In the 1920s W. Krull
to enter the Ecole Polytechnique and instead generalized the idea of Dedekind, using the
entered the Ecole Normale Supérieure in 1829. concept of topological algebraic systems, and
Active in political affairs, he was expelled from obtained the Galois theory of infinite alge-
school, imprisoned, and died in a duel soon brait extensions (Math. Ann., 100 (1928)). The
after his release. Galois theory gives a mode1 for a successful
The night before the duel, he left his re- theory summarizing the essentials of separable
search outline and manuscripts to his friend, algebraic extensions and has led to analo-
A. Chevalier. These were published by J. gous theories for other algebraic systems. For
Liouville in J. Math. Pures Appl., fïrst series, example, E. R. Kolchin constructed an analo-
11 (1846). The contents include the concept gous theory for differential fïelds where the
of groups and what essentially became the Galois groups are algebraic groups (- 113
+Galois theory of algebraic equations. The Differential Rings, Galois theory of differential
manuscript also contained expressions such as fïelds). Another line of development of this
“theory of ambiguity,” which seems to indicate theory, also originated by Dedekind (Werke
that Galois intended to study the theory of III, 1876/77), led to the Galois theory of rings,
algebraic functions along the same lines. an abject of active research by N. Jacobson
(Ann. Math., 41 (1940)), T. Nakayama, and
others since the 1940s. Also, recently, the
References
Galois theory for some general algebraic sys-
tems containing inseparable tïelds has been
[l] E. Galois, Oeuvres mathématiques, E.
constructed by M. E. Sweedler, U. S. Chase,
Picard (ed.), Gauthier-Villars, 1897.
and others in which Galois groups are re-
[2] F. Klein, Vorlesungen über die Entwick-
placed by Hopf algebras or bialgebras (- 203
lung der Mathematik im 19. Jahrhundert 1,
Hopf algebras).
Springer, 1926 (Chelsea, 1956).
[3] R. Bourgne and J.-P. Azra, Ecrits et
mémoires mathématiques d’Evariste Galois,
Gauthier-Villars. 1962. B. Definitions

Given a group G of tautomorphisms of a given


+fïeld L, the subfield F(G)={uELIu”=~,~EG}
is called the invariant field associated with G.
172 (111.7) For any extension fïelds L, L’ of K, an isomor-
phism of L into L’ whose restriction to K is
Galois Theory the identity is called a K-isomorphism. If L is a
tnormal extension of K, any K-isomorphism
A. History of L into L’ is a K-automorphism of L. If an
algebraic extension L of K is normal, the
After the discovery of formulas giving the group G(L/K) of a11 K-automorphisms of L is
general solutions of algebraic equations of called the Galois group of L/K. A tseparable
degrees 3 and 4 in the 16th Century, efforts to normal algebraic extension of K is called a
salve equations of degree 5 remained unsuc- Galois extension of K. Let L/K be a tïnite
cessful. Early in the 19th Century P. Ruffini normal extension; then there exist intermedi-
and N. H. +Abel showed that a general alge- ate lïelds M and N of L/K such that MIK is a
brait solution is impossible. Shortly after- Galois extension and N/K is a tpurely insep-
ward, E. +Galois established a general prin- arable extension and L = M OK N (- 277
ciple concerning the construction of roots of Modules J). Further, G(L/K)=G(L/N)=
algebraic equations by radicals. The principle G(M/K) and the order of G(L/K) is equal to
was described in terms of the structure of a the tseparable degree of L/K, i.e. [M: K]. A
certain permutation group (the Galois group) necessary and suffïcient condition for L/K to
659 172 F
Galois Tbeory

be a Galois extension is that the invariant field the treduced residue class group (Z/mZ)*;
associated with G(L/K) be K. A Galois exten- in particular, if K = Q, the subgroup coin-
sion L/K is called an Abelian extension or a cides with (Z/mZ)*, by the irreducibility of
cyclic extension when G(L/K) is +Abelian or tcyclotomic polynomials. Hence the degree
tcyclic, respectively (- 149 Fields). [Q(<):Q] is equal to q(m), where cp is +Euler’s
function.

C. Fundamental Tbeorem of Galois Tbeory


(2) Finite Fields. A tfinite lïeld K has nonzero
Let L/K be a lïnite Galois extension and G its characteristic p, and the number q of elements
Galois group. Then there exists a tdual lattice of K is a power of p. Also, K is uniquely deter-
isomorphism between the set of intermediate mined by q (up to isomorphism), hence it is
fields of L/K and the set of subgroups of G, denoted by GF(q) or F,. Thus GF(q”) is the
under which an intermediate lïeld M of L/K only extension of GF(q) of degree n; moreover,
corresponds to the subgroup H = G(L/M); it is a cyclic extension.
conversely, a subgroup H of G corresponds to
M =F(H). The degree of extension [L: M] is (3) Kummer Extensions. Assume that K con-
equal to the order of the corresponding sub- tains a primitive mth root [ of unity and the
group H (in particular, [L: K] is the order characteristic of K is 0 or is not a divisor of m.
of G), and [M: K] coincides with the index Denote by K* the multiplicative group of K.
(G: H). If sublïelds M and M’ are konjugate An extension L of K cari be expressed in the
over K, then the corresponding subgroups form L = K(G, . . , fi) (air K) if and only if
G(L/M) and G(L/M’) are conjugate to each L/K is an Abelian extension and a11 e E G(L/K)
other in G, and vice versa. In particular, M/K satisfy frm = 1; in this case L/K is called a Kum-
is a Galois extension if and only if the sub- mer extension of exponent m. There exists a
group H corresponding to M is a +normal one-to-one correspondence between Kum-
subgroup of G, and in this case, the Galois mer extension L of exponent m over K and
group G(M/K) is isomorphic to the factor lïnite subgroups H/(K*)” of the factor group
group G/H. K*/(K*)‘“, given by the relations H= L”‘n K*,
L = K($?). Moreover, there exists a canon-
ical isomorphism between H/(K*)“’ and the
D. Extensions of a Ground Field
tcharacter group of G(L/K), SO that H/(K*)m is
isomorphic to G(L/K). Let L= K(Q) be a cyclic
Let L/K be a lïnite Galois extension, K’/K any
Kummer extension of degree m of K, and let 0
extension, and L’ the tcomposite lïeld of L and
be a generator of the Galois group G(L/K).
K’. Then L’/K’ is also a Galois extension, and
Then the Lagrange resolvent (6 fI) = 0 + [eu
its Galois group is isomorphic to G(L/L n K’)
+ . . +yecm-’ satislïes ([, 0)+ = cm1 ([, O),
by the restriction map.
([, Q)m~K, and B and its conjugates cari be
expressed in terms of ([,6). In particular, L is
E. Normal Basis Theorem generated by ([, 0) over K.

Let L/K be a lïnite Galois extension with


(4) Artin-Schreier Extensions. Assume that K is
Galois group G. Then there exists an element u
of characteristic p # 0. For any element a of an
of L such that {u”I ~EG} forms a basis for L
extension of K, we denote by Ya the element
over K called a normal basis. If we denote by
ap-u and by (l/YP)a a root ofBX-a=O.
K[G] the +group ring of G over K, a +K[G]-
A lïnite extension L of K is of the form L =
module structure cari be introduced in L by
K((l/B)a,,...,(l/~)a,.)(a,~K)ifandonlyif
the operation C a,a(x) = Z a&‘; the existence
L/K is a Galois extension whose Galois group
of a normal basis implies that L is isomorphic
is an Abelian group of texponent p; in this
to K[G] itself as a K[G]-module, or in other
case, L/K is called an Artin-Schreier extension.
words, that the K-linear representation of G
There exists a one-to-one correspondence
by means of L is equivalent to the tregular
between Artin-Schreier extensions L over K
representation of G.
and lïnite subgroups HJPK of the additive
group K/YK, given by the relations H =
F. Examples of Galois Extensions ,YLClK, L=K((l/B)H); moreover, H/PK is
isomorphic to the character group of G(L/K)
(1) Cyclotomic Fields. Let m be a positive (therefore also to G(L/K) itself). More gen-
integer not divisible by the tcharacteristic of K; erally, for Abelian extensions L of exponent p”
1 a +Primitive mth root of unity, and L = K(c). (i.e., Galois extensions whose Galois groups
Then L/K is an Abelian extension, and its are Abelian groups of exponent p”), we obtain
Galois group is isomorphic to a subgroup of similar descriptions by using the additive
172 G 660
Galois Theory

group of +Witt vectors of length n instead of of a circumference in equal parts (- 179


K (- 449 Witt Vectors). Geometric Construction).

1. Infinite Galois Extensions


G. Galois Group of an Equation

If a Galois extension L/K is infinite, then its


L/K is a lïnite Galois extension if and only if L
is a tminimal splitting tïeld of a tseparable Galois group is an intïnite group. Let {MY} be
the family of intermediate tïelds of L/K that
polynomial f(X) in K [Xl. In this case, we cal1
are tïnite and normal over K, and put H,
G(L/K) the Galois group of the polynomial
= G(L/M,). Then by taking {H,} as a +base of
f(X) or of the algebraic equation f(X) = 0. The
a neighborhood system of the unity element, G
equation f(X) = 0 is called an Abelian equa-
becomes a ttopological group. This topology is
tion or a cyclic equation if its Galois group is
called the Krull topology (Krull, Math. Ann.,
Abelian or cyclic, respectively, while f(X) = 0 is
100 (1928)). G is then isomorphic to the +Pro-
called a Galois equation if L is generated by
jective limit of the family of tïnite groups
any root of f(X) over K. Generally, G(L/K)
JG/H} and is +totally disconnected and +com-
cari be tfaithfully represented as a permutation
pact. There is a one-to-one correspondence
group of roots of f(X) = 0. If this group is
between the set of intermediate fields of L/K
+Primitive, then f(X) = 0 is called a primi-
and the set of closed subgroups of G given by
tive equation. The index of the group in the
the map (Galois group)u(invariant field), and
group of a11 permutations of roots is called
thus we have a generalization of Galois theory
the affect of the equation f(X) = 0; if the affect
for lïnite extensions as described in Section C.
is 1, the equation f(X)= 0 is called affect-
Various theories, including the theory of Kum-
less. Let ui, , u, be +algebraically indepen-
dent elements over K. Then for the polyno- mer extensions, cari be generalized to the case
of inlïnite extensions (- 423 Topological
mialF,(X)=X”-u,X”-‘+...+(-l)“u,in
Groups).
K(M , , , un) [Xl, the equation F,,(X) =0 is
called a general equation of degree n. The
Galois group of F,(X) = 0 is isomorphic to the J. Galois Cohomology
tsymmetric group G,, of degree n, and if K is
not of characteristic 2, then the quadratic Let L/K be a finite Galois extension and G its
subfield corresponding to the talternating Galois group. Then both the additive group
group 2l, is the field K(fi) obtained by L and the multiplicative group L* have G-
adjoining the quadratic root of the tdiscrimi- module structures. The tcohomology groups
nant D of F,,(X). of G with coefficient module L are 0 for a11
dimensions because of the existence of a
normal basis (- 200 Homological Algebra).
H. Solvability of an Algebraic Equation As for the multiplicative group L*, we have
L?‘(G, L*) r K*/N(L*) (N is the tnorm N,,,),
Assume that K is of characteristic 0, ~(X)E H ‘(G, L*) = 0 (Hilbert’s theorem 90 or the
K [Xl, and L is the minimal splitting tïeld of Hilbert-Speiser theorem). In particular, if G is a
f(X). We say that the equation f(X) = 0 is cyclic group with generator U, then every
solvable by radicals if there is a chain of sub- element a such that N(a) = 1 cari be expressed
lïeldsK=L,cL,c...cL,.=LsuchthatLi= in the form a = b’ -“. H’(G, L*) is isomorphic
Li-l (&) with some ail Liml, and this is the to the +Brauer group of +Central simple alge-
case if and only if the Galois group of f(X) bras over K which have L as a tsplitting lïeld.
is tsolvable (Galois). In particular, Abelian In the case of number tïelds, a number of
equations are solvable by radicals. Cyclic G-modules arise, such as +Principal orders,
equations are solved by using the Lagrange +Unit groups, +ideal groups, +idele groups, and
resolvent, and theoretically the general solv- SO on, whose cohomological considerations
able equation cari be solved by repeating this are important (- 6 Adeles and Ideles; 59 Class
procedure. Since 6, is solvable if and only if Field Theory). Further, let A be a group (not
n < 4, it follows that a general equation of de- necessarily commutative), and suppose that G
gree n is solvable only if n = 1, 2, 3, 4 (Abel). acts on A. We denote by “a the action of (TE G
(For a method of solving these equations on a E A. A is called a G-group if “(ab) = baob
- 10 Algebraic Equations D.) Also, a poly- for a11 a, bE A and crE G. Then, in the same way
nomial is solvable by square roots if and only as for G-modules, we cari detïne the 0th coho-
if the order of the Galois group is a power of mology group H”(G, A) to be the subgroup A’
2. This fact enables us to answer some ques- of A consisting of a11elementsof A left fixed by
tions concerning geometric construction prob- G. The map a:g++a, from G into A such that
lems such as trisection of an angle or division a,, = aaba, (0, z E G) is called a 1-cocycle with
661 172 K
Galois Theory

values in A. Denote by Z’(G, A) the set of or not), K-varieties, or K-algebraic groups


the 1-cocycles of G with values in A. Two l- defined over K). When X and Yare isomor-
cocycles a and a’ in ZI(G, A) are called co- phic over L, Y is called a L/K-form of X. Let
homologous if there exists an element be A E(L/K, X) be the set of a11 isomorphism classes
such that a; = b m’a,“b for a11 UE G. This is an of X over L. Let A be the group of a11 auto-
equivalence relation in ZI(G, A) and the l- morphisms over L of X. For a L/K-form Y
cohomology set H’(G, A) of G with values in of X and an isomorphism f: X + Y over L,
A is delïned to be the set of cohomologous since G acts on X and Y, one cari define the
classes of Z’(G, A). H’(G, A) does not have the isomorphism “f: X-r Y over L that satishes
structure of a group in general, but there does “(f(x)) = “f(“x). In particular, A is a G-group
exist an element called an identity, namely, the by this action. Further, for on G, delïne a,
cohomologous class containing the trivial l- =f-‘.“feA, then U:~HU, is a 1-cocycle of G
cocycle. In general, a set X with an element p with values in A that corresponds to a L/K-
of X is called a pointed set; for any two pointed form Y of X. This induces a bijection from
sets (X, p) and (Y, q), a map f: X -, Y is called E(L/K, X) onto H’(G, A). Thus L/K-forms
a morphism of pointed sets if f( p) = q. For of X cari be classifïed if one cari determine
pointed sets (X, p), (Y, q), and (Z, r), a sequence H’(G, A). For example, H’(G,Sp,,(L))=O is
of morphisms XL YSZ of pointed sets is equivalent to saying that an L/K-form of a
called exact if Imf = Ker g. The 1-cohomology skew-symmetric bilinear form on a K-linear
set H’(G, A) with the identity cari be regarded space of dimension 2n is unique up to K-
as a pointed set. Therefore, exact sequences of isomorphisms. L/K-forms of a semisimple
these sets make sense and possess some of the Lie algebra g over K cari be classified by the
properties of cohomology groups in the com- 1-cohomology set of the algebraic group
mutative case. For example, let 1 -+A+B+ Aut(g 0 KL). (For the L/K-forms of algebraic
C-t 1 be an exact sequence of G-groups (and groups - 13 Algebraic Groups M.) This de-
A is central in B); then scent theory cari also be discussed for more
general categories.
H’(G,A)-H’(G,B)+H’(G,C)

+H’(G, A)+H’(G,B)+H’(G,C) (+H2(G, A))


K. Galois Theory of Rings
is an exact sequence of pointed sets. For
a Galois extension L/K with Galois group The theory of tcentralizers in simple algebras
G, a linear algebraic L-group delïned over cari be interpreted as the theory of a certain
K(- 13 Algebraic Groups) has naturally Galois correspondence with respect to inner
the structure of a G-group, and we have automorphism groups. Also, by using tcrossed
H’(G, GLJL)) = 0. Applying the above exact products, we cari deduce from the theory of
sequence to l-+SL,(L)+GL,(L)‘3L*+l, centralizers in simple algebras the Galois
we have H’(G,SL,(L))=O. Also, we have theory of commutative fïelds. On the other
H’(G,Sp,,(L))=O for any integer n> 1. It is hand, Jacobson obtained a Galois theory of
diftïcult to delïne higher cohomology sets tdivision rings with respect to lïnite groups of
naturally, but various methods to defïne them outer automorphisms that is similar to the
have been obtained. commutative case. Since then, many alge-
In many cases, we are more concerned with braists have proceeded with investigations that
the tcategory of Galois extensions of K with aim either at unifying these two theories by
K-isomorphisms between them than with a admitting inner automorphisms in the group
single extension L/K. In other words, we con- of automorphisms, at extending the theory
sider a tfunctor L+s(L) of the category of from division rings to general rings such as
Galois extensions of K into the category of simple rings, +Primitive rings, or tsemiprimary
(Abelian) groups, and study the cohomology rings, or at weakening the lïniteness con-
related to G(L/K)-module or G(L/K)-group ditions. One principal method in these theories
structures derived from g(L). In the case of lies in considering fïrst the endomorphism ring
inlïnite algebraic extensions, we consider the Horn,@, S) for an extension SIR and then the
inductive limit of cohomology of subfields roles of endomorphisms (or derivations) in it
of lïnite degrees, making use of continuous [6] (- 29 Associative Algebras).
cocycles of Galois groups relative to the Krull Let K be a lïeld of characteristic p > 0 and
topo1ogy [S, 91. L/K be a lïnite purely inseparable extension
Let L/K be a Galois extension with Galois such that LPc K. The set of a11 tderivations of
group G. Consider abjects X, Y delïned over K L/K forms a restricted Lie algebra D(L/K)
on which the extension of the ground fïeld is over K. Then there exists a one-to-one and
delïned (such as K-linear spaces with certain dual lattice isomorphic correspondence be-
tensors on them, algebras over K (associative tween the set of intermediate iïelds M of L/K
172 Ref. 662
Galois Theory

and the set of restricted Lie subalgebras H of selects from a list of alternatives one which,
D(L/K), given by the relations H = D(L/M) possibly together with chance and random
and M={a~LJd(a)=o, ~EH} (Jacobson [7]). events, leads to various outcomes over which
Namely, in the case of purely inseparable ex- the players have different preferences; thus the
tensions, derivations or higher derivations play behavior of one player aiming at his own
the role of automorphisms, and the bialge- favorable outcome might induce unfavorable
bras defmed by such derivations correspond outcomes for others. Although the potential
to Galois groups of Galois extension. Now, outcomes usually bring about conflicts among
the group algebra KG of a group G over K the players, there may be room for cooper-
and the (restricted) universal enveloping alge- ation among some of them. Game theory
bra of a (restricted) Lie algebra over K both attempts to extract that which is common and
have the structure of a tHopf algebra. From essential to such situations, to handle them by
this fact, unifying the Galois theory of Galois means of mathematical methods, and to pro-
extensions and Jacobson’s theory for purely vide a normative guide to rational behavior
inseparable extensions and using bialgebras for each of the players. Game theory thus goes
or Hopf algebras, one cari construct Galois beyond classical theories of probability and
theories of more general abjects containing dccision making that are sufhcient to solve
certain nonseparable field extensions (see M. games involving just one player and chance.
E. Sweedler, Ann. Math., 87 and 88 (1968); S. Modern game theory started in 1944 with
U. Chase and Sweedler, Lecture notes in math. the publication of the monumental book by
97, Springer, 1969). von Neumann and Morgenstern [ 11. These
authors presented many logical classifications
of games, including a distinction between two-
References
person and n-person games, between constant-
sum (zero-sum) and general-sum games (de-
[l] J. W. R. Dedekind, Uber die Permuta-
pending upon whether the sum of payoffs
tionen des Korpers aller algebraischen Zahlen,
to the players is constant (zero) or not), and
Gesammelte mathematische Werke, Viewig,
between noncooperative and cooperative games
1930-1932, vol. 2,272-291.
(depending upon whether any collaboration
[2] E. Artin, Galois theory, Univ. of Notre
among the players is prohibited or allowed).
Dame Press, second edition, 1948.
The second edition included a remarkable
[3] N. Bourbaki, Eléments de mathématique,
expected utility theory, which has become a
Algèbre, ch. 5, Actualités Sci. Ind. 1102b,
mainstay of game theory. Von Neumann and
Hermann, second edition, 1959.
Morgernstern also gave three representations
[4] N. G. Chebotarev, Grundzüge der
of games. The fïrst representation is the so-
Galois’schen Theorie, Noordhoff, 1950.
called extensive form; this representation has
[5] B. L. van der Waerden, Algebra 1,
been slightly modifïed by Kuhn [2]. The sec-
Springer, seventh edition, 1966.
ond representation is the normal-form game;
[6] N. Jacobson, Structure of rings, Amer.
this is the form which the minimax theorem
Math. Soc. Colloq. Publ., 1968.
for two-person zero-sum games was estab-
[7] N. Jacobson, Lectures in abstract algebra
lished [3]. This theorem was generalized to n-
III, Van Nostrand, 1964; Springer, 1975.
person general-sum noncooperative games by
[8] J.-P. Serre, Corps locaux, Actualités Sci.
Nash [4]. Finally, the characteristic-function
Ind., Hermann, 1962; Local fields, Springer,
form was the one in which the authors devel-
1979.
oped the theory of stable sets. Several other
[9] J.-P. Serre, Cohomologie galoisienne,
solution concepts, such as the tore, the Shap-
Lectures notes in math. 5, Springer, 1974.
ley value, and the bargaining set, have since
been delïned for games in characteristic-
function form. Historically, the development of
game theory has been closely related to vari-
173 (X1X.9) ous areas of pure mathematics, such as analy-
sis, topology, geometry, and the foundations of
Game Theory mathematics. A survey of game theory up to
1957 was presented by Luce and Raiffa [SI.
A. Introduction and Historical Highlights In the 1950s and early 196Os, several major
papers appeared in lïve issues of the Ann& of
Game theory consists of mathematical models Mathematics Studies [6]. Lucas [7] has pre-
used in the study of decision making in situ- sented a good survey of developments to 1972,
ations involving conflict and cooperation. A and Shubik [S] has surveyed the development
conflict arises when each player in a game of the field through 198 1. Current articles
663 173c
Game Theory

appear mainly in the International Journal the Cartesian product H:=i Xi, called player i’s
of Game Theory, and also in the journals in payoff function. Chance is incorporated by
fields such as operations research, manage- invoking an extra set X0 of chance moves and
ment science, economics, political science, and a probability distribution on X0. A two-person
psychology. zero-sum game or matrix game is the simplest
case in which the existence of an equilibrium
point has been established. Equilibrium fol-
B. The Extensive Form
lows from von Neumann% minimax theorem:
Let Ml={1 ,..., m,}, M’={l,..., m,} be the
An n-person game in extensive form is repre-
sets of pure strategies that may be chosen by
sented by a game ttree (i.e., a connected graph
two players. Let a, be player 1’s payoff when
with no cycles (- 186 Graph Theory)) having
strategies i and j are taken by players 1 and 2.
the following properties: There is one special
A mixed strategy for player i is a probability
vertex corresponding to the starting point of
distribution on Mi. Sets of mixed strategies
the game. Each nonterminal vertex corre-
for players 1 and 2 are thus given by X1 =
sponds either to a move of one of the n players
{~~R”~~C~~x~=l,x~~O~‘i~M’}andX~=
or to a chance move. The edges ascending from
{~~R~~~~~~~x~=1,x~>O~i~M~}.Ifplayers
a vertex denote the alternatives of the player at
1 and 2 use the mixed strategies x1 and x2,
this vertex. For each terminal vertex there is
respectively, the expected payoff for player 1
an n-dimensional vector whose components
is F’(x’,~~)=~~~~x~a~~x~, which is a pay-
represent the payoffs to each player. The state
off function for player 1. Von Neumann [3]
of a player’s information at any stage cari be
proved
described by certain subsets of the set of a11 his
vertices, called information sets. At each of his
moves, player i knows which information set
he is in, but not which vertex he occupies We say that a pair (Xi, X2) satisfying the
within this set. A local strategy for player i is a foregoing minimax theorem is an equilibrium
tprobability distribution over the set of a11 point. This pair (X1, X2) turns out to be a saddle
alternatives at each of his information sets. A point (- 292 Nonlinear Programming A) of
behavior strategy for player i is a function F’(x’, x2). The duality theorem (- 255 Linear
which assigns a local strategy to a11 of his Programming B) is mathematically equivalent
information sets. A pure strategy is a special to this minimax theorem. A generalization of
behavior strategy that assigns a particular this equilibrium to n-person general-sum non-
choice to each information set. Kuhn [2] cooperative games was presented by Nash [4].
showed the existence of pure optimal strategies An n-tuple (X’, ,X”) (X’EX~) is a Nash equilib-
for n-person general-sum noncooperative rium if, for each i,
games with Perfect information (ie., games in
F’(2, , xi-1,X’, 5?+1, . ,A”)
which a11 information sets contain a single
vertex) as a generalization of the result for >P(I?‘, ,A’-‘,x’,X’+‘, ,,-Y) for all xieXi.
two-person zero-sum games given in [ 11).
That is, no player cari improve his payoff by
Kuhn also proved the existence of equilibrium
changing his strategy if a11 other players con-
behavior strategies for games with Perfect
tinue to use the same strategies. Nash demon-
recall (i.e., games in which player i at any of his
strated the existence of this equilibrium for n-
information sets remembers ah his prior moves
person general-sum noncooperative games
but is not aware of the prior choices of the
with finite pure strategies. The existence of
other players). The definition of “equilibrium
Nash equilibria for wider classes of nonco-
strategy” is given in the next section. Because
operative games is proved by means of fixed-
of the complexity of n-person general-sum
point theorems (- 153 Fixed-Point Theorems).
games, research on them has been minimal
Recently, multistage games, such as super-
since the work of Kuhn. Recently a new attack
games and stochastic games in which games
on them has begun; for example, the Nash
are played repeatedly, have become a major
equilibrium point in extensive forms has been
research topic of noncooperative game theory
reexamined by Selten [9].
(with work being done on both extensive and
normal forms). In two-person general-sum
C. The Normal Form (or Strategic Form) games or bimatrix games, there arise possibil-
ities of cooperation (or bargaining) between
An n-person game in normal form is specitïed two players. A solution concept for such situa-
by {N, {Xi}isN, {Fi}iEN}, where N = { 1, . , n} tions, the Nash bargaining solution, was pro-
is a set of players, Xi is the set of player i’s posed by Nash [lO]; this solution was sub-
strategies, and F’ is a real-valued function on sequently discussed by Luce and Raiffa [S].
173 D 664
Game Theory

D. The Characteristic-Function Form (or , <p,(u)) sending an arbitrary characteristic


Coalitional Form) function u on N into an n-dimensional Eucli-
dean space satisfying the conditions (i) <p,Jrcu)
An n-person cooperative game in characteristic- = <pi(u), where 7~is any permutation of N and
function form is given by a pair (N, a), where N ~(~S)=U(S) ‘,SZ N; (ii) ~i,s<p,(u)=u(S) ‘,SZ N
= { 1, , n} is a set of players and u is a real- such that u(T)= U(S n T) “Tg N; and (iii)
valued function on a set of a11 subsets of N <~~(u+w)=<p~(u)+cp,(w) ‘iE N and for any two
with v(a) = 0, called a characteristic function. games u and w. These three axioms uniquely
Sometimes v is assumed to be superadditive, determine the value
i.e., U(S U T) > v(S) + u(T) ‘SS, T5 N with S n
<p,(u)= c (S-lY(n--sY
T# 0. u(S) represents the worth or the power
I MS)-u(S-{iJ))
achievable by the subset (coalition) S when SSN n!
S3i
its members cooperate regardless of the be-
havior(s) of the players in the complement of for each i, where s is the number of players in
S. An n-dimensional vector x =(x i , , x,) is S. The Shapley value is an imputation, and
said to be an imputation if it satisfïes (i) x, > cari be viewed as a fair-division solution since
u( {i}) ‘iE N (individual rationality) and (ii) the three axioms are desirable properties for
CitNxi= u(N) (group rationality). The set of a11 any equitable allocation scheme. From the
imputations, denoted by A, represents a11 above formula, the Shapley value cari also be
reasonable or realizable ways of distributing interpreted as the average of the marginal
the available gains among the n players. An contributions of the players in a coalition.
imputation x is said to dominate another im- A bargaining-set concept was proposed by
putation y if there is some nonempty coalition Aumann and Maschler in [6]. This set de-
S such that (i) ~,>y, ‘iES and (ii) CiESxi< scribes what payoffs are stable once a partic-
u(S) (the effectiveness of S with respect to x). ular coalition structure (a +Partition of N) has
The original solution concept for games in formed. Briefly, a payoff associated with a
characteristic-function form given by von coalition structure is stable or in a bargain-
Neumann and Morgenstern [l] is now called ing set if there is no objection to it from any
the stable-set or von Neumann-Morgenstern player; or, even with an objection, if there
solution. A subset K of A is a stable set if(i) exists a counterobjection to such an objection
no dominante relation exists between any from other players. For details - [6]. Since
two elements of K (interna1 stability) and (ii) there are many different ways to detïne objec-
any imputation outside K is dominated by tion and counterobjection, there are various
some imputation of K (external stability). The types of bargaining sets. Some of them are
existence of stable sets was settled negatively known to be nonempty. Two additional solu-
by the ten-person example of Lucas [Il]. This tion concepts, the kernel and the nucleolus,
example, however, is rather specialized, and derive from investigations into particular
thus the existence of stable sets may yet be bargaining sets. The kernel, introduced by
proved for a large class of games. Stable sets Davis and Maschler [ 133, is always a non-
have been characterized for several classes empty subset of its bargaining set. More impor-
of games, and these sets accurately reflect tant is the nucleolus defïned by Schmeidler
the coalition-forming processes among the [14], which is a unique imputation in the
players. Thus stable-set theory remains a ma- kernel and thus in the bargaining set. It is
jor research topic in the theory of games in also in the tore if the latter is nonempty. The
characteristic-function form. Many other “excess” of any nonempty S 5 N for an imputa-
solution concepts have been developed since tion x is deiïned by e(x, S) = u(S)- &xi. The
that of the stable set. One of these is the tore, excess represents the dissatisfaction or the
defined by Gillies in [6] (its naive idea had complaint of the coalition S with respect to x.
already appeared in [ 11). The tore is the set The nucleolus is the imputation which mini-
C={XEAI&.X~>U(S) !SZ N}. This says that mizes the largest excess. If we have a tie, i.e.,
no coalition cari protest against or block an if the maximum excess attains a minimum
imputation in the tore on grounds that the at several imputations, then the next largest
coalition cari expect more. For superadditive excess is to be compared, and SO on. That
games the tore coincides with the set of un- is, the nucleolus is the tlexicographical mini-
dominated imputations, and thus the tore is a mum in the ordering of these arrangements. It
subset of any stable set if both exist. The con- cari be computed by solving a series of linear
dition for nonemptiness of the tore was de- programs.
rived by Shapley [ 121 using the duality theo- Several generalizations and variations of
rem. Another solution concept, detïned by the classical von Neumann-Morgenstern for-
Shapley in [6], is known as the Shapley value. mulation of games in characteristic-function
The Shapley value is a function <p(v) = (<p, (u), form have also been investigated, such as the
665 173 Ref.
Game Theory

games without side payments studied by increased our understanding of the axiom of
Aumann and others (- e.g., M. Shubik (ed.), choice (- 34 Axiom of Choice and Equiva-
Essays in Mathematical Economies: In honor of lents A) and other foundational questions.
Oskar Morgenstern, Princeton Univ. Press, Nash equilibrium is closely related to tïxed-
1967); the games in partition-function form point theorems (- 153 Fixed-Point Theorems)
proposed by Thrall and Lucas (R. M. Thrall, and to separation theorems (- 89 Convex Sets
and W. F. Lucas, Naval Res. Logistics Quart., A). Cooperative game theory also has many
10 (1963), 281-298); and the games with in- connections to functional analysis and to
finitely many players, in connection with which convex analysis.
the Shapley value theory has become a major Finally, we mention that the study of dif-
research topic [l S]. ferential games (- 108 Differential Games) is
also highly developed and widely used in areas
E. Applications; Related Areas of Mathematics such as economics and control theory.

Game theory has been applied to many fields, References


such as economics, political science, manage-
ment science, operations research, information [l] J. von Neumann and 0. Morgenstern,
theory, and control theory, as well as to pure Theory of games and economic behavior,
mathematics. Games in extensive form are Princeton University Press, 1944; second
now important tools for analyzing the effects edition, 1947; third edition, 1953.
of information, and thus for solving many [2] H. W. Kuhn, Extensive games, Proc. Nat.
decision problems with uncertainty. Non- Acad. Sci. US, 36 (1950), 570-576.
cooperative games in normal form and Nash [3] J. von Neumann, Zur Theorie der Gessell-
equilibria have been used in the study of many schaftsspiele, Math. Ann. 100 (1928), 295-320.
phenomena, including oligopolistic markets [4] J. F. Nash, Noncooperative games, Ann.
(Friedman [15]), bidding processes, electoral Math., 54 (1951), 286-295.
competition, resource allocation, and arms [5] R. Luce and H. Raiffa, Games and deci-
control. Cooperative games have been success- sions: Introduction and critical survey, Wiley,
fully applied to economics, and the relation 1957.
between the tore and competitive equilibria [6] A. W. Tucker et al. (eds.), Contributions to
sheds further light on the theory of competi- the theory of games, vols. I-IV, and Advanced
tive economy. It is generally true that com- studies in game theory: Ann. Math. Studies 24,
petitive equilibria are contained in the tore. 28, 39,40, 52, Princeton University Press,
Debreu and Scarf [ 163 demonstrated that 1950, 1953, 1957, 1959, 1964.
if the number of players approach iniïnity [7] W. F. Lucas, An overview of the mathe-
in a certain manner, the tore shrinks to the matical theory of games, Management Sci.,
set of competitive equilibria. By working in 18, no. 5, pt. 2 (1972), P-3-P-19.
measure-theoretic terms, Aumann [ 171 was [S] M. Shubik, Game theory in the social
able to identify the tore with the set of com- sciences: Concepts and solutions, MIT Press,
petitive equilibria. It was also demonstrated 1982.
by Aumann and Shapley [ 1S] that the set of [9] R. Selten, Reexamination of the perfectness
competitive equilibria (and hence also the concept for equilibrium points in extensive
tore) converges to the Shapley value under games, Int. J. Game Theory, 4 (1975), 25-55.
such formulations, provided certain conditions [ 101 J. Nash, The bargaining problem, Eco-
are satisfïed. Another major application of nometrica, 18 (1950), 1555162.
cooperative game theory has arisen ir, pc!itical [ 1 l] W. F. Lucas, A game with no solution,
science, wherein value-type sollltions, such as Bull. Amer. Math. Soc., 74 (1968), 237-239.
the Shapley value, are widely used as indices of [ 121 L. S. Shapley, On balanced sets and
the power of each participant in various voting cores, Naval Res. Logistics Quart., 14 (1967),
situations. Major applications are to problems 453-460.
of cost allocation for public goods such as [ 131 M. D. Davis and M. Maschler, The ker-
water resources [ 191, public transportation ne1 of a cooperative game, Naval Res. Logistics
systems, and telephone systems; in such appli- Quart., 12 (1965), 223-259.
cations the tore, the Shapley value, and the [14] D. Schmeidler, The nucleolus of a charac-
nucleolus have a11 been employed. The books teristic function game, SIAM J. Appl. Math.,
[S, 20-221 are good references to the most 17 (1969), 1163-1170.
recent applications of game theory. [ 151 J. W. Friedman, Oligopoly and the the-
Game theory also has many close relations ory of games, North-Holland, 1977.
with various areas of pure mathematics. The [16] G. Debreu and H. Scarf, A limit theorem
following are typical examples. The study of on the tore of an economy, Int. Econ. Rev., 4
certain types of games in extensive form has (1963), 235-246.
174 A 666
Gamma Function

[ 171 R. J. Aumann, Markets with a continuum which is conjectured to be ttranscendental,


of traders, Econometrica, 32 (1964), 39950. but as yet even its irrationality remains un-
[18] R. J. Aumann and L. S. Shapley, Values proved. However, it is known, that if C were
of non-atomic games, Princeton Univ. Press, rational, the numerator and the denominator
1974. would be integers of more than 30,000 digits
[ 191 M. Suzuki and M. Nakayama, The cost (R. P. Brent, 1980).
allocation of the cooperative resource develop- The numerical value of C was calculated
ment: A game-theoretical approach, Manage- by Adams (1878) to 260 decimal places, and
ment Sci., 22 (1976), 1081l1086. recently it has been calculated to more than
[20] P. C. Ordeshook (ed.), Game theory and 20,000 decimal places by means of an elec-
political science, New York Univ. Press, 1978. tronic computer. Seven thousand digits have
[21] S. J. Brams et al. (eds.), Applied game been computed by W. A. Beyer and M. S.
theory, Physica-Verlag, 1979. Waterman (Math. Camp., 28 (1974)), and
[22] M. Shubik, A game-theoretic approach to 20,000 digits have been computed by Brent
political economy, MIT Press, 1984. et al. (1980).
I(x) is tholomorphic on the complex x-
plane except at the points x = 0, -1, - 2, . ,
where it has simple tpoles. When Rex > 0, we
174 (XIV.4) have Hankel’s integral representation

Gamma Function
w=-& s c
(- t)x-‘e-‘dt,

A. The Gamma Function x # integer,

The function I(x) was defined by L. Euler where the contour C lies in the complex plane
(1729) as the inlïnite product tut along the positive real axis, starting at cc,
going around the origin once counterclock-
wise, and ending at cc again.
Among various properties of this function
(- Appendix A, Table 17.1), the following two
Legendre later called it the gamma function or
formulas are especially useful for numerical
Euler’s integral of the second kind. The latter
calculations: Binet’s formula
name is based on the fact that for positive real
x, we have

I-(x)= Ocë’t”-’ dt.


s0 m arc tan(t/x) dt
+2 e2”f-1 ' Rex>O,
This function satistïes the functional relation s0

I-(x+ l)=xI-(x), and the tasymptotic expansion formula that


holds when largxl <(rr/2)--6 (6>0),
and hence for positive integral x, we have
I(x + 1) =x!. C. F. Gauss denoted the function
r(x + 1) by n(x) or x!, even when x is not a
positive integer. The function x! is also called
i0gr(x) - x-5
( > 1
logx-x+-
log 27c
2

the factorial function. The gamma function


+f (-Y’B,,

*=12n(2n- 1)x2”-’
cari also be delïned as the solution of the func-
tional equation F(x + 1) = XT(x) satisfying the (Stirling’s formula), where the B, are tBernou1li
conditions numbers. This last formula cari be rewritten as

lim -=r(x + 4 rtx + 1) =x! - xXemX&,


r(i)= 1, 1.
- r(ny which is used for large positive integers x.
Furthermore, we have The integrals

1 ri 00
-=xeCX m 1+X e-xin, em’txm’ dt, eë’t*-’ dt, Rex>O,
r(x) n=1
4 n
-1 s0 sA
This expression is known as the Weierstrass are known as the incomplete gamma functions
canonical form. Here C is Euler’s constant and are used in statistics, the theory of molec-
ular structure, and other fields. The texponen-
C=lim l+i+...+l-logn tial integral and the terrer function (- 167
r-rm ( n 1 Functions of Confluent Type D) are special
=0.57721566490153286060651209 ... . cases of the incomplete gamma function.
667 176 A
Gaussian Processes

B. Polygamma Functions childhood, Gauss showed genius in mathe-


matics. He gained the favor of Grand Duke
The derivatives of the logarithm of the gamma Wilhelm Ferdinand and under his sponsorship
function are named the digamma function (or attended the University of Gottingen. In 1797,
psi function) tj(x) = d log r(x)/dx; the trigamma on proving the tfundamental theorem of alge-
function $‘(x); the tetragamma function I/I”(X); bra, he received his doctorate from the Univer-
the pentagamma function $Y(x), etc. These sity of Halle. From 1807 until his death, he
functions are called polygamma functions. In was a professor and director of the Observa-
particular, $(x) is the solution of the functional tory at the University of Gottingen.
equation On March 30, 1796, he made the discovery
that it is possible to draw a 17-sided tregular
$(x-t 1)-$(x)=1/x, $(l)= -c,
polygon with ruler and compass, which moti-
lim(ti(x+n)-$(l+n))=O. vated his decision to devote himself to mathe-
n-cc
matics. The publication of his Disquisitiones
arithmeticae in 1801 opened an entirely new
C. The Beta Function era in number theory. In pure mathematics,
he did excellent research on tnon-Euclidean
Euler’s integral of the first kind geometry, thypergeometric series, the theory of
1 functions of a complex variable, and the whole
B(x, Y) = t”-‘(1 -t)Y-‘dt, theory of telliptic functions.
s0 In the field of applied mathematics he made
Rex>O, Rey>O, outstanding contributions to astronomy, ge-
odesy, and electromagnetism; he also studied
is called the beta function and is an analytic the tmethod of least squares, the theory of
function of two variables x, y. This function is surfaces (- 11 1 Differential Geometry of
related to the gamma function as follows: Curves and Surfaces), and the theory of tpo-
tential. He considered perfection in papers for
WWY)
%Y)= r(x+y). publication to be of utmost importance; thus
his published works are few relative to his
If the Upper limit 1 in the integral is replaced amount of research. However, the scope of his
by CI, the result is called the incomplete beta work cari be seen in his diary and letters, some
function B,(x, y). of which are included in his complete works,
comprising 12 volumes. He is generally consid-
ered the greatest mathematician of the tïrst
References half of the 19th Century.

[l] N. Nielsen, Die Gammafunktion 1, II,


Chelsea, 1964 (two volumes in one). References
[2] W. Shibagaki, Theory and applications of
the gamma function, with tables of complex [1] C. F. Gauss, Werke I-XII, Konigliche
arguments (in Japanese), Iwanami, 1952. Gesellschaft der Wissenschaften, Gottingen,
[3] E. T. Whittaker and G. N. Watson, A 186331933.
course of modern analysis, Cambridge Univ. [2] C. F. Gauss, Disquisitiones arithmeticae.
Press, fourth edition, 1958. German translation, Untersuchungen über
[4] K. Pearson, Tables of the incomplete I- hohere Arithmetik, Springer, 1889 (Chelsea,
function, Cambridge Univ. Press, second edi- 1965); English translation, Yale Univ. Press,
tion, 1968. 1966.
[S] E. Artin, The gamma function, Holt, [3] F. Klein, Vorlesungen über die Entwick-
Rinehart and Winston, 1964. (Original in lung der Mathematik im 19. Jahrhundert 1, II,
German, 1931.) Springer, 192661927 (Chelsea, 1956).
Also - references to 389 Special Functions.

176 (XVII.1 3)
175 (Xx1.26) Gaussian Processes
Gauss, Carl Friedrich
A. Gaussian Systems
Carl Friedrich Gauss (April 30, 1777-
February 23, 1855) was born into a poor A system X = {X,(o) 1ie A} of real-valued
family in Braunschweig, Germany. From random variables on a probability space
176 B 668
Gaussian Processes

(0, B, P) is said to be Gaussian, or X is a Gauss- property as X, then X and X’ have the same
ian system, if any fïnite linear combination distribution.
of elements X, of X is a +Gaussian random A necessary and suff~cient condition for the
variable. X, in a Gaussian system to be independent is
Any subsystem of a Gaussian system is that Vi.,@= 0 for every pair 1. #p. For a Gauss-
again a Gaussian system. In particular, the ian system X = {X, ( n > l}, the +Convergence
joint distribution of (X, , X,, . , X,,) for any in probability of the sequence {X,} is equiv-
lïnite subsystem {Xj 11 <j < n} of a Gaussian alent to +Convergence in the mean square.
system X is a multidimensional Gaussian The limit of the sequence in this case is again
distribution. This distribution is supported on Gaussian in distribution. Also for this system,
the whole of R” or on a hyperplane (at most the talmost sure convergence of the sequence
(n - l)-dimensional). Let mj be the expectation implies the convergence in mean square.
of Xi, and let V=( VJ be the (positive definite) There are many characterizations of Gauss-
tcovariance matrix of {Xj} given by ian distributions and of Gaussian systems.
(i) A necessary and sufficient condition for a
&=E{(Xj-mj)(Xk-m,)}, l<j, k<n.
distribution to be Gaussian is that cumulants
Suppose that the rank of the matrix Vis r. (tsemi-invariants) yk of a11 orders exist and
Then the distribution of (X,, X,, _. , X,) is satisfy yk = 0 for a11 k > 3. (ii) Gaussian distri-
concentrated on an r-dimensional hyperplane butions have a self-reproducing property: If
of R”: X and Y are independent Gaussian random
variables, then their sum S = X + Y is also
m + VR”, m=(m,,m, ,..., m,).
Gaussian. A converse to this property holds: If
When V is nondegenerate, that is, when r = n, the sum is Gaussian, then, assuming that they
the distribution is supported by the whole of are independent, X and Y are both Gaussian
R” and has a density function of the form as well (P. Lévy, H. Cramér). (iii) If X, and X,
are independent and if Y1 = aX, + bX, and
(2~)~““1 V1-‘/‘exp
[
wherex=(x,,x,,...,x,)ER”,
-$x-m)V-‘(x-m)’

[VI and V-’ are


1
, Y, = cX, + dX, are also independent,
both X, and X, are Gaussian except for the
then

trivial case b = c = 0 or a = d = 0. (iv) If for X


the determinant and the inverse, respectively, and Y there exist ci independent of X and V
of V, and where (x - rny denotes the (column) independent of Y satisfying Y = aX + U, X =
vector transpose to the (row) vector (x-m). h Y + V for some constants a, b, then there are
The above expression is the general form of only three possibilities: (1) (X, Y) is Gaussian,
the density of an n-dimensional Gaussian (2) X and Y are independent, (3) there is an
distribution. The characteristic function C~(Z)of ailïne relation between X and Y. (v) Suppose
this distribution is given by that a distribution has a finite mean m and a
density function of the form ,f(x -m). If the
&)=exp i(m,z)-$Vz,z) , tmaximum likelihood estimate of the mean is
L 1 always given by the arithmetic mean of the
samples, then the distribution is Gaussian
z=(z1,z2 >..., z,&R”,
(C. F. Gauss).
where ( , ) denotes the inner product on R”. For any element X, of a Gaussian system X
This distribution is denoted by N(m, V). If, in and for a subsystem X’ of X, the conditional
particular, m = 0 and if V is the identity matrix, expectation E(X,,B’) is the orthogonal projec-
then it is called the n-dimensional standard tion of X, onto X’, where B’ is the smallest o-
Gaussian distribution. tïeld with respect to which a11 the X,, in X’ are
For a general Gaussian system X = {X, 1 measurable and where X’ is the closed linear
1~ A} we are given the mean vector m, = subspace of L’(Q P) spanned by X’.
E(X),), 3,E A, and the covariance matrix VA,, = A system X = {X, 1A.E A}, X, = (Xj , . . , Xi),
E{(X,-m,)(X,-m,)}, Â, ~LEA, which is posi- of d-dimensional random variables is said
tive defïnite; for any n 2 1, complex numbers to be Gaussian if the collection {Xj 1AE A,
il>%,...> S(“EC, and i,,Ã,, . . . . Â,EA, we have 1 < j < d} is Gaussian.

B. Complex Gaussian Systems


Conversely, given m,, AE A, and a positive
definite V= ( F,,fl 11, p E A), there exists a Gaus- Let Z be a complex-valued random variable
sian system X = {X, 1Â E A}, the mean vector with mean m, and denote it in the form Z =
and the covariance matrix of which coin- X+iY+m,i=fl,X, Yreal.IfXand Y
cide with (ml) and V= (V,,,), respectively. If are independent and have the same Gaussian
there exists another system X’ with the same distribution with zero mean, then Z is called a
669 176 E
Gaussian Processes

complex Gaussian random variable. A system D. Wbite Noise and Gaussian Random
Z = {ZA 1AE A} of eomplex-valued random Distributions
variables is said to be complex Gaussian if any
lïnite linear combination C cjZAj, cj6 C, is A real-valued weakly stationary process with
complex Gaussian. A complex Gaussian sys- discrete parameter is called a white noise if the
tem has properties similar to those of a Gaus- mean m = 0 and the tcovariance function p(t)
sian system discussed above. For instance, two = 1 (t = 0); = 0 (t # 0). Obviously the +Spectral
complex Gaussian random variables are inde- measure F(dÂ) is the tlebesgue measure dÂ.
pendent if and only if they are uncorrelated. In the continuous parameter case, a weakly
Convergence properties are also similar. Fur- stationary random distribution (- 407 Sto-
thermore, one cari delïne complex Gaussian chastic Processes C) is called a white noise if m
systems consisting of higher-dimensional ran- = 0 and p = 6 (S is +Dirac’s delta function: 6((p)
dom variables. = q(O)). The spectral measure is, therefore,
the Lebesgue measure. In both cases Gauss-
ian white noise {X,} or {X,} is determined
C. Gaussian Processes uniquely. It has independent values at every
point in the following sense: In the discrete
A real-valued stochastic process {Xt} is called parameter case, Xcl, XI1, . , X,” are mutually
a Gaussian process or a normal process if it independent for any mutually distinct t i,
forms a Gaussian system. If {X,} is a complex t,, . . , t,. For a continuous parameter Gauss-
Gaussian system, it is called a complex Gauss- ian white noise, often called simply white
ian process. The most important example noise if no confusion occurs, X,,, XV*, ,X,”
of a Gaussian process is +Brownian motion are mutually independent if the supports of
{B, 1t > 0) with the properties E(B,) = 0 and <pl, (p2, . , <P” are disjoint. In the latter case
E(B, - R,)’ = 1t -si. There are several gener- Gaussian white noise is realized by taking the
alizations of this process, among them: (i) derivative of a Brownian motion. The tcharac-
+Brownian motion with a d-dimensional time teristic functional c(q) is given by

L+Il213
parameter, (ii) a +Wiener process with d-
dimensional time parameter, which is a Gaus- 44 = exp
sian system {X0 1a = (a,, , ad), a11 aj > 0) with
E(X,)=O and E(X,,X,,)=n,min{ut,aj2}, where 11 I/ is the L’(R’)-norm.
ai=(af / . . . , u;,, i = 1, 2. Characteristic functionals of general Gauss-
Since a multidimensional Gaussian distri- ian random distributions are expressed in the
bution is completely determined by its tmean form
vector and the tcovariance matrix, a Gaussian
process is strongly stationary if it is weakly
stationary (- 395 Stationary Processes); and
such a process is called a stationary Gaussian
c(<p)=exp
[
im(v)-iK(v,ql ,
where m is the mean functional
1 and K is the
process. The mean value m and the spectral
covariance functional.
measure F(dÂ) are associated with a weakly
stationary process. The measure F(dÂ) is sym-
metric with respect to the origin since each X, E. Representations of Gaussian Processes
is real-valued. Conversely, given such F(dA)
and m, we cari construct a real weakly station- The family of tstochastic integrals X, =
ary process {X,} with mean value m and spec- jiF(t,u)dB,, t>a(t>aifa= --CO), based
tral measure F(dA). Generally, such a process on Brownian motion, delïnes a Gaussian
{X,} is not determined uniquely; however, if process {X, 1t 2 a}. The converse problem
{X,} is Gaussian, then there exists only one is, however, not obvious. Given a Gaussian
stationary Gaussian process with given m and process {X, 1t > a} with E(X,) = 0, the problem
F(dl). In view of this fact, stationary Gaussian is to lïnd a Brownian motion {i?,} and a kernel
processes cari be regarded as being typical F that give a representation of the above form
among weakly stationary processes. (P. Lévy [lO]). A more specilïc problem dis-
The discussions SO far on stationary Gauss- cussed below is important. If, in addition, the
ian processes are generalized to the case of representation is formed in such a way that
stationary complex Gaussian processes, for B,(X) = B,(B) holds for every t, then the repre-
which symmetry of F(dA) need not be assumed. sentation is called canonical, where B,(X) is the
The trandom measure {M(A)} in the +Spectral smallest o-lïeld with respect to which a11 the
decomposition of a complex Gaussian process X,, s < t, are measurable. The canonical repre-
{Xt} again forms a complex Gaussian system. sentation, if it exists, is unique up to equiva-
Weakly stationary Gaussian random distri- lente, i.e., F(t, u)’ is unique. The existence of
butions cari also be introduced. the canonical representation is, however, not
176 F 670
Gaussian Processes

always guaranteed. Mean continuous, zero process exists, then it is expressed in the form
mean, purely nondeterministic stationary
Gaussian processes have tbackward moving
average representations, which are nothing but
Xr=j=l
t J(t)so‘gj(u)dBu,
the canonical representations. Once the canon- where det(fj(ti)) never vanishes for any choice
ical representation of {X!} is obtained, the of distinct ti (> a), 1 < id N, and where the
kernel, together with known properties of {B,}, g, are linearly independent vectors in L*([a, t])
tells us important properties of the given pro- for any t > a. Conversely, a Gaussian process
cess, e.g., sample function properties, Markov given by the above expression is N-ple Mar-
properties, and SO forth. kov. A stationary N-ple Markov Gaussian
Let {X, 1t > u} be a general Gaussian process process has a ?Spectral density function f(Â) of
with E(X,) = 0, and let M,(X) be the subspace a specific form, namely, it is expressed in the
of L’(R, P) spanned by the X,, s < t. Assume form [3]
that (i) {X,} is separable, i.e., M,(X) is sepa-
rable and (ii) {XI} is purely nondeterministic f(Â) = IQ(u)/P(iÂ)l’; P, Q polynomials,
in the sense that n,M,(X)= {O}. Then X, is
degP=N and degQ<N
expressed as a sum of stochastic integrals:

x,=x&4)dB;,
js<I
(rational spectral density function). Y. Okabe
[ 141 proved that the roots of P(x) = 0 are a11
real, and introduced a multiple Markov prop-
where the {Bi}, j = 1,2, . , N (N cari be CO), erty of a stationary Gaussian process {Xt} in
are mutually independent tadditive Gaussian a much wider sense to prove that {Xc} enjoys
processes. In addition, the B’, cari be taken in this property if and only if it has a rational
such a way that the measures du’(u) = E(dBj,)‘, spectral density.
j=1,2 , . , N is decreasing in j (i.e., du’ » A somewhat restricted definition of multiple
du’» . ..) and that B,(X)=VjB,(Bj) holds for Markov properties for a Gaussian process
every t. Such a representation is called a gen- uses differential operators. Assume that X, is
eralized canonical representation of {Xc} [3]. It (N - 1)-times differentiable with respect to the
always exists under the above assumptions (i) L*(R, P)-norm. If there is an Nth order dif-
and (ii), although it is not unique when N > 2. ferential operator L = Cf=, ak(t)DNmkwith
The number N, called the multiplicity of {X,}, D=dJdt such that
is independent of the choice of a generalized
LX, = B,, fi, = dB,/dt white noise,
canonical representation. It is a good question
to ask when {Xc} has a simple unit multiplicity then {X,} is called an N-pie Markov Gaussian
or when {Xc} has the canonical representation. process in the restricted sense. Such a process is
No interesting answer to this question has naturally N-ple Markov, and the canonical
been given SO far except for stationary Gauss- representation always exists. The kernel of the
isn processes. representation is the tRiemann function of the
differential operator L that was used in the
definition. The spectral density function corre-
F. Gaussian Markov Properties
sponding to this process is of restricted form,
A tsimple Markov Gaussian process {X, 1t > a} namely, it is expressed in the form l/lP(U)[*,
where P is a polynomial of degree N, and
has the canonical representation if it is sepa-
rable and purely nondeterministic and if its P(U) = 0 has no root in the lower half-plane.
covariance function never vanishes. It is of the Many attempts have been made to defïne
form a Markov property of a Gaussian random
field, namely, a Gaussian system with a multi-
dimensional time parameter; however, only
two signifïcant approaches are mentioned here.
Let {Xa 1a E Rd} be a Gaussian random fïeld
f(t)#O, [‘g(u)‘dufO for t>a. with d-dimensional time parameter. H. P.
Jll
McKean [ 121 gave a Markov property in the
A generalization of the simple Markov prop- following manner. Let 9” be the o-fïeld gen-
erty of a Gaussian process is given in the erated by the X,, x E U, U( c Rd) open. For
following manner (suggested by P. Lévy [ 101). a closed set C, we set gc = na pc,, where C, is
If the tconditional expectations E(XJB,(X)), the &-neighbourhood of C. If, for any open
for any choice of distinct t 1, t,, . , t,+j > t set U, &, and & become independent under
(j > 0) span an N-dimensional subspace of the assumption that & is known, then {X0}
L*(R, P), then the process {X,} is called an N- is called a Markov Gaussian random field in
ple Markov Gaussian process. If the canonical the McKean sense. It has been proved that
representation of an N-ple Markov Gaussian a Brownian motion with d-dimensional time
671 176H
Gaussian Processes

parameter is Markov in this sense if d is odd, where c, is a constant. This result is due to Z.
while it is not Markov if d is even. Further Ciesielski.
detailed investigations of Markov properties Conditions for continuity of sample func-
for Gaussian random lïelds have been given tions have also been given for Gaussian pro-
by L. Pitt, S. Kotani, Y. Okabe [17], and cesses with multidimensional time. parameters.
others. For {X*l te[O, l]“}, set d(s, t)=E[(XS-X,)Z]1’2.
Another deiïnition of Markov properties If there exists a function cp monotone increas-
of Gaussian trandom tïelds, in fact those of ing on some interval 0 < u < CIsuch that
Gaussian random distributions, has been given
by E. Nelson [ 131 in connection with Eucli-
dean tïelds (invariant under Euclidean mo- where /I 11is the metric in R”, and such that
tion) in quantum theory. Let X, be a Gauss- +Zl
ian random field. For a smooth surface F in <p(e-“*)dx< 00,
Rd let Pr be the a-field generated by X,, @ = s
11/@ 6(n), $ being a smooth function with then almost a11 sample functions are continu-
compact support and n being normal to I. If ous (X. Fernique [2]). Conversely, if there
9j and 9Dc become independent under the exists a monotone increasing function cp satis-
assumption that Fr, F = ao, is given, then fying d(s, t)= cp(11s- tll) and if sample functions
{X,} is called a Markov Gaussian random are continuous with positive probability, then
field in the Nelson sense. Given a Gaussian the above integral should be tïnite.
Euclidean tïeld which is Markov in the Nelson Suppose that {X,} is a stationary Gaussian
sense, then under some reasonable assump- tïeld. R. M. Dudley [ 1 l] proved that if there
tions, one cari form a Iïeld (in the sense of exists a number q > 1 and a neighborhood
tquantum tïeld theory) satisfying the tWight- V= V(0) in R” such that
man axiom by changing imaginary time to
real time.

where N( V, E) is the minimum number of E-


G. Sample Function Properties balls (relative to the metric d) that caver V,
then almost a11 sample lïelds of {X,} are con-
Some lïner results about the smoothness of tinuous, and Fernique Cl83 proved that the
sample functions have been obtained for station. converse statement is also true. Thus the
ary Gaussian processes. G. A. Hunt proved that problem of continuity of the sample fields of
if the spectral measure F(dl) of a stationary stationary Gaussian lïelds has been completely
Gaussian process {X,} has finite moment of solved.
order a, then almost a11 its sample functions
satisfy the following tHolder condition of
order c( [S]: For every lïnite interval 1 and H. Gaussian Measures
every constant C, there exists a sufficiently
small h, = h,(Z, C) such that Let {Xt, tE T}, T an interval, be a Gaussian
process. It gives a probability distribution on
the measurable space (R“, B), where &J is the
for any tu 1 and lhl ch,. The following result i <r-field of Bore1 subsets of RT. Such a mea-
due to Yu. K. Belyaev: Sample functions of a sure P is called a Gaussian measure. Let P’,
stationary Gaussian process are with proba- i = 1, 2, be Gaussian measures induced by
bility 1 either continuous or unbounded on Gaussian processes {Xi, t E T}, i = 1,2, respec-
every tïnite interval. tively. Then for these measures concepts such
Other conditions for continuity of sample as tabsolute continuity, tequivalence, and
functions are given in terms of the tcovariance tsingularity are introduced in the usual man-
function. Let U(S, t) be the covariance func- ner. The most signilïcant property for Gauss-
tion of a Gaussian process {X, 1t E [O, l] }. If v ian measures is that two measures P’ and P2
satislïes are either equivalent or singular. A powerful
l~~~~,~~-v~S*,t)l~CIS~-S*I” criterion for testing this dichotomy is the
tentropy of P2 with respect to P’, given by
(O<a<l, v(O,s)=O, S,,S,E[O,l]),
H(P2 1P’) = supc
d i

P limsup
IX,, -x*,1
where the sup is taken over a11 finite parti-
( alo,ltl-tll~sI~,-~,l=‘2110glt,-t*l11’2
tions m={Al, . . . . A,}, A,E&?, UAi=RT (Yu.
A. Rozanov [15]). The measures P’ and P2
qc1/=/51= =
> are equivalent (resp. singular) if and only if
176 1 672
Gaussian Processes

H(P’ 1Pi) < CU (resp. = CO). This statement cari is [cl2 nj(nj! 2”~). Taking c to be nj(,! 2”j))1/2,
be paraphrased in terms of the Hilbert spaces the collection of a11 Fourier-Hermite polyno-
H(X’), i= 1, 2, spanned by {Xi}, i= 1, 2, respec- mials based on {cp,} forms a complete ortho-
tively. For simplicity, assume that E(X1) =O. normal system in the Hilbert space (L’) con-
Then, P’ and P* are equivalent if and only if sisting of all Brownian functionals with fïnite
the following two conditions hold: (i) A map- variante. Denote by ,g” the subspace of (L’)
ping A determined by AX: = X: is extended spanned by a11 the Fourier-Hermite polyno-
to an invertible linear transformation from mials of degree n. Then the Hilbert space (L2)
H(X’) to H(X’); (ii) There exists a symmetric admits a direct-sum decomposition:
operator B of +Hilbert-Schmidt type such that
(AS, 41, = (U- BE, dl, where (5, rh= E(t%) is
the inner product in H(X’). Next, suppose that
{Xi], i= 1, 2, are not centered. Set E(Xi)=mf, which is called the Wiener-Itô decomposition
i= 1,2. Then Pi and P2 are equivalent if and [7]. The subspace &?n is eventually indepen-
only if {Xj} with Xi = X,i - mf satisfy the fol- dent of the choice of a complete orthonormal
lowing condition: (iii) in addition to conditions system { cp,}. A member X of 2” is referred to
(i) and (ii) above, there exists an element &, in as a multiple Wiener integral of degree n and is
H(X’) such that rn: -mf =E[&,X:], tu T. expressed in the form
Given equivalent Gaussian measures P’
and P2 on (RT, 93) T= [0, t,], it is in general X= f(s,,s, ,..., s,)dB,,dB, *,.. dB,, ”
not easy to obtain the +Radon-Nikodym de- SSR”
rivative; however, if one of them, say P’, is the where f is a symmetric L2(R”)-function. In ad-
+Wiener measure, then one cari obtain <p= dition, E(X2)=n! IlfIl’, /) 11being the L2(R”)-
dP’/dP’ explicitly (M. Hitsuda [4]). Assume norm.
that P2 is derived from {XC}. By assumption, Brownian functionals cari be expressed as
X, has the tcanonical representation functionals of white noise, SO that the Hilbert
f s space L2(p) is viewed as a realization of (L2),
X,=B,- QS, u) dB, + a(s) ds, tET, where p is the probability distribution of the
sis0 0 1
Gaussian white noise introduced in Section D.
where {B,} is the +Brownian motion from Nonlinear functionals of a general Gaussian
which P’ is assumed to be derived and where process cari be dealt with in a similar but
IEL~(T~), ~EL’(T). With this notation, we cari somewhat more complicated manner. If the
Write process has a canonical representation, then
the functionals cari easily be rewritten as
QS, u)dB, + a(s) dB, Brownian functionals.

1 to
0
s 2

--s2, (S
l(s,u)dB,+a(s)
>1 ds J. Applications to Prediction Theory

If a Gaussian process {X, 1t 3 a} has the canon-


This result yields a remark about general
Gaussian measures: If P’ and P2 are equiva- ical representation
lent, {X:, t E T} has a canonical representation
if and only if {X:, t E T} does. X,= ‘F(t,u)dB.,
s LI
then the best tpredictor E(X,/B,(X)), s < t, is
1. Nonlinear Functionals given by
s
Start with nonlinear functionals of a Brownian F(t, u)dB,.
motion {B,, t E R’ }, and cal1 them Brownian sa
functionals. If {(in} is a complete orthonormal
This is in agreement with the best tlinear
system in L2(R1), then the collection {Xn=
predictor.
l <p,(u)dB,} of tstochastic integrals forms a No systematic approach for nonlinear pre-
Gaussian system of independent standard
diction theory has been established SO far. We
Gaussian random variables. A Fourier-
illustrate this theory only in a typical case that
Hermite polynomial based on { cp,} is a poly-
arises from a stationary Gaussian process. Let
nomial in X, expressible as a fïnite product in
{X,} be a real-valued stationary Gaussian
the form
process. Without loss of generality, we cari as-
CH Hn,(Xj/$), c a constant. sume that E(X,) = 0. We consider the continu-
i ous parameter case, since the discrete param-
The sum n = C nj is the degree of this polyno- eter case is easier and cari be inferred from it.
mial. If n > 0, its mean is zero and the variante Let M,(X) be the subspace of L2@, P) de-
673 177 A
Generating Functions

tïned in the same way as in Section E, and let [S] G. A. Hunt, Random Fourier transforms,
L, = L’(a, B,(X)). If the process is purely non- Trans. Amer. Math. Soc., 71 (1951), 38869.
deterministic in the sense that n, M,(X) = {0}, [6] 1. A. Ibragimov and Yu. A. Rozanov,
then X, has the backward moving average Gaussian random processes, Springer, 1978.
representation (Original in Russian, 1970.)
[7] K. Itô, Multiple Wiener integral, J. Math.
Soc. Japan, 3 (1951) 157-169.
[S] M. G. Kreïn, On the fundamental approx-
where {Bt} is a Brownian motion. Further- imation problem in the theory of extrapola-
more, this representation is canonical, i.e., tion and the filtration of stationary random
B,(X) = B,(B). Since B is assumed to be Vt$, processes, Selected Transl. in Math. Stat.
every Y in (L’) cari be expressed in the form Prob., 4 (1963) 127- 13 1. (Original in Russian,
1954.)
[9] P. Lévy, Processus stochastiques et
mouvement brownien, Gauthier-Villars, 1948;
second edition, 1965.
, s,)dB,,dBs2.. dBsp,
[ 101 P. Lévy, A special problem of Brownian
in terms of multiple Wiener integrals. The motion and a general theory of Gaussian
predictor of Y based on the known values random functions, Proc. 3rd Berkeley Symp.
X,, s < 0, that is, the projection of Y on L, is Math. Stat. Prob. II, Univ. of Calif. Press,
given by truncating the domain of the integral 1956,133-175.
as [ll] R. M. Dudley, The size of compact subset
of Hilbert space and continuity of Gaussian
processes, J. Functional Anal., 1 (1967), 290-
330.
. . . . s,)dB,,dBs2. dBsp. [ 121 H. P. McKean, Jr., Brownian motion
with a several-dimensional time, Teor. Vero-
yatnost i Primenen., 8 (1963) 357-378.
K. Extrapolation and Interpolation
[ 131 E. Nelson, Construction of quantum
lïelds from Markov lïelds, J. Functional Anal.,
Extrapolation of a stationary Gaussian pro-
12 (1973), 97-112.
cess is used to fmd the best estimate of the
[ 141 Y. Okabe, On the structure of splitting
unknown values of a given process when we
lïelds of stationary Gaussian processes with
have been given the only some of the past
lïnite multiple Markovian property, Nagoya
values of the process. M. G. Kreïn [S] ob-
Math. J., 54 (1974), 191-213.
tained some results by using the theory of the
[ 151 Yu. A. Rozanov, Infinite-dimensional
inverse spectral problem discussed by 1. M.
Gaussian distribution, Proc. Steklov Inst.
Gel’fand and B. Levitan.
Math., 108 (1971). (Original in Russian, 1968.)
In contrast to extrapolation is the interpola-
[16] M. Yor, Existence et unicité de diffusion
tion of a stationary Gaussian process. There
à valeurs dans un espace Hilbert, Ann. Inst.
is an important contribution to this prob-
lem by H. Dym and H. P. McKean [l] who H. Poincaré, (B) 10 (1974), 55-88.
[ 171 Y. Okabe, Stationary Gaussian pro-
developed the Kreïn theory by applying the
cesses with Markovian property and M.
de Branges method to the theory of entire
Sato’s hyperfunctions, Japan J. Math., 41
functions.
(1973), 699122.
[ 181 X. Fernique, Regularité des trajectories
References des fonctions aléatoires gaussienes, Lecture
notes in math. 480, Springer, 1974, l-96.
[l] H. Dym and H. P. McKean, Gaussian
processes, function theory, and the inverse
spectral problem, Academic Press, 1976.
[2] X. Fernique, Continuité des processus 177 (XIV.2)
gaussiens, C. R. Acad. Sci. Paris, 258 (1964)
6058-6060.
Generating Functions
[3] T. Hida, Canonical representations of
Gaussian processes and their applications, A. General Remarks
Mem. Coll. Sci. Univ. of Kyoto, (A, Math.) 34
(1960), 109-155. A power series y(t) = En=,, a,t” in t that con-
[4] M. Hitsuda, Representation of Gaussian vergesin a certain neighborhood of t = 0 de-
processes equivalent to Wiener processes, fines the sequence of numbers {a,}. The func-
Osaka J. Math., 5 (1968) 299-312. tion g(t) is called the generating function of
177 B 674
Generating Functions

the sequence. Similarly, the series K(x, t) = pendix B, Table 3) is usually delïned by IBJO)l.
C~of.(x)t” that is convergent for x and t Sometimes other delïnitions, such as B, = B,(O)
in a certain domain in (x, t)-space is called the or B, = BZn(0), are used. Bernoulli polynomials
generating function of the sequence of func- satisfy the relations
tions {f”(x)}. Sometimes the function g(t)=
B,(x+ l)-B,(x)=nx”-‘,
C~,(a,/n!)t” is called the exponential generat-
ing function of the sequence {a,). For exam- dB,(x)/dx = nB,m,(x),
ple, the generating functions of the tbinomial
which are used in tinterpolation. For example,
coefficients and ilegendre polynomials
a polynomial solution of the tdifference equa-
are (1 + g and (1 - 2tx + t’)-“‘, respectively.
tion f(x + 1) -f(x) = ZF=, a,x” is given by
When a generating function of {a,> or {f.(x)}
f(x) = En=, a,B,+1 (x)/(n + 1) + (arbitrary con-
is given, we cari obtain a, or f,(x) by integral
stant). In particular, we have 1” + 2” + + p”=
expressions; for example, in the latter case we
have (B,+,(P+~)-B,+,(l))/(n+l).

C. Euler Polynomials

where the contour C is a suflïciently small A system of polynomials


circle going counterclockwise around the
origin. A generating function cari be continued En(x)=kzo ; akXn-k
beyond the domain of convergence of the 0

power series. Simple generating functions are is defined by the generating function
known for many important orthogonal sys-
tems of functions (- 3 17 Orthogonal Func-
tions). Generating functions are widely used
because they enable us to derive analytically
We cal1 E,(x) the Euler polynomial of degree n.
the properties of sequences of numbers or
Here ak is defïned by ak = Ek(0) and
functions. For a system of numbers or func-
tions depending on a continuous parameter
instead of the integral parameter n, we deîïne
the generating function in the form of a tLa-
place or +Fourier transform. In particular, sothatwehavea,=l,a,=-1/2,a,=1/4,...;
for a probability tdistribution function F(x), a Zn = 0 for n 2 1. The nth Euler number is
sometimes taken as un, but more often it is
the exponential generating function f(t) =
jYm eëfXdF(x) for the moments {a,} of F(x) detïned by
is called the moment-generating function of
F(x). There is another kind of generating func- En=(-1)” C 2’ n au,
p=0 0P
tion for the sequence a,: C a,nP in the form
of +Dirichlet series; it is frequently used in i.e., by
number-theoretic problems. 2
-=sechx= 5 (-1)“:~’
ex + eëx fi=0
B. Bernoulli Polynomials
(- Appendix B, Table 3). Al1 the E, are inte-
gers, E,,+,=O (m=O, 1, . ..) and E,,>O(m=
A system of polynomials
0, 1, ); in the decimal expressions for E, the
last digit is 5 for Eh,,, (m> 1) and 1 for Edrn+*
(m>O). Sometimes E,, is denoted by E,. We
have the relations
is delïned by the generating function
n E,x”
secx= 1 ~
n=O n! ’

E,(x) + E,(x + 1) = 2x”,


B,(x) is called the Bernoulli polynomial of
degree n. Since Bk(0) is the coefficient of E,(l -x)=(-l)“E,(x),
tk/k! in
dE,Sx)
-=nE,-,(x),
dx

and in particular,
we have B,(O)= 1, B,(O)= -1/2, f?*(O)= 1/6, . . . .
-l”+2”-3”+4”-...+(-1)Pp”
B,,+,(0)=0forn~1;and(-1)“~‘B,,(0)>0for
n > 1. The nth Bernoulli number B, (- Ap- =((-lY’Eh+ l)-L(l))P.
615 178 A
Geodesics

D. Application to Combinatorics speed is by detïnition a geodesic if and only if


every point on y is contained in a nontrivial
. Let p(n) denote the number of ways of dividing subarc whose length realizes the distance be-
n similar abjects into nonempty class. This tween its endpoints. M is equipped with the
is called the number of partitions of n. Euler Levi-Civita connection V (- 80 Connections)
noticed that the following formula is valid for by means of which a geodesic is characterized
the generating function of p(n): as an autoparallel curve. In local coordinates,
-1
l+fp(n)x”=
n=1
r fi(l-x’)
Ln=1
1
J
,
y is obtained as a solution of an ordinary
differential equation of order two (- 80 Con-
nections L). It follows from the properties of
whence we obtain p(n)=p(n- l)+p(n-2)- solutions of the equation for geodesics that
p(n-5)-p(n-7)+p(n-12)+p(n-15)-...= every point PE M has a neighborhood in which
Z&(-l)k-1[p(n-(3k2-k)/2)+p(n-(3k2+ every two points cari be joined by a unique
k)/2)] (- 328 Partitions of Numbers). Let shortest geodesic that depends smoothly on
B(n) denote the number of ways of dividing n both of the endpoints. A classical result of J.
completely dissimilar abjects into nonempty H. C. Whitehead (Quart. J. Math., 3 (1932))
classes. These are called Bell numbers. The states that every point PE M has a metric bal1
fïrst few are 1, 1, 2, 5, 15, 52, 203, 877,4140, B centered at p such that every two points in B
21147, . . [2]. For this sequence with B(O)= 1, are joined by a unique shortest geodesic whose
the forma1 generating function CE,, B(n)x” image is contained in B. Let M, be the tangent
does not converge except at x = 0. However, space to M at p, and let fi,c M, be the set
the exponential generating function g(x) = of a11 vectors u E M, such that the solution of
X:0 B(ri)(n!)-‘x” is convergent for a11 complex geodesic y with initial condition y(0) = p, y(O) =
numbers x and is equal to eex-‘. The differen- u is well defined on [0, 11. fi, contains an
tial equation g’ = e”g gives rise to a recursive open neighborhood of the origin and is tstar-
formula shaped with respect to the origin. The ex-
ponential mapping exp,: fi,- M is defined to
B(k). be exp, v = y( 1). This mapping exp, is smooth
and has the maximal rank at the origin. Small
balls around p are obtained as the image
References under the exponential mapping of the corre-
sponding halls in fiD centered at the origin,
[l] E. Netto, The theory of substitutions and
and the restriction of exp, to these halls are
its applications to algebra, revised and trans-
diffeomorphisms. Thus the topology of M as a
lated by F. N. Cale, second edition, Chelsea,
metric space is equivalent to the original one
1964. (Original in German, 1882.)
of M. The fundamental Hopf-Rinow theorem
[2] E. T. Bell, Algebraic arithmetic, Amer.
(Comment. Math. He/u., 3 (1931)) states that (1)
Math. Soc., 1927.
M is complete as a metric space if and only if
[3] E. B. McBride, Obtaining generating func-
ap= M, for some PE M; (2) M is complete if
tions, Springer, 1971.
and only if every closed metric bal1 is compact;
Also - references to 389 Special Functions.
and (3) if M is complete, then every two points
cari be joined by a shortest geodesic, namely, a
geodesic with length realizing the distance
between the endpoints.
178 (Vll.6) In the following discussion M is assumed to
Geodesics be connected, complete, and without boundary.
Let y : [0, l] -t M be a geodesic. A piecewise
A. General Remarks smooth 1-parameter variation 1/ along y is a
continuous mapping T/: [0, l] x (-E, E)+M
In the study of global differential geometry, withatïnitepartitionO=t,<t,<...<t,=l
geodesics play an essential role because the such that VJ [ti, ti+l] x (-F, E) is smooth and
behavior of geodesics on a complete Riemann- v(t, 0) = y(t) for all t E [0, 11. Then the fïrst and
ian m-manifold (m 2 2) M without boundary second variation formulas for V are
heavily influences its topological structure. A
L’(O)= < y> Y) l OIW
signifïcant merit of using geodesics is that one
cari make elementary and intuitive observa- and
tions on M (as in Euclidean geometry), which
often yields fruitful results. L”(0) = i ((Y, Yi’)-<Wi,1%4 Yi))dt
i=l
Riemannian structure induces in a natural
way a distance function d on M, and hence M
is a metric space. A smooth curve y of constant
+<V,,,yi>T9l:j-,
Ii L(Y)?
178 B 676
Geodesics

respectively, where y(t) = 8 v(t, s)/& 1s=0 for liJ(t)li < IIJ8(t)ll holds for all te[O, tJ, where
te [tirnI, ti] is the variation vector iïeld as- ya is the fïrst conjugate point to ya along
sociated with 1/1 [timl, ti] x (-6, E), L(s) is the ya (Rauch). If the Jacobi fields satisfy IIJ(O)il =
length of the variation curve t+ V(t, s), and IIJ8(0)ll = IIJ,(O)ll and J’(O)=J~(O)=.$(O)=O,
c = Vy x and R is the curvature tensor. then llJA(t)ll< llJ(t)II < lIJ&II holds for ail
A smooth vector field J along a geodesic t E [0, ti], where ti is the fïrst zero point of
y : [O, l] +M is called a Jacobi tïeld if and only Jb (Berger). They are often applied to compare
if it satisfies Y’ + R(J, j)f = 0. The set of a11 curve lengths as follows. Let c:I+R” be a
Jacobi fïelds along y forms a vector space iso- piecewise smooth curve with Ilc(t <t,. Fix
morphic to M,(,, x MYo, by the natural corre- points PE M, par M(6), and par M(A) with
spondence &Q(O), J’(O)). Every Jacobi fïeld fïxed isometric identifications of the tan-
is associated with a 1-parameter geodesic varia- gent spaces with R”. Then L(exp,&oc)<
tion I’ along y. Namely, every variation curve L(exp, o c) d L(exppd o c). If P, Pa, and PA are
of V is a geodesic, V is smooth, and J(t) = unit parallel lïelds along corresponding geo-
3 V(t, S)/as 1S=. for a11 tE [O, 11. Conversely, desics u, cr,, 0,: I+M, M(d), M(A), respec-
the variation vector fïeld of any 1-parameter tively, which have the same speed, and if
geodesic variation V along y is a Jacobi field. (P, 0) = ( Pa, tia) = (92, a,), then for every
Especially, if A is a tangent vector to M, at piecewise smooth function ,f: I+[O, ti] the
y(O), then d(exp&,, A = J( l), where J is the curves U(S) = exp,&s)P(s) and u,(s), u,(s) de-
Jacobi fïeld associated with the 1-parameter fined in the same way on M(6), M(A), respec-
geodesic variation V(t, s) = exp,t(y(O) +SA), tively, have lengths L(u,) d L(u) < L(u,). They
and A is identifïed under the canonical parallel play important roles in the theory of geodesics.
translation in M,. A point y(t,) is said to be A geodesic triangle is a triple (yo, yl, y2) of
conjugate to p = y(0) along y if and only if there shortest geodesics (for convenience they are
exists a nontrivial Jacobi fïeld along y that parametrized on [O, 11) such that yi( 1) = yi+i (0)
vanishes at 0 and t,. A theorem of Morse and for a11 i = 0, 1, 2 with mod 3. The vertices p,,,
Schoenberg states that if the sectional curva- p,, p2 and the angles O,, 0,) Q2 of a geodesic
ture K of M satisfies O<A<K d B< CO, then triangle (y,,, y,, y2) are defïned by pi+i = y,(O)
every unit speed geodesic y : [0, I] +M has a and O,+, =COS-’ {<?it”h -Yiki(l)>/llYill llji-1 II}>
point conjugate to y(0) if 13 n/fi and has no where angles are always taken in [0, ~1. The
point conjugate to y(0) if I < ~/fi. Especially, triangle comparison theorem of Toponogov
every geodesic y on M with K<O contains no (Amer. Math. Soc. Transi., 37 (1964)) is stated
conjugate point on it, and exp, is locally regu- as follows. Let K > 6 > -CO be satisfied on
lar for any PE M. Thus a complete and simply M. For any geodesic triangle (yo, yl, y2) on M
connected M with K ~0 is diffeomorphic to there is a geodesic triangle (&, yl, y,) on M(6)
R” (Hadamard and Cartan). such that L(Ti) = L(yi) and the angles Hi of this
Let y be a unit speed geodesic with y(O) = p. triangle satisfy ei < Qi for i=O, 1, 2. It turns
If t 1 > 0 satisfies d(p, y(t)) = t for a11 t E [0, tl) out that if 6 > 0 then the circumference of
and d(p, y(t)) < t for a11 t > t,, then the point any geodesic triangle on M does not exceed
y(tI) is called a tut point to p along y. It ap- 2nlJ6.
pears no later than the fïrst conjugate point to For details of the basic facts stated above we
p along y. The set C(p) of all tut points to p is refer the reader to [l-3].
called the tut locus of p. Let U be the set of
vectors IIIE M,, O<t< 1, where exp,u is the tut
point to p along y(t) = exp, tu. U is a nonempty B. Curvature and Fundamental Groups
open set and exp, 1U: U+ M - C(p) is a diffeo-
morphism, where M-C(p) is diffeomorphic The fundamental group n,(M) of M(- 170
to an m-disk. The tut locus possesses essential Fundamental Groups) is influenced by the
information on the topology of M. curvature of M. A basic idea, going back to
The metric comparison theorems of H. Hadamard (J. Math. Pures Appl., 4 (1898))
Rauch (Ann. Math., (2) 54 (195 1)) and M. Ber- and Cartan, states that every nontrivial tfree
ger (Illinois J. Math., 6 (1962)) are stated as homotopy class 6 E 71, (M) of loops on a com-
follows: Let inf, K = 6 > -CO and SU~~ K =A < pact M contains a closed geodesic of minimal
m, and let y: [0, l]+M be a nontrivial geo- length whose preimage under the covering
desic. Let M(c) be a complete and simply projection n : fi + M in the universal Riemann-
connected space form of constant curvature ian covering @ is either closed (when 6 is of
c. Fix geodesics ya, yA: [0, Il-M(h), M(A), re- fïnite order) or a straight line (when 6 is of
spectively, with the same speed as y. If J, J8, inlïnite order) that is translated along itself by
and J, are Jacobi tïelds along y, ya, and yh the tdeck transformation 6.
respectively such that J(0) = Ja(0) = J,(O) = 0 (1) By using the second variational formula,
ad llJ’(O)ll = IIJNII = lIJk(O)II, then lIJA(t)ll d an even-dimensional compact M with K > 0
671 178 D
Geodesics

either is simply connected or has n,(M)= Z, states that if M is compact and simply con-
(Synge, Quart. J. Math., 7 (1936)). A beautiful nected and if 1/4 < K < 1, then i(M) 2 n, and
result of Myers (Duke Math. J., 8 (1941)) states M is a topological sphere. This extends the
that if the Ricci curvature Rit of M is bounded pioneering work by H. Rauch (Ann. Math., (2)
below by 6 >O, then the diameter d(M) is not 54 (1951)). A rigidity theorem of M. Berger
greater than rc/& and hence M is compact (Ann. Scuola Norm. Sup. Pisa, 14 (1962)) states
and n,(M) is finite. It is proved in [3] that if that if dim M is even, M is simply connected,
K > 0, then there exists a finite normal sub- and 1/4 < K < 1, then M is homeomorphic to
group @ of n,(M) such that X~(M)/@ is a S” (if d(M) > rc) or isometric to one of the sym-
Bieberbach group (- 92 Crystallographic metric spaces of compact type of rank 1 (if
Groups). The splitting theorem (- Section d(M) = rc). A slight generalization of the trian-
F(2)) is used in the proof. gle comparison theorem is used to prove that
(2) When K Q 0 is satisiïed on M, fi is dif- for given positive constants d, V, and S, there
feomorphic to R”, and hence n,(M) = 0 for a11 exists a constant C,,,(d, V, S) > 0 such that if K,
i 2 2. A basic tool used in the study of rc, (M) in d(M) and the volume v(M) satisfy 1K I< S,
this case is that the displacement function X-t d(M)<d, and u(M)> V, then i(M)>C,(d, V,S)
d(X, S(X)), 26 fi, of every isometry 6 on M is ClOl.
convex [4], where a function on M is said
to be convex if and only if its restriction to
every geodesic is convex. A classical result of D. Finiteness Theorem
Preissmann (Comment. Math. Helu., 15 (1943))
states that every nontrivial Abelian subgroup Finiteness theorems are a natural extension
of rci (M) of a compact M with K<O is an of the sphere theorem. They provide a priori
intïnite cyclic subgroup. It is proved in [4] that estimates for the number of various topolog-
if K <0 holds on A4 and if every deck trans- ical types of manifolds (for instance, homology,
formation on fi translates some geodesic homotopy, homeomorphism, diffeomorphism
along itself, then Z,(M) is a disjoint union of types, etc.), which admit certain classes of
inlïnite cyclic subgroups, and any two com- Riemannian metrics characterized by geo-
muting elements belong to the same cyclic metric quantities. The basic idea of the esti-
subgroup. Moreover, if M is compact and mates is to lïnd a constant c > 0 that depends
K < 0 and if n,(M) is a direct product I, x F, only on geometric information which char-
such that nl(M) is tcenterless, then M is iso- acterizes the class of manifolds SO that if M be-
metric to the Riemannian product M, x M2, longs to the class, then i(M) > c. Then a number
and nl(Mi)=ri holds for i= 1,2 [5,6]. It is N is found from the information given a priori
shown in [7] that the fundamental group of such that every element M of the class has an
M with K < 0 occurs as that of an (m + l)- open caver of at most N balls whose radii are
dimensional M’ with K <c < 0, which is diffeo- ah less than c. It is proved in [ 1 l] that for
morphic to M x R. The warped product [4] given 6 E (0,l) and m, there exists a number
is used to construct such a metric on M x R. N(6, m) such that there are at most N(6, m)
homotopy types for the class of a11 simply
connected 2mdimensional manifolds with
c. Cut Locus 6 <K < 1. Furthermore, for given positive
numbers d, V, and S, there are at most finitely
The injectivity radius i(M) of M is defïned to many homeomorphism (or diffeomorphism)
be the infïmum of the continuous function types for the class of a11 m-manifolds M, each
X+~(X, C(x)), XE M. As is seen in Section D, of which has the property that d(M) < d, u(M)
the estimate of injectivity radii provides many > V, and 1K 1<S [ 101. This result is applied to
fruitful results on the topology of Riemannian obtain a result in [12], which states that there
manifolds. It follows from Synge’s result that exists for given m, V> 0 and an E> 0 such that
an even dimensional compact and orientable ifMiscompact, -1-s<K<-l,andu(M)>
M with 0 <K < 1 has its injectivity radius V, then M admits a metric of constant nega-
i(M) > rc [3]. However, such an estimate can- tive curvature.
not be obtained in odd dimensions. The ex- For an ~30, M is said to be c-flat if and only
amples discussed in [S] show that there are if supw K. d(M)2 <E is satisfied for M. Every
inlïnitely many homotopically distinct homo- compact flat manifold is s-flat for a11 E2 0. M is
geneous Riemannian 7-manifolds of posi- called a nilmanifold if and only if it admits a
tive curvature. The injectivity radii of such transitive action of a tnilpotent Lie group. It
examples are estimated precisely in [9], ac- is shown in [ 131 that for a compact M with
cording to which there is no positive lower dim M=m, there exists a number E(m) > 0 such
bound for them. The sphere theorem of Kling- that if M is E(m)-flat, then (1) there is a maxi-
enberg (Comment. Math. Helv., 35 (1961)) mal nilpotent normal divisor N c ni (M), (2)
178 E 678
Geodesics

the order of rc, (M)/N is bounded by a constant F,(x) = lim,,, [t -d(x, y(t))]. The original defi-
which depends only on m, and (3) the fïnite nition of it was used by H. Busemann to defme
caver of M which corresponds to N is diffeo- a parallel axiom on straight G-spaces (- Sec-
morphic to a nilmanifold. Moreover, if M is tion H). F, has the property Fy’(( -CO, t,])=
a(m)-flat then M is diffeomorphic to R”. If M is {~~~;‘~~-~,~,l~ld~x,~~y’~~-co,t,l))~
a(m)-flat and if n,(M) is commutative, then M t,-t,} for any t,>t,, and hence F,(x)=t,-
is diffeomorphic to a ttorus. d(x, aFyi(( -CO, t2])) for a11 t, and x with
F,(x) < t,. It has been proved, by using the
second variation formula, that if Rit > 0, K >
E. Uniqueness Theorem
0 [ 191 or if, in the case where M is Kahlerian,
the holomorphic bisectional curvature [20]
Uniqueness of topological structures (as in the
is nonnegative, then F, is subharmonic (sh),
sphere theorem) of certain classes of compact
convex or tplurisubharmonic (psh), respec-
Riemannian manifolds is discussed here.
tively, where the holomorphic bisectional
The discovery of texotic 7-spheres by J.
curvature is defined as R(X, JX, J Y, Y) for the
Milnor (Ann. Math., (2) 64 (1956)) gave rise to
complex structure J and orthonormal vectors
the question of whether in the sphere theorem
X and Y. If the holomorphic bisectional curva-
the conclusion (homeomorphism to the sphere)
ture [20] is positive, then F, is strictly psh. If
could be replaced by diffeomorphism. Since
K 3 0, then the triangle comparison theorem
the number of differentiable structures on a
implies that F = sup{F, 1y(0) = p} is convex and
topological sphere depends on its dimension
is an exhaustion [21], where the sup is taken
(- 114 Differential Topology), it has been
over ah rays emanating from a point PE M,
thought that in order to get the standard
and a function f is said to be an exhaustion if
sphere in the sphere theorem the best possible
and only if f-‘(( -co, a]) is compact for a11
restriction for the curvature might also depend
UER. A well-known theorem of H. Grauert
upon dimension. The appropriate differenti-
(Math. Ann., 140 (1960)) states that if M admits
ahle pinching prohlem is to fïnd a sequence
a strictly psh exhaustion function, then M is a
{Am} SO that if M is compact and simply con-
Stein manifold (- 21 Analytic Functions of
nected and if 6 <K < 1 for some 6 >A,, then
Several Complex Variables). In this context,
M is diffeomorphic to S”, and A, is the least
various conditions for curvatures on Kahler
possible with this property. D. Gromoll and Y.
manifolds under which they become Stein
Shikata proved independently [ 14,151 that A,
manifolds have been found [20].
< 1 holds for a11 m > 2. Later it was proved
Let H be complete and simply connected,
that there exists a 6, E (1/4,1) such that A, < 6,
and K < 0. Busemann functions on H are
holds for ah ma 2 [16]. The diffeotopy theo-
differentiable of class C2, and the thorosphere
rem, which plays an essential role in the proof
F;’ ({t}) is a C2-surface [22]. On a parabolic
of tïnding such a 6,, provides a suffïcient con-
visibility manifold (- Section F(3)) the nega-
dition for a diffeomorphism on S”-’ to be
tive of every Busemann function is Ci-convex
tisotopic to the identity mapping.
without minimum [7].
A different idea is put forth in [17], which
imitates the Gauss normal mapping of a
(2) Ends and Splitting Theorems. Ends of a
closed convex hypersurface in R”+I. It is
noncompact M are dehned as follows: If A,
proved that if the curvature operator is suff-
and A, are compact subsets of M with A, c
ciently closed to the identity, then M is diffeo-
A,, then any component of M - A, is con-
morphic to S”. This idea is used to obtain a
tained in a unique component of M - A,. An
diffeomorphism between M and a spherical
end is the limit of an inverse system {compo-
space form. A generalization of the sphere
nents of M-A; A} directed by the inclusion
theorem states that if M is compact and if
relation as indicated above and indexed by
(d( M)/7-c)2 inf, K > 1/4, then M is a topological
{~I~compact}.
sphere [18].
It has been shown that if Rit > 0 [23] (or
20 [19]), then M has exactly one end (or at
F. Noncompact Manifolds most two ends). A visibility manifold has at
most two ends if it is not Fuchsian [7]. If M
Let M be noncompact. It is due to the nature admits a locally nonconstant convex function,
of noncompactness that through each point on then M has at most two ends [24].
M there passes a ray y : [0, CO)-+ M, e.g., y is a If M has more than one end, then there
unit speed geodesic any of whose subarc is exists a straight line y: R+ M which is by
minimizing. definition a nontrivial geodesic any of whose
subarcs is minimizing. A classical result of
(1) Busemann Functions. A Busemann function Cohn-Vossen (Mut. Sb., 1 (1936)) states that if
F,: M+R with respect to a ray y is detïned by dim M = 2, K 20 and if M contains a straight
679 178 G
Geodesics

line, then the total curvature is 0 and hence M such that the second difference quotient along
is isometric to either a plane or a cylinder Si x every geodesic at any point on A is bounded
R. Generalizations of this result state that if below by 6 [20]. A standard tconvolution
M admits k independent straight lines and if smoothing procedure yields smooth convex
K > 0 (Toponogov, Amer. Math. Soc. Tran& approximations in neighborhoods of compact
37 (1964)) (or if Rit > 0 [ 19]), then M is iso- sets for every strongly convex function such
metric to the Riemannian product N x Rk. that their second derivatives along every geo-
desic is positive. Global approximations cari
(3) Structure Theorems. The structure theorem be constructed [27]. Let cp: M+R be a convex
of 2-dimensional M with K>O was proved by function which is not constant on any open
Cohn-Vossen (Compositio Math., 2 (1935)). It set. It has recently been proved that (1) if b >
has been shown that M is diffeomorphic to R” inf,cp> -CO, then there is a homeomorphism
if K > 0 [23] and that if K > 0, then there H:<p-‘({b})x(inf,cp,co)-+M-{minimum
exists a compact totally geodesic submanifold set of cp, if any} such that <po H(y, cc)= CIfor a11
S without boundary such that M is homeo- yErp-‘({b}) and for a11 ac(inf,<p, CO); (2) if cp
morphic to the total space of the tnormal takes a minimum, then the continuous exten-
bundle over S [21]. Here, homeomorphism sion H: [inf,,, cp, CO)+ M of H is proper and
cari be replaced by diffeomorphism, and if surjective [24].
K > 0 outside a compact set of M, then M
is diffeomorphic to R” [26].
For a ray y and a point x on H, if o is a
G. Manifolds All of Whose Geodesics Are
ray with ~(0) =x and if Q(t), o(t)), t > 0, is
Closed
bounded above, then Q is called an asymptotic
ray (a ray asymptotic) to y. The asymptotic
relation is an equivalence relation on the set As is well known, every symmetric space of
of a11 geodesics on H. A point at infïnity of H compact type of rank 1 (it is abbreviated
is an equivalence class on geodesics on H, CROSS as in [28]) has the property that a11
and the set of a11 points at inlïnity of H is geodesics are simply closed and of the same
denoted by H(m). With a suitable topology, length. Let M be a compact Riemannian mani-
the set of a11 points at inlïnity of H constitutes fold with the property that a11 geodesics on M
a bounding sphere such that H = H U H( CO) are simply closed and that they have the same
is a closed m-cell. For a properly discontin- length (SC property). The problem discussed
uous group D of isometries acting on H, a here is whether such an M is isometric (or at
closed D-invariant limit set L(D) is obtained least the topology of such an M is equivalent)
in H(m) as the set of a11 accumulation points to some CROSS, or if it is possible to classify
of 6(p), SED. M = H/D is a visibility manifold a11 such manifolds.
if and only if H satisfies the following axiom: An example of a nontrivial SC-manifold
any two distinct points on H(m) cari be joined was first constructed by Zoll (Math. Ann., 57
by at least one geodesic. If K < c < 0 is satisfied (1903)) on S2 as a surface of revolution in R3,
on M, then M is a visibility manifold [Il. which is not isometric to a standard sphere.
By investigating limit sets, visibility manifolds Blaschke conjectured that every SC-structure
are classified into three types [l]. (1) M is on PR2 is the standard real projective space.
parabolic; M is diffeomorphic to N x R and This has been solved affirmatively by L. W.
is characterized by the fact that it has a convex Green (Ann. Math., 78 (1963)). In higher di-
function without minimum. (2) M is axial; M mensions it has been proved that every in-
is a vector bundle over S’, and hence it is tïnitesimal deformation of the standard SC-
diffeomorphic to either S’ x Rm-i or the prod- structure on PR” is trivial [29]. A general
uct of a Mobius strip with R”-‘. (3) M is result for a SC-manifold M states that the
Fuchsian; M has more than two ends, and a volume of M (with dim M = m) with period 2~
strong algebraic restriction is imposed on the is the integral multiple of the volume of the
fundamental group of M. standard unit m-sphere and this integer is a
topological invariant [30]. For a point PE
(4) Convex Functions. Some elementary prop- M, if every geodesic segment with length 1
erties of C” convex functions on M have emanating from p is a simple geodesic loop at
been investigated and it was proved in [4] p, then M is called a SP-manifold. It has been
that if M admits a C” convex function with- proved that the tintegral cohomology ring of
out minimum and if K < 0, then M is diffeo- every SC’-manifold is isomorphic to one of the
morphic to N x R, where N is a level hyper- CROSS [31], which is a generalization of
surface. When K > 0, F cari be replaced by a Bott’s theorem (Ann. Math., 60 (1954)).
strongly convex exhaustion function. Namely, An essential difference between the metrics
for every compact set A c M there is a 6 > 0 of a Zoll surface and a standard sphere is seen
178 H 680
Geodesics

on the tut locus and the conjugate locus of a db, d+ db, 4 =d(p, 4 and if db, 4) + db, 4 =
point. The (tangent) tut locus of any Iïxed d(p, r’), then d(q, r) = d(q, r’) implies r = r’.
point of a CROSS coincides with the (tangent) From (2) and (3) it follows that any two
tïrst conjugate locus; however, this does not points on X cari be joined by a curve called a
hold on a Zoll surface. This observation gives segment whose length realizes the distance
rise to the definition of a Blaschke manifold: between them. A geodesic arc (or for simplicity
For distinct points p, q E M, let A(p, q) c M, be geodesic) y : [a, b] +X is a curve such that any
the set of a11 unit vectors tangent to the mini- subarc contained in a p,-bal1 around some
mizing geodesics from q to p. M is said to be point XE X is a segment. From (4) and (5) it
a Blaschke manifold at a point p if for every follows that every geodesic arc has a unique
qE C(p), A(p, q) is a great sphere of the unit intïnite extension to both sides. If a geodesic
hypersphere in M, centered at the origin. M is arc is extended inlïnitely to both sides, then it
called a Blaschke manifold if it is SO at every is called a geodesic line.
point of it. The following statements are equiv- The absence of smoothness causes an essen-
alent [28]: (1) M is a Blaschke manifold at tial difference in the fact that the distance
p, (2) the tangent tut locus C, at p is spher- function to a lïxed point on X is not necessar-
ical in M,, (3) along every geodesic starting ily convex in a small bal1 around it, in contrast
from p, the Iïrst conjugate point p appears to the Whitehead theorem (- Section A) for
at a constant length and the multiplicity of Riemannian manifolds (or +Finsler spaces).
it is independent of the initial direction. Then Here a function on X is called convex if and
such an M cari be exhibited as DU, E, where only if its restriction to every arc length-
D is a closed bah with tenter p, E a C”-closed parametrized geodesic line is convex.
k-disk bundle over an (m - k)-dimensional Every two points on X have neighborhoods
P-compact manifold with boundary aE dif- which are homeomorphic to each other. Thus
feomorphic to S”-‘, and a:aD-taE an tattach- the topological dimension of X(- 117 Dimen-
ing diffeomorphism. Conversely, if M is ex- sion Theory) is well delïned. A 1-dimensional
hibited as DU, E then there exists a metric g X is either a circle S’ or the whole real line. If
on M such that (M, g) is a pointed Blaschke dimX = 2 [33] or if dim X = 3 [36], then X is a
manifold at p, where p is the tenter of D. A topological manifold.
generalized Blaschke’s conjecture states that It should be noted that G-space theory not
every Blaschke manifold is isometric to a only proves known Riemannian theorems
CROSS. A partial solution to this conjecture under weaker conditions, but also leads to
has been obtained by M. Berger and states many facts which were either not considered
that if (Y, g) is a Blaschke manifold, then it is previously or well understood in (or thought
isometric to a standard sphere [28]. to be very different from) the Riemannian case.
Among many fruitful topics discussed by
Busemann [33], only two are included below.

H. G-Spaces
(1) G-Surfaces. For each point p on a l-dimen-
sional G-space Y, which is called a G-surface,
G-spaces were created by Busemann to show let S,={xeYId(x,p)=p}, where O<p<p,/4 is
that many global theorems of differential a lïxed number. SP is homeomorphic to a circle
geometry are independent of the Riemannian and bounds a 2-disk. An angle A at p is the set
character of the metric and also of smooth- of a11 segments emanating from p and passing
ness. This approach naturally led to novel through the points on a connected subarc of
questions and results. (Many facts hold even SP. An angular measure 1.1 for the angles at
when the distance is not symmetric; - [34].) p is a nonnegative function whichsatisfies:
The symbol G suggests that the principal (1) 1A 1= 7cif and only if p is the midpoint of
property of G-spaces is the existence of geo- the segment joining two points on SP which
desics with the same properties as in complete bounds A; (2) if the intersection%f two angles
Riemannian manifolds without boundary A, and A, is a unique segment, then IA, U A, 1
except for differentiability. A G-space X is = 1A, I+ 1A,]. If a triple of points (po,p1,p2) on
defined as follows: (1) X is metric with (sym- Y are contained in a suftïciently small bah,
metric) distance d; (2) every bounded intïnite then the three segments joining them define
set has at least one point of accumulation; (3) a triangle, and each vertex pi has the angle
given two distinct points p, TEX, there is a Ipi-Ipipi+, 1determined by the two segments.
point q E X such that p # q #Y and d(p, q) + An excess c(popI p2) of a triangle (po, pl, p2) is
d(q, r) = d(p, r); (4) to each point x E X a posi- defined to be~(p,p,p,)=lp~p~p,l+(p,p,p,l
tive number p, is assigned such that for any + IpzpOpl 1-7~. A degenerate triangle has
p, q E X with d(p, x) < px, d(q, x) < px there is excess 0. If a triangle (a, b, c), which consists
a point r with d(p,q)+d(q,r)=d(p,r); (5) if of geodesic arcs, is simplicially decomposed
681 178 Ref.
Geodesics

into a tïnite number of triangles (a,, bi, ci), i= triangle on 3. Thus x is contractible. Every
1, . . . , k, then +bc) = xf=, c(aibici), and this nontrivial element of the homotopy class of
is independent of the choice of finite decompo- loops on X with base point x6X contains a
sition. If 9 is compact and orientable, the geodesic loop at x which has the minimum
total excess E(Y) of 9 is detïned by E(Y) = length. This fact and the strict convexity of
xi=, E(aibici), where Y is simplicially decom- t*d(cr(t), y(t)) for any two arc length para-
posed into nondegenerate tïnite triangles metrized geodesic lines y, (T on the universal
(ai, bi,ci), i= 1, . . . . k. Then &(9)=2n~(y), covering ri of X with negative curvature imply
where ~(9) is the +Euler characteristic of 9. that if X is compact, then every Abelian sub-
The results obtained by Cohn-Vossen [35] on group of the fundamental group ni(X) of X is
the ttotal curvature of complete open surfaces infinite cyclic. This is a generalization of the
have also been generalized to G-surfaces with Preissmann theorem (- Section B).
angular measure uniform at n. Although no
angular measure leading to a true analog of
the +Gauss-Bonnet theorem exists even in G-
References
surfaces [34], many of its applications, in
particular the results by Cohn-Vossen, cari be
proved by using angular measures which are [ 1] A. D. Alexandrov, Die innere Geometrie
uniform at 7c [33]. der konvexen Flachen, Akademie-Verlag,
If every two points on X cari be joined by a 1955.
unique geodesic line, then all geodesic lines on [2] J. Cheeger and D. Ebin, Comparison
X are simultaneously either straight lines or theorems in Riemannian geometry, North-
circles of the same length 1. If in the latter case Holland, 1975.
dim X > 1, then X has a two-fold universal [3] D. Gromoll, W. Klingenberg, and W.
covering space r? whose geodesic lines are a11 Meyer, Riemannsche Geometrie in Grossen,
circles of the same length 21, and they have the Lecture notes in math. 55, Springer, 1968.
property that a11 geodesics through a point on [4] R. Bishop and B. O’Neill, Manifolds of
2 meet at the same point at length 1. For the negative curvature, Trans. Amer. Math. Soc.,
Riemannian case it is a Blaschke manifold (- 145 (1969), l-49.
Section G). If every two points on Y cari be [S] D. Gromoll and J. Wolf, Some relations
joined by a unique geodesic line, then Y is between ihe metric structure and the algebraic
either a plane or a projective plane. structure of the fundamental group in mani-
folds of nonpositive curvature, Bull. Amer.
(2) G-Spaces with Nonpositive Curvature. Let Math. Soc., 77 (1971), 545-552.
(pO, pl, p2) be a triple of points on X which are [6] B. Lawson and S. T. Yau, On compact
contained in a small hall. X, by detïnition, is of manifolds of nonnegative curvature, J. Differ-
nonpositive curvature if and only if for any such ential Geometry, 7 (1972), 211-228.
triple of points [7] P. Eberlein and B. O’Neill, Visibility mani-
folds, Pacifie J. Math., 46 (1973), 45- 109.
d(Pl,Pi+l)~2d(P;,Pl+,) for i=O, 1,2, (*)
[S] N. Wallach, Homogeneous Riemannian
where pi is the midpoint of the unique segment manifolds of strictly positive curvature, Ann.
joining pi-1 to P~+~. If the above inequality is Math., (2) 75 (1972), 277-295.
strict for any nondegenerate small triangle, [9] T. Sakai, Cut loti of Berger% sphere, Hok-
then X is said to be of negative curvature. X of kaido Math. J., 10 (1981), 143-155.
curvature 0 is defined in the same way. It [lO] J. Cheeger, Finiteness theorems for
should be noted that a G-space of curvature 0 Riemannian manifolds, Amer. J. Math., 92
is a locally Minkowskian space [33]. The (1970), 61-74.
sectional curvature of a Riemannian manifold [ 111 A. Weinstein, On the homotopy type of
M is nonpositive if and only if (*) holds for a11 positively pinched manifolds, Arch. Math., 18
small triangles on it. Because of d(p,, ~0) < (1967), 523-524.
~(P,,P;)+~(P;,P~)~(~(P,,P,)+~(P,,P,))/~, [ 121 M. Gromov, Manifolds of negative curva-
the distance function to a point x E X of non- ture, J. Differential Geometry, 13 (1978), 223%
positive curvature is convex on a small bal1 230.
around x. X is called straight if and only if all [ 133 M. Gromov, Almost flat manifolds, J.
nontrivial geodesic lines on X are straight Differential Geometry, 13 (1978), 231-241.
lines. If X has nonpositive curvature, then [14] D. Gromoll, Differenzierbare Struktur
the universal covering space r? is straight. und Metriken positiver Krümmung auf
Moreover, for any two geodesic lines y, 0 : R-r Spharen, Math. Ann., 164 (1966), 353-371.
X, the function t-+d(a(t), y(t)) is convex. In [ 151 Y. Shikata, On the differentiable pinching
particular, the distance function to every fixed problem, Osaka J. Math., 4 (1967), 279-287.
point on 2 is convex, and (*) holds for any [ 161 M. Sugimoto and K. Shiohama, On the
179A 682
Geometric Construction

differentiable pinching problem, Math. Ann., 479 (Vl.5)


195 (1971), 1-16.
[ 171 E. Ruh, Curvature and differentiable
Geometric Construction
structures on spheres, Comment Math. Helv.,
46 (1971), 127-136. A. General Remarks
[lS] K. Grove and K. Shiohama, A gen-
eralized sphere theorem, Ann. Math., (2) 106 A geometric construction problem is a problem
(1977), 201-211. of drawing a figure satisfying given conditions
[ 191 J. Cheeger and D. Gromoll, The splitting by using certain prescribed tools only a finite
theorem for manifolds of nonnegative Ricci number of times. If the problem is solvable,
curvature, J. Differential Geometry, 6 (197 l), then it is called a possible construction problem;
119-129. if it is unsolvable, even though there exist
[20] R. Greene and H. Wu, Function theory figures satisfying the given conditions, then it
on manifolds which possess a pole, Lecture is an impossible construction problem. If there
notes in math. 699, Springer, 1979. does not exist a figure satisfying the given
[21] J. Cheeger and D. Gromoll, On the struc- conditions, then we say that the problem is
ture of complete manifolds of nonnegative inconsistent.
curvature, Ann. Math., (2) 96 (1972), 413-443. Among problems of geometric construction,
[22] H. Im Hof and E. Heitze, Geometry of the oldest and the best known are those of
horospheres, J. Differential Geometry, 12 constructing plane figures by means of ruler
(1977), 481-492. and compass. In this article, we cal1 these
[23] D. Gromoll and W. Meyer, On complete problems simply problems of elementary geo-
open manifolds of positive curvature, Ann. metric construction. The following are some of
Math., (2) 90 (1969), 75-90. the more famous problems of this kind; the
[24] R. Greene and K. Shiohama, Convex lïrst four are possible construction problems.
functions on complete noncompact manifolds; (1) Suppose that we are given three straight
Topological structures, Inventiones Math., 63 lines 1, m, n and three points P, Q, R in a plane.
(1981), 129-157. Draw a triangle A BC in such a way that ver-
[25] W. Poor, Some results on nonnegatively tices A, B, C lie on 1, m, n and sides BC, CA,
curved manifolds, J. Differential Geometry, 9 AB pass through P, Q, R (Steiner3 problem).
(1974), 583-600. (2) Suppose that we are given a circle 0 and
[26] H. Wu, An elementary proof in the study three points P, Q, R not lying on 0. Draw a
of nonnegative curvature, Acta Math., 142 triangle ABC inscribed in 0 in such a way that
(1979), 57-78. the sides BC, CA, AB pass through P, Q, R
[27] R. Greene and H. Wu, C” convex func- (Cramer-Castillon problem).
tions and manifolds of positive curvature, Acta (3) Draw a circle tangent to a11 of three
Math., 137 (1976), 209-245. given circles (Apollonius’ problem).
[28] A. Besse, Manifolds a11 of whose geo- (4) Suppose that we are given a triangle.
desics are closed, Springer, 1978. Draw three circles inside this triangle. in such a
[29] R. Michel, Problème d’analyse géomé- way that each is tangent to two sides of the
trique lies à la conjecture de Blaschke, Bull. triangle and any two of the circles are tangent
Soc. Math. France, 101 (1973), 17-69. to each other (Malfatti’s problem).
[30] A. Weinstein, On the volume of manifolds (5) Let n be a natural number. For the divi-
whose geodesics are a11 closed, J. Differential sion of the circumference of a circle into n
Geometry, 9 (1974), 513-517. equal parts (consequently, the construction of
[31] H. Nakagawa, A note on theorems of a tregular n-gon) to be a possible construction
Bott and Samelson, J. Math. Kyoto Univ., 7 problem, it is necessary and suflïcient that
(1967), 205-220. the representation of n as a product of prime
[32] C. T. Yang, Odd-dimensional Wieder- numbers take the form n = 2”p, . pk, where
sehen manifolds are spheres, J. Differential 13o,p, ,..., pk are a11 different prime numbers
Geometry, 15 (1980), 91-96. of the form 2h + 1 (tFermat number) (C. F.
[33] H. Busemann, The geometry of geodesics, Gauss, 1801).
Academic Press, 1955. (6) The following are three famous impos-
[34] H. Busemann, Recent synthetic differen- sible construction problems of Greek origin: (i)
tial geometry, Springer, 1970. divide a given angle into three equal parts
[35] S. Cohn-Vossen, Kürzeste Wege und (trisection of an angle); (ii) construct a cube
Totalkrümmung auf Flachen, Compositio whose volume is double that of a given cube
Math., 2 (1935), 63-133. (duplication of a cube or the Delos problem);
[36] B. Krakus, Any 3-dimensional G-space is and (iii) construct a square whose area is that
a manifold, Bull. Acad. Pol. Sci., 16 (1968), of a given circle (quadrature of a circle). P. L.
7377740. Wantzel(l837) proved that problems (i) and
683 180 A
Geometric Optics

(ii) are impossible except for the special cases the length of a segment we cari draw by ruler
of n/2, 7r/4, etc. for (i); and C. L. F. Lindemann is required to satisfy certain conditions. Also,
(1882) proved the impossibility of (iii) while various considerations have been made con-
proving that the number rt is ttranscendental. cerning cases in which we cari use tools other
than ruler and compass. For example, it is
known that although not a11 possible elemen-
B. Conditions for Constructibility tary geometric construction problems are
solvable by ruler alone, a11 these problems are
A problem of elementary geometric construc- possible if we have either a pair of parallel
tion amounts to a problem of determining a rulers, a square, or a triangle having a lïxed
certain number of points by drawing straight acute angle. If we use a square and a compass,
lines that pass through given pairs of points, then the trisection of an angle and the dupli-
and circles having given points as centers and cation of a cube are possible (L. Bieberbach).
passing through given points. Let (a,, b,), Also, when a conic section other than a circle
(a*, b2), . ,(a,, b,) be rectangular coordinates is given, the trisection of an angle and the
of given points, and let K be the smallest duplication of a cube become possible by ruler
+number tïeld containing the numbers a,, and compass (H. J. S. Smith and H. Kortum).
“‘> b,. Straight lines that join given pairs of By ruler and ttransferrer of constant lengths,
points and circles that have given points as we cari solve Malfatti’s problem but not Apol-
centers and that pass through given points are lonius’ problem (Feldblum).
represented by equations of the lïrst or second Even when a problem is possible, the
degree with coefficients belonging to K. Con- method of construction may be rather com-
sequently, the coordinates of points of intersec- plicated and impractical. In these cases, vari-
tion of these straight lines and circles be- ous methods of highly accurate approximate
long to a quadratic extension K’= K(,,h) of construction have been investigated.
K. Let A be the set of coordinates of the points
that are to be determined. Then the problem is
References
solvable if and only if any number c( in A is
contained in a lïeld L = K(fi, Jd,, . , ,,&),
[l] H. L. Lebesgue, Leçons sur les construc-
whered,+iEK(& ,..., &)(i=O,l,..., r-l).
tions géométriques, Gauthier-Villars, 1950.
Thus L is a tnormal extension lïeld of K whose
[2] L. Bieberbach, Theorie der geometrischen
degree over K is a power of 2. Using this
Konstruktionen, Birkhauser, 1952.
theorem we cari prove the impossibility of
trisection of an angle and duplication of a
cube.
Since the 18th Century, besides the problem
of construction by ruler and compass, prob- 180 (XX. 15)
lems of construction by ruler alone or by
Geometric Optics
compass alone have also been studied. We
state here some of the more notable results:
(1) If by drawing a straight line we mean the A. General Remarks
process of finding two different points on that
line, then we cari solve a11 the problems of Geometric optics is a mathematical theory of
elementary geometric construction by means light rays. It is not concerned with the prop-
of compass alone (G. Mohr, L. Mascheroni). erties of light rays as waves (e.g., their wave-
(2) If by drawing a circle we mean the process length and frequency), but studies their prop-
of fmding its tenter and a point on its circum- erties as pencils of rays that follow three laws:
ference, and if a circle and its tenter are given, the law of rectilinear propagation, the law
then we cari solve any problem of elementary of reflection (i.e., angles of incidence and re-
geometric construction by means of ruler flection on a smooth plane are equal (Euclid)),
alone. (3) It is not possible to tïnd the tenter of and the law of refraction (i.e., if 0 and 8’ are
a given circle by ruler alone (D. Hilbert). (4) It angles of incidence and refraction of a light
is impossible to bisect a given segment by ruler ray refracted from a uniform medium to a
alone. (5) When two intersecting circles or second uniform medium and if n, n’ are the
concentric circles are given, we cari find the refractive indices of the first and the second
centers of these circles by ruler alone. When medium, respectively, then n sin Q= 11’sin 8’ (R.
nonintersecting and nonconcentric circles are W. Snell, Descartes)). These three laws follow
given, it is not possible to tïnd their centers by from Fermat’s principle, which states that the
ruler alone (D. Cauer). path of a light ray traveling from a point A’ to
Cases have been considered in which the A in a medium with refractive index n(P) at P
radius of a circle we cari draw by compass or is such that the integral si. n(P)ds attains its
180 B 684
Geometric Optics

extremal value, where ds is the line element r denotes the distance from the tenter of the
along the path. This line integral is called system) and Luneburg’s lens (n(r) = m).
the optimal distance from A’ to A. Therefore Perfect imaging conserves optical distance,
Fermat’s principle cari be taken as a foun- yields the relation n(A’)ds’ =n(A)ds, and gives
dation of geometric optics and is, in a way, a tconformal mapping, with the magnification
similar to the tvariational principle in particle inversely proportional to the refractive index.
dynamics (Maupertuis’s principle),

B. Gauss Mappings
6 J2h-2uods=O,
s
Consider an optimal system with a symmetri-
which is satisfïed by the path of a particle of
cal axis of rotation, its optical axis. A ray of
unit mass having constant energy h passing
light that is near the optical axis and has a
through a fïeld of tpotential U(P). The quan-
small inclination to the axis is called a paraxial
tity J2h-2u corresponds to the refractive
ray. A mapping realizable by paraxial rays
index n.
where the canoriical variables x, y, p, q cari be
In an optical system, express the position
considered to be infinitesimal variables whose
of a point on the path of a light ray by ortho-
squares are negligible, is called a Gauss map-
gonal coordinates (x, y, z), and defïne the +La-
ping (Gauss map). When the positions of an
grangian L = nJm (X = dx/dz, j =
abject point and its image under a Gauss
dy/dz), optical direction cosines p = C~L/&?, q =
mapping are represented by homogeneous co-
13L/aj, and the THamiltonian H = Xp + $4 -
ordinates, the mapping is represented as a
L = - Jz. Then the tcanonical
linear transformation, i.e., a tcollineation,
equations of the path are obtained as in parti-
which maps a point to a point and a line to a
cle dynamics; x, y and p, q are called tcanon-
line. A point in one space corresponding to
ical variables. Due to the variationai prin-
the point at infïnity in the other space is called
ciple, the integral of the linear differential form
a focus. If we take a focus as the origin of a
p dx + q dy - H dz = wd along a light path is a
coordinate system in each space and use the
function S(A’, A) of the endpoints A’, A of the
thomogeneous coordinates xi such that x =
path, and the optical direction cosines and the
x,/x4, y=x,lx,, z=x3/x4, then a Gauss
Hamiltonians of the system at A and A’ are
mapping cari be represented as x1 = xi,
given by
x2 =XL, x3 =fxi, ,f’x4 =x;. The ratio of x to
ôS as , as x’, i.e., the lateral magnification, is x/x’ = z/f =
-=- P>’ z=-q’ z- - H’,
0X’ f ‘/z’, where x’ is the length of an abject ortho-
gonal to the axis and x is the length of its
as as &Y image. The distance f between a focus and a
a,=p> ,îy=q, -z-H.
aZ
point where the lateral magnification is 1 (such
a point is called a principal point) is called
Hence we obtain +Hamilton-Jacobi differential
focal length in each space. The telescopic
equations
mapping, i.e., x1 = xi, x2 =x2,’ x,=ux;,
x4 = bxi, is also a Gauss mapping, in which
the lateral magnification is constant.

C. Aberration

As a corollary to these relations we obtain When a mapping is realized not only by par-
Malus’s theorem, which states that a pencil of axial rays but also by rays having larger incli-
light rays perpendicular to a common surface nations, a departure from the Gauss mapping
(locally) at a given moment is also perpendic- arises. This departure is generally called aber-
ular to a common surface (locally) after an ration. Suppose that a light ray that passes
arbitrary number of reflections and refractions. through the point (x’, y’, z’) of a plane perpen-
Suppose that light rays travel from an ob- dicular to the optical axis at a tïxed z’ and has
ject space into an image space through an optical direction cosines p’, q’ is transformed
optical apparatus. If a11 the rays starting from by the optical apparatus into a light ray that
any one point of the abject space converge to passes through the point (x, y, z) of a plane
a point of the image space and if the mapping perpendicular to the optical axis at a fixed z
given by this correspondence is tbijective, then and has optical direction cosines p, q there.
we say that this imaging is Perfect. Examples Then by the variational principle, pdx + q dy -
of Perfect imaging systems are realized by p’ dx’ - q’dy’ = d W (d W is an texact differen-
optical apparatus such as Maxwell’s fisheye tial). Therefore the transformation (x’, y’, p’, y’)
(having refractive index n(r) = a/(b + r’), where +(x, y, p, q) is a tcanonical transformation. The
685 181
Geometry

mapping cari be described by historian Herodotus, who wrote that in an-


tient Egypt people used geometry to restore
aw aw their land after the inundation of the Nile.
P’X’
q=ay’ Thus the theoretical use of figures for practical
ôw , aw purposes goes back to pre-Greek antiquity.
PI= -z’ Tradition holds that Thales of Miletus knew
q=-ay’
some properties of congruent triangles and
in terms of W and cari also be represented in used them for indirect measurement, and that
terms of V= W+p’x’+ q’y’ or U = W+p’x’+ the Pythagoreans had the idea of systematiz-
q’y’ - px - qy. For a given optical system, ing this knowledge by means of proofs (- 24
one of these functions W, U, V (called a char- Ancient Mathematics; 187 Greek Mathemat-
acteristic function or eikonal) cari be used to ics). tEuclid’s Elements is an outgrowth of this
estimate the aberration. By developing such a idea [l]. In this work, we cari see the entire
characteristic function in power series of mathematical knowledge of the time presented
canonical variables and observing its terms as a logical system. It includes a chapter (Book
of less than the fifth power, we cari single out V) on the theory of quantity (i.e., the theory of
five kinds of aberration: spherical aberration, positive real numbers in present-day termi-
curvature of image tïeld, distortion, coma, and nology) and chapters on the theory of integers
astigmatism in a rotationally symmetric opti- (Books VII-IX), but for the most part, it treats
cal system. TO eliminate these aberrations an figures in a plane or in space and presents
optical system must satisfy Abbe’s sine con- number-theoretic facts in geometric language.
dition (the elimination of spherical aberration Geometry in today’s usage means the
and coma), Petzval’s condition (the elimination branch of mathematics dealing with spatial
of astigmatism and curvature of image fïeld), figures. In ancient Greece, however, a11 of
and the tangent condition (the elimination of mathematics was regarded as geometry. In
distortion). later times, the French word géomètre or the
The path of a charged particle in an electro- German word Geometer was sometimes used
magnetic fïeld cari be treated in the same way as a synonym for mathematician. In a fragment
as the path of a light ray. Let F:represent the of his Pensées, B. Pascal speaks of the esprit de
specific charge of the particle, h the energy, géométrie as opposed to the esprit de finesse.
A, the electrostatic potential, and A,, A,, A, The former means simply the mathematical
vector potentials. Then the index of refrac- way of thinking.
tion is Jm+c(A,dx/ds+Aydy/ds+ Algebra was introduced into Europe from
A,dz/ds). In this case, the index of refraction the Middle East toward the end of the Middle
shows the anisotropy caused by the existence Ages and was further developed during the
of the magnetic field. The paths of paraxial Renaissance. In the 17th and the 18th cen-
rays are determined by a set of linear differ- turies, with the development of analysis, geom-
ential equations of the second order, and the etry achieved parity with algebra and analysis.
Gauss mapping is realized as in geometric As R. Descartes pointed out, however, tïg-
optics. ures and numbers are closely related [2].
Geometric figures cari be treated algebraically
or analytically by means of tcoordinates (the
References
method of analytic geometry, SO named by
S. F. Lacroix [3]); conversely, algebraic or ana-
[l] C. Carathéodory, Geometrische Optik,
lytic facts cari be expressed geometrically. Ana-
Springer, 1937.
lytic geometry was developed in the 18th cen-
[2] M. Herzberger, Modern geometrical op-
tury, especially by L. Euler [4], who for the
tics, Interscience, 1958.
fïrst time established a complete algebraic
[3] R. K. Luneburg, Mathematical theory of
theory of tcurves of the second order. Previ-
optics, Univ. of California Press, 1964.
ously, these curves had been studied by Apol-
[4] 0. N. Stavroudis, The optics of rays, wave-
lonius (262-200? B.c.) as tconic sections. The
fronts, and caustics, Academic Press, 1972.
idea of Descartes was fundamental to the
development of analysis in the 18th Century.
Toward the end of that Century, analysis was
again applied to geometry. For example, G.
181 (WI) Monge’s contribution [S] cari be regarded as a
forerunner of tdifferential geometry.
Geometry However, we cannot say that the analytic
method is always the best manner of dealing
The Greek word for geometry, which means with geometric problems. The method of treat-
measurement of the earth, was used by the ing figures directly without using coordinates
181 Ref. 686
Geometry

is called synthetic (or pure) geometry. In this branches of mathematics, and it is sometimes
vein, a new field called tprojective geometry diffcult to distinguish it from algebra or anal-
was created by G. Desargues and B. Pascal in ysis. The importance of geometric intuition,
the 17th Century. It was further developed in however, has not diminished from antiquity
the 19th Century by J.-V. Poncelet, L. N. Car- until today.
not, and others. In the same Century, J. Steiner
insisted on the importance of this fïeld (- 343
Projective Geometry). References
On the other hand, the taxiom of parallels
in Euchd’s Elements has been an abject of Cl] T. L. Heath, The thirteen books of Euclid’s
criticism since ancient times. In the 19th cen- elements, Cambridge, 1908.
tury, by denying the a priori vahdity of Eu- [L] R. Descartes, Géométrie, Paris, 1637
clidean geometry, J. Bolyai and N. 1. Loba- (Oeuvres, IV, 1901).
chevskiï formulated non-Euclidean geometry, [3] S. F. Lacroix, Traité élémentaire de trigo-
whose logical consistency was shown by nométrie rectiligne et sphérique et d’applica-
models constructed in both Euclidean and tion de l’algèbre à la géométrie, Bachelier,
projective geometry (- 285 Non-Euclidean 179881799.
Geometry). [4]L. Euler, Introductio in analysin infïni-
In analytic geometry, physical spaces and torum, Lausanne, 1748 (Opera omnia VIII, IX,
planes, as we know them, are represented as 1922); French translation, Introduction à
3-dimensional or 2-dimensional Euclidean l’analyse infinitesimale 1, II, Paris, 1835; Ger-
spaces E3, E’. It is easy to generalize these man translation, Einleitung in die Analysis des
spaces to n-dimensional Euclidean space E”. A Unendlichen, Springer, 1885.
“point” of E” is an n-tuple of real numbers (x1, [S] G. Monge, Application de l’analyse à la
. ..) x,), and the distance between two points géométrie, fourth edition, Paris, 1809.
(xl,...,x,),(~l,...,~,)is((~l-~l)2+...+(~, [6] B. Riemann, Über die Hypothese, welche
-~~)‘)i’~. The geometries of E2, E3 are called der Geometrie zu Grunde liegen, Habilita-
plane geometry and space (or solid) geometry, tionsschrift, 1854 (Gesammelte mathematische
respectively. The geometry of E” is called n- Werke, Teubner, 1876,254-269; Dover, 1953).
dimensional Euclidean geometry. We obtain [7] F. Klein, Vergleichende Betrachtungen
n-dimensional projective or non-Euclidean über neuere geometrische Forschungen, Das
geometries similarly. F. Klein [7] proposed Erlanger Programm, 1872 (Gesammelte math-
systematizing a11 these geometries in group- ematicshe Abhandlungen 1, Springer, 1921,
theoretic terms. He called a “space” a set S on 460-497).
which a group G operates and a “geometry”
the study of properties of S invariant under the
operations of G (- 137 Erlangen Program).
B. Riemann [6] initiated another direction
of geometric research when he investigated n- 182 (V.10)
dimensional tmanifolds and, in particular, Geometry of Numbers
+Riemannian manifolds and their geometries.
Some aspects of Riemannian geometry fa11
outside of geometry in the sense of Klein. It A. History
was a starting point for the broad Iïeld of
modern differential geometry, that is, the geom- H. Minkowski introduced the notions of lat-
etry of tdifferentiable manifolds of various tice and convex set in the talgebraic theory of
types ( - 109 Differential Geometry). numbers. He developed a simple yet powerful‘
The reexamination of the system of axioms method of arithmetic investigation using these
of Euclid’s Elements led to D. Hilbert’s tfoun- geometric notions to simplify the analytic
dations of geometry and to the axiomatic ten- theory of +Diophantine approximation, which
dency of present-day mathematics. The study had been developed by P. G. L. Dirichlet and
of algebraic curves, which started with the C. Hermite. His theory, the geometry of num-
study of conic sections, developed into the bers, has continued its development and con-
theory of algebraic manifolds, the algebraic tributed to various fïelds of mathematics (-
geometry that is now developing SO rapidly (- 83 Continued Fractions).
12 Algebraic Geometry). Another branch of
geometry is topology, which has developed
since the end of the 19th Century. Its influence B. Lattices
on the whole of mathematics today is con-
siderable (- 114 Differential Topology; 426 Let E” be an n-dimensional Euclidean space
Topology). Geometry has now permeated a11 identified with the linear space R”. For a point
687 182 D
Geometry of Numbers

P in E”, we denote the corresponding vector in C. Successive Minima and Minkowski’s


R” by u(P)=‘(x,, . . ..xJ. A subset A of E” is Theorem
called an n-dimensional (homogeneous) lattice
if there exists a basis {ui, , u,} of R” such that A subset S of the space E” is called a bounded
A={PEE~u(P)=~~~,A,u,,~~EZ}. The set of star hody (symmetric with respect to the
points {Xi, . . ..X.,} such that v(Xi)=ui (i= origin) if there exists a continuous function F
1, . , n) is called a basis of the lattice A. A delïned on the space E satisfying the following
typical example of a lattice is the point set four conditions: (i) F(0) = 0; (ii) if X # 0, then
corresponding to Z” in R”. The tfree module F(X)>O; (iii) for an arbitrary real number t
generated by vi (i = 1, , n) is denoted by A* and a point X, we have F(tX) = 1tlF(X); (iv)
and is called the lattice group of A. We have A* S= {X 1F(X)< 1). A bounded closed tcon-
={u~R”~~=u(P),PEA}. If{u, ,..., un} isan- vex body that is symmetric with respect to the
other basis of the free module A*, then there origin is a bounded star body. If we are given
exists an element (tlij) of GL(n, Z) (i.e., ccijeZ a star body S, the associated function F, and a
and Idet(cc,)( = 1) such that uj=C;=i mijuj. lattice A, there exist a set of points {P,, . , P,}
Hence the quantity ldet(v,, , u,,)l is inde- in A and a set of positive numbers {p, , . . . , p,}
pendent of the choice of the basis {ui, . , u,}. satisfying the following four conditons: (1)
We denote this quantity by d(A) and cal1 it the v(Pl), , u(P,,) are linearly independent; (2)
determinant of the lattice. We denote the mini- F(Pi)=pi (i= 1, . . . . n); (3) pi < <pn; (4) if P is
mum distance between the points belonging to a point in A and u(P) is not contained in the
A by S(A). subspace spanned by {u(Pl), . . , u(P,-,)}, then
A subset L of the space E” is called an in- F(P)>p,. The set {pl, . . ..p.} is uniquely
homogeneous lattice if there exists a homo- determined by S and A. The numbers pi are
geneous lattice A in E” and a point Pc, in En called the successive minima of S in A; the
such that L={PEE”Iu(P)-u(P~)EA*}. Thus points Pi are the successive minimum points of
an inhomogeneous lattice is obtained from a S in A.
homogeneous lattice by translation. In this Minkowski’s theorem: Let A be a lattice in a
article we restrict ourselves to the case of Euclidean space E” and S a bounded subset of
homogeneous lattices and henceforth omit the E”. Then we have the following:
adjective “homogeneous.” (1) If the volume V(S) is larger than d(A),
Suppose we are given a sequence of lattices then there exist points Xi and X, in S such
A,, AZ, , in E” with bases {X{i’}, {Xl’)}, . that X, #X, and V(X,)-u(X,)eA*. Suppose,
If the sequence of points Xi”) converges to moreover, that S is convex and symmetric with
Xi(i=1 ,..., n)andtheset{X, ,..., X,}forms respect to the origin. Then, if V(S) > 2”d(A),
a basis of a lattice A, we cal1 A the limit of there exists a point X in S n A different from
tbe sequence {A”}; we also say that the se- the origin. Hence we have 2”A(S) > V(S) (n =
quence {Ay} converges to the lattice A. In this dim E”).
case we have d(A,)-+d(A), 6(A,)+&A). The (II) Let S be a bounded closed convex body
notion of convergence of lattices gives rise to a that is symmetric with respect to the origin,
topology of the space M,, of a11 the lattices in and let p, , . , p,, be the successive minima of S
E. A sequence {AV} of lattices is said to be in A. Then we have p, p; V(S) < 2”d(A).
bounded if there exist positive numbers c and
c’ such that d(A,) < c, S(A,) > c’ for a11 v. A
bounded sequence of lattices has a convergent D. Minkowski-Hlawka Theorem
subsequence.
Let S be a subset of the space E”. A lattice A Suppose that we are given a subset S of the n-
is called S-admissible if we have A fl S’ = {O}, dimensional Euclidean space E” such that the
where Si is the interior of S and 0 is the origin. characteristic function x(X) of S is tintegrable
We denote the set of S-admissible lattices by in the sense of Riemann. Then we have the
A(S). Given a closed subset M of M,, we put Minkowski-Hlawka theorem: (i) If n > 2 and S
W\W = i4Ea~s)nw d(A) if A(S) n M is non- is open, then A(S) < V(S), and (ii) if, moreover,
empty, while if .4(S) n M is empty, we put S is symmetric with respect to the origin,
A(S\M)= CO. When M=M,,, we Write A(S\M) then 2A(S) < V(S); (iii) if S is a symmetric
=A(S) and cal1 it the critical determinant of S. star body with respect to the origin, then
Generally, a lattice A in M is said to be critical 2[(n)A(S)d V(S), where i(n) is the tRiemann
in M with respect to S if AeA(S) and d(A)= zeta function.
A(S\M). Suppose that we have O<A(S\M)< A proof for the theorem was given by E.
co. Then for a lattice critical in M with respect Hlawka (1944); (iii) and (iv) were conjectured
to S to exist it is necessary and sufficient that by Minkowski. C. L. Siegel obtained another
there exist a bounded sequence {A,} such that proof (194.5), and C. A. Rogers simplified the
A,E M n A(S) and d(A,)+A(S\M). original proof by Hlawka (1947). There are
182 E 688
Geometry of Numbers

results concerning the estimation of A(S),&‘(S) G. Approximation of Irrational Numbers by


for various subsets S. Rational Numbers

Given an irrational number 8, we have the


E. Siegel’s Mean Value Theorem problem of finding rational integers x (> 0)
and y such that 10 - y/xl < E/X, where F is a
In an attempt to obtain a proof for the latter given positive number. Suppose that we are
half of the Minkowski-Hlawka theorem, Min- given a positive integer N. Using Dirichlet’s
kowski observed the necessity of establishing drawer principle we cari show the existence of
the arithmetic theory of the linear transfor- x(<N)andysuchthatlQ-y/xl<l/xN.Let
mation groups. Siegel was inspired by this ob- M(B) be the supremum of positive numbers
servation and obtained the following theorem, A4 such that the inequality 18 - y/x I< l/Mx’
Siegel’s mean value theorem, which implies the holds for infïnitely many pairs of integers x, y.
Minkowski-Hlawka theorem: Let F be a tfun- We have 1 <M(B) (Q CO). Two irrational num-
damental region of the group SL(n, R) with bers f3 and B’ are said to be equivalent if there
respect to a discrete subgroup SL(n, Z). Let w exists an element (nij)~ GL(2, Z) such that
be the tinvariant measure on SL(n, R) such 8’=(a,, ~+~lJl(%l 0 + azz). In this case we
that jFda = 1 (- 225 Invariant Measures). Let have M(O) = M’(0).
f be a bounded Riemann integrable function If the irrational number 6 satisfies the qua-
with compact +Support detïned on the space dratic equation a@ + bO + c = 0 (a, b, c are
R”. Note that the lattice Z” is stabilized by the rational integers), then we have M(B)=
subgroup SL(n, Z). We have k-’ Jbz-4ac, where k = min {ax’ + bxy +
cy2 1x, y E Z, x # 0, y #O}. In general, for an
irrational number 0 of degree two, we have
M(O)> $. The equality M(O) = & holds if 0
where the right-hand side of the equation is is equivalent to 19~= (1 + fi)/2. If Q is not
equivalent to 0i, then M(B) > J8; the equality
the usual Riemann integral of the function f
holds if 0 is equivalent to 0, = 1 + 4. Simi-
A. Weil considered this theorem in a more
larly, we have &, O,, . .; and M(O,,+3 (n* a).
general setting (Summa Brasil. Math., 1 (1946)).
If M(Q) < 3, there exists a 0, such that Q is
equivalent to 0,. The set of irrational numbers
F. Diophantine Approximation B satisfying M(0) = 3 is uncountable (A. A.
Markov [ 133). We have no information about
Minkowski initiated the notion of Diophan- M(B) for the general algebraic irrational num-
tine approximation in reference to the prob- ber 8. Let ~(0) be the supremum of real num-
lem of estimating the absolute value If(x)1 of a bers ,u such that the inequality le-y/xl<l/xP
given function L where x varies in Z or in a holds for intïnitely many pairs of integers x, y.
given ring of talgebraic integers. (A +Diophan- Given a number K > 2, it cari be shown that the
tine equation is an equation f(x)=O, where x tlebesgue measure of the set of real numbers 6’
varies in Z.) Today Diophantine approxima- such that ~(0) > K is zero. If 0 is a real alge-
tion (in the wide sense) refers to the investiga- brait number of degree n, then ~(0) <n (J.
tion of the scheme of values f(x), where x Liouville). Concerning p(e), results have been
varies in a suitable ring of algebraic integers. obtained by A. Thue, Siegel, A. 0. Gel’fond,
The geometry of lattices is a powerful tool in and F. J. Dyson. K. F. Roth (1954) proved that
this investigation. A typical problem in this ~(0) = 2 (Roth’s theorem [12]), which settled
lïeld of study is that of approximating irra- the problem of p(e). Roth’s theorem means
tional numbers by rational numbers; here that if K is larger than 2, then there exist only a
tcontinued fractions play an important role (- lïnite number of pairs x, y satisfying 10 - y/xl
Section G). For the problem of uniform distri- < 1/x”. This cari be generalized to the case of
bution considered by H. Weyl, the analytic the approximation of an element H that is
method, especially that of trigonometric series, algebraic over an A-field k by an element of
is useful (- Section H). Dirichlet’s drawer the lïeld k (S. Lang [SI). (An A-tïeld is either an
principle (to put n abjects in m drawers with algebraic number field of tïnite degree or an
n > m, it is necessary to put more than one algebraic function field in one variable over a
abject in at least one drawer) is one of the tïnite constant lïeld.)
basic principles used in the theory of Diophan- In 1970 W. M. Schmidt [ZO] obtained
tine approximation. Recently, the theory has theorems on simultaneous approximation
been applied to the theory of ttranscendental which generalize Roth’s theorem. Thus, if
numbers and the theory of +Diophantine c(, , . . . , M, are real algebraic numbers such that
equations. 1, x,, ,E, are linearly independent over the
689 182H
Geometry of Numbers

tïeld of rational numbers, then for every E> 0 pletely determined on the basis of Baker’s
there are only fïnitely many positive integers q result. This was actually done by Baker (1966)
with and independently by H. M. Stark (1966)
(- 347 Quadratic Fields).
1/4% II “’ /14a”l141+te<L Refinements and generalizations of Thue’s
where ((5 1)denotes the distance from a real theorem on the finiteness of solutions of bi-
number 5 to the nearest integer; in particular, nary Diophantine equations have been ob-
we have tained by Baker and his collaborators. (-
[ 171, and also [ 161). Also, p- and p-adic ana-
(ai-pi/ql<q-(“+l)i”-&, i=l,...,n,
logs of Baker’s results are known [ 181.
for only tïnitely many n-tuples of rationals There are a number of unsettled problems
p, /q, . , p,/q. A dual to this result is as follows. on the irrationality of particular numbers,
Let CLr, . , CI,, E be as before. Then there are such as the +Euler constant C, ne, or [(2k + 1)
only fïnitely many n-tuples of nonzero integers with k a positive integer. R. Apéry (1978)
ql, . . ..q., with proved that there exist a positive number E
and a sequence of positive integers {q,,} such
that 0 < 11q, ((3) ((< eé’” for a11 n > 1, SO that ((3)
This last theorem cari be used to prove that if is irrational.
CIis an algebraic number, k a positive integer,
and E> 0, then there are only finitely many H. Uniform Distribution
algebraic numbers w of degree at most k such
that Icc-o)<H(cumkml-‘, where H(w) denotes Let 0 be a real number, x a positive integer,
the height of w. See also [ 16,223. and [0x] the maximum integer not larger than
The work of Thue, Siegel, and Rotli had the 0x. We Write (0x)= 0x - [0x] = fIx(mod 1).
basic limitation of noneffectiveness. A. Baker Jacobi showed that if 0 is irrational then
(1968) succeeded in proving that for any alge- the set {(0x) 1x E N} is densely distributed in
brait number 0 of degree n 2 3 and any JC> the interval (0,l) (N is the set of positive in-
n, there exists an effectively computable tegers). In general, let f be a real-valued func-
number c = ~(0, K) > 0 such that le - y/x] > tion defined on N. We say that f(x) (mod 1) is
CX-” exp(logx)‘/” for a11 integers x, y (x > 0) uniformly distributed in the unit interval, or
[15]. This result is an immediate consequence f(x) is uniformly distributed (mod l), if the
of the following effective version of a classical following condition is satisfïed: Let a, /? be an
theorem on binary Diophantine equations arbitrary pair of real numbers such that 0 <
(Thue, 1909): Let f=f(x, y) be an irreducible a < fi < 1, and let N be a given positive integer.
binary form of degree n 2 3 with integer coeffi- Let T(N) be the number of positive integers x
cients, and suppose that K > n. Then for any such that x < N, a <(f(x)) -C/I, where (f(x)) =
positive integer in, a11 integer solutions x, y of f(x) - [f(x)]. Then lim,,, T(N)/N = /J - a. In
the equation f(x, y)=m satisfy max(]xl, ]y\) < order for f(x) (mod 1) to be uniformly dis-
c exp(logm)“, where c z 0 is an effectively com- tributed, it is necessary and sufficient that
putable number depending on n, K, and the lim N-m N-‘C$, eznihfcx)=O for any non-
coefficients of J: Baker obtained this result by zero integer h (Weyl’s criterion, 1914). Weyl
making use of his theorems which give effec- proved that if 0 is an irrational number, then
tive estimates of moduli of linear forms in the 6x (mod 1) is uniformly distributed.
logarithms of algebraic numbers with alge- The following theorem, given by J. G. van
brait coefftcients. A typical theorem reads as der Corput, is often useful: Let f be a real-
follows: Let Mi, . , x,, be nonzero algebraic valued function dehned on N. Consider the
numbers with log a,, . . , log CI, linearly inde- function fh(x) =f(x + h) -f(x) for an arbitrary
pendent over the rationals, and Iet &, . . , p, be positive integer h. If fh(x) (mod 1) is uniformly
algebraic numbers, not a11 0, with degrees and distributed (mod 1) for a11 such h, then f(x)
heights at most d and H, respectively. Then for (mod 1) is also uniformly distributed (mod 1).
anyK>n+f,wehave)&,+~,loga,+...+ Utilizing this theorem, it cari be shown that
&loga,J>cexp(-(IogH)“), where c>O is an if ~(X)=&X~ + O,-,xr-’ + . . . +8,x, where at
effectively computable number depending only least one of the coefftcients 0, is irrational, then
onn,k,logcr, ,..., loga,, and d [14,19]. S(x) (mod 1) is uniformly distributed.
Results of this kind have many important The notion of uniform distribution of se-
applications in number theory. For instance, quences of real numbers has an analog in
we obtain a generalization of the Gel’fond- compact Hausdorff spaces and in various top-
Schneider theorem on transcendental num- ological groups. A systemmatic treatment of
bers. Furthermore, the imaginary quadratic such generalized notions of uniform distri-
number ftelds of class number 1 cari be com- bution cari be found in [21].
182 Ref. 690
Geometry of Numbers

References [21] L. Kuipers and H. Niederreiter, Uniform


distribution of sequences, Interscience, 1974.
[ 11 H. Minkowski, Geometrie der Zahlen, [22] W. M. Schmidt, Diophantine approxi-
Teubner, 1910 (Chelsea, 1953). mation, Lecture notes in math. 785, Springer,
[2] H. Minkowski, Diophantische Approxi- 1980.
mationen, Teubner, second edition, 1927 [23] W. M. Schmidt, Approximation to alge-
(Chelsea, 1957). brait numbers, Enseignement Math., 17
[3] H. Minkowski, Gesammelte Abhandlun- (1971) 1888253.
gen 1, II, Teubner, 1911 (Chelsea, 1967). [24] C. F. Osgood (ed.), Diophantine approxi-
[4] J. F. Koksma, Diophantische Approxima- mation and its applications, Academic Press,
tionen, Erg. Math., Springer, 1936 (Chelsea, 1973.
1950).
[S] H. Weyl, C. L. Siegel, and K. Mahler,
Geometry of numbers, Mimeographed notes,
Princeton, 1950. 183 (Vll.23)
[6] J. W. S. Cassels, An introduction to Dio-
Global Analysis
phantine approximation, Cambridge Univ.
Press, 1957.
[7] J. W. S. Cassels, An introduction to the Mathematics that treats, by using functional
geometry of numbers, Springer, 1959. analytical techniques, various problems con-
[S] T. Schneider, Einführung in die trans- cerning the +calculus of variations, tsingular-
zendentalen Zahlen, Springer, 1957. ities, infinite-dimensional Lie groups, or +non-
[9] S. Lang, Diophantine geometry, Inter- linear partial differential equations, such as
science, 1962. equations of fluids or of gravitation in general
[ 101 C. L. Siegel, Uber einige Anwendungen relativity, may be called global analysis if it
diophantischer Approximationen, Abh. Preuss. uses as a main tool an infinite-dimensional
Akad. Wiss., no. 1 (1929) (Gesammelte Ab- version of differential geometry and topology
handlungen, Springer, 1966, vol. 1,209-266). analogous to that for finite-dimensional mani-
[ 1 l] C. L. Siegel, A mean value theorem in folds. The term global analysis therefore has no
geometry of numbers, Ann. Math., (2) 46 precise definition. However, it cari be said that
(1945) 340-347 (Gesammelte Abhandlungen, it is analysis on manifolds, and the concept of
Springer, 1966, vol. 3, 39946). +infinite-dimensional manifolds is the central
[ 121 K. F. Roth, Rational approximations to abstract idea in it.
algebraic numbers, Mathematika, 2 (1955), l- Suppose one considers a nonlinear differen-
20. tial operator on a lïnite-dimensional manifold.
[ 133 A. A. Markoff (A. A. Markov), Sur les Then by using functional analytical techniques
formes quadratiques binaires indéfinies, Math. it often happens that its domain is neither a
Ann., 15 (1879), 381-407. linear space nor its open subset but an infmite-
Cl43 A. Baker, Linear forms in the logarithms dimensional manifold, and that such a non-
of algebraic numbers 1, II, III, IV, Mathe- linear operator cari be regarded as a tdifferenti-
matika, 13 (1966), 204-216; 14 (1967), 102- àble mapping between infinite-dimensional
107; 14(1967), 220-228; 15 (1968), 204-216. manifolds. The tdifferential at a point in that
[15] A. Baker, Contributions to the theory of source manifold is called a linearized operator,
Diophantine equations 1, II, Philos. Trans. to which one cari apply various theories of
Roy. Soc. London, A 263 (196771968), 173- linear functional analysis.
191,1933208. “Global analysis” seems to have lïrst ap-
[ 161 A. Baker, Transcendental number theory, peared in the literature in the late 1960s [ 1,2].
Cambridge Univ. Press, 1975. However, the phrase “infinite-dimensional
[ 171 A. Baker and J. Coates, Integer points on manifold” has been used widely since about
curves of genus 1, Proc. Cambridge Philos. 1960. By that early date, local theories of
Soc., 67 (1970), 5955602. infinite-dimensional manifolds, sometimes
[18] A. Brumer, On the units of algebraic called “general analysis,” such as delïnitions of
number lïelds, Mathematika, 14 (1967), 121- differentiability, the timplicit function theorem,
124. +Taylor’s theorem, the existence and unique-
[ 191 N. 1. Fel’dman, Linear form of logarith- ness of tordinary differential equations, and
mit algebraic numbers (in Russian), Dokl. the +Frobenius theorem, had already been
Akad. Nauk SSSR, 182 (1968) 1278-l 279. established in +Banach spaces [3]. Therefore,
[20] W. M. Schmidt, Simultaneous approxi- with these regarded as a local theory, the con-
mation to algebraic numbers by rationals, cept of infinite-dimensional manifolds could be
Acta Math., 125 (1970), 189-201. delïned naturally, and the concept of infinite-
691 183 Ref.
Global Analysis

dimensional Lie groups as well [4]. A few hand, infinite groups studied by S. Lie and
years later, R. Palais and S. Smale [S] and J. E. Cartan were in fact inlïnite-dimensional
Eells and J. Sampson [6] showed that such germs of transformation groups defïned on a
concepts are useful in the calculus of varia- neighborhood of a point in a manifold. These
tions, and R. Abraham and J. Robbin [7] re- were not groups in the strict sense. Recently,
marked that ttransversality theorems initiated H. Omori [ 171 has given a definition of ab-
by R. Thom cari be easily proved by using an stract intïnite-dimensional Lie groups that in-
infinite-dimensional version of +Sard’s theorem cludes Banach-Lie groups and many infinite-
ca dimensional transformation groups studied
The so-called +Atiyah-Singer index theorem by Cartan. An application of these groups to
[9], announced in 1963, gave impetus to the Wuid dynamics cari be found in [ 1 S]. More-
field as people sought the theorem’s most over, tunitary representation theories of these
natural expression; the work finally resulted in groups are now being constructed [19,20].
the theorem classifying separable +Hilbert Though global analysis consists at pre-
manifolds by homotopy type (- 105 Differen- sent of a rather disorganized combination of
tiable Manifolds). many nonlinear problems in analysis on mani-
After the appearance of these theorems an- folds and in mathematical physics, it is never-
nouncing the nonexistence of differential topol- theless one of the most active branches of
ogy on separable infinite-dimensional Hilbert mathematics.
manifolds, global analysis moved toward
more concrete problems and applications to
various branches of mathematics. However, References
many of these applications are formulated not
on +Banach or Hilbert manifolds, but on mani- [ 11 J. Eells, A setting for global analysis, Bull.
folds modeled on Fréchet spaces (+Fréchet Amer. Math. Soc., 72 (1966) 75 l-807.
manifolds) or on tnuclear spaces. For such [L] R. Palais, Foundations of global nonlinear
situations we have in general no local theories. analysis, Benjamin, 1968.
Neither the implicit function theorem nor the [3] J. Dieudonné, Foundations of modern
Frobenius theorem holds on such manifolds. analysis, Academic Press, 1960.
However, since these theorems are crucial for [4] G. Birkhoff, Analytic groups, Trans. Amer.
nonlinear problems, various kinds of suffrcient Math. Soc., 43 (1938), 61-101.
conditions for the validity of these theorems [S] R. Palais and S. Smale, A generalized
are being studied by many people (- 286 Morse theory, Bull. Amer. Math. Soc., 70
Nonlinear Functional Analysis). (1964), 165-172.
As for the calculus of variations, Yamabe’s [6] J. Eells and J. Sampson, Harmonie map-
problem is being studied extensively, for this pings of Riemannian manifolds, Amer. J. Math.,
seems to be a typical problem not satisfying 86 (1964), 109-160.
the so-called +Condition C. The original paper [7] R. Abraham and J. Robbin, Transversal
of H. Yamabe [13], insisting that every com- mappings and flows, Benjamin, 1967.
pact Riemannian manifold cari be conformally [S] S. Smale, An inlïnite-dimensional version
deformed into a manifold of constant scalar of Sard’s theorem, Amer. J. Math., 87 (1965),
curvature, contains a serious gap, and the 861-866.
problem is still open, though many cases are [9] M. Atiyah and 1. Singer, The index of
known where the statement holds (- 364 elliptic operators on compact manifolds, Bull.
Riemannian Manifolds H). The harmonie Amer. Math. Soc., 69 (1963), 422-433.
mappings dehned in [6] are also being studied [ 101 N. H. Kuiper, The homotopy type of the
extensively (- 195 Harmonie Mappings). unitary group of Hilbert space, Topology, 3
In differential geometry or the tgeneral (1965), 19-30.
theory of relativity delïnitions of various cur- [ll] D. Burghelea and N. H. Kuiper, Hilbert
vatures, such as Riemannian, Ricci or scalar, manifolds, Ann. Math., (2) 90 (1969), 335-352.
or Gauss, cari be sometimes regarded as non- [ 121 J. Eells and D. Elworthy, Open embed-
linear differential equations on manifolds. dings of certain Banach manifolds, Ann.
Proof of the global existence of solutions to Math., (2) 91 (1970), 465-485.
these equations has long been sought, and [ 131 H. Yamabe, On deformation of Rieman-
several theorems have recently appeared [ 14- nian structures on compact manifolds, Osaka
161. Math. J., 12 (1960) 21-37.
Infinite-dimensional groups such as GL(E), Cl43 T. Aubin, Equations differentielles non
or G&(E) with uniform topology, are called linéaires et problème de Yamabe concernant la
+Banach-Lie groups, and they have been courbure scalaire, J. Math. Pures Appl., 55
studied in the operator calculus. On the other (1976) 269-296.
184 Ref. 692
Godel, Kurt

[ 151 H. Gluck, The generalized Minkowski 185 (1.8)


problem in differential geometry in the large,
Ann. Math., 96 (1972) 2455276.
Gode1 Numbers
[ 161 S. Yau, On the Ricci curvature of a
compact Kahler manifold and the complex A. General Remarks
Monge-Ampère equation 1, Comm. Pure Appl.
Math., 31 (1978) 339-411. K. Gode1 [l] devised the following method to
[17] H. Omori, Iniïnite-dimensional Lie trans- prove his incompleteness theorems (- Section
formation groups, Lecture notes in math. 427, Cl.
Springer, 1974. Let 6 be a tformal system. In this article, we
[ 1 S] D. Ebin and J. Marsden, Groups of dif- cal1 its basic symbols, tterms, tformulas, and
feomorphisms and the motion of an incom- forma1 proofs the “constituents” of 6. Let y be
pressible fluid, Ann. Math., (2) 94 (1970) lOl- an tinjection from the constituents of G into
162. the natural numbers satisfying the following
[19] A. M. Vershik, 1. M. Gel’fand, and M. 1. two conditions: (1) Given a constituent C, we
Graev, Representations of the group of diffeo- cari compute the value g(C) in a finite number
morphisms, Russian Math. Surveys, 30 (6) of steps. (2) Given a natural number II, there
(1975), l-50. (Original in Russian, 1975.) exists a finitary procedure to fïnd out whether
[20] R. S. Ismagilov, Unitary representations there exists a constituent C of 6 such that
of the group of diffeomorphisms of the space g(C) = n; furthermore, when such a C exists, it
R”, n>2, Functional. Anal. Appl., 9 (1975), cari actually be specified in a tïnite number of
1444155. (Original in Russian, 1975.) steps.
If such a mapping y is given for the system
G, then the mapping g is called a Gode1 num-
bering and the number g(C) is called the Gode1
number of the constituent C (with respect to g).
184 (Xx1.27)
Godel, Kurt B. An Example of Gode1 Numbers

Kurt Gode1 (April28, 1906-January 14, 1978) (1) Let mg, SL~, be the basic symbols of 6.
was born in Brno, Czechoslovakia (at that With each cli we associate a distinct odd num-
time Brünn, Austria-Hungary). He studied ber qi: g(ai)=qi (i=O, 1, . ..). (2) Let F be a con-
mathematics and physics at the University stituent of 6. If F is constructed from any
of Vienna, where he took the Ph.D. degree other constituents F,, F,, . . , Fk of 6 by a rule
in 1930. After he had taught mathematics at peculiar to 6 (for convenience we Write this
the University of Vienna from 1933 to 1938, F =(F,, F,, , FJ), and if, for each Fi, g(F,) is
he was invited to the Institute for Advanced already defined, then we put g(F)= (g(F,),
Study at Princeton, where he became professor g(F,),...,g(F,)),where(a,,a,, . . ..a&
in 1953. In 1976, he was named professor denotes the number p2 p;l pp (pi is the
emeritus; he died in Princeton in 1978. (i + 1)st prime number). For example, suppose
Gode1 contributed important fundamental that 6 contains 0, =, vj (variables), and
results covering a11 aspects of mathematical 1 (negation) among the basic symbols, and let
logic. Among his famous works are the proof their Gode1 numbers be 7, 9, 1 lj+i, and 13,
of the tcompleteness of the tïrst-order predi- respectively. Since the formula 1 (0 = v,) cari
cate calculus, the incompleteness of the con- be analyzed in the form (1, (0, = , vj)), its
sistent axiomatic system containing Peano’s Gode1 number is (13, (7,9,11j”)). For
arithmetic (incompleteness theorem), and the details - [1,4].
tconsistency of the axiom of choice and the
generalized continuum hypothesis. He also
C. Godel’s Incompleteness Theorems
introduced the notion of irecursive functions
and found the Gode1 solution of Einstein’s
By means of a Gode1 numbering any meta-
equations of relativity. In addition to mathe-
mathematical notion about the constituents of
matical works, he left philosophical papers on
a forma1 system 6 cari be interpreted as a
set theory and the foundations of mathematics.
number-theoretic notion. For example, the
notion “formula” is interpreted as the number-
Reference theoretic predicate Form(x) which means
that x is the Godel number of a formula. The
[l] K. Godel, The consistency of the axiom provability of a formula is interpreted as the
of choice and of the generalized continuum number-theoretic predicate Prov(x), which
hypothesis, Princeton Univ. Press, 1940. means that x is the Godel number of a prov-
693 186 B
Graph Theory

able formula; accordingly, for any formula A wandter Systeme 1, Monatsh. Math. Phys., 38
of 6, the proposition Prov(g(A)) means that A (193 l), 173- 198; English translation, On form-
is provable. This interpretation is called the ally undecidable propositions of Principia
arithmetization of metamathematics. mathematica and related systems, Oliver &
Let the formai system U include forma1 Boyd, 1962.
elementary number theory. Then many meta- [2] A. Tarski, Der Wahrheitsbegriff in den
mathematically useful number-theoretic predi- formahsierten Sprachen, Studia Philosophica,
cates cari be expressed by the respective for- 1 (1936), 261-405, translated from the Polish
mulas of 6; for example, there exist formulas original 1933; English translation, The concept
Form(x) and Prov(x) expressing the predicates of truth in formalized languages, logic, seman-
Form(x) and Prov(x), respectively. Further- tics, metamathematics, Oxford Univ. Press,
more, Godel proved the existence of a closed 1956.
formula ci such that the formula [3] J. B. Rosser, Extentions of some theorems
1 Prov(y( CI)) u U is provable. This closed of Gode1 and Church, J. Symbolic Logic, 1
formula U is one of the so-called formally (1936), 87-91.
undecidable propositions, and in fact it is shown [4] S. C. Kleene, Introduction to metamath-
that neither U nor 1 U is provable in 6 if G is ematics, North-Holland and Noordhoff, 1952.
consistent in a strong sense. This result is [S] S. C. Kleene, Mathematical logic, Wiley,
called Godel’s fïrst incompleteness theorem. 1967.
By use of the formulas Form(x) and Prov(x) [6] J. R. Shoenfield, Mathematical logic,
the consistency of the formal system 6 is Addison-Wesley, 1967.
expressed by the formula Consis which is an
abbreviation of 3x(Form(x)~ lProv(x)).
Godel obtained the result that the formula
Consis is not provable in S if 6 is consistent, 186 (XVI.1 2)
on the basis of the following three facts: (1) U Graph Theory
is not provable in 6 if 6 is consistent; (2)
Consis+ lProv(g(U)) is provable in 6; (3)
A. Overview of Graph Theory
lProv(g(U))+U is not provable in 8. This
result is called Godel’s second incompleteness
Two aspects of graphs are the abject of graph
theorem.
theory. One is that a graph expresses a binary
The method of arithmetization is important
relation over a set V, and the other is the fact
and useful in the study of mathematical logic.
that a graph is a CW-complex of 1 dimen-
The notion of Godel numbers of recursive
sion, an abject of study in algebraic topology.
functions is one of its applications (- 356
Because of its special natural as an abject of 1
Recursive Functions).
dimension, we cari consider various concrete
properties in detail. Hence graph theory has
D. Tarski’s Theorem Concerning Truth close connection with other areas, such as
Definitions network theory, system theory, automata
theory, and the theory of computational pro-
Let a consistent forma1 system 6 and a +model cesses, and it has many useful applications.
of 6 be given. By means of a Gode1 number- The notion of a “graph” as currently used in
ing, the truth notion of closed formula is inter- graph theory was first discussed by L. Euler
preted as the number-theoretic predicate Cl]. It is said that J. J. Sylvester coined the
Tr(x) which means that x is the Godel number word “graph” as we presently understand it.
of a true closed formula; accordingly, for any However, up to now, there were few unifying
closed formula A of 6, the proposition principles, and graph theory seemed like a
Tr(g(A)) means that A is true. large collection of miscellaneous problems and
In relation to the foregoing fact, if there ad hoc techniques. The terminology has not
exists a formula Tr(x) of a single variable and yet been standardized; different usages prevail
the formula Tr(g(A))tt A is provable for every in different schools. The resulting confusion
closed formula A, then that formula Tr(x) is cari be seen in references books such as [2]-
called a truth definition. A. Tarski proved the [S]. Recently, inlïnite graphs have also been
fact that there is no truth definition in 6 if the studied. But here we restrict ourselves to finite
formai system 6 is consistent. graphs, since only they give a typical theory.

References B. Definition of Graph

[l] K. Godel, Über forma1 unentscheidbare The notion of a graph G =(V, E, 8, U-) is a
Satze der Principia Mathematica und ver- composite notion of two lïnite sets V and E
186 C 694
Graph Theory

andtwomapsa+:E+Vandd-:E+V.An U’E V-, it is called a complete bipartite graph.


element of V is called a vertex, and an element (3) A regular graph is a graph whose degree
of E is called an edge. The maps d* are called at each vertex is the same. (4) A partial graph
incidence relations. The terms point, or node is delïned as follows: Let E’ be a subset of E
instead of vertex, and arc, line, or branch in- inagraphG=(V,E,a+,û-).ThegraphG’=
stead of edge have also been used (sometimes (V’, E’, a’+, a’-) is called a partial graph of
with slightly different meanings). For an edge G, where V’ = i3+ E’U a-E’, and a’* are the
eE E, a+eE V is called the initial vertex and restrictions of O* on E’, respectively. When
ô-es Vis called the terminal vertex. Both are E’= E, the partial graph is the graph obtained
called the end vertices of e. The inverse map 6 * by deleting all isolated vertices from G. Simi-
of a*: V+2E is defined by G’v={eEEla*e= larly, for a subset V’ of V, the graph G” =
u} and has the following properties: (i) If (V”, E”, a»+ , rî”-), delïned by E” = 6+ V” U 6- V”
uzu’, then 6+un6+u’=6~ufl6-u’=~. (ii) and with a”* being the restriction of a* on E”,
Uvev6+u= UV,,G-u=E. Conversely, if the is called a section graph of G.
maps 6’ : V+2E have the properties (i) and (ii),
then there exist corresponding maps a*. For
a vertex VE V, [S+U~ is called the outdegree or D. Representation of a Graph
positive degree, )6-ul is called the indegree or
negative degree, and the sum j6tul + 16-ul is When we want to process by computer a
called the degree. We always have the relation problem concerning a graph, we must repre-
~vtVI~f~I=C,,ylfi~uI=lEl, and ~v~v(16+ul+ sent a graph in a suitable form (- 96 Data
Ifimul)=21El. The number of vertices with Processing). A commonly used method is the
odd degree is always even. Two end vertices of following list representation: Let V = { vl, v2,
an edge are called mutually adjacent, and two “‘> v,,,} and E={e,,e, ,..., e,}. Foreachu,EV
edges with at least one common end vertex are (c(= 1, , M), we arrange the edges in 6+v,
also called mutually adjacent. An edge satisfy- and in KV, in suitable orders. Then (i) for
ing a+e = 8-e is called a self-loop, and a vertex each v,EV, record 16’v,I, 16-0,1, Bz, B,i, B;,
satisfying 6+v = S-v = 0 is called an isolated and B; , where B’ and i?$ are the lïrst and
vertex. the last index of the edges in 6 *v,, respectively.
When permutation groups Pv operating We define the corresponding values to be 0 if
over V and PE operating over E are given, we the 6’v, are empty. (ii) For each e,E E, record
cari naturally delïne the permutation (rry, 7~~) a:,a,,E:,E:,E;) and E; - , where 3: are
over the graph G(rry E Pv, ~C~EPJ. A graph is the number of vertices a*e,, and E:, ,i?’ are
classitïed into equivalence classes by PV, PE. the number of edges immediately after or
When PV, PE are all the permutations of V, E, before eX among the edges with the same initial
respectively, each equivalence class is called an (for ‘) or terminal (for -) vertex as that of eK.
unlabeled graph, whereas when PV, PE consist If e, is the last or the lïrst among such edges,
only of an identity transformation, G is called we defme the corresponding values to be 0.
a labeled graph.
ForagivengraphG=(V,E,6+,6-)anda
subset E’ c E, we cari delïne the reoriented E. Deformation of a Graph
graph of edges in E’ by G’ =(V, E, a’+, a’-),
wherea’*e=a*eife$E’and8*e=a’eif Let G =(V, E, a+, a-) be a graph. A graph ob-
eeE’. We have an equivalence relation by tained by opening an edge e is the graph G’=
identifying the reoriented graphs. Each class (V, E - {e}, a”, a’-), where a’* is the restric-
by this equivalence relation is called an un- tion of a* on E- {e}. The graph obtained by
directed graph or an unoriented graph. In con- deleting all isolated vertices from G’ is a partial
trast, the graph in the original sense is called a graph of G over E- {e}. A graph obtained by
directed graph or an oriented graph. shortening an edge e is the graph G” =(V”, E -
{e},6”+,8-), where V”=(V-{a+e,Fe})U
{û}, 0 being a new vertex not contained in V,
C. Examples of Special Graphs and a”* =<~.a*, cp: V+V” being delïned by
cp(v)=v for v#a*e, and =û when u=Se,
(1) A complete graph is a graph such that there or v= a-e,. A graph obtained by opening
exists one and only one edge with end vertices and shortening several different edges is simi-
u and u’ for every different pair of vertices v larly detïned, and the result is independent
and u’. (2) A bipartite graph is a graph with a of the order of opening or shortening process.
partition V’ U V- = V such that V+ n V- = 0 A graph obtained by opening edge(s) or by
and for every eE E, we have i3+eE V+ and shortening edge(s) or by both processes is
5ec Vm. If there always exists an edge e with called a subgraph, a contraction, or a sub-
Z+e = v, d-e = v’ for every pair VE V+ and contraction, respectively.
695 186 G
Graph Theory

F. Connectedness Complexity of Computations). Various suftï-


tient conditions or necessary and sufficient
Let G = (V, E, û+, a-) be a graph, and let v, and conditions for special graphs have been given
vp be two vertices. A path from u, to v,, of (- cg., [SI).
length 1 is a sequence P=(v~=v~,,E~~~,,v,,,
&2eK2, . . . ,~,e,[, v,~=v~), where ci are +l or -1,
and for every i= 1, . . . . 1, we have ii+en,=v,Zm,, G. Tieset, Cutset, Tree, and Cotree
a-eKL = v,~ if &i = + 1, and (7?+eK,= v,~, a-e, = v, I
if ci= -1. When u,=uB, it is called a closed For a graph G =( V, E, a+, Z), we define its
path considering P as a cyclic sequence. If incidencematrix[&cc=l,...,M(=IV/),rc=
in the sequence P of a path no edge appears l,...,n(=IEl))] bydefming&tobe 1 ifo,=
more than once, it is called a simple path. û+e,#d-e,, -1 if v,=a-e,#d+e,, and 0 if
Similarly, if no vertex appears more than once, a+e,=a-e, or e,$8+v,Ub-0,. Similarly we
it is called elementary. If a11 &i = + 1, it is called defïne its adjacement matrix [ri (a, /J’= 1, . ,
a direct path. Direct closed paths, etc., are M)] by defïning r; to be 0 if v, and vg are not
similarly defined. mutually adjacent and 1 if v, and vp are mutu-
If we detïne v-v’ by the existence of a path ally adjacent.
from v to v’, this - is an equivalence relation. A set of edges forming a closed path is called
Let us denote the equivalence classes by V,, a tieset. A set of edges of the form {e) 8+e E W,
, V,. The section graph Gi = (y, Ei, a+, 8;) d-eEV-W}U{elZe~W,a+e~V-W}fora
determined by y is called a connected com- partition (W, V - W) of V is called a cutset. A
ponent, or simply a component, of G. The sets maximal subset of edges containing no tiesets
E, , , E, are mutually disjoint and the union is called a tree, and a maximal subset of edges
is E. If we denote by V+U’ the existence of a containing no cutsets is called a cotree. A tree
direct path from v to v’, the relation --$ is a is sometimes called a spanning tree of G. Every
tpseudo-order. The relation v et v’ defïned by tree is the complement (with respect to E) of a
V~U’ and v’+v is an equivalence relation in V, cotree, and vice versa.
and the equivalence classes p,- , c or the The number of elements of a tree is always
section graphs Gi = (c, Ei, a:, &) determined the same and equal to the +rank m of the in-
by E are called strongly connected components cidence matrix. This value m is called the rank
of G. I?,, , É, are mutually disjoint, but their of the graph G. Similarly, the number k of
union is not always E. Among the sets PI, elements of a cotree is always the same, which
, E, we cari defïne an order relation c- q is called the nullity or the cyclomatic number of
- -
by the existence of VE I$ V’E q, v-tu’. Classi- the graph G. Always, k = n - m.
fication by H is a reîïnement of that by -. Let K be a field, K”, KM be vector spaces
When s = 1, the graph G is called connected, over K of dimensions n and M, respectively,
and when t = 1, G is called strongly connected. and K”*, KM* be the tdual spaces of K”, KM.
For a different pair v, U’E V, a set S c V- The incidence matrix defines two mutually
{u, u’} is called a separator of u and u’ if every contragradient linear mappings a: K”+KM
path from v to v’ contains at least one vertex of and 6: KM*+ K”* with respect to their canon-
S. When every separator S for any pair v, v’ ical bases. A minimal set among the family of
has at least k elements, the graph G is called k- supports (in E) of nonzero vectors in the kernel
connected. For k > 3, there are several vari- Ker 8 is a tieset corresponding to an elemen-
ations of the definition of k-connectedness. l- tary closed path. Similarly, a minimal set
connectedness is equivalent to connectedness among the family of supports (in E) of nonzero
in the sense defined above. vectors of the image Im 6 is a minimal cutset.
A simple path containing all edges of a Let T(cE)beatreeand T=E-Tbea
graph is called an Euler path. Euler direct cotree. Let us renumber the edges SO that T=
paths, etc., are similarly defined. A graph with {e,, ,e,}, T= je,,, , , e,}. For each em+pE
an Euler path is called an Euler graph. A graph z there is a unique vector RP(p = 1, , k( = n -
is an Euler graph if and only if(i) it is con- m)) in Ker a whose support is in {em+,} U T
nected, and (ii) the number of vertices with and whose em+pm equals +l. Similarly, for
odd degrees is 0 or 2 (Euler’s unicursal graph each e, E T (a = 1, , m), there is a unique vec-
theorem [ 11). Similar results are known for tor 0: (a = 1, , m) in Im 8 whose support is in
Euler closed paths or Euler direct paths. An {e,} U T and whose e,- component equals +l.
elementary path containing all vertices of {R;, . . , R[} is a basis of Ker 8, and (D:, ,
a graph is called a Hamilton path. Hamil- 0:) is a basis of Im 0. Both matrices RE(k x n)
ton closed paths. etc., are similarly detïned. and &(m x n) are totally unimodular, i.e., every
The criteria for the existence of a Hamilton minor determinant is 0, + 1, or -1. Further-
path for any given graph is unknown. This is more, the following relations hold: RP+¶=
known to be an +NP-complete problem (- 71 SJ(p,q=l,..., k),D~=6~(a,b=l,..., m),Rg+
186 H 696
Graph Theory

DZ,+, =O(u=1 ,..., m;p=l,..., k).Thematrices 1. Coloring of Graphs


RF and Dl are called the fundamental tieset
matrix and the fundamental cutset matrix, A coloring of the vertices of a graph G =
respectively, with respect to the tree-cotree (V,E,a+,a-)isamapping$fromVtotheset
pair (T, T). The minor of the matrix RF con- of integers N satisfying the condition $(u) #
sisting of a11 rows and k columns IC, , . , ICI $(u’) for ah adjacent vertices v and u’. y(G)=
is flor -lifandonlyif{e,l,...,e,,}isa min{ I$( V)I I$ is a coloring of the vertices of
cotree, and 0 otherwise. Similarly, the minor G} is called the chromatic number of G. If
of the matrix Dl consisting of a11 rows and m the graph G is drawn over a 2-dimensional
columnsK-,,...,K-,is +lor -lifandonlyif closed surface with a 1-dimensional +Betti
{eK,, , eKm} is a tree, and 0 otherwise. number b, we have y(G)< L(7 + J-)/2],
where L ] denotes the integral part. Except
for the +Klein bottle (with b = 2), where y(G) <
6, this is the best possible, i.e., there exists a
H. Planarity of a Graph
graph whose y(G) equals the Upper bound on
the right-hand side. As for b 2 1, the inequality
Let there be a natural one-to-one correspon- was tïrst proved by P. J. Heawood (1890),
dence between the sets of edges of two graphs and final results were established by J. W. T.
G,=( v, Ei, a+, ni-) (i= 1,2), and suppose that Young and G. Ringel [lO]. When b=O, y(G)<
under the correspondence a tree T, in G, cor- 5 was shown by A. B. Kempe (1879), but the
responds to a tree T2 in G, and that the fun- four color conjecture: “y(G) <4?” has remained
damental cutset matrices D, and D, with re- unsolved for more than a hundred years. The
spect to (71, Ei - T) are mutually equal. Then conjecture is believed to have been solved
G, and G, are said to be 24somorphic. The affirmatively recently through checking a
definition is equivalent to the one given by huge number of cases on a large computer
the equality of fundamental tieset matrices. [ll, 121.
2-isomorphism is an equivalence relation. If A subset WC Vis called an independent set
G, and G, are 2-isomorphic, the families of or an internally stable set if no two vertices in
trees, cotrees, tiesets and cutsets are mutu- W are mutually adjacent. a(G) = max{ 1WI 1 W
ally corresponding. The coincidence of one is an independent set of G} is called the num-
of the families is a sufficient condition for ber of independence of G. A subset W of Vis
the 2-isomorphism of G, and G, as undi- called a dominating set or externally stable
rected graphs. A 3-connected graph has no set of G if every vertex v E V is either u E W or
2-isomorphic graph other than itself. adjacent to a vertex of W. fi(G) = min{ 1WI 1W
Similarly, when a tree Tl of G, corresponds being a dominating set of G} is called the num-
to a cotree T, of G, and if the fundamental ber of domination of G. For every graph G,
cutset matrix of G, with respect to ( Tl, E, - TJ we have cc(G)>/?(G) and y(G).a(G)>I VI.
is equal to the fundamental tieset matrix of G,
with respect to (E2 - T2, TJ as matrices, then
we say that G, is dual to G,. In this case, G, is J. Decision Problems and Graphs
dual to G, also. If G, is dual to G, and G, is
dual to G,, then G, and G, are 2-isomorphic. There are many interesting topics in decision
The duality is a relation among the equiva- problems concerning graphs, especially from
lente classes of graphs by 2-isomorphisms. the standpoint of tcomplexity of computa-
Any graph G =( V, E, a+, a-) cari be “drawn” tions (- e.g., [ 133). The following are some
in 3-dimensional Euclidean space in the fol- typical problems: (i) Problems for which algo-
lowing sense: Each vertex is a (distinct) point, rithms of polynomial order are known: 1s
and an edge e is an arc connecting two points the given graph k-connected, strongly con-
a+e and 3-e with a direction pointing from nected, a Euler graph, or a planar graph? (ii)
a+e to ô-e in such a way that no two arcs +NP-complete problems: 1s the given graph a
intersect. A graph representable on a plane or Hamilton graph? Do we have a(G) = k, /J’(G) =
on a 2-dimensional sphere in the above sense k, or y(G)=k?
is called a planar graph. A graph G is planar if Let G, and G, be two graphs. The problem
and only if it has a dual graph (H. Whitney of whether they coincide as unlabeled graphs
[7]). Another necessary and sufhcient con- is called the isomorphism problem, and has
dition is that as an undirected graph, neither been studied for many years in connection
the complete tïve-point graph K, nor the bi- with problems concerning the structure of
partite complete graph of three-three points chemical compounds. Unfortunately, no al-
K 3.3, appear in any subcontraction of the gorithm of polynomial order is known; nor
graph G. (This is a version of the criterion of do we know whether this is an NP-complete
C. Kuratowski [SI.) problem. As for the isomorphism problem for
697 187
Greek Mathematics

planar graphs, algorithms of polynomial order [13] A. V. Aho, J. E. Hopcroft, and J. D. Ull-
are known. man, The design and analysis of computer
algorithms, Addison-Wesley, 1974.
[ 141 L. Lovas~, Normal hypergraphs and
Perfect graph conjecture, Discrete Math., 2
K. Perfectness Theorem
(1972), 253-267.

Let the number of independence and the chro-


matic number of a graph G = (V, E, a+, ~7) be
a(G) and y(G), respectively. Let VI, , V, be
a disjoint decomposition of V. We denote by 187 (Xx1.2)
B(G) the minimal number of r such that every Greek Mathematics
section graph of G over v contains a com-
plete graph. Furthermore, we denote by ru(G)
It is generally believed that theoretical mathe-
the maximum value of) WI for a subset WC
matics originated with the Greeks. The Greeks
V such that the section graph of G over W
learned the arts of land surveying and com-
contains a complete graph. We always have
mercial arithmetic from earlier civilizations;
a(G and y(G)<w(G). Gis called CI-
but they developed theoretical mathematics
Perfect if every section graph H of G satistïes
themselves, toward the middle of the 4th cen-
x(H) = O(H). Similarly, G is called y-Perfect
tury B.C. The creation of a mathematics that
if y(H) = w(H) for every section graph H of
transcends practical purposes was one of
G. The conjecture that y-perfectness and c(-
the most remarkable events in the history of
perfectness are equivalent has recently been
human culture, and one that had an immense
solved affirmatively [ 141. This is called the
impact on the development of a11 branches of
perfectness theorem.
science. We owe the reestablishment of impor-
tant Greek mathematical texts and the re-
construction of the development of Greek
References mathematics to the historians of the 19th
Century (the oldest extensive exposition of the
[l] L. Euler, Solutio problematis ad geo- history of mathematics before Euclid is due to
metriam situs pertinentis, Commentarii Aca- Proklos (Proclus) (410-485)).
demiae Petropolitanae, 8 (1736) 128-140. The earliest known Greek mathematicians
[2] F. Harary, Graph theory, Addison-Wesley, are Thales of Miletus (c. 639-546 B.c.) and
1969. Pythagoras of Samos (fl. 510? B.c.). Both were
[3] C. Berge, Graphes et Hypergraphes, Ionians, but the latter went to what is now
Dunod, second edition, 1974. southern Italy and founded a semireligious
[4] A. A. Zykov, Finite graph theory (in Rus- school whose members called themselves
sian), Nauka, 1969. Pythagoreans. Their motto was “Everything
[S] M. Iri, Network flow, transportation and is number”; their studies were called mathema
scheduling, Academic Press, 1969. (“what is learned”) and consisted of music,
[6] H. Walther and H.-J. VO~S, Über Kreise in astronomy, geometry, and arithmetic (the
Graphen, VEB Deutscher Verlag der Wissen- subject group called the quadrivium, which
schaften, 1974. formed the tore of medieval and later higher
[7] H. Whitney, Non-separable and planar education) “for the purification of SOU~.” Their
graphs, Trans. Amer. Math. Soc., 34 (1932), research delved into the theories of proportion
339-362. (in relation to music) and tpolygonal numbers
[S] C. Kuratowski, Sur le prcbième des (triangular numbers, square numbers, etc.),
courbes gauches en topologie, Fund. Math., and more generally into the theory of numbers
15 (1930), 271-283. and geometric algebra. It is said that they
[9] F. Harary and W. Tutte, A dual form of knew of the irrationality of fi, though no
Kuratowski’s theorem, Canad. Math. Bull., 8 evidence of this has been found. Even after the
(1965), 17-20, 373. demise of the Pythagorean school, its followers
[lO] G. Ringel, Map color theorems, Springer, continued to promote mathematics in col-
1974. laboration with the Academy of Plato.
[ 1 l] K. Appel and W. Haken, Every planar Another significant school was the Eleatic.
map is four colourable, 1. Discharging, Illinois Among its members Zeno (c. 490-c. 430 B.c.) is
J. Math., 21 (1977), 429-489; K. Appel, W. especially important. +Zeno’s paradoxes are
Haken and J. Koch, II. Reducibility, Illinois arguments leading to absurdity. Some see
J. Math., 21 (1977), 490-567. within them the origin of logical reasoning
[ 121 S. Hitotumatu, Four color problem (in and, consequently, of theoretical mathematics
Japanese), Kodansha, 1978. [3]. It is chronologically ditlïcult to attribute
187 698
Greek Mathematics

to Zeno the consideration of the continuum are modeled on it. Most historians say that
and irrational numbers, but we cari fïnd in him Euclid’s method derives from Aristotle (3844
the impetus toward atomistic reasoning. The 322 B.C.), who, after studying at Plato’s Acad-
computation of the volume of pyramids (by emy, founded a new school, the Peripatetics,
dividing them into “atomistic” laminae) by whose doctrines are at many points opposed
Democritus (fl. 430? B.C.) and the atomistic to those of the Academy. Others, however, see
calculation of the area of circles by Antiphon the origin of Euclid’s axiomatic method in
(fl. 430 B.C.) came shortly after the time of the Eleatics [S]; we cari fïnd prototypes of
Zeno. some parts of the Elements in both Oenopides,
The middle decades of the 4th Century B.C. who lived during the time of Zeno, and in
are known as the Age of Pericles, the Golden Hippocrates.
Age of Athens. The ttrisection of an angle, the The third Century B.C. was the Golden Age
tduplication of a cube, and the tquadrature of of Greek mathematics. Archimedes of Syracuse
a circle, known at that time as the “three big (c. 282-212 B.C.) was the greatest mathema-
problems” (- 179 Geometric Construction), tician, mechanic, and technician of antiquity.
were studied by the Sophists. Hippias of Elis He did important work in mathematics, study-
(fl. 420 B.C.), Hippocrates of Chios (fl. 430 B.C. ing the exact quadrature of the pardbola.
in Athens), Archytas of Taras (c. 430-365 a.~.), According to his ephodos (method), he would
Menaechmus (fl. 350 B.C.), and his brother obtain a result by mechanical experiments and
Dinostratus (fl. 350 B.c.) solved these problems then prove it by the method of “exhaustion.”
using conic sections and the quadratrix (a He also computed the value of n; studied
transcendental curve whose equation is y= spirals and other curves, spheres, and circular
xcot(xx/2)). cylinders; contributed to the development of
By 400 B.C. Athens had lost its political statics and optics and their application; and
influence, but it remained the tenter of Greek had a profound influence on later mathema-
culture. It was during this time that Plato’s ticians. During the same period, Apollonius of
Academy flourished, and Plato (427-347 B.c.) Perga (fl. 210 B.C.) wrote Konikon biblia (Books
and his followers laid particular importance on Conics) in eight books, of which the last
on mathematics. Archytas, Menaechmus, and has been lost. The geometric theory of +conic
Dinostratus belonged to or were closely as- sections contained in this work is not much
sociated with this school. During the first tïfty different from the one we know today; it had a
years of the Academy, research in the follow- great influence on 17th-Century scientists es-
ing fïelds was pursued: methodology of mathe- pecially. Other mathematicians of this period
matics or science in general (i.e., dialectics, worth noting are Eratosthenes (c. 275-195
analysis, synthesis); geometric reconstruction B.C.), who conceived the tsieve method of tïnd-
of Mesopotamian algebra; the theory of irra- ing prime numbers and who measured the
tionals in relation to the geometrization of earth, and Hipparchus (fl. 150 B.c.), called the
algebra (Theodorus of Cyrene (5th Century father of astronomy, who made a table of
B.C.), who was Plato’s teacher, as well as Theai- sines.
tetus of Athens (415?-369 B.c.) contributed to Hellenistic influence began to decline in the
this study, and the general theory of propor- first Century B.c., and the influence of Alex-
tion by Eudoxus of Cnidos (c. 408-c. 355 B.c.) andria decreased. The Mouseion burned in
also belongs to this field); the method of ex- 48 B.C., but was rebuilt. Among the mathe-
haustion (by Eudoxus); and studies of the maticians of this time, we may Count Heron
“three big problems” and conic sections. It was (fl. 60? A.D.); Menelaus (fl. 100 A.D.), who wrote
this school in which the term muthema came Sphaerica; Theon of Smyrna (fl. 125 A.D.);
to be used in its present sense of “mathemat- Ptolemy (fi. 150 A.D.), the author of Almagest;
ics” rather than in the sense of disciplines in Nicomachus (50?-150‘? A.D.), the author of
general. Arithmetike eisugoge; Diophantus (fl. 250?
The conquests of Alexander the Great ac- A.D.), whose career is not fully known but who
celerated the already considerable cultural wrote Arithmetiku, of which six of the original
influence of Athens. Later, during the Ptole- thirteen books remained to influence +Fermat;
maie period, the tenter of culture moved to and Pappus (fl. 300 A.D.), the last creative
Alexandria. The Mouseion at Alexandra, the mathematician in Greece, who left eight books
combined library and university, is said to of the Synagoge, which influenced iDescartes
have possessed hundreds of thousands of and which still exist today.
volumes. The period following the fall of the Western
At Alexandria, Euclid (c. 300 B.c.) com- Roman Empire was a diffïcult one for Greco-
piled his Elements, which became a mode1 Egyptian science. The Mouseion was de-
for scientitïc works for centuries to corne- stroyed for the second time in 392 A.D. Theon
Newton% Principia as well as Spinoza’s Ethics of Alexandria (fl. 380) and his daughter, Hy-
699 188 A
Green% Functions

patia (c. 370-415) were at that time working [ 181 E. Montucla, Histoire des mathématiques
on commentaries on the classics. Among the I-IV, Paris, new edition, 179991802.
few remaining works of the period is Proclus’ [ 191 D. E. Smith, A source book in mathe-
(410-485) commentaries on the lïrst book of matics, McGraw-Hill, 1929 (Dover, 1959).
Euclid’s Elements. The Athenian Academy was [ZO] D. J. Struik, A concise history of mathe-
closed in 529 by order of the Emperor Justi- matics, Dover, third edition, 1967.
nian; the last director was Simplicius, who
commented on Aristotle. Soon afterward,
Alexandria fell into the hands of the Moors,
and many scholars fled as refugees to Constan- 188 (X111.28)
tinople, the capital of the Eastern Empire.
Green% Functions

References A. General Remarks

[l] T. L. Heath, A history of Greek mathe- Green’s functions are usually considered in
matics 1, II, Clarendon Press, 1921. connection with tboundary value problems
[2] T. L. Heath, A manual of Greek mathe- for ordinary differential equations and also
matics, Oxford Univ. Press, 1931 (Dover, 1963). with telliptic and tparabolic partial differen-
[3] M. Cantor, Vorlesungen über Geschichte tial equations. For example, consider bound-
der Mathematik 1, Teubner, third edition, ary value problems for the tlaplacian in 3-
1907. dimensional space: L[u] =(?/ox~ + ?/axi +
[4] B. L. van der Waerden, Zenon und die a2/ax$u. Let D be a bounded domain with a
Grundlagenkrise der griechischen Mathema- smooth boundary S and the boundary con-
tik, Math. Ann., 117 (1940) 141-161. dition B on S be either u(x) = 0 (x E S) (the tïrst
[S] A. Szabo, Anfange des euklidischen kind) or au/&+ + ju = 0 (X ES) (the third kind),
Axiomensystems, Arch. History Exact Sci., 1 where n is the outer normal of unit length,
(1960), 37-106. b(x) > 0, and p(x) $0. We say that the function
[6] J. L. Heiberg (ed.), Euclidis opera omnia g(x,,x,,x,; [,, &, &) is the Green% function of
I-XIII and suppl., Leipzig, 188331916. L (or the partial differential equation L[u] =0)
[7] T. L. Heath, The thirteen books of Euclid’s relative to the boundary condition B, when (i)
Elements 1, II, III, Cambridge Univ. Press, g(x, 5) satisfïes L,[g(x, l)] = 0 except for x = {;
1908 (Dover, 1956). (ii) g(x, l)= -1/47cr+w(x, 0, where r=(Cf’=,(xi
[8] P. Ver Eecke (trans.), Les oeuvres com- - <i)2)“2 and w(x, 5) is a regular function, i.e.,
plètes d’Archimède, Desclée de Brouwer & of class C” for a suitable value v; (iii) g(x, 5)
Cie, 1921 (Blanchard, 1961). satisfies the boundary condition B, i.e., g(x, 0
[9] P. Ver Eecke (trans.), Proclus Diadochus, = 0, XE S (the lïrst kind), or (a/& + 8) g(x, 5)
les commentaires sur le premier livre des Elé- =O, xeS (the third kind). Conditions (i) and
ments d’Euclide, Desclée de Brouwer & Cie, (ii) mean that g(x, 5) is a tfundamental solution
1948 (Blanchard, 1959). of L, i.e., L,[g(x, c)] = 6(x - 0, where 6(x - 5)
[ 101 P. Tannery, Le géométrie grecque, com- is +Dira& measure at the point x = 5. Note
ment son histoire nous est parvenue et ce que that if g(x, 0 is a fundamental solution, then
nous en savons, Paris, 1887. by adding any solution u of the equation L[u]
[ 1 l] B. L. van der Waerden, Science awaken- = 0 to g, we obtain another fundamental
ing, Noordhoff, 1954. solution g + u. Thus Green’s function is the
[ 121 M. Clagett, Greek science in antiquity, fundamental solution that satisfies the given
Abelard-Schuman, 1955 (Collier, 1963). boundary condition. TO be more precise, in
[ 131 0. Becker (ed.), Zur Geschichte der grie- the boundary value problem, if the boundary
chischen Mathematik, Darmstadt, 1965. condition is of the tïrst kind, g(x, 5) cari be
[ 141 A. Szabo, Anfange der griechischen obtained by adding to a fundamental solution
Mathematik, Oldenbourg, 1969. -1/4nr the solution (0(x, jr) of the following
For the history of mathematics in general, Dirichlet problem: A,c~(x, 5) = 0, w(x, 5) =
[15] R. C. Archibald, Outline of the history of 1/4rcr (XE S). We remark that there are slightly
mathematics, Mathematical Association of different definitions for Green’s function. For
America, sixth edition, 1949. example, there are cases where g(x, 5) is de-
[ 161 N. Bourbaki, Eléments d’histoire des tïned by g(x, [) = 1/4nr + o(x, 5) or by g(x, 5) =
mathématiques, Hermann, second edition, l/r + (0(x, 5).
1969. In the case in the previous paragraph,
[17] M. Cantor, Vorlesungen über Geschichte Green% functions satisfying the boundary
der Mathematik I-IV, Teubner, second edi- conditions are uniquely determined. In gen-
tion, 1894- 1908. eral, if we are given a Green’s function, then
188 B 700
Green% Functions

for any regular function u(x) the function Green’s function that satisfies conditions (i),
(ii), and (iii). However, by modifying the defï-
nition, we cari get a generalized Green’s func-
JD tion playing a similar role [2]. This method
represents the solution of L[u] = u with the cari be applied to the case of ordinary dif-
boundary condition B. More precisely, if u(x) ferential equations of higher order.
satisfïes the tH6lder condition 1u(x) - v(x’)I <
L]x -x’l’ (0 < c(< 1) (L, a positive constants), C. The Laplace Operator
then u(x) is of class C?. Conversely, if u(x)
satisfies the equation L[u] = u and the bound- When the domain D is the n-dimensional
ary condition B, it is represented by the for- sphere of radius a with tenter at the origin,
mula for u(x). This means that if we denote Green’s function of the Laplacian relative to
the operator that associates u to u by G, then the boundary condition u = 0 is obtained in
G is the inverse operator of the Laplacian L the following way. Let E(r) be the following
with the boundary condition B, and Green’s
fundamental solution of the Laplacian: E(r) =
function is the tintegral kernel of the operator (27~~‘logr for FI=~ and E(r)= -((n-2).
G. Using this property, the boundary value w r”m2)-1 for n>3, where w,=27~““T(n/2) is
problem relative to L cari be reduced to a
the (n - 1)-dimensional surface area of the n-
problem of tintegral equations. For example, dimensional unit sphere. Then Green’s func-
the differential equation with the boundary
tion g(x, 5) is defined by
condition B containing the complex parameter
î,, L[u] + iu =A is equivalent to the integral gk 0 = Wr) - Wpr’la),
equation u + ‘G[UI = G[.f], which is obtained where p = (CyE1 r,2)“‘, r’ = (Cycl (xi - [~)2)“2,
by letting G act from the left on the above dif-
5: = Wp)’ h.
ferential equation. In this way, the problem
cari be simplifïed.
In the case of general boundary value D. Helmholtz’s Differential Equation
problems for higher-order elliptic operators,
Green’s functions are defïned in the same way Let D be an exterior domain with a smooth
as before (- 189 Green’s Operator). The im- boundary S in R3. In mathematical physics,
portant case is when L and the boundary the boundary value problem of finding a solu-
condition B defïne a tself-adjoint operator. tion u(x) of Helmholtz’s differential equation
In this case, Green’s function is symmetric (A + !X*)~(X) =.f(x) (k > 0) satisfying u(x) = 0
(g(x, <)=g(<,x)). TO obtain Green’s function (XE~) is of particular interest. In this case,
is not easy in general. However, in some cases concerning the behavior of u(x) at infïnity,
such functions cari be obtained fairly easily we usually assume Sommerfeld’s radiation
(- Appendix A, Table 15.N). condition:

When Ix[+ +co, u(x)=o(IxI-l),


B. Self-Adjoint Ordinary Differential
Equations of the Second Order

Consider the operator L[u] ~(p(x)u’)‘+q(x)u where d/ar is the derivative along the radial
(p(x) > 0) deiïned in the interval a <x db, with direction. It is known that this condition en-
boundary conditions of the form EU’ + bu =0 sures the uniqueness of the solution (Rellich’s
at the two endpoints. Then Green’s function uniqueness theorem). We cari construct Green’s
g(x, <) is detïned in the following way: (i) For function G(x, 5) for any k( > 0) SO that for
X#L L[g(x,<)]=O; (ii) [ng(x,[)/ax]XZ:!O= smooth f‘(x) with bounded tsupport,
l/p(Q (iii) for 5 fixed, g(x, 5) satisfies the homo-
geneous boundary conditions at x = a and u(x)= 1 G(x>
SM5)&
x=/x JD
Conditions (i) and (ii) mean that L[g(x, <)] represents the solution satisfying u(x)=0 (XE~)
=6(x - 5). We cari construct g(x, 5) in the and Sommerfeld’s radiation condition. Then,
following way: Let u1(u2) be the solution of with
L[u] = 0 satisfying the boundary condition at &klx-<l
x = a (at x = h). If u1 and u2 are linearly inde- G(x, 5) = -~
4nlx--;l+KC(X’~),
pendent, we cari satisfy P(U; u2 -u, u;) = 1 by
choosing the constants suitably. Then Green’s where
function g(x, 5) is defined by g(x, 5) = u1(x)u2(<)
for x<& and g(x,t)=u,([)u,(x) for [<x. Ifu,
and u2 are linearly dependent, there exists no
701 188 G
Green% Functions

K,(x, 5) cari be obtained by solving an integral in this section for regular functions A cp. The
equation of tFredholm type [3,4]. In this case, function g(x, t; 5, r) is called Green? function
there exists no Green’s operator in L, space, relative to the boundary value problem. De-
and G(x, t) cari be considered to be a general- tailed consideration of such elementary cases is
ized Green’s function [4]. found in [6].

E. Stokes’s Differential Equation G. Kernel Functions

Let D be a bounded domain in R3 with The kernel function is closely related to


smooth boundary S, and consider Stokes’s Green’s function of A (the Laplacian) rela-
differential equation in D tive to the lïrst boundary value problem in a
domain in R’.
pAui=&pxi, i= 1,2,3, 2 3=0, First we explain the general detïnitions of
L j=1 axj
the kernel function. Let E be a general set, and
where p, p are positive constants. In hydro- let 5 be a +Hilbert space of complex-valued
dynamics, we consider the boundary value functions defined on E with a suitable inner
problem of finding solutions (u,(x), u,(x), u3(x), product (A g). Suppose that we are given a
p(x)) of Stokes’s equation satisfying the bound- function K(x, y) defined on E x E satisfying
ary condition ui(x) = 0, x E S (i = 1,2,3). In this the following conditions: (i) For any fixed y,
case, Green% tensors Gij(x, 0, gi(x, <) cari be K (x, y) regarded as a function of x belongs to
constructed, and for smooth functions X,(x) 5; ad (ii) for aw f(x) E 5, (f(x), K(x, Y)), =
(i = 1,2,3) the unique solution of this bound- f(y). Then K(x, y) is called a kernel function
ary value problem is represented by or reproducing kernel. The kernel function, if it
exists, is unique and is tpositive detïnite Her-
u;(x) = P 5 Gij(x, 5)Xj(5)dt, mitian; that is,
j=l s jj

P(X) F P $J gj(x, 5)Xj(5)d5


j=l sD
Conversely, any positive detïnite function is
c51. a reproducing kernel of some Hilbert space.
A necessary and sutficient condition for the
existence of the kernel function is that for
F. Parabolic Equations
any y~ E, the linear functional f-f(y) be
Consider the boundary value problem (the bounded. In this case, the minimum of IlfIl
initial boundary value problem): under the condition f(y) = 1 (fi 3) is attained
by the element K(x, y)/K(y, y), and its value is
^I
t>o, K(y,y)m”2. When 3 is a tseparable space, then
L[u,+2~=f(x;r), a<x<h,
by an torthonormal system {q,(x)}, we cari
represent K (x, y) as
4x, 0)= 444~
where at x = a and x = b, u(x, t) satistïes K(x> Y) = f <IOWA. (2)
some homogeneous boundary conditions. V=I
In this case, we cari construct the function As an example of kernel functions, the fol-
g(x, t; 5, z) (t > r) satisfying the following con- lowing case is of particular importance. Let E
ditions: (i) L [g] = 0 except for x = 5. t = r; be an n-dimensional tcomplex manifold, <pand
(ii) $ be holomorphic tdifferential forms of degree
n on E, and let the following inner product be
g(x t,5 T)=exp(-(x-5)2i4cz(t-~))
/, > given:
2cJ55j

+(regular function)

in a neighborhood of x = 5, t = z; and (iii) Now 3 is given by s={<pl<p,<p)< +co}, and


g(x, t; 5, r) satisfies the given homogeneous the kernel function is called the kernel dif-
boundary conditions at x = a and x = h. Then ferential. When E is a domain in C”, regarding
the coefficients of the differential form as func-
tions, we cal1 the kernel function Bergman%
kernel function. Moreover, if E cari be mapped
onto a bounded domain by a one-to-one
+ *dx,WMi)d5
s ll holomorphic mapping, then

represents the solution of the problem stated


188 H 702
Green% Functions

is positive detïnite and gives a Kiihler metric Deutscher Verlag der Wiss., 1956. (Original
which is called the Bergman metric. in Russian, 1950.)
[4] S. Mizohata, Theory of partial differen-
tial equations, Cambridge Univ. Press, 1973.
H. Kernel Functions for Domains in the (Original in Japanese, 196.5.)
Complex Plane [S] F. K. G. Odqvist, uber die Randwertauf-
gaben der Hydrodynamik zaher Flüssigkeiten,
Let E be a domain D in the complex plane Math. Z., 32 (1930), 329-375.
(z = x + iy). Let K(z, [) be Bergman’s kernel [6] E. E. Levi, Sull’ equazione del calore, Ann.
function of D, and let G(z, <) be Green’s func- Mat. Pura Appl., (3) 14 (1908), 187-264.
tion of A relative to the fïrst boundary con- For kernel functions,
dition with a pole at <. Then we have [7] S. Bergman, The kernel function and con-
forma1 mapping, Amer. Math. Soc. Math.
K(z, i) = -(2/n)13’G(z, [)/rlzcY[. (3)
Surveys, 1950.
Next, let U(z, [) be the kernel function of [S] S. Bergman and M. Schiffer, Kernel func-
the Hilbert space consisting of the holomor- tions and elliptic differential equations in
phic differential forms whose integrals are mathematical physics, Academic Press, 1953.
single-valued, and let N(z, [) be Neumann% [9] N. Aronszajn, Theory of reproducing
function of A, i.e., the function that is har- kernels, Trans. Amer. Math. Soc., 68 (1950),
monic in D - {[}, has the same singularity as 337-404.
G at 5, and whose derivative in the normal [ 101 H. Meschkowski, Hilbertsche Raume mit
direction aN/dn is constant along the bound- Kernfunktion, Springer, 1962.
ary. Then we have

U(z, i) = (2/n)aZN(z, i)Jaza[. (4)

Now the kernel H(z, [) =(N(z, [) - G(z, [))/27~ 189 (X111.29)


is the kernel function relative to the Hilbert
Green’s Operator
space consisting of a11 real tharmonic functions
whose integral mean value along the boundary
r is 0 and having the inner product A. General Remarks

CCP,*)= Consider the fïrst and the third boundary


value problems for the elliptic equation
The kernel H(z, 5) is called a harmonie kernel
function. A[u]= -Au+ c ai(x)&+c(x)u=f(x)
i=l I
Suppose that the boundary r is tpiecewise
smooth, and consider the space of a11 holo- (- 323 Partial Differential Equations of Ellip-
morphic functions in D that are continuous on tic Type). Let D be a bounded domain of R”
the boundary of D. The inner product of such whose boundary S consists of a fïnite number
functions q and $ is given by (<p, $) =SI-& of smooth hypersurfaces. By the tmethod of
ds (ds is the element of the arc length of r). orthogonal projection, we take the domain
Hence we have a Hilbert space. Then the g(A) as follows: (i) {u(x)Iu(x)EH’(D) and
kernel function relative to this Hilbert space is u(x)=0 for ~ES} or (ii) {u(x)Iu(x)~H~(D) and
called Szego’s kernel function, which has a au/& + P(~)U = 0 for x E S} according as we are
close relation with tbounded functions. considering the tïrst or third boundary value
The kernel functions enable us to represent problem, where Hz(D) is the +Sobolev space
holomorphic mappings that map the domain (- 168 Function Spaces). If the operator A is
D onto various canonical domains (- 77 a one-to-one mapping from g(A) onto the
Conforma1 Mappings). function space L,(D), we cal1 the inverse
operator A-’ Green% operator relative to the
boundary condition, and we denote it by G. In
References general, the existence of A-’ is not guaranteed.
However, if we take real t large enough, G, =
[l] R. Courant and D. Hilbert, Methods of (A + ~1)~’ exists.
mathematical physics, Interscience, 1, 1953; II, Consider the general case where  is a com-
1962. plex parameter: (n1- A) [u] =f(x), ,f(x)~l,~(D).
[2] K. Yosida, Lectures on differential and Letting G, act from the left, we have (1 -(A +
integral equations, Interscience, 1960. (Orig- t) G,) [u] = - G,f: Conversely, if u E L,(D) is
inal in Japanese, 1950.) a solution, clearly ~(X)E~(A), and u(x) satis-
[3] V. D. Kupradze, Randwertaufgaben der fies the lïrst partial differential equation and
Schwingungstheorie und Integralgleichungen, the boundary condition. Since G, is a +com-
703 189 c
Green% Operator

pact operator in L,(D), the +Riesz-Schauder This general boundary value problem, es-
theorem cari be applied (- 68 Compact and pecially the existence theorem, was treated
Nuclear Operators). In particular, if 3, + t is not under some algebraic conditions on A and
an teigenvalue of G,, U(X)=@-A))l,f= -(I- {Bj) by M. Schechter [S] who showed that
(E.+t)G,)-‘G,f represents a unique solution. G[u]EH”(D) if ~(X)E&(D) and that G[u]
In the equations in the first paragraph of depends continuously on u. In particular, if
this section, if ai(x)= 0 and c(x) and /j’(x) are m > n/2, then by Bobolev’s theorem, G is a
real, then G, is a +Self-adjoint operator in continuous mapping from L,(D) into C’(D),
L,(D), and therefore the +Hilbert-Schmidt and G is represented by an tintegral operator
expansion theorem cari be applied. Namely, let of Hilbert-Schmidt type (L. Garding [SI).
{Âi} be the eigenvalues of A such that Au,(x) = Namely, for any ~(X)E L,(D),
Âiwi(x), where {wi(x)} is an torthonormal
system in L,(D). Then for any ~(X)E&(D), f(x) (W(x)= W> 5)f(WL
=C?i(f; wi)wi(x), where the right-hand side sD
is taken in the sense of tmean convergence.
Furthermore, for ~(X)E~(A), we have the
expansion (@)(X)=C:~ ÂJ~;oJw~(x) in the
same sense. In general, the function G(x, t), obtained by
When G, is not self-adjoint, let Cl be the the kernel representation of Green% operator
tadjoint operator of G, in L,(D). Then GF G, is called Green% function.
represents Green? operator relative to the On the other hand, consider, for exam-
equation ple, the +Dirichlet problem of A in R3. Then
Green’s function is defined in the following
(A*+tl)Cul= +$&(oijx)u) way (- 188 Green’s Functions): G(x, t)=
1 -(47r]x-51))’ +u(x,<), where u(x, 5) satistïes
+ (c(x) + t)u (i) A,u(x, 5) =O, and (ii) G(x, <)lXss = 0. The
function delïned in this manner coincides with
=Y(x),
Green’s function defïned as a kernel represen-
corresponding to the boundary conditions (i) tation of Green’s operator [ 1,2].
u(x) = 0, x E S (lïrst boundary condition) and (ii) Suppose that in problem (1) the telliptic
(a/an)u+~(x)u=O, XE& where operator A(x, 3/3x) is independent of x, and let
E(x) be a tfundamental solution, i.e., E(x)
~<x)~~<x)+i~ui(X)CoSnXi~ is a distribution solution of A (~/~X)E(X) = 6(x)
(6(x) is +Dirac’s 6-function; - 112 Differential
with n the outer normal (third boundary con- Operators). Then E(x) is a C”-function away
dition) [2]. from the origin. Moreover, in a neighborhood
of the origin the following estimates hold: For
B. Elliptic Equations of Higher Order IKI cm,
rC,Xl~-n-t/, m-n-lal<O,
Green’s operator cari be defined for elliptic
equations of higher order. Consider the
equation
[C> m-n-lal>O,
A(x,a/ax)u(x)=f(x), XED;
where c is some positive constant. When prob-
Bj(x,a/ax)u(x)=o, XES, lem (1) has Green’s operator G, we cari say
the following, using the fundamental solution:
j=1,2 >...>H=m, (1)
Green’s function G(x, 5) exists and cari be
where A is an telliptic operator of order m and written as G(x, [) = E(x - 5) + u(x, t), where for
the boundary operators {Bj} satisfy: (i) At any lïxed <ED, u(x, 5) satislïes (i) A(a/ax)u(x, 5)
every point x of S, the normal direction is not =0 and (ii) Bj(x,d/3x)G(x,<)=0, xcS, j=
tcharacteristic for any Bj; and (ii) the order mj 1,2 ,.../ b.
of Bj is less than m, and mj # mk (j # k). The
domain 9(A) of A is defmed by
C. Hypoelliptic Operators
9(A) = {u(x) 1UE H”(D) and

Bj(x, û/ûx)u(x) = 0 for x~S, Let

j=1,2 ,..., b}. A(x,û/ûx)=1 a,(x) ; a,


When A is a one-to-one mapping from 9(A)
la1
<m 0
onto L,(D), the inverse G = A-’ is called a 3L alul
Green% operator. (-> - axy ...ax.'
ÛX
189 Ref. 704
Green’s Operator

be a general partial differential operator with [3] R. Courant and D. Hilbert, Methods of
C”-coefficients. If the kernel E(x, 5) satisfies mathematical physics II, Interscience, 1962.
,4(x, a/ax)E(x, 5) =6(x - c), that is, if we have [4] L. Garding, Dirichlet’s problem for linear
the relation (E(x, l), !4(x, ~/&X)C~(X)), = ~(5) elliptic differential equations, Math. Stand., 1
for any q(x) E 3, then E(x, 5) is called a funda- (1953), 55-72.
mental solution of A, where !4 is the transposed [S] M. Schechter, General boundary value
operator of A: problems for elliptic partial differential equa-
tions, Comm. Pure Appl. Math., 12 (1959),
‘A(.& a/ax)u(x)= c (-1)‘“’ ; n(a,(x)o(x)). 457-486.
lai <m 0
[6] L. Garding, Applications of the theory
Now if there exists a fundamental solution of direct integrals of Hilbert spaces to some
E’(x, 5) of !4(x, 3/8x) such that (i) E’(x, 5) de- integral and differential operators, Univ. of
fines a kernel that gives rise to two continuous Maryland lecture series no. 11, 1954.
mappings, one of which maps the space g< [7] L. Schwartz, Théorie des distributions,
into G,, and the other of which maps the space Hermann, second edition, 1966.
aX into Cç (- 168 Function Spaces), and (ii) [S] L. Hormander, On the theory of general
for x + <, E’(x, 0 a C”-function of (x, 0, then partial differential operators, Acta Math., 94
any distribution u(x) satisfying A(x, a/ax)u(x) (1955), 161-248.
=g(x) is a C”-function, where g(x) is of class
C”. In general, an operator A with the prop-
erty that any solution u(x) of A(x, a/dx)u(x) =
g(x) is of class C” whenever g(x) is of class 190 (IV.1)
C”, is called hypoelliptic. Elliptic and para-
bolic operators are both hypoelliptic. L. Hor-
Groups
mander characterized the hypoelliptic dif-
ferential operators with constant coefficients A. Definition
[8] (- 112 Differential Operators).
A kernel E(x, 5) such that Let G be a nonempty set. Suppose that for any
elements a, b of G there exists a uniquely deter-
mined element c of G, which is called the prod-
with a(x, 5) a C”-function, is called a para- uct of a and b, written c = ab. We cal1 G a
metrix of A. TO prove the hypoellipticity of A, group or multiplicative group if(i) the associa-
it suffices to show the existence of a parametrix tive law a(bc)=(ab)c holds, and (ii) for any
E’(x, 5) of the operator ‘A having the prop- elements a, b E G there exist uniquely deter-
erties (i) and (ii) mentioned in the previous mined elements x, y~ G satisfying ax = b,
paragraph. ya= b. Then the mapping (a, b)+ab is called
TO explain the notion of the fundamental multiplication in G. Condition (ii) is equivalent
solution for the tevolution equations, suppose to the following two conditions: (iii) There
that we are given an evolution equation exists an element e (called the identity element
or unit element of G) such that ae = ea = a for
L[UI =$(x, t) any element a of G; and (iv) for any element a
of G there exists an element x such that ux =
- a aj
- u(x, = 0, xa=e.
+ C aa, jtx> t)

0 ax
t, 2T
j<m atl The element x in condition (iv) is called the
inverse (or inverse element) of a, denoted by
XER>I, t,gt<T. A kernel E(x,t;& t,J (to<
u-l. The uniqueness of the identity element e
t < T) is called a fundamental solution to the
and the inverse a- ’ follows readily from the
evolution equation if
axioms. The identity element of a multiplica-
L,,,(~(x,t;~,bJ)=O, t>to, tive group is sometimes denoted by 1. If ab
= bu, then we say that a and b commute. The
and
commutative law, ah = ba for any elements a,
0, O<jdm-2, bE G, is not assumed in general. A group satis-
f linl, $(x, t; <, to) =
-0 { 6(x-(), j=m-1. fying the commutative law is called an Abelian
group (or commutative group) in honor of N.
References H. Abel, who made use of commutative groups
in his study of the theory of equations. The
Cl] H. G. Garnir, Les problèmes aux limites de product in a commutative group is often writ-
la physique mathématique, Birkhauser, 1958. ten in the form a + b, and in this case the
[2] S. Mizohata, Theory of partial differential mapping (a, b)-+a + b is called addition. The
equations, Cambridge Univ. Press, 1973. element a + b is called the sum of a and b, and
(Original in Japanese, 1965.) G is called an additive group. In an additive
705 190 c
Groups

group the identity element is usually denoted set H is a subgroup of G if and only if a-‘bE H
by 0 and called the zero element, and the in- for any a, bE H. For a family {HA} of sub-
verse of a is denoted by -a (- 2 Abelian groups of G, the intersection ni. H, is also a
Groups; 277 Modules). TO describe the +law of subgroup.
composition, we sometimes use notation differ- The associative law of multiplication says
ent from multiplication or addition (- 409 that elements a,, a,, a3 of G determine the
Structures). product ala2a3, which is the common value of
(ula2)u3 and a,(a,a,). This law cari be gen-
eralized to say that any ordered set of n ele-
B. Examples ments a,, a2, , a,, (n > 2) of G determines their
product a, a, a, (general associative law).
A tlinear space over a Yteld K is an additive When a, = a2 = ... = a,, = a, we denote the
group with respect to the usual addition of product au a by a”. If we define a-” for n > 0
vectors (- 256 Linear Spaces). A iïeld is an by a0 = e and a-” = (a”)-‘, we then have a”am
additive group with respect to the addition, = an+m, (u”)~ = anm for any n, m E Z. If there
and the set of nonzero elements of a tïeld exists a positive integer n such that a” = e, then
forms a group with respect to the multiplica- the smallest positive integer d with ad = e is
tion, which is called the multiplicative group called the order of the element a. If there is no
of the fïeld (- 149 Fields). such n, then a is called an element of infinite
Al1 tinvertible n x n matrices over a ring R order. If a is of infinite order, then its powers
form a group with respect to the usual multi- u”(=e),u*l,ai2 / *‘.. are a11 unequal. If a is of
plication of matrices. This group is called the order d, then the different powers of a are ao
+general linear group of degree n over R (- 60 (=e),a, a2 )...> a dm1 Al1 the powers of a form
Classical Groups). a subgroup (a) of G, called a cyclic subgroup.
Al1 one-to-one mappings from a set A4 onto The order of an element a is the same as the
itself (i.e., a11 permutations on M) form a group order of the subgroup (a). The group (a)
with respect to the composition defïned by itself is called a cyclic group and is an example
(f’og) (x)=f(g(x)) (~EM). (Sometimes the of an Abelian group (- 2 Abelian Groups).
product fo g is denoted by gfand ,f(x) by xf Let S be a subset of a group G. Then the
Then x(gf) = (xg)jJ The group of a11 permuta- intersection of a11 subgroups of G containing S
tions on M is called the symmetric group on is called the subgroup generated by S and is
M. A group G is called a permutation group denoted by (S). It is the smallest subgroup
(on M) if every element of G is a permutation containing S, and if S is nonempty, (S) con-
on M. For instance, the general linear group of sists of a11 the elements of the form
degree II over a field K may be regarded as a ulla22...a~(aiES,mi~Z). If (S)=G, the
permutation group on the set of n-dimensional elements of S are called generators of G. When
vectors, and it is also regarded as a permuta- G has a finite set of generators, G is said to be
tion group on a ttensor space. fïnitely generated. When S = {u}, then (S)
Al1 tmotions in a Euclidean space form a coincides with (a), and the element a is the
group with respect to the usual composition of generator of the cyclic group (a). Suppose
motions. Al1 invertible n x n matrices over K that elements a,, , a, of G satisfy an equa-
leaving a given +quadratic form invariant form tion of the form aria y* an” = 1. This equa-
a group with respect to the usual multiplica- tion is then called a relation among the ele-
tion of matrices. This group is called the ments a,, , a,,. If we have a system of gen-
+Orthogonal group belonging to the given erators and a11 relations among the generators,
quadratic form. If K is the tcomplex number then they defme a group (- 16 1 Free Groups).
fïeld or the real number iïeld, then these It is, however, still an open problem to tïnd a
groups are Lie groups (- 13 Algebraic general procedure to decide whether the group
Groups; 151 Finite Groups; 161 Free Groups; determined by a given system of generators
249 Lie Groups; 423 Topological Groups). and the relations among them contains ele-
ments other than the identity (- 161 Free
Groups B).
C. Fundamental Concepts For a subset S and an element x of a group
G, the set of a11 elements x ml sx (s E S) is de-
If a group G consists of a finite number of noted by x-‘Sx or S”, and S and S” are called
elements, then G is called a finite group; other- conjugate. We have (abjx=axbx, (a~‘)
wise, G is called an infinite group. The number =(ax)-‘. If H is a subgroup, then H” is also a
of elements of G is called the order of G. A subgroup. For a subset S, the set of all ele-
nonempty subset H of G. is called a subgroup of ments x satisfying S”= S forms a subgroup
G if H is a group with respect to the multipli- N(S), called the normalizer of S. The set of all
cation of the group G. Hence a nonempty sub- elements that commute with every element of
190 D 706
croups

S forms a subgroup Z(S), called the centralizer D. Isomorphisms and Homomorphisms


of S. The centralizer Z of G is called the tenter
of G. The set of all elements conjugate to a If there is a one-to-one mapping a t-) a’ of the
given element a of G is called a conjugacy class. elements of a group G onto those of a group
A group G is the disjoint union of its conju- G’ and if a w a’ and b H b’ imply ab u ab’, then
gacy classes. we say that G and G’ are isomorphic and Write
Let H be a subgroup of a group G and x an C=G’. If we put a’=f(a), thenf:G-rG’ is a
element of G. The set of elements of the form tbijection satisfying f(ab) =,f(a)f(b) (a, b E G).
hx (h E H) is denoted by Hx and is called a More generally, if a mapping f: G + G’ satisfies
right coset of H. A left coset xH is defïned f(ab) =f’(a),f(b) for a11 a, b E G, then ,f is called a
similarly. G is the disjoint union of left (right) homomorpbism of G to G’. An tinjective (kur-
cosets of H. The cardinality of the set of left jective) homomorphism is also called a tmono-
cosets of H equals that of the set of right cosets morphism (tepimorphism). If there is a surjec-
of H; it is called the index of the subgroup H tive homomorphism G-G’, then we say that
and is denoted by (G: H). Given two subgroups G’ is homomorphic to G. The composite of two
H and K of G, the set HxK={hxkIhEH,kEK} homomorphisms is also a homomorphism. If a
is called the double coset of H and K, and G is homomorphism f: G-G’ is a bijection, then f
the disjoint union of different double cosets of is called an isomorphism. In this case f-’ is
H and K. If the left cosets of a subgroup H are also an isomorphism, and we have G g G’.
also the right cosets, i.e., if Hx = xH for every For a subgroup H of a group G, the in-
x E G, then H is called a normal subgroup (or jective homomorphism f: H +G deiïned by
invariant subgroup) of G. An equivalent con- f(a) = a (a EH) is called the canonical injection
dition is that H = H” for a11 x E G. The tenter of (or natural injection). For a factor group GIR
G is always a normal subgroup of G. If H is a of G, the surjective homomorphism f: G+G/R
normal subgroup of G, the set of a11 products such that a~,f(a) (ag C) is called the canonical
of an element of Ha and an element of Hb surjection (or natural surjection).
coincides with Hab. Thus if we detïne the Let f: G+G’ be a homomorphism. Then the
product of two cosets Ha and Hb to be Hab, image f(G) of ,f is a subgroup of G, and the
then the set of cosets of H forms a group. This kernel H = {a E G 1f(a) = e’ (the identity of G’)}
group is denoted by G/H and is called the off is a normal subgroup of G. The equiva-
factor group (or quotient group) of G modulo lente classes of the equivalence relation given
H. (When G is an additive group, C/H is also by f(x) =,f(y) are just the cosets of H, and J
denoted by G-H and is called the difference induces an isomorphism f: C/H -f(G). The
group.) The group G itself and {e} are normal latter proposition is called the homomorphism
subgroups of G. If G has no normal subgroup theorem of groups. This theorem is extended in
other than these two, then G is called a simple the following way: For simplicity let f: G-G’
group. A subgroup of fïnite index contains a be a surjective homomorphism. (i) If H’ is a
normal subgroup of fïnite index. If H is a normal subgroup of G’, then the inverse image
subgroup of tïnite index, then we cari lïnd a H =f-‘(H’) is a normal subgroup of G, and f
common complete system of representatives of induces the isomorphism f: G/H+G’/H’. (ii) if
the left cosets and the right cosets of H. If G is H is a subgroup and N is a normal subgroup
finitely generated, then SO is any subgroup of ofG,thenHN={hnIhEH,nsN}isasubgroup
G of finite index. of G, and the canonical injection H+ HN
Let R be an +equivalence relation dehned in induces an isomorphism H/H n N-tHN/N. (iii)
a group G. If xRx’ and yRy’ always imply If H and N are two normal subgroups of G
(xy)R (~‘y’), then we say that R is compatible such that H 3 N, then the canonical surjection
with the multiplication. The tquotient set GIR G+G/N induces an isomorphism G/H+
is a group with respect to the induced multipli- (G/N)/(H/N). Propositions (i), (ii), and (iii) are
cation. This group is called the quotient group called the isomorphism theorems of groups.
of G with respect to R. The equivalence class A homomorphism of G to itself is called an
H containing e is a normal subgroup, and endomorphism of G, and an isomorphism of G
xRx’ if and only if x -~X>E H, i.e., x and x’ are to itself is called an automorphism of G. The
contained in the same coset of H. Thus GIR set of automorphisms of G forms a group with
coincides with G/H. respect to the composition of mappings, called
If G is a tïnite group of order n, then the the group of automorphisms of G. Given an
order and the index of any subgroup of G, the element a of G, the mapping x + a -’ xa (x E G)
order of any element of G, the cardinal number yields an automorphism of G which is called
of any conjugacy class of G, and the number of an inner automorphism of G. The set of inner
different conjugate subgroups of any subgroup automorphisms of G forms a normal subgroup
of G are a11 divisors of n. of the group of automorphisms of G, called the
101 190 G
Groups

group of inner automorphisms of G, which is phism). We have the homomorphism theorem


isomorphic to the factor group of G modulo its and the isomorphism theorems of R-groups if
tenter. The factor group of the group of auto- we consider only admissible subgroups and
morphisms of G modulo the group of inner admissible homomorphisms.
automorphisms of G is called the group of
outer automorphisms of G.
If a mapping f: G + G’ from a group G to F. Sequences of Subgroups
another group G’ satisfies f(ab) =f(b)f(a)
(a, b E G), then f is called an antihomomor- Let H,, H, be an intïnite sequence of
phism. A bijective antihomomorphism is called (normal) subgroups of a group G. If Hi$ Hi+l
an anti-isomorphism. When G = G’, S is called (i = 1,2, _. ), then the sequence is called an
an anti-endomorphism or anti-automorphism ascending chain of (normal) subgroups. If Hi
(e.g.,f:G+G detïned byS(a)=ü’ is an anti- 2 Hi+l (i = 1,2, . ), then it is called a descend-
isomorphism). ing chain of (normal) subgroups. If there is no
ascending (or descending) chain of (normal)
subgroups of G, we say that G satistïes the
E. Groups with Operator Domain ascending (or descending) chain condition for
(normal) subgroups. These conditions are the
Let R be a set and G a group. Suppose that for same as the ascending (or descending) chain
each 8 E R and x E G, the product 0x E G is condition in the ordered set of a11 (normal)
defmed and satistïes O(xy) = H(x)(Qy). Then fi is subgroups of G (- 311 Ordering C). A group
called an operator domain of G, and G is called G satistïes the ascending chain condition for
a group with operator domain fi, or simply an subgroups if and only if every subgroup of
fi-group. (We sometimes Write x0 instead of G is tïnitely generated. Also, for groups with
ex.) The mapping (0,x)+0x from 0 x G to G is operator domain we have similar results. The
called the toperation of R on G. If G is an fi- structure of Abelian groups satisfying the
group, then any element 0 of fi induces an ascending (descending) chain condition is
endomorphism 0,:x+0x of G. Conversely, if completely determined (- 2 Abelian Groups).
we are given a mapping Q-+0, of 0 to the set It is not known whether there is an intïnite
of endomorphisms of G, then we may regard G group satisfying both the ascending and de-
as an R-group. Any group may be regarded as scending chain conditions for subgroups. A
an fi-group with R equal to the empty set or group satisfying the descending chain con-
to the set consisting of the identity automor- dition for subgroups has no element of infinite
phism of G. Thus the general theory of groups order, but the converse is not true. There is an
cari be extended to the theory of groups with intïnite group which is tïnitely generated and
operator domain, and in some cases effective has no element of infmite order (- 161 Free
use of suitable operator domains cari be fruit- Groups C).
fui in the investigation of the properties of
groups themselves. (- 2 Abelian Groups; 277
Modules). G. Normal Chains
A subgroup H of an R-group G is called an
fLsubgroup (or admissible subgroup) if BXE H AtïnitesequenceG=G,,~G,~G,~...~G,
for any OEQ and XE H. In this case, H is also = {e} of subgroups of a group G is called a
an Q-group. If an equivalence relation R de- normal chain if Gi is a normal subgroup of Gi+,
fïned in G is compatible with the multiplica- for i = 1,2, , r. We cal1 r the length of the
tion and also compatible with the operators, chain. The sequence G,/G,, GJG,, , G,-,/G,
namely, if xRx' implies (Qx)R(Ox') for any is called the sequence of factor groups of the
OCR, then the quotient group GIR is also an normal chain. A normal chain G = H, 3 H, 2
R-group. The equivalence class containing e is H, 3 .X H, = {e} is called a refinement of
an admissible normal suhgroup. Conversely, if thechainG=G,~G,~...~G,={e}ifeveryG,
H is an admissible normal subgroup, then the appears in this chain. Two normal chains with
equivalence relation defined by H is compat- the same length are called isomorphic if there
ible with the operators, and the factor group is a one-to-one correspondence between their
G/H is an R-group. A homomorphism f’: G sequences of factor groups such that corre-
+G’ of an fi-group G to an R-group G’ is sponding factor groups are isomorphic. Any
called an R-homomorphism (admissible homo- two normal chains have reiïnements which are
morphism or operator homomorphism) if f(0x) isomorphic to each other (Schreier’s retïne-
=Of(x) for any OE R and x E G. If f is an iso- ment theorem). A normal chain is called a
morphism, then f is called an R-isomorphism composition series (or Jordan-HoIder sequence)
(admissible isomorphism or operator isomor- if it consists of different subgroups of G and in
190 H 708
Groups

any proper refmement there appear two suc- also a normal subgroup of G, and A/C’ com-
cessive subgroups which are the same. The mutes with BIC’ elementwise in the factor
sequence of factor groups of a composition group C/C’. Furthermore, [A, B] is the mini-
series is called a composition factor series, and mal normal subgroup with the property. The
the factor groups appearing in this series are subgroup [G, G] is the commutator subgroup
called composition factors. Any composition of G.
factor is a simple group. As a direct conse- If the commutator subgroup of G is Abelian,
quence of the refïnement theorem we see that if then G is called a meta-Abelian group. If a
a group G has a composition series, then the group G has a normal chain G( = G,) 1 G, 3
composition factor series is unique up to iso- G2( = {e}) of length 2 and the factor groups
morphism and the ordering of the factors. G/G,, G,/G, are Abelian, then G is meta-
(This theorem is due to 0. Holder. C. Jordan Abelian. Meta-Abelian groups are special
proved that if G is a finite group, then the cases of solvable groups, discussed in Section 1.
set of orders of composition factors is inde-
pendent of the choice of composition series.
Hence we cal1 the theorem the Jordan-H6lder 1. Solvable Groups
theorem.)
For an R-group G, if we consider only R- Suppose that we are given a series of subgroups
subgroups, we have definitions and theorems Gi(i=0,1,2,...)ofGsuchthatG=G,and
similar to those in this section. When we take [Ci, Gi] = G,+, Then we have a normal chain
the group of inner automorphisms of G as 0, G=G,xG,=~G,~....IfG,={e}forsome
then a composition series of the R-group G is r, then G is called a solvable group. For the
called a principal series. If we take the group of normalchainG(=G,)xG,x...G,(={e})the
automorphisms of G as R, then a composition factor groups G,/G,+, (i = 0, 1, . , r - 1) are all
series is called a characteristic series. An in- Abelian. A finite group G is solvable if and
fïnite group G does not always have a compo- only if G has a composition series G = H, 1
sition series. Even if G has a composition ff,~Hz 3.. .I H, = {e} such that the factor
series, G may have an infinite normal chain groupsH,/H,+,(i=O,l,...,s-l)areallof
G, c G, c c G such that each Gi is a normal prime order. An tirreducible algebraic equa-
subgroup of G,,, and u Gi = G. In fact, there is tion over a field of tcharacteristic 0 is solvable
a simple group which has such an infinite by radicals if and only if its Galois group is
normal chain (P. Hall). Two groups which solvable (- 172 Galois Theory).
have isomorphic composition series are not
necessarily isomorphic. A subgroup of a group
J. Nilpotent Groups
G is called a subnormal subgroup of G if it may
appear in some normal chain. The intersection
The sequence of subgroups G = G, 2 G, 1
of two subnormal subgroups is also sub-
G, 1. defïned inductively by setting G, =
normal, but their join (i.e., the subgroup gen-
[G,G,m,](r=1,2,...)iscalledthelower
erated by both of them) is not necessarily
central series of G. If G, = {e} for some n, then
subnormal in an infinite group. The set of
G is called a nilpotent group, and the least
subgroups and the set of normal subgroups of
number n with G, = {e} is called the class of the
a group form tlattices with respect to the inclu-
nilpotent group G. A nilpotent group is solv-
sion relation (for the relationship between
able. Let Z, be the tenter of G, Z,/Z, be the
these lattices and the group structure - [SI).
tenter of G/Z,, and SO on. Then we have a
sequence of subgroups Z, = {e} c Z, c Z, c ,
called the Upper central series of G. A group G
H. Commutator Subgroups is nilpotent if and only if Z, = G for some m,
and the least number m with Z, = G is the
Given two elements a and b of a group G, we class of G. For the subgroups G, and Z, (r =
cal1 ü’b-‘ab = [a, b] the commutator of a and 1,2 ,... ),wehave[Gim,,Zi]={e}.IfGisa
b. The subgroup C generated by a11 commu- +Lie group, then G is nilpotent if and only if
tators in G is called the commutator subgroup the corresponding ?Lie algebra g is nilpotent,
(or derived group) of G. The subgroup C is a Le., g” = 0.
normal subgroup of G, and the factor group
C/C is Abelian. On the other hand, if B is a
normal subgroup of G and G/B is Abelian, then K. Infinite Solvable Groups
B contains C. For two subsets A, B of G, the
subgroup generated by the commutators [a, b] The concepts of solvability and nilpotency are
(a E A, b E B) is called the commutator group of generalized in several ways for infinite groups.
A and B and is denoted by [A, B]. If A and B For instance, a group G is called a generalized
are normal subgroups of G, then C’ = [A, B] is solvable group if any homomorphic image of G
709 190 M
Groups

which is unequal to {e} contains an Abelian we deiïne the direct product ni.,, G, of these
normal subgroup unequal to {e}, and G is groups similarly. The set of a11 elements
called a generalized nilpotent group if any ( , xjr ) (xi, E G,) such that almost all xi (i.e.,
homomorphic image (f {e}) of G has tenter a11 except a fïnite number of n) are identity
unequal to {e}. These defïnitions coincide with elements is a subgroup of the direct product,
the previous ones for tïnite groups but not for called the direct sum (or restricted direct prod-
iniïnite groups [7]. uct) of {G,).

L. Direct Products M. Free Products

Let G,, , G, be a tïnite number of groups. Given a family of groups {G,},,,, we defïne
The set G of a11 elements (x,, ,x,) with X~E Ci the most general group G generated by these
(i = 1, , n) is a group if we deiïne the prod- groups, called the free product of {G,}, to-
uct of two elements x=(x,, . . ..x.) and y= gether with canonical injections ,fn: G,+G.
(yl,...,y,)tobexy=(x,y,,...,x,y,).We Let S be the disjoint union of the sets
cal1 G the direct product of groups G,, . . . , G, (G,},,,, and regard G, as a subset of S. A
and Write G = G, x x G,. If ej is the identity word is either void or a fïnite sequence a,,
element of G,, then e=(el, , e,) is the identity a*, , a, of elements of S, and we denote the
element of G. The mapping (x,, , x,)-xi set of a11 words by W. The product of two
from G to Ci is a surjective homomorphism, words w and w’ is defined by connecting w
called the canonical surjection. The subgroup with w’ SO that the tassociative law holds. We
Hi={(c ,,..., eiml,xi,ei+, ,..., e,)IxiEGi}is Write w>w’ when two words w and w’ satisfy
isomorphic to Ci. The subgroups Hi (i= one of the following two conditions: (i) The
1, , n) satisfy the following conditions: (i) Hi word w has successive members a, b which
is a normal subgroup of G. (ii) Hi commutes belong to the same group G,, and the word w’
with H, elementwise if i #j. (iii) Any element of is obtained from w by replacing a, b by the
G cari be written uniquely as the product of product ah. (ii) Some member of w is an iden-
elements of H,, , H,. Conversely, if a group tity element, and w’ is the word obtained from
G has subgroups H, , , H,, satisfying these w by deleting this member. For two words w
three conditions, then G is isomorphic to H, and w’, we Write w = w’ if there is a finite se-
x x H,. In this case we also Write G= H, quenceofwordsw=w,,w,,...,w,=w’such
x x H,, and we cal1 this a direct decompo- that for each i (1 <i<n), either wiml>y or
sition of G. Each Hi is called a direct factor of y>wiml. This relation is an equivalence rela-
G. Conditions (i), (ii), and (iii) are equivalent tion and is compatible with the multiplica-
to condition (i), (ii’) G = H, H, H,, and tion. Thus we may defïne a multiplication for
H,...H,-,nHi={e}(i=2 ,..., n). the quotient set G of W by this equivalence
A group G is called indecomposable if G relation, and then G is a group whose identity
cannot be decomposed into the direct product element is the equivalence class containing the
of two subgroups unequal to {e}, and com- void Word. Any x E G, is regarded as a Word,
pletely reducible if G is the direct product of and we have an injective homomorphism
simple groups. If G satisfies the ascending or fA : GA+G by assigning the corresponding class
descending chain condition for normal sub- to each element of G,. The group G is called
groups, then G cari be decomposed into the the free product of the system of groups
direct product of indecomposable groups. {G~,,A> and f’ is called the canonical injection.
Such a decomposition is not unique in general, The free product G of {G,},,, is characterized
but if G has two direct product decompo- by the following universal property: Given a
sitionsG=G,x...xG,=H,x...xH,,where group G’ and homomorphisms fi: G,+G
G, and H, are indecomposable and not equal @SA), we cari fïnd a unique homomorphism
to {e}, then m = n and the factors Ci are iso- 9: C+G such that gofi.=fZ:. The free product
morphic to the factors H, for some j; moreover, is the dual concept of direct product and is also
if G, corresponds, say, with H,, then we have called the tcoproduct (- 52 Categories and
G = H, x G, x . . x G,,,. This fact was first Functors). If each G, is an infïnite cyclic group
stated by J. H. M. Wedderburn, and a com- generated by a,, then the free product of the
plete proof of the theorem was given by R. G, is the tfree group generated by {uI} (- 161
Remak and 0. Schmidt. Later W. Krull ex- Free Groups).
tended it to more general groups (with opera- The concept of free product is generalized in
tor domain), and we cal1 it the Krull-Remak- the following way. Let H be a fïxed group. We
Scbmidt tbeorem. 0. Ore formulated it as a consider the family of pairs (G,j), where G is a
theorem on tmodular lattices. group and j: H + G is an injective homomor-
For an infinite number of groups G, (1~ A) phism. A homomorphism of pairs ,f:(G,j)+
190N 710
Groups

(G’,j’) is defmed to be a group homomorphism ditions of associated factor sets and split ex-
f: G+G’ such that ,foj=j’. For a given family tension are also simplitïed. If N is contained
of pairs {(G,, j,)}, we have the amalgamated in the tenter of G, then G is called a central
product (G,j) of the family and the canonical extension of N.
homomorphism fA:(G,,j,)+(G, j), which is
characterized by the following universal prop-
0. Transfers
erty: Given a pair (G’J) and homomorphisms
.&‘:(G,,j,)+(G’,j’), we have a unique homo-
Let H be a subgroup of fmite index in G and yi
morphism g:(G,j)+(G’,j’) such that gof’=
(i = 1, . . , k) be representatives of the right
fi. If H = {e}, then the amalgamated prod-
cosets of H. For bE Hgi we Write gi =b. Then
uct is the same as the free product. Now fA is
for XEG an element X= H’nf=, gix(gix)-’ of
an injection. If we regard G, as a subgroup of
H/H’ is determined uniquely (independent of
G, then G is generated by the subgroups G,
the choice of representatives), where H’ is the
and G,flG,,=j,(H)=j,(H) (l#p).
commutator subgroup of H. The correspon-
The notion of the amalgamated product is
dence G’x+X yields a homomorphism of G/G’
useful in constructing groups with interesting
to H/H’, which is called the transfer from G/G’
properties. For instance, we have a group
to HJH’.
whose nonidentity elements are a11 conjugate
(B. H. Neumann and G. Higman), and a group
generated by a tïnite number of elements such P. Generalizations
that its homomorphic image (f {e}) is always
an inlïnite group (Higman) SO that we have an The concept of group cari be generalized in
intïnite simple group generated by a lïnite several ways. A set S in which a multiplication
number of elements. (a, b)-+ab satisfying (ab)c = a(bc) (the associa-
tive law) is detïned is called a semigroup. If S is
a commutative semigroup in which ax = bx
N. Extensions implies a = b (the cancellation law), then S cari
be embedded in a group G SO that the multi-
Let N and F be groups. A group G is called an plication in S is preserved in G and any XE G is
extension of F by N if G has a normal sub- the quotient of two elements of S: x = a-l b =
group N isomorphic to N and GIN g F. The ba-’ (a, bc S). Such a group G is determined
problem of lïnding a11 extensions was solved uniquely by S. We cal1 it the group of quotients
by Schreier (Monatsk. Math. Pkys., 34 (1926); of s.
Ahk. Math. Sem. Univ. Hamburg, 4 (1928)). The notion of semigroup is obtained by
Suppose that (1) to each (TE F there corre- taking only associativity from the group
sponds an automorphism s,, of N; (2) there axioms. On the other hand, if Q is a set with a
exist elements c,,, ((T, z E F) of N such that law of composition (a, b)-tub which is not
s,(s,(a))=c,,,(s,,(a))c,‘, (~EN); and (3) ~~~~~~~~~ necessarily associative but satislïes the con-
=&T(c,,pk,,,p. Then the set G of a11 symbols as, dition that any two among a, b, c in the equa-
(aE N, CE F) is an extension of F by N if we tion ab = c determine the third uniquely, then
define multiplication by as; bs,=(as,,(b)c,,,)s,,. Q is called a quasigroup. A quasigroup with an
In fact, the set of a11 elements a = ac,‘, s1 (a EN) identity element e such that ea = ae = a for
is a normal subgroup N of G such that GIN every element a is called a loop. For loops, we
g F. Any extension cari be obtained in this have an analog of the structure theory of
way. A system (s,, c,,,) satisfying (l), (2), and (3) groups (R. H. Bruck, Trans. Amer. Math. Soc.,
above is called a factor set belonging to F. 60 (1946)).
Two factor sets (s,, c,,,) and (t,, d,,,) are said to If we give up the possibility of forming
be associated if there exist elements a, (a~ F) products for a11 pairs of elements or the
of N such that t,(a) = s,,(a,aa;‘) and d,,, = uniqueness of the product in the axioms for
a,(so(a,))c,,,a~z’. In this case, two extensions groups, then we have the following generaliza-
determined by these factor sets are isomorphic. tions of groups. A set M with multiplication
If(s,, c,,,) is associated with (t,, d,,,) (d,,, = 1 for under which to any elements a, be M there
any o, ZE F), then we say that the correspond- corresponds a nonempty subset ab of M is
ing extension is a split extension. In this case, called a hypergroupoid. Moreover, if the asso-
the extension G contains a subgroup F BO- ciative law (ab)c = u(bc) holds and for any
morphic to F, and G = FN, F n N = {e}. We elements a, b6 M there exist x, YE M such
cal1 such an extension a semidirect product of that boxa, beay, then M is called a hyper-
N and F. group.
If N is Abelian, then condition (2) is simply A set M is called a mixed group if (1) M cari
s&,(a)) = s,,(a), since the only inner automor- be partitioned into disjoint subsets M,, M,,
phism of N is the identity mapping. The con- M, ,... ;(2)foraEM,,bEMi(i=0, 1,2 ,._. ),
711 191
G-Structures

elements ab, a\b of Mi are detïned such that representation of groups by matrices (- 362
u(a\b) = b; (3) for b, CE Mi, an element b/c of Representations). By that time, the theory of
M, is defined such that (b/c) c = b; and (4) the Imite groups had acquired all its essential
associative law (ab)c=a(bc) (a, bs M,,cgM) features. Among the branches of abstract
holds (A. Loewy, 1927). algebra, the theory of groups was the tïrst to
A set M is called a groupoid if (1) M cari be develop; it led to the progress of abstract alge-
partitioned into disjoint subsets M, (i,j= bra in the 1930s. Since the latter half of that
1,2,...);(2)foruEM,andbEMjk,anele- decade, the theory of fmite groups has been
ment ub E Mi, is defïned; (3) for u E M, and developed further; there has been increased
bG Mi,, an element a\bE Mjk is delïned such interest in the theory, and many significant
that a(u\b)= b; (4) for UE M, and bE M,, an results have been obtained, especially since
element u/bE Mi, is defined such that (u/b)b 1955 (- 151 Finite Groups).
=a; and (5) for ~EM,, bcMj,, and ~EM~~, the
associative law a(bc) = (ab)c holds (H. Brandt,
References
1926). These generalized concepts also have
some practical applications (- 2 Abelian
[1] C. Jordan, Traité des substitutions et des
Groups; 13 Algebraic Groups; 60 Classical
équations algébriques, Gauthier-Villars, 1870
Groups; 69 Compact Groups; 92 Crystallo-
(Blanchard, 1957).
graphie Groups; 122 Discontinuous Groups;
[2] G. Frobenius, Gesammelte Abhandlungen
151 Finite Groups; 161 Free Groups; 243
IIIII, Springer, 1968.
Lattices; 249 Lie Groups; 277 Modules; 362
[3] W. Burnside, Theory of groups of finite
Representations; 422 Topological Abelian
order, Cambridge Univ. Press, second edition,
Groups; 423 Topological Groups; 437 Unitary
1911.
Representations).
[4] H. Weyl, The classical groups, Princeton
Univ. Press, revised edition, 1946.
[S] B. L. van der Waerden, Algebra, Springer,
Q. History
1, seventh edition, 1966; II, fifth edition, 1967.
[6] B. L. van der Waerden, Gruppen von
The concept of the group was Iïrst introduced
linearen Transformationen, Erg. Math.,
in the early 19th Century, but its rudiments cari
Springer, 1935 (Chelsea, 1948).
be found in antiquity; in fact, it was virtually
[7] A. G. Kurosh, The theory of groups 1, II,
contained in the concept of motion or trans-
Chelsea, 1960. (Original in Russian, 1953.)
formation used in ancient geometry. From the
[S] M. Suzuki, Structure of a group and the
time it took explicit form in the late 19th cen-
structure of its lattice of subgroups, Erg.
tury, it has played a fundamental role in a11
Math., Springer, 1956.
fields of mathematics.
[9] M. Hall, The theory of groups, Macmillan,
In their study of algebraic equations in the
1959.
late 18th Century, J. L. tlagrange, A. T. Van-
[ 101 H. Zassenhaus, Lehrbuch der Gruppen-
dermonde, and P. Ruffini saw the importance
theorie 1, Teubner, 1937; English translation,
of the group of permutations of roots; using
The theory of groups, Chelsea, 1958.
this idea N. H. Abel showed that a general
[ 1 l] A. Speiser, Die Theorie der Gruppen von
equation of degree > 5 cannot be solved alge-
endlicher Ordnung, Springer, third edition,
braically. A. L. +Cauchy studied the group of
1937.
permutations of roots for its own interest, but
[ 121 W. Specht, Gruppentheorie, Springer,
a complete description of the relationship
1956.
between groups and algebraic equations was
[13] R. H. Bruck, A survey of binary systems,
hrst given by E. +Galois. C. Jordan developed
Erg. Math., Springer, 1958.
a detailed exposition of the theory given by
[14] A. H. Clifford and G. B. Preston, The
Abel and Galois in his Traité des substitutions
algebraic theory of semigroups, Amer. Math.
(1870) [l]. Up to that time, a group meant a
Soc. Math. Surveys, 1, 1961; II, 1967.
permutation group; the axiomatic defïnition of
[ 151 E. S. Liapin, Semigroups, Amer. Math.
a group was given by A. Cayley (1854) and L.
Soc. Transl. of Math. Monographs, 1963.
Kronecker (1870). F. +Klein emphasized the
signilïcance of group theory in geometry in his
+Erlangen program (1872), and M. S. +Lie
developed the theory of +Lie groups in the
1880s. In 1897, W. Burnside published his 191 (Vll.8)
Theory qfgroups [3], whose second edition G-Structures
(1911) is one of the classics in group theory
and is still valuable. Since 1896, G. Frobenius Differential geometry studies differentiable
[2] and others have developed the theory of manifolds and geometric abjects or structures
191 A 712
G-Structures

on them. Among the geometric structures, the Let g be the Lie algebra of G. Since G acts
Riemannian and complex structures, with their on P from the right, we have the natural Lie
contacts with other fïelds of mathematics and algebra homomorphism z: g+ r( P, T(P)). Here
with their richness in results, occupy a central r(P, T(P)) is the Lie algebra of C”-vector
position in differential geometry. Perhaps it is fields on P. Actually 1 is injective, and we Write
impossible to say precisely what a differential A* for I(A) (AES). We cal1 A* the fundamental
geometric structure is or should be. However, vector tïeld corresponding to A.
the notion of G-structure allows a unilïed For XER”‘, delïne pO(x)~FX(Rm) by
description of many of the interesting known &(X)((V~)) = x& uj(ê/êxj),, where xi, , xm
geometric structures, such as Riemannian and are the natural coordinates of R”. Any open
complex structures. subset D of R” has the natural G-structure
P(D,G) defmed by P(D,G)={p,(x).aeF,(R”);
x E D, oc G}. We say that a G-structure np:
A. The Notion of G-Structures P-, M over M is integrable if for each x E M,
there exists a local coordinate system (U, q, D)
Let M be an m-dimensional C”-manifold, and around x such that cp is a G-isomorphism
let r: T(M)+M be its ttangent bundle. For of P 1U onto P(D, G). Equivalently, if we set
XE~, Tx(M)=rm’(x) is the vector space of ‘p=(x’ , . , xm), then the frame {(8/8x’),, ,
tangent vectors at x. Let rc: F(M)+M denote (8/6xm),} belongs to P for each y~ U.
the iframe bundle of M. It is a GL(m; R)-
principal bundle on M and F,(M) = Z-I (x) is
the set of linear isomorphisms (called frames at
B. Examples
x) of R” onto T,(M). Here GL(m; R) is the
general linear group of R”. Hence F,(M) cari
be naturally identilïed with the set of ordered The following examples (Bl))(B7) show that
bases of T,(M). Write 0 =(e’, , Om) for the some classical geometric structures cari be
canonical form of F(M), which is the R”- treated uniformly from the viewpoint of G-
valued 1-form on F(M) defined by H(u)= structures.
p-‘(n,(v)) for any VE T,(F(M)). Given a dif- (Bl) An absolute parallelism on M is, by
feomorphism f: Mg N of M onto another definition, a system {Xi, , X,} of C-vector
C” -manifold N, we cari naturally detïne the fields on M such that at each point XE M
bundle isomorphism f(i): F(M) g F(N) by {X,(x), , X,(x)} is a basis of T(M),. Clearly
f”‘(P) =.f* OP. this is equivalent to giving an {e}-structure
Let G be a +Lie subgroup of GL(m; R). A rcp: P+M, where e stands for the identity
principal G-subbundle np: P+M of the frame matrix of GL(m; R). Then P is integrable if and
bundle 7~: F(M)+M is called a G-structure only if [X,, Xi] = 0, 1 d i, j < m. In this case,
over M. Thus P is a regular submanifold of Aut(M, P) is a tïnite-dimensional +Lie trans-
F(M) satisfying the following three conditions: formation group on M.
(Al) n(P)= M. (B2) Let E be a k-dimensional tdistribution,
(A2) for peP and rr~GL(rn;R), pa~P if and namely, E is a k-dimensional vector subbundle
only if ~JE G. of T(M). Write R” = Rk @ Rmmk. Set GL(k, m; R)
(A3) for any x E M, there exist an open ={cr~GL(rn;R);a(R~)cR~} and P={pgF(M);
neighborhood ci of x and a C”-mapping p(Rk) c E}. Then P is a GL(k, m; R)-structure
s: U-P such that n(s(y))=y for any y~ U. over M. This gives a bijective correspondence
Conversely, if a regular submanifold P of F(M) between the k-dimensional distributions on M
satislïes the above three conditions (Al), (A2), and the GL(k, m; R)-structures on M. More-
and (A3), then np = T[ 1p: P-+M is a G-structure over, P is integrable if and only if E is +in-
over M. When G is closed, then condition (A3) volutive (Frobenius theorem).
is automatically satisfied. The restriction of 0 (B3) Let g be a Riemannian metric on M.
onto P is called the canonical form of P and is Set O(m)={cEGL(m;R);tocT=e}. Delïne P(g)
also denoted by 8. For an open subset U of M, bY P(Y)= {<OI / >uJEF(M); S(ui, uj) ~6,).
the restriction P 1U = ng’( U) is a G-structure Then P(g) is an O(m)-structure on M. This
over U. gives a bijective correspondence between the
Let n,:P+M and no:Q-fN be G-structures Riemannian metrics on M and the O(m)-
over M and N, respectively. A diffeomorphism structures over M. Then Aut(M, P(g)) is
f: M r N is called a G-isomorphism of P onto the group of tisometries of g, and a tïnite-
Q if f(‘)(P) = Q. When such an f exists, we say dimensional Lie transformation group on M.
that the G-structures P and Q are equivalent. A Moreover, P(g) is integrable if and only if g is
G-isomorphism of P onto itself is called a G- flat in the sense that the Riemannian curvature
automorphism of P. Write Aut(M, P) for the tensor of g is zero.
group of G-automorphisms of P. (B4) Two Riemannian metrics g, and gZ are
713 191 D
G-Structures

called conformally equivalent if there exists a C. Structure Functions


C”-function p > 0 with gi = pg2. This is an
equivalence relation among the Riemannian Let G be a Lie subgroup of GL(m; R) and g the
metrics on M. An equivalence class {g} is Lie algebra of G. For convenience, we Write V
called a conforma1 structure on M. Set CO(m) for R” and V* for the dual space of V. Then
={aA;A~0(m),a~R-(0)). Defme P({g}) VO A2 V* cari be considered as the space of
bY P({g})={(Ul,...,u,)EF(M);g(Ul,uj)= skew-symmetric bilinear mappings from Vx V
a6,,a~R-{O}}. Then P({g}) is a CO(m)- into V, and g @ V* cari be identified with the
structure on M. This gives a bijective cor- space of linear mappings from V into g. Delïne
respondence between the conforma1 struc- a linear mapping a : g 0 V* + V @ A2 V* by
tures on M and the CO(m)-structures over
M. Then Aut(M, P( { g})) is the group of tcon-
(aq(u,~)=-~+)U+T(~)~
forma1 transformations of {g} and is a finite- for TEg@V*, u,v~V.
dimensional Lie transformation group. More-
Put
over P( { g}) is integrable if and only if the
conforma1 structure {y} is conformally flat in H2.1(g)=V@A2V*/a(g@ V*).
the sense that at each point XE M, there exists
Let 7~~:P+M be a G-structure over M and
a local coordinate system (xi, . , xm) around x
0 the canonical form of P. Take any frame p in
such that .g(a/ax’, a/axj) = PS,, where p is a
P. An m-dimensional linear subspace H of
positive function.
T,(P) is said to be a horizontal subspace of
(B5) Assume that m is even, say m = 21. By an
P at p if 0,: H + V( = R”) is a linear isomor-
almost symplectic structure on M, we mean a
phism. Then we Write fH for (3;’ : V-+H. Write
differential 2-form R of maximal rank, i.e.,
RA A Sz (1 times) never vanishes. Let A be $(T,(P)) for the set of horizontal subspaces of
the standard skew-symmetric bilinear form on P at p. For Heb(T,(P)), defme c(p, H)E V@
A2 V* by
R2’ delïned by A((ui),(uj))=(u1v2-u2u1)+
21-1 u21
+(u -u~‘Ü~‘-~). The symplectic group c(P,H)(~,u)=~~(~,(U),~,(U)) for U,UE v.
Sp(l; R) is defined by Sp(l;R)= jo~GL(21;R);
A(~U, cru) = A(u, u), u, usR2’}. Set P(0) = Then we have
{NEF; p*A=Q,,x=n,(p)}. Then P@)is
an Sp(l; R)-structure over M. This gives a
bijective correspondence between the almost for H,, H2 E~~(T~(P)).
symplectic structures on M and the Sp(l; R)- Therefore we have a well-defined element
structures over M. Then Aut(M, P(a)) is the c(p)~H’~‘(g) by c(p)={c(p, H)}. We cal1 the
group of symplectic transformations of n, mapping c:P+Hzz’(g) the structure function of
which is never a fïnite-dimensional Lie trans- the G-structure rcp: P-+ M.
formation group. Moreover P(D) is integrable Let Q+N be another G-structure over N
if and only if R is a symplectic structure, i.e., and f: M g N be a G-isomorphism of P onto
dfi=O. Q. Then we have c(f(‘)(p)) = c(p) for p E P. In
(B6) Assume that m is even, say m = 21. We particular, it is easily seen that if P is inte-
identify C’ with R2’ as a real vector space. grable, then c = 0. However, the converse is not
Then GL(m; C) is a Lie subgroup of GL(21; R). true in general.
Let J be an almost complex structure on M.
Set P(J)={p~F(M);.fp(u)=p(iu),u~R~‘=C~}.
Then P(J) is a GL(I; C)-structure on M. This
gives a bijective correspondence between the D. Prolongation of Linear Lie Algebras and
almost complex structures on M and the Groups
GL(1; C)-structures over M. Then Aut(M, P(J))
is the group of J-analytic transformations
Let K denote either R or C. Let g be a Lie
on M. Furthermore, Aut(M, P(J)) is a finite-
subalgebra of gl(m; K). For k = 0, 1,2, . , let gk
dimensional Lie transformation group when
be the space of K-symmetric multilinear
M is compact. Moreover P(J) is integrable if mappings
and only if J is a complex structure.
(B7) It is easy to see that there is a bijec- t:K”x xKm+Km
c * I
tive correspondence between the SL(m; R)-
(k + 1) times
structures over M and the set of volume ele-
ments on M. Any SL(m; R)-structure over M is such that for each fixed u i , , uk E Km, the
always integrable and Aut(M, P) is the group linear transformation
of diffeomorphisms preserving R, which is
u~K~~t(u,u~,...,u~)~K~
never a imite-dimensional Lie transformation
group. belongs to g. In particular, go = g. We cal1 gk
191 E 714
G-Structures

the kth prolongation of g. If gk=O, then gk+l (D4) An irreducible Lie algebra g of gl(m; R)
= gk+2 = . = 0. The first integer k such that gk is of infinite type if and only if g is one of
= 0 is called the order of g. If gk #O for all k,
then g is said to be of infinite type. When g
contains a matrix of rank 1 as an element, then
g is of infinite type. When K =C, the converse
is also true [3]. However, when K = R, the
converse is not true in general. In fact, if we
consider gl(1; C) as a real Lie subalgebra of
gI(21; R) (example (B6)), then gl(l; C) is of in-
iïnite type, having no matrix of rank 1 in it. In
where csp(m/2; R) = R @ sp(m/2; R).
general, a Lie subalgebra g of gl(m; R) is said
to be of elliptic type if g contains no matrix of (D5) Let S be an m-dimensional irreducible
rank 1 as an element. We consider g1 as an Hermitian symmetric space of compact type,
Abelian Lie subalgebra of gl(K” @ g) by the which is different from the complex projective
correspondence tEg, H~E~I(K~ @ g), where 7 space. Let L(S) be the identity connected com-
is defmed by ponent of the group of biholomorphic trans-
formations of S. Take any point o E S and set
t(u)=t(.,u) for UEK”,
&,(S)={o~L(S);a(o)=o}. Let g(S) be the
t(A)=0 for AEg. linear isotropy Lie subalgebra of gl(m; C) cor-
responding to L(S). Then g(S) is an irreduc-
More generally, we consider gk+l to be an ible Lie subalgebra of gI(m; C) of order 2. Con-
Abelian Lie subalgebra of gI(K” @ g,, @ . . . @ versely, every irreducible Lie subalgebra g of
gk) by virtue of the correspondence t Egk+l H gl(m; C) of iïnite type with order > 2 is equal to
t E gI(K” @ go @ . . . @ gk), where t is defined by g(S) as given above.
(D6) Let S be an m-dimensional irreducible
F(u)=t(. ,..., .,u)egk for UEK~,
symmetric space of compact type. Assume that
t(A) = 0 for AEgo@ @gk. S admits a finite-dimensional connected Lie
transformation group L(S) which strictly con-
Identifying gl((K” @ g,, @ @ gkml) @ gk) with
tains the connected component of the group of
gl(K” @ go @ . . @ gk), we have the important
isometries of S. Take any point OES, and set
identity
L,(S)={~EL(S);O(O)=O}. Let g(S) be the
(Dl) (cd, =gk+li k=O, 1,2, . . . . linear isotropy Lie subalgebra of gl(m; R) cor-
responding to L,(S). If (S, L(S)) #(sphere, pro-
Let G be a Lie subgroup of GL(m; R) and g jective transformation group) or (S, L(S))#
the Lie subalgebra of gl(m; R) corresponding to (complex projective space, complex projective
G. For k = 1,2, , let Gk be the connected transformation group), then g(S) is an irre-
Abelian Lie subgroup of GL(R” @ g,, @ . . @ ducible Lie subalgebra of gl(m; R) of order 2.
gkel) corresponding to the Abelian Lie sub- Conversely, every irreducible Lie subalgebra
algebra gk; namely, Gk consists of the linear g of gI(m; R) of iïnite type with order > 2 is
mappings a(?) (t E gk) defïned by equal to g(s) as given above. For example, if
o(t)(u)=u+t(., . . . . .,u) for UER~, we take an m-dimensional sphere as S and the
group of conforma1 transformations of S as
g(t)(A)=.4 for .4Eg0@ . ..@gk-.. L(S), then g(s) = ca(m).

We cal1 Gk the kth prolongation of G. We say


that G is of fïnite (resp. infinite) type if g is of
finite (resp. infinite) type. We say that G is of E. Prolongation of G-Structures
elliptic type if g is. From (Dl), we have

(W (GA = G+l. Let G be a Lie subgroup of GL(m; R) and let g


be the Lie subalgebra of gI(m; R) correspond-
We know all irreducible Lie subalgebras g of ing to G. We choose once and for a11 a linear
gl(m; K) with order > 2 [3,6]. subspace C of V@ A2 V* such that
(D3) An irreducible Lie subalgebra g of
gI(m; C) is of infïnite type if and only if g is one
of In general there is no natural way of choosing
such a C.
NWC), 5l(m;C),c5t.+c), 5P(;;c)> Let nP: P-+M be a G-structure on M. For
each horizoantal subspace H of P at p (i.e.,
where csp(m/2; C)=C @ sp(m/2, C). HEO(T,(P)), we defme the frame (p, H)EF,,(P)
715 191 G
G-Structures

to be a linear isomorphism (p, H): R” @ g+ G. Automorphisms of G-Structures of Elliptic


T,(P) given by Type
(p,H)(u@A)=f,(v)@A,* for UER”‘, AEg, The following theorem of R. Palais allows us
where A* is the fundamental vector tïeld on P to prove that the automorphism groups of
induced from AE~. Let Pi ={(p,ff)~F(P); many general geometric structures are lïnite-
HE~(T~(P)),c(~,H)EC}. Then P,+Pis a dimensional Lie transformation groups.
G,-structure over P. We cal1 P, +P the first (Gl) Theorem [SI. Let L be a group of C”-
prolongation of P. The kth prolongation Pk of transformations of a C”-manifold M. Let 1 be
P is delïned inductively by Pk = (Pk-,)l = the the set of a11 vector lïelds X on M which gener-
lïrst prolongation of Pk-i. From (D2), Pk is a ate global 1-parameter groups qt = exp tX of
G,-structure over Pk-l. transformations of M such that qr E L. If the
Take any f EAut(M, P). It cari be proved set 1 generates a lïnite-dimensional Lie alge-
that f”‘EAut(P, P,). Since Pk = (Pk-l)l, we cari bra of vector fields on M, then L is a tïnite-
detïne ,ffk)~ Aut(Pk-i, Pk) inductively by ftk)= dimensional Lie transformation group, and 1 is
(f(k-“)‘l). By the correspondence ftiftk), we the Lie algebra of L.
cari consider Aut(M, P) as a subgroup of Let P+M be a G-structure over M. For any
AUt(Pk-1, pk). representation p: C+G,!,(V), we Write E(p) for
If G is connected, then the associated vector bundle of P with respect
to p. Let pi : G+GL(gl(m; R)) be the represen-
(El) Aut(M,P)=Aut(P,m,, Pk), k= 1,2, . . . tation delïned by p,(o)A=aAoml for agG,
Let Q+N be another G-structure. By a similar AE gl(m; R). Let p2 : G-+GL(g) be the represen-
argument to that above, we see that P+ M tation delïned by p,(o)A=~Aa~’ for OEG,
and Q+N are equivalent if and only if Pk + AE g. Then p2 is a subrepresentation of pl,
Pkml and Qk+Qk-, are equivalent. In Cl], E. since pl(a)A=p,(o)A for ~CG, AE~. We re-
Cartan studied general equivalence problems mark that E(pi) = Hom( T(M), T(M)). We
of two G-structures. In that problem the above Write g(M) for E(p,). Then g(M) is a vector
prolongation procedure plays an important subbundle of Hom(T(M), T(M)). We Write F
role. for the quotient vector bundle Hom(T(M),
T(M))/g(M). Let a: Hom(T(M), T(M))+F be
the natural projection. We lïx an affine con-
nection V on M which preserves P. We Write
F. Automorphisms of G-Structures of Finite
T for the torsion tensor held of V. For each
Type vector field XET(M, T(M)), we delïne a C”-
section VX + T(X, .) in T(M, Hom(T(M), T(M))
S. Kobayashi proved the following: by
(Fl) Theorem [4]. Let P+M be an {e}-
LIET(M),I+V,X+T(X(X),U)ET,(M).
structure over M. Then Aut(M, P) is a tïnite-
dimensional Lie transformation group on Then define a lïrst-order linear differential
M such that dim Aut(M, P) < dim M. More operator
precisely, for any point XE M, the mapping
D:l-(M, T(M))+T(M,F)
aeAut(M, P)H~(x)E M is injective, and its
image {O(X); oEAut(M, P)} is a closed sub- by D(X) =@(VX + T(X, .)). It is easy to see
manifold of M. The submanifold structure on that
this image makes Aut(M, P) into a lïnite-
(G2) l(M,P)ckerD,
dimensional Lie transformation group on M.
Now let G be a Lie subgroup of GL(m; R) where I(M, P) is the Lie algebra of a11 vector
of lïnite type, say G, = {e}. Then Pk-+Pkel fields XET(M, T(M)) which generate global
is an {e}-structure. From theorem (Fl) 1-parameter groups <pt= exp tX of transforma-
above, it follows that Aut(Pkml, Pk) is a tïnite- tions of M such that <P,EA~~(M,P). Then we
dimensional Lie transformation group. As cari show:
explained in Section E, Aut(M, P) is a sub- (G3) D is an elliptic operator if and only if G
group of Aut(Pkml, Pk). Clearly Aut(M, P) is is of elliptic type.
closed in Aut(Pk-i, P&. Since every closed Now suppose M is compact. Then from the
subgroup of a Lie group is again a Lie group, standard fact on linear elliptic operators on
we have the following: compact manifolds, we know the dimension of
(F2) Theorem. Let P-M be a G-structure ker D is lïnite if D is elliptic. Thus from (G2),
over M. If G is of lïnite type, then Aut(M, P) is (G3), and theorem (Gl), we have the following:
a Lie transformation group of dimension (G4) Theorem [7]. Let P+M be a G-
<dim(R”@g@g,@...@g,-,). structure on an m-dimensional compact mani-
191 H 716
G-Structures

fold M. If G is of elliptic type, then Aut(M, P) elementEofZonwhicha*B’,...,a*fY


is a finite-dimensional Lie transformation are linearly independent if and only if G is
group. involutive.
Combining proposition (H6) above with the
classical tCartan-Kahler theorem, we obtain
H. Local Equivalence Problem the following theorems:
(H7) Theorem (Cl], also see [SI). Let P+ M
Let n,:P-+M and z~:Q-+N be two G- and Q+N be two real analytic G-structures
structures. We say that P+M and Q+N over M and N, respectively, such that their
are locally equivalent if the following holds: structure functions are constant and equal.
(H 1) For arbitrarily given (p, q) E P x Q, there Suppose G is involutive. Then P-+M and
exists a local G-isomorphism f: U E V of P 1U Q+N are locally equivalent.
onto Q 1V such that f(‘)(p) = q, where U (resp. (H8) Theorem. Let P+ M be a real analytic
V) is an open neighborhood of zp(p) (resp. G-structure over M. Suppose G is involutive.
Q(4)). Then P is integrable if and only if the structure
From now on, we assume that the structure function of P is zero everywhere.
function cp (resp. cQ) of P (resp. Q) is constant. Einally, we remark that for a general Lie
If such an f as in (Hl) exists, we must have subgroup G c GL(m, R), Gk is involutive for
~~(,f(‘)(p’))=c,(p’) for p’cPI U. Therefore cp k > k, [SI.
and c, must be the same constant. Cartan
reduced the local equivalence problem be-
tween P and Q to that of certain differential
1. Cartan-Ktibler Tbeorem on Differential
systemsonPxQ.Infact,let8,=(8’,...,Bm)
Systems
and 0, = ($‘, , II, “) be the canonical forms of
P and Q respectively. Let c(: P x Q +P, /l: P x
Let Cl be an open subset of R”. We Write
Q-Q be the canonical projections such that
Ak( U) for the space of real analytic differential
a(p,q)=p, fi(p,q)=q. Let Z be the differen-
k-forms on U. Put A*(U)=.4’(U)@ .@
tial system on P x Q generated by {a * 0’ -
A”(U). Then A*(U) is a graded R-algebra
/S*$I’, . . ..a*Cl*-p*$“‘}. IfX is the m-
with respect to the usual exterior product. By
dimensional regular submanifold of P x Q
a differential system on U we mean an ideal
given by the graph of f(l) in (Hl), i.e., X =
Z of A*(U) such that
{(p’,f”‘(p’))~P x Q;p’gP( U}, then X is an
m-dimensional integral submanifold of L (Il) r=.4o(u)nZ@.4’(u)nr@ .
satisfying the following conditions:
oznP(U),
W) (P> 4) E X,
(12) d(C) cc,
(H3) tc* O’, . . , a * 8” are linearly indepen-
(13) C is iïnitely generated as an ideal.
dent on 7;D,4JX).
For convenience we Write Zck) = C n Ak( U).
Conversely any m-dimensional integral sub-
When C(O) = {0}, we cal1 Z a restricted dif-
manifold of C satisfying (H2) and (H3) is the
ferential system.
graph of a local G-isomorphism f required in
Let C be a differential system on U c R”. By
(Hl). Therefore the local equivalence problem
between P and Q is equivalent to the problem
a k-dimensional integral manifold of C, we
of fïnding an m-dimensional integral submani- mean a k-dimensional regular real analytic
submanifold of U such that for any acL, we
fold X of Z satisfying (H2) and (H3) for any
have a 1X = 0. In the above detïnition, it is
(P> & P x Q.
suffïcient to know that a 1X = 0 for any CIE Cck).
We say that G is involutive if there exist
For XE U, we Write G,(T,(U)) for the set of
linear subspaces 0 = Vo c VI c c V, = R”
a11 k-dimensional linear subspaces of T,(U).
such that
Set %(T(U))= Uxeu G,(T,(U)) (disjoint union);
(H4) dimVk=k (k=O,...,m), then G,(T( U)) is naturally a real analytic
manifold of dimension m + m(m - k). Let Z
(H5) dimg, = f dimg(b), be a differential system on U. A k-plane EE
k=O
G,( T,( U)) is called a k-dimensional integral ele-
where g is the Lie algebra of G and g( V,) = ment of Z at x if for any a EZ(~), we have CL1E
{Aeg;A(u)=O for UE Vk}. Now we have: = 0. Thus X is a k-dimensional integral sub-
(H6) Proposition. Let P+M and Q-N be manifold if and only if T,(X) is a k-dimensional
two G-structures such that their structure integral element of Z for each XEX. We Write
functions are constant and equal. The dif- I,(L) for the set of k-dimensional integral ele-
ferential system Z on P x Q defïned as above ments of L. In general I,(Z) is a real analytic
is involutive at any m-dimensional integral subset of Gk( T( U)). For any k-dimensional
717 191 Ref.
G-Structures

integral element E at XE U, we defme the polar involutive at E if and only if there exist 0 = E”
space H(E) by c E’ c c Ek-’ c E such that Ik(.Z) is a sub-
manifold of Gk( T(U)) near E with dimension m
+k(m-k)-L”dtj, where tj=m-dimH(Ej).
for ccEZtk+l),ul, . . . . ukEE}.

Then EcH(E). Moreover FE Gk+r(Tx(U)) References


with F 2 E is in 1,+,(L) if and only if F c H(E).
Let E be a k-dimensional integral element of [ 1) E. Cartan, Les problèmes d’équivalence,
L. We choose cpl, . . ..cpk~A1(U) such that Oeuvres, pt. II, vol. 2, 1311-1334.
<pi1E, . _. , <pk1E are linearly independent. Detïne [2] V. Guillemin and S. Sternberg, An alge-
@(<pl,...,<pk) by brait mode1 of transitive differential geometry,
Bull. Amer. Math. Soc., 70 (1964), 16647.
q((p’, . . . . <pk)={F~Gk(T(U));qllF, . . . . ‘pklF
[3] V. Guillemin, D. Quillen, and S. Sternberg,
linearly independent}. The classification of the irreducible complex
algebras of infinite type, J. Analyse Math., 18
Then %(<p’, , <pk) is an open neighborhood (1967) 107-112.
of E in G,( T( U)). For any element F~Ull(<p’, [4] S. Kobayashi, Le groupe des transfor-
. . . , <p”), delïne Fi, ,F,EF by <p’(F,)=6;. Thus mations qui laissent invariant le parallelisme,
{F, , , F,} is the dual basis of <pl 1F, . , ‘pk 1F. Colloque de Topologie de Strasbourg, 1954.
Any element CLE Ztk) defines a real analytic [S] S. Kobayashi, Transformation groups in
function c(* on @((p’, . . . . (pk) by a,(F)=a(F,, differential geometry, Erg. Math., 70, Springer.
, Fk). We cal1 E regular if there exist il, , [6] S. Kobayashi and T. Nagano, On lïltered
ar,Cck’ and an open neighborhood v of E in Lie algebras and geometric structures, J. Math.
%(cp’, , <pk) such that Mech. 1, 13 (1964), 875-908; II, 14 (1965), 5133
(14) Ik(Z)flV={F~“Y;a*(F)=...=a~(F) 522; III, 14 (1965), 679-706; IV, 15 (1966),
1633175; V, 15 (1966), 315-328.
=q, [7] M. Spivak, A comprehensive introduction
(15) dai , . , da: are linearly independent to differential geometry V, Publish or Perish
on ^Y-, Inc., 1975.
[S] T. Ochiai, On the automorphism group of
(16) dim H(F) is constant for any FE Y. a G-structure, J. Math. Soc. Japan, 18 (1966)
189-193.
We say that E E~~(L’) is an ordinary element if
[9] 1. Singer and S. Sternberg, The infinite
E contains a (k - 1)dimensional regular in-
groups of Lie and Cartan: The transitive
tegral element.
groups, J. Analyse Math., 15 (1965) l-l 14.
The following is the well-known theorem
[ 101 N. Tanaka, On the equivalence problems
due to Cartan and Kahler [7].
associated with a certain class of homogeneous
(17) Theorem. Let Z be a restricted dif-
spaces, J. Math. Soc. Japan, 17 (1965) 1033
ferential system on U c R”. Let X be a k-
139.
dimensional integral submanifold of C. Sup-
pose that T,(X) is regular for a point x6X and
that there exists a (k + 1)-dimensional integral
element F at x with F 2 T,(X). Then there
exists a (k + 1)-dimensional integral submani-
fold Y of C such that XE Y, T,(Y) = F, and
Xc Y in a neighborhood of x.
We say that C is involutive at E E Ik(C) if
there exist 0 = E” c E’ c . . c Ekm’ c E such
that E’ is a j-dimensional regular integral
element of C (j = 0, 1, , k - 1). Applying
theorem (G7) inductively, we obtain:
Corollary. Let C be a restricted differential
system on U c R”. If E is a k-dimensional
involutive integral element of C at x, there
exists a k-dimensional integral submanifold of
C such that XE X and T,(X) = E.
In view of this corollary, it is important to
know when Z is involutive at EEI~(C).
(18) Lemma [9]. Let Z be a restricted dif-
ferential system on U c R”, and E be a k-
dimensional integral element of L. Then Z is
192 A 720
Harmonie Analysis

192 (X.24) valued bounded function a(t) such that

Harmonie Analysis
f(x)= m e”“da(t) (2)
s -cc
A. Fourier Transforms for almost a11 x. If a( -00) = 0 and a(t) is right
continuous, then a(t) is unique (Bochner’s
Let f(x) be an element of the tfunction space theorem). Conversely, if a(t) is nondecreasing
L,( -CO, CD) and t a real number. Then the and bounded and f(x) is defmed by (2) then
integral f(x) (called the Fourier-Stieltjes transform of
a(t)) is continuous and of positive type.
flt)=(2n)-“2 ~f(x)e”‘dx (1) A sequence {a,} (-cc <n < CO) is said to
s be positive definite (or of positive type) if
converges, and the function f(r) is continuous C&i uj-,tj& 30 for Iïnitely many arbitrarily
in (-CO, CO). We cal1 f the +Fourier transform chosen complex numbers <,, t2, , 5,. If {a.}
off:Itf(x)E&-coa),thenf(x)ELi(-a,a) is of positive type, then there exists a mono-
for any Imite interval (-a, a), and if we set tone increasing bounded function a(t) on
C-n, x] such that
&=(27$“2 ;af(x)r-‘=dX, $7
s un= eintda(t)
then f, +Converges in the mean of order 2 as s -II
a+ CO to a function ,fin L,. In this case, we (Herglotz’s theorem). Conversely, if a(t) is
delïne f to be the +Fourier transform of S monotone increasing and bounded and a, is
(EL~). Furthermore, in this case, if we set defined by the above integral, then the se-
quence {a,} is of positive type.
f,(~)=(2n)-“~ ’ &eirXdt,
s -a
C. Poisson’s Summation Formula
then l.i.m.,,,fo(x)=f(x) (Plancherel theo-
rem). Moreover, we have +Parseval’s identity:
If f(x) EL i ( -CO, CO) is of tbounded variation
j?mIf(x)12dx=~Zm If(t)l’dt (- 160 Fourier
and continuous and if f(t) is its Fourier trans-
Transform).
form, then we have
Suppose that f(x) is periodic with period 2x
and fi L2( - n, n). We set
n
a”=(27L-r J(x)e-‘“‘dx
s where ab = 2n (a > 0). This is called Poisson’s
summation formula.
(+Fourier coefficients). The nth partial sum of
the +Fourier series s,(x) = Et= ~.a,eiYx con-
verges in the mean of order 2 tof(x), and D. Generalized Tauberian Theorems of Wiener

;s;IIIf(x)I’dx=
f Id2
Parseval’s identity

n=-CC
holds. On the other hand, if {a,} is a given
Suppose that we are given a function ~(X)E
L,( -CO, CO) whose Fourier transform y(t) is
never zero. Then the set of functions given by

sequence such that Zz-m la1l2 < CO, then h(x)= m f(x-Mddx
C~=-naveivX converges in the mean of order 2 s -a,
to a function f(x), the Fourier coefficients of where geL,( -co, CO), is dense in L,( --CO, co).
f(x) are {a,}, and Parseval’s identity holds Hence we cari deduce the tgeneralized Tauber-
(- 159 Fourier Series). ian theorem of Wiener: If the Fourier trans-
form c,(t) of k,(x)~L,(-CD, CO) does not van-
ish for any real t and
B. Bochner’s Theorem and Herglotz’s Theorem m cc
A complex-valued function f(x) defined on
lim
x-cc s-m
k,(x-y)f(y)dy=C
s -cc
k,Mdy

(-CD, CD) is said to be of positive type (or posi- for a function f(x) that is bounded and
tive definite) if ~‘&f(~~-x~)~~~~~O for any measurable on (--CO, CO), then for any k, E
finite number of reals x1, x2, , x, and com- L-Q, QX
plex numbers tl, c2, , 5,. If f(x) is measur-
able on (-CO, CO) and of positive type, then
there exists a tmonotone increasing real-
721 192 H
Harmonie Analysis

(- 160 Fourier Transform G). Hence we cari (1) A necessary and sufficient condition for
deduce TTauberian theorems of the Littlewood an entire function F(c) to be the Fourier-
type C31. Laplace transform of a tfunction of class C”
having its support in a tïnite interval C-B, B]
is that for any N, there exists a constant C, > 0
E. Harmonie Analysis
such that IF(c)l<C,(l +l[l)mNeBlal for a11 [=
t+icr.
Let x(n) be a complex-valued function on
(II) If g(t)E&(O, m), then its one-sided tLa-
(-CO, CO) that is of bounded variation and
place transform
right continuous. If we have the expression

f(z)=(27cm”’ m g(t)e-“dt
m= “3 e’“‘dc((3,), (2’) s0
s --o

CU
satisfïes: (i) f(z) is holomorphic in the right
then we say that the function f(t) is repre-
half-plane Rez > 0, and (ii)
sented by the superposition of harmonie oscil-
lations e’“‘. Conversely, when f(t) is given, we
have the problem of finding a function m(n) as
above such that f(t) cari be expressed in the
form of (2’). When such a function ~(1) exists, it
sup
s
x2-0 -cc
If(x+iy)j’dy<m.

Conversely, if f(z) =f(x + iy) (x > 0) satislïes


is also an important problem (in harmonie (i) and (ii), then the boundary function f(iy)E
analysis) to fmd the amplitude U(A)- n(&0) of L2( -CO, CO) exists and is such that
the component of a proper oscillation. Con- cc
cerning this problem, we have the following x,0
s-m If(iy) -f(x + N2 dy =O,

N
three theorems:
(1) A necessary and suftïcient condition for its Fourier transform in L,
,f(x) to be representable in the form (2’) is
g(t)=l.$m.(27q1’2

vanishes at almost
s -N
f(iy)eifYdy

a11 negative t, and f(z) is the


(II) For a function ,f(t) expressed in the form one-sided Laplace transform of g(t).
(2’), we have: (i) for any Â,,,
T G. Harmonie Analysis on Locally Compact
a(>.,)-CC(+~)= lim L f(t)e8+dt,
T-~T Abelian Groups
s mT
and (ii) if a(a) is continuous at >. = A, - o and The general theory of harmonie analysis on
J”=&+ri(a>O), then the real line was extended to a theory on lo-
cally compact Abelian groups by A. Weil, 1.
X(i, + fJ) - LY(& -a)
M. Gel’fand, D. A. Raikov, and others. The
theory of normed rings was utilized for the
development of the theory (- 36 Banach
Algebras). This theory is called harmonie
(III) In (2’), suppose that the discontinuity
analysis on locally compact Abelian groups,
points of c@) are A,, 3,,, , and set a, = @.,) -
and it has been developed as described in the
a(].,-O)(n= 1,2, . ..). then
following sections.

hi; ‘f(t+s)fods= f ~un~2eiA~t.


s0 >1=1 H. Group Rings

Let L, =L,(G) be the set of all integrable


F. The Paley-Wiener Theorem functions with respect to a tHaar measure on a
locally compact Abelian group G. If we define
In formula (l), if we change the variable from t
the norm and multiplication in L,(G) by
to a complex variable [ = t + io, then we have
cc
F(i) = (27$“2
Il.fll = IfMldx,
__f(x)em’@dx, (3) sG
s
which is called the Fourier-Laplace transform .f.dx)= [ fGW’)gMdy,
JG
of f(x). In particular, if f(x) has bounded tsup-
port, then F(c) is an tentire function. Concern- respectively, then L, has the structure of a
ing Fourier-Laplace transforms, Paley and commutative tBanach algebra. (We call f.g
Wiener proved the following theorems: the convolution (or composition product) off
192 1 122
Harmonie Analysis

and 9.) If the topology of G is not tdiscrete, J. Positive Detïnite Functions


L,(G) does not have a unity for multiplication.
Hence, adjoining a formal unity 1 to L,(G), A function <p(x) defmed on G is said to be
we set R = { ctl + f 1CIis a complex number, positive detïnite (or of positive type) if the in-
.f6L,(G)J, and equality L&, <p(xjx~‘)ajEk>O holds for arbi-
trary elements x i , . ,x, in G and arbitrary
complex numbers c(i,...,t(,. Wedenoteby Pc
(El +f)+W +s)=(a+P)l +(f+sX the set of all positive detïnite functions on G. If
<pE Pc, then cp(e) > 0 (e is the identity of G),
1V(X)~ <<p(e), and <p(x-‘) = Q(X). If G is locally
Then R is a commutative Banach algebra with compact, we further assume that <pE PG is
unity. When G is discrete, R =L,(G). The measurable with respect to the Haar measure
Banach algebra R is called the group algebra of of G. Then for any JE L,(G),
G. The group algebra R of G is tsemisimple. R
is algebraically isomorphic to a subalgebra of
C(Y.R), which is the associative algebra of a11
continuous functions on the compact Haus- Any C~EP~ is equal almost everywhere to a
dorff space !IX consisting of ah maximal ideals continuous ‘p, EP~. (Concerning positive de-
in R (- 36 Banach Algebras). By this corre- finite functions on locally compact groups and
spondence, if <pE R corresponds to v(M), their relation with unitary representations -
which is a function on YJl, then supMtwl~(M)I 437 Unitary Representations B.)
< llcpll. L,(G) belongs to %II if and only if the
group G is not discrete.
K. Harmonie Analysis and tbe Duality
Tbeorem
1. Fourier Transforms

According as the group G is discrete or not, we If G is a locally compact Abelian group, then a
set ‘%=XII or !R=%I-{L,(G)}. Then there function C~(X) on G belongs to Pc if and only
exists a one-to-one correspondence between if there exists a nonnegative measure PL(G) <
the elements ME X and the elements x of the a on G such that <p(x)=J&x)d&). (When
+character group G of G such that the follow- G = R’, this theorem is +Bochner’s theorem,
ing formulas are valid: whereas when G = Z, it is Herglotz’s theo-
rem.) Hence we cari prove a spectral resolu-
tion of unitary representations of G: U(x) =
f(M) = X(X).~(X) dx> ,f‘ELl, (4)
sG Jcx(x)dE(x) (a generalization of +Stone’s
theorem). IffgL,(G)fl Pc, then f(x)>O, ,?E
x(~)=f,(W/fW), (5) Ll(e), and the inversion formula of Fourier
where ,fJx) =f(xy-‘) and f is a function such transforms f(x) =jemf(x)& holds, provided
that f(M) # 0. This correspondence M tt x that the Haar measure on G is suitably chosen.
gives a homeomorphism between the locally Iff’ELi(G)flL,(G), thenfEL,(G),andPar-
compact space sJ1and G. Hence if we identify seval’s identity JG [~(X)I’ dx = si; I~(X)[~ dz holds.
M with x and set f(M)=?(x), then ,fis a con- If we put Uj”=fand V”=A then U is extend-
tinuous function on e, called the Fourier trans- able uniquely to an isometry of L,(G) onto
form of S(x). Since f-f(M) is an algebraic L,(G) and Vis extendable uniquely to its
isomorphism, (fg)*(x) =&)~(x). If ,f(x) E inverse transformation, respectively (Plan-
i(x), then f‘ is equal to g in L,(G). This is the cberel’s theorem on locally compact Abelian
uniqueness tbeorem of Fourier transforms. groups). By the inversion formula and Plan-
From it we cari deduce the tmaximal almost cherel’s theorem, we cari prove that the char-
periodicity of locally compact Abelian groups. acter group 8 of G is isomorphic to G as a
If G is not discrete, then L,(G)EYX, and topological group. This is called the Pon-
.f’W,(G))=O for fE L,(G). Hence {xl I~(~>I~E} tryagin duality theorem of locally compact
is a compact subset of G. This means that fis Abelian groups (- 422 Topological Abelian
a continuous function vanishing at intïnity on Groups). In particular, if G is compact, G is a
G. (This is a generalization of the tRiemann- discrete Abehan group. Then we cari normal-
Lebesgue theorem concerning the cases G = R’ ize the Haar measure of G and G SO that the
or T’ =R/Z.) Any continuous function u(x) on measure of G and the measure of each element
G vanishing at infinity is approximated uni- of G are 1. Plancherel’s theorem implies that
formly by f(x), which is the Fourier transform the set of characters of G is a tcomplete tortho-
offEL,(G). normal set in L,(G).
123 192 P
Harmonie Analysis

L. Poisson’s Summation Formula analytic in a neighborhood of [ -1, l] if G is


not compact [14,15]. Let E be a subset of G,
Suppose that G is a locally compact Abelian and 1, the set of all ~EA(G) such that f=O on
group, H is its discrete subgroup, and G/H is E. Let A(E)=A(G)/I, be the quotient algebra.
compact. Then the tannihilator r of H is a The set E is called a set of analyticity if every
discrete subgroup of G. For any continuous operating function on A(E) is analytic. For a
function ,f(x) on G, if C,,,f(xy) is convergent characterization of such a set - [ll].
absolutely and uniformly (hence fi L, (G))
and &,rf(<) is convergent absolutely, then
CYEHf(y) = c&rf([), where c is a constant 0. Measure Algebras
depending on the Haar measures of G and G.
This is called Poisson% summation formula on For a locally compact Abelian group G, let
a locally compact Abelian group (a generali- M(G) be the set of all tregular bounded com-
zation of +Poisson’s summation formula on plex measures. For  and p in M(G), the
G=R). convolution i * p is detïned by (Â * p)(E) =
jcl(E-y).dp(y), where E is a Bore1 set in
G. Then M(G) is a semisimple commutative
M. Closed Ideals in L, (G)
Banach algebra whose product is detïned to be
the convolution. The Fourier-Stieltjes trans-
In the following, the group operations in G
form of ~EM(G) is defïned by
and G are denoted by +, and the value of the
character y(x) (XE G, y E G) is written as (x, y).
For a function .f defined on G and any y~ G, B(y)= r (x,y)&(x), FG.
JG
the translation operator ~~ is detïned by zJ(x)
=f(x-y). A closed subspace of L,(G) is an A continuous function on G is positive detïnite
ideal in L,(G) if and only if it is invariant if and only if it is the Fourier-Stieltjes trans-
under a11 translations (N. Wiener). A closed form of a positive measure in M(G) (Bochner’s
ideal 1 coincides with L,(G) if and only if the theorem). Assume that G is not discrete. A
set of zeros of 1, i.e., Z(I) = r)rE,f^-r (0), is function on the interval [ -1, l] that oper-
empty (Wiener3 Tauberian theorem). A closed ates in the Fourier-Stieltjes transforms of
ideal 1 is maximal if and only if Z(I) consists of measures in M(G) cari be extended to an entire
a single point. If the dual G of G is discrete, function, and a function on the whole complex
then the closed ideals in L,(G) are completely plane that operates in the +Gel’fand repre-
characterized by the zeros; that is, +Spectral sentation of M(G) is an entire function [IS].
synthesis is possible, but this situation does From this fact, it follows that M(G) is asym-
not hold generally. P. Malliavin’s theorem metric and nonregular. Furthermore, there
states that if G is not discrete, then there exists exists a measure ~EM(G) such that fi(y)> 1
a set E in G and two different closed ideals 1 but l/fi is not a Fourier-Stieltjes transform of
and J such that Z(I) = Z(J) = E. Such a set E is M(G). (See also Wiener and Pitt [16], Shreider
called a non-S-set. For example, if G = R3, the [ 171, and Hewitt and Kakutani [ 181; for the
unit sphere is a non-S-set (L. Schwartz). general description of measure algebra, see
Rudin [l l] and Hewitt and Ross [ 121.)

N. Operating Functions
P. Idempotent Measures
Denote by A(G) the set of a11 Fourier trans-
forms of the functions in L,(G). Let fi A(G) A measure 1 E M(G) is called idempotent if p * p
and @ an analytic function in a neighborhood = p, that is, fi(y) = 0 or 1 for a11 y E G. Then $ is
of the range off Furthermore, assume that the characteristic function of the set {y~ G 1$(y)
O(O) = 0 if G is not discrete. Then there exists = 1). The smallest ring of subsets of G that
a QE.~(G) such that ,&y)=@(&)) for y~6 contains all open cosets of subgroups of G is
(Wiener-Lévy theorem). In general a function called the coset ring of G. The characteristic
@ detïned on a set D in the complex plane is function of a set E in G is the Fourier-Stieltjes
said to operate in a function algebra R or to be transform of an idempotent measure in M(G)
an operating function on R if ù>(f) E R for all if and only if E belongs to the coset ring of G
ftzR whose range lies in D. The converse of [19]. A simple proof of this theorem is given
the Wiener-Lévy theorem holds in the follow- by T. Ito and 1. Amemiya (Bull. Amer. Math.
ing form. Let G be an infinite Abelian group Soc., 70 (1964)). When G is the unit circle, the
and u> a function on the interval [ -1, 11. If <D coset ring consists of sequences periodic except
operates in A(G), then @ is analytic in a neigh- at a fïnite number of points, and for this case
borhood of the origin if G is compact, and the theorem was obtained by H. Helson. Let
192 Q 724
Harmonie Analysis

n, , n2, , nk be distinct integers and @L(x) = analog of a Helson set is a Sidon set. A subset
Cg=, ei”jx.dx. Then p is an idempotent mea- F of a discrete group G is called a Sidon set if
sure on the unit circle. J. E. Littlewood conjec- there is a constant C such that CytF ]a,] $
tured that the norm of p exceeds c log k, where Csup,IC,,,a,(x, y)] for every polynomial
c is a positive constant not depending on the C a,(~, y). For example, a tlacunary sequence
choice of { rrj}. A partial answer was given Ink)r nk+l /nk > 4 > 1, of integers is a Sidon set.
by P. J. Cohen [19] for compact connected These sets are deeply connected with har-
Abelian groups and was improved by H. monic analysis on groups, and measures con-
Davenport (Mathematika, 7 (1960)) and E. centrated on these sets have some unexpected
Hewitt and H. S. Zuckerman (hoc. Amer. pathological properties (- e.g., [ll, 131).
Math. Soc., 14 (1963)).

S. Tensor Algebras and Group Algebras


Q. Mappings of Group Algebras
Let X and Y be compact Hausdorff spaces,
Let G and H be two locally compact Abelian and denote by V(X, Y) the projective tensor
groups and p a nontrivial homomorphism of product C(X) 6 C(Y) of continuous func-
L,(G) into M(H). Associated with <p there is a tion spaces C(X) and C(Y). The norm of
mapping <p* of a subset Y of fi into G such V(X, Y) = Cj’h h(X)Yj(Y) is defined by Ilv Il
that <p(f)(y)=f(r~*(y)) for y~ Y and =0 for =infC,z, Il&$, llgjll r, where the infimum is
y& Y, or symbolically <p(f)=,f(<p*). A con- taken for a11 expressions of cp. If G is an infi-
tinuous mapping CIof Y into G is said to be nite compact group, then there exist two sub-
piecewise affine if there exist a finite number sets K, and K, such that (i) K, and K, are
of mutually disjoint sets Sj, j = 1, . , n, in the homeomorphic to the +Cantor ternary set; (ii)
coset ring of fi and mappings aj such that (i) the expression y, + y2 of an element of E = K,
Y=&,S,;(“) II 2,.’ 1s dfe me d on an open coset + K, is unique, where y, E K, and y2 E K,; (iii)
K, of fi, where Kj 2 Sj; (iii) 01~= CIon Sj; and K, n K, # 0; and (iv) K, U K, is a Kronecker
(iv) ctj(i/ + y’ -y”) = ~~(y) + ~,(y’) - ~,(y”) for all y, set or a set of type K, for some p. Varopoulos’
y’, y” E Ki, j = 1, , n. P. J. Cohen’s theorem is: theorem states that the algebra V(K,, K2) is
If <pis a homomorphism of L,(G) into M(H), isomorphic to A(E) which denotes the algebra
then Y belongs to the coset ring of fi and <p* is of restriction of functions in A(G) on the set E.
a piecewise affine mapping of Y into G. Con- By this theorem, the problems of spectral
versely, for any piecewise affine mapping c(, synthesis and operating functions of group
there is a homomorphism <p of L r (G) into algebras are transformed into problems of
M(H) such that q* = cx.Related theorems have tensor algebras. For a more precise discussion
been studied by A. Beurling, H. Helson, J.-P. - [20].
Kahane, Z. L. Leibenson, and W. Rudin [l 11.

References
R. Exceptional Sets
[1] E. C. Titchmarsh, Introduction to the
Let G be a locally compact Abelian group. A theory of Fourier integrals, Clarendon Press,
subset E is said to be independent if n, x 1 + 1937.
. . . + nkxk = 0, where the nj are integers and xje [2] A. Zygmund, Trigonometric series 1, Cam-
E implies njxj=O, j= 1, . . . . k. A set E in G is bridge Univ. Press, second edition, 1959.
called a Kronecker set if for every continuous [3] N. Wiener, The Fourier integrals and
function <pon E of absolute value 1 and E>O certain of its applications, Cambridge Univ.
there exists a yeG such that I<~(X)-(x,y)(<.s, Press, 1933.
x E E. Every Kronecker set is independent and [4] R. E. A. C. Paley and N. Wiener, Fourier
of infinite order, but independent sets are not transforms in the complex domain, Amer.
necessarily Kronecker sets. For a group G Math. Soc. Colloq. Publ., 1934.
whose elements are of finite order p, a set E is [S] K. Yosida, Functional analysis, Springer,
called of type K, if for every continuous func- 1965, sixth edition, 1980.
tion es on E with values exp(2nik/p), k = 0, , p [6] M. A. Naimark, Normed rings, Noordhoff,
-l,thereisayEGsuchthatcp=yonE.IfEis 1959. (Original in Russian, 1956.)
a compact Kronecker set in G and p is a mea- [7] L. H. Loomis, An introduction to abstract
sure with support in E, i.e., ~EM(E), then IlplI harmonie analysis, Van Nostrand, 1953.
= ~~p~~ I>. A compact set E is called a Helson set [S] H. Cartan and R. Godement, Théorie de
if there is a constant C such that I/p // < C 11fi 11~ la dualité et analyse harmonique dans les
for FE M(E). Every K, set is also a Helson set. groupes abéliens localement compacts, Ann.
For a Helson set E, C(E) = A(E). A discrete Sci. Ecole Norm. SU~., (3) 64 (1947) 79-99.
125 193 c
Harmonie Functions and Suhharmonic Functions

[9] R. Godement, Théorèmes taubériens et Laplace equation, but is not continuous at the
théorie spectrale, Ann. Sci. Ecole Norm. SU~., origin.
(3) 64(1947), 118-138. The fundamental properties of harmonie
[lO] A. Weil, L’intégration dans les groupes functions do not depend essentially on n.
topologiques et ses applications, Actualités Sci. A real-valued function u of class C2 satisfy-
Ind., Hermann, 1940, second edition, 1951. ing the inequality Au 2 0 is called suhharmonic.
[ 1 l] W. Rudin, Fourier analysis on groups, For a more general definition of subharmonic
Interscience, 1962. functions and their properties - Sections P-
[12] E. Hewitt and K. A. Ross, Abstract har- U.
monic analysis, Springer, 1, 1963; II, 1970.
[ 133 J.-P. Kahane and R. Salem, Ensembles
parfaits et séries trigonométriques, Actualités B. Invariance of Harmonicity
Sci. Ind., Hermann, 1963.
[ 141 Y. Katznelson, Sur le calcul symbolique Harmonicity in R2 is invariant under any
dans quelques algèbres de Banach, Ann. Sci. tconformal transformation. Namely, when
Ecole Norm. SU~., (3) 76 (1959) 83-123. there exists a conforma1 bijection sending a
[ 151 H. Helson, J.-P. Kahane, Y. Katznelson, domain D in the xy-plane onto a domain D’ in
and W. Rudin, The functions which operate the (q-plane, every harmonie function u(x, y)
on Fourier transforms, Acta Math., 102 (1959), on D is transformed into a harmonie function
1355157. of (<, 11)on D’. In R” for n 2 3, harmonicity is
[ 161 N. Wiener and H. R. Pitt, On absolutely not generally preserved under conforma1
convergent Fourier-Stieltjes transforms, Duke transformations. However, harmonicity is
Math. J., 4 (1938), 420-436. preserved in the following special case: Let D
[17] Yu. A. Shreïder @eider), The structure be a domain in R” (n > 3), and consider the
of maximal ideals in rings of measures with inversion f:D-D’ defined by f(xr , ,x,) =
convolution, Amer. Math. Soc. Trans]., 81 (X;,...,x;)=(a2x,/r2 ,...,a2X,/r2),r=(X;+
(1953). (Original in Russian, 1950.) + x~)“‘. Let u(xi, , x,,) be a harmonie
[ 183 E. Hewitt and S. Kakutani, Some multi- function on D, and let V(X’, , , XL) be the func-
plicative linear functionals on M(G), Ann. tion on D’ obtained by applying the Kelvin
Math., (2) 79 (1964) 4899505. transformation to u. Namely, V(X;, . , XL) =
[ 191 P. J. Cohen, On a conjecture of Little- (a/r’~~2u(u2x~/r’2 , . , a2x~/r’2), where rf2 =
wood and idempotent measures, Amer. J. xl2 + +XL’. Then the function u is harmonie
Math., 82 (1960), 191-212. on D’. A function u that is harmonie outside
[20] N. T. Varopoulos, Tensor algebras and a compact set is called regular at the point
harmonie analysis, Acta Math., 119 (1967), 51- at infïnity if any Kelvin transform of u is har-
112. monic in a neighborhood of the origin, in
which case u(P)+0 as OP+ GO. Now Iet T:
xk =X~(X’, , , xi), 1 <k < n, be a one-to-one
analytic transformation of a domain D’ onto
another domain D. If there exists a posi-
193 (X.29) tive function V(X’, , . , XL) in D’ such that
Harmonie Functions and V(x;, , xk)u(xl (xi, , xn), . . ,x,(x;, . . . , XL)) is
harmonie for any harmonie function u(xi,
Subharmonic Functions . . . . xn) in D, then T is conformal. A conforma1
transformation as it is known in differential
A. General Remarks geometry is either (i) a tsimilarity transforma-
tion, (ii) an inversion with respect to a sphere
A real-valued function u of tclass Cz delïned in or a plane, or (iii) a fmite combination of trans-
a domain D in the n-dimensional Euclidean formations of types (i) and (ii).
space R” is called harmonie if it satislïes the
Laplace equation C. Examples of Harmonie Functions
ah <i2U
Au(P)=~+...+~=O (P=(x,, . . . . x,)) (l)Ifuisapolynomialinx,,...,x,andhar-
1 n
monic in R”, then the terms of degree k in u
in D. A harmonie function is, by definition, form a harmonie function for each k > 0. A
twice continuously differentiable, but turns out harmonie homogeneous polynomial is said to
to be real analytic. It is not true, however, that be a spherical harmonie. (2) logr in R2 and rzmn
the solutions of the Laplace equation are real in R” (n > 3) are harmonie except at r = 0. (3)
analytic. For example, for the function u(x, y) Every tlogarithmic potential in R2 and every
=Reexp(-z~4)(z=x+iy#O),u(0,0)=0,u,, +Newtonian potential in R” (n > 3) is harmonie
and uyy exist everywhere, and u satisfies the outside the tsupport of the measure. Con-
193 D 126
Harmonie Functions and Subbarmonic Functions

versely, any harmonie function delïned on a ary of D. The mean value of u on the surface
domain D is represented in an arbitrary rela- or the interior of any bal1 in D is equal to the
tively compact domain D’ in D as the sum of value of u at the tenter of the ball. Namely,
a logarithmic (n = 2) or Newtonian (n > 3)
potential of a measure on 8D and the poten-
tial of a tdouble layer. (4) Both the real part u
and the imaginary part u of an analytic func- where T,, and o, are the volume and surface
tion of a complex variable are harmonie. We area of a unit bal1 in R”, respectively, B(P, r) is
cal1 u a conjugate barmonic function of u. If LI is the open bah with tenter at P and radius r,
harmonie on a tsimply connected domain D, and dz is the volume element. These relations

4x,
Y)
=
then the conjugate u of u is given by are called mean value tbeorems. Conversely, if
v is continuous in D and at every point PE D,
there is a sequence {rk} decreasing to zero and
such that the mean value of u over B(P, rJ or
where (a, b) is a Iïxed point in D and the path S(P, rk) is equal to v(P) for each k, then u is
of integration is contained in D. When D is a harmonie in D. This result is called Koebe’s
tmultiply connected domain, u may take many tbeorem. From the mean value theorems the
values in accordance with the thomology maximum principle for harmonie functions
classes of the paths of integration. follows: Any nonconstant u assumes neither
maximum nor minimum in D. If both u and u
are harmonie in D and have the same tïnite
D. Green% Formulas boundary value at every point on S, then u = v
in D by the maximum principle. This is called
In the following one should substitute “curve” the uniqueness tbeorem.
for the term “surface” when n = 2. Let D be a
bounded domain whose boundary S consists
of a finite number of closed surfaces that are
piecewise of class C’. Let u and u be harmonie F. Boundary Value Problems
in D, and suppose that a11 the lïrst-order par-
tial derivatives of u and u have fmite limits The lïrst boundary value problem (or Dirichlet
at every boundary point. We cal1 D the inside problem) is the problem of finding a harmonie
of S. Let n be a normal on S toward the out- function detïned on D that assumes boundary
side of D. Then the relation values prescribed on S (- 120 Dirichlet Prob-
lem). The second boundary value problem (or
Neumann problem) is the problem of lïnding a
harmonie function u whose normal derivative
follows immediately from +Gauss’s formula, Zu/&t is equal to a function f prescribed on the
where do is the +Surface element on S. In par- piecewise smooth boundary S. The solution, if
ticular, when v is identically equal to 1, for- it exists, is uniquely determined up to an addi-
mula (1) gives tive constant. In order for the solution to exist,
1 f should satisfy the condition jsfdo = 0. The
-do=O. third boundary value problem is the problem of
(4
s si3n
Equations (1) and (2) are called Green% and
f
lïnding a harmonie function u on D that satis-
fies du/& = hu + on S, where h and f are
functions prescribed on S. All these problems
Gauss’s formulas, respectively. Conversely, u is
harmonie in D if u is a function of class C2 cari be reduced to certain +Fredholm integral
equations. There is also the boundary value
in D and at every point ~ED, there is a se-
quence {Y~} decreasing to zero and such that problem of mixed type, in which the boundary
values are prescribed in a part of S and the
jstP,,,,(&/&)da=O (k = 1,2, .), where S(P, r,J
normal derivatives are prescribed on the rest.
is the spherical surface with tenter at P and
radius r,. Another suftïcient condition for u to
be harmonie is that u is of class C’ and, at
every point P E D, there be an r, > 0 such that G. The Poisson Integral
~S~P,,,(~u/&z)da=O for every r, O<r <r,
(Koebe, Bocher). Let D be a bounded domain with smooth
boundary S and u a function harmonie in
D and continuous on DU S. Let G(P, Q) be
E. Mean Value Tbeorems Green’s function in D. Then (1) yields

We assume that u is a harmonie function, D aW’>


Ut QI
WQ).
the domain of delïnition of u, and S the bound-
u(P)=-$ s
s Q
127 193J
Harmonie Functions and Subharmonic Functions

r2
-allr
OP2
u(P)=-
ss,o,r>~WQ).
In particular, if D = i?(O, Y), then H. Expansion

Let P. =(x7, , xf) be a point in D, and de-


note the distance from P, to S by r. Then a
Conversely, given an integrable function f on harmonie function u is expanded uniquely into
S(0, r), we set a power series

r’-OP’
u(P)=- f(Q) do(Q).
‘T”Y s s(o,r, PQ” k ,,... ,k,>O,
Then u(p) is harmonie in D(O, r) and converges in B(P,,(&- l)r). Thus u is (real) analytic in
to f(Q) as P tends to any point Q on S(0, r) D. If u vanishes on an open set in D, then u= 0
where f is continuous. We cal1 u a Poisson in D. If the power series is written as Ckhk with
integral. Sometimes it is possible to represent a spherical harmonies h, of degree k = 1, 2, .. ,
harmonie function u in D(O, r) in the following then this series converges over a11 of B(P,, r).
form, which is more general than the Poisson
integral:
- 1. Sequences of Harmonie Functions
r2 - OP2
u(P)=- 'd,(Q), (3)
w s sco,r, PQ" In this section, {u,,,) is a sequence of harmonie
functions in a bounded domain D. First, if
where CIis a signed +Radon measure on
each u, is bounded and continuous on D US
S(0, r). In order for u to admit such a repre-
and {um} converges uniformly on S, then
sentation, it is necessary and sufficient that
{u,,,} converges uniformly in D, and the limit-
JscO,,,, Iul do be a bounded function of r’ for 0 <
ing function u is harmonie in D. Moreover,
r’ < r, or equivalently, that the tsubharmonic
function IuJ have a tharmonic majorant. Fur-
ak~+-+knu,laXI;~..ax: converges to 8kx+...+k~u/
8x:1 . ax,kmuniformly on any compact subset
thermore, if cxis absolutely continuous, then
of D (Harnack’s first theorem). Second, if ui d
the Poisson integral representation of u is
u2 < . in D and there is a point of D at which
possible, and vice versa. A necessary and SU~~I-
{um} is bounded, then {um} converges uni-
tient condition for the function u to admit the
formly on any compact subset of D (Harnack’s
Poisson integral representation is that there
second theorem). The following Harnack’s
exist a positive convex function cp(t) on t>O
lemma is useful: If u is positive harmonie in D,
such that cp(t)/t-*co as t-ca and cp(lul) has a
P. is a point of D, and K is a compact subset
harmonie majorant.
of D, then there exist positive constants c and
When D is a general domain in which
Green% function exists, every positive har- c’, depending only on P, and K, such that
cu(P,)~u(P)<c’u(P,,) on K.
monic function u(P) is represented uniquely as
Any family of (locally) uniformly bounded
the integral sK(P, Q)dp(Q), where K(P, Q) is a
harmonie functions is tnormal. A family of
+Martin kernel and u is a Radon measure on
positive harmonie functions that is bounded
the Martin boundary B whose support is
contained in a certain essential part of B, each at a point is also normal by Harnack’s lemma.
point of which is called a minimal point. A If jDIuk-u,,,IPdz+O as k, m-cc for p> 1, then
similar integral representation appears in the +Holder’s inequality implies that {u,,,) con-
theory of Markov processes (- 260 Markov verges uniformly on any compact subset of D.
It follows that if J,lgrad(u,--u,)lPdz-tO as k,
Chains 1). The representation sK(P, Q)dp(Q) is
m+c.c and {un} converges at a point in D, then
a generalization of (3). In terms of a function
similar to <p(t), we cari give a necessary and there exists a harmonie function u in D such
that J,Igrad(u,-u)lPdrdO as m-cc and u,
sufftcient condition for u(P) to be represented
in the form sK(P, Q)f(Q)dv(Q), which corre- converges to u uniformly on any compact
sponds to the Poisson integral representa- subset of D, where P. is any point in D. Finally,
tion and in which v is determined by 1 = if l,lu,,lPdz (p> 1) are bounded, then {un}
sK(P, Q)dv(Q). This condition is equivalent forms a normal family.
to the condition that u(P) be quasibounded,
i.e., that there exist an increasing sequence of J. Level Surfaces and Orthogonal Trajectories
bounded harmonie functions that converges to
u c71. The set {P 1u(P) =Constant} is called a level
A positive harmonie function u is said to be surface (niveau or equipotential surface). When
singular if any nonnegative harmonie minorant a is given as the constant, the level surface is
of u vanishes identically. Every positive har- called the a-level surface. Assume that u is not
monic function cari be expressed as the sum of a constant. A point where grad u vanishes is
a quasibounded harmonie function and a called critical. The set of critical points consists
singular one. of at most countably many treal analytic
193 K 728
Harmonie Functions and Subbarmonic Functions

manifolds of dimension <n - 2 (n = dim D, and the value h (> a) on each orthogonal trajec-
a manifold of dimension 0 is understood to be tory passing through 0. Consider the union of
a point). Any compact subset of D intersects orthogonal trajectories that pass through o.
only a fïnite number of such manifolds; we The subset of this union on which u assumes
express this fact by saying that the manifolds values between a and h forms a set called a
do not cluster in D. Each of these manifolds is regular tube. The parts of the boundary corre-
contained in a certain level surface. The sponding to a and b are called the lower and
complement of the critical points with respect Upper bases of the tube; accordingly, o is the
to any level surface consists of real analytic lower base. The integral ~(&/&)da on any
manifolds of dimension n - 1 that do not section (i.e., the part of a level surface in the
cluster in D. tube) is constant and is called the flux of the
For each noncritical point there exists an tube. The family of orthogonal trajectories
analytic curve passing through it such that passing through an (n - l)-dimensional domain
gradu is parallel to the tangent to the curve at (not necessarily bounded by a smooth bound-
each point on the curve. A maximal curve with ary) in 2: is called a barmonic flow, and a
this property is called an orthogonal trajec- subfamily is called a harmonie subflow if its
tory (or line of force). Along every orthogonal intersection A with .Zi is measurable (in the
trajectory, u increases strictly in one direction (n - 1)-dimensional sense). The flux of a har-
and hence decreases in the other, SO that none monic subflow is dehned to be r,(h/h)do.
of the orthogonal trajectories is a closed curve. Then the Green measure of family E of Green
There is exactly one orthogonal trajectory lines originating at the pole is equal to the flux
passing through any noncritical point. There- of E divided by ré”. We cari compute the exact
fore no two orthogonal trajectories intersect, value of the textremal length of any harmonie
and no orthogonal trajectory terminates at a subflow.
noncritical point. Moreover, the set of limit
points of any orthogonal trajectory in each
direction does not contain any noncritical L. Isolated Singularities
point. When u is a Green’s function G(P, Q),
every orthogonal trajectory is called a Green Let u be harmonie in an open bal1 except at
line, and a Green line that originates at the the tenter 0. It is expressed as the sum of a
pole Q and along which u decreases to 0 is function h(P) harmonie in the entire bal1 and
called regular. For any suffkiently large a, the Izm=, Hm(P)/oP2m+1, where H,,, is a kpherical
a-level surface Z0 is an analytic closed surface harmonie of degree m. If OP%(P)+0 as P+O
homeomorphic to a spherical surface. Let E be for u > 0, then u(P) is equal to h(P) + c/OP +
a family of orthogonal trajectories originating + H,(P)/mzm+’ with m < c(- 1. In particular,
at the pole. If the intersection A of E and a if u is bounded in a neighborhood of 0, then 0
closed level surface Za is an (n - 1)-dimensional is a removable singularity for u. If u is bounded
measurable set, then the tharmonic measure above (below), then u(P) = h(P) + c/OP”-‘,
of A at Q with respect to the interior of ,& is where cd 0 (c 3 0). (When II = 2, we have u(p)
called the Green measure of E. M. Brelot and = h(p) + clog l/OP. For the removability of a
G. Choquet proved that all orthogonal trajec- set of capacity zero - 169 Function-Theoretic
tories originating at the pole except those Nul1 Sets).
belonging to a family of Green measure zero If u is harmonie near the point at infïnity,
are regular. Consider a domain D bounded by i.e., outside some closed ball, then
two compact sets, and denote by u the har-
monic measure of one compact set with re-
spect to D. Assume that u is not a constant.
Then u changes from 0 to 1 along ail ortho- where the fïrst sum is regular at the point at
gonal trajectories except those belonging to a infinity and U,,, is a spherical harmonie of
family that is small with respect to a measure degree m. If OP-%(P)+0 as OP+co with
similar to the Green measure (see “flux” de- 2 à 0, then U, = 0 for a11 m > c(. If u is bounded
fined in Section K). above or below, then U,,, = 0 for a11 m > 1. If u is
harmonie in R” and OP-%(P)+0 as OP-cc
with c(> 0, then u is a polynomial of degree m
K. Harmonie Flows
( <IX). If u is harmonie and bounded above or
Denote by Za the a-level surface for a harmonie below in R”, then u is constant. Brelot called a
function U, and by Zj the complement of the function u harmonie at the point at intïnity if
set of critical points with respect to L,. Let o
be an (n - l)-dimensional domain in Lj such u(P) = constant
m 4nU’)
+ c, Op2m+i
that the (n - 2)-dimensional boundary of 0 is
piecewise of class C?. Suppose that u assumes (note that m> 1) near the point at inhity Cl].
729 193 Q
Harmonie Functions and Subharmonic Functions

M. Harmonie Continuation and g are complex analytic in D (Goursat’s


representation). Biharmonic functions are used
If u vanishes in a subdomain of D, then u = 0 in in the theory of elasticity and hydrodynamics.
D. If uk is harmonie in Dk (k = 1,2), D, n D, #
@,andu,-u,inD,nD,,thenu,andu,de-
fine a harmonie function in D, U D,. If the P. Subharmonic Functions
boundaries of mutually disjoint domains D,
and D, have a surface S, of class C’ in com- Let D be a tdomain in the n-dimensional Eu-
mon, uk is harmonie in Dk (k = 1,2), u1 = u2 clidean space R” (n > 2). A real-valued function
on SO, and au,/& and -au,/& exist and u(P) in D is called subharmonic if (1) -CO <
coincide on S,, then ui and u2 delïne a har- u< +co, uf -CO; (2) u is tupper semicon-
monic function in the domain D, U SOU D,. tinuous; and (3) at every point P0 of D, the
We express this fact by saying that one of ui mean value of u over the surface of any closed
and u2 is a harmonie continuation of the other. +ball in D with tenter at P. is not smaller than
It follows that u = 0 in D if the boundary of u(P,), i.e.,
D contains a surface S, of class Ci and u =
aujan = 0 on S,. Consider the case n = 2. If the
boundary of a Jordan domain contains an
analytic arc C and u (or au/&) vanishes on C, where a, is the area of the surface of a unit bah
then a harmonie continuation of u into a cer- in R”. Condition (3) cari be replaced by: (3’)
tain domain beyond C is possible. If n = 3, The mean value A(P,, r) of u over the closed
however, nothing is known except in the case bah is >u(P,). In order that an Upper semi-
where S, is a part of a spherical surface or a continuous function u be subharmonic it is
plane and u = 0 (or au/& = 0) on S,. necessary and sufficient that, for any sub-
Boundary values of u do not always exist, domain D’ of D and for any harmonie function
but in some special cases, u has limits. For h in D’, the maximum principle hold for u-h.
instance, a positive harmonie function in a bah We cal1 -u superharmonic when u is sub-
has a tïnite limit at almost every boundary harmonie. A harmonie function is subhar-
point Q if the variable is restricted to any monic and superharmonic. The converse is
angular domain with vertex at Q. also true (- Section E).
When u is of Mass C2, then u is subhar-
monic if and only if
N. Green Spaces

Au=$+...+$,o (P=(x~, . . . . x,)).


As a generalization of tRiemann surfaces, 1 ”
Brelot and Choquet introduced &-spaces [3].
When u is an Upper semicontinuous function
It is required that G be a separable connected
topological space and satisfy the following two that is not necessarily differentiable, u is sub-
harmonie if and only if Au interpreted as a
conditions: (i) At each point P there exists a
tdistribution is a positive tmeasure.
neighborhood V, of P and a homeomorphism
If ui, . . . , uk are subharmonic and a,, ,uk
between V, and an open set Vi in the +Alexan-
are positive constants, then a, ui + . . + akuk
drov compactification R”U {CO}; (ii) if A = Vp, n
and max(u,(P),u2(P), ,uk(P)) are subhar-
VF, # 0 and AL is the part of Vjl that corre-
sponds to A (k = 1,2), then the correspondence monic. If a subharmonic function u is replaced
by the +Poisson integral for the boundary
between A; and A; via A is a conforma1 (pos-
function u inside a closed bal1 in D, then the
sibly with the sense of angles reversed) trans-
resulting function in D is subharmonic. If f(t)
formation when n = 2 and an tisometric trans-
is a monotone increasing convex function,
formation (which keeps oc invartant) when
then f(u) is subharmonic. If u > 0 and log u is
n à 3. If a +Green’s function exists on 8, then
& is called a Green space. Harmonie functions subharmonic, then v is subharmonic. If f(z) is
a holomorphic function of the complex vari-
and the +Dirichlet problem on a Green space
have been discussed from various points of able z and n > 0, then /I log If(z)1 and hence
If(z)l” are subharmonic. If h is harmonie, then
view.
[hi is subharmonic. Any tlogarithmic potential
(n= 2) or +Newtonian potential (n>3) is super-
0. Biharmonic Functions harmonie in R”.

A function u is called polyharmonic if Ako = 0


(k > 2) and biharmonic if AAu = 0; sometimes, Q. Properties of Mean Values
polyharmonic functions are also called bi-
harmonie. A biharmonic function in a plane Condition (3) (resp. (3’)) cari be replaced by the
domain D is written as Re(yf(z) + g(z)), where ,f condition that there exists an r(P,J > 0 at any
193 R 730
Harmonie Functions and Subbarmonic Functions

P0 such that u(P,)~L(P,,r)(u(P,)~A(P,,r)) of a harmonie function and a potential is


for every r, 0 < r < r(P,). The relation -03 < called a Riesz decomposition.
A(P,,r)<L(P,,r) always holds, and both
A(P,, r) and L(P,,, r) decrease to u(P,J as rJ0.
On any tcompact subset of D, u is tintegrable. T. Boundary Values
Both A(P,, r) and L(P,, r) increase with r and
are convex functions of - log r (n = 2) and r2 mn Let D be a domain in which a Green’s function
(n > 3); hence they are continuous functions of exists, and consider the +fïne topology on the
r. If D’ is relatively compact in D and r, is the +Martin compactification of D. Any negative
distance between i3o’ and aD, then A(P, r) is a subharmonic function u in D has a fïnite limit
continuous subharmonic function of P in D’, with respect to the fine topology at every point
where r is fïxed in the interval (0, ro). By taking of the +Martin boundary A except at the points
the average of A(P, r) k times, a subharmonic of a subset of A of harmonie measure zero
function of class Ck is obtained that decreases (J. L. Doob). When D is a hall, u has a limit
to u as rl0. If cp,((x: + +x:)~/‘) is suitably in the ordinary sense along almost every
chosen, then the tconvolution u * <pI is a sub- radius. However, it may happen that even if u
harmonie function of class C” and decreases is bounded there exists no angular limit at
to u as rJ0. any point of the boundary. If the mean value
of 1u 1 on every concentric smaller bal1 in D is
bounded, then u cari be decomposed into the
R. Sequences of Subharmonic Functions
sum of a nonnegative harmonie function and a
negative subharmonic function by the Riesz
The limit of a decreasing sequence or a
decomposition, and hence u has a limit both
downward-directed net of subharmonic func-
radially and with respect to the fine topology.
tions is subharmonic or equal constantly to
Let D be a domain and K be a compact
-CO. The limit of a uniformly convergent
subset of D of tcapacity zero. If a function u is
sequence of subharmonic functions is sub-
subharmonic and bounded above in D-K,
harmonie. If u 1, u2, . . are subharmonic, then
then u cari be extended to be a subharmonic
max(u,, . , uk) is subharmonic for every k, but
function in D. A function that is equal to a
sup(u,, u2, . . . ) may not be subharmonic. Let U
subharmonic function almost everywhere is
be a family of subharmonic functions in D that
called almost subharmonic, and an almost
are uniformly bounded above on every com-
subharmonic function satisfying condition (3’)
pact subset of D. Then the Upper envelope of
is called submedian.
U, i.e., the function defined by supueuu in D,
Subharmonic functions cari be discussed in
coincides with a subharmonic function except
a space more general than R”, e.g., a tRiemann
on a set of tcapacity zero.
surface (n = 2), or more generally, an +d-space
of dimension n (2 2) in the sense of Brelot and
S. Harmonie Majorants and Riesz Choquet.
Decompositions

Suppose that we are given a subharmonic U. The Axiomatic Treatment


function u in D. If there is a harmonie function
h satisfying h > u in D, then h is called a har- +Newtonian potentials were the main abject of
monic majorant of u. When there is a harmonie interest in the early stages of tpotential theory.
majorant of u, there exists a least one among A major part of potential theory cari be dis-
them, denoted by h,. For any relatively com- cussed on the basis of the theory of superhar-
pact subdomain D’ of D, h,. always exists and monic functions [S]. For example, a +Polar set
equals the +Perron-Brelot solution in D’ for the is defined as a set on which some superhar-
boundary function u. As D’ increases to D, h,, monic function assumes the value CO, and a set
increases to a function h that is either har- X is tthin at a point P0 $X if and only if P,, has
monic or equal constantly to CO. If h, exists, it a positive distance from X or there exists a
coincides with h, and hence h is harmonie. superharmonic function u(P) in a neighbor-
Conversely, if h is harmonie, then h, exists and hood of P. such that lim supu > u(PJ as
equals h. Generally, there is a unique +Radon PEX tends to P,. Moreover, we cari discuss
measure p in D with the following property: tbalayage, defïne potentials, and obtain Riesz
Let 6 be any subdomain of D such that h, and decompositions. Generalizing results of Doob
the +Green’s function G6 exist in 6 (6 may (1954) and starting from a family of harmonie
coincide with D). Then h, - u is equal to the functions defined axiomatically in a locally
potential sd G, dp, and u = h, -SS G, dp. In gen- compact Hausdorff space, M. Brelot detïned
eral, a representation of a superharmonic superharmonic functions and potentials and
(subharmonic) function as the sum (difference) discussed balayage, Riesz decompositions, and
731 194 B
Harmonie Integrals

the +Dirichlet problem (1957). Further pro- plex of tdifferential forms with respect to the
gress in axiomatic potential theory has been exterior derivative d. Thus every element of the
made by Brelot, H. Bauer, C. Constantinescu, cohomology group cari be represented by a
A. Cornea, and others [Il, 141. class of tclosed differential forms. Harmonie
forms enable us to choose one defïnite dif-
ferential form in each cohomology class. The
References theory of harmonie forms, called the theory of
harmonie integrals, is modeled after the theory
[l] M. Brelot, Sur le rôle du point à l’infini of holomorphic differentials and their inte-
dans la théorie des fonctions harmoniques, grals (Abehan integrals) in function theory
Ann. Sci. Ecole Norm. SU~., (3) 61 (1944) 301~ C2,4,5,81.
332.
[2] M. Brelot, Eléments de la théorie classique
du potentiel, Centre de Documentation Uni- B. Delïnitions
versitaire, Paris, third edition, 1965.
[3] M. Brelot and G. Choquet, Espaces et
lignes de Green, Ann. Inst. Fourier, 3 (1951) Let X be an oriented n-dimensional differ-
199-263. entiable manifold of class C” with a +Rie-
[a] 0. D. Kellogg, Foundations of potential mannian metric ds2 of class C” (- 105 Differ-
theory, Springer, 1929. entiable Manifolds, 364 Riemannian Mani-
[S] M. Nicolescu, Les fonctions polyharmoni- folds A). For every (Cm) p-form <p on X we
ques, Actualités Sci. Ind., Hermann, 1936. defïne an (n -p)-form * cp on X as follows:
[6] M. Ohtsuka, Extremal length of level First denote the +Volume element of X by
surfaces and orthogonal trajectories, J. Sci. du. If we choose a basis {oi, . . . , w,,} of the
Hiroshima Univ., 28 (1964) 2599270. space of 1-forms on an open set U of X such
[7] M. Parreau, Sur les moyennes des fonc- that d? = Ci wf and du = w, A.. . A w,, then
tions harmoniques et analytiques et la classifi- <p cari be expressed on U in the form cp=
cation des surfaces de Riemann, Ann. Inst. (l/p!)C<pi ,,,,,, i w. A...Aw~ Ifwelet *‘p=
P ‘1 P
Fourier, 3 (1952) 1033197. (ll(n-P)!)C(*cp)jl,... _ oj,A”‘AO’ , where
[S] M. Brelot, Sur la théorie autonome des <*~)j,...j,~p=(l/P!)~~~~~~~~nj,...j.~ Vi:n); then
fonctions sous-harmoniques, Bull. Sci. Math., * cp is an (n - p)-form on C? that d:es n$
(2) 65 (1941), 72298. depend on the choice of (w, , . , w,,) and is
[9] M. Brelot, Lectures on potential theory, determined only by <p. Since X is covered by
Tata Inst., 1960. open sets as above, * detïnes a linear map-
[ 101 T. Rade, Subharmonic functions, Erg. ping that transforms p-forms to (n - p)-forms.
Math., Springer, 1937 (Chelsea, 1970). If we let ds2 = C gjkdxjdxk in terms of the
[l l] Colloque International sur la Théorie du local coordinate system (xi, ,x”) and <p=
Potentiel, Paris-Orsay, 1964, C.N.R.S. (Ann. (l/p!)cpii ,,__,ipdxil A A dxip, then, in the nota-
Inst. Fourier, 15 (1965), fasc. 1.) tion of tensor calculus, we have
[ 123 L. Helms, Introduction to potential
*<~=(l/(n-p)!)(*<p)~~.,,~,_~dxj’A . ..A dxjn-P,
theory, Wiley Interscience, 1969.
[13] W. K. Hayman and P. B. Kennedy, Sub- (*cP)j ,.,. j,m,'Ji'dk,i~~i~j ,... jn-p~ki."kp
harmonie functions 1, Academic Press, 1976.
[ 141 C. Constantinescu and A. Cornea, Poten-
tial theory on harmonie spaces, Springer, 1972. For two p-forms <p and $, we detïne the
Also - references to 120 Dirichlet Problem inner product by (<p, $) = sx <pA *ti if the right-
and 338 Potential Theory. hand side converges. In order for the inner
product (<p, $) to be detïned, it suffices that
either <p or $ has compact tsupport. Then
(cp, $) is a symmetric, positive defïnite bilinear
form.
194 (VII.1 1) If we let 6 =( - l)npfn+i *d * operate on p-
forms, where d is the texterior derivative, then
Harmonie Integrals
d and 6 are adjoint to each other with respect
to the inner product. That is, if either cp or
A. Introduction $ has compact support, we have (dq, $) =
(<p, 6$) (Stokes’s theorem). We cal1 A = dh +
+De Rham’s theorem shows that the coho- 6d the Laplace-Beltrami operator, which is
mology group with real coefftcients of a +dif- a self-adjoint telliptic differential operator.
ferentiable manifold of class C” is isomorphic These operators satisfy relations such as
to the cohomology group of the cochain com- ,.=(-l)p@-p), dd=O, S6=0, *A=A*,
194 c 132
Harmonie Integrals

*6=(-1)Pd*,and6*=(-1)“~p’1’ *d (when have the direct sum decomposition f?,(X) =


they operate on p-forms). ‘B,(X) + BP(X) + s,,(X). In this decomposition
A differential form <pis said to be harmonie any component of a form of class C” is also of
if dp = 0 and 6<p= 0. Then A<p = 0. Since A is class C”.
an elliptic operator, a tweak solution cp of the If X is an open submanifold of another
equation A<p =p is an ordinary solution of manifold Y, X 1s compact, and 3X =X--X is a
class C” on the domain where p is of class C”. closed submanifold of Y, then the theory in
Therefore, if <p is harmonie (as a weak solu- this section is just a generalized potential
tion), <p is of class C”. theory with boundary condition cp= 0 on 3X.
We sometimes treat decompositions of other
Hilbert spaces that correspond to other bound-
C. Harmonie Forms on Compact Manifolds
ary conditions.

On a compact manifold X, any cp with A<p = 0


is harmonie, since (<p, A<p) = (LEq, dqo) + (Q, fi<p).
E. Generalization to Complex Manifolds
Let L,(X) be the linear space of p-forms of
class C” on X, and denote by U,(X) the com-
If X is a complex manifold, we consider
pletion of L,(X) with respect to the inner
complex-valued differential forms (- 72 Com-
product (<p, +). Then Q,(X) is the Hilbert space
plex Manifolds C). Then the space L,,(X) of p-
of square integrable measurable p-forms. Then
forms is the direct sum of the spaces L,,,(X) of
jj,(X)= {~EP,,(X)IA<P=O (in the weak sense)}
forms of type (r, s), and the exterior derivative
is a fïnite-dimensional subspace of Q,(X) and
d has the expression d = d’ + d”, where d’ is of
is contained in L,(X), as we have seen before.
type (1,O) (i.e., L,,,(X)+L,+,,,(X)) and d” is of
Also, e,(X) is closed in Q,(X), and the +Pro-
type (0,l). If we are given a holomorphic vec-
jection operator H:f!,(X)+Sj,(X) is an tinte-
tor bundle E on X, we cari define an operator
gral operator with kernel of class C”. The
d” on differential forms with values in E, and
orthogonal complement of a,,(X) in Q,(X) is
we have the generalized +Dolbeault theorem. If
mapped onto itself by A and has the inverse
X is compact and has a +Hermitian metric, we
operator G of A, which is a continuous oper-
cari defïne a Hermitian inner product on E as
ator of the Hilbert space. By letting G=O
follows: There is an open covering {q} such
on b,(X), we cari extend G to an operator
that over each U, the vector bundle E is iso-
from f?,(X) to P,(X) that is called Green%
morphic to U, x C4. A point of E over U, is
operator. It is also denoted by G, maps L, into
represented by (x, tj), where XE U, and tj~Cq.
itself, commutes with d and 6, and satisfies
For x E U, n U, we have (x, tj) = (x, &,) (the sides
CH = HC = 0, H + AG = 1 (= identity map-
are the respective expressions over Uj and U,)
ping). Therefore, for <PE L,,(X) we have cp= Hq
if and only if tj=gjk(x)&, where gjk(x) is a
+ GGdcp +dGG<p, which shows that H is +homo-
holomorphic mapping from qn U, to GL(q, C)
topic to the identity mapping of the tcochain
satisfying gjkgk, = gjr on Uj n U, n U,. A differen-
complex (2, L,(X), d). From this we infer that
tial form cp with values in E is expressed as a
every cohomology class of de Rham cohomol- family {cpj} of differential forms on Uj with
ogy contains a unique harmonie form that
values in Cq such that V~(X) = gjk(x)qk(x) on
represents the cohomology class. However,
Uj fl U,. If we take a positive detïnite Hermitian
since products of harmonie forms are not
matrix hi whose components are C”-functions
always harmonie, it is trot appropriate to use
on Uj such that ‘gjk hj?jj, = h, on Uj n U,, then
harmonie forms to study the ring structure of
{‘<jhjzj} determines a Hermitian inner product
cohomology. G is also an integral operator
on each fïber of E. We cari also endow the
with kernel of class C” outside the diagonal
space L,,,(E, X) (of forms of type (r,s) of class
subset in X xX.
C’” with values in E) with a Hermitian inner
product by setting (<p, $) = sx X,.0 h,,,& A * t+bf
D. Harmonie Forms on Noncompact for <p, $EL,.,(E,X) (where the ‘pg (a= 1, . . . . q)
Manifolds are the components of cpj). If we denote by b
the adjoint operator of d” with respect to this
If X is a noncompact manifold, let L,(X) be inner product and let A = d”ù + bd”, then A is a
the space of p-forms of class C” with compact self-adjoint elliptic differential operator, and
support, and let f?,(X) be its completion. Let results similar to those for A mentioned above
23,,(X) and %3:(X) be the respective closures of hold for A. For example, the space b,.,(E, X)
dL,-,(X) and fiL,+,(X) in c,(X), and let of harmonie forms of type (r, s) is of fïnite
3,,(X) and i-j:(X) be the respective orthogonal dimension, and there is a continuous linear
complements of 8,(X) and $%;(X) in P,(X). operator G on P,,,(E, X), the completion of
Then 3,,(X) n :3;(X) = g,(X) is a subspace of L,,,(E, X), that satisiïes 1 = H + AG, HG =
the square integrable harmonie forms, and we CH = 0, d” G = Gd”, and bG = Gb. Here H
733 195 B
Harmonie Mappings

denotes the projection 2-5, which is an [2] S. Bochner and K. Yano, Curvature and
integral operator with kernel of class C”. Betti numbers, Ann. Math. Studies 22, Prince-
Also, G maps L,,,v(E, X) into itself. Therefore ton Univ. Press, 1953.
H is homotopic to the identity on the co- [3] S. 1. Goldberg, Curvature and homology,
chain complex (C,L,,,(E, X), d”), and any ele- Academic Press, 1962.
ment of +Dolbeault’s cohomology groups (d”- [4] W. V. D. Hodge, The theory and applica-
cohomology groups) is represented by a uni- tions of harmonie integrals, Cambridge Univ.
que harmonie form (- 232 Kahler Manifolds Press, second edition, 1952.
W [S] K. Kodaira, Harmonie lïelds in Rieman-
nian manifolds (generalized potential theory),
Ann. Math., (2) 50 (1949), 5877665.
F. Other Generalizations
[6] C. B. Morrey and J. Eells, A variational
Even if a manifold X is not of class C”, if X is method in the theory of harmonie integrals 1,
a manifold of class Ci, we cari develop the Ann. Math., (2) 63 (1956), 91-128.
theory of harmonie forms [6]. We say that X [7] G. de Rham, Variétés différentiables, Ac-
is of class Ci if it is of class C’ and has a set of tualités Sci. Ind., Hermann, second edition,
local coordinate systems whose transition 1960.
functions have derivatives satisfying the +Lip- [S] G. de Rham and K. Kodaira, Harmonie
schitz condition. integrals, Lecture notes, Inst. for Advanced
If X is a real analytic manifold with a real Study, Princeton, 1950.
analytic Riemannian metric, then harmonie
forms are also real analytic. Using this fact, we
cari embed real analytically a compact mani-
fold with a real analytic Riemannian metric 195 (VII.1 5)
into a Euclidean space (P. Bidal and G. de
Harmonie Mappings
Rham; this result is now included in the
theorems of C. B. Morrey and H. Grauert).
We cari consider the theory of harmonie A. General Remarks
forms with singularities [4,5], a generalization
of the theory of differential forms of the second The theory of harmonie mappings between
and third kinds. Here the notion of tcurrent is Riemannian manifolds has its origin in the
very useful. study of +Plateau’s problem. The basic prob-
lem in the theory is to deform a given map-
ping into a harmonie one, which is a problem
G. Cohomology Vanishing Theorems
of the tcalculus of variations and tglobal ana-
Since the operator A is closely related to the lysis (- 46 Calculus of Variations, 183 Global
Riemannian metric, some metrics may admit Analysis). Recently, the theory of harmonie
no harmonie forms of certain degrees except mappings has been applied to problems in
zero. This is important since it means that the various branches of geometry [779,11].
corresponding cohomology group of the mani-
fold vanishes. The condition for this phenom- B. Definitions and Examples
enon to occur cari be described in terms of
the curvature of the metric. This study has its Let (M, 9) and (N, h) be +Riemannian mani-
origin in S. Bochner’s results [2]. folds with metrics g=Cy,dx’dxjand h=
Here is an example of a vanishing theorem: Ch,,dy”dy”, respectively. We defme the energy
Let B be a holomorphic line bundle on a com- of a Cl-mappingf:M+N by
pact complex manifold X of dimension n. If
the +Chern class of B is expressed by a real
W)=; Idfb)l*dx,
closed differential form of type (1,l) as w = sM
fl z ~dz’ A d8, where the Hermitian
where Idf(x)I is the +Hilbert-Schmidt norm of
matrix (Q) is positive delïnite at every point of
the differential df,: TX(M)+Tf,,,(N) off at
X, then H4(X, @‘(B)) = 0 for p + 4 > n. In this
x E M and dx is the canonical Lebesgue mea-
case, ds* = 2 C ampdz” A dz” is a +Hodge metric
sure delïned by y on M (assumed compact).
on X (- 232 Kahler Manifolds D).
Thus E(f) cari be considered to be a generali-
zation of the classical +Dirichlet integral for
References functions. The integrand e(,f)(x) = Idf(x)l’ is
called the energy density off; it measures
[l] W. L. Baily, The decomposition theorem the sum of the squares of elements of length
for l’-manifolds, Amer. J. Math., 78 (1956), stretched on a complete set of mutually per-
862-888. pendicular directions.
195 c 734
Harmonie Mappings

The +Euler-Lagrange differential equations system of partial differential equations 7(f) = 0


of the energy functional E(f) yield a vector (- Section B), it is a smooth (ie., of class Cm)
field r(f) along J i.e., a section of the bundle mapping. More generally, it is known that a
f” T(N) induced from the tangent bundle continuous mapping that satishes 7(f) = 0 in
T(N) of N by f: In fact, given a family ,f, of a weak sense is smooth [4,6].
mappings depending differentiably on t with (2) Unique continuation property. The follow-
f. =f, we have ing unique continuation theorem is valid for
harmonie mappings: If two harmonie map-
pings of M into N agree up to intïnitely high
order at some point of M, then they are iden-
where ( , ) denotes the inner product of tical (M being assumed connected). In partic-
tangent vectors along 1: The vector field r(f) is ular, a harmonie mapping that is constant on
called the tension field of the mapping f; it an open set is a constant mapping.
indicates the direction in which the energy off The global natures of harmonie mappings
decreases most rapidly. are closely related to the curvatures of the
The Euler-Lagrange differential equations manifolds under consideration. For instance,
z(f) = 0 are a system of tquasilinear elliptic suppose that M and N are compact and that
partial differential equations of the second the sectional curvatures of N are nonpositive
order. In local coordinates, these cari be writ- everywhere. Then we have:
ten in the form (3) Uniqueness. Let f: M j N be a harmonie
mapping, and assume that there is a point of
f(M) where the sectional curvatures of N are
negative. Then f is unique in its homotopy
where A is the tlaplace-Beltrami operator on class unless f(M) is a closed geodesic y of N;
M and the F’&,(f)(x) are the +Christoffel sym- and in this case we have uniqueness up to
bols on N at f(x). (The f” are local coordi- rotation of y, i.e., an isometry of y which moves
nates of the point f(x), and (9”) is the inverse each point of y a fixed oriented distance along
matrix of (gij).) A C2-mapping f: M+N is said Y.
to be harmonie if its tension fïeld r(f) vanishes. (4) Degeneracy. Suppose further that the
Thus, if M is compact, a Cz-mapping f: M+N +Ricci tensor (R,) of M is positive semidetïnite
is harmonie if and only if it is an extremal of everywhere. Then the energy density e(f) is a
the energy functional E(f). tsubharmonic function for every harmonie
Examples of harmonie mappings appear in mapping. This implies that any harmonie
various contexts of differential geometry. For mapping f: M+N is ttotally geodesic. More-
instance: over, if N is of negative sectional curvature,
(1) If N = R, then the harmonie mappings then f is either constant or maps M onto a
M-tR are the tharmonic functions on M. closed geodesic of N; if (R,) is positive deiïnite
(2) If M is the circle S’, then a harmonie at some point, then f is constant.
mapping S’ + N is a closed tgeodesic of N (5) Finiteness. Assume now that N is of
parametrized by arc length. negative sectional curvature. Then, for each
(3) Let f: M+N be an tisometric immersion K 2 1, there are only fïnitely many noncon-
of M into N. Then fis harmonie if and only if stant harmonie mappings ,f: M+N of dilata-
it is a tminimal immersion. tion bounded by K. Here, we say that the dila-
(4) If M and N are +Kahler manifolds, then tation off is bounded by K if and only if at
every tholomorphic or antiholomorphic map- each point of M we have df=O or (/?,/Â,)“* <
ping M -+ N is harmonie, where by an anti- K, A1 2 1,, > . > 0 being the positive eigen-
holomorphic mapping is meant a mapping values of the pullback quadratic form f*h(x)
whose differential mapping carries a differen- on T,(M) induced from the metric h of N by 1:
tial form of type (1,0) into that of type (0,l).
We note that each (anti-)holomorphic map-
ping is an absolute minimum for the energy in D. Harmonie Mappings of a Surface
its homotopy class. There are also examples of
nonholomorphic (and nonantiholomorphic) Let M be a compact surface. Then the energy
harmonie mappings between Kahler manifolds of a mapping M+N is the +Dirichlet-Douglas
(- Section D). functional, and harmonie mappings are closely
connected with solutions of Plateau% problem
(- 334 Plateau’s Problem). In fact, if a +Con-
C. Fundamental Properties forma1 mapping M-N minimizes the tarea
functional, then it also minimizes the energy
(1) Regularity. Since a harmonie mapping is a functional.
solution of a second-order quasilinear elliptic Now let M and N be compact orientable
735 195 Ref.
Harmonie Mappings

surfaces whose genera are denoted by p and q, beck [S] succeeded in giving a satisfactory
respectively. Then the problem of existence (or answer to the structure of rc2(N), which is a
nonexistence) of harmonie mappings is well ni(N)-module, in terms of harmonie mappings.
understood. In fact: They proved that there exists a generating set
(1) When q # 0, for any metrics g and h on for n,(N) consisting of harmonie mappings of
M and N, every homotopy class of mappings spheres that minimize energy and area in their
M+N contains a harmonie mapping. homotopy classes. We note that these har-
(2) When q = 0 (i.e., N is the 2-sphere S’), monic mappings are minimal immersions with
every harmonie mapping whose tdegree d tbranch points.
satisfïes Id] > p is holomorphic or antiholo- (3) Next, we mention the case of harmonie
morphic with respect to the complex struc- mappings of manifolds with boundary. In this
tures associated with g and h. For example, case, we cari naturally formulate the +Dirichlet
consider the homotopy classes of mappings and the +Neumann boundary value problem
from the 2-torus T2 to S* with any metrics. for harmonie mappings.
Then a11 classes with degree Id[ > 2 have har- In his study of Plateau% problem on Rie-
monic representatives, and any such is holo- mannian manifolds, C. Morrey (1948) dis-
morphic or antiholomorphic; and the classes cussed the Dirichlet problem for harmonie
with d = fl have no harmonie representatives. surfaces with boundary.
(3)Whenq=OandIdl<p-l,wehave,for The problem in arbitrary dimensions has
every such p and d, a surface M of genus p and been studied by R. S. Hamilton [3], who
a metric h on S2 such that there exists a har- extended the result of Eells and Sampson
monic nonholomorphic (and nonantiholo- mentioned above to the case where M and N
morphic) mapping of degree d from M to S2. have boundaries. In fact, let M and N be com-
pact Riemannian manifolds with boundary,
and assume that N has nonpositive sectional
E. Existence Theorems curvature and that the boundary ON of N is
tconvex (or empty). Then there exists a unique
The basic problem in the study of harmonie minimizing harmonie mapping in each Irela-
mappings is to prove their existence in general tive homotopy class determined by the pre-
geometric contexts. scribed Dirichlet boundary value. We note
(1) In regard to this problem, translating the that if aN is not convex, then it is easy to
problem of the elliptic system r(f) = 0 into the formulate Dirichlet problems with no solu-
tinitial value problem of the corresponding tions. Hamilton also treated the Neumann
nonlinear tparabolic system af/ôt = r(f), J. problem.
Eells and J. H. Sampson [l] proved that if Subsequently, S. Hildebrandt, H. Kaul, and
M and N are compact and if N has nonposi- K.-O. Widman [4] gave another existence
tive sectional curvature everywhere, then every proof of solutions of the Dirichlet problem
homotopy class of mappings M+N contains that covers the case where N admits positive
a harmonie mapping that minimizes the energy sectional curvature.
in that class. Subsequently, the uniqueness of
these harmonie mappings was established
by P. Hartman [L] in the form stated in Sec- References
tion C.
(2) For harmonie mappings of surfaces, [l] J. Eells and J. H. Sampson, Harmonie
more general existence results have been mappings of Riemannian manifolds, Amer. J.
known. Math., 86 (1964) 1099160.
First, by the tdirect method of the calculus [2] P. Hartman, On homotopic harmonie
of variations, L. Lemaire (1977) and others maps, Canad. J. Math., 19 (1967), 6733687.
proved that if M and N are compact and if M [3] R. S. Hamilton, Harmonie maps of mani-
is 2-dimensional, then every conjugacy class of folds with boundary, Lecture notes in math.
homomorphisms nl(M)+n,(N) of the funda- 47 1, Springer, 1975.
mental groups is induced by a minimizing [4] S. Hildebrandt, H. Kaul, and K.-O. Wid-
harmonie mapping. It follows that if, in partic- man, An existence theorem for harmonie
ular, the second homotopy group rc2(N) of N mappings of Riemannian manifolds, Acta
is zero, then every homotopy class of map- Math., 138 (1977), l-16.
pings of a compact surface M to N contains [S] J. Sacks and K. Uhlenbeck, The existence
a harmonie representative realizing the mini- of minimal immersions of two-spheres, Ann.
mum of the energy in that class. Math., (2) 113 (1981) l-24.
On the other hand, by making use of the [6] J. Eells and L. Lemaire, A report on har-
generalized +Morse theory for a perturbed monic maps, Bull. London Math. Soc., 10
energy functional, J. Sacks and K. Uhlen- (1978) l-68.
196 736
Hilbert, David

[7] Y.-T. Siu and S.-T. Yau, Compact Kahler Table 1. The 23 Problems of Hilbert
manifolds of positive bisectional curvature, (1) TO prove the continuum hypothesis (- 33
Inventiones Math., 59 (1980), 189-204.
Axiomatic Set Theory D).
[S] Y.-T. Siu, The complex-analyticity of har-
(2) TO investigate the consistency of the
monic maps and the strong ridigity of compact
axioms of arithmetic (- 156 Foundations of
Kahler manifolds, Ann. Math., (2) 112 (1980),
Mathematics E).
73-111.
[9] S. Nishikawa and K. Shiga, On the holo- (3) TO show that it is impossible to prove
morphic equivalence of bounded domains in the following fact utilizing only congruence
complete Kahler manifolds of nonpositive axioms: Two tetrahedra having the same alti-
curvature, J. Math. Soc. Japan, 35 (1983) 273% tude and base area have the same volume.
278. Solved by M. Dehn (1900).
[ 101 T. Ishihara, The index of a holomorphic (4) TO investigate geometries in which the line
mapping and the index theorem, Proc. Amer. segment between any pair of points gives the
Math. Soc., 66 (1977), 169-174. shortest path between the pair (- 155 Foun-
[ 1 l] T. Sunada, Rigidity of certain harmonie dations of Geometry).
mappings, Inventiones Math., 5 1 (1979), 297-
(5) TO obtain the conditions under which a
307.
topological group has the structure of a Lie
group (- 423 Topological Groups M). Solved
by A. M. Gleason and D. Montgomery and L.
Zippin (1952) and H. Yamabe (1953).
196 (Xx1.28)
(6) TO axiomatize those physical sciences in
Hilbert, David
which mathematics plays an important role.
David Hilbert (January 23, 1862-February 14, (7) TO establish the transcendence of certain
1943) was born in Konigsberg, Germany. He numbers (- 430 Transcendental Numbers B).
attended the University of Konigsberg from The transcendence of 2Jz, which was one of
1882 to 1885, when he received his doctoral the numbers put forth by Hilbert, was shown
degree with a thesis on the theory of invar- by A. Gel’fond (1934) and T. Schneider (1935).
iants. It was there that he established a life- (8) TO investigate problems concerning the
long friendship with H. Minkowski. In 1892 he distribution of prime numbers; in particular, to
became a professor at the University, and in show the correctness of the Riemann hypoth-
1895 he was appointed to a professorship at esis (- 450 Zeta Functions). Unsolved.
the University of Gottingen, a position he held
(9) TO establish a general law of reciprocity (-
until his death. He obtained his basic theorem
on invariants between 1890 and 1893, and next 59 Class Field Theory A). Solved by T. Takagi
(1921) and E. Artin (1927).
began research on the foundations of geometry
(- 155 Foundations of Geometry) and the (10) TO establish effective methods to deter-
theory of talgebraic number fields. Concerning mine the solvability of Diophantine equations
the former, he published Grundlagen der Geo- (- 97 Decision Problem; 182 Geometry of
metrie (fïrst edition, 1899), in which he gave the Numbers). Solved affïrmatively for equations
complete axioms of Euclidean geometry and of two unknowns by A. Baker, Philos. Trans.
a logical examination of them. Concerning Roy. Soc. London, (A) 263 (1968); solved nega-
the latter, he systematized a11 the important tively for the general case by Yu. V.
known results of algebraic number theory in Matiyasevich (1970).
his monumental Zahlhericht (1897). In number
(11) TO investigate the theory of quadratic
theory, he enunciated his signifïcant conjecture
forms over an arbitrary algebraic number tïeld
on tclass lïeld theory. At the international
of fïnite degree (- 348 Quadratic Forms).
congress of mathematicians held in Paris in
1900, he put forth 23 problems as targets for (12) TO construct class fields of algebraic num-
mathematics of the 20th Century (Table 1). ber fields (- 73 Complex Multiplication).
Between 1904 and 1906 he conducted research (13) TO show the impossibility of the solu-
on the +Dirichlet principle of tpotential theory tion of the general algebraic equation of the
and on the direct method in the tcalculus of seventh degree by compositions of continu-
variations. Around 1909 he established the ous functions of two variables. Solved nega-
foundations of the theory of THilbert spaces. tively. In general, V. 1. Arnold proved that
After 1910 he was chiefly involved in research every real, continuous function S(x,, x2, x3)
on the tfoundations of mathematics, and he on [0, l] cari be represented in the form
advocated the standpoint of tformalism. He is Xy=, hi(gi(x, ,x2), x3), where hi and g, are real,
one of the greatest mathematicians of the lïrst continuous functions, and A. N. Kolmogorov
half of the 20th Century. proved that f(x,, x2, xj) cari be represented
737 197 B
Hilbert Spaces

in the form Z&l hihi, (~1) + gi2(X2) + gi~(x~))~ [3] D. Hilbert, Grundzüge der allgemeinen
where hi and g, are real, continuous func- Theorie der linearen Integralgleichungen,
tions and g, cari be chosen once for a11 inde- Teubner, second edition, 1924 (Chelsea, 1953).
pendently off (Dokl. Akad. Nauk SSSR, 114 [4] D. Hilbert and W. Ackermann, Grundzüge
(1957) Amer. Math. Soc. Tran& 28 (1963)). der theoretischen Logik, Springer, third edi-
(14) Let k be a tïeld, x1, . ,x, be variables, tion, 1949; English translation, Principles of
and ,fi(xi, . ,x,) be given polynomials in mathematical logic, Chelsea, 1950.
[S] D. Hilbert and P. Bernays, Grundlagen
kCx 1)...> x,](i=l,... , m). Furthermore, let
R be the ring formed by rational functions der Mathematik, Springer, second edition, 1,
F(X, , . . , X,) in k(X, , . . , X,) such that F( fi, 1968; 11, 1970.
, f,) E k [x1, , x,]. The problem is to deter- [6] F. Klein, Vorlesungen über die Entwick-
mine whether the ring R has a tïnite set of lung der Mathematik im 19. Jahrhundert 1,
generators. Solved negatively by M. Nagata, Springer, 1926 (Chelsea, 1956).
Amer. J. Math., 8 1 (1959). [7] H. Weyl, David Hilbert and his mathemat-
ical work, Bull. Amer. Math. Soc., 50 (1944)
(15) TO establish the foundations of algebraic
612-654.
geometry (- 12 Algebraic Geometry). Solved
[S] C. Reid, Hilbert. With an appreciation of
by B. L. van der Waerden (193881940) A.
Hilbert’s mathematical work by H. Weyl,
Weil (1950), and others.
Springer, 1970.
(16) TO conduct topological studies of alge-
brait curves and surfaces.
(17) Let ,f(x, , , x”) be a rational function
with real coefficients that takes a positive value
for any real n-tuple (x1, . ,x,,). The problem is
to determine whether the function f cari be 197 (X11.2)
written as the sum of squares of rational func- Hilbert Spaces
tions (- 149 Fields 0). Solved in the affirma-
tive by E. Artin (1927).
A. General Remarks
(18) TO express Euclidean n-space as a disjoint
union UA PJ,, where each PA is congruent to The theory of Hilbert spaces arose from prob-
one of a set of given polyhedra. lems in the theory of +integral equations. D.
(19) TO determine whether the solutions of Hilbert noticed that a linear integral equation
regular problems in the calculus of variations cari be transformed into an infinite system of
are necessarily analytic (- 323 Partial Differ- linear equations for the +Fourier coefficients
ential Equations of Elliptic Type). Solved by of the unknown function. He considered the
S. N. Bernshteïn, 1. G. Petrovskiï, and others. linear space 1, consisting of all sequences of
(20) TO investigate the general boundary value numbers {x,} for which CE, Ix,/’ is finite,
problem (- 120 Dirichlet Problem; 323 Par- and detïned for each pair of elements x = {x.},
tial Differential Equations of Elliptic Type). y = {y,} E 1, their inner product as (x, y) =
C;;i x,7,. The space 1, cari be regarded as
(21) TO show that there always exists a linear
an intïnite-dimensional extension of the notion
differential equation of the Fuchsian class with
of a Euclidean space. F. Riesz considered the
given singular points and monodromic group
space of functions now termed L,-space and
(- 253 Linear Ordinary Differential Equa-
succeeded in giving a satisfactory answer to
tions (Global Theory)). Solved by H. Rohrl
the Fourier expansion problem. In his book
and others (1957).
[3], J. von Neumann established a rigorous
(22) TO uniformize complex analytic functions foundation of quantum mechanics employing
by means of automorphic functions (- 367 Hilbert spaces and the spectral expansion of
Riemann Surfaces). Solved for the case of one self-adjoint operators. The following axiomatic
variable by P. Koebe (1907). definition (- Section B) of Hilbert spaces is
(23) TO develop the methodology of the due to von Neumann. H. Weyl later justitïed
calculus of variations (- 46 Calculus of the +Dirichlet principle of Riemann by the
Variations). method of orthogonal projection in a Hilbert
space, and thus paved the way for the function-
analytic study of differential equations.
References

[l] D. Hilbert, Gesammelte Abhandlungen B. Definition


IIIII, Springer, 1932-1935 (Chelsea, 1967).
[2] D. Hilbert, Grundlagen der Geometrie, Let K be the field of complex or real numbers,
Teubner, seventh edition, 1930. the elements of which we denote by x, b,
197 c 738
Hilbert Spaces

Let H be a tlinear space over K, and to any A2(0), w:(n) (= H'(Q)), and H$) (- 168
pair of elements x, y~ H let there correspond a Function Spaces).
number (x, y)~ K satisfying the following iïve
conditions:(i) (x, +x,,y)=(x,,y)+(x,,y); (ii)
(ax, y) = CG, Y); (iii) (x, Y) = (Y, XL (iv) (x, xl 30; E. Closed Linear Subspaces and Projections
and (v) (x, x) = 0 o x = 0. Then we cal1 H a pre-
Hilbert space and (x, y) the inner product of x Let M be a closed linear subspace of a Hilbert
and y. space H, i.e., a linear subspace that is closed in
With the norm /~XII =m, H is a the norm topology of H. It is a Hilbert space
tnormed linear space. If H is tcomplete with with respect to the restriction of the inner
respect to the distance Ilx-yll (i.e., IIx,,-x,ll’ product in H. For a given M the set of a11
0 (m, n - m) implies the existence of lim x, =x), x E H such that (x, y) = 0 for every y~ M forms
then we cal1 H a Hilbert space. According a closed linear subspace M’ called the orthog-
as K is complex or real, we cal1 H a complex onal complement of M. The orthogonal com-
or real Hilbert space. A Hilbert space is a plement of M’- is M (i.e., Ml’ =M), and H is
+Banach space. the direct sum of M and ML (i.e., every XE H
A normed linear space with norm I~X// cari cari be uniquely represented as x = y + z, y E
be made a pre-Hilbert space, by defming an M, ~EM’, and /~XII~= l/~j/~+ ilzll’). Thus the
inner product (x, y) SO that //XII = m, if quotient space H/M is isomorphic to ML and
andonlyiftheequality ((~+y((~+((x-y(j~= is also a Hilbert space. The operator P,,, that
2(llx/12+ llyl12) holds for any x, y. maps x to y is called the projection (or ortho-
gonal projection or projection operator) to M.
A bounded linear operator P is a projection if
C. Orthonormal Sets and only if it is idempotent (P’ = P) and self-
adjoint ((Px,y)=(x,Py) for any x, y~ H) (-
Two elements x, y~ H are said to be mutu- 251 Linear Operators).
ally orthogonal if (x, y) = 0. A subset L of H is
called an orthogonal set (or system) if O$Z’ and
every distinct pair x, ygL is mutually ortho-
F. Conjugate Spaces
gonal. If every element of an orthogonal set C
is of norm 1, then C is called an orthonormal
A linear operator from H to K is called a
set. Any orthogonal set L = {xi} cari be normal-
linear functional. The set H’ of a11 continuous
ized into an orthonormal set {X~/~~X~II}. A
linear functionals f on H forms a Hilbert
maximal orthonormal set is called a complete
space with norm IlfIl =sup{lf(x)lI ~~XII= 1).
orthonormal set or an orthonormal basis. Al1
For every fi H' there exists a unique y~ H
the complete orthonormal sets of a given H
such that ,f(x) = (x, y) for a11 x E H (Riesz’s
have the same cardinal number, which we cal1
theorem), and the correspondence ,f+y gives
the dimension of H. Two Hilbert spaces are
an tantilinear isometric operator from H' onto
isomorphic if and only if they have the same
H (for tlinear operators on Hilbert spaces
dimension.
- 68 Compact and Nuclear Operators; 251
Let Z= {xi} be an orthonormal set. Then for
Linear Operators; 390 Spectral Analysis of
every XE H, its Fourier coefficients (x, xi) van-
Operators).
ish for all but a countable number of i, and the
Bessel inequality llx/1’ aCi I(x, xi)12 holds. The
following three statements are equivalent in a
Hilbert space: (i) L is complete; (ii) Parseval’s References
equality ~~x~~~=~~~(x,x~)~~ holds for every x;
(iii) every x cari be expanded in a Fourier series [l] D. Hilbert, Grundzüge einer allgemeinen
x =Xi(x, X~)X, (- 317 Orthogonal Functions). Theorie der linearen Integralgleichungen,
Teubner, second edition, 1924 (Chelsea, 1953).
[Z] E. Hellinger and 0. Toeplitz, Integral-
D. Examples of Hilbert Spaces gleichungen und Gleichungen mit unendlich-
vielen Unbekannten, Enzykl. Math., Teubner,
The space 1, (- Section A) is a Hilbert space 1928 (Chelsea, 1953).
of dimension K,. The tfunction space L, on a [3] J. von Neumann, Mathematische Grund-
measure space (X, PL)is a Hilbert space if the lagen der Quantenmechanik, Springer, 1932.
inner product of A y E L, is defined by (A g) = [4] J. von Neumann, Collected works II, III,
SXfýdpL. In the case of the +Lebesgue mea- Pergamon, 1961.
sure in a Euclidean space, L, is of dimension [5] M. H. Stone, Linear transformations in
Eç,, SO that it is a Hilbert space isomorphic Hilbert space and their applications to analy-
to 1,. Further examples of Hilbert spaces are sis, Amer. Math. Soc. Colloq. Publ., 1932.
739 198 A
Holomorphic Functions

[6] B. Sz.-Nagy, Spektraldarstellung linearer The set of a11 functions holomorphic in a


Transformationen des Hilbertschen Raumes, tdomain D forms a +ring.
Erg. Math., Springer, 1942. Suppose ,f(z) is holomorphic on D and
[7] F. Riesz and B. Sz.-Nagy, Leçons d’analyse S’(z,,) # 0, z0 E D. Then two curves that form
fonctionelle, Akademiai Kiado, third edition, an angle at z,, are mapped by f to two curves
1955; English translation, Functional analysis, forming the same angle at f(zc). Because of
Ungar, 1955. this property, the mapping .f’ is said to be
[8] N. 1. Akhiezer and 1. M. Glazman, Theory conforma1 at a11 points z with f’(z) ~0.
of linear operators in Hilbert space, Ungar, 1, The following four conditions are equivalent
1961; II, 1963. (Original in Russian, 1950.) for a function ,f = u + iv defïned on an open
[93 K. Yosida, Functional analysis, Springer, set D. (1) f is holomorphic in D. (2) u = u(x, y)
1965. and u = V(X, y) are ttotally differentiable at
[ 101 R. Courant and D. Hilbert, Methods of each point z = x + iy and satisfy the Cauchy-
mathematical physics 1, Interscience, 1953. Riemann differential equations
[ll] P. R. Halmos, Introduction to Hilbert
space and the theory of spectral multiplicity,
au/ax = ovpy, aulay= -dvpx.
Chelsea, second edition, 1957. (3) fis represented by a tpower series CEo c,,(z
[ 121 P. R. Halmos, A Hilbert space problem -a)” in a neighborhood of each point a of D;
book, Van Nostrand, 1967. that is, f(z) is analytic in D. (4) ,f is continuous
and J,--(z) dz = 0 for every rectifiable Jordan
closed curve C whose interior is contained,
together with C, in D. The proposition that (1)
198 (X1.1) implies (4) is called Cauchy’s integral theorem,
Holomorphic Functions and the proposition that (4) implies (1) is called
Morera’s theorem.
A. Differentiation of Complex Functions The hypothesis of Morera’s theorem cari be
weakened as follows: Let f(z) be continuous in
Let f(z) be a tcomplex-valued function defïned a domain D. If Icf(z) dz = 0 for every rectangle
in an open set D in the +complex plane C. We C in D with sides parallel to the axes and
say that f(z) is differentiable at z if the limit whose interior consists of only points of D, then
f(z) is holomorphic in D. In the statement of
!y (f(z + 4 -f(z)M =.fW (1) this theorem, if we let C be an arbitrary circle,
we get the same conclusion.
exists and is fïnite as the complex number h The following complex differential operators
tends to zero. We cal1 f’(z) the derivative of are often useful:
f‘(z) at z. This detïnition is a formal extension
of the defïnition of differentiability of a func-
tion of a real variable to that of a complex
variable (- 106 Differential Calculus), but it is
Generally, (dq/az) = a@/&? The Cauchy-
a much stronger condition than the differentia-
Riemann equations above cari be expressed in
bility of a real function, since z + h in (1) may
a single equation : afi% = 0. If f is holomor-
be an arbitrary point in a 2-dimensional neigh-
phic, af/az=,f'(z).
borhood of z. Hence many results essentially
In order to show that ,f = u + iv is holomor-
different from those for functions of a real
phic in D, assumption (2) cari be weakened.
variable follow from it.
Actually, we have the Looman-Men’shov
If a function f(z) is differentiable at each
theorem: Suppose that u and v are continuous
point of an open set D, it is said to be holo-
in D, &.@x, &.@y, ôv/&, and dv/ay exist at
morphic (or regular) in D, or ,f(z) is a holo-
every point of D except for at most a count-
morphic function on D. (For the defmition of
able number of points, and the Cauchy-
holomorphy of a complex-valued function
Riemann equations hold in D except for a set
of several complex variables - 21 Analytic
of 2-dimensional tmeasure zero; then f = u + iv
Functions of Several Complex Variables C.)
is holomorphic in D. D. E. Men’shov extended
Let E be an arbitrary nonempty subset of C.
this theorem and obtained various conditions
We say that ,f(z) is holomorphic on E if it is
for holomorphy. For example, he proved the
defined in an open set D containing E and is
following theorem: If f is a topological map-
holomorphic on D. Some results valid for
ping of D and S is conforma1 in D (i.e.,
differentiable real functions also hold for
holomorphic functions. For instance, the de- @wAf(z+h)-fWh
rivative of a sum, product, or quotient is given
by the usual rules. The derivative of a com- exists) except for at most a countable number
posite function is determined by the chain rule. of points, then ,f is holomorphic in D.
198 B 740
Holomorphic Functions

As another type of sufhcient condition for ,f(z) at a point z in the domain D in terms of
holomorphy, we have the proposition : If f is the values off on the boundary of D. In par-
locally +Lebesgue integrable in D and satis- ticular, when n = 0, the integral formula reads
t’es the Cauchy-Riemann equation ,fi = 0 in as
the sense of tdistribution, then there exists a
holomorphic function g in D such that 9 =f f(z)x’2ni ;-d(. (3)
talmost everywhere. s ci-z

Furthermore, if C is a circle Iz/ = R (i.e., D is


the disk IzI <R), we obtain Poisson’s integral
B. Cauchy’s Integral Theorem formula:

Cauchy’s integral theorem cari be stated as


follows: If ,f(z) is a holomorphic
+simply connected domain D in the complex
function in a

plane, the equality ~cf’(z)dz = 0 holds for every


-
27
1s0
f(z) =

2n f(Reh)
R2 -r2
R2+rZ-2Rrcos(0-<p)
d<p,

(rectifiable) closed curve C in D. In particular,


the integral jaf(z)dz (c(,/~ED) is uniquely deter-
mined by s( and [3 provided that its path of
integration lies in D. The function F(z)=
-
271
1s0277
.f(z)=

Ref(Reig)=dq
Re”@+z
+ iImf(O),

[:f([)d[ (~CD) is called the indefinite integral z = re’“, O<r-c R. (3)


of ,j: F(z) is holomorphic in D and F’(z) =.f(z).
In the proof of his integral theorem, Cauchy Formula (3)’ is valid for a tharmonic function.
assumed the existence and continuity of the Let C be a rectifiable curve and f(i) be a
derivative f’(z) in D. However, E. Goursat continuous function deiïned on C; then the

sf(i)&
E(z)=&
proved the theorem utilizing only the existence integral of Cauchy type
of f’(z), Actually, by virtue of the integral
formula (2) in this section, the existence of .f’(z)
Ci-z
in D implies the continuity off’(z). This is
sometimes called Goursat’s theorem. is holomorphic outside C. The nth derivative
Let C, Ci, C,, . . . , C,, be rectifiable Jordan F’“’ of F is given by (n!/2ni) Fcf(c)/(c- z)n+'d<;
curves. Suppose that Ci, C,, . . . . C, are in the moreover, F cari be expanded in a Taylor
interior of C and that each one lies in the ex- series about every point a outside C:
terior of the others. If f(z) is holomorphic in
F@‘)(a)
the region D bounded by these n + 1 curves F(z)= f U,(Z-a)“, U”=T,
andcontinuousonDUCUC,U...UC,,=D, n=o
then we have
which converges in (z - a( < p, p being the
distance from a to C. In particular, formula (3)
implies that a holomorphic function f is in-
lïnitely differentiable and is expanded in a
Here the curvilinear integrals are taken in the
Taylor series about every point of D as above.
positive direction (i.e., we take the direction
Let C be a closed curve not passing through
such that sc(z -a))’ dz = Sc,(z - a)-’ dz = 2ni for
a point a. Then the integral (1/2ni)lcdz/(z - a)
a point a in the interior of C or C,, respec-
is an integer. It is called the winding number of
tively). Henceforth, an integral along a closed
C about a and is denoted by n(C; a). A cycle y
curve is taken in the positive direction unless
(a finite sum of oriented closed curves) in an
otherwise noted. Cauchy’s integral theorem
open set D is said to be homologous to zero
under the assumption that f is holomorphic
in D if n(y; a) = 0 for a11 points a in the com-
in D and is continuous on D 1s sometimes
plement of D. The general form of Cauchy’s
called the stronger form of Cauchy’s integral
integral theorem is stated as follows. If ,f is
theorem.
holomorphic in D, then J,f(z)dz=O for every
Under the same assumptions as in the
cycle y which is homologous to zero in D (E.
stronger form, we have Cauchy’s integral for-
Artin). From this we have the general form of
mula for z E D:
Cauchy’s integral formula: if f is holomorphic
in a domain D, then

n(y;z).f(z)=& gdz, ~ED-Y,


sY
(2)
for every cycle y which is homologous to zero
This integral formula expresses the value of in D.
741 198 E
Holomorphic Functions

C. Zero Points ity. If f(z) is bounded in a neighborhood of a


singularity a, then a is removable (Riemann%
Let f(z) be a holomorphic function not identi- theorem). Usually, we assume that the remov-
cally equal to zero. If f(a) = 0, we cal1 a a zero able singularities of a function have already
point off: Every zero point off is an isolated been removed in this way.
point, and there exists a unique positive inte- When the singular part of S(z) at a exists
ger k and a function h holomorphic at a such and consists of a fïnite number of terms, the
that point a is called a pale; wheri it consists of an
infinite number of terms, the point is called.
.f(z) = (z - 4ks(4> Cl(4 + 0. (4) an essential singularity. If a is a pale, f(z) is
We cal1 k the order of the zero point a and a represented by the Laurent series XE -k c, t”
a zero point of the kth order. The equality (4) (cmk#O) and J’“(z)+co as z-u. In this case, the
implies that the +Taylor series of ,f(z) at a index k is called the order of the pole a. Then a
begins with the term ck(z - a)k. Suppose that a relation such as (4) holds, where the index k is
is a zero point of ,f(z) - y of the kth order; then replaced by -k. Hence the point a is some-
we cal1 a a y-point of the kth order. times called a zero point of the - kth order. If a
For a function f(z) defined in a neighbor- is an essential singularity, then for an arbitrary
hood of the +Point at infïnity, we set ,f( l/w) = number c there exists a sequence z, converging
g(w) (f(co)=g(O)) and cal1 f holomorphic at to a such that lim,,, f(zJ = c (the Casorati-
10 if y is holomorphic at 0; ,f is said to have a Weierstrass theorem or simply Weierstrass’s
zero of order k at CO if g has a zero of order k theorem). Related to Weierstrass’s theorem,
at 0. If two functions f and g are holomorphic we have +Picard’s theorem, which gives a de-
in D and f(z) =g(z) on a subset E that has an tailed description of the behavior of a function
taccumulation point in D, then fis identically around its singularities.
equal to g in D (theorem of identity or unique-
ness theorem) since the zeros of holomorphic
E. Residues
functions must be isolated.

Let a( # CO) be an isolated singularity of f(z).


Then the coefficient cm, of (z - a)-’ in the
D. Isolated Singularities
Laurent expansion (5) of f(z) is called the
residue of ,f(z) at a and is denoted by Res[f],,
Let ,f(z) be holomorphic in an annulus D
R(U;~), or R(u) if we need not indicate $ We
={zIR,<~z-u~<R~,O~R,<R~~+~}.
have
Then f(z) is expanded in the tlaurent series

R(U)=~-, =A f(i)&
f(z)= ” C,(Z-a)“. (5)
>1=-a> I l<-*l=,

This is called the Laurent expansion off where the integral is taken in the positive
about a. The coefficients cn are given by c, = direction along a path for 0 < r < R. If f (z) is
(l/2ni)Scf(i)di/(i--a)“” with C={zIIz-ai= holomorphic at z = a, then R(u) = 0. If ,jjz) has
r}, R 1 < r < R,. In particular, if f(z) is holo- a pole of the first order at a,
morphicinD={zIO<lz-al<R}(or,ifu=
co,inD={z(R<IzI<+~})butnotholomor-
phic in DU {a}, we cal1 a an isolated singular
point (or isolated singularity) off: By utilizing The residue at the point at infïnity is defïned to
the tlocal canonical parameter t = z-u (or be -a-,, where a-, is the coefficient of l/z of
t = l/z for a = CU), the Laurent expansion (5) the Laurent expansion of f(z) at 03 :f(z) =
of ,f is then written as f(z) = x;d 33c, t” + Cz ~ a,z”, and we have
C&: c, t”. The second sum is an ordinary
power series, called the holomorphic part of
f(z). The fïrst sum is a power series of l/t with
no constant term, called the singular part of Thus the notion of the residue of f(z) is actu-
f(z) at a or the principal part of the singularity ally related to the differential form f(z) dz and
(or of the Laurent expansion at a). not to f(z) itself.
If we have lim 1-0 tf(z) = 0, the Laurent From the first formula in this section and
expansion (5) of f(z) lacks its singular part, the formula for - a m1, the residue theorem
and the limit of f(z) exists as t+O (z-a) and is follows (Cauchy, 1825): Let C be a rectitï-
equal to c,,. If we set S(a) = cg, then the func- able Jordan curve in the complex plane. Let
tion f(z) is holomorphic in DU {a}. In this a,, , a, be a finite number of points inside C,
case, the point a is called a removahle singular- and let D be a domain containing C and its
198 F 742
Holomorphic Functions

interior. If f(z) is a function holomorphic in D R< +io,f(O)#O, #CO, and set <p(z)=logz.
-{a r,...,a,},wehave Take C as a closed curve consisting of the
boundary of an annulus 0 < p < IzI <r < R
(where p is sufficiently small) and two sides of
a suitable +crosscut joining a point of IzI =p
Furthermore, if f(z) is holomorphic in the and 1zI = r. Then we obtain Jensen’s formula:
extended complex plane (including the point at
intïnity) except for a finite number of poles, the =logIf(O)l +(N-P)logr
qhrh:::::“l
sum of a11 residues is equal to sera.
1 2n
log If(reiv)l dt,h.
F. Calculus of Residues -4 27L 0
The argument principle cari be utilized to
The calculus of residues is a tïeld of calculus prove Rouché’s theorem: Let f(z) and g(z) be
based on application of the notion of residues. functions holomorphic in a domain D that
For example, we have methods for the calcu- contains a rectifiable Jordan curve C and its
lation of detïnite integrals. Actually, one of the interior. Suppose that f(z) + ng(z) never van-
reasons why Cauchy studied the theory of ishes on C for any A with 0 <Â Q 1. Then the
complex functions was that he believed that number of zeros of f(z) in the interior of C is
the theory would provide a unified method of equal to that off(z)+g(z). If If(z)1 > Ig(z)l on C
computing detïnite integrals. For example, if or argS(z)-argg(z)#(2n+ 1)~ (n is an integer),
q(z) is a rational function without poles on the hypothesis of Rouché’s theorem is satisfied.
the real axis and with a zero point at infinity This theorem is useful in proving the existence
whose order is at least 2, then we have of a zero of a complex function (for example, a
polynomial) and in finding its position.
= <p(x)dx=2?7i c R(u; q(z)), (6)
s --c lItlU>
cc G. Analytic Continuations
eiX<p(x)dx=2ni 1 R(a;e”<p(z)). (7)
s -Ix: hLZ>O
Let f(z) be a holomorphic function in a do-
Here the sums are taken over a11 the poles in main D of the complex plane C and D* be a
the Upper half-plane. Formula (7) is valid also domain containing D as a proper subset. If
for a rational function <p(z) with a simple zero there exists a function F(z) holomorphic in D*
at infïnity. If <p(z) has simple poles at ak (k= that coincides with f(z) in D, then F(z) is called
1, , n) on the real axis, then we take the an analytic continuation (or analytic prolonga-
principal values of the integrals at those poles tion) of ,f(z) from D to D*. By the theorem of
and add niR(a,) (k = 1, . . , n) to the terms on identity an analytic continuation of F(z) is
the right-hand side of (6) and (7). Sometimes uniquely determined if it exists.
we use the residue theorem to obtain the value The function f,(z) detïned by the power
of the sum of a series (e.g., the +Gaussian sum) series P(z; a) = Es, a,(z-a)” with the radius
by expressing it as an integral. of convergence ri > 0 is holomorphic in the
Let ,f(z) be a single-valued function that is domain D, : Iz - a1 <ri, and at a point b of D,
tmeromorphic and not identically equal to it cari be expanded into a power series P(z; b)
zero in a domain D, and let <p(z) be a function = XE, b,,(z - b)” with the radius of conver-
holomorphic in D. Draw a rectifiable Jordan gencer,(>r,-lb-al).Ifr,>r,-lb-al,the
curve C in D such that the interior of C is domain D, : Iz - hi< r2 is not entirely contained
contained in D and f(z) has neither zeros nor in D,. Let f;(z) be the function defined in D,
poles on C. Let mi, . . . . cc,and&,...,/3,bethe by P(z; b). Then the function F(z) that is equal
zeros and poles inside C, respectively (where to f;(z) in D, and to ,fz(z) in D, is an analytic
each of them is repeated as often as its order). continuation of fi(z) from D, to D, U D, (a
Then we have direct analytic continuation by power series).
We have the following classical theorems
about analytic continuations:
Let D, and D, be two disjoint domains, and
If <p(z)= 1, we get suppose that their respective boundaries Ci
and C, are trectifiable simple closed curves
& .dargf(z)=N-P. and that the intersection of C, and C, contains
sL an open arc r. If two holomorphic functions
This is called the argument principle. Next, f,(z), f*(z) defined in D, and D,, respectively,
let f(z) be a function meromorphic for IzI < have finite common +boundary values at every
143 198 J
Holomorphic Functions

point of I, then there exists an analytic con- 1. Analytic Functions in the Sense of
tinuation F(z) of fi(z) and f2(z) to D, U r U D, Weierstrass
(Painlevé’s theorem). We sometimes cal1 fi(z)
a continuation of fi(z) beyond I. If I is not Let a be a point of the tz-sphere and t the
rectifiable, the continuation beyond I does not tlocal canonical parameter at a; i.e., t = z - a if
exist, in general. afco and t=z- l if a = co. If a power series
Let f(z) be holomorphic in a tJordan do- P(~;n)=C~~c,t” has a positive radius r, of
main D lying in the half-plane on one side of convergence, we cal1 P(z; a) a function element
the real axis and containing an open interval 1 with tenter a on the z-sphere, after K. Weier-
of the real axis in its boundary. If ,f(z) has strass. P(z;a)=Cg,~~(z-a)” if a# 03, and
finite real boundary values at every point of 1, P(z;u)=Z~,c,z~” if a= CO. These represent a
then it cari be continued analytically beyond 1 holomorphic function in lz - a/ < r, or in ra’ <
to the other side of the real axis; there the 1~1$ CO, respectively. If h is a point inside the
continued function is given by f(z) (Schwarz’s circle of convergence of the function element
principle of reflection). This theorem cari be P(z; a), by the fTaylor expansion of P(z; a) at
generalized to the case where the real interval z = b, we obtain the power series P(z; b) in
is replaced by an tanalytic curve. z-b, which is a direct analytic continuation
A harmonie continuation of rharmonic func- of P(z; a). Let a and b be two points on the
tions is delïned analogously to analytic con- z-sphere, and let C:z=z(s) (O<s< l,z(O)=
tinuation. Let D be a Jordan domain lying in a, z( 1) = b) be a curve joining a and b. We say
the half-plane on one side of the real axis and that P(z; a) is analytically continuahle along
having an open interval 1 on the real axis as a C and that we obtain P(z; b) at the end point b
part of its boundary. If u(z) is harmonie in D by the analytic continuation of P(z; a) along C
and has the boundary value 0 at every point if the following two conditions are satistïed: (i)
of 1, then u(z) has a harmonie continuation TO every s E [0, l] there corresponds a function
beyond 1. element P(z;z(s)) with tenter z(s); (ii) for every
For other properties of holomorphic func- s0 E [0, 11, we cari take a suitable subarc z =
tions - 43 Bounded Functions; 429 Tran- z(s) (1s - s0 1<E, E> 0) of C contained inside
scendental Entire Functions. the circle of convergence of P(z; z(sJ) such
that every function element P(z; z(s)) with
1s -s,, 1d E is a direct analytic continuation of
H. Analytic Functions P(z; z(s,,)). When P(z; a) and the curve C are
given, the analytic continuation along C is
A real-valued function f(t) of a real variable t uniquely determined (uniqueness theorem of
is said to be analytic at t = t, if it cari be repre- the analytic continuation).
sented by a tpower series in t-t, in a neigh- Given a function element P(z; a) with tenter
borhood of t, in R. If f(t) is delïned on an a, the set of a11 function elements obtained by
open set of R at every point of which it is every possible analytic continuation along
analytic, then f(t) is called an analytic function, every curve starting from a is called an ana-
or, more precisely, a real analytic function. lytic function in the sense of Weierstrass deter-
Analogously, a complex-valued function f(z) mined by P(z; a). In this defmition, we cari
of a complex variable z detïned on a tdomain restrict the curves to polygonal lines. An ana-
D of the complex plane C is said to be analytic lytic function in this sense is completely deter-
at z = z0 (ED) if it cari be represented by a mined by a single arbitrary function element
power series in z -zO in a neighborhood of z0 belonging to it, SO two analytic functions are
in C, and f(z) is an analytic function in D if it is identically equal if they have a common func-
analytic at every point of D. In the remainder tion element.
of this article, we are concerned with analytic A tgerm of a holomorphic function is iden-
functions in this sense. TO distinguish them tical to a function element, and the set of a11
from the real case, they are also called complex germs has the natural structure of a tsheaf d.
analytic functions. A complex analytic function In the terminology of sheaves, an analytic
f(z) defined on D is tdifferentiable in D; there- function is a connected component of 8, and
fore it is tholomorphic in D. The converse is an analytic continuation along a curve C is a
also true. Thus the term “analytic function” is continuous curve I in 0 whose projection is C.
synonymous with “holomorphic function”
insofar as it concerns a complex function (i.e.,
J. Values and Branches of Analytic Functions
a complex-valued function of a complex vari-
able) on a domain, but in the theory of func- The value of an analytic function at a point b
tions it takes on an additional meaning that is is, by definition, the value at b of its function
explained in the following section. elements with tenter b (whose existence is
198K 744
Holomorphic Functions

assumed; there may be several such elements). relation between function elements belonging
An analytic function is, in general, a multiple- to two analytic functions that holds in a neigh-
valued function because analytic continua- borhood of the starting point of a curve C is
tions along different curves with the same end conserved for function elements with tenter at
points may lead to different function elements. the terminal point b of C. This is called the
For a given analytic function f(z), if the max- invariance theorem of analytic relations. The
imal number of its function elements with the same statement is valid for relations among
same tenter is n, we say it is n-valued, and if more than two analytic functions and their
n > 2 we say it is multiple-valued (or many- derivatives (differential equations).
valued). The number of function elements of
f(z) with the same tenter is at most tcountably
L. Inverse Functions
infinite, SO the value of f(z) at a point is a
countable set (Poincaré-Volterra theorem). By
Suppose that P(z; a) (a # m) belongs to an
introducing a tRiemann surface instead of the
analytic function f(z) and P’(u; a) #O. We
complex plane as the domain of definition of
consider the inverse function of P(z; a) in a
an analytic function, we cari regard multiple-
neighborhood of a and let !R(w; c() (a = P(u; a))
valued analytic functions as single-valued
be its expansion as the power series in w - c(.
functions defmed on a suitable Riemann sur-
We call B(w; a) the inverse function element (or
face (- 367 Riemann Surfaces). .
simply inverse element) of P(z; a) and the ana-
Let f(z) be an analytic function and P(z; a) be
lytic function determined by ‘B(w; a) the inverse
a function element belonging to ,f(z), where a is
analytic function (or simply inverse function) of
a point of a domain D. The set of all function
f(z). The inverse function is completely deter-
elements obtained from P(z; a) by every pos-
mined by S(z) and is independent of the choice
sible analytic continuation along all curves in
of P(z; a). For example, analytic functions rep-
D is called a hranch of f(z) in D determined by
resented by & or log w are defined as the
P(z; a). When D coincides with the whole com-
inverse function of z2 or e’, respectively.
plex plane, the branch of f(z) in D is the func-
tion ,f(z) itself. A function holomorphic in a
domain D cari be expanded in a power series M. Singularities of Analytic Functions
with any point of D as its tenter, and the set of
these power series (function elements) consti- Hereafter, when we speak of a curve C: z = z(s)
tutes a branch of an analytic function. (0 <s < l), it is always supposed that C is a
If analytic continuations of a function ele- curve in the complex plane starting at a and
ment are possible along all curves in D, then ending at w. Let K, be the open disk Iz - OI<
the analytic continuations along two tho- r; we denote by C, the connected component
motopic curves in D lead to the same result of C n K, that contains w. If analytic continu-
(monodromy theorem). In particular, if D is ations of P(z; a) are possible along any subarc
tsimply connected and if analytic continua- of C with a terminal point arbitrarily near w
tions of P(z; a) are possible along a11 curves in but impossible along the whole C, we say that
D starting from a, then the branch of f(z) in D the analytic continuation of P(z; a) along C
determined by P(z; a) is single-valued. defines a singularity R of the coordinate w, and
that R lies over w. For example, if P(z; a) has
a tïnite radius of convergence, for a suitable
K. Invariance Theorem of Analytic Relations point w on the circle of convergence the ana-
lytic continuation of P(z; a) along the radius uw
Suppose the following four conditions hold: (1) defmes a singularity over w. Now take a point
F(z, w) is a holomorphic function of two vari- z, on C, and denote by F,(z) the branch of an
ables for ZEA, and WEA,, where A,, A2 are analytic function determined by P(z; z,) in K,.
domains in the complex plane. (2) A curve C: Let fi be a singularity determined by C and
z=z(s)(O<s< l,z(O)=a,z(l)=b) and two sets P(z; a), and suppose that we are given another
of function elements P(z; z(s)) and Q(z; z(s)) singularity R* over w determined by C* and
delïned for every s (O$s < 1) are given. (3) P*(z; a*). If they delïne the same branch F,(z)
P(z; a) and Q(z; a) cari be continued analyti- for every K,, by delïnition, we put R =R*.
cally along C using P(z; z(s)) and Q(z; z(s)), Thus F,(z) defines an tunramified covering
respectively. (4) There exists a positive number surface W, of K, - {w}, and it is single-valued
R(s) for every s (O<s< 1) such that, if [z-z(s)1 on W,.
<R(s), the values of P(z; z(s)) and Q(z; z(s)) Singularities are classified according to the
belong to A, and AZ, respectively. Under these geometric structure of W, and the value distri-
conditions, if F(P(z; a), Q(z; a)) = 0 holds for bution of F,(z) on it. First, if W, has no trela-
Iz-al<R(O), then F(P(z;b), Q(z; b))=O holds tive boundary over 0 < Iz - WI <Y for a suitable
for Iz-b(<R(l). In other words, an analytic r, then s2 is called an isolated singularity of the
745 198 P
Holomorpbic Functions

analytic function. In this case, the number k of which is also considered its own direct analytic
points of W, lying over a point z in K,- {w} continuation. For a fïxed r, the set of a11 direct
is constant. If k = CO, W, has a tlogarithmic continuations of (P, Q) thus obtained is called
branch point over w, and R is called a loga- an analytic neighborhood of (P, Q), and these
rithmic singularity. If k < CO, F,(z) cari be repre- neighborhoods defme a topology in the set of
sented as a single-valued holomorphic func- a11 function elements. A curve in this topolog-
tioninO<Itl<rl’k b y putting z = w + tk. In this ical space is called an analytic continuation in
case, if we introduce an additional point P0 the wider sense, and a tconnected component
corresponding to z = w, then W, U {P,} has of this space is called an analytic function in
only an talgebraic branch point over w. Now, the wider sense. An analytic function in the
taking into account the value of w = F,(z), we wider sense is a set of function elements in the
cal1 R an algebraic singularity if lim w exists. In wider sense, but it cari also be regarded as a
this case, we have F,(z)= Cz,c,t”, and if we function w =,f(z) (with an independent variable
admit analytic continuations in the wider z and a dependent variable w) deiïned by each
sense (- Section 0), P(z; a) is analytically con- function element p(z, w): z = P(t), w = Q(t).
tinuable along the whole C. An analytic continuation in the wider sense,

P(S) = Pk w 4;
N. The Natural Boundary
z = z(s) + tk@‘, w= f c,(s)t”, O<s< 1,
Given a domain D and an analytic function “=,(.Y,
f(z) holomorphic in D, if a11 boundary points is sometimes called an analytic continuation
of D are singularities of f(z) and f(z) is not along the curve C: z = z(s) (0 <s < 1) in the
continuable to the exterior of D, the boundary complex plane. If a11 p(s) are holomorphic
of D is called the natural boundary of f(z). This function elements, this coincides with the
phenomenon was first discovered for telliptic analytic continuation along C in the original
modular functions. Many results are known sense, but if this is not the case, p(0) and C do
about power series for which the circle of not necessarily determine p( 1) uniquely. Actu-
convergence is the natural boundary (- 339 ally, an analytic function in the wider sense is
Power Series). For any given domain D in C, just an analytic function in the original sense
there exists an analytic function whose natural with at most a countable number of ramified
boundary is the boundary of D. The original or polar elements added.
proof of this fact, given by Weierstrass, con-
tained a defect that was corrected by J. Besse.
P. Singularities of Analytic Functions in the
0. Analytic Continuation in the Wider Sense Wider Sense

Let two tlaurent series (with parameter t) z = Suppose the following three conditions hold:
P(t)=~~ku,,t” and w=Q(t)=CElbnt”(k (1) For every point on C except w, that is, for
and 1 are integers, and ak h, # 0) converge in z(s) (0 <s < l), a function element in the wider
0~ Itl <r, and let (P(t,),Q(t,))f(P(ta),Q(tz)) sense p(z, w; s) is given. (2) For every  (< l),
if t, ft,; then we say that the pair (P, Q) de- p(z, w; s) (0 d s < /2) constitutes an analytic con-
fines a function element in the wider sense. If a tinuation in the wider sense. (3) Tt is impossible
changeofparameterz=r,t+r,t2+...(r1# to iïnd a function element p(z, w; 1) such that
0 and the radius of convergence > 0) gives p(z, w; s) (0 d s d 1) is an analytic continuation
P(t) = n(z), Q(t)= K(7), we say that (n, K) and in the wider sense. When these three condi-
(P, Q) deiïne the same function element. By a tions are satisfied, we say that p(z, w; s) (0 <s <
suitable choice of parameter, any function 1) defmes a transcendental singularity R with
element cari be given in the form z = tk + a (or w as its coordinate. The method of determin-
a = t mk), w = Xz, b,,t”, and the elimination of t ing a branch w = F,(z) in an open disk with
gives the representation of w as a tPuiseux tenter w is completely parallel to the case of
series of z. SO if k = 1 and 12 0, it reduces to a holomorphic analytic functions. Because of the
holomorphic function element. When k = 1, appearance of function elements in the wider
with 1< 0 not excluded, the above element is sense in F,(z), the covering surface W, of K,
called a rational element. If k > 1 it is called a defïned by F..(z) may have algebraic branch
ramified element, and if l< 0 it is called a polar points. If W, has a logarithmic branch point
element. over w, R is called a logarithmic singularity. If
If P’, Q’ are the direct analytic continuations W, has no point over o for suitable r, !2 is
of P and Q at t, (0 < (t, I< r), i.e., their Taylor called a direct transcendental singularity; other-
expansions at t,, the function element (P’, Q’) is wise, it is called an indirect transcendental
called a direct analytic continuation of (P, Q), singularity. The logarithmic singularities are
198 Q 146
Holomorphic Functions

direct singularities. The inverse function of z = these functions developed into the theory of
w sin w has a direct singularity over z = CO tquasi-analytic functions.
that is not logarithmic, and the inverse func- The concept of tanalytic functions of several
tion of z = (sin w)/w has an indirect singularity complex variables cari also be defmed analo-
over z = 0. Taking into account the value of gously to the case of one variable. Then non-
w = F,(z), if the tcluster set of F, at Q: S, = uniformizable singularities appear that lead to
n,,,m consists of only one point, it is an a generalization of the concept of tmanifolds
ordinary singularity; if not, it is an essential (- 23 Analytic Spaces).
singularity.

References
Q. History
[ 1) E. Bore], Leçons sur les fonctions mono-
A function of a complex variable is monogenic gènes uniformes d’une variable complexe,
in the sense of A. L. Cauchy if it is differenti- Gauthier-Villars, 1917.
able at every point of its domain of definition. [2] A. Hurwitz and R. Courant, Volesuagen
It was B. Riemann who succeeded in develop- über allgemeine Funktionentheorie und ellip-
ing Cauchy’s concept. Riemann considered an tische Funktionen, Springer, fourth edition,
analytic function as a function defined on a 1964.
tRiemann surface, that is, a 1 -dimensional [3] L. Bieberbach, Lehrbuch der Funktionen-
complex analytic manifold. On the other hand, theorie, Teubner, 1, 1921; II, 1927 (Johnson
Weierstrass constructed the theory of analytic Reprint CO., 1969).
functions starting from power series. When we [4] E. C. Titchmarsh, The theory of functions,
speak of single-valued functions detïned in a Oxford Univ. Press, second edition, 1939.
domain of the complex plane, the monogenic [S] C. Carathéodory, Funktionentheorie 1, II,
functions of Cauchy and the analytic functions Birkhauser, 1950; English translation, Theory
of Weierstrass are identical. Although the of functions, Chelsea, 1, 1958; II, 1960.
analytic functions are very special functions, [6] S. Saks and A. Zygmund, Analytic func-
the study of complex analytic functions is tions, Warsaw, 1952.
usually called the theory of functions of a [7] L. V. Ahlfors, Complex analysis, McGraw-
complex variable, or simply the theory of Hill, 1953; third edition, 1979.
functions. [S] H. Behnke and F. Sommer, Theorie der
By considering the following point set C, analytischen Funktionen einer komplexen
which is more general than a domain, E. Bore1 Veranderlichen, Springer, second edition, 1962.
showed that a monogenic function on C is not [9] H. Kneser, Funktionenlheorie, Vanden-
necessarily holomorphic in the ordinary sense. hoeck & Ruprecht, 1958.
Take a countable dense subset in a subdomain [ 101 E. Hille, Analytic function theory, Ginn,
LY of a domain D and a double sequence of 1, 1959; II, 1962.
positive numbers {rn”)}. Put Sihf={zllz-z,,I< [ 1 l] H. P. Cartan, Théorie élémentaire des
r-Ah)} and Cch)= D - IJF=~ SAhI. By a suitable fonctions analytiques d’une ou plusieurs vari-
choice of rAh’, we cari suppose that the Cch) are ables complexes, Hermann, 1961; English
connected and monotone increasing with translation, Elementary theory of analytic
respect to h. Put C = u& Cch’. A function functions of one or several complex variables,
defïned in C is by defïnition monogenic if it is Addison-Wesley, 1963.
differentiable in Cch) for every h. For such a [ 121 A. Dinghas, Vorlesungen über Funk-
monogenic function, Cauchy’s tintegral for- tionentheorie, Springer, 1961.
mula in a generalized form holds, and the [ 131 R. Nevanlinna and V. Paatero, Einfüh-
function is infinitely differentiable. If f(z) and rung in die Funktionentheorie, Birkhauser,
g(z) are monogenic in C and coincide on a 1964; English translation, Introduction to
curve in C, then they are identical in C. Let D complex analysis, Addison-Wesley, 1968.
be the set {zlO<Rez< l,O<Imz<l} and {z,} [14] M. Heins, Complex function theory,
be a11 rational points in D (z, = (p + iq)/m). For Academic Press, 1968.
a natural number h, we defme Cch’ to be the set [ 153 B. A. Fuks and V. 1. Levin, Functions of a
D minus the union of open disks with radius complex variable and some of their applica-
exp( - em2)/h and tenter (p + iq)/m. The tions 1, II, Pergamon, 1961. (Original in Rus-
function sian, 1951.)
[16] W. Rudin, Real and complex analysis,
McGraw-Hill, 1966.
[ 17) S. Lang, Complex analysis, Addison-
is monogenic in C in the above-mentioned Wesley, 1977.
sense, but not holomorphic in C. The study of As to the Looman-Men’shov theorem,
141 199 B
Homogeneous Spaces

[18] D. Men’shov (Menchoff), Les conditions If we represent the homogeneous space M


de monogénéité, Actualités Sci. Ind., Hermann, as GIH, we obtain the canonical map n:s-+sH
1936. of G onto M, which we cal1 the projection of G
[19] D. Men’shov (Menchoff), Sur la générali- onto M. Let g be the +Lie algebra of G, and h
sation des conditions de Cauchy-Riemann, the Lie subalgebra corresponding to the closed
Fund. Math., 25 (1935), 59-97. subgroup H. When we identify g with the
[20] S. Saks, Theory of the integral, Warsaw, tangent space at the identity element e of G
1937, 188-201. and h with its subspace, the projection rc in-
For a new proof of Cauchy’s integral theorem, duces a linear isomorphism of g/b with the
[21] E. Artin, On the theory of complex func- tangent space VX of M at the point x = n(e).
tions, Notre Dame mathematical lectures, The tadjoint representation of G gives rise to a
1944, 57-70. Also - [7,17]. linear representation h+Ad(h) modulo b of the
For problems, group H on the linear space g/b. Through the
[22] L. Volkovyskiï, G. Lunts, and 1. Aramo- linear isomorphism between g/h and the tan-
vich, A collection of problems on complex gent space VX detïned by the projection rc, this
analysis, Addison-Wesley, 1965. representation of H is equivalent to the one
For analytic continuation, which associates with h the linear transforma-
[23] H. Weyl, Die Idee der Riemannschen tion fi delïned by h on the tangent space Vx.
Flache, Teubner, 1913; third edition, 1955; The homogeneous space G/H is said to be
English translation, The concept of a Riemann reductive if there exists a linear subspace m of
surface, Addison-Wesley, 1964. Also - [ l- g such that g = h + m (direct sum as linear
171. spaces) and (Ad H)m c m. H is said to be re-
For the history and concepts of analytic ductive in g if the representation h+Ad(h) of
functions, H in g is tcompletely reducible.
[24] G. Julia, Essai sur le développement de la If a ttensor Iïeld P on the homogeneous
théorie des fonctions de variables complexes, space M = G/H is G-invariant (namely, invar-
Gauthier-Villars, 1933. iant under the transformations delïned by the
elements of G), then the value of P at the point
x = n(e) is a ttensor over the tangent space Vx
at x which is invariant under the linear iso-
199 (IV.12) tropy group fi. Conversely, such a tensor
over Vx is uniquely extended to a G-invariant
Homogeneous Spaces tensor Iïeld on M. If G/H is reductive, then G-
invariant tensor tïelds over M are in one-to-
A. General Remarks one correspondence with fi-invariant tensors
over m. For instance, if H is compact, then H
Let M be a tdifferentiable manifold. If a +Lie is reductive in g and an fi-invariant positive
group G acts ttransitively on M as a +Lie definite quadratic form on m delïnes a G-
transformation group, the manifold M is said invariant +Riemannian metric on G/H.
to be a homogeneous space having G as its We say that the homogeneous space M =
transformation group (- 43 1 Transformation G/H is a Riemannian (linearly connected,
Groups). The tstabilizer (isotropy subgroup) complex Hermitian, Kahler) homogeneous
H, of G at a point x of M is a closed subgroup space if there exists on M a G-invariant Rie-
of G, and a one-to-one correspondence be- mannian metric (tlinear connection, +Her-
tween G/H, and M preserving the action of G mitian metric, +Kahler metric). Concerning
is defined by associating the element sH, (SE G) such homogeneous spaces, there are various
of G/H, with the point s(x) of M. This corre- results on their structures and geometric prop-
spondence is a tdiffeomorphism between the erties [l-5] (- 412 Symmetric Riemannian
manifold M and the quotient manifold G/H, if Spaces and Real Forms; 427 Topology of Lie
the number of connected components of G is Groups and Homogeneous Spaces).
at most countable. Under this condition we
may therefore identify a homogeneous space
M with the quotient manifold G/H of a Lie B. Examples
group G by a closed Lie subgroup H (- 249
Lie Groups). However, H is not uniquely Stiefel Manifold. A k-frame (1 < k < n) in a real
determined by M, and it may be replaced by n-dimensional Euclidean vector space R” is an
Hstxj = sH,sF’(s E G). Each element h of the ordered system consisting of k linearly inde-
stabilizer H, at a point x induces a linear pendent vectors. If we regard the real tgen-
transformation fi on the ttangent space V’ of eral linear group of degree n, GL(n, R), as the
M at the point x. The set ii, of a11 E is called regular linear transformation group of R”,
the linear isotropy group at the point x. GL(n,R) acts transitively on the set V$(R) of
199 Ref. 748
Homogeneous Spaces

a11 k-frames in R”. Therefore, if H denotes the the other hand, Mn,k(C) may be regarded as
subgroup of GL(n, R) consisting of the ele- the set of a11 (k - 1)-dimensional linear sub-
ments which leave tïxed a given k-frame ~0, spaces in the (n - 1)-dimensional complex
we may identify the set V& and the quotient projective space. Then, by using the +Plücker
set GL(n, R)/H. Transferring the differenti- coordinates of these subspaces, Mn,k(C) cari be
able manifold structure of GL(n, R)/H to V.:, realized as an talgebraic variety without sin-
through this identification, we see that L$(R) gularity in the projective space of dimension
= GL(n, R)/H becomes a homogeneous space n
- 1 (- 90 Coordinates B). Sometimes
(the differentiable manifold structure of V.:, is 0k
defined independently of the choice of vi). M,,,(R) is denoted by G,,,(R) or G(n, k). In the
The space V”:,(R) is called the (real) Stiefel same way, the homogeneous space represented
manifold of k-frames in R”. as Sp(n)/Sp(k) x Sp(n - k) is called the tquater-
A k-frame is called an orthogonal k-frame nion Grassmann manifold and is denoted by
if the vectors belonging to the frame are of ~AH).
length 1 and are orthogonal to each other. The
set V+(R) of a11 orthogonal k-frames is a sub- Flag Manifold. Let k,, , k, be a sequence of
manifold of V&(R). The +Orthogonal group integers such that n > k, > . > k,> 0, and let
O(n) acts transitively on V&(R), which is a F(k i, . . . , k,) be the set of a11 monotone se-
homogeneous space represented as V,,,(R) = quences Vi 3. I> V,, where y(i= 1, . , I) is a
O(n)/Zk x O(n-k). The manifold V,,,(R) is k,-dimensional linear subspace in R”. For the
actually the (n - 1)-dimensional sphere. We cal1 two sequences V, 3 . .T V, and V,’ 3 . .I V,’
I&(R) the (real) Stiefel manifold of orthogonal belonging to F(k,, , k,), there exists an ele-
k-frames (or simply Stiefel manifold). The ment SE GL(n, R) such that s(v) = y’ (i=
complex Stiefel manifold V,,,(C) = U(n)/Zk x 1,. _. , Y). Therefore F(k,, , k,) is a homogene-
U(n-k) is defined analogously. ous space with GL(n, R) as its transforma-
tion group, and is called the proper flag mani-
Grassmann Manifold. Let M,,,(R) (1 <k < n) fold. Since the unitary group U(n) of degree n
be the set of a11 k-dimensional linear sub- acts transitively on it, F(k,, , k,) is also re-
spaces of R”. The group O(n) acts transitively garded as a homogeneous space admitting
on M,,,(R), SO that we may put M,.,(R)= U(n) as its transformation group. In this case,
O(n)/O(k) x O(n-k). Here O(k) and O(n-k) putting F(k,, . . . , k,)= U(n)/H, H is isomorphic
are identified with the subgroups of O(n) con- to the direct product U(k, - k2) x U(k, - k3)
sisting of all elements leaving fixed every point x _. x LJ(k,). In particular, when r = n - 1,
of a fixed (n - k)-dimensional subspace and of ki = n - i, the homogeneous space is the quo-
its orthogonal complement, respectively. In tient space of the compact Lie group U(n) by a
this way, M,,,(R) is a homogeneous space, maximal ttorus T. In general, the quotient
which we cal1 the (real) Grassmann manifold. space GIT of a compact connected Lie group
The tproper orthogonal group SO(n) acts G by a maximal torus of G is called a flag
transitively on M,,,(R), and M,,,(R) may be manifold. If G acts effectively on GIT, G is a
represented as a homogeneous space having tsemisimple compact Lie group. The complex
SO(n) as its transformation group. It follows Lie group GC is a Lie transformation group of
that M,,,(R) is connected. The homogeneous tbiregular transformations which acts transi-
space n,,,(R)=SO(n)/SO(k) x So(n-k) is tively on the flag manifold G/T, a simply con-
called the Grassmann manifold formed by nected Kahler homogeneous space. Here GC is
oriented subspaces. Mn, i(R) and fin, i (R) may a complex Lie group having G as a maximal
be identified with the (n - 1)-dimensional real compact subgroup. If B is a maximal tsolvable
projective space and the (n - 1)-dimensional Lie subgroup (+Bore1 subgroup) of G”, G/T is
sphere, respectively. represented as GC/B.
Applying the above process for real Grass-
mann manifolds to the complex Euclidean
vector space C” instead of R”, we see that the References
set Mn,k(C) of ail k-dimensional linear sub-
spaces in C” is a homogeneous space with [l] A. Bore1 and R. Remmert, über kompakte
the tunitary group U(n) of degree n as its homogene Kahlersche Mannigfaltigkeiten,
transformation group, and we represent it as Math. Ann., 145 (1961-1962) 4299439.
U(n)/U(k) x U(n-k). This space is called the [2] Y. Matsushima, Espaces homogènes de
complex Grassmann manifold. The mani- Stein des groupes de Lie complexes, Nagoya
fold Mn,k(C) is a simply connected complex Math. J., 16 (1960), 2055218.
manifold and has a cellular decomposition [3] K. Nomizu, Invariant affine connections
as a +CW complex whose cells are Schubert on homogeneous spaces, Amer. J. Math., 76
varieties (- 56 Characteristic Classes E). On (1954) 33-65.
149 200 c
Homological Algebra

[4] H. C. Wang, Closed manifolds with homo- Ker f = C, Ker,f, and Imf= C,, Im f,-, are
geneous complex structure, Amer. J. Math., 76 homogeneous A-submodules of X and Y,
(1954), l-32. respectively, where f,: X,-> Y,,, is the restric-
[S] N. R. Wallach, Harmonie analysis on tion off on X,.
homogeneous spaces, Dekker, 1973. Sometimes, by a graded module we mean
[6] S. Helgason, Differential geometry, Lie only a sequence {X,,} of A-modules X,,, with-
groups, and symmetric spaces, Academic out considering the direct sum C, X,,. Simi-
Press. 1978. larly, we have the notion of a graded abject
{C,} in any category %?.

C. Chain Complexes and Homology Modules


200 (111.24)
Homological Algebra By a chain complex (X, a) over A we mean a
graded A-module X=x,X, together with an
A-homomorphism d: X+X of degree - 1 such
A. General Remarks
that a o o?= 0. Hence a chain complex over A is
completely determined by a sequence
Homological algebra is a new branch of math-
p,+, F.
ematics that developed rapidly after World -x,+1 --+x”+x,~,~...
War II. The introduction of the theory was
of A-modules and A-homomorphisms such
motivated by the observation that some alge-
that û,o&+, = 0 for a11 n. We cal1 3 the bound-
brait ideas and mechanisms that arose in
ary operator. For a chain complex (X, a), we
the development of talgebraic topology, in
Write Ker a = Z(X), Ker 8, = Z,(X), Im a = B(X),
particular, thomology theory, cari provide
Ima n+l = B,(X). Then Z(X) = C, Z,(X), B(X)
powerful tools for treating from a uniiïed
=C, B,(X) are homogeneous submodules of
viewpoint various problems in algebra that
X, called the module of cycles and the module
previously were treated differently. One of its
of boundaries, respectively. B(X) is a homog-
characteristic features lies in emphasizing,
eneous submodule of Z(X), and the quotient
from the‘standpoint of categories and functors
modules Z(X)/B(X), Z,(X)/&(X) are denoted
(- 52 Categories and Functors), the functional
by H(X), H,(X), respectively. We cal1 H(X) =
structure of the abjects to be investigated
C”H,,(X) the homology module of the chain
rather than their inner structure. Thus the
complex (X, a).
theory of derived functors constitutes the main
If (X, a), (Y, a’) are chain complexes over
theme of homological algebra. This new
A, an A-homomorphism f: X+ Y of degree
theory turned out to have wide applications in
Osatisfyinga’of=foa(i.e.,aaf,=f,_,a,)
other areas of mathematics, and the philos-
is called a chain mapping of X to Y. For a
ophy embodied in the theory has been influen-
chain mapping A we have f (Z,,(X)) c Z,( Y),
tial in the general progress of mathematics.
f(B,(X)) c B,( Y), and hence f induces an A-
For general references - [2,5,6]; for +sheaf
homomorphism f, : H(X)+H( Y) of degree 0,
cohomology - [3,4,8].
which is called the homological mapping in-
duced byf: We have (l,),= l,(,,, and (gof),
B. Graded Modules and Graded Objects = g* of, for chain mappings f: X+ Y and g :
Y-Z.
Let A be a +ring with unity element and X be a Let J; g : X + Y be two chain mappings. If
+unitary A-module. If we are given a sequence there is an A-homomorphism D :X + Y of
of A-submodules X, (nez) such that X = degree tl such thatf-g=Doû+a’oD, we
CneZXn (tdirect sum), we cal1 X a graded say that f is chain homotopic to g and Write
A-module and X, the component of degree n f = g; D is called a chain homotopy off to g.
of X. Each element x of a graded A-module If f is chain homotopic to g, we have f, =
X has a unique representation x = CntZ x, g*: H(X)-+H( Y). For chain complexes X
(x, E X,); we cal1 x, the component of degree n and Y, if there are chain mappings f: X+ Y
of x. An A-submodule Y of a graded A-module andg:Y+Xsuchthatfog=l,andgof=l,,
X is called bomogeneous if x E Y implies x, E Y we say that X is chain equivalent to Y. In this
(n E Z). In this case, Y = C, Y, and the quotient case f, : H(X)+H( Y) is an isomorphism and
module X/ Y = C, X,/ Y, are graded A-modules, g* : H( Y)+H(X) is its inverse.
where Y, = Yn X,. Let X=x,X, and Y= x, Y. Let (X, a) be a chain complex over A and
be graded A-modules and f: X+ Y be an A- Y = C, x be a homogeneous A-submodule
homomorphism. If there is a fïxed integer p of X such that aYc Y. Then Y and X/Y are
such that f (X,) c Y,+, for any ni Z, ,f is called chain complexes over A with the boundary
an A-bomomorpbism of degree p. In this case, operators induced by 0. Y is called a chain
200 D 150
Homological Algebra

subcomplex of X, and X/Y is called the quo- module M is uniquely determined up to chain
tient chain complex of X by Y or the relative homotopy.
chain complex of X mod Y. For a chain com-
plex X and its subcomplex Y we have an
texact sequence O-, Y~X~X/Y~O, where i is D. Tor
the tcanonical injection andj the tcanonical
surjection. Given a right A-module M and a left A-
Let (IV, a’), (X, d), (Y, 8”) be chain com- module N, Z-modules Tari (M, N) (n =
plexes over A, and f: W+X, g:X*Y be chain 0, 1,2, ), called the torsion products (or Tor
mappings such that O+ W-ftX-% Y+0 is groups), are defined as follows: Let
exact. Then an A-homomorphism 8, : H(Y)+
H(W) of degree - 1, called the connecting Y:...-tY,-tY,-,-,...-:Y,=N~O
homomorphism, is delïned by a*(y + B(Y)) be a projective resolution of N, and consider
=f-‘080gm1(y)+B(W)(yEZ(Y)),andwe the chain complex
have the exact sequence of homology:

. ..2Hn(W)1.Hn(X)-tHn(Y)

bm,(W)b-,(X)%-,(Y)%...

For a kommutative diagram


obtained by forming the ttensor product of M
O+W+X-tY-+O and Y. Then we see that the homology module
H,( M 0 A Y) is uniquely determined for any
choice of left projective resolution of N. We
detïne Torf(M, N) = H,,(M @ A Y). In partic-
consisting of chain complexes and chain ular, we have Tort(M, N)g M @ .N.
mappings in which each row is exact, we have
Properties of Tor. (1) If M is a tflat A-module,
~3,oi+b,=cp,od,:H(Y)-tH(W’).
For the tinductive limit 15X, of chain com- we have Torn(M, N)=O (n= 1,2, . ..).
plexes X, over A, we have (2) An A-homomorphism f: M, +M, in-
duces a homomorphism f, : Tori(M, , N)’
H(I$ X,) = l@H(X,). Tor/(M,, N). We have (l,), = 1, and (go&=
A chain complex X is said to be positive if X, g*of,forf:M,-tM,,g:M,+M,.
= 0 for a11 n < 0. If X is a positive chain com- (3) For an exact sequence O+MI-J*M2>
plex over A and M is an A-module, then we M, 40, we have the following exact sequence
mean by an augmentation of X over M an A- of Tor:
homomorphism .s:X,*M such that the com- . ..-.Torn(M,, N)kTort(M,, N)
position X,%X,=M is trivial:c:oa, =O. A
positive chain complex X together with an ~Tor~(M3,N)~Tor;f_i(M1,N)~...
augmentation E of X over M is called an aug- +Torf(M,, N$M, @ *N-M2 @ .N
mented chain complex over M. It is said to be
acyclic if the sequence -tM,@.N+O,

where O* are the connecting homomorphisms.


(4) For a commutative diagram
is exact, namely, if H,(X) = 0 (n #O) and E
O+M,+M,+M,+O
induces an A-isomorphism H,(X) z M. In this
case X is also called a left resolution of M. 1v 1 l’y
Moreover, if each X,, is a tprojective A-
module, X is called a left projective resolution.
For any A-module M, there exists a left pro- of A-modules and A-homomorphisms with
jective resolution of M. exact rows, we have 8, o $, = ‘p* o 0,.
Let a: M +M’ be an A-homomorphism of A- (5) Torf(C,M,, N)=C,Torf(M,, N)
modules, and X, X’ be augmented chain com- (6) Torf(l$ M,, N)rl&Torn(M,, N).
plexes over M, M’ having augmentations E, E’, On the other hand, take a left projective
respectively. Then a chain mapping f: X+X’ resolution X of M and consider the chain
satisfying E’ ofa = tl o E is called a chain map- complex X @ ,,N. Then we have H,(X @ AN)
ping over a. If X, X’ are left projective reso- g Tort( M, N) for n = 0, 1, Therefore
lutions of M, M’, respectively, then there exist properties similar to (l))(6) hold with respect
chain mappings of X to X’ over x, and any to the second variable N of Torf(M, N).
two such mappings are chain homotopic. In (7) If A0 is a ring +anti-isomorphic to A,
particular, a left projective resolution of an A- then Tort(M, N)ET~~: (N, M). In particular,
751 200 G
Homological Algebra

if A is commutative, then Torf(M, N) is an A- homomorphism d:X+X of degree + 1 such


module and we have Tori(M, N)E TO$(N, M). that do d = 0; d is called the coboundary oper-
(8) Let A be a tprincipal ideal ring. Then ator or differential. For a cochain complex
Torf(M,N)=O (n=2,3, . ..) and Tort(M,N) is (X, d), we denote by X” the component of
also denoted by M * *IV. For an exact se- degree n of X, and by d”, X”+X”+’ the restric-
quence O+M,+M,-+M,+O, we have the tion of d on X”. Then a chain complex (Y, a) is
exact sequence O+M1 * .N+M, * AN+ detïned by Y, = X -’ and a,, : Y, -+ Y,-i is equal
MJ*.N~M,O.N-tM,O.N~M,O.N~ to d-“:X-“-tX-“+‘.
0. In particular, Z*,N=O and (Z/PIZ)*~NE.N For a cochain complex (X, d), we Write
(={x~NInx=0}). Kerd”=Z”(X), Kerd=Z(X) (Z(X)=xZ”(X)),
Im d”-’ = B”(X), Im d = B(X) (B(X) = C B”(X)),
Universal Coefficient Theorem for Homology. and Z”(X)/B”(X) = H”(X)), Z(X)/B(X) =
If (X, 8) is a chain complex over A and N is a H(X) (H(X) = C H”(X)). These modules
left A-module, then (X @ .N, 8 @ 1) is a chain Z(X) V’(X)), B(X) (B”(W), and H(X) W”(W)
complex. If A is a principal ideal ring and each are called the module of cocycles, the module of
X,, is a ttorsion-free A-module, then we have a coboundaries, and the cohomology module of
formula X, respectively. If we consider the associated
chain complex (Y, a) of (X, d), then H-J Y)
corresponds to H”(X). In this way, results on
gH,(X)@,,,N+H,-,(X)*,N, chain complexes give results on cochain com-
plexes. Thus the concepts of cochain mapping,
called the universal coefficient theorem.
cochain homotopy, cochain equivalence, cochain
subcomplex, and relative cochain complex cari
E. Double Chain Complexes be defïned as in the case of chain complexes in
B, and we have corresponding results. In par-
By a double chain complex (X,,,, a’, a”) over
ticular, given an exact sequence O+ WLXA Y
A we mean a family of left A-modules X,,,
-0 of cochain complexes and cochain map-
(p, 4 E Z) together with A-homomorphisms
pings, the connecting homomorphism d, : H”(Y)
I&:X,,,+X,-,., and ôp,,:XP,4+XP,4m, such
+H”+i(W) is defïned, and the exact sequence
that ~~-,,,~a~,,=ap,,-,oap,,=a~,,~,oap,,
of cohomology
+ Cp-l ,4 o Z$, = 0. We detïne the associ-
ated chain complex (X,, 8) by setting X, = . ..o.Hn(W)kHn(X)%Hn(Y)
~:,+,=nXp,4, ô,=IZ,+,=,a~,,+a~,,. We cal1 8
~;H~+~(W)~H~+~(X)-;H"+~(Y)~;...
the total boundary operator, and a’, a” the
partial boundary operators. exists. For a commutative diagram
Given a chain complex X consisting of right
o+w+x-tY~o
A-modules and a chain complex Yconsisting
of left A-modules, a double chain complex
(Z,,,, a’, a”) is defïned by setting Z,,, =
X,@,r,, a;,,=ô,o 1, a;,,=(-l)pl oa,,
where aP, as are the boundary operators of X, of cochain complexes and cochain mappings
Y, respectively. It is called the product double with exact rows, we have d, o tj, = <p* od,.
chain complex of X and Y and the homology A cochain complex X is said to be positive if
module of its associated chain complex is X” = 0 for n < 0. If X is a positive cochain
denoted by H(X @ A Y). With respect to this complex over A and M is an A-module, we
homology module, the following facts hold. If mean by an augmentation of X over M an A-
X is a left projective resolution of a right A- homomorphism E: M+X” such that the com-
module M and Y is that of a left A-module N, position M=X’=X’ is trivial. If the sequence
then H,,(X @ .Y)=Tor/(M, N). If A is a prin-
O+M~X”bS . ..+x”d.x”+l+...
cipal ideal ring and each X, is a torsion-free A-
module, then we have the formula is exact, X is called a right resolution of M.
Moreover, if each X” is an tinjective A-
module, X is called a right injective resolution
of M. For any A-module M, there exists a
+ c H,(x)*.Hqm,
p+q=n-1 right injective resolution of M, and any two
the Kiinneth theorem. such resolutions are cochain homotopic.

F. Cochain Complexes G. Ext

By a cochain complex (X, d) over A we mean a Given left A-modules M and N, Z-modules
graded A-module X together with an A- Ext>(M, N) (n = 0, 1,2, .), called the Ext
200 H 152
Homological Algebra

groups, are delïned as follows: Let X: .-X, (6) If A is a principal ideal ring, then
-+x,-, - . ..*X.+M+O be a projective reso- Ext;(M, N)=O(n=2,3, . ..). and Exti(M, N)
lution of M, and consider the cochain complex is also denoted by Ext,(M, N). In particu-
Hom,(X, N): lar, Ext,(Z, N) = 0, Ext,(Z/nZ, N) g N/nN,
Ext,(M, Q/Z) = 0, Ext,(M, Z/nZ) = fi/nfi,
where fi = Hom,(M, Q/Z).
->Hom,(X,,N)->...
Universal Coefficient Theorem for Coho-
obtained by forming the +module of A- mology. If X is a chain complex over a prin-
homomorphisms. Then we cari show that the cipal ideal ring A such that each X, is a free A-
cohomology module H”(Hom,(X, N)) is module, then for any A-module N we have the
uniquely determined for any choice of projec- formula
tive resolution of M. We detïne ExtA(M, N)
= H”(Hom,(X, N)). This cari also be defïned
as the cohomology module H”(Hom,(M, Y))
E Hom,W,,W, NI + Ext,(H,-, (Xl, NI,
of the cochain complex Hom,(M, Y):O+
Hom,(M, Y”)~...~HomA(M, Ynml)+ the universal coefficient theorem. This is gen-
Hom,(M, Yn)d... , where Y:O+N+ Y’+...+ eralized as follows: Let X be a chain complex
Y”-l+Yn-+... is a right injective resolu- and Y a cochain complex, both over a prin-
tion of N. Furthermore, for a left projective cipal ideal ring A. Assume that each X, is a
resolution X of M and a right injective re- free A-module or that each Y” is an injective
solution Yof N, we see that Ext\(M, N) is A-module. Then we have the formula
isomorphic to the cohomology module
H”(Hom,(X, Y)) of the associated cochain
complex of the double cochain complex HomAWpW), Hq(Y))
zp;z”
Hom,(X, Y)=(Hom,(X,, Y¶),d’,d”), where
dP,,:Hom,(X,, Y4)+Hom,(X,+1, Yq) and + 1 Ext, W,(X)> H’(Y))
p+q=n-1
di,,:Hom,(X,, Yq)+Hom,(X,, Y¶+l)
are given by dp,,(u)=uoi3,+,, di,q(~)= (- 201 Homology Theory).
( -l)p+q+‘dqo~ (u~Hom,(X,, Y’)) by using
the boundary operator 8 of X and the co- H. Complexes in Abelian Categories
boundary operator d of Y.
We mainly consider general +Abelian cate-
Properties of Ext. (1) We have Exta(M, N) g gories w. Consideration may, however, be
Hom,(M, N). restricted to the tcategory (Ab) of Abelian
(2) If M is a projective A-module or N is groups (whose tobjects are Abelian groups and
an injective A-module, then ExtA(M, N) = 0 whose tmorphisms are homomorphisms) or
(n = 1,2, ). the tcategory .&? of R-modules.
(3) An A-homomorphism f: M, +M, A (cochain) complex C in an Abelian cate-
(resp. f: Ni + NJ induces a homomor- gory %?is a graded abject {C”} in %7with
phism f* : Ext;(M,, N)+Ext:(M,, N) (resp. differentials d”: C”+C”+’ subject to the con-
f,:Ext;(M, N,)+Extt(M, N2)). We have dition that d”+’ o d”= 0 (n E Z). The nth coho-
lM=l and(gof)*=f*og* forf:M,-M,, mology H”(C) of C is delïned by the texact
g:M,+M,(resp. l,,=l and(gof),=g,of, sequence O+E’(C)+Z”(C)+H”(C)+O, where
forf:N,+N,,y:N,+N,). B”(C) and Z”(C) are abjects representing
(4) For an exact sequence O*M1->M,+M, Im d”-’ and Ker d”, respectively. The complex
-0 (resp. O+N,+N,+N,+O), we have the C is called positive (negative) if C” = 0 for n < 0
exact sequence of Ext: (n > 0). We sometimes interchange positive
superscripts and negative subscripts and Write
O+Hom,(M,, N)+Hom,(M,, N)
C, instead of C”. Then the differentials
+Hom,(M,, N)+ExtA(M3, N) become d,: C,,-sC-i, and C is then called a
chain complex. The quotient of Ker d, = Z, by
-tExt;(M,, N)+... Imd n+l =B, is called the nth homology H,(C).
(resp. O+Hom,(M, N,)-rHom(M, N,) Negative complexes are usually described in
this manner. When C”, Z”, B”, and H” are
+Hom(M, N,)+ExtA(M, Ni) sets, as in the category .J% of R-modules,
+Ext;(M, N2)+...) their elements are called cochains, cocycles,
coboundaries, and cohomology classes, respec-
(5) tively. Similarly, in the group C, of chains,
residue classes of cycles (EZ,,) modulo bound-
aries (EB,) are called homology classes (EH,).
753 200 1
Homological Algebra

A morphism (or chain transformation) 1‘: C+ complexes {C,,,} and further to multiple
c’ is a tnatural transformation of the com- complexes, as we shall show in the case of
plexes considered as tfunctors Z-t%‘; i.e., bicomplexes.
f is a family of morphisms f” : C”+C’” (n E Z) Let T be a tbifunctor %‘, x Ce,+%” and Ci be
satisfying ,f”+’ od”=d”of”. It induces a complexes in Vi (i = 1,2). Then T(C, , C,) is a
morphism of cohomology H”(C)+H”(C’). bicomplex in 97’. For instance, Hom(C’, C) is a
A suhcomplex of C is an equivalence class positive (bipositive) complex if C(C’) is a posi-
of tmonomorphisms D+C, usually denoted tive (negative) complex in %7.If C, c’ are com-
by any representative D of the class. A (chain) plexes in JZR, R1, respectively, the ttensor
homotopy between two chain transforma- product C 0 ,$Y is a complex in (Ab) (the
tions J y : C-c’ is a family of morphisms product complex). There is a canonical mor-
h”:C”~C’“-’ (nez) satisfyingf”-g”= phism H,(C)@H,(C'+H,+,(C@ C'). If C,,
h”+’ od” + d”-’ o h”. If there exists a homo- and B, are tflat for a11 116 Z, we have the fol-
topy between ,f and 9, then f and g induce the lowing exact sequence (Kiinneth’s formula):
same morphism of cohomology. A morphism
f: C+c’ is called a (chain) equivalence if there
exists a morphism ,f’: C+C such that f’o,f
and ,fof’ are homotopic to the identities of C
and C’, respectively. In this case we have
(for the definition of Tor - Section D). For C’
H”(C) g P(C’). An exact sequence of com-
= AE&', Künneth’s formula reduces to the
plexes O-tc’+C+C”+O gives rise to the
exact sequence O+H,,(C) @ A+H,,(C 0 A)+
connecting morphisms H”(C”)+H”+l(C’)
Tor, (H,-,(C), A)-+0 (universal coefficient
(PIE~), and the resulting sequence -)
theorem). The corresponding exact sequence
H”-l(C»)+H”(C’)-+H”(C)+H”(C>‘)+
for cohomology is
H”+l(C’)+. is exact (the exact cohomology
sequence), and similarly for homology instead O+Ext’(H,m,(C), A)+H"(C, A)
of cohomology. An abject A EV defines a
+Hom(H,,(C), A)+0
complex (also denoted by A) such that A0 = A,
do = 0. A positive complex C together with a (- Section G; 201 Homology Theory).
morphism E: A + C is called a complex over A,
and E is the augmentation. A complex C over A
isacyclicifO~A~C”~C1-t... isexact. An 1. Satellites and Derived Functors
acyclic positive complex over A is called a
right resolution of A. Let {C, E}, {C’, E’} be Let %?and w be Abelian categories. Al1 func-
complexes over A and A', respectively, and c( a tors in this section are tadditive. A tcovariant
morphism A+ A’. Then a morphism f: C-C’ functor T:Vj%’ is called exact if Tmaps
satisfying fo E= E’ o tl is called a morphism every exact sequence in %?to an exact sequence
over a. For a negative complex C, we detïne in V. T is called half-exact, left exact, or right
similarly augmentations E: C-A, acyclicity, exact if for every short exact sequence O+ A
left resolutions, etc. +B+C*O, the sequence T(A)-+T(B)+T(C),
A hicomplex (or double complex) C in %? O+T(A)-+T(B)+T(C), or T(A)+T(B)+T(C)
consists of abjects Cp,* (p, qgZ) and two dif- -0, respectively, is exact. Similar definitions
ferentials d,: Cp.4+C P+l4, d,.CP4+CP.4+1 apply for tcontravariant functors. The functor
subject to df = d; = 0 and d,d,, = d,,d, (some- Horn:% x %+(Ab) (which defines the category
times replaced by anticommutativity, d,d,, + %?)is left exact in both factors. An abject P is
d,,d, = 0). Morphisms of bicomplexes are de- projective if h,( .) = Hom(P, .) is exact, while Q
fined as for single complexes. A bicomplex is injective if hQ( .) = Hem(. , Q) is exact. If
C becomes a (single) complex if we put C”= every abject A admits an tepimorphism from
C,+,=, Cp*4 (when the sum exists) and detïne a projective abject P+A (resp. tmonomor-
the differential d to be d, + (- l)Pd,, on Cp,4. phism into an injective abject A-Q), W
Then d is called the total differential and d,, d,, is said to have enough projectives (injec-
the partial differentials. On the other hand, Cp tives). An abject G is called a generator (cogen-
= {CpQ~Z), d,} constitutes a complex for erator) if the natural mapping Hom(A, B)+
each q, whose cohomology HP(CP) is denoted Hom(h,(A), h,(B)) (Hom(h’(B),h’(A))) is
by HF(P). Then d,, induces morphisms HP(P) one-to-one.
+H[(P”), SO that we obtain a complex An Abelian category % is called a Gro-
HP(C). The cohomology of HP(C) is denoted thendieck category if (1) %?has a generator, (2)
by HP,(HP(C)). We delïne Hf(HP,(C)) similarly. tdirect sums always exist, and (3) the identity
The cohomology of C with respect to the total (u Ai)flB= u(AiflB) holds for any abject A,
differential is denoted simply by H"(C). Similar its tsubobject B, and a ttotally ordered family
constructions are applied to double chain {Ai} of subobjects. A Grothendieck cate-
200 J 754
Homological Algebra

gory has enough injectives (R. Baer, 1940, left derived functors L,F of a contravariant
for (Ab); A. Grothendieck, 1957, for general functor F are deiïned similarly and are isomor-
V). A monomorphism into an injective ob- phic to the left satellites when F is right exact.
ject f: A+Q is called an injective envelope if If (e has enough projectives (instead of injec-
Imf n Im g # 0 for any nonzero monomor- tives), we delïne left (right) derived functors of
phism g: B-Q. Every abject A in a Grothen- covariant (contravariant) functors via projec-
dieck category admits an injective envelope, tive resolutions. For a multifunctor, we define
which is unique up to isomorphism (B. Eck- partial derived functors as well as (total) de-
mann and A. Schopf, 1953, for Ré; B. Mit- rived functors of the functor viewed as a func-
chell, 1960, for general V). tor defined in the tproduct category. For
We say that a covariant &functor %+%?’ is instance, let T(A, B):V, x %-%?’ be contra-
given if we have a sequence of covariant func- variant in A and covariant in B. When $Y2has
tors T= { T’:~&~%?‘} and the connecting mor- enough injectives, we obtain Ri T(A, B) =
phisms d: Ti(A”)+T’+i(A’) for an arbitrary Hi( T(A, Q)) using an injective resolution Q
short exact sequence O+A’+A+A”+O satis- of B. Suppose that T satisfies condition (i)
fying the following conditions: (i) do Ti(f”) A+ T(A, B) is exact for any injective B. Then
= T’+‘(f”) o a for a morphism f of short for a fixed injective B, Ri, T(A, B) is a cohomo-
exact sequences; and (ii) the sequence + logical functor in A. When A has a projec-
T”(A”)~tT’(A’)~T’(A)~T’(A”)-tT’+’(A’)~ tive resolution P in Vi, we obtain Ri T(A, B)
constitutes a complex. T= {T’} is called a = H’(T(P, B)) as well as the equation for the
covariant a*-functor if instead of û there are total derived functor R’T(A, B) = H’(T(P, Q)).
given a* : T’(A”)+ T’-‘(A’) satisfying similar We say that a functor T is right balanced if it
conditions (i*) and (ii*). By taking duals, we satisfies (i) and also (ii) B+ T(A, B) is exact for
define the notion of contravariant a- and a*- any projective A. In this case, the three derived
functors. They are also called connected se- functors are isomorphic. The left balanced
quences of functors. A &(a*-)functor defined functors are defined similarly. When the right
for - cû <i < + CO is called a cohomological derived functors of the functor Horn (which
functor (homological functor) if the sequence in delïnes the category) exist, they are denoted by
condition (ii) (resp. (ii*)) is always exact. A Ext’(A, B).
morphism of a-functors f: S+ T consists of
natural transformations fi: Si-> T’ that com-
mute with the connecting morphisms. A a-
J. Spectral Sequences
functor S delïned for a <i < b is called universal
if for any a-functor T defined in the same in-
terval and any natural transformation <p: Sa* In this section, we deal with cohomology in
T”, there exists one and only one morphism the category Rd of R-modules. A similar
f:S+T such that f"=<p. Let F:W+W’ be a theory for homology is obtained by modifying
covariant functor and b any positive integer. A the theory in a natural way. Similar construc-
universal covariant a-functor S defined for tions are also possible for general Abelian
0 < i < b is called a right satellite of F if Sa = F categories [3, S].
(S is then denoted by {S’F}). If such an S A filtration F of a module A is a family of
exists, then it is unique and satistïes S’+‘(F)= submodules { FP(A) 1PE Z} such that FP(A) 3
S’(S’F). If %?has enough injective abjects, Fpt’(A). We say that the filtration F is con-
the right satellites always exist, and if F is left vergent from above (or exhaustive) if UPFp(A)
exact, then {Si F} is a cohomological functor. =A, and F is bounded from below (or dis-
The universality of d*-functors is delïned by crete) if FP(A) = 0 for some p. The +graded
reversing the arrows; the satellites {SiF} are module G(A)={GP(A)=FP(A)/FP+l(A)Ip~Z}
then written as {S-‘F} and called the left is said to be associated with A. A morphism of
satellites. Iïltered modules ,f: A+ A’ is a module homo-
Let %Ybe an Abelian category with enough morphism such that f(FP(A))c FP(A’). It in-
injectives. An injective resolution of an abject A duces a homomorphism of the graded modules
is a right resolution Q = { Qi} such that a11 Qi G(A)+G(A’). A filtration of a complex C=
are injective. Every A admits an injective reso- {C”, d} consists of subcomplexes FP(C) =
lution, which is unique up to chain equivalence {Fp(Cn)} such that Fp(C)xFP+‘(C). We
(H. Cartan, 1950). For a covariant functor F: assume that the complex C satisfies UP F’(C)
cg+%“, the functor A+H’(F(Q)), called the ith = C, and is bounded from below; i.e., for every
right derived functor R’F of F, is independent n there exists some p such that FP(C”)=O. In
of Q. {R’F} is a cohomological functor. By the particular, if F’(C) = C, Fp+l (Cp) = 0, the com-
universality of satellites, there exists a mor- plex C is called canonically bounded. Writing
phism of a-functors {S’F}+{R’F} which is an Cp,¶ = GP(CP+q), we obtain a tbigraded module
isomorphism if and only if F is left exact. The { Cp,q}, in which p, 4, and p + q are called the
755 200 K
Homological Algebra

filtration degree, the complementary degree, spectral sequences was initiated by J. Leray
and the total degree, respectively. (1946), and suitable algebraic formulations
A spectral sequence {E,} with a graded were given by J. L. Koszul(l950).
module D = {D”} as its limit (denoted by
E;s4 3 pDn) consists of a family of doubly
graded modules E, = { EP.q 1p, 4 E Z} (r > 2 or K. Categories of Modules
sometimes r > 1) and differentiations d,: E,P,q+
Ep+r,q-ri’
I (p, 4 E Z) of degree (r, 1 -Y) satisfy- The category ,&! (resp. AR) of left (right) R-
ing d,? = 0 and satisfying the following two modules over a tunitary ring R is not only an
conditions: (i) H(E,) (with respect to d,) is Abelian category but also a Grothendieck
isomorphic to E,,, (hence there exists a se- category (- 277 Modules). The tfull embed-
quence of graded submodules of E, : 0 = B, c ding theorem permits us to deduce many pro-
B, c c Z, c Z, = E, such that ZJB, g E,); positions about general Abelian categories
and (ii) there are submodules Z, and B, such from the consideration of RA. An abject P of
that Uk B, c B, c Z, c nk Z,, and E, = ,&? is projective if and only if it is isomorphic
Z,/B, is isomorphic to the doubly graded to a direct summand of a tfree module. Any
module associated with a certain filtration F of projective module is the direct sum of count-
D (i.e., Egqg GP(DP+q)). We assume that Z, = ably generated projective modules (1. Kap-
nk Z, and B, = Uk B, (weak convergence). lansky, 1958). It follows that any projective
Suppose that F is convergent from above and module over a tlocal ring is free. Finitely
bounded from below and that Zk(Eg3q) is generated projective modules Pi and P2
stationary for every p, 4. Then {E,} is called are said to be equivalent if there exist lïnite-
regular. {E,} is bounded from below if for every ly generated free modules FI, F2 such that
n there exists a p. such that E$‘,n-P = 0 for P, 0 F, g P2 @ F2. The equivalence classes
p-cp,,. In particular, if E2,q=0 (p<O,q<O), then form an Abelian group with respect to
then {E,} is called the first quadrant (or cobo- the direct sum construction called the projec-
mology spectral sequence). In the latter case, tive class group of a ring R. The category of
the edge bomomorpbisms E~“+E~o, Emq+ complex tvector bundles over a compact space
E2.q are detïned through base terms Ere and X is equivalent to the category of projective
tïber terms EfBq, respectively. A morphism modules over C(X), the ring of complex-
of spectral sequences 1’: {E,, D} -{ EL, D’} con- valued continuous functions on X, and simi-
sists of ,f, : E,+ E: of degree (0,O) and f: D + D’ larly for other types of spaces and bundles.
of degree 0 which preserve the mechanism of Many investigations have been made involv-
spectral sequences. When the spectral se- ing the problem of whether every projective
quences are regular, a morphism .f‘ is an iso- module over a polynomial ring is free (J.-P.
morphism if one of the 1; is an isomorphism. Serre, 1955). This problem was settled aflïrma-
Addition is naturally introduced in the set of tively by D. Quillen [13] and independently
morphisms SO that spectral sequences form an by A. Suslin. It has been observed that “big”
additive category. An additive functor from projective modules are often free: for example,
an Abelian category % to this category is nonfïnitely generated projective modules over
called a spectral functor. A filtered complex an tindecomposable weakly Noetherian ring
{C, F} gives rise to a spectral sequence E;,q a are free (Y. Hinohara, 1963).
G(H(C)) if we put Z,P={UEF~(C)I~U~F~+~(C)}, The nth right derived functor of Hom,(A, B)
Bj’= dz;-‘, E; = Z;/(Zf:,’ + B;mI), E, = &, E;. is denoted by Ext;(A, B) (- Section G). This is
A double complex C = { Cp,q, d,, d,,} admits two a bifunctor R& x &+(Ab), contravariant in
natural filtrations F,: F~(C)=C,,,~qCs~q A and covariant in B. Extg is isomorphic to
and F,,:FP,(C)=C,,,C,CP.‘. By the pro- and identified with Horn,. An exact sequence
cedure above, these filtrations give rise to O+A’+A+A”+O gives rise to the con-
spectral sequences H[(Hfi(C)) *,H”(C) and necting homomorphisms A”: Ext;(A’, B)+
HP,(H/(C)) *$f”(C), respectively. Compari- Exti+‘(A”, B), and the following sequence
son of these sequences yields many useful is exact: . ..+ExtRi(A’.B)~Ext;(A”,B)+
results. Let T be an additive covariant func- Ext;(A,B)+Ext;(A’,B)~Ext”,t’(A”,B)+...
tor from an Abelian category %?to R.,&‘r C be (the exact sequence of Ext). Similarly, an exact
a complex in %‘, and Q = {Q”“} be an injec- sequence O+B’+B+B”+O gives rise to
tive resolution of C. The double complex Q A”:Exti(A, B”)-+Ext”‘(A, B’) and to an exact
gives rise to spectral sequences HP(R4T(C)) * sequence of Ext. An extension of A by B (or of
H(T(Q)) and RPT(Hq(C))=>H(T(Q)). The B by A) is an exact sequence (E):O+B+X+
limit H(T(Q)) is independent of Q and is A+O. The set of equivalence classes of exten-
called the hypercohomology of T with respect sions of A by B is in one-to-one correspon-
to C [2, S]. We cari similarly define hyper- dence with ExtA(A, B) by assigning to (E) its
cohomology of multifunctors. The theory of characteristic class xE = A’( 1)~ Extk(A, B),
200 K 756
Homological Algebra

where 1 denotes the identity of Hom,(B, B). In JW&,,V, WHB, (3) -PExt;Bz (A 0 ,,B, C)
this correspondence, the sum of two extensions by the double complex argument in Section D.
is obtained by a construction called Baer’s The homological dimension h dim, A,
sum of extensions. Similarly, Ext;(A, B) is dh, A, or projective dimension proj dim, A of
interpreted as the set of the equivalence classes A E sA is the supremum ( < CO) of n such
of n-fold extensions O+B+X,_,+...+X,+ that Extz(A, B)#O for some B. The relation
A+0 (exact). This point of view permits us to h dim, A < 0 means that A is projective. The
establish a theory of Ext, etc., in more general injective dimension inj dim, B of BE R.k’ is
(additive) categories lacking enough projec- delïned similarly by means of the functor
tives or injectives (N. Yoneda, 1954, 1960). Exti( ., B), and the weak dimension w dim, C
The tensor product A Q RB is a right exact of CE ,+V by the functor Torf(., C). The
covariant bifunctor As @ ,&+(Ab). If the common value sup{projdim, Al AE,&‘}
functor tP( .) = . @ P is exact, P is called a Vlat = sup { inj dim, B 1BE s&‘} is called the left
module. A projective module is flat. In general, global dimension 1 gl dim R of R. It is identical
a flat module is the tinductive limit of lïnitely with the supremum of homological dimensions
generated free modules (M. Lazard, 1964). A of tcyclic modules (M. Auslander, 1955). The
flat module P is called tfaithfully flat if P # mP right global dimension r gl dim R is delïned
for every maximal ideal m of R. The functors similarly. The common value sup{wdim, Al
@ and Horn are related by tadjointness (- 52 A~As}=sup{wdimsCI CesA} is called the
Categories and Functors). From this view- weak global dimension w gl dim R of R. We
point 0 cari be introduced in more general have w gl dim R < 1 gl dim R, r gl dim R. The
categories. Left-derived functors of A @ RB are equality may fail to hold (Kaplansky, 1958). If
denoted by Torf(A, B) and are called nth tor- R is TNoetherian, the three global dimensions
sion products of A and B. Torf(A, B) is often coincide (Auslander, 1955) and are called
denoted by A * RB. The functor OR is left simply the global dimension of R : gl dim R. The
balanced, hence Tor is calculated by using condition 1 gl dim R = 0 (or r gl dim R = 0)
projective resolutions of A, B, or both A and holds if and only if R is an tArtinian semi-
B. We have Tort = OR. An exact sequence O-+ simple ring, while w gl dim R = 0 if and only
A’+A+A”-+O gives rise to A,:Torf+,(A”, B)+ if R is a tregular ring in the sense of J. von
Torf(A’, B) and the inlïnite exact sequence of Neumann (M. Harada, 1956). A ring R is
Tor, and similarly for the second variables. called left (right) hereditary if 1 gl dim R < 1
From the various relations between Horn (r gl dim R < l), and left (right) semihereditary
and 0 follow the corresponding relations if every lïnitely generated left (right) ideal is
between their derived functors. When A and F projective. A left and right (semi)hereditary
are algebras over K and R = A @ I, we cari ring is called a (semi)hereditary ring. Since
delïne the external product (T -product), which projectivity and tinvertibility of an ideal of a
is a mapping (commutative) integral domain R are equiva-
lent, R is hereditary if and only if R is a tDede-
T :Torp(A, B) 0 Tori(A’, B’)
kind ring. In this case, the projective class
+Toe+,(A @ A’, B @ B’). group of R reduces to the tideal class group.
An integral domain R is semihereditary if and
In particular, if A and I are K-projective and only if w gl dim R < 1 (A. Hattori, 1957), and in
Tod(A, A’) = 0 (n > 0), then we cari deline the that case R is called a Priifer ring. A tmaximal
wedge product (V-product) V: Exti(A, B) @ order over a Dedekind ring is hereditary. A
Ex$(A’, B’)+Extg+4(A 0 A’, B 0 B’). The latter commutative semihereditary ring R is charac-
is described in terms of the composition of terized by the property that flatness of R-
module extensions. When K = A= r = R, modules is equivalent to torsion-freeness (S.
the T-product reduces to the interna1 product, Endo, 1961). A Noetherian ring R is left self-
called the m-product. If A is a tHopf alge- injective if and only if R is tquasi-Frobenius
bra over K, the tcomultiplication A-+ A @ A (M. Ikeda, 1952), and the global dimension of
induces Ext,,@,, -+Ext,. This, combined with a quasi-Frobenius ring is 0 or CO(S. Eilenberg
the V-product, yields the cup product (-- and T. Nakayama, 1955). A polynomial ring R
product)-: Ext;(A, B) 0 Ext4,(A’, B’-* = K [Xi, , X,] over a commutative ring K
Exthq(A @ A’, B @ B’). We define similarly satisfies gl dim R = gl dim K + n. When K is a
I -product, A-product, w-product, and - - field, this is a reformulation of Hilbert’s theory
product (cap product) [Z]. Let A, F, and C be of syzygy sequences (- 369 Rings of Poly-
algebras over K, with A K-projective; let nomials). In this sense, the study of the global
AE&,,@~, BE,,&,, CE&~@~, and assume dimension of rings and categories is sometimes
Torn(A, B) = 0 (n > 0). The natural isomor- called syzygy theory (Eilenberg, 1956). The
phism Hom,&A, Hom,(B, C))g Horn,@, homological algebra of commutative Noe-
(A 0 .B, C) then yields a spectral sequence therian rings has been studied extensively and
151 200 M
Homological Algebra

is useful in algebraic geometry. Since gl dim R copy of A, i\cAA cari be identifïed with ,,r.,&’
= sup, gl dim R,,, (R,,, is the +ring of quotients (and kl,,e). If A is K-projective, {H”(A;)} is
relative to m), with m running over the max- isomorphic to {Exti.(A, .)}. (In [2], H”(A, A) is
imal ideals of R, the problem of determining defmed as Extip(A, A) in general.) We have
gl dim R reduces to the case of tlocal rings. A H’(A,A)={uEAIÂ~=~~.,V~EA}. Wecall l-
fïnitely generated flat module over a local ring cocycles derivations (or crossed homomor-
R is free. If K denotes the residue fïeld R/m, phisms) of A in A and 1 -coboundaries inner
where m is the maximal ideal of the local derivations. Thus H’(A, A) is the derivation
ring R, To?(K, K) has the structure of a class group and is related to the tramification.
Hopf algebra (E. F. Assmus, Jr., 1959). De- When K is a fïeld, H’(A, .) = 0 if and only if A
tailed results concerning the Betti numbers is a tseparable algebra. In general, an alge-
dimTorf(K, K) of R have been obtained (J. bra A over a commutative ring K is called a
Tate, 1957, and others). In particular, R is separable algebra if A is A’-projective, i.e., if
+regular if and only if gl dim R < CO(Serre, Exti,(A, .) = 0 (Auslander and 0. Goldman,
1955). A local ring R is called a Gorenstein 1960). This is a generalization of the notion of
ring if the injective dimension of R-module R +maximally central algebras (Nakayama and
is finite. This is a notion intermediate between G. Azumaya, 1948). We have a one-to-one
regular rings and +Macaulay rings (- 284 correspondence of H’(A, A) to the family of
Noetherian Rings). algebra extensions of A with kernel A (i.e., K-
Consideration of a ring R in relation to a algebras I containing A as a two-sided ideal
subring S leads to relative homological algebra. such that T/A =A) satisfying A2 = 0. Any
Foundations for this theory were established extension of an algebra A over a tïeld K such
by Hochschild (1956). An exact sequence of that H’(A, .) = 0 splits over a nilpotent kernel
R-modules that +splits as a sequence of S- (J. H. C. Whitehead and G. Hochschild). This
modules is called an (R, S)-exact sequence. holds in particular for a separable algebra, and
An R-module P is called an (R, S)-projective we obtain the +Wedderburn-Mal’tsev theorem.
module if Hom,(P;) maps any (R, S)-exact There are some interpretations of H3(A. A) in
sequence to an exact sequence. (R, S’)-injective terms of extensions.
modules are defined similarly. Based on these The supremum (<CO) of n such that
notions, Exto,,, and Tor’R*s) are defïned as the H”(A, A) # 0 for some A is called the cohomo-
relative derived functors of Horn, and OR, logical dimension of A and written dim A. For
respectively. We also have a relative theory a finite-dimensional algebra A over a fïeld K,
from a different viewpoint (S. Takasu, 1957). dim A < cû if and only if A/N is separable and
Relative theory is extended to general cate- gl dim A < CO, where N is the +radical of A (N.
gories from various viewpoints [6,14] (- Ikeda, H. Nagao, and Nakayama, 1954).
Section Q). The homology groups H,(A, A) of A relative
to a coefficient module A are detïned similarly.
If A is K-projective, {H,(A;)} is isomorphic to
L. Cohomology Theory for Associative {Tort’(.,A)}.
Algebras

Let A be an talgebra over a commutative M. Cohomology of Groups


ring K and A a +two-sided A-module. Let
C” be the module of a11 n-linear mappings The pair consisting of an algebra A over K
of A into A called n-cochains (CO = A). De- and an algebra homomorphism E: A-, K is
fine the coboundary operator 6”: C~C”+ called a supplemented algebra [2] (or aug-
by(W)(4>...,4,+, )=n,f(a*, . . ..A+.)+ mented algebra [SI), of which E is the augmen-
CL-lYf(Â,, . . . . aiA^i+1, . . . . &+I)f tation. The +group algebra Z[C] of a group G
(-l)““f(>“,, . ,a,)&+,. over the ring of rational integers is a supple-
We thus obtain a complex whose coho- mented algebra, in which the augmentation is
mology is denoted by H”(A, A) and is called defined by c(x) = 1 (x E G). The category of left
the nth Hochschild’s cohomology group of A G-modules is identified with the category of
relative to the coefficient module A (Hochs- left Z[G]-modules. For a fmite group G, a
Child, 1945). A cochain f is called normalized if finitely generated projective G-module is not
f(n,, ,a,) =0 whenever one of the ni is 1. We necessarily free (D. S. Rim, 1959) and is iso-
obtain the same cohomology group H”(A, A) morphic to the direct sum of a free module
from the subcomplex of normalized cochains. and a left ideal of Z[G]. It follows that the
{H”(A, .)) is a cohomological functor from the projective class group of Z[G] is a tïnite group
category &/8,, of two-sided h-modules to the (R. G. Swan, 1960). The cohomology groups
category ,&Y. Using the enveloping algebra A’ and homology groups of G relative to AE,#
=A 0 K A’, where A0 is an anti-isomorphic (Eilenberg and S. MacLane, 1943) are defïned
200 M 758
Homological Algebra

by H”(G, A) = Ext;,,,(Z, A) and H,(G, A) = (2) When H is a normal subgroup of G, the


TorZrC1(Z, A), respectively. Their concrete mapping (x lr ,x,)-+(xIH, . . . . x,H) of non-
deslription is given usually via the Z[G]- homogeneous chains induces the inflation
standard resolution of Z. (or lift) Inf:H”(G/H, AH)+H”(G, A) and the
(1) Homogeneous formulation. The group of deflation Def:H,,(G, A)+H,(G/H, AH).
homogeneous n-chains is the free Abelian group (3) The embedding of nonhomogeneous
with basis G x x G (n + 1 times), on which chains induces the restriction Res: H”(G, A)
G operates by x(x0, , x,) = (xx,,, . . . , xx,), +H”(H, A) and the injection Inj (or corestric-
and the boundary operator is defined by tion Cor):H,,(H, A)+H,,(G, A). The theory of
d(x,, . , X”) = CYzo( - 1)’ (x,, , xi, )X”). tinduced representation gives another con-
(2) Nonhomogeneous formulation. The group struction of these mappings; that is, if we put
of nonhomogeneous n-chains is the Z[G]-free I~(A)=H~~,~~,(Z[G], A), Res is obtained by
module with basis G x x G (n times), and the isomorphism H”(G, l’(A)) r H”(H, A) com-
the boundary operator is defined by d(x,, bined with the homomorphism induced by A
“‘> x,)=x1(x2 )...) x,)+c;::(-l)i(xl >...) -l’(A); while if we put lG(A)=ZIG] @ zlH,A,
XiXi+l>"'r x,)+(-I)“(x ,,..., x,-,).Anon- Inj is obtained by the isomorphism H,(H, A) E
homogeneous 2-cocycle is sometimes called H,,(G, I~(A)) followed by the homomorphism
a factor set. H’(G, A) is the submodule A’ induced by l,(A)+A.
of A consisting of the G-invariant elements, (4)If(G:H)<co, we have tG(A)gzC(A).
while H,(G, A) is the largest residue class The composition of H”(H, A)+H”(G,l,(A))
module A, of A on which G acts trivially. +H”(G, A) defines on the cohomology groups
Given two groups G and K, an exact sequence the Inj:H”(H, A)+H”(G, A), while the compo-
of group homomorphisms 1 +K+E+G+l is sition H,(G, A)+H,(G,I’(A))+H,(H, A) gives
called a group extension of G over the kernel the Res:H,(G, A)+H,(H, A). In particular,
K. When K is Abelian, the extension canoni- Res: H,(G, Z)+H,(H, Z) coincides with the
cally induces a G-module structure on K, and ttransfer G/[G, G]+H/[H, H].
the deviation of K from being a semidirect (5) Let H be a normal subgroup of G.
factor of E is measured by a factor set. The Consider the additive relation p (the corre-
group H*(G, A) is thus in one-to-one corre- spondence) between hEZ”(H, A)’ and fi
spondence with the set of equivalence classes Z”+‘( G/H, AH) determined by p(h, f) if and
of the tgroup extensions of G over A which only if there exists a gE C”(G, A) such that h =
induce the originally given G-module structure Res y and Inf.f= 6g. If the relation induces a
onA(- 190 Croups N). This point of view is homomorphism H”(H, A)‘+H”+‘(G/H, AH),
essential in the proof of the +Schur-Zassenhaus then it is called the transgression. If H’(H, A)
theorem (- 151 Finite Groups). H3(G, A) is = 0 (0 < i < n), the sequence O+H”(G/H, AH)
interpreted as the set of obstructions for exten- +H”(G, A)+H”(H, A)‘+H”+‘(G/H, AH)+
sions (Eilenberg and MacLane, 1947). For a Hn+l(G, A) composed of inflation, restric-
+free group F, H”(F, A) =0 (n> 1). If a group tion, and transgression mappings is exact
G is represented as a factor group F/R of a (Hochschild and Serre, 1953) and is called
free group F, we have a group extension l+ the fundamental exact sequence. This exact
K+E+G+l,whereK=R/[R,R]and E= sequence cari be derived from a certain spec-
F/[R, R]. Let teH’(G,K) correspond to this tral sequence H”(G/H, Hq(H, A)) +,H”(G, A)
extension. Then for any G-module A, the cup (R. C. Lyndon, 1948; Hochschild and Serre,
product x-x-5 followed by the pairing 1953).
Hom(K, A) @ K + A provides isomorphisms The relative (co)homology theory relative to
H”(G,Hom(K, A))zH”“(G, A) (n>O) (the cup a subgroup (1. T. Adamson, 1954) cari be dealt
product reduction theorem of Eilenberg and with in terms of the relative Ext and the rela-
MacLane, 1947); similarly, we have the reduc- tive Tor (Hochschild, 1956). Many results in
tion theorem for the homology. The Z-algebra the absolute case are generalized to the rela-
H(G, Z) = CEo H”(G, Z) under the multipli- tive case: for example, the fundamental exact
cation defïned by the cup product is tïnitely sequence (Nakayama and Hattori, 1958). The
generated if G is a fmite group (B. B. Venkov, relative theory is further generalized to the
1959; L. Evens, 1961). cohomology theory of +Permutation represen-
The following are mappings relative to a tations of G (E. Snapper, 1964).
subgroup H.
(1) The inner automorphism by x E G in- Non-Abelian Cohomology. For a non-Abelian
duces an isomorphism of H”(H, A) and G-group A, the cohomology “set” H’(G, A)
H”(xHxml, A) which reduces to the identity (and HO(C, A)) is delïned as in the Abelian case
of H”(G, A) if H = G. Hence if H is a normal - by means of the nonhomogeneous cochains
subgroup, H”(H, A) has the structure of a G/H- (- e.g., [9]). Some efforts are being made
module, and similarly for H,(H, A). toward the construction of a more general non-
759 200 P
Homological Algebra

Abelian theory (J. Giraud, Cohomologie non 0. Cohomology Theory of Lie Algebras
ubeliénne, Springer, 1971).
Let g be a +Lie algebra over a commutative
ring K, and assume that g is K-free. The +en-
veloping algebra U = U(g) is a tsupplemented
N. Finite Groups
algebra over K. For a g-module (= U-module)
A, Extt(K, A) and Tory(K, A) are called the
Let G be a Imite group and A a G-module. cohomology groups H”(g, A) and homology
Define the norm N : A + A by N(a) = ZxeGxa, groups H,(g, A), respectively, of g relative to
and denote KerN by ,,,A. The kernel of the the coefficient module A. They are usually
augmentation E: Z [G] + Z is denoted by 1. Put described by means of the U-free resolution
I?(G,A)=H”(G,A) (n>O), fi’(G,A)=A’/NA, U @ AK(g) of K (called the standard com-
l?‘(G, A)=.A/IA, and fimn(G, A)=H,-,(G, A) plex of g) constructed by C. Chevalley and
(n > 1). Then {@(G, .)} forms a cohomological Eilenberg (1948), where &(g) is the exterior
functor, called the Tate cohomology (E. Artin algebra of the K-module g and (denoting
and J. T. Tate), that cari be described as the set 1 @(x,A...~x,) by (xi, . . . ,x,)) the differenti-
of cohomology groups concerning a certain ation is given by
complex called a complete free resolution of Z.
(Similar arguments are valid more generally 4x ,,"', x,)= i ( -l)‘+‘xi(x,, . ..> xi, . ..> x,)
for quasi-Frobenius rings (Nakayama, 1957), i=l
and a theory of this kind is called complete +I<C,,,(-l)i+j(Cxi,xjl, x1> ...1
, \
cohomology theory.) We have fi”(G, A)=0
ii, .. , x,, . . , X”).
(n E Z) if and only if h dimzlcl A < 1 (Naka-
yama, 1957). If A satisftes the conditions (i) Forn>[g:K], H”(g,A)=H,(g,A)=O.Ifgisa
l?‘(G,, A) = 0 for any Sylow p-subgroup G, tsemisimple Lie algebra over a field K of char-
of G, and (ii) there exists a <E fi’(G, A) such acteristic 0, we have H’ (g, A) = 0, H2( g, A) = 0,
that RestEAZ(Gp, A) has the same order as G, while H3(g, K)#O. H’(g, A)=0 is equivalent to
and generates ah of fi’(G,, A), then the homo- Weyl’s theorem, which asserts the complete
morphisms I?“(H, B)+fi”+‘(H, A @B) (FEZ) reducibility of finite-dimensional representa-
defïned by the cup product with Res 5 are tions (- 248 Lie Algebras E). H2(g,A) and
isomorphisms for every subgroup H and every H3(g, A) are interpreted by means of Lie
G-module B such that Tor(A, B)=O (Naka- algebra extensions as in the cohomology of
yama, 1957; for B=Z, Tate, 1952). If G is groups. The theorem on +Levi decomposition
cyclic, the mappings fi”(A)+fi”‘2(A) (FEZ) is derived from H2(g, A) = 0. Chevalley and
dehned by the cup product with a generator Eilenberg constructed this cohomology by
of fi2(Z) are isomorphisms. (The notation is algebraization of the cohomology of compact
abbreviated by omitting G.) If the orders of +Lie groups. They also introduced the notion
f?‘(A) and f?‘(A) are finite, their ratio is called of relative cohomology groups H”(g, 6, A)
the Herbrand quotient h(A) of A. If O+ A’+ A relative to a Lie subalgebra h of g, which cor-
+A”+0 is exact, then h(A)=h(A’)h(A”). If A is respond to the cohomology of homogeneous
finite, then h(A) = 1. By combining these two spaces. H”(g, h, A) does not always coincide
facts we obtain Herbrand’s lemma: If A’ is a G- with Extbc,),uth)(K, A) (Hochschild, 1956), but
submodule of A of fïnite index and h(A’) exists, does SO in an important case where K is a field
then h(A) also exists and h(A)=h(A’). The of characteristic 0 and b is treductive in g (-
periodicity fin(A)= fin+P(A) (FEZ, AE~&) 248 Lie Algebras).
holds if and only if every Sylow subgroup is For ttransformation spaces of tlinear alge-
cyclic or a tgeneralized quaternion group brait groups G over a field K, the rational
(Artin and Tate; [2]). cohomology groups are introduced using the
Let L/K be a finite +Galois extension with notion of rational injectivity (Hochschild,
the +Galois group G. The cohomology groups 1961). In particular, if G is a Qnipotent alge-
of various types of G-modules related to L/K brait group over a tïeld K of characteristic 0,
are called the Galois cohomology groups (- then H(G, A) is isomorphic to H(g, A), where g
172 Galois Theory). Using continuous cocycles, is the Lie algebra of G. There is also a relative
a cohomology theory (Tate cohomology) is theory.
developed for intïnite Galois extensions as well
[9,10]. By means of Galois cohomology (- 59
Class Field Theory), the cohomology theory of
finite groups and of ttotally disconnected P. Amitsur Cohomology
compact groups (which are tprofïnite groups)
plays an important role in class lïeld theory Let R be a commutative ring, and F a co-
and its related areas. variant functor from the category ?ZRof com-
200 Q 760
Homological Algebra

mutative R-algebras to the category of Abelian $ex X.-t A in Zd,


groups. For SE ‘e, and n = 0, 1,2, , we Write
+X,,b=Xn/...!%X$A, didi+, =O,
S(“) = S @ . . .@ S (n-fold tensor product over
R). Let&i:S(“‘1)~S(n’2)(i=0,1,...,n+1)be is called a Y-projective resolution of A if(i)
Ce,-morphisms detïned by ei(xO @ 0 x,) = X,E~ (n>O), and (ii) it is Sacyclic (i.e., the
x0 @ . . 0 xi-i @ 1 @xi @Q. . . 0 x,. Delïning sequence
d”:F(S(“+‘))+F(S(“+2)) by d”=C;:;(-l)‘F(a,),
~z~(P,x”)b:z~d(P,X,~l)~...
we obtain a cochain complex {F@“+i)), d”}.
This complex and its cohomology groups are rf;Z&‘(P, X,) rf=Z,zZ(P, A)+0
called the Amitsur complex and the Amitsur
is exact for any PEY [16]). Note that the
cohomology groups, and are usually denoted
y-acyclicity in this case implies the existence
by C(S/R, F) and H”(S/R, F), respectively.
of a contracting homotopy
If SIR is a lïnite Galois extension with
Galois group G, then the group H”(S/R, U) h,:Z.d(P, X,)-+Z4P, X.+1),
of the unit group functor U is naturally iso-
na-l,Xm,=A,
morphic to H”(G, U(S)). If SIR is a finite pure-
ly inseparable extension, then H”(S/R, U) =0 satisfying d,+i h, + h,-, d, = 1.
for n 2 3. The group H’(S/R, U) is related to Comparison theorem: Given two %y-
the tBrauer group B(S/R) (- 29 Associative projective resolutions X. + A and Y. -+ A, we
Algebras K). have that the chain complexes X, and Y. are
chain equivalent in Zd.
Let T:&‘-93 be a (covariant) functor with %
Q. Relative Theory an Abelian category. The nth left derived func-
tor L,T:&+B (naO) of T, with respect to
In the course of the development of homolog- OP, is delïned by L, T(A) = H,,( TX.) for a UP-
ical algebra, it has been recognized that the projective resolution X.+ A. The derived func-
notion of projective (resp. injective) resolutions tors L, T Will remain unaffected if we replace
should be generalized ([14,4]; Hochschild, the projective class 9 by its enlargement @=
1956). In the meantime, a method has been {abjects in B together with their retracts}.
introduced that utilizes simplicial abjects in We cari easily verify that (i) L, T(P)= T(P)
order to delïne the derived functor of an arbi- and L,T(P)=O (n>O) for PEYP, and (ii) a
trary functor with Abelian category as its range short exact sequence O-T’+ T-+ T”-0 of
([15,17]; J. Beck, 1967). As a consequence of functors : .d +.%’ induces a long exact sequence
these developments, there emerged a view- of derived functors
point, which we describe below, making it
possible to unify various known definitions of
(co-)homology theories that has been designed -+L,T”+O.
for particular cases.
Let d be a category and 9 a class of ob- If SS! is preadditive and has a zero abject
jects in &. In this section, we denote the and kernels, then it is routine to give a y-
set Hom,d(A, B) of morphisms by d(A, B). projective resolution of any abject. If d has
A morphism ,f: A+B in &’ is called a y- lïnite tlimits (i.e., tlïnite products and tequal-
epimorphism if the induced mapping &(P,f) izers), it cari be proved that there exists a
(= Hom(P,f)):&‘(P, A)+&(I’, B) is surjective y-projective resolution for any abject [ 173.
for any PE UP. The class .S’ is called a projective There is a standard functorial construction
class in d if there exist an abject PEY and a which provides canonically a projective class
YP-epimorphism f: P-t A for each abject AE d. UP in a category .d and a YP-projective reso-
TO any category ~2, we associate a preaddi- lution of any abject in d. Let (G, E, 6) be a
tive category Zd, adding a zero abject to d if cotriple (or comonad [ 1 S] or functor coalge-
necessary: Put ObZd = Obd, Zd(A, B) = free bra) in d. Here G:d-+d is an endofunctor,
Abelian group generated by the set &(A, B). a2 E: G-Id and 6: G+G* = GG are natural trans-
is regarded as a subcategory of Zd by the formations such that GE o 6 = EG o 6 = 1e and
natural inclusion J:&+Zd. Any functor T Gfi o 6 = 6G o 6. A cotriple cornes usually from
from d into an Abelian category !?8 has a a pair (F, U) oftadjoint functors U:&+W,
unique additive extension T: Zd+.G? such F:%?-+.c4 with natural bijection À:.d(F(C), A)
that T= FJ. If & is an additive category, 5%(C, U(A)). Putting G=FU:.d+d, E(A)=
there is a canonical projection O:Zd-+d /1-‘(l,,,,):FU(A)+A, r/(C)=Â(l&:C+
such that BJ = Id. If furthermore T is additive, UF(C), we have a cotriple (G = FG, E, 6 = FqU)
then T= TO. in &. Conversely, it is known that any cotri-
Now suppose that a projective class .y in d ple in d is induced from a suitable pair of
is given. For AE~, an augmented chain com- adjoint functors.
761 201 A
Homology Theory

Given a cotriple (G, E, 6) in .d, we delïne a [ 171 M. Tierney and W. Vogel, Simplicial
projective class YG = {G(A) 1AEJZZ} (or its derived functors, Lecture notes in math. 86,
enlargement $c) in .d, and an augmented Springer, 1969.
simplicial abject G,(A)+.4 for any abject A as [ 181 S. MacLane, Categories for the working
follows. Put G,(A)= C”+l(A) (n>O), ai=$= mathematician, Springer, 1971.
Gi~Gn-i(A):G,(A)+Gnm,(A) (face operator) [ 191 J. F. Adams, Stable homotopy and gen-
and 6i=&‘=GiGG”-‘(A):Gn(A)+Gn+,(A)(de- eralised homology, Lecture notes, Univ. of
generacy operator) for 0 <i < n, G-,(A) = A. Chicago, 1971.
Then G,(A)-+A gives rise to a Upc (or equiva-
lently &Pc)-projective resolution of A in Zd
with differentials d, = C&( - l)‘Z,. This is
the bar resolution (or standard resolution) 201 (1X.6)
in a generalized sense, and there are many
(co-)homology theories delïned by means of
Homology Theory
such constructions.
Most of the above defmitions and construc- A. History
tions cari be dualized SO as to give injective
classes 3, Y-injective resolutions, and right Homology theory is the oldest and most ex-
derived functors with respect to Y, triples (or tensively developed portion of talgebraic to-
manads), etc. (See also [ 191 for generalized pology. Historically, it started with measuring
(co-)homology). the higher-dimensional connectivity of a space
in the sense that the 0-dimensional connectiv-
ity is the number of tconnected components of
References the space. For example, take a +2-sphere S2
and a +2-torus T2. Then T2 is distinguished
[l] Sém. H. Cartan, 1950-1951, Paris, 1951. from S* by the fact that on T2 a closed curve
[2] H. P. Cartan and S. Eilenberg, Homolog- cari be drawn without forming a boundary,
ical algebra, Princeton Univ. Press, 1956. while this is not true for S2. In fact, a curve (ci
[3] A. Grothendieck, Sur quelques points or c’, in Fig. 1) cari be drawn on T2 SO that it
d’algèbre homologique, Tôhoku Math. J., (2) 9 does not form a boundary of an embedded 2-
(1957), 119-221. disk. On more complicated +Surfaces there are
[4] R. Godement, Topologie algébrique et many kinds of such closed curves. The maxi-
théorie des faisceaux, Actualités Sci. Ind., mum number of such closed curves is the l-
Hermann, 1958. dimensional connectivity of the surface; this is
[S] D. G. Northcott, An introduction to homo- a topological property of the surface. For
logical algebra, Cambridge Univ. Press, 1960. example, the 1-dimensional connectivity of S2
[6] S. MacLane, Homology, Springer, 1963. is 0, that of T2 is 2, and that of the surface in
[7] P. Freyd, Abelian categories, Harper, 1964. Fig. 2 is 6. A more general consideration of the
[8] A. Grothendieck (and J. Dieudonné), Elé- bounding properties of q-dimensional tclosed
ments de géometrie algébrique I-111, Publ. submanifolds of a manifold led E. Betti (Ann.
Math. Inst. HES, 1960-1963. Mat. Pure Appl., 4 (1871)) to introduce the
[9] J.-P. Serre, Cohomologie galoisienne, notion of the q-dimensional connectivity of the
Lecture notes in math. 5, Springer, 1964. manifold, which was a precursor of homology
[ 101 S. Lang, Rapport sur la cohomologie des theory.
groupes, Benjamin, 1966.
[ll] E. Weiss, Cohomology
Academic Press, 1969.
[ 121 H. Bass, Algebraic K-theory,
1968.
of groups,

Benjamin,
e c, d
[ 131 D. Quillen, Projective modules over
Fig. 1
polynomial rings, Inventiones Math., 36
(1976) 167-171.
[14] S. Eilenberg and J. C. Moore, Founda-
tion of relative homological algebra, Mem.
Amer. Math. Soc., 55 (1965).
[ 151 A. Dold and D. Puppe, Homologie Fig. 2
Nicht-Additiver Funktoren, Anwendungen,
Ann. Inst. Fourier, 1 I (1961) 201-312. The foundation of homology theory was
[16] M. Barr and J. Beck, Homology and laid by +H. Poincaré [l]. He started his study
standard constructions, Lecture notes in math. of homology with analytic treatment of mani-
80, Springer, 1969. folds, which led to a series of complications.
201 B 762
Homology Theory

Poincaré then introduced a new method for (1933), Alexander (1935), and Kolmogorov
the study, now called tcombinatorial topology: (1936). This development served to clarify the
He decomposed the manifold into elementary relations between combinatorial and set-
pieces or tcells, which adjoin one another in theoretic methods in topology (- 426 Topol-
a regular fashion; he then substituted the ogy), whereas it produced complexity and
algebraic notions of +Cycles and tboundary confusion in homology theory [6]. S. Eilen-
operators for the geometrical notions of closed berg and N. E. Steenrod (Proc. Nut. Acud. Sci.
submanifolds and boundaries. Thereby the US, 31 (1945) and [7]) cleared the air by treat-
notion of homology groups acquired an exact ing homology axiomatically.
logical meaning, and the fundamental for- Roughly speaking, a homology theory as-
mulas, which are now called Poincaré for- signs +Abelian groups to ttopological spaces
mulas, were proved. and thomomorphisms to tcontinuous map-
After Poincaré, much of the development of pings of one space to another. In this way, a
homology theory centered around the ques- homology theory is an algebraic image of
tion of the topological invariance of homology topology; it converts topological problems to
groups, that is, the independence of the ho- algebraic problems. Starting from this view-
mology groups on the choice of tcellular de- point, Eilenberg and Steenrod selected some
composition. Through the development of fundamental properties as axioms to charac-
+simplicial complexes and their techniques, terize homology theory. This unified homol-
J. W. Alexander (Trans. Amer. Math. Soc., 28 ogy theories and allowed systematic treatment
(1926)) gave the fïrst fully satisfactory proof for of homological problems which had previ-
topological invariance of the homology groups ously been done separately “by hand” in each
of tpolyhedra. In those days, the homology case. Moreover, it motivated the birth of a
groups themselves were barely recognized; new branch of mathematics, called thomo-
instead, one dealt with numerical invariants logical algebra.
such as the tBetti numbers and the ttorsion
coefficients [2,3].
During the period 1925-1935 there was a B. Homology of Chain Complexes
gradua1 shift of interest from the numerical
invariants to the homology groups themselves, A chain complex C = {C,, O,} is a collection of
and homology theory developed intensively (additive) Abelian groups C,, one for each
[4,5]. S. Lefschetz (Trans. Amer. Math. Soc., 28 integer q, and of homomorphisms a4 : C,-t C,-,
(1926)) added the theory of the tintersection such that a4 o a,,, = 0 for each q. Elements of
products to the homology of manifolds. He C, are called q-chains of C, and a4 is called the
also invented trelative homology theory (Proc. boundary operator. A subcomplex C’ = {Ci, 8;)
Nat. Acad. Sci. US, 13 (1927)) and generalized of C is a chain complex such that Ci c C, and
the tduality theorems of Poincaré and Alex- 0: = d, 1Ci for each q.
ander (ibid., 15 (1929)). G. de Rham (J. Math. The tkernel of a4 is denoted by Z,(C), and its
Pures Appl., 10 (1931)) obtained a duality element is called a q-cycle of C. The tirnage of
theorem that relates the texterior differential a4+, is denoted by B,(C), and its element is
forms in a manifold to the homology groups of called a q-boundary of C. The relation a4 o a4+1
the manifold. L. S. Pontryagin (Ann. Math., = 0 implies B,(C) c Z,(C). The tquotient group
(2) 35 (1934)) proved the complete group- Z,(C)/B,(C) is denoted by H,(C), the qth ho-
invariant form of the Alexander duality theo- mology group of C. Elements of H,(C) are
rem. These duality theorems seemed to reflect called q-dimensional homology classes of C.
the existence of a theory dual to homology Two cycles representing the same homology
theory, and the genesis of this dual theory, class are said to be homologous. The tdirect
now called tcohomology, occurred in 1935, in sum C,&(C) is denoted by H,(C) and is
the work of Alexander and A. N. Kolmo- called the homology group of C.
gorov. It was discovered subsequently by If C = {C,, 3,) and C’= {Ci, 8;) are chain
Alexander (Ann. Math., (2) 37 (1936)), E. Lech complexes, a chain mapping (chain map) cp: C
(ibid.), and H. Whitney (ibid., 39 (1938)) that +C’ is a sequence of homomorphisms ‘~4: C,
the cohomology of a polyhedron cari be made +Ci such that 3; o <p4= <p4m1o 3, for each q. If
into a ring. p : C-C’ is a chain mapping, then <p4sends
On the other hand, after L. Vietoris (Math. Z,(C) to Z,(C’) and B,(C) to B,(C’), and hence
Ann., 97 (1927)) and P. S. Aleksandrov (Ann. <pinduces a homomorphism of H,(C) to
MA., (2) 30 (1928)), many devices were inven- Hq(C’).
ted to extend the homology theory of poly- If H,(C) is Qïnitely generated, it cari be de-
hedra to general +topological spaces, and composed into the direct sum of a free Abelian
numerous variants of homology theory ap- group B,(C) and a fïnite Abelian group T,(C).
peared at the hands of Lech (1932), Lefschetz B,(C) and T,(C) are called the qth Betti group
763 201 D
Homology Tbeory

of C and the qth torsion group of C, respec- C,(K) is a free Abelian group generated by
tively. The trank p4 of B,(C) is called the qth the set {OF}. We delïne a homomorphism
Betti number of C. T’,(C) is isomorphic to the ~,:~,(K)-c,-,(K) by aqCa,,a,,...,~,7=
direct sum of t(q) finite cyclic groups of orders x&( -l)‘[a,, ,~~~~,a~+~, . ,aq]. Then
07, 02, , OP,,,, where 0; > 1 and Of divides OF+, cqoaq+l- - 0 holds, and we have a chain com-
for i = 1, . , z(q) - 1 (- 2 Abelian Croups). The plex C(K) = {C,(K), a,} (C,(K) = {O> if q < 01,
numbers OP, @, , O& are called the qth tor- called the (oriented) simplicial chain complex.
sion coefficients of C. The homology group H,(C(K)) is denoted by
If H,(C) is lïnitely generated, then the num- H,(K) and is called the (integral) homology
ber x(C) = &( - l)¶p, is called the Euler num- group of the simplicial complex K.
ber, the Euler cbaracteristic, or the Euler- Let K, and K, be simplicial complexes, and
Poincaré cbaracteristic of C. In this case a let ,j”: K, + K, be a tsimplicial mapping. Then
+polynomial C, p4 tq with variable t is called for each q, a homomorphism ,f,,: C,(K,)+
the Poincaré polynomial of C. C,(K,) cari be defmed byf~q([ao,ul, . . ..a.])=
Let C be a chain complex such that, for each [f(u,),f(a,), ,f(u,)], where the right-hand
q, C, is a tfree Abelian group of lïnite +rank. side is understood to be 0 if f(a,), f(u,), ,
Then the qth Betti number and the qth torsion f(a,) are not distinct. The sequence & = {f#,}
coefficients are well defined for each q. Denote is a chain mapping of C(K,) to C(K,), and it
the ranks of C, and B,(C) by c(~and p,, respec- induces a homomorphismf,:H,(K,)~H,(K,).
tively. Then it holds that p4 = c(~- flq - bqml, If K is a +lïnite simplicial complex, then the
and hence x(C) = C,( -~PU,. (Euler-Poincaré Betti numbers, the torsion coefficients, and the
formula). Moreover, there exists a set of bases, Euler characteristic of K are delïned to be
one for each C,, with the following properties: those for the chain complex C(K).
For each y, the base for C, is composed of Let K, and K, be a subcomplexes of a sim-
lïve types of elements, a: (1 < i d 8, - r,), bj plicial complex K. Then we have the follow-
(l~i~r,),c,e(l~i~p~),dp(l~igt,-,),ande,p ing texact sequence which relates the homol-
(1 <i</3,-, -tqml); aq satislïes aaF=O, abp=O, ogy group of K, U K, to the homology groups
8,: = 0, ad; = Oi4-i bp-‘, and &,Y = a!-‘. Such a ofK,, K,,and K,flK,:...~Hq(K,flK,)~
set of bases is called the canonical basis of C. H,V$)+HqW$H,K W-;H,-,K f-
Let C be a chain complex such that each C, K*)+... , where ~1,fi, and 8, are defïned as
is a free Abelian group with a given base {ci”}. follows. Let i,:K, flK2+K, andj,:K,jK, U
Then the incidence number [a:: a;P-‘1 E Z is K, (I = 1,2) be inclusion mappings; then cc(u)=
delïned by aq(~~)=~j[o~:aj4~1]~~-1. This (il.(4 -Ma)) and B(~l,~,)=~l~(~l)+~2*(~2).
notion was commonly used in the early days If z=ci fc, (c[EC(K~)) is a cycle of C(K, U K,)
of topo1ogy. then o’(c,) = - a(c,) is a cycle of C(K, n K,); 8,
sends the homology class of z to the homol-
ogy class of a(c,). The sequence is called the
C. Homology of Simplicial Complexes Mayer-Vietoris exact sequence of the couple
{K,, K2}, and 8, is referred to as the connect-
Let K be a tsimplicial complex. An oriented q- ing homomorphism. The prototype of the
simplex (r of K is a q-simplex SE K together Mayer-Vietoris exact sequence was obtained
with an equivalence class of +total orderings of by W. Mayer (Monatsh. Math. Phys., 36 (1929))
the vertices of S, two orderings being equiva- and L. Vietoris (ibid., 37 (1930)). The present
lent if they differ by an even permutation of form is due to Eilenberg and Steenrod [l].
the vertices. If a,, , a4 are the vertices of s,
then [a,, a,, . . , a,] denotes the oriented q-
simplex of K consisting of the simplex s to- D. Homology of Polyhedra
gether with the equivalence class of the order-
ing a, <a1 < , < a4 of its vertices. For every If K is a kimplicial complex and K’ is a tsub-
vertex a of K there is a unique oriented O- division of K, then there exists a canonical
simplex [a], and to every q-simplex with q à isomorphism H,(K) z H,(K’). This proves that
1 there correspond exactly two oriented q- if K, and K, are tsimplicial decompositions of
simplexes, which are said to be opposites of a tpolyhedron then H,( K i) and H,(K,) are
one another. isomorphic, because there exists a common
Let C,(K) denote the Abelian group subdivision of K, and K,. Thus we may de-
generated by the oriented q-simplexes of K fine the (integral) homology groups H,(X) of
with the relations o + 0 = 0 if o’ is the oppo- a polyhedron X to be the homology group
site of <T.If we choose an oriented q-simplex H,(K) of a simplicial decomposition K of X.
0,” for each q-simplex SP of K, then each Let X and Y be polyhedra, and let ,f:X-+ Y
element of C,(K) is written uniquely as a be a continuous mapping. Take a tsimplicial
finite sum Cigi@ with integers gi #O, and approximation <p: K +L off: Then a homo-
201 E 764
Homology Theory

morphism of H,(X) to H,(Y) given by the J$y=,( -l)‘ao~,. Then we have a chain com-
induced homomorphism ‘p* : H,(K)+H,(L) is plex S(X) = {S,(X), a,}, called the singular chain
independent of the choice of <p, and is denoted complex of X. The homology group H,(S(X))
by f,. The following properties hold: (i) l* is denoted by H,(X) and is called the integral
= 1: H,(X)+H,(X), where 1 is the identity; (ii) singular homology group of the topological
If ,f:X+ Y and g: Y-t2 are continuous map- space X.
piw, then (sof),=g*of,:H,(X)~H,(Z); Given a continuous mapping ,f: Xj Y, a
(iii) If ,J f’: X+ Y are homotopic, then f, = chain mapping f, : S(X)-tS( Y) is delïned by
fi : H,(X)+ H,( Y). These imply the homo- sending each singular simplex (r: A4dX to the
topy invariance of the homology group stated as singular simplex fo o: A4* Y, and it induces
follows: If X and Y are polyhedra which are the homomorphism f, : H,(X)+ H,( Y). The
thomotopy equivalent, then H,(X) and H,(Y) properties (i), (ii), (iii) off, in Section D hold
are isomorphic. Specifically, the homology for continuous mappings of topological spaces,
group is a topological invariant. Thus if X is and hence the singular homology group is a
a ttriangulable space (for example, if X is a homotopy invariant.
tdifferentiable manifold), then its homology The homology group H,(K) of a simplicial
group H,(X) cari be defined to be the homol- complex K is isomorphic to the singular ho-
ogy group H,(K), where (K, t) is a ttriangu- mology group H,( 1K 1) of the polyhedron 1K 1.
lation of X. This homology group is referred Therefore the simplicial homology group of a
to as the simplicial homology group of X. triangulable space is isomorphic to the sin-
Similarly, if X is a compact triangulable space, gular homology group of the space.
the Betti numbers p,(X) of X, etc. cari be de- If {Xi} is the set of tarcwise connected com-
lïned to be those for the chain complex C(K). ponents of a topological space X, then H,(X)
If we denote by pt a single point, then H,,(pt) g Ci H,(Xi). If X is arcwise connected, then
= Z (the group of integers) and H,(pt) = 0 if H,(X)? Z. If {A,} is the collection of all the
q#O. For an n-sphere S”, a 2-torus T’, and a compact subsets of X directed by inclusion,
treal projective plane P*, the homology groups then H,(X) is isomorphic to the tinductive
cari be computed as follows by means of their limit li$ H*(A,). It is not true that there is a
triangulations: (1) H,(s”) g HJS”) r Z, and Mayer-Vietoris sequence in singular homology
H&S")=O ifq#O, n;(2)Ho(T2)gH,(T2)gZ, for any couple {X, , X,} of subsets of X. How-
H,(T')gZ+Z, and H&T’)=0 if q#O, 1,2; ever, for certain couples {Xi, X,}, there is a
(3) Ho(P2)~Z,H,(P2)gZ2, and H,(P’)=O if M$yer-Vietoris exact sequence of {Xi, X,} :
q #O, 1, where Z, = 2122. . . ..H.(X, nX,)sH,(X,)+ H,(X&H,(X, U
Two surfaces are homeomorphic if and only X2)~Hq~l(XlnX,)%.... Forexample, this
if their integral homology groups are isomor- holds if X = Int X, U Int X,, where Int denotes
phic (- 410 Surfaces). the tinterior.
Let c:X+pt be the mapping of a topolog-
ical space to a single point. Then the kernel
E. Singular Homology of c* : H,(X)+ H,(pt) is denoted by fi,(X)
and is called the reduced homology group of
There are various devices for defining ho- X. It holds that H,(X)gfi,(X)+ H,(pt).
mology groups of general topological spaces. Regard the +Suspension SX as the union of
A familiar one is the singular homology theory two copies of the tcone over X. Then the
initiated by S. Lefschetz (Bull. Amer. Math. connecting homomorphism in the Meyer-
Soc., 39 (1933)) and improved by S. Eilenberg Vietoris sequence gives an isomorphism
(Ann. Math., (2) 45 (1944)). &(SX) s %-i(X) for any q. The inverse of
The standard q-simplex is the convex set this isomorphism is called the suspension iso-
A4~R4+’ consisting of ah (qf 1)-tuples (t,,t,, morphism for homology.
..‘, tY) of real numbers with ti 2 0, t, + t, + Let M be a +C”-manifold. A C”%ingular q-
+ t, = 1. Any continuous mapping of A4 to simplex in M is a singular q-simplex 0: Aq% M
a topological space X is called a singular q- such that 0 extends to a C”-mapping from an
simplex in X. The ith face of a singular q- open neighborhood of A4 in {(t,, t,, . . ..QE
simplex (r: A¶+X is the singular (q - 1)simplex Rqfl)tO+t,+...+fq=l}toM.Thetotality
0 o Es: Aq-’ +X, where the linear embedding of C”-singular simplexes in M generates a
~~:A~-‘~A~isdefinedby~~(t,,...,t~~,,t,+~, subcomplex of S(M), denoted by S”(M). The
“‘> t q)=(to, . . . , ti-,,O, ti+,, . ..) fq). inclusion S”(M) c S(M) induces an isomor-
For each integer q, let S,(X) denote the free phism H*@“(M))= H,(M).
Abelian group generated by the singular q- Let M be an n-dimensional ttopological
simplexes in X (S,(X) = 0 if y < 0), and detïne a manifold. Then H,(M) = 0 unless 0 < q d n, and
homomorphism a,:S,(X)+S,-i(X) by a,(~)= H,(M) is iïnitely generated if M is compact.
765 201 G
Homology Theory

F. Homology of CW Complexes the number of q-cells of X, then we have the


Euler-Poincaré formula x(X) = &( -1)4c(,. In
Homology theory is tractable in the category particular, if X is homeomorphic to S2 then we
of +CW complexes by virtue of the facts stated have the Euler theorem on polyhedra: Q - c(, +
below. Y(~= 2. This was the lïrst important result in
Let X be a topological space and A its sub- topology (L. Euler, 1752).
set. Then we denote by X/A the quotient space
obtained from X by shrinking A to a point,
understanding X/@ to be the disjoint union G. Homology with Coefficients in Abelian
X U pt, If X,, X, are tsubcomplexes of a CW Groups
complex, then the Mayer-Vietoris exact se-
quence of {Xi, X,} and the excision isomor-
phism i, : H, (X,/(X, n X,)) g H,((X, U X,)/X,) Given a chain complex C and an Abelian
(i: inclusion) are valid. If A is a subcomplex of group G, we have a new chain complex C 0 G
a CW complex X, then we have the following given by (C 0 G)q = C, 0 G and a,(c @ g)=
reduced homology exact sequence of (X, A):. c?,c @ g (CE C,, gE G), where 0 is the ttensor
PHq(A)I:Hq(~)Ift~q(X/A)a;%-,(A)~..., product of Abelian groups. For a topological
where i, and j, are induced by the inclusion space X, the homology group of the chain
i: A-+X and the collapsing j:X+X/A, and complex S(X) @ G is denoted by H,(X; G) and
3, is given by a commutative diagram is called the singular homology group of X
with coefficients in G. The homology group
E~,(x/A)~H,~,(A)
H,(K; G) of a simplicial complex K with coeffï-
-p* -1s
cients in G and the cellular homology group
H,(X; CA)!%$SA).
H,(C(X); G) of a CW complex X with coeffi-
Here CA is the cane over A, S is the sus- cients in G are similarly delïned. The ho-
pension isomorphism, and h : X U CA +(X U mology group with coefficients in Z is the
CA)/CA=X/A and h’:XUCA-(XUCA)/X= integral homology group.
SA are collapsings. More generally, if A, B The previous results for the integral ho-
are subcomplexes of a CW complex X and mology groups generalize in a straightforward
A 3 B, then we have the following reduced fashion to homology groups with coefficients
homology exact sequence of (X, A, B) : .fZ in G.
~~(A/B)“E~,(~/B)~;~(~/A)~~~~~(A/B)I; We have the homomorphism K: H,(X) @
. Furthermore, the homology group H,(X) G+H,(X; G) sending ~@~EH,(X)@ G to the
of a CW complex X cari be computed in the homology class of z @ gEZ,(S(X) @ G), where
following manner. z is a representative cycle of a. The follow-
Let X4 denote the q-skeleton of X, i.e., ing theorem is known as the universal coeffi-
the union of a11 cells of dimensions <q. Put cient theorem for homology, since it expresses
C,(X) = $(X4/X4-‘), and let a4: C,(X)+ H,(X; G) in terms of H,(X), H,-,(X), and G:
C,-,(X) be the connecting homomorphism There is an exact sequence O+H,(X) @ G$
a,:~q(Xq/Xq-l)~~q_,(Xq~l/Xq~Z) in the re- H,(X; G)+Tor(H,-i(X), G)-0, and this se-
duced homology exact sequence of (X4, X4-‘, quence is split (- 200 Homological Algebra).
X4-*). Then C(X) = {C,(X), a,} is a chain com- Universal coefficient theorems of this type
plex such that C,(X) is a free Abelian group were first shown by S. Eilenberg and S. Mac-
with one generator for each y-cell of X. If X is Lane (Ann. Math., (2) 43 (1942)).
a polyhedron 1K 1, then C(X) coincides with Let A be a tring with a unit 1. Then a chain
the simplicial chain complex C(K). The ho- complex over A is a chain complex C such that
mology group H,(C(K)) is called the cellular each C, is a +A-module and each i3q is a +A-
homology group of the CW complex X. This is homomorphism. The homology groups H,(C)
isomorphic to the singular homology group of a chain complex C over A are A-modules. If
f&(X). C is a chain complex, C @ A forms naturally a
Since a CW decomposition frequently re- chain complex over A. In particular, if X is a
quires fewer cells than a simplicial decompo- topological space, then S(X) @ A is a chain
sition, the cellular homology groups are use- complex over A, and H,(X; A) are A-modules.
fui in calculating the homology groups. For In this case, the induced homomorphisms
example, the tcomplex n-dimensional projec- f, : H,(X; A)+ H,( Y; A) are A-homomorphisms.
tive space CP” has a CW decomposition with The homology groups with coefficients in a
a single 2i-ce11 for each i = 0, 1, , n, and hence field k are tvector spaces over k and are useful
we see immediately that H,(CP”) E Z if q = 2i in applications. If H,(X) is lïnitely generated,
(0 d i d n) and = 0 otherwise. then x(X) = C,( -1)4dimk H,(X; k) holds for
If X is a tïnite CW complex and c(~denotes any tïeld k.
201 H 766
Homology Theory

H. Cohomology index (5, a) E A is defïned naturally. If k is


a tïeld, then the vector spaces Hq(X; k) are
A cochain complex C = { Cq, 84) is a collection identified with the dual space of H,(X, k) by
of Abelian groups Cq, one for each integer q, means of the Kronecker index.
and of homomorphisms a4 : Cq+ Cqtl such If M is a C”-manifold, then there is a co-
that fi4+’ o P=O. Elements of Cq are called q- chain complex CD(M)= {aq(M),d} over the
cochains, and hq is called the coboundary iïeld R of real numbers, where Zlq(M) is the
operator. The notions of subcomplex of a co- real vector space consisting of the tdifferen-
chain complex, cocycle, coboundary, coho- tial forms of degree q on M, and d: Dq(M)-+
mology group, and cochain mapping are de- aq+l (M) is the texterior differentiation. The
lïned as in chain complex. Let Hom(A, B) cochain complex D(M) is called the de Rham
denote the tgroup of homomorphisms from an complex of M, and its cohomology group
Abelian group A to an Abelian group B. Given H*(a(M)) is called the de Rham cohomology
a chain complex C and an Abelian group G, a group of M. A cochain mapping 9: D(M)+
cochain Complex C* = Hom(C, G) is defïned Hom(P’(M), R) is defmed by
by Cq=Hom(C,,
(uECq,CECq+J.
For a topological
G) and (6qu)(c)=u(~q+1c)

space X, the cochain


V(4)(4=
ccJ*w, Jaq
complex Hom(S(X), G) is called the singular where OE Dq(M), c: Aq% M is a C” singular q-
cochain complex of X with coefficients in G, simplex in M, and o*w denotes the tpullback
and elements of Hom(S,(X), G) are called of o by CT.We have isomorphisms
singular q-cochains of X. The cohomology
group of Hom(S(X), G) is denoted by H*(X; G) H*(B(M))~H*(Hom(S”(M),R))&H*(M;R), -
and is called the singular cohomology group of where i* is induced by the inclusion Y(M) c
X with coefficients in G. We Write H*(X) for S(M). This result is called the de Rham theo-
H*(X; Z); this is called the integral cohomol- rem on the cohomology of manifolds (- 105
ogy group of X. Similarly, the cohomology Differentiable Manifolds).
group H*(K; G) of a simplicial complex K
with coefficients in G and the cellular coho-
mology group H*(C(X); G) of a CW complex 1. Cohomology Rings
X with coefficients in G are defmed. There are
isomorphisms H*(K; G)zH*(IKI;G) and Given a topological space X and a ring A, the
H*(C(X); G)gH*(X; G). cup product u - UE Hom(S,,+,(X), A) of co-
If j: X+ Y is a continuous mapping, then a chains u E Hom(S,(X), A) and UE Horn@,(X), A)
cochain map ,f# : Hom(S( Y), G)+Hom(S(X), is defïned by (u-v)(~)=u(cJoE)u((ToE’), where
G)is defïned by (f#~)(c)=u(f#c) with UE <T:Apfq-tX is a singular (p + q)-simplex, E: AP+
Hom(S,( Y), G) and CG~,(X). Therefore f in- Aptq and &:Aq+APfq are given by E(t,,,t,,
duces the homomorphism f* : H*( Y; G)+ “‘Y fp) =(b, t, , , t,,, 0, . ,O), ad Q$,, t,+l,
H*(X; G). The following properties hold: (i) , t,,,) = (0, ,O, t,, t,,, , . . , t,,,). The prod-
l*= 1; (ii) (gof)*=f*og*; (iii) Iffandf’ are uct operation is bilinear, and the formula
homotopic, then f* =f’*. In particular, the 6(u - v) = 6u - u + ( - l)Pu ti 60 holds. There-
singular cohomology groups are homotopy fore it gives rise to the cup product <ti VE
invariants. -i‘he tcokernel of c*: H*(pt; G)+ Hptq(X; A) of cohomology classes 5 E HP(X; A)
H*(X; G) induced by the mapping c : X +pt and q E Hq(X; A). This cup product operation
is denoted by R*(X; G) and is called the re- makes H*(X; A) into a ring, which is called
duced cohomology group of X. the singular cohomology ring of X with coeff%
For (EH~(~; G) and ~EH,(X), the Kro- cients in A. If A has 1, the cohomology class
necker index (5, a) E G is defïned naturally in represented by the 0-cocycle taking the value
terms of representatives of 5 and a. We have 1 on each singular O-simplex serves as 1 of
the following universal coefficient theorem for H*(X; A). If A is commutative, then &=
cohomology: There is an exact sequence O+ ( - l)Pqqc holds. The induced homomorphism
Ext(H,-,(X), G)+Hq(X; G)$Hom(H,(X), G)+ f*: H*( Y; A)+H*(X; A) preserves the prod-
0, and this sequence is split, where K is given uct, and hence the cohomology ring is a ho-
by the Kronecker index (- 200 Homological motopy invariant.
Algebra). If K is a simplicial complex, the cup prod-
If A is a ring with 1, then a cochain complex uct operation v:HP(K; A)@ Hq(K; A)+
over A is defïned analogously to a chain com- Hpfq(K; A) is induced from the opera-
plex over A. The singular cochain complex tion - : Hom(C,(K), A) 0 Hom(C,(K), A)+
Hom(S(X), A) forms naturally a cochain com- Hom(C,+,(K), A) defined as follows. Adopt-
plex over A, and Hq(X; A) are A-modules. For ing a +linear ordering of vertices of K, we Write
5 E Hq(X; A) and a E H,(X; A), the Kronecker a11 oriented simplexes in this ordering. Then,
161 201 J
Homology Theory

for uEHom(C,(K),A) and uEHom(C,(K),A), A) 0 (S(Y) 0 A)+s(X) @ S(Y) @ A is de-


we detïne u-VEH~~(C,+,(K),A) by (u- fined by P((C 0 n’) @ (d 0 A”)) = c 0 d @ 1,‘L”; it
u)(Ca,,a,,...,a,+,l)=u(Ca,,a,,...,a,l)~(Ca,, induces homomorphisms ,LL.+: H,(X; A) @
a,,+r , . , a,+,]). The canonical isomorphism H,(Y; A)+H,+@(X) 0 S(Y) 0 A). The cross
from the cohomology of K to the singular product a x bE H,+,(X x Y; A) of ue H,(X; A)
cohomology of 1K 1 preserves the cup product. and bE H,( Y; A) is delïned to be p;‘(~,(u x b)).
On the de Rham complex B(M) of a C”- If A is a commutative ring with 1, then the
manifold M, we have the texterior product cross product defines a A-homomorphism x :
WA~EZ)~+~(M) of WEB~(M) and ~~ED~(M). H,(X; A) On H,( Y; A)+H,+,(X x Y; A), and
This makes H*(a(M)) into a ring, which is satislïes(axb)xc=ux(bxc), T,(uxb)=
called the de Rham cohomology ring. The (-l)Pqbxu,(fxg),(uxb)=f,(u)xg,(b),
canonical isomorphism of H*(a(M)) to where T: X x Y-* Y x X is the mapping inter-
H*(M; R) preserves the product (- 105 Dif- changing factors, and f:X-+X’, y: Y+ Y’ are
ferentiable Manifolds). continuous mappings. If A is a ?Principal ideal
Examples. (1) Let T” = S’ x . x S’ denote domain, there is an exact sequence
the n-dimensional torus, and let 7~~:T”+S’
O-,+T=. H,(X; 4 @,H,(Y; N:Hn(X x Y; A)
denote the projection to the ith factor (1~
i < n). Take a generator 5 of the A-module
Tor,W,W; 4, H,(Y; NW,
H’(S’;A), and put &=~$(<)EH’(T”;A). Then -p+C-1
H*(T”; A) is the texterior algebra over A gen-
and this sequence is split (- 200 Homolog-
erated by ci,. , <,. (2) If we denote by CP”
ical Algebra). In particular, if k is a lïeld we
the complex n-dimensional projective space,
have the following isomorphism of vector
then Hq(CP”; A) is A if q = 2i (0 <i < n) and 0
spaces:
otherwise. If 5 is a generator of the A-module
H’(CP”; A), then 5’ generates the A-module x :pCIIHp(X;k)oHy(Y;k)~H,(Xx Y;k).
H”(CP”; A) (0 < i < n). Thus H*(CP”; A) is the
quotient ring A[~]/(~““) of the tpolynomial This is called the Künneth theorem, since the
ring A[(] by the tideal (5”“). (3) If P” denote prototype was proved by D. Künneth (Math.
the real n-dimensional projective space, then Ann., 90 (1923); ibid., 91 (1924)). The present
H*(I’“; Z,) E Z, [(]/((““), where 5 is the gen- form was given by H. Cartan and S. Eilenberg
erator of H’(P”; Z,). CU.
The tensor product C 0 D of cochain com-
plexes C and D is delïned analogously to that
J. Homology of Product Spaces of chain complexes. Given topological spaces
X, Y and a ring A, a cochain mapping p :
If C and D are chain complexes, their tensor Hom(S(X), A) @ Hom(S( Y), A)+Hom(S(X)
product C 0 D is a chain complex given by 0 S( Y), A) is detïned by (P(U 0 v)) (c 0 d)
(C 0 D), = C,+,=, C, 0 D,, and a,(c 0 d) = =~(C)v(d), where u~Hom(S,,(x),A), VE
a,(c)@d+( -l)pc@ a,(d) (c~C~,deD~). The Horn@,(Y), A), CES,(X), deS,( Y), and u(c)u(d)
following Eilenherg-Zilher theorem (Amer. J. is understood to be 0 if (p, q) # (s, t). We then
Math., 75 (1953)) is the link between the alge- have the composite HP(X; A) 0 Hq( Y; A)2
bra of tensor products and the geometry of Hpfq(Hom(S(X) @ S(Y), A))zHP+q(X x Y; A),
product spaces: For the product space X x Y where p is the Alexander-Whitney mapping.
of topological spaces X and Y, there is an For 5 E HP(X; A) and n E Hq( Y; A), the cross
isomorphism p.+:H,(X x Y; G)r H,(S(X) @ product 5 x 4 E Hptq(X x Y; A) is defined to be
S(Y) 0 C) induced from a chain mapping p : p*,n,(< 0 a). The cohomology cross product
S(X x Y)+S(X) @ S(Y) defmed as follows: satistïes the properties analogous to the homol-
Given a singular n-simplex cr:A”+X x Y, ogy cross product.
we detïne for each p (0 < p < n) a singular p- The cup product and the cohomology cross
simplex 0; in X to be the composite AP&Ano> product are given in terms of each other: 5 - ré
Xx Y2X, where .a(tO, t,, . . . , rp)=(to, t,, , =d*(< x q), 5 x q=zT(&-@(q), where d:X+
t,, 0, . ,O) and ni is the projection to the first X x X is given by d(x) = (x, x), and rrl :X x Y +
factor. Similarly, we detïne for each q (0 < X, rt2 :X x Y-t Y are projections.
q Q n) a singular q-simplex 0; in Y to be the We have the following Kiinneth theorem for
composite AqsAn->X x Y? Y, where e’(t,-,, cohomology: If A is a principal ideal domain
. . , tn) = (0, . ,O, tnmq, , t,) and rc2 is the pro- and each H,(X; A) is lïnitely generated over A,
jection to the second factor. Then p is delïned then there is an exact sequence
by p(o) = Cp+,=,a~ @ cri and is called the
O-tp& HP(X; A) 0, Hq(Y; A):H”(X x Y; A)
Alexander-Whitney mapping (Alexander-
Whitney map).
Tor,(HP(X; A), Hq( Y; A))+O,
Given a ring A, a chain mapping p : (S(X) @ -p+z+*
201 K 768
Homology Theory

and this sequence is split (- 200 Homological by (w/d)(o)= Ci(w(a 0 TJ)&, where g is a
Algebra). For 5 E HP(X; A), réE H4( Y; A), 5’~ singular p-simplex in X, ri are singular q-
H”(X; A), $EH!( Y; A), the formula (5 x y~)- simplexes in Y, and ni E A. The slant operation
(5’ x $)=( -l)q”(c- 5’) x (II -PI’) holds. There- satislïes 6(w/d)=(6w)/d-( -l)Pw/Od. There-
fore, if k is a lïeld and dim,H,(X; k) < COfor fore, under the identification H*(X x Y; A) =
each q, then the cohomology ring H*(X x Y; k) H*(Hom(S(X) @ S(Y), A)), it induces the
is determined by the cohomology rings slant product [/b E HP(X; A) of a cohomology
H*(X; k) and H*( Y; k). class [ EH p’q(X x Y; A) and a homology class
tFiber bundles cari be considered as gen- bu H,( Y; A). For any a~ H,(X; A), it holds that
eralized product spaces. Let E be the total (ilb, a> = Ci, a x b).
space of a liber bundle with base B and liber F. Let G, G’ and G” be Abelian groups. Given a
The following Leray-Hirsch theorem (J. Math. homomorphism G’ @ C”+G, we Write g’g” for
Pures Appl., 29 (1950)) asserts that, under the image of g’ @ y” E G’ @ G” in G. Then the
certain conditions, the cohomology of E is cap product - : Hq(X; G’) @ H,+,(X; G”+
additively isomorphic to that of B x F: Let A H,(X; G) cari be detïned in the same way as
be a principal ideal domain, and assume that before. Similar delïnitions are valid for the cup
H,(F; A) is free and tïnitely generated over A. product, the cross products for homology and
Furthermore, assume that there is a homo- cohomology, and the slant product.
morphism 0: H*(F; A)+H*(E; 4) such that the
composite H*(F;A)%H*(E;A)%H*(p-‘@);A)
is an isomorphism for each h E B, where p : E +
B is the projection and i,:p-‘(h)cE. Then L. Relative Homology
an isomorphism @:H*(B;A)@*H*(F;A)g
H*(E; A) is given by Q(< 0 q)=p*r -Q(q), If c’ is a subcomplex of a chain complex C,
where ~EH*(B;A), ~EH*(F;A). then we have a chain complex C/C’ = { C,/Ci,
A general connection between (co)homol- a,}, where C,/Cl denotes the quotient group
ogy of E and B x F is given by means of spec- and 13~is induced from a4 by passing to the
tral sequences (- 148 Fiber Spaces). quotient. C/C’ is called the quotient complex of
C by C’.
A topological pair (X, A) is composed of a
K. Cap and Slant Products topological space X and its subset A. Given a
topological pair (X, A) and an Abelian group
There are other products closely related to the G, we have the chain complex (S(X)/S(A)) @G
cup product or the cross product that involve and the cochain complex Hom(S(X)/S(A), G).
cohomology and homology together. The homology group of (S(X)/S(A)) @ G is
Given a topological space X and a ring denoted by H,(X, A; G) and is called the rela-
A, the cap product v fi c E S,(X) 0 A of a co- tive singular homology group of X modula A
chain veHom(S,(X),A) and a chain c=Cioi@ with coefficients in G or the singular homology
ÂiESP+,(X)@A is detïned by v-c=Cioio group of (X, A) with coefficients in G. The
E 0 v(ai o a’)&, where ci are singular (p + q)- homology group H,(X; G) = H,(X, 0; G) is
simplexes in X, &EA, and a:AP+APfq, s’:Aq+ sometimes called the absolute homology group.
Ap+4 are the mappings used in the definition Similar definitions are made for the coho-
of cup product. For any UE Hom(S,(X), A), mology group H*(X, A; G) of the cochain
the formula (u -v, c > = (u, uh c) holds. The complex Horn@(X)/?(A), G).
cap product satisfies i3(v - c) = ( - 1)P6u A c + A simplicial pair (K, L) is composed of a
u-. ac, and hence it induces the cap product simplicial complex K and its subcomplex
<- a E H,(X; A) of a cohomology class 5 E L, and a CW pair (X, A) is composed of a
Hq(X; A) and a homology class UE H,+,(X; A). CW complex X and its subcomplex A. The
If A is a commutative ring with 1, then the relative homology group H,(K, L; G) =
cap product operation is bilinear and satislïes H,((C(K)/C(L)) @ G) of a simplicial pair
the following properties: (5 - 5’) ,-. a = 5 ,-. (K, L) is isomorphic to H,(IKI, IL/; G). For a
(i”-a),f*(f‘*rl^u)=9^f*(a),l,a=a, CW pair (X, A), there is an isomorphism
(~x~)~(axb)=(-1)~“~~‘(~~.)x(~“b), H,(X, A; C)g &(X/A; G). Similar statements
where <E H”(X; A), 5’~ HP’(X; A), VE H4( Y; A), hold for the cohomology groups.
u~H,(X;A),b~H,(Y;A),andf:x+Yisa A continuous mapping f:(X, A)-+( Y, B) of
continuous mapping. topological pairs is a continuous mapping f:
Given topological spaces X, Y and a ring X+Y such thatf(A)cB. Iff:(X,A)+(Y,B) is
A, the slant product w/de Hom(S,(X), A) of a continuous mapping, then f, :S(X)+S( Y)
a cochain w cHom(S(X) 0 S(Y)),+,, A) and sends S(A) to S(B), and hence f induces homo-
a chain d = Ci ri @ Âi E S,(Y) 0 A is detïned morphisms .f, : H,(X, A; G)+ H,( Y, B; G) and
769 201 M
Homology Theory

f* : H*( Y, B; G)->H*(X, A; G). Since the prop- M. Lech Homology Theory


erties are analogous, we state them below
only for the case of relative homology. Another homology theory is commonly used
The following six properties are fundamen- along with the singular theory. The theory was
tal. (i) 1* = 1. (ii) (g o,f), = g* o,f*. (iii) Homo- originated by E. Lech (Fund. Math., 19 (1932))
topy property: If f; f’:(X, A)+( Y, B) are +ho- and was moditïed by C. H. Dowker (Ann.
motopic, then f, =fi. (iv) Exactness property: Math., (2) 51 (1950)).
There exists a homology exact sequence of Given a topological space X and an Abelian
(X,A):...2H,(A;G)-rH,(X;G)i;H,(X,A;G)-; group G, the Lech homology group H,(X; G)
Hqm,(A;G)k..., where i:AcX,j:(X,@)c and the Lech coliomology group fi*(X; G) are
(X, A), and d, sends the homology class of a defined as follows. We take the family of all
cycle of (S(X)/S(A)) 0 G represented by a topen coverings of X directed by +relïnement,
chain CES(X) @ G to the homology class of dc and we consider the +nerve K(U) of each open
which is a cycle of S(A) 0 G. a, is called the covering U, that is, the simplicial complex
boundary homomorphism or the connecting whose simplexes are fïnite nonempty subsets
homomorphism. (v) Naturality of a*: For any of U with nonempty intersection. If II’ is a
continuous mapping f: (X, A)+( Y, B), it holds retïnement of U, then a simplicial mapping
that O* o,f, =(f‘l A), o a,. (vi) Excision pro- x(U, U’): K(U’)+K(U) is obtained by assign-
perty: If U is a subset of X such that the clo- ing to each U’E II’ an element U E Il such
sure Ü is in Int A, then the excision isomor- that U’C U. The induced homomorphisms
phism H,(X-U,A-U;G)rH,(X, A;G) is ~L(U, U’),:H,(K(U’); G)+H,(K(U); G) and
induced by inclusion. n(U,U’)*:H*(K(U); G)+H*(K(U’);G) are inde-
The exactness property extends to the pendent of the choice of n(U, II’), and we have
homology exact sequence of a triple (X, A, B): the tinverse system {H,(K(U); G), n(U, u’),}
.l.. %H~(A, B; G)~H,(x, B; G)%H,(x, A; G$ and the +direct system {H*(IC(U); G), TZ(U, Il’)*}.
k,-,(A,B;G)-*....A couple {Xi, X,} of sub- We now define fi,(X; G)=l@H,(K(U); G)
sets of X is said to be excisive if H,(X, , X, n and H*(X; G)=l$H*(K(U); G).
X,) z H,(X, U X2, X2) is induced by inclusion. A continuous mapping f: X-t Y induces
If {Xi, X,} is excisive, SO is {X,, X, }. For ex- homomorphisms f,: g,(X; G)-tfi,( Y; G) and
ample, if X = Int Xi U Int X, or if Xi and X, j’*: fi*( Y; G)+H*(X; G) as follows. If !II is an
are subcomplexes of a CW complex, then open covering of Y then a simplicial mapping
{Xi, X,} is excisive. If {Xi, X,} and {A,, AZ} fr,:K(fm’(%))-K(23) is detïned by J&-‘(V))
are excisive couples such that A, c X, and = V ( I’E )21).The induced homomorphisms
A, c X,, then we have the relative Mayer- .h~:ff,(KV’(V); G)+H,(K(BI); G) for
Vietoris exact sequence: . +H,(X, n X,, A, n aIl open coverings !-II of Y gives rise to f, :
A2;G)~Hq(X,rA,;G)+Hq(X2>A2;G)~Hq(XlU fi,(X; G)+H,( Y; G). Similarly f induces f*.
X2,AlUA2;G)'Hq~l(X1nX,,A,nA,;G)~.... Another approach to Lech cohomology
For the case of relative cohomology, we use theory is called the Alexander-Kolmogorov
terms such as cohomology exact sequence and construction (Proc. Nat. Acad. Ci., 21 (1935)
coboundary homomorphism, correspondingly. and C. R. Acad. Sci. Paris, 202 (1936)). The
The universal coefficient theorems are approach was improved by E. H. Spanier
valid for the relative (co)homology groups. (Ann. Math., (2) 49 (1948)) and the theory is
Given a homomorphism G’ @ G”+G, if now called the Alexander (or Alexander-
{A, B} is excisive in X, then the cup product Kolmogorov-Spanier) cohomology theory.
v : HP(X, A; G’) @ Hq(X, B; G”+Hp+q(X, A U The Alexander cohomology group I?*(X; G)
B; G) and the cap products .-. : Hq(X, B, G’) @ is detïned as follows. Let Qq(X; G) be the
Hp+q(X, A U B, G”+H,(X, A; G) cari be de- Abelian group of aIl functions from the (q + 1).
tïned. The product (X, A) x (Y, B) is defined fold product space X q+l to G with addition
tobethepair(XxY,AxYUXxB).Givena defined pointwise. An element cpE Q4(X; G)
homomorphism G’ @ G”->G, the homology is said to be locally zero if there is an open
cross products x : HJX, A; G’) @ H,( Y, B; G”) covering U of X such that V(X,, . . ,x4) van-
+H,+,((X, A) x (Y, B); G) and the slant prod- ishes if x,,, , xq are contained simultaneously
ucts /: HP+q((X, A) x (Y, B); G’) @ H,( Y, B; G”) in some U E II. The subgroup of Bq(X; G) con-
+HP(X, A; G) cari be defïned. If {A x Y, X x sisting of locally zero functions is denoted by
B} is excisive, then the cohomology cross 0:(X; G). We define a homomorphism 64:
products x : HP(X, A; G’) @ Hq( Y, B; G”+ Qq(X; G)+Qq+l(X; G) by (dq<p)(x 0, x l>“‘>
HP+4((X, A) x (Y, B); G) cari also be defined. Xq+l)=C~~~(-l)i<p(Xg,...,Xi~]rXi+l,...rXq+l)-
If {A x Y, X x B} is excisive, the Künneth Then @(X; G) = {@“(X; G), hq} is a cochain
theorems are valid for the relative complex, and @a(X; G)= {(I$(X; G), bq} is its
(co)homology groups [9,10,11]. subcomplex. We now detïne H*(X; G) to be
201 N 170
Homology Theory

the cohomology group of the quotient com- spaces, A and B are closed subsets of X and Y,
plex @(X; G)=@(X; G)/O,(X; G). respectively, and f: (X, A)+( Y, B) is a continu-
If f: X-t Y is a continuous mapping, then a ous mapping which maps X -A onto Y - B
cochain map f# :@(Y; G)+@(X; G) is defined homeomorphically. Then ,& : fi,(X, A; G) E
by(f”<p)(x,,x,,...,x,)=<p(f(x,),f(x,),..., fi,(Y,B;G)andf*:ti*(Y,B;G)rfi*(X,A;G)
,j’(x,)), and it induces the homomorphism f* : hold.
H*( Y; G)*H*(X; G). There is a natural iso-
morphism H*(X; G) g H*(X; G).
The (co)homology group of a simplicial N. Fundamental Classes of Manifolds
complex K is isomorphic to the Lech (CO)-
homology group of 1K 1. If X is a manifold or For a topological space X and a point x of X,
a CW complex, then its singular (co)homology the local homology groups H,(X, X -x) repre-
group and its Lech (co)homology group are sent a topological property of X around x.
isomorphic. However, even for compact met- The notion of +Orientation for differentiable
rit spaces X, the singular (co)homology group manifolds and triangulable manifolds general-
of X is not necessarily isomorphic with the izes to ttopological manifolds by using local
Lech (co)homology group of X. homology groups as follows. Let M be an
If {X,1 is an tinverse system of compact n-dimensional (topological) manifold with
Hausdorff spaces and X =I&n X,, then there tboundary 8M. If x is a point of the interior
are isomorphisms l$-fi,(X,; G) E fi,(X; G) M,=M-dM, then H,(M,M-x)zH,(R”,R”
and 15 H*(X,; G) g H*(X; G). This is called -O)isZforq=nandisOforq#n.Wedefine
the continuity property for Lech theory. If A is a local orientation o, for M at XE Mo to be a
any closed subset of a manifold M, then there choice of one of the two possible generators
is an isomorphism 15 H*( W; G) g I?*(A; G), for H,,(M, M-x), and we then detïne an orien-
where W varies over neighborhoods of A in M tation for M to be a function which assigns to
directed downward by inclusion. If the tcover- each x E M, a local orientation o, which varies
ing dimension of X is n, fiq(X; G) = 0 for q > n. continuously with x in the following sense: For
The cup product in the Lech cohomology is each x there should exist a compact neighbor-
introduced simply by passing to the limit with hood N c Mo and an element oN E H,,( M, M -
cup products in simplicial complexes, and the N) such that i,,(o,) = oy for each y~ N, where
cup product in the Alexander cohomology is i,:(M, M-N)c(M, M -y). If there is an ori-
induced from the operation - :@“(X; G’) 0 entation for M, then M is said to be orient-
@4(X;G”)-t@P+q(X; G) defined by (<p-$)(x,, able, and the pair of M and an orientation is
Xlr...rXp+q)=~(XO,X,I...,Xp)~(XprXp+,r called an oriented manifold. If M is a nonorient-
...‘xp+q. 1 able manifold without boundary, the set of
The relative Lech homology group fi,(X, local orientations for M forms an orientable
A; G) and the relative Lech cohomology group manifold doubly covering M, called the orien-
H*(X, A; G) are defmed as follows: An open tation manifold of M.
covering of (X, A) is a pair (U, !!In) of an open If M is an oriented n-dimensional manifold,
covering II of X and an open covering !II of then for any compact subset K of M there is a
A such that %ILc II. TO such a pair (U, !II) we unique element oK E H,,( M, (M - K) U C?M) such
assign a simplicial pair (K(U), K’(s)), where that i,,(o,) = o, for each x E K n M,,, where
K’(%) is the nerve of 5%n A = {N n A 1N E~L). i,:(M,(M-K)UaM)c(M,M-x). The ele-
Considering the family of all open cover- ment oK is called the fundamental homol-
ings of (X, A), we detïne now fi,(X, A; G)= ogy class around K. In particular, if M is it-
I&n H,(K(U), K’(s); G) and 8*(X, A; G)= self compact, o,,, E H,( M, OM) is usually de-
I&I H*(K(U), K’(S); G). noted by [M] and is called the fundamental
The relative Alexander cohomology group homology class of M. A connected compact
H*(X, A; G) is defïned to be the cohomology n-dimensional manifold M is orientable if
group of the cochain complex which is the and only if H,,(M, aM)#O, and in this case
kernel of the cochain mapping i#: &(X; G)+ H,(M, 8M) is a free cyclic group generated by
@(A; G) induced by inclusion. There is a nat- a fundamental class [Ml. If M is an orientable
ural isomorphism fi*(X, A; G) g H*(X, A; G). compact n-dimensional manifold, then aM is
The relative Lech (co)homology groups an orientable compact (n - 1)-dimensional
satisfy the properties analogous to the relative manifold without boundary, and the boundary
singular (co)homology groups except the homomorphism cY,:H,(M,~M)-+H,~,(~M)
exactness property for homology (- Section sends a fundamental class [M] to a funda-
Q). In certain cases, the excision property is mental class [aM].
strengthened for Lech (co)homology. For An n-dimensional manifold M is orientable
example, we have the following theorem: As- if and only if there exists an element U E H”(M
sume that X and Y are compact Hausdorff x M, M x M -dM) such that, for each XE Mo,
771 201 P
Homology Theory

j:(U) is a generator of H”(M, M-x), where product with the fundamental cohomology
dM is the diagonal in M x M, and j,:(M, M class U of M as follows: For each open neigh-
-x)+(M x M, M x M -dM) is given byj,(y) borhood W of K, we delïne a homomorphism
=(~,y) (y~ M). In fact, U corresponds to an yw:H,mq(M,M- W;G)+Hq(W;G) byy,(a)=
orientation which assigns to each XE Mo a j*(U)/a,wherej*:H”(MxM,MxM-dM)+
local orientation o, such that ( jx( U), 0,) = 1. H”( W x (M, M - W)) is induced by inclusion.
The element U is called the orientation coho- Then, passing to the limit, these yw deiïne the
mology class of M. If M is a compact manifold desired one.
without boundary, it holds that (d*(U), [M]) If we take an n-sphere S” as M in the above
=X(M),whered*:H”(MxM,MxM-dM)+ duality theorem and use the homology exact
H”(M) is induced by the diagonal mapping. sequence of (,Y, S”- K), then we have the fol-
The element d*(U)EH”(M) is called the Euler lowing Alexander duality theorem (Trans.
class of M. Amer. Math. Soc., 23 (1922)): If K is a closed
If we work with the (co)homology groups subset of Y’, then the qth reduced Lech coho-
with coefficients in Z,, the fundamental classes mology group of K is isomorphic to the (n -
are dehned for an arbitrary manifold without q - 1)th reduced singular homology group of
making any assumption of orientability. If S” - K for any coefficient group G and any q.
M is connected and compact, then H,(M, In particular, if K is a tneighborhood retract,
8M; Z,) g Z, is generated by [M]. Aq(K; G)r fin-,_, (S” - K; G) holds. This shows
that H,(S” - K) depends only on K and not on
the way K is embedded in S”. The Alexander
0. Duality in Manifolds duality theorem for n = 2 and K = S’ gives the
classical tJordan curve theorem.
Let M be a compact n-dimensional manifold, In view of the duality theorems, certain
and let M,, M, be compact (n - 1)-dimensional classical definitions in the homology of mani-
manifolds such that M, U M2 = SM and M, 0 folds cari be given in terms of cohomology.
Mz = ÛM, =C?M,. Assume either that M is For example, if f: M+M’ is a continuous
oriented or that G = Z,. Then for each q, an mapping of oriented closed manifolds, then the
isomorphism D:Hq(M, M,; G)zH,-,(M, Umkehr homomorphism or Gysin homomor-
M2; G) is deiïned by D(t) = 5 AM]. In par- phismf!:H,(M’;G)+H,+,(M;G) (d=dimM -
ticular, there are the isomorphisms D : Hq(M, dim M’) (W. Gysin, Comment. Math. Helu., 14
aM;G)zH,-,(M;G) and Hq(M; G)gH,-,(M, (1941)) cari be defined by D of! =f* o D. In co-
C?M; G), where the cap product is taken with homology we have ,f;: Hq(M; G)+Hqmd(M’; G).
respect to the homomorphism G 0 Z-G Similarly, if M is an oriented n-dimensional
detïned by multiplication. This theorem is closed manifold and UE H,(M), ~EH,(M), then
called the Poincaré-Lefschetz duality theorem, the intersection product a. ~EH,+,-,,(M) of
and the special case for 8M = @ is often Lefschetz cari be defïned by a. b = D ml a - b =
referred to as Poincaré duality. D(D-‘a -D-lb). If P+q=n, the number a.
Poincaré duality implies the following con- bs H,(M) g Z is called the intersection numher
sequences for a compact n-dimensional mani- of a and b. The classical defïnitions are still
fold M without boundary. If M is orientable, meaningful today, since they are closer to
then the qth Betti number is equal to the (n- geometric intuition and therefore possess con-
q)th Betti number, and the qth torsion coef- siderable heuristic value. For example, the
ficients are equal to the (n-q - 1)th torsion following fact serves to compute cup products
coefficients. If n is odd, then x(M) = 0, and if M in manifolds. If M is an oriented closed differ-
is orientable and n = 2 mod 4, then x(M) is entiable manifold and a, bEH*(M) are repre-
even. sented by closed submanifolds N,, N2 which
Poincaré duality generalizes to the following intersect ttransversally, then fa. b is repre-
duality theorem. Let M be an n-dimensional sented by N, n N, [ll]. See [12] for a rigor-
manifold without boundary, and let K be a ous discussion of classical intersection theory.
compact subset of M. Assume either that M is
oriented or that G = Z,. Then there is an iso-
morphism D:fiq(K; ~)EH,-,(M, M- K; G) for P. Cohomology with Compact Supports
any q, which is given as follows: For each open
neighborhood W of K, define D,: Hq( W; G)+ Let X be a topological space. A subset V of X
fL,W, M - K G) by WO= k,(i--. k;‘(4), is said to be cobounded if X? is compact. A
where k, is the excision isomorphism induced singular q-cochain u~Horn(S,(X), G) is said
byk:(W,W-K)c(M,M-K).NowDisde- to have compact support if there exists a co-
fïned to be the limit of D,, where W varies bounded set V such that u(o)=0 for every
over open neighborhoods of K. The inverse of singular q-simplex o in V. The singular co-
D up to sign is given in terms of the slant chains with compact support form a subcom-
201 Q 712
Homology Theory

plex of the cochain complex Hom(S(X), G). ,f*: H4( Y, B)+Hq(X, A). (3) A function as-
The cohomology group of this subcomplex is signing to each topological pair (X, A) and
denoted by H,*(X; G) and is called the singu- each integer q a homomorphism 6* : Hq(A)+
lar cohomology group of X with compact sup- H4+‘(X, A). Then H* is called a cohomology
ports. There is an isomorphism Hc(X; G) g tbeory on the category of topological pairs if
l$ H*(X, k’; G), where F varies over co- the following seven axioms are satistïed [7].
bounded subsets of X. (i) 1, = 1, where 1 is identity. (ii) (gof)* =
Let K be a simplicial complex. A q-cochain f* o g* : Hq(Z, C)-tH4(X, A) for continuous
u~Horn(C,(K), G) is called a finite cochain mappings f:(X, A)-+(Y,B) and y:(Y, B)+
of K if u(o) = 0 except for a finite number of (Z, C). (iii) Homotopy axiom: If ,fi ,f’:(X, A)+
oriented q-simplexes 0 of K. If K is a tlocally (Y, B) are homotopic, then ,f, =& : H4( Y, B)+
finite simplicial complex, then finite cochains H4(X. A). (iv) Exactness axiom: The seauence
of K form a subcomplex of the cochain com- . ..%Hq(X‘ A&H”(X):H”(A$H’-‘(x 2A)‘*+ .,.
plex Hom(C(K), G) whose cohomology group is exact, where i: A c X and j: (X, a) c (X, A).
is isomorphic to Hc( 1K 1;G). (v)f*06*=6*o(flA)*:Hq(B)+Hq+‘(X,A)for
Let X be a +locally compact Hausdorff a continuous mapping f: (X, A)+( Y, B). (vi)
space, and let X U { co} denote the tone-point Excision axiom: If U is an open set of X such
compactification of X. Then the Lech coho- that U c Int A, then i* : Hq(X, A) z Hq(X - U, A
mology group of X with compact supports, - U), where i is the inclusion. (vii) Dimension
denoted by @(X; G) is defined to be the re- axiom: Hq(pt) = 0 if q #O. Axioms (i)-(vii) are
duced Lech cohomology group of X U {a) called the Eilenberg-Steenrod axioms, and the
with coefftcients in G. There is an isomorphism group Ho(p) is called the coefficient group of
fic(X; G) g lim 8*(X, V; G), where V varies the cohomology theory H*.
over cobounded subsets of X. If X is a mani- A cohomology theory on the category of
fold or a CW complex, then H:(X; G) E pairs of compact Hausdorff spaces is defined
@(X; G). If X is a compact Hausdorff space similarly. A cohomology theory on the cate-
and A is closed in X, then fic(X - A; G) E gory of CW pairs (or finite CW pairs) is de-
H*(X, A; G). The Alexander-Kolmogorov con- fined similarly except that axiom (vi) is re-
struction gives a direct approach to fii(X; G) placed by the following excision axiom: If
[lO, 131. {Xi, X2} is a couple of subcomplexes of a CW
A tproper continuous mapping f: X+ Y of complex, then i* : Hq(X, U X,, X,) z Hq(X,,
locally compact Hausdorff spaces induces X, f’ X,), where i is the inclusion. Two coho-
homomorphisms f*: Hc( Y; G)-HC(X; G) and mology theories H* and H’* on the same
f*:Hé(Y;G)-r@(X;G),andiff,f”:X+Yare category are isomorpbic if there is an isomor-
properly homotopic, then they induce the phism h,: H4(X, A)z H4(X, A) for each (X, A)
same homomorphisms. and each q, and they commute with j* and
The cohomology with compact supports is 6*. A homology theory on various categories
useful in order to extend results in the coho- is defined similarly by dualization.
mology of compact spaces to noncompact A singular (co)homology theory with coef?ï-
spaces. For example, the conclusion of the cients in G is an example of a (co)homology
duality theorem on a compact set K c M in theory on the category of topological pairs.
Section 0 generalizes to the case of a closed The i’ech cohomology groups with coefficients
set K c M as follows: There is an isomorphism in G cari be made into a cohomology theory
@(K;G)EH,-,(M,M- K;G) for any q. This on the category of topological pairs. However,
implies the following generalization of Poin- the Lech homology groups do not constitute a
caré duality: Hj(M; G) g H,_,(M; G) holds for homology theory on the category of topolog-
an orientable n-dimensional manifold M with- ical pairs; the homology sequence of any pair
out boundary. (X, A) is detïned, but it cari be proved only
There are homology theories associated that the composite of any two successive
with the cohomology theories with compact homomorphisms is zero. The Lech homology
supports [13]. groups with coefficients in a field constitute a
homology theory on the category of compact
Hausdorff pairs. The Alexander cohomology
Q. Eilenberg-Steenrod Axioms groups constitute a cohomology theory on the
same category of topological pairs, and it is
Let H* be a collection of the following three isomorphic to the Lech cohomology theory if
functions: (1) A function assigning to each their coefficient groups are isomorphic [lO].
topological pair (X, A) and each integer q an The Lech (co)homology constitutes a (CO)-
Abelian group H4(X, A). (2) A function assign- homology theory on the category of CW
ing to each continuous mapping f: (X, A)+ pairs, and it is isomorphic to the singular
(Y, B) and each integer q a homomorphism (co)homology theory on the same category if
173 201 Ref.
Homology Theory

the coefficient groups are isomorphic. (CO)- is denoted by H,(X; 0) and is called the sin-
homology theories on the category of tînite gular homology group of X with coefficients in
CW pairs are determined, up to isomorphisms, 6.
by their coefficient groups. This fact is called Similarly, a cochain complex S*(X; 8) =
the uniqueness theorem of homology theory {S4(X; Cc), K4} is defined as follows: If 4 < 0,
on the category of tïnite CW pairs. Coho- then S4(X; 8) = 0, and if q b 0, then S4(X; 8) is
mology theories on the category of pairs of the Abelian group of functions u assigning to
compact Hausdorff spaces which satisfy the every singular q-simplex 0 in X an element
following continuity axiom are determined, up u((T)EG,,~,,~,,.,,~~; the coboundary operator
to isomorphisms, by their coefftcient groups: If hq:S4(X; S)+Sq+‘(X; 6) for q>O is given by
{(X,, A,)} is an inverse system of pairs of com- (~~u)(o)=I~~‘U(~O&~)+C~:;(-l)i14(oO&i),
pact Hausdorff spaces, then Hq(l@X,, l@ A,) where o is a singular (4 + 1)-simplex in X. The
gl@Hq(X,, A,). The Lech cohomology the- cohomology group of the cochain complex
ory satisfies this axiom. S*(X; 8) is denoted by H*(X; 8) and is called
During recent years, many (co)homology the singular cohomology group of X with coef-
theories have been developed which satisfy the ficients in 6.
lïrst six Eilenberg-Steenrod axioms but fail to If ch is trivial, then the (co)homology group
satisfy the dimension axiom. These are called with coefficients in 8 coincides with the (CO)-
generalized (co)homology theories, and include homology group with coefficients in G z G,.
various TK-theories, tbordism theories, and The various notions and theorems on the
tstable homotopy theories (- 202 Homotopy ordinary (co)homology cari be extended to
Theory). (co)homology with coefficients in 8. The Lech
(co)homology group with coefficients in 8 is
also detïned [ 101. The cohomology groups
R. Homology with Coeftïcients in Local with coefficients in 8 are generalized to the
Systems cohomology groups with coefficients in a sheaf
[lO, 141 (- 383 Sheaves).
N. E. Steenrod (Ann. Math., (2) 44 (1943)) in-
troduced the (co)homology group with coef-
ficients in a local system of Abelian groups, References
which is useful in +Obstruction theory and in
the homology theory of +lïber spaces. [1] H. Poincaré, Analysis situs, J. École
A local system KI of Abelian groups on a Polytech., 1 (1985), l-121; Rend. Cire. Mat.
topological space X is a set of Abelian groups Palermo, 13 (1899), 285-343.
G,, one for each XEX, together with an iso- [L] 0. Veblen, Analysis situs, Amer. Math.
morphism l*:GrO,+G,(,, for each tpath /:[O, 1] Soc. Colloq. Publ., 1922.
+X subject to the following conditions: (1) [3] S. Lefschetz, Topology, Amer. Math. Soc.
If two paths 1 and 1’ are homotopic with Colloq. Publ., 1930.
endpoints lïxed, then I* = I’*. (2) If 1 and m [4] H. Seifert and W. Threlfall, Lehrbuch der
are paths such that 1(l) = m(O), then (1. m)* = Topologie, Teubner, 1934 (Academic Press,
m* o 1*, where I. m denotes the tproduct of 1 1980).
and m. An example is provided by the +homo- [S] P. S. Aleksandrov and H. Hopf, Topologie
topy groups x,(X, x) for n k 2. Let M be an 1, Springer, 1935 (Chelsea, 1965).
n-dimensional topological manifold. Then [6] S. Lefschetz, Algebraic topology, Amer.
x+H,(M, A4 -x) is a local system of intïnite Math. Soc. Colloq. Publ., 1942.
cyclic groups. It is called the orientation sheaf [7] S. Eilenberg and N. E. Steenrod: Founda-
of M.‘A local system 8 is said to be trivial if tions of algebraic topology, Princeton Univ.
/* = 1’* for any paths I, 1’ with the same initial Press, 1952.
and final points. [S] H. Cartan and S. Eilenberg, Homological
Given a local system (5 of Abelian groups algebra, Princeton Univ. Press, 1956.
on a topological space X, a chain complex [9] P. J. Hilton and S. Wylie, Homology
S(X; 8)= {S,(X; CF>),a,} is delïned as follows: theory, Cambridge Univ. Press, 1960.
If q < 0, then .S,(X; 6) = 0, and if q > 0, then [ 101 E. H. Spanier, Algebraic topology,
.S,(X; KJ) is the Abelian group of forma1 finite McGraw-Hill, 1966.
sums C gc~, where o: A4+X are singular q- [ 111 A. Dold, Lectures on algebraic topology,
simplexes in X and gb~G,,(l,O,..,,O); the bound- Springer, 1972.
ary operator d4:S,(X; (T>)+S,-, (X; Cc>)is given [ 121 M. Glezerman and L. S. Pontryagin,
by L;,(.4,0)=lb(g,)aoBo+~~=~(-l)ig~~o~i, Intersections in manifolds, Amer. Math. Soc.
where o o si is the ith face of (r, and 1,: [0, 11 Transl., 50 (1951). (Original in Russian, 1947.)
+X is given by l,(t) = o( 1 - t, t, 0, ,O). The [ 131 W. S. Massey, Homology and coho-
homology group of the chain complex S(X; 8) mology theory, Dekker, 1978.
202 A 774
Homotopy Theory

[ 141 R. Godement, Topologie algébrique et the relation of homotopy. Denote by Y’ the


théorie des faisceaux, Hermann, 1958. set of all continuous mappings from X into
[15] S. Lefschetz, Introduction to topology, Y. The homotopy relation is an tequivalence
Princeton Univ. Press, 1949. relation on Y’, and the equivalence class [f]
[16] J. G. Hacking and G. Young, Topology, of a mapping f: X* Y is called the homotopy
Addison Wesley, 1961. class (or mapping class) off: The set of a11
[ 173 D. G. Bourgin, Modern algebraic topol- homotopy classes of mappings of X into Y is
ogy, Macmillan, 1963. called the homotopy set and is denoted by
[ 181 H. Schubert, Topologie, Teubner, 1964 x(X; Y) or [X, Y]. A function y of continuous
(Allyn and Bacon, 1968). mappings fi Y’ is called a homotopy invariant
[ 191 S. T. Hu, Homology theory, Holden-Day, iff=g implies y(f)=?(g). When X consists of
1966. a point * we Write 7c(*; Y) = n,(Y). If a11 con-
[20] M. J. Greenberg, Lectures on algebraic tinuous mappings in Y’ are homotopic to each
topology, Benjamin, 1967. other, we Write x(X; Y) = 0; rcO(Y) = 0 means
[21] G. E. Cooke and R. L. Finney, Homology that Y is tarcwise connected. A mapping f
of ce11 complexes, Princeton Univ. Press, 1967. from a compact space into an n-dimensional
[22] C. R. F. Maunder, Algebraic topology, sphere S” is called essential if any mapping g
Van Nostrand, 1970. homotopic to f satistïes g(X) = S”. A mapping
[23] A. H. Wallace, Algebraic topology, Ben- is inessential if and only if it is homotopic to
jamin, 1970. the constant mapping.
[24] C. Godbillon, Elements de topologie These concepts are generalized as follows:
algébrique, Hermann, 1971. Let Ai and Bi (i = 1,2,. . ) be subspaces of X
[25] J. Vick, Homology theory, Academic and Y, respectively, and denote by YX(A,, A,,
Press, 1973. ; B,, B,, ) the set of continuous mappings
[26] J. W. Milnor and J. D. Stasheff, Charac- fe Y’ satisfying f(Ai)c Bi. If a homotopy { 1;)
teristic classes, Princeton Univ. Press, 1974. is such that ft~ YX(Ai; Bi), then {,h} is called a
[27] W. S. Massey, Singular homology theory, restricted homotopy with respect to Ai, Bi or a
Springer, 1980. homotopy from a system of spaces (X, A,,
A, ,... )intoasystemofspaces(Y,B,,B, ,... ).
The notation f0 *fi :(X, A,, A,, . .)+(Y, B,,
B *, . . .) and the homotopy set n(X, A,, A,, . . ;
Y, B,, B,, ) are defïned accordingly.
202 (1X.9) For the composite gofsZX(Ai; Ci) of
Homotopy Theory fe YX(Ai; Bi) and gEZY(Bi; Ci), f=f’ and
g e g’ imply y of = y’ of’. Thus the composite
/i’oa=[gof]m(X,Ai;Z,Ci) of [f]=uE
A. General Remarks
~L(X, Ai; Y, Bi) and [g] = BE n( Y, B,; Z, Ci) is
defined. By putting g,[f] = [gof] =f*[g]
Given a topological space X, we utilize the
we induce two mappings,
concept of homotopy to detïne the tfunda-
mental group, homotopy groups, and co-
g* :~L(X, A,; Y, Bi)+n(X, Ai; Z, Ci),
homotopy groups of X. These groups, to-
gether with (co)homology groups, are useful f* : TC(Y, Bi; Z, C,)+n(X, A,; Z, Ci).
tools in topology.
Then f=f’ implies f* =f’* and g ?y’ implies
Since the research of H. Hopf, W. Hurewicz,
and H. Freudenthal in the 1930s homotopy y*=&. Also(gof),=g,of,,(gof)*=f*o
g*, and h* o g* = g* o h*, where h~x”‘(Q; Ai).
theory has made rapid progress and now plays
The category of pointed topological spaces
an important role in topology.
is detïned to be the tcategory in which each
abject X, which is a topological space, has a
B. Homotopy point lïxed as a base point and each mapping
X + Y carries the base point of X to. the base
Ifafamilyf~:X-,Y(t~I={t(O<t<l})of point of Y. In this category, we define a homo-
tcontinuous mappings from a ttopological topy set, denoted by ~L(X; Y)0 or [X; Y&,, as
space X into a topological space Y is also follows: Denoting the base points by *, we
continuous with respect to t, that is, if the have ~L(X, Ai, *; Y, Bi, *)= x(X, Ai; Y, Bi)o. A
mapping F from the product space X x 1 into continuous mapping f homotopic to the
Y delïned by F(x,t)=f,(x) (xcX,tsl) is con- constant mapping X+ * E Y is said to be
tinuous, then {f;} or F is called a homotopy. In homotopic to zero (or null-homotopic). This
this case, fa and fi are said to be homotopic. is indicated by f= 0, and rc(X; Y), = 0 means
This relation between f0 and fi is indicated by that a11 continuous mappings are homotopic
f0 =,fi :X + Y, or simply f0 =fi, and is called to zero. Let S” be a set of two points; then
175 202 F
Homotopy Theory

rr(X; SO), =0 means that X is tconnected. In hood in P. X is called locally contractible if


contrast to these specific homotopies, the each point x of X has a contractible neighbor-
usual homotopy is sometimes called a free hood U of x.
homotopy.
Suppose that a homotopy {f,} (f;: X-t Y) is
E. The Extension Property
such that the restriction off, to a subspace A
of X is stationary, that is, f,(a)=fo(a) (uEA,
Let X, Y be topological spaces, A c X, fo,
t~1). Then fo, fi are said to be homotopic
f, E Y’, and {g,: A-+ Y} a homotopy such that
relative to A, indicated by f. =Si (rel. A). If a
gi=jilA(i=O,l). Wecanextend {y,} toa
homotopy {f;} (f,: X-t Y) is such that each
homotopy {ft} of X if and only if the mapping
f, is a homeomorphism into Y, then {f,} is
F:(XxO)U(AxI)U(Xxl)~Ydetïnedby
called an isotopy and f. is called isotopic to fi
F(x, i) =L(x), F(a, t) = g*(a) cari be extended to
(- 235 Knot Theory).
a continuous mapping sending X x I into Y.
Research done by L. E. Brouwer, H. Hopf,
Therefore the problem of whether f. =fi cari
W. Hurewicz, K. Borsuk, L. S. Pontryagin,
be reduced to the problem of whether a con-
and S. Eilenberg has contributed to the theory
tinuous mapping delïned on a subspace cari be
of homotopy, an important tïeld of topology
extended to the whole space. If for any homo-
still in the process of development.
topy { gt : A+ Y} and any continuous mapping
fo: X-t Y into any topological space Y satisfy-
C. Mapping Spaces ing f. 1A = go there exists a homotopy {f,: X+
Y} satisfying f, 1A =gt, then we say that (X, A)
We endow the set Y’ of a11 continuous map- has the homotopy extension property. This
pings f: X+ Y, with the +Compact-open topol- occurs if and only if (X x 0) U (A x 1) is a re-
ogy. The topological space Y’ is called a tract of X x 1. A pair (X, A) of ANRs, where
mapping space. In particular we denote Y’ (0,l; A is closed in X, and a pair (P, PJ with P a
*,*)(*~Y)byR(Y)=n(Y,*)andcallitthe +CW complex and P. a subcomplex of P have
space of closed paths (or loop space) of Y. Two this property. Given a continuous mapping
points L g of Yx are connected by a tpath h: B+ A of a subspace B of a topological space
in YXifandonlyiff=g:X~Y.Thusno(YX) Y into a topological space A, we identify bu B
= n(X; Y) and n,( Y’(& B,)) = n(X, A,; Y, Bi). with h(b)E A in the tdirect sum A U Y and
obtain the tidentifïcation space denoted by
A U, Y, which is called an attaching space
D. Retracts
under h. If (Y, B) has the homotopy extension
property, then (Y x X, B x X) and (A U, Y, A)
Let A be a subspace of a topological space X.
also have the same property. When A consists
If there exists an fe AX such that the restric-
of a point * , we Write * U, Y = Y/B and cal1
tion fl A is the identity mapping of A, then A
the space Y/B a space smashing (shrinking or
is called a retract of X, and Sa retraction. If A
pinching) B to a point. If Y= B x 1, B = B x 0,
is a retract of X, any continuous mapping of A
then we cal1 A U,(B x 1) a mapping cylinder of
into any topological space cari be extended to
h, (A U,(B x I))/(i3 x 1) a mapping cane of h,
a continuous mapping of X. If A is a retract
and the mapping cylinder and mapping cane
of some neighborhood U(A), A is called a
of h : i? + * the cane over B and suspension of B,
neighborhood retract or NR of X. If for any
respectively.
thomeomorphism of a metric space A onto a
closed subspace A, of any metric space X, A,
is a retract (neighborhood retract) of X, then A F. Homotopy Type
is called an absolute retract or AR (absolute
neighborhood retract or ANR). For example, For systems (X, Ai), (Y, Bi) of topological
an n-dimensional simplex or an n-dimensional spaces, if there exist fi YX(Ai; Bi), SEXY(&; Ai)
Euclidean space is an AR. If a retraction fis such that g of and fo y are homotopic to the
homotopic to the identity mapping of X (resp. identity mappings of (X, Ai) and (Y, Bi), respec-
U(A)), we cal1 A a deformation retract (neigh- tively, then we say that (X, Ai) and (Y, BJ have
borhood deformation retract) of X. Moreover, the same homotopy type or are homotopy
if f= 1, (rel. A), then A is called a strong de- equivalent. Such mappings f and g are called
formation retract. In particular, if a point x0 is homotopy equivalences. For a homotopy
a (strong) deformation retract of X, we say equivalence L the induced mappings ,f, and
that X is contractible to the point x0. For f* are bijective. Therefore, in homotopy
example, any tpolyhedron P and any com- theory, systems of spaces having the same
pact n-dimensional ttopological manifold are homotopy type are considered equivalent.
ANRs; any polyhedron P. contained in P in a If A is a deformation retract of X, then A
strong deformation retract of some neighbor- and X have the same homotopy type, and the
202 G 116
Homotopy Theory

injection of A into X and the retraction of X is exact (i.e., Imi* = Kerf * =.f*-‘(O), where
onto A are homotopy equivalences. A con- i: Y-C, is the canonical inclusion and 0 is the
tractible space has the same homotopy type class of the constant mapping). The inclusion
as a point. Spaces having the same homo- i: Y*C, gives rise to the reduced mapping
topy type have isomorphic homotopy groups cane Ci. We also have the canonical inclusion
and t(co)homology groups. Since the mapping i’: Cf-Ci. Adding the term ~L(C~:Z)~~ to the
cylinder Zf = YU,(X x 1) of ,f’E Yx contains Y left-hand side of the sequence above, we have a
aq its deformation retract, it has the same new exact sequence. Continuing this process,
homotopy type as Y. By this homotopy equiva- we obtain an exact sequence of inlïnite length.
lente, f cari be replaced by the injection of X x If X, Y satisfy a suitable condition (e.g., X, Y
1 into Zf. If to each topological space there are CW complexes), then Ci has the same
corresponds a value (which may be some ele- homotopy type as the reduced suspension
ment of R or some algebraic structure) and SX of X; i’* is equivalent to p* : n(SX; Z),+
the values are the same for homotopy equiva- n( C,; Z), induced by a mapping p : C,+SX
lent spaces, then the value is called a homotopy smashing Y to a point; and furthermore C, has
type invariant. A homotopy type invariant is a the same homotopy type as SY, and the inclu-
+topological invariant; for example, n(X; Y) is sion i, : SX - C,, is equivalent to the suspension
a homotopy type invariant of X. If a continu- Sf: SX -tS Y of J: Thus the following Puppe
ous mapping f: X + Y induces isomorphisms exact sequence is obtained:
of the homotopy groups of each tarcwise con-
. ..%(sc.;z),%@Y;z),%(sx;z),
nected component, then f is called a weak
homotopy equivalence. Conversely, if X and Y
are CW complexes, then a weak homotopy
In this exact sequence, if Y is a CW complex,
equivalence is a homotopy equivalence (J. H.
X is a subcomplex of Y, and ,f is the inclusion
C. Whitehead).
i : X - Y, then Ci = C, = Y U C, is homotopy
Now we consider the category of pointed
equivalent to the space C,/CX = Y/X obtained
topological spaces. Let A and B be pointed
by smashing CX to a point, and an exact
topological spaces. Then the tdirect sum in this
sequence of the following type is obtained:
category is the one-point union (or bouquet)
A v B obtained from the disjoint union A U B . ..~7L(sx.z)ooli7L(Y/x;z)~-‘~(Y;z)o
by identifying two base points *A and *B.
%I(X; Z),.
A v B is identitïed with the subspace (A x
*B)u(*.4 x B) in A x B. The reduced join (or A sequence equivalent to the sequence
smash product) of A, B is the space obtained XL YAC, is called a cofibering, for which a
from A x B by smashing its subspace A v B similar exact sequence is obtained. For a con-
to a point and is denoted by A A B. We cal1 tinuous mapping f: X -i Y, consider the sub-
A AS’ the (reduced) suspension of A and de- space E/ = {(x, 9) 1f(x) = p(O)} of the product
note it by SA. Repeating the suspension n space X x Y’. By identifying X with {(x, cp,) 1
times, we have the n-fold reduced suspension of <p,(l) = f(x)}, we cari regard X as a deforma-
A.WecallCA=AAI(I=[O,l])thereduced tion retract of Ef. By putting pl(x, cp)= <p(l),
cane of A (1 has the base point 1). For a con- we obtain a 0ïber space (Es, pl, Y). The liber
tinuous mapping f: X+ Y, the space obtained Tf = p 1’ ( * ) is called a mapping track off:
by identifying each point (x, 0) of the base of Using the tcovering homotopy property, we
CX with f(x) E Y is called the reduced mapping see that the sequence
cane and is denoted by C, = Y U,CX. The
reduced join (or smash product) of mappings n(W; TJ)o%c(W;X),,%c(W; Y),
f: Y+X and f ‘: Y’GX’ is the mapping fi f ': is exact, where p(x, QD)= x. This sequence is
Y A Y’+X A X’ induced from the product also extended infinitely to the left as
mapping f x f ‘: Y x Y’-tX x X’. The reduced
join off: Y+X and 1: S’ +S’ (identity map- . ..~=(W.RX),~,(W;RY),~(W; Tf),,P;,
ping) is written as Sf = f A 1 and is called the where i is the inclusion of the loop space QY
suspension of ,f: into T/ and Qf: RX +R Y is the correspon-
dence of the loops induced from f:
G. Puppe Exact Sequences
H. Homotopy Sets that Form Groups
Forf:X-*Yandy:Y+Z,wehavegofzOif
and only if y cari be extended to a continuous
If X = SX’ or Y = fi Y’ (or, generally, if Y is a
mapping from C, into Z. In other words, the
thomotopy associative +H-space having a
sequence
thomotopy inverse), then z(X; Y),, forms a
7r(Cr; Z)&r( Y; Z),L(X; Z), group. In the general case the product of the
111 202 K
Homotopy Theory

loops induces the product of ~L(X; OY’),. We classes [f], Using the notation of homotopy
represent a point of SX by (x, t) (x E X, t E 1) sets, we have x,(X, *)=x(1”, i”; X, *). If we
and delïne the mapping Q,g : X -0 Y for each choose the constant mapping as the base
g:SX+Y by Q,g(x)(t)=g(x,t). Hence an point * of Q”(X, *), then Q”(Q”(X, *), *)=
isomorphism R, : n(SX; Y)a z n(X; Q Y)a is Q”‘+“(X,*). Thus ~m(n”(X,*),*)=~,+,(X,*).
obtained. Each of the following pairs of homo- Since ni is the tfundamental group, n,(X,*)=
morphisms is equivalent: f, : rc(SX; Y), + rc,(fin-‘(X), *) is also a group, called the n-
n(SX; Y’), and Rf,:~(X;nY),~~(X;RY’),; dimensional homotopy group of X with base
and h*:rc(X’;QY),-trc(X;RY), and point *. “Multiplication” in homotopy groups
Sh*:n(SX’; Y),-wT(SX; Y),. is defined as follows: Given fi, fz E fi”(X, *)
If S” is an n-dimensional sphere, then n,(X) we define ,fi + f2 ECJ”(X, *) by
= rt(S”; X), is the n-dimensional homotopy
group (- Section J). Let n”(X) = ~L(X; S’& If
fi(2~l,b,...>L)> O<t,<),
(f1 +f2N)= { f2(2t1 - l,t*, ,t,), f<t,<l
X is a CW complex of dimension less than 2n
- 1, ~L”(X) is the cohomotopy group isomor- (Fig. 1). Then the product or sum of [fi] and
phic to rc(X; fiS”+‘)O (- Section 1). Let K, be [fi] is given by [f, + f2]. The identity is the
an TEilenberg-MacLane space of type (ZZ, n). class of the constant mapping (denoted by 0),
Then we have K, = RK,,, , and if (X, A) is a and the inverse of [f ] is [f], represented by
pair of CW complexes, then rr(X/A; KJ,, coin- ,f(t)=f(l-t,,t, ,..., t,). ThespaceR”(X,*)is
cides with the cohomology group H”(X, A; n). an tH-space, where multiplication is given by
For the tclassifying space i?o(B,) of the inlïnite the correspondence (fi, f2)+ fi +f2. Since the
orthogonal group 0 (infinite unitary group U) fundamental group of an H-space is commuta-
(- Section V), rc(X/A; Bo) (~L(X/& B,)) may be tive, x,(X, *) is an Abelian group for n B 2.
considered the KO-group KO(X, A) (K-group
K (X, A)) (- 237 K -Theory).

1. Cohomotopy Groups

K. Borsuk detïned a sum of mapping classes of


X into S” (1936) which was named Borsuk’s
cohomotopy group by E. Spanier. Spanier also
studied the duality of the cohomotopy group Fig. 1 Fig. 2
with the homotopy group and its relations to
the usual cohomology groups. A cohomotopy
LetS”={t=(t,,...,t,+,)~~t~=l} bethen-
group of (X, A) is defined to be ~L”(X, A) =
sphere, and take * = (l,O, . . ,O) as its base
rc(X, A; S”, *), which forms a group if dimX/A
point. Suppose that we are given a continuous
< 2n - 1. A mapping F: X/A -+S” x S” given
mapping $,:(Z”,i”)+(S”,*) such that t//,:Z”-
by F(x)=(f(x),g(x)) with,f, g:X/A-rS” is
in-+,??- * is homeomorphic. Then the corre-
homotopic to a mapping into S” v S”. If we
spondence $n : rc(S”; X), +rc,(X, *) determined
compose F with a folding mapping of S” v
by t,@ [g] = [go ICI.1 is bijective. Thus we cari
s” onto S”, we obtain a mapping that repre-
identify the homotopy group x,(X, *) with
sents the sum [f] + [g]. With each homo-
7c(S”; X),.
topy class of a continuous mapping f of an
n-dimensional tpolyhedron K” into an n-
dimensional sphere S”, we associate the image
K. Relative Homotopy Groups
f*(u) of the fundamental class u E H”(S”; Z)
under the induced homomorphism f* :
Suppose that we are given a topological space
H”(S”; Z)-+H”(K”; Z). We then obtain a bijec-
X and a subspace A of X sharing the same
tive relation TC”(K”)+W(K”; Z), called Hopf’s
base point * Identify In-’ with the face t, =
classification theorem.
0 of I”, and let J”-’ be the closure of i” -
In-’ (Fig. 2). Denote by rc,(X, A, *) the set of
J. Homotopy Groups homotopy classes of continuous mappings
f :(Y, in, J”-l)-+(X, A, *). Let Q’(X, A, *) be
Let X be a topological space with a base point the mapping space consisting of such map-
*,P={t=(t1,t2 )...) t,))O<t,,t, ,...,t,<l} be pings i and let rc,(X, A, *) = n,(W(X, A, *)).
the unit n-cube, and i” its boundary. Write Since O*(Q”(X, A, *), *) is homeomorphic to
fin(X, *) = X’“(in, *) (in particular, P(X, *) is Qm+“(X, A, *), we have q,,(CY’(X, A, *), *)r
the loop space), and denote by x,(X, *) or 71,+,(X, A, *). Thus n,(X, A, *) is a group for
simply ~C,(X) the set of arcwise connected n > 2 and an Abelian group for n à 3. This
components of Q’(X, *), i.e., the homotopy group is called the n-dimensional relative
202 L 778
Homotopy Theory

homotopy group of (X, A) with respect to the -t(X, A) satisfying &(J”-‘) = h(B) and fi =f:
base point *, or simply the n-dimensional Then the homotopy class [fol of f0 with re-
homotopy group of (X, A). In the same man- spect to the base point * ’ = h(0) is determined
ner as in Section J multiplication in this group only by a and the homotopy class w of the
cari be detïned using fi +f2. Since Q”(X, *, *) path h. We denote the homotopy class [fol by
and Q”(X, *) are identical, we have ~L”(X, *, *) ?“~rr,(X, A, *‘). The correspondence cl+tl” is
= n,(X, *). Hence homotopy groups are spe- a group isomorphism, and (C(~)““=C?“. Thus
cial cases of relative homotopy groups. if A is arcwise connected, 7(,(X, A, *) is isomor-
Let g:(X, A, *)-( Y, B, *) be a continu- phic to x,(X, A, *‘). Hence, in this case, we may
ous mapping. Then a correspondence y* : simply Write rr,(X, A) instead of ~L”(X, A, *).
n,(X, A, *)-trr,(Y, B, *) is obtained by g,[f] = When * = *‘, the correspondence C(+CC~ deter-
[gof], with y* a homomorphism of homo- mines the action of the group ~L~(A, *) on
topygroupsforn>2andforn=l,A=*.We x,(X, A, *). Given an element aEz,(X, *) and a
cal1 g* the homomorphism induced hy y. Let class w of paths in X, we defïne aWcn,(X, e’)
E”={t=(t,,...,t,)I~t~=l}betheunitn-ce11 as for relative homotopy. Specifically, if WE
with boundary S”-‘. Utilizing a suitable rela- n, (X, *), then aw - a coincides with the White-
tive homeomorphism $L:(l”,J”-‘)+(En, *), head product [w, a] (when n= 1, we have au.
$k(p) = S”-‘, we obtain a one-to-one corre- r(~‘=[w,a]=waw~‘a-‘)(- Section P).
spondence I,&,*:~~(E”,S”~~;X, A)o+n,(X, A,*), A pair (X, A) consisting of a topological
and @‘(X, A, *) is homeomorphic (via I&) to space X and an arcwise connected subspace A
the mapping space XE”(S”-‘, * ; A, * ). of X is said to be n-simple if the operation of
rtr (A) on x,(X, A) is trivial. Similarly, an arc-
wise connected space X is called n-simple if the
L. Homotopy Exact Sequences
operation of rr, (X) on ~C,(X) is trivial. For
example, a pair (X, A) consisting of an H-space
Given an element c(= [f] E n,(X, A, *), and
X and an H-subspace A is simple, i.e., n-simple
letting acr= [,f1I’-‘]~n,~,(A, *), we obtain
for each n. If a topological space X satistïes
a homomorphism (n>2) a:rr,(X, A,*)+
n,(X)=0 (O<i<t~), then X is said to be n-
x,-,(,4, *), which is called the houndary
connected. 0-connectedness coincides with
homomorphism. Furthermore, we have the
arcwise connectedness and 1-connectedness
following exact sequence involving homomor-
means +Simple connectedness. S” is (n - l)-
phisms i,,j, induced by two inclusions i:(A, *)
connected. A pair (X, A) is said to be n-
+(X, *),j:(X, *, *)+(X, A, *):
connected if 7-co(A)=n,(X)=ni(X, A)=0 (1 <
. ..~tn.(A,*)l;~“(X,*)~~,(X, A,*) i < n), and (E”, S’-‘) is (n - 1)-connected.

-~...~~,(x,*)~;~~(x,A,*)<~*~~(A)I;TL~(x).
M. Homotopy Groups of Triads
This sequence is called the homotopy exact
sequence of the pair (X, A). A system of topo-
Let (X; A, B, * ) be a system, called a triad,
logical spaces X 3 A 3 B 3 * is called a triple.
of a topological space X and its subspaces
In this homotopy exact sequence, if we replace
A, B satisfying A fl BS * (base point). Let
(4 * 1, (X, *) by (A, 4 * 1, W, 4 * 1, respectively, n,(X;A,B,*)=n,~,(R’(X,B), R’(A,AnB),*)
we obtain an exact sequence, called the homo-
(n > 2); rc”(X; A, B, *) is a group for n > 3 and an
topy exact sequence of the triple (X, A, B).
Abelian group for n > 4. We cal1 n,(X; A, B, *)
The homotopy group n,(A x B) of the prod-
the homotopy group of the triad. From the
uct space is isomorphic to the direct sum
homotopy exact sequence of the pair, we ob-
q,(A)+qJB), and the projections p(p’): A x B
tain the following homotopy exact sequence of
+A(B) of the product space induce the pro-
the triad:
jections from n,(A x B) onto the direct sum-
mands n,(A), n,(B). This is a special case of the
+Hurewicz-Steenrod isomorphism theorem in il
L;71i(X;A,B,*)~*?li-1(A>Ar)B,*)~ ... .
fiber spaces (- 148 Fiber Spaces). Setting
AvB=(A x *)U( * x B), we obtain a direct Assume for simplicity that An B is simply
sum decomposition n,(A v B) z x,(A) + n,(B) + connected, X = Int A U Int B (Int A is the +in-
7-c,+,(A x B, A v B). Next we consider a fixed terior of A), (A, A n B) is m-connected, and
pair (X, A) and move the base point * to inves- (B, A n B) is n-connected. Then (X; A, B) is
tigate its effect on the elements of the homo- (m + n)-connected, i.e., rtj(X; A, B, *) = 0 (2 <
topy group. Suppose that we are given a path j < m + n) (Blakers-Massey theorem).
h: Ik.4 with termina1 point * = h( 1) and an Furthermore, in this case we have a replica
element a~rr,(X,A,*) (M=[f],S:(ln,in,Jn-i) of the texcision isomorphism in homology
-*(X, A, *)). By the homotopy extension prop- theory for j < m + n; that is, we have the iso-
erty, we cari construct a homotopy fs: (I”, in) morphism i,: nj(A, A fl B, *)r 5(X, B, *) in-
179 202 0
Homotopy Theory

duced by the inclusion i:(A, A flB)+(X, B). On We have the following generalized Hurewicz
the other hand, 7~,+.+i (X; A, B, *) is isomor- theorem: (A) Suppose that a class V satislïes
phicton,+,(A,AnB,*)o~,+,(B,AnB,*). (ii) and (iii) and we are given a 2-connected
This shows that the excision isomorphism pair (X, A) of simply connected spaces X, A.
does not always hold for homotopy groups, an If ni(X, A)E%? (icn), then Hi(X, A)E%?, and
important difference from homology theory. 7:71,(X, A)+H,,(X, A) is a 9?-isomorphism. (B)
However, if we replace the excision axiom by Suppose that A = *, V satistïes conditions (ii’)
the Hurewicz-Steenrod isomorphism theorem, and (iii), and X is simply connected. Then an
which is valid for tïber spaces (- 148 Fiber assertion similar to (A) holds. In particular, a
Spaces), then we cari construct homotopy simply connected space X having tïnitely
theory axiomatically in the same manner as generated homology groups (e.g., a simply
homology theory (- 201 Homology Theory). connected tïnite polyhedron) has fînitely gen-
erated homotopy groups. As a corollary to
theorem (A), we obtain a generalized White-
N. The Hurewicz Isomorphism Theorem head theorem. In particular, applying the
theorem to the class @ n %?r,,we obtain the
The Hurewicz homomorphism 7 of x,(X, A) following frequently used theorem: Suppose
into the n-dimensional integral homology that we are given simply connected spaces X,
group H,(X, A) is defïned by T( [f])=f,(s,) Y whose homology groups are lïnitely gen-
(where E, is a generator of W,,(ln, p)). Then we erated and f: X + Y satisftes f,nz(X) = rc2( Y).
have the Hurewicz isomorphism theorem: Sup- Then the following two conditions are equiva-
pose that the pair (X, A) is n-simple (e.g., A lent: (1) f, : rci(X)+ni( Y) is a modp isomor-
= *) and (n - 1)-connected. Then we have phism for i < n and a mod p surjection for i = n.
Hi(X, A) = 0 (i < n) and the isomorphism T: (2) f, : H,(X, ZP)+Hi( Y, Z,) is an isomorphism
n,(X, A) g H,(X, A) (for n = 1 - 170 Funda- for i < n and a surjection for i= n (where Z, =
mental Groups). Let X, Y be simply connected Z/pZ). The theory above, which makes use
topological spaces, and let f: X+ Y be a con- of the notion of class %‘, is an example of
tinuous mapping. Then the following two con- Serre% %‘-theory. Concepts such as tspectral
ditions are equivalent: (1) f, : xi(X)+ni( Y) is sequences for Iïber spaces and tn-connective
injective for i < II and surjective for i < n. (2) tïber spaces are important tools in Serre’s V-
f. : Hi(X)+Hi( Y) is injective for i < n and sur- theory (- 148 Fiber Spaces).
jective for i < n (J. H. C. Whitehead3 theorem). TO calculate homotopy groups, we use
J.-P. Serre generalized these theorems as notions such as exact sequences, fïber spaces,
follows: A family w of Abelian groups satisfy- (co)homology groups of n-connective tïber
ing condition (i) is called a class of Abelian spaces, and tPostnikov systems. Given an
groups: (i) If a sequence F+G + H of Abelian arbitrary group (more generally, a Postnikov
groups is exact and F, HE%?, then GEV. Fur- system), there exists a CW complex having the
thermore, we consider the following condi- given group (system) as its homotopy group
tions: (ii) The tensor product G @ F of an arbi- (Postnikov system) (realization tbeorem of
trary Abelian group F with an element G E 55 homotopy groups). For an arbitrary arcwise
also belongs to (e. (ii’) If both F, GEV?, then connected topological space X there exist
F@ G, Tor(F, C)E%?. (iii) If GE%?, then its topological spaces (X, n) and continuous map-
thomology group H,(G)E%? (i>O). Condition pings pn:(X,n+ l)+(X,n) (n= 1,2, . ..) satisfy-
(ii’) is implied by (ii). A homomorphism f: ing the following two conditions: (i) ((X, n + l),
F+G is called w-injective if KerfEV, %- p,,, (X, n)) is a liber space whose fïber is an
surjective if Cokerf= G/ImfE %?,and a V- tEilenberg-MacLane space. (ii) (X, 1) =X, and
isomorphism if f is V-injective and %‘- ((X, n + l), p1 0 . 0 pn, X) is an n-connective
surjective. Two Abelian groups G and G tïber space. The method of obtaining the
are called %‘-isomorphic if there exist V- homotopy group n,(X) g H,((X, n)) by com-
isomorphisms f: F+G and f’: F+G’. In puting (co)homology groups of (X, n) is called
particular, if the class ‘e, consists of only the a killing method.
trivial group 0, then concepts such as %$-
isomorphism coincide with the usual concepts
of isomorphism, and SO on. Let %?pbe the class 0. Homotopy Operations
of tïnite Abelian groups whose orders are
relatively prime to a tïxed prime number p. Let X, Y, X’, Y’ be topological spaces. If to
Here, instead of the terms %?r-isomorphism and each continuous mapping fe Y’ there corre-
SO on, we use the terms modp isomorphism sponds a homotopy class Q(f) E ~L(X’; Y’) that
and SO on. Let 3 be a class of fmitely gen- is a homotopy invariant off (satisfying a
erated Abelian groups. Then %,, satistïes con- certain naturality condition), then @ is called
ditions (ii) and (iii), and On satislïes (ii’) and (iii). a homotopy operation. More generally, we
202 P 780
Homotopy Theory

may consider the case where @ is a mapping and /?, denoted by [a, /?] and called the White-
from x(X,; Y,) x x x(X,; K) into 7c(X’; Y’). head product of E and p (J. H. C. Whitehead,
The naturality of Q, is defined as follows: Con- Ann. Math., (2) 42 (1941)). The Whitehead
sider the tcategory V of topological spaces (or product is a homotopy operation of type
its subcategory). Let Y= Y’ be an arbitrary (m, n;m+n- 1). Let +,,,:(lm,im)+(Sm, *) be a
tobject of %?,and fix X and X’. In this case, mapping that smashes im to a point. The pro-
the naturality of @,,:z(X; ~)+X(X’; Y) is de- duct of I/J, and ICI. delïnes a mapping $,,,. :
tïned to be the commutativity of the diagram: Sm+“~l+SmvS”=(Sm x *)U( * x Sn). Let 1~
z,,,(Sm v S”), z’~n,(S~ v S”) be the homotopy
7c(X; Y)%n(xI; Y)
classes of the natural inclusions of S”, S” into
lu* ls*
S” v S”; then the homotopy class of $,,,, is
7z(X;Z)-,n(X;Z)
[z, 1’1. G. W. Whitehead showed that a direct
i.e., g* o @y = <Dzo g* for an arbitrary tmor- sum decomposition n,(S” v S”) = l,z,(Sm) +
phism (i.e., continuous mapping) g: Y+Z of z*7cp(Sn) + [z, z’]*~L~(S~+“~~) (z.+, zi, [z, z’]* are
the category %‘. Similarly, when abjects Y, Y’ injective) holds for 1 < p <m + n + min(m, n) -
of the category %?are tïxed and X = X’ is an 3. Furthermore, P. J. Hilton showed that for
arbitrary abject of %?,to say that a homotopy general p > 1, zp(Sm v Sn) is the direct sum of
operation Qx:z(X; Y)+n(X; Y’) is natural the images of injections z*, &, [r, z’]*, [[I,
means that h* o Dx = mw o h* for an arbitrary 1’1, l],, [[z, 1’1, I’]*, etc. The homotopy
morphism h: W-tX. operations of type (m, n; p) are in one-to-one
We have the following theorem: In the correspondence with the elements of n,(S” v S”);
category of topological spaces and continu- hence such operations cari be constructed by
ous mappings, the homotopy operations @Y: means of composition and the Whitehead
n(X; ~)+X(X’; Y) and the elements of n(X’; X) product. The last proposition is also valid for
are in one-to-one correspondence. The corre- homotopy operations of type (m, , , m,; p).
spondence is obtained by associating a homo- The Whitehead product [a,/?] (aen,(X), BE
topy operation @(P)=~~OE (~Ex(X; Y)) with n,(X)) is distributive with respect to CI(resp. 8)
each c(E~L(X’; X). Similarly, the homotopy for m> 1 (n> l), and we have [/?,a]=(-1)“”
operations Ox:z(X; Y)+n(X; Y’) and the ele- [cc,/31 andf,[a,B]=[f&f*B] forf:X+Y.
ments of x( Y; Y’) are in one-to-one corre- Moreover, for y E q(X) the Jacobi identity
spondence. This theorem holds also for the holds: ~-~~“‘CC~,~~,Y~+~-~~““CCB,Y~,~~+
case involving several variables if we consider (-l)‘“[[y,a],P]=O (M. Nakaoka and H. Toda;
x(X’;X,vX,v...)or7c(Y,x Y,x...;Y’)in- H. Uehara and W. S. Massey; Hilton).
stead of ~L(X’; Y) or ~L(X; Y’). The theorem
remains valid if we replace the spaces X, Y by
systems of spaces.
Q. Suspensions and Generalized Hopf
Invariants

P. Homotopy Operations in Homotopy Groups


We denote by CIA fl E n(X A X’; YA Y’), the class
of the reduced join off; g, where f represents
(1) If X, X’ are spheres S”, SP with base points
nsn(X; Y),, and g represents ,~EK(X’; Y’),.
and Y, Y’ are topological spaces with base
We cal1 a A /l the reduced join of a and fi. In
points, a homotopy operation @y:n,,(Y)+
particular, if Y= Y’ = S’, b is the identity
np( Y) is said to be of type (n, p). By the theo-
mapping of S’, and a is represented by 5 then
rem in Section 0, the homotopy operations
a A b is called the suspension of a and is de-
of type (n, p) are in one-to-one correspondence
noted by Sa. Sa is the class of the suspension
with the elements of the homotopy group of
$f off and belongs to n(SX; SY),, where SX
the sphere n,,(Sn).
indicates the reduced suspension of X. The
(2) As an example of the 2-variable homo-
suspension Sa is often denoted by Ea in ref-
topy operations Q: n,(Y) x zLn(y)-+~,( Y) of
erence to the German term Einhtingung. The
type (m, n; p) we have the Whitehead product
identity mapping 1 of SY gives rise to an in-
delïned as follows: Suppose that EE~L,,,( Y),
jection i = Q, 1 sending Y into the loop space
/1~ n,(Y) are elements represented by f: (I”, im)
n(SY) determined by the formula i(y)(t)=(y, t).
-(Y, *) and g:(1”, &( Y, *), respectively.
Then we have
Delïne a continuous mapping F from the
boundary jm+” = (1” x in) U (im x 1”) of 1”+” = i, =R, oS:7r(X; Y),+7c(SX; SY),
I” x I” into Y by F(x, y) =f(x) for (x, y) E 1” x
i” and F(x,y)=g(y) for (x,y)gim x I”. Since
1‘m+” is homeomorphic to Sm+nml, we cari iden- and S and i, are equivalent. Let Y, be the
tify them. The homotopy class represented by identifying space Yk/ -, where Yk is the prod-
F is an element of 7tm+n-l( Y) determined by c( uctspaceYx...xYofkcopiesofYand-
781 202 T
Homotopy Theory

is the equivalence relation determined by the set of elements p m n(SW; Z) such that
p*(B)Ea,i*m’(/?) is denoted by {sc,p,y} and
is called a secondary composition or Toda
-(Y,>...,Yk-,,*). hracket. If 0, q are elements of rc(SW; Y),,
n(SX; Z),, respectively, then we have {a, b, y}
Denote by Y, = lJk yk the limit space with
+“*8={4B,Y}, {“,B,r)+SY*~={C(,B,Y).
respect to the injection Y,-, + Yk given by
Hence we may consider the set {a, 8, y} to be a
(yi, ,y,-,)+(~,, . . . ,y,-,, *) and cal1 it the residue class modula a submodule generated
reduced product space of Y. Let Y be a CW
by c(,@W; Y),, and Sy*n(SX; Z),.
complex of O-+Section * The mapping i: Y=
The secondary composition {a, 8, y} has the
Y, -&SYcan then be extended to i: Y, +
following properties: (i) {a, /l, y} is linear with
fiSY, where iis a weak homotopy equiva-
respect to c(, /l, y (if the sum is defïned); (ii)
lente. If X is also a CW complex, then K&’ o
ao{B,r,6}={~,8,~}o(-~~);(i~i)~{~,8,~}~
i,: n(X; Y,),+x(SX; SY), is bijective. By
-{Sa,SB,Sy};(iv)ao{B,y,6}-{aoB,y,6},
smashing the subset Y of Y,, we have YA Y=
{~oB,Y,6}~{a,BoY,~},...;(v){{~,B,Y},
Y2/Y. This smashing mapping cari be extended
S6, SE} + { % {B, y>6}, Se} + 1% 8, {Y, 6, E} } =
to h: Y, -( YA Y), (1. M. James). Utilizing
0. Suppose that the spaces X, Y, Z, W are
h, : n(X; Y,),-tn(X; ( YA Y),), and the bijec-
spheres. Then by (iii) the secondary composi-
tion fi;’ oi,, we obtain a correspondence
tion ~~,B,Y)~G~+~+~+~I(~oG~+~+~ +~oG,+,+d
H:n(SX;SY),~n(SX;S(Yr\ Y)),. We cal1 H(a)
is defined in the stable homotopy groups G,=
the generalized Hopf invariant of cc When X =
lim n-co n”+,(S”) of spheres. From this we obtain
S’“-‘, Y=S”-‘, H is equivalent to the Hopf
(vi) {y, fi, a} = ( -l)pq+qr+rp+’ {a, b, y} and (vii)
invariant y : 7czn-, (S”)-+Z (- Section U). In
(-1)~‘{~,~,Y}+(-1)~~{B,Y,~}+(-~)rq{Y,~,8}
general, we have Ho S = 0, and the exactness
EO.
of -% 3 holds under various conditions.
Denote also by o the composition of homo-
topy classes; then we have S(a o 8) = Sao S/I
S. Functional Operations
and H(E o SP) = Ha o SP. Also, H@E o fl) =
S(a A a)o H/I. Under the condition i < 3n - 3,
Let ù, be an operation corresponding to CI and
we have (a, +cc,)oB=cc, ofl+sc,op+[cc,,
y be the class of J: We put Qf(fi) = {a, B, y} and
a21 o H(p) for ~(i, a2~nn(X) and Peni(S”) (G.
cal1 Qf a functional @-operation. When @ is a
W. Whitehead). Thus the composition CIo /i’ is
cohomology operation, @, is called a func-
not always left distributive but is always right
tional cohomology operation. Then QI(b) is
distributive, and CIo fl is left distributive if p =
dehned for /J satisfying f*(b) =@(fi) = 0, and
S[y. The composition is defïned over the stable
m,.(b) is determined modulo ImSf* + Im @.
homotopy groups G,. of spheres (- Section U):
For f: Sn+k ->S”, k = 2i(p - 1) - 1, we denote by
ao/?~G,+, (REG,,,BEG~). It is distributive and
H,(f). ~,,+~+i E Hn+k+l(Sn+k+‘; Z,) the image
satisfies ~occ=(-l)pqcco~.
of a generator E, of H”(S”; Z,) under the
When Y and Y’ are Eilenberg-MacLane
functional Vp; operation. Then the Hopf in-
spaces, BO, and B,, we have tcohomology
variant modulo p (or modp Hopf invariant)
operations on cohomology groups H”( ; Z7),
Hp:z,+k(Sn)+Zp is obtained (we use Sq*’ for p
KO groups, and K groups, respectively. As
= 2). The following statements are equivalent:
typical examples there are +Steenrod square
(i) The mod 2 Hopf invariant is not trivial
operations Sq’: H”(X; Z,)-+Hn+i(X; Z,), +Steen-
(Hz #O); (ii) there exists a mapping: SZkil +
rod pth power operations 9’: H”(X; Z,)+
Sk+’ of Hopf invariant 1; (iii) Sk is an H-space;
Hn+2i(p-1)(X; Z,), +Chern characters ch”: K(X)
(iv) the Whitehead product [l, z] of a genera-
+H’“(X; Q) (Q : rational fïeld), +Adams opera-
tor I of nk(Sk) vanishes. Also, H, #O if and
tions @i:KO(X)+KO(X) (K(X)+K(X)). They
only if k = 2,4, 8 (J. Adams), and for an odd
are a11 homomorphisms (- 64 Cohomology
prime p, H, # 0 if and only if k = 2p - 3 (A. L.
Operations; 237 K-Theory).
Liulevicius; N. Shimada and T. Yamanoshita).

R. Secondary Compositions T. Stable Homotopy Groups and Spectra

The homotopy set n(S”X, S’Y), for n-fold


Suppose that aop=O, aoy=O for y~n(W;
iterated suspensions S’X = X A S”= SS’~‘X
X),, /l~n(X; Y),, c(E~L(Y;Z)~. In the com-
and S’Y, forms a group (an Abelian group) if
mutative diagram of Puppe exact sequences
n > 1 (n 3 2). The limit n”(X; Y) = lim $?‘X;
%L(SW;Y),sc(C,; Y),L(X; Y),? S’Y), with respect to the suspension homo-
morphisms S:n(S”X; S”Y),~n(S”“X;S”“Y),
is called a stable homotopy group of X and
202 u 782
Homotopy Theory

Y. For an r-connected space Y and a CW- HJS’“-’ ,cp-‘( *))+H,(S”,*) give rise to an in-
complex X, S:rr(X; Y),+n(SX;SY), is bijec- teger y(cp) determined by the relation ‘p.$ -I(E)
tive if dim X < 2r and surjective if dim X < = y(<~)&, (E, is an orientation of S”). This inte-
2r + 1 (generalized suspension theorem). Thus, ger is independent of the choice of <p, SO we
if X is a imite-dimensional CW-complex, cari set ?(~)=Y((P). Then S= y implies that
?(X, Y) is isomorphic to rr(S”X, S”Y), for suf- y(f) = y(g). We cal1 y(f) the Hopf invariant of
tïciently large n. TO discuss stable homotopy f: H. Hopf detïned y and showed y: rr;(S’) z Z
groups more generally, the following concept (1931); y(n,,-,(S”))=O for odd n; y(rrZn-r(Sn))3
of spectra is used. A system E = {&, Es} which 22 for even FI; and y(rczn-, (S”)) = Z for n = 4, 8
consists of CW-complexes E, and continuous (1935). H. Freudenthal defïned a homomor-
mappings ck:SEk-+Ek+, is called a spectrum. phism E:~(Y)+z~+~(S”+‘), E[f] = [S’], and
When E,=Sk and Q= lk+,:SSk+Sk+‘, S= proved the Freudenthal theorem: (1) E is an
{Sk, lk+r} is called a sphere spectrum. When isomorphism for i < 2n - 1; (2) E is a surjection
E, = K(G, k) (+Eilenberg-MacLane complex) for i = 2n - 1; and (3) the image of E coincides
and Es induces a homotopy equivalence K(G, with the kernel of y for i = 2n. Furthermore he
k) = RK(G, k + l), HG = {K(G, k), Es} is called obtained q,+,(S”)~Z, (n>3) (1937). For n=
an Eilenherg-MacLane spectrum. As in the 2,4,8, a mapping f:S’“-‘+S” (+Hopf map-
latter, a spectrum E in which ck induces a ping) such that y(f) = 1 (given by Hopf) is the
homotopy equivalence E, =RE,+, is called an projection of a tfiber bundle SZnml over the
R-spectrum. By Bott’s periodicity, R-spectra base space s”, and the correspondence (CZ,p)-
KU={ZxB,,U,ZxB,,U ,... }andKO= Ecc+f,fi gives an isomorphism 7c-r (S’-‘) +
{Z x B,, U/O, ~P/U, SP, Z x BS,, U/Sp, SO/U ni(S2”-‘) (direct sum) g ni(Sn). Hence we ob-
0, Z x B,, . } are obtained (- Section V). tain n,(S*) = Z,. It was shown by G. W.
Also, using +Thom complexes, Thom spectra Whitehead and L. S. Pontryagin that r~,+~(Sn)
MU, MO, etc. are obtained. Given a spectrum (n 2 3) is isomorphic to Z, (1949). Whitehead
E, by putting E”(X, A) = lim rc(Sk(X/.4); E,,,), also detïned a generalized Hopf homomorphism
for each pair (X, A) of CW-complexes, we H:TT~(S”)+TT~(S ‘“-l) for a range of i < 3n - 3,
obtain a generalized cohomology theory with and this restriction on the dimension was
E-coefficient; and by putting E,(X, A) = removed by P. J. Hilton and 1. M. James.
limn,(E,-“A(X/A)), we obtain a generalized Using H, many nontrivial results concerning
homology theory with E-coefficient. +Gener- rci(S”) have been obtained. Serre obtained the
alized (co)homology theory on (tïnite) CW- following (1951-1953): rci(S”) is fïnite except
complexes cari be represented by a suitable when i = n or i = 4m - 1 and n = 2m. Further-
spectrum (Ci. W. Whitehead, E. H. Brown, more, r~~~~r(S’~) is the direct sum of Z and
Adams). Corresponding to E=S, HG, KU, a fïnite group. Let p be an odd prime and
MU, etc., we have stable (co)homotopy n be even. Then rri(S”) is %$-isomorphic to
groups, G-coefficient (co)homology groups, ni_,(S”-‘)+~i(S2”~1). Let n be odd. Then
K-groups, +(co)bordism groups, etc., respec- ~~+~(S”)e(ep (k<2p-3), and ~c,+~~~~(S~) is vD-
tively (- 201 Homology Theory). isomorphic to Z,. Serre and H. Toda deter-
mined ~c,+~(S”) for k = 3,4, 5, and Serre further
determined it for k = 6, 7,8. Utilizing the re-
U. Homotopy Groups of Spheres duced product space of S”, James gave the
sequence
The spheres S” and their homotopy groups are
basic abjects in homotopy theory. Although . ..+ni(sn)%ri+l(S”+‘)%ci+l(S2n+‘)
much research has been done concerning these
abjects, there are still open problems.
S” is (n - l)-connected: ni(S”) = 0 (i < n). The and showed that it is an exact sequence if n is
fact that rc,,(S”) g Z (infinite cyclic group) was odd and an exact sequence mod 2 if n is even
obtained from the +Brouwer mapping theorem. (1953). Using this exact sequence and the
Also, ni(S’) = 0 (i> 1) follows from the fact that secondary composition, Toda determined
the tuniversal covering space of S’ is contrac- ~c~+~(S”) for k d 19 (- Appendix A, Table 6.VI).
tible. Suppose that we are given a continuous By the Freudenthal theorem (l), the z,+~(S”)
mapping f: S’“-’ -tS”. We approximate it by a (n > k + 1) for a tïxed k are isomorphic to each
tsimplicial mapping <p. Then the inverse image other. We cal1 z,+~(S”) (n > k + 1) the stable
q-‘( *) of a point * in the interior of an n- homotopy group of tbe k-stem of the sphere
simplex of S” is an (n - 1)-dimensional tpseudo- and denote it by Gk. For k=O, 1,2, , 15, . . . ,
manifold which is orientable by means of Gk~~,~,,~,,~,,,~,~,~,,~,,,,~,~~,,
a suitable generator .s~H,-t((p-~( *)). The Z,+Zz+Z,,Z,,Z,,,,O,Z,,Z,+Z,,Z,,,+
boundary isomorphism a: &(S’“-‘, q-‘( *)) z Z,, For the computation of Gk, the notion
H,-,(<p-‘( *)) and the homomorphism <p*: of n-connective fiber spaces is important. By
783 203 A
Hopf Algebras

utilizing the Adams spectral sequence, we cari ~qB,,)> Bsp x Z+wJ/SP), U/SP+wOlU)3
show that G, is closely related to the cohomol- O/U-&(O). This result is applied to non-
ogy of the +Steenrod algebra. Let p be an odd stable cases; for example, rc,,(U(n)) is a cyclic
prime. There exist the following sequences group of order n! (- Appendix A, Table 6.VI).
of elements of order p: {ccieG, (k=2i(p- l)- The 2-dimensional homotopy group n,(G) of
l)j,{l?i~Gk(k=2(ip+i-l)(p-l)-2))andfor any Lie group is trivial.
p>3{yi~G,(k=2(ip2+(i-l)p+i-2)(p-l)- Let sc~rt,(O(n)), where a=[f],f:S”-O(n).
3)}. The p-component of Gk is determined for We delïne 7: Sk x Y’ +S”-’ by f(x, y) =,f(x) . y
k < 2p2(p- 1) - 3 by using Steenrod algebra. and identify Sk+” with the boundary (Ek+’ x
TO compute G, for higher k, relations such as YL)U (Sk x E”) of Ek+l x E”. We extend f to
ut /If = 0, /$bp = 0 (i > 1) are necessary. In gen- f:sk+n +S” SO that it maps Ek+’ x S’-‘, Sk x
eral, each element of Gk (k # 0) is nilpotent E” into the Upper and lower hemisphere of
(G. Nishida). Let n,(S”: p) be the p-component S”(S”-’ = the equator), respectively. Let J(~)E
of r&S”). TO survey this group for the nonstable z,+~(S”) be the class of the mapping thus ob-
case (i > 2n - l), we utilize Serre’s mod p direct tained. This homomorphism J:rc,(O(n))+
sum decomposition (for n even), and we have ~c,+~(S”) is called a J-homomorpbism of Hopf
the following two exact sequences for the case and Whitehead. For the stable case, J:r~,(o)+
of odd n: Gk is injective for k = 0, 1 (mod S), and the
order of the image of J is the denominator
. ..-7ci(s”)%i+2(Sn+2)‘7Li(R2(Sn+2).Sn)L,
of B,,/4t (i?2t is a +Bernoulli number) or its
. ..+7ci+3(Sp”+p+‘.p)%ci+l(SP”+P-‘:p) double for k = 4t - 1 (Adams).

-+71i(R2(S”+~),S”:p)-+TLi+2(S~n+~+’:p)0,...,
References
where E’=EoE and AE’(cr)=pa(- Ap-
pendix A, Table 6.VI).
[ 1] H. Hopf, Über die Abbildungen von
Spharen auf Spharen niedriger Dimension,
V. Homotopy Groups of Classical Groups Fund. Math., 25 (1935), 427-440.
[2] W. Hurewicz, Beitrage zur Topologie der
Consider the classical group U(n, A), which is Deformationen I-IV, Proc. Acad. Amsterdam,
either the orthogonal group O(n) (A = R); the 38 (1935), 112-229,521-528; 39 (1936) 117-
unitary group U(n) (A = C); or the symplectic 125,2155224.
group Sp(n) (A=H). The infinite classical [3] H. Freudenthal, Uber die Klassen der
group U(co, A) is delïned to be the inductive Spharen-abbildungen, Compositio Math., 5
limit group of {U(n, A) 1n = 1,2,. . } with re- (1937),299-314.
spect to the natural injection U(n, A) c U(n + [4] J.-P. Serre, Groupes d’homotopie et classes
1, A). We cal1 U(ro,A) the infinite orthogonal des groups abeliéns, Ann. Math., (2) 58 (1953),
group, infinite unitary group, and infinite sym- 258-294.
plectic group for A = R, C, and H, respectively. [5] R. Bott, The stable homotopy of the class-
The dimensions of the cells of U( CO,A) - ical groups, Ann. Math., (2) 70 (1959), 3 13-
U(n,A) are >A(n+ 1)- 1, where i=dim,A 337.
(=1 (A=R), =2(A=C), =4(A=H)).It [6] H. Toda, Composition methods in homo-
follows that rck(U(n, A)) is isomorphic to topy groups of spheres, Princeton Univ. Press,
rrk(U(co,A)) for k<Â(n+ 1)-2, which is called 1962.
the kth stable homotopy group of the classical [7] E. H. Spanier, Algebraic topology,
group.LetO=U(oo,R),U=U(co,C),Sp= McGraw-Hill, 1966.
U(c0, H). The homotopy groups of the clas- [S] R. M. Switzer, Algebraic topology,
sical groups are periodic (k > 0): Springer, 1975.
[9] G. W. Whitehead, Elements of homotopy
x,(u)~~,+,(u)~z, k odd,
theory, Springer, 1978.
ZO, k even,

~k(O)~7Lk+4(SP)~71k+s(o),
=Z, k = 3, 7 (mod 8),
203 (111.25)
ZZ,, k=O, 1 (modS),
Hopf Algebras
ZO, k#O, 1, 3, 7 (mod8).

This is called the Bott periodicity theorem. The A. General Remarks


relations are deduced from weak homotopy
equivalences U+R(B,), Bu x Z-O(U), BO x Z The concept of Hopf algebras arose from two
+wJlO), U/O+fi(Sp/U), Sp/U+fi(Sp), SP directions. First in the lïeld of algebraic topol-
203 B 784
Hopf Algebras

ogy the notion of Hopf algebras arose from graded connected coalgebra of imite type,
the study of homology and cohomology of Lie and identify k with A, via q. Then the graded
groups or, more generally, H-spaces. It was algebra (A*, +*) has the unity of k as unity if
introduced by H. Hopf [ 11, whose basic struc- and only if t,k satisfïes $( 1) = 1 0 1 and +(x) =
ture theorem was generalized and applied to 1 @x+x@ l+Cix~@x~(O<degx~<degx)
several problems by A. Bore1 [2]. These Hopf for deg x > 0. In this case we say that $ has the
algebras are graded as algebras and coalge- unity of k as counity. For graded algebras
bras, and they are now used as standard tools (A,cp)and(B,rp’),ifcp”=(cp@rp’)o(l@T@l):
in algebraic topology. On the other hand, A 0 B @ A Q B-+A 0 B, then (A @ B, cp”) is
Hopf algebras without grading were studied in also a graded algebra, which we denote by
connection with affine algebraic groups and (A @ B, $‘)=(A, <p)@ (B, cp’). The tensor prod-
forma1 groups. The study of nongraded Hopf uct of graded coalgebras is defïned as the
algebras as an algebraic system was initiated dual notion of (A, cp)0 (B, q’).
by M. E. Sweedler [3], and many results on
Hopf algebras of this type have been applied,
C. Graded Hopf Algebras
not only to the theory of algebraic groups but
also to the Galois theory of field extension and
For simplicity we assume that graded mod-
to combinatorial theory.
ules are defïned over a fïeld, connected, and
Although these two types of Hopf algebras
of fïnite type. Let a graded module A be
have similar structures, and the same termi-
equipped with a multiplication <p and a co-
nology is used to describe their properties,
.multiplication $. If cp and $ have the unity
they are somewhat different from each other.
of k as unity and Ic, :(A, <p)-+(A, <p)@ (A, cp) is an
SO to avoid confusion in this article, we dis-
algebra homomorphism, then we cal1 (A, cp,+,k)
tinguish between graded Hopf algebras and
a graded Hopf algebra. The last condition for a
Hopf algebras.
graded Hopf algebra is satisfied if and only if
V:(A, $)@(A, $)+(A, tj) is a homomorphism
of graded coalgebras. The dual (A*, $*, <p*) is
B. Graded Algebras
also a graded Hopf algebra, called the dual
Hopf algebra of (A, cp, I/I).
A tgraded module A = CnbOAn over a fïeld k is
said to be of finite type when each A, is hnite-
dimensional, A is connected when an isomor- D. H-Spaces
phism r) : k z A, is given. The ttensor product
of two graded modules A and B is a graded Let X be a topological space. The tcoho-
module with A@B=C,(A@B),,(A@B),= mology group H*(X) (thomology group
C,, A, 0 B,-,. We cal1 A* = 2 An (where An H,(X)) considered over a field k has a multi-
is the +dual module of A,) the dual graded plication d* (comultiphcation d,), which is
module of A. When A and B are of finite type, induced by the diagonal mapping d: X +X x X
A 0 B and A* are also of fïnite type, and we and becomes a commutative and associative
have (A @ B)* = A* @ B* and A** = A. When graded algebra (coalgebra). The groups H*(X)
A and B are connected, A* and A Q B are also and H,(X) are dual to each other (- 201
connected. Homology Theory 1, J). When X is equip-
Let A be a graded module. If there exists a ped with a base point x0 and a base point-
degree-preserving hnear mapping cp: A @ A + preserving continuous mapping h:X x X+X
A, we cal1 (A, <p) a graded algebra, whereas if such that ho zi = 1, (thomotopic) for i = 1 and
there exists a degree-preserving linear mapping 2 (where ~i(x)=(x,xa) and z2(x)=(x0,x)), we
++k: A +A 0 A, we cal1 (A, $) a graded coalgebra. cal1 (X, h) an H-space, h a multiplication, and
We cal1 cp a multiplication, and $ a comultipli- x0 a homotopy identity of X. Then h induces,
cation (or diagonal mapping). Usually we Write through a +Künneth isomorphism, a co-
cp(a @ b) = ab (the product of a, bu A), and cal1 multiplication h* : H*(X)+H*(X) @ H*(X)
$(a) the coproduct of a. Multiplication and (Hopf comultiplication) and a multiplication
comultiplication are dual operations. If A is of h,: H,(X) @ H,(X)+H,(X) (Pontryagin
finite type and (A, <p) is a graded algebra, then multiplication). Then h*(E) (a~ H*(X)) is
(A*, cp*) (where cp* is the dual mapping of cp) is called the Hopf coproduct of c(, and h,(/? 0 y)
a graded coalgebra, and vice versa. A multi- (/j’, y E H,(X)) is called the Pontryagin product
plication cp is called associative (commuta- of B and y. When X is tarcwise connected and
tive)ifcp(l@<p)=<p(cp@l)(<poT=<p),where H,(X) is of fïnite type, (H*(X), d*, h*) and
T:A @ A+ A @ A is the mapping defined by (H,(X), h,,d,) are graded Hopf algebras dual
T(a@b)=(-l)Pqb@uforaEA,andbEAq. to each other. In particular, when h is homot-
Associativity and commutativity of a comulti- opy associative, i.e., h o (h x 1 x) = h o (1 x x h)
phcation are defïned dually. Let (A, $) be a (homotopy commutative, i.e., h = h o 7; where
785 203 F
Hopf Algebras

T(~,,xJ=(x~,xr) for X~E~), then h* and h, triple (A, p, y~)is said to be an algebra over k if
are associative (commutative). +Topological ~o(~Ol,)=~00(1~O~)and~o0(1~Or)=
groups and +loop spaces are homotopy as- ,uo(~ @ lA)= l,, where 1, is the identity
sociative H-spaces. If a continuous mapping mapping of A, and A @ kk and k @ kA are
9:X-X satisfies ho(l, xg)Zho(g x l,)gc identitïed naturally with A. We cal1 p the multi-
(constant mapping X+{x,}), then g is called a plication and q the unit mapping of the alge-
homotopy inverse for X, h. bra. Dually a triple (C, A, E) with C a vector
Suppose that a graded Hopf algebra A is space over k, linear mappings A : C+C 0 k C,
defïned over a fïeld k of characteristic p and and E: C+k is said to be a coalgebra over k if
equipped with associative and commutative (lc@A)oA=(A@ 1,)oA and(lc@s)oA=
multiplication, and A is generated by a single (E 0 1,) o A = 1,. We cal1 A the comultiplica-
elemen t a E A,. Then A is a tpolynomial ring tion or the diagonal mapping and E the aug-
k[a] (n! is even when p # 2) or a tquotient ring mentation or the counit of the coalgebra.
k[a]/(a’) (n is odd when pf2) or k[a]/(d’) An algebra (A, p, q) is called commutative
(only when p # 0; II is even when p # 2). These if p o T= p, where T is the twist mapping
are called elementary Hopf algebras. Every a @ b H h @ u. A cocommutative coalgebra is
graded Hopf algebra over a tperfect fïeld k detïned dually. Definitions of the tensor prod-
with associative and commutative multiplica- uct of two algebras or coalgebras are similar
tion is isomorphic (as a graded algebra) to a to the definitions in the graded case. If D is a
tensor product of elementary Hopf algebras subspace of a coalgebra (C, A, E) over k satisfy-
(Borel’s theorem) [2]. In particular, the coho- ing A(D)cD@,D, then (D,AID,&ID) is a co-
mology algebra over a tïeld of characteristic 0 algebra and is said to be a subcoalgebra of C.
of a +Compact connected Lie group is isomor- A subspace 1 of a coalgebra (C, A, E) over k is
phic to a +Grassmann algebra generated by called a coideal of C if A(I) c C @ ,J + 1 @ kC
elements of odd degrees [ 11. and E(I) = 0. Then the quotient space C/I has a
coalgebra structure induced naturally from
(C, A, E) and is said to be a quotient coalgebra
E. Steenrod Algebras
of C. The intersection and the sum of sub-
coalgebras of C are again subcoalgebras of
The +Steenrod algebra &‘p over Z, is generated
C. If S is a subset of C, then the intersection
by Steenrod operations Sq’ (p = 2), 9’ (p > 2),
of a11 subcoalgebras containing S is said to
and the +Bockstein operation A,, (p > 2), with
be the subcoalgebra generated by S. The
composition of operations defined as multi-
subcoalgebra generated by any finite set or
plication. Then &‘p is a connected associative
finite-dimensional subspace of C is tïnite-
graded algebra of fïnite type (not commuta-
dimensional.
tive). Defïning a comultiplication $ of &,, by
If (C, A, E) is a coalgebra over k, then the
~(Sq”)=CSq’osq”-‘, lj(P)=Capi@.Pm’,
dual space C* of C has an algebra structure
and $(A,,) = 10 A,, + A,, @ 1, .&‘,, becomes a
over k with multiplication p and unit mapping
graded Hopf algebra with an associative and
>1detïned naturally from the +dual mappings of
commutative comultiplication. Thus its dual
A and E respectively. (C*, p, ‘1) is called the dual
&‘p is a graded Hopf algebra with an associa-
algebra of (C, A, E). Suppose that (A, p, il) is an
tive and commutative multiplication, and we
algebra over k, and let A0 be the subset of the
cari apply Borel’s theorem to .&‘P in order to
dual space A* of A consisting of elements f
investigate the structure of &‘,, [4].
whose kernel contains an ideal 1 such that A/I
Let (A, <p, $) be a graded Hopf algebra
is tïnite-dimensional. Then (Ao, A, E) is a co-
with associative multiplication and comulti-
algebra over k, where A and E are the linear
plication. Putting c( 1) = 1 and c(a) = -a -
mappings induced from the dual ones of p and
C ai. ~(a:) for dega > 0 (where $(a) = 1 @ a +
n, respectively. (Ao, A, E) is called the dual co-
a @ 1 +x ai @ a:), we obtain a linear map-
algebra of (A, p, il). The tfunctors ( )* and ( )”
ping c: A+ A satisfying c<p = ~(c @ c) T. We
are adjoint to one another in the sense that
cal1 c the conjugation mapping of A. When the
there is a natural bijective correspondence
multiplication or comultiplication is commuta-
between the set of algebra homomorphisms
tive, we obtain the relation cz = 1, and L’ is a
of A to C* and that of coalgebra homomor-
bijection. The conjugation mapping is utilized
phisms of C to A0 for any coalgebra C and
in studying Steenrod algebras [4,5].
algebra A, where coalgebra homomorphisms
are detïned as the dual notion of +algebra
F. Coalgebras homomorphisms.
A nonzero subcoalgebra D of a coalgebra C
Now we turn to nongraded cases. Let A be a is called simple if D has no nonzero proper
vector space over a tïeld k, and let p: A 0 kA+ subcoalgebras, and the sum of a11 simple sub-
A and q : k+ A be linear mappings. Then the coalgebras of C is called the coradical of C. If
203 G 786
Hopf Algebras

C coincides with its coradical, then C is said to and $(a) = aq o E, then (R, p’, q’) is an algebra
be cosemisimple. If C has only one simple over k. If H is a bialgebra over k with underly-
subcoalgebra, then C is called irreducible. C is ing coalgebra HC and algebra HA, then R, =
called pointed if a11 simple subcoalgebras of C Hom,(HC, HA) is an algebra over k in the
are 1-dimensional. An element g of a coalgebra same manner as above. If the identity mapping
(C, A, E) is called grouplike if A(g) =g @ g and 1, of H has the inverse S in R, under the
s(g)= 1. The set G(C) of grouplike elements in multiplication p’, then H is said to be a Hopf
C is linearly independent over k. algebra over k with antipode S. Then S satistïes
the following: S(gh) = S(h)S(g) for g, h in H,
SO?=?, &OS=E and To(S@S)oA=AoS,
G. Bialgebras
where T is the twist mapping a 0 b+b @a. If
H is commutative or cocommutative, then
A system (H, p, q, A, s) with an algebra structure
sos= 1,.
(H, p, n) and a coalgebra structure (H, A, E) is
If H’ is another Hopf algebra over k with
said to be a bialgebra over k if A and E are
antipode S’, then a bialgebra homomorphism
algebra homomorphisms. This last condition
f of H to H’ such that SIf=fS is called a Hopf
is equivalent to saying that p and yl are coalge-
algebra homomorphism. If H and H’ are both
bra homomorphisms. If K is a subspace of a
commutative (cocommutative), then any bi-
bialgebra (H, p, n, A, E) which is simultaneously
algebra homomorphism of H to H’ is a Hopf
a subalgebra and a subcoalgebra of H, then
algebra homomorphism. The tcategory whose
we cal1 K a subbialgebra of H. An ideal1 of
abjects are commutative and cocommutative
(H, p, y~)which is also a coideal of (H, A, E) is
Hopf algebras over a tïeld k and whose mor-
called a biideal of H and the quotient space
phisms are Hopf algebra homomorphisms is
H/I has a bialgebra structure which is said to
an +Abelian category (A. Grothendieck). If H is
be a quotient bialgebra of H. A linear mapping
a commutative Hopf algebra over a tïeld of
between bialgebras is a bialgebra bomomor-
characteristic zero, then the underlying algebra
phism if it is simultaneously an algebra homo-
HA has no tnilpotent elements (P. Cartier).
morphism and a coalgebra homomorphism. A
If G is a tgroup, then the group bialgebra
bialgebra (H, p, n, A, E) is called commutative
kG given above has an antipode S delïned
(cocommutative) if (H, ,u, u) ((H, A, E)) is com-
by S(x) = x-’ for x in G, and hence kG is a
mutative (cocommutative).
cocommutative Hopf algebra over k. Another
example of Hopf algebras is the tcoordinate
Examples. Let kG be the vector space with a
ring of an talgebraic group detïned over k.
set G as free basis over a lïeld k. If we detïne
More generally, let X = Spec(A) be an +aflïne
linear mappings A:kG+G@,kG by A(x)=
group scheme over k. Then algebra homomor-
x@xands:kG+kbys(x)=l forxinG,
phismsA:A-tA@,A,s:A-,k,andS:A+A
then (kG, A, E) is a cocommutative coalge-
are naturally induced from the group structure
bra over k such that the set G(kG) of group-
of X, and (A, p, y~,A, E, S) is a commutative
like elements is equal to G. Moreover if G
Hopf algebra over k, where p and q are the
is a tsemigroup with unit element, then
multiplication and unit mapping of A, respec-
(kG, p, q, A, E) is a cocommutative bialgebra
tively. Conversely, if (A, p, n, A, E, S) is a com-
over k, where p(q) is the multiplication (unit
mutative Hopf algebra over k, then X =
mapping) of the tsemigroup algebra kG. This
Spec(A) is an affine group scheme over k
bialgebra is called a semigroup bialgebra over
with the group structure induced from A, E,
k. If L is a +Lie algebra over k and U(L) the
and S. Hence a commutative Hopf algebra
tuniversal enveloping algebra of L with multi-
over k is nothing but a cogroup abject of the
plication p and unit mapping y~,then Lie alge-
category of commutative algebras over k (i.e.,
bra homomorphisms x+x @x (L+L@ L)
a tgroup abject of the +dual category). Dually
and x-0 (L+(O)) induce algebra homomor-
a cocommutative Hopf algebra over k is
phisms A: U(L)+U(L@ L)g U(L)@ U(L) and
nothing but a group abject of the category of
E: CJ(L)+k, respectively. Then (U(L), p, y~,A, E)
cocommutative coalgebras over k.
is a cocommutative bialgebra over k, called the
universal enveloping bialgebra of L.

1. Hyperalgebras
H. Hopf Algebras
If (C, A, E) is a pointed irreducible coalgebra
Let (A, p, n) be an algebra over k and (C, A, E) over k, then C contains a unique grouplike
a coalgebra over k. If f and g are in R = element y, and kg is the unique simple sub-
Hom,(C,A), thenf*g=po(f@g)oA is coalgebra of C. An element a of C satisfying
called the convolution off and g. Detïning A(a) = CI@ y + g 0 a is called primitive. The set
~‘:ROkR~Rand~‘:k~Rby~‘(fQg)=f*g P(C) of primitive elements in C is a vector
787 204 B
Hydrodynamical Equations

subspace of C. A cocommutative coalgebra C [6] J. Dieudonné, Introduction to the theory


is called colocal if C is irreducible, i.e., if the of forma1 groups, Dekker, 1973.
dual algebra C* of C is a tquasilocal ring. The [7] M. Takeuchi, Tangent coalgebras and
tdimension of P(C) of a colocal pointed coalge- hyperalgebras 1, Japan. J. Math., 42 (1974), l-
bra C is tïnite if and only if the dual algebra 143.
C* of C is a +Noetherian complete local ring. [S] H. Yanagihara, Theory of Hopf algebras
A bialgebra (H, p, 4, A, E) over k is said to be a attached to group schemes, Lecture notes in
hyperalgebra over k if the underlying coalgebra math. 614, Springer, 1977.
is colocal. Then the unique simple subcoalge- [9] E. Abe, Hopf algebras, Cambridge Univ.
bra (grouplike element) of H is q(k) (n(l)), and Press, 1980. (Original in Japanese, 1977.)
P(H) has a Lie algebra structure delïned by [lO] G.-C. Rota, Coalgebras and bialgebras in
[x, y] = xy - yx for x, y in P(H). combinatorics, Lecture notes at the umbral
The universal enveloping bialgebra U(L) of calculus conference, University of Oklahoma,
a Lie algebra L over k is a hyperalgebra over k 1978.
such that the set P(U(L)) of primitive elements
in U(L) is equal to L. Conversely, if the char-
acteristic of k is zero, any hyperalgebra H over
k is isomorphic to the universal enveloping
algebra U(t’(H)) of the Lie algebra P(H). But 204 (XX.1 0)
in positive-characteristic cases U(P(H)) is Hydrodynamical Equations
generally a proper subbialgebra of H. Another
important example of hyperalgebras is the
A. General Remarks
dual coalgebra hy(X) = Ao of the +stalk A at
the neutral point of the +Structure sheaf of an
talgebraic group scheme X over k. In addition, Mathematical analysis of the motion of fluids
hy(X) has an algebra structure detïned from (- 205 Hydrodynamics) gives rise to various
the group structure of X and is a hyperalgebra kinds of mathematical problems; the equations
over k such that P(hy(X)) is equal to the Lie that govern flows are amongst the most im-
algebra L(X) of X. Although L(X) plays an portant and most extensively studied of the
important role in the infinitesimal theory of nonlinear partial differential equations. Here
algebraic groups over a lïeld of characteristic we review only basic and noteworthy results.
zero, it does not give any information on in- Sections B-E are concerned with incompress-
lïnitesimals of orders higher than p in the case ible fluids, while Sections F and G deal with
of positive characteristic p. In the case of char- compressible fluids.
acteristic zero we see hy(X)= U(L(X)) and
L(X) = P(hy(X)), and SO hy(X) is a natural B. Nonstationary Solutions of the Navier-
substitute for L(X) in positive-characteristic Stokes Equation
cases. From this viewpoint many interesting
results on hy(X) of an algebraic group scheme Let R be a bounded domain in R” (m = 2 or 3)
X which are parallel to those on L(X) in the occupied by a fluid, with smooth boundary
case of characteristic zero have been obtained ZQ. If the fluid is viscous and incompressible,
[6&8]. its motion cari be described by means of the
velocity u = u(t, x) and the pressure p = p(t, x)
with t >O and XGR the time variable and the
References space variable, respectively. For simplicity, we
assume that external forces are absent. Then a
[l] H. Hopf, Über die Topologie der and p satisfy the Navier-Stokes equation
Gruppen-Mannigfaltigkeiten und ihre Verall-
gemeinerungen, Ann. Math., (2) 42 (1941), ;+(uV)u=vAu-Vp, (1)
22-52.
[2] A. Borel, Sur la cohomologie des espaces
and the equation of continuity
lïbrés principaux et des espaces homogènes des
groupes de Lie compacts, Ann. Math., (2) 57 divu=O, (2)
(1953) 1155207.
where the positive constant v stands for the
[3] M. E. Sweedler, Hopf algebras, Benjamin,
(kinetic) viscosity. On I?JR, u is subject to the
1969.
boundary condition
[4] J. W. Milnor, The Steenrod algebra and its
dual, Ann. Math., (2) 67 (1958) 150&171. u I FR= m, 4. (3)
[S] J. W. Milnor and J. C. Moore, On the
The initial value of u is also prescribed:
structure of Hopf algebras, Ann. Math., (2) 81
(1965) 21 l-264. u 1f=() = uo(x). (4)
204c 788
Hydrodynamical Equations

If R is unbounded, e.g., if 0 is an exterior in the weak equation. In general, the unique-


domain outside compact surfaces, then ness of Hopf’s weak solution is not known
except for m = 2. However, when a regular
uk.+4dt) (IXl-+~) (3’) solution does exist, then the weak solution
is imposed in addition as the boundary con- coincides with it (a.e.), and is unique. Solutions
dition at inlïnity. of the NS initial value problem which are
The problem of finding u and p that satisfy stronger than the weak solution to the extent
(l))(4) for given /l and t10 is called the Navier- that the uniqueness cari be proved have been
Stokes initial value problem (abbreviation, NS introduced, for instance, by A. A. Kiselev and
initial value problem). Pioneering mathemat- Ladyzhenskaya [ 121 and Fujita and Kato
ical studies of this problem were initiated by [SI. Actually the existence and uniqueness
J. Leray, and since the 1950s various contri- theorems of the aforementioned regular solu-
butions have been made by many authors, tions are proved using existence theorems
including E. Hopf, 0. A. Ladyzhenskaya, H. of such strong solutions. Recently, Y. Giga and
Fujita, T. Kato, and S. Ito (- [ 13,17,21]). We T. Miyakawa succeeded in generalizing the
state here the result in terms of regular (class- Fujita-Kato theory from L2 to Lp, and thus
ical) solutions under the simplifying assump- they have shown that if u,EL”‘(R) and is sole-
tion that fi is bounded, p = 0, and u0 is sole- noidal, a unique strong solution exists at least
noidal (div u,, = 0) and smooth. The situation locally for any m.
depends nontrivially upon the dimension m. Those strong solutions that satisfy unique-
Namely. if m = 2, then a regular solution of the ness, and hence are regular solutions of the NS
NS initial value problem exists uniquely and initial value problem, turn out to be smooth
globally, i.e., for all time. If m = 3, we cari prove for t > 0; they are analytic in t > 0 and x ER.
only a local existence theorem (i.e., one hold-
ing in a Imite interval) of a regular solution.
This solution is unique in the interna1 of its D. Stationary Solutions of the Navier-Stokes
existence; however, it cari be extended over Equation
the whole interval when the Reynolds number
is sufftciently small. In other words, the ques- If the flow is steady, u and p are solutions of
tion of well-posedness of the 3-dimensional the boundary value problem consisting of (1)
NS initial value problem is open at present. with ch/& omitted, (2) and (3) with /I indepen-
dent of t (and, in addition, (3’) with a constant
U, if R is an exterior domain). The existence
C. Weak and Strong Solutions of the Navier- of solutions of this boundary value problem
Stokes Equation for the case of bounded n was established by
Leray as one of the earliest applications of
In 1951 Hopf introduced the notion of weak his lïxed-point theorem. For the case of un-
solutions of the NS initial value problem and bounded R, Leray’s study was completed and
succeeded in proving their global existence extended by R. Finn, H. Fujita, and others to
(without uniqueness). We here give the delïni- yield theorems on existence, regularity, and
tion of Hopf’s weak solution: Let C& be the asymptotic behavior in the wakes of solutions
set of vector functions u E Cz(fi) with div u = 0. (- [4,13,21]). These stationary solutions are
By H we denote the closure of C& under the unique if the Reynolds number is sufficiently
Lz-norm, and by V the closure of Ciy, under small. On the other hand, under certain cir-
the IV: (R)-norm (or equivalently, the Dirichlet cumstances that involve Quette flow, non-
norm for bounded n). Then an H-valued func- uniqueness or bifurcation of stationary solu-
tion IA= u(t) is a weak solution of the NS initial tions for large Reynolds numbers has been
value problem with /1= 0 in [0, 7) (0 < T< positively proved.
+m) if
(i) u~L~‘(0, T;H)flL2(0, ~0; V),
(ii) u is weakly continuous from [0, T) to H,
E. Euler’s Equation
and
(iii) u satislïes the weak equation
If the fluid is inviscid and incompressible, the
T
Navier-Stokes equation is reduced to Euler’s
tu> <Prhn, - v(Vu, vqLz(n)
110 equation

+ (tu. WA t*)~~~n~ dt = -tuo> dO)hn, (5)


1
for all <PE C$( [0, T) x 0) with div cp= 0. Then the boundary condition is replaced by
Note that the pressure p has been eliminated the frictionless boundary condition, which, in
189 204 G
Hydrodynamical Equations

the homogeneous case, takes the form 1-dimensional initia1 boundary value problem
cari also be solved globally if the gas is ideal
&l12Ci=0, (7) and polytropic, i.e., one for which p = RpB, and
where u, is the normal component of u. Regu- with the interna1 energy proportional to 0 and
lar solutions of the initia1 value problem con- [9]. For the 3-dimensional case, we cari only
sisting of (6) with (2), (4), and (7) have been refer to [18], where existence of global solu-
proved to exist for all t if m = 2 and in a finite tions of the Cauchy problem has been proved
timeifm=3 [l,ll]. for initial data close to constants under the as-
sumption that the gas is ideal and polytropic.

F. The General Navier-Stokes Equations


G. Equations for Inviscid Ideal Gases
If the fluid is compressible, viscous, and heat-
conductive, its motion is described in terms of When the gas under consideration is inviscid,
the density p, the velocity u, and some thermo- ideal, and barotropic, we put p = 0 and p = Cp y
dynamic quantity, say the absolute tempera- in (9). The equation thus obtained is combined
ture 8, and is governed by the following system with (8) to yield the following quasilinear
of equations, sometimes called the general hyperbolic system, which admits conservation
Navier-Stokes equations: laws:

g+div(pu)=O,

(1’)

ûo 0 Y It the initial data p. and u0 are smooth to


z+(u.V)O=&AOf$divu)pO+=. some extent, then the Cauchy problem for (11)
” ” Y
has a regular solution local in time. Generally,
(10)
we cannot expect existence of regular solu-
Here the pressure p is regarded as a function tions global in time, namely, discontinuity is
of p and 0 through the equation of state. The likely to take place in a tïnite time, which cor-
viscosity coefficient p, the coefficient of heat responds to the occurrence of shock waves.
conduction IC, and c,,, the specific heat at Therefore we have to introduce weak solutions
constant volume, are positive constants. We that admit discontinuity. By definition a piece-
have for simplicity assumed that the external wise continuous function {p, u} is a weak
force is absent, and that the Stokes condition solution of (11) if it satisfïes (11) in the distri-
for viscosity is satisfïed. Finally, Y is the dis- bution sense and if its discontinuity is sub-
sipation function: ject to a certain jump condition, called the
Rankine-Hugoniot relation as well as another
condition, called the entropy condition, which
two conditions allow us to distinguish a physi-
If fi is the whole space, then the initial values cally realizable solution among many possible
pO, uO, 0, of p, u, Q are given at t = 0, and we discontinuous solutions [3,14]. Global exis-
obtain the Cauchy problem for the general tence of the weak solution has been proved
Navier-Stokes equation. Mathematical study SO far only for the 1-dimensional problem with
of this Cauchy problem has become active initial data close to constants in the sense that
since J. Nash [19] and N. Itaya [S] proved the their oscillations and total variations are suflï-
existence of unique regular solutions local in ciently small. Actually in this case we cari
time. Following Itaya’s argument, A. Tani apply J. Glimm’s method [6] to construct
constructed a unique regular solution local in weak solutions by means of a difference ap-
time for the initial boundary value problem proximation which involves random numbers.
consisting of (8)-( 10) and boundary conditions Little is known regarding the uniqueness of
imposed on u and 0. When it cornes to the weak solutions. Finally, if we are concerned
global existence of solutions, only restricted with steady-state solutions of an inviscid com-
results are known. The global existence of a pressible fluid and assume that the flow is
regular solution in the 1-dimensional version irrotational, then we are led to a quasilinear
of the Cauchy problem has been established partial differential equation of mixed type for
by Ya. Kane]’ [ 101 and Itaya under certain the velocity potential @, which is elliptic in the
simplifying assumptions, such that the fluid subsonic region and hyperbolic in the super-
is a barotropic gas, i.e., one obeying p = Cpy, sonic region. Classical results concerning these
where C and y > 1 are positive constants. The equations may be found in [2,3].
204 Ref. 790
Hydrodynamical Equations

References [ 171 J. L. Lions, Quelques methodes de réso-


lution des problemes aux limites non linéaires,
Cl] C. Bardos, Euler equation and Berger Dunod, 1969.
equation-Relation to turbulence, Lecture [ 1S] A. Matsumura and T. Nishida, The initial
notes in math. 648, Springer, 1978, l-46. value problem for the equations of motion of
[2] L. Bers, Mathematical aspects of subsonic viscous and heat-conductive gases, J. Math.
and transonic gas dynamics, Wiley, 1958. Kyoto Univ., 20 (1980) 67-104.
[S] R. Courant and K. 0. Friedrichs, Super- [ 191 J. Nash, Le problème de Cauchy pour les
sonic flow and shock waves, Interscience, 1948. équations differentielles d’un fluide général,
143 R. Finn, Stationary solutions of the Bull. Soc. Math. France, 90 (1962) 487-497.
Navier-Stokes equation, Amer. Math. Soc. [20] B. L. Rozhdestvenskiï and N. N. Ya-
Proc. Symp. Appl. Math., 17 (1965), 121-153. nenko, Systems of quasilinear equations and
[S] H. Fujita and T. Kato, On the Navier- their application to gas dynamics (in Russian),
Stokes initial value problem 1, Arch. Rational Nauka, 1978.
Mech. Anal., 16 (1964) 2699315. [21] R. Temam, Navier-Stokes equations,
[6] J. Glimm, Solutions in the large for non- North-Holland, 1977.
linear hyperbolic systems of equations, Comm.
Pure Appl. Math., 18 (196.5) 6977715.
[7] E. Hopf, Über die Anfangswertaufgabe für
die hydrodynamischen Grundgleichungen, 205 (Xx.9)
Math. Nachr., 4 (1950-1951), 213-231. Hydrodynamics
[S] N. Itaya, On the Cauchy problem for the
system of fundamental equations describing
the movement of compressible viscous fluid, A. General Remarks
Kodai Math. Sem. Rep., 23 (1971), 60-120;
corrigenda and addenda, J. Math. Kyoto Gases and liquids are easily deformed, and
Univ., 16 (1976) 2399240. they share many kinetic properties. They are
[9] A. V. Kazhikhov and V. V. Shelukhin, examples of fluids. By delïnition, a fluid is a
Overall solvability of single-valued functions continuous substance having the property that
for the initial boundary problem for one- when it is not moving, any part of the sub-
dimensional equations of viscous gases (in stance separated from the rest by a surface
Russian), Prikl. Mat. Mekh., 41 (1977) 282- exerts an outward force that is perpendicular
291. to the given surface.
[ 101 Ya. 1. Kanel’, A mode1 system of equa- Hydrodynamics (or fluid dynamics) is con-
tions for the movement of gases in one di- cerned with the equilibrium and the motion of
mension (in Russian), Differentsial’nye Urav- gases and liquids without considering their
neniya, 4 (1968) 721-734. molecular structure. In particular, the branch
[Il] T. Kato, On classical solutions of two- of the theory concerning fluids in equilibrium
dimensional nonstationary Euler equation, is called hydrostatics, and hydrodynamics
Arch. Rational Mech. Anal., 25 (1967) 1888 sometimes refers to the branch concerning
200. fluids in motion.
[ 121 A. A. Kiselev and 0. A. Ladyzhenskaya, There are two methods of describing the
On the existence and uniqueness of solutions motion of a fluid. One regards a fluid as a sys-
of nonstationary problems for viscous, in- tem consisting of an inlïnite number of par-
compressible fluids (in Russian), Izv. Akad. ticles and discusses the motion of each parti-
Nauk SSSR, 21 (1957), 655-680. cle as a function of time. This is Lagrange3
[ 133 0. A. Ladyzhenskaya, The mathematical method. For example, suppose that a fluid
theory of viscous incompressible flows, Gor- particle with the coordinates (x, y, z) = (a, b, c)
don & Breach, 1963. (Original in Russian, at the moment t = 0 has coordinates x=
1961.) .f, (a, b, c, 0, Y =f2(a, b, c, 0, z =.f&, 6 c, t) at
[ 141 P. D. Lax, Hyperbolic systems of con- an arbitrary time t. Then the motion of the
servation laws and the mathematical theory of fluid is perfectly determined by the functions
shock waves, SIAM Regional Conf. Ser. Appl. .I; 1.f2 3 and .L.
Math., 11 (1973). The other is Euler’s method, which discusses
[ 151 J. Leray, Etude de diverses équations the values of the velocity V(U, v, w), the density
intégrales non linéaires et de quelques prob- p, the pressure p, etc., of the fluid at arbitrary
lèmes que pose l’hydrodynamique, J. Math. times and positions. From this standpoint
Pures Appl., (9) 12 (1933), l-82. each quantity of the fluid is regarded as a
[16] J. Leray, Sur le mouvement d’un liquid function of a space-time point (x, y, z, t).
visqueux emplissant l’espace, Acta Math., 63 The rate at which any physical quantity F
(1934) 1933248. varies while moving with the fluid particle is
791 205 B
Hydrodynamics

the Lagrangian derivative DF/Dt, which is nonviscous (sometimes also adiabatic) fluid,
related to the ordinary partial derivatives by which is called a Perfect fluid and is a good
approximation in the large of actual fluids.
DF CF r?F i?F i?F
The motion of a Perfect fluid is determined by
Euler’s equation of motion
The three components (u, u, w) and the two
p Dv/Dt = - grad p + pK, (4)
state quantities (p, p) (in general, other state
quantities, for example, the temperature T and which is obtained from (1) and (2) by replacing
the tentropy S, are assumed to be determined pik by the pressure only, and also by the ther-
by equations of state such as T= T(p, p), S = modynamic relation DS/Dt = 0 obtained from
S(p, p)) are determined by five ( = I + 3 + 1) (3’) by putting Q = 0 and h =0 or its integral
relations derived from the conservation laws of S = constant in homentropic flow, which is
mass, momentum, and energy, namely, the governed by the adiahatic law pocp’, where y
equation of continuity, which corresponds to denotes the ratio of specific heat at constant
the conservation of mass, pressure to that at constant volume. In partic-
ular, for a liquid, the density variation cari be
c?p/r’t+ div(pv) = 0; (1) neglected. Putting p =Constant in (l), we have
the equation of motion, which corresponds to divv=O, (5)
the conservation of momentum,
which, in conjunction with (4), determines four
L?(p)/% +div(pv @ V-p)=pK, (2) unknowns (u, II, w, p) as functions of (x, y, z, t).
where K is the external force per unit mass, p A fluid of constant density is called an in-
is the stress tensor, and @ denotes the ttensor compressible fluid, and one of variable density
product, while +divergence is applied to each a compressible fluid. Even though it might
row vector, and by virtue of(l), the equation seem natural to consider gases as examples of
(2) cari be expressed component-wise as compressible fluids, they cari be treated as
incompressible fluids if the speed of the flow
of the gas q = Iv1 is small compared with the
velocity c = m of sound propagating in
the gas. We cal1 q/c = M the Mach number.
The vector ~(5, q, 0, which is derived from
the velocity vector Y as w = rot v, is called the
vorticity. A small part of the fluid rotates with
angular velocity w/2. If w = 0, the flow is called
irrotational, otherwise rotational. The curves
and the energy equation, which corresponds to
dx:dy:dz=u:v:wanddx:dy:dz=<:tl:[are
the conservation of energy,
called, respectively, stream lines and vortex
i?(pv2/2+pE)/dt lines. The line integral &.v,xd.~ along a closed
circuit C is called the circulation around C.
+div(pv(a2/2+E)-v.p+h)=pv.K, (3) In irrotational flow, the velocity is expressed
or the equation of entropy production, which is as v = gradm, where U, is called a velocity poten-
another expression of (3), tial. When the external force K has a potential
0 (K = - grad n) and p is a defïnite function
pTDS/Dt= -divh+Q, (3’) of p, we have the pressure equation
where E is the interna] energy per unit mass, Q au,1 dp
the heat generated per unit time and volume,
and h the heat flux. Here K, pik, h, and Q or
s
x+Tq2+ -+R=constant,
P

which is valid everywhere in the flow. In a

1sdp
their relations with other quantities (e.g., h=

p2+
steady slow,
-K grad T, where IC is the thermal conductiv-
ity) are assumed to be known.
y + fi = constant

B. Perfect Fluids is valid along each stream line or each vortex


line; this is called the Bernoulli theorem. These
When there is a velocity gradient in the flow, two equations correspond to tenergy integrals
a tangential stress appears which tends to of the equation of motion. Furthermore, cor-
make the velocity uniform, SO that p is not a responding to the conservation of tangular
diagonal tensor (-PC~,,, Le., pressure). This momentum, we have Helmholtz’s vorticity
property is called fluid viscosity. Generally, Q theorem: When K = - grad R and p = f(p),
and h do not vanish in this case. However, in vorticity is neither created nor annihilated in
order to simplify the problem we consider a the fluid.
205 c 792
Hydrodynamics

For the irrotational motion of an incom- If M < 1 or M > 1 (although not too large) a
pressible fluid, +Laplace’s equation A@ =0 is linearization (Prandtl-Glauert approximation)
derived from (5). Hence the problem reduces to is possible by replacing M by the Mach num-
the determination of a tharmonic function @ ber at infïnity M, = Ujc,. For M > 1, (8) has a
under appropriate boundary conditions (e.g., +characteristic surface, which is the Mach cane
for a tïxed wall, normal velocity u, = aO/& = whose central axis makes an angle arc sin c/q =
0). For the 2-dimensional problem a stream arcsin l/M with the flow. This cari be inter-
function Y is introduced to satisfy (5) by the preted also as an envelope produced by spher-
relation u = aY/ay, u = - aY/ax. Since the ical sound waves with velocity c from a source
+Cauchy-Riemann equations a@/& = aY/ay, drifting with velocity q. For M - 1, we put U, =
a@/ay = - aY/ax are valid in this case, f= c*x + cp (c, is the fluid velocity when q = c).
Q + iY is an tanalytic function of z = x + iy. Then for an adiabatic gas, (8) is approximated
Therefore the theory of 2-dimensional irro- by a partial differential equation of tmixed type:
tational motion is essentially equivalent to the
theory of complex tanalytic functions, and -+-=--Y. (9)
ay ôz2 C* ax ax
consequently the theory of tcomformal map-
ping is a powerful method in the theory of Such a flow in which both domains M 2 1 co-
such fluid motion. exist is called the transonic flow, and exact
For irrotational steady flow of a compress- solutions by the hodograph method are
ible fluid in which R = 0, c is determined from known. However, continuous deceleration
(6) as a function of q. Then (1) and (4) yield a from M > 1 to M < 1 generally tends to be
tnonlinear partial differential equation for @: unstable or impossible, and the appearance of
a shock wave, i.e., a discontinuous surface of
(1-;)g+(l-$)$+(l-$)g state quantities, is not unusual. This cari be
considered as the tweak solution of(l), (2) (3)
uw ô2@ UV ô*Q, for a Perfect fluid. In particular, in the coordi-
-,~~_,!!!P-,- -=o. (7)
2 ayaz 2axay nate system fïxed to the surface, its integrated
form cari be obtained as follows: [pu,] = 0,
This equation is +elliptic or thyperbolic (- 326
[PGi,+PviUJ=o, [q2/2+E+p/p]=O([ ] is
Partial Differential Equations of Mixed Type)
the jump of the quantity at the surface, and n
according as M is less than 1 (subsonic) or
is the normal component). Supplemented by
greater than 1 (supersonic).
the entropy increase, these formulas give rela-
For 2-dimensional flow, we cari introduce a
tions between the fluid velocity and the state
stream function Y from (1) by u = a@/ax =
variables at the front and back of the shock.
umwm v=~w~Y= -(wawx). BY In an ideal gas they are called the Rankine-
utilizing the idea of tlegendre transformation,
Hugoniot relation. Entropy is not uniform
this system of nonlinear equations for @ and Y
behind a curved shock, and the flow is not
cari be reduced to a system of linear equations
irrotational. For a weak shock starting from
in the hodograph plane (q, 0):
the tip of a pointed slender body, however, the
a@ d 1 ay l-M2aY discontinuity is small and approaches the
x=4& p4 z=- tcharacteristic surface of (8), i.e., the Mach
0 P4 ao’
wave (compressive wave, in this case). Rare-
a@ qay
factive Mach waves are found in the super-
ae P a4 sonic flow of acceleration around a convex
(d(pq)/dq = p (1 - M’)), where the independent surface. Such waves contribute to the drag on
variables q and 0 are the magnitude and the an obstacle placed in supersonic flow.
inclination of the velocity, respectively. The
treatment of 2-dimensional compressible flow
C. Viscous Fluids
on the basis of this system is called the hodo-
graph method. For a flow of small M, there is a
method of successive approximation (M2- A body moving uniformly in a fluid at rest
(with velocity less than that of sound) suffers
expansion method) which starts from Laplace’s
no drag as long as the viscosity of the fluid is
equation, neglecting the terms of O(M2) in
negligible and the flow is continuous (d’Alem-
(7). For uniform flow (velocity U in the x-
bert’s paradox). Hence we must take the vis-
direction) past a thin wing or slender body
where v and w are small, we have thin wing cosity into account in order to discuss the
creation and annihilation of vortices, the gen-
theory or slender body theory, whose tïrst
approximation is eration and structure of shock waves, and
the drag acting on obstacles. For this purpose,
we extend Newton3 law stating that frictional
(l- M2)f$+$+$=0.
stress is proportional to the velocity gradient
793 205 E
Hydrodynamics

and assume that the stress tensor p is a linear tively, and U is the velocity outside the bound-
function of the rate-of-strain tensor e: ary layer.
If aU/ax < 0, it sometimes happens that the
2
p,,= -p+p’divv+pe,, -?pdivv boundary layer separates from the surface of
the body. In this case a vortex is generated in
the flow, as large vorticities in the boundary
layer are carried into the flow. For a body
without separation of the boundary layer, the
d’Alembert paradox holds and is no longer a
“paradox,” and the drag is small. Such bodies
are called streamlined.
The proportionality constants p and .u’ are In compressible flow the new problem arises
called, respectively, the coefficients of (shear) of the interaction of shock waves with the
viscosity and bulk viscosity. The bulk viscosity boundary layer. A rapid increase in pressure
is sometimes neglected (Stokes’s assumption) in due to the shock wave formed on the surface
the usual hydrodynamics, and then the mean of a body invalidates the assumption of a
value of the normal stress components equals boundary layer and causes its separation. If
the pressure. When a fluid satisfies this linear the Mach number becomes sufficiently large
relation between p and e, it is called a New- (M > 5, hypersonic flow), the bow shock ap-
tonian fluid. Otherwise, it is called a non- proaches the body and interferes with the
Newtonian fluid. Except for a few cases, such boundary layer. The generation of heat at the
as colloid solutions, fluids cari be regarded as boundary layer (e.g., viscous dissipation in Q)
Newtonian. requires the consideration of heat transfer as
If we take the viscosity into account, the well as viscosity. In this manner, it becomes
equation of motion of an incompressible fluid necessary to treat a complete system of equa-
becomes tions which take into account the energy
p Dv/Dt = pK - grad p + pAv. equation (3) as well as the temperature de-
(10)
pendence of K, .u, and p’.
This is called the Navier-Stokes equation. A
nondimensional quantity R = pUL/p formed
D. Laws of Similarity
by representative length L, velocity U, density
p, and viscosity p of a flow is called the Rey- For such complicated systems, tdimensional
nolds number. In order for two flows with analysis is often useful (- 116 Dimensional
geometrically similar boundaries to share Analysis). As laws of similarity, we cari con-
similar kinetic properties, their Reynolds num- sider not only those like the Reynolds law but
bers must be equal. This is called the Reynolds also others for bodies which transform similar-
law of similarity. ly by taffïne transformations. Corresponding
For small R, we cari approximate the equa- to equation (8), the Prandtl-Glauert law of
tion of motion (10) by replacing the accelera- similarity for subsonic flow is famous: The
tion Dv/Dt by av/at (Stokes approximation) or pressure coefficient (nondimensional pres-
by avjat + Uav/i?x (Oseen approximation) for a sure change) for a thin wing of chord (i.e., the
body placed in the uniform flow of velocity U length in the direction of flow) 1, span L, and
in the x-direction. thickness z is
For large R, the flow cari be regarded as
that of a Perfect fluid, since we cari neglect pAv C&L, 7) = X,,(J~ L, TlJ1-Ma: A),
as long as the velocity gradient is not too when 1 is an arbitrary constant and CPO is C,,
large. In the vicinity of a fïxed wall, however, for a body of scaled length and thickness
the velocity gradient becomes large, because in placed in an incompressible flow. Correspond-
a very thin layer the velocity decreases rapidly ing to (9), an extension of the famous von
from the value U of a Perfect fluid to zero at K%rm&n transonic similarity is possible:
the wall. This layer is called the boundary
layer. For the boundary layer, Prandtl’s bound- CJL, 7) = +(y + 1)-1’3
ary layer equation

au au au au au pa2u
ct+u~+vl’-+U-+-~,
UY at ax P OY E. Turbulence
1
E!+?=() (11)
ax ay For low Reynolds numbers, the flow generally
has smooth streamlines. For high Reynolds
is valid, where x and y are the coordinates numbers, however, extremely irregular motion
parallel and perpendicular to the wall, respec- in space and time appears. The former is called
205 F 194
Hydrodynamics

laminar flow, and the latter turbulent flow (- The conditions (12) and (13) imply, respec-
433 Turbulence and Chaos). The transition tively, that the fluid does not cross the bottom
from laminar to turbulent flow is considered to and the free surface, while (14) expresses the
be due to the instability of the laminar flow, fact that the difference between atmospheric
and this transition has been studied by the and fluid pressures at the free surface is equal
method of small oscillations. Recently, non- to the normal force (per unit area) due to the
linear effects have also been examined. Regard- surface tension. Thus the problem is formu-
ing the interna1 structure of turbulence, statis- lated as a nonlinear boundary value problem
tical theories originated by T. von Karman for Laplace’s equation including the unknown
and Ci. 1. Taylor (Proc. Roy. Soc. London, boundary z = h(x, y, t).
151 (1935)) and A. N. Kolmogorov (Dokl. Let us tïrst consider linear waves for which
Akad. Nauk. SSSR, 30 (1941)) are of central the wave amplitude of the surface elevation is
importance. much smaller than any other characteristic
linear dimension such as the wavelength or
F. Water Waves the water depth H (for simplicity, we assume
hereafter H = constant, i.e., a flat horizontal
+Surface waves that occur on the free surface of bottom). Linearizing the boundary conditions
water (or other liquids) are called water waves; (12)-( 14) with respect to h and grad @, and
their restoring forces are gravity and surface assuming a sinusoidal wave proportional to
tension. If we consider waves generated on still exp[i(k * r - wt)], k(k,, k,) and w being respec-
water (assumed to be inviscid and incompress- tively the (2-dimensional) +wave number vector
ible) in equilibrium, we cari regard the flow and the tangular frequency and r = (x, y), we
fïeld associated with the wave motion as tirro- obtain the dispersion relation
tational by virtue of +Helmholtz’s vorticity
theorem. Hence the flow velocity cari be de-
w2=
rived from the tvelocity potential U, which
satisfïes tlaplace’s equation A@ = 0, together
with the boundary conditions In a layer of still water the waves are isotropic
in the horizontal plane and the dispersion
D(HSz) a@aH a@aH ao
-=axox+t-+‘=o
Dt ay ay oz
relation involves only the magnitude k of
the wave number vector. It is readily seen
at z=-H(x,y), (12) from (15) that the water waves are typical
+dispersive waves in which the +Phase velocity
~(h-~) ah ao,ah aQ(ih a@
-‘t+axux++~--=O
Dt ayoy aZ
c,,( = wJk) depends on the wave number k or
the wavelength Â( = 27c/k). It is also evident
at z=z(x,y,t), (13) from (15) that the quantity ak2/(pg) measures
the relative importance of surface tension and
a@ i gravity. Hence for waves with wavelengths
Z+$grad@)z+gz=~=~
m much larger than &,, = 27rm ( N 1.7 cm
for water), the effect of surface tension is negli-
at z=h(x,y,t), (14)
gible, and we have gravity waves. Conversely,
where the Cartesian coordinates x and y are when n « Â,, the effect of surface tension
taken in the undisturbed horizontal free sur- becomes dominant, and we have capillary
face, while the positive z-axis is vertically waves or ripples. When the water depth H is
upward. The equations z = - H(x, y) and z = much larger than the wavelength A, we cari
h(x, y, t) denote, respectively, the bottom sur- approximate (15) by w2 = gk + crk3/p, since
face (assumed to be known) and the elevation tanh(kH)- 1. We cal1 such waves deep water
of the free surface measured from the un- waves. On the other hand, if H is much smaller
disturbed level z =O, while y stands for the than the wavelength (kH « 271), we have shal-
gravitational acceleration, o the surface ten- low (or long) water waves for which (15) cari be
sion, p0 the atmospheric pressure, and l/R, approximated by w2 =gHk’[l + {a/(pgH2)-
the +mean curvature of the disturbed free 1/3} (kH)‘+ . ..] if a/(pgH2)=O(l). In partic-
surface expressed as ular, if we neglect O(kH)‘, we recover the
well-known dispersionless long gravity waves
&=[{l+(;)l}$ whose phase velocity is simply ,,@?. In a11 the
cases mentioned above, the amplitude function
ah 2 a2h ahah a2h of the velocity potential @ is proportional to

1 OI axayaxay1
+ 1-t -
aY F--
-iwcosh{k(z+H)}/{ksinh(kH)},
flow velocity due to deep water waves de-
SO that the

creases exponentially as one proceeds verti-


cally downward from the free surface. In the
795 206 A
Hypergeometric Functions

limiting case of long gravity waves, however, [ 101 L. Rosenhead, Laminar boundary layers,
the fluid motion is nearly horizontal through- Clarendon Press, 1963.
out the fluid layer. [ 111 G. K. Batchelor, An introduction to fluid
When the wave amplitude becomes larger, dynamics, Cambridge Univ. Press, 1967.
nonlinear effects are no longer negligible. For [ 123 G. B. Whitham, Linear and nonlinear
such waves, various tsingular perturbation waves, Wiley, 1974.
methods provide powerful tools. For example, [ 131 M. J. Lighthill, Waves in fluids, Cam-
the basic system of equations for weakly non- bridge Univ. Press, 1978.
linear (1-dimensional) shallow water waves cari [ 147 H. Schlichting, Boundary-layer theory,
be reduced to a simple solvable nonlinear McGraw-Hill, 1979.
equation called the +Korteweg-de Vries equa-
tion, whose solitary wave solution is known
as a prototype of solitons (- 387 Solitons).
Another classical example is a (1 -dimensional)
deep gravity wave called the Stokes wave, 206 (XIV.5)
which cari be obtained as a power series in the Hypergeometric Functions
wave steepness (amplitude x wave number).
The lïrst term of the series is of the form of a
A. Hypergeometric Functions
linear sinusoidal wave, and the higher-order
terms correspond to the higher harmonies,
while the angular frequency is shifted from the The +power series
linear case and depends not only on the wave
number but also on the amplitude. Similar
singular perturbation methods have also been
applied to various kinds of resonant interac- in the complex variable z is called the hyper-
tions such as nonlinear self-modulation, higher geometric series or Gauss’s series.
harmonie resonances, and multiwave interac- It is convergent for any CI, 0, and y if (z(< 1,
tions. Finally it should be mentioned that an and is convergent for Re(cc+b-y)<0 if lzl= 1.
exact solution representing a (1-dimensional) If z= 1, its sum is equal to r(y)r(y-cc-pyr(y
deep capillary wave was obtained by G. D. - x)I(y - fi) (except when y is a nonpositive
Crapper (J. Fluid Me&., 2 (1957) 5322540; integer). The hypergeometric functions are
extended later to the case of Imite depth by obtained as analytic continuations of the
W. Kinnersley, J. Fluid Me&., 77 (1976), 2299 functions determined by hypergeometric series
241). This is one of the few realistic exact solu- that are single-valued analytic functions
tions obtained SO far; a famous exact solution delïned on the domain obtained from the
of Gerstner’s trochoidal wave does not satisfy complex plane by deleting a line connecting
the irrotationality condition. branch points z = 1 and z = SO (- Appendix A,
Table 18.1).
A hypergeometric function is a solution of
References the differential equation

[l] H. Lamb, Hydrodynamics, Cambridge z(1 -z)S+(y(z+B+ l)z)$Qw=O;


Univ. Press, sixth edition, 1932.
[2] S. Goldstein, Modern developments in
(1)
fluid dynamics, Clarendon Press, 1938. which is called the hypergeometric differential
[3] R. Courant and K. 0. Friedrichs, Super- equation or Gaussian differential equation. This
sonic flow and shock waves, Interscience, 1948. equation is a differential equation of +Fuchsian
[4] L. Howarth, Modern developments in type with tregular singular points at 0, 1, and
fluid dynamics, High speed flow 1, II, Oxford CO, whose solutions are expressed, in terms of
Univ. Press, 1953. the TP-function of Riemann, by
[S] K. Oswatitsch, Gas dynamics, Academic
Press, 1956.
[6] H. W. Liepmann and A. Roshko, Elements
of gasdynamics, Wiley, 1956.
[7] J. J. Stoker, Water waves, Interscience, If any one of the values of y, y - c(- 8, or c(-
1957. b is integral, there exists a series containing
[S] L. D. Landau and E. M. Lifshitz, Fluid logz, representing a solution of the differential
mechanics. Pergamon, 1959. (Original in Rus- equation (1) in a neighborhood of the corre-
sian, 1954.) sponding singular points. When none of the y,
[9] K. G. Guderley, The theory of transonic Y-U-P, or c(- b values is integral, since the
flow, Pergamon, 1962. linear transformations z’ = Z, z’ = l/z, z’ = 1 -z,
206 B
Hypergeometric Functions

Z' = Z/(Z - l), z’ = (z - l)/z, z’ = l/( 1 -z) permute equation


singular points, there exist 24 particular solu-
L,[W]=(l-ZZ)(((l-Zz)W’)‘+n(n+l)w)=O
tions around the singular points. The latter
fact was lïrst proved by E. E. Kummer (1836). is decomposed as follows:
There exist various curves C for which the
L,=S”~T,+n2=T”+I~S”+1+(n+1)2,
integral
d
T.=(l-z’)i+nz, S,=(l-z2)--nz.
w= u=-‘(1 -U)Y-“-‘(l-ZU)-@du dz
sc
If w, is a solution of L, [w] = 0, then multiply-
is a solution of (1). Among them we cari take
ing both sides ofS;T,[w,]+n2w,=0 by T,,
the segment [0, 11 when Re a > 0, Re(y -a) > 0.
we find that T;S,(T,[wJ)+nZ(T,[w,])=O,
Then the corresponding solution is holomor-
that is, T,[w.] is a solution of L,-, [w] =O.
phic in the interior of the unit circle, and
Similarly, we see that S,,, [w.] is a solution of
L,,,, [w] = 0. In this sense, S,, and T. are called,
respectively, the step-up operator, or up-ladder,
and the step-down operator, or down-ladder,
X u”~‘(l-U)Y~“~‘(l-ZU)~Pdu. with respect to the parameter n.
s c The above relation constitutes a recurrence
Since the integrand has branch points at 0, 1, formula for Legendre functions (- Appendix
A, Table 18.11).
and l/z, we have the following expression
when y is not an integer:
C. Extensions of Hypergeometric Functions
F(4 P, y; 4
1 J. Thomas (1870) proposed the series
w
=(l- Pi(y-@)(l- ezaia) r(cgr(y- CX) 1 + f @4(~2)““‘(%)” z”

(1+,0+.1-,0-) n=l uM.(BA . ..(Dh)” ’


x ua(l -u)y-“-I(l -zU)-fldu,
+ (n),=n(n+l)...(A+n-1)
where Re c(> 0, Re(y - c()> 0; whereas if y is an as an extension of the hypergeometric series.
integer, then The sum of this series satisfies the hth-order
differential equation
1 r(Y)
Fb, 8, y; 4 =
(1-e-2nirr) r(cq-(y-a) (l- )ti,(, -B )d”-,
‘dth ’ lZ y$7
(l+,o+)
X ua(l -~)y-=-I(l -zu)-fldu,
+(A,&$:; -+...+(A,-B,z)w=O,
where the contour in the lïrst expression en-
circles successively each of 1, 0, 1, and 0 once t=logz.
with indicated directions. These expressions When h = 2 and jI1 = 1, it reduces to the ordi-
cari be adopted as a delïnition of the hypergeo- nary hypergeometric series. The notation
metric functions for the general value of z.
Other integral expressions are also known pFq(c(1,c(2,...,clp;BI,82,...,Pq;z)

(- Appendix A, Table 18.1).


m h)“(%)” W” Zn (2)
=Cn!(Bl),(B2)....(134)n ’

B. The Ladder Method which is due to L. Pochhammer and modi-


fied by E. W. Barnes, is used to denote the ex-
A linear ordinary differential equation of the tended hypergeometric series, and the function
second order having three regular singular delïned by (2) is often called Barnes’s extended
points on the complex sphere is easily trans- hypergeometric function. For example, Gauss’s
formed into an equation of the form (1). TO series in this notation is 2F, (tu, 8, y; z).
solve such an equation with a parameter, it is Corresponding to Barnes’s integral expres-
often useful to decompose, in two different sion for hypergeometric functions, it is known
ways, the main part of the equation into two that the integral
factors of the tïrst order, and find a recurrence c+im
formula involving the parameter, as we shall
W(z)=& _, K([)H([)zmid[,
see in the following example. This method s c Irn
is called the ladder method or factorization where
method.
For example, tlegendre’s differential K(i)=K(i+ 1)
797 206 D
Hypergeometric Functions

and four kinds of functions [3]:

ni + w-(i + a2)..ui + ah)


H(i)=r(i+l+Pl)r(i+l+P,)...r(iï+l+pJ~
is a solution of the hth-order differential equa-
tion at the beginning of this section. The
hypergeometric function expressed by the
definite integral

r”(i-l)b(i-z)Ca
s
has an obvious format extension

(~-~,)~~(~-u~)~~...(~-a,)~-(~-z)‘-d~.
s
On the other hand, the equation

They are called Appell’s hypergeometric func-


tions of two variables. Each satisfïes a cor-
where
responding system of partial differential
equations:

x(1 -x)r+y(l-x)s+(y-cx)p-/Iyq

-afiz=O,

F, y(l-y)t+x(l-y)s+(y-c’x)q-B’xp

-apz=O,

1 (x-Y)s-FP+Pq=O,
x(1 -x)r-xys+(y-cx)p-/Iyq-aBz=O,
lin
P1(z)=~&) A+& + . +-
( 1 2 z-a, > ’ F2 y(l-y)t-xys+(y’-c’y)q-P’xp

called the Tissot-Pochhammer differential 1 -ajYz=O,


equation, has a solution
x(1-x)r+ys+(y-cx)p-apz=O,
F3
w(z)= (~-a,)P~~1(~-a,)P2~1 ... 1 y(1 -y)t+xs+(y-c”y)q-a’/Yz=O,
sc
x(1 -x)r-yzt-2xys+(y-cx)p-cyq
x ((--a m)fimm1(<-z)h+mm2d[.
-a/Iz=O,
After Pochhammer (1870), this is sometimes F4
called Pochhammer’s generalized hypergeo- y(l-y)t-xzr-2xys+(y’-cy)q-cxp
metric function.
I -apz=O,
As another extension of Gauss’s series, H. E.
Heine (1846) introduced Heine’s series: where

q(u, b, c; q; z) = 1 +
(1-4”)(1-qb) c=a+p+1, c’=a+/Y+ 1, c”=a’+p+l,
(1 -du -43 q’ p = az/ôx, 4 = aziay, y= a2zlax2,
+(l -q”)(l -q”+‘)(l -qb)U -qh+‘lq2’+ .,, s= a2zlaxay, t = a2zlay2.
(1 -d(l-q2)(l -q’)(l -qc+l)
Appell’s hypergeometric functions cari also be
Setting q= 1 +E, z=(l/c)logx, and letting ~‘0, represented by integrals: for example,
we obtain Gauss’s series as the limit of Heine’s
series.

D. Hypergeometric Functions of Several


Variables 1 1
uY)m~) uP-l #-1
F2 =
r(B)r(B’)r(y-p)r(y’-B’) SSo o
P. Appel1 (1880) formally extended Gauss’s
series to the case of two variables and detïned
206 E 798
Hypergeometric Functions

l-(Y) ub-lull-1 have


F3=r(B)r(~~)r(y-PPI)
SS ,F,(a;Z)=(det(E-Z)))“.
x(1 -u-u)~-~-fi’-l(l -ux)-y1 -uJJ-“‘dudu,
Based on this definition, many special func-
where, for F, and F3, the domain of integra- tions and formulas are extended to the case of
tion is ~30, ~30, ~-U-V>O. E. Picard a matrix argument. For example,
(1881) showed that F, cari also be expressed
A,(Z)=,F,(fi+(m+ 1)/2; -zyr,v+(m+ 1)/2)
by a single integral:
(5)
UY) o’
F’=r(cc)r(x-y) is an extension of the tBessel function, and this
s
u”-l(l -q-1

reduces to

WrdJ,(t) = f4a(w)2)
G. Lauricella (1893) extended the foregoing when m = 1. Formula (5) is applied to the
functions to the case of more than two vari- tnoncentral Wishart distribution in mathe-
ables. More general hypergeometric series of matical statistics.
several variables were defined by R. Mellin,
J. Horn, and J. Kampé de Fériet [3,4]. Every
algebraic equation cari be solved analytically References
in terms of the foregoing functions (Mellin,
R. Birkeland) [4]. Also there are studies con- [l] F. Klein, Vorlesungen über die hypergeo-
cerning +Riemann’s problem and tautomor- metrische Funktionen, Springer, 1933.
phic functions derived from F, (Picard; Also - references to 389 Special Functions.
T. Terada [5]). For applications of the ladder method,
[2] L. Infeld and T. E. Hull, The factorization
method, Rev. Mod. Phys., 23 (1951) 21-68.
E. Hypergeometric Functions with Matrix For hypergeometric functions of several
Argument variables,
[3] P. Appell, Sur les fonctions hypergéo-
For symmetric matrices 2 of degree m, C. S. métriques de plusieurs variables, Mémor. Sci.
Math., Gauthier-Villars, 1925.
Herz delïned hypergeometric functions with
[4] G. Belardinelli, Fonctions hypergéomé-
matrix argument as follows [SI: Denoting by
etr Z the exponential exp(tr Z) of the ttrace of triques de plusieurs variables et résolution
Z, let analytique des équations algébriques générales,
Mémor, Sci. Math., Gauthier-Villars, 1960.
,F,,(Z)=etrZ, [S] T. Terada, Problème de Riemann et fonc-
tions automorphes provenant des fonctions
,+,F,(cc,,...,a,;B,,...,lj4;~;Z)
hypergéométriques de plusieurs variables,
1 J. Math. Kyoto Univ. 13 (1973) 557-578.
ZZZ- et+A),F& ,,... ,a,;fil,...,Bq;
rk+) s A>o For the case of a matrix variable,
[6] C. S. Herz, Bessel functions of matrix argu-
AZ)(det A)?-“dA, i di,, di,,,, (3)
ment, Ann. Math., (2) 61 (1955), 4744523.
,F,+,(a,,...,cc,;B,,...,13,;~;12) [7] L. J. Slater, Generalized hypergeometric

cmsReZ~X,,,OetrzJJF~(E~~
=(2ni)m(m+l)/2
...7cciJ;
functions, Cambridge Univ. Press, 1966.

pi, , &;AZm’)(detZ))Ydz,, dzz2 . ..dz.,, (4)

where

r,(y)=7t ~'~-')'~r(~)r(y-1/2)...
x r(Y -Cm - lV4,
and A > 0 means that A is tpositive definite.
The integral (3) converges for -Z >O if Rey
>(m- 1)/2. If Rey is sulhciently large, then for
suitably chosen X0, (4) converges in a domain
of the space of A and represents an analytic
function of its argument. In particular, we
207 A 800
Ideal Boundaries

207 (XI.1 3) such a case, a point p* in A is said to be regu-


lar with respect to the +Dirichlet problem if
Ideal Boundaries lim RSP-P* HRxR*(p)=f(p*) for every bounded
f
continuous function f on A (- 120 Dirichlet
A. Ideal Boundaries Problem). The set A, of regular points in A
is contained in F. If R* is a resolutive com-
For a given Hausdorff space R, a +Compact pactification, then there exists a unique posi-
Hausdorff space R* that contains R as its tive +Bore1 measure pi, such that Hf,R*(p)=
dense subspace is called a compactification of l,f(p*)dp,(p*) for every bounded continu-
R, and A = R* -R is called an ideal boundary ous function f‘ on A. This measure is called
of R. In the present article, we deal mainly the harmonie measure with respect to pc R.
with properties (in particular, function- There exists a function P(p, p*) on R x A with
theoretic properties) of ideal boundaries of dp,(p*)=P(p,p*)dp,Jp*) for an arbitrary fixed
tRiemann surfaces R. point o in R satisfying the following three
conditions: (i) P(p, p*) is harmonie on R as a
function of p; (ii) P(p, p*) is Bore1 measurable
B. Harmonie Boundaries as a function of p*; (iii) k(o, p)-’ < P(p, p*) <
k(o, p), with the Harnack constant k(o, p) of
By R we mean an open Riemann surface. (0,~) relative to R [9].
The set F of points p* in A such that
lim inf Rap-p* P(p) = 0 for every tpotential P (i.e.,
a positive tsuperharmonic function P for
C. Compactifications Determined by Function
which the class of nonnegative tharmonic
Families
functions smaller than P consists only of the
constant function 0) is a compact subset of R*.
The set f is called the harmonie boundary of R A family F of real-valued continuous functions
with respect to R*. For an arbitrary compact on R admitting infinite values is called a sep-
subset K in A -F, there exists a lïnite-valued arating family on R if there exists an fin F
potential PK with limRsp-p* P,(p) = CO(p* EK). such that f(p)#f(q) for any pair of given
From this, various kinds of tmaximum prin- distinct points p and 4 in R. A compactifica-
ciples are derived. For instance, if u is a har- tion R* is called an F-compactification, de-
monic function bounded above for which noted by RZj, if every function in F cari be
limsup,,,u(p) < A4 holds, then u <M on R. continuously extended to R* and the family
There are inhnitely many compactitïcations of extended functions again constitutes a sep-
of R. For two compactifications RT (i = 1,2) arating family on R*. The correspondence
of R, we say that RT is greater than R2 or, <p: F+R; defines a single-valued mapping
equivalently, lies over Rz, if the identity map- of ah separating families F on R onto a11 F-
ping of R cari be extended to a continuous compactifications of R. If F, 3 F,, then cp(F,)
mapping of RT onto Rr. In order that deep lies over ~I(F,). For any R*, cp-‘(R*) contains
function-theoretic studies of R* cari be carried infmitely many separating families, among
out, various conditions must be imposed on which the separating famihes constituting
R*. A compactification R* is said to be of tassociative algebras are important. The fol-
Stoïlow type (or of type S) if for every +Con- lowing are typical examples of compactifïca-
nected open subset G* in R* whose boundary tions determined by function families:
in R* is contained in R, G* -A is also con- (1) The Aleksandrov compactification is the
nected. Next suppose that R # 0, (- 367 Rie- U-compactitïcation Rfi with the family U of
mann Surfaces E). For a given real-valued bounded continuous functions on R with
function f on A, let uns”* @Ifs”*) be the class compact support. It is the smallest compacti-
of tsuperharmonic functions s bounded from tïcation of R and is often used in function
below (tsubharmonic functions s bounded theory in discussing Dirichlet problems for
from above) such that lim infR3p+p* s(p) >f(p*) relatively noncompact subregions in reference
(lim supR3p+p* s(p)df(p*)) for every ~*CA. If to relative boundaries.
these classes are nonempty, then n/RvR*(p)= (2) The Stone-Lech compactiiïcation is the
inf{s(p)I~EBfR,~*} and HfR*R*(p)=s~p{~(p)Is~ 6-compactifïcation R,* with the family CFof
URsR*} are harmonie on R, and $R* < I?;X~*. bounded continuous functions on R. It is the
Irr’particular, if #,R* = flfsR*, then the com- largest compactification of R. It is rarely used
mon function is denoted by H/R,R’, and the in function theory, but an application is found
function fis said to be resolutive with respect in the work of M. Nakai [8].
to R*. A compactification such that every (3) The Kerékjhrt&Stoilow compactification
bounded continuous function on A is resolu- is the G-compactification R2 with the family
tive is called a resolutive compactification. In s of bounded continuous functions ,f on R
801 207 D
Ideal Boundaries

such that there exist compact sets K, with the on R - Rf that coincide with f on the bound-
property that the f are constants on each ary of R,. The Kuramochi compactification is
connected component of R - K,. This is the the A-compactifïcation RH with the family H of
smallest compactifïcation of Stoïlow type. bounded continuous functions f on R satisfy-
Many applications of this compactitïcation ing (R)af/an = 0. The continuous function
cari be found in function theory, among which k(p, q) on R such that (R)ak/an = 0 vanishes in
the investigation done by M. Ohtsuka on the a fïxed parametric disk R, in R and is har-
Dirichlet problem and the theory of conforma1 monic in R - & except for a positive tlogarith-
mappings is typical. mit singularity at a point q cari be extended
(4) The Royden compactification is the %- continuously to R x RH, which is called the
compactitïcation R,H with the family % of Kuramochi kernel. By the use of this kernel,
bounded C” functions f on R with finite RR is metrizable, as in the case of Martin
Dirichlet integrals iiRdf~ *& It was intro- compactification. This compactifïcation was
duced by H. L. Royden and developed further introduced by Kuramochi, and its important
by S. Mori, M. Ôta, Y. Kusunoki, Nakai, and applications to the study of ND-functions,
others. This compactification has been used potential theory, and cluster sets were made
effectively in the study of HD-functions and by Kuramochi, Constantinescu and Cornea,
the classification problem of Riemann surfaces and others.
(- 367 Riemann Surfaces). Among compactifïcations (l)-(7), no bound-
(5) The Wiener compactification is the 9% ary point in (2), (4), or (5) satisfies the tfirst
compactification R& with the family !II3 of countability axiom (hence they are not metriz-
bounded continuous functions f on R such able), while the others are a11 metrizable. In (4)
that { HPJ converges to a unique harmonie and (5) A, = r. Fig. 1 shows the relationship
function independent of the choice of exhaus- among the seven examples. Here A -rB means
tions {G,} of an arbitrary tïxed subregion G $ that A lies over B, and A #B means that in
Oc, where the G,, are relatively compact sub- general neither A-tB nor B-tA.
regions of G. It is the largest resolytive com-
pactifïcation, and compactitïcations smaller
than R& are always resolutive. This compac-
tification was introduced independently by
Mori, K. Hayashi, Kusunoki, and C. Constan-
tinescu and A. Cornea and is useful for the
study of HB-functions and the classification of
Riemann surfaces.
(6) The Martin compactification is the 9% D. Remarks
compactification R& with the family %JI of
bounded continuous functions f on R such In contrast to topological compactifications
that there exist relatively compact regions (l)-(3), (4)-(7) cari be regarded as potential-
R, with the property that f= Hf9-Rf’RH-R~/ theoretic and have enough ideal boundary
HfCRf,Rd-Rf on R - Rf. Here f* coincides points SO that one cari solve the Dirichlet
with f on Rf and equals 0 on R$- R, and l* problem and introduce varions measures
is similarly defïned. The set R,Y, - R is called there. Utilizing Green’s functions, R. S. Martin
the Martin boundary of R. If +Green’s func- [6] deduced the first important compactifï-
tion g exists on R, then the function m(p, q) = cation and gave the integral representation of
g(p, q)/q(o, q) for an arbitrary tïxed o E R cari positive harmonie functions (the extension of
be extended continuously to R x R,& which the +Poisson integral). Z. Kuramochi [3] ob-
is called the Martin kernel. By the metric tained his compactification similarly by using
&l(q? r, = s”ppsR, Im(P, Ml + m(p, 9)) - m(p, r)/ N-Green functions introduced by himself
(1 + m(p, r))] with a parametric disk R, in R, instead of the usual Green% functions. In this
R& is tmetrizable. This compactifïcation was compactification, the ideal boundary points
introduced by R. S. Martin, and many appli- cari be considered to be interior points of the
cations of it to the study of HP-functions, surfaces in a potential-theoretic sense. In case
potential theory, Markov chains, and cluster of lïnitely connected domains with smooth
sets were obtained by M. H. Heins, Z. Kura- boundaries, both the Martin and Kura-
mochi, J. L. Doob, Constantinescu and Cor- mochi boundaries coincide with the usual
nea, and others. boundaries.
(7) For a function f on R, (R)~~/&I=O The Royden and Wiener compactitïcations
means that there exists a relatively compact were introduced as the maximal ideal spaces
subregion RJ such that f is of class C” on R of respective function algebras. Their ideal
outside R, and the Dirichlet integral off over boundaries contain extremely many points.
R - Rf is not greater than those of functions However, these compactifications have elegant
207 Ref. 802
Ideal Boundaries

properties and many applications. An analytic 208 (X.7)


mapping <pof a Riemann surface R into an-
other R’ is called a Dirichlet (Fatou) mapping
Implicit Functions
if <pcari be extended to a continuous map-
ping of R$(R&) to R$(R$). For example, a A. General Remarks
Lindelolïan mapping (AD-function) is a Fatou
(Dirichlet) mapping. The Dirichlet and Fatou Historically, a function y of x was called an
mappings were investigated by Constanti- implicit function of x if there was given a
nescu, Cornea, and others. functional relation f(x, y) = 0 between x and y,
By using the Martin boundary, Z. Kuramo- but no explicit representation of y in terms of
chi and M. Nakai proved the extension of the 165 Functions). Nowadays, however,
xc-
Evans-Selberg theorem to parabolic Riemann the notion of implicit function is rigorously
surfaces [3, S]. The normal derivatives of HD- defined as follows: Suppose that a function
functions on the ideal boundaries and their f(xi, . , x,, y) is of tclass C’ in a domain G in
applications were studied by Constantinescu, the real (n + l)-dimensional Euclidean space
Cornea, F. Maeda, Y. Kusunoki, and others. R”+’ and that ,f(xy, , xf , y’) = 0, &(xT, . . . ,
The theory of compactification cari be gen- xi, y’) # 0 at a point (xy, , xt, y’) in G. Then
eralized to domains in R”, +Green spaces, and there is a unique function ~(x i, . ,x,) of class
harmonie spaces. C’ in a neighborhood of the point (x7, , XII)
that satisIïesf(x, ,..., xn,g(x, ,..., x,))=O,
y0 =g(xl, , xz) (implicit function theorem).
References The function g is called the implicit function
determined by f=O. The partial derivatives of
g are given by the relation
[l] C. Constantinescu and A. Cornea, Ideale
Rander Riemannscher Flachen, Springer, 1963. ûYlûxj= -(ûf/axj)l(ûf/aY)~

[2] K. Hayashi, Sur une frontière des surfaces wherey=g(x,,..., x,). If the function f is of
de Riemann, Proc. Japan Acad., 37 (1961) class c’ (1 <r < cc or Y= w), then the function
469-472. g is also of class c’. In particular, when n = 1,
[3] Z. Kuramochi, Mass distributions on the letting x, be x, we have dg/dx = -fJf;.
ideal boundaries of abstract Riemann surfaces
1, II, Osaka Math. J., 8 (1956) 1199137, 145-
186. B. Jacohian Matrices and Jacobian
[4] Y. Kusunoki, On a compactifïcation of Determinants
Green spaces, Dirichlet problem and theorems
of Riesz type, J. Math. Kyoto Univ., 1 (1962) A mapping u from a domain G in R” into R”
385-402.
[S] Y. Kusunoki and S. Mori, On the har-
monic boundary of an open Riemann surface
1, Japan. J. Math., 29 (1959) 52-56. is called a mapping of class C’ if each compo-
[6] R. S. Martin, Minimal positive harmonie nentu,,...,u,isofclassC’(O<r<coorr=cr,)
functions, Trans. Amer. Math. Soc., 49 (1941), in G. Given a mapping u of class C’ from G
137-172. into R”, we consider the following matrix,
[7] S. Mori, On a compactification of an open which gives rise to the differential du, of the
Riemann surface and its application, J. Math. mapping u (- 105 Differentiable Manifolds 1):
Kyoto Univ., 1 (1961), 21-42.
[S] M. Nakai, On Evans potential, Proc. û(“)lû(x)=(auj/ûxk)l Q<m, 1<k<n’ (1)
Japan Acad., 38 (1962) 6244629. This matrix is called the Jacobian matrix of
[9] M. Nakai, Radon-Nikodym densities the mapping u at x. If there is another map-
between harmonie measures on the ideal ping v of class Ci from a domain containing
boundary of an open Riemann surface, the +range U of u into R’, then we have the
Nagoya Math. J., 27 (1966) 71-76. law of composition:
[ 101 H. L. Royden, On the ideal boundary of a
Riemann surface, Contributions to the Theory (a(u)la(u))(a(u)lû(x)) = ~(4/@).
of Riemann Surfaces, Princeton Univ. Press, When n = m, the tdeterminant of the matrix (1)
1953,107~109. is called the Jacobian determinant (or simply
[ 1 l] F. Maeda, M. Ohtsuka, et al., Kuramochi Jacobian), and is denoted by D(u)/D(x),
boundaries of Riemann surfaces, Lecture notes
D(u,, , u,YW, , . , x,1 or
in math. 58, Springer, 1968.
[12] L. Sario and M. Nakai, Classification D(u,,...,u,)
theory of Riemann surfaces, Springer, 1970. Dh,...,x,,)’
803 208 D
Implicit Functions

Sometimes the notation 0 is used instead of D; the components ui, , u, are functionally de-
but in the present article we distinguish the pendent of class C” on every compact set in G
matrix from the determinant, using 8 for the (Knopp-Schmidt theorem) [ 11.
matrix and D for the determinant.
If m = n and D(u)/D(x) never vanishes at any
point of the domain G, then u is called a regular D. Implicit Functions Determined by Systems
(or nonsingular) mapping of class C’. If the of Functions
Jacobian D(u)/D(x) is 0 at x, we say that u is
singular at x. A mapping that is singular at
every point in a set S c G is said to be degen- Suppose that the +rank of the Jacobian matrix
erate on S. For a regular mapping u, the sign of (1) is r < m everywhere in G. Suppose that
of the Jacobian is constant in a connected u(x) is a mapping of class C’ from a domain
domain G. If it is positive, the mapping u G in R” into R”, the Jacobian determinant
preserves the orientation of the coordinate Db,, . . ..u.)lD(x,, . . . . x,) never vanishes in G,
system at each point in G, while if it is nega- and further D(u,, , u,, u,)/D(x,, , x,, x,) is
tive, the mapping changes the orientation. A identically 0 in G for each p, g with r-c p d m,
point where u is degenerate is called a critical r < 0 < n. Then the values u,+r (x), , u,(x) are
point of the mapping u, and its image under u determined by the values ui (x), , u,(x), and
is called a critical value. In general, the image each u,, is represented as a function of class C’
of the mapping is “folded” along the set of ofu,,...,u,.
critical points. The set of critical values of a Let u(x) be a mapping of class C’ from a
mapping IA of class C’ (sending a domain in R” domain G in R” into R” and V the tinverse
into R”) is of Lebesgue measure 0 in R” (Sard’s image of a point u”. TO study the properties of
theorem). If u is a regular mapping, then each the set V, we assume, for simplicity, that u” is
point in the domain G of u has a neighbor- the origin. Suppose that the rank of the matrix
hood V such that the restriction of u on V is a a(u is r for every point x in G and each
+topological mapping. Its inverse mapping x(u) ofu r+l, , u, is functionally dependent on
is also a regular mapping of class C’ and u ,,..., u,.Theneachu,(r<p<m)isafunc-
satisfies the relation tionu,(u, ,..., u,)ofu ,,.... u,.Theset Vis
empty if there is a p such that u,(O, , 0) # 0.
On the other hand, if u,(O, , 0) = 0 for a11 p
(r < p < m), then V is the set of common zero
(inverse mapping theorem). If u is of class C points of the functions u r (x), , u,(x). There-
(1 < rd cc or r = w), then SO is its inverse fore, to study the set V, we cari assume that r =
mapping. m d n. If m = n, V consists of isolated points
only. If m < n, then, changing the order of the
variables x1, . , x, if necessary, we cari assume
that D(u,, . . ..u.)/D(x i,...,x,)#Oatapoint
C. Functional Relations (x0) in V. In this case, there is a unique func-
tien ir,(~,+~, . , x,)ofclassC’(l<p<m)ina
A function F(u, , , u,) defïned on a domain B neighborhood of (xp) satisfying the following
in R” is called a function with scattered zeros if two conditions: (i) XE = <,(x2+, , . ,x,0); (ii) if
F has a zero point (ie., there exists a point u the point (x,+,, ,x,) is in a neighborhood of
for which F(u) = 0) and if every open subset of cxm+,>...> xf), then the point
B contains a point u such that F(u) # 0. Every
(ir*(x,+l> “‘> XA> “‘> 5,(x,+1 >‘. /X,),
tanalytic function $0 has scattered zeros. Let
u(x) be a mapping from a domain G in R” into X m+,r...>x,)Ev.
B c R”. Suppose that there exists a function
F(u) delïned in B, of class c’, with scattered Each function 5, is called an implicit function
zeros. If F(u(x)) = 0 for every x in G, then we ofx m+l,“‘i x, determined by the relations
say that the components ul, , u, of the u, = = u, = 0. The +total derivatives of the
mapping u have a functional relation of class t, are determined from the system of linear
C’ or are functionally dependent of class C’. In equations
such a case, we sometimes say simply that
u 1, , u, are functionally dependent or that i $dx,=O, j=l,..., m.
I=~+I 0x1
they have a functional relation. If the compo-
nents u,, , u, of a mapping u of class C’ are The foregoing implicit function theorem is a
functionally dependent of class Ca, then the local one. Among the global implicit function
Jacobian D(u)/D(x) of u must vanish. Con- theorems, the following one, due to Hadamard,
versely, if the Jacobian D(U)/~~(X) of a mapping is useful: Let x-y(x) be a mapping of class C’
u of class C’ is identically 0 in the domain G, from R” into R” such that the inverse W’(x)
208 E 804
Implicit Fum%ions

of its Jacobian matrix a>(x) is bounded in R”; the quadratic form


then the mapping is a tdiffeomorphism of class l-h/ m 12
C’ from R” onto R”.
The implicit function theorem holds also in
complex spaces. Let fi(x 1, , x,, y 1, , y,), of<,,...,&,,andisequalto
1~ i < p, be a system of tholomorphic functions
in a domain in the complex (n + p)-dimensional lb b
(det(uj(xk)))‘dx, dx,.
space C”+p. If(i)f;(xl,.,., xz,yy ,..., y$= m!
4 u ‘.’ s a
0, 1 <i<p, and (ii) O(fi, . . . . f,)/D(y,, .... We always have G(u,, . . ,u,)>O, and G(u,,
YPhx,Y)=wJA #O, then there exists a unique “‘/ u,) = 0 if and only if ul, , u, are linearly
holomorphic solution yi =y,(~, , . , x,) (1 < dependent. The Gramian is defined if the
i < p) in a neighborhood of the point x0 = functions ul, , u, are +Square integrable
(xl, . , xf) that satisfies fi(x 1, . ,x,, y, (x, , in the sense of Lebesgue. In that case, the
“‘2 x,), . . i Y&, ... ,x,)1=0 (1 <i<P). condition G(u, , , u,~) = 0 holds if and only if
u 1, , u, are linearly dependent talmost every-
where, i.e., there are constants c,, . , c, not all
E. Linear Relations zero such that the relation c1 u1 (x) + . . . +
c,,,u,(x)=O holds except on a set of Lebesgue
Suppose that u(x) is a mapping of class Cm-’ measure 0.
from R1 into R”. Its components (ul, , u,)
are functions of class Cm-‘. Then the
determinant
References
Ul U? .. . uns
Ul uz . hn . [l] K. Knopp and R. Schmidt, Funktional-
u\m-” u\m-U . ug-l> determinanten und Abhtingigkeit von Funk-
tionen, Math. Z., 25 (1926), 373-381.
is called the Wronskian determinant (or simply
[2] G. Doetsch, Die Funktionaldeterminante
the Wronskian) of the functions u, , , u, and
als Deformationsmass einer Abbildung und als
is denoted by W(u,, u2, . . . , u,,,). If the func-
Kriterium der Abhangigkeit von Funktionen,
tionsu,,..., u, are tlinearly dependent, i.e., if
Math. Ann., 99 (1928), 590&601.
there exist constants cj not a11 zero satisfying
[3] W. Rudin, Principles of mathematical
Ci=, cjuj(x) = 0 identically, then the Wronskian
analysis, McGraw-Hill, second edition, 1964.
vanishes identically. Therefore, if ~V(U,, , u,)
[4] H. Cartan, Calcul différentiel, Hermann,
# 0, then the functions u 1, , u, are linearly
1967.
independent. Conversely, if ~V(U,, , u,) = 0
[5] J. T. Schwartz, Nonlinear functional anal-
identically, and further if there is at least one
ysis, Gordon & Breach, 1969.
nonvanishing Wronskian for u,, , uiml, ui+,,
“‘> u, (1 <idm), then the functions ul, . . ..u.
are linearly dependent. The necessity of the
additional condition is shown by the follow-
ing example: u1 =x3 and u2 = Ix13 are of class 209 (XXl.5)
C’ in the interval [ -1, l] and linearly inde-
pendent, but they satisfy W(x3, IX[~) = 0 iden-
Indian Mathematics
tically. However, the additional condition is
unnecessary if the functions u, , , u, are India was one of the earliest civilizations, but
analytic. Similar theorems are valid in a do- because it has no precise chronological record
main in a complex plane. of ancient times, it is said to possess no his-
Furthermore, if u(x) is a continuous map- tory. Indian mathematics seems to have de-
ping from an interval [a, b] in R’ into R”, the veloped under the influence of the cuit of
determinant Brahma, as did the calendar. It may also have
some relation to the mathematics of the Near
(I>l) ,.’ (I,m)
East and China, but this is difflcult to trace.
Gb ,,“‘,
u,)= CLl)
. “’ (2,m) > The word ganita (computation) appears in
hl) .” (m,m) early religious writings; after the beginning of
the Christian Era, it was classifïed into patz-
(j,k)= b uj(x)U,(x)dx gucita (arithmetic), bija-gacita (algebra), and
sa k?etra-ga@a (geometry), thus showing some
is called the Gramian determinant (or simply systematization. The Buddhists (notably Na-
Gramian). The Gramian is the tdiscriminant of garjuna) had a kind of logic, but it had no re-
805 210 B
Inductive Limits and Projective Limits

lation to mathematics. Unlike the Greeks, the 210 (11.25)


Indians had no demonstrational geometry,
but they had symbolic algebra and a position
Inductive Limits and
system of numeration. Projective Limits
Indian geometry was computational; Arya-
bhata (c. 4766~. 550) computed the value of z
as 3.1416; Brahmagupta (598-c. 660) had a A. General Remarks
formula to compute the area of quadrangles
inscribed in a circle; and Bhàskara (1114- Inductive and projective limits cari be de-
1 185) gave a proof of the Pythagorean theo- fined over any tpreordered set I and in any
rem. In trigonometry, Aryabhata made a tcategory. We first explain the delïnition of
table of sines of angles between 0” and 90 these limits in the special case where 1 is a
for every 3.75” interval. The name “sine” is tdirected set and the category is that of sets, of
related to the Sanskritjya, which referred to groups, or of topological spaces. The simplest
half of the chord of the double arc. case is when 1 is the ordered set N of the
The Indians had a remarkable system of natural numbers.
algebra. At the beginning they had no oper-
ational symbols and described in words the
rules for solving equations. Brahmagupta
B. The Limit of Sets
worked on the +Pell equation ,x2 + 1 =y2.
Bhâskara knew that a quadratic equation cari
have two roots that cari be positive and nega- Let 1 be a directed set. Suppose that we are
tive, but did not assign any meaning to the given a set Xi for each i E 1 and a mapping
negative root in such cases. Bhaskara also <pji: Xi-Xi for each pair (& j) of elements of I
introduced algebraic symbols. with i <j, such that <pii = 1x, (the identity map-
The symbol0 was used in India from about ping on Xi) and qki = qkj o <pji (i <j < k). Then
200 B.C. to denote the void place in the posi- we denote the system by (Xi, cpj,) and call it
tion system of numeration; 0 as a number is an inductive system (or direct system) of sets
found in a book by Bakhshali published in the over 1. Let S be the tdirect sum uiXi of the
3rd Century A.D. The number 0 is delïned as a sets Xi (i~1), and delïne an equivalence re-
-a = 0 in our notation, and the rules a f 0 = lation in S as follows: x E Xi and y E Xj are
u,Oxa=O,~=0,Ota=Oarementioned. equivalent if and only if there exists a k E 1
Brahmagupta prohibited division by 0 in such that i < k, j < k, (pki(x) = qv(y). Let D be
arithmetic, but in algebra he called the “quan- the tquotient set of S by this equivalence rela-
tity” a e 0 taccherlu. Bhaskara called it khahuru tion, and let J : Xi + D (i E Z) be the canonical
and made it play a role similar to that of our mappings. Then we have I(l)fjocpj,=f, (i<j);
inlïnity. I(2) for any set X, and for any system of map-
Some historians assert that the Indians had pings gi: Xi+x (ici) satisfying gj 0 qji =gi
the ideas of intïnity and inlïnitesimal. Some (i <j), there exists a unique mapping f: D + X
hold that the Indian position system of numer- such that foL=gi (~CI). We cal1 (D,J) the
ation arose from the circumstance that the inductive limit (or direct limit) of the inductive
names of numbers differed according to their system (Xi, <pji) over 1, and denote it by 15 Xi
positions. The Indian numeration system was or ind lim Xi (more precisely, by Ii@,,, Xi or
exported to Europe through Arabia, and ind limier Xi).
had great influence on the development of Suppose, dually, that we are given a set Xi
mathematics. for each Ill and a mapping tiij:Xj+Xi for
each i <j, such that $ii = lxi and $, = tiij o tijk
(i <j < k). Then we denote the system by (Xi,
$,) and cal1 it a projective system (or inverse
References
system) of sets over 1. Let P be the subset
of the Cartesian product HX, defïned by P =
[l] M. Cantor, Vorlesungen über Geschichte {(xi)1 $,(xj)=xi (i<j)}, and let pi: P-Xi be
der Mathematik 1, Teubner, third edition, the canonical mappings. Then we have P( 1)
1907. tiij o pj = pi (i <j); P(2) for any set X, and for
[2] G. Sarton, Introduction to the history of any system of mappings qi : X-Xi satisfying
science 1, From Homer to Omar Khayyam $ijo qjqi (i <j), there exists a unique mapping
Carnegie Institute of Washington, 1927. p:X-tP such that piop=qi (ic1). We cal1
[3] B. Datta and A. N. Singh, History of (P, pi) the projective limit (or inverse limit) of
Hindu mathematics 1, II, Lahore, 1935-1938. the projective system (Xi, eu) over 1 and de-
(Asia Publishing House, 1962). note it by I$i Xi or proj lim Xi.
210 c 806
Inductive Limits and Projective Limits

Note that we may replace 1 by any cofïnal D. Limits in a Category


subset of 1 without changing the limits.
Let I be a preordered set and % a category. If
we are given an abject Xi of a category %?for
C. The Limit of Groups and of Topological each i E I and a tmorphism qj, : Xi-Xj of %?
Spaces for each pair (i,j) of elements of I with i<j,
and if the conditions <pii = l,,, <pki= <pkjo <pji
If, in the notation of Section B, X, is a group (i < j d k) are satisfïed, then we cal1 the system
and qji (tiy) is a homomorphism, then we say (Xi, qji) an inductive system over 1 in the cate-
that (Xi, pji) ((Xi, tiij)) is an inductive (projective) gory %. A projective system in ‘6 is defined
system of groups. The inductive hmit (as a set) dually: It is an inductive system in the +dual
D = 15 Xi has the structure of a group for category ‘Go. If we view I as a category (- 52
which the canonical mappings ,/; are homo- Categories and Functors B), then an inductive
morphisms. With this group structure, D is (projective) system in % over the index set 1 is
called the inductive limit (group) of the induc- a ‘covariant (tcontrdvariant) functor from I to
tive system of groups. It satisfies properties %. Now if an abject DEY? and morphisms
I(1) and I(2) with group X and homomor- ji:Xi+D (iE 1) satisfy conditions I(1) and I(2)
phisms gi and 1: Similarly, the projective limit with the modification that X is an abject and
(as a set) P=I@ Xi has a unique group struc- gi, .f are morphisms in %‘, then the system
ture such that each pi:P+Xi is a homomor- (D,jJ is called the inductive limit of (Xi, <pji) and
phism, namely, that of a subgroup of the direct is denoted by 15 X,. Similarly, if an abject
product group nXi. The group P is called the PEI and morphisms p,:P+X, (Ill) satisfy
projective limit (group) of the projective system P(1) and P(2) with a similar modification, then
of groups. When each Xi is a module over a (P, p,) is called the projective limit of (Xi, tiii)
lïxed ring A, we get entirely similar results by and is written I$ Xi. By I(2) and P(2), these
considering A-homomorphisms instead of limits are unique if they exist.
group homomorphisms. In the categories of sets, of groups, of
Next, let X, be a topological space and <pji modules, and of topological spaces, inductive
and i/jij be continuous mappings. Then (Xi, rpjj,) and projective limits always exist. Note that if
((Xi, I/I~,)) is called an inductive (projective) the ordering of 1 is such that i <,j implies i = j,
system of topological spaces. If we introduce in i.e., if there is no ordering between two distinct
D = 1% X, the topology of a quotient space of elements of 1, then the inductive (projective)
the ttopological direct sum of the spaces Xi hmit is the +direct sum (tdirect product) (- 52
(is I), then the ,f, are continuous, and I( 1) and Categories and Functors E).
l(2) hold with sets and mappings replaced by Let (Xi, cpii), (Xi, cp;,) be two inductive sys-
topological spaces and continuous mappings. tems over the same index set I, and let <pi:
Similarly, if we view P = I@r X, as a subspace X,+X: (iE 1) be morphisms satisfying (pi, o <pi
of the tproduct space HXi, then the pi are = <pio vii (i <j). Then the system (<pi) is called
continuous and P( 1) and P(2) hold with the a morphism between the inductive systems.
same modification as before. The spaces D and Such a morphism is a +natural transformation
P are called the inductive limit (space) and the between the inductive systems viewed as func-
projective limit (space) of the system of topo- tors 1 -t%. If 1% Xi and Ii$i Xi exist, then (<PJ
logical spaces, respectively. The projective induces a morphism limcpi:limXi-tlimX~ in
limit of Hausdorff (compact) spaces is also a natural way, and sit&arly%r projztive
Hausdorff (compact). hmits.
Furthermore, if the Xi (iel) form a topolog- For the more abstract notion of limit of a
ical group and CP,~,tiij are continuous hom- functor - [SI. For the theory of procategories
omorphisms, then 1% Xi and IF Xi are topo- - c41.
logical groups, and properties l(I), I(2), P(l),
and P(2) are satisfied for topological groups References
and continuous homomorphisms (- 423
Topological Croups). In particular, projec- [l] S. Lefschetz, Algebraic topology, Amer.
tive limits of finite groups are ttotally discon- Math. Soc. Colloq. Publ., 1942.
nected compact groups and are called profinite [2] S. Eilenberg and N. Steenrod, Foundations
groups; they occur, e.g., as the ring of +p-adic of algebraic topology, Princeton Univ. Press,
integers and as the +Galois group of an infinite 1952.
+Galois extension. Conversely, the +germs of [3] H. Cartan and S. Eilenberg, Homological
continuous functions at a point x in a topolog- algebra, Princeton Univ. Press, 1956.
ical space X, and other kinds of germs (- 383 [4] A. Grothendieck, Technique de descente et
Sheaves), are important examples of inductive théorèmes d’existence en géométrie algébrique
limits of groups. II. Le théorème d’existence en théorie formelle
807 211 c
Inequalities

des modules, Sém. Bourbaki, Exposé 195, these are called the aritbmetic mean, geometric
1959/1960 (Benjamin, 1966). mean, and harmonie mean of a, (v = 1, , n),
[S] M. Artin, Grothendieck topology, Lecture respectively. Except when either a11 a, are
notes, Harvard Univ. 1962. identical or some a, is 0 and Y< 0, the func-
tion M, increases tstrictly monotonically as r
increases, and M,+mina, (r+ -CO), M,+
maxa, (r- +co). Therefore we always have
211 (X.3) min u, < M, < max a,. In particular, we have
H < G < A if the a, are a11 positive and not
Inequalities
a11 equal.
Let p(x) (> 0), f(x) ( 3 0) be tintegrable func-
A. General Remarks tions on a tmeasurable set E. We put

In this article we consider various properties


of inequalities between real numbers. An in-
equality that holds for every real number (e.g., Furthermore, if M,(f) is strictly positive for
x2 20) is called an absolute inequality; other- some r > 0, we put
wise it is called a conditional inequality. When
we are given a conditional inequality (e.g., Mo(f)= lim M,.(f)
x(x - 1) CO), the set of all real numbers that
satisfy it is called the solution of the inequality.
The process of obtaining a solution is called
solving the conditional inequality.
We cal1 M,(f) the mean of degree r of the
function f(x) with respect to the weigbt func-
B. Solution of a Conditional Inequality tion p(x). It has properties similar to those of
M,(u). In particular, when the weight function
Suppose that a conditional inequality is given p= 1, the means M,(f), M,,(f), M-,(f) are
by f(x) > 0 (or f(x) 3 0), where f is a continu- called the arithmetic mean, geometric mean,
ous function defïned for every real number. and harmonie mean of x respectively.
If the equation f(x) = 0 has no solution, then (2) The Holder inequality: Suppose that
we have either f(x) > 0 or f(x) -C 0 for all x. p#O, 1 and (p- l)(q- l)= 1; that is, l/p+
On the other hand, if CIand b (a < 8) are adja- l/q = 1, and a, > 0, b, > 0. Then, in general,
cent roots of the equation f(x)=O, the sign we have the H6lder inequality:
of S(x) is unchanged in the open interval (c(, 8).
Therefore the solution of the given inequality
depends essentially on the solution of the
equation f(x) = 0. If inequalities involve two
where the inequality signs in the first inequal-
variables x, y and are given by ,f(x, y) > 0,
ity are taken in accordance with p < 1 or p > 1.
y(x, y) > 0 for continuous functions .f and ,g,
The summation may be infïnite if the sums
the solution is, in general, a domain in the xy-
are convergent. The inequality sign is replaced
plane bounded by the curves ,f(x, y) = 0 and
by the equality sign if and only if there exist
y(x, y) = 0. Similar results hold for the case of
constant factors  and p such that Âa,P = p&
inequalities involving more than two variables.
for a11 v. The Holder inequality for p = q = 2
is called the Caucby inequality (or Cauchy-
C. Famous Absolute Inequalities Schwarz inequality).
For two measurable positive functions f(x),
(1) Inequalities concerning means (or averages): g(x), we have the Holder integral inequality:
Suppose that we are given an n-tuple a =
(al, ,a,), a,>O. We set
I/V
M,=M,(u)= i”$, a:
( > except when there exist two constant factors A
If at least one a, is 0 and r < 0, we put M, = 0. and p ((2, p) # (0,O)) such that Af P= pgq holds
In particular, we put talmost everywhere. The above inequality is
replaced by equality if and only if we are in
A=MI, this exceptional case. The case where p = q = 2
is called the Schwarz inequality (or Bunyakov-
skiï inequality).
(3) The Minkowski inequality: Suppose that
H=M-,; p # 0, 1 and a, > 0, h,, > 0. Then we have the
211 D
Inequalities

Minkowski inequality:
212 (Xx.36)
Inequalities in Physics

A. Correlation Inequalities
PS4
except when {a,} and {&} are proportional. Let p be a probability measure on a space X
The inequality is replaced by equality if and (with <r-tïeld UA) and fj be measurable functions
only if we are in the exceptional case. on X, and Write
The corresponding integral inequality for
positive functions f(x), g(x) is (fi...!Y.>=
sx.fi(4...fn(x)Mx).
A number of inequalities among such expec-
tations under a variety of conditions on the
PS 1, measure p and functions jj are known as cor-
relation inequalities, after their original occur-
rence for correlation functions in the statistical
except when f(x)/g(x) is constant almost every-
mechanics of lattice gases.
where. The inequality is replaced by equality if
The earliest results were obtained by Grif-
and only if we are in the exceptional case.
fiths (J. Math. Phys., 8 (1967)) for the case
where X is the product I-IL, Ii of two point
sets Ii= { 1, -l}, the f are kth coordinate func-
D. Related Topics tions CQ=CQ.(X) (= kl) of X~X, dp(x) is the
counting measure multiplied by a Gibbs factor
Absolute inequalities are important in anal- Z-le-PH’“’ with a ferromagnetic Hamiltonian
ysis, especially in connection with techniques
to prove convergence or for error estimates. H(x) = - 5 ~,a~(x)a~(x), J, = Jji > 0,
However, there seldom are general principles t<j

for deriving such inequalities, except for a few p > 0, and the normalization factor Z =
elementary theorems.
Le -BH(x). His conclusions are
For other famous inequalities - Appendix
A, Table 8. For related topics - 88 Convex (~~g,l) 20 (Griflïths’s Iïrst inequality),
Analysis; for convex functions and their ap-
(wwm~n> a (Wll) (%%n)
plications - 255 Linear Programming; for
linear inequalities - 212 Inequalities in (GrifIïths’s second inequality).
Physics. These were extended by Kelly and Sherman
(J. Math. Phys., 9 (1968)) as

(oA) 20 (GKS tïrst inequality),


References
(CAOBB)2 (0‘4) (OBB)
[l] G. H. Hardy, J. E. Littlewood, and G.
(GKS second inequality),
Polya, Inequalities, Cambridge Univ. Press,
tïrst edition, 1934; revised edition, 1952. where 4 and B are subsets of { 1, , N), O, =
[2] V. 1. Levin and S. B. Stechkin (SteEkin), nicAcri, and H= -C AcN J A(TA with J,>O. A
Inequalities, Amer. Math. Soc. Transl., (2) further generalization (for example, Ii= R) cari
14 (1960) l-29 (English translation of the be found in [ 1,2].
appendix to the Russian translation of [ 1)). Under the same situation with
[3] E. F. Beckenbach and R. Bellman, In-
equalities, Erg. Math., Springer, second re- ff= -5 Jijaicj- 5 hiai, J,>O, hi>O,
i<j i=l
vised printing, 1965.
[4] E. F. Beckenbach and R. Bellman, An the following GHS inequality by Griffiths,
introduction to inequalities, Random House, Hurst, and Sherman (J. Math. Phys., 11 (1970))
1961. holds:
[S] N. D. Kazarinoff, Geometric inequalities,
Random House, 1961.
[6] N. D. Kazarinoff, Analytic inequalities,
Holt, Rinehart and Winston, 1961.
[7] W. Walter, Differential- und Integral-
Ungleichungen, Springer, 1964.
[S] 0. Shisha (ed.), Inequalities, Academic
Press, 1, 1967; II, 1970; III, 1972.
809 212 c
Inequalities in Physics

Further generahzations cari be found in [2,3], last inequality is based on the concavity of
and the references quoted therein. the functionf(p)=trexp(A+logp) in ~20
Let X be a imite distributive lattice, p be (A* =A) proved by Lieb [4] (also see Epstein,
a positive measure satisfying the condition Comm. Math. Phys., 31 (1973)), who also
p(x AY)& vy) 2 POP, ad S and g be bath proved the joint concavity of tr(C*p”Co’)
increasing or both decreasing functions on X. (r>O,s>O,r+sdl) in p>O and 020. This
Then the following FKG inequality by For- concavity for the case r = s = 1/2 was previ-
tuin, Kasteleyn, and Ginibre (Comm. Math. ously proved by Wigner and Yanase (Proc.
Phys., 22 (1971)) holds: NU~. Acad. Sci. US, 49 (1963)), and the general
case has been conjectured by Wigner, Yanase,
and Dyson. It leads to the joint concavity of
the relative entropy S( p, 0) = tr p(log p -1ogcr)
B. Inequalities Involving Traces of Matrices (defined to be +co if ati = 0 and p$ #O for
some vector $) in p > 0 and (r 2 0 as well as its
Let tr A denote the trace of a matrix A. Some monotonicity S(u(p), C(((T)< S(p, 0) for any
of the earlier results applied to statistical trace-preserving expectation mapping c(. (For
mechanics are as follows, where p 2 0, e 2 0, entropy, see also [SI.)
A*=A,B*=B: The above results have generalizations in
the context of von Neumann algebras (Araki,
tr(plogp-plogc-p+a)>O
Publ. Res. Inst. Math. Sci., 11 (1975); 13 (1977);
(Klein inequality), Uhlmann, Comm. Math. Phys., 54 (1977); [SI).

tr(eA+‘)/treA>exp(treAB/treA)

(Peierls-Bogolyubov inequality), C. Operator Monotone and Operator Convex


Functions
tr(eAeE)> tr(eA+B)

(Golden-Thompson inequality). A real-valued function f(x) delïned on an


interval 1 (fïnite or infïnite; open, half-open, or
The following inequality is related to the last closed) is called matrix monotone increasing
inequality and was proved for a general m > 0 (decreasing) of order m if f(A) >f(B) whenever
by Lieb and Thirring (Studies in Mathematical m x m Hermitian matrices A and B with their
Physics, Lieb, Simon, and Wightman (eds.), eigenvalues contained in I satisfy A > B (A GB)
Princeton Univ. Press, 1976): and is called operator monotone if it is matrix
tr(pa)m<tr(p”‘Om), p>O, 020. monotone of order n for all positive integers
n. f(x) is called matrix convex of order m if
For the Hilbert-Schmidt norm 11A I/u,s. = f[(l-t)A+tB]<(l-t)f(A)+tf(B)forO<t<
(tr A*A)“’ and the trace-class norm 11 A /ltT = 1 and a11 m x m Hermitian matrices A and B
tr(Al, where (A( =(A*A)r”, the Powers-Stormer with their eigenvalues in I and is called opera-
inequality (Comm. Math. Phys., 16 (1970)) tor convex if it is matrix convex of order n
holds: for any positive integer n. The functions xa
(O<~C< l), logx, and -(X+C~))’ (a>O) are
all operator monotone increasing in the half-
Araki and Yamagami (Comm. Math. Phys., 81 line interval x > 0. The functions (x + a) m1
(1981)) give and xlogx are operator convex in the same
interval.
III~I-I~I/I,.,.~J2ll~-~ll..,..
A function f(x) is operator monotone in-
If C* = C and D* =D, then fi cari be creasing in an open interval (a, b) if and only if
removed. it is analytic in (a, b) and has an analytic con-
The entropy S(p) = - tr p log p is a concave tinuation to the whole Upper half-plane where
function of p > 0 satisfying the triangular the function has a nonnegative imaginary part
inequality (Araki and Lieb, Comm. Math. [7]. Another necessary and suflïcient condition
Phys., 18 (1970)): is

i,~~f’i+k+l’(x)~i~*I(i+h+ l)! ao
and strong subadditivity (Lieb and Ruskai, J.
Math. Phys., 14 (1973)): for all XE(~, b), for a11 real [, and for a11 posi-
tive integers N. An operator monotone func-
tion f(x) on an open interval (-R, R) has the
where plz3 is a matrix on the tensor prod- integral representation
uct space Ht @ Hz@ H,, pjk=triplz3, pi=
trjtr,piz3((i,j,k}=jl,2,3})and trjisa
f(x)=f(O)+f’(O) x/(1 -tx)44),
(partial) trace taken only on the space 4. The s
212 Ref. 810
Inequalities in Physics

where p is a probability measure with its sup- [ 101 C. Davis, Notions generalizing convexity
port contained in [-R-i, R -‘] and, if f is for functions detïned on spaces of matrices,
not a constant, is uniquely determined by f: Amer. Math. Soc. Proc. Symposia in Pure
The set of a11 extremal points of the set of a11 Math., 7 (1963), 1877201.
operator monotone increasing functions on [ 1 l] F. Kraus, Über konvexe Matrixfunk-
( - 1,l) satisfying f(0) = 0 and f’(O) = 1 is exact- tionen, Math. Z., 41 (1936) 18842.
ly the set of functions x/( 1 - tx), 1tI < 1, which
appear as integrands in the above formula. An
operator monotone function on R must be
linear.
An operator convex function in an open in- 213 (XIX.1 2)
Information Theory

f(x)
=f(O)
+f’Wsx2/(1
-+W>
terval (-R, R) has the integral representation

+(f”W) A. General Remarks

where p is a probabihty measure with its sup- The mathematical theory of information trans-
port contained in [-R -‘, R-i] and, if f is mission in communication systems, tïrst devel-
not linear, is uniquely determined by f: An oped by C. E. Shannon [1] and now called
operator convex function in R must be at most information theory, is one of the most impor-
a quadratic polynomial satisfying f”(O) > 0. tant tïelds of mathematical science. It consists
For a continuous real function on an inter- of two main themes, channel coding theory and
val [0, x) (0 d s(d CO), the following four condi- source coding theory. The purpose of channel
tions are mutually equivalent: (i) f is operator coding theory is to ascertain the existence of
convex and f(0) d 0, (ii) f(a*Aa) < a*f(,4)a schemes for transmitting information or data
for any self-adjoint operator A with its spec- reliably over a noisy channel at a fïxed trans-
trum in [0, c() and any operator a with its mission rate. The purpose of source coding
norm not exceeding 1, (iii) the preceding condi- theory is to show the existence of schemes for
tion with a limited to projections, (iv) x-‘f(x) compressing data emitted from an information
is operator monotone increasing in an open source and reproducing them within tolerable
interval (0, c(). (See [S-l l] and Hansen and limits of distortion. Both theories are based
Pedersen, Jensen’s inequality for operators and profoundly on the theory of probability, statis-
Lowner’s theorem, Math. Ann., 258 (1982) tics, and the theory of stochastic processes.
2299241.

B. Entropy

References
In information theory it is customary to cal1
a tïnite set A = {al, . , x,} an alphabet and to
[1] J. Ginibre, General formulation of Grif- cal1 its elements the letters of the alphabet. The
tïths’ inequality, Comm. Math. Phys., 16 (1970) simplest mode1 of information sources consist-
310-328. ing of an alphabet A and a tprobability dis-
[2] B. Simon, Functional integration and tribution p over A is denoted by X = [A, p],
quantum physics, Academic Press, 1979, ch. 4. which cari be regarded as a +finite probability
[3] C. Newman, Gaussian correlation inequal- space, where the probability of outcome of a
ities, Z. Wahrscheinlichkeitstheorie und letter air A from the source is denoted by pi =
Verw. Gebiete, 33 (1975) 75593. ~(CC~).A real-valued function defined by I(cci) =
[4] E. H. Lieb, Convex trace functions and the - logp(a,) is called the self-information of
Wigner-Yanase-Dyson conjecture, Advances the event CL~EA. The tmean value of the self-
in Math., 11 (1973) 267-288. information, i.e.,
[S] A. Wehrl, General properties of entropy,
Rev. Mod. Phys., 50 (1978) 221-260. H(X)= C -P(cci)loi?P(cci)~
[6] H. Araki, Inequalities in von Neumann i=l
algebras, CNRS Publication R.C.P., 25 vols., is called the entropy of the source. This cari
22 (1975) l-25. be interpreted as a measure of the average a
[7] K. Lowner, Uber monotone Matrixfunk- priori uncertainty as to which letter Will ema-
tionen, Math. Z., 38 (1934) 1777216. nate from the source, or a measure of the
[S] J. Bendat and S. Sherman, Monotone and average amount of information one obtains
convex operator functions, Trans. Amer. upon receiving a single letter from the source.
Math. Soc., 79 (1955), 58-71. The unit of information or entropy for base e
[9] W. Donoghue, Monotone matrix functions logarithms is called a nat, while that for base 2
and analytic continuation, Springer, 1974. logarithms is called a bit.
811 213 D
Information Theory

Let A={cc, ,..., a,} and B={a, ,..., fi,,,} be random process, the source X = [A”, P] is said
two alphabets. Let r(q, bj) be a ‘joint proba- to be memoryless.
bihty distribution defïned on the product AB, For a given X, denote the subsequence
and denote the tprobability space by X Y= X, X, of X by XN. Then a probability mea-
[AB, r]. Then the joint distribution gives rise sure P on AN and a finite probability space
to tmarginal distributions P(CC~)= &‘& r(ui, pi) XN = [AN, PN] are naturally induced. The en-
and q(bj) = Cy=, r(Eir A) and to tconditional tropy of the stationary information source
distributions p(ai 1bj) = r(cti, /lj)/q(/3,) and q(pjl ai) X = [AZ, P] is defïned as H(X) = lim,,, H(XN)/
= r(c(,, ~j)/p(cci). Probability spaces X = [A, p] N or as H(X) = lim,,, H(X, 1XNml), because
and Y = [B, q] are subspaces of the probability both hmits exist and are identical. If X is
space XY=[AB,r]. The entropy of XY= memoryless, the entropy of X is equivalent
[AB, r] is dehned by to that of Xi = [A’, PI], i.e., H(X) = C;=1 -
pilogpi, where pi=pl(X, =CQ) and H(XN)=
H(XY)= t fJ -r(a,, /3j)logr(ai, &) NH(X’) = NH(X).
i=, j=,

as well. A real-valued function [(SC,1bj) =


D. Source Coding Theorem
- logp(a, 1li;) is called the conditional self-
information. The average of l(a, 1bj) over ai
Let xN =x, xN be a sequence of N consecu-
dehned by
tive letters from the source [A’, P]. Suppose
that we wish to encode such sequences into
H(XIbj)=f P(“J1(“illjj)
i=l fixed-length code words uL = u, uL consist-
ing of L letters from a code alphabet U of
is called the conditional entropy of X for given
size v. The number R = (logvL)/N is called the
pje B. The average of H(X )bj) over flj delïned
coding rate per source letter. A mapping <p:
by
AN+ UL is called an encoder and <p: ULbAN a
decoder. The set UL with specified encoder-
H(X I y)= f q(/3jIHtx I li;>
j=1 decorder pair [<p, $1 is called a code with rate
R =(L log v)/N. The error prohahility of the
is called the conditional entropy of X for given
Y. The conditional entropy of Y for given X, code is detïned by
denoted by H( Y1 X), is defïned similarly. Then
it cari be shown that
The fixed-length source coding theorem
H(XY)=H(Y)fH(XI Y)=H(X)-tH(YIX), [ 1,2] states: Let X = [AZ, P] be a stationary
ergodic source. Then for any 6 > 0, if R 2 H(X)
H(XY) < H(X) + H(Y),
+ 6, there exists a code [Q, $1 with rate R such
H(X I Y) G H(X), that the error probability P,[<p, $1 cari be
made arbitrarily small by making N suffï-
ffWIWGH(Y),
ciently large. Conversely, if R <H(X) - 6 then
where equalities in the last three expressions for any code with rate R, P,[<p, $1 must be-
hold if and only if r(a,, [$)=p(cci)q(/jj) for a11 corne arbitrarily close to 1 as N-t CO.
sci~AandIi;~B. This theorem is trivial if R > log n. For mem-
oryless sources, the theorem follows immedi-
ately from the tweak law of large numbers,
C. Information Sources which implies that for arbitrary E > 0 and 6 > 0
there exists an integer N&s, 6) such that for a11
N > N&, 6)
Given a fïnite alphabet A, we consider the
tïnite product set AN = A x . x A and the
doubly infinite product A” = flg -~ A,, whe:-e
A,=A,k=O,f1,*2 ,.... Let.pAbethea-
P
- log PN(XN)
N
-H(X)
/1>6

The validity of this property for stationary


<E.

algebra generated by a11 tcylinder sets in AZ.


Given a tprobability measure P over &, an ergodic sources was proved by B. McMillan
information source is defined as a probability [2]. In case of memoryless sources, the exact
space [AZ, P] or as a trandom process X = asymptotic form of the error probability for
. . . X-,X,X,X, . . When P is invariant under an optimal code with rate R was given by
the +shift transformation T on AZ, i.e., P(E) = F. Jelinek [3], 1. Csiszar and G. Longo [4],
P(TE) for any ES~~, then [AZ, P] is said to Blahut [S], and Longo and A. Sgarro [6] as
be a stationary source. In particular, if P(E) =
0 or 1 whenever TE=Ec&, then the infor- min Db IlP),
q:H(q)2R
mation source is said to be ergodic. If X is an
tindependently and identically distributed where p is the source distribution, q denotes a
213 E 812
Information Theory

probability distribution on A, and and the suffrx m is transmitted. Hence the


coding rate per source letter is R = (log M)/N,
and the average distortion is

which is called Kullhack’s discrimination in-


formation [7] or the divergence.
Source sequences xN cari be encoded into The problem is how far we cari reduce the rate
under the condition that the average distortion
variable-length code words consisting of letters
keeps satisfying a given fïdelity criterion, which
from alphabet U of size v. A set of nN code
is specilïed as a maximum tolerable value d for
words is called a prefix condition code if there
the average distortion.
is no code word which is equivalent to the
The source coding theorem with a fidelity
pretïx of any other code Word. Denote the
criterion states: Let X = [A”, P] be a memory-
length of the code word corresponding to a
less source. For any specihed d 2 0, any E> 0,
source sequence xN by L(X~) and the average
length of the code words per source letter by and 6 > 0, there exists a source code W with
rate R > R(d) + 6 and with suffrciently large
block length N for which the average distor-
tion satistïes

The variable length source coding theorem dN(%‘) <d + E,


states: Given a memoryless source X with
entropy H(X), there is a prefix condition code where R(d) is the rate distortion function de-
such that the average length of the code words tïned by
per source letter satisfies R(d)= pgdJ I(P; W),

W(jli)
I(P; w)= i E Piw(jIi)log
i-1 j=,

This theorem is valid for stationary ergodic i Pi W(jl4 ’


sources if (logv)/N in the last term is replaced
by E(N), where a(N)+0 as N+co. W: i c piW(jIi)d(ai,aj)<d ,
These two theorems are referred to as noise- i=l j=l

less source coding theorems.


where W( j 1i) denotes a conditional proba-
bility distribution referred to as the test chan-
nel, and where I(p; W) is called the mutual in-
E. Source Coding Theorem with Fidelity formation. It should be noted that R(0) = H(X).
Criterion The rate distortion function R(d), which was
fïrst detïned by Shannon [ 1, 81, is closely re-
The noiseless source coding theorem implies lated to the +entropy introduced by A. N.
that the average number of code letters per Kolmogorov [9]. R(d) is a monotonically de-
source letter cari be reduced to the source creasing and convex function.
entropy H(X) under the requirement that the The theorem was tïrst proved by Shannon
source sequence be exactly reproduced from [S], and was extended by R. G. Gallager [lO]
the encoded sequence. If an approximate re- to the case of stationary ergodic sources with
production of the source sequence to within a discrete alphabets and by T. Berger [ 1 l] to
given tïdelity criterion is required, the coding stationary ergodic sources with abstract al-
rate per source letter must be reduced further phabets. More recently, R. G. Gray and L. D.
to a certain value below the source entropy. Davisson [ 121 have proved source coding
Suppose that a distortion measure d(a,, orj) is theorems without the ergodic assumption
defined for ai, mjeA, where it is assumed that for stationary sources subject to a fïdelity
d(ai, aj) > 0 and d(ai, ai) = 0. For blocks xN = criterion. The proof was based on Rokhlin’s
x1 . ..xN and yN=y, . yN, detïne ergodic decomposition theorem [ 131. Other
important source coding theorems have been
obtained by F. Jelinek [ 143 for tree codes, A. J.
Viterbi and J. K. Omura [ 151 for trellis codes,
Any set W = {y;, . , y&}, y: E AN, of reproduc- and Gray, D. L. Neuhoff, and D. S. Ornstein
ing words is called a source code of block [ 161 for sliding block codes.
length N. Each source sequence xN of the The rate distortion function for memoryless
source X = [A’, P] is mapped into whichever Gaussian sources subject to squared error
code word y: E%Tminimizes dN(xN, y”), i.e., distortion was given by Shannon [S], and that
for autoregressive Gaussian sources was deter-
mined by Kolmogorov [9], M. S. Pinsker
813 213 F
Information Theory

[ 171, Berger [ 111, Gray [ 183, and T. Hashi- Feinstein [20]. The precise expression of E(R)
moto and S. Arimoto [19]. was subsequently given by P. Elias [21], R. M.
Fano [22], and Gallager [23]. The best ex-
pression of E(R), due to Gallager [23], is
F. Channel Coding Theory
E(R) = max{-%(R), UR + OW’))},
A mathematical mode1 for a channel over
where
which information is transmitted is specifïed in
terms of the set of possible inputs, the set of
E,(R)= max max -pR
outputs, and a probability measure on the
l+P
O<p<l p
output events conditional on each input. The
simplest channels are the noiseless ones, for
which there is a one-to-one correspondence
between input and output and no loss of in-
ji i II
-1ogC C~~w(jIi)“(‘+~) ,

formation in transmission through the chan- E,,(R)=supmax -pR


nel. The second simplest channels are discrete Pbl

-logctm
P [
UP
C,/WIi)WjIk)
memoryless channels (DMCs) which are de-
fïned as follows: The input and the output are
sequences of letters from finite alphabets (say,
i,k j II {

cci~.4 and &EB), and the output letter at a The converse of the fundamental theorem
given time depends statistically only on the states: Given a DMC with capacity C, for any
corresponding input letter. That is, a DMC is block code with rate R above C and any pair
characterized by a fïxed conditional proba- [<p, $1, the error probability satisfïes
bility distribution W( j 1i) = W(b, 1ai) because
the probability measure on the input and
output sequences satislïes where E(R) is a function positive for R > C.
This was fïrst proved by J. Wolfowitz [24],
and the precise expression, found by S. Ari-
moto [25], is described by
By Cl = { 1, , M} we denote the set of integers
E(R)= max min -pR

l+P
each of which is assigned to each correspond- -l<p<O p

ing possible message from the source. A map-


ping cp: U --* AN induces a collection %?= (x2,
. ,x4}, called the block code with rate R =
(logM)/N; each element is called a code Word.
-1ogC
j Ii CpiW(jIi)‘i(l+p)
II
The fundamental theorem of coding theory
Only code words are transmitted over the
and its converse imply that the capacity is a
channel. A mapping Ic, : BN+ U is called the
critical rate; above the capacity, information
decoding. Thus, given an encoding and decod-
cannot be transmitted reliably through the
ing pair [q, $1, the error probability is defmed
channel. Unfortunately, there is, in general, no
by direct method for computing the capacity, and
therefore an iterative method was proposed by
PJ% $I=i g c* WN(YNl4+4)> Arimoto [26]. Another iterative method for
m1
computing the rate distortion function was
where the summation C* is taken with respect given by R. E. Blahut [27].
to a11 yN such that $(y”)#~ The capacity for A discrete channel with memory is, in gen-
a DMC is defïned by eral, defïned by a list of probability measure
C=maxI(p; W) { W,, x E AZ} on a tmeasurable space { BZ, SB}
P such that for each FE .TB, W,(F) is a tmeasur-
able function of x. A channel W is called sta-
WI4
= max f C pi W( j 1i)log tionary if W,,( TF) = W,(F) for ail x E A” and
P i-1 jr1
C~=I Pi W( j 1i)
a11 FE$~. Given an input source [AZ, P] and a
The fundamental theorem of channel coding channel W, connecting the input to the chan-
theory states: Given a DMC with capacity C > ne1 induces a joint process of input and out-
0, there exist a block code with rate R below put, denoted by [AZ x BZ, P W] where P W is a
capacity C, a pair [q, $1, and a function E(R) measure on {AZ x B”, FA x .}. If a joint process
> 0 for 0 <R < C such that the error proba- [A” x BZ, P W] is stationary the mutual in-
bility satisfies formation between input and output is defmed
b
PeC%til<ex~{-NW)I.
I(X; Y) = H(X) + H(Y) - H(XY).
This was fïrst discovered by Shannon [ 11,
and its precise proof was fïrst given by A. A channel W is called ergodic if for every
213 Ref. 814
Information Theory

ergodic input [A”, P] the induced joint process sources with abstract alphabets and memory,
[AZ x B”, PW] is ergodic. The channel capac- Information and Control, 13 (1968), 254-273.
ity for a discrete stationary channel is defined [ 121 R. G. Gray and L. D. Davisson, Source
in various ways. The important ones are the coding theorems without the ergodic assump-
ergodic capacity defined by C, = sup. of I(X; Y) tion, IEEE Trans. Information Theory, 20
with respect to a11 stationary ergodic sources (1974), 502-516.
X = [A”, P], and the stationary capacity de- [ 131 V. A. Rokhlin, On the fundamental ideas
tïned by C, = sup. of I(X; Y) w.r.t. a11 station- of measure theory, Amer. Math. Soc. Transl.,
ary sources. K. R. Parthasarathy [28] proved 71 (1952).
C, = C, for a11 discrete stationary channels. [ 141 F. Jelinek, Tree-encoding of memoryless
In another way J. Nedoma [29] introduced time-discrete sources with a fidelity criterion,
the operational source/channel block coding IEEE Trans. Information Theory, 15 (1969),
capacity Csfbr which is defined as the supre- 584-590.
mum of the entropies of all admissible sta- [ 151 A. J. Viterbi and J. K. Omura, Trellis
tionary ergodic sources in the sense that there encoding of memoryless discrete-time sources
exist source/channel block codes such that with a fidelity criterion, IEEE Trans. Infor-
the error probability P,-+O as block length mation Theory, 20 (1974), 325-332.
N+co. Nedoma [29] also pointed out an [ 163 R. M. Gray, D. L. Neuhoff, and D. S.
example of a stationary channel where C, > Ornstein, Nonblock source coding with a
Cscb. Hence block coding theorems have been fidelity criterion, Ann. Probability, 3 (1975),
proved for various channels: for finite memory 478-491.
channels by A. 1. Khinchin [30], K. Takano [ 171 M. S. Pinsker, Information and infor-
[31], Nedoma [29], and Feinstein [32], for mation stability of random variables and pro-
d-continuous channels by Gray and Ornstein cesses, Holden-Day, 1965.
[33], and for almost fïnite memory channels [ 1S] R. M. Gray, Information rates of auto-
by D. L. Neuhoff and P. C. Shields [34]. regressive processes, IEEE Trans. Information
Theory, 16 (1970), 412-421.
[19] T. Hashimoto and S. Arimoto, On the
References rate-distortion function for the nonstationary
Gaussian AR process, IEEE Trans. Informa-
[l] C. E. Shannon, A mathematical theory tion Theory, 26 (1980), 478-480.
of communication, Bell System Tech. J., 27 [20] A. Feinstein, A new basic theorem of
(1948), 379%423,623-656. information theory, IRE Trans. Information
[2] B. McMillan, The basic theorems of in- Theory, 4 (1954), 2-22.
formation theory, Ann. Math. Statist., 24 [21] P. Elias, Coding for noisy channels, IRE
(1953), 196-219. National Convention Record, pt. 4 (1955), 37-
[3] F. Jelinek, Probabilistic information 46.
theory, McGraw-Hill, 1968. [22] R. M. Fano, Transmission of information,
[4] 1. Csiszar and G. Longo, On the error MIT Press and Wiley, 1961.
exponent for source coding, Studia Math. [23] R. G. Gallager, A simple derivation of the
Acad. Hung., 6 (1971), 181-191. coding theorem and some applications, IEEE
[S] R. E. Blahut, Hypothesis testing and in- Trans. Information Theory, 11 (1965), 3-l 8.
formation theory, IEEE Trans. Information [24] J. Wolfowitz, The coding of messages
Theory, 20 (1974), 405-417. subject to chance errors, Illinois J. Math., 1
[6] G. Longo and A. Sgarro, The source cod- (1957), 591-606.
ing theorem revisited: A combinatorial ap- [25] S. Arimoto, On the converse to the cod-
proach, IEEE Trans. Information Theory, 25 ing theorem for discrete memoryless channels,
(1979), 544-548. IEEE Trans. Information Theory, 19 (1973),
[7] S. Kullback, Information theory and sta- 357-359.
tistics, Wiley, 1957. 1261 S. Arimoto, An algorithm for computing
[S] C. E. Shannon, Coding theorems for a the capacity of arbitrary discrete memoryless
discrete source with a tïdelity criterion, IRE channels, IEEE Trans. Information Theory, 18
National Convention Record, pt. 4 (1959), (1972), 14-20.
142-163. [27] R. E. Blahut, Computation of channel
[9] A. N. Kolmogorov, On the Shannon capacity and rate-distortion functions, IEEE
theory of information transmission in the case Trans. Information Theory, 18 (1972), 460-
of continuous signais, IRE Trans. Information 473.
Theory, 2 (1956), 102%108. [28] K. R. Parthasarathy, On the integral
[lO] R. G. Gallager, Information theory and representation of the rate of transmission of a
reliable communication theory, Wiley, 1968. stationary channel, Illinois J. Math., 2 (1961),
[ 111 T. Berger, Rate distortion theory for 299-305.
815 214 A
Insurance Mathematics

[29] J. Nedoma, The capacity of a discrete Premiums are calculated using these factors
channel, Trans. 1st Prague Conf. Information and the principle of equivalence. The follow-
Theory (1957), 143-181. ing is an example of the classical method
[30] A. Ya. Khinchin, Mathematical founda- of calculation for a life insurance policy
tions of information theory, Dover, 1957. with the use of “commutation symbols,”
[31] K. Takano, On the basic theorems of which is an old device for the convenience
information theory, Ann. Inst. Statist. Math., 9 of calculations.
(1957), 53-77. We Write P for the net premium (in which
[32] A. Feinstein, On the coding theorem and the cost of administration is disregarded), P
its converse for fïnite-memory channels, In- for the gross premium, 7; for the amount of
formation and Control, 2 (1959), 25-44. death benefits payable in the tth year after the
[33] R. M. Gray and D. S. Ornstein, Block policy is issued, E, for the amount of survival
coding for discrete stationary d-continuous benefïts payable at the beginning of the tth
noisy channels, IEEE Trans. Information year, n for the period for which the insurance
Theory, 25 (1979), 292-306. is effective, and m for the period for which pre-
[34] D. L. Neuhoff and P. C. Shields, Chan- miums are to be paid. Let a, fi, and y stand for
nels with almost finite memory, IEEE Trans, three positive constants determining the initial
Information Theory, 25 (1979), 440-447. expenses = c((T, or E,), the premium collection
expenses = BP’, and the general expenses for
maintenance = y( 7; or E,). The factor that
cornes into consideration next is a mode1 of
human death and survival (measurement).
214 (XVlll.18) Assume that 1, is the number of lives attammg
age x, and Write qX for the probability that a
Insurance Mathematics
life of x years Will end within one year. Then
d,, the number of lives ending within one year
A. General Remarks out of I,, is I,q,, and 1,+, , the number of lives
remaining after one year at age x + 1, is I, -
Insurance is a system in which a large number d, = 1,( 1 - qx). The commutation symbols
of people contribute a small precalculated commonly employed are defïned as follows:
amount of money (called a premium) to fil1 the Write u = l/( 1 + i), where i is the assumed rate
economic need that arises when a person of interest; then
meets adversity. The amount of economic need
D,=l,vx, C, =d#+‘,
fïlled by this system is called the amount of
insurance (or amount insured). The insurer is
the one who implements the system. Actuarial
mathematics is the branch of applied mathe-
matics that studies the mathematical basis of For a policy issued at an insured person’s age
insurance, one of the fïrst cases in which math- x, the present value of the insurer’s future
ematics was successfully applied to a social income cari be expressed as P’(N, - N,+,)/D,,
question. Actuarial mathematics cari be di- and the present value of his future payments
vided into two branches according to its ap- cari be expressed as
plication. The fïrst includes the calculation of “fl
various values of each individual policy, such TCx+,-, + 1 WL-~
as premiums or reserves. The second is mainly
1=,
connected with management of an insurance
+&‘A or WD,+Y i (7; or WL-,
business and includes the study of reinsurance *=0
systems, of the maximum amount of insurance,
of the contingency fund, or the analysis of + b’P’(Nx- Nx+,)
profits. There is only one basic principle in >
actuarial mathematics, called the principle of By assuming that the present value of the
equivalence. It determines the premium and future income is equal to the present value of
reserve in each year SO that the present value the future payments, the value of the gross
of future premium income of the insurer is premium P’ is obtained. (The P’ obtained from
equal to the present value of future benefïts for the assumption c(= fi = y = 0 is denoted by P
each policy. and is called the net premium. The difference
The basic factors of actuarial calculations P’ -P is called the loading.) For a policy in
are (1) probabilities of contingencies, (2) an which beneiïts are payable on disability or
expected rate of interest in the future (often contingencies other than death, we have only
referred to as the assumed rate of interest), to obtain a mode1 of contingencies and apply a
and (3) cost of administration of the system. similar calculation.
214 B 816
Insurance Mathematics

B. Liability Reserve proaches have inevitably been made to ac-


tuarial mathematics. An outstanding example
During the term of an insurance contract, it is risk theory.
often happens that the present value of the Risk theory cari be divided into two
future income is less than the present value of branches. One is called classical risk theory
the future payments. If this is the case, the (or individual risk theory), in which the profit
difference is to be held by the insurer as the or loss that may result during a certain term
liability reserve. The source of this fund is the of an insurance contract is regarded as a +ran-
past premium income plus interest. The net dom variable. Since the insurer’s profit equals
premium reserve, which disregards expenses, is the sum of these random variables over all the
calculated as individual contracts, various probability func-
tions cari be obtained by applying the theory
T,C,+,-, + nf E,D,+,-, of probability. The second, called collective
r=1+1 risk theory, pays no attention to each indiv-
idual contract but studies changes in the
- P(Nx+, insurer’s balance as a whole with the lapse of
time. The basis of collective risk theory was
Between the net premium P and the net given by F. Lundberg, H. Cramér, and other
premium reserve V, we have the relation mathematicians.
We explain the collective risk theory follow-
ing Cramér [SI. For simplicity we consider an
The first term of the right-hand side of this insurer who issues no policies other than death
formula is called the savings premium, since it insurance and makes no expenditures except
is the amount left out of the premium income the policy claims. Suppose that during the
of the tth year and added to the reserve. The time interval (t, t + At) contingency occurs with
second term is called the cost of insurance or probability ÂAt + o(At) independently of the
risk premium and is applied to caver the dif- past. We denote by F(x) the tdistribution
ference between the amount of insurance and function of the amount to be paid by the in-
that of the existing reserve in case the contin- surer when a contingency occurs. Then the
gency of death arises. The third term is applied number of contingencies in (0, t), N(t), is a
to the payment of the survival beneiïts (or +Poisson process with parameter 1, and the
annuities in case of an annuity contract). If 7; - total expenditure of the insurer during the
V,, the amount of risk insured by the insurer, time interval (0, t), X(t), is a tcompound Pois-
is positive for all values of t during the period son process (- 5 Additive Processes) such that
of insurance, the policy is called death insur-
ance. On the other hand, if the value of i’- U; E(e izx(r)) = exp J.t a, teiix
- l)dF(x)
is negative for a11 values of t, the policy is (S 0 >
called survival insurance. If the value of 7; - F If p is the premium income per unit time, then
varies between positive and negative according we have p = /zsF xdF(x), because E(X(t)) = pt
to the different values of t, the policy is called holds by the principle of equivalence. If u
mixed insurance. If 7; is always equal to y, the denotes the initial fund of the insurer, then the
policy constitutes mere savings. Most of the fund reserved at time t, Y(t), equals u + pt -
insurance policies issued today are one or X(t). Y(t) is a +Lévy process (- 5 Additive
another type of death insurance, while life Processes) such that
annuity policies are a type of survival insur-
izY(*)
ance. For a long time, studies have been made E(e 1
of the effect on premiums and reserves of
= exp izu + cc(eirx - 1 - izx)dF(x)
changes in the three basic factors (l), (2), and lt

(3) in Section A. ( s0 >

The probability that Y(t) < 0 for some time


t < CO is called the ruin probability, which de-
pends on the initial fund u, denoted by G(u).
C. Risk Theory This satisfïes the following tintegral equation
of Volterra type:
Risk theory occupies a special position in the
tïeld of actuarial mathematics. Actuarial math- p(u)= J~Q(x)dx+ j~wQwwx
ematics was first born from the theory of prob-
ability. Since the modern theory of proba- (Q(x) = 1 - F(x)),
bility based on measure theory was developed
by A. N. Kolmogorov and other mathema- from which one cari derive an asymptotic rela-
ticians (- 342 Probability Theory), new ap- tion $(u)-Ce -Ru as u-00, where R and C
817 215 C
Integer Programming

are positive constants depending only on Â. Algorithms for solving P. cari be classilïed into
and F. See [6] for recent developments in risk algebraic methods and enumerative methods.
theory. Both are based upon linear programming and
relaxation techniques [2] in which some of
the constraints of P. are temporarily relaxed
References (- 264 Mathematical Programming).

[l] C. W. Jordan, Society of Actuaries’ text-


book on life contingencies, Society of Actu- B. Cutting Plane Metbods
aries, 1952.
[2] W. Saxer, Versicherungsmathematik, The origin of this class of algorithms is the
Springer, 1, 1955; II, 1958. fractional cutting plane algoritbm proposed
[3] E. Zwinggi, Versicherungsmathematik, in 1958 by R. E. Gomory [3], the outline of
Birkhauser, 1945. which is described as:
[4] P. F. Hooker and L. H. Longley-Cook, (1) Let X=X0.
Life and other contingencies 1, Cambridge (2) Solve a linear programming problem: Mini-
Univ. Press, 1953. mize {c’x 1x E X}, and let % be its toptimal
[S] H. Cramér, Collective risk theory, Nor- solution. If the xj are integer for all j < ni, then
diska Bokhandeln, 1955. stop (X is optimal to P,). Otherwise, go to (3).
[6] H. Bühlmann, Mathematical methods in (3) Generate a half-space H = {x E R” 1A’X > rc,}
risk theory, Springer, 1970. (n~R”,rco~R1), where H(i) contains X’ and (ii)
does not contain %. Return to (2) by replacing
XbyXflH.
Here, a linear programming problem ob-
tained by relaxing integrality conditions is
solved, and as long as X$X’, an inequality
215 (X1X.6) satisfying the two conditions (i), (ii) of (3) is
Integer Programming introduced. Such an inequality IC’X > no or
equality n’x = 7~~is called a tut or a cutting
plane. Gomory devised a Gomory tut using
A. General Remarks
(relaxed) integrality conditions on the vari-
ables, and showed that the algorithm above
Integer programming, in its broadest sense,
produces a point of X’ in finitely many steps.
addresses itself to either minimization or maxi-
Some of the other algorithms using cutting
mization of a functional f over some discrete
planes are the all-integer algorithm (Gomory,
set S in R”, but it is usually understood as
1963) and the prima1 ah-integer algorithm
dealing with questions related to linear pro-
(R. D. Young, Operations Res., 16 (1968)).
gramming problems (- 255 Linear Program-
These algorithms, however, are generally slow
ming) with additional integrality conditions
and behave erratically, SO that it is believed
on the variables, namely, the problem Pr,:
that they cannot in practice serve as general-
Minimize {c’x 1.4x = b, x > 0, xj an integer, j =
purpose algorithms.
1, . . . . n,(<n)}, where AeRmX”, beR”, CER”
aregivendataandx=(x,,...,x,)‘ER”isa
vector. P,, is called a pure integer programming
C. Other Algebraic Methods
problem or an all-integer programming prob-
lem if n = ni, and a mixed integer program-
Gomory [4], again in 1965, proposed a group-
ming problem if ni <n. In particular, P, is called
theoretic approach to Po. This method is based
a O-l integer programming problem if a11 the
upon the following observation: Let xg be the
integer variables are restricted to be equal to
vector of tbasic variables (- 255 Linear Pro-
either 0 or 1. We Write
gramming) associated with a tdual feasible
X”={xER”lAx=b,xj>O,j=l ,..., n}, basis of a linear programming problem Ix:
Minimize {c’x 1x EX’}, and let r?’ be the set
generated from X’ by relaxing the nonnegativ-
and assume for simplicity that (i) X’ # 0, (ii) ity constraints on xg. Then r?’ cari be shown
X0 is bounded, and (iii) a11 the components of to have a tcyclic group structure, SO that a
A and b are integers. P. arises not only as a group minimization problem Pc: minimize
mathematical mode1 for an optimization prob- {c’xlx~r?‘} cari be solved as a tshortest path
lem where some of the decision variables problem on a directed graph with a special
have indivisible minimum units but also as one structure (- 186 Graph Theory). If the opti-
for many optimization problems with some mal solution # of Pc satisfies X, 3 0, then it is
logical and/or combinatorial constraints [ 11. optimal for Po. If, on the other hand, T, 2 0,
215 D 818
Integer Programming

then a branch and bound algorithm described is continued until the problem list P becomes
later cari be applied for a systematic search empty, thereby implicitly checking all points of
of integer points near X. This algorithm is X’. The above method is called an LP-based
reported to produce good results when the branch and bound method because linear
size of the associated graph is not excessively programming techniques are employed to
large (G. A. Gorry et al., Management Sci., 17 obtain a lower bound. The branch and bound
(1971)). Some of the other important results method tends to require a large amount of
in this area are: (i) the theory of subadditive storage, but many engineering improvements
cuts (R. Gomory et al., Math. Prog., 3 (1972)) on the method of choosing (i) a subproblem Pr
and disjunctive cuts (E. Balas, in Nonlinear and (ii) a branching variable xi, and several
Programming 2, 0. Mangasarian et al. (eds.), improvements of the bounding techniques, in
Academic Press, 1975), (ii) research on facial addition to the substantial progress made in
structures of the integer polybedron coX’ linear programming codes, enable us to solve
(convex hull of XI) for some of the more im- a problem of size II z 100. In particular, an
portant integer programming problems, such improved version of the implicit enumera-
as that of the corner polybedron COT’ [S], tion method, proposed by E. Balas [7] for O-l
knapsack polytopes (E. Balas, Math. Prog., 8 integer programming problems which uses
(1975)), and traveling-salesman polytopes logical conditions for obtaining lower bounds,
(M. Grotschel et al., Math. Prog., 16 (1979)). It is known to be able to salve rather large O-l
should be pointed out, however, that more integer programming problems (A. Geoffrion,
diftïculties of P0 have been revealed rather Operations Res., 17 (1969)).
than resolved through the intensive research
in this area. Incidentally, the O-l integer
programming problem is known to be +NP-
E. Other Topics
complete (- 71 Complexity of Computations).

The partitioning algorithm [S], in which in-


teger variables are varied parametrically, is
D. Enumerative Metbods
reported to work well for O-l mixed integer
problems with relatively few integer variables.
Another large class of algorithms for solving
As people begin to realize the intrinsic dif-
P,, consists of the brancb and bound metbods,
tïculty of PO, they pay more attention to heu-
tïrst proposed by Land and Doig [6] in 1960.
ristic algorithms or approximate algorithms
An outline of the improved version (R. J.
to obtain a good but not necessarily optimal
Dakin, Computer J., 8 (1965)) is as follows.
solution. Among heuristic methods, the inte-
(1) Let Y = {PO}, z* = CO, x* = undetïned.
rior path method [9], which elaborates simple
(2) If UP= 0, then stop (x* is optimal to P,).
ideas such as rounding of the optimal solution
Otherwise, choose from P the problem P,:
of PO, has been reported to work well for prob-
Minimize {c’x 1XE~/}.
lems in which X0 has a nonempty interior.
(3) Salve p,: Minimize {c’x 1XE~,}, in which
Also, more emphasis is being placed on special
the integrality condition is relaxed from the
purpose algorithms for solving practical prob-
constraints of 4. If Fr has an optimal solution
lems, such as +Set partitioning and the ttravel-
2, then go to (4). Otherwise, return to (2).
ing salesman problem, etc. [ 11. P, is a typical
(4) If %EX/ and c’x<z*, then let X+x*, c’x+
nonconvex programming problem, and no
z*, and return to (2). If f E X/ and c’x > z*,
practically useful tduality theorem is avail-
then return to (2). Otherwise, go to (5).
able. Hence it is difficult to perform sensitivity
(5) Choose j < n,, for which Zj is not integral
and/or post-optimality analysis. Some research
and generate the two subproblems Pj’: mini-
in this area has emerged recently (e.g., C. J.
mize {c’xlxgX,/, xj>[Xj]+ l}, and Pj-: mini-
Piper et al., Management Sci., 22 (1976)), but
mize {c’xIxEX,/,X~Q[X~]}, in both of which
it looks as if it Will be several years before a
[Yj] represents the largest integer not exceeding
reasonably good procedure becomes available.
2,. Let BU { 4-, Cm } +g and return to (2).
The best point of X’ identifïed during the
preceding steps is denoted by x* and called
an incumbent. In summary, the branch and References
bound method chooses one subproblem Pi
from.the problem list Y and estimates the [l] R. S. Garfïnkel and G. L. Nemhauser,
lower bound of its optimal objective func- Integer programming, Wiley, 1972.
tional value. If the lower bound is worse than [L] A. M. Geoffrion and R. E. Marten, Integer
the current incumbent, then P, is discarded, programming: A framework and state-of-the-
whereas Pi is separated into two subproblems art survey, Management Sci., 18 (1972), 4655
if no conclusion cari be reached. This process 491.
819 216 B
Integral Calculus

[3] R. E. Gomory, Outline of an algorithm for hold for any partition D with 6(D) = max,(x,
integer solutions to linear programs, Bull. -xi-,) < 6. In other words, we have
Amer. Math. Soc., 64 (1958), 2755278.
[4] R. E. Gomory, On the relation between
integer and non-integer solutions to linear
programs, Proc. Nat. Acad. Sci. US, 53 (1965)
260-265.
[S] R. E. Gomory, Some polyhedra related to
combinatorial problems, Linear Algebra and From Darboux’s theorem it follows that a
Appl., 2 (1969) 451-558. necessary and suflïcient condition for f(x) to
[6] A. H. Land and A. G. Doig, An automatic be integrable on [a, b] is that for each positive
method for solving discrete programming E there exist a positive 6 such that fi(D) < 6 im-
problems, Econometrica, 28 (1960) 497-520. plies ~(D)-<T(D)=C;=,(M~-~~)(X~-X~-J<E.
[7] E. Balas, An additive algorithm for solving We cal1 Mi-mi the oscillation off on Ii and
linear programs with zero-one variables, Oper- o(D) and g(D) the Darboux sums. Obviously, if
ations Res., 13 (1965) 517-546. ,f is integrable on [a, b], then for each positive
[S] J. F. Benders, Partitioning procedures for E there exists a positive 6 such that the follow-
solving mixed-variables programming prob- ing inequality holds for every partition D=
lems, Numer. Math., 4 (1962) 238-252. {xj) with 6(D) <6 and for every set of points
[9] F. S. Hillier, Eflïcient heuristic procedures t,EIj(j=lr...,n):
for integer linear programming with an inte-
rior, Operations Res., 17 (1969), 600-637. j~f(~j)lxj-xj-,l-ShT(x)dlJ<E.
a
The sum C~Z,,f(~j)(xj-xj~l) is often called a
Riemann sum (or sum of products). A function
that is continuous on [a, b], or bounded and
216 (X.10) continuous except for a finite number of points
Integral Calculus in the interval, is integrable. Furthermore, a
bounded function that is continuous on [a, b]
A. The Riemann Integral except for an inlïnite number of points xi, is
integrable if for an arbitrary positive number a
Let f(x) be a bounded real-valued function there exist a lïnite number of intervals Ii of
delïned on an interval [a, b]. We shall divide which the total length is less than E and if the
this interval I = [a, b] into subintervals li = set {x~} of exceptional points is contained in
u Ii. Generally, a necessary and suflïcient
[xi-i, x,] (i = 1, , n) by a lïnite number of
condition for a bounded function dehned on
points xi (a =x0 <x, < . <x, = b). This divi-
[a, b] to be integrable is that the set of points
sion into subintervals is uniquely determined
where the function is not continuous be of
by the set D = {xi}, called the partition of 1. We
tmeasure 0 (in the sense of Lebesgue). A func-
set Mi = sup,,,,f(x), mi = infX,,c,f(x), and put
tion that is either tmonotonic on [a, b] (and
S(D)=& M,(x,-xi-,), g(D)=& mi(xi-
x,-i). Considering a11 possible partitions D consequently bounded) or of tbounded vari-
ation is integrable. A function that is inte-
of 1, we set Ilf(x)dx=inf,%(D), j,hf(x)dx=
grable on [a, b] is integrable on any sub-
sup,c(D), which are called the Riemann upper
integral and Riemann lower integral of A re- interval of [a, b], the integrand being the
restriction of the given function to this
spectively. If they coincide, then the common
value is called the Riemann integral of ,f on subinterval.
[a, b] and is denoted by j,hf(x)dx. In this case,
we say that ,f is Riemann integrable (or simply
integrable) on [a, b] and cal1 f the integrand; u B. Basic Properties of Integrals
and b are called the lower limit and the Upper
limit, respectively. In this case, by integrating ,f
from a to b we mean the process of obtaining Let 1 be the set of all functions integrable
the value j,hf(x)dx. on [a, b]. If ,f; g E 1, then for any numbers
Darboux’s theorem: For each F:> 0 there a, /?, we have ~f+~~g~l,~~~I,min{~g}~I,
exists a positive 6 such that the inequalities max { A y} E 1, and f/g E 1 provided that there
exists a positive constant A such that the
inequality Igl > A holds. Furthermore, if ,fc 1,
i.(D)-~,,(x)dx~<,,
thenlfIEI;andiff,EI(n=1,2,...)andf,
converges tuniformly to ,f; then f~1. Cor-
Inli>)- S:/(xW+~ responding to these properties, the following
formulas hold.
216 C 820
Integral Calculus

(1) Linearity: The second mean value theorem: If J (x) is a


positive, monotone decreasing function de-
cm) + BS(X)) dx lïned on [a, b] and C~(X) is an integrable func-
s tion, then there exists q (a =CV< b) such that
b v
_! c( hf(X)dx+B j hf(X)dX,
LI ” f(.4cPWx=f(~+O) dx)dx.
sa s a

where CC,fi are constants.


In the hypothesis of the second mean value
(2) Monotonicity: If j(x) > 0, then
theorem, if f(x) is assumed to be monotonie
b but not necessarily positive, then there exists
f(x) dx 2 0. q (a < q <b) such that
sa
b
If, further, f is continuous at a point X~E [a, b]
f(x) C~(X) dx
and f(xa) > 0, then C~(X) dx > 0. s0
(3) Additivity with respect to intervals: If a, v b

b, and c are points belonging to an interval on =f(a+O) cp(x)dx+f(b-0) cp(x)dx.


which f is integrable and a < c <b, then s 0 s 9
H. Okamura (1947) proved that the condition
udqdb cari be replaced by a<q<b.
In the case ,f(x) > 0 on [a, b], we consider the
Adopting the conventions that Jif(x)dx = 0 figure F bounded by the graph of f(x), the x-
and jbf(x)dx= -jif(x)dx, the additivity axis, and the lines x = a and x = b. Then 8(D)
formula holds independently of the order of a, and c(D) are areas of polygons of which one
b, and c. encloses F and the other is enclosed by F, as
It follows from (2) that Ijif(x)dx( < shown in Fig. 1. Hence it cari be shown that
Jilf(x)l dx if a < b. Further, if f.(x) converges the integrability of ,j(x) in the sense of Rie-
to f(x) uniformly on [a, b], then mann is equivalent to the measurability of F
in the sense of Jordan. The Riemann integral
lim~:~Odx=~~b/lx)dx. l,hf(x)dx is the area of F with respect to its
+Jordan measure.
Replacing f,(x) by partial sums of a series,
we obtain the following theorem: Let 2 a,(x)
be a series in which each term a,(x) is inte-
grable on an interval [a, b]. If the series con-
verges uniformly on [a, b], then the sum s(x) is
integrable on [a, b], and the series is termwise
integrable, that is,

a XI XL

Also, the series CzZ1 ltu,,(t)dt converges uni-


Fig. 1
formly on [a, b] to the integral iis(t)dt. As-
sume that Ca,(x) is convergent but not uni-
formly convergent. If a11 u,(x), together with C. Relation between Differentiation and
s(x) = 2 a,(x), are integrable and there is a Integration
constant A4 independent of n such that I~,,(X)/
<M (x E [a, b]) for a11 n, where s,(x) are par- Suppose that f(x) is integrable on an interval
tial sums, then the series is termwise integrable 1. We fïx a point a of I and consider the in-
(C. Arzelà). tegration F(x) = Jif(t) dt, where x varies in 1.
The first mean value theorem: If f(x) is con- The function F(x) is called the indefinite in-
tinuous on [a, b] and <p(x) is integrable and of tegral of j’(x). In contrast with this, the integral
constant sign on [a, b], then there exists (3 (0 < on a,lïxed interval, as considered in the previ-
0 < 1) such that ous sections, is often called the defïnite in-
h tegral. The indelïnite integral F(x) is continu-
f(x)<P(x)dx=f(a+O(b-a)) ‘<P(x)dx. ous on the interval 1 and of bounded varia-
sa sY tion. If f(x) is continuous at a point x0 in 1,
When <p(x) = 1, we have then F(x) is differentiable at x,, and F’(x,) =
j(x,,). In general, if a function G(x) satistïes
b
G’(x) =,f(x) everywhere in I, then G(x) is called
f(x)dx=f(a+O(b-a))&a).
sa a primitive function of f(x). If ,f(x) is continu-
821 216 E
Integral Calculus

ous, the indefïnite integral of f(x) is one of thermore, assume that f‘(x) is not bounded in
the primitive functions of ,f(x). Furthermore, if any neighborhood of each point cj (j= 1, , n)
a function G(x) is a primitive function of f(x), (a<c,<...<c,<b).Thenwedefïne
then any other primitive function cari be writ-
ten in the form G(x) + C, where C is a con- SII(x)dx=j.;.f(x)dx+j;f(x)dx+...
stant, called an integral constant. For a con-
tinuous function ,f(x) on [a, b] and any one
of its primitive functions G(x), we have +~~“,iix)dx+li6,i(i)di.
)“(x)dx=G(b)-G(u)=[G(x)]: provided that a11 improper integrals
s
(fundamental theorem of calculus) (- Ap-
pendix A, Table 9). From the differentiation
formulas we obtain the following integration exist. Suppose that f(x) is defïned on [a, b]
formulas: and bounded outside any neighborhood of
Integration by parts: If f(x) and g(x) have CE(~, b) but not bounded in either [C-E, c] or
continuous derivatives on [a, h], then [c, c + E] for any E> 0. It may well happen
b that, although neither lim,,,J~~“f(x)dx nor
f’(xMxVx= LWsb)l:- *f’bMxVx. lim,.,,J:+,,f(x)dx exists (accordingly, the
s0 sa improper integral J,b,f(x)dx does not exist), if
we put E= E’, the limit
More generally, if f(x) and g(x) are integrable
on [a, h], then

does exist. This limit is called Cauchy’s prin-


cipal value and is denoted by P.V. C~(X) dx
(v.p. in French). For example, p.v. lk,(dx/x) =
lim,,,(j~~(l/x)dx+j~(l/x)dx)=O.
Change of variables: If f(x) is integrable on
[a, b] and x = <p(t)and q’(t) are continuous on
E. Integrals on Infinite Intervals
Ca,PI, where a = ~CI), b = ~(8) (ad dt) < b),
then
Suppose that we are given a function f(x)
detïned on an intïnite interval [a, co) and
~~bf~x)dx=Si'f~a(t))~.odt. integrable on any finite interval [u, b]. If
lim,,, J$,f(x)dx exists and is hnite, then this
D. Improper Integrals limit is called the improper integral off on
[a, CO) and is denoted by szf(x)dx. We delïne
The concept of the integral cari be generahzed similarly Jhmf(x)dx=lim,,-,J,h,f(x)dx, where
to the case where the integrand or the interval fis defined on (-“o, b] and integrable on any
on which integration is accomplished is not interval [a, b]. Furthermore, JZrf(x)dx is, by
bounded. Assume that ,f(x) is not bounded on definition, jYmf(x)dx+j~f‘(x)dx, which is
[a, h) but is bounded and integrable on any independent of the choice of c. Suppose that
interval [a, b-e] (c [u, b)). If l,bmE,f(x)dx has a f(x) is integrable on [u, b] for a fixed a and an
tïnite limit for a-tO, the limit is denoted by arbitrary b larger than a. If f(x) = 0(x”) for
j,b,f(x)dx and is called the improper Riemann some z < -1, then jzf(x)dx exists. Generally,
integral (or simply improper integral) of ,f(x) on forqpsuchthat -codcc<mand -CD<
[a, b). For example, if f(x) is continuous on p < cq if the improper integral ilf(x)dx exists,
[a, b) and ,f(x)= O((b-x)“) for some s( (O>N> we say that the integral is convergent; other-
-1) where 0 is the +Landau symbol, then the wise, it is divergent. Improper integrals also
improper integral J,b,f’(x)dx exists. On the satisfy the three basic properties of integrals
other hand, if ,j is integrable on [a + E, b] for (1) (2) and (3) (- Section B). However, the
each E> 0 but not bounded in any neighbor- existence of an improper integral of a func-
hood of u, we cari define the integral on (u, b] tion f on an interval 1 does not imply the
in the same way. If f is not bounded in any existence of the improper integral of If1 on
neighborhood of a or b and if there exists a the same interval 1. For example, let 1’ be
point c (a < c ch) for which the improper in- a function determined by f(0) = 0, f(x) =
tegrals J”zf(x)dx and J”:f‘(x)dx exist, then we (l/x)sin(l/x) for O<xdrr. Then jGf(x)dx
detïne J,hf(x)dx =jaf(x)dx + j:f‘(x)dx, which is exists, but J;lf(x)ldx does not. On the other
independent of the choice of the point c. Fur- hand, if the improper integral of If(x)1 exists,
216 F 822
Integral Calculus

then the improper integral of f(x) exists, and I an interval containing K. Let <p(x, y) be the
we have characteristic function of K detïned on 1, that
is, <p is determined by

cp(x,y)=l for (x,~kK,

where -c~<sc</Y<+~~.Inthiscase,wesay <~(~,y)=0 for (x,y)~I-K.


that f is absolutely integrable on the interval
Replacing f(x, y) by this <p(x, y), we consider
[cc, b]. Assume now that ,f(x) is defïned on
inf,a(D) (SU~~~(D)). These values cari be
(-CO, CO) and integrable on any finite interval.
shown to be independent of the choice of such
If lim,,, jOa f(x)dx exists, then it is called
an interval 1 and are called the outer area
Caucby’s principal value of the integral off in
(inner area) of K, respectively. When these two
c--b CO). values coincide, K is said to be of delïnite area,
If f(x) is a monotone decreasing, positive,
and the common value is called the area of K.
and continuous function detïned on [k, CO) A necessary and suffïcient condition for K to
(where k is an integer), then according as
be of defïnite area is that the outer area of
C:&v) converges or diverges, SO does
the tboundary of K be zero. Now consider a
JkfWx. bounded function defïned on a set K of de-
Suppose that a series CE, f.(x), where all tïnite area. Then, taking an interval I contain-
the functions f,(x) (n = 1,2,. ) are defined and ing K, define an extension <p(x, y) of f(x, y) as
nonnegative on an infinite interval [a, CO),
follows:
satisfies c(Zz, f,(x))dx = XE, l,bf,(x)dx
for arbitrary b > a. Then according as <p(x,y)=f(x,y) for (x,Y)EK
C~l JO f,(x)dx converges or diverges, SO does
<~(~,y)=0 for (x,y)~Z-K.
~~(C~1 f,(x))dx. When they converge, the
following equality holds: iz Cs, f,(x)dx If <p(x, y) is integrable on 1, then f (x, y) is called
=Czl JO f,(x) dx. In this theorem, if integrable on K, and the integral off on K is
JOC,% If,(x)ldx or CE, J: If,(x)ldx con- defined by jjKf(x, y)dxdy =Jj, V(X, y)dxdy,
verges, then the same conclusion as above Will which is independent of the special choice of Z.
follow even when the f,(x) are not necessarily The set K is called the domain of integration.
positive (- Appendix A, Table 9). Since K is of definite area, the set of boundary
points of K at which cp(x, y) is not continuous
cari be contained in a union of intervals whose
F. Multiple Integrals total area cari be made smaller than any preas-
signed positive number. Consequently, a func-
Suppose that f (x, y) is a function deiïned and tion bounded on K and continuous at each
bounded on an interval 1 = {(x, y) 1a < x < b, tinterior point of K is integrable on K. Like
c < y < d} in the xy-plane. Partitions {xj} and integrals of functions of a single variable,
{y,}of[a,h]and[c,d] witha=x,<x,<... multiple integrals satisfy the three basic prop-
<x,=b and c=y,<y, < . . . <y.=d determine erties of integrals (- Section B).
a “partition,” denoted by D, of 1 into subinter-
vals of the form Ijk = {(x, y) )x~-~ <x < xj, y,-, d
G. Multiple Integrals and Iterated Integrals
y<y,}(j=l,..., m;k=l,..., n). Writing

Suppose that we are given a function f (x, y)


that is continuous on an interval 1 = {(x, y) 1
a<x<b,c<y<d}. Then, for a fixed y in
[c, d], the function f(x, y), regarded as a func-
we set tion of x, cari be integrated with respect to

a(D)=
fc j=l k=l
Mjk(Xj-Xj-l)(Yk-Yk-l)r
x on the interval [a, b], and the integral thus
obtained is a continuous function of y. The
integral of the function defined on [c, d],
namely, Jf(C f (x, y) dx) dy, is called the iterated
o(D)= f t Illlk(Xj-Xj-l)(Yk-Yk-l).
j=l k=l integral (or repeated integral) of f(x, y) and is
often written as ~~dy~~,f(x, y)dx. The following
Then we obtain inf,<r(D) > sup,c(D). If
inf, O(D) = sup,c~(D), then f (x, y) is called inte- formula gives a representation of a double
integral by iterated ones:
grable on 1, and the common value is called
the double integral of ,f on I and is denoted by
SS, f (x, y) dx dy. Analogously, we cari define n- ddy *f(x>y)dx
sc s r?
tuple integrals (or multiple integrals) and the
integrability of functions of n variables. = *dx df(x,y)dy.
Let K be a bounded set in the xy-plane and s0 se
823 216 Ref.
Integral Calculus

More generally, <pi(x) and (p2(x) being con- every b( > a), the improper integral
tinuous on [a, b] and ‘pl (x) < <p2(x), consider
the following subset K = {(x, y) 1a d x < b,
q1 (x) < y < <p2(x)} of the xy-plane. Suppose
further that f(x, y) is continuous on K. Then
J amf(x>y)dx=;~;

converges,
‘f(x,y)dx
sa
and the improper integral

s
the following equality holds:
cc (?f (x, Y) dx = lim b ;If (XT Y) dx
<p*w
<I aY b-m s ~ ôy
f(x,y)dxdy= *dX f (x> Y) dy.
SS K sa s <p,(X)
converges as b+ COuniformly for y with 1y -
In the case of unbounded integrands or y,J <q. Then
unbounded domains of integration, we cari
still defïne integrals under suitable restrictions. i[: f(x, y)dx=I:ydx for y=~,.
For instance, assume the following two prop-
erties: (1) There exists a sequence {K,} of sets, Several other similar theorems are known.
each of which is of definite area, satisfying Though the previous theorems are written in
K, c K, c . and K = un=, K,. (2) f(x, y) is terms of two variables, analogous theorems
bounded and integrable on each K, (n = hold for n variables.
1,2, . ..). If a finite limit lim,,,lJKnf(x,y)dxdy
exists and is independent of the choice of {K,},
1. Change of Variables in Multiple Integrals
then f(x, y) is called integrable on K. This limit
is called the integial of f(x, y) on K and is Let G be a bounded domain of defïnite area in
denoted by ssK f(x, y)dxdy: an n-dimensional Euclidean space R”(x). As-
sume that a mapping x-y(x)=(y,(x,, . . . . x,),
lim f (x, Y) dx dy = f(x,y)dxdy. . ..) y,(~, , . . , x,)) is of class C’ from an open
n-n SS
K. set containing the closure G of G into an n-
When the integral thus defïned exists, we say dimensional Euclidean space R”(y). We de-
that the integral is convergent. If a finite limit note the image of G under this mapping by B.
lim,,, slK. 1f(x, y)1 dxdy exists for some se- If f (y,, , y,) is continuous on B, then the
quence {K,} with property (l), then fis inte- following formula on change of variables
grable on K. Let ,f(x, y) be continuous and holds:
nonnegativeon K={(x,y)Icc<x<B,y<y<6},
where -a,<cc<fi<co, -CO<~<~<CO. Fur- f(Y,,...,Yn)dy,...dy.=
thermore, let f(x, y) be integrable on K, and SS B
assume that the improper integral F(x) =
J:f(x, y) dy = lim,lv,,,ta ji’f (x, yldy exists and
converges uniformly with respect to x as clr,
dT6. Then SiF(x)dx is well defïned, and we whereg(x,,...,x,)=f(y,(x,,...,x,),...,y,(x,,
have , x,)) and D( yl, , y,)/D(x 1, , x,) is the
TJacobian determinant of the mapping y(x).
This formula is usually utilized in the case
f(x,y)dxdy= ‘dx ‘f(.x,y)dy.
SS K sa sY where yl, , y. are tfunctionally independent,
though otherwise both sides vanish and the
In particular, if x = a, y = b, fi = 6 = oc, then we formula still holds. For improper integrals,
have a similar formula Will hold under suitable
m a. restrictions, for example, if the integrals con-
f(x,y)dxdy= mdx hcof(x,y)dy. verge absolutely.
SSa b sa s For related topics - 94 Curvilinear In-
tegrals and Surface Integrals, 221 Integration
Theory, and 270 Measure Theory.
H. Interchanging the Order of Differentiation
and Integration
References

If both ,f(x, y) and 8f (x, y)/ay are continuous


[l] T. M. Apostol, Mathematical analysis,
on an interval {(x,y)Ia<x<b,y,-qdy<
Addison-Wesley, 1957.
y0 + q}, then we cari interchange the order of
[2] N. Bourbaki, Eléments de mathématique,
differentiation and integration as follows:
Fonctions d’une variable réelle, Actualités Sci.
Ind., 1074b, 1132a, Hermann, second edition,
~~abf(x,y)dx=[ob~dx for y=y, 1958,196l.
[3] R. C. Buck, Advanced calculus, McGraw-
Assume further that this equality holds for Hill, second edition, 1965.
217 A 824
Integral Equations

[4] R. Courant, Differential and integral cal- Equations of the third kind, in many cases, cari
culus 1, II, Nordemann, 1938. be reduced formally to those of the second
[S] G. H. Hardy, A course of pure mathemat- kind. The function K(x, y) is called a kernel (or
ics, Cambridge Univ. Press, seventh edition, integral kernel) of the integral equation.
1938. Integral equations of Volterra type (or Vol-
[6] E. Hille, Analysis 1, II, Blaisdell, 1964, terra integral equations) are those of the forms
1966.
[7] W. Kaplan, Advanced calculus, Addison- (1’)
Wesley, 1952.
[S] E. Landau, Einführung in die Differential-
rechnung und Integralrechnung, Noordhoff, C~(X)- x~hMy)dy=f(x)~ (2’)
sa
1934; English translation, Differential and
integral calculus, Chelsea, 1965.
44dx)-- kx>L.)‘pMdy=f(4> (3’)
[9] J. M. H. Olmsted, Advanced calculus, sa
Appleton-Century-Crofts, 1961.
where <p(x) is an unknown function. Equations
[ 101 A. Ostrowski, Vorlesungen über
of the forms (l’), (2’), and (3’) are also called
Differential- und Integralrechnung I-III,
equations of the first, second, and third kind,
Birkhauser, second edition, 1960&1961.
respectively. Integral equations of Volterra
[ll] M. H. Protter and C. B. Morrey, Modern
type cari be regarded as integral equations of
mathematical analysis, Addison-Wesley, 1964.
Fredholm type having kernels equal to 0 for
[ 121 W. Rudin, Principles of mathematical
x <y, but these two types of equations are
analysis, McGraw-Hill, second edition, 1964.
usually treated separately, since they have
[ 131 V. 1. Smirnov, A course of higher math-
considerably different characters.
ematics. 1, Elementary calculus; II, Advanced
The kernels in equations (l)-(3) and (1’))
calculus, Addison-Wesley, 1964.
(3’) are frequently written in the form iK(x,y)
[14] A. E. Taylor, Advanced calculus, Ginn,
with a parameter IL, in particular when the
1955.
equations are related to eigenvalue problems,
which is explained in Section F.
The theory of integral equations was orig-
inated in 1823 by N. H. Abel, who investigated
217 (x111.30) the relationship between time and the path of
a falling body in the iïeld of gravitation. Let
Integral Equations q(t) be a quantity varying with time, which is
connected by some law with its values in some
A. General Remarks time interval of the past or the future. Then
the law of variation of <p(t) cari be described
Equations including the integrals of unknown mathematically by an integral equation. The
functions are called integral equations. The situation is the same even when the variable t
most studied ones are the linear integral equa- is not time but a coordinate of the space. In
tions, i.e., linear in unknown functions. this way, various problems in physics cari be
Let D be a domain of n-dimensional Eu- reduced to solutions of integral equations.
clidean space and ,f(x) and K(x, y) be func-
tions defined for x=(x1,x2, . . ..x.)eD, y=
(y,,~,, . . ..y.)cD. Integral equations of Fred-
holm type (or Fredholm integral equations) [ 11 B. Relation to Differential Equations
are those of the forms
Many problems in differential equations cari
be reduced to problems related to integral
K(x,Yh(Y)dY=f(x), (1)
sD equations. Such reduction often makes the
problems easier to handle and clarifies the
<p(x)- K(x,~)cp(~)d~=f(x)> (4 nature of the solutions. For example, consider
sD the problem of iïnding a solution of the ordi-
nary second-order linear differential equation
A(xMx)- K(x>y)<pMdy=f(x)> (3) d2y/dx2 + Ây = 0 with the boundary condition
sD
y(0) = y( 1) = 0 [4]. Let d2 y/dx2 = u(x). If we
where <p(x) is an unknown function and J,dy integrate the equation twice, change the order
means the n-fold integral s. J,dy, dy,. of integration, and make use of the boundary
Equations of the forms (l), (2), and (3) are condition, then we have
called equations of the first, second, and third
kind, respectively. Equations of the second
kind have been investigated in great detail.
825 217 D
Integral Equations

from which we see that the given differential problem, put f(s) = F(s)/n and
equation cari be written in the form

u=/? Ix(1 -&(5)dg-l x(x-ou(ode.


s0 s0 Then we have a solution u in the form
Decomposing the fïrst integral on the right- I 1
hand side into the sum of an integral over u(x,y)= - /L(S)log@fs+c,
s0 r
(0, x) and one over (x, l), and combining the
integral over (0, x) with the second integral on where p(s) is a solution of the following Fred-
the right-hand side, we obtain a Fredholm holm integral equation of the second kind:
integral equation of the fïrst kind as follows:
p(s)=f(s)- IL(S; t)p(t)dt.
s0
A solution of this integral equation, however,
((1 -x) (O<<<x), exists when and only when yo F(s) ds = 0. In the
G(x, 5) = expression of a solution u(x, y) of the Neu-
x(1-5) (X<<<l).
mann problem, c is an arbitrary additive con-
Clearly, the solution of this integral equation stant, up to which a solution of the problem is
is equivalent to that of the original differential determined uniquely. We cari also treat par-
equation. The function G is called tGreen’s tial differential equations of telliptic type in
function for the boundary value condition an analogous way.
y(0) = y( 1) = 0 in the theory of boundary value
problems. Differential equations of higher
orders cari be treated analogously (- 315 C. Integral Equations with Continuous Kernel
Ordinary Differential Equations (Boundary
We describe some results for integral equa-
Value Problems)). tInitia1 value problems of
tions with m-dimensional independent vari-
linear ordinary differential equations cari be
ables, i.e., equations in which D is an m-
reduced to the solution of Volterra integral
dimensional closed domain. We assume that
equations in a similar way.
K (x, y) and f(x) are continuous in Sections
As another example [S, 61, consider the
D-H.
TDirichlet problem on a plane, i.e., the problem
of finding a function u satisfying the conditions
(i) u is tharmonic in the interior of the region D D. The Method of Successive Iteration
bounded by a closed curve C([=C~(S), q = I&),
O<S<~); (ii) U(X, y)+F(s) uniformly with re- Among methods of solving Fredholm integral
equations of the second kind, the simplest is
spect to (x0, yo) as CGY) approaches (x0, yo)
from the inside of D, where (x0, yo) is an arbi- the method of successive iteration, sometimes
trary point on C, F(s) is a continuous func- called the method of successive approximation
tion given on C, and s is the arc length along [7]. In the method of successive iteration, we
C. Put f(s)= F(~)/R and rewrite (2) in the form

<p(x) =.m + K(x>ybMdy


sD
and replace the function q(y) on the right-
Then it is known that a solution u of the prob-
hand side by the function
lem cari be given in the form
I
a i f(y)+ K(~,z)cp(z)dz.
u(x, y)= p(s)anlog-ds,
r s D
s 0
If we repeat the process successively, then we
where ?=(V(s)-X)~+(+(S)-y)‘, n is the inner
have
normal of C, and p(s) is a continuous solution
of the following Fredholm integral equation of
<p(x)=f(x) + f: Ki(x, Y)f(Y)dY
the second kind: i=l sD

p(s)=f(s)- tJc(s; t)p(t)dt.


s0
We cari treat the tNeumann problem similarly, where
i.e., the problem in which condition (ii) is re-

Ki(X>Y)=
sDKi-l(X,s)K(s,Y)ds.
K, (x, Y) = K(x, Y)>
placed by (ii’) (au/%)@, y)*F(s) uniformly
with respect to (x0, yo) as (x, y) approaches
(x0, yo) from the inside of D. In the Neumann
217 E 826
Integral Equations

The functions Ki(x, y) are called the iterated The series in these two equations both con-
kernels. Assume that Es1 K,(x, y) converges verge uniformly and hence defïne tentire func-
uniformly. Then, putting tions of i. The functions D(A) and D(x, y; A) are
called Fredholm’s determinant and Fredholm’s
Rh Y) = f K(x, Y), (4) first minor of the kernel K(x, y), respectively.
PI=1
For small [Al, we have
we obtain a solution of (2) in the form

<p(x)=f(x)+ R(x,Y)~(Y)~Y. (5)


sD where the K,(x, y) are iterated kernels corre-
The series (4) is called a Neumann series. sponding to K(x, y). Now if D(A) # 0, a resolv-
For a given kernel K(x, y), a function R(x, y) ent ÂR(x, y; 1) of the kernel lK(x, y) cari be
satisfying given by

Y)
+r
K(x, Y)- R(x,
JD
K(x,s)R(s,y)ds=O
W, Y;
4
-=R(x,y;Â).
W

and If D(l) = 0, some extension of the method in

K(x, Y)
-Rb,
Y)
+c JD
R(x,s)K(s,y)ds=O
this section is needed. Fredholm introduced
for this purpose Fredholm’s rth minor

is called a resolvent of K(x, y) (in some cases D();:;;;;;;:I)


- R(x, y) is taken as the resolvent). If a re-
solvent of K(x, y) exists, the solution of (2) defïned by
cari be given uniquely by (5). If a Neumann
series converges uniformly, (4) gives a resolv- D(;;‘,‘;::;;;+
ent of K(x, y).
If we apply a similar process to Volterra
integral equations of the second kind, then we
have the iterated kernels defïned by

Ki+l(X,Y)=
sY
x
Ki(x, s)K(s, y)ds (i = 1,2, .),
X
SH
D
1..
D
K
x ,,... >X,,SII...,S,
y,>...rYriSlr...rS” >
ds , . ..ds..

For these iterated kernels, a Neumann series F. Eigenvalue Problems and Fredholm’s
defïned by (4) always converges uniformly. Alternative Theorem

E. Fredholm’s Method Consider a homogeneous integral equation of


the second kind
Let D be a bounded closed domain and K(x, y)
a continuous kernel. A Neumann series (4) <pW-1 K(x,YMY)~Y=O, (6)
sD
converges uniformly and gives a resolvent if
IK(x, y)1 or the region D is suficiently small, where D is a bounded closed domain and
but otherwise it does not necessarily converge. K(x, y) is continuous in D x D. When (6) has
E. 1. Fredholm [ 1,7] gave a method of con- a nontrivial solution <p(x) for some A, then 1
structing a resolvent for the more general case. is called an eigenvalue corresponding to the
Write a kernel in the form X(x, y), and put kernel K(x, y), and the corresponding non-
trivial solution p(x) is called an eigenfunction
K(x,>Y,) . K(x,,Y,) corresponding to the kernel K(x, y). If D(i) # 0,
. .
then (6) has no nontrivial solution, from which
K(x,>Y,) KCLY,) it follows that eigenvalues must be zero points
Deiïne D(1) and D(x, y; A) by of the entire function D(Â). For an arbitrary
eigenvalue 1, there is a set of linearly inde-
D(l) = pendent eigenfunctions corresponding to 1,
such that any eigenfunction corresponding to
i cari be written as a linear combination of
the eigenfunctions belonging to the set under
and

Y; Y)
consideration. Such a set of linearly indepen-
dent eigenfunctions corresponding to an eigen-
W, 4 = K(x,
value J. is called a fundamental system corre-
sponding to the eigenvalue A. The number of
elements of the fundamental system is called
827 217 G
Integral Equations

the index of the eigenvalue 1. The index of an bounded closed domain and K(x, y) be a con-
eigenvalue is always tïnite. The homogeneous tinuous symmetric kernel. In this case the as-
integral equation of the form sociated equation (6’) clearly coincides with the

40-Â
JD ~(x,YM(x)dx=o (6’) original equation (6) [.5,6,8].
Corresponding to any nontrivial symmetric
kernel K(x, y), there exist at least one eigen-
is called an associated (or transposed) integral value and one eigenfunction. The eigenvalues
equation of (6). The associated equation has are a11 real, and the eigenfunctions correspond-
the same eigenvalues as the original equation; ing to distinct eigenvalues are mutually ortho-
moreover, the index of a common eigenvalue is gonal. If we orthonormalize the eigenfunctions
the same for both equations. For any eigen- belonging to a11 fundamental systems and
value 1, the order of the zero point  of the number them according to the order of in-
entire function D(I) is called the multiplicity of creasing absolute values of the corresponding
the eigenvalue À. If an eigenvalue I is a pole of eigenvalues, then we have an orthonormal
R(x, y; A), then we have p + 12 r + q, where r is system {C~,(X)}, called a complete orthonormal
the order of the pole, p is the multiplicity of Â, system of fundamental functions or simply a
and 4 is the index of i. In particular, if i is a complete orthogonal (or orthonormal) system.
simple pale of R(x, y, 1) we have p = 4, namely, If we number the eigenvalues taking their
the multiplicity is equal to the index. An exam- multiplicities into account and according to
ple with this particular property is the integral the order of increasing absolute values, then
equation with a symmetric kernel to be dis- we have the equahty
cussed in Section G. For the set of eigenvalues
there is no fïnite accumulation
there are intïnitely many eigenvalues.
point even if x-=JJ
33 1
i:l 1.: D D
K’(x,y)dxdy.

If 1 is not an eigenvalue, the inhomogeneous Corresponding to an iterated kernel


equation K,(x, y), we have the eigenvalues {Ây), and we

dx)-j.
J
D
KkY)dY)dY=.f(x)

cari be solved uniquely for any continuous


(7)
cari choose the corresponding orthonormal
system SO that it coincides with the one corre-
sponding to K(x, y). Eigenvalues and eigen-
functions corresponding to an iterated kernel
function f(x). In this case we have D(Â) #
cari be obtained in the following way: Put
0, and the resolvent R(x, y; 1) of the kernel
SDK,(s, s) ds = u,; then the following limit
iK(x, y) exists. If i is an eigenvalue, we have
exists:
D(I) = 0, and equation (7) has a solution if and
only if lim LI&~~~+~ = Â2 < + ~7.3,
n-cc

which gives an eigenvalue of the iterated ker-


ne1 K,(x, y). A function <p(x, y) detïned by
for a11 solutions $(y) of (6’) (linearly indepen-
dent solutions $(y) are finite in number). The
last statement is called Fredholm’s alterna-
(uniformly convergent) gives the correspond-
tive theorem (- 68 Compact and Nuclear
ing eigenfunction <p(x, c) for any constant c
Operators).
satisfying cp(x, c) + 0.
A kernel of the type
Let 2” be an eigenvalue corresponding to an
K(X, y)' i xj(x> q(Y)
iterated kernel K,(x, y) and <p(x) be a corre-
j=l sponding eigenfunction. Consider the func-
is called a separated kernel, degenerate kernel, tions$j(x)(j=O,l,...,n-l)definedby
or Pincherle-Goursat kernel. For such a kernel, n-l
we have D(Â)=det(bjk-AjiXj(t)Yk(t)dt), and ‘/‘j(“)=4’(X)+k~l EkjÂk Kk(%Y)V(Y)dY
JD
hence we cari easily obtain eigenvalues and
eigenfunctions. A nondegenerate kernel cari be (j=O,1,2 ,..., n-l),
studied using the results obtained for sepa- where E is one of the nth primitive roots of 1.
rated kernels, since we cari regard a kernel of The,r for at least one value ofj, sj1. is an eigen-
the general form as the limit of a sequence of value corresponding to the kernel K(x, y), and
separated kernels. tij is a corresponding eigenfunction. This rela-
tionship between eigenvalues and eigenfunc-
G. Symmetric Kernels tions corresponding to an iterated kernel and
those corresponding to the original kernel is
A kernel K(x, y) is called a symmetric kernel valid even for kernels that are not necessarily
if it is real and K(x,y)= K(y, x). Let D be a symmetric [2].
217 H 828
Integral Equations

Let a kernel K(“)(x, y) be detïned by The series in the expansion of f(x) converges
uniformly. These facts are the content of the
K’“‘(x,
y)=K(x,
y)-c vi(xyyJ, i=l 1
Hilhert-Schmidt expansion theorem [S, 6,8]. By
using this theorem, for a A that is not an eigen-
where the A, (i = 1,2, , n) are the eigenvalues value, we cari obtain a solution <p(x) of the
corresponding to the kernel K(x,y) and the Fredholm integral equation (7) with a sym-
q+(x) (i = 1, ) n) are the corresponding ortho- metric kernel in the form
normalized eigenfunctions. Then eigenvalues
and eigenfunctions corresponding to K(“)(x, y)
are those corresponding to K(x, y), with the
exception of A,, , n, and cpi(x), .__, C~,(X). For m > 2, iterated kernels cari be expanded in
Let <p(x) be any function that satislïes the form
JD(<p(x))‘dx= 1. Then the integral
K,(~, y)= f <pi(xj,,)

i=l -I

(uniformly convergent). If ÂR(x, y; A) is a re-


assumes the maximum value when <p(x) is an solvent of a symmetric kernel AK(x, y), then
eigenfunction corresponding to K,(x, y) with R(x, y; A) cari be expanded as
the smallest eigenvalue 1:. Let the eigenvalues
3,” of K(x, y) be numbered in order of increas-
ing absolute values, SO that 11,1< l&,+i 1. Let
q(x) be any function satisfying
If a symmetric kernel K(x, y) satisftes the

s D
$,(x)q(x)dx=O (i= 1,2, . ,n),
inequality

SSK(x~.W.4d.ddxdy>O
1
JO
(<p(x))* dx = 1
D D

for a11 <p(x), it is called a positive (semidefinite)


for given functions $Jx) (i = 1, , n). Then the kernel. If in this inequality the equality holds
maximum value of the integral J above is the only for <p(x) = 0, K(x, y) is called a positive
least when the set of a11 linear combinations of definite kernel. For a positive defmite kernel,
{tjr(x), , $,(x)} coincides with the set of a11 eigenvalues are a11 positive, and the kernel cari
linear combinations of {C~,(X), , <p,(x)}, and be expanded in the form
in this case the maximum of J is attained by
some eigenfunction <p(x) corresponding to K(x, y) = f ‘i(x~(y’

K,(x, y) with the eigenvalue A:+i. The results i=l I

in this paragraph show that we cari obtain (uniformly convergent). This result is called
eigenvalues by solving a variational problem Mercer’s theorem.
concerning the integral J. When a real continuous kernel K(x,y) is not
symmetric, we consider two positive kernels
R’(x, y) and I?“(x, y) delïned by
H. Expansion Theorems

Let K(x, y) be a continuous symmetric kernel JO


r K(x,s)K(y,s)ds=I?‘(x,y)

and h(x) be a function square integrable on a

YPWy s
bounded closed domain D. Then a function

.f(x)
=sDK(x>
f(x) such that
K(s,x)K(s,y)ds=I?‘(x,y).
D

The eigenvalues corresponding to these ker-


nels are the same, and they are a11 positive. Let
cari be expanded in the form 1”; (i = 1,2, ) be these eigenvalues and {cpi(x)}
and {$Jx)} be the corresponding complete
f(x)= c vAA orthonormal systems corresponding to l?’ and
n=l

4sK(Y~X)<pi(Y)dY=$i(X
I?‘, respectively. Then we have
where {cpi(x)} is a complete orthonormal sys-
tem of fundamental functions corresponding
to K(x, y) and

iisK(X,
Y)tii(Y)dY
=<Pi(X).
D

c,= f(x)<p,(x)dx (n= 1,2, . ..).


sD D
829 217 J
Integral Equations

Let f(x) be an arbitrary function such that ent R,(x, y), we cari find a resolvent R(x, y) of
K(x, y) in the form
f(x)= K(x, YMYVY, Nx, Y) = k71(x, Y) + wx, Y)
sD

where h(x) is a function square integrable on


+ &,(x, .W,n(s, y)&
D. The function f(x) cari then be expanded in sD
the form
where H,(x, y) = CE;’ K’(x, y). When a kernel
is of the form X(x, y), the relationship be-
ftx) = i$ ci<pi(x)> 63) tween eigenvalues and eigenfunctions corre-
sponding to an iterated kernel and those cor-
where
responding to the original kernel, which was
stated in Sections G and H, is still valid for
ci= f(x)<p,(x)dx (i=1,2, . ..).
sD general kernels. If a kernel K(x, y) is continu-
ous for x #y and has a singularity of the form
The series in the expansion (8) converges
Ix-yl-* (O<X< 1) on x=y, the iterated ker-
uniformly.
nels K,(x, y) are continuous provided that
The Fredholm integral equation (1) of the
(1 - a)m à 1. tGreen’s functions of partial dif-
tïrst kind with a general kernel (i.e., a kernel
ferential equations of elliptic type have this
that is not necessarily symmetric) has a square
property.
integrable solution <p(x) if and only if f(x) has
A kernel that is not square integrable is
a uniformly convergent expansion (8) and
called a singular kernel. An integral equation
Z&(ciiJ2 < +co. When this condition is
whose domain of detïnition is unbounded or
satisfied, equation (1) has a solution given by
whose kernel is singular is called a singular
Cz, c,A,<~,(x) that converges in the sense of
integral equation [9]. Singular integral equa-
tmean convergence.
tions have some particular properties that are
Tt should be noted that the theory concern- not seen in ordinary integral equations, i.e.,
ing symmetric kernels cari be extended to
integral equations with kernels continuous in a
complex-valued Hermitian kernels, i.e., kernels
bounded closed domain. For example [ 101,
such that K(x, y) = K(y, x). Also, we cari obtain consider the identity
the theory in Section G and this section, con-
cerning Fredholm integral equations with
continuous kernels, by using the methods of
functional analysis that treat jDK(x, y)cp(y)dy
as a tcompact operator in the space of con- i-l
ZZZ-g-.x+X
tinuous functions (- 68 Compact and Nuclear J2 a2+x2’
Operators).
where a is an arbitrary real number. This
equality shows that for the continuous kernel
asinxy, 1 is an eigenvalue and a eëaX
1. Kernels of Hilbert-Schmidt Type
+ x/(a’ +x2) is a corresponding eigenfunction.
Since a is arbitrary, the index of the eigenvalue
Kernels of Hilbert-Schmidt type are kernels

UI
1 is evidently infmity. As another example,
which are square integrable in the sense of
observe the equality
Lebesgue over D x D, where D is an arbitrary
domain. Most of the results mentioned
previous section concerning integral equations
with kernels continuous
in the

on bounded domains
are valid also for equations with kernels of
s -cc
,-lx-yle~iaydy=---e-‘““, 2
l+a2

where a is an arbitrary real number. From this


Hilbert-Schmidt type, because every operator equality, we see that for the continuous kernel
mentioned in the previous section is also a emlX-yI defined on (-CO, CO), and number  =
compact operator in the space concerned, i.e., (1 + a’)/2 greater than or equal to 1/2 is an
the space L,(D) [6] (- 68 Compact and eigenvalue. In this example, the spectrum, i.e.,
Nuclear Operators). the set of eigenvalues, is a continuum. Such a
spectrum is called a continuous spectrum.
In applications, an important role is played
J. Singular Kernels by integral equations with kernels of Carleman
type:
For general kernels that are not necessarily
continuous, the theory described in the previ- K(X> Y) = WL YMY -4,
ous sections does not apply properly, but where G(x, y) is a bounded function. In inte-
when an iterated kernel K,(x, y) has a resolv- gral equations with such kernels, the integral
217 K 830
Integral Equations

is taken in the sense of the KJauchy principal where


value [S, 9,11,12]. For example, the Riemann-
O(x) = <pi(X -i+ l), F(x)=J(x-i+l),
Hilbert problem is the following: We are given
a simple closed and smooth curve L in the K(x,y)=K,(x-i+ l,y-j+ 1)
complex plane and real-valued smooth func-
(i-l<x,j-lgy<j;i,j=1,2 ,..., n).
tions a, b, and c defïned on L, a + ib never
vanishing. The problem is to fïnd a function A system of Volterra integral equations of the
<p(z) which is holomorphic in the exterior of L, second kind cari be reduced to a single equa-
at most of polynomial growth at infinity, and tion by eliminating the unknown functions
continuous up to L with boundary value cp+ successively.
such that Re [(a + ib)<p’] = c on L. This prob-
lem is reduced to that of fïnding a function p

Gk
P(Z)0 (~EL)>
defmed on L satisfying an integral equation

-sLZ-iAi)d:=f(4 L. Integral

Consider a Volterra
lïrst kind
Equations of Volterra

integral equation
Type

of the

where G is a smooth kernel determined by


a + ib, f is a known function depending on c,
and the integral is taken in the sense of the x K(x, YMY)~Y =fC4
sa
Cauchy principal value. The integer K defïned
such that K (x, x) # 0 and K,(x, y) and f’(x) are
by K = ( 1/7-c)sLd(arg(a - ib)) is called the in-
continuous. If we differentiate both sides of the
dex of this problem. The full solution of the
equation, then we have a Volterra integral
Riemann-Hilbert problem was given by 1. N.
equation of the second kind:
Vekua [ 111.
A multidimensional analog of the Cauchy
principal value is the singular integral of A. P.
Calderon and A. Zygmund (- 251 Linear

s;~dNy=î(x)
C<cc<l).
Operators). A smooth function k(x) detïned Abel’s integral equation of general form is
in R” except at x = 0 is called a kernel of
Calderh-Zygmund type if k(x) is positively (9)
homogeneous of order -n and if its integral
mean on the unit sphere is zero. Then the If G, G,, and f’ are continuous and G(x, x) # 0,
operator K deiïned by equation (9) cari be reduced to the equation
” Y
Kf(x) = lim WYC-y)dy H(u,yMy)dy= f(x)(U-x)dl-ldx>
E-O s lyi>E s0 s0

is called a Calderbn-Zygmund singular integral where


operator [13]. K is a bounded linear operator ”
WL Wx
inLP(R”)ifl<p<+co.Ifn=landk(x)= H(u, Y) =
s y (U-X)l-=(X-y)a’
l/(nix), K is nothing but the THilbert trans-
formation. The pseudodifferential operator (- Since H(u, u)=(n/sinctx)G(u, u)#O, it follows
345 Pseudodifferential Operators) is an exten- that
sion in some sense of the singular integral
’ ~L(U> Y)
operator. du) + o ,(UdyVy=d4>
s ’
where
K. Systems of Integral Equations
g(u)=H(u,u)-‘; ‘f(x)(u-x).mldx
A system of Fredholm integral equations of s0
the second kind cari always be reduced to a
= H(u, u)-’ u”-‘f(0)

1
single equation. In fact, as is seen easily, a
system of integral equations

<pi(x)-nC
i s 0
Kij(x, Y)Pj(Y)dY=.L(x)

(i=1,2,...,n)
+

Clearly (9a) is a Volterra integral equation of


the second kind. Abel% problem (- Section A)
s0
“(u-x).-‘f’(x)dx
>
.

cari be reduced to a single equation was to find the path of a falling body for a
given time of descent. The problem then cari
n
be reduced to the solution of equation (9) with
Q(x) - 1 K(x,y)W)dy=F(x) (O<xan),
s0 G(x,y)=l and a=1/2. When G(x,y)=l, we
831 217 N
Integral Equations

cari solve equation (9) explicitly to get those for initial value problems in ordinary
differential equations [ 163.

N. Numerical Solution

For the numerical solution of integral equa-


M. Nonlinear Integral Equations tions, we assume throughout that the func-
tions appearing are a11 continuous and the
When a nonlinear integral equation includes a solution of every equation is unique. Methods
parameter n, it may happen that the parameter of numerical solution cari be divided roughly
has a bifurcation point, i.e., a value & such that into two classes. Methods of one class try
the number of real solutions is changed when to evaluate numerically the analytical solu-
3, varies through & taking real values. For tion described in the preceding sections, and
example, consider the equation methods of the other try to obtain a solution
by transforming the problem to one that is
1
numerically solvable.
<p(x)--1 ‘p2(Y)dY = 1.
s0
This equation has real solutions <p(x) = (1 + (1) A Method Based on Numerical Quadra-
-)/2n for i < 114 but no real solution ture. Consider the integral equation
for  > 1/4. Hence /1, = 1/4 is a bifurcation b
point (- 286 Nonlinear Functional Analysis). F(x>Y> cpW><p(~))d~=O.
sP
Among nonlinear integral equations, Ham-
merstein’s integral equation has been studied Leta=x,<x,<...<x,=hbepointsonthe
in detail [S, 143. It is an equation of the form interval [a, b] and <pI, (p2, . , rp, be the values
of<p(x)atx,,x,,..., x,. By the use of numer-
ical quadrature, we then have the following
<p(x)+ K(x,Ylf(Y>dY))dY=o. (10)
s Ll system of equations in <pi:
If K(x, y) and f(y, 0) are square integrable and
f(y, U) satisfies a tlipschitz condition in u with
a suffciently small coefficient, then the inte-
gral equation (10) cari be solved by successive The method corresponds to that of solving
approximations. If K(x, y) is a square inte- ordinary differential equations by their dif-
grable positive kernel and SDI~<(X, y)l’dy is ference equation analogs. Hence the errors
bounded, then we cari prove the existence and involved in the solutions obtained by this
method cari be analyzed similarly to those in
uniqueness of a solution of (10) under a con-
dition on f(y, u) weaker than a Lipschitz con- the case of numerical solution of ordinary
dition. We cari prove similar results for equa- differential equations (- 303 Numerical Solu-
tion (10) with a nonsymmetric kernel when tion of Ordinary Differential Equations). If
the given integral equation is a Fredholm
K(x, y) is continuous in the mean, that is,
equation of the second kind, then we have a
system of linear equations in vi. If we apply
l& IK(x’,Y)-K(x,Y)lZdY=O,
sD quadrature formulas to the integral appearing
in the integral equation by using the values of
lim 1K(x,y’)-K(~,y)1~dx=O. <pi obtained, then we have a formula by which
Y’? 5 D
the solution cari be evaluated directly, that is,
A nonlinear Volterra integral equation of the without using an interpolation formula.
form
(2) A Method Utilizing Recurrence Formulas.
x
cp(4 =“m + Let d, and d,(x, y) be the respective coefficients
F(x, Y; dY)VY
5a of A” in the expansions of Fredholm’s determi-
cari be solved by successive approximations nant D(1) and Fredholm’s first minor D(x, y; 1.).
if F(x, y; u) and f(x) are square integrable, They satisfy the recurrence formulas
F(x, y; u) satislïes a Lipschitz condition b
IF(x,y;u’)-F(x,y;u”)l<k(x,y)lu’-u”I with d,(x, Y) = d,K(x, Y) + K(x, M-, b, Y)&
s Il
some square integrable function k(x, y), and
j:F(x, y;f(y))dy is majorized by some square 1 b
d n+, = -~ Us, 4 ds,
integrable function of x. When F(x, y; u) and n+l sa
f(x) are continuous, we cari obtain theorems
do= 1, 4,(x, Y) = K(x, Y).
on the existence and uniqueness of continuous
solutions and tcomparison theorems similar to By the use of these formulas, we cari compute
217 Ref. 832
Integral Equations

d, and d,(x, y) successively and hence evalu- holm integral equation (7). If we put $“(x) =
ate o(Â) and D(x, y; 1) approximately, and by C~,(X)- i.fiK(y, x)<p,(~)dy, then from (7) we
means of these recurrence formulas we cari have
readily obtain a solution of a Fredholm equa- b b
tion of the second kind. <pbWnbVx = f(xMxW> (12)
s cl s 0
(3) A Method Utilizing Approximate Kernels. and furthermore we see that {$,(x)} cari be
If we replace a kernel by an approximate one orthonormalized to yield a complete ortho-
in a Fredholm integral equation of the second normal system {X”(X)~. The equality (12) then
kind, then we have an integral equation that shows that the Fourier coeftïcienls of a solu-
has a solution approximately equal to the tion <p(x) with respect to the system {x,(x)}
solution of the original equation. Hence if we cari be obtained readily from the Fourier
cari tïnd an approximate kernel for which coefficients of f(x) with respect to the system
an integral equation cari be solved numeri- (C~,(X)}. This method of obtaining a solution is
cally or analytically, then we cari fïnd an ap- called Enskog’s method.
proximation to the desired solution by solving
the moditïed equation. For such solutions, a For Volterra integral equations, methods (1)
method of error estimatjon was given by F. G. and (4) cari be used effectively. We usually
Tricomi [ 171. transform equations of the fïrst kind into
equations of the second kind by differentiation
(4) An Iterative Method. Consider the integral and then apply the above numerical methods.
equation This is done for the sake of securing stability
of the numerical methods.
b
<p(x)= F(x>Y> C~(X), dy))dy.
s L?
References
Let <p,(x) be an adequate function, and defïne
<p,(x) successively by
[l] 1. Fredholm, Sur une classe d’équations
fonctionnelles, Acta Math., 27 (1903), 365-390.
R,+,(X)= ‘F(x,Y,Y)n(x)>~n(L.))dy. [2] G. Vivanti and F. Schwank, Elemente der
îa
Theorie der Lineare Integralgleichungen,
If the sequence {(P,,(X)} converges, then the Helwingsche Verlagsbuchhandlung, 1929.
limit lim,,, C~,(X) = <p(x) is a solution of the [3] E. Hellinger and 0. Toeplitz, Integral-
given equation, and hence we cari obtain an gleichungen und Gleichungen mit unendlich-
approximation to a solution by calculating vielen Unbekannten, Teubner, 1928.
<p,(x) for some tïnite n. This method cari be [4] R. Courant and D. Hilbert, Methods of
used effectively for Fredholm integral equa- mathematical physics 1, II, Interscience, 1953,
tions of the second kind with a parameter 1962.
A, provided that the absolute value of i is [S] F. G. Tricomi, Integral equations, Inter-
smaller than the least absolute value of the science, 1957.
eigenvalues. [6] K. Yosida, Lectures on differential and
integral equations, Interscience, 1960. (Orig-
(5) Variational Method. If some conditions are inal in Japanese, 1950.)
fullïlled, an integral equation of the form [7] E. Goursat, Cours d’analyse mathématique
b
III, Gauthier-Villars, 1928.
G(x> <p(x)) + @>Y, <p(x)> dy))dy=O
[S] F. Smithies, Integral equations, Cam-
sn bridge, 1958.
cari be regarded as an tEuler-Lagrange equa- [9] T. Carleman, Sur les équations inté-
tion for a variational problem grales singulières à noyau réel et symétrique,
Uppsala, 1923.
b b

J[u]= [ 101 T. Lalesco, Introduction à la théorie des


E(x>~,u(xXu(~)Vxdy
SSLI a équations intégrales, Gauthier-Villars, 1912.
b [ 1 l] N. 1. Muskhelishvili, Singular integral
+ H(x,u(x))dx=extremal. (11) equations, Noordhoff, 1953. (Original in Rus-
sa sian, 1946.)
In this case, we cari fïnd a solution of the given [12] S. G. Mikhlin, Multidimensional singular
integral equation numerically by solving the integrals and integral equations, Pergamon,
variational problem (11) numerically [ 1 S]. 1965. (Original in Russian, 1962.)
[ 131 A. P. Calderon and A. Zygmund, Sin-
(6) Enskog’s Method. Suppose that {Q,,(X)} is gular integral operators and differential equa-
a complete orthonormal system for the Fred- tions, Amer. J. Math., 79 (1957), 901-921.
833 218 c
Integral Geometry

[ 141 H. Schaefer, Neue Existenzsatze in der constant. A condition for the existence of a G-
Theorie nichtlinearer Integralgleichungen, invariant measure p cari be given by means of
Akademie-Verlag, 1955. +Haar measures of G and H (- 225 Invariant
[ 151 L. Lichtenstein, Vorlesungen über einige Measures). We now consider some examples.
Klassen nichtlinearer Integralgleichungen und
Integro-differentialgleichungen, Springer, 1931.
[ 161 T. Sato, Sur l’équation intégrale non
B. Crofton’s Formula
linéaire de Volterra, Compositio Math., 11
(1953), 271-290.
Let G(p, 0) be a straight line defined by the
[ 171 H. Büchner, Die praktische Behandlung
equation x1 COS0 + x2 sin 0 = p with respect to
von Integralgleichungen, Springer, 1952. orthogonal coordinates in a Euclidean plane.
[ 181 L. Collatz, The numerical treatment of Let n(p, 0) be the number of intersections of
differential equations, Springer, 1960.
G(p, 0) with a curve C of length L. Then we
[ 191 L. V. Kantorovich and V. 1. Krylov, have Crofton’s formula,

s
Approximate methods of higher analysis,
Interscience, 1958. (Original in Russian, 1949.)
n( p, 0) dp dO = 2L, (1)

where dpd0 is the texterior product of the


differential forms dp, d0 of degree 1, and the
218 (VII.1 9) integral is extended over p E ( -CC~, CO) and
Integral Geometry OE [O, 271).

A. General Remarks
C. Poincaré% Formula and the Principal
Integral geometry, in the broad sense, is the Formula of Integral Geometry
branch of geometry concerned with integrals
on manifolds, but the problems considered in The kinetic density dF of a figure F congruent
integral geometry are usually of a more limited (with the same orientation) to a fixed figure in
nature. If a +Lie group G acts on a tdifferenti- a Euclidean plane is defined as follows: Let R
able manifold M as a +Lie transformation be an orthogonal frame attached to F, (x1, x2)
group, G also acts on various figures on M, be the coordinates of the origin of R with
by which we mean geometric abjects such as respect to a fixed orthogonal frame R,, and
tsubmanifolds of M, ttangent r-frame bundles 0 be the angle between the iïrst axis of R and
on M, etc. Let B be a set of such figures on M the first axis of R,. If we put dF = dx, dx, dB
invariant under G (i.e., gFE 8 for g E G, FE 9). (exterior product), dF has the following in-
Consider the following problems: (i) to know variante properties: (i) dF is not changed by
whether any G-invariant tmeasure p on 9 displacements of F; (ii) dF is not changed if
exists, and how to determine p if it exists; (ii) to instead of R we take another orthogonal
find the integral Sq(F)dp(F) of functions cp on frame R’ attached to F.
F with respect to the measure p. Let two plane curves C, , C, of length L,,
The term integral geometry was introduced L,, respectively, be given, and suppose that C,
by W. Blaschke, who considered the special is tïxed and C, is mobile. If the number of
case of problem (ii) in which <p(F) is a function intersections of C, in an arbitrary position
representing geometric properties of F and the with C, is tïnite and equal to n, then the in-
integral is to be evaluated by means of the tegral of n extended over all possible positions
geometric invariants concerning F[l]. Prob- of C, is given by
lems of so-called geometric probability (such
as the problem of Buffun’s needle) belong to î ndC,=4L,L,
this category. The measure p is called the
kinetic measure (or kinetic density), and ~,u(F) (Poincaré% formula). L. A. Santalo applied this
is also denoted by dF. If F has the structure of result to give a solution of the tisoperimetric
an n-dimensional differentiable manifold and problem (1936).
the measure p is given by a +Volume element w Let Ci, C, be two plane +Jordan curves of
(i.e., a positive tdifferential form of degree n), length L,, L,, respectively, and let Si, S, be
we denote w also by dF. Problem (i) is simple: the areas of the domains bounded by C,, C,,
If G acts ttransitively on Y, then B = G/H, respectively. Suppose that C, is fixed and C,
where H is the tisotropy subgroup of G. In mobile, and let x be the number of connected
this case 9 has the structure of a differentiable domains common to the domains bounded by
manifold, and if a G-invariant +(Radon) mea- C,, C, for C, in an arbitrary position. Then
sure exists, it is unique up to a multiplicative the integral of x extended over ail possible
218 D 834
Integral Geometry

positions of C, intersecting Ci is given by MI’), A4i2’ be the integrals Mi defined by (4) for

s XdC,=L,L,+2?T(S,+S,)

(Blaschke Cl]). This is the principal formula


(3)

of
C, , .X2. If C, is fixed, Z2 is mobile, and the
+Euler-Poincaré characteristic
lïnite, then the generalization
form
x of D, n D, is
of (3) has the

integral geometry. Many formulas cari be


derived from it as special cases or limiting XdC,=I ,... I,-, M;?,Vz+M;?,V,
cases. s

D. Generalization to Dimension n
(Cher& formula), where dC2 is the kinetic
densityofC,andI,(k=l,...,n-l)isthearea
The kinetic density of subspaces of dimension
of the unit sphere in a Euclidean space of
k in a Euclidean space and in a spherical space
dimension k+ 1, with the integral extended
of dimension n was given by Blaschke, while
over a11 positions of C2 intersecting C,
the generalization of the principal formula
Let dE be the kinetic density of the sub-
(3) to a Euclidean space of dimension n was
spaces E of dimension k intersecting a com-
given by S. S. Chern, applying the methods of
pact orientable hypersurface C of class C2, and
E. Cartan.
let x be the Euler characteristic of the intersec-
Let (e, , , e,) be a positively oriented ortho-
tion of E with the domain bounded by Z. The
normal frame with vertex A. The inlïnitesimal
integral j x dE extended over all hyperplanes of
relative displacements are then given by
dimension k intersecting Z is proportional to
Mk relative to the hypersurface Z defmed by
dA= E mie,, de,= i wijej,
i=1 j=l (4). This fact generalizes (1). Further generali-
zations were obtained by Chern (1966).
where wi = (dA, e,), wij = (de,, e,) = - wji are
differential forms of degree 1 in the orthogonal
coordinates of A and the n(n - 1)/2 variables E. Other Generalizations
that determine e,, , e,. For various posi-
tions of a figure, we take an orthogonal frame For 2-dimensional spaces of constant curva-
(A, e,, , e,) tïxed to this figure and form the ture, Santalo derived formulas analogous to
texterior product those in a Euclidean plane (194221943) and
thus solved the isoperimetric problem in these
I i<j spaces. In 1952, he derived a formula corre-
sponding to (5) in n-dimensional spaces of *
of a11 the wi and 0, (i <j). This has the invar-
iance properties (i) and (ii) of Section C and constant curvature, following Chern’s method.
He investigated further integral geometry in
is, by defmition, the kinetic density of the fig-
affine, projective, and Hermitian spaces.
ure in an n-dimensional Euclidean space.
Chern and others obtained the results of the
Moreover, the kinetic density of dE of k-
previous sections by Cartan’s method of gen-
dimensional subspaces E cari be obtained by
eral moving frames and studied integral geom-
considering the orthogonal frames such that
etry in the setting of the geometry of Lie
the vertex A and e, , , ek lie on E and by
transformation groups in the sense of F. Klein
forming the exterior product of the corre-
(- 137 Erlangen Program).
spondingw,,w,,(a=l,..., k;L=k+l,..., n),
Chern, P. Griffiths, and others studied the
dE = Aw&o,~. value distribution theory of holomorphic
Let Z be a compact orientable hypersurface mappings in several complex variables from
the point of view of integral geometry (1961)
of class C2 in an n-dimensional Euclidean
(- 21 Analytic Functions of Several Complex
space, and let k, (U = 1, , n - 1) be the tprin-
Variables, 124 Distribution of Values of Func-
cipal curvatures at a point on C. Denote by Si
tions of a Complex Variable).
the telementary symmetric form of degree i in
k,(i=l ,..., n-l),andputS,=l.Thencon-
sider the integrals over Z: F. Radon Transforms

Mi=[zSidS/(n;l), i=O,l,...,n-1, (4) Another important topic of integral geome-


try is the theory of Radon transforms. Let 9
where dS denotes the surface element of Z. Let be the set of hyperplanes <(w,p)={x~R”I(x,w)
D,, D, be the domains bounded by two com- = p} in the Euclidean space R”, where w =
pact orientable hypersurfaces C, , Z, of class (a,, ,A,) is a unit vector, (x, w) = C x,1.,, and
C2 with volume k’, , V2, respectively, and let p is real. For a function f defined in R”, detïne
835 218 G
Integral Geometry

A formula corresponding to +Plancherel’s


theorem and an analog of the +Paley-Wiener
theorem for the Fourier transform are valid
for the Radon transform (1. M. Gel’fand et al.
where LErx is the +Volume element on the CW.
hyperplane 5 such that drx A C Ai dxi= dx, with
dx the volume element of R”. Then f is called
the Radon transform of ,f: For example, if f is G. Horospheres
the tcharacteristic function of a bounded
domain V, the value f(t) of the Radon trans- The theory of the Radon transform is also
form ,f of f at 5 ~9 is the volume of the sec- important in noncompact tsymmetric Rie-
tion of V by 5. Now the group G of tmotions mannian spaces M. The connected component
of R” (the tconnected component of the iden- G of the identity in the group of isometries of
tity of the group of isometries) acts on .F tran- M is isomorphic to the tadjoint group Ad G
sitively. For every XER”, the tisotropy sub- and cari be considered as a linear group. Maxi-
group G, of G with respect to x acts transi- mal tunipotent subgroups of G are conjugate
tively on the set k = { 5 E 9 1x E t}. Since G, is to each other. If N is such a subgroup, we cal1
compact, there exists a unique normalized G,- the +orbits on A4 of gNg-’ horospheres on M
invariant measure p on X such that PL(X) = 1. for g E G. These correspond to the hyperplanes
For a function g on 8, the conjugate Radon in R”. If M is the complex Upper half-plane
transform d cari now be defïned by with the thyperbolic non-Euclidean metric,
the horospheres are precisely the circles tan-
gent to the real axis and the straight lines
parallel to the real axis.
The group G acts transitively on the set
as a function on R”.
.F of horospheres on M, and we have F =
The determination of 8 belongs to problem
(ii) of integral geometry mentioned in Section G/M, N. Here G = KAN is an +Iwasawa de-
A. In particular, it is important to determine composition of G, and Mo is the tcentralizer
9 = (,f)” for y =f^ and to fïnd the relation be- of A in K. For a horosphere CE$-, let drx be
the volume element on 5 with respect to the
tween (f)” and f: These problems were solved
by J. Radon for n = 2,3 and by F. John in the +Riemannian metric on 5 induced by the Rie-
general case. The results cari be formulated as mannian metric on M, and detïne the Radon
follows: transform f of a function 1 on M by (6) as a
In the case of odd n, let Y’ be the space of function on 8. For every XE M, there exists a
trapidly decreasing C”-functions (- 168 unique normalized measure on X = { 5 E 8 (
Function Spaces). Let Y*(p) be the set of XE <} invariant under the (compact) isotropy
g(w,p)EY(F) such that ~Tmg(~~,p)pkdp=O for subgroup G, of G at x (p(X)= 1). The conju-
every natural number k and every w. For every gate Radon transform .4 of a function g on 9
is defined by formula (7) by means of this
fi 9’(R”) and every g E Y’*(F), we then have
measure p. Then there exists an integrodif-
ferential operator A such that if A* is the
adjoint operator, we have the inversion for-
where A is the tlaplacian in R” and Lg(w, p)
mula f=(AA*f)- and Plancherel’s theorem:
=d2g(w,p)/dp2, c=T(n/2)-‘(27~i)‘~“~~“‘~.
In the case of even n, for every fc,Y(R”),
gEy*m,

where dx, d< are G-invariant measures on M,


where g, respectively, and ,f is an arbitrary C”-
function with compact support. If the +Cartan
f(y)lx-A-‘“4 subgroups of G are conjugate to each other, f,

JAd(w>p)=
sghdlp-ql-“dq.
is a differential operator; the inversion formula
cari then be written in the form f=L((f^)‘)
with some differential operator L on M [9].
R
S. Helgason applied the Radon transform
These integrals are in general divergent, and on noncompact symmetric Riemannian spaces
they must be interpreted as regularizations to solve differential equations on these spaces
defined by analytic continuation with respect (1973).
to the powers of Ix-y1 or Ip-q1 [9]. Horospheres and Radon transforms cari be
John applied the Radon transform on R” to defined not only for symmetric Riemannian
the study of partial differential equations in R” spaces G/K, but also for various thomoge-
with constant coefficients [SI. neous spaces G/H of noncompact semisimple
270 H 1002
Measure Theory

x,<b,,k=l,2,...,n}(-co<a,<b,<co)is tive group of the reals is not measurable [3,


defined by m(1) = ni=, (bk -aJ. Let Im, be the pp. 677701.
collection of all sets that cari be represented as
the finite union of disjoint left-open intervals,
and for such expression A = IJI=~ Ij in $J& H. Product Measure
detïne m(A) = &r ~(1~). Let %N= !RI0 U {a}
and m(0) = 0. Then m gives a fïnitely additive When two o-tïnite complete measures px and
measure on !JJl that is completely additive. pLy are detïned on completely additive classes
Therefore m determines an outer measure p*, 23, and Bu on X and Y, respectively, an ele-
which in turn determines a measure p. This p* ment C belonging to the smallest fïnitely addi-
is called the Lehesgue outer measure (or simply tive class 52 in the Cartesian product that
outer measure). Sets that are measurable with contains {A x B 1A E 8,, BE S,} cari be repre-
respect to p* are said to be Lehesgue measur- sented as a finite disjoint union C= i&(Aj
able (or simply measurable), and the measure x Bj) (Aj~b,,Bj~23,). If we defïne v(C)=
p is called the (n-dimensional) Lebesgue mea- C’& px(Aj)py(Bj) (here we agree to put 0. CD=
sure (or simply measure). Every interval is 0), then this value is independent of the way
measurable, and its measure coincides with its the set C is represented, and v defines a com-
volume. Open sets, closed sets, and Bore1 sets pletely additive measure on si. By extending v
are a11 measurable. More generally, suppose by means of Hopf’s extension theorem, we
that an outer measure p* defined on a metric obtain a (complete) measure space, called the
space X with a metric d satisfies the condi- (complete) product measure space obtained
tion: If d(A,B)>O, then P*(AU@=~*(A)+ from the measure spaces (X, 8,, pLx) and (Y,
p*(B). Then every closed subset of X is p*- 23r, py). The measure obtained in this way
measurable, and therefore SO is every Bore1 on the space X x Y is called the product mea-
subset. Hered(A,B)=inf{d(a,b)IaEA,bEB} sure of pcx and pLu and is denoted by px x pr. If
denotes the distance between the two sets A we denote by %nP the class of a11 Lebesgue
and B. A measure p detïned on the class of a11 measurable subsets of the p-dimensional Eu-
Bore1 subsets of a topological space X is called clidean space RP and by mp the p-dimensional
a Bore1 measure. The cardinality of the set of Lebesgue measure, then the (complete) product
ah Lebesgue measurable subsets of R” is 2’, measure space of (RP, %J$,,mp) and (Rq, YJ$, m,J
while the cardinality of the class of Bore1 is (R (p+d , m p+4, mp+,J. The product measure
sets is c (here c is the cardinal number of the space of any finite number of measure spaces
tcontinuum, ie., the cardinal number of R). (Xi, d,, pi) (i= 1, , n) is defined similarly.
Therefore there exists a Lebesgue measur- Let X, (LEA) be spaces with an index set of
able set that is not a Bore1 set. It follows arbitrary cardinality. For the product space X =
immediately from the detïnition that the IIiE,, X,, an n-cylinder set, or simply a cylinder
Lebesgue measure is K-regular, SO for every set, is a set of the form A x n,,,,, ,_,,,lni X,
Lebesgue measurable set A we cari find an tF,- (A c X,, x x X,“). If a tïnitely additive class
set B and a tG,-set C such that B c A t C and ‘ZI, is given for each X,, the class of a11 subsets
p(C-B)=O. that cari be represented as the union of a fïnite
Historically, for a bounded subset A of R”, number of cylinder sets of the form A, x A, x
C. Jordan detïned t%(A) to be the intïmum of . XA,Xn,,,,,,...,,~}X~(Aj~~,j,j=1,2,...,~)
a11 possible values m(B), where BE W and B 3 is a fïnitely additive class in the space X.
A, and WI(A) to be supremum of a11 possible When each aA is completely additive, the
values m(B), where BE!JJI and Bc A. He called completely additive class ‘u generated by this
m(A) the outer volume of A and m(A) the inner lïnitely additive class is called the product of
volume of A (in the case of R2, the outer area the completely additive classes %, and is de-
and buter area, respectively). When #ï(A) = noted by &,,%u,. When a measure space (X,,
g(A), A is called Jordan measurable, and this 23i,pi) with Pu= 1 is given for ÂEA, a mea-
common value is defined to be the Jordan mea- sure p cari be defined in the following way on
sure (Jordan content) of A. Jordan measure is the completely additive class B = nis,,23, in
only finitely additive, and was found to be un- the product space X: TO begin with, for a cyl-
satisfactory in many respects. It was Lebesgue inder set of the form A, x x A, x X’ (here
who modified this notion and introduced AjEBA, and X’=~ic(il,....n,)X,), we define
completely additive measures. Jordan measur- V(A, x x A, x X’)=pLI,,(Al) pi,(A,). If we
able sets are always Lebesgue measurable. extend p to the fïnitely additive class 6 con-
Using the taxiom of choice, a set that is not sisting of all sets that cari be represented as
Lebesgue measurable cari be constructed (G. the tïnite union of such cylinder sets, then this
Vitah). For example, a set obtained by choos- extension gives a completely additive measure
ing exactly one element from each coset of on CF,and therefore, by Hopf’s extension theo-
the additive group of a11 rationals in the addi- rem, there exists a unique extension to a mea-
1003 270 J
Measure Theory

sure /* on 23, and p satisfies p(X) = 1. We to each of the variables x and y separately. For
denote this /* by p = n,,, F~. example, let 8 be the class of all closed subsets
F of R2 satisfying m,(F) > 0, F c [0, l] x [0, 11.
Using the fact that the cardinality of 5 does
I. Radon Measure
not exceed the cardinal number of the con-
tinuum, we can prove by +transfinite induction
Let X be a tlocally compact Hausdorff space,
that the elements of 3 can be indexed as F<,
23 the topological o-algebra on X, and C,(X)
F < c, < being the cardinality of 5, and that for
the real linear space of all real-valued con-
every F<E~ we can pick two points zs=(xy,
tinuous functions f on X having tcompact
ye), z; = (xi, y;) in such a way that if F< # F,,
support (i.e., the closure of the set {xI,f(x)#O}
then xc, xi, x,,, xi are all distinct and y,, y;, y,,
is compact). A (real) tlinear functional cp de-
yI, are all distinct. Furthermore, we can prove
fined on C,(X) is called a positive Radon mea-
that the set E consisting of all such zy is not
sure if cp(f) 2 0 whenever f> 0. For such a
measurable. Therefore, denoting the charac-
functional cp there corresponds a Bore1 mea-
teristic function of the set E by f(x, y), f(x, y) is
f
sure p on X for which cp(f) = lx dp holds for
not measurable; but if we fix x (resp. y), then as
every f~ C,(X). Lebesgue measure m, is re-
a function of y (resp. x), f(x, y) is measurable,
garded as a positive Radon measure on R”. If
since f (x, .) (resp. .f( , y)) is always 0 except
X is cr-compact (i.e., X can be represented as
possibly at one point. If ‘%3is the class of all
the countable union of compact sets), then the
Lebesgue measurable sets or the class of all
property cp(f)=J,fdp for all fEC,(X) defines
Bore1 sets, then a !&measurable function is
the measure p uniquely on the class of Bore1
called a Lebesgue measurable function (or
sets of X. A linear functional on C,,(X) that
simply measurable function) or a Bore1 mea-
can be written as the difference of two positive
surable function, respectively.
Radon measures is called a Radon measure.
In Euclidean space, the class of all Bore1
For a linear functional cp defined on C,(X)
measurable functions coincides with the class
to be a Radon measure, it is necessary and
of all Baire functions, and an arbitrary Le-
sufficient that for an arbitrary f E C,(X), the set
besgue measurable function is equal almost
{(~(g)~IgI~Ifl,g~C~(X)}bebounded.Equiva-
everywhere to a Baire function of at most the
lently, it is necessary and sufficient that the
second class. For a function f that is finite
restriction of cp to the subspace of all functions
almost everywhere on a Lebesgue measur-
in C,(X) having their support in a fixed com-
able set E to be Lebesgue measurable, it is
pact subset of X must be continuous with
necessary and sufficient that for an arbitrary
respect to the ttopology of uniform conver-
E> 0 we can find a closed subset F such that
gence. Therefore, if X is compact, an arbitrary
m(E - F) <c: and f is continuous on F (Luzin’s
continuous linear functional on C(X) is a
theorem).
Radon measure. L. Schwartz investigated
If a sequence {f,} of 23-measurable func-
Radon measures on spaces that are not locally
tions on a measure space (X, 8, p) converges
compact [6].
almost everywhere to f on a set E with p(E) <
rx), then for an arbitrary E> 0 we can find a
J. Measurable Functions set F (F c E, FE %) such that p(E - F) < E and
f, converges uniformly on F. If X = R” and B
When a completely additive class b on a is either the class of Bore1 sets or the class
space X is given, a function f defined on a d- of Lebesgue measurable sets, then the set F
measurable set E and taking real (and possibly can be chosen to be a closed set (Egorov’s
+co) values is called a b-measurable function theorem).
on E if for an arbitrary real number ~1,the set For a finite measurable function f(x) defined
{x 1f(x) > a} is B-measurable. The condtion on the real line, there exists a sequence {h,}
f>ccmaybereplacedbyf>cc,f<cc,fdcc. such that lim,,, f (x + h,) = f (x) almost every-
A function f can also be defined to be 2% where (H. Auerbach).
measurable if the tinverse image under f of The functional equation f (x + y) = f (x) +
any Bore1 set is B-measurable. When f and f(y) has infinitely many nonmeasurable solu-
g are B-measurable, so are af + bg (a, b con- tions (G. Hamel; - 388 Special Functional
stants),f. g, f/g, max(J g), min(f; g), Ifl” (P a Equations).
constant), whenever they are well defined. The Let %!I,, i= 1, 2, be o-algebras on Xi, i = 1, 2,
superior and inferior limits of a sequence of 8- respectively. A mapping f: X, -X, is said to
measurable functions are also B-measurable. be measurable ‘u,/‘u, if f -‘(A,)c211, for every
In a complete product measure space of two c- A,E’U,. When $3; generates 9I,, f:X,+X, is
finite measure spaces, a function f(x, y) may measurable %!I,/%, iff-‘(A;)E’U, for every
fail to be measurable as a function of two A; ES.&. For example, a mapping f: R’ +R’
variables even if it is measurable with respect is Bore1 (Lebesgue) measurable if and only if
270 K 1004
Measure Theory

fis measurable b1/23’(%JI,/%31), where !ZJ1 Xi, E(x,)~23, for almost all (~Jx~EX~, and
and W, denote the Bore1 subsets and the Le-
besgue measurable subsets of R’. Measurabil- P(E) = x, /4E(xAhVxJ
ity of mappings is preserved by compositions, s
namely, if f: X, -X, and 9: X,-X, are mea-
surable %,/Ql, and ?I,/Q&, respectively, then
gof:Xi+X3 is measurable %,/‘u,. If the pro-
jection n,:ni,,Xi+X, is measurable, then (- 221 Integration Theory E).
&,?I$&. Since ni-‘(Ai) (iel, Aig’Ui) generate (iii) The Radon-Nikodym theorem and the
nip, 21i, a mapping J’: X+ ni,, Xi is measur- Lehesgue decomposition theorem for measures.
able (rr/n,,, 21i if rri of: X -+Xi is measurable Let p and v be a-finite measures on (X, 93).
(u,/% for every in I, where 2I is a a-algebra If p(E) = 0 implies v(E) = 0, then v is said to
on X. be absolutely continuous with respect to p,
denoted by v<p. v<p if and only if v is ex-
pressible in the form v(E) = lEfdp, where
K. Image Measures f is a nonnegative !&measurable function
(the Radon-Nikodym theorem). Such an ,f is
Let p be a regular measure on a measurable uniquely determined up to p-measure 0 and is
space (X, %J, i.e., the completion of a measure called the Radon-Nikodym derivative of v with
on Btl,. Then every mapping f: X -+ Y measur- respect to p, denoted by dvldp. The concept
able 8,& induces a c-algebra on Y: opposite to absolute continuity is singularity. v
is said to be singular with respect to p if there
93 = {B c Y If-‘(B) is p-measurable}, exists an E E 23 such that v(E’) = p(E) = 0. If p
and a measure on 8: and v are two arbitrary o-finite measures, then
v is expressible uniquely as the sum of an
v(B) = PW1 @)I. absolutely continuous measure vi and a sin-
The measure v is called the image measure of gular measure vz with respect to p (Lebesgue
p under the mapping f; this is denoted by decomposition theorem) (- 221 Integration
pfuf-’ or f.~. Obviously, v(Y)=p(X), and v is Theory D, 380 Set Functions C).
complete. Although B includes 93, by virtue (iv) Invariant Measures. Let (X, %3)be a
of the measurability of J v is not regular in measurable space and G a group of Bore1
general. However, v is regular if p is a bounded isomorphic mappings from (X, !ZJ) to itself. A
measure and if both (X, 23,) and (Y,d,) are measure p on (X, 23) is said to be invariant
analytic measurable spaces (- 22 Analytic under G if p(E)=p(g-l(E)) for every EE‘B and
Sets I). Therefore the image measure of a every gE G. For example, the Lebesgue mea-
regular probability measure under a measur- sure m on (R”, a”) is invariant under the group
able mapping is also a regular probability of congruent transformations (- 225 Invariant
measure in practically all useful cases. Measures).
(v) The Lebesgue-Stieltjes measure. Let f be
a right continuous monotone increasing func-
L. Related Topics tion on R. Then there exists a unique measure
,u on (R, 93’) such that ~((a, b]) =f(b) -,f(a) for
a < b. This measure is called the Lebesgue-
(i) Integration. For a nonnegative measurable
Stieltjes measure induced by f (- 166 Func-
function f(x) on (X, b, m), we can define the
tions of Bounded Variation B).
integral of f(x) on a set EE 23, denoted by
(vi) Baire measurable functions and univer-
~dl4~@4~ ~Ef(x)&(4, jEfhL, or jE1: For a
sally measurable functions. Let f(x) be a real-
real measurable function f(x), the integral is
valued function defined on a topological space
defined to be equal to JEf+ &-lE,f- dp, if at
X. If for every open set 0 the inverse image
least one of these integrals is finite, where f’
f-‘(O) has the tBaire property (resp. is mea-
and f- are the positive and negative parts of ,f
surable with respect to the completion of any
(- 221 Integration Theory B).
cr-infinite tBore1 measure on X), then f is said
(ii) Fuhini’s theorem for measures. Let (X,
to be Baire measurable (resp. universally mea-
8, p) be the complete product measure space
surable or absolutely measurable). Universally
of (Xi, 23,, pi) (i = 1,2), where both p, and pL2
measurable functions can be defined similarly
are a-finite and complete. For EE 23 we define
on a measurable space.
the sections E(x,) and E(x,) by
(vii) Disintegration. Let (S, G) and (T, 2) be
E(x,)={x,I(xl,X2)~E}, standard measurable spaces, v a o-finite mea-
sure on (S, G), and p: S+ T a v-measurable
-w,)={x, Ibl~X,kE). mapping. If the image measure p = vf-’ is c-
Then we have E(x,)E!& for almost all (pi)x, E finite, there exists a unique family of measures
1005 271 B
Mechanics

{v,,t~T} on&G) such that(l) t~v*(E)is these laws in his famous book Principia mathe-
universally measurable for every E E 6, (2) v, is matica philosophiae naturalis (168661687)
concentrated on p-‘(t) for almost every t with where the law of gravitation and its applica-
respect to p, and (3) the following equality tion, problems of fluid motion, motions of the
planets in the solar system, etc., were systemat-

v(E)=
sTp(dt)v,(E),
EE
6.
holds:
ically treated.
Newton’s first law. A body continues its
state of rest or uniform motion in a straight
This expression is called the disintegration of v line unless it is compelled to change that state
with respect to p. Every v-measurable function by external action (i.e., force). This is also
f on S is v,-measurable for almost every t with called the law of inertia.
respect to p, and the integral off is expressible Newton’s second law. The rate of change of
in the form of iterated integrals as follows: momentum is proportional to force and is in
the direction in which the force acts. Here,
momentum is defined as the product mv of the
mass m and the velocity v. This law can be
expressed as d(mv)/dt = F, where F is the force
This corresponds to Fubini’s theorem on
expressed in an appropriately chosen system of
product measures. When v is a probability
units. Since dv/dt is the acceleration a, the law
measure on (S, G), p(s) is regarded as a (T, 2)-
takes the form ma = F, when the mass is con-
valued random variable on (S, 6, v), and v, is
the conditional probability measure under the stant. These equations are called equations of
condition p(s) = t. (- 342 Probability Theory motion. The second law is often simply called
E. For disintegration of measures on a topo- the law of motion.
Newton’s third law. When two bodies 1 and
logical space - [3, ch. 91.)
2 in the same system interact, the force exerted
on body 1 by body 2 is equal and opposite in
References direction to that exerted on body 2 by body 1.
This law is called the law of action and reac-
[l] H. Lebesgue, Integral, longueur, aire, Ann. tion, or simply the law of reaction.
Mat. Pura Appl., (3) 7 (1902), 23 l-359. Various attempts at a rigorous axiomati-
[2] S. Saks, Theory of the integral, Warsaw, zation of Newtonian mechanics have been
1937. made, beginning with E. Mach’s work. At the
[3] P. R. Halmos, Measure theory, Van Nos- beginning of the 20th century, it was found
trand, 1950. that Newtonian mechanics requires modifica-
[4] N. Bourbaki, Elements de mathematique, tion when bodies travel at speeds approach-
Integration Chap. 1-9, Actualites Sci. Ind., ing the speed of light or when it is applied to
Hermann, 196551969. physical systems of molecular size or smaller.
[S] H. L. Royden, Real analysis, Macmillan, These modifications led to the establishment of
second edition, 1963. the theory of trelativity and tquantum mechan-
[6] W. Rudin, Real and complex analysis, ics. In contrast to these later theories, New-
McGraw-Hill, second edition, 1974. tonian mechanics is called classical mechanics.
[7] L. Schwartz, Probabilites cylindriques et
applications radonifiantes, J. Fat. Sci. Univ.
Tokyo, IA, 18 (1971), 1399286. B. Newton’s Law of Gravitation

Kepler discovered the following three laws for


the motion of planets around the sun (valid
271 (Xx.4) within the accuracy in observation available at
the time):
Mechanics Kepler’s first law. The orbit of a planet is an
ellipse with the sun at one of its foci.
A. Newton’s Three Laws of Motion Kepler’s second law. The area swept per unit
time by the straight line segment joining the
The study of laws governing the motion of planet and the sun is independent of the posi-
bodies began with the laws of falling bodies, of tion of the planet in its orbit.
which the first exact formulation was made by Kepler’s third law. The square of the period
Galileo. But the general relationship between (the time needed for the planet to go around
force and acceleration was first described by the orbit once) is proportional to the cube of
+Newton, who established Newton’s three laws the major axis of the orbit.
of motion; the mechanics based on them is From these empirical laws, Newton deduced
Newtonian mechanics. Newton expounded his law of universal gravitation: Between any
271 C 1006
Mechanics

pair of point particles, with masses m, and which pushes the particle away from the axis
m, and at a distance r, there arises an attrac- of rotation (along the perpendicular), and the
tive force along the line joining the two points; Coriolis force 2mv x w ( x denotes the +vector
the magnitude of this force is given by F = product), which bends the motion of the par-
Gm, rn2rm2, where G is a universal constant ticle in a direction perpendicular to both the
(approximately 6.670 x lo-’ dyne cm2 g-‘) axis of rotation and the velocity of the particle.
called the gravitational constant.

E. Dynamics of Rigid Bodies


C. Kinetic and Potential Energies
A rigid body is defined as a system of particles
If the force F on a particle is a function of the whose mutual distances are permanently fixed.
position x of the particle and is the gradient Like a point particle, it is an ideal concept
-VU of a time-independent (scalar) function introduced into mechanics to simplify the
U(x), called the potential, then theoretical treatment. Actual solid bodies can
in most cases be regarded as rigid under the
E=;mv2+
U(x) action of forces of ordinary magnitudes. Since
a rigid body can be imagined to be made up of
is a constant of the motion of the particle. an infinite number of particles, the equations
E is called the total energy, while the first of motion for a system of particles can also be
and second terms on the right-hand side are applied to it. Thus the motion of a rigid body
called the kinetic energy and potential energy, can be completely determined by the theorems
respectively. of momentum and angular momentum.
For a system of II particles at points x(i), . , The momentum of a rigid body is defined by
x(“) with masses m i, . , m,, suppose that the
force acting on the particle at x(j) is - V”‘U Q= $dm,
s
(the gradient of U relative to x(j)) for a com-
mon potential function U (x(i), . , x(“)). Then where dm is the mass of the volume element at
a point r of the rigid body K and dr/dt is its
velocity. If the external forces acting on K are
denoted by Fi (i = 1,2, . ), we have
is the constant total energy of the motion. For
example, Newton’s gravitational force acting
among a number of particles can be described which expresses the theorem of momentum.
by the Newtonian potential If the velocity and acceleration of the cen-
ter of gravity (center of mass or harycenter)
U=c -Gm,mjlx(if--#-l,
i<j
Jr dm/l dm of the rigid body are denoted by V,
and A,, respectively, and the mass ldm by m,
A potential U which is a sum of functions we have Q = mV,, and the theorem of momen-
depending only on a pair of coordinates as tum becomes mdV,Jdt=mA,=CFi.
in the above example is called a two-body The angular momentum of a rigid body K
interaction. about an arbitrary point rO is defined by

D. Apparent Force H= (r-r,)xgdm.


sK
The coordinate system in which Newton’s If P, is the vector from rO to the point at which
three laws of motion hold is called an iner- Fi acts, we have
tial system. In some cases (e.g., on a rotating
dH/dt = c (Pi x Fi) = G.
sphere such as the earth) it is more convenient
to use a moving coordinate system, in which This is called the theorem of angular
the equation of motion derived from Newton’s momentum.
second law by a coordinate transformation For the case of a rigid body with one point
(from an inertial system to the moving coordi- rO fixed, the angular momentum H and the
nate system) takes a form similar to the second angular velocity w are related by
law except for an additional apparent force to
H,=Aw,-Fo,-Ew,,
be added to the force in the original equation
of motion. For a coordinate system rotating at H, = - Fu, + Bw, - Dw,,
a constant angular velocity w (relative to an
H,= -Em,-Dw,+Cw,,
inertial system), the apparent force consists of
the centrifugal force, of magnitude mw’p (p where H,, H,, Hz; w,, coy, w, are the compo-
being the distance to the axis of rotation), nents of H and w in the xyz-coordinate system
1007 271 F
Mechanics

fixed in space with its origin at the fixed point can be determined by the use of the theorem of
r,,, and momentum. Also, the rotational motion about
the center of gravity can be found from dH’/dt
A= (y’+z’)dm, B= (z2+x2)dm, = G’, which is a modification of the theorem of
s s angular momentum. Here H’ and G’ are re-
spectively the angular momentum and mo-
D= yzdm, ment of external forces about the center of
s
gravity. In this case also, simplification can be
achieved by considering the equation in the
E= zxdm, F= xydm,
s s reference system that coincides with the prin-
cipal axes of the ellipsoid of inertia with its
with the integrals taken over the whole rigid center at the center of gravity.
body. We call A, B, and C the moments of
inertia about the x-, y-, and z-axes, respec-
tively, and D, E, F the corresponding products F. Analytical Dynamics
of inertia. The rotational motion of a rigid
body with one axis fixed is completely deter- Mechanics as originally formulated by New-
mined by the theorem of angular momentum. ton was geometric in nature, but later L. Euler,
However, for rotation about a fixed point, the J. L. Lagrange, and others developed the ana-
theorem is not very convenient to use, because lytical method of treating mechanics that is
A, B, C, D, E, and F are generally unknown now called analytical dynamics. Lagrange
functions of time. introduced generalized coordinates qj (j =
The tquadric 1,2, . ,A where f is the number of degrees
Ax2+ByZ+Cz2+2Dyz+2Ezx+2Fxy=l of freedom of the system considered), which
uniquely represent the configuration of the
represents an tellipsoid with its center at the dynamical system, and derived Lagrange’s
origin, called the ellipsoid of inertia. If the equations of motion:
principal axes 5, ye,and [ are taken as coordi-
nate axes, the equation of the ellipsoid of ar!
--=o, j=l,2 / .... .L
inertia becomes AC2 + By2 + l-c2 = 1, where A, aqj

B, I are the moments of inertia about the t-,


where dj=dqj/dt, and e= T- U(T= kinetic
q-, c-axes and are called the principal moments
energy, U = potential energy) is a function of qj
of inertia, while the <-, q-, c-axes themselves
and dj called the Lagrangian function. Later,
are called the principal axes of inertia. If the
W. R. Hamilton introduced
components of the angular momentum H and
angular velocity in the direction of the prin- pj = d T/acjj,
cipal axes of inertia are denoted by (Hi, H2, H3)
H=CPj4j--=HH(P,,...,P,;q,,...,qf)
and (wi, w2, w,), respectively, then H, = Au,,
H2 = Bw,, H3 = Iw,. Furthermore, if the t-, VI-, and transformed the equations to Hamilton’s
[-components of the resultant moment of the canonical equations:
external forces G = C(P, x Fi) are denoted by
(G,, G,, G3), then dH/dt = G becomes Iqj-aH dpi- JH j=1,2,...,&
dt apj’ dt aqj ’
Adw,/dt=G,+(B-T)w,w,,
Here pj is the generalized momentum conjugate
Bdw,/dt=G,+(T-A)w,w,, to qj, and qj, pj are called canonical variables.
If the functions representing the configuration
rdw,/dt=G,+(A-B)w,w,.
of the dynamical system in terms of qj do not
These are called Euler’s differential equations. explicitly contain the time t, the Hamiltonian
The study of the motion of a rigid body can function (or Hamiltonian) H coincides with the
be reduced mostly to the study of the motion total energy of the system Tf U.
of its ellipsoid of inertia, since the latter is The transformation (p, q)-(P, Q) under
attached to the rigid body. The method of de- which canonical equations preserve their form
scribing the motion of a rigid body by means is called a canonical transformation. It is given
of its ellipsoid of inertia is known as Poinsot’s by
representation. The motions of two bodies
having equal ellipsoids of inertia are the same
if the external forces acting on them have
equal resultant moments, even if their geo- where W= W(q,, . . . . qr;Q1, . . . . Q/) and K is
metric forms are different. the Hamiltonian of the transformed system.
When a rigid body moves under no con- The set of canonical transformations forms a
straint, the motion of its center of gravity G group, called a group of canonical transforma-
271 G 1008
Mechanics

tions. An infinitesimal transformation is given ing the body. We confine ourselves to the
by small-displacement theory of elasticity, in
which the components of the displacement u=
(u, u, w) of an arbitrary point P(x, y, z) of the
body are assumed to be small enough to jus-
where E is an infinitesimal constant. Here S tify the linearization of the governing differen-
is an arbitrary function and is said to be the tial equations and boundary conditions.
generating function of the infinitesimal trans- (2) Stress. Consider an infinitesimal rectan-
formation. Canonical equations can be inter- gular parallelepiped enclosed by the following
preted to mean that the variations of p and 4 six surfaces:
during the time interval E= dt are the intini-
x = const, y = const, z = cons&
tesimal canonical transformations whose gen-
erating function is H(p, q, t). x + dx = const, y + dy = const, z + dz = const,
The variation of an arbitrary function
in the body. The stress at the point P(x, y, z) is
F(p, q) under an infinitesimal transformation
defined as those internal forces per unit area
is given by
acting on the six surfaces of the parallelepiped.
dF=c(F,S), It has nine components, which are usually
represented by
where
0, 7Yx 7,x

Txxy % 7zy >


I 7x, 7yz rr, i
is Poisson’s bracket. Therefore the time rate of
where the 0 and 7 are called normal and shear-
change of a dynamical quantity F(p, q) can be
ing (or tangential) stresses, respectively. These
written as
nine quantities form a tensor called the stress
dF/dt =(F, H). tensor. Since it can be shown that 7xy = z,,, zyz
= 7qJ> bx = 7x,, the stress tensor is a symmetric
Thus the function F(p, q) that satisfies (F, H) = 0
tensor. By considering equilibrium conditions
is an tintegral of the canonical equations.
of the parallelepiped, the equations of equihb-
If a canonical transformation (p,q)+(P, Q)
rium are found to be
such that Pj = aj, Qj = bj are constant is found,
the motion of the system can be determined by
$g+%+%+%o )...)...)

where x, . . are body forces per unit volume.


(3) Strain. The infinitesimal rectangular
where W is the tcomplete solution of the
parallelepiped fixed to the body at the point P
Hamilton-Jacobi differential equation:
before deformation is transformed, after de-
formation, into an infinitesimal parallelepiped
z+ff fJff,
....i’w.
(aql aqf q,>...,qJ>t
>=o which is no longer rectangular. The strain at
the point P is defined as the changes caused by
(- 82 Contact Transformations). the deformation of the parallelepiped: exten-
sions of the three sides and changes from right
G. Theory of Elasticity angles of the three angles formed by the three
sides of the parallelepiped. Thus the strain has
(1) General remarks. Suppose that a solid six components, usually represented by
body is deformed elastically by the action of
(L~y,~,,Y yzr Yzx, Yxy13
external forces. We may inquire about the
magnitudes of deformation, stress, and strain where E and y are called elongation and shear-
caused by the external forces at each point ing strains, respectively. These six quantities,
of the body. The theory of elasticity studies with slight modification, form a symmetric
this problem, assuming the body to be a con- tensor called the strain tensor.
tinuum and utilizing classical mechanics as a (4) Strain-displacement relations. In small-
basis, and “endeavours to obtain results which displacement theory, the strain-displacement
shall be practically important in applications relations are given in linear form by
to architecture, engineering and all other use- au au au au
ful arts in which the material of construction is ""'ax' ""=&' . ..) yxy=-+-
ay ax
solid” [ 11.
Cartesian coordinates (x, y, z) are employed (5) Stress-strain relations. In the theory of
for defining the 3-dimensional space contain- elasticity, stress and strain are assumed to
1009 271 Ref.
Mechanics

obey a linear relationship called Hooke’s law: laxing the continuity requirements imposed
on admissible functions.
One of the practical advantages of these
where variational principles is that they often provide
the problem with approximate formulations
and approximate methods of solution, among
which the Rayleigh-Ritz method is well known.
Theories of beams, plates, shells, and multi-
and [A] is a symmetric positive definite ma- component structures are typical examples
trix. For an isotropic body, the relations are of such approximate formulations. Recently,
these variational principles have been found to
&=a;-v(u,,+u~), .... Gy,,=7,,,
provide effective bases for the formulations of
where E, v, and G = E/2( 1 + v) are elastic con- the tfinite element method.
stants called the modulus of elasticity in tension (9) Notation. Various symbols are used for
or Young’s modulus, Poisson’s ratio, and the stress and strain. For example, 0, t and E, y are
modulus of elasticity in shear or the modulus of commonly used in the engineering literature.
rigidity, respectively. However, in Love’s treatise [6], X,, Y,, Z, and
(6) Boundary conditions. The surface of the exxr exy, exz are used in place of c~, T,,, 7,, and
body can be divided into two parts with re- %a Yxp Ym respectively. Also, various nota-
gard to boundary conditions: the part S, over tions are used for elastic constants. E is widely
which the boundary conditions are prescribed used, while G and v are less common. Love
in terms of external forces and the part S, over uses p and (T for G and v, respectively. In the
which the boundary conditions are prescribed engineering literature the reciprocal of Pois-
in terms of displacements. Obviously at’= son’s ratio m (= l/v) is used and is called the
S, t-S,, where i?V is the whole surface of the Poisson number.
body. (10) Finite-displacement theory of elastic-
(7) Small-displacement theory of elastic- ity. When the displacement of the body is
ity. We have seen that the equations which no longer small (infinitesimal) but is finite,
govern the problem are 3 equations of equihb- we should abandon small-displacement the-
rium, 6 strain-displacement relations, and 6 ory and instead employ finite-displacement
stress-strain relations in terms of 15 unknowns, theory, in which stress and strain are defined
namely, 6 stress components, 6 strain compo- in a manner different from that in small-
nents, and 3 displacement components. Thus displacement theory, keeping in mind the
our problem is reduced to a boundary value difference between spatial and material vari-
problem in which these 15 field equations are ables. Thus equations of equilibrium, strain-
to be solved under the specified boundary displacement relations, and the boundary
conditions. Since all the field equations and conditions on S, become nonlinear equations
boundary conditions are linear with respect with stress and displacement components as
to the unknowns under the assumption that unknowns, although the stress-strain relations
the displacements are small, we obtain linear remain linear. Thus the problem is reduced to
relationships between the external load and solving a nonlinear boundary value problem,
resulting deformation of the body; this is the sometimes called a nonlinear elasticity prob-
small-displacement theory of elasticity. It lem. Variational principles have been formu-
should be remembered, however, that the lated for finite-displacement theory and are
assumption of small displacement sets limits frequently used in the formulation of approxi-
to the application of the theory to practical mate methods of solution.
problems. When the stress becomes large enough to
(8) Variational principles. Several variational exceed the so-called elastic limit, where the
principles have been formulated in the small- linear stress-strain relationships cease to hold,
displacement theory of elasticity. These include the theory of elasticity is no longer valid and
the principle of minimum potential energy should be replaced by the theory of plasticity
[u], the generalized principle [a, E, u], the CR101.
Hellinger-Reissner principle [u, u], the prin-
ciple of minimum complementary energy
[a], and so forth, where the symbols in the
References
brackets represent independent functions
subject to variation. In connection with the
aforementioned variational principles, vari- [l] E. T. Whittaker, A treatise on the analyt-
ational principles with relaxed continuity re- ical dynamics of particles and rigid bodies,
quirements have also been formulated by re- Cambridge Univ. Press, fourth edition, 1927.
272 A 1010
Meromorphic Functions

[2] A. Sommerfeld, Mechanics, translated by there exists a meromorphic function of z with


M. 0. Stern, Academic Press, 1952. (Original fk( l/(z -zk)) as its tsingular part at zk (Mittag-
in German, 1943.) Leffler’s theorem).
[3] J. B. Marion, Classical dynamics of par-
ticles and systems, Academic Press, 1965.
[4] V. I. Arnold, Mathematical methods of
B. Nevanlinna Theory
classical mechanics, translated by K. Vogt-
mann and A. Weinstein, Springer, 1978. (Orig-
The theory of meromorphic functions can be
inal in Russian, 1974.)
considered an extension of the theory of entire
[S] R. Abraham and J. E. Marsden, Founda-
functions. In particular, value distribution
tions of mechanics, Benjamin/Cummings,
theory, originating in Picard’s theorem, was
second edition, 1978.
studied by many people, and in 1925 R. Nevan-
[6] A. E. H. Love, A treatise on the mathemat-
linna published a systematic theory unifying
ical theory of elasticity, Cambridge Univ.
the results obtained until then. This is called
Press, fourth edition, 1934.
Nevanlinna theory.
[7] S. Timoshenko and J. N. Goodier, Theory
We let f(z) denote a meromorphic function
of elasticity, McGraw-Hill, 1951.
inIzI<R<+co,andwhenwesaythatf(z)
[S] A. E. Green and W. Zerna, Theoretical
takes on a value, the value may be co. For a
elasticity, Oxford Univ. Press, 1954.
value a, n(r, a) denotes the number of a-points
[9] R. Hill, Mathematical theory of plasticity,
of f(z), i.e., points z with f(z) = a, in IzI <I < R,
Oxford Univ. Press, 1950.
where each a-point is counted with its multi-
[lo] K. Washizu, Variational methods in
plicity. We set
elasticity and plasticity, Pergamon, second
edition, 1975.
N(r, a) =
s
’ n(t, a) - n(0, a)
0 t
dt + n(0, a) logr,

272 (X1.7)
Meromorphic Functions ifa#co,and

‘n(t, co)-n(O,co)
A. General Remarks

A single-valued tanalytic function in a domain


D in the complex plane C is called meromor-
N(r, GO)=

m(r, a)=&
s0
277
s0
t

log+ If(re”)j
dt + ~(0, co) log r,

dQ

phic in D if it has no singularities other than ifa=co,wherelog+n=max(loga,O)fora>O.


tpoles. A function that is meromorphic in The functions N and m are called the counting
the whole complex plane including the point function and proximity function of f(z), respec-
at infinity is a rational function (Liouville’s tively, and 7’(r) = T(r,f) = m(r, 00) + N(r, co) is
theorem). Specifically, if a function is mero- the order function (or characteristic function) of
morphic in the domain C, then the function is f(z). T(r) is an increasing function of r and a
called simply a meromorphic function, and a tconvex function of logr and is useful for ex-
meromorphic function that is not a rational pressing ,f(z) as an infinite product, etc.
function is called a transcendental meromor- The following relation holds among T(r),
phic function. A meromorphic function f(z) m(r, a), and N(r, a) for any a:
can be represented as a quotient of two tentire
T(r)=m(r,a)+N(r,a)+O(l), (1)
functions. Let {zk} (k = 1,2, . . ) be poles of f(z),
and let fk(z) = a!:)/(~ - Z# + . + ay)/(z -zk) where O(1) is tlandau’s symbol (Nevanlinna’s
denote the tsingular parts of f(z) at zk (k = first fundamental theorem). By this theorem, if
1,2, ). Then f(z) can also be written in the a bounded remainder is disregarded, then
form m(r, a) + N(r, a) is equal to T(r) for all c(. This
equality thus demonstrates a beautifully bal-
anced distribution of a-points.
We see that N(r, a) is in a sense the mean
where .q(z) is an entire function and the pk(z) value of the number of a-points in 1zI < r, and
(k = 1,2, . . . ) are rational entire functions m(r, a) is the mean proximity to a of the value
(Weierstrass’s theorem). Assume that a se- f(z) on IzI =r. If the term log+ in the definition
quence {zk} (k = 1,2, . ) converges only to the of the proximity function is replaced by the
point at infinity and that fk( l/(z -zk)) (k = logarithm of the reciprocal of the chordal
1,2, ) are rational entire functions of l/(z - distance between f(re’“) and 2 on the com-
zk) which have no constant terms. Then plex sphere, then the remainder term in (1) is
1011 272 F
Meromorphic Functions

eliminated. Hence the definition of the proxim- where p is the smallest integer satisfying p + 1
ity function is sometimes given in this form. > p, the a, and b, are the zeros and poles of
f(z), respectively, k is an integer, and P(z) is a
polynomial of degree at most p (Hadamard’s
C. The Order of Meromorphic Functions theorem).
Let C(~, , c(~(4 > 3) be distinct values. Then
For an entire function f(z), the equality for any meromorphic function in IzI CR < co,
log T(r) loglogM(r)
lim sup ~ = lim sup
r-cc logr r-w logr

holds, where M(r) = M(r,f) = max,+ If(z O,<r<R.


Since the right-hand side is the order of f(z)

=s‘n,(t)--n,(O)
Here

40-1
(- 429 Transcendental Entire Functions),
we define the order (lower order) p of a mero-
morphic function f(z) by

p=limsup-
t 0
dt+n,(O)logr,

n,(r) is the number of tmultiple points in IzI < r


r-m (a multiple point of order k is counted k - 1
times), and D(r) is the remainder such that if R
The order of a meromorphic function in IzI CR
is also defined by = co, then D(r) < K(log T(r) + log r) for some K
except possibly for values of r belonging to the
p = lim sup
log T(r) union of a countable number of intervals with
,-R log(l/(R--1)’ finite total length, and if R < co, then D(r) <
K(log T(r)+ log(l/(R-r))) except possibly for
the union of a countable number of intervals
D. Meromorphic Functions on a Disk {Ii} with xjJ,jd(l/(R-r))<co (Nevanlinna’s
second fundamental theorem).
The order function T(r) is bounded if and Several theorems on value distribution of
only if f(z) can be represented as the quo- meromorphic functions can be obtained di-
tient of two bounded holomorphic functions rectly from this theorem. For instance, if f(z) is
h,(z), h2(z) in IzI -CR (Nevanlinna). If T(r) is a transcendental meromorphic function, the
bounded, lim,,,f(re”) exists and is finite equation f(z) = c( has an infinite number of
for every 8, 0 < 0 <27-c, except possibly for a roots for every value CIexcept for at most
set with tlinear measure zero (P. Fatou and two values called Picard’s exceptional values
Nevanlinna). Among functions f(z) such that (Picard’s theorem). For a meromorphic func-
lim,,, T(r) = co, those satisfying
tion of order p, lim,.,, C,,G,(rj(a))m” (2 <p)
T(r) diverges for every value c( except for at most
lim sup two values (Borel’s theorem). A value c( for
r+R log(l/(R-r))=CO
which the series converges is called a Bore1
have properties similar to those of transcen- exceptional value. We call 6(a) = &a, S) = 1
dental meromorphic functions. - lim supIem N(r, a)/T(r) the defect off: It
always satisfies 0 d 6(a) < 1, and the values
with 6(c() > 0 are called Nevanlinna’s excep-
E. Meromorphic Functions in the Whole Finite
tional values. The number of values c( (may be
Plane
CD) with 6(a) > 0 is at most countable for any
meromorphic function f(z), and CE1 S(ai) < 2.
Any meromorphic function such that
There are many studies concerning the values
t( with 6(a) = 0
limsup%<K
?-R logr
is a rational function. If ,f(z) is a meromorphic
F. Julia Directions
function of order p and {rj(a)}, rj(ct) < rj+l(cc)
(j = 1,2, ) is the set of absolute values of c(-
Among functions that have an essential sin-
points, then CF, (l/rj(cx))“‘” converges for any
gularity at the point at infinity and are mero-
CC.Furthermore,
morphic in the whole plane, there are some
f(z) = zkeP@) that possess no tJulia directions. These func-
tions, called Julia exceptional functions, are
(I--t)exp(t+...+s) of order 0. A necessary and sufficient condi-
tion for f(z) to be a Julia exceptional func-
x”‘&&(l-~)exp(~+...+$)’ tion is that f(z) can be written in the form
z”’ I&( 1 - z/n,)/n,( 1 -z/b,) (A. Ostrowski,
212 G 1012
Meromorphic Functions

1925). Concerning zeros a, and poles b, of ,f(z), number of asymptotic finite values. Some
the following three properties are obtained results analogous to those for entire functions
using the theory of tnormal families due to P. are obtained for meromorphic functions by
Montel: (1) There are constants K,, K,, and applying the theory of normal families. F.
K, independent of Y such that In@, co)-n(r, O)l Marty established a systematic theory of
<K,, n(2r, co) - n(r, co) < K,, and n(2r, 0) - normal families of meromorphic functions by
n(r, 0) <K,. (2) There are constants K, and using spherical distance.
K, such that for any p and q,

I. Inverse Functions

Generally, the inverse function of a meromor-


phic function w =f(z) is infinitely multiple-
valued. Let P(w, w,,) be a function element of
(3) There exists an E> 0 satisfying lap/b, - 1 I > the inverse function with center at wO, and let
E>Oforanypandq. C be an arbitrary curve starting at w0 and with
G. Valiron gave a precise form of Julia o its terminal point. For any domain S con-
directions that corresponds to Borel’s theorem taining C, P(w, w,,) can be continued analyti-
(Acta Math., 52 (1928), [3]). Namely, if the cally in S up to a point arbitrarily near w
order p of a meromorphic function f(z) in Iz/ < (Iversen’s theorem). Continue the function
co is positive and finite, then there exists a element analytically along each half-line start-
direction J defined by argz = CIsuch that the ing at its center. Then the set of arguments of
zeros z&a, A) of f(z) - a in any angular domain half-lines along which the analytic continu-
A : 1argz - al < S containing J have the prop- ation meets a singularity at a finite point is of
erty C,~Z,(~,A)I~(~-“)= cc for any E>O except zero linear measure (Gross’s theorem).
for at most two values of n. The direction J is By considering the inverse image of the
called a Bore1 direction. suitably cut Riemann surface of the inverse
function, the z-plane can be divided into fun-
damental domains such that each domain is
G. Relations between Two or More
the inverse image of the whole w-plane (with
Meromorphic Functions
suitable slits removed) and has a boundary
each point of which is taccessible from the
Borel’s unicity theorem can be stated as fol-
inside of the domain, and the boundary curves
lows: Let h (j = 1, . , n) be nonvanishing entire
of fundamental domains cluster nowhere in
functions satisfying C;=, h = 1; then for some
the plane.
(cl, . . . . c,)#(O, . . . . 0), Z;=, c,&=O. This is
For a meromorphic function f(z), the set
contained in the following theorem: Let fj
of functions z’ = q(z) defined by f(z’) =f(z)
(j=l 2 .“> n) be transcendental entire functions
(i.e., transformations between points that give
such that Cjnclh= 1; then &,6(0,h)<n- 1.
f(z) the same value) has the property of a
If two meromorphic functions fi (z), f2(z) have
thypergroup. If q(z) is single-valued, then it
the same Ej-points for five distinct values ‘xj (j
is a linear entire function, and if it is finitely
= 1, . . ,5) (where multiplicity is not taken into
multiple-valued, then it is an algebraic func-
account), then they coincide everywhere. If the
tion. The tcluster set of the inverse function at
+Riemann surface of the talgebraic function
a transcendental singularity consists of only
w(z) defined by a polynomial P(z, w) =0 of z, w
one point, co, that is, it is an iordinary sin-
is of tgenus > 1, it is impossible to find mero-
gularity. To an analytic continuation along a
morphic functions z =f([), w = g(i) that satisfy
curve that determines a transcendental sin-
P(f(l), g(c)) = 0 (tuniformization by meromor-
gularity of the inverse function there corre-
phic functions).
sponds a curve in the z-plane terminating at
co. This curve is an asymptotic path of ,f(z).
H. Asymptotic Values Namely, the value f(z) tends to the coordinate
of the transcendental singularity as z+ co
If a meromorphic function ,f(z)+a as z--r co along this path. Each asymptotic value of a
along a curve C, the value TVand the curve transcendental meromorphic function w =f(z)
C are called an asymptotic value and asymp- corresponds to a transcendental singularity of
totic path, respectively. Each (Picard’s) excep- its inverse function z = q(w), and if we consider
tional value of f(z) is an asymptotic value. For two asymptotic paths to be the same if they
meromorphic functions, no simple relation is correspond to the same singularity, then there
known between their order and the number of exists a one-to-one correspondence between
their asymptotic values. There exists a merom- the set of asymptotic paths and the set of
orphic function of order 0 with an infinite transcendental singularities of the inverse
1013 212 K
Meromorphic Functions

function. The inverse function of any mero- yields


morphic function of order p has at most 2,~
idirect transcendental singularities if p > l/2 $I n(r, Olj) - f nl (r, 'j)
j=*
and at most 1 such singularity if p < l/2 (L. V
Ahlfors). >(q-2)A(r)-0(.4(r)“‘+“)

with some exceptional intervals of values of r.


This latter important inequality corresponds
J. Theory of Covering Surfaces to Nevanlinna’s second fundamental theorem.
Let Dj (j= 1, . . ..q) (q>3) be disjoint simply
connected domains on the w-plane. If every
Ahlfors established the theory of covering
simply connected island over Dj has at least pj
surfaces by a metricotopological method and
sheets, then Cq=, (1 - (1,‘~~)) < 2 (disk theorem).
in applying it, obtained Nevanlinna theory
and many other results on meromorphic It follows from this theorem that given three
disjoint disks Dj on the Riemann sphere, there
functions.
is at least one Dj that has an infinite number
Let F, denote the covering surface of the
Riemann sphere F, with radius l/2; F, is the of islands over it, and also that given five Dj,
image of Iz( Q r under a meromorphic function there exists at least one Dj that has a l-sheeted
island over it (Ahlfors’s five-disk theorem). This
w =f(z). The area of F, divided by Z, where n is
theorem corresponds to tBloch’s theorem for
the area of F,,, is given by
entire functions. These theorems can also be
I.f’(z)12 obtained for meromorphic functions on a disk.
A(r)=1
71ss ,,,<,(I +lf(412)2 pdpdoy Ahlfors established a more important theory
by introducing a differential metric.
z=pe’@,
and is called the mean number of sheets of F,.
K. Recent Development
The length of the boundary of F, is given by

IfW (dz,, The Nevanlinna brothers raised several im-


L(r) = portant problems, which gave strong motiva-
s lzl=r 1 + l.f(412
tion for later investigations. The first major
The relation breakthrough after World War II was given by
A. Gol’dberg in 1956. He gave an example
T(r)= *&+0(l) which has infinitely many deficient values
s r
(Nevanlinna’s exceptional values). W. Hayman
holds (T. Shimizu, Ahlfors). proved that C&a,f)” converges for c(> l/3,
Consider the Riemann surface of the inverse and there are examples of meromorphic func-
function of a meromorphic function w =f(z) in tions for which the series diverges for (Y< l/3.
IzI <R < +co. It has a countable number of Finally, A. Weitsman showed that the series
components Q, over a domain on the w-plane. converges for CI= l/3. The second major break-
Let A, denote the inverse image of Q, on the z- through was given by A. Edrei and W. Fuchs
plane. If A” together with its boundary is con- in 1959, whose works concern the following
tained in IzI CR, the component Q, is called an Nevanlinna theorem: Let K(f) be
island, and otherwise, a peninsula.
NO-, 0) + N(r, ml
Let D be a simply connected domain of the lim sup
r-m T(r,f)
w-plane, n(r, D) be the sum of the sheet num-
bers of the islands of F, over D, and m(r, D) be and K(P) = infK(f), where inf is taken over all
the sum of the areas of the peninsulas of F, meromorphic function f of order p. Then K(P)
over D divided by the area of D. Then > 0 if p is neither a positive integer nor a.
Furthermore, Nevanlinna gave a conjecture
m(r,D)+n(r,D)=A(r)+O(L(r))
for an exact value of K(P). This conjecture is
Let 0, (j = 1, ,q) (q 3 3) be disjoint simply still open, although several estimates have
connected domains on the w-plane. Then appeared. In the case of entire functions hav-
ing only negative zeros the conjecture was
fi n(r,Dj)- i n,(r,Dj)>(q-2)A(r)--(L(r)), positively solved by S. Hellerstein and J. Wil-
j=l
liamson. Edrei and Fuchs proved that K(P) = 1
where n,(D) is the sum of the orders of branch forO<p< l/2 and K(P)=sinnp for 1/2<pdl.
points in all islands of F, over D. They also proved the ellipse theorem: Let f(z)
For a meromorphic function f(z), L(r) < be a transcendental meromorphic function of
A(r) 1’2tE, where r satisfies 0 6 r < cx except order p (06~6 1). Put u= 1-6(a,f), v= l-
for rE ujlj for some intervals lj. Hence in the fi(h,f‘). Then LI, UE[O, l] and u2-22uvcos~p+
case where Dj is a point uj, this inequality u2 > sin’ np. If further u < cos up, then v = 1.
272 L 1014
Meromorphic Functions

For the sum of deficiencies Edrei proved [2] H. Weyl and J. Weyl, Meromorphic func-
that tions and analytic curves, Princeton Univ.
Press, 1943.
1 -cosnp (O<p<1/2)
C&4f)G [3] G. Valiron, Directions de Bore1 des fonc-
2-sinnp (1/2<p<l)
tions meromorphes, Memor. Sci. Math.,
unless the number of deficient values is one. It Gauthier-Villars, 1938.
was conjectured that C S(a,f) = 2 implies that [4] W. K. Hayman, Meromorphic functions,
the order p off is a half integer, the number v Clarendon Press, 1964.
of deficient values is at most 2p, and the value [S] W. Fuchs, Developments in the classical
of 6(a,f) is a multiple of l/p. For the case of Nevanlinna theory of meromorphic functions,
entire functions this conjecture is true (A. Bull. Amer. Math. Sot., 73 (1967), 2755291.
Pfluger). Weitsman proved that v < 2p. Other [6] A. Edrei and W. Fuchs, The deficiencies of
cases remain open. All the above results still meromorphic functions of order less than one,
hold even if the order is replaced by the lower Duke Math. J., 27 (1960), 233-249.
order. The inverse problem was completely [7] S. Hellerstein and J. Williamson, Entire
solved by D. Drasin. Many of the above re- functions with negative zeros and a problem of
sults depend on the concept of Polya peaks. R. Nevanlinna, J. Anal. Math., 22 (1969) 2333
N. V. Govorov and V. Petrenko proved 267.
independently that [8] A. Edrei, Solution of the deficiency prob-
lem for functions of small order, Proc. Lon-
liminf~ww~~f) <7xp for p>1/2 don Math. Sot., 26 (1973), 4355445.
r-cc T(r>f) [9] A. Weitsman, Meromorphic functions with
for every entire function of order p. This was a maximal deficiency sum and a conjecture of F.
conjecture made by R. Paley. For p < l/2, an Nevanlinna, Acta Math., 123 (1969) 115-l 39.
exact upper bound rep/sin np was given by [lo] A. Weitsman, A theorem on Nevanlinna
Valiron. Furthermore, the following result was deficiencies, Acta Math., 128 (1972) 41-52.
proved by Edrei and Fuchs: [ 1 l] D. Drasin, The inverse problem of Ne-
vanlinna theory, Acta Math., 138 (1977), 833
N(r,O) 151.
lim sup a- (O<p< 1).
1-m logM(rJ-1 =P [ 121 A. Gol’dberg, Meromorphic functions, J.
Soviet Math., 4 (1975) 157-216.
[13] V. Petrenko, Growth of meromorphic
L. History functions of finite lower order (in Russian), Izv.
Akad. Nauk SSSR, 33 (1969), 414-454.
The value distribution theory of meromorphic Also - references to 124 Distribution of
Values of Functions of a Complex Variable.
functions had its inception with the classical
Picard theorem. It first appeared as the value
distribution theory of entire functions and was
developed into a well-organized field by way
of the Nevanlinna theory and the Ahlfors 273 (II.1 7)
theory of covering surfaces. In recent years, Metric Spaces
emphasis has also been placed on the study of
meromorphic functions on open Riemann
surfaces (- 367 Riemann Surfaces). The value A. General Remarks
distribution of a set of several meromorphic
functions was studied first by A. Bloch and The distance between two points x = (xi, . ,
developed into the study of meromorphic x,) and y = (yi, , y,) in the n-dimensional
curves by Ahlfors, and H. and J. Weyl [2]. The tEuclidean space R” is defined by p(x, y) =
behavior of meromorphic functions in neigh- J(y,-x,)‘+...+(y,-x,,)~. The function
borhoods of general singularities has also been p(x, y) is nonnegative for every pair (x, y) and
studied. An example of results in that field is has the following properties: (i) p(x, y) = 0 if
the theory of tcluster sets. and only if x = y; (ii) p(x, y) = p( y, x); and
(iii) p(x, z)<p(x, y)+p(y,z) for any three
points x, y, z. Property (iii) is called the
triangle inequality.

[1] R. H. Nevanlinna, Eindeutige analytische B. Definition of Metric Spaces


Funktionen, Springer, 1935, revised edition,
1953; English translation, Analytic functions, Abstracting the notion of distance from Eu-
Springer, 1970. clidean spaces, M. Frechet defined metric
1015 273 c
Metric Spaces

spaces [l] (1906). A metric on a set X is a totally bounded as a metric subspace. Any
nonnegative function p on X x X that satisfies totally bounded subset is bounded. Converse-
(i), (ii), and (iii) of Section A, and a metric space ly, in the Euclidean space R” any bounded
(X, p), or simply X, is a set X provided with a subset is totally bounded.
metric p. The members of X are called points, A bijection f from a metric space (X,, pt)
p is called the distance function, and p(x, y) is onto a metric space (X,, p2) is called an iso-
called the distance from x to y. The distance metric mapping if f preserves the metric, i.e.,
function is sometimes denoted by d(x, y) or p2(f(x),f’(y))=p1(x,y) for any points x, ygX,;
dis(x, y). If(i) is replaced by its weaker form (i’) and X, and X, are called isometric if there is
p(x, x) = 0, the function p is called a pseudo- an isometric mapping from X, onto X,.
metric (or pseudodistance function), and X is Let X be a metric space with metric p, and
called a pseudometric space. left f be an injection from a set Y into X.
Examples of metric spaces: Then the function P’(Y,,Y,)=P(~(Y,),~(Y,))
(1) The n-dimensional Euclidean space R”, in (yr , y, E Y) is a distance function on Y, and
particular the real number system R with with this metric the set Y becomes a metric
pO(x, y) = Jx - yl. (2) The tfunction space space called the metric space induced by f;
L,(R). (3) The tfunction space C(0). (4) The f is an isometric mapping from (Y, p’) onto
tsequence space s, i.e., the space R” of all se- (S(Y), PI.
quences of real numbers with metric p(x, y) = For a finite number of metric spaces (X,, p,),
C~12~“1x,-y,l/(l+Ix,-y,l),wherex= , (X,, p,), we can define a metric p on their
(XI,%> .., ) and y=(y,,y,, . ..). (5)The tse- Cartesian product X = X, x . x X, by
quence space m, i.e., the space of all bounded setting
sequences of real numbers with metric
p(x,y)=sup,Ix,-y,J for x=(x1,x2, . ..) and
y = (y, , y,, ). (6) A Baire zero-dimensional for two points x=(x1, . . . . x,), y=(yr, . . . . y,)
space (nN, p), where R is a set and the dis- of X. Thus we obtain a metric space (X, p),
tance p(x,y) between x=(x1,x2, . ..) and y= called the product metric space of (X,, p,), ,
(y,, y,, ) is equal to the reciprocal of the (X,,, p,,). The n-dimensional Euclidean space R”
minimum n such that x,#y,. When the car- is the product metric space of n copies of the
dinal number r of Sz is specified, the Baire real line (R, pa).
space is denoted by B(r). (7) For any set X,
define p by setting p(x, x) = 0 and p(x, y) = 1
when x # y. Then (X, p) is a metric space, C. Topology for Metric Spaces
called a discrete metric space. (8) For any set
X, define p by setting p(x, y) = 0 for any mem- For a point x of a metric space (X, p) and
bers x and y. Then p is a pseudometric, and any positive number E, the set U,(x) of all
the resulting space X is called an indiscrete points y such that p(x, y) < E is called the E-
pseudometric space. neighborhood (or s-sphere) of x. We can intro-
For a subset M of a metric space X, duce a topology for X by taking the family of
sup{ p(x, y) 1x, y E n/l} is called the diameter all s-neighborhoods as a +base for the neigh-
of M (denoted by d(M)), and M is said to borhood system (- 425 Topological Spaces).
be bounded if its diameter is finite (includ- Then the following five propositions hold, any
ing M = 0). For two subsets A, B of X, one of which can be used to define the same
inf{ p(x, y) 1x E A, ye B} is called the distance topology: (i) A subset 0 is topen if and only if
between A and B, denoted by p(A, B). We for any point x in 0 there is a positive number
have p(A, B) = p(B, A). When a family 9JI = E such that the s-neighborhood of x is con-
{M, 11~ A} of subsets of X is a covering of tained in 0. (ii) A subset F is +closed if and
X, i.e., X = uI M,, the supremum of the dia- only if any point whose every s-neighborhood
meters d(M,) of M, in W, sup{d(M,))i~A), contains at least one point of F is contained
called the mesh of the covering %I. For a posi- in F. (iii) A subset U is a neighborhood of a
tive number E, a covering whose mesh is less point x if and only if U contains some E-
than e is called an s-covering. A metric space neighborhood of x. (iv) A point x is an tinte-
X is called totally bounded (or precompact) rior point of a subset A if and only if some E-
(F. Hausdorff, 1927) if for each positive num- neighborhood of x is contained in A; the inte-
ber E there exists a finite s-covering of X. rior A’ of A is the set of all such points. (v) A
A subset X, of a metric space X becomes a point x is adherent to a subset A if every E-
metric space if we define its metric pr by set- neighborhood of x contains at least one point
ting p1 (x, y) = p(x, y) for x, YE X,, where p is of A; the closure A of A is the set of all such
the metric of X. The space (X,, p,) is called a points, and x E x if and only if p(x, A)= 0.
metric subspace of (X, p). A subset of X is Every metric space X satisfies the Virst
called totally bounded (or precompact) if it is countability axiom. A metric space X is a
273 D 1016
Metric Spaces

+Hausdorff space and, more specifically, a +per- separable. In particular, every compact metric
fectly normal space; it is also +paracompact. space is separable.
In the same way, we can define a topology Let U = {U,} be an open covering of a com-
for each pseudometric space that satisfies the pact metric space X. There exists a positive
first countability axiom, but a pseudometric number 6 such that every set with d(A) < 6 is
space is not necessarily Hausdorff. contained in some U,. The number 6 is called
the Lebesgue number of the open covering 11.
A subset A of a metric space is said to be
D. Convergence of Sequences compact if it is compact as a metric subspace,
and A is said to be relatively compact if its
A sequence {x,,} of points in a metric space is closure is compact. Bounded closed sets in R”,
said to converge to a point x (written lim,,, x, in particular closed intervals of real numbers,
=x) if p(x,, x) tends to zero as n-, co. The are compact. For these sets, conditions (i), (iv),
point x is called the limit of ix,,}. This conver- and (v) are called the Heine-Bore1 theorem (or
gence is equivalent to convergence with re- Borel-Lebesgue theorem), Cantor’s intersection
spect to the topology defined in Section C (- theorem, and the Bolzano-Weierstrass theorem,
87 Convergence). As the first countability respectively.
axiom is satisfied, we may define the topology
by means of convergent sequences of points:
the closure 2 of a subset A is the set of all
limits of sequences of points in A. G. Product Spaces of Metric Spaces

E. Separable Metric Spaces Let (X,, pl), . . , (X,,, p,) be metric spaces. Then
the Cartesian product X =X, x x X, has
For a metric space X the following three con- distance functions
ditions are equivalent: (i) There exists a coun-
table family 3, of open sets of X such that
each open set of X is the union of members of pbl
D,, (tsecond countability axiom). (ii) X is tsep-
and
arable, that is, X has a countable subset that is
+dense in X. (iii) Every open covering of X has Prr(x,y)=max{p,(x,,y,), . . ..Pn(xnrYn)l.
a countable subcovering (TLindelijf space). A
where x=(x,,..., x,) and y=(yl,...,y,). The
metric space with any of these properties is
topology of X induced by each one of these
called a separable metric space. The sequence
metrics coincides with the product topology.
space s is separable. Any separable metric
In particular, for the n-dimensional Euclidean
space can be isometrically embedded in the
space R”, the metrics pP (p> 1) and p, define
sequence space m, i.e., is isometric to a sub-
the same topology.
space of m (- 168 Function Spaces B).
Let (X,, pl), , (X,,p,), . be a countable
number of metric spaces. If we define a metric
F. Compact Metric Spaces p on the Cartesian product X = n:=, X, by

For a metric space X, the following five con- p(x, y) = f L p,(xd


n=, 2” 1 +P,(xn3Y,)’
ditions are equivalent: (i) X is tcompact, that
is, every open covering of X has a finite sub- where x=(x1,x2,... ) and y = (y, , yz, ), then
covering. (ii) X is tcountably compact, that is, the topology defined by p is identical with the
every countable open covering of X has a product topology. For the Cartesian product
finite subcovering. (iii) X is tsequentially com- of an uncountable number of metric spaces, we
pact, that is, any sequence of points in X has a cannot construct a metric p such that the
convergent subsequence. (iv) Every nested topology induced by p agrees with the product
family F, 3 F2 2 of nonempty closed sets topology in general.
of X has a nonempty intersection. (v) Every
infinite subset A4 of X has an accumulation
point x, i.e., x E M - {x} A metric space satis-
H. Uniformity of Metric Spaces
fying any of these conditions is called a com-
pact metric space (M. FrCchet [ 11). Every real-
valued continuous function defined on a com- Every metric space X is a tuniform space, for
pact metric space has a maximum and a min- which we may take a countable number of
imum. A metric space is compact if and only subsets {(x, y) 1p(x, y) < 2 -“}, n = 1, 2,. , of
if it is totally bounded and complete (- Sec- X x X as a base of tuniformity (- 436 Uni-
tion J). Every totally bounded metric space is form Spaces).
1017 273 K
Metric Spaces

I. Uniform Continuity expressed as the union of a countable number


of sets whose closures have no interior point
A mapping ,f from a metric space (X, p) into a has no interior point. In other words, if the
metric space (Y, 0) is tcontinuous if for any union u:=i F,, of closed sets F, , F2,. of X has
point x in X and any positive number E there an interior point, then at least one of the F,
is a positive number 6 such that f( U,(x)) c must have an interior point.
y(f(x)), where U,(x) is a a-neighborhood for
p and V,(y) is an e-neighborhood for 0; that
is, p(x, x’) < 6 implies o(f(x),f(x’)) < E. In this K. The Metrization Problem
case, we must generally choose 6 depending
on x and E. In the special case where we can A topological space X is called metrizable if
choose 6 depending only on E, independently we can introduce a suitable metric for X which
of x, we call f uniformly continuous in X. (The induces a topology identical to the original
notion of uniform continuity may be general- one. A +T,-space satisfying the second counta-
ized to uniform spaces.) Not every continuous bility axiom is metrizable if and only if it is
mapping is necessarily uniformly continuous, tregular (Uryson-Tikhonov theorem; P. S.
but every continuous mapping from a com- Uryson, Math. Ann., 94 (1925), A. Tikhonov,
pact metric space into a metric space is uni- Math. Ann., 95 (1925)). However, a metric
formly continuous. space does not necessarily satisfy the second
countability axiom. Therefore, the Uryson-
Tikhonov theorem does not provide a neces-
J. Complete Metric Spaces sary and sufficient condition for metrizability.
The following are some necessary and sufficient
A sequence {xn} of points in a metric space conditions for a topological space X to be
(X, p) is called a fundamental sequence (or metrizable:
Cauchy sequence) if ~(x,,x,,,)-$O as n, IYI-cc. (1) There exists a nonnegative real-valued
Every convergent sequence is a fundamental function d on X x X satisfying the first two
sequence, but the converse is not always true. axioms given in Section A and the following
A metric space is called complete if every fun- condition: There exists a real-valued function
damental sequence in the space converges to q(w) that converges to zero as w-0 such that,
some point of the space (M. Frechet Cl]). A for any three points x, y, z and any positive
topological space that is homeomorphic with a number E, d(x, y) < (P(E) and d(y, z) < (P(E) imply
complete separable metric space is sometimes d(x, z) < E (E. W. Chittenden, Trans. Amer.
called a Polish space (- 22 Analytic Sets I). Muth. sot., 18 (1917)).
The metric spaces introduced in examples (1) (2) X is a T,-space that has a countable
through (5) of Section B are complete. (In number of open coverings %Rr, !I&, satisfy-
example (3) we must assume that the space R ing the following two conditions: (i) If ci,,
is a compact Hausdorff space.) A metric space U2ES%t+, have a common point, there is a set
is compact if and only if it is complete and U E’JJ~, with U 3 U, U CJz; (ii) for any point x
totally bounded. A tlocally compact metric in X, if U, is any member of !lR,, containing x,
space is homeomorphic to a complete metric the family {U,}, =, ,Z, ,,_ is a base for the neigh-
space. borhood system of x (P. S. Aleksandrov and
For a metric space X, we can construct a Uryson, C. R. Acud. Sci., Paris, 177 (1923), and
complete metric space Y such that there is an N. Aronszajn). When X is a uniform space,
isometric mapping cp from X onto a dense this amounts to saying that X has a metric
subspace X, of Y(F. Hausdorff, 1914). Such a compatible with the uniform structure if and
pair (Y, cp) is called a completion of X. If X has only if X is a T,-space and has a countable
two completions (Y,, cp,) and (Y,, (p2), then base of uniformity.
there is an isometric mapping ,f from Y, onto (3) X is a Ti-space that admits a countable
Y, with ‘pZ =fo ‘pi. In this sense the comple- number of open coverings ~JJi, %l&, such
tion of X is unique. By identifying X with that {S(S(x,!BIi),Wj)Ii,j=1,2 ,... jisabasefor
q(X) when (Y, cp) is the completion of X, any the neighborhood system of x at each point of
metric space can be regarded as a dense sub- X, where S(A, W) is the tstar of A relative to
space of a complete metric space. For example, %II (R. L. Moore, Fund. Math., 25 (1935); K.
the completion of the rational number system Morita, Proc. Japan Acud., 27 (1951); A. H.
Q is the real number system R. A metric space Stone, Pacific J. Math., 10 (1960); A. V. Arkh-
is totally bounded (= precompact) if and only angel’skii, Dokl. Akud. Nauk SSSR, 2 (1961)).
if its completion is compact. (4) X is regular and has a to-locally finite
Baire-Hausdorff theorem: In a complete open base (J. Nagata, .I. Inst. Polytech. Osaka
metric space every set of the ttirst category is a City Univ., 1 (1950); Yu. M. Smirnov, Uspekhi
tboundary set. That is, every set that can be Mat. Nuuk, 6 (195 1)).
273 Ref. 1018
Metric Spaces

(5) X is regular and has a to-discrete open [Z] J. Nagata, Modern general topology,
base (R. H. Bing, Canad. J. Math., 3 (1951)). North-Holland, 1968.
(6) X is a tcollectionwise normal Moore [3] R. Engelking, General topology, Polish
space (- below; Bing, ibid.). Scientific Publishers, 1977.
(7) X is a tperfect image of a subspace of a [4] Y. Kodama and K. Nagami, Theory of
Baire’s zero-dimensional space (Morita, Sci. topological spaces (in Japanese), Iwanami,
Rep. Tokyo Kyoiku Daigaku, sec. A, 5 (1955)). 1974.
(8) X is a Hausdorff +M-space such that the [S] A. V. Arkhangel’skii, Mappings and
diagonal is a +G,-set in the direct product spaces, Russian Math. Surveys, 21 (1966),
X x X (A. Okuyama, Proc. Japan Acad. 40 1155162.
(1964); C. J. R. Borges, Pacific J. Math., 17
(1966); J. Chaber, Fund. Math., 94 (1977)).
A regular space is said to be a Moore space
if it has a countable number of open coverings 274 (XII.1 6)
%I$ such that {5(x, !IQ} is a base for the neigh- Microlocal Analysis
borhood system of x for any point x. A Moore
space is not necessarily metrizable. F. B. Jones
(Bull. Amer. Math. Sot., 43 (1937)) proved A. General Remarks
under the assumption 2Q < 2K~ that every
separable normal Moore space is metrizable Let X be an open set of R”. Then X x (R”\O)
and asked whether or not every normal Moore can be identified with T*X\O, the tcotangent
space is metrizable. (3) and (6) are partial bundle of X minus the zero-section. To every
answers to the question. A normal Moore locally integrable function (or distribution
space is metrizable if it is locally compact and or hyperfunction (- 125 Distributions and
tlocally connected (G. Reed and P. Zenor). The Hyperfunctions)) f(x) defined on X, one can
existence of a nonmetrizable separable normal assign closed subsets of X x (R”\O) called the
Moore space is consistent with and indepen- wave front set off and the singularity spec-
dent of the axioms of the usual ZFC set theory, trum (or analytic wave front set or essential
the +Zermelo-Fraenkel set theory with the support) of 1: The wave front set (resp. the
taxiom of choice (F. D. Tall). W. G. Fleissner singularity spectrum) off describes in detail
(Trans. Amer. Math. Sot., 273 (1982)) con- the singularity off modulo the infinitely dif-
structed a normal nonmetrizable Moore space ferentiable functions (resp. the real analytic
assuming an axiom weaker than the contin- functions). One can further associate with f
uum hypothesis, while P. J. Nyikos (1980) has more relined objects defined on X x (R”\O),
proved that every normal space with the first such as microfunctions. In many cases one can
countability axiom is collectionwise normal recover knowledge of the structure off by
from the strong axiom of set theory. analyzing these objects defined on X x (R”\O).
In connection with (7) the following result is Such an analysis on the cotangent bundle of X
known. Let f be a tclosed continuous mapping is called microlocal analysis. Microlocal analy-
from a metric space X onto a topological sis is particularly successful if S is a solution of
space Y. Then the following conditions are a system of linear (pseudo-)differentiaI equa-
equivalent: (1) Y is metrizable; (2) For each tions, because in that case one can use vari-
y E Y the tboundary 3f-l (y) of the inverse ous linear transformations, such as differential
image is compact; (3) Y satisfies the first coun- operators, pseudodifferential operators (- 345
tability axiom (Morita and S. Hanai, Proc. Pseudodifferential Operators), or microdiffer-
Japun Acad., 32 (1956); Stone, Proc. Amer. ential operators and Fourier integral opera-
Math. Sot., 7 (1956); I. A. Vainstein, Dokl. tors, or quantized contact transformations.
Akad. Nauk SSSR, 57 (1947)). In particular,
perfect images of metric spaces are metrizable.
B. Microlocal Analysis for Distributions
For quotient topological spaces of metric
spaces - 425 Topological Spaces CC.
Let u be a distribution defined in an open
subset X of R”. The wave front set WF(u) of u
is defined as the complement in X x (R”\O) of
References the collection of all (x,, to) in X x (R”\O) such
that for some neighborhood U of x0, I’ of &,
[ 1] M. Frechet, Sur quelques points du calcul we have for each (p~Com(U) and each N>O,
fonctionnel, Rend. Circ. Mat. Palermo, 22
(u,(~exp(-irx.<))=O(t-“)
(1906) l-74.
Also - references to 425 Topological Spaces. as z+ co, uniformly in 5 E V (L. Hiirmander;
For metrization problem see in particular Cl]). WF(u) is considered to be a subset of
1019 274 C
Microlocal Analysis

T*X\{O}. If WF(u)=@, then u is a C”- = xj dcj A dxj - Cj dq, A dy, vanishes on A, and
function. Let rc: X x (R” \O)+X be the natural the multiplicative group of positive numbers
projection. If rr(WF(u)) contains x,,, then u is acts on Ao. Let i,, A,, . . , /I,, be a system of
not a C”-function in any small neighborhood local coordinates in AV. These, together with
of x0. Thus WF(u) is the obstruction for u to a~/%, , a(pla8z, , a’p/aoN,constitute a SYS-
be infinitely differentiable. Let p(x, D) be a tern of local coordinate functions of R”+N+”
tlinear partial differential operator of order m. in a neighborhood of C,. Let J denote the
Assume that p(x, D)u =f: Then Jacobian determinant

WW) = WW4 = WF(f) U P,‘(O),


D
where p,:(x, <)+p,(x, 5) is the tprincipal sym-
bol of p(x, D). Such a technique of localizing
the problem on the cotangent bundle has
The function a,,,= fi a(,.@,’ exp(rrMi/4) is
been used in the form of the estimation of the
called the symbol of A. Here ~1,~ is the restric-
Fourier transform off since the advent of the
tion of a to C, and M is an integer called the
tsingular integral operators of A. P. Calderon
Keller-Maslov index [ 10,111. The conic La-
and A. Zygmund [3,4] (see also S. Mizohata
grangian manifold AV = A,(A) and the symbol
[5,6]). The formula above contains as a special
u,,~ = a,,&A) essentially determine the singular-
case the classical result that u is a C”-function
ity of the tkernel distribution k(x, y) of the
if p(x, D) is telliptic and if f is a C”-function.
Fourier integral operator A. Conversely, given
In obtaining useful results of microlocal
a conic Lagrangian manifold A in T*(R” x
analysis for distributions one often uses Fou-
R”)\O and a function a, on it, one can con-
rier integral operators and pseudodifferential
struct locally a Fourier integral operator A
operators (or singular integral operators)
such that A,(A) = A and u,~(A) = a,. For
(- 345 Pseudodifferential Operators).
global construction of such a Fourier integral
operator one requires detailed consideration of
C. Fourier Integral Operators [ 1,2,7-91 the Keller-Maslov index. A globally defined
Fourier integral operator A with A,(A) = A
A Fourier integral operator B: C$(R”)+9’(R”) and a,*(A) = a,, exists if and only if a, is not a
is a locally finite sum of linear operators of the function on A but a section of the complex
type line bundle R,,, 0 L, where R,,, is the bundle
of square roots of the volume elements of A
Af(x) = (27q@fN)‘2 and L is a Z, bundle over A called the Maslov
4% 0, Y)
s p+N bundle. The factor fi exp(nMi/4) in the de-
finition of u,,~ above appears as the effect of
trivialization of the bundle R,,* @ L. Those
Here a(x, 0, y) is a (Y-function satisfying the Fourier integral operators whose associated
inequality conic Lagrangian manifolds are the graphs
of +homogeneous canonical transformations
lD;lDfD,‘a(x,Q,y)l <C(l +IHl)m-p’P’+(l-P)(iai+lui)
of T*(R”) are most frequently used in the
for some fixed m and p, 12 p > l/2, and any theory of linear partial differential equations.
triple of +multi-indices a, /j’, y, and cp(x, 0, y) is Let A be a Fourier integral operator such that
a real-valued C”-function which is homoge- A,(A) is the graph of a homogeneous canon-
neous of degree 1 in B for If31> 1. The function ical transformation x. Then the adjoint of A
cp is called the phase function and a the ampli- is a Fourier integral operator such that the
tude function. associated conic Lagrangian manifold is the
Let C,={(x,O,y)ld,cp(x,8,y)=O, Q#O} and graph of the inverse transformation 1-l. Let
W={(x,y)ER”xR”13Q#Osuchthat(x,Q,y)e A, be another such operator; if A,(A,) is the
Cd. 1fdx.o.y cp(x, 0, y) # 0 for 0 # 0, then the graph of x, , then the composed operator
kernel distribution k(x, y) of A is of class C” A, A is also a Fourier integral operator and
outside W. A phase function cp is called non- A,(A, A) is the graph of the composed homo-
degenerate if the d,,,,,(cYcp(x, 0, y)/Nj), j = 1,2, geneous canonical transformation xix.
“‘> N, are linearly independent at every point Consider the kernel distribution k(x, y) of A.
of C,. In this case, C, is a smooth manifold If the phase function cp of A is nondegenerate,
in R”+N+” and the mapping 0: &3(x, f3,y)+ then WF(k) is contained in A,(A). Moreover,
(x, Y, 5, ab 5 = d,dx, 8, Y), ‘I = d,dx, H, Y), is if the symbol Q,~(A) does not vanish, then
an immersion of C,+,to T*(R” x R”) ~0, the WF(k) = A,(A). Let u be a distribution and A
cotangent bundle of R” x R” minus its zero- be a Fourier integral operator such that A,(A)
section. The image QC, = A, is a conic La- is the graph of a homogeneous canonical
grangian manifold, i.e., the canonical 2-form CI transformation x. Then WF(Au)cx(WF(u)).
274 D 1020
Microlocal Analysis

A pseudodifferential operator of class for all <E U, 0 < y < yo, and all positive integers
.SI ,-JR”) is a particular type of Fourier inte- N.
gral operator (- 345 Pseudodifferential Oper- The essential support ZxO(u) of u at x0, is
ators). In fact, a Fourier integral operator the limit of Z,(u) when the width Jsuppxl of the
A is a pseudodifferential operator of class support of x around x0 tends to 0. The es-
Sr, -JR”) if and only if A,(A) is the graph of sential support Z(u) of u is the closed subset
the identity mapping of T*(R”). Hence for any UxtX {x} x ZJu) of X x R”\O. C(u) is the ob-
Fourier integral operator A, A*A and AA* are struction for u to be real analytic. L. Horman-
pseudodifferential operators. der [20] also defined the analytic wave front
The following theorem is due to Yu. V. set of a distribution, which is also the obstruc-
Egorov [ 121: Let P(x, D) be a pseudodifferen- tion for a distribution to be real analytic. The
tial operator of class SE, -JR”) with the sym- definition of an analytic wave front set is quite
bol p(x, <), and let A be a Fourier integral oper- different from that of essential support. How-
ator such that the associated conic Lagrangian ever they coincide with each other [21]. More-
manifold A,(A) is the graph of a homogeneous over, both of them coincide with the singular-
canonical transformation x of T*(R”). Then ity spectrum of u if the distribution u is re-
there exists a pseudodifferential operator garded as a hyperfunction.
Q(x, D) with the symbol q(x, ~)ESL i -JR”)
such that P(x, D) A = AQ(x, D) and q(x, 0 -
p(~(x,~))eS~;!Pp+‘(R”). Note that m-2p+ E. Microlocal Analysis for Hyperfunctions
l<m. c221
Assume that m= 1, p= 1, and that p,(x, 5)
is a real-valued C”-function, homogeneous (1) Microfunctions. Let N be a real analytic
of degree 1 in < for [(I > 1, such that p(x, 5) manifold of dimension n + rl and M its sub-
-~~(x,5)~S~,~(R”)andd~~,(x~,5~)#0at manifold of codimension d. In what follows,
(x0, to), where pi (x0, 5’) = 0. Then one can find T,N and TG N denote the normal bundle of
a Fourier integral operator A such that the M and the conormal bundle supported by M,
function q(x, 5) of Egorov’s theorem satisfies respectively. Here the normal bundle T, N is
the relation q(x, 5) - [I E Sp,,(R”). defined to be the quotient bundle TN 1,/TM
The boundedness of Fourier integral oper- of the tangent bundle and the conormal bundle
ators in the space L,(R”) (or the spaces H”(R”)) T,* N to be the subbundle of the cotangent
has also been studied in several cases. Some bundle T* N I M that annihilates TM. Identify-
sufficient conditions for boundedness can be ing N with {(x,u)ETNIu=O} or {(x,~)E
found in [7,8,14-161. T*N I< = 0}, we define the tangential sphere
The theory of Fourier integral operators has bundle SN and cotangential sphere bundle S* N
its origin in the asymptotic representation by (TN\N)IRG(= U,,,dT,N\{Oj)lR~) and
of solutions of the wave equation, (see, e.g., (T*N\N)IR:(=U,,,(~**N\{O})lR:), re-
[ 17,181). Fourier integral operators were first spectively. The normal sphere bundle S, N =
used by G. I. Eskin [7]. (T, N \ M)/R: and conormal sphere bundle
S$ N = (TG N \ M)/R : are defined in the same
manner. In parallel with the algebraic geom-
etry (- 16 Algebraic Varieties) we define the
D. Essential Support or Analytic Wave Front
real monoidal transform of N with center M to
Set of a Distribution
be the manifold (N \ M) U S,N with boundary,
in which the center M is blown up to S, N by
Inspired by the physical idea introduced by the polar coordinates. We denote it by %.
C. Chandler and H. P. Stapp, J. Bros and D. We mainly use this notion when N is a tcom-
Iagolnitzer introduced the notion of the essen- plexification X of M, regarding X as a 2n-
tial support of a distribution, which is a closed dimensional real manifold. In this case we can
subset of X x (R”\O) [19]. Let u be a distri- canonically identify T,X with J-1 TM, and
bution defined on an open set X of R” and hence S,X with J-1 SM. We denote by x+
x be a C”-function with compact support -00 the point in S,X that corresponds to
around x06X which is locally analytic and
(x, J-1 u) in J-1 SM by this identification.
different from 0 at x0. Let C,(u) be the subset
of R”\O of which the complement is defined An open subset W of X \ M is called a conoidal
as follows. A point q is in the complement of neighborhood of a subset U of J-1 SM if
Zx(u) if there exist a conic neighborhood U WUfi SM is a neighborhood of U in M%.
of 4, constants ?, y. > 0, and C,v such that Let E, E, and ? denote respectively the canon-
ical embedding mappings from X \ M to X,
from ‘%-&l SM to M”x and from J-1 SM
to jv”x. We then define the tsheaves d and .d
1021 274 E
Microlocal Analysis

by I,&-‘0, and d 1Jo, SM( = f-‘6,). Here 6, on fl S* M. (ii) For each (x, fl <m) in
denotes the sheaf of germs of holomorphic J-1 S* M, there exists a surjective mapping
functions on X. We also define the sheaf 2 on from gM,, to VM,CX,~~:~gm,. This mapping is
J-1 SM by %$~iSM(~~iOJr where r is the denoted by sp. The mapping from BM to
canonical projection from M”x to X and 2: n,%$, is also denoted by sp. (iii) We have the
denotes the pth tderived functor of the functor following exact sequence: O-dM-BMz
I, of taking the sections with support in S. n.+V&,+O. (iv) Rklr,WM=O holds for k#O.
Then 9 is isomorphic to d /r-i& for the The exact sequence (iii) shows that the singu-
sheaf d of real analytic functions on M. The larities of hyperfunctions are dispersed over
sheaf 2 is, so to speak, the sheaf of “boundary J-1 S*M and that the dispersed object
values” of holomorphic functions. Actually is described by the sheaf VM of microfunc-
there exists a canonical mapping b from d to tions. For a hyperfunction DEB,,,, we call
~-i&, where BM denotes the sheaf of thyper- q(f) (~W(fi S*M)) the spectrum of ,f: We
functions on M (- 125 Distributions and denote suppsp(f) by S.S.j’and call it the
Hyperfunctions). Thus we see that S describes singularity spectrum of ,f‘ or the singular spec-
the singularities of the boundary value of a trum. It is known [21] that this coincides
holomorphic function. These sheaves d and with the analytic wave front set off and with
9 are easy to understand intuitively. However, the essential support off if f‘is a distribution.
they are defined on p SM, while J-1. (v) The following sequence is an exact se-
S*M is more important in analysis. Our final quence on J-1 SM: O+,p?,~~~l&?,,,+~~z~lW~
goal, namely, the sheaf of microfunctions, is -0. In the following, a subset A of the (n- l)-
constructed on fi S*M through cohomo- dimensional sphere S”-’ is said to be convex
logical machinery starting from 9. In order to if tY’(A)U {0} is convex, where zn is the ca-
do this, we introduce the disk bundle DM by nonical projection from R” \ {O} to SE-i, and
if u-‘(A) U { 0) is convex and includes no
straight line, A is said to be properly convex.
+iSMx,&iS*MI(v,&<O}. A subset Z of J-1 SM (resp., J-1 S*M) is
also said to be (properly) convex if r-i(x)f’Z
Here J-1 SM xMfi S*M denotes the (resp., Cl (x) n Z) is so for each x in M. For
+fiber product of J-1 SM and J-1 S* M a subset Z of J-1 SM its polar set Z” is, by
over M and the symbol <cc is used to em- definition, {(x, J-1 5, CCJ)EJ-1 ST M 1
phasize that 5 designates the codirection, (v,, 5,) > 0 holds for each x + J-1 U, 0 in
which is dual to the infinitesimally small Z}. The polar set Z” of a subset Z of J-1.
quantity ~0. The canonical projections from S*M is defined in the same way. (vi) Let U
DM to J-1 S*M and from J-1 SM to be an open subset of fl SM such that
M are both denoted by z. Similarly, the pro- T ml (x) n U is a nonvoid connected set for
jections from DM to fi SM and from each x in M. Let I/ denote U”“. Then we
J-1 S*M to M are denoted by rr. We denote have (a) The restriction mapping p: I( V; .g )-
by a the antipodal mapping on J-1 S*M, I( U; .d ) is a bijection. Here I( I/; .d ), etc. de-
namely, a(x, fita)-(x, -fi <co). For notes the space of global sections of d over V,
a sheaf B on J-1 S*M, we also denote etc. (b) O+.~(U)+~(M)%Y(fi S*M - U’)
a,,F( = a ml F) by 9’. In the following, Rjz,, is an exact sequence. (vii) Let f(x) be a hyper-
etc., denotes the jth iright-derived functor of function on M. Then the following two state-
the functor r* of taking the direct image of ments are equivalent: (a) ~p(,f)~~~,~~-j~,~,=O.
sheaves, etc. (- 383 Sheaves). Now the sheaf (b) There exist a finite family of open subsets
V,, of microfunctions is defined on J-1 S* M U, of J-1 SM whose polar set Uy does not
by (Rnm1r,n-‘!2)” @ nmlwM, where uM de- contain (x,, fi &, co) and ‘pj in I( Uj, .E? )
notes the torientation sheaf of M. Note that such that ,f= Ejih((pj). Then we say that f is
Rjz,n-‘9=Oholdsforj#n-1. micro-analytic at (x0, J-l to co).
Remark: Here we have defined the sheaf VM
(3) Operations on Microfunctions. Let M and
of microfunctions on J-1 S*M. However, it
N be real analytic manifolds, and let f be a
is sometimes more convenient to define the
real analytic mapping from N to M. We de-
sheaf on J-1 T*M (= T,*X) by the follow-
note by TN* M the kernel of the natural map-
ing convention: (2M,(x,~~--ir)=%,~x,~ -Irmj if ping from N x,,, T*M to T*N. It is also called
5Z0, and %,,x, if 5 = 0. Several authors (e.g., a conormal bundle supported by N. The as-
[23]) call this sheaf GM the sheaf of micro-
sociated sphere bundle is denoted by S,*M.
functions and denote it by ‘e,.
Denote by p and a the natural mappings from
(2) Basic Properties of Microfunctions. The NxM&iS*M+iS;M to&iS*N
sheaf ‘& defined above has the following and from N x,,,rfl S*M\fi SZM to
properties: (i) The sheaf ‘e, is a tflabby sheaf fi S* M, respectively. (i) Let ,?4h denote the
274 F 1022
Microlocal Analysis

sheaf {uEaMIS.S. un&lS,*M=(20. Then ported by Py*( Y x X), which is identified with
we have the following two canonical homo- Y x x P*X. Here and in what follows P*X,
morphisms: f*:.%?h-&& and f*:p!~i%“+ etc., denotes the cotangential projective bundle
qN. Here and in what follows, for a continuous of X, etc. Then the sheaf GYP of microdifferen-
mapping cp from N to M and a sheaf 9 on N, tial operators (of infinite order) is, by defini-
~~9 denotes the sheaf on M defined by assign- tion, GFxBx Om, Rx, where R, denotes the
ing {sEr((~-‘(C/);~,--)I~l~~~~~:supps~U is sheaf of holomorphic dim X-forms.
tproper} to each open subset U of M. These Remark. Several notations are used to de-
two homomorphisms are consistent. We call note &T in the literature. For example, [22]
each of them a substitution (homomorphism) uses the symbol Px and calls it the sheaf of
and denote it by (f*n)(y)=u(,f(y)). (ii) Let pseudodifferential operators. As in the case of
uy denote the sheaf of tdensities. Then there microfunctions, some authors use the symbol
exist the following two canonical homo- 8; to denote a sheaf on T*X. In this case
morphisms: f, :A@,,, Q,, u,,+&?~ @,, uy and 82 1x is, by definition, 92, the sheaf of linear
f,:a!p-‘(~~O~~v,)~ce,O~~v,. These two differential operators (of infinite order). One
homomorphisms are consistent. We call each should be careful in these notational confu-
an integration along a fiber and denote it by sions in referring to papers using microdif-
(~*u)(x)=~~~~(,)u. (iii) Using the result in (i), ferential operators. We note also that the sym-
we can define the product u, u2 of two byper- bols 9 and & have nothing to do with the
functions u, and u2, if S.S. u, f’(S.S. uz)” = symbols in distribution theory.
0. Furthermore, the singularity spectrum We now list the basic properties of micro-
of ui u2 is contained in {(x, fi(@, + (l- differential operators.
(i) When X is a complexilication of a real
o)irz)co)l(X,\/-151CX))ES.S.U1,(X,~52CO)
ES.S.z4*,0~.0~l}US.S.U, us.s.u,. analytic manifold M, &T 14ySeM is a subring
of YM.
(ii) Let R be an open subset of P*X. Using
a local coordinate system (x) on X, we define
F. Microdifferential Operators [22] fi by {(x,~)EC”X(C”-{O})~(X,~CO)E~}. Let
{ pj(x, <)}j,z be a sequence of holomorphic
(1) Microlocal Operators. Let M be a real functions on fi satisfying the following condi-
analytic manifold, and define the sheaf tions: (1) pj(x, 5) is homogeneous of degree j
ZMonfiS&(MxM)=fiS*Mby in 5. (2) For each E> 0 and each compact sub-
Jfc&St(M x M)WM x M 0 u,,,). A section K (x, y) dy set K of 6, there exists a constant C,,, such
of yM naturally determines an integral oper- that supKlpj(x, <)I <CE,k.sj/j! (j>O) holds.
ator ~?:u(y)+JK(x,y)u(y)dy. An operator (3) For each compact subset K of fi, there
thus obtained is called a microlocal operator, exists a constant R, such that supk (pj(x, 01~
because it acts on the sheaf %M of microfunc- RKj( -j)!( j< 0) holds. Then there is a one-to-
tions as a sheaf homomorphism. Usually we one correspondence between the space of such
identify an operator ~6” and a kernel function sequences and the space of sections of 8,” over
K (x, y)dy. (i) yM is a sheaf of rings by the n.
natural composition. The unit element of (iii) A sequence satisfying the conditions
yM is 6(x-y)dy. It acts on U, as an identity in (ii) is called a symbol sequence, and the
operator. (ii) Let K (x, y)dy be the kernel func- corresponding section of 8” is denoted by
tion of a microlocal operator x‘ defined near Cjczpj(x, 0,). If we define a subsheaf ~?~(rn) of
(x0, J-1 &, co). Then its adjoint operator Xx* 8x” by {P=Cjpj(x,D~)E~~lPj(X,5)=0(j~
is, by definition, the microlocal operator de- m + l)}, it is independent of the choice of the
fined near (x,, -J-l to co) with the kernel local coordinate systems. A microdifferential
function K(y, x)dy. The operation * is a sheaf operator belonging to &x(m) is said to be of
isomorphism between 58M and .5&, where a order (at most) m. We denote U,gx(m) by
denotes the antipodal mapping (- Section E). 8x and call a section of&x a microdifferential
operator of finite order.
(2) Microdifferential Operators. A micro- (iv) Let QA(z) denote I@)/( iz)“, where
differential operator is an analog of a micro- its branch is chosen so that QA( -1) = I(i).
local operator in the complex domain. By a When i, = 0, -1, -2, . , we consider its tfinite
procedure similar to that used to define the part. Let R be a complex neighborhood of
sheaf of microfunctions we first define the (x,,~~~co)E~S*R”~R”X~S”-~.
sheaf V$ of holomorphic microfunctions for Using a symbol sequence { pj(z, g)} on fi, we
a submanifold Y of X (- [22, definition 1.1.7 define a multivalued holomorphic function
on p. 3191, where it is denoted by %r,,). It K(Z, W, i) by CjPj(Z, O@n+j((Z- W, i>) and
follows from the definition that %$ is sup- consider its boundary value from the domain
1023 274 F
Microlocal Analysis

Re(z - w, [) < 0. We denote the resulting dependent of the choice of local coordinate
microfunction by system. It gives an isomorphism between
fx(m)/&x(m- 1) and BTs,(m), the sheaf of
~Pj(x,JIS)~.+j(~((x-Y,c)
holomorphic functions on T*X which are
homogeneous of degree m with respect to <.
+&iO)). (viii) (a) Let P be in &x((m)(,O,ro,. Assume that
Then 5,(P)(xo, &,)#O. Then its inverse P-’ (i.e.,
PP-‘=P-‘P= 1) exists in l?x(-m)t,o,e,,. (b)
K(x, y) = (27L)-” Let P and G be in ~x(m)~xo,~,~, ~x(l)~x,~r,~, re-
~Pj(xa~E)@n+j((x-Y.
Si spectively. Suppose that H,$,,(a,,,(P))(x,, to) =
0 (j = 0, , p - 1) and that H,P,&o,,,(P))(x,, to)
#O. (Here H,(g) is, by definition, the ‘Poisson
bracket (1; g} of ,f and g (- 82 Contact Trans-
is a well-defined microfunction in a neighbor- formations)). Then for each S in 8x,(xn,r01 we
hood of (xO,xO;fi(&, -&,)a) whose sup- can find Q and R in Gx,(xO,rO, so that S = QP +
port is contained in the antidiagonal set k =
Rwith(adG) p R=de+\ G,[G ,..., [G,R] ,... ]=0
{(X>Y;J-I(S>)?)4%/=*(~ x wx=y,
5 + q =O}. Here w(c) is the volume element holds. This result is usually referred to as the
of the (n - 1)-dimensional sphere S”-‘. Hence Spith-type division theorem (for microdifferen-
K(x, y) dy defines a microlocal operator. tial operators). In particular, when G = x,
The mapping which associates K(x, y)dy with and (x,,, &,) = (0; l,O, . . , 0), R has the form
{ pj(x, l)} is compatible with the inclusion C{:A Rck)(x, D’)D,“. Here Rck)(x, D’) = Rck’(x, D,,
mapping stated in (i). , Dnml). As a corollary to this expression
(v) By using the tplane wave decomposition we find the following (Weierstrass-type) pre-
of the S-function due to F. John (- 125 Distri- paration theorem (for microdifferential oper-
butions and Hyperfunctions CC), we find that ators): Let P be as above, and let G = x,. Then
microdifferential operators are a natural gen- we can find Q and W in G,,,,; , ,0, .,,, 0j such
eralization of tlinear differential operators: A that P = Q W with invertible Q and W= D,P +
linear differential operator corresponds to a CgzA Wck)(x, D’)D,“, where Wck) belongs to
symbol sequence { pj(x, <)}j,o, where pi is a &.r(p-k) and a,~,(Wck’)(O; l,O, . . . . O)=O.
polynomial with respect to 5. (ix) Quantized contact transformation. (a) Let
(vi) (a) Let P= Cjpj(x, 0,) and Q = X be an n-dimensional complex manifold and
& qk(x, 0,) be microdifferential operators. fi an open subset of P*X. Let 4 (j= 1,2,. , n)
Then their composition R = P o Q is a micro- be in gx( 1) (a) and Qj (j = 1, , n) in &,r(O)(R).
differential operator with the symbol sequence Assume that [I$ Pk] = [Q,, Qk] =0 and [!$ Qk]
= Sjk hold (1 <j, k Q n). Let cp be the contact
{rJlcz given by
transformation from R to P*C” defined by
PH(~Q,)(P)> . ..>~(Q.)(P)> ~I(PI)(PX “‘1
ol(Pn)(p)). Then there exists a unique C-algebra
homomorphism @:cp-‘&c”+&xlo such that
Here Di = @l/at;1 I?(; and s(! = !xr ! . a,! for @(xi) = Qj and @(D,) = 4 (j = 1, , n). Further-
the tmulti-index c(=(c(r, , n,). (b) Let P more, @ is an isomorphism, cDGcn(m) = &x(m)
= cjpj(x, D,) be a microdifferential operator. holds, and 0,(@(R)) = a,,,(R) o cp holds for R in
Let 6(x-y) denote the residue class [l] of the 8&m). We call the pair (cp, @) a quantized con-
left &xx x- Module 8, x xlG%1 8X x Axk -yk) tact transformation. In the above situation,
+ C;=r 8, x x(8/8xk + d/dyk)). (“Module” means the &x x,.-Module
sheaf of modules.) Then there exists a unique
microdifferential operator R = Cl rJy, DJ such
that P(x,D,)h(x-y)= R(y,D$(x-y). Fur-
thermore, r!(x, 5) is given by

is a simple holonomic system (- Section H)


R is called the adjoint operator of P and is whose support is the graph of cp’. Let u be the
denoted by P*. When X is a complexification canonical generator of .,ti, i.e., the residue class
of the real manifold M, it coincides with the of 1 in A. Then R*u=@(R)u holds for Red&.
adjoint operator P* E 3’;. (b) Conversely, let q be a contact transfor-
(vii) For a microdifferential operator P in mation from a neighborhood of p in P*X to
G,(m), we define its principal symbol a,,,(P) by P*c”, and let u be a generator of a simple
p,(x, 5). The principal symbol q,,(P) is in- holonomic system whose support is the graph
274 G 1024
Microlocal Analysis

of cp”. Then a C-algebra isomorphism D: (1) &~$,(&‘,8~)=0 forj#d. (2) V~fSuppJ


cp~‘&c~-&~ is defined in a neighborhood of p is regular at PE V in the sense that V is non-
through R*u=@(R)u (RE&), and (q,@) singular near p and that w 1Jp) # 0 for the
becomes a quantized contact transformation. +canonical 1-form w. Then, through a quan-
(c) In particular, let p be a point in J-1 S*M tized contact transformation (q, Q), 8’” @ J$?
and 0 its complex neighborhood. Let (9, @) be is isomorphic to a direct summand of a direct
a real quantized contact transformation de- sum of finite copies of partial de Rham sys-
fined on R (i.e., q maps fl S*M to J-1. tem JITO=&;uT(Cjd,, 8Fna/azj) with q(p)=
S*R”). Then, in a neighborhood of (p,(p(p)“)~ (0; 0, ,o, 1)E P*c”.
J-1 S*(M x R”), there exists a microfunction By studying the canonical form of V under
solution K(y,x)#O of the equations xjK(y,x)= real contact transformations, [22] further
QjK(Y>x)>-(d/~Xj)K(Y,X)=~K(Y,X)(j= gives the following structure theorem in a real
1,. , n) (xcR”, YE A4). Such a microfunction is domain, i.e., in J-1 S*M for a real analytic
unique up to constant multiple. The integral manifold M.
operator .~:~(x)HSK(y,x)u(x)dx gives rise Structure theorem 2. Let X be a complexifi-
to a sheaf isomorphism between @‘YRe,. and cation of a real analytic manifold M, and let J!
‘gM in a neighborhood of p. Furthermore, we be as in structure theorem 1. Let p be a point
have :X(Ru)=@(R)(XV) for UG%& and REC?,.. in Vn J-1 S* M. Suppose that the following
This .X is the counterpart of the Fourier inte- three conditions are satisfied: (3) Vf’ V is regu-
gral operator (- Section C). lar at p. (4) T,(V) n T,(V)= 7J T/f’ V) holds for
(x) Algebraic properties of 8; and Ex. (a) & each q in Vn V. Here V denotes the complex
is icoherent as a left &Module, and its stalk is conjugate of V (with respect to J-1 S*M)
a Ueft Noetherian ring. (b) 8; is tfaithfully flat and 7J V), etc., denotes the tangent space of V,
over &. (c) 8x is tflat over gx(0). (d) Rx is flat etc., at q. (5) The generalized Levi form of V is
over Y’&. of constant isignature (a, b) near p, where the
generalized Levi form of V is, by definition, the
Hermitian form
G. Microdifferential Equations 122,241
j,k
(1) Background. A system & of microdifferen-
tial equations (of finite order) is by definition a for the pj such that V= njpj’(0). Then 8” @
-icoherent left (or right) 8x-Module, i.e., there J? is isomorphic to a direct summand of
exists locally an exact sequence of the form a direct sum of finite copies of the system
&~&~JY~O. For a coherent $-Module R Z O8 ,$r considered in a neighborhood of
A, the support SuppA of ,,&’ is an impor- (x&i<)=(O&i(O ,..., OJ))E\/-IS*R”,
tant geometric object associated with .k. It is where .,lr is given by
called the characteristic variety of .,&‘. For i:
a coherent $&Module J%‘, its characteristic z.f=O (.i=I >..., 4,
variety is by definition Supp(b @,A). It is
I
often denoted by S.S. .&‘. Since a microdifferen-
tial operator is a microlocal operator, the
result in (viii) (a) of Section F (3) asserts that 1
8’&i(&‘, WJ is supported by the characteristic ci
-+J-1 xrtlr+g / n f=O
variety of M intersected with J-1 S*M. ( dX r+2s+I >

Now, it is known that V= Supp.4 is tinvolu- (I= I, . . ..a).


tory (= involutive, in involution) in P*X,
namely,f].=g],=Oentails

important problems in microlocal


{J.4}lv=O[22,
theorem 53.2 on p. 4531. One of the most
analysis is
(L-,--ix.+2.,+,g)f=o
? cx,+2s+1 n

(l=a+l,...,a+b).
to study how much information V can give
concerning the structure of &’ itself, and hence Here, r = 2 codim V- codim( Vn V) and s =
that of &:-*,L’,(d, %$,). The epoch-making dis- codim(Vn V)-codim V-(a+h).
covery of [22] is that V determines the struc- The first (resp. second) type of equation in
ture of 6” mfl& at generic points of V as the above are called (partial) de Rham equa-
follows. tions (resp. (partial) Cauchy-Riemann equa-
tions). The third and the fourth are called
(2) Structure Theorems. The fundamental Lewy-Mizohata equations (of type (a, h)) after
result of [22, theorem 5.3.7 on p. 4553 is the these authors’ pioneering works [25,26]. Thus
following theorem for coherent b-Modules: any system is seen to be microlocally isomor-
Structure theorem 1. Let ./Z be a coherent phic to a mixture of these three types of equa-
Rx-Module satisfying the following conditions: tions, generically speaking. As a corollary to
1025 274 H
Microlocal Analysis

this, structure theorem 2 clarifies the structure system 81.9. Suppose that A = Supp .I is
of microfunction solutions of .4f as follows: nonsingular. Then the principal symbol CT,,(U)is
Structure theorem 3. Let M, X, .k, V’, defined as follows: For P = Cjpj(x, D,)E 8(m),
and p be as in structure theorem 2. Then let L, denote the first-order linear differential
&;.I:c~; (&; @ .&‘, %$,) = 0 (j # a) holds, and the operator
remaining cohomology group 9 =8.&X1 (8; 0
~2, wM) has the following structure in a neigh-
borhood U of p: There exists an s-dimen-
sional complex manifold Y, a real analytic Here, HDm denotes the Hamiltonian vector field
manifold N, and a tsmooth mapping cp from defned by {pm; 1. Let R, and Rx denote the
VfWf$iS*M to Yx,/%*N such sheaf of n-forms of A and X, respectively, and
that~==cp-“3forasheaf’ZIon Yxfi. let a?‘/’ (resp. Qf”’ @ 02 -l/‘) denote a line
S*N that is a direct summand of .XN, with .f bundle L such that Lo2 is isomorphic to !&
being the solution sheaf of the partial Cauchy- (resp. Q,, @ Q-t). Since the line bundles QF”2
Riemann equations associated with Y. and QF112 @ fi$?“’ do not exist globally in
general, all the equations among their sections
should be understood up to a constant multi-
plicative factor. When A is a purely imaginary
H. Holonomic Systems
Lagrangian submanifold of G S*M for
a real analytic manifold M, these line bun-
A coherent (left) &-Module ~4’ is called holo- dles can be constructed globally by using the
nomic if Supp.B is tlagrangian. A coherent Keller-Maslov index (- Section C). Let v
(left) a-Module .1 is called holonomic if and $ denote respectively the first term and
d 0 ,r.,&’ is so. Even though the term “holo- the second term in the right-hand side of the
nomic” is currently used, another term, “maxi- above definition of L,. Then e,:Rf”2+
mally overdetermined,” is used to describe the OF”2 is given by e,(s) ==( l/2s)L,(s2) + *s for
same object in some of the literature, including s~@“~, where L,(s2) denotes the :Lie deriva-
[22]. The importance of such a system lies in tive of s2 along v. One can then prove that the
the fact that the space of its microfunction system of equations L,s = 0 (P E J) admits
solutions is finite-dimensional [27,29]. In this locally one and exactly one nonzero solution
sense it resembles an ordinary differential s in RF1/2 up to a constant factor. Then the
equation. A holonomic system, however, does principal symbol g*(u) of u equals s @ J%e
not satisfy condition (2) of structure theorem 1 @‘j2 @ Q’$-‘j2 by definition. The principal
of Section G, and its structure is rather com- symbol aA is homogeneous with respect to
plicated. A result which corresponds to struc- 5, and its homogeneous degree is called the
ture theorem 1 in Section G is the following: order of u and is denoted by ord,(u). The
Let V be an involutory submanifold of T *X, microlocal structure of a simple holonomic
and let 8” be the subring of&x generated by &-Module with a nonsingular characteristic
{PIz~~(~)~~,(P)]~=O}. Then a coherent 8X- variety is determined by the order of its gen-
Module & defined on an open subset Q of erator as follows: (a) Let (5”~ and 6u denote
7’* X is said to have regular singularities along simple holonomic B-Modules with the same
V if for any point p of R, there exist a neigh- characteristic variety A. Then 6% is isomorphic
borhood LJ of p and an Q-sub-Module ,Mo to &v if and only if ord,(u) - ord,(v) is an
of .B defined on U which is coherent over integer. (b) Let Gu be a simple holonomic
b(O), and which generates ~4’ as an &x-Module. system, and let x denote the order of U. Then
A holonomic system is said to have R.S., through a suitable quantized contact trans-
which is an abbreviation for regular singular- formation, 6% is isomorphic to gcmw, where w
ities, if it has R.S. along its support. Then for satisfies
an arbitrary holonomic I-Module .1 we can
find a holonomic I-Module .Lrreg with R.S. (z,~+(a+;))w=o,
such that 6” @A Jz’~.~ z 6” @A d holds. See
[28] for the proof of this striking result and i?w
related topics on holonomic systems with reg- z=O (j=&...,n).
ular singularities. I
An elementary class of holonomic systems is Thus the microlocal structure of a simple
that of simple holonomic systems. A holonomic holonomic system &? = bu is fairly simple at
&-Module .I is called simple if there exists a nonsingular points of its characteristic variety.
left Ideal .a such that .V =&/-a and that the Moreover a Hartogs-type theorem for micro-
symbol Ideal {a(P) ( P E 3) coincides with the differential equations [28, ch. I, 421 entails
defining Ideal of Supp.4’. Let IA denote the that if Supp.J has the form A1 U A2 with
generator 1 mod.9 of a simple holonomic Lagrangian manifolds A, and AZ such that
214 I 1026
Microlocal Analysis

codim,, (A, n A2) 2 2; then A has the form contribution was the fact that, through the
.A, @ &‘z with supp ,kj = Ai (j = 1,2). Hence construction of the sheaf of microfunctions he
the case where codim,,(A, f’ A,)= 1 is impor- found that the singularities of hyperfunctions
tant. Suppose A1 and A, are nonsingular in this can be canonically dispersed over the cotan-
case and T,A, #T,A, at a point XEA, flA,. gent bundle (- Section E) and that a hyper-
Then, through a quantized contact transfor- function solution u of a linear differential
mation ($, Q), the system .M is isomorphic to equation PU = 0 is concentrated on the charac-
&cnw, with w satisfying the following equations teristic variety when its singularities are thus
defined near (0; dz, co): dispersed (- Section G). The last-stated fact
was also formulated by Hormander [S, 203 in
the framework of distribution theory. The
most important part of the contribution of
Egorov [ 121 was the discovery that one can
use an integral transformation introduced by
V. I. Eskin [7] to find a transformation of
aw pseudodifferential operators compatible with a
?=O (j=3,...,n),
“‘j homogeneous canonical transformation, i.e., a
contact transformation, so that the commut-
where aj=ordAj(u) (j= 1,2) with $(Ai)=
ation relations and the orders of the operators
{~~=O,~~=...=&,=O}and$(A,)={z,=
can be preserved. Hormander (Acta Math.,
z2 = 0, [s = . = 5, = 0). For more general cases
121 (1969)) independently introduced integral
and an application - [30].
transformations of the same type, calling them
The theory of holonomic systems and its
Fourier integral operators. Egorov [ 131 and L.
applications are being studied most inten-
Nirenberg and F. Treves [33] successfully used
sively, raising the hope of establishing a uni-
the transformation of operators to study the
fied theory of special functions of several
regularity and existence of solutions. Sub-
variables; - [23] and references cited therein.
sequently, Hiirmander [S] elaborated the
theory of Fourier integral operators. Kashi-
wara and Kawai (Proc. Japan Acad., 46 (1970))
I. History observed that a pseudodifferential operator in
their sense (now called a microdifferential
Microlocal analysis means local analysis on operator; - Section F) gives rise to a sheaf
the cotangent bundle. It emphasizes the im- homomorphism on the sheaf of microfunctions
portance of localization in cotangent bun- and that the structure of the microfunction
dles in analysis, which was pointed out by S. solutions of pseudodifferential equations is
Mizohata [S, 61 immediately after the advent determined by the principal symbol of the
of singular integral operators in the works of operator in question if it has simple character-
A. P. Calderon and A. Zygmund [3,4]. Since istics. Then Sato, Kawai, and Kashiwara [22]
then, localization in the cotangent bundle has succeeded in amalgamating these two theories,
been used frequently in the theory of linear namely, the theory of microfunctions and the
partial differential equations. R. T. Seeley [31] theory of the transformation of operators.
proved that the symbol of a singular integral (They called the transformation a quantized
operator is well defined on the cotangent contact transformation in [22]). Such an amal-
bundle. Works by J. J. Kohn and L. Nirenberg gamation was also done independently by
[32] and L. Hormander (&mm. Pure Appl. Hormander [8,20], who introduced the notion
Math., 18 (1965)) strengthened the trend of of the wave front set for distributions as a
localizing the problem on the cotangent bun- substitute for the support of microfunctions.
dle. Although it seems that the term “micro- Incidentally, it is noteworthy that C. Chandler
local analysis” first appeared in the literature and H. P. Stapp (J. Math. Phys., 10 (1969)) and
in 1973 (T. Kawai, As&risque, 2 and 3 (1973)), D. Iagolnitzer and Stapp (Comm. Math. Phys.,
the basic part of the theory had been con- 14 (1969)) obtained a notion similar to the
structed during the period from 1969 to 1972 singularity spectrum in a physical context.
by M. Sato (Proc. Intern. Conf: Functional Their results were later elaborated (around
Anal. and Related Topics, 1969), Yu. V. Egorov 1971-1973) by J. Bros and Iagolnitzer [19].
[12], Hiirmander [S, 201, J. J. Duistermaat With the aid of the above-mentioned amal-
and Hiirmander (Acta Math., 128 (1972)) gamation of the theories, the works of Sato,
M. Kashiwara and Kawai (Proc. Japan Acad., Kawai, and Kashiwara [22], Hormander [ZO],
46 (1970)), and Sato, Kawai, and Kashiwara Duistermaat and Hormander (Acta Math., 121
[22]. Apparently the work of V. P. Maslov (1969)), Kawai (P&l. Rex Inst. Math. Sci., 7
[l l] had an important influence on the work (197 I - 1972)) and K. G. Andersson (Trans.
of Egorov. The most important part of Sato’s Amer. Math. Sot., 177 (1973)) have clarified the
1027 274 Ref.
Microlocal Analysis

importance of the bicharacteristic strip, which Russian), Uspekhi Mat. Nauk, 25 (1969), 2355
is a submanifold of a cotangent bundle, not a 236.
base manifold, as a carrier of singularities of [ 131 Yu. V. Egorov, Conditions for solvability
solutions of pseudodifferential equations (- of pseudodifferential equations, Soviet Math.
Section G); and the works of Sato, Kawai, and Dokl., 10 (1969), 1020-1022.
Kashiwara [22], Sato (Acres Congr. Internat. [ 141 D. Fujiwara, On the boundedness of
Math. Nice, 1971) and Kawai (Publ. Res. Inst. integral operators with highly oscillatory
Math. Sci., 7 (1971-1972)) revealed the hidden kernels, Proc. Japan Acad., 51 (1975), 96699.
mechanism of the celebrated counterexample [ 151 H. Kumano-go, A calculus of Fourier
of H. Lewy [25]. Among these, the contri- integral operators on R” and the fundamental
bution of Sato, Kawai, and Kashiwara [22] solution for an operator of hyperbolic type,
was most decisive and fundamental in that it Comm. Partial Diff. Eq., 1 (1976), l-44.
first clarified the structure of a general system [ 161 K. Asada and D. Fujiwara, On some
of pseudodifferential equations at generic oscillatory integral transformations in ,!,‘(R”),
points of the characteristic variety and then Japan. J. Math., 4 (1978), 299-361.
derived from it the above-quoted results on [ 171 A. Sommerfeld, Optics, Lectures on
the structure of the solutions (- Section G). Theoretical Physics, vol. 4, Academic Press,
Microlocal analysis has now become one of 1964.
the most important and basic concepts in the [ 181 P. D. Lax, Asymptotic solution of oscil-
theory of linear partial differential equations latory initial value problems, Duke Math. J.,
and theoretical physics. For recent develop- 24 (1957), 627-646.
ments - Hormander [34] and V. Guillemin, [19] D. Iagolnitzer, Analytic structure of
Kashiwara, and Kawai [23] and references distributions and essential support theory,
cited therein. Structural Analysis and Collision Amplitude,
North-Holland, 1976, 295-358.
[20] L. Hormander, Uniqueness theorem and
References wave front sets for solution of linear differen-
tial equation with analytic coefficients, Comm.
[1] J. J. Duistermaat, Fourier integral opera- Pure Appl. Math., 24 (1971) 671-704.
tors, Lecture notes, Courant Inst. Math. Sci., [21] J. M. Bony, Equivalence des diverses
1973. notions de spectre singulier analytique, Semi-
[2] V. Guillemin and S. Sternberg, Geometric naire Goulaouic-Schwartz 1976677, Ecole
asymptotics, Amer. Math. Sot. Math. Surveys Polytechnique.
14 (1977). [22] M. Sato, T. Kawai, and M. Kashiwara,
[3] A. P. Calderhn and A. Zygmund, Singular Microfunctions and pseudodifferential equa-
integral operators and differential equations, tions, Lecture notes in math. 287, Springer,
Amer. J. Math., 79 (1957) 901-921. 1973,2655529.
[4] A. P. Calderon, Uniqueness in the Cauchy [23] V. Guillemin, M. Kashiwara, and T.
problem for partial differential equations, Kawai, Seminar on microlocal analysis, Ann.
Amer. J. Math., 80 (19.58), 16-36. math. studies 93, Princeton Univ. Press, 1979.
[S] S. Mizohata, Systemes hyperboliques, J. [24] J. E. Bjork, Rings of differential opera-
Math. Sot. Japan, 11 (1959) 2055233. tors, North-Holland, 1979.
[6] S. Mizohata, Note sur le traitement par les [25] H. Lewy, An example of a smooth linear
operateurs d’integrale singuliere de Cauchy, J. partial differential equation without solution,
Math. Sot. Japan, 11 (1959), 2344240. Ann. Math., 66 (1957), 1555158.
[7] G. I. Eskin, The Cauchy problem for [26] S. Mizohata, Solutions nulles et solutions
hyperbolic systems in convolutions, Math. nonanalytiques, J. Math. Kyoto Univ., 1
USSR-Sb., 3 (1967) 243-277. (1962) 271-302.
[S] L. Hiirmander, Fourier integral operators [27] M. Kashiwara, On the maximally over-
I, Acta Math., 127 (1971), 799183. determined system of linear differential equa-
[9] J. Chazarain (ed.), Fourier integral opera- tions I, Pub]. Res. Inst. Math. Sci., 10 (1974-
tors and partial differential equations, Lecture 1975), 5633579.
notes in math. 459, Springer, 1975. [28] M. Kashiwara and T. Kawai, On holo-
[lo] J. B. Keller, Corrected Bohr Sommerfeld nomic systems of microdifferential equations
conditions for nonseparable systems, Ann. III, Systems with regular singularities, Publ.
Phys., 4 (1958) 180-188. Res. Inst. Math. Sci., 17 (1981) 813-979.
[l l] V. P. Maslov, Theorie des perturbations [29] M. Kashiwara and T. Kawai, Finiteness
et methodes asymptotiques, Gauthier-Villars, theorem for holonomic systems of micro-
1972. (Original in Russian, 1965.) differential equations, Proc. Japan Acad., 52
[ 121 Yu. V. Egorov, On canonical transfor- (1976) 341-343.
mation of pseudodifferential operators (in [30] M. Sato, M. Kashiwara, T. Kimura, and
275 A 1028
Minimal Submanifolds

T. Oshima, Microlocal analysis of prehomo- The history of the theory of minimal sub-
geneous vector spaces, Inventiones Math., 62 manifolds goes back to J. L. Lagrange, who
(1980), 117-179. studied minimal surfaces in the 3-dimensional
[3 l] R. T. Seeley, Singular integrals on com- Euclidean space R3. In 1762 he developed
pact manifolds, Amer. J. Math., 81 (1959) 658 his algorithm for the ‘calculus of variations,
690. which can be applied to higher-dimensional
[32] J. J. Kohn and L. Nirenberg, An algebra problems and is now known as the tEuler-
of pseudodifferential operators, Comm. Pure Lagrange equation. For instance, let D be a
Appl. Math., 18 (1965), 269-305. domain in the plane R2 and z =,f(x, y), (x, y) E
[33] L. Nirenberg and F. Treves, On local D, the equation of a surface in R3. As a neces-
solvability of linear partial differential equa- sary condition for the surface to have the least
tions 11, Comm. Pure Appl. Math., 23 (1970) area among the surfaces with fixed boundary,
4599510. Lagrange obtained a tquasilinear elliptic par-
[34] L. Hormander, Seminar on singularities tial differential equation of the second order,
of solutions of linear partial differential equa- called the minimal surface equation:
tions, Ann. math. studies 91, Princeton Univ.
(1 +z$z,,- 2z,zyz,). + (1-t z:)zYY = 0.
Press. 1979.
Before this, in 1744, L. Euler had found that a
tcatenoid is a minimal surface. In 1766, J. B.
M. C. Meusnier showed that a right thelicoid
275 (Vll.14) is a minimal surface. Besides cdtenoids and
Minimal Submanifolds helicoids, in 1834, H. F. Scherk found that the
surface defined by z = log(cos y) - log(cos x) is
a minimal surface, which is called Scherk’s
A. General Remarks surface.
In the latter half of the 19th century, tPla-
An immersion ,f of an m-dimensional manifold teau’s problem (- Section C; 334 Plateau’s
M with boundary aM (possibly empty) into a Problem) was studied extensively by 0. Bon-
Riemannian manifold N is called minimal if net, B. Riemann, K. Weierstrass, A. Enneper,
the tmean curvature vector field H of M with G. Darboux, and others. The problem is stated
respect to the induced Riemannian metric as follows: Given a iJordan curve I in R3 (or
vanishes identically. Then M is called a mini- in R”), find a surface of least area having I
mal submanifold of N. This definition comes as its boundary. On the other hand, in 1866,
from the following variational problem: By a Weierstrass gave a general formula, called
smooth tvariation off is meant a smooth map- the Weierstrass-Enneper formula (- Section B
ping F:I x M-N, where 1=(-l, l), such that (5)) to express a simply connected minimal
each f, = F(t, .): M-tN is an immersion, ,fO=,h surface in terms of a complex analytic function
and,f;IaM=flaMforalltEZ,Letd1/;bethe and a meromorphic function with certain
volume element of the metric induced by f,, properties. The formula allows one to con-
and set V(t)= lM dl/, the volume of M at time struct a great variety of minimal surfaces by
t. Then the first variation of the volume is choosing those functions.
expressed as The existence of a minimal surface of disk
type having a prescribed boundary curve was
first obtained in 1930 by J. Douglas and T.
Rado independently as a solution to Plateau’s
where a/& denotes the canonical vector feld problem, admitting singularities. The result
along the I factor in I x M. Thus the mean was improved by R. Courant for the case of
curvature vector field H of ,f vanishes identi- finitely many boundary curves by the method
cally if and only if d Vjdt (f=O = 0 for all vari- of tDirichlet integrals. The method was carried
ations off: Therefore a minimal submanifold out further by C. B. Morrey for the gener-
gives an extremal of the volume integral, alized Plateau’s problem in a Riemannian
though neither necessarily minimal nor of the manifold (- Section C (5)). Indeed, the Euclid-
least volume. ean space was replaced by any complete Rie-
In the case N = R”, an immersion x : M +R” mannian manifold which is metrically well
is viewed as a vector-valued function, and the behaved at infinity. For example, any compact
mean curvature vector field H is expressed as or any thomogeneous Riemannian manifold is
H = Ax/m, where A denotes the tlaplace- in this class.
Beltrami operator -g’lv,v,. Thus x is minimal The existence proof of minimal surfaces
if and only if each component function of x is cannot in general be applied directly to the
harmonic. In particular, there is no compact case of higher-dimensional minimal submani-
minimal submanifold without boundary in R”. folds. The notion of varifolds (- Section G (2))
1029 275 B
Minimal Submanifolds

and the corresponding generalization of a generalized) minimal immersion if it is con-


minimal submanifold, called a minimal variety formal except at the branch points, and the
(- Section G (2)), were then introduced by F. image f(M) is then called a branched minimal
Almgren, who proved the existence of a mini- surface. The solution to Plateau’s problem
mal variety in a compact Riemannian mani- given by Douglas and Rado is a branched
fold. It was also proved that a minimal variety minimal surface.
is approximated by regular minimal submani- (2) Maximum principle. When n = 3, the
folds. This work has been developed as geo- following maximum principle for minimal
metric measure theory by E. R. Reifenberg, surfaces holds: If M, and M, are two con-
W. H. Fleming, H. Federer, F. Almgren, and nected branched minimal surfaces in R3 such
others (- Section G). that for a point PE M, n M2, the surface M,
The study of minimal surfaces given as the locally lies on one side of M2 near p, then M,
graph of a real-valued function of two vari- and M, coincide near p.
ables leads to that of solutions of the minimal (3) Convex hull property. In general, every
surface equation. In 1915, S. Bernshtein proved branched minimal surface with boundary in R”
that a minimal surface z =,f(x, y) defined on lies in the convex hull of its boundary curve,
the entire plane RZ must be a plane. Sub- the smallest closed convex set containing the
sequently, this was generalized to the following boundary.
problem: Is a minimal hypersurface x,+, = (4) Reflection principle. If the boundary
f(xi, , x,,) in R”+’ defined on the entire space curve of a branched minimal surface contains
R” a hyperplane? The answer turns out to be a straight line y, then the surface can be ana-
affirmative for n < 7 and negative for n > 8 lytically continued as a branched minimal
(- Section F (1)). surface by reflection across 7. Based on this
In general, when a bounded domain D in R” principle, the following holds: Let I be an
and a continuous function q on its boundary analytic Jordan curve in R” and f: M +R” a
(70 are given, the problem of finding a minimal branched minimal immersion with boundary
hypersurface M defined by the graph of a real- F. Then f is analytic up to the boundary, i.e.,
valued function f on D with fl IYD = cp gives f(M) is contained in the interior of a larger
rise to a typical +Dirichlet problem. The basic branched minimal surface (H. Lewy). The
questions are those of existence, uniqueness, smooth version of this theorem was given by
and regularity of solutions. These were first S. Hildebrandt: If f: M*R” is a branched
studied by Rado for n = 2 and later by L. Bers, minimal immersion with smooth boundary
R. Finn, H. Jenkins, J. Serrin, R. Osserman, curve, then f is smooth up to the boundary.
and others (- Section D (1)). (5) Weierstrass-Enneper formula. Every
Minimal submanifolds of a sphere have simply connected minimal surface in R3 is
interesting properties; some are analogous to represented in the form
those in Euclidean space but some are not.
Among them, J. Simons, in 1968, gave a dif-
ferential equation that the norm of the second
fundamental form of a minimal submanifold
in a sphere should satisfy. He then showed where 4 = .f( 1- 9’)/2, d2= J-1 .f( I+ s2)/2,
that a totally geodesic submanifold is isolated b3 =,fi~, and ck is a constant. Here, g(c) is a
among the compact minimal submanifolds in meromorphic function on a domain D in the
complex c-plane, and f(i) is an analytic func-
the sphere. S. S. Chern, M. P. do Carmo, S.
tion on D with the property that at each point
Kobayashi, N. Ejiri, T. Ito, T. Otsuki, N. Wal-
[, where g(c) has a pole of order m, f(c) has a
lath, and others have made further contri-
zero of order 2m, D being either the unit disk
butions to this subject (- Section F (2)).
or the entire plane. This formula is quite use-
ful for constructing various minimal surfaces.
For instance, on setting .f’= 1 and g(c)= c,
B. Minimal Surfaces Enneper’s surface is obtained:

u3 213
(1) Branched minimal surfaces. A minimal ~x,rX2,x3)=(U--3+Uu~,u-3+vU~,U~-v~),
surface M in R” is an immersed surface with
(u, v)ER’.
vanishing mean curvature vector field. M
equipped with an atlas of tisothermal coor- Another feature of the formula is that general
dinates is viewed as a tRiemann surface. A theorems about minimal surfaces can be ob-
branch point of a tharmonic mapping ,f: M + tained by translating statements about analy-
R” is a point at which the differential df, is tic functions into the corresponding minimal-
zero. A harmonic mapping 1‘: M +R” of a surface ones. For example, a surface ,f: M +R3
Riemann surface M is called a branched (or is minimal if and only if the +Gauss mapping
275 C 1030
Minimal Submanifolds

G: M-S2 is antiholomorphic, i.e., the complex C. Plateau’s Problem (- 334 Plateau’s


conjugate of the mapping is holomorphic, Problem)
when S2 is viewed as the Riemann sphere with
complex structure by the stereographic projec- Let I- be a trectifiable iJordan curve in R” and
tion (from the south pole) onto the complex D = {(x, y)~ R2 1x2 + y2 < 1). Then there exists
plane. As a consequence, a complete minimal a continuous mapping f: D-+R” such that (a)
surface in R3 is a plane or else the normals to f I aD maps homeomorphically onto r; (b)
the surface are everywhere dense in Sz (R. f I D is harmonic and almost conformal, i.e.,
Osserman). (f,,f,)=OandIf,l=lf,linDwithIdfl>O
(6) Stability. A minimal submanifold M is except at isolated branch points; (c) the in-
called stable if for every compact region on M duced area off is the least among the class
all the second variations of the volume are of piecewise smooth surfaces bounding r with
positive. For instance, if M+R” is a minimal (a). This mapping f is called the classical solu-
hypersurface and fv is a variation vector field, tion or the Douglas-Rad6 solution to Plateau’s
v being the unit normal vector field of M and f problem for r, which may have singularities
a smooth function with compact support on called tbranch points. The resulting surface S
M, then the second variation formula for the is a branched minimal disk. This theorem
volume V(t) is given by establishes the existence of a surface of least
area among all surfaces homeomorphic to a
disk. It has been generalized by R. Courant
for r consisting of finitely many rectifiable
where 1A) denotes the norm of the second Jordan curves.
fundamental form of M. In particular, when For a branched minimal disk S bounded by
n = 3, then - (A 12/2 = K, the Gauss curvature smooth r in R3, there is a relation, called the
of M. Therefore a minimal surface M in R3 is Gauss-Bonnet-Sasaki-Nitscbe formula, among
stable if and only if the ttotal curvature K(r) of r, the total curva-
ture of S, and the orders of branch points:
(JVf12+2Kf’2)dV>0
sM
for any smooth function f with compact sup-
port on M. It follows that the stability of mini- where K denotes the Gauss curvature of S,
mal surface M in R3 is related to the bound- ma - 1 the orders of the interior branch points,
ary value problem of a linear elliptic operator and 2M, the orders of the boundary branch
L = A - 2K, A being the Laplacian on M. points, which must be even.
Namely, a minimal surface M in R3 is stable if (1) Regularity. A minimal disk of least area
and only if the first eigenvalue 1, (D) of L on in R3 has no boundary branch points when I-
any bounded domain D in M is nonnegative. is real analytic (R. Gulliver and F. Leslie) or
In connection with the tGauss mapping, a when r is smooth and the total curvature rc(T)
sufficient condition for a domain D in a mini- is less than 471 (J. C. C. Nitsche). In general, a
mal surface to be stable was obtained by H. classical solution for smooth r in R3 cannot
Schwarz in 1885: If a minimal surface M in R3 have infinitely many branch points. A re-
has one-to-one Gauss mapping G : M +S2, markable fact is that every classical solution
then a domain D c M such that G(D) is con- to Plateau’s problem in R3 is free of branch
tained in a hemisphere of S2 is stable. This was points in its interior, i.e., is a regular immer-
generalized by J. L. Barbosa and M. do Carmo sion (R. Osserman).
as follows: If the area A(G(D)) of the spherical (2) Emheddedness. A classical solution is
image G(D) is less than 27c, then D is stable. not necessarily an embedding; it may have
This result is sharp in the sense that for any self-intersections. For instance, if r is knot-
E> 0 there exists an unstable domain D with ted in R3, then every solution must have self-
A(G(D)) = 27~+ E. As an analog to Bernshtein’s intersections. It is known, however, that im-
theorem, the following holds: A complete and mersed minimal disks of least area in R3 which
stable minimal surface in R3 is a plane. H. can self-intersect only in their interiors are
Mori has also obtained results in this regard. embeddings. In particular, if r is an extremal
As for the existence of unstable minimal Jordan curve, i.e., if r lies on the boundary of
surfaces, it is known that if there are two dis- its tconvex hull, then the classical solution for
tinct stable minimal surfaces with the same r is an embedding (F. Tomi and A. J. Tromba;
smooth boundary curve r in R3, then there W. H. Meeks and S. T. Yau). If the topological
exists an unstable branched minimal surface type is not specified, then there always exists
with r as its boundary (M. Morse and C. an embeddedminimal surface.That is, if r is
Tompkins, M. Schiffman). the union of any finite collection of disjoint
1031 275 D
Minimal Submanifolds

smooth Jordan curves in R3, then there exists C gep dx” dxfl satisfies
a compact embedded minimal surface with
boundary F which is smooth up to the bound-
ary (R. Hardt and L. Simon). However, there for any x and (ui, . . . . u,,) E R”. Any compact or
exists an unknotted Jordan curve that never any homogeneous Riemannian manifold is
bounds an embedded minimal disk. homogeneously regular. Morrey’s solution is
(3) Uniqueness. In general, the classical as follows: If N is a homogeneously regular
solutions to Plateau’s problem do not have the Riemannian manifold and if F is a homotopi-
uniqueness property. Rado was the first who tally trivial rectifiable Jordan curve in N, then
gave a condition on a boundary curve to there exists a branched minimal immersion f:
guarantee the uniqueness of the minimal disk. D-+N with least area bounded by I with (a).
Namely, if a Jordan curve I in R” admits a The regularity of Morrey’s solution is simi-
one-to-one orthogonal projection onto a con- lar to the classical one. If I is smooth, then
vex curve in a plane R2 in R”, then the classical so is the solution f up to the boundary. If,
solution to Plateau’s problem for I is free of furthermore, dim N = 3, then the solution f
branch points and can be expressed as the is an immersion in its interior. If N and I are
graph over this plane. When n = 3, the solution real analytic and dim N = 3, then the solution f
is unique. Another geometric condition on I is an immersion up to the boundary.
has been given by Nitsche: An analytic Jordan
curve I in R3 with total curvature rc(I) G D. Existence Problems of Minimal
47~(or a smooth I- with K(F) <4x) bounds a Submanifolds
unique immersed minimal disk. Moreover,
the generic uniqueness holds: In the space d (1) The Dirichlet problem for the minimal
of all smooth Jordan curves in R” with suitable surface equation. When the graph of a (vector-
topology, there exists an open and dense sub- valued) function is a minimal submanifold
set 99 such that for any I in g there exists in some Euclidean space, the function must
a unique area-minimizing minimal disk (F. satisfy the so-called tminimal surface equation.
Morgan and A. J. Tromba). To be precise, let D be a (bounded) domain in
(4) Finiteness. As for the finiteness of the R”, ,f: D+Rk a (vector-valued) function, and
classical solutions to Plateau’s problem, put F:D-+Rnik: F(x) =(x, f(x)) for x E D. Let A4
several conditions on boundary curves are be the graph off: M = F(D). Then M is mini-
known. An analytic Jordan curve in R3 mal if and only if f (or F) satisfies one of the
bounds only finitely many minimal disks of following equations:
least area (F. Tomi). An analytic textremal
ifn=2and k=l,
Jordan curve in R3 bounds only finitely many
minimal disks with relative minima of area (H. (1 +f,2)fx,-w&,+U +f3f,,=0; (1)
Beeson). A smooth Jordan curve F with total
if n is arbitrary and k = 1,
curvature It(I) < 6n bounds only finitely many
minimal disks (Nitsche). Moreover, generically,
i.e., for an open and dense subset of boundary ad =0 where W=Jl+lvfl”;

curves, there are at most finitely many minimal (2)


surfaces with given boundary, relative minima
if n = 2 and k is arbitrary,
or not (R. Bohme and Tromba).
(5) Generalized Plateau’s problem. C. B. (1+l.f,12)f,,-2(f,~f,)f,,+~1+Is,12).f,,=o; (3)
Morrey’s setting of the generalized Plateau’s
and if n and k are arbitrary,
problem is as follows: A homotopically trivial
rectifiable Jordan curve I’ is given in an n-
0 where gij=$s, (4)
dimensional Riemannian manifold N. Let D
denote a disk in R’. Find a mapping f:&N
and (9”) is the inverse matrix of (gij).
such that (a) f ( 8D maps homeomorphically
onto I, (b) the induced area off is the least The basic problem for this class of equations
among the class of piecewise smooth surfaces is the TDirichlet problem. For n = 2 and k = 1,
in N bounded by F with (a). Obviously when the following theorem of Radb and Finn is
N = R”, this is the classical Plateau’s problem. fundamental: There exists a solution f of the
A solution was given by Morrey under the Dirichlet problem corresponding to an arbi-
assumption that N is homogeneously regular; trary continuous function cp on the boundary
i.e., that there exist 0 < k < K such that for c3D of D if and only if D is convex. Since the
any point y E N there is a local coordinate difference of any two solutions of eq. (1) satis-
system (V, a) around y for which Q(V) = fies the maximum principle, there can be at
{x f R” 1 ((x1(< 1) and the Riemannian metric most one solution f of the Dirichlet problem
275 E 1032
Minimal Submanifolds

corresponding to a given continuous function point. On putting z = x + J-1 y, the metric


cp on 8D. As for the removability of isolated has the form ds2 =2Fldz12, and M is viewed as
singularities, the following is known: Every a Riemann surface. A conformal immersion
solution of eq. (1) in 0 <x2 + y2 < 1 extends f: M --) R” is minimal if and only if Af = 0, or
continuously to the origin. The extended func- equivalently a&f = 0. In the case n = 3, the
tion is smooth at the origin and satisfies eq. (1) +Gauss mapping G:M+S* of a surface f:
in the full disk x2 + y2 < 1 (Finn). M-tR3 is defined by assigning to each point
For arbitrary n and k = 1, namely, for the PE M the unit normal vector translated paral-
case of minimal hypersurfaces, the following is lel to the origin of R3. A surface ,f: M+R3 is
known: Let D be a bounded domain in R” with minimal if and only if the Gauss mapping is
smooth boundary. The Dirichlet problem for antiholomorphic. In the general case, the
eq. (2) has a solution for any continuous func- Gauss mapping G is defined to be a mapping
tion cp on 8D if and only if the mean curvature assigning to each point PE M the oriented
of 8D with respect to the inner normal is non- tangent space f, T,(M) c R”. Thus G is a map-
negative at every point. As for the uniqueness ping from M into the +Grassmann manifold
and regularity, the same results as above hold fi,,,(R)=SO(n)/S0(2) x SO(n-2) of oriented
(Jenkins and Serrin). planes in R”, which is naturally diffeomorphic
For n = 2 and arbitrary k, namely, for the to the tcomplex quadric Qnm2 in the complex
case of minimal surfaces in Rzik, exactly the projective space CP”-‘. An immersion ,f:
same result of existence as in the classical case M +R” is minimal if and only if its Gauss
k = 1 holds. A solution exists for an arbitrary mapping is antiholomorphic.
continuous vector-valued function on aD if Let ,f: M-R” be a complete orientable
and only if D is convex (Osserman). Though minimal surface. Let x be the Euler character-
the removability of singularities holds under istic and - nC = j,,, KdA, the total curvature of
suitable restrictions, the uniqueness fails in this f‘(M). Assume that the total curvature is finite.
case. Then the following results of Chern and Osser-
For the last case, n > 2 and k > 1, there are man are fundamental: (a) A4 is conformally a
essentially no results on either existence or compact Riemann surface M with finite num-
uniqueness for the Dirichlet problem. ber, say r, of points deleted; (b) C is an even
(2) Existence of minimal surfaces and top- integer and satisfies C 3 2(r - x) = 4g + 4r - 4,
ology. For a compact Riemannian manifold where g is the genus of M (= genus of M); (c) if
N, there are results on the existence of minimal f(M) does not lie in any proper aftine subspace
surfaces related to the homotopy groups of ofR”, then C>4g+r+n-3>4g+n-2>n-2
N. Let f be a continuous mapping from a (F. Gackstatter); (d) if f(M) is simply connected
+Riemann surface Zg of tgenus 9 into N. If the and nondegenerate, i.e., its image under the
iinduced mapping f, :rr,(T,)+n,(N) of ,f on Gauss mapping does not lie in a hyperplane
the tfundamental groups is injective, then there of CP”-‘, then C > 2n - 2 and this inequality
exists a branched minimal immersion h : ,?+ N is sharp; (e) when n = 3, C is a multiple of 4,
such that h, =,f# on n,(.Zq) and the induced with the minimum value 4 attained only for
area of h is the least among all mappings with Enneper’s surface and the catenoid; (f) the
the same action on rrn,(C,). If furthermore Gauss mapping G of M extends to a mapping
rc2(N) = 0, then h can be deformed from S of M whose image G(M) is an algebraic curve
continuously (R. Schoen and S. T. Yau). If in CP”-’ lying in Qnm2; the total curvature of
n,(N)#O, then there exists a generating set for f(M) is equal in absolute value to the area of
n,(N) consisting of branched minimal immer- G(M), counting multiplicity; (g) G(M) inter-
sions of spheres that minimize energy and area sects a fixed number of times, say m (counting
in their homotopy classes (J. Sacks and K. multiplicity), every hyperplane in CP”-’ except
Uhlenbeck). If, furthermore, dim N = 3, then for those hyperplanes containing any of the
the above minimal immersion h in the homo- finite number of points of G(M -M); the total
topy class is an embedding or a two-to-one curvature of ,f(M) equals -2~m.
covering mapping. In the second case, its In particular, if a complete minimal surface
image is an embedded real projective plane. If in R” has k tends, then the total curvature
h’ is another such least-area mapping, then never exceeds 27r(x - k). Enneper’s surface and
either h’ is equal to h up to a conformal re- the catenoid are the only two complete mini-
parametrization or the images of h and h’ are mal surfaces in R3 whose Gauss mapping is
disjoint (Meeks and Yau). one-to-one.
If the Gauss mapping of a complete minimal
E. Gauss Mapping of Minimal Surfaces surface of finite total curvature in R3 omits
more than 3 points of S’, then it is a plane
On a connected, orientable surface M there (Osserman). F. Xavier proved that the Gauss
exist local isothermal coordinates (x, y) at each mapping of any complete nonflat minimal
1033 275 F
Minimal Submanifolds

surface in R3 can omit at most 6 points of S*. isotropy group is irreducible can be minimally
It is an open question whether this can be immersed into the n-sphere of curvature i/m,
improved to 4 points. The Gauss mapping of corresponding to any eigenvalue (#O) of the
Scherk’s surface omits exactly 4 points of S*. Laplacian, where n + 1 is the dimension of the
eigenspace corresponding to L (W. Y. Hsiang,
Takahashi).
F. Minimal Submanifolds Since for the tsymmetric spaces of +rank 1
the eigenvalues and the corresponding eigen-
(1) Bernsbtein’s problem. As was mentioned in spaces of the Laplacian are known, the mini-
Section A, a minimal surface in R3 represented mal immersions of such spaces into the unit
as the graph of a function on the entire plane sphere have been determined as well as the
R2 is a plane. This is known as Bernshtein’s trigidity of such minimal immersions (N. Wal-
theorem (1915). This is, however, not true in lath). For example, if the m-sphere of constant
R4, namely, there is a minimal surface defined curvature c is minimally immersed into the
as the graph of a vector-valued function on the unit n-sphere, but not contained in any great
entire plane, which is not a plane. The gen- hypersphere, then for each nonnegative integer
eralized Bernshtein problem is stated as follows: k,c=m/k(k+m-l)andn+1<(2k+m-1).
Let f: R”-rR be a function satisfying the mini- [(k+m-2)!/k!(m- l)!]. The immersion is rigid
mal hypersurface equation if and only if m = 2 or k G 3. Similar results
have been known for the projective spaces
over real numbers, complexes, quaternions, or
Cayley numbers.
where W= ,/m. Is the graph off an Simons showed that the scalar curvature
affme hyperplane? The answer is afftrma- p of a compact minimal submanifold A4 of
tive for n < 7 (E. De Giorgi, F. J. Almgren, J. dimension m in the unit (m + p)-sphere is not
Simons). For n > 8, it is negative (E. Bombieri, greater than m(m - 1). Furthermore, if p >
De Giorgi, E. Giusti). Though the problem m(m- I)-mp/(2p- l), then either p=m(m- 1)
turns out to be negative for n > 8, no concrete or p = m(m - 1) - mp/(2p - 1). In the former
counterexample is known, even for n = 8. case, M is totally geodesic and therefore is the
(2) Minimal submanifolds in the sphere. unit sphere S”. In the latter case, M is either
Minimal submanifolds in the unit sphere S” a hypersurface of the unit (m + l)-sphere
have some special properties. For example, which is isometric to the product Sk(&) x
there is no compact minimal submanifold of S’-‘(JG) of spheres of radius Jkim
s” contained in an open hemisphere. On the and ,,/G, respectively, called the gen-
other hand, there is no stable minimal closed eralized Clifford torus or the Veronese surface
submanifold in S” (B. Lawson, Simons). in S4. Here the Clifford torus is the torus
More generally, any closed minimal hypersur- S’(,/$)x S’(m)cS3cR4, and the Ver-
face in a Riemannian manifold with positive onese surface is defined as follows: Let (x, y, z)
Ricci curvature is unstable (Simons). As for the be the natural coordinate system in R3 and
existence of minimal surfaces in a sphere, the (u’, u2, n3, u4, u5) that in R5. The mapping
following theorem of Lawson is general: Any defined by
closed surface of any genus, except the real
projective plane, can be minimally immersed u’ = yzl&, l.2 = zx/&, u3 = xylJ3,
into the unit sphere S3. As an analog to Bern- u4 = (x2 ~ y2)/2& u5 =(x2 + y* - 2z2)/6
shtein’s theorem, a sphere S’, which is mini-
mally immersed in S3, is necessarily totally gives an isometric immersion of S*(d) into
geodesic. On the other hand, by making use S4. Two points (x, y, z) and (-x, - y, -z) of
of higher-order second fundamental forms and S’(,/?) are mapped into the same point of S4,
the relations among them, a complete descrip- and thus the mapping defines an embedding of
tion of the minimal immersions of R* into S” the real projective plane RP* into S4. This
has been given by K. Kenmotsu. embedded real projective plane in S4 is called
As a general theorem which gives a neces- the Veronese surface.
sary and sufficient condition for a Riemann- T. Otsuki proved the following: Let M be a
ian manifold to be minimally immersed in the complete minimal hypersurface immersed in
unit sphere S”, the following is fundamental: the unit (n + l)-sphere with two principal cur-
An isometric immersion x of an m-dimensional vatures. If their multiplicities are m and n -
Riemannian manifold M into S”, viewed as a m > 2, then M is congruent to the generalized
vector-valued function into R”+‘, is minimal if Clifford torus S”‘(a) x Pmm(d-)c
and only if Ax = mx (T. Takahashi). As its Sn+l CR”+*. If one of their multiplicities is 1,
immediate application, any m-dimensional then M is a hypersurface of S”” in R”+* =
compact homogeneous space whose linear R” x R2 whose orthogonal projection into R*
275 G 1034
Minimal Submanifolds

is a curve of which the tsupport function x(t) is a k-current defined by integration on a k-


is a solution of the following nonlinear differ- dimensional simplex. We say that T is normal
ential equation of the second order: if M(T) and M(aT) are finite. Normal cur-
rents form a +Banach space with norm N(T) =
nx(l -X’)(d2X/dt2)+(dx/dt)2+(1 -xZ)(nxZ- 1)
M(T)+M(aT).
=o. A current T is called rectifiable if it can be
approximated in the mass norm by currents of
Furthermore, there are countably many com-
type f,.., where y is a finite polyhedral chain
pact minimal hypersurfaces immersed but not
with integer coefficients, f: suppy + U is a
embedded in S”+‘. Only S”-’ (Jm) x
+Lipshitz mapping of the support of y into U,
S’(G) is minimally embedded in S”+‘.
and f*y is the current defined by means of
This just corresponds to the trivial solution
f,y(cp)=y(f*cp). If both T and c?T are recti-
x(t) = l/& of the above equation.
liable, T is called an integral current. Inte-
(3) Minimal submanifolds in Riemannian
gral currents give the appropriate notion of
manifolds. In general, it is difficult to deter-
generalized manifolds with boundary in the
mine all the complete stable minimal sub-
study of higher-dimensional Plateau problems.
manifolds in a given Riemannian manifold. If,
One of the fundamental properties of in-
however, some curvature conditions are given
tegral currents is the compactness, stated as
to the Riemannian manifold, then a classifica-
follows: Given a compact subset Kc U and a
tion has been given: Let N be a complete 3-
number c > 0, the set of integral currents T
dimensional Riemannian manifold with non-
such that supp T c K and N(T) < c is sequen-
negative scalar curvature p, and let M be a
tially compact in the tweak topology. Since
complete, stable, and orientable minimal sur-
the mass M(T) is +lower semicontinuous in
face in N.
the weak topology, it follows that Plateau’s
(a) If M is compact, then either M is confor-
problem can always be solved in the space of
mally equivalent to the +Riemann sphere S2
integral currents (- Section C; 334 Plateau’s
or else it is a totally geodesic flat torus. Fur-
Problem). The question arises: To what extent
thermore, if p > 0, then the latter case never
is this a satisfactory solution to the problem?
occurs.
For codimension 1, it is known from the work
(b) If M is not compact, then M is confor-
of Federer and others that an integral (n - l)-
mally equivalent to the complex plane C or a
current of least mass in R” is nonsingular in
cylinder (D. Fischer Colbrie; R. Schoen).
codimension < 7. In particular, in R”, n < 7,
As a generalization of the procedure for
every integral (n- l)-current of least mass is an
generating periodic minimal surfaces in R3
analytic manifold. In general codimensions, it
with octahedral or tetrahedral symmetry
is known that the set of regular points is dense
(Schwarz, 1867), T. Nagano and B. Smyth gave
(Reifenberg, Almgren, and others).
a construction procedure to generate periodic
The definition of rectifiable and integral
minimal surfaces in R” or n-tori with sym-.
currents carries over to those on a Riemann-
metry corresponding to any +Weyl group of
ian manifold M in a natural way. Then the
rank n.
space of integral currents I,(M) on M with the
boundary operator 3, for which a2 = 0, form
a chain complex. It is then a fundamental
G. Minimal Varieties theorem, due to Federer and Fleming, in
homological integration theory that there is a
Recent progress in the study of Plateau’s prob- natural isomorphism H,(I,(M)) E HJM; Z) of
lem in higher dimensions is closely related to the homology of the complex of integral cur-
the point of view of geometric measure theory rents with the integral singular homology
and tthe calculus of variations. groups of M. From this it follows that if M is a
(1) Integral currents. Let T be a tcurrent of compact Riemannian manifold M, then each
degree k, or simply a k-current, defined on an class 51E H,(I,(M)) E H,(M; Z) contains an
open set U of R”, and let 8T be the boundary integral current of least mass among all in-
of T, the exterior derivative of T. For simplic- tegral currents in tl.
ity, in the following only currents with com- For a homology class of codimension 1, it
pact support are considered. has also been proved by Almgren that if M is a
The mass M(T) of a k-current T is the dual compact Riemannian manifold of dimension
norm M(T)=sup{ T(p)1 M(q)< 1) of the ~7, then for each C(EH,-,(M;Z) there exists a
comass M(p) on k-forms, which is defined by finite collection of mutually disjoint, compact
M(v) = sup{ II dx)ll IXE U} with lIp(x)ll= oriented minimal hypersurfaces S,, . , S, em-
suP{(cp(x)>~, A 1.. Au,)(u,,...,u,areortho- bedded in M and integers n,, , n, such that
normal vectors at XE U}. It follows that M(T) the integral current &, njsj represents do.On
coincides with k-dimensional volume when T the other hand, it has been recently proved
1035 275 Ref.
Minimal Submanifolds

that every compact Riemannian manifold of H. Maximal Hypersurfaces in Minkowski


dimension < 6 contains a nonempty closed Space
embedded minimal hypersurface (J. T. Pitts).
(2) Varifolds. The theory of integral currents Let L”+’ be a Minkowski space, i.e., L”” =
developed by Federer, Fleming, and others {(X1....,Xn,t)l(X1,~“r xJER”, PER} with the
yields reasonable spaces for purposes of the +Lorentzian metric C& (dx,)’ -(dt)2. Let M
calculus of variations. However, it is not en- be a tspacelike hypersurface of L”+’ so that
tirely feasible to use these in the study of the the induced metric on M is Riemannian. If the
actual soap films that occur in physical ex- mean curvature vector field H of M, defined
periments. It turns out that we should work in the same way as in the Riemannian case,
in a more set-theoretic fashion and give up vanishes identically, then M is called maximal.
the notion of orientability and the boundary In contrast to the Riemannian case, by the
operator. A convenient theory for describing variation in the normal direction of H, the
these physical experiments is the theory of volume increases, provided H # 0. However,
varifolds developed by Almgren. the equation describing a maximal hyper-
A k-dimensional varifold on a Riemann- surface is similar to that for the Riemannian
ian manifold M is a +Radon measure on the case (- Sections D (l), F (1)). Indeed, let D be
bundle G,(M) of unoriented tangent k-planes a domain in R” and f: D-tR a smooth func-
of M. For simplicity, only varifolds with com- tion. Then the graph off in L”+’ is a maximal
pact support are considered. These are re- hypersurface if and only if f satisfies the fol-
garded as continuous linear functionals on the lowing quasilinear partial differential equation
space C(G,(M)) of continuous functions on of the second order:
G,(M). Thus a k-dimensional submanifold S of
finite volume embedded in M determines a
varifold V, by integration:
This is telliptic for lVf1 < 1, since M is space-
v,(f)= f(T,S)d*k(X)> fe C(Gk@‘f))> like. E. Calabi proposed a problem similar to
ss
Bernshtein’s in 1968. Contrary to Bernshtein’s
where flk denotes the k-dimensional +Haus- case, the answer is affirmative for any n.
dorff measure on M. More generally, to Namely, a maximal hypersurface in L”+’ de-
any rectifiable current there corresponds an fined as the graph of a function on the entire
underlying varifold obtained by neglecting space R” is a hyperplane in R”“. More gener-
orientations. ally, a maximal hypersurface, which is a closed
Let 1/,(M) denote the space of k-dimensional set in L”+‘, is a hyperplane (S. Y. Cheng and
varifolds on M. Given a varifold VE T/,(M), the S. T. Yau). It is also known that any maximal
mass M( I’) is defined to be the v-measure of hypersurface in L”+l is stable.
the total space, i.e., M(V) = V(1). Let f: M -+ M
be a Cl-mapping. Then f induces naturally a
mapping f, of V,(M) into itself. In particular, if
X is a smooth vector field on M with asso- References
ciated flow f,, then there is a smooth function
M(t)= M(&V). A varifold I’e V,(M) is then [l] F. J. Almgren, Plateau’s problem, An
called a k-dimensional minimal variety in invitation to varifold geometry, Benjamin,
M if the first variation (d/dt) M(t) 1t=. = 0 for 1966.
all smooth vector fields on M. For example, [2] R. Courant, Dirichlet’s principle, con-
minimal submanifolds, complex analytic sub- formal mapping, and minimal surfaces, Inter-
varieties of Klhler manifolds and integral science, 1950.
currents of least mass are minimal varieties. [3] H. Federer, Geometric measure theory,
An appropriate notion of rectifiable and Springer, 1969.
integral varifolds is defined, and an analog of [4] H. Federer, Colloquium lectures on geo-
the compactness theorem can be obtained. As metric measure theory, Bull. Amer. Math. Sot.,
a consequence, it was proved by Almgren 84 (1978) 291-338.
using +Morse theory methods that if M is an n- [S] D. Hoffman and R. Osserman, The geome-
dimensional compact Riemannian manifold, try of the generalized Gauss map, Mem. Amer.
then for each p, 1 < p < n - 1, there exists at Math. Sot., 236 (1980).
least one minimal variety of dimension p. As [6] H. B. Lawson, Minimal varieties, Proc.
for the regularity of minimal varieties, it is Symp. Pure Math., 27 (1975), 143-175.
known that if V is a k-dimensional variety, [7] H. B. Lawson, Lectures on minimal sub-
then in the support of V there is a relatively manifolds I, Publish or Perish, 1980.
open dense subset that is a regular minimal [S] C. B. Morrey, Multiple integrals in the
submanifold of dimension k (W. K. Allard). calculus of variations, Springer, 1966.
276 A 1036
Model Theory

[9] J. C. C. Nitsche, On new results in the ttransfinite ordinals for the numbers of argu-
theory of minimal surfaces, Bull. Amer. Math. ments of fj and Pj, and which has the extended
Sot., 71 (1965), 1955270. concepts &, Vacp, 3x, . . . 3x, . . . for a<b,
[lo] J. C. C. Nitsche, Vorlesungen iiber Mini- andVx,Vx,...Vx =... forr</$wherecrandb
malflachen, Springer, 1975. are transtinite ordinals. Another language
[l 11 R. Osserman, Minimal varieties, Bull. includes variables of higher +type as well as V
Amer. Math. Sot., 75 (1969), 10922 1120. and 3 over those variables. Free and bound
[ 121 R. Osserman, A survey of minimal sur- variables may not be distinguished typograph-
faces, Van Nostrand, 1969. ically. In that case a variable not bound by V
[ 131 T. Rado, On the problem of Plateau, or 3 in a tformula is called free. (The notion of
Springer, 1933. a formula is defined later.) To simplify this
[ 141 A. J. Tromba, On the number of simply discussion, however, we restrict ourselves to
connected minimal surfaces spanning a given the thirst-order predicate language with a
curve, Mem. Amer. Math. Sot., 194 (1977). typographic distinction between free and
[ 151 Minimal submanifolds and geodesics, bound variables. We also assume that there
Proc. Japan-US Seminar, Tokyo, 1977, North- are only a countable number of variables, and
Holland, 1979. hence we use natural numbers as subscripts.
[16] S. Nishikawa, On maximal spacelike Set L, = (logical symbols}, L, = {a,, n,,
hypersurfaces in a Lorentzian manifold, ~2,...},~3={~oI~~,x2,...},~4={~~,~~,

Nagoya Math. J., 95 (1984) 117-124. c,,...},L5={.fo,f*,f*r...},L6={Po,P,,


[ 171 0. Kobayashi, Maximal surfaces with P,,...‘I,L=(L,,L,,L,,L,,L,,L,).To

conelike singularities, J. Math. Sot. Japan, 36 determine a language L is to specify such a list
(1984), 6099617. L. Since V, 3, 1, A, v, +, are normally used
for L,, a,, a,, a2, for L,, and x0, xi,
x2, . . for L,, we may assume that these are
fixed, and hence to determine a language is to
determine (L4, L,, L6). We take (L,, L,, Lx)
276 (1.6) just described and assume an arbitrary but
fixed L = (L4, L,, L6). First we define the
Model Theory notions term of L (or L-term) and formula of
L (or L-formula).
A. Language Definition of the terms of L (L-terms): (1)
Each free variable aj is an L-term. (2) Each
Every mathematical theory has an appro- individual constant c, of L is an L-term. (3) If x
priate language. To determine a language for a is a function symbol of L, ij is the number of
theory means to determine a language for the arguments of fj, and each of t i, . , ti, is an L-
related mathematical system. Such a language term, then fj(t,, , ti,) is also an L-term. (4)
consists of the following symbols (the symbols The L-terms are only those constructed by (1))
given here are examples of only one notational (3).
system). A term that does not contain a free variable
(1) Symbols that express logical concepts is called a closed term.
(tlogical symbols): V, 3, 1, A, v , -‘; Definition of the formulas of L (or L-
(2) tfree variables: a,, a,, a2, . . . ; formulas): (1) Let 5 be a predicate symbol of L
(3) +bound variables: xc,, x,, x2, . . . ; and ij be the corresponding natural number. If
(4) symbols that denote individual objects each oft,, , ti, is an L-term, then q(ti, . , ti,)
(individual constants): cO, ci, c2, , c,, ; is an L-formula. This type of formula is called
(5) tfunction symbols: fO, fi, f2, . ,fh, ; a prime formula (or atomic formula). (2) If A
(6) ipredicate symbols: PO, P1, P2,. , P,, and B are L-formulas, then each of l(A),
The tcardinalities of the sets of symbols in (A)r\(B),(A)v(B), and (A)*(B) is an L-
(4), (5) and (6) are arbitrary, except that there formula.. (3) Let F be an L-formula and xi be a
must be at least one predicate symbol. It is bound variable that does not occur in F. Then
assumed that each set of symbols is +well an expression obtained by putting ( ) around
ordered. Also, it is understood that to each .h F, replacing some occurrences in F of a free
in (5) there corresponds a positive integer ii, variable, say aj, by xi, and prefixing Vx, or 3xi
while to each Pj there corresponds a nonnega- is an L-formula. (4) The L-formulas are only
tive integer (these integers are called the num- those constructed by (l))(3).
ber of arguments of 6 and l$ respectively). A formula that has no occurrence of a free
In practice, other kinds of languages are variable is called a closed formula. The paren-
also dealt with. One example is a system with theses used in the formation of a formula may
infinitely long expressions, which permits be omitted if no ambiguity arises thereby.
1037 276 D
Model Theory

B. Structures

Let L be a specific language as described in the


previous section. Then V.N= [M : p; o; r] defined
by (l))(4) below is called a structure for L (or (5) mm, m k 5+ C c> YX, m k 5 implies ‘Y.R,
L-structure). tn+C.
(1) M is a nonempty set. (M is called the (6) 93, m ~VxjF(xj)oYJl, n b F(q) for an
universe of (YJI.) arbitrary n that satisfies m I n, where a, has
(2) p is a mapping from L, into M. the least index among the free variables that
(3) Let L\ = {h 1the number of arguments do not occur in F(xj).
off,is i}. Then L,=L~UL:U...UL’,U... (7) (9X, m k &,F(x,)othere exists an n such
provides a partition of L,. Let gi be the set of that n L m and $%)I,tt k F(aJ, where ui satisfies
all mappings from M’ = M x x M (i times) the same condition as in (6).
into M and cri be a mapping from Li, into si. Following are some consequences of this
We define D for an arbitrary j” of L, by g(f) definition.
= cri(f), where i is the number of arguments of (1) For an arbitrary L-formula A and an
fi Then D is obviously a mapping from L, into arbitrary m-sequence m, exactly one of W,
()I”=1 8i. nt k A and VJl, m b 1 A holds.
(4) Decompose L, into Lg U LA.. .U Lb U.. (2) Let uj,, , aji include all free variables
as in (3). Let Pi be the set of all subsets of M’ that occur in A, and let m and n be !Vl-
and zi be a mapping from Ld into Pi, where PO sequences for which mj, = nj,, . , mji = nji.
is the set {M, @} (0 is the empty set). Then r Then Y.R, m k A and 9Jl, n k A are equivalent.
is defined for every i and for an arbitrary P of So we may write 9Jl k Ak;‘:,‘:.‘::i] instead
L; by r(P)=,,(P).
If we denote p(c) by C, a(j) by J and r(P) by of %R, m k A, for any formula A whose free
F, then we may understand that p is repre- variables are among aj,, . , uji, and any YJl-
sented by TO, c,, , o is represented by f,, sequence m.
yl, , and z is represented by P,,, pi,. . . (3) If A is a closed formula, then for an
Therefore 9X is normally expressed as arbitrary pair of sequences m and n, !JJl, m k A
and YX, n b A are equivalent. Therefore, for a
YJl=[M:c,,c,, ;so,fi,...;Po,p~,...l. closed formula A, we may express the state-
ment “for some (or, equivalently, for all) mm-
sequence m, YJl, m k A holds” by YJl k A.
C. Satisfiability (4) Let ai be an arbitrary variable that does
not occur in VxF(x) or 3xF(x). Then YJI,
We fix not only a language L but also a struc- m k VxF(x) is equivalent to YJl, n k F(ai) for
ture %II for L. Then the property that an L- an arbitrary n such that n A m. Likewise, YX,
formula is satisfiable is defined by the follow- m t= 3xF(x) is equivalent to the statement
ing procedure: that there is an n such that n Am and YJl,
Let m, n, . stand for isequences of the n k N4.
elementsofM,say(m,,m ,,.., ),(n,,n, ,...) ,...,
called $3J1-sequences. We write m L n to indi-
cate that each entry of m except the ith one is D. Models
equal to the corresponding entry of n. Using
these concepts, the value of an L-term at an Here again we fix a language L. Let A be a
%G-sequence m, denoted by t[m], is defined as closed L-formula and W an L-structure. If
follows: (9.Rk A, then YJl is called a model of A.
(1) If t is a free variable aj, then t [m] = mj. Furthermore,ifT={A,,A,,....}isanarbi-
(2) If t is an individual constant cj, then trary set of closed formulas and W k Ai for all
t[m] =i;J. Ai in r, then the structure 9.X is called a model
(3) If t is of the form J;.(t,, , ti), then t[m] = of r.
Sj(t, Cm], . . , ti[m]). If t is an L-term, then (1) Consistency. Consider a logical system
evidently t[m] is an element of M. whose language is L. If there is a model of the
Based on this definition of t[m], the relation set of all provable closed formulas of the sys-
A is satisfiable by m in 9X, denoted by !JJl, m k tem, then the system is tconsistent. In partic-
A, is defined for an arbitrary L-formula A ular, the +tirst-ordered predicate calculus is
and an arbitrary %n-sequence m as follows: consistent.
(l)~,m~~(t,,...,ti)o(tiCml,...,tiCml> (2) Completeness. A logical system is said to
EPj. be complete if every closed formula that is
(2) %II, tttk lBo!J.R, nrk5 is false. satisfied in every structure is provable in the
276 E 1038
Model Theory

system. In particular, the first-order predicate Definition 1. Two L-structures !IR and 9I are
calculus is complete. said to be elementarily (arithmetically) equiva-
K. Godel proved (2). Later L. Henkin gave lent if for an arbitrary closed L-formula A,
an alternative proof whose essential idea con- %R~/4-=%~‘4.
tributed to proving the following proposi-
tion: If a set F of closed L-formulas is con- Definition 2. Let
sistent, then there is a model of F. Henkin
also introduced a (nonstandard) second-order
semantics, relative to which the tsecond- and
order predicate calculus is complete. This
%=[N:r,,r, ,... ;h,,h, ,... ;R,, R ,,... ]
can be shown by extending Henkin’s tech-
nique (for the first order) to the second-order be two structures. Y.R is an elementary exten-
language. sion of ‘9 if the following two conditions are
(3) Here we extend the language slightly by satisfied: (i) M xN; qj=rj (j=O, 1, . ..). the
adding the second-order free predicate vari- restriction of gj to N is identical to hj (j=
ables cc;, cc;, . . , al, (n = 1,2, ) and the 0, 1, . . . ); the restriction of Qj to N is identi-
second-order bound predicate variables cp;, cal to Rj (j = 0, 1, . . . ). (If this condition holds,
~0;, , cpr, . (n = 1,2, . . .), where n indicates then YJI is said to be an extension of YI.) (ii) For
the number of arguments’ of a variable. Other- an arbitrary L-formula A and an arbitrary fl-
wise the definition of the language is the same sequence n, if %, n k A then YJI, n k A.
as for the case of the first-order predicate
calculus. For simplicity, however, we assume Theorem 1. Let %II be an extension of Yl. A
that there are no individual constants, function necessary and sufficient condition for !IJI to be
symbols, or predicate symbols. an elementary extension of !R is that for an
The structure is defined as follows: Put WI = arbitrary L-formula of the form 3xF(x) and
[M: S,, S,, . . , S,, . . 1, where M is a nonempty an arbitrary %-sequence n, if %I& n k 3xF(x),
set and S, is a set of subsets of M x x M (n then there is some element n of N such that
times). An 9X-sequence m is defined as before, for the YJ-sequence m for which m L n and
and 5, denotes (s;, s;, . . . , SF, . ), where each s; mi= n, YJI, m t= F(ai), where a, is an arbitrary
is a member of S,. The concept of satisfiability free variable that does not occur in F(x).
is defined as follows:
mm, (m, 5,, . . . / 5,, . . . I+$(%,, ...,xi
nlo Theorem 2. Here we place a condition on L
(mi,, . . . ,m,JEsjn. that each set of symbols be at most countable
!JJl, (m,e,, . . . . 5,, . ..)/=V(pLA(&!)ofor and arranged in the w-type (- 312 Ordinal
an arbitrary 5; for which 5; L 5,, YJI, (m, 5i, . , Numbers). Let the cardinality of the universe
eb, ) k A(aj”), where aj’ has the smallest index M of %lI be an infinite cardinal a, M’ be a
among the free predicate variables that do not subset of M of cardinality c, and b be an in-
occur in A(&). finite cardinal that satisfies c <b <a. Then
~Wh51>...> ~~,...)/=3&‘.4((~90there there exists an L-structure 9I whose universe
exists an 5; such that 5: i 5, and YJI, N has cardinality b and such that M’c N and
h 5 i, . . . ,5, , . . ) k A(“;), where a; satisfies YJI is an elementary extension of %.
the same condition as in the previous clause.
Satisftability for other cases is defined as for Theorem 3. Suppose that L satisfies the same
first-order predicate language. A structure YJI condition as in Theorem 2. Let the cardinality
is called normal if all axioms of the second- of the universe M of YJI be a (a is an infinite
order predicate calculus are true in )137. cardinal) and b be a cardinal for which a < 6.
Completeness of the second-order predicate Then there exists an L-structure 9I that is a
calculus: Every closed formula that is satisti- proper elementary extension of XII and whose
able in all normal structures is provable in the universe has cardinality 6.
second-order predicate calculus.
(4) Let the cardinality of L, be r, and I be
an arbitrary set of closed L-formulas. If I E. Ultraproducts
has a model, then I has a model of cardinal-
ity max (z, K,). This follows from Henkin’s Assume that for a set of L-structures C and a
method. Historically, however, it was first set of indices I, there is a mapping 0 from I
proved by Th. Skolem and L. Lijwenheim for onto Z. If a is a member of I, YJI is a member
a special case, and was later generalized by A. of Z, and 0(a) = !IR, then 9JI may be denoted
I. Mal’tsev and A. Robinson. by YJI’. It should be noted that there may be
(5) The following results are all due to A. more than one 3 corresponding to the same
Tarski and R. L. Vaught. ~ structure. If D is a tmaximal filter of I and %II”
1039 276 E
Model Theory

is expressed as 1 products. Therefore, we have the following


theorem.
‘w=[MZ:C” )... :f” ,...: P )... 1,

then n,,, Ma is defined by


Compactness Theorem. A set I of closed for-
g M” = { cp1cp is a mapping from
mulas has a model if and only if every finite
subset of it has a model.
I into U M”, where cp(a)~M”}.
aa1 In the case where all the structures ‘W
coincide with the single structure 92, the ultra-
For any two elements cp and $ of nnp, M”,
product of {Y.R”}cle, (with respect to D) may be
cpg $ is defined by
written %‘/D and called the ultrapower of 9J
cP~$-{alcp(a)=$(a)J~D. (with respect to D).
Let
Then cpE $ is an equivalence relation between
the elements of &, M”. Furthermore, the set ‘9X=[M; qo, q,,...; go>gI,...;Qo,Q,>...l
norP, M” partitioned by z is expressed by and
&, Ma/D, and each element m of naE, M”/D
is expressed by m = [q], where cp is a repre- %=[N; r,, r,,...; ho, h,,...; Ro, R,, . ..I
senting element of m. be two structures. YJI and ‘3 are said to be
Next we define an operator that produces a isomorphic if there is a bijection f from M to
new structure from C. Put M = noEr Ma/D. N such that the following three conditions
For an individual constant c of L, let c= [q], hold: (i) f(qO)=r,, f(q,)=r,, . . . . (ii) The se-
where q(r) =F for every a. For an n-ary quences go, gi, and ho, hi, . . . are of the
function f of L and arbitrary elements m, = Same type and .f(g;(al, . , q,)) = hj(f(al)a . ,
[ql], , m,,= [p.] of M, define ,f(m,, , f(a,)) holds for every n-tuple a,, . , a, in M.
m,,) = [$I, where $(a) =P(rp, (4, . , a,(a)) for (iii) The sequences Qo, Qi, . and R,, R,,
every a. For an n-ary predicate P of L, define are of the same type and Rj = { (f(a,), ,
(m,,...,m,)EP by ~(u,))I(u,,...,u”)~Q~}.
Let j be the function from N to N ‘/D de-
(n1 l,...,m,)eP
fined by j(a)= [qO] for each UE N, where qa is
o{aI(cpl(a),...,cp,(a))EP~}ED. the constant function from I to N such that
According to these definitions, put cp,(a) = a for each a E I. Let Im be the substruc-
ture of %‘/D whose universe is the range of j.
‘TJi=[M:E, . . . . f; . . . . p, . ..I. Then j is an isomorphism of ‘J1 to 9.X In the
and denote it by n,,,!JJP/D, called the ultra- following we identify a and j(u) for each a~ N.
product of {YJV}dls, (with respect to D). YJI is an Then % is an elementary substructure of %‘/D
L-structure. by the fundamental theorem of ultraproducts.
If %II and !R are isomorphic, then 9JI and ‘3
are elementarily equivalent. By using this fact
Fundamental Theorem of Ultraproducts. Let and the fundamental theorem of ultraproducts,
‘33 = &p,!IJtma/D be the ultraproduct of we have the following result. Let ‘9Jt and % be
cwEI> m=(m,,m,, . . . ) be an !IR-sequence, two structures. If there is a nonempty set I and
‘pi be a representing element of mi, and A a maximal filter D on I such that 9Xm’/D and
be an arbitrary formula. Then 93, m k A o 9l’/D are isomorphic, then 93 and ‘3 are
{a I sma, (cpl(4, (~~(4, .I k A) 6 D. elementarily equivalent. H. J. Keisler proved
By using this fundamental theorem we have the converse of this proposition by using the
the following result: Suppose that I is a set of G.C.H. (generalized continuum hypothesis),
closed formulas in L such that every finite and later S. Shelah proved it without the
subset of I has a model. Let 1 be the set of all G.C.H. Keisler-Shelah isomorphism theorem:
the finite subsets of I. For each a E I and each Let YJI and 9t be two structures. Then 9Jt and
A E I, let Y.JYbe a model of a and A^ the set of % are elementarily equivalent if and only if
all the finite sets in I which contain A as a there is a nonempty set I and a maximal filter
member. Let F = {al AE IT). Since F has the D on I such that 9X’/D and ‘%‘/D are
finite intersection property, there is a maximal isomorphic.
filter D such that D 1 F. Let %JJbe the ultra- The ultraproduct operation has various
product norE, W/D. Since A^ is a subset of the applications in number theory, algebraic geo-
set {a 1XV + A j and A^ belongs to D, the set metry, and analysis. Here we give an example
{a 1W + A} belongs to D, for each AE r. due to J. Ax and S. Kochen. Let P be the set of
Hence we have that M is a model of I, by prime numbers. Let Q, and Z,( (t)) be the field
the foregoing fundamental theorem of ultra- of p-adic numbers and the field of formal
276 F 1040
Model Theory

power series over 2, = {0, 1, , p - 1) for each theorem of ultraproducts we can conclude that
p in P, respectively. Ax-Kochen isomorphism there are many nonstandard real numbers a
theorem: Suppose that D is a nonprincipal such that 0 < *c( < *a in R* for any UE R. Such
maximal filter on P. Then n,,,Q,/o and a nonstandard real number c( is called an
n,,,Z,((t))/D are isomorphic. infinitesimal real number. Since the norm
As an immediate consequence of this theo- operator 11 11is a mapping from H to R,
rem, we have the following partial solution of I/ II* is a mapping from H* to R*. If 11x II*
Artin’s conjecture on Diophantine equations. (x~ H*) is infinitesimal, then x is said to be
Theorem: For each positive integer d, there infinitesimal in H*. Let S be the set of all linear
exists a finite set Y of primes such that every subspaces of H. For a linear subspace K of H*
homogeneous polynomial f(t,, , t,) of degree that is contained in S* let K” be the set of
d over Q,, with n > d’, has a nontrivial zero elements x E H such that x-x0 is infinitesimal
in Q, for every p 6 Y (- 118 Diophantine for some x0 in K. Then K” is a closed linear
Equations). subspace of H. Let e = {eiJisN be an ortho-
We give another example in nonstandard normal basis of the Hilbert space H; e can be
analysis. A. Robinson developed the general considered as a mapping from N to H, and
theory of nonstandard analysis in [lo]. Here hence e* = {ej}jENL is a mapping from N* to
we explain a theorem due to A. R. Bernstein. H*. For each jeN*, let Hi be the linear sub-
Let X be a nonempty set and U(X) be the space of H* spanned by {ek 1k <j}. For a given
smallest transitive set (i.e., a E b and b E U(X) bounded linear operator T on H, T* is a
imply a~ U(X)) which has X as a member and linear operator on H*. We define 7;= CT* I$
is closed under the following operations: pair- where Pj is the projection from H* to Hj. Since
ing, union, power set, and subset operation dim(Hj)=j is a (nonstandard) natural number,
(i.e., a~ U(X) and b c a imply bE U(X)). Let L there exists a tower JojcJljc . cJjj= Hj of
be the first-order predicate logic with equality closed, Tj-invariant linear subspaces of Hj such
whose set of nonlogical constants consists of a that dim(Jkj) = k(k <j). Then Jk/’ is a closed,
binary predicate symbol E and individual T-invariant linear subspace of H. If there is a
constant symbols c, for a E U(X) (- 411 Sym- polynomial p(x) such that p(T) is a compact
bolic Logic F). Then the first-order structure operator, then we get a nonstandard natural
%= [U(X); a(aE U(X)); E] is an L-structure, number j such that J,” is a proper subspace
where E is the E relation on the set U(X). Let of H for some k <j. This gives the follow-
YJI=[U(X)‘/D:U(UEU(X)); E’/D] be the ing result, which is an affirmative solution of
ultrapower %‘/D of 6% with respect to a non- a problem of K. Smith and P. R. Halmos.
principal ultrafilter D on a set 1. For each Theorem (Bernstein 141): Let T be a bounded
UE U(X), let a* be the set of all elements [p] in linear operator on an infinite-dimensional
U(X)‘/D such that {i~Zl~(i)~a}~D. Then a Hilbert space H over the complex numbers
is a proper subset of a* if a is infinite. Since !R and let p(x) # 0 be a polynomial with complex
and !JIl are elementarily equivalent, these two coefficients such that p(T) is compact. Then T
sets a and a* have common first-order prop- leaves invariant at least one closed linear
erties in the following sense: for each formula subspace of H other than H or {O}.
@(uo) in L,

(VbEu)(% + @[~])+/b~a*)(!M + @[PI).


F. Categoricity in Powers
From this it follows that if r is a relation on a
set a in a, then Y* is a relation on the set a*; Let r be a set of closed formulas in a first-
and if f is a mapping from a to b in %, then S* order language L which has a designated
is a mapping from a* to b*. Hence a* is a binary predicate symbol P,,. In the following,
mathematical object which greatly resembles we assume that the interpretation p, of P,, by
a. By using this type of resemblance between a !JJl is the equality relation on the universe of
and a* we have the following result. 9Jl for every L-structure W. r is said to
Let H be a Hilbert space over the complex be categorical if all the models of I- are iso-
number field C such that dim(H) = ~0 and let T morphic. By Theorem 3 in Section D, any r
be a bounded linear operator on H. Let X = having a model of infinite cardinality is not
H U C and consider the first-order structure categorical. Hence, there exists no interesting
$331as above. Since R (the set of all real num- r which is categorical. Therefore, we consider
bers) and N (the set of all natural numbers) are the weaker notion of categoricity in powers.
infinite sets which belong to U(X), R* and N* Let K be an infinite cardinal and n(r, K) be the
have elements which do not belong to R and number of nonisomorphic models of r of
N, respectively. Such elements are called non- cardinality K. Then r is said to be categorical
standard real numbers and nonstandard natural in IC if n(T, K) = 1, i.e., if all the models of I- of
numbers, respectively. By the fundamental cardinality K are isomorphic. There exist many
1041 277 A
Modules

interesting I’s which are categorical in IC for dition: I is said to locally omit C if there is no
some K. For example, the set of axioms of a-formula A such that P(A) is consistent
algebraically closed fields of characteristic 0 is and, for any formula B in Z, I” { A, 1 B} is
categorical in EC,, and the set of axioms of not consistent.
dense linear orderings without endpoints is
categorical in K,. With respect to this notion, Theorem. Suppose that I is a consistent set of
J. LOS conjectured that if I is categorical in IC closed formulas in a countable language L,
for some IC> L (the cardinality of L), then I is and C is a set of a,-formulas in L. If I locally
=
categorical in IC for all ti > L. This conjecture omits C, then I has a countable model which
was solved affirmatively by M. Morley in the omits Z. Also, if I has a model of power
case L =K,, and later by S. Shelah in the greater than & omitting Z for each 5 < wi,
general case. Theorem (1): Let I be a set of then I has a model omitting C in each infinite
closed formulas in L. Then I is categorical in power, where 7, is defined by 1, = K,, I,+, =
K for some K > L if and only if I is categorical 2”, 1, = SUP~<~& if u is a limit ordinal.
in K for all li > L. We also have the following
interesting theorem, due to J. T. Baldwin and
References
A. H. Lachlan. Theorem (2): Let I be a set of
closed formulas in L such that L = K,. If I is
[1] J. Ax and S. Kochen, Diophantine prob-
categorical in K,, then n(I, EC,) = 1 or EC,.
lems over local fields I, Amer. J. Math., 87
As for n(I, K) there are two famous conjec-
(1965) 605-630.
tures due to R. Vaught and others. A set I of
[2] J. Ax and S. Kochen, Diophantine prob-
closed formulas in L is called complete in L if
lems over local fields II, Amer. J. Math., 87
it has a model and, for any closed formula A
(1965), 630-648.
in L, either AEI or iAEI.
[3] J. Ax and S. Kochen, Diophantine prob-
lems over local fields III, Ann. Math., (2) 83
Conjecture 1. There is no complete set I of
(1966) 437-456.
closed formulas in L such that K, < n(I, EC,) <
[4] A. R. Bernstein, Non-standard analysis,
2”o.
Studies in Model Theory 8, M. D. Morley
(ed.), Mathematical Association of America,
Conjecture 2. There is no finite set I of closed
1973,35558.
formulas in L such that n(I, K,) = K, and
[S] C. C. Chang and H. J. Keisler, Model
n(r,K,)= 1. theory, North-Holland, 1973.
[6] S. C. Kleene, Introduction to metamath-
ematics, Van Nostrand, 1952.
G. Omitting Type Theorem
[7] M. Machover and J. Hirschfelt, Lectures
on non-standard analysis, Lecture notes in
By a,-formulas (or u-formulas), we mean for-
math. 94, Springer, 1969.
mulas which have no free variables except a,.
[S] M. Morley, Categoricity in powers, Trans.
Suppose that YJl is an L-structure and Z is a
Am. Math. Sot., 114 (1965) 514-538.
set of u-formulas in L. Then, 9X realizes Z if
[9] M. Morley, Omitting types of elements,
there is an element m of the universe of (9Jl such
The Theory of Models, J. W. Addison et al.
that (9X k AC”,] for all A in C, and YJI omits C
(eds.), North-Holland, 1965,265-273.
if (9.Xdoes not realize C. For example, if L is a
[lo] A. Robinson, Non-standard analysis,
first-order language such that L, = {co, c, 1,
North-Holland, 1966.
-k={h, .fi}, L,={f’,,P,j, whereh,f,, PO,
[ 1 l] G. E. Sacks, Saturated model theory,
Pi are all binary, and % is the L-structure
Benjamin, 1972.
[%;O,l;+,x;=,<],where%isthesetof
1121 S. Shelah, Classification theory, North-
natural numbers, and C = {P, (x, a) ) n = 0, 1,
Holland, 1978.
2 ,... }, whereo=c,, l=f,(&c,) ,..., n+l =
f@, c,), , then clearly 5%omits Z. Also, if cY.JI
is an L-structure, m is an element of the uni-
verse of 5331,and Z, = {A 1A is an u-formula
such that YJl+ Arm] ), then clearly YJI realizes 277 (111.23)
Z,, where C, is called the type of m in 9X.
Suppose that I is a set of closed formulas in L.
Modules
Then, by the completeness theorem of L, we
can easily see that I has a model realizing Z if A. General Remarks
and only if I”Z is consistent. On the other
hand, it is rather difficult to obtain a necessary In this article, we consider mainly modules
and sufficient condition for I to have a model with operator domain (Section C), in partic-
omitting Z. The following is a sufficient con- ular modules over a tring. Modules over a
277 B 1042
Modules

field are linear spaces (- 256 Linear Spaces). and XE M there is associated a unique element
Modules over a commutative ring are impor- ax EM satisfying the condition (1) u(x + y) =
tant in algebraic geometry (- 16 Algebraic ax+uy(u~A;x,y~M),wesaythatAisan
Varieties, 67 Commutative Rings, 284 Noe- operator domain of M and M is a module with
therian Rings). The theory of modules over a operator domain A (module over A or A-
igroup ring can be identified with the theory module) (- 190 Groups E). The mapping
of linear representations of a group (- 362 A x M-M given by (u,x)+ux is called the
Representations). Modules without operator roperation of A on M. Any aE A induces an
domain may be regarded as modules over the endomorphism CI~:X+UX of M as a module
ring Z of rational integers, and the theory of (not as an A-module). To give the structure of
finitely generated Abelian groups can be gen- an A-module to a module M amounts to
eralized to the theory of modules over a tprin- giving a mapping A+&(M) (u+uM).
cipal ideal domain (- 52 Categories and If N is a subgroup of an A-module M such
Functors, 200 Homological Algebra). that axE N for any UEA and XE N, then N
forms an A-module, called an A-submodule (or
B. Modules allowed submodule) of M. If (N,},,, is a family
of A-submodules of an A-module M, then the
A module (without operator domain) is a tcom- intersection n2,!EA N, and the sum Clt,, NA are
mutative group M whose law of composition both A-submodules of M.
is written additively: u + b = b + a (a, be M); the Let R be an +equivalence relation in an A-
tidentity element is denoted by 0, and the module M such that if UE A and xRy, then
inverse element of a by --a. Every subgroup H axRay. Then R is said to be compatible with
of M is a normal subgroup. For any a EM, the the operation of A. In this case, an operation
left and right cosets of H containing the ele- of A is induced on the quotient set MJR.
ment a are identical: H + a = u + H (- 190 Moreover, if R is compatible with the addition,
Groups A). namely, xRx’ and yRy’ imply (x+ y)R(x’+y’),
In the set NM of all mappings of a set M to then M/R forms an A-module, called a factor
a module N, we define an addition by the sums A-module of M. The equivalence class N con-
of values: (f+g)(x)=f(x)+g(x). Then NM taining 0 is an A-submodule of M, and M/R
forms a module. The set Hom(A4, N) of all coincides with M/N.
homomorphisms of a module M to a module
N forms a subgroup of the module NM, called
the module of homomorphisms of A4 to N. The D. Modules over a Group or a Ring
composite of homomorphisms is a homomor-
phism. Hence the set Hom(M, M) = B(M) of all If a (multiplicative) group structure is given to
endomorphisms of M forms a +ring with re- the operator domain A of a module M, we
spect to the addition and the multiplication always assume (in addition to condition (1)
defined by composition; this is called the endo- in Section C) that the following two condi-
morphism ring of M. The tunity element of tions are satisfied: (2) (ub)x = u(bx); (3) lx = x
B(M) is the identity mapping of M, and the (a,bEA,xeM).
tinvertible elements of d(M) are the automor- If a ring structure is given to A, we always
phisms of M. assume (besides conditions (1) and (2)) that the
Let {xnlAsA be a family of elements in a following condition holds: (4) (a + b)x = ax + bx
module M. The sum Crenxn is well defined if (a, bE A, XE M). This means that the mapping
xI = 0 (1”EA) except for a finite number of 1. Ad(M) (u-‘uJ is a king homomorphism. If
For any family {N,},,, of subsets of M, Clsr\ the ring A has unity element 1, and l,,, = iden-
NA denotes the set of all elements of the form tity mapping (namely, condition (3) holds),
c A^e,,xi, (xi E NJ, where xi = 0 except for a then the A-module M is called unitary. We
finite number of i. If all the NA are subgroups consider only unitary A-modules. Any module
ofA4, then N=C le,, NA is also a subgroup, M can be regarded as a Z-module or as an
called the sum of {N,},,,. If every element of B(M)-module.
N can be written uniquely in the form &,,x, When M is a module over a ring A, an
(xig N,), N is called the direct sum of {NA}IE,,. element of A is called a scalar, A itself is called
When the NA are subgroups, this is equivalent the ring of scalars (basic ring or ground ring),
to the condition that NA fl Ci ilrcA N, = {0} and the operation A x M--f M is called the
(AEA). scalar multiplication. The elements ax (a~ A)
are called scalar multiples of x, and the totality
C. Modules with Operator Domain of these elements is denoted by Ax. Let {x~}~~~
be a family of elements in M. An element of
Suppose that we are given a set A and a the form C i.EA a Ax 2, where the a, are elements
module M. If with each pair of elements UE A of A and equal to 0 except for a finite number
1043 277 F
Modules

of I., is called a linear combination of (x~}~~~. The composite of A-homomorphisms is an A-


The set N of linear combinations of (xl}lsiz is homomorphism.
the smallest A-submodule of M containing all Let f: M-tL be an A-homomorphism of A-
the xi @El\) and is equal to the sum CIE,,AxA. modules. The A-submodule Imf=f(M) of L is
The A-module N is said to be generated by called the image of 5 and the A-submodule
i%>r.*9 and ixJlpA is called a system of gen- Kerf={xIxEM,f(x)=O} of M is called the
erators of N. A module having a finite number kernel of $ Coimf= M/Kerf is called the co-
of generators is said to be finitely generated (of image of 5 and Cokerf= LjImf the cokernel
finite type or simply finite). The module Ax of 1: The binary relation xRy on M defined by
generated by a single element x is called mono- f(x) =f(y) (x, YE M) coincides with the equiva-
mial. If A is a Qield (which may be noncom- lence relation defined by x - y E N = Kerf
mutative), an A-module is a linear space over (x, y E M), and the mapping f induces an A-
A (- 256 Linear Spaces). isomorphism f: M/N +f(M).
Let a be an element of an A-module M. If A sequence of A-homomorphisms of A-
there exists a nonzero divisor /I of A such that modules M, (n E Z)
da = 0, then a is called a torsion element of M.
We say M is a torsion A-module if every ele- . ..-+A4 “I _ tiM n 5;M ?I+, +...
ment of M is a torsion element, and M is
is called an exact sequence if Imf,-, =Kerf,
torsion free if M has no torsion element other
for all n. The A-module {0) is denoted by 0.
than 0. An element a of M is called divisible if
Exactness of O-, N L M or M 5 L+O means
for any nonzero divisor i 6 A there exists an
that the mapping f: N + M is injective or the
element b E M such that a = lb. M is called a
mapping g : M -tL is surjective, respectively. In
divisible A-module if every element of M is
an tinductive (tprojective) system { M,,f,,) of
divisible.
A-modules, where every fA,: M,+ M, i < p
Strictly speaking, the A-modules we have
(A> ,u) is an A-homomorphism, the limit M =
considered so far are called left A-modules. If
1% M,! (l@ M,) is also an A-module. If O-,
we define the operation of A on M by xa
L,+M,pN,+O is exact for every E, and
(a E A, x E M) instead of ax and modify con-
ditions (l)-(4) appropriately (in particular,
condition (2) becomes x(ab)=(xa)b), then M is
called a right A-module. If A0 is a group or
ring anti-isomorphic to A, then a left A-
is a tcommutative diagram, then 0-Q LA-t
module can be naturally identified with a right
IiT M,+l$ N,+O is also exact. For the
A”-module. If A is a commutative group or
ring, we can disregard the distinction between projective limit, however, we can only state the
exactness of O-+limL,+lim M,+lim N,.
left and right A-modules.
The set of all zhomo&rphismyof an A-
Let A and B be groups or rings. Sometimes
module M to an A-module N, denoted by
we consider an A-module structure and a B-
module structure simultaneously on the same Hom,(M, N), is a subgroup of the module
module M. If the operations of A and B com- Hom(M, N), and is called the module of A-
mute with each other, namely, a(bx)= b(ax) homomorphisms. The set Hom,(M, M) = &FA(M)
(ae A, bell, xE M), it is convenient to put one of all A-endomorphisms of an A-module M
of the operations to the right. If M has a left forms a tsubring of the ring g(M) and coin-
A-module structure and a right B-module cides with the set of all elements commuting
with any aM (a~ A). We denote by GL(M) the
structure, satisfying condition (5) (ax)b=a(xb),
group of all tinvertible elements in C&(M). If
then M is called an A-B-bimodule. If G is a
A is a commutative ring, Hom,( M, N) can be
group and K is a commutative ring, the G-K-
regarded as an A-module by defining (as)(x) =
bimodule structure is equivalent to the left
af(x), namely, af= uN o$ In particular, G,(M)
K [G]-module structure, where K [Gj is the
is an ‘associative algebra over A. If M is an
tgroup ring.
A-B-bimodule, Hom,(M, N) forms a left B-
module by (bf)(x) =f(xb). If N is an A-B-
bimodule, Hom,(M, N) forms a right B-
E. Operator Homomorphisms
module by (fb)(x)=f(x)b.
A homomorphism f: M + N of A-modules M
and N such that I = us(x) (a E A, x E M) is F. Direct Products and Direct Sums
called an A-homomorphism (operator homo-
morphism or allowed homomorphism). If A is a In the Cartesian product P = &,, M, of a
ring, f is also called an A-linear mapping. family (MJ,,, of A-modules, we define ad-
Regarding A as an A-module, an A-linear dition and an A-operation as follows: {x~} +
mapping M --* A is called a linear form on M. {yl} = {xn+ yl}, a{~>,} = {axI}. Then P forms
277 G 1044
Modules

an A-module. We call nrch M, the direct If M and N are simple A-modules, any A-
product of modules {M,},,,. The canonical homomorphism of M to N is an isomor-
projection assigning x2 to {x1} is denoted by phism or the zero homomorphism (i.e., one
pn:P*M,. Suppose that an A-module M and which sends every element of M to 0) (Schur’s
A-homomorphisms fA : M + M,(,? EA) are lemma). If an A-module M is the sum of a
given. Then there exists a unique A-homo- family {MAll.,A of simple submodules, M is the
morphismf: M+P such that plof=fA (LEA); direct sum of a suitable subfamily {M,.},.,,.
f is given by f(x) = { fn(x)}. (A’c A). In this case, M is called semisimple (or
In the direct product n,,, M,, the set S of completely reducible).
all elements whose components xi are equal to If an A-module M can be decomposed into
0 except for a finite number of 1 is called the the direct sum of A-submodules N and N’,
direct sum of modules {M,},,, and is denoted then N’ is called a complementary submodule
by Cm M, (or LI ItA M, or OnEA M,). The of N. An A-module M is semisimple if and
canonical injection assigning { . ,O, xI, 0, . } only if every A-submodule of M has a comple-
ES to X~E M, is denoted by j,: M,+S. If an mentary submodule. Let A be a ring. Then the
A-module M and A-homomorphisms f?,: ML+ A-module A is semisimple if and only if every
M are given, then there exists a unique A- A-module is semisimple. In this case A is
homomorphism f: S- M such that ,foj, =fA called a ysemisimple ring (- 368 Rings G).
(k~h), defined by f({~~})=C,,,f,(x,). When Every simple module over a semisimple ring A
M is an A-module and {Ni}IEA is a family of is A-isomorphic to a tminimal left ideal of A.
A-submodules of M, the A-homomorphism
f:&,,,N,+M defined byf’({x,})=C,,,x, is
an A-isomorphism if and only if M is the I. Chain Conditions
direct sum of {N,}.
If M,= M for all AC/\, n,,, M, and The set of all A-submodules of an A-module
XI,,, M, are denoted by MA and MC”), respec- M forms an tordered set under the inclusion
tively. MA can be regarded as the set of all relation. An A-module is called a Noetherian
mappings of A to M. The direct product M, module if the ordered set satisfies the tmaximal
x...xM,anddirectsumM,@...@M,ofa condition and an Artinian module if it satisfies
finite number of A-modules M,, , M,, can the +minimal condition (- 311 Ordering C).
be identified with each other and, if Mi = M Let N be an A-submodule of an A-module
(1 < i < n), we simply denote it by M". M. Then M is Noetherian (Artinian) if and
only if N and M/N are both Noetherian (Ar-
tinian). A ring A is called a +left Noetherian
G. Free Modules
ring (+left Artinian ring) if A is Noetherian
(Artinian) as a left A-module, and similarly for
Let A be a ring. A family {x~}~~~ of elements
right Noetherian and Artinian rings. Every
in an A-module M is called linearly indepen-
finitely generated module over a Noetherian
dent if C iti\ ulxl = 0 (a, E A) implies a, = 0 for
(Artinian) ring is Noetherian (Artinian). Over
all SEA. This is equivalent to saying that the
an arbitrary ring A, a module M is Noetherian
mapping A(“)+ M that assigns &,, a, xi. E M
if and only if every A-submodule of M is fi-
to {a,} is injective. A linearly independent
nitely generated.
family {x2} lp,, generating M is called a basis of
A finite sequence {Mi),,4LQr of A-submodules
M. A family {x~}~~,, is a basis if and only if
of an A-module M is called a tJordan-Hiilder
every element of M can be written uniquely in
sequence if M = M,, Mi 3 Mi+l, M, = { 0}, and
the form C LEAalxl (a2 E 4
the M,/Mi+l (0 <i < r) are simple. If such a
An A-module that has a basis is called a free
sequence exists, M is said to be of finite length.
module over A. If A is a field (which may be
The number r, called the length of M, depends
noncommutative), every A-module is a free
only on M. The quotient modules Mi/Mi+,
module (- 256 Linear Spaces). The tcardi-
(0 < i < r) are uniquely determined by M up
nality of a basis of a free module M over A
to A-isomorphism and permutation of the
depends only on M if A is a field (which may
indices (C. Jordan and 0. Hiilder). An A-
be noncommutative) or a commutative ring;
module M is of finite length if and only if M is
this number is called the rank (or dimension) of
Noetherian and Artinian. A semisimple A-
M. Any submodule of a free module over a
module is of finite length if and only if it is
iprincipal ideal domain is a free module.
finitely generated.
An A-module M is called indecomposable if
H. Simple Modules and Semisimple Modules M cannot be decomposed into the direct sum
of two A-submodules different from M and
An A-module M is called simple if M # 0 and {O}. Any A-module of finite length can be
M has no A-submodules except M and 0. decomposed into the direct sum of a finite
1045 277 K
Modules

sequence N,, , N, of indecomposable A- sor product M @ N as a linear space (- 256


submodules different from {O}. The direct Linear Spaces H, I).
summands Ni (1 <i < n) are unique up to A- In general, let M be a B-A-bimodule and N
isomorphism and permutation of the indices be a left A-module. Then M Oa N becomes a
(W. Krull, R. Remak, and 0. Schmidt). left B-module if we define h(x 0 y) = (bx) @ y.
Let N be an A-B-bimodule and M be a right
A-module. Then M ma N becomes a right B-
J. Tensor Products module if we define (x 0 y)b =x 0 (yb). In
particular, we have A Ba N 1 N, M @A A g M.
Let A be a ring. Given a right A-module M Let M, M be right A-modules and N, N’ be
and a left A-module N, we construct a module left A-modules. For A-homomorphisms f:
M Ba N (called the tensor product of M and N) M+M’ and g: N-tN’, there exists a unique
and a canonical mapping cp: M x N + M @A N homomorphism h: M @A N + M’ @A N’ satisfy-
as follows. Let F be a free Z-module (free ing h(x 0 y) =f(x) @ g(y); h is called the tensor
Abelian additive group) generated by M x product of L g and is denoted by f @ g. We
N, and R be the subgroup generated by the give here some simple examples (also - Sec-
elements of the forms (x + x’, y) - (x, y) - tion L).
(x’, YX (x, Y + Y’) -(x, Y) -k Y’), (x4 Y) -(x7 UY) Examples. (1) Let M, N be free modules
(x,x’EM,~,~‘EN,uEA). WedelineM@,N (linear spaces, for example) over a commuta-
= F/R, and call the natural projection cp. If we tive ring A. If {xiiiE, and {yj}j.J are bases
denote cp(x, y) by x @ y, then we have (x1 + of M and N, respectively, M Oa N is also a
x*)QY=x,Qy+x,QY,xQ(Y,+Y,)= free module with a basis {xi @ yj}iF1,j.J.
xOy,+xOy,,and(xa)Oy=x@(ay).Any If the dimensions dim M, dim N are finite,
element of M ma N is written in the form dimM@,N=dimMdimN.
CXiQYi (XiEM,YiEN). (2) For an tideal a of a commutative ring A,
The tensor product M @A N of M and N the tfactor ring M = A/a can be regarded as an
and the canonical mapping cp: M x N + A-module, and we have M Oa N rz N/aN. For
M Oa N can be characterized as follows: instance, (Z/mZ) 0 z(Z/nZ)~ Z/(m, n)Z, where
For a module L, a mapping f: M x N + L is (m, n) denotes the greatest common divisor of
called biadditive if the conditions f(x + x’, y) m and n.
=fbk Y) +f(x’a Y), fk Y + Y’) =fk Y) +fk Y’)
hold. A biadditive mapping f satisfying the
condition f(xa, y) =f(x, ay) is called an A-
K. Horn and @
balanced mapping. Then we have (i) the canon-
icalmappingcp:MxN-+M@,NisA-
balanced; and (ii) for any module L and any A- We continue to consider modules over a ring
balanced mapping f: M x N+ L, there exists a A. Concerning the direct sum and product, we
unique homomorphism f, : M ma N+ L such have
that f(x, Y) = f*(x 0 Y) (x E M, Y E NJ.
A right (left) A-module can be regarded as a
left (right) A”-module, where A0 is the ring
and
anti-isomorphic to A. In this sense, we have
M@,NgNQO,aM.
Let A be a commutative ring. For A-
modules M, N, and L, a mapping f: M x N + Concerning projective and inductive limits we
L is called a bilinear mapping if f is biaddi- have
tive and satisfies f(ax, y) =f(x, ay) = af(x, y)
(a~A,x~M,y~N).Theset c(M,N;L)of Horn, I$ M,, l@r N, g l&n Hom,(M,, NJ
>
all bilinear mappings M x N + L forms A-
submodule of the A-module LMx N. A bilinear and
mapping M x N + A is called a bilinear form
on M x N. The tensor product M @*N be-
comes an A-module if we define a(x @ y) =
(ax) 0 y ( = x @ (a~)), and the canonical map- An A-homomorphism f: M +M’ induces a
ping M x N+M Ba N is bilinear. For any A- homomorphism Hom,(M’, N)-+Hom,(M, N)
module L and bilinear mapping f: M x N -+ by the assignment g-+gof: An exact sequence
L, there exists a unique A-linear mapping M’-t M + M”+O gives rise to the exact
f, : M Oa N + L satisfying f(x, y) =f,(x @ y). sequence
By this correspondence f-f,, we get an A-
O+Hom,(M”, N)+Hom,(M, N)
isomorphism f!(M, N; L) g Hom,(M @A N, L).
If A is a field, M @QaN coincides with the ten- +Hom,(M’, N). (1)
277 K 1046
Modules

An A-homomorphism f: N-N’ induces a = (0) implies M = (0). A flat right A-module R


homomorphism Hom,(M, N)+Hom,(M, N’) is faithfully flat if and only if R # RC!I for any
by the assignments g*,fo g, and an exact left ideal 9I (#A) of A. Let A be a tprincipal
sequence O+N’+N-rN” gives rise to the exact ideal domain. Then an A-module R is flat if
sequence and only if R has no torsion element, and R is
faithfully flat if and only if R has no torsion
O+Hom,(M, N’)-rHom,(M, N)
element and R # Rp for any tprime element
*Hom,(M, N”). (2) p of A. We have the following important
examples:
Let M be a right A-module and N’, N, N”
(1) For a commutative ring A and its multi-
be left A-modules. An A-homomorphism f:
plicatively closed subset S, the tring of quo-
N-N’ induces the homomorphism 1, @f:
tients A, is flat as an A-module. However, A,
M Qa N --$M Oa N’, and an exact sequence
is not faithfully flat. For instance, the field of
N’+NjN”+O gives rise to the exact sequence
rational numbers Q is not faithfully flat as a
M@aN’-+M@,N+M@aN”+O. (3) Z-module.
(2) Let A be a tsemilocal ring and A be its
Exchanging left and right, we obtain similar completion. Then A is faithfully flat as an A-
results (- 52 Categories and Functors B; 200 module (- 284 Noetherian Rings; also [l, 71).
Homological Algebra). In the exact sequence (4), if Im cp= Ker I// is a
Let Q be an A-module. If for any exact direct summand of the A-module M, we say
sequence of A-modules that (4) splits. Then (5), (6) and (7) are exact for
O+M’+M+M”+O, (4) any A-modules Q, P, R. The exact sequence (4)
splits if M’ is injective or M” is projective.
the induced sequence By .M, MAI and ,,MB, we mean that M is a
O+Hom,(M”,Q)+Hom,(M,Q) left A-module, a right A-module, and an A-B-
bimodule, respectively. As already stated, aM,
+Homa(M’, Q)-0 (5) and .N imply s(Homa(M,N)), and .M and
is exact, then Q is called an injective A-module. *Ns imply (Hom,(M, N))s. Similarly BMA and
This is equivalent to the condition that if M’ is N, imply (Hom,(M, N))s, and MA and sNA
imply B(Hom,(M, N)). Furthermore, BMA and
an A-submodule of an A-module M, then any
A-homomorphism M’+Q can be extended to .N imply ,&M @,,, N), and M, and ,N, imply
an A-homomorphism M+Q. Any A-module is (M Qa NLs.
an A-submodule of some injective A-module, With &,, AM, and .N, we have
and any injective A-module is a divisible A- Hom,(M, Hom,(L, N)) r Hom,(L Oa M, N).
module. If A is a +Dedekind domain, any (8)
divisible A-module is an injective A-module.
Let P be an A-module. If for any exact Similarly, for ,,M,, L,, and NB, we have
sequence (4), the induced sequence
Horn,& Hom,(M, N)) E Hom,(L Oa M, N).
O+Hom,(P, M’)+Hom,(P, M) (8’)
+Hom,(P, M”+O (6) If B is a commutative ring, (8) and (8’) are B-
is exact, then P is called a projective A-module. isomorphisms. Furthermore, with L,, aM,,
This is equivalent to the condition that for any and sN, we have
surjective A-homomorphism g: M+M” and (L@aM)@BN~LC+O,(MOBN). (9)
any A-homomorphism .f: P+ M”, there exists
an A-homomorphism h: P+ M satisfying go h We denote by M* the set Hom,(M, A) of all
=,I: Any A-module is a factor A-module of linear forms on an A-module M. Then .M
some projective A-module. A projective A- implies M;, and MA implies “M*; the A-
module has no torsion element. A free A- module M* is called the dual module of M. A
module is a projective A-module. In general, as a left A-module is dual to A as a right A-
an A-module is a projective A-module if and module, and vice versa. For a family of A-
only if it is a direct summand of a free A- modules {M,),,,, we have a canonical corre-
module. spondence (Glen M,)* g &,, Mz. From thts,
Let R be a right A-module, If for any exact we have a canonical isomorphism (M*)* g M
sequence (4), the induced sequence for any finitely generated projective A-module
M. Many facts concerning this tduality are
O+R@,M’+R@,M+R@,M”-+O (7)
similar to those valid for linear spaces (- 256
is exact, then R is called a flat A-module. Any Linear Spaces G).
projective A-module is a flat A-module. A flat Let A be a commutative ring. Letting A =
A-module R is called faithfully flat if R Oa M B = N in (8) and (83, we have the canonical A-
1047 278 A
Monge-Amp&e Equations

isomorphisms of M. The representation of G associated with


p*(M) is the +induced representation of the
Hom,(M, L*) g Hom,(L, M*)
representation of H associated with M. Next,
E(L ga M)* = iqL, M; A), we fix a group G and consider a homomor-
phism p: K [G]-K[G] induced by a homo-
namely, any bilinear form on L x M is repre-
morphism of commutative rings 0: K + K. If
sented by a linear mapping M-L* or L-M*.
K = K and 0 is an automorphism, then the
representation associated with the “scalar ex-
L. Extension and Restriction of a Basic Ring tension” p*(M) of a K [G]-module M is the
iconjugate representation to the representa-
Fix a ring homomorphism p: A +i?. We regard tion associated with M. If J? = K/Z (CLI is an
B as a B-A-bimodule by defining a right oper- ideal of K) and cr is the canonical projection,
ationofAonBbyb.a=bp(a)(aEA,bEB). then the representation over K associated
This bimodule is denoted by B,. with the scalar extension p*(M) of a K [G]-
For every left A-module M, we construct the module M is the reduction modulo Ql of the
left B-module p*(M) = B, Ba M, which is representation over K associated with M, and
called the scalar extension of M by p. Every p*(M) is canonically isomorphic to M/SUM.
A-homomorphism of A-modules f: M-t Furthermore, the tlocalization and the +com-
M’induces the B-homomorphism p*(f)= pletion can also be treated under the formula-
l,@f:p*(M)+p*(M’). tion of scalar extension (- 67 Commutative
For every left B-module N, we construct the Rings G, 284 Noetherian Rings B).
left A-module p.JN)=Hom,(B,,, N), which is
called the scalar restriction (or scalar change)
References
of N by p. By the assignment h+h( l), we have
a module isomorphism Hom,(B,, N)= N. We
[l] M. Nagata, Local rings, Interscience, 1962.
identify p,(N) with N under this isomorphism.
[2] H. P. Cartan and S. Eilenberg, Homolog-
The operation of A on N is then given as a. y
ical algebra, Princeton Univ. Press, 1956.
=p(u)y (aeA,y~N). If A is a subring of B and
[3] D. G. Northcott, An introduction to homo-
p is the canonical injection, then the opera-
logical algebra, Cambridge Univ. Press, 1960.
tion of A on p,(N) is the restriction of the
[4] S. MacLane, Homology, Springer, 1963.
operation of B on N. Every B-homomor-
[5] C. Chevalley, Fundamental concepts of
phism of B-modules f: N --$N’ induces the A-
algebra, Academic Press, 1956.
homomorphism p,(f’):p,(N)+p,(N’). For
[6] N. Bourbaki, Elements de mathematique,
any left A-module M and left B-module N, an
Algebre, ch. 2, Actualites Sci. Ind., 1236b, Her-
A-linear mapping f: M-p,(N)= N is called a
mann, third edition, 1962.
semilinear mapping with respect to p. This
[7] N. Bourbaki, Elements de mathematique,
means that ,f is an additive homomorphism
Algebre commutative, ch. 1, 2, Actualites Sci.
satisfying f’(ax)=p(n),f(x) (UEA,XE M).
Ind., 1290a, Hermann, 1961.
The extension and the restriction of a basic
[S] C. W. Curtis and I. Reiner, Representation
ring are related by the canonical isomorphism
theory of finite groups and associative alge-
Hom,(M,p,(N))zHom,(p*(M), N) for an A-
bras, Interscience, 1962.
module M and a B-module N (- equation
[9] R. Godement, Cours d’algebre, Hermann,
(8’)). An element CIof the left-hand side and an
1963.
element fl of the right-hand side are associated
[lo] S. Lang, Algebra, Addison-Wesley, 1965.
by the relation a(x) = b( 1 @ x) (x E M) (- 52
[ 1 l] S. T. Hu, Elements of modern algebra,
Categories and Functors).
Holden-Day, 1965.
Let A and B be commutative rings. Then for
A-modules M and M’, we have the canonical
B-linear mapping p*: B @A Hom,(M, M’)
+Hom,(B Oa M, Baa M’), which is a B-
isomorphism if M is a finitely generated
free (or more generally, projective) A- 278 (X111.23)
module. Using the notation p*, we have Monge-Amp&e Equations
p*(Hom,(M, M’))gHom,(p*(M),p*(M’)).
We now give some examples where the basic
A. Monge-Ampere Equations
rings are noncommutative. Let G be a group
and H its subgroup. Let p denote the homo-
A Monge-Amphre differential equation is a
morphism of group rings K [W] + K [G] in-
second-order partial differential equation of
duced by the canonical injection H&+G, where
the form
K is a commutative ring. For any K [ff]-
module M, p*(M) is called the induced module Hr+2Ks+Lt+M+N(rt--s2)=0, (1)
278 B 1048
Monge-Ampere Equations

where H, K, L, M, and N are functions of x, y, if a manifold, each element of which is com-


z, p, and q, and r, s, t, p, and q represent the posed of a point of a surface S and the tangent
partial derivatives plane at that point, is generated by a family of
characteristic manifolds depending on one
@Z azz azz parameter, then the surface S is an integral
r=g>
S=axay' t=ay2. surface of (1).
aZ aZ
P=z’ 4=ay.

The characteristic manifolds are integrals of B. Intermediate Integrals


a system of differential equations defined as
follows: If a relation d V(x, y, z, p, q) = 0 is a consequence
Case (i) N #O. of a system of differential equations of charac-
teristic manifolds, for example, in case (i) when
Ndp+Ldx+A.,dy=O,
V satisfies
Ndq+A,dx+Hdy=O,
N(av~ax+pav~az)-Lav,ap-n2av~a~=o,
dz-pdx-qdy=O, (4 N(avjay+qa v/a+n,avjap- Ha v/aq=o,
Ndp+Ldx+i,dy=O, V(x, y, z, p, q) = c (c an arbitrary constant) is
called an integral of the system of differential
Ndq+A,dx+Hdy=O,
equations. (i) If V = c is an integral of a system
dz-pdx-qdy=O, (3) of differential equations of characteristic mani-
folds, the solution z(x, y) of V=c considered
where 1, and i, are the two roots of the anew as a partial differential equation of the
equation i2+2Kl+HL-MN=O.
first order is a solution of (1). Conversely, if
Case (ii) N =O, H #O. every solution of V= c (excepting tsingular
dy=A,dx, Hdp+Hi,dq+Mdx=O, ones) satisfies equation (l), V= c is an integral
of a system of differential equations of char-
dz-pdx-qdy=O, (4) acteristic manifolds. (ii) If u(x, y, z, p, q) and
dy = A, dx, Hdp+HL,dq+Mdx=O, u(x, y, z, p, q) are two integrals of a system of
differential equations of characteristic mani-
dz-pdx-qdy=O, (5) folds, then for any arbitrary function cp, cp(u, u)
is also an integral. Thus, as a consequence of
where I, and ;L2 are the two roots of the
(i), every solution z(x, y) of cp(u, 0) = 0 is also
equation HE.*--2KA+ L=O.
a solution of (1). The converse is also true,
Case (iii) N=O, H=O, L#O.
namely, for every solution z of (1) we can find
dx=O, Mdy+2Kdp+Ldq=O, a function rp such that z(x, y) is also a solution
of cp(u, u) = 0. The relation rp(u, u) = 0 is called
dz-pdx-qdy=O, (6) an intermediate integral of (1). Sometimes an
2Kdy-Ldx=O, Mdy+Ldq=O, integral of a system of differential equations of
characteristic manifolds is also called an inter-
dz-pdx-qdy=O. (7) mediate integral. If each of the two systems of
Case(iv) N=O, H=O, L=O. differential equations defining the character-
istic manifolds has an intermediate integral,
dx=O, 2Kdp+Mdy=O, then the two intermediate integrals cp(u, u) = 0
dz-pdx-qdy=O, 03) and tj(u, u) =0 form a tcomplete system of
partial differential equations of the first order.
dy=O, 2Kdq+Mdx=O, Integrating this complete system, the tgeneral
dz-pdx-qdy=O. (9) solution of equation (1) is obtained.

A manifold x(n), y(n), z(1), p(i), q(1) that


satisfies the system (2) (3) of differential equa-
tions for case (i), (4), (5) for case (ii), (6), (7) for References
case (iii), or (8), (9) for case (iv) is a character-
istic manifold of equation (1). [ 1] E. Goursat, Cours d’analyse mathematique
The following result is known concerning III, Gauthier-Villars, fourth edition, 1927.
Monge-Ampere equations: The union of sur- [2] E. Goursat, Lecons sur I’integration des
face elements of an integral surface of (1) is equations aux d&iv&es partielles du second
generated in two ways by characteristic mani- ordre a deux variables indtpendantes I, Her-
folds depending on one parameter. Conversely, mann. 1896.
1049 279 C
Morse Theory

279 (VII.1 6) and if the index off at p is 1, then there exists


a local chart (yl, , y,) around p with p =
Morse Theory (0, ,O) such that f is expressed as

A. General Remarks
+(Ym)‘.
We are interested in smooth functions on
a smooth manifold of dimension m that
have only the simplest critical points. Here, C. The Existence of Morse Functions
“smooth” means differentiable of class C”.
Such a function j’: M+R enables us to investi- Let M be a compact smooth m-dimensional
gate the topology of M. The decomposition of manifold without boundary. The existence of
M into level sets off contains a lot of topo- Morse functions on A4 is shown by tembed-
logical information on M. For instance, it is ding M into R” for a sufficiently large n. For
shown that M is thomotopy equivalent to a convenience, M will be here identified with its
+CW complex that is determined by critical embedded image. At each point pe M the set
points off; furthermore the +Euler character- vp of all unit normals forms an (n -m - l)-
istic of M can be computed by means of 5 dimensional unit sphere, and v(M) = UPEM vP
because the critical points are related to the is a smooth (n - I)-dimensional compact
thomology groups of M. These types of inves- manifold. Let S be the unit hypersphere in R”
tigations, called Morse theory, were originated around the origin. The Gauss normal mapping
by H. PoincarC [l] and G. D. Birkhoff [2] and cp:v(M)+S sends each point (p,u(p))~v(M) to
then developed into the form we see today by UPS via a canonical parallel translation of R”.
M. Morse [3]. An excellent exposition of this Since cp is smooth, the +Sard theorem implies
theory has been given by J. Milnor [4]. R. S. that the set E of all critical values of cp has
Palais [S] has extended the theory to Hilbert measure zero on S. Thus ‘p* at every (p,u(p))e
manifolds. Morse theory has been fruitfully (p-l (u) has maximal rank if u is taken in S - E.
applied to differential topology and differential For every fixed uES-E let h,: M-R be de-
geometry. fined by h,(x) = (u, x),x EM, where ( , ) de-
notes the canonical inner product of R”. Then
h, is a Morse function. To see this let dC, dv,
B. Critical Points of Functions on Manifolds and dM be volume elements of S, v, and M, re-
spectively. Then dvr\dM is a volume element of
For a point p E M, M, is by definition the v(M), and the function G:v(M)+R is obtained
tangent space to M at p. Let S be a smooth by the relation cp*dC= G(x, u(x))dv r\dM.
real-valued function on M. A point PE M is G(x, u(x)) is called the Lipschitz-Killing curva-
called a critical point off if the induced map- ture at U(X)EV,. If .4(0(x)) denotes the tsecond
ping f,: Mp-‘RJcpJ is zero at p. For every tlocal fundamental form of M with respect to v(x),
chart (x 1, , x,) around a critical point p off then G(x,v(x))=( -l)“det(A(u(x))) (Chern and
with p=(O, . . ..O). Lashof [6]). Every critical point pi M of h,
for UE S - E has a normal u(p) to M at p, and
g(o)= .=g(o)=o.
1 m
the Hessian of h, is given by A(u(p)). Thus
G(p, u(p)) #O ensures the nondegeneracy at p
The value f(p) is then called a critical value of C61.
f: If the matrix (a2f/8xiaxj(0)) is invertible, Another approach to the construction of
then the point is called a nondegenerate critical Morse functions is to find for a given embed-
point, and if the matrix is not invertible, then ding of M into R” an open dense set U c R”
the point is said to be degenerate. These no- such that the Euclidean distance d, from any
tions are independent of the choice of local fixed point x on U to points on M has no
charts around p. The above matrix is called degenerate critical points on M. In this case
the Hessian of ,j” at p. The nullity and index of d;‘(( --co, a]) is compact for all a~: R. It is also
the Hessian off at p are called the nullity and possible to construct Morse functions on
index off at p, respectively. The function f has compact manifolds with boundary as well as
a local minimum at a nondegenerate critical on noncompact manifolds. But a striking
point of index 0, and a local maximum at a result of A. Phillips states that there exists a
nondegenerate critical point of index n. Morse function on every noncompact mani-
A smooth function f on M is called a Morse fold which has no critical points [7].
function if it satisfies: (Al) Every critical point For a pair of smooth manifolds M, N let
off is nondegenerate. The Morse lemma states C”(M, N) be the set of all smooth mappings
that if p is a nondegenerate critical point off from M to N. The weak (or compact-open) C”
1050
279 D
Morse Theory

topology on Cm(M, N) is generated by the sets M” we have the following: (3) If c~f(M) is a
defined as follows. Let ~EC~(M, N), and let critical value off with points p1 , , pr with pi
(Il, cp) and (V, $) be charts of M and N, respec- having index ii and if M,‘i: is compact for F.> 0
tively. Let Kc LJ be a compact set such that and contains no critical points other than
S(K) c I/. For E E (0, a), a weak tsubbasic p,, ,pk, then A4”’ has the same homotopy
neighborhoodNm(f;(U,cp),(V,$),K,~)off typeasM’-“Ue”lU...Ue”k,wheree’l(j=
is defined to be the set of all smooth map- 1, . , k) are Aj-cells. (4) Assume that a Morse
pings ~E?(M,N) such that (1) g(K)c V, (2) function f on a (not necessarily compact) M
IIDk(cpo.foK’) (x)-Dk(cPogoti-l) WI <E> satisfies: (A2) M” is compact for all a~ R. Then
x E q(K), k = 1,2, . , where Dk denotes the kth by choosing a sequence a, < a2 < . . of regular
derivative with respect to the coordinates. values of ,f and applying (3) to each Mz:+‘, we
The weak C” topology is thus defined on see that M has the same homotopy type as
Ca(M, N), and the space with this topology is that of a CW complex [, where the number of
denoted by Cz(M, N). It follows from the /l-cells belonging to [ is the same as that of
construction of the embedding in the Whitney critical points of index 1 off:
embedding theorem that the set of all smooth Moreover, we have the following Morse
embeddings of compact M into R” is dense in inequalities: (5) Let S be a Morse function on a
Cz(M, R”) if n > 2dim M. Moreover, the set of compact M. If M, is the number of critical
all Morse functions on a compact M forms points off on M of index i., and if R, is a i-
an open dense set in Cz(M, R) (Nagano [S], dimensional +Betti number of M, then
Hirsh [9], Auslander and MacKenzie [lo]).
1 M,>R,,
M,-M,>RR,-R,,
D. Decomposition of M by a Morse Function M,-MI-,+.,.+(-l)“M,,

To decompose M into levels of a Morse func- bR,-R,m,+...+(-l)“R,, 1 <I<n-1,


tion we now consider the process of tattaching
M,-M,-,+...+(-l)“M,
a handle. Let M be a compact manifold with
tboundary 8M. Let D” be the s-disk, and let =R,-RR,_,+...+(-l)“R,.
g : (8Ds) x D”-“+dM be an ternbedding. Then a
In particular, we have Mk > R, for all k. By
tmanifold X( M; g; s) with a handle attached by
using these facts S. Smale obtained an affrma-
y is defined as the quotient set obtained from
tive solution of the +Poincare conjecture in
the disjoint union M U (D” x D”-‘) by identify-
high dimensions [ 11,131.
ing points in dDs x D”-” and their images
The concept of critical manifolds in the
under y and equipped with a natural differenti-
sense of Bott [14] is stated as follows: Let M
able structure. Similarly, if g,:(aDiL) x D~-“~+
be a compact manifold embedded into an
8M (i= 1, , k) are embeddings with disjoint
open set U c Rk. Let f: U +R be a smooth
images, we can define the thandlebody X(M;
function. M is called a nondegenerate critical
$71,. . . . 9GSI>...>Sk)Clll. manifold off on U if (1) all points of M are
For an ,f6Ca(M, R) and for u, bEf(M) with
critical points off and (2) the nullity of all
a<b let M”={x~Mjf(x)<a) and M,“=
XE M is equal to dim M. When M is such a
{.x E M ) a <S(x) < b}. M,” is called a level set
nondegenerate critical manifold off on U, fis
of 1: For a Morse function f on a compact
constant on M, and the index 1 of S at x is
manifold M without boundary, the following
well defined and is the same at all points on
fundamental facts are known [4]: (1) The first
M. The +Poincare polynomial is expressed as
fundamental theorem. If Mi contains no crit-
P(M; t)=C tkdim Hk(M). The Morse poly-
ical points, then M,b is diffeomorphic to M,” x
nomial is defined by YJI(f; t) = z’N t”~p(N; t),
[a, b]; (2) The second fundamental theorem.
where N runs over all critical manifolds off
Let CE(U, b) be a unique critical value in [a, b],
and I,, is the index of N. Under certain con-
and let p, , , ply M: be critical points off
ditions for orientability along each nondegen-
with pi having index li. Then Mb is diffeo-
erate critical manifold of J we have again the
morphic to X(M”; f,, . . ,fk, 1,, , A,) for
Morse inequality
suitable embeddings f;, , fk. It follows from
the existence of Morse functions and from (1)
and (2) that every compact manifold can be
obtained by successively attaching handles to
a disk (- 114 Differential Topology). Fur- E. Morse Theory on Hilhert Manifolds
thermore, if M admits a Morse function with
only two critical points, then M is homeo- Let M be a +Hilbert manifold, and let f:M-tR
morphic to a sphere (Reeb [ 121). be a smooth function, The Hessian a’f, off at
Concerning the homotopy types of M and a critical point of j’ is a symmetric bilinear
1051 279 F
Morse Theory

form on M, given by lim ,-m c(t) exists such that the limit point c(a)
= lim,,, BE Mb is a critical point of J These
facts imply that the first fundamental theorem
u,=M,, of Morse theory holds if M,” does not contain
any critical point of J and also the second
where cp is a coordinate mapping of a local fundamental theorem holds if M,” has only
chart around p. The index (coindex) off at a nondegenerate critical points on Mf for some
critical point PE M off is defined to be the ce(a, b). Furthermore, assume that f: M-R
supremum of dimensions of subspaces of and M satisfies (Al), (B), and (C), and let a,
M, on which a’f, is negative (positive) defi- bE f(M) be regular values with a < b. If for
nite. The self-adjoint bounded operator A each nonnegative integer i, R, is the /Ith Betti
which represents a2f, is given by 8f,(u, v) = number of M,“, and if M, is the number of
(u, A(u)),. If A is invertible, then ,f is said to critical points of index J” of .f in M,“, then the
be nondegenerate at p. These notions do not Morse inequality holds:
depend on the choice of local charts around p.
The Morse lemma has been generalized to M,>R,,
a Riemannian Hilbert manifold M by R. M,-Mo>R,-Ro,
Palais and S. Smale [ 153 as follows. Let f be a
“‘>

smooth function defined on M. If XE M is a


nondegenerate critical point off then there k&-l)“-kM,Z i (-l)“-kR,,
exist a chart cp around x and a projection P k=O

such that for y near x . ..)


* m
fM=fW+ llWY)l12- ll(~-wP(Y)l12. k;oo(-l)kIMk= f (-l)kR,.
k=O
The above result has been extended to critical
manifolds [16]. A connected submanifold N of In particular, M, > R j, holds for all 1,. If f is
a Hilbert manifold M is called a nondegenerate bounded below, then the ith Betti number RT
critical manifold off: M +R if (1) every point of Mb and the number M,* of all critical points
PE M is a critical point of .f and (2) for each off with index i in Mb satisfy M,* > R: for
pe M there exists a closed subspace E, of M,
all 1..
such that M, = N, @ E, and the Hessian form
restricted to E, is nondegenerate. Let v =
F. Morse Theory of Path Spaces
(n, E, N) be a smooth Hilbert-space bundle
over a compact connected manifold N, and let
Morse theory on Hilbert manifolds applies to
( , ) be a Riemannian structure for v. Assume
the energy functions on Riemannian Hilbert
that a smooth function S: E+R has the zero
manifolds which consist of all HI-curves on a
section of v as a nondegenerate critical mani-
compact Riemannian manifold M, and the
fold. Then there exist a tubular neighborhood
theory is useful for proving the existence of
v,={uEE 1 Ilull <E} of the zero section in v and
closed geodesics on M.
a fiber-preserving diffeomorphism $: v~+$(v,)
Let M be a smooth manifold, and let I =
and an orthogonal bundle projection P such
[O, 11. Let H,(Z, M) be the set of all continuous
that for VE v,
curves 0: I+ M such that for each local chart
fo$(u)=f(N)+ /lPvl/*- II(~-P)412. (V, cp) of M, cpo 0 is absolutely continuous and
ll(qoa)‘// is locally square integrable. In par-
Let us assume that (B) M is a complete ticular, if M = R”, then H,(I, R”) is a Hilbert
Riemannian manifold, and (C) (Palais-Smale space with the inner product
condition) if ,f is bounded on a set S c M and if
the norm IiVf 11of the tgradient vector off
has infimum 0 on S, then there is a critical
point off on the closure S of S. Since M is not
0, P E H, (1, W.
locally compact, condition (C) is required. In
order to prove the first and second funda- For each 0 E H, (I, M), set H, (I, M), = {X E
mental theorems of Morse theory it is neces- H,(I, TM)IX(~)EM,(,),~EI}, where TM is by
sary that the integral curves of Vf exist, and definition the ttangent bundle over M. For any
(C) ensures their existence. Namely, under fixed pair of points p, q~ M, let Q( M; p, q) =
conditions (Al), (B), and (C), one of the follow- {~JE H,(I, M)Irr(O)=p,(~(l)=q}, and for each
ing two facts follows for any bEf(M) and for ~EWM;P, 4, let Q(M;p, &= {XEH~U, W,I
any regular point pi M”: (1) The integral curve X(O)=OEM~,X(I)=OEM~}. Then H,(I, M),
c: [0, r] + Mb of Of with c(0) = p exists, and forms a vector space, and Q(M; p, q), is a sub-
,fo c(r) = b holds for some rE [0, m); (2) the space of H, (I, M),. M can be embedded into
integral curve c: [0, a)+ M” of V’f exists and R” for sufficiently large n by the Whitney
279 G 1052
Morse Theory

embedding theorem [17]. Then we have that t( 1U x [t,-, , ti] is differentiable for all i =
thefollowingfacts [17]:(l)H,(1,M)={a~ l,..., k;(3)~(u,,u,,O)=p,cc(u,,u,,l)=qfor
H,(I,R”)~~J(I)~M}, (2) H,(Z,M) is a closed all (u,,u,)~U; and (4) &x(0,0, t)/au, = W,(t),
submanifold of H, (I, R”), (3) R(A4; p, q) is a &x(0,0, t)/&, = W2(t), tgl. Then the Hessian
closed submanifold of H, (I, M), (4) the tangent E,, of E at 0 is given by
space to H, (I, M) at each point c E H, (I, M) is
E (w w)_l~2mw42))
H, (I, M),, (5) the tangent space to R(M; p, q) at ** 13 2
2 C3U,i3U2 WJ,’
each point UER(M; p, q) is Q(M; p, q),. Here
M is identified with the embedded image in where Z(u,, u,)~s1 is by definition the curve
R”. Thus H, (I, M) and R(M; p, q) carry the cC(u,, u2)(t)= GL(U~,u2, t). The second variation
structure of a Hilbert manifold, and they are formula (- 178 Geodesics A) then gives
independent of the choice of embeddings [17].
&AW,>W,)= -~<K(t)>AL,F’(tD
The energy functions on H, (I, M) and
R(M; p, q) play important roles in the develop-
- (W,, W;+R(W,,a’)o’)dt,
ment of Morse theory. M is now assumed to sI
be a complete Riemannian manifold. It follows
from the Nash isometric embedding theorem where A,W{(t)= W;(t+O)- W{(t-0). We have
the Morse index theorem, which states: (1) The
(- 365 Riemannian Submanifolds B) that
M is isometrically embedded into R”’ for a +null space of E at a critical point g is the
linear space spanned by all Jacobi fields (-
sufficiently large n’. H,(I, R”‘) carries a com-
178 Geodesics A) along 0 that vanish at 0
plete Riemannian metric in a natural way,
and 1; (2) if {a(s,),r~(s,), . . ..cr(s.)} (O<s, <s2
and H, (I, M) becomes a complete Riemann-
< < si < 1) is the set of all points conjugate
ian manifold with the metric induced from
to a(O) along 0 and if /zi is the multiplicity of
H, (I, RN’). Similarly Q(M; p, q) admits a com-
the conjugate point c(si), then the index of E at
plete Riemannian metric. Then the energy
0 is equal to 1, + + 1,. It follows from the
function E: H,(I, M)+R is defined to be
Sard theorem together with the differentia-
E(u):=; (r~‘,cr’)dt, bility of the exponential mapping on M that
sI except for a set of measure zero in M x M, p, q
can be chosen so that p, q is not a conjugate
where ( , ) is the inner product induced from
the Riemannian structure of M. Then the pair along any geodesic in s1. Then all critical
points of E are nondegenerate, and for any
Palais-Smale condition (C) is satisfied for E
on H,(I, M). That is, if a sequence {Q} on c > 0, R’ = {w E R 1E(w) = c} contains at most
finitely many nondegenerate critical points
H,(I, M) satisfies: (1) {E(Q)} is bounded above;
and (2) VE(o,)+O as k-co, then there is a with finite indices.
subsequence {ok.} of {Q} such that {Q} con-
verges to a critical point 0~ H, (I, M) of E.
Also, condition (C) is fulfilled for the energy
G. Existence of Closed Geodesics
function on R(M; p, q).
For a fixed pair of points p, q E M, let 0 =
R(M, p, q). Let 0~ R be a critical point of E. Let M be a compact Riemannian manifold. By
A tangent vector W to R at u:I+M with u(0) replacing I with the circle S’, we consider a
= p, o( 1) = q is a tpiecewise differentiable vec- Hilbert manifold A(M)= H,(S’, M). A(M)
tor field along (T such that W(0) = OE M, and carries the structure of a complete Riemannian
W(l)=OgM,. A proper variation a:(-~,&) xI manifold. Every point PE M is naturally em-
+M along cr which is associated with W satis- bedded in A(M) as a point curve, and M is a
fies the following: (1) ~(0, t) = a(t), t E I; (2) there totally geodesic submanifold of A(M). A point
exists a finite partition 0 = t, < t, < . < t, = 1 0 E A(M) is a critical point of the energy func-
of I such that s(1(-E, E) x [L,-~, ti] is differenti- tion E: A(M)+R if and only if cr is a closed
able for all i = 1, . , IC; (3) c((u, 0) = p and a(u, 1) geodesic of M. It is known that E satisfies the
= q hold for all u E( -E, E); and (4) &(O, t)/& Palais-Smale condition (C). The index and
= W(t), t E I. It follows from the first variation nllllity of E at a critical point 0 is finite. Since
formula (- 178 Geodesics A) that (TER is a M is compact, the fundamental length & of
critical point of E if and only if g is a geodesic M is positive and #={~EA(M)I E(a)<c} is a
on M. Let WI, W, ER,, and let U be an open ideformation retract of M c A(M) by means of
set in RZ around the origin. Then a proper the deformation along the integral curves of
variation CL:U x I + M along 0 that is asso- - VE.
ciated with WI and W, satisfies the follow- If M is not simply connected, then each
ing: (1) cr(O,O,t)=cr(t), t~l; (2) there exists a nontrivial element of the tfundamental group
finitepartitionO=t,<t,<...<t,=lofIsuch of M represents a class of homotopic curves in
1053 279 Ref.
Morse Theory

which there is a closed geodesic whose length [4] J. Milnor, Morse theory, Ann. math.
realizes the iniimum of all these curves. If studies 51, Princeton Univ. Press, 1963.
M is simply connected, then there is a mini- [S] R. S. Palais, Morse theory on Hilbert
mum integer k for which n,(M) # 0. Clearly, manifolds, Topology, 2 (1963), 299-340.
rr~i(lZ(M))=:7~~(M)#O. Suppose that E has no [6] S. S. Chern and R. Lashof, On the total
positive critical value. Then A(M) is a defor- curvature of immersed manifolds, Amer. J.
mation retract of M, a contradiction. Therefore Math., 79 (1959) 3066318.
there exists at least one closed geodesic on [7] A. Phillips, Submersions of open mani-
every compact Riemannian manifold (Lyuster- folds, Topology, 6 (1967) 171-206.
nik and Fet [18]). [8] T. Nagano, Variational theory in the large
Proof of the existence of many closed geo- (in Japanese), Kyoritu, 1971.
desics on M is difficult due to the follow- [9] M. W. Hirsch, Immersions of manifolds,
ing: (1) There is a continuous O(2) action on Trans. Amer. Math. Sot., 93 (1959) 242-276.
A(M) that assigns a~A(!vf) and agO(2) to the [lo] L. Auslander and R. E. MacKenzie, In-
curve t+u(t+a), tES’; (2) for each integer troduction to differentiable manifolds,
k and for each critical point D E A(M), the McGraw-Hill, 1963.
curve t+o(kt) is a critical point with energy [l l] S. Smale, Generalized Poincare’s conjec-
k2E(a). ture in dimensions greater than four, Ann.
A remarkable result has been obtained Math., (2) 74 (1961) 391-406.
by Lyusternik and Shnirel’man [ 191, which [ 121 G. Reeb, Sur certains proprietes topo-
states that there are at least three closed geo- logiques des varietes feuilletees, Actualites Sci.
desics on every simply connected compact Ind., I I (1952) 91-154.
manifold of dimension 2. Fet then proved that [13] J. Milnor, Lectures on the h-cobordism
there exist at least two closed geodesics on a theorem, Princeton Univ. Press, 1965.
compact manifold if all critical points of E on [ 141 R. Bott, Nondegenerate critical mani-
A(M) are nondegenerate [20]. By developing folds, Ann. Math., (2) 60 (1954), 2488261.
a precise argument concerning the Morse [15] R. Palais and S. Smale, A generalized
lemma around an isolated degenerate critical Morse theory, Bull. Amer. Math. Sot., 70
point o~A(?vf) of E, Gromoll and Meyer have (1964), 1655172.
proved that there exist infinitely many closed [ 161 W. Meyer, Kritische Mannigfaltigkeiten
geodesics if the sequence of Betti numbers in Hilbertmannigfaltigkeiten, Math. Ann., 170
{b,(A(M))} with respect to any field is un- (1967) 45566.
bounded [21]. If A:M+M is a certain isome- [ 171 J. T. Schwarz, Nonlinear functional ana-
try, then there are also infinitely many A- lysis, Gordon & Breach, 1969.
invariant closed geodesics if the Betti numbers [ 183 L. Lyusternik and A. I. Fet, Variational
of the space of A-invariant Hi-curves are not problems on closed manifolds (in Russian),
bounded [22,23]. By investigating the Z,- Dokl. Akad. Nauk SSSR, 81 (1951), 17718.
cohomology of A(M) of compact symmetric [ 191 L. Lyusternik and L. Shnirel’man,
spaces, Ziller has proved that if M has the Methodes topologiques dans les problems
same homotopy type as that of a symmetric variations, Actualites Sci. Ind., Hermann,
space of rank > 2, then M has infinitely many 1934.
closed geodesics [24]. [20] A. I. Fet, Variation problems on closed
Many attempts have been made by W. manifolds, Amer. Math. Sot. Transl., ser. 1, 6
Klingenberg and others to prove the existence (1962), 1477206. (Original in Russian, 1952.)
of infinitely many geometrically distinct closed [21] D. Gromoll and W. Meyer, Periodic
geodesics on every compact Riemannian mani- geodesics on compact Riemannian manifolds,
fold [25]. J. Differential Geom., 3 (1969) 493-510.
[22] K. Grove, Condition (C) for the energy
integral on certain path spaces and applica-
tions to the theory of geodesics, J. Differential
References
Geom., 8 (1973) 207-223.
[23] M. Tanaka, On the existence of infinitely
[l] H. Poincare, Sur les lignes geodesiques des many isometry-invariant geodesics, J. Differen-
surfaces convexes, Trans. Amer. Math. Sot., 6 tial Geometry, 17 (1982) 171-184.
(1905), 237-274. (Oeuvres, Gauthier-Villars, [24] W. Ziller, Geschlossene Geodltische auf
1953, vol. 6, 38-85.) global symmetrischen und homogen Rlume,
[2] G. D. Birkhoff, Dynamical systems, Amer. Bonner Math. Schr., 85, 1976.
Math. Sot. Colloq. Publ., 1927. [25] W. Klingenberg, Lectures on closed geo-
[3] M. Morse, Calculus of variations in the desics, Grundlagen der math. Wiss. 230,
large, Amer. Math. Sot. Colloq. Publ., 1934. Springer, 1978.
280 A 1054
Multivariate Analysis

280 (XVW.9) where M =(ZZ’))‘. Applying +Cochran’s


theorem for the multivariate case, Q, is shown
Multivariate Analysis
to be distributed according to a tWishart
distribution with n-m degrees of freedom.
A. General Remarks To test the hypothesis B = B, under (i)-(iii),
we put Qs = (B - B,JZZ’(R - B,)’ and have (X
Multivariate analysis consists of methods of -B,Z)(X-B,Z)‘=Q,+Q,, where QB and Q,
statistical analysis of multivariate observations are independently distributed. The distribution
represented by a collection of points in a finite- of QB is a Wishart distribution with m degrees
dimensional Euclidean space RP. With the of freedom when the hypothesis is true, and a
development of powerful computers, multi- noncentral Wishart distribution when B # B,.
variate techniques are beginning to be utilized Based on this fact, several procedures have
in many fields of science and technology. been proposed. If we require the invariance of
procedures with respect to linear transforma-
tions of the coordinates of p-dimensional
B. The Multivariate Linear Model vectors, the roots i,, , , i,,, of the tcharacter-
istic equation IQB - nQ,I = 0 form a tmaximal
The multivariate linear model is an immediate invariant statistic; hence the testing proce-
extension of the univariate linear model. Sup- dures should be defined in terms of these roots
pose that X=(X”‘. X’“‘) denotes the p x n (- 396 Statistic I). Also, the consideration of
matrix of n observations of p-dimensional +power leads to procedures that reject the
data. Suppose that it can be expressed as hypothesis when these roots are large. Com-
monly used test statistics are (1) the tlikelihood
X=BZ+U, (1) ratio test IV=IQ,I/IQs+QJ =n,(l +ii)-’
where B is a p x m matrix of unknown para- (S. S. Wilks); (2) tr Q;’ QB = C li (D. N. Lawley
meters, Z is a known m x n matrix of inde- and H. Hotelling); (3) max Ai (S. N. Roy); (4)
pendent variables, and U is a p x II matrix of trQs(Qs+QJ’ =CEJl f&-i (K. C. S.
errors. We assume that (i) the texpectations of Pillai). Pillai’s trace test (4) is locally the most
the elements of U are zero, that is, the p x n powerful invariant; Wilks’s likelihood ratio
matrix E(U) = 0, and call the relation (1) a test (1) has the maximum Bahadur efficiency.
multivariate linear model. We usually assume The power functions of the tests (l)-(3) have
further that (ii) the column vectors U(‘), i= the monotonicity property, namely, they are
1, , n, of U are independent and identically monotonically nondecreasing with respect to
distributed, and that (iii) Uci) is distributed each eigenvalue of Cl=T’(B-BJZZ’(B-
according to a multivariate tnormal distri- B,)‘, the matrix of noncentrality parameters
bution with tcovariance matrix z. Analogous for Qs. The monotonicity for Pillai’s test (4) is
to the univariate case, the +least squares es- known to hold only for restricted cases where
timator R of B is defined to be the p x m ma- the critical value c for the acceptance region
trix that minimizes tr Q,(B, + QJ1 d c should satisfy 0 d c < 1. All
the tests (l)-(4) are unbiased. With respect to
tr(X-SZ)(X-BZ) 0- 1 loss, the tests (1) and (4) are tadmissible
and is given explicitly by Bayes and the tests (2) and (3) are tadmissible.
Tests based on min ii are inadmissible.
S=xz’(zzf)-’ when JZZ’I #O. Small-sample distributions of these statis-
Here the symbol ’ means the transpose of a tics are complicated but when n+m, nlog W,
n tr QBQ;i, and n tr Q,(Qs + QJ’ are as-
matrix. Also, an unbiased estimator of z is
ymptotically distributed according to a chi-
given by
square distribution with pm degrees of free-
x=Q,/(n-m), Q,=XX’-sZX’. dom under the null hypothesis. Even under
the alternative hypothesis that the matrix
R is an unbiased estimator of B under the
of noncentrality parameters R = 0( 1) for large
assumption (i) above and is the +best linear
n, the asymptotic distributions remain the
unbiased estimator. under (i) and (ii), while 2‘ is
same. When R = O(l), they are noncentral chi-
unbiased when (i) and (ii) are assumed. Under
square distributions of pm degrees of freedom
the assumptions (i)-(iii), R and 2 form a set of
and noncentrality parameter 6 = tr a. If R =
complete tsuffcient statistics; hence they are
O(n), they are normal distributions, namely,
tuniformly minimum variance unbiased es-
-&(logW+logll+HI), &(trQ,Q;‘-tr0)
timators. Also under (i))(iii), elements of R are
and &(trQ,(Q,+Q,)-‘-trB(I+(I)-‘)
normally distributed, and their covariance can
for 0 = limo/n have asymptotically normal
be expressed by
distributions of zero means and variances
z@ M (0 denotes the +Kronecker product), given by 2tr(l-((I+Q))‘), 2tr(20+0’), and
1055 280 c
Multivariate Analysis

2tr((I+0))2-(I+0)-4), respectively. As a (X-X)‘, x=I:qftJI:n,. Then x(X,-xl’)


special case, if m = 1 there exists only one non- (Xi-xl ‘)‘= Q, f Q,. We call Q, the matrix of
zero 1, and the procedures in this paragraph the sum of squares within classes, and Q, the
all coincide and are equivalent to one based matrix of the sum of squares between classes.
on T2=M-1(B-BB,)‘f-‘(B-B,). It is known The latter is distributed according to a Wish-
that under the hypothesis, (n-p)T’/(n- 1)p is art distribution when the hypothesis is true.
distributed according to an tF-distribution (3) Suppose that X, are p-dimensional vectors,
with (p, n-p) degrees of freedom. When p = and that
2 (resp. m=2), (n-m- l)(l-@)/(mJ@) Xij=p+ai+fij+“ij, i=l, . . ..m.
(resp. (n-p - l)( I- @)/(p$?@) is distributed
j=l,...,n,
according to an F-distribution with degrees of
freedom (2m,2(n-m-1)) (resp. (2p,2(n-p where ,u, ai, pj are p-dimensional constant
- 1))). Simultaneous ‘confidence regions of B vectors such that Ca,=O and C/?,=O, and the
can be derived from the testing procedures in U, are independently distributed according to
this paragraph, that is, a p-dimensional normal distribution N(0, c).
We set
trQ;‘(B-@ZZ’(B-@‘cc.

Moreover, when the matrix B is decom-


posed as B = (B, ! B,), where B, and B, are a p
x q matrix and a p x (m - q) matrix, respec-
tively, and the hypothesis to be tested is of the
form B, = 0, the test procedures can be ob- x(xij-xi,-x,j+x)‘,
tained as follows: Decompose Z as
where

z= -5
0z2 ’ j I

where Z, is a q x n matrix and Z, is an (m - 4) X = CC Xii/h.


x n matrix, and put
Then we have xX(X,-X)(X,-X)‘=Q,+
8;=xz;(z,z;)-‘, Qg + Q,, and Q,, Q,,, Q, are distributed inde-
pendently according to (noncentral) Wishart
Q*=XX’-@Z,X’, Qs,=Q*-Q,. distributions with degrees of freedom m - 1,
Then Qe, and Q, are independent, Qs, is n - 1, and (n - l)(m - l), respectively. The tests
distributed according to a Wishart distribution for the hypothesis a,=0 (i= 1, . . ..m) or /Ij=O
with q degrees of freedom when the hypothesis (j=l,..., n) are obtained from these matrices.
is true, and we can apply the procedures in
the previous paragraph, simply replacing Q, C. Tests for Covariance Matrices
by Q,,.
Such a procedure is called multivariate Let X,(p x l),j= 1, , Ni, be a random sample
analysis of variance (or MANOVA, for short). from p-variate normal distribution N(,u~, ,&)
Various standard situations can be treated in for i= 1, , k. For testing the hypothesis H,,:
this way (after some linear transformation of z, = . = & with unknown mean vectors pi
variables, if necessary). Some examples are (1) against all alternatives, the likelihood ratio
X=(X(‘) Xc”)), where the XC’) (i = 1, , n) statistic is given by
are distributed independently according to a
NN’2 fl) SilNJ*
p-dimensional normal distribution N(p, c). n~,r/~,/2 JCsilN/2 1 N=N~+...+N~> (2)
We can express X as X = pl’ + U, and the
estimators are given by $=X = Xl/n, f‘= where Si =x$,(X,-Xi) (X, -Xi)’ for Xi =
(X -Xl’)(X - Xl’)‘/(n - 1). In this case, we C$, X,/N,. If we replace the sample size Ni
obtain a test for the hypothesis ,n =pLo based by the degrees of freedom ni = Ni - 1 and N
on Hotelling’s T* statistic, i.e., the test with a by n = N-k in (2) the modified likelihood
icritical region of the form ratio test is unbiased for general p and k [16].
For k = 1, the hypothesis specifies H, : 6, =
T’=n(X-ao)‘f-‘(fZ-/c,)>c.
C,, (a given positive definite matrix) and
(2) Suppose that p x n, matrices Xi, i = 1, . , k, the likelihood ratio statistic is given by
are samples of size ni from p-dimensional (SIN12etr( -C;‘S/2) (etr(a)=exp(tra)). Again
normal distributions Nbi, C) with common replacing N by n = N - 1 yields an unbiased
covariance matrix C. The tests for the hypoth- test. Moreover the power function of this
esis pL1= . =,uk are obtained from the follow- modified likelihood ratio test depends only
ing observation: Let Q, = 2,(X,-X l’)(Xi - on the eigenvalues of z, C;’ and is increasing
x,1’)‘, where Xi=Xil/ni, Q,=Cn,(X,-X) with respect to the absolute deviation of each
280 D
Multivariate Analysis

eigenvalue from 1, that is, Ich,C,Z;’ - 1 I. For L,-loss is given by d,(S) = KAK’ with the same
k = 2, it is conjectured that the modified likeli- lower triangular matrix K, but A is the solu-
hood ratio test has a power function monoton- tion of the linear equation AA = b, where A is
ically increasing with respect to Ich,C,Z;’ - 1 I. given by
This conjecture has not yet been proved; we
(n+p-l)(n+p+l) nfp-3
know only that the power is increasing if the
n+p+3 (n+p--3)(n+p-1)
maximum eigenvalue ch, C, C;’ increases A=
1..
from 1 or the minimum eigenvalue ch,C,C;’
1 n-p+1 n-p+1
decreases from 1.
. .. n-p+1
... n-p+1
D. Estimation of Mean Vector and Covariance . . ..
Matrix (n-p+l)(n-p++)

Let X(p x 1) have p-variate normal distri- and b’=(n+p-l,n+p-3 ,..., n-p+l).The
bution N(,u, I). For estimating the unknown minimax estimators d,(S) and d*(S) dominate
mean, take the sum of squared errors S as a the best scalar multiples of S. However, they
loss L(p,d)= JJp-dl)*. Since each component are inadmissible. Let P be a permutation ma-
Xi of X is distributed independently according trix, then the estimator ZP P’di(PSP’)P/p! dom-
to N(pi, 1) for p=(pl, . . . ,p,)‘, it is natural to inates di(S) because of the convexity of the loss
suppose that X is a good estimator. In fact X function Li for i= 1 and 2. The estimator h,(S)
is a minimax estimator for p. However, Stein =(S+ub(u)C)/n, where u= l/trCS’ and C is
showed that X is admissible for p = 1 and 2 but a fixed positive-definite matrix, dominates S/n
inadmissible for p > 3. James and Stein showed under L,-loss if O<b(u)<2(p-1)/n and b(.) is
that the estimator nonincreasing. Under L,-loss (S + bC/tr CS -I)/

(1-p--2
(n+p+l)dominatesS/(n+p+l),ifO<bh

llw*
>x (3)
2(p- l)/(n-pf3)
h,(S) is dominated
[7]. The Haff estimator
by the James-Stein esti-
mator d,(S) if p 2 6. The estimators (S +
dominates X when pd 3. This estimator is
b(trCS’/trCS-*)C)/nforO<b<2(p-1)/n
further dominated by Stein’s positive part
under L,-loss also dominate S/n and are not
estimator (l-(p-2)/11X//*)+X, where (a),
dominated by d,(S). Similar results hold under
means a if a is nonnegative and zero if a is
L,-loss.
negative. The class of estimators X + V logf(X),
where V=(a/ax,, .,.,3/8x,)’ for X1=(x,, . . . . xp)
are all minimax for ,n dominating X, if f(X) E. Correlation among Variables
> 0 and m is superharmonic (V’$@
< 0), satisfying E [ I/V logy(X) II*] < co and In order to represent the interrelationships
E[~la*f(X)/ax2l/f(X)] < co. Putting f(X)= among the p variates, the population corre-
llXll--(p-2) yields the James-Stein estimator lation matrix P = { pij} is used. Also, we can
(3). For this problem a class of monotone calculate the population tmultiple correlation
estimators is essentially complete, where an coefficient of the ith component Xi and all the
estimator d(X) is called monotone if d(X) < rest of the variables by
d(Y) whenever X <Y (defined componentwise).
Pi.l...(i)...p=J~a
Stein’s positive-part estimator is not mono-
tone and is still inadmissible [S]. and the tpartial correlation coefficient of Xi
Let Xi(p x l), i = 1, . . . , n, be a random sam- and Xj, given all others, by
ple taken from the p-variate normal distri-
Pij.I...(i)...(j)...p= - P,,/G,
bution N(0, c). Then the maximum likeli-
hood estimator for C is given by S/n, where where Pij are the cofactors of P (- 397 Statis-
S = CyZI X,X;. S has Wishart distribution tical Data Analysis).
w,(n, C). To study the estimation problem of The sample correlation matrix is calculated
,?I, take two loss functions L,(C, d) = trdZ‘-’ from the data, and the sample multiple correla-
-1ogldL“I-p and L,(Z,d)=(1/2)tr(dZ“ tion and the sample partial correlation coeffi-
-I)‘. Under L,-loss, the best scalar multiple cients are calculated from the sample correla-
of S is given by S/n, and under L,-loss it is tion matrix in exactly the same way as thk
given by S/(n + p + 1). The James-Stein mini- population coefficients are calculated from
max estimator for I,,-loss is given by d,(S) the population correlation matrix. When X =
= KAK’, where K is the lower triangular ma- (X”’ X’“‘) is a sample of size n from a
trix with positive diagonal element for which multivariate normal population, the sampling
S=KK’ and A=diag(A,, . . ..A.,) with Ai= distributions of R,.l...~i,...p and R, .,... cij,,,cjJ...r, are
l/(n + p + 1 - 29. A minimax estimator for known (- 374 Sampling Distributions).
1057 280 G
Multivariate Analysis

The determinant of the covariance matrix where x=X1,/n. The variation optimality of
ICI (or IS]), called the (sample) generalized var- the principal component is given by maximiz-
iance, is a measure of the dispersion of a p- ing simultaneously all the eigenvalues of the
dimensional distribution. The distance of two matrix C’(X-ftlb).(X-xlb)‘C, subject to
distributions with mean vectors p, and p2, the condition that C is a p x r matrix such
respectively, and with common variance Z is that C’C = I,. The solution is given by C = A’,
often expressed by namely, C’X = T [15]. If all the correlations
between the components of X are positive, the
largest eigenvalue of the covariance matrix of
which is called the Mahalanobis generalized X is simple and positive. All the coefftcients of
distance. the first principal component (components of
When the data consists of (p + q)- the eigenvector) can be taken to be positive
x (Perron-Frobenius theorem).
dimensional vectors with q < p, the inter- When we assume normality, the eigenvalues
0Y
relation of X and Y as a whole can be ex- of the sample correlation matrix R are the
pressed in the following way: Let the covar- tmaximum likelihood estimators of the eigen-
iance matrix be partitioned as values of the population correlation matrix,
and their sampling distributions can be ob-
tained. A hypothesis relevant to principal
component analysis is, for example, that the
smallest p - r eigenvalues of the correlation
and the nonzero roots of the equation Ip.& -
matrix are equal, which can be tested by the
CvxC& Cxv 1 = 0 be p, , . . . , p4. Then pl”, . ,
statistic
4 I”, called the canonical correlation coeffi-
cients, are the maximal invariant parameters R,-,
with respect to linear transformation of X
and Y. Also, if we denote the eigenvector cor- =IRI/(I., . ..i.((p--i,-...-/I,)/(p-r))P-‘),
responding to a root pi by vi, i.e., where 1,) . ,1, are the r largest eigenvalues of
R. Under the hypothesis, -clog R,-, (c a con-
stant) is asymptotically distributed according
the linear function iY and ~~CvxC.&X are to a chi-square distribution when n+ co.
called the canonical variates. Variations of principal component analysis
can be obtained by taking the eigenvectors of
the covariance matrix of the raw data or of a
F. Principal Components
multiple of it by some weight matrix.
An important problem in multivariate analysis
is to express the variations of many variables G. Factor Analysis
by a small number of indices. Principal compo-
nent analysis is a technique of dealing with this Factor analysis is closely related to principal
problem. component analysis. We assume a model
Let X be a p x n matrix with n column vec-
X=BFfU,
tors of p-dimensional data. A linear transform
ofX,T=AX(Aanrxpmatrix,r<p,Tan where B and F are unknown p x r and r x n
r x n matrix) is called the principal component matrices of constants (p > r) and U is a p x n
if A is chosen so as to maximize the sum of matrix of independent errors. F is called the
the squares of the sample multiple correlation matrix of factor scores and B the matrix of
coefficients of each of the row vectors of X to factor loadings. We assume FF’ = nl. If E(UU’)
those of T, namely, if A is an r x p matrix = n@ is known, then by applying the least
formed by the r eigenvectors of the sample squares principle, we can determine B and F
correlation matrix of X corresponding to the r so as to minimize the trace of (X - BF)‘W’(X
largest eigenvalues or any nonsingular linear - BF). Then B is obtained by taking the r
transformation of them. This is a characteriza- eigenvectors of Q-r XX’ corresponding to the r
tion of principal components in terms of cor- largest eigenvalues. When @ is diagonal but
relation optimality. unknown, we can solve the simultaneous equa-
The principal component T = AX is also tion for B and 0, whose solutions are the
characterized by the information-loss op- matrix B with columns equal to eigenvectors
timality in that all eigenvalues of (X - CY - of 6-l XX’ and the diagonal matrix & with
blL)(X - CYbli)’ are simultaneously minimized elements equal to the diagonal elements of
subject to the condition: C is a p x r matrix, XXI/n - BB’.
b is a p x 1 vector and Y is an r x n matrix. If we assume that U is normal, the proce-
The solution is given by CY + bl; = A’T +x1;, dure in the preceding paragraph for the case
280 H
Multivariate Analysis

when @ is known is equivalent to the maxi- If we assume that U is normal, this proce-
mum likelihood method. When @ is unknown, dure is equivalent to the maximum likelihood
we can assume further that the columns in F method. It should be remarked that although
are also multivariate normal vectors distri- the model here is not symmetric in X and Y,
buted independently of U, which implie; that the results are symmetric in X and Y, and
the columns of X are also normal vectors with therefore they will be the same if the roles of X
the covariance matrix C = BE + a,. B and @ are and Y are interchanged in this model.
estimated from the sample covariance matrix,
and the solutions of the simultaneous equation
for B and @ gives the maximum likelihood 1. Linear Discriminants and Problems of
estimators. However, there is the so-called Classification
identification problem of determining whether
for given C and r the decomposition Z= BE’+ Let p x ni matrices Xi (i = 1, , k) be the set of
@ is unique. This problem has not yet been observations for k distinct populations with a
completely settled. If the solution & is posi- common covariance matrix. We determine a
tive definite and @B’ is positive semidefinite, it vector a such that Ti = a’Xi reveals the dif-
is called proper. When @ is not positive de& ferences of the k populations as much as pos-
nite, it is called a Haywood case, and when sible, or, more precisely, so that the ratio of
BB’ is not positive semidefinite it is called a the sum of squares between classes of T to the
complex case. Sometimes iterative procedures sum of squares within classes is maximized. If
lead to improper solutions. the matrices of the sums of squares between
and within classes are Qb and Q,, respectively,
the ratio is equal to
H. Canonical Correlation Analysis 1= a’Qha/a’Q,,,a,

Canonical correlation analysis can also be which is maximized when a is equal to the
used for descriptive purposes. Sample canon- eigenvector of Q,‘Qb corresponding to the
ical variates have various descriptive implica- largest eigenvalue. The linear function t =
tions. Suppose that q; Y and 5; X are the first a’X is called the linear discriminant function.
canonical variates corresponding to the largest When k = 2, a is given by a = Q;’ (x, -x2),
canonical correlation p:“. Then pl” is the where x, and x, are sample mean vectors.
largest possible correlation between a linear When k > 2, we let A be the matrix formed
function of X and a linear function of Y and is by the r eigenvectors corresponding to the
actually equal to the correlation of qi Y and first r largest eigenvalues of Q;’ Qbr and set
4; X. Similarly, the second canonical correla- Ti= AX,. From this we can construct the r-
tion is equal to the largest possible correla- dimensional discriminant function. These
tion between linear functions in X and in Y functions can be used to locate the k popu-
which are orthogonal to 5; X and qi Y, respec- lations in r-dimensional space, and also to
tively, and so forth. decide to which population a new observa-
As a second interpretation of the canonical tion belongs. For the latter problem we can
variates, we consider the linear regression also construct k quadratic functions si =
model (X -x,)‘Q;‘(X - xi), where Xi is the sample
mean vector of the ith population, and X a
Y=BX+U,
new observation. Then we can decide whether
where Y, B, U are q x n, q x p, q x n matrices, X belongs to the population corresponding to
respectively, and rank B = r < q. Then there the minimum si. Such a method is called a
exist an r x p matrix A and a 4 x r matrix C classification procedure.
satisfying C’Z-‘C = I such that B = CA, where
E(UU’) = nZ. Putting T = AX, we get a re-
J. Discrete Multivariate Analysis
gression model Y = CT + U, regarding T as a
matrix of regressor variables. If Z is known,
Let X,, be an observed frequency on three
least squares considerations lead to minimiz-
characteristics, each belonging to ijkth class
ing tr(Y -BX)‘C-‘(Y -BX) with the condition
for 1 di<l, 1 <j<J and 1 <k<K. Assume
rank B = r. The resulting row vectors of A
that X,, is a sample from multinomial distri-
consist of the r eigenvectors corresponding
bution having pijk as a probability of occur-
to the I largest eigenvalues of the matrix
ence for an (t,j, k) cell, where p... = C pijk = 1.
S,-,l S,,C-’ S,, and column vectors of C consist
The multinomial observation X,, with proba-
of the Y eigenvectors corresponding to the r
bility pijk is called an I x J x K contingency
largest eigenvalues of the matrix SyxS’Sxx xyC-‘.
table. If we further assume that
If C is replaced by S,,, T = AX and Z = C’S;; Y
are equal to the matrices of canonical variates.
1059 280 Ref.
Multivariate Analysis

with the restrictions on parameters ‘&cc, = complicated, and often only asymptotic prop-
xi xij = Cj rij = 0, similarly on [c’s and y’s and erties are known (- 374 Sampling Distri-
xi fi,, = Cj a,, = & fi,, = 0, we say that a butions). Nonparametric rank analogs of
saturated log-linear model is given. Here the many multivariate techniques for normal
number of observations and the number of distribution are found in [lS]. Robustness of
parameters are equal, and no errors can be the distributions of test statistics or of latent
estimated. The parameter fi,-, is called the roots is investigated in [3,10,14]. Multivariate
three-factor effect, or equivalently, the second- data analysis is found in [S].
order interaction. Similarly, uij, [(Jk, jr,, are
called the two-factor effects or the first-order
References
interactions. Finally xi, pj, yk are called the
main effects. A simple model is obtained when
all the first- and second-order interactions Cl] T. W. Anderson, An introduction to multi-
vanish. This is equivalent to the independence variate statistical analysis, Wiley, second edi-
model: ,oijk =pi..p.j.p..k and the maximum like- tion, 1984.
lihood estimators for ,u, xi, pj, yk are obtained [2] Y. M. M. Bishop, S. E. Fienberg, and P.
from fiJk = Xi..X.j,X..k.n3, where n =X . A W. Holland, Discrete multivariate analysis,
nontrivial model is obtained by putting all the MIT Press, 1976.
second-order interactions equal to zero. The [3] Y. Fujikoshi, Asymptotic expansions for
likelihood equations for this model are given the distributions of the sample roots under
by nonnormality, Biometrika, 67 (1980), 45-51.
npIj. =X, , np,+ = Xi.k, np.jk=X.j,. [4] N. C. Giri, Multivariate statistical in-
(4)
ference, Academic Press, 1977.
Bartlett first described a solution of (4) when I [S] R. Gnanadesikan, Methods for statistical
= J = K = 2. For a 2 x 2 x 2 table, the solution data analysis of multivariate observations,
can be expressed by Bijk = (xuL k 0)/n because Wiley, 1977.
of the constraints for Xi,., Xi+, and X.,. Putting 167 S. J. Haberman, The analysis of frequency
6 , , 1 = 0 for fiUk yields a cubic equation for 0: data, Univ. of Chicago Press, 1974.
[7] L. R. Haff, Empirical Bayes estimation of
(X,,,+~)(x,,,+~)(x,,,+~)(x,,,+~)
the multivariate normal covariance matrix,
=(X,,,-O)(X,,,-O)(X,,,-(I)(X,,,-H). Ann. Statist., 8 (1980), 586-597.
For a general I x J x K table, the equations (4) [S] I. Hashimoto, Estimation for distribu-
have a unique solution within the no-three- tions with monotone likelihood ratio: Case of
factor effect model if there exists qiik > 0 satisfy- vector-valued parameter, Commun. Statist.,
ing (4). The unique solution maximizes the A10(1981), 1901-1913.
likelihood. To solve the likelihood equa- [9] W. James and C. Stein, Estimation with
tions (4), standard iterative procedures, such quadratic loss, Proc. Fourth Berkeley Symp.
as the Newton-Raphson method, can be ap- Math. Statist. and Prob., 1 (1961), 361-379.
plied. However, the following iterative scaling [lo] T. Kariya, Robustness of multivariate
method of Deming and Stephan is more useful: tests, Ann. Statist., 9 (1981), 1267-1275.
[ 1 l] M. G. Kendall, Multivariate analysis,
Griffin, 1975.
[12] A. M. Kshirsagar, Multivariate analysis,
Dekker, 1972.
(3m+2)=pF+l)___ Xi.k
nPijk [ 131 D. N. Lawley and A. E. Maxwell, Factor
p;gm+l) ’
analysis as a statistical method, Butterworth,
x.jk 1963.
np;;km+3)=p!$++)-
p!?km+2)’ 1141 R. J. Muirhead and C. M. Waternaux,
J
Asymptotic distributions in canonical correl-
The first iteration adjusts pi?) by fitting pro- ation analysis and other multivariate proce-
portionally with respect to k so that np$“‘+‘) is dures for nonnormal populations, Biometrika,
equal to X,., and similarly for the second and 67 (1980), 31-43.
third iterations. Starting with any initial values [ 151 M. Okamoto, Optimality of principal
satisfying the first-order interaction model, the components, Multivariate Analysis II, P. R.
iterative scaling method (5) converges to the Krishnaiah (ed.), Academic Press, 1968, 673-
unique solution of (4) as m+ r*;. 685.
[ 161 M. D. Perlman, Unbiasedness of the
K. Other Problems likelihood ratio tests for equality of several
covariance matrices and equality of several
The sampling distributions associated with the multivariate normal populations, Ann. Statist.,
procedures discussed here are usually very 8 (1980), 247-263.
280 Ref. 1060
Multivariate Analysis

[ 171 M. D. Perlman and I. Olkin, Unbiased-


ness of invariant tests for MANOVA and
other multivariate problems, Ann. Statist., 8
(1980) 1326-1341.
[18] M. L. Puri and P. K. Sen, Nonparametric
methods in multivariate analysis, Wiley, 1971.
[19] C. R. Rao, Linear statistical inference and
its applications, Wiley, 1973.
[20] R. Schwartz, Admissible tests in multi-
variate analysis of variance, Ann. Math. Sta-
tist., 38 (1967) 698-710.
[21] M. S. Srivastava and C. G. Khatri, An
introduction to multivariate statistics, North-
Holland, 1979.
[22] C. Stein, Estimation of the mean of a
multivariate normal distribution, Ann. Statist.,
9(1981), 1135-1151.
281 A 1062
Network Flow Problems

281 (X1X.5) Euclidean plane R x R is said to be mono-


tone if(x, -x2)(y, -y,)>O for any (x,,y,),
Network Flow Problems (x,,y,)~C. A monotone curve C is called a
characteristic curve if its projection to each
A. Introduction coordinate axis, i.e., C, = {x 1(x, y) E C} and
C, = { y 1(x, y) E C}, is a closed interval. For a
The network flow problem is a special kind of characteristic curve C, two convex functions,
mathematical-programming problem (- 255 cp(~)=JC~(~-~ddx and ~(Y)=S&(X-X~W~
Linear Programming, 264 Mathematical Pro- are defined, where (x,, yO) is a point fixed on
gramming, 292 Nonlinear Programming), C and the integrals are taken along (x, y)~ C.
where the variables of the objective function (It is understood that q(x) = co if x $ C, and
and the constraints are all defined in terms of a $(y) = co if y $ C,.) These two functions defined
graph (- 186 Graph Theory). Owing to its for a fixed C are conjugate to each other in
special structure, the mathematical properties Fenchel’s sense, and they satisfy v)(x) + $(y) >
of the network flow problem as well as the (x-x,,)(y-yo) for any (x,y)~R x R, where the
solution algorithms have been investigated in inequality reduced to an equality if and only if
detail. Network flow problems have a variety (x, y) E C (- 88 Convex Analysis, 292 Non-
of useful applications in fields such as trans- linear Programming).
portation, scheduling, and resource alloca- A network N in network flow theory is a
tion, and in operations research in general, so graph G to each edge eK whose edge set E =
that they now constitute an important class {e,, , e,} is given a characteristic curve C”.
of mathematical programming. By the term On a network N, the following three problems,
“network” (Netzwerk in German, rheau in PI, PII, and PIII, are defined. PI: Find a flow
French, set’ in Russian) we usually mean a 5 that minimizes Q(t)= C”,=l qn,(t”) under
graph with some physical attributes attached the constraints ~“-~(~,)EC~ (IC= 1, . . ..n).
to its edges and vertices. PII: Find a tension v which minimizes Y (II) =
Et=1 cp”(q,) under the constraints q, = q(e,)E
C; (IC = 1, , n). PIII: Find a pair (5, ‘I) of a
B. Basic Form of the Problem flow 5 and a tension q such that (~“,~JEC“ for
all K = 1, , n. (With respect to PI or PII, a
Let G =( V, E, i;+, d -) be a graph with vertex flow or a tension satisfying the “constraints”
set V, edge set E, and incidence relations a+, is ordinarily called a feasible flow or tension,
B- : E-t V, and let R be the field of real num- respectively.) Then, as a special case of the
bers. (Most of the statements in the following (Karush-)Kuhn-Tucker theorem and the dual-
are valid if we take for R an ordered field or ity theorem in nonlinear programming, we
ordered additive group more general than the have the following theorems (A) and (B).
field of real numbers.) Furthermore, we regard (A) For a given network N, one and only
the collection E of all functions 5: E-rR as a one of the following four alternatives is the
vector space of dimension IEl, denoting t(e,) case:
by 5” if E = {e,, , e,}. Similarly, the collec- (i) There is a feasible flow in PI, and at the
tion fi of all functions w: V+R is a vector same time, there is a feasible tension in PII. In
space of dimension 1VI. If we define the map- this case, both PI and PI1 have a solution,
ping 6’: V+2E by G’u={eld*e=u} (- 186 and, for any solution [ of PI and any solution
Graph Theory), a linear mapping (7:E+R is 9 of PII, the pair ([, 4) is a solution of PIII;
naturally introduced through the relation and, conversely, for any solution (E, 0 of PHI,
(X)(4 = Loa+ v <(e)-C,,a-v[(e). A vector <E= the flow e is a solution of PI and the tension 9
is called a flow on G if a< = 0. The dual space is a solution of PII.
E.* OfE and the dual a* of R are identified (ii) There is a feasible tension in PII, whereas
with the collection of functions 8: E-R and there is no feasible flow in PI. In this case,
that of [: V-R, respectively, under the obvi- @(<) of PI1 is not bounded below for feasible
ous correspondence. Then the linear mapping tensions 5.
6:R*+E*, which is contragredient to 3, is (iii) There is a feasible tension in PII, whereas
defined by means of the relation (s<)(e)= there is no feasible flow in PI. In this case,
<(Z’e)-[(d-e). A vector FEZ*, which is the Y(q) of PI1 is not bounded below for feasible
image of [ under 6, i.e., which satisfies q = SC, tensions q.
is called a tension on G, and [ is called the (iv) There is neither a feasible flow in PI nor a
potential on G corresponding to the tension feasible tension in PII.
yl. (Sometimes, t(e) is called a flow in edge e, (B) Let Cz=[b”,cK] (b”<c”;b”can be -co,
q(e) a tension across e, and <(v) a potential at and cK, 00) and C;=[rl,,f,] (d, can be --cv,
vertex u,) and J,, co). Then a necessary and sufficient
A continuous curve C (c R x R) on the condition for PI to have a feasible flow is that
1063 281 D
Network Flow Problems

C 1 c” -x2 b” 2 0 for every cutset (cocycle or relation ,f= min {z’, ,f, - C; d,}, called the
cocircuit) w of G, where the summation C, maximum-separation minimum-distance theo-
(resp. CZ) is taken over all the edges eK (E E) rem, holds, where the minimum is taken over
that lie in w in the positive (resp. negative) all the tiesets that contain e, in the positive
direction. Similarly, a necessary and sufficient direction and where C’r (resp. C’J denotes
condition for PII to have a feasible tension is the summation over all the edges, except e,,
that C t ,f, - C, d, > 0 for every tieset (cycle or lying in a tieset in the positive (resp. nega-
circuit) 0 of G, where C I (resp. X2) is taken tive) direction.
over all the edges eK that lie in fI in the posi- The minimum-cost flow problem is to deter-
tive (resp. negative) direction. mine the two-terminal characteristic of N,
when all the C” (K # 1) are of staircase form.
A number of algorithms exist for time com-
C. Shortest Paths, Maximum Flows, and plexity (- 71 Complexity of Computations)
Minimum-Cost Flows O(l k’(j) for the shortest-path problem [4]; the
algorithm proposed by E. W. Dijkstra [7] for
We choose one of the edges of G, say e,, as the the case d, < 0 <f, for all IC( # 1) is of complex-
reference edge and assign it the parametric ity O() VI’). The algorithm of A. V. Karzanov
characteristic curve C’(a) = {(x, y)l y=x +a). [S, 93 for the maximum-flow problem is of
Then if there is a feasible flow on the network complexity O(] V13). The minimum-cost flow
obtained from G by contracting (i.e., short- problem can be solved by alternately solving
circuiting) e,, and if at the same time there is a the subproblems of the shortest-path type and
feasible tension on the network obtained from those of the maximum-flow type, but no al-
G by deleting (i.e., open-circuiting) e,, then gorithm of time complexity polynomial in 1E 1
problem PI11 has a solution (e(a), q(a)) for and ( VI has been found.
every real a, and [r<a) and fi,(a) are uniquely
determined for each a. The problem of deter-
mining these parametric solutions is the two- D. Transportation and Scheduling
terminal problem for the two-terminal network
N, (which is obtained from G by deleting e,) Let us make the edges of a graph G = (V, E,
with the vertex 3 e 1 as the entrance (or source) a+, a-) correspond to the transportation
and the vertex G’e, as the exit (or sink). The routes and the flow to the stream of a com-
curve C={([‘(a),tf,(a))Ia~R} enjoys the modity; impose a capacity constraint of the
properties of a characteristic curve, and is form 0 <g* < cK on the flow gK in each edge err
called the two-terminal characteristic of N, and assume a cost function of the form (~~(5”)
with respect to the entrance a-e, and exit =&t” for each edge. (cK is a constant called
a’e,. A two-terminal network for which only the capacity of edge e,, and fK is a constant
the projections to the x-axis Cz = [h”, c’] of called the unit cost of edge e,.) Furthermore,
the edge characteristics are specified (K. = 2, let us specify a subset VI (c V) of vertices as
. ..) n) is called a capacitated network, and the the set of entrances and another subset V,
maximum-flow problem for a capacitated (c V) as the set of exits, where V, n V, = a,
network N, can be mathematically formulated and prescribe the amount of inflow q(v) to
as the problem of determining the projection each entrance VE VI and the amount of outflow
to the x-axis C, = [b, c] of the two-terminal q(v) from each exit VE V,, where we must
characteristics of N, For the maximum-flow have CVEV, q(v)=&,2q(v). Planning a trans-
problem, the relation c = min{C’r c’ -xi b”} portation plan that satisfies all the above-
holds, where the minimum is taken over all prescribed conditions and that minimizes the
the cutsets that contain e, in the negative di- total cost CxtE~JcK) can be reduced to finding
rection and where C’, (resp. xi) denotes the a minimum-cost flow on the extended network
summation over all the edges, except e, , lying G’= (t? E, j’, I?), defined as follows, such that
in a cutset in the positive (resp. negative) direc- the flow in the reference edge is to be maxi-
tion. This relation is called the maximum-flow mum: P=vu{~}U{t} (s,t$T/), E”=EUE,U
minimum-cut theorem. (A similar relation E,U{e,}(E,nE=E,nE=0,e,~EUE,UE,;
holds also for b.) e, is the reference edge, s’e, = t, a-e, =s;
Similarly, or dually, the problem of deter- E,={eIcj+e=s,s-e=v,v~V,},C(e)={(O,y)l
mining the y-projection C, = [d,f] of the two- y~O)U((x,0)/0~xfq(~-e)}U{(q(~),y)l
terminal characteristic of a two-terminal net- O<y}, where d-e=ueV,; E,={e($+e=v,
work N, for which the y-projections C; = ~-e=t,v~1/2},C(e)={(0,y)ly~O}U{(x,0)IO~
[d,,fJ of the characteristics are specified to x < q(v)} U {(q(v), y) IO d y}, where c?+e = UE V,),
the nonreference edges e, (K = 2, , n) is a ~‘IE=~i,C(e,)=i(O,~)l~d,f,lU{(x,.f,)l
network flow formulation of the shortest-path O~xxcCK)U{(~IC,y)Iy~,f~} for e,EE.
problem. For the shortest-path problem, the For the project-scheduling problem with an
281 E 1064
Network Flow Problems

acyclic graph G = (V, E, a+, 8 ~) as the arrow prescribed vertex degrees, etc., are known to
diagram for which the start node SE V and the be reducible to network flow problems [2].
completion node are specified, we make an
edge I correspond to a job (or activity),
and a vertex correspond to the event that
F. Generalizations
all the jobs corresponding to 6-u have been
finished, whereas those corresponding to 6’~
are ready to commence. Then we interpret The network flow problem may be general-
the negative of the tension of an edge as the ized in various directions. Replacing a graph
time (i.e., duration) spent on the correspond- by a matroid (- 66 Combinatorics) or con-
ing job and the potential at a vertex as the sidering stronger conditions on the feasibility
time (i.e., instant) of the corresponding event of flows and tensions are natural extensions
taking place. Furthermore, we assume that [4,6,10,11]. Another extension is to con-
to each edge eK (or the corresponding job) are sider several kinds of flow, instead of a single
given the normal duration - d,( > 0), the crash kind, that simultaneously affect the capacities
duration -f,( > 0, < -d,), and the unit cost of edges. This latter problem is called the
cK( > 0) we have to pay in order to decrease multicommodity flow problem in contrast to
the job time by one unit, where the job time the single-commodity flow problem that was
necessarily lies between the normal and the treated above [S]. It can be said that any ex-
crash duration. Under the above-listed speci- tension of the network flow problem aims at
fications, we consider the relation between the a mathematical model that has wider appli-
total project time (i.e., the duration from the cation without losing the advantage of having
event corresponding to the start node to that simple effective solution algorithms.
corresponding to the completion node) and
the extra cost to be paid for decreasing the
project time below the normal one. This prob-
References
lem can be reduced to the two-terminal char-
acteristic problem for the following network
G: G is obtained from G by adding the refer- [l] G. J. Minty, Monotone networks, Proc.
enceedgee,(J+e,=t,%e,=s);C”={(O,y)l Roy. Sot. London, A257 (1960), 1944212.
Y~d,}U{(X,d,)IO~x~CK}U{(CIC,y)I~~~y~ [Z] L. R. Ford, Jr., and D. R. Fulkerson,
f,~U{(X,f,)ICK~X). Flows in networks, Princeton Univ. Press,
1962.
[3] C. Berge and A. Ghouila-Houri, Pro-
grammes, jeux et reseaux de transport, Dunod,
E. Applications to Combinatorial Optimization
1962.
[4] M. lri, Network flow, transportation and
Many problems in combinatorial optimization scheduling-Theory and algorithms, Aca-
can be reduced to network flow problems. The demic Press, 1969.
problem of finding a maximum matching on a [S] T. C. Hu, Integer programming and net-
bipartite graph G (- 186 Graph Theory) is work flows, Addison-Wesley, 1969.
reduced to the maximum-flow problem for the [6] E. L. Lawler, Combinatorial optimization
graph representing the transportation prob- -Networks and matroids, Holt, Rinehart and
lem on G with one of the two vertex sets as Winston, 1976.
the entrance set and the other as the exit set, [7] E. W. Dijkstra, A note on two problems in
where all the edges have a unit capacity and connexion with graphs, Numerische Math., 1
the amount of inflow/outflow to/from each (1959) 269-271.
entrance/exit vertex is equal to unity (the cost [S] E. A. Dinits, Algorithm solutions to
being irrelevant). (The existence of an integer maximum network flow problems by power
solution to this kind of maximum-flow prob- analysis (in Russian), Dokl. Akad. Nauk SSSR,
lem is proved constructively on the basis of 194 (1970) 754-757.
the solution algorithm.) The maximum-flow [9] A. V. Karzanov, Calculation of maximum
minimum-cut theorem in this particular case network flow by power methods (in Russian),
can be stated as follows: The maximum car- Dokl. Akad. Nauk SSSR, 215 (1974), 49952.
dinality of matchings on a bipartite graph [lo] M. Iri and S. Fujishige, Use of matroids
is equal to the minimum cardinality of ver- in operations research, circuits and systems
tex subsets which cover all the edges (this is theory, Intern. J. Syst. Sci., 12 (1981) 27754.
known as the K6nig-EgervBry theorem). The [ 1 l] C. P. Bruter, Elements de la theorie des
Dilworth theorem for a partially ordered set, matro’ides, Lecture notes in math. 387,
the criterion for the existence of a graph with Springer, 1974.
1065 282 C
Networks

282 (XX.1 7) can be formulated as the problem of minimiz-


ing Cr=lfK(i,) under condition (i) (or minimiz-
Networks ing C:=, f,(E,) under condition (ii)), where for
each branch B,, f, is a given tconvex function
A. Linear Graphs (- 186 Graph Theory) defined on a given interval [a,, b,].

A linear graph (or simply graph) is an object


C. Electric Networks
composed of(i) a finite set {BK} (K = 1, . , n)
of elements called branches (or edges), (ii) a
Since there has been a great deal of research
finite set {N,} (a = 1, . . , m) of elements called
on electric networks, “network” often means
nodes (or vertices), and (iii) an incidence rela-
“electric network.” We call a branch in which
tion between branches and nodes represented
the current is a given function of time a cur-
by a function [B,: NJ from {BK} x {iv.} to
rent source, and one across which the voltage
(0, 1, -1) such that for every K, there exist
is a given function of time a voltage source.
exactly one N, with [B,: NJ = 1 and exactly
A branch that is either a current source or a
one N0 with [B,:N,] = -1, with all the other
voltage source is called a source branch. A
[B,: NJ equal to 0. (Intuitively, a branch B,
network with M source branches is called an
starts from the node N, with [B, : NJ = 1 and
M-port network. If the currents i, in and the
ends at the node N,, with [B,:N,]= -1.) In
voltages E, across the non-source branches
terms of topology, a (linear) graph is a l-
(n’ in number) are related by
dimensional finite tsimplicial complex (- 70
Complexes, 186 Graph Theory). A network
E,= 2 zKii, @=l,...,n’)
in the wide sense is a linear graph whose A=1
branches and nodes are endowed with some
or
physical properties.
“’
i,=A=1
c yKAEA (K= 1, . . ..n’).

B. Networks where the zKi or yKll are linear tintegrodifferen-


tial operators, the network is said to be linear.
A contact network, one of the simplest kinds of If zlu = z,~ or yKI = y,,, it is said to be recipro-
networks, is an abstraction of a circuit whose cal or bilateral; if the zK1 or y,, are invariant
branches correspond to contact points of under the change of the origin of time, it is
relays and switches that are allowed to take a
said to be time-invariant; and if a linear time-
finite number of physical states, e.g., the two
invariant network satisfies the condition
states “on” and “off.” The theory of contact
networks is developed by means of +Boolean
algebra and is applied to switching networks,
such as telephone exchange networks and the
for every t and for every choice of functions
logical networks of digital computers.
of time for i, or E, associated with the source
In most cases, to the branches B, of a net-
branches B, in S, provided that the current-
work two kinds of real quantities i, and E, are
voltage relations for non-source branches are
assigned (which may be variables or functions
satisfied, then the network is said to be passive.
of time) satisfying the conditions:
Under certain nonsingularity conditions, for
the currents in and the voltages across the
(i) f: [B,:N,]i,=O (a=l,...,m);
K=, source branches (denoted by I, and e,) of a
linear M-port network, we have the relations
(ii) there exist E, such that

where the matrices Z,, and Y,, of linear inte-


In the case of electric networks, where i,, E,, grodifferential operators are called the port-
and E, are the current in branch B,, the vol- impedance matrix and the port-admittance
tage across branch B,, and the potential at matrix of the network, respectively. Analysis
node N,, respectively, these conditions are determines ZKI, Y,, from a given linear graph
known as Kirchhoff’s laws. and given zKA.,y,,, while synthesis finds a net-
The network flow problem, which has a work (i.e., zK1 or yKI, as well as a linear graph)
number of practically important applications when part of Z,,, Y,,, or some relations to be
in toperations research (e.g., transportation satisfied by them are given. (In synthesis, the
problems, project-scheduling problems) and is z,~ or yK2, are usually confined to some special
a special case of tmathematical programming, class.) In analysis as well as synthesis we usu-
282 Ref. 1066
Networks

ally deal with the +Laplace transforms &(s), [S] 0. Brune, Synthesis of a finite two-
-
LKI(4 .&(4, Y,,(s) instead of zKA, YK,, -L K, terminal network whose driving-point im-
themselves, where the characteristics Z,,(s), pedance is a prescribed function of frequency,
E,(s) of a network are determined by the J. Math. Phys., 10 (1931) 191-236.
topological properties of its linear graph and [6] Y. Oono and K. Yasuura, Synthesis of
the properties of Z,,(S), yKI,(s) as functions of finite passive 2n-terminal networks with pre-
the complex variable s. (If w is the angular scribed scattering matrices, Mem. Fat. Eng.
frequency, s = iw.) Kyushu Univ., 14 (1954), 1255177.
The following fundamental facts are known [7] S. Seshu and M. B. Reed, Linear graphs
in regard to analysis and synthesis: and electrical networks, Addison-Wesley, 1961.
(1) If Z,,(s), i;;,,(s) are rational functions of s [S] W. H. Kim and R. T. Chien, Topological
holomorphic on the open right half-plane, a analysis and synthesis of communication net-
necessary and sufficient condition, for a net- works, Columbia Univ. Press, 1962.
work to be passive is that for arbitrary real [9] C. Berge and A. Ghouila-Houri, Program-
numbers t,, ming, games and transportation networks,
Methuen, 1965. (Original in French, 1962.)
[lo] L. R. Ford, Jr., and D. R. Fulkerson,
Flows in networks, Princeton Univ. Press,
is a positive real function of s, i.e., a function
1962.
whose value is real when s is real and lies on
[ 1 l] R. W. Newcomb, Linear multiport syn-
the right half-plane when s lies on the right
thesis, McGraw-Hill, 1966.
half-plane.
(2) Every passive one-port network can
be synthesized by using a finite number of
three kinds of branches, i.e., positive resistors
(E, = R,i,, R, > 0), positive capacitors (i, = 283 (Xx1.37)
C,dE,/dt, C, > 0), and positive inductors Newton, Isaac
(E, = L,di,/dt, L, > 0).
(3) Every passive M-port network (M > 2)
Sir Isaac Newton (December 25, 1642-March
can be synthesized by using, in addition to the
20, 1727), the English mathematician and
three kinds of branches mentioned in (2), ideal
physicist, was born into a family of farmers in
transformers (an ideal transformer is a pair of
Woolsthorpe, Lincolnshire. In 1661, he entered
branches (B,, B,) such that i, = ni,, E, = nE,,
Cambridge University, where he was greatly
and n = real number) and ideal gyrators (an
influenced by the professor who was teaching
ideal gyrator is a pair of branches (B,, B,) such
geometry, I. Barrow, and where he began
that i, = E,, i, = -E,); ideal gyrators are not
research in Kepler’s optics and Descartes’s
needed, however, to synthesize a reciprocal
geometry.
network.
In 1665, he discovered the tbinomial theo-
However, very little is known about the
rem, and in the same year, during a stay at
synthesis of passive M-port networks without
his birthplace to escape the plague, he began
using ideal transformers and gyrators. Topo-
work on his three great discoveries-the spec-
logical methods are expected to be powerful
tral decomposition of light, the universal law
for such synthesis problems.
of gravity, and differential and integral cal-
The linear graph structure of a network
culus. He returned to Cambridge University in
loses its significance if ideal transformers are
1667, and in the following year invented the
admitted.
reflecting telescope and proposed his theory
of light particles. During this period, he suc-
References ceeded Barrow as professor and lectured on
optics. At the same time, he probed deeper
[l] K. Kondo (ed.), RAAG memoirs of the into the calculus. Guided by Barrow’s insight
unifying study of basic problems in engineer- that differentiation and integration were in-
ing and the physical sciences by means of verse operations and also by his own research
geometry, Association for Science Documents on infinite series, Newton obtained the tfunda-
Information, Tokyo, I, 1955; II, 1958; III, 1962. mental theorem of calculus. Leibniz obtained
[2] W. Cauer, Synthesis of linear communica- the same theorem a little later, and a struggle
tion networks, McGraw-Hill, second edition, resulted between the two over priority. The
1958. (Original in German, 1941.) two discoveries were independent, but because
[3] J. E. Storer, Passive network synthesis, Leibniz’s notation was superior, the later
McGraw-Hill, 1957. development of calculus owes more to him.
[4] L. Weinberg, Network analysis and syn- The dynamic elucidation of the heliocentric
thesis, McGraw-Hill, 1962. theory was accomplished in Newton’s main
1067 284 B
Noetherian Rings

work, Principia mathematics philosophiae generated by a finite number of elements over


naturalis (1686-1687) in which Kepler’s law a Noetherian ring (Hilbert’s basis theorem).
on the movement of the planets, Galileo’s The following three conditions for a ring R
theory of movement, and Huygens’s theory are equivalent: (i) R is an Artinian ring, that
of oscillation were unified into the three laws is, it satisfies the tminimum condition for its
of Newtonian dynamics. These natural laws, ideals (for right and left Artinian rings - 368
which deal with all dynamic phenomena in the Rings F). (ii) R is a Noetherian ring and every
universe. are the most superlative realization prime ideal of R is a maximal ideal. (iii) There
of Descartes’s concept of exploring the mathe- exist a finite number of Noetherian rings
matical structure of nature; they had an es- R,(i = 1, . , , n) whose tmaximal ideals are
sential influence on the later development of tnilpotent such that R is the direct sum of
the natural sciences. The style of writing is the Ri(i = 1, . . . , n). We say that the restricted
similar to that in Euclid’s Stoicheia. In the minimum condition holds in a ring R if R/n is
Principia, Newton also sets forth his philo- an Artinian ring for every nonzero ideal a of
sophical position. R; the latter condition is satisfied if and only if
In 1695, Newton moved to London and R is either an Artinian ring or a Noetherian
became engrossed in theology. He was ap- domain of tKrul1 dimension 1. Every ideal of
pointed Master of the Mint and was presi- a Noetherian ring R can be expressed as the
dent of the Royal Society from 1703 until his intersection of a finite number of tprimary
death. While he is sometimes said to have ideals. Given a ring R and an R-module M, a
divorced himself from science, many of his submodule P of M is said to be a primary
notes on geometry date from this time. submodule of M if every element a of R that is
a zero divisor with respect to M/P (i.e., there
exists an m E M/P such that m # 0 and am = 0)
References is nilpotent with respect to M/P (i.e., there
exists a natural number n such that a”(M/P)
[ 1] D. T. Whiteside (ed.), Sir Isaac Newton, = 0).
Mathematical works, Johnson Reprint, I, 1964; The property of ideals in Noetherian rings
II, 1969. stated above can be generalized to the case of
[2] D. T. Whiteside (ed.), Sir Isaac Newton, Noetherian modules: If an R-module M is a
Mathematical papers, Cambridge Univ. Press, +Noetherian module, then every submodule of
I, 1967; II, 1968; III, 1969; IV, 1971; V, 1972; M can be expressed as the intersection of a
VI, 1974; VII, 1976; VIII, 1981. finite number of primary submodules.
[3] I. Newton, Philosophiae naturalis prin- Let R be a Noetherian ring, a an ideal of R,
cipia mathematics, London, 1687; English M a finite R-module, and N and N’ submod-
translation, Mathematical principles of natural ules of M. Then we have (i) the Artin-Rees
philosophy, trans. by A. Motte in 1729, Univ. lemma: There exists a natural number r such
of California Press, 1934. that for all n > r, a”N 0 N’ = anmr. (a’N n N’).
[4] D. Brewster, Memoirs of the life, writings, (ii) Krull’s intersection theorem: r);:, a”M =
and discoveries of Sir Isaac Newton, Con- {mEMI3aEasuchthat(l-aa)m=O}(hence,
stable, 1855. in particular, if m is the iJacobson radical or
R, then nz, m”M= (0). (iii) Krull’s altitude
theorem: If a is generated by s elements and
p is a iminimal prime divisor of a, then the
284 (111.12) height of p <s.
Noetherian Rings
B. Topology Defined by an Ideal
A. General Remarks
Let R be a ring, a an ideal of R, and M an R-
In this article, we mean by ring a commutative module. Then the a-adic topology of M is
ring with unity element. Thus a Noetherian defined to be the topology on M such that
ring is a commutative ring with unity element {a”M ) JI = 1,2, ) is a tbase for the neighbor-
that satisfies the tmaximum condition for its hood system of zero. In particular, let R be
ideals; if it is also an tintegral domain, then it Noetherian, M a finite R-module, and N a
is called a Noetherian integral domain or submodule of M. Then by the Artin-Rees
Noetherian domain (for right and left Noe- lemma, the a-adic topology of N coincides
therian rings - 368 Rings F). with the topology on N as a subspace of M
A ring is Noetherian if and only if every with the a-adic topology. Returning to the
prime ideal of the ring has a +linite basis general case, M is a +T,-space (under the a-
(Cohen’s theorem). A ring is Noetherian if it is adic topology) if and only if n:i a”M = {0},
284 C 1068
Noetherian Rings

or, in other words, if and only if M is a tmetric the Jacobson radical of R. Then the Zariski
space, where the tdistance d(a, b) between ring (R, J) is called a semilocal ring. Further-
points a, b in M is defined to be inf{2-” 1a - more, if R has only one maximal ideal, then
bEa”M}. Moreover, if N is a submodule of (R, J) is called a local ring. A ring that has only
M, then M/N is a TO-space if and only if N a finite number of maximal ideals is called a
is a closed subset of M, that is, if and only quasisemilocal ring; if it has only one maximal
if n.“=i(N + a”M)= N. A sequence {a,} = ideal, it is called a quasilocal ring. (In some
(a,, az, . . , a,, ) is called a Cauchy sequence literature, the terms local ring and semilocal
(under the a-adic topology) if Vn 3 N Vr Vs ring are used under weaker conditions; in the
(with each of them a natural number) aN+r weakest case, quasisemilocal rings and quasi-
-ahi+,ea”M; for this it is sufficient that local rings are simply called semilocal rings
Vn3NVrQ,+r+, -aN+r E a”M. If this sequence and local rings, respectively, and local rings
converges to zero (i.e., Vn3N Vr u~+~E~“M), and semilocal rings in our sense are called
it is called a null sequence. The set SYJlof all Noetherian local rings and Noetherian semi-
Cauchy sequences in M becomes an R-module local rings, respectively.)
if we define their sum and multiplication by an Assume that R is a semilocal ring with max-
element of R by {a,} + {b,} = {a,+ b,}, c{a,j = imal ideals m, , . . , m, and the Jacobson rad-
{~a,,}. Then the set % of all null sequences is ical J=m, n . . . n m,. For every finite R-
a submodule of YJI. An element m of M is module M, we introduce the J-adic topology
identified with the sequence (m, m, . , m, ), as its natural topology. The completion R of R
and we regard M as a submodule of %II. Then is a semilocal ring with maximal ideals m, R,
the R-module M = !IJI/?lI is the tcompletion of “.> m,R and is naturally isomorphic to the
M/(n,a”M) (as a metric space under the a- direct sum of the completions of the local rings
adic topology). M IS called the a-adic com- R,i (i= 1, . . . . n). Since R is a Zariski ring, (1) R
pletion of M. If a has a finite basis, then the is faithfully flat; (2) submodules of M are
topology of M (as the completion) coincides closed subsets of M; and (3) the completion of
with its a-adic topology. If M = R, we define a M is identified with M @ Rl?. If (R, m).is a
multiplication in 9JI by {a,} {b,} = {a, b,}. In complete local ring (i.e., a local ring and a
this case, YJI is a ring in which 5%is an ideal, complete Zariski ring at the same time), then
and hence the completion R = M is a ring. If R contains a subring I with the following
a= & a,R, then considering the tring of properties: (i) I is a complete local ring, and
formal power series R” = R[ [x1, , x,]] and its 1/(nr n I) = R/m; and (ii) for the tcharacteristic p
ideal ii= n~i(~~=l(~i-xi)l? + a”& we have of R/m (p is either zero or a prime number),
Rz &ii. m n I = pl. Therefore, if m is generated by n
elements, then R is a homomorphic image of
the ring of formal power series in n variables
C. Zariski Rings
over I. This theorem is called the structure
theorem of complete local rings, and I is called
For a Noetherian ring R and an ideal a of R,
a coefficient ring of R. If R contains a field,
every element b of R such that 1 -b E a has an
then I is a field, called a coefficient field of R.
inverse in R if and only if every ideal of R is a
A complete local ring is a +Hensel ring.
closed subset of R under the a-adic topology.
When (R, m) is a local ring and C;=i xiR
When this condition is satisfied, the ring R
is m-primary, then we have an inequality
with the a-adic topology is called a Zariski
r > (+Krull dimension of R); if the equality
ring; we often express this by saying that (R, a)
holds, then we say that x1, . , x, form a sys-
is a Zariski ring. A Zariski ring is called com-
tem of parameters of R. Furthermore, if m =
plete if it is a complete topological space. The
C;=, xiR, then we say that x,, . ,x, form a
completion R of the Zariski ring (R, a) has the
regular system of parameters of R. A local ring
aR-adic topology, and (R, aR) is a Zariski
that has a regular system of parameters is
ring. Furthermore, (i) R 1s a tfaithfully flat R-
called a regular local ring (cf. +Jacobian crite-
module; and (ii) when N is a submodule of a
rion). A regular local ring is a tunique factori-
finite R-module M, then N is a closed sub-
zation domain. Let d be the Krull dimension
space of M (under their a-adic topologies), and
of a local ring (R, m). Then R is a regular local
their completions are identified with N @ RR,
ring if and only if one of the following holds:
MQ,R.
i (1) every R-module has finite thomological
dimension, (2) every R-module has homolog-
D. Local Rings ical dimension of at most d, or (3) the homo-
logical dimension of R/m (as an R-module) is
Suppose that R is a Noetherian ring having finite (and actually coincides with d). A Noe-
only a finite number of maximal ideals and J is therian ring R’ is called a regular ring if Rb.is
1069 284 G
Noetherian Rings

a regular local ring for every prime ideal p’. A theory of local rings, particularly in the theory
regular local ring is a regular ring. of multiplicity.
Consider a local ring (R, m) and an m-
primary ideal q. Then the length l(n) of R/q”
(as an R-module) is a function of n. For a E. Chains of Prime Ideals
sufficiently large n, the length l(n) can be ex-
pressed as a polynomial f(n) in n with rational Let R be a Noetherian ring with prime ideals
coefficients. The degree of f(n) coincides with p, q such that p c q. Consider the length n of a
the Krull dimension d of R. The multiplicity of chainofprimeidealsp=p,~p,~...~p,=
q is (coefficient of nd in f(n)) x (d!). If x1, . . , xd q which cannot be refined any more. It is
form a system of parameters, then (multi- not true in general that n is uniquely deter-
plicity of Cf=, xi R) <(length of R/z xi R); if mined by p and q (M. Nagata). However, n is
the equality holds, then we call x1, . , xd a uniquely determined for a rather large class of
distinct system of parameters. A local ring Noetherian rings, for instance the rings that
that has a distinct system of parameters is are homomorphic images of locally Macaulay
called a Macaulay local ring. A local ring is a rings and, in particular, finitely generated rings
Macaulay local ring if and only if one of the over a +Dedekind domain.
following holds: (1) every system of parameters
is a distinct system of parameters, or (2) if an
ideal a of height s is generated by s elements, F. Integral Closures
then every tprime divisor of a is of height s. A
regular local ring is a Macaulay local ring. Let R be a Noetherian integral domain with
The notion of multiplicity can be also de- the field of quotients k, let K be a finite al-
fined in general Noetherian rings [4]. Let R be gebraic extension of k, and let R be the tin-
a Noetherian ring. If R,,, is a Macaulay local tegral closure of R in K. Then (i) If R is of
ring for every maximal ideal m, then R is Krull dimension 1, then for an arbitrary ring
called a locally Macaulay ring. Furthermore, if R’ such that R c R’c K and for every nonzero
height m = Krull dim R for every maximal ideal a’ of R’, the quotient RI/a’ is a finite
ideal m, then R is called a Macaulay ring. If R R/(a’ n R)-module. In particular, R’ is a Noe-
is a locally Macaulay ring, then the poly- therian domain satisfying the restricted mini-
nomial ring in a finite number of variables mum condition. (ii) If R is of Krull dimension
over R is also a locally Macaulay ring. In 2, then r? is Noetherian. (iii) In general, a is a
general, an ideal a is called an unmixed ideal tKrul1 ring, and for an arbitrary prime ideal p
(or pure ideal) if the height of every prime of R there are only a finite number of prime
divisor of a coincides with height a; otherwise, ideals i, of R such that p = @fl R. For any
a is called a mixed ideal. Thus if R is a locally of such @, the field of quotients of r?/@ is a
Macaulay ring, an ideal a of the polynomial finite algebraic extension of that of R/p. Result
ring R [x,, . , x,] over R is generated by r (i) is called the Krull-Akizuki theorem. We say
elements, and height a = r, then a is unmixed that R satisfies the finiteness condition for
(unmixedness theorem). integral extensions if l? is a finite R-module for
If the completion of a local ring R is a tnor- any choice of K. A Noetherian ring R is called
ma1 ring, then we say that R is analytically a pseudogeometric ring, or a universally Japa-
normal. If the completion of a semilocal ring R nese ring, if R/p satisfies the finiteness con-
has no nilpotent element except zero, then we dition for integral extensions for every prime
say that R is analytically unramified. A semi- ideal p. A ring is pseudogeometric if it is gen-
local integral domain R which is a ring of erated by a finite number of elements over a
quotients of a finitely generated ring over a pseudogeometric ring.
field is analytically unramified. If R is a normal
local ring, then R is analytically normal (0.
Zariski). G. History
Let (R, nr) be a local ring, and let q be an m-
primary ideal. Set FL= q’/q’+’ (i =O, 1,2, . , J. W. R. Dedekind first introduced the concept
q” = R). Let a = a’ (mod q’+‘)E Fi and b = b’ of ideals in the theory of integers. The main
(modqj+‘)EQ. We put ab=a’b’(modq’+j+‘)E objects studied in ring theory were subrings of
Fi+j. Then the direct sum of modules F= number fields or function fields until M. Sono
CIpu_OFi becomes a graded ring generated by (Merit Coil. Sci. Univ. Kyoto, 2 (1917), 3 (1918-
F, over F,, in which F, is the module of homo- 19 19)) originated an abstract study of Dede-
geneous elements of degree i. F, called the kind domains, which was followed by E.
form ring (or associated graded ring) of R with Noether (Math. Ann., 83 (1921), 96 (1926)),
respect to q, plays an important role in the who originated the theory of Noetherian rings.
284 Ref. 1070
Noetherian Rings

W. Krull made further important contri- was mistaken in thinking that he had ob-
butions to the development of the theory of tained a contradiction, his work is regarded
Noetherian rings and general commutative as a forerunner to the study of non-Euclidean
rings [ 11. Many other authors, including E. geometry.
Artin, Y. Akizuki, and S. Mori, also contri- At the beginning of the 19th century, N. I.
buted to the theory. The theory of local rings Lobachevski! and J. Bolyai opened up the
was originated by Krull (J. Reine Angew. impasse by establishing a geometry based on
Muth., 179 (1938)) and developed by C. Che- postulates that contradict the fifth postulate.
valley (Ann. Muth., 44 (1943)) I. S. Cohen This geometry is called hyperbolic geometry or
(Truns. Amer. Math. Sot., 59 (1946)), and Zari- Lobachevskii’s non-Euclidean geometry. Actu-
ski (Arm. Inst. Fourier, 2 (1950)), and later by ally, a similar idea had been conceived by C. F.
many authors, including P. Samuel, Nagata, Gauss, but he refrained from publishing it
M. Auslander, D. A. Buchsbaum, and J.-P. because of likely misunderstanding by a public
Serre [4]. The theory of Noetherian rings is still strongly influenced by I. Kant’s philoso-
applied to algebraic geometry. phy. On the other hand, B. Riemann con-
structed so-called elliptic geometry (or Rie-
mann’s non-Euclidean geometry), which is
References different from both Euclidean and hyperbolic
geometry. Euclidean geometry (including the
Cl] W. Krull, Idealtheorie, Erg. Math., theory of similarity) is sometimes called para-
Springer, second edition, 1968. bolic geometry. In general, a space satisfying
[2] B. L. van der Waerden, Algebra I, II, axioms that contradict Euclid’s postulates is
Springer, 19666 1967. called a non-Euclidean space.
[3] 0. Zariski and P. Samuel, Commutative Around the turn of the 20th century, A.
algebra I, 11, Van Nostrand, 1958-1960. Cayley, F. Klein, and H. Poincare constructed
[4] M. Nagata, Local rings, Wiley, 1962 models of non-Euclidean spaces that are sub-
(Krieger, 1975). sets of Euclidean spaces, and E. Beltrami con-
[S] N. Bourbaki, Elements de mathematique, structed a differential geometric model. By
Algebre, ch. 1, 8, Actualites Sci. Ind., 1144b, means of these models, it was established that
126la, Hermann, 1964, 1958. non-Euclidean geometries are consistent as
[6] N. Bourbaki, Elements de mathematique, long as there is no inconsistency in the under-
Algebre commutative, ch. l-7, Actualitts Sci. lying Euclidean geometries. On the other
Ind., Hermann; ch. 1, 2, 1290a, 1961; ch. 3,4, hand, D. Hilbert established a complete system
1293a, 1967; ch. 5, 6, 1308, 1964; ch. 7, 1314, of axioms for Euclidean geometry and showed,
1965. by constructing non-Euclidean models, that
[7] J.-P. Serre, Algebre locale, multiplicites, the axiom of parallels is independent of the
Lecture notes in math. 11, Springer, 1965. other axioms (- 155 Foundations of Geom-
[S] D. G. Northcott, Lessons on rings, mod- etry). The logical foundation of non-Euclidean
ules and multiplicities, Cambridge Univ. Press, geometries was thus clarified. Moreover, A.
1968. Einstein showed in his itheory of relativity
[9] H. Matsumura, Commutative algebra, that actual space-time does not satisfy Eu-
Benjamin, 1970; second edition, 1980. clidean axioms. Together with Euclidean
spaces, non-Euclidean spaces are often used
as fundamental models both in the problem of
+space forms and in the theory of tsymmetric
285 (VI.1 6) spaces.
Non-Euclidean Geometry
B. Axiomatic Considerations
A. History
Hilbert’s system of axioms of plane Euclidean
The validity of the fifth postulate of Euclid’s geometry consists of axioms of incidence,
Elements, the taxiom of parallels, has been a order, congruence, parallels, and continuity (-
subject of argument ever since it was for- 155 Foundations of Geometry). Specifically,
mulated (- 139 Euclidean Geometry). At the the axiom of parallels is stated as follows:
beginning of the 18th century, G. Saccheri Suppose we are given a straight line and a
tried to prove the postulate by assuming the point in a plane. If the straight line does not
validity of other axioms. Under the hypothesis contain the point, then there exists only one
that the axiom does not hold, he deduced line through the point that does not intersect
various extraordinary results. Although he the line.
1071 285 C
Non-Euclidean Geometry

The system of axioms of hyperbolic geome- invariant. We call Q the absolute and call G
try is obtained by replacing the axiom of par- the group of congruent transformations. When
allels by the following: Suppose we are given a a ~0, then this Q is a real quadric hypersur-
straight line and a point in a plane. If the line face. In this case, there exists a domain H” (the
does not contain the point, then there exist at totality of points inside Q) whose boundary
least two lines that pass through the point coincides with Q, and the group G acts ttransi-
without intersecting the line. (The other four tively on H”. The pair {G, H”} provides a
groups of axioms are unaltered.) In this case, if model of hyperbolic geometry, and the n-
[ is a given line that does not contain a given dimensional hyperbolic space H” is homeo-
point C, then there exist exactly two lines morphic to an n-dimensional open cell. Points
parallel to 1 that pass through C. We denote of H”, points on Q, and points outside Q are
them by X Yand X’Y’, and they are character- called ordinary points, points at infinity, and
ized as follows (Fig. 1): Any line that passes ultrainfinite points (or ideal points), respec-
through C and lies in L X’C Y necessarily tively. Two lines on H” are said to be parallel if
intersects I; by contrast, neither the two lines they intersect on the absolute Q. Next, when
X Y X’Y’ nor any line in L XCX’ intersects 1. a > 0, the absolute Q is an imaginary quadric
Euclidean geometry can be considered as a hypersurface, and the group G acts transitively
“limit” of this geometry where the lines XY on P”. The pair {G, P”} provides a model of
and X’ Y’ coincide. elliptic geometry. The n-dimensional elliptic
space P” is homeomorphic to n-dimensional
real projective space; hence it is compact. In
elliptic geometry, any two distinct lines in a
s
X’ Y plane necessarily intersect at a point. The
above models {G, P”} of non-Euclidean geo-
metries are called Klein’s models.
Let {G, H”} be a Klein’s model, A, B distinct
I points in P”, I the line containing A, B, and I, J
Fig. 1 be two points where the line 1 meets Q (Fig.
2). If we denote by (A, B, I, J) the tanhar-
In elliptic geometry, the axiom of parallels is manic ratio of these four points, then the non-
replaced by the following: Suppose we are Euclidean distance p between the points A and
given a straight line and a point in a plane. If B is given by p = a log( A, B, I, J), a constant.
the line does not contain the point, then any Next, let 1, g be lines in H” intersecting at a
line passing through the point intersects the point D. In the plane determined by 1 and g,
line. In this case, the lines are closed curves, we draw two imaginary tangents u, u to the
and the axioms of order must be modified. absolute through D. Denoting by (1, g, u, u) the
Specifically, in Euclidean geometry, the axioms anharmonic ratio of these four lines, the non-
of order are based on the notion of a point A Euclidean angle e between the lines 1 and g is
lying between points B and C, where A, B, C given by (3= (1/2i)log(l, g, u, u), i = J-1. Gen-
are distinct points on a line. In elliptic geome- erally, let 0 be a point in the projective space
try, however, to define order we utilize the P”, and denote by p the tpolar of 0 with re-
notion of a pair A, C of points separating spect to the absolute. By counting the polar
another pair B, D (and vice versa), where A, B, p doubly, it can be regarded as a quadric
C, D are distinct points on a line. The axioms hypersurface, which will be denoted by S,. A
of order are modified accordingly. quadric hypersurface S of H” is called a non-
The sum of inner angles of a triangle is Euclidean hypersphere if it belongs to the tpen-
smaller or greater than two right angles ac- cil of quadric hypersurfaces determined by Q
cording as we use the axioms of hyperbolic or and S,. S is called a proper hypersphere, a
elliptic geometry. limiting hypersphere, or an equidistant hyper-
surface according as the center 0 is an ordi-

C. The Projective-Geometric Point of View


J
I .3 R
We take tprojective coordinates in an n-
dimensional, real tprojective space P” and X Y’
consider a quadric hypersurface defined by c
Q:ax~+x~+...+x~=O,a#O(- 90Coordi- x’ Y
I
nates; 343 Projective Geometry). We denote by
@
G the group consisting of the totality of pro-
jective transformations of P” that leave Q Fig. 2
285 D 1072
Non-Euclidean Geometry

nary point, a point at infinity, or an ultrain- between two points A and B is defined as
finite point, respectively. In the case of ellip- before, by making use of the anharmonic ratio
tic geometry {G, P”}, the distance p between of four points A, B, I, J on a circle (Fig. 3) (-
two points A and B is given by p =(a/i) 74 Complex Numbers G).
log(A, B, I, J) as I, J are imaginary points.
Moreover, we may get parabolic geometry as Q J
the “limit” (a-r co) of the geometry given by
Klein’s models. B
I
A

C
D. The Conformal-Geometric Point of View X Y’
1

Let s” be an n-dimensional tconformal space. @ X’ Y

If we take suitable (n + 2)-hyperspherical co- Fig. 3


ordinates in S”, the space S” can be realized as
a quadric hypersurface x: + xi + . . +x,’ -
2x,x, =0 in (n + I)-dimensional real projec-
tive space P”+‘. A point in P”+l represents a E. The Differential-Geometric Point of View
hypersphere in S” (- 76 Conformal Geometry;
90 Coordinates). We denote by G the group An n-dimensional space M of tconstant curva-
consisting of the totality of +Mobius trans- ture is by definition a +Riemannian manifold
formations leaving invariant a (real, point, or whose line element ds is given by
imaginary) hypersphere Q. When Q is a real dx:+dx;+...+dx,2
hypersphere, the space s” is divided into Q and ds2 =
the two open cells H”, Hz. If we denote by G l+$x:+x:+...+x;)
the totality of transformations of the group i? >
that do not interchange H” and Hi, then G is a with respect to appropriate local coordinates,
subgroup of index 2 of G. In this case, each of where K is a constant called the tsectional
the pairs {G, H”} and {G, HG} provides a model curvature (- 364 Riemannian Manifolds).
of hyperbolic geometry. When Q is a point According as K is positive, zero, or negative,
hypersphere, the space E” obtained from S” by M can be considered locally as an elliptic
omitting the point Q is homeomorphic to an space, Euclidean space, or hyperbolic space,
open cell, and the pair {G, E”} provides a respectively. In this case, lines are tgeodesics
model of parabolic geometry. On the other of M, and the non-Euclidean distance and
hand, when Q is an imaginary hypersphere, G non-Euclidean angle are those defined in the
is isomorphic to the torthogonal group O(n + Riemannian manifold. When n = 2, a tsim-
1). We call the pair (c, S”} a spherical geom- ply connected and +complete space of posi-
etry and S” an n-dimensional spherical space. tive constant curvature is tembedded in 3-
In this case two points x and x’ are called dimensional Euclidean space as a sphere, and a
equivalent if x’ is the image of x by symmetry space of negative constant curvature is tlocally
with respect to Q. The space P” obtained isometric to a pseudosphere (Fig. 4) which is a
from S” by identifying equivalent points is surface of revolution obtained by rotating a
homeomorphic to the n-dimensional real ttractrix around its asymptote. +Complete n-
projective space. If we denote by G the group dimensional spaces of constant curvature
obtained from G by making its actions effec- (n > 2) are called space forms. A simply con-
tive on P”, then G is the factor group of G by a nected space form is necessarily one of spher-
cyclic group Z, = Z/22. The pair {G, P”} pro- ical space, Euclidean space, or hyperbolic
vides a model of elliptic geometry. These space. Each of these is a iuniversal covering
models are called PoincarC’s models. They manifold of a general connected space form
were introduced as a result of research on with a curvature of the same sign, and the
tautomorphic functions in the case n = 2. group of icovering transformations is isomor-
In Poincare’s model, every straight line is
represented either by a circle orthogonal to Q
or by a circle passing through Q according as
Q is a (real or imaginary) hypersphere or a
point hypersphere. In spherical geometry,
however, straight lines are usually called great
circles, and two distinct great circles lying on a
2-dimensional sphere necessarily intersect at
two points that are symmetric with respect to
Q. Also, in Poincartt’s model the distance Fig. 4
1073 286 D
Nonlinear Functional Analysis

phic to a tdiscontinuous subgroup of the for all x, y in X. In particular, the mapping F


group of congruent transformations. is said to be nonexpansive if 0 < a < 1. If 0 < a <
Each of the spaces, Euclidean, non- 1, then F is called a contraction. (Sometimes
Euclidean, and spherical, is a thomogeneous a nonexpansive mapping is also called a con-
space on which the corresponding group of traction.) A contraction F satisfies the contrac-
congruent transformations acts transitively. tion principle: F has a unique fixed point x0,
Actually, each of these spaces has the structure and the iteration
of a tsymmetric Riemannian homogeneous
x nt, = f’s (n=1,2,...)
space of rank 1.
with an arbitrary initial element xi always
converges to x0 [l-3]. Similar results hold for
References a contraction that is defined only on a +ball
and leaves the ball invariant. This leads to
[ 11 D. M. Y. Sommerville, The elements of +Newton’s iterative process and the timplicit
non-Euclidean geometry, Bell, 1914. function theorem.
[2] F. Klein, Vorlesungen iiber nicht-
Euklidische Geometrie, Springer, 1928 (Chel-
sea, 1960).
C. Methods of Monotonicity
[3] E. Cartan, Lecons sur la gtometrie des
espaces de Riemann, Gauthier-Villars, second
edition, 1963. By definition, a nonlinear mapping G from a
[4] H. S. M. Coxeter, Non-Euclidean geome- tHilbert space H to H is a monotone or accre-
try, Univ. of Toronto Press, fifth edition, 1965. tive operator if
[S] F. Engel and P. Stackel, Urkunden zur Re(Gx-Gy,x-y)>O (x,y~H).
Geschichte der nicht-Euklidischen Geometrie
I, II, Teubner, 1898, 19 13. G. Minty [4] proved that if G: H-H is mono-
tone and continuous, then iI + G is a mapping
onto H for any i > 0, and its inverse @I+ G) ml
is nonexpansive. He has also shown that in the
hypothesis of the theorem, we can replace the
286 (X11.21) continuity requirement for G by maximality of
G within the class of accretive operators that
Nonlinear Functional are possibly multivalued. Various develop-
Analysis ments of Minty’s ideas, including generaliza-
tion of his results to Banach spaces and appli-
cations to partial differential equations, have
A. General Remarks been obtained by F. Browder (Amer. Math.
Sot. Proc. Symposia in Appl. Math., 17 (1965)),
At present, the theory of nonlinear problems is J. Leray and J. L. Lions (Bull. Sot. Math.
still not unified, and many individual results France, 93 (1965)), J. L. Lions [S], W. Strauss,
obtained for specific classes of problems are H. Brezis (Amer. Math. Sot. Proc. Symposia in
stated in the languages of the corresponding Pure Math., 18 (1970)), and others.
fields of study. However, there are some funda- A mapping A is said to be dissipative if -A
mental facts and methods of a general nature is accretive. Dissipative mappings play a cen-
concerning nonlinear problems, which may be tral role in the theory of nonexpansive semi-
referred to as the subject matter of nonlinear groups (- Section X).
functional analysis.

D. Topological Methods
B. Iterative Methods

In the geometric study of ordinary differen-


Let X be a +Banach space. Consider a non-
tial equations [6] some familiar theorems of
linear mapping G:X+X and the equation
topology and tdifferential topology have been
Gx=O (xEX). (1) strong tools, e.g., +Brouwer’s fixed-point theo-
rem is utilized to establish the existence of
Set F = 1 - G. Then (1) can be written as
periodic solutions. However, in order to deal
x=Fx. (4 with nonlinear partial differential equations we
have to generalize these theorems to infinite-
F satisfies the Lipschitz condition if there exists
dimensional cases. For example, a fixed-point
a constant GIsuch that
theorem in an infinite-dimensional space was
IIFx--Y// <4x-Yll (3) first obtained by G. D. Birkhoff and 0. D.
286 E 1074
Nonlinear Functional Analysis

Kellogg (Trans. Amer. Muth. Sk., 23 (1922) theorem of Shvarts (1964) 131: Let X and Y be
95- 115). The theory of the degree of mappings two Banach spaces. Let D= {XEX 1 IlxII < 1) be
was generalized to the case of Banach spaces theunitballandi;D={x~XI~~xll=l}bethe
by J. Leray and J. Schauder [33] for the class unit sphere of X. Suppose L is a fixed continu-
of mappings of the form I-F, where F is a ous linear +Fredholm operator from X to Y
compact continuous mapping, i.e., the image of index p 2 0. Let PL be the set of compact
by F of any bounded set is relatively compact. perturbation of L mapping 8D into Y 10, i.e.,
Let D be an open bounded set in a Banach PL = {@ = L + K 1K is a continuous compact
space X and 8D be its boundary. Let F: 0+X mapping of dD to Y such that Q(x) = Lx +
be a continuous compact mapping and @ K(x)#O for x~c?D}. Two mappings Q,, and
denote I-F. If a point p in X does not be- @, in PL are said to belong to the same com-
long to @(aD), then we can define the Leray- pact homotopy class on SD if there exists a
Schauder degree cl(@,,p, D) of @ relative to p continuous compact mapping h: [0, l] x ciD+
[l-3]. The Leray-Schauder degree d(@, p, D) Y such that Lx + h(t, x) # 0 for x in dD, Qo(x) =
is an integer with the following properties: (i) Lx+ h(0, x), and Q,(x)= Lx+ h(1, x).
d(l,p,D)= 1 ifpED. Ifp$@(D), then d(@,p,D) Shvarts’s theorem: Let L be a fixed continu-
=O. (ii) (Homotopy invariance) d(@, p, D) de- ous linear Fredholm mapping from X to Y
pends only on the compact homotopy class of of index p 3 0. Then the compact homotopy
@:dD+X\{p}.Moreprecisely,let K:[O,l]x classes on 8D of PL are in one-to-one corre-
aD+X be a continuous compact mapping spondence with the elements of the pth stable
such that x + K (t, x) #p for any t E [0, 11 and homotopy group 7c,,+,,(s”) (nap+ 1) (- 202
any XE~D. Let @,,(x)=x+K(O,x) and @i(x)= Homotopy Theory H).
x+ K(l,x). Then d(@,,p, D)=d(@,,,p,D). (iii) Warning: The topological structure of an
If p and p’ are in the same component of infinite-dimensional Hilbert or Banach space is
X\@(aD), then d(O, p, D) = d(@, p’, D). (iv) (Con- quite different from that of a finite-dimensional
tinuity) d(@, p, D) is a continuous locally con- Euclidean space. For instance, let X be a Hil-
stant function of Q, (with respect to uniform bert space of infinite dimension, and let D =
convergence) and of p~x\@(o’D). (v) (Domain 1~~x1 IIxi/ < 1) be its unit ball and tYD=
decomposition) If D is the union of finite num- (x6X 1 11x//= I} be the unit sphere of X. S.
ber of open disjoint sets Dj (j = 1,2, . , N) Kakutani (Proc. Imp. Acud. Tokyo, 19 (1943))
with aD,caD and @(x)#p on u,“=, aDj, then gave a fixed-point free continuous mapping
d(@, p, D) = C,t, d(@, p, Dj). (vi) (Excision) If A of D into itself if X is separable [l-3]. Thus
is a closed subset of D on which Q(x) # p, then naive generalization of the Brouwer tixed-
d(@, p, D) = d(@, p, D \A). (vii) (Cartesian prod- point theorem is no longer true in infinite-
uct formula) If X =X, @ X, with Di c Xi, @ = dimensional spaces. V. Klee and C. Bessaga
(@,,(IQ with Qi:Di+X, (i= 1,2), D=D, x D,, [ 171 proved that the unit sphere dD is C” -
and ~=(~,,~~),thend(~,~,D)=d(~,,,p,,D,)x diffeomorphic to X for an arbitrary Hilbert
d(@,,, pz, D2), providedthat the right-hand space X of infinite dimension. N. H. Kuiper
side is well defined. proved that the group of invertible continuous
The degree of mapping of @ is also defined linear operators on X is contractible if X is
for some proper Fredholm mapping @ (- separable [3]. All this is in striking contrast
Section E; K. D. Elworthy and A. J. Tromba to the well-known facts for finite-dimensional
C381; 1321). spaces (- 202 Homotopy Theory, 427 Topol-
Brouwer’s fixed-point theorem (- 153 ogy of Lie Groups and Homogeneous Spaces).
Fixed-Point Theorems) is generalized to This is the reason why compactness assump-
Schauder’s fixed-point theorem: A compact tions are made in the theorems mentioned
mapping F of a closed bounded convex set K above.
in a Banach space X into itself has a fixed
point. Using the Leray-Schauder degree
theory, one has the Leray-Schauder fixed-point E. Calculus in Banach (or Locally Convex)
theorem: Let D be a bounded open set of a Spaces
Banach space X containing the origin 0. Let
F(x, t): D x [0, 11 +X be a compact mapping When one considers a nonlinear operator,
such that F(x,O)=O. Suppose that F(x,t)#x it often happens that the domain and the
for any XE~D and t~[0, 11. Then the compact range are neither linear spaces nor their open
mapping F(x, 1) has a fixed point x in D [l-3]. subsets. The domain might be a space of all
Other homotopy invariants, such as +homo- smooth mappings of a compact manifold into
topy groups and +cohomotopy groups, are another, and so might be the range. Such
also used in nonlinear functional analysis spaces have no linear structure, and hence
[l, 31. For instance we have the following linearity or semilinearity do not make sense in
1075 286 H
Nonlinear Functional Analysis

general. The concept of infinite-dimensional mapping (Pi: U + L&,,(E, F) such that


manifolds is therefore introduced of necessity
in nonlinear functional analysis.
Definition of differentiable mappings. Let E
and F be real Banach spaces and let L(E, F) at every XE U and y sufficiently close to 0, then
(= L(E) if E = F) be the Banach space of all f: U +F is of class c’, and dkf(x) = pk(x).
bounded linear operators with uniform oper-
ator norm. Let U be an open subset of E and
x a point of U. A mapping (= nonlinear oper- G. The Implicit Function Theorem
ator) f of U into F is called Gdteaux differ-
entiable at x if lim,,,, t -‘(,f’(x + ty) -f(x)) =
Using the notation above, let ,f: CJq V be of
df(x, y) exists for any YE E. df(x, y) is called the
class c’, r 2 I, and assume that OE U, OE V, and
GIteaux derivative of ,f at x. ,f is called F&bet
,f(O) = 0, where 0 is the origin of E or F. Sup-
differentiable at x if there exists a linear oper-
pose there is an AE L(F, E), called the right
ator AcL(E, F) such that lim,,, ll.f’(x+y)-
inverse of @f(O), such that df(O)A = 1, (the iden-
,f(x)- Ayll/Ilyll =O. A is called the Frkchet
tity). Then the following assertions hold: (i)
derivative of .f‘ at x and is denoted by @‘(x) or
The image of F under A, AF, is a closed sub-
,f’(x). ,f is FrCchet differentiable in U if and
space of E, and E = Ker @(O) @ AF. (ii) There
only if it is Gateaux differentiable, df(x, y)
are neighborhoods U, , U,, V’ of the zeros
is linear in y, and supYzO lldf(x,y)ll/llyll is
of Kerdf(O), AF, F, respectively, such that
bounded [lo].
U, @I Ui, c U and such that the mapping g :
Let U be an open subset of E. A mapping
U, @C/-U, @ v’defmed by g(u,v)=(u,f(u,r;))
,f of U into F is said to be of class Co if it
is a Gdiffeomorphism. Therefore, denoting the
is continuous and to be of class C’ if it is
inverse of g by h = (h 1, h,), we have h, (u, w) = u
Frkchet-differentiable at each point XE U and
and f(u, h,(u, w)) = W. The latter means that the
the differential c(f’(x)~L(E, F) is continuous as
nonlinear equation f(u, V)= w can be solved
a mapping of U into L(E, F). The differential
with respect to II.
df(x) is also called the linearized operator. If
the mapping df: U-tL(E, F) is of class c’-l,
then ,f is said to be of class c’. d(d’-‘f)(x) is
written as d”f(x), and called the rtb differential H. Existence and Uniqueness of Integral
at x. d’f(x) is an r-linear, bounded, symmetric Curves
operator of E x x E (r times) into F. ,f is said
to be of class C” if J’is of class C’ for every Y. Using the notation above, let ,f be a c’ map-
For an open subset I/ of F, f: U--t V is called a ping (r 3 1) of U into E. Since U x E is the +tan-
C’ diffeomorphism if ,f is a bijection and both J gent bundle of U, (x,.f’(x)) can be regarded as a
and .f-’ are of class c*. c’ tangent vector field on U. The equation of
A C’ mapping ,f of U into F is called a Fred- +integral curves is (d/dt)x(t)=f(x(t)). A local
holm mapping (Fredholm map) if df(x) E L(E, F) existence and uniqueness of solutions is stated
is a linear +Fredholm operator for every XE CJ. as follows: For an arbitrarily fixed XE U, there
Since Inddf(x) is constant if U is connected, are E> 0 and an open neighborhood W of x
that integer Ind df(x) is called the index of 1: such that there exists uniquely a C’ mapping h
of W x (-6, E) into U satisfying (d/dt) h(w, t) =
f(h(w, t)) and h(w, 0) = w.
F. Taylor’s Theorem and Its Converse Using this fact, one can prove the Frobenius
theorem: Let E’ be a closed linear subspace of
Let f: U + F be of class c’ (r > 1). A general- E with a direct summand E”, and let ,f: UA
ized Taylor theorem claims that ,f can be ap- L(E’, E”) be a C’ mapping (r > I) such that
proximated by a polynomial mapping: Let ,f(O) = 0. To each XE U one associates a closed
XE U and YE E be sufficiently close to 0 so that linear subspace D,={(u,f(x)t~)~u~E’}. The
x+tygU for O,<t< 1. Then disjoint union D = lJxaO D, can be regarded as
a subbundle of the tangent bundle U x E. A
‘(l-t),-’ mapping ii of LJ into E is called a cross sec-
.f(x + Y) =k$o $wX)~Y. ” >Y) + o (r-l)!
s tion of D if G(x)E D, for every XE U. D is called
involutive if for any two C’ cross sections ~2,6
x jd’f(x + ty) -d’,f’(.4) (Y, . , W. of D, the Lie bracket product [G, 61 defined by
Let I&,(E, 5’) be the Banach space of all k- [tz,C](x)=dd(x)(C(x))-dC(x)(C(x)) is again a
linear, bounded, symmetric operators of E x cross section of D. Now suppose D is an in-
. x E into F with the uniform topology. If for volutive subbundle of U x E. Then for an arbi-
every k, 0 <k < r, there exists a continuous , trarily fixed x E U, there are a neighborhood W
286 I 1076
Nonlinear Functional Analysis

of x and a c’ diffeomorphism h of W onto a with respect to 1 Ik:


neighborhood of 0 of E such that dh(x)(D,) =
E’. This fact shows that an involutive sub-
bundle can be trivialized by a suitable change +pk(b~kkl)~ulk-l> k>d,

of local coordinate systems.

+lYldl”ldlulk)+Pk(lYlk-l)IUlk~lIulk-l, k>d,
I. Local Theories on Locally Convex Spaces (4)
where I Ik is the norm in Ek or Fk, C is a posi-
All local theories mentioned in Sections E-H tive constant independent of k, and Pk is a
are constructed on Banach spaces. However, it polynomial with positive coefficients, and if a
is important in concrete applications to con- right inverse A of df(0) satisfies the inequality
struct these theories on a wider class of locally of GBrding type,
convex topological linear spaces.
Let E, F be tlocally convex topological IAulktC’lulkfDklulk-,, k > d, (5)
linear spaces, and let U be an open subset of E.
where C’ > 0 independent of k and Dk > 0, then
A mapping f: U + F is said to be of class Co if
the implicit function theorem holds just as in
it is continuous. 1 is said to be of class c’ if f
Section G, and the obtained mapping h satis-
is of class c’-’ and the following is fulfilled: fies the same inequalities as (4) [ 111.
For every XE U, there is an r-linear continuous (iii) Nash-Moser implicit function theorem.
symmetric mapping d’f(x) of E x . x E into F Though linear estimates such as (4) hold for
such that d*f: U x E x . x E + F is continuous.
many differential operators, the second in-
If we put
equality (5) is sometimes out of order, espe-
F(y)=f(x+y)-f(x)--f(x)(y)--... cially if f is a nonlinear hyperbolic operator.
However, one can often obtain instead of (5)
a weaker inequality:

IAUlk~C’IUlk+,+Dklulk+~~~, s > 0.
for every y sufficiently close to OE E, then the
mapping G defined by J. Nash [37] and J. Moser [ 123 approximated
such an operator A by some smoothing oper-
t # 0, ators and proved an implicit function theo-
t=o @eR)
rem under a certain additional condition.
is continuous at (0, O)ER x E. The definitions The Nash-Moser implicit function theorem
of C” mappings and c’ diffeomorphisms are was successfully applied to many difficult
given as in Section E. problems, e.g., the embedding problem of
Riemannian manifolds [37], the small divi-
sor problem of celestial mechanics [36], free
boundary problems (e.g., L. H6rmander Arch.
J. Implicit Function Theorems in Locally Rational Mech. Anal., 62 (1976), l-52), and
Convex Spaces other problems (e.g., S. Klainerman, Comm.
Pure Appl. Math., 33 (1980), 43-101; M. Kura-
The implicit function theorem does not hold nishi, Amer. Math. Sot. Proc. Symposia in Pure
in general locally convex spaces. However, Math., 30 (1977), 97-105).
since it is useful for nonlinear problems, sev- (iv) Analytic implicit function theorem. In
eral sufficient conditions are presently being cases (ii) and (iii), the spaces E, F were given
studied. The following are some of them. We as projective limits of Banach spaces. On the
assume that f: U+ F is a C’ mapping (r > 1) contrary, H. Jacobowitz considered the case
such that f(0) = 0 and that df(0) : E + F has a where E, F are tinductive limits of Banach
continuous right inverse. spaces. For instance, the space E of the
(i) If dim F < co, the implicit function theo- smooth functions can be approximated by a
rem holds just as in Section G. family of Banach spaces {E, I E> 0) of all real
(ii) An implicit function theorem that can analytic functions with E as the radius of con-
be applied to nonlinear elliptic differential vergence. Under this circumstance, certain
operators can be restated as follows. Suppose conditions for f and the right inverse of df(0)
E, F are tprojective limit of families of Banach yield an implicit function theorem [ 131.
spaces {Ek,k>d}, {Fk,k>d} and U=EflUd, (v) Mather’s implicit function theorem [14].
where Ud is an open subset of Ed. If f: U + F The difficulty of implicit function theorems in
can be extended to a c’ mapping (r 3 2) of Frkchet spaces is concentrated in the following
Ek fl Ud into Fk for every k > d, if f satisfies the fact: Even if df(0) has a right inverse and even
following inequalities, called a linear estimate if x is sufficiently close to 0, df(x) may not have
1077 286 N
Nonlinear Functional Analysis

a right inverse. If the implicit function theorem topologies on separable Hilbert manifolds -
holds, such a phenomenon should not happen. 279 Morse Theory E.
In cases (ii)-( several functional-analytic
conditions exclude this pathological phenom-
L. Structures on Infinite-Dimensional
enon. In some special cases, these conditions
Manifolds
can be replaced by algebraic ones. J. Mather,
using his division theorem, proved an im-
plicit function theorem that was applied to th? Suppose M is a C’+’ Hilbert manifold
theory of singularities. modeled on E. At each XE M, the tangent
space T,M = n-l(x) is a Hilbert space, linear-
homeomorphic to E. M is called a C’ Riemann-
ian manifold if there is defined an inner prod-
K. Infinite-Dimensional Manifolds uct (u, o), on each T,M such that ( ., .), is of
class c’ with respect to x. Existence of such a
A Hausdorff space M is called a c’ Banach structure is ensured by using a partition of
manifold modeled on a Banach space E if the unity if M is paracompact.
following conditions are satisfied: (a) M is Let M be a c’ Banach manifold (r > 1)
covered by a family of open subsets { UOL}aEA. modeled on a Banach space E. Each tangent
For each U, there are an open subset V, of E space T,M is linear homeomorphic to E. M is
and a homeomorphism $, of U, onto V,. Such called a C’ Finsler manifold if there is a norm
a pair is called a local coordinate system or a 1~1, defined on each T,M such that 1 IX is
local chart of M. (b) If U, n U, # 0, t,kr$pl continuous with respect to x. A paracompact
is a C’-diffeomorphism of $p(Um n U,) onto C’ Banach manifold can have a C’ Finsler
$,(U= n Up). (c) The index set A is maximal structure.
among those that satisfy (a) and (b). Let M be a P2 FrCchet manifold (r>O)
If M (resp. M’) is a c’ Banach manifold and C”‘(M), P+’ (T,) the spaces of all c*+’
modeled on E (resp. E’), then so is M x M’ functions and of all C’+’ vector fields on M,
modeled on E @ E’. A mapping f: M + M’ is respectively. A bilinear mapping V of r’l (T,)
said to be of class C’ if f expressed through x P1 (T,) into Y( T,) is called an affine
local coordinate systems is of class c*. connection on M if V satisfies V,;a= f V,i?,
Suppose M is covered by a family of local V,f1?=($)6+fV,6 for every d, U”ET’+‘(T,),
charts {(r/,, Ic/,)},,A. On the disjoint union fc C”“(M). For an afine connection V, T(ci, 6)
= V,6 - V,u” - [I?, 51 is called the torsion tensor,
u 3CACJ=x E, define an equivalence rela-
tion - as follows: For (x, U)E U, x E, (y, U)E and R(ii, 6) = V,V, - V,V, - V,,, a, is called the
Up x E, (x, u) -( y, v) if and only if x = y and curvature tensor of V. If M is a Riemannian
rl($c$p’)(x)u = u. The set of equivalence classes manifold, then there exists a unique afflne
is called the tangent bundle of M, which will connection without torsion which leaves the
be denoted by T,. There is another definition Riemannian inner product parallel.
of the tangent bundle which uses the ring of
c’ functions on M. However, since E is not
M. Local Linearization Theorems
reflexive in general, the latter gives us a differ-
ent vector bundle. 7” is a c’-’ Banach mani-
Let M be a C’+’ Banach manifold (r > 1)
fold modeled on E @ E. The correspondence
modeled on E. Let C be a c’ vector field on M
which sends (x, U)E UGLEAU, x E to x induces a
such that ii(x)#O at XE M. Then there are a
c’-’ mapping of T, onto M, called the pro-
neighborhood Ux of x and a c’ diffeomor-
jection of T,, denoted by T[.
phism $ of U, onto an open subset V of E
A topological group G is called a Banach-
such that ~I,&(I,!I-‘(y))=(y,u), UEE, for every
Lie group if G is a C” Banach manifold and
yeV, where v does not depend on y.
the group operation (9, h)+g-’ h is a C”
mapping of G x G into G. If E is a Banach
space, then CL(E), the group of all invertible N. Morse Lemma
bounded linear operators, is a Banach-Lie
group under the uniform topology. If E is a Let M be a C’+2 Hilbert manifold and f be an
Hilbert space, then the group of all unitary R-valued P2 function. Suppose x is a critical
operators is also a Banach-Lie group under point of ,J i.e., df(x) = 0. x is called a nondegen-
the same topology. erate critical point if d2f(x) is a nondegenerate
The concepts of manifolds and Lie groups bilinear form. For such x there are a neighbor-
are similarly defined when the model space hood UX of x and a C’ diffeomorphism $ of U,
is a Hilbert space or a FrCchet space. These onto an open neighborhood of 0 of the model
are called a Hilbert manifold and a FrCchet space E such that $(x) = 0, and f($ -l(y)) =
manifold, respectively. For some differential [PY(~--)(~ -P)yl’, where P is an orthogonal
286 0 1078
Nonlinear Functional Analysis

projection in E. i, = dim( 1 - P)E (0 < i, < cc) is Morse theory. Let M be a C” complete
called the index of the critical point x off: Riemannian manifold and ,f a C” function
bounded below satisfying Condition C and
having only nondegenerate critical points.
0. Submanifolds
Then, using the Morse lemma, one can make a
thandlebody decomposition of M by the same
A subset N of a C” Banach manifold M mod-
method as in the case of finite-dimensional
eled on E is called a C’ submanifold if at each
manifolds (- 279 Morse Theory).
point XE N there are a neighborhood Ux of
Lyusternik-Shnirel’man theory. This theory,
x and a C’-diffeomorphism $ of U, onto an
constructed on finite-dimensional manifolds,
open neighborhood V of 0 of E such that $(x)
can be extended naturally to Finsler mani-
=Oand$(U,nN)=VnF,where Fisaclosed
folds Let M be a complete C2 Finsler mani-
linear subspace of E. There are some other
fold and f a C2 function satisfying Condition
definitions of submanifolds. One of them re-
C and bounded below. Then ,f has at least
quires in addition that F be a direct summand
cat(M) critical points, where cat(M) =m means
of E, and another uses instead of U, f’ N its
that M can be covered by m closed contrac-
connected component containing x. In the
tible subsets of M but not by m - 1 ones. If
latter definition, a submanifold is not neces-
there is no such integer, then we set cat(M) =
sarily locally closed.
co.
Both Morse theory and Lyusternik-
P. Sard-Smale Theorem Shnirel’man theory have been successfully
employed in the global theory of the calculus
Although it is not easy to define nontrivial of variations.
measures on infinite-dimensional manifolds,
the concept “almost everywhere” can some-
R. Bifurcation Theory
times be replaced by that of residual sets.
A subset of a topological space is called a
Bifurcation theory concerns itself with the
iresidual set if it contains an intersection of
structure of the zeros of the functional equa-
countably many open dense subsets. A re-
tion of w with a parameter I:
sidual set in a complete metric space or in
a +Baire space is dense. S. Smale [ 153 ex- G(i, w) = 0. (6)
tended Sard’s theorem to infinite-dimensional
manifolds as follows: Let M, N be c’ Ba- In general, the state w satisfying (6) represents
nach manifolds and f: M + N be a C’ Fred- the equilibrium (time-independent or station-
holm mapping. If M is separable and r > ary) solution of the tevolution equation
max{O, Ind(df(x))} for each XE M, then R,= w, = G(i, w) and w(0) = wO. (7)
N-f(C) is a residual subset of N, where C
Here the evolution equation itself stems from a
is the set of critical points off, i.e., a point x
where LEJ’(x) is not surjective. mathematical model describing natural phe-
nomena, w = w(t) stands for the state at time t,
and 1, is the set of parameters representing the
Q. Calculus of Variations and Infinite- physical environment. For example, in the
Dimensional Manifolds +Navier-Stokes equation appearing in fluid
dynamics, w(t) represents the unknown veloc-
Many problems in the calculus of variations ity field at time t and i is the +Reynolds num-
can be understood as problems seeking crit- ber. It is important to study bifurcation phe-
ical points of functions defined on infinite- nomena because they typically accompany
dimensional manifolds. R. Palais and S. Smale the transition to instability of the state when
set up the following Condition C and fixed a some characteristic parameter passes through
category of functions where the critical points a certain value, called a critical value.
can be chased through gradient-like vector Let X, Y, and A be real Banach spaces, and
fields [6]. let G(& w) be a mapping from A x X to Y.
Palais-Smale Condition C. Let f be a C’ Suppose that there exists a mapping @(?.):A-,
function on a C’ Finsler manifold M. If S is X satisfying G(/I, 5(i)) = 0. One calls (E., G(i))
any subset of M on which ,f is bounded but a trivial solution of (6).
on which I@(x)1 is not bounded away from 0, (&, a(&)) is called a bifurcation point of
then there is a critical point of ,f adherent to S. G(/1, w) (with respect to the trivial solution) if
In general, it is not easy to examine Con- in any neighborhood of (A,, K&)) there exists
dition C for a concrete f However, many a nontrivial solution of (6). (In general, there
concrete problems, where the Euler equations may appear another type of solution that is
are nonlinear elliptic, satisfy Condition C. not connected with the trivial solution [20]).
1079 286 v
Nonlinear Functional Analysis

Assume that G(n, w) is of class C’ in some U. Bifurcation of Periodic Solutions (Hopf


neighborhood of (&,, +(A,,)) in A x X. Then it Bifurcation Theorem)
follows from the iimplicit function theorem
that (,I,, I?(&)) is not a bifurcation point if If u(i) loses stability by virtue of a pair of
G,(i,, $(I,)), the +Frechet derivative of G with complex conjugate eigenvalues crossing the
respect to w, is nonsingular. imaginary axis, then under suitable conditions
one can prove the existence of bifurcating time-
periodic solutions of (7). Rewrite equation (7)
as
S. The Principle of Linearized Stability
u, + Lou + g(1, u) = 0. (8)
Closely tied to the phenomenon of bifurcation
is the property of stability. Suppose that the Suppose(i) L,:D(LO)cX~X is a densely de-
dynamics of a physical system are governed fined linear operator on X such that -L,
generates a strongly continuous semigroup on
by (7). Let w(t; w,J denote the solution of (7).
An equilibrium solution D = I?(].) is called X, which is holomorphic on Xc (= the com-
stable if for any E > 0 there exists a 6 > 0 such plexification of X). I,, has compact resolvent;
i is a simple eigenvalue of L,, and ni#u(L,),
that 11w(t; wO) - +I1 <E for all t > 0 whenever
the spectrum of L,, for n=O, 2,3, . . As a con-
1)w0 - @\I < 6. Furthermore, D is said to be
asymptotically stable if, in addition, w(t; w& sequence of (i), if r > - Re i for any 1 E a(&,),
Oast-co. then the fractional power (rl+ &)a for CI2 0 is
By the principle of linearized stability we well defined. Because their domains are in-
mean that the stability of an equilibrium solu- dependent of r, one can set X, = D((rl + I,,)“),
tion @ is determined formally by the +spec- which are Banach spaces under the norm
trum of the linearized operator G,(1, G,(I)). As- lluII,= Il(rl+LJull. Suppose (ii) there exist
sume that G&I, E(I)) has only a +point spec- an c(E [O, 1) and a neighborhood Lo of (0,O) in
trum. Then (i) if $(I.) is stable, the spectrum of R x X such that gE Cz(O, X,), where Ck(Lo, X,)
G,(I.,i5(i)) is contained in {zeCI Rez<O}, and denotes the space of all X,-valued Ck functions
(ii) if the spectrum of G,(& V?(A)) is contained defined on G. Moreover g(I, 0) = 0 if (I, 0)~ G,
in {z E C )Re z < 0}, i? is asymptotically stable. and y,(O,O) = 0. (iii) Let fl= /3(n) be a continu-
Suppose that as 1. crosses a certain value &,, ously differentiable function defined in a neigh-
one or more eigenvalues of G,(& G(n)) cross borhood of 0 such that ~~(I)Eo(& +g,,(i, 0))
the imaginary axis from the left to the right and B(O) = i. Suppose that Rep(O) # 0. If as-
half-plane, where $1) is the known equilib- sumptions (i), (ii) and (iii) are satisfied, then
rium solution. This is precisely the situation there exist a positive 6 and continuously dif-
when a(].) becomes unstable. For notational ferentiable functions (p, 1, u):( -6,6)-R’ x
simplicity, we put u = w-S(i) and F(i, u) = C”(R, X,) such that (a) for 0 < Is] < 6, u(s) is a
G(1, u + i?(l)). 27-r&)-periodic solution of period 27-41(s) of (8)
corresponding to ,I = I(s); (b) p(O) = 1, I(O) =O,
u(O) = 0, and u( 4 # 0 if s # 0; and (c) any 27cp-
periodic solution ot (8) in Ci,,(R, X,) (= the
T. Bifurcation from Simple Eigenvalues space of 2zp-periodic continuous functions
with value in X,) with Ip- 11, 111 and Ilull suffi-
Take A = R. Let FE C’(R x X, Y) be such that ciently small is of the above form for some
F(i, 0) = 0 for any real i. Set L, = F,(i,,O), (s( < 6 up to a translation of the real line.
L, =FJ&,O), and suppose that (i) Ker(L,) Moreover, if g E C’+‘(U, X,), then the functions
is spanned by u0 #O; (ii) codim R(L,) = 1; p, 1, u are of class Ck.
and (iii) L,u,~R(L,), where R(L,) denotes the
+range of L,. Then there exists a Cl-curve
(I,+):(-&6)+R x X defined on some inter- V. The Lyapunov-Schmidt Procedure
val (- 6,6) such that I.(O) = A,, $(O) = 0, and
F(~(s),s(u, +$(s)))=O for any sf~(-6,6). Suppose that L(i) = F,,(/z, 0) has at i = 0 an n-
Moreover, in a neighborhood of (A,,, 0) any fold eigenvalue at the origin, i.e., dim Ker L(0)
zero of F either lies on this curve or is a trivial = n, and assume that X c Y. Let Ker L(0) be
solution (M. G. Crandall and P. H. Rabino- spanned by ‘pr , , (pn, and let P be the projec-
witz, J. Functional Anal., 8 (1971)). tion onto this linear subspace that commutes
In the above case, there exist only three with L(0). Then P must take the form Pu =
possible situations of the curves (1, cp) and C;=, (u,@)cPJ, where the (P/*E Y*cX* are
(i, 0), called subcritical, supercritical, and null functions of the adjoint operator L(O)*
transcritical bifurcations. In the third case, and (qj, (p,*) = Sjk. P is a linear operator from
there occurs the so-called exchange of stability Y to Y; hence it can be regarded as a mapping
[19921]. from X to X as well, and Q = I -P is a projec-
286 W 1080
Nonlinear Functional Analysis

tion onto the range of L(0) in Y. Using P and A similar treatment works for a singular but
Q, the equation F(i, u) =0 can be decomposed mildly nonlinear A of the form A = L + N if the
into the system of equations linear operator L is the generator of a tstrong-
ly continuous semigroup et’- in the sense of
QF(&o+$)=O and PF(I,u+$)=O, (9) Hille-Yosida theory (- 378 Semigroups of
where u = Pu and $ = Qu. One solves the first Operators and Evolution Equations) and the
equation for $ = $(I., u) by the implicit function nonlinear mapping N is Lipschitz continuous.

ts
theorem, and then, substituting this into the In this case, we reduce the problem to the
second, one has the bifurcation equation integral equation

F(L, u) = PF(1, u + l)(i, II)) = 0. (10) u(t) = efLa + e(f-r)L Nu(s) ds.
0
Solutions of the bifurcation equation are in
one-to-one correspondence with solutions of Here, if we merely have to assure ourselves of
the original system sufficiently close to the the local existence of the solution, then N can
bifurcation point. merely be locally Lipschitz continuous; and
There exists another method of reducing in this case, N can even be singular to some
an infinite-dimensional problem to a finite- extent if etL is tholomorphic in t [22,23]. A
dimensional one, called the center manifold typical application of this procedure was made
theorem [21,39]. by P. Sobolevskii, T. Kato, and H. Fujita to
the Navier-Stokes equation (- 204 Hydro-
dynamical Equations; 205 Hydrodynamics)
W. A Global Result to construct regular solutions [22].
+Galerkin’s method (- 304 Numerical Solu-
The following global result is due to P. H. tion of Partial Differential Equations) is some-
Rabinowitz (J. Functional Anal., 7 (1971)). times quite convenient for obtaining (tweak)
Assume that F(,J u) = u - /zLu + H(i, u), where solutions of (11) and (12) [S]. Again, typical
L is a compact linear operator and H : R x X+ applications were made to the Navier-Stokes
X is a compact mapping with H(&u)=o( Ilull) equations by E. Hopf and others [S, 24,251.
at 0 uniformly on bounded l-intervals. Then,
To prove the convergence of approximate solu-
if p ml is an eigenvalue of L of odd multiplic- tions constructed by Galerkin’s method and
ity, (p, 0) is a bifurcation point for F with
subject to so-called energy estimates, we often
respect to the trivial solution. Moreover, the make use of Aubin’s compactness theorem
closure of the set of nontrivial zeros of F con- concerning vector-valued functions [5].
tains a component that meets (p, 0) and either Remarkable developments have taken place
is unbounded in R x X or meets (fi, 0), where since 1964 for the case where A is dissipative,
p #p and fi-’ is an eigenvalue of L. i.e., -A is accretive. F. Browder [26] proved
The beginning of bifurcation theory seems that if X is a Hilbert space and A is dissipative,
to be in the celebrated work of H. Poincart then the mere continuity of A is sufficient for
(Acta Math., 7 (1885)). the twell-posedness of (11) and (12). Then Y.
Komura [27] brought about a crucial advance
X. Abstract Caucby Problems by showing that if A is a (possibly multivalued
and) maximal dissipative operator with D(A)
Suppose that we are given an abstract Cauchy dense in the Hilbert space X, then (11) and
problem (12) are uniquely solvable for any asD(A).
Furthermore, he founded the theory of non-
du linear semigroups of operators by establishing
-=Au (t > 01,
dt a tnonlinear version of the Hille-Yosida the-
u(+O)=a ory for semigroups of rnonexpansive opera-
(12)
tors in Hilbert spaces [28]. Subsequent de-
in a Banach space X, where asX and the non- velopments and applications were made in
linear operator A is assumed, for simplicity, to various directions by T. Kato (J. Math. Sot.
be independent oft. If the domain D(A) of A Japan, 19 (1967)) M. Crandall and A. Pazy
coincides with X and A is +Lipschitz continu- (J. Functional Anal., 3 (1969)) H. B&is and

fs
ous, we can reduce the abstract Cauchy prob- A. Pazy (J. Functional Anal., 6 (1970)), S.
lem to the tintegral equation of Volterra type Oharu (J. Math. Sot. Japan, 22 (1970)), M.
Crandall and T. Liggett [29], Y. Konishi (Proc.
u(t) = a + Au(s) ds, Japan Acad., 47 (1971)), and others. While
0 Komura’s original proof was based on the
and by applying the iteration procedure we Yosida approximation A(I-EA)~‘(c-+0) for
can easily show that the abstract Cauchy A, it is proved in [29] that in any Banach
problem has a uniquely determined solution. space a dissipative operator A that is maxi-
1081 286 Ref.
Nonlinear Functional Analysis

ma1 in a certain sense generates a nonlinear Assume the following conditions on F: (i) For
semigroup 7; by the exponential formula some numbers R > 0,~ > 0, p,, > 0 and every
-n pair of numbers p, p’ such that 0 < p’ < p <
?;x=lim I-fA x. (13) pO, (u, t)+F(u, t) is a continuous mapping of
n-m ( n > {ueB,I llull,<R} x {tl]t)<g} into B,.. (ii) For
The scope of applications of the generating any p’<p<pO and all u, vcl3, with llullp<
theorem (13) can be seen, e.g., in B. K. Quinn R, 11v 11p < R, and for any t, It 1<q, F satisfies
(Comm. Pure Appl. Math., 24 (1971)) M. G. ll~~u,~~-~~~,~~ll,~~~ll~-~ll,l~~-~’~, where C
Crandall (Israel J. Math., 12 (1972)) Y. Koni- is a constant independent oft, u, v, p, or p’. (iii)
shi (Proc. Japan Acud., 48 (1972) and J. Math. F(0, t) is a continuous function of t, 1t( <‘I, with
Sot. Japan, 25 (1973)) and S. Aizawa (Hiro- values in B, for every p < p0 and satisfies, with
shima Math. J., 6 (1976)). a fixed constant K, llF(O, t)llp<K/(po-p), O<
P<Po.
Abstract Cauchy-Kovalevskaya theorem.
Y. Nonlinear Semigroups in Banach Lattices Under the preceding hypotheses there is a
positive constant M such that there exists a
Let X be a Banach lattice (- 310 Ordered unique function u(t) which for every positive
Linear Spaces). An operator A:X=D(A)+X p < p. and 1t I< M( p. - p) is a continuously
is said to be dispersive if for all x, ye D(A), differentiable function oft with values in B,,
llu(t)ll,<R, and satisfies (14) (- also [31]).
11(x-y)+II~ll(x-y-i(Ax-Ay))+ll (‘i>O).
These results cover a theorem of M. Na-
When a dispersive operator A satisfies the gumo (Japan. .I. Math., 18 (1948)), which gen-
range condition R(I--iA)1D(A) for any 1,>0, eralizes the classical Cauchy-Kovalevskaya
it generates an order-preserving semigroup 7; = theorem.
elA on D(A) : II(T,x - T,y)+ 1)< 11(x-y)’ 1)for t >
0. We have therefore the preservation of order:
x < y implies 7;~ d 7;~. We can prove in par-
ticular that the order of initial data is inherited References
by the solutions of a nonlinear heat equation
(Y. Konishi). [l] L. Nirenberg, Topics in nonlinear func-
Remark. Various pathological phenomena tional analysis, Lecture notes, New York
arise when we do not restrict the form of A Univ., 197331974.
in the abstract Cauchy problem (1 l), (12). We 123 J. T. Schwartz, Nonlinear functional analy-
cite merely the “blowing up” of solutions of sis, Gordon & Breach, 1969.
Cauchy problems for tnonlinear heat equa- [3] M. Berger, Nonlinearity and functional
tions (- 291 Nonlinear Problems) and the analysis, Academic Press, 1977.
nonlinear wave propagations described by [4] G. J. Minty, Monotone (nonlinear) oper-
nonlinear Schriidinger equations (J. B. Bail- ators in Hilbert space, Duke Math. J., 29
Ion, T. Cazenave, and M. Figueira). (1962) 341-346.
[S] J. L. Lions, Quelques methodes de resolu-
tion des problemes aux limites nonlineaires,
Z. Abstract Cauchy-Kovalevskaya Theorem in
a Scale of Banach Spaces Dunod, 1969.
[6] J. K. Hale, Ordinary differential equations,
Interscience, 1969.
T. Yamanaka (Comment. Math. Univ. St. Paul,
9 (1960)), L. V. Obsyannikov (Soviet Math. [7] M. A. Krasnosel’skii, Topological methods
in the theory of nonlinear integral equations,
Doklady, 6 (1965) and 12 (1971)), F. Treves
(Trans. Amer. Math. Sot., 150 (1970)), L. Niren- Pergamon, 1964. (Original in Russian, 1956.)
berg (J. Differential Geometry, 6 (1972)), and T. [S] A. Friedman, Partial differential equations
of parabolic type, Prentice-Hall, 1964.
Nishida [30] discussed abstract treatments
of classical Cauchy-Kovalevskaya theorem for [9] 0. A. Ladyzhenskaya and N. N. Ural’-
tseva, Linear and quasilinear elliptic equations,
partial differential equations: Let S = jBp}p,O
be a collection of Banach spaces depending on Academic Press, 1968. (Original in Russian,
1956.)
the real parameter p > 0. Let 1)uJ/ p denote the
norm of an element u E B,. The collection S is [lo] E. Hille and R. S. Phillips, Functional
analysis and semigroups, Amer. Math. Sot.
called a scale of Banach spaces if, for any p and
COIL Publ., 31 (1957).
p’<p, B,cB,, and Ilull,,< (lull,, for any u in
[ 1 l] H. Omori, Infinite-dimensional Lie trans-
B,. Consider in S the initial value problem of
formation groups, Lecture notes in math. 427,
the form
Springer, 1974.
du [12] J. Moser, A new technique for construc-
z = w40, t), It]<& and u(O)=O. (14) tion of solutions of nonlinear differential equa-
287 A 1082
Nonlinear Lattice Dynamics

tions, Proc. Nat. Acad. Sci. US, 47 (1961), [31] T. Kano and T. Nishida, Sur les ondes de
1824-1831. surface de l’eau avec une justification mathe-
[ 131 H. Jacobowitz, Implicit function theo- matique des equations des ondes en eau peu
rems and isometric embeddings, Ann. Math., profonde, J. Math. Kyoto Univ., 19 (1979),
95 (1972), 191-225. 335-370.
[14] J. Mather, Stability of C” mappings I, 1321 Yu. G. Borisovich, V. G. Zvyagin, and
Ann. Math., 87 (1968), 89- 104. Yu. I. Sapronov, Nonlinear Fredholm maps
[ 151 S. Smale, An infinite-dimensional version and the Leray-Schauder theory, Russian
of Sard’s theorem, Amer. J. Math., 87 (1965), Math. Surveys, 32 (1977), l-54. (Original in
861-866. Russian, 1977.)
[ 161 R. Palais and S. Smale, A generalized [33] J. Leray and J. Schauder, Topologie et
Morse theory, Bull. Amer. Math. Sot., 70 equation fonctionnelles, Ann. Sci. Ecole Norm.
(1964), 165-172. Sup., 51 (1934), 45578.
[ 171 C. Bessaga, Every infinite-dimensional [34] L. Lyusternik and L. Shnirel’man,
Hilbert space is diffeomorphic with its unit Methode topologique dans les problemes
sphere, Bull. Acad. Sci. Polon., 14 (1966), 27- variationnels, Herman, 1934. (Original in
31. Russian, 1930.)
[ 181 R. Palais, Lusternik-Schnirelman theory [35] M. Morse, Calculus of variations in the
on Banach manifolds, Topology, 5 (1966), large, Amer. Math. Sot. Colloq. Publ., 18
1155132. (1934).
[19] M. G. Crandall and P. H. Rabinowitz, [36] J. Moser, A rapidly convergent iteration
Mathematical theory of bifurcation, Bifurca- method and nonlinear partial differential
tion Phenomena in Mathematical Physics equations I, II, Ann. Scuola Norm. Pisa, 20
and Related Topics, C. Bardos and D. Bessis (1966), 2655315,4999535.
(eds.), Reidel, 1980. [37] J. Nash, The embedding problem for
[20] J. E. Marsden, Qualitative methods in Riemannian manifolds, Ann. Math., 63 (1965),
bifurcation theory, Bull. Amer. Math. Sot., 84 20-63.
(1978), 112551148. [38] K. D. Elworthy and A. J. Tromba, Dif-
[21] D. H. Sattinger, Bifurcation and sym- ferential structures and Fredholm maps, Amer.
metry breaking in applied mathematics, Bull. Math. Sot. Proc. Symposia in Pure Math., 15
Amer. Math. Sot., new ser., 3 (1980), 779- (1970), 45-94.
819. [39] J. Carr, Applications of centre manifold
[22] T. Kato, Nonlinear evolution equations theory, Springer, 1981.
in Banach spaces, Amer. Math. Sot. Proc.
Symposia in Appl. Math., 17 (1965), 50-67.
[23] P. E. Sobolevskii, Equations of parabolic
type in a Banach space, Amer. Math. Sot.
Transl., (2) 49 (1966), l-62. (Original in Rus- 287 (Xx.33)
sian, 1961.)
Nonlinear Lattice Dynamics
[24] E. Hopf, Uber die Anfangswertaufgabe
fur die hydrodynamischen Grundgleichungen,
Math. Nachr., 4 (1951), 213-231. A. Lattice Dynamics
[25] 0. A. Ladyzhenskaya, The mathemat-
ical theory of viscous incompressible flow, In order to elucidate certain characteristic
Gordon & Breach, 1963. (Original in Russian, features of nonlinear waves, one-dimensional
1961). lattice models have been studied. Around
[26] F. Browder, Nonlinear equations of evo- 1953, E. Fermi et al. performed computer
lution, Ann. Math., 80 (1964), 485-523. experiments on nonlinear lattices to verify a
[27] Y. Komura, Nonlinear semigroups in generally accepted belief that nonlinear cou-
Hilbert space, J. Math. Sot. Japan, 19 (1967), pling between the inormal modes of harmonic
493-507. oscillators would lead to complete energy
[28] Y. Komura, Differentiability of nonlinear sharing between these modes. To their sur-
semigroups, J. Math. Sot. Japan, 21 (1969), prise, their nonlinear lattices yielded very little
3755402. energy sharing at all; on the contrary, the
[29] M. G. Crandall and T. M. Liggett, Gener- interactions resulted in the recurrence of the
ation of semigroups of nonlinear transfor- initial state. These results were later inter-
mations on general Banach spaces, Amer. J. preted in terms of solitons (- 387 Solitons),
Math., 93 (1971), 265-298. i.e., nonlinear waves that preserve identity
[30] T. Nishida, A note on a theorem of Niren- despite mutual interaction.
berg, J. Differential Geometry, 12 (1977), 6299 The equations of motion for a uniform l-
633. i dimensional chain of particles of mass m with
1083 287 C
Nonlinear Lattice Dynamics

nearest-neighbor interaction can be written as in a Lax representation dL/dt = BL - LB,


where L and B are N x N matrices with the
d2Q” elements
m-== -d(Q,-Qn-A+cp’(Qn+, -Q,)
dt2
Ln=4,, -L,+, =L+l.n=~
(?I= . . . . 1,2 ,... ).
L ,,lv- -L N.1 =aNY
It was shown that a lattice with a nonlinear
B n,n+, = -Bn+~,n= --a,>
interaction of the form
B l.N= - BN, I = ‘N
cp(r)=e-‘+r+const.
(the other elements of L and B are all zero),
admits solutions in closed form. Later, it was
with
shown that this lattice (the exponential lattice
Li =fe-‘“n+, -QnW
or the Toda lattice) is a completely integrable n
system.
b,= $P” (Pn = dQ,ldt).
It is convenient to introduce s,, which is the
generalized momentum canonically conjugate The eigenvalues 7, of L can be shown to be
to the mutual displacement rn = Q,+I -Q,. For independent of time, and so the motion of the
a lattice with exponential interaction, lattice is a spectrum-preserving deformation.
Now, if we define (I,} by
e-‘n- 1 =ds,/dt,
det(i1 -L)=lN+JNm’I, +...+ll,-, +I,,
and if we introduce
the n { I,} are polynomials of u, and h,. These
S,= ‘s,dt, are constants of motion that were discovered
s independently by M. H&non and H. Flaschka.
then the equations of motion can be written as Thus the lattice has N conserved quantities;
I, is related to the total momentum and I, to
log(l +d2&/dt2)=Sn+, +.S,-, -2S,, the total energy, but higher-index conserved
and the displacements are given by quantities have no physical interpretation.

Q,=&-&,+I.
We have a solution C. Method of Integration
&=log{ 1 +e2(Zn+Pt+a)},
Let L and B be the infinite matrices obtained
where c[ and S are arbitrary constants and [j = from the foregoing ones in the limit N -+ co.
f sinh CC.The associated wave The eigenvalues 1. of the equation
e-‘-l=~*sechZ(cLn+/3t+6) Lcp=kp
represents a solitary wave or soliton. are independent of time, and the time evolu-
The multisoliton (N-soliton) solution is tion of cp is given by the equation
given by
dpJdt = Bq.
S, = log det V’,,
If the motion in the lattice is restricted to a
where Y,, is an N x N matrix whose elements finite region, we can clearly speak of the scat-
are tering of the wave cp due to the deformation in
the lattice. For a given initial motion Q,,(O) and
(Y$, = Sj, + cjck-etzjz!J+ (p.+pxj*
J P,,(O), or L(O), we calculate the initial scattering
1 - ziz,
data of asymptotic form cp- z” (n-+ co). The
with
scattering data consist of the reflection coefft-
Z,= fee”,, cient R(z), the bound state eigenvalue Ij=
-(z,+z,:‘)/2 (SC -1 or S> l), and the co-
/I~= TsinhMj. efftcient cj of the normalized bound state eigen-
Asymptotically the wave reduces as t--r Tm to function of asymptotic form cjz,? for n+ co.
an assembly of solitons From the initial data and the equations of
motion for n+ +CXI, we get the scattering data
emrn-l=T /I~sech2(~jn+fijt+6,~). at a later time t. In effect, we construct the
j=, kernel

B. Conserved Quantities F(m)=& R(z,O)e-(‘-‘~“‘z”-‘dz


4
The equations of motion for a periodic ex-
ponential lattice of N particles can be written
287 Ref. 1084
Nonlinear Lattice Dynamics

of the discrete integral equation (Gel’fand- [6] E. Date and S. Tanaka, Analogue of in-
Levitan-Morchenko equation) verse scattering theory for the discrete Hill’s
equation and exact solutions for the periodic
fc(n,m)+F(n+m)+ f K(n,n’)F(n’+m)=O, Toda lattice, Prog. Theor. Phys., 55 (1976)
“‘=n+l
457-465.
ffl>n+ 1.
[7] M. Kac and P. Van Moerbeke, A complete
After solving this equation for ~(n, m), we solution of the periodic Toda problem, Proc.
calculate K (n, n), given by Nat. Acad. Sci. US, 72 (1975), 287992880.
[S] M. Toda, Theory of nonlinear lattices,
1
=l+F(2n)+ f K(n,n’)F(n’+n). Springer, 198 1. (Original in Japanese, 1978.)
CKh 41’ It’=“+, [9] B. Kostant, The solution to a generalized
Then the initial value problem is solved in the Toda lattice and representation theory, Ad-
form vances in Math., 34 (1979) 1955338.

e-@-Q.

The solution
~I)=
[
K(n,n)
K(n-l,n-1) 1
2

can be given as dQ, Jdt = s, - s,+, ,


288 (XIII.1 0)
with
Nonlinear Ordinary
s,=K(n-1,n). Differential Equations
The simplest case R(z) = 0 yields the multi- (Global Theory)
soliton solution.
For the periodic case also, eigenvalues of
A. General Remarks
the equations Lq = pep and dpJdt = Bq, under
suitable boundary conditions and for certain
Many well-known functions (with the notable
initial data, give sufficient information to
exception of the W-function), such as the ex-
construct a solution to the initial value prob-
ponential, trigonometric, telliptic, and tauto-
lem. Such a method of obtaining a general
morphic functions, satisfy ordinary differential
solution for the periodic lattice was developed
equations of simple forms. For the purpose
by E. Date and S. Tanaka, and independently
of finding new transcendental functions, P.
by M. Kac and P. Van Moerbeke. Following
Painleve initiated the systematic study of the
Date and Tanaka, the solution can be written
equation
in terms of the multivariable theta function, or
the Riemann theta function 9, as F(x, y, y’, . . , y’“‘) = 0 (1)
9(an + bt + 6) in the complex domain. To investigate the
P&log
9(a(n + 1) +/It + 6) + const” solution in its whole domain of definition, he
assumed that F is a polynomial in y, y’, , yr”)
where c(, p, and 6 are certain vector constants.
whose coefficients are analytic in x. Such an
There has been much activity recently to-
equation is called an algebraic differential
ward interpreting the integrability of the Toda
equation. If F is linear in y”‘), then (1) is written
lattice in terms of Lie algebras (B. Kostant
as
C91).
y’“’ = P(x, y, y’, . . . , y’“-1)) (2)
Q(x,Y,Y’,...,Y’“~“)’
References
where P and Q are polynomials in y, y’, . . ,
[l] E. Fermi, J. Pasta, and S. Ulam, Studies of y’“-” with coefficients that are analytic func-
non-linear problems, Los Alamos Report LA- tions of x. Equation (2) is called a rational
1940 (1955); Collected papers of Enrico Fermi, differential equation.
vol. II, Univ. of Chicago Press, 1965, 978988. If F is linear in y, y’, . , y’“‘, i.e., (1) is a linear
[2] M. Toda, Vibration of a chain with non- differential equation, and if the coefficient of
linear interaction, J. Phys. Sot. Japan, 22 y’“’ is 1, then singular points of solutions are
(1967) 431-436. situated at the singular points of the coeff-
[3] M. Toda, Studies of a nonlinear lattice, cients (- 254 Linear Ordinary Differential
Phys. Rep., 18C (1975), l-123. Equations (Local Theory)). If F is not linear,
[4] M. Henon, Integrals of the Toda lattice, then singular points of solutions of (1) are
Phys. Rev., B9 (1974), 1921-1923. divided into two categories, one consisting of
[S] H. Flaschka, On the Toda lattice. I, Exis- those points whose positions are determined
tence of integrals, Phys. Rev., B9 (1974), 1924- by the equation itself and are independent of
1925; II, Inverse-scattering solution, Prog. individual solutions, and the other consisting
Theor. Phys., 51 (1974) 703-716. of those points whose positions depend on the
1085 288 B
Nonlinear ODES (Global Theory)

choice of particular solutions. In other words, able, g denotes the +genus of the algebraic
the singularities of the first category appear curve defined by (4). Painleve found that there
independently of the choice of arbitrary con- were gaps in the proofs of Fuchs and Poincare
stants involved in the general solution, while and completed these by proving his theorem
those of the second category depend on the and the following one: Let cp(x, y,, x0) be the
choice of the arbitrary constants. The former solution of (4) satisfying the initial condition
are called fixed singularities and the latter y(x,) = y,. Let X, x0 be points different from <
movable singularities. The linear differential and t’, and let L be a curve connecting x,, to
equation (1) has fixed singularities only, which x and not passing through any 5 or 5’. If we
are situated at the singularities of the coeffi- denote by &x, yO, x0) the value at x =X of the
cients. In the same way, tbranch points of branch obtained by continuing cp(x, y,, X0)
solutions can be classified into two kinds, fixed analytically in a neighborhood of L and re-
branch points and movable branch points. gard (P&Z, y,,, x,,) as a function of yO, then
(P~(x, y,, x,,) coincides, in a neighborhood of
every point y0 = h, with several branches of an
B. Algebraic Differential Equations of the First talgebroid function of y,. Painlevi: studied the
Order case when the general solution is finitely many-
valued and gave a condition for cp(x, y,, x0) to
Consider the equation be an algebraic function of y,,.
When equation (4) does not contain x ex-
ox> Y)
plicitly, no movable branch points appear if
Y’=Q(x,~,’
and only if all solutions are single-valued, and
where P and Q are relatively prime polyno- then the solutions are expressible in terms of
mials in x, y. The fixed singular points of (3) rational, exponential, and elliptic functions.
are defined to be points 5, <’ with the following Such an equation is called a Briot-Bouquet
properties: (i) Q(5, Y) = 0. (4 Q(5’, Y) $0, P(5’, Y) differential equation.
= Q(t’, y) = 0 have a root y = n’. (iii) If, in (ii), J. Malmquist proved, by using P. Bou-
we substitute l/z for y and if the same relation troux’s method of studying the behavior of
(ii) holds for x = 5 and z = 0, we count such solutions in the neighborhood of a fixed sin-
a value 5, as a fixed singular point. (iv) We gularity, that if equation (4) admits at least one
transform equation (3) by setting x = l/t. If solution that has an essential singularity and is
the value t = 0 satisfies (i) or (ii) for this trans- finitely many-valued and free from movable
formed equation, we count co as a singular branch points around this singularity, then (4)
point 5 or i;’ accordingly. The points & 5’ are, is an equation without movable branch points.
in general, itranscendental singularities of solu- If (4) admits a solution that is a finitely many-
tions, and the points 5’ cannot be tessential valued transcendental function, then an alge-
singularities of solutions but may be tordinary braic transformation may be applied to (4)
transcendental singularities. A singular point so that it will become an equation without
of a solution different from 5 and <’ is an talge- movable branch points. It is an immediate
braic singular point, and for any point dis- consequence of the first assertion that if (3)
tinct from 5 and t’, equation (3) admits a admits a solution that has an essential sin-
solution with an algebraic singularity at this gularity and is finitely many-valued and free
point. A necessary and sufficient condition from movable branch points around the sin-
that (3) has no movable branch point is that gularity, then (3) is a Riccati equation.
(3) be a +Riccati equation. Later, equations (3) and (4) were studied by
Consider the algebraic differential equation M. Hukuhara, K. Yosida, T. Sato, T. Kimura,
of the first order and T. Matuda. The following results are due
to Kimura. If a solution q(x) of (3) has an
F(x, y, Y’) = 0. essential singularity at x = 5, then, in an arbi-
After defining the fixed singular points 5 and trary neighborhood of 5, q(x) assumes every
5’ of (4), where the algebraic function of x, y value with the exception of the roots of P(<, y)
defined by (4) has bad singularities, Painleve = 0. If (3) is not a Riccati equation, it is deter-
proved that movable singularities of solutions mined by a finite number of algebraic pro-
are algebraic. Before Painlevt’s work, L. Fuchs cesses whether or not (3) admits a solution
gave a necessary and suficient condition that that has an essential singularity at x = 5 and
(4) has no movable branch points, and then H. has no movable branch point around 5. If (3)
Poincare showed that if this condition is satis- admits such a solution, then the singularity is
fied, then (4) is either reducible to the Riccati a tlogarithmic branch point. For an essential
equation if y = 0, integrable by the use of ellip- singularity of a solution there exists, in gen-
tic functions if g = 1, or algebraically integrable eral, a direction similar to a +Julia’s direction,
if g > 1, where, with x the independent vari- which was investigated by Hukuhara and
288 C 1086
Nonlinear ODES (Global Theory)

Kimura. Matuda studied in detail the behavior tions and their solutions are called PainIevC
of solutions as x tends to 5 along a half-line equations and Painlevk transcendental func-
and concluded that, except for some special tions, respectively. Equation (VI) was dis-
cases, any solution tends to a certain value covered by 8. Gambier, who found an omis-
as x tends to < along a half-line. To obtain sion in Painleve’s calculations. All solutions of
algebraic solutions C. Briot and J. C. Bou- (I) are single-valued, and their properties were
quet devised a method similar to +Puiseux investigated by Boutroux. The solutions of
expansion in the theory of algebraic functions. (VI) have, in general, logarithmic branch
Hukuhara improved their method and suc- points at x=0, 1, co and were studied by R.
ceeded in reducing (3) to several differential Garnier.
equations of standard forms in a neighbor- The case when the equation is of degree 2
hood of x = <. This enables us to apply the with respect to y” was studied by Malmquist
local theory to the global study. Hukuhara’s and F. Tricomi.
method was used by Kimura and Matuda to The following facts are known concerning
obtain the results discussed in this paragraph. movable transcendental singularities of the
rational equation y” = P(x, y, y’)/(Q(x, y, y’),
where P and Q are relatively prime poly-
C. Algebraic Differential Equations of the nomials (Kimura). Let p, q be the degrees of P
Second Order and Q with respect to y’. If p > q + 2, then any
solution q(x) admits no movable essential
For second-order algebraic differential equa- singularities, but its derivative v’(x) may admit
tions we can pose the same problems as for such singularities. If p > q + 2 and Q is not
first-order equations: When do these equations decomposable as Q, (x, y)QZ(x, y, y’), then
have single-valued or finitely many-valued neither q, nor cp’ admits movable essential
general solutions? What new transcendental singularities. If p < q + 2, then both cp and cp’
functions are needed to integrate such equa- may have movable essential singularities. If
tions? These problems, studied by E. Picard p < 4 + 2 and Q is not decomposable as above,
and Painleve, are difficult because of the exis- then cp’ has no movable essential singularities.
tence of movable transcendental singularities. If, for a solution q(x), x = a is a movable essen-
However, Painleve succeeded in determining tial singularity of q(x) or cp’(x), the q(x) or
rational differential equations of the second p’(x) assumes all values other than a finite
order without movable branch points. Such number of exceptional values in an arbitrary
equations, with the exception of those that neighborhood of x = a. If p > q + 2, then every
are integrated by the use of solutions of the solution possesses +Iversen’s property, and
first-order and linear differential equations, hence the set of movable singularities is not a
can be transformed by rational transforma- continuum.
tions into one of the following six differential
equations:

(1) y” = 6y2 + x, D. Higher-Order Equations and Other


/I
Equations
(II) yf’=2y3+xy+cc,

PainlevC’s method of obtaining the second-


order equations without movable branch
points is applicable to higher-order equations.
The determination of third-order equations
without movable branch points was attempted
by Painlevt, J. Chazy, and Garnier by the use
of this method, but is not yet complete. Chazy
studied in detail an equation of the form

y,,,= (1- l14Y”2


+ NYIY’Y” + C(Y)Y’”
Y’
and showed that when n = - 2 and b(y) =
0, +Fuchsian and +Kleinian functions are
obtained as solutions (- 32 Automorphic
Functions).
R. Fuchs, a son of L. Fuchs, derived equa-
tion (VI), at almost the same time as Gam-
bier, from the study of tmonodromy groups.
where c(, /j, y, and 6 are constants. These equa- He showed that the monodromy group of the
1087 289 B
Nonlinear ODES (Local Theory)

equation tions differentielles du premier ordre, Acta


Math., 36 (1913), 2977343; 74 (1941) 1755196.
[7] E. Hille, Ordinary differential equations in
the complex domain, Wiley, 1976.
6 3 a
f- -+
t(t-1)+4(t-y)2 t(t-l)(t-x)

b 289 (X111.9)
+
t(t- l)@-Y) > Nonlinear Ordinary
remains invariant as the singularity x varies if Differential Equations
and only if x, j$ y, and 6 remain constant; the (Local Theory)
singularity y, considered as a function of x,
satisfies equation (VI); and a and h are rational
functions of x and y and y’. Investigating a A. General Remarks
second-order linear differential equation of
Fuchsian type with tregular singularities 0, 1, Consider a system of n differential equations
~,Xl,...iX”>Yl,“‘, y, where y,, , y, are ap-
dyj/dx=1;(x,Y,,...,y”), j=l>...ana (1)
parent fixed singularities, Garnier was led,
under the hypothesis that the monodromy where the f; are analytic functions of x, y,,
group of this equation remains invariant as x,, “‘2 y,. To simplify the notation, we use vector
. . . , x, vary, to a tcompletely integrable system notation y instead of (yi, , y,). If all the
of partial differential equations and showed 1; are holomorphic at a point (x, y) = (a, h),
that a symmetric function of y,, , y,, con- there exists one and only one solution y(x) for
sidered as a function of any one of x,, . ,x,, (1) such that y+b as x-a and y(x) is holo-
satisfies an equation without movable branch morphic at x = a. We say that a point (a, b) is
points (- 253 Linear Ordinary Differential a singular point of the system (1) if it is a sin-
Equations (Global Theory)). gular point of fj for some j. The well-known
For nonalgebraic equations, movable +Cauchy’s existence theorem can no longer be
branch points appear even in the first-order applied to the case when (a, b) is a singular
case. Kimura obtained a sufficient condition point of system (1). In this case, the following
that for an equation F(x, y, y’) = 0, where F is a three problems arise naturally: (i) to determine
polynomial of y’ with meromorphic coefii- whether solutions y(x) such that y(x)-+b as
cients in x, y, every solution has Iversen’s x+a exist, and if they exist, to determine the
property in a domain of the complex plane. number of independent solutions; (ii) to con-
It was shown by 0. Holder that the I- struct analytic expressions for solutions y(x)
function satisfies no algebraic differential such that y(x)+b as x-a or, in a slightly more
equation. Also, if a function meromorphic in general way, analytic expressions for bounded
the unit circle is of torder co, then it satisfies solutions y(x) such that the values of (x, y(x))
no algebraic differential equation. stay in a neighborhood of the singular point
(a, b); (iii) to investigate the properties of these
solutions. These three problems are called
local problems, since only those solutions in
References
a neighborhood of the singular point (a, b)
are considered. However, even when it = 1,
[1] P. Painleve, Oeuvres de Paul Painleve, 1,
the study of local problems is very difficult
2, 3, Centre Nat. Recherche Sci. France,. 1972,
except for the case of singular points of par-
1974, 1975.
ticular types at which the functions fj are
[2] P. Boutroux, Lecons sur les fonctions
meromorphic.
detinies par les equations differentielles du
When n > 1, the problem becomes even
premier ordre, Gauthier-Villars, 1908.
harder; research on this case lags that for n = 1.
[3] E. L. Ince, Ordinary differential equations,
In the subsequent discussion we assume with-
Longmans-Green, 1927.
out loss of generality that a = 0, b = 0.
[4] L. Bieberbach, Theorie der gewiihnlichen
Differentialgleichungen, Springer, second
edition, 1965. B. The Case of a Single Equation
[S] M. Hukuhara, T. Kimura, and T. Matuda,
Equations differentielles ordinaires du premier Consider the equation
ordre dans le champ complexe, Publ. Math.
Sot. Japan, 1961.
dyldx = Y(x> ~)lX(x, Y), (2)
[6] J. Malmquist, Sur les fonctions a un where X and Y are holomorphic functions of
nombre fmi de branches dtfinies par les tqua- (x,y) at (0,O) x, y~c. When X(0,0)=0 and
289 C 1088
Nonlinear ODES (Local Theory)

Y(O,O) #O, we can rewrite equation (2) in the By studying these analytic expressions, the
form dx/dy = X(x, y)/Y(x, y), and we see that (i) properties of the solutions can be clarified.
if X(0, y) f 0, equation (2) has one and only
one solution y(x) which is algebraic at x = 0
and tends to 0 as x+0; (ii) if X(0, y) = 0, there D. Properties of Solutions of Briot-Bouquet
is no solution of equation (2) such that y+O as Differential Equations
x-0.
The case where X(0,0) = 0 and Y(0, 0) = 0 For equation (3), the character of the simpli-
was first studied by C. A. A. Briot and J. C. tied equation depends on the value of 1=
Bouquet. In order to obtain algebraic solu- f;,(O, 0). We have the following four cases: (i)
tions they introduced a method similar to 3, is neither 0 nor negative. A suitable formal
+Puiseux expansion in the theory of algebraic transformation (7) changes (3) to
functions. A. R. Forsyth and J. Malmquist
xdzldx = AZ + bx”.
studied a problem of reduction for equation
(2) by using the Briot-Bouquet method. The In particular, if/z is not equal to a positive
theory of reduction was completed by M. integer, then b = 0. The double power series (7)
Hukuhara, who divided a neighborhood of is uniformly convergent. There exists a func-
(0,O) into a finite number of subdomains in tion cp(x, z) of (x, z) holomorphic at (0,O) such
such a way that (i) the union of these sub- that y = cp(x, x”(b logx + C)) is a general solu-
domains covers the given neighborhood of tion of (3) with an integration constant C. (ii)
(0,O) completely; (ii) in each of these sub- i = 0. The equation satisfied by z takes the
domains equation (2) takes one of eight canon- form
ical forms. Hukuhara investigated the prop-
xdz/dx=z”+‘(b+b’z”), m>l.
erties of the solutions for these canonical
forms. Among them, the following two are If h = 0, b’ necessarily vanishes and x = 0 is a
well known as the Briot-Bouquet differential holomorphic point of (3). A general solution is
equations: given by

xdyldx =f(x, Y), f(O, 0) = 0, (3) Z(x)=(C-mblogx)-““, b#O, b’ = 0,

xc+’ dy/dx =f(x, y), and

f (0, 0) = 0, 0 > 1 is an integer. -1,ffl


(4)
) bb’ # 0.

C. Systems of Differential Equations Here, [ = a(t) is the branch (of the inverse
function of [ - log < = t) such that a(t) - t log t
When y is a vector with components (yj), equa- -t+O as t+co. Then there exists a holo-
tions (3) and (4) can be written as morphic function cp(x, z) of (x, z) for [xl< 6,
)margz+argbI<3n/2-a, [Z/CA such that
XdYj/dx=f;(X,Yl, ...aYn)> j=l,...,n, (5) cp(x, Z(x)) is a general solution of (3). (iii) i, is a
x”+ldyj/dx=&(x,y, /..., y,), j=l,..., n. (6) negative rational number -p/v. The equation
satisfied by z is written as
After Briot and Bouquet, the singular points of
these types were studied by many authors, vxdz/dx=z(-p+b(x”z”)“+b’(x’z’)2m).
including Poincare, E. Picard, H. Dulac,
A general solution has the form
Malmquist, W. J. Trjitzinski, and Hukuhara. A
method of constructing solutions of these Z(x)=x~(C-mblogx)~l~m”, b’=O,
equations consists of two parts: (i) formally
and
transforming the given equations into simpli-
-1,ltl”
fied or reduced equations of the simplest pos-
,
sible form by applying a formal transforma-
tion of the type
bb’#O.
Y=CPk,xkz’ (7)
In this case, there exists a holomorphic func-
or tion cp(x,z) of (x,z) for Impargx+mvargz+
argbf?r/2I<n-a, 1x1<& lzl<A such that
yj=~pkoki~~,k,~k~~~~ . . . z+, j= 1,2, . . . . n; (8)
cp(x, Z(x)) is a general solution of (3). M.
(ii) verifying convergence or the validity of Iwano expressed this solution in the form
tasymptotic expansions for the formal solu- $((x~Z(X)~)“‘, x,Z(x)), where $(w,x,z) is
tions of the given equations, which are ob- holomorphicfor largw+~I<7[--E,O<JwI<fi,
tained by substituting bounded solutions of ~x~<A,~~~<A.Inparticular,ifb=b’=O,cp(x,z)
the equations satisfied by z or zj into (7) or (8). is holomorphic at (0,O). (iv) i. is a negative
1089 289 E
Nonlinear ODES (Local Theory)

irrational number. The equation in z has the not generally be integrated by quadratures,
form but as Z,+,(x)--tO, the power series expansions
coincide with the expressions obtained in (i).
x dz/dx = iz.
(iii) In the case when the Jacobian matrix (ijk)
The formal transformation (7) may either is the zero matrix, Iwano constructed, under
diverge or converge. Dulac proved that if (7) additional assumptions, a convergent analytic
diverges and if there exists a solution y(x) such expression of a general solution for equations
that y(x)+0 as x-r0 along a suitable path L, (5).
then Ix”y(x)@argxl--tco and Ix’y(~)~argy(x)l Let the right-hand side of (6) be holomor-
--t co as x+0, x E L for any c( and /r. However, phic at (O,O, , 0). In 1939 Trjitzinski proved
the existence of such a solution is not yet the existence of solutions that admit asymp-
verified. C. L. Siegel proved that if 3, satisfies totic expansions in powers of n arbitrary con-
certain inequalities, the formal transformation stants. In 1940 and 1941, Malmquist proved,
(7) is divergent. An example such that (7) is under strong conditions, the existence of
divergent was first given by Dulac. A very solutions that are expressed as uniformly
simple example for such a case was given by convergent power series of Z,(x), , Z,(x):
Y. Sibuya. Cpjk =,,,kp(~) Z,(X)~Q . Z&X)~S. Here the Z,(x) are
In the case f(O,O) = 0 and 1. =f,(O, 0) # 0 for polynomials of x and logx of the form
(4) the most complete result was obtained by
Z,(x) = eh~(x)~A^~(C, + a polynomial of
Hukuhara, namely: (i) A suitable formal trans-
formation (7) reduces equation (4) to ck-,,~~~,c,,logx), k = ct, , p,

x”+‘dz/dx=z(cc,+x,x+...+cc,x”), Lx0= 2. where the coefficients admit asymptotic expan-


sions in powers of x. Trjitzinski’s result is
A general solution is given by Z(x) = C.
contained in Malmquist’s result as a special
~“~e“‘~‘, where A(x) is a polynomial in l/x of
case. On the other hand, under much weaker
degree (r. (ii) A general solution of (4) is ex-
assumptions than Malmquist’s, Hukuhara
pressed by a uniformly convergent power
solved a problem on the formal simplification
series of the form C ~(x)Z(x)~, where the (pk(x)
of equations (6) and formal solutions. Iwano
are holomorphic functions of x for a certain
improved Hukuhara’s result on formal solu-
sectorial neighborhood of x = 0 and have
tions and discussed the convergence of formal
+asymptotic expansions Cpjkxk as x+0.
solutions under weaker conditions than Malm-
Assume fi(0, 0, ,O) = 0 for equations (5).
quist’s (Ann. Mat. Pura Appl., 1957, 1959).
Put A,, = ahf;.layk(O, 0, , 0). Denote by i, , ,
The pjk,,,,kb(~) are holomorphic functions of
1”. the eigenvalues of an n x n matrix with
x in a sectorial neighborhood of x = 0 and
elements {3Ljk}. Then (i) if an angle w can be
admit asymptotic expansions in powers of x as
chosen so that for some m < n, all of 1WI,
x-0. The angle of the sector in which the
largl”, -WI, . . ..larg3.,-wl are less than x/2,
asymptotic expansions are valid is largest for
equations (5) possess solutions that are
Iwano’s method.
expressed as uniformly convergent (m + l)-
If C gj < n, equations of the form
tuple power series of x, Z,(x), , Z,(x):
x”ldyj/dx=fj(x,y ,,..., y,), j=l,..., n,

possess at least n-C aj solutions that are


Here the Z,(x) are general solutions of the
holomorphic for Ixj<6 (R. W. Bass, Amer. J.
simplified equations and have the expression
Math., 77 (1955)). This result is analogous to
Zk(x)=xAx (C,+a polynomial of those obtained by 0. Perron, F. Lettenmyer,
and Hukuhara and Iwano in the linear case
c,,...,ck-,,logx), k=l,2 ,..., m. (- 254 Linear Ordinary Differential Equa-
(ii) Moreover, Iwano extended the result of(i) tions (Local Theory)).
as follows: When there exists one and only one
zero among the other n-m eigenvalues, equa- E. Singular Perturbations
tions (5) have solutions, depending on m + 1
arbitrary constants, that are expressed as (m + The terms on the right-hand side of the non-
I)-tuple uniformly convergent power series linear differential equations

~“JdYjldx=f;.(x,Y~ 1 .. >Yn, ~1, j=1,2 ,...,n,


(9)
Here the coefficients pjk,k,,..k,(~m+,) are holo-
morphic functions of z,,,+, in a sectorial neigh- are holomorphic functions of (x, y, E) for lx I <a,
borhood of z,+, = 0 and admit asymptotic 11y((<b, O<~E(<C, largsl<d, and admit uni-
expansions in powers of z,+i as z,+i -0. In formly convergent expansions in powers of y
this case, the functions Z,(x), , Z,+,(x) can- with coefficients asymptotically developable in
289 Ref. 1090
Nonlinear ODES (Local Theory)

powers of E. The cj are nonnegative integers. of N. M. Krylov and N. N. Bogolyubov since


W. R. Wasow, W. A. Harris, Sibuya, and alound 1930, and after World War 11 research
Iwano and T. Saito discussed problems on in this field became active in Western countries
constructing asymptotic or convergent expan- also.
sions for bounded solutions that are depen- Written as first-order systems, the differen-
dent on several arbitrary constants. tial equations in this theory take one of the
In equations (9), the f; and afj/ay, are con- forms
tinuous functions of (x,y,E) for --oo <x< +co,
dx/dt =X(x) (1)
llyll <b, 1~1 cc, and periodic functions of period
T with respect to x. Moreover, assume that a or
system of degenerate algebraic equations
dx/dt = X(x, t), (4
O=f,(x,y, ,...a Y,,O), j=l,,..,n, (10) where x is a vector and t is a scalar. A dif-
has a periodic solution yj= pj(x) of period T ferential equation of the form (1) is said to
for --co <xc +co. Then if equations (9) have be autonomous. In a differential equation of
periodic solutions yj = p,(x, E) of period T such the form (2), X(x, t) is usually assumed to be
that yjhpj(x) as ~40, the pj(x,~) are called periodic or almost periodic in t. In the former
singular perturbations of p,(x) for equations (9). case, the differential equation (2) is said to be
Concerning this problem, see I. M. Volk (Prikl. periodic, and in the latter case, almost periodic.
Mat. Mekh. SSSR, 10 (1946)). The work of Oscillations in physical systems are described
Wasow (1950) on a single equation mostly by periodic or almost periodic dif-
ferential equations; therefore it is a principal
Py(“)=f(x,y,y’,..., Y’“‘,E), cr>o, n>m$O, problem in the theory of nonlinear oscillation
(11) to tind a periodic or almost periodic solution
is remarkable. of these differential equations. However, an
oscillation described by a solution of a dif-
ferential equation can be actually realized only
References when the solution is tstable (- 394 Stability)
under a small variation of the initial value.
[ 11 H. Dulac, Points singuliers des &qua- Therefore it is important to investigate the
tions diff&entielles, Mkmor. Sci. Math., 61, stability of periodic or almost periodic solu-
Gauthier-Villars, 1934. tions. In view of the fact that an actual phe-
[2] E. Picard, Trait& d’analyse III, Gauthier- nomenon may be only approximately de-
Villars, 1896. scribed by mathematical equations, sometimes
[3] E. L. Ince, Ordinary differential equations, it is necessary to require a certain stability
Dover reprint, 1956. of the system so that solutions of a peri-
[4] M. Hukuhara, T. Kimura, and T. Matuda, odic or almost periodic equation stay stable
Equations diffkrentielles ordinaires du premier under a small variation of the equation itself.
ordre dans le champ complex, Math. Sot. Such stability is called structural stability,
Japan, 1961. the investigation of which is also important.
[S] E. Hille, Ordinary differential equations in It may happen that a differential equation
the complex domain, Wiley, 1976. possesses neither a periodic solution nor an
almost periodic solution, but that it has an
almost periodic tintegral manifold (i.e., a mani-
fold x =f(t, Q) in tx-space, where 0 is a para-
290 (XIII.1 1) meter, such that f(t, 0) is periodic or almost
periodic in t and periodic in 0) containing the
Nonlinear Oscillation itrajectory of the differential equation passing
through an arbitrary point of the manifold. In
A. General Remarks this case, we can consider that a solution
corresponding to a trajectory lying on the
By nonlinear oscillation we usually mean oscil- manifold describes an oscillation. Therefore
lation described by periodic or talmost peri- it is important to find a periodic or almost
odic solutions of nonlinear ordinary differential periodic integral manifold of a differential
equations. The theory of nonlinear oscillation equation and to investigate the stability of
is sometimes called nonlinear mechanics. In such a manifold.
connection with oscillations in +dynamical The methods used most frequently in re-
systems and electrical circuits, the theory of search on nonlinear oscillation are: (i) geo-
nonlinear oscillation has been studied inten- metric methods, (ii) analytic methods, and (iii)
sively in the Soviet Union under the direction numerical methods.
1091 290 D
Nonlinear Oscillation

B. Linear Oscillations differential equation j;- +f(x)i + y(x) = 0, which


includes the van der Pol differential equation
Let S, and ST be the spaces of all w-periodic m-?t(l -x2)1+x=0 [1,2].
solutions of the w-periodic linear system In the periodic case (2), let w( > 0) be a
period of X(x, t) with respect to t and x =
dx/dt = A(t)x, A(t+w)=A(t), (3) cp(t,a) be a solution of (2) such that ~(0, a) =
and its adjoint system, respectively. Then the r. Then a periodic solution of (2) is given by
space P, of continuous w-periodic functions x = cp(t,Q), where !xOsatisfies cp(w, a,) = c(,,.
R+R” has direct sum decompositions P, = Thus the existence of a periodic solution can
S, + S, and P, = SF + Sz so that (x, y) = 0 for be shown by investigating geometrically the
every XES,, YES, or XGSF, YES:, where (x,y) existence of a +fixed point of the mapping x+
=(l/o)~~‘x(s)y(s)ds. Let {t’, . . ..t”} and x’= cp(w, x) in the phase space. In such an
{~l,...,~m}(mmaybeO)bebasesofS,andS~ approach, +Brouwer’s fixed-point theorem is
orthonormal with respect to (. , .). Then for utilized frequently. In the mapping x+x’, it
PEP, there is a unique w-periodic solution x may happen that there is no fixed point but
of that there exists an tinvariant manifold. In this
case, we get a periodic integral manifold.
dx/dt = A(t)x + p(t) (4) Geometric methods give information on
belonging to S, under the condition (sk, p) = 0 qualitative properties, but usually not on
(k = 1, , m), which is represented in the form quantitative properties like the shape of an
x = G[p] by a bounded linear operator G: P,,+ oscillation. Thus these methods are in general
P,. Also, if (3) and hence its adjoint system not sufficient for the analysis of the phenom-
have no nontrivial w-periodic solution, then ena met in practice.
(4) always has a unique w-periodic solution
given by x = G[p], and this is true even if the
w-periodicity is replaced by almost periodicity.
D. Analytic Methods
The almost periodic system (3) is said to be
regular if (4) has almost periodic solution for
At present analytic methods are used most
an arbitrary almost periodic function p(t). A
frequently in the study of nonlinear oscilla-
necessary and sufficient condition for (3) to be
tions because, in comparison with geometric
regular is that (3) induce an exponential di-
methods, they enable us to get many quantita-
chotomy, that is, the solution space S of (3)
tive results in addition to the qualitative ones.
have a direct sum decomposition S = S- + S,
However, these methods are usually efficient
such that Ix(t)1 <Me-Ylt-s’ Ix(s)1 holds for -cc
only for weakly nonlinear differential equations,
<s<t<cc ifxeS+ andfor -co<t<s<m if that is, differential equations differing only
x E S-, where M and y are positive constants. slightly from linear differential equations (gen-
An autonomous (resp. periodic) system (3) is
eral nonlinear differential equations are called
regular if and only if no tcharacteristic roots
sometimes strongly nonlinear differential equa-
(resp. characteristic exponents) have zero real
tions). In this sense, analytic methods are all
parts.
iperturbation methods in the wider sense [3],
and the variety of the methods lies in the form
of perturbation and the method of calculation.
C. Geometric Methods
To make use of analytic methods, we always
reduce the given differential equation to a
Geometric methods are used frequently for
differential equation of the form
finding a periodic solution in an autonomous
or periodic case. In the autonomous case (l), a i= Ax +&x(x, t, E). (5)
periodic solution describes a closed orbit (t is
Here E is a parameter with small absolute
a parameter of a curve) in x-space, which is
value, A is a matrix of the form A =
usually called a phase space. The geometric
diag(O,, B), where 0, is a p x p zero matrix
method is used to show the existence and the
and i? is a matrix whose eigenvalues have all
stability of a closed orbit by investigating
nonzero real parts, and X(x, t, E) is periodic or
geometrically the behavior of orbits in the
almost periodic in t.
phase space. In such an approach, the prop-
(i) When X(x, t, E) is periodic in t, PoincarC’s
erties of tcritical points and tlimit sets (- 126
perturbation method is used frequently.
Dynamical Systems) are utilized frequently.
(ii) In practical problems, we frequently meet
This method is effective especially for 2-
the case A = 0, that is, the case where (5) is of
dimensional cases on the basis of the tPoincar&
the form
Bendixson theorem, and various results are
given for a generalized LiCnard’s (or Duffing’s)
290 E 1092
Nonlinear Oscillation

In this case, the average X,(x, E)= of computing periodic solutions: Newton’s
lim,,,(l/T)S,TX(x, t, ~)dt exists. If dx/dt = iterative method, the finite-element method, the
X,(x, 0) has a periodic solution t(t) and the Lindstedt-Poincare method, the method of
related variational linear system is regular, multiple scales, the method of harmonic hal-
then (6) has an (almost) periodic integral mani- ante, the Galerkin method, etc. [12-153.
fold x = f,(t, 0) such that f,(t, o)+{(Q) uniformly For weakly nonlinear cases, analytic meth-
as E+O. The method of averaging based on ods (that is, perturbation methods) are very
this fact was devised by Bogolyubov and efficient. However, when a parameter is fixed
Mitropol’skii [4] and is used frequently. The beforehand, it is not easy to know whether
equation the conclusion obtained by the perturba-
tion method is valid for the given value. For
a + w2x = &f(X, k-, t, E) (7) strongly nonlinear cases, it is very difficult
is one of the examples that can be reduced to to analyze the problems by analytic methods,
an equation of the form (6). Equation (7) ap- and at present hardly any efficient methods
pears frequently in practical problems, and exist. Numerical methods will therefore be-
hence various convenient techniques are de- come exceedingly important in research on
vised, such as the method of linearization, the nonlinear oscillations. These are now of toler-
asymptotic method, the method of harmonic able efficiency and reliability, due to progress
balance [4], etc., through which we can apply in high-speed machine computation.
the method of averaging directly to the given
equation (7).
(iii) Consider an w-periodic system
F. Nonstationary Oscillations
dx/dt = A(t)x + f(x, t). (8)
Physically speaking, an oscillation is a station-
For any x(t)~P,, dx/dt=A(t)x+f(x(t), t) is of
ary state. However, when a system contains a
the form (4) and it has an o-periodic solution
parameter varying slowly with time, the oscil-
C~~_,U~~~+G[NX], or
lation also varies slowly with time (for exam-

1
ple, when the length of a pendulum varies
TX= t aktk+G Nx- 2 (qk,Nx)qk (9)
k=l k=l
slowly, the amplitude of the pendulum also
varies slowly). Such variation of oscillations in
under the condition (qk, Nx)=O (k= 1, . . ..m) as the course of time is represented by dx/dt =
in Section B, where N: P,+P, is defined by Ef(x, t, Et,E), where f(x, t, s, E) is (almost) peri-
Nx = f(x( .), .), and the w-periodic solutions of odic in t, s. Such cases provide important
(8) correspond to the fixed points of the trans- problems in the theory of nonlinear oscilla-
formation T: P,,+P, induced by (9), where the tions; one such problem has been posed by
ak in (9) are given by uk = (c”, x), as is required Mitropol’skii [ 163 under the name nonstation-
when x is a fixed point. This is the alternative ary oscillations and has been investigated by
or bifurcation method [6-91, which is extend- means of the method of multiple scales [15].
able to the case of almost periodic systems
under the regularity of (3) [lo]. In order to
obtain a fixed point of T, various kinds of
fixed point theorems can be utilized. References
(iv) In Section C and (iii) above, the search
for a fixed point of an appropriate mapping is [l] G. Sansone and R. Conti, Nonlinear dif-
a principal device. However, the choice of a ferential equations, Pergamon, 1964. (Original
suitable domain for the mapping is a crucial in Italian, 1956.)
problem, and usually an a priori bound for [2] S. Lefschetz, Differential equations: Geo-
solutions is looked for. The concept of stability metric theory, Interscience, 1959.
and hence Lyapunov’s second method are [3] G. E. 0. Giacaglia, Perturbation methods’
effective in such situations [ 111. in non-linear systems, Springer, 1972.
[4] N. N. Bogolyubov and Yu. A. Mitro-
pol’skii, Asymptotic methods in the theory
E. Numerical Methods of nonlinear oscillations, Gordon & Breach,
1961. (Original in Russian, 1958.)
Numerical methods are used for obtaining [S] N. Minorsky, Nonlinear oscillations, Van
explicit forms of the oscillations. They are Nostrand, 1962.
convenient in practical applications, since they [6] J. K. Hale, Ordinary differential equations,
can be used efficiently whether or not the Wiley, 1969.
nonlinearity of the system is weak. For an [7] M. A. Krasnosel’skii, Translation along
autonomous or periodic system, one can uti- trajectories of differential equations, Amer.
lize the following methods as efficient means Math. Sot., 1968. (Original in Russian, 1966.)
1093 291 C
Nonlinear Problems

[S] L. Cesari, R. Kannan, and J. D. Schuur, be dealt with only by means of very particular
Nonlinear functional analysis and differential or ad hoc devices.
equations, Dekker, 1976.
[9] N. Rouche and J. Mawhin, Equations
B. Methods Used in Nonlinear Problems
differentielles ordinaires II, Masson, 1973.
[lo] M. A. Krasnosel’skii, V. Sh. Burd, and
Consider a nonlinear equation
Yu. S. Kolesov, Nonlinear almost periodic
oscillations, Wiley, 1973. (Original in Russian, G(x) = 0, (1)
1970.)
where G is a nonlinear mapping or operator of
[ 111 T. Y oshizawa, Stability theory and the
a subset S of a linear space X into itself. If we
existence of periodic solutions and almost
put G = I - F (I is the identity), equation (1)
periodic solutions, Springer, 1975.
becomes
[ 121 C. Hayashi, Nonlinear oscillations in
physical systems, McGraw-Hill, 1964. x = F(x). (2)
[13] A. Andronov, A. Vitt, and S. Khaikin,
Theory of oscillations, Addison-Wesley, 1966. Then a solution of (2) is a fixed point of F.
(Original in Russian, 1959.) Therefore fixed-point theorems of various kinds
[ 141 M. Urabe, Nonlinear autonomous oscil- are useful for solving (2) (- 286 Nonlinear
lations, Academic Press, 1967. Functional Analysis).
[ 151 A. H. Nayfeh and D. T. Mook, Nonlinear Let x(‘)E S, and suppose that we can define
oscillations, Wiley, 1979. xck’,k=1,2 ,..., by
[ 161 Yu. A. Mitropol’skii, Problems of the x(~)=F(x(~-‘)), k= 1,2,
asymptotic theory of nonstationary vibrations,
Daniel, 1965. (Original in Russian, 1964.) If F is continuous and the sequence xck) con-
verges to a point XIZS, then x is a fixed point
of (2). Such a method of constructing an ap-
proximate sequence by iteration is called the
iterative method. +Newton’s iterative process,
291 (XIII.1 2)
given by
Nonlinear Problems

A. General Remarks is one such method, where G’ denotes the


+Frtchet derivative of G.
Nonlinear problems deal with nonlinear map- If X is an infinite-dimensional space, many
pings or operators and the related equations. concepts and methods of the theory of func-
Until recently it was customary to consider tional analysis can be used for a number of
nonlinear problems as belonging to applied nonlinear problems (- 286 Nonlinear Func-
mathematics and the physical sciences. How- tional Analysis).
ever, nonlinear problems now belong to mod- We note that there exist nonlinear trans-
ern mathematics. Many phenomena in math- formations that change nonlinear equations
ematical physics are essentially described into linear ones. For example, the thodograph
by nonlinear equations, e.g., the motions of method, which is often applied in hydrody-
several particles or of viscous or compressible namics, consists of reducing a system of tquasi-
fluids (- 420 Three-Body Problem, 204 Hy- linear partial differential equations of the form
drodynamical Equations). Some of these equa- Ai@, u)u, + B,(u, u)u, + Ci(u, v)u, + D&L, u)u, = 0
tions are approximated by linear equations
only when the variables appearing in the equa- (i=1,2)
tions are restricted to very small domains; to linear differential equations
they are treated by perturbation methods
when the variables stay in comparatively small A,y,, - B,X” - c,y, + DiX, =o (i= 1,2)
domains. If these requirements cannot be met by means of the thodograph transformation,
and we have to deal with equations in which which changes the independent variables from
the variation of the variables are not negli- x, y to u, u (- 205 Hydrodynamics).
gible, nonlinear problems certainly arise.
The methods of solution of nonlinear prob-
lems are not as powerful or general as those
C. Nonlinear Algebraic and Transcendental
for linear differential equations. For instance,
Equations
the tprinciple of superposition of solutions
does not hold for nonlinear problems, and
Consider a system of equations
therefore Fourier methods are no longer ap-
plicable. Indeed, some nonlinear problems can .m 1)..., x,)=0 (i=l,..., n). (3)
291 D 1094
Nonlinear Problems

Newton’s iterative process can be applied as abstract Cauchy problem,


follows. Starting with a point x(O) = (x\“, . . ,
du/dt=A(t)u (t>O), u(+O)=a,
xlp’) lying near the desired solution, we define
x(k)--(x ‘:), . ..) xtk))
“r k= 1,‘12 by solving the in a Banach space X, where a~x and A(t) is a
system of equations nonlinear operator (- 286 Nonlinear Func-
tional Analysis).
n af. For extensive studies of nonlinear ordinary
and partial differential equations - 314 Ordi-
(j=l ) . ) n). nary Differential Equations (Asymptotic
Behavior of Solutions), 290 Nonlinear Oscil-
Under some conditions the iteration {xtk’} lation, 394 Stability, 321 Partial Differential
converges to the solution (- 301 Numerical Equations (Initial Value Problems), 323 Par-
Solution of Algebraic Equations). tial Differential Equations of Elliptic Type,
If the system (3) is a real one, (3) is equiva- 325 Partial Differential Equations of Hyper-
lent to Zfi’ = 0. It is clear that a method of bolic Type.
obtaining a minimum of a function f(x’, . . , Nonlinear differential equations of special
xn) is applicable to solving Cf,’ = 0. Taking a types appear in many fields of pure and ap-
point x(O), we define plied mathematics, e.g., the Monge-Ampere
~~=x(~~~'-l~_~Vf(x(~~~'), k= 1,2,..., equation and the equation for minimal sur-
faces in differential geometry (- 183 Global
where Vf =(;?flax,, . , Sf/iYx,,), and let A,-, Analysis, 275 Minimal Submanifolds), and
satisfy the Toda lattice equation and the Korteweg-
de Vries equation in mathematical physics
(l>O). (- 287 Nonlinear Lattice Dynamics, 387
Solitons).
Under suitable assumptions, a subsequence
xckj’ converges to x such that Vf(x) = 0 and
f(xtkj’) decreases monotonically to f(x) [l]. E. Nonlinear Problems of Control Systems
Let f be a continuous mapping from a
domain D of R” or C” into itself. Then for any The basic equation for a tcontrol system in
x(O’ED, the iteration xck’ can always be defined which the state of the controlled object can be
by xck’ =f(x ckml’). The problem of the behavior represented by an n-vector x is given by the
of xck’ is an interesting and important one in following system of differential equations [3]:
pure mathematics; recent work in the physical
sciences is yielding many new concepts related
i = Ax - cp(a)b, t = cp(d a=c'x--y(, (5)
to this problem (- 126 Dynamical Systems, where i and [ stand for dxldt and dt/dt, re-
433 Turbulence and Chaos). spectively, 5 is a scalar function representing
the control, b, c are constant n-vectors, y is a
constant number, c’ is the transpose of c, and
D. Nonlinear Differential Equations A is a constant n x n matrix whose character-
istic roots have negative real parts. Further-
more, we assume that when the control has
Consider a nonlinear system of ordinary dif- no effect upon the system, x is determined by
ferential equations x = Ax. The quantities under consideration
are all real. Finally, the function cp= q(g) is a
dx/dt =f(t,x) (xER”).
scalar function characteristic of the control
The initial value problem with initial condi- mechanism. Generally, cp is nonlinear in cr, and
tion x(r) = 5 is equivalent to the problem of hence equation (5) is nonlinear. Normally we
solving the nonlinear integral equation assume that cp has the following properties: (i)
f q(o) is a real-valued continuous function on
x(t)=(+ f(s,x(s))ds. (4) (-co, co) with cp(O)=Oand ocp(a)>O for a#
sT 0; (ii) j:m cp(a)do= +co. We say that (5) is
The solution of (4) is a fixed point of the absolutely stable if
operator T: cp(t)~ 5 + l:f(s, &))ds defined
x(t)W, t(t)-0 as t-++co
for a suitable function space. Therefore fixed-
point theorems are applicable to 7’, and the for any choice of cp subject to (i) and (ii) and
iterative method for T is called the tmethod of for every solution of (5). In the study of control
successive approximation. Also the method of systems an important problem is to obtain a
the Kauchy polygon is useful ( - 3 16 Ordi- necessary and sufficient condition for the
nary Differential Equations (Initial Value system to be absolutely stable. In this connec-
Problems)). These ideas are extended to an tion, we have the following result, due to M. V.
1095 291 Ref.
Nonlinear Problems

Popov: (5) is absolutely stable if there exists a Although Emden’s equation can be reduced to
nonnegative q such that an equation of the first order, it has not been
possible to solve it analytically except for the
Re{(l+iwq)(c’(iwl-.4)‘b)j+qy>O (6) cases n = 0, 1, and 5. For certain values of n
for any real w, where I is the n x n identity between 0.5 and 6, Emden gave numerical
matrix. Conversely, if the absolute stability of solutions, which were relined later by Green,
(5) is given by means of the tLyapunov func- D. H. Sadler, and D. C. Miller [S].
tion (- 394 Stability) (3) Caianiello’s differential equations, which
” describe the state of a network of neurons [6],
V(x, a) = X’BX + cd + ctf’x +/I da) da> are
s0
where B is a constant matrix and f is a con-
stant vector, then there exists a nonnegative
x,(t+z)= Y
[
CCuQ’xj(t-r7)--Oi
j r

where the function x,(t) represents the state of


1 ,

q for which (6) is satisfied for all real w.


the ith neuron at the time t and takes only the
values 0 and 1. The state of the system is to be
F. Nonlinear Equations in Applied considered at the discrete times t = 0, 7, 27, .
Mathematics The (real) coefficient u{j” represents the weight
of the effect of hysteresis in the relay process
Some examples of nonlinear problems are from the cell j to the cell i. The nonnegative
given here (- also 205 Hydrodynamics; 3 18 integer 0, is the threshold value of the cell i.
Oscillations). Y[x] is the unit step function that is equal to 1
(1) The nonlinear tdifferential-difference for x > 0 and vanishes for x < 0.
equation (4) The following Hodgkin-Huxley differen-
du(t)/dt=(a-u(t-l))u(t) tial equation arises in the study of conduction
and excitation in nerve systems [7]:
is called the Cherwell-Wright differential equa-
tion [4]. Given the initial condition u(t) = g(t) a2V 2r,
-=- C,;+g,m’h(V- VJ
(0 <t < 1, g(r) is a given continuous function), 8x2 R,
its solution is uniquely determined for 0 < t <
co. Ifa< and g(l)>O, then u(t)+0 as t+ +tg,nv- 1/2)+93W- VJ >
>
co; if a=0 and g(l)<O, then u(t)+ --co as t-r
(;o. For a>O, u(t)+-co as t--tc~ ifg(l)<O, am
- -(al(V)+81(V))m+al(V),
while u(t) either approaches a monotonically at-

(0 <a < l/e) or oscillates (boundedly) around


dh
a (a > l/e) if g( 1) > 0. In particular, we have -= -(~2(V)+82(V))h+Cr,(v),
damping oscillations for a < 3/2, while oscilla- at
tory solutions without damping appear for an
a > 42. - -(cc3(V)+B3(V))n+a3(v),
at-
(2) If we regard a star as a gas sphere and
assume the polytropic relation p= KpY (K and where c(~,pi (1 <i < 3) are given functions of V,
y are constants) between the pressure p and the and ro, R,, C,,, gi, F (1 <i<3) are constants.
density p at each point inside the star, then The unknown function V is sought in the
we have a differential equation of the second domain 0 <x < co, 0 < t < co, while the initial
order that determines the density distribu- values of V, m, h, n, and the boundary value of
tion. This equation constitutes the basis of the V are given at t = 0 and x = 0, respectively.
classical theory of the internal structure of the
stars and is called Emden’s differential equa-
tion or the polytropic differential equation. It References
reads:
[l] T. L. Saaty and J. Bram, Nonlinear mathe-
(l/t2)4t2 dOlWd5 = -O”,
matics, McGraw-Hill, 1964 (Dover, 1981).
where y= 1+ l/n, p=10”, r=at, a=((n+ [2] T. L. Saaty, Modern nonlinear equations,
1)Kly-2/(4~G))1’2, with r the distance from McGraw-Hill, 1967 (Dover, 1981).
the center, G the universal gravitational con- [3] S. Lefschetz, Stability of nonlinear control
stant, and i an arbitrary constant. The solu- systems, Academic Press, 1965.
tion that satisfies the conditions 0 = 1 and [4] S. Kakutani and L. Markus, On the non-
de/d< = 1 at 5 = 0 is called the Lane-Emden linear difference-differential equation y’(t) =
function of index n. Emden’s equation is in- (A - By(t - z))y(t), Contributions to the theory
variant under the transformation <+ A& of nonlinear oscillations, Ann. Math. Studies,
O+A~21(“~1)0 (with A an arbitrary constant). Princeton Univ. Press, 1958, vol. 4, l-18.
292 A 1096
Nonlinear Programming

[S] R. Bellman, Stability theory of differential necessary optimality conditions can be given
equations, McGraw-Hill, 1953. without the differentiability condition. Subject
[6] E. R. Caianiello, Outline of a theory of to suitable modifications, the method of La-
thought-process and thinking machines, J. grange multipliers can also be applied to the
Theoret. Biol., I (1961) 2044235. solution of nonlinear programming problems.
[7] A. L. Hodgkin and A. F. Huxley, A quan- The Lagrangian function $ associated with
titative description of membrane current and the minimization problem (NLP) is defined by
its application to conduction and excitation in
ax, 4 = K4 + u’gb),
nerve, J. Physiol. (London), 117 (1952), 500-
544. where u = (ui, , u,) and u’ denotes the vector
of Lagrange multipliers.
A pair (X, Ii) is called a saddle point of $(x, u),
provided X6X0, iicR”‘, UaO, and $(X,u)<
292 (X1X.4) @,U)<$(x,li) for all xeX” and all UER”’
such that u > 0. It follows easily that:
Nonlinear Programming
(1) (H. Uzawa [ 1S]) If (X, U) is a saddle point
of $(x, u), then x is an optimal solution.
A. Problems (2) (Kuhn and Tucker [ll]) Assume that X0
is open and convex, and that 0 and 9 are dif-
A nonlinear programming problem is a type of ferentiable and convex. If there exists a pair
mathematical programming problem where it (X, U) such that
is required to minimize or maximize a non-
va(x)+u’vg(x)=o, XEC, U’g(X)=O, ii>0
linear function e(x) of n-vector variable x
defined in a closed connected set X0 with a set (where V@(x) and Vg(x) denote, respectively,
of linear or nonlinear constraints. Minimiza- the gradient vector of 0 at x and the Jacobian
tion or maximization of a continuously dif- matrix of g at x), then x is an optimal solution.
ferentiable function O(x) under the equality
condition expressed by a set of continuously
B. Necessary Conditions for Optimality
differentiable functions has been traditionally
dealt with by the method of Lagrange multi-
In Section A sufficient conditions for opti-
pliers. Hence a typical nonlinear programming mality were given. Some necessary conditions
problem is usually formulated as follows. are also known.
(NLP) Minimize O(x) under the condition (3) When 0(x) is continuously differentiable
x~X’cR”andg~(x)<Ofori=1,2 ,..., m. and all the gi are linear, that is, when the con-
Or, equivalently, determine the set of all dition is expressed as Ax < b and x > 0, where
x such that O(x)=min,,,O(x), where C= A is an m x y1matrix and b an m-vector, x is
{xlxeX” and y(x),<O}. an optimal solution only if X is an optimal
Here we need only consider minimization, solution of the following linear programming
since maximization problems can be converted problem.
to minimization problems by virtue of the (LP’) Minimize u’x under the condition that
obvious relation max 0 = - min( - 0). Ax d b, x > 0, where u = V&x).
The set C is known as the feasible region or By applying the duality theorem of linear
the constraint set, and X is called an optimal
programming, it can be proved that there
solution or simply a solution. In many non- exists a vector i?> 0 such that
linear programming problems X0 is R”. If X0
= R” and 0 and y are linear functions on R”, Vcp(Y,Z)=VB(x)+u’A=O.
then the problem becomes a linear program- (4) (F. John [9]) Assume that X0 is open,
ming problem (- 255 Linear Programming).
and that 0 and g are differentiable. Then, if x is
The problem of minimizing a quadratic func-
a solution of the problem (NLP), there exist
tion subject to linear constraints is called the u. > 0 and UE R” such that
quadratic programming problem (- 349 Qua-
dratic Programming). If X0 is a convex set, H iioVO(X) + U’Vg(X) = 0,
is convex (or concave), and the gi are convex
Y(X)dO, U’g(X) = 0, u> 0.
on X0, then the minimization (or maximiza-
tion) problem is known as a convex (or con- Now we consider the following: The vector-
cave) programming problem. Convex and con- valued function g is said to satisfy Guignard’s
cave functions are important in nonlinear constraint qualification at an inner point x of
programming, because they admit reason- X0 if any vector y satisfying the linear inequal-
ably straightforward sufficient conditions for ities Vgi(x). y<O for iEl= {ilgi(x)=O} is in
optimality and also because they constitute the the convex hull spanned by the vectors tangent
only important class of functions for which to the set C at x.
1097 292 D
Nonlinear Programming

(5) (Guignard [7]) Let X0 be an open set, let where V, denotes the gradient of 0 or g with
x be an optimal solution of (NLP), let 0 and g respect to the parameter.
be differentiable at X, and assume that g satis-
fies the Guignard constraint qualification.
Then there exists EER”’ such that VQ@) + D. Duality
U’Vg(X)=O, g(x)<O, and U’g(X)=O, 520.
Guignard’s constraint qualification is satis- A duality theorem in mathematical program-
fied if neither of the following conditions ming is the statement of a certain relationship
hold: (L) Vectors gi(x) for ill are linearly between two problems. This relationship has
independent. the following two aspects: (i) one problem is a
(S) (Slater’s constraint qualification) The gi constrained minimization problem and the
are convex and X0 is a convex set; and there other is a constrained maximization problem;
exists a vector x such that gi(x) < 0 for all i. (ii) the existence of a solution to one of these
With convexity of 0 and gi, differentiability is problems ensures the existence of a solution
not required: to the other, and in this case their respective
(6) (Kuhn and Tucker [ 111) Let X0 be a values are equal.
convex set, let 0 and g be convex on X0, and Let $(x, u) be the Lagrangian form of
assume that g satisfies Slater’s constraint (NLP), and define ~(u)=inf~~~,,$~(x, u). Then
qualification. If x is an optimal solution, then we can state the following two problems.
there exists ii~ R”, ii> 0, such that U’g(x) = 0 (P) (primary problem) Miminize 0(x) under
and (x, 11)is a saddle point of +(x, u) = Q(x) + the condition that x E X0 and g(x) < 0
u’g(4. (D) (dual problem) Maximize w(u) under the
condition u > 0.
If (x, U) gives a saddle point of $(x, u), then
C. Sensitivity Analysis

Now consider the following class of problems. The dual problem can be formulated alterna-
(PC) Minimize the function Q(x) under the tively as follows.
condition that g(x) dc, where c is a real m- (0) Maximize $(x,tl)=H(x)+u’g(x) sub-
vector. We denote by & the set of the solutions jectto(x,u)~Y={(x,u)~x~X”,u~Rm,u~O,
of the problem (PC) and denote 0: = f&Q, X,E Vx+(x, u) = 0) (where Vx$(x, u) denotes the
6,. Suppose that Guignard’s condition is vector whose components are the partial de-
satisfied for each c and that the set of La- rivatives 5$(x, u)/ax, for i = 1, . , n). .
grange multipliers A, is nonempty. Then for There are a number of duality theorems
any real vector a, we have related to problems (P) and (D); two such
theorems are as follows:
infa’u, d ‘,‘y i(@+,. - 0:) < sup a’u,, U,EA,. (1) (P. Wolfe [20]) Suppose that X0 is open
and convex, that 0 and g are differentiable and
Therefore, if the Lagrange multiplier is unique- convex, and that g satisfies the Kuhn-Tucker
ly determined for some c, then u, represents constraint qualification. Then, if x is a solution
the vector of the rates of increments of the of(P), there exists a UER”’ such that (x, E) is a
objective function to the small increments of solution of (0) and 0(x) = $(x, u).
the components of the constraints vector. (2) (0. L. Mangasarian and J. Ponstein [13])
Hence the components of the Lagrange- Suppose that X0 is open and convex, that 0
multiplier vector are called the imputed and g are differentiable and convex, and that
prices or shadow costs of the constramts; these (a, a) is a solution of(D). If Ii/(x, ti) is strictly
have important economic impiications, espe- convex in some neighborhood of i-, then i is a
cially when the objective function is expressed solution of(P) and Q(a) = +(a, t;).
in terms of money or profits. The above two problems (P) and (D) are not
More generally, we can consider the follow- symmetric. The notion of symmetric duality
ing class of problems. was introduced by G. B. Dantzig, E. Eisen-
(GPc): Minimize the function 0(x, c) under berg, and R. W. Cottle:
the condition that g(x, c) < 0, where c is a real
Primary: Minimize
parameter.
F(x, u)= K(x,u)-u’V,K(x, u),
Denote by X(c) and u(c) the solution and the
subject to the constraints
Lagrange multiplier of (GPc) corresponding to
V,,K(x,u)gO, x20, and ~20;
the parameter c, respectively, and let O*(c) =
Dual: Maximize
8(X(c)). Then under a set of regularity condi-
G&u)-K(x,u)-x’V,K(x,u),
tions we have
subject to the constraints
w*(c) = V,O(x(c), c) + u(c)‘V,g(x(c), c), V,K(x,u)>O, x30, and ~30,
292 E
Nonlinear Programming

where K is continuously differentiable in is to change the constrained problem to a


(x, u)ER” x R”. nonconstrained one by introducing a suffi-
(3) Dantzig et al. proved [6] the existence of ciently large number M, called the penalty,
a common optimal solution (x, u) to both the and then maximizing
primary and dual problems, provided (i) an
4A-4 = Q(x) + MC max(s,(x), 0)
optimal solution (x, u) to the primary problem i
exists, (ii) K is convex in x for each u and
concave in u for each x, and (iii) K is twice without the constraint. This is called the pen-
differentiable and the matrix of second partials alty method.
(?K/&‘&j) is negative definite at (x, u). The third and most generally applicable
Rockafellar [ 151 gave another expression of technique is the gradient method, of which
several variations are known:
the duality relation: Define F(x, Y)= e(x) if
g(x)<y and = co otherwise, and denote q(y) (i) Arrow-Hurwicz-Uzawa gradient method
= infxtxO F(x, y). Then O(x) = q(O). For any [4]. Concave or convex programming prob-
lems can be solved by finding a saddle point
nonlinear function q(y), the conjugate cp*(;rl) is
of the Lagrangian function ti(x, u). Let cp(x, u)
defined by
be strictly concave and of class C2 in n-vector
x > 0 and convex and of class C2 in m-vector
u 2 0 and possess a saddle point (X, V). To
(4) (Rockafellar [lS]) supw(u)=‘p**(O)= approach a saddle point of cp(x, u) it is natural
clco q(O), where clco q(y) denotes the closed to devise a gradient process of the form
convex hull or maximum convex minorant
of q(y) defined by clcocp(y)=sup,l’y{l’z~ dx, i3cp 3- acp
-=~
dt axi’ dt auj.
(1)
p(z) for all z}. It follows from the above that
infB(x)=supw(u) if and only if cp(O)=clcocp(O),
To keep the variables in the positive orthant,
which holds true if q(Y) is convex.
we need to modify (l), and we consider the
Further forms of the duality theorem hold
following system of differential equations:
for linear or quadratic programming problems
(- 255 Linear Programming, 349 Quadratic
Programming).

~ otherwise
E. Algorithms

1 (i=l,...,n),
In a limited class of problems, i.e., when the
objective function is quadratic and the con-
straints are linear (- 349 Quadratic Program-
duj=
dt
0 ifu,=OandE>O
J

I-
acp
~ otherwise
ming) the optimal solution can be obtained by auj
solving a system of linear equations by the (j= 1/ . , m).
simplex method or other algorithms; but in
most nonlinear programming the solution is Under certain regularity hypotheses, there
calculated by some kind of iterative procedure. exists a unique solution (x(t), u(t)) of the sys-
Note that even when the constraints are given tem with any initial point (x0, u”), and the x
as equalities and all the functions involved are component x(t) of the solution converges to x
continuously differentiable, so that the optimal as t-co.
solution is explicitly given as a solution of a set Applying the above results to the Lagran-
of simultaneous equations, we usually require gian function $(x, u) we can solve the concave
some iteration procedure, such as the Newton- or convex programming problem.
Raphson algorithm, to obtain numerically the (ii) Rosen’s gradient projection method [ 161.
solution with preassigned accuracy; and the If a point x0 of the feasible region does not
iteration is not always easy if the functions are give a solution for the minimization problem
sufficiently complex. (P), then we look for a feasible point with a
Several iterative procedures for solving non- lower function value by proceeding from x0 in
linear programming problems have been pro- the direction of the gradient of the function
posed. Since the simplex method is a powerful -0(x). The method fails if x0 is a boundary
tool in linear programming, one type of ap- point and if the gradient vector points toward
proach is to obtain an approximately optimal the exterior of the feasible region. Rosen’s
solution by approximating the objective and method [16] is to project the gradient onto the
the constraint functions by piecewise linear boundary of the feasible region and then pro-
functions and then applying linear program- ceed in the direction of this projection. In this
ming techniques to get the approximately manner, we remain on the boundary of the
optimal solution within each region. Another feasible region.
1099 292 Ref.
Nonlinear Programming

(iii) Methods of feasible directions. These (B), when A, is a closed convex cone. L. W.
were first described by G. Zoutendijk [21]. Neustadt [ 141 investigated nonlinear pro-
Consider the problem of minimizing O(x) gramming problems in linear vector spaces
subject to the constraint x E S c R”, where S is and gave an application to the theory of
a closed, connected set satisfying certain regu- optimal control. The main results are: (i)
larity conditions and 0 is a continuously dif- Kuhn-Tucker type conditions which are both
ferentiable function of the n-vector x, such necessary and sufficient for optimality, (ii) a
that, for some CI, the set {x E S ( O(x) < a} is duality theory for obtaining multipliers in the
bounded and nonempty. generalized Kuhn-Tucker conditions, and (iii)
A method of feasible directions is any recipe an application to optimal control theory.
for solving this problem by proceeding along
the following lines: (1) Start with some x0 ES
such that 0(x0) < sup CI. (2) Pass from the kth References
iteration point xk to xk+i by first determining
a direction sk in xk such that the ray xk +Isk [l] J. Abadie (ed.), Nonlinear programming,
lies in S for all sufficiently small 3,> 0. (3) Then North-Holland, 1967.
determine the step length Ak, thus obtaining [2] J. Abadie (ed.), Integer and nonlinear
the (k+ 1)st iterate xkil =xk+&sk. (4) Repeat programming, North-Holland, 1970.
this procedure until some prescribed stopping [3] K. J. Arrow, L. Hurwicz, and H. Wzawa
condition is satisfied. (eds.), Studies in linear and nonlinear pro-
There are many methods of determining the gramming, Stanford Univ. Press, 1958.
sk, and in most cases the 1, are then deter- [4] K. J. Arrow and L. Hurwicz, Gradient
mined by solving a one-dimensional mini- method for concave programming I, in [3,
mum problem along the direction so obtained. pp. 117- 1261; H. Uzawa, Gradient method for
Zoutendijk has unified the various possible concave programming II, in [3, pp. 12771321.
methods of feasible directions and the relevant [S] E. M. Beale, On minimizing a convex
normalization rules that yield the optimal function subject to linear inequalities, J. Roy.
directions sk, and has investigated this subject Statist. Sot., (B) 17 (1955), 1733184.
in detail from the viewpoint of computational [6] G. B. Dantzig, E. Eisenberg, and R. W.
technique [21,22]. Cottle, Symmetric dual nonlinear programs,
Pacific J. Math., 15 (1965), 809-812.
[7] M. Guignard, Generalized Kuhn-Tucker
F. Generalizations conditions for mathematical programming
problems in a Banach space, SIAM J. Control,
The extension of the Kuhn-Tucker theory to 7 (1969) 232-241.
linear topological spaces is due to L. Hurwicz [8] L. Hurwicz, Programming in linear spaces,
[S]. Let 5 be a linear space, Y, 3 be linear in [3, pp. 38%1021.
topological spaces, Py, Pz the nonnegativity [9] F. John, Extremum problems with inequal-
cones of Y, 3, respectively, which are closed ities with subsidiary conditions, Studies and
convex cones containing inner points, D a Essays: Courant Anniversay Volume, Inter-
convex set in X, and F, G concave mappings science, 1948, 187-204.
(- 88 Convex Analysis A) from D into Y, 2, [lo] H. P. Kiinzi and W. Krelle, Nichtlineare
respectively, such that G(D) contains an inner Programmierung, Springer, 1962.
point of Pz. If F(X) attains its maximal point [l l] H. W. Kuhn and A. W. Tucker, Non-
when X =X0 and X0 satisfies G(X,) > 0, X0 ED, linear programming, Proc. 2nd Berkeley
then there exist Y,* >, 0, Z,* >, 0 such that Symp. Math. Statist. Prob., 1951,481-492.
@(X, Z*) = Y,*(F(X)) + Z,*(G(X)) has a saddle [ 121 0. L. Mangasarian, Nonlinear program-
point at (X0, Zx). The condition that Py and ming, McGraw-Hill, 1969.
Pz have inner points can be weakened to cover [13] 0. L. Mangasarian and J. Ponstein, Mini-
the cases of (1,) (L,), (s), (S) (Hurwicz and max and duality in nonlinear programming, J.
Uzawa). P. P. Varaiya [ 191 considered the Math. Anal. Appl., 11 (1965) 5044518.
following nonlinear programming problem in Cl43 L. W. Neustadt, Sufficiency conditions
Banach space. and a duality theory for mathematical pro-
(B) Maximizef(x) subject to XEA, gear, gramming problems in arbitrary linear spaces,
where X, Y are real Banach spaces, x E X, g : in [17, pp. 32333481.
X+ Y is a Frechet differentiable mapping, f is [ 151 R. T. Rockafellar, Augmented Lagrange
a real-valued differentiable function, A is a multiplier functions and duality in nonconvex
subset of X, and A, is a convex set in Y. programming, SIAM J. Control, 12 (1974),
The main results are similar to the Kuhn- 293-322.
Tucker necessary conditions. Varaiya also [ 161 J. B. Rosen, The gradient projection
exhibited a saddle value problem related to method for nonlinear programming, J. Sot.
293 A 1100
Nonstandard Analysis

Indust. Appl. Math., 8 (1960), 181-217; 9 following symbols: (1) countably many vari-
(1961), 514-553. ables. (2) constants corresponding to all ele-
[17] J. B. Rosen, 0. L. Mangasarian, and K. ments in U. (3) predicate symbols = and E. (4)
Ritter (eds.), Nonlinear programming, Aca- logical symbols 1, A, v, -, 3, V. (5) auxiliary
demic Press, 1970. symbols [ ,I. The last two symbols will be
[18] H. Uzawa, The Kuhn-Tucker theorem in omitted where there is no danger of confusion.
concave programming, in [3, pp. 32-371. Definition of formulas. (1) If t and s are
[ 191 P. P. Varaiya, Nonlinear programming in terms (variables or constants), then t = s and
Banach space, SIAM J. Appl. Math., 15 (1967), t ES are formulas (atomic formulas). (2) If 4
284-293. and $ are formulas, then l+,d A $, 4 v $,
[20] P. Wolfe, A duality theorem tor nonlinear @-$ are formulas. (3) If 4 is a formula and x
programming, Quart. Appl. Math., 19 (1961), is a variable, then 3x[f and Vx[& are for-
239-244. mulas. (4) The formulas are those that can be
[21] G. Zoutendijk, Methods of feasible direc- constructed by the above procedure.
tions, Elsevier, 1960. A formulas 4 containing at most n free
[22] G. Zoutendijk, Nonlinear programming, variables is called an n-ary formula. In this
computational methods, in [2, pp. 37-861. case, we often write &x1,. , x,,) for 4. A sen-
[23] G. Zoutendijk, Some algorithms based tence is a 0-ary formula. We write + 4 if a
on the principle of feasible directions, in [ 17, sentence 4 is true under the usual inter-
pp. 93-1211. pretation.
Consider a quadruple (U, *U, *E, *) where
*U is a set, *E is a binary relation on *U, and
* is a mapping: a~ *a from U to *U. An
293 (1.7) element x of *U is called standard if a = *a for
some a~ Cl,and nonstandard if not. Adjoin to
Nonstandard Analysis
Y constants representing all nonstandard
elements of *U. Then we have a language *Y
A. General Remarks for *U. A sentence in 8 is a sentence in *Y.
Taking *U as the scope of quantifiers, we can
Nonstandard analysis is a new field of research interpret every *S-sentence in *U. We write
that has branched off from model theory and *+ d if an *S-sentence 4 is true under this
that provides a powerful method applicable in interpretation.
almost all fields of mathematical science. Axiom 1 (transfer): For every sentence 4 in
About 1960, A. Robinson successfully used a 9, we have
nonstandard model of the real number field R
to justify Leibniz-type infinitesimal calculus.
After this, using higher-order logic, he devel- Definition: Let 4(x, y) be a binary formula in
oped a stronger theory of nonstandard analy- Y (resp. in *Y). We say that 4(x,y) is concur-
sis, and applied it to other mathematical fields rent in U (resp. in *U), if, for every finite num-
c11. ber of elements ul,. , a, in U (resp. in *U),
In this article, we adopt a first-order logic there exists an element b in U (resp. in *U)
over a universe. In the final section we present such that + &ai, h) (resp. *+ &ai, h)) for
a theory called nonstandard set theory; this is l<i,<n.
a conservative extension of Zermelo-Fraenkel Axiom 2 (enlargement): If a binary formula
set theory with the axiom of choice. 4(x, y) in Y is concurrent in U, there exists an
element /j in *U such that *+=(a, [j) for all u in
u.
B. Axioms for Nonstandard Analysis
A quadruple (U, *U, *E, *) or simply *U
is called an enlargement of U if it satisfies
A nonempty set U is called a universe if the
Axioms 1 and 2.
following four conditions are satisfied:
These axioms are strong enough to develop
(a)xEU,yEx=>yEU; the basic theory of nonstandard analysis, but
sometimes we require a stronger axiom. Let K
(b)x,yeU={x,y)EU;
be an infinite cardinal.
(c)xEU*uxEU; Axiom 3 (K-saturation): Let 4(x,y) be a
binary *T-formula concurrent in *U. Then,
(d)xEUaP(x)EU.
for every subset A of *U with cardinality at
In what follows, U will be a universe contain- most IC, there exists an element /I of *U such
ing the field R of real numbers. that *b&x,8) for all acA.
We construct a language Y describing *U is called a K-saturated model if it satisfies
mathematics in U. The alphabet consists of the Axioms I and 3, and a K-saturated enlarge-
1101 293 D
Nonstandard Analysis

ment if it satisfies Axioms 1, 2, and 3. If K is Let 9 be a nonprincipal ultrafilter on I and


not less than the cardinality of U, then a K- define an equivalence relation -c on the set
saturated model is necessarily an enlargement. U’ of all mappings from I to U:
In the remainder of this article, *I/ will be
(a(i)) -,(~(i))~{iEIla(i)=/l(i)}~F.
an enlargement of U, if the contrary is not
explicitly mentioned. Denote by *U the quotient set of U under the
For a in *U, 2 is the set of 5 E * U such that relation -I. The class of (a(i)),,, will be
<*Ea:&={~c*U/~*Ea}. For simplicity, we denoted by [a(i)],,, or simply [a(i)].
write E for *E and identify oi with a. Under this Define a binary relation *E on *U:
identification, a subset A of *U is called inter-
nal if it belongs to *U and external if not.
Axiom 1 and 2 imply the following results. Let * be the diagonal mapping from U to *U
(1) There exist infinite hypernatural num- and consider the quadruple (U, *U, *E, *).
bers, namely, elements of *N bigger than all Theorem (LOS): The foregoing quadruple
*It, nezN. satisfies Axiom 1. Moreover, *U is a countably
(2) There are positive infinitesimal hyperreal saturated model.
numbers. In particular, let I be the set of all finite
(3) For every set A in U, there exists a subsets of U. For ill, put p(i)= { jc1I icj}.
hyperfinite internal subset of *A containing all Then the family of sets B = {p(i) 1iEl} has the
*u, a E A. Here, r is called hyperfinite if we finite intersection property, and therefore there
have *k&r), where 4(x) is a formula that says exists an ultrafilter .F on I including B (use
“x is a finite set.” +Zorn’s lemma). If we construct *U from the
(4) Let A be a set in U. Then *A has non- ultrafilter .F, then *U is a countably saturated
standard elements if and only if A is an infinite enlargement of U Cl].
set. It iS not easy to construct a K-saturated
(5) If a family 9 of sets in U has the finite enlargement for an arbitrary cardinal K. We
intersection property, then the family {*A 1A E have two ways: to use a K-good ultrafilter (the
9) in *U has a nonempty intersection. proof of its existence is difficult) [2,3] or to use
(6) Let a: *N+*R be an internal hyper- an ultralimit (iteration of ultrapowers) [4,16].
sequence. If a(*n) is infinitesimal for every
HEN, then there exists an infinite hypernatural
number i such that a(v) is infinitesimal for D. Applications
every hypernatural number v less than 1.
If we moreover assume Axiom 3, we have (I) Infinitesimal Calculus. A hyperreal number
the following results. a (element of *R) is called infinitesimal if Ial is
(7) The cardinality of an internal set is either smaller than every positive real number. If
finite or more than K. X-B is infinitesimal, a and p are said to be
(8) If a family F of internal sets in *U has infinitely close to each other; this is written
the finite intersection property and cardinality a z b. If l/a is not infinitesimal, a is called
K at most, 9 has a nonempty intersection. finite. Every finite hyperreal number a is in-
(9) Introduce the order topology in *R. finitely close to a real number a (completeness
Then every subset of *R with cardinality at of R). We call a the standard part of a and
most K is bounded and discrete. write St(a). Every finite hyperinteger is an
(10) Let a, B be internal sets and C a subset integer.
of a with cardinality at most K. Then every Let f be a real-valued function on a real
mapping from C to /I can be prolonged to an interval I. Then yf is a hyperreal-valued func-
internal mapping from a to b. tion on the hyperreal interval *I. The function
(lo’) In particular, every external sequence f is continuous if and only if f(x)= *f(q) for
N+fl can be prolonged to an internal se- every x E I, q E *I with x z q. The function f is
quence *N-/l. This follows from the assump- uniformly continuous if *f(t) z *f(q) for every
tion of countable saturation. This fact plays an <, IIE*I with 5%~.
essential role in nonstandard probability Let 6x be a variable ranging over nonzero
theory, which is now in the process of rapid infinitesimals. The function Sf= *f(a +6x) -
development (- Section D (3)). ,f(u) of 6x will be called the infinitesimal
increment off at a. f‘ is differentiable at a if
and only if the quotient ijf/!flfix is of infinitesimal
C. Construction of Ultrapower Models variation. The common standard part of Sjj’Sx
is the derivative f’(a).
Let I be an infinite set. A mapping a from I to The higher-order differential 8”’ is also
U is a family of elements in U with indices in I. justified as a higher-order infinitesimal dif-
So we write (a(i))i,, or simply (a(i)) for a. ference, with the standard part of 6”f/6x” being
293 E 1102
Nonstandard Analysis

,f”)(a). We can define the Riemann integral as Let a(&) be the external countably additive
the standard part of a Riemann hypersum with algebra of subsets of X generated by ~2. Then
respect to a hyperfinite partition of infini- the mapping &H st(v(&)) from & to R can
tesimal width. be extended to a measure on g(d). The com-
This type of reformulation permits us to pletion of this measure space is denoted
rewrite the whole of calculus, and many (X, L(d), L(v)). In cases where X is hyper-
teachers are trying to adapt this theory to finite, G! is the totality of internal subsets of X,
elementary calculus [S, 61. and v(X)= 1, then the space (X, L(d), L(v)) is
called a Loeb space and L(v) a Loeb measure.
(2) Topological Spaces. Let X be a topological Every Radon probability space can be re-
space in U and let u be a point of X. The presented as the image of a Loeb space by a
intersection of *A, A varying over the neigh- measure-preserving mapping. Therefore the
borhoods of a is called the monad of a and is probability theory on a Radon probability
denoted by Man(a). There exist hyperneigh- space can be reduced to that on a Loeb space.
borhoods of a contained in Man(a) (inlini- For example, for +Lebesgue measure on the
tesimal neighborhoods). The topology is deter- interval [O,l], take an infinite hypernatural
mined by the system of monads (Man(a)),,,. number 1 and put X={,U/~~~E*N,O<~<
We can thus rewrite the theory of topological i- 1). If we assign l/1 to every point of X, we
spaces. For example, X is Hausdorff if and have an internal probability measure v on X.
only if Man(a) n Man(b) = 0 for every a # b Then the mapping st: x-St(x) from X to [O,l]
in X. X is compact if and only if every point of serves as a measure-preserving mapping from
*X belongs to the monad of some element in the Loeb space (X, L(.d), L(v)) onto the Lebes-
X. This characterization is very useful; using it, gue measure space on [O,l].
Robinson and Bernstein were able to solve the The notion of lifting plays a key role in
invariant subspace problem for a special class probability theory on Loeb spaces. Let f be a
of operators on a Hilbert space [7]. We can real-valued function on X, and F an internal
also construct a Haar measure very naturally hyperreal-valued function on X. F is called a
and simply [2]. lifting off if we have .f(x) = st(F(x)) almost
everywhere with respect to L(v). Hence f is
(3) Measures and Probability Theory. Let measurable if and only if it has a lifting.
(X, B, m) be a measure space. Then there exist Shuttling between an internal probability
a hyperfinite subset r of *X and a positive space and its Loeb space by lifting and
hyperreal-valued internal function cp such that standard-part mapping, we can develop, sim-
we have ply and partially hyperfinitely, probability
theory on the Loeb space. Among others,
Anderson [lo, 111 and Keisler [ 121 applied
this method quite successfully to Brownian
for every integrable function .f [S]. The right- motions, It8 integrals, stochastic differential
hand side is a hyperfinite sum. It follows that equations, etc. [ 171.
the measure m can be extended to a finitely
additive measure defined over all subsets of X.
If in particular every measurable finite set is E. Nonstandard Set Theory
of measure 0, then there exist a hyperfinite
subset r of *X including X and a hyper- In our formulation in B, we must construct *U
natural number p such that we have from a fixed universe U. Nelson [ 131, Hrbacek
[ 143 and &da [ 151 independently invented
theories that nonstandardize the whole of set
theory. In these theories, there is only one real
for every integrable function f [g]. If m(X) = 1, number field R, and R already contains in-
then p can be taken as the hypercardinality of finitesimal numbers. Compare this to
r. Here, the right-hand side is nothing but the Robinson-type infinitesimal analysis, where
mean value of *f on a hyperfinite set. This idea infinitesimal numbers are introduced as an ad
leads us to a simple description of probability hoc tool. The new theories may demand a
theory. reflection on mathematical description in the
The above method is rather formal; but P. natural sciences.
Loeb [9] has pioneered a new approach in In the following, we outline (intuitively)
probability theory by constructing an external Hrbacek’s theory, as strengthened and im-
measure space from a hyperfmitely additive proved by Kawai [16].
internal measure space (X, G?‘, v) in *U. In the We start from ZFC, the Zermelo-Fraenkel
following, *U is supposed to be countably set theory plus the axiom of choice. The lan-
saturated. guage of Kawai’s theory NST is that of ZFC
1103 294 A
Numbers

plus two constants S and I. We understand S [9] P. A. Loeb, Conversion from nonstandard
as the totality of standard sets and I as the to standard measure spaces and applications
totality of internal sets. in probability theory, Trans. Amer. Math.
Let 4 be a formula of ZFC, that is, a for- Sot., 211 (1975) 1133122.
mula of NST without S and I. We write “4 [lo] R. M. Anderson, A nonstandard repre-
(resp. ‘#), the formula of NST obtained by sentation for Brownian motion and It6 in-
restricting the scope of variables in I$ to S tegration, Israel J. Math., 25 (1976), 15-46.
(resp. I). [ 1 l] R. M. Anderson, Star-finite probability
The mathematical axioms and axiom theory, Dissertation, Yale Univ., 1977.
schemes of NST are substantially as follows. [ 121 H. J. Keisler, An infinitesimal approach
(1) If 4 is an axiom of ZFC, then “4 is an to stochastic analysis, Mem. Amer. Math. Sot.
axiom of NST. In other words, all the axioms 297 (1984).
of ZFC are valid in the universe of standard [ 131 E. Nelson, Internal set theory, Bull.
sets. Amer. Math. Sot., 83 (1977), 116551198.
(2) In the universe of all sets, all the axioms [ 141 K. Hrbacek, Axiomatic foundations for
of ZFC are valid except the regularity axiom. nonstandard analysis, Fund. Math., 98 (1978)
(3) Every standard set is internal. Every l-19.
element of an internal set is internal. [ 151 K. Cuda, A nonstandard set theory,
(4) (transfer) For every n-ary formula Comm. Math. Univ. Carolinae, 17 (1976),
4(x1, ,x,) of ZFC, we have 647-663.
[ 161 T. Kawai, Nonstandard analysis by
Vx,ES...vX”ES[s~(X ,,‘.., X”)
axiomatic method, Southeast Asian Confer-
-‘&x1, .,.,x.)1. ence on Logic, C.-T. Chong and M. J. Wicks
(eds.), Elsevier, 1983.
(5) (saturation) For every set of size at most
[ 173 N. J. Cutland, Nonstandard measure
that of S, the scheme corresponding to Axiom
theory and its applications, Bull. London
3 (K--saturation) in Section B holds. Therefore
Math. Sot., 15 ( 1983), 5299589.
the scheme corresponding to Axiom 2 (enlarge-
ment) holds also.
(6) (standardization) For every set u in-
cluded in a standard set, there exists a stan-
dard set b such that we have VxgS[xga* 294 (11.9)
xEb]. Numbers
This completes the description of NST.
Theorem (Nelson-Hrbacek-Kawai): NST is
A. General Remarks
a conservative extension of ZFC. That is, let 4
be a sentence of ZFC. If “d can be proved in
From counting, a primitive mental activity,
NST, then 4 can be proved in ZFC.
came the natural numbers (N) (- Section B),
Corollary: If ZFC is consistent, so is NST.
which serve to denote the number of items or
the order in which these items are arranged.
References We can extend this concept to define, step by
step, the iintegers (Z), trational numbers (Q),
[l] A. Robinson, Non-standard analysis, treal numbers (R), and tcomplex numbers (C)
North-Holland, 1966; second edition, 1974. (- 355 Real Numbers; 74 Complex Numbers).
[2] M. Saito, Ultraproducts and non-standard The extensions up to the rationals are
analysis (in Japanese), Tokyo Tosho, 1976. carried out to attain a domain within which
[3] C. C. Chang and H. J. Keisler, Model the operations of addition, subtraction, multi-
theory, North-Holland, 1973. plication, and division, namely, the four aritb-
[4] K. D. Stroyan and W. A. J. Luxemburg, metic operations (or rational operations) can
Introduction to the theory of infinitesimals, be performed indefinitely, with division by
Academic Press, 1976. 0 the only exception. To develop the theory of
[S] H. J. Keisler, Elementary calculus, Prindle, natural numbers there are two well-known
Weber & Schmidt, 1976. methods, one being G. Peano’s system of
[6] H. J. Keisler, Foundations of infinitesimal axioms [4], which will be stated below, and
calculus, Prindle, Weber & Schmidt, 1977. the other being R. Dedekind’s set-theoretic
[7] M. Davis, Applied nonstandard analysis, treatment [S]. The domain of rational num-
Wiley, 1976. bers can be extended, taking continuity into
[S] M. Saito, On the non-standard represen- consideration, to that of real numbers by
tation of linear mappings from a function several methods of which the best known are
space, Comm. Math. Univ. Sancti Pauli, 26 those of the Dedekind cut (1872) [S], which
(1977), 165-185. will be described below, and of G. Cantor’s
294 B 1104
Numbers

fundamental sequences (1872) [9]. There is generalization of mathematical induction, is as


also a way based on infinite series, given by follows: to show that the property P(m, n) is
K. Weierstrass in his lectures (185991860). true for every pair of natural numbers m and II,
Though in the domain of real numbers (1) we have only to show that (iii) P(m, 1) and
the four arithmetic operations can be per- P( 1, n) are true for every m and every n; and
formed indefinitely and (2) an order relation for (iv) for every pair of natural numbers k and 1, if
magnitude is defined, the equation x2 + 1 =0 P(k + 1,1) and P(k, 1+ 1) are true, then P(k + 1,
has no root. Introducing numbers expressible I + 1) is true. This axiom can be formulated in
as a + ib (i = fi), it is possible to solve several other ways and can be generalized
every equation of the second order. Such further to n-tuple mathematical inductions
numbers, once called imaginary, have been (n = 2,3,4, ), generically called multiple
used since the days of G. Cardano [6] in the mathematical inductions.
16th century. L. Euler also made good use of Assume for a set M that a mapping f from
complex numbers as convenient tools in many the Cartesian product N x M into M is given.
calculations and obtained among other things Then a mapping cp from N into M such that
his formula exp i0 = cos 0 + i sin 0. Indeed, the (v) cp(l)=a; and (vi) cp(x’)=f(x,cp(x)) (x~N)
notation i= J-1 was used for the first time exists and is unique. Defining cp by (v) and (vi)
by Euler (1777). Furthermore, C. F. Gauss, is called the definition of cp by mathematical
giving imaginary numbers the name complex induction.
numbers, showed that any algebraic equa- In particular, given a natural number a, the
tion with numerical coefficients always has mapping cp: N + N defined by (vii) cp(1) = a’;
roots in the domain of complex numbers. The and (viii) cp(x’) = v(x) is called addition by n.
discovery of the geometric representation of We shall write v(b) = a + b, whence x’= x + 1.
complex numbers by several mathematicians Addition thus defined obeys the following
in the late 18th and early 19th centuries and laws: a + b = b + a (commutative law); (a + b)
their use in many applications have made + c = a + (b + c) (associative law). Peano’s
complex numbers indispensable in mathematics. postulates are thus equivalent to the following:
Though there are extensions of complex (1’) 1 EN; (2’) for each pair a, bEN, a+ bcN is
numbers, such as tHamilton’s quaternions or defined so that addition obeys the commuta-
+Cayley numbers, it is generally accepted that tive and associative laws; (3’) for each pair of
when we speak of a number we usually mean a natural numbers a and b, one and only one of
complex number. the following three relations holds: a = b + c
(cEN); a=b; a+c=b (cEN); (4’) the same as
(5) (mathematical induction). From (l’)-(4’)
B. Natural Numbers follows the cancellation law: a + c = b + co
u=b.Definea>bifandonlyifa=bora=
Peano, basing his system on a specific natural b + c (a, b, c E N). Then, from (3’), N becomes a
number, 1, and a function such that to each ttotally ordered set and a > b = a + c > b + c.
natural number x corresponds a natural num- For each agN, the mapping (p:N*N
ber x + 1 (hereafter denoted by x’ and called defined by cp(l)=a and cp(x’)=cp(x)+a is
the successor of x), formulated the funda- called multiplication by a, and we write v(b)
mental properties of the set N of natural num- = ah (or a. b). Multiplication obeys the follow-
bers in the following five axioms, called the ing laws: ah = ba (commutative law); (ab)c =
Peano postulates: (I) 1 EN; (2) if x EN, then u(bc) (associative law); a(b + c) = ab + ac,
x’EN;(3)ifxeN,thenx’#1;(4)ifxr=y’ ((I + b)c = UC+ bc (distributive laws); and
(x, y E N), then x = y; and (5) if a set M satisfies UC= bc 9 a = b (cancellation law). The state-
the two conditions: 1 EM, and XE M implies ment a’ 1 = 1 ‘a = a also holds.
X’E M, then N c M. Since these postulates Natural numbers, which have been intro-
determine N uniquely up to isomorphism, they duced thus far as tordinal numbers, also have
can be regarded as a definition of N. Elements the nronerties of tcardinal numbers. Denoting
1 I

of N are called natural numbers. i&T...> n}=M,,,wehaveM,=M,*m=n,


Owing to Peano’s fifth postulate, regarding M,+En=M,,,+,, M=,xM,=M,,(-49
a certain property P(n) for natural numbers n Cardinal Numbers).
we can deduce that P(n) is true for every n if
we prove both of the following conditions: (i)
P( 1) is true; and (ii) for any natural number k,
C. Integers
if P(k) is true, then P(k + 1) is true. Such rea-
soning is called mathematical induction (or
complete induction). Accordingly, Peano’s fifth Introducing new numbers which are not in
postulate is called the axiom of mathematical N = { 1,2,. }, represented by the notations
induction. Double mathematical induction, a 0, - 1, -2, . . . . -n, . . . . we write Z={ . . . .
1105 294 E
Numbers

--II ,..., -2,-1,0,1,2 ,_.., n ,... }.Anelement ordered, and we have (ii) x > y = x + z > y + z
of Z is called an integer (or rational integer). and (iii) x 2 y and z > 0 => xz 2 yz. The ration-
Algebraically we can construct Z from N as al x is called positive if x >O and negative if
follows: Let the set of all tordered pairs (k, I) x<o.
of natural numbers k, 1 be M = N x N, and
define in M an tequivalence relation (k, 1) -
(m, n) by k + n = m + 1. Let the equivalence class
E. Real Numbers
of (k, 1) be K(k, 1), and construct the tquotient
space, MJ- = M*. Then the mapping q:
Z-M* defined by cp(n)=K(k+n, k), q(O)= Two typical methods of constructing real
K(k, k), cp(- n) = K(k, k + n) is bijective. Setting numbers from rational numbers are those of
K(k, I) + K(m, n) = K(k +m, I+ n), addition Dedekind and of Cantor.
can be defined in M* (and accordingly in Z)
which is an extension of that in N. Since Dedekind’s Theory of Real Numbers. We call a
K(k,1)-K(m,n)=K(k+n,I+m),subtrac- pair (A,, A,) of subsets A,, A, of the set Q of
tion can be defined in Z. This makes Z an all rational numbers a cut of Q if they satisfy
+Abelian group with respect to addition. An the following conditions: (i) A, # 0, A, # 0;
order relation in Z is defined by K(k, I) 3 (ii)Q=A,UA,;(iii)a,EA,,a,EA,=>a,-=a,.
K(m,n)ok+n>m+/, which makes Z a Then the following three cases are distin-
totally ordered set. This order relation is an guished: there is (i) a maximum in A, with no
extension of that in N. In particular, N = minimum in A,; (ii) no maximum in A, with a
{a~Z(u>0). Furthermore, setting K(k, /) x minimum in A,; or (iii) no maximum in A,
K(m, n) = K(km + In, kn + lm), multiplication with no minimum in A,. A cut with either
in Z can be defined. It is an extension of condition (i) or (iii) is called a real number (in
that in N and obeys commutative, associative, the sense of Dedekind); condition (ii) can be
and distributive laws. Also, for each a, h E Z, converted to (i). The set of all real numbers is
we have ab=Oo(a=O or b=O). Thus, Z denoted hereafter by R and each real number
becomes an iintegral domain. by a or /J or A real number with property
(i) is called a rational real number, and a real
number with property (iii) an irrational real
D. Rational Numbers number. Any rational real number is uniquely
determined by the maximum a* of A,, and the
Let P be the set of all ordered pairs (a, b) of mapping: u+u* = (A,, A,) from the set Q of
integers u, b with h # 0, and define in P an rational numbers onto the set Q* of rational
equivalence relation (u, b) - (c, d) by ad = bc. real numbers is bijective.
Each equivalence class determined by this I. For real numbers n =( A,, A2) and /3 =
relation is called a rational number. Denoting (B, , B2) we define c(< p if and only if A, c B,
by L(u, b) the equivalence class to which (a, b) By this ordering <, R becomes a totally
belongs, we can define the sum x + y, the ordered set.
difference x-y, the product xy, and the quo- II. For real numbers cc=(A,, A2) and fl=
tient x/y of rational numbers x = L(a, b), y = (B,,B,),put C,={a+bJaeA2,bEB2\1 and
L(c, d), as in the cases of addition and multi- C, = R - C,; then (C, , C,) = y is a real num-
plication of integers, in the following way: ber. Define the sum c(+ fl by setting c(+ b =;j.
x + y = L(ud + bc, bd), x - y = L(ud - bc, bd), Addition thus defined obeys commutative
xy = L(uc, bd), x/y = L(ud, bc), where the quo- and associative laws, and R becomes an
tient is defined only when c # 0. Thus the set +Abelian group with 0* as its zero element.
Q of all rational numbers becomes a tfield. Furthermore, for real numbers c(=(A,, A2)
For the same reason that we have identified and fl=(B,,B,) with O*<cc, O*<p, put D,=
integers of a special type with natural num- {ublaEA,,bEB,}, D,=R--D,; then (D1,D2)=
bers, we now identify a rational number ex- 6 becomes a real number. Define the product ap
pressible as L(u, I) with an integer a. Hence- by setting c$ = 6. According as 0* > E, 0* < /J;
forth, any rational number L(a, b) (b # 0) can O*<a,O*>fi;orO*>cc,O*>~,define@=
be expressed in the form of a quotient a/b -((-a)B); cd= -(+P)); and xD=(-4-B),
(b ~0) of integers a and b. respectively. Multiplication thus defined obeys
From ,!,(a, b) = L(cu, cb) (c # 0), it is always the commutative, associative, and distributive
possible to assume b > 0 in the representation laws, and R becomes a field with l* as its
L(u, b) of a rational number x. For any two unity element.
rational numbers x = L(u, b), y = L(c, d) with III. For ordering and arithmetic operations,
b > 0, d > 0, define an ordering in Q by x >, y we have(l) a>/Y~a+y>/Y+~; and (2)
oud > bc, which is an extension of the order- cc3~,~3o**ccy>~y.
ing of integers. Thus (i) Q becomes totally By letting each rational number a corre-
294 F 1106
Numbers

spond to a rationally real number a*, we can all are real numbers has a limit in R’ (com-
set up a bijection between the set Q of all pleteness of real numbers).
rational numbers and Q*. Furthermore, in this The sets of real numbers obtained by the
correspondence, the sum, product, 0, and 1 of above two methods are isomorphic with re-
Q are mapped to the sum, product, zero ele- spect to arithmetic operations and ordering
ment, and unity element of Q*, respectively; in (- 355 Real Numbers).
addition, the ordering is preserved. Thus Q
and Q* are isomorphic with respect to both
F. Complex Numbers
arithmetic operations and ordering. We shall
hereafter identify the element a* with a. Ac-
To construct complex numbers from real ones,
cordingly we call rational real numbers simply
several methods have been devised, among
rational numbers and similarly irrational real
which the one stated below is due to W. R.
numbers simply irrational numbers.
Hamilton.
IV. Similarly as for Q, we can define a cut
An ordered pair (a, b) of real numbers a and
(A,, A2) of the set of all real numbers R (pre-
b will be called a complex number. Arithmetic
cisely, by the conditions A, # 0, A, # 0;
operations among complex numbers are de-
R=A,UA2;d,~A,,d2~A2*d,<d2).For
fined as follows: (a, b)+(c,d)=(u+c, b+d),
each cut (A,, AZ) of R, either there is a max-
(u,b)-(c,d)=(a-c,b-d),(u,b).(c,d)=(ac-
imum in A, and no minimum in A, or no
bd, bc + ad), and
maximum in A, and a minimum in A, (Dede-
kind). This property is called the connectedness (~3
b)
-zz
ucfbd bc-ad
k 4 # (0, 0).
(or continuity) of real numbers. k 4 c2+d2’c2+d2 >’
By using definitions (l))(IV), any real num-
ber c( can be represented (i) as the +least upper According to these definitions, the set C of all
bound of some set A of rational numbers: c(= complex numbers becomes a tfield with (0,O)
sup A; and also (ii) as the limit of a sequence and (1,O) as its +zero element and tunity ele-
{a,} of rational numbers: c(= lima,. ment, respectively. R* = {(a, 0) 1a E R} is a
+subfield of C, and the mapping (p:R+R*
Cantor’s Theory of Real Numbers. A sequence defined by q(a) =(a, 0) proves to be a field
{a,} of rational numbers is called a funda- +isomorphism. Thus, identifying the element
mental sequence (or Cauchy sequence) if it (a, 0) of R* with the element a of R, C can be
satisfies the following condition: for each posi- regarded as an tovertield of R (R c C). Hence,
the zero element of C is the real number 0 and
tive rational number a, there exists a natural
number n, such that --E<a,-a,,,<E for the unity element is the real number I. The
every a,,, and a, with m > n, and n > n,. For complex number (0,l) is called the imaginary
unit and denoted by i. Thus we can write
two Cauchy sequences {a,} and {b,}, we can
u + bi as usual for a complex number (a, b)
write{u,}-{b,}if{a,,b,,u,,b, ,..., u,,h, ,... }
is again a Cauchy sequence. The relation - is (- 74 Complex Numbers).
an equivalence relation. Let R’ be the set of all
equivalence classes obtained by classifying, References
with respect to -, the set of all Cauchy se-
quences, and call an element of R’ (an equiva- [l] 0. Perron, Irrationalzahlen, de Gruyter,
lence class) a real number (in the sense of Can- second edition, 1929 (Chelsea, 1948).
tor). We shall write [{Us}] hereafter for the [Z] E. G. H. Landau, Grundlagen der Analy-
equivalence class of {a,}. In particular, {a,} sis, Akademische Verlag., 1934; English trans-
with a, = a (n = 1,2, . . . ) being a Cauchy se- lation, Foundations of analysis, Chelsea, 1960.
quence, we denote a real number [{a,}] by [3] N. Bourbaki, Elements de mathematique,
a**. An element of R’ of this type is called a I. Theorie des ensembles, ch. 3, Actualites Sci.
rational real number (in the sense of Cantor), Ind., 1243b, second edition, 1967; III. Topo-
while one not of this type is called an irra- logie g&&ale, ch. 4, 1143c, third edition, 1960;
tional real number (in the sense of Cantor). ch. 8, 1235b, third edition, 1963; English trans-
For each pair of real numbers c(= [{a.}] lation, Theory of sets, Addison-Wesley, 1968;
and p=[{b,}], their sum and product are General topology, pt. I, Addison-Wesley, 1966.
uniquely defined by c(+ b = [{a, + b,}] and [4] G. Peano, Arithmetices principia nova
~@=[{u,b,}], where both {u,,+b,} and {a,,b,,} method0 exposita, Bocca, 1889 (Opere scelte
are shown to be Cauchy sequences. Further, II, Cremona, 1958).
for CIand [I, define c(< fi if a, < b, hold for all [S] R. Dedekind, Was sind und was sollen die
n larger than some number n,. As regards Zahlen? Vieweg, 1887 (Gesammelte mathe-
arithmetic operations and ordering, R’ has the matische Werke 3, Vieweg, 1932); English
properties (I))(IV) of Dedekind’s theory. Fur- translation, Essays on the theory of numbers,
ther, any Cauchy sequence of which the terms Open Court, 1901.
1107 295 C
Number-Theoretic Functions

[6] G. Cardano, Artis magnae sive de regulis and f(mn)=f(m)+f(n) for any m and n (EN). If
algebraicis, Nuremberg, 1545. f is multiplicative and xz, f(n) is absolutely
[7] W. R. Hamilton, Lectures on quaternions, convergent, then C,“=,f(p”) is absolutely con-
Hodges and Smith, 1853. vergent for any prime p. Moreover, we have
[S] R. Dedekind, Stetigkeit und irrationale
Zahlen, Vieweg, 1872 (Gesammelte math-
ematische Werke 3, Vieweg, 1932); English
translation, Essays on the theory of numbers, where the infinite product on the right-hand
Open Court, 1901. side is also absolutely convergent. Further-
[9] G. Cantor, tiber die Ausdehnung eines more, if f(n) is completely multiplicative, then
Satzes aus der Theorie der trigonometrischen we have
Reihen, Math. Ann., 5 (1872), 1233132 (Gesam-
melte Abhandlungen, Springer, 1932).

In particular, for f(n) = n P, which is com-


pletely multiplicative (with s = D + it a complex
variable), we get the tEuler product formula
295 (V.4) for the tRiemann zeta function:
Number-Theoretic Functions
c(s)= f n-’
It=1 = V” -p-“)-1
A. Recurrent Sequences for 0z 1.

A (complex-valued) function that has the


set of nonnegative integers (or the set N of
natural numbers) as its tdomain is called a C. Convolutions
number-theoretic (or arithmetic) function.
Thus it can be regarded as a tsequence of If f and g are number-theoretic functions, the
convolution f* g is defined by
numbers. We first consider recurrent se-
quences. Let f(x,,, xi, . , x,-r) be a complex-
valued function of r variables. Put no = a,,
u1=a,,..., u,-r = a,-, , and successively de- with the summation carried over all divisors d
tine~~+,=f(u~,u~+~ ,..., ~~+,-~)(i=O,l,2 ,... ). of n. For any number-theoretic functions J g,
The sequence {u,} thus defined is called a re- h, we havef*g=g*fand (f*g)*h=f*(g*h).
current sequence of order Y determined by the If f and g are multiplicative number-theoretic
initial values a,, a,, . , a,-, and the function functions, f* g is again a multiplicative
f: In particular, when the defining function f number-theoretic function.
is given by CT=; bixi, {u,,} is called a linear re- The Mobius function p(n) is defined as fol-
current sequence. The Fibonacci sequence is a lows: p( 1) = 1, p(n) = 0 if n is divisible by the
special linear recurrent sequence with initial square of a prime, and p(n) =( - 1)’ if n is the
values a,, a, and the defining function x0 + xi. product of r distinct primes. It can easily be
Let cc=(1+$)/2, jj’=(l-,,&)/2 be two proved that n is multiplicative and that
roots of 1 +x=x2, and let c, and c2 be deter-
mined by ci +~,=a, and c,cc+c,~=a,. Then 1 (n=l),
;Ic(d)= 0 (n>l).
the Fibonacci sequence with the initial values
a,,~, is given by putting u,=cr~l”+c~~” (n= Let e and p be functions defined by e( 1) = 1,
0,1,2 ,... ).Ifweputa,=l,u,=l,thenwe e(n) = 0 (n > l), and p(n) = 1 for every n. Then e
obtain Binet’s formula: is the identity element for the convolution
*, and p*P=p*p=e. It follows thatf*p=g
is equivalent to f=g */*, that is,

dn)=$f’(d)
n

B. Multiplicative Functions implies that

A number-theoretic function f with domain N f(n)=~h4dn/d),


n
is said to be multiplicative if f( 1) = 1, f(mn)
=f(m)f(n) for (m, n) = 1, and to be completely and vice versa. We call the latter the Mobius
multiplicative if f( 1) = 1, f(mn) =,f(m)f(n) for inversi.on formula. Similarly, for complex-
any m and n (EN). Similarly, f is said to be valued functions F, G defined on [l, +co),
additive if f( 1) = 0, f(mn) =f(m) +f(n) for (m, n)
G(x)= c Cd4
= 1, and to be completely additive if f( 1) = 0 n<x
295 D 1108
Number-Theoretic Functions

is equivalent to Let x be a character modulo k and p =


exp(2ni/k). The Gaussian sum modulo k is
F(x)= c !-&w(X/~).
nsr defined by

Let (p(n) be the number of integers m not G(~,x)= C z(n)p”“,


n(modk,
greater than n and such that (m, n) = 1. The
function q(n), called the Euler function, is where n runs over a complete system modulo
multiplicative, and we have m (- 297 Number Theory, Elementary, G).
Hence if a = b (mod k), then G(a, x) = G(b, x).
Suppose that k = k, k, k, ((k,, kj) = 1, i #j)
and U, = a (mod k,). Then we have
and (~(p”)=p’-pn-’ for every prime p. Let v
be the function defined by V(M) = n for every II.
Then we have 1, x v = cp and v = p * cp, and hence
Cdln 47(d) = n. where :! = x1 xz x, is the decomposition of :!
The generalized divisor function d,(n) is the mentioned above.
number of ways of expressing n as a product of Let k, be a divisor of k. If x(n) = 1 whenever
k factors. Thus we have (n, k) = 1 and n = 1 (mod k,), then we say that x
is defined mod k,. The least positive integer
,f=f(x) modulo to which x is defined is called
and d, = p * * p (k factors). Therefore, d, is
the conductor of x. If the conductor of 31is k
multiplicative. For simplicity, we write d(n) itself, then ;c is said to be a primitive character
instead of d,(n). Then d(n) is the number of modulo k. When x has the decomposition x=
divisors of n, and we call it the divisor function. x1 x~, the conductor of x is also decom-
For example, d( 12) = 6. We denote by a,(n) posed as f(x) = f(x 1).1’(x2) .f(xJ. If D is a
the sum of ccth powers of divisors of n. If n = square-free integer, then the tdiscriminant d of
Q,,,,p’, then we have the quadratic number field Q(fi) (where Q is
the rational number field) is either D (D = 1
(mod 4)) or 40 (D = 2,3 (mod4)). The integers
d so represented are called the fundamental
discriminants. The TKronecker symbol (d/n)
and a, is also multiplicative. We write oi (n) =
(- 347 Quadratic Fields) is defined only for
o(n). A number n satisfying a(n) = 2n is said
such fundamental discriminants. Z. Suetuna
to be a +perfect number. Let n = np,,, p’p. The
and A. Z. Walfisz (1936) proved that if x(n) is
functions w(n) = &,, 1 (the number of distinct
a real primitive character modulo k, then we
prime factors of n) and O(n) = & 1, (the num-
necessarily have one of the following cases:
ber of prime factors of n) are often used, the
(i) k=p,p,... p,; (ii) k=4p,p, p,; or (iii) k=
former being additive and the latter com-
8p, p2 . p, (with the pi distinct odd primes). In
pletely additive.
case (i),

D. Residue Characters and Gaussian Sums (k/n) (k= 1 (mod4)),


x(4 =
1 (-k/n) (k = - 1 (mod 4));
Let k be a positive integer and x be a com-
pletely multiplicative function such that x(n) in case (ii),
=O for (n,k)> 1 and ;r(n,)=~(nJ for n, =n2 (-k/n) (k/4= I (mod4)),
(mod k). The function x is called a Dirichlet x(4 = (k,n)
i (k/4= -1 (mod4));
character (or residue character) with modulus k
(or modulo k). There exist q(k) distinct charac- and in case (iii), x(n) is either (k/n) or (-k/n).
ters modulo k for a given k. The character If x is a primitive character modulo k, then
satisfying x(n) = 1 for every n coprime to k is G(u,x)=%(u)G(I,x), G(l,x)G(I,X)=k. In par-
called the principal character modulo k, and ticular, if x is a real primitive character, then
we denote it by x0. It is easily proved that
(31,io= Jir k-l)= 11,
v(k) (x= xo)> iJil (%(--l)=-1).
0 (XZXO)> Sometimes S(a, k) = CR Cmodkipun2 is called the

1 q(k) (n = 1(mod k)), Gaussian sum, where p = exp(2rcilk). If (u, k,)


C%(n)= o =1 (i=1,2)and(k,,k,)=l,thenS(a,k,k,)
x (n + 1(mod k)).
=S(ak,, k,)S(ak,, k,). If (a, 2)= 1, then
Suppose that k = k, k, k, ((ki, k,) = 1. i #,j)
Then there exist r characters xi modulo ki (r= 11,
permitting the unique decomposition x= (r even),
(r odd, 1).
%1Xz...Xr.
1109 295 E
Number-Theoretic Functions

Let p be an odd prime and (a, p) = 1. Then methods to the problem in this section. Some-
we have times more complicated functions are used.
We mainly consider the generating function
WP)S(l>P) (r= I),
represented by the Dirichlet series. The gen-
S(a,p’)= p”Z (r even), erating functions of p, p, d, x (- Sections C, D)
{ p’*-‘“‘S(a,p) (r odd > l), are <(s)=C.“=rn-‘, C,“=,~(n)n-~, C,“=,d(n)n-“,
L(s, x) = Cgl x(n)n -‘, respectively. The tab-
where (a/p) is the tLegendre symbol (- 297
scissa of absolute convergence of each of these
Number Theory, Elementary, H). The well-
Dirichlet series is 1. The function L(s, x) is
known Gauss formula is stated as follows:
called the TDirichlet L-function (- 123 Distri-
4 (P- 1 (mod4)), bution of Prime Numbers; 450 Zeta Func-
S(l,d=
1 iJfr (p~3 (mod4)). tions). When k = 1 and x = x0, L(s, x) is pre-
cisely c(s).
Various proofs of this formula have been given Let F(s), G(s), and H(s) be the generating
by many authors [S]. functions of ,fi g, and h =f* g. Let f and g
Let be multiplicative, so that h is also multiplica-
G(a, x0) = c’ exp(2niah/k), tive. Moreover, if F(s) and G(s) are absolutely
h(nmdk, convergent for o>q,, then H(s) is also ab-
called the Ramanujan sum, be denoted by ~~(a). solutely convergent for r~> q,, and F(s)G(s) =
The sum Cbcmod k) means that h runs through a H(s) for r~> (TV. It follows from this that
reduced residue system modulo m (- 297 C,“=lpc(n)nP=im’(s), C,“=lp(n)x(n)n~”
Number Theory, Elementary, G). It follows =L(s,x))‘, C,“=r dk(n)nm”=i(s)k, and
that ck( 1) = p(k) and C,“=, d2(n)n-“=[4(s)/[(2s). The last result
was given by S. Ramanujan. More generally,
ck(a)= 1 p i d. Cs, d’(n)n-“=<2’(s)cp(s), where q(s) is an
dllk.4 0 analytic function for 0 > l/2. By utilizing ana-
lytic methods, we can deduce from this that
Cn~xd’(n)-x(c,(logx)2r-1 + +c,,) (B. M.
E. Analytic Methods Wilson, 1923) (- 123 Distribution of Prime
Numbers; 242 Lattice-Point Problems).
There are many formulas known as the
Let f be a number-theoretic function. One of Euler summation formula. Among them the
the problems in analytic number theory is to following one is convenient to use (E. Landau,
estimate CUsk,f(n) or to expand it into series. L. J. Mordell, H. Davenport): Let &(x)=x
We assume that f(x) and g(x) are real-valued -[xl - l/2 (when x is not an integer), =0
functions defined for x 3 1 and g(x) is of class (when x is an integer). We successively con-
C’. Then we have struct continuous functions fr (x), f2(x), of
period 1 such that f;‘(x) =f,-r(x) (r 2 1) if x is
not an integer, and j; f,(x)dx = 0. If F(x) is of
class C* on [a, b], then using these auxiliary
where F(x) = C,s ,f(n). If we take f’(x) = 1, g(x) functions, we have
=logx or l/x in this formula, then we obtain

c logn=xlogx-x+0(l0gx)
n<x

-fIW”‘(W
or

(where C is the tEuler constant), respectively.


We now construct from the function ,f the
Dirichlet series
+(-1)*-r
s b

(1
.fh-I

where C’ means that the term corresponding


(x) F’*‘(X) dx,

to m = u or m = b in the sum is to be replaced


F(s)= i f(n)nP
It=1 by F(a)/2 or F(b)/2 whenever a or h is an in-
teger. Since the tFourier series of&(x) is
or the power series
-;l sint22)

(which is convergent), f,(x) can be expressed by


There are called the generating functions off:
The consideration of these functions makes
possible the application of function-theoretic
295 Ref. 1110
Number-Theoretic Functions

where C’ means that the term with n = 0 is 0, means that there exist infinitely many k
omitted and this sum actually means satisfying 1S,,,( > c& log log k by taking m
and x suitably, with c any positive constant.
Let x be a character module k, and x0 be a
primitive character associated with x. Then
Hence if h = 1, we have we have L(s, x) = L(s, x’)&,~( 1 - x”(p)pm”).
It follows that

ltl xb)=d~~~Yl~(d)Xo(d)m~,dXo(m)
> %
F’(x)sin(Znnx)dx. if the conductor of x0 is f: A. G. Postnikov
investigated (1956) the sum of characters. The
For instance, if we put h = 1, a = 1, b = N, least integer that is not a quadratic residue
F(x)= x? and let N tend to co, then we have modulo p does not exceed p”‘Je (logp)‘, where
the formula p is a sufficiently large prime (I. M. Vinogra-
dov, 1926). The least integer that is a tprimi-
tive root modulo p does not exceed 2m+‘&,
where m is the number of distinct prime divi-
1 1 “fob) sors of p- 1, with p a prime (L. K. Hua, 1942).
,,dx for a>l.
=,_l+7 s1 x D. A. Burgess (1962) deals with the latter re-
sults. We conclude with Artin’s conjecture: If w
Utilizing the following integration by parts, is a given square-free integer, then there exist
the integral on the right-hand side can be infinitely many primes p such that w is a primi-
extended so as to become holomorphic in s in tive root module p. In 1967, C. Hooley proved
the whole complex plane: this conjecture subject to the assumption that
the general +Riemann hypothesis holds for
"fob4
-dx= -f,(l)+(s+ 1) mfedx +Dedekind zeta functions over certain +Kum-
s 1 x s1 x
mer extensions of Q. He also obtained the
=... asymptotic formula for the number of such
primes not exceeding x. Recently the tsieve
Probabilistic considerations are also used
method has been widely applied to the various
for the study of various number-theoretic
investigations in the theory of number theore-
functions. If f(n) is w(n) or Q(n), then we have
tic functions (- 123 Distribution of Prime
~n~xf(n)=xloglogx+cx+o(x). Therefore
Numbers E).
the average order of w(n) or Q(n) is estimated
as loglogn. Let A(x; c(,b) be the number of II
satisfying n <x and log log n + x~G <
f(n) < log log n +&/e. Then References

lim Ak%P) 1 be-“*,2du,

x-00 X [ 1] R. G. Ayoub, An introduction to the ana-


=J5L s
lytic theory of numbers, Amer. Math. Sot.
For f(n) = w(n), the result was proved by P. Math. Surveys, 1963.
Erdiis and M. Kac (1940) by using the tcentral [2] A. 0. Gel’fond and Yu. V. Linnik, Elemen-
limit theorem and V. Brun’s tsieve method. tary methods in analytic number theory, Rand
Further general formulas were obtained by M. McNally, 1965. (Original in Russian, 1962.)
Tanaka (1955). Further development and the [3] G. H. Hardy and E. M. Wright, An intro-
present state of probabilistic number theory duction to the theory of numbers, Clarendon
can be seen in [7, S]. Press, fourth edition, 1965.
Finally, we mention the well-known estima- [4] H. Hasse, Vorlesungen iiber Zahlentheorie,
tion formulas. Let E be an arbitrary positive Springer, 1950.
number and n be a positive integer. Then d(n) [S] E. G. H. Landau, Vorlesungen iiber
= O(n’), where 0 (+Landau’s symbol) depends Zahlentheorie I, Hirzel, 1927 (Chelsea, 1946).
on E. We know that lim sup,,,(logd(n)) [6] W. J. Leveque, Topics in number theory,
(log log n)/log n = log 2 (S. Wigert, 1907), and Addison-Wesley, 1956.
lim inf,,, cp(n).(loglogn)/n= e?. The result [7] J. Kubilius, Probabilistic methods in the
w(n) = O(log n/log log n) is often used. Let x be theory of numbers, Amer. Math. Sot. Transl.
a primitive character modulo k, and let S,,, = Math. Monographs, 1964. (Original in Russian,
C,“=i x(n). We can prove that IS,1 < & log k for 1962.)
all m (I. Schur, 1918) and S,,,=n($ loglogk) [S] P. D. T. A. Elliott, Probabilistic number
(R. Paley, 1932). This formula, with the symbol theory I, II, Springer, 1979.
1111 296 B
Number Theory

296 (V.l) Gauss’s Disquisitiones [4] appeared at about


the same time as Legendre’s book. The theo-
Number Theory retical arithmetic of today originates from this
work of Gauss. The book includes the theories
A. History ~ of tquadratic residues, tquadratic forms, and
~ cyclotomy (i.e., arithmetic theory of the roots
Simple and curious relations among integers of unity in the field of complex numbers), all
were discovered and admired from antiquity. of which appeared as well-developed theories
For example, we have the relation 3’ + 4’ = S, of a remarkably high standard. The work was
which has geometric meaning concerning right received with more respect than comprehen-
triangles. The Pythagoreans (- 187 Greek sion. Dirichlet made a lifelong effort to pop-
Mathematics) sought similar relations. They ularize the Disquisitiones; he also applied
were also interested in tperfect numbers (num- analytical methods to compute the +class
bers equal to the sum of their divisors, such number of quadratic forms, thus giving num-
as 28 = 1 + 2 + 4 + 7 + 14). Modern arithmetic ber theory a new direction, the analytic theory
inherits from the Greeks the proof of the of numbers. Gauss treated only binary qua-
existence of an infinite number of primes, the dratic forms; Eisenstein, Minkowski, and
+Euclidean algorithm to obtain the greatest Siegel generalized the theory to the case of n
common divisor of two integers (both given in variables. The algebraic theory of numbers has
Euclid’s Elements), and Eratosthenes’ +sieve for its origin in Gauss’s paper on biquadratic
finding primes. In the 3rd century A.D., Dio- residues. (- 14 Algebraic Number Fields; 59
phantus of Alexandria discovered a method of Class Field Theory; 118 Diophantine Equa-
solving indeterminate equations of degrees 1 tions; 182 Geometry of Numbers; 297 Num-
and 2; this marked the origin of Diopbantine ber Theory, Elementary; 347 Quadratic Fields;
analysis. The ancient Chinese also knew how 430 Transcendental Numbers.)
to solve equations of the first degree in some
special cases. Arithmetic also developed in
India from an early period; its relation to B. Analytic Methods
Greek mathematics is not yet entirely clear. In
the 12th century, the Indian mathematician Analytic methods are sometimes used to solve
Bhlskara knew how to solve +Pell equations arithmetic problems. For example, Legendre
using a method much like Lagrange’s. conjectured that any arithmetic progression of
Interest in integers was reborn in Europe in integers a, a + d, a + 2d, . contains an infinite
the 17th century. During this period, Bachet number of primes if a and d are relatively
de Meziriac (158 1~ 1638) rediscovered the prime. The conjecture was first proved by
solution of the tDiophantine equation of the Dirichlet in 1837; in the proof he used an +,!,-
first degree and published it in his famous function. Recently, proofs that do not use L-
book on mathematical recreations [ 11. Primes functions have been obtained, although these
of the form 2p- 1, closely related to perfect proofs are still not purely arithmetical. Analy-
numbers and called +Mersenne numbers, at- sis is indispensable in the formulation of some
tracted considerable interest. +Fermat, some- arithmetic problems. For example, let XER
times called the father of number theory, and n(x) be the number of primes not exceed-
announced numerous results without giving ing x. Euclid’s Elements already give the re-
proofs; the most famous among them is the so- sult n(x)--, co as x+ co, but to describe the
called Fermat’s last theorem (- 145 Fermat’s behavior of Z(X) as x+ co, notions of analysis
Problem). Another famous conjecture of his is are needed. Gauss conjectured that
that every integer is expressible as the sum of
4-4
at most n n-gonal numbers, i.e., numbers of the lim -----=I
x-m x/log x
form k(k- l)n/2-k(k-2), keN. This was
proved by Gauss (for the case 12= 3), Jacobi This +prime number theorem was first proved
(n = 4), and Cauchy (for the general case). in 1896, using the results of the theory of func-
In the 18th century, remarkable progress tions of a complex variable; more elementary
was made by Euler and Lagrange. The second proofs have been obtained in recent years.
part of Euler’s Algebra [2] contains rich re- Analysis is sometimes needed to solve or
sults of miscellaneous sorts in the field. La- simplify certain problems in number theory.
grange developed the theory of tcontinued The branch of mathematics treating such
fractions and applied it to arithmetic. Toward problems is called analytic number theory. The
the end of the 18th century, Legendre com- question of the extent to which analysis is
piled his comprehensive book [3], from whose really needed in dealing with such problems is
title originates the term number theory. itself an interesting one.
296 Ref. 1112
Number Theory

In the 20th century, analytic number theory tion of the form r, =qk+2rk+l. The remainder
has made rapid progress. The problem of r=rk+l, uniquely determined in this manner by
distribution of primes has been generalized to a and b, is the greatest common divisor (G. C.
the case of algebraic number fields. +Additive D.) of a and b. It can be expressed as r = ax +
number theory dealing with +Waring’s prob- by with integers x and y. This method of
lem, +Goldbach’s problem, and other prob- obtaining the G. C. D. is called the Euclidean
lems has developed and formed a new field. algorithm. The greatest common divisor of
We also have geometric number theory, which a and b is denoted by (a, b). If an integer c
deals with tlattice-point problems. (- 4 Addi- divides both a and b, then cl (a, b). When
tive Number Theory; 123 Distribution of (a, h) = 1, we say that a and b are relatively
Prime Numbers; 242 Lattice-Point Problems; prime, and there are integral solutions x, y of
295 Number-Theoretic Functions; 328 Par- the equation ax + by = 1. Given a pair of posi-
titions of Numbers; 450 Zeta Functions.) tive integers a and b, there exists a unique
positive integer c that divides any common
References
multiple of a and b, called the least common
multiple (L. C. M.) of a and b. If d and I denote
[I] C. G. Bachet de Meziriac, Problemes
the G. C. D. and L. C. M. of a and b, respec-
plaisants et dtlectables qui se font par les
tively, we have ab = dl.
numbres, Lyon, 1612.
[Z] L. Euler, Vollstandige Anleitung zur Alge-
bra, Petersburg, 1770. B. Prime Numbers
[3] A. M. Legendre, Essai sur la theorie des
nombres, Paris, 1798. An integer p is called a prime number if p is
[4] K. F. Gauss, Disquisitiones arithmeticae, larger than 1 and has no positive divisors
Leipzig, 1801; English translation, Yale Univ. other than I and itself. A positive integer is
Press, 1966. called a composite number if it has positive
[5] M. B. Cantor, Vorlesungen iiber Geschich- divisors other than 1 and itself. The follow-
te der Mathematik, Teubner, 1894-1908. ing method of selecting the prime numbers
[6] L. E. Dickson, History of the theory of from the sequence 2,3,4,5, . , known to the
numbers I, II, III, Carnegie Institution of Greeks, is called Eratosthenes’ sieve: First we
Washington, 191991923 (Chelsea, 1952). discard multiples of two, thus reaching the
[7] P. G. L. Dirichlet, Vorlesungen iiber Zah- second prime number, 3. Then we discard
lentheorie, herausgegeben und mit Zusltzen multiples of three and reach the next prime, 5.
versehen von R. Dedekind, Braunschweig, Continuing this process, we find the sequence
fourth edition, 1894. of primes 2,3,5,7,11, . . by sifting out all
[S] J. LeVeque, Reviews of papers in number multiples.
theory, vols. 1-6, Amer. Math. Sot., 1974.

C. Decomposition into Primes Factors

Every integer n > 1 can be uniquely expressed


297 (V.2) as the product of primes; the resulting decom-
Number Theory, Elementary position can be written as n = p’qflr’ with
distinct prime factors p, 4, r, and corre-
sponding exponents x, b, y, uniquely de-
A. The Euclidean Algorithm termined by n. This theorem is called the
fundamental theorem of elementary number
We denote the set of natural numbers 1,2, theory.
3 ) . . . by N and the set of rational integers
0, &l, k2, . . . by Z. Evidently Z is an ordered
tcommutative ring and also an tintegral do- D. Perfect Numbers
main with respect to ordinary addition and
multiplication. For any UEZ and bcN, there We denote by a(n) the sum of positive divisors
exists a unique pair of integers q and r such of n (including 1 and n itself). According as
that a = qb + r (0 < r < b) (division algorithm); 4 cr(n) is greater than, equal to, or less than 2n,
is called the quotient and r the remainder of the we call n an abundant number, perfect number,
division of a by b. When the remainder r is or deficient number. An even number is perfect
zero, we say that a is a multiple of h, b is a if and only if it can be represented as II = 2”-’
divisor of a, and u is divisible by b. We denote (2”- 1) with prime 2”- 1 (L. Euler). The exis-
this relation by b 1a. After a finite number of tence of an odd perfect number still constitutes
divisionsa=q,b+r,, b=q,r,+r,, r, =q3r2+ an open question. It is well known that, if n
r3, (b>r, >r2 . ..>O). we reach an equa- is odd and perfect, then n must be of the
1113 297 G
Number Theory, Elementary

form n = pppyl. .pp’, where p0 = a, = 1 Fermat conjectured that numbers of the form
(mod4), and ai is even for i>O. Recently it has 2” + 1 are all primes. In fact, for v=O, 1, 2, 3,
been proved that t must be > 7 (P. H. Hagis, 4, the corresponding p = 3, 5, 17,257, 65537
Jr., Math. Comp., 35 (1980)). Many necessary are primes; however, 2” + 1 is divisible by 641.
conditions for an odd number to be perfect It is not yet known whether there exist Fermat
have been given and repeatedly improved. It primes other than these five primes. For fur-
has been recently proved that there exists no ther details - [lo]. Fermat numbers are
odd perfect number less than 105’ (Hagis, closely connected with the problem of tgeo-
Math. Comp., 27 (1973)). metric construction of regular polygons.
Two numbers m and n are said to form a
pair of amicable numbers if a(n) - n = m and
c(m) - m = n (e.g., m = 220 and n = 284). Euler G. Congruence
listed 61 such pairs. The following two num-
bers m, n, both having 152 digits, constitute the Let m be a positive integer. Two integers a and
largest amicable pair currently known: b are said to be congruent modulo m if their
difference is divisible by m; we denote this
relation by the congruence a = b (mod m) (or
- 1x simply by a = b (m)) and call m the modulus of
this congruence. Congruence modulo m is an
n=34.5.11.528119(23.33.52.1291.528119
tequivalence relation, which is compatible
-1) with the ring operations and classifies Z into
m classes. We thus obtain the tresidue class
(H. J. J. te Riele, Math. Comp., 28 (1974)). ring Z/mZ with m elements. If p is a prime,
E. Lionnet considered the numbers n such then Z/pZ is a field which is isomorphic to the
that the product I&,d is equal to n* and tprime field with tcharacteristic p. A complete
called such numbers perfect numbers of the system of representatives of the quotient set
second kind, e.g., n = p3, n = pp’, where p and p’ Z/mZ is called a complete residue system
are unequal prime numbers. It is also known modulo m. On the other hand, a set of q(m)
that there are numbers n such that a(n)/n is an elements ni (cp is ‘Euler’s function) such that
integer. For example, 2’ ‘3.7, 2i5. 3’. 52. 7*. ni $ nj (m) (i #j) and (n,, m) = 1 is called a re-
11.13.17.19.31.43.257havethisproperty. duced residue system modulo m. The set of all
residue classes represented by a reduced re-
sidue system modulo m forms a multiplicative
E. Mersenne Numbers
Abelian group of order q(m); we denote this
group by (Z/mZ)*. If (a, m) = 1, then a’@‘)= 1
A number of the form 2’- 1, where e is a
(mod m). If p is a prime and (a, p) = 1, then
prime, is called a Mersenne number. For the
up-i = 1 (modp) (Fermat’s theorem), since
number 2”- 1 to be prime, it is necessary but
cp(p)=p- 1. When m is 2,4, pk or 2pk (p#2;
not sufficient that e be prime. If a Mersenne
k = 1, 2, . ), (Z/mZ)* is a ‘cyclic group, whose
number is a prime, it is called a Mersenne
tgenerator g is called a primitive root modulo
prime. It is not known whether there are in-
m (- Appendix B, Table 1). For any a prime
finitely many Mersenne primes; it has been
to m and generator g of (Z/mZ)*, there exists
verified, however, that for e < 44500, there are
a unique number p (1 <p < q(m)) such that
27 cases of e such that 2”- 1 is a Mersenne
a = ,qP (mod m). We call p the index of a with
prime; e=2, 3, 5, 7, 13, 17, 19, 31, 61, 89, 107,
respect to the basis g, and write p = Ind, a (-
127,521,607, 1279,2203,2281,3217,4253,
Appendix B, Table 2). The group (Z/2kZ)*
4423,9689,9941, 11213,21701,23209, and
(k 2 3) is Abelian of type (2, 2km2), a basis of
44497. Until the 18th century, the verifications
for e < 3 1 were done by direct calculation. For which is formed by the residue classes repre-
sented by - 1 and 5. From this follows (p - l)!
61 be< 127, the “Lucas test” was utilized to
= - 1 (mod p) (Wilson’s theorem). Generally,
execute the computation. The remaining cases
if m = nr=, m,, (m,, mj) = 1 for i #j, then we
were calculated by means of electronic com-
have Z/mZ E Z/m, Z + + Z/m,Z (direct sum)
puters. The number 244497 - 1, which has
and (Z/mZ)*g(Z/m,Z)* x . x(Z/m,Z)* (di-
13395 digits, is at present the largest known
rect product).
prime.
The congruence ax = b (mod m) with (a, m)
= d is solvable if and only if d 1b, and when it is
F. Fermat Numbers solvable, the solution is unique modulo m/d.
The simultaneous congruences x = ai (mod mi)
Numbers of the form 2*’ + 1 are called Fer- (i = 1,2, , k) are solvable if and only if ui = aj
mat numbers. For a number p = 2’+ 1 to be a (mod(m,, mj)) (i, j= 1, . , k), and when they are
prime, it is necessary that e be a power of 2. solvable, the solution is unique modulo the
297 H 1114
Number Theory, Elementary

greatest common multiple of m,, , mk. In tomic fields [4]. The third and fifth proofs
particular, when (mi, m,) = 1 (i #j), the solution are the most elementary and simple. They are
is unique modulo m, m2 . mk (Chinese remain- based on Gauss’s lemma: Let ri , r2, . , r~P~l~,2
der theorem). If m = p;~ . . . p;k is the factoriza- be the residues of divisions of la, 2u, . . , (p -
tion of m, solving the congruence f(x) = 0 l)u/2 by an odd prime p, and let it be the
(mod m), where f(x) is a polynomial with in- number of these residues that are greater than
tegral coefficients, can be reduced to solving p/2. Then we have (u/p) = (- 1)“. T. Takagi gave
f(x) = 0 (mod p>) (i = 1, . . , k). Also, solving a a simplified exposition of the third proof using
quadratic congruence can be reduced to solv- geometric figures (1904). The same method
ing a linear congruence and a congruence of was rediscovered by G. Frobenius (1914).
the form x2 = a (mod m). When m is an odd integer such that m =
* nip?, (m, n) = 1, we call (n/m) = nJn/p,)ei
Jacobi’s symbol. If m has no square factor, it is
H. Quadratic Residues a character of (Z/mZ)*. If we put sgnm= + 1
or - 1 according as m > 0 or m < 0, we have
When the congruence x2 = a (mod m), where the following law of quadratic reciprocity of
(a, m) = 1, is solvable, a is said to be a quadratic Jacobi’s symbol and its complementary laws:
residue modulo m; otherwise, a is said to be a If m and n are relatively prime odd numbers,
quadratic nonresidue modulo m. The following then
two conditions are necessary and sufficient for
a to be a quadratic residue modulo m: (i) a is a
quadratic residue with respect to each of the ~~~~)((m-l)/2)((n~l)/2)+((a~nmlX2)((s~nnl~/2~
prime factors p ( # 2); and (ii) a = 1 (mod 4) or
a = 1 (mod 8) according as 4 1m or 8 1m. (-l,n)=(-l)‘“~l”2+(sgnn~1)/2,
Given a prime number p and integer a (2/n) =( - 1)‘“‘-‘“4,
prime to p, the Legendre symbol (u/p) is by
definition 1 or - 1 according as a is a qua- where
dratic residue modulo p or not. The value of n* =( - l)‘“-“Pn,
this symbol is determined by a (mod p), and
we have (ah/p) = (a/p)(h/p). Hence the symbol Furthermore, TKronecker’s symbol, another
determines a icharacter of (Z/pZ)* of order 2. generalization of the Legendre symbol, is used
Furthermore, the congruence (u/p) = u(P~1)/2 in the theory of quadratic number fields (-
(mod p) holds (Euler’s criterion). 347 Quadratic Fields).

I. Reciprocity Law References

[1] C. F. Gauss, Werke 2, Konigliche Gesell-


For odd primes p and 4 (p # q), the formulas
schaft der Wissenschaften, Giittingen, 1863.
[2] C. F. Gauss, Sechs Beweise des Funda-
mentaltheorems iiber quadratische Reste,
Ostwald’s Klassiker Nr. 122, Wilhelm Engel-
are called the law of quadratic reciprocity of mann, 1901.
the Legendre symbol, the first complementary [3] L. E. Dickson, History of the theory of
law, and the second complementary law, re- numbers I, Carnegie Institution of Washing-
spectively. These laws were conjectured by ton, 1919 (Chelsea, 1952).
Euler and first proved by Gauss, who gave [4] P. G. L. Dirichlet, Vorlesungen iiber Zah-
seven different proofs. We now have more lentheorie, Herausgegeben und mit Zusatzen
than fifty different proofs of these laws. P. versehen von R. Dedekind, Vieweg, fourth
Bachmann (Niedere Zahlentheorie I (1902)) edition, 1894 (Chelsea, 1969).
lists the 47 different proofs of the laws known [S] H. Hasse, Vorlesungen fiber Zahlentheorie,
at the time. Gauss’s first proof was elementary; Springer, 1950.
his second proof used quadratic forms. The [6] E. G. H. Landau, Vorlesungen iiber Zahlen-
latter has been reformulated utilizing the theorie I, Hirzel, 1927 (Chelsea, 1946).
theory of quadratic fields [4]. The fourth proof [7] W. J. LeVeque, Topics in number theory,
used tGaussian sums, and the sixth proof used Addison-Wesley, 1956.
algebraic congruences with integral coefft- [S] H. Rademacher, Lectures on elementary
cients. His seventh proof, contained in his number theory, Blaisdell, 1964.
posthumous works, used congruences of higher [9] Z. I. Borevich and I. R. Shafarevich, Num-
degree [4]. His fourth, sixth, and seventh ber theory, Academic Press, 1966. (Original in
proofs are related to the arithmetic of tcyclo- Russian, 1964.)
1115 298 B
Numerical Computation of Eigenvalues

[lo] W. Sierpinski, Elementary theory of (a$)) with the maximum absolute value and
numbers, Hafner, 1964. denote it by #.
(2) Compute

298 (XV.6)
Numerical Computation of (where sgn x = 1, 0, or -1 according as x > 0,
Eigenvalues =O,or CO),

co~H=(l+tan*0))~/*, and
A. General Remarks
sine=tanfJ.cos8;

Numerical computation of teigenvalues and and form T(l) = (t$‘), where ttb = t$i = cos 0, tif) =
teigenvectors of a matrix provides a basic
1 for i#p, 4, -c,,-(I) - t(l)
4P= sin 0, and $1 = 0 for
technique for the numerical solution of various all other (i, j).
eigenvalue problems. Roughly speaking, there (3) Determine A(‘+‘) and I?‘+‘) by A(‘+l’=
are two kinds of method. In methods of the T”)‘A(‘)T”) (’ denotes the transpose) and I/(‘+r)
first kind, one determines the tcharacteristic = U(‘)T(‘). In this process, T(‘) represents an
polynomial p(A) = det(1Z - A) (where I is the orthogonal transformation (rotation) in the
unit matrix) of A (or to give an algorithm to plane spanned by the pth and qth coordinate
calculate the value of p(L) for an arbitrary ,?), axes such that u$:‘) = uk,“) is nullified. If we
then solves the algebraic equation p(l) = 0 put N(B)=Ci,jbi and M(B)=CiZjbi, then
numerically to obtain the eigenvalues 1, (p = N(B) is invariant under an orthogonal trans-
1,2, ) (- 301 Numerical Solution of Alge- formation, so that N(A”))= N(A). Further-
braic Equations), and finally to one deter- more since L$+‘) = u$) (i # p, q) and (L$$“)’ +
mines the eigenvectors x, by means of the (u&+l~)* =(u~b)‘+(u$’ +2(aFk)*, we have
equations (A,1 - .4)x, = 0. In methods of the M(A”+‘)) = M(A@) - 2(a$‘. Since u:i has the
second kind, one obtains eigenvalues and maximum absolute value among all the off-
eigenvectors directly without resorting to the diagonal elements, we have (a!:)’ > M(A”‘)/
solution of an algebraic equation, relying in- (n’ -n). Therefore
stead upon repeated application of similarity
transformations. (The power method does not M(A”+“)<(l-2/(n*-n))M(A”‘)
lit into this classification, being a different <(l-2/(n2-n))‘+‘M(A)
approach altogether; - Section C.) In partic-
ular, a real symmetric or complex Hermitian < M(A)exp( - 2(1+ l)/(n’ -n)).
matrix is reduced approximately to a diagonal
It has been proved a fortiori [13] that, after
matrix. In this article we denote a given square
M(A@) comes down to below a certain thresh-
matrix of order n by A = (aij) (gj = 1, . , n).
old value, the convergence of the iteration
process becomes quadratic, i.e., there is a num-
B. The Jacobi Method ber c determined by the order n of A and the
arrangement of the eigenvalues of A such that
The Jacobi method is an iterative technique MM (r+n(n-1)i2)) < c(hf(A@)))*. Since the set of
for determining all the eigenvalues and eigen- eigenvalues of A(‘) coincides with that of A
vectors of a real symmetric matrix [7]. Before and the eigenvalues of an arbitrary symmetric
the advent of high-speed computers it was matrix B can be made to correspond in a one-
not considered practical, but at present it is to-one way to its diagonal elements in such a
one of the most compact and elegant methods. way that the difference between an eigenvalue
The following algorithm can be extended and the corresponding diagonal element is not
to Hermitian matrices by replacing tortho- greater than M(B)“‘, u$) tends to E., (i= 1,
gonal transformations by suitable tunitary “‘2 n) as I tends to infinity. Moreover, as I
transformations. tends to infinity, each column vector of UC’)=
Roughly, the method transforms a given (ui;‘) tends to the corresponding eigenvector,
matrix A = (uij) (uji = aij; i, j = 1, , n) into a in the sense that x:=1 uiku~j--jiiuj’+O.
diagonal one by repeated application of 2- The number of arithmetic computations
dimensional rotations of the reference axes. required to obtain A”+‘) and I?“‘) from A”’
We first put A(‘)= A, U(O)= I, and compute and U(‘) is at most proportional to n, so that
A(” , A’Z’ ) . . . , @I) ) U’Z’ , . successively as for a given E> 0, the arithmetic required to
follows: reduce maxi la!? - Ai1 below &(M(A))“* is at
(1) Select an off-diagonal element of A”) = most proportional to n3 (because 1 is at most
298 C 1116
Numerical Computation of Eigenvalues

proportional to n’). On the other hand, the approximate eigenvalues 1, and 1, as the two
search for an off-diagonal element of A(‘) with roots of the quadratic equation in 1:
the maximum absolute value, if it is done by
simply comparing all the elements, will require
effort proportional to n*, so that the amount
of work required by the searching process is
proportional to n4. To bypass this searching (i and j arbitrary, i #j).
process, the cyclic Jacobi method and the
threshold Jacobi method are often used. The The corresponding eigenvectors are given by
former method adopts as a:b the off-diagonal x =A x(‘+1)-x(‘+*) and x2 =~lx(‘+‘)Lxx(‘+*)~
element for which 4 > p and l= (p - 1) (n-p/2) &mpiex conjugate pairs of eigenvalues can be
+(q-p), that is, a(,:), a\*J, . . . . a(;,-‘), a#, a(,zl), dealt with in this manner. This is useful also
. are adopted in this sequence. The latter when Iill + Ii,\. The extension to the case of
method adopts as c$, off-diagonal elements more than two eigenvalues with the same
in a sequence similar to the one above as long maximum absolute value is obvious.
as they exceed a given threshold value; but if In order to determine the remaining eigen-
an element is less than that threshold value, values by the power method we have to com-
then the next element in the sequence is a can- bine it with the deflation or transformation of
didate for adoption as ati, where the thresh- matrices, as discussed later in this section and
old value is made to decrease gradually as in the following section. The amount of com-
the iteration process proceeds. However, the putation depends on the arrangement of the
search for an element with the maximum eigenvalues of A and the required accuracy.
absolute value can be done more effectively by We note that the multiplication of a matrix by
taking account of the fact that only the matrix a vector requires an amount of computation
elements lying in rows p and q and in columns proportional to n*.
p and q change their values when we transform Improvement in the approximations can be
A”’ into ,4(“1). In fact, we can record for each incorporated into the power method. If u = xP
row the value as well as the position of the + O(E) is an approximation to the eigenvector
(off-diagonal) element with the maximum xp corresponding to an eigenvalue 1, of a real
absolute value in that row. By so doing, the symmetric matrix A, then the Rayleigh quo-
effort of searching for an off-diagonal element tient I, =(u, Av)/(v, u) affords a good approxi-
with the maximum absolute value can be mation to i,. In fact, we have 11, - 1, I = O(E*).
reduced to something proportional to n on the If Izi (i = 1, , n) is an eigenvalue of A and
average. xi the corresponding eigenvector, then P(&)
(i = 1, . , n) is an eigenvalue of P(A) and xi
the corresponding eigenvector, where P(t) is
C. The Power Method a polynomial in 5 and 5 ml, This fact can be
utilized to transform the magnitudes of eigen-
The power method is suitable for obtaining values to accelerate the convergence of the
only the eigenvalue of maximum absolute power method, to separate eigenvalues with
value [6]. Let us assume that the eigenvalues the same absolute value, or to obtain inter-
i,,...,l,ofAarearrangedsothatli,I> mediate eigenvalues. The choice P(i) = 1 -c
I~,I~I~,I~...BI1,I,withi, real,andde- or (2-c) -’ is particularly useful, where c is
note by y, the left eigenvector corresponding an appropriate constant. In fact, an efficient
to 1, (which means y,(l,I-A)=O). Starting algorithm known as inverse iteration exists
from an arbitrary (real) vector x(O) such that for computing an approximate eigenvector
( yl, x(O)) #O and $‘) = 1 for a prescribed i,, when a good approximation to an eigenvalue
we compute Q(‘), Q(l), . . . and x(l), xc*), . by is known. Given a trial eigenvector x cor-
Ax(‘) = Q(‘)x(‘+~) (1 = 0, 1,2, ; xji”) = 1). Then responding to a computed eigenvalue c, one
we have lim,,, O(‘)= I 1 and lim r+mX(‘)=X computes an improved approximate eigen-
(the eigenvector corresponding to the eigin- vector y by solving (A - cl)y = x or y = (A -
value &). The rate of convergence depends on cl) -1 x.
11,/1,1 if the telementary divisor of A cor- Aitken’s 6*-method is also efficient in accel-
responding to I, is linear, but in the case of a erating the convergence of the power method.
nonlinear elementary divisor, the convergence When an eigenvalue-eigenvector pair is
is too slow for practical purposes. If i, # 1, known, another eigenvalue-eigenvector pair
and~i,l=li,l>~1,~>...>~1,~,thenthese- can be computed using the process known as
quences of /3(l) and x(l) computed by the for- deflation. If an eigenvalue 1, and the corre-
mulas above do not converge but in general sponding eigenvector x,, (and also the corre-
oscillate. However, from Q(l), @“‘), x(l), x(“‘), sponding left eigenvector ‘y,, if necessary) of A
x(‘+~) for a sufficiently large I, we can obtain are known, it is possible to “subtract” them
1117 298 E
Numerical Computation of Eigenvalues

from A to get a problem containing only the the elimination method with row interchange.
remaining eigenvalues. Such deflations are All these methods require an amount of com-
often used in combination with the power putation proportional to n3. In general, special
method. The following are two examples of treatment is necessary for the case of multiple
deflation methods. eigenvalues. As an example, we explain the
(1) Assuming that xp and y, are normalized Givens method for general matrices. Let N =
in such a way that (y,, xp) = 1, form B = A - (n - 1) (n - 2)/2. For l= 0, 1, . . , N - 1, choose
~,,x,y~. Then B has the same set of eigen- (P, 4) = (3,2), (4,2), . . . >(4 2); (4,3), (5,3), . . ,
values and eigenvectors as A except for 1,. The (n,3);...;(n-l,n-2),(n,n-2);(n,n-l)inthis
eigenvalue and the eigenvector of B corre- order. Using T(” of the same form as in the
sponding to 1, of A are 0 and x,, respectively. Jacobi method, and setting tan0 = at!,-l/&l
This kind of deflation process can be gen- in this case, calculate A(‘) = A, U(O) = I, A(‘+‘) =
eralized to the case of nonlinear elementary T”“A”‘T”’ U(‘+‘) = U”‘T”’ (I = 0, 1, . , N - l),
divisors, but that becomes somewhat more and put B = AcN). Then we have b, = 0 for i -
complicated. j > 2. We can solve the eigenvalue problem for
(2) After normalizing x,, so that its nth com- this simplified B and then retransform the
ponent xpn is equal to 1, form B=(bij):bij=aij eigenvectors of B thus obtained into those of
-~,,~a”~ (i,j = 1, . , n - 1). Then B has the same A by means of UcN). It should be noted that
set of eigenvalues as A except for 1,. If w, is the method of bisection based on tSturm’s
the eigenvector corresponding to the eigen- theorem is effectively used to solve the char-
value 3,, of B, then the corresponding eigen- acteristic equation of a real tridiagonal ma-
vector xk of A is given by xki = wki + d,~,,~ (i = trix [14]. This method is remarkably stable
1, , II - l), xkn = d,, where d, is determined numerically. It is used when the number of
from C:Z,r n, wki = (& - l,)d, if 1, # I,. If 1, = 1, eigenvalues to be computed is small relative to
and Y= Cy=i a,,,wki = 0, then we can put d, = 0. the order of the given matrix. If all eigenvalues
If & = 1, and r # 0, A has a nonlinear elemen- are desired, an alternative method such as the
tary divisor for 1, = &, and the xk defined by QR method (- Section F) is recommended.
xki = wki/r and xkn = 0 is a generalized eigen-
vector of A in the sense that Ax, = 1,x, + xfi.
E. The Lanczos Method

D. Transformation of Matrices A more detailed exposition of the Lanczos


method is now given. Let A be a given real
There are a number of methods of transform- matrix of order n. Pick two vectors ci and c1,.
ing a given matrix A by means of a suitable Determine recursively the vectors ci+i and
similarity transformation A+B = S -‘AS into ci+i, i= 1, , n, that satisfy the following con-
another matrix B for which it is easier to solve ditions: (i) yi+ici+r = Aci-EiCi-fiici-1 sbi+,,
the eigenvalue problem. The Givens method Bi+tF,+,=A’Zi--iFi--PiEi~,-~~+,,i=2,...,n;
[9] transforms a symmetric matrix A into a (4 ci+l is orthogonal to F,-i and F,; (iii) Ei+i is
tridiagonal matrix B (i.e., a matrix such that b, orthogonal to ciml and ci, where CL~,pi, Bi, and
= 0 for 1i -jl > 2) by means of a matrix S which pi are scalars, and where yi and jri are nor-
is the product of 2-dimensional rotation ma- malizing scalars. Actually, C(~= <Aci/<ci = ii,
trices. The Householder method [lo] also trans- /3i=yic~Fi/~-,ci-,, ~i=yy,~c,/cj-,c”,-,, where
forms a symmetric A into a tridiagonal B by &Zi, i = 1, , n, are assumed nonzero. It can
means of an orthogonal matrix S of special be shown that ci+i is orthogonal to every cj,
type, i.e., a reflection matrix I-2uu* (u*u = 1, 1 <j < i, and that Zi+1 is orthogonal to every cj,
u* = conjugate transpose of u), and the Lanc- l<j<i(c,+,=E,,+,= 0). The conclusion is that
zos method [S] transforms a general A into a the given matrix A is similar to the tridiagonal
tridiagonal B. To general matrices the follow- matrix H = (h,), where hii = tli (i = 1, . , n), /I~,~+,
ing methods are also applicable: (1) The Dani- =fli+i, hi+l,i=yi+i (i=2 ,..., n). In fact, if C
levskii method [l] transforms A into its com- denotes the matrix whose jth column equals cj
panion matrix B by repeated application of (j=l , . , n), one obtains C-’ AC = H. Thus the
elimination operations. Here, row interchanges eigenvalues of A are identical to those of H.
can be combined to increase numerical sta- In principle, the Lanczos method transforms
bility [14]. (2) The Hessenberg method [3] a given matrix to a tridiagonal matrix similar
transforms A into a B such that b, = 0 for to it if every ciEi is nonzero (i = 1, , n). If ciEi
i-j>2 with a triangular S. (3) The Givens does vanish for a certain i, one selects c1 and
method [9] transforms A into a B of the same E, again and restarts. On the other hand, if
form as in (2) by repeated application of 2- bi+l =0 one chooses an arbitrary vector c,+i
dimensional rotations. This method now tends orthogonal to 2,) . , ci and if bi+, = 0 one
to give way to the Householder method or to selects an arbitrary vector Ei+1 orthogonal to
298 F 1118
Numerical Computation of Eigenvalues

ci, . , ci. In the actual numerical computation, tional to n3, while one step of the QR process
the distinction between zero vectors and non- applied to a Hessenberg matrix requires a
zero vectors is usually blurred by rounding number of operations proportional to n2.
errors. We now describe one of the most useful
In the application of the Lanczos method versions of the QR method. Let A = A, be a
one often observes the loss of orthogonality given matrix, and define a sequence of matrices
ciEj = 0 (i #j). This usually results from cancel- A,, A,, . obtained from A, as follows. At the
lation errors in the computation of b,+i and ith (i = 0, 1, . ) step, choose an appropriate
ii+, If the orthogonality is lost, C-’ AC may constant si, called an origin shift, according to
significantly deviate from a tridiagonal matrix. the process described below, and decompose
As a practical means for preserving the in- Ai - s,l into the product of a unitary matrix
dicated orthogonality one can reorthogonalize Qi and an upper Hessenberg matrix R, : Ai -
the vectors ci and Fi. Indeed, one can add s,l= QiRi. The matrix Ai+1 is then defined by
to the computed ci+i a linear combination Ai+1 = RiQi+siI = QfAiQi. Hence A,+i is simi-
of ci, . , ci so that the sum is orthogonal to lar to A,. The QR method is closely related to
c1i, . , Fi, then take the sum as citl after pro- the power method and to the inverse power
perly normalizing. A similar process applies to method. We describe this relationship in order
c"i+l.
The Lanczos method is often applied to obtain an insight into the nature of the QR
in double precision. It has been suggested method. To this end, we first state a well-
that one first reduces the given matrix to an known theorem: Let A be diagonalizable, and
upper Hessenberg form by using the Hessen- let its eigenvalues li (i = 1, . . . , n) have distinct
berg method with row interchange before modu1i,say11,1>11,1>...>11,1.LetX~’AX
applying the Lanczos method [ 151. =diag{l,,i,, . ,A,}, and let X have an LU-
A further remark on the Lanczos method is decomposition X = LU, where L and U are,
in order. Recall that yi is determined from respectively, lower triangular matrix and an
Aci, cim,, xi-i, and pi-i; pi from ci, E,, ci-i, &i, upper triangular matrix. Then the QR al-
and yi; and xi from Aci, ci, and Fi. This shows gorithm without an origin shift (si = 0, i =
that the Lanczos method applied to a sparse 0,1,2, . ..) behaves as follows: a$;-+0 (p>q);
matrix (a matrix whose elements are almost all a$+&; agi oscillates (p < q), p, q = 1,2, . . , n,
zero) requires only a memory proportional to i+ co, where u$ denotes the (p, q)th element
n. The Householder method, on the other of Ai. In other words, Ai approaches an upper
hand, requires memory proportional to n2 triangular matrix as i-+ co so that the diagonal
even for a sparse matrix. elements of Ai converge to the eigenvalues of
For a real symmetric matrix one can modify A.
the Lanczos method for a general matrix so The relationship between the QR algorithm
that C is orthogonal and H is real, symmetric, and the power method is given by (QOQ1. . . Qi)
and tridiagonal (the details are omitted). (R,R,-, . ..R.)=(A-s,I)(A-ss,I)...(A-sJ).
Ifs,=s,=... = si = 0, then the right-hand side
reduces to A’+‘, and since the product Ri R,
is upper triangular, the first column of Qi =
F. The QR Method
QoQl . Qi equals a scalar multiple of A’+’ e,,
where e, = (LO, 0, , 0)‘. Therefore, by the
The QR method was discovered independently power method, the first column of Qi converges
by J. G. F. Francis and by V. N. Kublanov- to an eigenvector corresponding to the eigen-
skaya in 1961 [15]. The method has been value 1, of A that has the largest modulus,
improved and extended considerably since under a certain fairly mild condition. Since
then. In the usual application to matrix eigen- A,+,=&A&,, Ai+le,=&A&,~~l,e, for
value problems, the QR method provides the large i. In other words, the first column of Ai
most efficient iterative process for finding all converges to the vector (1,,0,0, . ,O) as i-co.
eigenvalues of a given matrix. The matrix is The relationship of the QR method to the
reduced by means of a similarity transforma- inverse power method and to the Rayleigh
tion to Hessenberg form or to tridiagonal form quotient is now explained. From the definition
before application of the QR method; the re- of the QR algorithm with an origin shift, we
duction process may be effected by the House- have Qi=[(A-sJml]*RT. Since RF is a lower
holder method or by the elimination method triangular matrix, the nth column of Qi equals
with row interchange. (If the given matrix Qie.=[(Ai-siI)m’]*Rfe,=(A-sil)-l .(a
is a complex matrix, the latter is preferable.) scalar multiple of e,), where e, = (O,O, . ,O, 1)‘.
The reason is that one step of the QR process This last process of obtaining the last column
applied to a full matrix is prohibitively expen- of Qi represents a process known as the inverse
sive, requiring a number of operations propor- power method. If si is close to an eigenvalue of
1119 298 G
Numerical Computation of Eigenvalues

Ai (and hence to an eigenvalue of A), the last G. Generalized Eigenvalue Problem Ax =23x
column of Ai gives a good approximate eigen-
vector of A corresponding to the eigenvalue An eigenvalue problem of the type Ax = 1Bx is
under consideration. Now, if x is a given col- called a generalized eigenvalue problem and is
umn vector such that x*x = 1, the value of 1 often encountered in applied mathematics. A
which minimizes (Ax - Lx)*(Ax - J.x) is given necessary and sufficient condition for I to be
by the Rayleigh quotient x*Ax. If x(x*x = 1) an eigenvalue is det(A -LB) = 0. If B-r exists,
happens to be an eigenvector of A, the Ray- then Ax = 1Bx is equivalent to B-‘Ax = /zx and
leigh quotient equals an eigenvalue of A. If hence has n eigenvalues, where n is the order of
we take x = e,, the corresponding Rayleigh A. If B-i does not exist, the eigenvalue prob-
quotient equals unn. Therefore if we take the lem Ax = 1Bx has at most n - 1 eigenvalues.
(n, n)th element of Ai as the origin shift si, then We restrict ourselves to one of the most im-
si can be regarded as the best approximation portant cases: that where A is real and sym-
to an eigenvalue of Ai (hence of A) when e, is metric and B is real, symmetric, and positive
taken as an approximate eigenvector of Ai, in definite. In this case one could solve the prob-
the sense that 1= si minimizes the functional lem by reformulating it as B ml Ax = Ix, where
(Aien-IZe,)*(Aie,-le,). Bm’A is explicitly computed. However, B-‘A
Under the same condition as in the preceed- is not in general symmetric. Moreover, when
ing theorem, the rate of convergence of the B has eigenvalues of widely different moduli,
QR method with an origin shift is given as fol- the elements of B-’ may have widely different
lows: a$: (n > p 3 q > 1) is asymptotically pro- moduli as well, which would in turn make the
portional to (&/L,)’ for si = 0 (i = 0, 1, . ) (no computation of those eigenvalues of B ml A of
origin shift); and with the origin shift sir the smaller moduli difficult. An efficient and nu-
behavior of u:i at the ith step is determined by merically stable method is known which obvi-
(1, - sJ/(n, - si). If si + i, ( = the eigenvalue of A ates the aforementioned difficulty by exploit-
with the least modulus), each element in the ing the symmetry of A and B. An outline of
nth row of Ai+l except a$ir) exhibits rapid this procedure is now given. Since B is positive
decrease in modulus. A natural and practical definite, a lower triangular matrix L exists
choice of the origin shift si is given as follows: such that IL’= B. This is called the Cholesky
(i) for a real tridiagonal matrix, si is taken to decomposition of B. The elements of L can be
be that eigenvalue of the 2 x 2 matrix situated computed by equating the correspon.ding ele-
at the lower right corner of Ai that is closer ments in LL’ = B. The eigenvalue problem Ax
to u$A [14]; (ii) for a real upper Hessenberg = /ZBx is now equivalent to L-‘A@‘-’ y=
matrix, the two eigenvalues of the 2 x 2 matrix ly with y=L’x, where the matrix Lm’A(L’)-’
situated at the lower right corner of Ai as = P is computed in two stages by solving
si and sifl [14]; (iii) for a complex upper LX = A and PL’= X. In this last equation it is
Hessenberg matrix, si is chosen as in (i) [ 17, enough to compute the upper right half of P,
COMQR]. because P is symmetric while X is not gener-
The QR method with origin shift would ally symmetric. The upper right half of P can
eventually make each element in the nth row, be computed by equating the corresponding
except a$ smaller in modulus than a pre- elements in PL’= X. The eigenvalues /z of
scribed positive number. At this stage of itera- Ax = IBx are given by the eigenvalues of the
tion, L$: is taken as an approximate eigenvalue matrix P. Other types of eigenvalue problems,
of A. The QR method is then applied anew to such as y’AB = ly’, BAy = J.y, and x’BA = lx’,
the (n- 1) x (n- 1) matrix obtained from Ai by often appear in practice, where A is real and
deleting the nth row and the nth column of Ai, symmetric and B is real, symmetric, and posi-
and another approximate eigenvalue of A is tive definite. By using the Cholesky decomposi-
obtained. The method proceeds similarly until tion of B, one can reduce any one of these
all the eigenvalues of A are computed. For eigenvalue problems to the ordinary eigen-
maximum accuracy the eigenvalues of the value problem for a real symmetric matrix [4].
given matrix should be computed in the order If A and B are general matrices in Ax = iBx,
of increasing modulus. A word of caution is in the following method is known to be effective
order. When the given matrix A has elements [IS]. First, reduce B to an upper triangular
of greatly varing modulus, rearrangement of matrix by applying n - 1 Householder trans-
elements of A may be necessary before apply- formations from the left. This reduces the
ing the QR method with an explicit origin eigenvalue problem Ax = /1Bx to the case
shift, where A,--s,l (i=O, 1, . ..) is explicitly where B is upper triangular. Next, apply to A
computed. a sequence of plane rotations of Givens type
The reader is referred to [14-161 for details from the left, thereby reducing the eigenvalue
of the QR method. problem Ax = 1Bx to the case where A is an
298 Ref. 1120
Numerical Computation of Eigenvalues

upper Hessenberg matrix and B is an upper [ 161 G. W. Stewart, Introduction to matrix


triangular matrix. Then apply the QR method computations, Academic Press, 1973.
to B-’ Ax = ix without explicitly computing [ 171 B. T. Smith, J. M. Boyle, J. J. Dongarra,
B-‘A to reduce the eigenvalue problem Ax B. S. Garbow, Y. Ikebe, V. C. Klema, and
= iBx to the case where A and B are both C. B. Moler, Matrix eigensystem routines-
approximately upper triangular. The eigen- EISPACK guide, Springer, second edition,
values of Ax = 1Bx are then easily computed 1976.
as ratios of corresponding diagonal elements. [18] C. B. Moler and G. W. Stewart, An al-
This method is called the QZ method [18]. gorithm for generalized matrix eigenvalue
A collection of about 50 excellent FOR- problems, SIAM J. Numer. Anal., 10 (1973),
TRAN subroutines for various types of ma- 241-256.
trix eigenvalue problems is contained in
[lS]. These subroutines are in most part trans-
lations from ALGOL procedures given in
c141.
299 (XV.7)
References
Numerical Integration

[ 1] H. Wayland, Expansion of determinantal A. Interpolatory Integration Formulas


equations into polynomial form, Quart. Appl.
Math., 2 (1945), 277-306. Numerical integration is a method of finding
[2] P. S. Dwyer, Linear computation, Wiley, an approximate numerical value of a definite
1951. integral of a given function f(x). Usually the
[3] R. Zurmiihl, Matrizen, Springer, 1950. integral j,“f(x)dx or Sjf(x)w(x)dx (w(x) is the
[4] L. Collatz, Eigenwertaufgaben mit techni- weight function) is approximated by a linear
schen Anwendungen, Akademische Verlag., combination Cd1 wif(xi) of the values of the
1949. integrand at the points x 1, x 21”‘> x,. Integra-
[S] R. V. Mises and H. Geiringer, Praktische tion formulas are divided into two groups,
Verfahren der GleichungsauflGsung, Z. Angew. the interpolatory formulas and the formulas
Math. Mech., 9 (1929), 58-77, 152-164. based on variable transformation.
[6] E. Bodewig, Matrix calculus, Interscience, In order to obtain an interpolatory formula,
1956. we interpolate over the integrand f(x) at n
[7] C. G. J. Jacobi, iSber ein leichtes Verfah- points x1,x2, . . . . x, by means of the iLagrange
ren, die in der Theorie der SLkularsttirungen interpolation polynomial of degree IZ- 1, and
vorkommenden Gleichungen numerisch auf- then integrate the polynomial over [a, b].
zuliisen, J. Reine Angew. Math., 30 (1846), 51- Depending on the selection of the points xi
95. and the weights w, we have several kinds of
[S] C. Lanczos, An iteration method for the formulas.
solution of the eigenvalue problem of linear
differential and integral operators, J. Res. Nat. (1) Newton-Cotes Formulas. We assume w(x)
Bur. Standards, 45 (1950), 255-282. =constantandxi=x,+ih(i=O,l,...,n).The
[9] W. Givens, Computation of plane unitary weights Wi are so determined that the value of
rotations transforming a general matrix to the integral can be calculated accurately if the
triangular form, SIAM J., 6 (1958), 26-50. integrand f(x) is a polynomial whose degree
[lo] J. H. Wilkinson, Householder’s method does not exceed n. There formulas are called
for symmetric matrices, Numer. Math., 4 the Newton-Cotes formulas: SC;f(x)dx = (f. +
(1963), 354-361. f,)h/2 for n = 1 (trapezoidal rule), lc;f(x)dx =
[1 1] P. A. White, The computation of eigen- (fO +4f, +f,)h/3 for n = 2 (Simpson’s l/3 rule),
values and eigenvectors of a matrix, SIAM J., and J:;f(x)dx = (f. + 3fi + 3f2 +f,)3h/8 for
6 (1958), 393-437. n = 3 (Simpson’s 3/8 rule). The truncation
[ 121 S. H. Crandall, Engineering analysis, errors of these three formulas are given by
McGraw-Hill, 1956. h”f(*)(5)/12, h5fc4)(5)/90, 3h5f(4)(<)/80, respec-
[13] A. Schanhage, Zur quadratischen Kon- tively, where 5 is a number in the interval of
vergenz des Jacobi-Verfahrens, Numer. Math., integration and f@) denotes the ith derivative
6 (1964), 410-412. off (differentiability is assumed). For an even
[ 141 J. H. Wilkinson and C. Reinsch, Hand- n, the polynomial of degree n + 1 can also be
book for automatic computation II, Linear integrated accurately by these formulas.
algebra, Springer, 1971. In Table 1 the coefficients A and Bi of the
[ 151 J. H. Wilkinson, The algebraic eigenvalue Newton-Cotes formulas hA &, Bif(xi), h =
problem, Clarendon Press, 1965. (x,-x,)/n, and the coefficients C of the error
1121 299 A
Numerical Integration

Table 1

n A 4 4 B2 B3 B4 4 43 4 Bs C
1 w I 1 -l/12
2 l/3 1 4 1 -l/90
3 3/8 1 3 3 1 - 3180
4 tJ45 I 32 12 32 I - 81945
5 5/288 19 75 50 50 75 19 - 215/12096
6 l/140 41 216 27 212 27 216 41 - 911400
I 7117280 751 3571 1323 2989 2989 1323 3517 751 - 8 183/5 18400
8 4/14175 989 5888 -928 10496 -4540 10496 -928 5888 989 -2368/467775

term CYZ~“~(~)(~), where p = n + 2 if n is even the correction term h( fi -f-,)/24 + h( f,-


and p = n + 1 if n is odd, are given. .f,+,)/24=NAfo+Af-,)/24-&!f,~, +Af,)/24>
When the interval [a, b] of integration is where Afi means fitI -L.
large, we usually divide it into small subinter-
vals and apply formulas for small n for each (2) Chebyshev Formulas. The Chebyshev for-
part rather than formulas for large n for the mulas are a family of integration formulas in
whole interval. For example, if the interval is which all the weights w of & &f(xi) are
divided into m equal subintervals, we get the equal, while the abscissas xi are chosen so that
following formula by applying the trapezoidal the integral can be evaluated exactly when f(x)
rule for each subinterval: is an arbitrary polynomial whose degree does
not exceed n. When Sk1 f(x)dxk W& f(xi), it
b
is easy to see that W= 2/n since the right-hand
.f(x)dx=h((f,+f,)/2+(f,+f,+“‘+f,~,)),
sa side must be equal to the left-hand side when
f(x) = 1. It is known that the abscissas xi for
where x0 = a, x, = b, h = (b - a)/m, with trun-
n 9 7 and n = 9 are real, while for n = 8 and
cation error (b - a)3f(2)/( 12m’). (Here, as in the
rest of the article, f”) stands for f(‘)(t), with n > 10 at least one of the abscissas becomes
complex. It is easy to see that Chebyshev for-
differentiability assumed as before.) By apply-
mulas are interpolatory.
ing Simpson’s l/3 rule we obtain the formula
h (3) Gauss Formulas. In the Gauss formulas
f(x) dx both the weights K and the abscissas xi are
s c1
chosen so that we obtain the accurate value of
the integral when the integrand is any poly-
nomial whose degree does not exceed 2n - 1. If
we put n(x) = n:=, (x -xi), an arbitrary poly-
+(.fi +.L+ “’ +f2m-2))r
nomial of degree 2n - 1 can be expressed in the
where x,, = a, x2,,, = b, h=(b-a)/2m, with form
truncation error (b-~)‘f(~)/180(2m)~.
For the integral of a periodic analytic func-
tion over a single period the trapezoidal rule
with equally spaced points gives the best result where f (xk) = fk and the first term is the tLa-
asymptotically, as the number of points tends grange interpolation polynomial. By the as-
to infinity. sumption that the integral of f(x) with weight
Newton-Cotes formulas can also be ob- w(x) equals Xi=, Wkfk, we obtain W, =
tained by integrating over the interval [a, b] l,“(w(x)Z7(x)/(x-x,)U’(x,))dx and the relations
the interpolation polynomial for equally b
spaced points. The formulas mentioned so w(x)Z7(x)xkdx = 0, k=O,l,..., n-l.
far are called closed formulas since they use sa
values at the two endpoints of the interval. Accordingly, the abscissas xi are determined as
We can also use open formulas, which do the roots of the polynomial n(x) of degree n
not use values at the ends. For example, we that is orthogonal to xk (k = 0, 1, . , n - 1) with
have jc;f(x)dx = 4h(2f, - f2 + 2f3)/3 with trun- respect to the weight function w(x).
cation error 14h5f (4)/45. Open formulas are The following are typical examples of inte-
useful for numerical solution of differential gration formulas of Gaussian type.
equations. There are also formulas that in- (i) Gauss integration formulas (in the
clude values outside the interval, for example, narrow sense). For w(x) = 1 and the interval
(-fel+ 13&+13f,-f,)h/24 for l;;f(x)dx. [ - 1, 11, n(x) is the +Legendre polynomial
Applying these formulas to the m subintervals P,(x) = (1/2”n!)d”(x* - l)“/dx”. The error is
of [a, b], we obtain a trapezoidal formula with (n!)422”+‘f(Zn)/(2n+ 1)((2n)!)3.
299 B 1122
Numerical Integration

(ii) Gauss-Laguerre formulas. For the (1) IMT Formula. The tEuler-Maclaurin
weight w(x) = exp( - x) and the interval [0, co), formula is given by
n(x) is the tLaguerre polynomial L,(x) =
(expx)d”(x”exp( -x))/dx”.
(iii) Gauss-Hermite formulas. For the weight
function w(x) = exp( - x2) and the interval
(-co, co), n(x) is the tHermite polynomial
H,(x)=(--l)“expx’.d”exp(-x’)/dx”.
(iv) Gauss-Chehyshev formulas. For the
weight function w(x) = (1 -x2)-‘/’ and the
interval [ -1, 11, we use the Chebyshev poly- &If,
nomial T,(x)=2-(“-l)cos(narccosx). In this n
case, the & are all equal to r/n.
From the definition of W, we see that the
Gauss formulas are interpolatory.
where the B,,(t) are tBernoulli polynomials
(4) Clenshaw-Curtis Formulas. Although the of degree n and the B, are tBernoulli num-
Gauss formula is in general more accurate bers(B,=l,B,=-l/2,B2=1/6,B,=0,B,=
than the Newton-Cotes formula with the same -l/30, . ). This formula suggests that if the
number of points of interpolation, the points higher derivatives of the integrand vanish at
of interpolation for a Gauss formula of any the both endpoints, the error of the trape-
order are distinct from those of any other zoidal rule with equally spaced points be-
order except the point zero, which appears in comes very small. The IMT formula is based
all formulas of odd order. The Clenshaw-Curtis on the idea of transforming the variable x of
formulas are interpolatory, with the points ih,f(x)dx in such a way that all the deriva-
chosen so that the distribution of the points is tives of the new integrand vanish at both end-
similar to that of the Gauss formula and such points by taking x = q(t), q(t)= K m1&$(z)dt,
that, in proceeding from a computation of K=~~$(z)dr,$(r)=exp(-r-‘-(1-7)-‘).
order n to that of order 2n, all the function Then the trapezoidal rule with h = l/n is
values evaluated in the former computation applied to the transformed integral to obtain
be used in the latter. For w(x) = 1, the inter- (l/Kn)C;Z: $(j/n)f(cp(j/n)) [l]. The asymptotic
val [ -1, 11, and even n, the points xk and the expression of the error for the IMT formula is
weights W, are given by proportional to exp( - Cfi) with a positive
constant C.
kn
xk=cos--, k=O,l,...,n;
n
(2) Double Exponential Formula. The trape-
1 zoidal rule with equally spaced points ap-
WOE w,=-.
n2-1’ plied to the integral of an analytic function
over (-co, a) gives in general a result of
1 2nj.9
w,=wnms=y
?l
-cos-,
l -4j2 n
high accuracy. The double exponential for-
mula is based on the idea of transforming
j=o

s?, f(x)dx to J?,f(q(t))cp’(t)dt with x =


s=1,2 ,..., !! q(t)= tanh&sinh t) and applying the trape-
2
zoidal rule with a mesh size h, which results in
C” mea s that the first and the last terms in !~C,“=~,f’(cp(nh))cp’(nh) [4]. The name of the
7
the sum are to be halved. There are some double exponential formula is attributed to the
different types of Clenshaw-Curtis formulas decay of q’(t) at t-t *co, which is approxi-
depending on the selection of the points xk [ 11. mately proportional to exp( - Cexp 1t I) with a
positive constant C. The transformations x=
exp(n sinh t) and x = sinh($ sinh t) give the
double exponential formulas for the infinite
B. Integration Formulas Based on Variable
integrals lzf(x)dx and 10aoof(x)dx, respec-
Transformation
tively. In the actual computation the infinite
~ summation is truncated at appropriate upper
If the integrand has some singularity at the and lower bounds. The asymptotic expres-
endpoint, any interpolatory formula based on sion of the error for the double exponential
interpolation with a polynomial does not give formula in terms of the number N of the sam-
a good result. In such cases integration for- pling points actually used is proportional to
mulas based on variable transformation are exp( - CN/log N) with a positive constant C.
quite effective. The IMT formula and the double exponential
1123 299 Ref.
Numerical Integration

formula are robust against the singularities at tained by forming the product of 1-dimen-
the endpoints. sional formulas is useful. For integrals over a
cube or a Sphere there are monomial rules
C. Automatic Integration which are exact for a certain family of mono-
mials. These rules can be used for integrals of
By an automatic integration scheme we mean a dimension lower than 5 or 6. For integrals of
computer program for numerical integration higher dimension only methods based on
of Jtf(x)dx in which the user gives the limits sampling make sense.
of integration a and b, a subroutine for com-
puting f(x), and an error tolerance E. Then the
program gives a value of the integral which is E. Numerical Differentiation
expected to be correct within the tolerance E.
Usually in an automatic integration scheme In order to find the numerical value of the
the mesh size is halved until the desired ac- derivative fd = f ‘(x,) at a point xp from the
curacy is attained. Automatic integration tabulated values fk = f (xk), we usually use the
schemes are classified into two groups, non- derivative of the tLagrange interpolation
adaptive schemes and adaptive schemes. In a formula. This gives
nonadaptive scheme, the sequence of integra-
tion points is chosen according to some fixed
f”=k~p(x
rule independent of the shape of the inte-
grand. Newton-Cotes formulas, Clenshaw-
where n(x) = n;=, (x - xk).
Curtis formulas, IMT formulas, and double
When we compute the derivative of a func-
exponential formulas can be used as base
tion which can be evaluated at any point in a
formulas for nonadaptive schemes.
given interval, the approximation
From the historical point of view, Romberg
integration should be mentioned; it is a kind of
nonadaptive automatic integrator. Consider 2h
an integral I = jif(x)dx. Divide the interval
is useful; similarly, we can use
[a, b] of integration into 2k subintervals and
apply the trapezoidal rule with the mesh size
h = (b - a)/2k, which we denote by Tdk). Then, h2
starting from the values obtained for Tf),
k = 0, 1, . , we compute the sequence It must be noted that, as h tends to zero, the
difference f (x + h) -f (x - h) comes to contain
JmT;+;’ - T’k’
T’k’
m
= m 1
m=l,2,... fewer significant digits, so that it is meaning-
4”-1 ’ less to carry out { f (x + h) -f (x - h)}/2h be-
From the tEuler-Maclaurin formula the as- yond a reasonable value of h.
ymptotic error for the trapezoidal rule Tdk) can
be expressed as I - Tdk)= c, h2 + c2 h4 + . +
c,hzm+ . . . . where c,=const x (f(2m-1)(b)- References
f (“‘-l)(u)) does not depend on h. If we com-
pute T:k)=(4Td“+1)- Tdk))/3 using the values [l] P. J. Davis and P. Rabinowitz, Methods of
Tdkfl) and Td”), then we see that the asymp- numerical integration, Academic Press, 1975.
totic error expression of T/k) becomes I - T:k) = [2] P. J. Davis and I. Polonsky, Numerical
c; h4 + cj h6 + . + ckh’” + . Romberg inte- interpolation, differentiation and integration,
gration is based on the idea of eliminating Handbook of Mathematical Functions, M.
the term with h’” in the expression of the error Abramowitz and I. A. Stegun (eds.), Nat. Bur.
by successive application of TAk’. This is an Stand. Appl. Math. Ser., 55 (1964), 875-924.
application of Richardson’s extrapolation [3] V. I. Krylov, Approximate calculation of
procedure. integrals, translated by A. H. Stroud, Mac-
In the adaptive scheme, the points are millan, 1962.
chosen in a way that depends on the shape of [4] H. Takahasi and M. Mori, Double ex-
the integrand. The Newton-Cotes formula of ponential formulas for numerical integration,
order 8, for example, is used as the base for- Publ. Res. Inst. Math. Sci., 9 (1974), 721-741.
mula for the adaptive scheme. [S] J. R. Rice, A meta-algorithm for adap-
tive quadrature, J. ACM, 22 (1975), 61-82.
D. Approximate Multiple Integration [6] A. H. Stroud and D. H. Secrest, Gaussian
quadrature formulas, Prentice-Hall, 1966.
If a region is a product region, such as a rect- [7] A. H. Stroud, Approximate calculation of
angular parallelepiped, a product rule ob- multiple integrals, Prentice-Hall, 1971.
300 Ref. 1124
Numerical Methods

300 (XV.1) Numerical Computation of Eigenvalues;

Numerical Methods C61).


Partial differential equations seem to have
become more familiar because of the visualiza-
In the earlier history of mathematics, the de- tion of their numerical solutions in graphical
velopment of methods of numerical calcula- computer outputs. In 171, which appeared
tion was one of the main purposes of research. much earlier than computers, partial difference
Until the beginning of this century, logarith- equations for the fundamental linear prob-
mic calculation played a central role in numer- lems in mathematical physics were discussed
ical calculation, and the main topic of this field with a suggested variational treatment (- 304
was to make tables of values of functions. The Numerical Solution of Partial Differential
digital electronic computer (- 75 Computers), Equations; [S]). The tfinite element method,
which made its first appearance in the 1940s which started as a calculation technique in
and has been developing at an exponential structural mechanics and is based on the +cal-
rate, has caused drastic changes in numerical culus of variations, is widely accepted as an
technique. Problem solving by numerical efficient approximation method for partial
methods has now become one of the fundamen- differential equations [9, lo]. The term tsimu-
tal means of research in the physical sciences lation is used often to describe procedures in
and engineering, and also in the social sciences which partial differential equations describing
and humanities. In this article we give exam- time-dependent phenomena are discretized;
ples of the changes in numerical methods the resulting difference system can be solved
brought about by the availability of digital for long periods of time [ll].
computers and portable calculators. Numerical analysis has heretofore long been
Computers may be effectively utilized for considered to be the only numerical method
calculating individual values of functions. This (for error analysis in particular), and has been
has led to the reexamination and, in some carried out mainly by means of the discretiza-
cases, modification of approximate formulas tion of equations. Nowadays, however, mathe-
for evaluating functions (- 142 Evaluation of matical modeling, taking into account both the
Functions). For familiar functions, such as phenomena to be described and the capabi-
tlogarithmic, texponential, and ttrigonometric lities of the computers to be used, has become
functions, tables have been almost completely an important numerical method.
replaced by function keys on electronic cal-
culators. Microprogramming algorithms for
obtaining values of these functions have also References
been devised [ 11. A complex function-theoretic
error-estimation method for use with numer- [l] J. S. Walther, A unified algorithm for
ical integration formulas is given in [2]. In elementary functions, Conference Proceedings,
this method, graphical outputs of the com- Spring Joint Computer Conference, AFIPS 38
puter are utilized. For problems in which the (1971), AFIPS Press, 1971, 379-385.
existence and uniqueness of solutions have [2] H. Takahashi and M. Mori, Error estima-
been established, as for algebraic equations tion in the numerical integration of analytic
and ordinary differential equations, fairly good functions, Rep. Computing Centre, Univ. of
numerical calculation methods and their error Tokyo, 3 (1970), 41&108.
estimates have been established (- 301 Nu- [3] P. Henrici, Elements of numerical analysis,
merical Solution of Algebraic Equations; Wiley, 1974.
303 Numerical Solution of Ordinary Diffcrcn- [4] P. Henrici, Discrete variable methods in
tial Equations; [3,4]). A method for estimating ordinary differential equations, Wiley, second
arithmetic taccumulation errors for operations edition, 1968.
involving finite numbers of digits has been [S] J. H. Wilkinson, Rounding errors in alge-
systematized (- 138 Error Analysis; [S]). braic processes, HMSO, 1963.
It is not unusual nowadays for linear equa- [6] J. H. Wilkinson and C. Reinsch (eds.),
tions with 10,000 or more unknowns to be Handbook for automatic computation 2,
solved. As new computing systems, such as the pt. II, Linear algebra, Springer, 1971.
virtual memory system and the vector oper- [7] R. Courant, K. Frienrichs, and H. Lewy,
ation system, come into existence, new al- On the partial difference equations of math-
gorithms are examined; a numerical method ematical physics, IBM J. Res. Develop., 11
that is optimal for today’s technology may (1967), 215-234. (Original in German, 1928.)
well be suboptimal for the next generation [8] F. John, Lectures on advanced numerical
of computers.General-purposeprogram analysis,Gordon & Breach, 1967.
packages of linear problems, including eigen- [9] G. Strang and G. J. Fix, An analysis of the
value problems, have been developed (- 298 finite element method, Prentice-Hall, 1973.
1125 301 D
Numerical Solution of Algebraic Equations

[lo] P. G. Ciarlet, The finite element method conditions for the speed of convergence of
for elliptic problems, North-Holland, 1978. {xi} to be of the kth order are c(= F(X), F’(a) =
[l l] R. J. Roache, Computational fluid dy- F”(a)= =F’k-l’(a)=O, F’k’(a)#O.
namics, Hermosa, 1972. The main iterative processes used in numer-
ical calculation are described in the following
sections.

301 (XV.5)
C. Regula Falsi
Numerical Solution of
Algebraic Equations Regula falsi (or the method of false position) is
a process of obtaining the real root c( of an
equation f(x) = 0 by approaching the root
A. General Remarks
from both sides. The calculation procedure is
Methods for finding roots of an equation as follows: Assume that f(x,) > 0 and f(x,) < 0,
where xp and xq are approximate values of c(
f(x) = 0 (where f(x) is not necessarily a poly-
such that CI lies between them. A new approxi-
nomial, but is assumed to be a function with
mate value x is then obtained from X=(X$(X,)
some regularity) by numerical calculation can
in general be divided into the following two -x$(x,)Y(f(x,) -f&J). Iff(3 > 0, then X-+
types: (i) The first has as its goal the finding xp, and if f(x) ~0, then x--*x~, and the pro-
cedure is repeated (the symbol + means re-
of good approximate values of the roots;
placement). The conditions and speed of con-
examples are the Bernoulli method (Section J)
and the Graeffe method (Section N). (ii) The vergence of regula falsi are as follows: Let F(x)
second improves the accuracy of estimates of = (x,.f(x) - xf(x,)Mf(x) -.f(x,)); then l+d =
the roots; an example is the Newton-Raphson (f(x,) + (a - x,)f’(cc))/f(x,). If f’ and f” exist
and are continuous near CL,then f(x,) =f(a) +
method (Section D). The methods belonging to
(x,-d~)f’(n)+(1/2)(x,-~)‘f”(& where c(>
(i) and (ii) can be used separately. However, in
< > xq or E < 5 < xq. Accordingly, assume
(i) convergence of the approximations may be
that F’(a)=(1/2)(x,--cc)2f”(~)/f(x,)#0. Then
excessively slow when a pair of nearly equal
IF’(x)1 < 1 can be satisfied if x and xq are near
roots is present, and the size of numbers in-
c(. Therefore, if the initial value is appropriate,
volved in the calculation may grow exponen-
the convergence is of the first order. In regula
tially as we proceed through the iterations. In
falsi, only calculation of f(x) is necessary, while
(ii) convergence of the approximate values is
that of f’(x) is unnecessary. Furthermore, if
not assured unless the initial approximate
value is suitably chosen. Accordingly, when an f(x) = 0 has nearby real roots, i.e., CI,8, . . close
electronic computer is utilized, it is advisable to each other and f’(x) small near the roots,
to combine both types of method. Because of there can be no mistake such as the neighbor-
the development of computers, solutions with ing root fi being obtained in the process of
global convergence have become important finding the root CI by this method. This method
belongs to the inverse linear interpolation
[1,2,15, 17,211.
method. This kind of inverse interpolation
includes Muller’s method [3], which uses
B. Successive Substitutions tlagrange’s interpolation formula, the Torii-
Miyakoda method [4], which uses tHer-
When an equation f(x) = 0 is transformed into mite’s interpolation formula, and Whittaker’s
x = F(x) and the roots are obtained by iterative method [S], which uses tstirling’s interpola-
calculation of xi+1 = F(Xi) (i=O, 1,2,. ), suff- tion formula. These methods converge more
cient conditions for its convergence are as rapidly but have more complicated formulas
follows: Let one of the roots of the equation be than inverse linear interpolation. The Sturm
s(. Then xi+1 - CI= F(xi) - c(, and therefore (xi+, method (in which the interval where the roots
-a)/(~,-a)=F’(() (xi<[<a or x~><>u). exist is narrowed by tsturm’s theorem) and the
Accordingly, SCconverges monotonically if 0 < Horner method (which obtains the decimal
F’(c) < 1, while it converges with oscillation part digit by digit), both used to obtain the
ifO>F’(t)> -1. IfIF’(<)l>l and F-‘(x)de- real roots of a high-order algebraic equation
notes the inverse function of F(x), the iteration f(x) = 0, are also of this type.
xi+l = F’(xJ converges. To define the speed of
convergence by iterative processes the follow-
ing notion of order is utilized: When lim,,, xi D. The Newton-Raphson Method
= CI, the speed of convergence of the sequence
{xi} is said to be of the kth order if limi+oo(xi+l In obtaining the real roots of an equation ,f(x)
- %)/(xi - ~c)~= c # 0. Necessary and sufficient = 0, the Newton-Raphson method (or the New-
301 E 1126
Numerical Solution of Algebraic Equations

ton iterative process), which converges rapidly, Y= d& YX and if IWW + l@laxl< 1, lWh4
is used when f’(x) #O is computable. Let xi be + la$/ayl< 1, then the iterative processes xi+,
a sufficiently close ith approximation of the = cp(xi, yi) and yi+l = Il/(xi, yi) are applicable. In
root a; then x,+~ = xi -f(xi)/f’(xi) is closer than the Newton-Raphson method, let the correc-
xi to the true solution. The process is repeated tions to the ith approximate values xi and yi
until Ix~+~ -x,1 is small enough. Conditions be Axi and Ayi, and solve
and the speed of convergence are as follows
[6,7]: Let F(x) = x -f(x)/f’(x); then F’(x) =
.j;(xi,Yi)AXi+f,(xi,Yi)AYi= -f(xiaYJ,
f(x)f”(x)/(f’(x))’ and F’(E) = 0. Accordingly, R(Xi,Yi)Axi+g,(Xi,Yi)AYi= -g(xi>YJ
F”(E) # 0 if f’(a) # 0 and f”(a) # 0, and the
Then we can take xi+, = xi + Axi and Y,+~ =
convergence is of the second order if it is pos-
yi + Ayi as the next approximate values. This
sible to determine an appropriate approxi-
process is repeated until both (Axi( and (Ayi(
mate value x, that is close to CI and satisfies
are sufficiently small. More generally, the
IF( < 1. In particular, if f(xo)f’(xO)#O, h, =
Newton-Raphson method for solving a system
-f(xoYf’(xo), IS’WI GM, and If’(x,)l~
of n nonlinear equations f;(xl, ,x,) = 0, i =
2 1h, I M, then every Newton approximation
1,2, . . . , n, is defined by
xi starting with x,, is contained in the interval
I=[x,-Ihol,x,+lhol],theequation hasonly X(k+l)=X(k)-~(X(k))-l~(X(k)), k=0,1,2 ,...,
one root CIin I, and X,+X. Besides,
where xck)--(x \“), ,xLk))t F(x)=(f,(x,, . . . , XII),
(“-x,+,(~M(xi-xi-l(2/2(f’(X,)I, i=l,2 ,.... ~~~if~(X~~~~~~Xn))‘iJ(X)=~i)l;(XI~~~~~x~)irlxj)

(the tJacobian matrix of F), and where x(O) is


With regard to convergence and evaluation of
chosen appropriately [23].
the error, including roundoff error in practical
computations, M. Urabe’s studies should be
consulted [8,9]. In general, the convergence E. The Bairstow Method
is of the third order in the Newton-Raphson
method, which uses not only f’(x) but also Corresponding to a pair of complex roots CIf
f”(X) ClOl. ib of an algebraic equation f(x) = aOxn + a, xnm’
When f’(x)=O, we must assume f”(x) #O + + a, = 0 with real coefficients, there is a
and use the value f”(x). With regard to this real quadratic factor x2 + p*x + q*. The Bair-
case, the study by S. Hitotumatu should be stow method (or Hitchcock method) obtains
consulted [ll]. In addition, W. Kizner (SIAM the coefficients p* and q* of this quadratic
J. Appl. Math., 12 (1964)) reported an itera- factor.
tive process in which the convergence is of the To do this, first choose an appropriate qua-
fifth order without using any derived function dratic factor x2 +px + q as a candidate. Then
higher in order than f”(x). The essential part through synthetic division by the quadratic
of his method lies in a numerical integration of factor, bj and cj are computed by means of
the integral part by the tRunge-Kutta formula, the formulas
based on the fact that if x1 is the first approxi-
mate value of a root X of f(x) = 0, then bj=aj-pbjm, -qbjm,, j=O, 1,...,n,

~ O dx and
x= -df+x,.
s ml) df cj=bj-pcj+ -qcj-2r j=O,l,..., n-l,
There may be several iterative processes for where
solving f(x) = 0, even if the order k of the con-
vergence is fixed. For example, in obtaining b-, =bm,=c-, =cm,=O.
the positive root of f(x) = x2 -u = 0 (a > 0), i.e., By solving the simultaneous equations
the square root a = a I/’ , both iterative pro-
cesses xi+, --(xi + u/x,)/2 and xi+1 = 2x:/(3x: - c,m,Ap+c,-,Aq=b,-,,
a) give convergence of the second order. In
c,m,Ap+c,-,Aq=b,,,
obtaining the real root of .f(x) =x3 -a = 0, i.e.,
the cube root ~=a”~, xi+, =xi+(u/xF-xi)/3 where
converges of the second order, while xi+1 =
xi/2 + (a + a/2)/(2x’ + a/xi) converges of the
third order. the quantities Ap and Aq are obtained. Then
The Newton-Raphson method is applicable takingp=p+Ap,G=q+Aq,wehavex’+
also to holomorphic functions of complex fix + 4 as a new candidate. This operation is
variables. repeated until Ap and Aq are sufficiently small.
Take simultaneous equations of two vari- Since this method corresponds to the Newton-
ables, f (x, y) = 0 and y(x, y) = 0. If it is possible Raphson method in the case of two variables,
to transform the equations into x = cp(x, y) and and, accordingly, the convergence is of the
1127 301 G
Numerical Solution of Algebraic Equations

second order, a choice of suitable initial values method that, if r. is large enough, then
of p and 4 leads to rapid convergence. The key
to this method lies in choosing p and 4 so that zikJ+>=( 1 -~)x(;I”)+~), i= 1,2, . . . . n,
R,(p,q)=O and R,(p,q)=O, where R, and R,
are such that R i x + R, is the remainder of f(x) hold nearly for the first several steps. Thus
divided by the trial quadratic factor x2 + px + the DKA method has a certain kind of global
q. This method was generalized by A. A. Grau convergence property that renders it one of the
(SIAM J. Appl. Math., 11 (1963)). Namely, most powerful methods for solving algebraic
when f(x) = (x2 + px f&(x) + r(x) with r(x) = equations.
r,xkfl +r,xk, the functions r, and r2 can be A variant of the Durand-Kerner method is
used instead of R, and R,. The process cor- the Ehrlich-Aberth method [13,14]; it is de-
responds to the Bairstow method when k = 0, fined for i=l, 2, . . . . n, k=O, 1, 2, . . . by
and to the McAuley method (SIAM J. Appl.
Z(k+l)
Math., lO(1962)) when k=n-2. 1

= Z!k) - f (zlk))
’ f ‘(z$) - f(zi”‘) C;=, jzi l/(z!k’1 --z!k’)I
F. The Durand-Kerner Method The speed of convergence of this method
is of the third order. Further variants of
The Durand-Kerner (DK) method [ 121 for the Durand-Kerner and the Ehrlich-Aberth
solving an algebraic equation f(z) = zn + a r z”-l methods have been proposed by M. Iri et al
+ + a, = 0 (a, # 0) with complex coefficients [lS] and A. W. M. Nourein [16].
is an iterative method defined by

G. The Dejon-Nickel Method

Let f(z) be a polynomial of degree n with


i=1,2 ,...) n, k=0,1,2 ,.... (1)
complex coefficients, and take z. such that
A feature of this method is that it can deter- f(zo) # 0. Then we can write
mine simultaneously all the roots of f(z)=O.
f(z,+h)=f(z,){1+b,h’+b,+,h’+‘+...+b,h”}
Let cp&,,..., z,),m=1,2 ,..., n,bethemth
elementary symmetric functions with respect (bi f 0)
to zl, . . ..z.:
=f(z,){l+b,h’(l+O)}, 0=0(h).
(Pm(z13...>zn) c zi,zi,...zi m
i,Xi,C...<i, Now, choose h sufficiently small so that 1bihil
Set ,l;,(z)=( -l)“cp,(z,, . . . ,z,)-u,,,. Then t(= <l,~O~<l,andarg(b,hi)=n.Then~l+bihil=
(aI, , cc,)‘, a set of the roots off(z) = 0, is a 1 -Ib$‘l and Ibih’Ol<lbihil so that If(zo+h)l
solution of a system of the n equations f,(z) = ~~.f(zo)~{~l+b,h’~+~bihiB~}~~f(zo)l.Hence,
0, m= 1,2, . , n. I. 0. Kerner [12] showed if z. is chosen so that min If(z)1 = 1f(zo)l,
then we must have f(zo) = 0. This is the out-
that the DK method is the Newton-Raphson
method applied to the system of nonlinear line of Cauchy’s existence proof for the roots
equations f,(z) = 0, m = 1,2, , n. Therefore of algebraic equations. The Dejon-Nickel
method [ 171 is a method in which h is chosen
the speed of convergence of (1) is of the second
as follows:
order, provided that it converges. The initial
values z’p), , z(O)
n for the DK method are h =( -l/bk)l’k,
usually chosen as follows: Let g(z) = f (z - u,/n)
where
=z”+c,z”~2+...+~,andh(z)=z”-~c,~z”-2
- . . . -Ic,I. If (c,, . . . . c,)#(O, . . . . 0) (n>2), then
it can be shown that h(z) = 0 has exactly one ( = r, say).
positive root r, and all the roots of f(z)=0 lie
in the disk ]z + al/n1 <r. Now, with a positive If several such k exist, take the smallest one.
r. > r and 0 = n/(2n), put The branch of the multivalued function
(- l/bJ lik is chosen such that zlik is positive for
zpositive. If If(zo+h)l<(l-s)lf(zo)] with a
Zi”‘= -s+roexp[(“n’!n,n)fl],
n preassigned constant E such that 0 <a < 1, then
put zi = z. + h. If the inequality does not hold,
i=1,2 ,..., n. (2)
then for each integer m > 1 choose the small-
Such a choice was proposed by 0. Aberth est integer I=!(m) such that max{ lb,l(r/2”)j/
[t 31. Hence the process (1) together with (2) i<j<n} = 1b,I(r/2”)‘, and put 2, =zo +(r/2”)
can be called the Durand-Kerner-Aberth ( - I bJ/bJ’/‘. Find the smallest integer m > 1 such
(DKA) method. It is shown for the DKA that If( <(I -2~1(r/2m)‘lb~l)lf(zo)l, and put
301 H 1128
Numerical Solution of Algebraic Equations

z1 =Z,,,. By continuing this process, a sequence > 11, I). Then the Bernoulli method is the
{zj} is constructed such that If(z,,)l> If(zl)l > same as the tpower method for obtaining the
If(zJ > . It converges toward some root maximum characteristic value 1, (- 298 Nu-
of the equation f(z) = 0. S. Hirano has pro- merical Computation of Eigenvalues C).
posed a similar method.

K. The Lehmer Method


H. Methods for Finding Good First Estimates
of Roots The Lehmer method [9] is a general method
for finding the n roots of an algebraic equation
Some of the principal methods for obtaining with complex coefficients
good first approximate values of the roots are
given in the following sections 1-N.
in the complex plane by repeating the follow-
ing procedure: First, draw a circle whose cen-
I. Matrix Methods
ter is at the origin and whose radius is R =
r2 -8, where r is an arbitrary given number
The problem of solving an algebraic equation
f(z)=z”+a,~“~~+...+a,=Owithcomplex and tJ an arbitrary given integer. Then, utiliz-
coefficients is equivalent to that of finding the ing the Lehmer theorem, observe whether a
eigenvalues of the companion matrix root a of f(z) = 0 lies inside the circle. If there is
no such root, replace R with 2R, whereas if
such a root exists, replace R with R/2. Con-
tinue this process until an R for which a root

. . .. . . .. . . .. . .. . . . .. . ..
0 0 0 1 0 1
2 exists within the annulus R < IzI <2R is
obtained.
Second, draw circles with the centers /jk =
(5R/3)exp(i2nk/S) (k=O, 1, . . . ,7) and com-
Therefore numerical methods for solving the
mon radius p = (5R/3)/2 and cover the annulus
eigenvalue problems for nonsymmetric ma-
obtained by the first step. Then find out which
trices are applicable (- 298 Numerical Com-
of the circles has the root c( in its interior. If
putation of Eigenvalues). However, it should
c( is in the interior of the circle for k =j, the
be remarked that this might be inefficient
origin is shifted to fij and the operation is
because most of the elements of the matrix A
started again from the first step. If R, satisfy-
are zero [ 1S]
ingR,<lz--/1;<2R, isobtained bythefirst
step, we have R, <(5/12)R. Therefore, when
J. The Bernoulli Method the first step is repeated N times, the root CI
is confined in a small circle whose radius is
In the Bernoulli method the iterative formulas smaller than 2(5/12)NR. Then the center b of
S,=--a,,Sk=-(a,Sk~,+a2Sk~,+...+ak~,S, this small circle gives a good approximate
+ka,)(k=2,3 ,..., n), S,= -u,S,~,-a,&,-- value of c(.
-an&. (k= n-t 1, n+ 2,. ..) are repeatedly
applied to an algebraic equation f(x) = x” +
a,x”~‘+a,~“~~+ +a,=O. When the roots L. The Downhill Method
off(x)=Oarecl,,x., ,..., a,((a,(>(cc,(>...>
[x,1), &/&-,+a, if tl, is a real and simple The downhill method is a method for obtain-
root and Ic(, I > It121. When zl, c(~ are complex ing the extreme values of a function of many
roots, we put c(~=Re”, cc,=Re-“‘. If Icr,l<R, variables. It is also applicable in obtaining
then we have approximate solutions of a system of equa-
tions. Let us consider the downhill method in
$-Sk+, Sk-, +R2, the case of two variables. The problem of
sk2_1-s,s,-2 obtaining the real roots of simultaneous equa-
tions f(x, y) = 0 and g(x, y) = 0 can be reduced
SkSk-1 -Sk+,Sk-2
-+ 2R cos 0, to the problem of obtaining the coordinates
s;-, -SJm2
(a, 8) which give the extreme value 0 of @(x, y),
and hence the two roots of x2 - (2R cos 0)x + where 0(x, y) =,f ’ + g2. The values of @(x, y)
R2 = 0 are LYEand a2. S, is the sum of the kth are calculated at 32 points obtained by the
powers of the n roots of the equation, CI~, tc2, combinationofx=x,,x,kh;y=y,,y,+h,
. . . , CI,,. C. Lanczos used xf’(x)/f(x) = II + S,/x $ where (x,, y,) are arbitrary approximate values
&/x2 + . . . t S,/xk + . . . to compute S, [lo, of (a, 8) constituting the centers of those sets of
pp. 26-301. Let the eigenvalues of the com- points, and h is the given step size. Utilizing
panionmatrix Abe1,,12,...,~,((~,l~11,1~ the values of the function at the 32 points,
1129 301 Ref.
Numerical Solution of Algebraic Equations

@(x, y) is approximated by a quadratic surface subject to the initial condition x(0) =x(O), pro-
vided that J(x(t)) is nonsingular. Therefore
b,+b,x+bzy+b,,(3x2-2)
numerical solution at t = 1 gives a good ap-
+b,,(3y2-2)+h,,xy. proximation for a solution of F(x) = 0. This is
called Davidenko’s method of differentiation
The values of h,, b,, b,, b,,, b,,, b,, are cal-
with respect to a parameter. These methods
culated by the tmethod of least squares, and
can be used to find initial approximations of
the center (~7, y:) of the approximate quadra-
the Durand-Kerner and the Ehrlich-Aberth
tic surface is obtained. Then the first approxi-
methods. They are also applicable to a single
mations x, and y, are replaced by the sec-
algebraic equation f(x) = 0, which is the special
ond approximations x1 +xT and y, +y:. This
case n= 1.
process is repeated while the step size h is
made appropriately smaller. This method is
an improvement on the method of successive N. Other Methods
experimental planning used by G. E. P. Box
and K. B. Wilson (1954) to obtain optimum To obtain the first approximate values of the
conditions in the exploration of response roots of an algebraic equation, the Graeffe
curved surfaces. method has been used, in which the roots of
By using another minimization technique, the equation are separated by successively
J. A. Grant and G. D. Hitchins [21] gave an forming an equation whose roots are the
ever-convergent algorithm for determining squares of the roots of the preceding equation.
initial approximations to the roots of algebraic In computer calculations, however, the other
equations with real coefficients. The prac- methods described in previous sections are
tical implementation of this method is given in more convenient than the Graeffe method.
J. A. Grant and G. D. Hitchins (Comput. J., The Lanczos method [ 101 is a method in
18 (1975)). which y(x), an approximate function of f(x),
is obtained by the process of approximating
functions, and the roots of y(x) = 0 are taken
M. Continuation Methods as approximate values of the roots of f(x) = 0.
The Garside-Jarratt-Mack method [25] is a
Let F(x)=(f,(x), . . . . f,(x))‘=0 (x=(x,, . . . . x,))’ modification of the Lanczos method and ap-
be a system of n equations. Suppose that no proximates f(x)/f’(x) by a rational function
reasonable approximation for a solution exists. g/(x)=(x-u)/(b+cx).
Then, take arbitrary x(O) and define a one-
parameter family of equations
0. Error Bounds for Computed Solutions
H(x, t)-F(x)-(1 -t)F(x’O’)=O, O$f<l.
(3) Let zl, . , z, be computed solutions of an
Suppose that for each t the equation has a algebraic equation f(z) = 0 which were ob-
solution x(t) which depends continuously on tained by some method. If z,, . , z, are dis-
t. Observe then that x(0)=x(‘) and x( 1) is a tinct, then the following result due to B. T.
solution for the equation F(x)=O. Partition Smith [26] is quite useful for estimating the
the interval [0, l] by the points O=t,<t, < errors of zi: Let
< t, = 1. First, solve H(x, tl) = 0 by some itera-
tive method using x(O) as a first approximation.
Let x(l) be a solution thus obtained. Next,
solve H(x, tz) = 0 by some iterative mslilui Then the union of the disks ri contains all
using x(l) as a first approximati;,i, and so on. the roots CI, , , c(, of f(z) = 0. Any connected
Finally, solve H(x, tN) = 0 by some iterative component of IJF~ F”, which consists of just
method which uses xcN-‘) as a first approxima- m disks ri, contains exactly m roots of f(z) =
tion. Then xcN), a solution thus obtained, can 0. Hence if ri n 5 = 0 for every j # i, then 1CQ-
be used as a first approximation in an itera- zil < yi. Smith obtained a more general result
tive method applied to the equation F(x) = 0. for the case where zl, . . , z, are not necessarily
This method is called a continuation method distinct.
[22-241.
As another approach, differentiate (3) with
References
respect to t. Then J(x(t))x’(t)+ F(x@))=O,
where J(x(t)) is the Jacobian matrix of F,
[l] J. F. Traub, A class of global convergent
evaluated at x=x(t). Hence x(t) is the solution
iteration functions for solution of polynomial
of the system of ordinary differential equations
equations, Math. Comp., 20 (1966), 113-138.
x’(t)= -J(x(t))-‘F(x’O’), O<t< 1, [2] B. Dejon and P. Henrici (eds.), Construc-
302 A 1130
Numerical Solution of Linear Equations

tive aspects of the fundamental theorem of [21] J. A. Grant and G. D. Hitchins, An


algebra, Wiley-Interscience, 1969, always convergent minimization technique
[3] D. E. Muller, A method for solving alge- for the solution of polynomial equations, J.
braic equations using an automatic computer, Inst. Math. Appl., 8 (1971) 122-129.
Math. Tables Aids Comput., 10 (19X), 20% [22] J. M. Ortega, Numerical analysis, a sec-
215. ond course, Academic Press, 1972.
[4] T. Torii and T. Miyakoda, A practical [23] J. M. Ortega and W. C. Rheinboldt,
root-finding algorithm based on the cubic Her- Iterative solution of nonlinear equations in
mite interpolation, Information Processing several variables, Academic Press, 1970.
in Japan, 13 (1973), 1433148. [24] R. Wait, The numerical solution of alge-
[.5] E. T. Whittaker and G. Robinson, Cal- braic equations, Wiley, 1979.
culus of observations, Van Nostrand, 1924. [ZS] G. R. Garside, P. Jarratt, and C. Mack, A
[6] L. B. Rail, Computational solution of new method for solving polynomial equations,
nonlinear operator equations, Krieger, 1979. Comput. J., 11 (1968), 87-89.
[7] A. M. Ostrowski, Solution of equations in [26] B. T. Smith, Error bounds for zeros of a
Euclidean and Banach spaces, Academic Press, polynomial based upon Gerschgorin’s theo-
1973. rems, J. ACM, 17 (1970), 661-674.
[S] M. Urabe, Error estimation in numerical [27] A. S. Householder, The numerical treat-
solution of equations by iteration process, J. ment of a single nonlinear equation, McGraw-
Sci. Hiroshima Univ., ser. A., 26 (1962) 7779 1. Hill, 1970.
[9] M. Urabe, Component-wise error analysis [28] J. H. Wilkinson, Rounding errors in alge-
of iterative methods practiced on a floating- braic processes, HMSO, 1963.
point system, Mem. Fat. Sci. Kyushu Univ., [29] J. H. Wilkinson, Algebraic eigenvalue
27 (1973), 23-64. problem, Oxford Univ. Press, 1965.
[lo] C. Lanczos, Applied analysis, Prentice- [30] P. Rabinowitz (ed.), Numerical methods
Hall, 1956. for nonlinear algebraic equations, Gordon &
[1 1] S. Hitotumatu, A method of successive Breach, 1970.
approximation based on the expansion of
second order, Math. Japonicae, 7 (1962), 3 l-
50.
[12] I. 0. Kerner, Ein Gesamtschrittverfahren
zur Berechnung der Nullstellen von Polyno- 302 (XV.4)
men, Numer. Math., 8 (1966), 290-294. Numerical Solution of Linear
[ 131 0. Aberth, Iteration methods for finding
Equations
all zeros of a polynomial simultaneously,
Math. Comp., 27 (1973), 339-344.
[ 141 L. W. Ehrlich, A modified Newton A. Condition of Linear Systems
method for polynomials, Comm. ACM, 10
(1967), 1077108. The solution of the system of linear algebraic
[15] M. Iri, H. Yamashita, T. Terano, and H. equations
Ono, An algebraic-equation solver with global
convergence property, Publ. Res. Inst. Math. & uijxj=h,; i= 1, . . . . n, aij, hi real, (1)
Sci., 339 (1978), 43-69.
[ 161 A. W. M. Nourein, An improvement on which may be written in the matrix form
two iteration methods for simultaneous deter-
Ax=b; A=@,), x=(xi), b=(bJ, (1’)
mination of the zeros of a polynomial, Int. J.
Computer Math., 6 (1977), 241-252. is expressed as quotients of determinants by
[ 171 B. Dejon and K. Nickel, A never failing, +Cramer’s rule. In practice, however, this form
fast convergent root-finding algorithm, Con- of the solution is of little value for numerical
structive Aspects of the Fundamental Theo- computation, because the direct evaluation of
rem of Algebra, B. Dejon and P. Henrici (eds.), the determinants involves (n + l)!(n - 1) multi-
Wiley-lnterscience, 1969, l-35. plications which, even for a moderate-sized
[18] D. M. Young and R. T. Gregory, A sur- system, amount to a prohibitive number of
vey of numerical mathematics I, Addison- arithmetic operations to be executed, even
Wesley, 1972. with modern high-speed computers; besides,
[ 191 D. H. Lehmer, A machine method for it requires high-precision calculation. There
solving polynomial equations, J. ACM, 8 are a variety of practical methods of solving
(1961), 151-162. efficiently the system (1) with finite-precision
[20] 0. L. Rasmussen, Solution of polynomial calculation. These numerical methods are
equations by the method of D. H. Lehmer, generally divided into two classes: direct meth-
BIT, 4 (1964) 250-260. ods and iterative methods.
1131 302 B
Numerical Solution of Linear Equations

Whatever method is used, inherent diff- Starting with A (‘I = A and b(O) = b, a system
culties are encountered when the solution of of linear equations with an upper-triangular
the system is unstable. The instability is usu- coefficient matrix,
ally measured by the condition number of the
coefficient matrix, which is defined by
condA=cr,/a, (=~~AIIIIA-‘ll>l), is produced at the stage k = n - 1, where x* is
(2)
a permutation of x, caused by interchanging
where the IYJ~ (ol > g2 > > gn > 0) are the non- columns. This part of the process is called
negative square roots of the teigenvalues of forward elimination. If at-’ = 0 for all i, j > k at
A’A (called the singular values of A), II A // = some stage, then A is a singular matrix of
max{llAxll/llx~~:x#O},and 11x//=Jx’x.This +rank k - 1 and the system (1) admits infinitely
definition is valid also for an m x n matrix A many solutions if bf-l= 0 for i = k, , n, and
(m > n). The condition number cond A satisfies no solution otherwise. If this is not the case, A
is a nonsingular matrix with det A = a;;’ a;;’
IIW cond A
a:[:,-‘, and the solution of (1) is given by
TP
1 -wcond A
IIAII
where Ax=h, (A+6A)(x+6x)=b+6b, and
lIdAIl //A-’ II < 1. As the condition number for i = n, n - 1, , 1, which is called back sub-
increases, solution processes become more stitution. Taking the pivot to be an element
susceptible to errors. If cond A is large, the of the largest absolute value among column
system is called ill-conditioned. elements &’ (i> k) at each stage is called the
partial pivoting strategy. In complete pivoting,
the pivot is taken to be an element of the
B. Direct Methods largest absolute value among a$-’ (i, j > k).
These pivotings are introduced to prevent loss
A direct method is one that yields an exact
of accuracy due to rapid growth of elements of
solution in a finite number of arithmetic oper-
successive Ack). Although a smaller bound for
ations if they are performed without roundoff
the growth factor is obtained for complete
error. Among the existing direct methods, the
pivoting, in practice partial pivoting appears
one known as Gaussian elimination with pivot-
to be entirely adequate.
ing, which is based on systematic elimination
If rows or columns are not interchanged in
of unknowns of the equation (l), is found to be
the process, forward elimination effectively
the best with respect to time or accuracy. Its
produces a factorization of A into the product
dependability was re-established by means of
of a lower triangular matrix L and an upper
backward error analysis (- 138 Error Analy-
triangular matrix U, i.e.,
sis; [l-3]). The method generates successive
vectors hck)= (b/‘) and matrices Ack’ = (a$), typi- A=LU, (3)
tally of the form
where L has unit diagonal elements and U =
A(“-I). This factorization is computed directly
by the Doolittle method without calculating
the intermediate Ack). The Grout method also
produces a similar factorization (3) in which
U has unit diagonal elements. The Cbolesky
method determines a similar factorization (3)
when n = 5 and k = 2. At the kth stage, a pivo- of a tpositive definite matrix A, in which U =
tal element uzml # 0 (i,j > k) is chosen in one L’. Once the triangular factorizations (3) are
way or another. Then the ith and kth rows formed, the solution of the system (1) is deter-
and the jth and kth columns of Ackm’) are mined by solving the two triangular systems
interchanged so that at-’ becomes a:. The Ly = b and Ux = y successively.
ith and kth elements of hckml) are also inter- The number of multiplicative operations
changed. The pivot alk is then used to elimi- required for these factorizations are about n3/3
nate all the nonzero entries in its column for forward elimination, Dootittle’s method,
below the diagonal as follows: and Crout’s method, and about n3/6 for Cho-
a!!v = a!.-’
v ’ i=l,..., k, j=l,..., n; lesky’s method. The solution of each triangu-
lar system requires about n2/2 multiplicative
bkI = bk-’
I , i=l , . . ..k.
operations. Special properties of A, such as a
aF.=a$-’
U -(aF;l/aLJaLj, banded structure, can be exploited to reduce
the number of operations and memory re-
i=k+l,..., n, j=k ,..., n;
~ quirements considerably [4]. Gauss-Jordan
b/=b/-’ -(a,“,-1/a;Jb,k, i=k+l,...n. elimination is similar to Gaussian elimina-
302 C 1132
Numerical Solution of Linear Equations

tion, but the elements above the diagonal small elements accounting for the effect of
are also eliminated to dispense with back sub- roundoff errors [i-4]. In this case, the com-
stitution. However, it requires about n3/2 bined method is called the iterative improve-
multiplications. Modern computers can easily ment of the direct method. If applied to a first
handle problems of size rr = 100 by these direct solution x0 = 0 -‘E-‘h, it produces more accu-
methods. rate solution in a few steps with only a mod-
est increase in computation time, when the
system is not too ill-conditioned [4]. It is
C. Iterative Methods essential, however, that the residual b - Ax0 be
computed with higher precision.
An iterative method is a dynamical process Various convergence criteria have been
that generates a sequence of approximations established for a variety of methods [5,6]. For
{x”} converging to the exact solution. At each example, the Gauss-Seidel method is conver-
step of the iteration, an improved approxima- gent for a symmetric positive definite matrix A.
tion is obtained from the previous ones. The The smaller the spectral radius, the faster the
accuracy of the solution depends on the num- convergence of the method. In general, a larger
ber of iterations performed. Most iterative amount of computation is required in each
methods retain the coefficient matrix in its iteration to get faster convergence. If the spec-
original form throughout the process and tral radius p(l- RA) is close to one, the con-
hence have the advantage of requiring minimal vergence is slow, and an acceleration of the
memory. They are suitable for solving large process is needed. SOR (successive over-
sparse systems arising in finite-dimensional relaxation) is an accelerated version of the
approximations to tpartial differential equa- Gauss-Seidel method, in which R, = (L +
tions, where A is sparse if most of its elements w-ID)-’ and the optimum acceleration
are zero. parameter w (0 < w < 2) is chosen to minimize
Linear stationary iterative processes are the p(l - R, A) (- 304 Numerical Solution of
most frequently used iterative methods. A Partial Differential Equations; [6]). There is
method in this class is written in the form an adaptive acceleration method [7], which,
if applied to a scalar sequence, reduces to
Xk=Xk-‘+R(h-AXk-l), k=1,2,..., (4)
Aitken’s G2-method.
where R is chosen to approximate the inverse
of A. Usually, R and b - Axk-’ are not ex-
pressed explicitly in the actual algorithms. If D. The Conjugate Gradient Method
the tspectral radius of the iteration matrix I -
RA is less than one, the method converges The conjugate gradient (CG) method is a non-
to the solution for an arbitrary initial ap- linear stationary iterative method for solving
proximation x0 [S]. The key matrix R can be a system with a symmetric positive definite
chosen quite freely as long as this condition coefficient matrix. The method generates ix”},
is satisfied. In the Richardson method, R = {rk}, and {p”} by means of the formulas
ctA’, O<at2/jjA112. R=(A +E))’ is the usual
choice, in which E is a perturbing matrix that
makes A + E easily invertible. In the Gauss- Xki-l = xk+ Cc,pk,

Seidel method and the Jacobi method, R is


rk+‘-h-AXk+l=yk-akApk,
- (6)
chosen to be the inverse of the lower triangular
and the diagonal submatrices of A, respec- bk=b k+l, rk+‘)/(rk, rk),
tively, i.e., R=(L+Llml and R=F’, where ktl = rk+l + flkkpk,
L =(aij) (i>j) and D =(aii). P
Direct methods can be combined with the where x0 is arbitrary and p” = r” = h - Ax’,
iterative method (4) to obtain an approxi- (r,s)=s’r.
mate inverse R. In the course of factorization The CG method shares a feature with the
by a direct method, an artificial perturbation direct method. In theory, {x”} converges to
E is introduced to produce an incomplete the solution in less than n steps, and the pi
factorization, are mutually conjugate, i.e., (p’)‘Ap’= 0 (i #,j).
When Hestenes and Stiefel [9] proposed the
L”O=A+E, (5)
method in 1952, they created a great sensation
so that z and 0 are of low tcomplexity of because of the method’s theoretical elegance.
computation. A combined method is then The CG method turned out, however, to be
constructed by putting R = 0 -’ L-‘. Even if highly sensitive to roundoff errors. In practice,
E is not added intentionally, each of the di- nice theoretical properties, such as finite ter-
rect methods actually produces an incomplete mination, do not hold in the presence of error.
factorization (5) in which E is a matrix of Recently, the CG method has regained its
1133 302 Ref.
Numerical Solution of Linear Equations

popularity as an iterative method for solving similar factorization can also be produced by
large sparse systems. The iteration (6) is usu- using a sequence of Givens transformations,
ally restarted periodically to accelerate the which have the advantage of exploiting the
convergence. It may converge in a small num- sparseness of A. The matrices U in (9) and (10)
ber of steps, as it takes advantage of the distri- are unique and coincide with the transpose L’
bution of eigenvalues of A adaptively. The CG of the lower triangular matrix L of the Cho-
method is most effective when it is used to lesky factorization, A’A = LL’, of A’A. The
accelerate linear stationary iterative methods multiplicative operations required for the
or when it is applied to the preconditioned factorizations (9) and (10) are about rnn’ and
system mn2 - n3/3, respectively.
When the rank of the matrix is unknown,
L-‘A(~-‘)ly=L-‘b, x=(L-‘)‘y, (7) we can determine an “effective rank” p of A,
where e is computed by the incomplete Cho- based on the singular value decomposition
lesky factorization EL”’ = A + E. The matrix (SVD):
L-‘A(L-‘)’ is not to be formed explicitly in the
A= UDV’,
CG method.
where U and I’ are torthogonal matrices of
dimensions m and n, respectively, and D is an
E. Linear Least Squares Problem m x n diagonal matrix whose diagonal ele-
ments dEi are the singular values oi of A [ 131. If
Let Ax = h be an overdetermined system of singular values smaller than ap can be ignored,
linear equations, where A is an m x n matrix the approximate least squares solution is given
(m > n). The linear least squares problem is by Zp= Vi?,’ U’b, where the ith element & of
to find the x that minimizes the Euclidean the diagonal matrix B,’ is aimi if i<p and 0
norm /Ih - Ax 11.We assume here that A is of otherwise.
rank n. The most straightforward method for These methods certainly solve the system (1)
solving the problem is to apply the Cholesky but generally require more computational
method to the normal equation work than those based on the triangular fac-
torization. The least squares solution is com-
(A’A)x = A’b. (8) puted also by applying the CC method (6) to
This may result in ill-conditioning of the sys- the normal equation (S), in which A’A is not to
tem, since cond A’A = (cond A)‘. Ill-conditioning be formed explicitly [9]. Iterative methods for
can sometimes be avoided by forming the solving linear systems with singular and rect-
factorization A = LU by Gaussian elimination, angular coefficient matrices are characterized
where L is an m x n lower trapezoidal matrix in terms of the +range of RA and the +null
and U is an upper triangular matrix. The space of AR [ 151.
least squares solution is then obtained by Besides the methods discussed here, numer-
solving successively the systems L’Ly = L’h ous other methods have been proposed for
and Ux=y. solving the system (1) [ 16-181. Recently, vari-
Another approach to avoiding ill- ous sparse techniques have been developed for
conditioning is based on orthogonalization. direct methods to control the growth of non-
The modified Gram-Schmidt orthogonaliza- zero entries (fill-in) in the process of matrix
tion method produces in an effective manner factorization [19]. In general, direct methods
the factorization have an advantage over iterative methods;
however, their relative computational effi-
ciency varies according to the scale, sparsity,
where Q is an m x n matrix whose columns and type of the coefftcient matrix and to the
are a set of torthonormal vectors and U is an available computational devices.
upper triangular matrix [lo, 11). The least
squares solution is obtained by solving Ux = References
Q’b. A sequence of Householder transforma-
tions, Pk = I - 2w, wl/ll w, II’, can produce the [1] J. H. Wilkinson, Error analysis of direct
factorization methods of matrix inversion, J. ACM, 8 (1961),
281-330.
[2] J. H. Wilkinson, Rounding errors in alge-
(10) braic processes, Prentice-Hall, 1963.
[3] J. H. Wilkinson, The algebraic eigenvalue
where U is an n x n upper triangular matrix problem, Oxford Univ. Press, 1965.
[12]. The least squares solution is obtained by [4] G. E. Forsythe and C. B. Moler, Computer
solving Ux = 6, where h IS formed by the first solution of linear algebraic systems, Prentice-
n elements of the vector P,,P,-, P2 P, b. A Hall, 1967.
303 A 1134
Numerical Solution of ODES

[S] R. S. Varga, Matrix iterative analysis, and teigenvalue problems. Although solution
Prentice-Hall, 1962. by an tanalog computer is among the numer-
[6] D. M. Young, Iterative solution of large ical methods in the wide sense, we usually
linear systems, Academic Press, 1971. mean by “numerical solution” a solution ob-
[7] K. Tanabe, An adaptive acceleration of tained by approximating a problem of infinite
general linear iterative process for solving a degrees of freedom and a continuous variable
system of linear equations, Proc. Fifth Hawaii by one of finite degrees of freedom. The ap-
Int. Conf. System Sci., (1972), 116-118. proximation of an arbitrary function by a
[S] C. Lanczos, An iteration method for the linear combination of a given finite system of
solution of the eigenvalue problem of linear functions is an example (- Section I). Succes-
differential and integral operators, J. Res. Nat. sive substitution and power series expansion
Bur. Standards, sec. B, 45 (1950), 255-282. are also used but have no particular advan-
[9] M. R. Hestenes and E. Stiefel, Methods of tages over other methods. Among various
conjugate gradients for solving linear systems, methods of approximation, the so-called dif-
J. Res. Nat. Bur. Standards, sec. B, 49 (1952), ference methods (or discrete variable methods)
409-436. are the most flexible and have the largest field
[lo] F. L. Bauer, Elimination with weighted of application. A difference method reduces a
row combinations for solving linear equations given problem to an approximate problem
and least squares problems, Numer. Math., 7 in which we deal only with a systematically
(1965), 338-352. chosen set of discrete values of the variable
[ 1 l] A. Bjijrck, Solving linear least squares and with values of the unknown functions
problems by Gram-Schmidt orthogonaliza- at the chosen points. A difference method is
tion, BIT, 7 (1967), l-21. usually carried out by a simple iterative com-
[ 121 G. H. Golub, Numerical methods for putation, so that it is suitable for tdigital com-
solving linear least squares problems, Numer. puters. We confine ourselves almost exclu-
Math., 7 (1965), 206-216. sively to difference methods [l-7].
[ 131 G. Golub and W. Kahan, Calculating the
singular values and pseudoinverse of a matrix,
SIAM J. Numer. Anal., 2 (1965), 205-224. B. Initial Value Problems
[14] C. L. Lawson and R. J. Hanson, Solving
least squares problems, Prentice-Hall, 1974. To solve numerically general initial value
[ 151 K. Tanabe, Characterization of linear problems, it suffices to consider the problem
stationary iterative processes for solving a of determining numerically the values on an
singular system of linear equations, Numer. interval [a, b] of m functions y’(x) (i = 1, . , m)
Math., 22 (1974), 349-359. that satisfy a system of ordinary differential
[ 161 D. K. Faddeev and V. N. Faddeeva, equations of the form y”(x) =f’(x, y’(x), ,
Computational methods of linear algebra, y”(x)) and the initial conditions y’(a) = vi (i =
Freeman, 1963. (Original in Russian, 1963.) 1, , m), where the ,fi are given functions with
[ 171 G. E. Forsythe, Solving linear algebraic appropriate smoothness, and u, b, and vi are
equations can be interesting, Bull. Amer. given constants. We define the mesh points
Math. Sot., 59 (1953), 299-329. x,=a+nh(n=O,l,...),callh thestepsize,and
[IS] J. R. Westlake, A handbook of numerical determine numerically the values yi approxi-
matrix inversion and solution of linear equa- mating y’(x,).
tions, Wiley, 1968. Assuming that the yi were computed with
[ 191 J. K. Reid (ed.), Large sparse sets of linear infinite accuracy, i.e., without troundoff error,
equations, Academic Press, 197 1. we call et = y; - y’(x,) the global truncation
error (or global discretization error), while
we call ri = jt - yi the global roundoff error,
where the jji are the values we actually have
by means of finite-accuracy computation. On
303 (XV.8) the other hand, if the assumption holds locally
Numerical Solution of that the solutions at the previous steps are
Ordinary Differential exact, we call the ei and ri the local truncation
error (or local discretization error) and the
Equations local roundoff error, respectively. To avoid
confusion, we denote the local truncation error
A. General Remarks by ti. The numerical solution must have the
property that, for every xe[a, b], lim,,,,,,~=,e~
Applications of the numerical solution of = 0, which is called the condition of conver-
tordinary differential equations include tinitial gence. In particular, if ti = O(hp+l) (x,=x, h-,
value problems, tboundary value problems, 0, where O(hPtl) is Landau’s symbol), then p
1135 303 D
Numerical Solution of ODES

is called the order of the solution process. of z”(t) =,f’(t, z’(t)) with the initial condi-
We shall show the typical methods of numer- tions zi(x) = y’ and put hAi(x, yj; h) = z’(x + h) -
ical solution, using the abbreviation fni for z’(x), then we have @,‘(x, yj; h) - A’(x, yj; h) =
f YX”> Yf). hPvi(x, yj) + O(hPtl), where the qi(x, yj) are
expressible in terms of the ,f’(x, yj) and their
(partial) derivatives. Various Runge-Kutta
methods have been devised by searching for
C. Overall Approximation
values of the c(, and & that make p as large
as possible for a given p (p is called the order
If we rewrite the differential equations on
of the method). When searching for these
the interval [x,,x,+~] as ~~(x,+~)-y’(x,)=
values of c(, and /&, we usually impose at least
Jz;+qfi(x, yj(x))dx, q = 1, , P, and substitute
one of the following conditions: (1) CI, and firs
suitable tnumerical integration formulas for
are simple; (2) the truncation error is small; (3)
the integrals on the right-hand side, then we
the region of absolute stability (- Section G)
have the overall approximation formulas:
is large; (4) c(, and fir, give smaller roundoff
errors; (5) only a small computer memory is
Y’n+q -Y~=~,~ocqAL. q=l,...,P,
required. The ordered pairs (P, p), where p is
the highest order that can be attained by the
from which we can obtain yj+i, , yi,, when
(P + 1)-stage method, are (P < 3, P + l), (4,4),
the yi are given. Since j:+, contains y;+,, these
(5,5), (6,6), (7,6), (8,7),(9910,8), (p+2<(P+
equations are nonlinear in yi,,, so that we
1) <(l/2) (p2 -2p + 4), p > 9). The following
have to use, for example, the iterative proce-
formulas are frequently used since they are
dure of starting from suitable 0th approxima-
accurate and have simple parameters: for P =
tions and adopting as the Ith approximations
1 and p = 2, the formulas with c(~= 0, c(i = 1,
of the Y:+~ values computed from the overall
biO = l/2 (modified Euler method), with CI~=
approximation formulas by substituting the
c~i = l/2, Dir, = 1 (improved Euler method) and
(l- 1)th approximations for the y;,, in the fi+,
with u0 = l/4, c(i = 314, pi0 = 213 (Ralston’s
on the right-hand sides. These formulas are
second-order method); for P = 2 and p = 3,
often used to determine a set of starting values
the formulas with a,=2/9, c(, = l/3, x2 =4/9,
for a multistep method. The truncation error
&0 = l/2, b2,, = 3/4, p2i = 3/4 (Ralston’s third-
of the formulas depends on that of the numer-
order method), and with go = l/4, c(, = 0, ~1~=
ical integration formula substituted for the
314, b10 = l/3, &,=2/3, fl,, =2/3 (Heun’s
integral. A few examples: P = 1, (c, 0, ci i) =
third-order method). For P = 3, p = 4, the for-
(l/2,1/2) (error h3yiC3’(x,)/12 + O(h4)); P=
mula with sc,=a,=1/6, c(i =a,=1/3, /$O=&,
2,(~~~~~~,,~,~)=(5/12,8/12, -W)tmm =~21=1/2,~3,,=&2=lr~31=Oiswellknown
-h Y hJ+ O(O), (c20, c 21>c22)=(1/3,4/3, and is frequently referred to as “the fourth-
l/3) (error h5yi~5’(~,)+O(h6)); P=3, (c~~,c,,,
order Runge-Kutta method” or “the Runge-
c,2> c,,)=PP, 19124, -5l24 l/24), (c,,, c2,, Kutta method.” It has various desirable fea-
C22,C23)=(1/3,4/3,1/3,0),(C30,C31rC32,C33)=
tures [S]. Gill’s modification of the classical
(3/8,9/8,9/8,3/8).
Runge-Kutta method (sometimes called the
Runge-Kutta-Gill method) has some advantage
in regard to roundoff errors and to the neces-
D. Runge-Kutta Methods sary computer memory size, while Ralston’s
second-, third-, and fourth-order methods have
By a general Runge-Kutta method we mean a minimum error bounds in Lotkin’s sense for
method of determining yb = vi, yf , yi, suc- the local truncation error [7]. There exist
cessively by means of a formula yi,, - yi = “substantially fifth-order” methods for P = 4
h@‘(x,, y,$ h), where the functions @’ are de- [9]. Today we frequently prefer to use higher-
fined in terms of parameters c(, and /$, as @,’ order methods, especially fifth-order methods,
(x,y’;h)=C&cc,k;, k;=f’(x,yj), k;=f’(x+ instead of lower-order methods, because the
B,oCy’+h(B,,-~,P=,~~,,)k~+hC,P_,B,,k,~)(r= former require less computation time for solu-
1, . . , P). A Runge-Kutta method is called ex- tions of given accuracy [lo].
plicit if p,, = 0 for r <s, and implicit otherwise. For the estimation of the local truncation
In the latter case, if 8, = 0 for r <s, the method error, the following two practical techniques
is called semi-explicit (or semi-implicit). Unless are well known: (i) one-step-two-half-steps
otherwise stated, the Runge-Kutta methods error estimate. If we let ~15, and yt+i denote
considered below are assumed to be explicit. the approximations that are computed with
In order to proceed one step from yr to yj,, an mth-order method by taking two half-steps
we have to compute each function f’ (P + 1) and one full step (= h,), respectively, then
times. Hence this is called the (P + l)-stage an estimate of the local truncation error per
method. If we denote by z’(t) the solutions half-step associated with y::, is given by the
303 E 1136
Numerical Solution of ODES

formula of p(c) = 0 lies on or inside the unit circle, and


every root lying on the unit circle is simple.
We call ti=(~(E)y’(x,)-ha(E)y”(x,))/cr(l) the
where we assume that yk does not vary much local truncation error. For a consistent and
over the interval [x,,,x,+,]. (ii) The formula stable method determined by (p, U) there exist
of embedding form. The following formula of a real constant C,,,, ( # 0) and an integer p
embedding form is more effective than the (2 1) such that tj = hP+’ C,,, yi@“)(x,)/o( 1) +
traditional method above. By use of the pth- O(hp”); p is called the order of the method
order method yi,, =yA + Cr=‘=, cr,kj and the and C = C,+,/a( 1) the error constant. For a
(p+ l)th-order method ys, =yi+C’= I 0 cc*k’ , I’ given p satisfying (ii), e is chosen to make the
we obtain an estimate of the local truncation order as small as possible and then to make
error in yi,, as tj,, ~yi+, -yt*,, [l l-131. At the error constant as small as possible. How-
present, formulas of this type are known for p ever, p cannot exceed k + 1 if k is odd or k + 2
= 2-8, where the method for p = 5 is said to be if k is even; moreover, p can be equal to k + 2
the most efficient. Recently, a similar method only if all the roots of p(i) = 0 lie on the unit
for the estimation of the global truncation circle.
error has been investigated [ 141. In addition, The following are examples of multistep
multistep generalizations of explicit Runge- methods.
Kutta methods, which are known as pseudo-
Runge-Kutta methods, have also been inves- (1) Explicit Methods.
tigated [15]. (a) Adams-Bashforth methods.

k= 1, di)=i--1, 5(i)= 1, p= I:
E. Multistep Methods
C = l/2 (Euler method);
A linear multistep method approximates given
k=2, /45)=r(l-l), 0(1)=(36-l)/&
differential equations by difference equations
p(E)yL = ha(E)fi, where E is the operator of p=2, C=5/12;
increasing n by 1, p(<)=c~,<~+cc,~,<~-’ +
k=3, P(5)=12(5-1),
. ..+a.i+cc,,5(i)=Bik+Bk~15k-1+...+
&i+&, akf”, and Icc,l+l/$,l#O. (Ifp and 0([)=(23[‘-16[+5)/12, p=3,
0 are of degree k in c, we speak of a linear k-
step method.) By means of these difference C=3/8;
equations we can determine Y:+~ from ~i+~-i, etc.
yifkeP, , yl. Since the ~f+~ are determined (b) Midpoint rule.
explicitly if pk = 0 and implicitly if & # 0, the
difference equations are called explicit and k=2, m=i2--, m=21,
implicit, respectively. In the implicit case, if Ihl p=2, C= l/6.
< l(ak/&)Ll, where L is a +Lipschitz constant
for f’(x, yj), then the Y:+~ can be calculated by (c) Milne’s predictor.
successive substitutions. In order to obtain the k=4, M=i4- 1,
y: successively by means of a k-step method
with k > 0, it is necessary, to begin with, to give a) = (s53 -412 + 81)/3,
the k - 1 sets of m values y; , , y;-i (called the
p=4, c=1/90.
starting values) in addition to the initial values
y6 = vi. To determine the starting values,
Runge-Kutta methods, overall approximation (2) Implicit Methods.
formulas, etc., are ordinarily used. The number (a) Adams-Moulton methods.
of times we have to compute the fi to pro-
k=l, di)=i--1, 40=(1+1)/2,
ceed one step from yi to yi+i is only 1 for an
explicit case, while for an implicit case it is p=2, C= -l/12 (trapezoidal rule);
equal to the number of iterations required to
k=2, P(i)=1(5-1),
secure the convergence of the successive substi-
tutions (ordinarily, the step size h as well as the c(1)=(512 +si- 1)/12,
0th approximation for ~f+~ are chosen so that
p=3, C= -l/24;
the convergence is attained after a few itera-
tions). We have lim h-O,x,=,eb=O(~ECa,bl) etc.
for any f i and vi and any starting values such (b) Mime’s corrector (or the Milne-Simpson
thatlim,,,y~=~‘(~=O,l,..., k-l),ifand formula).
only if the polynomials p and 0 satisfy the
k=2, PK)=12-L a(i)=(12+41+1)/3,
following two conditions: (i) consistency: p( 1) =
0, p’( 1) = a( 1); (ii) zero stability: Every root p=4, c= -l/90.
1137 303 G
Numerical Solution of ODES

When an implicit formula (p, rr) is used to on Milne’s device. In the VSVO algorithms
obtain yLik, the 0th approximation yi*,, for heuristics play an important role [2,17, IS].
Y:+~ is usually determined by an explicit for-
mula (p*, a*) of the same order as (p, a) (where
F. Extrapolation Methods
(p*, a*) itself may not necessarily be stable).
This kind of combination of implicit and ex-
Let us assume that the numerical solution with
plicit methods is called a predictor-corrector
the step size h at some fixed point x has the
method (or PC method), where the formula
form
(p*, u*) is called the predictor and (p, a) the
corrector. Typical combinations are the mid-
point rule and the trapezoidal rule, Mime’s y(x,h)=y(x)+ F zih’i+O(h’(m+l)),
i=l
predictor and Milne’s corrector (this combina-
tion is called Milne’s method), and an Adams- where r is a positive integer. Then we can
Bashforth method and an Adams-Moulton approximate y(x, h) by a function R,(x, h) with
method of the same order. A predictor- (m + 1) unknowns determined by the require-
corrector method has the advantage that the ment that R,(x, hj) = y(x, hj), j = 0, 1,2, , m,
local truncation error (of the corrector) can be hi > hj (i <j) in order to approximate y(x) by
estimated in the course of calculation without R,(x, 0). If R,(x, h) is a polynomial of degree
extra computation. In fact, if the orders of the mr in h, we call the foregoing method a poly-
predictor (p*, a*) and the corrector (p, a) are nomial extrapolation method, where R,(x, h)
equal to p and their error constants to C* and =y(x)+o(h r@“+l)) as h tends to 0. Defining
C, respectively, then the local truncation error Ri = y(x, h,) and R,(x, 0) = Rk for j = i, i + 1,
tt can be estimated by t: = KDL + O(U”‘) (n = “‘2 i + m, we can calculate the Rh by the re-
0, 1, . ), where 0: = y:T, - yf,, is the differ- cursion relation Rk = Rz?,+_‘,+(Rr?,+_‘, - Rkml)/
ence between the value y:+, at x,+~ obtained ((hi/h,+,)‘- l), m> 1. It is well known that, for
by the predictor and the Y:+~ obtained by the the following method (Gragg’s method), there
corrector, and K = cQC/[(C- C*)a*(l)] [16]. exists an asymptotic expansion with I= 2 [ 191:
The term multistep methods originates from x,=nk r1(o,~)=Y(o), I?(X*r~)=Yo+hf(Y”,O),
the fact that these use the values of the depen- r(X,+1,~)=YI(X,~1,~~)+2~f(ll(x,r~)rXn)r n=l,Z
dent variables at more than two different mesh , N - 1, where xN=x, y(x,h)= 1/2[q(x,-,,h)
points in order to proceed one step. These are + v(x,, 4+ W&G, 4, ~11. If K,,(x, 4 is a
also called multivalue methods since they use rational function
more than one value of the dependent vari-
able. The multivalue method is, however,
a more general concept than the multistep
method. wherep=[m/2],v=m-p=[(m+1)/2],j=i,
Linear multistep methods are not only ex-
i + 1, , i + m, we call this method a rational
amples of the PC method; they are also exam- extrapolation method. For r = 2, the following
ples of the variable-step variable-order algo- formulas were derived by Bulirsch and Stoer
rithms (VW0 algorithms), where the order of [20]:
the formula as well as the step size are auto-
matically chosen according to the behavior of Rf,,=R;!,+_‘, +(R;?,+_‘, -R;-l)/((hi/hi+,)2
the solution. In practical VSVO algorithms,
x [l -(R;t, -R;m,)/(R;?l -R;:2)]-1),
the Adams-Bashforth-Moulton family of PC
pairs of order 1 to 13 are usually used, and m>l,R’f,=O, Rb=y(t,h,).
for solving stiff systems (- Section G) those
correctors with large regions of absolute sta- G. Stability
bility are used. In these algorithms we use the
multivalue method, which saves the informa- Consider the application of the general k-step
tion at different steps in a form convenient method at the nth point (which is consistent
for the change of order and of step size. The and O-stable)
multivalue method also facilitates error esti-
mation and stability, and results in an efficient
use of memory and reduction of the computa-
tional cost. For example, in Gear’s algorithm to y’ =f(x, y), y(x,,) = y,,. Let J, be the numer-
the information required for computing y,,, ical solution at x=x,. Let P, = y(x,) - J,, be the
is saved in the following form: yn = (y,, hy:, . , global error, and let cp, be the total error at the
(hk/k!)ykk)), where y’(x) is the polynomial Pk.“(x) nth application. Then we find
interpolating f,, fn-tr. ,fn-k+l, where yp=
d”P,,,(x)/dx’-’ 1x=x” hold. All the local trun-
cation error estimators are based essentially
303 H 1138
Numerical Solution of ODES

where tnij lies in the open interval whose regions have been proposed. Some of them
endpoints are Y,+~ and Y(x,,+~). If we make two are:
assumptions, 8f/ay = i (const), (pn= cp (const), (1) A-stability (Dahlquist): S,Z {hiIRe <O}
the above equation reduces to &,(clj- (2) Stiff-stability (Gear): S, 2 S, U S,, S, =
hE.li;.)P,+, = q, whose general solution is given {h1.jRe(hn)< -a<O}, S,={Ml -a<Re(hi),<
by b, -c<Im(h1)<c,b>O,c>O)
(3) A(cc)-stability (Widlund): S, 2 {hi I -a <n -
arg(hl)<cx,aE(O,rr/2)}
(4) A,-stability (Cryer): S, 1 { h3, I Re(hl) <
where the d, are arbitrary constants and the r, 0, Im(h1) = 0).
are the roots, assumed distinct, of the poly- The order p and the step number k of linear
nomial equation multistep methods are restricted by the follow-
ing stability requirements:
(i) A-stability: implicit, p<2; trapezoidal rule is
the most accurate method.
We call the linear k-step method absolutely (ii) Stiff-stability: implicit, p < k; backward
stableforagivenhiiflr,l<l,s=1,2 ,..., k. differentiation formulas c;=, ~~y,,+~ = hfn+k are
On the other hand, we call the linear k-step stiffly stable for p = k = 1,2, ,6, O-unstable
method relatively stable for a given hi if 11;1< for k=6.
Jr11 (or )rS)<eLh), s=2,3, . . . . k, where rl is the (iii) A(cc)-stability: implicit; there exist high-
root corresponding to the theoretical solu- order A(O)-stable linear multistep methods.
tion. We call the region S, = {h/Z 1Ir,l< 1, s = (A method is said to be A(O)-stable if it is A(a)-
1,2,...,k) (or S,={hl.(lr,l<e”h(or lrlI),s= stable for some sufficiently small c(E (0,7r/2).)
2,3, , , k}) in the complex plane the region For the Runge-Kutta (P,p) methods, we can
of absolute stability (or the region of relative write
stability) of the linear k-step method. Also we
call S, n R (or S, n R) the interval of absolute c+, = 1 + hl+(hl)‘/2!+ +(l~/l)~/p!
stability (or the interval of relative stability) of I
the linear multistep method, where R is the P+1
real line. The explicit methods and the PC + 1 Y&W4 %+cp,+,,
q=p+l I
methods have a finite interval of absolute
stability. The implicit methods usually have where aflay = i (const) and yq are functions of
larger intervals of absolute stability than the the coefficients of the method in use. We call a
corresponding explicit methods, The higher- regionR={hl]~1+hl+(h1)2/2!+...+(hl)P/p!
order PC methods have smaller intervals, +CgZj+, Yq(hA)ql < 1) a region of absolute
while Runge-Kutta methods do not. The stability of the Runge-Kutta (P, p) method.
PECE methods usually have larger intervals The implicit Runge-Kutta methods have a
of stability than the corresponding PEC or larger region of absolute stability than the cor-
P(EC)’ methods, where P indicates an appli- responding explicit Runge-Kutta methods.
cation of predictor, C an application of cor- Therefore the implicit Runge-Kutta methods
rector, and E an evaluation off [21,22]. are suitable for stiff equations, and the explicit
Let us consider the linear system y’= Ay + Runge-Kutta methods are suitable for nonstiff
b(x), where A is an m x m constant matrix or mildly stiff equations.
and y(x), y’(x), q(x)~R”‘. If A possesses m dis- Recently, Yamaguti and others have pointed
tinct eigenvalues 1, = pr + iv,, t = 1,2, . , m, the out that the instabilities occurring in the nu-
theoretical solution of this system is given by merical solution of ordinary differential equa-
tions are closely connected to the phenomena
Y(x)= *=,
2 Kexp(h + W)C, + v,(x), of chaos studied by Li, Yorke, and others
[23,24].
where K, and C,, t = 1,2, , m, are, respec-
tively, arbitrary constants and the eigenvectors
corresponding to i, and v(x) is a particu- H. Boundary Value Problems
lar solution. We call the linear system stiff
if(i)pLt<0,t=1,2,...,m,(ii)s>>1,wheres= A boundary value problem is generally for-
max, ~t4mI~A/min , 4t4m 1.~~1.We call the non- mulated as the problem of obtaining functions
linear system y’ =f(x, y), y(x), y”(x), ,f(x, y)~ R” y’(x) that satisfy the differential equations
stiff in an interval I of x if, for every x E I, the y”(x) = f’(x, y’(x), . . , y”(x)) and boundary
eigenvalues i,(x) of the +Jacobian matrix off conditions B,(~‘(x,,~), , ym(x,,); y’(x& . ,
satisfy (i) and (ii). The ratio s is called the stiff- y”‘(x,,,); . . . ; y’(x,), .. . , y”(x,,)) = 0 (i, k = 1, ,
ness ratio. To solve stiff systems effectively m), where the fi and B, are given functions.
a variety of methods with infinite stability If the fi and B, are linear in the y’, we can
1139 303 Ref.
Numerical Solution of ODES

first calculate the solutions y;(x) of the differ- References


ential equations under an appropriate initial
condition, as well as a set of m independent [l] J. D. Lambert, Computational methods in
solutions y:(x) (I = 1, . , m) of the homogene- ordinary differential equations, Wiley, 1973.
ous differential equations (say, under the ini- [2] C. W. Gear, Numerical initial value prob-
tial conditions y;(a) = S; at some point x = a), lems in ordinary differential equations,
and then substitute the expression for the Prentice-Hall, 197 1.
desired solution y’(x) =yh(x) + C;“=, X,$(X) in [3] P. Henrici, Discrete variable methods in
the boundary conditions to obtain simulta- ordinary differential equations, Wiley, 1962.
neous linear equations for the unknowns CQ [4] G. Hall and J. M. Watt (eds.), Modern
(- 302 Numerical Solution of Linear Equa- numerical methods for ordinary differential
tions). No universally powerful method is equations, Clarendon Press, 1976.
known for the case where the ,fi or the B, or [S] H. J. Stetter, Analysis of discretization
both are nonlinear. Ordinarily, we resort to methods for ordinary differential equations,
the trial-and-error method of solving the Springer, 1970.
differential equations iteratively under differ- [6] L. Lapidus and J. H. Seinfeld, Numerical
ent initial conditions until the solution fully solution of ordinary differential equations,
satisfies the boundary conditions, or to ap- Academic Press, 1971.
proximating the differential operators by [7] A. Ralston and P. Rabinowitz, A first
suitable difference operators to obtain a set of course in numerical analysis, McGraw-Hill,
(generally nonlinear) simultaneous equations second edition, 1978.
approximating at the same time the differential [S] M. Tanaka, T. Arakawa, and S. Yama-
equations and the boundary conditions. For shita, On the characteristics of Runge-Kutta
related topics - 298 Numerical Computa- methods, Information Processing in Japan, 17
tion of Eigenvalues; 3 15 Ordinary Differen- (1977), 175-181.
tial Equations (Boundary Value Problems); [9] E. B. Shanks, Solutions of differential
[25-271. equations by evaluations of functions, Math.
Comp., 20 (1966), 21-38.
[IO] K. R. Jackson, W. H. Enright, and T. E.
I. Methods Other than Difference Methods Hull, A theoretical criterion for comparing
Runge-Kutta formulas, SIAM J. Numer. Anal.,
Besides difference methods there are frequently 15 (1978), 618-641.
used methods for finding in a given iinite- [l I] M. Tanaka, On Kutta-Merson process
dimensional function space a function that and its allied processes, Information Process-
best satisfies the differential equations as well ing in Japan, 8 (1968), 44-52.
as the initial or boundary conditions. Denote [ 121 J. H. Verner, Explicit Runge-Kutta meth-
the equations by L[y(x)] =0 and assume the ods with estimates of the local truncation
conditions to be linear. We choose a function error, SIAM J. Numer. Anal., 15 (1978), 772Z
y,(x) that satisfies the conditions and that is 790.
considered to approximate the exact solution, [ 131 H. Shintani, Two step processes by one
and also functions yl(x) (/= 1, , q) that are step methods of order 3 and order 4, J. Sci.
considered to represent the typical deviations Hiroshima Univ., ser. A-I, 30 (I 966), 183- 195.
of yO(x) from the exact solution and each of [ 141 P. Merluzzi and C. Brosilow, Runge-
which satisfies the homogeneous conditions. Kutta integration algorithms with built-in
We then determine the rI in such a way that estimates of the accumulated truncation error,
y(x) = y,(x) + CpZ1 ~,y,(x) best satisfies the Computing, 20 (1978), 1~ 16.
differential equations. Corresponding to differ- [15] G. D. Byrne and R. J. Lambert, Pseudo-
ent interpretations of the words “best satisfy” Runge-Kutta methods involving two points, J.
there are different methods. The collocation Assoc. Comput. Math. 13 (1966), 114-123.
method determines the CQso as to nullify the [ 161 H. J. Stetter, Global error estimation in
values of L[y(x)] at some prescribed points Adams PC codes, ACM Trans. Math. Soft-
xi; the method of least squares minimizes the ware, 5 (1979), 415-430.
integral of IL[y(x)]l* over the considered 1171 L. F. Shampine and M. K. Gordon, Com-
region; the Galerkin method makes L[y(x)] puter solution of ordinary differential equa-
orthogonal to every yl(x) (1= 1, , q); and the tions, the initial value problem, Freeman,
Ritz method considers the variational problem 1975.
GJ[y(x)] =0 whose +Euler equation is L[y(x)] [ 1S] F. T. Krogh, A variable-step variable-
= 0 (- 46 Calculus of Variations F), if such a order multistep method for the numerical
problem exists, and determines the al so as to solution of ordinary differential equations,
have J[y(x)] take an extremum value with Proc. Information Processing 68, North-
Y(x)=ydx)+CP=, w+(x). Holland, 1969, vol. 1.
304 A 1140
Numerical Solution of PDEs

[ 191 W. B. Gragg, On extrapolation algo- the difference method deserve special mention
rithms for ordinary initial value problems, here because of their generality and scope of
SIAM J. Numer. Anal., 2 (1965) 3844403. application.
[20] R. Bulirsch and J. Stoer, Numerical treat-
ment of ordinary differential equations by
extrapolation methods, Numer. Math., 8 B. The Ritz-Galerkin Method
(1966).
[21] G.‘Dahlquist, Stability questions for Having been applied mainly to boundary
some numerical methods for ordinary dif- value problems for elliptic partial differen-
ferential equations, Amer. Math. Sot. Proc. tial equations, the Ritz method is a classical
Symp. in Applied Math. 15, 1963. method, which has been used since the old era
[22] M. Iri, A stabilizing device for unstable of mechanical calculators, and is mathemati-
numerical solutions of ordinary differential cally nothing other than the tdirect method in
equations-Design principle and applications the calculus of variations (- 46 Calculus of
of a “filter,” Information Processing in Japan, Variations F) applied to tvariational problems
4 (1964), 65-73. that are equivalent to the original boundary
[23] T. Y. Li and J. A. Yorke, Period three value problems. In order to exemplify the
implies chaos, Amer. Math. Monthly, 82 method, let 0 be a bounded domain in RN
(1975), 985-992. with piecewise smooth boundary S, and con-
[24] M. Yamaguti and H. Matano, Euler’s sider the Dirichlet boundary value problem
finite difference scheme and chaos, Proc. Japan comprised of the Poisson equation
Acad., ser. A, 55 (1979) 78-80. Au= -f (1)
[25] A. K. Aziz (ed.), Numerical solutions of
boundary-value problem, Academic Press, and the boundary condition
1975.
[26] H. B. Keller, Numerical methods for two-
uls=b. (2)
point boundary-value problems, Blaisdell, Here ,f and p are given functions on Q and S,
1968. respectively. (In the following, given functions
[27] H. B. Keller, Numerical solutions of two- are assumed to be sufficiently smooth.) The
point boundary-value problems, SIAM Re- boundary value problem (I), (2) is equivalent
gional Conf. Ser. in Appl. Math., 1976. to the variational problem of minimizing the
+functional

1
JCUI=-4u,u)-W)
2
(3)
304 (XV.9)
Numerical Solution of Partial within the set D(J) of admissible functions u
Differential Equations subject to (2). In (3) the following notation is

4%
4=(VU>
w,<*(n)
=s
used:

A. General Remarks Vu.Vvdx


0
The numerical solution of partial differen- and
tial equations became practical in the 1950s
with the advent of automatic digital compu- uvdx,
(4 4 = (u3 &,(O) =
ters. Nowadays, by means of modern high- sR
performance computers, the numerical solu-
while
tion of partial differential equations is carried
out extensively and often on a very large scale D(J)={uEL*(n)IVuEL2(R), uI,=p},
for problems in physics, engineering, and
This variational problem is further reduced to
other fields of applied analysis, in order to
the condition:
obtain approximate solutions of rigorous
equations or to simulate real phenomena by
means of numerical experiments. Various
&hcp)=(Jcp) (V’(PEV, (4)
numerical-solution schemes have been pro-
posed, applied, and studied, and programming where V stands for the Sobolev space H,‘(R)
techniques required to implement numerical (- 168 Function Spaces B), that is,
solutions are being invented regularly; some
V=H,‘(n)={uEL2(R)IVuEL2(~), uI,=O}.
are of a general nature, and others are de-
vised for particular problems. The variational Note that V is equal to D(J) with j-0. The
method, which includes the Ritz-Galerkin condition (4) is often called a weak form of the
method and the finite element method, and boundary value problem. In applying the Ritz
1141 304 B
Numerical Solution of PDEs

method to this problem, we introduce un- Many studies have been made of conver-
known parameters tl,, Q, . , an and set the gence and error estimation for the Ritz method
approximate solution u, in the form in general (- e.g., [ 1-41). For instance, it is
known that, if b=o,f~L’(Q), and {(pl,(pz,
u,=F(x;cc,,cc 2r...,4~D(J) . , q,,, } spans a linear subspace dense in
and minimize J[u,] as a function of the n vari- V= H,‘(R), then the approximate solution u,
ables c(~, c(~, . , LX,,.A standard way to form obtained through (6) (or through (12) below)
u, is to choose first a heD(J) subject to the converges to the rigorous solution in the H’-
inhomogeneous boundary condition and ‘pr , topology (and hence in the L2-topology).
‘pZ,. , (P,E V subject to homogeneous bound- Nevertheless, success in using the method in
ary condition, and then to set practical applications can be gained only by
a clever choice of trial functions. A typical
u,=b+cc,cp,+a,cp,+...+a,cp,. (5) but systematic example of such a choice is
The functions ‘pj (j= 1,2, , n) are called basis made by the finite element method (- Sec-
functions or coordinate functions in the Ritz tion C).
method. Denoting by D,, the set of all possible We proceed to the Galerkin method [ 11.
functions appearing on the right-hand side of Suppose that we are to solve the equation
(5) and by V, the linear space generated by L[u]=Au+(b.V)u= -J (10)
‘pr, (p2,. , (pn, we can state the condition to
determine the approximate solution u, as: which is the equation in (1) perturbed by a
lower-order term (the convection term). Here b
=(h,(x), . , b,(x)) and (b.V)u=Cjbj(x)(au/axj).
For simplicity, the boundary condition (2)
4u,> cp)=(.fi cp) WV K). (6)
is assumed to be homogeneous (i.e., ,!I = 0).
Thus the Ritz method is a kind of projective Although the boundary value problem (IO), (2)
approximation method which approximates the cannot be reduced to a variational problem
weak equation (4) by its projection on a finite- since (10) is not symmetric, it is equivalent to
dimensional space V,. The condition (6) is its weaker form
equivalent to the equations
a(u,cp)-((b.V)u,cp)=(f;cp) (vv~v), (11)
a(u,>pj)j)=(LcPj) (j=l,Z...,n). (7)
provided that UE V. Note that (11) is ob-
Particularly when 8~0 in (2), D, coincides tained from (L[u] +f; (P)~> = 0 by transforming
with V,, and the equations (7) that deter- (L[u], cp) through integration by parts. Ac-
mine the coefficients LX~are reduced to a linear tually, this boundary value problem has a
equation unique solution u for any CELL if ldiv bl is
sufftciently small. According to the Galerkin
Kcc=y (8)
method as applied to the present problem, one
forc(=‘(c(r,a2,..., E,,), where the matrix K = replaces V by V, in (11) and determines the
(Kij) and the vector y =‘(y,, yZ, . , y,) are given approximate solution u,=c(, ‘pr + +E,(P,E V,
respectively by Kij=a(cpi, qj) and yj=(f; cpj). by the condition
Solving (8) numerically by the +Gauss elimina-
tion method or by titeration, one eventually
obtains the approximate solution u, by the which is equivalent to the equations
Ritz method. The Ritz method is also appli-
cable, for instance, to the eigenvalue problem
Au = Au, u I\ = 0, since this eigenvalue problem (13)
is equivalent to the variational problem of Sometimes it is convenient to project (L[u] +
finding a stationary Rayleigh’s quotient R[u] = A cp)= 0 onto the finite-dimensional space and
a(u, u)/(u, u) with V as the set of admissible to deal directly with (L[u,] +f,cpi)=O (j=
functions. In the approximation to restrict the 1,2, , n). When the coefficients of the equa-
admissible functions to V,, the approximate tion and the given function f are periodic in
eigenvalue I.(“) is determined through the ma- space variables and when sine or cosine func-
trix eigenvalue problem tions are chosen as basis functions, the Galer-
kin method turns out to be the same as the
Kcc = rl’“‘Ma, (9) so-called Fourier approximation method and
where the matrix K is as above and the matrix can be efficiently implemented with the aid of
M = (Mij) is given by M, = (pi, qj). Sometimes, a tfast Fourier transform (FFT) to yield the
optimization of J[u,] or R[u,] is carried out required Fourier coefficients numerically [S].
more directly, e.g., by the tgradient method, The Galerkin method can be applied to
particularly when u, contains free parameters the numerical solution of evolution equa-
‘* in some nonlinear way. tions, namely, in order to approximate time-
304 c 1142
Numerical Solution of PDEs

dependent solutions. This is exemplified tion with discretization of the space and time
through the following initial-boundary value variables is called a full discrete approximation.
problem for the diffusion equation (heat Finally, an advantage of the Ritz-Galerkin
equation): method is that if the boundary condition
imposed on rigorous solutions is a natural
(14) boundary condition, then the admissible func-
tions and particularly the basis functions need
with the homogeneous boundary condition not satisfy it [4].

ul,=O (15)
and the initial condition C. Finite Element Methods for Boundary
ul,_,=a(x) (XER). Value Problems
(16)
We first set the required approximate solution
In recent years the method of numerical solu-
u, = u,(t, x) in terms of the basis functions
tion of partial differential equations most
vl,e,...,cPnofKas
extensively employed in structural mechanics
u,=~,(eP, +a2(t)(P*+...+a,(t)(P,, (17) and many other fields of engineering has been
the finite element method. In its standard form,
and then determine the coefficients c(~(t), ccz(t),
the finite element method can be regarded,
. . . . a,(t) from the conditions
at least mathematically, as a type of Ritz-
Galerkin method that adopts as its basis func-
(18) tions piecewise polynomials of low degree
and with narrow supports. Although the idea
and leading to the finite element method can be
MO)> cp)= (4 d (VcpE u. (19) traced back to a paper by R. Courant in 1943
(Bull. Amer. Math. SK), the method acquired
Since (18) is equivalent to the equations its popularity in the late 1950s when it was
rediscovered by engineers on the basis of me-
~~"n~'pi)=~a("n~Vj) (j=L2,...,4, chanical considerations [6]. Here we apply
the method to the boundary value problem
the vector function a(t)=‘(a,(t),a,(t), . ,a,(t)) (I), (2), assuming that /I’ = 0 and 0 a polygonal
is obtained by solving the ordinary differential domain in RZ [7-91. First, R = R U S is divided
equation into small triangles of which the length of any
side does not exceed h > 0. Each triangle T
M$a= -Ka, appearing in this decomposition of 52 is called
a triangular element, or simply an element
where the matrices M and K are those in (8) (following the terminology in structural me-
and (9). The initial value a(0) of a should be chanics), and we denote the set of all triangular
given in accordance with (19). Sometimes the elements by &, h being equal to max,,,{the
procedure above is called a semidiscrete ap- longest side of T}. The set of all nodal points,
proximation based on the Galerkin method. i.e., the vertices of the triangular elements, is
Obviously, this method can be applied to more denoted by N, and we put
general evolution equations, for instance, to
Ni={P~NIPd2}={P,,P, ,..., Pm}. (22)
the diffusion-convection equation
Then a standard choice of admissible functions
au
~=L[u]=Au+(h.V)u, (21) is to adopt as V, in (6) the following V, c V=
at H,j (Cl):
which governs time-dependent states corre- V, = {u,,E C(n) 1uh is linear on each element
sponding to stationary states subject to (10).
As a concrete example, we can refer to the and uhls=O}, (23)
finite element approximation for evolution
namely, V, is the set of all elementwise linear
equations described in Section D. If the geom-
continuous functions satisfying the boundary
etry of 0 is simple enough, e.g., for l-dimen-
condition. Now the approximate solution
sional R, one can adopt eigenfunctions of L [ ] U,,E V, is determined by the condition
as the basis functions; the resulting version of
the semidiscrete Galerkin method is called a 4% %I) = (A (PA (V% E 6). (24)
spectral method.
A basis of V, is formed by pyramidal functions
In order to obtain u,(t) one has to carry out
qj (j= 1,2, ,n)~ V, such that
numerical integration for (20), discretizing
the time variable. The Galerkin approxima- cpj(()=l and cpj(Q)=O (QEN\{~}). (25)
1143 304 D
Numerical Solution of PDEs

In order to obtain Us, one has to solve (8) lowing estimate [ 121:
numerically for CL Since the support of ‘pj is
confined to elements adjacent to 5, the matrix
II~-~~IIL,~~~211~Il~Z/t (t>o), (27)
K turns out to be a sparse matrix of multi- which reflects the smoothing property of the
diagonal type, and hence particular elimination diffusion equation. In the terminology of the
procedures can be applied with high efficiency. finite element method, the matrix A4 in (20) is
In the finite element method, the matrix K is called the mass matrix. In order to gain a fully
often called the stiffness matrix. The conver- discrete finite element approximation u,,~ for
gence of uh to u as h-0 is guaranteed if Y,, u, one discretizes the time variable with the
(O< h Q h,) satisfies a certain regularity con- +mesh length z and solves the difference analog
dition. For instance, if any angle of a triangle of (20), i.e., either
T is not smaller than Q,, > 0, then uh converges
M(a(t+z)-a(t))/z= -Kcr(t) (28)
in the H’(R)-topology. Moreover, when !A is
convex, it holds that or

M(a(t+z)-a(t))/z= -Kcr(t+z), (29)

where t is restricted to t = kr (k = 0, 1,2,. ).


The difference schemes (28) and (29) are of
[7,9]. Uniform convergence of uh to u and L,,-
forward type and of backward type, respec-
estimation of the error have also been estab-
tively. From (29) follows stall < Ila(O)/l, and
lished (e.g., [7, chap. 31). u,, is subject to the
consequently ll~A~)ll ,22< Clla~lL2. In the gener-
maximum principle, provided that the trian-
ally accepted terminology, the approximation
gulation Y,, is of acute type. Here, Y,, is said to
is stable if for each T> 0 there exists a positive
be of acute type (or of strongly acute type) if
constant C, depending on T such that
any angle 0 of TE Fj is in 0 < 0 < (7r/2) (0 <
Q< 8, < (7r/2). To acquire higher accuracy, llu,,,(t)llGC~llall (0<t=k~<T). (30)
more sophisticated admissible functions that
are piecewise polynomials of higher degree are Thus the approximation through (29) is un-
used. To take account of curved boundaries, conditionally stable, while the one through
several devices, including the isoparametric (28) is stable only under certain conditions on
method, have been proposed. the triangulation Yh and on the time mesh r.
In dealing with biharmonic equations or For instance, the forward scheme (28) is stable
other equations of higher order, one can con- if & is regular and satisfies the inverse
veniently apply a modified version of the finite assumption
element method of mixed type or of noncon- h
forming type; in the latter the approximate
o<s~&, F< (longest side of T) < +co’
solution is sought within a class of functions
which are of less regularity at the interface and if the ratio t/h2 is sufficiently small
of elements and hence do not belong to the [13,14].
domain of the original variational problem. i.e., Sometimes, the mass matrix M in (28) and
which are not admissible in the classical sense (29) is modified by the following procedure,
[lo, 111. called mass lumping: For each Pj, one joins
alternately the center of gravity of the ele-
ments with Pj as one of their vertices and the
middle point of the sides with Pi as one of their
D. Finite Element Methods for Initial Value
endpoints, thus forming a closed broken line
Problems
rj surrounding P,. We denote by (pj the char-
acteristic function of the polygonal domain
The most basic type of finite element method Bj bounded by 5.. When mass lumping is ap-
applicable to evolution problems is described plied, the matrix M is replaced by the matrix
here, and concerns the initial boundary value M =(( (pi, I@). Generally, mass lumping re-
problem (14))( 16), assuming that R is the same laxes the conditions for the stability and the
as in Section B [S, 91. In the semidiscrete finite maximum principle to hold, although it may
element approximation one uses the functions lower the accuracy to some extent (H. Fujii,
‘pj of the preceding section as the basis for the Proc. US-Japan Seminar, Univ. Tokyo Press,
approximate solution, now denoted by uhr and 1973). Usually, schemes without mass lumping
then determines u,, through (17)-( 19) with V, are called consistent mass schemes.
replaced by V,. Then u,, converges to u as h-0, In principle, there is no difftculty in applying
provided that the triangulation Y,, is regular. the finite element method to the diffusion-
Furthermore, under the same condition that convection equation (21). However, when llbll
makes (26) hold, the error is given by the fol- is large, the stability condition becomes strin-
304 E 1144
Numerical Solution of PDEs

gent. To meet these difficulties, M. Tabata, difference analog of A, which is defined by


T. Ikeda [ 151, and others have devised some
A,cp=D)D:cp+D;D;q
schemes that enjoy better stability, the maxi-
mum principle, and even the law of conserva- =(cp(x+h,Y)+~o(x--h,Y)+cp(x,Y+h)
tion of mass by improving mass lumping and
introducing the idea of an upstream approxi-
mation or an artificial viscosity for discretizing In view of the maximum principle for net
the convection term. Moreover, finite element functions on R,, it is seen that (31) and (32)
methods are currently being applied to wave admit only uh = 0 as a solution if f= p = 0,
equations, the Navier-Stokes equations, and which implies that the solution uh of (31) and
various other evolution equations. (32) exists uniquely for all f and fl. When we
are concerned with the convergence of uh to
u as h-0, we have to consider (31) and (32)
E. Difference Methods for Boundary Value for all small h > 0. Such a family of difference
equations is called a difference scheme. Actu-
Problems
ally, uh obtained through (31) and (32) con-
verges to u as h+O, and we can say that the
In difference methods for solving partial dif- difference scheme (3 1) and (32) is convergent.
ferential equations we reduce the original In general, the order of the error u-u,, de-
equation to its difference analog, replacing pends on the smoothness of u. For instance, if
tdifferential quotients of the unknown function UE C”(n), then (u-u,,] < Ch’ holds uniformly
by the corresponding difference quotients. The on R,. When the asymptotic behavior of the
difference analog, usually a system of algebraic error is known to satisfy u-u,, = wh’ + o(11’)
equations, is then solved numerically, yielding for some w = w(x, y), it is possible to attain a
the desired approximate solution. Since the higher accuracy by eliminating the /?-order
last part of the method, i.e., the numerical term if we compute uh for two different values
solution of the difference analog, requires of h and form an appropriate linear combi-
a large amount of computation, difference nation of the uh thus obtained (Richardson’s
methods were not feasible until the develop- extrapolation; - e.g., [ 171). The convergence
ment of automatic computers. of difference schemes applied to boundary
In terms of the variable x, difference ap- value problems is not affected if rectangular
proximations for the derivative df/dx are the nets are used instead of square nets. Many
forward difference techniques have been devised to treat curved
boundaries. There have been attempts recently
ef)(x) = (f(x + 4 -f(x))lk
to map 12 into a domain Q of simpler geom-
the backward difference etry, say, a rectangular one, and then discretize
the arising partial differential equations with
variable coefficients in Q (e.g., J. F. Thompson
et al., J. Comput. Phys., 1982.) Moreover,
and the central difference (symmetric difference)
when information regarding the location and
@Y)(x) =w + 4 -“ox - NP. nature of singularities of solutions is available,
one can refine the net near these points.
In order to exemplify the difference method The difference analog of elliptic equations
applied to boundary value problems, we re- described above is a system of linear equa-
turn to equations (1) and (2) supposing that R tions with the values of the u,, as its unknowns.
is a 2-dimensional square Q = {(x, y) IO < x < 1, Since the coefficient matrix of this system is of
O<y<li [1,16-181. Then we cover QUS by multidiagonal type, direct elimination methods
a square net (lattice, grid) with mesh length can be efficiently applied to solve it. Partic-
h= l/iv, for N a large integer. The set of net ularly, some elimination algorithms employ-
points Pm,, = (mh, nh) lying in Q is denoted by ing vector computations have been invented
R,, and S, is the set of net points on S. We [19]. Iteration methods of the Gauss-Seidel
want to have a net function u,, on R, = Q,, US,, type can also be applied, where acceleration
that approximates the solution u of (1) and of convergence by SOR is effective (- 302
(2) where uh is determined by the difference Numerical Solution of Linear Equations C).
equation

~,~,,(x,Y)= -f(xa~) ((x> YFU (31)


F. Difference Methods for Initial Value
and the boundary condition Problems

Uh(X,YkB(X,Y) (b>Y)ESh). To demonstrate characteristics of the dif-


Here Ah is a difference operator, the 5-point ference method applied to evolution equa-
1145 304 F
Numerical Solution of PDEs

tions [17,20,21], we first consider the initial on T. Lax’s equivalence theorem asserts that
boundary value problem (14)-( 16) for the dif- if the original initial value problem is twell-
fusion equation supposing that n =(O, 1), a posed and if the approximating difference
l-dimensional interval. Then we cover Q = scheme is consistent, then the stability of the
[O, co) x fi by a rectangular net t = nz (n = difference scheme is necessary and sufficient
0,1,2 ,... )andx=jh(j=O,l,... )withmesh for the convergence of the approximate solu-
lengths At = z and Ax = h. The value of the tion u,,~ to the rigorous solution u as 7, h&O.
approximate function u,,~ at the net point For instance the explicit difference scheme (33)
(nr,jh) is denoted by Uy. Then the simplest for the l-dimensional diffusion equation is
difference scheme of forward type for (14) is convergent under the mesh condition (36). On
written as the other hand, the implicit scheme (34) is
unconditionally stable and hence is conver-
w-q
I ~+I+UjF-2u/
zz (33) gent. Many difference schemes more sophisti-
7 h2 cated than (33) and (34), of higher accuracy or
with other favorable properties, have been
Namely, we adopt the difference operator
L,,, [ ] = D, - D$@! as the difference analog of proposed and studied. All these difference
(a/at)-(a2/ax2). Equation (33) is consistent in schemes can be generalized to the case of
many-dimensional R. For instance, if R c R2,
the sense that L,,h[u] =“the residual of u”+O
(7, h+O) for any smooth solution u of (14). This then by means of the difference operator Ah in
is the case also with the difference scheme of (31), the forward scheme and the backward
backward type scheme are given by D, Cl’= A,, U” and D, U”=
A,, Cl”+‘, respectively. The former is stable if
ur+l - ur (Jn+1+ U?_” -2ur+1 0 <i = r/h2 < l/4, while the latter is uncon-
J J = J+l J 1
h2 ” (34)
7 ditionally stable. Moreover, in a method called
the ADI method (alternating direction implicit
To compute the approximate solution U” = method) one introduces Untliz on fractional
{Y]O<j<N}, (33) or (34) must be combined
steps t = nz + 7/2 and discretizes (14) as
with the approximate initial condition cor-
I/n+W-Un
responding to (16), say Ujo = a( jh), and with =@!D”(J”+DhD”,‘J”+‘l2
the boundary condition U,” = Ui = 0. Then one 712 xx ’ ’ ’
can proceed from U” to U’, from U’ to U2, un+1 _ (Jn+1/2
and so on. Actually, (33) is rewritten in the =DhDhUn+’
x x +D;D$P+‘/~. (38)
form 71-2

U~+‘=iu;+,+nu~~,+(1-2~.)u;, (35) Then, starting from U”, the computation goes


J
as U”%U1/2 +U’-t...-+U”+U”+‘“...by
with 2.= z/h’, which gives U”” explicitly in
solving at each step alternatively a difference
terms of U”. Hence (33) is an explicit scheme. equation implicit with respect to x and y, for
On the other hand, in order to acquire U”+’ which a particular elimination method can be
from U” according to (34), it is necessary to used with good efficiency. Furthermore, the
solve a system of linear equation with Un+’ as ADI method is unconditionally stable. Various
its unknown. Hence (34) is an implicit scheme.
types of fractional-step method that generalize
Since the coefficient matrix of the system of the ADI method have been proposed [ 12,173.
linear equations to determine U”+’ according We now proceed to hyperbolic equations.
to (34) is tridiagonal, one can employ a partic- First consider the Cauchy problem where the
ularly efficient elimination algorithm. spatial domain is the whole R’. Noting that
We introduce the maximum norm 11(P,,11
h=
the wave equation u,, = u,,, for instance, can be
max,Jq(jh)] for net functions (Pi on R,=R,U
rewritten as
&={x=jhIO<j<N}. Then, as is obvious from
(35), the approximate solution U”=u,,,(nz) du 0 1 au
obtained through (33) satisfies 11U”ll,< 11U’I/,, at (1 0>-ax
-=

(n = 0, 1, ), provided that
by putting u =‘(u,, u,), we here deal with a
o<+; (36) hyperbolic system of the form

Generally, a difference scheme approximating (39)


an initial value problem is said to be stable if
the approximate solution u,,* for the initial where the unknown u is an m-vector u =
value a,, = u,,~(O) satisfies, for small 7 and h, ‘(u,,u,,...,u,),andAisaconstantmxm
matrix that is supposed to be real and sym-
II~,,,~~~ll~~~~llU~ll~ (Odc=nrdT) (37)
metric. For hyperbolic equations, explicit
for any T> 0, M, being a constant depending schemes are generally preferred, and as a typ-
304 Ref. 1146
Numerical Solution of PDEs

ical one we mention the Friedrichs scheme. hyperbolic equations to be stable, and is
In the Friedrichs scheme applied to (39), the known as the CFL condition (Courant-
approximate solution U = ‘(U, , r/,, , &i,) is Friedrichs-Lewy condition) [22]. For a scheme
determined by more accurate than the Friedrichs scheme we
refer to the Lax-Wendroff scheme, which pro-
U(t+z,x)-#J(t,x+h)+U(t,x-h))
ceeds by way of the following amplification
operator S,:
=AQU(t,x). (40)
Here t is restricted to t = m (n = 0, 1, . . ), while
x is regarded as a variable ranging over R’.
Introducing translation operators Th and T-, =I+;A(T,+T-,)+;(Th-Z+T-,).
by
K:(P(x)+(P(x+~) and T-,,:(P(x)+(P(x-h), The Lax-Wendroff scheme is stable under (43).
There are many other difference schemes
we obtain from (40) to approximate (39) most of which are appli-
U(t+z;)=S*U(t;), (41) cable to higher-dimensional spaces. Some of
these schemes can be conveniently employed
where, with the mesh ratio 1= z/h, S,, is given to solve nonlinear hyperbolic equations, for
by instance, those’arising in the gas dynamics of
compressible fluids, for which the Friedrichs
scheme and the Lax-Wendroff scheme were
originally intended [23]. Concerning difference
Generally, if a difference scheme is reduced to schemes with variable coefficients which ap-
the form of (41), then the operator S,,, which proximate a tregularly hyperbolic system with
yields evolution of the solution of the dif- an x-dependent principal part, criteria for the
ference equation for one time step, is called stability of the scheme in terms of the symbol
the amplification operator of the scheme. The S,,(x, 0 of S, have been obtained, making use of
stability of the difference scheme is implied the theory of tpseudodifferential operators, by
by the uniform boundedness of the operator P. D. Lax and L. Nirenberg (Comm. Pure Appl.
norm jjS:ll of S” acting in the Hilbert space Math., 1966) M. Yamaguti and T. Nogi (P&l.
(L,(R’))” for 0~ n < T/z. Consequently, the Res. Inst. Math. Sci., 1967), R. Vaillancourt
scheme is stable if llShll < 1 + Ct, C being a (Math. Comput., 1970), Z. Koshiba (J. Math.
constant. The tsymbol s,,(t) of S, with respect Anal. Appl., 1981) and others.
to the Fourier transform is called the amplifi-
cation matrix of the scheme. The &-stability
of the scheme mentioned above is equivalent References
to the uniform boundedness of the matrix
[l] L. V. Kantorovich and V. I. Krylov, Ap-
norm ~~~~(~)ll forO<n~<Tand t~Rl.There-
proximate methods of higher analysis, Inter-
fore, denoting by r,,(t) the spectral radius of
science and Noordhoff, 1958. (Original in Rus-
$(<), one can give a necessary condition for
sian, 1950.)
the stability by lr,,(<)[ < 1 + Cz, which is called
[2] J. Necas, Les methodes directes en theorie
the von Neumann condition. Concerning gen-
des equations elliptiques, Masson, 1967.
eral critera for the uniform boundedness of
[3] R. Temam, Numerical analysis, Reidel,
powers of S,(l) when ,$,({) is not necessarily
1973.
normal, a fundamental theorem was given by
[4] R. Courant and D. Hilbert, Methods of
H. 0. Kreiss in 1962 [21].
mathematical physics, lnterscience, I, 1953; II,
If the largest-modulus eigenvalue of A is
1962.
denoted by pO, then the Friedrichs scheme is
[S] J. W. Cooley and J. W. Tukey, An algo-
stable and hence is convergent under the con-
rithm for the machine calculation of complex
dition on the mesh ratio /z= z/h given by
Fourier Series, Math. Comp., 19 (1964), 297-
I< md (43) 301.
[6] M. J. Turner, R. W. Clough, H. C. Martin,
In particular, (43) implies that for the stable
and L. J. Topp, Stiffness and deflection anal-
Friedrichs scheme
ysis of complex structures, J. Aero. Sci., 23
propagation speed of the difference scheme (1956), 8055823.
[7] P. G. Ciarlet, The finite element method
> propagation speed of the original equation.
for elliptic problems, North-Holland, 1978.
(9 [S] A. R. Mitchell and R. Wait, The finite
Condition (44) is a necessary condition for element method in partial differential equa-
general difference schemes approximating tions, Wiley, 1977.
1147 304 Ref.
Numerical Solution of PDEs

[9] G. Strang and G. Fix, An analysis of the


finite element method, Prentice-Hall, 1973.
[lo] F. Kikuchi and Y. Ando, On the conver-
gence of a mixed finite scheme for plate bend-
ing, Nuclear Eng. Design, 24 (1973) 357-373.
[ll] T. Miyoshi, A mixed finite element
method for the solution of the von Karman
equations, Numer. Math., 26 (1976), 2555269.
[ 121 H. Fujita and A. Mizutani, On the finite
element method for parabolic equations I,
Approximation of holomorphic semi-groups,
J. Math. Sot. Japan, 28 (1976) 749-771.
[ 131 T. Kato, Perturbation theory for linear
operators, Springer, second edition, 1976.
[ 141 T. Ushijima, Approximation theory for
semi-groups of linear operators and its appli-
cation to approximation of the wave equa-
tion, Japan. J. Math., 1 (1975) 185-224.
[ 151 T. Ikeda, Maximum principle in finite
element models for convection-diffusion phe-
nomena, Kinokuniya and North-Holland,
1983.
[16] G. E. Forsythe and W. R. Wasow, Finite-
difference methods for partial differential equa-
tions, Wiley, 1960.
[ 171 G. I. Marchuk, Methods of numerical
mathematics, Springer, second edition, 1982.
[ 1S] G. D. Smith, Numerical solution of par-
tial differential equations: Finite difference
methods, Clarendon Press, second edition,
1978.
[19] D. L. Book (ed.), Finite-difference tech-
niques for vectorized fluid dynamics calcula-
tions, Springer, 1981.
[20] F. John, Lectures on advanced numerical
analysis, Gordon & Breach, 1967.
[21] R. D. Richtmyer and K. W. Morton,
Difference method for initial value problems,
Interscience, 1957.
[22] R. Courant, K. Friedrichs, and H. Lewy,
Uber die partiellen Differenzengleichungen der
mathematischen Physik, Math. Ann., 100
(1928), 32-74.
[23] R. Peyret and T. D. Taylor, Computa-
tional methods for fluid flow, Springer, 1983.
[24] B. Carnahan, H. A. Luther, and J. 0.
Wilkes, Applied numerical methods, Wiley,
1969.

You might also like