Random Variables and Probability H Cramer (CUP 1962 125s)

RANDOM VARIABLES
AND
PROBABILITY DISTRIBUTIONS
BY
HARALD CRAMER
('hane81Jo.r of the Su)edi8h, Uni-verBitltS
and PifojeR9or tn the Un1.t>erBity of StockAolrn
CAMBRIDGE
AT THE UNIVERSITY PRESS
1962
PUBLISHED BY
'fH:P: $). NDICS OF THE CAl\IBRIDGE UNIVERSITY PRESS
Belltl
P
3 200 Euston Roa.d, London. N.'V.l
American Branch 32 Eac;t 57th Street, York 22,
\V'fi'>bt AfrIcan Office: P.0. Box 33. Ibadan, Niflcria
First printed. 1937
$eg,onil IxhUm/, 1962
First printed at th.e UmtJt1"$ity Pru8, Oambridge
Reprinted by o1f
a
et..lithography by Bradford &: Diclcens, London. W..O. J
CONTENTS
Preface to the First Edition
Preface to the Second Edition "
AbbreviationB "
FIRST PART. PRINCIPLES
'Page vi
vii
Ohap. I. Introductory remarks 1
II.. Axionls and preHminary theorems .. 9
SECOND PART. DISTRIBUTIONS IN R
1
III. Gellerai properties. Meall values 18
IV. Characteristic functions "
24
v. Addition of independent variables.. Conver..
gence "iIi probability". Special distributions
3t)
VI. The normal distribution and the central limit
theorem
50
VII. Liapounoff's theorem. Asynlptotic expansions
!70
VIII. A class of stochastic processes 90
THIRD PART. DISTRIBUTIONS IN R
k
IX. General properties. Characteristic fUllctions 101
X. The normal distribution and the central limit
theorem 110
Bibliography .. 116
Some Recent Worles on Mathematical Probability 119
H. C.
PREFACE
The Mathematical Theory of Probability has lately beconle of
growing importance owing to the great variety ofits applications,
and also to its purely mathematical interest. The subject, of this
traot is the development of the purely mathematical side of the
theory, without any reference to the applications. The axiomatic
foundations of the theory have been chosen in agreement with
the theory given by A. Kolmogoroff in his work Grundbegrijje der
fVahr8cheinlichkeitBrechnung, to which I am greatly indebted. In
accordance with this theory, the subjeot has been treated as a
branch of the theory of completely additive set ftmotions. The
method principally used has been that of characteri8tic functi0ft.8
(or Fourier-Stieltjes transforms).
The limitation of space has made it necessary to restrict the
programme somewhat severely. Thus in the first place it has
pro,red necessary to consider exclusively probability distribu-
tions in spaces of a finite number of dimensions. With respect to
the advanced part of the theory, I ha.ve found it convenient to
confine myself almost entirely to problems connected with the
s o ~ c a l l e d Gentral Limit Theorem for sums of independent vari-
ables, and with some of its generalizations and modifications in
various directions. This limitation permits a certain uniformity
of method, but obviously &. great number of important and
interesting problems will remain unmentioned.
My most sincere tha.nks .are due to my friends W. Feller,
O. Lundberg and H. Wold for valuable help with the preparation
of this work. In p&rtioular the constant assistance and criticism
of Dr Feller has been very helpful to Ine.
lJeparime.nt of MatAe:mCltietil Statistics
Unit'u8itg 0/ Stockholm
December 1936
PREFACE TO THE SECOND EDITIOK
This Tract has now been out of print for a number ofyears. Since
there still seems to be some demand for it, the Syndics of the
Cambridge University Press have judged it desirable to publish
a new edition.
However, owing to the vigorous development ofMathematica.l
Probability Theory since 1937, any attempt to bring the book up
to date would have meant rewriting it completely, a task that
would have been utterly beyond my possibilities under present
conditions. Thus I have had to restrict myself in the main to a
number of minor corrections, otherwise leaving the work-
including the Bibliography-where it was in 1937.
Besides the minor"corrections, most of which are concerned
with questions of terminology, there a.re, in fact, only two major
alterations. In the first place, a serious error in the statement
and proof of Theorem 11 has been put right.. Further, the con-
tents of Chapter IV, 4, which are fundamental for the theory of
asymptotio expansions, etc.. developed in Chapter VII, have been
revised and simplified. This permits a new formulation of the
important Lemma 4, on which the proofs of Theorems 24-26
are based. Finally a brief list of recent works on the subject in
thE" English language has been added.
H.C..
Univer8ity OAa:nullor1f OJlirf
Stockholm
MaTt1l.1960
ABBRE"\TIl\TIONS AND NOTATIONS
Hymbol Rignificatlon
Explanation
d.f. Distribution function
page 11
pr.. f. Probabihty function
11
B.d.. Standard de"iation
21
E(X) lIean value (or mathematical
20
expectation) of X
D(J::; Standard da\iation of }[
21
c.. Characteristic function
24-
F (x) = F
1
(x) *F
z
(x)
F(:c)= J:", 1'1 (:c-t)dF
2
(t) 37
conv"'ergence Lpr. Convergence in probability 39
(F (x))n* F (x).F(x) * .... (11, times) 53
The u?l,ion or BUm of any finite or enumerable sequence of sets
8
1
, S'j., ... ~ is denoted by
8=8
1
+8
2
+....
The- intersection or product of the sets 8
1
, 8
2
, .... is denoted by
8=8
1
8
2
.....
The inclusioll SigIl c is used in relations of the type S1 c S
indicating that ;..9
1
is a subset of S, and also in relations ofthe type
x c ;.\1 t.o express the fact tllat x is an element of the set S.
FIRST PART
PRINCIPLES
CHAPTER I
INTRODUCTORY REMARKS
1" In the most varied fields of practical and scientific experi-
ene-e, cases ocour where certain observations or trials may be
repeated a large number of times under similar circumstances.
Our attention is then directed to a certain quantity, which may
assume different numerical values at successive observations.
In many cases each observation yields not only one, but a certain
number of quantities, say k, so that generally we may say that
the result of each observation is a definite point X in a space of
Ie dimensions (k 1), while the result of the whole series of obser-
vations is 8t sequence of points: Xl' XI' ....
Thus if we make a, series of throws with a given number of
dice, we may observe the sum of the points obtained at each
throw.. We are then concerl1ed with a variable quantity, which
may assume every integral value between m and 8m (both limits
where m is the number ofdice. On the other hand, in a
series of measurements of the state of some physical system, or
of the size of certain organs in a number of individuals belonging
to the same biological species, each observation furnishes a
certain number ofnurp.erical values, i.e. a definite point ina space
ofa fixed number of dimensions.
In certain cases, the observed characteristic is only indirectly
expressed as a number. Thus if, in a mortality investigation, we
observe during one year a large number of persons:J we may at
each observation (i.e. for each person) note the number of deatka
which take place during theyear, so that inthis case the observed
I
2 INTRODUOTORY RE1IARKS
quantity assumes the value 0 or 1 according as the corresponding
person is alive at the end of the year or not.
In a given class of observations, let R denote the set of points
which are a priori possible positions of our variable point X, and
let S be a sub..set of R. Further, let a &eries of n observations be
made, and count the number v of those observations, where the
follo'\\ing eve-nt takes place: the point X determined by the ob8erva-
tz-on belongs to S. Then the ratio vIn is called the frequency of that
event or, as ,ve may shortly put it, the frequency of the relation
(or event) XeS. Obviously any such frequency always lies
between 0 and 1, both limits inclusive. If 8=8
1
+lJ
2
, where 8
1
and 8
2
have no common point, and if vl/n and v2!n are the
frequencies corresponding to 8
1
and 8
2
, we obviously haye
v=vl+vS and thus
(1) vjn=Vl/n+ v,.!n.
When we are dealing with such frequencies, a certain peculiar
kind of regularity very often presents itself. This regularity may
be roughly described by saying that, for any given sub...set S,
the frequency of the relation (or event) XeS tends to becmne more
or leas constant as n increa8e8. In certain cases, such as e.g. cases
of biological measurements, our observations may be regarded
as samples from a very large or even infinite population, so that
for indefinitely increasing n the frequency would ultimately
reach an ideal value, characteristic of the total population.
It is thus suggested that in cases where the above-mentioned
type of regularity appears, we should try to introduce a number
P (S) to represent such an ideal value of the frequency v/71, corre-
sponding to the Bub-set S. The number P (8) is then called the
probability oj tke 8ub-Bet S, or of tke event Xc 8. It follows from
(1) that we should obviously choose P (8) such that
(2) P (8
1
+8.) = P (8
1
) +P (8
2
)
for any two Bub-sets 8
1
and loS! of R which have no common point.
Further, it is obvious that we should always have P (8) ~ 0 and
that for the particular set S== R we should have P (R) == 1..
INTRODUOTORY REl\IABKS 3
The investigation of 8et fUMtions of the type P (8) and their
mutual relations is the object of the Mathematical Theory of
Probability. This theory should be considered as a branch of
Pure Mathematics, founded on an axiomatic basis, in the same
sense as Geometry or Theoretical Once the funda-
mental conceptions have been introduced and the axioms have
been laid down (and in this procedure we are, of QOurse, guided
by empirical considerations), the whole body ofthe theory should
be constructed by purely mathematical deductions from the
axioms. The practical value of the theory will then have to be
tested by experience, just in the same way as a theorem in
euclidean geometry, which is intrinsically a purely mathematical
proposition, obtains a, practical value because experience shows
that euclidean geometry really conforms with sufficient accuracy
to 8. large group of empirical facts.
2. The axiomatic basis of a theory may, of course, always be
constructed in many different ,vays, and it is well known that,
with respeot to the foundations of the Theory of Probability,
there has been a great diversity of opinions.
The type of statistical regularity indicated above was first
observed in connection with ordinary games of chance with
cards, dice, etc., and this gave occasion to the origin and early
development of the theory.2 In every game of this character. all
the results that are a priori possible may be arranged in a finite
number of cases which are supposed to be perfectly symmetrical.
This led to the famous principle oj equally P088ible ca8eR which,
.after having been more or less tacitly assumed by earlier writers,
was explicitly framed by Laplace [1], as the fundamental prin-
ciple ofthe whole theory. Throughout thewhole centuryfollowing
the publication of Laplace's classical treatise, a, large amount of
work has been spent on the discussion of this principle.
During the course of this discussion, it has been maintained
1 This view seems to have been first explicitly expressed by v. Mises [2J.
Of. Todhunter [lJ.
4 INTRODUCTORY REMARKS
by various authors that the validity of the principle of equally
possible cases is necessarily restricted to the field of games of
chance, so that it is wholly incapable of serving as the basic
principle of the theory. Attempts have been made! to establish
the theory on an essentially different basis, the probabilities
being directly defined as ideal values of statistical frequencies.
The most successful attempt on this line is due to v. Mises [2, 3],
who endeavours to reach in this wayan axiomatic foundation of
the theory in the modern sense.
The fundamental conception of the v. Mises theory is that of a
" Kollektiv", by which is meant an unlimited sequence K of
similar observations, each furnishing a definite point belonging
to an a priori given space R of a finite number of dimerlsions..
The first axiom of v. Mises then postulates the existence of the
limit
(3) lim

for every simple sub-set ScR, while the second axiom requires
that the analogous limit should still exist and have the same
value P (S) for every sub-seq'llence K
'
that can be formed from
K according to a rule such that it can always be decided whether
the nth observation of K should belong to K' or not, 'Without
the re,sult oj this partic'lilar observatiort.
2
It does, however,
seem difficult to give a precise mathematical meaning to the
condition printed in italics, and the attempts to express the
second axiom III a more rigorous way do not, so far, seem to have
reached satisfactory and easily applicable results. Though fully
recognizing tIle value of a system of axioms based on the pro-
perties of statistical frequencies, I think that these difficulties
must be considered sufficiently grave to justify, at least for the
time being, the choice of a fundamentally different system..
The underlying idea of the system that will be adopted here
1 For the history of these attempts, cf. Keynes (1], chaps. VII-VIII.
2 The second a.xiom as gIven by v. Mises [31, p. 18, is som.ewhat more com..
plica.ted. It can, however, be shown tha.t this is equivalent to the simpler statement
given a.bove.
INrtRODUCTORY REMARKS 5
may be roughly described in the following simple llray: The
probability ofan event i8 a definite number a880ciated with that etrent ;
and GU'! have to express the fundamental rules for operatio1J.,8
with 8U.ch numbers.
Follov,ing Kolrnogoroff we take as our starthlg-IJoint the
observation made ahnve (of.. (2) that the probabihtj" P(8) nlaJ"
be regarded as an additi1)P oj the 8et S. We sh$ll1. ill fact,
content ourselves by Fiostu.lr..ting mainly the of a
function of this type, defined for a fal=luy of set'S S ill the
k...dimensiollal space R" tc, which our IJloint )( is
and such that P (8) denotes the prnbability ofttle relation -it: cS
Thus the question of the validity of the relation (3) \\ilI 110ti
at all enter into the mathematical theory. For tele eral,irical
of tIle theory"' it on the other hand, becon1e a
matter of fundanlelltal to know if, in a gi,,"en case,
(3) is satisfied \vi
i
h a I)ractically sufficient approximation.!
Questions of verification and application fall, however, outside
the scope ofthe present work, \vhich will be exclusively conc:erlled
"\vith the development of the pllrely Dlathematical J>aI't of the
subject.
3. Before giving the explicit statemellt of our axioms, it ,vill
be convenient to discuss here a fe\\r preliminary questions related
to the theory of point sets and (generalized) Stieltjes integrals in
spaces of a finite number of
In the first place, we must define the falniIJT .F of sets S, for
which we shall want Otlr additive set function P (S) to be given.
If X =('1' ..... 'I belongs to the k..dirl1ensiollal euclidean space
R
k
, the family F should obviously co:ntain every k...dilnel1sional
interval J defined by inequalities of the fornl
(i=1,2.. ... ,l"),
as we may always want to know the 1)robal>ilit
1
y of tIle relation
1 Cf.. Ca.ntelli (2], Tomier [:l].. The foundations of the theor..r l);S down by
authors present eertam analogies lvith the principles h-ete used.
2 Reference may be made to treatises by Hobson (I), Lebesgue [11 and de- La.
'-alice Poussin [1]"
6 INTRODUCTORY REMARKS
Xc J. It is also obvious that F should contain every set S con-
structedbyperforming onintervals J a finite number ofadditions,
subtractions and multiplications. It is even natural to require
that it should be possible to perform these operations an infinite
number of times without ever arriving at a set S such that the
value of P (8) is not defined. Accordingly, we shall assume that
P (8) is defined for all Borel 8et8
1
S of R
k
-
The family of Borel sets consists precisely of all sets that can
be constructed from intervals J by applying a finite or infinite
number of times the three elementary operations. If 8
1
, 8
2
)
are Borel sets in R
k
, this also holds tt1le for the two sets
limsup Sn = lim (811, +Sn+l +... ),
lim inf Sn=lim(SnSn+l.. ).
If fun sup 8
n
and liminfB
n
are identical, we put
limSn = limsup Sn. =liminfSn'
and thus limS
n
is also a Borel set. In particular, the sum and
product ofan infinite sequence ofBorel sets are always Borel sets.
If no two of the sets Si have a common point, it follows from
the additive property (2) that
P (8
1
+..... + Sn) == P ( ~ ) +.... +P (8,.,)
for every finite n. Since the limit 8
1
+ S2 +... always exists and
is a, Borel set, it is natural to require that this relation should
hold even as n-+00, so that we should have
P(8
1
+S2+... )= P(St)+P(S2)+ ...
A set function with this property will be called completely
additive, and it 'Will be assumed that the function P (8) is of
this type.
Consider now a real-valued point function g (X), defined for
all points X = ( ~ 1 ' .... , 'k) in R
k
* 9 (X) is said to be mea8'Urable B2
if, for all real a and b, the set of points X such that a < g(X) ~ b
is a Borel set. Similarly, a vector function Y=!(X), where
1 cr. Hobson ( 1 ] ~ I, p_ 179; Lebesgue [1], p. 117; deIa Vallee POl18sin (1], p. 33.
:I Cf.. Hobson (lJ, It p. 563; de 1& Vallee P0US81n [1], p. 34.
INTRODUCTORY REMARKS 7
y = (7Jl' , 7]t) belongs to a certain fdimensional space ffi
r
, is
measurable B if every component 11, regarded as a function
of X, is measurable B. If <5 denotes any Borel set in Dl
r
, and if S
i-s the set of all points X ill R
k
such that f (X) c6, then loS is also
a Borel set. (If f(X) never assumes a value belonging to $,
~ S is of course the empty set.) IfJl'!2' ... are measurable B, so are
11 f2' Il!2' fl
1
, limsuPfn' liminfj", and, in the case of conver-
gence, limfn.
All sets ofpoints with whick we shallltave to deal in the sequel are
Borel 8ets, while all point jU'Mtio'lUl are measurable B.. Generally
this will not be e:eplicitly mentioned, and should tltetL always be
tacitly ulUierstood.
A Lehe8gue-Stieltje8 integral with respect to the completely
additive set function P (8) is, for every bounded and non-
negative g(X) and for every set S, uniquely defined by tIle
postulates
(A)
J
gdP=J gdP+j gdP,
~ + ~ ~ ~
8
1
and 8
2
having no common point, and
(B) f (gl +g,JdP=j Ul
dP
+J g2
dP
,
s s s
(0) fsgdPi:
O
,
(D) f 1.dP=P(S).
i ~
Ifg is not bounded, we put g.ll = min (g, M) and define f gaP
.. s
as the limit of Isg.lldP as M -+00. If the limit is tinite, g is said
to be integrable over S with respect to P (8). The extellsion to
functions g which are not ofconstant sign is perfornled by putting
2f gdP==J (lgl+g)clP-J (lgl-g)dP.
8 S S
8 INTRODUOTORY REMARKS
FarallYg such that I9 I< Gthroughout th.e set S, vre then I1ave
the Inean value theorem
IfsgdP! < G P(S}.
I:,et gl' 92' . _. be a sequence of functions euch tllat for all points
of sS we have Ign i <g, where g jg integrable. Then iflin19n exiRt.s
for every point of S, except possibly for a certain set of pOints
S'1 c S such that P (8
1
) = 0, we have
,.. f"
Iiln I uIngrc,dI".
"'S s
It follows tliat the theorer.as on co:::rt.inuity, differelltiation a:u(l
integration \nth respect to Iiarameter, etc. ,,"hich are 1\:110\Vn
from elementa,ry int-egratior! theory' extend. thelneelYes im...
mediately to integrals of t!--:. tYf:e Jr g{X,t)dP, where t is a
s
parameter..
TIle ordinary theoretl1s on repeated i:ntegra,Is1 are also easil;y
to integrals of the type here considered.. III ]::;artieular
,ve l1ave the following result ,vBl be used in ChaI:ter III.
Let P (is) be denlled in a, space R
2
and such
tllS:t for every J ((.1 < gl OJ.' a
2
< &2)
we na,"'e P I J') - P. (J ') R ;J \
\<- - 1 1 2 \ 2i'
where PI (S) and (S) are completely a.ddit.ive set functions in
H
1
J
i
denotes the one-dinlel1siol1al interval a
1
< 'i 0i.
'l1l.etl if the function g1 (eli fJ2 iR integrs.. ble over R
s
""ith
respect to P(S), we have
f Ydgl}YS(gs)dP=J gl('I)dP1f g2('2)
dP
t.-

1 Cf. Kobson (1].. I, p. 626; de 180 Va.llee Pou3sin (1], p. 50
CHAPTER II
AXIOl\fS AND PRELIMll{ARY THEOREMS
1 \)7
e
now proceed to the explicit st.atement of our axioms..
1
In accordance with the preceCUng chapter, we denote by R
J
;
a k-dimensional euclidean space ,,--rj'tti the variable POlllt
and we consider the family of all Borel sets S
inB
k
"
Axiom 1. 1
1
0 every S corre8po:nda '-1; non-negative nurnber
P (ij), which is called tke probability of the 1
c
elation (01" eve-nt) XeS.
2. 11
7
e have P (R
k
) = 1a
Axiom 3* P (8) ia a completelyadditi've8etf'l//nction, i.e. wenat'e
P (S1 +/3
2
+ ... )= P (8
1
) + P (8
2
) -i- ,
where li
1
, .0 .. are Borelaets, no tu.
1
0 of 1.vhic!t luzve a cornmon poin.t.
The variable point X is th.en called a randmn 1Jariable (or
random point
1
random vector)o The set functio:n P (8) is called
the probability!ufWtio?lI of X, and is said to define the probability
diatrilYutiO'n in R
k
which is attached to the variable X. It is often
convenient to use a concrete interpretation of a probability dis
tribution as a distribution of mass ofthe total amount lover R
k
,
the quantity of mass allotted to any Borel set 8 being aqua!
to P (ls).
It follows ilnmediately from the axioms that we always have
1,
and P(S)+P(S*)=l,
where Sand S* are complementary sets. Further, if 8
1
and 8
2
are two sets such that 8
1
:>19
2
, have 8
1
= 8
2
+(S1 - Sf) and thus
(4) P P (S2).
1 The:f&ct that we restrict ourselves here to Borel seta in BIt: permits some formal
aimplifieation or the system of moms given by Kolmogoroff [4:If and or the im-
mediate conclusions drawn from the axioms. .
B
10 AXIOMS AND PRELIMINA.RY THEOREMS
Theorem 1. For an'll 8equence oj Borel sets 8
1
, S2' (0. in R
k
,
we have P (limsup Sn) limsup P (8",),
P (lim inf Sn) lim inf P (Sn)
Hence, if lim8,,, exi8tB, 80 doe8lim P (Sn)' and we have
(5) P (lim8
n
) = limP (Stt>.
In order to prove this theorem, we shall first show that (5)
holds for any morwtcme sequence {Sn}. If {Sn} is arl increasing
sequence, we may in fact write
lim8
n
=Sl +(SZ-Sl)+ (8
3
-8
2
)+ ... ,
and thus obtain from Axiom 3
P(lim8
n
)=P(8
1
)+ P(Ss-&)+P(8
3
-8
2
)+ ...
== p (St,) +(P (S2) - P (8
1
)) +(P (S3) - P (8
2
) +...
=limP(Sn,).
For a decreasing sequence {Sn}' the Bame thing is shown by con..
sidering the increasing sequence formed by the complementary
sets S:.
For any sequence {Sn}' whether monotone or not, we have
(of. I, 3) limsupSn,=lim(Sn+Sn+l+ ... ). Now, Sn,+Sn+l+ .... is
obviously the general element of a decreasing sequence, 80 that
(6) P (limsup 81'1,) = limP (Stl, +8
n
+
1
+... ).
For every 1'==0, 1, .. , we have 8ft, +8
110
+
1
+... j Sn+" and thus
by (4:) P P S
(Sn+Stt+l + ) (n+,),
P (S,,+Sn+l +
We thus obtain from (6)
P (limsup 8
n
) limsup P (S,,).
Hence the inequalityfor P (liminf8ft) is obtained by considering
the sequence {S:} of complementary sets and using the identity
liminfSn. := (lim. sUp S:)*. Thus Theorem 1 is proved.
In the particular case when every point X of Bit. belongs at
most to a finite number of the sets 8
ft
, lim8" is the empty set,
and it follows that we have limP (8
11
J== o.
AXIOMS A.ND PRELIMINARY THEOREMS 11
2. Consider BOW the particular set 8:;el'X2'. .'x1c defined by the
inequalities
(7) e' x, (i = I, 2, ... , k).
For all real values ofthe XI, we define a point/unction F (Zl' ... , xk)
by putting
F (Xl' .. ,. Xk) = P (SZlf .... Xk)'
so that according to Axiom 1 Jf' (Xl' .... , :Ck) represents the pro-
babilityofthe joint existenceoftbe relations (7). ThenF(Xl' ... ,Xk)
is called the diatribution fun.ction
1
of the probability distribtttion
defined by P (8).
In the sequel, the terms probability functicm and diBtrib1ttion
function will usually be abbreviated to pr.f. and d.f. respectively.
Let J denote the half-open lc-dimensionaJ interval defined by
the inequalities at <,;; b
i
for i =1, 2, ..... , k. The corresponding
probability P (J) is then easily seen to be given by the k..th order
difference of the d.f. F (Zl' -... , Xk) associated with the interval J.
We thus have, writing only the first and last terms of the expres...
sion for this difference,
P (J) = i1
k
JJ' . -, xk)
=F(b
1
, ,b
k
)- . + (-I)kF(a
1
, ,t.Jk)
Theorem 2. Every d.l_ F (Xl' " , Xk) P0lJ8eB8eJJ the following
fWperlies :
(a) In eacA fJQ,riabk Zl' F i8 a never decreasing function, whick
everywhere oontifl,1.WU8 to the right and, tends to tke limit 0 a8

(b) As all tke variables Xi tend (independently or not) to +00, F
terul8 to tke limit 1.
(0) For any kal/-open k-dimensional interval J, the aB8ociated,
Je..th order tJ,i!/erence of F is non-.negative.. i.e. Ai;F o.
.Further, every Junction, F (Zl' ... , xk) which po88e88e8 the pro..
pertiea (a), (b) and (0) determi,'IUUl uniquely Q, probability distribu-
tion in R
k
, IJ'UC1I, that F repreatnt8 the probabili,y of the relations (7).
1 The use here made of the terms probabUitg !undio. and .ilerihtlcm. j'll/lICti.on
corresponds to the terminology of Kolmogoroff' [4]. The latter term was used, with
the same significance, already by v. Mises [I" 2].
12 AXIOMS AND PRELIMINARY THEOREZ,iS
That F is a never decreasing function of each Xi follo'W"s im...
mediately from (4), since the set 8.1:1, ... ,:tk increases steadily wit}}
each Xi- Further, we have for every k> 0
F ,Xk) - F (X
1
,X
2
, ...... ,xk) = -l:J
Z1
t;:lJfJ ...,:rk).
Ifkruns a sequenceofvalues tendingto zero, thesequence
of point sets appearing in the second member obviously tends to
a, definite limit, viz. the empty set. Thl18 by Theorem 1 the first
member tends to zeros and F is continuous to the right in .xl.
The same argument evidently applies to every Xi' In the sanle
way it is se-an that F tends to zero as any given since
the set SXl
t
e.' (lJk tends then to the empty set.
As, on the other hand, all the variables Xi tend simultaneously
to +00, the set S:&l, ...,Xk tends to the whole apace RIc' ;j,Ild con-
sequently F tends to the limit III
Further, it is obvious that any d. will satisfy the property
(c), as we must have P (,.,9) 0 for any Borel set S.
trhe last part of Theorem 2, which asserts that every d.f.
Ilniquely determines a non--negative set function P (8) satisfying
our axionls, is equivalent to a well-known proposition in the
theory of Lebesgue integration.
l
We have already seen that the
d.f.. immediately determines the value ofP (8) for every half-open
k...dimensionai interval {i=i,2, ... ,k)c Now
Borel set can be constructed from such intervals by means vf
repeated passages to the limit, and the corresponding value of
the set function has then to be determuled according to (5). That
this procedure leads to a uniquely determined result for every
Borel set is preciselyasserted bythe proposition. referredtoabo","e.
AccoTding to Theorem 2, we are at liberty to define a pro...
babilit,y distribution either by its pr.f.. (which is a lJei function)
or by its d.f. (which is a point function). Though of course the
distinction between lihe two methods is only formal, it will
sometimes be found convenient to prefer one of them to the
1 Le"besgue [IJ, pp.. 168-169 (one..dimensiona.I case); de la VaU6e Polt88in (1],
cha.p.. VI.
AXIOMS AND PRELIMINA.RY 13
other.. It is particularly in the case of distributions in a one-
dimensional space (k= 1) that we shall use the d.. f., while for
general values of k the pr.f. will be used$
In the one-dimensional case (k =1), the propert.y (e) is implied
by (a), and thus it follows from Theorem 2 that every non-
decreasing funotionF (x) which is aiways continuous to the right
and is SUO!l tllat F(x)->O as 3:-i'--CO, and F(:c)-+-l as
defines a prolJability distribution.. As soon as k> 1, however, (c)
is no longer implied by (a), and already for lc=2 it is ill fact easy
to construct of funotions F satisfying (a) and (b), but
not (c). Accorcling!y, these functions are llot distribution func-
tions..
2
3. Let ... ,e
ft
) be a random variable in R
1c
""itll tlle
pr.f.. P (8), and Y =,f(X) = (1]1' ,11f) be a B-mea.surable fttnQtion
which is nllite and uniquely defined for all points ..t.Y. of R
k
..
and such that its values belong to a. certain space ffira TheIl
if 6 is a Borel set in ffl
t
, the set S of all poirlts X in R
k
, SItch that
IT=!(X)c'S, is (cf. I, 3) also a Borel set.. If, 110"''', we define a
set function (6) in 31
t
by the relation
(6) = P
it is readily seell that our Axioms 1-3 are satisfied by ((5), so
that (6) determines a probability distribution in by
definition, is the probability distribution of the random variable
Y = J(X). The condition that f should be finite and 1.miquely
(lefined for all points ofR
k
may obviously be replaced bythe m.ore
general condition that the points X, whereJ is not finite or not
uniquely defined, should form a set i:: such thap P C) = o.
For a set <; such that the corresponding set S contains no point
X \ve obtain, of course (6)=0.
Take, e.g.) Y= .... , gf), wllere f < k, so that Y is simply the
S A mmple example is the funotion F (a:. y) defined by F =0 for x<l, y<l, for
11<0, and for .z<Ot and by F=l elsewhere.. For this function,. the
difference aasocia.ted with it sufticienifty sma.ll interval J containing the point
x==y== 1 in its interior is seen to be negative, so that (c) is not sa.tisfied.
14 AXIOlIS AND PRELIMINA.BY THEOREMS
projection of the point X on a certain f-dimensional sub...space Rr
The pr.f. of Y is then (6) = P (8), where S is the cyZinder 8et in
R" defined by the relation ('1' .. t'I)CS. This may be concretely
interpreted byS&ying that the distribution of Y is formed by pro-
jecting the mass in the original distribution on the sub-space Rr
In particular (I =1), every component ,,, of.xis itself a, random
variable, and the corresponding distribution i$ found by pro-
jecting the original distribution on the axis of 'i.
4. Two random variables Xl == .... , e
k1
) in Riel and
X 2=(1]1' .... , 11k,.) in RksJ>eing given, it often occurs that we have to
consider also the"combined" variable I =(Xl' X
2
) as a random
variable. The "values" of I are all pairs of "values" of Xl and
X
s
, so that it =(Xl' X
a
)= ('1' .... , 'k
1
' 1'Jl' .... , "IJk
J
) is defined in the
product space Si
f
== R
lcl
"R
kt
, where f:= k
1
+k
2
" Obviously the
probability distributionof! inat must be suchthat its projections
on R' and R
U
coincide with the distributions of Xl and X
s
respectively.l Similar remarks apply to the "combined" vari...
able formed with any number of random variables.
Let the probability functions of Xl' XI and be ii, P2 and
while the corresponding distribution functions are F
1
, Fa and g..
Then F
1
(Xl' ",. Xk
1
) and Fa (111) .... , Yk
s
) denote the probabilities
of the relations t <,.. (.: - 12k)
Si = ""'I. II - , , ... , l'
and (j=-I, 2, . ,k
s
)
while (Xl' .... , :J:k
1
' Y1' .. -.,1Ika) denotes the pro-
bability of the joint existence of all these f = k
l
+ k
a
relations.
We now introduce the following important definition: The
variables Xl and X
2
are called mutually independent if, for all
values of the Xi and Yi
J
(8) tr (Xl' . , Xk
1
' Yl' ... , Ykj) ==.1;. (Xl' .. ) Xk
1
) F
2
(Yl' . -,11k.)
If 8
1
and S" are given sets in R' and Bit respectively, and ifwe
consider the set S in mformed by all pairs I =(Xl' X.) such that
(9) Xl cS
I
and XscS",
1 Any distribution satisfying thia eoDditioD U. of C011l'8e, logieaDy poesible.
AXIOMS AND PBELIMINA:RY THEOREMS 15
then it follows from (8) by the basic property of Borel sets that
'(6) =P
1
(Sl) Pi (S.).
Thus for two independent variables the probability of the joint
existence of the relations (9) is equal to the product of the pro-
babilities for each relation separately. The validity of this
multiplicative rule for the particular sets connected with the
distribution functions is thus equivalent to the validity of the
same rule for all Borel sets.
5. Let Xl' XI' ... , X
n
be random variables with the pr.f.'s.
... ,Pn and the lfi, ...,F
n
defined in the spaces B', ..., H:n)
of any number of dimensions. Consider the combined variable
11. =(Xl' .. *' Xfl) with the pr.f. '71. and the d.f. fYn' defined in the
produot space ffi,<n).
Xl' ... , Xft are then called mut'Ual''ll indepe:ndent if
trn=F1F" ... F
n
,
which is the straightfonvard generalization of (8). As in the case
of two variables, this is equivalent to the relation
(10) 'n(Sn)=Ii (8
1
) P", (8
n
),
where 8
1
, _, S.,., are given sets in B', , :en) respectively, while en
denotes theset inm<n) which consists ofall points $ =(Xl' . -, Xno)
such that for i == 1, 2, . , no
If, in (10), we put Sn,=Rn), we obtain
'n-l == Ii (81) ... (Sn-l)'
where ''''-1 is the pr.f. of In-l =(Xl' .. , X
n
-
1
) and <Sn-l is the
set ofall points In-l in 8l,<n-Usuch that Xi C8"for i = 1, 2, ... , n - 1.
Thus we infer that the variables Xl' ... , X
n
-
1
are independent,
and in the same way we obviously :find that any g'TOUp 01 m n
amung the variable8 X, are mututJUy independent
Further, it is easily found that, if the variables Xl' ... , X
m
,
1;" , Y
n
are all mutually independent, then the combined vari...
abIes lm== (Xl' .. , X
m
) and Wn =(1;, .. 0'1';,,) are also independent.
6. Any B-measur&ble vector function I(X
1
, , X
m
) o m
random variables may be considered as a B-measura.ble function
of the combined variable !m. Thus according to II, 3, the pro-
16 A.XIOl\lS A:s"n PRELIMIKARY THEOREMS-
bability distributIon of f is uniquely determined by the diS-
tribution of x
m
Theorem 3.. Let Xl' ... , X

m
, Y;" ... , Y
n
be independent rand07n
ttariabletJ, and j,X
1
, ... ,Xm,) arJ g(Y;,. c .. ,YnJ be B-measurable
vectorfU/lwtion-s of the a.<:,'Jigned argurnenf,a.. Then!andgare mutually
indepe:nde:nt ra'fUiqm va,riablea..
We ha.ve J= f(i
m
) and g= 9 (IDn), where, according to the
preceding paragraph, I
m
and %)tt. are independent. The pro..
babilitythatfbelongs to a given set S is then, by definition, equal
to the probability that ;Em belongs to the set of all points which,
by the relationj=f correspond to values offbelonging to IS.
For g, and for the combined variable (J, g), the 8JnalogouBrelations
hold.. The independence of f and g thus follows from the in..
dependence of and IDn
7.. If X and Yare Iandom variables with the pr.f.'s P (S) and
Q(T), ,vhere Sand T are variable sets in the spaces of X and Y
respeotively, and if the prltf. of the combined variable (X, I'")
is ,ve can form. the probability of the joint existence of
the relations Xc ;.S, Y c To This is a functioll of t,ro variable sets,
say G T). Now let us consider the expression
(11) p, T)
T - Q(T)
for a fixed set T such that Q(T) > 0.. Then P
p
(S) becomes a
function ofthe variable set S, and it isimmediately seen that this
function satisfies Axioms 1-3. For ever:v fixed T in the space of
Y such that Q(T) > 0: P
T
(8) thus defines a probability distribu-
tion in the spae.e of X. This distribution is callbd th 'P"'obability
di8tri&a.jticm of X relative to the kypothesi-s Yc,T, and the quantity
P
T
is known as the probability of tke relation (e1Jent) XeS
relalivc to tke hypotliJW YeT<# Similarly, we define a distribution
in the space of Y relative to the h:Y'Pothesis XeS:
G(S,T)
(12) Qs(T)= P(S)'
If J.9
1
, .... ", 8'17. are such that and 81: have n() common point for
AXIOMS AND PRELIl\fINARY THEOREMS 17
't:Fk, while 8
1
+ +Sn, coincides with the whole ~ a c e of X, we
obviously have by (12)
Q(T) = G(Sl' T) +... +G(8
f
t' T)
= P (8
1
) QS
1
(T) +... + P (Sn) Q
Sn
(T),
and 80 obtain from (11)
P
T
( S ~ ) = : (S$) Qs. (T) .
~ P(Si) QS
i
(T)
t.=1
This relation is known under the name of Bayes' theorem, and is
considered as giving the probability a posteriori, Le. calculated
after the '''event'' Yc T has been observed, of the particular
'hypothesiet" Xc 81. when P (8
1
), . , P (Sa) are the a priori
probabilities of the various hypotheses X cSt.
If X and Yare independent variables, we have
G(8, T)=P(8) Q(T),
and thus by (II) and (12)
Px(/S)=P(S), Qs(T)=Q(T),
so that in thi'J case the relative probabilities coincide with the
.: total" probabilities P (S) and Q(P).
SECOND PART
DISTRIBUTIONS IN R
1
All random variables and distributions considered in this part
are, unless explicitly stated otherwise, defined ill a one-dimen-
sional space R
1
CHAPTER III
GENERAL PROPERTIES$ MEAN VALUES
1. According to Theorem 2, the d.f. F (x) of a. probability
distribution in ~ is always a non-decreasing function of x, which
is everJ"Where continuous to the right and tends to 0 as x-+-- 00,
and to 1 as x ~ + c o . Conversely" any F (x) with these properties
determines a probability distribution.
Any d.f. F (x) being a monotone function, we can at once state
a number ofgeneral properties of.... 11' (x), for the proofs ofwhich we
refer to standard treatises on the Theory of Funotions of a Real
Variable.!
Theorem 4. A d.f. F (x) lUJ8 at most a finite number of pointe
at which the 8altUB i8 ~ k >0, and COfUIequenfly at most an enumer
able aet of pointB of diacominuitll.. PM derivative F' (x) exists/or
"alWKJ8t all" val'Ue8 of x (i ..e. the pointa oj e:xception form a 8et oj
meaB'Ure zero).
F (z) can always be represented as a 8um of three components
(13) F (x) =aIF
1
(x) +aIIFII (z) +alIIFIll (x),
where aI' an' alII are non-negative numher8 with the 8'Um 1, while
F
1
, F
II
) FIll are distributionfu:ncti01UJ BUCk that ..
PI (x) is ab80lutely continuous,' F
I
(x) = f: 0:) F ~ (t) dt for aU valU68
0/$,.
1 Hobson [I], 1, p. 338 and p. 603.
GENERAL PROPERTIES 19
FII (x) is a "8tep1unction" ; F
1I
(:c) = tke 8um of the 8altu8e8 of
F (x) at all discontinuities 3:.
FIll (x), the component. is a continuO?Ul functio11t
having almost 6vergwkere a derivative =o.
The three components aIF
1
, aIIF1I' aIIIFIII are u'fl/iquely deter-
mined by F (x).
Let us consider in particular the cases when aI or all is equal
to so that J! (z) coincides with F
1
or F
II
(:tt Wp sha,ll say in
these C&SeS, which are those usually occUlTing ilL applications,
that F is of type I or n respectively.
I. H F (x) =F
1
(x), we have for all values of x
F J:oo F' (t)dt,
and thus tJte probability that the random variable X with the
d.f. F (x) $lSume8 & value belonging to the given set S is
,.
Js F' (t)dt.
The derivative F' (x) is then called the frequency function or
probability density of X.
II. If F (x) = F
n
there is a finite or enumerable set of
points z, such that every Xi is a point of discontinuity of F (x),
while F {x} is constant on every closed interval which contains
no IfPi, is the saltus of F (x) at the point Xl' we have LP! == 1.
,
The probability that X belongs to the given set S is zero, if S
does not contain any 3:" and is otherwise equal to the sum of all
those p, which correspond to points Xi belonging to S.. Thus in
this case the distribution is completely described by saying that
wehavetheprobabilityp", that X assumes the valuez, (i ::= 1, 2, .,. )
and the probability 0 that X differs from all tIle Xi.
2. A LebesgueStieltjes integral fgdP with respect to the
pr.f. P(S) has been defined in 1, 3. 'Ve now define tIle cotte-
respectively.
20 GENERAL PROPERTIES
sponding integral with respect to the d.fa F (x) simply by putting
fs
gdF
=fs
UdP
.
If X is a random variable with the d.f. F (x), the integral
f: xdF(x) has a uniquely determined value for all finite a and b.
If this integral tends to a finite limit as a-;. - 00 and b-;.. +00
independently (i.e. if the integral is absolutely convergent), we
denote this limit byl
(14) E(X)=J:", xdF{x)
and call it the 1nean value or expectation of the
random variable X ..
A B..measurable function g(X) of X may, according to n, 3
t
be considered as a random variable. If the d.f. of this variable
is denoted by F* (x), we have by n, 3,
F* (x) =J dF (t),
s::
where Sa; denotes the set of all points t such that g(t) x. Thus
,,-e obtaJ.n fb f
a xdF*(x)= g(x)dF(x),
where the integral in the second member is extended to the set
8
11
- Sa- Nowif the integral
f:a> Ig(x) IdF (x)
is convergent, we may allowa and b to tend to - 00 and +00, and
80 obtain aocording to (14) for the mean value 0/ g(X)
(15) E{g{X)}=f:a> 9 (x) dF(x).
In the same way we obtain, if 9(I) is &. real-valued function af
a random variable I which is defined in a space m
r
of any number
1 In the partionlar ea.se8 when F (z) is of type I or type II" we have
E (x>=j"CO ::r:F' (x)dr and E (X)=Ep$X'i
j
GENERAL J:lROPERTIES
21
of dimensions,
(15a)
where (S) is the pr.f. of , and the integral is assumed to be
absolutely convergent.. If, in particular, g depends only 011 a
certain number k <f of the co-ordinates of the integral is, by
II, 3, directly reduced to an integral over the corresponding
sub-space St
k
"
The mean value of the particular funotion (X-E(X)! is
called the variance of X. The non-negative square root of this
mean value is called the 8tandard deviation (abbreviated s.d.) of
X and is denoted by D (X), so that we have, assuming the con-
vergence of the int\egral,
(16) (X-E(X2dF(x)
= E (X
2
) - E2 (X).
The square root D (X) is always to be given a non-negative value.
We have D (X) = 0 if and only if F (x) is constant on every
closed interval which does not contain the point x=E (X) .. In
this extreme case, we have the probability 1 that the variable X
assumes the value E(X), and we have F(x)=e(x-E(X,
where f: (x) denotes the particular d.f. given by
(17) E(X)= {O for x<O,
1 "
In all other cases, the standard deviation D (X) is positive.
If X is a random variable with a finite mean value, we ob-
viously have by (15)
(18) E (aX+b)=aE(X)+b
for any, coIlBt-ant a ftJ}d b. Further, if the s.d. is aJso finite, we
have /
(19) D(aX+b)=)alD(X).
In particular, the normalized variable X;; 'fi;> has the mean
value 0 and the s.d. 1.
The moment8 p and the ab80lute mome:nts p." of the variable X
22 GENERAL PROPERtIES
are the mean values of Xv and f X p' for v=0, 1, 2, ... :
Gt,,= x"dF (x),
fl,,= f:t1J Ix l"dF (x).
\/J
p
is, of course, hereby defined also for non-integral v> 0.) It
is immediately seen that, if 13k is finite, both v and fJv are finite
for v k. Further, we have 211 = /32v and f tX
2v
+l I f32v+l- From
(14) and (16) we obtain
E (X) = eX1' D2 (X) = tX2 - (XI.
IfPk is finite, it follows fromwell...knowninequalities
1
that we hava
1 1 1
(20) Pl fi: ,e::;i ...
Inthe sequel it will always be tacitly understood that the mean
values occurring in our considerations are assumed. to be finite
even in the rigorous sense that the corresponding integrals are
absolutely convergent.
3. Theorem 5.
2
Let t/J (x) denote a non-negative function Buck
tkat if; (a-) M > 0 for all x belonging to a certain 8et 8. Then if X
is a random variable) tke probability that X Q,88Ume8 a value
belo1Iging to S i8
This follows directly from the relation
E{!f1(X)}= fsdF(X)=MP(S).
Taking here in particular "p(x)=(x-E(X)2, M=lc
l
, we
obtain for every Ie> 0 the Bienayme-TcAebyehefJ inequality:
The probability of tk relation IX - E (X) I /c is [)I/c\X)
Taking further t/J(x)= Iz I", M==Jcvftll' it follows that the
proba.bilityof IXI k\lfJ.. is
1 Cf.. Bardy...Littlewood.P61ya [1], p. 157..
I This is an obvious general.iAtion of theorems due to Tchebycheff and Markoff.
Of. Kolmogoroff{'], p. 37.
GENERA.L PROPERTIES 23
Ohoosing finally .p(:r:)=e=, M=e
ca
, where c>O, we conclude
bili
. fX E (tc
X
)
that the proba ty 0 ~ a IS ~ eca-
4. Let X and Y be random variables in R
1
, such that the
combined variable Z = (X, Y) has a certain pr.f. P (S) in R
s
Then X +Y is a one-dimensional vector function of Z, which

according to II, 6 has a distribution uniquely determined by
,.
P(S). By (loa) we then have 1il (X+ Y)=J - (X+ Y)dP. The
Rs
integrals f XdP and f yap reduce, however, according to
Ell R
t
..
the remark made in connection with (15a), to the one-dimen-
sional integrals representing E (X) and E (Y). As soon as these
two mean values exist, we thus have the important formula
(21) E(X+Y)=E(X)+.E(Y),
which is evidently hereby proved without any assumption con-
cerning the nature of the dependence between X and Y. Ob-
viously (21) is immediately generalized to any finite number
of terms.
Treating in the same way the product XY, we obtain
E(XY)=fRo XYdP.
If, in particular, X and Yare mut'Ually i,ruJ,epeMent, we have by
II, 4:, P=liP2, PI and P2 being the pr.f.'s of X and Y. It then
follows from I, 3, that, if X and Y are independent, we have
1
(22) E(XY)==E(X)E(Y).
From (16), (21) and (22) we obtain further, if X and Y are
independent,
(23) DJ(X+Y)=DI (X) +DI(Y),
whichisimmediatelygeneralizedtoanyfinitenumberofmutnally
independent terms.
1 If we restrict ourselves to variables with finite variances, the ~ r y a.nd
nJ/ici.t:"" condition for the validity of (22) and (IS) is that the correlation coefficient
of X and Y vanishes..
CHAPTER IV
CHARACTERISTIO FUNCTIONS
E (XY)= E (X) E (Y).
1. The mean value of a real function g(X) of a random
variable X has been defined in ill, 2. For a complex. function
9 (X) +ih (X), we put
E(g+ih)=E(g)+iE(k)= f:oo (g+ik)dF(x).
With this definition) the rules for operations with mean values
given in the preceding Chapter hold true even for mean values
of com.plex functions. If, in particular, X and Y are complex
fU:lctions of mutually independent variables, we obtain from (22)
and Theorem 3
!j(t) I~ f ~ o o dF(x) = 1.
The variable aX+b has the d.f. F e ~ b ) and the c.f. eJ1il/(at).
TIle mean value of the particular function e"'x, where t is an
auxiliary variable, will be called the iharacte:ri8tic f'Undion
(abbreviated c.. f.) of the corresponding distribution.
1
Denoting
this function byf(t), we have
(24) j(t)=E(e(tX)= f:oo eiktJdF(x).
Unless explicitly stated otherwise,/(t) will be considered for
real value8 of t only.
The integral in (24) is absolutely and uniformly convergent
for all real t, so that f \t) is uniformly continuous. Obviously
1(0)=1, and
1 The first use of an analytie&1 IDstrument sub8tantJ.&lly eqmvaJent to the
cha.racter1st1e funciion seems to be due to Lagr&nge (1]. (Cf. Todhunter [IJ, pp.
309-813.) SImilar mnctJ.oDl were then sy&tema.t1ca1ly employed by Laplace m hie
great work [1].
UHABA.CTER1STIC 25
Thus in particular, putting E (X)=trt andD(X)=a, the normal-
X -,n _mil (t)
ized variable (of.. Itt, 2) -0- has the c.r. e a f ;;, . Further,
the variable -xhas the c.f.J<
Theorem 6.
1
For every real, the limit
lim 2
1
TfT !(t)e-ttedt
T-+J -T
exi8t8 and i3 equal to the 8alfu-8 of F (x) at x=e.. TJt'U8 if F (x) i8
continu0'U8 at x=" the limit is zero..
We have
.I'11fT l(t) dt= ITJ"'1 ,dt foo eil(;r-tidF (x)
2-d. -I 2 -':1 eI-QC
f
OJ Slll Tx
= -TdzF(x+f).
-00 x
The contribution to the last integral which is due to the domain
1x I 11, > 0 tends to zero as T 00, whatel'"er the value of
Let k be so chosen that the variation of F(x) in
.. sin Tx
exceeds the saltus at x = eby less thall E,. Then, smce -T= 1
x
for x=O, and 1 always, it 18 seen that, for all suffi-
cientiy large T, the last integral differs from t.!le saltus of F at
uhe point, by less than 2t!o Thus the theorem 1s proved.
Representing F (x) as the sum of three components acoording
to (13), we have
let) =arfI (t) +a1IfII (t) +aIII/III (t),
each term containing the o.f.. of the corresponding component of
F We shall consider the behaviour of these three terms
separately.
I. Since Fy is absolutely continuous, Jl (t) = f:ooe1i:l:Fi(X)dx,
1 Bochner [1]. p. 79.
26
CHARACTERISTIC FUNCTIONS
and hence as f t I by the Riemann-Lebesgue
1 fT
theorem.
1
It follows that 2T -1'1 Id
t
) as T Ifthe
nth derivative Ft
n
) (x) exists for all:c and is absolutely integrable.
a partial integration shows thatII(t) = 0 Ct as It I 00.
II. aII!II (t) = (mling the notations of ill, 1) is the
v
sum ofan absolutely convergent trigonometric series, and is thus
an almost periodic function, I which comes as close to all as we
please for arbitrarily large values of t, so that limsup IfII (t) 1=1.
I fT f t 1-+000
We haves 2T _2.,1 lu(t) as
III. lUI (t) is the c.f. of a FIll (x) which is continuous and
has almost everywhere a derivative equal to zero. It is possible
to show by examples' that fIll (t) does not necessarily tend to 0
1 fT
as !t I We have, however, always 2T _pi 1m(t) as
T-+co. It will, in fact, be shown in v, 1, that ifj(t) is the c.f. of
3contin'U0U8d.f., the same holds true for I/(t> II, Thus the desired
result follows by applying Theorem 6 to IfIll (t) II and putting
,=0.
We are thus able to state the following theorems.
Theorem 7. If, in tke repreaentation of the d./. F (x) according
to (13), toe have a
I
>0, then limsup II(t) I< 1.
ft I-+-co
If a
I
= 1, then lim !(t)=O.
It I-+-o:>
If all = then limsup IJ(t) 1=1.
Itl-+IXJ
Theorem 8.
5
For every c.J.f(t) wekave
1 f2'
lim 2T If(t) 1
2dt
=
T-+co -T 11
1 Hoblcin [lJ, II, p. 514. Besicovitch (1], p. 6. 3 Besicoritch [1], p. 19.
t Cf. e.g.. JessESn-Wintner [1]. $ Levy [1], p. 171.
CHARACTERISTIC FUNOTIONS 27
the Pv the 8altuse8 of tke correaponding d./_ F (z) at all its
discontinuities.
Remark. It is easily seen that we Qannot have II (to) j =1
for any to:F 0 unless F (x) is of type and any two discon-
tinuities ::c
v
differ by a multiple of 2wIto. Hence it follows that, if
limsuplf(t)l<l, then If(t)t<k<l for however
small E is chosen.
For a later purpose (cf. Theorem. 26), we shall in this con-
nection prove the following lemma.
Lemma 1. If f(t) i8 ael suck tkoJ I/(t) I Ie < 1 a8 soon (VJ
f t I then we havefor It 1<6
t
l
If(t) I 1-(1-k
2
) Bb2-
Froln the elementary inequality cos t ! +! cos 2t we obtain
lJ(t) 1
2
= f:<of:=eilGz-tl)dF(x)dF(11)
=J:=J:oocost(x-y)dF(x)dF (1/)

For b/2 It 1<b we thus have by hypothesis
I/Ct) 1-1(I-lei).
Repeating the same argument, we conclude that for
b/2
n
;:i It 1< b/2
t
&-1,
where n is an arbitrary integer, we have
I/(t) 1- < 1-(1-k
2
)t
2
f(4b
2
),
and thus I/(t) I<1-{I-k
2
)t
2
j(8b
2
).
As n is arbitrary, this proves our assertion for any t such that
0< It I<b. For t=O, we have j(O) == 1, and thus the lemma, is
proved.
2. Ifthe absolute moment Pk is finite for some positive integer
It, (24) may be differentiated k times, and it follows that !(v) (e)
28
CHARACTERISTIC F1JliCTIONS
exists as a bounded and uniforml:y continuous function for
JI= 1, 2, u", k. Obviously !(v)(O)=ivrx
v
, t111d so we obtain by
MacLaurin's theorenl
1
for small values of f t I
k
(25) J' (t) = I +I; Ct.!', lit)V -roo (j ell:).
1 V .. \ , .
For sufficiently small values of It j, the of logf(t) which
tends to zero with t may be developed ill .:vIaoLaurin"s series up
to the term of order k, and thus we na.ve, introducing a new
sequence of parameters,
1:,
(26) logf(t) = (it)V +. 0 (t
k
).
1 v.
A comparisoll of (25) and (26) sho\vs that is a POl:Y'Iloluia! in
lXl' 0:2' .... , at., and that 1'1 =1' 1'2 = et.J - Ctr. In the JJariiculaI-iy
important case C(1 =0, we have
""1 = 0, Y2 =0!2' "3= 1%3' 14 = tX4 - 3(%;, )......
The Yv are called the semi... invariants of the distribll-tionu
2
For any 16 k, it follows from (25) and (26) that Ynln ! is equal
to the coefficient of zn in the development of log ( 1+i: ::i zv)
\ 1 v.
as a po,ver series in z. According to (20), this series is Illajoratetl
by the series 1 !
[
CO C8: Z )J1] t:t) (ef!l:t
z
-1)V
-log I-f-;r v
and so ,ve have
1
I I (coetr.. of in ! eJf,a:z)
or n1 1 v
(
n7) . 1< ,1.{J
;;;: {Yn =n n
3. According to (24), the c.f. f(t) is uniquely deterlnined by
the d.f. F (x). We now proceed to prove a group of theorems
which show inter alia that, conversely, F (x) is uniquely deter...
milled byJ(t).
1 A form of the rf'ma.inder in 1t{acLaurin's series which yields (25) doss not often
occur in text.. books. It is, however, easily deduced from the orqinary Lagra.nge form.
a Thiele [1].
CH.tiRACTERISTIC FUNCTIONS
29
1 fT l-e-
illL
foo
-2 Q
t
e-itef(t)dt= ,pdF(x),
1T -T -(0
Theorem 9.
1
If F(x) is continuous/or x=g andJor x=,+li,
we have
1 fT 1 - e-illt--
F(e+h)-F(E)= lim -2 -t e-ite!(t)dt.
T-+aJ 7f.i -T 11
Before proving the theorem, ,,'"e shall use it to prove the
identity of any two d.f.'s F
1
(x) and F
2
(x) having the same c.f.
f (t). As a matter of faot, Theorem 9 shows that the differences
and F
2
(x)-F
2
(y) coincide for almost all values of
x and y. If y - 00, it follows that F
1
(x) = F
2
(x) for ahnost all x.
Since every d.f. is continuous to the right, the equality must hold
generally.
In order to prove the theorem, we may clearly suppose k > o.
Tilen
where
ifi=J g, k, T) =! fTsint (t
X
-1) dt- !fTsint - h) dt.
0
Given any E' > 0, we now choose 8 such that the sum of the
variations ofF (x) over theintervals I I I -h I
is less than E. This is possible, since by hypotllesis F (x) ifJ con...
tinuous for x = gand for x = g+k.
If T-+oo, while, and k remain fixed, l/J tends uniformly to 0
in the intervals x < - 8 and x > + k +8, and to 1 in the interval
+8< x < , +h - 8. In the remaining intervals Ix - I 0 and
I I we have t ifli < 2.
It thus follows that, for all sufficiently large values of T, the
f
Cl) fl+h-a
integral t/JdF (x) differs from tiP (x) by a quantity of
-co f+3
modulus less than 3. If 8 is sufficiently small, the last integral
comes, however, as close as ,ve please t.o Thus
the theorem is proved.
2
1 Levy (l]t p. 166..
I It is easy to show that,. if the definition of a d..f. is modified, 80 that in a point
of discontinuity we put F (x) =t [F (x +0) +F (:c - 0)]. then Theorem 9 holds for
aU values of , and h.
30 CHARACTERISTIC FUNCTIONS
The integral appearing in Theorem 9 is, in the general case)
only conditionally convergent as T 4 co. We shall now prove a
similar theoremwhich contains anabsolutely convergent integral.
For any given d.f. F (x) and for any k> 0, the function
If:t+k
Ii F (u)du
is obviously a continuous d.f. The corresponding c.f. is found by
l_e-
ith
(24) to be itk j(t). Replacing in Theorem 9 F (x) by this
new d.f., we thus obtain for all values of eand h
IfE+2h 1f'+1I.
(28) Ii l+h F(u)du-ji I F(u)du
= C dt.
Substituting here, for, +h, we obtain after an easy transforma-
tion of the integral on the right-hand side the following theorem.
Theorem 10. For all real, a'TUl for all k > 0, we have
(m:tr
This can, of course, also be proved directly, without the use
of Theorem 9. \Ve are now in & position to prove the following
important theorem.
Tkeorem 11.
1
Let {Fn(:c)} be a sequence 01 d.J.'a, arul {fn(t)}
the corresponding seq:uence of c.f.'a. A nece88ary and 8ufficient
condition for .the convergence of (x}} to a d.J. F (x), in every
continuity point of the latter, ill that th.e aequence {In (t)} of C./.'S
1 In a slightly less precise form, this theorem was first proved by Levy [1], pp.
195-197. Cf. also Bochner [I), p. 72. It should be observed that the theorem be
comes fa.lse if we omit the assumption that the limit I <t> is continuous at t =0.
Choosing, in fact. In (t) =e-att, we h&ve J(t) =: 1 for t -=0, and / (t) =-0 for t=l=O. so
that! (t) is discontinuous at t=cO. Accordingly, the corresponding sequence of d.f.'s
{F
n
(x)} tends for every x to the limit P (x) Dli, which is not a d.f.
CHARACTERISTIC 31
converge8 jO'f every t to a limit J(t), which is continU0'U8 for the
,c:pecial value t =o.
When this condition is 8atisfied, the limit f (t) is identical u'itk the
c.f. of the lirniting dl F (x), and! (t) converges to f(t) uniformly in
every fi'ltite t-interval. Tkia implies, in particular, that the limitJ(t)
is then continuo'U8 for all t.
That the condition is follows almost immediately
from the definition (24) of a c.f. In fact, if (x) -+ F (x) in every
continuity point of F (x), \vhere F (x) is a d.f., "'"e can choose
M =M(t:) such that if eil.zdF
n
(x) 1< E' for all n, E> 0 being
t.l'}>.tlI
given. In particular M can be so chosen that F (x) is continuous
for z = M. According to the theory of Stieltjes integrals, we
then have
f
lJI fJl
eil.l: dFn, (x) -+ dF (x),
-l}l -jJ[
uniformly in every finite t-intervaL The last integral differs,
ho,vever, from the c.f.f(t) of F (x) by a quantity of modulus less
than f, if M is sufficiently large. Thus It). (t)-:;..f(t) as n-;-.co,
uniformly in every finite t-interval.
The main difficulty lies ill the proof that the condition is
81.jfieient. We then assume that In (t) tends for every t to a limit
J(t) which is continuous for t = 0, and we shall prove that under
this hypothesis F
n
(x) in every continuity point of F (a:),
where F (a:) is a d.f.. If this is proved it follows from the first
part of the proof that the limit J" (t) is identical with the c.f. of
F (x), and that In (t) converges to f (t) uniformly in every finite
t-interval.
In order to prove this, we choose fronl the sequence (x}}
a sub-sequence (x), (x), .... , such. that (x) converges to
a never decreasing funotion F(x),\ in every continuity point of
F(x). It is well known that this call always be done, and ob-
viously we may suppose that F (x) is everywhere continuous to
the right. We shall now prove tllat F (x) is a d.f. As we already
know that F (z) is never decreasing, and we obviously have
32
CHARACTERISTIC FUNC"rIONS
o F (X) 1 for all x, it is sufficient to prove that
F (+co)-F (-00)= 1.
From Theorem 10 we obtain, putting ,=0,
(29) Fn. Fn. S:ao (si:tYfn. (i)dt.
On both sides of this relation, we may allow v to tend to infinity
under the integral signs. In fact, it is easily seen that tIle con-
vergence conditions for Lebesgue-Stieltjes integrals given in
I, 3, are satisfied by the integrals occur.ring here, and we thus
obtain
(30) F F(u)du

Let now h 00. As F (x) is a never decreasing function, the
first member of (30) tends to F (+00) -F(-00). By assumption
J(t) is continuous for t =0, so tends for every fixed t to
the limit 1(0). We have, 1(0) =limfn (0), and In (0) = 1
n-+o:>
for all 7L, (t) is a c.f. Hencef(O) = 1. Applying once more
the convergence properties of integrals (I, 3), we thus obtain
F( +(0) -F( -OO)=!JC:O (Bil1t)2dt= 1.
1'( -co t
(The value of the last integral may be obtained, e.g. by letting
h co in (29).)
We have thus proved tha.t the sub-sequence {F
1l
.
V
(X)} tends to
a d"f. F (x), in every continuity point of F (x). By the first part
ofthe proofit then follows that the limitf (t) of the correspollding
c.f.. must be identical with the a.f. of F (x) ..
Consider now another convergent sub..sequence of {Fn (x)}, and
denote the limit of the new sub-sequence by F* (x), always
assuming this function to be determined so as to be everywhere
continuous to the right. III the same way as before, it is then
CHARACTERISTIC FUNCTIONS 33
shown that F* (x) is a d.f. By hypothesis tIle C.f.'8 of the naw
sub-sequence have, however, for all values of t the same limit
!(t) as before, so thatj(t) is the e.f. of both F (x) and F* (x). But
then it follows from the remarks made in oonnectiol1 with
Theorem 9 that we have F (x) =F* (x) for all z. Thus every con-
vergent sub-sequence of {F11 (x)} has the same limit F (x). This
is, however, equivalent to the statelnent that the sequence
{F (x)} converges to F (:t), and sil1ce we h.ave sllown th&t F
is a d.f., our theorem is proved.
4. Let us 110W oonsider a funotion R {x) whicll is of bounded
variation in (-co, +co), but not monotone.. The
integral
(31) r(t)= e'iCd.R(x)
is then bounded and uniformly continuous for all real t. In
Chapter VII, ,ve shall require the following theorem.
Theorem 12. Let R (:c) be of bounileil variation. in (-00, +00),
and 8'uppose thAt R (x) atJ 00, 80 that
(32) 1'(0)=f:"., d.R(x):=O.
Suppose further that the integral
f:1X> Ix 11 dR(x) I
is c011.vergem. For 0 < CI) < 1,for all real x and all h > 0 we tken have
J
%+1&
(33) :r (y - X)<rJ-l R (y) ily
= _ -.!. fIX> : e-il3l at fA. u
w
-
1
e-
itu
au ..
2m., -00 t Jo
If, moreotu!,r, th,e integral
(34)
i-9 convergent, we have
Ir(t) J
t ---- i dt
-co! t i
34
(35) R(z)= r(t) e-4lzdt.
2m -co t
We observe t.hat the conditions of the first part of the theorem
aresatisfied, inparticular, whenever R (x) is thedifference between
two d.f.'s with finite mean values. In this case, ,.(t) is the
differenoe between the oorresponding 0.f.. '8.
In order to prove the theorem, \ve shall first show that both
members of (33) are continuous functions of x and h, when (JJ is
fixed. between 0 and 1.. In respeot of the first mamber, this is
readily seen by writing this member in the form
hif: yt-lR(z+ky)dIg.
In respect of the second member, we have already remarked
that r (t) is bounded for all t, and by the argument used in IV, 2,
we have 'r(t)=O(t) as t-.+-O. Moreover, we have for t'#:.O
(36) I fAu/.tJ-Ie-il:tldul=l! [ht u
tU
-
1
e-iJ,d,fJ, II'
Jo twJo wftJw
where G is an absolute constant. It follows that the integral
with respect to t in the second member of (33) is absolutely and
uniformly oonvergent for all x and It, and accordingly represents
a continuous function.
Without restricting the generality, we may thus assume for
the proof of (33) that x and :t +k are continuity points of R (x).
The second member of (3S) is the limit, as M of the
expression
1 JAtI r (t) in,
- -. - e-it3: at u
w
-
1
e-
w
d'U
2,,-1. -,,'V t 0
1 [A 1.1.11 e-d(z+u) fco
=- - m ut.U-1au -.-at eif.lI dB (y)
1'( .-0 0 tt -co
= _!Jco dR(y) fA dt.
Jo 0 t
According to the oonvergence properties of integrals (1, 3), we
may here allow ..r.11 to tend to infinity under the integral. Using
CHA.RACTERISTIC FUNCTIONS 35
the well-known properties of trigonometric integrals
t
and ob-
serving that by assumption we have R ( 00) =0, we then nnd
that the second member of (33) is equal to
IJ:t+ll. AttJ
-- (lI-X}OItlR(y)+- B (w+h).
W :.c OJ
An integration by parts now yields (33)..
On the other hand, replacing the first member of (33) by the
expression just obtained, we obtain the relation
(z+k
J,. (y-x)WdB(y}-h<R(x+h)
(I) fCf) r(t) i
A
= --. -e-itJ:dt utu-l
e
-it:udu
211t -r:t:! t 0
= ].foo r(t} e-il:cdt(ktl)e-I.th +it fA'U/.lJS-itUdtJ,).
21Tt -0;) t Jo
If we assume the convergence of (34), this gives (35) as C U ~ O ..
Thus Theorem 12 is proved.
CHAPTER V
ADDITION OF INDEPENDENT VARIABLES.
CONVERGENCE ,elK PROBABILITY"
SPECIAL DISTRIBUTIONS
1. If X and Yare mutually independent randoln variables
'With given (:t) and FA (y), then by 11:, 4, the d.f$ of the
combined variable (X, Y) is F
1
F
2
(y). Thus the pr.f. (S) of
(X, }'") is,. accorfling to Theorem 2, uniquely determined by F
1
and F
2
for all two-dimensional Borel sets 6.
The sum X + Y is a one-dimensional vector function of the
variable (X, Y), so that accordingto II, 6, its d.f. F (z) is uniquely
determined by (S), i.e. by F
1
and F
2
Let C;z denote the set of points (",Y:, Y) such that X +Y z.

Then by definition F (z) = (CS
z
).
Further, let Xn, be a sequence of real numbers steadily in...
creasing with n from -00 to +00 alld suell that x
n
+
1
-x
n
<h for
all n, where h is a given number > o. Denote by ffi
1it
the illfinite
rectangle defined by the inequalities x
n
< X y Z - X
n
,
and by t
n
the rectangle defined by X
n
< X X
n
+l' y Z - x
n
+
1
-
Obviously CezcL8l,l' while I; (tR
n
- t
n
) C<5
e
+
k
- e
z
-
h
, the
sums being extended from n= - to I'll = +co.
Since (6) is a pr.f., this gives 'll.b to (4)
(t
n
) (ffi;n)
&11.11 (1
n
) - (t
n
) (6
z
+
ii
) -
former inequality is equivalent to
(z (P
l
(x
4
+
1
) -Fl (x
n
)) F (z)
:EF
2
(z-x
n
) (F
1
(xnJ),
while the latter shows that the difference between the limits thus
obljained for F (z) does not exceed F (z+k)- F (z-llt). In every
ADDITION Olt" INDEPENDENT VARIABLES 37
continuity point of F (z), both sums thus tend to F (z) as h -70,
al1d by the ordinary definitioIl of a in-
tegral,lthis limit is equal to J:<X:l According to the
definition of a Lebesgue-Stieltjes integral given abo'\?e (I, 3, and
III, 2), the last integral hov/ever, for all values of z alld
is everywllere continuous to the right, so that it always represents
F (z) .. Obviously F
1
and F
2
may be interchanged ,vithout altering
the value of the integraL
..By 1'heoreln 3, any two functions gl (X) and g2 ( Y) are mutually
independent:- 80 that we :have b:y' (22) E (glgf/) = E (gl) E (g2).
As pointed out in 1\7" this :holds also if 91 and g2 are complex.
Thus in particular
E (e
1l
(X+l") = E (e
itX
) .. E (e
UY
),
so that we have proved the following theorem..
Theorem 13Cl
2
If X and Yare mutually i1uiepe1tdent random
variable..s with tke d.f,,'8 F
1
and F
2
<: and tlte cl.'8/1 and 12' then the
8'ltrn.. X + Y has the d.f..
F (X) =f:<Xl F
1
(x-v)dF
2
(v)=f:", F
2
(x-v)dF
1
(t'),
and the c.f.
(38) j(t) =/1 (t)f2 (t).
When three d.f.'s satisfy (37), we shall sayS that F is CO'mposed
of the components F
1
and F
2
, and we shall use the abbreviation
(37a) F=F
1
*F
2
=F
2
*F
1
*
According to (38) this symbolical multiplication of the d.. f..'s
corresponds to a genuine multiplication of the C.f.'8"
Ifthe three variables Xl:> X
2
and X
s
are mutually independe:nt,
then by Theorem,3 any X
r
is independent of the sunl of the other
1 Of.. Hobson [llr X, p. 538.
t A rigorous proof of this long used theorem, which expresses the fundamental
property of the characteristic funotions, has nOli been given until compa.ratively
recently. Cf.. Levy [1], Bochner [1, 2], Wintner [1], Ha.vlland [2]..
3 Some 'Writers use the expression: F is the convolution (German: Faltung) or
F
1
and Fa.
38 ADDITION OF INDEPENDENT VARIABLES
two variables. A repeated application of Theorem 13 then shows
that the sum Xl +X
s
+X
3
has the d.f. (PI *F,)*Fa=F; *(Ft*F
a
),
and the
Obviously this may be generalized to any number of com
ponents, and it is thus seen that the operation of composition is
commutative and associative. For the sum Xl +X
t
+... +X
n
of
n mutually independent variables we have the d.f.
(39) F=F
1
*.F
2
*... *F
n
and the c.f.
(40) f=fJa .../n.
If at least one of the oomponents F
v
is continuous, it follows
from (37) that the composite F is also continuous.
1
Similarly, if
at least one ofthe F
v
is ab80lutely continuous, this holds also for F.
If, on the other hand, all the F" have discontinuities" then F has
also discontinuities, and the set of discontinuity points of F
consists of all points x representable in the form
x =:Jf.1) +2f.t> +.... +:r!-1l),
where :tf.
v
) is a disoontinuity point of Fv.
2. Suppose that the absolute moments of order k are :finite
for all the mutually independent variables XI' X
af
... , Xn.. The
inequality IXl +... +X
n
X
1
1
k
+ u. +IX
n
I
k
), which
holds in every point of the space afthe variables Xl' ... , Xn" then
shows that the lcth absolute moment of the sum Xl +.... +X
n
is
also finite.
Further, ifCl..1), . are the moments of F
v
' while I' (ta, are
those of the d.f. F composed aocording to (39), it follows from
(25) and (40) that the coefficients oft, fl., ., t
k
in the l)l>lynomials
k IX, n ( k (v) )
1+ ftr and n 1+ --!ft
r
"=1 r. v==1 r==l r.
are identical. Using a symbolical notation, we may write
(41) ,.= (<l) +(1)+ . +rx!-ft,.,
1 Hen<'8 follows the truth of the statement made in IV, 1: if / is the 0.. of a
continuous w.tribution. then the ume holds for I1I
I
=/.j.
ADDITION OF INDEPENDENT VARIABLES 39
where after the expansion of the rth power, every (Cf.,(vP- should
be replaced by In particular, we have (11 = and
tX2-ctf=:E (<<y2), in accordance with the relations already
obtained in m, 4.
The semi-invariants introduced in IV, 2, behave in a very
simple way when the corresponding probability distributions are
composed. Let "r>, ..., denote the k first semi-invariants of
F
v
' which are by hypothesis all finite. Then if 1'1' , l'k are the
corresponding semi-invariants of F, it follows from (26) and
(40) that
(42)
3. Let Xl' X
2
, .... be a sequence of random variables. We shall
say that X
n
converges in probabilityl (briefly: "converges i.pr.")
to a constant A if, for every e- > 0, the probability of the relation
jX.",-A I>E tends to zero as .. We shall also say that X
n
converges i.pr. to a ra.ndom variable X, if:the variable Xn-X
converges i.pr. to zero.
A proper treatment of the questions connected with this mode
of convergence cannot be given without introducing probability
distributions inspaces of an infinite number of dimell.si.ons. A few
simple theorems will, however, be given here.
A necessary and sufficient condition that X-n, converges i.pr.
to a constant A is obviously that the d.f. of X
n
-.A. tends, for
any x: 0, to the particular d.f. E (x) defined by (17). By Theorem
11, an equivalent condition is that the corresponding e.f. tends
to 1 for all t,.
If, for a sequence of variables Zl' Z2' "'I'" the mean value E (Zn)
and the s.d. D(Z,,> are finite for aU ft., and if D as 11,-+00,
itfollows immediatelyfromthe Bienayme-TchebycheffinequaJity
(m, S) that ZfI. -E(Zn,) converges i.pr. to zero. From this
1 Cantelli [1], Slutsky [1], Freehet [1], Kobnogoroff [4-]. A full discussion of the
various modes of convergence of sequences of random variables is contained in the
recently published treatise by Frechet [2].
remark, we deduce at once the following theorem.
Theorem 14. Let Xl' X
2
, .... be independent variable8 eruch
that E (X
ll
)=m
n
and D(X
1l
) =(1'1" and put
... +Xn)' M
n
=!(ml++
11l
n).
n n
[J crf+ ... (n
2
), then Zn -M
n
converges i.pl. tozero.
We have, in fact, E(Z,J=M,t and .. +O;),
Intheparticular casewhen all the
X
n
llave the same probability we have M
n
=m
u
= m,
say'I and ai +..... +a;' = na
2
=0 (n
2
), so that Zn converges i.pr.. to 1?l.
If, for the independent variables X n. considered in Theorem 14,
the existence of finite mean values and variances is not assumed,
we Dlay still.ask if it is possible to find constants ..,lJ1:
n
such that
!(X
1
+ ... +Xn)-M'l=Zn-Mrr converges i.pr. to zero. When all
n
the X", hav'e the same distribution, the following theorem holds.
Theorem 15.
1
Let Xl' X
s
, ... be independent variable.s all
having the sartte d.f. F (x), and put ZlI (Xl + ... +X
ll
) .A 'lUCe8-
n
8ary and 8ufficient condition for the exi8tence of a se1J.ueme oj
constant8M
1
, H
s
, .. _8Uch that Zn -N
n
converges i.pr_tozero, i8tMn
f
dF (x)=o(l/z)
Ixj .... z
as z co. This condition being satiafled, we can al1.vay8 take!
Mn = f:nxdF(X}.
1. The condition i8 necessary. Denoting as usual by f(t) the
c.. f. cOITesponding to F (;2;), tIle c.f. of Zn - M'A is
e-,Vni{J(;)]'l= 1 (t).
1 Kolmogoroff [1Jt and [4-], p. 57. Of. a.lso Khintchine [4J.
J If, inadditlon, tbegeneralued mean vaJue Jl = lim zilF(x) e:\iats, it follows
# ...... 00 -z
tha.t Zn convergets i.pr. to JI.. If the ordina.ry luean value as defined in III 2 exists"
it is easily seen that the condition of the theoreln is ahvaya satisfied.. '
ADDITIOK vF INDEPENDENT VARIABLES 41
If Z1t -M
n
converges i.pr. to zero, then according to the
remark made above the corresponding d..f. tends to E (x) and
thus by Theorem 11 An (t) tends to zero, uniformly in every finite
t..interval. Taking the n,..th root we have therefore as soon as
JAn(t) I< 1
I _M",U (t) I I I J
je "!;;:, -1
while the left side is boul1ded by 2 for all nand t. Thus
=eM;ii +
\11, n '
where 18(n,t) I 2n for alilt and t, and tends to zero as
uniformly in every finite t-intervaL SinceJ(tjn)-.+l as it
follows that we have M
1l
==o(n}. From Theorem 10 we then
obtain,. putting k = n,
IJ"l+1J, Ifg
(43) - F(v)dv-- F(v)dv
n e n f-n
If(O (Sint)' _tif
e
( 2M
fl
it 8(n,2t)
= - - e n e 1ft +-- dt.
w t n
Now) gMnil is the c.f.. of the d.f" E (x - M
n
,), where ! (x) 1l;j detlned
by (17). The contribution to the second member of (43) arising
IM.it
from the term is thus by Tlleorem 10 equal to the value
assumed by the first member if F(v) is replaced by E(V-Jf
n
)"
This value is, however, equal ttl zero if J g-M". i >'1t. For all ,
satisfying this condition, we tllUS obtain, since 8 (n, t) tends
uniformly to zero,
F F (v)dtJ < '1J (n),
ft n n
where 1} (n) 0 as n -?00. On the other hand we
!f-t-nJ! (V)dv-.!J
e
F (V)dv=f,+n(l- J v-f f)dF (v)
n t n l-n t-fi n
t {F (e+ in) -F
and thus F ('+!n)-1'(,-t-) < 27)(n)
n
for all, such that Ie-Mn I> ft.. Since M",,=o(n), we may for all
sufficiently large it put f = in, 80 that we obtain
F(2n)_F(n}<2
1J
(n) a.nd F(.... 11.)-F(-2n} < 27) (nl.
n n
Replacing here successively 1J, by 2"" 2In, ... and adding, we
obtain the desired result, as the restriction of n to integral values
is obviously not essential.
2. Taking.M;.::: wehave
by &, partial integr&tion
dF{1J>+J"tkf dF(v)
-# 11'1>"' 0 1t'1>3:
=O(I)+o(f: 1)=0(10811.),
and in the same way
f
a -nJf dF(v):=o(ft.).
-no 11J1>1It 0 tlll>:e
The o.f. of Z", - M. may be written
(") =[1+ (e"<Z;M,,) -1)dF(:J:)]".
Now we have by hypothesis
f
03 it(:e-M,,) In U(3:-M.) (J ( t)
(e----- -l)d.F(x)=
-co -ft, 11;
where 8(ft., t) tends to zero as n 00, uniformly in every finite
t-interval. Accord.iI\g to the definition of M,., we may write
f
00 -it (1:- M',.)
(45) n -4*D(t-- -l)tlF(:c)
f
'"' (U(:A:-M.> II (z M))
=" _$ e---l- : " tJF(3:) +8(., t).
ADDITION OF INDEPENDENT VA.1tIABLES 43
The :first i.ierm in the second member of (45) is, however, of
modulus less than
M,,)2dl!'
and according to the above inequalities this tends to zero as.
n-+a), uniformly in every finite t-intervaI. Since (1 +cx",)"'-)o.1
if Mn-+O, it thus follows from (44) and (40) that the c.f. of
Z" - tends to 1 as n co, uniformly in every finite ' ...interval.
Then by Theorem 11 Zrt - M.", converges i.pre to zero.
4. Let Xl' X
2
, ...... be independent variables
t
and put
.. +Xn"
If Fy(z) is the d.f. a,nd!." (t) the c.f. of the variable Xv; the d.!. of
Yn. is *Fa *.... *F,v and the corresponding o.f. is flit ../n- By
Theorem 11, a necessa1'Y and sufficient condition for the con-
vergence of F
1
*.F
t
*.....F
n
to a d.f.. is the convergence of
the infinite product QC)
I(t) = TIJv (t),
vaal
for an t, where j(t) is oontinuous for t=6.
1
If this condition is
8&tisfted, it follows from Theorem 11 that the infinite product
converges even uniformly in every finite ' ..interval, and tb.&t/(t)
is the c..f. of &. d.f. F (z) such that
F (z)= lim 1; .F
s
*.... *F",
.,..-+00
n+tI"
in every continuity point of F. For any n', the product IT Iv (t)-
",+1
tends uniformly to 1 &S n co, and consequently the difference
Y.+nt - Yn. converges i.pr.. to zero. It would be natural to conclude
that there is a variable Y with the d.f. F (z), such that Y" con-
verges i.pr. to Y. Then Y would be the sum of an infinite series
of random variables: Y== Xl +XI +..... In order to give a precise-
meaningtoa statement ofthis oharacterit is necessaryto consider
1 .A. IfI,IJi,cient condition for this convergence is the convergence of the two eeries
(X,) and ED2 (X,,).
44 ADDI'l'10N OF INDEPENDENT VARIABLES
a probability distribution in the space of the combined
(Xl) X
2
, ), whioh has an infinite number of dimensions. ThIS
falls, however, outside the scope of the present work.
5. We shall now consider some particlllar examples of prob"
ability distributIons. In the mst place, we consider a variable
X which can assume only the values 1 and 0, the corresponding
probabilities being p and q=1- p. The d..f. of this variable is 8,
U step-function" with steps in the points 1 and 0, of the height
p and q respectively, while the oorresponding o.f. is equal to
pett+q. We have further E(X)=p and D(X)=v'Pi If Xl'
X
a
, .... , X
n
are independent variables all having the same distri
bl1tion as X, the sum v=X
1
+X
2
+.... +X
n
is equal to the
number of those X, which assume the value 1. The c.f. of the
variable v is (peit+q)n, and v may assume the values 0, 1, .... , n,
the probability of a. given value v being (:)P" qn-v. This distribu-
tion is usually called a binmnial or Bernaulli diBtTibution, and 11
m&y be Concretely interpreted as the number of white balls
obtained in a set of n drawings from an urn, the probability of
drawing a white ball being each time equal to p. By m, 4, we
have .E(J1) = np and .D(v) = vinpq. According to Theorem 14,
the "frequency" vln converges i.pr., to p. This result coincides
with the classical Be'frWUlli'8 theorem as originally proved by
Bernoulli.
If we allow the quantity p to vary from one X,. to another, the
o.f. of the sum v beoomes
B tao
(46) n(Preit+lJr) = II (1 +Pr(e'd-l.
1 1
In this ease, we have and D(v)=JiPrq." and by
1 1
n
Theorem 14 the variable (11- converges i"pr. to zero" If
1
QO
the series Pr is convergent, it is seen that the c.. f. (46) tends to a
1
ADDITION OF INDEPENDENT VA.RIABLES 41
limit as n.-.?co, for all real t, so that the case con-
sIdered at the end of the preceding paragraph presents itself.
Another case of convergence is obtained if, in (46), we allow
the 'Pr to depend on n in suoh a way that, when n 00, each 'P,.
n
tends to zero, while LPr tellds to it constant A> O. (We may e.g.
1
take Pr ='Ajn for If =- 1, 2, _.. , n.) Then the c.f. (46) tends to the
limit
(47)
This is the c.f. of a variable which may assume the values
0, 1, 2, ... , the probability of any given v being The mean
v.
value of this variable is '"\, and the s.d. is VX. The semi-nlvariants
Yp. defined by (26) are aU equal to A. This distribution is usually
oalled a Pois8on distribution. If Xl and XI are independent
va.riables both having Poisson distributions with the pa.rameter
values Al and A
2
, the expression (47) of the c.. f. shows that the
sum Xl +X
2
has a distribution of the same kind with the para-
meter A
1
+1\2- If we denote by F (z, A) the d.f. corresponding to
the POiSSOll distribution, we thus have the relation
(4S) F (X, AI) *F (x
t
A
a
) = F (x, i\1 +
6. The probebility distribution defined by the d.f. F (x/a),
where a> 0 and
F(x)=!+!arctanx, F'(X)=!'-l 1 2'
7/' 11' +x
is sometimes called ()auchy's dist1
4
ibution..
1
This distribution has
not a finite mean value. sillce the integr8.11 f fIX I does not
-GO +x
wnverge, althougll the "gerleralized meanvalue" lim ftJ :x;dF(x)
-s
does exist and is equal tiC zero. By aneasy applicationof Cauchy's
1 Cf. Levy (1],. p. 179.
46 ADDITION OF INDEPENDENT VA.RIABLES
theorem we find the c.
IfCX> i:tz
f(t)=-
'11'
The c.f. corresponding to the d.f. F (zla,) is then obviously
f(at)=e-aUl. We thus have
j(a1t)j(a1t)=/(a
1
+Qs)t),
or F (x/a,.> *F (x/at) == F (xJ(a
1
+(1.)
so that the Cauchy distribution reproduces itself at the addition
of independent variables. If Xl' XI' ... , X"" are independent
variables all having the Cauchy d.f. F (z/a), the arithmetic
mean Zit. ={Xl +... +Xn)/n thus has the same d.f. F (zja).
Hence we cannot in this case find constants Mn such that Zn- M"
converges i.pr. to zero. It is easily seen that, accordingly, the
condition of Theorem 15 is not satisfied.
As our next example we take a d.f. F (x; , A) which is equal to
zero for x 0, and for x>0 is defined by
(1.'>.
(49) F (x; at .\) == r (A) 0";'-1e-
W
00
,
((It> O. A>0).
This is a distribution of " type III" according to the classification
introduced by K. Pearson.
1
All moments of the distribution are
finite; the mean value is ).;1., and the s.d. is VA/. The o.f. is
1
f(t; at, A) = r (A) 0 xA-
1
e-(O:-'U)$ck=
This shows that we have the expression i'po = (JL - 1)J-tJ. Afor
the semi-invariant 'Yp, of the distribution, and that, for a, fixed
value of , the d.f. satisfies vrith respect to the parameter Athe
same relation (48) as the Poisson distribution.
7. In many applications, it is required to find the distribution
of the quotient of two random variables. In certain cases, the
1. Ct. e.g. Elderton [1]..
_4..DDITION OF INDEPENDENT VARIABLES 47
follo\ring theorem enables us to ex!>ress this distribution in terms
of the c.f.' s of tIle t\\"O vBJriables.
Theorem 16. Let Xl and "''2 be independent variablelJ with
finite m,ean values, the correspondirl1} d.!. '8 being F
1
(x) and F
2
(x),
u'ith the c./.'811 (t) andIs (t). If F
2
(O)=O, and ijthe integral
f:o li2:t) jdt
then the d.!... G(x) of tll.e quotient X
I
/X
2
i8 given by tke
relation G(x) == co /2 (t) -/1(t)fl ( - tx) dt.
21110 -GO t
If the integral obtained by formal differentiation of thill relation
with reS1Ject to x i8 uniforrnly conv8'rgent in a certain interval, we
tlbUS have in tki8 interval tor the frequency function G' (x)
0' (x) =-2
1
-Itt) /1 (-tx)dt.
m _'0';)
By definition G(x) is equal to the probability of the relation
X
1
/X
2
x. Since F
2
(0) == 0, we need only consider positive values
of X
2
, so that the last inequality is equivalent to Xl-XXI 0,
and if H (g) denotes the d.f. of the variable X
I
-XX
2
, we thus
obtain G(x)=H(O)=H (O)-F
2
(O). By hypotllesis the difference
H (,) - F
2
(,) satisfies the conditions ofthe Remark to Theorem 12)
and so we obtain, since the c.f. of H (,) isll (t)fs ( - tx),
1 Jco
H(g)-F
2
(')=-2 -t-(!2(l)-fl(t)!a(-t:e)dt.
7T1, _ QO
Putting here g= 0, ",e obtain the theorem.
We shall give two examples for the 8Jpplicatioll of this theorem.
In the first l>lace, ,ve consider two variables Xl and X
a
, both
distributed according to (49) witll the parameters IXl' A1 and
0:
2
, A
2
respectively. In this case the theorem gives
G(X)=_1JO::> ( 1 1
21T1: (it )A
J
( it )/\ ( t"
1--- 1-- 1+--
- C(J (X2 0:1 IX2
If;\2 is an integer, the integral may be calculated, and we find by
an easy application of Cauchy's theorem. G(x) =0 for x 0 and
for x> o. In the particular case "2 =1, the last expression reduces
to G(x) =( (t,1 X )A
1
(X,lx+cxa
For our second example, we shall anticipate the discussion of
the normal distribution that will be given in the following
Chapter. We shall consider a quotient of the form X1!v'X,.,
where Xl is normally distributed with the mean value 0 and the
s.d. cr, while XI is distributed according to (49). We then have
(cf. (51))
and
IX>' f) 2A fCi:)
It (t) =r (,\) 0 xA-
1
e-Gt.l:+itv':lI ax == r (A) 0 v2A-
1
e-<Xr'+itlJdv,
2ia:.
A
fco
1" (t) :=-- vIAe-
lXv
2+Uv dv
J2 r (A) 0
In this case we may apply the last forlnula of Theorem 16, and
80 obtain for the frequency function G' of the variable
X
1
!1/X
S
G' (z) = fco e-lutl: ae] QOv
ll
e-o:.",1.-U11;z dv
.,.,r (A) -<X) 0
== fcovIA dvf e-laIP-Uv;c tit
nr(A) 0 -co
= 2tr' f0()vi" e-(0<+:;.)",tW
O'V271 r (1) 0
= 1
V21TtX0
2
r (I\.) \ 2cxu
2
This is a distribution of type VII according to the classification
ADDITION OF INDEPENDENT VARIABLES 49
ofK. Pearson.. In the particular case wIlen 2tXu
2
=n, A=nJ2, we
obtain a distributioll defined by
r(n+1) _,.+1
G
'
(x)= v ~ ; r(i) (1+~ ) 2
"rhich is kno\vn under the nanle of "'Student'8" distribution.,l
1 '" Student ,) [ 1 ] ~ Cf. also e.g. Rider [1].
CHAPTER VI
l'HE NORMAL DISTRIBUTION AND
THE CENTRAL LIMIT THEOREl\I
1. rfhe nQrllw,l distribution Junction
l
 (x) is defined by the
relation 1 J':r -
 (a-) = -=_ e 2dt..
V21T - ao
The corresponding nortnal frequency jU1wtion is

m/( ) __1_ -2
\V x - ... ;-_ ..
"V 21T
The mean value ofthis distribution is 0, and the s.. d. is 1, as shown
by the relations
(50) J:tOxd$(x)=O, J:tOx
2
dct> (x) =1.
The lnoments of odd order 0:.2v+1 all vanish, while
Ct2v = f (t) x
2v
d (x) = 1.. 3.. u.. .. (2II - 1)"
-co
The c.. f .. is, by a well-known integral formula,
(51) f<:O eUxd (x) = 1 ftO -j.
- co v'21T - co
Hence "\"'6 obtaill, for v = 1, 2, ...... , by partial illtegration
(52) f:tOe,txdct>(") (x) = (-it)" e
and by differentiation
1 ftO
(it)V-1e 'dt"
..lTf _ 00
A random v"ariable X is said to be normally dist1"ibuted, if its
1 The normal distribution was discussed already m 1733 by De :Moivre in the
second supplement to his Analytwa.. Cf. K. Pea.rson [IJ. It was afu.r..
.. ards treated by Gauss and Laplace, and is often referred to as the Gauss or
Gausb-Lrtplace distribution..
NORMAL DISTRIBUTION
61
'x-m)
d.f. is 4> (-0- , where (J 0 and m are constants. (The case
q=O is, of course,. a degeBerated limiting case which might be
caJled an improper normal distribution. cJ) e m) should always
be interpreted as (x-m), where E(X) is defined by (17).) The
normtJ,lized varia.ble X - m has then the d.f. cJ) and we obtain
(J
from (50) E (X) =m, D(X)=a,
while (51) shows (cf. also IV, 1) that the c.f. of the varia.bleX is
E (e
UX
) =e
tn
1.t-i
a
Jlt.
The semi-invariants of X, as defined in IV, 2, are
"1 =m, i'1=a2, 'Ys="4.= .. =0.
2. We now proceed to prove a number of theorems which
show that the normal distribution plays a fundamental part in
& great nl1mber of questions connected with the a.ddition of
mutually independent random variables.
Let Xl and X
J
be independent and normally distributed
variables
t
the parameter values being ml' at and m., as respec-
tively. Then the sum Xl +X
2
has the composed d.f. (cf. v, 1)

while the corresponding c.f. is
e
m
l
it
-iai
tt
e1ntU-io;" = e(ml+
m
t> U-l (oi+oi>tI.
This is, however, obviously the c.f.. of a normal distribution, and
so we have the following theorem.
Theorem 17.
1
The BUm oj two inilepe:ndem and fWrf1U.I8,y
distributed tJariaJJle8 is itBelf normally di8trib1detl fJllU8
* =cJ)
where m= m
1
+ml' at = af +ai-
l ThiJ! theorem is som.etimes attributed to d'Ocagne, but it seems to b.a.ve been
known aJready to Poisson and Cauchy, and possibly &Iso to Gauss.
52 NORMAL DISTRIBUTION
Obviously this theorem is immediately generalized to the
composition of any finite number of normal distributions.
We shall now prove three theorems which attach themselves
in a natural way to Theorem 17 and reveal fllrther remarkable
properties of the normal distribution..
According to Theorem 17, the d.f.'s of the type c;I) tn) fol'Ill
a closedjamity (the "normal family") with respect to the opera..
tiOll of Now, any with a finite mean 'lalue m
and a finite s.d. a may be written in the form F , whertl
a I
F (x) is a d.. f. with the mean value 0 and the s.d. 1. For any given
F(x) with these properties, all functioIlB F (x:m) may be con
sidered a8 a family generated by F (x). Our next three theorems
then assert (1) that no F (2:) different from (zo) generates in this
way a closed family; (2) that the composition of any two d.f.'s
which do not both belong to the normal family never produces a
member of tha.t family; and (3) that every d.f. with a finite s.d.
gives, by n-fold composition with itself, a d.f. which for all suffi
ciently large fa, comes (uniformly for all real x) as as weplease
to a member of the normal family. We shall first give the formal
statements of the three theorems and then proceed to the
proofs.
Theorem 18.
1
Let F (x) be a with the mean value 0 and the
s.d. 1. If, to any con8ta1ll8 ?nt, tnt (real) ani/, at, 0'1 (poaitiV6), 106
can find, m aM (1 suck that
(53)
then F (x) =$(x).
1 P61ya [1). The example of Caurhy's distribution {v
1
6} shows that, in thia
theo1"etu, it is essentia.l tha.t we consider only d.l. 's with finite dispersions. Further
examples of non-normal d.f.'s satisfying (53) have- been discussed by polya. and
Levy (1].
NORMAL DISTRIBUTION tiS
Theorem 19.
1
If the BUm of two independent raruiAn variables
ia 'fl,()'ff1W,lly distributed, then each variable i8 itself 1Wrmally dis-
tributed. Thus if (z) and F
2
(x) are d.J.:'8 8uck that
(54)
then where m
1
+m
2
=m,
01+01=0'1.
Before stating the third theorem, some preliminary remarku
are necessary. Denoting the composition F *F *.... *F of 11,
equal components by Fn$, we obtain from Theorem 17
(
cD =(J) (x-m..!!.)
er \ ayn " '
and in particular for 1ft = 0, (J = l/,\/n,
(55) ( (xyn})n* =fIl (x).
The last relation expresses that if Xl' ..... , X. are independent
variables, all with the same d.f.. ep (x), then the variable
(Xl +..... +Xn)!v'n
has the d.f. tb (x).
Theorem 20.
2
Let F (x) be a d.f. u'ith the mean value 0 arui the
8.d. 1. If Xl' XI) .... are independent all having the d.,l..
F(:t) then the a.f. of the variable (Xl +... +X'nJ/vn tends to CI> (x)
aa uniformly for all real z. Th'U8
(55a) {F (xvnn* -+4> (x)
uniformly in x. Hence it follows also that
(56)
(
(
X- m))n* (X- mn)
F -- -fIl --
a ay'n
uniformly in x, for all fixed m and a.
1 Cramer [5J. The theorem had been conjeotured by Levy [2J, [3J. It will be
observed tha.t in thi'i theorem it is not assumed that the moments of a.ny order are
finite..
S! Lindeberg [1], Levy (1], p. 233.
Theorem 20 is a particular case of the famous "Central Linl1t
Theorem" in the theory of probability, which will be more fully
treated in the following paragraph. We shall now first prove
Theorem 20) which will then be used for the proofof Theorem 18
,
Finally, we shall prove Theorenl 19.
Proof oj Tn.ehfem 20. If f(t) is the c.f. of a d.f. F (x) with
<Xl=O and 2=1, it follows from formula (25) of IV, 2) that
j(t)= I-lt
2
+o(t
2
) for srnall values of f t I. Thus we have uni-
formly in every finite t-interval
I(_t
vn 2ft, n
as The e.f. of the variable (Xl + .. , +Xit)/yn is
...
As n ..00, this tends uniformly in every finite t-interval to the
t'
limit e-', which is the c.f. of W(z). ThuB by Theorem 11 the d,f.
of (Xl +... +Xn)/V'" tends to ep The uniformity of the con-
vergence follows easily from the fact that is continuous..
Thus (554) is proved, and (56) follows immediately from the
remark tha.t (F f* is the d.t. ofthel&riable
... / Xt++Xn.
mn+uv
n
v'n ..
Proof of Theorem 18. Both members of the relation (53) are
d.f. '8) andthe first order moments are m
1
+m
s
and m respectivelyt
while the variances are at +crI and a2) so that we obtain
m=ml +fnt, aI=at+crI Putting ""l=fnt= ... =0, we obtain by
iteration ..
56
NORMAL DISTRIBUTION
and thus in particular
(F (xy'n)"'* =F (x).
From (55a) it then follows that F (x) (x) for all x.
Proof of Theorem 19. Let Xl and X
2
be liWO independent
variables vvith the d f.'s 1
1
and F
t
, and the c..f.'s 11 and !'Z) and
suppose tha.t Xl -l- X 2 ha.s the d.f. cI> e:m). Since the qua.dra.nt
is a sub-set of the half-plane Xl .. we
have for all values of x and y
.F;. (x)F
2
(y)
Here we choose for 11 any fixed value such that F
I
(g) > 0, and use
the mequality
1
4> (x) < vi e 2.
217' t x I
which holds for all x < 0 and is easily proved by partial integra...
tion" It then follows that we can determine .A and B independent
of x, such that for all x < 0
ttfA
F
1
(x) < Ae-2ut+
BI
:r
,
Similarly we can determine A' and B' such that for all

1-.F;. (x) < A'e Itr
From the two last inequalities it follows that the integral
(57) J = et:dJ;, (x)
is convergent.. If, now, we Qonsider the
A(t)- eiLrdJ;, (x)
for complex 1Jalues oj the variable t, it follows from the c.onvergence
of (57) that the integral which represents ,.ll (t) is absolutely and
uniformly convergellt in every finite domain in the t-plane. Thus
56 NOlt.MAL DISTRIBUTION
11 (t) is an in:tegral funrJio1ll of the complex variable tv For the
modulus of this function we obtain by lneans of the elementary
inequality x2
J tx I 0'2 f t f
2
+4(12'
I11 (t) I S:""e0'1 till- ia. d.F" (x) = J ei7lltlt,
so that the order
1
of tke integral junction /1 (t) does not exceed 2.
In the same way it is proved thatfa (t) is an integral function oft,
the order of which does not exceed 2. ACCOlding to (54) we have,
however, 11 (t)/a (t) =emit-laIl2,
which shows that 11 and 12 are integral functiona zefOS.
By the classical factorization theorem
2
of Hadamard it then
follows thsJt
(58) 11 (t) = e!11(t), 12 (t) =eqa(t),
where 21 (t) and q" (t) are polynomials ofdegree not greate.r than 2.
The convergence of (57) implies that all moments and semi...
invarianta of Xl are finite. Denoting the mean value by m
1
and
the s.d. by at, we then obtain from (58) according to IV, 2,
IV, 2, 11 (t) =emlit-ia1,tt,
and similarly
This is, however, equivalent to
(
x-m
1
) (x-m)
P
l
(x) =$ --u;- , PI (x) =$
Then obviously m
1
and af+oi=a
2
, and the theorem is
proved.
3
1 (,1'. e.g. Tltchma.rsh (1), p.. 248. Cf. e.g. Titehmarah [IJ, p. 250.
3 0'1 or 0t may be equal to zero If, e.g., at =0" we have by 1 to interpret
IX - ,n,,)
tP (- as ((x -tn.
1
), and so obtatn the trlvialsolution of (54); F
1
(*)=c-
, 1
Fa (=)=\1) -n:,
NORMAL DISTRIBUTION
3. The Central Limit Theorem
1
in the theory of probability
asserts that, under certain general conditions, the sum oj a large
number of independent variables i8 approximately norrnally di8tri-
buled. In Theorem 20, we have already met with a particular
case of the general theorem, viz" the composition of n equal
components with a, finite s..d. We shall now consider the case
when the components are not necessarily equal. Throughout tkia
paragraph and tM immediately foUowing one, we shall suppose
that every component 1148 a finite 8.d. and a mean, valtU eq'Ual to
zero. The assumption that the mean value is zero may obviously
be made without 108s of generality, since it is equivalent to the
simple addition of a constant to each variable.
We thus consider a sequence of independent random variables
Xl' Xi:; ... , such that Xv has the mean value 0 and the s.d. O'v"
The d.f. of Xv will be denoted by 1:, (x) and o..f. byIv (t).
Ifthed.f. ofthe sumXl +... +X
n
is denoted by (x), we have
(59) .. *Fn(x),
and F", (x) has the mean value zero and the variance 8: given by
(60) 8;=a}+oi+ ... +0';
The variable (Xl + ..... +X
n
J/8
n
then has the <i.f..
(61) iYn (x) == F'n, (8
n
X)
with the mean value 0 and the s.d. 1. It is possible to 81ww thal
under fairly general oonditions iJn (z) teM8 to the normal ill.
<lJ (x) a8 n tends to infinity. The Inost important case is that in
which the following two conditions are satisfied:
(62)
1 This theorem was first stated by L&pl&ce, a.nd was further trea.ted by several
m&thematicians during the nineteenth century, notably Tchebychetf a.nd Markoff..
Acomplete and rigorous proof under f&irly general condItions was first given in 1901
by Liapounoff [1], [2]. Of. Vlt 4, and va. 4. A eomprehensive a.ccount of the
modern development of the subject is given by Khintehine [2].. The central position
which the Limit Theorem occupies in the Theory of Prob&bility is well brought out
in this beautiful treatise.
6S
NORMAL DISTRIBUTION
(64)
(66)
This means that the total sed. of Xv tends to infinity) while

1
eaoh component contributes only a small fraction of the total
s.d.! In this case, we have the following theorem.
"fheorem 21.
2
Let Xl' XI' .... be OJ sequenoe of independent
ra'Nlom variablu UJi,th vQ/nl8kirtg mean val"1U8 and :finite B.d.'a
8ati8fying (62), and denote by ijn(x) the a.J. of the mriable
(Xl +... +X,,)/Sn as ilefi'PAYi by (59) Olnd (61).. Then a nect!Jl8ary and
.sufficient corttditiOfl, for the 'Validity oj the relation
(68) lim iYlt (z) =<1> (x)
110--). co
for all is that, fot any given E > 0,
Ins
lim "2 L ..
n.--+- co8on 1 tz t
It is readily seen tha,t Theorem 20 is contained as a particular
case in Theorem 21. The condition (64) is known as the Lindeberg
condition. It is here given in 8- slightly simpler form than that
originally given by Lindeberg.
In order to prove the theorem, we denote by fn. (t) the c.f.
which corresponds to the d.f. (t), and then obtain from (59)
and (61)
(65) f,.. (t) ==/1 (t/8n,)Ja ('IBn) f", (liB",).
Now, for any integer k > 0 and for any real a we have
k-l (ia)V a"
e'"= L - +.&-,
o vI leI
using .& as a general notation for a real or complex quantity of
modulus not exceeding unity. We shall first prove that the oon-
dition is 8ujJicie:nt, &Ild thus ,asaume that (64) is aatisfied for any
given>O. Takingin (66) 1c= 2 for I J > E8",andk= 3 for I:l: f :iE'_,
1 Excepting the trivial cue when 8" =0 for all n,. it ia eaaily Ieen that (Ii) :is
equivalent to 1'" C-.a:) -+ c (.) for every fixed '&0, uniformly for J' =It t. ..... 1&, as
n ...... GOt where (z) is defined by (17).
: Lindeberg [1], Levy [1], Feller [1]. It can be shown without difJioult1 tha,t
condition (64) implies (62). TIlus as a. 'UJfdewl condition (84) is independent of (a).
59 NORMAL DISTBIBUTION
we obtain for I t I< T, where T> 1,
f" (t/8,,) =f:tI) e'::tlF"
-1- tSsI 3:
2
tJ,F1/+& PSI Izi'dF"
2Bn Izl:it-.
+& T2f zldF..
28; .",.
2 &P3( f 9..1F.)
=-1- " ... t Eo;+ x UJ, 11 ,

bearing in mind that the mean value of X., is equal to zero. From
(62) weobtaineasilyuv/a
n
-+-0as uniformlyfor v== 1,2, ... , n.
Thusfv(t/8
n
J-+-l, uniformly for Jtl<P and v=1,2, ... ,n. It
follows that we have
log/v (tIs",) =(1 +17) (Iv -1),
where I'1J I< for aJl sufficiently large n. As we may obviously
suppose 0 < < i, we thus obtain
log!" (t/s
n
) = - t!+ 2&'[3 :C1d,F.).
8n. 8
n
Ix 1>..,-
Summing over v= 1, 2, .... , n, we obtain according to (65) for
0<t" < t and It I < T
logff& (t) = +2&P'(E+ x
2
dF
v
).
1 l:tI>E'S"
being arbitrary, it then follows from (64) that we have as n-...oo
n t2
(67) logfn (t) = f logj" (tI8,,) -... -"2
uniformly for ttl < T, and by Theorem 11 this is equivalent to
(63). Thus the condition (64) is sufficient.
In order to prove that (64) is also nece.88ary, we assume that
(63) and thus also (67) is satisfied.. From (62) we obtain as above
(Jp!8. Using (66) with k =2, we have
Gat!
(tI8?!) =1-& ,
11,
so thatJv(t/8
n
)'-+ 1, uniformly in the same sense as above, while
n
:I: 1Iv (t/snJ - 11 is bounded for every fixed t. Since (z - 1 1
1
as 1, it then follows from (67) that
n t
2
(68)
1
for every real t. According to the Bienayme..Tchebycheff in-
equality (nI, 3) we have,. however, paying regard to (60),
iif
1 , x j :> eSn E'
and so obtain from (68), taking the real part,
(69) lim - iif (1- cos tx)dFyl
,,-. co 1 Ia: 8n. E
On the other hand, we have
iif tile iif
1 I 8n 28ft 1 i 2
Introducing this in (69), we obtain
t
2
Inf 2
sup! I: I"
co 8n 1 f zl>E"8n If!
Since t may be taken arbitrarily large, it follows that the con
dition (64) must be satisfied, and thus Theorem 21 is proved.
Ifthe conditions (62) are not satisfied, one of the following two
cases must occur:
(A) lim 8
n
=8 exists; or
,,-.,.0)
(B) 8
n
-+-00, GnJsn> tX> 0 for an infinity of values of fl...
Inthe (A), itfollows from (61)that therelation tin (x)
is equivalent to (x/s). Thus by (59) the infinite com-
position F
1
(x)*F
2
(x)*.u converges (v, 4) to tJ>(x/s}. Putting
G(x) =F
2
(x)*F
s
{x)* ... , we then have F
1
(x) *G (x) ='1> (x/a).
From Theorem 19 we then obtain F
t
(x) =<1> (x/a!), and it
follows thatf in case (A) the neces8ary and aujficient condition for
{x)-+ (x) ia that each variable Xv is normally distributed.
NORMAL DISTRIBUTION 61
In case (B), on tIle other hand.. for values of 11, such that
fln,/S,., > tX > 0,. the s.d. of X
n
is not small compared to the tot.a}
n
s.d.. of Xv. It is then easily understood that (6S) O&llDOt be
1
satisfied unless the d.f.. 's of these "large" X:t, tend to the nonn&}
type. We shall, however, not enter upon a detailed discussion of
this case.
4. A8ufficie1tt condition for the validitjf of (63), which is often
useful, has been by LiaIlounotl'.l Let Pky= FJ (f Xv t
k
)
denote the absolute moment of order k of the d.f.. F
v
(x), SO that in
partioular Suppose that for some k> 2 (not necessarily
integral) f3kv is furite for all v and is such that
and thus the Lindeberg condition (64) is satisfied, so that by
21 (cf. also p. 57, footnote 2) we have (z).
If) in particular, there are two positive constants M and m,
such that for all v we have fJltv < M and Ps" =: >m, it is obviot18
that the Liapounoff oondition (70) is satisfied, and thus fJ.",
tends to (a:).
5. We shall now apply the results of t.he two preceding para-
graphs to some particular examples"
As a first example, we take the variables X
r
== - Pr!'t where
has the simple dist,ributio:n. oonsidered in v,. 5:
1';. = 1 with the probability 'Pr' and = 0 with the probability
1 Li&pounoff (2). By meaDS of the condition (iO), Liaponnoff obtained a.n upper
limit for the modulus of the difference (x) - <b !z). This result will be proved in
the foBowing OMpter (cf. Theorem 24).
NORMAL DISTIUBUTION
f,_I-p,.. We then ha.ve and
(j X,.I8)=p,q,. 80 that
PSr (:E.P7Qr)-t.
lY
1L
1 1
Putting, as in V, 5, v = Yi +... +Y1P so that v represents the
number'ofthoea which assume the value 1, we have
n
v-!:p
u.. _X
1
+,+X
n
_ 1 r
,.- 8,. - (fP,q.)*'
If the series is divergent, the Liapounofi condition (70) is
satisfied, and thus the d.f. of the variable Un tends to (%) as
n-+C(). If, on the other hand, 1:,prQ,. is convergent, it fonows from
the a.bove discussion (case (A), p. 59) that the d.f. of U. does not
tend to (,() (x), since the variables X,. are not normally distributed.
Inthe particular case when all PI' areequal to j), where 0 <11 < 1,
the series is obviously divergent, 80 tliat the d.f. of the
variable Un=(v-np)/Vnpq tends to It follows that for
any fixed A
1
and the probability of the relation
Al < (v-np)!vnpq < 'At
tends to the limit .} f"*e dt. This is the extended form of
v 217' .\,
Bemoulli's theorem proved by De Moine and Laplace.
As a second example we consider the variables X,. with the
distribution
, - ret with the probability 2:
StJ
'
X
r
==
0
1
"
1--
rlil'
rill
1
u
2r
2
'
Obviously E (X,,) =0 and DS (Xr)=a:= 1, so that
...
NORl\IAL DISTRIBUTION 63
Thus (62) is satisfied<t and by Theorem 21 a necessary and suffi...
cient condition for ty,,, (x) -+w(x) is
lim.!:. L 1=0.
on-+- co n
VO&>vn
It is readily seen that this condition is satisfied for C( < i, but not
for i. For at> I, it is indeed obvious that the distribution
cannot tend to the normal type, as in this case we have a prob..
IX>
ability greater than n(1 - r-
2Cl
) > 0 tllat any sum X
a
+... +X.,.,
2
aSSUlnes the value zero. The Liapounoff condition (70) is satisfied
for (1.. <!, but not for tX 1.
6. If, for the illde})elldellt variables X
n
considered in Theorenl
21, the existence of finite mean values and variances is not
assumed, we may still ask if it is Ilossible to find constants atl. and
bn, such that the d.f. of (Xl +.... +X
n
J/a1l. -b
n
tends to (x) as
n 400. The same question may, of course, be asked in a case
when finite values and variances do exist, but the Linde-
berg condition (64) is not satisfied. We shall not enter Upoll a
detailed discussion of the problems belonging to this order of
ideas, but shall content ourselves with proving the following two
theorems.
Theorem 22.
1
Let Xl' Xs, .... be a sequence of independent
random variables, and denote by F
v
(z) the d.f.. of Xv. If, for a
sequence aI' aI' ... of p08itive 'number8, tke conditions
(71) lim :E I dF
v
(x) = 0,
n-+-co v==l Izl>ean.
(72) lim 1
2
i I x
2
dF
v
(x) = 1,
21--+ CD anv===l 1a:1::iE'a.
(73) lim I; II
v==l
1 Feller [11" It is there further 8hown that (7lH73) are neceasary for the con...
vergence to tIJ (;l:) of the d.f. of any variable (B"X
1
+... +8
n
X
n
)fa
n
, where 3, == l.
4re sati8fied for every E > 0, then the a.f. of the variable
(Xl +... +Xn)!a
n
tends to <1> (x) as n -+00.
If, in the particular case when every Xv has a, finite s.d. and
a mean value equal to zero, we take an =8n, as in Theorem 21,
it is easily found that the conditions (71)-(73) reduce to the
Lindeberg condition (64).
In order to prove the theorem, we denote by f, (t) the c.f. of
Xv. According to (71)we may, to any gi'\""en > 0, choosen
o
=no (E)
such that. for all n > no
tJ dF.,,<Ea
1 Izl>"CI,t
NOBl\lfAL DISTRIBUTION
for all sufficiently large n. Thus we have, as n 00,
Iv (t/afl')-l
n t
2
I: (Iv (tla
n
) - 1) -7 - -2'
v-=l
ft.
and lini Slip (f., (tla,,) -1 J t
i
,
p-l
uniformly for v= 1,2, ... ,ft... It follows that for every t
n t2
log/v (tja
n
) - "9 '
va;::;} "'"
n _!.111
or nf,,(t/an.)-+e 2,
p=l
The first member of the last relation is, however, the c.f. of the
variable (Xl +... +Xn)/a"" and thus by Theorem 11 the theorem
is proved.
We shall no\v consider the case when all the variables Xv have
the same probability distribution.
Theorem 23.
1
Let Xl' XI' ... be a 8equence oj independent
tJariables a11 having the 8ame d.l. F (x) .. If 'lee
(74) f X
2
dF(x))
J:rl>1: z
tU %400, then.-
(1) The ab80lutemoment f;Jr= ;:cl"etF(:eJi" fittitefor O;;i r < 2,
J
80 that in particular a finite mean val?:e m. = _f$JQ;tl:F (x) exists.
(II) It is p08sible to find a 8equ.ence "1' tZ.z, .... ojpositive numbers
.taM that the d.J. oj the variable
('15) U. X1 -+- . +Xn -nin
n an.
te?uU to fb ($) fJ8 n tX>.
1 FeDer [lj. [2], Kbintchine [3], Levy (3]. It is shown by these authors (?4)
malso & ?1Cfe88M'y condition for the existence of" two sequences {aft} and {on} sUbh
thAt the d.t. of (Xl +..... +X.>/o,ft - b.,. teilds toO (X). On the other h.a.nd. ('4) is not
a necessary condition for the eort"texpnoe of p,. for 0 <2.
66 NORMA.L DISTRIBUTION
For the proof of this theorem, we may obviously assume that
/32 is not finite, as otherwise (1) is trivial and (II) is an immediate
corollary of Theorem 20.
'Ve shall first prove that Pr is finite for 0 If < 2. The fun<.'tion
t/J(Z)=f x
2
dF(x) =-z'I dF(Z)+2j"vdvi dF(;t)
Iz 0 Ia: 1>1.1
is never decreasing for z> 0 and tends to infinity with z. By (74)
we have (z-?co)
,p (z) VdvJ dF (x) == 0 (III t/J (v) dV) .
o l:e!>'t' 1 V
E> 0 being given, we denote by M (z) the upper bound of v-eljJ(v)
in the intervall v z. and then obtain
J: dv M (z) dv < (z)
Thus we have (z) == 0(M (z), which shows that'" (z) =0 (Zf)
for every E > o. It follows that, for any fixed r such that 0 r<2
and for all suffioiently large z,
J
I I'tJ,F (x) <7/-tJ/J (2z) < %,,/2-1,
'"
alld this obviously implies that fJ, is finite, Hence in particular
the mean value m is .finite.
We now proceed to prove the assertion (II). As by hypothesis
/32 is not finite, the first member of (74) is positive for all z> 0)
and the function
(76)
Z (tt) == lower bound of all z> 0 such that I dF(x) 'U,
la:I>.
is a positive and never increasing function of u" uniquely defined
for 0 0
(77) f tiP(x) <'Y/ f :.r;tdF (x).
1a:1>* z
Let {An} denote 8, decreasing sequence of numbers sucb that
0<"1&<1 and
(78)
We put
;\,.,,40,
(79) z'" =Z (AnIn) , =nf x
2
dF (x),
l:t
and are now going to show that, with this definition of an' the
d.f.. of the variable Un defined by (75) tends to (x). Putting
X.,=X.,,-m, we have U,.,,=(X
1
+.... +XnJ/a
n
, the d.f. of each Xv
being F (x+m). We now apply Theorem 22 to the sequence
Xl' .It, ... , and then only have to show that the conditions
(71)-(73) are satisfied if we put Fv(x)=F(z+m) and define an
according to (79).
By means of (76) and (79) we obtain
(80) f tiP(x) ,f tiP(x) > An ,
n 12:1>1.. fl,
and further according to (77)-(79)
zldF(x) > n f rJ,F (x)
4TJ (lz,,) l:cf>P'a
> A.
=- 400,
so that Z1f, == 0 (a,.,). E" > 0 being given, we now choose no such that
for all n >ito we have Zn < !fra
n
and Im J < tc:a
n
, and then obtain
by (80)
nI dF(x+m)<ft,f dF(x)-+-O,
lad>... Ixf>Ztt
68 NORMAL DISTRIBUTioN
so that (71) is satisfied. We have further for 11,:> no

x2dF(x+m)-11
n I
II (x-m)2
dF
(x>-j dcSdF(x) I
an I
< IlfJ (m
2
-2mx)dF (x) 1+ 2a:f
an I an
< (m
2
+2,s1 i m I) +e2n f elF (x).
an .. l.itl>%ft
According to (79) and (80) the last expression tends, however, to
zero as n-..+oo, so that (72) is also satisfied.
Thus it only remains to show that (73) is satisfied. By (74) we
have for every fixed S> 0 and for all sufficiently large z
zJ Ix IelF(X)=zsf dF (:t)+zjtDdvf dF (x)
Izi '>z ixl>$ 11 Ixl>'"
<8+(z) +8zJ'".o+ ail
v
=2&P(z)+8z f IxldF(x),
eJ
a.nd consequently, putting z= lEan'
for every fixed E>O, aa n-+co. By (79) and (80) we have, how-
ever, for all n >
and thus by (81)
NORMAL DISTltIBUTION
89
Ix IdF (x)-+O.
aft Ia: I >icGtt
Finally we have, the mean va.lue of each Xv being equal to zero;
r xdF(x+1n) I
ani .. an l:ct>4Ed,w I
2nf 2nf
<-
an 1:I>fa,. an
Thus (73) is satisfied, and the proof of Theorem 23 is completed.
CHAPTER VII
LIAPOUNOFF'S THEOREM.
ASYMPTOTIC EXPANSIONS
1. In VI, 3, we have considered a sequence of independent
variables {X
n
} such that X.,.. has the d.f.. Fn, (x) with the mean
value zero and the s.d.. CT",. As in VI, 3, we ptlt
and .. +O-;
(82) iYn (:c) ==F
1
(8
n
X) *.... *F
n
(8
n
X),
so that tin (x) is the d.. f. of the variable (Xl +... +Xn)/sn. The
corresponding c.f. is then
(83) f-n (t) ==11 (tla
n
) .. ftt (tis",).
If the Lindeberg condition (64) is satisfied, it follows from
Theorem 21 that tyn (x) tends to the normal function 4) (x) as
n-+-oo. It is then natural to try to investigate the asymptotic
behaviour of the difference (a:) (a:). In this respeot, it
might be desired: (1) to find an upper limit for the modulus of
the difference (x) -tI (x), and (II) to obtain some kind of
asymptotio expansion of this difference for large values of fl...
In the present Chapter, both these questions will be treated.
In the first place it will be shown (Theorem 24) that, under fairly
general conditions, we have 1trn (x) -fIJ (x) I < K logn/von, where
K is independent of n and x.. It will then be shown (Theorems
25, 26) that, subject to conditions ofa somewhat more restrictive i
character, an asymptotic expansion of fYn (x) -4l (:I:) in powers of
11,-1 can be obtained. From this expansion follows, in particular,
the relation I -4l (x) I<KJvn, which is improvement
ofthe precedinginequality. In the last paragraph ofthe Chapter,
we shall make some remarks concerning the relations between
our asymptotic expansions and the expansions in series of
Hermite polynomials which have been widely usedinapplications
to mathematIcal statistics.
(84)
ASYMPTOTIC EXPANSIONS 71
2. PnlfO'lJ,fJ1wut the whole Okapte:r, we sW Ct.Wl.8itkr a aeq'Uefl,Ct,
X
1J
XI' ... of independent ra,nd&m, 'lXlriable8 suck that X", has tke
mean vaJ,m zero aMthe 8.d. aft. The trivial case when aU the an are
eq:uJil to zero will al1.lJ4118 be ezel'UiJ,e,d. The vth order morn,t,nt, absolme.
m,om.,ent and 8emi,-in'OtJriam (ef. IV, 2) 01 tke variable X
n
wiU be
den,oted by ~ n ' fJYA and Ym f"upectivelll_ Tk'U8 in particular
(Xlft. =Yl" == 0, Ott", ==13,,, = rift. = a:.
ThrougluYut the whole Ohapter it will be (J,IJ8'U/ffteil that twe exi8t8
aninteger k ~ 38'UCkthat flkn isfinitefrnall n =1,2, .... ItthenJollow8
that (XVI" f3vn and i'vn arejinitefor v= 1, 2, .... , k. In the particular
ease when all moments are finite, k fIUl,y be chosen as laf'fe Q,8 we
please.
We aooll1l8e the letters & and ere to denote 'u/nJ1peci:fied quantitiea
ruck that I& I~ 1, while 10
k
I is l.e.,s thn,n a number depending
onJ,y on k.
All the results of this Chapter take a particularly simple form
in the case when all the variables X,1, have the same d.f. We shall
refer to this case as the case of equal components, and the common
d.f. aftha variables X
n
will be denoted by F (a:). If, in this case,
a denotes the s..d. of X
n
, we have 8",=a.yn, and the relations
(82) and (83) become
tTn (x) = (F (axV n1l.*, fn (t) = (!(t/(aVn)'fI,.
3. In this paragraph we shall deduce some lemmas that are
required for the proofs of the results indicated in 1. We put for
v=2, 3, ... ,. Ie
1 1
B",,=-(Pvl + ... +~ m ) ' rm=-(rvl +... +y...,.),
n f1,
B
m
\ rlin
pvn, =Br:./2' ''vn == r,,/I
In ~
Thus for v =2 we have B
21t
=r In == B ~ / n , P2-n =As" = 1. B"", is the
vth absolute moment of the d.f.. (F
1
(x) + ... +F
n
(z/n, and thus
by (20) B ~ never decreases as v increases from 2 to 1c, 80 that we
have for 11= 2, 3, .. _, lc
(85)
72
(90)
It follows from (42) that n-(JI-8)l2
Av1I
is the 11th order semi.in.
variant (x). Further, it follows from. (27) that t rvn I v"'BlYI.)
and hence
(86) IAmI t
/l1
pvn. (kkpltn)v/k.
In the particular case of equal components B
llfU
r V'lo' 'PV1tJ and
AV1f. are all independent ofn, and we have B.",., =fJ"" r va. =rS', where
Pv and 'Yv denote the 11th order absolute moment and the vth
order ..invariant of the common d.f_ F (x).
BesiCles the case of equal components, we shall also sometimes
consider the case when the following condition is satisfied: it is
possible to find two positive constants g and G such that for all"
(87) B
2n
>g, B
kn
< G.
Obviously this case includes the case of equal components. If
(87) is satisfied, it follows from (85) and (88) that Pm And AItft are
uniformly bounded for all 11, 1 and for '11=2) 3, .., Ie.
We now consider the c.f. fn (t) of the variable (Xl +... +Xn)/'fl'J
as defined by (83). Putting
vn
(88) Tkn. = 4 I/k'
. Pkn
onr first object will be to show that in the interval It I '\0/21:8
there exists a certain expansion of f. (t) which.. in the particular
case of equal components, becomes an ordinary asymptoti.c
expansion in powers of n-
t
.. (In the case of equal components,
Pkn is indeJlE'ndent ofn and thus Pkn. is, for large values off'l" of the
same order of magnitude as Vn.)
Lemma 2.
1
For It! :i '\o/T
kn
we have
t
1
k- 3 P. (.) 0
(89) eiT (t)=l+ L 2!!-!!.+_k <ltlk+ltI3{k-2
In v=l nl12 t
where
v
p.
(
-t) '" ( t\v+2j
v?t t ="'-i CJvn i I
;=1
is II of degree 3v in (it), the coeJlicient c
jvn
beittg a pol,..
1 CramsI' (2].
(94)
ASYl\-IPTOTIC EXP.ANSIONS 73
nornial in ,\31t' A
41l
, , A
v
-:l+3t
n
with 'lI/umerical coefficients, 1J1UJ'h, that
v+2;
(91) e;m= 0
k
Pk: .
ThU18 in the oa-se of equal componentB Pvn (it) is independent of n,
while in the more general case when (87) is 8ati8fied the coeJficietz:t8
of P
v
" (it) are boundedlor all n.
For every r= 1, 2, ... , n we have by (66)
k -let (it)V fl (t)k
U = --1 +& 1.. --
v:=r 2 v. 8.,.. 1(;. 811.
For It f we obtain, however, by and (88)
P
I
/
k
It I (nB )l/k !_!
(93) (nB:)lIi p1/f: =n
k
1,
and thus we obtain from (20)
IU I (f311 i t I)V e- 2 < f.
v-2 v. \ 8ft
For I U I<! have, however,
Ui
log(l+U)=
J
According to (92) U formally, a polynomial in t (in reality. the
factor.& depends of course on t), and the series
co !- It f)V

v. \ 8
n
is a Dlajorating expression for this polynomial. For any power
Vi, where 1 < k/2, we thus obtain from (92) the expansion
; _ k-"Ql CO! It 1)11
U - ::E +,& L t
v=2, 8
n
v:::-k
V
\ 8
n
k-l I it)V 8
k
t
k
= 3
vir
\f:-
",.2; 811./ 8
n
with coefficients 8
vjr
whioh are independent of t. From (92) and
(94). we thus obtain an expansion of logfr (tj8
71
J in power$ of it,
up to the term containing (it)k-l, and with an error term of the
74 ASYMPTOTIC EXPANSIONS
order tIc. According to (26), the coefficient of (it/8,,)11 inthis expan-
sion however, equal to Ywlv 1, 80 that we have for t , I
k-l y (it)V p..,.t
k
logJr(tjstt,)= -T - +@k--.:r
v-2 JI. 8
n
N"fi,
Summing here over r= 1, 2, ... , 11" we obtain according to (83)
and (84)
k-l",r (it)V nBlmtk
logf,,(t)= --r - +@k----:r-
v-I v. 8
n

= +0
k
'n
Pk
'A(-t )k.
2 v-3 vI yn v'n
Substituting tz for t and dividing by Z2, we have
t
l
1 k-3 A (it)V+2 ( Z )11 {k ( Z )k-o.
V:=log{e
2
(f (tz)i'}=!; v+l.tJ, - +e!kft -
n v-I (v+ 2)1 Vn le! \in
If we regard here t and n as fixed, and z as a, real variable such
that Iz I 1, we thus have for the function V =V (z) an expansion
inpowers ofz, withan error termofthe order zk-i. Thenobviously
there is a similar expansion for the function e
V
, 80 that we may
write for Iz I 1
!. k-3 ( Z )"
(95) e
Y
==e
9
(l" (tz-' =1+ J +B(z),
V" 1 :v n
where R(Z)==O(zk-l) %-+0. It is then readily seen that the
coefficient PVA (it) is a polynomial of degree 3v in it, which may be
put in the form (90).
According to (86), & majorating series for V== V(z) is
(96) 8&(PJ/: It I)s1=1 !: Itz I)V,
V'np=ro v. v'11,
and thus VI is, for j == 1, 2, ... , k - 2, majorated by
(97) 0
k
(pl':l t l)31(J!1)1 i
'\1'11, 11-0 v.. V
n
From the development
k-S Vi
eY:=:E -:;-+.&Vk-2e1YI
; .. 0 J.
ASYMPTOTIC BXPANSIONS 75
we thus obtain, since the majorating series (96) shows that
I IVI< arc for It I
R (z) =0","'i;' (Pi': It 1)S1 (l!l)1 i (jPM: Itz J)V
:1-=1 v'n v-k-2-1 v. v
n
== Elk(pJJ: 1tz J)Tc-ll 'j;' (Pit: It/)2/ i; I, ('!:-)'V
1-1 v-ov.
= 0", (.;,,)1&-2{(pJ/: It l)k+(Pi': It 1>3(k-t>}
= (I t I'" + It 1
8
(k-1.
Putting z= 1 in (95), we thus obtain (89). Finally, the relation
(91) for the coefficients c/
VIt
follows immediately from the major-
ating series (97) ifwe observe that, in the expansion (95), a term
containing the product (it)J'+11 (z/vn)" can only arise from the
development of the term Vijj!. Thus Lemma 2 is proved.
We next consider thefollowing Lemma 3, which gives an upper
limit of IfA (t) I, valid in the interval It I Ph. IT the behaviour
of the absolute moments PtA and Ph for large values of n. is not
too irregular, Tim as defined by (88) tends to infinity with n, 80
that the interval It J t'P
b
of Lemma 2 is, for all suffioiently
large 'It, contained in the interval It I:i T
Ie
",-
Lemma 3.
1
FiYI' It I 21. we n.aloe
We have
Ifr(t) 1
1
= S:..J:<J) cost(:t:-y) d.F,. (x) d.F,.(lI).
but cost(x-y) It 1
8
Iz-yl3
It 1
3
{I X1
3
+1111
8
),
1 Liapounoff [2].
76 ASYllPTOTIC EXPANSIONS
and thus for J t f Tkft, we obtain
-11: ""
Us" ,
fa PIn It la
t fn (t) 1
2
= n Ifr (tI8"J 1
2
e 3- v/n ,
,-1
i'( It I ) is
Ifn (t) I e-
i
1-
3
2'0 e-'3.
Thus Lemma 3 is proved.
If, in the polynomial P", (it), we replace each power (it)1I+2:1 by
(_1)1'+
11
4l(J1+!1) (2:), we obtain a linear aggregate of the derivatives
ofthe normal function til (x), that will be symbolioally denotedby
p1m (-ttJ). Thus by Lemma 2
v
(98) Pvn. (-0) == L (_1)1'+21 c;vn. fIl<V-+2:/) (X),
i-I
where elm is a polynomial in the quantities such that
c - I:),. p(Y+IS)/k
III." - 'elk k'n
Obviously we may write
f'
(99) Pm (-l)=Pav-l,n (z)e-
i
,
where PSp-l.n (:c) is a polynomial of degree 3v-1 inz. In the case
6f equal components, PPft, (-<I and Pall-l,n. (x) are independent
of n, and in the more general case when (87) is satisfied, the
coefticient.s CjV1t as well as the coefficients ofPSv-..l
t
n (:t) are bounded
for all n. Aooording to (62) we have
(
100) p. (it)e-i=J"" eUa:dP.,.(-4l).
Vtl -<XI
We nowdefine two"enor terms" R1t;n, (z) and rkn. (t) by writing
the following expansions for the d.f. tT,.. and the o.f. fn (t)
k-3P. (-cIJ)
(101) L vn vll
v-=l n
=4\ (x) + (x)
V== 1 nFl'S kn'
k - 31'. (it) _t
(102) f", (t) =e j + :11/1 - e 2 +rkn (t)..
v-=l III
77
FroID (100) we then obtain
rkn (t)= f:""eiixdRkn (x).
Lemma 2 shows that we have 'rkn (t) =0 (t
k
) in the vicinity oft == 0,
and by the argument used in IV, 2, we conclude that
f:"" x
v
dRk1/.(x)=O
for v=O, 1, .... ,k-l. Thus in particular Rkn(x) satisfies the con..
ditions of Theorem 12.
\Ve now proceed to the proof of tIle follo,\\Ting lemma. which is
fUlldanlental for tIle rest of the chapter..
Lemma 4 ..
1
Fo'r 0 < (/) < 1, we have fo' all1eal x and all h> 0
(103)
f
.1:+11 - (ICO f fn (t)J 1)
w (y-x)W lR
kn
(y)dY=(::)k t
w
+
1
dt+
pk
_
i
-
z Ph
If titre integral in the 8econcl member of tlitis relation is convergent for
w= 0, we further n.ave
(f
OO Ifn(t) I 1 )
R
'en
(x) = 0
k
- -t-
dt
+pk-2 ..
'lt
kn
kn
For the proof of this lemma, we shall suppose that Tkn. > I,
so that < T
kn
- If this does not hold, only tiri\rial modifica-
tions are necessary. (It will appear below that the conclusions
which will be drawn from Lemma 4 are all trivial in the case
1, so that this case is not really interesting.) (33)
l In the brat edItlon of tillS 'rrac1, LLmlna 4 was stnted In a dlftclont ftlrlll
Impbcs, ill that if the first nu:nlOCu of (103) is leph\('cd by
'" (.G' - 1/1'"-1 R411 III) dlj.
we ttbt.aut a, rolation valid for U -1. Tn thltt form, the J..,emms, ,rus glven by
(""raJner [2], and a,pphed to the study of the asymptotic I)ropcltie$ of certain integral
averages of }'n (Cf. below, p. 84.)
78
we obtain
LIAPOUNOFF'S THEOREM
and further, using the inequality (36),
ICdJ:+h(y - X)W-l Rim(y)dy I< 0f: I IdJ
o(S:I I
dt
+A1+Al!+A
s
),
\vhere 0 denotes an absolute constant.. For AI' A
2
and ..&3 we
have, on account of Lemmas 2 and 3,
\vhich completes the proof of the first part of the LamIna. The
second part is proved in exactly tIle same way, using (35) il1stead
of (33).
4. In this paragraph, we shall use Lemma 4 to prove the
following theorenl. which is due to Liapounoff..
Theorem 24.
1
Let Xl' X:a, .... , X
n
be independent variables
81JCh thal X
p
has the mean. value zero aM the 8.d. C1" and put
1 Liapounoff [11, (2]. It is possible to show (cf. Cramer [11, and p. 19) that
we ma.y ('==3.. In the works of and Gnedenko..Kobnogoroft quoted on
p. 119, It J8 shown that the factor logn In the evaluatIon of the error given in
Theorem 24 may be omitted. Of. the Remark on p.. 82 l>tt'low'. The evaluation th.us
obta.ined 18, in a. certain sense, a be&t..possihle one.
LIA.POUNOFF'S THEOREM 79
8;=af+... +0;. If the oosol'llk mom,ent Par =E 1 the ineq:ucility
logn
Ity" {z)-4> (z) I< P81f, V
n
'
wheJre 0 is an ab80lute ccmstant, and P31" i8 dejineiJ, by (84).
(It will be remembered that, in the particular case of equal
components, Pan is independent of ?it, while in the more general
case when (87) is satisfied, Psn. is bounded for all n.)
Without 10s8 of generality, we mayinthe proof of this theorem
assume Pan.> 100, as in the opposite case we have
Pa.!V
n
= 1/(421n)
so that the theorem is then trivial.
Let us denote by X
n
+
1
an auxiliary vari&ble independent of
Xl' XI' ... , X", and having the d.f. F.+
1
(Z/(8.,f,' where
E is a parameter such that 0 < E < 1. In the notation used in
the preceding paragr&phs, we then have for the sequence
Xl) X2,' ... , X
n
+
1
= (1 +E
2
),
... .r;-:--o (Xv'1+.:1)
l+E
i
).cIl --E-- ,
( )
e'"
ft&+l (t) = f. v-
t
- e
1+E
2
We now apply Lemma 4 to the variables Xl' .... , X
n
+
1
, putting
e=3. It is then obviously permitted to use the second part of
the Lemma as, according to the above expression for fn+l (t),
the integral ocourring in the second member of (103) is abso-
lutely convergent for 6)=0.. Replacing xVI +E
1
by 2:, we then
obtain
(104) lit" (z).<1l -$(v-=' 2) 'I < To 0 +f'" e-2(r:..)dt
E 1 +E 3.n+1 TSft +
1
t
<0 e-1ctTa..+I) ,
:.L8,.n.+l E 3.n+l
where 0 is an absolute constant. During the rest of this proof,
80 LIAPOUNOPlr'S THEOREM
we shall use the letter G to denote an unspeoified absolute con.
stant. We have further
f:>
and hence deduce, denoting by A> 1 a new parameter,
(105) f:49(;)
o
<A
e
s
(106)
()
2.
From (104)-(106) we obtain
x )-o(i
e
-
f
+
p
--!.-+
1+E
t
. .111 3, n+1 3, 1&+...
(
%) (1 1 1 )
tT,,(x-hE)<4l 4/1+1' +0 he ! +Pa,n+l +E1Tl'Hl e-lf':7"",,+.
Replacing a: in the first inequality by %-1H:, in the second by
a:+Jw, and using the relation we have
further
(107)
(
1 1 1 )
<0 1H+h:6 I + e.-."7'1"+1 it
3,1f,+1 ".+1 I
We now dispose of the parameters hand 4! by taking
.. /--- Vlog1:
A-v 210gT
hJ
E==3 In.
a_
From the assumption T",> 100 it then followa that we have
1"IAPOUNOFF'S THBO&Ell 81
11,> 1 and 0 < e < 1. Further, according to (84) and (88) we have
T. v;;.:+:I (1 +111)1 > Pan. ..
"_+1= 4p"1H-l - 3It 1+8J;c
3
T8ft,
>
1+ 1001
and hence 1'Tl.+1> log Paa-
Introducing in (101), we then obtain, since
!() (v'l:ot=)-4l (a:) I<
. logn
! (:1:) -4l Cal)! < 0 T
k
3r. < apIA v'n
and the theorem is proved.
This theorem is directly a.pplicable, e.g., to the Bemoulli
distribution considered in v, 5, and VI, 5, in which case we ha.ve
Pa", == (1- 2pq)/vPii. Thus if denotes the probability of the
relation where v is the number of white balls
obtained in a set of n drawings, the probability of drawing 8,
white ban being each time equal to p == 1- q, we have for all on > 1
logn
Ia. (z)-(J) (z) t < 0 ;-,
ynpq
where 0 is an absolute constant.
5. We now return to Lemma 4 with an arbitrary 3. In
this paragraph, we shaJI consider the particular case of equal
components. It will be shown in this case, it is possible to
give &, very simple sufficient condition for the existence of an
asymptotic expansion of the difference (a:) -$(z) in powers
of .,.,-1.
In the ca.se of equal components, the moments etc. introduced
in 2-3 are independent of 11-, so that we may write p", P
v
and
Pa.-l in the place of Ph- p..... and PaIJ-l.".
We shall say that & d.f. F (:1:) aoJ,iBjie8 the (C) if, for
the corresponding c.f.j(t), we ha.ve
(0) limsupl/(t) I <1.

By Theorem 7, the condition (C) is certainly satisfied if, in the
standard decomposition of F (x) according to (13), the coefficient
aI of the absolutely continuous component is different from zero.
We now proceed to prove the following theorem.
Theorem 15.
1
Let Xl' XI' .,.. be a sequence oj inilependem
ooriable8 all hOlDing the same d,.J. F (x) with the mean value zero,
the I.d. (T, and a finite absolute moment Pit oj order k a; 3.
By (2:) =(F (azvn)n* we denote the a.l. 01 tke fXJriable
(Xl +... +Xn)/(UVn). If F (z) 8ati8ji,tJJ the condition (0), we the""
have the expansion
1:-3 R (-4l)
(108) ffn (z) + I; v JIll +R
k3

)Ilia 1 1lt
k-3
p
(3:) :t'
+:E e-'2+Bien (x),
1'-=1
with
M
(109) IBin (z) I< n(1c-2)/2'
where MdeperulB"on 1c andOft, the given,fU'Mtion F, but is i1ulependem
o!n aM z.
Remark. For 1c == 3 this theorem shows that, as soon &8 fla is
finite and condition (0) is satisfied, we have
M
Ii.'f" (x) -lP(x) I<V
n
'
where M is independent of ,.., and Thus the condition (0)
enables us to improve the Li&:POunotf limit of the error as- given
by Theorem .24:
Proof. From Lemma 4 we obtain, using (88) and substituting
atv''' for t in the integral,
1 Cramer (2)-
It is shown in the works of Bueen ud Gned.enko-Kolmogorol' quoted OIl
pap 119 that this improvement holds even without the condition (O).
ASYl\IPTOTIC EXPANSIONS
R3
(3:+1&
(110) (,)J:: (11 -X)(I)-1 B
k
,. (lI)dll
=0
k
(u-(I)'I/,-W/'I. flO If,.(crtv''I/,) I
J t
cu
+
l
n('k-2)!J
Given any d.f. F (x) satisfying the condition (C)'l it follows from
the Remark p. 26 that we can find c> 0 such that f j(t) I < for
t> 1/{4up'fk). By (83), however, fit (atv
n
) = (/(t))tt, and thus we
obtain from (llO)
(Ill)
i(JJ J:Th (y_X)<u-l RIM (lI) dU I < M e:
lI
+n-<k-2)/9).
M denotes here, as during the rest of this proof, an unspecified
quantity depending only on k and on the given function F, but
independent of n, x, It and w.
Now (y) is the differenoe between the never deoreasing
k-3
function (g) and the function U (11):= <1l + n-
v
/
2
I:,( - tP}.
v:=l
The derivative U' (y) obviously satisfies the relation t u' (y) 1< M,
80 that we have for every 'Y in the interval of integration
Bien (:1:) - Mh < R
kn
(y) < R
kn
(x +h) +Mit.
By means of these inequalities, we obtain fronl (111)
Rim(x) <M(h+h-a>;-cn+h-n-<k-2)/i) ,
Bien (x+h) > -M(h +k-a>;-cn +k-<Dn-(k-2)/B).
Replacing in the last inequality x +h by x, we thus have generally
(112) IR
kn
(x) I< M (h + ..;: k-<u
n
-<k-2)/2) .
Taking here k=n-(k-2)/2, w= l/logn, ,ve obtain (109), and tht.'
theorem is proved.
1
It is easily shown by examples that Theorem 25 does not hold
84 ASYMPTO'!'IO EXPANSIONS
true without the condition (0).. Let, e.g", F (z) be the step-
function oonnect-ed with the simple Bernoulli distribution (v, is):
{O for x< -p,
F(x)={q H
II "
F (z) being of type II (of. m, I), the condition (0) is obviously
not satisfied. Taking k =4, Theorem 25 would give the expansion
iJ (x\ ==w (x) + p - q <9(3) (:1:) +0 (!) .
n J n
This can, however, not be true
J
as it is readily seen that (a')
has, in t.he vicinity of x =0, discontinuities where the saltus is
of the same order of magnitude as n-
l
.
Howe.ver. it can be s}lown (Cramer (2], p. 56) that, even without
condition (C), all asymptotic expansion of the form given in
Theorem 25 holds for an appropriately weighted average of the
function l1n (x) over any given intervaL In the second member
of {IDS}, we shall then have to hltroduoe the corresponding
average of (x) &Ild its while the order of the error
term will only differ by a factor (log n)2 from the order given
by (109).
6. w
g
e shall nov," prove an analogue to Theorem 25 for the case
of unequal components. We shall then have to lay down certain
conditionswhich, roughlyspeaking, may be interpreted b;y saying
that the d.f.'s of the variables X'1 will be required to satisfy the
condition (0) on the average in a certain specified sense.
According to Theoreul any d.f. F, (z) ma.y be uniquely
represented in the form
(113) F,(X) =K,G
r
(x)+ (I-It,) G
r
(x), (0 1),
where 01' (x) is a d"f. of type 1 (absolutely continuous), while
Or (x) is a, d.f. \vhioh does not contain any compOllent of type I.
We no\\l' proceed to prove the following theorem.
(115)
(114)
Theorem 26. Let Xl' X
1t
.... be independem vanable8 BUCk
thot Xr has the itl_ 1;. (.-c) until the mean. value zero, tke 8.d. ar'
and a finite ab80lute mom,ent Pic' of order k 3. Let 1;. (x) be repre...
aented (JC(JO'fo,ing to (113) ana suppose that tke derivative a; (x) is
0/ bolllTUkd total variation v,. i,-n ( - 00, +(0). Suppose further that
we AafJtlor infinitely increJ1,8i1l{/ n
1 'If, I(
-- -'--+00
lognr-=11 ,
.. 1 ICr
- -- A.A ---+-oo
8
110
logn"_ll
8
11
ani/, Ph being defined as in tlfe preceding parag'fapkB. For the
d.l. if. oj tke tiOrriable. (Xl +... +X."J/8
n
we tken Mt1e the
ea:paftlion
witA an. error term Rim(z) satisfying the relation
M
(116) IBh (2:) I< Pl;I'
where M is indepe.ndem. 01 n 0,114 #:.
Remark. An important particular case is the case when
(4) the conditions (87) are and (b) the variation&t'r are
anitormly bounded for all r== 1, 2, .... As we have
== B'n/( t
the conditions (114:) and (115) are in this case equivalent and
reduce to the single condition
1 "
-I- 1::
og 7t"llIDl
Ph is in this case of the same order of magnitude as Vn, 80 that
.M
IRim(:I:) I<n,fk-f)/I
For Jete: 3 we obtain here the same improvement of Liapounoff'e
theorem as at Theorem 25_
86
(116) becomes
Proof. :From (91) and (98) we obtain

This shows that for T
kn
;$ 1 the assertion of the theoremis trivial,
so that we may assume throughout the proof Pie,,> 1. From
Lemma 4 we obtain, using (83), for 0 <Q) < 1,
wJ:-"(y-:r:)_1Rim(g)tlg=e" .
n
where Z= upper bound of II II" (t) I for t>T
Im
/8
n
-
r-=l
Hence we obtain by the same argument as that used for the
deduction of (112)
(
k-fJJZ )
IR
kn
(x) I< Elk A+-;- +k-tDTkJ,"-t) -
(For this deduction we require the result that the derivative of
k-8
the function U(t)=4l+ I; n,-v/IP
n
(-4 satisfies, for 1'1:.> 1, the
v-I
relation IU' (t) I< e
k
- This is easily proved by meaDS of (91)
and (98),.)
Taking n=T;Jk-l), w= l/logT
h
, we now obtain
JRk,n (:1:) f <a. (TkJ,k-t) +Z logTim).
So far we have made no use ofthe aesumptions (114) and (115).
If we can now show that, owing to these assumptions, we Ilave
for every fixed A > 0
M
(117) Z<
where M is independent of 11" the theorem will obviously be
proved.
ASYMPTOTIO EXPANSIONS 87
By hypothesis we have, denoting by gr (t) the c.f. of G,.
II,. (t) I 1<, Ig, (t) I+1 - IC"
&nd
For 1t I 2v
r
we thus have
iI,. (t) I 1- tIC",
and hence for f t I< 2v,. by Lemma 1
tt tt
i fr (t) I 1-(Ie,. - 32tJ! 1 -1(,. 64tJ
z
-
r r
It follows that we have for all t> 0
11,,(t) I
and consequently for t> Trm./8n.
1/1'(t) I I--h
K
r
Min
( 1. ;;':;) 1-
1 It,. Min (1 Pt.)
:i e- 6i 1+f); , < ,
.. I :Min I.. P:ft ) i 1(,.
n f Itt(t) I e- 64 \ 1, s; r_:t
1
wf
,-".1
According to (114) and (115) the last expression is, however,
for any fixed A > 0 a.nd for all sufficiently large -n less than
so that (117) holds true, and the theorem is
proved.
7. It has been proved in tile preceding paragraphs that.
subject to certain conditions, the series
1
(lOla) =4 (x) +Pin (:-cI +P
21l
( -$)+PSn, (;-4 +'"
ns- n n
gives an asymptotic expansion ljf tYn (z) for large values of n.
According to (95) and (98), the Pvn ( -tI are for 11= 1, 2, ... , k - 8
defined in the following manner. We fu'St define an ordinary
1 The formal definition of this aeries was given by Edgeworth [1].
polynomial P"" (t) by the relation
t-l A
E .. +J,. %" Ie-a
e
v
-
1
(v+2)! :=1+
v-l
Here, Amdenotes the quantity defined by (84), so that n-(v-I)/tA...
is the vth order semi-invariant of ij1\ (x); Jc is an integer such that
the 1cth order absolute moments are known to be finite for all
the components of (x); and finally z is an auxiliary variable
which varies in the vicinity of z== o. To obtain P
vn
( -fla) we then
replace in Pm (t) each power f! by the function In
this way we obtain the expressions
( -C) = - cXl<ll) (z),
P
h
( -f1... 1)(') (z)+ 4)(6) (z).
Ph (-4l)- - -
for the first terms of the development (lOla). It will be remem...
bered that in the case of equal com.ponents the Avn. (and thus alia
the Pvn) are independent of 1t.
On the other hand, 8, development of the type
(1010)
has been much used by writers OD mathematical statistiCi (ef.
e.g. works by Charlier, Bruns, Gram and Thiele), and it haa been
claimed (without correct proof) that this expansion should
possess asymptotio properties similar to those discu88ed above
for the expansion (101 a). The coefficients c
vn
are here determined
by the relation >
en=( -1)11f-=HII(z)dg:n(z),
where H
p
(z) is the vth Hermite polynomial:

H. (z) =(-l)Pe
i
ilz"e 2"
From these expressions we obtain, by m.eans of the relations
between moments and semi...invariants (IV, 2).
;\3ft.
Can =- n
i
'
A
4n
c
4n
=
n-
Aan
Csn = - nfj.

n! + -;'
f; ""'''''oO'' to" '1'''' 'If c.,.,
For larger of n, the expressions of the and the om
become inCl:e&SL."t1gly complex, but it will be seen from the &bt1Ve
that the two expansions (lOla) and (lOlb) may be regarded a$
rearrangements of one Jt followa from Ol.Ir
tha.t It is only (lOla) which gives in tIle ordinary seflse, an
asymptotic expansion of tjn (x). Or1 the ottu;r tIlE' expan...
sion (Ii}} b) ma)' be considered formally tlirnpler, the
art' by simple relatlon givCll n ,vhich le!1ts on the
orthogonalitJ of the Herluite J:}Olynomials 1
1 Fo! a. more det&tltd anal,; of the relat.olls betv.een the two vf t'xpan
mons cf. Cramer (2]
o
CHAPTER VIII
A CLASS OF STOCHASTIC PROCESSES
Z =Z +D:
'Tl+Ts 1"1 '1'1 Ta'
where Z'TI and 'TI are independent.
It is, in faot, possible to give an exact meaning to the limit
passage which has thus been roughly indioated.. We shall, how-
ever, prefer to consider directly a random variable which depends
1. In the preceding Chapters, we have been ooncerned with
distributions of sums of the type Zn == Xl + .. +X"" where the
X,. are independent mndom variables. Z,.. is then a variable
depending on a discontinuous parameter fi., and the passage from
Zn to Zn-rl means that Zn, receives the additive contribution
X
11
+
1
' so that we have Z-n+l = Z.,. +X
n
+
1
, where Zn and X..+
1
are
independent.
Consider now the formation of Zn by successive addition of
the indepe...l.dent contributions Xl) X" ... , and let us
assume that each addition of :& new contribution takes a. finite
time S.. (In a concrete interpretation the X, might e.g. be the
gains of a certaJ.n player during a series of ga.mes, every game
requiring the time S, so that Zit-is the total gain realized after 1Ir
games, or aftser the time n.8.)
The sum ZtJ, then arises after the time n8, and the d.f. of Z. is
thus the d.f.. of the sum that has been formed during the time
interval (0, nO).. Suppose now that we allow 8 to tend to zero and
1 to tend to infinity, insuch a way that nO tends to a finite limit 'T.
It is conceivable that the distribution of Zn may then tend to a
definite limit, whioh will depend on the oo1l-tinuous ti1Mparamt.ter
7. Thus instead ofthe variable Z-n, with a discontinuous parameter
n we should have a variable Z.,. with8 continuous parameter'Tt and
luch that the increment of ZT dUring the time interval (-Tt, 1"1 +Tt)
is independent of Z'rl:
(118)
A CLA.SS OF STOOHASTIC PROCl1SSES 91
on & continuous parameter and which behaves in the general way
described above.
1
2. Let T be &. continuous parameter which may be thought of
as representing time. Suppose that, for every 1" ~ 0, we have a
random variable Z-r with the d.f. F ( ~ , T) and the c"f..
I(t, T) = f ~ Q ) e#eclDO'F (:t, T).
ZO will be supposed to be identically equal to zero, so th&t
F(x,O) coinoides with the d.f. E(X) defined by (17).
The set of variables Z.,.. will be said to define a random. or
8t00ka8tic proces8 with 8tationary and indepentlettt itu;remems
(briefly: a 8.t.i. procea8) if, for 'T1 ~ 0, "-1> 0, the difference
U'T11'a = Z1'l+'T1 - Z'Tl is a random variable which is inilepefUkn;t 01
tke variable Z'Tl and has a d.f. which is i'llilepenilent oj 71- We oan
then say that the inorement of the variable ZT during any time
interval is independent of the value assumed by the variable at
the beginning of the interval, and also independent of the posi.-
tion of the interval on the time scale (but not, of course, indepen-
dent of the length of the interval).
If ZT defines 8, s.i.i. process, it is seen from (lIS) that the dlOf.
of Z1"l+
T
t is composed by the d..f.'s of Z"rl and U"l'1
7
'. The latter d.f.
is, however, by hypothesis independent of 71' and for '1"1 =0 we
have 0'0,'7'1 =Z,;s-ZO=Z"'a' so that the d.. f. of UrI"'. is identical with
F (x, Tj). This gives us the following relations which may serve
as an analytical definition of the s.i.i. process:
(119) F (x, Tl +72) = F (x, Tl) *F (z, T:a),
(120) f (t, 'T1 +Ta) =/(t, 71)! (t, Tt).
1 Particular cases of variables of this character were first studied by Baehe1ier
(1, 2) and Lundberg [1, 2]. Further contributions were given i-nter alia, by Cramer
[3] Stud Esscher [1], in conneotion with the mathematical theory of insuran.ce risk.
A complete and mathematically rigorous theory. which embnces &lso cases much
more general than the s..i.i. process, was first given by Kohnc.goroff [2]. The theory
of the s..i ..i. process was developed by Uvy [2] under more general conditions th&n
those considered here.
92 .A CI.,ASS OF STO(.HASTIC
Fo!" the mome:n::s &f Z'f' we use the nutation
f'J...:;;
C(p(r)=E(Z;)= f
.. -0;)
(Throughout the C;hapte:r lt be that the variable
ofintegration 18 alw-ays thefir8l varlable occurring in. the function
the tugn d. so Lhat we rnay Olnit the index or! this sign.)
Theorerrl Let Z"f defi'M a 8.,i.. i" 8twh tJw,t at (1")=0
(JfJta, :1
2
( T) J./f h,ite j(y.r all -"f' > 0Q We then have
:0.. .. ... .. f} - 1-- itz ..
-iUOTt2+7 --d!!(x),
-c x
w.ag visC (()jrstant andn(x) irs {it i
4
('i.unded aruJ nevtr deCrta8ing
junction1JJh.irJit i8c,ominu0'U8 at x ::.= o. G01lf..'eraely git'en anyCA'J'Mtant
0 and any brru/lu.led and neve:r ile.c1'easing n(x) con-
ti'A'U0'U8 at x=0, (121) defines (;.f. f (t, -r) Q, fJariable Z... c<Jrre.
8'jll11t;ding to a a.i.i. pr0CS88.
Before proceeding to the proof of this tbeorerrt, 'lte shall con...
sider some simple particular cases Suppose futBt t/hat n(x)
reduces to a oonstant, so that the last term In the second member
of (121) disappears. Then it follows froID (121) 1hat,
F (x, '1") =cJ) (xf(OO'V/7)),
80 that Z'T is, for every ". > 0, normally distributed witil the mean
vaJue 0 a,nd the s.d.. O"OVT This case is often called the BrfYWnian
movement prOCe88, a name referring to one of its important
physieal applications. Suppose on other hand ao = 0 and
o(x) = AC
2
E (x-c), where ,\>0 and c:pO are COllstants, and (z)
is defined by (17). Then (121) gIVes
logf(t, T} (eM _.. 1 -- cit),
t Kolmogoroff{&l. aI, also de Flnf"tti 2] lfths hypothefWl.xl (.,)-Oiaomitted.
may apply the theorem to the varia.ble Z., ""'J (1").. and choose for ft.l (-r) any
:Mlutlon \eontinllt')us or not) of the funCttional equation
1 (1'] +"f'%) (1",.) (Tt).
If we uaume, e.g., that, !Xl (-r) 1C m some mterval, however small, we neoes-
sanly have tXt (,.1 where e is & real OOI1It&nt.. Levy {2] atudies the s.i.i.
mthu"Ut assumill$( the eXlItenee momenta 0'.1 (,.) and. a (1').
A. OLASS OF STOOHA.STIC PROCESSES 93
80 that the variable ZT+ACT has the a.f.

According to (47) this corresponds to 8, distribution ofthe Poisson
The corresponding process, which has important applica-
tions, e.g. in the theory ofinsurance risk, is known as the PoiaB,.
pr0WJ8.
More generally, let (1 (x) be a step-function with a finite num-
ber of steps, none of whioh is situated at the point :lJ == 0) and put
b=J 00r
1
tID(2:). Thenit follows from (121) that the distribution
of the variable ZT+bT may be regarded as composed of one
normal component (arising from the term containing (To) a:n,d a
number of independent Poisson distributions, each of which
corresponds to one step ofn(x).
In the general the distribution of Z.,. is always composed
of the normal component (:e/(ao\IT) and another oomponent
corresponding to the term containing (1 (x) in (121)..
\Ve now proceed to the proof of Theorem 27.. Let us first
consider the s.d. Vi.Xt(T). From the fundamental relatlons (119)i
and (120) it follows that we have
! +Ta):= (.(2 (Tl) +tX, (TI)
The only non-negative solution of this functional equation is,
however..
l
(122) .s (T) == aar,
where 0'2 0 is a constantt From (122) we deduce
(123) J(t, = 1- i&altttlT
with 1.& t 1, so that !(t, L\T)-->-l as According to (120)
it then follows that, for every fixed t, f (t, 7) is & oontinuous
function of T ..
From (120) we obtain further f(t, l/n)={!(t, l)}l/1t, and hence
for all rational mIn we havef(t, mIn) ={f(t, .. By continuity
this result extends immediately to all 'T > 0, 80 that we have
generally
1 Oi.. Hamel (1]" Hauadorti [1], p. 17;;.
(125)
94 A OLASS OF STOCHASTIC PBOCJESSES
(124) f(t,T)={f(t,l)}T_
According to (123), the expression
f(t, t1T)-l_ {f(t, 1)}QT-l
dT - d7
is, for every fixed t, bounded as It follows thatf(t, 1).,&0
for all real t, and thus the expression (125) converges uniformly
in every :finite t-interval to the limit
(126)
lim j(t, -1=logj(t, 1),
AT-+-O T
where logJ(t, 1) denotes that branch of the multi-valued fnnction
which vanishes for t:= 0 and is for all real t uniquely determined
by continuity.
On the other hand we have
Putting
(128)
H (z, A.T) is a never decreasing function of x suoh that
H (-C1J,AT) =0, H (+00, 6:r) =0'2.
For every fixed AT>O, H(x,Ar) is continuous at x=O, and
we have
1 foc fGO e*-1- itx
A (e
itz
- 1- dF (x, .aT) = i dB (x, A:T),
x
where, for x=O, (e
1h
-l-itx)jz" is to be interpreted as -tll/!.
According to (124), (126) and (127) we thus obtain
f
cc e'ib: -l-itx
(129) log! (t, T) = T lim 2 dB (x, 1ia'T).
A'T-+-O -00 x
Consider now the function H (x, aT) for a sequence of values
a1'r, L\eT) ... tending to zero. It is then always possible to choose
a sub-sequence Ant T, Ll
nt
T, such that the oorresponding fune...
A OLASS OF STOCHASTIO PROOBSSBS 90
tiona B (z, tT) tend to & limit H (3:), in &ll oontinuity points :e
of the latter. From (129) we then obtain
f
co e1l:J:-l-itx
(130) logf(t, 1') =='1' 2 dB(:r:).
-00 z
Obviously H (x) is a never decreasing function such that

We can, however, show that in both these relations the sign of
equality must hold. We obtain in fact from (130) for small values
oft logf(t,'T)= - t'Tt
2
{H(+co)-H( -oo)} +0(t
J
) ,
but on the other hand (122) gives
logf(t,T)= -iat.rt
2
+o(tI),
so that we must have
B (-00)=0, H( +00)=(12.
Let, now, denote the saltus of H (x) at the point :=0 (thus
o (12) and put
(131) n(z)=B
:(x) being defined by (17). Then we have
fi( -00)==0, n(+00) =oi=al-aJ.
Further, n(%) is bounded, never decreasing and continuous at
z=O, and (121) follows immediately from (130), so that the :first
part of the theorem. is proved..
The latter part of the theoremis obvious in the particular case
when n(x) is a step-function with a finite number of steps.
(Of. the remarks made above.) Further, ifn(z) is any function
satisfying the conditions of the theorem, the second member of
(121) may be uniformly approximated by means of a, sequence of
step-functions converging to the limit n (x). By Theorem 11, the
corresponding dllf.'s tend to a limit which is itself a d.f., and the
second member of (121) is equal to the logarithm of the c.f. of
this limit. Thus (121) determines uniquely a d.f. F(z,-r), and it
follows immediately from the form of (121) that the fundamental
96 A CLA.SS OF STOOHASTIC PROCESSES
relations (119) and (120) are satisfied, 80 that proof of
Theorem 27 is completed.
Since (XS (1') is finite, (130) may be twice differentiated with
respect t.o t, and we obtain
fa>e
1Jz
dH :;'logj(t, T).
But H (:e)/ut is It d.f. whioh is tlniquely determined by its a.f.
It follows that we must reach the same limit H (z) for every
sequence A
l
T, 4,.", a. tending to zero. This implies, however'I that
we have lim H(z, AT) =H(x) in every continuity point of H (x).
4..1->-0
This leads to an interesting interpretation of Theorem 27. For
x< 0, we have by (128) and (131) in every continuity point of
a as AT40,
== tllI ="1 (X),
and for
-F(x, AT) =f> U, A'r}-ioJ"'OC
dOU
)=-:
A:" :z; es :e es
This may be Mitten
F (x, 41")= fit (2:) 6.1"+0 (6.1"), (z< 0',
I-F(x, AT) = fl. (z) dT+O (AT), (z> 0),
The probability that, during the infinitely small time AT, 8,
variation < x <0 ocours in the value of the variable Z., is thus
asymptotically equal to fil (x) AT, while the probability of a
variation > x> 0 is asymptotically equal to il. (2:) A-r
Thus the function (1 (x) determines the discontinuous part of
the variationofZ,., while obviously the constant Godetermines the
CO'TdinUC1U8 pan..
1
Further we have
J
"'O :r:2dnt (;t}+flxii dil
t
I_fflO dO (a:)
0
as (r)="=o1"+o1"1',
1 It should be noted that the d..f" F 1') is aJway. continllOUi with respect to T,
although variable Z1' mI.,. 8uifer diIconiinuoua :Jha.npI of value, if Q is not
identie&1ly zero.
A CLASS OF STOCHASTIC PROCESSES 97
S9'&hat the variance t ('t) of ZT is the sum of one term due to the
ogfltinuous part of the variation and one term due to the dis-
continuous part.
3.. By means of the remarks made in 1,. it will be easily
understood that- the s..Li.. process, as defined in 2, presents &,
great analogy with the "case of equal components" in the
problemof addition of independent variables treated in Chapters
VI-VII.! Roughly speaking, we are here concemed not with &
8vm, but with an integral" the elements of which are independent
random variables (cf. Levy [2]).
It, is tl1en fairly obvioUiS that our theorems bearing on
the case ofequal coolponents, as Theorems 20 and 25, sllould
hol..t it1/u.f,a.rulis, also for the case of a a.Lie process.. In
the 'Variable with the d.f.
(z) T) = F (u.'t ,/7",,-r)
and the a.f. f(t,T)=J(tj(av'r),T)
is directly analogous to the previously considered variable
(X
1
-s- .. +Xfl)!(ay'n) with the d.f. ti_ (x) and the c.f. fn (t).
Instead of the discontinuous parameter n, we are here concerned
with the continuous parameter T.
The relation (121) may be written
f
Qj eit%-1-itx- t (itX)2
logj{t,T)== -ia'rt
2
+,. I tlO{:t}.
-ex;) x
Substituting here t/(t1VT) for t, we obtai11
(132) .
,",co Uoe itx 1 l ike \ 2
t
2
J eav'I"-I- ay'T-2\aV'T)
logf(t,T)=--2+
T
2 --d11 (:r).
-eo x
1 If we omit the condition laid down at the beglnning of 2 that the distri bution
of the increase Z"':t.+'" - Z"l should be independent of 1"ilt we arrive at a mare general
kind of random process related to the general problem of addition of independent
variables in the same way as the process here considered: is related to the particular
cue of equal components. Subject to appropri&te conditions, Theorems 27-30 can
be generalized to this caN. (For &t generalization of Theorem 21 along these lines
of. Levy [21, who considers also the case when at, (or) i' not finite.)
(183)
98 A OLASS OF STOCHASTIC PROCESSES
In a way which is closely similar to the proof of Theorem 20, it is
now easily shown that the 1&st term of this expression tends to
zero as 1'"-700, uniformly in every finite t-interval. We thus have
the following theorem directly analogous to Theorem 20.
Theorem28.
1
.A8T-+OO,tked,.J. (x) T) ojtkemriableZ.,./(aVT)
tends to tke 'JWN)'l,Q], Ju/nction <1l (3:).
In order to obtain an asymptotic expansion of g. (z, 7) for
large valuesof'T analogous to theexpansiongivenbyTheorem25,
we shall suppose henceforth that there is an integer k 3, such
that the absolute moment of order k- 2 of the function Q (x)
occurring in (121) is finite. We put for v=3, ... )k
1 fco
A
v
=-; zv-st1Q (z),
a -co
1Ico
p)l=-; IzIV-
2
dO(x),
a -eo
VT
Tm=4 Ilk
Pk
These notations are analogous to those introduced in VII, 3, by
(84) and (88). We can now prove the following lemmas, which
are directly analogous to Lemmas 2 and 3.
Lemma 5. For It I iYP/w we have
t
l
k-SP.. (t) e
ejf(t,T)==l+
p-1 7" k-r
tJfn..ette (it) is the polynomial of degree 3v in (it), whick ill obtaineJl
by f'eplaci'n{} in the polyn,om,ial P1m (it) 0/ Lem'flUJ, 2 the fJ.'UCIntitiu
A
vn
defined by (84) by the quantities Av defined by (133).
Lemma 6. For It I T
kT
we M/ve
t'

The proofs of these lemmas, which are based on the relation
132), are so closely similar to the proofs of Lemmas .2 and 3 that
:4 Levy [2J.
A CLA.SS OJ' STOCHASTIC PBOCB88BS 99
theyneednot be explicitlygivenhere. Finally, puttinginanalogy
to (101)
k-3 E (-cz,)
(134) tf (x, T) =cJ) (x) + "T*/I + R
k
(:v, T)
v-I
where P3i1-1 (x) is a polynomial of degree 3v-l independent
of"T, we obtain in the same way as in VII, 3 the following funda-
mental Lemma corresponding to Lemma 4.
Lemma 7. For o<w< 1, we havejor all real a: and all k> 0
(U f:+A (1/-X)oo-1 I Idt+
11 the 'ntegral in the second, member of this relation is CfYR,vergent
lor w = 0, we.furtM'r have

1f{t,T) I 1 )
Ric (x, 'T') == ail: TM t dt +Tt"l!
Proceeding in the same way as in VII, 4--5, we can now use
Lemma 7 to obtain information as to the behaviour of tv (x, 'T) for
large values ofT. Inthe first place, we have the following theorem,
the proof of whioh is direotly analogous to that of Theorem 24
&nd need not be giverl here.
Theorem :29. 1/ the quantity Pa defined by (133) i8 finite,
weMve
log..,
Ii.J (x, 1') -41 (x)! < OPa y''T '
where 0 is an ab80lute conetant.
Further, we can now prove the following theorem which gives
an asymptotic of ty (:e, 1") analogous to obtained
in Theorem
Theorem 30.
1
Suppose that the variable Z., cO'/l.8iAlered in
T1u?Prem 27 8lJtiaftea the conditions :
1 Cramer (4).
100 A CLASS OJ' STOOHASTIC PROOESSEe
(1) PIN ah80lute fIU'Jme1&t Pi OIl clefiwby (133) is finite for 80me
integer Ie 3;
(II) For SOOJ,8 T > 0, the d.l. F (%, ,,) 8tJ4Vftea the etmrlitsOfl, (0)
ofvn, 5.
For the il.!. it T) 01 the Vf.W'ia,ble Z,./(ay'r), we the"" have the
ezpo/nttion (134) toith
(135) IRj: (x, T) 1<
M being independent oj T aMz.
Further, afl,1/ of the JoUowing c<YIII1itic'M (IIa) and (lIb) t.
8Ujficie:nt for tAe validity 01 (n) ..
(na) 0-1>0,-
(lIb) n(x) =01 (z) +0. where 0
1
(:c) aM 0. (z) are both
never tlecretUj,ng, wkiJe {}1 (:t) 1,8 ob8olutely continuous (J/nd, tlou not
"ed1JlC8. 'W a tXmBtant.
If (II) is satisfied for a single 1">0, it follows from (121) that
the same thing holds for every T> 0, and thus in particular for
T= 1. From Lemma 7 we obtain according to (124)
r.c+h
wJ.c
(
i
CC> If (t) 1) I'" pI )
= E)Ie ('f-t.lJ.r-tll tQJ+l til +7._1)/1 ,
T
which corresponds to (110). By me&ns of the condition (0) we
then obtain
jwJ:+1s (lI- X)-1BA;(Y,T)d,1 <M(S:+.,-<-I)'I) ,
and the rest of the proof of (135) is perfectly similar to the proof
of Theorem 25. The last part of the theorem is easily proved by
considering tIle real part of log!(t: T) according to (121).
THIRD PART
DISTRIBUTIONS IN RI;
The object of this Part is to sho,v that many of the results
obtained above for distributions in a one-dimensional space can
be generalized to any number of dimensions. We shall, in the
main, restriot ourselves to a brief discussion of some typical
of this kind.
CHAPTER IXI
PROPERTIES.
1. For a distribution in a one-dimensional space, the only
possible discontinuities arise from discrete points interms
of the meohanical interpretation used in Chapter II, are bearers
of positive quantities of mass. As soon as the number of dimen-
sions exceeds unity, the question of the discontinuities becomes,
more cor1'lplicated. Thus in a lc-dimensiona.l space, the
whole mass may be conc.entrated to a SUb-space of less than Jc
dimensions (line, surface, .... ), though there is no single point that
earries &, positive quantity of mass.
Given a random variable X = (El' ... , ek) in the k-dimensional
space R
k
, we denote as in Chapter II the corresponding pr.f. by
P(S) and the by F (Xl' ou,Xk). Just as in the case lc== 1, there
can at most be a finite nun'lber of points A such that P (A) > a > 0,
and hence at most an eIlu.merable set of points B such that
P (B) > o. We Sohal] call this set the point spectrum of the dis-
tributioD.
1 The geueraJ theory of completely additive set functions in a hedimensionaJ space
haa been developed by Radon [11, Bochner [2] and. Ha.viland [1,2.3]. .A compre..
helsive account of the principal results of the theory is by Jessen..Wintner [1J.
102 GENERAL PROPERTIES
According to II, 3, every component 'i of X is itself a random
variable, and the corresponding (onedim.ensional) distribution
is found by projecting the original distribution on the axis of
Let Q
i
be the set of real numbers which are discontinuities of
the distribution of fi' and form the (at most enumerable) set
Q=Q
1
+... +Qt. Further) let J denote a lc-dimensional interval

and consider the probability P(J) of the ','Oevent" X cJ as 8,
function of the variables al and hI.. It is then obvious that, as
long as no fJi and no bi. belong to the set Q, P (J) is a
function of these variables.
Any interval J such that no Q,i 1l,nd no hi belongs to Qwill be
called a, continluityintervaZofthe distribution. Iftwo distrihutions
coincide for every interval whioh is a continuity interval for both
distributions, it follows from Theorem 2 that the corresponding
d.f. '8 are al,vays equal, and thus by the same theorem the
distributions are identical..
If. a sequence of pr,f.'s {P
n
(S)} converges to a completely
additive set function P* (8) in ev"'ery continuity interval of the
latter, we shall say simply that {Pn, (8)} confJergea to p. (8). The
symbolPn. P* (8) will be used onlyinthis sense. Fromevery
sequence {Pn, (S)} it is possible
l
to choose a Bub-sequence whioh
converges in this way to a limit p* (8). Obviously we cannot ill
getteral assert that p* (8) is a probability function, 88 we only
know that 0 p* (R
1
:) I.
Any pr.. f. can always2 (cf. Theorem 4) be uniquely represented
as a sum of three components
(136) P (8) =aIP
1
(8) +all 111 (8) (8),
where Ox, all' alII are non-negative numbers with the sum 1,
l1x, PIlI are pr.f.'s such that
ll(S) is absolutely continuous; .&(S)=JsD(X)tlX. where
1 Radon [1]. This is proved in practically the same way as in the one..dimenaioDal
cue
:': Radon (t]..
GENERAL PROPBRTIES 103
D(X) is a, non-negative point function in R
k
which is called the
probability de'lt8ity or densitY!'Ullldion of the distribution defined
by 1\ (8).
Ii:I (8, :is pnrely discontinuous; (8) = 1 if S coincides with
the point spectrunl of P (8).
1\11 (8) is "singular"; the point spectrum of (8) is empty
and there existsa Borel set S ofmeasure zerosuchthat (8) == 1.
2. A real-valued function g(X) which is :finite and uniquely
defined for all points
l
of Bit is, according to II, 3, a random
variable with a uniquely defined one-dimensional distribution.
By (I5a) we have for the mean value of this variable the expres-
f
E(g(X= g(X)d.P,
B.t
subject to the conditionthat the integral isabsolutelyconvergent.
The mean values of the particular functions
g (X) =fit ... (Vi =0,1,2, ... )
are called the '1fI..011ten:t8 of the distribution. We shall use the
notations m( = E (f
i
),
1-';= E gi-mt) <f
j
-
m
J
)),
== D2 (et) =fLii = E e
i
- mi)I).
Putting rij = f-Lii ,
Ui(7j
it is thell ea.sily shown that we have - 1 riJ 1, and that the
extreme values 'ti1 = 1 can only be reaohed if, in the two-
dimensional distriblltion of the H combined" variable (E" f;),
the whole mass is situated on one of the straight lines
<f
i
-
m
i)/t:1f,= (,;-m;)/o-J-
"; is called the ooeJficient oj correlatiO'lt between '1. and f
s
' and
plays all important part in the statistical applications.
More generall)', he (!uadTat,ic forr.n
I; J.-t,;; U l ::= dP
f.t>t;'" .. I4: i
1 Ex("ept posaibly fur pl)intB formir)g a $ftt :E such thn.t P (1:) =0..
104 GBNERAL PROPERTIES
18 never negative, which implies that the determinant II f'illl, as
wen as all its principal minors, is o.
3. The c1l4ract,eriBticfu'l"ldion ofa distributionin R
k
is the mean
value
(137) f(t
1J
... ,t't):aE(e
i
(ltil+...+
l
ktl)==j e'i.(llfl+...-+tl;fk>dP.
BI:
explioitly stated. otherwise, the t, will be considered as
real variables, so that I may be considered as a function of the
real point (tx, ,tic) in R,,_
ObviouslyJis a uniformly continuous function of the t, in the
whole space, and we have always IJ I 1. The generalization of
Theorems 6-8 to any nutnber of dimensions is comparatively
easy, and will not be de&lt with here.
If all moments np to a certain order are finite, we have for
small values of It, I an expansion of,' analogous to (25). If, in
particular, all #L1,1 are finite and all m'l are equal to zero, we have
(138) . ,t/e)= 1-1
iti ,.
We shall now consider the generalization of Theorem 9, and
for the sake of formal simplicity we take first the case of a two"
dimensional space R
2
The generalized theorem will then be as
follows: If the interval J defined by
Xl < '1 +'1tt, 2:. < t,
is a continuity interval of the distribution, we have
(139) P (J)== F (xl .. +k
a
) - F (Xl +k
1
) XI)
-1' (2:1' xa+ha )+F (Xl' XI)
1 JT IT l-e-itlhll-e-Uahs
= lim A-a 't,. 't f(t
1
t
a
)dt
1
dt
a
7'-+- t.O "S'H -7,1 -'1' Z 2.

We have, in fact, for the quantity behind the sign lim. , the
'I'.-,..a:>
expression
GENERA.L PROPERTIES
105
where
,p. ('l) =!JTsin t (f, -,xtl) dt- !f2"Sint ('i -: , - hi) dt.
t 11 0 t 11" 0 t
As T --+CLJ, the product "1 eel),pl <ft) tends to unity for every
interior point e,J of J, and to zero for every point '2)
outside J ..
It is then obvious that the proof of (139) can be performed by
an easyextension ofthe argument usedinthe proofofTheorem 9"
As inthe case ofTheorem 9, we deduce immediately the corollary
that a twodimensional d.f. is uniquely determined by its c.f.
It is clear that the argument is perfectly general, so that \\"e
may state the following theorem.
Theorem 9a..
1
If tke k..dimensioool imerv&l J defined by
(i=l, 2, ... , k) is f1, continuity interval for tke
probability function P (S)t we have
_.. __I_JT fT l-e-ilthl t> l-e-i1lkJ..
P(J)- lim (2)k .... at ..
p-+-CX) 1T -T -T $ 1 It
k
X .....+l1JJ})!(tl' ... ,tk)dt
1
.... dlA,.
Hence it follows that Q, probability distribution in R
x
.. is uniq'ilely
determined by its characteri8tic function.
4. Before proceeding to further generalizations of a similar
kind, we shall 110W introduce a method of induction
2
'\vhich call
often be used for the extension of theorems on one-dimensiollal
distributions to any number of dimensions.
Consider a, random variable X = ('1' ... , 'Ie) in R
k
""ith the
pr.. f. P (8). Let T = (t
1
, , t
k
) denote a fixed point in R
k
such tllat
T::I.:(O, ,0)) and consider the one-dimensional random variable
U=ll'l +... +tke
A
..
The c.f. of this variable is by (137)
(140) E (e
UU
) =E (eit(tlEl+ = f (U
1
, .... , Uk).
This is &, relation between the c.f.. of a k-dilnensional variable X
H
1 Romanovaky [1], Haviland [3]. 2 Cramer-Wold (11.
106 GENERAL PROPERTLi:S
and the c.f. of a certain associated one-dImensional variable U
Since both t alld t
1
, .... , tk; are arbitrary, it will in many cases be
possible to use this relation for the purpose in vie,,".
Denoting by 81:',: the half-apace defined by the inequality
(141) U=t1e
1
+...
we observe that P (ST, x), considered as a function of the real
variable x IS the d.f. of the random variable [I. From (140), we
now obtain in the first place the following theorem, the one
dimensional case of which is, of course, trivial.
Theorem 31.
1
If two probability functions in R
k
coincide Jor
every half-space ST,:!:' they are identical.
In order to prove this theorem it is sufficient to remark that,
by hypothesis, the associated variable [T has one and the same
pr.f. in both cases.. Thus in the relation (140) the first member,
being the c.f. of U, has the same value in both cases. Putting
t= 1, it then follows that the c.f.'s of both distributions coincide
for T#(O, .... ,O). For T=(O, .... ,O), both c.f.'s assume the
value 1. Thus the c.f.. 'sare always equal, and then by Theorem 9a
the corresponding distributions are identicaL
5 .. We now proceed to the generalization of the important
Theorem 11.
Theorem l1a.
2
Let {P1l, (S)} be a sequence of pr.. f.'8 in R
k
,
and {In (t
1
, ... , t
k
)} the corresponding sequence of C.J.'8. A nece8sary
a1UJ, 8ujficit:iu condition for the convergence of to a pr.f.
P (8), in every continuity interval 0/ the latter, 18 that the 8equence
{In (t
1
, .... , t
k
)} OOfl,'Vergesfor every T =(t
1
, , tIc) to a limitf(t
1
, . , t
k
),
which is contint1U8 at tke point T =(0, .... , 0)..
When. this condition i8 8ati,8jied, tke limit!(t
1
, ..... , tit) $a identical
with the c.f. oj P (8), and {!"'} cO'n/vef'gea to J uniformly i" 61Jt'1r1J
finite interval..
That the condition is necessary is proved by a straightfor\vard
1 Cramer... \\,,'oJ.d [1J.
2 Romanovlky (1]. Bochner [2]. H&riland [3], Cramer.Wold [11..
GENERAL PROPERTIES 107
generalization of the argument used ill the olle-dimel1sional case..
It thus only relnaills (cf. the })roof of Theoreul 11) to prove that,
if fll. (t
1
, , t
k
) cOllverges to a limit J* (/
1
, .... , t
k
), uniformly in
Itt I<a, then P,t where P(S) is a IJr.f.
Let T= (t
l
, .. , t
A
) be a given point in R
k
sucah that T# (0, ... ,0),
and consider the sequence In (ttl' ...., tl1&)' where t is a real vari-
a.ble. By hypothesis this converges for all t to a limit, which is
continuous at 1=0. According to the preceding paragrapb...
fit (ttl' ... , tt
k
) is, however, the c.f. of the d.f. P
n
(ST.:.c). Thus by
Theorem 11 we have Pn. ill every continuity point
of F
p
(x), where F
p
(x) is a d.f.
From {.F:t (S)}, we now choose a sub-sequence ,vhich converges
to a limit P* (8), in every continuity interval of the latter. Then
it follows from the above that, in every continuity point of
FT(x). we have P* (ST,x)=F
7
" (x). Allowing here x to tend to
infinity, it follows that P* (R
k
) = 1, and thus p* (8) is a pr.f.,
which ,ve denote by P(8).
In exactly the same way as in the proof of Theorem 11 we C&11
no", show (using, of course, Theorem 9a instead of Theorem 9)
that every convergent sub-sequence of {P,t (S)} converges to the
same limit P (8).. This is, however, equivalent to the statement
that the sequence {P
n
(S)} converges to P (8).. Thus Theorem 11 a
is proved.
We.shall not enter here upon the questioll of a k-dilnensioD&l
generalization of Theorem 12.
6. Let us consider two mutually independent variables XI
and .X
2
in R
k
The pr.f.'s will be denoted by PI and P2) and the
c..f.'s by11 and!,. respectively.. The sumXl +XI' formed according
to the ordinary rule of vector addition, is a k-dimensional vector
function of the combined variable (Xl) X
t
), and thus according
to II, 6 (cf. v, 1), Xl +X
2
is a randoln variable in R
k
) with &
probability <listribution uniquely determined by PI and P2'" We
shall now prove the following theorem, ,vhich corresponds to
Theorem 13.
lOS OEYERAL PROPERTIES
rfbeorem 13a.
1
11Xl artd X2 are-ramuallyirukpe'lUkn,t random
mria61e3 in R
k
witll, the prf.'8 1i and P2-- and ihe c.!.'811 and J2)
lien tke 8um Xl +X
a
ka8 tke 0./_
(142) , t
k
) =/1 4lU, t/C)!2 (t
1
, ... , t
k
),
nt.l tnt pr
(143) P(S)= f
J ltk Be
ttr1lee 8 - X de'i1/Jtea the set of all poin:t.s I - X, where l belong8 to lJ.
As in the one-dimensional case, the relation (142) for the
sanimmediate consequence ofthe definition ofthe c.f. according
to (rS7).
Let us no,,,,, consider the secol1d member of (143). For every
8, (8- X) is fit bounded, non...negative and B-measnr-
a"ble function of the point X, so that according to I, 3, the
int.egnal al\vays exists.
2
Obviously the value P (S) of this integral
is a completely additive non..negative function of the set S
f.or S = Ric' assumes the value 1'1 Le. a pr.f. If, now, we can
sho\v that the c.f. of P (8). which we denote byJ, is identical with
f as given by (14:2), it follows that P (8) must be the pr.f. of the
variable Xl +X
2
, and then by reasons of symmetry P (8) will
also he equal to the third member of (143)'1 so that the theorem
WIll be proved.
If the set S is a half-space ST,m as defined by (141), it follows
ii-om. the expression of P (8) as an integral that the one-dimen-
sional d.f" P (ST,:c) is the composition of1i (Sp,z) and (ST.:e)' or
in th.e notation of v, 1,
P (ST,x) =Ii (STt:J:) * (ST
a
z).
i Bochner [2J, Ha.viland [2], Cramer-Weld [1].
Z '.Phis may be shown in the following way. If, in particular, S is an interval of
the 'bype 8
Z
!fO u,.:tJ: considered in II; 2, we ha.ve, pntting
X =('1'; ...., is)' PI (S - X) = F1 (Xl - tlf ...., Zk - '1&)'
whm's 1'1 is the d.f. eon'eSponmng to the pr.f. Pl. Thus in this case PI" regarded a8
&.fbuem.orl of ... 1', is boundecl,. and B..measurable. From the oomplpte
;uldit.ivity of P1 (19 -X) with respect to 8 it then follows that the same properties
mnst hold for every Borel set S in R
t
-
GENERAL PROPERTIES
109
Thus according to Theorem 13 the c.f. of the first member is the
product of the c.f.'s of both components in the member,
whioh gives according to (140)
]<"1' ... J tt
k
) =/1 (ttl' ., ttTe)/" (ttl' ... , Uk)
Putting here t == 1, we obtain the desired result.
As in the one-dimensional case, we shall say that P(S) is
composed of the components (8) and (/3), and we shall use
the abbreviation
(1430.) P=P
1

For the sum Xl +X
s
+ .... +X
n
of n mutually independent
random variables in R
k
we have the pr.. f.
P =Ii* *... *
and the c.f.
/=/1/21"..
CHAPTER X
THE NOR)lAL DISl
t
RIBUTION AND
THE CENTRAL LIMIT THEOREM
I. In order to generalize the 1!orma.l distribu.tion to the apace
R
k
, it is convenient to begin with a discussion ofthe oharacteristic
functions. FoX" a nOrJnally distributed one-dimensional variable
'With th.e mean value m and the ($, we have,/ according to
\"7, 1" the c.f.
! (t) =emu-tctfl"
Aperfectlynatt:ral generalizationuf this to k va'rIah;es
is obtained by putting
(144
\ jl( t )_
I iI' ".,.." Ie - e I
AssU1Uillg that l(t
13
as defined by this expression ia the
e.f. of it probability distribution in it is seen by expartSiOll in
MacLaurin's series that ra
r
and ftf"B are the fiTat and seoond order
moment.<;llntrool1ced in IX" 2" We have, in fact
lt
the following
tbeore.m..
Theorem 32" Fn-t anN real m,. and P.r, 8'UCh tAat the
qu.a.ilralic form ()='J;.U$ntrt,. i8 never ttgative
t
j(t
1
.. ....... tic) as

de.li:fted by (144) i8 tk.e c.f. of a probability distrib'Uiion in RJ!i u-kick
will be called a n&r1:nnl di8wiJJUtion. 'I'kt!.loUou'ing two CfJ8d fnlJ1j
occur ;;-
tA) I,.f tk.e!('Fm QUi bfinite poaitive, the ctYrreapondifl{/ di,ttibu..
tion. i06 oo8olYi.elll c.o",,,tinuou-8 and uill be called a proper normal
d;stributwia. Phe t:leMityJutwtion, of Mia di8triJnJ,tiMf, i8
{I4S) D(Xl ... D('l> .... fJl)- e-
kli1
._.eu.
where 6. =Ii P-r811 >0andq== :E7if,. - m,.}{f* - m.) i8 tAe reajWooo,l
",
ItHm oj QJ with the VtJriable8 f,.-m,.
NORMAL DISTRIBUTION
III
and hence
(B) If the form Qi8 only 8emi-dejinite, tke di8tribuJion i8 of the
ftr;.,g1dar type and will be called an improper ndrmal distribution.
For Mia di8tributiO'n, ,he 'llikole 71'UU18 is situated in a certain 811b...
space of le88 thalli, k dimensions, by OYtte line4r
tiOftB bet:ween t,'ke E?' (straight li'fle, plane, ky:p1rplane). Every
improper normal distribution may be reyreaenk9. ita (he limit oj a
sequence 01:JYfoper distrib'lltio'P..a.
1
Inthe case (A) we have to showthat (145) is a- ;.iensity function.
the c.f. of which is identics,! Vtith (144). ConfJider the integral
1 r I;',,"'j'-ig f;
G (VI' .... J 'U*) =j"9 'Ie t e'1' :.
\- 7/ V 1?1c
where, until further noticJ 'U
1
, ... 'I" Uk real variables. By
mean, of the substitution
E,..-.. m, == .L,.",., ()]s +us),
8
we obtain
l;ul"f,-i
1" fI',f
== L ttl-rUr +t I: i'l'
s
U
r
U, - t
r
For ur=O obtain G= 1, whieh ShOlVS that (145) is a 1ensity
function. Since both members of (146) are iTltegral fl:tDctions
of the ott?, (146) holds also for complex values of the 'Urc- ....
stituting it,. for UfOj G becomes Lha c"f.. cotTeSponding to the
1. The diatintotion of proper aud inlpfoprf normal dIstributions ira. of COJ.I!"8e.
relative to the tipace in which the &f'Ie cGnaidered. A proper normal
dIstribution in R
le
becomes an im{JlI'{)per 9.9 i'Oon ItS it ia OOD8!dered ft8 a
dstrlbution in OJ spal\e BJr. WIth K>
112 NORMAL DISTBIBUTION
density function (145), and (146) shows that this is identical
with (144).
In order to prove the case (B) of Theorem 32, it is sufficient to
consider a sequence of proper normal distributions, allowing the
IL" to tend to the coefficients of &, semi-definite fonn Q, while the
'n,. are being kept constant_ Obviously the corresponding c.f!s
converge uniformly to a limit of the form (144:) witll a semi..
definite Q, and then 'Theorem lla sho\vs that this limit is the
c.f. of a certain probability distribution in RI;. The determinant
a is, of equal to zero for the limiting distribution, and it
then follows from (145) that the \vhole mass is concentrated in
the set of points defined b
y
l tl" (f,. - m
r
) - m
s
) = o. NO\\l" the
r,8
determinant IJ!J.'lI If is zero, so that this relation is equivalent to
a certain number le
1
<Ie of linear relations between tl' ..., 'k-
2. We now proceed to the generalization of Theorems 17-20.
From the expression (144) of the e.f., the following theorem is
immediately deduced.
Theorem 17a. The BUm 0/ two independent and 'nonnally
di8tributed tJariablea in R
k
i8 it8elf 'lU:ffmQ,lly di8tributed. II at least
one of the comp0nent8 has a proper normal distributio'll" tke Sa111e
kolda true/or the 8um.
It should be noted that, of course, the sum may have a proper
normal distribution even if both components ha,,e improper
distributions, since the Bum of two semi-definite forms Q may
well be a definite form.
If (1 is & positive quantity, and M= (mit .... , mk) is a point in
Bit, we denote by (8-.M)/(/ the set of all points (X-M)/a, wllere
X belongs to IfP (8) is the pr.f. of a random variable X in R
k
,
it is then clear that P (8- M)ja) is the pr.f. of the variable
M+aX.
Theorem 18a. Let P(S) be a pr./. in R
k
with finite 8econd
order momenta.. 1/, to any points Jf
ll
Jl" in, Ric and any positive
1 In order to &v-oid trivial ditli'ultiH, we t\StJume here that there is at least onl'
minor "'fa #:0.
NORMAL DISTRIBUTION 113
contJtants at, 0'2' we canfintl M arul C18uch that
P(S *P(S == p ($:JJ
then P (8) 14 a nomuilpr.J.
Theorem 19a. If the sumof two independent vartable8 in R]e
is normaU,y diatri1Yated, thm each mriable is itIJelj '1I.O'IWI4Uy
distributed.
Theorem 20a. Let P (8) be a p.l- in R1: BUCk tkat the fiTst
order -moments 'In,. are all equal to zero while tke aecondorder f1UYI1UmI8
J'r, are finite. If Xl' Xs, ... are indepefUlent variables alllw/ving the
pr./. P{S), tken tkepr./. (P(Svnlt* of the variable
(Xl +... +X.)/v'n
te'lUls, as to that normaJ pr.f. whick ka8 tAe sameji"at and
second order moments a8 P (8).
As in the one-dimensional case, we prove first Theorem 204,
from whichTheorem 18ais deducedas a corollary. We thenprove
Theorem 19a by means of the induction method indicated in
IX, 4.
The c.f. of (P(8v'nn. is (!(It'''n" ...,t1clvn)'''. From the
relation (138) it then follows, in the sa,me way as in the one-
dimensional case, that this tends t<> the limit
-1-1:l4r8
"
I.,.
e ".f
as n co, uniformly in every finite interval. According to
Theorem 11at this proves Theorem 20(1. Hence Theorem l8a is
deduced ina perfectly similar way as inthe one-dimensional case.
In order to prove Theorem 19a) we suppose that
P
1
(8) *P2 (8) = P (8),
where P (8) is a normal pr.f.) and thus have for the corresponding
c..f.'s the relation
(147) flf,. = f =e
iL
-
1Q
,
where L is a linear form and Q a non-negative quadratic form. in
the variables t.,.
114 NORHA.L DISTRIBUTION
Consider now the one-dimensional d.!'.'! (87'.3:)' Pa(ST,:e),
P(81.'
t1
:t:), where Sp,:}; is the halfwsp&ce defined by (141). By (140)
the corresponding o.for's are 11 (ttt, ... , "Ie)' la (ttl' ..... , Uk) and
j(tt!, ne,ttk)=eiLl-fOP. Thus P(8
r
,r.c) is a normal d.f., and since
according to (147) we have
PI (S!l',z) (82",2:) =P (82',=)'
it follows from Theorem 19 that (8f',:c) and P
s
(ST,a:) are both
normal. We thus have
(148) 11 (Ut, ... , tt
k
) =eloLi'-1Qli*,
whereL
1
and Q
1
are functions oft
l
, "... , tic-
It follows from (140) that!1 (ttl' ... , Uk) is the c.f. of the variable
U='l'l+ .... +
t
k'1c' if we put X1=(fl' ... "k). The nth order
moment of Uis thus a homogeneous polynomial of order n in the
t,.. Thu8 in (148)L
1
is a linear form and Q
1
is a quadratic form in
the t
r
Since (148) is a c.f" the form Ql must be non-negative.
Putting t= 1 in (148), it then follows that /1 e ._, tat) is the Cof.
of 8, normal distribution ill Ric" The same holds, of course, for
Is (t
1
) .. , t
k
), and thus Theorem 19a is proved.
3. Theorem 20a constitutes the simplest case of the Oentral
Limit Theorem for random variables in R
ltr
It is possible to find
also Ie-dimensional analogues of the more ge:ueral theorems
proved in Chapter VI, 3-6, and of the theorems on asymptotie
expansions, etc. given in Chapters v"lI-VlII. We shall content
ourselves here with giving the statement of a theorem which
corresponds to Theorem 21, though it does not represent a
complete of what has been proved in the one-
dimewJ10naI case..
Theorem 21a..
1
LeJ, Xl' X
z
H be a s.gquence (1 indeptf'..Qe1lt
rarulnn" l'flriable.s in R
t
8'UcA t'kal t/t
1
e'ry X n 'h..aa tIle p; (B) 'UJ'ith
vani8ki,tfI firs' order motnentB and finite second fYl'der
1 Theorems ofa .mu1ar kind have been gIven by Bernstem [I]. Ca8telnu.ovo [1]
aDd Khintchtne tI,l 3]
NORMAL DISTRIBUTION
115
SUPP08e UuJt, lJ8 110-)- 00, the jollotcing two contliti()f1,8 are
8atiBfled:
(149)
1 ",
- 1: p..Vj Ikn, (r, 8 == 1, 2, h.) 1c),
"'.,-1
wiers the I'ra are 'nOt all equal, to zero, and
(160) ! i J IX
'ft.v-l
loy werg I: > OJ where IX I denotes viet + +
Then the pr.fe oj the var'able (Xl + +Xn)/V'n con/verga to
that Wr100Z pr./" v..,lflich has tne fira! artie"" moments zero aM the
Jeccnu: wder flwrpen't8 fLrs"
"f'hl'S can be by a direct generalization of the
proof of 21, which) of course; requires & little more
calcwa.tu'Jn than In tt...e onedimensional case
t
but does not involve
any new difficulty of principle. Obviousl)" the condition (150)
is analogous to the Lindeberg condition (64). It should be ob-
that the limitIng distribution nlay well be an improper
nonnal viz.. if the corresponding form Q is semi-
qefi.rlite.. Ob-llously this may occur even in a case wIlen all the
funcl"ions are absolutely continuous.
BIBLIOGRAPHY
The following list contains only works actually referred to in the text.
[1] BA.OHELlER, L. Theorie de 1& speculation. Annalea Ecole Norm. Sup.
17 (1900), 21.
[2] BACHELlER, L. OalcttJ, du probabiZitU. Paris, 1912.
[1] BERNSTEIN, S. Sur l'extension du theoreme limite du calcul des
probabilites aux sommes de quantites dependantes. Math. An.nalen..
97 (1927), 1-59.
[1] BESICOVITCH, A. S. Alm08t periodic/unctions. Cambridge, 1932.
[1] BOCHNER, S. VOf'leBungen iiber FouritJrsche IntBgrale. Leipzig, 1932.
(2) BOCHNER, S. Monotone Funktionen, Stieltjessche Integrate und
harmonische Analyse. MalA. Annalen, 108 (1933), 378-410.
[1] CANTELLI, F. P. La tendenza ad un limite nel sanzo del cal0010 delle
probabilitA. Rend. Circ. Mat. Palermo, 16 (1916), 191-201.
[2] CANTELLI, F. P. Una teoria astratta del calcolo delle probabilitA.
Giom. lIt. ltril. Attuari, 3 (1932), 257-265.
[1] CASTELNUOVO.. G. Oalcolo delle probabilitd'O Second 00. Bologna,
1926-28..
[1] CRAMEB, H. Das Gesetz von Gauss und die Theorie des Risikos.
l3kand. Aktuaristida1cr. 6 (1923), 209-237.
[2] CRAMER, H. On the composition of elementary errors. Bleand
.Aktuarietid81cr. 11 (1928), 13-74 and 141-180.
[3] CRA.MtB., H. On the mathematical theory of risk. Skandia1est8hri/t..
Stockholm, 1930.
[4] CRAMER, H. Sur lea proprietes asymptotiques d'une classe de
variables aleatoires. O.R. Acad. Sci. Pam, 201 (1935), 441-4:43.
[5] OBA,O:a, H. ttber eine Eigenschaft der normalen Verteilunga..
funktion. Math. Zeit8chrif'. 4:1 (1936), 405-414:.
[1] #C:RAMEB, H. and WOLD, H. Some theorems on distribution functions.
Jo'Urn. London Math. Soc. 11 (1936), 290-294.
[1] EDGEWOBT.J4.t F.. Y.. The law of error. Gamb. Phil.. Soc. Proo. 20 (1905),
36-141.
[1] ELDERTON, W. P. FF6tJ.fUnc'JJ ctm1e8 and correlation. Second ad.
London, 1927.
[1] ESSOHER, F.. On the probability function in the collective theory of
risk. Skand. A1ctuarietid81cr. 15
[1] FELLER, W. t)'ber den zentralen Grenzwertsatz dar Wahrscheinlich-
keitsrechnung. Math. Zeit8chriftt 40 (1935), 521-559.
[2] FELLER, w. tJber den zentraJen Grenzwertsatz der Wahrscheinlich-
keitsrechnung, II. ll!ath. ZeitBchriJt, 42 (1937).
[1] DB FINETTlfJ B.. Bulle funzioni a incremento aleatorio. Rend. R.
Accad. Lincei, (6), 10 (1929). 163-168.
BIBLIOGRAPHY II7
l2] DE FINETTI!I B. La funzioni caratteristiche di leSBtl istantanea. ReM.
R. Aceaa. Lince';', (6), 12 (1930), 278-282.
[1] FBEOHET, M. Surlaoonvergenoeenprobabilite. J.Vetrfm, 8 (1980),1.,48.
[2] FBECKET, M. Re0h6rche.1J 'l'raiU du calm.JJ le8
probabilitbJ, par E. Borel_ tome I, fase. 3, Paris, 1937.
[1] GLIVENKO, V. SuI teorema limite della teoria delle funzioni carat-
teristiche. Giom. Ist. Ital. Attuari, '1 (1936), 160-167.
[1] HAMEL, G. Eine Baais eJler Zahlen Wld die unstetigen Lasungen der
Funktionalgleichung!(:t:+1/) = !(a:)+!(y). Math. Anflalen, 60 (1905),
459-462.
[1] HA:aDY, G. H.) LI'rrLBwoOD, J. E. and G. lneq'Ufilitie8. Cam..
bridge, 1934.
[1] HAUSDORFJ', F .. Menge,nlehre.. Second 00. Berlin..Leipzig, 1927.
[1] HAvILAND, E .. K .. On distribution functions and their ..
Fourier transfornls. P'roc. National Acad. Sci. 20 (1934), S6-57. "
[2] HAVILAND, E. K. On the theory of absolutely additive distribution
fWl.ctlOns.. ... 4mer. 'l.lQurn. Math.. 56 (1934), 625-658.
[3] HAVILAND, E.. K. On the inversion formula for Foutier.. Stieltjes
transforms in more than one dimension. Amer. ,Journ.. Math.. 57
(1935), 94-100 and 382-388.
[1] HOBSON" E. W. The theory oj junctiona oj Q, real t,6Qriable. Vol. third
00. 1927, Vol. II, second 00. 1926.
[1] JESSEN, B.. and WINTNER, A. Distribution functions and the
Riemann zeta function. 'l.'mnB. Amer. Soc. 38 (1936)., 48-88.
fl] KEYNES, J. M. A &reati8s on 1WobabiZity. London, 1921.
[1] KHINTCHINE, A. Begriindung der Normalkorrelation nach der
Lindebergschen Methode. Nachr. ForschungBinst. M081cau, 1 (1928).
(2] KHINTCHINE, A. Asympt,otiscke Gesetze de,. Wakrscheinlichkeitareoh
'fI/U11{/.. Berlin, 1933.
[3] KKINTCRINE" A. SuI dominio di attrazione della. legge di Gauss.
Giom. 18'. ltal. Attuari, 6 (1935), 378-393.
[4] KRINTCHlNE, A. Su una legge dei grandi numeri generalizzata.
Giorn. 18e.. ItaZ. Attuari, 7 (1936), 365-377.
[1] KOLMOGO:ROFF, A. Bemerkungen zu meiner Arbeit uttber die
Summen zufilliger GrBssen".. Math. Annalen" 102 (1929), 484-488.
[2] KOLMOGOROFF, A. tiber die analytischen Methoden in dar Wahr..
scheinlichkeitsrechnung. Math. Annalen, 104 (1931)" 415-458.
[3] KOLMOGOROFF, A, Sulla forma genera1e di un processo stocas'tioo
omogeneo. Rend. R. Accaa. Ltncei, (6), 15 (1932), 805-808 and 866-
869.
[4] KOLMOGOROFF, A. GrundbegriJfe der Wahrsch6inliehksitsreckn"Ung.
Berlin, 1933..
[1] L..'\GBANGE, J. L. Memoire sur l'utilite de la methode de prendre
Ie milieu entre Iss resultats de plusieurs observa.tions.. Misc. Tauri ...
nen8ia, 5 (1770-73), 167-232. (EfJIVt"U, 2, Paris, 1868.
[1] LAPLAOE, P.. S. Tne.O'rU analytique au probabilites. First ed.. 1812
t
second ed. 1814. third ad. 1820.
118
BIBLIOGRAPHY
[1] LEBESGUE, H. Let;ona 8Ur l'integration. Second ed. Paris, 1928.
[1] LEVY, P. Oalcti/, d88 probabilitbJ. Paris, 1925.
[2] LEVY, P. Sur las integrales dont lea elements sont des variables
aleatoires independantes. Annali R. Sci. Norm. Sup. Pilla, (2),
3 (1934), 337-366.
[3] LEVY, P. Proprietes asymptotiques des sommes de variables
aleatoires independantes ou enchainees. J ourn. .tvlatll. puflea apple
(7), 14 (1935), 347-402.
[1] LIAPOUNOJT, A. Sur une proposition de la theorie des probabilites.
Bull. Acad. Sci. St.peter8bourg, (0), 13 (1900), 359-386.
[2] LIAPOUNOFJr, A. Nouvelle forme du theoreme sur Is. limite de
probabilite. Mem. Acad. Sci. St.petersbourg, (8), 12 (1901), No.5.
[t] LmDEBERG, J. W. Eine neue Herleitung des Exponentialgesetzes
in dar W&brscheinlichkeitsrechnung.. Mdth. Zeit8chrijt, 15 (1922),
211-225.
[1] Lti'"NDBERG, F. 'Ober die Theorie der Ruckversicherung. T'erha11dl.
6. intern. Kongr. Ver8.-Wis8., W4tm, 1909,1,877-956.
[2] LUNDBERG, F. FIJr8akringatelmisk riBkutjamning, 1-2. Stockholm,
1926-28.
[1] v. :rtfISES, R. Fundamentalsatze der Wahrscheinlichkeitsrechnung..
Math. Zsit8chriJt, 4 (1919), 1-97.
[2] v.. MISES, R. Grundlsgen der Wahrscheinlichkeitsrechnung. Math.
Z6mchrijt,5 (1919), 52-99.
[3] v. MISES, R. WahracheiiUichkeitarechnung. Leipzig-Wien, 1931.
[1] PEARSON, K. Historical note on the origin of the nonnal curve of
errors. Biometrika, 16 (1924), 402-404..
[1] P6LYA, G. HerleitWlg des Gauss'schen Gesetzes &Us einer Funk
tionalgleichung. Math. Zeit8chriJt, 18 (1923), 96-108.
[1] RADON, J. Theorie und Anwendung der absolut additiven Mangen..
funktionen. Si'zung8ber. Akad. Wien
t
122 (1913), 1295-1438.
[1] RIDER, P. R. A survey of the theory of small samples. AnnaZB of
Math. (2), 31 (1930), 577-628.
[1] ROMANOVSltY, V.. Sur un theoreme limite du calcul des probabilitee..
Ree. Soc.. Math. MOBCa'U, 36 (1929), 36-64.
[1] SLUTSKY, E. 'Ober stochastische Asymptoten und Grenzwerte.
Metron, 5 (1925), 1-90.
[t] "STUDENT". The probable error of a mean. Biometrika, 6 (1908-9).
1-25.
[t] THIELE, T.. N.. Theory of observations. London, 1903.
[1] TITOHKARSH, E. C. The theory of functions. O x f o r d ~ 1932.
[1] TODHUNTER, I. A history oj the mathematical thv.wy oj probability.
Cambridge-London, 1865.
[1] TOBNIEB, E. WahrscheinlichkeitBrechnung. Leipzig-Berlin, 1936.
(1] DE LAVALLEE POUSSIN, C.. 1nfi,grale8tk Lehesqua,jonctionad'BruJSmbleS.
Claa8e8 de Baire. Second ad. Paris, 1934"
[1] WINTNER, A. On the addition of independent dIstributions. Amer.
Journ. MatI,. 56 (1934), 8-16.
BIBI.,IOGBAPRY 119
SOl\IE RECENT WORXS ON MA.THEMATIOAL PROBABILITY
BARTLET', M. S. An Introduction to Stochastic Proc6t!Joea. Ca.mbridge,
1935.
BOCHNER, S. Harnwnic AnalY8U1 and tM Theory of Probability. Berkeley,
1955.
DOOB, J. L.. Stochastic P-rOC688U. New York, 1953.
ESSEEN, C. G. FO'IJIri,er analyBiB oj diBttriJ:nltion j'u/net-ions. Acta Maihe-
matica, 77, 1945, 1-125.
FELLER, W. An Introd,'lU)tion to Probability Theory and its AppZicoJiona,
Vol. I, 2nd ad. New York, 1957.
GNEDENKO, B. V. and KOLMOGO:B.OFF, A. N. lAm" JJilMibutionl for
~ ~ U ' m 8 oj Indept/JuIent Random Variables.. (Translated from the Russian
by K. L. Chung.. ) Cambridge, Mass., 1954.
LOEVE, :\-f. Probability Theory.. Fou:ndationa-BandfJ"m Sequences. New
York.. 1955.
LUKACS, E. (/haracteristic Ji'Unction8. London, 1960.

Random Variables and Probability H Cramer (CUP 1962 125s)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Random Variables and Probability H Cramer (CUP 1962 125s)

Uploaded by

Copyright:

Available Formats

RANDOM VARIABLES

Theorem 3.. Let Xl' ... , X

Then X +Y is a one-dimensional vector function of Z, which

Let C;z denote the set of points (",Y:, Y) such that X +Y z.

This means that the total sed. of Xv tends to infinity) while

7'-+- t.O "S'H -7,1 -'1' Z 2.

You might also like