You are on page 1of 90

8039.9789814335478-tp.

indd 1 5/25/11 3:50 PM


This page intentionally left blank This page intentionally left blank
NE W J E RSE Y L ONDON SI NGAP ORE BE I J I NG SHANGHAI HONG KONG TAI P E I CHE NNAI
World Scientifc
Narahari Prabhu
Cornell University, USA
TOPICS IN
PROBABILITY
8039.9789814335478-tp.indd 2 5/25/11 3:50 PM
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library.
For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.
ISBN-13 978-981-4335-47-8
ISBN-10 981-4335-47-9
Typeset by Stallion Press
Email: enquiries@stallionpress.com
All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, including photocopying, recording or any information storage and retrieval
system now known or to be invented, without written permission from the Publisher.
Copyright 2011 by World Scientific Publishing Co. Pte. Ltd.
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
Printed in Singapore.
TOPICS IN PROBABILITY
YeeSern - Topics in Probability.pmd 5/12/2011, 2:28 PM 1
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-fm
Now Ive understood
Times magic play:
Beating his drum he rolls out the show,
Shows dierent images
And then gathers them in again
Kabir (14501518)
v
This page intentionally left blank This page intentionally left blank
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-fm
CONTENTS
Preface ix
Abbreviations xi
1. Probability Distributions 1
1.1. Elementary Properties . . . . . . . . . . . . . . . . . 1
1.2. Convolutions . . . . . . . . . . . . . . . . . . . . . . . 4
1.3. Moments . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4. Convergence Properties . . . . . . . . . . . . . . . . . 8
2. Characteristic Functions 11
2.1. Regularity Properties . . . . . . . . . . . . . . . . . . 11
2.2. Uniqueness and Inversion . . . . . . . . . . . . . . . . 15
2.3. Convergence Properties . . . . . . . . . . . . . . . . . 17
2.3.1. Convergence of types . . . . . . . . . . . . . . 19
2.4. A Criterion for c.f.s . . . . . . . . . . . . . . . . . . . 21
2.5. Problems for Solution . . . . . . . . . . . . . . . . . . 24
3. Analytic Characteristic Functions 27
3.1. Denition and Properties . . . . . . . . . . . . . . . . 27
3.2. Moments . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3. The Moment Problem . . . . . . . . . . . . . . . . . . 31
3.4. Problems for Solution . . . . . . . . . . . . . . . . . . 40
vii
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-fm
viii Topics in Probability
4. Innitely Divisible Distributions 43
4.1. Elementary Properties . . . . . . . . . . . . . . . . . 43
4.2. Feller Measures . . . . . . . . . . . . . . . . . . . . . 46
4.3. Characterization of Innitely Divisible
Distributions . . . . . . . . . . . . . . . . . . . . . . . 50
4.4. Special Cases of Innitely Divisible Distributions . . 54
4.5. Levy Processes . . . . . . . . . . . . . . . . . . . . . . 57
4.6. Stable Distributions . . . . . . . . . . . . . . . . . . . 58
4.7. Problems for Solution . . . . . . . . . . . . . . . . . . 66
5. Self-Decomposable Distributions; Triangular Arrays 69
5.1. Self-Decomposable Distributions . . . . . . . . . . . . 69
5.2. Triangular Arrays . . . . . . . . . . . . . . . . . . . . 72
5.3. Problems for Solution . . . . . . . . . . . . . . . . . . 78
Bibliography 79
Index 81
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-fm
PREFACE
In this monograph we treat some topics that have been of some
importance and interest in probability theory. These include, in
particular, analytic characteristic functions, the moment problem,
innitely divisible and self-decomposable distributions.
We begin with a review of the measure-theoretical foundations
of probability distributions (Chapter 1) and characteristic functions
(Chapter 2).
In many important special cases the domain of characteristic func-
tions can be extended to a strip surrounding the imaginary axis
of the complex plane, leading to analytic characteristic functions.
It turns out that distributions that have analytic characteristic func-
tions are uniquely determined by their moments. This is the essence
of the moment problem. The pioneering work in this area is due to
C. C. Heyde. This is treated in Chapter 3.
Innitely divisible distributions are investigated in Chapter 4. The
nal Chapter 5 is concerned with self-decomposable distributions and
triangular arrays. The coverage of these topics as given by Feller in
his 1971 book is comparatively modern (as opposed to classical) but
is still somewhat diused. We give a more compact treatment.
N. U. Prabhu
Ithaca, New York
January 2010
ix
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-fm
This page intentionally left blank This page intentionally left blank
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-fm
ABBREVIATIONS
Term Abbreviation
characteristic function c.f.
distribution function d.f
if and only if i
Laplace transform L.T.
probability generating function p.g.f
random variable r.v.
Terminology: We write x
d
=y if the r.v.s x, y have the same distribution.
xi
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch01
Chapter 1
Probability Distributions
1.1. Elementary Properties
A function F on the real line is called a probability distribution
function if it satises the following conditions:
(i) F is non-decreasing: F(x + h) F(x) for h > 0;
(ii) F is right-continuous: F(x+) = F(x);
(iii) F() = 0, F() 1.
We shall say that F is proper if F() = 1, and F is defective other-
wise.
Every probability distribution induces an assignment of proba-
bilities to all Borel sets on the real line, thus yielding a proba-
bility measure P. In particular, for an interval I = (a, b] we have
P{I} = F(b) F(a). We shall use the same letter F both for the
point function and the corresponding set function, and write F{I}
instead of P{I}. In particular
F(x) = F{(, x]}.
We shall refer to F as a probability distribution, or simply a distri-
bution.
A point x is an atom if it carries positive probability (weight). It is
a point of increase i F{I} > 0 for every open interval I containing x.
1
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch01
2 Topics in Probability
A distribution F is concentrated on the set A if F(A
c
) = 0, where
A
c
is the complement of A. It is atomic if it is concentrated on the
set of its atoms. A distribution without atoms is continuous.
As a special case of the atomic distribution we have the arithmetic
distribution which is concentrated on the set {k(k = 0, 1, 2, . . .)}
for some > 0. The largest with this property is called the span
of F.
A distribution is singular if it is concentrated on a set of Lebesgue
measure zero. Theorem 1.1 (below) shows that an atomic distribution
is singular, but there exist singular distributions which are continu-
ous.
A distribution F is absolutely continuous if there exists a function
f such that
F(A) =

A
f(x)dx.
If there exists a second function g with the above property, then it is
clear that f = g almost everywhere, that is, except possibly on a set
of Lebesgue measure zero. We have F

(x) = f(x) almost everywhere;


f is called the density of F.
Theorem 1.1. A probability distribution has at most countably
many atoms.
Proof. Suppose F has n atoms x
1
, x
2
, . . . , x
n
in I = (a, b] with
a < x
1
< x
2
< < x
n
b and weights p(x
k
) = F{x
k
}. Then
n

k=1
p(x
k
) F{I}.
This shows that the number of atoms with weights >
1
n
is at most
equal to n. Let
D
n
= {x : p(x) > 1/n};
then the set D
n
has at most n points. Therefore the set D = D
n
is
at most countable.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch01
Probability Distributions 3
Theorem 1.2 (Jordan decomposition). A probability distribu-
tion F can be represented in the form
F = pF
a
+ qF
c
(1.1)
where p 0, q 0, p + q = 1, F
a
, F
c
are both distributions, F
a
being
atomic and F
c
continuous.
Proof. Let {x
n
, n 1} be the atoms and p =

p(x
n
), q = 1 p.
If p = 0 or if p = 1, the theorem is trivially true. Let us assume that
0 < p < 1 and for < x < dene the two functions
F
a
(x) =
1
p

x
n
x
p(x
n
), F
c
(x) =
1
q
[F(x) pF
a
(x)]. (1.2)
Here F
a
is a distribution because it satises the conditions (i)(iii)
above. For F
c
we nd that for h > 0
q[F
c
(x + h) F
c
(x)] = F(x + h) F(x)

x<x
n
x+h
p(x
n
) 0,
(1.3)
which shows that F
c
is non-decreasing. Letting h 0 in (1.3) we
nd that
q[F
c
(x
+
) F
c
(x)] = F(x
+
) F(x) = 0,
so that F is right-continuous. Finally, F
c
() = 0, while F
c
() = 1.
Therefore F
c
is a distribution.
Theorem 1.3 (Lebesgue decomposition). A probability distri-
bution can be represented as the sum
F = pF
a
+ qF
sc
+ rF
ac
(1.4)
where p 0, q 0, r 0, p+q +r = 1, F
a
is an atomic distribution,
F
sc
is a continuous but singular distribution and F
ac
is an absolutely
continuous distribution.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch01
4 Topics in Probability
Proof. By the Lebesgue decomposition theorem on measures we
can express F as
F = aF
s
+ bF
ac
, (1.5)
where a 0, b 0, a + b = 1, F
s
is a singular distribution and
F
ac
is an absolutely continuous distribution. Applying Theorem 1.2
to F
s
we nd that F
s
= p
1
F
a
+ q
1
F
sc
, where p
1
0, q
1
0, p
1
+
q
1
= 1. Writing p = ap
1
, q = aq
1
, r = b we arrive at the desired
result (1.4).
Remark. Although it is possible to study distribution functions
and measures without reference to random variables (r.v.) as we have
done above, it is convenient to start with the denition
F(x) = P{X x}
where X is a random variable dened on an appropriate sample
space.
1.2. Convolutions
Let F
1
, F
2
be distributions and F be dened by
F(x) =

F
1
(x y)dF
2
(y) (1.6)
where the integral obviously exists. We call F the convolution of F
1
and F
2
and write F = F
1
F
2
. Clearly F
1
F
2
= F
2
F
1
.
Theorem 1.4. The function F is a distribution.
Proof. For h > 0 we have
F(x + h) F(x) =

[F
1
(x y + h) F
1
(x y)]dF
2
(y) 0
(1.7)
so that F is non-decreasing. As h 0,
F
1
(x y + h) F
1
(x y) F
1
(x y+) F
1
(x y) = 0;
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch01
Probability Distributions 5
since
|F
1
(x y + h) F
1
(x y)| 2,

2dF
2
(y) = 2,
the right side of (1.7) tends to 0 by the dominated convergence theo-
rem. Therefore F(x
+
)F(x) = 0, so that F is right-continuous. Since
F
1
() = 1 the dominated convergence theorem gives F() = 1.
Similarly F() = 0. Therefore F is a distribution.
Theorem 1.5. If F
1
is continuous, so is F. If F
1
is absolutely
continuous, so is F.
Proof. We have seen in Theorem 1.4 that the right-continuity of
F
1
implies the right-continuity of F. Similarly the left-continuity of
F
1
implies that of F. It follows that if F
1
is continuous, so is F.
Next let F
1
be absolutely continuous, so there exists a function
f
1
such that
F
1
(x) =

f
1
(u)du.
Then
F(x) =

dF
2
(y)

f
1
(u y)du
=

f
1
(u y)dF
2
(y)

du
so that F is absolutely continuous, with density
f(x) =

f
1
(x y)dF
2
(y). (1.8)

Remarks.
1. If X
1
, X
2
are independent random variables with distributions
F
1
, F
2
, then the convolution F = F
1
F
2
is the distribution of their
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch01
6 Topics in Probability
sum X
1
+ X
2
. For
F(z) = P{X
1
+ X
2
z} =

x+yz
dF
1
(x)dF
2
(y)
=

dF
2
(y)

zy

dF
1
(x) =

F
1
(z y)dF
2
(y).
However, it should be noted that dependent random variables X
1
, X
2
may have the property that the distribution of their sum is given by
the convolution of their distributions.
2. The converse of Theorem 1.5 is false. In fact two singular distri-
butions may have a convolution which is absolutely continuous.
3. The conjugate of any distribution F is dened as the distribution

F, where

F(x) = 1 F(x

).
If F is the distribution of the random variable X, then

F is the
distribution of X. The distribution F is symmetric if F =

F.
4. Given any distribution F, we can symmetrize it by dening the
distribution

F, where

F = F

F.
It is seen that

F is a symmetric distribution. It is the distribution
of the dierence X
1
X
2
, where X
1
, X
2
are independent variables
with the same distribution F.
1.3. Moments
The moment of order > 0 of a distribution F is dened by

dF(x)
provided that the integral converges absolutely, that is,

|x|

dF(x) < ;

is called the absolute moment of order . Let 0 < < . Then


for |x| 1 we have |x|

1, while for |x| > 1 we have |x|

|x|

.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch01
Probability Distributions 7
Thus we can write |x|

|x|

+ 1 for all x and so

|x|

dF(x)

(1 +|x|

)dF(x) = 1 +

|x|

dF(x).
This shows that the existence of the moment of order implies the
existence of all moments of order < .
Theorem 1.6. The moment

of a distribution F exists i
x
1
[1 F(x) + F(x)] (1.9)
is integrable over (0, ).
Proof. For t > 0 an integration by parts yields the relation

t
t
|x|

dF(x) = t

[1 F(t) + F(t)]
+

t
0
x
1
[1 F(x) + F(x)]dx. (1.10)
From this we nd that

t
t
|x|

dF(x)

t
0
x
1
[1 F(x) + F(x)]dx
so that if (1.9) is integrable over (0, ),

(and therefore

) exists.
Conversely, if

exists, then since

|x|>t
|x|

dF(x) > |t|

[1 F(t) + F(t)]
the rst term on the right side of (1.10) vanishes as t and the
integral there converges as t .
Theorem 1.7. Let
(t) =

|x|
t
dF(x) <
for t in some interval I. Then log (t) is a convex function of t I.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch01
8 Topics in Probability
Proof. Let a 0, b 0, a + b = 1. Then for two functions
1
,
2
we have the Holder inequality

|
1
(x)
2
(x)|dF(x)

|
1
(x)|
1/a
dF(x)

|
2
(x)|
1/b
dF(x)

b
provided that the integrals exist. In this put
1
(x) = x
at
1
,
2
(x) =
x
bt
2
, where t
1
, t
2
I. Then
(at
1
+ bt
2
) (t
1
)
a
(t
2
)
b
(1.11)
or taking logarithms,
log (at
1
+ bt
2
) a log (t
1
) + b log (t
2
)
which establishes the convexity property of log .
Corollary 1.1 (Lyapunovs inequality). Under the hypothesis of
Theorem 1.7,
1
t
t
is non-decreasing for t I.
Proof. Let , I and choose a = /, t
1
= , b = 1 a, t
2
= 0.
Then (1.11) reduces to

( )
where we have written
t
= (t).
1.4. Convergence Properties
We say that I is an interval of continuity of a distribution F if I is
open and its end points are not atoms of F. The whole line (, )
is considered to be an interval of continuity.
Let {F
n
, n 1} be a sequence of proper distributions. We say
that the sequence converges to F if
F
n
{I} F{I} (1.12)
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch01
Probability Distributions 9
for every bounded interval of continuity of F. If (1.12) holds for
every (bounded or unbounded) interval of continuity of F, then the
convergence is said to be proper, and otherwise improper. Proper
convergence implies in particular that F() = 1.
Examples
1. Let F
n
be uniform in (n, n). Then for every bounded interval
contained in (n, n) we have
F
n
{I} =

I
dx
2n
=
|I|
2n
0 as n
where |I| is the length of I. This shows that the convergence is
improper.
2. Let F
n
be concentrated on {
1
n
, n} with weight 1/2 at each atom.
Then for every bounded interval I we have
F
n
{I} 0 or 1/2
according as I does not or does contain the origin. Therefore the
limit F is such that it has an atom at the origin, with weight 1/2.
Clearly F is not a proper distribution.
3. Let F
n
be the convolution of a proper distribution F with the
normal distribution with mean zero and variance n
2
. Thus
F
n
(x) =

F(x y)
n

2
e
(1/2)n
2
y
2
dy
=

F(x y/n)
1

2
e
(1/2)y
2
dy.
For nite a, b we have

b
a
dF
n
(x) =

[F(b y/n) F(a y/n)]


1

2
e
(1/2)y
2
dy
F(b

) F(a

) as n
by the dominated convergence theorem. If a, b are points of continuity
of we can write
F
n
{(a, b)} F{(a, b)} (1.13)
so that the sequence {F
n
} converges properly to F.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch01
10 Topics in Probability
If X is a random variable with the distribution F and Y
n
is an
independent variable with the above normal distribution, then we
know that F
n
is the distribution of the sum X + Y
n
. As n , it
is obvious that the distribution of this sum converges to that of X.
This justies the denition of convergence which requires (1.13) to
hold only for points of continuity a, b.
Theorem 1.8 (Selection theorem). Every sequence {F
n
} of dis-
tributions contains a subsequence {F
n
k
, k 1} which converges
(properly or improperly) to a limit F.
Theorem 1.9. A sequence {F
n
} of proper distributions converges to
F i

u(x)dF
n
(x)

u(x)dF(x) (1.14)
for every function u which is bounded, continuous and vanishing at
. If the convergence is proper, then (1.14) holds for every bounded
continuous function u.
The proofs of these two theorems are omitted.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
Chapter 2
Characteristic Functions
2.1. Regularity Properties
Let F be a probability distribution. Then its characteristic function
(c.f.) is dened by
() =
_

e
ix
dF(x) (2.1)
where i =

1, real. This integral exists, since


_

|e
ix
|dF(x) =
_

dF(x) = 1. (2.2)
Theorem 2.1. A c.f. has the following properties:
(a) (0) = 1 and |()| 1 for all .
(b) () =

(), and

is also a c.f.
(c) Re is also a c.f.
Proof. (a) We have
(0) =
_

dF(x) = 1, |()|
_

|e
ix
|dF(x) = 1.
11
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
12 Topics in Probability
(b)

() =
_

e
ix
F(dx) = (). Moreover, let

F(x) =
1 F(x

). Then
_

e
ix

F{dx} =
_

e
ix

F{dx} =
_

e
ix
F{dx}.
Thus () is the c.f. of

F, which is a distribution.
(c) Re =
1
2
+
1
2

= c.f. of
1
2
F +
1
2

F, which is a distribution.

Theorem 2.2. If
1
,
2
are c.f.s, so is their product
1

2
.
Proof. Let
1
,
2
be the c.f.s of F
1
, F
2
respectively and consider
the convolution
F(x) =
_

F
1
(x y)dF
2
(y).
We know that F is a distribution. Its c.f. is given by
() =
_

e
ix
dF(x) =
_

e
ix
_

dF
1
(x y)dF
2
(y)
=
_

e
iy
dF
2
(y)
_

e
i(xy)
dF
1
(x y)
=
1
()
2
().
Thus the product
1

2
is the c.f. of the convolution F
1
F
2
.
Corollary 2.1. If is a c.f., so is ||
2
.
Proof. We can write ||
2
=

, where

is a c.f. by
Theorem 2.1(b).
Theorem 2.3. A distribution F is arithmetic i there exists a real

0
= 0 such that (
0
) = 1.
Proof. (i) Suppose that the distribution is concentrated on {k,
>0, k =0, 1, 2, . . .} with the weight p
k
at k. Then the c.f.
is given by
() =

p
k
e
ik
.
Clearly (2/) = 1.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
Characteristic Functions 13
(ii) Conversely, let (
0
) = 1 for
0
= 0. This gives
_

(1 e
i
0
x
)dF(x) = 0.
Therefore
_

(1 cos
0
x)dF(x) = 0
which shows that the points of increase of F are among
2k

0
(k =
0, 1, 2, . . .). Thus the distribution is arithmetic.

Corollary 2.2. If () = 1 for all , then the distribution is con-
centrated at the origin.
Remarks.
1. If F is the distribution of a random variable, then we can write
() = E(e
iX
)
so that the c.f. is the expected value of e
iX
. We have () =
E(e
iX
), so that () is the c.f. of the random variable X. This
is Theorem 2.1(b).
2. If X
1
, X
2
are two independent random variables with c.f.s
1
,
2
,
then

1
()
2
() = E[e
i(X
1
+X
2
)
]
so that the product
1

2
is the c.f. of the sum X
1
+X
2
. This is only
a special case of Theorem 2.2, since the convolution F
1
F
2
is not
necessarily dened for independent random variables.
3. If is the c.f. of the random variable X, then ||
2
is the c.f. of
the symmetrized variable X
1
X
2
, where X
1
, X
2
are independent
variables with the same distribution as X.
Theorem 2.4. (a) is uniformly continuous.
(b) If the n-th moment exists, then the n-th derivative exists and is
a continuous function given by

(n)
() =
_

e
ix
(ix)
n
dF(x). (2.3)
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
14 Topics in Probability
(c) If the n-th moment exists, then admits the expansion
() = 1 +
n

n
(i)
n
n!
+ 0(
n
) ( 0). (2.4)
Proof. (a) We have
( +h) () =
_

e
ix
(e
ihx
1)dF(x) (2.5)
so that
|( +h) ()|
_

|e
ihx
1|dF(x)
2
_

| sin(hx/2)|dF(x).
Now
_
x<A,x>B
| sin(hx/2)|dF(x)
_
x<A,x>B
dF(x) <
by taking A, B large, while
_
B
A
| sin(hx/2)|dF(x)
_
B
A
dF(x) < .
since | sin(hx/2)| < for h small. Therefore |(+h)()| 0
as h 0, which proves uniform continuity.
(b) We shall prove (2.3) for n = 1, the proof being similar for n > 1.
We can write (2.5) as
( +h) ()
h
=
_

e
ix

e
ihx
1
h
dF(x). (2.5

)
Here

e
ix

e
ihx
1
h

e
ihx
1
h

|x|
and
_

|x|dF(x) <
by hypothesis. Moreover (e
ihx
1)/h ix as h 0. Therefore
letting h 0 in (2.5

) we obtain by the dominated convergence


May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
Characteristic Functions 15
theorem that
( +h) ()
h

_

ixe
ix
dF(x)
as required. Clearly, this limit is continuous.
(c) We have
e
ix
=
n

n=0
(ix)
n
n!
+o(
n
x
n
) ( 0)
so that
_

e
ix
dF(x) = 1 +
n

n=1
(i)
n
n!

n
+
_

o(
n
x
n
)dF(x),
where the last term on the right side is seen to be o(
n
).
Remark . The converse of (b) is not always true: thus

() may
exist, but the mean may not. A partial converse is the following:
Suppose that
(n)
() exists. If n is even, then the rst n moments
exist, while if n is odd, the rst n 1 moments exist.
2.2. Uniqueness and Inversion
Theorem 2.5 (uniqueness). Distinct distributions have distinct
c.f.s.
Proof. Let F have the c.f. , so that
() =
_

e
ix
dF(x).
We have for a > 0
_

2
e

1
2
a
2

2
iy
()d
=
_

2
e

1
2
a
2

2
iy
_

e
ix
dF(x)
=
_

dF(x)
_

e
i(xy)
a

2
e

1
2
a
2

2
d,
the inversion of integrals being clearly justied. The last integral is
the c.f. (evaluated at x y) of the normal distribution with mean 0
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
16 Topics in Probability
and variance a
2
, and therefore equals e
(xy)
2
/2a
2
. We therefore
obtain the identity
1
2
_

1
2
a
2

2
iy
()d =
_

2a
e

1
2a
2
(yx)
2
dF(x)
(2.6)
for all a > 0. We note that the right side of (2.6) is the density of the
convolution F N
a
, where N
a
is the normal distribution with mean
0 and variance a
2
. Now if G is a second distribution with the c.f. ,
it follows from (2.6) that F N
a
= G N
a
. Letting a 0
+
we nd
that F G as required.
Theorem 2.6 (inversion). (a) If the distribution F has c.f. and
|()/| is integrable, then for h > 0
F(x +h) F(x) =
1
2
_

e
ix

1 e
ih
i
()d. (2.7)
(b) If || is integrable, then F has a bounded continuous density f
given by
f(x) =
1
2
_

e
ix
()d. (2.8)
Proof. (b) From (2.6) we nd that the density f
a
of F
a
= F N
a
is given by
f
a
(x) =
1
2
_

1
2
a
2

2
ix
()d. (2.9)
Here the integrand is bounded by |()|, which is integrable by
hypothesis. Moreover, as a 0
+
, the integrand e
ix
(). There-
fore by the dominated convergence theorem as a 0
+
,
f
a
(x)
1
2
_

e
ix
()d = f(x) (say).
Clearly, f is bounded and continuous. Now for every bounded
interval I we have
F
a
{I} =
_
I
f
a
(x)dx.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
Characteristic Functions 17
Letting a 0
+
in this we obtain
F{I} =
_
I
f(x)dx
if I is an interval of continuity of F. This shows that f is the density
of F, as required.
(a) Consider the uniform distribution with density
u
h
(x) =
1
h
for h < x < 0, and = 0 elsewhere.
Its convolution with F has the density
f
h
(x) =
_

u
h
(x y)dF(y) =
_
x+h
x
1
h
dF(y) =
F(x +h) F(x)
h
and c.f.

h
() = ()
_

e
ix
u
h
(x)dx = ()
1 e
ih
ih
.
By (b) we therefore obtain
F(x +h) F(x)
h
=
1
2
_

e
ix
()
1 e
ih
ih
d
provided that |()(1 e
ih
)/i| is integrable. This condition
reduces to condition that |()/| is integrable.
2.3. Convergence Properties
Theorem 2.7 (continuity theorem). A sequence {F
n
} of distri-
butions converges properly to a distribution F i the sequence {
n
}
of their c.f.s converges to , which is continuous at the origin. In
this case is the c.f. of F.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
18 Topics in Probability
Proof. (i) If {F
n
} converges properly to F, then
_

u(x)dF
n
(x)
_

u(x)dF(x)
for every continuous and bounded function u. For u(x) = e
ix
it follows that
n
() () where is the c.f. of F. From
Theorem 2.4(a) we know that is uniformly continuous.
(ii) Conversely suppose that
n
() (), where is continuous at
the origin. By the selection theorem there exists a subsequence
{F
n
k
, k 1} which converges to F, a possibly defective distri-
bution. Using (2.6) we have
a

2
_

e
iy
1
2
a
2
2

n
k
()d =
_

1
2a
2
(yx)
2
dF
n
k
(x).
Letting k in this we obtain
a

2
_

e
iy
1
2
a
2
2
()d =
_

1
2a
2
(yx)
2
dF(x)
F() F(). (2.10)
Writing the rst expression in (2.10) as
1

2
_

e
i(y/a)
1
2

2
(/a)d (2.11)
and applying the dominated convergence theorem we nd that (2.11)
converges to (0) = 1 as a . By (2.10) it follows that F()
F() 1, which gives F() = 0, F() = 1, so that F is proper.
By (i) is the c.f. of F, and by the uniqueness theorem F is unique.
Thus every subsequence {F
n
k
} converges to F.
Theorem 2.8 (weak law of large numbers). Let {X
n
, n 1}
be a sequence of independent random variables with a common dis-
tribution and nite mean . Let S
n
= X
1
+ X
2
+ + X
n
(n 1).
Then as n , S
n
/n in probability.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
Characteristic Functions 19
Proof. Let be the c.f. of X
n
. The c.f. of S
n
/n is then
E(e
i(S
n
/n)
) = (/n)
n
= [1 +i(/n) + 0(1/n)]
n
e
i
as n . Here e
i
is the c.f. of a distribution concentrated at the
point . By the continuity theorem it follows that the distribution of
S
n
/n converges to this degenerate distribution.
Theorem 2.9 (central limit theorem). Let {X
n
, n1} be a
sequence of independent random variables with a common distribu-
tion and
E(X
n
) = , Var(X
n
) =
2
(both being nite). Let S
n
= X
1
+ X
2
+ + X
n
(n 1). Then as
n , the distribution of (S
n
n)/

n converges to the standard


normal.
Proof. The random variables (X
n
)/ have mean zero and vari-
ance unity. Let their common c.f. be . Then the c.f. of (S
n
n)/

n is
(/

n)
n
= [1
2
/2n + 0(1/n)]
n
e

1
2

2
where the limit is the c.f. of the standard normal distribution. The
desired result follows by the continuity theorem.
Remark . In Theorem 2.7 the convergence of
n
is uniform
with respect to in [, ].
2.3.1. Convergence of types
Two distributions F and G are said to be of the same type if
G(x) = F(ax +b) (2.12)
with a > 0, b real.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
20 Topics in Probability
Theorem 2.10. If for a sequence {F
n
} of distributions we have
F
n
(
n
x +
n
) G(x), F
n
(a
n
x +b
n
) H(x) (2.13)
for all points of continuity, with
n
> 0, a
n
> 0, and G and H are
non-degenerate distributions, then

n
a
n
a,

n
b
n
a
n
b and G(x) = H(ax +b) (2.14)
(0 < a < , |b| < ).
Proof. Let H
n
(x) = F
n
(a
n
x + b
n
). Then we are given that
H
n
(x) H(x) and also H
n
(
n
x +
n
) = F
n
(
n
x +
n
) G(x),
where

n
=

n
a
n
,
n
=

n
b
n
a
n
. (2.15)
With the obvious notations we are given that

n
() (),
n
() e
i
n
/
n

n
(/
n
) ()
uniformly in . Let {
n
k
} be a subsequence of {
n
} such
that
n
k
a (0 a ). Let a = , then
|()| = lim|
n
k
()| = lim|
n
k
(/
n
k
)| = |(0)| = 1
uniformly in [, ], so that is degenerate, which is not true. If
a = 0, then
|()| = lim|
n
k
()| = lim|
n
k
(
n
k
)| = |(0)| = 1,
so that is degenerate, which is not true. So 0 < a < . Now
e
i(
n
k
/
n
k
)
=

n
k
()

n
k
()

()
()
so that
n
k
/
n
k
a limit b/a (say). Also
() = e
i(b/a)
(/a). (2.16)
It remains to prove the uniqueness of the limit a. Suppose there are
two subsequences of {
n
} converging to a and a

, and assume that


May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
Characteristic Functions 21
a < a

. Then the corresponding subsequences of {b


n
} converge to b, b

(say) From (2.16) we obtain


e
i(b/a)
(/a) = e
i(b

/a

)
(/a

)
and hence |(/a)| = |(/a

)| or
|()| = |(a/a

)| = |(a
2
/a
2
)| = = |(a
n
/a
n
)| = |(0)| = 1.
This means that is degenerate, which is not true. So a a

.
Similarly a a

. Therefore a = a

, as required. Since we have


proved (2.16), the theorem is completely proved.
2.4. A Criterion for c.f.s
A function f of a real variable is said to be non-negative denite in
(, ) if for all real numbers
1
,
2
, . . . ,
n
and complex numbers
a
1
, a
2
, . . . , a
n
n

r,s=1
f(
r

s
)a
r
a
s
0. (2.17)
For such a function the following properties hold.
(a) f(0) 0. If in (2.17) we put n = 2,
1
= ,
2
= 0, a
1
= a, a
2
= 1
we obtain
f(0)(1 +|a|
2
) +f()a +f() a 0. (2.18)
When = 0 and a = 1 this reduces to f(0) 0.
(b)

f() = f(). We see from (2.18) that f()a + f() a is real.
This gives

f() = f().
(c) |f()| f(0). In (2.18) let us choose a =

f() where is real.


Then
f(0) + 2|f()|
2
+
2
|f()|
2
f(0) 0.
This is true for all , so |f()|
4
|f()|
2
[f(0)]
2
or |f()| f(0), as
required.
Theorem 2.11. A function of a real variable is the c.f. of a dis-
tribution i it is continuous and non-negative denite.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
22 Topics in Probability
Proof. (i) Suppose is a c.f.; that is,
() =
_

e
ix
dF(x)
where F is a distribution. By Theorem 2.4(a), is continuous.
Moreover,
n

r,s=1
(
r

s
)a
r
a
s
=
n

r,s=1
a
r
a
s
_

e
i(
r

s
)x
dF(x)
=
_

_
n

1
a
r
e
i
r
n
__
n

1
a
s
e
i
s
x
_
dF(x)
=
_

i
a
r
e
i
r
x

2
dF(x) 0
which shows that is non-negative denite.
(ii) Conversely, let be continuous and non-negative denite. Then
considering the integral as the limit of a sum we nd that
_

0
_

0
e
i(

)x
(

)dd

0 (2.19)
for > 0. Now consider
P

(x) =
1

_

0
_

0
e
(

)x
(

)dd

=
_

e
isx

(s)ds (2.20)
where

(t) =
_
_
_
_
1
|t|

_
(t) for |t|
0 for |t|
.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
Characteristic Functions 23
From (2.20) we obtain

(t) =
1
2
_

_
1
||

_
e
it
P

()d
=
1
2
_

(s)ds
_

_
1
||

_
e
(ts)
d
=
1
2
_

4 sin
2 1
2
(s t)
(s t)
2

(s)ds

(t) as .
On the account of (2.19),

is a c.f., and

is continuous at
the origin. By the continuity theorem

is a c.f. Again

(t) (t) as
and since is continuous at the origin it follows that is a c.f.
as was to be proved.
Remark. This last result is essentially a theorem due to S. Bochner.
Remark on Theorem 2.7. If a sequence {F
n
} of distributions con-
verges properly to a distribution F, then the sequence {
n
} of their
c.f.s converges to , which is the c.f. of F and the convergence is
uniform in every nite interval.
Proof. Let A < 0, B > 0 be points of continuity of F. We have

n
() () =
_

e
ix
F
n
{dx}
_

e
ix
F{dx}
=
_
x<A,x>B
e
ix
F
n
{dx}
_
x<A,x>B
e
ix
F{dx}
+
__
B
A
e
ix
F
n
{dx}
_
B
A
e
ix
F{dx}
_
= I
1
+I
2
+I
3
(say).
We have
I
3
=
_
B
A
e
ix
F
n
{dx}
_
B
A
e
ix
F{dx}
= {e
ix
[F
n
(x) F(x)]}
B
A
i
_
B
A
e
ix
[F
n
(x) F(x)]dx
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
24 Topics in Probability
and so
|I
3
| = |F
n
(B) F(B)| +|F
n
(A) F(A)|
+||
_
B
A
|F
n
(x) F(x)|dx.
Given > 0 we can make
|F
n
(B) F(B)| < /9, |F
n
(A) F(A)| < /9
for n suciently large. Also, since |F
n
(x) F(x)| 2 and F
n
(x)
F(x) at points of continuity of F, we have for || <
||
_
B
A
|F
n
(x) F(x)|dx
_
B
A
|F
n
(x) F(x)|dx < /9.
Thus
|I
3
| < /3.
Also for A, B suciently large
|I
1
|

_
x<A,x>B
e
ix
F
n
{dx}

1 F
n
(B) +F
n
(A) <
1
3

|I
2
|

_
x<A,x>B
e
ix
F
n
{dx}

1 F
n
(B) F
n
(A) <
1
3
.
The results follow from the last three inequalities.
2.5. Problems for Solution
1. Consider the family of distributions with densities f
a
(1 a
1) given by
f
a
(x) = f(x)[1 +a sin(2 log x)]
where f(x) is the log-normal density
f(x) =
1

2
x
1
e
1/2(log x)
2
for x > 0.
= 0 for x 0.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch02
Characteristic Functions 25
Show that f
a
has exactly the same moments as f. (Thus
the log-normal distribution is not uniquely determined by its
moments).
2. Let {p
k
, k 0} be a probability distribution, and {F
n
, n 0} a
sequence of distributions. Show that

n=0
p
n
F
n
(x)
is also a distribution.
3. Show that () = e
(e
||
1)
is a c.f., and nd the corresponding
density.
4. A distribution is concentrated on {2, 3, . . .} with weights
p
k
=
c
k
2
log |k|
(k = 2, 3, . . .)
where c is such that the distribution is proper. Find its c.f. and
show that

exists but the mean does not.


5. Show that the function () = e
||

( > 2) is not a c.f.


6. If a c.f. is such that ()
2
= (c) for some constant c, and
the variance is nite, show that is the c.f. of the normal distri-
bution.
7. A degenerate c.f. is factorized in the form =
1

2
, where
1
and
2
are c.f.s. Show that
1
and
2
are both degenerate.
8. If the sequence of c.f.s {
n
} converges to a c.f. and
n

0
,
show that
n
(
n
)
n
(
0
).
9. If {
n
} is a sequence of c.f.s such that
n
() 1 for < <,
then
n
() 1 for all .
10. A sequence of distributions {F
n
} converges properly to a non-
degenerate distribution F. Prove that the sequence {F
n
(a
n
x +
b
n
)} converges to a distribution degenerate at the origin i
a
n
and b
n
= 0(a
n
).
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
Chapter 3
Analytic Characteristic
Functions
3.1. Denition and Properties
Let F be a probability distribution and consider the transform
() =
_

e
x
dF(x) (3.1)
for = + i, where , are real and i =

1. This certainly
exists for = i. Since

_
B
A
e
x
dF(x)

_
B
A
e
x
dF(x), (3.2)
() exists if
_

e
x
dF(x) is nite. Clearly, the integrals
_

0
e
x
dF(x),
_
0

e
x
dF(x) (3.3)
converge for < 0, > 0 respectively. Suppose there exist numbers
, (0 < , ) such that the rst integral in (3.3) converges
for < and the second for > , then
_

e
x
dF(x) < for < < . (3.4)
In this case () converges in the strip < < of the complex
plane, and we say (in view of Theorem 3.1 below) that F has an
analytic c.f. . If = = the c.f. is said to be entire (analytic on
the whole complex plane).
27
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
28 Topics in Probability
The following examples show that a distribution need not have
an analytic c.f. and also that there are distributions with entire
c.f.s. The conditions under which an analytic c.f. exists are stated
in Theorem 3.5.
Examples
Distribution c.f. Regions of existence
Binomial: f(n, k) =

n
k

p
k
q
nk
(q + pe

)
n
whole plane
Normal: f(x) =
1

2
e

1
2
x
2
e
1
2

2
whole plane
Cauchy: f(x) =
1


1
1 + x
2
e
||
= 0
Gamma: f(x) = e
x

x
1
()

<
Laplace: f(x) =
1
2
e
|x|
(1
2
)
1
1 < < 1
Poisson: f(k) = e

k
k!
e
(e

1)
whole plane
Theorem 3.1. The c.f. is analytic in the interior of the strip of
its convergence.
Proof. Let
I =
( +h) ()
h

_

xe
x
dF(x)
where the integral converges in the interior of the strip of conver-
gence, since for > 0,

xe
x
dF(x)

|x|e
x
dF(x)
_

e
|x|+x
dF(x)
and the last integral is nite for + < < . We have
I =
_

e
x
_
e
hx
1 hx
h
_
dF(x)
=
_

e
x
(h(x
2
/2!) +h
2
x
3
/3! + )dF(x).
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
Analytic Characteristic Functions 29
Therefore
|I|
_

e
x
|h||x|
2
(1 +|hx|/1! +|hx|
2
/2! + )dF(x)
|h|
_

e
x+|x|+|h||x|
dF(x) <
in the interior of the strip of convergence. As |h| 0 the last expres-
sion tends to zero, so
( +h) ()
h

_

xe
x
dF(x).
Thus

() exists for in the interior of the strip, which means that


() is analytic there.
Theorem 3.2. The c.f. is uniformly continuous along vertical
lines that belong to the strip of convergence.
Proof. We have
|( +i
1
) ( +i
2
)| =

e
x
(e
i
1
x
e
i
2
x
)dF(x)

e
x
|e
i(
1

2
)x
1|dF(x)
= 2
_

e
x
|sin(
1

2
)(x/2)|dF(x).
Since the integrand is uniformly bounded by e
x
and approaches 0
as
1

2
, uniformly continuity follows.
Theorem 3.3. An analytic c.f. is uniquely determined by its values
on the imaginary axis.
Proof. (i) is the c.f. discussed in Chapter 2 and the result fol-
lows by the uniqueness theorem of that section.
Theorem 3.4. The function log () is convex in the interior of the
strip of convergence.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
30 Topics in Probability
Proof. We have
d
2
d
2
log () =
()

()

()
2
()
2
and by the Schwarz inequality

()
2
=
__

xe
x
dF(x)
_
2
=
__

e
1
2
x
xe
1
2
x
dF(x)
_
2

e
x
dF(x)
_

x
2
e
x
dF(x) = ()

().
Therefore
d
2
d
2
log () 0, which shows that log () is convex.
Corollary 3.1. If F has an analytic c.f. and

(0) = 0, then ()
is minimal at = 0. If is an entire function, then () as
, unless F is degenerate.
3.2. Moments
Recall that

n
=
_

x
n
dF(x),
n
_

|x|
n
dF(x)
have been dened as the ordinary moment and absolute moment of
order n respectively. If F has an analytic c.f. , then
n
=
(n)
(0),
and
() =

n
n!
,
the series being convergent in || < = min(, ). The converse is
stated in the following theorem.
Theorem 3.5. If all moments of F exist and the series

n
n!
has
a nonzero radius of convergence , then exists in || < , and
inside the circle || < ,
() =

n
n!
.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
Analytic Characteristic Functions 31
Proof. We rst consider the series

n
n!
and show that it also
converges in || < . From Lyapunovs inequality

1
n
n

1
n+1
n+1
we obtain
limsup

1
n
n
n
= limsup

1
2n
2n
2n
= limsup

1
2n
2n
2n
limsup
|
n
|
1
n
n
.
Also, since |
n
|
n
we have
limsup
|
n
|
1
n
n
limsup

1
n
n
n
.
Therefore
limsup
|
n
|
1
n
n
= limsup

1
n
n
n
which shows that the series

n
n!
has radius of convergence . For
arbitrary A > 0 we have
>

n
||
n
n!

||
n
n!
_
A
A
|x|
n
dF(x) =
_
A
A
e
|x|
dF(x)
for || < . So

_
A
A
e
x
dF(x)

_
A
A
e
|x|
dF(x) <
for || < . Since A is arbitrary, this implies that () converges in
the strip || < .
3.3. The Moment Problem
The family of distributions given by
F

(x) = k
_
x

e
|y|

{1 + sin(|y|

tan )}dy
for 1 1, 0 < < 1 has the same moments of all orders. This
raises the question: under what conditions is a distribution uniquely
determined by its moments?
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
32 Topics in Probability
Theorem 3.6. If F has an analytic c.f. then it is uniquely deter-
mined by its moments.
Proof. If F has an analytic c.f., then the series

n
n
converges
in || < = min(, ) and () is given by this series there. If
there is a second d.f. G with the same moments
n
, then by Theo-
rem 3.5, G has an analytic c.f. (), and () is also given by that
series in || < . Therefore () = () in the strip || < and
hence F = G.
The cumulant generating function
The principal value of log () is called the cumulant generating func-
tion K(). It exists at least on the imaginary axis between = 0 and
the rst zero of (i). The cumulant of order r is dened by
K
r
= i
1
r
__
d
d
_
r
log (i)
_
=0
.
This exists if, and only if,
r
exists; K
r
can be expressed in terms of

r
. We have
K(i) =

0
K
r
(i)
r
r!
whenever the series converges.
Theorem 3.7. Let () =
1
()
2
(), where (),
1
(),
2
() are
c.f.s. If () is analytic in < < , so are
1
() and
2
().
Proof. We have (with the obvious notations)
_

e
x
dF(x) =
_

e
x
dF
1
(x)
_

e
x
dF
2
(x),
and since () is convergent, so are
1
() and
2
().
Theorem 3.8 (Cramer). If X
1
and X
2
are independent r.v. such
that their sum X = X
1
+ X
2
has a normal distribution, then X
1
,
X
2
have normal distributions (including the degenerate case of the
normal with zero variance).
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
Analytic Characteristic Functions 33
Proof. Assume without loss of generality that E(X
1
) = E(X
2
) =0.
Then E(X) = 0. Assume further that E(X
2
) = 1. Let
1
(),
2
()
be the c.f.s of X
1
and X
2
. Then we have

1
()
2
() = e
1
2

2
. (3.5)
Since the right side of (3.5) is an entire function without zeros, so are

1
() and
2
(). By the convexity property (Theorem 3.4) we have

1
() 1,
2
() 1 as moves away from zero. Then (3.5) gives
e
1
2

2
=
1
()
2
()
1
() |
1
()|. (3.6)
Similarly |
2
()| e
1
2

2
. Therefore
e
1
2

2
|
1
()| |
1
()
2
()| = e
1
2
Re(
2
)
= e
1
2
(
2

2
)
,
so that
|()| e

1
2

2
. (3.7)
From (3.6) and (3.7) we obtain

1
2
||
2

1
2

2
log |
1
()|
1
2

1
2
||
2
,
or, setting K
1
() = log
1
(),
|Re K
1
()|
1
2
||
2
. (3.8)
From a strengthened version of Liouvilles theorem (see Lemma 3.1)
it follows that K
1
() = a
1
+ a
2

2
. Similarly K
2
() = b
1
+
b
2

2
.
Theorem 3.9 (Raikov). If X
1
and X
2
are independent r.v. such
that their sum X = X
1
+ X
2
has a Poisson distribution, then X
1
,
X
2
have also Poisson distributions.
Proof. The points of increase of X are k = 0, 1, 2, . . . , so all points
of increase
1
and
2
of X
1
and X
2
are such that
1
+
2
= some k,
and moreover the rst points of increase of X
1
and X
2
are and
where is some nite number. Without loss of generality we take
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
34 Topics in Probability
= = 0, so that X
1
and X
2
have k = 0, 1, 2, . . . as the only
possible points of increase. Their c.f.s are then of the form

1
() =

0
a
k
e
k
,
2
() =

0
b
k
e
k
(3.9)
with a
0
, b
0
> 0, a
k
, b
k
0 (k 1) and

a
k
=

b
k
= 1. Let z = e

and
1
() = f
1
(z),
2
() = f
2
(z). We have
f
1
(z)f
2
(z) = e
(z1)
. (3.10)
Therefore
a
0
b
k
+a
1
b
k1
+ +a
k
b
0
= e

k
k!
(k = 0, 1, . . .), (3.11)
which gives
a
k

1
b
0
e

k
k!
, |f
1
(z)|
1
b
0
e
(|z|1)
. (3.12)
Similarly |f
2
(z)|
1
a
0
e
(|z|1)
. Hence
1
a
0
e
(|z|1)
|f
1
(z)| |f
1
(z)f
2
(z)| = e
(u1)
where u = Re (z). This gives
|f
1
(z)| a
0
e
(|z|u)
a
0
e
2|z|
. (3.13)
From (3.12) and (3.13), noting that a
0
b
0
= e

we nd that
2|z| log |f
1
(z)| log a
0
2|z|,
or setting K
1
(z) = log f
1
(z), and log a
0
=
1
< 0,
|Re K
1
(z) +
1
| 2|z|. (3.14)
Proceeding as in the proof of Theorem 3.8, we obtain
1
+K
1
(z) =cz,
where c is a constant. Since f
1
= 1, K
1
(1) = 0, so c =
1
and f
1
(z) = e

1
(z1)
, which is the transform of the Poisson
distribution.
Theorem 3.10 (Marcinkiewicz). Suppose a distribution has a
c.f. () such that (i) = e
P(i)
, where P is a polynomial. Then
(i) () = e
P()
in the whole plane, and (ii) is the c.f. of a normal
distribution (so that P() = +
2
with 0).
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
Analytic Characteristic Functions 35
Proof. Part (i) is obvious. For Part (ii) let
P() =
n

1
a
k

k
, n nite, a
k
real (cumulants).
From |()| () we obtain |e
P()
| e
P()
or e
Re P()
e
P()
.
Therefore Re P() P(). Put = re
i
, so that = r cos , =
r sin. Then
a
n
r
n
cos n +a
n1
r
n1
cos(n 1) +
a
n
r
n
cos
n
+a
n1
r
n1
cos
(n1)
+
Suppose a
n
= 0. Dividing both sides of this inequality by r
n
and
letting r we obtain a
n
cos n a
n
cos
n
. Putting =

2n
we
obtain
a
n
0 a
n
cos
n

2n
,
so a
n
0 for n 2. Similarly, putting =
2
n
we nd that
a
n
a
n
cos
n
2
n
,
and since cos
n 2
n
< 1 for n > 2 we obtain a
n
0. Therefore
a
n
= 0 for n > 2, P() = a
1
+ a
2

2
, and () is the c.f. of a nor-
mal distribution, the case a
2
= 0 being the degenerate case of zero
variance.
Theorem 3.11 (Bernstein). Let X
1
and X
2
be independent r.v.
with unit variances. Then if
Y
1
= X
1
+X
2
, Y
2
= X
1
X
2
(3.15)
are independent, all four r.v. X
1
, X
2
, Y
1
, Y
2
are normal.
This is a special case of the next theorem (with n = 2, a
1
=
b
1
= a
2
= 1, b
2
= 1). For a more general result see [Feller (1971),
pp. 7780, 525526]. He considers the linear transformation Y
1
=
a
11
X
1
+ a
12
X
2
, Y
2
= a
21
X
1
+ a
22
X
2
with || = 0, where is the
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
36 Topics in Probability
determinant
=

a
11
a
12
a
21
a
22

.
If a
11
a
21
+a
12
a
22
= 0 then the transformation represents a rotation.
Thus (3.15) is a rotation.
Theorem 3.12 (Skitovic). Let X
1
, X
2
, . . . , X
n
be n independent
r.v. such that the linear forms
L
1
= a
1
X
1
+a
2
X
2
+ +a
n
X
n
,
L
2
= b
1
X
1
+b
2
X
2
+ +b
n
X
n
, (a
i
= 0, b
i
= 0),
are independent. Then all the (n + 2) r.v. are normal.
Proof. We shall rst assume that (i) the ratios a
i
/b
i
are all distinct,
and (ii) all moments of X
1
, X
2
, . . . , X
n
exist. Then for , real we
have (with obvious notations)

()
L
1
+L
2
=
()
L
1

()
L
2
so that
n

i=1

()
(a
i
+b
i
)X
i
=
n

i=1

()
a
i
X
i

i=1

()
b
i
X
i
.
Taking logarithms of both sides and expanding in powers of we
obtain
n

i=1
K
(a
i
+b
i
)X
i
r
=
n

i=1
K
a
i
X
i
r
+
n

i=1
K
b
i
X
i
r
or
n

i=1
K
(X
i
)
r
{(a
i
+b
i
)
r
(a
i
)
r
(b
i
)
r
} = 0
for all r 1. This can be written as
n

i=1
K
(X
i
)
r
r1

s=1
_
r
s
_
(a
i
)
s
(b
i
)
rs
= 0
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
Analytic Characteristic Functions 37
for all r 1 and all , . Hence
n

i=1
a
s
i
b
rs
i
K
(X
i
)
r
= 0 (s = 1, 2, . . . , r 1, r 1),
Let r n + 1. Then for s = 1, 2, . . . , n, i = 1, 2, . . . n we can write
the above equations as
A
r

r
= 0 (3.16)
where A
r
=
_
a
S
i
b
rs
i
_
1 s, i n and
r
is the column vector with
elements K
(X
1
)
r
, K
(X
2
)
r
, . . . , K
(X
n
)
r
. Since
|A
r
| = (a
1
a
2
a
n
)(b
1
b
2
b
n
)
r1

j>i
(c
j
c
i
) = 0,
the only solution of (3.16) is
r
= 0. Therefore
K
(X
i
)
r
= 0 for r n + 1, i = 1, 2, . . . , n. (3.17)
Thus all cumulants of X
i
of order n + 1 vanish, and K
(X
i
)
()
reduces to a polynomial of degree at most n. By the theorem of
Marcinkiewicz, each X
i
has a normal distribution. Hence L
1
and L
2
have normal distributions.
Next suppose that some of the a
i
/b
i
are the same. For example,
let a
1
/b
1
= a
2
/b
2
, and let Y
1
= a
1
X
1
+a
2
X
2
. Then
L
1
= Y
1
+a
3
X
3
+ +a
n
X
n
,
L
2
=
b
1
a
1
Y
1
+b
3
X
3
+ +b
n
X
n
.
Repeat this process till all the a
i
/b
i
are distinct. Then by what has
just proved, the Y
i
are normal. By Cramers theorem the X
i
are
normal.
Finally it remains to prove that the moments of X
i
exist. This
follows from the fact that L
1
and L
2
have nite moments of all orders.
To prove this, we note that since a
i
= 0, b
i
= 0 we can take a, c > 0
such that |a
i
|, |b
i
| c > 0. Also, let us standardize the a
i
and b
i
so
that |a
i
| 1, |b
i
| 1. Now if |L| = |a
1
X
1
+a
2
X
2
+ +a
n
X
n
| nM,
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
38 Topics in Probability
then at least one |X
i
| M. Therefore
P{|L
1
| nM}
n

i=1
P{|X
i
| M}. (3.18)
Further, if c|X
i
| nM and |X
j
| < M for all j = i, then |L
1
| M,
|L
2
| M. Thus
P{|L
1
| M, |L
2
| M} P
_
|X
i
|
nM
c
_

j=1
P{|X
j
| < M}
P
_
|X
i
|
nM
c
_
n

j=1
P{|X
j
| < M}.
Summing this over i = 1, 2, . . . , n we obtain, using (3.18),
nP{|L
1
| M, |L
2
| M} P
_
|L
1
|
n
2
M
c
_
n

j=1
P{|X
j
| < M}.
Since L
1
and L
2
are independent, this gives
P
_
|L
1
|
n
2
M
c
_
P{|L
1
| M}
n
P{|L
2
| M}

n
1
P{|X
j
| < M}
0 (3.19)
as M . We can write (3.19) as follows. Choose n
2
/c = > 1.
Then
P{|L
1
| M}
P{|L
1
| M}
0 as M . (3.20)
By a known result (Lemma 3.2), L
1
, and similarly L
2
, has nite
moments of all orders.
Lemma 3.1 (see [Hille (1962)]). If f() is an entire function and
|Ref()| c||
2
, then f() = a
1
+a
2

2
.
Proof. We have f() =

0
a
n

n
, the series being convergent on
the whole plane. Here
a
n
=
n!
2i
_
||r
f()

n+1
d (n = 0, 1, 2, . . .). (3.21)
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
Analytic Characteristic Functions 39
Also, since there are no negative powers
0 =
n!
2i
_
||r
f()
n1
d (n = 1, 2, . . .). (3.22)
From (3.21) we obtain
a
n
=
n!
2i
_
2
0
f(re
i
)
r
n+1
e
i(n+1)
re
i
id
or
a
n
r
n
=
n!
2
_
2
0
f(re
i
)e
in
d (n = 0, 1, . . .). (3.23)
Similarly from (3.22) we obtain
0 =
n!
2
_
2
0
f(re
i
)e
in
d
or
0 =
n!
2
_
2
0
f(re
i
)e
in
d (n = 1, 2, . . .). (3.24)
From (3.23) and (3.24) we obtain
a
n
r
n
=
n!

_
2
0
Ref(re
i
)e
in
d (n 1).
Therefore
|a
n
|r
n

n!

_
2
0
ck
2
d = 2cn!r
2
or
|a
n
|
2cn!
r
n2
0 as r for n > 2.
This gives f() = a
0
+a
1
+a
2

2
.
Lemma 3.2 (see [Loeve (1963)]). For > 1 if
1 F(x) +F(x)
1 F(x) +F(x)
0 as x
then F has moments of all orders.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
40 Topics in Probability
Proof. Given > 0 choose A so large that for x > A
1 F(x) +F(x)
1 F(x) +F(x)
< and 1 F(A) +F(A) < .
Then for any positive integer r,
1 F(
r
A) +F(
r
A)
1 F(A) +F(A)
=
r

s=1
1 F(
s
A) +F(
s
A)
1 F(
s1
A) +F(
s1
A)
<
r
so that
1 F(
r
A) +F(
r
A) <
r+1
.
Therefore
1 F(x) +F(x) <
r+1
for x >
r
A. Now
_

A
nx
n1
[1 F(x) +F(x)]dx
=

r=0
_

r+1
A

r
A
nx
n1
[1 F(x) +F()]dx
<

r=0

r+1
_

r+1
A

r
A
nx
n1
dx = A
n
(
n
1)

0
(
n
)
r
and the series converges for <
n
.
3.4. Problems for Solution
1. If 1 F(x) + F(x) = 0(e
x
) as x for some > 0, show
that F is uniquely determined by its moments.
2. Show that the distribution whose density is given by
f(x) =
_
_
_
1
2
e
|

x|
for x > 0
0 for x 0
does not have an analytic c.f.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch03
Analytic Characteristic Functions 41
3. Proof of Bernsteins theorem. Introduce a change of scale so
that Y
1
=
1

2
(X
1
+X
2
), Y
2
=
1

2
(X
1
X
2
). Then prove that
K
(Y
1
)
s
=
_
1

2
_
s
_
1
s
K
(X
1
)
s
+ 1
s
K
(X
2
)
s

,
K
(Y
2
)
s
=
_
1

2
_
s
_
1
s
K
(X
1
)
s
+ (1)
s
K
(X
2
)
s

,
and similarly for K
(X
1
)
s
, K
(X
2
)
s
in terms of K
(Y
1
)
s
, K
(Y
2
)
s
. Hence
show that

K
(X
i
)
s

1
2
s
_
2

K
(X
1
)
s

+ 2

K
(X
2
)
s

_
(i = 1, 2).
This gives K
(X
i
)
s
= 0 for s > 2, i = 1, 2.
4. If X
1
, X
2
are independent and there exists one rotation (X
1
,
X
2
) (Y
1
, Y
2
) such that Y
1
, Y
2
are also independent, then show
that Y
1
, Y
2
are independent for every rotation.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
Chapter 4
Innitely Divisible
Distributions
4.1. Elementary Properties
A distribution and its c.f. are called innitely divisible if for each
positive integer n there exists a c.f.
n
such that
() =
n
()
n
. (4.1)
It is proved below (Corollary 4.1) that if is innitely divisible, then
() = 0. Dening
1/n
as the principal branch of the n-th root,
we see that the above denition implies that
1/n
is a c.f. for every
n 1.
Examples
(1) A distribution concentrated at a single point is innitely divisi-
ble, since for it we have
() = e
ia
= (e
ia/n
)
n
where a is a real constant.
(2) The Cauchy density f(x) =
a

[a
2
+(x)
2
]
1
(a > 0) has () =
e
ia||
. The relation (4.1) holds with
n
() = e
i/na||/n
.
Therefore the Cauchy density is innitely divisible.
(3) The normal density with mean m and variance
2
has c.f. () =
e
im
1
2

2
= (e
im/n
1
2

2
n

2
)
n
. Thus the normal distribution is
innitely divisible.
43
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
44 Topics in Probability
(4) The gamma distribution (including the exponential) is innitely
divisible, since its c.f. is
() = (1 i/)

=
_
(1 i/)
/n
_
n
.
The discrete counterparts, the negative binomial and geometric
distributions are also innitely divisible.
(5) Let N be a random variable with the (simple) Poisson distribu-
tion e

k
/k!(k = 0, 1, 2, . . .). Its c.f. is given by
() = e
(e
i
1)
,
which is clearly innitely divisible. Now let {X
k
} be a sequence
of independent random variables with a common c.f. and let
these be independent of N. Then the sum X
1
+X
2
+ +X
N
b
has the c.f.
() = e
ib+[()1]
,
which is the compound Poisson. Clearly, this is also innitely
divisible.
Lemma 4.1. Let {
n
} be a sequence of c.f.s. Then
n
n
contin-
uous i n(
n
1) with continuous. In this case = e

.
Theorem 4.1. A c.f. is innitely divisible i there exists a
sequence {
n
} of c.f.s such that
n
n
.
Proof. If is innitely divisible, then by denition there exists a
c.f.
n
such that
n
n
= (n 1). Therefore the condition is necessary.
Conversely, let
n
n
. Then by Lemma 4.1, n[
n
() 1] =
log . Now for t > 0,
e
nt[
n
()1]
e
t()
as n .
Here the expression on the left side is the c.f. of the compound Poisson
distribution and the right side is a continuous function. Therefore for
each t > 0, e
t
is a c.f. and
= e

= (e
/n
)
n
,
which shows that is innitely divisible.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
Innitely Divisible Distributions 45
Corollary 4.1. If is innitely divisible, = 0.
This was proved in the course of the proof of Theorem 4.1.
Corollary 4.2. If is innitely divisible, so is ()
a
for each a > 0.
Proof. We have
a
= e
a
= (a
a/n
)
n
.
Proof of Lemma 4.1. (i) Suppose n(
n
1) which is contin-
uous. Then
n
1 and the convergence is uniform in [, ].
Therefore |1
n
()| <
1
2
for [, ] and n > N. Thus log
n
exists for [, ], and n > N, and is continuous and bounded.
Now
log
n
= log[1 + (
n
1)]
= (
n
1)
1
2
(
n
1)
2
+
1
3
(
n
1)
3

= (
n
1)[1 +o(1)]
and therefore
nlog
n
= n(
n
1)[1 +o(1)]
or
n
n
= e

.
(ii) Suppose
n
n
. We shall rst prove that has no zeros. It
suces to prove that |
n
|
2n
||
2
implies ||
2
> 0. Assume that this
symmetrization has been carried out, so that
n
n
with
n
0,
0. Since is continuous with (0) = 1, there exists an interval
[, ] in which does not vanish and therefore log exists and is
bounded. Therefore log
n
exists and is bounded for [, ] and
n > N, so nlog
n
log . Thus log
n
0 or
n
1. As in (i),
n(
n
1) log = .
Theorem 4.2. If {
n
} is a sequence of innitely divisible c.f.s and

n
which is continuous, then is an innitely divisible c.f.
Proof. Since
n
is innitely divisible,
1/n
n
is a c.f. Since
_

1/n
n
_
n
continuous,
is an innitely divisible c.f. by Theorem 4.1.
Theorem 4.3 (De Finetti). A distribution is innitely divisible i
it is the limit of compound Poisson distributions.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
46 Topics in Probability
Proof. If
n
is the c.f. of a compound Poisson distribution, and

n
which is continuous, then by Theorem 4.2, is an innitely
divisible c.f. Conversely, let be an innitely divisible c.f. Then by
Theorem 4.1 there exists a sequence {
n
} of c.f.s such that
n
n
.
By Lemma 4.1
e
n[
n
()1]
e

= .
Here e
n[
n
()1]
is the c.f. of a compound Poisson distribution.
4.2. Feller Measures
A measure M is said to be a Feller measure if M{I} < for every
nite interval I, and the integrals
M
+
(x) =
_

x

1
y
2
M{dy}, M

(x) =
_
x
+

1
y
2
M{dy} (4.2)
converge for all x > 0.
Examples
(1) A nite measure M is a Feller measure, since
_
|y|>x
1
y
2
M{dy}
1
x
2
[M{(, x)} +M{(x, )}].
(2) The Lebesgue measure is a Feller measure, since
_
|y|>x
1
y
2
dy =
2
x
(x > 0).
(3) Let F be a distribution measure and M{dx} = x
2
F{dx}. Then
M is a Feller measure with
M
+
(x) = 1 F(x

), M

(x) = F(x
+
).
Theorem 4.4. Let M be a Feller measure, b a real constant and
() = ib +
_

e
ix
1 i sin x
x
2
M{dx} (4.3)
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
Innitely Divisible Distributions 47
(the integral being convergent). Then corresponding to a given there
is only one measure M and one constant b.
Proof. Consider

() = ()
1
2h
_
h
h
( +s)ds (h > 0). (4.4)
We have

() =
_

e
ix
{dx} (4.5)
where
{dx} =
_
1
sin hx
hx
_
1
x
2
M{dx} (4.6)
and it is easily veried that is a nite measure. Therefore

()
determines uniquely, so M uniquely. Since b = Im(1), the con-
stant b is uniquely determined.
Convergence of Feller measures. Let {M
n
} be a sequence of
Feller measures. We say that M
n
converges properly to a Feller mea-
sure M if M
n
{I} M{I} for all nite intervals I of continuity of
M, and
M
+
n
(x) M
+
(x), M

n
(x) M

(x) (4.7)
at all points x of continuity of M. In this case we write M
n
M.
Examples
(1) Let M
n
{dx} = nx
2
F
n
{dx} where F
n
is a distribution measure
with weights
1
2
at each of the points
1

n
. Then
M
n
{I} =
_
I
nx
2
F
n
{dx} = n
_
1
n

1
2
+
1
n

1
2
_
= 1
if {
1

n
,
1

n
} I. Also M
+
n
(x) = M

n
(x) = 0 for x >
1

n
.
Therefore M
n
M where M is a distribution measure concen-
trated at the origin. Clearly, M is a Feller measure.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
48 Topics in Probability
(2) Let F
n
be a distribution measure with Cauchy density
1

n
1+n
2
x
2
and consider M
n
{dx} = nx
2
F
n
{dx}. We have
M
n
{(a, b)} =
_
b
a
n
2
x
2
1 +n
2
x
2
dx |b a|,
M
+
n
(x) =
_

x
n
2
1 +n
2
y
2
dy
_

x
dy
y
2
,
M

n
(x) =
_
x

n
2
1 +n
2
y
2
dy
_
x

dy
y
2
.
Therefore M
n
M where M is the Lebesgue measure.
Theorem 4.5. Let {M
n
} be a sequence of Feller measures, {b
n
} a
sequence of real constants and

n
() = i b
n
+
_

e
ix
1 i sin x
x
2
M
n
{dx}. (4.8)
Then
n
continuous i there exists a Feller measure M and a
real constant b such that M
n
M and b
n
b. In this case
() = i b +
_

e
ix
1 i sin x
x
2
M{dx}. (4.9)
Proof. As suggested by (4.4)(4.6) let

n
{dx} = K(x)M
n
{dx}, where K(x) = x
2
_
1
sin hx
hx
_
(4.10)

n
=
n
{(, )} < . (4.11)
Then
M

n
{dx} =
1

n
{dx} (4.12)
is a distribution measure. We can write

n
() = i b
n
+
n
_

e
ix
1 i sinx
x
2
K(x)
1
M

n
{dx}.
(4.13)
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
Innitely Divisible Distributions 49
(i) Let M
n
M and b
n
b. Then

n
=
_

K(x)M{dx} > 0
and
M

n
M

, where M

{dx} =
1

K(x)M{dx}.
Therefore from (4.13) we nd that

n
() i b +
_

e
ix
1 i sinx
x
2
K(x)
1
M

{dx}
= ().
(ii) Conversely, let
n
() () continuous. Then with
n

(),
with () dened as in (4.4),
n

()

(); that is,


_

e
ix

n
{dx}

(). (4.14)
In particular

n
=
n
{(, )}

(0).
If

(0) = 0, then
n
{I} and M
n
{I} tend to 0 for every nite interval
I and by (i) () = i b with b = limb
n
. We have thus proved the
required results in this case. Let =

(0) > 0. Then (4.14) can be


written as

n
_

e
ix
M
n

{dx}

().
Therefore M

n
M

where M

is the distribution measure corre-


sponding to the c.f.

()/

(0). Thus

n
_

e
ix
1 i sin x
x
2
K(x)
1
M

n
{dx}

e
ix
1 i sin x
x
2
K(x)
1
M

{dx},
(the integrand being a bounded continuous function), and b
n
b.
Clearly,
M{dx} = K(x)
1
M

{dx}
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
50 Topics in Probability
is a Feller measure and
() = i b +
_

e
ix
1 i sin x
x
2
M{dx }
as required.
4.3. Characterization of Innitely Divisible
Distributions
Theorem 4.6. A distribution is innitely divisible i its c.f. is of
the form = e

, with
() = i b +
_

e
ix
1 i sin x
x
2
M{dx}, (4.15)
M being a Feller measure, and b a real constant.
Proof. (i) Let = e

with given by (4.15). We can write


() = i b
1
2

2
M{0} + lim
0+

() (4.16)
where

() =
_
|x|>
e
ix
1 i sin x
x
2
M{dx}
= i +c
_
|x|>
(e
ix
1)G{dx }
with
cx
2
G{dx } = M{dx} for |x| > , and
=
_
|x|>
sin x
M{dx}
x
2
,
c being determined so that G is a distribution measure. Let
denote the c.f. of G; then
e

()
= e
i+c[()1]
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
Innitely Divisible Distributions 51
is the c.f. of a compound Poisson distribution. As 0,


0
, where

0
() =
_
|x|>0
e
ix
1 i sin x
x
2
M{dx}
is clearly a continuous function. By Theorem 4.3, e

0
is an
innitely divisible c.f. Now we can write
e
()
= e
ib
1
2

2
M{0}
e

0
()
,
so that is the product of e

0
()
and the c.f. of a normal distri-
bution. Therefore is innitely divisible.
(ii) Conversely, let be an innitely divisible c.f. Then by Theo-
rem 4.3. is the limit of a sequence of compound Poisson c.f.s.
That is,
e
c
n
[
n
()1i
n
]
()
or
c
n
_

(e
ix
1 iw
n
)F
n
{dx} log ()
where c
n
> 0,
n
is real and F
n
is the distribution measure cor-
responding to the c.f.
n
. We can write this as
_

e
ix
1 i sin x
x
2
M
n
{dx}
+i c
n
__

sinxF
n
{dx}
n
_
log ()
where M
n
{dx} = c
n
x
2
F
n
{dx}. Clearly, M
n
is a Feller measure.
By Theorem 4.5 it follows that
M
n
M and c
n
__

sin xF
n
{dx}
n
_
b
where M is a Feller measure, b a real constant and
log () = i b +
_

e
ix
1 i sinx
x
2
M{dx}.
This proves that = e

, with given by (4.15).



May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
52 Topics in Probability
Remarks.
(a) The centering function sin x is such that
_

e
ix
1 i sin x
x
2
M{dx}
is real. Other possible centering functions are
(i) (x) =
x
1 +x
2
and
(ii) (x) =
_
_
_
a for x < a,
|x| for a x a,
a for x > a with a > 0.
.
(b) The measure (Levy measure) is dened as follows: {0} = 0
and {dx } = x
2
M{dx} for x = 0. We have
_

min(1, x
2
){dx} < ,
as can be easily veried. The measure K{dx} = (1+x
2
)
1
M{dx }
is seen to be a nite measure. This was used by Khintchine.
(c) The spectral function H is dened as follows:
H(x) =
_

_

x
M{dy}
y
2
for x > 0
_
x+

M{dy}
y
2
for x < 0,
H being undened at x = 0. We can then write
() = i b
1
2
w
2

2
+
_

0+
[e
ix
1 i (x)]dH(x)
+
_
0

[e
ix
1 i (x)]dH(x),
where the centering function is usually (x) = x(1 +x
2
)
1
. This
is the so-called LevyKhintchine representation. Here H is non-
decreasing in (, 0) and (0, ), with H() = 0, H() = 0.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
Innitely Divisible Distributions 53
Also, for each > 0
_
|x|<
x
2
dH(x) < .
Theorem 4.7. A distribution concentrated on (0, ) is innitely
divisible i its c.f. is given by = e

, with
() = i b +
_

0
e
ix
1
x
P{dx} (4.17)
where b 0 and P is a measure on (0, ) such that (1 + x)
1
is
integrable with respect to P.
Theorem 4.8. A function P is an innitely divisible probability gen-
eration function (p.g.f.) i P(1) = 1 and
log
P(s)
P(0)
=

1
a
k
s
k
(4.18)
where a
k
0 and

1
a
k
= < .
Proof. Let (4.18) hold and P(1) = 1. Put a
k
= f
k
where f
k
0,

1
f
k
= 1. Let F(s) =

1
f
k
s
k
. Then
P(s) = P(0)e
F(s)
and
P(1) = P(0)e

.
Therefore
P(s) = e
+F(s)
,
which is the p.g.f. of a compound Poisson distribution, and is there-
fore innitely divisible.
Conversely, let P be an innitely divisible p.g.f. Then Deni-
tion (4.1) implies P(s)
1/n
is also a p.g.f. (see [Feller (1968)]) for each
n 1. Let
P(s)
1/n
= q
0n
+q
1n
s +q
2n
s
2
+ = Q
n
(s) (say)
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
54 Topics in Probability
or
P(s) = (q
0n
+q
1n
s +q
2n
s
2
+ )
n
.
In particular q
n
0n
= P(0) = p
0
. If p
0
= 0, then q
0n
= 0 and P(s) =
s
n
(q
1n
+q
2n
s +q
3n
s
2
+ )
n
. This implies that p
0
= p
1
= p
2
= =
p
n1
= 0 for each n 1, which is absurd. Therefore p
0
> 0. It follows
that P(s) > 0 and therefore P(s)
1/n
1 for 0 s 1. Now
log P(s) log P(0)
log P(0)
=
log
n
_
P(s)
P(0)
log
n
_
1
P(0)

n
_
P(s)
P(0)
1
n
_
1
P(0)
1
=
n
_
P(s)
n
_
P(0)
1
n
_
P(0)
=
Q
n
(s) Q
n
(0)
1 Q
n
(0)
.
Thus
Q
n
(s) Q
n
(0)
1 Q
n
(0)

log P(s) log P(0)
log P(0)
.
Here the left side is seen to be a p.g.f. By the continuity theorem
the limit is the generating function of a non-negative sequence {f
j
}.
Thus
log P(s) log P(0)
log P(0)
=

1
f
j
s
f
= F(s) (say).
Putting s = 1 we nd that F(1) = 1. Putting = log P(0) > 0 we
obtain
P(s) = e
[1F(s)]
which is equivalent to (4.18).
4.4. Special Cases of Innitely Divisible
Distributions
(A) Let the measure M be concentrated at the origin, with weight

2
> 0. Then (4.15) gives () = e
ib
1
2

2
, which is the c.f. of
the normal distribution.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
Innitely Divisible Distributions 55
(B) Let M be concentrated at h(= 0) with weight h
2
. Then
() = e
ir+(e
ih
1)
, r = b sin h.
Thus is the c.f. of the random variable hN +r, where N has
the (simple) Poisson distribution e

k
/k! (k = 0, 1, 2, . . .).
(C) Let M{dx } = x
2
G{dx } where G is the distribution measure
with the c.f. . Clearly, M is a Feller measure and
() = e
i+[()1]
, = b
_

sin x G{dx }.
We thus obtain the c.f. of a compound Poisson distribution.
(D) Let M be concentrated on (0, ) with density e
x
x(x > 0).
It is easily veried that M is a Feller measure. We have
_

0
e
ix
1
x
2
M{dx} =
_

0
e
(i)x
e
x
x
{dx}
= log

i
= log
_
1
i

.
Choosing
b =
_

0
sin x
x
e
x
dx <
we nd that
() =
_
1
i

,
This is the c.f. of the gamma density e
x

x
1
/().
(E) Stable distributions. These are characterized by the measure M,
where
M{(y, x)} = C(px
2
+qy
2
) (x > 0, y > 0)
where C > 0, p 0, q 0, p + q = 1, 0 < 2. If = 2,
M is concentrated at the origin, and the distribution is the
normal, as discussed in (A). Let 0 < < 2, and denote by

the corresponding expression . In evaluating it we choose


May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
56 Topics in Probability
an appropriate centering function

(x) depending on . This


changes the constant b and we obtain

() = i +
_

e
ix
1 i

(x)
x
2
M{dx}
where
= b +
_

(x) sin x
x
2
M{dx} (|r| < )
and

(x) =
_
_
_
sinx if = 1
0 if 0 < < 1
x if 1 < < 2.
Substituting for M we nd that

() = i +c(2 )[pI

() +q

()]
where
I

() =
_

0
e
ix
1 i

(x)
x
+1
dx.
Evaluating the integral I

we nd that

() = i c|w|

_
1 +i

||
(||, )
_
where c > 0, || 1 and
(||, ) =
_

_
tan

2
if = 1
2

log |w| if = 1.
In Sec. 4.6 we shall discuss the detailed properties of stable dis-
tributions. We note that when = 0 and = 1 we obtain

() = i c||, so that is the c.f. of the Cauchy distribution.


May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
Innitely Divisible Distributions 57
4.5. Levy Processes
We say a stochastic process {X(t), t 0} has stationary independent
increments if it satises the following properties:
(i) For 0 t
1
< t
2
< < t
n
(n 2) the random variables
X(t
1
), X(t
2
) X(t
1
), X(t
3
) X(t
2
), . . . , X(t
n
) X(t
n1
)
are independent.
(ii) The distribution of the increment X(t
p
)X(t
p1
) depends only
on the dierence t
p
t
p1
.
For such a process we can take X(0) 0 without loss of
generality. For if X(0) 0, then the process Y (t) = X(t)X(0)
has stationary independent increments, and Y (0) = 0.
If we write
X(t) =
n

k=1
_
X
_
k
n
t
_
X
_
k 1
n
t
__
(4.19)
then X(t) is seen to be the sum of n independent random vari-
ables all of which are distributed as X(t/n). Thus a process with
stationary independent increments is the generalization to con-
tinuous time of sums of independent and identically distributed
random variables.
A Levy process is a process with stationary independent incre-
ments that satises the following additional conditions:
(iii) X(t) is continuous in probability. That is, for each > 0
P{|X(t)| > } 0 as t 0. (4.20)
(iv) There exist left and right limits X(t

) and X(t
+
) and we assume
that X(t) is right-continuous: that is, X(t
+
) = X(t).
Theorem 4.9. The c.f. of a Levy process is given by E[e
iX(t)
] =
e
t()
, where is given by Theorem 4.6.
Proof. Let
1
() = E[e
iX(t)
]. From (4.19) we nd that
t
() =
[
t/n
()]
n
, so for each t > 0,
t
is innitely divisible and
t
= e

t
.
Also from the relation X(t+s)
d
=X(t)+X(s) we obtain the functional
equation
t+s
=
t
+
s
. On account of (4.20),
t
0 as t 0, so
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
58 Topics in Probability
we must have
t
() = t
1
(). Thus
t
() = e
t()
with =
1
in
the required form.
Special cases: Each of the special cases of innitely divisible dis-
tributions discussed in Sec. 4.4 leads to a Levy process with c.f.

t
() = e
t()
and in the prescribed form. Thus for appropriate
choices of the measure M we obtain the Brownian motion, simple
and compound Poisson processes, gamma process and stable pro-
cesses (including the Cauchy process).
A Levy process with non-decreasing sample functions is called a
subordinator. Thus the simple Poisson process and gamma process
are subordinators.
4.6. Stable Distributions
A distribution and its c.f. are called stable if for every positive integer
n there exist real numbers c
n
> 0, d
n
such that
()
n
= (c
n
)e
id
n
. (4.21)
If X, X
1
, X
2
, . . . are independent random variables with the c.f. ,
then the above denition is equivalent to
X
1
+X
2
+ +X
n
d
= c
n
X +d
n
. (4.22)
Examples
(A) If X has a distribution concentrated at a single point, then (4.22)
is satised with c
n
= n, d
n
= 0. Thus a degenerate distribution
is (trivially) stable. We shall ignore this from our consideration.
(B) If X has the Cauchy density f(x) =
a

[a
2
+ (x r)
2
]
1
(a > 0),
then () = e
ira||
. The relation (4.21) holds with c
n
= n,
dn = 0. Thus the Cauchy distribution is stable.
(C) If X has a normal density with mean m and variance
2
, then
(22) holds with c
n
=

n and d
n
= m(n c
n
). Thus the normal
distribution is stable.
The concept of stable distributions is due to Levy (1924), who
gave a second denition (see Problem 11).
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
Innitely Divisible Distributions 59
Theorem 4.10. Stable distributions are innitely divisible.
Proof. The relation (4.21) can be written as
() =
_

c
n
_
e
i
d
n
nc
n
_
n
=
n
()
n
where
n
is clearly a c.f. By denition is innitely divisible.
Domains of attraction. Let {X
k
, k 1} be a sequence of inde-
pendent random variables with a common distribution F, and S
n
=
X
1
+ X
2
+ + X
n
(n 1). We say that F belongs to the domain
of attraction of a distribution G if there exist real constants a
n
> 0,
b
n
such that the normed sum (S
n
b
n
)/a
n
converges in distribution
to G.
It is clear that a stable distribution G belongs to its own domain
of attraction, with a
n
= c
n
, b
n
= d
n
. Conversely, we shall prove
below that the only non-empty domains of attraction are those of
stable distributions.
Theorem 4.11. If the normed sum (S
n
b
n
)/a
n
converges in dis-
tribution to a limit, then
(i) as n , a
n
, a
n+1
/a
n
1 and (b
n+1
b
n
)/a
n
b with
|b| < , and
(ii) the limit distribution is stable.
Proof. (i) With the obvious notation we are given that
[(/a
n
)e
ib
n
/na
n
]
n
() (4.23)
uniformly in [, ]. By Lemma 4.1 we conclude that
n[(/a
n
)e
ib
n
/na
n
1] ()
where = e

. Therefore

n
() = (/a
n
)e
ib
n
/na
n
1.
Let {a
n
k
} be a subsequence of {a
n
} such that a
n
k
a (0 a ).
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
60 Topics in Probability
If 0 < a < , then
1 = lim|(/a
n
k
)| = |(/a)|,
while if a = 0, then
1 = lim|
n
k
a
n
k
| = |()|.
Both implications here would mean that is degenerate, which is
not true. Hence a = and a
n
. From (4.23) we have

_

a
n+1
_
n+1
e
ib
n+1
/a
n+1
(),
which can be written as

_

a
n+1
_
n
e
ib
n+1
/a
n+1
(), (4.24)
since (/a
n+1
) 1. By Theorem 2.10 it follows from (4.23) and
(4.24) that a
n+1
/a
n
1 and (b
n+1
b
n
)/a
n
b.
(ii) For xed m 1 we have

_

a
n
_
mn
e
imb
n
/a
n
=
_

_

a
n
_
n
e
ib
n
/a
n
_
m

m
().
Again by Theorem 2.10 it follows that a
mn
/a
n
c
m
, (b
mn

mb
n
)/a
n
d
m
, where c
m
> 0 and d
m
is real, while
() =
m
_

c
m
_
e
id
m
/c
m
or

m
() = (c
m
)e
id
m
.
This shows that is stable.
Theorem 4.12. A c.f. is stable i = e

, with
() = i c||

_
1 +i

||
(||, )
_
(4.25)
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
Innitely Divisible Distributions 61
where is real, c > 0, 0 < 2, || 1 and
(||, ) =
_

_
tan

2
if = 1
2

log|| if = 1.
(4.26)
Here is called the characteristic exponent of .
Proof. (i) Suppose is given by (4.25) and (4.26). Then for a > 0
we have
a() (a
1/
) = i(a a
1/
)
ac||

i

||
[(||, ) (a
1/
||, )]
=
_

_
i(a a
1/
) if = 1
i
_
2c

_
a log a if = 1.
This shows that is stable.
(ii) Conversely, let be stable. Then by Theorem 4.11 it possesses
a domain of attraction; that is, there exists a c.f. and real constants
a
n
> 0, b
n
such that as n
[(/a
n
)e
ib
n
]
n
().
Therefore by Lemma 4.1,
n[(/a
n
)e
ib
n
1] ()
where = e

. Let F be the distribution corresponding to . We rst


consider the case where F is symmetric; then b
n
= 0. Let M
n
{dx} =
nx
2
F{a
n
dx}. Then by Theorem 4.5 it follows that there exists a
Feller measure M and a constant b such that
() = ib +
_

e
ix
1 i sin x
x
2
M{dx}. (4.27)
Let
U(x) =
_
x
x
y
2
F{dy} (x > 0). (4.28)
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
62 Topics in Probability
Then
M
n
{(x, x)} =
n
a
2
n
U(a
n
x) M{(x, x)} (4.29a)
n[1 F(a
n
x)] =
_

x

y
2
M
n
{dy} M
+
(x) (4.29b)
nF(a
n
x) =
_
x
+

y
2
M
n
{dy} M

(x). (4.29c)
By Theorem 4.11 we know that a
n
, a
n+1
/a
n
1. Therefore
U(x) varies regularly at innity and M{(x, x)} = Cx
2
where
C > 0, 0 < 2. If = 2 the measure M is concentrated at the
origin. If 0 < < 2 the measure M is absolutely continuous.
In the case where F is unsymmetric we have
n[1 F(a
n
x +a
n
b
n
)] M
+
(x), nF(a
n
x +a
n
b
n
) M

(x)
and an analogous modication of (4.29a). However it is easily seen
that b
n
0, and so these results are fully equivalent to (4.29).
Considering (4.29b) we see that either M
+
(x) 0 or 1 F(x)
varies regularly at innity and M
+
(x) = Ax

. Similarly F(x) and


1 F(x) + F(x) vary regularly at innity and the exponent is
the same for both M
+
and M

. Clearly 0 < 2.
If M
+
and M

vanish identically, then clearly M is concentrated


at the origin. Conversely of M has an atom at the origin, then a
symmetrization argument shows that M is concentrated at the ori-
gin, and M
+
, M

vanish identically. Accordingly, when < 2 the


measure M is uniquely determined by its density, which is propor-
tional to |x|
1
. For each interval (y, x) containing the origin we
therefore obtain
M{(y, x)} = C(px
2
+qy
2
) (4.30)
where p + q = 1. For = 2, M is concentrated at the origin. For
0 < < 2 we have already shown in Sec. 4.4 that the measure (4.30)
yields the required expression (4.25) for .
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
Innitely Divisible Distributions 63
Corollary 4.3. If G

is the stable distribution with the characteristic


exponent , then as x
x

[1 G

(x)] Cp
2

, x

(x) Cq
2

. (4.31)
Proof. Clearly, G

belongs to its own domain of attraction with


the norming constants a
n
= n
1/
. For 0 < < 2, choosing n
1/
x = t
in (4.29b) we nd that t

[1 G

(t)] Cp
2

as t . For = 2,
G

is the normal distribution and for it we have a stronger result,


namely, x

[1 G

(x)] 0 as x .
Theorem 4.13. (i) All stable distributions are absolutely continu-
ous.
(ii) Let 0 < < 2. Then moments of order < exist, while
moments of order > do not.
Proof. (i) We have |()| = e
c||

, with c > 0. Since the function


is integrable over (, ), the result (i) follows by Theorem 2.6(b).
(ii) For t > 0 an integration by parts gives
_
t
t
|x|

F{dx} = t

[1 F(t) +F(t)]
+
_
t
0
x
1
[1 F(x) +F(x)]dx

_
t
0
x
1
[1 F(x) +F(x)]dx.
If < , this last integral converges as t . Since by Corollary 4.3
we have x

[1 F(x) +F(x)] M for x > t where t is large. It fol-


lows that the absolute moment (and therefore the ordinary moment)
of order < is nite. Conversely if the absolute moment of order
> exists, then for > 0 we have
>
_
|x|>t
|x|

F{dx} > t

[1 F(t) +F(t)]
or t

[1 F(t) + F(t)] < t

0 as t , which is a contra-
diction. Therefore absolute moments of order > do not exist.

May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04


64 Topics in Probability
Remarks
(1) From the proof of Theorem 4.12 it is clear that
()
a
= (c
a
)e
id
a
for all a > 0, and the functions c
a
and d
a
are given by
(i) c
a
= a
1/
with 0 < 2, and
(ii) d
a
=
_
(a a
1/
) if = 1
(2c/)a log a if = 1.
(2) If in the denition (4.21), d
n
= 0, then the distribution is called
strictly stable. However, the distinction between strict and weak
stability matters only when = 1, because when = 1 we can
take d
n
= 0 without loss of generality. To prove this we note that
d
n
= (n n
1/
) for = 1, and consider the c.f.
() = ()e
i
.
We have
()
n
= ()
n
e
in
= (c
n
)e
i(d
n
n)
= (c
n
)e
i(c
n
+d
n
n)
= (c
n
)
which shows that is strictly stable.
(3) Let = 1 and assume that = 0. Then we can write
() = a||

for > 0, and a||

for < 0 (4.32)


where a is a complex constant. Choosing a scale so that |a| = 1
we can write a = e
i

, where tan

2
= tan

2
. Since || 1 it
follows that
|| if 0 < < 1, and || 2 if 1 < < 2.
(4.33)
Theorem 4.14. Let = 1 and let the c.f. of a stable distribution be
expressed in the form
() = e
||

e
i/2
(4.34)
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
Innitely Divisible Distributions 65
where in 1 the upper sign prevails for > 0 and the lower sign for
< 0. Let the corresponding density be denoted by f(x; , ). Then
f(x; , ) = f(x; , ) for x > 0. (4.35)
For x > 0 and 0 < < 1,
f(x; , ) =
1
x

k=1
(k + 1)
k!
(x

)
k
sin
k
2
( ) (4.36)
and for x > 0 and 1 < < 2
f(x; , ) =
1
x

k=1
(k
1
+ 1)
k!
(x)
k
sin
k
2
( ). (4.37)
Corollary 4.4. A stable distribution is concentrated on (0, ) if
0 < < 1, = and on (, 0) if 0 < < 1, = .
Proofs are omitted.
Theorem 4.15. (a) A distribution F belongs to the domain of
attraction of the normal distribution i
U(x) =
_
x
x
y
2
F{dy} (4.38)
varies slowly.
(b) A distribution F belongs to the domain of attraction of a stable
distribution with characteristic exponent < 2 i
1 F(x) +F(x) x

L(x) (x ) (4.39)
and
1 F(x)
1 F(x) +F(x)
p,
F(x)
1 F(x) +F(x)
q (4.40)
where p 0, q 0 and p+q = 1, Here L is a slowly varying function
on (0, ); that is, for each x > 0
L(tx)
L(t)
1 as t . (4.41)
The proof is omitted.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
66 Topics in Probability
Theorem 4.16. Let F be a proper distribution concentrated on
(0, ) and F
n
the n-fold convolution of F with itself: If F
n
(a
n
x)
G(x), where G is a non-degenerate distribution, then G = G

, the
stable distribution concentrated on (0, ), with exponent (0 < <
1), and moreover, 1 F(t) t

L(t)/(1 ). Conversely, if
1 F(t) t

L(t)/(1 ), we can nd constants a


n
such that
F
n
(a
n
x) G

(x). Here L is a slowly varying function.


Proof. (i) Suppose that F
n
(a
n
x) G(x), and is the L.T. of G.
Denote by F

the L.T. of F. Then F

(/a
n
)
n
() or
nlog F

(/a
n
) log ().
This shows that log F

() is of regular variation at the origin, that


is, log F

()

L(1/)( 0+), with 0. Since log(1z) z


for small z, we nd that 1F

()

L(1/). This gives 1F(t)


t

L(t)/(1 ), as required. Moreover, log () = c

(c > 0) or
() = e
c

, so that G is the stable distribution with exponent .


Here 0 < < 1 since G is non-degenerate.
(ii) Conversely, let 1 F(t) t

L(t)/(1 )(t ). This


gives 1 F

()

L(1/)( 0+). Let us choose constants a


n
so
that n[1 F(a
n
)] c/(1 ) for 0 < c < . Then as n ,
na

n
L(a
n
) =
a

n
L(a
n
)
[1 F(a
n
)](1 )
n[1 F(a
n
)](1 ) c
and also
na

n
L(a
n
/) = na

n
L(a
n
)
L(a
n
/)
L(a
n
)
c.
Therefore 1 F

(/a
n
)

c/n and
F

(/a
n
)
n
= [1 c

/n +o(1/n)]
n
e
c

.
This shows that F
n
(a
n
x) G

(x).
4.7. Problems for Solution
1. Show that if F and G are innitely divisible distributions so is
their convolution F G.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
Innitely Divisible Distributions 67
2. If is an innitely divisible c.f., prove that || is also an innitely
divisible c.f.
3. Show that the uniform distribution is not innitely divisible.
More generally, a distribution concentrated on a nite interval is
not innitely divisible, unless it is concentrated at a point.
4. Let 0 < r
j
< 1 and

r
j
< . Prove that for arbitrary a
j
the
innite product
() =

j=1
1 r
j
1 r
j
e
ia
j
converges, and represents an innitely divisible c.f.
5. Let X =

1
X
k
/k where the random variables X
k
are inde-
pendent and have the common density
1
2
e
|x|
. Show that X is
innitely divisible, and nd the associated Feller measure.
6. Let P be an innitely divisible p.g.f. and the c.f. of an arbitrary
distribution. Show that P() is an innitely divisible c.f.
7. If 0 a < b < 1 and is a c.f., then show that
1 b
1 a

1 a
1 b
is an innitely divisible c.f.
8. Prove that a probability distribution with a completely monotone
density is innitely divisible.
9. Mixtures of exponential (geometric) distributions. Let
f(x) =
n

k=1
p
k

k
e

k
x
where p
k
> 0,

p
k
= 1 and for deniteness 0 <
1
<
2
<
<
n
. Show that the density f(x) is innitely divisible. (Similarly
a mixture of geometric distributions is innitely divisible.) By a
limit argument prove that the density
f(x) =
_

0
e
x
G(d),
where G is a distribution concentrated on (0, ), is innitely
divisible.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch04
68 Topics in Probability
10. If X, Y are two independent random variables such that X > 0
and Y has an exponential density, then prove that XY is
innitely divisible.
11. Show that a c.f. is stable if and only if given c

> 0, c

> 0
there exist constants c > 0, d such that
(c

)(c

) = (c)e
id
.
12. Let the c.f. be given by log () = 2

2
k
(cos 2
k
1).
Show that ()
n
= (n) for n = 2, 4, 8, . . . , () is innitely
divisible, but not stable.
13. If ()
2
= (c) and the variance is nite, show that () is
stable (in fact normal).
14. If ()
2
= (a) and ()
3
= (b) with a > 0, b > 0, show
that () is stable.
15. If F and G are stable with the same exponent , so is their
convolution F G.
16. If X, Y are independent random variables such that X is stable
with exponent , while Y is positive and stable with exponent
(< 1), show that XY
1/
is stable with exponent .
17. The Holtsmark distribution. Suppose that n stars are dis-
tributed in the interval (n, n) on the real line, their locations
d
i
(i = 1, 2, . . . , n) being independent r.v. with a uniform density.
Each star has mass unity, and the gravitational constant is also
unity. The force which will be exerted on a unit mass at the
origin (the gravitational eld) is then
Y
n
=
n

r=1
sgn(d
r
)
d
2
r
Show that as n , the distribution on Y
n
converges to a stable
distribution with exponent =
1
2
.
18. Let {X
k
, k 1} be a sequence of independent random variables
with the common density
f(x) =
_
2
|x|
3
log |x| for |x| 1
0 for |x| 1.
Show that (X
1
+ X
2
+ + X
n
)/

nlog n is asymptotically
normal.
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch05
Chapter 5
Self-Decomposable
Distributions; Triangular
Arrays
5.1. Self-Decomposable Distributions
A distribution F and its c.f. are called self-decomposable if for
every c in (0, 1) there exists a c.f.
c
such that
() = (c)
c
(). (5.1)
We shall call
c
the component of . Restriction of c to (0, 1) is
explained in Problem 5.1.
If is self-decomposable, then it can be proved that = 0
(Problem 5.2). Thus the above denition implies that ()/(c)
is a c.f. for every c in (0, 1).
Examples.
1. Degenerate distributions are (trivially) self-decomposable, and all
their components are also degenerate.
2. A stable c.f. is self-decomposable, since by P. Levys second
denition (Problem 4.11) we have
() = (c)(c

)e
id
with 0 < c < 1, 0 < c

< 1. Here
c
() = (c

)e
id
with c

and d
depending on c; the component is also self-decomposable.
The concept of self-decomposable distributions is due to
Khintchine (1936); they are also called distributions of class L.
69
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch05
70 Topics in Probability
Theorem 5.1. If is self-decomposable, it is innitely divisible, and
so is its component
c
.
Proof. (i) Let {X
k
, k 1} be independent random variables with
X
k
having c.f.
k1/k
(k). Let S
n
= X
1
+ X
2
+ + X
n
(n 1).
Then the c.f. of S
n
/n is given by
E(e
i(S
n
/n)
) =
n

k=1

k1/k
(k/n)
=
n

k=1
(k/n)
((k 1)/n)
= ()
so that is the c.f. of X
1
/n + X
2
/n + + X
n
/n. By the theorem
on triangular arrays is innitely divisible.
(ii) We also have
() = (m/n)
n

k=m+1

k1/k
(k/n).
In this let m , n in such a way that m/n c (0 < c < 1).
Then
() = (c) lim
m
n
n

k=m+1

k1/k
(k/n),
which shows that
c
() = lim
m

n
k=m+1

k1/k
(k/n). Again by
the theorem on triangular arrays
c
is innitely divisible.
As a converse of Theorem 5.1 we ask whether, given a sequence
{X
k
, k 1} of independent random variables there exist suitable con-
stants a
n
> 0, b
n
such that the normed sums (S
n
b
n
)/a
n
converge in
distribution. It is clear that in order to obtain this convergence we
have to impose reasonable restrictions on X
k
. We require that each
component X
k
/a
n
become uniformly asymptotically negligible (uan)
in the sense that given > 0 and > 0 one has for all suciently
large n
|1 E(e
iX
k
/a
n
)| < for [, ], k = 1, 2, . . . , n (5.2)
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch05
Self-Decomposable Distributions; Triangular Arrays 71
Theorem 5.2. If the normed sums (S
n
b
n
)/a
n
converge in distri-
bution, then
(i) as n , a
n
, a
n+1
/a
n
1; and
(ii) the limit distribution is self-decomposable.
Proof. (i) Let
k
be the c.f. of X
k
. We are given that

n
() = e
ib
n
/a
n
n

k=1

k
(/a
n
) () (5.3)
as n . Take a subsequence {a
n
k
} of {a
n
} such that a
n
k
a(0
a ). If 0 < a < , then for each k, |1
k
(/a)| < for
[, ] or
|1
k
()| < for
_


a
,

a
_
on account of the uan condition (5.2). Therefore
k
() = 1 in
[

a
,

a
] and (5.3) gives |()| 1. This means that is degenerate,
which is not true. If a = 0, then
1 = |(0)| = lim
k
|
n
k
(a
n
k
)| = lim
nk

i
|
j
()|.
This gives |
j
()| = 1 for all and again leads to degenerate .
Therefore a = , which means that a
n
. Proceeding as in the
proof of Theorem 4.11 we nd that a
n+1
/a
n
1.
(ii) Given c in (0, 1), for every integer n we can choose an integer
m < n such that a
m
/a
n
c, and m , n m as n .
We can write (5.3) as

n
() = e
i
b
m
a
n
m

k=1

k
_
a
m
a
n


a
m
_
e
i
b
n
b
m
a
n

k=m+1

k
_

a
n
_
=
m
_
a
m
a
n

mn
() (say). (5.4)
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch05
72 Topics in Probability
Here
n
() () and
m
((a
n
/a
n
)) (c). If we prove that
= 0 then the c.f.
mn
() ()/(c), a continuous function.
It follows that ()/(c) is a c.f., which means that is self-
decomposable. To show that = 0, note that (
0
) = 0 for some

0
implies that (c
0
) = 0. By induction (c
n

0
) = 0, so (0) = 0,
which is absurd.
Theorem 5.3. A c.f. is self-decomposable i it is innitely divisible
and its Feller measure M is such that the two functions M
+
c
, M

c
,
where
M
+
c
(x) = M
+
(x) M
+
_
x
c
_
,
M

c
(x) = M

(x) M

x
c
_
,
are monotone for every c in (0, 1).
5.2. Triangular Arrays
For each n 1 let the random variables X
1n
, X
2n
, . . . , X
r
n
,n
be inde-
pendent with X
kn
having the distribution F
kn
and c.f.
kn
. The dou-
ble sequence {X
kn
, 1 k r
n
, n 1}, where r
n
, is called a
triangular array. Let
S
nn
= X
1n
+X
2n
+ +X
nn
.
We are interested in the limit distribution of S
nn
+
n
where {
n
} is a
sequence of real constants. The array {X
kn
} will be called a null array
if it satises the uniformly asymptotically negligible (uan) condition:
for each > 0 there exists a such that
P{|X
kn
| > } < (k = 1, 2, . . . , r
n
) (5.5)
for n suciently large. In terms of c.f.s we can express this as follows:
given > 0 and > 0 we have
|1
kn
()| < for || < , k = 1, 2, . . . , r
n
(5.6)
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch05
Self-Decomposable Distributions; Triangular Arrays 73
for n suciently large. As special cases of the limit results we seek
we have the following:
(i) Let X
kn
=
X
k
b
n
/n
a
n
(k = 1, 2, . . . , n; n 1) where {X
k
} is a
sequence of independent random variables with a common dis-
tribution. The problem is to nd norming constants a
n
> 0, b
n
such that the distribution of the random variables (S
n
b
n
)/a
n
converges properly. We have seen that the limit distribution is
(a) stable if the X
k
are identically distributed, and (b) self-
decomposable in general.
(ii) Let P{X
kn
= 1} = p
n
and P{X
kn
= 0} = 1 p
n
and suppose
that p
n
0, np
n
> 0. Then we know that
P{S
nn
= k} e

k
k!
(k = 0, 1, 2, . . .).
We introduce the Feller measure M
n
by setting
M
n
{dx} =
r
n

k=1
x
2
F
kn
{dx}. (5.7)
For this we have
M
+
n
(x) =
r
n

k=1
[1 F
kn
(x)], M

n
(x) =
r
n

k=1
F
kn
(x) (5.8)
for x > 0. We also introduce the truncation procedure by which a
random variable X is replaced bu (X), where
(x) =
_

_
a for x < a,
x for a x a
a for x > a.
(5.9)
It is seen that E(X +t) is a continuous monotone function of t and
therefore vanishes for some t.
For each pair (k, n) there exists a constant t
kn
such that E(X
kn
+
t
kn
) = 0. We can therefore center the random variable X
kn
so that
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch05
74 Topics in Probability
b
kn
= E(X
kn
) = 0. Assume this has been done, so that
b
kn
= E(X
kn
) = 0. (5.10)
Let
A
n
=
r
n

k=1
E
2
(X
kn
). (5.11)
Proposition 5.1. As n we have
log Ee
i(S
nn
+
n
)
=
r
n

k=1
[
kn
() 1] +i
n
+ 0(A
n
)
for || < .
Proof. We have

kn
() 1 =
_

(e
ix
1)F
kn
{dx}
=
_

[e
ix
1 i(x)]F
kn
{dx}
=
_
a
a
(e
ix
1 ix)F
kn
{dx}
+
_
a

(e
ix
1 +ia)F
kn
{dx}
+
_

a
(e
ix
1 ia)F
kn
{dx}
so that
|
kn
() 1|
_
a
a
1
2

2
x
2
F
kn
{dx} +
_
|x|>a
(2 +a||)F
kn
{dx}

1
2

2
E
2
(X
kn
) + (2 +a||)E
2
(X
kn
)a
1
= c()E
2
(X
kn
).
where c() =
1
2

2
+
2
a
+ ||. Summing this over k = 1, 2, . . . , r
n
we
obtain
r
n

k=1
|
kn
() 1| c()A
n
. (5.12)
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch05
Self-Decomposable Distributions; Triangular Arrays 75
The uan condition implies that
kn
() = 0 in (, ) so that log
kn
exists in (, ) for n suciently large. Therefore
log
kn
() = log[1 +
kn
() 1]
= [
kn
() 1] +

r=2
(1)
r1
r
[
kn
() 1]
r
and

r
n

k=1
log
kn
()
r
n

k=1
[
kn
() 1]

r
n

k=1

r=2
1
r
|
kn
() 1|
r

r
n

k=1
1
2
|
kn
() 1|
2
1 |
kn
() 1|
sup
1kr
n
|
kn
() 1|
r
n

k=1
|
kn
() 1|
< c()A
n
(5.13)
by the uan condition and (5.2). From (5.2) and (5.3) it follows that
log E e
i(S
nn
+
n
)
= log
_
e
i
n
r
n

k=1

kn
()
_
=
r
n

k=1
log
kn
() +i
n
=
r
n

k=1
[
kn
() 1] +i
n
+ 0(A
n
)
as required.
Theorem 5.4. Let {X
kn
} be a null array, centered so that b
kn
= 0,
and {
n
} a sequence of real constants. Then S
nn
+
n
converges in
distribution i
n
b and M
n
M, a Feller measure. In this case
the limit distribution is innitely divisible, its c.f. being given by
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch05
76 Topics in Probability
= e

, with
() = ib +
_

e
ix
1 i(x)
x
2
M{dx}. (5.14)
Proof. The desired result is a consequence of Theorem 4.5. In order
to apply this theorem we dene the distribution
F
n
{dx} =
1
r
n
r
n

k=1
F
kn
{dx} (5.15)
and its c.f.

n
() =
1
r
n
r
n

k=1

kn
(). (5.16)
Then M
n
{dx} = r
n
x
2
F
n
{dx}, the associated c.f. being
n
() =
e

n
()
, where

n
() = i
n
+r
n
[
n
() 1]
= i
n
+
r
n

k=1
[
kn
() 1]. (5.17)
Using Proposition 5.1 we can therefore write
log E e
i(S
nn
+
n
)
=
n
() + 0(A
n
) (n ). (5.18)
(i) Let M
n
M and
n
b. By Theorem 4.5 it follows that

n
where
() = ib +
_

e
ix
1 i(x)
x
2
M{dx}. (5.19)
Furthermore
A
n
=
r
n

k=1
E
2
(X
kn
) =
r
n

k=1
_
a
a
x
2
F
kn
{dx} +
r
n

k=1
_
|x|a
a
2
F
kn
{dx}
=
_
a
a
M
n
{dx} +
_
|x|a
a
2
1
x
2
M
n
{dx} M{(a, a)}
+[M
+
(a) +M

(a)]a
2
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch05
Self-Decomposable Distributions; Triangular Arrays 77
if (a, a) is an interval of continuity of the measure M. Thus A
n
tends to a nite limit and (5.8) gives log Ee
i(S
nn
+
n
)
(),
with given by (5.9).
(ii) Conversely, suppose that S
nn
+
n
converges in distribution.
Then by (5.8) there is a c.f. such that

n
() + 0(A
n
) log () (5.20)
for || < . Since the convergence is uniform we integrate (5.20)
over (h, h) where 0 < h < . The left side gives
_
h
h
d
r
n

k=1
(e
ix
1)F
kn
{dx} + 2h0(A
n
) +
n
_
h
h
id
=
r
n

k=1
_

_
2 sin hx
x
2h
_
F
kn
{dx} + 2h0(A
n
)
and so
r
n

k=1
_

_
1
sinhx
hx
_
F
kn
{dx}
+0(A
n
)
1
2h
_
h
h
log (). (5.21)
Now take h < 2; then
1
sinhx
hx

1
10
h
2
x
2
for |x| < 1,
1
sinhx
hx
>
1
2
for |x| 1
Then the left side of (5.21) is

r
n

k=1
h
2
10
_
1
1
x
2
F
kn
{dx} +
r
n

k=1
1
2
_
|x|>1
F
kn
{dx} + 0(A
n
)
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-ch05
78 Topics in Probability

h
2
10
r
n

k=1
_
1
1
x
2
F
kn
{dx} +
1
2
r
n

k=1
_
|x|>1
F
kn
{dx} + 0(A
n
)
=
h
2
10
A
n
+ 0(A
n
).
This shows that A
n
is bounded as n . We can therefore
write (5.20) as

n
() log () (5.22)
uniformly in || < . The required result now follows from
Theorem 4.5.

5.3. Problems for Solution


1. If () = (c)
c
() for c 1, where
c
() is a c.f., then either

c
() is degenerate or () is degenerate.
[If c = 1, then
c
() 1. If c > 1 then since |()| |(c)|
we obtain
1 |()|

c
_

c
2
_

c
n
_

|(0)| = 1,
which gives |()| 1.]
2. A self-decomposable c.f. never vanishes.
[If (2a) = 0 and () = 0 for 0 < 2a, then
c
(2a) = 0.
We have
|
c
(a)|
2
= |
c
(2a)
c
(a)|
2
2[1 Re
c
(a)].
Here
c
(a) = (a)/(ca) 1 as c 1, so we have a contradic-
tion].
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-bib
Bibliography
Feller, W., An Introduction to Probability Theory and its Applications, Vol. 1,
3rd Ed. (Wiley, New York, 1968).
Feller, W., An Introduction to Probability Theory and its Applications, Vol. 2, 2nd
Ed. (Wiley, New York, 1971).
Hille, E., Analytic Function Theory, Vol. II (Boston, 1962).
Loeve, M., Probability Theory, 3rd Ed. (Van Nostrand, New York, 1963).
79
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-bib
This page intentionally left blank This page intentionally left blank
May 12, 2011 14:38 9in x 6in Topics in Probability b1108-Index
Index
absolutely continuous, 2
analytic c.f., 27
arithmetic distribution, 2
atomic, 2
binomial, 28
Cauchy, 28
central limit theorem, 19
characteristic function, 11
continuity theorem, 17
continuous, 2
convergence of types, 19
convolution, 4
cumulant generating function, 32
defective, 1
density, 2
distributions of class L, 69
domains of attraction, 59
entire, 27
Feller measure, 46
gamma, 28
innitely divisible, 43
Jordan decomposition, 3
Levy measure, 52
Levy processes, 57
LevyKhintchine representation, 52
Laplace, 28
Lebesgue decomposition, 3
Lyapunovs inequality, 8
moment problem, 31
moments, 6
non-negative denite, 21
normal, 28
Poisson, 28
probability distribution, 1
probability distribution function, 1
proper, 1
random variables, 4
Schwarz inequality, 30
selection theorem, 10
self-decomposable distributions, 69
singular, 2
stable distributions, 55
subordinator, 58
triangular arrays, 72
weak law of large numbers, 18
81