MOMENT Generating Functions

page 93 110SOR201(2002)
Chapter 6
Moment Generating Functions
6.1 Denition and Properties
Our previous discussion of probability generating functions was in the context of discrete r.v.s.
Now we introduce a more general form of generating function which can be used (though not
exclusively so) for continuous r.v.s.
The moment generating function (MGF) of a random variable X is dened as
M
X
() = E(e
X
) =
_
x
e
x
P(X = x) if X is discrete
_

e
x
f
X
(x)dx if X is continuous
(6.1)
for all real for which the sum or integral converges absolutely. In some cases the existence
of M
X
() can be a problem for non-zero : henceforth we assume that M
X
() exists in some
neighbourhood of the origin, || <
0
. In this case the following can be proved:
(i) There is a unique distribution with MGF M
X
().
(ii) Moments about the origin may be found by power series expansion: thus we may write
M
X
() = E(e
X
)
= E
_

r=0
(X)
r
r!
_
=
r=0
r
r!
E(X
r
) [i.e. interchange of E and

valid]
i.e.
M
X
() =
r=0
r
r!
where
r
= E(X
r
). (6.2)
So, given a function which is known to be the MGF of a r.v. X, expansion of this function in
a power series of gives
r
, the r
th
moment about the origin, as the coecient of
r
/r!.
(iii) Moments about the origin may also be found by dierentiation: thus
d
r
d
r
{M
X
()} =
d
r
d
r
_
E(e
X
)
_
= E
_
d
r
d
r
(e
X
)
_
(i.e. interchange of E and dierentiation valid)
= E
_
X
r
e
X
_
.
page 94 110SOR201(2002)
So
_
d
r
d
r
{M
X
()}
_
=0
= E(X
r
) =
r
. (6.3)
(iv) If we require moments about the mean,
r
= E[(X )
r
], we consider M
X
(), which
can be obtained from M
X
() as follows:
M
X
() = E
_
e
(X)
_
= e
E(e
X
) = e
M
X
(). (6.4)
Then
r
can be obtained as the coecient of

r
r!
in the expansion
M
X
() =
r=0
r
r!
(6.5)
or by dierentiation:
r
=
_
d
r
d
r
{M
X
()}
_
=0
. (6.6)
(v) More generally:
M
a+bX
() = E
_
e
(a+bX)
_
= e
a
M
X
(b). (6.7)
Example
Find the MGF of the N(0, 1) distribution and hence of N(,
2
). Find the moments about the
mean of N(,
2
).
Solution If Z N(0, 1),
M
Z
() = E(e
Z
)
=
_

e
z
1
2
e
1
2
z
2
dz
=
1
2
_

exp{
1
2
(z
2
2z +
2
) +
1
2
2
}dz
= exp(
1
2
2
)
1
2
_

exp{
1
2
(z )
2
}dz.
But here
1
2
exp{...} is the p.d.f. of N(, 1)), so
M
Z
() = exp(
1
2
2
) (6.8)
If X = +Z, X N(,
2
), and
M
X
() = M
+Z
()
= e
M
Z
() by (6.7)
= exp( +
1
2
2
).
page 95 110SOR201(2002)
Then
M
X
() = e
M
X
() = exp(
1
2
2
)
=
r=0
(
1
2
2
)
r
r!
=
r=0
2r
2
r
r!
2r
=
r=0
2r
2
r
.
(2r)!
r!
.

2r
(2r)!
.
Using property (iv) above, we obtain
2r+1
= 0, r = 1, 2, ...
2r
=

2r
(2r)!
2
r
r!
, r = 0, 1, 2, ...
(6.9)
e.g.
2
=
2
;
4
= 3
4
.
6.2 Sum of independent variables
Theorem
Let X, Y be independent r.v.s with MGFs M
X
(), M
Y
() respectively. Then
M
X+Y
() = M
X
().M
Y
(). (6.10)
Proof
M
X+Y
() = E
_
e
(X+Y )
_
= E
_
e
X
.e
Y
_
= E(e
X
).E(e
Y
) [independence]
= M
X
().M
Y
().
Corollary If X
1
, X
2
, ..., X
n
are independent r.v.s,
M
X
1
+X
2
++Xn
() = M
X
1
().M
X
2
()...M
Xn
(). (6.11)
Note: If X is a count r.v. with PGF G
X
(s) and MGF M
X
(),
M
X
() = G
X
(e
) : G
X
(s) = M
X
(log s). (6.12)
Here the PGF is generally preferred, so we shall concentrate on the MGF applied to continuous
r.v.s.
Example
Let Z
1
, ..., Z
n
be independent N(0, 1) r.v.s. Show that
V = Z
2
1
+ +Z
2
n

2
n
. (6.13)
Solution Let Z N(0, 1). Then
M
Z
2() = E
_
e
Z
2
_
=
_

e
z
2 1
2
e
1
2
z
2
dz
=
_

2
exp{
1
2
(1 2)z
2
}dz.
Assuming <
1
2
, substitute y =
1 2z. Then
M
Z
2() =
_

2
e
1
2
y
2
.
1
1 2
dy = (1 2)
1
2
, <
1
2
. (6.14)
page 96 110SOR201(2002)
Hence
M
V
() = (1 2)
1
2
.(1 2)
1
2
...(1 2)
1
2
= (1 2)
n/2
, <
1
2
.
Now
2
n
has the p.d.f.
1
2
n
2
(
n
2
)
w
n
2
1
e
1
2
w
, w 0; n a positive integer.
Its MGF is
_

0
e
w
1
2
n
2
(
n
2
)
w
n
2
1
e
1
2
w
dw
=
_

0
1
2
n
2
(
n
2
)
w
n
2
1
exp{
1
2
w(1 2)}dw
(t =
1
2
(1 2) ( <
1
2
))
= (1 2)
n
2
1
(
n
2
)
_

0
t
n
2
1
e
t
dt
= (1 2)
n
2
, <
1
2
= M
V
().
So we deduce that V
2
n
. Also, from M
Z
2() we deduce that Z
2

2
1
.
If V
1

2
n
1
, V
2

2
n
2
and V
1
, V
2
are independent, then
M
V
1
+V
2
() = M
V
1
().M
V
2
() = (1 2)
n
1
2
(1 2)
n
2
2
( <
1
2
)
= (1 2)
(n
1
+n
2
)/2
.
So V
1
+V
2

2
n
1
+n
2
. [This was also shown in Example 3, 5.8.2.]
6.3 Bivariate MGF
The bivariate MGF (or joint MGF) of the continuous r.v.s (X, Y ) with joint p.d.f.
f(x, y), < x, y < is dened as
M
X,Y
(
1
,
2
) = E
_
e
1
X+
2
Y
_
=
_

1
x+
2
y
f(x, y)dxdy, (6.15)
provided the integral converges absolutely (there is a similar denition for the discrete case). If
M
X,Y
(
1
,
2
) exists near the origin, for |
1
| <
10
, |
2
| <
20
say, then it can be shown that
_
r+s
M
X,Y
(
1
,
2
)
r
1
s
2
_
1
=
2
=0
= E(X
r
Y
s
). (6.16)
The bivariate MGF can also be used to nd the MGF of aX +bY , since
M
aX+bY
() = E
_
e
(aX+bY )
_
= E
_
e
(a)X+(b)Y
_
= M
X+Y
(a, b). (6.17)
page 97 110SOR201(2002)
Example Bivariate Normal distribution
Using MGFs:
(i) show that if (U, V ) N(0, 0; 1, 1; ), then (U, V ) = , and deduce (X, Y ),
where (X, Y ) N(
x
,
y
;
2
x
,
2
y
; );
(ii) for the variables (X, Y ) in (i), nd the distribution of a linear combination aX +bY , and
generalise the result obtained to the multivariate Normal case.
Solution
(i) We have
M
U,V
(
1
,
2
) = E(e
1
U+
2
V
)
=
_

1
u+
2
v
1
2
_
1
2
exp
_
1
2(1
2
)
[u
2
2uv +v
2
]
_
dudv
=
1
2
1
2
_

exp{......}dudv
= ......... = exp{
1
2
(
2
1
+ 2
1
2
+
2
2
)}.
Then
M
U,V
(
1
,
2
)
1
= exp{.....}(
1
+
2
)
2
M
U,V
(
1
,
2
)
2
= exp{....}(
1
+
2
)(
1
+
2
) + exp{....}.
So
E(UV ) =
_
2
M
U,V
(
1
,
2
)
2
_
1
=
2
=0
= .
Since E(U) = E(V ) = 0 and Var(U) = Var(V ) = 1, we have that the correlation coecient of
U, V is
(U, V ) =
Cov(U, V )
_
Var(U).Var(V )
=
E(UV ) E(U)E(V )
1
= .
Now let
X =
x
+
x
U, Y =
y
+
y
V.
Then, as we have seen in Example 1, 5.8.2,
(U, V ) N(0, 0; 1, 1; ) (X, Y ) N(
x
,
y
;
2
x
,
2
y
; ).
It is readily shown that a correlation coecient remains unchanged under a linear transforma-
tion of variables, so (X, Y ) = (U, V ) = .
(ii) We have that
M
X,Y
(
1
,
2
) = E
_
e
1
(x+xU)+
2
(y+yV )
_
= e
(
1
x+
2
y)
M
U,V
(
1
x
,
2
y
)
= exp{(
1
x
+
2
y
) +
1
2
(
2
1
2
x
+ 2
1
y
+
2
2
2
y
)].
So, for a linear combination of X and Y ,
M
aX+bY
() = M
X,Y
(a, b) = exp{(a
x
+b
y
) +
1
2
(a
2
2
x
+ 2abCov(X, Y ) +b
2
2
y
)
2
}
= MGF of N(a
x
+b
y
, a
2
2
x
+ 2abCov(X, Y ) +b
2
2
y
)
2
),
i.e.
aX +bY N(aE(X) +bE(Y ), a
2
Var(X) + 2abCov(X, Y ) +b
2
Var(Y )). (6.18)
page 98 110SOR201(2002)
More generally, let (X
1
, ..., X
n
) be multivariate normally distributed. Then, by induction,
n
i=1
a
i
X
i
N
_
_
n
i=1
a
i
E(X
i
),
n
i=1
a
2
i
Var(X
i
) + 2
i<j
a
i
a
j
Cov(X
i
, X
j
)
_
_
. (6.19)
(If the Xs are also independent, the covariance terms vanish but then there is a simpler
derivation (see HW 8).)
6.4 Sequences of r.v.s
6.4.1 Continuity theorem
First we state (without proof) the following:
Theorem
Let X
1
, X
2
, ... be a sequence of r.v.s (discrete or continuous) with c.d.f.s F
X
1
(x), F
X
2
(x), ...
and MGFs M
X
1
(), M
X
2
(), ..., and suppose that, as n ,
M
Xn
() M
X
() for all ,
where M
X
() is the MGF of some r.v. X with c.d.f. F
X
(x). Then
F
Xn
(x) F
X
(x) as n
at each x where F
X
(x) is continuous.
Example
Using MGFs, discuss the limit of Bin(n, p) as n , p 0 with np = > 0 xed.
Solution Let X
n
Bin(n, p), with PGF G
X
(s) = (ps +q)
n
. Then
M
Xn
() = G
Xn
(e
) = (pe
+q)
n
= {1 +

n
(e
1)}
n
where = np.
Let n , p 0 in such a way that remains xed. Then
M
Xn
() exp{(e
1)} as n ,
since
_
1 +
a
n
_
n
e
a
as n , a constant, (6.20)
i.e.
M
Xn
() MGF of Poisson() (6.21)
(use (6.12), replacing s by e
in the Poisson PGF (3.7)). So, invoking the above continuity

theorem,
Bin(n, p) Poisson() (6.22)
as n , p 0 with np = > 0 xed. Hence in large samples, the binomial distribution
can be approximated by the Poisson distribution. As a rule of thumb: the approximation is
acceptable when n is large, p small, and = np 5.
page 99 110SOR201(2002)
6.4.2 Asymptotic normality
Let {X
n
} be a sequence of r.v.s (discrete or continuous). If two quantities a and b can be found
such that
c.d.f. of
(X
n
a)
b
c.d.f. of N(0, 1) as n , (6.23)
X
n
is said to be asymptotically normally distributed with mean a and variance b
2
, and we write
X
n
a
b
a
N(0, 1) or X
n
a
N(a, b
2
). (6.24)
Notes: (i) a and b need not be functions of n; but often a and b
2
are the mean and variance of
X
n
(and so are functions of n).
(ii) In large samples we use N(a, b
2
) as an approximation to the distribution of X
n
.
6.5 Central limit theorem
A restricted form of this celebrated theorem will now be stated and proved.
Theorem
Let X
1
, X
2
, ... be a sequence of independent identically distributed r.v.s, each with mean and
variance
2
. Let
S
n
= X
1
+X
2
+ +X
n
, Z
n
=
(S
n
n)
n
.
Then
Z
n
a
N(0, 1) or P(Z
n
z) P(Z z) as n , where Z N(0, 1),
and S
n
a
N(n, n
2
).
Proof Let Y
i
= X
i
(i = 1, 2, ...). Then Y
1
, Y
2
, ... are i.i.d. r.v.s, and
S
n
n = X
1
+ +X
n
n = Y
1
+ +Y
n
.
So
M
Snn
() = M
Y
1
().M
Y
2
()....M
Yn
() = {M
Y
()}
n
,
and
M
Zn
() = MSnn
n
() = E
_
exp
_
Snn
n

__
= E
_
exp
_
(S
n
n)(

n
)
__
= M
Snn
_

n
_
=
_
M
Y
_

n
__
n
.
Note that
E(Y ) = E(X ) = 0 : E(Y
2
) = E{(X )
2
} =
2
.
Then
M
Y
() = 1 + E(Y )

1!
+ E(Y
2
)
2
2!
+ E(Y
3
)
3
3!
+
= 1 +
1
2
2
+o(
2
)
page 100 110SOR201(2002)
(where o(
2
) denotes a function g() such that
g()
2
0 as 0). So
M
Zn
() = {1 +
1
2
2
(

2
n
2
) +o(
1
n
)}
n
= {1 +
1
2
2
.
1
n
+o(
1
n
)}
n
(where o(
1
n
) denotes a function h(n) such that
h(n)
1/n
0 as n ).
Using the standard result (6.20), we deduce that
M
Zn
() exp(
1
2
2
) as n
which is the MGF of N(0,1).
So
c.d.f. of Z
n
=
S
n
n
n
c.d.f. of N(0, 1) as n ,
i.e.
Z
n
a
N(0, 1) or S
n
a
N(n, n
2
). (6.25)

Corollary
Let X
n
=
1
n
n
i=1
X
i
. Then X
n
a
N(,

2
n
). (6.26)
Proof X
n
= W
1
+ + W
n
where W
i
=
1
n
X
i
and W
1
, ..., W
n
are i.i.d. with mean

n
and
variance

2
n
2
. So
X
n
a
N(n.
n
, n.
2
n
2
) = N(,

2
n
).
(Note: The theorem can be generalised to
independent r.v.s with dierent means & variances
dependent r.v.s
but extra conditions on the distributions are required.
Example 1
Using the central limit theorem, obtain an approximation to Bin(n, p) for large n.
Solution Let S
n
Bin(n, p). Then
S
n
= X
1
+X
2
+ +X
n
,
where
X
i
=
_
1, if the ith trial yields a success
0, if the ith trial yields a failure.
Also, X
1
, X
2
, ..., X
n
are independent r.v.s with
E(X
i
) = p, Var(X
i
) = pq.
So
S
n
a
N(np, npq),
i.e., for large n, the binomial c.d.f. is approximated by the c.d.f. of N(np, npq).
[As a rule of thumb: the approximation is acceptable when n is large and p
1
2
such that
np > 5.]
page 101 110SOR201(2002)
Example 2
As Example 1, but for the
2
n
distribution.
Solution Let V
n

2
n
. Then we can write
V
n
= Z
2
1
+ +Z
2
n
,
where Z
2
1
, ..., Z
2
n
are independent r.v.s and
Z
i
N(0, 1), Z
2
i

2
1
; E(Z
2
i
) = 1, Var(Z
2
i
) = 2.
So
V
n
a
N(n, 2n).
Note: These are not necessarily the best approximations for large n. Thus
(i)
P(S
n
s) P
_
Z
s+
1
2
np
npq
_
where Z N(0, 1)
= F
S
_
s+
1
2
np
npq
_
.
The
1
2
is a continuity correction, to take account of the fact that we are approximating a
discrete distribution by a continuous one.
(ii)
_
2V
n
approx
N(
2n 1, 1).
6.6 Characteristic function
The MGF does not exist unless all the moments of the distribution are nite. So many distri-
butions (e.g. t,F) do not have MGFs. So another GF is often used.
The characteristic function of a continuous r.v. X is
C
X
() = E(e
iX
) =
_

e
ix
f(x)dx, (6.27)
where is real and i =
1. C
X
() always exists, and has similar properties to M
X
(). The
CF uniquely determines the p.d.f.:
f(x) =
1
2
_

C
X
()e
ix
d (6.28)
(cf. Fourier transform). The CF is particularly useful in studying limiting distributions. How-
ever, we do not consider the CF further in this module.

MOMENT Generating Functions

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MOMENT Generating Functions

Uploaded by

Copyright:

Available Formats

page 93 110SOR201(2002)

in the Poisson PGF (3.7)). So, invoking the above continuity

You might also like