You are on page 1of 5

Lecture 23: Variance and Covariance

1.) Expectations of Independent RVs


Proposition 4.1: If X and Y are independent, then for any real-valued functions h and g,
E[h(X)g(Y )] = E[h(X)] E[g(Y )],
provided that the expectations shown in this formula exist.
Proof: We will prove this under the assumption that X and Y are jointly continuous with joint
density pX,Y = pX pY . Then
Z Z


h(x)g(y)pX,Y (x, y)dxdy
E h(X)g(Y ) =

Z Z
=
h(x)g(y)pX (x)pY (y)dxdy

Z
Z

=
pX (x)h(x)dx
pY (y)g(y)dy

= E[h(X)] E[g(Y )].

Remark: Taking h(x) = x and g(y) = y gives the important result:


E[XY ] = E[X] E[Y ].
Generalization: If X1 , , Xn are independent RVs, then a simple induction argument on n
shows that
" n
#
n
Y
Y
E
fi (Xi ) =
E[fi (Xi )],
i=1

i=1

again provided that all of the expectations appearing in the formula exist.
2.) Covariance:
Definition: If X and Y are RVs with means X = E[X] and Y = E[Y ], then the covariance
between X and Y is the quantity

Cov(X, Y ) = E (X X )(Y Y )]
= E[XY ] X Y .
Remark: The covariance of two random variables is a measure of their association. The
covariance is positive if when X is larger than X , then Y also tends to be larger than Y ,
1

while the covariance is negative if the opposite pattern holds. In particular, if X and Y are
independent, then
Cov(X, Y ) = E[X X ]E[Y Y ] = 0,
because knowledge of how X compares with its mean provides no information about Y or vice
versa. Notice that the converse is not true: there are random variables that have zero covariance
but which are not independent. For example, suppose that X has the distribution
1
P{X = 0} = P{X = 1} = P{X = 1} = ,
3
and that the random variable Y is defined by

0
Y =
1

if X 6= 0
if X = 0.

Then E[X] = 0 and also E[XY ] = 0 because XY = 0 with probability 1, so that


Cov(X, Y ) = E[XY ] E[X]E[Y ] = 0,
but X and Y are clearly not independent. For example,
P{X = 0|Y = 1} = 1 6=

1
= P{X = 0}.
3

3. Properties of Covariances
Proposition 4.2 (Ross).
(a) Cov(X, Y ) = Cov(Y, X)
(b) Cov(X, X) = V ar(X)
(c) Cov(aX, Y ) = aCov(X, Y )

n
m
n X
m
X
X
X
(d) Cov
Xi ,
Yj =
Cov(Xi , Yj ).
i=1

j=1

i=1 j=1

Each identity can be proved directly from the definition of the covariance and the linearity
properties of expectations. (See Ross pp. 323-324.)
Another useful formula follows from parts (b) and (d) of the preceding proposition:
!
n
n
X
X
X
V ar
Xi =
V ar(Xi ) +
Cov(Xi , Xj ).
i=1

i=1

1i6=jn

In particular, if the Xi s are pairwise independent, then Cov(Xi , Xj ) = 0 whenever i 6= j, so


that
!
n
n
X
X
V ar
Xi =
V ar(Xi ).
i=1

i=1

If, in addition to being pairwise independent, the Xi s are identically distributed, say with
variance V ar(Xi ) = 2 , then
!
n
X
V ar
Xi = n 2 .
i=1

Example (Ross, 4b): If X1 , , Xn are independent Bernoulli RVs, each with parameter p,
then X = X1 + + Xn is a Binomial RV with parameters (n, p). Because the variance of each
Bernoulli RV is p(1 p), the preceding formula for the variance of a sum of IID RVs shows that
V ar(X) = np(1 p),
confirming our previous calculation using probability generating functions.

4.) Statistical Applications


Example (Ross, 4a): If X1 , , Xn is a collection of IID (independent, identically-distributed)
and the sample
random variables with mean and variance 2 , then the sample mean X
2
variance S are the quantities defined by
n

=
X

1X
Xi
n
i=1

S2 =


1 X
2.
Xi X
n1
i=1

If we think of the Xi s as the outcomes of a sequence of independent replicates of some experiment, then the sample mean and sample variance are often used to estimate the true mean and
true variance of the underlying distribution. Recall that we previously showed that the sample
mean is an unbiased estimator for the true mean:
= .
E[X]
Although unbiasedness is a desirable property of an estimator, we also care about the variance
of the estimator about its expectation. For the sample mean, we can calculate
!
 2
n
X
1
=
V ar(X)
V ar
Xi
n
i=1
 2 X
n
1
=
V ar(Xi )
n
i=1

2
,
n

which shows that the variance of the sample mean is inversely proportional to the number of
independent replicates that have been performed.

We can also show that the sample variance is an unbiased estimator of the true variance. We
first observe that
n

S2 =

=
=


1 X
2
Xi + X
n1
1
n1
1
n1

i=1
n
X

)
(Xi ) 2(X
2

n
X

i=1
n
X

(Xi ) +

i=1

(Xi )2

i=1

n
X

!

X

2

i=1

n
)2 .
(X
n1

Taking expectations gives


 
E S2 =

n

1 X 
n

V ar(X)
E (Xi )2
n1
n1
i=1

n
n 2
=
2
n1
n1 n
2
= .
To understand why we divide by n 1 rather than n when computing the sample variance,
note that the deviations between the Xi s and the sample mean tend to be smaller than the
deviations between the Xi s and the true mean.
5.) Correlation
The following definition is motivated by the need for a dimensionless measure of the association between two random variables.
2 > 0 and
Definition: The correlation of two random variables X and Y with variances X
Y2 > 0 is the quantity
Cov(X, Y )
(X, Y ) =
.
X Y

Because
(aX, bY ) =

abCov(X, Y )
= (X, Y ),
aX bY

the correlation between two RVs does not depend on the units in which they are measured, i.e.,
the correlation is dimensionless.
Another important property of the correlation is that
1 (X, Y ) 1.

This can be proved as follows. First note that




Y
X
+
0 V ar
X
Y
V ar(X) V ar(Y )
Cov(X, Y )
=
+
+2
2
2
X Y
X
Y
= 2(1 + (X, Y )),
which implies that (X, Y ) 1. A similar calculation involving the variance of the difference
of X/X and Y /Y shows that (X, Y ) 1.
It can be shown that if (X, Y ) = 1, then with probability one such that
Y = Y +

Y
(X X ),
X

Y = Y

Y
(X X ).
X

while if (X, Y ) = 1, then

Thus the correlation constant can also be treated as a measure of the linearity of the relationship
between two random variables.

You might also like