Stats 3

1
STATISTICS: MODULE 12122

Chapter 3 - Bivariate or joint probability distributions
In this chapter we consider the distribution of two random variables where both
random variables are discrete (considered first) and probably more importantly where
both random variables are continuous. Bivariate or joint distributions model the way
two random variables vary together.
A. DISCRETE VARIABLES
Example 3.1
Here we have a probability model of the demand and supply of a perishable
commodity. The probability model/distribution is defined as follows:
Supply of commodity (SP)
1 2 3
0 0.015 0.025 0.010
Demand for 1 0.045 0.075 0.030
commodity 2 0.195 0.325 0.130
(D) 3 0.030 0.050 0.020
4 0.015 0.025 0.010
This is known as a discrete bivariate or joint probability distribution since there are
two random variables which are "demand for commodity (D)" and "supply of
commodity (SP)".
The sample space S consists of 15 outcomes (d, s) where d and s are the values of D
and SP.
The probabilities in the table are joint probabilities, namely P( D = d and SP = s) or
P( D = d SP = s) using set notation.
Examples
Note: The sum of the 15 probabilities is 1.
3.2 Joint probability function
Suppose the random variables are X and Y , then the joint probability function is
denoted by p( x, y) and is defined as follows:
p( x, y) = P( X = x and Y = y) or P( X = x Y = y)
2
Also ( )
x y
p x y

, = 1.
3.3 Marginal probability distributions
The marginal distributions are the distributions of X and Y considered separately
and model how X and Y vary separately from each other. Suppose the probability
functions of X and Y are ( ) p x
X
and ( ) p y
Y
respectively so that
( ) p x
X
= P(X = x) and ( ) p y
Y
= P(Y = y)
Also ( ) p x
X
x
1 and ( ) p y
Y
y
1.
It is quite straightforward to obtain these these from the joint probability distribution
since ( ) p x
X
= ( ) p x y
y
,
and ( ) p y
Y
= ( ) p x y
x
,
In regression problems we are very interested in conditional probability distributions

such as the conditional distribution of X given Y = y and the conditional distribution
of Y given X = x
3.4 Conditional probability distributions
The conditional probability function of X given Y = y is denoted by
( )
p x y
is defined as

( )
p x y =
( )
P X x Y y =
( )
( )
P X x and Y y
P Y y

=
( )
( )
p x y
p y
Y
,
whereas the conditional probability function of Y given X = x is denoted by
( )
p y x
and defined as

( )
p y x =
( )
P Y y X x =
( )
( )
P Y y and X x
P X x

=
( )
( )
p x y
p x
X
,
3.5 Joint probability distribution function
The joint (cumulative) probability distribution function (c.d.f.) is denoted by F(x, y)
and is defined as
F(x, y) = P( X x and Y y) and 0 F(x, y) 1
The marginal c.d.f s are denoted by ( ) F x
X
and ( ) F y
Y
and are defined as follows
( ) F x
X
= P( X x) and ( ) F y
Y
= P( Y y)
(see Chapter 1, section 1.12 ).
3
3.6 Are X and Y independent?
If either (a) F(x, y) = ( ) F x
X
. ( ) F y
Y
or (b)p(x, y) = ( ) p x
X
. ( ) p y
Y
then X and Y are independent random variables.
Example 3.2 The joint distribution of X and Y is
X
-2 -1 0 1 2
Y 10 0.09 0.15 0.27 0.25 0.04
20 0.01 0.05 0.08 0.05 0.01
(a) Find the marginal distributions of X and Y.
(b) Find the conditional distribution of X given Y =20.
(c) Are X and Y independent?
B. CONTINUOUS VARIABLES
3.7 Joint probability density function
The joint p.d.f. is denoted by f (x, y) (where ( ) f x y , 0 all x and y) and defines
a probability surface in 3 dimensions. Probability is a volume under this surface and
the total volume under the p.d.f. surface is 1 as the total probability is 1 i.e.
( ) f x y dx dy ,

1
and P( a X b and c Y d ) = ( ) f x y dx dy
x a
x b
y c
y d
,

As before with discrete variables, the marginal distributions are the distributions of
X and Y considered separately and model how X and Y vary separately from each other.
Whereas with discrete random variables we speak of marginal probability functions,
with continuous random variables we speak of marginal probability density functions.
Example 3.3
An electronics system has one of each of two different types of components in joint
operation. Let X and Y denote the random lengths of life of the components of type 1
and 2, respectively. Their joint density function is given by

( )
( )
f x y x e x y
otherwise
x y
, ;
/
_
,
> >
+
1
8
0 0
0
2
4
Example 3.4
The random variables X and Y have a bivariate normal distribution if
( ) f x y ae
b
,

where a
X Y
1
2 1
2

and

( )
b
x x y y
X
X
X
X
Y
Y
Y
Y
_
,

_
,
_
,
+

_
,
1
]
1
1
1
2 1
2
2
2 2
where < < < < < < x y

X
, , , < < < < > >
Y X Y
, , , . 1 1 0 0
The p.d.f. surface is shown below and as you can see is bell-shaped.
3.8 Marginal probability density function
The marginal p.d.f of X is defined as ( ) f x
X
and is the equation of a curve called
the p.d.f. curve of X . P( a X b) is an area under the p.d.f. curve and so
P( a X b) = ( ) f x dx
X
a
b
(as in Chapter 1 , section 1.8).

It can be obtained from the joint p.d.f. by a single integration, as follows:
( ) f x
X
= ( ) f x y dy ,
.
The marginal p.d.f of Y is defined as ( ) f y
Y
and is the equation of a curve called
the p.d.f. curve of Y . P( c Y d) is an area under the p.d.f. curve and so
P( c Y d) = ( ) f y dy
Y
c
d
.
It can be obtained from the joint p.d.f. by a single integration, as follows:
and ( ) f y
Y
= ( ) f x y dx ,
5
3.9 Conditional probability density functions
The conditional p.d.f of X given Y = y is denoted by
( )
f x y and
defined as
( )
f x y =
( )
( )
( )
f x Y y
f x y
f y
Y

,
whereas the conditional p.d.f of Y given X = x is denoted by
( )
f y x and
defined as
( )
f y x =
( )
( )
( )
f y X x
f x y
f x
X

,
3.10 Joint probability distribution function
As in 3.5 the joint (cumulative) probability distribution function (c.d.f.) is denoted by
F(x, y) and is defined as F(x, y) = P( X x and Y y) but F(x, y) in the continuous
case is the volume under the p.d.f. surface from X to X = x and from Y to
Y = y, so that
( ) F x y , ( ) f u v du dv
u
u x
v
v y
,

The marginal c.d.f. s are defined as in 3.5 and can be obtained from the joint
distribution function F(x, y) as follows:
( ) F x
X
= ( ) F x y
MAX
, where y
MAX
is the largest value of y and
( ) F y
Y
= ( ) F x y
MAX
, where x
MAX
is the largest value of x.
3.11 Important connections between the p.d.f s and the joint c.d.f.s.
(i) The joinf p.d.f. f (x, y) =
( )

2
F x y
x y
,
(ii) The marginal p.d.fs can be obtained from the marginal c.d.f.s as follows:
the marginal p.d.f. of X = ( ) f x
X
=
( ) dF x
dx
X
or ( ) F x
X
,
the marginal p.d.f. of Y = ( ) f y
Y
=
( ) dF y
dy
Y
or ( ) F y
Y
3.12 Are X and Y independent?
X and Y are independent random variables if either
(a) F(x, y) = ( ) F x
X
( ) F y
Y
; or
6
(b) f(x, y) = ( ) f x
X
( ) f y
Y
; or
(c)
( )
f x y = function of x only or equivalently
( )
f y x = function of y only
Example 3.5 The joint distribution function of X and Y is given by
F x y y
x
x x y ( , ) , +
_
,

2
3
2
2
0 1
= 0 otherwise
(i) Find the marginal distribution and density functions.
(ii) Find the joint density function.
(iii) Are X and Y independent random variables?
Example 3.6 X and Y have the joint probability density function
( ) f x y
x
y
x y , ,
8
7
1 2
2
3
(a) Derive the marginal distribution function of X.
(b) Derive the conditional density function of X given Y = y
(c) Are X and Y independent?
Given: Given:
Joint density fn. f (x ,y) Joint distribution fn. F(x, y)
| Integrate w.r.t | Differentiate (partially)
| x and y | w.r.t. x and y

Joint distribution fn. F(x, y) Joint density fn. f (x, y)
( ) f u v du dv
u
u x
v
v y
,

2
F
x y
.
7
Example 3.6(b) and (c)
Solution From 3.9 the conditional p.d.f of X given Y = y is denoted by
( )
f x y and
defined as
( )
f x y =
( )
( )
( )
f x Y y
f x y
f y
Y

,
where ( ) f x y , is the joint p.d.f. of X
and Y and ( ) f y
Y
is the marginal p.d.f. of Y. We know ( ) f x y
x
y
,
8
7
2
3
so we need to
find ( ) f y
Y
.
There are two ways you can find ( ) f y
Y
. The first way involves integration and the
second way involves differentiation. I will do both ways to show you how to use the
different results we have here but you should always choose the way you find easiest i.e
you would not be expected to find ( ) f y
Y
both ways in any assessed work .
Method 1
From 3.8 ( ) f y
Y
= ( ) f x y dx ,
so ( ) f y
Y
=
8
7
2
3
1
2
x
y
dx
=
8
7
3
2
1
2
y
x dx
=
8
7 3
3
3
1
2
y
x
1
]
1
=
=
8
7
2
3
1
3
3
3
y

1
]
1
=
8
7
7
3
3
y
1
]
1
=
8
3
3
y
Method 2
From 3.11 ( ) f y
Y
=
( ) dF y
dy
Y
where ( ) F y
Y
is the marginal c.d.f of Y.
From 3.10 ( ) F y
Y
= ( ) F x y
MAX
, where x
MAX
is the largest value of x, so
( ) F y
Y
= ( ) F y 2, and from part (a), ( ) F x y ,
( )
4
21
1 1
1
3
2
x
y

_
,
so
( ) F y 2, =
( )
4
21
2 1 1
1
3
2

_
,
y
=
4
3
1
1
2
_
,
y
hence ( ) F y
Y
=
4
3
1
1
2
_
,
y
.
Hence ( ) f y
Y
=
d
dy y
4
3
1
1
2
_
,
_
,
=
4
3
2
3
y
_
,
=
8
3
3
y
as with method 1.
Now therefore the conditional density function of X given Y = y ,
( )
f x y is given by

( )
f x y =
( )
( )
f x y
f y
Y
,
=
8
7
8
3
2
3
3
x
y
y
_
,
_
,
=
3
7
3
x
So
( )
f x y =
3
7
3
x
1 x 2 and 1 y 2
= 0 otherwise
(c) Now
( )
f x y is a function of x only, so using result 3.12(c), X and Y are independent.
Notice also that
( )
f x y = ( ) f x
X
which you would expect if X and Y are independent.
8
3.13 Expectations and variances
Discrete random variables

( ) ( ) E X x p x y
r
x
r
y

, = ( ) x p x y
r
x y

, = ( ) x p x
r
x
X
r =1,2.....

( ) ( ) E Y y p x y
r
x
r
y

, = ( ) y p x y
r
y x

, = ( ) y p y
r
y
Y
r =1,2....
Examples
Hence Var(X) =
( ) ( ) ( )
E X E X
2
2
, Var(Y) =
( ) ( ) ( )
E Y E Y
2
2
etc.
Continuous random variables
( ) ( ) ( ) E X x f x y dx dy x f x dx r
r r r
X

, , ..... 1 2
( ) ( ) ( ) E Y y f x y dx dy y f y dy r
r r r
Y

, , ..... 1 2
Examples
3.14 Expectation of a function of the r.v.'s X and Y
Continuous X and Y Discrete X and Y
E g X Y g x y f x y dxdy [ ( , )] ( , ) ( , )

e g E
X
Y
x
y
f x y dxdy . . ( , )
1
]
1

E[XY]

xy f x y dxdy ( , ) .
3.15 Covariance and correlation
Covariance of X and Y is defined as follows : Cov (X,Y) =
XY
= E(XY) - E(X)E(Y).
Notes
(a) If the random variables increase together or decrease together, then the covariance
will be positive, whereas if one random variable increases and the other variable
decreases and vice-versa, then the covariance will be negative.
(b) If X and Y are independent r.v's, then E(XY) = E(X)E(Y) so cov(X, Y) = 0.
However
if cov(X,Y) = 0, it does not follow that X and Y are independent unless X and Y are
9
Normal r.v's.
Correlation coefficient = = corr(X,Y)
Cov X Y
X Y
( , )
.

Note
(a) The correlation coefficient is a number between -1 and 1 i.e. -1 1
(b) If the random variables increase together or decrease together, then
will be positive, whereas if one random variable increases and the other variable
decreases and vice-versa, then will be negative.
(c) It measures the degree of linear relationship between the two random variables X
and Y , so if there is a non-linear relationship between X and Y or X and Y are
independent random variables, then will be 0.
You will study correlation in more detail in the Econometric part of the course with
David Winter.
Example 3.7 In Example 3.2 are X and Y correlated?
Solution Below is the joint or bivariate probability distribution of X and Y:
X
-2 -1 0 1 2
Y 10 0.09 0.15 0.27 0.25 0.04
20 0.01 0.05 0.08 0.05 0.01
The marginal distributions of X and Y are
x -2 -1 0 1 2 Total
P(X =x) or ( ) p x
X
0.10 0.20 0.35 0.30 0.05 1.00
and
y 10 20 Total
P(Y = y) or ( ) p y
Y
0.8 0.20 1.00
Example 3. 8 In Example 3.6
(i) Calculate E(X), Var(X), E(Y) and cov(X,Y).
(ii) Are X and Y independent?
3.14 Useful results on expectations and variances
(i) E aX bY aE X bE Y ( ) ( ) ( ) + + where a and b are constants.
10
(ii) Var aX bY a Var X b Var Y ab X Y ( ) ( ) ( ) cov( , ). + + +
2 2
2
Result (i) can be extended to any n random variables X X X
n 1 2
, ,.......,
( ) ( ) ( ) ( ) E a X a X a X a E X a E X a E X
n n n n 1 1 2 2 1 1 2 2
+ + + + + + ....... ........
When X and Y are independent, then
(iii) Var aX bY a Var X b Var Y ( ) ( ) ( ) + +
2 2
(iv) ( ) ( ) ( ) E XY E X E Y so cov( , ) X Y 0
Results (iii) and (iv) can be extended to any n independent random variables
X X X
n 1 2
, ,.......,
(iii)* ( ) Var a X a X a X
n n 1 1 2 2
+ + + .......
( ) ( ) ( ) a Var X a Var X a Var X
n n 1
2
1 2
2
2
2
+ + + ........
(iv)* ( ) ( ) ( ) ( ) E X X X E X E X E X
n n 1 2 1 2
..... . ........
11
3.15 Combinations of independent Normal random variables
Suppose X N
1 2 2
2
~ ( , ) , X N
2 2 2
2
~ ( , ) , X N
3 3 3
2
~ ( , ) ,......... and
X N
n n n
~ ( , )
2
X X X
n 1 2
, ,......., are independent random variables, then if
Y a X a X a X a X
n n
+ + + +
1 1 2 2 3 3
..... where a a a
n 1 2
, ..... are constants,
Y ~ N a a a a a a a
n n n n
( ..... , ..... )
1 1 2 2 3 3 1
2
1
2
2
2
2
2 2 2
+ + + + + + +
i.e. Y N a a
i i i i
~ ( , ).
2

12
In particular, suppose X X X
n 1 2
, ..... form a random sample from a Normal
population with mean and variance
2
,

1 2 3 1
2
2
2 2 2
. .... . ... . .
n n
and
Y N a a
i i
~ ( , ).
2 2

Further, suppose that a a a a
n
n 1 2 3
1
.....
then Y =
X X X
n
X
n 1 2
+ +
...
and X N
n
~ ( , )

2
.

Stats 3

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stats 3

Uploaded by

Copyright:

Available Formats

1

STATISTICS: MODULE 12122

In regression problems we are very interested in conditional probability distributions

where < < < < < < x y

(as in Chapter 1 , section 1.8).

You might also like