You are on page 1of 19

Two Random Variables

W&W, Chapter 5

Joint Distributions
So far we have been talking about the probability of a single variable, or a variable conditional on another. We often want to determine the joint probability of two variables, such as X and Y. Suppose we are able to determine the following information for education (X) and age (Y) for all U.S. citizens based on the census.

Joint Distributions
Education (X) Age (Y): 25-35 30
0 1 2 3 .01 .03 .18 .07

Age: 3555 45
.02 .06 .21 .08

Age: 55100 70
.05 .10 .15 .04

None Primary Secondary College

Joint Distributions
Each cell is the relative frequency (f/N).
We can define the joint probability distribution as: p(x,y) = Pr(X=x and Y=y) Example: what is the probability of getting a 30 year old college graduate?

Joint Distributions
p(x,y) = Pr(X=3 and Y=30) = .07 We can see that: p(x) = y p(x,y) p(x=1) = .03 + .06 + .10 = .19

Marginal Probability
We call this the marginal probability because it is calculated by summing across rows or columns and is thus reported in the margins of the table.
We can calculate this for our entire table.

Marginal Probability Distribution


Education (X) Age (Y): 30 p(x)

45 .02 .05

70 .08

None: 0

.01

Primary: 1

.03

.06

.10

.19

Secondary: .18 2 College: 3 .07

.21

.15

.54

.08

.04

.19

p(y)

.29

.37

.34

Independence
Two random variables X and Y are independent if the events (X=x) and (Y=y) are independent, or: p(x,y) = p(x)p(y) for all x and y Note that this is similar to Event E is independent of F if: Pr(E and F) = Pr(E)Pr(F) Eq. 3-21

Example
Are education and age independent? Start with the upper left hand cell: p(x,y) = .01 p(x) = .08 p(y) = .29 We can see they are not independent because (.08)(.29)=.0232, which is not equal to .01.

Independence
In a table like this, if X and Y are independent, then the rows of the table p(x,y) will be proportional and so will the columns (see Example 5-1, page 158).

Covariance
It is useful to know how two variables vary together, or how they co-vary. We begin with the familiar concept of variance (E is expectation). 2 = E(x- )2 = (x- )2 p(x) X,Y = Covariance of X and Y = E(X - X)(Y - Y) = (X - X)(Y - Y)p(x,y)

Covariance
Lets calculate the covariance for education (X) and age (Y). First we need to calculate the mean for X and Y:
X = xp(x) = (0)(.08)+(1)(.19)+(2)(.54)+(3)(.19)=1.84 Y = yp(y) = (30)(.29)+(45)(.37)+(70)(.34)=49.15

Now calculate each value in the table minus its mean (for X and Y), multiplied by the joint probability!

Covariance
X,Y = (X - X)(Y - Y)p(x,y)
= (0-1.84)(30-49.15)(.01) + (0-1.84)(45-49.15)(.02) + (0-1.84)(70-49.15)(.05) + (1-1.84)(30-49.15)(.03) + (1-1.84)(45-49.15)(.06) + (1-1.84)(70-49.15)(.10) + (2-1.84)(30-49.15)(.18) + (2-1.84)(45-49.15)(.21) + (2-1.84)(70-49.15)(.15) + (3-1.84)(30-49.15)(.07) + (3-1.84)(45-49.15)(.08) + (3-1.84)(70-49.15)(.04) = -3.636

Covariance
The covariance is negative, which tells us that as age increases, education decreases (and vice versa). It is negative because when one variable is above its mean, the other is below its mean on average. We can calculate covariance alternatively as

X,Y

= E(XY) - X Y = (xy)p(x,y) - X Y

Covariance and Independence


If X and Y are independent, then they are uncorrelated, or their covariance is zero:

X,Y = 0
The value for covariance depends on the units in which X and Y are measured. If X, for example, were measured in inches instead of feet, each X deviation and hence X,Y itself would increase by 12 times.

Correlation
We can calculate the correlation instead:
= X,Y

X Y
Correlation is independent of the scale it is measured in, and is always bounded:

-1 1

Correlation
A perfect positive correlation (=1); all x,y coordinate points will fall on a straight line with positive slope. A perfect negative correlation (=-1); all x,y coordinate points will fall on a straight line with negative slope. A correlation of zero indicates no relationship between X and Y (or independence!). Positive correlations (as X increases, Y increases) Negative correlations (as X increases, Y decreases)

Example of Correlation
Calculate the correlation between education and age:

= X,Y = -3.636

X Y
= -0.2743

(.8212)(16.14)

Interpretation
There is a weak, negative correlation between education and age, which means that older people have less education. Later on we will learn how to conduct a hypothesis test to determine if is significantly different from zero.

You might also like