Professional Documents
Culture Documents
' $
To deal with this type of data we need a PDF for more than one variable.
p(x, y) = P (X = x, Y = y)
where
& %
' $
Two tire-quality experts examine stacks of tires and assign quality ratings
to each tire on a 3-point scale. Let X denote the grade given by expert
A and let Y denote the grade given by B . The following table gives the
joint distribution for X for Y .
Y 1 2 3
• P (X = 1, Y = 2) =
• P (X ≥ 2, Y ≥ 3) =
The marginal probability mass function of X and Y , denoted by
pX (x) and pY (y), respectively are given by:
X X
pX (x) = p(x, y) pY (y) = p(x, y)
Y X
X 1 2 3 Y 1 2 3
pX (x) pY (y)
& %
' $
1.2 Continuous Joint PDF for Two Variables
For two continuous RANDOM variables X, Y , the joint PDF has the
following properties.
x(1 + 3y 2 )
f (x, y) = 0 < x < 2, 0 < y < 1
4
Show that conditions 1 holds:
& %
' $
Show that condition 2 holds:
R∞ R∞
−∞ −∞
f (x, y)dxdy =
& %
' $
& %
' $
The marginal probability density function of X and Y , denoted by
fX (x) and fY (y), respectively are given by:
Z ∞ Z ∞
fX (x) = f (x, y)dy fY (y) = f (x, y)dx
−∞ −∞
& %
' $
Y 1 2 3
& %
' $
1.4 Conditional Distributions
Given that a student scored > 1200 on the SAT, what is the
probability the student will have a college GPA ≥ B ?
Let X and Y be two continuous RV’s with joint pdf f (x, y) and marginal
X pdf fX (x). Then for any X value x for which fX (x) > 0, the
conditional probability density function of Y given that X = x is:
f (x, y)
fY |X (y|x) = −∞<y <∞
fX (x)
For the discrete case:
p(x, y)
pY |X (y|x) =
pX (x)
& %
' $
Example: What is the probability that tire expert A (rv X ) will assign a
grade of 2 given that tire expert B (rv Y ) assigned a grade of
1?
Y 1 2 3
& %
' $
Example: For the joint PDF:
x(1 + 3y 2 )
f (x, y) = 0 < x < 2, 0 < y < 1
4
what is P (Y ≥ 21 |X = 1)?
& %
' $
1.5 Joint PDF’s of n Random Variables
The concept of joint PDF for 2 RV’s can be extended that that of a joint
PDF of n RV’s: Joint PMF:
P (x1 , x2 , . . . , xn ) = P (X1 = x1 , X2 = x2 , . . . , Xn = xn )
P (a1 ≤ X1 ≤ b1 , a2 ≤ X2 ≤ b2 , . . . , an ≤ Xn ≤ bn ) =
Z b1 Z b2 Z bn
... f (x1 , x2 , . . . , xn )dxn . . . dx2 dx1
a1 a2 an
n!
p(x1 , x2 , . . . , xr ) = px1 1 px2 2 . . . pxr r
x1 !x2 ! . . . xr !
xi = 0, 1, 2, . . . x1 + x2 + . . . xr = r
where
n = Number of trials.
& %
' $
2 Expected Values
PMF:
XX
E[h(X, Y )] = h(x, y) · p(x, y)
x y
PDF: Z Z
∞ ∞
E[h(X, Y )] = h(x, y) · f (x, y)dxdy
−∞ −∞
& %
' $
2.1 Covariance
Often the RV’s X, Y are related to each other (ie. they are not
independent) and we want to know something about the strength of the
linear relationship between X and Y .
1.0
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0
0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
& %
' $
Example: Calculate the covariance between the two tire experts:
Y 1 2 3
& %
' $
2.2 Correlation
Cov(X, Y )
ρx,y = Corr(X, Y ) = − 1 ≤ ρx,y ≤ 1
σX σY
Properties:
1. Corr(aX + b, cY + d) = Corr(X, Y ).
2. If ρx,y = 1 or ρx,y = −1 then Y = aX + b for a 6= 0.
3. If X, Y are independent, then ρx,y = 0. However,
ρx,y = 0 does not imply independence:
1.0
0.8
0.6
0.4
0.2
0.0
& %
' $
Example: Calculate the correlation between the two tire experts:
Y 1 2 3
& %
' $
2.3 Correlation and Causation
Just because two variables are highly correlated does not mean that one
“causes the other”.
Example: “Pet a day keeps doctor away.” This headline appeared in the
August 2, 1990 issue of the Santa Barbara News-Press. It was
referring to a study of Medicare enrollees in a HMO.
Participants were followed for 1 year and the frequency of
doctor contacts noted. Those who owned pets had contact
with doctors an average of 8.42 times while those without pets
has an average of 9.49 contacts. The study’s conclusion is
that pet ownership has a moderating role in helping elderly
through times of stress.
Example: In October 1994, The New York Times ran a front-page article
indicating that “men from traditional families, in which the
wives stay at home to care for children, earn more and get
higher raises than men from two-career families.” The
statement is based on studies of salary and histories of male
managers. Does this study suggest cause and effect: To make
more money, have a traditional family?”.
Example: In a German town it was found that the number of births and
stork sighting where positively correlated.
& %
' $
If we have correlation it could be any of the following:
& %
' $
2.4 Establishing Cause and Effect
However, more recently, many scientists have come to believe that in the
absence of such experimentations a good case can be made for cause
and effect if:
& %
' $
x 40 45 50 55
p(x) .1 .2 .3 .4
0.4
0.3
0.2
0.1
0.0
40 45 50 55
& %
' $
Continuing with our tune-up example, suppose that we sample two
customers and take the average of what they are charged. The
outcomes for that statistic are
x1 x2 x p(x)
40 40 40 (.1)(.1) = .01
40 45 42.5 (.1)(.2) = .02
40 50 45 (.1)(.3) = .03
40 55 47.5 (.1)(.4) = .04
45 40 42.5 (.2)(.1) = .02
45 45 45 (.2)(.2) = .04
45 50 47.5 (.2)(.3) = .06
45 55 50 (.2)(.4) = .08
50 40 45 (.3)(.1) = .03
50 45 47.5 (.3)(.2) = .06
50 50 50 (.3)(.3) = .09
50 55 52.5 (.3)(.4) = .12
55 40 47.5 (.4)(.1) = .04
55 45 50 (.4)(.2) = .08
55 50 52.5 (.4)(.3) = .12
55 55 55 (.4)(.4) = .16
So, the table above represents every possible outcome of averaging two
bills.
& %
' $
What is the distribution of the statistic?
& %
' $
Suppose that we sample four customers and take the average of what
they are charged. The (partial) outcomes for that statistic are
x1 x2 x3 x4 x p(x) x1 x2 x3 x4 x p(x)
40 40 40 40 40 .0001 40 45 40 40 41.25 .0002
x1 x2 x3 x4 x p(x) x1 x2 x3 x4 x p(x)
55 50 40 40 46.25 .0012 55 55 40 40 47.5 .0016
& %
' $
What is the distribution of the statistic?
x p(x)
40.00 .0001
41.25 .0008
42.50 .0036
43.75 .0120
45.00 .0310
46.25 .0648
47.50 .1124
48.75 .1608
50.00 .1905
51.25 .1840
52.50 .1376
53.75 .0768
55.00 .0256
& %
' $
0.15
0.10
0.05
0.0
& %
' $
If we sample 10 customers (over 1 million different possible samples!)
and take the average of what they are charged the distribution of the
mean would be:
0.12
0.10
0.08
0.06
0.04
0.02
0.0
40.0 41.5 43.0 44.5 46.0 47.5 49.0 50.5 52.0 53.5 55.0
& %
' $
0.25
0.4
0.20
0.3
0.15
0.2
0.10
0.1
0.05
0.0
0.0
0.10
0.10
0.05
0.05
0.0
0.0
40.00 42.50 45.00 47.50 50.00 52.50 55.00 40.0000 42.5000 45.0000 47.5000 50.0000 52.5000 55.0000
0.14
0.12
0.12
0.10
0.10
0.08
0.08
0.06
0.06
0.04
0.04
0.02
0.02
0.0
0.0
40.000 41.875 43.750 45.625 47.500 49.375 51.250 53.125 55.000 40.0 41.5 43.0 44.5 46.0 47.5 49.0 50.5 52.0 53.5 55.0
& %
' $
This pattern is not a coincidence. It turns out that the average of most
distributions (if we have enough observations) will always be a normal
distribution. This is known as the
40.0 41.5 43.0 44.5 46.0 47.5 49.0 50.5 52.0 53.5 55.0
& %
' $
Notice how the mean and variance of the distributions changed:
then
σ2 ´ ³
X̄ ∼ N µ,
n
How large is “large”? The answer depends on the underlying distribution.
A good rule of thumb is that if n ≥ 30 the CLT generally applies.
& %
' $
Example: Breaking strength of a rivet has µ = 10000psi and
σ = 500psi. What is the probability that the sample mean
breaking strength for a random sample of 40 rivets is between
9900 and 10200?
& %
' $
• mean: µ1 , . . . , µn
• variance: σ12 , . . . , σn2
then:
1. E[a1 X1 + . . . + an Xn ] = a1 E[X1 ] + . . . + an E[Xn ]
Pn Pn
2. V [a1 X1 + . . . + an Xn ] = i=1 j=1
ai aj Cov(Xi , Xj )
Note that the RV’s Xi do not have to be independent for the above to
hold. However, if they are mutually independent, then:
This follows from the fact that, if Xi and Xj are independent, then:
• Cov(Xi , Xi ) = V (Xi ).
& %
' $
& %