Professional Documents
Culture Documents
Samples
Chapter 5
True Story
In the 1980s, the Hughes Aircraft Co. (HAC) bought cryogenic
coolers for their Bradley Fighting Vehicle night vision assemblies from
two different vendors:
Hughes Aircraft Co. (HAC) Santa Barbara (in-house)
BAC (out-of-house)
After installation into the Bradley Fighting Vehicles, the HAC coolers
seemed to be failing sooner than the BAC coolers
How could the engineers determine if there was a significant
difference between the mean lives of the HAC coolers and the BAC
coolers?
2
Bradley Fighting Vehicle
Cooler
www.us-army-info.com
www.rdysales.com
3
Case 1. Hypothesis tests on the
difference between two means
Variances Known
Z
The statistic Z
X 1 X 2 1 2
~ N 0,1
12 22
n1 n2
5
Case 1: Test of Hypothesis on Difference Between Two Means
Variances Known
H 0 : 1 2 0
H1 : 1 2 0 H1 : 1 2 0 H1 : 1 2 0
6
Two-Sided Confidence Interval on
Population Mean,
A (1-a% confidence interval on the true
difference between means is given by L and
U: L: lower value
PL 1 2 U 1 a U: Upper value
12 22
L x1 x2 z a TI-83/84:
2 n1 n2 STATTESTS9: 2-SampZInt
12 22
U x1 x2 z a
2 n1 n2
7
One-Sided Confidence Intervals on
Population Mean,
A (1-a% confidence interval on the true
difference between means is given by L
and U: P L 1 a
1 2
or
Note: There is no L: lower value
P1 2 U 1 a
0 in this U: Upper value
where
equation, because
we would only be 12 22
L x1 x2 za
interested in the n1 n2
difference between 12 22
U x1 x2 za
the means for a CI n1 n2
8
Accommodating 0
TI-83, 84
H 0 : 1 2 0 When you enter Stats data into the 2-
same as SampZTest option, add 0 to the value
H 0 : 1 2 0 you input for x2
same as
H 0 : 1 ( 2 0 ) 0
x1 ( x2 0 )
z0
12 22
n1 n2
9
Case 5. Hypothesis Tests on the
Ratio of Two Variances
Note that I am teaching this case out of order from the text.
Notation Chapter 5
Chapter 3: Chapter 3
F(x) denotes the cumulative distribution function
(CDF) of both discrete and continuous random
variables.
Chapter 5:
The letter F is now used as a RANDOM
VARIABLE.
The letter f is now used as a specific value of F
Analogy: X and x.
11
The F Probability Distribution
Function (pdf)
W and Y are independent random variables
that are each distributed c2 with u and v
degrees of freedom, respectively.
W /u
f(x) The statistic F
Y / v ~ Fu,v with pdf
F2,10 F200,10
a
f1a ,n1 1,n2 1 fa ,n1 1,n2 1
13
Practice with the F table
X is ~ F with 2 and 4 degrees of freedom
Find A such that
P(X > A) = .01
P( X > A) = .05
P(X > A) = .95
P(X < A) = .025
14
The Test Procedure
A hypothesis testing procedure for the equality of two
variances is based on the following result.
The statistic
S12 / 12
F 2 2 ~ Fn1 1,n2 1
S2 / 2
If 21 22, then
S12
F 2 ~ Fn1 1,n2 1
S2
15
Case 5: Test of Hypothesis on Equality of Two Variances
H 0 : 12 22
H1 : 12 22 H1 : 12 22 H1 : 12 22
Reject H 0 if Reject H 0 if Reject H 0 if
f0 fa or f 0 fa ,n1 1,n2 1 f 0 f1a ,n1 1,n2 1
, n1 1, n2 1
2
f0 f a
1 , n1 1, n2 1
2
s12 TI-84:
f0 2 STATTESTSE: 2-SampFTest
s2
17
One-Sided Confidence Interval on Equality of Population
Variances
2
2
or L: lower value
12
P 2 U 1 a U: Upper value
2
s12
L 2 f1a ,n2 1,n1 1
s2
s12
U 2 fa ,n2 1,n1 1
s2
18
Case 5: Computing the p-value
1. Clearly state the null and alternative hypotheses, H0 and H1.
2. Determine the numerical value of the test statistic, f0
3. Plot the pdf of F (Recall that the lowest value is 0)
If you have a two-sided hypothesis test and the test statistic is closer Two-sided
to zero than to the right tail of the pdf the p-value is 2x (area below
the test statistic) H 0 : 12 22
H1 : 12 22
If you have a two-sided hypothesis test and the test statistic is closer
to the right tail of the pdf the p-value is 2x (area above the test
statistic).
One-sided upper
If you have a one-sided upper hypothesis test, the p-value is the area H 0 : 12 22
above the test statistic. H1 : 12 22
One-sided lower
If you have a one-sided lower hypothesis test, the p-value is the area
below the test statistic. H 0 : 12 22
H1 : 12 22 19
Case 2. Hypothesis tests on the
difference between two means
sp
n1 1s12 n2 1s22
n1 n2 2
21
Case 2: Test of Hypothesis on Difference Between Two Means
Variances Unknown and Equal
H 0 : 1 2 0
H1 : 1 2 0 H1 : 1 2 0 H1 : 1 2 0
Reject H0 if |t0 | > ta/2, n1+n2-2 Reject H0 if t0 > ta,n1+n2-2 Reject H0 if t0 < - ta,n1+n2-2
t0
x1 x2 0 n1 1s12 n2 1s22
Reject H0 if p-value is < a sp
1 1 n1 n2 2
sp
n1 n2
TI-83/84:
STATTESTS4: 2-SampTTest
Pooled? Yes
22
Two-Sided Confidence Intervals on Population
Mean, Variances Unknown and Equal
S 22 / n2
2
26
n1 1 n2 1
Case 3: Test of Hypothesis on difference between two means
Variances Unknown and Unequal
H 0 : 1 2 0
H1 : 1 2 0 H1 : 1 2 0 H1 : 1 2 0
Reject H0 if |t0 | > ta/2, Reject H0 if t0 > ta, Reject H0 if t0 < - ta,
2
x1 x2 0
S12 S 22
Reject H0 if p-value is < a n n
t0
1 2
s12 s22
S1
2
/ n1
2
S 22 / n2
2
n1 n2 n1 1 n2 1
TI-83/84:
STATTESTS4: 2-SampTTest
Pooled? No 27
Two-Sided Confidence Interval on Population Mean,
, Variances UNKNOWN and UNEQUAL
s12 s22
equation, because L x1 x2 ta
, n1 n2
2
we would only be
interested in the TI-83/84:
difference between STATTESTS0: 2-SampTInt
the means for a CI
28
One-Sided Confidence Intervals on Population Mean,
, Variances UNKNOWN and UNEQUAL
30
Cases 1, 2 and 3:
Computing the p-value
1. Determine the numerical value of the test statistic, z0 for Case 1
and t0 for Cases 2 and 3.
2. For Case 1, draw the pdf of Z, the standard normal random
variable. For Cases 2 and 3, draw the pdf of the T distribution.
If you have a two-sided hypothesis test and the test statistic is < 0, the Two-sided
p-value is 2x (area below the test statistic)
H0: 1 2 = 0
If you have a two-sided hypothesis test and the test statistic is > 0, the H1: 1 2 0
p-value is 2x (area above the test statistic).
One-sided upper
If you have a one-sided upper hypothesis test, the p-value is the area
above the test statistic. The test statistic could be < 0 or > 0). H0: 1 2 = 0
H1: 1 2 > 0
If you have a one-sided lower hypothesis test, the p-value is the area
One-sided lower
below the test statistic. The test statistic could be < 0 or > 0).
H0: 1 2 = 0
H1: 1 2 < 0
31
Ch. 5 : Cases 1, 2 and 3 Comparison
Case 1 Case 2 Case
Null H0 : 1 2 0 H0 : 1 2 0 H0 : 1 2 0
Alternative (2-sided) H0 : 1 2 0 H0 : 1 2 0 H0 : 1 2 0
36
Example Situation 2
From the Journal of the American Medical
Association (JAMA):
Ten adult males ages 45 - 55 are recruited to
study a new cholesterol medication.
Analysts wish to determine if the medication
decreases the total cholesterol levels in this
population.
37
X1 and X2
are random
variables Data
D = X1 - X2 Samples is of adult males ages 45-55
X1 is the total cholesterol level before medication
is a random X2 is the total cholesterol level after 6 months on the medication n=10
Person x1 x2 d = x1-x2 d1
variable 1 240 196 44
2 275 232 43
3 220 214 6 d2
4 222 226 -4
D is a random 5 256 200 56
etc.
6 311 301 10
variable 7
8
212
220
215
220
-3
0
9 284 280 4
10 306 241 65
d
S2 and SD
D
Sample Average = 22.1
are random
Sample Std. Dev. = 27.17
sd
variables
sd
d will be the estimate of D 38
T
D D
The statistic T ~ Tn 1
SD
n
where SD is the sample standard deviation of the difference
in paired observations
n is the sample size.
n-1 is the degrees of freedom
H 1 : d 0 H 1 : d 0 H 1 : d 0
Reject H0 if |t0 | > ta/2,n-1 Reject H0 if t0 > ta, n-1 Reject H0 if t0 < - ta,n-1
d d
2
i
sd i 1
n 1 40
Two-Sided Confidence Interval on Mean Difference
(one population)
Variance Unknown
A (1-a% confidence interval on the true
difference is given by L and U:
L: lower value
P L D U 1 a
U: Upper value
sd
U d ta
, n 1 n
2
TI-83/84:
sd
L d ta STATTESTS 8: TInterval
, n 1 n
2
41
Two-Sided Confidence Interval on Mean Difference
(one population)
Variance Unknown
If you have a two-sided hypothesis test and the test statistic is < 0, the
p-value is 2x (area below the test statistic)
Two-sided
If you have a two-sided hypothesis test and the test statistic is > 0, the
H0: d = 0
p-value is 2x (area above the test statistic).
H1: d 0
If you have a one-sided upper hypothesis test, the p-value is the area
above the test statistic. The test statistic could be < 0 or > 0). One-sided upper
H0: d = 0
If you have a one-sided lower hypothesis test, the p-value is the area H1: d > 0
below the test statistic. The test statistic could be < 0 or > 0).
One-sided lower
H0: d = 0
H1: d < 0
43
Case 6. Hypothesis Tests on the
Equality of Two Proportions
Notation
n1 and n2 are sample sizes from two different populations.
x1 and x2 are the number of observations from n1 and n2 belonging
to a class of interest. (x1 and x2, in this application, are not random
variables, but integers.)
p1 and p2 are the population proportions (0 < p1,p2 < 1) belonging
to the class of interest.
x x
p1 1
1n
and
2
p
2
45
Z
The statistic Z
P P p p ~ N 0,1
1 2 1 2
1 1
P(1 P)
n1 n2
46
Case 6: Test of Hypothesis on the Equality of Two
Proportions
H 0 : p1 p2 0
H1 : p1 p2 0 H1 : p1 p2 0 H1 : p1 p2 0
47
Two-Sided Confidence Interval on the Difference of
Population Proportions
48
One-Sided Confidence Interval on the Difference of
Population Proportions
If you have a two-sided hypothesis test and the test statistic is < 0, the
p-value is 2x (area below the test statistic) Two-sided
H0: p1 p2 = 0
If you have a two-sided hypothesis test and the test statistic is > 0, the
p-value is 2x (area above the test statistic). H1: p1 - p2 0
If you have a one-sided upper hypothesis test, the p-value is the area One-sided upper
above the test statistic. The test statistic could be < 0 or > 0).
H0: p1 p2 = 0
H1: p1 - p2 > 0
If you have a one-sided lower hypothesis test, the p-value is the area
below the test statistic. The test statistic could be < 0 or > 0).
One-sided lower
H0: p1 p2 = 0
H1: p1 - p2 <0
50
Analysis of Variance (ANOVA):
Hypothesis Tests for More Than
Two Populations
P-value reminders and something
new
If p-value
is very, very small (<< .01), we REJECT H0
in favor of H0
is a, we REJECT H0
If p-value
is very big (> .20), we FAIL TO REJECT H0
is > a, we FAIL TO REJECT H0
52
Completely Randomized
Experiment
ANOVA
ANOVA Notation:
Completely Randomized Experiment
Treatments: The aspects about which you are trying to determine a
difference or not.
ti is the ith treatment effect, typically measured as deviations from an
overall process mean
n is the number of observations in a treatment.
When n is not the same for each treatment, we say the design is unbalanced.
a is the number of treatments
54
Assumptions
Each observation is obtained randomly
55
The Hypothesis Test
H 0 : t 1 t 2 t 3 ... t a 0
H1 : t i 0 for at least one i
Reject H0 if p-value is < a
H0 in English.
There is no difference between the treatments with respect to
measured observations
H1 in English.
There is at least one treatment that is yielding statistically different
observations
56
57
Problem Using Excel
1. Enter the data into an Excel sheet
Temperature
100 125 150 175
21.8 21.7 21.9 21.9
There are 21.9 21.4 21.8 21.7
22 separate 21.7 21.5 21.8 21.8
21.6 21.5 21.6 21.7
and distinct 21.7 21.5 21.6
21.5 21.8
items 21.8
59
3. Click in the Input Range window
4. Highlight the treatment labels and all the data (in this case,
cell A2 through cell D9)
5. Select Grouped By
Columns
6. Select Labels as
First Row
7. Select the desired
Type I error level,
Alpha (a
8. Click in the
Output Range
window and select any
cell in the sheet where
you want your results
to show (here cell A11)
60
9. Click Okay
The p-value is .08 > .05, so we
FAIL TO REJECT H0
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 0.13911039 3 0.04637013 2.615911 0.082655 3.159911
Within Groups 0.319071429 18 0.01772619
Total 0.458181818 21
61
What you should conclude
62
Randomized Complete Block
Experiment
ANOVA
How is this application different from a
completely randomized experiment
(previous slides)?
If subjects that receive a particular
treatment are the same, then we have a
randomized block design
analogy is the paired t-test we did with the
mazes
Block design removes differences among
experimental subjects
64
Example of difference
Completely Randomized Experiment:
I have three arthritis medications. I want to know if there is
a significant difference between them.
I enlist 12 people for my study and randomly assign them
one of the three meds.
They report their pain indices (while on the medication) as
scores between 0 (no pain) and 10 (most pain).
I will name these people Abe, Bob, Cal, Dave, Ed, Fred,
Gary, Hal, Jon, Lee, Mike, Ned
65
Randomized Block Experiment:
Same study, but I want to eliminate the inherent
differences between person 1, person 2, , person 12.
For this study, I enlist four men: Rob, Sal, Ted and Vic
Each man gets to try each medication in sequence and
reports his pain index from 0 (no pain) to 10 (high pain)
while on each medication
66
Completely Randomized vs.
Randomized Block
Observations
Med 1 8 (Mike) 7 (Dave) 6 (Lee) 3 (Gary)
Med 2 6 (Fred) 7 (Abe) 4 (Hal) 9 (Cal)
Med 3 6 (Jon) 5 (Ned) 8 (Bob) 7 (Ed)
67
ANOVA Notation:
Randomized Block Experiment
Treatments: The aspects about which you are trying to determine a difference or
not.
ti is the ith treatment effect, typically measured as deviations from an overall
process mean
n is the number of observations in a treatment.
When n is not the same for each treatment, we say the design is unbalanced.
a is the number of treatments
b is the number of blocks
68
The Hypothesis Test
H 0 : t 1 t 2 t 3 ... t a 0
H1 : t i 0 for at least one i
H0 in English.
There is no difference between the treatments with respect to
measured observations
H1 in English.
There is at least one treatment that is yield statistically different
observations
69
70
Example using Excel
72
3. Click in the Input Range window
4. Highlight everything (in this case,
cell A1 through cell D5)!
5. Select Labels
6. Select the desired
Type I error level,
Alpha (a
7. Click in the
Output Range
window and select any
cell in the sheet where
you want your results
to show (here cell A7)
8. Click Okay
73
The p-value is .125 > .05, so we
FAIL TO REJECT H0
ANOVA
Source of Variation SS df MS F P-value F crit
Rows 0.021225 3 0.007075 30.32142857 0.000506 4.757055
Columns 0.0014 2 0.0007 3 0.125 5.143249
Error 0.0014 6 0.000233333
Total 0.024025 11
74
What you should conclude
Since p-value > a, we fail to reject H0 and
conclude that there is no difference in
arsenic test procedure
75
Case 1
Chapter 5 Summary
Case 5 Case 2 Case 3 Case 4 Case 6
H0 : 1 2 0 12 H0 : 1 2 0 H0 : 1 2 0 H0 : d = 0 H0 : p1 - p2 = 0
Null
H0 : 2 1
Hypothesis 2
Test statistic
z0
x1 x2 0
s12 x x 0 x x 0
1 2 d 0 z0
p1 p 2 0
1 1
12 22 f0 2 t0 1 2 t0
s s2 2 t0 p (1 p )
n1 n2
1 1
sd
n1 n2 s2 sp
n1 n2
1 2
p
x1 x2
n1 n2 n n1 n2
2-Sided L x1 x2 s12 L x1 x2 L x1 x2
L f1a / 2,n2 1,n1 1 L p1 p2
Confidence 2
2
s22 sd
za 1
2
ta
1 1
s12 s22 L d ta p1 1 p1 p2 1 p2
Interval n1 n2 ,n1 n2 2
sp ta ,n 1 n za
2
s2 n1 n2 , n1 n2 2 n1 n2
U 12 fa / 2,n2 1,n1 1
2 2
2
sd
U x1 x2
s2
U x1 x2
U d ta U p1 p2
,n 1 n
U x1 x2 p1 1 p1 p2 1 p2
2
2
2
1 1 za
za 1
2
ta sp n1 n2
n1 n2 ,n1 n2 2 n1 n2 2
2 2
s12 s22
ta
, n1 n2
2