Professional Documents
Culture Documents
Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
Analysis via T-test (if equivariance holds): Point estimates y = yi / n
NOTE:
Group Means y1 =
667 + 653 + 614 + 612 + 604
5 = 630 y2 = 593 + 525 + 520
3 = 546 y1 - y2 = 84 >0
Group 630)) 2 +K + (604 - 630 )2
(667 - 630 (593 - 546 2
+K + (520 - 546 2
= 1663 F =
= 2.11 < 4
Variances s1 =
2
5 -1 = 788.5 s2 2 = 546))
3-1
546)) 1663
788.5
Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
Analysis via T-test (if equivariance holds): Point estimates y = yi / n
NOTE:
Group Means y1 =
667 + 653 + 614 + 612 + 604
5 = 630 y2 = 593 + 525 + 520
3 = 546 y1 - y2 = 84 >0
Group 630)) 2 +K + (604 - 630 )2
(667 - 630 (593 - 546 22
46 ) 22
+K + (520 - 5546)
Variances s1 =
2
5 -1 = 788.5 s2 2 = 546))
3-1 = 1663 F = 1663
788.5 = 2.11 < 4
s2 = SS/df SSErr = 6480
2 2
Pooled 2
spooled = (5( n1--1)(1)n788.
s1 +5()n+2 -(3
1)-s1)(
2 1663 )
= 1080 The pooled variance is a weighted average of the group
Variance 1 + n52+-32- 2 variances, using the degrees of freedom as the weights.
dfErr = 6
p-value ==2 P (Y1 - Y2 84) = 2 P ( T6 24 ) = 2 P T6 3.5
p-value 84 - 0
( )
Standard 11 1 1 > 2 * (1 - pt(3.5, 6)) Reject H0 at = .05
Error s.e.0 = s 2
1080 ++ = 24
pooled [1] 0.01282634 stat signif, Hosp > Clinic
5n1 3n2
R code:
> y1 = c(667, 653, 614, 612, 604)
> y2 = c(593, 525, 520)
>
> t.test(y1, y2, var.equal = T)
Formal Conclusion
Two Sample t-test
p-value < = .05
data: y1 and y2 Reject H0 at this level.
t = 3.5, df = 6, p-value = 0.01283
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
25.27412 142.72588
sample estimates: Interpretation
mean of x mean of y
630 546 The samples provide evidence that the
difference between mean costs is (moderately)
statistically significant, at the 5% level, with
the hospital being higher than the clinic (by an
average of $84).
Alternate method ~
Y1 Y2 Yk
kk
L
1
1 2
2
Null
H:
sis?
0
m1 = m2 = K = mk
pot he
Hy HA: At least one treatment mean i is
significantly different from the others.
Example: Y = $ Cost of a certain medical service
Assume Y is known to be normally distributed at each of k = 2 health care facilities (groups).
Hospital: Y1 ~ N(1, 1) Clinic: Y2 ~ N(2, 2)
Null Hypothesis H0: 1 = 2,
i.e., 1 2 = 0
(No difference exists.")
2-sided test at significance level = .05
Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
ANOVA F-test (if equivariance holds): Point estimates y = yi / n
NOTE:
Group Means y1 =
667 + 653 + 614 + 612 + 604
5 = 630 y2 = 593 + 525 + 520
3 = 546 y1 - y2 = 84 >0
5 (630) 3 (546)
Grand Mean
667 + 653 + 614 + 612 + 604 + 593 + 525 + 520
y= = 598.50
5+3
Y1 Y2 Yk
kk
L
1
1 2
2
H0: m1 = m2 = K = mk
Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
ANOVA F-test (if equivariance holds): Point estimates y = yi / n
667 + 653 + 614 + 612 + 604 593 + 525 + 520
Group Means y1 = 5 = 630 y2 = 3 = 546
5(630) + 3(546)
Grand Mean y= = 598.50
5+3
Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
ANOVA F-test (if equivariance holds): Point estimates y = yi / n
667 + 653 + 614 + 612 + 604 593 + 525 + 520
Group Means y1 = 5 = 630 y2 = 3 = 546
5(630) + 3(546)
Grand Mean y= = 598.50
5+3
SSTot = (667 - 598.5) + (653 - 598.5) + (614 - 598.5) + (612 - 598.5) + (604 - 598.5)
2 2 2 2 2
Y1 Y2 Yk
kk
L
1
1 2
2
H0: m1 = m2 = K = mk
How can we measure this? Imagine zero variability within groups
Alternate method ~
Y1 Y2 Yk
kk
L
1
1 2
2
H0: m1 = m2 = K = mk
How can we measure this? Imagine zero variability within groups
Example: Y = $ Cost of a certain medical service
Assume Y is known to be normally distributed at each of k = 2 health care facilities (groups).
Hospital: Y1 ~ N(1, 1) Clinic: Y2 ~ N(2, 2)
Null Hypothesis H0: 1 = 2,
i.e., 1 2 = 0
(No difference exists.")
2-sided test at significance level = .05
The
Alternate method ~
Y1 Y2 Yk
kk
L
1
1 2
2
H0: m1 = m2 = K = mk
Example: Y = $ Cost of a certain medical service
Assume Y is known to be normally distributed at each of k = 2 health care facilities (groups).
Hospital: Y1 ~ N(1, 1) Clinic: Y2 ~ N(2, 2)
Null Hypothesis H0: 1 = 2,
i.e., 1 2 = 0
(No difference exists.")
2-sided test at significance level = .05
Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
ANOVA F-test (if equivariance holds): Point estimates y = yi / n
667 + 653 + 614 + 612 + 604 593 + 525 + 520
Group Means y1 = 5 = 630 y2 = 3 = 546
5(630) + 3(546)
Grand Mean y= = 598.50
5+3
SSTot = (667 - 598.5) + (653 - 598.5) + (614 - 598.5) + (612 - 598.5) + (604 - 598.5)
2 2 2 2 2
Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
ANOVA F-test (if equivariance holds): Point estimates y = yi / n
667 + 653 + 614 + 612 + 604 593 + 525 + 520
Group Means y1 = 5 = 630 y2 = 3 = 546
5(630) + 3(546)
Grand Mean y= = 598.50
5+3
SSTot = (667 - 598.5) + (653 - 598.5) + (614 - 598.5) + (612 - 598.5) + (604 - 598.5)
2 2 2 2 2
SSErr = (667 - 630) + (653 - 630) + (614 - 630) + (612 - 630) + (604 - 630)
2 2 2 2 2
Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
Analysis via T-test (if equivariance holds): Point estimates y = yi / n
NOTE:
Group Means y1 =
667 + 653 + 614 + 612 + 604
5 = 630 y2 = 593 + 525 + 520
3 = 546 y1 - y2 = 84 >0
Group 630)) 2 +K + (604
(667 - 630 (604 - 630 )2 (593 - 546 22
46 ) 22
+K + (520 - 5546)
Variances s1 =
2
5 -1 = 788.5 s2 2 = 546))
3-1 = 1663 F = 1663
788.5 = 2.11 < 4
s2 = SS/df SS1 SS2
2 2
Pooled 2
spooled = (5( n1--1)(1)n788.5
s1 + ()n+2 -(3
1)-s1)(
2 1663 )
= 1080 The pooled variance is a weighted average of the group
Variance 1 + n52+-32- 2 variances, using the degrees of freedom as the weights.
LL
RECA
Example: Y = $ Cost of a certain medical service
Assume Y is known to be normally distributed at each of k = 2 health care facilities (groups).
Hospital: Y1 ~ N(1, 1) Clinic: Y2 ~ N(2, 2)
Null Hypothesis H0: 1 = 2,
i.e., 1 2 = 0
(No difference exists.")
2-sided test at significance level = .05
Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
Analysis via T-test (if equivariance holds): Point estimates y = yi / n
NOTE:
Group Means y1 =
667 + 653 + 614 + 612 + 604
5 = 630 y2 = 593 + 525 + 520
3 = 546 y1 - y2 = 84 >0
Group 630)) 2 +K + (604
(667 - 630 (604 - 630 )2 (593 - 546 22
46 ) 22
+K + (520 - 5546)
Variances s1 =
2
5 -1 = 788.5 s2 2 = 546))
3-1 = 1663 F = 1663
788.5 = 2.11 < 4
s2 = SS/df SSErr = 6480
2 2
Pooled 2
spooled = (5( n1--1)(1)n788.5
s1 + ()n+2 -(3
1)-s1)(
2 1663 )
= 1080 The pooled variance is a weighted average of the group
Variance 1 + n52+-32- 2 variances, using the degrees of freedom as the weights.
dfErr = 6
LL
RECA
Example: Y = $ Cost of a certain medical service
Assume Y is known to be normally distributed at each of k = 2 health care facilities (groups).
Hospital: Y1 ~ N(1, 1) Clinic: Y2 ~ N(2, 2)
Null Hypothesis H0: 1 = 2,
i.e., 1 2 = 0
(No difference exists.")
2-sided test at significance level = .05
Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
ANOVA F-test (if equivariance holds): Point estimates y = yi / n
667 + 653 + 614 + 612 + 604 593 + 525 + 520
Group Means y1 = 5 = 630 y2 = 3 = 546
5(630) + 3(546)
Grand Mean y= = 598.50
5+3
SSTot = (667 - 598.5) + (653 - 598.5) + (614 - 598.5) + (612 - 598.5) + (604 - 598.5)
2 2 2 2 2
SSErr = (667 - 630) + (653 - 630) + (614 - 630) + (612 - 630) + (604 - 630)
2 2 2 2 2
Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5 Sample 2 = {593, 525, 520}; n2 = 3
ANOVA F-test (if equivariance holds): Point estimates y = yi / n
667 + 653 + 614 + 612 + 604 593 + 525 + 520
Group Means y1 = 5 = 630 y2 = 3 = 546
5(630) + 3(546)
Grand Mean y= = 598.50
5+3
SSTot = (667 - 598.5) + (653 - 598.5) + (614 - 598.5) + (612 - 598.5) + (604 - 598.5)
2 2 2 2 2
Trt
SS MSTrt
MS = F=
ANOVA Table df MSErr
Trt
SS MSTrt
MS = F=
ANOVA Table df MSErr
H A : 12 2 2
SS1 2 SS 2
s =
2
1 s2 =
df1 df 2
Test Statistic
s12
F= 2
s2
Sampling Distribution =?
SSTot = SSTrt + SSErr dfTot = dfTrt + dfErr
Tot
Err
Trt
SS MSTrt
MS = F=
ANOVA Table F1,6 df MSErr
Trt
SS MSTrt
MS = F=
ANOVA Table F1,6 df MSErr
Trt
SS MSTrt
MS = F=
ANOVA Table df MSErr
Trt
SS MSTrt
MS = F=
ANOVA Table df MSErr
( between )
= s 2
12.25 .01282634
(on F1, 6 )
Error 6 6480 1080
(=s 2
within )
Total 7 19710 Note:
2
This is also spooled .
SSTot = SSTrt + SSErr dfTot = dfTrt + dfErr
Tot
Err
Trt
SS MSTrt
MS = F=
ANOVA Table df MSErr
( between )
= s 2
12.25 .01282634
(on F1, 6 )
Error 6 6480 1080
(=s 2
within )
Total 7 19710
13230
Thus, the treatment accounts for 19710 = 67.1% of the total variability in the response Y.
R code:
# ANOVA FOR UNBALANCED DESIGN
> y1 = c(667, 653, 614, 612, 604)
> y2 = c(593, 525, 520)
>
> Data = data.frame(
+ Y = c(y1, y2),
+ X = factor(rep(c("y1", "y2"), times = c(length(y1),
length(y2))))
+ )
>
> var.test(Y ~ X, data = Data) # EQUIVARIANCE?
Response: Y
Df Sum Sq Mean Sq F value Pr(>F)
X 1 13230 13230 12.25 0.01283 *
Residuals 6 6480 1080
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
1
etc k
k
1 2
2
H0: m1 = m2 = K = mk
Idea: Test all possible pairwise comparisons, each via a two-sample t-test.
Example : Suppose there are k = 5 treatment groups.
(1, 2) (1, 3) (1, 4) (1, 5) (2, 3) (2, 4) (2, 5) (3, 4) (3,5) (4,5)
p = ... p = ... p = ... p = ... p = ... p = ... p = ... p = ... p = ... p = ...
5
There are = 10 such comparisons. PROBLEM???
2
SPURIOUSY1 Y2 Yk
SIGNIFICANCE!!!
1
etc k
k
1 2
2
H0: m1 = m2 = Ke = mk
p-valu
Idea: Test all possible pairwise comparisons, each via a two-sample t-test.
Example
= .05 : Suppose there are k = 5 treatment groups.
(1, 2) (1, 3) (1, 4) (1, 5) (2, 3) (2, 4) (2, 5) (3, 4) (3,5) (4,5)
p = ... p = ... p = ... p = ... p = ... p = ... p = ... p = ... p = ... p = ...
5
There are = 10 such comparisons. PROBLEM???
2
* = .05/10
Y1 Y2 Yk
1
etc k
k
1 2
2
H0: m1 = m2 = K = mk
Idea: Test all possible pairwise comparisons, each via a two-sample t-test.
Example : Suppose there are k = 5 treatment groups.
(1, 2) (1, 3) (1, 4) (1, 5) (2, 3) (2, 4) (2, 5) (3, 4) (3,5) (4,5)
p = ... p = ... p = ... p = ... p = ... p = ... p = ... p = ... p = ... p = ...
5
There are = 10 such comparisons. PROBLEM???
2
Make each comparison at level * = / 10.
Y1 Y2 Yk
1
etc k
k
1 2
2
H0: m1 = m2 = K = mk
Idea: Test all possible pairwise comparisons, each via a two-sample t-test.
Example : Suppose there are k = 5 treatment groups.
(1, 2) (1, 3) (1, 4) (1, 5) (2, 3) (2, 4) (2, 5) (3, 4) (3,5) (4,5)
p = ... p = ... p = ... p = ... p = ... p = ... p = ... p = ... p = ... p = ...
5
There are = 10 such comparisons.
2
BONFERRONI
Make each comparison at level * = / 10. CORRECTION
Alternate method ~
NS?
PTI O
SSU M
EL A
MOD
Y1 Y2 Yk
1
L kk
1 2
2
H0: m1 = m2 = K = mk
Alternate method ~
Y1 Y2 Yk
1
L kk
1 2
2
H0: m1 = m2 = K = mk
Alternate method ~
Y1 Y2 Yk
1
L kk
1 2
2
H0: m1 = m2 = K = mk
Alternate method ~
Y1 Y2 Yk
1
L kk
1 2
2
H0: m1 = m2 = K = mk
Alternate method ~
Y1 Y2 Yk
1
L kk
1 2
2
H0: m1 = m2 = K = mk
ssppuurrio
iou
ssiiggnniiffic uss
icaannccee