- Midterm 2 With Solution
- 3. Quiz 2 - 4
- latihan soal statistika
- Notes 22
- Formulae Calculus
- Standard Deviation - Wikipedia, The Free Encyclopedia
- EVM-site-selection-tool.xls
- Megan Bryant Hw1
- treatment
- multiple regression
- Saunders Presentation
- D2915
- For the Test statistics
- Slides 13 Manyweak IV
- ASTM E 1155 Procedure Floor Flatness
- Errata
- Khusnul.doc
- 273.full
- P Value and Confidence Interval
S09 Handout - Inference from Small Samples

6, page 1

The overall F-test is only one step in the comparison of several groups. In Chapter 5, we saw

how the pooled standard deviation could be used in confidence intervals and hypothesis tests for

the difference of any pair of group means. We can generalize this to confidence intervals and

hypothesis tests for any linear combination of group means.

γ = C1 µ1 + C 2 µ 2 + … + C I µ I

Mice example:

Suppose I wanted to examine the difference in the mean lifetimes of the two control diets:

µ1 − µ 2 . Then C1 = 1 , C 2 = −1 , C3 = 0 , C 4 = 0 , C5 = 0 , C 6 = 0 .

More complicated combinations are sometimes of interest: suppose I wanted to compare the

average of the two control diets to the average of the four reduced calorie diets: what are the

Ci ’s for it?

In the above two examples, the sum of the Ci ’s is 0 in each case. A linear combination of the

means in which the coefficients sum to 0 is called a contrast because it compares or contrasts

some means with others. Specific contrasts are often of interest in ANOVA, but we can create a

confidence interval for any linear combination of means; it does not need to be a contrast.

Which of the following linear combinations of means are contrasts?

a) ( µ1 − µ 3 ) + ( µ 4 − µ 5 )

µ1 + µ 2 + µ 3 + µ 4 + µ 5

b)

5

c) 2 µ1 − µ 2 − µ 3

(µ 2 + µ 3 + µ 4 + µ 5 )

d) − µ1

4

e) µ1 + .µ 2 + µ 5 − µ 3 − µ 4

(µ 3 − µ 2 ) (µ 6 − µ 3 )

f) −

35 10

Chap. 6, page 2

(this is from the mice example on p. 157 and compares the increase in mean life

expectancy per calorie of going from N/R85 to N/R50 to the increase per calorie of

going from N/R50 to N/R40)

Estimate γ = C1 µ1 + … + C I µ I by g = C1Y1 + … + C I YI .

SD(g) =

We estimate σ by the pooled standard deviation s p . Plugging this estimate into SD(g) gives the

estimated standard deviation of g which is the standard error of g:

SE(g) =

g ± t df (1-α/2) SE(g)

where df is the degrees of freedom for s p (that is, n-I). A test of the hypothesis

g

H 0 : γ = 0 is carried out using the test statistic t = .

SE ( g )

Examples:

1. Compute a confidence interval for the difference between mean lifetimes for the laboratory

control (N/N85) and the unrestricted controls (NP):

SE(g) =

Chap. 6, page 3

2. Compute an estimate of the contrast which is the average of the two control diets minus the

average of the four reduced calorie diets along with a confidence interval.

g= − = -12.45

2 4

SE(g) = 6.678 + + + + + = 0.7800

49 57 71 56 56 60

We are 95% confident that the mean life expectancy of the two controls diets is from 10.9 to 14.0

months less than the mean lifetime of the four restricted diets.

Simultaneous Inferences

Fishing Expeditions and Data Snooping: tests based on how the data turned out.

Say you would like to study a particular stream in Chile that is being used to dispose of waste

from pulp manufacture.

i) Measure the concentrations of contaminants in the water

ii) Measure different stream characteristics such as species diversity (plants and

animals), species richness (plants and animals), brood size for several species of fish,

asymmetry in fish, prevalence of bacterial infection (plants and animals) etc., in the

stream and in several other uncontaminated streams in the region.

iii) Keep measuring stream characteristics until you find a significant difference between

the pulp waste stream and the others at the 95% level.

This example, and the one found in your text, which you should study until you understand it,

illustrates the need for family-wise control of the alpha value, or confidence level.

Chap. 6, page 4

If we form individual 95% confidence intervals for a set of linear combinations of means, then

we cannot be 95% confident that they all include the true parameters they’re estimating. The

actual confidence that a family of confidence intervals are simultaneously correct is called the

familywise confidence level.

Example: Say we conduct an experiment where we make 10 pairwise comparisons and control

each of them at the 95% confidence level. What is the probability that of at least Type I error

occurring in the experiment?

Let p = Pr( success ) = 0.95 , q = Pr( failure) = 0.05 , where success means that we do not make a

type I error, and let x be a binomial random variable. Also, assume that each comparison is

independent of every other. Then, the probability of at least type I error is given by

⎛10 ⎞

Pr( x ≥ 1) = 1 − Pr( x = 0) = 1 − ⎜ ⎟ p10 q 0 = 1 − 0.5987 = 0.4013

⎝0⎠

Hence, if we would like to control our probability of type I error for our entire experiment, we

must make some adjustments. Unfortunately, the above calculation depends on the trials being

independent, which is probably not the case for most experiments. So we can not calculate

family-wise probabilities exactly. For this reason, there are several different versions of alpha

correction techniques.

The most common form of family-wise correction is the Bonferroni inequality to create

simultaneous confidence intervals with any desired familywise confidence level. To create

100(1-α)% simultaneous confidence intervals for k parameters, we make each confidence

interval an individual 100(1-α/k)% confidence interval. .

Bonferroni guarantees that the familywise confidence level is at least 1-α, but it can be overkill,

especially when k is large. There are several ways that have been developed for creating

simultaneous confidence intervals among means that can be less drastic.

• Planned comparisons: contrasts which the researcher decides are of interest before the

data are collected. We can control the familywise confidence level using the Bonferroni

inequality or one of the other methods listed below.

• Unplanned comparisons: contrasts which the researcher decides are of interest after

examining the data. These may be chosen from a larger set of contrasts which have been

examined or may be chosen after looking at the data to suggest contrasts of interest. The

confidence intervals must take into account that you actually (in the first case) or

essentially (in the second case) examined a large number of contrasts and picked out the

most “significant” one or ones.

Chap. 6, page 5

In the specific case of all pairwise comparisons of group means, a number of procedures have

been developed to control the familywise error rate. The primary one is

• Tukey-Kramer procedure (for all planned or unplanned pairwise comparisons)

In the general case of contrasts (or any linear combinations of the means) which are not

necessarily pairwise comparisons, there are two main choices. These methods can also be used

for pairwise comparisons.

• Planned comparisons: Bonferroni

• Unplanned comparisons: Scheffe (can also be used for planned comparisons)

In all these cases, the confidence interval for a contrast γ always has the form:

The specific method used determines only the multiplier. If you have a legitimate choice

between two or more procedures, you can choose the one with the smaller multiplier.

In SPSS, the standard errors of one or more contrasts can be calculated by selecting the Contrasts

button on the One-way ANOVA window. You will have to find the value of the appropriate

multiplier to create a confidence interval for the contrast.

Pairwise comparisons between all pairs of means can be obtained by clicking the Post Hoc

button in the One-Way ANOVA window. It will automatically give you confidence intervals for

the difference between each pair of means. There are a multitude of options there; the ones

corresponding to the ones mentioned here are:

⎛I⎞

Bonferroni Bonferroni t n − I (1 − α / 2k ) where k = ⎜⎜ ⎟⎟ is number of

⎝ 2⎠

pairwise comparisons of means

q I ,n −I (1 − α )

Tukey-Kramer Tukey (not Tukey’s-b) (q is from Table A.5)

2

Chap. 6, page 6

⎛ 6⎞ 6!

There are I = 6 groups and n-I = 343 d.f. within groups. There are ⎜⎜ ⎟⎟ = = 15 pairwise

⎝ 2 ⎠ 2! 4!

comparisons. The coefficients or multipliers for 95% confidence intervals for the difference

between each pair of means are:

1. LSD: t 343 (.975) = 1.967 (approx. 1.984 using Table A.2 with 100 d.)

(approx. 5(2.26) = 3.36 using Table A.4 with df2 = 200)

4. Tukey-Kramer: = = 2.866

2 2

4.10

(approx. = 2.90 using Table A.5 with 120 df)

2

If I had just been interested in all pairwise comparisons a priori then I would use Tukey-Kramer.

If there were other pre-planned contrasts I were interested in in addition to all pairwise

comparisons, then I would either use Bonferroni (but I would have to increase k to reflect the

additional contrasts) or Scheffe, whichever were smaller. If there were additional unplanned

comparisons, then I would use Scheffe for all comparisons.

Chap. 6, page 7

Post Hoc Tests

Multiple Comparisons

Tukey HSD

Mean

Difference 95% Confidence Interval

(I) Diet (J) Diet (I-J) Std. Error Sig. Lower Bound Upper Bound

NP N/N85 -5.289* 1.301 .001 -9.018 -1.561

N/R50 -14.895* 1.240 .000 -18.450 -11.341

R/R50 -15.484* 1.306 .000 -19.228 -11.740

N/R lopro -12.284* 1.306 .000 -16.028 -8.540

N/R40 -17.715* 1.286 .000 -21.400 -14.029

N/N85 NP 5.289* 1.301 .001 1.561 9.018

N/R50 -9.606* 1.188 .000 -13.010 -6.202

R/R50 -10.194* 1.257 .000 -13.796 -6.593

N/R lopro -6.994* 1.257 .000 -10.596 -3.393

N/R40 -12.425* 1.235 .000 -15.965 -8.885

N/R50 NP 14.895* 1.240 .000 11.341 18.450

N/N85 9.606* 1.188 .000 6.202 13.010

R/R50 -.589 1.194 .996 -4.009 2.832

N/R lopro 2.611 1.194 .246 -.809 6.032

N/R40 -2.819 1.171 .156 -6.176 .537

R/R50 NP 15.484* 1.306 .000 11.740 19.228

N/N85 10.194* 1.257 .000 6.593 13.796

N/R50 .589 1.194 .996 -2.832 4.009

N/R lopro 3.200 1.262 .117 -.417 6.817

N/R40 -2.231 1.241 .468 -5.787 1.325

N/R lopro NP 12.284* 1.306 .000 8.540 16.028

N/N85 6.994* 1.257 .000 3.393 10.596

N/R50 -2.611 1.194 .246 -6.032 .809

R/R50 -3.200 1.262 .117 -6.817 .417

N/R40 -5.431* 1.241 .000 -8.987 -1.875

N/R40 NP 17.715* 1.286 .000 14.029 21.400

N/N85 12.425* 1.235 .000 8.885 15.965

N/R50 2.819 1.171 .156 -.537 6.176

R/R50 -5.289 1.301 .468 -1.325 5.787

N/R lopro -14.895* 1.240 .000 1.875 8.987

*. The mean difference is significant at the .05 level.

Homogeneous Subsets

Months survived

a,b

Tukey HSD

Subset for alpha = .05

Diet N 1 2 3 4

NP 49 27.402

N/N85 57 32.691

N/R lopro 56 39.686

N/R50 71 42.297 42.297

R/R50 56 42.886 42.886

N/R40 60 45.117

Sig. 1.000 1.000 .108 .212

Means for groups in homogeneous subsets are displayed.

a. Uses Harmonic Mean Sample Size = 57.462.

b. The group sizes are unequal. The harmonic mean of

the group sizes is used. Type I error levels are not

guaranteed.

