307 views

Uploaded by Fanny Sylvia C.

- Midterm 2 With Solution
- 3. Quiz 2 - 4
- latihan soal statistika
- Notes 22
- Formulae Calculus
- Standard Deviation - Wikipedia, The Free Encyclopedia
- EVM-site-selection-tool.xls
- Megan Bryant Hw1
- treatment
- multiple regression
- Saunders Presentation
- D2915
- For the Test statistics
- Slides 13 Manyweak IV
- ASTM E 1155 Procedure Floor Flatness
- Errata
- Khusnul.doc
- 273.full
- P Value and Confidence Interval
- S09 Handout - Inference from Small Samples

You are on page 1of 7

6, page 1

The overall F-test is only one step in the comparison of several groups. In Chapter 5, we saw

how the pooled standard deviation could be used in confidence intervals and hypothesis tests for

the difference of any pair of group means. We can generalize this to confidence intervals and

hypothesis tests for any linear combination of group means.

γ = C1 µ1 + C 2 µ 2 + … + C I µ I

Mice example:

Suppose I wanted to examine the difference in the mean lifetimes of the two control diets:

µ1 − µ 2 . Then C1 = 1 , C 2 = −1 , C3 = 0 , C 4 = 0 , C5 = 0 , C 6 = 0 .

More complicated combinations are sometimes of interest: suppose I wanted to compare the

average of the two control diets to the average of the four reduced calorie diets: what are the

Ci ’s for it?

In the above two examples, the sum of the Ci ’s is 0 in each case. A linear combination of the

means in which the coefficients sum to 0 is called a contrast because it compares or contrasts

some means with others. Specific contrasts are often of interest in ANOVA, but we can create a

confidence interval for any linear combination of means; it does not need to be a contrast.

Which of the following linear combinations of means are contrasts?

a) ( µ1 − µ 3 ) + ( µ 4 − µ 5 )

µ1 + µ 2 + µ 3 + µ 4 + µ 5

b)

5

c) 2 µ1 − µ 2 − µ 3

(µ 2 + µ 3 + µ 4 + µ 5 )

d) − µ1

4

e) µ1 + .µ 2 + µ 5 − µ 3 − µ 4

(µ 3 − µ 2 ) (µ 6 − µ 3 )

f) −

35 10

Chap. 6, page 2

(this is from the mice example on p. 157 and compares the increase in mean life

expectancy per calorie of going from N/R85 to N/R50 to the increase per calorie of

going from N/R50 to N/R40)

Estimate γ = C1 µ1 + … + C I µ I by g = C1Y1 + … + C I YI .

SD(g) =

We estimate σ by the pooled standard deviation s p . Plugging this estimate into SD(g) gives the

estimated standard deviation of g which is the standard error of g:

SE(g) =

g ± t df (1-α/2) SE(g)

where df is the degrees of freedom for s p (that is, n-I). A test of the hypothesis

g

H 0 : γ = 0 is carried out using the test statistic t = .

SE ( g )

Examples:

1. Compute a confidence interval for the difference between mean lifetimes for the laboratory

control (N/N85) and the unrestricted controls (NP):

SE(g) =

Chap. 6, page 3

2. Compute an estimate of the contrast which is the average of the two control diets minus the

average of the four reduced calorie diets along with a confidence interval.

g= − = -12.45

2 4

SE(g) = 6.678 + + + + + = 0.7800

49 57 71 56 56 60

We are 95% confident that the mean life expectancy of the two controls diets is from 10.9 to 14.0

months less than the mean lifetime of the four restricted diets.

Simultaneous Inferences

Fishing Expeditions and Data Snooping: tests based on how the data turned out.

Say you would like to study a particular stream in Chile that is being used to dispose of waste

from pulp manufacture.

i) Measure the concentrations of contaminants in the water

ii) Measure different stream characteristics such as species diversity (plants and

animals), species richness (plants and animals), brood size for several species of fish,

asymmetry in fish, prevalence of bacterial infection (plants and animals) etc., in the

stream and in several other uncontaminated streams in the region.

iii) Keep measuring stream characteristics until you find a significant difference between

the pulp waste stream and the others at the 95% level.

This example, and the one found in your text, which you should study until you understand it,

illustrates the need for family-wise control of the alpha value, or confidence level.

Chap. 6, page 4

If we form individual 95% confidence intervals for a set of linear combinations of means, then

we cannot be 95% confident that they all include the true parameters they’re estimating. The

actual confidence that a family of confidence intervals are simultaneously correct is called the

familywise confidence level.

Example: Say we conduct an experiment where we make 10 pairwise comparisons and control

each of them at the 95% confidence level. What is the probability that of at least Type I error

occurring in the experiment?

Let p = Pr( success ) = 0.95 , q = Pr( failure) = 0.05 , where success means that we do not make a

type I error, and let x be a binomial random variable. Also, assume that each comparison is

independent of every other. Then, the probability of at least type I error is given by

⎛10 ⎞

Pr( x ≥ 1) = 1 − Pr( x = 0) = 1 − ⎜ ⎟ p10 q 0 = 1 − 0.5987 = 0.4013

⎝0⎠

Hence, if we would like to control our probability of type I error for our entire experiment, we

must make some adjustments. Unfortunately, the above calculation depends on the trials being

independent, which is probably not the case for most experiments. So we can not calculate

family-wise probabilities exactly. For this reason, there are several different versions of alpha

correction techniques.

The most common form of family-wise correction is the Bonferroni inequality to create

simultaneous confidence intervals with any desired familywise confidence level. To create

100(1-α)% simultaneous confidence intervals for k parameters, we make each confidence

interval an individual 100(1-α/k)% confidence interval. .

Bonferroni guarantees that the familywise confidence level is at least 1-α, but it can be overkill,

especially when k is large. There are several ways that have been developed for creating

simultaneous confidence intervals among means that can be less drastic.

• Planned comparisons: contrasts which the researcher decides are of interest before the

data are collected. We can control the familywise confidence level using the Bonferroni

inequality or one of the other methods listed below.

• Unplanned comparisons: contrasts which the researcher decides are of interest after

examining the data. These may be chosen from a larger set of contrasts which have been

examined or may be chosen after looking at the data to suggest contrasts of interest. The

confidence intervals must take into account that you actually (in the first case) or

essentially (in the second case) examined a large number of contrasts and picked out the

most “significant” one or ones.

Chap. 6, page 5

In the specific case of all pairwise comparisons of group means, a number of procedures have

been developed to control the familywise error rate. The primary one is

• Tukey-Kramer procedure (for all planned or unplanned pairwise comparisons)

In the general case of contrasts (or any linear combinations of the means) which are not

necessarily pairwise comparisons, there are two main choices. These methods can also be used

for pairwise comparisons.

• Planned comparisons: Bonferroni

• Unplanned comparisons: Scheffe (can also be used for planned comparisons)

In all these cases, the confidence interval for a contrast γ always has the form:

The specific method used determines only the multiplier. If you have a legitimate choice

between two or more procedures, you can choose the one with the smaller multiplier.

In SPSS, the standard errors of one or more contrasts can be calculated by selecting the Contrasts

button on the One-way ANOVA window. You will have to find the value of the appropriate

multiplier to create a confidence interval for the contrast.

Pairwise comparisons between all pairs of means can be obtained by clicking the Post Hoc

button in the One-Way ANOVA window. It will automatically give you confidence intervals for

the difference between each pair of means. There are a multitude of options there; the ones

corresponding to the ones mentioned here are:

⎛I⎞

Bonferroni Bonferroni t n − I (1 − α / 2k ) where k = ⎜⎜ ⎟⎟ is number of

⎝ 2⎠

pairwise comparisons of means

q I ,n −I (1 − α )

Tukey-Kramer Tukey (not Tukey’s-b) (q is from Table A.5)

2

Chap. 6, page 6

⎛ 6⎞ 6!

There are I = 6 groups and n-I = 343 d.f. within groups. There are ⎜⎜ ⎟⎟ = = 15 pairwise

⎝ 2 ⎠ 2! 4!

comparisons. The coefficients or multipliers for 95% confidence intervals for the difference

between each pair of means are:

1. LSD: t 343 (.975) = 1.967 (approx. 1.984 using Table A.2 with 100 d.)

(approx. 5(2.26) = 3.36 using Table A.4 with df2 = 200)

4. Tukey-Kramer: = = 2.866

2 2

4.10

(approx. = 2.90 using Table A.5 with 120 df)

2

If I had just been interested in all pairwise comparisons a priori then I would use Tukey-Kramer.

If there were other pre-planned contrasts I were interested in in addition to all pairwise

comparisons, then I would either use Bonferroni (but I would have to increase k to reflect the

additional contrasts) or Scheffe, whichever were smaller. If there were additional unplanned

comparisons, then I would use Scheffe for all comparisons.

Chap. 6, page 7

Post Hoc Tests

Multiple Comparisons

Tukey HSD

Mean

Difference 95% Confidence Interval

(I) Diet (J) Diet (I-J) Std. Error Sig. Lower Bound Upper Bound

NP N/N85 -5.289* 1.301 .001 -9.018 -1.561

N/R50 -14.895* 1.240 .000 -18.450 -11.341

R/R50 -15.484* 1.306 .000 -19.228 -11.740

N/R lopro -12.284* 1.306 .000 -16.028 -8.540

N/R40 -17.715* 1.286 .000 -21.400 -14.029

N/N85 NP 5.289* 1.301 .001 1.561 9.018

N/R50 -9.606* 1.188 .000 -13.010 -6.202

R/R50 -10.194* 1.257 .000 -13.796 -6.593

N/R lopro -6.994* 1.257 .000 -10.596 -3.393

N/R40 -12.425* 1.235 .000 -15.965 -8.885

N/R50 NP 14.895* 1.240 .000 11.341 18.450

N/N85 9.606* 1.188 .000 6.202 13.010

R/R50 -.589 1.194 .996 -4.009 2.832

N/R lopro 2.611 1.194 .246 -.809 6.032

N/R40 -2.819 1.171 .156 -6.176 .537

R/R50 NP 15.484* 1.306 .000 11.740 19.228

N/N85 10.194* 1.257 .000 6.593 13.796

N/R50 .589 1.194 .996 -2.832 4.009

N/R lopro 3.200 1.262 .117 -.417 6.817

N/R40 -2.231 1.241 .468 -5.787 1.325

N/R lopro NP 12.284* 1.306 .000 8.540 16.028

N/N85 6.994* 1.257 .000 3.393 10.596

N/R50 -2.611 1.194 .246 -6.032 .809

R/R50 -3.200 1.262 .117 -6.817 .417

N/R40 -5.431* 1.241 .000 -8.987 -1.875

N/R40 NP 17.715* 1.286 .000 14.029 21.400

N/N85 12.425* 1.235 .000 8.885 15.965

N/R50 2.819 1.171 .156 -.537 6.176

R/R50 -5.289 1.301 .468 -1.325 5.787

N/R lopro -14.895* 1.240 .000 1.875 8.987

*. The mean difference is significant at the .05 level.

Homogeneous Subsets

Months survived

a,b

Tukey HSD

Subset for alpha = .05

Diet N 1 2 3 4

NP 49 27.402

N/N85 57 32.691

N/R lopro 56 39.686

N/R50 71 42.297 42.297

R/R50 56 42.886 42.886

N/R40 60 45.117

Sig. 1.000 1.000 .108 .212

Means for groups in homogeneous subsets are displayed.

a. Uses Harmonic Mean Sample Size = 57.462.

b. The group sizes are unequal. The harmonic mean of

the group sizes is used. Type I error levels are not

guaranteed.

- Midterm 2 With SolutionUploaded byFatimaIjaz
- 3. Quiz 2 - 4Uploaded byhimanshubahmani
- latihan soal statistikaUploaded byArgantha Bima Wisesa
- Notes 22Uploaded byTamil Raja
- Formulae CalculusUploaded bywhat
- Standard Deviation - Wikipedia, The Free EncyclopediaUploaded byManoj Borah
- EVM-site-selection-tool.xlsUploaded bymulyadi
- Megan Bryant Hw1Uploaded byrayka
- treatmentUploaded byJessie Anne Enriquez
- multiple regressionUploaded byYeldho Peter
- Saunders PresentationUploaded byherbert
- D2915Uploaded byAnonymous 4GNAkbVRT
- For the Test statisticsUploaded byParis Iannis
- Slides 13 Manyweak IVUploaded bykeyyongpark
- ASTM E 1155 Procedure Floor FlatnessUploaded byshrinivast
- ErrataUploaded byAlina Ciabuca
- Khusnul.docUploaded byKhusnul Khotimah T W
- 273.fullUploaded bypsyquis2
- P Value and Confidence IntervalUploaded byCivils Tornado
- S09 Handout - Inference from Small SamplesUploaded byKatherine Sauer
- Chapter 17 Two-way AnovaUploaded byAnkush Bhandari
- CT3_Sol_0512.pdfUploaded byeuticus
- QUANTILE PROBABILITY AND STATISTICAL DATA MODELINGUploaded byg458191
- stathw19Uploaded bykokleong
- Solution of HW 5 (Chapter 6) Problems.docxUploaded byDayanara
- An Application Study of Process Capability Analysis - HeUploaded bytehky63
- project2-jdongUploaded byapi-280866825
- Data RianUploaded byKhoirunnisa
- MSA V and VUploaded byCiprian Raileanu
- Probability of Detection CurvesUploaded byDav89

- Chapter 12Uploaded byFanny Sylvia C.
- ReviewChaps3-4Uploaded byFanny Sylvia C.
- Chapter 10Uploaded byFanny Sylvia C.
- Chapter 14Uploaded byFanny Sylvia C.
- Chapter 20Uploaded byFanny Sylvia C.
- ReviewChaps1-2Uploaded byFanny Sylvia C.
- Chapter 13Uploaded byFanny Sylvia C.
- Chapter 21Uploaded byFanny Sylvia C.
- Charles TaylorUploaded byFanny Sylvia C.
- Chapter 8Uploaded byFanny Sylvia C.
- Chapter 11Uploaded byFanny Sylvia C.
- Non%26ParaBootUploaded byFanny Sylvia C.
- SampleSizeCalcRevisitedUploaded byFanny Sylvia C.
- Model- vs. design-based sampling and variance estimationUploaded byFanny Sylvia C.
- Hypo%26PowerLectureUploaded byFanny Sylvia C.
- R Matrix TutorUploaded byFanny Sylvia C.
- Intro BootstrapUploaded byMichalaki Xrisoula
- Clustering in the Linear ModelUploaded byFanny Sylvia C.
- Good Article on Standard Error vs Standard DeviationUploaded byAshok Kumar Bharathidasan
- An Ova PowerUploaded byFanny Sylvia C.
- Chapter 7Uploaded byFanny Sylvia C.
- Chapter 7Uploaded byFanny Sylvia C.
- Bio Math 94 CLUSTERING POPULATIONS BY MIXED LINEAR MODELSUploaded byFanny Sylvia C.
- Chapter 5Uploaded byFanny Sylvia C.
- Chapter 9Uploaded byFanny Sylvia C.
- GRM: Generalized Regression Model for Clustering Linear SequencesUploaded byFanny Sylvia C.
- Chapter5p2LectureUploaded byFanny Sylvia C.
- Data Modeling: General Linear Model &Statistical InferenceUploaded byFanny Sylvia C.
- The not so Short Introduction to LaTeXUploaded byoetiker

- [Sadiku] Practice Problem Solution.pdfUploaded byShakil Ahmed
- AHP applied to maintenance strategy selection.pdfUploaded bydfbdfbfdbfb
- Digital Booklet - Mobile OrchestraUploaded bychayan_mondal29
- ThesisShaker[1].pdfUploaded byRenzo Melliza
- Recent Advances in the Fluid Dynamics of Ladle MetUploaded bypinenamu
- Trapezium+Rule+solvedUploaded byChan Pei Xin
- EAST EUROPE DIRTY LOBBINGUploaded byStefan Herkel Vlahovich
- THE IMPACT OF XBRL ON THE QUALITY OF FINANCIAL REPORTING SYSTEM IN INDIAUploaded byAnonymous eAga6J
- Spelling Hamza k2optUploaded byaem
- Matteo Morici Assignment 1 - Danone Wahaha JVUploaded byMatteo Morici
- grammar-paper.docxUploaded bynavalgajjar
- Greedy ProgrammingUploaded byakg299
- Dowling et al. 2011Uploaded bywamu885
- EEE Books AuthorsUploaded byViswanath Nt
- ParasitUploaded byMiitu
- LV Renewal Parts CatalogUploaded byvda
- piping arrangement system.pdfUploaded bydasubhai
- hadoop presentationUploaded byAashika Mittal
- Optical fiber: StructuresUploaded bysbashar36
- 02 chapter02 020-039Uploaded byapi-258749062
- IRJET-Smart Shoe for Route TrackingUploaded byIRJET Journal
- Suzuki Motor Corporation change analysisUploaded bymclawson1
- Investing in Indian Stock Markets - The 5 Minute WrapUp - EquitymasterUploaded byAtanu Paul
- Features of Business EnvironmentUploaded byRashedul Hasan Rashed
- IJSER TemplateUploaded bymwbarve
- Dissolution DeedUploaded bySeshadry
- Chapter 16Uploaded byHarvey T. Golez
- Static Functions in Linux Device DriverUploaded byAkhilesh Chaudhry
- General Safety AwarenessUploaded byjhamsterdoom
- why MBAUploaded byNabendu Saha