You are on page 1of 21

ANOVA

Analysis of
Variance

Prof. M.K.Tiwari
Department of Industrial
Engineering and Management
IIT Kharagpur
Analysis of variance (ANOVA)

 ANOVA assesses the extent to which the distributions of two or more


variables overlap
 The more the distributions overlap the less likely it is that they are
different
Analysis of Variance

 Analysis of variance, or ANOVA, or F tests, were


designed to overcome these shortcomings of the t
test.
 An ANOVA with ONE IV with only two levels is
the same as a t test.
The Logic of ANOVA

 Hypothesis testing in ANOVA is about whether


the means of the samples differ more than you
would expect if the null hypothesis were true.
 This question about means is answered by
analyzing variances.
 Among other reasons, you focus on variances because
when you want to know how several means differ, you
are asking about the variances among those means.
Two Sources of Variability

 In ANOVA, an estimate of variability between groups is compared


with variability within groups.
 Between-group variation is the variation among the means of the

different treatment conditions due to chance (random sampling


error) and treatment effects, if any exist.
 Within-group variation is the variation due to chance (random

sampling error) among individuals given the same treatment.

A N O VA

T o ta l V a r ia tio n A m o n g S c o r e s

W ith in -G r o u p s V a r ia tio n B e tw e e n -G r o u p s V a r ia tio n


V a ria t io n d u e t o c h a n c e . V a ria t io n d u e t o c h a n c e
a n d t r e a t m e n t e f f e c t ( i f a n y e x is t i s ) .
Variability Between Groups

 There is a lot of variability from one mean to the next.


 Large differences between means probably are not due to
chance.
 It is difficult to imagine that all six groups are random
samples taken from the same population.
 The null hypothesis is rejected, indicating a treatment effect
in at least one of the groups.
Variability Within Groups

 Same amount of variability between group means.


 However, there is more variability within each group.
 The larger the variability within each group, the less
confident we can be that we are dealing with samples
drawn from different populations.
Completely Randomized Experiment and
Analysis of Variance

Say, we have ‘a’ different levels of single factor to be compared (Table


1), where, yij - represents the jth observation taken under treatment i.

Table 1: Typical data for a single factor experiment


Treatment Observations Totals Averages
(level)
1 y11 y12 . . . y1a y1. y1.
2 y21 y22 . . . . y2. y1.
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
a ya1 ya2 . . . yan ya. y a.
y.. y..
Completely Randomized Experiment and
Analysis of Variance

• The levels of the factor are sometimes called treatments.

• Each treatment has six observations or replicates.


• The runs are run in random order.
The observations may be described by the linear
statistical model
 i = 1, 2,..., a
yij = µ + τ i + ε ij 
 j = 1, 2,..., n
Where,
µ : Overall mean
τ i : Parameter associated with the i th treatment
(i th treatment effect)
ε ij : Random error component
The model can be written as

i = 1,2,..., a
yij = µi + ε ij 
 j = 1,2,..., n
Where,
µi = µ + τ i : Mean of i th treatment
Completely Randomized Experiment and
Analysis of Variance
Completely Randomized Experiment and
Analysis of Variance
Completely Randomized Experiment and
Analysis of Variance
Completely Randomized Experiment and
Analysis of Variance
Completely Randomized Experiment and
Analysis of Variance
Completely Randomized Experiment and
Analysis of Variance
Completely Randomized Experiment and
Analysis of Variance
Completely Randomized Experiment and
Analysis of Variance

Example 1
The development engineer is interested in determining if the cotton weight percentage
in a synthetic fiber affects the tensile strength, and she has run a completely
randomized experiment with fiber levels of cotton weight percentage and five
replicates. The data is given as below

Observed tensile Strength


lb/in2
Cotton
Weight (%) 1 2 3 4 5 Totals Averages
15 7 7 15 11 9 49 9.8
20 12 17 12 18 18 77 15.4
25 14 18 18 19 19 88 17.6
30 19 25 22 19 23 108 21.6
35 7 10 11 15 11 54 10.8
376 15.04
Completely Randomized Experiment and
Analysis of Variance

• We have to test the hypothesis H 0 : 1   2  3   4  5


against H1 : some means are different

The sum of squares are computed as follows

5 5
y 2..
SST   yij  = 636.96
i 1 j 1 N

1 5 2 y 2..
SSTreatements   y i.  =475.76
n i 1 N
Completely Randomized Experiment and
Analysis of Variance

SS E  SST  SSTreatments =161.20


ANOVA Table for above data

Source of variation Sum of Degrees of Mean F0 P- Value


Squares freedom Square

Cotton weight 475.76 4 118.94 14.76 <0.01


Percentage
Error 161.20 20 8.06
Total 636.96 24