Professional Documents
Culture Documents
Tools
x
Rankit
x x
Sort data (x1 xn) in order of increasing magnitude Write down cumulative frequency for each value (ie how many data points have lesser or equal value) Calculate cumulative % frequency = 100 x cumulative frequency / (n+1) Plot the Z-score associated with the cumulative % frequency against x [in Excel = NORMSINV(% frequency/100) ]
1 2 4 4 5 6
1.50 1.00 0.50 0.00 -1.5 -0.50 -1.00 -1.50 -1 -0.5 0 0.5 1 1.5
M MC
2.5
Va lue of effect
ANOVA
(Analysis of Variance)
ANOVA partitions the total variablity in a set of data, taken under different conditions, between the variablity attributable to each source of difference. Often it is used to determine the significance of variance compared to random uncertainty
Which of many variables are important for a method. If a linear relationship between variables is significant For a round robin analysis between several laboratories, what is the inter-laboratory precision (reproducibility) and what the intra-laboratory precision (repeatability). Is the inter-laboratory precision significantly greater than the intra-laboratory precision?
x1,1 x 2,1 xi ,1
x1, 2 x 2, 2 xi , 2
x1, j x 2, j xi , j
Different variables of interest in COLUMNS Replicates (or whatever groups each variable) in ROWS
Calculation
1. Subtract global mean from each value 2. Total sum of squares = SST 3.
3.1 Average each column 3.2 Square average and multiply by number of rows 3.3 Sum = SSc
= ( xi , j x ) x=
x
j
i, j
n
j
SS T = xi , j x
i j
xi , j x i SS C = n j nj j = nj xj x
j
Names
SST is the total sum of squares. Also the corrected sum of squares x SSc is the sum of squares due to the factor studied. Also the treatment sum of squares, the heterogeneity sum of squares, or the between column sum of squares x SSR is the residual sum of squares. Also the within column sum of squares
x
ANOVA Table
Source Between variables Within variables Sum of Squares SSc Degrees of freedom k- 1 Mean squares SSc/(k - 1) Expected mean squares
+ n j
2
2 c
SSR
N-k
SSR/(N-k)
TOTAL
SST
N-1
Decisions
Use a one-tailed F-test to compare the mean square values to decide if the variance between columns is significantly different from the residual variance
F=
+ n j
2
2 c
F-test
x x
The Fisher F distribution is used to compare variances For two sets of data with standard deviations 2 s1 and s2 s1
s1 > s2
F=
2 2
The F distribution is at a given probability level (eg 0.05 = 95%), and at the relevant number of degrees of freedom (n 1) for numerator and denominator As we know s1 > s2 a one tailed test is used
SST = 844.4
ANOVA Table #1
Source Between variables Within variables TOTAL Sum of Squares 144.4 700 844.4 Degrees of freedom 1 8 9 Mean squares 144.4 87.5 Expected mean squares
2 + 5c2 2
F test #1
144. 40 F= = 1. 65 87. 50 F0.05' ,1,8 = 5. 3
F < Ftable Therefore the difference is NOT significant at 95% probability
Calculation of s
SS c = SSc/(k-1) = 2 + njc2 SS R = SSR/(N-k) = 2
= SS R C =
SSC SS R nj
DRUM A DRUM B 49 44 44 57 70 34 50 48 58 50
In Excel use Data Analysis add in: One-way ANOVA Check: Headers in first row
=144.4/87.5 Probability of 144.4 being greater than 87.5 by chance F value at 95% probability
SS 144.4 700
844.4
df
1 8 9
ANOVA Example #2
The following data shows the stability of a fluorescent molecule under different storage conditions. Determine which storage conditions lead to significantly different signals. Storage method A: Freshly prepared B: 1 hr in dark C: 1 hr in subdued light D: 1 hr in bright light Fluorescence 102, 100, 101 101 101 104 97, 95, 99 90, 92, 94
ANOVA calculations #2
k=4 A B C D
nj = 3
4 2 3
3
3 3 6
4
-1 -3 1
-1 SSc = 186
-8 -6 -4
-6
Mean
ANOVA Table #2
Source Between variables Within variables TOTAL Sum of Squares 186 24 210 Degrees of freedom 3 8 11 Mean squares 62 3 Expected mean squares
2 + 3c2 2
F test #2
62 F= = 20. 7 3 F0.05' ,3,8 = 4. 07
F > Ftable Therefore there is a SIGNIFICANT difference at 95% probability
C 97 95 99
D 90 92 94
SS
186 24 210
df
3 8 11
MS
F crit 4.06618
Arrange mean of each variable in ascending order Calculate s t (2/n) where s is the within - variables estimate of , n is the number of rows, and t is the Student's-t value for the degrees of freedom of this s at 95% cl If the difference between means is > calculated LSD then it is SIGNIFICANT If the difference between means is < calculated LSD then it is NOT SIGNIFICANT
Interpretation
D & C differ significantly from each other and from A & B x A & B do not differ significantly from each other
x x