Some Basic Formula in Statistics

2012
Formula related to STAT310
FORMULA AND OTHER PRACTICAL THINGS TO KNOW

RAJU RIMAL
NORWEGIAN UNIVERSITY OF LIFE SCIENCES | s, Norway
For testing the differences in means:

Hypothesis: 0 : 1 = 2 1 : 1 2 Test Statistic: 0 = 1 2 1 1 + 1 2
= ( . . )
=1 =1
= ( . . . )2
=1
= ( . )
=1 =1
Testing a two-way ANOVA model

The model is given as, = + + + () + () Where, = 1,2, , (For Factor A) = 1,2, , (For Factor B) = 1,2, , (Replication) The estimate for the unknown factors , and residual is given as, Estimate () . . . . . . . . . +
Where,
2 2 (1 1)1 + (2 1)2 2 = 1 + 2 2
The confidence interval for the difference between two treatment means is given as, . . 2, 2
Fundamental Decomposition:
= + Where,
ANOVA Table
Source of Error Total . . 1 Sum of Square
2 = =1 =1 =1 2
Factor A () Factor B () Interaction () Error
1 1
= ( 1)( 1) = ( 1)

2 1 2 . . =1
2 1 2 = .. =1 2
1 2 .
=1 =1
The variance of is,
Post-hoc Test
2
Multiple Testing
While testing the different levels of factors, if we use the multiple paired t-test, we can have problem of falsely rejecting hypothesis. For instance, if a factor has 4 levels then we can have 6 pairwise t-test, so that, (at least one false rejection) = 1 (1 0.05)6 = 0.264 So that assumptions of independence will be violated, this is adjusted by Tukeys post-hoc test.
var() = 2
=1
For testing the contrast hypothesis 0 : = 0, the test statistic is constructed as, 0 = 1 .
2 2 =1
However 2 is unknown it is replaced by . Alternative approach is to construct the Statistic as,

2 0 = 0 = 2 ( 1 .) 2 =1
Tukeys HSD Test

Based on, = | 1 2 |
Residual
The residual is given by, =
The two means are declared significantly different if, | 1 2 | > 0.05 = 0.05 (, ) Where, = Number of groups = Degree of Freedom 0.05 (, )= From Table The confidence interval or is given as, . . (, ) 2 1 1 ( + )
Standardize Residual
The standardize Residual is given as, = ( 1)
Outliers
With common rule of thumb the standardized residual greater than 3 or smaller than -3 are considered as outliers.
Normality
Normality is checked primarily with the graphs. The scattered points in Residuals VS Fitted graph should be random and should not follow any kind of pattern. Further, in Normal Q-Q plot (Theoretical Quintiles VS Standardized Residuals) the points should lie close to the standard line. The points that are far from the line are considered as an outliers.
Contrast
Contrast is defined as,

= ,
=1
= 0
=1
For example, while testing the treatment totals, the contrast can be constructed as,
Confidence interval for 2

The confidence interval for 2 is given as,
= .
=1
] 2 , 2 1
2 2
in degree of freedom of error for Full and Reduced Model.
Power of Test
The power of test is the probability of rejecting Null Hypothesis when it is false. In any test, the possible outcomes are, Accept is true is false It is given as, 1 ) = ( > = ( > 1 ) Here, 1 1 = Solving for , we get, ( 1 ) ] [ 1
2
Latin Square Design

This is special case with two or more factors regarded as blocks and doesnt have enough observations to do completely randomized block experiment. In this design, each treatment level are tested exactly once in each lock of the first blocking factor (Row) and exactly once in each block of second blocking factor (Columns). Example: A B C B C A C A B
Reject Type error () Correct (1 )
Correct (1 ) Type error ()
2 Factorial Design
The full factorial design with 3 factors is written as, = + + + () + + () + () + () + The effect and standard error of the effect, Contrast 21 Contrast 2 Sum of Square = 2 Effect = Before fitting the full model, we can check which factors and their interactions have significant effect on the model. This can also be performed only with one replication using the Normal Probability Plot. In Normal Probability Plot, the negligible effects tends to fall along the line and the significant effects will have non-zero mean and fall off the lines. Non-significant effects are considered to be removed from the model.
Partial F-Test
For testing the effect of a factor in an experiment, we reduce the model and compare it with the full model. For reducing the model for testing a factor, it is removed along with all of its interactions. The reduced model is then fitted. For Example, if we are testing the effect of factor C then the hypothesis is set as, : 1 = 2 = = = 0 The Test statistic is, = ( (Reduced) (Full)) (Full)
This is distributed with with and error degree of freedom ( ). Where is the number of parameter in , i.e. the difference
Fractional Factorial Design

A Full design with factor requires 2 experiments per replication which will increase
significantly for large and only few degree of freedom are used in estimation by main and lower degree interaction effects. For Instance, in 26 experiment, 64 runs are needed and the main effects use only 6 degree of freedom and two factor interaction use 15. Other 42 degree of freedom are associated with higher order interaction which might have insignificant effect on response. If we can run the fraction of full model experiment, it can save a lot of work and cost of experiment.
cor( , ) =
cov( , ) var( ) var( )
2 2 + 2
Nested Design
The design discussed so far are all crosssectional design. The design where the levels of a factor is nested under the levels of another factor is called Nested Design.
Cow 1 Herd 1 Cow 2 Sire 1 Cow 3 Herd 2 Cow 4 Cow 5 Herd 3 Cow 6 Sire 2 Cow 7 Herd 4 Cow 8
Aliases
While running fractional factorial design, some factor are confounded with other. For instance, if BC is confounded with A then when estimating A, we are actually estimating A+BC. Thus A and BC are aliases.
Design Resolution
The resolution of a design is equal to the smallest number of effects in the defining relation.
Random Effect Model

When a factor in a model is considered as random then a restriction that it follows an independent and identical normal distribution. In the model, = + + It is assumed that,
2) (0, (0, 2 ) 2 The term and 2 are called variance components. Thus,
A Two stage Nested Model is, = + + () + () Here, the factor with levels is nested under factor with level . Interaction is not possible under Nested Design. The Sum of square for a Nested Design from a cross-sectional model with interaction can be obtained as, () = +
var( ) = var( + + ) = Also, the covariance is given as,
Expected Sum of Square

Cross-sectional Designs
Two Factor, both Fixed
2 ( ) = + 1 2 2 ( ) = + 1 2
cov( , ) = cov( + + , + + ) = cov( , ) + cov( , ) + cov( , ) + cov( , ) 2 2 = + 0 + 0 + 0 = Thus, the correlation between the two is given as,
( ) = + ( ) = 2
()2 ( 1)( 1)
ANCOVA
ANCOVA is a combination of regression and a linear model without covariate as independent factor. An ANCOVA model might contain both categorical and continuous variable in the same model. From example, the weight of a person can depend on the sex and height. The model including these variables along with their interaction is, = + 1 1 + 2 2 + 12 1 . 2 + Here, 1 = 1 for female and 1 = 1 for male. 2 is the measurement for the height. Then the separate regression line for female and male can be obtained as, For Female: = (0 + 1 ) + (2 + 12 )2 + For Male: = (0 1 ) + (2 12 )2 +
Two Factor, both random

2 2 ( ) = 2 + + 2 2 ( ) = 2 + + 2 ( ) = 2 + ( ) = 2
Two Factor, one random

2 ( ) = 2 + + 2 ( ) = + 2 ( ) = 2 + ( ) = 2 2 2 1
Nested Design
Three Stage Nested- A and B Fixed C Random
2 1 2 () 2 (() ) = 2 + + ( 1) 2 2 (() ) = + ( ) = 2 2 ( ) = 2 + +

Some Basic Formula in Statistics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Some Basic Formula in Statistics

Uploaded by

Copyright:

Available Formats

2012

Formula related to STAT310

FORMULA AND OTHER PRACTICAL THINGS TO KNOW

NORWEGIAN UNIVERSITY OF LIFE SCIENCES | s, Norway

For testing the differences in means:

Testing a two-way ANOVA model

Factor A () Factor B () Interaction () Error

The variance of is,

However 2 is unknown it is replaced by . Alternative approach is to construct the Statistic as,

Tukeys HSD Test

Confidence interval for 2

in degree of freedom of error for Full and Reduced Model.

Latin Square Design

Reject Type error () Correct (1 )

Correct (1 ) Type error ()

Fractional Factorial Design

cov( , ) var( ) var( )

Random Effect Model

var( ) = var( + + ) = Also, the covariance is given as,

Expected Sum of Square

Two Factor, both random

Two Factor, one random

You might also like