You are on page 1of 4

Formulae

Addition rule: P (A B) = P (A) + P (B) P (A B).


P (AB)
P (B)

Conditional Probability: P (A|B) =

provided P (B) > 0.

Independence: A and B are independent if P (A B) = P (A) P (B).


Discrete random variable: x =

P
x

Continuous random variable: x =

xp(x), x2 =

P
x

x2 p(x) 2x .

xf (x)dx, x2 =

x2 f (x) 2x .

If X1 , . . . , Xn are random variables, the mean of any linear combination is given by


c1 X1 ++cn Xn = c1 X1 + + cn Xn .
If X1 , . . . , Xn are independent random variables, the variance of any linear combination is given by
2
2
c21 X1 ++cn Xn = c21 X
+ + c2n X
.
n
1

Binomial distribution: p(x) =


special case with n = 1.

n
x

px (1 p)nx ; x = 0, 1, . . . , n. x = np, x2 = np(1 p). Bernoulli is a

Poisson distribution: p(x) = e x /x!; x = 0, 1, . . .. x = , x2 = .


Hypergeometric: p(x) =

N R
(R
x )( nx )
, x =
N
(n)

nR
,
N

x2 =

nR
(1
N

R N n
)
.
N N 1

Geometric: p(x) = (1 p)x1 p; x = 1, 2, . . .. x = p1 , x2 =


Negative binomial: p(x) =

x1
r1

1p
.
p2

pr (1 p)xr ; x = r, r + 1, . . .. x = pr , x2 =

Normal distribution: If X N (, 2 ), then Z =

r(1p)
.
p2

N (0, 1).

Lognormal distribution: If Y LN (, 2 ), then X = loge (Y ) N (, 2 ). y = e+


2
e2+ .
Exponential: f (x) = ex , x > 0. x =
Uniform distribution: f (x) =

1
,
ba

1
,

x2 =

/2

, y2 = e2+2

1
.
2

a < x < b. x =

a+b
,
2

x2 =

(ba)2
.
12

Central Limit Theorem: If X1 , . . . , Xn are independent random variables each with mean and standard
deviation , then the following hold approximately:
S = X1 + + Xn N (n, n 2 )
X=

X1 + + Xn
2
N (,
)
n
n

provided n is large (n > 30).


Large sample confidence interval for :

X z/2 .
n

Sample size needed to get a desired confidence bound B: n =

2
z/2
2

B2

Confidence interval for p:

r
p z/2
where n
= n + 4, p =

p(1 p)
n

X+2
.
n+4

Sample size needed for a desired confidence bound B: n =

2
z/2
p (1p )

B2

where p is a guess of p.

Small sample confidence interval for :


X tn1,

s
.
n

/2

Large sample CI for X Y based on independent samples:

r
X Y z/2

2
2
X
+ Y.
nX
nY

CI for pX pY :

r
pX pY z/2
where pX =

X+1
,
nX +2

pY =

Y +1
,
nY +2

pX (1 pX )
pY (1 pY )
+
n
X
n
Y

n
X = nX + 2, n
Y = nY + 2.

2
Small-sample CI for X Y based on independent samples, when X
6= Y2 :

r
X Y t,

/2

s2X
s2
+ Y
nX
nY

where


=

s2
X
nX

/nX )2
(s2
X
nX 1

s2
Y
nY

2

/nY )2
(s2
Y
nY 1

(1)

rounded down to the nearest integer.


2
= Y2 :
Small-sample CI for X Y based on independent samples, when X

r
X Y tnX +nY 2,
where

r
sp =

/2 sp

1
1
+
nX
nY

(nX 1)s2X + (nY 1)s2Y


.
nX + nY 2

CI for paired data:


D tn1,

sD
/2

where D = X Y .
Large sample test for : Test statistic z =

X
0.
/ n

(
P value =

Test for p: Test statistic z =

pp

if H1 : > 0
if H1 : < 0
if H1 : 6= 0 .

(assume np0 > 10, n(1 p0 ) > 10.)

p0 (1p0 )/n

(
P value =

Small sample test for : Test statistic t =

(
P value =

P (Z > z )
P (Z < z )
2 P (Z > |z |)

P (Z > z )
P (Z < z )
2 P (Z > |z |)

if H1 : p > p0
if H1 : p < p0
if H1 : p 6= p0 .

X
0.
s/ n

P (tn1 > t )
P (tn1 < t )
2 P (tn1 > |t |)

Large sample test for X Y : Test statistic: z = p

if H1 : > 0
if H1 : < 0
if H1 : 6= 0 .

XY 0

2 /n + 2 /n
X
X
Y
Y

(
P value =

P (Z > z )
P (Z < z )
2 P (Z > |z |)

if H1 : X Y > 0
if H1 : X Y < 0
if H1 : X Y =
6 0 .

(2)

p
X p
Y

Large sample test for pX pY : Test statistic: z =

p(1

p)(1/n

X +1/nY )

p =

X+Y
nX +nY

where pX =

X
nX

, pY =

Y
nY

and

.
P (Z > z )
P (Z < z )
2 P (Z > |z |)

(
P value =

if H1 : pX > pY
if H1 : pX < pY
if H1 : pX 6= pY .

2
Small-sample test for X Y , independent samples (assume X
6= Y2 ): Test statistic: t = p

P (t > t )
P (t < t )
2 P (t > |t |)

(
P value =

XY 0
s2
/nX +s2
/nY
X
Y

if H1 : X > Y
if H1 : X < Y
if H1 : X 6= Y

where is as in (1).
2
Small-sample test for X Y , independent samples (assume X
= Y2 ): Test statistic: t =

XY 0
sp

1/nX +1/nY

where sp is as in (2).

(
P value =

Test for paired data: Test statistic: t =

P (tnX +nY 2 > t )


P (tnX +nY 2 < t )
2 P (tnX +nY 2 > |t |)
D
0 ,
sD / n

(
P value =

if H1 : X > Y
if H1 : X < Y
if H1 : X 6= Y .

where D = X Y .

P (tn1 > t )
P (tn1 < t )
2 P (tn1 > |t |)

if H1 : D > 0
if H1 : D < 0
if H1 : D 6= 0 .

Chi-Square test of goodness-of-fit: H0 : p1 = p10 , . . . , pk = pk0 . Test statistic:


2 =

k
X
(Oi Ei )2

Ei

i=1

where Ei = N pi0 . P -value = P (2k1 > 2 ).


Chi-Square test of homogeneity: H0 : p1j = p2j = = pIj for each j (j = 1, . . . , J). Test statistic:
2 =

J
I
X
X
(Oij Eij )2
i=1 j=1

where Eij =

Oi. O.j
O..

Eij

. P -value= P (2(I1)(J1) > 2 ).

Correlation between X and Y :

Pn
i=1

r = pPn

x2
i=1 i

xi yi nxy

nx2

pPn
i=1

yi2 ny 2

Test for = 0 (where 0 6= 0): Use the test statistic W = 21 loge


1+
2
1
distributed with mean W = 21 loge 1
and variance W
= n3
.

1+r
,
1r

which is approximately normally

r n2
To test for = 0, use U =
, which has a tn2 distribution under H0 .
2
1r

Pn

xi yi nxy
Least squares regression coefficients: 1 = Pi=1
, 0 = y 1 x.
n
2
2
i=1

xi nx

100(1 )% CI for 0 and 1 are: 0 tn2,/2 s0 and 1 tn2,/2 s1 where

s
s0 = s

1
x2
+ Pn
,
n
(xi x)2
i=1

s1 = pPn
i=1

s
(xi x)2

r
and

s=

(1 r2 )

Pn

(yi y)2
.
n2
i=1

100(1 )% CI for the mean predicted value at x is 0 + 1 x tn2,/2 sy where

s
sy = s

(x x)2
1
+ Pn
n
(xi x)2
i=1

100(1 )% prediction interval at x is given by 0 + 1 x tn2,/2 spred where

s
spred = s

1+

(x x)2
1
+ Pn
n
(xi x)2
i=1

Pn

Regression SS (SSR) = i=1 (


yi y)2 , Error SS (SSE) =
Analysis of variance identity: SST = SSR + SSE.
s2 =

SSE
.
np1

Coefficient of determination R2 =

To test H0 : 1 = = p = 0, use F =

Pn
i=1

(yi yi )2 and Total SS (SST) =

SSR
.
SST

SSR/p
.
SSE/(np1)

Under H0 , F Fp,np1 .

F -test for one-way ANOVA:


SST r =

I
X

Ji X i. N X ..

i=1

SSE =

Ji
I
X
X

I
X

2
Xij

i=1 j=1

Ji X i.

i=1

SST r
SSE
M ST r
M ST r =
, M SE =
, F =
I 1
N I
M SE
Under H0 : 1 = = I , F FI1, N I .
100(1 )% CI for i is X i. tN I,

q
/2

M SE
Ji

Fishers Least Significant Difference Method:


100(1 )% CI for (i j ) is X i. X j. tN I,/2

M SE( J1i +

1
Jj

To test H0 : i = j , reject H0 at level if

r
|X i. X j. | > tN I,/2

M SE(

1
1
+ )
Ji
Jj

Bonferroni Method with C simultaneous comparisons:


100(1 )% CI for (i j ) is X i. X j. tN I,/(2C)

M SE( J1i +

1
Jj

To test H0 : i = j , reject H0 at level if

r
|X i. X j. | > tN I,/(2C)

M SE(

1
1
+ )
Ji
Jj

Tukey-Kramer Method for all possible comparisons:


100(1 )% CI for (i j ) is X i. X j. qI,N I,

M SE 1
( Ji
2

1
Jj

To test H0 : i = j , reject H0 at level if

r
|X i. X j. | > qI,N I,

M SE 1
1
( + )
2
Ji
Jj

Pn
i=1

(yi y)2 .