You are on page 1of 2

Statistics II

Fact sheet for exams


Confidence intervals and hypothesis testing in one and two populations.
Notation:
2
and s2 : sample mean and
X and X
: population mean and variance of a random variable/population X, X
X
quasi-variance

pX population proportion if X Bernoulli(pX ), pX sample proportion


X n : simple random sample (SRS) of size n from X
(1 ) confidence level, significance level
z an upper quantile of N(0,1) distribution, tn1; an upper quantile of a tn1 distribution
Parameter

Assumptions: SRS(s) and

Normal population, known variance

Normal population, unknown variance

Nonnormal population, large sample size

pX

Bernoulli population, large sample size

2
X
and X

Pivotal quantity and distribution


X
X
N (0, 1)
X / n
X
X
tn1
sX / n
X
X
approx. N (0, 1)
sX / n
pX pX
p
approx. N (0, 1)
pX (1 pX )/n
(n 1)s2X
2n1
2
X

Normal population

X Y

Normal difference Di = Xi Yi , matched pairs

X Y

Normal populations, common variance

X Y

Normal populations, known variances

X Y

Nonnormal populations, large sample sizes

pX pY

Bernoulli populations, large sample sizes

D
D
tn1
sD / n
Y (X Y )
X
q
tnX +nY 2 , where
sp n1X + n1Y
(nX 1)s2X + (nY 1)s2Y
s2p =
nX + nY 2
Y (X Y )
X
q 2
N (0, 1)
2
Y
X
+
nX
nY
Y (X Y )
X
q 2
approx. N (0, 1)
s2Y
sX
+
nX
nY

pX pY (pX pY )
r
 approx. N (0, 1), where

1
1
p0 (1 p0 ) nX + nY
p0 =

2
X
/Y2 and X /Y

nX pX + nY pY
nX + nY

2
s2X /X
FnX 1,nY 1
s2Y /Y2

Normal populations

2
2
Example: To construct an (1 ) confidence interval for X if X N (X , X
) with X
unknown we have:


sx
sx
+ tn1;/2
CI1 (X ) = x
tn1;/2 ; x
n
n

To perform a lower-tail test H0 : X 0 versus H1 : X < 0 , the rejection region at significance level , RR , is:

}| {

zx

0
< tn1;1
RR = t :

sx / n

Sample covariance and correlation based on bivariate observations (x1 , y1 ), . . . , (xn , yn ):


sxy
z }| {
cov (x, y) =

n
X
i=1

(xi x) (yi y)
n1

n
X
i=1

xi yi n
xy
n1

r(x,y)
z }| { cov (x, y)
cor (x, y) =
= s
sx sy
n
P

i=1

n
P

xi yi n
xy
s
n
P
2
2
x
y2
xi n
yi2 n
i=1

i=1

Slope and intercept estimates in the simple linear regression model yi = 0 + 1 xi + ui , where
ui iid N (0, 2 ) to obtain the fitted line yi = 0 + 1 xi :
cov(x, y)
1 =
=
s2x

n
X
i=1

(xi x
) (yi y)
n
X
i=1

(xi x
)

n
X

xi yi n
xy

i=1
n
X
i=1

0 = y 1 x

x2
x2i n

Pivotal quantities for 1 , 0 , 2 , with residuals ei = yi yi and residual variance s2R =


s

1 1

s2R
(n 1)s2X

tn2 ,

s2R

0 0

1
x
2
+
n (n 1)s2X

 tn2 ,

n
X

e2i

i=1

n2

(n 2) s2R
2n2
2

Confidence intervals for the mean and individual response for y0 given X = x0 :
v
v
!
!
u
u
2
u
u
1
1
(x

)
(x0 x)2
0
2
2
t
t
, y0 tn2,/2 sR 1 + +
y0 tn2,/2 sR
+
n (n 1) s2X
n (n 1) s2X
ANOVA table for the simple linear regression model (R-squared R2 = SSM/SST ):
Source of variability
Model
Residuals/errors
Total

SS
Pn
SSM =P i=1 (
yi y)2 P
n
n
SSR = i=1 (yi yi )2 = i=1 e2i
SST = SSM + SSR

DF
1
n2
n1

Mean
SSM/1
SSR/(n 2) = s2R

F ratio
SSM/s2R

To test H0 : 1 = 0 vs. H1 : 1 6= 0, test stat is F = SSM/s2R F1,n2 and RR = {F > F1,n2; }.


Model formulation, estimates, fitted model and residuals in multiple linear regression model
yi = 0 + 1 xi1 + 2 xi2 + + k xik + ui , where ui iid N (0, 2 ) in matrix notation:
= (X T X)1 X T y,

y = X + u,

y=

y1
y2
..
.
yn

X=

1
1
..
.

y = X ,

x11
x21
..
.

x12
x22
..
.

..
.

x1k
x2k
..
.

1 xn1

xn2

xnk

e=yy
, where

0
u1
1

u2

= 2 , u = .
..
..
.
un
k

Pivotal quantities for 2 and j , j = 0, 1, . . . , k, with residual variance s2R =


(n k 1) s2R
2nk1 ,
2

j j
tnk1 ,
s(j )

Pn

2
i=1 ei /(n

k 1):

where s(j ) = s2 (j ) and s2 (j ) is the j-th diagonal element of the (estimated) variance-covariance matrix of ,
2
T
1
with the matrix defined as S = sR (X X) .
ANOVA table for the multiple linear regression model:
Source of variability
Model
Residuals/errors
Total

SS
SSM
SSR
SST

DF
k
nk1
n1
2

Mean
SSM/k
SSR/(n k 1) = s2R

F ratio
(SSM/k)/s2R

You might also like