You are on page 1of 31

MATH5315 Applied Statistics and Probability 2011-2012

Lecturer: Andrew J. Baczkowski room 9.21i email: sta6ajb@leeds.ac.uk


Regularly updated information about the module is available on the internet at:
http://www.maths.leeds.ac.uk/sta6ajb/math5315/math5315.html
Module Objective: The aim of the module is to provide a grounding in the aspects of statistics,
in particular statistical modelling, that are of relevance to actuarial and financial work. The
module introduces and develops the fundamental concepts of probability and statistics used in
applied financial analysis.
Provisional Detailed Syllabus:
Part I: Fundamentals of Probability (11 lectures)
Summarising data; Introduction to probability; Random variables; Probability distributions; Generating functions; Joint distributions; The central limit theorem; Conditional expectation.
Part II: Fundamentals of Statistics (9 lectures)
Sampling and statistical inference; Point estimation; Confidence intervals; Hypothesis testing.
Part III: Applied Statistics (10 lectures)
Correlation and regression (OLS); Analysis of variance (ANOVA); Univariate time series analysis
and forecasting (ARMA); Multivariate time series analysis (VAR); Cointegration; Volatility models
(ARCH/GARCH).
Booklist: Sections from the two books will form the course notes for this module.
1. Subject CT3 Probability and Mathematical Statistics Core Technical, Core Reading, published
by Institute of Actuaries, price 45. Referred to as CT3.
2. Introductory Econometrics for Finance (2nd edition) by C. Brookes, published by Cambridge
University Press, 2008, price 40. Referred to as IEF.
IT is ESSENTIAL that you have access to these books. You MUST prepare the material BEFORE
the lectures, which will consist of examples and further explanation to illustrate the book material.
Timetable:
Lectures (weeks 1-5): Tuesdays 10-11 in RSLT14 and Tuesdays 12-1 in RSLT08.
Lectures (weeks 1-4, 6-11): Fridays 1-3 in RSLT08.
Seminar (weeks 1-4, 6): Fridays 3-4 in E.C.Stoner Building, room 9.90.
Practical (weeks 7-11): Fridays 3-4 in Irene Manton North cluster.
(RSLT is the Roger Stevens Lecture Theatre Block).
Assessment:
70% of marks for two hour examination at end of semester.
30% of marks for continuous assessment practical work.
Examination Paper: Format currently planned as follows for the TWO hour paper. Eleven
section A questions each worth TWO marks. Eleven section B questions each worth THREE
marks. Eleven section C questions each worth FIVE marks. You attempt TEN section A questions,
TEN section B questions, and TEN section C questions.
Exercise Sheets for MATH5315: None. I will introduce examples as we need them. There are
some questions (and answers) available on the module web-page.

MATH5315 Applied Statistics and Probability


Lecture 1: Summarising Data
References: CT3 Unit 1.
(CT3 denotes Subject CT3 Probability and Mathematical Statistics Core Technical, Core Reading, published by Institute of Actuaries, price 45.)
2 Tabular and graphical methods.
2.1 Types of data. Discrete and continuous data.
2.2 Frequency distribution.

A line chart is better for discrete data than a bar chart!

2.3 Histograms.
2.5 Lineplots.

Dotplots. Cumulative frequency is frequency x.

3 Measures of Location.
3.1 The mean.

Sample mean x
.

3.2 The median.


4 Measures of spread.
4.1 The standard deviation.
4.2 Moments.

Sample standard deviation s, sample variance s2 .

Sample moments mk =

i=1

i=1

1X k
1X
xi , m k =
(xi x
)k .
n
n

4.3 The range.


4.4 The interquartile range.
1
2 (Q3 Q1 ).

More often people use the semi-interquartile range SIQ =

5 Symmetry and skewness.


5.1 Boxplots.

MATH5315 Applied Statistics and Probability


Lecture 2: Introduction to Probability
References: CT3 Unit 2.
1 Introduction to sets.

Sample space S, event A.

1.1 Complementary sets.


1.2 Set operations.

A . More usually Ac might be used.

Union and intersection .

2 Probability axioms and the addition rule.


2.1 Basic probability axioms.
P{A B} = P{A} + P{B}.
2.2 The addition rule.

P{S} = 1, 0 P{A} 1. If A and B are mutually exclusive,

In general P{A B} = P{A} + P{B} P{A B}.

3 Conditional probability.
3.1 Independent events.

P{A|B} =

P{A B}
.
P{B}

A and B are independent if and only if P{A B} = P{A} P{B}.

3.2 Theorem of total probability.

P{A} =

n
X
j=1

3.3 Bayes Theorem.

P{Ei |A} =

P{A Ej }.

P{A|Ei } P{Ei }
.
n
X
P{A|Ej } P{Ej }
j=1

MATH5315 Applied Statistics and Probability


Lecture 3: Random Variables
References: CT3 Unit 3.
1 Discrete random variables. Random variable X. Probability function fX (x) = P{X = x};
notation pX (x) would be better!
X
Cumulative distribution function (cdf) FX (x) = P{X x} =
fX (xi ).
xi x

2 Continuous random variables. Probability density function (pdf) fX (x).


Z x
Z b
dFX (x)
fX (t)dt, so fX (x) =
fX (x)dx. Cdf FX (x) =
P{a X b} =
.
dx

a
3 Expected values.
3.1 Mean.

E[X] or . More generally E[g(X)].

3.2 Variance and standard deviation.


notation!) Standard deviation .
3.3 Linear functions of X.
3.4 Moments.
using 3 / 3 .

Variance V[X] often denoted 2 . (I prefer Var[X] as

E[aX + b] = a + b. V[aX + b] = a2 2 .

k = E[(X )k ] is kth central moment of X about . Can measure skewness

4 Functions of a random variable.


4.1 Discrete random variables.
y1 = u(x1 ).

Y = u(X).

If have a 11 mapping, P{Y = y1 } = P{X = x1 } where

4.2 Continuous random variables.


dx
Y = u(X) so X = w(Y ). fY (y) = fX (x) .
dy

MATH5315 Applied Statistics and Probability


Lecture 4: Probability Distributions I
References: CT3 Unit 4.
2 Discrete distributions.
2.1 Uniform distribution.
2.2 Bernoulli distribution.

P{X = x} =

1
k

for x = 1, 2, . . . , k.

Bernoulli trial.

2.3 Binomial distribution. If X is number of successes in n Bernoulli trials, X Bin(n, );


P{X = x} = nx x (1 )nx for x = 0, 1, 2, . . . , n.

2.4 Geometric distribution. If X is number of Bernoulli trials until first success, X


geometric(); P{X = x} = (1 )x1 for x = 1, 2, 3, . . ..

2.7 Poisson distribution.

If X Poisson(), then P{X = x} =

x e
for x = 0, 1, 2, . . . .
x!

MATH5315 Applied Statistics and Probability


Lecture 5: Probability Distributions II
References: CT3 Unit 4.
3 Continuous distributions.
1
for < x < .

3.1 Uniform distribution.

If X uniform(, ), fX (x) =

3.2 Gamma distribution.


(n 1)! for n Z.

Gamma function (), (1) = 1, () = ( 1)( 1), (n) =

x1 ex
for x > 0. E[X] = , V[X] = 2 .
()

Exponential distribution: X exponential() gamma(1, ).


Chi-squared distribution: X 2 gamma( 21 , = 12 ).
If X gamma(, ), fX (x) =

3.3 Beta distribution. (NOT needed for the exam.)


x < 1 where B(, ) =
Mean is =

fX (x) =

x1 (1 x)1
for 0 <
B(, )

()()
.
( + )

. Variance is =
.
+
( + )2 ( + + 1)
1

N(, 2 ),

(x )2
2 2

exp
3.4 Normal distribution. If X
fX (x) =
2 2
.
X
If Z =
, then Z N(0, 1). Values P{Z < z} = (z) are tabulated.

3.6 t-distribution.

If X 2 and Z N(0, 1) independently, then T = p

3.7 F-distribution.

If X 2n1 and Y 2n2 and are independent, then F =

X/

for < x <

t .

X/n1
Fn1 ,n2 .
Y /n2

MATH5315 Applied Statistics and Probability


Lecture 6: Generating Functions
References: CT3 Unit 5.
1 Probability generating functions. If X is discrete taking value k with probability pk

X
X
p k tk .
(k = 0, 1, 2, . . .), then GX (t) = E[t ] =
k=0

GX (1) = 1, GX (0) = p0 .

1.1 Important examples.


mial). Poisson.

Uniform. Binomial. Geometric (as special case of negative bino-

1.2 Evaluating moments.

GX (t)

GX (t) =

X
k=2

k(k 1)pk tk2 so GX (1) =

kpk t

k1

so

GX (1)

k=1

X
k=2

kpk = E[X].

k=1

k(k 1)pk = E[X(X 1)].

2 Moment generating function.


mX (t) = E[etX ].
Z
In continuous case mX (t) = etx fX (x)dx.
x
Z
Z
r tx
(r)
(r)
m (t) = x e fX (x)dx so m (0) = xr fX (x)dx = E[X r ].
x

2.1 Important examples.

gamma(, ). N(, 2 ).

4 Linear functions. If Y = a + bX, GY (t) = E[tY ] = E[ta+bX ] = ta E[(tb )X ] = ta GX (tb ).


Similarly, mY (t) = eat mX (bt).

MATH5315 Applied Statistics and Probability


Lecture 7: Joint Distributions I
References: CT3 Unit 6.
1 Joint distributions.
1.1 Joint probability (density) functions.
Discrete case f (x, y) = P{X = x, Y = y}, though a better notation might
Z xp2X,Y (x, y) as in 1.3!
Z y2 be
f (x, y)dxdy.
Continuous case f (x, y) is joint pdf. P{x1 < X < x2 , y1 < Y < y2 } =
y1

x1

1.2 Marginal probability (density)


X functions.
Discrete case fX (x) = P{X = x} =
f (x, y) though a better notation might be pX (x) = P{X = x}
y

as in 1.3!

Continuous case, marginal pdf is fX (x) =

f (x, y)dy.
y

P{A B}
.
P{B}
fX,Y (x, y)
pX,Y (x, y)
. Continuous case fX|Y =y (x|y) =
.
Discrete case P{X = x|Y = y} =
pY (y)
fY (y)
1.3 Conditional probability (density) functions.

Recall P{A|B} =

1.4 Independence of random variables.


If X and Y are independent, then fX,Y (x, y) = fX (x)fY (y) for all x and y.
If X and Y are independent, then g(X) and h(Y ) will be independent.
2 Expectations of functions of two random variables.
XX
Discrete case E[g(X, Y )] =
g(x, y)P{X = x, Y = y}.
x
y
Z Z
g(x, y)fX,Y (x, y)dxdy.
Continuous case E[g(X, Y )] =

2.1 Expectations.

2.2 Expectations of sums and products. E[ag(X) + bh(Y )] = aE[g(X)] + bE[h(Y )].
If X and Y are independent, E[g(X)h(Y )] = E[g(X)]E[h(Y )].

MATH5315 Applied Statistics and Probability


Lecture 8: Joint Distributions II
References: CT3 Unit 6.
2.3 Covariance and correlation coefficient.
cov(X, Y ) = E[(X X )(Y Y )] = E[XY ] X Y . cov(X, X) = V[X].
cov(X, Y )
, often denoted , 1 1.
corr(X, Y ) = p
V[X]V[Y ]
If = 0, X and Y are uncorrelated.
2.3.1 Useful results on handling covariances.
cov(aX + b, cY + d) = ac cov(X, Y ). cov(X, Y + Z) = cov(X, Y ) + cov(X, Z).
If X and Y are independent, cov(X, Y ) = 0. Converse not necessarily true.
2.4 Variance of a sum.

V[X + Y ] = V[X] + V[Y ] + 2cov(X, Y ).

3 Convolutions. Suppose
XZ =X +Y.
Discrete case P{Z = z} =
P{X = x, Y = z x}.
x Z
Continuous case fZ (Z = z) = fX,Y (x, y = z x)dx.
x

Simplifies in independence case.

MATH5315 Applied Statistics and Probability


Lecture 9: More on generating functions
References: CT3 Units 5 and 6.
3 (Unit 5) Cumulant generating function.
(0) = E[X]. C (0) = V[X].
CX (t) = log mX (t). CX
X
2
t
t3
CX (t) = 1 t + 2 + 3 + .
2!
3!
3.1
of linear combinations of random variables.
" n(Unit#6) Moments
n
X
X
ci E[Xi ].
ci Xi =
E
i=1
i=1
#
" n
n
X
XX
X
c2i V[Xi ] + 2
ci cj cov(Xi , Xj ). Special case is (mutual) independence case.
ci Xi =
V
i=1

i=1

1i<jn

3.2 (Unit 6) Distributions of linear combinations of independent random variables.


Discrete case via probability generating function (pgf). Let S = c1 X + c2 Y .
GS (t) = E[tc1 X+c2 Y ] = E[(tc1 )X ]E[(tc2 )Y ] = GX (tc1 )GY (tc2 ).
Binomial case: If X Bin(m, ) and Y Bin(n, ), X + Y Bin(m + n, ).
Poisson case: If X Poisson() and Y Poisson(), then X + Y Poisson( + ).
Continuous case via moment generating functions (mgf). Let S = c1 X + c2 Y .
mS (t) = E[e(c1 X+c2 Y )t ] = E[e(c1 t)X ]E[e(c2 t)Y ] = mX (c1 t)mY (c2 t).
k
X
ind
Xi gamma(k, ).
Exponential case: If Xi exponential(), i = 1, 2, . . . , k,
i=1

Gamma case: If X gamma(, ) and Y gamma(, ), then X + Y gamma( + , ).


Chi-square case: If X 2m and Y 2n , then X + Y 2m+n .
2 ) and Y N( , 2 ), then X + Y N( + , 2 + 2 ).
Normal case: If X N(X , X
Y
X
Y
Y
X
Y

10

MATH5315 Applied Statistics and Probability


Lecture 10: Central Limit Theorem
References: CT3 Unit 7.
1 The central limit theorem. For X1 , X2 , . . . , Xn iid with common mean and common

X
N(0, 1) for large n.
variance 2 (< ),
/ n
2 Normal approximations.
2.1 Binomial distribution.
2.2 Poisson distribution.

If X Bin(n, ) and n is large, X N(n, n(1 ).


ind

If Xi Poisson(), from lecture 8,

large n, this is approximately N(n, n).


2.3 Gamma distribution.

ind

n
X
i=1

From lecture 8, if Xi exponential(),

For large n, this is approximately N(n/, n/2 ).


Since 2k gamma( 21 k, 12 ), 2k N(k, 2k) for large k.

Xi Poisson(n). For

n
X
i=1

Xi gamma(n, ).

2
3 The continuity
correction.
) If X Poisson() and is large, X N(, = ).
(
x + 12
, where Z N(0, 1).
P{X x} P Z

11

MATH5315 Applied Statistics and Probability


Lecture 11: Conditional Expectation
References: CT3 Unit 14.
1 The conditional expectation E[Y |X = x].
2 The random variable E[Y |X].
value of a random variable g(X).
E[E[Y |X]] = E[Y ].

If g(x) = E[Y |X = x], then consider this as the observed

3 The random variable V[Y |X] and the E[V] + V[E] result.
V[Y |X = x] = E[{Y g(x)}2 |x] = E[Y 2 |X = x] g(x)2 .
V[Y |X] = E[{Y g(X)}2 |X] = E[Y 2 |X] g(X)2 .
E[V[Y |X]] = E[E[Y 2 |X]] E[g(X)2 ] = E[Y 2 ] E[g(X)2 ] so E[Y 2 ] = E[V[Y |X]] + E[g(X)2 ].
V[Y ] = E[Y 2 ] {E[Y ]}2 = E[V[Y |X]] + E[g(X)2 ] {E[g(X)]}2 = E[V[Y |X]] + V[g(X)] so that
V[Y ] = E[V[Y |X]] + V[E[Y |X]].
5 Compound distributions. S = X1 + X2 + + XN .
2 .
E[S|N = n] = nX , V[S|N = n] = nX
E[S] = E[E[S|N = n]] = E[N X ] = E[N ]X .
2 ] + V[N ] = 2 + 2 2 .
V[S] = E[V[S|N ]] + V[E[S|N ]] = E[N X
X
N X
N X
tS
tS
Mgf of S: mS (t) = E[e ] = E[E[e |N ]].
E[etS |N = n] = {mX (t)}n .
mS (t) = E[{mX (t)}N ] = GN (mX (t)) in terms of pgf of N .

12

MATH5315 Applied Statistics and Probability


Lecture 12: Poisson process and simulating random variables
References: CT3 Unit 4.
4 The Poisson process. Point-like events ocur randomly and independently in time at an
average rate per unit time. Let N (t) be the number of events in interval [0, t] with N (0) = 0.
Let pn (t) = P{N (t) = n}.
pn (t + h) pn1 (t)[h] + pn (t)[1 h] so pn (t + h) pn (t) h[pn1 (t) pn (t)].
As h 0, pn (t) = [pn1 (t) pn (t)].
Similarly p0 (t) = p0 (t).

X
sn pn (t).
Define G(s, t) =
n=0

Thus G (s, t) = sG(s, t) G(s, t) so log G(s, t) = t(s 1) as G(s, 0) = 1.


Thus G(s, t) = exp t(s 1) so N (t) Poisson(t).
Time T1 to first event satisfies T1 exponential().
Time between events has the same exponential distribution.
5 Random number simulation.
5.1 Basic simulation method.
method.

Generate U uniform(0, 1). Then use inverse transformation

5.2 Continuous distributions.


example.

If F (x) = P{X x}, let x = F 1 (u). X exponential()

5.3 Discrete distributions.

13

MATH5315 Applied Statistics and Probability


Lecture 13: Sampling and Statistical Inference I
References: CT3 Unit 8.
1 Basic definitions.
2 Moments of the sample mean and variance.
n

2.1 The sample mean.

2
X
= 1
= , V[X]
= .
X
Xi . E[X]
n
n
i=1

2.2 The sample variance.

1 X
2= 1
(Xi X)
S =
n1
n1
2

i=1

n
X

Xi2

i=1

nX

3 Sampling distributions for the normal.


3.1 The sample mean.
ind

If Xi N(, 2 ), then Z =

In general Z =


X
N(0, 1) for large n.
/ n


X
N(0, 1) for all n.
/ n

(n 1)S 2
ind
If Xi N(, 2 ), then
2n1 .
2


U = 2k tabulated values k () satisfy P U > 2k () = .
3.2 The sample variance.

3.3 Independence of the sample mean and variance.

14

Xi N(, 2 ) case.

. E[S 2 ] = 2 .

MATH5315 Applied Statistics and Probability


Lecture 14: Sampling and Statistical Inference II
References: CT3 Unit 8.
N(0, 1)
for independent N(0, 1) and 2k distributions.
tk = q
2
k /k

X
ind
tn1 .
If Xi N(, 2 ), then t =
S/ n
Properties of tk distribution. Tabulated values tk () satisfy P{T > tk ()} = for T tk . Also
P{T < tk ()} =

4 The t distribution.

5 The F result for variance ratios.


U/1
F1 ,2 .
V /2
If S1 and S2 are based on samples of size n1 and n2 respectively from normal populations with
S 2 / 2
variances 12 and 22 respectively, then F = 12 12 Fn1 1,n2 1 .
S2 /2


1
= .
Tabulated values F1 ,2 () satisfy P{F > F1 ,2 ()} = for F F1 ,2 . Also P F <
F2 ,1 ()
If U 21 and V 22 are independent, then F =

15

MATH5315 Applied Statistics and Probability


Lecture 15: Point Estimation I
References: CT3 Unit 9.
1 The method of moments.
1.1 The one-parameter case.
1.2 The two-parameter case.
3 Unbiasedness. Let g(X) be an estimator of a parameter .
Bias is Bias(g(X)) = E[g(X)] .
Unbiasedness.


4 Mean square error. Let g(X) be an estimator of a parameter . M SE(g(X) = E (g(X) )2 .
M SE(g(X) = V[g(X)] + Bias(g(X))2 .

16

MATH5315 Applied Statistics and Probability


Lecture 16: Point Estimation II
References: CT3 Unit 9.
2 The method of maximum likelihood.
2.1 The one-parameter case.

Likelihood L() =

n
Y

f (xi ; ).

i=1

2.1.1 Example.
2.2 The two-parameter case.
5 Asymptotic distribution of MLE.
"
For large n, N(, v) where v =
nE

2 # .

log f (X; )

17

MATH5315 Applied Statistics and Probability


Lecture 17: Confidence Intervals I
References: CT3 Unit 10.
1 Confidence intervals in general.

n
o
Want values 1 and 2 such that P 1 < < 2 = 0.95.

2 Distribution of confidence intervals.


2.1 The pivotal method. Look for a pivotal quantity g(X, ) such that, for example, if
g(X, ) increases as increases, then g(X, ) < g2 < 2 and g1 < g(X, ) 1 < .

X
ind
.
For example, if Xi N(, 2 ), then g(X, ) =
/ n
2.2 Confidence
limits.



1.96 .
The interval X 1.96 , X + 1.96
can be written as X
n
n
n
2.3 Sample size.
3 Confidence intervals for the normal distribution.
3.1 The mean.

X
tn1 (2.5%) S .
tn1 so 95% confidence interval for is X
Recall t =
S/ n
n
3.2 The variance.
(n 1)S 2
(n 1)S 2
(n 1)S 2
< 2 < 2
.
Recall
2n1 so 95% confidence interval for 2 is 2
2

n1 (2.5%)
n1 (97.5%)

18

MATH5315 Applied Statistics and Probability


Lecture 18: Confidence Intervals II
References: CT3 Unit 10.
4 Confidence intervals for binomial and Poisson.
4.1 The binomial. Course text a bit muddled here I think! If P{h1 () < X < h2 ()} 0.95,
then P{X h2 ()} 0.025 and P{X h1 ()} 0.025. Now X < h2 () if > 2 (X) and similarly
X > h1 () if < 1 (X). Thus interval is 2 < < 1 , where, for example, P{X x| = 2 } =
0.025.
4.1.1 The normal approximation.
5 Confidence intervals for two sample problems.
5.1 Two normal means.



2
2
1 X
2 N 1 2 , 1 + 2 so
Confidence interval with known variances based on fact that X
n1 n2

X1 X2 (1 2 )
r
N(0, 1).

22
12
n1 + n2

If 12 = 22 = 2 is unknown, can be estimated using s2p =


1 X
2 (1 2 )
X
r 
tn1 +n2 2 .

1
1
2
s p n1 + n2

(n1 1)s21 + (n2 1)s22


and then
n1 + n2 2

S12 /S22
S12 /12

F
so
Fn1 1,n2 1 .
n1 1,n2 1
S22 /22
12 /22
Also,
> F1 ,2 ()} = , then
 if F F1 ,2 with P{F
2
2
S /S
P Fn1 1,n2 1 (0.975) < 12 22 < Fn1 1,n2 1 (0.025) = 0.95 re-arranges to give
1 /2
2
S1
1
S2
1
2
1

< 12 < 12
where
= Fn2 1,n1 1 (0.025).
2
Fn1 1,n2 1 (0.975)
S2 Fn1 1,n2 1 (0.025)
2
S2 Fn1 1,n2 1 (0.975)

5.2 Two population variances.

Recall

5.3 Two population proportions.


X1 Bin(n1 , 1 ) N(n1 1 , n1 1 (1
 1 )) and X2 Bin(n2 , 2 ) N(n2 2 , n2 2 (1 2 )).


(1

)
1 (1 1 ) 2 (1 2 )
X
i
i
i
N i ,
+
for i = 1, 2 and so 1 2 N 1 2 ,
.
Thus i =
ni
ni
n1
n2
!
1 (1 1 ) 2 (1 2 )
+
so can obtain confidence inIn practice we assume 1 2 N 1 2 ,
n1
n2
terval for 1 2 by assuming the variance is known.
6 Paired data.
D
D
tn1 .
SD / n

(NOT needed for the exam.)

19

Form pairs Di = X1i X2i .

Then

MATH5315 Applied Statistics and Probability


Lecture 19: Hypothesis Testing I
References: CT3 Unit 11.
1 Hypotheses, test statistics, decisions and errors.
Null hypothesis H0 . Alternative hypothesis H1 . Critical region.
= P{Type I error} = P{Reject H0 when H0 true}.
= P{Type II error} = P{Accept H0 when H0 false}.
Power = 1 = P{Reject H0 when parameter is }.
2 Classical testing, significance and P -values.
2.1 Best tests. Neyman-Pearson Lemma H0 : = 0 vs. H1 : = 1 best test based on
L0
likelihood ratio with critical region C satisfying
k.
L1
2.2 P -values.

P = P{A value occurs as or more extreme than the one observed|H0 true}.

3 Basic tests single parameter.


3.1 Testing the value of a population mean. Testing H0 : = 0 .
0
0
X
X
N(0, 1) or T =
tn1 if H0 true.
Test based on Z =
/ n
S/ n
3.2 Testing the value of population variance.
(n 1)S 2
2n1 if H0 true.
Test based on
02

Testing H0 : 2 = 02 .

3.3 Testing the value of a population proportion. Testing H0 : = 0 .


Test bsed on X Bin(n, 0 ) N(n0 , n0 (1 0 )) if H0 true.
4 Basic tests two independent samples.
4.1 Testing the value of the difference between two population means.
Testing H0 : 1 2 = .
1 X
2
1 X
2
X
X
q
Test based on Z = q 2

N(0,
1)
or
T
=
tn1 +n2 2 if H0 true.
1
22
1
1
S
+
p
n1
n2
n1 + n2
4.2 Testing the value of the ratio of two population variances.
S2
Test based on 12 Fn1 1,n2 1 if H0 true.
S2

Testing H0 : 12 = 22 .

4.3 Testing the value of the difference between two population proportions.
Testing H0 : 1 = 2 (= ).
1 2
Test based on q
if H0 true.

(1
)
(1
)
+
n1
n2
20

MATH5315 Applied Statistics and Probability


Lecture 20: Hypothesis Testing II
References: CT3 Unit 11.
7 2 tests.
Test statistic is

X (fi ei )2

where ei are expected values under H0 and fi are observed values.


ei
i
X (Oi Ei )2
!
I prefer the notation
Ei
i

7.1 Goodness of fit.


7.1.1 Degrees of freedom.
Number of groups Number of constraints on ei Number of fitted parameters.
7.1.2 The accuracy of the 2 approximation.
Ensure all ei > 5 by combining groups (cells).
7.1.3 Example.
7.2 Contingency tables.
For r c table, degrees of freedom is (r 1)(c 1).
Row total Column total
Expected frequencies for any cell are
.
Grand Total
Tests of homogeneity and independence not clearly distinguished.

21

MATH5315 Applied Statistics and Probability


Lecture 21: Correlation and Regression I
References: CT3 Unit 12; IEF chapter 2, pages 27-38, 44-51.
0 (CT3) Introduction.

Scatter plot and summary statistics Sxx , Syy , Sxy .

1 (CT3) Correlation analysis.


1.1 (CT3) Data summary.

Sample correlation r = p

1.2 (CT3)
The normal model and inference.
r n2
tn2 .
If = 0,
1 r2


1+r
1+
1
1
1
If W = 2 log
, then W N 2 log
,
.
1r
1 n 3
Can re-write this as W = tanh

r, so that W N tanh

Sxy
.
Sxx Syy


1
.
,
n3

2 (CT3) Regression analysis the simple linear model.


Yi = + xi + ei , i = 1, 2, . . . , n. E[ei ] = 0, V[ei ] = 2 .

2.1 (CT3) Introduction.

2.2 (CT3) Fitting the model.

and minimise q =

n
X
i=1

e2i

n
X
i=1

Sxy
i where
Fitted line is y =
+ x
= y x
and =
.
Sxx
(yi xi )2 . Least squares derivation.
n

2
1 X
= , V[]
= .
2 =
(yi yi )2 .
E[]
Sxx
n2
i=1

2.3
the
X (CT3) Partitioning
X
Xvariability of the responses.
(yi y)2 =
(yi yi )2 +
(
yi y)2 . SST OT = SSRES + SSREG.
i

2
Sxy
SST OT = Syy . SSREG =
.
Sxx
2
E[SST OT ] = (n 1) , E[SSREG] = 2 + 2 Sxx , E[SSRES] = (n 2) 2 .
SSREG
.
Coefficient of determination R2 =
SST OT
Cases where line closely fits the data and where line a poor fit to the data.

22

MATH5315 Applied Statistics and Probability


Lecture 22: Correlation and Regression II
References: CT3 Unit 12; IEF chapter 2, pages 38-39, 51-66.
2.4 (CT3) The full
 normal
 model and inference. 2
2

(n 2)

Assumptions. N ,
2n2 .
, independently of
Sxx
2
2.5 (CT3)p
Inferences on the slope parameter .

( )/ 2 /Sxx

p
=s
tn2 .
((n 2)
2 / 2 )/(n 2)

2
Sxx

2.6 (CT3) Estimating a mean response and predicting an individual


response.

(x0 x
)2
1
0 . E[
+
2 .
0 = E[Y |x0 ] = + x0 estimated by
0 =
+ x
0 ] = 0 . V[
0 ] =
n
Sxx

0
p0
tn2 .
V[
0 ]
0.
Estimate individual response y0 = + x0 + e0 , estimated by y0 =
+ x
y

0
0
Since E[y0 y0 ] = 0, V[y0 y0 ] = 2 + V[
0 ], so s 
 tn2 .
2
1
(x

)
0
2 1 + +
n
Sxx
2.7 (CT3) Checking the model.

Residuals ei = yi yi . Residual plot ei vs. xi .

2.8 (CT3) Extending the scope of the linear model. (NOT needed for the exam.)
Transformations to give linearity.
3 (CT3) The multiple linear regression model. (NOT needed for the exam.)
+ 1 xi1 + k xik + ei , i = 1, 2, . . . , n.

23

Yi =

MATH5315 Applied Statistics and Probability


Lecture 23: Analysis of Variance
References: CT3 Unit 13.
0 (CT3) Introduction.
1 (CT3) One-way analysis of variance.

1.1 (CT3) The model.

Yij = + i + eij for i = 1, 2, . . . , k, j = 1, 2, . . . , ni , n =

k
X

ni .

i=1

ind

eij N(0, 2 ) so Yij N( + i , 2 ).


k
X
ni i = 0, then i is the ith treatment effect, and is the overall mean.
If
i=1

ni
k X
X
(Yij i )2 to give
1.2 (CT3) Estimation of the parameters. Minimise q =
i=1 j=1

ni
1 X
(ni 1)Si2 ind 2

= Y, i = Yi Y. If Si2 =


(Yij Yi)2 , then
ni 1 .
ni 1
2
j=1

k
1 X
(n k)
2
2
Hence
=
(ni 1)Si2 satisfies
2nk .
nk
2
i=1

1.3 (CT3) Partitioning the variability.

ni
ni
k
k X
k X
X
X
X
ni (Yi Y)2 ,
(Yij Yi)2 +
(Yij Y)2 =
i=1 j=1

i=1 j=1

i=1

i.e., SST = SSR + SSB where SST is total sum of squares, SSR is residual sum of squares (withintreatments sum of squares), SSB is between-treatments sum of squares.
SSB
M SB
is mean squares
Fk1,nk , where M SB =
If H0 : 1 = 2 = = k = 0 is true, then
M SR
k1
SSR
between treatments and M SR =
is residual mean square.
nk
1.4 (CT3) Example.
1.5 (CT3) Checking the model.

Residuals are rij = eij = Yij


i = Yij Yi.

1.6 (CT3) Estimating the treatment means.

95% confidence interval for + i is Yi tnk (2.5%) .


ni
95% confidence interval for i j is Yi Yj tnk (2.5%)

1
1
+ .
ni nj

1.7 (CT3) Further comments. Linear regression model Yi = a + bxi + ei , i = 1, 2, . . . , n can


n
n
n
X
X
X
(Yi Y )2 , i.e., SST = SSR + SSREG .
(Yi Yi )2 +
(Yi Y )2 =
be analysed as
i=1

i=1

SSREG
F1,n2 .
If H0 : b = 0 is true,
SSR /(n 2)

i=1

24

MATH5315 Applied Statistics and Probability


Lecture 24: Univariate Time Series Analysis and Forecasting I
References: IEF chapter 5, pages 206-223.
5.1 Introduction.
5.2 Some notation and concepts.
5.2.2 A weakly stationary process. E[Yt ] = , V[Yt ] = 2 , cov(Yt1 , Yt2 ) = t1 t2 .
s
Autocovariance function is cov(Yt , Yts ) = s . Autocorrelation function is s = .
0
5.2.3 A white noise process.
ind

E[Yt ] = , V[Yt ] = 2 , s = 0 for s 6= 0.

If Yt N(, 2 ) for t = 1, 2, . . . , T , s N(0, 1/T ).

Box-Pierce test that 1 = = m = 0; if H0 true, Q = T

m
X
k=1

k2 2m .
ind

5.3 Moving average processes. MA(q) is Yt = +Ut +1 Ut1 + +q Utq with Ut (0, 2 ).
Backshift operator L (Most textbooks would use B!) satisfies LYt = Yt1 .
Thus Yt = + (L)Ut with (L) = 1 + 1 L + + q Lq .
V[Yt ] = 0 = (1 + 12 + + q2 ) 2 , s = (s + s+1 1 + + q qs ) 2 for s = 1, 2, . . . , q.
Example 5.2.
5.4 Autoregressive processes. AR(p) is Yt = + 1 Yt1 + 2 Yt2 + + p Ytp + Ut .
Can write as (L)Yt = + Ut with (L) = 1 1 L p Lp .
5.4.1 The stationarity condition. AR(p) process stationary if roots of (z) = 0 lie outside
unit circle. Can then write AR(p) process as MA() Yt = 1 (L)Ut .
Example 5.3.
5.4.2 Wolds decomposition theorem. (NOT needed for the exam.)
All we really need here is that (1 1 2 p )E[Yt ] = and the autocorrelation function
satisfies the Yule-Walker equations r = 1r 1 + 2r 2 + + pr p for r = 1, 2, . . . , p with
s = s .
Example 5.4.
5.5 The partial autocorrelation function. The pacf kk can be found from fitting the model
Yt = + k,1 Yt1 + k,k1 Ytk+1 + kk Ytk + Ut .
5.5.1 The invertibility condition. MA(q) process is invertible if roots of (z) = 0 lie outside
the unit circle. The process Yt = (L)Ut can then be written as an AR() process 1 (L)Yt = Ut .

25

MATH5315 Applied Statistics and Probability


Lecture 25: Univariate Time Series Analysis and Forecasting II
References: IEF chapter 5, pages 223-238, 247-251.
5.6 ARMA processes. ARMA(p, q) process is (L)Yt = + (L)Ut .
Mean satisfies (1 1 2 p )E[Yt ] = .
AR(p) process: acf (oscillatory) geometric decay, pacf zero after lag p.
MA(q) process: acf zero after lag q, pacf (oscillatory) geometric decay.
ARMA(p, q) process: acf like AR, pacf like MA.
5.6.1 Sample acf and pacf plots for standard processes. Correlogram (acf plot) has lines

drawn at 1.96/ n to indicate significant k ; pacf plot also has lines drawn at 1.96/ n.
5.7 Building ARMA models: the Box-Jenkins approach. Determine model order; estimate parameters; check model validity. Parsimonious models best!
5.7.1 Information criteria for ARIMA model selection.
5.7.3 ARIMA modelling.

AIC widely used (with SBIC).

Data differenced to give stationarity.

5.8 Constructing ARMA models in EViews. (NOT needed for the exam.) We use R.
5.11.4 Forecasting with time series versus structural models. (NOT needed for the
exam.) Conditional expectation is E[Yt+1 |t ] = E[Yt+1 |Y1 , Y2 , . . . , Yt ].
5.11.5 Forecasting with ARMA models. (NOT needed for the exam.) ARMA(p, q)
q
p
q
p
X
X
X
X
t+sj
bj U
ai Yt+si +
bj Utj . Forecast at time t + s is Yt+s =
ai Yti +
model Yt =
i=1

i=1

j=1

j=1

k = 0 for k t, U
k = 0 for k > t. IEF uses notation ft,s Yt+s .
where Yk = Yk for k t, U

5.11.6 Forecasting the future value of an MA(q) process. (NOT needed for the
exam.)
MA(3) process Yt = + Ut + 1 Ut1 + 2 Ut2 + 3 Ut3 .
eg, Yt+2 = + Ut+2 + 1 Ut+1 + 2 Ut + 3 Ut1 . Thus E[Yt+2 |t ] = + 2 Ut + 3 Ut1 .
5.11.7 Forecasting the future value of an AR(p) process. (NOT needed for the
exam.)
AR(2) process Yt = + 1 Yt1 + 2 Yt2 + Ut .
eg, Yt+2 = + 1 Yt+1 + 2 Yt + Ut+2 . Thus E[Yt+2 |t ] = + 1 Yt+1 + 2 Yt .

26

MATH5315 Applied Statistics and Probability


Lecture 26: Multivariate Models I
References: IEF chapter 3, pages 88-93; chapter 6, pages 265-271, 276-277.
3.1 Generalising the simple model to multiple linear regression.
3.2 The constant term. Writing the linear model in the form y = X + u.
3.3 How are the parameters (the elements of the vector) calculated?
= (X X)1 X y found by minimising (y X) (y X).

6.1 Motivations.
Structural equations. Reduced form equations.
6.2 Simultaneous equations bias.
6.3 So how can simultaneous equations models be validly estimated?
6.4 Can the original coefficients be retrieved from the s ?
6.4.1 What determines whether an equation is identified or not?
6.8 Estimation procedures for simultaneous equations systems.
6.8.1 Indirect least squares (ILS). (NOT needed for the exam.)
6.8.2 Estimation of just identified and overidentified systems using 2SLS. (NOT
needed for the exam.)
Using R systemfit command.

27

MATH5315 Applied Statistics and Probability


Lecture 27: Multivariate Models II
References: IEF chapter 6, pages 290-293, 294-296, 298, 308-315.
6.11 Vector autoregressive models.
6.11.1 Advantages of VAR modelling.
6.11.2 Problems with VARs.
6.11.3 Choosing the optimal lag length for a VAR.
6.11.5 Information criteria for VAR lag length selection.
6.12 Does the VAR include contemporaneous terms?
6.14 VARs with exogenous variables.
6.17 VAR estimation in EViews. (NOT needed for the exam.)
We use the R package vars.

28

MATH5315 Applied Statistics and Probability


Lecture 28: Cointegration
References: IEF chapter 7, pages 318-329, 335-341.
7.1 Stationarity and unit root testing.
7.1.1 Why are tests for non-stationarity necessary?
7.1.2 Two types of non-stationarity.
7.1.3 Some more definitions and terminology.
7.1.4 Testing for a unit root.
7.3 Cointegration.
7.3.1 Definition of cointegration.
7.4 Equilibrium correction or error correction models.
7.5 Testing for cointegration in regression: a residuals-based approach.
7.6 Methods of parameter estimation in cointegrated systems. (NOT needed for the
exam.)
7.6.1 The Engle-Granger 2-step method. (NOT needed for the exam.)

29

MATH5315 Applied Statistics and Probability


Lecture 29: ARCH Models
References: IEF chapter 8, pages 379-381, 383-384, 385-389.
8.1 Motivations: an excursion into non-linearity land.
8.1.1 Types of non-linear models.
8.2 Models for volatility.
8.3 Historical volatility.
8.6 Autoregressive volatility models.
8.7 Autoregressive conditionally heteroscedastic (ARCH) models.
8.7.1 Another way of expressing ARCH models.
8.7.2 Non-negativity constraints.
8.7.3 Testing for ARCH effects.
8.7.5 Limitations of ARCH(q) models.

30

MATH5315 Applied Statistics and Probability


Lecture 30: GARCH Models
References: IEF chapter 8, pages 392-399.
8.8 Generalised ARCH (GARCH) models.
8.8.1 The unconditional variance under a GARCH specification.
8.9 Estimation of ARCH/GARCH models.
8.9.1 Parameter estimation using maximum likelihood. (NOT needed for the exam.)
8.9.2 Non-normality and maximum likelihood. (NOT needed for the exam.)

31

You might also like