Probability Theory and Statistics: October 7, 2014 Robert Dahl Jacobsen

Probability Theory and Statistics
Lecture 6
October 7, 2014
Robert Dahl Jacobsen
robert@math.aau.dk
Department of Mathematical Sciences
Aalborg University
Robert DJ | Probability Theory and Statistics
Agenda
Estimation
Two means
Likelihoods
Matlab

2
Statistics in a nutshell
Model:
Xi N(, 2 )
Estimation:
b = x,
b2 = s2
= 0 ,
2 = 02
Hypothesis test:

3
Estimation
population:
I
I
sample: b
Point estimate: Estimate of population parameter () from

b
sample ().
b
Estimator: Corresponding random variable ().
parameter
estimate
estimator
x
s2
X
S2

4
Unbiased estimate
I
Unbiased estimator:
b =
E()
Example:
Xi N(, 2 ),
X =
S2 =
1
n1
1
n
n
X
i=1
i = 1, . . . , n
2
Xi N ,
n
n
X
(Xi X )2
i=1
2 2
(n 1)
n1
Then:
X and S 2 are independent
E(X ) =
E(S 2 ) = 2
Confidence interval
Repeat 10 times:
I
I
100 samples from N(0, 1).

Compute average x:
0.015, 0.085, 0.036, 0.088, 0.0043,
0.015, 0.015, 0.0067, 0.081, 0.14
Questions:
I
I
Is it okay that all x 6= 0?

How far from 0 can x before it is not okay?
Confidence interval: An interval that is pretty certain to contain x.
Confidence interval for mean

Known variance
Sample:
Xi N(, 2 ),
i = 1, . . . , n
Notation:
z = fractile of N(0, 1)
2
X N ,
n
(1 )100% confidence interval for :
x + z/2 x + z1/2
n
n
Shorthand:
x z/2
n
Confidence interval: Interpretation
We are (1 )100% confident that is in the CI.
20 samples with 100 observations from N(0, 2):

1 : x1,1 , x1,2 , . . . , x1,100
x1
2 : x2,1 , x2,2 , . . . , x2,100
x2
..
.
20 : x20,1 , x20,2 , . . . , x20,100

95% confidence interval:
x 1.96
2
10
Expect one confidence interval without 0:

http://xkcd.com/882
x 20

8
Confidence intervals
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
1
10 11 12 13 14 15 16 17 18 19 20

9
Chocolate bars
In a sample of 20 chocolate bars the amount of calories has been
measured. We have:
I
the corresponding random variable is approx. normally

distributed.
the population standard deviation is 10 calories.

the sample mean is 224 calories.
Calculate 90% and 95% confidence intervals for the mean. Which
one is larger?
Solutions:
10
[220.3; 227.7].
90% confidence interval: 224 1.64
20
10
[219.6; 228.4].
20
Confidence interval for mean

Unknown variance
Sample:
Xi N(, 2 ),
i = 1, . . . , n
Notation:
t = fractile of t(n 1)
n
s2 =
1X
(xi x)2
n
k =1
(1 )100% confidence interval for :

s
s
x + t/2 x t1/2
n
n
or
s
x t/2
n
Note: t < z
10
Normal or t distribution?
11
General form of confidence interval for mean:

std
x fractile
n
Situation 1
Situation 2
Observations from N(, )

(unknown mean and variance)
Observations from N(, 2 )

(unknown mean)
Estimate:
Estimate:
mean = x,
I
variance = s2
Use
I
I
mean = x
I
fractile from t distribution

std = s2
Use
I
fractile from normal

distribution
std = 2
More chocolate bars
12

measured. We have:
I
the corresponding random variable is approx. normally

distributed.
the sample standard deviation is 10 calories.
the sample mean is 224 calories.
Calculate 90% and 95% confidence intervals for the mean.

Solutions:
10
[220.1; 227.9].
20
10
[219.3; 228.7].
20
Confidence interval for variance

I
Sample:
Xi N(, 2 ),
i = 1, . . . , n
Notation:
n
s2 =
1 X
(xi x)2
n1
k =1
(n 1)S 2
2 (n 1)
2
2,n1 = fractile of 2 (n 1)
I
(1 )100% confidence interval for s2 :

(n 1)s2
(n 1)s2
2 2
2
1/2,n1
/2,n1
13
Variating chocolate bars

14

measured. We have:
I
the sample standard deviation is 10 calories.
Calculate 90% and 95% confidence intervals for the variance.
Solutions:
h
i
2
19102
[63.0; 187.8]
90% confidence interval: 1910
30.1 ; 10.1
h
i
19102 19102
95% confidence interval: 32.9 ; 8.9 [57.8; 213.3]
Difference in means
Known variances
15
Two populations:
X1,i N(1 , 12 )
X2,i N(2 , 22 )
Two samples:
x1,1 , x1,2 , . . . , x1,n1
x2,1 , x2,2 , . . . , x2,n2
Estimate of 1 2 :
x1 x2 =
n1
n2
1 X
1 X
x1,i
x2,i
n1
n2
i=1
i=1
Confidence interval:
s
s
12
22
12
2
(x 1 x 2 )+z/2
+
1 2 (x 1 x 2 )+z1/2
+ 2
n1
n2
n1
n2
Test of two means

Unknown & equal variances
16
Degrees of freedom: = n1 + n2 2
Pooled variance estimate:

sp2 =
(n1 1)s12 + (n2 1)s22

n1 + n2 2
s
(x 1 x 2 ) + t/2, sp
1
1
+
1 2
n1
n2
s
(x 1 x 2 ) + t1/2, sp
1
1
+
n1
n2
Test of two means

Unknown & unequal variances
17
Degrees of freedom:
=
(s12 /n1 + s22 /n2 )2

s12 /n1
n1 1
s22 /n2
n2 1
s
s
s12
s22
s12
s2
(x 1 x 2 )+t/2,
+
1 2 (x 1 x 2 )+t1/2,
+ 2
n1
n2
n1
n2
Likelihood approach: Motivation

I
I
I
I
18
Flip coin 10 times.

X = # heads B(10, p)
We observe 3 heads.
What is p / which p explains our data best?
p 7 b(3; 10, p)
0.3
0.2
0.1
0.2
0.4
0.6
0.8
Likelihood approach: Motivation

I
I
I
I
18
Flip coin 10 times.

X = # heads B(10, p)
We observe 8 heads.
What is p / which p explains our data best?
p
7 b(3; 10, p)
p
7 b(8; 10, p)
0.3
0.2
0.1
0.2
0.4
0.6
0.8
Likelihood, contd
19
p 7 b(3; 10, p)
0.3
0.2
0.1
0.2
0.4
0.6
0.8
Likelihood, contd
19
p 7 b(6; 20, p)
0.3
0.2
0.1
0.2
0.4
0.6
0.8
Likelihood, contd
19
p 7 b(9; 30, p)
0.3
0.2
0.1
0.2
0.4
0.6
0.8
Likelihood, contd
19
p 7 b(12; 40, p)
0.3
0.2
0.1
0.2
0.4
0.6
0.8
Likelihood, contd
19
p 7 b(15; 50, p)
0.3
0.2
0.1
0.2
0.4
0.6
0.8
Likelihood, contd
19
p 7 b(18; 60, p)
0.3
0.2
0.1
0.2
0.4
0.6
0.8
Likelihood function
The general approach
20
Joint density function of X1 , X2 , . . . , Xn :

f (x1 , x2 , . . . , xn ; )
is the parameter (vector) of f = parameter of interest.
The likelihood function:

L(; x1 , x2 , . . . , xn ) = f (x1 , x2 , . . . , xn ; )
The log-likelihood function:

l(; x1 , x2 , . . . , xn ) = log L(; x1 , x2 , . . . , xn )
Notice:
Density: (x1 , x2 , . . . , xn ) 7 f (x1 , x2 , . . . , xn ; ) ( fixed)
Likelihood: 7 f (x1 , x2 , . . . , xn ; ) (data fixed)
Likelihood function
21
Maximum likelihood estimate (MLE):

b = argmax f (x1 , x2 , . . . , xn ; )
MLE is not necessarily unique
Exact optimization can be difficult

Numerical optimization can be
I
I
time consuming to run

time consuming to program
Easier with independent observations:

f (x1 , x2 , . . . , xn ; ) =
n
Y
i=1
f (xi ; )
Likelihood function: Example

22
I
I
I
Independent observations: x1 , x2 , . . . , xn , Xi N(, )

Parameter vector: = (, 2 ).
Likelihood function:
n
n
(x )2
Y
Y
1
i
L(; x1 , x2 , . . . , xn ) =
f (xi ; ) =
exp
2
2 2
2
i=1
i=1
n
1 X

1
2
=
exp
(x
)
i
2 2
(2 2 )n/2
i=1
Log-likelihood function:
n
n
n
1 X
l(; x1 , x2 , . . . , xn ) = log(2) log( 2 )
(xi )2
2
2
2 2
i=1
Maximum likelihood estimate:

n
= x,
2 =
1X
(xi x)2 6= s2
n
i=1
Good vs Best
23
MLE gives the best parameter within the chosen model class.
This does not guarantee that the model is good.
Fit with normal distribution: x 0, s2 9.8

0.15
0.12
0.09
0.06
0.03
Good vs Best
23
MLE gives the best parameter within the chosen model class.
This does not guarantee that the model is good.
Fit with normal distribution: x 0, s2 9.8

0.15
0.12
0.09
0.06
0.03
Matlab
(1 )100% Confidence interval for mean with known variance:

mean(x) + [-1 1] * norminv(1-alpha/2) * std(x) / sqrt(n)
(1 )100% Confidence interval for mean with unknown

variance:
mean(x) + [-1 1] * tinv(1-alpha/2, n-1) * std(x) / sqrt(n)
(1 )100% Confidence interval for variance:

(n-1)*std(x)^2 ./ chi2inv( [alpha/2 1-alpha/2], n-1 )
24

Probability Theory and Statistics: October 7, 2014 Robert Dahl Jacobsen

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probability Theory and Statistics: October 7, 2014 Robert Dahl Jacobsen

Uploaded by

Copyright:

Available Formats

Probability Theory and Statistics

Robert DJ | Probability Theory and Statistics

Robert DJ | Probability Theory and Statistics

Robert DJ | Probability Theory and Statistics

Point estimate: Estimate of population parameter () from

Robert DJ | Probability Theory and Statistics

Robert DJ | Probability Theory and Statistics

100 samples from N(0, 1).

Is it okay that all x 6= 0?

Confidence interval: An interval that is pretty certain to contain x.

Robert DJ | Probability Theory and Statistics

Confidence interval for mean

(1 )100% confidence interval for :

Robert DJ | Probability Theory and Statistics

Confidence interval: Interpretation

We are (1 )100% confident that is in the CI.

20 samples with 100 observations from N(0, 2):

20 : x20,1 , x20,2 , . . . , x20,100

Expect one confidence interval without 0:

Robert DJ | Probability Theory and Statistics

Robert DJ | Probability Theory and Statistics

the corresponding random variable is approx. normally

the population standard deviation is 10 calories.

Robert DJ | Probability Theory and Statistics

Confidence interval for mean

(1 )100% confidence interval for :

Robert DJ | Probability Theory and Statistics

General form of confidence interval for mean:

Observations from N(, )

Observations from N(, 2 )

fractile from t distribution

fractile from normal

Robert DJ | Probability Theory and Statistics

More chocolate bars

In a sample of 20 chocolate bars the amount of calories has been

the corresponding random variable is approx. normally

the sample standard deviation is 10 calories.

the sample mean is 224 calories.

Calculate 90% and 95% confidence intervals for the mean.

Robert DJ | Probability Theory and Statistics

Confidence interval for variance

(1 )100% confidence interval for s2 :

Robert DJ | Probability Theory and Statistics

Variating chocolate bars

In a sample of 20 chocolate bars the amount of calories has been

the sample standard deviation is 10 calories.

Calculate 90% and 95% confidence intervals for the variance.

Robert DJ | Probability Theory and Statistics

Robert DJ | Probability Theory and Statistics

Test of two means

Pooled variance estimate:

(n1 1)s12 + (n2 1)s22

Robert DJ | Probability Theory and Statistics

Test of two means

(s12 /n1 + s22 /n2 )2

Robert DJ | Probability Theory and Statistics

Likelihood approach: Motivation

Flip coin 10 times.

Robert DJ | Probability Theory and Statistics

Likelihood approach: Motivation

Flip coin 10 times.

Robert DJ | Probability Theory and Statistics

Robert DJ | Probability Theory and Statistics