Professional Documents
Culture Documents
May 2012
Outline
Introduction
Sampling of independent observations
Sampling without replacement
Stratified sampling
Taking samples
X1 + X2 + . . . + Xn
n
if n = N.
would verify m = m
All parameters can, in principle, be known with certainty!
With n 6= N,
=
m
.5cm
X1 + X2 + . . . + Xn
n
if n = N.
would verify m = m
All parameters can, in principle, be known with certainty!
With n 6= N,
=
m
.5cm
X1 + X2 + . . . + Xn
n
if n = N.
would verify m = m
All parameters can, in principle, be known with certainty!
With n 6= N,
=
m
I
I
.5cm
We will see:
I What makes sampling without replacement more complex.
We will see:
I What makes sampling without replacement more complex.
I
We will see:
I What makes sampling without replacement more complex.
I
X1 + . . . + Xn
N(m, 2 /n)
n
X1 + . . . + Xn
N(m, 2 /n)
n
Then,
Prob X z/2
2
2
m X + z/2
=1
n
n
Prob NX Nz/2
2
2
T NX + Nz/2
= 1
n
n
Estimation of a proportion
Estimation of a proportion
Estimation of a proportion
Estimation of a proportion
From
Prob X z/2
2
2
m X + z/2
=1
n
n
we see
q that we will be off the true value m by less than
2
z/2 n with probability 1 .
From
Prob X z/2
2
2
m X + z/2
=1
n
n
we see
q that we will be off the true value m by less than
2
z/2 n with probability 1 .
This is called the 1 (sampling) error.
From
Prob X z/2
I
I
2
2
m X + z/2
=1
n
n
we see
q that we will be off the true value m by less than
2
z/2 n with probability 1 .
This is called the 1 (sampling) error.
Sampling error also used to mean standard deviation of
the estimate.
Example:
What n do we need so that with confidence 0.95 the error
in the estimation of a proportion is less than 0.03?
Example:
What n do we need so that with confidence 0.95 the error
in the estimation of a proportion is less than 0.03?
Solution:
q
2
Error is less than z/2 n with confidence 1 .
Example:
What n do we need so that with confidence 0.95 the error
in the estimation of a proportion is less than 0.03?
Solution:
q
2
Error is less than z/2 n with confidence 1 .
Confidence 0.95 means z/2 = 1.96
Example:
What n do we need so that with confidence 0.95 the error
in the estimation of a proportion is less than 0.03?
Solution:
q
2
Error is less than z/2 n with confidence 1 .
Confidence 0.95 means z/2 = 1.96
2
.
n
Example:
What n do we need so that with confidence 0.95 the error
in the estimation of a proportion is less than 0.03?
Solution:
q
2
Error is less than z/2 n with confidence 1 .
Confidence 0.95 means z/2 = 1.96
Therefore, n >
n = 1068.
2
.
n
(1.96)2 0.25
0.032
In independent sampling,
"
X1 + . . . + Xn
E [x ] = E
n
m + m + ... + m
nm
=
=
=m
n
n
In independent sampling,
"
X1 + . . . + Xn
E [x ] = E
n
m + m + ... + m
nm
=
=
=m
n
n
I
In independent sampling,
"
X1 + . . . + Xn
E [x ] = E
n
m + m + ... + m
nm
=
=
=m
n
n
I
In independent sampling,
"
X1 + . . . + Xn
E [x ] = E
n
m + m + ... + m
nm
=
=
=m
n
n
I
Theorem 1
P
In a finite population of size N with m = Ni=1 yi /N, for
samples Y1 , . . . , Yn without replacement of size n < N we
have:
E [Y ] = m
Theorem 1
P
In a finite population of size N with m = Ni=1 yi /N, for
samples Y1 , . . . , Yn without replacement of size n < N we
have:
E [Y ] = m
Proof
Theorem 1
P
In a finite population of size N with m = Ni=1 yi /N, for
samples Y1 , . . . , Yn without replacement of size n < N we
have:
E [Y ] = m
Proof
I
Theorem 1
P
In a finite population of size N with m = Ni=1 yi /N, for
samples Y1 , . . . , Yn without replacement of size n < N we
have:
E [Y ] = m
Proof
I
I
There are
N
n
N!
(Nn)!n!
different samples.
There are
N
n
Of those,
N1
n1
N!
(Nn)!n!
different samples.
N
n
Of those,
N1
n1
Clearly,
N!
(Nn)!n!
different samples.
N 1
(Y1 + Y2 + . . . + Yn ) =
(y1 + y2 + . . . + yN )
n1
N
n
N
n
different
Indeed,
(Y1 + Y2 + . . . + Yn )
N
n
=
=
N1
n1
(y1 + y2 + . . . + yN )
N
n
n
(y1 + y2 + . . . + yN )
N
Indeed,
(Y1 + Y2 + . . . + Yn )
N
n
=
=
N1
n1
(y1 + y2 + . . . + yN )
N
n
n
(y1 + y2 + . . . + yN )
N
Therefore,
P
E [Y ] =
(Y1 + . . . + Yn )/n
N
n
(y1 + . . . + yN )
= E [y ] = m
N
We have
(Y1 + Y2 + . . . + Yn ) = (y1 Z1 + y2 Z2 + . . . yN ZN )
where Zi is a binary variable which takes value 1 if yi
belongs to a given sample.
We have
(Y1 + Y2 + . . . + Yn ) = (y1 Z1 + y2 Z2 + . . . yN ZN )
PN
i=1 (yi
y )2
PN
y )2
N 1
i=1 (yi
PN
=
I
y )2
i=1 (yi
PN
y )2
N 1
Pn
Y )2
n1
i=1 (yi
s =
Pn
i=1 (Yi
Y )2
s =
i=1 (Yi
PN
=
I
PN
y )2
N 1
Pn
Y )2
n1
i=1 (yi
s =
I
y )2
i=1 (yi
Pn
i=1 (Yi
Y )2
s =
i=1 (Yi
Variance of Y (I)
I
Theorem 2
In a finite population of size N, the estimator Y of
P
m = Ni=1 yi /N based on a sample of size n < N without
replacement Y1 , . . . , Yn has variance:
2
n
1
n
N
Var[Y ] =
Variance of Y (I)
I
Theorem 2
In a finite population of size N, the estimator Y of
P
m = Ni=1 yi /N based on a sample of size n < N without
replacement Y1 , . . . , Yn has variance:
2
n
1
n
N
Var[Y ] =
I
Factor
n
1
N
usually called finite population correction factor or
correction factor.
Variance of Y (II)
I
Remarks:
Variance of Y (II)
I
I
Remarks:
It is the same expression as in independent random
sampling with i) 2 replaced by
2 , and ii) corrected with
the factor (1 n/N).
Variance of Y (II)
I
I
Remarks:
It is the same expression as in independent random
sampling with i) 2 replaced by
2 , and ii) corrected with
the factor (1 n/N).
If n = N, the variance Var(Y ) is 0 (why?).
Variance of Y (II)
I
I
Remarks:
It is the same expression as in independent random
sampling with i) 2 replaced by
2 , and ii) corrected with
the factor (1 n/N).
Variance of Y (III)
I
Proof
y1 Zi + . . . + yN ZN
Var(Y ) = Var
n
N X
N
X
1 X
yi yj Cov(Zi , Zj )
= 2 yi2 Var(Zi ) +
n i=1
i=1 j6=i
Variance of Y (III)
I
Proof
y1 Zi + . . . + yN ZN
Var(Y ) = Var
n
N X
N
X
1 X
yi yj Cov(Zi , Zj )
= 2 yi2 Var(Zi ) +
n i=1
i=1 j6=i
I
Variance of Y (IV)
I
Variance of Y (IV)
I
n(n 1)
n
N(N 1)
N
Cov(Zi , Zj ) =
n(n1)
,
N(N1)
2
so
n(1 n/N)
N(N 1)
Variance of Y (IV)
I
n(n 1)
n
N(N 1)
N
Cov(Zi , Zj ) =
I
n(n1)
,
N(N1)
2
so
n(1 n/N)
N(N 1)
Variance of Y (V)
Var(Y ) =
1
n2
1
= 2
n
I
N
X
yi2
i=1
n
N
Var(Zi ) +
| {z }
(n/N)(1n/N)
N X
X
i=1 j6=i
yi yj Cov(Zi , Zj )
|
{z
n(1n/N)
N(N1)
N
N X
1 X
n X
2
1
yi
yi yj
N
N
1
i=1
i=1 j6=i
Variance of Y (VI)
I
Remark that,
N
X
N
X
i=1
i=1
(yi m)2 =
2
N
y
i
i=1
P
yi2
N 1
=
N
N
X
y2
i
i=1
N X
X
i=1 j6=i
yi yj
N 1
Variance of Y (VI)
I
Remark that,
N
X
N
X
i=1
i=1
(yi m)2 =
yi2
N 1
=
N
I
2
N
y
i
i=1
P
N
X
y2
i
i=1
N X
X
i=1 j6=i
yi yj
N 1
Variance of Y (VII)
I
n
N
|
PN
n
1
=
1
n
N
2
n
= 1
N n
N
N X
n X
1 X
1
yi2
yi yj
N
N 1 i=1 j6=i
i=1
N
N1
m)2
N 1
i=1 (yi
{z
PN
(y m)2
i=1 i
The (1 ) error is
s
= z/2
2
(1 n/N)
n
The (1 ) error is
s
= z/2
I
2
(1 n/N)
n
2
Nz/2
2
2
N 2 +
2 z/2
The (1 ) error is
s
= z/2
I
2
(1 n/N)
n
2
Nz/2
2
2
N 2 +
2 z/2
2 or 2 are required.
I
I
2 or 2 are required.
We either replace an upper bound or conservative
estimation for 2 .
I
I
2 or 2 are required.
We either replace an upper bound or conservative
estimation for 2 .
Failing that, we estimate 2 or
2.
Why strata?
I
Why strata?
I
Why strata?
I
Why strata?
I
Example 1
o
6
X2
5
4
o
X1
o
o
Expenditure
Sample unit
1
Ni
PNi
j=1 yij
and
Clearly,
m =
2 =
h
X
i=1
h
X
i=1
Ni
N
1
Ni
PNi
j=1 yij
and
mi
h
Ni
Ni 2 X
i +
(mi m)2
N
N
i=1
2
(1 n/N).
n
2
(1 n/N).
n
2
variance i2 = ni (1 ni /Ni ).
2
(1 n/N).
n
2
variance i2 = ni (1 ni /Ni ).
h
X
i=1
Ni
N
!2
i2
(1 ni /Ni )
ni
2
(1 n/N) 2 =
n
!"
#
h
n X
Ni
Ni 1 Ni
i2
1
+
N i=1 N
N 1
N ni
n
1
N
h
1X
Ni
(mi m)2
n i=1 N 1
2
(1 n/N) 2 =
n
!"
#
h
n X
Ni
Ni 1 Ni
i2
1
+
N i=1 N
N 1
N ni
n
1
N
h
1X
Ni
(mi m)2
n i=1 N 1
Hungarian-born. Graduated
(Ph.D. Mathematics) from
University of Vienna, 1931.
Hungarian-born. Graduated
(Ph.D. Mathematics) from
University of Vienna, 1931.
Hungarian-born. Graduated
(Ph.D. Mathematics) from
University of Vienna, 1931.
Hungarian-born. Graduated
(Ph.D. Mathematics) from
University of Vienna, 1931.
I
Abraham Wald (1902-1950)
Never do!
Never do!
Do not let the survey taker to choose the units.
I
I
Never do!
Do not let the survey taker to choose the units.
I
I
Never do!
Do not let the survey taker to choose the units.
I
I
Never do!
Do not let the survey taker to choose the units.