You are on page 1of 113

Finite Population Sampling

Finite Population Sampling


Fernando TUSELL

May 2012

Finite Population Sampling

Outline
Introduction
Sampling of independent observations
Sampling without replacement
Stratified sampling
Taking samples

Finite Population Sampling


Introduction

Sampling of independent observations


I

We have been assuming samples


X1 , X2 , . . . , Xn
made of independent observations.

Finite Population Sampling


Introduction

Sampling of independent observations


I

We have been assuming samples


X1 , X2 , . . . , Xn

made of independent observations.


This makes sense:

Finite Population Sampling


Introduction

Sampling of independent observations


I

We have been assuming samples


X1 , X2 , . . . , Xn

made of independent observations.


This makes sense:
I

When we sample an infinite population: seeing one value


does not affect the probability of seeing the same or
another value.

Finite Population Sampling


Introduction

Sampling of independent observations


I

We have been assuming samples


X1 , X2 , . . . , Xn

made of independent observations.


This makes sense:
I

When we sample an infinite population: seeing one value


does not affect the probability of seeing the same or
another value.
When we sample with replacement.

Finite Population Sampling


Introduction

Sampling of independent observations


I

We have been assuming samples


X1 , X2 , . . . , Xn

made of independent observations.


This makes sense:
I

When we sample an infinite population: seeing one value


does not affect the probability of seeing the same or
another value.
When we sample with replacement.

With finite populations without replacement, what we see


affects the probability of what is yet to be seen.

Finite Population Sampling


Introduction

Finite versus infinite populations (I)

With infinite populations, precision depends only on


sample size.

Finite Population Sampling


Introduction

Finite versus infinite populations (I)

With infinite populations, precision depends only on


sample size.

Usually, standard error of estimation is n where n is


sample size and 2 the population variance.

Finite Population Sampling


Introduction

Finite versus infinite populations (I)

With infinite populations, precision depends only on


sample size.

Usually, standard error of estimation is n where n is


sample size and 2 the population variance.

If estimator is consistent we approach (but never quite


hit with certainty) the true value of the parameter.

Finite Population Sampling


Introduction

Finite versus infinite populations (II)


I

If population is finite of size N, we could inspect all units


and estimate anything with certainty:
X1 + X2 + . . . + Xn
n
if n = N.
would verify m = m
=
m

Finite Population Sampling


Introduction

Finite versus infinite populations (II)


I

If population is finite of size N, we could inspect all units


and estimate anything with certainty:
X1 + X2 + . . . + Xn
n
if n = N.
would verify m = m
All parameters can, in principle, be known with certainty!
=
m

Finite Population Sampling


Introduction

Finite versus infinite populations (II)


I

If population is finite of size N, we could inspect all units


and estimate anything with certainty:

X1 + X2 + . . . + Xn
n
if n = N.
would verify m = m
All parameters can, in principle, be known with certainty!

With n 6= N,

=
m

.5cm

Finite Population Sampling


Introduction

Finite versus infinite populations (II)


I

If population is finite of size N, we could inspect all units


and estimate anything with certainty:

X1 + X2 + . . . + Xn
n
if n = N.
would verify m = m
All parameters can, in principle, be known with certainty!

With n 6= N,

=
m

If n/N 0, independent sampling good approximation.

.5cm

Finite Population Sampling


Introduction

Finite versus infinite populations (II)


I

If population is finite of size N, we could inspect all units


and estimate anything with certainty:

X1 + X2 + . . . + Xn
n
if n = N.
would verify m = m
All parameters can, in principle, be known with certainty!

With n 6= N,

=
m

I
I

If n/N 0, independent sampling good approximation.


If n/N  0, we have to take into account that we are
looking at a substantial portion of the population.

.5cm

Finite Population Sampling


Introduction

An overview of things to come

We will see:
I What makes sampling without replacement more complex.

Finite Population Sampling


Introduction

An overview of things to come

We will see:
I What makes sampling without replacement more complex.
I

What relationship there is among independent and


non-independent sampling.

Finite Population Sampling


Introduction

An overview of things to come

We will see:
I What makes sampling without replacement more complex.
I

What relationship there is among independent and


non-independent sampling.

What other types of sampling exist.

Finite Population Sampling


Sampling of independent observations

The central approximation


I

Requirement: replacement of large population size N.

Finite Population Sampling


Sampling of independent observations

The central approximation


I

Requirement: replacement of large population size N.

If n is large and X1 , . . . , Xn near independent,


X=

X1 + . . . + Xn
N(m, 2 /n)
n

Finite Population Sampling


Sampling of independent observations

The central approximation


I

Requirement: replacement of large population size N.

If n is large and X1 , . . . , Xn near independent,


X=

X1 + . . . + Xn
N(m, 2 /n)
n

Then,

Prob X z/2

2
2
m X + z/2
=1
n
n

Finite Population Sampling


Sampling of independent observations

Estimation of the population total


I

Since T = Nm, we just have multiply by N the extremes


of the interval for m.

Finite Population Sampling


Sampling of independent observations

Estimation of the population total


I

Since T = Nm, we just have multiply by N the extremes


of the interval for m.
Hence,

Prob NX Nz/2

2
2
T NX + Nz/2
= 1
n
n

Finite Population Sampling


Sampling of independent observations

Estimation of a proportion

If Xi is a binary variable, X is the sample proportion.

Finite Population Sampling


Sampling of independent observations

Estimation of a proportion

If Xi is a binary variable, X is the sample proportion.

We have X N(p, pq/n)

Finite Population Sampling


Sampling of independent observations

Estimation of a proportion

If Xi is a binary variable, X is the sample proportion.

We have X N(p, pq/n)

Usual estimate of variance is p(1 p)/n.

Finite Population Sampling


Sampling of independent observations

Estimation of a proportion

If Xi is a binary variable, X is the sample proportion.

We have X N(p, pq/n)

Usual estimate of variance is p(1 p)/n.

Sometimes we use a (conservative) estimate: pq 0.5,


hence a bound for 2 is 0.5/n.

Finite Population Sampling


Sampling of independent observations

Sampling error with confidence 1 .


I

From

Prob X z/2

2
2
m X + z/2
=1
n
n

we see
q that we will be off the true value m by less than
2
z/2 n with probability 1 .

Finite Population Sampling


Sampling of independent observations

Sampling error with confidence 1 .


I

From

Prob X z/2

2
2
m X + z/2
=1
n
n

we see
q that we will be off the true value m by less than
2
z/2 n with probability 1 .
This is called the 1 (sampling) error.

Finite Population Sampling


Sampling of independent observations

Sampling error with confidence 1 .


I

From

Prob X z/2

I
I

2
2
m X + z/2
=1
n
n

we see
q that we will be off the true value m by less than
2
z/2 n with probability 1 .
This is called the 1 (sampling) error.
Sampling error also used to mean standard deviation of
the estimate.

Finite Population Sampling


Sampling of independent observations

Finding the required sample size n


I

Example:
What n do we need so that with confidence 0.95 the error
in the estimation of a proportion is less than 0.03?

Finite Population Sampling


Sampling of independent observations

Finding the required sample size n


I

Example:
What n do we need so that with confidence 0.95 the error
in the estimation of a proportion is less than 0.03?
Solution:
q
2
Error is less than z/2 n with confidence 1 .

Finite Population Sampling


Sampling of independent observations

Finding the required sample size n


I

Example:
What n do we need so that with confidence 0.95 the error
in the estimation of a proportion is less than 0.03?
Solution:
q
2
Error is less than z/2 n with confidence 1 .
Confidence 0.95 means z/2 = 1.96

Finite Population Sampling


Sampling of independent observations

Finding the required sample size n

Example:
What n do we need so that with confidence 0.95 the error
in the estimation of a proportion is less than 0.03?
Solution:
q
2
Error is less than z/2 n with confidence 1 .
Confidence 0.95 means z/2 = 1.96

Want 0.03 > 1.96

2
.
n

Worst case scenario is 2 = 0.25.

Finite Population Sampling


Sampling of independent observations

Finding the required sample size n

Example:
What n do we need so that with confidence 0.95 the error
in the estimation of a proportion is less than 0.03?
Solution:
q
2
Error is less than z/2 n with confidence 1 .
Confidence 0.95 means z/2 = 1.96

Want 0.03 > 1.96

Therefore, n >
n = 1068.

2
.
n

Worst case scenario is 2 = 0.25.

(1.96)2 0.25
0.032

= 1067.11 will do. Will take

Finite Population Sampling


Sampling of independent observations

Interesting facts (I)


I

Under independent sampling (infite population or


sampling with replacement), required sample size depends
only on variance and precsion required.

Finite Population Sampling


Sampling of independent observations

Interesting facts (I)


I

Under independent sampling (infite population or


sampling with replacement), required sample size depends
only on variance and precsion required.

Questions like: Is a sample of 4% enough? are badly


posed.

Finite Population Sampling


Sampling of independent observations

Interesting facts (I)


I

Under independent sampling (infite population or


sampling with replacement), required sample size depends
only on variance and precsion required.

Questions like: Is a sample of 4% enough? are badly


posed.

n = 4% of a population with N = 10000 insufficient to


give a precision of 0.03 with confidence 0.95.

Finite Population Sampling


Sampling of independent observations

Interesting facts (I)


I

Under independent sampling (infite population or


sampling with replacement), required sample size depends
only on variance and precsion required.

Questions like: Is a sample of 4% enough? are badly


posed.

n = 4% of a population with N = 10000 insufficient to


give a precision of 0.03 with confidence 0.95.

. . . but n = 0.3% of a population with N = 1000000 will


be more than enough!

Finite Population Sampling


Sampling of independent observations

Interesting facts (II)


I

As long as populations are large detail is expensive!

Finite Population Sampling


Sampling of independent observations

Interesting facts (II)


I

As long as populations are large detail is expensive!

To estimate a proportion in the CAPV with the precision


stated requires about n = 1068.

Finite Population Sampling


Sampling of independent observations

Interesting facts (II)


I

As long as populations are large detail is expensive!

To estimate a proportion in the CAPV with the precision


stated requires about n = 1068.

To estimate the same proportion for each of the three


Territories with the same precision, requires three times
as large a sample!

Finite Population Sampling


Sampling of independent observations

Interesting facts (II)


I

As long as populations are large detail is expensive!

To estimate a proportion in the CAPV with the precision


stated requires about n = 1068.

To estimate the same proportion for each of the three


Territories with the same precision, requires three times
as large a sample!

Subpopulation estimates have much lower precision than


those for the whole population.

Finite Population Sampling


Sampling without replacement

Estimation of the mean (I)


I

In independent sampling,
"

X1 + . . . + Xn
E [x ] = E
n
m + m + ... + m
nm
=
=
=m
n
n

Finite Population Sampling


Sampling without replacement

Estimation of the mean (I)


I

In independent sampling,
"

X1 + . . . + Xn
E [x ] = E
n
m + m + ... + m
nm
=
=
=m
n
n
I

E [Xi ] = m irrespective of what other values are in the


sample.

Finite Population Sampling


Sampling without replacement

Estimation of the mean (I)


I

In independent sampling,
"

X1 + . . . + Xn
E [x ] = E
n
m + m + ... + m
nm
=
=
=m
n
n
I

E [Xi ] = m irrespective of what other values are in the


sample.

Without replacement, distribution of Xi depends on what


other values are already present in the sample.

Finite Population Sampling


Sampling without replacement

Estimation of the mean (I)


I

In independent sampling,
"

X1 + . . . + Xn
E [x ] = E
n
m + m + ... + m
nm
=
=
=m
n
n
I

E [Xi ] = m irrespective of what other values are in the


sample.

Without replacement, distribution of Xi depends on what


other values are already present in the sample.

The same result as for independent sampling is true!

Finite Population Sampling


Sampling without replacement

Estimation of the mean (II)


I

Theorem 1
P
In a finite population of size N with m = Ni=1 yi /N, for
samples Y1 , . . . , Yn without replacement of size n < N we
have:
E [Y ] = m

Finite Population Sampling


Sampling without replacement

Estimation of the mean (II)


I

Theorem 1
P
In a finite population of size N with m = Ni=1 yi /N, for
samples Y1 , . . . , Yn without replacement of size n < N we
have:
E [Y ] = m

Proof

Finite Population Sampling


Sampling without replacement

Estimation of the mean (II)


I

Theorem 1
P
In a finite population of size N with m = Ni=1 yi /N, for
samples Y1 , . . . , Yn without replacement of size n < N we
have:
E [Y ] = m

Proof
I

Y1 , Y2 , . . . , Yn are the elements of the sample.

Finite Population Sampling


Sampling without replacement

Estimation of the mean (II)


I

Theorem 1
P
In a finite population of size N with m = Ni=1 yi /N, for
samples Y1 , . . . , Yn without replacement of size n < N we
have:
E [Y ] = m

Proof
I
I

Y1 , Y2 , . . . , Yn are the elements of the sample.


y1 , y2 , . . . , yN are the elements of the population.

Finite Population Sampling


Sampling without replacement

Estimation of the mean (III)


I

There are

 
N
n

N!
(Nn)!n!

different samples.

Finite Population Sampling


Sampling without replacement

Estimation of the mean (III)


I
I

There are

 
N
n

Of those,

N1
n1

N!
(Nn)!n!

different samples.

contain each of the values y1 , y2 , . . . , yN .

Finite Population Sampling


Sampling without replacement

Estimation of the mean (III)


There are

 
N
n

Of those,

N1
n1

Clearly,

N!
(Nn)!n!

different samples.

contain each of the values y1 , y2 , . . . , yN .

N 1
(Y1 + Y2 + . . . + Yn ) =
(y1 + y2 + . . . + yN )
n1

where the sum in the left is taken over all


samples. Dividing by

 
N
n

 
N
n

finishes the proof.

different

Finite Population Sampling


Sampling without replacement

Estimation of the mean (IV)


I

Indeed,


(Y1 + Y2 + . . . + Yn )
 
N
n

=
=

N1
n1

(y1 + y2 + . . . + yN )
 
N
n

n
(y1 + y2 + . . . + yN )
N

Finite Population Sampling


Sampling without replacement

Estimation of the mean (IV)


I

Indeed,


(Y1 + Y2 + . . . + Yn )
 
N
n

=
=

N1
n1

(y1 + y2 + . . . + yN )
 
N
n

n
(y1 + y2 + . . . + yN )
N

Therefore,
P

E [Y ] =

(Y1 + . . . + Yn )/n
 
N
n

(y1 + . . . + yN )
= E [y ] = m
N

Finite Population Sampling


Sampling without replacement

The indicator variable method


I

We have

(Y1 + Y2 + . . . + Yn ) = (y1 Z1 + y2 Z2 + . . . yN ZN )
where Zi is a binary variable which takes value 1 if yi
belongs to a given sample.

Finite Population Sampling


Sampling without replacement

The indicator variable method


I

We have

(Y1 + Y2 + . . . + Yn ) = (y1 Z1 + y2 Z2 + . . . yN ZN )

where Zi is a binary variable which takes value 1 if yi


belongs to a given sample.
The probability of that happening is n/N. Then,
n
(y1 + y2 + . . . yN ),
N
which again gives the previous result E [Y ] = y = m.
E [(Y1 + Y2 + . . . + Yn )] =

Finite Population Sampling


Sampling without replacement

Population variance an quasi-variance


I

They are defined as:

PN

i=1 (yi

y )2

PN

y )2
N 1

i=1 (yi

Finite Population Sampling


Sampling without replacement

Population variance an quasi-variance


I

They are defined as:

PN

=
I

y )2

i=1 (yi

PN

y )2
N 1

Pn

Y )2
n1

i=1 (yi

Similarly for sample analogues:

s =

Pn

i=1 (Yi

Y )2

s =

i=1 (Yi

Finite Population Sampling


Sampling without replacement

Population variance an quasi-variance


I

They are defined as:

PN

=
I

PN

y )2
N 1

Pn

Y )2
n1

i=1 (yi

Similarly for sample analogues:

s =
I

y )2

i=1 (yi

Pn

i=1 (Yi

Y )2

s =

i=1 (Yi

Turns out some formulae are simpler in terms of


quasi-variances.

Finite Population Sampling


Sampling without replacement

Variance of Y (I)
I

Theorem 2
In a finite population of size N, the estimator Y of
P
m = Ni=1 yi /N based on a sample of size n < N without
replacement Y1 , . . . , Yn has variance:

2
n
1
n
N


Var[Y ] =

Finite Population Sampling


Sampling without replacement

Variance of Y (I)
I

Theorem 2
In a finite population of size N, the estimator Y of
P
m = Ni=1 yi /N based on a sample of size n < N without
replacement Y1 , . . . , Yn has variance:

2
n
1
n
N


Var[Y ] =
I

Factor
n
1
N
usually called finite population correction factor or
correction factor.


Finite Population Sampling


Sampling without replacement

Variance of Y (II)
I

Remarks:

Finite Population Sampling


Sampling without replacement

Variance of Y (II)
I
I

Remarks:
It is the same expression as in independent random
sampling with i) 2 replaced by
2 , and ii) corrected with
the factor (1 n/N).

Finite Population Sampling


Sampling without replacement

Variance of Y (II)
I
I

Remarks:
It is the same expression as in independent random
sampling with i) 2 replaced by
2 , and ii) corrected with
the factor (1 n/N).
If n = N, the variance Var(Y ) is 0 (why?).

Finite Population Sampling


Sampling without replacement

Variance of Y (II)
I
I

Remarks:
It is the same expression as in independent random
sampling with i) 2 replaced by
2 , and ii) corrected with
the factor (1 n/N).

If n = N, the variance Var(Y ) is 0 (why?).

Formula covers middle ground between infinite


populations (n/N = 0) and census sampling (n/N = 1).

Finite Population Sampling


Sampling without replacement

Variance of Y (III)
I

Proof
y1 Zi + . . . + yN ZN
Var(Y ) = Var
n

N X
N
X
1 X
yi yj Cov(Zi , Zj )
= 2 yi2 Var(Zi ) +
n i=1
i=1 j6=i

Finite Population Sampling


Sampling without replacement

Variance of Y (III)
I

Proof
y1 Zi + . . . + yN ZN
Var(Y ) = Var
n

N X
N
X
1 X
yi yj Cov(Zi , Zj )
= 2 yi2 Var(Zi ) +
n i=1
i=1 j6=i
I

We only need expressions for Var(Zi ) and Cov(Zi , Zj ).

Finite Population Sampling


Sampling without replacement

Variance of Y (IV)
I

Since Zi is binary with probability n/N,


Var(Zi ) = (n/N)(1 n/N).

Finite Population Sampling


Sampling without replacement

Variance of Y (IV)
I

Since Zi is binary with probability n/N,


Var(Zi ) = (n/N)(1 n/N).

But E[Zi Zj ] = P(Zi = 1, Zj = 1) =

n(n 1)
n

N(N 1)
N


Cov(Zi , Zj ) =

n(n1)
,
N(N1)

2

so

n(1 n/N)
N(N 1)

Finite Population Sampling


Sampling without replacement

Variance of Y (IV)
I

Since Zi is binary with probability n/N,


Var(Zi ) = (n/N)(1 n/N).

But E[Zi Zj ] = P(Zi = 1, Zj = 1) =

n(n 1)
n

N(N 1)
N


Cov(Zi , Zj ) =
I

n(n1)
,
N(N1)

2

so

n(1 n/N)
N(N 1)

Replacing in expression for Var(Y ) will lead to result.

Finite Population Sampling


Sampling without replacement

Variance of Y (V)

Var(Y ) =

1
n2

1
= 2
n
I

N
X

yi2

i=1

n
N



Var(Zi ) +
| {z }

(n/N)(1n/N)

N X
X
i=1 j6=i

yi yj Cov(Zi , Zj )
|

{z

n(1n/N)
N(N1)

N
N X
1 X
n X
2
1
yi
yi yj
N
N

1
i=1
i=1 j6=i

Will rewrite expression in brackets.

Finite Population Sampling


Sampling without replacement

Variance of Y (VI)
I

Remark that,
N
X

N
X

i=1

i=1

(yi m)2 =

2
N
y
i
i=1

P

yi2

N 1
=
N

N
X

y2
i

i=1

N X
X
i=1 j6=i

yi yj
N 1

Finite Population Sampling


Sampling without replacement

Variance of Y (VI)
I

Remark that,
N
X

N
X

i=1

i=1

(yi m)2 =

yi2

N 1
=
N
I

2
N
y
i
i=1

P

N
X

y2
i

i=1

N X
X
i=1 j6=i

yi yj
N 1

The expression in square brackets in th r.h.s is therefore


N PN
2
i=1 (yi m) .
N1

Finite Population Sampling


Sampling without replacement

Variance of Y (VII)
I

We are now done!


1
Var(Y ) = 2
n

n
N



|
 PN

n
1
=
1
n
N

 2
n

= 1
N n


N
N X
n X
1 X
1
yi2
yi yj
N
N 1 i=1 j6=i
i=1
N
N1

m)2
N 1

i=1 (yi

{z
PN

(y m)2
i=1 i

Finite Population Sampling


Sampling without replacement

Sample size for given precision (I)


I

The (1 ) error is
s

= z/2

2
(1 n/N)
n

Finite Population Sampling


Sampling without replacement

Sample size for given precision (I)


I

The (1 ) error is
s

= z/2
I

2
(1 n/N)
n

Solving for n we obtain


n=

2
Nz/2

2
2
N 2 +
2 z/2

Finite Population Sampling


Sampling without replacement

Sample size for given precision (I)


I

The (1 ) error is
s

= z/2
I

Solving for n we obtain


n=

2
(1 n/N)
n

2
Nz/2

2
2
N 2 +
2 z/2

In terms of the variance, it can be written as:


2
Nz/2
2
n=
2
(N 1) 2 + 2 z/2

Finite Population Sampling


Sampling without replacement

Sample size for given precision (II)

2 or 2 are required.

Finite Population Sampling


Sampling without replacement

Sample size for given precision (II)

I
I

2 or 2 are required.
We either replace an upper bound or conservative
estimation for 2 .

Finite Population Sampling


Sampling without replacement

Sample size for given precision (II)

I
I

2 or 2 are required.
We either replace an upper bound or conservative
estimation for 2 .
Failing that, we estimate 2 or
2.

Finite Population Sampling


Stratified sampling

Why strata?
I

Sometimes we know something about the composition of


the population, knowledge that can be put to use.

Finite Population Sampling


Stratified sampling

Why strata?
I

Sometimes we know something about the composition of


the population, knowledge that can be put to use.

Example: We might know that males and females have


different spending in e.g. tobacco or cosmetics.

Finite Population Sampling


Stratified sampling

Why strata?
I

Sometimes we know something about the composition of


the population, knowledge that can be put to use.

Example: We might know that males and females have


different spending in e.g. tobacco or cosmetics.

To estimate average spending, it makes sense to sample


males and females, and combine the estimations.

Finite Population Sampling


Stratified sampling

Why strata?
I

Sometimes we know something about the composition of


the population, knowledge that can be put to use.

Example: We might know that males and females have


different spending in e.g. tobacco or cosmetics.

To estimate average spending, it makes sense to sample


males and females, and combine the estimations.

Sometimes, the target quantity might be similar, but the


variance quite different. Also makes sense to differentiate.

Finite Population Sampling


Stratified sampling

Example 1
o
6

X2

5
4

o
X1
o
o

Expenditure

Sample unit

Makes sense to estimate mean in each subpopulation

Finite Population Sampling


Stratified sampling

Definitions and notation


I

We assume the population is divided in h strata. Total


size is N = N1 + N2 + . . . + Nh .

Finite Population Sampling


Stratified sampling

Definitions and notation


I

We assume the population is divided in h strata. Total


size is N = N1 + N2 + . . . + Nh .

The i-th stratum has a mean mi =


P i
(yij mi )2 .
variance i2 = N1i Nj=1

1
Ni

PNi

j=1 yij

and

Finite Population Sampling


Stratified sampling

Definitions and notation


I

We assume the population is divided in h strata. Total


size is N = N1 + N2 + . . . + Nh .

The i-th stratum has a mean mi =


P i
(yij mi )2 .
variance i2 = N1i Nj=1

Clearly,
m =
2 =

h
X
i=1
h
X
i=1

Ni
N

1
Ni

PNi

j=1 yij

and

mi

h
Ni
Ni 2 X
i +
(mi m)2
N
N
i=1

Finite Population Sampling


Stratified sampling

Estimation of the mean


I

The estimation of the mean sampling without


replacement the whole population has variance

2
(1 n/N).
n

Finite Population Sampling


Stratified sampling

Estimation of the mean


I

The estimation of the mean sampling without


replacement the whole population has variance

2
(1 n/N).
n

Similarly, the estimation of the mean of each stratum has

2
variance i2 = ni (1 ni /Ni ).

Finite Population Sampling


Stratified sampling

Estimation of the mean


I

The estimation of the mean sampling without


replacement the whole population has variance

2
(1 n/N).
n

Similarly, the estimation of the mean of each stratum has

2
variance i2 = ni (1 ni /Ni ).

The variance of the global mean reconstituted from the


estimated means of the strata is
2

h
X
i=1

Ni
N

!2

i2
(1 ni /Ni )
ni

Finite Population Sampling


Stratified sampling

Does the estimation of m improve?


I

Yes. If we sample each stratum in proportion to its size


(i.e., ni /Ni = n/N for all i), then:

2
(1 n/N) 2 =
n
!"
#

 h
n X
Ni
Ni 1 Ni
i2

1
+
N i=1 N
N 1
N ni


n
1
N

h
1X
Ni
(mi m)2
n i=1 N 1

Finite Population Sampling


Stratified sampling

Does the estimation of m improve?


I

Yes. If we sample each stratum in proportion to its size


(i.e., ni /Ni = n/N for all i), then:

2
(1 n/N) 2 =
n
!"
#

 h
n X
Ni
Ni 1 Ni
i2

1
+
N i=1 N
N 1
N ni


n
1
N

h
1X
Ni
(mi m)2
n i=1 N 1

Marked Improvement when the mi s very different.

Finite Population Sampling


Taking samples

Abraham Wald on sample selection

Finite Population Sampling


Taking samples

Abraham Wald on sample selection

Abraham Wald (1902-1950)

Finite Population Sampling


Taking samples

Abraham Wald on sample selection

Abraham Wald (1902-1950)

Finite Population Sampling


Taking samples

Abraham Wald on sample selection


I

Abraham Wald (1902-1950)

Hungarian-born. Graduated
(Ph.D. Mathematics) from
University of Vienna, 1931.

Finite Population Sampling


Taking samples

Abraham Wald on sample selection

Abraham Wald (1902-1950)

Hungarian-born. Graduated
(Ph.D. Mathematics) from
University of Vienna, 1931.

Fled to the USA in 1938, as


Nazi persecution intensified in
Austria.

Finite Population Sampling


Taking samples

Abraham Wald on sample selection


I

Hungarian-born. Graduated
(Ph.D. Mathematics) from
University of Vienna, 1931.

Fled to the USA in 1938, as


Nazi persecution intensified in
Austria.
Important contributions to the
war effort as statistician (notably
sequential analysis)

Abraham Wald (1902-1950)

Finite Population Sampling


Taking samples

Abraham Wald on sample selection


I

Hungarian-born. Graduated
(Ph.D. Mathematics) from
University of Vienna, 1931.

Fled to the USA in 1938, as


Nazi persecution intensified in
Austria.
Important contributions to the
war effort as statistician (notably
sequential analysis)

I
Abraham Wald (1902-1950)

Was consulted about aircraft


armoring.

Finite Population Sampling


Taking samples

What Wald saw that the others did not


I

Mark hits in B-29 bombers as they come back.

Finite Population Sampling


Taking samples

What Wald saw that the others did not


I

Mark hits in B-29 bombers as they come back.

Pretty obvious! Will armor the most beaten areas.

Finite Population Sampling


Taking samples

What Wald saw that the others did not


I

Mark hits in B-29 bombers as they come back.

Pretty obvious! Will armor the most beaten areas.

I didnt tell you to do that!

Finite Population Sampling


Taking samples

What Wald saw that the others did not


I

Mark hits in B-29 bombers as they come back.

Pretty obvious! Will armor the most beaten areas.

I didnt tell you to do that!

Do you want us to protect the areas with no hits?

Finite Population Sampling


Taking samples

What Wald saw that the others did not


I

Mark hits in B-29 bombers as they come back.

Pretty obvious! Will armor the most beaten areas.

I didnt tell you to do that!

Do you want us to protect the areas with no hits?

Thats exactly what I suggest!

Finite Population Sampling


Taking samples

Sample selection is ubiquitous!


I

If you ask for volunteers in a field study, no chance you


will get a truly random sample.

Finite Population Sampling


Taking samples

Sample selection is ubiquitous!


I

If you ask for volunteers in a field study, no chance you


will get a truly random sample.

Never do!

Finite Population Sampling


Taking samples

Sample selection is ubiquitous!


I

If you ask for volunteers in a field study, no chance you


will get a truly random sample.

Never do!
Do not let the survey taker to choose the units.

Finite Population Sampling


Taking samples

Sample selection is ubiquitous!


I

If you ask for volunteers in a field study, no chance you


will get a truly random sample.

I
I

Never do!
Do not let the survey taker to choose the units.

A random sample is not a grab set.

Finite Population Sampling


Taking samples

Sample selection is ubiquitous!


I

If you ask for volunteers in a field study, no chance you


will get a truly random sample.

I
I

Never do!
Do not let the survey taker to choose the units.

A random sample is not a grab set.

Build a census, randomize properly, address the chosen


units and no others.

Finite Population Sampling


Taking samples

Sample selection is ubiquitous!


I

If you ask for volunteers in a field study, no chance you


will get a truly random sample.

I
I

Never do!
Do not let the survey taker to choose the units.

A random sample is not a grab set.

Build a census, randomize properly, address the chosen


units and no others.
If you use systematic sampling (every n-th unit with
random start), make sure no periodicities exist that will
destroy randomness.

You might also like