FinitePopulation Sampling and Determining Its Size

Finite Population Sampling

Fernando TUSELL
May 2012
Outline
Introduction
Sampling of independent observations
Sampling without replacement
Stratified sampling
Taking samples

Introduction

I
We have been assuming samples

X1 , X2 , . . . , Xn
made of independent observations.

Introduction

I

X1 , X2 , . . . , Xn

This makes sense:

Introduction

I

X1 , X2 , . . . , Xn

This makes sense:
I
When we sample an infinite population: seeing one value

does not affect the probability of seeing the same or
another value.

Introduction

I

X1 , X2 , . . . , Xn

This makes sense:
I

another value.
When we sample with replacement.

Introduction

I

X1 , X2 , . . . , Xn

This makes sense:
I

another value.
When we sample with replacement.
With finite populations without replacement, what we see

affects the probability of what is yet to be seen.

Introduction
Finite versus infinite populations (I)
With infinite populations, precision depends only on

sample size.

Introduction

sample size.
Usually, standard error of estimation is n where n is

sample size and 2 the population variance.

Introduction

sample size.
Usually, standard error of estimation is n where n is

sample size and 2 the population variance.
If estimator is consistent we approach (but never quite

hit with certainty) the true value of the parameter.

Introduction
Finite versus infinite populations (II)

I
If population is finite of size N, we could inspect all units

and estimate anything with certainty:
X1 + X2 + . . . + Xn
n
if n = N.
would verify m = m
=
m

Introduction

I

X1 + X2 + . . . + Xn
n
if n = N.
would verify m = m
All parameters can, in principle, be known with certainty!
=
m

Introduction

I

X1 + X2 + . . . + Xn
n
if n = N.
would verify m = m
With n 6= N,
=
m
.5cm

Introduction

I

X1 + X2 + . . . + Xn
n
if n = N.
would verify m = m
With n 6= N,
=
m
If n/N 0, independent sampling good approximation.
.5cm

Introduction

I

X1 + X2 + . . . + Xn
n
if n = N.
would verify m = m
With n 6= N,
=
m
I
I
If n/N 0, independent sampling good approximation.

If n/N 0, we have to take into account that we are
looking at a substantial portion of the population.
.5cm

Introduction
An overview of things to come
We will see:
I What makes sampling without replacement more complex.

Introduction
We will see:
I
What relationship there is among independent and

non-independent sampling.

Introduction
We will see:
I
What relationship there is among independent and

non-independent sampling.
What other types of sampling exist.

The central approximation

I
Requirement: replacement of large population size N.


I
If n is large and X1 , . . . , Xn near independent,

X=
X1 + . . . + Xn
N(m, 2 /n)
n


I
If n is large and X1 , . . . , Xn near independent,

X=
X1 + . . . + Xn
N(m, 2 /n)
n
Then,
Prob X z/2
2
2
m X + z/2
=1
n
n

Estimation of the population total

I
Since T = Nm, we just have multiply by N the extremes

of the interval for m.

Estimation of the population total

I
Since T = Nm, we just have multiply by N the extremes

of the interval for m.
Hence,
Prob NX Nz/2
2
2
T NX + Nz/2
= 1
n
n

Estimation of a proportion
If Xi is a binary variable, X is the sample proportion.

We have X N(p, pq/n)

Usual estimate of variance is p(1 p)/n.

Usual estimate of variance is p(1 p)/n.
Sometimes we use a (conservative) estimate: pq 0.5,

hence a bound for 2 is 0.5/n.

Sampling error with confidence 1 .

I
From
Prob X z/2
2
2
m X + z/2
=1
n
n
we see
q that we will be off the true value m by less than
2
z/2 n with probability 1 .


I
From
Prob X z/2
2
2
m X + z/2
=1
n
n
we see
2
This is called the 1 (sampling) error.


I
From
Prob X z/2
I
I
2
2
m X + z/2
=1
n
n
we see
2
This is called the 1 (sampling) error.
Sampling error also used to mean standard deviation of
the estimate.

Finding the required sample size n

I
Example:
What n do we need so that with confidence 0.95 the error
in the estimation of a proportion is less than 0.03?


I
Example:
Solution:
q
2
Error is less than z/2 n with confidence 1 .


I
Example:
Solution:
q
2
Confidence 0.95 means z/2 = 1.96

Example:
Solution:
q
2
Want 0.03 > 1.96
2
.
n
Worst case scenario is 2 = 0.25.

Example:
Solution:
q
2
Want 0.03 > 1.96
Therefore, n >
n = 1068.
2
.
n
Worst case scenario is 2 = 0.25.
(1.96)2 0.25
0.032
= 1067.11 will do. Will take

Interesting facts (I)

I
Under independent sampling (infite population or

sampling with replacement), required sample size depends
only on variance and precsion required.


I

Questions like: Is a sample of 4% enough? are badly

posed.


I


posed.
n = 4% of a population with N = 10000 insufficient to

give a precision of 0.03 with confidence 0.95.


I


posed.
n = 4% of a population with N = 10000 insufficient to

give a precision of 0.03 with confidence 0.95.
. . . but n = 0.3% of a population with N = 1000000 will

be more than enough!

Interesting facts (II)

I
As long as populations are large detail is expensive!


I
To estimate a proportion in the CAPV with the precision

stated requires about n = 1068.


I

To estimate the same proportion for each of the three

Territories with the same precision, requires three times
as large a sample!


I

To estimate the same proportion for each of the three

Territories with the same precision, requires three times
as large a sample!
Subpopulation estimates have much lower precision than

those for the whole population.

Estimation of the mean (I)

I
In independent sampling,
"
X1 + . . . + Xn
E [x ] = E
n
m + m + ... + m
nm
=
=
=m
n
n


I
"
X1 + . . . + Xn
E [x ] = E
n
m + m + ... + m
nm
=
=
=m
n
n
I
E [Xi ] = m irrespective of what other values are in the

sample.


I
"
X1 + . . . + Xn
E [x ] = E
n
m + m + ... + m
nm
=
=
=m
n
n
I

sample.
Without replacement, distribution of Xi depends on what

other values are already present in the sample.


I
"
X1 + . . . + Xn
E [x ] = E
n
m + m + ... + m
nm
=
=
=m
n
n
I

sample.
Without replacement, distribution of Xi depends on what

other values are already present in the sample.
The same result as for independent sampling is true!

Estimation of the mean (II)

I
Theorem 1
P
In a finite population of size N with m = Ni=1 yi /N, for
samples Y1 , . . . , Yn without replacement of size n < N we
have:
E [Y ] = m


I
Theorem 1
P
have:
E [Y ] = m
Proof


I
Theorem 1
P
have:
E [Y ] = m
Proof
I
Y1 , Y2 , . . . , Yn are the elements of the sample.


I
Theorem 1
P
have:
E [Y ] = m
Proof
I
I
Y1 , Y2 , . . . , Yn are the elements of the sample.

y1 , y2 , . . . , yN are the elements of the population.

Estimation of the mean (III)

I
There are

N
n
N!
(Nn)!n!
different samples.


I
I
There are

N
n
Of those,
N1
n1
N!
(Nn)!n!
different samples.
contain each of the values y1 , y2 , . . . , yN .


There are

N
n
Of those,
N1
n1
Clearly,
N!
(Nn)!n!
different samples.
contain each of the values y1 , y2 , . . . , yN .
N 1
(Y1 + Y2 + . . . + Yn ) =
(y1 + y2 + . . . + yN )
n1
where the sum in the left is taken over all

samples. Dividing by

N
n

N
n
finishes the proof.
different

Estimation of the mean (IV)

I
Indeed,

(Y1 + Y2 + . . . + Yn )

N
n
=
=
N1
n1
(y1 + y2 + . . . + yN )

N
n
n
(y1 + y2 + . . . + yN )
N

Estimation of the mean (IV)

I
Indeed,

(Y1 + Y2 + . . . + Yn )

N
n
=
=
N1
n1
(y1 + y2 + . . . + yN )

N
n
n
(y1 + y2 + . . . + yN )
N
Therefore,
P
E [Y ] =
(Y1 + . . . + Yn )/n

N
n
(y1 + . . . + yN )
= E [y ] = m
N

The indicator variable method

I
We have
(Y1 + Y2 + . . . + Yn ) = (y1 Z1 + y2 Z2 + . . . yN ZN )
where Zi is a binary variable which takes value 1 if yi
belongs to a given sample.

The indicator variable method

I
We have
(Y1 + Y2 + . . . + Yn ) = (y1 Z1 + y2 Z2 + . . . yN ZN )
where Zi is a binary variable which takes value 1 if yi

belongs to a given sample.
The probability of that happening is n/N. Then,
n
(y1 + y2 + . . . yN ),
N
which again gives the previous result E [Y ] = y = m.
E [(Y1 + Y2 + . . . + Yn )] =

Population variance an quasi-variance

I
They are defined as:
PN
i=1 (yi
y )2
PN
y )2
N 1
i=1 (yi


I
PN
=
I
y )2
i=1 (yi
PN
y )2
N 1
Pn
Y )2
n1
i=1 (yi
Similarly for sample analogues:
s =
Pn
i=1 (Yi
Y )2
s =
i=1 (Yi


I
PN
=
I
PN
y )2
N 1
Pn
Y )2
n1
i=1 (yi
Similarly for sample analogues:
s =
I
y )2
i=1 (yi
Pn
i=1 (Yi
Y )2
s =
i=1 (Yi
Turns out some formulae are simpler in terms of

quasi-variances.

Variance of Y (I)
I
Theorem 2
In a finite population of size N, the estimator Y of
P
m = Ni=1 yi /N based on a sample of size n < N without
replacement Y1 , . . . , Yn has variance:
2
n
1
n
N

Var[Y ] =

Variance of Y (I)
I
Theorem 2
In a finite population of size N, the estimator Y of
P
m = Ni=1 yi /N based on a sample of size n < N without
replacement Y1 , . . . , Yn has variance:
2
n
1
n
N

Var[Y ] =
I
Factor
n
1
N
usually called finite population correction factor or
correction factor.


Variance of Y (II)
I
Remarks:

Variance of Y (II)
I
I
Remarks:
It is the same expression as in independent random
sampling with i) 2 replaced by
2 , and ii) corrected with
the factor (1 n/N).

Variance of Y (II)
I
I
Remarks:
the factor (1 n/N).
If n = N, the variance Var(Y ) is 0 (why?).

Variance of Y (II)
I
I
Remarks:
the factor (1 n/N).
If n = N, the variance Var(Y ) is 0 (why?).
Formula covers middle ground between infinite

populations (n/N = 0) and census sampling (n/N = 1).

Variance of Y (III)
I
Proof
y1 Zi + . . . + yN ZN
Var(Y ) = Var
n
N X
N
X
1 X
yi yj Cov(Zi , Zj )
= 2 yi2 Var(Zi ) +
n i=1
i=1 j6=i

Variance of Y (III)
I
Proof
y1 Zi + . . . + yN ZN
Var(Y ) = Var
n
N X
N
X
1 X
yi yj Cov(Zi , Zj )
= 2 yi2 Var(Zi ) +
n i=1
i=1 j6=i
I
We only need expressions for Var(Zi ) and Cov(Zi , Zj ).

Variance of Y (IV)
I
Since Zi is binary with probability n/N,

Var(Zi ) = (n/N)(1 n/N).

Variance of Y (IV)
I

Var(Zi ) = (n/N)(1 n/N).
But E[Zi Zj ] = P(Zi = 1, Zj = 1) =
n(n 1)
n
N(N 1)
N

Cov(Zi , Zj ) =
n(n1)
,
N(N1)
2
so
n(1 n/N)
N(N 1)

Variance of Y (IV)
I

Var(Zi ) = (n/N)(1 n/N).
But E[Zi Zj ] = P(Zi = 1, Zj = 1) =
n(n 1)
n
N(N 1)
N

Cov(Zi , Zj ) =
I
n(n1)
,
N(N1)
2
so
n(1 n/N)
N(N 1)
Replacing in expression for Var(Y ) will lead to result.

Variance of Y (V)
Var(Y ) =
1
n2
1
= 2
n
I
N
X
yi2
i=1
n
N

Var(Zi ) +
| {z }
(n/N)(1n/N)
N X
X
i=1 j6=i
yi yj Cov(Zi , Zj )
|
{z
n(1n/N)
N(N1)
N
N X
1 X
n X
2
1
yi
yi yj
N
N
1
i=1
i=1 j6=i
Will rewrite expression in brackets.

Variance of Y (VI)
I
Remark that,
N
X
N
X
i=1
i=1
(yi m)2 =
2
N
y
i
i=1
P
yi2
N 1
=
N
N
X
y2
i
i=1
N X
X
i=1 j6=i
yi yj
N 1

Variance of Y (VI)
I
Remark that,
N
X
N
X
i=1
i=1
(yi m)2 =
yi2
N 1
=
N
I
2
N
y
i
i=1
P
N
X
y2
i
i=1
N X
X
i=1 j6=i
yi yj
N 1
The expression in square brackets in th r.h.s is therefore

N PN
2
i=1 (yi m) .
N1

Variance of Y (VII)
I
We are now done!

1
Var(Y ) = 2
n
n
N

|
PN
n
1
=
1
n
N

2
n
= 1
N n

N
N X
n X
1 X
1
yi2
yi yj
N
N 1 i=1 j6=i
i=1
N
N1
m)2
N 1
i=1 (yi
{z
PN
(y m)2
i=1 i

Sample size for given precision (I)

I
The (1 ) error is
s
= z/2
2
(1 n/N)
n


I
The (1 ) error is
s
= z/2
I
2
(1 n/N)
n
Solving for n we obtain

n=
2
Nz/2
2
2
N 2 +
2 z/2


I
The (1 ) error is
s
= z/2
I
Solving for n we obtain

n=
2
(1 n/N)
n
2
Nz/2
2
2
N 2 +
2 z/2
In terms of the variance, it can be written as:

2
Nz/2
2
n=
2
(N 1) 2 + 2 z/2

Sample size for given precision (II)
2 or 2 are required.

I
I
We either replace an upper bound or conservative
estimation for 2 .

I
I
We either replace an upper bound or conservative
estimation for 2 .
Failing that, we estimate 2 or
2.

Stratified sampling
Why strata?
I
Sometimes we know something about the composition of

the population, knowledge that can be put to use.

Stratified sampling
Why strata?
I

Example: We might know that males and females have

different spending in e.g. tobacco or cosmetics.

Stratified sampling
Why strata?
I


To estimate average spending, it makes sense to sample

males and females, and combine the estimations.

Stratified sampling
Why strata?
I


To estimate average spending, it makes sense to sample

males and females, and combine the estimations.
Sometimes, the target quantity might be similar, but the

variance quite different. Also makes sense to differentiate.

Stratified sampling
Example 1
o
6
X2
5
4
o
X1
o
o
Expenditure
Sample unit
Makes sense to estimate mean in each subpopulation

Stratified sampling
Definitions and notation

I
We assume the population is divided in h strata. Total

size is N = N1 + N2 + . . . + Nh .

Stratified sampling

I

size is N = N1 + N2 + . . . + Nh .
The i-th stratum has a mean mi =

P i
(yij mi )2 .
variance i2 = N1i Nj=1
1
Ni
PNi
j=1 yij
and

Stratified sampling

I

size is N = N1 + N2 + . . . + Nh .
The i-th stratum has a mean mi =

P i
(yij mi )2 .
variance i2 = N1i Nj=1
Clearly,
m =
2 =
h
X
i=1
h
X
i=1
Ni
N
1
Ni
PNi
j=1 yij
and
mi
h
Ni
Ni 2 X
i +
(mi m)2
N
N
i=1

Stratified sampling
Estimation of the mean

I
The estimation of the mean sampling without

replacement the whole population has variance
2
(1 n/N).
n

Stratified sampling

I

2
(1 n/N).
n
Similarly, the estimation of the mean of each stratum has
2
variance i2 = ni (1 ni /Ni ).

Stratified sampling

I

2
(1 n/N).
n
Similarly, the estimation of the mean of each stratum has
2
variance i2 = ni (1 ni /Ni ).
The variance of the global mean reconstituted from the

estimated means of the strata is
2
h
X
i=1
Ni
N
!2
i2
(1 ni /Ni )
ni

Stratified sampling
Does the estimation of m improve?

I
Yes. If we sample each stratum in proportion to its size

(i.e., ni /Ni = n/N for all i), then:
2
(1 n/N) 2 =
n
!"
#

h
n X
Ni
Ni 1 Ni
i2
1
+
N i=1 N
N 1
N ni

n
1
N
h
1X
Ni
(mi m)2
n i=1 N 1

Stratified sampling
Does the estimation of m improve?

I
Yes. If we sample each stratum in proportion to its size

(i.e., ni /Ni = n/N for all i), then:
2
(1 n/N) 2 =
n
!"
#

h
n X
Ni
Ni 1 Ni
i2
1
+
N i=1 N
N 1
N ni

n
1
N
h
1X
Ni
(mi m)2
n i=1 N 1
Marked Improvement when the mi s very different.

Taking samples
Abraham Wald on sample selection

Taking samples
Abraham Wald (1902-1950)

Taking samples

Taking samples

I
Hungarian-born. Graduated
(Ph.D. Mathematics) from
University of Vienna, 1931.

Taking samples
Fled to the USA in 1938, as

Nazi persecution intensified in
Austria.

Taking samples

I

Austria.
Important contributions to the
war effort as statistician (notably
sequential analysis)

Taking samples

I

Austria.
Important contributions to the
war effort as statistician (notably
sequential analysis)
I
Was consulted about aircraft

armoring.

Taking samples
What Wald saw that the others did not

I
Mark hits in B-29 bombers as they come back.

Taking samples

I
Pretty obvious! Will armor the most beaten areas.

Taking samples

I
I didnt tell you to do that!

Taking samples

I
Do you want us to protect the areas with no hits?

Taking samples

I
Do you want us to protect the areas with no hits?
Thats exactly what I suggest!

Taking samples
Sample selection is ubiquitous!

I
If you ask for volunteers in a field study, no chance you

will get a truly random sample.

Taking samples

I

Never do!

Taking samples

I

Never do!
Do not let the survey taker to choose the units.

Taking samples

I

I
I
Never do!
A random sample is not a grab set.

Taking samples

I

I
I
Never do!
Build a census, randomize properly, address the chosen

units and no others.

Taking samples

I

I
I
Never do!
Build a census, randomize properly, address the chosen

units and no others.
If you use systematic sampling (every n-th unit with
random start), make sure no periodicities exist that will
destroy randomness.

FinitePopulation Sampling and Determining Its Size

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FinitePopulation Sampling and Determining Its Size

Uploaded by

Copyright:

Available Formats

Finite Population Sampling