You are on page 1of 26

Outline

Ratio Estimator with Simple Random Sampling.


Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

STAT 440 Chapter 7:


Auxiliary Data and Ratio Estimation.

Natalia Tchetcherina

February 26.

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

1 Ratio Estimator with Simple Random Sampling.

2 Small population example illustrating bias.

3 Derivations and Approximations for the Ratio Estimator.


4 Compare the two estimates y and µ̂r under simple random
sample design.
5 Finite-Population Central Limit Theorem for the Ratio
Estimator.

6 Ratio estimation with unequal probability design.

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

Auxiliary Variables

In addition to the variable of interest yi , one or more auxiliary


variables xi may be associated with the i-th unit of the population.
Example
1 yi is number of animals in unit i
xi is size of unit i
2 yi is sales of a given book title at bookstore
xi is size of i-th bookstore
3 yi is volume of a tree
xi is diameter of a tree

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

Auxiliary Variables.
Auxiliary information may be used either in the sampling design or
in estimation.
Example
1 Stratification based on vegetation type or elevation.
2 Sampling with replacement with selection probabilities
proportional to size of plot or size of tree.

Example
1 Ratio estimator.
2 Regression estimator.

In some situations, the x-values may be known for the entire


population, while in other situations, the x values are known only
Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

The Ratio Estimator with Simple Random Sampling.


1 Suppose that the x-values are known for the whole population
and the relationship between the x’s and the y ’s is linear and
when xi is zero, yi will be zero.
Example
As plot size goes to zero, the number of animals on the plot will be
almost certainly go to zero.
PN
2 Let τ be the population total of the x’s. τ =
x x i=1 xi Let
µx be the population mean of the x’s. µx = τx /N
3 τ and µ are assumed known. The object of inference is to
x x
estimate the population mean µ or total τ of the y -values.
4 For a simple random sample of n of the units, the sample

y -values are recorded along with the associated x-values.


Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

The Ratio Estimator.


1 The population ratio R is
P N
yi τy
R= i=1
PN = τx
x
i=1 i
The P
sample ratio r is
n
yi
r = Pi=1
n
xi
= yx
i=1

Definition
The ratio estimator of the population mean µ is µ̂r = r µx
2 Since the ratio estimator is not unbiased, its mean square
error will be of interest for comparing its efficiency relative to
other estimators.
mse(µ̂r ) = E (µ̂r − µ)2
Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

The Mean Square Error of the Ratio Estimator.


1 For an unbiased estimator the mean square error equals to the
variance, but for a biased estimator the mean square error is
mse(µ̂r ) = var (µ̂r ) + (E (µˆr ) − µ)2
2 With the ratio estimator, the squared bias is small relative to
the variance, so the first-order approximation to the mean
square error is the same as for the variance.
3 The approximate mean square error or variance of the ratio
estimator is
σr2 1 PN
var (µ̂r ) ≈ ( N−n
N ) n where σr = N−1
2
i=1 (yi − Rxi )
2
4 The ratio estimator thus tends to be more precise than the
sample mean of the y -values for populations for which σr2 is
less than σ. This is the case for populations for which the y ’s
and x’s are highly correlated, with roughly a linear relationship
through the origin.
Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

The Mean Square Error of the Ratio Estimator.


1 The traditional estimator of the mean square error or variance
of the ratio estimator is
sr2 1 Pn
ˆ (µ̂r ) = ( N−n
var N ) n where sr2 = n−1 i=1 (yi − rxi )
2

2 An approximate 100(1 − α)% confidence interval for µ, based


on the normal approximation
p is
µ̂r ± tn−1 (α/2) var ˆ (µ̂r )
3 The ratio estimate of the population total τ is τ̂r = N µ̂r = r τx
4 For estimating the population ratio R, the sample ratio r may
be used. Although not unbiased, it is approximately so with
large sample size.
2
5 The approximate variance is var (r ) ≈ ( N−n ) σr
Nµ2 n x
2
ˆ (r ) = ( N−n
An estimate of this variance is var ) sr
Nµ2 n x

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

Example.Estimation of the Total Number of Fish Caught.

The bias and mean square error under simple random sampling of
the ratio estimator can be illustrated by the sampling of a very
small population and looking at the sample space, that is, the set
of all possible samples.

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

Example.Estimation of the Total Number of Fish Caught.

Example
We would like to estimate the total number of fish caught, in a
given day, along a river on which fish are caught in nets fixed
adjacent to established fishing sites.Suppose that there are N = 4
sites along the river and suppose that the number of nets xi at
each site in the population is readily observed, say from an aircraft
flying the length of the river. The number yi of fish caught can be
supposed to be obtained only with more difficulty, by visiting a
given site at the end of the fishing day. A simple random sample of
n = 2 sites will be selected and ratio estimation used to estimate
the total number of fish caught.

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

Example.Estimation of The Total Number of Fish Caught


.
Site,i 1 2 3 4
Nets, xi 4 5 8 6
Fish, yi 200 300 500 400
1 The actual population total
τ = 200 + 300 + 500 + 400 = 1400 fish caught.
2 The population total for the auxiliary variable- which is known
to the samplers is τx = 4 + 5 + 8 + 6 = 23 nets.
3 The
 number
  of possible
 samples is
N 4 4!
= = 2!(4−2)! =6
n 2
4 Since each possible sample has the same probability
P(s) = 1/6 Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

Estimating E (r ).

1 An exact expression for the bias of r can be obtained as


follows: cov (r , x) = E (r x) − E (r )E (x) = µ − µx E (r )
µ−cov (r ,x)
2 So, E (r ) = µx = R − cov (r , x)/µx
3 Since the covariance of two random variables cannot exceed in
absolute valueqthe product of their standard deviations,
var (r )var (x)
|E (r ) − R| ≤ µx

|E (r )−R| var (x)
So that √ ≤ µx
var (r )
That is, the magnitude of bias relative to the standard
deviation of the estimator is no greater than the coefficient of
variation of x.

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

The Approximate Mean Square Error of the Ratio


Estimator µ̂r .
1 Use a linear approximation for the (nonlinear) function
f (x, y ) = y /x. The linear approximating function is
(yi − Rxi )/µx , obtained as the first term in a Taylor series of
f about the point (µx , µ).
2 Since E (yi ) = µ, E (xi ) = µx and Rµx = µ, the expected
value of yi − Rxi under simple random sample
E (yi − Rxi ) = 0
3 Thus, the mean square error of the ratio estimator µ̂r is
approximated by the variance of the variables yi − Rxi , which
under simple random sampling is
σr2
var (µ̂r ) ≈ ( N−n
N ) n
Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

The Taylor Series Expansion for r .


1 The Taylor series expansion of function g (x, y ) about point
(a, b) is
g (x, y ) = g (a, b) + gx (a, b)(x − a) + gy (a, b)(y − b) +
1/2gxx (a, b)(x − a)2 + gxy (a, b)(x − a)(y − b) +
1
1/2gyy (a, b)(y − b)2 + ... + p!q! gx p y q (a, b)(x − a)p (y − b)q + ...
2 For the sample ratio r = y /x, let g (x, y ) = yx and
approximate r = g (x, y ) by expanding about the point
(µx , µy ).
µ µ
r = g (x, y ) ≈ µyx − µy2 (x − µx ) + µ1x (y − µy ) =
x
µ
R − µy2 (x − µx ) + µ1x (y − µy )
x
3 Under simple random sampling
µ
E (r ) = E (R − µy2 (x − µx ) + µ1x (y − µy )) ≈
x
µy
E (R) − µ2x
E (x − µx ) + µ1x E (y − µy ) = R
Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

The approximate variance of r .

1 The approximate variance from the first order approximation is


µ µ
var (r ) ≈ ( µy2 )2 var (x) + µ12 var (y ) − 2 µy3 cov (x, y ) =
x x x
1
µ2x
var (y − Rx)
N−n 2
2 With simple random sampling, var (x) = Nn σx
var (y ) = N−n
Nn σy
2
PN (xi −µx )(yi −µy )
cov (x, y ) = N−n
Nn i=1 N−1
3 Approximation using more terms than one of Taylor’s formula
are used in examining the bias, mean square error, and higher
moments of the ratio estimator.

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

Approximate variance of µ̂r .

1 Since µ̂r = r µx and


µ µ
var (r ) ≈ ( µy2 )2 var (x) + µ12 var (y ) − 2 µy3 cov (x, y )
x x x
var (µ̂r ) ≈ R 2 var (x) + var (y ) − 2Rcov (x, y )
2 ˆ (µ̂r ) ≈ R 2 N−n
var 2
Nn sx +
N−n 2
Nn sy − 2Rcorr (X , Y )sx sy N−n
Nn

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

Compare the two estimates y and µ̂r under simple random


sample design.

N−n 2
var
ˆ (y ) s
Nn y
1
var
ˆ (µ̂r ) = R Nn sx + Nn sy −2Rcorr (X ,Y )sx sy N−n
2 N−n 2 N−n 2 =
Nn
sy2
sy2 +R 2 sx2 −2Rcorr (X ,Y )sx sy
2 var
ˆ (y ) > varˆ (µ̂r ) if R 2 sx2 − 2Rcorr (X , Y )sx sy < 0 or
corr (X , Y ) > 12 ssyx /µx
/µy

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

Finite-Population Central Limit Theorem for the Ratio


Estimator.
The finite-population CLT for the ratio estimator is proved in Scott
and Wu.
Law
Let consider a sequence of populations, with both population size
N and sample size n increasing. Then the standardized ratio
estimator
√µ̂r −µ
var
ˆ (µ̂r )
has an asymptotic standard normal distribution as n and N − n
tend to infinity, provided that certain conditions are satisfied. Two
of them require that the proportion of σr2 due to outliers should
not be too large and the coefficient of variation of x should not be
too large. Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

Definition of Generalized Ratio Estimator.


1 Let πi probability of inclusion unit i in the sample.
2 Let the variable of interest y has a linear relationship through
the origin with an auxiliary variable x.
Definition
The generalized ratio estimator is
τ̂
τ̂G = τ̂yx τx
The components τ̂y and τ̂x are Horvitz-Thompson estimators of τ
and τP
x respectively,
τ̂y = vi=1 πyii and τ̂x = vi=1 πxii
P

3 Note that the usual ratio estimator is a special case of τ̂G


under simple random sampling, in which πi = n/N for all
units.
Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

Properties of Generalized Ratio Estimator.

1 τ̂G is not unbiased.


2 It is recommended in cases in which the y -values are roughly
proportional to the x-values, so that the variance of the
residuals yi − Rxi is much smaller than the variance of the
y -values themselves.

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

An Approximate Formula for the MSE of Generalized Ratio


Estimator.

1 Use the variance formula for the Horvitz-Thompson estimator


with yi − Rxi as the variable of interest, where R = τy /τx is
population ratio. For estimating the variance, the
corresponding Horvitz-Thompson formula with estimate
R̂ = τ̂y /τ̂x can be used.
2 We can rewrite the generalized ratio estimator as
τ̂G = τ̂y + R̂(τx − τ̂x )
The first term in Taylor’s formula, expanding about the point
(τx , τ ), gives the approximation τ̂G ≈ τ̂y + R(τx − τ̂x )

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

An Approximate Formula for the MSE of Generalized Ratio


Estimator.

1 The MSE or variance of τ̂G may be approximated by


substituting the population ratio R for the estimate R̂, giving
τ̂G − τ ≈ τ̂y + R(τx − τ̂x ) − τ = τ̂y − R τ̂x
2 Since E (τ̂y − R τ̂x ) = 0, we get
yi −Rxi
var (τ̂G ) ≈ E (τ̂y − R τ̂x )2 = var (τ̂y − R τ̂x ) = var ( vi=1
P
πi )
3 The approximate variance is thus the variance of a
Horvitz-Thompson estimator based on variables yi − Rxi .
Denote y 0 =Pyi −1−π
Rxi . The Papproximate formula is
πij −πi πj
var (τˆG ) ≈ N
i=1 πi
i
y 02+
i
N P
i=1 i6=j πi πj y 0i y 0j

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

An Approximate Formula for the MSE of Generalized Ratio


Estimator.

1 An estimator of this variance is obtained using ŷ = yi − R̂xi in


the Horvitz-Thompson variance estimation formula:
P P π −π π ŷ ŷ
ˆ (τ̂G ) = vi=1 1−π ŷ + vi=1 i6=j ijπi πji j πi ijj =
i 2
P
var πi2 i
Pv 1 1 2
Pv P 1 1
i=1 ( π 2 − πi )ŷi + 2 i=1 i>j ( πi πj − πij )ŷi ŷj
i

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

Ratio Estimation with Unequal Probability Designs.


Example.
Example
Large mammals in open habitat are often surveyed from aircraft.
As the aircraft flies over the selected strip, all animals of the
species within a prescribed distance of the aircraft path are
counted, the distance sometimes being determined by markers on
the wing struts of the aircraft. Because of irregularities in the
shape of the study area, the strips to be flown may be of varying
lengths. One may select units (strips) with probability proportional
to their lengths by randomly selecting n points on a map of the
study region and including in the sample any strip that contains a
selected point. The draw-by-draw selection probability for any strip
equals its area (1× length of strip) divided by the area of the study
Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

Ratio Estimation with Unequal Probability Designs.


Example. Continue.
Example
A strip is selected more than once if it contains more than one of
the points selected. A sample of size n = 4 was selected from a
population of N = 100 units.

Sample Observations.
yi Length pi
60 5 .05
60 5 .05
14 2 .02
1 1 .01
Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.
Outline
Ratio Estimator with Simple Random Sampling.
Small population example illustrating bias.
Derivations and Approximations for the Ratio Estimator.
Compare the two estimates y and µ̂r under simple random sample design.
Finite-Population Central Limit Theorem for the Ratio Estimator.
Ratio estimation with unequal probability design.

Next time

Read Chapter 8 from Thompson.

Natalia Tchetcherina STAT 440 Chapter 7: Auxiliary Data and Ratio Estimation.

You might also like