You are on page 1of 56

ESTIMATION

CHAPTER 7
Statistical
Methods

Descriptive Inferential
Statistics Statistics

Hypothesis
Estimation
Testing
Estimation Process

Population Random Sample


I am 95%
Mean  confident that
Mean, µ, is  X = 50 µ is between
unknown 40 & 60.

 
 
Sample


 
x

ESTIMATION
 Estimation is the
_ process of using information from the sample
(often using X and s) to make inferences about population
parameters such as µ. This reverses the process examined so
far where we have used information
_
about µ and s to make
probability statements about X .

 An estimator is a statistics that specifies how to use the sample


data to estimate an unknown parameter of the population.

 Note that an estimator is a random variable that takes many


possible values.

 The sample mean is a natural candidate for an estimator of the


population mean.
Point Estimators
 In attempting to obtain point estimates of population parameters, the
following questions arise:

 What is a point estimate of the population mean?


 How good of an estimate do we obtain through the methodology
that we follow?

 Example: We said that the sample mean is a good estimate of the


population mean
 The sample mean is an estimator

 A particular value of the sample mean is an estimate


 An estimator is a rule which tells us how to use the sample data to
obtain an estimate. Since there are many possible estimators we need
to have some criteria to differentiate between good and bad estimators.

 Two types of estimate are illustrated: point and interval estimates.


 A point estimate provides a single value, a best guess at the truth.

 An interval estimate gives a range around the point estimate and


gives some idea of the reliability of the estimate.
 Question: Is there a unique estimator for a population parameter? For
example, is there only one estimator for the population mean?

 The answer is that there may be many possible estimators

 Those estimators must be ranked in terms of some desirable properties


that they should exhibit.

 The choice of point estimator is based on the following criteria

 Unbiasedness
 Efficiency
 Consistency
Rules and criteria - Bias
 An unbiased estimator is one which, on average, will give the right
answer. The expected value of the estimator (i.e. averaged over
many applications of the estimator) is equal to the population
parameter.

 A point estimator θˆ is said to be an unbiased estimator of the


population parameter θ if its expected value (the mean of its
sampling distribution) is equal to the population parameter it is
trying to estimate

( )
E θˆ = θ
The sample mean, variance and proportion are unbiased estimators
of the corresponding population parameters.

Generally speaking, the sample standard deviation is not an


unbiased estimator of the population standard deviation.

We can also define the bias of an estimator as follows

() ()
Bias θˆ = E θˆ − θ
Properties of Point Estimators
 Usually, if there is an unbiased estimator of a population parameter,
several others exist as well.

 To select the “best unbiased” estimator, we use the criterion of


efficiency.

 An unbiased estimator is efficient if no other unbiased estimator of the


particular population parameter has a lower sampling distribution
variance.
Properties of Point Estimators
 If θˆ1 and θˆ2 are two unbiased estimators of the population
parameter θ, then θˆ1 is more efficient thanθˆ2 if:
( ) ( )

V θˆ1 < V θˆ2

 The unbiased estimator of a population parameter with the lowest


variance out of all unbiased estimators is called the most efficient or
minimum variance unbiased estimator.

 We say that an estimator is consistent if the probability of obtaining


estimates close to the population parameter increases as the sample
size increases.
Unbiased estimators
 Which of the following estimators are unbiased?
 a) use x
b) use a random x in the sample
c) use the smallest observation in the sample
d) use + 1/n
 (a) and (b) are unbiased. (b) is self-evident since E(xi) = µ for all i (the expected
value of a single observation is the mean). For the mean:

 x + x +  + xn  1 1
E ( x ) = E 1 2  = ( E ( x1 ) + E ( x 2 ) +  E ( x n ) ) = ( µ + µ +  + µ ) = µ
 n  n n
 Since E(xs) < E( x ), where xs is the smallest sample observation,
 (c) is biased downwards.
 For (d), E(+1/n) = µ + 1/n > µ, so is biased upwards.
EXAMPLES
_
 The sample mean X and sample variance S2 are unbiased
estimators of population mean µ and population variance σ2
respectively.

 We’ve seen previously that the sample mean is an unbiased estimator


for the population mean.

 The sample variance is:

 We also know that σ2 /n+µ2


EXAMPLE
 Among all unbiased estimators, we choose the most efficient
estimator called the minimum variance unbiased estimator (MVUE).
The MVUE is an unbiased estimator with the smallest variance.
MVUE is the most efficient estimator. An efficient estimator
_ θ^ will
produce an estimate closer to the true parameter θ . X is MVUE
for µ.
EXAMPLE
HOMEWORK
 7.1
 7.2
 7.3
Unknown Population
Parameters Are
Estimated
Estimate Population with Sample
Parameter... Statistic
Mean µ x
2
Variance σ 2 s

Differences µ1 - µ 2 x1 - x2


Confidence Intervals-The Simple Sample
Case
 After knowing something about the mean and the variance of an
estimator θ^ , we would like to know how small the distance between
the estimator θ^ and the parameter θ is likely to be.

 To find out information about this we need to discuss about


confidence intervals.

 Intuitively the confidence interval is the range where you expect


something to be. By saying "expect" you live open the possibility of
being wrong. The degree of confidence measures the probability of
that expectation to be true.
Confidence Intervals
 The interval is calculated with the use of information from a sample.
You have seen the form of a confidence interval before, although it
was not called such. When we have large samples, we know that the
sampling distribution of the sample mean is approximately normal.
 Therefore we know we can use the Empirical Rule:
σ
µ± .
 In 68% of samples, the sample mean will fall within n
σ
 In 95% of samples, the sample mean will fall within µ ± ( 2)( ).
n
 In 99.7% of the samples, the sample mean will fall within µ ± (3)( σ ).
n
Idea

If θ^ is an estimator of θ, which ha a known sampling distribution, and
we can find two quantities that depend on θ^, say g1(θ^ ) and g2(θ^ ),
such that
 P[g1(θ^ ) ≤ θ ≤ g2(θ^ )] = 1-α, for some positive α, then we can say

that (g1(θ^ ) , g2(θ^ )) forms an interval that has probability (1- α) of


capturing the true θ.
The interval above is referred to as a confidence interval, with confidence
coefficient (1- α).
 g (θ^ ) - Lower Confidence Limit (LCL)
1

 g2(θ^ ) - Upper Confidence Limit (UCL)


Large-Sample Confidence Intervals for a Population
Mean
_
X −µ
Z =
Recall that σ/ n

 Take a random sample of size n from a population that has UNKNOWN


mean μ and known s.d. σ.

 By Central Limit Theorem (CLT), the sampling distribution


-
of the sample mean X based on a random sample of size n is
approximately Normal with mean μ and s.d. σ / n for large
sample size (n≥30).

P (− zα / 2 ≤ Z ≤ + zα / 2 ) = 1 − α
 Therefore,
Derivations
 _

 X −µ 
1 −α=P −zα/ 2 ≤ ≤+zα/ 2 
 σ/ n 
 
 σ _
σ 
=P −zα/ 2 ≤X −µ≤+zα/ 2 
 n n 
_ σ _
σ 
=P X −zα/ 2 ≤µ≤X +zα/ 2 
 n n 

 _ σ _ σ 
Therefore, the interval  X − zα / 2 , X + zα / 2 
 n n
is a large sample confidence interval for µ with confidence coefficient (1-
α ).
How to construct Confidence
Intervals
 Select a confidence level. The confidence level refers to the likelihood that the true
population parameter lies within the range specified by the confidence interval. The
confidence level is usually expressed as a percentage. Thus, a 95% confidence level
suggests that the probability of that the true population parameter lies within the
confidence interval is 0.95.
 Compute alpha. Alpha refers to the likelihood that the true population parameter lies
outside the confidence interval. Alpha is usually expressed as a proportion. Thus, if
the confidence level is 95%, then alpha would equal 1 - 0.95 or 0.05.
 Identify a sample statistic (e.g, mean, standard deviation) to serve as a point estimate
of the population parameter.
 Specify the sampling distribution of the statistic.
 Based on the sampling distribution of the statistic, find the value for which the
cumulative probability is 1 – α/2. That value is the upper limit of the confidence
interval.
 In a similar way, find the value for which the cumulative probability is alpha/2. That
value is the lower limit of the confidence interval.
Large-Sample Confidence Interval for a
Population Mean

 The confidence coefficient is equal to 1- α, and is split between the


two tails of the distribution
zα/ 2 Values for Commonly Used
Confidence Depends on Interval (z)

 X = µ ± Zσ  x
σ x_

X
µ -2.58σ  x µ -1.65σ  x µ µ +1.65σ  x µ +2.58σ  x
µ -1.96σ  x µ +1.96σ  x

90% Samples
95% Samples
99% Samples
Example
 Example: Suppose we draw a sample of 100 observations of returns on
the Gini index, assumed to be distributed, with sample mean 4% and
standard deviation 6%.

 What is the 95% confidence interval for the population mean?

 The standard error is .06/ 100 = .006

 The confidence interval is .04 ± 1.96(.006)


Definitions
_
 The sample size for establishing a confidence interval of the form X ± B
2
z 
n =  α /2 
with confidence coefficient (1-α) is given by  B  .

Example. It is desired to estimate the average distance traveled to work for


employees of a large manufacturing firm. Past studies of this type indicate that
the standard deviation of these distances should be in the neighborhood of 2
miles. How many employees should be sampled if the estimate is to be within
0.1 mile of the true average, with confidence coefficient 0.95?
Solution


_
The resulting interval is to be of the form X ± 0.1, with 1-α =0.95.
Thus, B=0.1 and z0.025=1.96. It follows that

n= [(zα /2 σ)/2] 2 = [(1.96 * 2)/ 0.1] 2= 1,536.64 or 1,537

Thus 1,537 employees would have to be sampled to achieve the desired


result. If a sample of this size is too costly , then either B must be
increased or (1-α ) must be decreased.
One sided confidence
interval
 The confidence intervals discussed so far have been two-sided intervals.
Sometimes we are interested in only a lower or upper limit for a
parameter, but not both. A one sided confidence interval can be formed
in such cases.

 Suppose we want to study the mean lifelength of batteries, only the lower
limit of the mean.


_ _
We need to find a statistics g(X ) so that P[µ > g(X ) ]=1- α
_ σ 
P  X − zα ≤ µ  = 1− α
 Again using the definition of Z we have  n 
_
σ
X − zα
 Or n
Binomial Distribution – Large Sample CI
for p
 Estimating the parameter p for the Binomial distribution is analogous
to the estimation of a population mean.

 We’ve seen in Chapter 3 that a random variable Y having a binomial


distribution with n trial can be written as Y=∑ i=1 n Xi , where the Xi are
independent Bernoulli random variables with common mean p.

 Therefore, E(Y)=np or E(Y/n) = p and we say that Y/n is an unbiased


estimator of p. Because V(Y) = np(1-p), we have V(Y/n)=p(1-p)/n.


_
n
Note that Y/n= =∑ i=1 Xi /n = X with Xi having E(Xi)=p and
V(Xi)=p(1-p). From CLT, we have that Y/n is approximately normally
distributed with mean p and variance p(1-p)/n.
Binomial Distribution – Large Sample CI
for p
 The confidence interval is constructed in a similar manner as the
confidence interval for µ. The observed fraction of success y/n will
be used as the estimate of p and will be written as p^.
∧ ∧


The CI for p is of the form: p ±zα/ 2 p(1 − p) with
confidence (1-α). n

 Example. Suppose that 20 out of 100 selected samples of water


tested for microorganisms, show presence of a particular organism.
Estimate the true probability p of finding this microorganism in a
sample of the same volume, with a 90% confidence interval.
Binomial Distribution – Large Sample CI
for p

 We have that 1-α = 0.90 and z α/2 = 1.645, y=20 and n=100.

 Therefore the interval estimate of p is

 0.20 ± 1.645 * sqrt [(0.2 *0.8)/100],

 Which is 0.20 ± 0.066

 We are about 90% confident that the true probability p is somewhere


between 0.134 and 0.266.
Normal Distribution – Confidence Interval
for µ
 2 problems presented by sample sizes of less than 30

 CLT no longer applies


 Population standard deviation is almost always unknown, and s may
provide a poor estimation when n is small

 If we can assume that the sampled population is approximately


normal, then the sampling distribution of can be assumed to be
approximately normal
 Instead of using z = x − µ we use t=
x− µ
σ n s n
 This t is referred to as the t-statistic.
Normal Distribution – Confidence Interval
for µ
 The t-statistic has:
a sampling distribution very
similar to z

 Variability dependent on n,
 or sample size.

 Variability is expressed as (n-1) degrees of freedom (df). As (df) gets smaller,


variability increases.
 s 
 The exact CI for µ is then x ±tα 2  
 n 
basing tα/2 on n-1 degrees of freedom.
Normal Distribution – CI for µ - Small
Sample
 Comparing t and z distributions for the same α, with df=4 for the t-distribution,
you can see that the t-score is larger, and therefore the confidence interval will
be wider.
 The closer df (df=n-1) gets to 30, the more closely the t-distribution
approximates the normal distribution.
Normal Distribution – Confidence Interval
for σ
 Suppose we have obtained a random sample of n observations
from a normal population with variance σ2 and that the sample
variance is s2. A 100(1 - α)% confidence interval for the
population variance is

( n − 1) s 2 ( n − 1) s 2
<σ 2 <
χ n2−1,α / 2 χ n2−1,1−α / 2
Homework
Confidence Intervals – The Multiple Sample Case

 It’s often needed to compare two population means, and it’s


conveniently accomplished by estimating the difference between the
two means. Suppose the two means are denoted by µ1 and µ2 and
the two variances by σ12 , respectively σ22, with µ1 - µ2 , the
parameter to be estimated.

 Two samples are derived, one from each population and we have the
following:
σ12
E ( X 1 ) = µ1 , V ( X 1 ) =
n1
σ22
E ( X 2 ) = µ2 , V ( X 2 ) =
n2
Independent Populations

E( X1 − X 2 ) = µ1 − µ 2
σ σ 2 2
V ( X1 − X 2 ) = V ( X1) + V ( X 2 ) = + 1 2
n1 n2
Independence

 The difference between the two sample means has an approximately


normal distribution.
Confidence Intervals
( X 1 − X 2 ) ~ N ( µ1 − µ2 , σ12 / n1 +σ22 / n2 )
( X 1 − X 2 ) − ( µ1 − µ2 )
⇒ ~ N (0,1)
σ / n1 +σ / n2
2
1
2
2

( X 1 − X 2 ) − (µ 1 − µ 2 )

− zα / 2 ≤ ≤ zα / 2
σ 12 / n1 + σ 22 / n2

( X 1 − X 2 ) − zα / 2 σ12 / n1 + σ 22 / n2 ≤ ( µ1 − µ2 )
≤ ( X 1 − X 2 ) + zα / 2 σ12 / n1 + σ 22 / n2
Example
 A farm-equipment manufacturer wants to compare the average daily
downtime for two sheet metal stamping machines located in two
different factories. Investigation of company records for 100 randomly
selected days on each of the two machines gave the following results:
 n1=100 n2=100
_ _

X 1 = 12 minutes X 2= 9 minutes

=
S12=6 S22=4

 Construct a 90% Confidence Interval for µ1- µ2


Solution
_ _
s12 s22
 Using the above formula, we have ( x1 − x 2 ) ± z0.05 +
n1 n2
 Giving (12-9) ± (1.645) Sqrt(6/100 + 4/100), which is

 3 ± 0.52

 That is, we are about 90% confident that µ1- µ2 is between 3-0.52=2.48
and 3+0.52=3.52. This evidence suggests that µ1is larger than µ2.
Normal Distribution – Same Variance – 2 Samples

E ( X 1 − X 2 ) = µ1 − µ 2
σ2 σ2
V ( X1 − X 2 ) = V ( X1 ) + V ( X 2 ) = +
n1 n2
 We need a good estimator for the common variance, which should
be a function of the two sample variances, but unbiased.

 Therefore the pooled sample variance is the good estimator and


( n1 −1) s12 + ( n2 −1) s22
s =
2
p
n1 + n2 − 2
Confidence Interval for the Difference Between Two Means


Variance unknown. σ1 = σ 2
( X 1 − X 2 ) − ( µ1 − µ2 )
~ t n1 +n2 −2
s 2p / n1 + s 2p / n2
( X 1 − X 2 ) − (µ 1 − µ 2 )
− tα / 2,n1 + n2 − 2 ≤ ≤ tα / 2,n1 + n2 − 2
s 2p / n1 + s 2p / n2

( X 1 − X 2 ) − tα / 2,n1 + n2 − 2 s p 1 / n1 + 1 / n2 ≤ ( µ1 − µ 2 )
≤ ( X 1 − X 2 ) − tα / 2,n1 + n2 − 2 s p 1 / n1 + 1 / n2
Confidence Interval for the Difference Between Two
Binomial Proportions

 
 ( p1 − p2 ) − ( p1 − p2 )
    ~ N (0,1)
p1 (1 − p1 ) / n1 + p2 (1 − p2 ) / n2

 
( p1 − p2 ) − ( p1 − p2 )
− zα / 2 ≤     ≤ zα / 2
p1 (1 − p1 ) / n1 + p2 (1 − p2 ) / n2
Confidence Interval for the Difference Between Two
Binomial Proportions

   
  p ( 1 − p ) p (1 − p 2)
( p1 − p2 ) − zα / 2 1 1
+ 2
≤ ( p1 − p2 )
n1 n2
   
  p (1 − p ) p (1 − p 2)
≤ ( p1 − p2 ) + zα / 2 1 1
+ 2
n1 n2
Example
 We want to compare the proportion of defective electric motors turned
out by two shifts of workers. From the large number of motors produce in
a given week, n1=50 motors were selected from the output of shift I, and
n2=40 motors were selected from the output of shift II. The sample from
shift I reveled four to be defective, and the sample from shift II showed
six faulty motors. Estimate the true difference between proportions of
defective motors produced in a 95% confidence interval.

Using the previous formula, the interval is -0.07 ± 0.13.

Note. Since the interval contains zero, there does not appear to be any
significant difference between the rates of defectives for the two
shifts. Zero cannot be ruled out as a plausible value of the true
difference between proportions of defective motors.
Confidence Interval for the Ratio of Two Variance
f (F )

s12 σ 12
/ 2 ~ Fn1 −1,n2 −1
s2 σ 2
2

Fα , v,v F
1 2

s12 σ 12
F1−α / 2,n1 −1,n2 −1 ≤ 2 / 2 ≤ Fα / 2,n1 −1,n2 −1
s2 σ 2
s12 1 σ 12 s12 1
( )≤ 2 ≤ 2 ( )
s2 Fα / 2,n1 −1,n2 −1 σ 2 s2 F1−α / 2,n1 −1,n2 −1
2
Homework
 DATA iron;
 INPUT brand $ dist @@;
 CARDS;
 A 251.2 B 263.2 C 269.7 D 251.6
 A 245.1 B 262.9 C 263.2 D 248.6
 A 248.0 B 265.0 C 277.5 D 249.4
 A 251.1 B 254.5 C 267.4 D 242.0
 A 260.5 B 264.3 C 270.5 D 246.5
 A 250.0 B 257.0 C 265.5 D 251.3
 A 253.9 B 262.8 C 270.7 D 261.8
 A 244.6 B 264.4 C 272.9 D 249.0
 A 254.6 B 260.6 C 275.6 D 247.1
 A 248.8 B 255.9 C 266.5 D 245.9
 ;
PROC GLM DATA=iron;
CLASS brand;
MODEL dist=brand;
means brand/cldiff tukey;
RUN;

quit;    means brand /cldiff tukey alpha=0.01;


 The GLM Procedure

 Tukey's Studentized Range (HSD) Test for dist


 NOTE: This test controls the Type I experimentwise error rate.

 Alpha 0.05
 Error Degrees of Freedom 36
 Error Mean Square 21.17503
 Critical Value of Studentized Range 3.80880
 Minimum Significant Difference 5.5424

 Comparisons significant at the 0.05 level are indicated by ***.

 Difference
 brand Between Simultaneous 95%
 Comparison Means Confidence Limits

 C -B 8.890 3.348 14.432 ***


 C -A 19.170 13.628 24.712 ***
 C -D 20.630 15.088 26.172 ***
 B -C -8.890 -14.432 -3.348 ***
 B -A 10.280 4.738 15.822 ***
 B -D 11.740 6.198 17.282 ***
 A -C -19.170 -24.712 -13.628 ***
 A -B -10.280 -15.822 -4.738 ***
 A -D 1.460 -4.082 7.002
 D -C -20.630 -26.172 -15.088 ***
 D -B -11.740 -17.282 -6.198 ***
 D -A -1.460 -7.002 4.082
a. Variable - This column lists the dependent variable(s). In our example, the dependent variable is
write.

b. female - This column gives values of the class variable, in our case female. This variable is necessary
for doing the independent group t-test and is specified by class statement.

c. N - This is the number of valid (i.e., non-missing) observations in each group defined by the variable
listed on the the class statement (often called the independent variable).

d. Lower CL Mean and Upper CL Mean - These are the lower and upper confidence limits of the
mean. By default, they are 95% confidence limits.

e. Mean - This is the mean of the dependent variable for each level of the independent variable. On the
last line the difference between the means is given.

f. Lower CL Std Dev and Upper LC Std Dev - These are the lower and upper 95% confidence limits
for the standard deviation for the dependent variable for each level of the independent variable.

g. Std Dev - This is the standard deviation of the dependent variable for each of the levels of the
independent variable. On the last line the standard deviation for the difference is given.

• Std Err - This is the standard error of the mean. It is used in calculating the F-value.
We compare the mean writing score between the group of female
students and the group of male students. Ideally, these subjects
are randomly selected from a larger population of subjects.
Depending on if we assume that the variances for both
populations are the same or not, the standard error of the mean
of the difference between the groups and the degree of freedom
are computed differently.

You might also like