You are on page 1of 57

SAMPLING

DISTRIBUTIONS
& CONFIDENCE
INTERVAL
CHAPTER 3
BPF 3313 / BUM 2413
CONTENT
3.1 Sampling Distribution
3.2 Estimate, Estimation and Estimator
3.3 Confidence Interval for the mean μ
3.4 Confidence Interval for the Difference
between Two mean
3.5 Confidence Interval for the Proportion
3.6 Confidence Interval for the Difference
between Two Proportions
3.7 Confidence Interval for Variances and
Standard Deviations
3.8 Confidence Interval for Two Variances
and Standard Deviations
3.1 Sampling Distributions

OBJECTIVE

 After completing this chapter, you should be able to

1. Identify the sampling distribution for sample mean,


difference between two sample means, sample
proportion and difference between two sample
proportions.
Sampling Distribution
 A sampling distribution is the probability distribution, under
repeated sampling of the population, of a given statistic
(a numerical quantity calculated from the data values in a
sample).

 The formula for the sampling distribution depends on the


distribution of the population, the statistic being
considered, and the sample size used. A more precise
formulation would speak of the distribution of the statistic
for all possible samples of a given size, not just "under
repeated sampling".

 In other word, the sampling distribution of a statistic S for


samples of size n is defined as follows:
– The experiment consists of choosing a sample of size n from
the population and measuring the statistic S. The sampling
distribution is the resulting probability distribution.
Sampling Distribution of Means
 Imaging carrying out the following procedure:

– Take a random sample of n independent observations from a


population.

– Calculate the mean of these n sample values. (Mean


sample)

– Repeat the procedure until you have taken all possible


samples of size n, calculating the sample mean of each one.

– Form a distribution of all the sample means.

 The distribution is called sampling distribution of


means
Example
 Imagine that our population consists of only three
numbers: the number 2, the number 3 and the number 4.
Our plan is to draw an infinite number of random
samples of size n = 2 and form a sampling distribution of
the sample means.
Sampling Distribution for Means
 The mean of the sampling distribution of means is equal to the
population mean.   
X

 The standard deviation of the sampling distribution of means is



X 
n for infinite population

X 
  N  n
for finite population
n  N  1
 If the population is normally distributed, the sampling distribution
is normal regardless of sample size.
 2 
 By using the Central Limit Theorem, X ~ N   , 
 n 
If the population distribution is not necessarily normal, and has
mean μ and standard deviation σ , then, for sufficiently large n,
the sampling distribution of X is approximately normal,
with mean   X     and standard deviation   X    
n
EXERCISE 3.1
1. At a college, the masses of the male
students can be modeled by a normal
distribution with mean mass 70kg and
standard deviation 5kg. Four male
students are chosen at random. Find the
probability that their mean mass is less
than 65kg.

TIPS: Use Central Limit


Theorem
Sampling Distribution for
Difference Mean
  12    22 
For X 1 ~ N  1 ,  and X 2 ~ N  2 , 
 n1   n2 

  12  2 2 
X 1  X 2 ~ N  1  2 ,  
 n1 n2 

  12  2 2 
X 1  X 2 ~ N  1  2 ,  
 n1 n2 

n1 and n2 independent observations


EXERCISE 3.1
2. The elasticity of a polymer is affected by the
concentration of a reactant. When low
concentration is used, the true mean elasticity
is 55, and when high concentration is used, the
true mean elasticity is 60. The standard
deviation of elasticity is 4, regardless of
concentration.
If two random samples of sizes 16 are taken,
find the probability that the difference mean of
elasticity for high and low concentration is at
least 2.
Sampling Distribution for
Proportions
 p - Proportion, Probability and Percent for population
X
 pˆ  - sample proportion of X successes in a sample of
n
size n

 qˆ  1  pˆ - sample proportion of failures in a sample of size n

 X is the binomial random variable created by counting the


number of successes picked by drawing n times from the
population. X ~ bin  n, p 

 The shape of the binomial distribution looks fairly Normal as


long as n is large and/or p is not too extreme (not close to 0
or 1).  p  1 p 
If np  1  p   5, then pˆ ~ N  p, 
 n 
Sampling Distribution for
Proportions

 The sampling distribution for proportions is a distribution of


the proportions of all possible n samples that could be taken in
a given situation. 

 That is, the sample proportion (percent of successes in a


sample), is approximately Normally distributed with

– mean p, and
– standard deviation p  1 p
n
EXERCISE 3.1
3. It is known that 3% of frozen pies
delivered to a canteen are broken. What
is the probability that, on a morning
when 500 pies are delivered, 5% or
more are broken?

TIPS:
Use Normal Approximation to the Binomial distribution
OR
pˆ  p
Z
Use p  1 p with continuity correction 
1
2n
n
Sampling Distribution for
Difference Proportions

 p1  1  p1    p2  1  p2  
If pˆ1 ~ N  p1 ,  and pˆ 2 ~ N  p2 , 
 n1   n 2 

 p  1  p1  p2  1  p2  
ˆp1  pˆ 2 ~ N  p1  p2 , 1  
 n1 n 2 

 p1  1  p1  p2  1  p2  
pˆ1  pˆ 2 ~ N  p1  p2 ,  
 n 1 n 2 

n1 and n2 independent observations


3.2 Estimate, Estimation,
Estimator
OBJECTIVE

 After completing this chapter, you should be able to

1. Define and understand the general formula of interval


estimate (confidence interval) for a parameter.
Estimator

 Probability function are actually families of models


in the sense that each include one or more
parameter.

 Example: Poisson, Binomial, Normal

 Any function of a random sample whose objective


is to approximate a parameter is called a statistic
or an estimator

 ˆ is the estimator for 


statistic parameter
Properties of Good Estimator
 Unbiased

 Efficient

 Sufficient

 Consistent

Estimations & Estimate
 Estimation – Is the entire process of using an
estimator to produce an estimate of the
parameter
 2 types of estimation
1. Point Estimate
• A single number used to estimate a population parameter

2. Interval Estimate
• A spread of values used to estimate a population
parameter
• The interval is usually written (a, b) where a and b are
known as confidence limit
• a – lower confidence limit
• b – upper confidence limit
Definitions
 Confidence Interval
– Range of numbers that have a high probability of
containing the unknown parameter as an interior
point.
– By looking at the width of a confidence interval, we
can get a good sense of the estimator precision.
– Width = b – a

 Confidence Coefficient
1
– The probability of correctly including the population
parameter being estimated in the interval that is
produced
Definitions
 Level of Confidence
– The confidence coefficient expressed as a
percent ,
– Example:  1     100%

 1  %
Definitions

OR
(1- α) 100% confidence interval for θ

=  ˆ   distribution for ˆ   s.d. for ˆ  


3.3 Confidence Interval for
Mean
OBJECTIVES

 After completing this chapter, you should be able to

1. Find the confidence interval for the mean.

2. Find the confidence interval for the mean when σ is


known and unknown.
Confidence Interval for Mean
Confidence Interval for the Mean

The ( 1 – α ) 100 % confidence interval for μ

   
 X  z , X  z 
 2 n 2 n
 s s   s s 
 X  z , X  z   X  t , n 1 , X  t , n 1 
 2 n 2 n  2 n 2 n
t- Distributions

The number of values that are free


to vary after a sample statistic has
been computed
Rounding Rule
 When you are computing a confidence
interval for a population mean by using raw
data, round off to one more decimal place
than the number of decimal places in the
original data.

 When you are computing a confidence


interval for a population mean by using a
sample mean and standard deviation, round
off to the same number of decimal places
as given for the mean.
EXERCISE 3.3

1. The mass of vitamin E in a capsule


manufactured by a certain drug company is
normally distributed with standard deviation
0.042 mg.
A random sample of 5 capsules was
analyzed and the mean mass of vitamin E
was found to be 5.12 mg.
Find the 95% confidence interval for the
population mean mass of vitamin E per
capsule.
EXERCISE 3.3

2. A plant produces steel sheets whose


weights are known to be normally
distributed with a standard deviation of
2.4 kg.
A random sample of 36 sheets had a
mean weight of 3.14 kg.
Find the 99% confidence interval for the
population mean weight.
EXERCISE 3.3

3. A random number of 100 pieces of wood


are cut using a machine.
The sample mean of length in cm is 1.06
cm and the standard deviation is 0.08 cm.
a. Find the 90% confidence interval for mean
length all the woods cut by the machine.
b. What is the width of this confidence
interval?
EXERCISE 3.3

4. The mean IQ score for 25 UMP students is


115 with standard deviation 10.
If the IQ score for all UMP students is
normally distributed, find the 95%
confidence interval for the mean IQ score
for all UMP students.
EXERCISE 3.3
5. The result X of a stress test is known to be
normally distributed random variable with
mean μ and standard deviation 1.3.
It is required to have a 95% confidence
interval for μ with total width less than 2.
Find the least number of tests that should
be carried out to achieve this.
EXERCISE 3.3

6. Eight UMP students are randomly chosen


and the value of their CPA has been
collected as below.

3.20 2.76 2.94 3.41


2.92 2.99 3.01 3.11

Find the 99% confidence interval for the


CPA mean for all UMP students.
EXERCISE 3.3

7. The heights of men in a particular district


are distributed with mean μ cm and the
standard deviation σ cm.
On the basis of the results obtained from a
random sample of 100 men from the
district, the 95% confidence interval for μ
was calculated and found to be (177.22 cm
, 179.18 cm).
Calculate the value of sample mean and
standard deviation.
EXERCISE 3.3

8. A 90% confidence interval for a


population mean based on 144
observations is computed to be (2.7, 3.4).
How many observations must be made
so that a 90% confidence interval will
specify the mean to within ±0.2?
3.4 Confidence Interval for the
Difference Between 2 Mean

OBJECTIVES

 After completing this chapter, you should be able to

1. Find the confidence interval for the difference between


two means when σ’s are known.

2. Find the confidence interval for the difference between


two means when σ’s are unknown and equal.

3. Find the confidence interval for the difference between


two means when σ’s are unknown and not equal.
Confidence Interval for the
Difference Between 2 Mean
EXERCISE 3.4

1. Find the 95% confidence interval for the


difference mean of children's sleep time
and adults sleep time if given that the
variances for children's sleep time is 0.81
hours while for adults is 0.25 hours.
The mean sample sleep time for 30
children's are 10 hours while for 40 adults
are 7 hours.
EXERCISE 3.4

2. The mean of sleep time for 50 IPTS students


are 7 hours with standard deviation of 1 hour.
The mean of sleep time for 60 IPTA students
is 6 hours with standard deviation of 0.7 hour.
Find the 99% confidence interval for the
difference mean of sleep time between the
IPTS and IPTA students.
a. Assume the population variance are same
b. Assume the population variance are different
EXERCISE 3.4

3. The mean of sleep time for 20 IPTA students


are 7.5 hours with variance of 1.11 hours.
The mean of sleep time for 10 IPTS students
is 6.3 hours with variance of 0.67 hours.
Find the 95% confidence interval for the
difference mean of sleep time between the
IPTA and IPTS students.
a. Assume the population variance are same
b. Assume the population variance are not equal
EXERCISE 3.4
4. Two groups of students are given a problem
solving test, and the results are compared. The
data are as follows:
Mathematics Majors Computer Science majors
n1  29 n2  28
x1  83.6 x2  79.2
s1  3.3 s2  2.8
Find the 98% confidence interval for the difference
mean of test marks between the two groups of
students. Assume the variance population test
marks are same for both groups.
EXERCISE 3.4
5. A medical researcher wishes to see whether the
pulse rates of nonsmokers are lower than the pulse
rates of smokers. Samples of 110 smokers and 120
nonsmokers are selected. The results are shown
here.
Smokers Nonsmokers
X 1  90 X 2  88
s1  5 s2  6
n1  110 n2  120
Find the 90% confidence interval for the difference
mean between pulse rates of nonsmokers and the
pulse rates of smokers. Assume that the variance
pulse rates for both populations are not same.
3.5 Confidence Interval

for the Proportion


OBJECTIVE

 After completing this chapter, you should be able to

1. Find the confidence interval for a proportion


The ( 1 – α ) 100 % confidence interval for proportion p

   
pq  pq
  p  z , p  z 
 2 n 2 n 

NOTES:
Number of successes in a given sample size
Sample
proportion  X  
p and q  1  p
n
Sample size
where npˆ  5 and nqˆ  5
EXERCISE 3.5
1. 23 from 100 families in a village are poor. Find
the 99% confidence interval poorness rate for this
village.
2. A survey was undertaken of the use of the
internet by residents in a large city and it was
discovered that in a random sample of 150
residents, 45 logged on to the internet at least
once a day. Calculate an approximate 90%
confidence interval for p, the proportion of
residents in the city that log on to the internet at
least once a day.

3. Given p  0.3590 . What sample size is needed
to obtain a 95% confidence interval for p with
width ± 0.08.
EXERCISE 3.5
4. A researcher whishes to estimate, with 90%
confidence, the proportion of people who own a
home computer. A previous study shows that 40%
of those interviewed had a computer at home. The
researcher whishes to be accurate within 5% of the
true proportion. Find the minimum sample size
necessary.
5. A manufacturer wants to assess the proportion of
defective items in a large batch produce by a
particular machine. He tests a random sample of
300 items and finds that 45 items are defective. If
200 such tests are performed and a 95%
confidence interval calculated for each, what is the
probability that more than 194 of the confidence
intervals cover the true proportions?
3.6 Confidence Interval

Difference between

Two Proportions
OBJECTIVE
 After completing this chapter, you should be able to

1. Find the confidence interval for the difference between


two proportions.
The ( 1 – α ) 100 % confidence interval for the difference
proportions p1 – p2

       
p1q1 p2 q2
  p1  p2   z  
 2 n1 n2 
EXERCISE 3.6
 
1. Given p1  0.6, n1  75, .
p2  0.3, n2  100
Find the 95% confidence intervals for p1  p2 .

2. In a sample of 200 surgeons, 15% thought


the government should control health care.
In a sample of 200 general practitioners,
21% felt the same way. Find 99%
confidence interval for the difference of
proportions for surgeons and practitioners.
3.7 Confidence Interval
for Variances and
Standard Deviations

OBJECTIVE
 After completing this chapter, you should be able to

1. Find the confidence interval for a variance and a


standard deviation.
The ( 1 – α ) 100 % confidence interval for the variance 2

  n  1 s 2  n  1 s 2 
 , 2 
  ,n 1 1 ,n 1 
2
 2 2 

The ( 1 – α ) 100 % confidence interval for the standard deviation 


  n  1 s 2 ,  n  1 s 2 
 2 12 
 2
, n 1
2
, n 1

Where
 n  1 s 2

~  n21 (Chi-square distribution)


2
EXERCISE 3.7

1. A random sample of 10 rulers produce by


a machine gives a group of data below, in
cm.

100.13, 100.07, 100.02, 99.99, 99.88,


100.14, 100.03, 100.10, 99.92, 100.21

Find the 95% confidence interval for the


height variance and standard deviation of
all the rulers produce by the machine.
EXERCISE 3.7
2. A factory has a machine that’s designed to filled
boxes with an average of 24 ounces of cereal,
and the population standard deviation for this
filling process is expected to be 0.1 ounce.
Thus, if the machine is working properly, the
population variance should be 0.01 squared
ounce.
To estimate the value of population variance, an
employee selected a random sample of 15 boxes
from a supply filled by the machine and found
that the sample variance was 0.008 squared
ounce.
What’s the 98% confidence interval for the
population variance and standard deviation?
3.8 Confidence Interval
for Two Variances and
Standard Deviations

OBJECTIVE
 After completing this chapter, you should be able to

1. Find the confidence interval for the confidence interval


for the variance and a standard deviation proportion.
The ( 1 – α ) 100 % confidence interval
 12
for the variance proportion
 22

 2 2 
 s1 1 s1
 2 , 2 F ,n 1,n 1 
 s2 F ,n 1,n 1 s2 2 2 1 
 2 1 2 

s12  12
Where ~ Fn1 1,n2 1 (F distribution)
s2  2
2 2

1
F1 , n1 1, n2 1

2 F , n2 1, n1 1
2
EXERCISE 3.8
1. The machined in EXERCISE 3.7.1 is
serviced. A random sample of 12 rulers
produces by the machine after the serviced
made give a group of data below.
100.03, 100.01, 100.02, 100.04,
99.90, 99.96, 100.04, 100.06,
100.08, 99.98, 100.11, 100.05
Find the 95% confidence interval for
variance proportion for all rulers produces
by the machine before and after the service.
EXERCISE 3.8
2. Before service, a machine can packed
10 packets of sugar with variance
weight 64 g² while after service the
variance weight for 5 packets of sugar
are 25 g².
Find the 90% confidence interval for
variance proportion for all sugar
produces by the machine before and
after the service.
Conclusion
 An important aspect of inferential statistics is
estimation

 Estimations of parameters of populations are


accomplished by selecting a random sample from
that population and choosing and computing a
statistic that is the best estimator of the parameter

 Statisticians prefer to use the interval estimate


rather than point estimate because they can be
95%,99% or else confidence that their estimate
contains the true parameter and also determine the
minimum sample size necessary.
Thank You
NEXT: CHAPTER 4 HYPOTHESIS TESTING

You might also like