You are on page 1of 49

COMPLETE

BUSINESS
STATISTICS
by
AMIR D. ACZEL
&
JAYAVEL SOUNDERPANDIAN
7th edition.
Prepared by Lloyd Jaisingh, Morehead State
University

Chapter 5

Sampling and Sampling Distributions


McGraw-Hill/Irwin

Copyright 2009 by The McGraw-Hill Companies, Inc. All

5-2

5 Sampling and Sampling Distributions


Using

Statistics
Sample Statistics as Estimators of Population Parameters
Sampling Distributions
Estimators and Their Properties
Degrees of Freedom
The Template

5-3

5 LEARNING OBJECTIVES
After studying this chapter you should be able to:

Take random samples from populations


Distinguish between population parameters and sample statistics
Apply the central limit theorem
Derive sampling distributions of sample means and proportions
Explain why sample statistics are good estimators of population parameters
Judge one estimator as better than another based on desirable properties of
estimators
Apply the concept of degrees of freedom
Identify special sampling methods
Compute sampling distributions and related results using templates

5-4

5-1 Using Statistics

Statistical Inference:
Predict and forecast values of
population parameters...
Test hypotheses about values
of population parameters...
Make decisions...

Make
Make
generalizationsabout
about
generalizations
thecharacteristics
characteristicsofof
the
population...
aapopulation...

On basis of sample statistics


derived from limited and
incomplete sample
information.

Onthe
thebasis
basisof
of
On
observationsofofaa
observations
sample,aapart
partofofaa
sample,
population
population

5-5

The Literary Digest Poll (1936)


Unbiased
Sample

Democrats

Population

People who have


phones and/or cars
and/or are Digest
readers.

Democrats

Population

Republicans

Biased
Sample

Republicans

Unbiased,
representative sample
drawn at random from
the entire population.
Biased,
unrepresentative
sample drawn from
people who have cars
and/or telephones
and/or read the Digest.

5-6

5-2 Sample Statistics as Estimators of


Population Parameters

A sample statistic is a numerical


measure of a summary characteristic
of a sample.

A population parameter
is a numerical measure of
a summary characteristic
of a population.

Anestimator
estimatorofofaapopulation
populationparameter
parameterisisaasample
samplestatistic
statisticused
usedtotoestimate
estimateor
or
An
predictthe
thepopulation
populationparameter.
parameter.
predict
Anestimate
estimateofofaaparameter
parameterisisaaparticular
particularnumerical
numericalvalue
valueofofaasample
samplestatistic
statistic
An
obtainedthrough
throughsampling.
sampling.
obtained
pointestimate
estimateisisaasingle
singlevalue
valueused
usedasasan
anestimate
estimateofofaapopulation
population
AApoint
parameter.
parameter.

5-7

Estimators
The sample
sample mean,
mean,X ,, isis the
the most
most common
common
The
estimator of
of the
the population
population mean,
mean,
estimator
The sample
sample variance,
variance, ss22,, isis the
the most
most common
common
The
22.
estimator
of
the
population
variance,

estimator of the population variance, .


The sample
sample standard
standard deviation,
deviation, s,s, isis the
the most
most
The
common estimator
estimator of
of the
the population
population standard
standard
common
deviation, ..
deviation,
The sample
sample proportion,
proportion,p ,, isis the
the most
most common
common
The
estimator of
of the
the population
population proportion,
proportion, p.
p.
estimator

5-8

Estimators

5-9

Population and Sample Proportions

The population proportion is equal to the number of elements in the


population belonging to the category of interest, divided by the total number of
elements in the population:

p
N

The sample proportion is the number of elements in the


sample belonging to the category of interest, divided by
the sample size:
x
p
n

5-10

A Population Distribution, a Sample from a


Population, and the Population and Sample Means

Population mean ()
Frequency distribution
of the population

X
X
X

X
X

X
X
X

X
X

X
X

X
X
X

Sample points
Sample mean ( X)

X
X
X

5-11

Other Sampling Methods

Stratified sampling: in stratified sampling, the population is


partitioned into two or more subpopulation called strata, and from each
stratum a desired sample size is selected at random.
Cluster sampling: in cluster sampling, a random sample of the strata is
selected and then samples from these selected strata are obtained.
Single-stage Cluster sampling: in single-stage cluster sampling, a
cluster is chosen at random and every item or person in that cluster is
sampled.

5-12

Other Sampling Methods

Two-stage Cluster sampling: in two-stage cluster sampling, a cluster


is chosen at random and random samples of the items or persons are
taken from that cluster.
Multistage Cluster sampling: in multistage cluster sampling, a
random sample of the strata is selected, then samples of items or
persons from these strata are taken and then samples from these items
and persons are selected.

5-13

5-3 Sampling Distributions

The sampling distribution of a statistic is the probability


distribution of all possible values the statistic may assume, when
computed from random samples of the same size, drawn from a
specified population.
The sampling distribution of X is the probability distribution of
all possible values the random variable may assume when a
sample of size n is taken from a specified population.

5-14

Sampling Distributions (Continued)


Uniform population of integers from 1 to 8:
P(X)
P(X)

XP(X)
XP(X)

(X-)x)
(X-
x

(X-)x2)2
(X-
x

P(X)(X-)x2)2
P(X)(X-
x

11
22
3
3
44
55
66
77
88

0.125
0.125
0.125
0.125
0.125
0.125
0.125
0.125
0.125
0.125
0.125
0.125
0.125
0.125
0.125
0.125

0.125
0.125
0.250
0.250
0.375
0.375
0.500
0.500
0.625
0.625
0.750
0.750
0.875
0.875
1.000
1.000

-3.5
-3.5
-2.5
-2.5
-1.5
-1.5
-0.5
-0.5
0.5
0.5
1.5
1.5
2.5
2.5
3.5
3.5

12.25
12.25
6.25
6.25
2.25
2.25
0.25
0.25
0.25
0.25
2.25
2.25
6.25
6.25
12.25
12.25

1.53125
1.53125
0.78125
0.78125
0.28125
0.28125
0.03125
0.03125
0.03125
0.03125
0.28125
0.28125
0.78125
0.78125
1.53125
1.53125

1.000
1.000

4.500
4.500

5.25000
5.25000

Uniform Distribution (1,8)


0.2

P(X)

XX

0.1

0.0
1

E(X)== ==4.5
4.5
E(X)
22 = 5.25
V(X)
=

V(X) = = 5.25
SD(X)== ==2.2913
2.2913
SD(X)

5-15

Sampling Distributions (Continued)

1
2
3
4
5
6
7
8

There are 8*8 = 64 different but equallylikely samples of size 2 that can be drawn
(with replacement) from a uniform
population of the integers from 1 to 8:

Samples of Size 2 from Uniform (1,8)


1 2
3 4 5
6 7
8
1,1 1,2 1,3 1,4 1,5 1,6 1,7 1,8
2,1 2,2 2,3 2,4 2,5 2,6 2,7 2,8
3,1 3,2 3,3 3,4 3,5 3,6 3,7 3,8
4,1 4,2 4,3 4,4 4,5 4,6 4,7 4,8
5,1 5,2 5,3 5,4 5,5 5,6 5,7 5,8
6,1 6,2 6,3 6,4 6,5 6,6 6,7 6,8
7,1 7,2 7,3 7,4 7,5 7,6 7,7 7,8
8,1 8,2 8,3 8,4 8,5 8,6 8,7 8,8

Each of these samples has a sample


mean. For example, the mean of the
sample (1,4) is 2.5, and the mean of
the sample (8,4) is 6.

1
2
3
4
5
6
7
8

Sample Means from Uniform (1,8), n =


1 2
3
4
5
6 7
8
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5
2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5
3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0
4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5
4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

5-16

Sampling Distributions (Continued)


The probability distribution of the sample mean is called the
sampling distribution of the the sample mean.
mean
Sampling Distribution of the Mean
P(X)

XP(X)

X- X

(X- X)

1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0

0.015625
0.031250
0.046875
0.062500
0.078125
0.093750
0.109375
0.125000
0.109375
0.093750
0.078125
0.062500
0.046875
0.031250
0.015625

0.015625
0.046875
0.093750
0.156250
0.234375
0.328125
0.437500
0.562500
0.546875
0.515625
0.468750
0.406250
0.328125
0.234375
0.125000

-3.5
-3.0
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5

12.25
9.00
6.25
4.00
2.25
1.00
0.25
0.00
0.25
1.00
2.25
4.00
6.25
9.00
12.25

1.000000

4.500000

Sampling Distribution of the Mean

P(X)(X- X)

0.191406
0.281250
0.292969
0.250000
0.175781
0.093750
0.027344
0.000000
0.027344
0.093750
0.175781
0.250000
0.292969
0.281250
0.191406
2.625000

0.10

P(X)

0.05

0.00
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

E ( X ) X 4.5
V ( X ) 2X 2.625
SD( X ) X 1.6202

5-17

Properties of the Sampling Distribution


of the Sample Mean
Uniform Distribution (1,8)
0.2

P(X)

Comparingthe
thepopulation
populationdistribution
distributionand
andthe
the
Comparing
samplingdistribution
distributionofofthe
themean:
mean:
sampling
Thesampling
samplingdistribution
distributionisismore
morebellbell The
shapedand
andsymmetric.
symmetric.
shaped
Bothhave
havethe
thesame
samecenter.
center.
Both
Thesampling
samplingdistribution
distributionofofthe
themean
meanisis
The
morecompact,
compact,with
withaasmaller
smallervariance.
variance.
more

0.1

0.0
1

Sampling Distribution of the Mean

0.10

P(X)

0.05

0.00
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

5-18

Relationships between Population Parameters and


the Sampling Distribution of the Sample Mean
The expected value of the sample mean is equal to the population mean:

E( X )
X

The variance of the sample mean is equal to the population variance divided by
the sample size:

V(X)

2
X

2
X

The standard deviation of the sample mean, known as the standard error of
the mean,
mean is equal to the population standard deviation divided by the square
root of the sample size:

SD( X )
X

5-19

Sampling from a Normal Population


Whensampling
samplingfrom
fromaanormal
normalpopulation
populationwith
withmean
meanand
andstandard
standard
When
deviation,
,the
thesample
samplemean,
mean,X,
X,has
hasaanormal
normalsampling
samplingdistribution:
distribution:
distribution
deviation
distribution

X ~ N (, )
n
2

Sampling Distribution of the Sample Mean


0.4

Sampling Distribution: n =16


0.3

f(X)

Thismeans
meansthat,
that,as
asthe
the
This
samplesize
sizeincreases,
increases,the
the
sample
samplingdistribution
distributionof
ofthe
the
sampling
samplemean
meanremains
remains
sample
centeredon
onthe
thepopulation
population
centered
mean,but
butbecomes
becomesmore
more
mean,
compactlydistributed
distributedaround
around
compactly
thatpopulation
populationmean.
mean.
that

Sampling Distribution: n = 4

0.2

Sampling Distribution: n = 2

0.1

Normal population

Normal population
0.0

5-20

The Central Limit Theorem


n=5
0.25
0.20

P(X)

0.15
0.10
0.05
0.00

n = 20
P(X)

0.2

0.1

0.0

When sampling
sampling from
from aa population
population
When
with mean
mean and
and finite
finite standard
standard
with
deviation ,
, the
the sampling
sampling
deviation
distribution of
of the
the sample
samplemean
mean will
will
distribution
tend to
to aa normal
normal distribution
distribution with
with
tend

mean

and
standard
deviation
as
mean and standard deviation n as
the sample
sample size
size becomes
becomes large
large
the
(n >30).
>30).
(n

Large n
0.4

0.2
0.1
0.0

For large
large enough
enough n:
n: X ~ N ( , / n)
For
2

f(X)

0.3

5-21

The Central Limit Theorem Applies to


Sampling Distributions from Any Population
Normal

Uniform

Skewed

General

Population

n=2

n = 30

The Central Limit Theorem


(Example 5-1)

5-22

Mercurymakes
makesaa2.4
2.4liter
literV-6
V-6engine,
engine,the
theLaser
LaserXRi,
XRi,used
usedin
inspeedboats.
speedboats.
Mercury
Thecompanys
companysengineers
engineersbelieve
believethe
theengine
enginedelivers
deliversan
anaverage
averagepower
powerof
of
The
220horsepower
horsepowerand
andthat
thatthe
thestandard
standarddeviation
deviationof
ofpower
powerdelivered
deliveredisis15
15
220
HP. AApotential
potentialbuyer
buyerintends
intendsto
tosample
sample100
100engines
engines(each
(eachengine
engineisisto
tobe
be
HP.
runaasingle
singletime).
time). What
Whatisisthe
theprobability
probabilitythat
thatthe
thesample
samplemean
meanwill
willbe
beless
less
run
than217HP?
217HP?
than

X 217
P ( X 217) P

n
n

P Z

217 220

217 220
P Z

15
15

10
100

P ( Z 2) 0.0228

The Central Limit Theorem


(Example 5-1) Using the Template

5-23

Mercurymakes
makesaa2.4
2.4liter
literV-6
V-6engine,
engine,the
theLaser
LaserXRi,
XRi,used
usedin
inspeedboats.
speedboats.
Mercury
Thecompanys
companysengineers
engineersbelieve
believethe
theengine
enginedelivers
deliversan
anaverage
averagepower
powerof
of
The
220horsepower
horsepowerand
andthat
thatthe
thestandard
standarddeviation
deviationof
ofpower
powerdelivered
deliveredisis15
15
220
HP. AApotential
potentialbuyer
buyerintends
intendsto
tosample
sample100
100engines
engines(each
(eachengine
engineisisto
tobe
be
HP.
runaasingle
singletime).
time). What
Whatisisthe
theprobability
probabilitythat
thatthe
thesample
samplemean
meanwill
willbe
beless
less
run
than217HP?
217HP?
than

5-24

Example 5-2
EPS Mean Distribution
2.00 - 2.49
2.50 - 2.99
3.00 - 3.49

25

Frequency

20

3.50 - 3.99
4.00 - 4.49
4.50 - 4.99

15
10

5.00 - 5.49
5.50 - 5.99
6.00 - 6.49
6.50 - 6.99

5
0
Range

7.00 - 7.49
7.50 - 7.99

5-25

Students t Distribution
thepopulation
populationstandard
standarddeviation,
deviation,,
,isisunknown,
unknown,
replacewith
with
unknown replace
IfIfthe
unknown
thesample
samplestandard
standarddeviation,
deviation,s.s. IfIfthe
thepopulation
populationisisnormal,
normal,the
the
the
resultingstatistic:
statistic: t X
resulting
s/ n

hasaattdistribution
distributionwith
with(n
(n--1)
1)degrees
degreesof
offreedom.
freedom.
freedom
has
freedom

Thet tisisa afamily


familyofofbell-shaped
bell-shapedand
andsymmetric
symmetricdistributions,
distributions,one
one
The
foreach
eachnumber
numberofofdegree
degreeofoffreedom.
freedom.
for
Theexpected
expectedvalue
valueofoft tisis0.0.
The
Thevariance
varianceofoft tisisgreater
greaterthan
than1,1,but
butapproaches
approaches11asasthe
thenumber
number
The
of degrees of freedom increases. The t is flatter and has fatter tails
of degrees of freedom increases. The t is flatter and has fatter tails
thandoes
doesthe
thestandard
standardnormal.
normal.
than
Thet tdistribution
distributionapproaches
approachesa astandard
standardnormal
normalasasthe
thenumber
numberofof
The
degrees of freedom increases.
degrees of freedom increases.

Standard normal
t, df=20
t, df=10

5-26

The Sampling Distribution of the


Sample Proportion, p
0 .4

P(X)

Thesample
sampleproportion
proportionisisthe
thepercentage
percentageof
of
The
successesininnnbinomial
binomialtrials.
trials. ItItisisthe
the
successes
numberof
ofsuccesses,
successes,X,
X,divided
dividedby
bythe
the
number
numberof
oftrials,
trials,n.n.
number

n=2, p = 0.3
0 .5

0 .3
0 .2
0 .1
0 .0
0

X
n=10,p=0.3

X
n

0.2

P(X)

Sample proportion: p

0.3

0.1

0.0
1

10

n=15, p = 0.3
0.2

P(X)

Asthe
thesample
samplesize,
size,n,n,increases,
increases,the
thesampling
sampling
As
distributionof
of p approaches
approachesaanormal
normal
distribution
distributionwith
withmean
meanppand
andstandard
standard
distribution
deviation p(1 p)
deviation
n

0.1

0.0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0 1 2 3 4 5 6 7 8 9 10 11 12 13 1415
15 1515 15 15 15 15 15 15 1515 1515 15 1515

^
p

5-27

Sample Proportion (Example 5-3)


recentyears,
years,convertible
convertiblesports
sportscoupes
coupeshave
havebecome
becomevery
verypopular
popularininJapan.
Japan. Toyota
Toyota
InInrecent
currentlyshipping
shippingCelicas
CelicastotoLos
LosAngeles,
Angeles,where
whereaacustomizer
customizerdoes
doesaaroof
rooflift
liftand
and
isiscurrently
shipsthem
themback
backtotoJapan.
Japan. Suppose
Supposethat
that25%
25%ofofall
allJapanese
Japaneseininaagiven
givenincome
incomeand
and
ships
lifestylecategory
categoryare
areinterested
interestedininbuying
buyingCelica
Celicaconvertibles.
convertibles. AArandom
randomsample
sampleofof100
100
lifestyle
Japaneseconsumers
consumersininthe
thecategory
categoryofofinterest
interestisistotobe
beselected.
selected. What
Whatisisthe
theprobability
probability
Japanese
thatatatleast
least20%
20%ofofthose
thoseininthe
thesample
samplewill
willexpress
expressan
aninterest
interestininaaCelica
Celicaconvertible?
convertible?
that
n 100

p 0.25

p p

P ( p 0.20 ) P

np (100 )( 0.25) 25 E ( p )
p (1 p )
n

p (1 p )
n

(.25)(.75)
100

0.001875 V ( p )

0.001875 0.04330127 SD ( p )

p (1 p )
n

(.25)(.75)

P z

p (1 p )

.20 .25

P z

100
P ( z 1.15) 0.8749

.20 p

.05
.0433

Sample Proportion (Example 5-3)


Template Solution

5-28

recentyears,
years,convertible
convertiblesports
sportscoupes
coupeshave
havebecome
becomevery
verypopular
popularininJapan.
Japan. Toyota
Toyota
InInrecent
currentlyshipping
shippingCelicas
CelicastotoLos
LosAngeles,
Angeles,where
whereaacustomizer
customizerdoes
doesaaroof
rooflift
liftand
and
isiscurrently
shipsthem
themback
backtotoJapan.
Japan. Suppose
Supposethat
that25%
25%ofofall
allJapanese
Japaneseininaagiven
givenincome
incomeand
and
ships
lifestylecategory
categoryare
areinterested
interestedininbuying
buyingCelica
Celicaconvertibles.
convertibles. AArandom
randomsample
sampleofof100
100
lifestyle
Japaneseconsumers
consumersininthe
thecategory
categoryofofinterest
interestisistotobe
beselected.
selected. What
Whatisisthe
theprobability
probability
Japanese
thatatatleast
least20%
20%ofofthose
thoseininthe
thesample
samplewill
willexpress
expressan
aninterest
interestininaaCelica
Celicaconvertible?
convertible?
that

5-29

5-4 Estimators and Their Properties


Anestimator
estimatorof
ofaapopulation
populationparameter
parameterisisaasample
samplestatistic
statisticused
usedto
to
An
estimatethe
theparameter.
parameter. The
Themost
mostcommonly-used
commonly-usedestimator
estimatorof
ofthe:
the:
estimate
PopulationParameter
Parameter
SampleStatistic
Statistic
Population
Sample
Mean()
()
the
Mean(X)
(X)
Mean
isisthe
Mean
Variance(
(22))
the
Variance(s(s22))
Variance
isisthe
Variance
StandardDeviation
Deviation()
()
the
StandardDeviation
Deviation(s)
(s)
Standard
isisthe
Standard
Proportion(p)
(p)
the
Proportion((p ))
Proportion
isisthe
Proportion

Desirableproperties
propertiesofofestimators
estimatorsinclude:
include:
Desirable
Unbiasedness
Unbiasedness
Efficiency
Efficiency
Consistency
Consistency
Sufficiency
Sufficiency

5-30

Unbiasedness
Anestimator
estimatorisissaid
saidto
tobe
beunbiased
unbiasedififits
itsexpected
expectedvalue
valueisisequal
equalto
to
An
thepopulation
populationparameter
parameterititestimates.
estimates.
the
Forexample,
example,E(X)=so
E(X)=sothe
thesample
samplemean
meanisisan
anunbiased
unbiasedestimator
estimatorof
of
For
thepopulation
populationmean.
mean. Unbiasedness
Unbiasednessisisan
anaverage
averageor
orlong-run
long-run
the
property. The
Themean
meanof
ofany
anysingle
singlesample
samplewill
willprobably
probablynot
notequal
equalthe
the
property.
populationmean,
mean,but
butthe
theaverage
averageof
ofthe
themeans
meansof
ofrepeated
repeated
population
independentsamples
samplesfrom
fromaapopulation
populationwill
willequal
equalthe
thepopulation
population
independent
mean.
mean.
Anysystematic
systematicdeviation
deviationof
ofthe
theestimator
estimatorfrom
fromthe
thepopulation
population
Any
parameterof
ofinterest
interestisiscalled
calledaabias.
bias.
bias
parameter
bias

5-31

Unbiased and Biased Estimators

Bias

An unbiased estimator is on
target on average.

A biased estimator is
off target on average.

5-32

Efficiency
Anestimator
estimatorisisefficient
efficientififitithas
hasaarelatively
relativelysmall
smallvariance
variance(and
(and
An
standarddeviation).
deviation).
standard

An efficient estimator is,


on average, closer to the
parameter being estimated..

An inefficient estimator is, on


average, farther from the
parameter being estimated.

5-33

Consistency and Sufficiency


Anestimator
estimatorisissaid
saidto
tobe
beconsistent
consistentififits
itsprobability
probabilityof
ofbeing
beingclose
close
An
tothe
theparameter
parameterititestimates
estimatesincreases
increasesas
asthe
thesample
samplesize
sizeincreases.
increases.
to

Consistency
n = 100
n = 10
Anestimator
estimatorisissaid
saidto
tobe
besufficient
sufficientififititcontains
containsall
allthe
theinformation
information
An
inthe
thedata
dataabout
aboutthe
theparameter
parameterititestimates.
estimates.
in

5-34

Properties of the Sample Mean


For a normal population, both the sample mean and
sample median are unbiased estimators of the
population mean, but the sample mean is both more
efficient (because it has a smaller variance), and
sufficient.
sufficient Every observation in the sample is used in
the calculation of the sample mean, but only the middle
value is used to find the sample median.
In general, the sample mean is the best estimator of the
population mean. The sample mean is the most
efficient unbiased estimator of the population mean. It
is also a consistent estimator.

5-35

Properties of the Sample Variance


Thesample
samplevariance
variance(the
(thesum
sumof
ofthe
thesquared
squareddeviations
deviationsfrom
fromthe
the
The
samplemean
meandivided
dividedby
by(n-1)
(n-1)isisan
anunbiased
unbiasedestimator
estimatorof
ofthe
the
sample
populationvariance.
variance. In
Incontrast,
contrast,the
theaverage
averagesquared
squareddeviation
deviation
population
fromthe
thesample
samplemean
meanisisaabiased
biased(though
(thoughconsistent)
consistent)estimator
estimatorof
ofthe
the
from
populationvariance.
variance.
population
2

(
x
x
)

2
2
E (s ) E

(n 1)

(x x)
n

5-36

5-5 Degrees of Freedom


Consider a sample of size n=4 containing the following data points:
x1=10

x2=12

and for which the sample mean is:

x3=16

x4=?

x
14
n

Given the values of three data points and the sample mean, the
value of the fourth data point can be determined:
x 12 14 16 x4
x=

14
n
4

x4 56 12 14 16
x

12 14 16 x 56
4

56

x44 = 14

5-37

Degrees of Freedom (Continued)


If only two data points and the sample mean are known:
x1=10

x2=12

x3=?

x4=?

x 14

The values of the remaining two data points cannot be uniquely


determined:
12 14 x x 4
3
x=

14
n
4
x

12 14 x x4 56
3

5-38

Degrees of Freedom (Continued)


Thenumber
numberof
ofdegrees
degreesof
offreedom
freedomisisequal
equalto
tothe
thetotal
totalnumber
numberof
of
The
measurements(these
(theseare
arenot
notalways
alwaysraw
rawdata
datapoints),
points),less
lessthe
thetotal
total
measurements
numberof
ofrestrictions
restrictionson
onthe
themeasurements.
measurements. AArestriction
restrictionisisaa
number
quantitycomputed
computedfrom
fromthe
themeasurements.
measurements.
quantity
Thesample
samplemean
meanisisaarestriction
restrictionon
onthe
thesample
samplemeasurements,
measurements,so
so
The
aftercalculating
calculatingthe
thesample
samplemean
meanthere
thereare
areonly
only (n-1)
(n-1)degrees
degreesof
of
after
freedomremaining
remainingwith
withwhich
whichto
tocalculate
calculatethe
thesample
samplevariance.
variance.
freedom
Thesample
samplevariance
varianceisisbased
basedon
ononly
only(n-1)
(n-1)free
freedata
datapoints:
points:
The
s

(x x)

(n 1)

5-39

Example 5-4
sampleof
ofsize
size10
10isisgiven
givenbelow.
below. We
Weare
aretotochoose
choosethree
threedifferent
differentnumbers
numbers
AAsample
fromwhich
whichthe
thedeviations
deviationsare
aretotobe
betaken.
taken. The
Thefirst
firstnumber
numberisistotobe
beused
usedfor
forthe
the
from
firstfive
fivesample
samplepoints;
points;the
thesecond
secondnumber
numberisistotobe
beused
usedfor
forthe
thenext
nextthree
threesample
sample
first
points;and
andthe
thethird
thirdnumber
numberisistotobe
beused
usedfor
forthe
thelast
lasttwo
twosample
samplepoints.
points.
points;
Sample #

10

Sample
Point

93

97

60

72

96

83

59

66

88

53

i.

What three numbers should we choose in order to minimize the SSD


(sum of squared deviations from the mean).?

Note: SSD x x

5-40

Example 5-4 (continued)


Solution:Choose
Choosethe
themeans
meansof
ofthe
thecorresponding
correspondingsample
samplepoints.
points. These
Theseare:
are:83.6,
83.6,
Solution:
69.33,and
and70.5.
70.5.
69.33,
ii. Calculate
Calculatethe
theSSD
SSDwith
withchosen
chosennumbers.
numbers.
ii.
Solution:SSD
SSD==2030.367.
2030.367.See
Seetable
tableon
onnext
nextslide
slidefor
forcalculations.
calculations.
Solution:
iii. What
Whatisisthe
thedfdffor
forthe
thecalculated
calculatedSSD?
SSD?
iii.
Solution: dfdf==10
1033==7.7.
Solution:
iv. Calculate
Calculatean
anunbiased
unbiasedestimate
estimateof
ofthe
thepopulation
populationvariance.
variance.
iv.
Solution: An
Anunbiased
unbiasedestimate
estimateof
ofthe
thepopulation
populationvariance
varianceisisSSD/df
SSD/df==2030.367/7
2030.367/7
Solution:
290.05.
==290.05.

5-41

Example 5-4 (continued)


Sample #

Sample Point

Mean

Deviations

Deviation
Squared

93

83.6

9.4

88.36

97

83.6

13.4

179.56

60

83.6

-23.6

556.96

72

83.6

-11.6

134.56

96

83.6

12.4

153.76

83

69.33

13.6667

186.7778

59

69.33

-10.3333

106.7778

66

69.33

-3.3333

11.1111

88

70.5

17.5

306.25

10

53

70.5

-17.5

306.25

SSD

2030.367

SSD/df

290.0524

5-42

5-6 Using the Computer


Using the Template

SamplingDistribution
Distributionof
ofaaSample
SampleMean
Mean
Sampling

5-43

Using the Template

SamplingDistribution
Distributionof
ofaaSample
SampleMean
Mean(continued)
(continued)
Sampling

5-44

Using the Template

SamplingDistribution
Distributionof
ofaaSample
SampleProportion
Proportion
Sampling

5-45

Using the Template

SamplingDistribution
Distributionof
ofaaSample
SampleProportion
Proportion(continued)
(continued)
Sampling

5-46

Using Excel to Generate Random Data


Constructing a sampling distribution of the mean from a uniform
population (n = 10) using EXCEL (use RANDBETWEEN(0, 1)
command to generate values to graph):

200

Frequency

CLASS MIDPOINT FREQUENCY


0.15
0
0.2
0
0.25
3
0.3
26
0.35
64
0.4
113
0.45
183
0.5
213
0.55
178
0.6
128
0.65
65
0.7
20
0.75
3
0.8
3
0.85
0
999

Histogram of Sample Means

250

150

100

50

0
1

10

11

12

Sample Means (Class Midpoints)

13

14

15

5-47

Using Minitab to Generate Random Data


Constructing a sampling distribution of the mean from any distribution using
MINITAB can be achieved by selecting CALCRANDOM DATA and then
generating the data from a selected distribution. For example, we can
generate data from a normal distribution with mean = 100 and standard
deviation = 5.

Using Minitab to Look at the Sampling


Distribution of the sample Mean (n = 50)
wewould
wouldlike
liketo
tolook
lookat
atthe
thedistribution
distributionofofthe
thesample
samplemeans
meansfor
forthe
the
IfIfwe
previoussimulation,
simulation,we
wecan
canselect
selectdifferent
differentsample
samplesizes
sizesfrom
fromthe
the
previous
columns(C1
(C1totoC50).
C50). IfIffor
forinstance
instancewe
weselect
selectsamples
samplesofofsize
sizenn==50,
50,then
then
columns
wewill
willhave
have1000
1000ofofthese
thesebased
basedon
onthe
theprevious
previoussimulation.
simulation.
we
Inthis
thissimulation
simulation==100
100and
and==5.5.
In
Wecan
cancompute
computethe
therow
rowmeans
meansfor
forthese
these50
50columns
columnsand
andsave
saveinin
We
thenext
nextavailable
availablecolumns.
columns.
the
Thusfor
forthe
thesampling
samplingdistribution
distributionofofthe
thesample
samplemeans,
means,its
itsmean
meanwill
willbe
be100
100
Thus
andstandard
standarddeviation
deviationwill
willbe
be5/50
5/50==0.7071.
0.7071.
and
Thenext
nextslide
slideshows
showsthis
thissituation.
situation. Observe
Observethat
thatthe
thedistribution
distributionfor
forthese
these
The
simulatedsample
samplemeans
meansisisapproximately
approximatelynormally
normallydistributed
distributedwith
withaa
simulated
meanofof100.02
100.02and
andaastandard
standarddeviation
deviationof
of0.70.
0.70. These
Thesevalues
valuesare
arevery
very
mean
closetotothe
thetheoretical
theoreticalvalues.
values.
close

5-48

Using Minitab to Look at the Sampling


Distribution of the sample Mean (n = 50)

97.8
97.8

98.4
98.4

99.0
99.0

99.6
99.6

100.2
100.2

100.8
100.8

101.4
101.4

102.0
102.0

95% Confidence I ntervals


95% Confidence I ntervals
Mean
Mean
Median
Median
99.950
99.950

99.975
99.975

100.000
100.000

100.025
100.025

100.050
100.050

100.075
100.075

Anderson-Darling Normality Test


Anderson-Darling Normality Test
A -Squared
0.31
A -Squared
0.31
P-Value
0.566
P-Value
0.566
Mean
100.02
Mean
100.02
StDev
0.70
StDev
0.70
V ariance
0.50
V ariance
0.50
Skewness
-0.0604654
Skewness
-0.0604654
Kurtosis
0.0706591
Kurtosis
0.0706591
N
1000
N
1000
Minimum
97.74
Minimum
97.74
1st Q uartile
99.54
1st Q uartile
99.54
Median
100.00
Median
100.00
3rd Quartile
100.50
3rd Quartile
100.50
Maximum
102.09
Maximum
102.09
95% Confidence I nterval for Mean
95% Confidence I nterval for Mean
99.98
100.06
99.98
100.06
95% Confidence I nterval for Median
95% Confidence I nterval for Median
99.95
100.05
99.95
100.05
95% Confidence I nterval for StDev
95% Confidence I nterval for StDev
0.67
0.74
0.67
0.74

5-49

You might also like