You are on page 1of 7

# EstimatingtheMeanandVarianceofaNormalDistribution

LearningObjectives

Aftercompletingthismodule,thestudentwillbeableto

explainthevalueofrepeatingexperiments
explaintheroleofthelawoflargenumbersinestimatingpopulationmeans
describetheeffectof
increasingthesamplesize
orreducingmeasurement
errorsorothersourcesof
variability

KnowledgeandSkills

Propertiesofthe
arithmeticmean
Estimatingthemeanofa
normaldistribution
LawofLargeNumbers
EstimatingtheVarianceof
anormaldistribution
Generatingrandom
variatesinEXCEL

Prerequisites

1. Calculatingsamplemeanandarithmeticaverage
2. Calculatingsamplestandardvarianceandstandarddeviation
3. Normaldistribution

Citation:Neuhauser,C.EstimatingtheMeanandVarianceofaNormalDistribution.
Created:September9,2009Revisions:
otherstotranslate,makeremixes,andproducenewstoriesbasedonthiswork,providedtheoriginalauthorandsourceare
Funding:ThisworkwaspartiallysupportedbyaHHMIProfessorsgrantfromtheHowardHughesMedicalInstitute. Page1

Pretest

platinum by repeating the measurements three times. To save time, they decide to only measure
thedensityonce.Explaintheconsequencesofthisshortcut.
2. Tom and Bao Yu measured the density of solid platinum three times: 19.8, 21.4, and 21.9 g/cm3.
Determinethearithmeticaverageofthesethreemeasurementsaccuratetothreedecimalplaces.
3. The following graphs are densities of probability distributions. Which represent the density of a
normaldistribution?
(a) (b) (c)
0.5 2.5 0.35

0.45
0.3

0.4 2

0.25
0.35

0.3 1.5
0.2

0.25

0.15
0.2 1

0.15
0.1

0.1 0.5

0.05
0.05

0 0 0
0 2 4 6 0 2 4 6 0 2 4 6
t t

4. Whichtwoparametersaretypicallyusedtodescribethenormaldistribution?
a. Median
b. Variance
c. Standarddeviation
d. Mean
5. Suppose X is normally distributed with mean 3 and standard deviation 1, that is, X N(3,1) . Use
EXCELto(a)find P( X > 3) ,(b)find P(1 < X < 4) ,and(c)determine a sothat P( X > a) = 0.74 .

Citation:Neuhauser,C.EstimatingtheMeanandVarianceofaNormalDistribution.
Created:September9,2009Revisions:
otherstotranslate,makeremixes,andproducenewstoriesbasedonthiswork,providedtheoriginalauthorandsourceare
Funding:ThisworkwaspartiallysupportedbyaHHMIProfessorsgrantfromtheHowardHughesMedicalInstitute. Page2

EstimatingtheMeanofaNormallyDistributedPopulation

## Supposeanexperimentisrepeatedntimesunderidenticalconditions.Denoteby xi , i = 1,2, , n the

outcomeofeachindividualexperiment.Thearithmeticaverage xn iscalculated

x1 + x2 + + xn 1 n
xn = = xi
n n i =1

Whenoutcomesarenotalldistinct,wecancountthenumberoftimeseachvalueoccurs:Supposeagain
thatanexperimentisrepeatedntimesunderidenticalconditions.Butnow,weassumethatthereare
onlykdistinctvalues x j , j = 1,2,..., k ,andthat x j occurs f j times.Thenthearithmeticaverage xn is
calculated

k
1 1
xn =
n
( x1 f1 + x2 f2 + ... + xk fk ) =
n
x f
j =1
j j

Example

Supposethatthefollowingdatarepresenttheagesofpatientsinastudy:17,19,19,20,21,24,26,26,
26,and27.Wefindforthearithmeticaverage

17 + 19 + 19 + 20 + 21 + 24 + 26 + 26 + 26 + 27 225
x10 = = = 22.5
10 10

Sincesomeofthevaluesoccurmorethantwice,wecanalsousethefrequencydistribution:

xj 17 19 20 21 24 26 27
fj 1 2 1 1 1 3 1

Forthearithmeticaveragewefind

1 225
x10 = ((17)(1) + (19)(2) + (20)(1) + (21)(1) + (24)(1) + (26)(3) + (27)(1)) = = 22.5
10 10

Citation:Neuhauser,C.EstimatingtheMeanandVarianceofaNormalDistribution.
Created:September9,2009Revisions:
otherstotranslate,makeremixes,andproducenewstoriesbasedonthiswork,providedtheoriginalauthorandsourceare
Funding:ThisworkwaspartiallysupportedbyaHHMIProfessorsgrantfromtheHowardHughesMedicalInstitute. Page3

InclassActivity

We will explore the properties of the arithmetic mean when measurements are taken from a normal
variates from a normal distribution with mean 3 and variance 1. Recall that the function
=NORMINV(probability,mean,standard_dev)returnstheinverseofthenormalcumulativedistribution
forthespecifiedmeanandstandarddeviation.ColumnCcalculatesthecumulativesumandColumnD
hasthecorrespondingarithmeticaverages.TheFigureplotsColumnDagainstColumnA.

UsetheF9keytoexplorethearithmeticaverage.Whatdoyouobserve?

Theory

In Explore 1, you observed that the arithmetic mean stabilizes around the mean of the normal
distribution, regardless of the variance, as you increase the sample size. This is a consequence of the
Law of Large Numbers. While we do not yet have the background to completely understand its
mathematicalformulation,wewillgiveithereanywaysothatyoucanseehowamathematicalresult
expressing this property is formulated. We will come back to this result later in the course when we
havemorebackground.

LawofLargeNumbers

## If X1 , X2 , , X n areindependentandidenticallydistributedwith E | X i |< ,thenasn

tendstoinfinity, X n convergesto EX1 inprobability.

Problems

1. A random variate is a particular outcome of a random variable. Assume that random variates are
drawn repeatedly from a normal distribution with mean 4 and variance 9. If you calculated the
arithmeticaverageforalargenumberofvariatesfromthisdistribution,whatwouldyouexpectthe
arithmeticaveragetobecloseto?
2. The Law of Large Numbers holds quite generally. Without going more deeply into the theory, can

Based on our observations in Explore 1, we conclude that the mean of a normal distribution can be
estimatedbyrepeatedlysamplingfromthenormaldistributionandcalculatingthearithmeticaverageof
thesample.Thisarithmeticaverageservesasanestimateforthemeanofthenormaldistribution.

Citation:Neuhauser,C.EstimatingtheMeanandVarianceofaNormalDistribution.
Created:September9,2009Revisions:
otherstotranslate,makeremixes,andproducenewstoriesbasedonthiswork,providedtheoriginalauthorandsourceare
Funding:ThisworkwaspartiallysupportedbyaHHMIProfessorsgrantfromtheHowardHughesMedicalInstitute. Page4

PropertiesoftheArithmeticAverage

Explore2

When you compare the arithmetic averages of 100 random variates in Explore 1, you will realize that
different runs of the simulation result in slightly different averages. Arithmetic averages are random
variables and we will explore their distribution as a function of the sample size. Again, we will use
normallydistributedrandomvariables.

AsimulationissetupunderthetabExplore2thatsimulatesarithmeticaveragesofnormallydistributed
exploretheeffectofthesamplesizeonthearithmeticaverage.Whatdoyouobserve?

Explore3

Thevariationinthearithmeticmeancomesfromthefactthattherandomvariatesineachsamplevary
fromruntorun.Themoretherandomvariatesvary,themorethearithmeticmeanvaries.Thedegree
ofvariationisdescribedbythestandarddeviation.Toexploretheeffectofthevariation,wesimulate
we calculate arithmetic means for random variates that are normally distributed with mean 3 and
standard deviation 1; in the second scenario, we calculate arithmetic means for random variates that
are normally distributed with mean 3 and standard deviation 0.5. Details are explained in the
Whatdoyouobserve?

Problems(cont.)

3. BasedonyourobservationsinExplore2and3,whatistheeffectonthearithmeticmeanwhenyou
(a)increasesamplesizeand(b)reducevariation.Whatdoesthisimplyforexperiments?

Citation:Neuhauser,C.EstimatingtheMeanandVarianceofaNormalDistribution.
Created:September9,2009Revisions:
otherstotranslate,makeremixes,andproducenewstoriesbasedonthiswork,providedtheoriginalauthorandsourceare
Funding:ThisworkwaspartiallysupportedbyaHHMIProfessorsgrantfromtheHowardHughesMedicalInstitute. Page5

Thefollowingresultquantifiestheeffectonvariancewhenweincreasethesamplesizen.Thelargerthe
sample size, the smaller the variance of the arithmetic mean. That is, the larger the sample size of a
sample drawn from a normal distribution, the more accurately can we estimate the mean of the
underlyingnormaldistribution.

Theory

If X is normally distributed with mean and standard deviation , one can show that the

## arithmeticmean X n isnormallydistributedwithmean andstandarddeviation / n .

EstimatingtheVarianceofaNormallyDistributedPopulation

## Supposeanexperimentisrepeatedntimesunderidenticalconditions.Denoteby xi , i = 1,2, , n the

outcomeofeachindividualexperiment.Thesamplevariance sn2 iscalculated

sn2 =
n 1
=
n 1 i =1
(xi xn )2

## deviation sn isthesquarerootofthesamplevariance: sn = sn2 .

Thesamplevarianceservesasanestimateforthevarianceofanormallydistributedpopulation.This
impliesthatifwewishtoestimatethevarianceofanormallydistributedpopulation,wetakeasample
andcalculatethesamplevariance.Aswithestimatingthemean,thelargerthesampleis,thebetterthe
estimatewillbe.Wewilllearnlaterinthecoursewhywedividebyn1andnotbynwhenwecalculate
thesamplevariance.

Togainsomefamiliaritywiththeconceptofestimation,wewillsimulatenormallydistributedvariates
andestimatethemeanandthevariancefromthesimulateddata.

Explore4

The spreadsheet under the tab Explore 4 is set up to simulate 20 random variates from a normal
distributionwithmean (CellJ3)andstandarddeviation (CellJ4).InCellH10,weestimatethemean
bycalculatingthearithmeticaverage(=AVERAGE(number1,[number2],)).InCellH11,weestimate

Citation:Neuhauser,C.EstimatingtheMeanandVarianceofaNormalDistribution.
Created:September9,2009Revisions:
otherstotranslate,makeremixes,andproducenewstoriesbasedonthiswork,providedtheoriginalauthorandsourceare
Funding:ThisworkwaspartiallysupportedbyaHHMIProfessorsgrantfromtheHowardHughesMedicalInstitute. Page6

the variance by calculating the sample variance (=VAR(number 1, [number 2], )). In Cell H12, we
calculate the sample standard deviation by taking the square root of the sample variance
(=SQRT(number)).

(a)UsetheF9keytoexplorehowtheestimatesforthemeanandthevariancechangefromruntorun.

(b) Change the simulation so that instead of simulating 20 random variates, simulate 40 random
variates.Calculatethearithmeticmean,thesamplevariance,andthesamplestandarddeviation.How
doesincreasingthesamplesizechangeyourestimates?