You are on page 1of 7

EstimatingtheMeanandVarianceofaNormalDistribution

LearningObjectives

Aftercompletingthismodule,thestudentwillbeableto

explainthevalueofrepeatingexperiments
explaintheroleofthelawoflargenumbersinestimatingpopulationmeans
describetheeffectof
increasingthesamplesize
orreducingmeasurement
errorsorothersourcesof
variability

KnowledgeandSkills

Propertiesofthe
arithmeticmean
Estimatingthemeanofa
normaldistribution
LawofLargeNumbers
EstimatingtheVarianceof
anormaldistribution
Generatingrandom
variatesinEXCEL

Prerequisites

1. Calculatingsamplemeanandarithmeticaverage
2. Calculatingsamplestandardvarianceandstandarddeviation
3. Normaldistribution

Citation:Neuhauser,C.EstimatingtheMeanandVarianceofaNormalDistribution.
Created:September9,2009Revisions:
Copyright:2009Neuhauser.ThisisanopenaccessarticledistributedunderthetermsoftheCreativeCommonsAttribution
NonCommercialShareAlikeLicense,whichpermitsunrestricteduse,distribution,andreproductioninanymedium,andallows
otherstotranslate,makeremixes,andproducenewstoriesbasedonthiswork,providedtheoriginalauthorandsourceare
creditedandthenewworkwillcarrythesamelicense.
Funding:ThisworkwaspartiallysupportedbyaHHMIProfessorsgrantfromtheHowardHughesMedicalInstitute. Page1

Pretest

1. LauraandHamidarelateforChemistrylab.Thelabmanualasksfordeterminingthedensityofsolid
platinum by repeating the measurements three times. To save time, they decide to only measure
thedensityonce.Explaintheconsequencesofthisshortcut.
2. Tom and Bao Yu measured the density of solid platinum three times: 19.8, 21.4, and 21.9 g/cm3.
Determinethearithmeticaverageofthesethreemeasurementsaccuratetothreedecimalplaces.
3. The following graphs are densities of probability distributions. Which represent the density of a
normaldistribution?
(a) (b) (c)
0.5 2.5 0.35

0.45
0.3

0.4 2

0.25
0.35

0.3 1.5
0.2

0.25

0.15
0.2 1

0.15
0.1

0.1 0.5

0.05
0.05

0 0 0
0 2 4 6 0 2 4 6 0 2 4 6
t t


4. Whichtwoparametersaretypicallyusedtodescribethenormaldistribution?
a. Median
b. Variance
c. Standarddeviation
d. Mean
5. Suppose X is normally distributed with mean 3 and standard deviation 1, that is, X N(3,1) . Use
EXCELto(a)find P( X > 3) ,(b)find P(1 < X < 4) ,and(c)determine a sothat P( X > a) = 0.74 .

Citation:Neuhauser,C.EstimatingtheMeanandVarianceofaNormalDistribution.
Created:September9,2009Revisions:
Copyright:2009Neuhauser.ThisisanopenaccessarticledistributedunderthetermsoftheCreativeCommonsAttribution
NonCommercialShareAlikeLicense,whichpermitsunrestricteduse,distribution,andreproductioninanymedium,andallows
otherstotranslate,makeremixes,andproducenewstoriesbasedonthiswork,providedtheoriginalauthorandsourceare
creditedandthenewworkwillcarrythesamelicense.
Funding:ThisworkwaspartiallysupportedbyaHHMIProfessorsgrantfromtheHowardHughesMedicalInstitute. Page2

EstimatingtheMeanofaNormallyDistributedPopulation

Supposeanexperimentisrepeatedntimesunderidenticalconditions.Denoteby xi , i = 1,2, , n the


outcomeofeachindividualexperiment.Thearithmeticaverage xn iscalculated

x1 + x2 + + xn 1 n
xn = = xi
n n i =1

Whenoutcomesarenotalldistinct,wecancountthenumberoftimeseachvalueoccurs:Supposeagain
thatanexperimentisrepeatedntimesunderidenticalconditions.Butnow,weassumethatthereare
onlykdistinctvalues x j , j = 1,2,..., k ,andthat x j occurs f j times.Thenthearithmeticaverage xn is
calculated

k
1 1
xn =
n
( x1 f1 + x2 f2 + ... + xk fk ) =
n
x f
j =1
j j

Example

Supposethatthefollowingdatarepresenttheagesofpatientsinastudy:17,19,19,20,21,24,26,26,
26,and27.Wefindforthearithmeticaverage

17 + 19 + 19 + 20 + 21 + 24 + 26 + 26 + 26 + 27 225
x10 = = = 22.5
10 10

Sincesomeofthevaluesoccurmorethantwice,wecanalsousethefrequencydistribution:

xj 17 19 20 21 24 26 27
fj 1 2 1 1 1 3 1

Forthearithmeticaveragewefind

1 225
x10 = ((17)(1) + (19)(2) + (20)(1) + (21)(1) + (24)(1) + (26)(3) + (27)(1)) = = 22.5
10 10

Citation:Neuhauser,C.EstimatingtheMeanandVarianceofaNormalDistribution.
Created:September9,2009Revisions:
Copyright:2009Neuhauser.ThisisanopenaccessarticledistributedunderthetermsoftheCreativeCommonsAttribution
NonCommercialShareAlikeLicense,whichpermitsunrestricteduse,distribution,andreproductioninanymedium,andallows
otherstotranslate,makeremixes,andproducenewstoriesbasedonthiswork,providedtheoriginalauthorandsourceare
creditedandthenewworkwillcarrythesamelicense.
Funding:ThisworkwaspartiallysupportedbyaHHMIProfessorsgrantfromtheHowardHughesMedicalInstitute. Page3

InclassActivity

We will explore the properties of the arithmetic mean when measurements are taken from a normal
distribution.Openthefirsttab(Explore1)ontheaccompanyingspreadsheet.ColumnBhas100random
variates from a normal distribution with mean 3 and variance 1. Recall that the function
=NORMINV(probability,mean,standard_dev)returnstheinverseofthenormalcumulativedistribution
forthespecifiedmeanandstandarddeviation.ColumnCcalculatesthecumulativesumandColumnD
hasthecorrespondingarithmeticaverages.TheFigureplotsColumnDagainstColumnA.

UsetheF9keytoexplorethearithmeticaverage.Whatdoyouobserve?

Theory

In Explore 1, you observed that the arithmetic mean stabilizes around the mean of the normal
distribution, regardless of the variance, as you increase the sample size. This is a consequence of the
Law of Large Numbers. While we do not yet have the background to completely understand its
mathematicalformulation,wewillgiveithereanywaysothatyoucanseehowamathematicalresult
expressing this property is formulated. We will come back to this result later in the course when we
havemorebackground.

LawofLargeNumbers

If X1 , X2 , , X n areindependentandidenticallydistributedwith E | X i |< ,thenasn


tendstoinfinity, X n convergesto EX1 inprobability.

Problems

1. A random variate is a particular outcome of a random variable. Assume that random variates are
drawn repeatedly from a normal distribution with mean 4 and variance 9. If you calculated the
arithmeticaverageforalargenumberofvariatesfromthisdistribution,whatwouldyouexpectthe
arithmeticaveragetobecloseto?
2. The Law of Large Numbers holds quite generally. Without going more deeply into the theory, can
youguesstheanswertothefollowingproblem?Supposeyourepeatedlytossedabiasedcoinwhere
headsoccurwithprobability0.2.Whatpercentageoftimewouldyouexpecttoseeheads?

Based on our observations in Explore 1, we conclude that the mean of a normal distribution can be
estimatedbyrepeatedlysamplingfromthenormaldistributionandcalculatingthearithmeticaverageof
thesample.Thisarithmeticaverageservesasanestimateforthemeanofthenormaldistribution.

Citation:Neuhauser,C.EstimatingtheMeanandVarianceofaNormalDistribution.
Created:September9,2009Revisions:
Copyright:2009Neuhauser.ThisisanopenaccessarticledistributedunderthetermsoftheCreativeCommonsAttribution
NonCommercialShareAlikeLicense,whichpermitsunrestricteduse,distribution,andreproductioninanymedium,andallows
otherstotranslate,makeremixes,andproducenewstoriesbasedonthiswork,providedtheoriginalauthorandsourceare
creditedandthenewworkwillcarrythesamelicense.
Funding:ThisworkwaspartiallysupportedbyaHHMIProfessorsgrantfromtheHowardHughesMedicalInstitute. Page4

PropertiesoftheArithmeticAverage

Explore2

When you compare the arithmetic averages of 100 random variates in Explore 1, you will realize that
different runs of the simulation result in slightly different averages. Arithmetic averages are random
variables and we will explore their distribution as a function of the sample size. Again, we will use
normallydistributedrandomvariables.

AsimulationissetupunderthetabExplore2thatsimulatesarithmeticaveragesofnormallydistributed
randomvariables.Wevarythesamplesizes.Detailsareexplainedinthespreadsheet.UsetheF9keyto
exploretheeffectofthesamplesizeonthearithmeticaverage.Whatdoyouobserve?

Explore3

Thevariationinthearithmeticmeancomesfromthefactthattherandomvariatesineachsamplevary
fromruntorun.Themoretherandomvariatesvary,themorethearithmeticmeanvaries.Thedegree
ofvariationisdescribedbythestandarddeviation.Toexploretheeffectofthevariation,wesimulate
arithmeticmeansfortwodifferentscenariosinthespreadsheetundertabExplore3:inonesimulation,
we calculate arithmetic means for random variates that are normally distributed with mean 3 and
standard deviation 1; in the second scenario, we calculate arithmetic means for random variates that
are normally distributed with mean 3 and standard deviation 0.5. Details are explained in the
spreadsheet.UsetheF9keytoexploretheeffectofthestandarddeviationonthearithmeticaverage.
Whatdoyouobserve?

Problems(cont.)

3. BasedonyourobservationsinExplore2and3,whatistheeffectonthearithmeticmeanwhenyou
(a)increasesamplesizeand(b)reducevariation.Whatdoesthisimplyforexperiments?

Citation:Neuhauser,C.EstimatingtheMeanandVarianceofaNormalDistribution.
Created:September9,2009Revisions:
Copyright:2009Neuhauser.ThisisanopenaccessarticledistributedunderthetermsoftheCreativeCommonsAttribution
NonCommercialShareAlikeLicense,whichpermitsunrestricteduse,distribution,andreproductioninanymedium,andallows
otherstotranslate,makeremixes,andproducenewstoriesbasedonthiswork,providedtheoriginalauthorandsourceare
creditedandthenewworkwillcarrythesamelicense.
Funding:ThisworkwaspartiallysupportedbyaHHMIProfessorsgrantfromtheHowardHughesMedicalInstitute. Page5

Thefollowingresultquantifiestheeffectonvariancewhenweincreasethesamplesizen.Thelargerthe
sample size, the smaller the variance of the arithmetic mean. That is, the larger the sample size of a
sample drawn from a normal distribution, the more accurately can we estimate the mean of the
underlyingnormaldistribution.

Theory

If X is normally distributed with mean and standard deviation , one can show that the

arithmeticmean X n isnormallydistributedwithmean andstandarddeviation / n .

EstimatingtheVarianceofaNormallyDistributedPopulation

Supposeanexperimentisrepeatedntimesunderidenticalconditions.Denoteby xi , i = 1,2, , n the


outcomeofeachindividualexperiment.Thesamplevariance sn2 iscalculated

(x1 xn )2 + (x2 xn )2 + + (xn xn )2 1 n


sn2 =
n 1
=
n 1 i =1
(xi xn )2

where xn denotesthearithmeticaverageofthenoutcomes x j ,j=1,2,,n.Thesamplestandard

deviation sn isthesquarerootofthesamplevariance: sn = sn2 .

Thesamplevarianceservesasanestimateforthevarianceofanormallydistributedpopulation.This
impliesthatifwewishtoestimatethevarianceofanormallydistributedpopulation,wetakeasample
andcalculatethesamplevariance.Aswithestimatingthemean,thelargerthesampleis,thebetterthe
estimatewillbe.Wewilllearnlaterinthecoursewhywedividebyn1andnotbynwhenwecalculate
thesamplevariance.

Togainsomefamiliaritywiththeconceptofestimation,wewillsimulatenormallydistributedvariates
andestimatethemeanandthevariancefromthesimulateddata.

Explore4

The spreadsheet under the tab Explore 4 is set up to simulate 20 random variates from a normal
distributionwithmean (CellJ3)andstandarddeviation (CellJ4).InCellH10,weestimatethemean
bycalculatingthearithmeticaverage(=AVERAGE(number1,[number2],)).InCellH11,weestimate

Citation:Neuhauser,C.EstimatingtheMeanandVarianceofaNormalDistribution.
Created:September9,2009Revisions:
Copyright:2009Neuhauser.ThisisanopenaccessarticledistributedunderthetermsoftheCreativeCommonsAttribution
NonCommercialShareAlikeLicense,whichpermitsunrestricteduse,distribution,andreproductioninanymedium,andallows
otherstotranslate,makeremixes,andproducenewstoriesbasedonthiswork,providedtheoriginalauthorandsourceare
creditedandthenewworkwillcarrythesamelicense.
Funding:ThisworkwaspartiallysupportedbyaHHMIProfessorsgrantfromtheHowardHughesMedicalInstitute. Page6

the variance by calculating the sample variance (=VAR(number 1, [number 2], )). In Cell H12, we
calculate the sample standard deviation by taking the square root of the sample variance
(=SQRT(number)).

(a)UsetheF9keytoexplorehowtheestimatesforthemeanandthevariancechangefromruntorun.

(b) Change the simulation so that instead of simulating 20 random variates, simulate 40 random
variates.Calculatethearithmeticmean,thesamplevariance,andthesamplestandarddeviation.How
doesincreasingthesamplesizechangeyourestimates?

Homework(ReadingAssignmentsarefromC.Neuhauser,CalculusforBiologyandMedicine,3rd
edition,PrenticeHall)

ReadSection12.7.1.
DoProblems18and11inSection12.7.

Citation:Neuhauser,C.EstimatingtheMeanandVarianceofaNormalDistribution.
Created:September9,2009Revisions:
Copyright:2009Neuhauser.ThisisanopenaccessarticledistributedunderthetermsoftheCreativeCommonsAttribution
NonCommercialShareAlikeLicense,whichpermitsunrestricteduse,distribution,andreproductioninanymedium,andallows
otherstotranslate,makeremixes,andproducenewstoriesbasedonthiswork,providedtheoriginalauthorandsourceare
creditedandthenewworkwillcarrythesamelicense.
Funding:ThisworkwaspartiallysupportedbyaHHMIProfessorsgrantfromtheHowardHughesMedicalInstitute. Page7