You are on page 1of 24

Stat250GundersonLectureNotes

7:LearningaboutaPopulationMean

Part1:DistributionforaSampleMean

RecallParameters,StatisticsandSamplingDistributions
Wegobacktothescenariowherewehaveonepopulationofinterestbutnowtheresponse
being measured is quantitative (not categorical). We want to learn about the value of the
populationmean Wetakearandomsampleandusethesamplestatistic,thesamplemean
X ,toestimatetheparameter.Whenwedothis,thesamplemeanmaynotbeequaltothe
populationmean,infact,itcouldchangeeverytimewetakeanewrandomsample.

Sorecallthatastatisticisarandomvariableanditwill haveaprobabilitydistribution.This
probabilitydistributioniscalledthesamplingdistributionofthestatistic.

Weturntounderstandingthesamplingdistributionofthesamplemeanwhichwillbeusedto
constructaconfidenceintervalestimateforthepopulationmeanandtotesthypothesesabout
thevalueofapopulationmean.

SamplingDistributionforOneSampleMean

Manyresponsesofinterestaremeasurementsheight,weight,distance,reactiontime,scores.
Wewanttolearnaboutapopulationmeanandwewilldosousingtheinformationprovided
fromasamplefromthepopulation.

Example:Howmanyhoursperweekdoyouwork?
ApollwasconductedbyaCenterforWorkforceDevelopment.Aprobabilitysampleof1000
workersresultedinameannumberofhoursworkedperweekof43.

Population=allfulltimeworkersintheU.S.

Parameter=m =populationmeannumberofhoursworkedperweek(unknown)

Sample=the1000workerspolled(andtheirresponses)

Statistic=xbar=samplemeannumberofhoursworkedperweek=43

(knownforagivenselectedsample)
Cananyonesayhowclosethisobservedsamplemean x of43istothepopulationmean?No

Ifweweretotakeanotherrandomsampleofthesamesize,wouldwegetthesamevalueforthe
samplemean?ProbablyNOT

Sowhatarethepossiblevaluesforthesamplemean x ifwetookmanyrandomsamplesofthe
samesizefromthispopulation?Whatwouldthedistributionofthepossible x valueslooklike?
Whatcanwesayaboutthedistributionofthesamplemean?

103

DistributionoftheSampleMeanMainResults
Let=meanforthepopulationofinterestand=standarddeviationforthethatpopulation.
Let x =thesamplemeanforarandomsampleofsizen.

Ifallpossiblerandomsamplesofthesamesizenaretakenand x iscomputedforeach,then

Theaverageofallofthepossiblesamplemeanvaluesisequalto_thepopulationmean _.

Thusthesamplemeanisan__unbiased____estimatorofthepopulationmean.

Thestandarddeviationofallofthepossiblesamplemeanvaluesisequaltotheoriginal
populationstandarddeviationdividedby n .
Standarddeviationofthesamplemeanisgivenby:s.d.( x )=
n

Whatabouttheshapeofthesamplingdistribution?Thefirsttwobulletsaboveprovidewhat
the mean and the standard deviation are for the possible sample mean values. The final two
bulletstellusthattheshapeofthedistributionwillbe(approximately)normal.

Iftheparent(original)populationhasanormaldistribution,
thenthedistributionofthepossiblevaluesof x ,thesamplemean,isnormal.

Iftheparent(original)populationisnotnecessarilynormallydistributed
butthesamplesize n islarge,thenthedistributionofthepossiblevaluesof x ,thesample
meanisapproximatelynormal.

ThislastresultiscalledtheCENTRALLIMITTHEOREM(CLT).

104

The C in CLT is for CENTRAL. The CLT is an


importantorcentralresultinstatistics.Asitturns
out, many normal curve approximations for
various statistics are really applications of the
CLT. The Stat 250 formula card summarizes
distributionofasamplemeanasfollows:

TryIt!SRTTestScores
Aparticulartestformeasuringvariousaspectsof
verbal memory is known as the Selective
RemindingTask(SRT)test.Itisbasedonhearing,
recalling,andlearning12wordspresentedtothe
client. Scores for various aspects of verbal
memory are combined to give an overall score.
Let X represent overall score for 20yearold
females. Such scores are normally distributed
withameanof126andastandarddeviationof10.

a. Whatistheprobabilitythatarandomlyselected20yearoldfemalewillhaveascoreabove
134?

134 126

P X 134 P Z
P Z 0.8 1 0.7881 0.2119
10

b. Arandomsampleof9suchfemaleswillbeselected.Whatistheprobabilitythatallninewill
scorebelow134?

Usingindependence:(0.7881)9=0.1173(orcouldhavevieweditasabinomial)

c. Arandomsampleof9suchfemaleswillbeselected.Whatistheprobabilitythattheirsample
meanscorewillbeabove134?

HerewehavethatXbaris N 126,

10

9

134 126

P X 134 P Z
P Z 2.4 1 0.9918 0.0082

10

105

TryIt!ActualFlightTimes
SupposetherandomvariableXrepresentstheactualflighttime(inminutes)forDeltaAirlines
flightsfromCincinnatitoTampafollowsauniformdistributionovertherangeof110minutesto
130minutes.

a. SketchthedistributionforX(includeaxeslabelsandsomevaluesontheaxes).

b. Supposeweweretorepeatedlytakearandomsampleofsize100fromthisdistributionand
computethesamplemeanforeachsample.Whatwouldthehistogramofthesamplemean
values look like? Provide a smoothed out sketch of the distribution of the sample mean,
includealldetailsthatyoucan.

BytheCLT,thehistogramshouldresembleapproximatelyanormaldistributionwitha
meanof120andthestandarddeviationiss/squareroot(n)orforthisexampleastandard
deviationof5.8/10=0.58minutes.

TryIt!TrueorFalse
DeterminewhethereachofthefollowingstatementsisTrueorFalse.Atruestatementisalways
true.Clearlycircleyouranswer.

a. Thecentrallimittheoremisimportantinstatisticsbecauseforalargerandomsample,itsays
thesamplingdistributionofthesamplemeanisapproximatelynormal.

True

False

b. The sampling distribution of a parameter is the distribution of the parameter value if


repeatedrandomsamplesareobtained.

True

False

106

MoreontheStandardDeviationof X

Thestandarddeviationofthesamplemeanisgivenby:s.d.( x )=
n

This quantity would give us an idea about how far apart a sample mean
populationmeanareexpectedtobeonaverage.

x and the true

We can interpret the standard deviation of the sample mean as approximately the average
distanceofthepossiblesamplemeanvalues(forrepeatedsamplesofthesamesizen)fromthe
truepopulationmean

Note: If the sample size increases, the standard deviation decreases, which says the possible
samplemeanvalueswillbeclosertothetruepopulationmean(onaverage).

Thes.d.( x )isameasureoftheaccuracyoftheprocessofusingasamplemeantoestimatethe
populationmean.Thisquantity doesnottellusexactlyhowfarawayaparticularobserved
n

x valueisfrom

In practice, the population standard deviation is rarely known, so the sample standard
deviation s is used. As with proportions, when making this substitution we call the result the
standard error of the mean s.e.( x ) = s . This terminology makes sense, because this is a
n

measureofhowmuch,onaverage,thesamplemeanisinerrorasanestimateofthepopulation
mean.

Standarderrorofthesamplemeanisgivenby:s.e.( x )= s
n

Thisquantityisanestimateofthestandarddeviationof x .

Sowecaninterpretthestandarderrorofthesamplemeanasestimating,approximately,the
averagedistanceofthepossible x values(forrepeatedsamplesofthesamesizen)fromthe
truepopulationmean

Moreover,wecanusethisstandarderrortocreatearangeofvaluesthatweareveryconfident
willcontainthetruepopulationmeannamely, x (few)s.e.( x ).Thisisthebasisforconfidence
intervalforthepopulationmeandiscussedinPart2.

107

PreparingforStatisticalInference:
StandardizedStatistics

InourSRTTestScoresandActualFlightTimesexamplesearlier,wehavealreadyconstructedand
usedastandardizedzstatisticforasamplemean.
z= x has(approximately)astandardnormaldistributionN(0,1).

Dilemma=___ werarelyknowthevalueof ___

Ifwereplacethepopulationstandarddeviation withthesamplestandarddeviations,then
x
s
n

wontbeapproximatelyN(0,1);insteadithasatdistribwithn1degreesoffreedom.

StudentstDistribution
Alittleaboutthefamilyoftdistributions...
Theyaresymmetric,unimodal,centeredat0.
TheyareflatterwithheaviertailscomparedtotheN(0,1)distribution.
As the degrees of freedom (df) increases ... the t distribution approaches the N(0,1)
distribution.
Wecanstillusetheideasaboutstandardscoresforaframeofreference.
TablesA.2andA.3summarizepercentilesforvarioustdistributions

From Utts, Jessica M. and Robert F. Heckard. Mind on Statistics, Fourth Edition. 2012. Used with permission.

Wewillseemoreontdistributionswhenwedoinferenceaboutpopulationmean(s).

108

EveryStatistichasaSamplingDistribution

Thesamplingdistributionofastatisticisthedistributionofpossiblevaluesofthestatisticfor
repeatedsamplesofthesamesizefromapopulation.

So far we have discussed the sampling distribution of a sample proportion, the sampling
distributionofthedifferencebetweentwosampleproportions,andthesamplingdistributionof
the sample mean. In all cases, under specified conditions the sampling distribution was
approximatelynormal.

Everystatistichasasamplingdistribution,buttheappropriatedistributionmaynotalwaysbe
normal,orevenbeapproximatelybellshaped.

You can construct an approximate sampling distribution for any statistic by actually taking
repeatedsamplesofthesamesizefromapopulationandconstructingahistogramforthevalues
ofthestatisticoverthemanysamples.

AdditionalNotes
Aplacetojotdownquestionsyoumayhaveandaskduringofficehours,takeafewextranotes,write
outanextraproblemorsummarycompletedinlecture,createyourownsummaryabouttheseconcepts.

109

110

Stat250GundersonLectureNotes
7:LearningaboutaPopulationMean

Part2:ConfidenceIntervalforaPopulationMean

Donotputfaithinwhatstatisticssayuntilyouhavecarefullyconsidered
whattheydonotsay.WilliamW.Watt

Earlier we studied confidence intervals for estimating a population proportion and the
differencebetweentwopopulationproportions.Recallitisimportanttounderstandhowto
interpretanintervalandhowtointerpretwhattheconfidencelevelreallymeans.

Theintervalprovidesarangeofreasonablevaluesfortheparameterwithanassociatedhigh
levelofconfidence.Forexamplewecansay,Weare95%confidentthattheproportionof
Americans who do not get enough sleep at night is somewhere between 0.325 to 0.395,
basedonarandomsampleofn=935Americanadults.

The95%confidenceleveldescribesourconfidenceintheprocedureweusedtomakethe
interval. If we repeated the procedure many times, we would expect about 95% of the
intervalstocontainthepopulationparameter.

ConfidenceIntervalforaPopulationMean

Considerastudyonthedesignofahighwaysign.Aquestionofinterestis:Whatisthemean
maximumdistanceatwhichdriversareabletoreadthesign?Ahighwaysafetyresearcherwill
takearandomsampleofn=16driversandmeasurethemaximumdistances(infeet)atwhich
eachcanreadthesign.

Populationparameter
=___population__meanmaximumdistancetoreadthesignfor_alldrivers_.

Sampleestimate
x =samplemeanmaximumdistancetoreadthesignfortherandomsampleofdrivers.

Butweknowthesampleestimate x maynotequal ,infact,thepossible x valuesvaryfrom


sampletosample.Becausethesamplemeaniscomputedfromarandomsample,thenitisa
randomvariable,withaprobabilitydistribution.

SamplingDistributionofthesamplemean

If x isthesamplemeanforarandomsampleofsizen,andeithertheoriginalpopulationof
responseshasanormalmodelorthesamplesizeislargeenough,
thedistributionofthesamplemeanis(approximately)

N ,

n

111

Sothepossible x valuesvarynormallyaround withastandarddeviationof .Thestandard


n

deviationofthesamplemean, ,isroughlytheaveragedistanceofthepossiblesamplemean
n

valuesfromthepopulationmean .Sincewedontknowthepopulationstandarddeviation
wewillusethesamplestandarddeviations,resultinginthestandarderrorofthesamplemean.

StandardErroroftheSamplemean

s.e.( x )=

wheres=samplestandarddeviation

Thestandarderrorof x estimates,roughly,theaveragedistanceofthepossible x valuesfrom


Thepossible x valuesresultfromconsideringallpossiblerandomsamplesofthesamesizen
fromthesamepopulation.

Sowehaveourestimateofthepopulationmean,thesamplemean x ,andwehaveitsstandard
error.Tomakeourconfidenceinterval,weneedtoknowthemultiplier.

SampleEstimateMultiplierxStandarderror

Themultiplierforaconfidenceintervalforthepopulationmeanisdenotedbyt*,whichisthe
valueinaStudentstdistributionwithdf=n1suchthattheareabetweentandtequalsthe
desiredconfidencelevel.Thevalueoft*willbefoundusingTableA.2.Firstletsgivetheformal
result.

OnesampletConfidenceIntervalfor

x t * s.e.( x )
where t * isanappropriatevalueforat(n1)distribution.

Thisintervalrequireswehavearandomsamplefromanormalpopulation.Ifthesamplesize
islarge(n>30),theassumptionofnormalityisnotsocrucialandtheresultisapproximate.

Importantitems:
besuretochecktheconditions
knowhowtointerprettheconfidenceinterval
beabletoexplainwhattheconfidencelevelofsay95%reallymeans

112

TryIt!UsingTableA.2tofindt*
(a)Find t * fora90%
confidenceintervalbased
onn=12observations.

df=n1=11
t*=1.80


(b)Find t * fora95%
confidenceintervalbased
onn=30observations.

df=n1=29
t*=2.05

(c)Find t * fora95%
confidenceintervalbased
onn=54observations.

df=53
beconservativeand
usedf=50
t*=2.01

(d)Whathappenstothe
valueof t * asthesample
size(andthusthedegrees
offreedom)getslarger?

t*getssmallerand
approachesthe
correspondingz*value

From Utts, Jessica M. and Robert F. Heckard. Mind on Statistics, Fourth Edition. 2012. Used with permission.

113

TryIt!ConfidenceIntervalfortheMeanMaximumDistance
Recallthestudyonthedesignofahighwaysign.Theresearcherwantedtolearnaboutthemean
maximumdistanceatwhichdriversareabletoreadthesign.Theresearchertookarandom
sampleofn=16driversandmeasuredthemaximumdistances(infeet)atwhicheachcanread
thesign.Thedataareprovidedbelow.

440 490 600 540 540 600 240 440


360 600 490 400 490 540 440 490

a. Verifythenecessaryconditionsforcomputingaconfidenceintervalforthepopulationmean
distance.Wearetoldthatthesamplewasarandomsamplesowejustneedtocheckifa
normalmodelfortheresponsemaxdistanceforthepopulationisreasonable.

Allimages

Comments:
Responsenormallydistributed?
Boxplot,histogram,qqplot=>low
outlierpresent.
Cantbedeletedunlessclear
reason.Ifnoclearreason,might
lookatresultswithandwithout
lowoutlier.Here=someonewith
poordistancevisiontheyforgot
theirglasses.

114

440 490 600 540 540 600 240 440


360 600 490 400 490 540 440 490
b. Computethesamplemeanmaximumdistanceandthestandarderror(withouttheoutlier).
Thesamplemeanmaximumdistance==497.3feet.
Thesamplestandarddeviation=s=73.4feet.

s
n =73.4/sqrt(15)=19.0feet.
Thestandarderrorofthemean=

The average distance of the possible sample mean values from the population mean is
roughly19feet.
c. Usea95%confidenceintervaltoestimatethepopulationmeanmaximumdistanceatwhich
alldriverscanreadthesign.Writeaparagraphthatinterpretsthisintervalandtheconfidence
level.
*

x t s.e.( x ) =>497.3(2.14)(19.0)=>497.340.6

(456.7,537.9)
Wecansaythatinthepopulationofdriversrepresentedbythesample,the
meanmaximumsignreadingdistanceisestimatedtobebetween456.7feet
and537.9feet.(*Thisappliestodriverswithadequatedistancevision.)
Ifwerepeatedthisproceduremanytimes,wedexpect95%oftheresulting
confidenceintervalstocontainthepopulationmeanmaximumdistancem.

UsingRCommanderwe
wouldusetheSingleSample
tTesttoproducethe
followingresults.Boththe
confidenceintervalanda
testofhypotheseswillbe
provided.Wewilldiscuss
thehypothesistestingfora
meandifferenceinPart3.

One Sample t-test


data: MaxDist
t = 26.23, df = 14, p-value = 2.651e-13
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
456.6676 537.9991
sample estimates:
mean of x
497.3333

115

AdditionalNotes
Aplacetojotdownquestionsyoumayhave
and ask during office hours, take a few extra
notes,writeoutanextraproblemorsummary
completed in lecture, create your own
summaryabouttheseconcepts.

116

Stat250GundersonLectureNotes
7:LearningaboutaPopulationMean

Part3:TestingaboutaPopulationMean

IntroductiontoHypothesisTestsforMeans

Wehavealreadybeenintroducedustothelogicandstepsofhypothesistestingforlearning
aboutapopulationproportionandforthedifferencebetweentwopopulationproportions.
Recallthebigideathatwedeclarestatisticalsignificanceandrejectthenullhypothesisifthe
pvalueislessthanorequaltothesignificancelevelNowwewillextendtheseideastotesting
aboutmeans,focusingfirstonhypothesistestingaboutasinglepopulationmean.

Afewnotes:Hypothesesandconclusionsapplytothelargerpopulation(s)representedbythe
sample(s).Andifthedistributionofaquantitativevariableishighlyskewed,weshouldconsider
analyzingthemedianratherthanthemean.Methodsfortestinghypothesesaboutmediansare
aspecialcaseofnonparametricmethods,whichwewillnotcoverindetail,butdoexistasthe
needarises.

NextletsreviewtheBasicStepsinAnyHypothesisTest.

Step1: Determinethenullandalternativehypotheses.
Thehypothesesarestatementsaboutthepopulation(s),notthesample(s).
Thenullhypothesisdefinesaspecificvalueofapopulationparameter,calledthenullvalue.

Step2:

Verifynecessarydataconditions,andifmet,summarizethedataintoan
appropriateteststatistic.

Arelevantstatisticiscalculatedfromsampleinformationandsummarizedintoateststatistic.
We measure the difference between the sample statistic and the null value using the
standardizedstatistic:

SamplestatisticNullvalue
(Null)standarderror

Forhypothesesaboutproportions,thestandardizedstatisticiscalleda
__zstatistic__andthe__standardnormaldistribution__isusedtofindthepvalue.

Forhypothesesaboutmeans,thestandardizedstatisticiscalleda
__tstatistic____andthe____tdistribution__isusedtofindthepvalue.

Step3: Assumingthenullhypothesisistrue,findthepvalue.

Apvalueiscomputedbasedonthestandardizedteststatistic.Thepvalueiscalculatedby
temporarilyassumingthenullhypothesistobetrueandthencalculatingtheprobabilitythatthe
teststatisticcouldbeaslargeinmagnitudeasitis(orlarger)inthedirection(s)specifiedbythe
alternativehypothesis.

117


Step4:

Decideiftheresultisstatisticallysignificantbasedonthepvalue.

Basedonthepvalue,weeitherrejectorfailtorejectthenullhypothesis.Themostcommonly
usedcriterion(levelofsignificance)isthatwerejectthenullhypothesiswhenthepvalueisless
thanorequaltothesignificancelevel(generally0.05).Inmanyresearcharticles,pvaluesare
simplyreportedandreadersarelefttodrawtheirownconclusions.Rememberthatapvalue
measuresthestrengthoftheevidenceagainstthenullhypothesisandthesmallerthepvalue,
thestrongertheevidenceagainstthenull(andforthealternative).RejectH0ifpvalue .

TheBeautyofpvalues:Supposethesignificancelevelissetat5%fortestingH0:statusquo
versusHa:thenewtheory.

Ifpvalueis

StatisticalDecision

0.462

FailtoRejectH0

Notevenclosetosignificance
maybethrowoutthatnewtheory

0.063

FailtoRejectH0

Justmissedbeingsignificantat5%
Practicallyimportant?Newstudy?Largern?

0.041

RejectH0

Significantatthe5%level
sufficientsupportforthenewtheory

0.003

RejectH0

Highlysignificant
strongsupportforthenewtheory

Step5:

FeasibleConclusionabouttheNewTheory

Reporttheconclusioninthecontextofthesituation.

Thedecisionistorejectorfailtorejectthenullhypothesis,buttheconclusionshouldgobackto
the original question of interest being asked. It should be stated in terms of the particular
scenarioorsituation.

TestingHypothesesaboutOnePopulationMean

Wehaveonepopulationandaresponsethatisquantitative.Wewishtotestaboutthevalue
ofthemeanresponseforthepopulationThedataareassumedtobearandomsample.The
responseisassumedtobenormallydistributedforthepopulation(butifthesamplesizeislarge,
thisconditionislesscrucial).

Step1: Determinethenullandalternativehypotheses.

1.H0:

versusHa: onesidedtotheright

2.H0:

versusHa:
onesidedtotheleft

versusHa: twosided

3.H0:

118

Step2:

Verify necessary data conditions, and if met, summarize the data into an
appropriateteststatistic.

Howwouldyouchecktheconditionsasstatedinthescenarioabove?

Randomsample:maybeatimeplot(foridpart)ifdatacollectedovertime

Normality:histogramandaqqplot

Teststatistic=SamplestatisticNullvalue

Standarderror

x 0 x 0
t

s.e.( x )
s n

Step3:

IfH0istrue,thisteststatistichasa__t(n1)___distribution.

Assumingthenullhypothesisistrue,findthepvalue.

Stepsforfindingapvalue

DrawthedistributionfortheteststatisticunderH0
Forttestsitwillbeatdistributionwithacertaindf.
Locatetheobservedteststatisticvalueontheaxis.
Shadeintheareathatcorrespondstothepvalue.
Lookatthealternativehypothesisforthedirectionofextreme.
Usetheappropriatetabletofind(boundsfor)thepvalue.
ForttestsweuseTableA.3.

Basicsteps
forfinding
apvalue

Step4: Decidewhetherornottheresultisstatisticallysignificantbasedonthepvalue.
Thelevelofsignificanceisselectedinadvance.Werejectthenullhypothesisifthepvalueis
lessthanorequaltoInthiscase,wesaytheresultsarestatisticallysignificantatthelevel

Step5: Reporttheconclusioninthecontextofthesituation.
Oncethedecisionismade,aconclusioninthecontextoftheproblemcanbestated.
FromtheStat250formulacard:

PopulationMean
Parameter

Statistic
x
StandardError

s.e.( x )

s
n

OneSampletTest
x 0 x 0

s.e.( x )

119

df=n1

From Utts, Jessica M. and Robert F. Heckard. Mind on Statistics, Fourth Edition. 2012. Used with permission.

120

TryIt!UsingTableA.3tofindapvalueforaonesidedtest
WearetestingH0:=0versusHa:>0withn=15observationsandtheobservedtest

statisticis t 1.97

DrawthedistributionfortheteststatisticunderH0

Locatetheobservedteststatisticvalueontheaxis.

Shadeintheareathatcorrespondstothepvalue.
Lookatthealternativehypothesisforthedirectionofextreme.

Usetheappropriatetabletofind(boundsfor)thepvalue.
ForttestswewilluseTableA.3.

Thepvalueisbetween0.033and0.047.0.033<pvalue<0.047
(itiscloserto0.033andtheexactpvalueis0.0345)
Isthevalueof t 1.97 significantatthe5%level?___yes___Atthe1%level?__no____

TryIt!UsingTableA.3tofindapvalueforatwosidedtest

WearetestingH0:=64versusHa:64withn=30observationsandtheobservedteststatistic
is t 1.12 .Howwouldyoureportthepvalueforthistest?

121

TryIt!ClassicalMusic
AresearcherwantstotestifHSstudentscompleteamazemorequicklywhilelisteningtoclassical
music. For the generalHS population, the time to complete the maze is assumed to follow a
normaldistributionwithameanof40seconds.Usea5%significancelevel.

Definetheparameterofinterest:Letrepresentthepopulationmeancompletiontime
whenlisteningtoclassicalmusicforallHSstudents

Statethehypotheses:
H0:

Ha:

Arandomsampleof100HSstudentsaretimedwhilelisteningtoclassicalmusic.
Themeantimewas39.1secondsandthestandarddeviationwas4seconds.Conductthetest.

Ask:Is39.1asamplemeanorapopulationmean?Is4asample
standarddeviationorapopulationstandarddeviation?

x 0 39.1 40

2.25
4
s
100
n

Oursamplemeanwas
2.25standarderrors
BELOWthehypothesized
(ornull)valueof40

Ourpvalueisbetween0.011and0.024
0.011<pvalue<0.024.

Aretheresultsstatisticallysignificantatthe5%level?yes,pvalueis<0.05,sowerejectH0

Statetheconclusionatthe5%levelintermsoftheproblem.

Atthe5%level,thereissufficientevidencetosaythatlisteningtoclassicalmusic
helpsHSstudentscompletethemazemorequicklyonaverage.

Commentabouttheassumptionsrequiredforthistesttobevalid:

toldwehavear.s.,howaboutanormalmodel?
Notmentionedbutareweconcerned?n=100islarge!

122

TryIt!CalciumIntake
Abonehealthstudylookedatthedailyintakeofcalcium(mg)for38women.Theyareconcerned
thatthemeancalciumintakeforthepopulationofsuchwomenisnotmeetingtheRDAlevelof
1200mg,thatis,thepopulationmeanislessthanthe1200mglevel.Theywishtotestthistheory
usinga5%significancelevel.
a. Statethehypothesesaboutthemeancalciumintakeforthepopulationofsuchwomen.
H0:______________
versusHa:_______________

SummaryStatistics
Std.Dev(s)
SampleSize(n)
427.23
38

Mean
926.03

Std.Error
69.31

BelowarethettestresultsgeneratedusingRCommanderandselectingStatistics>Means>
SingleSample T Test. A test value of 1200 was entered and the correct direction for the
alternativehypothesiswasselected.Noticethata95%onesidedconfidenceboundisprovided
sinceourtestalternativewasonesidedtotheleft.Ifyouwantedtoalsoreportaregular95%
confidenceinterval,youwouldrunatwosidedhypothesistestinR.

t
3.953

df
37

OneSampleTResults
pvalue
95%CILower
0.000165
***

95%CIUpper
1043.16

b. InterprettheStd.Errorofthemean(SEM):
Wewouldestimatetheaveragedistanceofthepossiblesamplemeanvaluesfromthe
populationmeancalciumintakemtobeabout69.31mg.Note:thepossiblexbarvalues
wouldarisefromrepeatedlytakingarandomsampleofsizen=38fromthesame
population.
c. Givetheobservedteststatisticvalue:__t__=__3.953___
Interpretthethisvalueintermsofadifferencefromthehypothesizedmeanof1200.
Oursamplemeanwasalmost4standarderrorsBELOW
thehypothesizedmeanof1200mg.

d. Sketchapictureofthepvalueintermsofanareaunderadistribution.
degreesoffreedom:n1=381=37df.

e. Givethepvalueandtheconclusionusinga5%significancelevel.
pvalueis0.000165.WewouldrejectH0andconcludethereissufficientevidenceatthe
5%leveltosaytheaveragecalciumintakeforwomenisbelowtheRDAlevelof1200mg.

123

TheRelationshipbetweenSignificanceTestsandConfidenceIntervals
Earlierwediscussedtheusingofconfidenceintervalstoguidedecisions.Aconfidenceinterval
providesarangeofplausible(reasonable)valuesfortheparameter.Thenullhypothesisgives
anullvaluefortheparameter.So:
Ifthisnullvalueisoneofthereasonablevaluesfoundintheconfidenceinterval,
thenullhypothesiswouldnotberejected.
Ifthisnullvaluewasnotfoundintheconfidenceintervalofacceptablevaluesforthe
parameter,thenthenullhypothesiswouldberejected.

Notes:
(1) The alternative hypothesis should be twosided. However, sometimes you can reason
throughthedecisionforaonesidedtest.

(2) Thesignificancelevelofthetestshouldcoincidewiththeconfidencelevel(e.g. =0.05


with a 95% confidence level). However, sometimes you can still determine the decision if
thesedonotexactlycorrespond(seepart(c)ofthenextTryIt!).

(3) Thisrelationshipholdsexactlyfortestsaboutapopulationmeanordifferencebetweentwo
populationmeans.Inmostcases,thecorrespondencewillholdfortestsaboutapopulation
proportionordifferencebetweentwopopulationproportions.

TryIt!TimeSpentWatchingTV
A study looked at the amount of time that teenagers are spending watching TV. Based on a
representativesample,the95%confidenceintervalformeanamountoftime(inhours)spent
watchingTVonaweekenddaywasgivenas:2.6hours2.1hours.Sotheintervalgoesfrom0.5
hoursto4.7hours.

a.

TestH0:=5hoursversusHa:5hoursat 0.05.
RejectH0
FailtorejectH0
Can'ttell

Why?Sincethevalueof5ISNOTinthe95%CIfor .

b.

TestH0:=4hoursversusHa:4hoursat 0.05.
FailtorejectH0
Can'ttell
RejectH0

Why?Sincethevalueof4ISinthe95%CIfor .

c.

d.

TestH0:=4hoursversusHa:4hoursat 0.01
FailtorejectH0
Can'ttell
RejectH0
Why?Sincethe99%confidenceintervalwouldbewider
andstillhavethevalueof4init.
TestH0:=4hoursversusHa:4hoursat 0.10
RejectH0
FailtorejectH0
Can'ttell
Why? Sincethe90%confidenceintervalwouldbenarrower
andmayormaynotstillhavethevalueof4init.

124

TryIt!MBAgradsSalaries
ItsagoodyearforMBAgradswasthetitleofanarticle.Oneoftheparametersofinterestwas
the population mean expected salary, (in dollars). A random sample of 1000 students who
finishedtheirMBAthisyear(from129businessschools)resultedina95%confidenceintervalfor
of(83700,84800).

a. Whatisthevalueofthesamplemean?Includeyourunits.
samplemean=midpoint=(83700+84800)/2=84250so$84,250

b. Foreachstatementdetermineifitistrueorfalse.Clearlycircleyouranswer.
If repeated samples of 1000 such students were obtained, we would expect 95% of the
resultingintervalstocontainthepopulationmean.

True
False

Thereisa95%probabilitythatthepopulationmeanliesbetween$83,700and$84,800.

True
False

c. Theexpectedaverageearningsforsuchgraduatesinpastyearwas$76,100.Supposewe
wishtotestthefollowinghypothesesatthe10%significancelevel:

H0:=76100versusHa:76100.

Ourdecisionwouldbe: FailtorejectH0
RejectH0
canttell
Because
Thevalueof76100isnotinthe95%confidenceintervalandthe90%confidenceinterval
willbenarrowerandthuswillalsonotinclude76100.OrSince76100isnotinthe95%
confidenceinterval,weknowthepvalueis0.05.
Ifthepvalueis0.05,thenitisalso0.10

d. Several plots of the expected salary data were


constructed to help verify some of the data
conditions.Aqqplotisprovidedforcheckingthe
assumption that the response is normally
distributed.Thisplotshowssomedeparturefrom
astraightlinewithapositiveslope.Isthiscause
for concern that inference based on our
confidenceintervalandhypothesistestwouldnot
bevalid?Explain.

No,sincethesamplesizen=1000islarge,wecan
relyontheCLTfortheapproximatelynormality
ofthesamplemean

125

AdditionalNotes
Aplacetojotdownquestionsyoumayhave
andaskduringofficehours,takeafewextra
notes,writeoutanextraproblemorsummary
completed in lecture, create your own
summaryabouttheseconcepts.

126

You might also like