Professional Documents
Culture Documents
Week3
Administrivia
Thefirstfortnightlyquizwasputupon
MyStatLablastWednesday.Getontoit!
LearningCatalytics
Yourtutorwilluseitinyourtutorialworkshopthis
week(andwillprovidethesessionID)
Yourhomeworkwill(sometimes)useit
Week 3 topics
Introductiontoprobability
Probabilitydistributions
Marginal,conditionalandjointdistributions
Pdfsandcdfs
Examplesofhowtousethem
Probability
American professional football players have four times greater risk of
dying from ALS or Alzheimers. National Institute for Occupation Safety
and Health 2012.
Probabilityisthemathematicalmeansofstudying
uncertainty
Itprovidesthelogicalfoundationofstatisticalinference
Studyingprobabilityhelpsusmakejudgementcallstosupport
decisionsonthebasisofpartialinformation
E.g.:whatistheprobabilityofasell-outatanupcoming
concert,givendataonpastconcerts,weatherforecasts,etc.?
4
Probability
Tounderstandprobabilityformally,youneedto
understandwhatismeantby:
Independence
Mutualexclusivity
Joint,marginalandconditionalprobability
Countingrulescombinationsandpermutations
Inhighschool,youconcentratedonprobabilitiesof
individualevents/outcomes
Wewillconcentrateonprobability distributions
Aprobabilityisassignedtoeverypossibleoutcomewithina
sample space
5
Probability review
LeteandfbetwoeventsinasamplespaceS.
Then:P(e)0;P(f)0;P(S)=1
P(eUf)=P(e)+P(f)
SharpewritesP(eUf)asP(eorf)
Ifeandfarenotmutuallyexclusive,thenthe
general addition rule states: S
P(eUf)=P(e)+P(f)P(ef)
SharpewritesP(ef)asP(eandf)
e
6
Probability review
Oftenitseasiertoworkwiththeprobabilityof
theevent(orsetofevents)complementaryto
evente,whichwecallnot e,orec
Theideaofjoint probabilityiscapturedinthe
expression P(ef),writtenasP(eandf)
P(ec)=1P(e)
Tounderstandit,weneedtounderstandthe
ideaofconditional probability
InanySwheref hasalreadybeenobservedtooccur
ornotoccur,the(marginal)probabilitythateoccurs
7
maychange.Itmaygoupordown.
Probability review
Theconditionalprobabilitythateoccurs,
giventhatfhasoccurred,isdefinedby:
P(ef)=P(eandf)/P(f)
Similarly,P(f|e)=P(eandf)/P(e)
Rearrangingyields:
P(eandf)=P(e |f)P(f)
Thisisthemultiplicationrule
IfP(e
|f)=P(e)andP(f |e)=P(f),then:
Conditioninghasnoeffect
eandfaresaidtobeindependent events
8
Probability review
Recalltheconditionalprobabilitiesofeandf:
P(ef)=P(eandf)/P(f)andP(f|e)=P(eandf)/P(e)
Rearrangingbothyields:
P(eandf)=P(e |f)P(f);and(1)
P(eandf)=P(f|e)P(e).(2)
Nowsubstitute(1)into(2)andsolveforP(e |f):
Pe|f
P f | e P e
Pf
ThisisBayesRule!
Ifeandfareindependentevents,thisequationsimplifiesto:
P(e|f)=P(e).
Canyouseewhy?
9
An example
Supposewedrawastudentrandomlyfromthisclassandasktheir
height.
WethenringupmyfriendPaulandsay,heyPaul,withwhat
probabilitydoyouthinkthepersonwevepickedfromtheclassis
female?[uh,Idontknow..5?]Andwhatabouttheprobabilitythat
s/heislessthan160cmtall?[whyareyouaskingmethis?Idont
know,maybe.3?]
Thenwesay,OK,wellasithappens,thispersonis155cmtall.
Whatisyourrevisedguessabouttheprobabilitys/heisfemale?
[youarenuts.Maybe.8.Goodbye.]
P (female) = P(femaleless than 160 cm)
P (female | less than 160 cm) = P(female and less than 160 cm)
P(less than 160 cm)
P(female and less than 160 cm) = P(female)*P(less10than 160 cm)
11
12
Whatistherelationshipbetweenhavingprivate
healthinsurance(PHI)andbeingadmittedto
hospitalasapublicorprivatepatient?
DatafromABS2001NationalHealthSurveywas
usedtoderivefollowingtableofrelativefrequencies:
13
Treatrelativefrequencies
asprobabilities
Findtheprobabilityofthe
followingevents:
HavePHI
HavePHIandadmittedto
hospital
Admittedasaprivate
patient,givenhavePHI
Admittedasapublic
patient,givenhavePHI
Areadmittedasaprivate
patientandhavePHI
independentevents?
14
Originaltabletreatedasa
joint bivariate distribution
ofadmissionandPHI
Whatisthemarginal
distribution ofhospital
admission?
Whatistheconditional
distributionofadmission
givenPHI?
15
Independence
Covarianceandcorrelationaremeasuresoflinear
associationorlineardependence
Dependence(anditsopposite,independence)isamore
generalconceptofassociationbetweentwovariables
SupposeadmissionandPHIwereindependent
Whatwouldweexpecttofindwhencomparingthe
marginaldistributionofadmissionwiththeconditional
distributiongivenhavePHI?
Isthatwhatwefind?
Whatarethepotentialconfoundingfactorsinthiscase?
16
Auditing example
Supposeyouworkatalocalauditingfirm
Yourfirmservices100companies
10ofthesecompaniesareknowntohaveoverdue
accounts
Iftwoofthese100clientfirmsarechosenat
randomfromtheclientlist,thenwhatisthe
probabilitydistributionapplicabletothat(small!)
samplefornumberofaccountsoverdue?
17
Auditing example
Let ei denote the i th firm having an overdue account.
P ( none overdue) P (e1C and e2C )
P (e1C ) P (e2C | e1C )
90
100
89
0.809
99
100 99 100
P ( two overdue) 1 0.809 0.182
0.009
90
0.182
99
18
A tease on sampling
Thisexampleillustratesthedistinctionbetween
samplingwithandwithoutreplacement
Becausetheproblemimpliedthatfirmswerechosenatthe
sametime,samplingwasdonewithout replacement
Suchsamplinginducesdependenceacrossevents
P(e )dependsonthetypeoffirmsampledinthefirstdraw
2
(i.e.,onP(e1))
Independenceandhencerandomsamplingrequires
samplingwith replacement
Thisdistinctionisnotofpracticalimportanceifthepopulation
fromwhichwedrawthesampleislarge
Athome,re-dotheauditingexampleassumingsamplingwith
19
replacement
Probability trees
Eventsindrawssuchasthatmadeintheauditing
examplemayberepresentedbyprobability trees
Thesearediagrams(thatresembletrees)
Supposeweselect2studentswithoutreplacement
fromagroupof10students,ofwhom3arefemale
and7aremale:
20
Another example:
Gender composition
Supposeweareinterestedinthegender
compositionoffamilies
DefineX=numberofboysinfamilieswith3children
Assume:
ThusX=0,1,2or3
Birthoutcomesareindependentevents
Malesandfemalesareequallylikelytobeborn
WhatistheprobabilitydistributionofX?
Wehavedeterminedoutcomes,sonowweneedto
calculateassociatedprobabilities.
21
Gender composition
P ( X 0) P (FFF ) 0.5 .125
P ( X 1) P (MFF ) P (FMF ) P (FFM )
3
3 0.5 .375
3
Thisiscalledaprobability
distribution function(pdf)
Notes:
Wecouldhaveusedaprobabilitytreetoisolatethe8outcomes
andtheirassociatedprobabilities
Theresultantprobabilitydistributionsatisfiestherequirements:
P(X =x)0forallx,and
P( X x ) 1
allx
22
Gender composition
Wecanalsorepresentdistributionsintermsofthe
cumulative distribution function(cdf)
DefinedasF(x)=P(X x)forallx
F(x)
0 ifx 0
0.125 0 x 1
0 .5 1 x 2
0.875 2 x 3
1
x3
23
Gender composition
WhatisP(0<X<3)?
P(0<X<3)=P(X=1)+P(X=2)=0.75
orP(0<X<3)=F(2)F(0)=0.8750.125=0.75
WhatisP(X>0)?
P(X>0)=1-P(X0)=1F(0)=0.875
Becareful!
P(X>0)P(X0)
24
Odds?(at26/7/07)
Winpricesfor$1bet:$1.67ALP,$2.15Coalition
Letp =theprobabilityofanALPwin.Then,assumingthatthe
A
pricescomefromafairgame,
0.67pA - (1-pA)=0pA=1/1.67=0.599
But
1.15pc - (1-pc)=0pc=1/2.15=0.465 1-pA
Alsonotice0.599+0.465=1.064>1
25
Whatshappening?
WecanshowthatprobabilitypA=0.563andtheprofit
marginofPortlandbetisabout6%
Thisexplainswhythebookiegoeshomeina
Mercedesandthepunteronabus
WhatifthewinpriceforALPwas$2andfor
Coalition$2.15?Takethebet!
Sumofreciprocalsis0.5+0.465=0.965<1
$100betoneachpartywouldyieldabreak-evenifthe
ALPwinsandaprofitof$15ifthecoalitiondoes
26
Review
Probability:Whatitisandhowtoworkwithit
Marginal,conditional,andjointprobability
Independence
Probabilitiesinaction
Ateaseaboutsampling(moredetailsupsoon)
27