Professional Documents
Culture Documents
Assignment1:Probs&Statsreview
(8questions,80pointstotal)
1. Poissondistribution
Due:2/26,inclass
(Total10points)
ThePoissondistribution,X~Poisson(),isadiscretedistributionwithp.m.f.givenby:
(a) Ensurethatthep.m.f.addsupto1
(Hint:YouwillneedtousetheinfiniteseriesexpansionofanExponential)
(b) FindE[X]
(c) FindVar[X]
2. Paretodistribution
(2points)
(3points)
(5points)
(Total10points)
TheParetodistribution,X~Pareto(),1<<2,isacontinuousdistributionwithp.d.f.givenby:
(a) Ensurethatthep.d.f.integratesto1
(b) FindE[X]
(c) FindVar[X]
(Hint:Notetherangeof)
3. Hadoopwithreplication
(2points)
(3points)
(5points)
(Total10points)
LetusrevisittheHadoopexamplediscussedinclass.Therearenserversorganizedintorracks
ofkserverseach,suchthatn=rk.Therearentasksinthesystem,andeachtaskhasan
associateddatasetthatitmustworkon.Eachdatasetisreplicatedftimesamongthennodes.
Further,eachserverhasexactlyfdataslots,eachofwhichcanstoreonedataset.Thus,nfdata
setsaredistributedamongtheavailablenfdataslots.Ataskisdatalocalifitisassignedtoa
serverthathasitsdataset(atleastonecopy).Ataskisracklocalifitisassignedtoarackof
serversthathasitsdataset.Thentasksarerandomlyassignedtoservers,suchthateachserver
hasexactlyonetask.Further,thenfdatasetsarealsorandomlyassignedtothenfavailable
dataslots.
(a) Whatistheexpectednumberofdatalocaltasks?
(5points)
(b) Whatistheexpectednumberofracklocaltasks?
(5points)
4. Alternativeexpressionforexpectation
(a) LetXbeanonnegative,integervaluedRV.Provethat:
Pr
(Total10points)
(5points)
(Hint:Oneapproachistoconsiderdoublesummationsandcarefullyslipthesummations)
(b) LetXbeanonnegative,continuousRV.Provethat:
(5points)
Pr
(Hint:Again,oneapproachistointerchangetheorderofintegration)
5. BoundsandInequalities
(Total10points)
(a) GiventhattheaveragetasksizeforagivenHadoopworkloadis300s,showthatfewerthan
halfofthetasksinagivenJobcanhaveasizegreaterthan600s.
(2points)
(b) Iftheminimumtasksizeis100s,provideatighterupperboundonthepercentageoftasks
thathavesizegreaterthan600s.
(3points)
(c) Letusnowassumethateachtaskiscomposedoftwoseparate,independentsubtasks.That
is,thesizeofthetaskisthesumofsizesofthefirstandsecondsubtasks.Thefirstsubtask
isExponentiallydistributedwithmean200s,andthesecondsubtaskisUniformly
distributedwithlowerandupperlimitsof50sand150s,respectively.Themeantasksizeis
clearly300s.Whatfractionoftaskshassizegreaterthan300s?Whyisthisnot?(5points)
(Hint:Part(c)hasnothingtodowithparts(a)and(b))
6. Ratinamaze
(Total10points)
Aratistrappedinamazeandistryingtofindawayout.Initially,ifitgoesright,thenitwill
wanderaroundfor10minutesinthemazeandreturntoitsstartingpoint.Ifitgoesleft,thenit
willarriveatafork.Fromthisfork,ifitgoesleft,itwilldepartthemazeafter10minutesof
travel.Ifitgoesrightfromthisfork,itwillwanderaroundfor20minutesandreturntothe
startingpoint.LetXdenotethetimespentbytherat,inminutes,inthemazebeforeitfindsits
wayout.Assumethat,atalltimes,theratisequallylikelytogoleftversusright.
(a) WhatisE[X]?
(3points)
(b) WhatisVar[X]?
(7points)
(Hint:BecarefulwithVar[X].Youwanttouseconditioning.)
7. RVswhoseparametersarethemselvesRVs
(Total10points)
LetX~Exponential().LetY~Uniform(0,X).YouaregiventhatE[X]=1/,andE[X2]=2/2.
(a) WhatisE[Y]?
(b) WhatisVar[Y]?
8. Thecouponcollectorproblem
(4points)
(6points)
(Total10points)
Letussaythereareddistinctcouponsthatyouaretryingtocollect.Everyday,youreceiveone
couponinthemail.Thecouponthatyoureceivecouldbeanyoneofthedcouponswithequal
probability.Yourgoalistocollectatleastonecopyofallddistinctcoupons.LetXdenotethe
numberofdaysneededtocompleteyourgoal.
(a) WhatisE[X]?
(5points)
(b) WhatisVar[X]?
(5points)
Wedonotneedclosedformshereforparts(a)and(b).