You are on page 1of 24

09/05/2012

2012UpdatesinTherapeutics:
ThePharmacotherapyPreparatoryReview&
RecertificationCourse
Biostatistics:ARefresher
KevinM.Sowinski,Pharm.D.,FCCP
PurdueUniversity,CollegeofPharmacy
IndianaUniversity,SchoolofMedicine
WestLafayetteandIndianapolis,IN

ConflictofInterestDisclosures
Noconflictsofinteresttodiscloserelatedtothis
presentation

Outline

Purpose:Whatthisisandisnt
Introduction:WhatdoIneedtoknow?
Variables
Descriptivestatistics
Inferentialstatistics
Hypothesistesting
Statisticaltests
Decisionerrors

09/05/2012

Statistics
..collecting,classifying,summarizing,and
analyzingdata(demystifying?)
Toolsforquantifyingclinicalandlaboratory
datainameaningfulway
Assistsindeterminingwhether/howmucha
treatmentorprocedureaffectsagroup
Whypharmacistsneedtoknowstatistics?
Hopefullyobvioustothisgroup
Moreimportantly:WHATdoIneedtoknow
Page 2-126

Whatdoyouneedtoknow?
Descriptivestatistics/simplestatistics
Mean,median,frequency,SD,range,CI

Chisquare;Fisherexacttest
ttest(s)
KaplanMeier,Coxproportionalhazards
Analysisofvariance
Correlation
Regression(linear,multiple,logistic,other)
Multivariateanalysis
Wilcoxonranksumtest(nonparametric)

Pages 2-126-7

Summarized from: JAMA 2007; 298:1010-22

Statistics:WHYdoyouneedtoknowit?
Domain2:Retrieval,Generation,
Interpretation,andDisseminationof
KnowledgeinPharmacotherapy(25%)
Interpretbiomedicalliteraturewithrespecttostudydesignand
methodology,statisticalanalysis,andsignificanceofreported
dataandconclusions.
Knowledgeofbiostatisticalmethods,clinicalandstatistical
significance,researchhypothesisgeneration,researchdesign
andmethodology,andprotocolandproposaldevelopment

Page 2-126

09/05/2012

TypesofVariables/Data
Discretevariables
Canonlytakealimitednumberofvalueswithina
givenrange
Nominal:Classifiedintogroupsinanunordered
mannerandwithnoindicationofrelativeseverity
Sex(M/F),mortality(yes/no),diseasestate(present/absent)

Ordinal:Rankedinaspecificorderbutwithno
consistentlevelofmagnitudeofdifferencebetween
ranks
NYHAfunctionalclass:I,II,III,IV

COMMONERROR:
Useofmeans(SDs)withordinaldata.
Page 2-127

TypesofVariables/Data
ContinuousVariables
Countingvariables,cantakeonanyvalue
withinagivenrange
IntervalScaled:Datarankedinaspecificorder
withaconsistentchangeinmagnitude
betweenunits;thezeropointisarbitrary
degreesFahrenheit

RatioScaled:Likeinterval butwithan
absolutezero
degreesKelvin,pulse,BP,time,distance
Page 2-127-8

TypesofStatistics:Descriptivestatistics
Visualmethodsofdescribingdata
Frequencydistribution
Histogram
Scatterplot

Page 2-128

09/05/2012

TypesofStatistics:Descriptivestatistics
Histogram
16
14

Frequency

12
10
8
6
4
2
0
0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.1

Antifactor Xa Concentrations (U/mL)

Descriptivestatistics:Numericalmethods
MeasuresofCentralTendency

Mean
Usedonlyforcontinuousandnormallydistributed
data
Verysensitivetooutliers(tendstowardthetail)
Mostcommonlyused/wellunderstood

Median(a.k.a50thpercentile)
Midpointofthevalueswhenplacedinorderfrom
highesttolowest.Halfaboveandbelow.
Usedforordinalorcontinuousdata(especiallyfor
skewedpopulations)
Insensitivetooutliers
Page 2-128

Descriptivestatistics:Numericalmethods
MeasuresofCentralTendency

Mode
Mostcommonvalueinadistribution
Usedfornominal,ordinal,orcontinuousdata
Datamayhave>onemode(bimodal,trimodal)
Describesmeaningfuldistributionswithalarge
rangeofvalues

Page 2-128

09/05/2012

MeasuresofDataSpreadandVariability
StandardDeviation
Measureofthevariabilityaboutthemean
Appliedtocontinuousdata thatare~normally
distributedortransformedtobe
Empiricalrule:68%within1SD,95%within
2SD,and99%within3SD
Coefficientofvariation(CV)relatesthemean
totheSD
(SD/mean100%)

Variance=SD2
Page 2-128

MeasuresofDataSpreadandVariability
Range

Differencebetweenthesmallestandlargest
Appliedtoparametric andnonparametric
Easytocompute
Sizeofrangeisverysensitivetooutliers
Oftenreportedastheactualvalueratherthan
thedifferencebetweenthetwoextremevalues

Page 2-128-9

MeasuresofDataSpreadandVariability
Percentiles
Pointinadistributionwhichavalueislarger
thansomepercentageoftheothervalues
75thpercentile:75%ofthevaluesaresmaller
Doesnotassumethepopulationhasanormal
oranyotherdistribution
IQR:percentilethatdescribesthemiddle50%,
encompassesthe25th75thpercentile.

Page 2-128-9

09/05/2012

Example:Pharm.D.studentswereaskedthe
followingquestions..
2006(n=119)

2007(n=127)

2008(n=134)

The examinationquestionsinthiscoursewereappropriatetothematerialthat
wascovered
Mean (SD)
Median(IQR)

2.48(1.08)

3.80 (0.91)

4.04(0.82)

2.0(2.03.0)

4.0(3.04.0)

4.0(4.05.0)

IunderstandtheimportanceofthiscoursetotheprofessionofPharmacy
Mean (SD)
Median(IQR)

3.51(1.08)

3.57(0.92)

3.90(0.90)

4.0(3.04.0)

4.0(3.04.0)

4.0(4.04.0)

1=S. Disagree; 2=Disagree; 3=Neutral; 4=Agree; 5=S. Agree


Curr Pharm Teach Learn 2010; 2:171-9

MeasuresofDataSpreadandVariability
Summary
Measuresofcentraltendencyshouldbe
presentedalongwithmeasuresofvariability
Whatmeasuresofcentraltendencyshouldbe
presentedwith
Continuous,intervalscaleddata?
Ordinaldata?

Whatmeasuresofspreadandvariability
shouldbepresentedwith
Means?
Medians?
Page 2-128-9

Dataset
HDLcholesterolexample
20 HDL concentrations measured as part of a
clinical study..
64
54
59

60
68
65

59
67
87

65
79
49

64
55
46

62
48
46

54
65

Calculate the mean, median, and mode of the


above data set.
Calculate the range, SD and SEM
Evaluate the visual presentation of the data
Page 2-129

09/05/2012

Dataset
HDLcholesterolexample
MeasureofCentralTendency

MeasureofSpread

Mean
60.8

Median
61

Mode
65

SD
10.4

Range
41
(4687)

IQR
(5465)

SEM: 2.3
Evaluate the visual presentation of the
data..
Page 2-129

HDLcholesterolexample
GroupedHistogram
7

Normally distributed?

Frequency

5
4
3
2
1
0
40

48

56
64
72
80
HDL Concentrations (mg/dL)

88

More

Inferentialstatistics
Conclusionsmadeaboutapopulationfroma
studyofasampleofthatpopulation
Choosing/evaluatingstatisticalmethods
dependsonthetypeofdataused
Educatedstatementaboutanunknown
populationiscommonlyreferredtoasan
inference
Statisticalinferencecanbemadebyestimation
orhypothesistesting
Page 2-129

09/05/2012

PopulationDistributions
Discrete
Binomialdistribution
Poissondistribution

Page 2-129

PopulationDistributions
Normal(Gaussian)Distribution
Mostcommonmodelforpopulation
distributions
Symmetricorbellshaped
Importantlandmarks
:Populationmeanisequaltozero.
:PopulationSDisequalto1.
xandsrepresentthesamplemeanandSD.

Page 2-129-30

PopulationDistributions
Normal(Gaussian)Distribution

-2

-2SD

-SD

mean

SD

2SD

09/05/2012

HDLcholesterolexample
GroupedHistogram
7
6

Frequency

5
4
3
2
1
0
40

48

56
64
72
80
HDL Concentrations (mg/dL)

88

More

Normal(Gaussian)Distribution
Howdoweassess?
Frequencydistributionandhistograms
Median~mean(mostpracticalandeasiestto
use)
HDLExample:61vs.60.8mg/dL
Formaltest:KolmogorovSmirnovtest
Challengingtoevaluatewhenyouarereadinga
paper
Mean/SDdefineanormaldistribution..termed
parametric
Page 2-129-30

Normal(Gaussian)Distribution
Estimationandsamplingvariability
Separatesamplesfromapopulationwillgive
differentestimates
Distributionofmeansapproximatesanormal
distribution.
Meanofthisdistributionofmeans =(popmean)
SDofmeansisestimatedbytheSEM.
95%ofthesamplemeansliewithin2SEMof

Distributionofmeansfromtheserandom
samplesis~normalregardlessofthe
underlyingpopulationdistribution
Pages 2-129-30

09/05/2012

Normal(Gaussian)Distribution
StandardErroroftheMean(SEM)
SEM=SD/sqrt(n)
TheSEMquantifiesuncertaintyinthe
estimateofthemean,notvariabilityinthe
sample.
Whyisallofthisworthknowingaboutthe
differencebetweentheSEMandSD?
Application:95%CIis~mean 2 SEM
Deception?
Page 2-130

Dataset:HDLcholesterolexample
SDorSEM?
A = Mean (SD)
B = Mean (SEM)

Yes ?
No ?

ConfidenceIntervals
95%CIsarethemostcommonlyreportedCIs
Inrepeatedsamples,95%ofallCIsincludetrue
populationvalue.
Whyare95%CIsmostoftenreported?

Assumeabaselinebirthweightinagroupwith
amean SDof1.18 0.4kg
95%CI~mean 1.96 SEM(or2 SEM)
Whatisthe95%CI?(1.07,1.29)

SD,SEM,andCIsareoftenusedinterchangeably
(incorrectly)
Page 2-130

10

09/05/2012

CIsInsteadofStandardHypothesisTesting?
Hypothesistestingandcalculationofpvaluestellus
whetherthereis(orisnot),astatisticallysignificant
difference,butnothingaboutthemagnitude
CIs
Helptodeterminetheimportanceofafindingandits
application
Provideanideaofthemagnitudeofthedifference
Differencebetweentwocontinuousvariables:
CIthatincludes0(nodiff)isnotstatisticallysignificant(p>0.05)
Thereisnoneedtoshowboththe95%CIandthepvalue

CIsforORandRRareevaluateddifferently
Page 2-131

HypothesisTesting
Nullhypothesis(H0):
Nodifferencebetweencomparatorgroups(TxA=TxB)

Alternativehypothesis(Ha):
Statesthatthereisadifference(TxA TxB)

Resultsofhypothesistesting willindicatewhether
thereisenoughevidence torejectH0
H0 isrejected=statisticallysignificant(SS)difference
H0 isnotrejected =noSSdifference
Wearenotconcludingthatthetreatmentsareequal.

Pages 2-131-2

StatisticalTestsandChoosingaStatistical
Test
Dependenton:
Typeofdata(nominal,ordinal,continuous)
Distributionofdata(normal,etc.)
Studydesign(parallel,crossover,etc.)
Presenceofconfoundingvariables
Onetailedversustwotailed

Parametricvs.nonparametrictests
Page 2-132

11

09/05/2012

Parametricvs.Nonparametric
Parametrictestsassume
Databeinginvestigatedhaveanunderlying
~normaldistribution
Dataarecontinuous
Databeinginvestigatedhavevariancesthatare~
equal
Nonparametrictests
Dataarenotnormallydistributed
Datadonotmeetothercriteria(discretedata)

Page 2-132

ParametricTests
Studentsttest
Onesampletest:
Comparesthemeanofthestudysamplewiththe
populationmean
Group 1 Mean

Known population mean

Page 2-132

ParametricTests
Studentsttest(s)
Twosample,independentsamples,or
unpairedtest:
Comparesthemeansoftwoindependent samples.
Group 1

Group 2

Pages 2-132-3

12

09/05/2012

ParametricTests
Studentsttest(s)
Twosample,independentsamples,orunpaired
test:
Equalvariancetest
Ruleofthumbforvariances:Ratiooflargertosmaller
varianceisgreaterthan2,weconcludevariancesare
different
Formaltestfordifferencesinvariances:Ftest
Adjustmentscanbemadeforcasesofunequalvariance.

Unequalvariancetest
Correctionemployedtoaccountforvariances
Pages 2-132-3

ParametricTests
Studentsttest(s)
Twosample,independentsamples,orunpaired
test:
Equalvariancetest
Ruleofthumbforvariances:Ratiooflargertosmaller
varianceisgreaterthan2,weconcludevariancesare
different
Formaltestfordifferencesinvariances:Ftest
Adjustmentscanbemadeforcasesofunequalvariance.

Unequalvariancetest
Correctionemployedtoaccountforvariances
Pages 2-132-3

ParametricTests
Studentsttest(s)
Pairedtest:Comparesthemeandifferenceof
pairedormatchedsamples.Thisisarelated
samplestest.
Group 1
Measurement 1

Measurement 2

Pages 2-132-3

13

09/05/2012

ParametricTests
Studentsttest(s)
COMMONERROR:
Useofmultipletteststocomparemorethantwo
groups

Pages 2-132-3

ParametricTests
AnalysisofVariance(ANOVA)
Oneway(singlefactor)ANOVA:
Comparesthemeansof>3groups
Independentsamplestest
Young
Group 1
Group 2

Group 3

Twoway(twofactor)ANOVA:
Additionalfactoradded
Young
Elderly

Group 1
Group 1

Group 2
Group 2

Group 3
Group 3

Page 2-133

ParametricTests
AnalysisofVariance(ANOVA)
RepeatedMeasuresANOVA:
Relatedsamplestest,extensionofpairedttest

Young
(Group 1)

Related Measurements
Measurement 1 Measurement 2 Measurement 3

Page 2-133

14

09/05/2012

ParametricTests
Posthoctests

Remembermultiplettesterror
Maintainsappropriateerrorrate
Determinewhichgroupsactuallydiffer
ConductedifANOVAstatisticallysignificant
Posthoctests(examples):

TukeyHSD(HonestlySignificantDifference),
Bonferroni
Scheffe
NewmanKeuls

Page 2-133

NonParametricTests
Testsforordinaldataorcontinuousdata(thatdo
notmeetappropriateassumptionsforparametric
tests)
Testsforindependentsamples
WilcoxonranksumandMannWhitneyUtest
Compares2independentsamples(independentsamplest
test)

KruskalWallisonewayANOVAbyranks
Compares> 3independentgroups(onewayANOVA)
Posthoctesting

Page 2-133

NonParametricTests
Testsforrelatedorpairedsamples
SigntestandWilcoxonsignedranktest:Compares
2matchedorpairedsamples(pairedttest)
FriedmanANOVAbyranks:Compares>3
matched/pairedgroups

Page 2-133

15

09/05/2012

NonParametricTests
NominalData
Chisquare(2)test:Comparesexpectedand
observedproportionsbetween>2groups
Testofindependence
Testofgoodnessoffit

Fisherexacttest:UseofChisquaretestforsmall
groups(cells)containing<5observations
McNemar:Pairedsamples
MantelHaenszel:Controlsfortheinfluenceof
confounders
Page 2-134

ChoosingtheMostAppropriateStatistical
Test:Example
Group

Baseline LDL p-value


(mg/dL)
Baseline

Rosuvastatin
(n=25)

152 5

Simvastatin
(n=25)

151 4

> 0.05

Final LDL
(mg/dL)

p-value
Final

138 7

> 0.05

135 5

Page 2-134

ChoosingtheMostAppropriateStatistical
Test:Example
Men/Women
Smokers
Baseline LDL-C
(mg/dL)

Rosuvastatin (n=25)

Simvastatin (n=25)

12/13

10/15

10

13

152 5

151 4

Which is the appropriate statistical test to determine baseline


differences in:
Sex distribution?
Low-density lipoprotein cholesterol?
Percentage of smokers and nonsmokers?
Page 2-134

16

09/05/2012

Appropriatetesttodeterminebaseline
differencesin.
1. Sexdistribution?
2. Lowdensitylipoproteincholesterol?
3. Percentageofsmokersand nonsmokers?
A.Wilcoxonsignedranktest
B.Chisquaretest
C.ANOVA
D.Twosamplettest

ChoosingtheMostAppropriateStatistical
Test:Example
Rosuvastatin (n=25) Simvastatin (n=25)
Baseline LDL (mg/dL)

152 5

151 4

Final LDL (mg/dL)

138 7

135 5

LDL (mg/dL)

14 6

16 5

Which is the appropriate statistical test to determine:


The effect of rosuvastatin on LDL-C
The primary end point: 3-month change in LDL-C
The authors concluded that rosuvastatin is similar to
simvastatin. What else would you like to know?
Page 2-134

Appropriatetesttodetermine

EffectofrosuvastatinonLDLC
Primaryendpoint:3monthchangeinLDL
C

A.Wilcoxonsignedranktest
B.Chisquaretest
C.ANOVA
D.Twosamplettest

17

09/05/2012

DecisionErrors
TypeIError
ProbabilityofmakingTypeIerror=significancelevel
()
Conventionistosettheto0.05
5.0%ofthetime,wewillconcludethereisaSSdifference
whenactuallyonedoesnotexist.
CalculatedchancethatatypeIerrorhasoccurrediscalled
thepvalue.
Lowerpvaluedoesnotsuggestmoreimportance,onlySS
andlesslikelyattributabletochance

Pages 2-134-5

DecisionErrors
TypeIIerror
TypeIIError:
Convention:0.100.20
Concludingthatnodifferenceexistswhenonetrulydoes
(notrejectingH0 whenitshouldberejected)

Pages 2-134-5

DecisionErrors
Power(1)
Abilitytodetectdifferencesbetweengroupsifoneactually
exists
Dependentonthefollowingfactors:

Predetermined
Samplesize
Sizeofthedifferencebetweentheoutcomesyouwishtodetect
Variabilityoftheoutcomesthatarebeingmeasured

Powerisdecreasedby.
Asaboveand
Poorstudydesign
Incorrectstatisticaltests(useofnonparametrictestswhenparametric
testsareappropriate)
Pages 2-134-5

18

09/05/2012

DecisionErrors:Statisticalpower
analysisandsamplesizecalculation
Shouldbeperformedinallstudiesapriori
Necessarycomponentsforestimatingappropriatesample
size
AcceptabletypeIIerrorrate(usually0.100.20)
Observeddifferenceinpredictedstudyoutcomesthatisclinically
significant
Expectedvariabilityinabove
AcceptabletypeIerrorrate(usually0.05)

Pages 2-134-5

Statisticalsignificanceversusclinical
significance
Sizeofthepvalueisnotrelatedtothe
importanceoftheresult.
Statisticallysignificantnotnecessarilyclinically
significant
Lackofstatisticalsignificancedoesnotmean
resultsarenotimportant.
Withnonsignificantfindingsconsidersample
size,estimatedpower,andobserved
variability
Pages 2-134-5

CorrelationandRegression
Introduction
Correlationexaminesthestrengthoftheassociation
betweentwovariables.
Itdoesnotnecessarilyassumethatonevariableisuseful
inpredictingtheother.

Regressionexaminestheabilityofoneormore
variablestopredict anothervariable.

Pages 2-135-6

19

09/05/2012

Correlation
PearsonCorrelation
Strength oftherelationshipbetweentwovariables
thatare..
normallydistributed
ratioorintervalscaled
linearlyrelated

Oftenreferredtoasthedegreeofassociation
betweenthetwovariables
Doesnotnecessarilyimplythatonevariableis
dependentontheother
Pages 2-135-6

CorrelationCoefficient
Pearsoncorrela oncoecient(r)rangesfrom1to+1andcantakeany
valueinbetween.
1
Perfect negative linear
relationship

0
No linear
relationship

+1
Perfect positive linear
relationship

Hypothesistestingisperformedtodeterminewhetherthecorrelation
coefficientisdifferentfromzero.Thistestishighlyinfluencedbysample
size
SpearmanRankCorrelation:Nonparametrictestthatdoesnotassumea
normaldistributionorcontinuousdata.Canbeusedforordinaldataor
nonnormallydistributedcontinuousdata

Pages 2-135-6

CorrelationCoefficient

1
Perfect positive linear
relationship

0
No linear
relationship

-1
Perfect negative linear
relationship

20

09/05/2012

CorrelationPearls
Closerristo1(either+or),themorehighly
correlatedthetwovariables
Noconsistentinterpretationofthevalueofr
Paymoreattentiontothemagnitudeofthe
correlationthantothepvalue
VIEWtherelationshipbetweenthetwo
variables

Pages 2-135-6

Regression
Statisticaltechniquerelatedtocorrelation
Therearemanydifferenttypes
Simplelinearregression:
continuousoutcome(dependent)variable
continuousindependent(causative)variable

Twomainpurposesofregression:
Developmentofpredictionmodel
Accuracyofprediction
Pages 2-136-7

Regression
Developmentofpredictionmodel
Makingpredictionsofthedependentvariable
fromtheindependentvariable
Y=mx+b (dependentvariable=slope
independentvariable+intercept)

Pages 2-136-7

21

09/05/2012

Regression
Accuracyofprediction:Howwelltheindependent
variablepredictsthedependentvariable.
Determinestheextentofvariabilityinthedependent
variablethatcanbeexplainedbytheindependent
variable.
Coefficientofdetermination(r2)describesthis
relationship.Valuesofr2 canrangebetween0and1.

Anr2 of0.80:80%ofthevariabilityinY is
explained bythevariabilityinX.
Statisticaltestsassociatedwithregression
Pages 2-136-7

Coefficientofdetermination(r2)

r2~0.25

r2~0.5

r2~0.80

TypesofRegression

Simplelinearregression
Multiplelinearregression
Simplelogisticregression
Multiplelogisticregression
Nonlinearregression
Polynomialregression

Pages 2-136-7

22

09/05/2012

Regression
Example

Y = 0.227 (X) + 0.097


r2 = 0.31; p<0.05

Whatyoushouldknow
Slopeandintercept?
Requiredassumptions?
r2 interpretation?
PredictantifactorXa
concentrationsatdosesof
2and3.75mg/kg
Whatdoesthep<0.05
valueindicate?

Page 2-138

SurvivalAnalysis
Studiesthetimebetweenentryinastudyand
someevent(e.g.,death,myocardialinfarction)
Censoringmakessurvivalmethodsunique
Subjectsdonotenterthestudyatthesametime

Page 2-138

SurvivalAnalysis
KaplanMeiermethod
Usessurvivaltimestoestimatetheproportionof
peoplewhowouldsurvivealengthoftime

LogRankTest
Comparethesurvivaldistributions> 2groups

Coxproportionalhazardsmodel
Evaluatetheimpactofcovariatesonsurvivalin
twoormoregroups
Allowscalculationofahazardratio(andCI)
Pages 2-138-9

23

09/05/2012

SurvivalAnalysis
KaplanMeiermethod
Logranktest
HR:0.54(0.231.00)
p=0.05

Pages 2-138-9

Pharmacotherapy 2010; 30:1117-26

SurvivalAnalysis
Coxproportionalhazardsmodel
Mostpopularmethodtoevaluatetheimpact
ofcovariates
Investigatesseveralvariablesatatime
Actualmethodofconstruction/calculationis
complex
Comparessurvivalintwoormoregroupsafter
adjustingforothervariables
Allowscalculationofahazardratio(andCI)
Page 2-139

24

You might also like