You are on page 1of 25

BASIC BIOSTATISTICS

Diane Flynn, LTC, MC Colin Greene, LTC, MC

Objectives
Overview

of Biostatistical Terms and Concepts Application of Statistical Tests

Why Use Statistics?

Descriptive Statistics

identify patterns leads to hypothesis generating distinguish true differences from random variation allows hypothesis testing

Inferential Statistics

Why Use Statistics?


Cardiovascular Mortality in Males 1.2 1 0.8 SMR 0.6 0.4 0.2 0 '35'44

Bangor Roseto

'45'54

'55'64

'65'74

'75'84
AJPH 1992

Types of Data
Numerical Continuous Discrete Categorical Ordinal Nominal

Descriptive Statistics
Identifies patterns in the data Identifies outliers Guides choice of statistical test

Percentage of Specimens Testing Positive for RSV


Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun

South 2 North- 2 east West 2 Midwest


2

2 3 2 2

5 5 3 3

7 3 3 2

20 12 5 4

30 28 8 12

15 22 25 12

20 28 27 12

15 22 25 10

8 20 22 19

4 10 15 15

3 9 12 8

Descriptive Statistics
Percentage of Specimens Testing Postive for RSV 1998-99
35 30 25 20 15 10 5 0 Jul Sep Nov Jan Mar May Jul

South Northeast West Midwest

Describing the Data with Numbers


Measures of Central Tendency

MEAN -- average MEDIAN -- middle value MODE -- most frequently observed value(s)

Distribution of Course Grades


14 12 10 Number of 8 Students 6 4 2 0 A A- B+ B B- C+ C Grade C- D+ D DF

Describing the Data with Numbers


Measures of Dispersion

RANGE STANDARD DEVIATION SKEWNESS

Measures of Dispersion
RANGE

highest to lowest values STANDARD DEVIATION how closely do values cluster around the mean value SKEWNESS refers to symmetry of curve

Measures of Dispersion
RANGE

highest to lowest values STANDARD DEVIATION how closely do values cluster around the mean value SKEWNESS refers to symmetry of curve

Standard Deviation
Curve A

Curve B

Measures of Dispersion

RANGE highest to lowest values STANDARD DEVIATION how closely do values cluster around the mean value SKEWNESS refers to symmetry of curve

Skewness
Curve A Curve B

Mode

Median

Mean

negative skew

The Normal Distribution


.

Mean = median = mode Skew is zero 68% of values fall between 1 SD 95% of values fall between 2 SDs

Mean, Median, Mode

Inferential Statistics
Used to determine the likelihood that a conclusion based on data from a sample is true

Terms
p value: the probability that an observed difference could have occurred by chance

Hypertension Trial
DRUG Baseline mean SBP F/u mean SBP

A B

150 150

130 125

Terms
confidence interval: The range of values we can be reasonably certain includes the true value.

30 Day % Mortality
Study Khaja Anderson Kennedy IC STK Control 5.0 4.2 3.7 10.0 15.4 11.2 p 0.55 0.19 N 40 50

0.02 250

95% Confidence Intervals


Khaja (n=40) Anderson (n=50)

Kennedy (n=250)

-.40 -.35 -.30 -.25 -.20 -.15 -.10 -.05 .00

.05

.10

.15

.20

Types of Errors
Truth
No difference Difference

Conclusion

No difference Difference

TYPE II ERROR () TYPE I ERROR ()

Power = 1-

What Test Do I Use?


1. What type of data?
2. How many samples? 3. Are the data normally distributed? 4. What is the sample size?

You might also like