Biostats Notes

Data is Everywhere
Research Literature
Hypothesis: Surgeon-directed institutional peer review, associated with
positive physician feedback, can decrease the morbidity and mortality
rates associated with carotid endarterectomy. Results: Stroke rate
decreased from 3.8% (1993-1994) to 0%(1997-1998). The mortality rate
decreased from 2.8% (1993-1994) to 0% (1997-1998). (average) Length
of stay decreased from 4.7 days (1993-1994) to 2.6 days (1997-1998).
The (average) total cost decreased from $13344 (1993-1994) to $9548
(1997-1998).
Biostatistics
Archives of Surgery, August 2000
SGU
Popular Press
July 2014
For the first time, an influential doctors group is recommending that

some children as young as 8 be given cholesterol-fighting drugs to ward
off future heart problems... With one-third of U.S. children overweight
and about 17 percent obese, the new recommendations are important,
said Dr. Jennifer Li, a Duke University childrens heart specialist.
cnn.com, July 8, 2008
Data provides Information
Steps in Research Project
Good Data Can Be Analyzed and Summarized to

Provide Useful Information
Planning
Design
Data Collection
Data Analysis
Bad Data Can Be Analyzed and Summarized to

Provide Incorrect/Harmful/Non-informative
Information
Presentation
Interpretation
Biostatistics
Design of Studies
Sample size
Selection of study participants
Role of randomization
Data Collection Variability
Important patterns in data are obscured by variability.
Distinguish real patterns from random variation.
Inference
Draw general conclusions from limited data
e.g. survey
Summarize
What summary measures will best convey the results
How to convey uncertainty in results
Interpretation
What do the results mean in terms of practice, the program,
the population etc.
1954 Salk Polio Vaccine Trial

Vaccinated
n = 200, 745
School Children
Placebo
n = 201, 229
Vaccine
Placebo
Design: Features of the Polio Trial
Polio Cases
82
162
Reference: Meier P, The Biggest Public Health Experiment Ever: The 1954 Field Trial of the
Salk Poliomyelitis Vaccine, In: Statistics: A Guide to the Unknown, 1972.
There were almost twice as many polio cases

in the placebo compared to vaccine group.
Comparison Group
Randomized
Placebo Controls
COULD WE GET SUCH GREAT IMBALANCE BY CHANCE?
Double Blind
Polio Cases
Vaccine 82 out of 200,745
Placebo 162 out of 201,229
Objective: The groups should be equivalent

except for the factor (vaccine) being investigated.
p-value=?
Question: Could the results be due to chance?
Statistical methods tell us how to make these probability
calculations.
7
Types of Data
1
There are Different Statistical Methods for Different Types of Data
Binary (dichotomous) data

Polio: Yes/No
Cure: Yes/No
Gender: Male/Female
Binary Data To compare the number of polio cases in the 2

treatment arms of the Salk Polio vaccine, you could
use
Fishers Exact Test
Chi-Square Test
Categorical data
Race/ethnicity nominalno ordering
Country of birth nominalno ordering
Degree of agreement ordinalordering
Continuous data (finer measurements)
Continuous Data To compare blood pressure in a clinical trial

evaluating 2 blood pressure lowering medications, you
could use
2-sample t-Test
Wilcoxon Rank Sum (nonparametric) Test
Blood pressure
Weight
Height
Age
Time to Event data

Time in remission
9
)
Sample Mean (X
10
)
Notes on Sample Mean (X
Add up data, then divide by sample size (n)
The sample size n

is the number of
observations
(pieces of data)
Example: n = 5 Systolic Blood Pressures (mmHg)
Formula
=
X
X1 = 120
X2 = 80
X3 = 90
X4 = 110
X5 = 95
Pn
i=1 Xi
n
Also called sample average or arithmetic mean
Sensitive to extreme values

One data point could make a great change in sample mean
Why is it called the sample mean?

To distinguish it from population mean
= 120 + 80 + 90 + 110 + 95 = 99 mmHg

X
5
11
12
Population Versus Sample
Population Versus Sample
is not the population mean

The sample mean X
Population The entire group about which you want information
Blood pressures of all 18-year-old male college
students in the U.S.
Population
Population mean
Sample A part of the population from which we actually

collect information; used to draw conclusions about
the whole population
Sample
Sample mean X
We dont know the population mean but we would like to

We draw a sample from the population
We calculate the sample mean X
Sample of blood pressures from n = 5

18-year-old male college students in the U.S.
to ?
How close is X
is to
Statistical theory will tell us how close X
13
14
Sample Median
The median is the middle number

STATISTICAL INFERENCE IS THE PROCESS OF
TRYING TO DRAW CONCLUSIONS ABOUT THE
POPULATION FROM THE SAMPLE
80
90
95
110
120
Median
We will return to this later
1
Not sensitive to extreme values.

If 120 became 200
Median no change
Mean big change (becomes 115)
15
16
Sample Median
If the sample size is an even number,

Average the two middle numbers
80
90
95
110
120
125
Median
95 + 110
= 102.5 mmHg
2
17
Describing Variability
18
Describing Variability
The sample variance is the average of the

square of the deviations about the sample
mean
How Can We Describe the Spread

of the Distribution?
Minimum and Maximum
Range=Max-Min
Why n 1?
Stay tuned
SAMPLE STANDARD DEVIATION (S or SD)
s2 =
Pn
)2
X
n1
i=1 (Xi
Sample variance (s 2 )
Sample standard deviation (s or SD) is the square root of s 2
m
19
20
Calculating s
Notes on s
Example: n = 5
Systolic Blood Pressures
(mmHg)
X1
120
X2
80
X3
90
X4
110
X5
95
The bigger s is, the more variability there is
s measures the spread about the mean
s can equal 0 only if there is no spread

all n observations have the same value
The units of s are the same as the units of the data

(e.g. mmHg)
Often abbreviated SD
s is the best estimate of the population standard deviation
Interpretation
Most of the population will be within about 2 standard
Sample Mean
Sample Variance
Sample Standard Deviation (SD)
deviations (s) of the mean X
= 99 (mmHg)
X
s 2 = 255
s = 15.97 (mmHg)
For a normally (Gaussian) distributed population, most is
about 95%
21
22
23
24
More Notes about SD:

Why do we divide by n 1 instead of n?
with in the formula for s 2
We want to replace X
P
)2
(Xi X
s =
n1
2
Because we dont know , we use X

)2 tends to be smaller than (Xi )2
But (Xi X
So, to compensate we divide by a smaller number:
n 1 instead of n
n 1 is called the degrees of freedom of the variance.
Why?
The sum of the deviations is zero
The last deviation can be found once we know the other n 1
Only n 1 of the squared deviations can vary freely
The term degrees of freedom arises in other statistics

It is not always n 1, but it is in this case
Other Measures of Variation
Continuous Variables: Histograms
Standard deviation (SD or S)
Means and medians do not tell the whole story
Minimum and maximum observation
Differences in spread (variability)
Range=Max-Min
Differences in shape of the distribution
What Happens To These as Sample Size Increases?
Histograms are a way of displaying the

distribution of a set of data by charting the
number (or percentage) of observations whose
values fall within pre-defined numerical ranges
. Tend to increase?
. Tend to decrease?
. Remain about the same?
25
26
How to Make a Histogram

Divide into intervals (equal)
Table 20: Resident Population by Age and State (2000)

State
Alabama
Alaska
Arizona
Arkansas
California
Colorado
Connecticut
Delaware
Florida
Georgia
Hawaii
Idaho
Illinois
Indiana
Iowa
Kansas
Kentucky
Percent
13.0
5.7
13.0
14.0
10.6
9.7
13.8
13.0
17.6
9.6
13.3
11.3
12.1
12.4
14.9
13.3
12.5
State
Louisiana
Maine
Maryland
Massachusetts
Michigan
Minnesota
Mississippi
Missouri
Montana
Nebraska
Nevada
New Hampshire
New Jersey
New Mexico
New York
North Carolina
North Dakota
Source: Statistical Abstract of the United States, 2001.

www.census.gov/prod/2002pubs/01statab/stat-ab01.html
Percent
11.6
14.4
11.3
13.5
12.3
12.1
12.1
13.5
13.4
13.6
11.0
12.0
13.2
11.7
12.9
12.0
14.7
State
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Vermont
Virginia
Washington
West Virginia
Wisconsin
Wyoming
Count number in each
Percent
13.3
13.2
12.8
15.6
14.5
12.1
14.3
12.4
9.9
8.5
12.7
11.2
11.2
15.3
13.1
11.7
Count the observations in each class.

Here are the counts:
Class
4.1 to 5.0
5.1 to 6.0
6.1 to 7.0
7.1 to 8.0
8.1 to 9.0
27
Count
0
1
0
0
1
Class
9.1 to 10.0
10.1 to 11.0
11.1 to 12.0
12.1 to 13.0
13.1 to 14.0
Count
3
2
9
14
12
Class
14.1 to 15.0
15.1 to 16.0
16.1 to 17.0
17.1 to 18.0
18.1 to 19.0
Count
5
2
0
1
0
28
15
10
Bin width:
5 mm Hg
Number of Men in Population
Divide range of data into intervals (bins) of equal width

Count number of observations in each class
Draw the histogram
Label scales
12
14
Pictures of Data: Histograms
How to Make a Simple Histogram
80
100
120
10
12
14
16
18
160
180
50
3
2
1
0
80
120
140
160
180
80
100
120
140
160
180
Systolic Blood Pressure (mm Hg)
30
Other Types of Histograms

25
20
15
0
There is no perfect answer to this

Depends on sample size n
Rough Guideline: # Intervals n
Frequency
Histogram
10
Frequency
30
35
How many intervals (bins) should you have in a histogram?
100
29
Bin width:
1 mm Hg
40
30
20
10
20
Percent of residents over 65
Bin width:
20 mm Hg
10
8
6
0
Number of states
140
Number of Intervals
about 3
about 7
about 10
31
1.0
0.8
0.6
0.4
0.2
Relative Frequency (%)
Relative
Frequency
Polygon
0.0
0.2
0.4
0.6
0.8
Relative
Frequency
Histogram
0.0
Histogram applet at
http://www.stat.sc.edu/~west/javahtml/Histogram.html
1.0
IgM concentrations (g/l) in 324 children

Relative Frequency (%)
n
10
50
100
IgM concentrations (g/l)
IgM concentrations (g/l)
32
Stem and Leaf Plot
Boxplots
160
140
150
79
1166788999
0112333444555667777889
00111111233445555566777777889
0011123333456667788999
0111224446
003
05
130
75th Percentile
120
Sample Median
110
|
|
|
|
|
|
|
|
Largest Non-Outlier
25th Percentile
100
9
10
11
12
13
14
15
16
Outlier
Smallest Non-Outlier
33
Shapes of the Distribution
34
Shapes of the Distribution
1976
Symmetrical and
bell-shaped
Positively skewed or
skewed to the right
Negatively skewed or
skewed to the left
Bimodal
Reverse J-shaped
Uniform
1988
Many Distributions are Not Symmetric

Source: Sibergeld, Annual Rev. Public Health, 1997.
35
36
Distribution Characteristics
Note on Shapes of Distributions

Right Skewed (positively skewed)
Long right tail
Mean > Median
e.g. hospital stays
Left Skewed (negatively skewed)
Long left tail
Mean < Median
e.g. humidity (cant get over 100%)
Symmetric Right and left sides are mirror images
Left tail looks like right tail
Mean Median Mode
Mode
Median
Outlier An individual observation that falls outside the

overall pattern of the graph.
Mean
37
5
2
Medium
Sample
The Histogram and the Probability Density
Mean=Balancing Point
38
80
100
120
140
160
180
80
39
0.020
0.010
Probability Density
Entire
Population
0.000
10
20
30
Large
Sample
100
120
140
160
180
80
100
120
140
160
180
40
The Probability Density
What is the most well-known Distribution?
The Normal (Gaussian) Distribution

The probability density is a smooth idealized curve that shows
0.08
the shape of the distribution in the population

This is generally a theoretical distribution that we can never
Frequency
0.04
0.06
see: we can only estimate it from the distribution presented by

a representative (random) sample from the population
Areas in an interval under the curve represent the percent of
0.00
0.02
the population in the interval
25
30
35
40
Serum Albumin (g/l)
45
50
41
The Normal (Gaussian) Distribution
42
The Normal Distribution

There are lots of normal distributions:
Symmetric
Bell-shaped
Mean Median Mode
You can tell which normal distribution you have by knowing the
mean and standard deviation:
Mean () is the center
Standard deviation () measures the spread (variability)
Applet at http://stat-www.berkeley.edu/~stark/Java/Html/StandardNormal.htm
43
44
The Normal Distribution
The 68-95-99.7 Rule
Areas under a normal curve represent the proportion of total values

described by the curve that fall in that range:
In any normal distribution, approximately:
The shaded area is

approximately 29% of the
total area under the curve
. 68% of the observations fall within one standard deviation of

the mean.
. 95% of the observations fall within two standard deviations
of the mean.
. 99.7% of the observations fall within three standard
deviations of the mean.
*more precisely, 1.96
45
46
47
48
Distributions of Heights in Females Age 18-24

Approximately normal
Mean 65 inches
Standard deviation 2.5 inches
The rule says that if a
population is normally
68%
distributed then approximately

68% of the population will be
within 1 SD of X
It doesnt guarantee that

exactly 68% of your sample of
.
data will fall within 1 SD of X
Why? The rule works better if
the sample size is big.
57.5
60
62.5
65
67.5
70
72.5
Standard Normal Distribution

= 0 and = 1
Standard Normal Scores

Z -Scores
How many standard deviations from the population mean are you?
Standard Score(Z ) =
observation population mean

standard deviation
A standard score of
Z = 1 = observation lies one SD above the mean
Z = 2 = observation lies two SD above the mean
=0
=2
Z = 1 = observation lies one SD below the mean

Z = 2 = observation lies two SD below the mean
49
Z -Scores
50
Whats the usefulness of standard normal scores?
Example: Female Heights, mean= 65, s = 2.5 inches

1
Height = 72.5 inches

72.5 65
Z=
= +3.0
2.5
It tells you how many SD(s) an observation is from the mean.
Thus, it is a way of quickly assessing how unusual an

observation is.
Suppose the mean height is 65 inches, and s = 2.5
Height = 60 inches
Is 72.5 inches unusually tall?
60 65
Z=
= 2.0
2.5
If we know Z = 3.0, does that help us?
51
52
Assuming the population has a normal distribution:
Within Z
SDs of
the mean
Fraction of Population that is

More than Z
More than Z
SDs above
SDs below
the mean
the mean
More than Z
SDs above
or below
the mean
Normal Probability Applet at

www-stat.stanford.edu/~naras/jsm/FindProbability.html
Z
0.5
1.0
1.5
2.0
2.5
3.0
3.5
38.29
68.27
86.64
95.45
98.76
99.73
99.95
%
%
%
%
%
%
%
30.85%
15.87%
6.68%
2.28%
0.62%
0.13%
0.02%
30.85%
15.87%
6.68%
2.28%
0.62%
0.13%
0.02%
61.71%
31.73%
13.36%
4.55%
1.24%
0.27%
0.05%
53
Problems
54
If you have a standard score of Z = 3, what % of the

population would have scores greater than you?
If you have a standard score of Z = 1.5, what % of the

population would have scores less than you?
Suppose the population is normally distributed:

1

population would have scores greater than you?

population would have scores less than you?
55
56
Suppose we call unusual observations those that are either at

least 2 SD above the mean or about 2 SD below the mean.
What % are unusual?
What % of the observations would have |Z | > 3.0?
What percent of observations would have |Z | > 1.15?
In other words, what % of the observations will have a
standard score either Z > +2.0 or Z < 2.0?

What % would have |Z | > 2?
What % of the observations would have |Z | > 1.0 (i.e., more

than 1 SD away from the mean)?
57
Normal Distribution
z
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.10
0.11
0.12
0.13
0.14
0.15
0.16
0.17
0.18
0.19
0.20
0.21
0.22
0.23
0.24
0.25
0.26
0.27
0.28
0.29
P
1.0000
0.9920
0.9840
0.9761
0.9681
0.9601
0.9522
0.9442
0.9362
0.9283
0.9203
0.9124
0.9045
0.8966
0.8887
0.8808
0.8729
0.8650
0.8572
0.8493
0.8415
0.8337
0.8259
0.8181
0.8103
0.8026
0.7949
0.7872
0.7795
0.7718
z
0.30
0.31
0.32
0.33
0.34
0.35
0.36
0.37
0.38
0.39
0.40
0.41
0.42
0.43
0.44
0.45
0.46
0.47
0.48
0.49
0.50
0.51
0.52
0.53
0.54
0.55
0.56
0.57
0.58
0.59
P
0.7642
0.7566
0.7490
0.7414
0.7339
0.7263
0.7188
0.7114
0.7039
0.6965
0.6892
0.6818
0.6745
0.6672
0.6599
0.6527
0.6455
0.6384
0.6312
0.6241
0.6171
0.6101
0.6031
0.5961
0.5892
0.5823
0.5755
0.5687
0.5619
0.5552
z
0.60
0.61
0.62
0.63
0.64
0.65
0.66
0.67
0.68
0.69
0.70
0.71
0.72
0.73
0.74
0.75
0.76
0.77
0.78
0.79
0.80
0.81
0.82
0.83
0.84
0.85
0.86
0.87
0.88
0.89
P
0.5485
0.5419
0.5353
0.5287
0.5222
0.5157
0.5093
0.5029
0.4965
0.4902
0.4839
0.4777
0.4715
0.4654
0.4593
0.4533
0.4473
0.4413
0.4354
0.4295
0.4237
0.4179
0.4122
0.4065
0.4009
0.3953
0.3898
0.3843
0.3789
0.3735
z
0.90
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1.00
1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.10
1.11
1.12
1.13
1.14
1.15
1.16
1.17
1.18
1.19
Tabulated values are the proportion of the

standard Normal distribution outside the
range z, where z is a standard Normal
deviatealso called two-sided p-values.
P
z
P
0.3681
1.20
0.2301
0.3628
1.21
0.2263
0.3576
1.22
0.2225
0.3524
1.23
0.2187
0.3472
1.24
0.2150
0.3421
1.25
0.2113
0.3371
1.26
0.2077
0.3320
1.27
0.2041
0.3271
1.28
0.2005
0.3222
1.29
0.1971
0.3173
1.30
0.1936
0.3125
1.31
0.1902
0.3077
1.32
0.1868
0.3030
1.33
0.1835
0.2983
1.34
0.1802
0.2937
1.35
0.1770
0.2891
1.36
0.1738
0.2846
1.37
0.1707
0.2801
1.38
0.1676
0.2757
1.39
0.1645
0.2713
1.40
0.1615
0.2670
1.41
0.1585
0.2627
1.42
0.1556
0.2585
1.43
0.1527
0.2543
1.44
0.1499
0.2501
1.45
0.1471
0.2460
1.46
0.1443
0.2420
1.47
0.1416
0.2380
1.48
0.1389
0.2340
1.49
0.1362
The above results will turn out to be very important later in our
discussion of p-values.
58
Normal Distribution
z
1.50
1.51
1.52
1.53
1.54
1.55
1.56
1.57
1.58
1.59
1.60
1.61
1.62
1.63
1.64
1.65
1.66
1.67
1.68
1.69
1.70
1.71
1.72
1.73
1.74
1.75
1.76
1.77
1.78
1.79
59
P
0.1336
0.1310
0.1285
0.1260
0.1236
0.1211
0.1188
0.1164
0.1141
0.1118
0.1096
0.1074
0.1052
0.1031
0.1010
0.0989
0.0969
0.0949
0.0930
0.0910
0.0891
0.0873
0.0854
0.0836
0.0819
0.0801
0.0784
0.0767
0.0751
0.0735
z
1.80
1.81
1.82
1.83
1.84
1.85
1.86
1.87
1.88
1.89
1.90
1.91
1.92
1.93
1.94
1.95
1.96
1.97
1.98
1.99
2.00
2.01
2.02
2.03
2.04
2.05
2.06
2.07
2.08
2.09
P
0.0719
0.0703
0.0688
0.0672
0.0658
0.0643
0.0629
0.0615
0.0601
0.0588
0.0574
0.0561
0.0549
0.0536
0.0524
0.0512
0.0500
0.0488
0.0477
0.0466
0.0455
0.0444
0.0434
0.0424
0.0414
0.0404
0.0394
0.0385
0.0375
0.0366
z
2.10
2.11
2.12
2.13
2.14
2.15
2.16
2.17
2.18
2.19
2.20
2.21
2.22
2.23
2.24
2.25
2.26
2.27
2.28
2.29
2.30
2.31
2.32
2.33
2.34
2.35
2.36
2.37
2.38
2.39
P
0.0357
0.0349
0.0340
0.0332
0.0324
0.0316
0.0308
0.0300
0.0293
0.0285
0.0278
0.0271
0.0264
0.0257
0.0251
0.0244
0.0238
0.0232
0.0226
0.0220
0.0214
0.0209
0.0203
0.0198
0.0193
0.0188
0.0183
0.0178
0.0173
0.0168
z
2.40
2.41
2.42
2.43
2.44
2.45
2.46
2.47
2.48
2.49
2.50
2.51
2.52
2.53
2.54
2.55
2.56
2.57
2.58
2.59
2.60
2.61
2.62
2.63
2.64
2.65
2.66
2.67
2.68
2.69
P
0.0164
0.0160
0.0155
0.0151
0.0147
0.0143
0.0139
0.0135
0.0131
0.0128
0.0124
0.0121
0.0117
0.0114
0.0111
0.0108
0.0105
0.0102
0.0099
0.0096
0.0093
0.0091
0.0088
0.0085
0.0083
0.0080
0.0078
0.0076
0.0074
0.0071
z
2.70
2.71
2.72
2.73
2.74
2.75
2.76
2.77
2.78
2.79
2.80
2.81
2.82
2.83
2.84
2.85
2.86
2.87
2.88
2.89
2.90
2.91
2.92
2.93
2.94
2.95
2.96
2.97
2.98
2.99
P
0.0069
0.0067
0.0065
0.0063
0.0061
0.0060
0.0058
0.0056
0.0054
0.0053
0.0051
0.0050
0.0048
0.0047
0.0045
0.0044
0.0042
0.0041
0.0040
0.0039
0.0037
0.0036
0.0035
0.0034
0.0033
0.0032
0.0031
0.0030
0.0029
0.0028
60
Is every variable normally distributed?
Population versus Sample
Absolutely not.
The population of interest could be
Then why do we spend so much time
studying the normal distribution?
1
Some variables are normally distributed.
A bigger reason is the Central Limit Theorem

(next lecture)
All women between ages 30 and 40

All patients with a particular disease
The sample is a small number of individuals from the population.

The sample is a subset of the population.
61
62
versus population mean ()

Sample mean X
e.g. mean blood pressure
A parameter A number that describes the population.

A parameter is a fixed number, but in practice we do not know its
value.
population mean
Example:
population proportion
(e.g., X
= 99 mmHg)
We know the sample X
We dont know the population mean but we would like to
Sample proportion versus population proportion

e.g. proportion of individuals with health insurance
A statistic A number that describes a sample of data.

A statistic can be calculated. We often use a statistic to estimate
an unknown parameter.
sample mean
Example:
sample proportion
We know the sample proportion (e.g. 80%)

We dont know the population proportion
Key Question
How close is the sample mean (or proportion) to
the population mean (or proportion)?
63
64
Sources of Error
Some Examples of Potentially Biased Sampling
Errors from Biased Sampling

The study systematically favors certain outcomes
. Voluntary response
. Non-response
Example Blood pressure study of women age 30-40

Volunteers
Solution:
Random sampling
Non-random; selection bias
Family members
Non-random; not independent
Telephone survey; random digit dial
Random or non-random sample?
. Convenience sampling
Errors from (Random) Sampling
. Caused by chance occurrence
Example Clinic Population

100 consecutive patients
. Get a bad sample because of bad luck

(by bad I mean not representative)
Random or non-random sample?

Convenience samples are sometimes assumed to
. Can be controlled by taking a larger sample
be random.
Using mathematical statistics, we can figure out how much

potential error there is from random sampling (standard error)
65
Example: Literary Digest poll of 1936 presidential election
66
Bottom Line
Election result: 62% voted for Roosevelt

Digest prediction: 43% voted for Roosevelt
Problem: Sampling Bias
When a selection procedure is biased, taking a larger sample
does not help

. This just repeats the mistake on a larger scale
Non-respondents can be very different from respondents
. When there is a high non-response rate, look out for
non-response bias
Selection Bias
Mail questionnaire to 10 million people
Sources: telephone books, clubs
Poor people are unlikely to have telephone
(only 25% had telephones)
Non Response Bias

Only about 20% responded (2.4 million)
Responders different than non-responders
67
68
Random Sample
Sampling Variability
When a sample is randomly selected from a population, it is
called a random sample
If we repeatedly choose samples from the same population, a

statistic will take different values in different samples
In a simple random sample each individual in the population
has an equal chance of being chosen for the sample

Random sampling helps control systematic bias
IDEA
If the statistic does not change
much if you repeated the study
(you get the same answer each time),
then it is fairly reliable
(not a lot of variability)
But even with random sampling, there is still sampling
variability or error
69
70
Example
Sample 1
Estimate the proportion of persons in a population who have
health insurance
1100
= .8012
1373
p =
1090
= .7939
1373
Sample 2
Choose a sample of size n = 1373.
Sample 1
n = 1373
p =
p =
1100
= .8012
1373
Sample 3
p = .8347
Sample 4
Is the sample proportion reliable?
p = .7786
If we took another sample of another 1373 persons,
would the answer bounce around a lot?
and so on
71
72
The spread of the sampling distribution depends on the sample size

Proportions based on
Proportions based on
sample size n = 300
sample size n = 1000
25
15
30
30
The Sampling Distribution
15
10
0
10
20
10
20
Histogram
of 1000
Sample
Proportions
0.70
0.75
0.80
0.85
0.90
0.70
Sample Proportion with Health Insurance

Samples of size n= 300
0.76
0.78
0.80
0.82
0.75
0.80
0.85
0.90

0.84

73
74
0.0
0.2
Percentage
p = .80
0.4
Lets explore this...
0.6
0.8
1.0
Population distribution of Health Insurance
No Health Insurance
75
Health Insurance
76
Lets do an experiment...
1.0
Sample 2
1.0
Sample 1
0.8
0.6
Percentage
0.4
0.0
0.0
Ready, set,
go...
0.2
Percentage
0.2
plot the health insurance status

record the sample proportion
0.4
patients, each with n = 20 patients

For each of the 500 samples, we will
0.6
0.8
Take 500 separate random samples from this population of
No Health Insurance
Health Insurance
No Health Insurance
p = 0.9
Health Insurance
p = 0.6
77
So we did this 500 times...

Lets look at a histogram of the 500 sample means
Each based on a sample of size 20
78
Lets do ANOTHER experiment...

2.5
3.0

1.5
2.0
p
= 0.8
s = 0.11
0.0
0.5
1.0
Ready, set,
go...
0.0
0.2
0.4
0.6
0.8
1.0
79
80
0.6
0.2
0.0
Health Insurance
No Health Insurance
Health Insurance
No Health Insurance
p = 0.8
s = 0.06
Percentage
0.4
0.6
0.4
0.0
0.2
Percentage
0.8
0.8
1.0
Sample 2
1.0
Sample 1

p = 0.8
0.0
p = 0.7
0.2
0.4
0.6
0.8
1.0
81
82
1.0
Sample 2
1.0
Sample 1
0.8
0.6
Percentage
No Health Insurance
Health Insurance
p = 0.76
83
0.4
0.0
0.0
Ready, set,
go...
0.2
Percentage
0.2

0.4

0.6
0.8
No Health Insurance
Health Insurance
p = 0.83
84

Lets Review
p = .8
Population
Health Insurance
No Health Insurance
n = 20
p
= 0.8
s = 0.04
0.0
0.2
0.4
0.6
0.8
p
= 0.799
sp = 0.11
p
= 0.803
sp = 0.06
p
= 0.798
sp = 0.04
1.0
n = 50
0.2
0.4
0.6
0.8
1.0
0.0
0.0
0.2
0.4
0.6
0.8
1.0
n = 100
0.0
0.2
0.4
0.6
0.8
1.0
85
0.025
0.015
0.020
men, each with n = 20 subjects

plot a histogram of the sample BP values
record the sample mean
sample standard deviation
0.005
0.010
= 125 mm Hg
= 14 mm Hg
Ready, set,
go...
0.000
Percentage of Men in Population
0.030
Population distribution of blood pressures
86
80
100
120
140
160
87
88

each based on a sample of size 20
0.12
0.10
= 125
X
0.08
0.04
sX = 3.07
100
120
140
160
180
80
100
120
140
160
180
= 125.17
X
= 124.3
X
s = 12.36
s = 11.65
0.00
0.02
80
0.04
0.00
0.00
0.06
0.02
0.01
0.03
0.02
0.01
0.04
Sample 2
0.03
Sample 1
80
100
120
140
160
180
89
90
Ready, set,
go...
0.01
0.00
0.00
0.01
0.02

0.02
0.03

0.03
0.04
Sample 2
0.04
Sample 1
80
91
100
120
140
160
180
80
100
120
140
= 124.98
X
= 126.72
X
s = 14.05
s = 13.64
160
180
92

0.20
0.15

0.10
= 125.01
X
sX = 1.93
0.00
0.05
Ready, set,
go...
80
100
120
140
160
180
93

120
140
160
180
0.25
0.20
0.15
80
100
120
140
160
180
0.05
100
sX = 1.41
= 127.32
X
= 125.06
X
s = 14.93
s = 13.15
0.00
80
= 124.93
X
0.10
0.03
0.02
0.01
0.00
0.03
0.02
0.00
0.01
0.04
Sample 2
0.04
Sample 1
94
80
95
100
120
140
160
180
96
Lets Review
Population distribution of hospital length of stay
Population
90
100
110
120
130
140
150
= 14
= 124.997
X
sX = 3.07
160
120
140
160
= 125.015
X
n = 50
80
100
120
140
0.15
= 4 days
= 3 days
0.05
100
sX = 1.93
0.00
80
0.10
n = 20
Percentage
0.20
80
= 125
160
10
15
20
25
30
Length of Stay (in days)
= 124.934
X
n = 100
80
100
120
140
sX = 1.41
160
97
98
Ready, set,
go...
0.25
0.15
0.05
0.00
0.00
0.05
0.10
plot a histogram of the sample LOS values

0.10
0.15
0.20
hospital admissions, each with n = 16 patients

0.20
0.25
0.30
Sample 2
0.30
Sample 1
99
10
15
20
25
10
15
= 4.7
X
= 5.01
X
s = 2.88
s = 2.73
20
25
100

0.4
sX = 0.74
0.2
= 4.08
X
0.3
0.5

0.0
0.1
Ready, set,
go...
0
10
15
20
25
101

0.8
0.25
= 4.1
X
0.6
0.20
0.15
0.10
sX = 0.37
0.00
0.4
0.05
0.20
0.15
10
15
20
25
10
15
20
25
0.0
0.2
0.00
0.05
0.10
0.25
0.30
Sample 2
0.30
Sample 1
102
= 4.26
X
= 4.08
X
s = 2.72
s = 2.45
103
10
15
20
25
104
Ready, set,
go...
0.25
0.00
0.05
0.10
0.15
0.00
0.05
0.10
0.15
0.20

0.20
0.25
0.30
Sample 2
0.30
Sample 1
10
15
20
25
10
15
= 4.48
X
= 4.29
X
s = 3.32
s = 2.76
20
25
105

106
Lets Review
Population
10
15
20
s = 3
= 4.081
X
sX = 0.74
= 4.104
X
sX = 0.37
= 4.1
X
sX = 0.19
25
1.5
= 4.1
X
1.0
2.0
= 4
sX = 0.19
n = 16
0
10
15
20
25
0.5
n = 64
5
10
15
20
25
0.0
10
15
20
25
n = 256
0
107
10
15
20
25
108
Variation in sample mean values tied to size of each sample

NOT the number of samples
The Sampling Distribution
The sampling distribution of a sample statistic refers to what the

distribution of the statistic would look like if we chose a large
number of samples from the same population
www.ruf.rice.edu/~lane/stat_sim/sampling_dist/index.html
500
5000
Simulations
500
5000
Simulations
500
5000
Simulations
n=16
n=64
n=256
109
110
Sampling Distribution of a Sample Mean
In real research it is impossible to estimate the sampling
distribution of a sample mean by actually taking multiple

random samples from the same population
The sampling distribution of a sample mean is a theoretical

probability distribution
It describes the distribution of
no research would ever happen if a study needed to be
repeated multiple times to understand this sampling behavior
all sample means
Simulations are useful to illustrate a concept, but not to
highlight a practical approach!
from all possible random samples
Luckily, there is some mathematical machinery that
of the same size
generalizes some of the patterns we saw in the simulation

results
taken from a population
111
112
Amazing Result
Mathematical statisticians have figured out how to predict what
the sampling distribution will look like without actually repeating
the study numerous times and having to choose a sample each time
The Big Idea

Its not practical to keep repeating a study to evaluate sampling
variability and to determine the sampling distribution.
Mathematical statisticians have figured out how to calculate it
without doing multiple studies.
Often, the sampling distribution will

look normal
The sampling distribution of a statistic is often normally
30
distributed.
This mathematical result comes from the CENTRAL LIMIT
10
20
THEOREM. For the theorem to work, it requires the sample

size (n) to be large (usually n > 60 suffices).
Statisticians have derived formulas to calculate the standard
deviation of the sampling distribution and its called the

standard error of the statistic.
0.76
0.78
0.80
0.82
0.84

113
Central Limit Theorem
114
Illustration of the Central Limit Theorem

Population
If the sample size is large, the distribution of sample means

approximates a normal distribution:
One Die
Two Dice
10
Five Dice
Means based
on n = 16
2
10
Number of Occurrences
Means based
on n = 32
0
1
Mean Value
Mean Value
10
Mean Value
Means based
on n = 64
0
115
10
116
Why is the normal distribution so important in the study of statistics?
Why is the sampling distribution so important?
If a sampling distribution has a lot of variability (i.e. has a big
Its not because things in nature are always normally distributed

(although sometimes they are)
standard error), then if you took another sample its likely you
would get a very different result
About 95% of the time the sample mean (or proportion) will
be within 2 standard errors of the population mean (or

proportion)
Its because of the Central Limit Theorem:

The sampling distribution of statistics (like a sample mean) often
follows a normal distribution if the sample sizes are large
This tells us how close the sample statistic should be to the
population parameter
117
118
Standard Errors (SE)
The standard deviation

IS NOT
The standard error of a statistic
Measures the precision of your sample statistic

A small SE means it is more precise
The SE is the standard deviation of the sampling distribution
Standard deviation measures the variability among individual

observations.
of the statistic
Mathematical statisticians have come up with formulas for the
standard error. There are different formulae for:
Standard error measures the precision of a statistic such as the

sample mean or proportion that is calculated from a
number (n) of different observations. The sample
mean and sample proportion are trying to estimate
the population mean or population proportion.
Standard error of the mean (SEM)

Standard error of a proportion
These formulae always involve the sample size n. As the
sample size gets bigger, the standard error gets smaller.
119
120
Standard Error of the Mean

SEM
Notes on SEM
This is a measure of the precision of the sample mean:

s
SEM =
n
1
2
3
Example
Measure systolic
Sample size
Sample mean
Sample SD
is.
The smaller SEM is, the more precise X
SEM depends on n and s.
SEM gets smaller if
s gets smaller
n gets bigger
blood pressure on random sample of 100 students

n = 100
= 123.4 mmHg
X
s = 14.0 mmHg
14
SEM =
= 1.4mmHg
100
121
Question:
122
95% Confidence Interval for Population Mean
How close to population mean () is sample mean (X)?

2SEM
X
ANSWER
The standard error of the sample mean tells us 95% of the time
the population mean will lie within about 2 standard errors of the
sample mean
2SEM
X
1.96SEM)
(More accurately: X
The CI gives the range of plausible values for

= 123.4 mmHg, s = 14
Example: Blood pressure n = 100, X
95% CI is
123.4 2 1.4
123.4 2 1.4
123.4 2.8
123.4 2.8
Why is this true? Because of the Central Limit Theorem

INTERPRETATION
We are 95% confident that the sample mean is within 2.8 mmHg
of the population mean. The 95% error bound is 2.8.
123
Ways to write a confidence interval:

120.6 to 126.2
(120.6, 126.2)
(120.6126.2)
We are highly confident that the population mean falls in the
range 120.6 to 126.2.
124
Notes on Confidence Intervals
Interpretation:
Plausible values for the population mean with high
confidence
n increases
s decreases
Level of confidence decreases (e.g. 90%, 80% vs 95%)
Are all CIs 95%? No.
The length of CI decreases when
It is the most commonly used

A 99% CI is wider
A 90% CI is narrower
To be more confident you need a bigger interval For a 99% CI
you need 2.576 SEM
95% CI you need 2 SEM (actually its 1.96 SEM)
90% CI you need 1.645 SEM
Where do these come from?
Confidence interval is only accounting for random sampling

error not other systematic sources of error of bias
Examples
BP measurement is always +5 too high
Only those with high BP agree to participate
(non response bias)
125

5
126
Underlying Assumptions for a 95% CI for the Population Mean
Technical Interpretation
The CI works (includes ) 95% of the time
2SEM
X
2 s
X
n
Random sample of population
Important
Sample size n is at least 60 to use 2SEM
Central Limit Theorem requires large n

6
Confidence Interval Applet

www.stat.sc.edu/~west/javahtml/ConfidenceInterval.html
127
128
What if the sample size is smaller than 60?
Value of t.95 used for 95% Confidence Interval for Mean

df
1
2
3
4
5
6
7
8
9
10
11
There needs to be a small correction in the formula
2SEM needs to be slightly bigger.

X
How much bigger 2 needs to be depends on the sample size.
Computers or statistical tables refer to the degrees of

freedom = n 1. One looks up the correct number in a
t-table or t-distribution with n 1 degrees of freedom.
You can think of degrees of freedom like a corrected sample
size. In this case its n 1 because we had to estimate one

. But its not always n 1.
parameter by X
t SEM
X
t
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
df
12
13
14
15
20
25
30
40
60
120
t
2.179
2.160
2.145
2.131
2.086
2.060
2.042
2.021
2.000
1.980
1.960
Notes
t s
X
n
129
Students t-Distribution
df
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
0.2
3.078
1.886
1.638
1.533
1.476
1.440
1.415
1.397
1.383
1.372
1.363
1.356
1.350
1.345
1.341
1.337
1.333
1.330
1.328
1.325
1.323
1.321
1.319
1.318
1.316
1.315
1.314
1.313
1.311
1.310
0.05
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
2.179
2.160
2.145
2.131
2.120
2.110
2.101
2.093
2.086
2.080
2.074
2.069
2.064
2.060
2.056
2.052
2.048
2.045
2.042
0.02
31.821
6.965
4.541
3.747
3.365
3.143
2.998
2.896
2.821
2.764
2.718
2.681
2.650
2.624
2.602
2.583
2.567
2.552
2.539
2.528
2.518
2.508
2.500
2.492
2.485
2.479
2.473
2.467
2.462
2.457
0.01
63.657
9.925
5.841
4.604
4.032
3.707
3.499
3.355
3.250
3.169
3.106
3.055
3.012
2.977
2.947
2.921
2.898
2.878
2.861
2.845
2.831
2.819
2.807
2.797
2.787
2.779
2.771
2.763
2.756
2.750
130
Students t-Distribution
Tabulated values correspond to

a given two-tailed p-value for
different degrees of freedom.
0.1
6.314
2.920
2.353
2.132
2.015
1.943
1.895
1.860
1.833
1.812
1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1.725
1.721
1.717
1.714
1.711
1.708
1.706
1.703
1.701
1.699
1.697
Most people use t = 2 once n gets above 60 or so

Sometimes people use 1.96 when n gets bigger (> 120)
Value of t depends on the level of confidence and sample size
df
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
0.001
636.619
31.599
12.924
8.610
6.869
5.959
5.408
5.041
4.781
4.587
4.437
4.318
4.221
4.140
4.073
4.015
3.965
3.922
3.883
3.850
3.819
3.792
3.768
3.745
3.725
3.707
3.690
3.674
3.659
3.646
131
0.2
1.309
1.309
1.308
1.307
1.306
1.306
1.305
1.304
1.304
1.303
1.303
1.302
1.302
1.301
1.301
1.300
1.300
1.299
1.299
1.299
1.298
1.298
1.298
1.297
1.297
1.297
1.297
1.296
1.296
1.296
1.282
0.1
1.696
1.694
1.692
1.691
1.690
1.688
1.687
1.686
1.685
1.684
1.683
1.682
1.681
1.680
1.679
1.679
1.678
1.677
1.677
1.676
1.675
1.675
1.674
1.674
1.673
1.673
1.672
1.672
1.671
1.671
1.645
0.05
2.040
2.037
2.035
2.032
2.030
2.028
2.026
2.024
2.023
2.021
2.020
2.018
2.017
2.015
2.014
2.013
2.012
2.011
2.010
2.009
2.008
2.007
2.006
2.005
2.004
2.003
2.002
2.002
2.001
2.000
1.960
0.02
2.453
2.449
2.445
2.441
2.438
2.434
2.431
2.429
2.426
2.423
2.421
2.418
2.416
2.414
2.412
2.410
2.408
2.407
2.405
2.403
2.402
2.400
2.399
2.397
2.396
2.395
2.394
2.392
2.391
2.390
2.326
0.01
2.744
2.738
2.733
2.728
2.724
2.719
2.715
2.712
2.708
2.704
2.701
2.698
2.695
2.692
2.690
2.687
2.685
2.682
2.680
2.678
2.676
2.674
2.672
2.670
2.668
2.667
2.665
2.663
2.662
2.660
2.576
0.001
3.633
3.622
3.611
3.601
3.591
3.582
3.574
3.566
3.558
3.551
3.544
3.538
3.532
3.526
3.520
3.515
3.510
3.505
3.500
3.496
3.492
3.488
3.484
3.480
3.476
3.473
3.470
3.466
3.463
3.460
3.291
132
t-distribution Applets
Example: Blood pressure

n=5
= 99 mmHg
X
s = 15.97
2.776 SEM
95% CI is X
http://www.stat.sc.edu/~west/applets/tdemo.html
http:
//www.econtools.com/jevons/java/Graphics2D/tDist.html
99 2.776 7.142
99 19.83
The 95% CI for mean blood pressure is
(79.17, 118.83)
(79.17 118.83)
Rounding off is okay too: (79, 119)
133
Confusion between SD and SEM
134
PROPORTIONS (p)
Proportion of individuals with health insurance
Standard deviation (s) - measures spread in the data
Proportion of patients who became infected

Proportion of patients who are cured
Standard error (s/ n) - measures the precision of the sample

mean
Proportion of individuals who are hypertensive

Proportion of individuals positive on a blood test
Proportion of adverse drug reactions
Proportion of premature infants who survive
The standard error of the sample mean depends on the sample size.
Does the standard deviation depend on the sample size too?
On each individual in the study, we record a binary outcome

(Yes/No; Success/Failure) rather than a continuous measurement
135
136
Proportions
Example
n = 200 patients
X = 90 adverse drug reaction
The estimated proportion who experience an adverse drug reaction
is
p = 90/200 = .45
or 45%
How accurate of an estimate is the sample proportion of the
population proportion?
NOTES
What is the standard error of a proportion?
There is uncertainty about this rate because it involved only
n = 200 patients
If we had studied a much larger number of patients, would we
have gotten a much different answer?

The sample proportion is p
= .45
But it is not the true rate of adverse drug reactions in the
population
137
The Sampling Distribution of a Proportion
95% CI for a Proportion
10
12
p 1.96SE (
p)
r
p (1 p)
p 1.96
n
The standard error of a

sample proportion is
r
p (1 p)
SE (
p) =
n
p is the sample proportion

n is the sample size
Example
n = 200 patients
X = 90 adverse drug reactions
p = 90/200 = .45
r
.45 .55
.45 1.96
200
.45 1.96 0.035
Number of Samples
138
0.35
0.40
0.45
0.50
0.55
.45 0.07
Sample Proportion
139
The 95% confidence interval is (.38 .52).
140
Interpreting a 95% CI for a Proportion
Notes on 95% CI for Proportions
Plausible range of values for population proportion

Highly confident that population proportion is in the interval
The method works 95% of the time
Random (or representative) sample

Suppose the 200 patients were sicker?
Suppose the 200 patients were consecutive?
The confidence interval does not address your definition of

drug reaction and whether thats a good or bad definition. It
accounts only for sampling variation.
Can also have CI with different levels of confidence
141
142
Example
Sometimes 1.96SE (
p ) is called
Study of survival of premature infants.
95% Error Bound

Margin of Error
5
All premature babies born at Johns Hopkins during a 3 year
period (Allen, et al. NEJM, 1993)
The formula for a 95% CI is ONLY APPROXIMATE. It works

well if the number of failures (drug reactions) and successes
(non-reactions) are both at least 5.
Otherwise, you need to use a computer to perform something
called exact binomial calculations.
You do NOT use the t-correction for small sample sizes like
we did for sample means. We use exact binomial
calculations.
n = 39 infants born at 25 weeks gestation

31 survived 6 months
p =
31
= 0.79
39
95% CI .63 .91

(based on exact binomial calculations)
Source: Motulsky, Intuitive Biostatistics
143
144
Comparison of 2 Groups
Are confidence intervals needed even though all infants were
studied?
Are the Population Means Different

(Continuous Data)
Two Situations
Are the 39 infants a sample?

Seems like its the whole population.
1
Paired Design
Before-after data
Twin data
It makes sense to calculate a CI when the sample is

representative of a larger population about which you
wish to make inferences. It is reasonable to think
that these data from several years at one hospital are
representative of data from other years at other
hospitals, at least at big-city university hospitals in
the United States.
Two Independent Sample Design

Treatment A
School Children
Treatment B
145
Paired Design
146
Example: Blood pressure and Oral Contraceptive Use
Before
After
Subjects Ten non-pregnant, pre-menopausal women 16-49

years old who were beginning a regimen of oral
contraceptive (OC)
Methods Measure blood pressure prior to starting OC use, and
three-months after consistent OC use
Why Pairing?
Goal Identify any changes in average blood pressure

associated with OC use in such women
Controls extraneous noise

Everyone acts as own control
Rosner, Fundamentals of Biostatistics, (2005).
147
148
Example: Blood pressure and Oral Contraceptive Use
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
sample mean
BP Before OC
115
112
107
119
115
138
126
105
104
115
115.6
BP After OC
128
115
106
128
122
145
132
109
102
117
120.4
Calculate a 95% CI for the Expected Change in Blood Pressure

95% CI for population mean BP change
Difference
After-Before
4.8
4.8
t.95,df =9 SEM
4.57
2.262
10
2.262 1.445
1.53 mm Hg
to
8.07 mm Hg
Notes
The sample average of the differences is 4.8.

The sample standard deviation (s) of the differences is
s = 4.57.
Where does 2.262 come from?

See the t-distribution with 9 degrees of freedom
The BP change could be due to factors other than oral

contraceptives. A control group of comparable women who
were not taking oral contraceptives would strengthen this
study.
149
150
Hypothesis TestingSignificance Testingand p-values

Want to draw a conclusion about a population parameter:
In a population of women who use oral

contraceptives, is the average (expected) change in
blood pressure (After-Before) 0 or not?
The number 0 is NOT in confidence interval (1.53 8.07)

0
1.53
8.07
Sometimes statisticians use the term expected for the
population average
Because 0 is not in the interval, suggests there is a significant
is the expected (population) mean change in blood pressure
change in BP over time
Choose between two competing possibilities for using a
There is a significant increase in blood pressure
single imperfect (paired) sample

Null hypothesis H0 : = 0
Alternative hypothesis H1 : 6= 0
We reject H0 if the sample mean is far away from 0.
151
152
The Hypotheses
Do we have sufficient evidence to reject H0 and claim H1 is true?
We set up mutually exclusive, exhaustive possibilities for the truth:

The null hypothesis H0
Typically represents the hypothesis that there is no
effect or difference.
It represents current beliefs or state of knowledge.
For example, there is no effect of oral contraceptives
on blood pressure:
is close to zero, it is consistent with H0

If X
is far from zero, is it consistent with H1
If X
= 4.8 is
How do we decide if X
more consistent with H0 or H1 ?
H0 : = 0
The alternative hypothesis H1
Typically represents what you are trying to prove.
For example, oral contraceptives affect blood
pressure:
H1 : 6= 0
153
154
The p-value
What is the probability of observing an extreme sample meanlike

4.8 mm Hgif the null hypothesis (H0 : = 0 ) were true?
The answer is called the p-value
If that probability (p-value) is small, it suggests the observed
result was unlikely if H0 is true.

This would provide evidence against H0
If that probability (p-value) is large, it suggests the observed
result quite probably if H0 is true.

This would provide evidence for H0
155
http://xkcd.com/892/
156
How are p-values calculated?

1
First, measure the distance between the sample mean and

what you would expect the sample mean to be if H0 : = 0
were true:
t=
t=
The t-statistic is analogous to the Z -score on pages 37-44
Z=
sample mean 0
SEM
4.8
4.8
=
= 3.31
1.45
4.57/ 10
observation mean
SD
t=
0
X
SEM
Z
t
observation
sample mean
standard deviation standard error
mean
0 because
we are calculating pvalues under the scenario that

H0 : = 0
The value t = 3.31 is called the test statistic

We observed a sample mean that was 3.31 standard
deviations of the mean (SEM) away from what we would have

expected the mean to be if OC has no effect (i.e., d = 0)
157

2
158
Next, calculate the probability of getting a test statistic as or

more extreme than what you observed (t=3.31) if H0 was
true:
This p-value comes from the normal distribution.
How unusual is it to get a standard normal score as extreme as
If the sample size is small (n < 60), a small t-correction must
be made
3.31? Not likely at all (p < .01)
Instead of a normal distribution, t-distribution is used with
n 1 degrees of freedom
The p-values gets a little larger
Use the t table.

3.31
This procedure is called a paired t-test with n 1 degrees of
3.31
freedom. In the oral contraceptive example, we performed a

paired t-test with 9 degrees of freedom.
159
160
Interpreting the p-value
Using the p-value to make a decision
The p-value in the blood pressure/OC example is .0089

Interpretation If the true before OC/after OC blood pressure
difference is 0 amongst all women taking OCs, then
the chance of seeing a mean difference as
extreme/more extreme as 4.8 in a sample of 10
women is .0089
p-values are probabilities (numbers between 0 and 1).

Small p-values are measures of evidence against H0 in favor of
H1 .
2
The p-value is the probability of obtaining a result

as/or more extreme than you did by chance alone
assuming the null hypothesis H0 is true.
If the p-value is small either

(a) A very rare event occurred and H0 is true
OR
(b) H0 is false
161
162
The p-value in the blood pressure/OC example is .0089
p-value is a continuum of evidence

Guidelines?
p = .10: suggestive
p = .05: magical cutoff
p = .01: strong evidence
. This p-value is small

5
. So there is a small probability of observing our data (or

something more extreme) if H0 is true
How precise should p-values be?

2 decimal places suffice (p = .07)
Sometimes 3 decimal places if p < .01
. p = .007
If the p-value is really small, p < .001 is fine
If the p-value is really big, p > .20 is fine
. We reject H0
163
164
Blood PressureOC ExampleSummary
SummaryPaired t-test
Methods The changes in blood pressures after oral contraceptive use

were calculated for 10 women. A paired t-test was used to
determine if there was a significant change in blood pressure
and a 95% confidence was calculated for the mean blood
pressure change (after-before).
1
2
Designate null and alternative hypotheses

Collect data
Compute change for each paired set of observations
d , the sample mean of the paired differences
Compute X
Compute sd , the sample standard deviation of the differences
Result Blood pressure measurements increased on average 4.8 mm Hg

with standard deviation 4.57. The 95% confidence interval for
the mean change was 1.5 8.1. There was evidence that blood
pressure measurements after oral contraceptive use were
significantly higher than before oral contraceptive use
(p = .0089).
Calculate the test statistic

t=
Discussion A limitation of this study is that there was no comparison

group of women who did not use oral contraceptives. We do
not know if blood pressures may have risen even without oral
contraceptive usage.
d 0
d 0
X
X
=
SEM
sd / n
Compare t to a t-distribution to get a p-value

If p is small, Reject H0
If p is large, Fail to Reject H0
165
Two Types of Errors
166
The p-value and the -level

Some people will only call a p-value significant if it is less than
some cutoff (e.g., .05). This cutoff is called the -level
The -level is the probability of a type I error. It is the probability

of falsely rejecting H0 .
Type I error: Claim H1 is true when in fact H0 is true

Type II error: Do not claim H1 is true when in fact H1 is true
Statistically significant
The p-value is less than a preset threshold value, .
The probability of making a Type I error is called the -level

The probability of making a Type II error is called the -level
The probability of NOT making a Type II error is called the power
Do Not Say
The result is statistically significant
The result is statistically significant at = .05
The result is significant (p < .05)
167
168
One-Sided versus Two-Sided p-values
Connection between CIs and HTs
Two-sided p-value: (p = .009) Probability of a result as or more

< 4.8 or X
> 48)
extreme than observed (either X
The CI gives plausible values for the population parameter

data take me to the truth
Hypothesis testing postulates two choice for the population
One-sided p-value: (p = .0045) Probability of a more extreme

> 4.8)
positive result than observed (X
parameter
You never know what direction study results will go...

In this course, we will use two-sided p-values exclusively.
This is what is typically done in the scientific/medical literature.
here are two possibilities for the truth, data help me choose one
169
170
If 0 is not in the 95% CI, then we reject H0 that = 0 at
level = .05 (the p-value < .05)

In this BP-OC example, the 95% confidence interval tells us
1.53
that the p-value is less than .05, but it doesnt tell us that it
is p = .009
8.07
The confidence interval and the p-value are complementary.
Why?
You cant get an exact p-value from just looking at a

confidence interval
d and captures 2 standard errors in either

CI starts at X
direction
If 0 is not in the 95% CI, the d is more than 2 standard errors
I like to report both
from 0 (either above or below)

So the distance (t) will be > 2 or < 2
and the resulting p-value< .05
171
172
More on the p-value
More on the p-value

STATISTICAL SIGNIFICANCE
IS NOT THE SAME AS
SCIENTIFIC SIGNIFICANCE
STATISTICAL SIGNIFICANCE DOES

NOT IMPLY CAUSATION
Example: Blood Pressure and Oral Contraceptives
Blood Pressure Example
n = 100, 000
= .03 mmHg
X
There could be other factors that could explain the change in
blood pressure.
s = 4.57
A significant p-value is only ruling out random sampling as
the explanation.
p-value
Need a Comparison Group
= .04
Big n can sometimes produce a small p-value even though the
Self-selected (may be okay)

Randomized (better)
magnitude of the effect is very small (not scientifically

significant)
Supplement with a CI: 95% CI is .002 .058 mmHg
173
The Language of Hypothesis (Significance) Testing
174
More on the p-value

NOT REJECTING H0 IS NOT THE SAME
AS ACCEPTING H0
Suppose the p-value is p = .40

How might this result be described?
Example: Blood Pressure and Oral Contraceptives
Not statistically significant
n = 5
= 5.0 mmHg
X
Do not reject H0
Can we also say?
s = 4.57
Accept H0
p-value
Claim H0 is true
= .07
We cannot reject H0 at significance level = .05.

Are we convinced there is no effect of OC on BP?
Maybe we should have taken a bigger sample.
Interesting trend, but not proven beyond a reasonable doubt
Look at the confidence interval 95% CI (-.67, 10.7)
Statisticians much prefer the double negative

Do not reject H0
Innocent until proven guilty

175
176
Comparing Two Independent Groups
Note
Controlled Trial in Peru of Bismuth Subsalicylate (Pepto Bismol)

in Infants with Diarrheal Disease
Controls n = 84
The data are not paired.
There are different infants in each group.
Infants
2 independent groups
How do we calculate
Confidence interval for difference
p-value to determine if the difference in two groups is
significant
2-sample (unpaired) t-test
Treatment n = 85
n
Mean stool output ml/kg
Standard deviation(s)
Control
84
260
254
Tx
85
182
197
Scientific Question: Is there a treatment effect?

177
95% CI for the Difference in Means of Two Independent (Unpaired) Groups
178
The SE of the Difference in Sample Means
Generic CI formula:
Principle: Variation from independent sources can be added
estimate 1.96 SE
Variance(X1 X2 ) = (SE (X1 ))2 + (SE (X2 ))2
(X1 X2 ) 1.96 SE (X1 X2 )
SE (X1 X2 ) =
SE (X1 X2 ) = standard error of the
q
(SE (X1 ))2 + (SE (X2 ))2
difference of 2 sample means

Formula depends on n1 , n2 , s1 , s2
There are other slightly different equations for SE (X1 X2 )
The standard error of the difference for two independent
samples is calculated differently than we did for paired designs.
But they all give similar answers
Statisticians have developed formulae for the standard error of
the difference. These formulae depend on sample size in both

groups and standard deviations in both groups.
179
180
The SE of the Difference in Sample Mean
Example: Pepto Bismol RCT
95% CI for Difference in Mean

n
Mean stool output ml/kg
Standard deviation(s)
Control
84
260
254
Tx
85
182
197
78 1.96 SE (X1 X2 )
78 1.96 34.94
78 68.48
SE (X1 X2 ) =
r
2
2
254/ 84 + 197/ 85
9 to 147
p
=
27.712 + 21.372
Note
= 34.94
The confidence interval does not include 0.
Thus, p < .05
181
Hypothesis Test to Compare Two Independent Groups
182
Notes on the 2-sample t-test

1
2
Two-Sample (Unpaired) t-test
Are the expected stool outputs equal in the two groups?
This is a 2-sample (unpaired) t-test

The value t = 2.23 is the test statistic
We calculate a p-value which is the probability of obtaining a
test statistic as extreme as we did if H0 was true.
2.23
2.23
H0 : 1 = 2
H1 : 1 6= 2
difference in means 0
t=
SE of the difference
t=
260 182
78
=
= 2.23
34.94
34.94
183
How is the probability computed? If sample sizes are large

(both greater than 60) a normal distribution is used.
If sample sizes are small, a small t correction is required (a t
distribution is used with n1 + n2 2 degrees of freedom; that
is the degrees of freedom is the total sample size from both
groups minus 2).
An assumption that is also required is that both populations
are approximately normally distributed. (Results can be highly
influenced by wild observations or outliers.)
184
DiarrheaPepto BismolSummary
Nonparametric Alternative to the 2-sample t-test

Mann-Whitney-Wilcoxon Rank Sum Test
Question Is there a difference in mean stool output between the two

treatment groups?
Methods The stool output was calculated for 84 infants randomized to
placebo and 85 infants randomized to Pepto Bismol. A 95%
confidence interval was calculated for the difference in mean
stool output between the two groups and a two-sample t-test
was used to determine if there was a significant different
between the two groups.
Objective
Assess if the two populations are different?
Advantages
Does not assume populations are normally
distributed. The two-sample t-test requires that

assumption with small sample sizes
Uses only ranksdo not need precise numerical
outcomes
Not sensitive to outliers
Result The mean stool outputs in the treated and control groups were
182 and 260 respectively. The control group stool output was
significantly higher than the treated group (p = .03). The
control group was 78 ml/kg higher than the treated group
(95% confidence interval 9 147 ml/kg).
Disadvantage of the Nonparametric Test

Nonparametric methods are often less sensitive
(powerful) for finding true differences because

they throw away information (they use only
ranks)
Need full data set, not just summary statistics
185
186
Example: Health Intervention Study

Rank the pooled data:
Order:
Evaluate an intervention to educate high school students about

health and lifestyle
Rank:
Group:
Y = Post Pretest Score
Find the average rank in the 2 groups:
Intervention (I) 5 0 7 2 19
Only 5 individuals in each
sample
Randomize
Control (C)
Intervention group average rank:
3+5+7+9+10
5
Control group average rank:
1+2+4+6+8
5
6 -5 -6 1 4
= 6.8
= 4.2
p-value calculations:
With such a small sample, we need to be sure score
improvements are normally distributed if we want to use t-test

BIG assumption
Statisticians have developed formulae and tables to determine the

probability of observing such an extreme discrepancy (6.8 vs 4.2)
by chance alone. Thats the p-value.
In the example, p = .22.
Alternative: Mann- Whitney-Wilcoxon non-parametric test!
187
188
Health Intervention StudySummary
Note
Question Is there a difference in test score change between the

intervention and control groups?
Design 10 high school students were randomized to either receive a
two-month health and lifestyle education program (or no
program). Each student was administered a test regarding
health and lifestyle issues prior to randomization and after the
two-month period.
In the health insurance study, the p-value was .22.

. No significant difference in test scores between the intervention
and control group (p = .22)
The two-sample t-test would give a different answer (p = .11).
Statistics Differences in the two test scores (after-before) were computed

for each student. Mean and median test score changes were
computed for each study group. A Mann-Whitney rank sum
test was used to determine if there was a statistically significant
difference in test score change between the intervention and
control groups at the end of the two-month study period.
Different statistical procedures can give different p-values.

If the largest observation 19 was changed to 100, the p-value
based on the Mann-Whitney test would not change but the

two-sample t-test would.
Result The median score change was four points higher in the
intervention group than in the control group. The difference in
test score improvements between the intervention and control
groups was not statistically significant (p = .17)
189
The t-test or the nonparametric test?
190
Example: Exposure of Young Infants to Environmental Tobacco Smoke

Objective This study examined the degree to which breast-feeding and
cigarette smoking by mothers and smoking by other household
members contributed to the exposure of infants to the
products of tobacco smoker (urinary cotinine level).
Method We report median values and interquartile ranges for each
group. Comparisons between groups are made with the
Wilcoxon rank sum test because the distributions of urine
cotinine values are positively skewed.
Statisticians will not always agree, but there are some guidelines:
Use nonparametric test if sample size is small and distribution
looks skewed. You might also do a t-test, too, and compare.

Only ranks available
Otherwise, use t-test
191
Source: Mascola et al., AJPH, 1998, 88:893-895.
192
Extension of the 2-sample t-test

Analysis of Variance
One-Way ANOVA
Example: Pulmonary Disease
Goal: Does passive smoking have a measurable effect on

pulmonary health?
The t-test compares two populations
Methods: Measure mid-expiratory flow (FEF) in liters per

second (amount of air expelled per second) in six
smoking groups.
Analysis of variance is a generalization of the two-sample
t-test to compare three or more populations

The test statistic from ANOVA calculations is called the
F -test
A p-value is then calculated
Are there any differences among the populations?
An alternative strategy is to perform lots of two-sample
t-tests (pairwise)
That could be a lot of statistical testing!
Instead, perform an ANOVA
Nonsmokers (NS)
Passive Smokers (PS)
Noninhaling Smokers (NI)
Light Smokers (LS)
Moderate Smokers (MS)
Heavy Smokers (HS)
No significant differences Stop. No further analysis necessary

Significant differences Do two-sample t-tests to find them
193
White and Froeb. Small-Airways Dysfunction in Non-Smokers Chronically Exposed

to Tobacco Smoke, NEJM 302: 13 (1980)
194
Mean FEF 2 SE
One strategy is to perform lots of two-sample t-tests...
3.3
FEF (L/s)
n
200
200
50
200
200
200
3.1
sd FEF
(L/s)
0.79
0.77
0.86
0.78
0.81
0.82
2.9
Mean FEF
(L/s)
3.78
3.30
3.32
3.23
2.73
2.59
2.7
Group
name
NS
PS
NI
LS
MS
HS
2.5
Group
number
1
2
3
4
5
6
3.5
3.7
NS
PS
NI
LS
MS
HS
Group
Based on a one-way analysis of variance, there are significant
In this example, there would be 15 comparisons...

It would be nice to have one catch-all test which would tell
you whether there were any differences amongst the six groups
195
differences in pulmonary function among these groups

(p < .001).
Pairwise two-sample t-tests show very significant differences
between nonsmokers and all other groups.
There were no significant differences between passive smokers,
noninhalers and light smokers; and between moderate and
heavy smokers.
196
Smoking and FEFSummary
Smoking and FEFSummary
Statistics
ANOVA was used to test for any differences in FEF levels

amongst the six groups of men
Individual group comparisons were performed with a

Subjects A sample of over 3,000 persons was classified into one of six
smoking categorizations based on responses to smoking related
questions
series of two sample t-tests, and 95% confidence intervals

were constructed for the mean difference in FEF between
each combination of groups
Methods 200 men were randomly selected from each of five smoking
classification groups (non-smoker, passive smokers, light
smokers, moderate smokers, and heavy smokers), as well as 50
men classified as non-inhaling smokers for a study designed to
analyze the relationship between smoking and respiratory
function
Results
Analysis of variance showed statistically significant

(p < .001) differences in FEF between the six groups of
smokers.
Non-smokers had the highest mean FEF value, 3.78 L/s,
and this was statistically significantly larger than the five
other smoking-classification groups
The mean FEF value for non-smokers was 1.19 L/s
higher than the mean FEF for heavy smokers (95% CI
1.031.35 L/s), the largest mean difference between any
two smoking groups
197
Whats the rationale behind analysis of variance?
198
Overuse of Hypothesis TestsBad Statistics!!
H0 : 1 = 2 = = k
H1 : at least one mean is different
The variation in the sample means between groups is compared to
the variation within a group.
Age
< 20
20
3.3
3.1
2.9
FEF (L/s)
3.5
3.7
n
97
88
Sample
Mean
17.8
24.6
2.7
2.5
NS
PS
NI
LS
MS
HS
Group
If the between group variation is a lot bigger than the within group
variation, that suggests there are some differences among the
populations.
http://www.ruf.rice.edu/~lane/stat_sim/one_way/index.html
199
200
Comparing Two Proportions
Notes on Design
Study: Clinical trial of AZT to prevent maternal-infant

transmission of HIV.
AZT
n = 121
9 infected
infants
Placebo
n = 127
31 infected
infants
Random assignment of Tx
Helps insure 2 groups are comparable
Patient & physician could not request particular Tx
Double blind
Patient & physician did not know Tx assignment
Randomize
Definition of infection
Two positive cultures (infant > 32 weeks)
Conner et al. New England J. of Medicine 331:1173-1190 (1994)
201
HIV Transmission Rates
202
HIV Transmission Rates
95% confidence intervals

AZT
Placebo
9/121
31/127
=
=
.074
.244
(7.4%)
(24.4%)
AZT
Placebo
95% CI
95% CI
.03 .14
.17 .32
Note
These are NOT the true population parameters for the
transmission rates.
There is sampling variability
Is the difference significant, or can it be explained by chance?
As CIs do not overlap, suggests significance. But whats the

p-value?
Note: if the CIs did overlap, it would still be possible to get a

p < .05.
Want a direct method for testing 2 independent proportions

203
204
Display the Data in a 2 2 Table

(2 rows and 2 columns)
Hypothesis Testing
H0 : p1 = p2
AZT
Placebo
H1 : p1 6= p2
HIV transmission
(infected)
Yes
31
40
No
112
96
208
121
127
248
p1 = Proportion infected on AZT

p2 = Proportion infected on placebo
1
Fishers Exact Test
(Pearsons) Chi-Square Test (2 )
205
Fishers Exact Test
206
Notes on Fishers Exact Test
As with all hypothesis tests, start by assuming H0 is true:
AZT is not effective

Imagine putting 40 red balls (the infected) and 208 blue balls
(non-infected) in a jar. Shake it up.
Calculations are difficult
Now choose 121 ballsthats AZT group.

The remaining balls are the placebo group.
Always appropriate to test equality of two proportions

Computers are usually used
Exact p-value (no approximations)
no minimum sample size requirements
We can calculate the probability you get 9 or fewer red balls
among the 121. That is the one-sided p-value.

The two-sided p-value is just about (but not exactly) twice
the one-sided p-value. It accounts for the probability of

getting either extremely few red balls or a lot of red balls in
the AZT group.
The p-value is the probability of obtaining a result as or more
extreme (more imbalance) than you did by chance alone.

207
208
HIV-AZTSummary
The Chi-Square Approximate Method
Study We conducted a randomized, double-blind, placebo-controlled

trial of the efficacy and safety of zidovudine (AZT) in reducing
the risk of maternal-infant HIV transmission
Methods HIV transmission rates for both the placebo and AZT groups
were calculated as the ratio of HIV infected infants (based on
cultures at 32 weeks) divided by the total number of infants
and 95% confidence intervals were calculated. The transmission
rates for the two groups were compared by Fishers Exact Test.
Results
Works for big sample sizes

If all 4 numbers in the 2 2 table are 5 or more it is okay
The only advantage of this method over Fishers Exact Test is
The maternal infant HIV transmission rate for the AZT
you dont need a computer to do it.
group was 7.4%

(95% CI 3.5% 13.7%)
The maternal infant HIV transmission rate for the
placebo group was 24.4%
(95% CI 17.2% 32.8%)
AZT significantly reduced the rate of HIV transmission
compared to placebo (p < .001)
209
Looks at discrepancies between observed and expected.
O =
observed
expected =
210
Calculate expected counts assuming H0 is true
Calculate a test statistic to measure the difference between

what we observe and what we expect
Test Statistic 2 =
row total column total

grand total
Expected refers to the values for the cell counts that would be
expected if the null hypothesis is true
X (O E )2
E
4 cells
Use a chi-square table with 1 degree of freedom to get a

p-value
How likely is it to get such a big discrepancy between the
observed and expected?
211
212
2 Distribution with 1 Degree of Freedom
Performing the 2 Test for a 22 Table
HIV transmission
(infected)
AZT
Placebo
Yes
31
40
No
112
96
208
121
127
248
Observed = 9
3.84
Expected = 121
40
= 19.52
248
213
214

HIV
Expected
Observed
AZT
Yes
No
AZT
9
112
121
Placebo
Placebo
Yes
31
40
Yes
40
HIV
Yes
No
AZT
19.52
101.48
121
Placebo
20.48
106.52
127
40
208
248
No
112
96
208
No
208
121
127
248
= 13.19 2
HIV
127
40
208
248
(9 19.52)2 (112 101.48)2 (31 20.48)2 (96 106.52)2

+
+
+
19.52
101.48
20.48
106.52
AZT
121
Placebo
31
96
127
The p-value is about p = .0003
248
It is NOT a coincidence that the
13.19
215
square of Z on page 141 is almost the

2 . One is nearly the square of the
other:
13.19 3.63
216
2 Distribution with 1 Degree of Freedom
Chi-Square for Associations in r c Tables
This table assumes that you have one degree of freedomthe case when analyzing a
22 table:
2
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
P
1.0000
0.7518
0.6547
0.5839
0.5271
0.4795
0.4386
0.4028
0.3711
0.3428
0.3173
0.2943
0.2733
0.2542
0.2367
0.2207
0.2059
0.1923
0.1797
0.1681
0.1573
0.1473
0.1380
0.1294
0.1213
2
2.5
2.6
2.7
2.8
2.9
3.0
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
4.0
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
P
0.1138
0.1069
0.1003
0.0943
0.0886
0.0833
0.0783
0.0736
0.0693
0.0652
0.0614
0.0578
0.0544
0.0513
0.0483
0.0455
0.0429
0.0404
0.0381
0.0359
0.0339
0.0320
0.0302
0.0285
0.0269
2
5.0
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
6.0
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
7.0
7.1
7.2
7.3
7.4
P
0.0253
0.0239
0.0226
0.0213
0.0201
0.0190
0.0180
0.0170
0.0160
0.0151
0.0143
0.0135
0.0128
0.0121
0.0114
0.0108
0.0102
0.0096
0.0091
0.0086
0.0082
0.0077
0.0073
0.0069
0.0065
2
7.5
7.6
7.7
7.8
7.9
8.0
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
9.0
9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
9.9
P
0.0062
0.0058
0.0055
0.0052
0.0049
0.0047
0.0044
0.0042
0.0040
0.0038
0.0036
0.0034
0.0032
0.0030
0.0029
0.0027
0.0026
0.0024
0.0023
0.0022
0.0021
0.0019
0.0018
0.0017
0.0017
2
10.0
10.1
10.2
10.3
10.4
10.5
10.6
10.7
10.8
10.9
11.0
11.1
11.2
11.3
11.4
11.5
11.6
11.7
11.8
11.9
12.0
12.1
12.2
12.3
12.4
P
0.0016
0.0015
0.0014
0.0013
0.0013
0.0012
0.0011
0.0011
0.0010
0.0010
0.0009
0.0009
0.0008
0.0008
0.0007
0.0007
0.0007
0.0006
0.0006
0.0006
0.0005
0.0005
0.0005
0.0005
0.0004
2
12.5
12.6
12.7
12.8
12.9
13.0
13.1
13.2
13.3
13.4
13.5
13.6
13.7
13.8
13.9
14.0
14.1
14.2
14.3
14.4
14.5
14.6
14.7
14.8
14.9
P
0.0004
0.0004
0.0004
0.0003
0.0003
0.0003
0.0003
0.0003
0.0003
0.0003
0.0002
0.0002
0.0002
0.0002
0.0002
0.0002
0.0002
0.0002
0.0002
0.0001
0.0001
0.0001
0.0001
0.0001
0.0001
217
Summary of Methods for Comparing Proportions
218
Note on p-Values and Sample Size
Will the p-value change if we have smaller sample sizes but

proportions remain about the same?
Suppose our sample size were about 1/4 the original:
Fishers Exact Test Always works with large or small sample size
Highly computational; need a computer
2 -Test Works with larger sample size
Calculations easy to do
One of the most popular statistical methods in
scientific literature
Extends to larger tables
HIV
transmission
AZT
Placebo
Yes
10
No
28
24
52
30
32
62
AZT 2/30 = 6.7%

Placebo 8/32 = 25%
p = .083
219
220
Note
Relative Risk
Ratio of Proportions:
Relative risk = p1 /p2
AZT Example
The p-value depends not only on the observed difference
between the proportions, but also on the sample sizes

If the sample sizes that two proportions were based on were
The risk of HIV with AZT relative to placebo:

Relative risk= p1 /
p2 = .074/.244 = .30
The risk of HIV with placebo relative to AZT:

Relative risk= p2 /
p1 = 3.29
The risk of HIV transmission with placebo is more than 3
times higher compared to AZT
bigger, the p-value would get smaller
221
222
The Relative Risk versus the p-Value

The relative risk tells you the magnitude of the disease-exposure
association.
3
You can test

H0 : Relative Risk= 1
H1 : Relative Risk6= 1
Using any of the methods for comparing proportions
(2 , Fishers)
The p-value (calculated using either Fishers exact test or the 2

statistic) tells you if the observed result can be explained by
chance.
A big relative risk does not necessarily mean that the p-value is
small.
The p-value depends both on the magnitude of the relative risk as

well as the sample size.
223
224
Describing the Association Between Two Continuous Variables
Scatter plot
Correlation coefficient
Simple linear regression
Association between body weight (X ) and plasma volume (Y )
Body Weight (kg)

58.0
70.0
74.0
63.5
62.0
70.5
71.0
66.0
1
2
3
4
5
6
7
8
Plasma Volume (l)

2.75
2.86
3.37
2.76
2.62
3.49
3.05
3.12
225
The Scatter Plot
226
The Correlation Coefficient

Measures the direction and magnitude of
the linear association between X and Y
3.2
The correlation coefficient is between -1 and +1
3.0
2.8
65
Body Weight (kg)
r = 1
r =0
227
r = .7
r =1
70
Scatter diagram of plasma volume and

body weight showing the linear regression
line
60
2.6
Plasma Volume (l)
3.4
r = .7
228
Examples of the Correlation Coefficient
Properties of Correlation Coefficient
Perfect Positive
Uncorrelated
1 r 1
Corr(X , Y ) = r
r
r
r
r
r
= 1: Perfect positive association

> 0: Positive association
= 0: No association
< 0: Negative association
= 1: Perfect negative association
Weak Positive
Closer to 1 and 1: stronger relationship
Sign: direction of association
r = 0: no linear association
Weak Negative
229
230
Correlation measures linear association
Correlation Slider:
noppa5.pc.helsinki.fi/koe/corr/cor7.html
Correlation Guessing Game:
http://istics.net/stat/Correlations/
r =0
A strong relationship along a curve for which r = 0
231
232
Four Scatterplotsall have r = .7
NOTES AND CAVEATS ON CORRELATION COEFFICIENT
Measuring only linear relationships
Other kinds of relationships are also important
Look at and graph the data
Sensitive to outliers
X values are measured not controlled by the experimental
design. That is, X and Y are random
Example where r is appropriate
X = height
Example where r is not appropriate
Y = weight
Clinical study at different doses

X = dose of drug
Y = Response
Anscombes Data
233
Body WeightBlood Plasma
234
How Close Do the Points Fall to the Line?
3.2
3.0
This is measured by the correlation coefficient

But what line? This is measured by regression
2.8
2.6
Plasma Volume (l)
3.4
60
65
70
Body Weight (kg)
r = .76
235
236
Simple Linear Regression

Fit a straight line to the data
3.6
Y is the dependent variable

X is the independent variable
Predictor
Regressor
Covariate
3.4
Called simple because there is only one independent variable
2.6
3.0
We try to predict Y from X
2.8
Plasma Volume (l)
3.2
2.4
If there are several independent variables, its called multiple
2.2
linear regression
55
60
65
70
75
Body Weight (kg)
237
Least Squares Line
238
Regression by Eye
The linear regression line minimizes the sum of squares of
vertical deviations
Least squares line
www.ruf.rice.edu/~lane/stat_sim/reg_by_eye/index.html
Its called simple because there is only one independent
variable (X )
Each distance is yi yi = yi a + bxi
this is computed for each data point in the sample
239
240
The Equation of a Line
The Equation of a Line
Y
Intercept The expected value of Y when X is 0
Warning: The intercept is not always easily
interpretable.
Only meaningful if X can take the value 0.
Is body weight ever really 0?
b
1
Slope The slope b is the expected increase in Y

corresponding to a unit increase in X
b = 0 No association
b > 0 Positive association
as X increases Y tends to increase
b < 0 Negative association
as X increase Y tends to decrease
a
X
241
Simple Linear Regression Model

Points dont fall exactly on line, so to represent that we add
242
Body WeightBlood Plasma
Y = a + bX +
Plasma volume = 0.0857+.0436weight
is
Noise
Estimate of the intercept = 0.0857
Error
Scatter
Estimate of the slope = 0.0436

For each kilogram difference in body weight, we
expect plasma volume to differ by .0436 liters
Assumptions About
Random noise
Sometimes positive, sometimes negative
but, on average, its 0

Normally distributed about 0
243
244
Prediction
Estimated Slope
Versus
Population Parameter Slope
Measurement of plasma volume is time-consuming

Use equation and body weight to estimate plasma volume
1
The estimated slope is .0436 and is based on a sample
Estimate the plasma volume for 60kg man
(n = 8) of data.
Y = 0.0857 + .0436 60 = 2.7liters
2
On the other hand, the true slope is a population parameter
and represents the true relation between Y and X based on

an infinite number of observations.
Estimate the plasma volume for 70kg man

Y = 0.0857 + .0436 70 = 3.1liters
Is the estimated slope (.0436) a good estimate?
Estimate the plasma volume for a 50kg man?
Is it close to the true population slope?
245
Precision of Estimated Slope
246
Statistical Inference on Slope

Random sampling behavior of estimated regression coefficients is
normal for large samples (n > 60), and centered at true values
Confidence interval
Hypothesis test
The estimated slope 0.0436 is not the true population parameter

slope b
Standard Error of Estimated Slope
95% Confidence Interval for Slope

Measures precision of estimated slope
Standard error depends on
N
Spread in X s
How close points are to line
Slope t standard error

where t is the t-value with n 2 degrees of freedom
Example
Statisticians have developed formulas to calculate the SE of
the estimated slope

Example
0.0436 2.447 .0153

0.0436 .0374
Slope = 0.0436
Standard error of slope = 0.0153
(.0062,
247
.081)
Note: 0 is not in the confidence interval.
248
Notes
t Values Used for Confidence Intervals for Linear Regression

Parameters
If 0 is not in the 95% confidence interval

it means
Critical Values of t for 95% Confidence
df
1
2
3
4
5
6
7
8
9
10
11
t
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
df
12
13
14
15
20
25
30
40
60
120
b = 0 is not plausible
t
2.179
2.160
2.145
2.131
2.086
2.060
2.042
2.021
2.000
1.980
1.960
thus
We would reject H0 :
the = .05
b=0
at
level
That is, p < .05

If we want the actual p-value, we would have to go a step further
and perform a significance test.
249
Hypothesis Testing
250
Hypothesis Testing
H0 :
b=0
H1 :
b 6= 0
The p-value represents the probability of observing a slope as
(slope is zero)
extreme as .0436 if the true population slope was really 0

p-value is probability of being 2.85 or more standard errors
away from mean of 0 on a normal curve

Assume null true, and calculate standardized distance of b from 0
test statistic = t =
p-value = .03
0.0436
slope 0
=
SE(slope)
0.0153
2.85
Plasma volume is positively associated with body weight

(p = .03)
2.85
= 2.85
251
252
Where does the p-value come from?
Some Notes and Assumptions about Simple Linear Regression
Model predicts Y from X

In a normal population, what is the probability of being at
Relationship between Y and X is a straight line
least 2.85 standard deviations away from the population

mean?
Use equation in the range of the data
Statisticians have calculated these probabilities

This is fine for larger sample sizes (n > 60)
Beware of extrapolation
Actually, a t-correction must be used because the sample size
Variability of values about the line is approximately normal
is small with n 2 degrees of freedom
Pairs of data points (X , Y ) are independent
In the example, n 2 = 6 degrees of freedom
The variability (standard deviation) about the line is the same
at all places on the line
253
The Slope (b) and the Correlation Coefficient (r )
254
Testing for an Association
Both indicate direction of association (positive or negative)
We can test whether the correlation coefficient is 0 or not
Slope b is the expected difference in Y per unit difference
(higher versus lower) in X

Larger slope does not necessarily mean a stronger linear
BUT
association
The correlation coefficient is scaled between -1 and +1
The correlation coefficient measures how close points fall on a
linear line
It turns out it is equivalent to testing if the slope b is 0
255
256
The Coefficient of Determination, R 2
Whats a Good R 2 ?
The Square of the Correlation

Coefficient R 2 is the
The Coefficient of Determination
There are a couple of important things about R 2 and r :

These quantities are both estimates based on the sample of
data
Fraction of observed variability in Y that can be explained by
Frequently reported without some recognition of sampling
variability, for example a 95% confidence interval

Low R 2 and r not necessarily bad
R 2 is a number between 0 and 1
many outcomes can not/will not be fully or close to fully
Idea
explained, in terms of variability, by any one single predictor
When there is a straight line relation, some of the variation in Y is

accounted for by the fact that as X changes, it pulls Y along with
it.
The higher the R 2 values, the better the X predicts Y for
individuals in a sample/population
individual Y -values vary less about their estimated means
based on X
Example
However, there may be important overall associations between
In plasma volume and body weight example,

R
= .76
= .762 = .58
mean of Y and X even though still a lot of individual

variability in Y -values about their means estimated by X
257
Caveat
258
Spurious Correlations with Time

CORRELATION IS NOT CAUSATION
May occur when two variables are recorded over time
. and correlated with time
Observation
Negative correlation between death rates from ovarian cancer and
family size based on 20 countries?
Death Rate
from
Drowning
Faulty Interpretation?
Does this mean that having a large family
will protect you from ovarian cancer?
Amount of Ice Cream Sold
259
260
Spurious Correlation
Ecological FallacyHeight and Dietary Intake in 3 Countries

Blood Pressure
Students who received radiation therapy are most likely to die

Morning on which it is hard to get out of bed are the
mornings with the most car accidents
Dietary Salt
261
Multiple Linear Regression
262
Multiple Regression Equation
Finds an equation to predict Y from multiple independent variables

(the X s)
Y
X1
X2
y = b0 + b1 X1 + b2 X2 + b3 X3 +
= Number of bed days

HMO or
=
Fee for service plan
= Mental health score at baseline
Data
X3 = Functional status at baseline

X4 = Bed days in year prior to enrollment
Parameters
dependent variable
b0
intercept
X1
first independent variable
b1
regression coefficient (slope) for X1
X2
second independent
b2
regression coefficient for X2
X3
third independent variable
b3
regression coefficient for X3
X5 = Age
Just represents random noise (scatter)

and is in the equation to remind us
X6 = Gender
that actual data points wont fall

perfectly on the line
263
264
Interpretation of Regression Coefficients from Multiple Linear Regression
Uses of Multiple Linear Regression
b1 is the expected difference in Y per unit difference

(higher versus lower) in X1 if all the other X s
(independent variables) are held constant
To look for relationships between variables and adjust for
confounding variables
To develop models to predict the expected value of Y from
b2 is the expected difference in Y per unit difference

(higher versus lower) in X2 if all the other X s are
held constant
the X s
265
Example:
Is there an association between hemoglobin and packed cell volume?
Example: Hemoglobin and Pack Cell Volume
Hemoglobin level, packed cell volume, age, and menopausal status for 20 women
Subject Number Hb (g/dl) PCV
Age (yrs) Menopause (0=No)
1
11.1
35
20
0
2
10.7
45
22
0
3
12.4
47
25
0
4
14.0
50
28
0
5
13.1
31
28
0
6
10.5
30
31
0
7
9.6
25
32
0
8
12.5
33
35
0
9
13.5
35
38
0
10
13.9
40
40
0
11
15.1
45
45
1
12
13.9
47
49
0
13
16.2
49
54
1
14
16.3
42
55
1
15
16.8
40
57
1
16
17.1
50
60
1
17
16.6
46
62
1
18
16.9
55
63
1
19
15.7
42
65
1
20
16.5
46
67
1
Source: Pan data from Campbell et al. (1985)
266
16
14
Hb (g/dl)
12
10
25
30
35
40
45
50
55
PCV (%)
267
268
55
50
16
45
40
PCV (%)
14
Hb (g/dl)
A statistical procedure that tries to evaluate the association

between two variables while controlling for the effects of other
variables.
One way of thinking about how it works is this:
12
35
30
Imagine sorting the individuals into age groups (example

21-25, 26-30, 31-35 etc.)
For each age group, perform a simple linear regression of

hemoglobin (Y ) versus PCV (X )
Get the slope
Calculate a sort of average of the slopes from each age group
Basically, this is what multiple linear regression does but it

doesnt actually need to sort into age groups
25
10
20
30
40
50
60
20
Age (years)
30
40
50
60
Age (years)
Is the relationship between hemoglobin and PCV only
apparent because maybe they both increase with age?

Is age a confounder?
Can we control for age?
269
Multiple Linear Regression Results
270
Interpretation
Predicted hemoglobin = 5.24 + 0.110 Age + 0.097 PCV

Regression of Hemoglobin on Age and PCV
Variable
Constant
Age
PCV
b0
b1
b2
Regression
coefficient
5.24
0.110
0.097
SE
1.21
0.016
0.033
t-value
4.34
6.74
2.98
MLR For a given age, hemoglobin differs by 0.097 gm/dl

for every unit difference in PCV
MLR The simple linear regression suggested hemoglobin
differed by 0.121 per unit difference in PCV
(not controlling for age)
p-value
0.0004
0.0001
0.0085
Conclusion Hemoglobin is positively associated with PCV even

after accounting for age
(p = .0085)
271
272
Inference for MLR Coefficients
Inference for MLR Coefficients

Hypothesis test
A parameter estimate
(regression coefficient)
has an associated standard error
(SE)
H0 : b = 0
H1 : b 6= 0
Test statistic: t =
Statisticians have developed formulas for standard errors of
b
SE
To calculate a p-value use a t-distribution with degrees of

freedom = sample size - # coefficients
multiple linear regression coefficients

SE represents the imprecision in the parameter estimate
Example: df = 20 3 = 17
To test H0 : b2 = 0
Generally, the standard errors get smaller with larger sample
size
t=
.097
= 2.98
.033
2.98
2.98
p-value is .0085
273
Example
274
Solution
Use multiple linear regression to account for effects of age.

Question Do post-menopausal women have a different
hemoglobin level than pre-menopausal women?
Y
Simple Analysis Two-sample t-test comparing pre vs

post-menopausal women is highly significant
= b0 + b1 X1 + b2 X2
X1 = Age (years)
X2 = Menopause

0 if N0 (pre)
=
1 if YES (post)
Concern Post-menopausal women are older AND Hemoglobin

rises with age
Are we just seeing the effects of age?
275
276
Note
How is Multiple Linear Regression Adjusting for Age?
Imagine sorting individuals into age groups

In each age group, divide into pre-menopausal and
post-menopausal
The X s do not have to be continuous
Then perform a two-sample t-test to calculate the differences
X2 is called an indicator or dummy variable
in hemoglobin sample means between pre and

post-menopausal
Indicator variables are used to define groups
Multiple linear regression kind of averages these two-sample
t-tests across all age groupsIt averages the differences

across all age groups
277
Results:
MLR of Hb against age and menopausal status
278
Confidence Interval for b2
b2 t S.E.
Variable
Constant
Age
Menopausal
b0
b1
b2
Regression
coefficient
9.74
0.081
1.88
SE
1.11
0.033
1.03
t-value
8.77
2.41
1.82
p-value
<0.001
0.03
0.09
1.88 2.1 1.03
Interpretation
After accounting for age, post-menopausal women have, on
average, hemoglobin levels about 1.88 g/dl higher than
pre-menopausal women (95% CI is -0.28 to 4.08, p = .09)
Campbell and Machin
279
280
Power and Sample Size
Example: Thrombolysis and Acute MI
Clinicians thought thrombolysis would benefit AMI

Successive studies failed to prove
Used to determine how many subjects are

needed to answer the research question
Finally, did a mega-trial adequately powered to prove
association
281
Truth
H0
Emerg Med J 2003;20:453-458
282
Type I Error (significance level)
H1
Reject H0
Type II Error
Decision

Fail to
reject H0
Power
283
284
Statistical Power
Power
Power=the chance you reject H0 when H0 is false

That is, you correctly conclude there is a treatment
effect when there really is a treatment effect.
Power is a measure of doing the right thing when H1 is true!

Higher power is better (the closer the power is to 1.0 or 100%)
10
15
20
25
285
Effect of Effect Size
286
Effect of Effect Size
10
10
15
20
25
10
10
15
20
25
10
10
15
20
25
Which is harder?
To detect very small differences
To find a large (obvious) difference
287
288
Effect of Amount of Variability
10
Effect of
or How certain we want to be to avoid type I error
10
15
20
Conventionally, we choose a probability of .05 for a type I error
25
If we lower the significance level, the power will be lower
10
10
15
20
25
289
Effect of
or How certain we want to be to avoid type I error
10
10
10
10
10
10
15
15
15
290
Effect of Sample Size
20
20
20
25
10
10
15
20
25
10
10
15
20
25
25
25
291
292
What influences power?
Blood pressure and oral contraceptives

Is oral contraceptive use associated with higher blood pressure
among 35-39 year olds?
H0 : OC = NOOC
Effect size
H0 : OC 6= NOOC
Variability in measurements
Chosen significance level ()
Pilot Study
Sample size
OC users
Non-OC users
n
8
21
Sample
Mean systolic BP
132.8
127.4
Sample
SD(s)
15.3
18.2
2-sample t-test p = .46

293
But...
294
Design
The sample mean difference in blood pressures is 132.8 -
127.4 = 5.4
This could be considered scientifically significant, however, the
Determine sample sizes needed to detect
result is not statistically significant (or even close to it!) at

the = .05 level
about a 5mm increase in blood pressure in OC users

with 80% power
95% CI for difference in means (OC Non-OC)
at significance level = .05
-9.5 to +20.3
Using the pilot data, we estimate that the standard deviations are
15.3 and 18.2 in OC and non-OC users respectively
Suppose, as a researcher, you were concerned about detecting
a population difference of this magnitude if it truly existed

This OC/Blood pressure study has power of .13 to detect a
difference in blood pressure of 5.4 or more, if this difference

truly exists in the population of women 35-39 years old!
295
296
Key Determinants of Sample Size
Specify
-level of the test Probability of type I error
p-values below this are called statistically significant
In order to detect a difference in BP of 5.4 units (if it really
exists in the population) with high (80%) certainty
Power The power you desire for detecting this treatment

effect
we would need to enroll 153 OC users and 153 non-users
This assumed that we wanted equal numbers of women in
each group...
Effect Size Your best estimate of the true difference

= 1 2 is the treatment effect
Pilot data
Smallest effect of scientific interest
Variability Your best estimate of the true SDs
297
Designing Your Own Study
298
Designing Your Own Study
What if sample size calculation yields group sizes that are too big?
Increase minimum difference of interest
Increase -level
When designing a study, there is a tradeoff between

Power
Decrease desired power
-level
Decrease SD???
Sample size calculations are an important part of study proposal
Sample size
Study funders want to know that the researcher can detect a
Minimum detectable difference (specific H1 )
relationship with a high degree of certainty (should it really

exist)
Industry standard80% power, = .05
Accounting for confounders requires more information and sample

size calculations have to be done via computer simulationconsult
a statistician!
299
300
Formulae
Steps in a Research Project
Pn
X =
i=1 Xi
s
SE (X ) =
n
s
SE (X 1 X 2 ) =
s12
s2
+ 2
n1 n2
Pn
s =
r
SE (
p) =
X )2
n1
i=1 (Xi
Planning
Design
p (1 p)
n
Data Collection
Data Analysis
Presentation
X (O E )2
=
E
4 cells
2
Interpretation
301
Types of Data
302
Measures of Center
Binary data
Categorical data
Continuous data
Survival data
303
304
Measures of Spread
Pictures of Data
305
Shapes of Distributions
306
Normal Distribution
Right skew
Completely described by:
Left skew
Symmetric
Uniform
Rule?
Bimodal
307
308
Constructing Intervals
Standard Normal Scores
Z -score =
Measures
309
Sampling Distribution
Refers to the distribution of a

samples have been taken.
310
Standard Error
when multiple
311
312
Central Limit Theorem
Confidence Intervals
Means and Proportions (and differences in means and differences

in proportions) are distributed normally when the sample size is
.
95% of the time, the population mean will lie within about two
standard errors of the sample mean.
Variability of the distribution of means is characterized by

=
.
To construct a Confidence Interval:
Variability of the distribution of proportions is characterized by

=
.
313
Hypothesis Testing
314
Unified Theory of CI/HT
Step 0:
Situation
Parameter
H0
Statistic
SE
Distribution
Step 1:
Step 2:
Step 3:
Step 4:
315
316
Type I error
Type II error
Power
p-value
317
Power
318
Sample Size
Parameter
versus
Statistic
Difference to Detect
Variability
Significance level ()
319
320
Correlation
r = 1
r <0
r =0
r >0
Assumptions
r =1
CAUTION:
Prediction
Coefficient of Determination
321
Inference on Slope
322
323
324
Association not Causation
325

Biostats Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Biostats Notes

Uploaded by

Copyright:

Available Formats

Data is Everywhere

Archives of Surgery, August 2000

For the first time, an influential doctors group is recommending that

Data provides Information

Steps in Research Project

Good Data Can Be Analyzed and Summarized to

Bad Data Can Be Analyzed and Summarized to

1954 Salk Polio Vaccine Trial

Design: Features of the Polio Trial

There were almost twice as many polio cases

COULD WE GET SUCH GREAT IMBALANCE BY CHANCE?

Objective: The groups should be equivalent

There are Different Statistical Methods for Different Types of Data

Binary (dichotomous) data

Binary Data To compare the number of polio cases in the 2

Continuous data (finer measurements)

Continuous Data To compare blood pressure in a clinical trial

Time to Event data

Add up data, then divide by sample size (n)

The sample size n

Example: n = 5 Systolic Blood Pressures (mmHg)

Sensitive to extreme values

Why is it called the sample mean?

= 120 + 80 + 90 + 110 + 95 = 99 mmHg

Population Versus Sample

Population Versus Sample

is not the population mean

Sample A part of the population from which we actually

We dont know the population mean but we would like to

We calculate the sample mean X

Sample of blood pressures from n = 5

The median is the middle number

Not sensitive to extreme values.

If the sample size is an even number,

The sample variance is the average of the

How Can We Describe the Spread

SAMPLE STANDARD DEVIATION (S or SD)

The bigger s is, the more variability there is

s measures the spread about the mean

s can equal 0 only if there is no spread

The units of s are the same as the units of the data

s is the best estimate of the population standard deviation

deviations (s) of the mean X

For a normally (Gaussian) distributed population, most is

More Notes about SD:

Because we dont know , we use X

The term degrees of freedom arises in other statistics

Other Measures of Variation

Continuous Variables: Histograms

Standard deviation (SD or S)

Means and medians do not tell the whole story

Minimum and maximum observation

Differences in spread (variability)

Differences in shape of the distribution

What Happens To These as Sample Size Increases?

Histograms are a way of displaying the

How to Make a Histogram

Table 20: Resident Population by Age and State (2000)

Source: Statistical Abstract of the United States, 2001.

Count number in each

Count the observations in each class.

Number of Men in Population

Divide range of data into intervals (bins) of equal width

Pictures of Data: Histograms