You are on page 1of 8

Explorations in statistics: the bootstrap

Douglas Curran-Everett
Advan in Physiol Edu 33:286-292, 2009. doi:10.1152/advan.00062.2009

You might find this additional info useful...

Supplemental material for this article can be found at:


http://advan.physiology.org/content/suppl/2009/12/01/33.4.286.DC1.html

This article cites 27 articles, 18 of which can be accessed free at:


http://advan.physiology.org/content/33/4/286.full.html#ref-list-1

This article has been cited by 2 other HighWire hosted articles


Explorations in statistics: correlation
Douglas Curran-Everett
Advan in Physiol Edu, December, 2010; 34 (4): 186-191.
[Abstract] [Full Text] [PDF]

Explorations in statistics: power

Downloaded from advan.physiology.org on December 19, 2010


Douglas Curran-Everett
Advan in Physiol Edu, June, 2010; 34 (2): 41-43.
[Abstract] [Full Text] [PDF]

Updated information and services including high resolution figures, can be found at:
http://advan.physiology.org/content/33/4/286.full.html

Additional material and information about Advances in Physiology Education can be found at:
http://www.the-aps.org/publications/advan

This infomation is current as of December 19, 2010.

Advances in Physiology Education is dedicated to the improvement of teaching and learning physiology, both in specialized
courses and in the broader context of general biology education. It is published four times a year in March, June, September and
December by the American Physiological Society, 9650 Rockville Pike, Bethesda MD 20814-3991. Copyright © 2009 by the
American Physiological Society. ISSN: 1043-4046, ESSN: 1522-1229. Visit our website at http://www.the-aps.org/.
Adv Physiol Educ 33: 286–292, 2009;
Staying Current doi:10.1152/advan.00062.2009.

Explorations in statistics: the bootstrap


Douglas Curran-Everett
Division of Biostatistics and Bioinformatics, National Jewish Health, and Department of Biostatistics and Informatics
and Department of Physiology and Biophysics, School of Medicine, University of Colorado Denver, Denver, Colorado
Submitted 20 July 2009; accepted in final form 4 September 2009

Curran-Everett D. Explorations in statistics: the bootstrap. Adv Boot.R4 to your Advances folder and install the extra pack-
Physiol Educ 33: 286–292, 2009; doi:10.1152/advan.00062.2009.— age boot.
Learning about statistics is a lot like learning about science: the To install boot, open R and then click Packages | Install
learning is more meaningful if you can actively explore. This fourth package(s) . . . 5 Select a CRAN mirror6 close to your location
installment of Explorations in Statistics explores the bootstrap. The and then click OK. Select boot and then click OK. When you
bootstrap gives us an empirical approach to estimate the theoretical
have installed boot, you will see
variability among possible values of a sample statistic such as the
sample mean. The appeal of the bootstrap is that we can use it to make package ‘boot’ successfully unpacked and MD5 sums checked
an inference about some experimental result when the statistical
theory is uncertain or even unknown. We can also use the bootstrap to in the R Console.

Downloaded from advan.physiology.org on December 19, 2010


assess how well the statistical theory holds: that is, whether an To run R commands. If you use a Mac, highlight the commands
inference we make from a hypothesis test or confidence interval is you want to submit and then press (command key⫹enter). If
justified. you use a PC, highlight the commands you want to submit, right-
Central Limit Theorem; R; sample mean; software; standard error
click, and then click Run line or selection. Or, highlight the com-
mands you want to submit and then press Ctrl⫹R.

THIS FOURTH PAPER in Explorations in Statistics (see Refs. 3–5) The Simulation: Data for the Bootstrap
explores the bootstrap, a recent development in statistics (8, In our early explorations (3–5) we drew random samples of
11–16) that evolved from the jackknife (25, 28, 36). Despite its 9 observations from a standard normal distribution with mean
brief history, the bootstrap is discussed in textbooks of statis- ␮ ⫽ 0 and standard deviation ␴ ⫽ 1. These were the obser-
tics (26) and used in manuscripts published by the American vations–the data–for samples 1, 2, and 1000:
Physiological Society (2, 10, 17–24, 35, 37).
⬎ # Sample Observations
The bootstrap1 gives us an empirical approach to estimate
[1] 0.422 1.103 1.006 1.034 0.285 ⫺0.647 1.235 0.912 1.825
the theoretical variability among possible values of a sample [2] 0.154 ⫺0.654 ⫺0.147 1.715 0.720 0.804 0.256 1.155 0.646

statistic.2 In our previous explorations (3–5) we calculated the


:
[1000] 0.560 ⫺1.138 0.485 ⫺0.864 ⫺0.277 2.198 0.050 0.500 0.587

standard error of the mean, our estimate of the theoretical


variability among sample means, as SE{y៮ } ⫽ s/公n, where s Each time we drew a random sample we calculated some
was the sample standard deviation and n was the number of sample statistics. These were the statistics for samples 1, 2, and
1000:
observations in the sample. This computation was grounded in
theory (7). In our last exploration (4) we derived a confidence ⬎ # Sample Mean SD SE t LCI UCI

interval for the population mean. This too was grounded in [,1]
1
[,2]
0.797
[,3]
0.702
[,4]
0.234
[,5]
3.407
[,6]
0.362
[,7]
1.232
theory. The beauty of the bootstrap is that it can transcend 2
:
0.517 0.707 0.236 2.193 0.079 0.955

theory: we can use the bootstrap to make an inference about 1000 0.233 0.975 0.325 0.718 ⫺0.371 0.838

some experimental result when the theory is uncertain or even In contrast to our early explorations that used 1000 samples
unknown (14, 26). We can also use the bootstrap to assess how from our standard normal population, our exploration of the
well the theory holds: that is, whether an inference we make bootstrap uses just the 9 observations from sample 1:
from a hypothesis test or confidence interval is justified.
0.422 1.103 1.006 1.034 0.285 ⫺0.647 1.235 0.912 1.825

R: Basic Operations In our previous exploration (4) we used these observations to


In the first article (3) of this series, I summarized R (29) calculate a 90% confidence interval for the population mean:
and outlined its installation.3 For this exploration, there are
two additional steps: download Advances_Statistics_Code_ 关0.362, 1.232兴 ⬟ 关0.36, 1.23兴 .
Because this interval excluded 0, we declared, with 90%
confidence, that 0 was not a plausible value of the popula-
1
The APPENDIX reviews the origin of the name bootstrap. tion mean. This inference was consistent with the result of
2
For example, a sample mean, a sample standard deviation, or a sample
correlation.
3
I developed the scripts for the early explorations (3–5) using R-2.6.2. I
4
developed the script for this exploration using R-2.8.2 (deployed 22 Dec 2008), This file is available through the Supplemental Material for this article at
but it will run in R-2.6.2. the Advances in Physiology Education website.
5
Address for reprint requests and other correspondence: D. Curran-Everett, The notation click A | B means click A, then click B.
6
Div. of Biostatistics and Bioinformatics, M222, National Jewish Health, 1400 CRAN stands for the Comprehensive R Archive Network. A mirror is a
Jackson St., Denver, CO 80206 (e-mail: EverettD@NJHealth.org). duplicate server.

286 1043-4046/09 $8.00 Copyright © 2009 The American Physiological Society


Staying Current
BOOTSTRAP METHODS 287

Table 1. Illustration of bootstrap samples We use the notation y៮ *j to represent the mean of bootstrap
sample j (11, 16).
Bootstrap Sample j Suppose we generate 10,000 bootstrap replications of the
y 1 2 … b sample mean (Fig. 1). If we treat these 10,000 sample means as
observations, we can calculate their average and standard
0.422 ⫺0.647 0.285 … 1.034
1.103 1.285 1.235 0.285
deviation:
1.006 0.912 1.825 1.103
1.034 0.912 1.034 0.912 Ave兵y៮ *其 ⫽ 0.798 and SD兵y៮ *其 ⫽ 0.219 .
0.285 1.825 1.006 1.103
⫺0.647 1.825 1.103 0.422 The standard deviation SD{y៮ *} describes the variability among
1.235 0.285 0.912 ⫺0.647 the b bootstrap means and estimates the standard deviation
0.912 1.006 1.103 0.912 of the theoretical distribution of the sample mean (16). The
1.825 ⫺0.647 0.422 1.825 commands in lines 85– 86 of Advances_Statistics_Code_
0.797 0.640 0.992 0.772
Boot.R return these values. Your values will differ slightly.
y៮ y៮ *1 y៮ *2 y៮ *b We can do more than just calculate Ave{y៮ *} and SD{y៮ *}:
we can assess whether the distribution of these 10,000 boot-
y, Actual observations from sample 1. Because the bootstrap procedure samples strap means is consistent with a normal distribution. How? By
with replacement, an original observation can appear more than once in a bootstrap
sample. The bootstrap sample means y៮ *1, y៮ *2, . . . , y៮ *b cluster around 0.797, the using a normal quantile plot (Fig. 2). It turns out that these
observed sample mean y៮. The commands in lines 40–49 of Advances_Statistics_ 10,000 bootstrap means are not consistent with a normal

Downloaded from advan.physiology.org on December 19, 2010


Code_Boot.R create three bootstrap samples from the observations in sample 1. To distribution (Fig. 3). On the other hand, 10,000 bootstrap
generate three bootstrap samples as in this table, highlight and submit the lines of code replications of the sample mean using the observations from
from Table 1: first line to Table 1: last line.
sample 1000 are consistent with a normal distribution (Fig. 4).
And last, we can use bootstrap replications of the sample
the hypothesis test in our second exploration (5) in which mean to estimate different kinds of confidence intervals for
we realized that when we draw a single random sample from the population mean: for example, normal-theory, percen-
some population–when we do a single experiment–we can tile, and bias-corrected-and-accelerated confidence intervals
have enough unusual observations so that it appears the (16, 26).
observations came from a different population. This is what Normal-theory confidence interval. In our last exploration
happened with our first sample. (4) we calculated a 100(1 ⫺ ␣)% confidence interval for the
With this brief review of the observations from our first population mean as
sample, we are ready to explore the bootstrap.
The Bootstrap
In our previous exploration (4) we used the standard error of
the mean, our estimate of the theoretical variability among
possible values of the sample mean, to calculate a confidence
interval for the mean of the underlying population. Rather than
use theory (7) to develop the notion of the standard error of the
mean, we used a simulation: we drew 1000 random samples
from our population and–for each sample–we calculated the
mean (3). When we treated those 1000 sample means as
observations, we calculated their standard deviation:
SD兵y៮ 其 ⫽ 0.326 .
The standard deviation of sample means is the standard error of
the sample mean SE {y៮ }. The bootstrap estimates the standard
error of a statistic using not a whole bunch of random samples
from some theoretical population but actual sample observations.
Suppose we want to bootstrap the mean using the observa-
tions from the first sample: 0.422, 1.103, . . . , 1.825. How do
we do this? We draw at random–with replacement7–a sample
of size 9 from these 9 actual observations (Table 1). We then
repeat this process until we have drawn a total of b bootstrap
samples.8 For each bootstrap sample, we calculate its mean. Fig. 1. Bootstrap distribution of 10,000 sample means, each with 9 observa-
tions. The bootstrap replications were drawn at random from the observations
in sample 1 (see Table 1). The average of these sample means, Ave{y៮ *}, is
0.798, nearly identical to the observed sample mean y៮ ⫽ 0.797. The standard
7
After an observation is drawn from the pool of 9 values, it is returned to deviation of these sample means, SD{y៮ *}, is 0.219, which estimates the
the pool. The consequence: the observation can appear more than once in a standard deviation of the theoretical distribution of the sample mean, ␴/公n ⫽
bootstrap sample. 0.333 (3). The commands in lines 65–78 of Advances_⬎Statistics_Code_Boot.R
8
The number of bootstrap replications can vary from 1,000 to 10,000 (11, create this data graphic. To generate this data graphic, highlight and submit the
16, 26). lines of code from Figure 1: first line to Figure 1: last line.

Advances in Physiology Education • VOL 33 • DECEMBER 2009


Staying Current
288 BOOTSTRAP METHODS

Downloaded from advan.physiology.org on December 19, 2010


Fig. 3. Normal quantile plot of 10,000 sample means, each with 9 observations.
The bootstrap replications were drawn at random from the observations in sample
Fig. 2. Normal quantile plots (right) of 100 observations drawn from a normal 1 (see Table 1). The bootstrap sample means are inconsistent with a normal
(top) or nonnormal (bottom) distribution. If observations are consistent with a distribution. The commands in lines 165–171 of Advances_Statistics_Code_
normal distribution, then they will follow a rough straight line. If observations are Boot.R create this data graphic. To generate this data graphic, highlight and submit
inconsistent with a normal distribution, then they will deviate systematically from the lines of code from Figure 3: first line to Figure 3: last line.
a straight line. The commands in lines 109 –156 of Advances_Statistics_Code_
Boot.R create this data graphic. To generate this data graphic, highlight and submit
the lines of code from Figure 2: first line to Figure 2: last line. Percentile confidence interval. In the bootstrap distribution
of 10,000 sample means (see Fig. 1), 90% of the means are
covered by the interval
关y៮ ⫺ a, y៮ ⫹ a兴 ,
关0.426, 1.143兴 ⬟ 关0.43, 1.14兴 .
where the allowance a was The commands in lines 196 –199 of Advances_Statistics_
Code_Boot.R return these values.
a ⫽ z ␣ /2 䡠 SD兵y៮ 其 .
Recall that z␣/2 is the 100[1 ⫺ (␣/2)]th percentile from the
standard normal distribution and SD{y៮ } is the standard devia-
tion of the sample means, ␴/公n.
A normal-theory confidence interval based on bootstrap
replications of the sample mean is similar. We just replace y៮
with Ave{y៮ *}, a with a*, and SD{y៮ } with SD{y៮ *}:

关Ave兵y៮ *其 ⫺ a*, Ave兵y៮ *其 ⫹ a*兴 .


The bootstrap allowance a* is

a* ⫽ z ␣ /2 䡠 SD兵y៮ *其 .
Suppose we want to calculate a 90% bootstrap confidence
interval for the population mean. In this situation, ␣ ⫽ 0.10 and
z␣/2 ⫽ 1.645. Therefore, the allowance a* is

a* ⫽ z ␣ /2 䡠 SD兵y៮ *其 ⫽ 1.645 䡠 0.219 ⫽ 0.360 ,


and the resulting 90% confidence interval is

关0.438, 1.158兴 ⬟ 关0.44, 1.16兴 .


Fig. 4. Normal quantile plot of 10,000 sample means, each with 9 observa-
This bootstrap confidence interval resembles the confidence tions. The bootstrap replications were drawn at random from the observations
in sample 1000. These bootstrap sample means are consistent with a normal
interval of [0.36, 1.23] that we calculated before (4). The distribution. The commands in lines 183–189 of Advances_Statistics_Code_
commands in lines 196 –199 of Advances_Statistics_Code_ Boot.R create this data graphic. To generate this data graphic, highlight and
Boot.R return these values. Your values will differ slightly. submit the lines of code from Figure 4: first line to Figure 4: last line.

Advances in Physiology Education • VOL 33 • DECEMBER 2009


Staying Current
BOOTSTRAP METHODS 289

Bias-corrected-and-accelerated confidence interval. If the What happens, however, if we doubt that the theoretical
number of observations in the actual sample is too small, if the distribution of the sample mean is consistent with a normal
average Ave{y៮ *} of the bootstrap sample means differs from distribution? Our suspicion would be natural: this distribution
the sample mean y៮ , or if the bootstrap distribution of the sample is, after all, a theoretical one. One approach is to transform the
mean is skewed, then normal-theory and percentile confidence sample observations. Common transformations include the
intervals are likely to be inaccurate (16, 26). A bias-corrected-and- logarithm and the inverse. Box and Cox (1) described a family
accelerated confidence interval adjusts percentiles of the bootstrap of power transformations in which an observed variable y is
distribution to account for bias and skewness (16, 26). In this explo- transformed into the variable w using the parameter ␭:

再 共yln y⫺ 1兲/␭
ration, the 90% bias-corrected-and-accelerated confidence interval is ␭
for ␭ ⫽ 0 , and
关0.376, 1.113兴 ⬟ 关0.38, 1.11兴 . w⫽ for ␭ ⫽ 0 .
The commands in lines 196 –199 of Advances_Statistics_ The logarithmic, ␭ ⫽ 0, and inverse, ␭ ⫽ ⫺1, transformations
Code_Boot.R return these values. This confidence interval is belong to this family.
shifted slightly to the left of the percentile interval of [0.43, Draper and Smith (9) summarized the steps needed to
1.14]. In many situations, a bias-corrected-and-accelerated estimate ␭ and its approximate 100(1 ⫺ ␣)% confidence
confidence interval will provide a more accurate estimate of interval. These steps include, for each trial value of ␭, the
our uncertainty about the true value of a population parameter calculation of the maximum likelihood ᐉmax:
such as the mean (16, 26).

Downloaded from advan.physiology.org on December 19, 2010


Limitations. As useful as the bootstrap is, it cannot always 1 n

salvage the statistical analysis of a small sample. Why not? If the ᐉmax ⫽ ⫺ n ln 共SSresidual/n兲 ⫹ 共␭ ⫺ 1兲 ln yi , (1)
2 i⫽1
sample is too small, then it may be atypical of the underlying
population. When this happens, the bootstrap distribution will not where SSresidual is the residual sum of squares in the fit of a
mirror the theoretical distribution of the sample statistic. The general linear model to the observations. The optimum esti-
trouble is, it may not be obvious how small too small is. A statistician mate of ␭ maximizes ᐉmax (Fig. 5).
(see Ref. 6, guideline 1) and an estimate of power can help. Without question, transformation can be useful (1, 9). But
In our second and third explorations (4, 5) we concluded that really, who wants to identify the optimum transformation by
the observations from sample 1 were consistent with having solving Eq. 1? The bootstrap provides another way.
come from a population that had a mean other than 0. Only The textbook I use in my statistics course provides the data:
because we had defined the underlying population (see Ref. 3) measurements of C-reactive protein (Table 2). Problem 7.26 (Ref.
could we have known that we had erred. When we do a single 26, p. 442– 443) asks students to study the distribution of
experiment, we can have enough unusual observations so that these 40 observations and to calculate a 95% confidence
it just appears the observations came from a different popula- interval for the true C-reactive protein mean. The real value
tion (5). A small sample size exacerbates the potential for this of problem 7.26 is that it then asks students if a confidence
phenomenon. This is what happened when we bootstrapped the interval is even appropriate for these data.
sample mean using observations from the first sample. Cursory examinations of the histogram and normal quantile
We know the theoretical distribution of the sample mean is plot reveal that the C-reactive protein values are skewed and
exactly normal (3), but the distribution of those bootstrap means inconsistent with a normal distribution (Fig. 6, top). Still, if the
is clearly not normal (see Fig. 3). This happens because the theoretical distribution of the sample mean is roughly normal–if
observations from the first sample are atypical of the underlying
population. The message? All bets are off if you bootstrap a
sample statistic using observations from a sample that is too small.
The Bootstrap in Data Transformation
In our previous exploration (4) we delved into confidence
intervals by drawing random samples from a normal distribu-
tion. That we chose a normal distribution for our population
was no accident. For normal-theory confidence intervals to be
meaningful, one thing must be approximately true: if the
random variable Y represents the physiological thing we care
about, then the theoretical distribution of the sample mean Y៮
with n observations must be distributed normally with mean ␮
and standard deviation ␴/公n.9 In our simulations (3–5) this
assumption was satisfied exactly (7). In a real experiment we
never know if this assumption is satisfied, but we know it will
be satisfied at least roughly–regardless of the population dis-
tribution from which the sample observations came–as long as Fig. 5. Maximum likelihood approach to data transformation. For each value
the sample size n is big enough (27). of ␭, the maximum likelihood ᐉmax is calculated according to Eq. 1. For the
C-reactive protein values (see Table 2), the maximum likelihood estimate of ␭
that maximizes ᐉmax is ⫺0.1, and an approximate 95% confidence interval for
9
In our second exploration (5) we used the test statistic t to investigate ␭ is [⫺0.4, ⫹0.1]. Because this interval includes 0, a log transformation of the
hypothesis tests, test statistics, and P values. A t statistic shares this assumption. C-reactive protein values is reasonable.

Advances in Physiology Education • VOL 33 • DECEMBER 2009


Staying Current
290 BOOTSTRAP METHODS

Table 2. C-reactive protein data C-reactive protein observations by adding 1 to each observa-
tion and taking the logarithm of that number:
Observations, n ⫽ 40

y w y w y w y w
w ⫽ ln共y ⫹ 1兲 .

0 0 73.20 4.307 0 0 15.74 2.818 Problem 7.27 then asks students to study the distribution of the
3.90 1.589 0 0 0 0 0 0 40 transformed observations and to calculate a 95% confidence
5.64 1.893 46.70 3.865 4.81 1.760 0 0 interval for the true transformed C-reactive protein mean.
8.22 2.221 0 0 9.57 2.358 0 0 A histogram and normal quantile plot show that the trans-
0 0 0 0 5.36 1.850 0 0
5.62 1.890 26.41 3.311 0 0 9.37 2.339 formed values are less skewed but still inconsistent with a
3.92 1.593 22.82 3.171 5.66 1.896 20.78 3.081 normal distribution (Fig. 6, bottom). As my students tell me,
6.81 2.055 0 0 0 0 7.10 2.092 “Better, but not great.” The 95% confidence interval [1.05,
0 0 30.61 3.453 59.76 4.107 7.89 2.185 1.94] reverts to
0 0 3.49 1.502 12.38 2.594 5.53 1.876

y, Actual C-reactive protein observations (in mg/l) (26). The average, standard 关e 1.05 ⫺ 1, e 1.94 ⫺ 1兴 ⬟ 关1.86, 5.96兴 mg/l .
deviation, and standard error of these observations are Ave{y} ⫽ 10.03, SD{y} ⫽
16.56, and SE{y៮} ⫽ 2.62. The transformed observations, w, result from ln (y ⫹ 1). The At this point, we have another problem: we have assumed
average, standard deviation, and standard error of the transformed observations are the log transformation ln (y ⫹ 1) is useful–that it gives us a
Ave{w} ⫽ 1.495, SD{w} ⫽ 1.391, and SE{w៮ } ⫽ 0.220. For each set of observations, meaningful confidence interval– but what evidence do we
if we want to calculate a 95% confidence interval for the true mean, then ␣ ⫽ 0.05, the have that it really is? You guessed it: the bootstrap.

Downloaded from advan.physiology.org on December 19, 2010


degrees of freedom v ⫽ 39, and t␣/2,v ⫽ 2.023 (see Ref. 4).
A bootstrap distribution estimates the theoretical distribution
of some sample statistic. In this problem, the bootstrap sample
the Central Limit Theorem holds–then the 95% confidence means from the actual observations are inconsistent with a
interval [4.74, 15.33] mg/l will be meaningful. But we have an normal distribution (Fig. 7, top). This means the sample size of
intractable problem: we have no way of knowing if the sample 40 is not big enough for the theoretical distribution of the
size of 40 is big enough for the theoretical distribution of the sample mean to be roughly normal; as a result, the confidence
sample mean to be roughly normal. interval [4.74, 15.33] mg/l is misleading. On the other hand,
the bootstrap sample means from the transformed C-reactive
Problem 7.27 (Ref. 26, p. 443) tells students a log transformation10
protein observations are consistent with a normal distribution
decreases skewness and asks them to transform the actual
(Fig. 7, bottom): the confidence interval [1.05, 1.94] ⫽ ˙ [1.86,
5.96] mg/l is a useful tool for inference.
10
The Box and Cox method identifies a log transformation of the C-reactive
protein values as the optimal transformation (see Fig. 5).

Fig. 7. Bootstrap distributions (left) and normal quantile plots (right) of 10,000
sample means, each with 40 observations. The bootstrap replications were drawn
at random from the actual or transformed C-reactive protein observations in Table
Fig. 6. Distributions (left) and normal quantile plots (right) of the actual and 2. The bootstrap sample means from the actual observations are inconsistent with
transformed C-reactive protein observations. The transformation changed the a normal distribution. In contrast, the bootstrap sample means from the trans-
distribution of the observations, but the transformed values remain inconsistent formed observations are consistent with a normal distribution. The commands in
with a normal distribution. The commands in lines 232–252 of Advances_ lines 259 –279 of Advances_Statistics_Code_Boot.R create this data graphic. To
Statistics_Code_Boot.R create this data graphic. To generate this data graphic, generate this data graphic, highlight and submit the lines of code from Figure 7:
highlight and submit the lines of code from Figure 6: first line to Figure 6: last line. first line to Figure 7: last line.

Advances in Physiology Education • VOL 33 • DECEMBER 2009


Staying Current
BOOTSTRAP METHODS 291

Summary I was four or five miles from the earth at least, when it
broke; I fell to the ground with such amazing violence, that
As this exploration has demonstrated, the bootstrap gives us I found myself stunned, and in a hole nine fathoms deep
an approach we can use to assess whether an inference we at least, made by the weight of my body falling from so great
make from a normal-theory hypothesis test or confidence a height: I recovered, but knew not how to get out again;
interval is justified: if the bootstrap distribution of a statistic however, I dug slopes or steps with my [finger] nails (the
Baron’s nails were then of forty years’ growth), and easily
such as the sample mean is roughly normal, then the accomplished it.
inference is justified. But the bootstrap gives us more than Refs. 31 (1786) and 32 (2001)
that. We can use a bootstrap confidence interval to make an
It was from this 9-fathom-deep hole that the Baron is rumored to have
inference about some experimental result when the statisti-
extricated himself by his bootstraps.
cal theory is uncertain or even unknown (14, 26). Although Although the notion of pulling yourself up by your bootstraps is
we have explored the bootstrap using the sample mean, we entirely consistent with Baron Munchausen’s flair for the dramatic, I
can use the bootstrap to make an inference about other failed to find any evidence that the Baron availed himself of this
sample statistics such as the standard deviation or correla- life-saving technique.
tion coefficient.
In the next installment of this series, we will explore power, ACKNOWLEDGMENTS
a concept we mentioned in our exploration of hypothesis tests I thank John Ludbrook (Department of Surgery, The University of Mel-
(5). Power is the probability that we reject the null hypothesis bourne, Victoria, Australia) and Matthew Strand (National Jewish Health,

Downloaded from advan.physiology.org on December 19, 2010


Denver, CO) for giving helpful comments and suggestions, Sarah Kareem
given that the null hypothesis is false. The notion of power is (Department of English, University of California, Los Angeles, CA) for
integral to hypothesis testing, confidence intervals, and grant humoring my questions and for graciously providing a lot of information about
applications. Baron Munchausen, and Bernhard Wiebel (Munchausen Library, Zurich,
Switzerland) for searching more than 80 English editions of the Baron’s
adventures for mention of bootstraps.
APPENDIX
REFERENCES
In 1993, Efron and Tibshirani (16) wrote that bootstrap, a term
1. Box GEP, Cox DR. An analysis of transformations. J R Stat Soc Ser B 26:
coined originally by Efron in 1979 (11), was inspired by the collo- 211–243, 1964.
quialism pull yourself up by your bootstraps, a nonsensical notion 2. Carra J, Candau R, Keslacy S, Giolbas F, Borrani F, Millet GP,
generally attributed to an escapade of the fictitious Baron Munchausen Varray A, Ramonatxo M. Addition of inspiratory resistance increases the
(30). When I learned this, I embarked on a search for the escapade. amplitude of the slow component of O2 uptake kinetics. J Appl Physiol 94:
With the generous assistance of Dr. Sarah Kareem (Department of 2448 –2455, 2003.
English, University of California, Los Angeles, CA) and Bernhard 3. Curran-Everett D. Explorations in statistics: standard deviations and
standard errors. Adv Physiol Educ 32: 203–208, 2008.
Wiebel (Munchausen Library, Zurich, Switzerland), this is what I 4. Curran-Everett D. Explorations in statistics: confidence intervals. Adv
discovered. Physiol Educ 33: 87–90, 2009.
The complete title of the Baron’s adventures (30), written by 5. Curran-Everett D. Explorations in statistics: hypothesis tests and P
Rudolf Erich Raspe, was Baron Munchausen’s Narrative of His values. Adv Physiol Educ 33: 81– 86, 2009.
Marvellous Travels and Campaigns in Russia. Humbly Dedicated 6. Curran-Everett D, Benos DJ. Guidelines for reporting statistics in
and Recommended to Country Gentlemen; and, If They Please, to journals published by the American Physiological Society. Physiol
Genomics 18: 249 –251, 2004.
be Repeated as Their Own, After a Hunt at Horse Races, in 7. Curran-Everett D, Taylor S, Kafadar K. Fundamental concepts in
Watering-Places, and Other Such Polite Assemblies; Round the statistics: elucidation and illustration. J Appl Physiol 85: 775–786, 1998.
Bottle and Fire-Side. 8. DiCiccio TJ, Efron B. Bootstrap confidence intervals. Stat Sci 11:
In chapter VI, the Baron throws a silver hatchet at two bears in 189 –228, 1996.
hopes of rescuing a bee.11 The hatchet misses both bears and ends 9. Draper NR, Smith H. Applied Regression Analysis (2nd ed.). New York:
up on the moon. To retrieve the hatchet, the Baron grows and Wiley, 1981, p. 225–227.
10. Durheim MT, Slentz CA, Bateman LA, Mabe SK, Kraus WE. Rela-
promptly climbs a Turkey-bean after it had attached itself to the tionships between exercise-induced reductions in thigh intermuscular
moon. He finds his hatchet but then discovers the bean has dried adipose tissue, changes in lipoprotein particle size, and visceral adiposity.
up, so he braids a rope of straw to use for the descent. The Baron Am J Physiol Endocrinol Metab 295: E407–E412, 2008.
is partway down the straw rope when the next catastrophe hits. The 11. Efron B. Bootstrap methods: another look at the jackknife. Ann Stat 7:
details of the Baron’s escape differ according to the account you 1–26, 1979.
happen to read: 12. Efron B. Better bootstrap confidence intervals. J Am Stat Assoc 82:
171–185, 1987.
I was still a couple of miles in the clouds when it broke, and 13. Efron B. Discussion: theoretical comparison of bootstrap confidence
with such violence I fell to the ground that I found myself intervals. Ann Stat 16: 969 –972, 1988.
stunned, and in a hole nine fathoms under grass, when I 14. Efron B. The bootstrap and modern statistics. J Am Stat Assoc 95:
1293–1296, 2000.
recovered, hardly knowing how to get out again. There was no 15. Efron B. Second thoughts on the bootstrap. Stat Sci 18: 135–140, 2003.
other way than to go home for a spade and to dig me out by 16. Efron B, Tibshirani RJ. An Introduction to the Bootstrap. New York:
slopes, which I fortunately accomplished, before I had been so Chapman & Hall, 1993.
much as missed by the steward. 17. Esbaugh AJ, Perry SF, Gilmour KM. Hypoxia-inducible carbonic
Refs. 30 (1785), 33 (1948), and 34 (1952) anhydrase IX expression is insufficient to alleviate intracellular metabolic
acidosis in the muscle of zebrafish, Danio rerio. Am J Physiol Regul Integr
Comp Physiol 296: R150 –R160, 2009.
18. Fanini A, Assad JA. Direction selectivity of neurons in the macaque
lateral intraparietal area. J Neurophysiol 101: 289 –305, 2009.
19. Gillis TE, Marshall CR, Tibbits GF. Functional and evolutionary rela-
11
It is a long story. tionships of troponin C. Physiol Genomics 32: 16 –27, 2007.

Advances in Physiology Education • VOL 33 • DECEMBER 2009


Staying Current
292 BOOTSTRAP METHODS

20. Grenier AL, Abu-ihweij K, Zhang G, Ruppert SM, Boohaker R, 29. R Development Core Team. R: a Language and Environment for
Slepkov ER, Pridemore K, Ren JJ, Fliegel L, Khaled AR. Apoptosis- Statistical Computing. Vienna, Austria: R Foundation for Statistical Com-
induced alkalinization by the Na⫹/H⫹ exchanger isoform 1 is mediated puting, 2008; http://www.R-project.org.
through phosphorylation of amino acids Ser726 and Ser729. Am J Physiol 30. Raspe RE. Baron Munchausen’s Narrative of his Marvellous Travels and
Cell Physiol 295: C883–C896, 2008. Campaigns in Russia. Oxford: Smith, 1785, p. 45.
21. Kelley GA, Kelley KS, Tran ZV. Exercise and bone mineral density in 31. Raspe RE. Gulliver Revived; or The Singular Travels, Campaigns,
men: a meta-analysis. J Appl Physiol 88: 1730 –1736, 2000. Voyages, and Adventures of Baron Munikhouson, commonly called Mun-
22. Kristensen M, Hansen T. Statistical analyses of repeated measures in chausen. London: Kearsley, 1786, p. 45.
physiological research: a tutorial. Adv Physiol Educ 28: 2–14, 2004. 32. Raspe RE. The Surprising Adventures of Baron Munchausen. Rockville,
23. Lemley KV, Boothroyd DB, Blouch KL, Nelson RG, Jones LI, Olshen MD: Wildside, 2001, p. 58 –59.
33. Raspe RE. Singular Travels, Campaigns and Adventures of Baron Mun-
RA, Myers BD. Modeling GFR trajectories in diabetic nephropathy. Am J
chausen, edited by Carswell JP, others. London: Cresset, 1948, p. 22.
Physiol Renal Physiol 289: F863–F870, 2005.
34. Raspe RE. The Singular Adventures of Baron Munchausen, edited by
24. Lott MEJ, Hogeman C, Herr M, Bhagat M, Sinoway LI. Sex differ- Carswell JP, others. New York: Heritage, 1952, p. 18.
ences in limb vasoconstriction responses to increases in transmural pres- 35. Rozengurt N, Wu SV, Chen MC, Huang C, Sternini C, Rozengurt E.
sures. Am J Physiol Heart Circ Physiol 296: H186 –H194, 2009. Colocalization of the ␣-subunit of gustducin with PYY and GLP-l in L
25. Miller RG. The jackknife–a review. Biometrika 61: 1–15, 1974. cells of human colon. Am J Physiol Gastrointest Liver Physiol 291:
26. Moore DS, McCabe GP, Craig BA. Introduction to the Practice of G792–G802, 2006.
Statistics (6th ed.). New York: Freeman, 2009. 36. Tukey J. Bias and confidence in not-quite large samples (abstract). Ann
27. Moses LE. Think and Explain with Statistics. Reading, MA: Addison- Math Statistics 29: 614, 1958.
Wesley, 1986. 37. Yoshimura H, Jones KA, Perkins WJ, Kai T, Warner DO. Calcium
28. Quenouille M. Approximate tests of correlation in time-series. J R Stat sensitization produced by G protein activation in airway smooth muscle.

Downloaded from advan.physiology.org on December 19, 2010


Soc Ser B 11: 68 – 84, 1949. Am J Physiol Lung Cell Mol Physiol 281: L631–L638, 2001.

Advances in Physiology Education • VOL 33 • DECEMBER 2009

You might also like