Professional Documents
Culture Documents
McElduff F, Cortina-Borja M, Chan SK, Wade A. When t-tests test: a general form and the location-shift model, both of which
or Wilcoxon-Mann-Whitney tests won’t do. Adv Physiol Educ 34: can be problematic when skewed discrete data are analyzed.
128 –133, 2010; doi:10.1152/advan.00017.2010.—t-Tests are widely The general form of the WMW test assesses the probability
used by researchers to compare the average values of a numeric (P) that any observation drawn at random from the first group
outcome between two groups. If there are doubts about the suitability
(X) will exceed any observation from the second group (Y) i.e.,
This is a training article to illustrate common misconcep- to produce cysts and those kidneys that can produce cysts but haven’t
tions about the t-test and WMW test and to propose the use of during this experiment.
discrete regression modeling as an alternative. The data set we Model parameters. Parameter estimates in linear regression give
used represents data resulting from a typical experiment found the difference in means between the groups. In contrast, the Poisson
and negative binomial models yield rate ratios (RRs), which estimate
in research. It provides a simple example of a situation where the change in the relative (rather than absolute) mean number of
the assumptions of the commonly applied t-test and WMW test events between the groups. RRs can be expressed in different ways.
are violated. Although these tests are clearly inappropriate for For example, RR ⫽ 1.25 indicates that the mean in one group is, on
the real-life data presented here, we include them for illustra- average, 1.25 times higher or, alternatively, that there is a 25%
tive purposes. increase in one group compared with the other. On the other hand,
e.g., RR ⫽ 0.83 indicates a 17% decrease in one group compared with
METHODS the other.
Zero-inflated regression models additionally yield the relative odds
Data. The example data set originated from a study (3a) investi- of the proportions of zeros in each group. The odds and probability are
gating the effect of a corticosteroid on cyst formation in mice fetuses calculated by dividing the number of events by the number without
cyst counts; hence, the distributions were highly skewed to the met, but we present the results purely for comparative pur-
right and zero inflated. The overall mean number of cysts was poses. A t-test yielded a highly significant difference between
0.87 (SD: 2.28), indicating overdispersion since the variance is the means of 1.40 [95% CI: (0.82, 1.99), P ⬍ 0.01]. The
almost four times larger than the mean. The values showed WMW test under the general null hypothesis P (X ⬎ Y) ⫽ 0.5
larger number of kidneys with low counts in the control group. produced P ⬍ 0.01, similarly indicating a significant differ-
The steroid-treated group had 1 kidney with 19 cysts, which ence. A median difference of 0 and a value of 0 for both the
was much higher than the maximum number of cysts found in upper and lower 95% CI for the median difference were also
the group of control kidneys (maximum: 3). Not surprisingly, calculated, and this contrasts with the results from the WMW
the Shapiro-Wilks test for normality gave P ⬍ 0.01, indicating test, where a significant difference was detected. A comparison
that the counts did not follow a normal distribution. Similarly, of the number of cysts with zero counts showed that there were
Bartlett’s F-test for homogeneity of variances yielded P ⬍ significantly more (2-statistic ⫽ 28.23, degree of freedom: 1,
0.01, which implied a significant difference in the variances
P ⬍ 0.01) cysts with zero counts in the control group than in
between the two groups. Note that this test assumes normality.
The Ansari-Bradley test for equal scale parameters (9) also
gave P ⬍ 0.01, indicating a significant difference in dispersion
between the groups. Table 1. Summary statistics for counts of cysts for the
Table 1 shows summary statistics for the steroid-treated and steroid-treated and control groups
control groups. The means were 1.55 and 0.15 for the steroid-
Steroid-Treated
treated and control groups, with SDs of 2.98 and 0.51, respec- Group Control Group
tively. The medians for the two groups were both 0, with the
Mean 1.55 0.15
steroid-treated group having an interquartile range of 2 and the SD 2.98 0.51
control group an interquartile range of 0. The mean and SD in Median 0 0
the steroid-treated group were much higher than those in the Interquartile range (quartile 1, quartile 3) 2 (0, 2) 0 (0, 0)
control group, although the medians were equal. Minimum 0 0
Tests of comparisons for the two groups. Clearly, the as- Maximum 19 3
Percent zeroes 58.56 91.26
sumptions for the application of a two-sample t-test were not
the steroid-treated group [percent difference: 32.7%, 95% CI: the BIC values clearly indicated that the negative binomial
(21.1%, 44.3%)]. model resulted in an improvement in fit to the data (450.68
Regression models. Table 2 shows the parameter estimates, compared with 667.07 for the Poisson model).
CIs, P values, and BIC for each of the regression models For the ZIP model, the zero inflation parameter was signif-
described above. As expected, linear regression gave the same icant (model 4: P ⫽ 0.04), and the odds of a kidney having zero
information as the t-test but additionally yielded a BIC value cysts was 0.20 [95% CI (0.08, 0.51), P ⬍ 0.01] times lower in
that could be compared with those obtained from other regres- the steroid-treated group (model 5). Given that there are cysts,
sion models. The Poisson model showed that the average the rate was over three times [RR: 3.23, 95% CI: (1.51, 6.94),
number of cysts in kidneys in the steroid-treated group were P ⬍ 0.01] higher on average for kidneys in the steroid-treated
almost 11 times [RR: 10.64, 95% CI: (6.28, 18.03), P ⬍ 0.01] group compared with the control group. The additional param-
higher than kidneys not treated with steroids. eter to model the zero-inflated component in the Poisson model
The negative binomial model gave the same rate ratio as the reduced the BIC values greatly (507.94) and even further when
Poisson model but with larger SEs, resulting in a wider CI than the variation between zero inflation of the steroid-treated and
the Poisson model. As expected, the overdispersion parameter control groups was included (505.98), with both providing a
Table 2. Estimates, 95% confidence, P values, and BIC values for the various models for the control and steroid-treated
groups
Estimate 95% Confidence Intervals for the Estimate P Value BIC Value
The basic negative binomial model (model 3) gave a better tion can result in a misleading difference between means being
fit than any of the Poisson models (BIC: 450.68), with signif- detected. Value inflation (namely, zero inflation) can lead to a
icant overdispersion, as expected. The constant zero inflation high proportion of tied values in the location-shift version of
term (model 6) was nonsignificant, and the model that allowed the WMW test, resulting in incorrect conclusions being drawn
zero inflation to vary according to group showed the difference from P values. In these cases, tests may lead to conflicting
to be almost significant [odds ratio: 0.18, 95% CI (0.03, 1.02), results and may not agree with observations made from ex-
P ⫽ 0.05]. However, increasing the complexity of the negative ploratory data analyses where the medians do not indicate a
binomial model resulted in a higher BIC value (455.20 and difference between groups. While an advantage of t-tests and
457.80 for ZINB with and without zero inflation varying by WMW tests are their simplicity and ease of application, they
group) as it was penalized by the incorporation of the addi- are greatly disadvantaged when applied to count data due to
tional parameters, indicating models that do not provide a more these erroneous assumptions (14). It can be shown using
parsimonious fit to the data. simulation that the WMW test is particularly vulnerable to zero
Comparison of regression models. The observed and pre- inflation, particularly when the probability of belonging to the
dicted numbers of cysts for models 1–3, 5, and 7 are shown in zero class is ⬎0.75 and less so to overdispersion. More details
Table 3. Observed and fitted numbers of cysts for the control and steroid-treated groups
Frequency of Cysts
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Control group
Observed numbers 94 4 4 1
Fitted number
Model 1: normal 77 20
Model 2: poisson 89 13 1
Model 3: negative binomial 94 4 2 1 1
Model 5: zero-inflated poisson 94 5 3 1
Model 7: zero-inflated negative binomial 94 6 2 1
Steroid-treated group
Observed numbers 65 14 10 6 4 2 2 2 1 1 1 2 1
Fitted numbers
Model 1: normal 13 15 15 13 11 8 5 3 1 1
Model 2: poisson 24 37 28 15 6 2
Model 3: negative binomial 64 17 9 6 4 3 2 2 1 1 1 1
Model 5: zero-inflated poisson 65 5 8 10 9 7 4 2 1
Model 7: zero-inflated negative binomial 65 14 9 6 4 3 2 2 1 1 1 1
n ⫽ 103 mice in the control group and 111 mice in the steroid-treated group.
uous (as opposed to the single binary covariate, steroid/control, analysing discrete data that avoids making erroneous assump-
used in our example data set). In this example, the number of tions and facilitates comparisons between models.
groups could be extended to include more than one type of ACKNOWLEDGMENTS
steroid or a continuous variable, e.g., the growth of the kidneys
over the number of days cultured. Interactions between covari- The authors are grateful to Adrian Woolf and David Long for the helpful
comments on a previous version of this article.
ates can also be investigated in the models, e.g., we could look
at the interaction between the growth of kidneys and steroid- GRANTS
treated groups. A further use of discrete regression is in The Institute of Child Health of University College London receives a
modeling rates. For counts of events measured in a certain time portion of funding from the Department of Health’s National Institute of
period, where the time period may vary across observation, Health Research Biomedical Research Centres funding scheme. The Centre for
Paediatric Epidemiology and Biostatistics also benefits from funding support
dividing the counts of events by the length of a time period from the Medical Research Council (MRC) in its capacity as the MRC Centre
gives a rate. In discrete regression modeling, an offset can be of Epidemiology for Child Health (G0400546). F. McElduff acknowledges the
included to adjust for the different time periods that has no support of an MRC Capacity Building Studentship. S.-K. Chan is funded by
parameter estimates or P values. Kidney Research UK.