You are on page 1of 5

letter

Meta-analysis of genetic association studies supports a


contribution of common variants to susceptibility to
common disease
Kirk E. Lohmueller1,2, Celeste L. Pearce3, Malcolm Pike3, Eric S. Lander1,4& Joel N. Hirschhorn1,5,6
©2003 Nature Pub lishing Gr oup http://www .nature .com/natureg enetics

Published online 13 January 2003; doi:10.1038/ng1071

Association studies offer a potentially powerful approach to This result could, in theory, be explained by publication bias,
identify genetic variants that influence susceptibility to com- the preferential publication of studies that achieve statistical
mon disease1–4, but are plagued by the impression that they significance and non-publication of studies that do not. But
are not consistently reproducible5,6. In principle, the inconsis- several lines of evidence suggest that this is not the case. First,
tency may be due to false positive studies, false negative stud- 47 of 59 studies with P < 0.05 showed an effect in the same
ies or true variability in association among different direction as that shown in the original study (inconsistent with
populations4–8. The critical question is whether false positives the random directionality predicted under the scenario of no
overwhelmingly explain the inconsistency. We analyzed 301 true associations; P < 10–5 by sign test). Second, we calculated
published studies covering 25 different reported associations. the number of hypothetical unpublished non-significant stud-
There was a large excess of studies replicating the first positive ies that would be required to explain this large excess of repli-
reports, inconsistent with the hypothesis of no true positive cations. The likelihood that any given follow-up study would
associations (P < 10–14). This excess of replications could not be by chance both achieve significance at P < 0.05 and show an
reasonably explained by publication bias and was concentrated effect in the same direction as the original report is 1 in 40.
among 11 of the 25 associations. For 8 of these 11 associations, Thus, to account for the observed 47 replications, they would
pooled analysis of follow-up studies yielded statistically signif- need to have been drawn from a pool of studies forty times
icant replication of the first report, with modest estimated larger, that is, from 1,880 total published and unpublished
genetic effects. Thus, a sizable fraction (but under half) of studies (95% confidence interval (c.i.) = 1,381–2,501). As
reported associations have strong evidence of replication; for there are only 301 published studies, we must postulate the
these, false negative, underpowered studies probably con- existence of an additional 1,579 unpublished, non-significant
tribute to inconsistent replication. We conclude that there are studies, or an average of 63 unpublished studies for each asso-
probably many common variants in the human genome with ciation (95% c.i. = 43–88). Given the relative ease with which
modest but real effects on common disease risk, and that stud- even fairly small non-significant association studies have been
ies using large samples will convincingly identify such variants. published (especially if the association in question is contro-
From 166 frequently studied associations between common vari- versial), and because an average of only 13 total studies were
ants and common diseases9, we selected a subset of 25 associa- actually published per association, this degree of publication
tions for meta-analysis (Table 1). After excluding the first bias seems implausible.
positive reports (defined as the earliest study to achieve P < 0.05), Similarly, we were able to estimate the actual amount of
we identified 301 additional published studies across these 25 publication bias in our sample by considering the number of
associations, or approximately 12 potential replication studies studies that achieved P < 0.05 but showed an effect in the
per association. We considered the hypothesis that the original opposite direction as that shown in the original report. There
reported associations are largely false positive reports. In the were 12 such studies; as above, these should also occur by
absence of publication bias, this hypothesis predicts that chance once in 40 times, and therefore be drawn from a pool of
roughly 5% of these 301 studies should be statistically signifi- 480 studies (95% c.i. = 250–839). As there are only 301 pub-
cant at P < 0.05, approximately 1% of studies should be signifi- lished studies, this corresponds to an average of 179 unpub-
cant at P < 0.01 and so on. Of the 301 studies, however, 59 lished studies, or roughly 7 unpublished studies per
achieved P < 0.05 (versus 15 such studies expected by chance), association (95% c.i. = 0–21). This small degree of publication
26 studies had P < 0.01 (versus 3 expected), and 10 studies had bias is plausible but also insufficient to explain the large num-
P < 0.001 (versus <1 expected). These rates are in vast excess of ber of replications we observed. Finally, we carried out ‘funnel
those expected by chance under the hypothesis of no true asso- plot’ analysis10, a graphical test for publication bias, and found
ciations (P < 10–14 for the 0.05 and 0.01 significance thresholds that only three studies showed significant evidence of publica-
and P < 10–12 for the 0.001 threshold; P values determined tion bias (Table 1 and Fig. 1). Thus, the excess of replications
using the appropriate Poisson distribution for each threshold). we observed cannot be explained by publication bias.

1Whitehead/Massachuetts Institute of Technology Center for Genome Research, Cambridge, Massachusetts 02139, USA. 2Georgetown University,
Washington, DC 20057, USA. 3Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, USA. 4Department of
Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA. 5Department of Genetics, Harvard Medical School, Boston, Massachusetts
02115, USA. 6Divisions of Genetics and Endocrinology, Enders 561, Children’s Hospital, 300 Longwood Avenue, Boston, Massachusetts 02115, USA.
Correspondence should be addressed to J.N.H. (e-mail: joel.hirschhorn@tch.harvard.edu).

nature genetics • volume 33 • february 2003 177


letter
Notably, only 11 of the 25 associations had multiple replica- (described in ref. 1, for example) could contribute to inconsistency.
tions of the original report, and these account for 44 of the 47 Finally, studies with small sample sizes could lack power to reliably
replications (Table 1). This degree of clustering was non-random detect modest genetic effects, leading to intermittent replication.
(P = 0.002 by simulation). Thus, over half of reported associa- If the inconsistency in replication were due to population-
tions have little or no evidence for replication, but a sizable frac- specific effects for some associations, one should expect het-
tion of associations (11 of 25 in this analysis) have been erogeneity across studies (true association in some
replicated far more frequently than would be expected by chance. populations and no association in others). Ten associations
We next explored the possible reasons for inconsistency in showed significant heterogeneity. In six of these cases, how-
replication, particularly among the 11 associations with multiple ever, the evidence for heterogeneity disappeared after we
replications and non-replications. One possibility is that associa- removed either a single outlying study or outliers representing
tions are specific to certain populations. Alternatively, false posi- fewer than 5% of studies (Table 2). Thus, for most associa-
©2003 Nature Pub lishing Gr oup http://www .nature .com/natureg enetics

tives due to ethnic admixture and population stratification tions, there is only limited evidence for heterogeneity across

Table 1 • Number of statistically significant follow-up studies for 25 associations


Number Number
of studies, of studies,
Variant Number Number P < 0.05, P < 0.05, P value Citation for
Associated gene, (associated of studies, of studies, same direction opposite direction for publication first positive
phenotype allele/genotype) total P < 0.05 as original report as original report bias report

ABCC8, type 2 diabetes Intron 24 –3T/C


(C) 9 2 1 1 0.11 ref. 19
ABCC8, type 2 diabetes Exon 22 C/T
(T) 4 2 2 0 0.62 ref. 19
ADD1, hypertension G460W
(W) 18 7 5 2 0.63 ref. 20
APOE, schizophrenia ∑2/3/4
(∑4) 12 1 0 1 0.80 ref. 21
BLMH, Alzheimer disease I443V 5 0 0 0 0.31 ref. 22
(V/V)
COL1A1, osteoporotic fracture Intron 1 G/T
(T) 12 5 5 0 0.28 ref. 23
COMT, bipolar disorder V158M
(M) 12 0 0 0 0.73 ref. 24
COMT, schizophrenia V158M
(V) 9 1 0 1 0.99 ref. 25
CTLA4, type 1 diabetes T17A
(A) 20 8 8 0 0.30 ref. 26
DRD2, schizophrenia –141C ins/del
(ins) 8 3 2 1 0.43 ref. 27
DRD3, schizophrenia S9G
(S/S) 48 7 5 2 0.04 ref. 28
GSTM1, breast cancer Null allele
(null/null) 15 0 0 0 0.18 ref. 29
GSTM1, head/neck cancer Null allele
(null/null) 25 4 3 1 0.90 ref. 30
GYS1, type 2 diabetes XbaI RFLP
(A2, site present) 3 1 0 1 0.65 ref. 31
HTR2A, schizophrenia 102T/C
(C) 28 4 3 1 0.27 ref. 32
INSR, type 2 diabetes SstI RFLP
(5.8-kb allele) 4 1 1 0 0.01 ref. 33
INSR, type 2 diabetes V985M
(M) 4 0 0 0 0.19 ref. 34
KCNJ11, type 2 diabetes E23K
(K) 6 0 0 0 0.45 ref. 35
NTF3, schizophrenia 5′ dinucl. rept.
(A3, 147 bp) 7 0 0 0 0.61 ref. 36
PON1, CAD R192Q
(R) 14 5 5 0 0.01 ref. 37
PPARG, type 2 diabetes P12A
(P) 14 5 4 1 0.38 ref. 38
SERPINE1, MI Promoter 4G/5G
(4G/4G) 13 1 1 0 0.36 ref. 39
SLC2A1, type 2 diabetes XbaI RFLP
(6.2-kb allele) 3 2 2 0 0.79 ref. 40
SLC2A2, type 2 diabetes TaqI RFLP
(13-kb allele) 3 0 0 0 0.15 ref. 41
TPH, bipolar disorder Intron 7 218A/C
(C) 5 0 0 0 0.63 ref. 42
Total 301 59 47 12
CAD, coronary artery disease; MI, myocardial infarction; ins, insertion; del, deletion; RFLP, restriction-fragment length polymorphism; dinucl. rept., dinucleotide
repeat. P value for publication bias was calculated as described10. The standard normal deviate (standard error divided by the log of the odds ratio) for each
study is regressed against its precision (the inverse of the standard error). If the 90% c.i. for the intercept excludes zero, reflected in this table as P < 0.10, there is
evidence of publication bias.

178 nature genetics • volume 33 • february 2003


letter
Fig. 1 Funnel plot analysis to detect publication bias. Each data point repre- no evidence of publication bias:
sents a separate study for the indicated association. For each study, the odds APOE with schizophrenia
ratio is plotted on a logarithmic scale against the precision (the reciprocal of
10.0
the standard error). If bias is absent, small studies will have odds ratios that are
widely scattered but still centered around the odds ratio estimates provided by
larger, more precise studies. In this case, the plot will resemble a funnel on its
side with the mouth towards the y axis (top). If bias is present, part of the fun-

odds ratio
nel mouth (small studies with odds ratios close to 1) will be missing (bottom).
Formal statistical criteria are described in the legend to Table 1. 1.0
0 5 10

studies (although the power to detect heterogeneity is limited


with small numbers of studies). For the remaining four hetero-
©2003 Nature Pub lishing Gr oup http://www .nature .com/natureg enetics

0.1
geneous associations, separating the studies by ethnicity did not
remove the heterogeneity (Table 3), and we could not identify precision
any clear methodological discrepancies that could account for
the heterogeneity (data not shown). Thus, population-specific
effects seem unlikely to explain the excess of replications. evidence of publication bias:
PON1 with CAD
Repeated false positives due to population stratification also 10.0
seem unlikely to explain most of the excess replications, as 9 of
the 11 multiply replicated associations were replicated using
family-based controls or in multiple ethnic groups.

odds ratio
By contrast, underpowered non-significant studies of real
1.0
associations with modest genetic effects can reasonably account
0 10 20
for much of the variability in replication. Consistent with this
model, 8 of the 25 original positive reports were replicated in a
meta-analysis of follow-up studies, with modest genetic effects
that would be hard to replicate in small studies (Table 2). As 0.1
expected, all 8 were among the 11 associations with multiple
precision
replications. Most estimated odds ratios were between 1.1 and 2;
the most extreme case, a relative risk of 1.07 for HTR2A and
schizophrenia, would require a study of 6,900 case–control pairs One of the prominent features of this analysis is that the esti-
to achieve 80% power at P < 0.05, much larger than typical asso- mate of the genetic effect in the first positive report is biased
ciation studies with hundreds of individuals. upward. Indeed, for 24 of 25 associations, the odds ratio in the first

Table 2 • Pooled odds ratios for follow-up studies of 25 associations

Number of
Associated gene, Number of studies removed to Fixed effects Random effects
phenotype studies achieve homogeneity OR (95% c.i.) OR (95% c.i.)
Associations for which meta-analysis replicates original report
ABCC8, exon 22, type 2 diabetesa 4 1 2.28 (1.27–4.10) 2.23 (1.21–4.11)
COL1A1, osteoporotic fracturea 12 0 1.59 (1.36–1.86) 1.61 (1.31–1.96)
CTLA4, type 1 diabetesa 20 6 1.27 (1.17–1.37) 1.31 (1.18–1.46)
DRD3, schizophreniaa 48 2 1.12 (1.02–1.23) 1.13 (1.02–1.25)
GSTM1, head/neck cancera 25 1 1.20 (1.09–1.33) 1.20 (1.08–1.33)
HTR2A, schizophreniaa 28 0 1.07 (1.01–1.14) 1.07 (1.01–1.15)
PPARG, type 2 diabetesa 14 1 1.22 (1.08–1.37) 1.21 (1.07–1.37)
SLC2A1, type 2 diabetesa 3 1 1.76 (1.35–2.31) 1.82 (1.23–2.69)
Associations for which meta-analysis does not provide replication for original report
ABCC8, intron 24, type 2 diabetes 9 2 1.02 (0.92–1.14) 1.10 (0.96–1.26)
ADD1, hypertension 18 5 1.09 (0.98–1.20) 1.09 (0.95–1.25)
APOE, schizophrenia 12 0 1.0 (0.85–1.18) 0.99 (0.82–1.20)
BLMH, Alzheimer disease 5 0 0.92 (0.67–1.26) 0.93 (0.53–1.64)
COMT, bipolar disorder 12 0 1.08 (0.95–1.23) 1.08 (0.95–1.23)
COMT, schizophrenia 9 0 0.99 (0.87–1.12) 0.99 (0.84–1.16)
DRD2, schizophrenia 8 1 0.84 (0.70–1.01) 0.85 (0.67–1.09)
SLC2A2, type 2 diabetes 3 0 0.96 (0.63–1.45) 0.96 (0.63–1.46)
GSTM1, breast cancer 15 0 1.07 (0.98–1.18) 1.07 (0.98–1.18)
GYS1, type 2 diabetesb 3 0 0.56 (0.35–0.90) 0.55 (0.29–1.07)
INSR RFLP, type 2 diabetes 4 0 1.32 (0.90–1.94) 1.39 (0.83–2.32)
INSR Met985, type 2 diabetes 4 0 0.89 (0.49–1.64) 0.81 (0.48–1.70)
KCNJ11, type 2 diabetes 6 0 1.09 (0.95–1.24) 1.11 (0.92–1.33)
NTF3, schizophrenia 7 0 1.01 (0.80–1.29) 1.02 (0.78–1.32)
PON1, CAD 14 3 1.06 (0.98–1.14) 1.07 (0.98–1.16)
SERPINE1, MI 13 0 1.04 (0.95–1.15) 1.04 (0.95–1.15)
TPH, bipolar disorder 5 0 1.12 (0.98–1.28) 1.12 (0.98–1.28)
aStatistically
significant in fixed and random effects models.bStatistically significant in fixed effects model only; opposite direction from original report. OR, esti-
mated pooled odds ratio; CAD, coronary artery disease; MI, myocardial infarction.

nature genetics • volume 33 • february 2003 179


letter
positive report exceeded the genetic effect estimated by meta- DRD3 and schizophrenia, compared with 25 in their study11). For 8
analysis of the remaining studies. This result is consistent with the of 36 associations in their review, the pooled odds ratio exceeded 1,
main conclusion of a recent review of association studies11. We although they did not rule out publication bias as a possible expla-
hypothesized that this consistent overestimation is a consequence nation. Despite the difficulties in comparison, the overall picture is
of the ‘winner’s curse’ phenomenon, recently described in the set- encouragingly similar: when all published studies are combined,
ting of linkage studies12 and perhaps most clearly described for approximately 20–30% of genetic association studies are statisti-
auctions (see ref. 13 for example). If all auction participants place cally significant, with modest estimated genetic effects.
bids that are unbiased but imprecise estimates of an item’s true We propose three general recommendations. First, in light of
value, the final selling price will be an upwardly biased estimate of the seemingly high proportion of false positive reports in the lit-
the value, because the winning bid’s success is conditional on that erature, more stringent criteria for interpreting association stud-
bid being the highest of a set of unbiased estimates. By a similar ies are needed. A single, nominally significant association should
©2003 Nature Pub lishing Gr oup http://www .nature .com/natureg enetics

logic, the genetic effect in an association study will be biased be viewed as tentative until it has been independently replicated
upward, conditional on that study being the first to reach statistical at least once and preferably twice. A review of these 25 associa-
significance and be published. In either case, the expected degree tions suggests that two studies with P < 0.01 or a single study
of upward bias can be mathematically modeled and predicted. (other than the first positive) with P < 0.001 is strongly predictive
For 23 of 25 initial positive studies, the overestimation of of future replication (data not shown).
genetic effects was consistent with a winner’s curse phenomenon Second, large studies should be encouraged, with collaborative
(only 3 of the 23 P values under our winner’s curse model were efforts probably required to achieve the sample size of many thou-
<0.1, and none were <0.01). One consequence of the winner’s sands of case–control pairs that is necessary for definitive studies of
curse is that the odds ratio in the first positive study cannot be common variants with modest genetic effects. Even larger samples
used to estimate the strength of the genetic effect, as emphasized will be required to detect gene–gene or gene–environment interac-
by previous authors11,12. Notably, two first positive studies in our tions or associations specific to defined subgroups or to correct for
analysis (GYS1 and ABCC8 intron 24 with type 2 diabetes) testing association to multiple phenotypes. To help increase the
greatly overestimated the actual genetic effect, to an extent that effective sample size, reports of association would ideally include a
was inconsistent with a winner’s curse (P < 10–5 in both cases). meta-analysis of all available published data to give a more robust
These two studies could represent strong associations limited to a estimate of the genetic effect. To facilitate such meta-analyses and
very specific population, but could also simply be flawed studies. minimize publication bias, all disease association studies that meet
Our results support two complementary conclusions: false minimal quality standards should be published. Such standards
positive associations are abundant in the literature, but there are could include explicit phenotype definitions, complete listing of all
also many real associations lurking among the data. These true phenotypes analyzed, precise localization of the polymorphism(s),
associations probably confer a modestly higher risk of common low genotyping error rate, analysis that avoids overlap with previ-
disease and thus are difficult to detect. Furthermore, some of ous studies and availability of genotype counts for cases and con-
these true associations are scattered among already published trols (or equivalent data for family-based studies). Publication of
associations. Our analysis was limited to associations with com- non-significant results could be encouraged using forums such as a
mon variants, as there are not enough published association common association study database or ‘brief reports’ and ‘negative
studies with rare variants to permit meta-analysis, probably results’ sections of specialty journals, with sufficient credit to pro-
owing to the greater challenges of studying rare variants14. Thus, vide incentives for publication.
although our results support a contribution of common variants Finally, our results suggest a possible productive avenue for
to common disease, they do not address one way or the other the human disease genetics in the near future. It seems likely that a
contribution of rare variation to common disease. fraction (perhaps a quarter) of previously published associations
A recent study also reviewed associations between common vari- represent real associations with common disease. Thus, using
ants and common disease11. It is difficult to directly compare their large samples to test all previously reported associations, perhaps
results to ours because they included first positive reports in their focusing initially on those associations that have already been
analyses, and because our review seems to be more comprehensive replicated at least once, would probably identify a significant
(for example, we identified 48 studies of the association between number of variants that affect the risk of common disease.

Table 3 • Grouping studies by ethnicity generally does not remove heterogeneity


Gene, phenotype Number of studies Fixed effects Random effects P value for
Ethnic group in ethnic group OR (95% c.i.) OR (95% c.i.) heterogeneity
ABCC intron 24, type 2 diabetes
of European descent 7 0.96 (0.86–1.07) 1.0 (0.79–1.27) <0.01
ADD1, hypertension
African-American 3 0.98 (0.73–1.32) 1.04 (0.44–2.45) 0.02
Japanese 4 1.13 (0.96–1.32) 1.20 (0.82–1.74) <0.01
of European descent 10 1.01 (0.90–1.14) 1.01 (0.82–1.25) 0.01
CTLA4, type 1 diabetes
Japanese 4 1.10 (0.93–1.31) 1.12 (0.87–1.44) 0.10
of European descent 10 1.25 (1.14–1.36) 1.34 (1.06–1.70) 0.01
PON1, CAD
Indian 3 2.0 (1.63–2.46) 2.18 (1.26–3.77) 0.01
Japanese 3 1.36 (1.13–1.62) 1.40 (0.90–2.16) 0.01
of European descent 6 1.02 (0.94–1.11) 1.02 (0.94–1.11) 0.63
For each of the four studies with unexplained heterogeneity, studies were grouped by the ethnic/geographic origin of the study population, and the P value for
heterogeneity, estimated genetic effect and 95% confidence interval were calculated separately for each group. OR, estimated pooled odds ratio; CAD, coronary
artery disease.

180 nature genetics • volume 33 • february 2003


letter
and repeated the analysis. If neither of these approaches resulted in homo-
Methods geneity, we successively removed studies with the greatest contribution to
Selection of associations for analysis. In
a previous study, we identified
heterogeneity as above until homogeneity was achieved. Although the
160 associations between disease risk and common polymorphisms
studies removed in this way can not be considered outliers, removing stud-
(defined as having at least two alleles with frequency >1% in the popula-
ies that contribute most to heterogeneity is an unbiased way of achieving
tion) where the association was not consistently reproducible9. Specifically,
the homogeneity required for meta-analysis. Meta-analyses were consid-
these associations had three or more separate published studies in total, at
ered to have replicated the original report if the 95% c.i. around the pooled
least one of which achieved statistical significance (a ‘positive’ study) and at
odds ratio for the homogeneous studies excluded 1 using both fixed and
least 25% of which did not achieve significance (‘negative’ studies). To
random-effects models.
minimize problems arising from multiple hypothesis testing, we limited
For the winner’s curse analysis, we calculated the probability of observing
our analysis to the 130 studies of variants with a single associated allele or
an odds ratio at least as large as that reported in each initial positive study,
genotype and dichotomous phenotypes such as diseased versus healthy.
given the sample size and allele frequencies in the study, and assuming that
We also treated reports involving different polymorphisms in the same
©2003 Nature Pub lishing Gr oup http://www .nature .com/natureg enetics

the real genetic effect was accurately estimated by the remaining studies
gene as separate associations. Although associations at multiple markers in
(Table 2). We then refined the calculation by assuming that an initial positive
a gene might support the general notion that variation in that gene is asso-
report would meet at least one of three conditions that make early publication
ciated with disease susceptibility, it is not possible to fully evaluate whether
an association at one marker provides supporting evidence for another likely: P < 0.01, odds ratio ≥ 2 or use of family-based controls. (These condi-
tions are reasonable, as 24 of the 25 first positive reports met at least one of
marker without knowing the pattern of linkage disequilibrium (correla-
these three conditions.) The P value under the winner’s curse model is the
tion between markers) in the population being studied.
ratio of the P value for the actual data in the first report divided by the P value
We selected 25 of these associations for further analysis, 8 associations
for data that minimally achieve one of these three conditions.
chosen at random plus all identified associations for three diseases of inter-
est to us: type 2 diabetes (9 associations), bipolar disorder (2 associations)
and schizophrenia (6 associations). We attempted to identify all published Acknowledgments
studies for these 25 associations by several independent searches of litera- We thank E. Byrne for help in obtaining manuscripts and D. Stram for help
ture published through approximately May 2001 (taking care to identify with analysis. J.N.H. is the recipient of a Burroughs Wellcome Career Award
instances in which multiple papers discussed the same data set and choos- in Biomedical Sciences. This work was supported in part by research grants
ing only one representative paper). from Bristol-Myers Squibb, Millennium Pharmaceuticals and Affymetrix to
E.S.L. and by the University of Southern California/Norris Comprehensive
Statistical analysis. For each study, we compared the number of risk alleles (or Cancer Center Core Grant from the US National Cancer Institute.
genotypes) in cases and controls and calculated P values by  2 analysis with 1
degree of freedom (or by Fisher’s exact test if the any of the cells in the expected
Competing interests statement
table were less than 5). The situation for transmission-disequilibrium testing
The authors declare that they have no competing financial interests.
in parent–offspring trios is algebraically equivalent to treating the number of
transmissions of each allele as the number of occurrences of that allele in cas-
es and considering the controls to be a very large population with equal num- Received 2 July; accepted 15 November 2002.
bers of each allele (to reflect the expected 50:50 transmission ratio). We used 1. Lander, E.S. & Schork, N.J. Genetic dissection of complex traits. Science 265,
allele counts unless the original positive report showed a stronger association 2037–2048 (1994).
to disease risk with a particular diploid genotype rather than with an allele. 2. Risch, N. & Merikangas, K. The future of genetic studies of complex human
diseases. Science 273, 1516–1517 (1996).
We analyzed all studies using the same phenotype that was reported to be 3. Collins, F.S., Guyer, M.S. & Charkravarti, A. Variations on a theme: cataloging
associated in the original positive study for that association. We compared the human DNA sequence variation. Science 278, 1580–1581 (1997).
4. Risch, N.J. Searching for genetic determinants in the new millennium. Nature
number of observed studies with P values below a given threshold with the 405, 847–856 (2000).
number of such studies expected by chance using a Poisson distribution. We 5. Freely associating. Nat. Genet. 22, 1–2 (1999).
carried out funnel plot analysis as described10. 6. Cardon, L.R. & Bell, J.I. Association study designs for complex diseases. Nat. Rev.
Genet. 2, 91–99 (2001).
To determine whether there was non-random clustering of replication 7. Altshuler, D., Kruglyak, L. & Lander, E. Genetic polymorphisms and disease. N.
studies (follow-up reports with P < 0.05 and association in the same direc- Engl. J. Med. 338, 1626 (1998).
tion as the original report), we carried out 10,000 simulations of our data. 8. Tabor, H.K., Risch, N.J. & Myers, R.M. Opinion: candidate-gene approaches for
studying complex genetic traits: practical considerations. Nat. Rev. Genet. 3,
In each simulation, we randomly assigned 47 replication studies to each of 391–397 (2002).
the 25 associations with a probability proportional to the total number of 9. Hirschhorn, J.N., Lohmueller, K., Byrne, E. & Hirschhorn, K. A comprehensive
studies published for each association. To obtain an empiric P value, we review of genetic association studies. Genet. Med. 4, 45–61 (2002).
10. Egger, M., Davey Smith, G., Schneider, M. & Minder, C. Bias in meta-analysis
counted the number of simulations in which the degree of clustering of detected by a simple, graphical test. BMJ 315, 629–634 (1997).
replication studies matched or exceeded the actual observed data. 11. Ioannidis, J.P., Ntzani, E.E., Trikalinos, T.A. & Contopoulos-Ioannidis, D.G.
For each association, we tested the collection of studies (excluding the Replication validity of genetic association studies. Nat. Genet.29, 306–309 (2001).
12. Goring, H.H., Terwilliger, J.D. & Blangero, J. Large upward bias in estimation of
first positive report) for homogeneity using a Pearson  2 goodness-of-fit15; locus-specific effects from genome-wide scans. Am. J. Hum. Genet. 69, 1357–1369
associations were considered homogeneous for P > 0.05. For homoge- (2001).
13. Bazerman, M.H. & Samuelson, W.F. I won the auction but don’t want the prize. J.
neous associations, we used the Epilog statistical package (Epicenter Soft- Conflict Resolut. 27, 618–634 (1983).
ware) to calculate the pooled odds ratio and 95% c.i. for the odds ratio 14. Hirschhorn, J.N. & Altshuler, D. Once and again—issues surrounding replication in
using both a fixed effects model16,17 and a random effects model18. For het- genetic association studies. J. Clin. Endocrinol. Metab. 87, 4438–4441 (2002).
15. Reis, I.M., Hirji, K.F. & Afifi, A.A. Exact and asymptotic tests for homogeneity in
erogeneous associations, we ranked the studies according to their individ-
several 2 ⋅ 2 tables. Stat. Med. 18, 893–906 (1999).
ual  2 values from the goodness-of-fit test, removed the study with the 16. Breslow, N.E. & Day, N.E. Statistical Methods in Cancer Research: 1. The Analysis of
Case–Control Studies (International Agency for Research on Cancer, Lyon, 1980).
highest  2 value and repeated this process until homogeneity was achieved. 17. Robins, J., Breslow, N. & Greenland, S. Estimators of the Mantel–Haenszel
If the association became homogeneous after removing either one study or variance consistent in both sparse data and large-strata limiting models.
fewer than 5% of studies, we recalculated the overall odds ratio and 95% Biometrics 42, 311–323 (1986).
c.i., and no further action was taken. Meta-analyses that required the 18. Stram, D.O. Meta-analysis of published data using a linear mixed-effects model.
Biometrics 52, 536–544 (1996).
removal of more than a limited number of outlying studies to achieve 19. Inoue, H. et al. Sequence variants in the sulfonylurea receptor (SUR) gene are
homogeneity were examined further using three approaches. associated with NIDDM in Caucasians. Diabetes 45, 825–831 (1996).
For these persistently heterogeneous associations, we considered two 20. Cusi, D. et al. Polymorphisms of 〈-adducin and salt sensitivity in patients with
essential hypertension. Lancet 349, 1353–1357 (1997).
potential sources of heterogeneity. First, we stratified the studies by ethnic- 21. Harrington, C.R., Roth, M., Xuereb, J.H., McKenna, P.J. & Wischik, C.M.
ity and repeated the analysis separately for each ethnic group. Second, the Apolipoprotein E type epsilon 4 allele frequency is increased in patients with
schizophrenia. Neurosci. Lett. 202, 101–104 (1995).
methods section of each study was reviewed by an epidemiologist who was
22. Montoya, S.E. et al. Bleomycin hydrolase is associated with risk of sporadic
blind to the outcome of each study to determine whether methodological Alzheimer’s disease. Nat. Genet. 18, 211–212 (1998).
discrepancies were apparent in any of the studies. We removed from the 23. Grant, S.F. et al. Reduced bone density and osteoporosis associated with a
polymorphic Sp1 binding site in the collagen type I alpha 1 gene. Nat. Genet. 14,
meta-analysis any studies with obvious methodological flaws or differences 203–205 (1996).

nature genetics • volume 33 • february 2003 181

You might also like