You are on page 1of 5

The Candidate Gene

Approach
Jennifer M. Kwon, M.D., and Alison M. Goate, D.Phil.

Alcoholism has a significant genetic basis, and identifying genes that confer a susceptibility to
alcoholism will aid clinicians in preventing and effectively treating the disease. One commonly
used technique to identify genetic risk factors for complex disorders such as alcoholism is the
candidate gene approach, which directly tests the effects of genetic variants of a potentially
contributing gene in an association study. These studies, which may include members of an
affected family or unrelated cases and controls, can be performed relatively quickly and
inexpensively and may allow identification of genes with small effects. However, the candidate
gene approach is limited by how much is known of the biology of the disease being investigated.
As researchers identify potential candidate genes using animal studies or linking them to DNA
regions implicated through other analyses, the candidate gene approach will continue to be
commonly used. KEY WORDS: genetic theory of AODU (AOD [alcohol or other drug] use, abuse,
and dependence); genetic linkage; genetic polymorphism; nucleotides; apolipoproteins;
quantitative trait locus; alcohol dehydrogenases; aldehyde dehydrogenases; Alzheimer’s disease

F
amily, twin, and adoption studies families, investigators can identify genetic Whereas the linkage mapping
have indicated that alcoholism regions associated or “in linkage” with approach is an unbiased search of the
has a strong genetic component the disease by observing that affected entire genome without any preconcep-
(Reich et al. 1999). Although researchers family members share certain marker tions about the role of a certain gene,
are still investigating its exact nature, variants (i.e., alleles) located in those the candidate gene approach allows
several genes of varying effect that con- regions more frequently than would be researchers to investigate the validity of
fer a susceptibility to alcoholism (i.e., expected by chance. These regions can an “educated guess” about the genetic
susceptibility genes) likely play a role. then be isolated, or cloned, for further basis of a disorder. This approach involves
Identification of these genes will aid analysis and characterization of the assessing the association between a par-
researchers and clinicians in preventing responsible genes. Linkage mapping ticular allele (or set of alleles) of a gene
and effectively treating this disorder. techniques have already resulted in the that may be involved in the disease (i.e.,
The search for alcoholism suscepti- identification of several potential DNA a candidate gene) and the disease itself.
bility genes centers on two major tech- regions that may contain susceptibility In other words, this type of association
niques, linkage mapping and the can- genes for alcoholism (Reich et al. 1999).
didate gene approach. Linkage mapping, (For a review of the analogous approach
also called positional cloning, is the pro- in mice, see the article in this issue on JENNIFER M. KWON, M.D., is an assistant
cess of systematically scanning the entire quantitative trait locus [QTL] analysis professor of neurology and ALISON M.
DNA contents (i.e., the genomes) of by Grisel, pp. 169–174.) The primary GOATE, D.PHIL., is a professor of genetics
various members of families affected by advantage of linkage mapping is that in psychiatry at Washington University
the disorder using regularly spaced, investigators need no prior knowledge of School of Medicine, St. Louis, Missouri.
highly variable (i.e., polymorphic) DNA the physiology or biology underlying the Both authors are researchers in the
segments whose exact position is known disorder being studied, which is important Collaborative Study of the Genetics of
(i.e., genetic markers). Using those for complex disorders, such as alcoholism. Alcoholism.

164 Alcohol Research & Health


The
What
Candidate
Is Moderate
Gene Drinking?
Approach

study tries to answer the question, several variants, or alleles, allowing for functions that might influence the trait
“Is one allele of a candidate gene more its use in the candidate gene approach. of interest.1 (For more information on
frequently seen in subjects with the dis- In alcoholism, as in other addictive the relationship between mutations in
ease than in subjects without the dis- disorders, the pathways through which the DNA and variations in protein
ease?” The major difficulty with this brain chemicals (i.e., neurotransmitters) function, see the sidebar, p. 167.) In the
approach is that in order to choose a and other signaling molecules act may case of ALDH, several well-known
potential candidate gene, researchers polymorphisms result in the substitution
must already have an understanding of of certain protein building blocks (i.e.,
the mechanisms underlying the disease Candidate gene studies amino acids) and thus can lead to pro-
(i.e., disease pathophysiology). In con- teins with biologically relevant changes
trast with linkage mapping studies, how- are better suited for in function. In many cases, however,
ever, studies of candidate genes do not
require large families with both affected
detecting genes researchers may know a gene’s DNA
sequence but may not have any infor-
and unaffected members, but can be underlying common mation about functional variation in
performed with unrelated cases and the gene.
control subjects or with small families and more complex Detecting genetic variants is a labo-
(e.g., a proband and parents). Further- diseases where the risk rious process that often involves
more, candidate gene studies are better sequencing—that is, determining the
suited for detecting genes underlying associated with any sequence of DNA building blocks (i.e.,
common and more complex diseases
where the risk associated with any given
given candidate gene nucleotides)—for the entire gene in
both affected and unaffected individu-
candidate gene is relatively small (Collins is relatively small. als to look for consistent differences.2
et al. 1997; Risch and Merikangas Alternatively, researchers can employ
1996). This article describes the methods screening procedures during which they
used in candidate gene studies, including also play a role in the development and isolate small gene sections from many
associated methodological and technical maintenance of addictive behaviors. individuals and compare their mobility
considerations, and reviews examples of For example, researchers using the fruit in a gelatinous material. Differences in
this approach that are both related and fly Drosophila melanogaster as an animal mobility in these analyses may indicate
unrelated to alcoholism. Because many model of alcohol sensitivity found that nucleotide variations (Malhotra and
of the best known examples of this a fly mutant called cheapdate exhibited Goldman 1999).3 To confirm that a
approach have been conducted in enhanced vulnerability to alcohol (Moore potential nucleotide variation exists and
humans, these studies have been high- et al. 1998). This mutant carries a spe- to determine its exact location in the
lighted. However, the overall approach cific allele of a gene important in cellu- genome, investigators then must con-
is similar in other model organisms, lar signaling pathways; consequently, duct additional studies, typically on the
and the differences will be reviewed at genes involved in this and other signaling direct sequencing of the DNA section
the end of this article. pathways would be reasonable choices in question. This information also
for candidate genes influencing alcohol allows researchers to determine
sensitivity and possibly alcoholism. (For whether the nucleotide variation is
Strategies Used more information on alcohol-related likely to have functional significance,
in the Candidate studies in Drosophila, see the article in either because it actually results in
Gene Approach this issue by Heberlein, pp. 185–188.) amino acid changes in the resulting pro-
The selection of particular genes for tein or because it occurs in DNA regions
further analysis as candidate genes could controlling the gene’s activity. Finally, to
Selecting a Candidate Gene be facilitated if some of the potentially be useful for candidate gene studies, the
The first critical step in conducting important genes were located in DNA variant should occur with sufficient fre-
candidate gene studies is the choice of a regions that could be linked to alco-
suitable candidate gene that may plausi- holism in genome screens. 1
Both the genes and the proteins they encode frequently
are abbreviated with the same letters; however, the
bly play a relevant role in the process or names of the genes are usually typed in italics and the
disease under investigation. For example, names of the proteins in regular letters.
Choosing a DNA Polymorphism
when studying alcoholism, genes encod- 2
A typical gene can span 10,000 to 100,000 or more
ing enzymes that act in various pathways Once investigators have selected a can- nucleotides of the human genome, of which approximately
of alcohol metabolism, such as alcohol didate gene, they must decide which 2 to 5 percent (i.e., a few thousand nucleotides) consist of
dehydrogenase (ADH) and aldehyde polymorphism would be most useful the coding sequence and the rest, intronic sequence.

dehydrogenase (ALDH), are logical for testing in an association study. To 3


Two such techniques are single strand conformation
choices. Both enzymes are encoded by this end, they must identify existing gene analysis (SSCP) and denaturing high performance liquid
chromatography (DHPLC). For a discussion of other
more than one gene (i.e., by gene fami- variants and determine which of those candidate gene variant selection techniques, see Malhotra
lies), and each of these genes exists in variants result in proteins with altered and Goldman 1999; Collins et al. 1997.

Vol. 24, No. 3, 2000 165


quency to allow detection of differences gene in a sample of randomly chosen tein E (ApoE), whose gene (APOE) is
between individuals with and without subjects with the disease (i.e., cases) located on chromosome 19. ApoE was
the trait under investigation. and without the disease (i.e., controls). implicated in the development of AD
Not all genes, however, have an eas- Such subject groups are relatively easy by findings that it binds tightly to β-
ily identifiable common functional to obtain, giving the candidate gene amyloid in the fluid surrounding the
variant that can be exploited in associa- approach an important advantage over brain and spinal cord (i.e., the cere-
tion studies, and in many cases researchers the linkage mapping approach, which brospinal fluid) (Strittmatter et al. 1993).
have identified only changes in individ- requires the analysis of families with Furthermore, prior linkage data had
ual nucleotides (i.e., single nucleotide multiple affected members. Additional indicated that a gene for late-onset AD
polymorphisms [SNPs]) that have no advantages of the case-control study was located on chromosome 19 (Pericak-
known functional significance. Never- design over linkage-based methods Vance et al. 1991), in a region that
theless, SNPs can be potentially useful include the following (Malhotra and included the APOE gene. Based on
in narrowing a linkage region. In addi- Goldman 1999): these findings, researchers conducted
tion, they may show a statistically sig- an association study comparing the fre-
nificant association with a disease sus- • Researchers can more easily obtain quency of three APOE alleles called E2,
ceptibility gene if they are located within large numbers of cases and control E3, and E4 in 30 cases and 91 unrelated
or near that gene by virtue of linkage subjects. controls (Strittmatter et al. 1993). The
disequilibrium (see the sidebar for a investigators found that whereas all alleles
description of this phenomenon). • The effect of disease heterogeneity occurred in the controls, the APOE*E4
SNPs can be of particular benefit in (i.e., that a disease may have multi- allele was greatly overrepresented in the
studies of complex disorders for which ple genetic causes despite a similar AD cases, indicating that this allele is a
many potential candidate genes exist. disease phenotype) is less problematic. major risk factor for the development
For example, linkage mapping studies of AD. This robust association between
have suggested several genomic areas • Researchers do not need to make APOE*E4 and AD has been confirmed
that may contain susceptibility genes assumptions about the exact mode in many subsequent studies (for a
for alcoholism. Each of these areas, how- of disease transmission before con- review, see St. George-Hyslop 2000).
ever, is so large that it may contain dozens ducting their analyses. With respect to alcoholism, researchers
or hundreds of genes depending on the have used the candidate gene approach
size and gene density of each region.4 The major problem associated with to investigate the association between
Because it would be prohibitively diffi- the case-control design is that it may certain ADH and ALDH alleles and an
cult to sequence all these genes, publicly result in spurious associations if the altered risk of alcoholism. Studies have
available SNP data are a great resource controls are not appropriately matched found that the enzyme encoded by an
for candidate gene and association stud- to the cases with respect to ethnicity or ALDH allele called ALDH2*2 degrades
ies. For example, researchers recently other factors that influence an individ- acetaldehyde more slowly than normal,
analyzed several SNPs in the DNA ual’s genetic composition. resulting in the prolongation of certain
region containing a candidate gene for unpleasant alcohol effects, such as facial
Alzheimer’s disease (AD) and demon- flushing, racing of the heart (i.e., palpi-
strated that two SNPs closely flanking Examples of the tations), and nausea. Not surprisingly,
that gene indeed showed strong associa- Candidate Gene this allele appears to have a protective
tion with AD (Martin et al. 2000). (For Approach in Humans effect against alcoholism—that is, peo-
more information on this candidate ple carrying the allele are less likely to
gene for AD, see the section “Examples A widely cited example of the useful- consume alcohol and to develop alco-
of the Candidate Gene Approach in ness of the candidate gene approach holism (for a review, see Reich et al.
Humans.”) involves AD, the most common cause 1999). The frequency of the ALDH2*2
of dementia in the elderly. AD typically allele is particularly high in some Asian
is a late-onset disorder (i.e., the earliest populations, and carriers of this allele
Testing the Candidate Gene symptoms occur after age 60) with consume less alcohol and are much less
Once investigators have chosen a candi- a complex inheritance pattern. The dis- likely to develop alcoholism than are
date gene and suitable polymorphism, ease often appears to occur sporadically, people without the allele.
they commonly test the role of this even when there is an underlying genetic
predisposition. One of the pathologic
4
Genes are not equally spaced throughout the genome, and hallmarks of AD is the presence of The Candidate Gene
some DNA regions may contain more genes than others.
microscopic aggregates, or plaques, of Approach in Mouse Studies
a small protein-like molecule called β-
For example, although the human chromosomes 21 and 22
are of similar size (approximately 33.5 million nucleotides),
their gene density differs substantially. Thus, there are on amyloid peptide. These β-amyloid Quantitative trait loci (QTLs) are DNA
average 6 genes per million nucleotides on chromosome 21
(Hattori et al. 2000) but 16 genes per million nucleotides on
plaques also contain several other pro- regions that may contain one or more
chromosome 22 (Dunham et al. 1999). teins, including one called apolipopro- genes related to the development of a

166 Alcohol Research & Health


The
What
Candidate
Is Moderate
Gene Drinking?
Approach

Genes and Mutations

DNA, the genetic material contained in each cell, to the development of a different phenotype, they
encodes the information for all the proteins needed to represent polymorphisms that can be useful in candi-
create and maintain an organism. The information for date gene studies. Many mutations do not result in
each protein is contained within one gene. Genes represent amino acid changes, however, and therefore do not
only a small portion of a cell’s entire DNA (i.e., the alter the resulting protein or its function. For example,
genome), however, and stretches of DNA both between because of the redundant nature of the DNA code,
and within genes are not converted into proteins. Some
of these “noncoding” DNA stretches (e.g., promoters) some mutations result in codons that still specify the
regulate the activity of the genes and determine which same amino acid. Thus, if a mutation occurred in the
gene is turned “on” or “off” in a given cell at a given last nucleotide of the GCA triplet, all three possible
time. This regulation is necessary, because not all cells new triplets (GCC, GCG, and GCT) would still
need to generate all proteins at all times, and excessive or encode the amino acid alanine. Nevertheless, these single
untimely protein production can lead to disease. For nucleotide polymorphisms (SNPs) can be useful in
example, only blood cells need to produce the protein linkage studies, as described in the main article.
hemoglobin, which carries oxygen from the lungs to the Furthermore, many mutations occur in noncoding
tissues. Noncoding DNA stretches within genes are called DNA regions and therefore do not result in protein
introns. They are cut (i.e., spliced) out of an intermedi- variants that are associated with an altered phenotype or
ary molecule called messenger RNA during the conver-
increased disease risk. Under two conditions, however,
sion of the genetic information in the DNA into a pro-
tein. This splicing process must be highly accurate in even mutations in noncoding regions might result in
order to ensure that the resulting protein is functional. an altered phenotype and therefore be useful in candi-
DNA is a long, thread-like molecule whose building date gene studies. First, mutations that occur in regula-
blocks—the nucleotides—consist of sugar molecules tory regions, such as promoters or intron splice sites,
linked to organic bases. There are four such bases: could alter gene activity and, consequently, the pheno-
adenine (A), guanine (G), cytosine (C), and thymine type determined by that gene. Second, noncoding
(T). The order, or sequence, in which the nucleotides mutations that occur in an intron or outside a gene
are arranged specifies the order in which the building could be associated with an altered phenotype if they
blocks of the resulting proteins (i.e., the amino acids) are positioned close to (i.e., typically within 200,000
are combined. Because there are 20 amino acids but nucleotides) a functional mutation and are therefore
only 4 different nucleotides, a triplet of three nucleotides
(i.e., a codon) represents one specific amino acid. almost always inherited together with the functional
However, the 4 nucleotides can be arranged into 64 mutation. This phenomenon is known as “linkage
different triplets, far more than the 20 codons needed disequilibrium.” In all other cases, an observed associa-
to represent each amino acid. As a result, the genetic tion between a noncoding mutation and a disease may
code is redundant, which means that more than one codon be a consequence of population stratification—that is,
can represent a particular amino acid. For example, general differences between cases and controls if both
only the codon ATG represents the amino acid methion- subject groups are drawn from different underlying
ine; however, four different codons (GCA, GCC, populations (e.g., ethnic groups or animal strains)—
GCG, and GCT) represent the amino acid alanine. or a chance event.
Both during the DNA duplication that occurs when
cells divide and as the result of external factors (e.g.,
exposure to radiation or certain chemicals), changes in
the nucleotide sequence (i.e., mutations) can occur. If —Jennifer M. Kwon and
these changes result in altered proteins that contribute Alison M. Goate

Vol. 24, No. 3, 2000 167


certain quantitative trait. Mapping of Conclusion MALHOTRA, A.K., AND G OLDMAN, D. Benefits and
QTLs in animal models of alcohol- pitfalls encountered in psychiatric genetic associa-
A combination of linkage mapping and tion studies. Biological Psychiatry 45:544–550, 1999.
related phenotypes has identified mul-
tiple genomic areas that potentially a candidate gene approach has been the MARTIN, E.R.; GILBERT, J.R.; LAI, E.H.; E TA L.
contain candidate genes for these phe- most successful method of identifying Analysis of association at single nucleotide polymor-

notypes. (For more information on disease genes to date. The candidate gene phisms in the APOE region. Genomics 63:7– 12, 2000.
approach is useful for quickly determin-
QTL mapping, see the article in this MOORE, M.S.; DEZAZZO, J.; LUK, A.Y.; E TA L .
ing the association of a genetic variant Ethanol intoxication in Drosophila: Genetic and
issue by Grisel, pp. 169–174.) The
with a disorder and for identifying pharmacological evidence for regulation by the
methods of identifying these candidate genes of modest effect. This approach cAMP signaling pathway. Cell 93:997–1007, 1998.
genes and any potential functional vari- has certain advantages over traditional PERICAK-VANCE, M.A.; BEBOUT, J.L.; GASKELL,
ants are essentially the same as those linkage mapping or positional cloning P.C.; E TA L . Linkage studies in familial Alzheimer
used in humans. Once functional vari- approaches. The current methods for disease: Evidence for chromosome 19 linkage.
ants are found, however, any positive evaluating risk associated with candidate American Journal of Human Genetics 48:1034–
association between a variant and the genes complement traditional linkage 1050, 1991.
trait of interest must be interpreted efforts in identifying susceptibility genes REICH, T.; HINRICHS, A.; CULVERHOUSE, R.; AND
with caution. For example, because of for alcoholism. As more SNPs are iden- BIERUT, L. Genetic studies of alcoholism and sub-
the way mouse strains are bred, mice tified throughout the genome, some stance dependence. American Journal of Human
who have a trait (analogous to human of those SNPs also will be located Genetics 65:599–605, 1999.

cases) and mice who do not have that within candidate genes, thereby allowing RISCH, N., AND MERIKANGAS, K. The future of
trait (analogous to controls) may pos- researchers the use of the candidate gene genetic studies of complex human diseases. Science
sess different alleles at a particular gene approach on a genome-wide scale. ■ 273:1516–1517, 1996.

even if that gene is unrelated to the dis- ST. GEORGE-HYSLOP, P.H. Molecular genetics of
ease (or trait) under consideration. The Alzheimer’s disease. Biological Psychiatry 47:183–
References 199, 2000.
gene that actually confers the risk for
COLLINS, F.S.; GUYER, M.S.; AND CHARKRAVARTI,
the disease or trait under investigation STRITTMATTER, W.J.; SAUNDERS, A.M.; SCHMECHEL,
A. Variations on a theme: Cataloging human DNA D.; E TA L. Apolipoprotein E: High avidity binding
may be located near the gene showing sequence variation. Science 278:1580–1581, 1997.
to β-amyloid and increased frequency of type 4 allele
the allelic polymorphism, but may be
DUNHAM I.; SHIMIZU, N.; ROE, B.A.; E TA L. The in late-onset familial Alzheimer disease. Proceedings
difficult to identify positively using DNA sequence of human chromosome 22. Nature of the National Academy of Sciences USA 90:1977–
association methods alone. 402:489–495, 1999. 1981, 1993.

168 Alcohol Research & Health

You might also like