Professional Documents
Culture Documents
Human DNA consists of ~3 billion base pairs (bp) of DNA per haploid genome.
DNA length is normally measured in units of 1000 bp (kilobases, kb) or
1,000,000 bp (megabases, Mb). Not all DNA encodes genes. In fact, genes
account for only ~10–15% of DNA. Much of the remaining DNA consists of
highly repetitive sequences, the function of which is poorly understood. These
repetitive DNA regions, along with nonrepetitive sequences that do not encode
genes, may serve a structural role in the packaging of DNA into chromatin, i.e.,
DNA bound to histone proteins, and chromosomes (Fig. 62-1). If only 10% of
DNA is expressed and there are 30,000 genes, the average gene would be ~10
kb in length. Although many genes are about this size, the range is quite broad.
For example, some genes are only a few hundred bp, whereas others, such as
the DMD gene, are extraordinarily large (2 Mb).
Figure 62-1
Structure of chromatin and chromosomes. Chromatin is composed of double-
strand DNA that is wrapped around histone and nonhistone proteins forming
nucleosomes. The nucleosomes are further organized into solenoid structures.
Chromosomes assume their characteristic structure, with short (p) and long (q)
arms at the metaphase stage of the cell cycle.
Structure of DNA
Figure 62-2
The presence of four different bases provides surprising genetic diversity. In the
protein-coding regions of genes, the DNA bases are arranged into codons, a
triplet of bases that specifies a particular amino acid. It is possible to arrange
the four bases into 64 different triplet codons (43). Each codon specifies 1 of the
20 different amino acids, or a regulatory signal, such as initiation and stop of
translation. Because there are more codons than amino acids, the genetic code
is degenerate; that is, most amino acids can be specified by several different
codons. By arranging the codons in different combinations and in various
lengths, it is possible to generate the tremendous diversity of primary protein
structure.
Mechanisms that regulate gene expression play a critical role in the function of
genes. The transcription of genes is controlled primarily by transcription factors
that bind to DNA sequences in the regulatory regions of genes. As described
below, mutations in transcription factors cause a significant number of genetic
disorders. Gene expression is also influenced by epigenetic events, such as X-
inactivation and imprinting, processes in which DNA methylation or histone
modifications are associated with gene silencing. Several genetic disorders,
such as Prader-Willi syndrome (neonatal hypotonia, developmental delay,
obesity, short stature, and hypogonadism) and Albright hereditary
osteodystrophy (resistance to parathyroid hormone, short stature,
brachydactyly, resistance to other hormones in certain subtypes), exhibit the
consequences of genomic imprinting. Most studies of gene expression have
focused on the regulatory DNA elements of genes that control transcription.
However, it should be emphasized that gene expression requires a series of
steps, including mRNA processing, protein translation, and posttranslational
modifications, all of which are actively regulated (Fig. 62-2).
Structure of Genes
A gene product is usually a protein but can occasionally consist of RNA that is
not translated (e.g., microRNAs). Exons refer to the portion of genes that are
eventually spliced together to form mRNA. Introns refer to the spacing regions
between the exons that are spliced out of precursor RNAs during RNA
processing (Fig. 62-2).
The gene locus also includes regions that are necessary to control its
expression. The regulatory regions most commonly involve sequences
upstream (5') of the transcription start site, although there are also examples of
control elements within introns or downstream of the coding regions of a gene.
The upstream regulatory regions are also referred to as the promoter. The
minimal promoter usually consists of a TATA box (which binds TATA-binding
protein, TBP) and initiator sequences that enhance the formation of an active
transcription complex. A gene may generate various transcripts through the use
of alternative promoters and/or alternative splicing of exons, mechanisms that
contribute to the enormous diversity of proteins and their functions.
Transcriptional termination signals reside downstream, or 3', of a gene. Specific
sequences, such as the AAUAAA sequence at the 3' end of the mRNA,
designate the site for polyadenylation (poly-A tail), a process that influences
mRNA transport to the cytoplasm, stability, and translation efficiency. A rigorous
test of the regulatory region boundaries involves expressing a gene in a
transgenic animal to determine whether the isolated DNA flanking sequences
are sufficient to recapitulate the normal developmental, tissue-specific, and
signal-responsive features of the endogenous gene. This has been
accomplished for only a few genes; there are many examples in which large
genomic fragments only partially reconstitute normal gene regulation in vivo,
implying the presence of distant regulatory sequences. Genome-wide analyses
of selected transcription factor binding sites, such as for the estrogen receptor,
reveal that the majority of regulatory sites are very distant from the transcription
start sites of genes. A detailed understanding of mechanisms that regulate
genes is also relevant for gene therapy strategies that require normal gene
regulation (Chap. 65).
Mutations can occur in all domains of a gene (Fig. 62-4). A point mutation
occurring within the coding region leads to an amino acid substitution if the
codon is altered. Point mutations that introduce a premature stop codon result
in a truncated protein. Large deletions may affect a portion of a gene or an
entire gene, whereas small deletions and insertions alter the reading frame if
they do not represent a multiple of three bases. These "frameshift" mutations
lead to an entirely altered carboxy terminus. Mutations occurring in regulatory or
intronic regions may result in altered expression or splicing of genes. Examples
are shown in Fig. 62-5.
Figure 62-4
Point mutations causing -thalassemia as example of allelic heterogeneity.
The -globin gene is located in the globin gene cluster. Point mutations can
be located in the promoter, the CAP site, the 5'-untranslated region, the initiation
codon, each of the three exons, the introns, or the polyadenylation signal. Many
mutations introduce missense or nonsense mutations, whereas others cause
defective RNA splicing. Not shown here are deletion mutations of the -
globin gene or larger deletions of the globin locus that can also result in
Figure 62-5
A. Examples of mutations. The coding strand is shown with the encoded amino
acid sequence. B. Chromatograms of sequence analyses after amplification of
genomic DNA by polymerase chain reaction.
HNF1 ,
HNF1
Paired box PAX3 Waardenburg syndrome types 1 and 3
T-box TBX5 Holt-Oram syndrome (thumb anomalies, atrial
or ventricular septum defects, phocomelia)
Cell cycle control P53 Li-Fraumeni syndrome, other cancers
proteins
Coactivators CREB binding Rubinstein-Taybi syndrome
protein (CBP)
General TATA-binding Spinocerebellar ataxia 17 (CAG expansion)
transcription factors protein (TBP)
Transcription VHL Von Hippel–Lindau syndrome (renal cell
elongation factor carcinoma, pheochromocytoma, pancreatic
tumors, hemangioblastomas)
Autosomal dominant inheritance, somatic
inactivation of second allele (Knudson two-hit
model)
Runt CBFA2 Familial thrombocytopenia with propensity to
acute myelogenous leukemia
Chimeric proteins PML—RAR Acute promyelocytic
due to leukemiat(15;17)(q22;q11.2-q12) translocation
translocations
Note: Selected abbreviations include: SRY, sex determining region Y; HNF,
hepatocyte nuclear factor; CREB (cAMP responsive element binding) binding
protein; VHL, Von Hippel–Lindau; PML, promyelocytic leukemia; RAR, retinoic
acid receptor.
Of course, these mechanisms are not mutually exclusive, and most genes are
activated by some combination of these events.
Suppression of gene expression is as important as gene activation in the control
of cell differentiation and function. Some mechanisms of repression are the
corollary of activation. For example, repression is often associated with histone
deacetylation or protein dephosphorylation. For nuclear hormone receptors,
transcriptional silencing involves the recruitment of repression complexes that
contain histone deacetylase activity. Aberrant expression of repressor proteins
is sometimes associated with neoplasia. The t(15;17) chromosomal
translocation that occurs in promyelocytic leukemia fuses the PML gene to a
portion of the retinoic acid receptor (RAR ) gene (Table 62-2). This
event causes unregulated transcriptional repression in a manner that precludes
normal cellular differentiation. The addition of the RAR ligand, retinoic acid,
activates the receptor, thereby relieving repression and allowing cells to
differentiate and ultimately undergo apoptosis. This mechanism has therapeutic
importance as the addition of retinoic acid to treatment regimens induces a
higher remission rate in patients with promyelocytic leukemia (Chap. 104).
Methylation of promoter regions is frequently found in neoplasms and silences
gene expression.