You are on page 1of 33

Pseudogenes

Sean D. Pitman M.D.


© January 2005
Latest Update: May 2008

Table of Contents
• Pseudogenes
• Shared Pseudogenes
• Signs of Function
• One Man's Junk
• Shared Mistakes
• A New Paradigm
• The Human Genome in Numbers
• Pyknons
• The Key Human-Ape Differences
• Wade Schauer Essay

Home
Pseudogenes are DNA sequences that resemble functional genes but are generally
thought to have no purpose. In fact many scientists think that pseudogenes are nothing
more than discarded genetic fossils of a bygone era when they did have some sort of
important function. Of course, it logically follows that similar pseudogenes that are
shared by different species give evidence of common ancestry and even potential times
of divergence.11 For example, the eta-globin pseudogene, which is found in both
humans and chimps, has been used as an argument for the common ancestry of the
two species.
The first pseudogene was reported in 1977.1 Since that time, a large number of
these
genes
have
been

reported and described in humans and many other species.


There are two types of pseudogenes known as "processed" and "unprocessed"
pseudogenes.2,11
Processed genes are found on different chromosomes from their functional
counterparts. They lack introns and certain regulator genes, often terminate in adenine
series, and are flanked by direct repeats (which are associated with movable genetic
elements). They may be complete or incomplete copies of genes or mixtures of several
genes. They are believed to have occurred through a 3-step process: Copying DNA
into RNA, editing the introns to make mRNA, and then turning the code in the mRNA
back into DNA through a reverse transcription process. This process is thought to have
created the "L1 family of pseudogenes."2 Other theories include retroviruses as means
of pseudogene transport between different organisms.
Unprocessed pseudogenes are usually found in clusters of similar functional
sequences on the same chromosome. They usually have introns and associated
regulatory sequences. Their expression is usually prevented by a "misplaced" stop
codon or codons. There may be other changes from the "original" as the result of
deletions, insertions, and point mutations. Some form of mRNA may or may not be
produced depending on the damage to the gene. Many of these are believed to have
arisen by gene duplication, which produced an extra copy of the gene. The extra copy
could then accumulate mutations without harming the organism since it would still have
a completely functional original copy.2 (The evolutionary gene duplication hypothesis
suggests that over time, random mutations may produce a new gene with new functions
by using this gene duplicate while maintaining the original gene funtion5).

Shared Pseudogenes

It is felt by many, especially evolutionary biologists, that shared pseudogenes, which


have no function in any form in different species, are examples of common ancestry.
Comparison of DNA sequences from humans, chimps, and other mammals shows a
great number of shared pseudogenes. Perhaps the best-known example of a shared
pseudogene is the eta-globin gene.
The eta gene is located on chromosome 11 in humans and is fourth in a series of 6
beta globin genes (five are functional).4 It has no start codon (AUG) and it has several
stop codons. So obviously, no mRNA is made and therefore no protein. Humans,
chimps, and gorillas have the same number of beta-globin genes arranged in the same
sequence. The exon sequences within these genes are also similar - as are the exons
of the eta gene.4 It is thought that the eta-globin gene originated by a duplication of the
gamma-A-globin gene because of the high similarity of the sequences. Also, both
genes are present in primates.
The history of the eta-globin pseudogene is thought to have originated some 140
million years ago in marsupials and placental mammals. After the "evolutionary
divergence" of marsupials, the gamma-globin gene formed by duplication of an existing
gene in the beta-globin family. Later, but before radiation of the orders of placental
mammals, the eta-globin gene formed from a duplication of the gamma-globin gene.
Gamma and eta genes must therefore have been present in ancestral placentals, but
presumably gamma was lost by goats (which do not have gamma) and eta was lost by
rabbits (which do not have eta).
According to this scenario, the eta gene must have been functional at first, because
it is functional in goats today. 2 It is non-functional in all primates, which is interpreted
to mean it was already non-functional in ancestral primates some 70-80 million years
ago. This interpretation implies that the eta-globin gene has been maintained for more
than 70 million years without being converted to a useful new gene and without being
eliminated through random mutations.

Signs of Function?

So, the persistence of a non-functional DNA sequence in an entire lineage for such a
supposed long period of time seems remarkable in the context of the gene duplication
hypothesis. The very fact that pseudogenes are still present and recognizable after
tens of millions of years without any beneficial function just doesn't seem to make
sense. Certainly, without some beneficial function, natural selection would not have
maintained their sequences for such long periods of time. There is in fact a cost to
maintain non-functional DNA. It takes energy to replicate and maintain DNA that
doesn't pay for its keep. Although this cost might seem small over the short term. An
extremely small cost compounded over the course of millions of generations starts to
turn into a significant disadvantage. So, the fact that pseudogenes have any
recognizable gene-like structure at all suggests that they do in fact serve some kind of
purpose.

The persistence of pseudogenes is in itself evidence for their activity.


This is a serious problem for evolution, as it is expected that natural
selection would remove this type of DNA if it were useless, since DNA
manufactured by the cell is energetically costly. Because of the lack of
selective pressure on this neutral DNA, one would expect that ‘old’
pseudogenes would be scrambled beyond recognition as a result of
accumulated random mutations. Moreover, a removal mechanism for
neutral DNA is now known.6

“Typically when people say that the human genome contains 27,000 genes or so,
they are referring to genes that code for proteins,” points out Michel Georges, a
geneticist at the University of Liège in Belgium. But even though that number is still
tentative—estimates range from 20,000 to 40,000—it seems to confirm that there is no
clear correspondence between the complexity of a species and the number of genes in
its genome. “Fruit flies have fewer coding genes than roundworms, and rice plants have
more than humans,” notes John S. Mattick, director of the Institute for Molecular
Bioscience at the University of Queensland in Brisbane, Australia. “The amount of
noncoding DNA, however, does seem to scale with complexity.". . .
"Increasingly we are realizing that there is a large collection of ‘genes’ that are
clearly functional even though they do not code for any protein” but produce only RNA,
Georges remarks. The term “gene” has always been somewhat loosely defined; these
RNA-only genes muddle its meaning further. To avoid confusion, says Claes Wahlestedt
of the Karolinska Institute in Sweden, “we tend not to talk about ‘genes’ anymore; we
just refer to any segment that is transcribed [to RNA] as a ‘transcriptional unit.’” Based
on detailed scans of the mouse genome for all such elements, “we estimate that there
will be 70,000 to 100,000,” Wahlestedt announced at the International Congress of
Genetics, held this past July in Melbourne. “Easily half of these could be noncoding.” If
that is right, then for every DNA sequence that generates a protein, another works solely
through active forms of RNA—forms that are not simply intermediate blueprints for
proteins but, rather, directly alter the behavior of cells.” . . .
“I think this will come to be a classic story of orthodoxy derailing objective analysis
of the facts, in this case for a quarter of a century,” Mattick says. “The failure to
recognize the full implications of this particularly the possibility that the intervening
noncoding sequences may be transmitting parallel information in the form of RNA
molecules—may well go down as one of the biggest mistakes in the history of
molecular biology." [emphasis added] 16
Given this, it is not known if all of what are currently thought of as pseudogenes
have absolutely no function. In fact, some pseudogenes are believed to function as
sources of information for producing genetic diversity. It is thought that partial
pseudogenes are copied into functional genes during genetic recombination, producing
variants of the functional gene. This phenomenon has been reported many times to
include various immunoglobulins within mice and birds, mouse histone genes, horse
globin genes, and human beta-globin genes. It is not known if this could be a possible
role for the eta-globin gene as well. However, the fact that the eta-globin pseudogene is
located between the fetal and adult genes suggests that it might play a role in gene
switching (there seems to be some preliminary evidence to this effect although the eta
gene sequence’s part in this is still unknown).
It all seems like the protein coding genes are actually rather informationally simplistic
- that the real informational complexity and functionality lies in the non-coding portion of
the genome. This portion of the genome directs when and where the protein building
blocks are placed and therefore is vitally important to the overall structure and ultimate
function of the resulting creature. It was because of the evolutionary bias that these
non-coding regions of DNA were assumed to be junk for so long - and therefore
overlooked and unrecognized as key informational components in the genome.
Interestingly enough, such findings actually support the predictions of intelligent design
theory while countering long-held evolutionary assumptions. Of course, there are
always ad hoc modifications to explain such failed predictions resulting from an
evolutionary bias.

One Man's Junk . . .

Other pseudogenes and so-called transposons, such as the “Alu element” (once
thought to be completely useless), are being found to have important functions.

There is a growing body of evidence that Alu (a SINE – Short


Interspersed Nuclear Element) sequences are involved in gene regulation,
such as in enhancing and silencing gene activity, or can act as a receptor-
binding site… This is surely a precedent for the functionality of other types
of pseudogenes. 6, 7

Around 1998 Carl Schmid, a molecular biologist at the University of California at


Davis, started advancing what seemed like a nutty idea to explain Alu’s unusual affinity
for genes. Schmid suggested Alu sequences resided near genes because they are not
really “junk” sequences, but are rather useful sequences involved with a mechanism
that helps cells repair themselves. With the entire genome map in front of them,
showing so many instances of Alu sequences around genes, scientists are beginning to
take Schmid seriously. “It looks pretty convincing,” Francis Collins said. Others such as
M.I.T. geneticist Eric Lander agree.8
More recently in 2001, a team of molecular geneticists discovered two “hot spots”
where the same SINEs inserted independently:
Vertebrate retrotransposons have been used extensively for
phylogenetic analyses and studies of molecular evolution. Information can
be obtained from specific inserts either by comparing sequence
differences that have accumulated over time in orthologous copies of that
insert or by determining the presence or absence of that specific element
at a particular site. The presence of specific copies has been deemed to
be an essentially homoplasy-free phylogenetic character because the
probability of multiple independent insertions into any one site has been
believed to be nil. . . . We have identified two hot spots for SINE insertion
within mys-9 and at each hot spot have found that two independent SINE
insertions have occurred at identical sites. These results have major
repercussions for phylogenetic analyses based on SINE insertions,
indicating the need for caution when one concludes that the existence of a
SINE at a specific locus in multiple individuals is indicative of common
ancestry. Although independent insertions at the same locus may be rare,
SINE insertions are not homoplasy-free phylogenetic markers.9

Even more recently, in the May 2003 issue of Nature, Jeannie Lee published an
article entitled, "Complicity of Gene and Pseudogene" in which some interesting findings
from work done by Hirotsune et al.13 were presented:

Dysfunctional in the sense that they cannot be used as a template for producing a
protein, pseudogenes are in fact nearly as abundant as functional genes. Why have
mammals allowed their accumulation on so large a scale? One proposed answer is
that, although pseudogenes are often cast as evolutionary relics and a nuisance to
genomic analysis, the processes by which they arise are needed to create whole gene
families, such as those involved in immunity and smell. But, are pseudogenes
themselves merely byproducts of this process? Or do apparent evolutionary pressures
to retain them [natural selection] hint at some hidden biological function? For one
particular pseudogene, the latter seems to be true . . . Hirotsune and colleagues report
the unprecedented finding that the Makorin1-p1 pseudogene [located on chromosome 5
in mice] performs a specific biological task [it regulates the expression of the Makorin1
gene which is located on a completely different chromosome - chromosome 6 in mice].
The work of Hirotsune et al. is provocative for revealing the first biological function of
any pseudogene. It challenges the popular belief that pseudogenes are simply
molecular fossils -- the evidence of Mother Nature's experiments gone awry." 12,13

In yet another recent Science article by Wojciech Makalowski, the following


comments are made that seem to echo what design theorists have been saying for a
very long time:

Although catchy, the term "junk DNA" for many years repelled mainstream
researchers from studying noncoding DNA. Who, except a small number of genomic
clochards, would like to dig through genomic garbage? However, in science as in
normal life, there are some clochards who, at the risk of being ridiculed, explore
unpopular territories. Because of them, the view of junk DNA, especially repetitive
elements, began to change in the early 1990s. Now, more and more biologists regard
repetitive elements as genomic treasure." 14

Then, as recently as the December 2003 issue of Annual Review of Genetics,


Balakirev and Ayala published a paper entitled, "Pseudogenes: Are They 'Junk' or
Functional DNA?" Consider just a few of their conclusions and see if they do not again
remind you of what design theorists have been claiming for a long time - - That
pseudogenes surely have important functions and therefore are not really "pseudo" after
all:

Pseudogenes have been defined as nonfunctional sequences of genomic DNA


originally derived from functional genes. It is therefore assumed that all pseudogene
mutations are selectively neutral and have equal probability to become fixed in the
population. Rather, pseudogenes that have been suitably investigated often exhibit
functional roles, such as gene expression, gene regulation, generation of genetic
(antibody, antigenic, and other) diversity. Pseudogenes are involved in gene conversion
or recombination with functional genes. Pseudogenes exhibit evolutionary conservation
of gene sequence, reduced nucleotide variability, excess synonymous over
nonsynonymous nucleotide polymorphism, and other features that are expected in
genes or DNA sequences that have functional roles. . .
An extensive and fast-increasing literature does not justify a sharp division between
genes and pseudogenes that would place pseudogenes in the class of genomic "junk"
DNA that lacks function and is not subject to natural selection. Pseudogenes are often
extremely conserved and transcriptionally active. . .
There seems to be the case that some functionality has been discovered in all
cases, or nearly, whenever this possibility has been pursued with suitable
investigations. One may well conclude that most pseudogenes retain or acquire some
functionality and, thus, that it may not be appropriate to define pseudogenes as
nonfunctional sequences of genomic DNA originally derived from functional genes, or as
"genes that are no longer expressed but bear sequence similarity to active genes".
Rather, pseudogenes might be defined as DNA sequences derived by duplication or
retroposition from functional genes that are often subject to natural
selection and therefore retain much of the original sequence and
structure because they have acquired new regulatory or other
functions, or may serve as reservoirs of genetic variability.15

Shared Mistakes

Another interesting argument is that various pseudogenes in


different species often have certain shared "mistakes" - that "must
have originated in a common ancestor." 11 However, there is some
evidence that nucleotide changes may not be completely random in
certain gene locations. Mutational "hotspots" have been identified in
many genes as well as pseudogenes. In these locations, point
mutations, even specific types of point mutations, are much more common than
elsewhere in the gene.

Consider the GULOP (or GULO) pseudogene for example. In most mammals this is
an active gene encoding the enzyme L-glucono-γ-lactone oxidase (LGGLO). GULO is
located on chromosome 8 at p21.1 in a region that is rich in genes (see figure). This is
the enzyme that catalyzes the last step in the synthesis of ascorbic acid (vitamin C). As
it turns out, this particular gene is defective in humans and other primates as well as
several other creatures to include guinea pigs, bats and certain kinds of fish. Compared
to the rat GULO gene, the human version, as well as the great ape version, has large or
clearly functional deletions involving exons I-III, V-VI, VIII, and XI (see figure above).18-21
Compare this with the significant deletions of the guinea pig GULO sequence that
involve exons I, V, and VI - - all of which match the same losses of the primate
mutations. In addition to this, all four functionally detrimental stop codons (3TGA and
1TAA sequences) that are identified in the guinea pig are shared at the same sites
locations in the primate GULO pseudogene.

Of course, it seems that we humans are able to get along just fine without this gene
because we eat a lot of foods that are rich in vitamin C, like citrus fruits. So, what's the
big deal? Well, the argument goes something like this (as per a popular Talk.Origins
essay by Edward E. Max, Ph.D.):

In most mammals functional GLO genes are present, inherited - according to the
evolutionary hypothesis - from a functional GLO gene in a common ancestor of
mammals. According to this view, GLO gene copies in the human and guinea pig
lineages were inactivated by mutations. Presumably this occurred separately in guinea
pig and primate ancestors whose natural diets were so rich in ascorbic acid that the
absence of GLO enzyme activity was not a disadvantage--it did not cause selective
pressure against the defective gene.
Molecular geneticists who examine DNA sequences from an evolutionary
perspective know that large gene deletions are rare, so scientists expected that non-
functional mutant GLO gene copies--known as "pseudogenes"--might still be present in
primates and guinea pigs as relics of the functional ancestral gene. . . [Beyond this], the
theory of evolution would make the strong prediction that primates [like apes and
monkeys] would carry similar crippling mutations to the ones found in the human
pseudogene. A test of this prediction has recently been reported. A small section of the
GLO pseudogene sequence was recently compared from human, chimpanzee,
macaque and orangutan; all four pseudogenes were found to share a common crippling
single nucleotide deletion that would cause the remainder of the protein to be translated
in the wrong triplet reading frame (Ohta and Nishikimi BBA 1472:408, 1999). 11,20

Now, it is interesting that among the many various substitution mutations in the
"GLO" pseudogene that many, though not all, would be shared, to include a single
deletion mutation that is shared by all primates (when compared to the rat of course). If
not for common descent why would the sequences of human, chimpanzee, gorilla and
orangutan reveal a single nucleotide deletion at position 97 in the coding region of Exon
X? What are the odds that out of 165 base pairs the same one would be mutated in all
these primates by random chance? Pretty slim - right? Is this not then overwhelming
evidence of common evolutionary ancestry?
This would indeed seem to be the case at first approximation. However, in 2003, the
same Japanese group published the complete sequence of the guinea pig GLO
pseudogene, which is thought to have evolved independently, and compared it to that of
humans [Inai et al, 2003]. 21 Surprisingly, they reported many shared mutations
(deletions and substitutions) present in both humans and guinea pigs. Remember now
that humans and guinea pigs are thought to have diverged at the time of the common
ancestor with rodents. Therefore, a mutational difference between a guinea pig and a
rat should not be shared by humans with better than random odds. But, this was not
what was observed. Many mutational differences were shared by humans, including the
one at position 97. According to Inai et al, this indicated some form of non-random bias
that was independent of common descent or evolutionary ancestry. The probability of
the same substitutions in both humans and guinea pigs occurring at the observed
number of positions was calculated, by Inai et al, to be 1.84x10-12 - consistent with
mutational hotspots.

What is interesting here is that the mutational hot spots found in guinea pigs and
humans exactly match the mutations that set humans and primates apart from the rat
(see figure below). 21,22 This particular feature has given rise to the obvious argument
that Inai et al got it wrong. Reed Cartwright, a population geneticist, has noted a
methodological flaw in the Inai paper:

"However, the sections quoted from Inai et al. (2003) suffer from a major
methodological error; they failed to consider that substitutions could have occurred in
the rat lineage after the splits from the other two. The researchers actually clustered
substitutions that are specific to the rat lineage with separate substitutions shared by
guinea pigs and humans. . .
If I performed the same analysis as Inai et al. (2003), I would conclude that there are
ten positions where humans and guinea pigs experienced separate substitutions of the
same nucleotide, otherwise known as shared, derived traits. These positions are 1, 22,
31, 58, 79, 81, 97, 100, 109, 157. However, most of these are shown to be substitutions
in the rat lineage when we look at larger samples of species.
When we look at this larger data table, only one position of the ten, 81, stands out as
a possible case of a shared derived trait, one position, 97, is inconclusive, and the other
eight positions are more than likely shared ancestral sites. With this additional
phylogenetic information, I have shown that the "hot spots" Inai et al. (2003) found are
not well supported." (see Link)
It does indeed seems like a number of the sequence differences noted by Cartwright
are fairly unique to the rat - especially when one includes several other species in the
comparison. However, I do have a question regarding this point. It seems to me that
there simply are too many loci where the rat is the only odd sequence out in Exon X
(i.e., there are seven and arguably eight of these loci). Given the published estimate on
mutation rates (Drake) of about 2 x 10-10 per loci per generation, one should expect to
see only 1 or 2 mutations in the 164 nucleotide exon in question (Exon X) over the
course of the assumed time of some 30 Ma (million years). Therefore, the argument of
the mutational differences being due to mutations in the rat lineage pre-supposes a
much greater mutation rate in the rat than in the guinea pig. The same thing is true if
one compares the rat with the mouse (i.e., the rat's evident mutation rate is much higher
than that of the mouse).
This is especially interesting since many of the DNA mutations are synonymous (see
Link). Why should essentially neutral mutations become fixed to a much greater extent
in the rat gene pool as compared to the other gene pools? Wouldn't this significant
mutation rate difference, by itself, seem to suggest a mutationally "hot" region - at least
in the rat?
Beyond this, several loci differences are not exclusive to the rat/mouse gene pools
and therefore suggest mutational hotspots beyond the general overall "hotness" or
propensity for mutations in this particular genetic sequence.
Some have noted that although the shared mutations may be the result of hotspots,
there are many more mutational differences between humans and rats/guinea pigs as
compared to apes. Therefore, regardless of hotspots, humans and apes are clearly
more closely related than are humans and rats/guinea pigs.
The problem with this argument is that the rate at which mutations occur is related to
the average generation time. Those creatures that have a shorter generation time have
a correspondingly higher mutation rate over the same absolute period of time - like 100
years. Therefore, it is only to be expected that those creatures with a very long
generation times, like humans and apes, would have fewer mutational differences
relative to each other over the same period of time relative to those creatures with much
shorter generation times, like rats and guinea pigs.
As an aside, many other genetic mutations that result in functional losses are known
to commonly affect the same genetic loci in the same or similar manner outside of
common descent. For example, achondroplasia is a spontaneous mutation in humans
in about 85% of the cases. In humans achondrioplasia is due to mutations in the FGFR2
gene. A remarkable observation on the FGFR2 gene is that the major part of the
mutations are introduced at the same two spots (755 C->G and 755-757 CGC->TCT)
independent of common descent. The short legs of the Dachshund are also due to the
same mutation(s). The same allelic mutation has occurred in sheep as well.
What is interesting about many of these mutational losses is that they often share
the same mutational changes. It is at least reasonably plausible then that the GULO
mutation could also be the result of a similar genetic instability that is shared by similar
creatures (such as humans and the great apes).
Another interesting example of this phenomenon has been studied in detail in more
rapidly reproducing organisms, such as viruses. For example, an interesting study was
published by Bull et al., on replicate lineages of the bacteriophage {phi} X174.
Numerous mutations occurred in each genome during propagation. Across nine
separate lineages 119 independent substitutions occurred at 68 nucleotide sites.
What is interesting here is that over half of these substitutions at 1/3 of the sites were
identical in the different lineages. Some convergent substitutions were specific to
specific hosts while others where shared between the two separate hosts. Phylogenetic
reconstruction using the complete genome sequence not only failed to recover the
correct evolutionary history because of these convergent changes, but the true history
was rejected as being a significantly inferior fit to the data (see Link).

This same sort of thing is seen to a fairly significant degree in the GULO region.
Many of the same significant mutations are shared between humans and guinea pigs.
Consider the following illustration yet again:
Why would both humans and guinea pigs share major deletions of exons I, V and VI
as well as four stop codons if these mutations were truly random? In addition to this, a
mutant group of Danish pigs have also been found to show a loss of GULO
functionality. And, guess what, the key mutation in these pigs was a loss of a sizable
portion of exon VIII. This loss also matches the loss of primate exon VIII. In addition,
there is a frame shift in intron 8 which results in a loss of correct coding for exons 9-12.
This also reflects a very similar loss in this region in primates (see Link). That's quite a
few key similarities that were clearly not the result of common ancestry for the GULO
region. This seems to be very good evidence that many if not all of the mutations of the
GULO region are indeed the result of similar genetic instabilities that are prone to
similar mutations - especially in similar animals.

Back to mutational hotspots, what makes hotspots so "hot"? Perhaps the answer
lies in the chemical nature of the hotspot region. The type of molecular bonds, their
stability or instability, or other molecular interactions may lend themselves to specific
nucleotide pair switches, especially given certain environmental changes. No one really
knows for sure except to say that mutational hot spots do exist. So, given that they do
exist, similar genes should be expected to function in similar ways and this includes
having similar mutational "hotspots and/or "shared mistakes." 3 In any case, it is
interesting to note that there are no such examples of "shared errors" between
mammals and other groups of animals (although there are plenty of common "errors"
that are shared by widely divergent mammalian groups).

There are no examples of 'shared errors' that link mammals to other


branches of the genealogic tree of life on earth. . . Therefore, the
evolutionary relationships between distant branches on the evolutionary
genealogic tree must rest on other evidence besides 'shared errors.' 11

Of course the argument used to explain this fact is that mammals split off from other
groups of animals over 200 million years ago. Given this amount of time, random
mutations would have obliterated any trace of common genetic errors. 11 This is a very
good point. The question remains however as to why are some identifiable genetic
errors are maintained as long as they are if they are in fact functionless? Also,
"processed pseudogenes" are very similar to "movable genetic elements" which are
often transmitted from animal to animal by viruses. Certain interspecies pseudogenes
of this type might in fact share a common ancestor while the various types of animals
themselves, that harbor certain of these genetic sequences, may not be related through
common descent so much as they are partially related through common infection.
In any case, there really are no "foolproof" genetic markers of common decent. All
of the ones proposed so far to be foolproof have been shown to have significant flaws.
The prediction that pseudogenes, transposons (SINEs and LINEs) and other shared
mutational mistakes are conclusive evidence for common descent has not held up over
recent years. For example, consider the following excerpt from David Hillis' paper
entitled, "SINEs of the perfect character." published in the Proceedings of the National
Academy of Sciences, 1999:

What of the claim that the SINE/LINE insertion events are perfect
markers of evolution (i.e., they exhibit no homoplasy)? Similar claims
have been made for other kinds of data in the past, and in every case
examples have been found to refute the claim. For instance, DNA-DNA
hybridization data were once purported to be immune from convergence,
but many sources of convergence have been discovered for this
technique. Structural rearrangements of genomes were thought to be
such complex events that convergence was highly unlikely, but now
several examples of convergence in genome rearrangements have been
discovered. Even simple insertions and deletions within coding regions
have been considered to be unlikely to be homoplastic, but numerous
examples of convergence and parallelism of these events are now
known. Although individual nucleotides and amino acids are widely
acknowledged to exhibit homoplasy, some authors have suggested that
widespread simultaneous convergence in many nucleotides is virtually
impossible. Nonetheless, examples of such convergence have been
demonstrated in experimental evolution studies. 10

A New Paradigm

Obviously then, the old notions that pseudogenes and other forms of shared "junk"
DNA give clear evidence of common ancestry over common functional need, will have
to be discarded. Certainly if organisms share similar environments and have similar
morphologic appearances and needs, should one be surprised to find similar functional
genetic elements shared between such creatures? Such sequences cannot be used to
clearly establish evolutionary trees and to estimate divergent times since such beneficial
sequences would be maintained over time via natural selection without any significant
changes. The similarities and differences would not be based so much on evolutionary
changes over the time since a shared common ancestor as they would be the result of
similarities and differences in functional needs that have always been there, maintained
by the forces of natural selection, since these creatures came to be.

No one knows yet just what the big picture of genetics will look like once this
hidden layer of information is made visible. "Indeed, what was damned as junk because
it was not understood may, in fact, turn out to be the very basis of human complexity,"
Mattick suggests. Pseudogenes, riboswitches and all the rest aside, there is a good
reason to suspect that is true. Active RNA, it is now coming out, helps to control the
large-scale structure of the chromosomes and some crucial chemical modifications to
them—an entirely different, epigenetic layer of information in the genome.16
In fact, the most detailed probe yet into the workings of the human genome has led
scientists to conclude [as of June 14, 2007] that a cornerstone concept about the
chemical code for life is badly flawed. Reporting in the British journal Nature and the
US journal Genome Research on Thursday [June 14, 2007], they suggest that an
established theory about the genome should be consigned to history.
In between the genes and the sequences known to regulate their activity are long,
tedious stretches that appear to do nothing. The term for them is "junk" DNA, reflecting
the presumption that they are merely driftwood from our evolutionary past and have no
biological function. But the work by the ENCODE (ENCyclopaedia of DNA Elements)
consortium implies that this nuggets-and-dross concept of DNA should be, well, junked.
The genome turns out to a highly complex, interwoven machine with very few
inactive stretches, the researchers report. Genes, it transpires, are just one of many
types of DNA sequences that have a functional role. And "junk" DNA turns out to have
an essential role in regulating the protein-making business. Previously written off as
silent, it emerges as a singer with its own discreet voice, part of a vast, interacting
molecular choir.
"The majority of the genome is copied, or transcribed, into RNA, which is the active
molecule in our cells, relaying information from the archival DNA to the cellular
machinery," said Tim Hubbard of the Wellcome Trust Sanger Institute, a British research
group that was part of the team. "This is a remarkable finding, since most prior research
suggested only a fraction of the genome was transcribed."
Francis Collins, director of the US National Human Genome Research Institute
(NHGRI), which coralled 35 scientific groups from around the world into the ENCODE
project, said the scientific community "will need to rethink some long-held views about
what genes are and what they do."17

The human genome in numbers26

• 1.5% of the genome translated into proteins

• 27% of the genome transcribed as part of


protein-coding gene expression but not
translated into proteins
• 25% of the genome that is transcribed but not
translated, and is not associated with protein-
coding genes

• 250 microRNAs currently identified (as of June


2005)
o ~1,000 as of 2007 ( Link )

• 10,000 protein-coded genes estimated to be


regulated by microRNAs; each microRNA can
target several genes, and a particular gene
may be regulated by several microRNAs

• 98% of genomic output that is non-coding RNA

• 9% of genes that appear to have associated


antisense transcripts

• ~20,000 "pseudogenes" in the genome

This is very interesting. I mean, who would have thought that the majority of the
genome would be copied or transcribed into RNA? - and that it would in fact be
functional? Only a few years ago the scientific community believed that less than 5% of
the genome was actually functional and the rest was non-functional evolutionary
remnants. After all, "noncoding genomic regions account for 98% to 99% of the human
genome and consist of introns found within protein-coding transcripts and the intergenic
regions between them."25 Add to these numbers the very surprising finding that many
genetic sequences that do not produce either proteins or RNA are also being found to
be functional (see discussion of Pyknons)
Who would have predicted this? - - besides creationists and intelligent design
theorists that is? Creationists and intelligent design theorists have been claiming for
many years that the concept of "Junk DNA" (as well as vestigial structures) was not
entirely correct. I myself have been promoting this idea for over 11 years (as of June,
2008). Yet, only now are mainstream scientists finally starting to realize the significant
errors in their long-cherished beliefs when it comes to the ill-conceived notion of junk
DNA - an idea which was based on ardently held evolutionary presuppositions that
blinded mainstream science and prevented them from searching out the hidden
treasures of so-called "junk DNA" for a fairly long time.
When are scientists going to start realizing that the creationist paradigm does
indeed have very good predictive scientific value when it comes to accurately
understanding and investigating the physical world and universe?

Pyknons

To add to this, consider the fairly recent finding (2006) of "pyknons" by Rigoutsos et
al.24 Pyknons are variable-length patterns within DNA sequences that have identically
conserved copies and multiplicities above what is expected by chance. They are also no
transcribed into RNA (unlike miRNAs noted above) or translated into protein. Among the
millions of discovered patterns, Rigoutsos et al. found a subset of 127,998 patterns,
which they termed pyknons, that have additional nonoverlapping instances in the
untranslated and protein-coding regions of 30,675 transcripts from 20,059 human
genes. The pyknons arrange combinatorially in the untranslated and coding regions of
numerous human genes where they form mosaics. Consecutive instances of pyknons in
these regions show a strong bias in their relative placement, favoring distances of ~22
nucleotides.
Pyknons are also very common in the human genome. They form 1/6th of the
human intergenic and intronic regions for a total of 127,998 pyknons covering
898,424,004 DNA nucleotide positions on the forward and reverse strands of the human
genome.
What is interesting here, of course, is that pyknons are associated with specific
biologic processes - i.e., they are functional. Cross-genome comparisons reveal that
many of the pyknons have instances in the 3' UTRs of genes from other vertebrates and
invertebrates where they are overrepresented in similar biological processes, as in the
human genome. This "unexpected finding" suggests, according to the authors, potential
unique functional connections between the coding and noncoding parts of the human
genome - such as a possible link with posttranscriptional gene silencing and RNA
interference.

"Human pyknons are also present in other genomes, where they associate with
similar biological processes. Notably, >600 million nucleotides that are associated with
nongenic copies of pyknons in the human genome are absent from the mouse and rat
genomes. Interestingly, the human pyknons have many instances in the intergenic and
intronic regions of the phylogenetically distant worm and fruit fly genomes, covering ~1.6
million nucleotides in each."24

Given that genetic sequences that are transcribed or translated or both seem to
account for the "majority" of the genome, and are thought to be functionally beneficial, it
is interesting that certain types of genetic sequences that are neither translated nor
transcribed are also being found to be functional. Taken together, it seems like the
significant majority of the genome is indeed functional to at least some degree - well
over 50% if not more like 85-90% or even higher?

The Key Human-Ape Differences

It is becoming more and more clear that the key functional differences between

living things, like humans and apes, are not so much found in protein-coding genes, but

in the non-coding regions of DNA once thought to be functionless "junk-DNA" -

evolutionary remnants of past mistakes that are shared between various creatures.
This notion is starting to be shed with more and more discoveries that show that many

of these same regions are not just functional, they carry the vast majority of the genetic

information. The "genes" that were once thought to be so important for genetic

function are turning out to be equivalent to the most low-level basic building blocks

within the genome, like bricks and motor. Surprisingly, it is the non-coding regions of

DNA control what is done with these building blocks - that determine what kind of

"house" to build so to speak. The following article is very interesting in this regard:

"Seventy-five percent of known human miRNAs [microRNAs] cloned in this study

were conserved in vertebrates and mammals, 14% were conserved in invertebrates,

10% were primate specific and 1% are human specific. The new miRNAs have a
different conservation distribution: more than half of the human miRNAs were

conserved only in primates, about 30% in mammals and 9% in nonmammalian

vertebrates or invertebrates; 8% were specific to humans. We saw a similar distribution

for the chimpanzee miRNAs.

The different miRNA repertoire, as well as differences in expression levels of

conserved miRNAs, may contribute to gene expression differences observed in human

and chimpanzee brain . Although the physiological relevance of miRNAs expressed

at low levels remains to be shown, it is tempting to speculate that a pool of such

miRNAs may contribute to the diversity of developmental programs and cellular

processes . . . For example, miRNAs recently have been implicated in synaptic

development and in memory formation. As the species specific miRNAs described here

are expressed in the brain, which is the most complex tissue in the human body, with an

estimated 10,000 different cell types, these miRNAs could have a role in establishing or

maintaining cellular diversity and could thereby contribute to the differences in human

and chimpanzee brain ... function." 23

Pseudogenes are also being found to have similar functionality as miRNAs.


"Transcripts of processed pseudogenes can contain regions with significant antisense
homology, which may suggest a regulatory role for transcribed pseudogenes through an
RNAi-like mechanism" (see Link ). Two recent studies have demonstrated that such
transcribed pseudogenes can regulate transcription of homologous protein-coding
genes. Transcription of a pseudogene in Lymnea stagnalis, that is homologous to the
nitric oxide synthase gene, decreases the expression levels for the gene through
formation of a RNA duplex; this is thought to arise via a reverse-complement sequence
found at the 5′ end of the pseudogene transcript (Link). In a second example,
transcription of the makorin1-p1 TPΨg in mouse was required for the stability of the
mRNA from a homologous gene makorin1. This regulation was deduced to arise from
an element in the 5′ areas of both the gene and the pseudogene (Link). More recently,
Weil et al. discovered that the murine FGFR-3 pseudogene is transcribed in fetal tissues
in an antisense direction. This prompted the following consideration:

'As the regions of exact identity between FGFR-3 and its pseudogene can be up to
60 nt long, it may be envisioned that FGFR-3 transcripts could play a regulatory role in
FGFR-3 expression. If these antisense transcripts could hybridize to sense FGFR-3
transcripts inside the cells, this may lead to either rapid degradation or inhibition of
translation.' (Link)

As Yao et. al., predict, "Further studies on transcribed pseudogenes will add to our
understanding of their potential roles as non-coding RNA genes or other new types of
functional elements." (Link) It seems like many transcribed pseudogenes may act as
giant miRNAs to regulate the function of protein-coding genes and other genetic
elements.

Additional information dealing with this most interesting topic is listed in an fairly extensive essay by
Wade Schauer (used with permission).

1. Jacq C, Miller JR, Brownlee GG. A pseudogene structure in 5S DNA of Xenopus laevis,
Cell 12:109-120. 1977.
2. Gibson L. J., Pseudogenes and Origins, Origins 21(2):91-108. 1994.
3. Menotti R.M., Starmer W.T., Sullivan D.T., Characterization of the structure and evolution
of the Adh region of Drosophila hydei, Genetics 127:355-366. 1991.
4. Lalley P.A., Davisson M.T., Graves J.A.M., O’Brien S.J., Womack J.E., Roderick T.H.,
Creau-Goldberg N., Hillyard A.L., Doolittle D.P., Rogers J.A., Report of the committee on
comparative mapping, Cytogenetics and Cell Genetics 51:503-532. 1989.
5. Long M., Langley C.H., Natural selection and the orgin of jingwei, a chimeric processed
functional gene in Drosophila, Science 260:91-95. 1993.
6. Jerlstrom, Pierre. 2000. Pseudogenes. Creation Ex Nihilo Technical Journal 14 (no. 3):15.

7. Woodmorappe, John.2000. Are Pseudogenes 'Shared Mistakes' Between Primate


Genomes? Creation Ex Nihilo Technical Journal 14 (no. 3):58-71.
8. Abate, Tom. 2001. Genome Discovery Shocks Scientists. San Francisco Chronicle
(February 11).
9. Cantrell, Michael A. and others. 2001. An Ancient Retrovirus-like Element Contains Hot
Spots for SINE Insertion. Genetics 158:769-777.
10. Hillis, David M. 1999. SINEs of the perfect character. Proceedings of the National
Academy of Sciences 96:9979-9981.
11. Max, Edwards. Plagiarized Errors and Molecular Genetics. Creation/Evolution (XIX, p.34)
1986-2003. ( http://www.talkorigins.org/faqs/molgen/ )
12. Lee, Jeannie T., Complicitiy of the gene and pseudogene, Nature 423:26-28. 2003
13. Hirotsun, Shinji et. al., An expressed pseudogene regulates the messenger-RNA stability
of its homologous coding gene, Nature 423:91-96. 2003
14. Makalowski, Wojciech. 2003. Not Junk After All, Science 300:1246-1247
15. Balakirev, Evgeniy S., Ayala, Francisco J., PSEUDOGENES: Are They "Junk" or
Functional DNA? Annual Review of Genetics, Vol. 37, pp. 123-151, December 2003 (
http://arjournals.annualreviews.org/doi/abs/10.1146%2Fannurev.genet.37.040103.103949 )
16. Wyatt Gibbs, The Unseen Genome: Gems among the Junk, Scientific American,
November 2003, pp 45-53 ( Link )
17. ENCORE Project Consortium et al., Identification and analysis of functional elements in
1% of the human genome by the ENCODE pilot project, Nature 447, 799-816 (14 June
2007); Richard Ingham, Landmark study prompts rethink of genetic code, Yahoo News,
accessed June 15, 2007 (Link1, Link2)
18. Nishikimi, M. and Yagi, K. (1991) Molecular basis for the deficiency in humans of
gulonolactone oxidase, a key enzyme for ascorbic acid biosynthesis. Am. J. Clin. Nutr.
54(6 Suppl):1203S-1208S.
19. Nishikimi, M., Fukuyama, R., Minoshima, S., Shimizu, N. and Yagi. K. (1994) Cloning and
chromosomal mapping of the human nonfunctional gene for L-gulono-gamma-lactone
oxidase, the enzyme for L-ascorbic acid biosynthesis missing in man. J. Biol. Chem.
269:13685-13688.
20. Ohta, Y. and Nishikimi, M. (1999) Random nucleotide substitutions in primate
nonfunctional gene for L-gulono-gamma-lactone oxidase, the missing enzyme in L-
ascorbic acid biosynthesis. Biochim. Biophys. Acta. 1472:408-411.
21. Inai, Y., Ohta. Y., and Nishikimi, M. (2003) The whole structure of the human
nonfunctional L-gulono-gamma-lactone oxidase gene--the gene responsible for scurvy--
and the evolution of repetitive sequences thereon. J Nutr Sci Vitaminol (Tokyo) 49:315-
319.
22. Peter Borger, Shared mutations: Common descent or common mechanism?, The
Independent Research Institute on Origins, Accessed 8/10/07 ( Link )
23. Eugene Berezikov, Fritz Thuemmler, Linda W van Laake, Ivanela Kondova, Ronald
Bontrop4, Edwin Cuppen & Ronald H A Plasterk, "Diversity of microRNAs in human and
chimpanzee brain", Nature Genetics, Vol 38 | Number 12 | December 2006 pp. 1375-
1377. ( Link )
24. Isidore Rigoutsos, Tien Huynh, Kevin Miranda, Aristotelis Tsirigos, Alice McHardy, and
Daniel Platt, Short blocks from the noncoding parts of the human genome have instances
within nearly all known genes and relate to biological processes, PNAS | April 25, 2006 |
vol. 103 | no. 17 | 6605-6610 ( Link )
25. Jill Cheng, Philipp Kapranov, Jorg Drenkow, Sujit Dike, Shane Brubaker, Sandeep Patel,
Jeffrey Long, David Stern, Hari Tammana, Gregg Helt, Victor Sementchenko, Antonio
Piccolboni, Stefan Bekiranov, Dione K. Bailey, Madhavan Ganesh, Srinka Ghosh, Ian Bell,1
Daniela S. Gerhard, Thomas R. Gingeras, Transcriptional Maps of 10 Human
Chromosomes at 5-Nucleotide Resolution, Science 20 May 2005: Vol. 308. no. 5725, pp.
1149 - 1154 ( Link )
26. Richard Twyman, Small RNA: BIG NEWS, The Human Genome, January 2005 ( Link )
. Home Page . Truth, the Scientific
Method, and Evolution

. Methinks it is Like a Weasel . The Cat and the Hat -

The Evolution of Code

. Maquiziliducks - The Language of Evolution . Defining Evolution

. The God of the Gaps . Rube Goldberg


Machines

. Evolving the Irreducible . Gregor Mendel

. Natural Selection . Computer Evolution

. The Chicken or the Egg . Antibiotic


Resistance

. The Immune System . Pseudogenes

. Genetic Phylogeny . Fossils and DNA


. DNA Mutation Rates . Donkeys, Horses,
Mules and Evolution

. The Fossil Record . The Geologic


Column

. Early Man . The Human Eye

. Carbon 14 and Tree Ring Dating . Radiometric Dating

. Amino Acid Racemization Dating . The Steppingstone


Problem

. Quotes from Scientists . Ancient Ice

. Meaningful Information . The Flagellum

. Harlen Bretz . Milankovitch Cycles

. Kenneth Miller's Best Arguments

Search this site or the web powered by FreeFind


Site search Web search

Since June 1, 2002

You might also like