You are on page 1of 82

Manual on Application

of Molecular Tools in
Aquaculture and Inland
Fisheries Management

MANUAL ON APPLICATION OF MOLECULAR TOOLS IN AQUACULTURE AND INLAND FISHERIES MANAGEMENT: PART 1
Part 1

Conceptual basis of
population genetic
approaches

NACA Monograph 1

www.enaca.org
Manual on Application of
Molecular Tools in Aquaculture
and Inland Fisheries
Management

Part 1:
Conceptual basis of population genetic
approaches
Contributors
Thuy Nguyen
Network of Aquaculture Centres in Asia-Pacific

David Hurwood, Peter Mather


School of Natural Resource Sciences, Queensland University of Technology

Uthairat Na-Nakorn
Kasetsart University, Thailand

Wongpathom Kamonrat
Department of Fisheries, Thailand

Devin Bartley
Food and Agriculture Organization of the United Nations

Queensland University
of Technology
Brisbane, Australia
NACA MONOGRAPH SERIES
NACA is an intergovernmental organization that
promotes rural development through sustainable
aquaculture. NACA seeks to improve rural income,
increase food production and foreign exchange
earnings and to diversify farm production. The
ultimate beneficiaries of NACA activities are farmers
and rural communities.

Visit NACA online at www.enaca.org for hundreds


of freely downloadable publications on aquaculture
and aquatic resource management.

© Network of Aquaculture Centres in Asia-Pacific


PO Box 1040, Kasetsart University Post Office
Ladyao, Jatujak
Bangkok 10903
Thailand
Email: info@enaca.org

Nguyen, T.T.T., Hurwood, D., Mather, P., Na-Nakorn,


N, Kamonrat, W. and Bartley, D. 2006. Manual on
applications of molecular tools in aquaculture and
inland fisheries management, Part 1: Conceptual basis of
population genetic approaches. NACA Monograph No. 1,
80p.

ISBN 978-974-88246-1-1

Printed by Scandmedia, Bangkok.


Contents

Preface ............................................................................................................... 5
Acknowledgements........................................................................................... 7
Background........................................................................................................ 9
Target audiences ............................................................................................. 11
Aims, scope and format of the manual ......................................................... 12
Abbreviations .................................................................................................. 13

Section 1. The fundamental nature of DNA............................................................. 15

1.1 Basic DNA structure .................................................................................. 17


1.2 Where does variation in DNA sequences come from? ........................... 18

Section 2. Genetic variation in nature ...................................................................... 23

Section 3. Basic concepts in population genetics .................................................... 29

Section 4. Natural selection....................................................................................... 35

Section 5. Genetic drift .............................................................................................. 41

Section 6. Non-random mating and population structure ...................................... 47

Section 7. Environmental influences on population processes .............................. 55

Section 8. Ecological influences on population processes ...................................... 63

Glossary ........................................................................................................... 69
Bibliography .................................................................................................... 79

3
Preface

The mandate of NACA is to support is produced to facilitate training


member governments in their endea- processes that NACA will undertake
vours to achieve long-term sustainabi- in the ensuing years to enable the
lity of inland fishery resource utilisation member nations to achieve the overall
and aquaculture development. In this objectives in regard to maintaining
regard, NACA plays a major role in biodiversity in relation to development
developing human capacity in aspects of aquatic resources utilisation.
in the member countries.
We accept the fact that a number of
In the current millennium, inland fishe- text books are available for reference
ries resource utilisation and aquacul- in this field. Most however, are
ture development have to go hand in expensive for many users and some of
hand with maintaining environmental the techniques provided in them are
integrity and biodiversity. Conserving not always suitable for many of the
biodiversity has become an important molecular laboratories in the region.
consideration worldwide. Nations that This has prompted us to prepare this
import aquaculture products, often manual, which is designed to be less
stress that the production processes expensive, more “user friendly” and of
must not negatively affect natural direct relevance to the region.
biodiversity. Furthermore, conservation
of biodiversity is an integral component
of responsible fisheries and enshrined
in the FAO Code of Conduct for
Responsible Fisheries. Consequently,
NACA as mandated by its Governing
Council, is embarking on a program
that attempts to sustain genetic
diversity in relation to inland fisheries
management and aquaculture develop-
ment in the region.

One of the initial steps is to assist


member nations to achieve the above
broad objectives and to develop human
capacity in the current methodologies
used to assess genetic diversity and its
applications to biodiversity issues in
inland fishery resource utilisation and
aquaculture development. This manual

5
Acknowledgements

T. T. T. Nguyen would like to thanks Mr. Pedro Bueno, former Director General
of NACA, without whose support the manual could not have become possible.
Encouragement from Prof. Sena De Silva, Deakin University (current Director
General of NACA) is very much appreciated. P. Mather and D. Hurwood would
like to acknowledge the Australian Centre for International Agricultural Research
(ACIAR) for funding support.

7
Background

It has generally been accepted that as a loss of valuable genetic material


aquaculture can contribute significantly such as locally adapted genes or
to narrowing the gap between demand gene complexes or homogenisation
and supply for aquatic food supplies. of previously structured populations
Currently, aquaculture production is via flooding with exogenous genes.
estimated to be 51.4 million tonnes In Thailand, one example of such
annually, valued at US$60 billion. More impacts is the outcome of hybridisation
importantly, developing countries, between the Thai walking catfish,
particularly in Asia, account for over Clarias macrocephalus and the African
85% of current production. It is most catfish C. gariepinus (Senanan et al.,
likely that dominance of Asian coun- 2004). While the long-term impact of
tries in aquaculture production will be this hybridisation is still to be deter-
maintained into the foreseeable future. mined, there has been a general loss of
genetic diversity in the native species.
With increasing developments in Similarly, it has been a suggested
aquaculture however, the sector that hybrid Clarias are contributing
also has had to face public concern to the decline of native C. batrachus
in regard to environmental effects. in the Mekong Delta (Welcomme
Aquaculture development with no and Vidthayanon, 2003). A parallel
regard for social and environmental situation appears to be occurring
issues is no longer acceptable to the elsewhere in Viet Nam, but as yet no
public, be it in developed or developing genetic analyses have been conducted
countries. Aquaculture development (personal observation).
needs increasingly to take into account
environmental impacts. It is in this Stock enhancement is a common fishery
regard that maintaining and sustaining practice in the freshwaters of many
the environment has become para- Asian nations, and is considered to be a
mount. Attention to genetic diversity means by which fish food supplies can
and biodiversity in aquaculture devel- be significantly enhanced (Petr, 1998
opment and aquatic resource manage- De Silva, 2004). Many enhancement
ment are therefore, crucial elements practices, except those in China and
for sustainable environments. perhaps India, are however dependent
primarily on exotic species, with little
Introduction of new species/strains can understainding of their effects on
affect biodiversity via impacts on the genetic diversity in the native species.
native gene pool. New species/strains A limited study conducted in Thailand
can hybridise with native stocks, appears to indicate that stock enhance-
and hence alter the natural genetic ment together with escapees from
architecture. This may be expressed aquaculture operations have brought

9
about a decrease in genetic diversity macrocephalus of wild and broodstock
in the silver barb Puntius gonionotus populations in Thailand, while
populations (Kamonrat, 1996). Indeed, Kamonrat (1996) demonstrated that a
the observation itself indicates a need similar situation has resulted for silver
to step up the number of similar studies barb Puntius gonionotus.
in the region to enable measures to be
adopted that ensure levels of genetic Another major concern is poor stock
diversity and biodiversity can be management practices in hatcheries,
sustained for the long-term. especially with respect to broodstock
management, which may lead to losses
The major regional genetic program of genetic variation in culture stocks
initiatives in Asia have thus far largely due to genetic drift and inbreeding.
been confined to selective breeding Although the number of published
programs, a much needed area of works on this matter are limited (e.
work for aquaculture development in g. Eknath and Doyle, 1990) there is
the region. None of these programs anecdotal evidence for genetic erosion
were directly related however, to of cultured stocks especially with
contributing to aquatic resource regard to the major carp species.
diversity. On the other hand, at recent
regional workshops (Gupta and Acosta, Asian nations in the meeting of the
2001) in which most Asian nations were International Network of Genetics in
represented, ongoing and planned Aquaculture in 2000 (Gupta and Acosta,
genetic related work was discussed 2001) recognised that more attention
and some consideration was made needs to be paid to biodiversity and
regarding biodiversity and conservation conservation issues. Thus while atten-
issues. Unfortunately, there were a very tion should be paid to genetic improve-
limited number of biodiversity related ment of important cultured species,
studies reported. increasing awareness of the potential
impacts of aquaculture and fisheries
To date only a limited number of (and related activities) on biodiversity
studies have addressed biodiversity is also very important at this stage.
issues in freshwater species in the There is a need to build the capacity of
region. These studies have raised regional fisheries agencies in molecular
however, important concerns regarding genetic techniques to address this issue.
the potential negative impacts of Genetic diversity studies in the region
aquaculture on biodiversity. Of should therefore focus on:
particular concern is the ongoing
practice of translocations and importa- • Genetic improvement of important
tion of exotic strains/species for culture. cultured species
Senanan et al. (2004) and Na-Nakorn
et al. (2004) have provided evidence
that African catfish (Clarias gariepinus)
genes have introgressed into native C.

10
• Assisting management practices in raised by the public and nations that
aquaculture operations, especially import aquatic products. It is in this
broodstock management regard that there is a great need to
build capacity in applied molecular
• Resolving taxonomic uncertainties, genetic capabilities at the national
and phylogenetic relationships, and regional levels. This will allow
especially for those species or characterisation of the genetic
populations that are endangered resources of relevant species important
and/or commercially important to aquaculture and inland fisheries
in the respective nations/sub-region.
• Documenting patterns of natural Knowledge on the applications of
genetic diversity and identifying molecular genetics to aquaculture and
management units fisheries management will help reduce
the negative impacts of many current
• Assessing genetic impacts of activities on biodiversity, and allow
cultured stocks on indigenous stocks development of suitable strategies for
maintaining and sustaining diversity. It
In the light of the major aquaculture will also help to provide a useful guide
developments taking place in Asia, to the identification and conservation
urgent attention is needed on biodi- of genetic integrity of aquatic species
versity and genetic integrity issues of within the region.
cultured as well as indigenous wild
stocks; issues that are increasingly

Target audiences

This manual is expected to enable The manual has gone through two
NACA member country personnel to be development/improvement stages. The
trained to undertake molecular genetic initial material was tested at a regional
studies in their own institutions, and workshop and at the second stage
as such is aimed at middle and higher feedback from participants was used to
level technical grades. The manual can improve the contents.
also provide useful teaching material
for specialised advanced level university
courses in the region and postgraduate
students.

11
Aims, scope and format of the manual

The aim of this manual is to provide a utilised in population genetics and


comprehensive practical tool for the systematic studies. In addition, a
generation and analysis of genetic data brief discussion and explanation of
for subsequent application in aquatic how these data are managed and
resources management in relation to analysed is also included.
genetic stock identification in inland
fisheries and aquaculture.

The material only covers general


background on genetics in relation
to aquaculture and fisheries resource
management, the techniques and
relevant methods of data analysis
that are commonly used to address
questions relating to genetic resource
characterisation and population genetic
analyses. No attempt is made to include
applications of genetic improvement
techniques e.g. selective breeding or
producing genetically modified organ-
isms (GMOs).

The manual includes two ‘stand-alone’


parts:

• Part 1 – Conceptual basis of


population genetic approaches:
will provide a basic foundation on
genetics in general, and concepts of
population genetics. Issues on the
choices of molecular markers and
project design are also discussed.

• Part 2 – Laboratory protocols,


data management and analysis:
will provide step-by-step protocols
of the most commonly used
molecular genetic techniques

12
Abbreviations

A Adenine
AA Amino Acid
AFLP Amplified fragment length polymorphism
AMOVA Analysis of molecular variance
ANOVA Analysis of variance
C Cytosine
DGGE Denaturing Gradient Gel electrophoresis
DNA Deoxyribonucleic acid
dsDNA Double stranded DNA
G Guanine
GD Genetic drift
HWE Hardy-Weinberg Equilibrium
IBD Isolation-by-distance or identical-by-descent
kb 1000 nucleotide base pairs (kilobase)
LHT Life history traits
MDS Multidimensional scaling ordinations
MHC Major histocompatability complex
mRNA Messenger ribonucleic acid
MSN Minium spanning network
mtDNA Mitochondrial deoxyribonucleic acid
MU Management units
NCA Nested clade analysis
nDNA Nuclear deoxyribonucleic acid
Nm Effective number of migrants (where N= effective population size
and m=mutation rate)
NS Natural selection
PCR Polymerase chain reaction
RAPD Random amplified polymorphic DNA
RE Restriction enzyme
RNA Ribonucleic acid
SCR Semi-conservative replication
SSCP Single strand conformational polymorphism
SSR Simple sequence repeats
T Thymine
TGGE Temperature gradient gel electrophoresis
U Uracil

13
SECTION 1

The fundamental nature


of DNA

15
Traditional approaches in fisheries for RNA) and a Phosphate group. The three
identifying populations that should be components are bound covalently and
managed separately (i.e. management when joined are called a Nucleotide.
units) have relied on documenting There are four kinds of nucleotide
population life history traits including present in any DNA strand. Essentially,
reproductive condition both temporally the sugar and phosphate form the
and spatially, breeding and feeding backbone of the molecule and the
sites, population specific behaviours, backbone is identical in all DNA and
and movement patterns to infer simi- RNA molecules. The only potential
larity or independence of gene pools. difference between any two DNA or
While the results often are in accord RNA molecules are the sequences of
with subsequent population genetic nitrogenous bases, so it is this sequence
analyses of the same populations this that encodes the genetic traits in an
may not always be the case (i.e. obser- organism. There are four bases in both
vations of morphological similarity does DNA and RNA: Thymine (T), Guanine
not necessarily mean individuals belong (G), Adenine (A) and Cytosine (C) in
to the same reproductive unit or DNA with Uracil (U) replacing (T) in
observations of mating do not neces- RNA. For a long time the idea that all
sarily imply successful reproductive genetic diversity could be explained by
input into the population). Molecular sequence variation in four nucleotides
analyses (either direct or indirect) have was disputed because scientists
the capacity to directly test if morpho- could not comprehend how the
logical similarity corresponds with diversity observable in nature could be
genetic similarity or breeding actually explained by variation in only 4 bases.
results in genetic exchange. This is This was because biologists already
because a large amount of essentially new that there are 20 common Amino
ecological and life history information Acids (AAs) that are the building blocks
is retained in the DNA and is expressed of cells present in living organisms and
as variation in DNA sequences. So it was difficult to see how four bases
the basis of using Population Genetic could encode the diversity of amino
approaches for identifying manage- acids unless groups of bases were read
ment units in fisheries is to understand together. The discovery of the genetic
the basic attributes of DNA, how it code whereby bases are read in groups
changes (evolves) and the limitations of three bases (Codons) and then
on storage of life history information in decoded into Amino Acids (AAs) solved
DNA sequences. this problem.

1.1 Basic DNA structure Other important aspects of DNA (RNA)


structure to consider include; the
DNA is a polymer and a macromolecule. Base-Pairing Rule whereby because
It consists of three building blocks, of their chemical structure and the
Nitrogenous bases, a Pentose sugar physical structure of the DNA molecule
(Deoxyribose in DNA and Ribose in A binds to T (U) and C binds to G

17
and the fact that the bases in a DNA placed into a test tube at the start of a
molecule are held on the inside of PCR reaction are basically identical to
the helix and joined by a hydrogen what is present in the nucleus of a cell
bond. This allows for DNA replication, during DNA replication except we use
a necessary attribute for reproduction an artificial short piece of DNA (called
(both cellular and whole organism) a Primer) to specify the sequence of
and thus for near faithful transmission DNA we wish to amplify. The other
of genetic information from cell to components are the same; target
cell and organism to organism. DNA chromosomal DNA, a catalytic enzyme
replication is Semi-conservative (SCR) - DNA Polymerase, building blocks of
that implies that when DNA replicates new DNA strands – free nucleotides and
the two strands separate with each a buffer to stabilise the reactions. By
old strand acting as a template for cycling the reaction repeatedly, millions
the production of a new strand that of copies of the target sequence are
should have the reciprocal sequence to generated so we can easily harvest it
the strand that was used to generate from the limited copies of other DNA
it because replication occurs according sequences. So DNA replication provides
to the base-pair rule (A - T and G – C). us with a method for producing a
This is important to recognise because specific target DNA sequence from a
this attribute provides the basis for mix of all sequences in the cell.
later proof-reading of new strands of
DNA whereby the sequence along the 1.2 Where does variation in
new strand can be proof read by special
DNA sequences come from?
enzymes to check to see if the correct
base has been incorporated. Where an When we compare the same DNA
incorrect base has been incorporated in sequence from two individuals we
the new strand and this is detected by may detect a different base at the
the repair enzyme relative to the old identical point along the sequence.
strand, it can be corrected. If however, This difference is referred to as a
a change occurs in both strands simul- Mutation or base-pair substitution.
taneously then repair enzymes have no Mutations are the result of ‘rare’
reference point to correct the change. errors during DNA replication but are
The mechanism of DNA replication a basic requirement for Evolution as
that occurs naturally in all cells forms a process of change because without
the basis of a very powerful technique mutation all DNA sequences would be
that was developed in the late identical to the first DNA sequence(s)
1970’s/early 1980’s called Polymerase that evolved originally. Potential for
Chain Reaction (PCR). Essentially, PCR accumulating mutational change is an
mimics what happens naturally in the attribute of DNA and RNA molecules
cell during DNA replication (in vivo) in and is the basis for the differences we
a test tube (in vitro). We will discuss this see in living and extinct organisms.
in more detail later but to demonstrate Mutations can occur anywhere in the
the similarity, all of the ingredients DNA (both in coding and non-coding

18
DNA) but where they occur in coding most relevant to analyses of population
DNA they may produce changes in the structure are point substitutions e.g.
AA sequence and be expressed as new GAG to GUG and deletions or insertions
phenotypes. If the mutation is present (Indels) of bases in a sequence e.g. GAG
in some individuals in the population to GAGG.
and not in others then the differential
expression of the two phenotypes in a Effects that mutations can have vary
particular environment allows the envi- widely from no effect on the individual
ronment to select the most appropriate to death and there are no simple rules
form. The effect may be to change the that we can apply to say what the
relative frequency of the two different likely impact of a particular type of
forms of the gene in the population mutation is going to be. The impact
over time. Mutation rates vary widely is determined by where they occur in
among DNA sequences in an organism’s the genome and what changes they
genome and relative mutation rate to produce. The simple fact is however
a large extent is determined by what that because mutations are random,
role (if any) a particular DNA sequence when they occur in coding sequences
serves in the organism. So, the more they are likely to be deleterious (i.e.
important the role of an individual produce poor outcomes), simply
sequence is to the individual, the more because they are random changes to
slowly the sequence is likely to accumu- DNA. Ultimately the environment is
late mutations and therefore evolve. the key however, as to whether a new
mutation in coding DNA will provide
There are a number of different ways in better or poorer phenotypes.
which DNA can be modified by muta-
tions, from simple base-pair substitu- Until the development of molecular
tions involving individual nucleotides technologies for examining vari-
to changes in whole blocks of DNA, ation in natural populations, the
to loss or gain of a sequence or larger most common characters used to
changes that could include loss or gain document variation were studies of
of one or more individual chromosomes external morphological phenotypes
or even whole chromosome sets and while mutations in genes that
(polyploidy). When they occur, their code for morphological phenotypes
probability of long-term incorporation can produce different outcomes,
(survival) depends on their impact on often these mutations do not survive
Gene function. Non-coding DNA will because there can be strong functional
tend to accumulate more mutations constraints on many morphological
and so evolve faster than coding DNA traits. Thus morphological evolution
because mutations in this type of DNA can be a relatively slow process and
do not directly affect gene function. populations may diverge genetically
As a general rule, the more important without any changes appearing in
the gene function, the lower the rate their external morphology. Many
of mutation. The types of mutations morphological traits are also polygenic,

19
meaning that they are the product of individuals, populations, species etc.
the combined effect of a few or more to be constructed without the need to
commonly many gene loci that may consider complications of factors like
be expressed differently in different transient impacts of natural selection
environments. This means that many or environmental effects on sequence
systematic or population variation divergence. Thus neutral markers can
studies based solely on examination provide more fundamental information
of variation in external morphological about phylogenetic relationships than
traits may underestimate the real can studies of morphology alone.
extent of underlying genetic variation
and hence population structure. Simply But molecular markers like morpho-
put, population studies based simply on logical markers can also have associated
morphology are unlikely to detect all of problems that we need to address if we
the significant population structuring are going to use them constructively.
that may exist in a species. Molecular For example, protein-based markers
systematic studies in contrast, are not such as allozymes, often show low
limited in the same way. levels of polymorphism hence they may
not be suitable for detecting genetic
Molecular markers can provide a more differentiation of organisms having
fundamental data set than morphology weak population structure such as
for examining relationships among many marine organisms. Also, allozyme
populations and higher taxonomic studies only detect a portion of the
levels. One important difference is actual genetic variation because not
that they are not complicated by any all nucleotide changes lead to amino
potential effect of the environment acid changes and not all amino acid
because they are fixed at fertilisation. substitutions result in electrophoreti-
If we target areas of DNA that do not cally detectable mobility differences.
encode phenotypes (i.e. non-coding A major problem with DNA markers
DNA), these markers are usually can be Homoplasy. Since there are only
neutral in respect to potential effects four potential character states at any
of Natural Selection (NS). Thus they point along a DNA sequence (A, T, C or
should accumulate mutations at a G), eventually by chance mutations can
constant rate determined by their change a single base many times, but
locus specific mutation rates. What this we are only able to determine the base
means effectively is that the absolute that is present at a particular location
number of mutations between homolo- when we sequence that region (i.e.
gous sequences in two individuals now), not what may have been there in
provides an absolute estimate of the the past. If we compare two individuals
time since they shared a common and find that they both share the same
ancestor after allowing for the locus base at a particular point along a DNA
specific mutation rate. Where DNA sequence we interpret this as similarity
sequences evolve neutrally, they allow due to common descent. If however,
phylogenetic relationships between they share the same base due to

20
homoplasy we have no way of knowing number of mutations between any
this. Thus, it is essential to choose DNA two individuals in each population can
markers carefully. Appropriate DNA be used to calculate the evolutionary
markers need to evolve fast enough time that has elapsed since they last
so that populations or species show shared a common ancestor, once we
differences, because without variation have calibrated for mutation rate. Put
there is no basis for phylogenetic infer- another way, the absolute number of
ence, but they must also evolve slowly mutational differences between the
enough so that there is little chance two individuals with the most similar
of character convergence (homoplasy) genotypes in each population is a
where we will score similarity, incor- direct reflection of how closely related
rectly. For any DNA marker there will the two forms are. Once we have this
be a point reached when homoplasy information we can correlate estimated
will become an issue, so we should divergence times with past earth
choose a DNA marker appropriate for geological or climate history events
the time frame we are examining to that may have impacted on the evolu-
reduce possible confounding effects of tion of the different forms.
homoplasy. This point theoretically, is
when sufficient evolutionary time has An example of this is the evolution
elapsed, given the mutation rate at the of the flightless ratite birds (Emus,
locus, for all four character states to Ostriches, Rheas, etc.). Ratites are an
have been expressed (A, T, C and G) at ancient order of birds that now are
a single point with the final outcome limited to a few relict species confined
being a return to a character state to the southern continents (Australia,
that previously occurred there. When Africa and South America, respectively).
this point is reached, we are likely to Molecular analyses confirm both
underestimate the real divergence time the relationship between the three
between two individuals with the same surviving families and the fact that
genotype. they last shared a common ancestor
in the Cretaceous. The simplest
The recognition that DNA sequences explanation for their evolution is that
evolve at a constant rate as a function the common ancestor evolved when
of their locus specific mutation rates, the three continents were part of a
implies that the same gene sequence in super continent (Gondwana) that also
two populations should evolve at the included Antarctica. This giant land
same rate. This is the basis for the idea mass that was fractured subsequently
of the ‘Molecular Clock’. Essentially, due to tectonic plate movement that
what this means is that assuming lead to the sequential rafting of the
population sizes are similar and remain three continents northward carrying
constant over time, individuals in their ancestral flightless ratites with
different populations will accumulate them (first Africa, then South America
mutations at approximately the same and finally Australasia). The relatives
rate and so the absolute minimum of the ancient ratites (Emus, Ostriches

21
and Rheas etc.) are now living evidence
for both the existence of giant super
continents in the past but also for how
Molecular Systematics can help to tease
out the processes that have influenced
modern biodiversity. Vicariance due to
the splitting of the Gondwanan land
mass has also been invoked to partially
explain the distribution of freshwater
galaxids among southern hemisphere
continents (Waters et al., 2000).

22
SECTION 2

Genetic variation in nature


Genetic variation is essential for of sexually-reproducing individuals
evolutionary change in populations. accumulate more genetic variation than
While we are well accustomed to will smaller populations.
detecting variation in our own species,
we are less perceptive at detecting fine Mutations are the ultimate source of
scale variation in other species. But genetic variation and occur randomly
the level of phenotypic variation we along DNA sequences. Mutation rates
can detect in ourselves is also present vary widely among DNA sequences
in most other species. The amount of (over 1000 fold among genetic loci). An
variation in a population will influence optimal rate exists however, for each
its relative rate of evolution. Thus large DNA sequence, so populations of a
populations should contain higher species and closely related species are
levels of genetic variation than small likely to share the same mutation rate
populations. What this means, at least for each common DNA sequence.
conceptually, is that populations with
little or no genetic variation have little A question that has long interested
potential to respond to environmental evolutionary biologists is ‘how much
disturbance and therefore have a genetic variation actually exists in
higher probability of going extinct natural populations?’ For a long time
when environments change. As a (based on observations of external
consequence most extant populations morphological variation in natural
are variable. populations) biologists believed that
genetic diversity in natural popula-
The genetic variation present in natural tions was relatively low because most
populations comes from three funda- con-specifics looked morphologically
mental sources: Mutation, Genetic very similar. So, when biologists
Recombination and Sexual Reproduc- quantified variation in individuals in
tion. All variation ultimately arises from a population for morphological traits
mutation but this varation is mixed (e.g. colour variants), they tended to
among individual chromosome strands regard it as unusual and not typical
by genetic recombination, and mixed of most traits that they examined.
among diploid individuals by sexual This led to a view that Evolution as a
reproduction when individuals combine process in general was conservative and
their gametes to produce a zygote. The hence slow, and they argued that this
extent of this variation means that no could be explained by the fact that if
two individuals in a population (except mutations were random, then changes
for monozygotic twins or clones), will in genes were likely to produce poor
be genetically identical. Since most (deleterious) outcomes in mutants
individuals are genetically unique and and hence will be lost as a result of
population size determines how much ‘purifying selection’. This view however,
mutational variation can accumulate developed at a time when we did not
in a population, larger populations

25
know about non-coding DNA and to directly influence allele frequencies
had no molecular data on levels and at a locus however, the locus must be
patterns of genetic variation. coding and produce different pheno-
types. Neutral evolution in contrast,
The modern view of how much genetic is where changes in allele frequencies
variation exists in natural populations occur simply as a consequence of
is quite different to the ‘Classical View” accumulation of neutral mutations at
described above and resulted from a locus and their frequencies change
the development and application of as a result of random genetic drift
molecular analyses of genetic diversity (a function of population size) over
in natural populations that commenced time. This idea (developed by Kimura
in the late 1960’s. The early studies in the 1960’s into the ‘Neutral Theory
exploited the development of the of Evolutionary Change’) argues that
technique of Allozyme Electrophoresis genetic drift acting in populations
first developed for human disease of different size (i.e. chance) will
diagnosis in the early 1960’s. This was determine the fate of most individual
extended to molecular analyses of mutations at a locus over time. Kimura
DNA markers in the late 1970’s and showed that both coding and non-
early 1980’s. What these studies have coding DNA sequences in theory, can
shown is that genetic diversity in most evolve by genetic drift alone and that
DNA sequences is in fact much higher there is no absolute requirement for
than had been predicted from earlier NS to influence gene frequencies at a
morphological studies and so required locus for populations to diverge. Even
a new explanation. This led to the idea though the effect of GD will be greater
that the high genetic variation evident in small populations, over evolutionary
in most DNA sequences could result time isolated populations are likely to
from two evolutionary mechanisms; diverge simply as a consequence of GD.
Natural Selection (NS) in the form of This process will occur regardless of
balancing selection and the random whether NS is affecting the frequency
accumulation of neutral mutations. of alleles at the locus or not (unless
strong balancing selection, i.e. hetero-
Evolution via NS (or Darwinian zygote advantage, is present). Under
evolution) results from the difference this model the amount of genetic
in relative fitness of the possible diversity at a locus is largely determined
phenotypes present in a population. by a balance between how many new
Evolutionary biologists now recognise alleles are entering the population by
a variety of different types of natural mutation and the number being lost by
selection that can change gene genetic drift (the balance is referred to
frequencies in a population including: as ‘mutation-drift equilibrium’). Since
heterozygote advantage, effect of loss of alleles via genetic drift occurs
patchy environments, frequency more rapidly in small populations, large
dependent selection, epistatic interac- populations will be more genetically
tions etc. to name but a few. For NS diverse simply because there are more

26
individuals in the population in which So is most genetic variation (read
mutations can occur and fewer alleles – evolution) in nature determined
will be lost by drift. largely by adaptive or neutral
processes? The first point to recognise
Thus developed two contrasting is that different types of DNA are more
hypotheses that attempted to explain likely to be affected by one or other
genetic change in populations over mechanism. Non-coding DNA is likely
time: evolution via NS or neutral to be influenced by neutral evolution
evolution. Both attempt to explain because no phenotypes are expressed,
why there is so much genetic variation so the only way that non-coding DNA
present in most natural populations, can be affected by NS is if by chance it
but do so from essentially opposing occurs in close proximity to a coding
positions. Proponents of Darwinian sequence on a chromosome and
evolution (i.e. evolution via natural genetic variation at the non-coding
selection) argue that genetic variation sequence is influenced by selection
results from accumulation of mutants acting at the adjacent coding locus,
that produce phenotypes that have indirectly. This is called a ‘hitch-hiking’
higher fitness than alternative effect. On the other hand, coding DNA
phenotypes at the locus driven by a can be affected by NS and the more
variety of different selective processes important is the functional role of a
singularly or in concert. In contrast, coding locus, the more likely it is that
proponents of the neutral model of NS has, is or will affect genetic variation
evolutionary change argue that genetic at the locus over time.
variation accumulates in populations
simply due to the fact that most new While evolutionary biologists agree
mutations entering a population are that genetic variation levels in nature
selectively neutral i.e. they affect fitness are generally high and large popula-
very little if at all and so it is chance tions contain on average more genetic
and population size that are the most diversity, the relative importance of NS
important factors that will affect their and genetic drift in generating diversity
long-term fate. Since modelling has remains a matter of ongoing debate.
shown that both mechanisms in theory
can change gene frequencies and there
are practical examples in nature of
both processes, this led to a debate as
to which mechanism was responsible
for the majority of observable genetic
variation in nature. The so-called
‘Neutralist - Selectionist debate’ that is
still to be finally resolved.

27
28
SECTION 3

Basic concepts in population


genetics
Basic concepts in population genetics heritable, secondly what evolutionary
are central to understanding the mechanism(s) may have caused the
processes that influence development change and thirdly, how the evolu-
of population structure in natural tionary mechanism(s) are having their
populations. The science of population effect. In nature however, satisfying the
genetics focuses on heredity in groups three requirements is often not easy
of individuals and populations and aims and so we may have to be satisfied by
to describe the genetic composition inference rather than direct evidence
of populations and to document and for some of the requirements.
understand the forces that change their
genetic composition over time. Thus at If we believe that a phenotypic trait
its heart, population genetics seeks to in a population may be evolving
understand the process of evolution. and wish to test the hypothesis and
attempt to understand the process, we
The fundamental starting point to need a foundation (essentially a ‘null
understanding population genetics is hypothesis’). This is simply because
to recognise the relationship between we can never prove the hypothesis
DNA and the phenotype. Encrypted in that evolutionary processes have
the DNA of all organisms is the genetic changed the gene frequencies at the
information necessary to encode locus (or loci) coding for the trait,
phenotypes, but first in eukaryotes but we can refute the null hypothesis
it has to be transcribed into a carrier that evolutionary processes have not
molecule (mRNA) and this molecule changed the gene frequencies. To set
then moves to the cytoplasm of up this null hypothesis test we need to
eukaryote cells where it can be trans- employ the Hardy-Weinberg Principle.
lated into the encoded polypeptide Hardy and Weinberg were mathemati-
chain at the ribosome. Essentially there cians in the 1930s that developed
are three steps in the process; transcrip- the basic mathematical platform for
tion, translation and gene expression modern population genetics. They were
of which only gene expression can be interested in how gene frequencies
influenced by external environmental can change in natural populations and
factors. While mutations can affect any recognised that before this process can
stage of the process, only mutations be explored there has to be an a priori
in the DNA have the potential to be reason for focusing on a particular trait.
passed among generations. Put simply, there is no reason to try to
understand what forces are changing
From a population genetic perspective, gene frequencies at a locus or how they
Evolution can be defined as any change have their effect without first having
in phenotypic frequency in a popula- reasonable evidence that gene frequen-
tion over time. To demonstrate that cies have in fact, changed! So they
evolution has occurred in a population modelled the effects of gene frequency
we need to satisfy three requirements; change in populations over time and
first that the trait in question is developed what is now referred to as

31
the ‘Hardy-Weinberg Equation’ (H/W). natural populations will fully satisfy all
The Hardy-Weinberg Principle can be of these attributes but unless we have
defined as; ‘in the absence of migra- evidence to the contrary we can assume
tion, mutation and natural selection, that most large natural populations will
gene frequencies and genotypic approach H/W status. Characteristics
frequencies remain constant in a large, of populations at H/W equilibrium are
randomly mating population’. that allele frequencies at autosomal
loci will not change across generations,
The Hardy-Weinberg equation genotype frequencies will also remain
essentially states the null hypothesis of constant and if H/W equilibrium is
gene frequency change, i.e. that if no disturbed, it can be re-established
evolutionary mechanisms are affecting within one generation of random
the frequencies of alleles at a locus, mating.
then the frequencies should not change
over time or among generations. Thus, By inference populations that do not
if we have the necessary information satisfy H/W equilibrium must be expe-
and data to test the hypothesis for riencing changes in gene frequency
a locus of interest and we are able due to some evolutionary mechanism.
to refute the null hypothesis that An example would be changes in the
no change in gene frequencies has frequency of a recessive allele that
occurred, then we are in a position to causes a genetic disease in humans
justify searching for a mechanism(s) across generations. When expressed in
that may be causing the change and the homozygous state (rr), individuals
to attempt to understand the process. suffer from the genetic disorder and
Hardy and Weinberg recognised may die before they can reproduce,
however, that there are qualifications thus when this happens, the frequency
on the attributes of populations in of the ‘r’ allele in the population will
which their principle would hold. They decline. Assuming that no other factors
defined this population as a ‘Mendelian affect survival of the mutant recessive
population’ and recognised that it allele over time, we should expect that
must have the following attributes; be this allele will eventually go extinct
diploid, sexual, outbreeding, randomly because it is not favoured by natural
mating and large. Populations that selection. An example in humans is the
satisfy these conditions are considered mutant allele that causes haemophilia.
to have reached H/W equilibrium and Although inheritance of this allele is
this means that every reproductive complicated by the fact that the locus
individual has an equal chance of is sex-linked and so is inherited in a
mating, all genotypes at a locus have different manner in males and females.
equal fitness, each new generation
is a random sample of the previous Where we have data on gene frequen-
generation’s gametes and no new cies in a population at a locus we can
alleles appear in the population. We test to see if the population conforms
now recognise that not all (if any!) to H/W equilibrium by comparing

32
the distribution of genotypes against
those expected if the population was The difference between the
at equilibrium. To do this we use the observed and expected values
H/W equation i.e. in the simplest case, can be tested for statistical
if the trait in question is determined by significance, using a χ2 test for
a single genetic locus with two alleles goodness of fit.
then if we let the frequency of the (a)
(Observed - Expected)2
allele equal p and the frequency of the
alternative allele (b) equal q then:
χ2 = Σ Expected

p+q=1
in a simple χ2 test with the ∑(observed
But in diploid organisms for most – expected)2 / expected (with degrees of
nuclear genes we inherit two copies of freedom equal to number of genotypes
each gene, one from each of parent, minus number of alleles) (Example 1).
so Hardy and Weinberg realised that If after completing such an analysis
their equation needed to take the the result is that the population does
diploid condition into consideration not conform to H/W equilibrium then
and to recognise that there are three we can look for information that can
ways that an individual can carry a and help to identify the likely causative
b alleles if they are diploid. To address agent (evolutionary mechanism) with
this issue they expanded the equation the results of the test providing some
to deal with diploid genotypes so that; insight into the possible cause. For
p2 is the probability of receiving a copy example, an excess of heterozygotes
of the a allele from both parents, 2pq is may indicate balancing selection (i.e.
the probability of being a heterozygote heterozygote advantage), whereas a
and q2 is the probability of receiving a deficiency of heterozygotes may reflect
copy of the b allele from both parents disruptive selection or non-random
then: (assortative) mating.

p2 + 2pq + q2 = 1 While Darwin was aware of only a


single class of causative agent for which
The equation can also be expanded to he coined the term ‘natural selection’,
deal with cases where there are more modern evolutionary biologists recog-
than 2 alleles at the locus, e.g. nise at least six different mechanisms
p + q + r = 1. Once we have data for the that can cause populations to deviate
observed allele frequencies constituting from H/W equilibrium (mutation,
the genotypes in the population we can migration/gene flow, non-random
then use the H/W equation to calculate mating, genetic drift, natural selection
the expected genotypic frequencies if and ‘molecular drive’). Of these; natural
the population was at equilibrium. The selection, genetic drift and migration/
expected frequencies of genotypes can gene flow are the mechanisms most
be compared with observed frequencies commonly considered to be the most

33
important ones that can affect popula- The degrees of freedom (df) in a test
tion structure, at least over shorter involving n classes are usually equal
evolutionary time frames. to n-1. That is, if the total number of
individual (6 in this example) is divided
Example 1. Observed distribution among n classes (3 genotypic classes in
and expected Hardy-Weinberg the example), then once the expected
equilibrium distribution of genotypes numbers have been computed for n-1
can be summarised in the Table on the classes (1 in the example), the expected
following page: number of the last class is set. Thus in
the above example there is only one
Genotypes degree of freedom in the analysis.
AA AB BB
Observed 3 2 1 Check the χ2 value of 0.37 at df = 1 in
Expected 2.66 2.67 0.67 Table 1 we will have P-value > 0.05 and
(O - E)2 0.12 0.45 0.11 therefore we accept the null hypothesis
(O - E)2/E 0.04 0.17 0.16 of Hardy-Weinberg equilibrium in the
population in our example.
χ2 = 0.04 + 0.17 + 0.16 = 0.37

Table 1. Chi-square Probabilities.

Probabilities
df 0.95 0.90 0.70 0.50 0.30 0.20 0.10 0.05 0.01 0.001
1 0.004 0.016 0.15 0.46 1.07 1.64 2.71 3.84 6.64 10.83
2 0.10 0.21 0.71 1.39 2.41 3.22 4.61 5.99 9.21 13.82
3 0.35 0.58 1.42 2.37 3.67 4.64 6.25 7.82 11.35 16.27
4 0.71 1.06 2.20 3.36 4.88 5.99 7.78 9.49 13.28 18.47
5 0.15 1.61 3.00 4.35 6.06 7.29 9.24 11.07 15.09 20.52
6 1.64 2.20 3.83 5.35 7.23 8.56 10.65 12.59 16.81 22.46
7 2.17 2.83 4.67 6.35 8.38 9.80 12.02 14.07 18.48 24.32
8 2.73 3.49 5.53 7.34 9.52 11.03 13.36 15.51 20.09 26.13
9 3.33 4.17 6.39 8.34 10.66 12.24 14.68 16.92 21.67 27.88
10 3.94 4.87 7.27 9.34 11.78 13.44 15.99 18.31 23.21 29.59
11 4.58 5.58 8.15 10.34 12.90 14.63 17.28 19.68 24.73 31.26
12 5.23 6.30 9.03 11.34 14.01 15.81 18.55 21.03 26.22 32.91
13 5.89 7.04 9.93 12.34 15.12 16.99 19.81 22.36 27.69 34.53
14 6.57 7.79 10.82 13.34 16.22 18.15 21.06 23.69 29.14 36.12
15 7.26 8.55 11.72 14.34 17.32 19.31 22.31 25.00 30.58 37.70
20 10.85 12.44 16.27 19.34 22.78 25.04 28.41 31.41 37.57 45.32
25 14.61 16.47 20.87 24.34 28.17 30.68 34.38 37.65 44.31 52.62
30 18.49 20.60 25.51 29.34 33.53 36.25 40.26 43.77 50.89 59.70
50 34.76 37.69 44.31 49.34 54.72 58.16 63.17 67.51 76.15 86.66
Accept at 0.05 level Reject

34
SECTION 4

Natural selection
Charles Darwin and a colleague, Alfred relative fitness of different genotypes
Wallace, established that evolution at a locus in a population where we
could result from the effects of natural have data on the average number of
selection changing the frequency of surviving offspring per genotype across
genetically determined traits in nature. at least two generations. Relative
Since this was the only mechanism fitness varies from 0 to 1 because the
that had been proposed to drive calculation is made in such a way that
evolutionary change in nature from the best performing genotype of all
the middle of the 19th century until possible ones at the locus is always
the 1930’s, it has attracted considerable given a fitness value of 1 and poorer
interest from evolutionary biologists genotypes a value less than 1. A
over time and continues to do so. genotype that does not produce any
Simply put, NS acts on heritable surviving offspring across generations
variation and is the relative ability of will have a relative fitness of 0.
individuals with different phenotypes
to survive and pass on their genes to Sometimes different genotypes can
their offspring. Where NS is affecting have equal fitness in a particular
allele frequencies at a locus, over time environment or multiple niches are
individuals with superior phenotypes available for different genotypes and
(and hence superior underlying geno- where this occurs, multiple phenotypes
types) in a particular environment will may do well over time. This is called
tend to have more surviving offspring a ‘balanced polymorphism’ and is
and so their alleles will increase in one way evolutionary biologists who
frequency in the population at the support the notion that NS is the most
expense of individuals with poorer important evolutionary mechanism,
performing phenotypes. Differences believe that NS can maintain high
in reproductive output was termed levels of genetic diversity in natural
‘relative fitness’ by Darwin and by populations. Balanced polymorphisms
this he meant that individuals in the may evolve for a number of different
population with high relative fitness reasons including that across the
would on average provide more natural distribution of the species
surviving offspring to the next genera- there may be different habitat patches
tion compared with another individual that favour different phenotypes (and
with a poorer phenotype. The hence influence the frequency of their
comparison is always ‘relative’ because underlying alleles). The ‘peppered
it is population specific and is made moth’ is a classic example of this kind
against the best-performing genotype of balanced polymorphism because in
in a particular environment. This is polluted environments in Britain where
an important point because the best it occurs, the dark morph of the moth
performing genotype may not always is favoured because it is more cryptic
be the same genotype if populations to predators than is the light form.
of the species are found in different In contrast in pristine environments,
environments. We can estimate the where there is little air pollution from

37
coal dust, the light coloured form is the fitness of individuals may vary
favoured because the resting place geographically across the natural
moths use during the day (trunks of distribution of the species, while
oak trees) are covered with lichens ecological variation in fitness may occur
that are white and light grey in colour where factors such as differences in
that provide more protection to the fitness associated with substrate type,
light coloured morph than to the dark depth, canopy cover etc. can influence
morph. Lichens are very sensitive to air fitness.
pollution particularly coal dust and so
in polluted areas they do not thrive and Once evolutionary biologists had recog-
so the trunks of oak trees are basically nised that selection can act in a variety
dark brown to black, the natural colour of ways, attempts were made to model
of the tree bark. An alternative way the impact of selection on traits with
balanced polymorphisms may evolve, different modes of inheritance. These
is where multiple niches are available models attempt to predict the outcome
in the same place (environment). Shell of selection. The most important factor
colour variation in English land snails in the models is the time to ‘fixation’ or
(Genus Cepaea) can be influenced by when one allele (the one favoured by
this process. There are patches within NS) reaches 100% and allelic variation
a single habitat type where different at the locus is lost. Time to fixation will
colour morphs may be more cryptic depend on the starting frequency of
(e.g. certain patches favour banded the allele favoured by NS, differences in
snails and other patches may favour relative fitness among genotypes and
un-banded snails, so overall both the the mode of inheritance of the locus.
alleles for ‘banded’ and ‘un-banded’ For simplicity, most selection models
remain in the population. Thus selec- assume that selection pressure remains
tion can favour one allele in a single constant over time but in reality this
place or multiple alleles in the same may not necessarily always be true.
place so that a balanced polymorphism Most models are essentially H/W
evolves for a particular trait. models that incorporate selection co-
efficients i.e. an estimate of how much
Relative fitness can vary temporally, advantage the favoured genotype has
geographically and ecologically for a over alternative genotype(s) at the
population. If one or more of these locus. There are four basic kinds of
effects are evident then polymorphisms selection model; (a) selection against
will be common at the locus. Temporal the recessive homozygote, (b) selection
variation in fitness is where fitness that favours the heterozygote, (c)
may be affected in different ways selection against a single allele at a
at different life history stages (e.g. locus and (d) selection that acts against
eggs vs fingerlings) or with different heterozygotes. A special case of type
seasons within a single life history (a) is referred to as a recessive lethal
stage. Geographical variation in fitness where only the recessive homozygote is
occurs where factors that determine affected and always dies pre-reproduc-

38
tion, so the relative fitness of this dominant inheritance it can quickly
genotype is 0. Cystic Fibrosis in humans be eliminated by natural selection
used to be an example of this type as long as it does not provide better
of mutation until modern medicine outcomes than pre-existing forms of
devised ways to prolong the life of the gene in certain environments as
some affected individuals. is the case for alleles that codes for
‘sickle-cell’ anaemia in humans. While
Once we have data on the mode of individuals that express the sickle cell
inheritance of the mutant we can use allele in either the homozygous or
relative fitness estimates incorporated heterozygous state have lower fitness
into a H/W Model to determine the in most environments where humans
likely time to fixation under different occur, where Plasmodium falciparum
selection intensities and look at the malaria is a problem (i.e. many tropical
effect of the change with different and sub-tropical environments) hetero-
starting gene frequencies. While the zygotes that express sickle-cell anaemia
outcomes can be very diverse, one have higher fitness than homozygous
obvious characteristic is that the time normal individuals because they have
required to purge a recessive allele higher resistance to infection from
that produces even extremely poor the malarial parasite. Mortality due to
fitness outcomes for a sufferer is much malaria is higher in these areas than
longer (in terms of generation time) the lower reproductive capacity that is
than for an equivalent allele that shows associated with sickle-cell anaemia so
dominant inheritance. Equally it will essentially the environment (presence
take a much longer time for a new or absence of malaria) changes the
recessive mutant that provides higher fitness of an allele from negative to
relative fitness than pre-existing allelic positive.
forms of the gene to reach fixation
than an equivalent dominant favoured As discussed earlier we now recognise
mutation. The simple explanation that NS can take many forms and affect
for these phenomena relate to the individuals and hence populations in
differences in the mode of inheritance a diversity of ways. The evolutionary
and the fact that NS can only act effects of NS can often be very
when a mutation is expressed as a complicated and even in opposition
phenotype, so deleterious mutations when different types of NS act in
can remain hidden in the population concert on a population at the one
with no effect simply because an time. For example sexual selection
individual requires two copies of the may be favouring alleles in males in a
gene to express the phenotype. This completely different way to females
is one reason why so many mutations (e.g. favouring more conspicuous males
that cause ‘nasty’ genetic disorders that just happen to be more obvious
can remain in the genomes of species to predators as well) while other
for many thousands of generations. forms of NS (e.g. NS favouring cryptic
In contrast if a ‘nasty’ mutation shows colouration to avoid predation is acting

39
on both sexes). What we measure is
the cumulative effects of both types
of NS on colouration patterns not the
individual effects of each process. Esti-
mating the relative effect of individual
NS agents even when they are known
can be very difficult, because to do this
we need to know; how many different
selective agents are affecting a trait
at one time, what are their individual
effects and how they interact. In most
cases this is not possible, so we simply
look at their cumulative impact on
phenotypes over time.

It is obvious however, that NS can


be a very powerful mechanism for
evolutionary change in natural
populations and in certain situations
can influence how populations are
structured in space and time. This
is especially true when populations
have been in the past or are currently
isolated from other populations so
that gene exchange is either restricted
or completely disrupted for extended
periods of time. When this occurs,
isolated populations are likely to
experience their local environments
differently because conditions will
not be identical and so local selective
agents may produce unrelated changes
in gene frequencies and so result in
population divergence. Geographical
speciation models argue that this is the
simplest and most widely accepted way
in which new species can evolve from
ancestral types (isolation leading to
populations experiencing local environ-
ments differently and hence NS driving
their divergence).

40
SECTION 5

Genetic drift
The process of random genetic drift In a population of N diploid individuals
is a powerful evolutionary force and there are 2N alleles at any particular
is central to our understanding of locus. Therefore, if a new mutation
population genetics. Random genetic arises, it will start in the population
drift (GD) refers to the random fluctua- with a frequency of 1/2N. This is also
tions of allele frequencies from one roughly the probability of the new
generation to the next. Sometimes mutation being passed on to the
it is referred to as a sampling error next generation. It is also the prob-
of gametes between generations. In ability of the new allele increasing
a randomly mating population, the in frequency to fixation. From this
expectation that two particular alleles simple relationship it can be seen that
coming together at fertilisation is a the eroding force of genetic drift on
function of the relative frequencies genetic variability is purely a function
of each allele in that population and of population size (N). The greater
therefore should conform to Hardy- the population size, the smaller the
Weinberg expectations. In essence, effect. Similarly in small populations,
these expectations are rarely realised a new mutation will initially exist at a
due to stochastic events that may affect relatively high frequency (relative to a
random mating, for example, unequal large population of the same species)
offspring numbers from individual thereby having a greater chance of
females. being passed onto the next generation.
Effects of small population sizes on
Because of the random nature of genetic drifts will be further explained
genetic drift, it is impossible to predict later.
absolutely the fate of a particular
allele. The effects of genetic drift on Kimura and Ohta (1971) calculated that
an allele over time however, will be to it would take 4N generations for the
either increase or decrease in frequency frequency of a newly mutated allele to
in the population. Given sufficient reach fixation in a population. That is,
time, the allele in question will if p is the frequency of allele a:
increase in frequency until it reaches
fixation or alternatively decrease until Time to fixation of allele a (i.e. p=1):
it becomes extinct. Either way, the
locus in question is heading towards T1(p)=4N
a homozygous state. So even if we
are unable to forecast the outcome of Conversely, the time it takes for allele a
an individual allele, we can state that to be lost from the population is:
the overall effect of drift is to reduce
genetic variation (push polymorphisms Time to loss of allele a (i.e. p=0):
towards homozygosity).
T0(p)=2ln(2N)

43
With the ratio between them:
In the absence of selection and
2N/[ln(2N)] assuming that each mutation
results in a unique allele, the
It can be seen therefore, that in a level of genetic variation (hetero-
population of 500 individuals, it takes zygosity) can be considered as
approximately 145 times longer for the a balance between the force of
allele to go to fixation than it does for drift (that erodes variability) and
it to be lost from the population. It is mutation (generating variability).
easy to see therefore, that the majority Kimura and Crow (1964) defined
of new genetic variants will be more the equilibrium of heterozygosity
likely to go extinct than to become as:
established in the population
H = 4Nµ/(4Nµ + 1)
Given enough time, every locus will
become homozygous as a single allele where N is the population size
will have drifted to fixation at each and µ is the mutation rate. It
gene locus within the population can be seen from this equation
(i.e. a total lack of genetic variation). that if either population size or
Because of the potentially huge mutation rate is low then hetero-
timescale required for an allele to reach zygosity will also be low. This is
fixation, however, this rarely (if ever) intuitive as small population size
occurs. During the time it takes for an results in elevated drift thereby
allele to head towards fixation, new reducing genetic variation.
mutations are continuing to arise that A general rule regarding the
are subject to the same pressure of relationship between these two
drift (with their own respective prob- opposing processes is that if 4Nµ
abilities of increasing in frequency). is much larger than one, then
Hence, a population is in a constant mutation is the dominant process
heterozygous state for many loci with and heterozygosity is high. If 4Nµ
the persistence of the polymorphism is much lower than one, then
dependent on population size. drift is the dominant force and
heterozygosity will be low.
Natural populations also rarely
remain stable in size over time. Many
populations at some stage experience pass their alleles onto the following
a sudden crash in numbers (usually due generation and what genetic diversity
to some extrinsic disturbance such as an does survive is subject to a greatly
outbreak of disease). A rapid decline elevated pressure of drift. Ultimately,
in size is referred to as a ‘Population the degree to which genetic variation
Bottleneck’ that has a two-fold effect is lost is a function of two factors: i) the
on genetic variability. Firstly, only a magnitude of the bottleneck (i.e. how
relative few individuals manage to few individuals survive to reproduce)

44
and ii) the duration of the bottleneck drifting to fixation of the same allele.
(i.e. how long the population remains If there were five alleles present, then
at a low number). It can be argued that the probability would drop to a 20%
the duration of the bottleneck has a chance. Also the probability of two
greater impact with respect to the loss populations going to fixation for the
of genetic variability. That is, if a popu- same allele would be very low when
lation that has undergone a bottleneck considered across many loci. Therefore,
can recover numbers rapidly, then the the net effect of drift is to cause
loss of variation will be attenuated. populations to differentiate (diverge
genetically).
Another form of population bottleneck
is when a few individuals colonise an We have seen the relationship between
environment previously unoccupied by mutation and genetic drift and how
the species. This is known as a ‘Founder their interaction determines genetic
Event’ with the new population being variation. These predictions are only
subject to the forces of drift in the valid under the neutral theory of
same way as seen with a bottleneck. evolution. Although many (most)
The founding population will lose point mutations are selectively neutral,
genetic variability much faster than will particular mutations may bestow a
the parent population. significant fitness advantage on the
individual. In this case, new mutations
So far we have discussed the fate of may increase in frequency at a much
genetic variation due to the forces of higher rate than may be expected
drift within single populations. Genetic under neutral theory. The question is,
drift also plays an important role in which evolutionary force is stronger?
leading isolated populations to become Balancing selection will tend to keep
genetically differentiated. This concept multiple alleles in relatively high
is central to population genetic theory. frequencies at the locus under selection
Because the process of drift is random, thereby maintaining high heterozy-
alternative alleles within different gosity. This opposes the effects of drift
populations will increase (or decrease) that reduces variability. The relationship
in frequency. Eventually, populations between these two evolutionary forces
will become fixed at particular loci for is once again a function of population
different alleles (i.e. total differentia- size. The smaller the population the
tion). It should be recognised however, greater will be the probability that the
that due to the random nature of drift, effects of drift will outweigh the effects
two populations could also become of selection. The general rule is that if
fixed for the same allele by chance. The 4Ns (where s is the selection coefficient)
probability of this occurring reduces is much less than one, then drift is the
rapidly as the number of alleles at the most important process determining
locus increases. For example, for a locus variability. If 4Ns is much greater than
where there are only two alleles there one, then selection is likely to be the
is a 50% chance of two populations dominant force. It should be noted

45
however, that other forms of natural cytoplasmic genomes (e.g. mitochon-
selection (e.g. directional selection) can drial or chloroplast DNA) will exist
also lead to reduced heterozygosity. at lower Ne than for nuclear genes.
For example, mtDNA is maternally
It is clear from the preceding discussion inherited (compared to bi-parental
that the force of drift in influencing inheritance for nDNA) and is a haploid
genetic variability depends on molecule (compared to diploidy of
population size (N). Given this fact, nDNA), therefore half the number of
it is probably important to briefly parents times half the ploidy results in
explore the value N. When we think of a four-fold reduction in Ne. Therefore,
a population size, we merely see it as the effects of drift will be four times
the number of individuals present at a greater on mtDNA genes than on
location at a particular time. In terms nDNA genes in the same population.
of population genetics however, this This concept and its implications for
value can be misleading. The important assessing population structure will be
point is that we are interested in the developed later.
probability of alleles being transferred
successfully to subsequent generations.
Therefore we are only interested in the
number of individuals that contribute
their genes to the next generation (i.e. Effective population size can also
individuals that breed successfully). be influenced by other factors
These individuals constitute what is such as unequal sex ratios or
called the ‘Effective Population Size’ particular breeding behaviours
(Ne). In nearly all cases, the effective where one or a few males breed
population size is significantly smaller with many females. The simple
than the census population size. So formula for calculating Ne in
even a population that appears to be these cases is:
very large may in fact have a relatively
small Ne and therefore be subject Ne = (4NmNf)/(Nm + Nf)
to an elevated pressure of drift. The
concept of Ne is particularly significant where Nm is the number of
to conservation genetics. That is, how breeding males and Nf is the
small can the effective population be number of breeding females.
until levels of inbreeding reduce the Ne is affected more (in terms
overall fitness of the population? of reduced numbers) by the
rarer sex. This is because they
Another important concept regarding constitute less than half of the
Ne is that it will vary depending on breeding population (sometimes
which gene locus we look at. All auto- significantly so) yet they still
somal genes in the nuclear genome will contribute 50% of the gametes to
follow the rule listed above. However, the next generation.
genes on the Y chromosome or in

46
SECTION 6

Non-random mating and


population structure
A Gene Pool is the collection of geno- clones. This situation is relatively rare in
types present in all individuals that nature however, because self-fertilising
constitute a reproducing population, so species have little genetic variation and
essentially it comprises all individuals hence lose most of the advantages of
who potentially could exchange genes. diversity. Even species that are capable
Sometimes a gene pool is connected of self-fertilisation may not neces-
directly i.e. reproducing individuals sarily engage in it (e.g. some mollusc
can meet and exchange genes directly species). More common in nature is
or exchange may be indirect via the situation where organisms within
intermediates because individuals a gene pool practice some level of
choose not to move large distances or inbreeding (i.e. non-random mating).
individual dispersal distances are not The consequences of this can be that
large enough to allow contact with populations will be structured spatially
all members of the gene pool. While and/or temporally.
one assumption of the H/W theorem
is that individuals within a gene pool Inbreeding may result from both
mate at random, this is seldom if ever intrinsic (e.g. behavioural traits) and/or
the case in nature both for intrinsic extrinsic factors (e.g. physical barriers
and extrinsic reasons. Thus individuals to dispersal). If individuals mate assor-
that belong to a discrete gene pool tatively, that is they either choose other
are often distributed as ‘demes’, local individuals as mates that are pheno-
populations, subpopulations or popula- typically similar to themselves (positive
tions and share more genes in common assortative mating) or individuals that
with members of their own sub-group are phenotypically different to them-
than with the rest of the gene pool. selves (negative assortative mating),
When populations become subdivided this can affect the level of inbreeding in
by limitations on dispersal, the popula- the population. An example of positive
tion will inevitably become subdivided assortative mating may be the fact that
as complete interbreeding may not in human populations, individuals more
be possible, so mating will not be at often than at random select individuals
random. This results in genes being of the opposite sex of similar height,
structured spatially across the natural while examples of negative assortative
distribution of the gene pool. mating include mate choice in mice
and self incompatibility factors in
A number of different types of non- some plants. Female mice have been
random mating have been recognised shown to select males with different
by evolutionary biologists. One form odours to themselves as mates when
is ‘inbreeding’ where individuals share the opportunity exists. Odour in mice
more genes by common descent than is in part, determined by Major Histo-
would be expected by chance. Different compatability Complex genes (MHC)
levels of inbreeding exist from one that provide a major component of the
extreme of self-fertilisation where the bodies defence system against disease,
population is essentially an assembly of parasites and pathogens. It is thought

49
Once modelers had determined that inbreeding increases homozygosity
they worked out that where genetic data were available, this effect could be
used to estimate the level of inbreeding that was occurring in a population
essentially by comparing the observed heterozygosity against that predicted
under H/W equilibrium given the observed allele frequencies. Thus the
probability that two alleles are inbred is given by the inbreeding coefficient
(F), where F is the probability that two alleles in an individual are identical by
descent. F or the inbreeding coefficient varies from 0 where the population
is completely outbred to 1 when the population is completely inbred so
the population will consist of only AA and aa homozygotes for a two allele
system. We can estimate the relative level of inbreeding in a population
using:

1
FX = Σ[( )n1+n2 +1 + (1 + FA )]
2
Where:

• FX is the inbreeding coefficient of the individual in question

• FA is the inbreeding coefficient of the common ancestor, and

• n1 and n2 are the number of generations from the sire and the dam to the
common ancestor, respectively.

The statistics of inbreeding were developed by Sewall Wright in the 1922


and later who modelled the effect of various processes on gene frequencies
in natural populations and related this to what was expected under H/W
equilibrium. The result is that the modern statistics of inbreeding take his
name i.e. Wright’s (F) statistics and a variety of versions are available for
analysis with genetic markers that possess different modes of inheritance and
in theory mutate in different ways. The general equation is:

(1 – FIS)(1 – FST) = (1 – FIT)

Where:

• FIT is the correlation of uniting gametes relative to gametes drawn at


random from the entire population

• FIS is the correlation of uniting gametes relative to gametes drawn at


random from within a population and,

50
• FST is the correlation of uniting gametes within subpopulations relative to
gametes drawn at random from the entire population.

The statistic of real interest in studies of population structuring is FST because


in essence it measures the extent to which the populations under examina-
tion are subdivided, or put another way, how much gene flow is occurring
among subpopulations.

FST varies between 0 and 1 where an FST of 0 implies that the populations
under examination have the same set of alleles in identical frequencies
and an FST of 1 implies that the populations share no alleles in common. In
practice FST among populations is rarely larger than 0.5 and is often much
less. Wright proposed for a simple two allelic system at a locus where FST >
0.25 constitutes very great differentiation and within the range 0.15 to 0.25
this constitutes ‘moderate differentiation’. The actual interpretation of FST is
more complex. An example is that recently it has been shown that for hyper-
variable genetic markers (e.g. microsatellites) that often possess many alleles
per locus, FST estimates among populations may be considerably lower than
for traditional markers with fewer alleles per locus (e.g. allozymes).

that choice of a male with different is called ‘Gene Flow’ and is another
odour type by female mice increases method apart from mutation by which
the probability that their offspring new genes can enter a population.
will be more heterozygous at MHC Gene flow is a very powerful force
loci and this attribute may increase for homogenising gene frequencies
overall fitness of the offspring. Both among demes or populations and
inbreeding and positive assortative the more gene flow that occurs the
mating will increase homozygosity lower will be the level of inbreeding.
while negative assortative mating will Essentially, gene flow is a force that
increase heterozygosity above that opposes development of population
predicted under the H/W model. differentiation and hence population
sub-structuring. As gene flow increases
As discussed earlier, a major factor it should also increase heterozygosity
that keeps populations evolving as a in the receiving population. This effect
unit is when they are connected by results from crosses among individuals
ongoing dispersal. In theory, the more from different populations that did not
effective dispersal that occurs (where have identical gene frequencies at all
individuals move among populations loci at the start of the process. So gene
and reproduce in the new site), the flow and inbreeding are essentially
more similar populations should opposing forces that largely determine
be, genetically. Effective dispersal the extent of population structure that

51
will evolve among demes. If gene flow dispersal is possible either directly or
is high among subpopulations, then indirectly via generational connections,
population structuring will be low individuals disperse more commonly at
because inbreeding is reduced. If gene a relatively local scale so that subpopu-
flow is low among subpopulations, lation differentiation is greatest at the
then population structuring will be largest spatial scale. ‘Stepping Stone
high because inbreeding will increase. Models’ are mathematically more
complex and describe situations where
Once the relationship between dispersal is only possible between
inbreeding and gene flow was under- adjacent populations and the greater
stood, interest focused on the diversity the geographic distance between
of potential patterns of population populations the less chance there is of
structure that could result in nature. gene flow, so there is genetic isolation
So migration models were devised to by distance. In this case the relationship
describe patterns of population subdivi- between FST and gene flow is:
sion that were possible. Essentially
because they are population genetic FST = 1/(1 + 4Nm)(2Nµ/Nm)1/2
models, they describe the relative
contribution that migrants make to Notice that the stepping stone model
demes that they enter (i.e. the extent approaches the island model when
of effective dispersal). populations become very large. Also,
the stepping stone model is a function
The simplest migration model is an of not only gene flow but the mutation
‘Island Model’ where subpopulations rate (µ) as well.
of equal size over a geographical area
interact in such a way so that they can An example could be catadromous
exchange genes with equal probability. fish species that spend much of their
An example could be subpopulations of life cycles in freshwater, but can have
a fish species confined to a large lake. limited dispersal via the marine envi-
The relationship between FST and gene ronment and hence reach neighbouring
flow (Nm) for the island model is: rivers. Complexity of stepping stone
models can be increased by spatial and
FST = 1/(1 + 4Nm) temporal effects of the environment
and this will have consequential
A second kind of model is ‘Isolation effects on the relative complexity of
by Distance’ where relative gene flow the mathematical equations used to
among subpopulations of one large describe the relationships.
population is affected by distance
among subpopulations and/or possible The models discussed above attempt
alternative paths by which individuals to describe the patterns of population
can disperse. An example could be structure that can exist in specific
populations of a species of fish that situations in nature. All rely on the
occur widely across an ocean. While association between gene flow and

52
As biologists began to apply migration models to aquatic species they quickly
realised that in some instances (e.g. riverine freshwater systems) that the
existing models were not adequate to explain all possible limitations on
gene flow. Riverine systems are unique in that they can impose a hierarchical
structure on potential for gene flow on species that are obligate in that
environment (e.g. some freshwater invertebrates and fishes). This lead
Meffe and Vrijenhoek (1988) to develop a specific model to address this
situation, a model they called the ‘Stream Hierarchy Model’ (SHM). What this
model attempts to describe is the fact that rivers and streams are essentially
dendritic spatial systems for the organisms that are obligate users of them.
Consequently, their patterns of genetic diversity should reflect the dendritic
nature of the habitat with genetic diversity is likely to increase down the
system because of water flow effects on relative dispersal and gene flow
structured hierarchically. Thus gene flow is structured according to the
following hierarchy, within stream > among streams > among drainages so
that:

HT = HC + DCR + DRS + DST

Where:

• HC = within population diversity

• DCR = differences among populations in a river

• DRS = differences among rivers in a drainage

• DST = differences among drainages

With the expectation that: DCR < DRS < DST

inbreeding, i.e. as gene flow increases, population or the migration rate. Nm


level of inbreeding should decline. This is based on an Island model of popula-
means that when we have data on how tion structure and estimates recurrent
differentiated two or more populations gene flow among subpopulations and
are from each other, in theory this tells is equivalent to ‘the probability that
us how much gene flow is occurring an allele randomly chosen from the
(or has occurred historically between population comes from a migrant’. Nm
them). This information is used to calcu- can be difficult to measure in nature,
late Nm, a statistic that equates to the but if we know allele frequencies in
number of migrants moving between both donor and recipient populations

53
before gene flow and the change
in allele frequency in the recipient
population after gene flow, then we
can estimate Nm. This is because the
change in allele frequency over time
following gene flow is proportional to
the difference in frequencies between
donor and recipient populations.
The outcome of modelling of the
effect of different levels of gene flow
among populations has shown quite
clearly that even very limited gene
flow is sufficient to keep populations
essentially, genetically homogenous. As
little as a single migrant per generation
is sufficient in theory, to homogenise
gene frequencies among populations.
So only very limited dispersal is capable
of restricting divergence that results
from local selection and genetic drift
effects.

54
SECTION 7

Environmental influences on
population processes
In the previous sections we have Change in the physical environment
discussed the genetic processes that that has affected levels of gene flow
operate at the population level that among populations on an evolutionary
principally determine population time scale has been immense. For
structure (i.e. mutation, genetic example, the continents that we
drift, gene flow and selection). These know today once were part of super-
processes however, must operate within continents (Pangaea, Gondwana).
a framework shaped by the environ- As the continents drifted apart (via
ment (extrinsic factors) and the ecology plate tectonics) gene flow ceased,
and life history traits of the species leaving populations isolated from each
(intrinsic factors). In fact it is rather other (unless they were very good at
meaningless to interpret population swimming or flying). The separation
genetic data (especially for manage- of the super-continents happened so
ment purposes) in isolation without long ago that most species affected by
taking intrinsic and extrinsic factors it have since gone extinct or isolated
into account. In this section we will populations have evolved into different
look at the effect that the environment species. There are however, still several
can play in shaping genetic variation in closely related taxa that share a Gond-
natural populations, with the emphasis wanan distribution (e.g. marsupials,
on freshwater systems. Firstly we can ratite birds, lungfish). Because the
disregard mutation, as the effect of the separation occurred so long ago, it
environment on the mutation process bears little application to intraspecific
largely results in somatic mutations level processes.
which are rarely heritable (e.g. solar
radiation causing skin cancer). On a more recent evolutionary time
scale however, many events have
The environment can either promote or shaped the population structure of
inhibit gene flow among populations extant species particularly during
and as such a heterogeneous environ- the Pleistocene. Many populations
ment (as is usually the case) will result became isolated due to the expansion
in varying levels of population connec- of ice sheets during the most recent
tivity. An important consideration is ice age. In fact many populations still
that the environment or habitat of a bear the genetic signatures of these
species is rarely stable over time and vicariant events in North America and
therefore its impact on shaping popula- Europe even though levels of gene
tion structure will consist of a historical flow are significantly higher today
and a contemporary component. That than 10,000 years ago. Mountain
is, how the environment is affecting uplift due to tectonics or volcanism
structure today (on a ecological time (geomorpholoical change) also has
scale), and how it affected population resulted in much habitat fragmentation
structure in the past (on an evolu- leading to population differentiation
tionary time scale). and many of these populations still
remain isolated today. The rise and fall

57
of sea levels (eustasy) also connected the Pleistocene, low sea levels resulted
and isolated landmasses and hence in freshwater connection between
populations repeatedly. For example Australia and New Guinea via ‘Lake
much of the terrestrial fauna shared Carpentaria’. Several freshwater species
among the Indonesian islands and (e.g. gudgeons, rainbowfish, freshwater
between Australia and New Guinea prawns) still have a distribution that
can be explained through this process. reflect this history. Another effect
Over this sort of time scale there of eustatic change has been on river
was also significant fluctuations in systems that are currently isolated by
temperature which played a significant the marine environment but historically
role in shaping genetic variation in had a freshwater confluence at times
populations. Changes in temperature of low sea level. This phenomenon
generally led to reduced habitat avail- also explains the distribution of
ability with intervening regions often genetic variation of a southeast
inhospitable to dispersal. Asian freshwater catfish (Hemibagrus
nemurus) among currently isolated
All of the historical environmental river drainages. During the Pleistocene,
fluctuations mentioned above have had low sea levels resulted in freshwater
significant impacts on the population confluences on the Sunda Shelf that
structure of freshwater fauna through facilitated interdrainage gene flow
the modification of dispersal pathways. (Dodson et al. 1995).
One of the most significant effects
resulted from geomorphological Sometimes climate has changed so
change through the rearrangement rapidly (e.g. temperature), that species
of drainage channels (e.g. river fail to evolve in situ and are forced
capture). Under this scenario where to move to more suitable habitat.
a stream flowing to one river system This movement may take the form
is ‘captured’ and begins flowing in of latitudinal or altitudinal shifts.
another direction, populations that had For example, freshwater crayfish in
been connected through a high level Australia (Euastacus sp.) historically had
of gene flow previously, became totally a widespread lowland distribution. As
isolated while populations that may temperatures began to increase in the
have been isolated started exchanging Miocene, they were forced to retreat
genes. The geomorphological evolution further and further up mountains
of drainage channels is generally seen where cool moist conditions still
as the primary factor that influences remained. Eventually, connectivity
the distribution of most obligate among mountain top populations was
freshwater species. cut as the intervening lowlands became
uninhabitable for crayfish.
Sea level fluctuations have also have
a significant influence on shaping On an ecological time scale, there
population structure of freshwater are also many factors that can affect
fauna. For example, towards the end of levels of gene flow, either promoting

58
or restricting it. Firstly, it must be disturbances (e.g. tropical cyclones,
recognised that due to the nature of earthquakes, volcanic eruptions) that
river systems, freshwater populations alter the landscape will probably affect
are expected to be highly structured the qualities necessary for continued
especially among drainages. The connectivity. For example, volcanic
terrestrial environment and the marine eruptions in New Zealand some 2,000
habitat that separate rivers, inher- years ago with associated larva flows
ently dictate that gene flow will be and ash deposits, resulted in small
highly restricted. Climatic fluctuations isolated populations of freshwater fish
however, can overcome these barriers species with little or no potential for
to dispersal. High rainfall can result in gene flow among them.
freshwater plumes around the mouths
of rivers (e.g. the freshwater plume at Other natural instream barriers to
the mouth of the Amazon River some- dispersal include waterfalls, rapids
times extends hundreds of kilometres and cascades. It is not uncommon
into the Atlantic Ocean). Depending for upland populations to be totally
on the scale of the plume and the isolated from downstream populations
proximity of the neighbouring river that are divided by a significant and
mouths, connectivity among normally rapid change in stream profile. Stream
isolated rivers may exist and for a flow itself dictates that gene flow
short period of time a small degree downstream is going to be significantly
of dispersal may result. Also, flooding greater than in an upstream direction,
caused by high rainfall can lead to a unless species have evolved dispersal
high degree of connectivity amongst mechanisms to counteract this effect
normally isolated drainages resulting in (e.g. positive rheotaxis). Another
massive interdrainage dispersal events, important barrier to dispersal in some
especially in areas of low elevation (e.g. freshwater systems is just physical
inland eastern Australia). distance. In extensive drainages such
as the Mekong River, it is not physically
Within a single drainage there also possible for an individual to traverse
exist several natural barriers to gene the entire distance of the river in a
flow, some of which are influenced single lifetime.
climatically. For example, headwater
streams that are continuous during the As can be seen from this discussion,
wet season may be transformed into a there are many environmental factors
series of isolated waterholes during the that can influence gene flow (either
dry season. Also dispersal vectors such promoting or restricting) that operate
as water currents may change season- over various temporal scales. For
ally (e.g. Tonle Sap River, Cambodia). management purposes, an important
These processes can affect gene flow. goal is to understand the magnitude
Generally, most species have evolved of gene flow that is occurring today.
to cope with seasonal environmental As such, one of the challenges of
fluctuations but random catastrophic population genetics is to be able to

59
differentiate the effects of historical divergent in isolation (mtDNA is a
versus contemporary gene flow on particularly powerful marker for this
the observed population structure. application).
Early population genetic studies based
on allozymes were largely unable to We know that the population process
accomplish this. Allozymes (and to of random genetic drift is largely
a certain extent mtDNA haplotypic a function of population size – the
frequency data) can distinguish smaller the population the greater
between high gene flow and total will be the effect of drift. It is also
isolation, but the interpretation of commonly known that populations
situations in between these extremes naturally fluctuate in size over time
can only ever be an educated guess. with much of the fluctuation a result
The development of more sensitive of environmental influences. Once
techniques (i.e. DNA sequencing, again, these environmental fluctuations
microsatellites) has provided tools that have a historical and a contemporary
allow us to determine more confidently component. For example, during the
the relative contributions of both Pleistocene much of the freshwater
historical and contemporary processes habitat in the northern hemisphere
to gene flow. was locked up as ice and what suitable
habitat was left tended to fragment
Finally, humans in recent times have large populations, sending many
had a substantial impact on gene flow subpopulations extinct. The surviving
in freshwater systems. Over the past individuals existed as small populations
few hundred years, anthropogenic in small habitat refugia. During this
modifications to natural water courses time much genetic variation would
have been significant. Mostly these have been lost. Subsequent climatic
modifications such as dams, pollution warming re-opened much habitat
and stream channel alteration have allowing the small populations to
resulted in dispersal being restricted rapidly increase their range and expand
further. Because most anthropogenic into areas previously unavailable to
disturbance has occurred relatively them with an associated increase in
recently, any cessation of gene flow population size.
is unlikely to be detected in the data
from a population genetics survey On an ecological time scale, natural
(although some population structuring seasonal shifts result in fluctuations
has been recorded either side of some of available habitat (as seen in the
dams in the United States that have previous section). When habitat is
only been in existence for 50 years). On reduced such as in the dry season,
the other hand, human mediated gene population size generally decreases.
flow via interbasin transfers of water or Similarly seasonal fluctuation can
direct translocation of species among affect resource availability. If periods
drainages can be detected if the newly of poor habitat and low resources
mixed populations were genetically coincide, local populations may crash

60
and possibly become extinct. Because differential selection pressures due to
seasonal fluctuations are short lived, varying environments play a major role
it is expected that the level of genetic in influencing genetic differentiation
variation in the population will be among populations.
determined by the duration of poor
conditions and hence the effects of On the other hand, selection may act to
drift when the population is at its homogenize allele frequencies among
smallest size. If only a few individuals populations, even in the absence of
make it through the bottleneck, gene flow. If the local environments
regardless of the rate of recovery, of populations are similar, alleles may
genetic variation will have been lost be under similar selection pressure (i.e.
and recovery can take a long time. the same alleles are favoured in both
populations) thereby creating a popula-
While most species are well adapted tion structure that would be expected
to their environments and have life under a model of high gene flow. This
history traits that are well suited to highlights the necessity for choosing
seasonal environmental fluctuations, neutral markers for population studies
catastrophic events can have a devas- that are capable of revealing any
tating effect on population numbers population structure that may be
and even result in local extinctions. present.
It may take many generations before
population size recovers and many
more before genetic variation reaches
pre-disturbance levels. The amount
of genetic variation a population can
maintain over time is determined by
the population size at its lowest level,
not at the highest.

When new mutations arise, they are


either beneficial, deleterious or neutral.
Their relative fitness is purely a function
of the environment. Much genetic
variation may exist in a population
that is essentially selectively neutral.
A sudden change in environmental
conditions however, may result in
a particular genetic variant having
a significant selective advantage
(or disadvantage). This will lead to
strong directional selection that will
inherently reduce genetic variation. In
association with genetic drift, localized

61
62
SECTION 8

Ecological influences on
population processes
It is difficult to discuss ecological species that have freshwater larvae,
influences on population processes fly upstream as adults to breed. Some
without incorporating environmental species, rather than compensating for
factors because a species’ life history down stream dispersal, have evolved
traits (LHT) will adjust over time to local physiological or behavioural traits
environmental conditions. However, that assist them to avoid displacement
certain LHTs will inherently influence in the first place. Most freshwater
the effects of gene flow and genetic crustaceans have an abbreviated larval
drift. phase thereby reducing the time
in the plankton. Some species are
Gene flow can be achieved by individ- dorso-ventrally flattened which makes
uals at all life history stages (i.e. from them less ‘visible’ to the water current.
fertilized eggs through to adult) or as Others still have the ability to adhere
gametes (eggs or sperm). Most species to the substrate or some species glue
have evolved a dispersal phase in their their eggs to the substrate. Behavioural
life history in order to avoid inbreeding adaptations include brooding of eggs
and competition with close relatives. or larvae, remaining at the edge of
Dispersal can either be of a passive or the stream (where the current has less
active nature. Passive dispersal is usually velocity), hiding under large immov-
undertaken as gametes or as planktonic able objects (rocks, snags, etc.) and
larvae, but exceptions do exist (e.g. burrowing into the substrate.
some adult spiders disperse large
distances in the wind by producing Irrespective of compensatory, physi-
‘silk parachutes’). Passive dispersal has ological or behavioural adaptations,
advantages because minimum energy the majority of gene flow in freshwater
is required, however a dispersal vector systems is in a downstream direction.
is required (e.g. a water current). The Therefore downstream populations
disadvantage of this form of dispersal tend to act as ‘sinks’ for genetic varia-
is that the individual may end up in tion and should display higher levels
unsuitable habitat. of diversity that populations further
upstream. Furthermore, confluence
In most river systems, passive dispersal sites should have a mixture of all
is always in a downstream direction. alleles present in the river branches
This presents a problem – how do that lead to them. This effect is
upstream reaches of a stream remain further accentuated if there is a
colonized? Many freshwater taxa with barrier to dispersal such as a waterfall
a passive dispersal stage also have that significantly restricts upstream
a compensatory behaviour at some movement. Therefore, new mutations
stage of their life history. For example, that arise in upstream populations can
many freshwater crustaceans display a disperse downstream but new variants
positive rheotactic response as adults from downstream may not be found
(i.e. they actively swim or walk against upstream.
the current). Similarly, many insect

65
Species that have evolved an active population numbers, fluctuations in
dispersal phase are particularly vulner- environmental conditions (both predict-
able to anthropogenically modified able seasonal and catastrophic change)
environments, especially migratory mean that most populations will go
species. For example, dams or impound- through declines and expansions
ments can interrupt long established (boom/bust). As discussed in previous
dispersal pathways to breeding or sections, the severity and the duration
feeding grounds. Disruptions to the of population declines will largely
natural life history of the species in this determine the level of genetic variation
manner will result in a marked reduc- that can be maintained in the gene
tion in the potential for long-term pool. The most extreme form of popu-
population persistence. lation size fluctuation is that of extinc-
tion and recolonisation. Depending
Many species display sex-biased on the source and magnitude of the
dispersal in their life history. That is, recolonisation, genetic diversity may
either males or females, but not both, either increase or decrease, both within
are the principal dispersers. This has and among populations.
significant implications for mtDNA
studies due to the maternal inheritance Evolution of LHT in some species
of the molecule. If dispersal is male has resulted in breeding systems
mediated, then there is no effective where certain sexually mature
dispersal of mitochondrial genes (i.e. individuals (usually the males) gain
gene flow is zero). Therefore a mtDNA a high percentage of matings (e.g.
survey may indicate strong genetic harem system in many Pinniped
structuring while nuclear markers may species). This behaviour by itself
reveal panmixia. A similar pattern reduces Ne significantly. From Section
may be seen in philopatric species 5 we know that unequal numbers of
(those that return to their natal site to breeding males and females will result
reproduce). Even though these species in the effective population size being
may disperse over great distances (e.g. significantly smaller than the total
across oceans) if the female is philo- number or breeders. In some breeding
patric, mtDNA gene flow is nil (e.g. this systems, selection has favoured mate
pattern is seen in sea otters). choice (‘good genes’ hypothesis)
where the reduction in Ne is offset by
Species evolve to maximise their the increased genetic quality of the
reproductive output. This may be offspring (e.g. co-operative breeding in
through breeding at a time of birds).
maximum resource quantity/quality,
iteroparity (multiple breedings over An important outcome of size fluctua-
time) or through modifying the repro- tions is that many natural populations
ductive allocation to suit prevailing will rarely achieve mutation/drift
environmental conditions. Irrespective
of these adaptations to maintain high

66
equilibrium, a condition that forms a
common assumption underlying many
statistical analyses.

67
68
Glossary

Aestivation: Dormancy during summer or dry season.

Allele: An alternative form of a gene occurring at the gene locus.

Allopatric: Relating to the geographic distribution of populations/species with


distributions that do not overlap.

Allozymes: Alternative forms of an enzyme coded for by different DNA sequences


at a single genetic locus.

Ancestral retention: Isolated populations having the same allele from a time prior
to isolation.

Autosomal loci: Gene sequences on non sex linked chromosomes.

Balanced polymorphism: Where multiple alleles exist at a single locus over evolu-
tionary time.

Balancing selection: Process by which multiple alleles are maintained by selection at


a coding locus.

Base-Pairing Rule: Where A binds to T (U) and G binds to C.

Bonferroni correction: Adjustment of the significance (α) of a statistical test to


reduce the probability of committing a Type I error through multiple comparisons.

Bootstrapping: Permutation method for testing the reliability of a node in a gene


tree.

Coding: Region of DNA that can be transcribed and translated to produce func-
tional polypeptide product.

Co-dominant: Locus where heterozygotes express a unique phenotype.

69
Codon: Nucleotide triplet in DNA or RNA that specifies the amino acid to be
inserted in a specific position of a polypeptide.

Confluence: A place where two water channels join.

Conspecific: Of the same species.

Cytoplasm: All cell contents excluding nucleus.

Denature: Process of breaking the bonds between the two complementary strands
of DNA through chemical or temperature stress.

Dendritic: Resembling a dendrite (nerve cell) which is characterised by many paths


of connection.

Diploid: Referring to an organism having two sets of chromosomes, one from each
parent.

Directional selection: Process by which one phenotype if favoured by selection


produces an increase in relative frequency of the underlying allele.

Dispersal: The movement of individuals.

Dispersal vectors: Extrinsic entities that facilitate dispersal; physical (wind, water) or
biological.

DNA markers: DNA sequences that can characterise individuals, populations,


species, etc.

DNA polymerase: Enzyme that catalyses production of new DNA molecules.

DNA replication: Process of generating new DNA strands.

Dominant: Locus where heterozygotes are not detected from homozygote domi-
nants.

Duplex: Single stranded DNA that has re-annealed to its complementary strand or
another strand with different sequence (homoduplex and heteroduplex respec-
tively).

Effective population size: The number of breeding adults in a population that


contribute their genes to the next generation (Ne).

70
Electrophoresis: Procedure for separation of molecules (based on charge and/or
structure) in an electric field.

Epistatic interaction: Interaction of nonallelic genes to produce phenotypes.

Eukaryote: Referring to the superkingdom that contains organisms whose cells


contain membrane-bound nuclei and mitochondria (Protista, Fungi, Animalia, and
Plantae).

Eustasy: Rise and fall of sea levels.

Evolution: Changes in genetic composition that occur within populations from one
generation to the next.

Evolutionary mechanism: A process that results in changes in gene frequencies in


populations.

Extant: Currently in existence, not extinct.

Extrinsic factors: Factors outside the basic nature of something.

Fixation: The point when an allele reaches a frequency of 100% in a population.

Founder event: The formation of a new population by one or a few individuals.

Gamete: Germ cell (egg, sperm) with a haploid genome.

Gene: A hereditary unit that occupies a specific location (locus) on a chromosome,


the physical entity that is transmitted from parent to offspring.

Gene expression: When a coding locus produces a phenotype (protein).

Gene flow: Successful movement of genes among populations via dispersal of


individuals or gametes.

Gene frequency: The frequency that an allele occurs at within a population.

Gene function: The role that a specific coding DNA sequence plays.

Gene tree: A branching diagram depicting the inferred relationships among a


group of genes or other DNA fragments.

Genetic Drift: Random changes in gene frequency due to chance.

71
Genetic recombination: Reassortment of DNA sequences on homologous chromo-
somes at meiosis.

Genome: The total genetic material within a cell or individual.

Genotype: Genetic constitution of a cell or individual.

Geographical speciation: Where two species evolve from a common ancestor as a


result of geographical isolation and independent evolution in different environ-
ments.

Geomorphology: Study of the evolution of physical landscapes.

Gondwana: Supercontinent (consisting of South America, Africa, Australia and


Antarctica) existing ~200 million years ago.

Haploid: Single copy of each gene.

Haplotype: Genetic constitution of a haploid cell.

Heritable: Capable of being passed from one generation to the next.

Heredity: The genetic transmission of characteristics from parent to offspring.

Heteroduplex: Combining DNA’s from different sources together.

Heterogeneous: Variable, made up of different elements.

Heterologus: Non-identical DNA sequences.

Heteroplasmic: Contains more than a single DNA sequence per cell.

Heterosis: Also known as hybrid vigour and heterozygote advantage. Occurs when
the fitness of individuals with two different alleles at a locus is greater than the
fitness of individuals with two identical alleles at a locus.

Heterozygote: An individual that carries two different alleles at a diploid locus.

Homoduplex: Single DNA strand bound to its mirror image.

Homoplasmic: Contains a single DNA sequence per cell.

72
Homoplasy: Denotes parallel or convergent evolution of DNA sequence informa-
tion, same allelic state not resulting from descent from a common ancestor.

Homozygous: An individual that carries two copies of the same allele at a diploid
locus.

In vitro: In an artificial environment outside an organism.

In vivo: Within a living organism.

Indels: Changes in DNA sequence, specifically insertion or deletion of nucleotides.

Inbreeding: Mating among closely related individuals.

Intraspecific: Pertaining to interactions among individuals of the same species.

Intrinsic factors: Factors belonging to the basic nature of something.

Introgression: Mixing of discrete entities/populations.

Iteroparity: The characteristic of breeding more than once in a lifetime.

Life history trait (LHT): Significant feature of the life cycle through which an
organism passes.

Lineage sorting: Process of particular genetic variants going extinct over evolu-
tionary time in a population through random drift.

Locus: The site that a gene or molecular sequence occupies on a chromosome


(plural loci).

Maternal: Relating to the mother.

Microsatellites: Tandem repeats of short DNA motif’s (non-coding DNA).

Migration: The mass directional movement of large numbers of individuals of a


species as part of their life history.

Miocene: Geological time scale (epoch) from 23.8-5.3 million years ago.

Mismatch distribution: Distribution of the pairwise nucleotide differences among


all DNA sequences in a sample, used for determining historical demographic
change.

73
Molecular clock: Consistent accumulation/loss of mutations at a locus that occurs at
the DNA level.

Molecular drive: A process of DNA ‘turnover’.

Molecular systematics: Use of DNA sequence data to characterise relationships


among organisms.

Morphology: Shape, form, external structure or arrangement of an organism.

Multidimensional scaling: Multivariate statistical method that represents the


multidimensional similarity of samples in two or three dimensions.

Mutation: Alteration in the arrangement or amount of genetic material of a cell.

Mutation/drift equilibrium: The increase in genetic variation in a population


through mutation is offset by the reduction in genetic variation due to random
drift.

Mutation rate: The frequency with which new mutations arise in a population.

Natural selection: Change in gene frequencies due to differences in individual


fitness.

Neutral markers: DNA sequences that evolve solely due to genetic drift and
mutation.

Neutral Theory: Theory proposed by M. Kimura to account for the high level of
genetic variation in populations; most point mutations are selectively neutral.

Niche: The position or role of a plant or animal species within its community or
environment.

Nitrogenous bases: Bases that code for variation at the DNA level.

Non-coding: Region of DNA that is not transcribed and translated.

Nucleotides: Building blocks of DNA molecules.

Nucleotide diversity: Measure of genetic variation, the mean number of base pair
differences in a sample.

74
Null alleles: An allele that produces no functional product, a sequence that is not
amplified in PCR-based analysis because a variation in the DNA sequence annealing
to the 3’ end of a primer results in nonamplification of the expected segment.

Outbreeding: Breeding with individuals from another sub-population.

Pangaea: Supercontinent (all continents joined) existing ~ 225 million years ago.

Panmixia: A single gene pool, no barriers to gene flow.

Pentose sugar: Part of DNA ‘skeleton’.

Permutation test: Statistical procedure that repeatedly randomises a data set to


create a null distribution for testing the significance of an parameter estimated
from the data.

Phenotype: The physical manifestation of a genotype, eg. the colouration pattern


of a fish.

Philopatric: Returning to the natal site to reproduce.

Phosphate group: Part of DNA ‘skeleton’.

Phylogenetic: Relating to the hypothesised evolutionary relationships of indi-


viduals, populations or species.

Phylogeography: Analysis of genealogy, population genetics or evolution within a


geographical context.

Point substitutions: Mutations at a single base pair site.

Polygenic: Refers to a trait or phenotype whose expression is the result of the


interaction of numerous genes.

Polymorphic: The occurrence of different forms, stages, or types in individual


organisms or in organisms of the same species, independent of sexual variations.

Polyphyletic: The term for a group of organisms, when despite their being classi-
fied together as one taxonomic category, it is thought that not all have descended
from a common ancestor.

Polyploidy: The situation of cells or individuals having additional complete sets of


chromosomes.

75
Ploidy: Number of sets of homologous chromosomes.

Population bottlenecks: Severe reduction in population size that reduces popula-


tion genetic variation.

Post hoc: Planned after the fact.

Plate tectonics: Movement of continental plates.

Pleistocene: Geological time scale (epoch) from 1.8 million – 10,000 years ago.

Primer: A short oligonucleotide fragment from where nucleotide extension is


initiated during PCR.

Proof reading: Process of scanning new DNA strands for replication “errors”.

Purifying selection: Process of removal of deleterious alleles from gene pools.

Recolonisation: The arrival of a number of individuals to re-establish a population


that had gone extinct.

Refugia: Places of suitable habitat generally surrounded by inhospitable habitat.

Relative fitness: Measure of relative reproductive success of different phenotypes.

Restriction enzymes: Prokaryote enzymes that cut DNA strands.

Rheotaxis: Active dispersal response when subject to a current; positive – against


the current; negative - with the current.

Ribosome: Site of protein synthesis in cells.

River capture: Drainage rearrangement where a stream flowing into one river
system is diverted (captured) by an adjacent system that has a higher erosional rate.

Segregating site: Nucleotide position in a series of homologous DNA sequences


where a mutation has occurred.

Semi-conservative replication: Where new DNA molecules are synthesised using an


‘old’ strand as a template.

Selection coefficient: The relative fitness of a particular genetic variant(s).

76
Selectively neutral: Not affected by natural selection.

Sex-biased dispersal: Dispersal predominantly by one sex.

Simple sequence repeats: Microsatellites.

Sink: A population that accumulates genetic variation through gene flow from
several other source populations.

Somatic: Of cells of the body as opposed to germ cells.

Stochastic: Involving chance or probability.

Stock: A group of organisms that shares the same genetic and demographic
parameters.

Stream profile: 2 dimensional cross section of a stream reflecting elevation.

Substrate: Ground or other solid surface on which animals walk on or are attached
to.

Temporal: Related to time.

Transcription: Process of encrypting DNA gene message onto a ‘carrier’ molecule


(mRNA).

Transition: Point mutation where a purine base is replaced by another purine (A or


T) or a pyrimidine is replaced by another pyrimidine (C or G).

Translation: Decoding of mRNA into polypeptide.

Transversion: A point mutation where a purine (A or T) is replaced by a pyrimidine


(C or G) or vice versa.

Vicariance: Separation.

χ2 Test: A test that uses the chi-square statistic to test the fit between a theoretical
frequency distribution and a frequency distribution of observed data for which
each observation may fall into one of several classes.

Zygote: Offspring (2n) that results at fusion of egg (n) and sperm.

77
Also see additional glossary in “Glossary of biotechnology and genetic engi-
neering”, FAO Rsearch and Technology Paper, No.7 at: http: //www.enaca.
org/modules/wfdownloads/singlefile.php?cid=63&lid=769

78
Bibliography

De Silva, S. S. (2004). Fisheries in inland waters in the Asian region with special reference to
stock enhancement practices. Bangkok, Thailand.

Dodson, J.J., Colombani, F. & Ng, P.K.L. (1995). Phylogeographic structure in mitochondrial
DNA of a South-East Asian freshwater fish, Hemibagrus nemurus (Siluroidei, Bagridae) and
Pleistocene sea-level changes on the Sunda shelf. Molecular Ecology 4: 331-346.

Eknath, A. E. & Doyle, R. W. (1990). Effective population size and rate of inbreeding in
aquaculture of Indian major carps. Aquaculture 85: 293-305.

Gupta, M. V. & Acosta, B. O., eds. (2001). Fish genetics research in member countries and
institutions of the International Network on Genetics in Aquaculture.

Kamonrat, W. (1996). Spatial genetic structure of Thai silver barb Puntius gonionotus
(Bleeker) population in Thailand. Halifax, Canada: Dalhousie University.

Petr. T. (ed.) (1998). Inland fishery enhancements. Papers presented at the FAO/DFID Expert
Consultation on Inland Fishery Enhancements. Dhaka, Bangladesh, 7–11 April 1997. FAO
Fisheries Technical Paper. No. 374. Rome, FAO. 1998. 463p.

Kimura, M. and Crow J. F., (1964). The number of alleles that can be maintained in a finite
population. Genetics 49: 725-738.

Rhymer, J. M. & Simberloff, D. (1996). Extinction by hybridisation and introgression. Annual


Review of Ecological Systematics 27: 83-109.

Senanan, W., Kapuscinski, A. R., Na-Nakorn, U. & Miller, L. (2004). Genetic impacts of hybrid
catfish farming (Clarias macrocephalus x C. gariepinus) on native catfish populations in
central Thailand. Aquaculture 235: 167-184.

Waters, J. M., Lopez, J. A. & Wallis, G. P. (2000). Molecular phylogenetics and biogeography
of galaxiid fishes (Osteichthyes: Galaxiidae): dispersal, vicariance and the position of
Lepidogalaxias salamandroides. Systematic Biology 49, 777-795.

79
Welcomme, R. L. & Vidthayanon, C. (2003). The impacts of introduction and stocking of
exotic species in the Mekong basin and policies for their control. Cambodia: Mekong River
Commission.

80
Manual on Application
of Molecular Tools in
Aquaculture and Inland
Fisheries Management

MANUAL ON APPLICATION OF MOLECULAR TOOLS IN AQUACULTURE AND INLAND FISHERIES MANAGEMENT: PART 1
Part 1

Conceptual basis of
population genetic
approaches

NACA Monograph 1

www.enaca.org

You might also like