You are on page 1of 19

Molecular Ecology (2008) 17, 35653582

doi: 10.1111/j.1365-294X.2008.03714.x

Identifying footprints of directional and balancing selection


in marine and freshwater three-spined stickleback
(Gasterosteus aculeatus) populations
Blackwell Publishing Ltd

H . S . M K I N E N , J . M . C A N O and J . M E R I L
Ecological Genetics Research Unit, Department of Biological and Environmental Sciences, PO Box 65, FI-00014 University of Helsinki,
Helsinki, Finland

Abstract
Natural selection is expected to leave an imprint on the neutral polymorphisms at the adjacent
genomic regions of a selected gene. While directional selection tends to reduce withinpopulation genetic diversity and increase among-population differentiation, the reverse
is expected under balancing selection. To identify targets of natural selection in the
three-spined stickleback (Gasterosteus aculeatus) genome, 103 microsatellite and two
indel markers including expressed sequence tags (EST) and quantitative trait loci (QTL)associated loci, were genotyped in four freshwater and three marine populations. The
results indicated that a high proportion of loci (14.7%) might be affected by balancing selection
and a lower proportion (2.8%) by directional selection. The strongest signatures of
directional selection were detected in a microsatellite locus and two indel markers located
in the intronic regions of the Eda-gene coding for the number of lateral plates. Yet, other
microsatellite loci previously found to be informative in QTL-mapping studies revealed no
signatures of selection. Two novel microsatellite loci (Stn12 and Stn90) located in chromosomes I and VIII, respectively, showed signals of directional selection and might be linked
to genomic regions containing gene(s) important for adaptive divergence. Although the
coverage of the total genomic content was relatively low, the predominance of balancing
selection signals is in agreement with the contention that balancing, rather than directional
selection is the predominant mode of selection in the wild.
Keywords: balancing selection, directional selection, Gasterosteus aculeatus, genome scan, hitchhiking,
microsatellite
Received 25 October 2007; revision accepted 22 January 2008

Introduction
The amount of sequence information available for various
organisms has recently increased tremendously (Benson
et al. 2007), but the functional properties of the sequence
data remain mostly unresolved. For example, the identity
of genes underlying evolutionary change is still largely
unknown especially in wild populations (e.g. Mackay 2001;
Orr 2005a, but see Orr 2005b). One promising approach to
locate genes involved in adaptation is to use the hitchhikingmapping approach to detect genomic regions showing
footprints of natural selection (Schltterer 2003; Storz 2005;
Vasemgi & Primmer 2005). Population genetic theory
Correspondence: H. S. Mkinen, Fax: +358-9-191 57694; E-mail:
hannu.makinen@helsinki.fi
2008 The Authors
Journal compilation 2008 Blackwell Publishing Ltd

predicts that when the frequency of a beneficial allele in


a population is increased by selection, the patterns in
differentiation and diversity at linked sites are changed as
well in a predictable fashion (Maynard-Smith & Haigh
1974). Directional selection is expected to decrease withinpopulation diversity and increase between-population
differentiation in comparison to neutral expectations. In
contrast, balancing selection tends to homogenize allele
frequencies and increase the within-population diversity
(Nielsen 2005; Charlesworth 2006). Thus, genomic regions
showing such patterns of genetic diversity could be considered as candidates for containing loci involved in
evolutionary change (Schltterer 2003).
Separating the footprint of selection from the population
history poses a major problem for identifying genomic
regions under selection (e.g. Vitalis et al. 2001; Nielsen 2005;

3566 H . S . M K I N E N , J . M . C A N O and J . M E R I L
Teshima et al. 2006). Especially in single-locus studies,
population expansions or bottlenecks can have a similar
effect on neutral polymorphisms as expected under selection (Simonsen et al. 1995). However, selection will affect only
locus-specific patterns in neutral polymorphisms as compared to the genome-wide effects of population history and
demographic events. Consequently, genotyping a large
number of neutral polymorphisms scattered throughout
the organisms genome is an effective way to tell apart the
effect of selection from the confounding effects of population
history (Schltterer 2003).
Empirical and theoretical studies have shown that the
probability to detect signatures of selection depends largely
on the chromosomal distance between the selected site and
a neighbouring locus, as well as on the strength of selection
(Storz 2005; Cano et al. 2006; De Kovel 2006). The signal of
selection decays as recombination breaks down the linkage
disequilibrium between the selected site and a linked
locus. In wild populations, the footprint of directional
selection can be lost when many generations have elapsed
since the selection event (Storz 2005; De Kovel 2006). Therefore, tight linkage between the marker locus and the actual
target of selection would improve the probability to detect
targets of natural selection. Utilizing molecular markers
such as microsatellites found within expressed sequence
tags (EST) would be a logical starting point for the screening of gene-associated polymorphisms (Vasemgi et al.
2005).
Northern Hemisphere fish populations are suitable
models for screening of genomic regions underlying adaptation. During the northward recolonization after the last
glaciation (c. 10 000 years ago) different fish species adapted
to numerous newly formed freshwater habitats, leading to
rapid phenotypic diversification (Taylor & McPhail 2000;
stbye et al. 2006; Rogers & Bernatchez 2007). For example,
in the Fennoscandian Atlantic salmon populations several
genomic regions showed indications of divergent selection
as a result of adaptation to the salt, brackish and freshwater
habitats (Vasemgi et al. 2005). Furthermore, comparisons
between quantitative genetic and neutral genetic differentiation have shown that the phenotypic differences
might have been shaped by directional selection even in
very short timescales (Leinonen et al. 2008).
Fennoscandian three-spined stickleback (Gasterosteus
aculeatus) populations provide an excellent system for
genome scan studies. Marine three-spined sticklebacks have
post-glacially colonized numerous freshwater habitats in
Fennoscandia showing extensive adaptive divergence in
morphological traits (Cano et al. 2006; Leinonen et al. 2006;
Mkinen et al. 2006) as has been documented for the species
throughout its distribution range (Bell & Foster 1994;
McKinnon & Rundle 2002). In addition, there is a large
amount of genomic information available on the genetic
basis of the morphological differentiation in three-spined

stickleback populations (Peichel et al. 2001; Colosimo et al.


2004, 2005; Shapiro et al. 2004). For example, the number
of lateral plates in marine and freshwater populations is
largely determined by full and low plated alleles segregating
in a quantitative trait locus (QTL) known as Eda (Colosimo
et al. 2005). In the fully plated marine three-spined sticklebacks, the frequency of the full plated allele is close to fixation
and the frequency of the low plated allele is extremely low.
The frequencies are more or less reversed in the freshwater
populations, where the number of lateral plates is decreased
due to increased frequency of the low plated allele (Colosimo
et al. 2005). However, QTL-mapping and candidate gene
approaches might miss some relevant genes underlying
adaptive divergence as they require a priori information
about phenotypic evolution, which may not be available
(Schltterer 2003). In addition, QTLs with weak phenotypic effects might not be detected due to the fairly low
resolution of the linkage maps available for nonmodel
organisms (Schltterer 2003; Storz 2005). Therefore, identifying genomic targets of natural selection with a genome
scan approach might be a complementary approach to
identify additional genomic regions involved in the adaptive divergence of three-spined stickleback populations.
The main aim of this study is to detect signatures of
directional and balancing selection in Fennoscandian
freshwater and marine three-spined stickleback populations. To maximize the chances of finding targets of natural
selection 103 microsatellite and two indel markers spaced
across the stickleback genome were genotyped, consisting
of a set of EST-associated markers and QTL-linked loci
coding for morphological traits in experimental crosses.
This approach allowed evaluation of the efficiency of using
microsatellites linked to QTLs in experimental crosses to
identify the footprints of selection, as well as search for
novel genomic regions of adaptive importance.

Material and methods


Study populations
Altogether seven populations inhabiting both marine and
freshwater environments were analysed (Fig. 1, Table 1).
Three populations were of marine origin; one from the
Baltic Sea (Merirastila), the North Sea (Orrevatnet) and the
Barents Sea. Merirastila and Orrevatnet were coastal
sampling sites whereas the Barents Sea samples were
collected from the pelagic area. The freshwater sampling
sites comprised of a lake (L.) population in southern
Sweden (L. Vttern) and two lake populations in Finnish
Lapland (L. Pulmanki and L. Kevo). One distantly located
river (R.) population (R. Neretva) in the Adriatic Sea region
was included for comparative purposes. The Scandinavian
freshwater populations originate from independent colonizations by marine three-spined sticklebacks after the last
2008 The Authors
Journal compilation 2008 Blackwell Publishing Ltd

N AT U R A L S E L E C T I O N I N S T I C K L E B A C K S 3567
Fig. 1 The geographical locations of the six
Scandinavian study populations are shown
in the larger map and the distantly located
R. Neretva population in the inset map. The
plate morphs are indicated next to the
population labels.

Table 1 Basic information of the sample sites and genetic diversity (HE), allelic richness (AR) and inbreeding coefficient (FIS) estimates
Population

Coordinates

Drainage

Habitat

Plate morph

HE

AR

FIS

Merirastila
Orrevatnet
Barents
L. Vttern
L. Pulmanki
L. Kevo
R. Neretva

6010N, 2500E
6019N, 0522E
7458N, 3708E
5854N, 1424E
6958N, 2758E
6945N, 2700E
4306N, 1743E

Baltic Sea
North Sea
Barents Sea
S Sweden/Baltic Sea
N Finland/Barents Sea
N Finland/Barents Sea
W Bosnia/Adriatic Sea

Marine (coastal)
Marine (coastal)
Marine (pelagic)
Freshwater (lake)
Freshwater (lake)
Freshwater (lake)
Freshwater (river)

Full
Full
Full
Partial
Low
Full
Low

0.76
0.66
0.73
0.74
0.67
0.55
0.58

8.8
5.1
7.7
8.0
6.1
5.2
5.7

0.028
0.007
0.011
0.033
0.0
0.026
0.026

glaciation c. 10 000 years ago (Mkinen et al. 2006). The R.


Neretva population belongs to a separate lineage, which
has diverged from the marine ancestors already during the
late Pleistocene (Mkinen et al. 2006; Mkinen & Meril
2008). The plate morphs were determined from alizarin-red
stained fish according to the number of lateral and keel
plates per side: full plated fish had 3032 plates, partially
plated 67 anterior plates and 35 keel plates and the low
plated only anterior plates (67). The study populations
were virtually monomorphic with respect to the plate
counts. Adult fish were collected during 2003 04 with
2008 The Authors
Journal compilation 2008 Blackwell Publishing Ltd

seine nets or minnow traps. The fish were killed with an


overdose of MS-222 and stored in 95% ethanol for subsequent molecular and plate count analyses. In total, 24
individuals from each population were screened for 103
microsatellite loci and two indel markers.

Microsatellite selection and EST-library screen for


microsatellite motifs
Fifty-seven microsatellite loci were selected from the stickleback linkage map (Largiader et al. 1999; Peichel et al. 2001).

3568 H . S . M K I N E N , J . M . C A N O and J . M E R I L

Fig. 2 A physical map of 21 three-spined stickleback chromosomes showing the genomic positions of the microsatellite loci used in this
study. The map was depicted from the genome sequence available at the http://www.ensembl.org/Gasterosteus_aculeatus/index.html.
The positions of the QTLs from published studies are shown along the chromosomes (Peichel et al. 2001; Colosimo et al. 2004, 2005; Shapiro
et al. 2004; Kimmel et al. 2005). Note that the position of the markers and QTLs may have changed in comparison to the original linkage map
(Peichel et al. 2001).

Two indel markers (Stn380 and Stn381) located in the


intronic regions of the Eda-gene were also genotyped
(Colosimo et al. 2005). Fourteen loci were targeted to the
genomic regions linked to QTLs affecting various morphological traits in experimental crosses (Fig. 2, Appendix 3). In
addition, 46 novel microsatellites were developed from the
published stickleback expressed sequence tag library for
the purposes of this study (Kingsley et al. 2004; Appendix
2). In order to identify ESTs containing di- or trinucleotide
microsatellite motifs, 103582 ESTs (Genebank accession
nos CD492713510291 and DN652663738667) were
analysed using the troll-module (Martins et al. 2006)
implemented in the subprogram pregap4 in the staden
package (Staden et al. 2000). The minimum number of
repeats was set to 10 for dinucleotide repeats and eight for
trinucleotide repeats in the search for microsatellite motifs.
This resulted in 1662 ESTs containing a microsatellite
repeat. To check for the redundancy of the database, the
ESTs were assembled in the gap4 subprogram in the

staden package using 95% similarity criterion. In total, 94


contigs (27 reads per contig) and 1568 singletons were
identified. Eighty-seven new primer pairs were developed
in a modified version of the program primer3 (http://
wsmartins.net/primerdesign) from contigs and 46 amplified
a specific polymerase chain reaction (PCR) product and
were chosen for genotyping. The primer sequences of the
new EST-associated microsatellite loci are listed in the
Appendix 2. The chromosomal positions of the microsatellite loci in three-spined stickleback genome were
mapped, with blast searches for the whole genome sequence available at http://www.ensembl.org/Gasterosteus_
aculeatus/index.html (Fig. 2) (Hubbard et al. 2007). Due to
gaps in the latest version of the genome sequence assembly,
the location of markers Stn180, 198, 205, GAest16, 17, 52 and
60 were not identified. The previously published microsatellite markers (i.e. named as Stn and Pbbe loci; Largiader
et al. 1999; Peichel et al. 2001) were located in relation to
ESTs. This analysis revealed that 36 loci were found within
2008 The Authors
Journal compilation 2008 Blackwell Publishing Ltd

N AT U R A L S E L E C T I O N I N S T I C K L E B A C K S 3569
ESTs, 16 closely linked to ESTs (within a 50 kb window)
and seven loci were located in noncoding DNA. To look for
putative homologies of the candidate genes, a protein
protein blast search was conducted on NCBI using the
protein sequence from Ensembl transcript predictions.
The biological processes of the putative homologies were
classified according to the gene ontology (GO) categories
(Harris et al. 2006).

DNA extraction and PCR amplification


DNA was extracted from pectoral fins using a standard
proteinase K digestion, followed by a silica fine-based
purification of nucleic acids in 96-well filter plates
(Elphinstone et al. 2003). All PCRs were carried out in
similar conditions and with the same cycling profile using
a commercial multiplex PCR kit (Qiagen). PCR consisted of
2 pmol of each primer, 1 Qiagen multiplex PCR master
mix, 0.5 Q-solution and approximately 20 ng of template
DNA in a total volume of 10 L. The PCR cycling started
with an activation step of 15 min at 95 C, followed by
30 cycles 30 s at 94 C, 90 s at 53 C and 60 s at 72 C and a
final extension at 72 C for 10 min. The forward primers
were fluorescently labelled with FAM, HEX or TET dyes
to visualize the PCR products. The reverse primers were
modified from their 5-end with a GTTT-tail to enhance the
activity of the 3-adenylation (Brownstein et al. 1996).
Special care was taken that microsatellite loci labelled with
the same fluorescent dye in the same PCR had nonoverlapping size ranges. The PCRs were diluted 1:50 with
MQ-water and mixed with ET-ROX 550 size standard
according to the manufacturers instructions (Amersham
Biosciences) and were resolved in a Megabace 1000 capillary
sequencer (Amersham Biosciences). Genotypes were scored
with the program fragment profiler 1.2 (Amersham
Biosciences) and were manually edited by HSM.

Data analysis
Basic population genetic parameters, such as expected heterozygosity, allelic richness and deviations from the Hardy
Weinberg equilibrium were calculated as implemented in
the fstat 2.9.3.2 (Goudet 2001). Population differentiation
was calculated using the estimator (Weir & Cockerham
1984) and the 95% confidence intervals were determined
by 1000 permutations. To test whether stepwise mutations
had contributed to the population differentiation in addition
to random genetic drift, an allele permutation test was
applied as implemented in the program spagedi (Hardy
& Vekemans 2002; Hardy et al. 2003). Analysis of molecular
variance was used to partition the genetic variation
between-groups, among-populations and within-population
components and was calculated with arlequin 3.1
(Schneider et al. 2000).
2008 The Authors
Journal compilation 2008 Blackwell Publishing Ltd

Detection of loci under selection


Roughly speaking, two types of methods are available to
identify targets of natural selection from allele frequency
data (Storz 2005). The first type of test is based on the
prediction that the levels of genetic differentiation (as
measured by FST) are unusually low or high (outliers) in the
genetic markers linked to the actual selected locus when
the strength or direction of selection varies between compared populations (Lewontin & Krakauer 1973; Beaumont
& Nichols 1996). The second type of test relies on comparison
of the relative levels of genetic diversity between populations
(Kauer et al. 2003). Genetic markers linked to loci, which
are subject to directional selection should show a decrease
in heterozygosity relative to neutral expectations. A set of
methods was used based on the degree of allele frequency
differentiation and genetic diversity and the particular
emphasis was to look for outlier loci that are supported
by several different methods. Rather than applying the
conservative Bonferroni correction, the false discovery rate
(FDR) was used to assess the statistical significance of the
outlier loci (Storey & Tibshirani 2003). Furthermore, the
detected outliers should be considered as candidates for a
more detailed genomic analysis.
The detection of microsatellite loci showing signals of
both balancing and directional selection was carried out by
a Bayesian method as implemented in the program bayesfst
(Beaumont & Balding 2004). In this method, F ST is
modelled as log (Fij/1 Fij) = i + j + ij, where i is a locus
effect, j is a population effect and ij is a locus-by-population
effect. Each parameter in the model is estimated with a
MCMC-simulation and the resulting posterior distributions
are used to identify loci under selection. In practice, the
interpretation relies mainly on the locus effects (i): positive values indicate directional selection and negative
values balancing selection (Beaumont & Balding 2004). A
critical P-value (5% level) for the locus effect was adjusted
according to the guidelines in Beaumont & Balding (2004).
A locus was interpreted as being under directional selection if its locus effect 2.5% quantile was positive and under
balancing selection if its 97.5% quantile was negative. The
major advantage of this method is that it does not assume
equal FST among the study populations as the frequentist
method implemented in the fdist program (Beaumont &
Nichols 1996; Beaumont & Balding 2004). This is highly
relevant in the case of the three-spined sticklebacks since
previous studies have shown a contrasting pattern of
genetic differentiation among the marine and freshwater
populations (e.g. Reusch et al. 2001; Mkinen et al. 2006). In
addition, simulation studies have shown that the Bayesian
method performs slightly better in detecting loci under
balancing selection than the frequentist method. Finally,
simulation studies indicate that the results of both methods
are similar and combining the results will lead to an

3570 H . S . M K I N E N , J . M . C A N O and J . M E R I L
increase of false positives rather than detecting new candidate loci under selection (Beaumont & Balding 2004).
The Bayesian FST test was carried out across all study
populations and between populations within habitat types
(marine or freshwater; Table 1). Also, comparisons between
plate morphs (fully vs. low/partially plated; Table 1) were
conducted. Two thousand draws from the posterior probability distribution were used to summarize the parameter
estimates for the locus effects () and were calculated in the
R package (http://www.r.project.org/) using the functions
provided with the distribution package of bayesfst (http://
www.reading.ac.uk/Statistics/genetics/software.html).
Two independent runs starting from different parameter
values were performed to check whether the MCMCsimulation converged to similar parameter estimates.
To further investigate the outlier loci detected by the
Bayesian FST test an analysis of molecular variance (amova)
was carried out in arlequin 3.1 (Schneider et al. 2000). Here,
a simple working hypothesis was assumed: if the loci are
affected by directional selection, their among-population
variance component should exceed the variance observed
in neutral loci. In a similar way, if a locus is affected by
balancing selection then the variance among populations
should be lower than in the neutral loci. The analysis was
conducted separately for the putatively neutral, directional
and balancing selection loci classified a posteriori according
to the Bayesian FST test. The populations were arranged
in freshwater and marine groups to analyse the effect of
habitat type on the partition of the total variance.
The Ln RH test was used in the pairwise comparisons to
detect signatures of directional selection (Kauer et al. 2003).
In this analysis the primary focus was to get additional
evidence for directional selection for the loci identified in
the Bayesian FST test. The Ln RH test is based on the
assumption that directional selection decreases the genetic
diversity ( = 4Ne) around the selected site in comparison
to the genome-wide effects of the population structure and
history (Kauer et al. 2003). Standardized (mean = 0, SD = 1)
Ln RH estimates are expected to be normally distributed
under neutrality and 95% of the values are expected to fall
between 1.96 and 1.96 (Kauer et al. 2003). Thus, loci with
Ln RH estimates outside these boundaries were considered
significant at the 5% level. The power of this test is assumed
to be higher if compared populations differ in genetic
diversity levels, i.e. in cases where only the other population
is experiencing directional selection (Storz 2005). All possible
pairwise comparisons were carried out within- and betweenhabitat types (i.e. marine and freshwater). In addition,
comparisons between low and full plated populations
were conducted. In order to estimate the Ln RH statistics,
the genetic diversity () was calculated as implemented in
the constrained gene diversity option in the program
microsatellite analyser (Dieringer & Schltterer 2003).
Another test statistics, Ln RV, which is based on the relative

reduction in the microsatellite repeat number might


decrease the number of false positives when used in
conjunction with Ln RH (Schltterer & Dieringer 2005). On
the other hand, Ln RV is sensitive to insertions/deletions in
the flanking regions of a microsatellite (Schfl & Schltterer
2004). Since 11 of our loci showed one bp mutations indicative of insertions/deletions, Ln RH was used in this study.

Results
Basic population genetic structure
The level of polymorphism varied considerably among the
105 studied loci. The expected heterozygosities ranged
from 0.08 (GAest32) to 0.98 (GAest30) and the number of
alleles from two (Stn381) to 78 (GAest30). The average
expected heterozygosity was 0.79 and the number of alleles
20.3. Four loci and two populations (Merirastila and L.
Vttern) deviated from the HardyWeinberg expectations
showing a slight excess of homozygosity, but this effect
disappeared after correcting for multiple tests at an -level
of 0.05 (Table 1). A comparison between habitat types
revealed similar average allelic richness in marine (7.2) and
in freshwater populations (6.3) (two-tailed permutation
test, P = 0.44). Likewise, expected heterozygosity was similar
in marine (0.72) and in freshwater populations (0.66,
P = 0.23). Genetic differentiation (FST) among marine
populations was markedly lower (0.06) than in freshwater
populations (0.23) but again not significant (P = 0.13). The
amova analysis indicated a negligible effect of habitat type
(0.88%, 1000 permutations; P = 0.16), whereas most of
the variation was explained by the among-population
component (16.07%, P < 0.001) and the within-population
component (83.05%, P < 0.0001; Table 3). The estimate of
the FST according to Weir & Cockerham (1984) across all
populations indicated moderate genetic differentiation
(FST = 0.167, 95% CI; 0.1480.189). There was considerable
heterogeneity in the locus-specific FSTs ranging from 0.037
(GAest32) to 0.859 (Stn381).

Detection of microsatellites under selection


The Bayesian FST method identified five loci showing
signals of directional selection (2.5% quantile) and 15 loci
under balancing selection (97.5% quantile) in the global
analysis (Table 4, Fig. 3a) at the 5% significance level. The
strongest signal of directional selection emerged from the
loci Stn365, Stn380 and Stn381, which were tightly linked
with the Eda-gene coding for the lateral plate number
(Colosimo et al. 2005). The other two directional selection
loci (Stn12 and Stn90) were not linked to the previously
known QTLs of morphological traits and could be considered as linked to the candidate genes of other ecologically
important traits (Table 4, Fig. 3a). Almost the same markers
2008 The Authors
Journal compilation 2008 Blackwell Publishing Ltd

N AT U R A L S E L E C T I O N I N S T I C K L E B A C K S 3571

Fig. 3 (ae) Summary of the Bayesian FST method results depicting the comparisons between different combinations of populations. The
locus effects () are plotted against their P-values. The solid line indicates the critical P-value for the directional selection and the dashed
line indicates the corresponding cut-off for the balancing selection loci. The P-values (x-axis) were transformed as logit (2|P 0.5|) for a
better visualization.

were detected as being potentially affected by directional


selection in a separate analysis of freshwater populations,
but the signature of selection in the loci Stn12 and Stn90
was stronger (Table 4, Fig. 3b). Two additional markers
(GAest84 and GAest87) showed a footprint of directional
selection in the analysis between the freshwater populations (Table 4, Fig. 3b). The effect of the population structure
on the number of outlier loci was further investigated by
pooling the weakly differentiated marine populations.
However, this did not affect the results; the number and
identities of the outlier loci remained unchanged. Applying
FDR-level of 0.05 to the global and freshwater comparisons
indicated that Stn12 was not a statistically supported
outlier in either of the comparisons. Likewise, GAest84 and
87 were detected as false positives in freshwater comparison.
2008 The Authors
Journal compilation 2008 Blackwell Publishing Ltd

There was a considerable overlap in the number and identities of loci under balancing selection in the global and
freshwater comparisons. Fifteen loci were detected as balancing selection outliers in the global analysis, and 16 loci in
the analysis including only the freshwater populations.
The number of false positives was higher in the freshwater
comparison (six loci) as compared to the global comparison
(three loci, Table 4). Two of the loci were not the same as in
the global comparison (GAest36 and GAest57) and three
additional loci (Stn163, Stn208 and GAest42) were
unique to the freshwater populations (Table 4, Fig. 3a, b).
No evidence for directional selection was found within the
marine populations and only two loci (GAest30 and
GAest60) appeared to be affected by balancing selection
(Fig. 3c). Therefore, most of the signatures of selection were

3572 H . S . M K I N E N , J . M . C A N O and J . M E R I L
Table 2 The mean degree of population differentiation (FST),
expected heterozygosity (HE) and allelic richness (AR) of putatively
neutral, directional and balancing selection loci as identified in
the Bayesian FST method in the global comparison. Numbers in
parentheses refer to the number of loci in each category

FST
HE
AR

Neutral (85)

Directional (5)

Balancing (15)

0.11
0.77
6.3

0.40
0.63
2.8

0.04
0.94
11.7

found within the freshwater populations. In addition, the


absence of the signal of selection in loci Stn365, Stn380 and
Stn381 when populations with the same plate morph (i.e.
full and low/partial plated) were analysed showed that the
signals of selection on these loci are strongly associated
with the plate morph, as expected (Fig. 3d, e).
The levels of average genetic differentiation (FST), expected
heterozygosity (HE) and allelic richness (AR) at loci classified as neutral or under balancing and directional selection
were according to the theoretical expectations (Table 2).
Lower expected heterozygosity and allelic richness were
apparent for the directionally selected loci, whereas the
same estimates were considerably higher for the loci under
balancing selection. Likewise, the degree of genetic differentiation was higher for the directionally selected loci and
lower for the balancing selection loci as compared to the
neutral loci (Table 2).
The amova results are summarized in Table 3. The
amova analysis revealed clear differences among habitats.
The variance explained by the directionally selected loci
(26.9%) exceeded the neutral loci or loci under balancing
selection (0.00.88%). Similarly, the among-population
component of variance was higher for the directionally
selected loci (37.5%) than for the neutral loci (16.6%), but
was lowest for the loci under balancing selection (6.7%).
The within-population component explained 93.4% and
83.8% of the total variance for the balancing and neutral
loci, respectively. However, for the directionally selected
loci the within-population component explained only
35.6% of the total variation. In the locus-by-locus analysis

of the loci in the intronic regions of the Eda-gene, the


among-group component explained 74.990.4% of the total
variation when populations were grouped according to
the plate morph. The other two directionally selected loci
(Stn12 and Stn90) accounted for a negligible proportion
(~ 0.0) of variance in the plate morph comparison suggesting
that the functional roles of these loci are not associated with
the number of lateral plates. However, habitat type explained
14.3% of the total variance of the locus Stn90 indicating that
this locus might have been relevant in adaptive divergence
of the freshwater populations from their marine ancestors.
The results of the Ln RH tests (Table 5) were not fully
congruent with the Bayesian FST test. In pairwise comparisons of the full plated vs. low plated populations Ln RH
statistics did not detect consistently directional selection in
the loci Stn365, Stn380 and Stn381, although the signal of
selection was evident in the Bayesian FST test. For example,
the patterns of diversity in the locus Stn365 deviated only
twice from the neutral expectations out of 12 possible low
and full plated comparisons (Table 5). However, the locus
Stn380 showed a signal of selection in five out of 12 comparisons between full and low/partial plated morphs.
Locus Stn381 deviated from neutral expectations only in a
comparison between Merirastila and L. Vttern (Table 5).
The signal of selection in locus Stn12 varied geographically,
since deviations from the neutral expectations emerged
from the populations L. Vttern in southern Sweden and in
L. Pulmanki in Finnish Lapland. In locus Stn90, the signal
of selection emerged only when L. Pulmanki was involved
in the comparisons. The Ln RH test did not support the
outlier status of locus GAest87 in any of the comparisons
and GAest84 was only an outlier in comparison between
Orrevatnet and L. Vttern (Table 5).

Discussion
This study revealed novel regions in the three-spined
stickleback genome that may have been affected by natural
selection. The majority of the selective footprints were
caused by the patterns of allelic divergence among the
freshwater populations, which is consistent with existing
knowledge about phenotypic adaptive divergence observed

Table 3 Analysis of molecular variance (amova) showing the partition microsatellite allele frequencies to the among-group (habitat type),
among-populations within groups and within-population variance components for the putatively neutral, directional and balancing
selection loci identified from the Bayesian FST test. The number of loci is given in the parenthesis

Directional (5)
Balancing (15)
Neutral (85)
All loci (105)

Among-groups

FCT

Among-populations

FSC

Within-populations

FST

26.9
0.0
0.0
0.88

0.27*
0.0
0.0
0.009

37.5
6.7
16.6
16.1

0.51***
0.07***
0.17***
0.16***

35.6
93.4
83.8
83.1

0.64***
0.07***
0.16***
0.17***

*P < 0.05, ***P < 0.001.


2008 The Authors
Journal compilation 2008 Blackwell Publishing Ltd

2008 The Authors


Journal compilation 2008 Blackwell Publishing Ltd

Table 4 Summary of the outlier loci detected by the Bayesian FST method in the global and freshwater comparisons. The putative homologies of the outlier loci and the biological processes
according to the GO categories. The P-values have to be interpreted as the quantilies of the posterior distribution of the locus effect (Beaumont & Balding 2004). In other words, P is
significantly positive if its 2.5% quantile is positive, and is significantly negative if its 97.5% quantile is negative. The q-values are given in parenthesis
Global

Freshwater
P (q-value)

Biological process

Protein homology

Stn12
Stn90
Stn365
Stn380
Stn381
GAest84
GAest87

0.32
0.40
0.37
0.42
0.46
0.28
0.28

0.007 (0.032)
0.0005 (0.004)
0.014 (0.043)
0.0015 (0.009)
0.0025 (0.015)
0.02 (0.057)
0.012 (0.057)

Unknown
Regulation of transcription
Lateral plate #
Lateral plate #
Lateral plate #
Unknown
Unknown

M6Ba (Danio rerio, 77%)


DRIL1 (Xenopus laevis, 67%)
Ectodysplasin
Ectodysplasin
Ectodysplasin
MGC84485 protein (Xenopus laevis, 49%)
Hypothetical protein LOC431717 (Danio rerio, 74%)

0.996 (0.981)
0.995 (0.977)

Stn122
Stn19

0.077
0.071

0.978 (0.942)
0.993 (0.967)

EFNA3 (Danio rerio, 85%)


Plexin (Xenopus laevis, 79%)

0.037
0.046
0.044
0.034
0.038
0.042
0.024
0.055
0.059

0.999 (0.998)
0.997 (0.985)
0.994 (0.975)
0.999 (0.998)
0.999 (0.996)
0.998 (0.991)
0.999 (0.998)
0.984 (0.944)
0.982 (0.941)

Stn34
Stn57
Stn67
Stn59
Stn81
GAest8
GAest30

GAest52

0.060
0.065
0.069
0.051
0.046
0.050
0.038

0.055

0.999 (0.993)
0.992 (0.953)
0.987 (0.965)
0.999 (0.996)
0.999 (0.996)
0.999 (0.996)
0.999 (0.996)

0.999 (0.993)

GAest57
GAest60
GAest63

0.057
0.037
0.037

0.982 (0.941)
0.998 (0.991)
0.999 (0.998)

GAest60
GAest63

0.079
0.047

0.986 (0.957)
0.999 (0.996)

GAest74

0.038

0.999 (0.998)

GAest74
Stn163
Stn208
GAest42

0.052
0.072
0.078
0.070

0.999 (0.996)
0.987 (0.957)
0.978 (0.943)
0.988

Cell-Cell signalling
Multicellular
organismal development
Protein binding
Vesicle mediated transport
Unknown
Unknown
Unknown
Unknown
Unknown
Unknown
Brain segmentation,
hindbrain development
Unknown
Unknown
UDP-N-acetylgalactosamine
metabolism
Unknown
Unknown
Unknown
carboxylic acid metabolism

P (q-value)

Directional
Stn12
Stn90
Stn365
Stn380
Stn381

0.19
0.28
0.42
0.52
0.56

0.016 (0.056)
0.002 (0.011)
0.0003 (0.002)
0.0003 (0.002)
0.0003 (0.002)

Balancing
Stn122
Stn19

0.050
0.054

Stn34
Stn57
Stn67
Stn59
Stn81
GAest8
GAest30
GAest36
GAest52

Ubiquitin-conjugating enzyme variant 2 (Danio rerio, 87%)


SYTL1 (Gallus gallus, 55%)
Delta isoform of regulatory subunit (Danio rerio, 88%)
unknown
Zinc finger protein 534 (Xenopus tropicalis, 56%)
Cerebellin 2 (Bos taurus, 87%)
Adaptor related protein complex (Danio rerio, 56%)
Glykosyltransferase-like 1B (Danio rerio, 92%)
Radical fringe (Danio rerio, 71%)
Mgc 104331 protein (Danio rerio, 67%)
Arrestin red cell isoform 3 (Oncorhynchyss mykiss, 89%)
Similar to UDP-GalNAc (Danio rerio, 92%)
MGC83916 protein (Xenopus laevis, 71%)
Smad4 type 1 (Cyprinus carpio, 79%)
unknown
Multiple coagulation factor deficiency 2 (Gallus gallus, 88%)

N AT U R A L S E L E C T I O N I N S T I C K L E B A C K S 3573

Bayes FST

Bayes FST

3574 H . S . M K I N E N , J . M . C A N O and J . M E R I L
Table 5 Pairwise Ln RH comparisons indicating significant loci and the LnRH estimates in parenthesis
Merirastila

Orrevatnet

Barents

L. Pulmanki

L. Kevo

L. Vttern

R. Neretva

Merirastila
Orrevatnet
Barents
L. Pulmanki

NA

Stn12 (2.01)
Stn90 (2.54)

NA
Stn12 (2.16)
Stn90 (2.0)

NA

L. Kevo

NA

Stn12 (2.46)
Stn365 (2.15)
Stn 90 (2.33)
GAest84 (2.37)

NA

L. Vttern

Stn380 (2.40)
Stn12 (1.98)
Stn380 (2.58)

GAest84 (2.02)

Stn380 (2.27)

Stn12 (2.05)
Stn90 (2.36)
Stn365 (2.36)
Stn90 (2.73)

NA

Stn380 (2.57)

Stn380 (2.82)

Stn380 (2.73)

NA

R. Neretva

in the three-spined sticklebacks. Most of the phenotypic


differentiation has been documented among the freshwater populations, whereas the marine populations have
remained relatively uniform (Bell & Foster 1994; McKinnon
& Rundle 2002).
The number of microsatellite loci identified as outliers
was comparable to previous studies, which have reported
1.4 9.5% of the loci underlying adaptive divergence in
various taxa (Campbell & Bernatchez 2004; Vasemgi et al.
2005; Bonin et al. 2006; Kane & Rieseberg 2007; reviewed in
Stinchcombe & Hoekstra 2008). It is worth noting that due
to different cut-off levels used, the studies and figures are
not directly comparable. Only 2.8% (3/105; Stn365, 380 and
381 are considered effectively as one locus due to tight linkage) of the analysed loci showed footprints of directional
selection in the global comparison and a slightly higher
number (4.8%, 5/105) in the comparison of freshwater
populations only. The number of loci indicating balancing
selection was considerably higher in both the global comparison (14.7%, 15/105) and in the comparison of freshwater
populations only (15.2%, 16/105). In a survey of sunflower
(Helianthus annuus) populations from normal and stressful
environments, 1.5 6% of the microsatellite loci were
detected as being affected by directional selection (Kane
& Rieseberg 2007). An amplified fragment length polymorphism-scan in sympatric whitefish (Coregonus albula)
ecotypes revealed 1.43.2% of loci underlying adaptive
divergence (Campbell & Bernatchez 2004). Vasemgi et al.
(2005) found a higher number of outlier loci (9.5%) in salt,
brackish and freshwater Scandinavian Atlantic salmon
(Salmo salar) populations. Unfortunately, the above studies
have not reported the number of loci underlying balancing
selection making the comparisons difficult. The low number
of loci detected as outliers in the genome scan studies is
not surprising if one considers the total number of genes
contained in an organisms genome. For example, the
number of Genscan gene predictions for the three-spined
stickleback is currently 44884 (http://www.ensembl.org/

Gasterosteus_aculeatus/index.html), indicating that the


coverage of the estimated gene number in this study was
roughly 0.0023%. Recent simulation studies have shown
that in order to detect strong artificial selection in dog
breeds one would require a spacing of one highly polymorphic marker per 0.8 centimorgans (cM) (Pollinger et al.
2005). Assuming that one cM corresponds to one Mb (as in
humans) then c. 560 markers (446.6 Mb/0.8 Mb) would be
needed to fully cover the three-spined stickleback genome.
The relatively low number of loci detected as underlying
adaptive divergence in various organisms has convinced
some authors that the genome scan approach would not be
a powerful method for identifying genes involved in adaptive evolution (Eyre-Walker 2006). However, the number of
markers used in a typical genome scan study (100200) still
covers a very low proportion of the genome, which might
partly explain the low number of loci detected as outliers.

Higher number of loci shows footprints of balancing


selection
Our results are in agreement with the theoretical expectation
of a predominant role of balancing over directional selection.
Balancing selection tends to maintain phenotypes close to
the population mean and remove extreme phenotypes,
which is commonly observed in the wild (e.g. Kimura
1981). Empirical examples at the molecular level of the
prevalence of balancing over directional selection are
scarce. For example, in an exhaustive search for traces of
natural selection in the human genome, the empirical
distribution of FSTs of 26 530 single nucleotide polymorphisms (SNPs) in three human populations, showed that
11% of SNPs had FST ~ 0.0 and 6% had FST 0.4 (Akey et al.
2002). The authors pointed out that, due to methodological
limitations in their approach, the difference in the relative
numbers did not reflect the underlying evolutionary forces
(Akey et al. 2002). In contrast to our findings, Storz &
Nachman (2003) found a lower proportion of balancing
2008 The Authors
Journal compilation 2008 Blackwell Publishing Ltd

N AT U R A L S E L E C T I O N I N S T I C K L E B A C K S 3575
than directional selection signals in allozyme data sets in
the rodent genus Peromyscus. The contention that balancing
selection is more common than directional selection in
three-spined sticklebacks should be considered as preliminary given the low genomic coverage of this study, as
well as the methodological limitations discussed below. A
more dense genomic coverage and replicate populations
would be needed to confirm the generality of this pattern.
It has been also suggested that balancing selection is
responsible for maintaining polymorphism at selected loci
(Charlesworth 2006). Some recent studies, however, have
reported a limited role of balancing selection in preserving
trans-species level polymorphisms (Asthana et al. 2005).
In humans Bubb et al. (2006) found 16 high diversity SNP
regions but the distribution fell within the neutral expectations. A classical example of balancing selection maintaining
high genetic diversity is the major histocompatibility
(MHC) loci, which play a key role in vertebrate immune
defence (e.g. Garrigan & Hedrick 2003). The high number
of alleles is expected to be maintained due to overdominance (heterozygote advantage) or frequency dependent
selection (rare allele advantage). An attempt to demonstrate balancing selection operating in MHC-linked
microsatellites failed in wild sheep (Ovis dalli), probably due
to recombination breaking down the linkage between the
markers and causative polymorphisms (Worley et al. 2006).
In this study, the biological and molecular function of the
majority of the genomic regions underlying balancing
selection remains unclear since no putative homologies to
known genes were found.
Detecting balancing selection with the Bayesian FST
method is not without limitations. First of all, it has a low
power to detect balancing selection from simulated data
sets, but, on the other hand, the false discovery rate is
extremely low (0.01%, Beaumont & Balding 2004). This
may indicate that when balancing selection loci are detected,
the signal is probably very strong. Secondly, FST tends to
underestimate the degree of genetic differentiation in
highly polymorphic loci (Hedrick 2005). This was also
evident in our data; there was a negative correlation with
FST and the allelic richness across different loci (results not
shown). However, the classification of loci as indicative of
balancing selection is not solely based on the allelic richness, but also on the unusually similar allele frequencies
over populations. A third explanation for balancing selection signatures can be found from the rates of mutation. If
some microsatellite loci have a higher than average mutation rate, a higher polymorphism level could be explained
without invoking balancing selection as a cause. Although
information about the mutation rates is lacking, comparing
the FST and RST estimates on the degree of population
differentiation would be informative for the impact of
stepwise mutations. However, RST values for the candidate
loci under balancing selection did not consistently exceeded
2008 The Authors
Journal compilation 2008 Blackwell Publishing Ltd

FST estimates, suggesting that the signatures of balancing


selection are unlikely to be due to an extraordinary high
mutation rate in all of these loci. From the 15 balancing
selection loci identified in the global analysis, approximately half (8/15) showed an indication of a significant
contribution of stepwise mutations to the degree of
population differentiation (Appendix 1). Thus, the evidence
for balancing selection might be weaker for the loci for
which a significant contribution of stepwise mutations was
detected. It should be also noted that several balancing
selection loci were identified as false positives.
Also, the selection of populations can and perhaps even
should influence the likelihood of detecting different
forms of selection. When the most distantly located population (R. Neretva) in our data was excluded from the
analysis, the number of balancing selection loci was reduced
from 15 to 10 indicating some degree of dependency
between the overall FST and the number of balancing
selection signals. This pattern may explain why only a few
balancing selection signals were detected among the marine
populations (FST ~ 0.05) in comparison to the freshwater
populations (FST ~ 0.17). Also, ascertainment bias might
explain the higher number of balancing selection over
directional selection signals. It has been suggested that this
kind of bias could be introduced by selecting loci originally
developed for QTL-mapping purposes (Vigouroux et al.
2002). Typically, markers with high polymorphisms are
informative in QTL-crosses and low diversity markers
are discarded. This might result in underestimation of the
significance of directional selection in genome scan studies
(Vigouroux et al. 2002). Our marker set comprised of 59 loci
from QTL-mapping panels (Peichel et al. 2001; Colosimo
et al. 2005) and 46 markers developed from EST-libraries.
The occurrence of directional selection signals in the global
analysis was more common in QTL-mapping set (3/59)
than in the EST-associated markers (0/46) indicating that
ascertainment bias probably does not explain the observed
pattern. It may be reasonable to conclude that some of
the balancing selection signals are true but given the
limitations of the current methods in detecting balancing
selection and the possibility to underestimate genetic
differentiation with highly polymorphic loci caution is
needed for interpretation of the balancing selection signals.

Candidate loci under directional selection


Two novel candidate loci (Stn12 and Stn90) were identified
as being potentially linked to genes important for adaptive
divergence. In addition, this study demonstrates that the
three loci (Stn365, Stn380 and Stn381) located in intronic
regions of the Eda-gene have a very strong signal of
directional selection. This latter result reflects the fact that
the study populations represent different plate morphs,
and it is consistent with the patterns of sequence variation

3576 H . S . M K I N E N , J . M . C A N O and J . M E R I L
in the Eda-gene found for the same set of populations
(Cano et al. 2006). The signal of directional selection among
the freshwater populations in the Eda-linked markers was
caused by the inclusion of an exceptional full plated L.
Kevo population in the analysis. This result contrasts with
the findings of Raeymaekers et al. 2007), suggesting that
balancing selection could explain the observed plate morph
pattern within freshwater populations in western Europe.
The results of the amova analyses and Bayesian FST test
indicate that the novel putatively selected loci are not related
to plate morph divergence but a moderate proportion of
the allele frequency variation in Stn90 was related to
differences between marine and freshwater populations.
The interpretation that Stn90 may be related to adaptation
to freshwater is only tentative at this stage; denser
mapping around this locus is needed to narrow down any
potential gene and its functional role. The other candidate
locus for directional selection, Stn12, did not show any
significant relation to plate morph but a weak effect of
habitat type. Even if it was linked to a selected gene, the
weak signal of Stn12 suggests that may be quite far from
the actual target of selection. All in all, among our set of
loci, Stn90 is the more promising candidate to find novel
genes of adaptive relevance.

Selection on microsatellite loci informative in


QTL-mapping studies
Only one (the three Eda intronic markers are considered
effectively as one due to tight linkage) out of 12 (8.3%) of
the QTL-linked markers showed signals of divergent
selection. The utility of the QTL-linked markers in detecting
selection seems to be very limited if the markers are not
tightly linked to the actual genes underlying phenotypic
variation, which is also in line with previous empirical
studies (Cano et al. 2006; Raeymaekers et al. 2007). Selection
was clearly detectable in the microsatellite locus Stn365
and the indel markers Stn380 and 381 located in the
intronic regions of the Eda-gene coding for the number of
lateral plates. For example, microsatellite locus 4147Pbbe
showed no signal of selection although it was found to be
informative in a QTL-cross (Colosimo et al. 2004, 2005) but
located roughly 1.5 cM away from the Eda-gene (Colosimo
et al. 2005). The decay of the footprint of selection was
evident in the decrease of average FST values. Locus
4147Pbbe had a FST of 0.09 whereas the FSTs in the intronic
loci ranged from 0.41 to 0.56. This result is also in line with
simulation studies, which suggest a decay of the signal of
selection with the increasing distance from the selected site
(Storz 2005; De Kovel 2006). Recombination between the
selected site and a marker locus tends to break down the
linkage disequilibrium as many generations pass since
the selective event. In experimental QTL-crosses, linkage
between marker loci and QTL extends over larger genetic

distances. The observed pattern suggests that, with the


current resolution in QTL maps of nonmodel organisms,
QTL-linked markers are not a good proxy to measure
adaptive divergence in natural populations.
An intermediate strength of selection would be detectable
only in a window of 200400 generations of divergence of
large populations (De Kovel 2006). It is important to note
that the footprint of selection is not lost in the microsatellite
locus Stn365 and in the two indel markers at the intronic
regions of the Eda even after a substantial number of
generations of population divergence. The Scandinavian
populations were established c. 10 000 years ago (c. 5000
generations assuming two years generation interval). Even
in the distantly related R. Neretva population Eda-linked
hitchhiking was detectable although this population had
already diverged from its marine ancestors during the late
Pleistocene (Mkinen et al. 2006; Mkinen & Meril 2008).
Furthermore, simulation studies show that empirical
genome scan studies might not detect a substantial number
of loci contributing to the adaptation due to the mode of
selection and the demographic population history (Teshima
et al. 2006). For example if selection is operating on a recessive
rather than a codominant allele. Recently, population
bottlenecks have been found to increase the number false
positive signals of selection (e.g. Teshima et al. 2006; Wiehe
et al. 2007). However, bottleneck analysis of the same
populations indicated that a constant population size
model could not be rejected (Mkinen et al. 2008). Finally,
and in a more general level, we note that there is still uncertainty over the issue whether signals of selection can be
separated from the confounding effects of population
history even in very large data sets (Kelley et al. 2006).
From this perspective, the results of the current study
showing the ability of the methods used to identify the signature of selection in the Eda-gene associated loci serves as
a demonstration that the genome scan approach can work
in practice.

Performance of the neutrality tests


One problem in the identification of loci under selection is
that it typically involves a large number of statistical tests.
Here, roughly five outliers from the analysed 105 loci would
be expected by chance alone at the 5% -level. However,
applying Bonferroni correction might lead to an overconservative criterion for defining statistical significance of
the potential outliers (e.g. Vasemgi et al. 2005; Narum 2006).
For instance, in our case the use of the significance level of
0.05 would make the directional selection candidate loci
Stn90 and Stn12 in the global analysis nonsignificant. A
common practice in genome scan studies has been to seek
for confirmatory evidence from multiple analytical methods
with different assumptions (Vasemgi et al. 2005; Bonin
et al. 2006; Kane & Rieseberg 2007). Another way to reduce
2008 The Authors
Journal compilation 2008 Blackwell Publishing Ltd

N AT U R A L S E L E C T I O N I N S T I C K L E B A C K S 3577
the number of false positives is to genotype more marker
loci from the flanking regions of the candidates identified
in the initial analysis (Wiehe et al. 2007).
The FST-based and heterozygosity-ratio test results were
not fully congruent especially in the case of the microsatellite
locus and the two indel markers located in the intronic
regions of the Eda-gene. This is not surprising since the
overall heterozygosities were low at the Eda-linked loci in
most of the population comparisons and almost fixed for
different alleles. For example, the microsatellite locus Stn365
had only three alleles and the gene diversities ranged from
0 to 0.41 but the allele frequencies were highly skewed
(Appendix 4). In this situation the ratio of gene diversities
may not be informative but results in a higher than average
FST. Thus, rather than relying on only one test statistic, a
combination of allele frequency and gene diversity-ratio
based methods would exploit the information of genome
scan data sets effectively. It has been also suggested that
FST-based tests might detect selection in longer periods
after the selection event since mutation might restore the
genetic diversity to the levels before selection (Storz 2005).
Finally, if a large number of loci are selected in the compared
populations, then the standard deviation of the heterozygosity ratio becomes large and only extreme values
are significant (Storz 2005).

Conclusions
The relatively large number of genomic regions showing
footprints of balancing rather than directional selection
might be indicative of a predominant role of balancing
over directional selection in shaping the variability in the
three-spined stickleback genome. However, despite the
conservative nature of the method used to detect balancing
selection, the low genome coverage of the markers used and
the uncertainty about possible effects of differing mutation
rates among the microsatellite loci call for further studies to
verify this conclusion. The result that the number of loci
under directional selection was considerably higher among
the freshwater than among the marine populations suggests
that colonization of freshwater environments entails a
major change in selective regime. Shifting from the original
marine environment to the freshwater probably leads
to detectable imprints of selection at the genomic level.
However, it appears that balancing selection among the
freshwater populations may maintain similar allele frequencies in substantial number of loci. The outlier loci
detected in this study provide a good starting point for a
more fine-scale mapping of selective footprints. Especially
the locus Stn90 seems to be a promising candidate to uncover
genes related to adaptive divergence in Fennoscandian
three-spined stickleback populations. Given the availability
of the three-spined stickleback genome sequence, it should
be possible not only to narrow down the genomic regions
2008 The Authors
Journal compilation 2008 Blackwell Publishing Ltd

affected by selection, but also identify the actual genes


under selection. From a methodological perspective, the
strong signal emerging from the markers in the intronic
regions of the Eda-gene provides an empirical demonstration that the hitchhiking-mapping approach is able to
detect signatures of selection when the markers are tightly
linked with a selected locus. The lack of a signal of selection
for other QTL-linked markers confirms earlier findings of
decay in the selective signal with increasing distance from
the selected locus. Thus, choosing gene-associated marker
loci is an optimal strategy for future attempts to locate
targets of natural selection.

Acknowledgements
We would like to thank Tuomas Leinonen for collecting the
stickleback samples and Kaisa Vlmki for excellent laboratory
assistance. Anti Vasemgi and two anonymous referees gave
useful comments on an earlier version of this manuscript.
This study was supported by the Finnish Graduate School in
Population Genetics and the Academy of Finland.

References
Akey JM, Zhang G, Zhang K, Jin L, Shriver MD (2002) Interrogating
a high-density SNP map for signatures of natural selection.
Genome Research, 12, 18051814.
Asthana S, Schmidt S, Sunyaev S (2005) A limited role for balancing
selection. Trends in Genetics, 21, 3032.
Beaumont MA, Balding DJ (2004) Identifying adaptive genetic
divergence among populations from genome scans. Molecular
Ecology, 13, 969980.
Beaumont MA, Nichols RA (1996) Evaluating loci for use in the
genetic analysis of population structure. Proceedings of the
Royal Society of London. Series B: Biological Science, 263, 1619
1626.
Bell MA, Foster SA (1994) Introduction to the evolutionary biology
of the threespine stickleback. In: The Evolutionary Biology of the
Threespine Stickleback (eds Bell MA, Foster SA), pp. 126. Oxford
University Press, Oxford.
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL
(2007) GenBank. Nucleic Acids Research, 35, 2125.
Bonin A, Taberlet P, Miaud C, Pompanon F (2006) Explorative
genome scan to detect candidate loci for adaptation along a
gradient of altitude in the common frog (Rana temporaria).
Molecular Biology and Evolution, 23, 773783.
Brownstein MJ, Carpten JD, Smith JR (1996) Modulation of nontemplated nucleotide addition by taq DNA polymerase: Primer
modifications that facilitate genotyping. Biotechniques, 20,
10081010.
Bubb KL, Bovee D, Buckley D et al. (2006) Scan of human genome
reveals no new loci under ancient balancing selection. Genetics,
173, 21652177.
Campbell D, Bernatchez L (2004) Generic scan using AFLP markers
as a means to assess the role of directional selection in the
divergence of sympatric whitefish ecotypes. Molecular Biology
and Evolution, 21, 945956.
Cano JM, Matsuba C, Mkinen H, Meril J (2006) The utility of

3578 H . S . M K I N E N , J . M . C A N O and J . M E R I L
QTL-linked markers to detect selective sweeps in natural
populations a case study of the EDA gene and a linked marker
in threespine stickleback. Molecular Ecology, 15, 46134621.
Charlesworth D (2006) Balancing selection and its effects on
sequences in nearby genome regions. Plos Genetics, 2, 379384.
Colosimo PF, Peichel CL, Nereng K et al. (2004) The genetic
architecture of parallel armor plate reduction in threespine
sticklebacks. Plos Biology, 2, 635641.
Colosimo PF, Hosemann KE, Balabhadra S et al. (2005) Widespread parallel evolution in sticklebacks by repeated fixation of
ectodysplasin alleles. Science, 307, 19281933.
De Kovel CG (2006) The power of allele frequency comparisons to
detect the footprint of selection in natural and experimental
situations. Genetics, Selection, Evolution, 38, 323.
Dieringer D, Schltterer C (2003) Microsatellite analyser (MSA): a
platform independent analysis tool for large microsatellite data
sets. Molecular Ecology Notes, 3, 167169.
Elphinstone MS, Hinten GN, Anderson MJ, Nock CJ (2003) An
inexpensive and high-throughput procedure to extract and
purify total genomic DNA for population studies. Molecular
Ecology Notes, 3, 317320.
Eyre-Walker A (2006) The genomic rate of adaptive evolution.
Trends in Ecology and Evolution, 21, 569575.
Garrigan D, Hedrick PW (2003) Perspective: Detecting adaptive
molecular polymorphism: Lessons from the MHC. Evolution, 57,
17071722.
Goudet J (2001) FSTAT, a program to estimate and test gene
diversities and fixation indices (version 2.9.3). Available from
http://www.unil.ch/izea/softwares/fstat.html. Updated from
Goudet (1995).
Hardy OJ, Vekemans X (2002) Spagedi: a versatile computer
program to analyse spatial genetic structure at the individual
or population levels. Molecular Ecology Notes, 2, 618620.
Hardy OJ, Charbonnel N, Freville H, Heuertz M (2003) Microsatellite allele sizes: a simple test to assess their significance on genetic
differentiation. Genetics, 163, 14671482.
Harris MA, Clark JI, Ireland A, Lomax J, Ashburner J (2006) The
gene ontology (GO) project in 2006. Nucleic Acids Research, 34,
322326.
Hedrick PW (2005) A standardized genetic differentiation measure.
Evolution, 59, 16331638.
Hubbard TJ, Aken BL, Beal K et al. (2007) Ensembl 2007. Nucleic
Acids Research, 35, D610D617.
Kane NC, Rieseberg LH (2007) Selective sweeps reveal candidate
genes for adaptation to drought and salt tolerance in common
sunflower, Helianthus annuus. Genetics, 175, 18231834.
Kauer MO, Dieringer D, Schltterer C (2003) A microsatellite
variability screen for positive selection associated with the
out of Africa habitat expansion of Drosophila melanogaster.
Genetics, 165, 11371148.
Kelley JL, Madeoy J, Calhoun JC, Swanson W, Akey JM (2006)
Genomic signatures of positive selection in humans and the limits
of outlier approaches. Genome Research, 16, 980989.
Kimmel CB, Ullmann B, Walker C et al. (2005) Evolution and
development of facial bone morphology in threespine sticklebacks. Proceedings of the National Academy of Sciences of the United
States of America, 102, 57915796.
Kimura M (1981) Possibility of extensive neutral evolution under
stabilizing selection with special reference to nonrandom usage
of synonymous codons. Proceedings of the National Academy of
Sciences of the United States of America, 78, 57735777.

Kingsley DM, Zhu B et al. (2004) New genomic tools for molecular
studies of evolutionary change in sticklebacks. Behaviour, 141,
13311344.
Largiader CR, Fries V, Kobler B, Bakker TC (1999) Isolation and
characterization of microsatellite loci from the three-spined stickleback (Gasterosteus aculeatus L.). Molecular Ecology, 8, 342344.
Leinonen T, Cano JM, Mkinen H, Meril J (2006) Contrasting
patterns of body shape and neutral genetic divergence in
marine and lake populations of threespine sticklebacks. Journal
of Evolutionary Biology, 19, 18031812.
Leinonen T, Cano JM, OHara R, Meril J (2008) Comparative
studies of quantitative trait and neutral marker divergence: a
meta-analysis. Journal of Evolutionary Biology, 21, 117.
Lewontin RC, Krakauer J (1973) Distribution of gene frequency as
a test of the theory of the selective neutrality of polymorphisms.
Genetics, 74, 175195.
Mackay TF (2001) The genetic architecture of quantitative traits.
Annual Review of Genetics, 35, 303339.
Mkinen HS, Meril J (2008) Mitochondrial DNA phylogeography
of the three-spined stickleback (Gasterosteus aculeatus) in Europe
evidence for multiple glacial refugia. Molecular Phylogenetics and
Evolution, 46, 167182.
Mkinen HS, Cano JM, Meril J (2006) Genetic relationships
among marine and freshwater populations of the European
three-spined stickleback (Gasterosteus aculeatus) revealed by
microsatellites. Molecular Ecology, 15, 15191534.
Mkinen HS, Shikano T, Cano JM, Meril J (2008) Hitchhiking mapping reveals a candidate genomic region for natural selection in
three-spined stickleback chromosome VIII. Genetics, 178, 453465.
Martins W, de Sousa D, Proite K, Guimaraes P, Moretzsohn M,
Bertioli D (2006) New softwares for automated microsatellite
marker development. Nucleic Acids Research, 34, e31.
Maynard Smith J, Haigh J (1974) The hitchhiking effect of a
favourable gene. Genetical Research, 23, 2335.
McKinnon JS, Rundle HD (2002) Speciation in nature: The threespine stickleback model systems. Trends in Ecology and Evolution,
17, 480488.
Narum SR (2006) Beyond bonferroni: Less conservative analyses
for conservation genetics. Conservation Genetics, 7, 783787.
Nielsen R (2005) Molecular signatures of natural selection. Annual
Reviews of Genetics, 39, 197218.
Orr HA (2005a) The genetic basis of reproductive isolation: Insights
from Drosophila. Proceedings of the National Academy of Sciences of the
United States of America, 102 (Supplement 1), 65226526.
Orr HA (2005b) The genetic theory of adaptation: a brief history.
Nature Reviews Genetics, 6, 119127.
stbye K, Amundsen PA, Bernatchez L et al. (2006) Parallel evolution
of ecomorphological traits in the European whitefish Coregonus
lavaretus (L.) species complex during postglacial times. Molecular
Ecology, 15, 39834001.
Peichel CL, Nereng KS, Ohgi KA et al. (2001) The genetic architecture
of divergence between threespine stickleback species. Nature, 414,
901905.
Pollinger JP, Bustamante CD, Fledel-Alon A, Schmutz S, Gray
MM, Wayne RK (2005) Selective sweep mapping of genes with
large phenotypic effects. Genome Research, 15, 18091819.
Raeymaekers JA, Van Houdt JK, Larmuseau MH, Geldof S,
Volckaert FA (2007) Divergent selection as revealed by PST and
QTL-based FST in three-spined stickleback (Gasterosteus aculeatus)
populations along a coastal-inland gradient. Molecular Ecology,
16, 891905.

2008 The Authors


Journal compilation 2008 Blackwell Publishing Ltd

N AT U R A L S E L E C T I O N I N S T I C K L E B A C K S 3579
Reusch TB, Wegner KM, Kalbe M (2001) Rapid genetic divergence
in postglacial populations of threespine stickleback (Gasterosteus
aculeatus): The role of habitat type, drainage and geographical
proximity. Molecular Ecology, 10, 24352445.
Rogers S, Bernatchez L (2007) The genetic architecture of ecological
speciation and the association with signatures of selection in
natural lake whitefish (Coregonus sp. salmonidae) species pairs.
Molecular Biology and Evolution, 24, 14231438.
Schltterer C (2003) Hitchhiking mappingfunctional genomics
from the population genetics perspective. Trends in Genetics, 19,
3238.
Schltterer C, Dieringer D (2005) A novel test statistic for the
identification of local selective sweeps based on microsatellite
gene diversity. In: Selective Sweep (ed. Nurminsky D), pp. 5564.
Landes Bioscience, Georgetown Texas,.
Schneider S, Roessli D, Excoffier L (2000) Arlequin: a Software for
Population Genetics Data Analysis. Version 2.000. Genetics and
Biometry Laboratory, Department of Anthropology, University
of Geneva, Switzerland.
Schfl G, Schltterer C (2004) Patterns of microsatellite variability
among X chromosomes and autosomes indicate a high frequency
of beneficial mutations in Non-African D. simulans. Molecular
Biology and Evolution, 21 (7), 13841390.
Shapiro MD, Marks ME, Peichel CL et al. (2004) Genetic and developmental basis of evolutionary pelvic reduction in threespine
sticklebacks. Nature, 428, 717723.
Simonsen KL, Churchill GA, Aquadro CF (1995) Properties of
statistical tests of neutrality for DNA polymorphism data.
Genetics, 141, 413429.
Staden R, Beal KF, Bonfield JK (2000) The staden package, 1998.
Methods in Molecular Biology, 132, 115130.
Stinchcombe JR, Hoekstra HE (2008) Combining population
genomics and quantitative genetics: Finding the genes
underlying ecologically important traits. Heredity 100, 15870.
Storey JD, Tibshirani R (2003) Statistical significance for genomewide
studies. Proceedings of the National Academy of Sciences, 100, 9440
9445.
Storz JF (2005) Using genome scans of DNA polymorphism to
infer adaptive population divergence. Molecular Ecology, 14,
671688.
Storz JF, Nachman MW (2003) Natural selection on protein poly-

2008 The Authors


Journal compilation 2008 Blackwell Publishing Ltd

morphism in the rodent genus Peromyscus: Evidence from


interlocus contrasts. Evolution, 57, 26282635.
Taylor EB, McPhail JD (2000) Historical contingency and ecological determinism interact to prime speciation in sticklebacks,
Gasterosteus. Proceedings of the Royal Society B: Biological Sciences,
267, 23752384.
Teshima KM, Coop G, Przeworski M (2006) How reliable are
empirical genomic scans for selective sweeps? Genome Research,
16, 702712.
Vasemgi A, Primmer CR (2005) Challenges for identifying functionally important genetic variation: The promise of combining
complementary research strategies. Molecular Ecology, 14, 3623
3642.
Vasemgi A, Nilsson J, Primmer CR (2005) Expressed sequence
tag-linked microsatellites as a source of gene-associated polymorphisms for detecting signatures of divergent selection in
Atlantic salmon (Salmo salar L.). Molecular Biology and Evolution,
22, 10671076.
Vigouroux Y, McMullen M, Hittinger CT et al. (2002) Identifying
genes of agronomic importance in maize by screening microsatellites for evidence of selection during domestication.
Proceedings of the National Academy of Sciences of the United States
of America, 99, 96509655.
Vitalis R, Dawson K, Boursot P (2001) Interpretation of variation
across marker loci as evidence of selection. Genetics, 158, 18111823.
Weir BS, Cockerham CC (1984) Estimating F-statistics for the
analysis of population structure. Evolution, 38, 13581370.
Wiehe T, Nolte V, Zivkovic D, Schltterer C (2007) Identification of
selective sweeps using a dynamically adjusted number of
linked microsatellites. Genetics, 175, 207218.
Worley K, Carey J, Veitch A, Coltman DW (2006) Detecting the
signature of selection on immune genes in highly structured
populations of wild sheep (Ovis dalli). Molecular Ecology, 15,
623637.

This study was a part of HMs thesis entitled Phylogeography


and adaptive divergence of three-spined stickleback populations.
JC and JM have a broad interest in studying the relative roles of
natural selection and random genetic drift in the wild.

3580 H . S . M K I N E N , J . M . C A N O and J . M E R I L
Appendix 1 Basic population genetic estimates for the loci identified as directional (bold) and balancing (italics) selection outliers in the
global analysis. Allele frequency differentiation (FST), differentiation when taking into account stepwise mutations (RST), pRST is the
permuted FST assuming that stepwise mutations had not contributed to the population differentiation. AR = allelic richness based on 12
individuals, HE = expected heterozygosity, % variation is the amount of variance explained in the locus-by-locus amova analysis between
marine and freshwater populations
Locus

FST

RST

pRST (95% C.I.)

AR

HE

% variation

Stn12
Stn90
Stn365
Stn380
Stn381
Stn122
Stn19
Stn34
Stn57
Stn67
Stn59
Stn81
GAest8
GAest30
GAest36
GAest52
GAest57
GAest60
GAest63
GAest74

0.408
0.521
0.654
0.685
0.856
0.087
0.069
0.065
0.088
0.089
0.040
0.066
0.080
0.047
0.069
0.087
0.087
0.036
0.043
0.061

0.148
0.677
0.727
0.837
0.855
0.0306
0.162
0.167
0.387
0.159
0.180
0.133
0.358
0.462
0.067
0.406
0.067
0.016
0.238
0.239

0.386 (0.140.63)
0.497 (0.140.72)
0.596 (0.280.78)
0.505 (0.080.84)
0.855 (0.850.85)
0.084 (0.00.22)
0.067 (0.00.19)
0.064 (0.00.18)*
0.085 (0.00.25)***
0.088 (0.00.26)
0.040 (0.00.14)**
0.049 (0.00.18)
0.056 (0.00.21)***
0.046 (0.00.17)***
0.068 (0.00.18)
0.083 (0.00.21)***
0.080 (0.00.19)
0.038 (0.00.13)
0.042 (0.00.13)***
0.059 (0.00.16)***

7.7
4.3
2.9
3.3
2.0
13.4
16.5
15.9
15.7
9.7
14.4
13.7
16.0
19.4
11.3
14.3
10.7
11.6
17.3
16.0

0.84
0.72
0.58
0.55
0.49
0.94
0.96
0.95
0.96
0.89
0.94
0.94
0.96
0.98
0.91
0.92
0.89
0.92
0.97
0.95

6.1
14.3
43.5
36.5
51.4
1.1
0.7
1.1
0.4
1.0
0.4
0.1
1.5
0.2
0.9
1.9
0.0
0.7
0.9
0.4

*P < 0.05, **P < 0.01, ***P < 0.001.

2008 The Authors


Journal compilation 2008 Blackwell Publishing Ltd

N AT U R A L S E L E C T I O N I N S T I C K L E B A C K S 3581
Appendix 2 A list of the locus names, GeneBank accession numbers and the chromosomal positions in the three-spined stickleback genome. Primer sequences
are given for the markers (GAest), which were developed from the stickleback EST-library available at the National Centre For Biotechnology Information. Note
that the accession number refers to the EST-sequence in case of GAest-loci
Locus

GeneBank Acc.

Chromosome

Stn12
Stn122
Stn19
Stn3
Stn34
Stn57
7033Pbbe*
Stn110
Stn163
Stn21
Stn38
Stn79
1125Pbbe*
Stn132
Stn174
Stn135
Stn195
Stn46
Stn130
Stn26
Stn365
Stn9
Stn96
Stn100
Stn178
Stn185
Stn219
Stn61
Stn82
4147Pbbe*
Stn1
Stn15
Stn23
Stn380
Stn381
Stn70
Stn30
Stn37
Stn52
Stn64
Stn67
Stn59
Stn81
Stn125
Stn90
Stn146
Stn148
Stn119
Stn173
Stn180
Stn170
Stn196
Stn198
Stn205
Stn49
Stn160
Stn208
Stn118
Stn83
GAest1

G72132
G72282
G72135
G72128
G72243
G72155
AJ010360
G72182
G72304
G72136
G72145
G72166
AJ010354
G72193
G72310
G72288
G72221
G72150
G72286
G72240

G72131
G72176
G72177
G72312
G72214
BV102497
G72158
G72168
AJ010358
G72126
G72236
G72137

G72164
G72241
G72144
G72154
G72160
G72161
G72156
G72262
G72189
G72173
G72296
G72198
G72280
G72309
G72313
G72307
G72320
G72222
G72324
G72153

G72229
G72186
G72263
DN712245

I
V
XII
I
VIII
V
XI
IX
XIV
VII
IV
VII
XX
XI
XVI
XII
XVIII
IV
XI
II
IV
I
VIII
IX
XVI
XIX
XXI
VI
VII
IV
I
I
X
IV
IV
VII
III
IV
V
VI
VI
V
VII
X
VIII
XII
XIII
X
XV
Scaffold_182
XV
XVIII
Scaffold_128
Scaffold_27
IV
XIV
XXI
XIV
VIII
IX

GAest3

DN716846

XIX

GAest4

DN736839

XVII

GAest6

DN733971

IV

GAest7

DN735236

VIII

GAest8

DN732699

XXI

GAest11

DN735398

XI

GAest14

DN735932

Primer sequences 53

F: TGAATGCTTCTAATTGGTGTAG
R: AGTCCATGAAAACAAACCTCTA
F: ATTAGAAACCAGATGTCAAAGC
R: TGCGTATACATACATATCACTCAG
F: TAGAAATGAATCAAAACACGAG
R: TGTCAGATGCAAATAAGTGAGT
F: GGTTAACTTCTTTGTCAGCTTC
R: TTAGTTGGATTACAATGTGAGG
F: CTGAAGCAGAAAGTGCTCA
R: TGGTCTATTACTGATGCTCAAA
F: CCTTGGAGGTTTGTTAGTTCT
R: ATCGCAGATAGAGGAATAGAGA
F: TCTCTTACGTTGTATGCACATT
R: TTACACTACTGAAGGACTGCTG
F: CGTTTTATGTGATTCATGGTAG
R: GAACGTACACAAACTGCTACTG

*Largiader et al. 1999.

2008 The Authors


Colosimo et al. 2005, indel markers.
Journal
compilation
2008
Blackwell
Publishing Ltd
Other Stn
loci described
in Peichel
et al. 2001.
Same as Stn113.

Locus

GeneBank Acc.

Chromosome

Primer sequences 53

GAest15

DN732561

II

GAest16

DN716882

Scaffold_27

GAest17

DN720020

Scaffold_128

GAest19

DN728221

VI

GAest21

DN704336

VII

GAest26

DN700256

GAest29

DN704287

III

GAest30

DN685788

XII

GAest31

DN685475

XIX

GAest32

DN694579

XII

GAest34

DN682722

XXI

GAest35

DN685500

IV

GAest36

DN687963

II

GAest41

DN666064

GAest42

CD494453

VI

GAest43

CD493458

IV

GAest47

DN705394

XIX

GAest49

DN730395

XIV

F: CAATCATGAAACAAGTTACCAG
R: ATCTTTATAAGAGCACACGCTT
F: ATTCAGAAAAGAGAGAGGTGTG
R: ACAGAGTATCCATGCTTCATTC
F: ATTTCACAACATCATCATCATC
R: TATATTCCAGTTTGCAGAAAGA
F: ATGAGAGAGCACATGACTGAG
R: GAAATCAACGGGAACAGATA
F: TATCATTACGGATGACTTCAGA
R: AAGTCCTCATTTCAATGTTTG
F: AAAACACTAAAATGGTCCTTTG
R: ATTTATGGCGTTTATGGATTAG
F: TCCGTCAGTTAGTCTGTTTGTA
R: CTGGACTACTTTACTGTGCTGA
F: AGGTTGGTCTAGTAAAAGCTGA
R: GCCAATCAGGAGAACAACT
F: CAAACTAAGCACAAACTAAGCA
R: GACGTTCATTCATCTCTTCTCT
F: GTAAATATCTCTTGCCAATTC
R: ATATCAATAATGCAGTAGGTTAC
F: ATGACAGACATGAAATGAACAC
R: CAAGTACAAGACGAGCTACGA
F: TGCAGTTTAGCACAAACTCTAC
R: AATGTGTAACCATCACAGAATG
F: CGTAGATCCCAAATAAACTCAT
R: TCATCTCGTCTAATTGTTTCTG
F: TTTACACAAAAGCTTCATAACG
R: AAGGGGTCCAGATAGAATATGT
F: ATTGGCTTGAATAAATGTGG
R: CTCATTAACTGTAGGTGACACG
F: TCTCAGAAAGCAATACAAAA
R: ACTGTTATCACGTCCACTTT
F: CAGCAAAGTTACAGACTGACAT
R: GTGAATTATTTTGTACTGCGAA
F: CTTTAACACCAGTTCATGTCAC
R: CAGATTTGTAAGAAACACATCG
F: GGTTGAATCTGTCTTACCAAAT
R: CAGTATCATGGCTACATCTCAG
F: TACTCATACCAGCTTATGCAAG
R: ACTTGGGTGTTTATAAGTCGTC
F: CAGAAGAAGAACCCCATATTTA
R: GACCCTATCTTCCCATTTATTT
F: CATCAACATCAACAACAACTG
R: GTCAGACAGGGTCTCGTATT
F: TCTTTGTAGTACGATCCAATCA
R: ATTCCATGTCTAATCTCTCAGC
F: GTTAGGATTGAAAAGGAAGGTT
R: TGTACGACTACGACAAGCTG
F: GTTAGCTTCAAAGATCCAAATG
R: AGAATGAGAGCAGTTACAGAGC
F: CTGGAACAACAAACATTTTATC
R: TGGAGGTCTGTTATTTTATTTTC
F: ATGAAAGGATAATGTTACCAGC
R: CAACAAAAGAAATGTGAGAATG
F: AAAGAGATTTTACCTCTGATCG
R: ACGCTTCTTCATACGAGTTAAT
F: GTACTGTGGTTGAGTGTGTGTT
R: CGTGTTTAGATGGAAGTGTAGA
F: AAAAGTGTGAAGTCACTGGAG
R: AGTGTCTACCTCTAACCTACGTG
F: AGGAGGAGTGCAGATAAAGAC
R: ATGAAGATGAAATCAGACGAGT
F: ATGAAAGGATAATGTTACCAGC
R: CAACAAAAGAAATGTGAGAATG
F: CACCACTAACTCGTACATCCTT
R: CTCTATCCATAGTTGTTGCTCC
F: TACAAAGCTCAGTCAGAAGTCA
R: CGACCACAATAACCAGTAGAC
F: CTGATCACTTGTGTGTACTTTGT
R: AAAGAGATCCGAGAGTACGAC
F: AATGTCAAATAACGTCTTCTCC
R: GAAGCCTTTTAACACGTCTAAC
F: GTTTGAATATAGCTTCCTCTGC
R: GTTTTCTTTTGAAACATTCCTC
F: AACACCTCCTTAAACAGATCAC
R: GGGTTAAAAGCAATGAGATTAC

GAest50

DN719394

GAest51

DN728002

XVI

GAest52

DN722519

Scaffold_48

GAest53

DN719509

VII

GAest55

DN698980

III

GAest56

DN683451

XI

GAest57

DN707612

XX

GAest60

CD502157

Scaffold_132

GAest61

DN713857

XII

GAest63

DN693003

XX

GAest64

DN655209

GAest66

DN683768

II

GAes67

DN685529

XIII

GAest71

DN713857

XIII

GAest73

DN730681

GAest74

DN730448

XV

GAest80

DN704459

XI

GAest82

DN706699

VII

GAest84

DN732607

XII

GAest87

DN687948

XVIII

3582 H . S . M K I N E N , J . M . C A N O and J . M E R I L
Appendix 3 Details of the markers, which have been found to be associated with QTLs in experimental crosses and were used in this study
to identify traces of natural selection in the morphological traits. Note that group refers to the linkage groups in the original linkage map
(Peichel et al. 2001)
Markers

Group

Traits

LOD

% explained

Reference

Stn178
Stn9
Stn26
Stn96
Stn130
Stn82

XVI
I
II
VIII
XI
VII

IV

Stn21
Stn365
Stn380
Stn381
1125Pbbe

II
IV
IV
IV
XXV

Stn219
Stn185

XXVI
IX

37
21
17
22
17
19.8
3745
3745
3745
4.9
116.9
4.7
4.9
NA
NA
NA
26.3
10.9
3.512.3
7.8

14.9
1.53
1.42
2.72
2.57
22.2
43.7
c. 40
c. 40
5.8
77.6
5.6
7.6
NA
NA
NA
28.9
17.9
4.528.6
32

Peichel et al. 2001

Gac4174

# rakers
Dorsal Spine1
Dorsal Spine1
Dorsal Spine2
Dorsal Spine2
Asc. Pelvic Branch
Pelvic Spine
Pelvis Loss
Pelvic Asymmetry
Pelvic Spine
Lat. Plates
Pelvic Girdle (length)
Pelvic Spine
# lateral plates
# lateral plates
# lateral plates
Plate Width
Plate high
# lateral plates
Opercular bone shape

Shapiro et al. 2004

Colosimo et al. 2005


Colosimo et al. 2004

Kimmel et al. 2005

Appendix 4 Allele frequencies in the microsatellite locus Stn365 showing low within-population heterozygosities but skew between
population allele frequencies. This pattern might explain the differences in detecting selection with FST and gene diversity ratio based
neutrality tests
Allele (bp)

Merirastila

Orrevatnet

Barents

L. Pulmanki

L. Kevo

L. Vttern

R. Neretva

138
140
142
HE

10.5
85.5
4.0
0.26

4.2
89.6
6.3
0.19

98.0

2.0
73.0
25.0
0.41

87.0
13.0

100

71.7
28.3
0.40

2.0
0.04

0.23

2008 The Authors


Journal compilation 2008 Blackwell Publishing Ltd