Professional Documents
Culture Documents
PHYLOGENETICS
AND
EVOLUTION
Molecular Phylogenetics and Evolution 31 (2004) 852–864
www.elsevier.com/locate/ympev
Abstract
Molecular phylogenetic research on Selaginellaceae has focused on the plastid gene rbcL, which in this family has unusually high
substitution rates. Here we develop a molecular data set from the nuclear 26S ribosomal DNA gene with the aim of evaluating and
extending the results of previous phylogenetic research. The 26S rDNA and the rbcL regions were sequenced for a sample of 23
species, which represent the main elements of species diversity in the family. The data were analysed independently and in com-
bination using both maximum parsimony and Bayesian inference. Although several between genome differences were found, the
general pattern of relationships uncovered by all analyses was very similar. Results corroborate the previous study supporting new
groupings not previously recognised on morphological grounds. Substitution rates in the 26S rDNA were also found to be high
(26% informative) for the region analysed, but lower than for rbcL (37% informative). These data indicate that high substitution
rates might be widespread in all three genomes (i.e., plastid, mitochondrion, and nucleus).
Ó 2003 Elsevier Inc. All rights reserved.
Keywords: Selaginellaceae; Phylogeny; 26S rDNA; rbcL; Maximum parsimony; Bayesian inference; Long branches; Rate heterogeneity; Substitution
rate
1055-7903/$ - see front matter Ó 2003 Elsevier Inc. All rights reserved.
doi:10.1016/j.ympev.2003.10.014
P. Korall, P. Kenrick / Molecular Phylogenetics and Evolution 31 (2004) 852–864 853
monophyletic, but the so-called resurrection plants are Isoetes melanopoda was chosen as outgroup instead of
polyphyletic. I. andina. In the combined analysis we have united the
One striking feature of Selaginellaceae is the extremely rbcL sequence of I. melanopoda with the 26S rDNA
high substitution rate in the rbcL gene. We found that 566 sequence of I. andina in a single OTU, here called I.
of the 1299 available characters were phylogenetically melanopoda/andina.
informative (Korall and Kenrick, 2002). Some branches
even exceed 100 characters in length. Taken as a whole, 2.2. DNA extraction, amplification, and sequencing
branch length variation within this family is greater than
that of all other land plants, and this leads to instability in With the exception of Selaginella lepidophylla, the
phylogenetic analysis. The position of two species, Se- total DNA extractions were those used by Korall and
laginella australiensis and Selaginella sinensis, were par- Kenrick (2002). Most of the extractions were made us-
ticularly unstable. Under certain ingroup/outgroup ing the DNeasy Plant Mini Kit from Qiagen (Santa
combinations, monophyly of Selaginellaceae would Clarita, California, USA). The total DNA extraction of
break down, and this species pair would make an enor- S. lepidophylla was no longer available, and a new ex-
mous phylogenetic leap across the tree to group as sister to traction was made. Total DNA of I. andina was kindly
Gnetales. provided by Catarina Rydin (Department of Botany,
Here we attempt a critical evaluation of the results of Stockholm university, Sweden).
our previous study on chloroplast data by sampling a Based on the results of Kuzoff et al. (1998) we de-
more conserved region from the nuclear genome, 26S cided to amplify the first third of the 26S rDNA region
nuclear ribosomal DNA. In plants, the 26S nuclear which is approximately 1200 bp. This required the
ribosomal DNA region is approximately 3.4 kb in synthesis of more specific primers, which were con-
length, and it is divided into rapidly evolving expansion structed in the following way. PCR amplification of the
segments and more conservative core regions (Kuzoff et 26S rDNA was performed using the primers N-nc26S1
al., 1998). Overall, 26S rDNA evolves at a slightly lower and 1229rev (Table 2) from Kuzoff et al. (1998) and the
rate than rbcL (Kuzoff et al., 1998). We chose to focus Ready-To-GoTM PCR beads from Amersham–Phar-
on the 26S rDNA region because the anticipated lower macia Biotech (Uppsala, Sweden). The reactions were
evolutionary rates would help address problems that we run in a Perkin–Elmer Thermal Cycler with one cycle
had encountered with long branches in rbcL (Korall and of 95 °C for 5 min and 30 cycles of 95 °C for 30 s, 55 °C
Kenrick, 2002). Also, 26S rDNA provides a comple- for 30 s, and 72 °C for 1.5 min. Since the primers were
mentary data set from a different genome (nucleus). This unspecific, multiple sequences were produced. The
has the advantage of side-stepping problems or pecu- products were separated on a 4% agarose gel, and the
liarities specific to the plastid genome. We have also section of the gel containing the DNA of the correct
chosen to analyse our data using both maximum parsi- length was excised. The DNA was extracted by the
mony and Bayesian inference in an attempt to correct ‘‘freeze and squeeze’’-method. The piece of gel with the
for the effects of high substitution rates. correct DNA was placed in a small package of para-
film open at one end, frozen for a few minutes
()80 °C), and then the package was squeezed by hand.
2. Materials and methods The resulting drop of fluid containing the DNA was
collected. This was used as a template for a nested
2.1. Choice of taxa PCR using internal primers. Cycle sequencing of the
PCR products was performed using an ABI kit [Big-
A total of 23 ingroup species were chosen to represent Dye Terminator-kit (PE Applied Biosystems, War-
a sample of the 62 species included in our previous rbcL rington, WA1, USA)] with the PCR primers as well as
analysis (Korall and Kenrick, 2002) (Table 1). Note that the two internal sequencing primers 641R by Kuzoff
Selaginella peruviana was previously misidentified and et al. (1998) and 380F constructed by Catarina Rydin
was included in Korall and Kenrick (2002) under the (Department of Botany, Stockholm university, Swe-
name S. sellowii. The misidentification does not affect den). The resulting fragments were separated on an
the results, since the same voucher and DNA extract ABI Prism 377 automated sequencer (PE Applied
have been used in both studies. Biosystems, Warrington, WA1, USA). These were used
The choice of outgroup was based on previous to construct the Selaginella specific PCR and se-
morphological (Kenrick and Crane, 1997) and molec- quencing primers 60F and 1160R (Table 2). The new
ular (Korall et al., 1999; Kranz and Huss, 1996; Wi- primers were then used for PCR amplification and
kstr€
om and Kenrick, 1997) phylogenetic studies. These sequencing using the protocol outlined above. Se-
indicate that Isoetaceae is the sister group to Selagi- quences were assembled and edited using the Staden
nellaceae. We included two species of Isoetaceae: Package (Staden, 1996) and deposited in the EMBL
Isoetes lacustris and Isoetes andina. In the rbcL study, sequence database.
854
Table 1
Taxa included in analysis, systematic position according to Jermy (1986), geographical distribution, voucher name, and Accession numbers in the EMBL sequence database
Taxa Subgenus/series Geographical distribution DNA source/voucher EMBL Accession numbers,
Subgenus according to Jermy rbcL/26S rDNA
(1986)
INGROUP (Selaginella)
S. acanthostachys Baker Stachygynandrum S America Korall 1996:14a (S) AJ295884e /AJ507611
S. bombycina Spring Stachygynandrum S & C America Korall 1997:31 (S) AJ010848c /AJ507600
Table 2
Primers used in amplifying and sequencing 26S rDNA
Primer Direction 50 –30 sequence Reference/designed by
N-nc26S1a Forward CGACCCCAGGTCAGGCG Kuzoff et al. (1998)
60Fa Forward TTTAAGCATATCACTAAGCGGAGG Petra Korall
380F Forward CCGCGAGGGAAAGATGAAAAGGAC Catarina Rydin, Department of Botany, Stockholm University
1229reva Reverse ACTTCCATGACCACCGTCCT Kuzoff et al. (1998)
1160Ra Reverse CCAGTTCTGCTTACCAAAAATGGCCC Petra Korall
641rev Reverse TTGGTCCGTGTTTCAAGACG Kuzoff et al. (1998)
a
Primers used both in PCR and sequencing.
Metropolis-coupled Markov chain Monte Carlo. In all laginoides (Fig. 1A). All other species fall into one of
analyses, 200,000 generations were performed and every two large clades (clades A and B), which have moderate
10th tree was saved. Stationarity of the chains was support (bv 88%/di 9 and bv 69%/di 4, respectively).
judged by examining the output files of the analyses. Groups with moderate or high support within clade A
The first 3000 trees sampled were discarded as burn-in include the subgenus Ericetorum (Selaginella uliginosa
(corresponding to 30,000 generations which was well and Selaginella gracillima, bv 100%/di 45) and the so-
beyond apparent stationarity in all analyses) and a 50% called articulate species, excluding Selaginella exaltata
majority rule consensus tree was calculated for the re- (Selaginella diffusa–Selaginella kraussiana, bv 78%/di 3).
maining 17,000 trees. Most internal nodes within the articulate group are well
Each Bayesian inference analysis was repeated three supported. In clade B all of the main groups are well-
times to test for convergence. Furthermore, we investi- supported. The Asian species and species groups
gated the variation found in the posterior probabilities (Selaginella brooksi–Selaginella kerstingii, Selaginella
values obtained by running 10 replicates of the 26S rDNA plana–Selaginella willdenovii, Selaginella stauntoniana)
analysis (GTR + C + I). The mean and the Monte Carlo are paraphyletic to a clade of South and Central
variance of the posterior probabilities were calculated. American species (Selaginella haematodes–Selaginella
The test for homogeneity among data sets with dif- acanthostachys, bv 81%/di 3).
ferent origins implemented in PAUP* 4.0 (Swofford, The result of the Bayesian inference analyses (Fig. 1B,
2002) and described by Farris et al. (Farris et al., 1995) Table 4) is broadly similar to maximum parsimony
was performed, using 1000 heuristic searches each with (Fig. 1A), with notable exceptions (Table 4). The most
10 replicates of random addition sequence. striking incongruence is the position of S. sinensis. The
Bayesian inference analyses always place this species in a
more crownward position within the rhizophoric clade.
3. Results Depending upon the model chosen, S. sinensis either
appears as sister to clade A (GTR + C, Table 4), or sister
The data sets were analysed separately and in to clade B (GTR + C with three partitions, Fig. 1B,
combination. Parsimony analyses are presented as Table 4). Both results have low posterior probability.
bootstrap trees (Figs. 1A, 2A, and 3A), and tree sta- The relationships of the articulate species S. diffusa,
tistics are summarised in Table 3. The Bayesian infer- Selaginella lingulata, Selaginella suavis, and Selaginella
ence analyses are presented as 50% majority rule sericea, also differ with respect to each other.
consensus trees (Figs. 1B, 2B, and 3B). The topologies
depicted here are those derived from the most complex 3.2. 26S rDNA data set—nuclear genome
model for each data set. For rbcL this is GTR + C,
with each codon position treated separately (Fig. 1B). Parsimony analysis places a single species, S. selagi-
The model for the 26S rDNA data set is GTR + C + I noides, as sister to a clade containing all other species,
(Fig. 2B), and for the combined data set GTR + C with the rhizophoric clade (Fig. 2A, bv 67%/di 3). Relation-
four partitions treated separately (26S rDNA sequences ships among basal groups in the rhizophoric clade are
plus three codon positions) (Fig. 3B). All differences unresolved. Clade B is resolved with weak support (bv
in the results of the various analyses are summarised in 57%/di 2). There is high support for a close relationship
Table 4. The three replicates run in each Bayesian in- between the problematic S. sinensis and clade B (bv
ference analysis always produced the same topology. The 90%/di 7), with S. sinensis as either sister group to or
Monte Carlo variance found in posterior probability included in clade B. Clade A is not resolved in the
values of the 26S rDNA analysis is presented in Table 5. consensus tree (Fig. 2A). Within the rhizophoric group
Names on clades follow Korall and Kenrick (2002). there is a basal polytomy, which comprises clades and
Unnamed clades are referred to throughout the text by species that other analyses (rbcL here, and Korall and
the outermost (top and bottom) species as depicted in Kenrick, 2002) place in clade A. Bayesian inference
the figures. Note that the circumscriptions are depen- analyses yield a completely congruent but more resolved
dent on how the trees are drawn and are only relevant phylogeny (Fig. 2B, Table 4). Here, clade A is resolved
when compared to the figures in question. as a monophyletic group. However, all nodes unique to
this more resolved phylogeny have low posterior prob-
3.1. rbcL data set—chloroplast genome ability.
Fig. 1. Alternative topologies for a Selaginella phylogeny based on rbcL gene sequences. S. sinensis is marked in bold to highlight its position in the
different analyses. (A) Maximum parsimony. Bootstrap consensus tree, support values above branches denote bootstrap values and below branches
decay indices; (B) Bayesian inference. Fifty percent majority rule consensus tree of 17,000 trees, support values denote posterior probabilities.
between the two data sets (26S rDNA versus rbcL) can and bv 86%/di 5, respectively). The problematic S. sin-
be rejected (P ¼ 0:016). ensis is placed within the rhizophoric clade, but its re-
lationship to clades A and B remains unresolved.
Within clade A, the basal most nodes are rather weakly
3.4. rbcL and 26S rDNA—combined analyses supported (bv/di 63%/2, 53%/1, and 56%/1, respec-
tively), but three clades have stronger support; S. le-
Parsimony analysis of the combined data sets again pidophylla and S. peruviana (bv 88%/di 8), the subgenus
places S. selaginoides, as sister to the rhizophoric clade Ericetorum (S. uliginosa and S. gracillima, bv 100%/
(Fig. 3A, bv 52%/di 0). Clades A and B are both di 72), and the so-called articulate series, excluding
resolved with comparatively high support (bv 94%/di 11 S. kraussiana and S. exaltata (S. diffusa—S. fragilis, bv
858 P. Korall, P. Kenrick / Molecular Phylogenetics and Evolution 31 (2004) 852–864
Fig. 2. Alternative topologies for a Selaginella phylogeny based on 26S rDNA sequences. S. sinensis is marked in bold to highlight its position in the
different analyses. (A) Maximum parsimony. Bootstrap consensus tree, support values above branches denote bootstrap values and below branches
decay indices; (B) Bayesian inference. Fifty percent majority rule consensus tree of 17,000 trees, support values denote posterior probabilities. Nodes
are numbered, with figures to the right of the node corresponding to those in Table 5.
100%/di 55). In clade B all the main groups are The Bayesian inference analysis is more resolved and
well-supported. The Asian species and species groups differs only slightly from the parsimony analysis (Fig. 3B,
(S. brooksi–S. frondosa, S. plana–S. willdenovii, Table 4). The support for the rhizophoric clade is high
S. stauntoniana) are paraphyletic to a clade of South and the basal polytomy is resolved with S. sinensis sister
and Central American species (S. haematodes– to clade B. The single incongruence is the position of
S. acanthostachys, bv 100%/di 13). S. exaltata which is resolved as sister group to the
P. Korall, P. Kenrick / Molecular Phylogenetics and Evolution 31 (2004) 852–864 859
Fig. 3. Alternative topologies for a Selaginella phylogeny based on combined analyses of rbcL and 26S rDNA sequences. S. sinensis is marked in bold
to highlight its position in the different analyses. (A) Maximum parsimony. Bootstrap consensus tree, support values above branches denote
bootstrap values and below branches decay indices; (B) Bayesian inference. Fifty percent majority rule consensus tree of 17,000 trees, support values
denote posterior probabilities.
Table 3
Tree statistics of the parsimony analyses subgenus Ericetorum, but this result has low posterior
rbcL 26S rDNA Combined
probability.
No. of characters 1299 889 2188
No. of informative characters 477 235 712
No. of most parsimonious trees 1 4 3 4. Discussion
Treelength 1623 910 2558
Islands 1 2 1 The nuclear 26S rDNA sequence data broadly cor-
CI 0.48 0.44 0.46 roborate the phylogenetic conclusions that emerged
RI 0.67 0.64 0.66
from our earlier rbcL analysis of 62 species. The 26S
860
Table 4
Summary of all differences in results found using maximum Parsimony versus Bayesian inference
Analysisa Position of S. sinensisb Rhizophoric clade? Ericetorum (er), S. exaltata (ex), S. diffusa (di), S. lingulata (li),
Articulatae excl. S. exaltata (art) S. sericea (se), S. suavis (su)
Combined
Parsimony (Fig. 3A) Trichotomy with clades A and B Yes bv 52% (ex(er, art)) bv 53% (se(su(di, li))) bv 70%, 100%
Bayesian, GTR + C + I Sister to clade B pp 0.96 Yes pp 1.00 (art(er, ex)) pp 0.80 (se(su(di, li))) pp 1.00, 1.00
Bayesian, GTR + C, 2 parts Sister to clade B pp 0.99 Yes pp 1.00 (art(er, ex)) pp 0.80 (se(su(di, li))) pp 1.00, 1.00
Bayesian, GTR + C, 4 parts (Fig. 3B) Sister to clade B pp 0.99 Yes pp 1.00 (art(er, ex)) pp 0.51 (se(su(di, li))) pp 1.00, 1.00
Analysisa (S. lepidophylla, S. peruviana)? S. brooksii (br), S. acanthostachys (ac),
S. kerstingii (ke), S. bombycina (bo), S. erythropus (er),
S. frondosa (fr) S. haematodes (ha)
rbcL
Parsimony (Fig. 1A) Collapsed (ke(br, fr)) bv 61% (ac(bo(er, ha))) bv 98%, 62%
Bayesian, GTR + C Yes pp 1.00 (ke(br, fr)) pp 0.67 (ac(bo(er, ha))) pp 1.00, 0.88
Bayesian, GTR + C, 3 parts (Fig. 1B) Yes pp 0.99 (ke(br, fr)) pp 0.95 (ac(bo(er, ha))) pp 1.00, 0.84
26S rDNA
Parsimony (Fig. 2A) Yes bv 84% (br, ke, fr) - (er(ac(bo, ha))) bv 74%, 76%
Bayesian, GTR + C + I (Fig. 2B) Yes pp 1.00 (br(ke, fr)) pp 0.86 (er(ac(bo, ha))) pp 0.89, 0.86
Combined
Parsimony (Fig. 3A) Yes bv 88% (br, ke, fr) - (ac(er(bo, ha))) bv 74%, 67%
Bayesian, GTR + C + I Yes pp 1.00 (br(ke, fr)) pp 0.80 (ac(er(bo, ha))) pp 0.99, 0.72
Bayesian, GTR+, 2 parts Yes pp 1.00 (br(ke, fr)) pp 0.82 (ac(er(bo, ha))) pp 0.99, 0.73
Bayesian, GTR + C, 4 parts (Fig. 3B) Yes pp 1.00 (br(ke, fr)) pp 0.69 (ac(er(bo, ha))) pp 0.99, 0.87
a
See text for a description of the models used in Bayesian inference analyses.
b
Stability values presented denote the weakest node involved (bv ¼ bootstrap value, pp ¼ posterior probability).
P. Korall, P. Kenrick / Molecular Phylogenetics and Evolution 31 (2004) 852–864 861
insights to the phylogeny and should not, in our opin- The position of S. exaltata as sister to all other articulate
ion, be avoided. The null hypothesis is rejected species (i.e., a monophyletic Articulatae) in the 62-taxon
(P ¼ 0:016), but the test is not significant if S. sinensis is analysis is also weakly supported.
excluded (P ¼ 0:36). These results are in line with the Because the cases of incongruence outlined above
different topologies found, where the major differences mainly involve nodes for which branch support is low, it
concerns the position of S. sinensis, see below. remains unclear whether the differences are real or just
the consequences of low phylogenetic signal masked by
4.2. Genome and analytical differences noisy data. In the case of differences emerging from an-
alytical methods (e.g., relations among articulate species:
Consistent differences emerged in several areas be- S. diffusa, S. lingulata, S. sericea, and S. suavis) there may
tween genomes and between methods of analysis. These be problems with the assumptions underlying either the
were generally found in parts of the tree where branch maximum parsimony or Bayesian models of analysis. On
support was weak. Incongruence may therefore reflect a the other hand, the between genome differences observed
lack of signal or biases in analytical models rather than in the relationships of the South and Central American
fundamentally different evolutionary histories of orga- species (S. haematodes, S. bombycina, S. erythropus, and
nelle genomes. There is only one case in which all plastid S. acanthostachys) could reflect fundamentally different
analyses yield one topology, whereas analyses of the evolutionary histories. These species are all closely re-
nuclear gene, irrespective of method, yield another one. lated, and in some areas they have sympatric distribu-
This concerns the internal relationships of the South and tions. Among other explanations, a hybridisation event
Central American clade (S. acanthostachys, S. bomby- with accompanying introgression of chloroplast DNA
cina, S. erythropus, and S. haematodes) in clade B (Figs. should be considered as a possible cause of the perceived
1 and 2, Table 4). The combined analyses yield a third differences in phylogenetic histories.
topology (Fig. 3). There was no clear rise in indices of In the Bayesian inference analysis, multiple Markov
branch support in analyses run on the combined data chains were performed to minimise the risk of the al-
set. Where incongruence occurred between genomes or gorithm failing to converge. All replicates of each
between analyses of the same genome, the combined analysis produced the same topology, and convergence
analyses yielded low support. Where the separate anal- seems to have been reached. It should be noted that the
yses yielded the same topology, the branch support posterior probabilities of the different chains vary, with
values were high. The posterior probabilities were usu- lower posterior probabilities having a rather high Monte
ally slightly higher in the combined analyses. Carlo variance (Table 5). Posterior probabilities above
The most conspicuous phylogenetic conflicts con- 97%, on the other hand, are almost constant in all
cerned the position of S. sinensis, which varied de- analyses, with a Monte Carlo variance less than 0.3.
pending upon analytical method and gene sequence. This study, as well as most previously published studies
Parsimony analysis of the rbcL gene placed S. sinensis as (see e.g., Douady et al., 2003; Leache and Reeder, 2002;
sister to a clade containing all other species in the family. Smedmark and Eriksson, 2002; Wilcox et al., 2002), show
Bayesian inference however moved S. sinensis to posi- that the posterior probabilities of the Bayesian inference
tions within the rhizophoric clade, either as sister to analysis tend to be higher than nonparametric bootstrap
clade A or clade B. Both parsimony and Bayesian values. Simulation studies indicate that the posterior
analyses of the 26S rDNA alone and the combined data probabilities tend to be overestimations of phylogenetic
indicated a position within the rhizophoric clade with accuracy, whereas bootstrap values tend to be conserva-
five out of six analyses placing S. sinensis as sister to tive estimates (Hillis and Bull, 1993; Suzuki et al., 2002).
clade B. A position within the rhizophoric clade is Wilcox et al. (2002), however, maintain that, based on
consistent with comparative morphology. S. sinensis their results, the posterior probabilities are underesti-
possesses the distinctive rhizophores as well as the de- mates as well, although less so than bootstrap values, and
cussately arranged sporophylls that characterise the they advocate the use of posterior probabilities.
rhizophoric clade. We attribute the anomalous result Posterior probabilities also have a tendency to yield
obtained from maximum parsimony analysis of the rbcL high values for false nodes, as seen in simulation studies
gene to branch length effects (see below). where the ‘‘true’’ phylogeny is known (Douady et al.,
Further differences in results that are attributable to 2003; Suzuki et al., 2002). This is especially true when
genome or analytical preference involve the position of the chosen model of evolution is inappropriate (Douady
S. brooksii with respect to the Asian species S. frondosa et al., 2003; Suzuki et al., 2002). Huelsenbeck et al.
and S. kerstingii (clade B). In this case support is uni- (2002) also points out the importance of choosing a
formly low. The position of S. exaltata is also ambigu- correct model of evolution when using Bayesian infer-
ous. All analyses resolve the Articulatae series as ence for reconstructing phylogenies.
paraphyletic. However, none of the hypothesised Bayesian inference of phylogeny is a rather new
relationships for S. exaltata is strongly supported. method in phylogenetic reconstruction, with many
P. Korall, P. Kenrick / Molecular Phylogenetics and Evolution 31 (2004) 852–864 863
questions still unanswered. The conclusion by Douady 2002). Furthermore, branch length heterogeneity within
et al. (2003) seems very appropriate at this time: ‘‘Both the family itself (see Fig. 2 in Korall and Kenrick, 2002),
PP and bootstrap supports are of great interest to can not be explained simply by an ancient origin.
phylogeny as potential upper and lower bound of node High substitution rates in plant genes are likely to
support, but they are surely not interchangeable and have a variety of causes, none of which is very well
cannot be directly compared.’’ understood (Muse, 2000). They will depend upon whe-
ther the rate differences are coupled to a specific gene, to
4.3. Exceptional rates of molecular evolution a genome, or correlated in all three genomes (chloro-
plast, mitochondrion, and nucleus). Several plausible
The results of the 26S rDNA analysis presented here mechanisms have been proposed (e.g., accuracy of DNA
indicate that the high number of parsimony informative replication, generation time, speciation rate, and popu-
characters previously observed in the plastid gene rbcL lation size (Andreasen and Baldwin, 2001; Barraclough
(Korall and Kenrick, 2002) is to an extent mirrored also and Savolainen, 2001; Bousquet et al., 1992; Britten,
in the nucleus. We found that 37% of the characters 1986; Gaut et al., 1996; Muse, 2000, and references
were parsimony informative for rbcL and 26% for the therein)), but the extent to which these mechanisms are
region of the 26S rDNA included in this study. Typi- active individually or how they might interact to elevate
cally, this amount of variation would be associated with rates is very poorly understood. With its elevated and
analyses that include much larger numbers of species. heterogeneous rates of base substitution, Selaginellaceae
Phylogenetic analyses of 26S rDNA usually exhibit might provide a good model to study the relationship
lower levels of variation than we have observed in Se- between rate heterogeneity and gene function within and
laginellaceae (e.g., Fan and Xiang, 2001; Stefanovic among plant genomes and plant groups.
et al., 1998). In an analysis of 147 species of angiosperms
Nandi et al. (1998) found that 40% of rbcL sites were
parsimony informative, and in a larger 357 species
Acknowledgments
analysis Savolainen et al. (2000) found 52% parsimony
informative characters. Another feature of the rbcL tree
The authors thank Catarina Rydin for providing to-
is that branch length is unevenly distributed: there are
tal DNA extract of Isoetes andina, and Mari K€allersj€
o,
far more substitutions in clade A—a fast clade—than in
PO Karis, and Johan Nylander for valuable comments
clade B—a slow clade (see Fig. 2 in Korall and Kenrick,
on the manuscript. This work was financially supported
2002). Some branches are also extremely long, such as
by the Swedish Natural Science Research Council (NFR
the 155 character long terminal branch leading to S.
research grant to Paul Kenrick and PO Karis: B 1393/
sinensis in the 62-taxon analysis (Korall and Kenrick,
1999), and the foundation ‘‘Lars Hiertas minne’’ (grant
2002). This extreme branch length variation is not,
to Petra Korall).
however, a feature of the 26S rDNA, in which the
number of substitutions are distributed more evenly
throughout the family. Both genes, therefore, have large
numbers of substitutions but the imbalance in the dis- References
tribution of these substitutions is found only in rbcL.
Albert, V.A., Mishler, B.D., 1992. On the rationale and utility of
The extraordinary large number of substitutions in weighting nucleotide sequence data. Cladistics 8, 73–83.
Selaginellaceae is most probably due to an elevated sub- Andreasen, K., Baldwin, B.G., 2001. Unequal evolutionary rates
stitution rate, and the new evidence from the nuclear 26S between annual and perennial lineages of checker mallows (Sidal-
rDNA indicates that this is a phenomenon that is not cea, Malvaceae): evidence from 18S-26S rDNA internal and
localised to the plastid. High substitution rates have been external transcribed spacers. Mol. Biol. Evol. 18, 936–944.
Barraclough, T.G., Savolainen, V., 2001. Evolutionary rates and
observed in other regions as well. Unpublished data in- species diversity in flowering plants. Evolution 55, 677–683.
dicate that within the chloroplast, not only the rbcL gene Bousquet, J., Strauss, S.H., Doerksen, A.H., Price, R.A., 1992.
but also atpB have a high substitution rate. Besides 26S Extensive variation in evolutionary rate of rbcL gene sequences
rDNA, the nuclear 18S rDNA region seems to evolve among seed plants. Proc. Natl. Acad. Sci. USA 89, 7844–7848.
quickly in Selaginellaceae compared to other land plants Bremer, K., 1988. The limits of amino acid sequence data in
angiosperm phylogenetic reconstruction. Evolution 42, 795–803.
(Kranz and Huss, 1996). The high rates of substitution in Britten, R.J., 1986. Rates of DNA sequence evolution differ between
Selaginellaceae are most likely not an effect of its long taxonomic groups. Science 231, 1393–1398.
evolutionary history. Although the family has ancient DiMichele, W.A., Skog, J.E., 1992. The Lycopsida: a symposium.
origins dating back to the beginning of the Carboniferous Ann. Mo. Bot. Gard. 79, 447–449.
Period (Thomas, 1992, 1997) high rates of substitution Donoghue, M.J., Olmstead, R.G., Smith, J.F., Palmer, J.D., 1992.
Phylogenetic relationships of Dipsacales based on rbcL sequences.
are not seen within and among closely related similarly Ann. Mo. Bot. Gard. 79, 333–345.
ancient groups such as Lycopodiaceae (Wikstr€ om and Douady, C.J., Delsuc, F., Boucher, Y., Doolittle, W.F., Douzery,
Kenrick, 1997) and Isoetaceae (Rydin and Wikstr€ om, E.J.P., 2003. Comparison of Bayesian and maximum likelihood
864 P. Korall, P. Kenrick / Molecular Phylogenetics and Evolution 31 (2004) 852–864
bootstrap measures of phylogenetic reliability. Mol. Biol. Evol. 20, Inokuchi, H., Ozeki, H., 1986. Chloroplast gene organization
248–254. deduced from the complete sequence of the liverwort Marchantia
Eriksson, T., 1999. AutoDecay Hypercard Program Distributed by the polymorpha chloroplast DNA. Nature 322, 572–574.
Author. Bergius Foundation, Royal Swedish Academy of Sciences, Posada, D., Crandall, K.A., 1998. Modeltest: testing the model of
Stockholm. DNA substitution. Bioinformatics 14, 817–818.
Fan, C., Xiang, Q.-Y., 2001. Phylogenetic relationships within Cornus Rambaut, A., 1996. Se–Al, sequence alignment editor. Version 1.0 alpha
(Cornaceae) based on 26S rDNA sequences. Am. J. Bot. 88, 1131– 1. Department of Zoology, University of Oxford, Oxford, UK.
1138. Rodrıguez, F., Oliver, J.L., Marın, A., Medina, J.R., 1990. The general
Farris, J.S., K€ allersj€
o, M., Kluge, A.G., Bult, C., 1995. Testing stochastic model of nucleotide substitution. J. Theor. Biol. 142,
significance of incongruence. Cladistics 90, 315–319. 485–501.
Felsenstein, J., 1985. Confidence limits on phylogenies: an approach Rydin, C., Wikstr€ om, N., 2002. Phylogeny of Isoetes (Lycopsida):
using the bootstrap. Evolution 39, 783–791. resolving basal relationships using rbcL sequences. Taxon 51, 83–89.
Fishbein, M., Hibsch-Jetter, C., Soltis, D.E., Hufford, L., 2001. Savolainen, V., Chase, M.W., Hoot, S.B., Morton, C.M., Soltis, D.E.,
Phylogeny of Saxifragales (Angiosperms, Eudicots): analysis of a Bayer, C., Fay, M.F., De Bruijn, A.Y., Sullivan, S., Qiu, Y.-L.,
rapid, ancient radiation. Syst. Biol. 50, 817–847. 2000. Phylogenetics of flowering plants based on combined analysis
Gaut, B.S., Morton, B.R., McCaig, B.C., Clegg, M.T., 1996. Substi- of plastid atpB and rbcL gene sequences. Syst. Biol. 49, 306–362.
tution rate comparisons between grasses and palms: synonymous Smedmark, J.E.E., Eriksson, T., 2002. Phylogenetic relationships of
rate differences at the nuclear gene Adh parallel rate differences at Geum (Rosaceae) and relatives inferred from the nrITS and trnL-
the plastid gene rbcL. Evolution 93, 10274–10279. trnF regions. Syst. Bot. 27, 303–317.
Holmgren, P.K., Holmgren, N.H., Barnett, L.C., 1990. Index Herbar- Staden, R., 1996. The Staden sequence analysis package. Mol.
iorum 1: The herbaria of the world. Greuter W, Regnum Biotechnol. 5, 233–241.
Vegetabile. New York Botanical Garden, New York. Stefanovic, S., Jager, M., Deutsch, J., Broutin, J., Masselot, M., 1998.
Hillis, D.M., Bull, J.J., 1993. An empirical test of bootstrapping as a Phylogenetic relationships of conifers inferred from partial 28S
method for assessing confidence in phylogenetic analysis. Syst. rRNA gene sequences. Am. J. Bot. 85, 688–697.
Biol. 42, 182–192. Sugiura, M., Iida, Y., Oono, K., Takaiwa, F., 1985. The complete
Huelsenbeck, J.P., Larget, B., Miller, R.E., Ronquist, F., 2002. nucleotide sequence of a rice 25S rRNA gene. Gene 37, 255–259.
Potential applications and pitfalls of Bayesian inference of Suzuki, M., Glazko, G.V., Nei, M., 2002. Overcredibility of molecular
phylogeny. Syst. Biol. 51, 673–688. phylogenies obtained by Bayesian phylogenetics. Proc. Natl. Acad.
Huelsenbeck, J.P., Ronquist, F., 2001. MrBayes: Bayesian inference of Sci. USA 99, 16138–16143.
phylogenetic trees. Bioinformatics 17, 754–755. Swofford, D.L., 2002. PAUP*: Phylogenetic Analysis Using Parsi-
Jermy, A.C., 1986. Subgeneric names in Selaginella. Fern Gaz. 13, mony (*and Other Methods). Version 4. Sinauer Associates,
117–118. Sunderland, MA.
Johnson, K.P., Whiting, M.F., 2002. Multiple genes and the mono- Tavare, S., 1986. Some probabilistic and statistical problems on the
phyly of Ischnocera (Insecta: Phthiraptera). Mol. Phylogenet. Evol analysis of DNA sequences. Lect. Math. Life Sci. 17, 57–86.
22, 101–110. Therrien, J.P., Haufler, C.H., 2000. Phylogeny and biogeography of
Kenrick, P., Crane, P.R., 1997. The Origin and Early Diversification of Selaginella subg. Tetragonostachys based on nuclear ribosomal ITS
Land Plants: A Cladistic Study. Smithsonian Institution Press, sequence data. Am. J. Bot. 87, 98.
Washington. Therrien, J.P., Haufler, C.H., Korall, P., 1999. Phylogeny and
Korall, P., Kenrick, P., 2002. Phylogenetic relationships in selaginell- biogeography of Selaginella subg. Tetragonostachys. In XVI
aceae based on rbcL sequences. Am. J. Bot. 89, 506–517. International Botanical Congress, abstracts. 120. St Louis, USA.
Korall, P., Kenrick, P., Therrien, J.P., 1999. Phylogeny of Selaginell- Thomas, B.A., 1992. Paleozoic herbaceous lycopsids and the begin-
aceae: evaluation of generic/subgeneric relationships based on rbcL nings of extant Lycopodium sens. lat. and Selaginella sens. lat. Ann.
gene sequences. Int. J. Plant Sci. 160, 585–594. Mo. Bot. Gard. 79, 623–631.
Kranz, H.D., Huss, V.A.R., 1996. Molecular evolution of pterido- Thomas, B.A., 1997. Upper Carboniferous herbaceous lycopsids. Rev.
phytes and their relationships to seed plants: evidence from Palaeobotan. Palynol. 95, 129–153.
complete 18S rRNA gene sequences. Plant Syst. Evol. 202, 1–11. Wikstr€ om, N., 2001. Diversification and relationships of extant
Kuzoff, R.K., Sweere, J.A., Soltis, D.E., Soltis, P.S., Zimmer, E.A., homosporous lycopods. Am. Fern J. 91, 150–165.
1998. The phylogenetic potential of entire 26S rDNA sequences in Wikstr€ om, N., Kenrick, P., 1997. Phylogeny of Lycopodiaceae
plants. Mol. Biol. Evol. 15, 251–263. (Lycopsida) and the relationships of Phylloglossum drummondii
Lanave, C., Preparata, G., Saccone, C., Serio, G., 1984. A new method Kunze based on rbcL sequences. Int. J. Plant Sci. 158, 862–871.
for calculating evolutionary substitution rates. J. Mol. Evol. 20, Wikstr€ om, N., Kenrick, P., 2000a. Phylogeny of epiphytic Huperzia
86–93. (Lycopodiaceae): paleotropical and neotropical clades corrobo-
Leache, A.D., Reeder, T.W., 2002. Molecular systematics of the eastern rated by rbcL sequences. Nordic J. Bot. 20, 165–171.
fence lizard (Sceloporus undulatus): a comparison of parsimony, Wikstr€ om, N., Kenrick, P., 2000b. Relationships of Lycopodium and
likelihood, and Bayesian approaches. Syst. Biol. 51, 44–68. Lycopodiella based on combined plastid rbcL gene and trnL intron
Manhart, J.R., 1994. Phylogenetic analysis of green plant rbcL sequence data. Syst. Bot. 25, 495–510.
sequences. Mol. Phylogenet. Evol. 3, 114–127. Wikstr€ om, N., Kenrick, P., 2001. Evolution of Lycopodiaceae (Lyc-
Muse, S.V., 2000. Examining rates and patterns of nucleotide opsida): estimating divergence times from rbcL gene sequences by
substitution in plants. Plant Mol. Biol. 42, 25–43. use of nonparametric rate smoothing. Mol. Phylogenet. Evol. 19,
Nandi, O.I., Chase, M.W., Endress, P.K., 1998. A combined cladistic 177–186.
analysis of angiosperms using rbcL and non-molecular data. Ann. Wikstr€ om, N., Kenrick, P., Chase, M., 1999. Epiphytism and
Mo. Bot. Gard. 85, 137–212. terrestrialization in tropical Huperzia (Lycopodiaceae). Plant Syst.
Nylander, J.A.A., 2002. MrModeltest. Version 1.0b. Computer pro- Evol. 218, 221–243.
gram distributed by the author. Department of Systematic Zool- Wilcox, T.P., Zwickl, D.J., Heath, T.A., Hillis, D.M., 2002. Phyloge-
ogy, Uppsala University, Uppsala, Sweden. netic relationships of the dwarf boas and a comparison of Bayesian
Ohyama, K., Fukuzawa, H., Kohchi, T., Shirai, H., Sano, T., Sano, S., and bootstrap measures of phylogenetic support. Mol. Phylogenet.
Umesono, K., Shiki, Y., Takeuchi, M., Chang, Z., Aota, S., Evol. 25, 361–371.