Professional Documents
Culture Documents
50(5):689699, 2001
J OHN J. WIENS
Section of Amphibians and Reptiles, Carnegie Museum of Natural History, Pittsburgh,
Pennsylvania 15213-4080 , USA; E-mail: wiensj@carnegiemuseums.org
Good science requires clearly explained, included (Pimentel and Riggins, 1987; Nixon
repeatable methods. Yet, practitioners of and Wheeler, 1990; Stevens, 1991; Campbell
morphological phylogenetics tend not and Frost, 1993; Thiele, 1993; Wiens, 1995,
to be explicit about their methodology, 1998; Rae, 1998), how within-species vari-
specically, how morphological characters ation is coded (Archie, 1985; Campbell and
are selected, and how states are dened, Frost, 1993; Thiele, 1993; Wiens, 1995, 1999;
delimited, coded, and ordered (a process I Swiderski et al., 1998 Smith and Gutberlet,
refer to as character analysis). The lack 2001), how character states are ordered
of methodological explanation in published (Hauser and Presch, 1991; Lipscomb, 1992;
morphological studies has been discussed by Wilkinson, 1992; Slowinski, 1993), and how
several authors (e.g., Pimentel and Riggins, different types of morphological characters
1987; Pogue and Mickevich, 1990; Stevens, are weighted relative to each other (e.g., Far-
1991; Thiele, 1993; Wiens, 1995) and has been ris, 1990; Campbell and Frost, 1993; Wiens,
documented for character selection (Poe and 1995, 1998). Different choices and assump-
Wiens, 2000). This is a particularly serious tions are important because they can lead to
problem, because in contrast to analysis of radically different trees (e.g., Wiens, 1995).
DNA sequence data, in which character def- In this paper, I suggest that for many
inition and character state delimitation are morphological characters, these problems
virtually automatic (the nontrivial problem and controversies in the selection, deni-
of alignment notwithstanding), morpholog- tion, delimitation, and ordering of charac-
ical character analysis requires considerable ters may have a common solution. Many,
effort, involving many methodological if not most, morphological characters de-
decisions and implicit assumptions at every scribe variation in quantitative traits (e.g.,
step in the process. differences in size, shape, or counts of se-
Many aspects of morphological character rially homologous structures), regardless of
analysis are controversial, including the way whether systematists choose to code them
in which characters are constructed (e.g., quantitatively or qualitatively (Stevens, 1991;
Maddison, 1993; Pleijel, 1995; Wilkinson, Thiele, 1993). Given this, three fundamen-
1995; Hawkins et al., 1997; Lee and Bryant, tal problems of character analysis (charac-
1999; Strong and Lipscomb, 1999), whether ter state denition, delimitation, and order-
intraspecically variable characters can be ing) potentially can be solved by simply
689
690 S YSTEMATIC BIOLOGY VOL. 50
characters consistently increases phyloge- variation within character state ranges may
netic accuracy relative to excluding them be ignored. For example, given a character
(Wiens and Servedio, 1997; Wiens, 1998). with state 0 consisting of 1114 vertebrae
Third, coding quantitative variation as con- and state 1 of 1520 vertebrae, two hypo-
tinuous quantitative characters (i.e., weight- thetical taxa with species values of 11 and
ing character state transformations on the 14 vertebrae would be coded as identical.
basis of differences in mean trait values us- Second, differences within intervals may be
ing the method proposed in this study) may larger than between intervals. For the above
be preferable to qualitative coding because character, a change from 11 to 14 vertebrae is
it can potentially solve three common prob- ignored, but a change from 14 to 15 receives
lems in morphological phylogenetics. These maximum weight. Third, use of cutoffs and
problems are explained in the sections below. ranges may not reect the differences in the
Vague character denitions.The language amount of change between character states.
constrained by the number of distinct states Some systematists may object to step-
allowed by the computer software package. matrix gap-weighting because it requires
This makes it difcult to include very large assumptions about evolutionary processes.
numbers of taxa (>32 for PAUP or PAUP ) However, as stated above, this method is
with unique trait means. If the number of taxa simply a logical extension of the same as-
with unique means is too large, I recommend sumption that is widely used by morpho-
using Thieles (1993) gap-weighting method. logical systematists when they code intrinsi-
This method uses less-ne-grained informa- cally quantitative characters. Morphological
tion but has no limits on the number of taxa character states typically describe ranges of
that can be coded. Second, when using step- trait values, regardless of whether the states
matrix gap-weighting, the only states that are dened quantitatively (e.g., state 0 D
are reconstructed at ancestral nodes are those frontal process length >50% nasal length vs.
that occur within terminal taxa. However, to state 1 D frontal process <40% nasal length)
what extent (if any) this negatively impacts or qualitatively (e.g., frontal process long vs.
tree reconstruction is unclear, and simula- short). Thus, systematists implicitly assume
tions and congruence studies with polymor- that taxa sharing similar but nonidentical
phic characters coded with frequency-based trait values should be more closely related
step matrices do not suggest that this prob- than taxa sharing more dissimilar trait val-
lem limits phylogenetic accuracy (Wiens and ues. They assume that traits will generally
Servedio, 1997, 1998; Wiens, 1998). evolve gradually, rather than leaping from
694 S YSTEMATIC BIOLOGY VOL. 50
low to high trait values and vice versa (i.e., titative characters: between-character scal-
they assume no a priori homoplasy in quan- ing, between-state scaling, and statistical
titative trait values). This assumption is little scaling.
more than an extension of parsimony to char- Between-character scaling.Various au-
acter state denition; the minimum amount thors have recommended weighting or
of change is assumed a priori. This assump- scaling quantitative characters to be equal
tion is also supported by the elds of em- to each other and to qualitative characters
pirical and theoretical quantitative genetics (e.g., Thiele, 1993). The goal is to ensure
(Lynch and Walsh, 1998), which show that a that quantitative and (binary) qualitative
character is generally more likely to evolve characters have the same maximum length,
to a similar trait value (e.g., from a low mean an approach I label between-character
number of ventral scales to a different low scaling. For quantitative characters coded
number) than to a dissimilar value (e.g., from using step matrices with a maximum
obviously discrete binary character that gap-coding). The statistical methods could
would have the same weight as any other be used to determine the number of distinct
traditional qualitative character. If we also states for each character, and the number
observe taxa xed for 12 and 13 vertebrae in of distinct states minus one could be used
the same group, and we assume the character as a weighting function for each step-matrix
is ordered, then applying between-character coded character. Using this method, the cost
scaling to this character would make the cost of a change between the lowest and highest
(weight) of going from 10 to 11 vertebrae de- mean species trait values would be equiva-
crease to 33% of its original weight (i.e., be- lent to the maximum length of an ordered
cause the cost of going from 10 to 13 is scaled qualitative character; whether it was equiva-
to be equal to the cost of going from 0 to 1 in lent to a qualitative character with two states,
a xed character, the cost of going from 10 to four states, or more would depend on how
11 decreases to one-third). But if the standard many states were determined to be statisti-
The uncertainty over the best scaling qualitatively coded data can be inuenced
method, combined with the sensitivity of strongly by a single character, if the author
phylogenetic results to different scaling chooses to divide the character into a large
methods (Fig. 4), might be seen as a seri- number of states (in an analysis in which
ous drawback of treating data quantitatively. all character state transformations are given
But this is a case where quantitative anal- equal weight).
ysis calls for explicit treatment of a gen-
eral problem that is present but typically ig- AN EMPIRICAL EXAMPLE OF
nored with qualitative coding. For example, Q UANTITATIVE CODING AND S CALING
without quantitative methods for delimiting I have recently applied the step-matrix
character states, a phylogenetic analysis of gap-weighting approach outlined in this pa-
per to a phylogenetic analysis of morpho-
logical data in hoplocercid lizards (Wiens
the consistency index (ci; Kluge and Farris, similar to the only other phylogenetic study
1969). Seven data sets were analyzed: (1) of the group (Etheridge and de Queiroz,
all characters (meristic characters weighted 1988), with the genus Enyalioides forming a
with between-state scaling), (2) all characters paraphyletic series of lineages at the base of
(meristic characters with between-character the tree leading to a clade containing the gen-
scaling), (3) meristic characters only (with era Morunasaurus (which is paraphyletic) and
between-state scaling), (4) meristic charac- Hoplocercus. In the tree based on between-
ters only (with between-character scaling), state scaling, Hoplocercus and a paraphyletic
(5) xed characters only, (6) polymorphic Morunasaurus are at the base, and Enyalioides
characters only, and (7) morphometric char- is a well-supported monophyletic group.
acters only (with between-character scaling). These trees also differ in numerous place-
Each data set was randomized 100 times, by ments of individual species of Enyalioides,
randomly shufing states among taxa within although many branches of both trees (ex-
TABLE 2. Consistency indices of the different types racy of parsimony, distance, and likelihood
of characters for the trees in Figure 4 (n D number of methods for quantitative traits. Congruence
parsimony-informative characters of each type).
analyses (e.g., Wiens, 1998), which allow
Character n Between-state Between-character phylogenetic accuracy to be addressed with
type scaling scaling empirical data sets, should be particularly
Fixed 17 0.439 0.238 0.308 0.128 useful in this area.
(0.167 1.000) (0.167 0.500)
Polymorphic 19 0.406 0.208 0.316 0.237
(0.167 0.945) (0.142 0.945)
ACKNOWLEDGMENTS
Meristic 8 0.490 0.057 0.349 0.059 I thank Chris Beard, Maureen Kearney, Brad Livezey,
(0.417 0.599) (0.258 0.417) Zhexi Luo, Dick Olmstead, Steve Poe, John Rawlins,
Morphometric 2 0.390 0.110 0.307 0.082 Maria Servedio, Peter Stevens, and John Wible for com-
(0.312 0.467) (0.249 0.365) ments on the manuscript, and Richard Etheridge for use
of our hoplocercid data set for this paper.
HAWKINS , J. A., C. E. HUGHES , AND R. W. SCOTLAND. S CHLUTER, D. 1984. Morphological and phylogenetic
1997. Primary homology assessment, characters and relations among the Darwins nches. Evolution.
character states. Cladistics 13:275283. 38:921930.
HILLIS , D. M., AND J. J. BULL. 1993. An empirical test SLOWINSKI , J. B. 1993. Unordered versus ordered
of bootstrapping as a method for assessing con- characters. Syst. Biol. 42:155165.
dence in phylogenetic analysis. Syst. Biol. 42:182 S MITH, E. N., AND R. L. G UTBERLET , J R. 2001. General-
192. ized frequency coding: a method of preparing poly-
HILLIS , D. M., AND J. P. HUELSENBECK . 1992. Signal, morphic multistate characters for phylogenetic anal-
noise, and reliability in molecular phylogenetic anal- ysis. Syst. Biol. 50:156169.
yses. J. Hered. 83:189195. STEVENS , P. F. 1991. Character states, morphological vari-
KLUGE, A. G., AND J. S. FARRIS . 1969. Quantitative ation, and phylogenetic analysis: A review. Syst. Bot.
phyletics and the evolution of anurans. Syst. Zool. 16:553583.
18:132. STRAIT , D., M. MONIZ, AND P. STRAIT. 1996. Finite mix-
LEE, D.-C., AND H. N. BRYANT. 1999. A reconsideration ture coding: A new approach to coding continuous
of the coding of inapplicable characters: Assumptions characters. Syst. Biol. 45:6778.
and problems. Cladistics 15:373378. STRONG , E. E., AND D. LIPSCOMB . 1999. Character coding