You are on page 1of 8

This article was downloaded by:[UT Arlington] On: 27 February 2008 Access Details: [subscription number 768512488] Publisher:

Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Systematic Biology

Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t713658732

Coding Meristic Characters for Phylogenetic Analysis: A Comparison of Step-Matrix Gap-Weighting and Generalized Frequency Coding

A. Michelle Lawing a; Jesse M. Meik a; Walter E. Schargel a a Department of Biology, The University of Texas at Arlington, Arlington, Texas, USA First Published on: 01 February 2008 To cite this Article: Lawing, A. Michelle, Meik, Jesse M. and Schargel, Walter E. (2008) 'Coding Meristic Characters for Phylogenetic Analysis: A Comparison of Step-Matrix Gap-Weighting and Generalized Frequency Coding', Systematic Biology, 57:1, 167 - 173 To link to this article: DOI: 10.1080/10635150801898938 URL: http://dx.doi.org/10.1080/10635150801898938

PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf This article maybe used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

2008

POINTS OF VIEW

167

Downloaded By: [UT Arlington] At: 23:33 27 February 2008

Syst. Biol. 57(1):167173, 2008 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150801898938

Coding Meristic Characters for Phylogenetic Analysis: A Comparison of Step-Matrix Gap-Weighting and Generalized Frequency Coding
A. M ICHELLE LAWING , J ESSE M. M EIK , AND WALTER E. S CHARGEL
Department of Biology, The University of Texas at Arlington, Arlington, Texas 76019, USA; E-mail: jmeik@uta.edu (J.M.M.)

Meristic characters, or counts of discrete serially homologous structures, are a distinctive and ubiquitous class of quantitative organismal variation. Meristic characters share many properties with morphometric characters (i.e., measurements, proportions, etc.): they are readily described numerically, they usually vary within and among taxa, and they often appear to follow similar underlying frequency distributions (e.g., Burbrink, 2001; Allsteadt et al., 2006). Based on these similarities, meristic characters are generally lumped with morphometric characters into the broader category of quantitative continuous characters for phylogeny reconstruction. Meristic characters also exhibit properties that differ subtly, but perhaps importantly, from morphometric characters. It has been increasingly recognized that morphological systematists often code intrinsically quantitative characters as qualitative by articially compartmentalizing variation into relatively few ordered states (e.g., interclavicle, median process: 0 = normal length; 1 = reduced; Etheridge and de Queiroz, 1988). This practice always produces arbitrary character states with morphometric data. In contrast, meristic characters may be viewed as discrete traits that, depending on the range of variation, show a continuum from binary transformations, through multistate polymorphic characters, to quasicontinuous variation analogous to morphometric data (Wiens, 2001). Arguments both for and against the inclusion of quantitative continuous characters are prevalent in the systematics literature and are beyond the scope of this article (see Rae, 1998; Swiderski et al., 1998; Thiele, 1993). Regardless, various authors have shown that such characters do provide substantial phylogenetic information despite the potential for increased levels of homoplasy, and thus remain relevant to empirical systematics (e.g., Campbell and Frost, 1993; Wiens, 1995; Wiens and Servedio, 1998). Several coding methods have been developed for incorporating meristic characters in phylogenetic analysis and dealing with the problem of partially overlapping character states across taxa. For binary characters, frequency bins have most often been used; whereas polymorphic multistate characters have been analyzed using majority methods, segment coding, gap coding, gap weighting, step matrices, and various statistical similarity analyses (e.g., Colless, 1980; Mabee and Humphries, 1993; Mickevich and Johnson, 1976; Swiderski et al., 1998; Thiele, 1993; Wiens, 1993).

Wiens (1998, 2000) and Wiens and Servedio (1997, 1998) evaluated several classes of coding methods and concluded that frequency methods were generally most effective. In recent years, two methods that augment and improve on previous approaches have become widely used for coding meristic (and other quantitatively dened) characters for phylogenetic analysis: step-matrix gap-weighting (SMGW; Wiens, 2001) and generalized frequency coding (GFC; Smith and Gutberlet, 2001). SMGW is an application of step matrices (Wiens, 1995) to the gap-weighting method introduced by Thiele (1993). In gap weighting, taxa are assigned states based on range-standardized mean values of a trait, with the number of possible states scaled to the maximum number allowed by the software used to infer phylogenies (e.g., 32 states for PAUP*; Swofford, 1993). Gaps between means are weighted based on the magnitude of their differences so that larger differences in trait means between taxa translate into larger weights. A limitation of the step-matrix approach is that the number of distinct states is potentially restricted by the software used to build phylogenies (e.g., PAUP* only allows 32 states, so data sets with over 32 taxa would likely not be amenable to SMGW). The purported advantage of SMGW is that step matrices allow more ne-grained weighting than simple gap weighting by increasing the trait range from 32 states to 1000 states (the maximum cost between states in a step matrix using PAUP*). Thus, characters are treated as approximations of a continuous scale (Wiens, 2001). GFC can be viewed as a method that combines elements of both gap weighting (as implemented by Thiele, 1993) and the frequency bins method of Wiens (1993). In GFC, each quantitative character is divided into subcharacters that correspond with each character state. The frequency of specimens falling into a given subcharacter is described with frequency bins; the overall effect is that cumulative frequency distributions of character states per taxa are constructed for each character. A potential advantage of GFC is that frequency distributions are simply translated into phlyogenetically analyzable data, maximizing information content while eliminating the need for further data manipulation (Smith and Gutberlet, 2001). The primary operational difference between these methods is that character states within taxa are coded using estimates of cumulative frequencies of

168

SYSTEMATIC BIOLOGY

VOL. 57

Downloaded By: [UT Arlington] At: 23:33 27 February 2008

states in GFC versus estimates of the mean trait value in SMGW. Both SMGW and GFC have been recently used in empirical studies (e.g., Bonett, 2002; Doan, 2003; Gutberlet and Harvey, 2002; Wiens and Etheridge, 2003), and programs made available by their respective developers have greatly facilitated their implementation. Methodological differences aside, both approaches scale characters on the parsimonious assumption that similar values (or frequencies) for a trait imply more recent common ancestry. Neither Wiens (2001) nor Smith and Gutberlet (2001) discussed the genetic processes underlying the evolutionary variation and inheritance of meristic traits but rather considered them in a strictly operational sense (i.e., simply as counts). Here we focus on analysis of meristic characters because we note that fundamental differences between meristic and morphometric characters may be reected in the statistical properties and relative performance of GFC and SMGW. We compared the relative performance of these two coding methods using data generated under a threshold model of inheritance of meristic characters. First, we discuss the specication and associated underlying assumptions of such a model and show that the model is consistent with the implementation of both coding methods. T HRESHOLD M ODEL OF M ERISTIC CHARACTER EVOLUTION AND I TS ASSUMPTIONS As is typical of continuous characters, the expressed state of a meristic character is generally specied by polygenic architecture (Felsenstein, 2005). The number of counts in a series has been described theoretically by a model in which the expression of the character is governed by whether the sum of the underlying genetic products involved (e.g., proteins, cells, etc.) exceed physiological or developmental thresholds (Falconer and Mackay, 1996; Felsenstein, 2004, 2005; Wright, 1934). This idea was rst proposed by Wright (1934a, 1934b) in his discussion of the inheritance of hind toe counts in guinea pigs (here, number of hind toes represented a meristic character with multiple thresholds or states). As presented by Wright, a trait is controlled by one or more continuously varying factors, which cause a change in the ultimate expression of the trait phenotype when a threshold value is crossed. We refer to the underlying continuous quantity or variable as the liability of the trait, following Falconer (1965). The liability may be thought as the concentration of a substance or compound (e.g., a hormone), which plays a major role in the physiological and/or developmental processes involved in the expression of the meristic trait (Falconer and Mackay, 1996). It can also be considered as the rate of a developmental process or even the size of the eld bearing the meristic structures (see below). The threshold model is expanded from the individual to the population level by assuming that the liability is normally distributed, with the frequency of the different discrete character states in a given population dened by the area beneath the curve that falls between two threshold values (Felsenstein

2004, 2005). Felsenstein (2004) proposed to simulate the threshold model by evolving the liability using Brownian motion. There are many empirical examples of discretely varying traits that conform with the threshold model, including litter size (in which case the liability is directly related to the concentration of gonadotrophic hormones; Falconer and Mackay, 1996), disease incidence (Falconer, 1965), digit and phalange number in salamanders (in which case the liability is related to the number of mesenchymal cells in the embryonic eld; Alberch and Gale, 1983), and wing morphology in crickets (Roff and Fairbairn, 1999). Although considerable additional work is needed on the inheritance of meristic traits, and especially those traits traditionally used in systematics (e.g., scale counts), the diversity of the above examples portend to the generality of the threshold model as applied to discretely varying traits. In his original paper on threshold models, Wright (1934b) described the underlying mechanisms resulting in the observed phenotype as physiological. Fixed threshold models across species assume that the physiological or developmental mechanisms responsible for the phenotype remain invariant within a group of closely related species. This assumption may be justiable under the paradigm that physiological pathways and other aspects of development are generally conserved (Wilkins, 2002). From an energetics perspective, we suggest that equidistant threshold values are also a reasonable assumption, because adding equivalent structures would likely require similar allocations of resources. We can envision an additional mechanism that would result in multiple equidistant thresholds and would represent a general case where the number of structures comprising the meristic trait is integrated with, and dependent on, the size of the structure containing them. Consider a hypothetical case where number of maxillary teeth (the meristic character) is a linear function of size of the eld bearing them (i.e., the length of the embryonic structure that is the precursor of the maxilla) at some point in early development. This would represent a special case where the liability is indirectly observable, as we know nothing of the underlying genetic determinants of this trait. Note that this does not assume perfect correlation of tooth count and maxilla length in adults, because developmental trajectories may change later in ontogeny via genetic architecture and/or environmental interactions. As with other biological models, the threshold model described above is an oversimplied representation of evolutionary processes and the genetic basis of phenotypic traits. The complexity of the model could be increased to potentially make it more realistic; for example, the liability could be evolved under an OrnsteinUhlenbeck process (Butler and King, 2004), or evolved characters could be intercorrelated. Both SMGW and GFC implicitly assume polygenic variation, a necessary assumption to describe phenotypic evolution with Brownian motion. Neither method necessarily assumes a threshold model, but we are unaware of any other simple

2008

POINTS OF VIEW

169

Downloaded By: [UT Arlington] At: 23:33 27 February 2008

FIGURE 1. Diagram illustrating how deviations from equidistant thresholds can negatively affect the performance of both generalized frequency coding (GFC) and step-matrix gap-weighting (SMGW). The graph shows the frequency distribution of character states for three species (A, B, and C), with the threshold dening the character states depicted as vertical lines. The position of the frequency distributions for each species is relative to an arbitrarily dened underlying liability scale (not shown). The top panel of the graph represents an equidistant threshold scenario in which both GFC and SMGW will support the correct tree (depicted on the top right). The bottom panel shows that both methods would support the wrong tree (depicted on the bottom right) because A and B are more similar in their observed phenotype than B and C, yet B and C have closer liability values consistent with their evolutionary history.

and biologically meaningful alternative model of the genetics of meristic characters. We recognize that under various circumstances thresholds may be neither xed nor equidistant; however, what is salient to our comparison is that the model as presented (with xed, equidistant thresholds) is consistent with both GFC and SMGW methods, as is illustrated by the fact that deviations from our model settings would result in failure for both methods (Fig. 1). Although such assumptions are important to justify use and gain insights into which conditions may affect performance of a method, they have rarely been made explicit by proponents of character coding methods (but see Felsenstein 1988, 2005). S IMULATION OF M ERISTIC D ATA ON A S TRUCTURED PHYLOGENY We compared the performance of SMGW and GFC by evaluating each methods ability to recover the correct topology of a known phylogeny. We used an arbitrarily structured phylogeny that was initiated with a single lineage and produced ve species over 1000 time steps: topology = (A,((B,C),(D,E))). Character evolution was simulated with a multiple xed and equidistant threshold model with the liability evolving under Brownian motion. To account for possible biases due to phylogenetic signal, we allowed branch lengths to vary randomly across all simulations. We varied three addi-

tional parameters in our simulated data sets: number of characters (5, 10, 20, and 30), intraspecic sample size (2, 5, 10, and 20), and standard deviation of the normal distribution tted to the liability (SD = 2, hereafter high variance; SD = 0.75, hereafter low variance). The values for a given character in the high-variance set each had a 95% quartile range of 8 counts. For the lowvariance set, 95% quartile range of values was approximately 3 counts. Both the low- and high-variation data sets are within the normal range of intrataxonomic variation documented in various empirical data sets (e.g., supralabial scale counts in squamates, body segment counts in centipedes, pectoral n rays in shes, etc.; Arthur, 2000; Campbell and Lamar, 2004; Crampton et al., 2005). We generated 100 replicate data sets for each combination of the three xed parameters (see above). This produced a total of 3200 independent data sets that were each analyzed using both GFC and SMGW coding. Each liability simulation was initiated with the same value. We obtained terminal (i.e., extant) liability means for each of the ve taxa in the clade as real numbers with precision equal to 0.01. For each liability mean that was evolved, we simulated meristic data collection for the various sample sizes by drawing random values from a bivariate normal distribution with the mean equal to the mean generated from the simulated liabilities. We truncated drawn values to represent integer counts by

170

SYSTEMATIC BIOLOGY

VOL. 57

Downloaded By: [UT Arlington] At: 23:33 27 February 2008

shearing the decimals, which set our threshold values equal to the set of natural numbers. After generating data sets, we prepared each for phylogenetic analysis using both GFC and SMGW methods. Although software has been developed for each of the respective coding methods, neither was designed to process multiple data sets simultaneously. We wrote new programs that allowed implementation of these methods concomitant with the simulation of the morphological data sets. To ensure consistency of program performance, we compared results of a small subset of data sets generated with our versions of the programs with results obtained using programs provided by the original authors. The new programs developed for these analyses are available in the supplementary material (available online at http://www.systematicbiology.org). Individual character weighting is an issue that has considerable bearing on the outcome of phylogeny reconstruction. Most empirical morphological data sets contain various types of characters, thus methods need to be employed (and justied) for adjusting weights of state changes to be consistent across disparate character types. Our simulated data effectively ameliorates this potential issue because the characters were evolved under uniform specic parameters and therefore have comparable properties across a given data set (all characters weighted equally). In order to treat characters as ordered for GFC, we used cumulative frequencies to construct subcharacter matrices. We then assigned weights using unequal subcharacter weighting because number and distance of gaps in character frequency distributions varied in extent across taxa and simulated data sets. Phylogeny reconstruction was performed in PAUP* under the maximum parsimony criterion and trees were searched using the branch-and-bound algorithm. Accuracy of each method was assessed as the number of correct topologies recovered out of 100 simulated data sets for each parameter combination (n = 32). COMPARISON OF SMGW AND GFC M ETHODS GFC consistently recovered the correct topology with higher frequency than did SMGW (30 out of 32 parameter combinations; Fig. 2). When results were pooled across all simulated data sets, GFC performed with 26.2% greater accuracy. Considering only the high-variance simulations, GFC outperformed SMGW by an average of 4.6%. The performance differential increased to an average of 21.6% in the low-variance simulations. Furthermore, at low variance the increased accuracy of GFC was most dramatic at small sample sizes (n 5). This overall pattern appears to be a result of decreasing performance of SMGW under low character variance, as opposed to major differences in the performance of GFC. Of the 2547 correct trees obtained from our analyses, 5.5% were recovered only by SMGW and 21.7% were recovered only by GFC. The consistency in performance across virtually all parameter combinations we assessed suggests that GFC may be a generally more accurate method of coding meristic characters for phylogenetic analysis.

The observation that the difference in performance between GFC and SMGW is greater under the lowvariance simulations provides potential insights into the operational implications of these methods. Conceptually SMGW attempts to estimate only the mean of the underlying liability. Assuming nite sample sizes, the ability of this method to accurately capture the liability decreases when distance between thresholds is greater relative to the variance of the distribution of the liability (i.e., fewer thresholds are spread across a distribution as in the low-variance simulations). Conversely, through its implementation GFC seems to reconcile the fact that there are observed character states that are determined by thresholds and an underlying liability. In other words, despite the fact that the underlying genetic model was not explicitly considered in the initial development of either method, GFC appears to more effectively account for the threshold model via the way it codes meristic data. GFC retains the observed character states, which in theory are determined by thresholds, but also takes into account the gaps between character state distributions, which result from divergence in the underlying liability. An extension of this idea, but one that we cannot necessarily conclude from our simulations based on the threshold model, is that we would expect that as genetic architecture becomes more simplistic (less polygenic), the mean of the liability would become less informative, inasmuch as a normal distribution would not be a good t to the data. Although this would likely be detrimental to both methods, we envision this problem to inuence SMGW to a greater extent. It would be desirable to directly test the robustness of both methods to deviations from normality; however, approximations of Brownian motion are described using the density function of the normal distribution (Felsenstein 2004). Thus, although it would be computationally simple to simulate data from any non-normal distribution, it would not be biologically justiable as we are unaware of a model in which a continuous trait (e.g., the liability) is not normally distributed. A logical next step then would be to evaluate these coding methods using empirical data sets with various known sampling distributions. Wiens (2001) advocated that one of the major advantages of SMGW is the ne-grained weighting that is afforded through the conversion of mean trait values into a scale of 1000 possible states. Although SMGW allows ne-grained weighting, the method implicitly requires comparable precision in the quality of the original data. Simply multiplying range-standardized means by a factor of 1000 may increase precision in gaps; or conversely, it may create false precision by amplifying any sample size errors and measurement inconsistencies inherent in the raw data. For this added precision to be meaningful, there need be careful and explicit consideration of the quality of the raw data and the potential effects of translating those data across a ner scale. Here we focused exclusively on meristic characters and the conceptual agreement of GFC to the threshold model of meristic trait inheritance. It is possible

2008

POINTS OF VIEW

171

Downloaded By: [UT Arlington] At: 23:33 27 February 2008

FIGURE 2. Simulation results depicting number of correct trees recovered by both GFC and SMGW (denitions as in Fig. 1). Vertical bars represent number of correct trees recovered out of 100 independent data sets under various parameter combinations (3200 total data sets).

that under certain conditions SMGW may outperform GFC, especially when characters vary continuously, are normally distributed, and sample sizes are adequate. However, regardless of the coding method used, it is also important to note that many morphometric characters are actually controlled by simple genetic architecture that would appear as discrete states except for the inuence of environmental interactions; thus it is difcult to assess when high levels of precision reect nuances of heritable variation or merely the additive effects

of genotype/environmental interactions. It may also be argued that the use of frequency bins (as in GFC) may induce articial signal by arbitrarily making equivalent character states that are subtly different (Wiens, 2001). Although this is an important issue, our results show that the coarseness of such an approach does not seem to hinder the relative ability of the coding method to recover the correct topology. In fact, attempts to achieve high levels of precision in coding may be illusory due to the additive noise of nonheritable variation, sample size

172

SYSTEMATIC BIOLOGY

VOL.

57

Downloaded By: [UT Arlington] At: 23:33 27 February 2008

and associated statistical issues, variable rates of character evolution, etc., that are inherent in phenotypic data. CONCLUSIONS We are entrenched in an era where DNA sequences are the primary data source for reconstructing the tree of life and the major focus in theoretical systematics is on incorporating better models of molecular evolution; however, it is clear that the incorporation of morphological data remains vital to a holistic effort in systematics (Hillis and Wiens, 2000). Although the evolution of morphology remains less tractable relative to molecular data, our knowledge and conceptual understanding of morphogenesis is increasing rapidly and probabilistic models of morphological evolution are beginning to be incorporated into phylogeny estimation (e.g., Lewis, 2001). An increased appreciation of the mechanisms underlying development and evolution of morphology will certainly inject rigor into the translation of morphology into characters suitable for phylogentic analysis. In this study, we have made an additional step towards reconciling what is understood about the evolution of meristic characters with the choices available for incorporating these into phylogenetic analyses. Interestingly, the theoretical model underpinning our consideration of meristic characters was rst presented over half a century ago but has yet to be explicitly considered by systematists attempting to code and utilize such characters (but see Felsenstein, 2005). We acknowledge that reasons for the differential performance of these methods remain speculative, but it is clear from our results that GFC uses information contained within an intrataxonomic sample of meristic characters more efciently than does SMGW, particularly when variation is small. Because GFC is operationally more consistent with the concept of an underlying threshold model than is SMGW, such a result may be expected. ACKNOWLEDGMENTS
For comments, discussion, and providing helpful literature, we are grateful to T. Castoe, N. MacLeod, E. Martins, A. Pires da Silva, P. D. Polly, J. Sullivan, and the UTA herpetology discussion group (Spring 2007).

R EFERENCES
Alberch, P., and E. Gale. 1983. Size dependency during the development of the amphibian foot: Colchicine induced digital loss and reduction. J. Embryol. Exp. Morphol. 76:177197. Allsteadt, J., A. H. Savitzky, C. E. Petersen, and D. N. Naik. 2006. Geographic variation in the morphology of Crotalus horridus (Serpentes: Viperidae). Herpetol. Monogr. 20:163. Arthur, W. 2000. Intraspecic variation in developmental characters: The origin of evolutionary novelties. Am. Zool. 40:811818. Bonett, R. M. 2002. Analysis of the contact zone between the dusky salamanders Desmognathus fuscus fuscus and Desmognathus fuscus conanti. Copeia 2002:344355. Burbrink, F. T. 2001. Systematics of the North American rat snake complex (Elaphe obsoleta). Herpetol. Monogr. 15:153. Butler, M. A., and A. A. King. 2004. Phylogenetic comparative analysis: A modeling approach for adaptive evolution. Am. Nat. 164:6836905.

Campbell, J. A., and D. R. Frost. 1993.Anguid lizards of the genus Abronia: Revisionary notes, descriptions of four new species, phylogenetic analysis, and key. Bull. Am. Mus. Nat. Hist. 216:1121. Campbell, J. A., and W. W. Lamar. 2004. The venomous reptiles of the Western Hemisphere. Cornell University Press, Ithaca, New York. Crampton, W. G. R., D. H. Thorsen, and J. S. Albert. 2005. Three new species from a diverse, sympatric assemblage of the electric sh Gymnotus (Gymnotiformes: Gymnotidae) in the Lowland Amazon Basin, with notes on ecology. Copeia 1:8299. Colless, D. H. 1980. Congruence between morphometric and allozyme data for Menidia species: A reappraisal. Syst. Zool. 31:100104. Doan, T. M. 2003. A south-to-north biogeographic hypothesis for Andean speciation: Evidence from the lizard genus Proctoporus (Reptilia, Gymnophthalmidae). J. Biogeogr. 30:361374. Etheridge, R., and K. de Queiroz. 1988. A phylogeny of Iguanidae. Pages 283368 in Phylogenetic relationships of lizard families: Essays commemorating Charles L. Camp (R. Estes and G. Pregill, eds.). Stanford University Press, Palo Alto, California. Falconer, D. S. 1965. The inheritance of liability to certain diseases, estimated from the incidence among relatives. Ann. Hum. Genet. Lond. 29:5176. Falconer, D. S., and T. F. C. Mackay. 1996. Introduction to quantitative genetics, 4th edition. Longman, Essex, UK. Felsenstein, J. 1988. Phylogenies and quantitative characters. 1988. Ann. Rev. Ecol. Syst. 19:445471. Felsenstein, J. 2004. Inferring phylogenies. Sinauer Associates, Sunderland, Massachusetts. Felsenstein, J. 2005. Using the quantitative genetic threshold model for inference between and within species. Phil. Trans. R. Soc. B 360:1427 1434. Gutberlet, R. L., Jr., and M. B. Harvey, 2002. Phylogenetic relationships of New World pitvipers as inferred from anatomical evidence. Pages 5168 in Biology of the vipers (G. W. Schuett, M. Hoggren, H. W. Greene, and M. Douglas, eds.). Eagle Mountain Publishing, Eagle Mountain, Utah. Hillis D. M., and J. J. Wiens. 2000. Molecules versus morphology in systematics: conicts, artifacts, and misconceptions. Pages 119 in Phylogenetic analysis of morphological data (J. J. Wiens, ed.). Smithsonian Institution Press, Washington, DC. Lewis, P. O. 2001. A likelihood approach to estimating phylogeny from discrete morphological character data. Syst. Biol. 50:913925. Mabee, P. M., and J. Humphries. 1993. Coding polymorphic data: Examples from allozymes and ontogeny. Syst. Biol. 42:166181. Mickevich, M. F., and M. F. Johnson. 1976. Congruence between morphological and allozyme data. Syst. Biol. 25:260270. Rae, T. 1998. The logical basis for the use of continuous characters in phylogenetic systematics. Cladistics 14:221228. Roff, D. A., and D. J. Fairbairn. 1999. Predicting correlated responses in natural populations: changes in JHE activity in the Bermuda population of the sand cricket, Gryllus rmus. Heredity 83:440 450. Smith, E. N., and R. L. Gutberlet, Jr. 2001. Generalized frequency coding: A method of preparing polymorphic multistate characters for phylogenetic analysis. Syst. Biol. 50:156169. Swiderski, D. L., M. L. Zelditch, and W. L. Fink. 1998. Why morphometrics is not special: Coding quantitative data for phylogenetic analysis. Syst. Biol. 47:508519. Swofford, D. L. (1993) PAUP*: Phylogenetic analysis using parsimony (*and other methods). Version 4.0b10(PPC). Sinauer Associates, Sunderland, Massachusetts. Thiele, K. 1993. The Holy Grail of the perfect character: The cladistic treatment of morphometric data. Cladistics 9:275304. Wiens, J. J. 1993. Phylogenetic systematics of the tree lizards (genus Urosaurus). Herpetologica 49:399420. Wiens, J. J. 1995. Polymorphic characters in phylogenetic systematics. Syst. Biol. 44:482500. Wiens, J. J. 1998. Testing phylogenetic methods with tree congruence: Phylogenetic analysis of polymorphic morphological characters in phyrnosomatid lizards. Syst. Biol. 47:411428. Wiens, J. J. 2000. Coding morphological variation within species and higher taxa for phylogenetic analysis. Pages 115145 in Phylogenetic analysis of morphological data (J. J. Wiens, ed.). Smithsonian Institution Press, Washington, DC.

2008

POINTS OF VIEW

173

Downloaded By: [UT Arlington] At: 23:33 27 February 2008

Wiens, J. J. 2001. Character analysis in morphological phylogenetics: Problems and solutions. Syst. Biol. 50:689699. Wiens, J. J., and R. E. Etheridge. 2003. Phylogenetic relationships of hoplocercid lizards: Coding and combining meristic, morphometric, and polymorphic data using step matrices. Herpetologica 59:375 398. Wiens, J. J., and M. R. Servedio. 1997. Accuracy of phylogenetic analysis including and excluding polymorphic characters. Syst. Biol. 46:332 345. Wiens, J. J., and M. R. Servedio. 1998. Phylogenetic analysis and intraspecic variation: Performance of parsimony, likelihood, and distance methods. Syst. Biol. 47:228253.

Wilkins, A. S. 2002. The evolution of developmental pathways. Sinauer Associates, Sunderland, Massachusetts. Wright, S. 1934a. An analysis of variability in the number of digits in an inbred strain of guinea pigs. Genetics 19:506536. Wright, S. 1934b. The results of crosses between inbred strains of guinea pigs differing in number of digits. Genetics 19:537 551. First submitted 28 June 2007; reviews returned 10 September 2007; nal acceptance 18 October 2007 Associate Editor: Norman MacLeod

Syst. Biol. 57(1):173181, 2008 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150801910469

Crown Clades in Vertebrate Nomenclature: Correcting the Denition of Crocodylia


J EREMY E. M ARTIN1 AND M ICHAEL J. B ENTON2
1

Universit e Lyon 1, UMR 5125 PEPS CNRS, 2, rue Dubois 69622 Villeurbanne, France; E-mail: jeremy.martin@pepsmail.univ-lyon1.fr 2 Department of Earth Sciences, University of Bristol, Bristol, BS9 1RJ, UK; E-mail: mike.benton@bristol.ac.uk

A crown group is dened as the most recent common ancestor of at least two extant groups and all its descendants (Gauthier, 1986). Despite criticism, crown-group denitions are widely used, especially for certain clades of vertebrates. As an example, crown-group Crocodylia was established by Clark (in Benton and Clark, 1988), and there has been increasing use of crown Crocodylia rather than traditional or total Crocodylia since that date. Originally, the Crocodylia embraced forms dating from the Late Triassic to the present. These were divided into three classes, Protosuchia, Mesosuchia, and Eusuchia, the rst two of which were accepted as probably or certainly paraphyletic. The new convention was cemented by Brochu (2003), who gave a new denition of crown Crocodylia according to the conventions of phylogenetic nomenclature (PN), as the last common ancestor of Gavialis gangeticus, Alligator mississipiensis, and Crocodylus niloticus, and all of its descendents. This led to an interesting reversal in the hierarchy, so that crown-clade Crocodylia is a subset of Eusuchia, rather than the other way round, as had been the case. Reasons for redening the boundaries of major vertebrate groups are linked to the advent of cladistics. Such nomenclatural revisions have been accelerated by the need for clarity in the application of the principles of PN (de Queiroz and Gauthier, 1992, 1994). Many proponents of crown-clade denitions assume that crown clades are a key element of PN and the Phylocode, but this is not the case (Cantino and de Queiroz, 2004). The assumption of a linkage arose because earlier papers by architects of the Phylocode (e.g., de Queiroz and Gauthier, 1992) included crown clades as a part of the manifesto for change, and Phylocode supporters generally support crown clades. This article does not aim to criticize the principles of PN (see Benton, 2000, 2007; Nixon and Carpenter, 2000;

Dyke, 2002; Forey, 2002; Monsch, 2005; Rieppel, 2006) but rather expresses dissatisfaction with the increasingly common use of crown-group denitions, with a particular focus on the use of the term Crocodylia. Names should be given to stable clades for the sake of nomenclatural stability, independent of which nomenclatural system is preferred. Lee (1996) demonstrated that crown clades were as good as any other kinds of clades in terms of clarity of denition and biological usage. Our question is therefore the following: why is it necessary to redene something already established and accepted for almost 250 years with a new denition that is no more stable and even more confusing than the previous one? Moreover, consistency with traditional taxonomy is recommended by the PhyloCode (e.g., Articles 10 and 11; Cantino and de Queiroz, 2003). The basis for the denition of crown clades was set up by Gauthier (1986) and Gauthier et al. (1988), who argued that crown clades possess three main advantages: (a) they allow us to reconstruct soft tissues and other unfossilizable characters of extinct members; (b) they promote stability in discussion; and (c) they conform most closely to the original concept of the name. We will develop our ideas around these three points, the aim being to survey the literature in order to determine the traditional meaning of Crocodylia. I S A S EPARATE CLASS FOR CROWN CLADES NECESSARY? Motivations for the usage of crown-clade denitions came with the advent of cladistics in the mid-1980s. Proponents of PN may dene taxa in three ways: nodebased, stem-based, and apomorphy-based denitions. A crown clade is founded on a node-based denition and it is specically bracketed by extant taxa. Crown

You might also like