You are on page 1of 11

Syst. Biol.

51(5):806–816, 2002
DOI: 10.1080/10635150290102483

An Optimality Criterion to Determine Areas of Endemism

CLAUDIA A. S ZUMIK ,1,2 FABIANA CUEZZO ,2 P ABLO A. G OLOBOFF,1,2


AND ADRIANA E. CHALUP2
1
Consejo Nacional de Investigaciones Cient´õ Žcas y Técnicas, Miguel Lillo 205, 4000 San Miguel de Tucumán,
Tucumán, Argentina
2
Instituto Superior de Entomolog´õ a, Miguel Lillo 205, 4000 San Miguel de Tucumán, Tucumán, Argentina;
E-mail: insue@infovia.com.ar

Abstract.—A formal method was developed to determine areas of endemism. The study region is
divided into cells, and the number of species that can be considered as endemic is counted for a given
set of cells (D area). Thus, the areas with the maximum number of species considered endemic are
preferred. This is the Žrst method for the identiŽcation of areas of endemism that implements an
optimality criterion directly based on considering the aspects of species distribution that are relevant
to endemism. The method is implemented in two computer programs, NDM and VNDM, available
from the authors. [Biogeography; endemicity; optimality criterion.]

IdentiŽcation of areas of endemism is lacking (Harold and Mooi) or was used


important for both historical biogeography only a posteriori to select among the conclu-
and conservation. Although there are many sions found by other less appropriate means
formalized methods for determining the re- (Morrone and Linder).
lationships between areas of endemism in A method used to identify areas of en-
vicariance biogeography (e.g., Nelson and demism must consider the taxa occurring in
Platnick, 1981; Brooks, 1990; Page, 1994; a given area and their positions in space. This
Nelson and Ladiges, 1996; Ronquist, 1997) spatial component has not been included
and for determining conservation priori- in preexisting clustering methods, and thus
ties (e.g., Vane-Wright et al., 1991; Faith, those methods (designed only to recover hi-
1992; Pressey et al., 1993; Williams, 1996; erarchy) cannot be adopted for identiŽcation
Rodrigues et al., 2000), there are almost no of areas of endemism. An attempt to produce
equivalent methods for the identiŽcation of such a formalization, taking into account the
the areas of endemism themselves. spatial component of endemism, is presented
In contrast to species (which normally here.
have discrete boundaries), areas of en-
demism are difŽcult to recognize because G ENERAL CONS IDERATIONS
the basic biogeographic patterns are easily An endemic taxon is restricted to a region
obscured by many factors (dispersal, extinc- and is found nowhere else. The range of dis-
tion, etc.). Thus, a formalization of the criteria tribution of a taxon is determined by both
used for recognition of areas of endemism is historical and current factors. Whatever the
clearly needed. factors are, if they affect (or have affected)
An explicit method to identify areas of en- in a similar way different taxonomic groups,
demism should relate relevant evidence and there will be congruence in the patterns of
conclusions. With this method, an investiga- endemicity in different groups. Thus, areas
tor should be able to evaluate a potential area that have many different groups found there
independently of how the area was deŽned. and nowhere else can be deŽned as areas of
Acceptance of those conclusions (i.e., bound- endemism. Such a situation would, of course,
aries of areas) that are best supported by indicate that the speciation processes in the
available evidence requires (in principle, at different groups have been caused by com-
least) evaluation of all possible conclusions, mon factors, but knowledge of the factors is
selecting the ones judged as optimal based not a prerequisite to identifying the existence
on the established criterion. of the area of endemism itself.
Harold and Mooi (1994), Morrone (1994), This notion of area of endemism has sev-
and Linder (2001) have discussed identiŽ- eral implications as to factors to be consid-
cation of areas of endemism. However, an ered when proposing a formalized identi-
explicit criterion of optimality was either Žcation method, particularly regarding the

806
2002 SZUMIK ET AL.—OPTIMALITY CRITERION FOR ENDEMICITY 807

PROBLEMS WITH PREVIOUS PROPOSALS


Harold and Mooi (1994) stated that sym-
patry is not a prerequisite for the recogni-
tion of an area of endemism. Although no
one would expect exact congruence in the
distributional limits of two or more species
FIGURE 1. Examples of ideal (a) and realistic (b) dis- at every possible scale of mapping, some ex-
tributions of species endemic to an area. (a) The two tensive sympatry must exist at the relevant
species (black squares and white circles) occupy all the level (Platnick, 1991; Morrone, 1994). Harold
area. (b) Both species are conŽned to some sector of the and Mooi (1994:265) argued that “nonover-
area (stippled).
lapping distributions need not be considered
separate historical entities if there is indepen-
limits of the area, the widely and narrowly dent evidence that the areas could be con-
distributed taxa, and the use of grids. sidered as one.” They used as example sev-
Ideally, the limits of the area of endemism eral islands (A, B, and C; Fig. 2), with two
would be inviolable; none of its species species present in each of them; the species
would be found outside the area. Addition- in islands B and C are sympatric, and the
ally, under ideal conditions, all the species species in island A are allopatric. The two
should be found in every part of the area species that inhabit the island do not coexist;
of endemism (Fig. 1a). However, not all tax- if anything, the distribution of the species in
onomic groups will respond in exactly the island A argues for recognizing two areas of
same way to the factors that either cause or endemism within island A. However, Harold
modify the area of endemism (e.g., not all and Mooi argued that in such a situation,
species expand or contract their distributions A must be considered an area of endemism
in exactly the same way). A consequence is instead of a composite, and the validity of
that the limits of the area will often be dif- this assumption will be tested with infor-
fuse, with borders of areas possibly support- mation from other groups of organisms. The
ing some of the endemic species but lacking testing, however, could hardly be considered
others (Fig. 1b). signiŽcant when all of the data are entered
As has been suggested (Platnick, 1991, in this way or when A is considered (a pri-
1992), the taxa to be used should be those ori) as a single unit; testing with information
that are maximally endemic, i.e., those for from other groups of organisms simply can-
which the ranges of distribution are small, not correct for mistakes like these. Not sur-
compared with the study region. The range prisingly, Harold and Mooi’s approach does
size, of course, is relative to the size of not provide a strict formalization. In the ab-
the study region. A species distributed in sence of formalization, the criteria proposed
all of the dry Chaco is widely distributed by Harold and Mooi are not operational.
if the study region is the Chaco but nar-
rowly distributed if the study region is South
America.
The use of grids seems unavoidable, be-
cause the series of dots that makes up the ac-
tual records for a species must be converted
into ranges in some explicit way. The size of
the grid cells will, obviously, affect the re-
sults, perhaps in a deterministic manner. For
example, use of very small grid cells will ren-
der all distributions entirely discontinuous,
and then only very small areas of endemism
(or none at all) would be recognized. Alter-
natively, use of very large grid cells will prob-
ably cause very large areas to be recognized, FIGURE 2. Distributions of species inhabiting is-
with many species appearing as endemic to lands (a, b, c). Redrawn from Harold and Mooi (1994:
each area. Fig. 2).
808 S YSTEMATIC BIOLOGY VOL. 51

Morrone (1994) and Linder (2001) pro- of endemicity. A natural criterion of optimal-
posed more detailed operational procedures, ity is thus provided by counting the species
scoring presence/absence of each species in that can be considered as endemic, given the
each cell of a grid. Both authors proposed to area (and the species distributions). Obvi-
use counts of endemic species as a criterion to ously, from among possible areas, those with
evaluate possible areas, although they were the highest scores of endemicity should be
not completely speciŽc on how to decide preferred.
whether or not a species can be considered To determine how many species appear
endemic. Using counts of endemic species as endemic, endemicity itself must be de-
to select from among all possible conclu- termined for each species in a formalized
sions presents considerable computational manner, which can be done in several ways.
difŽculties. Instead of using those counts to Four possible criteria have been examined,
select from among all possible sets of cells, from a very strict or ideal concept of en-
Morrone and Linder used them to select from demism (criteria 1 and 2, with a very high
among the sets of cells produced by a par- congruence required between the species
simony analysis or UPGMA clustering us- distribution and the area) to less rigorous
ing the Jaccard similarities. As both Mor- but more realistic requirements (criteria 3
rone and Linder were well aware, not all and 4, which allow for some incongruence).
the species appearing as “synapomorphies” Because each of the criteria is a relaxation of
of a given set of cells will correspond to the preceding one(s), the score under each
endemic species, because they may also be criterion will always be equal to or greater
synapomorphies of many other (not closely than the score under the preceding crite-
related and geographically distant) groups. ria. The data entry is done following the
This possibility violates the main require- steps outlined by Morrone (1994), by plot-
ment for endemicity, that of being restricted ting species localities on a map with a grid,
to the area. Parsimony is indeed an appropri- except that the spatial location of the cell in
ate criterion for phylogenetic reconstruction, the grid (as row, column; see Fig. 3) must
but it cannot be adapted to a Želd with com- also be considered. The method has been
pletely different goals and premises. Like- implemented in two computer programs,
wise, UPGMA may prefer groups of cells NDM and VNDM (Goloboff, 2001). NDM
with no endemic species over groups with is the basic search engine, and VNDM is a
several endemics. Thus the counts of species program that helps viewing and diagnosing
endemic to different sets of cells should (e.g., Žnding out which species contribute to
be used to select from among all possible the score). Optionally, the data can be read as
sets, not only those sets that parsimony or coordinates, and internally converted by the
UPGMA happen to produce.

AN O PTIMALITY CRITERION
A method to determine areas of endemism
based on an optimality criterion must pro-
vide a way to assign a value of endemic-
ity, or score, to a given area (D set of grid
cells) regardless of how that area was found
or hypothesized. For different deŽnitions of
an area, there will be different numbers of
species that can be considered endemic. For
example, a species will satisfy the require-
ment for endemicity if the area comprises
the same cells where it is distributed but FIGURE 3. An area (including Žve grid cells) with
will not satisfy the requirement if the area score 2 under criterion 1 and score 3 under criterion 2.
comprises half those cells. Thus, for differ- Under criterion 1, species X, even if occurring in each cell
of the area, does not contribute to the score because it
ent sets of cells, there will be different num- is also found in cells outside the area. Under criterion 2,
bers of species that can be considered as “en- species X contributes to the score. All the cells in the area
demic,” i.e., they will have different scores have identical species composition.
2002 SZUMIK ET AL.—OPTIMALITY CRITERION FOR ENDEMICITY 809

programs to a grid of a speciŽed size. In that before, SI must be identical for all CI in A; if
case, the programs will consider a species some SI 6D SJ , then XA D Ø; otherwise, deŽne
as present in a cell if it is present in at least
one point (D locality record) inside the cell; B D (SJ \ SK \ : : : \ SN );
a point on the edge (or corner) of a cell indi- V D (Sj [ Sk [ : : : [ Sn ) (for all j, k, and n
cates presence of the corresponding species that belong to NA );
in the two (or four) adjacent cells. Option- XA D B \ » V:
ally, it is also possible to consider each point
as having a “radius” equal to some (user de-
Under this criterion therefore it is not re-
Žned) percentage of the cell width or height,
quired that all the species contributing to the
so that a point very close to the edge (or cor-
score have identical distributions. The exam-
ner) of a cell can be considered as also present
ple of Figure 3 will have a score E2 D 3, con-
in the adjacent cell(s).
tributed by the distributions of species X, Y,
As the criteria are deŽned here, they can-
and Z; X contributes to the score because it is
not be applied to disjunct areas; only areas
found outside the area but only in neighbor-
where all cells are contiguous can be evalu-
ing cells (2–1 and 2–2).
ated. Although it would of course be desir-
able to have a criterion to evaluate disjunct
areas, this Žrst approximation to the problem Third Criterion (E3 )
does not allow such an evaluation. This criterion is similar to the preceding
For a more explicit deŽnition of the criteria, criterion but drops the requirement that SI
a simple notation is used: must be identical for all CI in A. Thus, it is
not required that all cells in A have identical
AD an area (D set of cells); species composition. However, because XA
CN D nth cell that belongs to A; is determined as with the previous criterion,
Cn D nth cell that does not belong to A; only species occurring in each and every one
NA D set of cells not adjacent to A; of the cells in A will contribute to the score.
SN D set of species present in CN ; Figure 4 shows an example; the area formed
Sn D set of species present in Cn ; by cells 0–1, 0–2, 0–3, 1–1, and 1–2 has a score
XA D set of species that contribute to the E3 D 2 (by species X and Y).
score of area A:
Fourth Criterion (E4 )
In all cases, the score of an area A will be the Under criteria 1 through 3, a species can
cardinality of XA . The complement of a set S contribute to the score only if it is present in
is denoted as »S. each and every one of the cells of the area. A
more realistic criterion, however, must take
First Criterion (E 1 ) into account the fact that a species may be
This criterion assumes that the distribu- absent from a given cell because of poor col-
tion of a species must adjust perfectly to the lecting effort or partial extinction (as in urban
area to contribute to the score. For all CI in
A, SI must be identical; if some SI 6D SJ , then
XA D Ø; otherwise XA D (SJ \ SK \ : : : \ SN )
\ » (Sj [ Sk [ : : : [ Sn ). That is, a species con-
tributes to the score if it is found in the area
and nowhere else, and each of the cells in the
area has exactly the same species composi-
tion. Figure 3 is an example; the area formed
by the cells 0–2, 0–3, 1–1, 1–2, and 1–3 has an
endemicity score E1 D 2.

Second Criterion (E 2 )
This criterion is similar to the preceding
FIGURE 4. An area with score 2 under criterion 3. Not
one, but a species can contribute to the score all cells in the area have identical species composition.
if present in some cell outside the area as long Species X and Y contribute to the score; species Z does
as the cell is adjacent to the area. Thus, like not because it is found in only some cells of the area.
810 S YSTEMATIC BIOLOGY VOL. 51

areas). A species, therefore, should be able to


contribute to the score even if absent from
some cells. However, the mere number (or
percentage) of cells in which the species is
present (as proposed by Linder, 2001) would
be a poor indicator, because the species could
satisfy this requirement by being conŽned to
some part of the area (e.g., very common in
the right half of the cells, really absent in
the left half), which is clearly undesirable.
Some indicator of whether the species is more
evenly distributed in the area is needed. One
can be provided by considering that a species FIGURE 5. An area with score 4 under criterion 4.
must satisfy three conditions for endemicity: None of the species are found in every cell of the area,
(1) It is present in at least two of the cells that but all satisfy the requirement of having no more than
seven empty cells around a given cell in the area and
form the area, (2) it is present either in CI itself having at least one adjacent cell occupied.
or in one of the adjacent cells that belongs to
A for each of the CI, and (3) it is absent from
no more than Q (where 0 < Q < 8 ) of the cells numbers of cells. Equally difŽcult problems
around CI that belong to A (Q D 0 is equiva- have been posed for reserve selection crite-
lent to criterion 3). Only species that are more ria (reviewed by Rodrigues et al., 2000), also
or less evenly distributed in the area will sat- based on evaluating possible sets of cells.
isfy this requirement, and then XA will in- However, the reserve selection algorithms
clude all those species as long as they are not implemented so far do not take into ac-
found in any of the cells in NA (in which case, count whether a species occurring inside the
they are widespread taxa). Additionally, each study area also occurs outside, which is a key
of the cells in A must have at least one of component for evaluating endemism. The
the species in XA ; if one of the cells lacks branch-and-bound implementation of those
each of the species in XA (i.e., if XA \ CI D reserve selection criteria is about as time con-
Ø, for some I), then make XA D Ø. (Without suming as our present implementation (see
this proviso, adding a strip of empty cells on Rodrigues et al., 2000, for details), and the
the side of the area would sometimes main- heuristic algorithms used for that problem
tain the same score and will produce areas are not applicable in the present case.
a little larger than actually indicated by the NDM, the program used here to explore
data.) The check for empty surrounding cells the method, uses a branch-and-bound im-
can also be done for occupied cells. (Check- plicit enumeration of areas. Such an ap-
ing around occupied cells provides for a proach guarantees the correctness of the re-
more stringent requirement and thus a score sults; however, although useful to explore
equal to or less than that obtained without the properties of the method and to analyze
checking.) small or medium sized examples, it is not ap-
Under this criterion, not all cells in A are plicable to very large data sets. The strategy
required to have the same species composi- used by NDM to Žnd the sets that actually
tion. An example is shown in Figure 5, where maximize the score under a given criterion
cells 0–2, 0–3, 1–2, and 1–3 have E4 D 4 (given is detailed here. To facilitate description, an
by species W, X, Y, Z); each of the species con- absolute numbering of the cells will be used;
tributing to the score is present in only some the absolute number of a cell with coordi-
cells in the area but is present in at least one nates x and y in a grid of C columns is de-
adjacent cell. Žned as (y * C) C x. Internally, NDM uses a
bitwise representation of the species distri-
butions and the sets of cells; this approach
S EARCHING FOR O PTIMAL AREAS allows calculation of unions or intersections
As the criteria are deŽned above, one can easily for 32 species or cells at a time. (Anal-
simply evaluate all possible sets of cells and ogous procedures have already been used in
select those with the highest scores. How- several parsimony programs; Allard et al.,
ever, this approach is computationally very 1999; Moilanen, 1999; see also Goloboff, 2002,
intensive and is intractable for even modest for a generalization to polytomies.)
2002 SZUMIK ET AL.—OPTIMALITY CRITERION FOR ENDEMICITY 811

During the search stage, NDM actually ex- tersection for a set of cells is empty, it follows
amines only the areas having more than a that the intersection of any possible addi-
single cell (those with single cells can be eas- tional combination of cells will also be empty,
ily examined later). To enumerate all possi- and then those additional cells are never
ble combinations of cells, NDM starts with an added. Actually, NDM checks whether the
empty set. To this set, it adds Žrst cell number partial intersection has fewer members than
0 (upper left corner) and tries all combina- a given minimum score; obviously, searching
tions of the remaining cells together with cell for areas with larger scores speeds up cal-
0. Then it eliminates cell 0, adds cell 1, and culations because it interrupts calculations
tries all possible combinations of the remain- earlier.
ing cells. This procedure is repeated until the For criterion 4, the calculations are more
Žrst cell included in the set is the one before difŽcult because a species can contribute to
the last cell in the grid (lower right corner), the score even if it does not occur in each
in which case only one two-cell set can be cell of the area. A good lower bound on the
generated. The possible combinations of cells score can be obtained by calculating an en-
are always examined in the same orderly larged distribution for each species (done be-
fashion. fore the search itself starts and stored in mem-
The procedure described allows genera- ory). For such enlarged distribution, a cell is
tion of all possible sets of cells. Each of the considered as having the species present if
combinations must be evaluated for continu- the species satisŽes the requirements of cri-
ity (disjunct areas are ignored) and, if con- terion 4 in that cell (i.e., actually present in
tinuous, assigned a score under the criterion at least one adjacent cell, absent in no more
(or criteria) in effect. Actual examination of than Q cells). The intersection of the species
each possible combination in this fashion is (with enlarged distributions) in the cells of
extremely time consuming (requiring several a given set of areas will be a superset of the
hours even for small data sets), but many of set of species actually giving a score under
the sets can be implicitly rejected by predict- criterion 4 for that area. Thus, if the number
ing that they will be discontinuous or that of members in the intersection of the species
they will have a low number of endemic occurring in the enlarged distribution in a
species. set of cells is less than the minimum score,
Discontinuous sets will have gaps. For ex- it follows that no set formed by adding fur-
ample, in a grid with eight columns, the set ther cells can have an E4 equal to or greater
formed by cells 0, 1, 5, and 6 is discontinu- than the minimum score. This is true as long
ous. The mere existence of some gap (such as the distributions have been enlarged by
as 2, 3, and 4 in the example) is not enough allowing up to seven empty cells around a
to deduce that any resulting set will be dis- given cell and not checking around cells ac-
continuous; e.g., adding cells 10, 11, and 12 tually occupied. If the number of allowed
to the original set will make it continuous. empty cells is less, the number of species
However, whenever a gap is longer than the contributing to the score can be underesti-
number of columns plus 2, any resulting set mated because a cell may be surrounded by
produced by adding subsequent cells will be some number of empty cells in the full grid
discontinuous (e.g., in an eight-column grid, but by a smaller number when an area is de-
the set formed by 0 and 10 is discontinuous, Žned (if the area excludes some of the cells
and no possible addition of cells beyond 10 that did not have the species; only the empty
can make it continuous). When a partial com- cells belonging to the area are counted). Thus,
bination of cells contains such a long gap, all some areas with a positive score (optimal
the sets that result from adding further cells or not) may be missed during the search.
to that partial combination are ignored. The likelihood of missing positive areas in
For criteria 1 through 3, predicting which a given case depends on the relative num-
species can potentially contribute to the score bers of empty cells used to create the en-
is easy because these criteria require that a larged species distributions and to evaluate
species be found in each and every cell of the areas. Allowing up to Žve empty cells
the area to count as endemic. As each cell when enlarging species distributions is un-
is added to the set, the intersection of the likely to create errors if the areas are to be
species contained in the new cell with those evaluated allowing up to two empty cells but
previously included is calculated. If the in- is very likely to create errors if the areas are
812 S YSTEMATIC BIOLOGY VOL. 51

to be evaluated allowing up to Žve empty are minimal. For larger problems, it is pos-
cells. sible to Žnd good solutions by constraining
The enlargement of species distributions the search to a given region; only those sets
allowing for fewer empty cells can Žnd areas contained within the region are evaluated.
that are contractions of the actual optimal ar- The candidate regions can be selected by ana-
eas, i.e., areas that are produced by eliminat- lyzing the data with enlarged grid cells (e.g.,
ing some cells from the actual optimal area. reducing the number of rows and columns
Some of these errors (not necessarily all) will to a half or a third) and then constraining
be remedied if a heuristic addition of cells, the search to the corresponding region of the
one at a time, is done for each of the cells larger data set.
found, retaining (and submitting to the same
procedure) each of the enlarged areas that has
a positive score. FURTHER CONS IDERATIONS
Additional speed can be obtained by iden- A possibility that has not been discussed so
tifying in advance species that cannot con- far is that of conict between the areas with
tribute to the score of a given area by virtue a positive score under some of the criteria. It
of occurring in nonadjacent cells. A cell that is is of course possible, given conicting distri-
columns C 2 positions before the Žrst cell in a butions, that two sets of cells, where one is a
set and a cell that is columns C 2 positions be- subset of the other, both have positive scores
yond the last cell in the set will by necessity be (under criteria 2 through 4). The one with the
discontinuous (i.e., nonadjacent) to the area. largest score is the one more strongly sup-
For each cell i, a set Fi can be calculated as Fi ported by the evidence. If two partially over-
D Fi¡1 [ Si and a set Bi as Bi D BiC1 [ Si (where lapping areas have the same score, either the
Si is the set of species occurring in cell i); this evidence is ambiguous regarding which of
calculation is done before the search starts. the areas is an area of endemism or both rep-
Then, during the search, if the Žrst cell in the resent real phenomena. (If each is supported
set is in position i, any species occurring in by the congruent distribution of many taxa,
the set Fi¡(columnsC2) cannot contribute to the the taxa may simply be responding to dif-
score and can be eliminated from the set of ferent factors, such as terrestrial vs. aquatic
species potentially contributing to the score. organisms.) Another possibility is that sev-
(As before, if fewer species than the mini- eral subsets of an optimal area will also have
mum score occur in that set, there is no need some positive score. This result does not re-
to form all the areas that result from adding ally represent conict but simply reects the
further cells to the present set of cells.) Be- fact that some species may have their ranges
fore evaluating a given area, all the species in further contracted. As implemented in NDM,
BjCcolumnsC2 (where j is the last cell of the set) such smaller areas will not be considered;
can be eliminated from the set of species po- the program eliminates them. The situation
tentially contributing to the score. (This saves is different, of course, if the smaller area has
less time than checking against F but still a larger score (under some criteria), in which
saves some time because some areas can be case both areas are saved. Ideally, the com-
rejected easily without further evaluation.) parison should take into account whether the
Because higher minimum scores allow for scores for the larger and smaller areas are
a quicker rejection of many areas, they pro- given by different sets of species, and if so, it
duce faster searches. Using all the shortcuts should retain both areas (this option has not
described above, NDM can analyze data sets yet been implemented).
of medium size in reasonable times. On a Whether an area X in conict with another
266-MHz pentium II machine, the areas with area Y of higher score is reported by the pro-
score ¸2 for a real matrix of carabid beetles, gram or not may depend in turn on whether
with a grid of 10 £ 15 and 33 species (actually area Y itself is in conict with another area
occurring in 42 cells) can be found in 664 sec, (e.g., Z) of even higher score. If so, area Y
the areas with score ¸3 in 1.97 sec, and the must be eliminated (because it loses against
areas with score ¸4 in 0.69 sec. The areas Z), and X will be retained. Thus, NDM can-
with score ¸2 can be found in only 1.05 sec not check for conict between the areas as it
if the enlarged species distributions are cal- Žnds them. If it did so, Žnding Žrst Y, then X,
culated allowing for up to four empty cells, then Z, it would miss area X; when X is found
and the differences from the correct results and compared to Y, it is discarded, and when
2002 SZUMIK ET AL.—OPTIMALITY CRITERION FOR ENDEMICITY 813

Z is found, it discards Y. Only Žnding X after Žgure). The three areas of largest score are
both Y and Z are found would produce the the Žrst three in Figure 7. Area 1 completely
correct result. To avoid this problem, NDM includes areas 8 and 9 (all of lower E4 , but
stores all the areas with positive score that it reported by NDM because they have higher
Žnds during the search, and only when the E3 ) and is completely included in areas 4, 5,
search is Žnished does it globally compare all 6, and 7 (all of lower E4 ). Area 2 is in conict
the areas for conict. with area 10 and completely includes area 11
The four criteria for scoring can be used si- (both of lower E4 ). Area 3 is in conict with
multaneously during a search. Because each area 13, is included completely in area 12,
criterion is a relaxation of the preceding and includes area 14 (the three with lower
one(s), the criteria do not actually contradict E4 ). Area 1 of Morrone (1994) is equivalent
each other but give instead complementary to our area 1, and area 2 of Morrone (1994)
information. is equivalent to our area 12 (which is subop-
timal according to our criterion). Area 3 of
A R EAL EXAMPLE: R EANALYSIS OF SCIOBIUS Morrone is equivalent to one of our single-
S CHÖNHERR cell areas. Morrone’s analysis did not recog-
nize any possible equivalent of our area 2 nor
Morrone (1994) analyzed, using parsi- any equivalent of the single cell area N.
mony, a matrix of 47 species of Scio- Even for the areas that appear (identical
bius (Coleoptera: Curculionidae) from South or very similar) in the analysis of Morrone
Africa in a grid with 21 occupied cells. On the (1994), there are signiŽcant differences in the
consensus from 289 optimal trees, Morrone species that deŽne the areas. Area 1 is diag-
(1994) proposed three areas of endemism nosed under criterion 4 as having 17 endemic
(Fig. 6). Area 1 (cells I, J, L, and M) is deŽned species (see Fig. 8). Of these 17 species, only 7
by having Žve species; there are seven species (6, 7, 10, 12, 22, 23, and 46) appear as synapo-
as synapomorphies of this area, but Morrone morphies of the area when mapped most
indicated only Žve, presumably by consider- parsimoniously onto the consensus tree;
ing that only these Žve were endemic. Area 2 Morrone (1994) actually showed only
(cells N, O, R, S, and T) is deŽned by hav- 5 species (he did not show 12 and 23).
ing two species, and area 3 (cell P) is de- (Morrone [1994] mapped the characters onto
Žned by having seven species (here, Morrone the consensus tree; we consider that it is
counted only the autapomorphies). The same better to map the individual trees, but we
matrix analyzed under criteria 3 and 4 with use the consensus for comparability with
NDM (allowing for up to two empty cells Morrone’s results.) Some of the species
around each cell in the area) obtained a total contributing to the score under criterion 4
of 16 areas (in 1.17 sec running on a 266-MHz do not appear as synapomorphies under
Pentium II machine), as shown in Figure 7 parsimony because they are not found in
(the two single cell areas, 4-3 and 4-5, N and all the cells forming the area. Species 23
P in Morrone’s grid, are not shown in that (S. marshalli) appears as a synapomorphy
under parsimony, but because it is also
present in the nonadjacent cells A and B
it seems illogical to count it as supporting
endemicity. Thus, for the distribution of
Sciobius, the criteria proposed here produce
more reasonable results than parsimony.

CONCLUSIONS AND PROSPECTS


The method proposed here is only a Žrst
approximation of a solution to the problem of
identifying areas of endemism. The method
could be improved in many ways that would
FIGURE 6. Grid used by Morrone (1994) in his analy- still reect its general spirit and approach.
sis of Sciobius. The areas marked are the ones selected by
Morrone’s method. Area 1 D cells I, J, L, and M (medium
The Žrst aspect is the continuity of the area
shading); area 2 D cells N, O, R, S, and T (dark shading); of endemism; as the criteria are deŽned here,
area 3 D cell P (light shading). the areas of endemism resulting from habitat
814 S YSTEMATIC BIOLOGY VOL. 51

FIG URE 7. The 14 sets with positive E3 or E4 for the data of Morrone (1994).
2002 SZUMIK ET AL.—OPTIMALITY CRITERION FOR ENDEMICITY 815

FIGURE 8. Distributions of the 17 species endemic to area 1 in Figure 7, according to criterion 4.

fragmentation (due to many possible causes) does not. Ideally, species that adjust well to
cannot be recognized as such. It would be de- the expectation of endemicity should con-
sirable to modify the criteria in such a way tribute to the score more than species that ad-
that disjunct areas can be recognized. Modi- just poorly (in a proportion that depends on
Žcations of the criteria for meaningful eval- how well the species adjust to endemicity).
uation of disjunct areas are currently being A possibility is to weight a species according
investigated. to the proportion of cells in the area that are
Another aspect that should be improved is effectively occupied by the species or by the
the all-or-none aspect of the method; a given ratio of occupied cells inside and outside the
species either contributes to the score or it area, or by both methods.
816 S YSTEMATIC BIOLOGY VOL. 51

Aside from those possible improvements, LINDER , P. 2001. On areas of endemism, with an example
a better insight into the properties of the from the African Restionaceae. Syst. Biol. 50:892–912.
MOILANEN , A. 1999. Searching for most parsimo-
method can be gained by testing the method nious trees with simulated evolutionary optimization.
on randomly generated distributional data. Cladistics 15:39–50.
Another aspect that must be studied more MORRONE, J. J. 1994. On the identiŽcation of areas of
closely is the effect of the grid cell size on the endemism. Syst. Biol. 43:438–441.
NELS ON, G., AND P. LADIGES . 1996. Paralogy in cladistic
results (for a brief discussion, see Morrone, biogeography and analysis of paralogy-free subtree.
1994). More detailed analyses along these Am. Mus. Novit. 3167:1–58.
lines are currently being carried out, and NELS ON, G., AND N. I. PLATNICK . 1981. Systematics and
their results will be published elsewhere. biogeography: Cladistics and vicariance. Columbia
Univ. Press, New York.
PAGE, R. D. M. 1994. Maps between trees and cladistic
ACKNOWLEDGMENTS analysis of historical associations among genes, or-
We thank the CONICET (PIP 4974) and FONCYT ganisms, and areas. Syst. Biol. 43:58–77.
(PICT 01-04347 ) for support. Helpful comments from PLATNICK, N. I. 1991. On areas of endemism. Aust. Syst.
James Carpenter, Jonathan Coddington, Peter Linder, Bot. 4:xi–xii.
Roderic Page, Norman Platnick, Mart´õ n Ram´õ rez, and PLATNICK, N. I. 1992. Patterns of biodiversity. Pages 15–
reviewers Juan Morrone, Marco van Veller, and Rino 24 in Systematics, ecology, and the biodiversity crisis
Zandee are greatly appreciated . (N. Eldredge, ed.). Columbia Univ. Press, New York.
PRESSEY, R. L., C. J. HUMPHRIES , C. R. MARGULES , R. I.
VANE-WRIGHT , AND P. W ILLIAMS . 1993. Beyond op-
portunism: Key principles for systematic reserve se-
R EFERENCES lection. Trends Ecol. Evol. 8:124–128.
ALLARD , M., J. FARRIS , AND J. CARPENTER . 1999. Congru- RODRIGUES , A., J. ORES TES CER DEIR A, AND K. GASTON.
ence among mammalian mitochondrial genes. Cladis- 2000. Flexibility, efŽciency, and accountability: Adapt-
tics 15:75–84. ing reserve selection algorithms to more complex con-
BROOKS , D. R. 1990. Parsimony analysis in historical bio- servation problems. Ecography 23:565–574.
geography and coevolution: Methodological and the- RONQUIST , F. 1997. Dispersal–vicariance analysis: A new
oretical update. Syst. Biol. 39:14–30. approach to the quantiŽcation of historical biogeog-
FAITH, D. P. 1992. Conservation evaluation and phylo- raphy. Syst. Biol. 46:195–203.
genetic diversity. Biol. Conserv. 61:1–10. VANE-W RIGHT , R., C. J. HUMPHRIES , AND P. H.
GOLOBOFF, P. A. 2001. NDM and VNDM: Programs for W ILLIAMS . 1991. What to protect—systematics and
analysis of endemicity. Distributed by the author, San the agony of choice. Biol. Conserv. 55:235–254.
Miguel de Tucumán, Tucum án, Argentina. WILLIAMS , P. H. 1996. WORLDMAP 4: Program
GOLOBOFF, P. A. 2002. Optimization of polytomies: State and documentation. Distributed by the author,
set and parallel operations. Mol. Phylogenet. Evol. www.nhm.ac.uk/science/projects/worldmap
22:269–275.
HAROLD , A. S., AND R. D. MOOI . 1994. Areas of en- First submitted 11 December 2001; reviews returned
demism: DeŽnition and recognition criteria. Syst. Biol. 11 June 2002; Žnal acceptance 22 July 2002
43:261–266. Associate Editor: Roderic Page

You might also like