Tree Consensus

Systematics - Bio 615
Confidence - Assessment of the Strength of

the Phylogenetic Signal - part 2
1. Consistency Index
2. g1 statistic, PTP - test
3. Consensus trees
4. Decay index (Bremer Support)
5. Bootstrapping / Jackknifing
6. Statistical hypothesis testing (frequentist)
7. Posterior probability (see lecture on Bayesian)

Derek S. Sikes University of Alaska
Multiple optimal trees Multiple optimal trees

•  Many methods can yield multiple •  If multiple optimal trees are found we know
equally optimal trees that all of them are wrong except, possibly,
(as species tree, not gene trees)
•  We can further select among these

trees with additional criteria, but (hopefully) one
•  Typically, relationships common to all •  Some have argued against consensus

the optimal trees are summarized with tree methods for this reason
consensus trees
•  Debate over quest for true tree (point estimate)
versus quantification of uncertainty
Consensus methods Strict consensus methods

•  A consensus tree is a summary of the agreement •  Strict consensus methods require agreement
among a set of fundamental trees across all the fundamental trees
•  There are many consensus methods that differ in: •  They show only those relationships that are
  1. the kind of agreement unambiguously supported by the data
  2. the level of agreement
•  The commonest method (strict component
•  Consensus methods can be used with multiple consensus) focuses on clades/components/full
trees from a single analysis or from multiple splits
analyses
1
Strict consensus methods Strict consensus methods

TWO FUNDAMENTAL TREES"
•  This method produces a consensus tree that A! B! C! D! E! F! G! A! B! C! E! D! F! G!
includes all and only those full splits found in all the
fundamental trees
•  Other relationships (those in which the

fundamental trees disagree) are shown as A! B! C! D! E! F! G!
unresolved polytomies
•  Can be less optimal than any of the optimal trees Simplest to interpret
STRICT CONSENSUS TREE!
Majority rule consensus Majority rule consensus

•  Majority-rule consensus methods require •  This method produces a consensus tree that
agreement across a majority of the fundamental includes all and only those full splits found in a
trees majority (>50%) of the fundamental trees
•  May include relationships that are not supported by •  Other relationships are shown as unresolved
the most parsimonious interpretation of the data polytomies
•  The commonest method focuses on clades/ •  Of particular use in bootstrapping and Bayesian
components/full splits Inference (best not to use for single searches)
•  Implemented in PAUP* and MrBayes
Majority rule consensus Majority rule consensus

THREE FUNDAMENTAL TREES Majority Rule Consensus trees are used for
A B C D E F G A B C E F D G
A B C E D F G
1. Summarizing multiple equally optimal trees from
one search (but they shouldn’t be!)
2. Summarizing the results of a bootstrapping
analysis (multiple searches)
3. Summarizing the results of a Bayesian

Numbers indicate A B CE D F G
frequency of 100 66
analysis
clades in the 66 66
fundamental trees 66
Don’t confuse these! The numbers on the branches
mean very different things in each case
MAJORITY-RULE CONSENSUS TREE
2
Reduced consensus methods Consensus methods

Three
TWO FUNDAMENTAL TREES! fundamental
A! B! C! D! E! F! G! A! G! B! C! D! E! F! trees strict consensus agreement subtree
Ochromonas Ochromonas Euplotes excluded
Symbiodin ium Symbiodinium
Prorocentrum
Prorocentrum Symbiodinium
Loxodes
Tetrahymena
Loxodes Prorocentrum
Spirostomumum Tetrahymena Loxodes

Tracheloraphis Tracheloraphis Tetrahymena
Euplotes
Gruberia
Spirostomum Spirostomum
Ochromonas Euplotes Tracheloraphis
A B!C! D! E! F! G!
Gruberia
Gruberia
Symbiodin ium
Prorocentrum
majority-rule
A! B! C! D! E! F! Loxodes Ochromonas
Tetrahymena
Spirostomumum
Euplotes Ochromonas
Tracheloraphis
Strict component consensus! Gruberia

Ochromonas
100 Symbiodinium
Prorocentrum
completely unresolved! Symbiodinium 100

100 Loxodes
Prorocentrum 66
Tetrahymena
AGREEMENT SUBTREE - PAUP*!

Loxodes
Tetrahymena m
Spirostomumu
66
Spirostomum
Euplotes
Euplotes
100
Tracheloraphis Tracheloraphis
Taxon G is excluded! Gruberia Gruberia
Consensus methods Recall

•  Use strict methods to identify those relationships
unambiguously supported by parsimonious •  Stochastic error vs Systematic error
interpretation of the data
•  Use reduced methods where consensus trees are

•  These assessment methods help
poorly resolved identify stochastic error
–  How repeatable are the results?
•  Avoid methods which have ambiguous –  How strongly do the data support them?
interpretations. Prevent possible confusion between –  This is a measure of precision (which is
MR consensus for an optimal tree search and a MR
hopefully related to accuracy)
consensus for a bootstrapping search

Accuracy and Precision the Phylogenetic Signal - part 2
•  Accuracy 1. Consistency Index
–  Accuracy is correctness. How close a
measurement is to the true value. ""
"(unless we know the “true tree” in "" 3. Consensus trees
"advance we cannot measure this)"
•  Precision 5. Bootstrapping / Jackknifing

–  Precision is reproducibility. How closely two 6. Statistical hypothesis testing (frequentist)
or more measurements agree with one
another. (this we can measure!) 7. Posterior probability (see lecture on Bayesian)
3
Branch Support Decay analysis

•  Several methods have been proposed that attach •  In parsimony analysis, a way to assess support for a
group is to see if the group occurs in slightly less
numerical values to internal branches in trees that
parsimonious trees also
are intended to provide some measure of the
strength of support for those branches and the
•  The length difference between:
corresponding groups
the shortest trees including the group and
•  These methods include:
the shortest trees that exclude the group
  - The Bootstrap (BS) and jackknife
  - Decay analyses (aka Bremer Support)
(the extra steps required to collapse a group)
  - Bayesian Posterior Probabilities (PP or BPP)
is the decay index or Bremer support
Decay analyses - in practice

Decay analysis -example •  Decay indices for each clade can be determined by:
Ciliate SSUrDNA data Randomly permuted data -  Using PAUP* to search for the shortest tree that
Ochromonas Ochromonas lacks the branch of interest using reverse
topological constraints
+27 Symbiodinium +1 Symbiodinium
+45
Prorocentrum
Loxodes
+1
+3
Prorocentrum
Loxodes
-  with the Autodecay or TreeRot programs (in
Tracheloraphis Tetrahymena conjunction with PAUP*) - MacClade 4 will also
Spirostomum
+8
Tracheloraphis help prepare for a Decay analysis
+15 Gruberia Spirostomum
+10 Euplotes Euplotes
-  An excellent use for the Parsimony Ratchet -
+7 Tetrahymena Gruberia
because finding the shortest tree length is all that
matters (not finding multiple shortest trees)
Decay indices - interpretation Decay indices - interpretation

•  Generally, the higher the decay index the better the
relative support for a group •  Unlike BS decay indices are not scaled (0-100)
–  This has the advantage that the value can exceed 100
•  Like Bootstrap values (BS), decay indices may be whereas BS “tops - out” at 100 meaning that we cannot
distinguish between the support of two branches with BS
misleading if the data are misleading
values of 100 although one might have a far greater
decay index than the other
•  Magnitude of decay indices and BS generally
correlated (i.e. they tend to agree) •  It is even less clear what is an acceptable decay
index than a BS value…
•  Only groups found in all most parsimonious trees –  Unlike the BS value very little work has examined
have decay indices > zero the properties and behavior of decay indices
4

Decay indices - interpretation
the Phylogenetic Signal - part 2
One key study is that of DeBry (2001)
–  He showed that decay indices should be interpreted in 1. Consistency Index
light of branch lengths
–  That the same values, even within the same tree, do not
represent the same support if the branch lengths differ 3. Consensus trees
-  ie Decay Indices are not easily comparable as measures 4. Decay index (Bremer Support)
of branch support
-  Values < 4 should be considered weak regardless of
branch length 6. Statistical hypothesis testing (frequentist)
DeBry, R.W. (2001) Improving interpretation of the Decay Index for DNA sequence data. Systematic
Biology 50: 742-752. 7. Posterior probability (see lecture on Bayesian)
Bootstrapping (non-parametric)
•  Bootstrapping is a statistical
technique that uses computer
intensive random resampling
of data to determine sampling
error or confidence intervals
for some estimated parameter
•  Introduced to phylogenetics by
Decay values versus Bootstrap and Jacknife values Felsenstein in 1985
from one empirical study •  Based on idea of Efron (1979)
Norén, M. & U. Jondelius. 1999. Phylogeny of the Prolecithophora
(Platyhelminthes) inferred from 18S rDNA sequences. Cladistics 15: 103-112.
Bootstrapping (non-parametric)
1. Characters are sampled with replacement to create

many (100-1000) bootstrap replicate data sets
(think shuffle vs random play of music)
2. Each bootstrap replicate data set is analysed (e.g.

with parsimony, distance, ML)
3. Agreement among the resulting trees is

summarized with a majority-rule consensus tree
5
Bootstrapping (non-parametric) Bootstrapping

•  Frequency of occurrence of groups, bootstrap Original data matrix! Resampled data matrix!
support (BS), is a measure of support for those Characters! Characters!

Summarize the results of
Taxa 1 2 3 4 5 6 7 8 ! Taxa 1 2 2 5 5 6 6 8 !
groups A
B
R R Y Y Y Y Y Y
R R Y Y Y Y Y Y
! A
B
R
R
R
R
R Y Y Y Y Y
R Y Y Y Y Y
! multiple analyses with a
! !
majority-rule consensus tree
C Y Y Y Y Y R R R ! C Y Y Y Y Y R R R !
D Y Y R R R R R R ! D Y Y Y R R R R R !
Bootstrap values (BS) are the
Outgp R R R R R R R R Outgp R R R R R R R R frequencies with which groups
•  Additional information is given in partition tables (for ! !
are encountered in analyses of

Randomly resample characters from the original data with replicate data sets
groups below 50% support) replacement to build many bootstrap replicate data sets of the
same size as the original - analyse each replicate data set
A! B! C! D!
A! B! C! D! A! B! C! D!
•  Can ask PAUP* to create MR con-tree of higher 1! 5!
1!
2! 5! 96%!
cut-off, eg 80% - all weaker branches collapse 8!
7!
2!
8!
6!
5!
6!
6!
2!
2!
66%!
4! 1!
3!
Outgroup!
Outgroup! Outgroup!
Bootstrapping - an example Bootstrapping - random data

Ciliate SSUrDNA - parsimony bootstrap Partition Table Partition Table
123456789 Freq Randomly permuted data - parsimony bootstrap 123456789 Freq!
Ochromonas (1)!
-----------------!
----------------- .*****.** 71.17!
Symbiodinium (2)! Ochromonas Ochromonas ..**..... 58.87!
100! .**...... 100.00
Symbiodinium 16 Symbiodinium ....*..*. 26.43!
Prorocentrum (3)! ...**.... 100.00 59 Prorocentrum 59 Prorocentrum .*......* 25.67!
84! Euplotes (8)! .....**.. 100.00 71

Loxodes
Tracheloraphis
26
21
Loxodes
Spirostomumum
.***.*.**
...*...*.
23.83!
21.00!
...****.. 100.00 Spirostomumum 71 16

Tetrahymena
.*..**.** 18.50!
.....*..* 16.00!
Tetrahymena (9)! ...****** 95.50 Euplotes Euplotes .*...*..* 15.67!
96! Tetrahymena Tracheloraphis
100! Loxodes (4)! .......** 84.33 Gruberia Gruberia
.***....*
....**.** 12.67!
13.17!
Tracheloraphis (5)! ...****.* 11.83 ....**.*. 12.00!

100!
...*****. 3.83 ..*...*.. 12.00!
50% Majority-rule consensus (with minority components) .**..*..* 11.00!
100! Spirostomum (6)! .*******. 2.50 .*...*... 10.80!
.....*.** 10.50!
Majority-rule consensus Gruberia (7)! .**....*. 1.00 .***..... 10.00!
.**.....* 1.00
Bootstrapping Bootstrap - interpretation

The probability of a character being omitted •  Bootstrapping was introduced as a way
from a bootstrap sample ranges from of establishing confidence intervals for
0-0.367 (depending on N, the number phylogenies
of characters)
•  This interpretation of bootstrap values
Rule of thumb: a branch must be
N P depends on the assumption that the
supported by 3 or more characters to be original data is a random sample from
1  0 recovered in >95% of bootstraps
a much larger set of independent and
2  0.25
identically distributed data (i.i.d.)
3  0.29
4  0.31
… 0.367
6
Bootstrap - interpretation “…bootstrapping provides

•  However, several things complicate this interpretation us a confidence interval
within which is
-  These assumptions are often wrong - making any contained not [necessarily]
the true phylogeny but the
strict statistical interpretation of BS invalid phylogeny that
would be estimated on
-  Some theoretical work indicates that BS are very repeated sampling of
conservative (too low), and may underestimate many characters from the
confidence intervals - problem increases with underlying pool of
numbers of taxa characters.”
-  BS can be high for incongruent relationships in

separate analyses - and can therefore be misleading Joseph Felsenstein (1985)
(misleading data -> misleading BS) recall the
Mantra: The data are the things
Bootstrap - interpretation Bootstrap - interpretation

Huelsenbeck & Rannala (2004) list 3 common interpretations •  High BS (e.g. > 85%) is indicative of strong ‘signal’ in
the data (some use 70% as the cutoff, there is no
1. Probability that a clade is correct (accuracy) consensus as to which value is best)
2. Robustness of the results to perturbation •  Provided we have no evidence of strong misleading

(repeatability / precision) signal due to violation of assumptions (e.g. base
composition biases, great differences in branch
3. Probability of incorrectly rejecting a hypothesis of
lengths) high BS values are likely to reflect strong
monophyly (1-P) : probability of getting that
much evidence if, in fact, the group did not exist phylogenetic signal
Huelsenbeck, J.P. and Rannala, B. (2004) Frequentist properties of Bayesian posterior probabilities of
•  In other words, although technically they are meant
phylogenetic trees under simple and complex substitution models. Systematic Biology 53: 904-913.
to be a measure precision, they are usually thought
to be at least strongly correlated with accuracy
Bootstrap - interpretation Bootstrap - interpretation

Be suspicious of •  Low BS values, however, need not mean the
maximum bootstrap relationship is false, only that it is poorly supported
values…
–  This is especially true of morphological data
they might be due
to systematic error. –  Morphologists often use the Decay index instead
•  Bootstrapping can be viewed as a way of exploring

the robustness of phylogenetic inferences to
perturbations in the balance of supporting and
conflicting evidence for groups
Paul Lewis
7
Systematics - Bio 615 Hillis, D.M. and Bull, J.J. (1993) An empirical test of
bootstrapping as a method for assessing confidence in
phylogenetic analysis. Systematic Biology, 42: 182-192.
Jackknifing
•  Jackknifing is very similar to bootstrapping
and differs only in the character resampling
strategy
•  Some proportion of characters (e.g. 37%, 50%)

are randomly selected and deleted
•  Replicate data sets are analyzed and the

results summarized with a majority-rule
consensus tree
•  Jackknifing and bootstrapping tend to produce

broadly similar results and have similar
interpretations - Jackknifing is preferred by
cladists
=1,089 pseudo datasets
Bootstrap - interpretation
Two types of precision (Hillis & Bull 1993):
Precision of bootstrap value vs repeatability of

finding a branch:
- Precision of bootstrap values increases with the

number of bootstrap replicates (variance
among analyses decreases)
- Repeatability tells us how likely we are to find the

same results using a different but similar
dataset - Felsenstein’s original idea
Hillis & Bull (1993) examined

precision, repeatability, and
accuracy of the bootstrap
a) 1,089 BS of 100 reps e from 1

“real” dataset
b) 100 real datasets
“Comparison of these two

distributions reveals that the
process of bootstrap resampling
is not the same as repeated,
independent sampling of data.”
Sanderson, M.J. (1995) Objections to Bootstrapping Phylogenies: A Critique. Systematic Biology, 44:
299-320.
But the top reason has been that they seem to be

too conservative - ie underestimates of the
probability of the branch being correct - ie biased
downward (erratically & unpredictably)
Bootstrap - interpretation
Newton, M.A. (1996) Bootstrapping phylogenies: Large deviations and dispersion effects. Biometrika,
83: 315-328.
Hillis & Bull (1993) examined precision, repeatability, and
accuracy of the bootstrap
- Found that BS provide a very imprecise measure of

repeatability - so imprecise as to be worthless as a
measure of repeatability
- Determined that in some cases a BS as low as 70% was

equivalent to a 95% probability of being true - Bias
confirmed by Newton (1996) Low Support
Low branch support can result from
Hillis, D.M. and Bull, J.J. (1993) An empirical test of bootstrapping as a method for assessing confidence in
phylogenetic analysis. Systematic Biology, 42: 182-192.
1. Conflicting data (homoplasy)
2. Lack of data - even a dataset with no homoplasy can yield

poorly resolved trees if there are branches without change
3. Use of a poorly fitting model (too complex or too simple)
4. Artifact of mid-sized clades? “This indicates that, for all support

measures on trees of a given size, the largest clades and the smallest clades are
Bootstrap - interpretation supported most strongly, whereas medium sized clades receive lower support”
Picket, K.M. and Randle, C.P. (2005) Strange bayes indeed: uniform topological priors imply non-uniform
BS values have been criticized for a variety of clade priors. Molecular Phylogenetics and Evolution 34: 203-211. SEE ALSO: Brandley, M. et al. (2006)
Are unequal clade priors problematic for Bayesian phylogenetics? Systematic Biology 55: 138-146.
reasons:
8
Confidence - Assessment of the Strength of Terms - from lecture & readings

the Phylogenetic Signal - part 2 consensus methods
consensus tree
1. Consistency Index
strict consensus
splits
majority rule consensus
reduced consensus trees
3. Consensus trees
agreement subtree
branch support
Decay analysis
Decay index (Bremer Support)
DeBry (2001)
Bootstrapping
6. Statistical hypothesis testing (frequentist)
resampling with replacement
repeatability
7. Posterior probability (see lecture on Bayesian)
jackknifing
Study questions
Describe the difference between a strict and majority
rule consensus tree."
What were the key findings of DeBry in his (2001) paper on
Decay Indices?"
What is the rule of thumb in bootstrapping for a branch to receive

> 95% support?
What are two common but different interpretations of

bootstrap values? What did Hillis & Bull (1993) conclude
regarding these interpretations?"
What are two common explanations for low branch support?"

Tree Consensus

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Tree Consensus

Uploaded by

Copyright:

Available Formats

Systematics - Bio 615

Confidence - Assessment of the Strength of

2. g1 statistic, PTP - test

4. Decay index (Bremer Support)

6. Statistical hypothesis testing (frequentist)

7. Posterior probability (see lecture on Bayesian)

Multiple optimal trees Multiple optimal trees

• We can further select among these

• Typically, relationships common to all • Some have argued against consensus

Consensus methods Strict consensus methods

Strict consensus methods Strict consensus methods

• Other relationships (those in which the

STRICT CONSENSUS TREE!

Majority rule consensus Majority rule consensus

• Implemented in PAUP* and MrBayes

Majority rule consensus Majority rule consensus

3. Summarizing the results of a Bayesian

Reduced consensus methods Consensus methods

Spirostomumum Tetrahymena Loxodes

Strict component consensus! Gruberia

completely unresolved! Symbiodinium 100

AGREEMENT SUBTREE - PAUP*!

Consensus methods Recall

• Use reduced methods where consensus trees are

Confidence - Assessment of the Strength of

• Precision 5. Bootstrapping / Jackknifing

Branch Support Decay analysis

is the decay index or Bremer support

Decay analyses - in practice

Decay indices - interpretation Decay indices - interpretation

Confidence - Assessment of the Strength of

1. Characters are sampled with replacement to create

2. Each bootstrap replicate data set is analysed (e.g.

3. Agreement among the resulting trees is

Bootstrapping (non-parametric) Bootstrapping

support (BS), is a measure of support for those Characters! Characters!

are encountered in analyses of

Bootstrapping - an example Bootstrapping - random data

84! Euplotes (8)! .....**.. 100.00 71

...****.. 100.00 Spirostomumum 71 16

Tracheloraphis (5)! ...****.* 11.83 ....**.*. 12.00!

Bootstrapping Bootstrap - interpretation

Bootstrap - interpretation “…bootstrapping provides

- BS can be high for incongruent relationships in

Bootstrap - interpretation Bootstrap - interpretation

2. Robustness of the results to perturbation • Provided we have no evidence of strong misleading

Bootstrap - interpretation Bootstrap - interpretation

• Bootstrapping can be viewed as a way of exploring

• Some proportion of characters (e.g. 37%, 50%)

• Replicate data sets are analyzed and the

• Jackknifing and bootstrapping tend to produce

=1,089 pseudo datasets

Precision of bootstrap value vs repeatability of

- Precision of bootstrap values increases with the

- Repeatability tells us how likely we are to find the

Hillis & Bull (1993) examined

a) 1,089 BS of 100 reps e from 1

b) 100 real datasets

“Comparison of these two

But the top reason has been that they seem to be

- Found that BS provide a very imprecise measure of

- Determined that in some cases a BS as low as 70% was

1. Conflicting data (homoplasy)

2. Lack of data - even a dataset with no homoplasy can yield

•  We can further select among these

•  Typically, relationships common to all •  Some have argued against consensus

•  Other relationships (those in which the

•  Implemented in PAUP* and MrBayes

•  Use reduced methods where consensus trees are

•  Precision 5. Bootstrapping / Jackknifing

-  BS can be high for incongruent relationships in

2. Robustness of the results to perturbation •  Provided we have no evidence of strong misleading

•  Bootstrapping can be viewed as a way of exploring

•  Some proportion of characters (e.g. 37%, 50%)

•  Replicate data sets are analyzed and the

•  Jackknifing and bootstrapping tend to produce