You are on page 1of 4

Models and Tree Reconstruction Methods

(Answer the questions on another document and submit via LearningSuite)

Modeling the evolution of DNA sequence


Nucleotide sequence of the gamma-brinogengene from artiodactyls (ungulates),
cetaceans (whales), perissodactyls (odd-toed ungulates), carnivores and a
primate (chapter 7, example 7.3 in Nei & Kumar) are available from the examples
folder in MEGA or the MEP book website (http://lifesciences.asu.edu/mep/). Get
these in MEGA format and open them up in Mega (they should be .meg files).

1. Conduct the following using MEGA5:


a. Look at the Data in the data explorer. Notice that there are multiple
whales and other mammals. For this homework assignment, I want
you to just work with a closely-related pair (e.g., whale-whale) and
a distantly-related pair (e.g., whale-human). So deselect the
species that you are not going include in your comparisons.
b. Estimate the evolutionary difference and associated variance
(error) for the following models:
i. p distance
ii. Jukes-Cantor (JC)
iii. Kimuras 2-parameter (K2P)
iv. Tamura 3 parameter model
v. Maximum Composite Likelihood
To do this click on Distance -> Compute pairwise distance; use
these settings:
Variation estimation method -- Bootstrap method
Model/method -- use each of the ones above, one at a time
Substitutions to include -- d: Transitions + Tranversions
Gaps/Missing data treatment -- Pairwise deletion

Compare the estimated distance by the various models.


Question 1. Make a table (in Excel or similar) to help illustrate the differences;
just record values for a closely-related pair (e.g., whale-whale) and a distantly-
related pair (e.g., whale-human).
*Shortcut: remember to deselect unneeded taxa from the data editor.

Question 2. Why there are multiple models?

Question 3. What are some of the differences between the models?

Question 4. With what types of data would one, in general, use a more simple or
more complex model?
Strepsiptera
Female twistedwing flies are neotenic (retains larval characteristics: see figure)
and totally endoparasitic in their hosts (males are not neotenic: right figure, family
Stylopidae). The question remains: are the Strepsiptera more closely related to
beetles (the traditional view) or to flies?

Apples & Oranges:


On comparing trees
You will be making Maximum Parsimony (MP), Neighbor Joining (NJ) and
Maximum Likelihood (ML) trees. Your long term goal is to figure out how to
compare these different methods so that you may intelligently choose which ones
to use. One comparison, that may guide which method you prefer, is to
determine if there are any topological differences.

Reconstructing evolutionary history using Maximum Parsimony


We are going to first learn how to use MP. You will use a portion of the 18S small
subunit ribosomal gene alignment for 25 hexapod taxa note that Diplura is the
outgroup.

Using MP in MEGA:
Open up the strepsiptera.fas file in the data explorer (you could put it in the
alignment explorer as well and then execute it by using the Phylogenetic Analysis
command as you did with the alignment assignment, but it is already aligned).
Question 5. How many total sites?
Question 6. How many variable sites?
Question 7. How many parsimony informative sites?

Now perform a Branch and Bound analysis.


Question 8. Did it finish the analysis? Why?

Deselect one taxa (not the dipluran, strepsiptera, flies or beetles) at a time until
the Branch and Bound finishes in under a minute in time.
Question 9. How many taxa did you have to eliminate?

Now compute a Majority-rule consensus and root the topology to Diplura.


Question 10. What is the supported sister group of Strepsiptera?
Question 11. What is the consensus value that unites these lineages as a
monophyletic group?

Now re-include all 25 taxa and perform a heuristic search analysis (TBR) with
1000 random additions.
Question 12. About how long did it take?
Question 13. How many trees were found?
Question 14. What is the tree length?

Now compute a Majority-rule consensus and root the topology to Diplura.


Question 15. What is the supported sister group of Strepsiptera?
Question 16. What is the consensus value that unites these lineages as a
monophyletic group?
Question 17. Were your answers for Q9-Q13 different than before? Why or why
not?

Question 18. Explain the similarities and differences between a Branch & Bound
and Heuristic (TBR with random additions) searching. When would you prefer to
use one or the other?
Question 20. In your own words, explain what is happening in a heuristic search.
Use the analogy I used in class and relate it to the parsimony method using
correct terms.

Finding the best Model. Run the MEGA modeltest program. Click on Models ->
find best DNA/protein models (ML). Use the automatic tree and use all sites
settings.

Question 21. Which model was selecting as the best model in the AIC and the
BIC criteria?

Now run an analysis of the dataset using the AIC Model in a Maximum Likelihood
(ML) analysis with these Settings: Test of Phylogeny [none]; Model and Rates [as
indicated by selected model]; Gaps [use all sites]; ML Heuristic [Extensive SPR
level 5]; Initial Tree [automatically]; Branch swap filter [very strong]; Number of
threads [1].
.
Question 22. What is the LogL score of the ML tree using the AIC suggested
model?

Question 23. What is the parsimony score of this tree? To do this you need to
export this tree (.nwk format) and then perform User Tree analysis with
parsimony on the saved tree.
Question 24. How and why is it different from the MP score from the regular MP
analysis for this same dataset?

Now run an analysis of the dataset using the BIC Model in a Maximum Likelihood
(ML) analysis with these Settings: Test of Phylogeny [none]; Model and Rates [as
indicated by selected model]; Gaps [use all sites]; ML Heuristic [Extensive SPR
level 5]; Initial Tree [automatically]; Branch swap filter [very strong]; Number of
threads [1].
Question 25. What is the LogL score of the ML tree using the AIC suggested
model?

Question 26. What is the parsimony score of this tree? To do this you need to
export this tree (.nwk format) and then perform User Tree analysis with
parsimony on the saved tree.

Question 27. How and why is it different from the MP score from the regular MP
analysis for this same dataset?

Question 28. When might one prefer to use ML method instead of Parsimony
distance?

Compare/contrast the MP and ML trees with respect to topology and root trees
to Diplura. Pay close attention to the placement of the Strepsiptera. Try to explain
any topological differences. (Hint: take note of the branch lengths for groups that
move around; ever heard of long branch attraction?)

Question 29. Which insect order is a likely sister group to the Strepsiptera?

Question 30. Defend your answer. You may want to refer to scientific literature
to provide support for your conclusion.

You might also like