|
Phylogenetic Trees
First we will do a couple of pencil exercises and then we will analyze the ten SSU rRNA sequences that everyone aligned last
time.
1. Draw all possible unrooted trees for 3 species (A, B, C). Now
draw all possible rooted trees. Note: Branches can pivot at a node
so:





equals
Both still group with .
2. Now draw all possible unrooted trees for 4 species (1,2,3,4).
If the trees you have drawn represent maximum parsimony trees for
a given information position where the nucleotides at that position are T, T, C,
and C (corresponding to 1,2 3 & 4 respectively). Take the trees you have drawn and label each of the 2 internal nodes with the
most likely (may not be 1 right answer) candidate for inferred ancestral
nucleotide. How many of these trees are equally parsimonious, invoking a
minimum of 1 substitution? How many invoke 2? An invoke
greater than 2 substitutions?
We will now look at some distance matrices and use two different methods to generate phylogenetic trees using BIOLOGY WORKBENCH.
Go to alignment tools which is were your imported alignment should be.
3. Clustal Distance (CLUSTALDIST)
This program calculates distances between sequences using a matrix.
Run ClustalDist on your aligned sequences using the default values (correction for multiple substitutions = NO)
Look at the row corresponding to the human 18S rRNA sequence. Going across the row, group
similar organisms together based on evolutionary distances calculated..
Next look at the row corresponding to the E. coli 16S rRNA sequence. Going across the row, group the organisms by distance.
Are there any differences you notice between these two sets of groupings.
If so, which seems to be more accurate? On your paper rough out a phylogenetic tree (don't worry about branch lengths or style, simply draw
something that shows the groupings and how the various groups are related to
each other). You don't need to hand in the matrix.
Next run ClustalDist on your aligned sequences with the correction for multiple substitutions = YES
Perform the same comparisons. (You can use the back button on the second page with the results to toggle between matrices.)
Answer the following questions:
Are there any differences in the distances between organisms?
Does this change your groupings? (If so adjust your hand-drawn tree
accordingly.)
Does it appear to be more accurate?
Why or why not?
4. Next we will draw out the trees determined by Clustal Dist (a
distance based method). Note it is NOT necessary to run the matrix
first like we did here. You can go straight to the following two programs.
Select DrawTree to draw an unrooted tree.
Select DrawGram to draw a rooted tree (again recall that the first sequence is considered the root)
5. DNA Parsimony (DNAPAR - maximum parsimony (a character
based method)
In this program select Print Steps = NO, Print Sequence = NO.
The first sequence in your alignment is used as a the root sequence in
deriving the tree, but this is not a rooted tree. (This may give you an illogical tree in appearance, but the relative order of organisms should be correct).
Other versions of this program will give you the choice of selecting what to use
for the root sequence. Here the only way to change it is to edit your
alignment to change the order.
A tree dendrogram is given. This is the relative order of sequences by evolutionary distance, and
it shows only the groupings. The branch lengths do not represent quantitative distances.
How does this tree(s) and the distance-based tree compare to the tree you drew out by hand?
6.
Return to your aligned sequences and edit them using "Edit aligned sequences" to
remove nucleotides at the beginnings and ends of the sequences so that all the sequences start
and end at the same position (i.e. they are all the same length). Do
NOT hand in this alignment. Redo the phylogenetic tree using one of the two methods above. Does this change the tree
at all? If you have time, go back to your sequence files under nucleic tools
and create a new alignment with a few sequences left out of the alignment.
Redraw the trees and comment on any differences you see in the shape of the
tree.
6. Finally we will analyze the Clustal Tree by bootstrap analysis.
Select ClustalTree
Next select Bootstrap tree and change the default from 1000 trees to 100 trees.
Now you will get bootstrap values for the different nodes of the tree.
Unfortunately this version does not draw out the tree with the values placed at
the nodes. Try to place these values on the nodes of the tree you drew by
hand. (This may not be possible if the Clustal tree was considerably
different from yours.)
What factor would have the greatest impact on the number of computations
needed to complete a bootstrap analysis - doubling the number of sequences or
doubling the length of the sequences in the alignment? Why?
. |