Take the practice sequences you loaded from last time and run a Clustal W alignment.
Import the aligned file into your workbench file.
The aligned file will be under Alignment Tools.
Select the alignment and use box shade and the text shade for a visual analysis of the aligned sequences.
Do NOT print these out.
Both programs do similar things, but the output is formatted differently.
Which do you prefer? What are some of the advantages of this visual
alignment over the initial Clustal W alignment? Try changing some of the
defaults to see how it affects the output and what your preferences are for
settings. What happens when you change the % threshold or similarity
threshold fraction?
Do the sequences start and end in the same place? Why do you suppose
this is? Do you think this affects your alignment?
Scanning your alignment, you should see both
variable and conserved
regions? Why are both of these features important?
The region between 1400 and 1500 (E. coli
numbering) contains an area
of signature sequence that is considered universal. Find it and write down
at least 10 nt from this conserved region (assume N's are likely conserved
nt).
Give the numbers (from the consensus sequence)
for a couple of regions (size doesn't matter) where Eukarya and Archaea (Methanococcus
and Pyrodictium)
have sequence in common but the Bacterial sequences (E. coli and M.
scandinavica) are different? Give the numbers for a couple of regions
the Archaea and Bacteria share in common? Likewise for Eukarya
and Bacteria? Was the last one harder to find? Why do you suppose
that's true?