Current Level

 Lecture 2.1
 Lab 2.1
 Lecture 2.2
 Lab 2.2
 Lecture 2.3
 Lab 2.3
 Take-home 2

Previous Level

 BioWeb Home
 Unit 1
 Unit 2
 Unit 3
 Unit 4
 Temp
 Genetics Ex
 Take-home 2

 

Take Home Assignment # 2

 

         Use the hierarchy browser search tool to find your assigned genus in the RDP.    Set the size to >1200 nt as you want to use only close to full length sequences. Please hand in a list (not a print out) of  the hierachal “phylogenetic steps” to your assigned genus starting with the Domain - identify what type each step is (for example the first step is a type of Domain).  Retrieve all the sequences within the genus either from the RDP or from any of the other databases you've learned to use in this course (regardless of whether the individual species within the genus actually include your genus name as part of their name).   Note:  It is important to check "remove all gaps" in the sequence when downloading it if it comes from RDP, which is an aligned database.  Find your genus again, only this time set the size to both.  How many total sequences are in this genus, including the short partials?  Note: Do not  retrieve any of these shorter sequences - only use the >1200 nt for the rest of the take home. (6 pt)

 

         Align the retrieved sequences that fit the above criteria using CLUSTALW. Hand in a copy of the CLUSTALW alignment.  Be sure that each sequence is clearly identified (by name, NOT by number) - either by editing the original sequence labels (in the sequence text box NOT the label line) or by handwriting the full name next to each sequence identifier.  The computer will sometimes chop off the labels resulting in identical labels for longer names for some clones where the unique identifiers fall only at the end of the name. (3 pt)

 

        Run a second alignment with E. coli 16S rRNA (or another bacterial 16S rRNA as long as it is outside of your group) as one of the sequences – don’t hand this in (-2 pt if you do).  Comparing the consensus sequence from the first alignment to this second alignment should help you find regions that might be unique signature sequences for your genus.  Briefly explain why this second alignment should help you to determine these regions.  Why might you choose to use another bacterial 16S rRNA sequence instead of E. coli?  If you had a picture of the secondary structure of the Bacterial 16S rRNA showing which areas are highly conserved and which are highly variable (like the one I showed in lecture), which of these areas would be a better place for you to search for signature sequence for your genus?  Select 3-4 different potential  sequences (potential probe targets - see tips below).  Highlight each of your selected signature sequences on the CLUSTALW alignment pages you are handing in and number them 1-3 or 4.  Run each candidate through Probe Match to determine their usefulness.  Hand in a copy of the Probe Match (be sure it is set to both) information on your signature sequence candidates - be sure the print-out includes the number of hits in the genus (but I don't need to see which members of the genus match) and the corresponding number from the highlighted region on the alignment.  On this printout also write down the total number of hits you get for the probe when you also allow for 1 mismatch (do not print this out as it can get long, just write it on the initial print out).  Rerun each signature sequence through Probe Match restricting the search to sequences with data in the region of this signature sequence.  (Pick a region 10-20 nt before the start of your signature sequence to 10-20 nt after the end.  Think - what information have you generated that will make it easy to identify where this would be on the E. coli sequence?)  If you lose hits within your genus, try widening the window by a few more nucleotides on each end until you get them back.  Hand in the output from this restricted run just like you did with the initial Probe Match run, however, somewhere on each printout write down the numbers for the region you restricted the run to.  Also, hand in a paragraph analyzing each of your Probe Match results.  Which one of your signature sequences do you believe is the best? Explain why you consider this signature sequence to give you a better probe than each of the other options.  Be sure to include the information you gained by allowing for a mismatch and by restricting the search area in your argument.  The signature sequence you selected may be the best of your 3-4 candidates, however, in the real world would you consider this probe search a success? Include an explanation as to why you do or don't believe the selected sequence will give you a useful probe for your group. (15 pt)

 

                Construct phylogenetic trees for your assigned group of organisms using two different tree methodologies (not just appearance like with rooted and unrooted).  Hand in the trees along with the name and description of the tree methodology (not program name) used for each.  Be sure it is clear how these two methodologies differ.  Please evaluate your phylogenetic results,  including the following in the evaluation discussion:  Did either of your methods give you multiple trees?  Why would this occur?  Do your two trees truly differ?  Would you expect them to differ? Explain.  Are the branch lengths valid for either of your trees?  If so, which?  Do you prefer one tree over the other?  For your group, speculate as to whether  selecting "correct for multiple substitutions" or not as one of your analysis parameters would make a difference.  Explain your answer. (15 pt)

 

                It is generally necessary to manually edit your aligned sequences prior to phylogenetic analysis or use a mask during the analysis.  While I don’t want a nucleotide by nucleotide edit for this assignment, there are some simple types of  edits that may be required for your alignment.  Please perform this/these edit(s) but do not print it it outEither mark your edits on the CLUSTALW alignment you generated above (with the edited areas circled) and give a brief explanation as to why you performed the edits, or turn in a paragraph explaining why you did not need to make any edits to your aligned sequences if you believe that to be the case.  Construct a new phylogenetic tree with the edited alignment using either of the two methodologies you used above and hand in this new tree (clearly labeled as edited) with answers to the following question(s).  How is the "edited" tree different from the original tree or is it the same?  If they are different, which do you think is more valid and why? (5 pt)

 

          You also need to run a bootstrap analysis on your phylogenetic groupings.  Please hand in a hard copy of this computer analysis (be careful all necessary information is included if you don't hand in the entire print out) and write the calculated bootstrap values at the appropriate nodes on one of the trees that you are handing in.  Use whichever of the trees has the same groupings as the neighbor-joining tree that the bootstrap program generates.  If they are completely different, then sketch out the bootstrap neighbor-joining tree by hand and write the bootstrap values on this tree.  Be sure to state how many random trees are tested so we know if the 89 is out of 100 trees or 1000.  On the tree note whether or not any of the branches (groupings) are not valid according to the bootstrap analysis. (6 pt)

 

 

Probe Design tips

 

1.  Probes, optimally, should be about 18-25 nucleotides in length, but some are as short as 15 nucleotides and others are longer than 25.

 

2.  If most of your sequences have a particular nucleotide (say a T for example) at a site, but one or two of the sequences have an N at that site (meaning it could be any nucleotide), go ahead and design the probe with that nucleotide (the T in my example).

 

3.  If you are having considerable trouble finding consensus sequence regions long enough for probes, expand your options by first checking to see if some base uncertainties are masking consensus.  The following IUPAC abbreviations may be used within your sequence:  R for A or G, W for A or T, S for G or C, M for A or C, Y for C or T, and K for G or T.  Consider then that an R may be in consensus if the other sequences all have A or all have G at that position.

 

4.  If necessary, design the probes using an ambiguous base like R or W (only 1 ambiguous base for probes shorter than 20 nt or 2 for probes over 20 nt).

 

5.  Your phylogenetic trees may show that 1 sequence or a small group of sequences is more distantly related to the rest of the sequences.  If this sequence(s) is causing problems in finding a consensus region for a probe, go ahead and design your probe for just the main group of sequences.  You will need to explain this in your paragraph on the probe.

 

 

Genus names of organisms for Take Home Assignment # 2

 

 1.            Natronorubrum

 2.            Asticcacaulis

 3.            Marinospirillum          

 4.            Methanimicrococcus

 5.           Thermovibrio 

 6.            Piscirickettsia

 7.            Anaerolinea

 8.            Pontibacter

 9.             Thermocladium

10.            Cardiobacterium

11.            Streptobacillus

12.            Gelidibacter

13.            Deferribacter

14.            Phaeospirillum

15.            Antarctobacter

16.            Aequorivita 

17.            Methylosarcina

18.           Pelagibaca

19.            Sandarakinorhabdus

20.            Azovibrio

21.            Runella

22.            Roseivivax

 

   

 

   
uwsa_l5

 © 2002 The Board of Regents of the University of Wisconsin System.

Click here to email comments to Scott Cooper regarding this site or its links.