|
Web sites:
Exercise 1
A. Sequence Match
1. Which of the sequences (>1200) in the RDP most resemble the sequence given below?
1 gacgaacgct ggcggcatgc ctaatacatg caagtcgaac gcttttgttt caccgggtgc 61 ttgcacccac cgagacaaaa gagtggcgga cgggtgagta acacgtgggt aacctgccca 121 taagaggggg ataacatccg gaaacggatg ctaataccgc atatttccaa ttgtctcctg 181 acagatggaa aaaaggtggc ttcggctacc gcttatggat ggacccgcgg cgtattagct 241 agttggtgag gtaatggctc accaaggcga tgatacgtag ccgacctgag agggtgatcg 301 gccacactgg gactgagaca cggcccagac tcctacggga ggcagcagta gggaatcttc 361 cgcaatggac gaaagtctga cggagcaatg ccgcgtgagt gaagaaggtt ttcggatcgt 421 aaaactctgt tgttagagaa gaacaaggat gagagtaact gctcatcccc tgacggtatc 481 taaccagaaa gccacggcta actacgtgcc agcagccgcg gtaatacgta ggtggcaagc 541 gttgtccgga tttattgggc gtaaagcgag cgcaggcggt tctttaagtc tgatgtgaaa 601 gcccccggct caaccgggga gggtcattgg aaactggaga acttgagtgc agaagaggag 661 agtggaattc cacgtgtagc ggtgaaatgc gtagatatgt ggaggaacac cagtggcgaa 721 ggcgactctc tggtctgtaa ctgacgctga ggctcgaaag cgtggggagc aaacaggatt 781 agataccctg gtagtccacg ccgtaaacga tgagtgctaa gtgttggagg gtttccgccc 841 ttcagtgctg cagctaacgc attaagcact ccgcctgggg agtacgaccg caaggttgaa 901 actcaaagga attgacgggg acccgcacaa gcggtggagc atgtggttta attcgaagca 961 acgcgaagaa ccttaccagg tcttgacatc ctttgaccac tctagagata gagctttccc 1021 ttcggggaca aagtgacagg tggtgcatgg ttgtcgtcag ctcgtgtcgt gagatgttgg 1081 gttaagtccc gcaacgagcg caacccctat tattagttgc cagcattcag ttgggcactc 1141 tagtgagact gccggtgata aaccggagga aggtggggat gacgtcaaat catcatgccc 1201 cttatgacct gggctacaca cgtgctacaa tggatggtac aacgagtcgc aaggtcgcga 1261 ggccaagcta atctcttaaa gccattctca gttcggattg caggctgcaa ctcgcctgca 1321 tgaagccgga atcgctagta atcgcggatc agcacgccgc ggtgaatacg ttcccgggtc 1381 ttgtacacac cgcccgtcac accacgagag tttgtaacac ccgaagtcgg tgaggtaacc 1441 cttttgggag ccagccgcct aaggtgggac agataattgg ggtg
2. Check the information (GenBank flat file accessed via S00 number)
for your best match. What habitat was the organism
isolated from?
B. Use the Hierarchy Browser (search) to find Methylomonas scandinavica.
a. Does this organism belong to the domain Archaea or Bacteria?
b. M. scandinavica is found in which phylum?
c. M. scandinavica is found in which class?
d. M. scandinavica is found in which order?
e. M. scandinavica is found in which family?
f. Where was this organism isolated from?
g. Download the sequence to your on-line files. This will be one of the
sequences used for labs 2.2 and 2.3
Probe Match (set size to "both")
C. For each of the following sequences, determine:
a. Which genus, family, or order of organisms the signature sequence or probe is specific for?
[That is, which of these headings encompasses (contains) the bulk of the matches
given.] How many "hits" (query is complementary to sequence found in this
organism) fall into this category out of the total hits. Be sure the settings are correct for your entry - they
will differ depending on if you are entering a signature sequence or a probe
sequence.
b. For each of the following, how many sequences within the genus (or family
or order) aren't complementary to the signature sequence/probe?
c. How many organisms are there in the Domain Bacteria that have zero or one
mismatch with the signature sequence/probe? How many organisms are there
within the genus, family, or order you identified that have zero or one
mismatch with the signature sequence/probe?
Possible Signature sequences (- strand, 5'-3'):
1. TAGAGTGCAGCAGAGGGG
2. GATGTGGAGCGAACCTGAGA
Possible Probes (+ strand, 5'-3'):
3. CTTCCATACTCTAGGTAC
4. ACCGAGGTACATGTACCCCGACAT
D. Which of the 4 sequences above would work the best
for a genus-specific probe (or signature sequence)? Defend your answer.
Provide at least 2 reasons for your choice.
E. When you run Probe Match the size default is "both". Why do
you suppose you'd want both instead of just >1200 or <1200? When set to "both", if
the results show your probe hits 23 of the 52 sequences, does this mean that 29
organisms have sequence with mismatches to the probe in the region of the 16S rRNA where the probe is targeted? Defend your answer.
F. Look back at your results for signature sequence 2 above (i.e. matched it in
Probe Match). Was the probe complementary to sequence from all members in
this group (with no mismatches)? Probe match allows you to restrict
your search by entering a region of the sequence that should contain your
signature sequence/probe target. This narrows the search to
only include the sequences in the database that contain the target region complementary to your probe.
Note: it still checks the entire sequence though and not just the region
entered. Run Probe Match again setting the region to 1240 to 1290. How do these
results compare to those you had before? Why are they different? What do these results tell you
about the utility of the probe compared to what you knew before?
(Do not hand in this next part. These sequences are needed for Lab 2.2 and
so you need to have found these sequences before you can do this next lab.
Be sure to use appropriate nucleic acid databases and not protein databases -
otherwise you can use any database you prefer. )
Find the following rRNA sequences (or the rRNA gene sequences) and download or copy the sequences into Biology Workbench,
a FASTA file or a text file (do not use a word processing program like Microsoft
Word).
Note most of these aren't in the RDP currently as they are still preparing the Eukarya section and so you should use either Workbench or
Entrez to retrieve the sequences. For some of these there will be more than one option to choose from. Please select complete
or large partial sequences rather than small partial sequences if possible (about 1500 nt
for 16S, over 1400 to preferably close to 1600 for the protozoa and at least 1700 nt for
the other 18S, although 1900 is better ) and do NOT
use sequences which also contain other genes or spacers. The protozoa
files may refer to these as 16S or 16S-like as they are much smaller than
typical 18S rRNA. We will be using these sequences to perform multiple alignments and create phylogenetic trees and the sequences you select
can affect your results.
Mouse 18S rRNA
Methylomonas scandinavica 16S rRNA
Human (Homo sapiens) 18S rRNA
Xenopus laevis 18S rRNA
Actaea japonica 18S rRNA
Chlamydomonas nivalis 18S rRNA
Trichomonas tenax 18S rRNA
Giardia intestinalis 18S rRNA
Methanocaldococcus jannaschii 16S rRNA
Escherichia coli 16S rRNA
Pyrodictium occultum 16S rRNA
|