Current Level

 Seq Editor
 Searches
 Seq. Comparison
 Alignment
 Restriction Map
 Translation & ORF
 Reverse Comp
 Primer Design
 Splice Sites
 Genomics
 Protein Motifs
 Protein structure
 Transmembrane

Previous Level

 BioWeb Home
 Modeling
 Seq Anal
 Theory
 Bioinformatics
 Mo Bio Lecture
 Mo Bio Lab
 Sequence Comparison

 Homology Search

In a homology search a test sequence is compared to all of the different sequences in a large database, and those sequences in the database with the closest match, or most homology, are reported. If you had sequenced a gene and didn't know if it had been discovered before you would perform this type of search. One can also search using a protein's amino acid sequence to find other homologous proteins. Homology searches are easily done over the www using the program BLAST (Basic Local Alignment Search Tool) from NCBI (National Center for Biotechnology Information, Washington, D.C.). The NCBI has a database called GenBank containing all of the know DNA and protein sequences from around the world.   The number of submissions has increased exponentially in the past 20 years. 

YEAR

BASES

SEQUENCES

1997

967,000,000

1,491,000

1998

1,622,000,000

2,356,000

1999

3,400,000,000

4,610,000

2000

10,300,000,000

9,102,000

2001 15,849,921,438 14,976,310
2002 28,507,990,166 22,318,883
 

When you submit a sequence for a search, it is compared to all of these sequences, and the best matches are displayed. Each sequence is given a number called an accession number.  This unique number makes it easier to keep track of individual sequences. For more information on how these programs work, visit the site ``A Guide to Molecular Sequence Analysis" or our bioinformatics lecture on this topic on this site.

We can either use the program BLAST directly to perform a database search, or use the Biology WorkBench.

 

Using Biology WorkBench to search a sequence database.

Log onto the Biology WorkBench and either create a new session or resume an existing session.  To search for amino acid sequences select Protein Tools, to search for DNA or RNA sequences select  Nucleic Tools.

Bwbbut 

 

To use a nucleic acid sequence (NS) or a protein sequence (PS) to search a database (DB), first select the sequence you wish to use, then select either BLASTP for proteins or BLASTN for nucleic acids.  BLASTX translates the nucleic acid sequence into all six frames of amino acid sequence and uses that to search the database.  You will then have to select the appropriate database to search and submit the search (For example GBPRI1 is a GenBank Primate Sequence Database).

Blast

 

If you want to obtain one of the sequences that matched with your sequence, just click on the line and press Import to Workbench.  To select multiple sequences hold down the control key as you select each sequence.

These sequences will be added to your stored sequences.

 

 

Using BLAST to search a sequence database.

     1.  We will use the program  BLAST (www.ncbi.nlm.nih.gov/BLAST/)  from NCBI.

     2.  Use the program BLAST 2.0. There are several different searches that can be performed. You should just select Basic BLAST Search

     3.  Type or paste your sequence into the box below the button Submit Query. Do not add any spaces or characters other than A, C, G or T. Push the Submit Query button. 

     4.  Your results will appear next. The files at the top of the list represent the best matches. Click on the blue file name to the left of the description to go to that sequence. 

image1DT

 

     5.  Each sequence will be accompanied by data indicating who submitted the sequence, and any journals that this may be published in. Click on the button at the top of the page labeled FASTA, this gives you just the DNA sequence. 

     6.  If you click on the button labeled Protein you will get a link to the translated amino acid sequence. 

     7.  If you wish to save a sequence, highlight the DNA sequence using you mouse and copy the sequence, either by simultaneously pushing Control and C, or by selecting Copy under Edit. 

     8. You can now paste this sequence into other programs for further analysis, or into a word processing file for   storage. 

.

uwsa_l5

 © 2003 The Board of Regents of the University of Wisconsin System.

Click here to email comments to Scott Cooper regarding this site or its links.