Current Level

 Seq Editor
 Searches
 Seq. Comparison
 Alignment
 Restriction Map
 Translation & ORF
 Reverse Comp
 Primer Design
 Splice Sites
 Genomics
 Protein Motifs
 Protein structure
 Transmembrane

Previous Level

 BioWeb Home
 
 Seq Anal
 Theory
 Bioinformatics
 Mo Bio Lecture
 
 Alignment

An alignment program is used to compare the sequence homology between two protein or DNA sequences.  These programs find the best match between the two sequences. Occasionally gaps need to be introduced to make the two sequences align.

       seq1 > 1 ggcctctgcctaatcacacagat-ctaacaggattatttc 
                ||||||||||| || ||||| ||  |||||||| |||||| 
       seq2 > 1 ggcctctgccttattacacaaatcttaacaggactatttc

This type of analysis is useful in detecting evolutionary differences between species and to look for mutations in genes.

We can either use the program BLAST directly to align two sequences, a Multiple Alignment Program or use the Biology WorkBench

 

Using Biology WorkBench to align two or more sequences.

Log onto the Biology WorkBench and either create a new session or resume an existing session.  Select  Nucleic Tools or Protein Tools.

Select the sequences that you wish to align and then scroll down to CLUSTALW

Clust1
 

You will now be given some options on parameters you can change in your alignment.  You can just use the default values and select Submit at the bottom of the page.

The results will show the two sequences with colored letters representing a consensus.  Black letters will illustrate a mismatch and dashes will represent gaps.

Align
 

If you wish to use this alignment to create a phylogenetic tree, or just want to save the alignment, select Import Alignment(s).

 

 

 ALIGNMENT OF TWO SEQUENCES

          1.  The following program allows you to align two sequences together.  

                         http://www.ncbi.nlm.nih.gov/gorf/bl2.html

          2.  Either copy and paste or type your two sequences into the boxes labeled Sequence 1 and
               Sequence 2. Be sure you do not have any text or numbers mixed in with your sequence.  

                    To give each sequence a title type an ">" followed by the title and then "enter".  
                    The information on the line with the ">" will not be considered in the alignment.

              >sequence 1  
              CCTTGGCCTCTGCCTAATCACACAGATT

          3.  If you are aligning DNA sequences select Program: blastn, if you are aligning protein
               sequences select Program: blastp.  

          4.  Press the Align button and wait for your results.  
 

RESULTS

If you align DNA sequences, vertical lines will indicate identical bases and "-" will indicate gaps in the alignment.  

               seq1 > 1 ggcctctgcctaatcacacagat-ctaacaggattatttc 
                        ||||||||||| || ||||| ||  |||||||| |||||| 
               seq2 > 1 ggcctctgccttattacacaaatcttaacaggactatttc

If you align protein sequences the output will show the identical amino acids lined up between the two sequences. A blank will appear at non-conservative substitutions and a "+" will appear at conservative substitutions. A "-" will indicate any gaps in the alignment.  

               seq1 1 KKLYPATTA-VSSQQVV 16 
                      KKLYPA+TA VSS QVV 
               seq2 1 KKLYPASTAVVSSNQVV 17
 


 MULTIPLE ALIGNMENTS

1.  To align several sequences the following  Multiple SequenceAlignment  program is useful.  All of the sequences are entered in the same data box, with titles separating each sequence.  

                      http://dot.imgen.bcm.tmc.edu:9331/multi-align/Options/map.html

2.  Your data should be formatted as follows:  

               >seq1   
               ggcctctgcctaatcacacagatctaacaggattatttc 
               >seq2   
               ggcctctgccttattacacaaatcttaacaggactatttc 
               >seq3   
               ggcctctgccttattttctttacaggactatatc

3.  Perform Search

 

RESULTS

The results from the multiple alignment will be given in two formats.  The second is a FASTA format that can be pasted into some evolutionary programs.  A " - " indicates a gap in the alignment of the two sequences.                   
                1            15 16           30 31           45   
         1 seq3 GGCCTCTGCCTTATT T------TCTTTACA GGACTATATC   34 
         2 seq1 GGCCTCTGCCTAATC ACACAGATCT-AACA GGATTATTTC   39 
         3 seq2 GGCCTCTGCCTTATT ACACAAATCTTAACA GGACTATTTC   40 
 
              

          >seq3   
          GGCCTCTGCCTTATTT------TCTTTACAGGACTATATC 
          >seq1   
          GGCCTCTGCCTAATCACACAGATCT-AACAGGATTATTTC 
          >seq2   
          GGCCTCTGCCTTATTACACAAATCTTAACAGGACTATTTC

uwsa_l5

 © 2003 The Board of Regents of the University of Wisconsin System.

Click here to email comments to Scott Cooper regarding this site or its links.