An alignment program is used to compare the sequence homology between two protein or DNA sequences. These programs find the best match between the two sequences. Occasionally gaps need to be introduced to make the two sequences align.
seq1 > 1 ggcctctgcctaatcacacagat-ctaacaggattatttc ||||||||||| || ||||| || |||||||| |||||| seq2 > 1 ggcctctgccttattacacaaatcttaacaggactatttc
This type of analysis is useful in detecting evolutionary differences between species and to look for mutations in genes.
We can either use the program BLAST directly to align two sequences, a Multiple Alignment Program or use the Biology WorkBench
Using Biology WorkBench to align two or more sequences.
Log onto the Biology WorkBench and either create a new session or resume an existing session. Select Nucleic Tools or Protein Tools.
Select the sequences that you wish to align and then scroll down to CLUSTALW.
You will now be given some options on parameters you can change in your alignment. You can just use the default values and select Submit at the bottom of the page.
The results will show the two sequences with colored letters representing a consensus. Black letters will illustrate a mismatch and dashes will represent gaps.
If you wish to use this alignment to create a phylogenetic tree, or just want to save the alignment, select Import Alignment(s).
ALIGNMENT OF TWO SEQUENCES
1. The following program allows you to align two sequences together.
http://www.ncbi.nlm.nih.gov/gorf/bl2.html
2. Either copy and paste or type your two sequences into the boxes labeled Sequence 1 and Sequence 2. Be sure you do not have any text or numbers mixed in with your sequence.
To give each sequence a title type an ">" followed by the title and then "enter". The information on the line with the ">" will not be considered in the alignment.
>sequence 1 CCTTGGCCTCTGCCTAATCACACAGATT
3. If you are aligning DNA sequences select Program: blastn, if you are aligning protein sequences select Program: blastp.
4. Press the Align button and wait for your results.
RESULTS
If you align DNA sequences, vertical lines will indicate identical bases and "-" will indicate gaps in the alignment.
seq1 > 1 ggcctctgcctaatcacacagat-ctaacaggattatttc ||||||||||| || ||||| || |||||||| |||||| seq2 > 1 ggcctctgccttattacacaaatcttaacaggactatttc
If you align protein sequences the output will show the identical amino acids lined up between the two sequences. A blank will appear at non-conservative substitutions and a "+" will appear at conservative substitutions. A "-" will indicate any gaps in the alignment.
seq1 1 KKLYPATTA-VSSQQVV 16 KKLYPA+TA VSS QVV seq2 1 KKLYPASTAVVSSNQVV 17
MULTIPLE ALIGNMENTS
1. To align several sequences the following Multiple SequenceAlignment program is useful. All of the sequences are entered in the same data box, with titles separating each sequence.
http://dot.imgen.bcm.tmc.edu:9331/multi-align/Options/map.html
2. Your data should be formatted as follows:
>seq1 ggcctctgcctaatcacacagatctaacaggattatttc >seq2 ggcctctgccttattacacaaatcttaacaggactatttc >seq3 ggcctctgccttattttctttacaggactatatc
3. Perform Search
RESULTS
The results from the multiple alignment will be given in two formats. The second is a FASTA format that can be pasted into some evolutionary programs. A " - " indicates a gap in the alignment of the two sequences. 1 15 16 30 31 45 1 seq3 GGCCTCTGCCTTATT T------TCTTTACA GGACTATATC 34 2 seq1 GGCCTCTGCCTAATC ACACAGATCT-AACA GGATTATTTC 39 3 seq2 GGCCTCTGCCTTATT ACACAAATCTTAACA GGACTATTTC 40
>seq3 GGCCTCTGCCTTATTT------TCTTTACAGGACTATATC >seq1 GGCCTCTGCCTAATCACACAGATCT-AACAGGATTATTTC >seq2 GGCCTCTGCCTTATTACACAAATCTTAACAGGACTATTTC |