Sequence Editor

Current Level

Previous Level

A sequence editor is a program that allows you to manipulate DNA or protein sequences. You may wish to find a particular region in a long sequence, or cut and paste together different sequences, or just determine the lenght of a given region of sequence.

Unfortunately I haven´t been able to find a good web-based sequence editor that is also free. If you find one let me know.

Instead, we use a word processing program as a sequence editor, in this case Microsoft Word.

There are a few tricks in Word that make it easier to manipulate sequences.

	PC	Mac
Cut	Ctrl + X	Apple + X
Copy	Ctrl + C	Apple + C
Paste	Ctrl + V	Apple + V
Select All	Ctrl + A	Apple + A
Find	Ctrl + F	Apple + F
Replace	Ctrl + H	Apple + H
Select Word	Double Click	Double Click
Select Paragraph	Triple Click	Triple Click

Let´s assume that you have two sequences shown below. You wish to find the regions of overlap between these two sequences and join them into one longer sequence. You notice that these sequences have returns in them. First we will remove them.

Sequence 1

TTGAATTCCGCTTGCATGCCTNCAGGTCGACTCTAGAGGATCCCCGGGTACCGAGCTCGAATT CTGATGAAATTTTGGCT

CACTCCTAGGCCTCTGCCTTATTACACAAATCTTAACAGGACTATTTCTTGCAATACACTACAC

Sequence 2

AACAGGACTATTTCTTGCAATACACTACACAGCTGACATTTCAACA

GCCTTCTCCTCCGTCGCCCACATCTGCCGAGACGTAAACTACGGGTGACTAATCCGAAACGTC CACGCAAATGGCGCCTC

REPLACE: Saves time from manually going through a long sequence. Enter Ctrl + H and then Select MORE and next select SPECIAL. At the top of the window select ``Paragraph Mark". Leave ``Replace with:" empty. This action will essentially delete all paragraph marks, leaving one continuous sequence.

FIND: Next we want to find the overlapping regions. We will use Ctrl + F to find the last 10 bp of Sequence 1 in Sequence 2. At this point I like to change the color of each sequence to make it easier to keep track of changes. Select the last 10 bp of Sequence 1 and press Ctrl + C (copy). Paste this into the ``Find What:" box and then select Find Next.

Sequence 1

TTGAATTCCGCTTGCATGCCTNCAGGTCGACTCTAGAGGATCCCCGGGTACCGAGCTCGAATT CTGATGAAATTTTGGCTCACTCCTAGGCCTCTGCCTTATTACACAAATCTTAACAGGACTATTTC TTGCAATACACTACAC

Sequence 2

AACAGGACTATTTCTTGCAATACACTACACAGCTGACATTTCAACAGCCTTCTCCTCCGTCGCC CACATCTGCCGAGACGTAAACTACGGGTGACTAATCCGAAACGTCCACGCAAATGGCGCCTC

Now that we have identified the region of overlap between the two sequences, we wish to remove the overlapping region from one sequence and join the two together.

To join the two sequences enter a return after the last base in common between Sequence 1 and Sequence 2. This ensures that you don´t accidentally copy part of one sequence twice.

Sequence 1

TTGAATTCCGCTTGCATGCCTNCAGGTCGACTCTAGAGGATCCCCGGGTACCGAGCTCGAATT CTGATGAAATTTTGGCTCACTCCTAGGCCTCTGCCTTATTACACAAATCTTAACAGGACTATTTC TTGCAATACACTACAC

Sequence 2

AACAGGACTATTTCTTGCAATACACTACAC

AGCTGACATTTCAACAGCCTTCTCCTCCGTCGCCCACATCTGCCGAGACGTAAACTACGGGTG ACTAATCCGAAACGTCCACGCAAATGGCGCCTC

Next delete all of the overlapping sequence in sequence 2.

AACAGGACTATTTCTTGCAATACACTACAC

If you are working with a very large sequence, instead of scrolling through the entire sequence to select it you can double click on the sequence and the entire sequence (which is treated as one word) will be selected.

Finally, join the two fragments into one longer sequence.

The final product: Sequence 1+2

TTGAATTCCGCTTGCATGCCTNCAGGTCGACTCTAGAGGATCCCCGGGTACCGAGCTCGAATT CTGATGAAATTTTGGCTCACTCCTAGGCCTCTGCCTTATTACACAAATCTTAACAGGACTATTTC TTGCAATACACTACACAGCTGACATTTCAACAGCCTTCTCCTCCGTCGCCCACATCTGCCGAG ACGTAAACTACGGGTGACTAATCCGAAACGTCCACGCAAATGGCGCCTC

Click here to email comments to Scott Cooper regarding this site or its links.