Importing Sequences from GenBank and Other Sources
DNA and protein sequences can be brought into Expression from a variety of sources, including FASTA files and GenBank. This section describes the easiest approaches of importing existing sequence data.
Reading Sequences in From GenBank
The primary database of DNA and protein sequences is GenBank, located at the NCBI (National Institute of Biotechnological Information, USA). Expression can automatically load in GenBank data, complete with sequence annotation. To do this, simpy select NCBI Query from the Internet menu. This will launch the NCBI home page in the Internet Tool Window. From here simply locate the desired sequence - typically via either the Entrez or a BLAST search. Expression will automatically detect when you open a sequence file from GenBank, and the Import Sequence button will appear in the bottom-right corner. Pushing this button will bring the sequence, with annotation, directly into the program for display and further analysis or manipulation.
Reading Sequences from Other Sources
The easiest way of loading in pure sequence data is through the clipboard. That is, just copy the sequence into the clipboard and paste it into the Sequence Editor. Expression also recognises the FASTA format for loading in either single or multiple sequences. An example of the FASTA format is shown below:
To load in a FASTA format file, simply go to Open on the File menu and select your file. Sequences can also be saved in the FASTA format for use in other programs. This is performed using the Export function on the File menu. Note however, that files saved as FASTA will lose any annotation and other file properties, since this is not compatible with the format. Files must be saved as .GML (Genamics Mark-up Language) to preserve any annotation and file properties.
Annotating Your Sequences
Return to Expression Overview