Aligning Multiple Sequences and Generating a Consensus

Multiple DNA or protein sequences can be aligned from Expression using either the MultAlin or ClustalW algorithms. The processes are performed over the internet on fast servers, which will provide alignment results faster than most local desktop pc alignment tools. This section provides an introduction to the interface and a brief overview of the different algorithms.

Performing an Alignment

Open your target protein or DNA sequences, then select the desired algorithm under Multiple Alignment from the Internet menu. This will open the appropriate form in the Internet Tool Window. If you have both DNA and protein sequences open, only the sequence type of the currently selected sequence will be used. After adjusting the appropriate parameters, simply push the Submit button to invoke the algorithm. The time it takes before your results are returned depends on the speed of your internet connection and the length and number of sequences to be aligned.

Multiple Alignment Algorithms

MultAlin

MULTALIN is a progressive multiple sequence alignment program which allows you to align several biological sequences using hierarchical clustering (Corpet, 1988). The parameters and their functions are outlined in the table below:

Parameter	Function
Output Order	Selects whether to show the aligned sequences in the input or in the aligned order
Symbol Comparison Table	Sets the matrix to use for scoring matches in the alignment
Gap Penalty Value	Sets the penalty to create a gap in the alignment
Gap Length Value	Sets the penalty to create extensions to the gaps
Extremity Gap Penalty	Sets end gaps to be penalized the same as internal gaps
One Iteration Only	Performs the alignment in a single iteration, resulting in a faster computation, but a less accurate alignment
Unweighted Sequences	Gives all sequences an equal weighting (1.0)
Scoring Method	Sets how the pairwise scores are computed

For further details, see the MultAlin documentation.

ClustalW

ClustalW is a progressive multiple sequence alignment program (Thompson et al., 1994). It proceeds in three steps. In the first one, all sequences are aligned by pair. Then, in the second, a dendrogram is constructed describing the approximate groupings of the sequences by similarity. In the third, the multiple alignment is built using the dendrogram as a guide. It takes into account sequence weighting, positions-specific gap penalties and different weight matrix.

The input parameters are classifed into Pairwise and Multiple Alignment Parameters. Pairwise alignment parameters control the speed and the sensitivity of the initial alignments, whereas multiple alignment parameters control the gaps in the final multiple alignments. For further details, see the CLUSTALW documentation.

References

Corpet F (1988). Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res, 25;16(22):10881-10890.

Thompson JD, Higgins DG, and Gibson TJ (1994). CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res, 22(22):4673-4680.

Importing Sequences from GenBank and Other Sources

Return to Expression Overview

Overview
Sequence Annotation
Graphical Map
Restriction Analysis
Primer Design
ORF Prediction
Pattern Finding
Reverse Translation
GenBank Searching
Pattern Identification
Multiple Alignment
DNA/Protein Calculator