Aligning Multiple Sequences and Generating a Consensus
Multiple DNA or protein sequences can be aligned from Expression using either the MultAlin or ClustalW algorithms. The processes are performed over the internet on fast servers, which will provide alignment results faster than most local desktop pc alignment tools. This section provides an introduction to the interface and a brief overview of the different algorithms.
Performing an Alignment
Open your target protein or DNA sequences, then select the desired algorithm under Multiple Alignment from the Internet menu. This will open the appropriate form in the Internet Tool Window. If you have both DNA and protein sequences open, only the sequence type of the currently selected sequence will be used. After adjusting the appropriate parameters, simply push the Submit button to invoke the algorithm. The time it takes before your results are returned depends on the speed of your internet connection and the length and number of sequences to be aligned.
Multiple Alignment Algorithms
MULTALIN is a progressive multiple sequence alignment program which allows you to align several biological sequences using hierarchical clustering (Corpet, 1988). The parameters and their functions are outlined in the table below:
|Output Order||Selects whether to show the aligned sequences in the input or in the aligned order|
|Symbol Comparison Table||Sets the matrix to use for scoring matches in the alignment|
|Gap Penalty Value||Sets the penalty to create a gap in the alignment|
|Gap Length Value||Sets the penalty to create extensions to the gaps|
|Extremity Gap Penalty||Sets end gaps to be penalized the same as internal gaps|
|One Iteration Only||Performs the alignment in a single iteration, resulting in a faster computation, but a less accurate alignment|
|Unweighted Sequences||Gives all sequences an equal weighting (1.0)|
|Scoring Method||Sets how the pairwise scores are computed|
For further details, see the MultAlin documentation.
ClustalW is a progressive multiple sequence alignment program (Thompson et al., 1994). It proceeds in three steps. In the first one, all sequences are aligned by pair. Then, in the second, a dendrogram is constructed describing the approximate groupings of the sequences by similarity. In the third, the multiple alignment is built using the dendrogram as a guide. It takes into account sequence weighting, positions-specific gap penalties and different weight matrix.
The input parameters are classifed into Pairwise and Multiple Alignment Parameters. Pairwise alignment parameters control the speed and the sensitivity of the initial alignments, whereas multiple alignment parameters control the gaps in the final multiple alignments. For further details, see the CLUSTALW documentation.
Corpet F (1988). Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res, 25;16(22):10881-10890.
Thompson JD, Higgins DG, and Gibson TJ (1994). CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.
Nucleic Acids Res, 22(22):4673-4680.
Importing Sequences from GenBank and Other Sources
Return to Expression Overview