Home |  JournalSeek |  SoftwareSeek |  GenomeSeek |  Expression |  Developer |  TakeOnIt
Return to Genamics Home

Finding Patterns and Motifs in DNA or Protein Sequences

Using the powerful Pattern Finder tool, it is possible to rapidly find not only sequences identical to your query, but also near matches. In fact, the Patter Finder is so flexible, you can even use it to identify similar genes in a genome or genome section. Additionally, the tool can automatically search through all six reading frames of a DNA sequence for an amino acid query.

Using the Pattern Finder

To search for a pattern, simply open the Pattern Finder tool and enter your query into the box. Depending on the speed of your computer and the length of your sequence (less than 50,000 bp on a Pentium III 450), the Find button is automatically pushed in and the search results are calculated real-time and displayed instantly. For longer target sequences (or slower computers), you must push the Find button to execute the search. The match slider is used to to set the threshhold percentage match - ie the percentage of the query sequence which must match with the target. A similarity matrix (BLOSUM62) is employed when searching for an amino acid sequence, to take into account the degree of chemical similarity between the different amino acids.

The search results are, by default, sorted by percentage match. Like the other tools in Expression, to sort the results by any of the other fields, simply click on the column heading. Clicking on the records in the search results, automatically highlights the corresponding region in the Sequence Editor and the Sequence Map, as well as aligning the sequence directly under your query.

Note that the Pattern Finder does not provide any specific DNA or protein motifs and signatures. To examine a protein sequence for known protein signatures, use the ProScan tool on the Internet menu. See the Using ProScan to Identify Protein Signatures tutorial for further details.

Searching Sequences with Ambiguous Characters

The Patter Finder tool can analyse sequences that contain ambiguous characters. The algorithm records an exact match only where the query and target are identical. For example, the query sequence GTATC will not match GTRTC (where R = A or G) in the target sequence, however, the query GTRTC will match GTRTC, GTATC, and GTGTC. For more details about how Expression handles amiguity in sequences, see the Reverse Translation and Handling Ambiguous Characters tutorial.

Related Articles

Reverse Translation and Handling Ambiguous Characters

Using ProScan to Identify Protein Signatures

Return to Expression Overview

Pattern and Motif Finder

Add To Favorites
Email This Page


Online Tutorial

Sequence Annotation
Graphical Map
Restriction Analysis
Primer Design
ORF Prediction
Pattern Finding
Reverse Translation
GenBank Searching
Pattern Identification
Multiple Alignment
DNA/Protein Calculator

Side Panel
Privacy Policy About Us Contact Us