Pattern Matching allows you to search for short (<20 residues) nucleotide or peptide sequences, or ambiguous/degenerate patterns. It uses the same Arabidopsis dataset as TAIR's BLAST and FASTA programs. If you are searching for a sequence >20 bp or aa with no degenerate positions, please use BLAST or FASTA, which are much faster. Pattern Matching allows for ambiguous characters, mismatches, insertions and deletions, but does not do alignments and so is not a replacement for BLAST and FASTA Currently the maximum number of hits retrieved is 250,000 and the minimum number of input string is 3 residues.
Version 1.1 Release Notes
Your comments and suggestions are appreciated :Send a Message to TAIR
Supported Pattern Syntax and Examples:
Search Type | Character | Meaning | Examples |
Peptide Searches | IFVLWMAGCYP TSHEDQNKR | Exact match | DQGT |
J | Any hydrophobic residue (IFVLWMAGCY) | AAAAAAJJ |
O | Any hydrophilic residue (TSHEDQNKR) | TTTTTTOO |
B | D or N | FLGB |
Z | E or Q | GLFGZ |
X or . | Any amino acid | DXXXNW..VSK |
Nucleotide searches | ACTGU | Exact match | ACCGGCGTAA |
R | Any purine base (AG) | AAGGCCGGRRRR |
Y | Any pyrimidine base (CT) | CCCATAYYGGYY |
S | G or C | YGGTWCAMWTGTY |
W | A or T |
M | A or C |
K | G or T |
V | A or C or G | CCGG...WHW.{3,5}HWH...CCGG |
H | A or C or T |
D | A or G or T |
B | C or G or T |
N or X or . | Any base | ATGCTNNNNATCG |
All searches: | [ ] | A subset of elements [TC] = T or C | [WFY]XXXDN[RK][ST] |
[^ ] | An excluded subset of elements [^TA] = not T or A, (matches nucleotides C or G) | NDBB...[VILM]Z[DE]...[^PG] |
( ) | Specifies a sub-pattern (YPT) = YPT | (YDXXX){2,} |
{m,n} | {m} = exactly m times {m,} = at least m times {,m} = 0 to m times {m,n} = between m and n times
| L{3,5}X{5}DGZ |
< | Constrains pattern to N-terminus or 5' end | <MNTD (pep) <ATGX{6,10}RTTRTT (nuc) |
> | Constrains pattern to C-terminus or 3' end | sbgz> (pep) yattrtga> (nuc) |
|