|
|
blast2seq |
> blast2seq
Finds local alignments between two sequences, using BLAST
1 : blastn (nuc with nuc)
2 : blastp (prot with prot)
3 : blastx (nuc translated with prot)
4 : tblastn (prot with nuc translated)
5 : tblastx (nuc translated with nuc translated)
Select type of alignment [2]:
Input sequence: sw:tpa_human
Second sequence: sw:urok_human
Word size [3]:
E() value cutoff [10.0]:
Output file [tpa_human.blastp2seq]: tpa_urok.blastp2seq
|
Go to the input files for this example
Go to the output files for this example
Standard (Mandatory) qualifiers:
-program menu [2] Alignment type : nuc. or prot. (Values:
1 (blastn (nuc with nuc)); 2 (blastp (prot
with prot)); 3 (blastx (nuc translated with
prot)); 4 (tblastn (prot with nuc
translated)); 5 (tblastx (nuc translated
with nuc translated)))
[-asequence] sequence Sequence filename and optional format, or
reference (input USA)
[-bsequence] sequence Sequence filename and optional format, or
reference (input USA)
-wordsize integer [11 for blastn, 3 for other alignment types]
Word size (Any integer value)
-expect float [10.0] E() value = number of sequences with
same or higher bit score that you expect to
find by chance. BLAST lists alignments with
an E() value lower than the cutoff. (Number
0.000 or more)
[-outfile] outfile [*.blast2seq] Output file name
Additional (Optional) qualifiers (* if not always prompted):
* -strand selection [both] Strand to search. By default BLAST
searches both strands, but for blastn and
(t)blastx you can choose to search only the
top or bottom strand of the second
respectively the first sequence.
* -match integer [1] Nucleotide match reward (Integer 0 or
more)
* -mismatch integer [-3] Nucleotide mismatch penalty (Integer up
to 0)
* -matrix selection [3] Amino acid comparison matrix
* -gappenalty integer [5 for blastn, 11 for other alignment types]
Gap penalty (Integer 0 or more)
* -gaplength integer [2 for blastn, 1 for other alignment types]
Gap length penalty. BLAST subtracts from the
similarity score for each gap a penalty of
type <Gap penalty> + <Gap length penalty> *
n. Only certain combinations of matrix and
gap penalty are allowed, see on-line manual.
(Integer 0 or more)
Advanced (Unprompted) qualifiers:
-[no]gaps toggle [Y] Make gapped alignments (is default)
-[no]seqfilter boolean [Y] Filter low complexity segments out of
first sequence (is default)
-seqcoilfilter boolean Filter coiled coils out of first sequence
-seqsoftfilter boolean Use soft filtering, that is, filter only at
initial hit searching, not at hit extension
-effdbsize float [0.000] Effective databank size for
statistical calculations (Number 0.000 or
more)
Associated qualifiers:
"-asequence" associated qualifiers
-sbegin1 integer Start of the sequence to be used
-send1 integer End of the sequence to be used
-sreverse1 boolean Reverse (if DNA)
-sask1 boolean Ask for begin/end/reverse
-snucleotide1 boolean Sequence is nucleotide
-sprotein1 boolean Sequence is protein
-slower1 boolean Make lower case
-supper1 boolean Make upper case
-sformat1 string Input sequence format
-sdbname1 string Database name
-sid1 string Entryname
-ufo1 string UFO features
-fformat1 string Features format
-fopenfile1 string Features file name
"-bsequence" associated qualifiers
-sbegin2 integer Start of the sequence to be used
-send2 integer End of the sequence to be used
-sreverse2 boolean Reverse (if DNA)
-sask2 boolean Ask for begin/end/reverse
-snucleotide2 boolean Sequence is nucleotide
-sprotein2 boolean Sequence is protein
-slower2 boolean Make lower case
-supper2 boolean Make upper case
-sformat2 string Input sequence format
-sdbname2 string Database name
-sid2 string Entryname
-ufo2 string UFO features
-fformat2 string Features format
-fopenfile2 string Features file name
"-outfile" associated qualifiers
-odirectory3 string Output directory
General qualifiers:
-auto boolean Turn off prompts
-stdout boolean Write standard output
-filter boolean Read standard input, write standard output
-options boolean Prompt for standard and additional values
-debug boolean Write debug output to program.dbg
-verbose boolean Report some/full command line options
-help boolean Report command line options. More
information on associated and general
qualifiers can be found with -help -verbose
-warning boolean Report warnings
-error boolean Report errors
-fatal boolean Report fatal errors
-die boolean Report dying program messages
|
| Standard (Mandatory) qualifiers | Allowed values | Default | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| -program | Alignment type : nuc. or prot. |
|
2 | ||||||||||
| [-asequence] (Parameter 1) |
Sequence filename and optional format, or reference (input USA) | Readable sequence | Required | ||||||||||
| [-bsequence] (Parameter 2) |
Sequence filename and optional format, or reference (input USA) | Readable sequence | Required | ||||||||||
| -wordsize | Word size | Any integer value | 11 for blastn, 3 for other alignment types | ||||||||||
| -expect | E() value = number of sequences with same or higher bit score that you expect to find by chance. BLAST lists alignments with an E() value lower than the cutoff. | Number 0.000 or more | 10.0 | ||||||||||
| [-outfile] (Parameter 3) |
Output file name | Output file | <sequence>.<program>2seq | ||||||||||
| Additional (Optional) qualifiers | Allowed values | Default | |||||||||||
| -strand | Strand to search. By default BLAST searches both strands, but for blastn and (t)blastx you can choose to search only the top or bottom strand of the second respectively the first sequence. | Choose from selection list of values | both | ||||||||||
| -match | Nucleotide match reward | Integer 0 or more | 1 | ||||||||||
| -mismatch | Nucleotide mismatch penalty | Integer up to 0 | -3 | ||||||||||
| -matrix | Amino acid comparison matrix | Choose from selection list of values | BLOSUM62 | ||||||||||
| -gappenalty | Gap penalty | Integer 0 or more | 5 for blastn, 11 for other alignment types | ||||||||||
| -gaplength | Gap length penalty. BLAST subtracts from the similarity score for each gap a penalty of type <Gap penalty> + <Gap length penalty> * n. Only certain combinations of matrix and gap penalty are allowed, see on-line manual. | Integer 0 or more | 2 for blastn, 1 for other alignment types | ||||||||||
| Advanced (Unprompted) qualifiers | Allowed values | Default | |||||||||||
| -[no]gaps | Make gapped alignments (is default) | Toggle value Yes/No | Yes | ||||||||||
| -[no]seqfilter | Filter low complexity segments out of first sequence (is default) | Boolean value Yes/No | Yes | ||||||||||
| -seqcoilfilter | Filter coiled coils out of first sequence | Boolean value Yes/No | No | ||||||||||
| -seqsoftfilter | Use soft filtering, that is, filter only at initial hit searching, not at hit extension | Boolean value Yes/No | No | ||||||||||
| -effdbsize | Effective databank size for statistical calculations | Number 0.000 or more | 0.000 | ||||||||||
Query= TPA_HUMAN P00750 Tissue-type plasminogen activator precursor
(EC 3.4.21.68) (tPA) (t- PA) (t-plasminogen activator) (Alteplase)
(Reteplase) [Contains: Tissue-type plasminogen activator chain A;
Tissue-type plasminogen activator chain B].
(562 letters)
>UROK_HUMAN P00749 Urokinase-type plasminogen activator precursor
(EC 3.4.21.73) (uPA) (U-plasminogen activator)
[Contains: Urokinase-type plasminogen activator long
chain A; Urokinase-type plasminogen activator short
chain A; Urokinase-type plasminogen activator chain B].
Length = 431
Score = 299 bits (766), Expect = 1e-85
Identities = 162/389 (41%), Positives = 214/389 (55%), Gaps = 30/389 (7%)
Query: 189 WCYVFKAGKYSSEFCSTPACSEGNSDCYFGNGSAYRGTHSLTESGASCLPWNSMILIGKV 248
WC K K+ + C + + CY GNG YRG S G CLPWNS ++ +
Sbjct: 50 WCNCPK--KFGGQHCEI----DKSKTCYEGNGHFYRGKASTDTMGRPCLPWNSATVLQQT 103
Query: 249 YTAQNPSAQALGLGKHNYCRNPDGDAKPWCHVLKNRRLTWEYCDVPSCS----------- 297
Y A A LGLGKHNYCRNPD +PWC+V + + C V C+
Sbjct: 104 YHAHRSDALQLGLGKHNYCRNPDNRRRPWCYVQVGLKPLVQECMVHDCADGKKPSSPPEE 163
Query: 298 ---TCGLRQYSQPQFRIKGGLFADIASHPWQAAIFAKHRRSPGERFLCGGILISSCWILS 354
CG ++ +P+F+I GG F I + PW AAI+ +H R ++CGG L+S CW++S
Sbjct: 164 LKFQCG-QKTLRPRFKIIGGEFTTIENQPWFAAIYRRH-RGGSVTYVCGGSLMSPCWVIS 221
Query: 355 AAHCFQERFPPHHLTVILGRTYRVVPGEEEQKFEVEKYIVHKEFDDDT--YDNDIALLQL 412
A HCF + V LGR+ + E KFEVE I+HK++ DT + NDIALL++
Sbjct: 222 ATHCFIDYPKKEDYIVYLGRSRLNSNTQGEMKFEVENLILHKDYSADTLAHHNDIALLKI 281
Query: 413 KSDSSRCAQESSVVRTVCLPPADLQLPDWTECELSGYGKHEALSPFYSERLKEAHVRLYP 472
+S RCAQ S ++T+CLP T CE++G+GK + Y E+LK V+L
Sbjct: 282 RSKEGRCAQPSRTIQTICLPSMYNDPQFGTSCEITGFGKENSTDYLYPEQLKMTVVKLIS 341
Query: 473 SSRCTSQHLLNRTVTDNMLCAGDTRSGGPQANLHDACQGDSGGPLVCLNDGRMTLVGIIS 532
C H VT MLCA D PQ D+CQGDSGGPLVC GRMTL GI+S
Sbjct: 342 HRECQQPHYYGSEVTTKMLCAAD-----PQWKT-DSCQGDSGGPLVCSLQGRMTLTGIVS 395
Query: 533 WGLGCGQKDVPGVYTKVTNYLDWIRDNMR 561
WG GC KD PGVYT+V+++L WIR + +
Sbjct: 396 WGRGCALKDKPGVYTRVSHFLPWIRSHTK 424
Score = 130 bits (327), Expect = 1e-34
Identities = 64/141 (45%), Positives = 78/141 (55%), Gaps = 5/141 (3%)
Query: 72 NSGRAQCHSVPVKSCSEPRCFNGGTCQQALYFSDFV-CQCPEGFAGKCCEIDTRATCYED 130
+ G + H VP S C NGGTC YFS+ C CP+ F G+ CEID TCYE
Sbjct: 18 SKGSNELHQVP----SNCDCLNGGTCVSNKYFSNIHWCNCPKKFGGQHCEIDKSKTCYEG 73
Query: 131 QGISYRGTWSTAESGAECTNWNSSALAQKPYSGRRPDAIRLGLGNHNYCRNPDRDSKPWC 190
G YRG ST G C WNS+ + Q+ Y R DA++LGLG HNYCRNPD +PWC
Sbjct: 74 NGHFYRGKASTDTMGRPCLPWNSATVLQQTYHAHRSDALQLGLGKHNYCRNPDNRRRPWC 133
Query: 191 YVFKAGKYSSEFCSTPACSEG 211
YV K + C C++G
Sbjct: 134 YVQVGLKPLVQECMVHDCADG 154
Score = 14.6 bits (26), Expect = 8.3
Identities = 5/14 (35%), Positives = 7/14 (50%)
Query: 291 CDVPSCSTCGLRQY 304
CD + TC +Y
Sbjct: 31 CDCLNGGTCVSNKY 44
Lambda K H
0.321 0.136 0.453
Gapped
Lambda K H
0.267 0.0410 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 899
Number of extensions: 47
Number of successful extensions: 16
Number of sequences better than 10.0: 1
Number of HSP's gapped: 3
Number of HSP's successfully gapped: 3
Length of query: 562
Length of database: 431
Length adjustment: 34
Effective length of query: 528
Effective length of database: 397
Effective search space: 209616
Effective search space used: 209616
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 26 (14.9 bits)
S2: 26 (14.6 bits)
|
| scoring matrix | gap penalty | gap length penalty | recommended |
|---|---|---|---|
| BLOSUM90 | 6 | 2 | |
| 7 | 2 | ||
| 8 | 2 | ||
| 9 | 1 | ||
| 2 | |||
| 10 | 1 | * | |
| 11 | 1 | ||
| BLOSUM80 | 6 | 2 | |
| 7 | 2 | ||
| 8 | 2 | ||
| 9 | 1 | ||
| 2 | |||
| 10 | 1 | * | |
| 11 | 1 | ||
| 13 | 2 | ||
| 25 | 2 | ||
| BLOSUM62 | 6 | 2 | |
| 7 | 2 | ||
| 8 | 2 | ||
| 9 | 1 | ||
| 2 | |||
| 10 | 1 | ||
| 2 | |||
| 11 | 1 | * (the default) | |
| 2 | |||
| 12 | 1 | ||
| 13 | 1 | ||
| BLOSUM50 | 9 | 3 | |
| 10 | 3 | ||
| 11 | 3 | ||
| 12 | 2 | ||
| 3 | |||
| 13 | 2 | * | |
| 3 | |||
| 14 | 2 | ||
| 15 | 1 | ||
| 2 | |||
| 16 | 1 | ||
| 2 | |||
| 17 | 1 | ||
| 18 | 1 | ||
| 1 | |||
| BLOSUM45 | 10 | 3 | |
| 11 | 3 | ||
| 12 | 2 | ||
| 3 | |||
| 13 | 2 | ||
| 3 | |||
| 14 | 2 | * | |
| 15 | 2 | ||
| 16 | 1 | ||
| 2 | |||
| 17 | 1 | ||
| 18 | 1 | ||
| 19 | 1 | ||
| PAM30 | 5 | 2 | |
| 6 | 2 | ||
| 7 | 2 | ||
| 8 | 1 | ||
| 9 | 1 | * | |
| 10 | 1 | ||
| PAM70 | 6 | 2 | |
| 7 | 2 | ||
| 8 | 2 | ||
| 9 | 1 | ||
| 10 | 1 | * | |
| 11 | 1 | ||
| PAM250 | 11 | 3 | |
| 12 | 3 | ||
| 13 | 2 | ||
| 3 | |||
| 14 | 2 | ||
| 3 | |||
| 15 | 2 | * | |
| 3 | |||
| 16 | 2 | ||
| 17 | 1 | ||
| 2 | |||
| 18 | 1 | ||
| 19 | 1 | ||
| 20 | 1 | ||
| 21 | 1 |
Similarly, blastn supports only certain combinations of match reward, mismatch penalty and gap penalty. If both gap penalty and gap length penalty are above the maximum in following table blastn will shift to statistics for gapless alignments.
| match reward/mismatch penalty | gap penalty | gap length penalty |
|---|---|---|
| 2 / -7 | 0 | 4 |
| 2 | 2 | |
| 4 | ||
| 4 | 2 | |
| 4 | ||
| 1 / -3 | 0 | 2 |
| 1 | 1 | |
| 2 | ||
| 2 | 1 | |
| 2 | ||
| 2 / -5 | 0 | 4 |
| 2 | 2 | |
| 4 | ||
| 4 | 2 | |
| 4 | ||
| 1 / -2 | 0 | 2 |
| 1 | 1 | |
| 2 | ||
| 2 | 1 | |
| 2 | ||
| 3 | 1 | |
| 2 / -3 | 0 | 4 |
| 2 | 2 | |
| 4 | ||
| 3 | 3 | |
| 4 | 2 | |
| 4 | ||
| 5 | 2 | |
| 6 | 2 | |
| 4 | ||
| 4 / -5 | 3 | 5 |
| 4 | 5 | |
| 5 | 5 | |
| 6 | 5 | |
| 12 | 8 | |
| 1 / -1 | 0 | 2 |
| 1 | 2 | |
| 2 | 1 | |
| 2 | ||
| 3 | 1 | |
| 2 | ||
| 4 | 1 | |
| 2 | ||
| 5 / -4 | 8 | 6 |
| 10 | 6 | |
| 25 | 10 |
| Program name | Description |
|---|---|
| blastz | Nonintersecting best local alignments, makes LAJ file |
| lfasta | Finds local alignments between two sequences, using fastA |
| matcher | Finds the best local alignments between two sequences |
| seqmatchall | All-against-all comparison of a set of sequences |
| sim_lav | Nonintersecting best local alignments, makes LALNVIEW file |
| supermatcher | Match large sequences against one or more other sequences |
| water | Smith-Waterman local alignment |
| wordfinder | Match large sequences against one or more other sequences |
| wordmatch | Finds all exact matches of a given size between 2 sequences |
| blast | BLAST search of query sequence(s) against sequence search set |
| phiblast | Search protein sequence set combining matching of pattern with local alignment of a query sequence surrounding the match |
| psiblast | Iterative BLAST search with generation of profile of protein sequence against protein sequence set |
| makeblastdb | Make BLAST format sequence database |
The program bl2seq itself was written by a team of developers working at the National Center for Biotechnology Information, Bethesda MD, U.S.A., comprising among others Stephen Altschul, David Lipman, Tom Madden, Alex Schaffer, Sergei Shavirin and Jinghui Zhang.
You can contact the BLAST development team at blast-help@ncbi.nlm.nih.gov