|
|
lfasta |
By default the "wrapper" lfasta produces the text output. You can obtain instead the graphic in PostScript format (program plfasta) with the option -psplot. If you work under X-Window the PostScript file is automatically "opened" with the program ghostview.
> lfasta Finds local alignments between two sequences, using fastA Input sequence: embl:x00066 Second sequence: embl:k00153 Word (ktup) size [6]: 4 Output file [x00066.lfasta]: |
Go to the input files for this example
Go to the output files for this example
second example, with graphical output
> lfasta -psplot Finds local alignments between two sequences, using fastA Input sequence: embl:x00066 Second sequence: embl:k00153 Word (ktup) size [6]: 4 Output file [x00066.lfasta.ps]: |
Go to the output files for this example
Standard (Mandatory) qualifiers:
[-asequence] sequence Sequence filename and optional format, or
reference (input USA)
[-bsequence] sequence Sequence filename and optional format, or
reference (input USA)
-wordsize integer [2 for protein, 6 for nucleic] Word (ktup)
size (Integer 1 or more)
[-outfile] outfile [*.lfasta] Output file name
Additional (Optional) qualifiers (* if not always prompted):
* -matrix menu [BL50] Amino acid comparison matrix (Values:
BL50 (BLOSUM50); BL62 (BLOSUM62); 250
(PAM250))
-gapopen integer [12 for protein, 16 for nucleic] Gap opening
penalty (Integer 0 or more)
-gapextend integer [2 for protein, 4 for nucleic] Gap extension
penalty. fastA subtracts from the
similarity score for each gap a penalty of
type <Gap opening penalty> + <Gap extension
penalty> * (n - 1) (Integer 0 or more)
* -format menu [0] Alignment format (Values: 0 (default); 1
(x = conservative replacements, X =
non-conservative substitutions); 2 (show
only residues in sequence 2 that differ from
sequence 1); 10 (write alignments in
parsable format))
Advanced (Unprompted) qualifiers:
-psplot boolean Make PostScript file with dotplot-like
graphic instead of writing alignment
-linesize integer [60] Number of residues per line of the
alignment (Integer from 10 to 200)
-[no]ghostview boolean [Y] Open PostScript file with Ghostview
Associated qualifiers:
"-asequence" associated qualifiers
-sbegin1 integer Start of the sequence to be used
-send1 integer End of the sequence to be used
-sreverse1 boolean Reverse (if DNA)
-sask1 boolean Ask for begin/end/reverse
-snucleotide1 boolean Sequence is nucleotide
-sprotein1 boolean Sequence is protein
-slower1 boolean Make lower case
-supper1 boolean Make upper case
-sformat1 string Input sequence format
-sdbname1 string Database name
-sid1 string Entryname
-ufo1 string UFO features
-fformat1 string Features format
-fopenfile1 string Features file name
"-bsequence" associated qualifiers
-sbegin2 integer Start of the sequence to be used
-send2 integer End of the sequence to be used
-sreverse2 boolean Reverse (if DNA)
-sask2 boolean Ask for begin/end/reverse
-snucleotide2 boolean Sequence is nucleotide
-sprotein2 boolean Sequence is protein
-slower2 boolean Make lower case
-supper2 boolean Make upper case
-sformat2 string Input sequence format
-sdbname2 string Database name
-sid2 string Entryname
-ufo2 string UFO features
-fformat2 string Features format
-fopenfile2 string Features file name
"-outfile" associated qualifiers
-odirectory3 string Output directory
General qualifiers:
-auto boolean Turn off prompts
-stdout boolean Write standard output
-filter boolean Read standard input, write standard output
-options boolean Prompt for standard and additional values
-debug boolean Write debug output to program.dbg
-verbose boolean Report some/full command line options
-help boolean Report command line options. More
information on associated and general
qualifiers can be found with -help -verbose
-warning boolean Report warnings
-error boolean Report errors
-fatal boolean Report fatal errors
-die boolean Report dying program messages
|
| Standard (Mandatory) qualifiers | Allowed values | Default | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| [-asequence] (Parameter 1) |
Sequence filename and optional format, or reference (input USA) | Readable sequence | Required | ||||||||
| [-bsequence] (Parameter 2) |
Sequence filename and optional format, or reference (input USA) | Readable sequence | Required | ||||||||
| -wordsize | Word (ktup) size | Integer 1 or more | 2 for protein, 6 for nucleic | ||||||||
| [-outfile] (Parameter 3) |
Output file name | Output file | <sequence>.lfasta
or <sequence>.lfasta.ps |
||||||||
| Additional (Optional) qualifiers | Allowed values | Default | |||||||||
| -matrix | Amino acid comparison matrix |
|
BL50 | ||||||||
| -gapopen | Gap opening penalty | Integer 0 or more | 12 for protein, 16 for nucleic | ||||||||
| -gapextend | Gap extension penalty. fastA subtracts from the similarity score for each gap a penalty of type <Gap opening penalty> + <Gap extension penalty> * (n - 1) | Integer 0 or more | 2 for protein, 4 for nucleic | ||||||||
| -format | Alignment format |
|
0 | ||||||||
| Advanced (Unprompted) qualifiers | Allowed values | Default | |||||||||
| -psplot | Make PostScript file with dotplot-like graphic instead of writing alignment | Boolean value Yes/No | No | ||||||||
| -linesize | Number of residues per line of the alignment | Integer from 10 to 200 | 60 | ||||||||
| -[no]ghostview | Open PostScript file with Ghostview | Boolean value Yes/No | Yes | ||||||||
Note that the local alignment of identical sequences produces "mirror-image" alignments, as well as a full identity alignment. lfasta reports only one-half of the local alignments in the text output ; in the graphical output (parameter -psplot) it draws all alignments, including the full diagonal.
LFASTA compares two sequences
v2.1u00 Mar, 2001
Please cite:
W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448
searching embl-id:K00153 library
Comparison of:
(A) embl-id:X00066 X00066 X00066.1 Salmonella typhimurium hisR gene - 972 nt
(B) embl-id:K00153 K00153 K00153.1 E.coli Arg-tRNA-2. - 77 nt
using matrix file DNA
74.026% identity in 77 nt overlap; init: 203, opt: 205
310 320 330 340 350 360
X00066 GCGCCCGTAGCTCAGCTGGATAGAGCGCTGCCCTCCGGAGGCAGAGGTCTCAGGTTCGAA
:: X:::::::::::::::::::: :: :: :: : :: :::: :::::::::
K00153 GCATCCGTAGCTCAGCTGGATAGAGTACTCGGCTGCGAACCGAGCGGTCGGAGGTTCGAA
10 20 30 40 50 60
370 380
X00066 TCCTGTCGGGCGTACCA
:::: ::: : :::X
K00153 TCCTCCCGGATGCACCA
70
----------
72.131% identity in 61 nt overlap; init: 152, opt: 152
670 680 690 700 710 720
X00066 GTAGCGCAGCTTGGTAGCGCAACTGGTTTGGGACCAGTGGGTCGGAGGTTCGAATCCTCT
X:::: ::::: : ::: : : :: : : ::: ::::::::::::::::::::
K00153 GTAGCTCAGCTGGATAGAGTACTCGGCTGCGAACCGAGCGGTCGGAGGTTCGAATCCTCC
10 20 30 40 50 60
X00066 C
X
K00153 C
----------
73.214% identity in 56 nt overlap; init: 67, opt: 133
450 460 470 480 490
X00066 TAGCTCAGTTGG-TAGAGCCCTGGATTGTGATTCCAGTTGTCGTGGGTTCGAATCC
:::::::: ::: ::::: :: : :: :: : :: X::: ::::::::::X
K00153 TAGCTCAGCTGGATAGAGTACTCGGCTGCGAACCGAGCGGTCGGAGGTTCGAATCC
10 20 30 40 50 60
----------
71.429% identity in 28 nt overlap; init: 62, opt: 68
600 610 620
X00066 GGGGGTTCAAGTCCCCCCCCTCGCACCA
:: X:::: : ::: ::: :::::X
K00153 GGAGGTTCGAATCCTCCCGGATGCACCA
50 60 70
----------
|
| Program name | Description |
|---|---|
| blast2seq | Finds local alignments between two sequences, using BLAST |
| blastz | Nonintersecting best local alignments, makes LAJ file |
| matcher | Finds the best local alignments between two sequences |
| seqmatchall | All-against-all comparison of a set of sequences |
| sim_lav | Nonintersecting best local alignments, makes LALNVIEW file |
| supermatcher | Match large sequences against one or more other sequences |
| water | Smith-Waterman local alignment |
| wordfinder | Match large sequences against one or more other sequences |
| wordmatch | Finds all exact matches of a given size between 2 sequences |
| fasta | fastA search of query sequence(s) against sequence search set |
| fasts | Protein identification from peptides using fastA algorithm |
The programs lfasta and plfasta themselves were written by
William R. Pearson
Department of Biochemistry
Box 440, Jordan Hall
U. of Virginia
Charlottesville, VA
wrp@virginia.EDU