blastz

 

Function

Nonintersecting best local alignments, makes LAJ file

Description

blastz is an EMBOSS "wrapper" program for the BLASTZ alignment program, which is used by PipMaker and for making the whole-genome human-mouse alignments available from the UCSC browser. BLASTZ output can be browsed with the LAJ interactive alignment viewer. It is also possible to obtain static graphical overviews and traditional text-style alignments, using respectively PipServer software and the LAT program (which are all integrated behind the "wrapper"). blastz operates only on nucleic acid sequences ; it takes as input one Reference Sequence (in principle a sequence that is well annotated) and one or several Test Sequence(s) (in principle sequences in which the user wants to identify features by comparing with the Reference Sequence). The "wrapper" blastz uses the programs genbank2exons and genbank2repeats from the PipTools suite to create for LAJ files with information about the position of the coding sequence and of "repeat" regions in the Reference Sequence (see below input and output format).

BLASTZ allows for a greater trade-off between speed and sensitivity than the BLAST from the NCBI. The main changes in the algorithm are :

Masking uninteresting regions

Sequences can contain regions (e.g. repeat regions like Alu) that give a huge number of uninteresting alignments. There is a provision for "masking" such regions, which has as effect that blastz does not search for "hits" (although it can extend an alignment into a "masked" region from a "hit" outside). Under the "wrapper" blastz there are currently two alternative ways to select the regions to be "masked" :

Algorithm

Besides what was already mentioned in the Description section above, BLASTZ differs from the NCBI BLAST in the following ways :

Usage

Here is a sample session with blastz

> blastz -secondscore=2200 -lat=S
Nonintersecting best local alignments, makes LAJ file
Reference sequence: embl:m13792
Test sequence(s): embl:u73107
Output file name [M13792.blastz]: ada.blastz
      1 : none
      2 : PostScript
      3 : PDF
Graphic output format for dotplot and PIP files [3]:

Go to the input files for this example
Go to the output files for this example

Command line arguments

   Standard (Mandatory) qualifiers:
  [-refseq]            sequence   Reference sequence
  [-testseqs]          seqall     Test sequence(s)
  [-outfile]           string     [$(refseq.name).blastz] Output file name
                                  (Any string is accepted)
   -pip                selection  [3] Write graphic files with dotplot and
                                  Percentage Identity Plot. These provide an
                                  alternative to viewing the output with LAJ.

   Additional (Optional) qualifiers (* if not always prompted):
*  -maskrefseq         range      [(full sequence)] Regions of reference
                                  sequence that must be masked (displayed for
                                  sequence continuity but not aligned). Is not
                                  used if you set option -masklowrefseq.
   -chain              boolean    [N] Report only matching regions that have
                                  same order and orientation in both sequences
   -wordtype           menu       [1] Word type for initial search (Values: 0
                                  (simple matching words); 1 (template
                                  1110100110010101111, 1 transition allowed);
                                  2 (template 1110100110010101111); 3
                                  (template 1110101100110010101111, 1
                                  transition allowed); 4
                                  (1110101100110010101111))
   -score              integer    [3000] (K) Threshold for accepting HSP
                                  (Integer 1 or more)
   -gapscore           integer    [equal to Threshold for accepting HSP] (L)
                                  Threshold for starting HSP extension into
                                  gapped alignment (Any integer value)
   -gappenalty         integer    [400] (O) Gap penalty (Integer 0 or more)
   -gaplength          integer    [30] (E) Gap length penalty. BLASTZ
                                  subtracts from the similarity score for each
                                  gap a penalty of type <Gap penalty> + <Gap
                                  length penalty> * n (Integer 0 or more)
   -secondscore        integer    [0] (H) Threshold for accepting HSP at
                                  second pass. If you fill in a value BLASTZ
                                  will perform a more sensitive second search
                                  (using simple matching words) in regions
                                  between adjacent matches (Integer 0 or more)
*  -wordsize           integer    [8] Word size for initial search with simple
                                  matching words and for second pass search
                                  (Integer 1 or more)
   -maskbase           integer    [0] (M) Mask regions in test sequence that
                                  give at least n matches (Integer 0 or more)
   -lat                menu       [0] Write second output file with alignments
                                  in text format (Values: 0 (none); A (with
                                  ticks relative to alignment); S (with ticks
                                  relative to sequences))
*  -awidth             integer    [50] Alignment width (Integer 1 or more)

   Advanced (Unprompted) qualifiers:
   -masklowrefseq      toggle     Use lowercase to mask reference sequence
   -masklowtestseqs    boolean    Use lowercase to mask test sequence(s)
   -[no]showsequences  boolean    [Y] Show sequences in output
   -[no]showfeatures   boolean    [Y] Show exons and repeats in output
   -[no]laj            boolean    [Y] Open output with LAJ

   Associated qualifiers:

   "-refseq" associated qualifiers
   -sbegin1            integer    Start of the sequence to be used
   -send1              integer    End of the sequence to be used
   -sreverse1          boolean    Reverse (if DNA)
   -sask1              boolean    Ask for begin/end/reverse
   -snucleotide1       boolean    Sequence is nucleotide
   -sprotein1          boolean    Sequence is protein
   -slower1            boolean    Make lower case
   -supper1            boolean    Make upper case
   -sformat1           string     Input sequence format
   -sdbname1           string     Database name
   -sid1               string     Entryname
   -ufo1               string     UFO features
   -fformat1           string     Features format
   -fopenfile1         string     Features file name

   "-testseqs" associated qualifiers
   -sbegin2            integer    Start of each sequence to be used
   -send2              integer    End of each sequence to be used
   -sreverse2          boolean    Reverse (if DNA)
   -sask2              boolean    Ask for begin/end/reverse
   -snucleotide2       boolean    Sequence is nucleotide
   -sprotein2          boolean    Sequence is protein
   -slower2            boolean    Make lower case
   -supper2            boolean    Make upper case
   -sformat2           string     Input sequence format
   -sdbname2           string     Database name
   -sid2               string     Entryname
   -ufo2               string     UFO features
   -fformat2           string     Features format
   -fopenfile2         string     Features file name

   General qualifiers:
   -auto               boolean    Turn off prompts
   -stdout             boolean    Write standard output
   -filter             boolean    Read standard input, write standard output
   -options            boolean    Prompt for standard and additional values
   -debug              boolean    Write debug output to program.dbg
   -verbose            boolean    Report some/full command line options
   -help               boolean    Report command line options. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning            boolean    Report warnings
   -error              boolean    Report errors
   -fatal              boolean    Report fatal errors
   -die                boolean    Report dying program messages

Standard (Mandatory) qualifiers Allowed values Default
[-refseq]
(Parameter 1)
Reference sequence Readable sequence Required
[-testseqs]
(Parameter 2)
Test sequence(s) Readable sequence(s) Required
[-outfile]
(Parameter 3)
Output file name Any string is accepted <sequence.>.blastz
-pip Write graphic files with dotplot and Percentage Identity Plot. These provide an alternative to viewing the output with LAJ. none
PostScript
PDF
PDF
Additional (Optional) qualifiers Allowed values Default
-maskrefseq Regions of reference sequence that must be masked (displayed for sequence continuity but not aligned). Is not used if you set option -masklowrefseq. Sequence range by default is not set
-chain Report only matching regions that have same order and orientation in both sequences Boolean value Yes/No No
-wordtype Word type for initial search
0 (simple matching words)
1 (template 1110100110010101111, 1 transition allowed)
2 (template 1110100110010101111)
3 (template 1110101100110010101111, 1 transition allowed)
4 (1110101100110010101111)
1
-score (K) Threshold for accepting HSP Integer 1 or more 3000
-gapscore (L) Threshold for starting HSP extension into gapped alignment Any integer value equal to Threshold for accepting HSP
-gappenalty (O) Gap penalty Integer 0 or more 400
-gaplength (E) Gap length penalty. BLASTZ subtracts from the similarity score for each gap a penalty of type <Gap penalty> + <Gap length penalty> * n Integer 0 or more 30
-secondscore (H) Threshold for accepting HSP at second pass. If you fill in a value BLASTZ will perform a more sensitive second search (using simple matching words) in regions between adjacent matches Integer 0 or more 0
-wordsize Word size for initial search with simple matching words and for second pass search Integer 1 or more 8
-maskbase (M) Mask regions in test sequence that give at least n matches Integer 0 or more 0
-lat Write second output file with alignments in text format
0 (none)
A (with ticks relative to alignment)
S (with ticks relative to sequences)
0
-awidth Alignment width Integer 1 or more 50
Advanced (Unprompted) qualifiers Allowed values Default
-masklowrefseq Use lowercase to mask reference sequence Toggle value Yes/No No
-masklowtestseqs Use lowercase to mask test sequence(s) Boolean value Yes/No No
-[no]showsequences Show sequences in output Boolean value Yes/No Yes
-[no]showfeatures Show exons and repeats in output Boolean value Yes/No Yes
-[no]laj Open output with LAJ Boolean value Yes/No Yes

Input file format

blastz reads any normal sequence USAs. It takes as in put one Reference Sequence and one or several Test Sequence(s). They must be of type DNA (RNA with U's is not allowed).

If the Reference Sequence is in EMBL or GenBank format and contains feature annotation, the "wrapper" blastz will make sure that information about coding sequence and repeat regions is extracted and written in the <output file name>.exons and <output file name>.repeats files, which are used by LAJ to display these features graphically.

It are the features with Feature Key "CDS", "exon", "gene", "mRNA", "prim_transcript" and "repeat_region" that are used. You can find more information about the exact syntax of the feature table in the DDBJ/EMBL/GenBank Feature Table document.

You can in the Reference Sequence as well as in the Test Sequences use a lowercase/uppercase coding. You must put the regions you want to "mask" (see Description) in lowercase. You must then use the optional parameters -masklowrefseq respectively -masklowtestseqs.

Output file format

blastz can create up to 8 different output files. For the usage example they could be : The first 5 files are all destined as input for the graphical viewer LAJ and are created by default.

The <output file name>.dotplot.<extension> and the <output file name>.pip.<extension> files provide a static picture corresponding to what LAJ displays. They can be useful for documention or publication and also provide an alternative for the users who cannot afford working under X-windows. You can choose between PostScript and Adobe PDF format, or have no picture files created.

The <output file name>.lat file can directly be read by the user. Since it can be really huge, it is only created if you request it explictly.

Output files for usage example

File: ada.blastz

#:lav
d {
  "blastz.v7 ada.blastz.refseq ada.blastz.testseqs T=1 K=3000 L=3000 O=400 E=30 H=2200 W=8
     A    C    G    T
    91 -114  -31 -123
  -114  100 -125  -31
   -31 -125  100 -114
  -123  -31 -114   91
  O = 400, E = 30, K = 3000, L = 3000, M = 0"
}
#:lav
s {
  "ada.blastz.refseq" 1 36741 0 1
  "ada.blastz.testseqs" 1 29807 0 1
}
h {
   ">HSADAG M13792.1 Human adenosine deaminase (ADA) gene, complete cds."
   ">MMU73107 U73107.1 Mus musculus adenosine deaminase (ADA) gene, complete cds."
}
a {
  s 7377
  b 17 1939
  e 898 2638
  l 17 1939 25 1947 78
  l 26 1951 36 1961 100
  l 46 1962 79 1995 74
  l 86 1996 96 2006 73
  l 123 2007 144 2028 55
  l 156 2029 206 2079 61
  l 210 2080 217 2087 63
  l 220 2088 225 2093 83
  l 227 2094 246 2113 60
  l 247 2115 251 2119 100
  l 270 2120 279 2129 60
  l 298 2130 348 2180 61
  l 365 2181 374 2190 60
  l 375 2192 414 2231 73
  l 417 2232 455 2270 51
  l 456 2272 479 2295 58
  l 482 2296 499 2313 50
  l 505 2314 532 2341 71
  l 548 2342 583 2377 67
  l 587 2378 616 2407 67
  l 635 2408 638 2411 100
  l 646 2412 681 2447 61
  l 689 2448 698 2457 60
  l 699 2459 707 2467 56
  l 729 2468 760 2499 69
  l 761 2508 790 2537 67
  l 792 2538 827 2573 58
  l 831 2574 856 2599 65
  l 859 2600 877 2618 68
  l 879 2619 898 2638 80
}
a {
  s 7957
  b 1612 2789
  e 2347 3398
  l 1612 2789 1635 2812 63
  l 1637 2813 1657 2833 67

  [Part of this file has been deleted for brevity]

  l 36671 28561 36679 28569 89
  l 36684 28570 36713 28599 73
  l 36715 28600 36736 28621 77
}
x {
  n 0
}
m {
  n 0
}
#:eof

File: ada.blastz.refseq

>M13792 M13792.1 Human adenosine deaminase (ADA) gene, complete cds.
GATCTGGGTAAAGGGTTTTCCAGGTGTCAGGATGGAAGTGACTAAGGTGCAGAGGCTGGA
GGGCTGGGGCAGGTAGAAGCAAGCATTCCTGTTACCTACTGCTGTGTGACAATCTCCCCC

  [Part of this file has been deleted for brevity]

TCCATATCTGCTGAAAAAAGGTTTAAAATTTTTAAAAAGTTTAAAAGTGTTTTCTAAAAA
AGGGACAAGCAGGTCTGGACC

File: ada.blastz.testseqs

>U73107 U73107.1 Mus musculus adenosine deaminase (ADA) gene, complete cds.
GCCGACTTTAGATGTTCCTAAACTACATTTCCCAGCCCATTCCACCCCTCTCTGTCTCTG
TGACCCCTGATCCAGCTCTACCCTACTACAATGACCCCTACTGACTTAAATGTGCTTCTT

  [Part of this file has been deleted for brevity]

CAGCAATGCAGTCTCAAGCCCCAGGGATCTTGGTGCCACTGAAGGTGCATGTCATTCTGC
CATTAGGCCTGTCTTTAAGTCTATGTTGAGCTCTGGGTCTGGAATTC

File: ada.blastz.exons

M13792 

# 1 genes; 12 exons; 1 cds; 0 mrna; 1 prim_transcript;

# gene ADA 4031..35664
# exon <4031..4063
# exon 19230..19291
# exon 26344..26466
# exon 28908..29051
# exon 29823..29938
# exon 31176..31303
# exon 32425..32496
# exon 32573..32674
# exon 32851..32915
# exon 34354..34483
# exon 35100..35202
# exon 35651..>35664
# cds  join(4031..4063,19230..19291,26344..26466,28908..29051,29823..29938,31176..31303,32425..32496,32573..32674,32851..32915,34354..34483,35100..35202,35651..35664)

> 4031 35664 ADA
+ 4031 35664 
4031  4063
19230  19291
26344  26466
28908  29051
29823  29938
31176  31303
32425  32496
32573  32674
32851  32915
34354  34483
35100  35202
35651  35664

# stray mRNA
# stray CDs

File: ada.blastz.repeats

%:repeats
1362 1672  Right Other
2357 2903  Right Other
4907 5227  Right Other
5606 5908  Right Other
7582 8001  Right Other
8179 8484  Right Other
10005 10204  Right Other
10257 10534  Right Other
13452 13777  Right Other
14837 15386  Right Other
15806 16106  Right Other
16913 17224  Right Other
18414 18717  Right Other
19605 19902  Right Other
22523 22829  Right Other
24481 24773  Right Other
25143 25453  Right Other
26949 27269  Right Other
28032 28333  Right Other
31460 31867  Right Other

Graphics File: ada.blastz.dotplot.pdf

[blastz dotplot file]

Graphics File: ada.blastz.pip.pdf

[blastz PIP file]

File: ada.blastz.lat

------------------------------------------------------------
Seq 2 = ">U73107 U73107.1 Mus musculus adenosine deaminase (ADA) gene, complete cds."

Description:
	  "blastz.v7 /home/demo/ada.blastz.refseq /home/demo/ada.blastz.testseqs T=1 K=3000 L=3000 O=400 E=30 H=2200 W=8
	     A    C    G    T
	    91 -114  -31 -123
	  -114  100 -125  -31
	   -31 -125  100 -114
	  -123  -31 -114   91
	  O = 400, E = 30, K = 3000, L = 3000, M = 0"


              Local Alignment Number 1
              Similarity Score:  7377
              Match Percentage:  49 %
              Number of Matches:  446
              Number of Mismatches:  239
              Total Length of Gaps:  212
              Begins at (17,1939) and Ends at (898,2638)

                 :    .       :    .    :    .    :    .    :   
          17  TTTCCAGGT   GTCAGGATGGAAGTGACTAAGGTGCAGAGGCTGGAGGG
              |||||| :|---|||||||||||---------| ::|| |  ||| ||||
       1,939  TTTCCACATCTGGTCAGGATGGA         GTCACACATTCTGCAGGG
               :    .    :    .    :             .    :    .    

               .    :    .    :    .    :    .    :    .    :   
          64  CTGGGGCAGGTAGAAGCAAGCATTCCTGTTACCTACTGCTGTGTGACAAT
              ||| |||||||:||||------|:||||::|||-----------------
       1,980  CTGTGGCAGGTGGAAG      TCCCTGCCACC                 
              :    .    :    .          :    .                  

               .    :    .    :    .    :    .    :    .    :   
         114  CTCCCCCTAAAACACAATGGCTTAAAATAACATCCATTTCATTACATATC
              ---------|:|:||::|:||| | :| || -----------:|| |::|
       2,007           AGATACGGTAGCTGATGAGAAA           CACCTGCC
                          :    .    :    .               :    . 

               .    :    .    :    .    :    .    :    .    :   
         164  TCAATACTATAGGTCAGGAATTTGGGCTGGGCTTACTTGGGTAATTCTTC
               ||:: :|:||:|| ||| ||  :|||||||||| |:|||::|---||  
       2,037  ACAGCTTTGTAAGTAAGGTATGAAGGCTGGGCTTCCCTGGACA   CTGA
                 :    .    :    .    :    .    :    .       :   

               .    :    .    :    .    :    .     :    .    :  
         214  TGTCCCACATGGCATTGACCAAAGCCTGGTTTT CAGTGGGCAGCTGGGC
               |||--|| |||-:||| : ||:| |||::|||-|||||-----------
       2,084  AGTC  ACCTGG GTTGCTGAAGGGCTGACTTTACAGTG           
               .      :     .    :    .    :    .               

                .    :    .    :    .    :    .    :    .    :  
         263  TGGATGGCCCAACACAGCTTCGCTAACATGATTGCTGTCTTCGTAGGGAT
              -------:|::|||:||------------------||   |: |||||:|
       2,120         TCTGACATAG                  TGGGATTTTAGGGGT
                     :    .                      :    .    :    

                .    :    .    :    .    :    .    :    .    :  
         313  GGTGGAAGCCTGGGCTCAGTGGGACTGTCAACTGGAATGGCCATATGTGG
              ||: |||: |||||||::|:  ::| ::||||||||--------------
       2,145  GGCTGAAAGCTGGGCTTGGCTCAGCAACCAACTGGA              
              .    :    .    :    .    :    .    :              

                .    :     .    :    .    :    .    :    .    : 
         363  ACTCTCTTAGCA TGATGGTCTCTTCTAGAAGCTTGGGTTCCCAGAGAGA
              --|:||::| ||-|||:||| |||||||| :|||:::|||||||||| :|
       2,181    TTTCCCACCAGTGACGGTGTCTTCTAGTGGCTCAAGTTCCCAGAGCAA
                    .    :    .    :    .    :    .    :    .   

                 .    :    .    :    .    :    .    :    .     :
         412  ATGTTCAAGAGGCCCCAAAGGACACCACAAAGCTTCTTTATGAC CAAGG
              | :--|||||||:||||: ||| :::: ||:: ::|:  ||  |-| ||:
       2,229  AAA  CAAGAGGTCCCAGTGGAAGTTGAAAGAGCCCCAAATCCCTCCAGA
               :      .    :    .    :    .    :    .    :    . 

                  .    :    .    :    .    :    .    :    .    :
         461  CTCGGAAATCCAGGAAGCTTGCTCCCATCACGCTCTATTACTCCAACAAG
                 ||||:|:||| |: ||--|::|:::|||::||||  -----|:||||
       2,277  GAAGGAAGTTCAGTAGCCT  CCTCTGCCACATTCTAAG     AGCAAG
                 :    .    :    .      :    .    :         .    

                  .    :    .    :    .    :    .    :    .    :
         511  TCACTCAGGCCAGCCCAGGTCCAAGAGGAGGAAACCTAGACTCCATCTTG
              || :||||| |  |||||| | ---------------||||||||||  |
       2,320  TCCTTCAGGACTTCCCAGGGCA               AGACTCCATCAGG
              :    .    :    .    :                   .    :    

                  .    :    .    :    .    :    .    :    .    :
         561  CAATGTGAAGAATTGCAAATAATTTGTGTCACCCTTAAGCAACCAGCAAC
              ||| |  |:|| : ||||| : |---||||||::||::  :||||| || 
       2,355  CAAAGACAGGATCAGCAAAGGCT   TGTCACTTTTGGCGGACCAGGAAG
              .    :    .    :    .       :    .    :    .    : 

                  .    :    .    :    .    :    .    :    .    :
         611  TCATCTAGGTTGATTGGCATTTCAGCAATGTGGTGGGAAGTGGTGGGACT
              ||||:|------------------||||-------:|||  :|| |||: 
       2,402  TCATTT                  GCAA       AGAATGAGTTGGATG
                 .                      :           .    :    . 

                  .    :    .    :    .    :    .     :    .    
         661  GATGTTGAAGAGGGACTTGAATGTCATGAGAGGCTGGG GAGGCAATAAG
              ||: |||:|||::||:|||| -------| || |::||-  :|| |||--
       2,427  GACTTTGGAGAAAGATTTGAC       ACAGCCCAGGCCCAGCTATA  
                 :    .    :    .           :    .    :    .    

              :    .    :    .    :    .    :    .    :    .    
         710  GTGGGGAGTGAAGTTTCTCGAGTCAGATTCAAATTTAAACCCCAGTTTTG
              -------------------|||||||| |||||||:||:: ||| ::|: 
       2,468                     GAGTCAGAATCAAATTCAAGTGCCACCCTCC
                                   :    .    :    .    :    .   

              :            .    :    .    :    .    :    .    : 
         760  C        CACTTACAACCCATGAGCCAAGCAGGCTGTCTCTCTATCTG
              |--------|||| |:  :||| ||||| ||::| ||||- |:|| |: |
       2,499  CTTGCATTGCACTAATCTTCCAGGAGCCCAGTGGCCTGT GCCCTCTTGG
               :    .    :    .    :    .    :    .     :    .  

                 .    :    .    :    .    :    .    :    .    : 
         802  AACCTCAGTGTCCTCATCTGTAAAATGAGGAGAACACCTCCTACATCTGA
              ::||||:|:  |:||||:||||:| |---|| |||| :|:||:| ||  |
       2,548  GGCCTCGGCCACTTCATTTGTAGACT   GATAACAGTTTCTGCTTCACA
                :    .    :    .    :       .    :    .    :    

                 .    :    .    :    .    :    .    :    .   
         852  GGATGACTGTAAAGATGAAATGGGATGGGTGCTTATAAAGTGCTTCC
              ||:||--|: | |||| :|||||:||-|:||:|| :|||||||||||
       2,595  GGGTG  TAGACAGATTGAATGGAAT GATGTTTTCAAAGTGCTTCC
              .      :    .    :    .     :    .    :    .   


              Local Alignment Number 2
              Similarity Score:  7957
              Match Percentage:  48 %
              Number of Matches:  370
              Number of Mismatches:  217
              Total Length of Gaps:  172
              Begins at (1612,2789) and Ends at (2347,3398)

                 .    :    .    :    .    :    .    :    .    : 
       1,612  ATCCTCCTGCCTTGGCCTCCCAAAGTCCTGGGATTACAGGCATGAACCAC
              |||:|||||:||::||:|  ::||-| ||::||||||||:||::|:----
       2,789  ATCTTCCTGTCTCAGCTTGGTGAA TACTAAGATTACAGACACAAG    
               :    .    :    .    :     .    :    .    :       

                 .    :    .    :    .    :    .    :    .    : 
       1,662  TGCGCCCAGGCTCGGGTATGTCTTCATCAGTAGCATGAAAATAATGGACT
              ------------------|:|:::||||  |: | |::||:|| ::  :|
       2,834                    TATTCCCATCCCTGTCTTAGAAGTACCATTTT
                                 .    :    .    :    .    :    .

                 .    :    .    :    .    :    .    :    .    : 
       1,712  AATACAGCCACCCTCTCCCTCACTCCCACATACAACCAAACCCCAAATCC
              :||----||||||||||:|||||||||||| |||||::||:|||||||--
       2,866  GAT    CCACCCTCTCTCTCACTCCCACAAACAACTGAATCCCAAAT  
                      :    .    :    .    :    .    :    .      

                 .    :    .    :    .    :    .    :    .    : 
       1,762  AGCTGATTTTACACCCTAAATGCAGCTTGAATATGAGTTTCTCCACTTCC
              --:  |||:|:|  |:|| | |:|||  |:|| |||||:: ||| |||- 
       2,910    TGTATTCTGCTGCTTACAAGTAGCAGGGATCTGAGTCCATCCTCTT G
                :    .    :    .    :    .    :    .    :    .  

                 .    :    .    :    .    :    .    :    .    : 
       1,812  CCCACTGACATCACTATGCCCTACCCAGACCATGGCAGTTGCCTCCTTCC
              ||||  | |::||| :| :| :|||::||--------------||:||||
       2,957  CCCAGAGTCGCCACGGTCTCACACCTGGA              TCTTTCC
                 :    .    :    .    :    .                  :  

                 .    :    .    :    .    :    .    :    .    : 
       1,862  TGGTATCCTGTCCTCCCTCACCCCCGCTGGCCCCCTGTAATGCCCTCCCC
              ||||:| |::: ||::||----|| |||-----||| | || :: |||||
       2,993  TGGTGTACCACGCTTTCT    CCAGCT     CCTTTCATCTTGTCCCC
                .    :    .    :        .         :    .    :   

                 .    :    .    :          .    :    .    :    .
       1,912  TCACAGCAGGGAGCCCAGGCTT      CTCAAAGTGCCCTGTGGGTGCG
              |||:: ||||: :|||:||| |------|||||:||||-------|::|:
       3,034  TCATGTCAGGACACCCGGGCATCCCCACCTCAAGGTGC       GCACA
               .    :    .    :    .    :    .    :           . 

                  :    .    :    .    :    .    :              .
       1,956  AACCACCTGGGGGTCCTGTTTGTATAAAATACAGATTCT          A
              |:::||:|::||: |:||:|   ||| |:: | |  ||:----------|
       3,077  AGTTACTTAAGGAACTTGCTACAATATAGCCCTGCATCCCGCCCCCAAAA
                 :    .    :    .    :    .    :    .    :    . 

                  :    .    :    .    :    .    :    .    :    .
       1,996  CTTCAGTAGGTCTGGGATGGGGTCTGAAAGTCTGCATTTGTAGTCAGCTC
              :::|| :|:::||:|::|: |||:: |||---------------||||||
       3,127  TCCCACCAAACCTAGAGTATGGTTCTAAA               CAGCTC
                 :    .    :    .    :    .                   : 

                  :    .    :    .    :    .           :    .   
       2,046  CCAGGTGATGTGGGTGCTGATGATCCCTGGAT       CACACTTTCAG
               |  |||| ||----------- ||||||| |-------:| ||||  ::
       3,162  ACCTGTGAAGT           CTCCCTGGCTAAATTTCTAGACTTGGGA
                 .    :               .    :    .    :    .    :

               :    .    :    .    :    .    :    .    :    .   
       2,089  TAGCTGGAGAATATTTTTTCCAAATAAAAGGGTGATTTTGTCTCGCCTCC
              :||||||||------|||||||||:|:||| ||||| :||| ::|   ||
       3,201  CAGCTGGAG      TTTTCCAAACAGAAGTGTGATGCTGTACTGAAACC
                  .          :    .    :    .    :    .    :    

               :    .    :    .    :    .    :    .    :    .   
       2,139  ACTTAAAACACTCCACTGACTTCCTAGGAATCCCACACCATCGCTGGGTC
              ||| | ||  :|:|| |||| ||---------------------------
       3,245  ACTGACAAACTTTCAGTGACATC                           
              .    :    .    :    .                             

               :    .    :    .    :    .    :    .    :    .   
       2,189  CCACATCCCTGGCAGGATTCAGCTCCCATCAGACCTTCTAGCCCCTTGCT
              -----||:|||---------||||:|||:|  :||| ||:|||:|- |||
       3,268       TCTCTG         AGCTTCCACCTCGCCTGCTGGCCTC AGCT
                     :             .    :    .    :    .     :  

               :    .    :    .    :    .    :    .    :    .   
       2,239  CTCCACTCTCCCACTCTCTCTTTCCCCCTTGTTTATGGGTTTGTTAATTT
              |:|| |||:||: |:| ::|::|||:|:||:|||:|::||||||| | ||
       3,303  CCCCTCTCCCCTTCCCATCCCCTCCTCTTTATTTGTAAGTTTGTTTAGTT
                .    :    .    :    .    :    .    :    .    :  

               :    .    :    .    :    .    :    .    :    .   
       2,289  ATTTATGATGAAATGAAATGAAGCTACCATCCACCCCAGTACTGGAACAT
              -------------| |:||||:: |  ||||:  ||:||| ||||:|:|:
       3,353               TCAGATGAGAATTGCATCTCACCTAGTTCTGGGATAC
                             .    :    .    :    .    :    .    

               :    .  
       2,339  TATCAATAA
              :|:|| |||
       3,390  CACCATTAA
              :    .   


  [Part of this file has been deleted for brevity]


              Local Alignment Number 16
              Similarity Score:  31563
              Match Percentage:  47 %
              Number of Matches:  1622
              Number of Mismatches:  718
              Total Length of Gaps:  1066
              Begins at (33834,25779) and Ends at (36736,28621)

               .    :    .    :    .    :    .    :    .    :   
      33,834  AAGAAACAGTCAACAGTGTGAAATTCTGCTATGCAAGTCGATTATGGTCA
              || ||: | |::| ||:|| :||  ||||| | |:||-|:|  | |||::
      25,779  AATAAGAATTTGAAAGCGTTGAAAACTGCTCTCCGAG CAAGGAAGGTTG
               :    .    :    .    :    .    :    .     :    .  

               .    :    .    :    .    :    .    :    .    :   
      33,884  GAGCTAGGAAAGATCCATTAGATACAACAAGATGGTGGTCAGGGATCGTG
              |||||:  |:|:|:| :|:|::| ||||||||||| || |||||-|:|||
      25,828  GAGCTGCCAGAAACCAGTCAAGTTCAACAAGATGGAGGGCAGGG TTGTG
                :    .    :    .    :    .    :    .    :     . 

               .    :    .    :    .    :    .    :    .    :   
      33,934  CCAAGAACAGCTTCCATGGTATGTTGGAGTAGCCAGCTCCCAGTGGGACT
              |||:|||--|:||||||| |:|:|||||: ||||   |  ||:-||||||
      25,877  CCAGGAA  GTTTCCATGCTGTATTGGAAGAGCCTTGTGGCAA GGGACT
                 :      .    :    .    :    .    :    .     :   

               .    :            .    :    .                    
      33,984  GAGGAACAA        GCAGGGTAGGGTGC                   
              ||||| ::|--------| ||||||||:|||-------------------
      25,924  GAGGACTGACTAGGAGAGGAGGGTAGGATGCTAGCCATGCTCCACCACTG
               .    :    .    :    .    :    .    :    .    :   

                  :    .    :    .    :    .    :    .    :    .
      34,007   AGAGGGGAAGGCTGGAGAGGGTGGCAGCCGGAGGGGGATGTTGCTTTCT
              -:::||||||||||:||||| :|||:|||: :|||---------------
      25,974  AGAGGGGGAAGGCTAGAGAGCATGGTAGCTTAAGG               
               .    :    .    :    .    :    .                  

                  :    .    :    .    :    .    :    .    :    .
      34,056  TGGCTCCCACCCCCACGCCCCCACCGGCTGCCATTCTGCCTGGTTCCCAT
              : |:: ||::|:|| :|:| || | |||||---||||||||---------
      26,009  CTGTCACCGTCTCCCTGTCACCTCAGGCTG   TTCTGCCT         
               :    .    :    .    :    .       :    .          

                  :    .    :    .    :    .    :    .    :    .
      34,106  GTCTGGCCCCTCTGCTGCCTTTGCCCAGCTCTGGTCTTCAGGATGGGCTG
              ---: |||:||  |||| |||| |||||:|:|------------------
      26,047     CTGCCTCTGAGCTGGCTTTTCCCAGTTTT                  
                    :    .    :    .    :    .                  

                  :    .    :    .    :    .    :    .    :    .
      34,156  GATTCTGGACTTTCTGGTTACATAGACTTGAACAAGTCACCTAAGTTCTG
              ---------------------|||:|------------------------
      26,076                       ATAAA                        
                                       :                        

                  :    .    :    .    :    .    :    .    :    .
      34,206  AATTTATTTCCCCCTCTGCACAAGGATCAGATCTTTCAGATCTGTTTGAG
              ----|:|||||:|:|: ::||:||:||  :|:|:||:: :|-|||:|:||
      26,081      TGTTTCCTCTTTAATACGAGAATGCAACCCTTTGTGT TGTCTAAG
                      .    :    .    :    .    :    .     :    .

                  :    .    :    .    :    .    :    .    :    .
      34,256  GCTGCTGTGAGGATCAAAGGCGGGTGAACGTCAATGTGTTCTGACTATTT
              |:||-|:|:|:||| :|||:-||| |:  || :| | |:  |||: :||:
      26,126  GTTG TATAAAGATGGAAGA GGGAGGTGGTGGAAGGGCAGTGATGGTTC
                   :    .    :     .    :    .    :    .    :   

                  :    .    :    .    :    .    :               
      34,306  ATGTAAGAGTAAAAGGAGGCTGATTCTCTCCTCCTC              
               ||---||||:||--||||||  :||||||:::||:--------------
      26,174  TTG   GAGTGAA  GAGGCTCTCTCTCTCTCTCTTTTCTTCCTGCCTGG
               .       :      .    :    .    :    .    :    .   

                 .    :    .    :    .    :    .    :    .    : 
      34,342  CCTCTTCTGCAGGCTCAAAAATGACCAGGCTAACTACTCGCTCAACACAG
              ||:||:|: ||| :||||:|||||: ||||:||||||||:||||||||||
      26,219  CCCCTCCCCCAGCTTCAAGAATGATAAGGCCAACTACTCACTCAACACAG
               :    .    :    .    :    .    :    .    :    .   

                 .    :    .    :    .    :    .    :    .    : 
      34,392  ATGACCCGCTCATCTTCAAGTCCACCCTGGACACTGATTACCAGATGACC
              |:||||| ||||||||||||||||||||:||||||||:||||||||||||
      26,269  ACGACCCCCTCATCTTCAAGTCCACCCTAGACACTGACTACCAGATGACC
               :    .    :    .    :    .    :    .    :    .   

                 .    :    .    :    .    :    .    :    .    : 
      34,442  AAACGGGACATGGGCTTTACTGAAGAGGAGTTTAAAAGGCTGGTGAGTGG
              ||: ::|||||||||||:|||||:||||||||:||: |:|||||||||: 
      26,319  AAGAAAGACATGGGCTTCACTGAGGAGGAGTTCAAGCGACTGGTGAGTAT
               :    .    :    .    :    .    :    .    :    .   

                 .    :       .    :    .    :    .    :     .  
      34,492  GTGTGAGCCATA   CTGGCCTTGACTCGGGTTTGGGAGTATG GTATCT
              ||||||||:||:---|||:| :||:|:|:||| || | ||:||-|||| |
      26,369  GTGTGAGCTATGAGCCTGACACTGGCCCAGGTGTGTGTGTGTGTGTATAT
               :    .    :    .    :    .    :    .    :    .   

                :    .     :                                    
      34,538  ACAGGTCCA GTCCGGG                                 
              ::: || ::-|| || |---------------------------------
      26,419  GTGTGTGTGTGTGCGCGCGCGCGCGCGTGCACACACACGCACGTGAGTGC
               :    .    :    .    :    .    :    .    :    .   

                                 .    :    .    :    .    :    .
      34,554                    GCCTGGAATCTTTGGAGAGAGGGAGTGAGTCT
              ------------------||||:|||:||:||||||  ::||:||:||: 
      26,469  ATATATTGTGTGTAGATTGCCTAGAACCTCTGGAGACCAAGAATGGGTTG
               :    .    :    .    :    .    :    .    :    .   

                  :    .    :    .    :    .          :    .    
      34,586  GCCTCAACAGTCCAAGACAAGCCCAACCTAG      ACACTTTCCACAG
              |||| |------|| | |||||:||:|||||------| :|:||||||||
      26,519  GCCTGA      CATGCCAAGCTCAGCCTAGTACCAAAAGCCTTCCACAG
               :          .    :    .    :    .    :    .    :  

              :    .    :    .    :    .    :     .    :    .   
      34,630  AGAAGACATCTTTGTGTTGACGTCCTGACCTA GGACCAGGTTTTTGATC
              ||| |---:||| |:: |::|:| |||:|: |-|||| |||:  :| | |
      26,563  AGATG   CCTTGGCAGTAGCATGCTGGCTGAGGGACGAGGCAACTTAGC
                .       :    .    :    .    :    .    :    .    

               :    .    :    .    :    .    :    .    :    .   
      34,679  CTTTGCTTGGGTTGAGTGCCTTTAAAGAATCCAGTGAAAGCTGTCAACCC
              ||||||||||||||||: |||| :|||------------------| |||
      26,610  CTTTGCTTGGGTTGAGCCCCTTAGAAG                  ACCCC
              :    .    :    .    :    .                      : 

               :    .    :    .    :    .    :      .     :    .
      34,729  TCTCCCCAGAAAGGTGTGTGCAGCAGCTATGAA  GTCTTGC ACACTCT
              |||||| ||| |||| ||-------:||:||:|-- |:|||:-|||----
      26,642  TCTCCCAAGACAGGTCTG       ACTGTGGACGTTTTTGTGACA    
                 .    :    .           :    .    :    .    :    

                  :    .    :    .    :    .    :    .    :    .
      34,776  CTTCAGGTTGTTCTTAAATCCCAGGCTGAATAAGTCCATTCCTGCACGTG
              -||:|||||| |: ||::-|:|||:|--------------|:||  | ||
      26,681   TTTAGGTTGGTTGTAGG CTCAGAC              CTTGGTCTTG
                   .    :    .     :                  .    :    

                  :    .    :    .    :    .    :    .    :     
      34,826  TCTGCGAGGTGTCTCTGGCCCCCTACATGCCACCCTGTCTCTCAAAG   
              ----------| :||||:: :|||: :||   |||||-----|||:|---
      26,715            GGTTCTGATGTCCTGAGTGAGTCCCTG     CAAGGATC
                        .    :    .    :    .    :         .    

                 .    :    .    :    .    :    .    :    .    : 
      34,873   GTTTCTCCAACTTCCTTCTCACAGCCCTTTTTCATGTAATGACAAATTA
              -|||||:|| |||--------------------------||||-------
      26,750  TGTTTCCCCCACT                          ATGA       
              :    .    :                              .        

                 .    :    .    :    .    :    .    :    .    : 
      34,922  AGAACACGACCTCATGGTCTCTACTCTGGCACTTGCTGCCGTGTGACAGT
              ---------------------|:|:|| || ||||||:------------
      26,767                       TGCCCTTGCCCTTGCTA            
                                      :    .    :               

                 .    :    .    :    .    :    .    :    .    : 
      34,972  GGACAAATCCTTCCCCCTCTAAGCGTATCTGCCCATGTTGAGTGAAGAGG
              --------------------------------------------------
      26,784                                                    
                                                                

                 .    :    .    :    .    :    .    :    .    : 
      35,022  ATGGACTATCACTACATTGCTAAGAGCTGCCTTCTTTGTTCTCTGGTTCC
              -------------|||  |||----||| |||||:||||--:|||::|||
      26,784               ACAGGGCT    GCTTCCTTCCTTGT  CCTGACTCC
                            .    :        .    :    .      :    

                 .    :    .    :    .    :    .    :    .    : 
      35,072  ATGTTGTCTGCCATTCTGGCCTTTCCAGAACATCAATGCGGCCAAATCTA
              ||||| :|-----------||:|||:||||||||||:||:|| ||:|| |
      26,815  ATGTTTCC           CCCTTCTAGAACATCAACGCAGCGAAGTCAA
              .    :               .    :    .    :    .    :   

                 .    :    .    :    .    :    .    :    .    : 
      35,122  GTTTCCTCCCAGAAGATGAAAAGAGGGAGCTTCTCGACCTGCTCTATAAA
              |:|||||||||||:|| ||:||||:|||:||||| || | ||||||:|:|
      26,854  GCTTCCTCCCAGAGGAAGAGAAGAAGGAACTTCTGGAACGGCTCTACAGA
               .    :    .    :    .    :    .    :    .    :   

                 .    :    .               :    .    :    .     
      35,172  GCCTATGGGATGCCACC           TTCAGCCTCTGCAGGTAG    
              |  ||: ::  ||||||----------- |  ||:| ||||||  |----
      26,904  GAATACCAATAGCCACCACAGACTGACGGTACGCTTGTGCAGGGCGCAAT
               .    :    .    :    .    :    .    :    .    :   

                                  :          .    :             
      35,207                   GTTCCT      GTCTGGGC             
              -----------------||:|||------ |||| ||-------------
      26,954  AACCACCCCACCACACTGTCCCTCCTTAACTCTGTGCGATTGTGGCAGAA
               .    :    .    :    .    :    .    :    .    :   

                    .    :                                .    :
      35,221    TTCTGGGCAGTTG                            CCTGTCC
              --:::|||||||  |----------------------------|||:| |
      27,004  GTCCTTGGGCAGGAGCACACCTCTGCAGGGTTACAGCCACCACCCTATGC
               .    :    .    :    .    :    .    :    .    :   

                  .    :    .    :    .    :    .    :    .    :
      35,241  TGGCCCCAGTGTGGCTTTCTGTGGGACTTCTAGCAAGATGCCCTTCCATT
              || |||: -:::::||||||||| |:||| |||||::|  |::|:|||  
      27,054  TGTCCCTC CACAACTTTCTGTGTGGCTTGTAGCAGAAATCTTTCCCAGA
               .    :     .    :    .    :    .    :    .    :  

                  .     :    .    :    .    :    .    :    .    
      35,291  CTTGGG CAGCGCATGAATGTGTGATGACTCCCTGGTTTCTGGGCCCTGG
               | |||-||||-| |||| : :||: |||| :||||:||||||||----:
      27,103  GTAGGGACAGC CCTGAAAAAATGGAGACTGTCTGGCTTCTGGGC    A
                .    :     .    :    .    :    .    :    .      

              :    .    :    .    :    .    :    .    :    .    
      35,340  CTGGGAGCAGCGTCTCATTAGATCGGTTTGTTTTCTATAAAAGTTCTTGA
              | | |  |||| |||::|:||: :||:|  ||::: || |  :| |  ||
      27,148  CAGTGCTCAGCTTCTTGTCAGGGTGGCTATTTCCTGATCATTATGCAGGA
                :    .    :    .    :    .    :    .    :    .  

              :    .         :    .    :    .             :    .
      35,390  GAGGCT     GTTCTAAGGGGAGACTTTCTGAA         GCCCAGT
               |||||-----: :||:::| | ||||---||||---------|||||||
      27,198  TAGGCTCGGGCAGCCTGGAGTGTGACT   TGAACAGCAGTTGGCCCAGT
                :    .    :    .    :       .    :    .    :    

                  :    .    :    .    :    .    :    .    :    .
      35,426  CCCAAAGGTCTGGGCAGTTGGGGACACCTCCATGGCTGCCCAAAGCCAAG
              ||||:----------------------:||||||:---------------
      27,245  CCCAG                      TTCCATGA               
              .                          :    .                 

                  :    .    :    .    :    .    :    .    :    .
      35,476  GGCAGGGAGAGGGGCCCAGGCTGTTCTGCTCCTTTCTTCCTATGTGGTCT
              ---------------------------|::||||| ||:|: |:|||:||
      27,258                             GTCCCTTTATTTCCTTATGGCCT
                                           :    .    :    .    :

                  :    .                        :    .    :    .
      35,526  TGGCAAGGCA                    TCTTCTTGCCATCATAGGAA
              ||||||| ||--------------------:| ||||||::||   ||||
      27,281  TGGCAAGCCACACATTCCCTTGCTTGAAGCCCGTCTTGCTGTCTATGGAA
                  .    :    .    :    .    :    .    :    .    :

                      :    .    :    .    :    .    :    .    : 
      35,556  GGA    GTTCCTTTCTGGTTCTGGTGTTCTATGATTTTTACAACATCCT
              |||----:|| ||:|||||::| |:|-||||:|| |||:||| |:::|:|
      27,331  GGAAGTTATTACTCTCTGGCCCGGAT TTCTGTGCTTTCTACCATGCCTT
                  .    :    .    :    .     :    .    :    .    

                 .    :    .    :    .    :    .    :    .    : 
      35,602  GGGTACTACAAGTTGCCTGATCTTTTTGCTTCTCTGAACCAACGAGCAGG
              : :|:::|::||--:|||||:||||:|::|||||||| :::|| ||||||
      27,380  ACATGTCATGAG  ACCTGACCTTTCTATTTCTCTGACTTGACCAGCAGG
              :    .    :      .    :    .    :    .    :    .  

                 .    :    .          :    .    :    .    :    .
      35,652  GCAGAACCTCTGAAGAC      GCCACTCCTCCAAGCCTTCACCCTGTG
              ||:|: ||:|||||||:------||||||:|||::||||-|||:||||||
      27,428  GCGGGTCCCCTGAAGATGGCAAGGCCACTTCTCTGAGCC TCATCCTGTG
                :    .    :    .    :    .    :    .     :    . 

                      :    .    :    .    :    .    :    .    : 
      35,696  G    AGTCACCCCAACTCTGTGGGGCTGAGCAACATTTTTACATTTATT
              |----|||| :: ||||||||------------|||| || || ||:|||
      27,477  GATAAAGTCTTTACAACTCTG            ACATATTGACCTTCATT
                 :    .    :    .                :    .    :    

                 .    :    .    :    .    :    .    :    .    : 
      35,742  CCTTCCAAGAAGACCATGATCTCAATAGTCAGTTACTGATGCTCCTGAAC
              |||||||:------------------------------------------
      27,515  CCTTCCAG                                          
              .    :                                            

                 .    :    .    :    .    :    .    :    .    : 
      35,792  CCTATGTGTCCATTTCTGCACACACGTATACCTCGGCATGGCCGCGTCAC
              -----------------------------||||:|| : ||||: |||  
      27,523                               ACCTTGGAGAGGCCAGGTCTG
                                             .    :    .    :   

                 .    :    .    :    .    :    .    :       .   
      35,842  TTCTCTGATTATGTGCCCTGGCCAGGGACCAGCGCCCTTG CA  CATGG
              |:||||||||: :|::||||||:|||  |||| |  ||||-||--|||| 
      27,544  TCCTCTGATTGGATATCCTGGCTAGGTCCCAGGGGACTTGACAATCATGC
               .    :    .    :    .    :    .    :    .    :   

               :    .    :    .    :    .    :    .    :    .   
      35,889  GCATGGTTGAATCTGAAACCCTCCTTCTGTGGCAACTTGTACTGAAAATC
              :||----|||||: :|||||:|||||||: :|---------||:||| | 
      27,594  ACA    TGAATTGAAAACCTTCCTTCTAAAG         CTAAAATTA
               .        :    .    :    .    :             .    :

               :    .    :    .    :    .    :    .    :    .   
      35,939  TGGTGCTCAATAAAGAAGCCCATGGCTGGTGGCATGCAGCAGGTGGCATG
              |||||:||||||||| |||: :||:|||||: | ||||||| :|||:: :
      27,631  TGGTGTTCAATAAAGCAGCTGGTGACTGGTATCTTGCAGCACATGGTGAA
                  .    :    .    :    .    :    .    :    .    :

               :    .    :    .    :    .    :    .       :    .
      35,989  TAATTTGGTGGTCTTGGGCGGGCCGATGTGGGCAGGATG   AGCATGGA
              ||------||||||:|||   ||-----||| :||||||---|| | |||
      27,681  TA      TGGTCTCGGGGCTGC     TGGCTAGGATGCTAAGAAAGGA
                        .    :    .         :    .    :    .    

                  :    .    :    .    :    .           :    .   
      36,036  GGGAGCTGGGTCAGCCTGCTCAGCAGCAGG       GCCTGAGCCT   
              ||:: |: || |  : :||| ||:: ||||-------||| |:|:||---
      27,720  GGAGCCCTGGGCCCTACGCTGAGTGTCAGGCTGGGGAGCCAGGGTCTCTT
              :    .    :    .    :    .    :    .    :    .    

                                        :    .          :    .  
      36,076                        AAGGGTGGCTGT      GAATGCCAGG
              ---------------------- ||:| ||||||------::||||:---
      27,770  TCCTGCAGAAGCGATTCTTTCCCAGAGGGGCTGTTGGAGCAGATGCT   
              :    .    :    .    :    .    :    .    :    .    

                :    .                 :    .    :              
      36,098  CCAGAGATC             CCAATGCTGTGGGCC             
              || ||: ||-------------|||:| || |||:::-------------
      27,817  CCTGAACTCTCCGCCCCTTTAACCAGTCCTTTGGATTTATTTTTATTATT
                 :    .    :    .    :    .    :    .    :    . 

                                         .    :    .    :       
      36,122                          AAGAGGGGTCCAGAGGCTGT      
              ------------------------ | | |||| :   | |||:------
      27,867  TTTAAATATTTAATTATGTTTATGTATATGGGTGTTTTGCCTGCTTGTAT
                 :    .    :    .    :    .    :    .    :    . 

                                   .    :    .      :           
      36,142                    CCTCCTTCCAGAAG  AAATAAGG       C
              ------------------|||  | |||||:|--|:| ||||-------|
      27,917  GTATATGCATCATGTGTGCCTGGTGCCAGAGGTCAGAAAAGGGTACCACC
                 :    .    :    .    :    .    :    .    :    . 

              .    :               .    :    .    :    .        
      36,165  TTCTCTGGT           TGTTGCTCAAACATTCCCTGAACTC     
              |:|:|||||----------- || |:|| ||| : || || :|:|-----
      27,967  TCCCCTGGTACTGGAATTTAGGTAGTTCTAACCCACCATGTGCCCATGTG
                 :    .    :    .    :    .    :    .    :    . 

                          :                              .    : 
      36,199             TCAGC                          CCCTCCTA
              ----------- ||||--------------------------|||| ::|
      28,017  CCCACCAGTGGACAGCAAGTGAGAGCCGACTCTCTCTTCCTACCCTGTCA
                 :    .    :    .    :    .    :    .    :    . 

                 .    :    .                      :    .        
      36,212  ACTCTAGGTTTTAA                  GGAGTAAAGCT       
              :|:|:|||  | ||------------------| |||||: ||-------
      28,067  GCCCCAGGGATGAACTCAGGCTGCCAGGCTGGGTAGTAAGTCTCTACCCA
                 :    .    :    .    :    .    :    .    :    . 

                                  :    .    :    .    :    .    
      36,237                   TCCTTTTGGGTTCCTGAAGCTGGCAGTTGGGGT
              -----------------:|:|||::||:| : | || ||-----|||| |
      28,117  CTGAGCATCTCACAGGCCCTTTTCAGGCTGTGGCAGGTG     TGGGCT
                 :    .    :    .    :    .    :    .         : 

              :    .    :    .    :    .    :    .    :    .    
      36,270  GAGAGCAGATGAGATGGAAGAGGGCTCATCAGACACTGGCCTTGGAGG  
              | |:|::|:||||:||::::|||:||||| |||::|::--| ||::||--
      28,162  GTGGGTGGGTGAGGTGAGGAAGGACTCATGAGATGCCA  CATGAGGGGT
                 .    :    .    :    .    :    .      :    .    

                 :    .    :    .    :    .    :    .    :    . 
      36,318   GTGCTGGCCTCTGCAGAACGCCAGCATCTTCTCAGAATCGTATGTTCTA
              -|||||||:| |::|||:|-----------------------------||
      28,210  TGTGCTGGTCACCACAGGA                             TA
              :    .    :    .                                 :

                 :     .    :    .    :    .    :    .    :    .
      36,367  GAAGCCTG GGCGAAGTCCGGCTAATTGTGGACTTGGGGAAAATAAGGCC
              ||:| | |-:|:|::|:|| || :: || |||: |||||-| ::|::|||
      28,231  GAGGGCAGCAGTGGGGCCCTGCAGGGTGAGGATGTGGGG ATGCAGAGCC
                  .    :    .    :    .    :    .     :    .    

                  :    .    :    .    :    .    :    .    :    .
      36,416  CAACCCCTGTTTTTGCAAGGTTAAGGAGAAATAATCTTAAACCAGTCACA
              ||:-|: ||:  |||  || | ||--||||| :||| |||:|||||||||
      28,280  CAG CTGTGCAGTTGATAGTTAAA  AGAAAAGATCATAAGCCAGTCACA
              :     .    :    .    :      .    :    .    :    . 

                  :    .    :    .    :    .    :    .    :    .
      36,466  CAAATCATCGGCATTTATTTCCTGGGTCCTAGGTGTCACTTATCCTGGTG
              ||||:||:|:||||||||||||||||-||||:||:||| |:-||:||||:
      28,327  CAAACCACCAGCATTTATTTCCTGGG CCTAAGTATCAGTC TCTTGGTA
                 :    .    :    .    :     .    :    .     :    

                  :    .      :    .    :    .    :    .    :   
      36,516  GACAGGGCAGAGG  TGGTCAGATCGTTTTGAGCCAAAATCCCTTCCCTA
               | ||||||:||:--||||||| | ||||||||:| ||||||:|||||-|
      28,375  CAAAGGGCAAAGAATTGGTCAGTTAGTTTTGAGTCCAAATCCTTTCCC A
              .    :    .    :    .    :    .    :    .    :    

               .    :    .    :    .    :    .    :    .        
      36,564  AAAATGGATCTGTGGAGCTCCATGAGGGAACCTCAGAGATGCACA    A
              ||| ||||||||||::|||||||||||:||-||:|||| ||||:|----|
      28,424  AAACTGGATCTGTGAGGCTCCATGAGGAAA CTTAGAGCTGCATAGATCA
               .    :    .    :    .    :     .    :    .    :  

              :    .    :     .    :                            
      36,610  TGACAGTTTAGC TAAAATGGCTT                          
              | | |::| |||-| |||:|||||--------------------------
      28,473  TCAGAACTGAGCTTTAAACGGCTTTCAAAACAAAACCAAAACCAAAAACC
                .    :    .    :    .    :    .    :    .    :  

                  .    :    .    :    .    :    .    :    .    :
      36,633    AAAAAATGTGAATTGATTGTCAGCTCTCTCCATATCTGCTGAAAAAAG
              --|| |:| | :|| |||||||||||||:|::|-| |||-||||||||:-
      28,523  AAAACAGAAGAAAAGTGATTGTCAGCTCCCCTC TCTCT CTGAAAAAG 
                .    :    .    :    .    :    .     :     .     

                  .    :    .    :    .    :    .    :    .    :
      36,681  GTTTAAAATTTTTAAAAAGTTTAAAAGTGTTTTCTAAAAAAGGGACAAGC
              ---| | :||||||||||:| |||||| | :||-|||||||||| :|:| 
      28,570     TTATGTTTTTAAAAAATGTAAAAGAGGCTT TAAAAAAGGGCTAGGA
                 :    .    :    .    :    .     :    .    :    .

                  . 
      36,731  AGGTCT
              ||:|||
      28,616  AGATCT
                  : 

Data files

None.

Notes

The output can be visualized with LAJ. If you are working in an X-terminal and you did not set the parameter -nolaj the "wrapper" blastz will automatically start LAJ. Users who cannot afford working in an X-terminal can copy the files to their own computer and vizualize it with a LAJ installed locally. They can also use the static picture files with the dotplot and the PIP ; blastz allows to choose between PostScript and Adobe PDF format, which are compatible with a lot of software. Note that under wEMBOSS these pictures do not show up in the Program Output page. You must right click on the "click to view" link next to the file name. A file <output>.pdf can usually be opened with Acrobat reader and a file <output>.ps can be saved to disk.

References

Human-Mouse Alignments with BLASTZ.
Scott Schwartz, W. James Kent, Arian Smit, Zheng Zhang, Robert Baertsch, Ross C. Hardison, David Haussler, and Webb Miller.
Genome Res. 2003 Jan;13(1):103-7. Erratum in: Genome Res. 2004 Apr;14(4):786

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It always exits with status 0.

Known bugs

None.

See also

Program nameDescription
blast2seq Finds local alignments between two sequences, using BLAST
lfasta Finds local alignments between two sequences, using fastA
matcher Finds the best local alignments between two sequences
seqmatchall All-against-all comparison of a set of sequences
sim_lav Nonintersecting best local alignments, makes LALNVIEW file
supermatcher Match large sequences against one or more other sequences
water Smith-Waterman local alignment
wordfinder Match large sequences against one or more other sequences
wordmatch Finds all exact matches of a given size between 2 sequences
maskseq Mask off regions of a sequence.
est2genome Align EST and genomic DNA sequences
sim4 Align an mRNA to a genomic DNA sequence
sim4_lav idem as sim4, makes by default file for LALNVIEW

Author(s)

The wrapper application blastz was written by Guy Bottu (gbottu@vub.ac.be)
BEN, ULB, Brussels, Belgium

The program blastz itself, the Java programs LAJ and LAT, the PipTools suite (from which genbank2exons and genbank2repeats are used) and the PipServer distribution (which is used to make the PostScipt/PDF output) were all developed by the group of Webb Miller (webb@bio.cse.psu.edu) at the Penn State University Center for Comparative Genomics and Bioinformatics.

History

Completed 6 January 2005
Modified 5 October 2005 - adapted to BLASTZ version of 27 December 2004
Modified 14 November 2006 - added PostScipt/PDF output

Target users

This program is intended to be used by everyone and everything, from naive users to embedded scripts.