BLASTX+BEAUTY Search Results

WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.

BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.

BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract

Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract

RepeatMasker Server unavailable.

Reference:  Gish, Warren (1994-1997).  unpublished.
Gish, Warren and David J. States (1993).  Identification of protein coding
regions by database similarity search.  Nat. Genet. 3:266-72.

Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.

Query= B05H08.seq(1>613) (571 letters)

  Translating both strands of query sequence in all 6 reading frames

Database: nr 625,274 sequences; 197,782,623 total letters.



     Observed Numbers of Database Sequences Satisfying
    Various EXPECTation Thresholds (E parameter values)

        Histogram units:      = 2 Sequences     : less than 2 sequences

 EXPECTation Threshold
 (E parameter)
    |
    V   Observed Counts-->
  10000 481 99 |=================================================
   6310 382 45 |======================
   3980 337 87 |===========================================
   2510 250 82 |=========================================
   1580 168 67 |=================================
   1000 101 38 |===================
    631  63 24 |============
    398  39 16 |========
    251  23  5 |==
    158  18  2 |=
    100  16  3 |=
   63.1  13  2 |=
   39.8  11  1 |:
   25.1  10  1 |:
   15.8   9  0 |
 >>>>>>>>>>>>>>>>>>>>>  Expect = 10.0, Observed = 9  <<<<<<<<<<<<<<<<<
   10.0   9  0 |
   6.31   9  0 |
   3.98   9  1 |:
   2.51   8  0 |
   1.58   8  0 |
   1.00   8  0 |
   0.63   8  0 |
   0.40   8  1 |:


                                                                     Smallest
                                                                       Sum
                                                     Reading  High  Probability
Sequences producing High-scoring Segment Pairs:        Frame Score  P(N)      N
gi|6714305|gb|AAF26001.1|AC013354_20(AC013354) F15H18... +2   645  8.3e-61   1
gi|10176982|dbj|BAB10214.1|(AB010077) contains simila... +2   518  9.6e-49   1
gi|11357884|pir||T49947hypothetical protein F8M21.10 ... +2   461  1.1e-42   1
gi|4512685|gb|AAD21739.1|(AC006931) hypothetical prot... +2   440  1.8e-40   1
gi|9558428|dbj|BAB03364.1|(AP002486) ESTs AU069374(C6... +2   439  2.3e-40   1
gi|11358151|pir||T49150hypothetical protein T20N10.20... +2   414  1.0e-37   1
gi|7295160|gb|AAF50485.1|(AE003556) CG7550 gene produ... +2   126  1.2e-05   1
gi|6679797ref|NP_032042.1| fibroblast growth factor i... +1    74  0.25      1
gi|5817316|gb|AAD52701.1|AF091540_1(AF091540) cystein... +2    83  0.93      1

Use the and icons to retrieve links to Entrez:

E = Retrieve Entrez links (e.g., Medline abstracts, FASTA-formatted sequence reports).
R = Retrieve links to Related sequences (neighbors).
Use the icon (if present) to retrieve links to the Sequence Retrieval System (SRS).
Use the icon (if present) to retrieve links to the Ligand Enzyme and Chemical Compound Database .
Use the icon (if present) to retrieve links to the Protein Data Bank database.


to_Entrezto_Relatedto_Related >gi|6714305|gb|AAF26001.1|AC013354_20  (AC013354) F15H18.4 [Arabidopsis
            thaliana]
            Length = 1702

Frame  2 hits (HSPs):                                               ______
                        __________________________________________________
Database sequence:     |              |              |              |     | 1702
                       0            500           1000           1500

  Plus Strand HSPs:

 Score = 645 (227.1 bits), Expect = 8.3e-61, P = 8.3e-61
 Identities = 113/177 (63%), Positives = 139/177 (78%), Frame = +2

Query:    14 YVDIHECDSFTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESKE 193
             ++DIHECD+FTMCIFCFPTSSVIPLHDHP M VFSK+LYGSLHVKAYDWVEPPCII   +
Sbjct:  1526 FLDIHECDTFTMCIFCFPTSSVIPLHDHPEMAVFSKILYGSLHVKAYDWVEPPCIITQDK 1585

Query:   194 --PGYAQVRLAKLAVDKVLNAPCDTSVLYPKHGGNLHCFTAVTPCAMLDILTPPYREEEG 367
               PG    RLAKL  DKV+    +   LYPK GGNLHCFTA+TPCA+LDIL+PPY+E  G
Sbjct:  1586 GVPGSLPARLAKLVSDKVITPQSEIPALYPKTGGNLHCFTALTPCAVLDILSPPYKESVG 1645

Query:   368 RRCTYYHDYPYSAFSVANA--PICDGEEEEYAWLTELESPSDLYMRQGVYAGPAIQL 532
             R C+YY DYP+S F++ N    + +G+E+EYAWL ++++P DL+MR G Y GP I++
Sbjct:  1646 RSCSYYMDYPFSTFALENGMKKVDEGKEDEYAWLVQIDTPDDLHMRPGSYTGPTIRV 1702


to_Entrezto_Relatedto_Related >gi|10176982|dbj|BAB10214.1|  (AB010077) contains similarity to unknown
            protein~emb|CAB89322.1~gene_id:MYH19.8 [Arabidopsis thaliana]
            Length = 270

Frame  2 hits (HSPs):                     ________________________________
                        __________________________________________________
Database sequence:     |         |        |        |        |         |   | 270
                       0        50      100      150      200       250

  Plus Strand HSPs:

 Score = 518 (182.3 bits), Expect = 9.6e-49, P = 9.6e-49
 Identities = 94/172 (54%), Positives = 123/172 (71%), Frame = +2

Query:    14 YVDIHECDSFTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESKE 193
             Y+ I+ C  F++CIFC P S VIPLH+HP MTVFSKLL+G++H+K+YDWV  P   +S +
Sbjct:   103 YLHIYACHRFSICIFCLPPSGVIPLHNHPEMTVFSKLLFGTMHIKSYDWV--P---DSPQ 157

Query:   194 PGYAQVRLAKLAVDKVLNAPCDTSVLYPKHGGNLHCFTAVTPCAMLDILTPPYREEEGRR 373
             P  +  RLAK+ VD    APCDTS+LYP  GGN+HCFTA T CA+LD++ PPY +  GR 
Sbjct:   158 PS-SDTRLAKVKVDSDFTAPCDTSILYPADGGNMHCFTAKTACAVLDVIGPPYSDPAGRH 216

Query:   374 CTYYHDYPYSAFSVANAPICDGEEEEYAWLTELES-PSDLYMRQGVYAGPAIQ 529
             CTYY DYP+S+FSV    + + E+E YAWL E E  P DL +   +Y+GP I+
Sbjct:   217 CTYYFDYPFSSFSVDGVVVAEEEKEGYAWLKEREEKPEDLTVTALMYSGPTIK 269


to_Entrezto_Relatedto_Related >gi|11357884|pir||T49947  hypothetical protein F8M21.10 - Arabidopsis thaliana
            >gi|7671481|emb|CAB89322.1| (AL353993) putative protein
            [Arabidopsis thaliana]
            Length = 293

Frame  2 hits (HSPs):                       ______________________________
                        __________________________________________________
Database sequence:     |        |       |        |       |        |       | 293
                       0       50     100      150     200      250

  Plus Strand HSPs:

 Score = 461 (162.3 bits), Expect = 1.1e-42, P = 1.1e-42
 Identities = 87/172 (50%), Positives = 118/172 (68%), Frame = +2

Query:    14 YVDIHECDSFTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESKE 193
             Y+ +H+CD F++ IFC P S VIPLH+HPGMTVFSKLL+G++H+K+YDWV    + +SK 
Sbjct:   123 YLHLHQCDQFSIGIFCLPPSGVIPLHNHPGMTVFSKLLFGTMHIKSYDWVVDAPMRDSK- 181

Query:   194 PGYAQVRLAKLAVDKVLNAPCDTSVLYPKHGGNLHCFTAVTPCAMLDILTPPYREEEGRR 373
                   RLAKL VD    APC+ S+LYP+ GGN+H FTA+T CA+LD+L PPY   EGR 
Sbjct:   182 -----TRLAKLKVDSTFTAPCNASILYPEDGGNMHRFTAITACAVLDVLGPPYCNPEGRH 236

Query:   374 CTYYHDYPYSAFSVANAPICDGEEEE--YAWLTELE-SPSDLYMRQG-VYAGPAIQ 529
             CTY+ ++P    S  +  +   EEE+  YAWL E + +P D     G +Y GP ++
Sbjct:   237 CTYFLEFPLDKLSSEDDDVLSSEEEKEGYAWLQERDDNPEDHTNVVGALYRGPKVE 292


to_Entrezto_Relatedto_Related >gi|4512685|gb|AAD21739.1|  (AC006931) hypothetical protein [Arabidopsis
            thaliana]
            Length = 242

Frame  2 hits (HSPs):                 ____________________________________
                        __________________________________________________
Database sequence:     |          |         |         |          |        | 242
                       0         50       100       150        200

  Plus Strand HSPs:

 Score = 440 (154.9 bits), Expect = 1.8e-40, P = 1.8e-40
 Identities = 84/172 (48%), Positives = 110/172 (63%), Frame = +2

Query:    14 YVDIHECDSFTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESKE 193
             Y+ +HECDSF++ IFC P SS+IPLH+HPGMTV SKL+YGS+HVK+YDW+EP  + E ++
Sbjct:    73 YLHLHECDSFSIGIFCMPPSSMIPLHNHPGMTVLSKLVYGSMHVKSYDWLEPQ-LTEPED 131

Query:   194 PGYAQVRLAKLAVDKVLNAPCDTSVLYPKHGGNLHCFTAVTPCAMLDILTPPYREEEGRR 373
             P   + R AKL  D  + A    + LYPK GGN+HCF A+T CA+LDIL PPY  E  R 
Sbjct:   132 PSQ-EARPAKLVKDTEMTAQSPVTTLYPKSGGNIHCFKAITHCAILDILAPPYSSEHDRH 190

Query:   374 CTYYHDYPYSAFSVANAPICDGEE-EEYAWLTELESPSDLYMRQGVYAGPAIQ 529
             CTY+         +      DGE   +  WL E + P D  +R+  Y GP I+
Sbjct:   191 CTYFRKSRRE--DLPGELEVDGEVVTDVTWLEEFQPPDDFVIRRIPYRGPVIR 241


to_Entrezto_Relatedto_Related >gi|9558428|dbj|BAB03364.1|  (AP002486) ESTs AU069374(C61044),D24451(R1944),
            AU031820(R1944) correspond to a region of the predicted
            gene.~Similar to Arabidopsis thaliana DNA chromosome 5, BAC clone
            F8M21; putative protein (AL353993) [Oryza sativa]
            Length = 246

Frame  2 hits (HSPs):                  ___________________________________
                        __________________________________________________
Database sequence:     |         |          |         |         |         | 246
                       0        50        100       150       200

  Plus Strand HSPs:

 Score = 439 (154.5 bits), Expect = 2.3e-40, P = 2.3e-40
 Identities = 85/171 (49%), Positives = 112/171 (65%), Frame = +2

Query:    14 YVDIHECDSFTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESKE 193
             Y+ ++EC++F++ IFC P   VIPLH+HP MTVFSKLL+G L VK+YDW +     +S +
Sbjct:    77 YLHLYECEAFSIGIFCLPPRGVIPLHNHPNMTVFSKLLFGELRVKSYDWADASQ--DSTD 134

Query:   194 PGYAQVRLAKLAVDKVLNAPCDTSVLYPKHGGNLHCFTAVTPCAMLDILTPPYREEEGRR 373
                   RLAK+ VD  LNAPC TSVLYP+ GGNLHCFTA T CA+LD+L PPY +  GR 
Sbjct:   135 AQLQGARLAKVKVDGTLNAPCATSVLYPEDGGNLHCFTAHTACAVLDVLGPPYDDGSGRH 194

Query:   374 CTYYHDYPYSAFSVANAPICDGEEEEYAWLTELESPSDLYMRQGVYAGPAI 526
             C +Y+    SA S  ++    G++  YAWL E E P + ++    Y GP I
Sbjct:   195 CQHYN-VSSSAPSAGDSKPLPGDDG-YAWLEECEPPDNFHLVGSTYMGPRI 243


to_Entrezto_Relatedto_Related >gi|11358151|pir||T49150  hypothetical protein T20N10.20 - Arabidopsis thaliana
            >gi|7630062|emb|CAB88284.1| (AL353032) putative protein
            [Arabidopsis thaliana]
            Length = 242

Frame  2 hits (HSPs):                 ____________________________________
                        __________________________________________________
Database sequence:     |          |         |         |          |        | 242
                       0         50       100       150        200

  Plus Strand HSPs:

 Score = 414 (145.7 bits), Expect = 1.0e-37, P = 1.0e-37
 Identities = 79/172 (45%), Positives = 104/172 (60%), Frame = +2

Query:    14 YVDIHECDSFTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESKE 193
             Y+ +HECDSF++ IFC P  S+IPLH+HPGMTV SKL+YGS+HVK+YDW EP    E  +
Sbjct:    73 YLQLHECDSFSIGIFCMPPGSIIPLHNHPGMTVLSKLVYGSMHVKSYDWAEPDQS-ELDD 131

Query:   194 PGYAQVRLAKLAVDKVLNAPCDTSVLYPKHGGNLHCFTAVTPCAMLDILTPPYREEEGRR 373
             P   Q R AKL  D  + +P   + LYP  GGN+HCF A+T CA+ DIL+PPY    GR 
Sbjct:   132 P--LQARPAKLVKDIDMTSPSPATTLYPTTGGNIHCFKAITHCAIFDILSPPYSSTHGRH 189

Query:   374 CTYYHDYPYSAFSVANAPICDGEE-EEYAWLTELESPSDLYMRQGVYAGPAIQ 529
             C Y+   P          + +GE      WL E + P +  + +  Y GP I+
Sbjct:   190 CNYFRKSPMLDLP-GEIEVMNGEVISNVTWLEEYQPPDNFVIWRVPYRGPVIR 241


to_Entrezto_Relatedto_Related >gi|7295160|gb|AAF50485.1|  (AE003556) CG7550 gene product [Drosophila
            melanogaster]
            Length = 240

Frame  2 hits (HSPs):               ____________________________          
                        __________________________________________________
Database sequence:     |          |         |          |         |        | 240
                       0         50       100        150       200

  Plus Strand HSPs:

 Score = 126 (44.4 bits), Expect = 1.2e-05, P = 1.2e-05
 Identities = 41/129 (31%), Positives = 65/129 (50%), Frame = +2

Query:    11 AYVDIHECDSFTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESK 190
             +Y+ I E D F+M +F    +S IPLHDHP M    + ++G L V ++     P    + 
Sbjct:    61 SYMHIFEDDRFSMSLFIVRGASTIPLHDHPMMFGLLRCIWGQLMVDSFSHQLGPDEPLTY 120

Query:   191 EPGYAQVRLAKLAVDKVLN--APCDTSVLYPKHGGNLHCFTAVTP--CAMLDILTPPYRE 358
             +P    V++  +   K++   +PC T  L P+   N H    +     A  DIL+PPY  
Sbjct:   121 DPHQTVVKV-NVEEPKLVTPASPCAT--LTPRKR-NYHQIAQIGSGVAAFFDILSPPYDA 176

Query:   359 EE---G-RRCTYY 385
             +    G R+C +Y
Sbjct:   177 DMPTYGPRQCRFY 189


to_Entrezto_Related >gi|6679797  ref|NP_032042.1| fibroblast growth factor inducible 15 [Mus
            musculus] >gi|2498381|sp|Q61075|FI15_MOUSE FIBROBLAST GROWTH FACTOR
            INDUCIBLE PROTEIN 15 (FIN15) >gi|1353707|gb|AAB08866.1| (U42384)
            FIN15 gene product [Mus musculus]
            Length = 87

Frame  1 hits (HSPs):      _______________________________                
                        __________________________________________________
Database sequence:     |          |           |          |           |    | 87
                       0         20          40         60          80

  Plus Strand HSPs:

 Score = 74 (26.0 bits), Expect = 0.29, P = 0.25
 Identities = 19/54 (35%), Positives = 28/54 (51%), Frame = +1

Query:   265 SFISQTRWKSALFHSSDTLCHAR--HSHTSLQRRGRKEVYILS*LSLFSI-LSC 417
             SFI    WK+A +++   +C     H+HT LQ       Y  +  SLFS+ +SC
Sbjct:     7 SFIIHKFWKNATYYTCSFVCVCMDIHTHTVLQNELFMYTYFRTAFSLFSVKISC 60


to_Entrezto_Relatedto_Related >gi|5817316|gb|AAD52701.1|AF091540_1  (AF091540) cysteine dioxygenase
            [Schistosoma japonicum]
            Length = 212

Frame  2 hits (HSPs):                     _________________________       
                        __________________________________________________
Database sequence:     |           |           |           |          |   | 212
                       0          50         100         150        200

  Plus Strand HSPs:

 Score = 83 (29.2 bits), Expect = 2.7, P = 0.93
 Identities = 26/104 (25%), Positives = 43/104 (41%), Frame = +2

Query:    41 FTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESKEPGYAQVRLA 220
             + + + C+       +HDH G   F KL+ G +    ++W +    +E       Q+ L 
Sbjct:    79 YNLFLLCWSEDQGTRIHDHSGAHCFVKLIKGCIKETIFEWPKY-FTVEKSNYSINQIDLP 137

Query:   221 KLAVDKVLNA-PCDTSVLYPKHG-GNLHCFTAVTPCAMLDILTPPY 352
              L V  V    P D + ++ K G   LH  +       L +  PPY
Sbjct:   138 -LTVKSVSEMRPGDVTYMHDKIGIHRLHNPSTTETAITLHLYFPPY 182


Parameters:
  filter=none
  matrix=BLOSUM62
  V=50
  B=50
  E=10
  gi
  H=1
  sort_by_pvalue
  echofilter

  ctxfactor=5.99

  Query                        -----  As Used  -----    -----  Computed  ----
  Frame  MatID Matrix name     Lambda    K       H      Lambda    K       H
   Std.    0   BLOSUM62                                 0.318   0.135   0.401  
   +3      0   BLOSUM62        0.318   0.135   0.401    0.368   0.163   0.557  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   +2      0   BLOSUM62        0.318   0.135   0.401    0.325   0.142   0.467  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   +1      0   BLOSUM62        0.318   0.135   0.401    0.368   0.161   0.641  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -1      0   BLOSUM62        0.318   0.135   0.401    0.366   0.162   0.623  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -2      0   BLOSUM62        0.318   0.135   0.401    0.353   0.151   0.528  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -3      0   BLOSUM62        0.318   0.135   0.401    0.358   0.159   0.580  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a

  Query
  Frame  MatID  Length  Eff.Length     E    S W   T  X   E2     S2
   +3      0      189       188       10.  76 3  12 22  0.12    34
                                                    31  0.12    37
   +2      0      190       189       10.  76 3  12 22  0.12    34
                                                    31  0.12    37
   +1      0      190       189       10.  76 3  12 22  0.12    34
                                                    31  0.12    37
   -1      0      190       189       10.  76 3  12 22  0.12    34
                                                    31  0.12    37
   -2      0      190       189       10.  76 3  12 22  0.12    34
                                                    31  0.12    37
   -3      0      189       189       10.  76 3  12 22  0.12    34
                                                    31  0.12    37


Statistics:

  Database:  /usr/local/dot5/sl_home/beauty/seqdb/blast/nr
    Title:  nr
    Release date:  unknown
    Posted date:  4:06 PM CST Feb 28, 2001
    Format:  BLAST
  # of letters in database:  197,782,623
  # of sequences in database:  625,274
  # of database sequences satisfying E:  9
  No. of states in DFA:  597 (59 KB)
  Total size of DFA:  221 KB (256 KB)
  Time to generate neighborhood:  0.01u 0.01s 0.02t  Elapsed: 00:00:00
  No. of threads or processors used:  6
  Search cpu time:  183.37u 1.06s 184.43t  Elapsed: 00:00:32
  Total cpu time:  183.39u 1.10s 184.49t  Elapsed: 00:00:32
  Start:  Sat Feb  2 05:24:35 2002   End:  Sat Feb  2 05:25:07 2002

Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000