Please help us to improve our services and obtain funding for the
BCM Search Launcher
-- take a minute to complete our User Survey


BLASTX+BEAUTY Search Results

WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.

BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.

BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract

Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract



RepeatMasker repeats found in sequence:

No Repeats Found.

Reference:  Gish, Warren (1994-1997).  unpublished.
Gish, Warren and David J. States (1993).  Identification of protein coding
regions by database similarity search.  Nat. Genet. 3:266-72.

Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.

Query= A12A07_CONSENSUS (508 letters)

  Translating both strands of query sequence in all 6 reading frames

Database: nr 625,274 sequences; 197,782,623 total letters.



     Observed Numbers of Database Sequences Satisfying
    Various EXPECTation Thresholds (E parameter values)

        Histogram units:      = 5 Sequences     : less than 5 sequences

 EXPECTation Threshold
 (E parameter)
    |
    V   Observed Counts-->
  10000 997 291 |==========================================================
   6310 706 151 |==============================
   3980 555 136 |===========================
   2510 419 118 |=======================
   1580 301 124 |========================
   1000 177  65 |=============
    631 112  45 |=========
    398  67  23 |====
    251  44  12 |==
    158  32  10 |==
    100  22   3 |:
   63.1  19   3 |:
   39.8  16   4 |:
   25.1  12   0 |
   15.8  12   1 |:
 >>>>>>>>>>>>>>>>>>>>>  Expect = 10.0, Observed = 11  <<<<<<<<<<<<<<<<<
   10.0  11   0 |
   6.31  11   0 |
   3.98  11   0 |
   2.51  11   0 |
   1.58  11   1 |:


                                                                     Smallest
                                                                       Sum
                                                     Reading  High  Probability
Sequences producing High-scoring Segment Pairs:        Frame Score  P(N)      N
gi|7460098|pir||T05076hypothetical protein T6K21.80 -... +2   425  6.9e-39   1
gi|6714272|gb|AAF25968.1|AC017118_5(AC017118) F6N18.8... +2   303  5.8e-26   1
gi|12643058|gb|AAK00447.1|AC060755_17(AC060755) unkno... +2   264  7.9e-22   1
gi|9758507|dbj|BAB08915.1|(AB016882) gene_id:MZA15.12... +2   245  8.1e-20   1
gi|4836888|gb|AAD30591.1|AC007369_1(AC007369) Unknown... +2   203  2.3e-15   1
gi|12323060|gb|AAG51520.1|AC068324_8(AC068324) hypoth... +2   199  6.1e-15   1
gi|12323979|gb|AAG51950.1|AC015450_11(AC015450) unkno... +2   190  5.5e-14   1
gi|4510423|gb|AAD21509.1|(AC006929) unknown protein [... +2   129  1.6e-07   1
gi|11280631|pir||T47878hypothetical protein T4C21.80 ... +2   125  9.5e-07   1
gi|9755373|gb|AAF98180.1|AC000107_3(AC000107) F17F8.4... +2   124  1.2e-06   1
gi|5441885|dbj|BAA82383.1|(AP000367) Similar to Arabi... +2    73  0.74      2



Locally-aligned regions (HSPs) with respect to query sequence:

Locus_ID                Frame 2 Hits
gi|7460098             |             _____________________________________
gi|6714272             |    ______________________________________________
gi|12643058            |             _____________________________________
gi|9758507             |             _____________________________________
gi|4836888             |                    ______________________________
gi|12323060            |                    ___________________________   
gi|12323979            |                      ____________________________
gi|4510423             |                      _________________________   
gi|11280631            |                     _____________________________
gi|9755373             |                      ___________________________ 
gi|5441885             |                      ___________   ______________
                        __________________________________________________
Query sequence:        |              |              |             |      | 170
                       0             50            100           150

Use the and icons to retrieve links to Entrez:

E = Retrieve Entrez links (e.g., Medline abstracts, FASTA-formatted sequence reports).
R = Retrieve links to Related sequences (neighbors).
Use the icon (if present) to retrieve links to the Sequence Retrieval System (SRS).
Use the icon (if present) to retrieve links to the Ligand Enzyme and Chemical Compound Database .
Use the icon (if present) to retrieve links to the Protein Data Bank database.


to_Entrezto_Relatedto_Related >gi|7460098|pir||T05076  hypothetical protein T6K21.80 - Arabidopsis thaliana
            >gi|2894599|emb|CAA17133.1| (AL021889) putative protein
            [Arabidopsis thaliana] >gi|7268542|emb|CAB78792.1| (AL161547)
            putative protein [Arabidopsis thaliana]
            Length = 254

Frame  2 hits (HSPs):   _________________________                         
                        __________________________________________________
Database sequence:     |         |         |         |         |         || 254
                       0        50       100       150       200       250

  Plus Strand HSPs:

 Score = 425 (149.6 bits), Expect = 6.9e-39, P = 6.9e-39
 Identities = 78/123 (63%), Positives = 91/123 (73%), Frame = +2

Query:   137 MAIENQDNTVREIKPKNRRIMGAGGPDDDDNRWPPWLKPLLKESFFVQCKLHADSHKSEC 316
             MAIE+Q+NT+REIKPKNRRIMGAGGP++++NRWPPWLKPLLKE FFV CK H DSHKSEC
Sbjct:     1 MAIEDQENTIREIKPKNRRIMGAGGPEEEENRWPPWLKPLLKEQFFVHCKFHGDSHKSEC 60

Query:   317 NMLLLGLYEWPLCSLCLAHHKDHRAIQIXSPHTMMLIRXSXIQKVFGLTGVQPTLSTAQG 496
             NM  L     PLCSLCLAHHKDHR IQI       +IR + IQK   + G+Q  +  +  
Sbjct:    61 NMYCLDCTNGPLCSLCLAHHKDHRTIQIRRSSYHDVIRVNEIQKYLDIGGIQTYVINSAK 120

Query:   497 XVF 505
              VF
Sbjct:   121 VVF 123


to_Entrezto_Relatedto_Related >gi|6714272|gb|AAF25968.1|AC017118_5  (AC017118) F6N18.8 [Arabidopsis thaliana]
            Length = 270

Frame  2 hits (HSPs):    _____________________________                    
                        __________________________________________________
Database sequence:     |         |        |        |        |         |   | 270
                       0        50      100      150      200       250

  Plus Strand HSPs:

 Score = 303 (106.7 bits), Expect = 5.8e-26, P = 5.8e-26
 Identities = 74/152 (48%), Positives = 91/152 (59%), Frame = +2

Query:    50 FSSIFLKKQKHEKSSTLSHSLSLSLLSTNMAIE---NQDNTVREIKPKNRRIMGAGGPDD 220
             F S+    QK   S     SL   L S  +AIE   +Q+ T+REIK    +   A   ++
Sbjct:     8 FGSLNPPYQKQGLSQL--RSLGSVLGSKKLAIEENHHQEATIREIKALQHKNQHA---EE 62

Query:   221 DDNR-WPPWLKPLLKESFFVQCKLHADSHKSECNMLLLGLYEWPLCSLCLAHHKDHRAIQ 397
             + N+ +P WLKPLL+E FFVQCKLHADSHKSECNM  L     PLCSLCL+ HKDH AIQ
Sbjct:    63 ETNKTYPHWLKPLLREKFFVQCKLHADSHKSECNMYCLDCTNGPLCSLCLSFHKDHHAIQ 122

Query:   398 IXSPHTMMLIRXSXIQKVFGLTGVQPTLSTAQGXVF 505
             I       +IR S IQK   +TGVQ  +  +   VF
Sbjct:   123 IRRSSYHDVIRVSEIQKFLDITGVQTYVINSAKVVF 158


to_Entrezto_Relatedto_Related >gi|12643058|gb|AAK00447.1|AC060755_17  (AC060755) unknown protein [Oryza
            sativa]
            Length = 254

Frame  2 hits (HSPs):   ___________________________                       
                        __________________________________________________
Database sequence:     |         |         |         |         |         || 254
                       0        50       100       150       200       250

  Plus Strand HSPs:

 Score = 264 (92.9 bits), Expect = 7.9e-22, P = 7.9e-22
 Identities = 62/137 (45%), Positives = 81/137 (59%), Frame = +2

Query:   137 MAIENQDNTVREIKPKNRRIMGAGGPDDDDN-------RWPPWLKPLLKESFFVQCKLHA 295
             MAI+++ +  +E++ KNRRIMG GGP+ ++        +WP WL PLL  SFF QCK+HA
Sbjct:     1 MAIDHE-SPFKELRLKNRRIMGGGGPEPEEEEAVAHGEQWPRWLSPLLSASFFSQCKVHA 59

Query:   296 DSHKS-ECNMLLLGLYE------WPLCSLCLAH-HKDHRAIQIXSPHTMMLIRXSXIQKV 451
             DSH+S ECNM  L            LCSLCLAH H+DH  IQI       +IR S IQ+ 
Sbjct:    60 DSHRSGECNMFCLDCAADADAAAAALCSLCLAHNHRDHHTIQIRRSSYHDVIRVSDIQRF 119

Query:   452 FGLTGVQPTLSTAQGXVF 505
               + GVQ  +  +   VF
Sbjct:   120 MDIGGVQTYVINSARVVF 137


to_Entrezto_Relatedto_Related >gi|9758507|dbj|BAB08915.1|  (AB016882) gene_id:MZA15.12~pir||T05076~similar to
            unknown protein [Arabidopsis thaliana]
            Length = 226

Frame  2 hits (HSPs):   __________________________                        
                        __________________________________________________
Database sequence:     |          |          |          |           |     | 226
                       0         50        100        150         200

  Plus Strand HSPs:

 Score = 245 (86.2 bits), Expect = 8.1e-20, P = 8.1e-20
 Identities = 50/123 (40%), Positives = 69/123 (56%), Frame = +2

Query:   137 MAIENQDNTVREIKPKNRRIMGAGGPDDDDNRWPPWLKPLLKESFFVQCKLHADSHKSEC 316
             MAIE+ +N  REIKPKNRR M      + +N+WP WLKPLL + FF QCK H    ++EC
Sbjct:     1 MAIEDYENPNREIKPKNRRFM------EGENQWPIWLKPLLNQHFFAQCKFHGHLPRTEC 54

Query:   317 NMLLLGLYEWPLCSLCLAHHKDHRAIQIXSPHTMMLIRXSXIQKVFGLTGVQPTLSTAQG 496
              M  L       CSLCL+ H++HR IQI       + +   IQK   ++ +Q  +  +  
Sbjct:    55 KMYCLDCTNDSFCSLCLSEHENHRTIQIRISSYHNVTKVDEIQKYLDISSIQTYVINSSK 114

Query:   497 XVF 505
              +F
Sbjct:   115 VLF 117


to_Entrezto_Relatedto_Related >gi|4836888|gb|AAD30591.1|AC007369_1  (AC007369) Unknown protein [Arabidopsis
            thaliana]
            Length = 243

Frame  2 hits (HSPs):    ____________________                             
                        __________________________________________________
Database sequence:     |          |         |         |         |         | 243
                       0         50       100       150       200

  Plus Strand HSPs:

 Score = 203 (71.5 bits), Expect = 2.3e-15, P = 2.3e-15
 Identities = 38/97 (39%), Positives = 55/97 (56%), Frame = +2

Query:   215 DDDDNRWPPWLKPLLKESFFVQCKLHADSHKSECNMLLLGLYEWPLCSLCLAHHKDHRAI 394
             +++D   PPWL P+L+ S+FV C +H DS+K+ECN+  L       CS CL  HKDHR +
Sbjct:     6 EEEDYTSPPWLMPMLRGSYFVPCSIHVDSNKNECNLFCLDCAGNAFCSYCLVKHKDHRVV 65

Query:   395 QIXSPHTMMLIRXSXIQKVFGLTGVQPTLSTAQGXVF 505
             QI       ++R + IQK   +  VQ  +  +   VF
Sbjct:    66 QIRRSSYHNVVRVNEIQKFIDIACVQTYIINSAKIVF 102


to_Entrezto_Relatedto_Related >gi|12323060|gb|AAG51520.1|AC068324_8  (AC068324) hypothetical protein
            [Arabidopsis thaliana]
            Length = 216

Frame  2 hits (HSPs):   ____________________                              
                        __________________________________________________
Database sequence:     |           |          |           |           |   | 216
                       0          50        100         150         200

  Plus Strand HSPs:

 Score = 199 (70.1 bits), Expect = 6.1e-15, P = 6.1e-15
 Identities = 37/86 (43%), Positives = 51/86 (59%), Frame = +2

Query:   215 DDDDNRWPPWLKPLLKESFFVQCKLHADSHKSECNMLLLGLYEWPLCSLCLAHHKDHRAI 394
             ++DD   PPWL P+L+  +FV C +H+ S KSECN+  L       CS CLAHH+ HR I
Sbjct:     2 ENDDVMTPPWLTPMLRADYFVTCSIHSQSSKSECNLFCLDCSGNAFCSSCLAHHRTHRVI 61

Query:   395 QIXSPHTMMLIRXSXIQKVFGLTGVQ 472
             QI       ++R S IQK   ++ +Q
Sbjct:    62 QIRRSSYHNVVRVSEIQKHIDISCIQ 87


to_Entrezto_Relatedto_Related >gi|12323979|gb|AAG51950.1|AC015450_11  (AC015450) unknown protein; 77280-78196
            [Arabidopsis thaliana]
            Length = 242

Frame  2 hits (HSPs):     ____________________                            
                        __________________________________________________
Database sequence:     |          |         |         |          |        | 242
                       0         50       100       150        200

  Plus Strand HSPs:

 Score = 190 (66.9 bits), Expect = 5.5e-14, P = 5.5e-14
 Identities = 35/90 (38%), Positives = 53/90 (58%), Frame = +2

Query:   236 PPWLKPLLKESFFVQCKLHADSHKSECNMLLLGLYEWPLCSLCLAHHKDHRAIQIXSPHT 415
             PPWL P+L+ ++F+ C +HA S+KSECNM  L       CS CL +H++HR +QI     
Sbjct:    15 PPWLIPMLRANYFIPCSIHAASNKSECNMFCLDCSSEAFCSYCLLNHRNHRVLQIRRSSY 74

Query:   416 MMLIRXSXIQKVFGLTGVQPTLSTAQGXVF 505
               ++R + IQK   ++ VQ  +  +   VF
Sbjct:    75 HNVVRVNEIQKYIDISCVQTYIINSARIVF 104


to_Entrezto_Relatedto_Related >gi|4510423|gb|AAD21509.1|  (AC006929) unknown protein [Arabidopsis thaliana]
            Length = 135

Frame  2 hits (HSPs):    ______________________________                   
                        __________________________________________________
Database sequence:     |                  |                 |             | 135
                       0                 50               100

  Plus Strand HSPs:

 Score = 129 (45.4 bits), Expect = 1.6e-07, P = 1.6e-07
 Identities = 27/80 (33%), Positives = 40/80 (50%), Frame = +2

Query:   236 PPWLKPLLKESFFVQCKLHADSHKSECNMLLLGLYEWPLCSLCLAH-HKDHRAIQIXSPH 412
             P WL+ LL+ +FF  C  H ++ ++ECNM  L       C  C +  H DH  +QI    
Sbjct:     4 PKWLEGLLRTNFFSICPRHRETPRNECNMFCLSCQNAAFCFYCRSSFHIDHPVLQIRRSS 63

Query:   413 TMMLIRXSXIQKVFGLTGVQ 472
                ++R S I+    + GVQ
Sbjct:    64 YHDVVRVSEIENALDIRGVQ 83


to_Entrezto_Relatedto_Related >gi|11280631|pir||T47878  hypothetical protein T4C21.80 - Arabidopsis thaliana
            >gi|7329677|emb|CAB82671.1| (AL162295) putative protein
            [Arabidopsis thaliana]
            Length = 245

Frame  2 hits (HSPs):   ____________________                              
                        __________________________________________________
Database sequence:     |          |         |         |         |         | 245
                       0         50       100       150       200

  Plus Strand HSPs:

 Score = 125 (44.0 bits), Expect = 9.5e-07, P = 9.5e-07
 Identities = 29/95 (30%), Positives = 46/95 (48%), Frame = +2

Query:   221 DDNRWPPWLKPLLKESFFVQCKLHADSHKSECNMLLLGLYEWPLCSLCLAHHKDHRAIQI 400
             +   +P WL+ LLK+ FF  C  H D  K+E N+L +      +C  CL+ H  HR +QI
Sbjct:     2 ESGEFPAWLEVLLKDKFFNACLDHEDDKKNEKNILCIDCC-LTICPHCLSSHTSHRLLQI 60

Query:   401 XSPHTMMLIRXSXIQKVFGLTGVQPTLSTAQGXVF 505
                    ++R     K+   + +QP  + +   VF
Sbjct:    61 RRYVYRDVLRVEDGSKLMDCSLIQPYTTNSSKVVF 95


to_Entrezto_Relatedto_Related >gi|9755373|gb|AAF98180.1|AC000107_3  (AC000107) F17F8.4 [Arabidopsis thaliana]
            Length = 241

Frame  2 hits (HSPs):      ___________________                            
                        __________________________________________________
Database sequence:     |          |         |         |          |        | 241
                       0         50       100       150        200

  Plus Strand HSPs:

 Score = 124 (43.7 bits), Expect = 1.2e-06, P = 1.2e-06
 Identities = 29/86 (33%), Positives = 47/86 (54%), Frame = +2

Query:   236 PPWLKPLLKESFFVQCKLHADSHKSECNMLLLGLYEWPLCSLCLAHHKDHRAIQIXSPHT 415
             P WL+ L+ E+FF  C +H    KSE N+  L L    +C  CL  H+ H  +Q+     
Sbjct:    19 PAWLEGLMAETFFSSCGIHETRRKSEKNVFCL-LCCLSVCPHCLPSHRSHPLLQVRRYVY 77

Query:   416 MMLIRXSXIQKVFGLTGVQP-TLSTAQ 493
               ++R S ++K+   + VQP T++ A+
Sbjct:    78 HDVVRLSDLEKLIDCSYVQPYTINGAK 104


to_Entrezto_Relatedto_Related >gi|5441885|dbj|BAA82383.1|  (AP000367) Similar to Arabidopsis thaliana BAC
            genomic sequence. (AL021889) [Oryza sativa]
            Length = 306

Frame  2 hits (HSPs):     _______    ________                             
                        __________________________________________________
Database sequence:     |        |       |       |       |       |       | | 306
                       0       50     100     150     200     250     300

  Plus Strand HSPs:

 Score = 73 (25.7 bits), Expect = 1.4, Sum P(2) = 0.74
 Identities = 14/34 (41%), Positives = 17/34 (50%), Frame = +2

Query:   230 RWPPWLKPLLKESFFVQCKLHADSHKSECNMLLL 331
             R P WL+ LL   FF  C  H    ++ECN   L
Sbjct:    17 REPAWLRSLLGARFFEACAAHRGMSRNECNQYCL 50

 Score = 49 (17.2 bits), Expect = 1.4, Sum P(2) = 0.74
 Identities = 12/44 (27%), Positives = 21/44 (47%), Frame = +2

Query:   374 HKDHRAIQIXSPHTMMLIRXSXIQKVFGLTGVQPTLSTAQGXVF 505
             H+ HR +Q+       ++R S +++   LT VQ  +      VF
Sbjct:    85 HR-HRVVQVRRSSYHNVVRVSELERTLDLTRVQTYVINRDRVVF 127


Parameters:
  filter=none
  matrix=BLOSUM62
  V=50
  B=50
  E=10
  gi
  H=1
  sort_by_pvalue
  echofilter

  ctxfactor=5.97

  Query                        -----  As Used  -----    -----  Computed  ----
  Frame  MatID Matrix name     Lambda    K       H      Lambda    K       H
   Std.    0   BLOSUM62                                 0.318   0.135   0.401  
   +3      0   BLOSUM62        0.318   0.135   0.401    0.339   0.140   0.444  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   +2      0   BLOSUM62        0.318   0.135   0.401    0.325   0.137   0.430  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   +1      0   BLOSUM62        0.318   0.135   0.401    0.352   0.152   0.565  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -1      0   BLOSUM62        0.318   0.135   0.401    0.348   0.151   0.499  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -2      0   BLOSUM62        0.318   0.135   0.401    0.346   0.150   0.525  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -3      0   BLOSUM62        0.318   0.135   0.401    0.350   0.151   0.541  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a

  Query
  Frame  MatID  Length  Eff.Length     E    S W   T  X   E2     S2
   +3      0      168       165       10.  75 3  12 22  0.10    34
                                                    30  0.10    37
   +2      0      169       165       10.  75 3  12 22  0.10    34
                                                    30  0.10    37
   +1      0      169       165       10.  75 3  12 22  0.10    34
                                                    30  0.10    37
   -1      0      169       166       10.  75 3  12 22  0.10    34
                                                    30  0.10    37
   -2      0      169       165       10.  75 3  12 22  0.10    34
                                                    30  0.10    37
   -3      0      168       165       10.  75 3  12 22  0.10    34
                                                    30  0.10    37


Statistics:

  Database:  /usr/local/dot5/sl_home/beauty/seqdb/blast/nr
    Title:  nr
    Release date:  unknown
    Posted date:  4:06 PM CST Feb 28, 2001
    Format:  BLAST
  # of letters in database:  197,782,623
  # of sequences in database:  625,274
  # of database sequences satisfying E:  11
  No. of states in DFA:  593 (58 KB)
  Total size of DFA:  203 KB (256 KB)
  Time to generate neighborhood:  0.02u 0.00s 0.02t  Elapsed: 00:00:00
  No. of threads or processors used:  6
  Search cpu time:  163.39u 1.09s 164.48t  Elapsed: 00:00:28
  Total cpu time:  163.42u 1.11s 164.53t  Elapsed: 00:00:28
  Start:  Mon Oct  1 23:41:44 2001   End:  Mon Oct  1 23:42:12 2001

Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000