BLASTX+BEAUTY Search Results

WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.

BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.

BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract

Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract



RepeatMasker repeats found in sequence:

No Repeats Found.

Reference:  Gish, Warren (1994-1997).  unpublished.
Gish, Warren and David J. States (1993).  Identification of protein coding
regions by database similarity search.  Nat. Genet. 3:266-72.

Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.

Query= 'E02H07_B12_04.ab1' (829 letters)

  Translating both strands of query sequence in all 6 reading frames

Database: nr 625,274 sequences; 197,782,623 total letters.



     Observed Numbers of Database Sequences Satisfying
    Various EXPECTation Thresholds (E parameter values)

        Histogram units:      = 6 Sequences     : less than 6 sequences

 EXPECTation Threshold
 (E parameter)
    |
    V   Observed Counts-->
  10000 1242 353 |==========================================================
   6310  889 186 |===============================
   3980  703 195 |================================
   2510  508 147 |========================
   1580  361 123 |====================
   1000  238  79 |=============
    631  159  41 |======
    398  118  37 |======
    251   81  26 |====
    158   55  17 |==
    100   38  12 |==
   63.1   26   7 |=
   39.8   19   2 |:
   25.1   17   5 |:
   15.8   12   3 |:
 >>>>>>>>>>>>>>>>>>>>>  Expect = 10.0, Observed = 9  <<<<<<<<<<<<<<<<<
   10.0    9   2 |:
   6.31    7   1 |:
   3.98    6   1 |:
   2.51    5   0 |
   1.58    5   1 |:
   1.00    4   2 |:


                                                                     Smallest
                                                                       Sum
                                                     Reading  High  Probability
Sequences producing High-scoring Segment Pairs:        Frame Score  P(N)      N
gi|11357983|pir||T48025hypothetical protein T12C14.30... +1   745  8.5e-73   1
gi|6498462|dbj|BAA87851.1|(AP000816) hypothetical pro... +1   713  2.1e-69   1
gi|7517337|pir||B72581hypothetical protein APES063 - ... +3    50  0.55      2
gi|6691188|gb|AAF24526.1|AC007534_7(AC007534) F7F22.1... +1    88  0.56      1
gi|9294244|dbj|BAB02146.1|(AP000411) copia retroeleme... +1    95  0.68      1
gi|1030731|emb|CAA32198.1|(X14037) polyprotein [Droso... +1    94  0.97      1
gi|85056|pir||S02021micropia polyprotein - fruit fly ... +1    94  0.99      1
gi|11358885|pir||T48160transcription factor GT-3a - A... +1    85  0.9993    1
gi|6683623|dbj|BAA89271.1|(AB025309) Gag [Alternaria ... +1    86  0.9997    1

Use the and icons to retrieve links to Entrez:

E = Retrieve Entrez links (e.g., Medline abstracts, FASTA-formatted sequence reports).
R = Retrieve links to Related sequences (neighbors).
Use the icon (if present) to retrieve links to the Sequence Retrieval System (SRS).
Use the icon (if present) to retrieve links to the Ligand Enzyme and Chemical Compound Database .
Use the icon (if present) to retrieve links to the Protein Data Bank database.


to_Entrezto_Relatedto_Related >gi|11357983|pir||T48025  hypothetical protein T12C14.30 - Arabidopsis thaliana
            >gi|7340704|emb|CAB82947.1| (AL162507) putative protein
            [Arabidopsis thaliana]
            Length = 479

Frame  1 hits (HSPs):   _____________________                             
                        __________________________________________________
Database sequence:     |               |               |              |   | 479
                       0             150             300            450

  Plus Strand HSPs:

 Score = 745 (262.3 bits), Expect = 8.5e-73, P = 8.5e-73
 Identities = 141/196 (71%), Positives = 163/196 (83%), Frame = +1

Query:    94 ANRPDPDIDDDFRELYKEYTGPLGTATTN-MQERAKSNK-RSNAGSDEEEEAR-DPNAVP 264
             + R DP++DDDF E+YKEYTGP    T N +Q++ K  K RS    DEEEE   DPN+VP
Sbjct:     3 STRSDPELDDDFSEIYKEYTGPASAVTNNNIQDKDKPVKQRSEERCDEEEEQLPDPNSVP 62

Query:   265 TDFTSREAKVWEAKSKATERNWKKRKEEEMICKLCGESGHFTQGCPSTLGANRKSQDFFE 444
             TDFTSREAKVWEAKSKATERNWKKRKEEEMICK+CGESGHFTQGCPSTLGANRKSQ+FFE
Sbjct:    63 TDFTSREAKVWEAKSKATERNWKKRKEEEMICKICGESGHFTQGCPSTLGANRKSQEFFE 122

Query:   445 RIPARDKNVRALFTEKVLSKIEKDVGCKIKMDEKFIIVSGKDRLILAKGVDAGHKIREEG 624
             R+PARD NVR LFTEKV+  IE++  CKIK+DEKFIIVSGKDRLIL KGVDA HK++E+G
Sbjct:   123 RVPARDNNVRVLFTEKVMESIERETSCKIKLDEKFIIVSGKDRLILRKGVDAVHKVKEDG 182

Query:   625 DQRGSSSSQMTQSRSP 672
             + + SS S  ++SRSP
Sbjct:   183 EMKSSSVSHRSRSRSP 198


to_Entrezto_Relatedto_Related >gi|6498462|dbj|BAA87851.1|  (AP000816) hypothetical protein [Oryza sativa]
            >gi|7106535|dbj|BAA92220.1| (AP001278) hypothetical protein [Oryza
            sativa]
            Length = 460

Frame  1 hits (HSPs):   ______________________                            
                        __________________________________________________
Database sequence:     |                |               |               | | 460
                       0              150             300             450

  Plus Strand HSPs:

 Score = 713 (251.0 bits), Expect = 2.1e-69, P = 2.1e-69
 Identities = 139/199 (69%), Positives = 160/199 (80%), Frame = +1

Query:    91 MANRPDPDIDDD-FRELY-KEYTGPLGTATTNMQERAKSNKR--SNAGSDEEEEARDPNA 258
             MA  P P+IDD+ F E+Y K Y+GP+ T T N+  R    KR      SDEE+   DPNA
Sbjct:     1 MAREPSPEIDDELFNEVYGKAYSGPVATTTNNVTPRVNDEKRPLEREKSDEEDGPPDPNA 60

Query:   259 VPTDFTSREAKVWEAKSKATERNWKKRKEEEMICKLCGESGHFTQGCPSTLGANRKSQDF 438
             VPTDFTSREAKVWEAK+KATERNWKKRKEEEMICK+CGESGHFTQGCPSTLGANR++ DF
Sbjct:    61 VPTDFTSREAKVWEAKAKATERNWKKRKEEEMICKICGESGHFTQGCPSTLGANRRNADF 120

Query:   439 FERIPARDKNVRALFTEKVLSKIEKDVGCKIKMDEKFIIVSGKDRLILAKGVDAGHKIRE 618
             FER+PARDK VR LFTE+ +S+IEKDVGCKIKMDEKF+ VSGKDRLILAKGVDA HKI +
Sbjct:   121 FERVPARDKQVRDLFTERTISQIEKDVGCKIKMDEKFLFVSGKDRLILAKGVDAVHKIIQ 180

Query:   619 EGDQRGSSSS-QMTQSRSP 672
             EG  + +SSS +  + RSP
Sbjct:   181 EGKGKNTSSSPKRDRLRSP 199


to_Entrezto_Relatedto_Related >gi|7517337|pir||B72581  hypothetical protein APES063 - Aeropyrum pernix (strain
            K1) >gi|5105622|dbj|BAA80935.1| (AP000062) 64aa long hypothetical
            protein [Aeropyrum pernix]
            Length = 64

Frame  3 hits (HSPs):                             _______________________ 
Frame  2 hits (HSPs):          ___________________                        
                        __________________________________________________
Database sequence:     |              |               |               |   | 64
                       0             20              40              60

  Plus Strand HSPs:

 Score = 50 (17.6 bits), Expect = 0.80, Sum P(2) = 0.55
 Identities = 12/29 (41%), Positives = 16/29 (55%), Frame = +3

Query:   678 SPVSARFHRSEPKGLILTRNTSRFTKVGR 764
             S V+AR HR+    L+L     R T+ GR
Sbjct:    35 SLVTARHHRAVNTSLLLAHTARRSTRGGR 63

 Score = 40 (14.1 bits), Expect = 0.80, Sum P(2) = 0.55
 Identities = 8/24 (33%), Positives = 15/24 (62%), Frame = +2

Query:   503 RLKRMLAAKLRWMRSLLLSVVRID 574
             R ++ L  +L W R L L++V ++
Sbjct:    11 RGRQSLKPRLSWDRGLQLALVNVE 34


to_Entrezto_Relatedto_Related >gi|6691188|gb|AAF24526.1|AC007534_7  (AC007534) F7F22.12 [Arabidopsis thaliana]
            Length = 169

Frame  1 hits (HSPs):              _____________________________          
                        __________________________________________________
Database sequence:     |              |              |              |     | 169
                       0             50            100            150

  Plus Strand HSPs:

 Score = 88 (31.0 bits), Expect = 0.81, P = 0.56
 Identities = 24/93 (25%), Positives = 42/93 (45%), Frame = +1

Query:   136 LYKEYTGPLGTAT-TNMQERAKSNKRSNAGSDEEEEARDPNAVPTD-FTSREAKVWEAKS 309
             L   Y G + T   +N +E+ + N    A  D+E E    N +  +   +R   V +  +
Sbjct:    41 LPSRYDGLVETMKYSNSREKLRLNDVMVAARDKEREMSQNNRLIAEGHYARRRPVGKNNN 100

Query:   310 ---KATERNWKKRKEEEMICKLCGESGHFTQGC 399
                K   R+W K  + + +C +CG+  HF + C
Sbjct:   101 QGNKGKNRSWSKSADGKRVCWICGKEKHFNEQC 133


to_Entrezto_Relatedto_Related >gi|9294244|dbj|BAB02146.1|  (AP000411) copia retroelement pol polyprotein-like
            [Arabidopsis thaliana]
            Length = 526

Frame  1 hits (HSPs):              _____________________                  
                        __________________________________________________
Database sequence:     |              |             |             |       | 526
                       0            150           300           450

  Plus Strand HSPs:

 Score = 95 (33.4 bits), Expect = 1.1, P = 0.68
 Identities = 52/217 (23%), Positives = 86/217 (39%), Frame = +1

Query:    43 SSHPHRVQFS*DIY-FLMANRPDPDID-DDFRELYKEYTGPLGTATTN--MQERAKSNKR 210
             +S P+R+      Y + M +    D + DDF +L  +    +G   T   ++E  K +K 
Sbjct:   122 TSLPNRIYLHLKFYTYKMTDSKSIDGNVDDFLKLVTDLNN-IGVNVTKERIKESGKLSKT 180

Query:   211 SNAGSDEEEEARDPNAVPTDFTSREAKVWEAKSKATERNWKKR---KEEEMICKLCGESG 381
              + G   E   R    +   F   + K W  +SK+  R+ K R    +    C +C   G
Sbjct:   181 QSEGLYVETRGR----LEKRFDKGKGKPWRGRSKSKGRS-KSRPNYNKNNNGCFICRREG 235

Query:   382 HFTQGCPSTLGANRKSQDFFERIPARDKNVRALFTEKVLSKIEK--DVGCKIKMDEKFII 555
             H+ + CP    +N+ S      I    K    L T    +K E   D GC       F I
Sbjct:   236 HWKRECPEK-SSNKPSSS--ANIAVEPKQPLVLTTSPQYTKEESVVDSGCS------FHI 286

Query:   556 VSGKDRLILAKGVDAGHKIREEGDQRGSSSSQMTQSRSPEEVLLVL 693
                KD     +  D G  +      R        +  +P++ +++L
Sbjct:   287 TPNKDSPFGLQEFDGGKVLMGNMTHREVKGIGKIKILNPDDYVVIL 332


to_Entrezto_Relatedto_Related >gi|1030731|emb|CAA32198.1|  (X14037) polyprotein [Drosophila melanogaster]
            Length = 1053

Frame  1 hits (HSPs):           ________                                  
                        __________________________________________________
Database sequence:     |       |      |      |      |      |      |      || 1053
                       0     150    300    450    600    750    900   1050

  Plus Strand HSPs:

 Score = 94 (33.1 bits), Expect = 3.4, P = 0.97
 Identities = 38/146 (26%), Positives = 66/146 (45%), Frame = +1

Query:   187 ERAKSNKRSNAGSDEEEEARDPNAVPTDFTSREAK-VWEAKSKATE-RNWKKRKEEEMI- 357
             ++ +  +  N G D++     P  V   F S+  + + E +SK  + R  K ++E+  + 
Sbjct:   187 DKKRHARDDNLGPDQKNRKASP--VVCHFCSKPGRRIAECRSKMRQDRRAKPQREKSNVT 244

Query:   358 CKLCGESGHFTQGCPSTLGANRKSQDFFERIPARDKNVR----ALFTEKVLSKIEKDVG- 522
             C  CG+ GHF+  CP   G   K QD  ++       V     +L     +  I  D G 
Sbjct:   245 CYRCGQPGHFSNQCPKN-GTAAK-QDVTQQKTVNQCCVTEPKGSLHQRGEIYPICFDSGA 302

Query:   523 -CKIKMDEKFIIVSGK--DRLILAKGVDAG 603
              C +  D+    +SGK  +  ++ KG+  G
Sbjct:   303 ECSLIKDDISSKLSGKRINNTVMIKGIGGG 332


to_Entrezto_Relatedto_Related >gi|85056|pir||S02021  micropia polyprotein - fruit fly (Drosophila
            melanogaster)  (fragment)
            Length = 1291

Frame  1 hits (HSPs):          ______                                     
                        __________________________________________________
Database sequence:     |                   |                  |           | 1291
                       0                 500               1000

  Plus Strand HSPs:

 Score = 94 (33.1 bits), Expect = 4.2, P = 0.99
 Identities = 38/146 (26%), Positives = 66/146 (45%), Frame = +1

Query:   187 ERAKSNKRSNAGSDEEEEARDPNAVPTDFTSREAK-VWEAKSKATE-RNWKKRKEEEMI- 357
             ++ +  +  N G D++     P  V   F S+  + + E +SK  + R  K ++E+  + 
Sbjct:   187 DKKRHARDDNLGPDQKNRKASP--VVCHFCSKPGRRIAECRSKMRQDRRAKPQREKSNVT 244

Query:   358 CKLCGESGHFTQGCPSTLGANRKSQDFFERIPARDKNVR----ALFTEKVLSKIEKDVG- 522
             C  CG+ GHF+  CP   G   K QD  ++       V     +L     +  I  D G 
Sbjct:   245 CYRCGQPGHFSNQCPKN-GTAAK-QDVTQQKTVNQCCVTEPKGSLHQRGEIYPICFDSGA 302

Query:   523 -CKIKMDEKFIIVSGK--DRLILAKGVDAG 603
              C +  D+    +SGK  +  ++ KG+  G
Sbjct:   303 ECSLIKDDISSKLSGKRINNTVMIKGIGGG 332


to_Entrezto_Relatedto_Related >gi|11358885|pir||T48160  transcription factor GT-3a - Arabidopsis thaliana
            >gi|6573264|gb|AAF17610.1|AF206715_1 (AF206715) transcription
            factor GT-3a [Arabidopsis thaliana] >gi|7320716|emb|CAB81921.1|
            (AL161746) transcription factor GT-3a [Arabidopsis thaliana]
            Length = 323

Frame  1 hits (HSPs):                          ___________________        
                        __________________________________________________
Database sequence:     |       |       |       |      |       |       |   | 323
                       0      50     100     150    200     250     300

  Plus Strand HSPs:

 Score = 85 (29.9 bits), Expect = 7.3, P = 1.0
 Identities = 33/118 (27%), Positives = 53/118 (44%), Frame = +1

Query:    34 SSSSSHPHRVQFS*DIYFLMANRPDPDIDDDFRELY---KEYTGPLGTAT-TNMQERAKS 201
             S+SS   H  QFS D      + P+ DI+++   L    K  T  + T+T TN ++RAK 
Sbjct:   152 STSSKRKHH-QFSSDDEEEEVDEPNQDINEELLSLVETQKRETEVITTSTSTNPRKRAKK 210

Query:   202 NKRSNAGSDEEEEARDPNAVPTDFTSREAKV-------WEAKS---KATERNWKKRKEE 348
              K   +G+  E        +  +F  +  K+       WE K    +  E+ W++R  E
Sbjct:   211 GKGVASGTKAETAGNTLKDILEEFMRQTVKMEKEWRDAWEMKEIEREKREKEWRRRMAE 269


to_Entrezto_Relatedto_Related >gi|6683623|dbj|BAA89271.1|  (AB025309) Gag [Alternaria alternata]
            Length = 406

Frame  1 hits (HSPs):                                        __________   
                        __________________________________________________
Database sequence:     |                  |                 |             | 406
                       0                150               300

  Plus Strand HSPs:

 Score = 86 (30.3 bits), Expect = 8.0, P = 1.0
 Identities = 24/74 (32%), Positives = 32/74 (43%), Frame = +1

Query:   202 NKRSNAGSDEEEEARDPNAVPTDFTSREAKVWEAKSKATERNWK-KRKEEEMICKLCGES 378
             N+RS A    +     P AV  +  S EA  WE    +  R  + K K   + C  CG+ 
Sbjct:   307 NQRSTAHDGAQNH---PRAVQRE-ASPEAMDWEPSKVSQARESRVKTKRAPLTCYSCGKP 362

Query:   379 GHFTQGCPSTLGANR 423
             GH  + C ST    R
Sbjct:   363 GHIARDCQSTTRVRR 377


Parameters:
  filter=none
  matrix=BLOSUM62
  V=50
  B=50
  E=10
  gi
  H=1
  sort_by_pvalue
  echofilter

  ctxfactor=5.99

  Query                        -----  As Used  -----    -----  Computed  ----
  Frame  MatID Matrix name     Lambda    K       H      Lambda    K       H
   Std.    0   BLOSUM62                                 0.318   0.135   0.401  
   +3      0   BLOSUM62        0.318   0.135   0.401    0.367   0.169   0.676  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   +2      0   BLOSUM62        0.318   0.135   0.401    0.348   0.153   0.496  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   +1      0   BLOSUM62        0.318   0.135   0.401    0.324   0.139   0.415  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -1      0   BLOSUM62        0.318   0.135   0.401    0.359   0.164   0.630  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -2      0   BLOSUM62        0.318   0.135   0.401    0.357   0.161   0.552  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -3      0   BLOSUM62        0.318   0.135   0.401    0.340   0.147   0.469  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a

  Query
  Frame  MatID  Length  Eff.Length     E    S W   T  X   E2     S2
   +3      0      275       275       10.  79 3  12 22  0.097   36
                                                    33  0.12    39
   +2      0      276       276       10.  79 3  12 22  0.098   36
                                                    33  0.12    39
   +1      0      276       276       10.  79 3  12 22  0.098   36
                                                    33  0.12    39
   -1      0      276       276       10.  79 3  12 22  0.098   36
                                                    33  0.12    39
   -2      0      276       276       10.  79 3  12 22  0.098   36
                                                    33  0.12    39
   -3      0      275       275       10.  79 3  12 22  0.097   36
                                                    33  0.12    39


Statistics:

  Database:  /usr/local/dot5/sl_home/beauty/seqdb/blast/nr
    Title:  nr
    Release date:  unknown
    Posted date:  4:06 PM CST Feb 28, 2001
    Format:  BLAST
  # of letters in database:  197,782,623
  # of sequences in database:  625,274
  # of database sequences satisfying E:  9
  No. of states in DFA:  595 (59 KB)
  Total size of DFA:  257 KB (320 KB)
  Time to generate neighborhood:  0.02u 0.00s 0.02t  Elapsed: 00:00:00
  No. of threads or processors used:  6
  Search cpu time:  307.93u 0.98s 308.91t  Elapsed: 00:02:21
  Total cpu time:  307.98u 0.98s 308.96t  Elapsed: 00:02:21
  Start:  Wed Jan 16 19:57:04 2002   End:  Wed Jan 16 19:59:25 2002

Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000