BLASTX+BEAUTY Search Results

WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.

BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.

BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract

Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract



RepeatMasker repeats found in sequence:

No Repeats Found.

Reference:  Gish, Warren (1994-1997).  unpublished.
Gish, Warren and David J. States (1993).  Identification of protein coding
regions by database similarity search.  Nat. Genet. 3:266-72.

Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.

Query= 'E04C12_C12_06.ab1' (717 letters)

  Translating both strands of query sequence in all 6 reading frames

Database: nr 625,274 sequences; 197,782,623 total letters.



     Observed Numbers of Database Sequences Satisfying
    Various EXPECTation Thresholds (E parameter values)

        Histogram units:      = 3 Sequences     : less than 3 sequences

 EXPECTation Threshold
 (E parameter)
    |
    V   Observed Counts-->
  10000 830 161 |=====================================================
   6310 669 125 |=========================================
   3980 544 131 |===========================================
   2510 413 171 |=========================================================
   1580 242  84 |============================
   1000 158  53 |=================
    631 105  37 |============
    398  68  23 |=======
    251  45  13 |====
    158  32   8 |==
    100  24   8 |==
   63.1  16   2 |:
   39.8  14   2 |:
   25.1  12   2 |:
   15.8  10   6 |==
 >>>>>>>>>>>>>>>>>>>>>  Expect = 10.0, Observed = 4  <<<<<<<<<<<<<<<<<
   10.0   4   0 |
   6.31   4   0 |
   3.98   4   1 |:


                                                                     Smallest
                                                                       Sum
                                                     Reading  High  Probability
Sequences producing High-scoring Segment Pairs:        Frame Score  P(N)      N
gi|7486565|pir||T05123hypothetical protein F7H19.100 ... +3   871  3.8e-86   1
gi|7487452|pir||T09350hypothetical protein T26M18.120... +3   810  1.1e-79   1
gi|8778672|gb|AAF79680.1|AC022314_21(AC022314) F9C16.... +3   490  8.9e-46   1
gi|5174693ref|NP_005979.1| small proline-rich protein... +3    66  0.96      1

Use the and icons to retrieve links to Entrez:

E = Retrieve Entrez links (e.g., Medline abstracts, FASTA-formatted sequence reports).
R = Retrieve links to Related sequences (neighbors).
Use the icon (if present) to retrieve links to the Sequence Retrieval System (SRS).
Use the icon (if present) to retrieve links to the Ligand Enzyme and Chemical Compound Database .
Use the icon (if present) to retrieve links to the Protein Data Bank database.


to_Entrezto_Relatedto_Related >gi|7486565|pir||T05123  hypothetical protein F7H19.100 - Arabidopsis thaliana
            >gi|3292817|emb|CAA19807.1| (AL031018) hypothetical protein
            [Arabidopsis thaliana] >gi|7269139|emb|CAB79247.1| (AL161558)
            hypothetical protein [Arabidopsis thaliana]
            Length = 268

Frame  3 hits (HSPs):       ____________________________________________  
                        __________________________________________________
Database sequence:     |         |        |        |         |        |   | 268
                       0        50      100      150       200      250

  Plus Strand HSPs:

 Score = 871 (306.6 bits), Expect = 3.8e-86, P = 3.8e-86
 Identities = 167/238 (70%), Positives = 190/238 (79%), Frame = +3

Query:     3 AGXNSLFPYCGRRVGKKNKAMVPVARLFGPAIFEASKLKVLFLGVDENKHPGNLPRTYTL 182
             +  +SLF +  RR  KKN+++VPVARLFGPAIFE+SKLKVLFLGVDE KHP  LPRTYTL
Sbjct:    26 SSSSSLF-FNNRRSKKKNQSIVPVARLFGPAIFESSKLKVLFLGVDEKKHPSTLPRTYTL 84

Query:   183 THSDITAKLTLAISQTINNSQLQGWYNRFQRDEVVAQWKKVKGRMSLHVHCHISGGHFLL 362
             THSDITAKLTLAISQ+INNSQLQGW NR  RDEVVA+WKKVKG+MSLHVHCHISGGHFLL
Sbjct:    85 THSDITAKLTLAISQSINNSQLQGWANRLYRDEVVAEWKKVKGKMSLHVHCHISGGHFLL 144

Query:   363 DILARLRYFIFCKELPVVLKAVVHGDENLFNSYPELQDALVWVYFHSNIPEFNKVECWGP 542
             D+ A+ RYFIFCKELPVVLKA VHGD NL N+YPELQ+ALVWVYFHSN+ EFNKVECWGP
Sbjct:   145 DLFAKFRYFIFCKELPVVLKAFVHGDGNLLNNYPELQEALVWVYFHSNVNEFNKVECWGP 204

Query:   543 LKEASAPTGGVQEEGLAIPQP-CQEECQCCFPPLTLSPIQWSKQVPSRHYEPCDGIGTQ 716
             L EA +P G   E    +P+  C +EC CCFP  T+S I WS  + +       G  T+
Sbjct:   205 LWEAVSPDGHKTE---TLPEARCADECSCCFP--TVSSIPWSHSLSNEGVNGYSGTQTE 258


to_Entrezto_Relatedto_Related >gi|7487452|pir||T09350  hypothetical protein T26M18.120 - Arabidopsis thaliana
            >gi|5002526|emb|CAB44329.1| (AL078606) putative protein
            [Arabidopsis thaliana] >gi|7267892|emb|CAB78234.1| (AL161533)
            putative protein [Arabidopsis thaliana]
            Length = 466

Frame  3 hits (HSPs):      _____________________________________________  
                        __________________________________________________
Database sequence:     |               |                |               | | 466
                       0             150              300             450

  Plus Strand HSPs:

 Score = 810 (285.1 bits), Expect = 1.1e-79, P = 1.1e-79
 Identities = 148/204 (72%), Positives = 173/204 (84%), Frame = +3

Query:    36 RRVGKKNKAMVPVARLFGPAIFEASKLKVLFLGVDENKHPGNLPRTYTLTHSDITAKLTL 215
             RR   KN+++VPVARLFGPAIFEASKLKVLFLGVDE KHP  LPRTYTLTHSDITAKLTL
Sbjct:    32 RRSKMKNRSIVPVARLFGPAIFEASKLKVLFLGVDEKKHPAKLPRTYTLTHSDITAKLTL 91

Query:   216 AISQTINNSQLQGWYNRFQRDEVVAQWKKVKGRMSLHVHCHISGGHFLLDILARLRYFIF 395
             AISQ+INNSQLQGW N+  RDEVV +WKKVKG+MSLHVHCHISGGHF L+++A+LRY+IF
Sbjct:    92 AISQSINNSQLQGWANKLFRDEVVGEWKKVKGKMSLHVHCHISGGHFFLNLIAKLRYYIF 151

Query:   396 CKELPVVLKAVVHGDENLFNSYPELQDALVWVYFHSNIPEFNKVECWGPLKEASAP---T 566
             CKELPVVL+A  HGDE L N++PELQ++ VWVYFHSNIPE+NKVECWGPL EA +     
Sbjct:   152 CKELPVVLEAFAHGDEYLLNNHPELQESPVWVYFHSNIPEYNKVECWGPLWEAMSQHQHD 211

Query:   567 GGVQEEGLAIPQ-PCQEECQCCFP 635
             G   ++   +P+ PC +EC+CCFP
Sbjct:   212 GRTHKKSETLPELPCPDECKCCFP 235

 Score = 777 (273.5 bits), Expect = 3.4e-76, P = 3.4e-76
 Identities = 149/212 (70%), Positives = 166/212 (78%), Frame = +3

Query:    57 KAMVP-VARLFGPAIFEASKLKVLFLGVDENKHPGNLPRTYTLTHSDITAKLTLAISQTI 233
             K   P VARLFG AIFEASKL V FLGVDE KHP NLPRTYT THSDITAKLTLAIS +I
Sbjct:   231 KCCFPTVARLFGQAIFEASKLNVKFLGVDEKKHPPNLPRTYTFTHSDITAKLTLAISHSI 290

Query:   234 NNSQLQGWYNRFQRDEVVAQWKKVKGRMSLHVHCHISGGHFLLDILARLRYFIFCKELPV 413
             NNSQLQGW NR  RDEVVA+W+KVK  MSLHVHCHISG HFLLD++A LRYFIFCKELP+
Sbjct:   291 NNSQLQGWANRLYRDEVVAEWRKVKSNMSLHVHCHISGDHFLLDLIAELRYFIFCKELPM 350

Query:   414 VLKAVVHGDENLFNSYPELQDALVWVYFHSNIPEFNKVECWGPLKEASAPTGGVQEEGLA 593
             VLKA VHGDEN+ N+YPEL +A VWVYFHSNIP+FNKVECWG L EA++  G        
Sbjct:   351 VLKAFVHGDENMLNNYPELHEAFVWVYFHSNIPKFNKVECWGRLCEATSHDGCKTPTCEI 410

Query:   594 IPQP-CQEECQCCFPPLTLSPIQWSKQVPSRHYE 692
             +P+P C ++C CCFP  T+S I WS      H E
Sbjct:   411 LPEPPCFDKCSCCFP--TVSTIPWSHSHGCSHGE 442


to_Entrezto_Relatedto_Related >gi|8778672|gb|AAF79680.1|AC022314_21  (AC022314) F9C16.20 [Arabidopsis
            thaliana]
            Length = 299

Frame  3 hits (HSPs):            _____________________________            
                        __________________________________________________
Database sequence:     |        |       |       |        |       |        | 299
                       0       50     100     150      200     250

  Plus Strand HSPs:

 Score = 490 (172.5 bits), Expect = 8.9e-46, P = 8.9e-46
 Identities = 97/169 (57%), Positives = 121/169 (71%), Frame = +3

Query:    54 NKAMVPVARLFGP-AIFEASKLKVLFLG-VDENKHPGNL--PRTYTLTHSDITAKLTLAI 221
             N  +    RL  P A F++SKLKV FLG + ENK  G +  PRTY L+H D TA LTL I
Sbjct:    57 NTLVSEAVRLLVPQANFDSSKLKVEFLGELLENKSNGGIITPRTYILSHCDFTANLTLTI 116

Query:   222 SQTINNSQLQGWYNRFQRDEVVAQWKKVKGRMSLHVHCHISGGHFLLDILARLRYFIFCK 401
             S  IN  QL+GWY   ++D+VVA+WKKV   + LH+HC +SG   L D+ A LRY IF K
Sbjct:   117 SNVINLDQLEGWY---KKDDVVAEWKKVNDELRLHIHCCVSGMSLLQDVAAELRYHIFSK 173

Query:   402 ELPVVLKAVVHGDENLFNSYPELQDALVWVYFHSNIPEFNKVECWGPLKEAS 557
             ELP+VLKAVVHGD  +F   PEL DA VWVYFHS+ P++N++ECWGPLK+A+
Sbjct:   174 ELPLVLKAVVHGDSVMFRENPELMDAYVWVYFHSSTPKYNRIECWGPLKDAA 225


to_Entrezto_Related >gi|5174693  ref|NP_005979.1| small proline-rich protein 2A [Homo sapiens]
            >gi|12719984 ref|XP_010605.1| small proline-rich protein 2A [Homo
            sapiens] >gi|464788|sp|P35326|SP2A_HUMAN SMALL PROLINE-RICH PROTEIN
            2A (SPR-2A) (2-1) >gi|107689|pir||S12712 small proline-rich protein
            spr2-1 - human >gi|3367693|emb|CAA37239.1| (X53064) small
            proline-rich protein [Homo sapiens]
            Length = 72

Frame  3 hits (HSPs):                               ____________________  
                        __________________________________________________
Database sequence:     |             |             |            |         | 72
                       0            20            40           60

  Plus Strand HSPs:

 Score = 66 (23.2 bits), Expect = 3.3, P = 0.96
 Identities = 14/28 (50%), Positives = 18/28 (64%), Frame = +3

Query:   597 PQPCQ-EECQCCFPPLTLSPIQWSKQVP 677
             PQPC  ++CQ  +PP+T SP   SK  P
Sbjct:    42 PQPCPPQQCQQKYPPVTPSPPCQSKYPP 69


Parameters:
  filter=none
  matrix=BLOSUM62
  V=50
  B=50
  E=10
  gi
  H=1
  sort_by_pvalue
  echofilter

  ctxfactor=5.97

  Query                        -----  As Used  -----    -----  Computed  ----
  Frame  MatID Matrix name     Lambda    K       H      Lambda    K       H
   Std.    0   BLOSUM62                                 0.318   0.135   0.401  
   +3      0   BLOSUM62        0.318   0.135   0.401    0.322   0.139   0.449  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   +2      0   BLOSUM62        0.318   0.135   0.401    0.339   0.149   0.499  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   +1      0   BLOSUM62        0.318   0.135   0.401    0.345   0.149   0.557  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -1      0   BLOSUM62        0.318   0.135   0.401    0.341   0.152   0.538  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -2      0   BLOSUM62        0.318   0.135   0.401    0.345   0.154   0.582  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -3      0   BLOSUM62        0.318   0.135   0.401    0.348   0.151   0.557  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a

  Query
  Frame  MatID  Length  Eff.Length     E    S W   T  X   E2     S2
   +3      0      238       237       10.  78 3  12 22  0.11    35
                                                    32  0.10    39
   +2      0      238       237       10.  78 3  12 22  0.11    35
                                                    32  0.10    39
   +1      0      239       239       10.  78 3  12 22  0.12    35
                                                    32  0.10    39
   -1      0      239       238       10.  78 3  12 22  0.11    35
                                                    32  0.10    39
   -2      0      238       238       10.  78 3  12 22  0.11    35
                                                    32  0.10    39
   -3      0      238       237       10.  78 3  12 22  0.11    35
                                                    32  0.10    39


Statistics:

  Database:  /usr/local/dot5/sl_home/beauty/seqdb/blast/nr
    Title:  nr
    Release date:  unknown
    Posted date:  4:06 PM CST Feb 28, 2001
    Format:  BLAST
  # of letters in database:  197,782,623
  # of sequences in database:  625,274
  # of database sequences satisfying E:  4
  No. of states in DFA:  598 (59 KB)
  Total size of DFA:  280 KB (320 KB)
  Time to generate neighborhood:  0.01u 0.01s 0.02t  Elapsed: 00:00:00
  No. of threads or processors used:  6
  Search cpu time:  260.43u 1.30s 261.73t  Elapsed: 00:01:00
  Total cpu time:  260.48u 1.32s 261.80t  Elapsed: 00:01:00
  Start:  Fri Jan 18 12:05:20 2002   End:  Fri Jan 18 12:06:20 2002

Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000