WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.
BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.
BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract
Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract
RepeatMasker repeats found in sequence:No Repeats Found.Reference: Gish, Warren (1994-1997). unpublished. Gish, Warren and David J. States (1993). Identification of protein coding regions by database similarity search. Nat. Genet. 3:266-72.Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.
Query= 'E06H07_P07_16.ab1' (515 letters)
Translating both strands of query sequence in all 6 reading framesDatabase: nr 625,274 sequences; 197,782,623 total letters.Observed Numbers of Database Sequences Satisfying Various EXPECTation Thresholds (E parameter values) Histogram units: = 3 Sequences : less than 3 sequences EXPECTation Threshold (E parameter) | V Observed Counts--> 10000 747 141 |=============================================== 6310 606 114 |====================================== 3980 492 174 |========================================================== 2510 318 102 |================================== 1580 216 63 |===================== 1000 153 57 |=================== 631 96 30 |========== 398 66 21 |======= 251 45 19 |====== 158 26 7 |== 100 19 6 |== 63.1 13 1 |: 39.8 12 1 |: 25.1 11 1 |: 15.8 10 1 |: >>>>>>>>>>>>>>>>>>>>> Expect = 10.0, Observed = 9 <<<<<<<<<<<<<<<<< 10.0 9 0 | 6.31 9 1 |: 3.98 8 0 | 2.51 8 1 |: 1.58 7 1 |: 1.00 6 0 | 0.63 6 0 | 0.40 6 0 | 0.25 6 0 | 0.16 6 0 | 0.10 6 0 | 0.063 6 0 | 0.040 6 0 | 0.025 6 0 | 0.016 6 1 |: Smallest Sum Reading High Probability Sequences producing High-scoring Segment Pairs: Frame Score P(N) N gi|7487051|pir||T02581hypothetical protein T16B24.15 ... +2 252 1.2e-19 1 gi|3329366|gb|AAC39500.1|(AF031243) nodule-specific p... +2 219 3.9e-16 1 gi|4063746|gb|AAC98454.1|(AC005851) nodulin-like prot... +2 207 7.7e-15 1 gi|5882744|gb|AAD55297.1|AC008263_28(AC008263) Strong... +2 130 1.3e-06 1 gi|8778273|gb|AAF79282.1|AC068602_5(AC068602) F14D16.... +2 116 0.00055 1 gi|7485467|pir||T02323hypothetical protein F13P17.19 ... +2 112 0.012 1 gi|4581109|gb|AAD24599.1|AC005825_6(AC005825) nodulin... +2 90 0.74 1 gi|7486862|pir||T10241hypothetical protein T11I11.190... +2 89 0.84 1 gi|6730730|gb|AAF27120.1|AC018849_8(AC018849) nodulin... +2 85 0.996 1
Use the and icons to retrieve links to Entrez:
>gi|7487051|pir||T02581 hypothetical protein T16B24.15 - Arabidopsis thaliana >gi|3402684|gb|AAC28987.1| (AC004697) nodulin-like protein [Arabidopsis thaliana] Length = 601 Frame 2 hits (HSPs): _______ __________________________________________________ Database sequence: | | | | || 601 0 150 300 450 600 Plus Strand HSPs: Score = 252 (88.7 bits), Expect = 1.2e-19, P = 1.2e-19 Identities = 47/74 (63%), Positives = 57/74 (77%), Frame = +2 Query: 11 GKQLAALGLKRIEGQELNCVGVHCFKLSFIIITAATFFGAIVSLILVARTRTFYKSDIYK 190 GKQ ALG R+EGQ+LNC+G CFKLSFIII A T FG +VS++LV RT+ FYKSDIYK Sbjct: 502 GKQYKALGKTRVEGQDLNCIGTSCFKLSFIIIAAVTLFGVLVSMVLVIRTKKFYKSDIYK 561 Query: 191 RYRNAATESETEMA 232 ++R A +E EMA Sbjct: 562 KFREKALAAEMEMA 575 >gi|3329366|gb|AAC39500.1| (AF031243) nodule-specific protein Nlj70 [Lotus japonicus] Length = 575 Frame 2 hits (HSPs): _______ __________________________________________________ Database sequence: | | | | | 575 0 150 300 450 Plus Strand HSPs: Score = 219 (77.1 bits), Expect = 3.9e-16, P = 3.9e-16 Identities = 42/72 (58%), Positives = 55/72 (76%), Frame = +2 Query: 14 KQLAALGLKRIEGQELNCVGVHCFKLSFIIITAATFFGAIVSLILVARTRTFYKSDIYKR 193 +Q+AALGL+R G+ELNC G C+KL++IIITA FGA+VS ILV RTR FYK+DIYK+ Sbjct: 498 RQMAALGLQRKPGEELNCNGSDCYKLAYIIITAVCLFGALVSFILVLRTRQFYKTDIYKK 557 Query: 194 YRNAATESETEM 229 + +ET+M Sbjct: 558 FTEEPRTAETKM 569 >gi|4063746|gb|AAC98454.1| (AC005851) nodulin-like protein [Arabidopsis thaliana] Length = 577 Frame 2 hits (HSPs): ________ __________________________________________________ Database sequence: | | | | | 577 0 150 300 450 Plus Strand HSPs: Score = 207 (72.9 bits), Expect = 7.7e-15, P = 7.7e-15 Identities = 43/81 (53%), Positives = 57/81 (70%), Frame = +2 Query: 14 KQLAALGLKRIEGQELNCVGVHCFKLSFIIITAATFFGAIVSLILVARTRTFYKSDIYKR 193 KQL A GL R + ++L C+G C+KL F+I+ A TFFGA+VSL L RTR FYK DIYK+ Sbjct: 494 KQLTARGLTRKDVKDLTCLGSQCYKLPFLILAAVTFFGALVSLGLAIRTREFYKGDIYKK 553 Query: 194 YRNAATESETEMAEKDSKHVV 256 +R + ESE+E+ DS+ V Sbjct: 554 FRESP-ESESELVP-DSRKAV 572 >gi|5882744|gb|AAD55297.1|AC008263_28 (AC008263) Strong similarity to gb|AF031243 nodule-specific protein (Nlj70) from Lotus japonicus and is a member of the PF|00083 Sugar (and other) transporter family. EST gb|Z37715 comes from this gene. [Arabidopsis thaliana] Length = 533 Frame 2 hits (HSPs): ______ __________________________________________________ Database sequence: | | | | | 533 0 150 300 450 Plus Strand HSPs: Score = 130 (45.8 bits), Expect = 1.3e-06, P = 1.3e-06 Identities = 23/52 (44%), Positives = 34/52 (65%), Frame = +2 Query: 38 KRIEGQELNCVGVHCFKLSFIIITAATFFGAIVSLILVARTRTFYKSDIYKR 193 K G+ C G HCF+LSFII+ + FFG +V+++L RT+T Y+ + KR Sbjct: 478 KTASGEGNTCYGSHCFRLSFIIMASVAFFGFLVAIVLFFRTKTLYRQILVKR 529 >gi|8778273|gb|AAF79282.1|AC068602_5 (AC068602) F14D16.8 [Arabidopsis thaliana] Length = 526 Frame 2 hits (HSPs): ______ __________________________________________________ Database sequence: | | | | | 526 0 150 300 450 Plus Strand HSPs: Score = 116 (40.8 bits), Expect = 0.00055, P = 0.00055 Identities = 20/52 (38%), Positives = 35/52 (67%), Frame = +2 Query: 38 KRIEGQELNCVGVHCFKLSFIIITAATFFGAIVSLILVARTRTFYKSDIYKR 193 + I G+ C G HCF+L++++I + F G +VS +LV RT+T Y+ I+++ Sbjct: 471 RTIIGEGNTCYGPHCFRLAYVVIASVAFLGFLVSCVLVFRTKTIYRQ-IFEK 521 >gi|7485467|pir||T02323 hypothetical protein F13P17.19 - Arabidopsis thaliana >gi|3337366|gb|AAC27411.1| (AC004481) nodulin-like protein [Arabidopsis thaliana] Length = 2301 Frame 2 hits (HSPs): __ __ __________________________________________________ Database sequence: | | | | | | 2301 0 500 1000 1500 2000 Plus Strand HSPs: Score = 112 (39.4 bits), Expect = 0.012, P = 0.012 Identities = 19/44 (43%), Positives = 31/44 (70%), Frame = +2 Query: 62 NCVGVHCFKLSFIIITAATFFGAIVSLILVARTRTFYKSDIYKR 193 +C G HCF+ SF+I+ A G++V+L+L+ RT+ FY + + KR Sbjct: 479 SCYGNHCFRTSFLIMAAMALLGSLVALVLLLRTKKFYATLVAKR 522 Score = 103 (36.3 bits), Expect = 0.18, P = 0.17 Identities = 18/44 (40%), Positives = 29/44 (65%), Frame = +2 Query: 62 NCVGVHCFKLSFIIITAATFFGAIVSLILVARTRTFYKSDIYKR 193 +C G CF+ SF+I+ + FG++V+ +L RT FYK+ + KR Sbjct: 1093 SCFGSQCFRTSFMIMASVALFGSLVASVLFFRTHKFYKNLVAKR 1136 >gi|4581109|gb|AAD24599.1|AC005825_6 (AC005825) nodulin-like protein [Arabidopsis thaliana] Length = 546 Frame 2 hits (HSPs): ____ __________________________________________________ Database sequence: | | | | | 546 0 150 300 450 Plus Strand HSPs: Score = 90 (31.7 bits), Expect = 1.4, P = 0.74 Identities = 15/36 (41%), Positives = 23/36 (63%), Frame = +2 Query: 65 CVGVHCFKLSFIIITAATFFGAIVSLILVARTRTFY 172 CVG HC++L FI++ A+ G + L+L RT+ Y Sbjct: 494 CVGAHCYRLIFIVMALASVIGVGLDLVLAYRTKEIY 529 >gi|7486862|pir||T10241 hypothetical protein T11I11.190 - Arabidopsis thaliana >gi|5123712|emb|CAB45456.1| (AL079347) putative protein [Arabidopsis thaliana] >gi|7270446|emb|CAB80212.1| (AL161586) putative protein [Arabidopsis thaliana] Length = 567 Frame 2 hits (HSPs): ____ __________________________________________________ Database sequence: | | | | | 567 0 150 300 450 Plus Strand HSPs: Score = 89 (31.3 bits), Expect = 1.9, P = 0.84 Identities = 15/36 (41%), Positives = 23/36 (63%), Frame = +2 Query: 65 CVGVHCFKLSFIIITAATFFGAIVSLILVARTRTFY 172 CVG HCF++ FI++ A+ G + L+L RT+ Y Sbjct: 515 CVGAHCFRIVFIVMAFASIIGVGLDLLLAYRTKGIY 550 >gi|6730730|gb|AAF27120.1|AC018849_8 (AC018849) nodulin-like protein; 38383-40406 [Arabidopsis thaliana] Length = 561 Frame 2 hits (HSPs): _____ __________________________________________________ Database sequence: | | | | | 561 0 150 300 450 Plus Strand HSPs: Score = 85 (29.9 bits), Expect = 5.4, P = 1.0 Identities = 12/41 (29%), Positives = 26/41 (63%), Frame = +2 Query: 56 ELNCVGVHCFKLSFIIITAATFFGAIVSLILVARTRTFYKS 178 ++ C+G CF+++F+++ G ++S+IL R R Y++ Sbjct: 505 KMTCIGPDCFRVTFLVLAGVCGLGTLLSIILTVRIRPVYQA 545 Parameters: filter=none matrix=BLOSUM62 V=50 B=50 E=10 gi H=1 sort_by_pvalue echofilter ctxfactor=6.00 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H Std. 0 BLOSUM62 0.318 0.135 0.401 +3 0 BLOSUM62 0.318 0.135 0.401 0.356 0.157 0.593 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +2 0 BLOSUM62 0.318 0.135 0.401 0.346 0.152 0.493 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +1 0 BLOSUM62 0.318 0.135 0.401 0.363 0.163 0.628 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -1 0 BLOSUM62 0.318 0.135 0.401 0.373 0.167 0.681 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -2 0 BLOSUM62 0.318 0.135 0.401 0.363 0.164 0.618 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -3 0 BLOSUM62 0.318 0.135 0.401 0.348 0.150 0.484 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a Query Frame MatID Length Eff.Length E S W T X E2 S2 +3 0 171 171 10. 75 3 12 22 0.11 34 31 0.11 37 +2 0 171 171 10. 75 3 12 22 0.11 34 31 0.11 37 +1 0 171 171 10. 75 3 12 22 0.11 34 31 0.11 37 -1 0 171 171 10. 75 3 12 22 0.11 34 31 0.11 37 -2 0 171 171 10. 75 3 12 22 0.11 34 31 0.11 37 -3 0 171 171 10. 75 3 12 22 0.11 34 31 0.11 37 Statistics: Database: /usr/local/dot5/sl_home/beauty/seqdb/blast/nr Title: nr Release date: unknown Posted date: 4:06 PM CST Feb 28, 2001 Format: BLAST # of letters in database: 197,782,623 # of sequences in database: 625,274 # of database sequences satisfying E: 9 No. of states in DFA: 596 (59 KB) Total size of DFA: 205 KB (256 KB) Time to generate neighborhood: 0.02u 0.00s 0.02t Elapsed: 00:00:00 No. of threads or processors used: 6 Search cpu time: 161.34u 0.92s 162.26t Elapsed: 00:01:30 Total cpu time: 161.37u 0.94s 162.31t Elapsed: 00:01:30 Start: Fri Jan 18 16:14:03 2002 End: Fri Jan 18 16:15:33 2002
Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000