WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.
BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.
BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract
Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract
RepeatMasker repeats found in sequence:No Repeats Found.Reference: Gish, Warren (1994-1997). unpublished. Gish, Warren and David J. States (1993). Identification of protein coding regions by database similarity search. Nat. Genet. 3:266-72.Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.
Query= 'D18H11_P23_16.ab1' (580 letters)
Translating both strands of query sequence in all 6 reading framesDatabase: nr 625,274 sequences; 197,782,623 total letters.Observed Numbers of Database Sequences Satisfying Various EXPECTation Thresholds (E parameter values) Histogram units: = 3 Sequences : less than 3 sequences EXPECTation Threshold (E parameter) | V Observed Counts--> 10000 801 150 |================================================== 6310 651 121 |======================================== 3980 530 115 |====================================== 2510 415 126 |========================================== 1580 289 93 |=============================== 1000 196 70 |======================= 631 126 30 |========== 398 96 30 |========== 251 66 30 |========== 158 36 8 |== 100 28 3 |= 63.1 25 5 |= 39.8 20 3 |= 25.1 17 3 |= 15.8 14 1 |: >>>>>>>>>>>>>>>>>>>>> Expect = 10.0, Observed = 13 <<<<<<<<<<<<<<<<< 10.0 13 1 |: 6.31 12 0 | 3.98 12 1 |: 2.51 11 2 |: 1.58 9 0 | 1.00 9 0 | 0.63 9 0 | 0.40 9 0 | 0.25 9 1 |: Smallest Sum Reading High Probability Sequences producing High-scoring Segment Pairs: Frame Score P(N) N gi|7485759|pir||T02677hypothetical protein F19D11.4 -... +2 533 2.5e-50 1 gi|7485758|pir||T02676hypothetical protein F19D11.3 -... +2 518 9.6e-49 1 gi|7485757|pir||T02675hypothetical protein F19D11.2 -... +2 507 1.4e-47 1 gi|10177848|dbj|BAB11277.1|(AB009049) gene_id:MCD7.26... +2 505 2.3e-47 1 gi|12597859|gb|AAG60168.1|AC084110_1(AC084110) unknow... +2 502 4.8e-47 1 gi|10177846|dbj|BAB11275.1|(AB009049) gene_id:MCD7.24... +2 488 1.4e-45 1 gi|8920604|gb|AAF81326.1|AC007767_6(AC007767) Strong ... +2 449 2.2e-41 1 gi|11357471|pir||T48513hypothetical protein F15N18.13... +2 343 1.2e-29 1 gi|7431419|pir||D70989probable oxidoreductase - Mycob... +2 97 0.19 1 gi|11560006ref|NP_071556.1| L-gulono-gamma-lactone ox... +2 89 0.85 1 gi|625202|pir||OXRTGUL-gulonolactone oxidase (EC 1.1.... +2 89 0.85 1 gi|12188976|emb|CAC21485.1|(AL512562) putative d-arab... +2 87 0.97 1 gi|7482785|pir||C69152polyferredoxin - Methanobacteri... +2 82 0.9997 1
Use the and icons to retrieve links to Entrez:
>gi|7485759|pir||T02677 hypothetical protein F19D11.4 - Arabidopsis thaliana >gi|3510251|gb|AAC33495.1| (AC005310) unknown protein [Arabidopsis thaliana] Length = 603 Frame 2 hits (HSPs): ___________ __________________________________________________ Database sequence: | | | | || 603 0 150 300 450 600 Plus Strand HSPs: Score = 533 (187.6 bits), Expect = 2.5e-50, P = 2.5e-50 Identities = 91/124 (73%), Positives = 107/124 (86%), Frame = +2 Query: 11 RSKDPLAPRLFQDILEEIEQIGLFKYGGLPHWGKNRNLAFLGAIKKYHYADKFLKVKEEY 190 R+KDPL+PRL++D +EEIEQI LFKY LPHWGKNRNLAF G IKKY FLKVKE Y Sbjct: 468 RAKDPLSPRLYEDFIEEIEQIALFKYNALPHWGKNRNLAFDGVIKKYKNVPAFLKVKESY 527 Query: 191 DPQGLFSSLWTDQMLGLQEGVTILKDGCAMEGMCICSQDSHCAPKKGYFCRPGRIYKEAR 370 DP GLFSS WTDQ+LG++ VTI+KDGCA+EG+C+CS+D+HCAP KGYFCRPG++YKEAR Sbjct: 528 DPMGLFSSEWTDQILGIKGNVTIIKDGCALEGLCVCSEDAHCAPTKGYFCRPGKVYKEAR 587 Query: 371 VCTR 382 VCTR Sbjct: 588 VCTR 591 >gi|7485758|pir||T02676 hypothetical protein F19D11.3 - Arabidopsis thaliana >gi|3510250|gb|AAC33494.1| (AC005310) unknown protein [Arabidopsis thaliana] Length = 570 Frame 2 hits (HSPs): ____________ __________________________________________________ Database sequence: | | | | | 570 0 150 300 450 Plus Strand HSPs: Score = 518 (182.3 bits), Expect = 9.6e-49, P = 9.6e-49 Identities = 88/124 (70%), Positives = 107/124 (86%), Frame = +2 Query: 11 RSKDPLAPRLFQDILEEIEQIGLFKYGGLPHWGKNRNLAFLGAIKKYHYADKFLKVKEEY 190 R+KDPL PRL++D +EEIEQI LFKY LPHWGKNRNLAF G I+KY+ A FLKVK+ Y Sbjct: 439 RAKDPLTPRLYEDFIEEIEQIALFKYNALPHWGKNRNLAFDGVIRKYNNAPAFLKVKDSY 498 Query: 191 DPQGLFSSLWTDQMLGLQEGVTILKDGCAMEGMCICSQDSHCAPKKGYFCRPGRIYKEAR 370 DP+GLFSS WTDQ+LG++ +I+KDGCA+EG+CICS+D+HCAP KGY CRPG++YKEAR Sbjct: 499 DPKGLFSSEWTDQILGIKGNASIVKDGCALEGLCICSKDAHCAPAKGYLCRPGKVYKEAR 558 Query: 371 VCTR 382 VCTR Sbjct: 559 VCTR 562 >gi|7485757|pir||T02675 hypothetical protein F19D11.2 - Arabidopsis thaliana >gi|3510249|gb|AAC33493.1| (AC005310) unknown protein [Arabidopsis thaliana] Length = 590 Frame 2 hits (HSPs): ___________ __________________________________________________ Database sequence: | | | | | 590 0 150 300 450 Plus Strand HSPs: Score = 507 (178.5 bits), Expect = 1.4e-47, P = 1.4e-47 Identities = 88/123 (71%), Positives = 101/123 (82%), Frame = +2 Query: 11 RSKDPLAPRLFQDILEEIEQIGLFKYGGLPHWGKNRNLAFLGAIKKYHYADKFLKVKEEY 190 R+ DPL PRL++D +EEIEQI L KY LPHWGKNRNLAF G IKKY A FLKVKE Y Sbjct: 464 RANDPLTPRLYEDFIEEIEQIALLKYNALPHWGKNRNLAFDGVIKKYKNAPAFLKVKESY 523 Query: 191 DPQGLFSSLWTDQMLGLQEGVTILKDGCAMEGMCICSQDSHCAPKKGYFCRPGRIYKEAR 370 DP GLFSS WTDQ+LG++ TI+KDGCA+EG+CICS D+HCAP KGY CRPG++YKEAR Sbjct: 524 DPNGLFSSEWTDQILGIKGNPTIVKDGCALEGLCICSDDAHCAPSKGYLCRPGKVYKEAR 583 Query: 371 VCT 379 VCT Sbjct: 584 VCT 586 >gi|10177848|dbj|BAB11277.1| (AB009049) gene_id:MCD7.26~pir||T02677~strong similarity to unknown protein [Arabidopsis thaliana] Length = 577 Frame 2 hits (HSPs): ____________ __________________________________________________ Database sequence: | | | | | 577 0 150 300 450 Plus Strand HSPs: Score = 505 (177.8 bits), Expect = 2.3e-47, P = 2.3e-47 Identities = 89/125 (71%), Positives = 108/125 (86%), Frame = +2 Query: 11 RSK-DPLAPRLFQDILEEIEQIGLFKYGGLPHWGKNRNLAFLGAIKKYHYADKFLKVKEE 187 RSK DPLAPRL++D +EEIEQ+ +FKY LPHWGKNRNLAF GAI+KY A+ FLKVKE+ Sbjct: 450 RSKNDPLAPRLYEDYIEEIEQMAIFKYNALPHWGKNRNLAFDGAIRKYKNANAFLKVKEK 509 Query: 188 YDPQGLFSSLWTDQMLGLQEGVTILKDGCAMEGMCICSQDSHCAPKKGYFCRPGRIYKEA 367 +D GLFS+ WTDQ+LGL+ VTI+K GCA+EG+CICS+DSHCAP KGY CRPG++Y+EA Sbjct: 510 FDSLGLFSTEWTDQILGLKGNVTIVKQGCALEGLCICSEDSHCAPTKGYLCRPGKVYREA 569 Query: 368 RVCTR 382 RVCTR Sbjct: 570 RVCTR 574 >gi|12597859|gb|AAG60168.1|AC084110_1 (AC084110) unknown protein [Arabidopsis thaliana] Length = 595 Frame 2 hits (HSPs): ____________ __________________________________________________ Database sequence: | | | | | 595 0 150 300 450 Plus Strand HSPs: Score = 502 (176.7 bits), Expect = 4.8e-47, P = 4.8e-47 Identities = 89/134 (66%), Positives = 110/134 (82%), Frame = +2 Query: 11 RSKD-PLAPRLFQDILEEIEQIGLFKYGGLPHWGKNRNLAFLGAIKKYHYADKFLKVKEE 187 RSKD PLAPRL++D +EEIEQ+ +FKY LPHWGKNRNLAF G I+KY A+ FLKVKE Sbjct: 451 RSKDDPLAPRLYEDFIEEIEQMAIFKYNALPHWGKNRNLAFDGVIRKYKNANTFLKVKER 510 Query: 188 YDPQGLFSSLWTDQMLGLQEGVTILKDGCAMEGMCICSQDSHCAPKKGYFCRPGRIYKEA 367 +DP GLFS+ WT+Q+LGL+ VTI+K+GCA+EG+C+CS D+HCAPKKGY CRPG++Y +A Sbjct: 511 FDPLGLFSTEWTNQILGLKGNVTIVKEGCALEGLCVCSDDAHCAPKKGYLCRPGKVYTKA 570 Query: 368 RVCTRDVKKTKEADD 412 RVCT VK DD Sbjct: 571 RVCTH-VKSVNGYDD 584 >gi|10177846|dbj|BAB11275.1| (AB009049) gene_id:MCD7.24~pir||T02676~strong similarity to unknown protein [Arabidopsis thaliana] Length = 252 Frame 2 hits (HSPs): __________________________ __________________________________________________ Database sequence: | | | | | || 252 0 50 100 150 200 250 Plus Strand HSPs: Score = 488 (171.8 bits), Expect = 1.4e-45, P = 1.4e-45 Identities = 88/124 (70%), Positives = 101/124 (81%), Frame = +2 Query: 11 RSKD-PLAPRLFQDILEEIEQIGLFKYGGLPHWGKNRNLAFLGAIKKYHYADKFLKVKEE 187 RSKD P PRL++D +EEIEQ+ + KY LPHWGKNRNLAF GAIKKY A+ FLKVKE Sbjct: 125 RSKDDPWTPRLYEDYMEEIEQMAILKYNALPHWGKNRNLAFDGAIKKYKNANTFLKVKER 184 Query: 188 YDPQGLFSSLWTDQMLGLQEGVTILKDGCAMEGMCICSQDSHCAPKKGYFCRPGRIYKEA 367 DP GLFS+ WTDQ+LGL+ VTI+K GCA EG+CICS DSHCAP KGY CRPG++YKEA Sbjct: 185 LDPWGLFSTEWTDQILGLKGNVTIVKQGCAPEGLCICSDDSHCAPNKGYMCRPGKVYKEA 244 Query: 368 RVCT 379 RVCT Sbjct: 245 RVCT 248 >gi|8920604|gb|AAF81326.1|AC007767_6 (AC007767) Strong similarity to an unknown protein F19D11.4 gi|7485759 from Arabidopsis thaliana BAC F19D11 gb|AC005310. EST gb|AV535485 comes from this gene Length = 647 Frame 2 hits (HSPs): __________ __________________________________________________ Database sequence: | | | | | | 647 0 150 300 450 600 Plus Strand HSPs: Score = 449 (158.1 bits), Expect = 2.2e-41, P = 2.2e-41 Identities = 78/113 (69%), Positives = 96/113 (84%), Frame = +2 Query: 11 RSKD-PLAPRLFQDILEEIEQIGLFKYGGLPHWGKNRNLAFLGAIKKYHYADKFLKVKEE 187 RSKD PLAPRL++D +EEIEQ+ +FKY LPHWGKNRNLAF G I+KY A+ FLKVKE Sbjct: 451 RSKDDPLAPRLYEDFIEEIEQMAIFKYNALPHWGKNRNLAFDGVIRKYKNANTFLKVKER 510 Query: 188 YDPQGLFSSLWTDQMLGLQEGVTILKDGCAMEGMCICSQDSHCAPKKGYFCRP 346 +DP GLFS+ WT+Q+LGL+ VTI+K+GCA+EG+C+CS D+HCAPKKGY CRP Sbjct: 511 FDPLGLFSTEWTNQILGLKGNVTIVKEGCALEGLCVCSDDAHCAPKKGYLCRP 563 >gi|11357471|pir||T48513 hypothetical protein F15N18.130 - Arabidopsis thaliana >gi|7573411|emb|CAB87714.1| (AL163815) putative protein [Arabidopsis thaliana] Length = 585 Frame 2 hits (HSPs): ____________ __________________________________________________ Database sequence: | | | | | 585 0 150 300 450 Plus Strand HSPs: Score = 343 (120.7 bits), Expect = 1.2e-29, P = 1.2e-29 Identities = 63/122 (51%), Positives = 85/122 (69%), Frame = +2 Query: 11 RSKDPLAPRLFQDILEEIEQIGLFKYGGLPHWGKNRNLAFLGAIKKYH-YADKFLKVKEE 187 R+ D L PRL QD++EE+EQ+ K+G PHWGKNR + F G +K DKFL+VK + Sbjct: 455 RADDELTPRLNQDVMEEMEQMAFVKHGAKPHWGKNRKVGFFGVKQKIGPNFDKFLEVKNK 514 Query: 188 YDPQGLFSSLWTDQMLGLQEGVTILK-DGCAMEGMCICSQDSHCAPKKGYFCRPGRIYKE 364 DP+ +FSS W+D++L G K DGCA+EG C+CS++ HC P KGYFC+ G +Y + Sbjct: 515 LDPKKMFSSEWSDEIL---LGTEASKYDGCALEGNCVCSEERHCNPSKGYFCKEGLVYTQ 571 Query: 365 ARVC 376 ARVC Sbjct: 572 ARVC 575 >gi|7431419|pir||D70989 probable oxidoreductase - Mycobacterium tuberculosis (strain H37RV) >gi|3242248|emb|CAB09342.1| (Z95890) hypothetical protein Rv1771 [Mycobacterium tuberculosis] Length = 428 Frame 2 hits (HSPs): ________ __________________________________________________ Database sequence: | | | | 428 0 150 300 Plus Strand HSPs: Score = 97 (34.1 bits), Expect = 0.21, P = 0.19 Identities = 20/66 (30%), Positives = 34/66 (51%), Frame = +2 Query: 41 FQDILEEIEQIGLFKYGGLPHWGKNRNLAFLGAIKKYHYADKFLKVKEEYDPQGLFSSLW 220 F+ +E+I + Y G PHWGK ++Y D+F V++ DP +F + + Sbjct: 363 FESYFRAVEEI-MDDYAGRPHWGKRHYQTAATLRERYPQWDRFAAVRDRLDPDRVFLNDY 421 Query: 221 TDQMLG 238 T ++LG Sbjct: 422 TRRVLG 427 >gi|11560006 ref|NP_071556.1| L-gulono-gamma-lactone oxidase [Rattus norvegicus] >gi|121141|sp|P10867|GGLO_RAT L-GULONOLACTONE OXIDASE (L-GULONO-GAMMA-LACTONE OXIDASE) >gi|204150|gb|AAA41164.1| (J03536) L-gulono-gamma-lactone oxidase precursor [Rattus norvegicus] Length = 440 Frame 2 hits (HSPs): ________ __________________________________________________ Database sequence: | | | | 440 0 150 300 Plus Strand HSPs: Score = 89 (31.3 bits), Expect = 1.9, P = 0.85 Identities = 22/67 (32%), Positives = 34/67 (50%), Frame = +2 Query: 32 PRLFQDILEEIEQIGLFKYGGLPHWGKNRNLAFLGAIKKYHYADKFLKVKEEYDPQGLFS 211 PRL D E I + K+GG PHW K N + Y KF ++E+ DP G+F Sbjct: 375 PRL--DYWLAYETI-MKKFGGRPHWAKAHNCTQKDFEEMYPTFHKFCDIREKLDPTGMFL 431 Query: 212 SLWTDQM 232 + + +++ Sbjct: 432 NSYLEKV 438 >gi|625202|pir||OXRTGU L-gulonolactone oxidase (EC 1.1.3.8) - rat >gi|286224|dbj|BAA02232.1| (D12754) L-gulono-gamma-lactone oxidase [Rattus norvegicus] Length = 440 Frame 2 hits (HSPs): ________ Annotated Domains: _____ __________________________________________________ Database sequence: | | | | 440 0 150 300 __________________ Annotated Domains: PROSITE OX2_COVAL_FAD: Oxygen oxidoreductases co 21..54 __________________ Plus Strand HSPs: Score = 89 (31.3 bits), Expect = 1.9, P = 0.85 Identities = 22/67 (32%), Positives = 34/67 (50%), Frame = +2 Query: 32 PRLFQDILEEIEQIGLFKYGGLPHWGKNRNLAFLGAIKKYHYADKFLKVKEEYDPQGLFS 211 PRL D E I + K+GG PHW K N + Y KF ++E+ DP G+F Sbjct: 375 PRL--DYWLAYETI-MKKFGGRPHWAKAHNCTRKDFEEMYPTFHKFCDIREKLDPTGMFL 431 Query: 212 SLWTDQM 232 + + +++ Sbjct: 432 NSYLEKV 438 >gi|12188976|emb|CAC21485.1| (AL512562) putative d-arabinono-1,4-lactone oxidase [Schizosaccharomyces pombe] Length = 461 Frame 2 hits (HSPs): ________ __________________________________________________ Database sequence: | | | | | 461 0 150 300 450 Plus Strand HSPs: Score = 87 (30.6 bits), Expect = 3.4, P = 0.97 Identities = 18/64 (28%), Positives = 34/64 (53%), Frame = +2 Query: 41 FQDILEEIEQIGLFKYGGLPHWGKNRNLAFLGAIKKYHYADKFLKVKEEYDPQGLFSSLW 220 ++ + +E I +Y G PHW K +L +++Y K+L +++ DP+G+F W Sbjct: 397 YKPYFKALEDIAN-QYNGKPHWAKEYSLTKEQLLERYPNLSKWLSLRKLLDPKGVF---W 452 Query: 221 TDQM 232 D + Sbjct: 453 NDYL 456 >gi|7482785|pir||C69152 polyferredoxin - Methanobacterium thermoautotrophicum (strain Delta H) >gi|2621463|gb|AAB84907.1| (AE000824) polyferredoxin [Methanothermobacter thermautotrophicus] Length = 337 Frame 2 hits (HSPs): _________ __________________________________________________ Database sequence: | | | | 337 0 150 300 Plus Strand HSPs: Score = 82 (28.9 bits), Expect = 8.2, P = 1.0 Identities = 22/54 (40%), Positives = 29/54 (53%), Frame = +2 Query: 251 VTILKDGCAMEGMC--ICSQDSHCAPKKGYFCRPGRIYKEARVCTRDVKKTKEAD 409 V IL DGC G C +C D+ ++G+ R GRIY E RV T + + E D Sbjct: 227 VKIL-DGCVFCGRCRGVCPVDAIEITEEGFRARDGRIYLERRVLTGPRRGSVEVD 280 Parameters: filter=none matrix=BLOSUM62 V=50 B=50 E=10 gi H=1 sort_by_pvalue echofilter ctxfactor=5.96 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H Std. 0 BLOSUM62 0.318 0.135 0.401 +3 0 BLOSUM62 0.318 0.135 0.401 0.352 0.156 0.530 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +2 0 BLOSUM62 0.318 0.135 0.401 0.340 0.152 0.513 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +1 0 BLOSUM62 0.318 0.135 0.401 0.361 0.164 0.606 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -1 0 BLOSUM62 0.318 0.135 0.401 0.360 0.161 0.595 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -2 0 BLOSUM62 0.318 0.135 0.401 0.347 0.151 0.510 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -3 0 BLOSUM62 0.318 0.135 0.401 0.351 0.158 0.563 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a Query Frame MatID Length Eff.Length E S W T X E2 S2 +3 0 192 191 10. 76 3 12 22 0.12 34 31 0.098 38 +2 0 193 192 10. 76 3 12 22 0.12 34 31 0.098 38 +1 0 193 192 10. 76 3 12 22 0.12 34 31 0.098 38 -1 0 193 192 10. 76 3 12 22 0.12 34 31 0.098 38 -2 0 193 193 10. 76 3 12 22 0.12 34 31 0.099 38 -3 0 192 191 10. 76 3 12 22 0.12 34 31 0.098 38 Statistics: Database: /usr/local/dot5/sl_home/beauty/seqdb/blast/nr Title: nr Release date: unknown Posted date: 4:06 PM CST Feb 28, 2001 Format: BLAST # of letters in database: 197,782,623 # of sequences in database: 625,274 # of database sequences satisfying E: 13 No. of states in DFA: 595 (59 KB) Total size of DFA: 219 KB (256 KB) Time to generate neighborhood: 0.01u 0.00s 0.01t Elapsed: 00:00:00 No. of threads or processors used: 6 Search cpu time: 189.31u 0.87s 190.18t Elapsed: 00:01:14 Total cpu time: 189.35u 0.89s 190.24t Elapsed: 00:01:14 Start: Fri Jan 18 09:28:35 2002 End: Fri Jan 18 09:29:49 2002
Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000