WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.
BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.
BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract
Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract
RepeatMasker repeats found in sequence:No Repeats Found.Reference: Gish, Warren (1994-1997). unpublished. Gish, Warren and David J. States (1993). Identification of protein coding regions by database similarity search. Nat. Genet. 3:266-72.Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.
Query= 'D17F12_K24_12.ab1' (612 letters)
Translating both strands of query sequence in all 6 reading framesDatabase: nr 625,274 sequences; 197,782,623 total letters.Observed Numbers of Database Sequences Satisfying Various EXPECTation Thresholds (E parameter values) Histogram units: = 3 Sequences : less than 3 sequences EXPECTation Threshold (E parameter) | V Observed Counts--> 10000 854 169 |======================================================== 6310 685 98 |================================ 3980 587 148 |================================================= 2510 439 138 |============================================== 1580 301 103 |================================== 1000 198 60 |==================== 631 138 33 |=========== 398 105 24 |======== 251 81 13 |==== 158 68 21 |======= 100 47 13 |==== 63.1 34 6 |== 39.8 28 5 |= 25.1 23 4 |= 15.8 19 3 |= >>>>>>>>>>>>>>>>>>>>> Expect = 10.0, Observed = 16 <<<<<<<<<<<<<<<<< 10.0 16 0 | 6.31 16 2 |: 3.98 14 0 | 2.51 14 2 |: Smallest Sum Reading High Probability Sequences producing High-scoring Segment Pairs: Frame Score P(N) N gi|6682235|gb|AAF23287.1|AC016661_12(AC016661) unknow... +3 291 1.7e-23 1 gi|9758575|dbj|BAB09188.1|(AB008264) gb|AAF23287.1~ge... +3 288 3.7e-23 1 gi|7304087|gb|AAF59125.1|(AE003838) CG8709 gene produ... +3 168 3.1e-10 1 gi|7662022ref|NP_055461.1| lipin 2 [Homo sapiens] >gi... +3 151 1.5e-08 1 gi|7505003|pir||T23134hypothetical protein H37A05.1 -... +3 150 1.8e-08 1 gi|12584972ref|NP_075021.1| lipin 3 [Mus musculus] >g... +3 149 2.4e-08 1 gi|12584970ref|NP_075020.1| lipin 2 [Mus musculus] >g... +3 148 3.2e-08 1 gi|7490275|pir||T37941conserved hypothetical protein ... +3 144 3.4e-07 1 gi|2495714|sp|Q14693|Y188_HUMANHYPOTHETICAL PROTEIN K... +3 142 1.3e-06 1 gi|7656875ref|NP_056578.1| lipin 1; fatty liver dystr... +3 140 2.8e-06 1 gi|9581794|emb|CAC00516.1|(AL132654) dJ450M14.2 (nove... +3 132 1.4e-05 1 gi|6323817ref|NP_013888.1| involved in respiration an... +3 135 1.4e-05 1 gi|7340799|emb|CAB10579.2|(Z97348) PFC0150w (MAL3P1.1... +3 94 0.85 1 gi|7494190|pir||T18423hypothetical protein C0150w - m... +3 94 0.85 1 gi|1170404|sp|P42145|HSP1_PSECUSPERM PROTAMINE P1 >gi... -3 64 0.99 1 gi|7672349|gb|AAF66444.1|AF132733_1(AF132733) unknown... +1 87 0.993 1
Use the and icons to retrieve links to Entrez:
>gi|6682235|gb|AAF23287.1|AC016661_12 (AC016661) unknown protein [Arabidopsis thaliana] Length = 904 Frame 3 hits (HSPs): _____ __________________________________________________ Database sequence: | | | | | | || 904 0 150 300 450 600 750 900 Plus Strand HSPs: Score = 291 (102.4 bits), Expect = 1.7e-23, P = 1.7e-23 Identities = 54/84 (64%), Positives = 69/84 (82%), Frame = +3 Query: 357 LNL*GRMQAVGRIISQGVYTFSGPFHPFGGAVDIVVVEQQDGTFKSSPWYVRFGKFQGVL 536 ++L GR VG +ISQGVY+ + PFHPFGGA+D++VV+QQDG+F+S+PWYVRFGKFQGVL Sbjct: 1 MSLVGR---VGSLISQGVYSVATPFHPFGGAIDVIVVQQQDGSFRSTPWYVRFGKFQGVL 57 Query: 537 KAREKVVDICVNGVQAGFQMHLDH 608 K EK V I VNG +A F M+LD+ Sbjct: 58 KGAEKFVRISVNGTEADFHMYLDN 81 >gi|9758575|dbj|BAB09188.1| (AB008264) gb|AAF23287.1~gene_id:MBD2.6~strong similarity to unknown protein [Arabidopsis thaliana] Length = 930 Frame 3 hits (HSPs): _____ __________________________________________________ Database sequence: | | | | | | | | 930 0 150 300 450 600 750 900 Plus Strand HSPs: Score = 288 (101.4 bits), Expect = 3.7e-23, P = 3.7e-23 Identities = 57/81 (70%), Positives = 64/81 (79%), Frame = +3 Query: 375 MQAVGRI---ISQGVYTFSGPFHPFGGAVDIVVVEQQDGTFKSSPWYVRFGKFQGVLKAR 545 M AVGRI I +GV T SGPFHPFGGA+DI+VVEQ DGTFKSSPWYVRFGKFQGVLK Sbjct: 1 MNAVGRIGSYIYRGVGTVSGPFHPFGGAIDIIVVEQPDGTFKSSPWYVRFGKFQGVLKNG 60 Query: 546 EKVVDICVNGVQAGFQMHLDH 608 ++ I VNGV +GF M+L H Sbjct: 61 RNLIRIDVNGVDSGFNMYLAH 81 >gi|7304087|gb|AAF59125.1| (AE003838) CG8709 gene product [Drosophila melanogaster] Length = 1102 Frame 3 hits (HSPs): ___ __________________________________________________ Database sequence: | | | | | | | | | 1102 0 150 300 450 600 750 900 1050 Plus Strand HSPs: Score = 168 (59.1 bits), Expect = 3.1e-10, P = 3.1e-10 Identities = 33/53 (62%), Positives = 43/53 (81%), Frame = +3 Query: 444 GAVDIVVVEQQDGTFKSSPWYVRFGKFQGVLKAREKVVDICVNGVQAGFQMHL 602 GA+D++VVEQ+DG F+ SP++VRFGK GVL++REKVVDI +NGV QM L Sbjct: 25 GAIDVIVVEQRDGEFQCSPFHVRFGKL-GVLRSREKVVDIEINGVPVDIQMKL 76 >gi|7662022 ref|NP_055461.1| lipin 2 [Homo sapiens] >gi|11425603 ref|XP_008766.1| lipin 2 [Homo sapiens] >gi|2495724|sp|Q92539|Y249_HUMAN HYPOTHETICAL PROTEIN KIAA0249 >gi|1665767|dbj|BAA13380.1| (D87436) Similar to Human KIAA0188 protein [Homo sapiens] Length = 896 Frame 3 hits (HSPs): _____ __________________________________________________ Database sequence: | | | | | | | 896 0 150 300 450 600 750 Plus Strand HSPs: Score = 151 (53.2 bits), Expect = 1.5e-08, P = 1.5e-08 Identities = 32/80 (40%), Positives = 50/80 (62%), Frame = +3 Query: 375 MQAVGRIISQGVYTFSGPFH-----PFGGAVDIVVVEQQDGTFKSSPWYVRFGKFQGVLK 539 M VG++ Q + T + G +D++VV+QQDG+++ SP++VRFGK GVL+ Sbjct: 1 MNYVGQLAGQVIVTVKELYKGINQATLSGCIDVIVVQQQDGSYQCSPFHVRFGKL-GVLR 59 Query: 540 AREKVVDICVNGVQAGFQMHL 602 ++EKV+DI +NG M L Sbjct: 60 SKEKVIDIEINGSAVDLHMKL 80 >gi|7505003|pir||T23134 hypothetical protein H37A05.1 - Caenorhabditis elegans >gi|3878102|emb|CAA16154.1| (AL021346) predicted using Genefinder~cDNA EST yk9a1.3 comes from this gene~cDNA EST yk9a1.5 comes from this gene~cDNA EST yk54c7.5 comes from this gene~cDNA EST yk164g10.5 comes from this gene~cDNA EST yk208b12.5 comes from this gene~cDNA EST yk209f2.> Length = 823 Frame 3 hits (HSPs): _____ __________________________________________________ Database sequence: | | | | | | | 823 0 150 300 450 600 750 Plus Strand HSPs: Score = 150 (52.8 bits), Expect = 1.8e-08, P = 1.8e-08 Identities = 32/70 (45%), Positives = 46/70 (65%), Frame = +3 Query: 396 ISQGVYTFSGPFHP--FGGAVDIVVVEQQDGTFKSSPWYVRFGKFQGVLKAREKVVDICV 569 + + V F +P GA+D+VVVEQ +G +KS+P++VRFGK+ GV +K VDI V Sbjct: 7 VFKNVKYFYNSINPATLSGAIDVVVVEQPNGEYKSTPFHVRFGKY-GVFSYSDKYVDIAV 65 Query: 570 NGVQAGFQMHL 602 NGV+ +M L Sbjct: 66 NGVEIDLKMKL 76 >gi|12584972 ref|NP_075021.1| lipin 3 [Mus musculus] >gi|12330450|gb|AAG52762.1|AF286724_1 (AF286724) LPIN3 [Mus musculus] Length = 848 Frame 3 hits (HSPs): ____ __________________________________________________ Database sequence: | | | | | | | 848 0 150 300 450 600 750 Plus Strand HSPs: Score = 149 (52.5 bits), Expect = 2.4e-08, P = 2.4e-08 Identities = 29/53 (54%), Positives = 40/53 (75%), Frame = +3 Query: 444 GAVDIVVVEQQDGTFKSSPWYVRFGKFQGVLKAREKVVDICVNGVQAGFQMHL 602 G +D++VV Q+DG+F+ SP++VRFGK GVL++REKVVDI +NG M L Sbjct: 29 GGIDVLVVRQRDGSFRCSPFHVRFGKL-GVLRSREKVVDIEINGEPVDLHMKL 80 >gi|12584970 ref|NP_075020.1| lipin 2 [Mus musculus] >gi|12330448|gb|AAG52761.1|AF286723_1 (AF286723) LPIN2 [Mus musculus] Length = 893 Frame 3 hits (HSPs): ____ __________________________________________________ Database sequence: | | | | | | | 893 0 150 300 450 600 750 Plus Strand HSPs: Score = 148 (52.1 bits), Expect = 3.2e-08, P = 3.2e-08 Identities = 28/53 (52%), Positives = 40/53 (75%), Frame = +3 Query: 444 GAVDIVVVEQQDGTFKSSPWYVRFGKFQGVLKAREKVVDICVNGVQAGFQMHL 602 G +D+VVV QQDG+++ SP++VRFGK GVL+++EKV+DI +NG M L Sbjct: 29 GCIDVVVVRQQDGSYQCSPFHVRFGKL-GVLRSKEKVIDIEINGSAVDLHMKL 80 >gi|7490275|pir||T37941 conserved hypothetical protein SPAC1952.13 - fission yeast (Schizosaccharomyces pombe) >gi|5731946|emb|CAB52577.1| (AL109820) conserved hypothetical protein [Schizosaccharomyces pombe] Length = 656 Frame 3 hits (HSPs): ______ __________________________________________________ Database sequence: | | | | | | 656 0 150 300 450 600 Plus Strand HSPs: Score = 144 (50.7 bits), Expect = 3.4e-07, P = 3.4e-07 Identities = 34/76 (44%), Positives = 47/76 (61%), Frame = +3 Query: 375 MQAVGRIISQGVYTFSGPFHP--FGGAVDIVVVEQQDGTFKSSPWYVRFGKFQGVLKARE 548 MQ VGR T++ +P GA+D++VVEQ+D T SP++VRFGKF +L + + Sbjct: 1 MQYVGRAFDSVTKTWNA-INPSTLSGAIDVIVVEQEDKTLACSPFHVRFGKFSLLLPS-D 58 Query: 549 KVVDICVNGVQAGFQMHL 602 K V+ VNG GF M L Sbjct: 59 KKVEFSVNGQLTGFNMKL 76 >gi|2495714|sp|Q14693|Y188_HUMAN HYPOTHETICAL PROTEIN KIAA0188 >gi|1136436|dbj|BAA11505.1| (D80010) KIAA0188 [Homo sapiens] Length = 899 Frame 3 hits (HSPs): _____ __________________________________________________ Database sequence: | | | | | | | 899 0 150 300 450 600 750 Plus Strand HSPs: Score = 142 (50.0 bits), Expect = 1.3e-06, P = 1.3e-06 Identities = 32/85 (37%), Positives = 49/85 (57%), Frame = +3 Query: 348 LDTLNL*GRMQAVGRIISQGVYTFSGPFHPFGGAVDIVVVEQQDGTFKSSPWYVRFGKFQ 527 + T+N G++ + + +Y P G +DI+V+ Q +G + SP++VRFGK Sbjct: 7 VQTMNYVGQLAGQVFVTVKELYKGLNPA-TLSGCIDIIVIRQPNGNLQCSPFHVRFGKM- 64 Query: 528 GVLKAREKVVDICVNGVQAGFQMHL 602 GVL++REKVVDI +NG M L Sbjct: 65 GVLRSREKVVDIEINGESVDLHMKL 89 >gi|7656875 ref|NP_056578.1| lipin 1; fatty liver dystrophy [Mus musculus] >gi|7264655|gb|AAF44296.1|AF180471_1 (AF180471) Lpin1 [Mus musculus] Length = 891 Frame 3 hits (HSPs): _____ __________________________________________________ Database sequence: | | | | | | | 891 0 150 300 450 600 750 Plus Strand HSPs: Score = 140 (49.3 bits), Expect = 2.8e-06, P = 2.8e-06 Identities = 34/80 (42%), Positives = 49/80 (61%), Frame = +3 Query: 375 MQAVGRIISQGVYT----FSGPFHP--FGGAVDIVVVEQQDGTFKSSPWYVRFGKFQGVL 536 M VG++ Q T + G +P G +DI+V+ Q +G+ + SP++VRFGK GVL Sbjct: 1 MNYVGQLAGQVFVTVKELYKG-LNPATLSGCIDIIVIRQPNGSLQCSPFHVRFGKM-GVL 58 Query: 537 KAREKVVDICVNGVQAGFQMHL 602 ++REKVVDI +NG M L Sbjct: 59 RSREKVVDIEINGESVDLHMKL 80 >gi|9581794|emb|CAC00516.1| (AL132654) dJ450M14.2 (novel protein similar to KIAA0188, KIAA0249 and yeast SMP2) [Homo sapiens] Length = 438 Frame 3 hits (HSPs): ______ __________________________________________________ Database sequence: | | | | 438 0 150 300 Plus Strand HSPs: Score = 132 (46.5 bits), Expect = 1.4e-05, P = 1.4e-05 Identities = 28/53 (52%), Positives = 39/53 (73%), Frame = +3 Query: 444 GAVDIVVVEQQDGTFKSSPWYVRFGKFQGVLKAREKVVDICVNGVQAGFQMHL 602 G +D++VV+Q DG+F+ SP++VRFGK GVL++REKV DI +NG M L Sbjct: 29 GGIDVLVVKQVDGSFRCSPFHVRFGKL-GVLRSREKV-DIELNGEPVDLHMKL 79 >gi|6323817 ref|NP_013888.1| involved in respiration and plasmid maintenance; Smp2p [Saccharomyces cerevisiae] >gi|417782|sp|P32567|SMP2_YEAST SMP2 PROTEIN >gi|320853|pir||S30911 SMP2 protein - yeast (Saccharomyces cerevisiae) >gi|218488|dbj|BAA00880.1| (D01095) Smp2 protein [Saccharomyces cerevisiae] >gi|825570|emb|CAA89801.1| (Z49705) Smp2p [Saccharomyces cerevisiae] >gi|445061|prf||1908378A SMP2 gene [Saccharomyces cerevisiae] Length = 862 Frame 3 hits (HSPs): _____ __________________________________________________ Database sequence: | | | | | | | 862 0 150 300 450 600 750 Plus Strand HSPs: Score = 135 (47.5 bits), Expect = 1.4e-05, P = 1.4e-05 Identities = 32/76 (42%), Positives = 45/76 (59%), Frame = +3 Query: 375 MQAVGRIISQGVYTFSGPFHP--FGGAVDIVVVEQQDGTFKSSPWYVRFGKFQGVLKARE 548 MQ VGR + T+S +P GA+D++VVE DG SP++VRFGKFQ +LK + Sbjct: 1 MQYVGRALGSVSKTWSS-INPATLSGAIDVIVVEHPDGRLSCSPFHVRFGKFQ-ILKPSQ 58 Query: 549 KVVDICVNGVQAGFQMHL 602 K V + +N + M L Sbjct: 59 KKVQVFINEKLSNMPMKL 76 >gi|7340799|emb|CAB10579.2| (Z97348) PFC0150w (MAL3P1.12), Human hypothetical protein KIAA0249-related protein len: 1156 aa; Similarity to 2 human and a yeast hypothetical gene. Human hypothetical protein KIAA0249 (SW:Y249_HUMAN). BLAST score: 2185, sum P(2) = 4.3e-74, revised pr> Length = 1156 Frame 3 hits (HSPs): ____ __________________________________________________ Database sequence: | | | | | | | | | 1156 0 150 300 450 600 750 900 1050 Plus Strand HSPs: Score = 94 (33.1 bits), Expect = 1.9, P = 0.85 Identities = 25/81 (30%), Positives = 42/81 (51%), Frame = +3 Query: 387 GRIISQGVYTFSGPFHPFGGAVDIVVVEQQ----------DGTFKSSPWYVRFGKFQGVL 536 G+I+S G +DI+ +E + + +KS+P++VRFGK + +L Sbjct: 12 GKIVSSVSNALDFNQATLSGCIDIICIESEIENKLKNDKIEVIYKSTPFHVRFGKTK-LL 70 Query: 537 KAREKVVDICVNGVQAGFQMHL 602 +++EK+V I VNG M L Sbjct: 71 RSKEKIVSILVNGKSTNLHMKL 92 >gi|7494190|pir||T18423 hypothetical protein C0150w - malaria parasite (Plasmodium falciparum) Length = 1169 Frame 3 hits (HSPs): ____ __________________________________________________ Database sequence: | | | | | | | | | 1169 0 150 300 450 600 750 900 1050 Plus Strand HSPs: Score = 94 (33.1 bits), Expect = 1.9, P = 0.85 Identities = 25/81 (30%), Positives = 42/81 (51%), Frame = +3 Query: 387 GRIISQGVYTFSGPFHPFGGAVDIVVVEQQ----------DGTFKSSPWYVRFGKFQGVL 536 G+I+S G +DI+ +E + + +KS+P++VRFGK + +L Sbjct: 28 GKIVSSVSNALDFNQATLSGCIDIICIESEIENKLKNDKIEVIYKSTPFHVRFGKTK-LL 86 Query: 537 KAREKVVDICVNGVQAGFQMHL 602 +++EK+V I VNG M L Sbjct: 87 RSKEKIVSILVNGKSTNLHMKL 108 >gi|1170404|sp|P42145|HSP1_PSECU SPERM PROTAMINE P1 >gi|598345|gb|AAA74602.1| (L35334) protamine P1 [Pseudochirops cupreus] Length = 69 Frame -3 hits (HSPs): _____________________________ Annotated Domains: ________________________________________________ __________________________________________________ Database sequence: | | | | | 69 0 20 40 60 __________________ Annotated Domains: BLOCKS BL00048: Protamine P1 proteins. 1..27 PFAM protamine_P1: Protamine P1 1..66 PRODOM PD001830: HSP1(29) VE2(21) GAG(12) 6..65 PROSITE PROTAMINE_P1: Protamine P1 signature. 2..13 __________________ Minus Strand HSPs: Score = 64 (22.5 bits), Expect = 4.3, P = 0.99 Identities = 17/40 (42%), Positives = 24/40 (60%), Frame = -3 Query: 199 RFRAQRKKEVIKFENRY*RRRSGSVRRRSDRGRK---CLSRR 83 R+R +R++ ++ R RRR RRR RGR+ CL RR Sbjct: 14 RYRRRRRRRRSRYRGR--RRRYRRSRRRRRRGRRRGNCLGRR 53 >gi|7672349|gb|AAF66444.1|AF132733_1 (AF132733) unknown [Homo sapiens] Length = 555 Frame 1 hits (HSPs): ________ __________________________________________________ Database sequence: | | | | | 555 0 150 300 450 Plus Strand HSPs: Score = 87 (30.6 bits), Expect = 5.0, P = 0.99 Identities = 25/79 (31%), Positives = 37/79 (46%), Frame = +1 Query: 19 KQAAGSNGAKRTXESEKRAPH-PAATGTSFPDHFFVLHCRIASFNIYSQI*SLPFFFVLG 195 KQ A NG T +K A H P T P + F++H I+S S+ SL F + Sbjct: 151 KQEAKENGTNLTFIGDKTAMHEPLQTWQDAP-YIFIVHIGISSSKESSKENSLSNLFTMT 209 Query: 196 IETKYPFSLTVIHNSAIHIF 255 +E K P+ + + + IF Sbjct: 210 VEVKGPYEYLTLEDYPLMIF 229 Parameters: filter=none matrix=BLOSUM62 V=50 B=50 E=10 gi H=1 sort_by_pvalue echofilter ctxfactor=5.99 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H Std. 0 BLOSUM62 0.318 0.135 0.401 +3 0 BLOSUM62 0.318 0.135 0.401 0.341 0.153 0.498 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +2 0 BLOSUM62 0.318 0.135 0.401 0.344 0.151 0.532 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +1 0 BLOSUM62 0.318 0.135 0.401 0.355 0.159 0.570 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -1 0 BLOSUM62 0.318 0.135 0.401 0.340 0.152 0.505 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -2 0 BLOSUM62 0.318 0.135 0.401 0.350 0.154 0.554 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -3 0 BLOSUM62 0.318 0.135 0.401 0.339 0.149 0.504 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a Query Frame MatID Length Eff.Length E S W T X E2 S2 +3 0 203 202 10. 77 3 12 22 0.095 35 31 0.11 38 +2 0 203 203 10. 77 3 12 22 0.095 35 31 0.11 38 +1 0 204 203 10. 77 3 12 22 0.095 35 31 0.11 38 -1 0 204 203 10. 77 3 12 22 0.095 35 31 0.11 38 -2 0 203 202 10. 77 3 12 22 0.095 35 31 0.11 38 -3 0 203 202 10. 77 3 12 22 0.095 35 31 0.11 38 Statistics: Database: /usr/local/dot5/sl_home/beauty/seqdb/blast/nr Title: nr Release date: unknown Posted date: 4:06 PM CST Feb 28, 2001 Format: BLAST # of letters in database: 197,782,623 # of sequences in database: 625,274 # of database sequences satisfying E: 16 No. of states in DFA: 592 (58 KB) Total size of DFA: 223 KB (256 KB) Time to generate neighborhood: 0.02u 0.00s 0.02t Elapsed: 00:00:00 No. of threads or processors used: 6 Search cpu time: 211.54u 1.14s 212.68t Elapsed: 00:00:42 Total cpu time: 211.58u 1.16s 212.74t Elapsed: 00:00:42 Start: Thu Jan 17 17:17:50 2002 End: Thu Jan 17 17:18:32 2002
Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000