WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.
BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.
BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract
Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract
RepeatMasker repeats found in sequence:No Repeats Found.Reference: Gish, Warren (1994-1997). unpublished. Gish, Warren and David J. States (1993). Identification of protein coding regions by database similarity search. Nat. Genet. 3:266-72.Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.
Query= SSH7E02.SEQ(1>227) (204 letters)
Translating both strands of query sequence in all 6 reading framesDatabase: nr 505,245 sequences; 158,518,215 total letters.Observed Numbers of Database Sequences Satisfying Various EXPECTation Thresholds (E parameter values) Histogram units: = 9 Sequences : less than 9 sequences EXPECTation Threshold (E parameter) | V Observed Counts--> 10000 2703 533 |=========================================================== 6310 2170 460 |=================================================== 3980 1710 323 |=================================== 2510 1387 302 |================================= 1580 1085 235 |========================== 1000 850 222 |======================== 631 628 178 |=================== 398 450 125 |============= 251 325 98 |========== 158 227 98 |========== 100 129 40 |==== 63.1 89 32 |=== 39.8 57 19 |== 25.1 38 16 |= 15.8 22 8 |: >>>>>>>>>>>>>>>>>>>>> Expect = 10.0, Observed = 14 <<<<<<<<<<<<<<<<< 10.0 14 2 |: 6.31 12 3 |: 3.98 9 2 |: 2.51 7 1 |: 1.58 6 3 |: 1.00 3 1 |: Smallest Sum Reading High Probability Sequences producing High-scoring Segment Pairs: Frame Score P(N) N gi|6017108|gb|AAF01591.1|AC009895_12(AC009895) unknow... -1 218 1.7e-16 1 gi|7487275|pir||T06629hypothetical protein T20K18.60 ... -1 119 1.1e-05 1 gi|37256|emb|CAA68681.1|(Y00672) tpr gene product (53... -1 67 0.52 1 gi|3298091|dbj|BAA31346.1|(AB010093) interleukin 2 pr... -1 66 0.68 1 gi|125881|sp|P18487|L2CS_DROMEPROTEIN CS >gi|85024|pi... -1 48 0.79 2 gi|116723|sp|P05399|COAT_CERVPROBABLE COAT PROTEIN >g... -1 50 0.79 2 gi|384327|prf||1905414Balbumin 2S [Bertholletia excelsa] -1 63 0.89 1 gi|7297782|gb|AAF53032.1|(AE003630) CG14069 gene prod... -2 60 0.93 1 gi|6330102|dbj|BAA86467.1|(AB032979) KIAA1153 protein... -1 56 0.97 2 gi|5834567|emb|CAB55274.1|(AL035461) dJ967N21.3 (nove... -1 56 0.992 2 gi|37405|emb|CAA44719.1|(X62947) 55 kd protein [Homo ... -1 67 0.997 1 gi|625086|gb|AAA92686.1|(U19348) tpr-met fusion prote... -1 67 0.998 1 gi|6469527|gb|AAF13321.1|(AF112543) envelope glycopro... -2 51 0.9995 2 gi|37258|emb|CAA44819.1|(X63105) Tpr [Homo sapiens] -1 67 0.9999 1 Locally-aligned regions (HSPs) with respect to query sequence: Locus_ID Frame -1 Hits gi|6017108 | gi|7487275 | gi|37256 | gi|3298091 | gi|125881 | _____________________ gi|116723 | ________________________________________ gi|384327 | gi|6330102 | gi|5834567 | gi|37405 | gi|625086 | gi|6469527 | ________________ gi|37258 | __________________________________________________ Query sequence: | | | | | 68 0 20 40 60 Locus_ID Frame -2 Hits gi|125881 | ______________ gi|7297782 | _____________________________ gi|6469527 | _______________ __________________________________________________ Query sequence: | | | | | 68 0 20 40 60 Locus_ID Frame -3 Hits gi|6330102 | _________ gi|5834567 | _________ __________________________________________________ Query sequence: | | | | | 68 0 20 40 60
Use the and icons to retrieve links to Entrez:
>gi|6017108|gb|AAF01591.1|AC009895_12 (AC009895) unknown protein [Arabidopsis thaliana] Length = 417 Frame -1 hits (HSPs): _________ __________________________________________________ Database sequence: | | | | 417 0 150 300 Minus Strand HSPs: Score = 218 (76.7 bits), Expect = 1.7e-16, P = 1.7e-16 Identities = 45/63 (71%), Positives = 52/63 (82%), Frame = -1 Query: 201 LLQLLRESPYSRPKAEPDILENIVCDIISQIDGDDQSGRAKKMLAEMVQVSMEQSLRHLQ 22 LL LLR S R + +PD +ENIV +IS IDGDDQSG+AKKMLAEMVQVSME+SLRHLQ Sbjct: 349 LLCLLRNSESPRSEVQPDTIENIVSSLISHIDGDDQSGKAKKMLAEMVQVSMEKSLRHLQ 408 Query: 21 QRA 13 +RA Sbjct: 409 ERA 411 >gi|7487275|pir||T06629 hypothetical protein T20K18.60 - Arabidopsis thaliana >gi|4586247|emb|CAB40988.1| (AL049640) putative protein [Arabidopsis thaliana] >gi|7267972|emb|CAB78313.1| (AL161534) putative protein [Arabidopsis thaliana] Length = 402 Frame -1 hits (HSPs): _________ __________________________________________________ Database sequence: | | | | 402 0 150 300 Minus Strand HSPs: Score = 119 (41.9 bits), Expect = 1.1e-05, P = 1.1e-05 Identities = 29/66 (43%), Positives = 42/66 (63%), Frame = -1 Query: 201 LLQLLRESPYSRPKAEPDILENIVCDIISQIDGDDQSGR-AKKMLAEMVQVSMEQSLRHL 25 LL LLRE+P + + P LE IV I Q+DG +++ AKK+L +MV SME S++ + Sbjct: 325 LLDLLRETPREK-EMTPLTLEKIVYGIAVQVDGAEKAAETAKKLLQDMVHRSMELSMKSI 383 Query: 24 QQRALVC 4 Q +A C Sbjct: 384 QHKAASC 390 >gi|37256|emb|CAA68681.1| (Y00672) tpr gene product (539 is 1st base in codon) [Homo sapiens] Length = 142 Frame -1 hits (HSPs): _______________________ __________________________________________________ Database sequence: | | | | 142 0 50 100 Minus Strand HSPs: Score = 67 (23.6 bits), Expect = 0.73, P = 0.52 Identities = 22/63 (34%), Positives = 34/63 (53%), Frame = -1 Query: 201 LLQLLRESPYSR-PKAEPDILENIVCDIISQIDGDDQSGRAKKMLAEMVQ--VSMEQSLR 31 L Q+L + ++ PK+ + LE + D S+IDG GR +K E Q +E+ L Sbjct: 5 LQQVLERTELNKLPKSVQNKLEKFLADRQSEIDG--LKGRHEKFKVESEQQYFEIEKRLS 62 Query: 30 HLQQR 16 H Q+R Sbjct: 63 HSQER 67 >gi|3298091|dbj|BAA31346.1| (AB010093) interleukin 2 precursor [Cavia porcellus] Length = 152 Frame -1 hits (HSPs): ______________________ __________________________________________________ Database sequence: | | | || 152 0 50 100 150 Minus Strand HSPs: Score = 66 (23.2 bits), Expect = 1.1, P = 0.68 Identities = 21/63 (33%), Positives = 32/63 (50%), Frame = -1 Query: 204 TLLQLLRESPYSR-PKAEPDILENIVCDIISQIDGDDQSGRAKKMLA-EMVQVSMEQSLR 31 TL L +P S PK D LE ++ D+ + ++G + R KML ++ M L+ Sbjct: 13 TLALLTSSAPTSSSPKQTQDRLELLLRDLQTLLEGVTSNPRLPKMLKLKLYPPKMVSELQ 72 Query: 30 HLQ 22 HLQ Sbjct: 73 HLQ 75 >gi|125881|sp|P18487|L2CS_DROME PROTEIN CS >gi|85024|pir||S01105 hypothetical protein 4 - fruit fly (Drosophila melanogaster) >gi|7763|emb|CAA29408.1| (X05991) Cs [Drosophila melanogaster] Length = 245 Frame -1 hits (HSPs): ______ Frame -2 hits (HSPs): ____ Annotated Domains: __________________________________________________ __________________________________________________ Database sequence: | | | | | | 245 0 50 100 150 200 __________________ Annotated Domains: PRODOM PD188322: L2CS(1) O96568(1) O96570(1) 1..217 PRODOM PD119338: L2CS_DROME 219..244 __________________ Minus Strand HSPs: Score = 48 (16.9 bits), Expect = 1.5, Sum P(2) = 0.79 Identities = 11/29 (37%), Positives = 18/29 (62%), Frame = -1 Query: 168 RPKAEPDILENIVCDIISQIDGDD-QSGR 85 RP+ P L+N+V D+I +D Q+G+ Sbjct: 129 RPRYVPTGLDNVVDDLIQNMDKAQLQTGK 157 Score = 37 (13.0 bits), Expect = 1.5, Sum P(2) = 0.79 Identities = 7/17 (41%), Positives = 8/17 (47%), Frame = -2 Query: 77 RCWLRWSKSAWSRV*DI 27 RCWL + W DI Sbjct: 210 RCWLSRNLGLWQSPQDI 226 >gi|116723|sp|P05399|COAT_CERV PROBABLE COAT PROTEIN >gi|75486|pir||VCCVCE coat protein - carnation etched ring virus >gi|58862|emb|CAA28359.1| (X04658) pot. ORF 4 (AA 1-494) [Carnation etched ring virus] >gi|225355|prf||1301227D ORF 4 [Carnation etched ring virus] Length = 494 Frame -1 hits (HSPs): __ ______ Annotated Domains: __________________________________________________ __________________________________________________ Database sequence: | | | | | 494 0 150 300 450 __________________ Annotated Domains: DOMO DM03741: CAULIFLOWERMOSAICVIRUSCOATPROTE 57..478 Entrez Zinc finger region: POTENTIAL. 420..433 PRINTS CAULIMOCOAT1: Caulimovirus coat protein 142..155 PRINTS CAULIMOCOAT2: Caulimovirus coat protein 163..179 PRINTS CAULIMOCOAT3: Caulimovirus coat protein 198..209 PRINTS CAULIMOCOAT4: Caulimovirus coat protein 222..231 PRINTS CAULIMOCOAT5: Caulimovirus coat protein 257..273 PRINTS CAULIMOCOAT6: Caulimovirus coat protein 355..368 PRINTS CAULIMOCOAT7: Caulimovirus coat protein 411..423 PRINTS C2HCZNFINGER1: C2HC-type zinc finger mot 418..427 PRINTS CAULIMOCOAT8: Caulimovirus coat protein 423..436 PRINTS C2HCZNFINGER2: C2HC-type zinc finger mot 427..435 PRODOM PD133098: COAT_CERV 1..140 PRODOM PD005896: COAT(8) 142..361 PRODOM PD003789: COAT(8) POL(2) 392..465 PRODOM PD057383: COAT_CERV 467..493 __________________ Minus Strand HSPs: Score = 50 (17.6 bits), Expect = 1.6, Sum P(2) = 0.79 Identities = 14/43 (32%), Positives = 22/43 (51%), Frame = -1 Query: 150 DILENIVCDIISQIDGDDQSGRAKKMLAEMVQVSMEQSLRHLQ 22 DIL N+V + + G+D +G +K L E + L +LQ Sbjct: 218 DILLNVVSGLYTMFLGEDYTGNQEKTL-EQERAKASLRLINLQ 259 Score = 42 (14.8 bits), Expect = 1.6, Sum P(2) = 0.79 Identities = 8/13 (61%), Positives = 9/13 (69%), Frame = -1 Query: 183 ESPYSRPKAEPDI 145 ES RPK EPD+ Sbjct: 115 ESSNKRPKREPDL 127 >gi|384327|prf||1905414B albumin 2S [Bertholletia excelsa] Length = 145 Frame -1 hits (HSPs): ______________________ __________________________________________________ Database sequence: | | | | 145 0 50 100 Minus Strand HSPs: Score = 63 (22.2 bits), Expect = 2.2, P = 0.89 Identities = 22/63 (34%), Positives = 38/63 (60%), Frame = -1 Query: 195 QLLRESPY-SRPKA--EPDILENIVCDIISQIDGDDQSGRAK--KMLAEMVQVSMEQSLR 31 Q+++ESPY + P+ EP + E C+ Q++G D+S R + +M+ M+Q E R Sbjct: 58 QMMKESPYQTMPRRGMEPHMSE--CCE---QLEGMDESCRCEGLRMMMRMMQ-QQEMQPR 111 Query: 30 HLQQRALV 7 Q R ++ Sbjct: 112 GEQMRMMM 119 >gi|7297782|gb|AAF53032.1| (AE003630) CG14069 gene product [Drosophila melanogaster] Length = 101 Frame -2 hits (HSPs): _____________________ __________________________________________________ Database sequence: | | | | | || 101 0 20 40 60 80 100 Minus Strand HSPs: Score = 60 (21.1 bits), Expect = 2.6, P = 0.93 Identities = 14/41 (34%), Positives = 20/41 (48%), Frame = -2 Query: 173 IQDLKLNLTFLRT*CVTSYHRLMGMIN--LVE-QRRCWLRW 60 +Q KL + T C T +G+ L+E Q+RCW W Sbjct: 50 LQQQKLRIPLQNTGCATILSNQLGVTGEMLLEIQKRCWAGW 90 >gi|6330102|dbj|BAA86467.1| (AB032979) KIAA1153 protein [Homo sapiens] Length = 424 Frame -1 hits (HSPs): ________ Frame -3 hits (HSPs): __ __________________________________________________ Database sequence: | | | | 424 0 150 300 Minus Strand HSPs: Score = 56 (19.7 bits), Expect = 3.4, Sum P(2) = 0.97 Identities = 17/61 (27%), Positives = 28/61 (45%), Frame = -1 Query: 189 LRESPYSRPKAEPDILENIVCDIISQIDGDDQSGRAKKMLAEMVQVSMEQSLRHLQQRAL 10 + E+P S + + +E I D + G + G K + E + EQ RHL+ AL Sbjct: 330 MAEAPESNHPEDQETMETISQD--PEHKGPKERGSKKDYIQEKQRRQEEQRKRHLEAAAL 387 Query: 9 V 7 + Sbjct: 388 L 388 Score = 31 (10.9 bits), Expect = 3.4, Sum P(2) = 0.97 Identities = 6/11 (54%), Positives = 7/11 (63%), Frame = -3 Query: 199 FAVTKRVSIFK 167 F V KR +FK Sbjct: 49 FVVLKREDVFK 59 >gi|5834567|emb|CAB55274.1| (AL035461) dJ967N21.3 (novel protein similar to predicted worm, yeast and plant proteins) [Homo sapiens] Length = 497 Frame -1 hits (HSPs): _______ Frame -3 hits (HSPs): __ __________________________________________________ Database sequence: | | | | | 497 0 150 300 450 Minus Strand HSPs: Score = 56 (19.7 bits), Expect = 4.9, Sum P(2) = 0.99 Identities = 17/61 (27%), Positives = 28/61 (45%), Frame = -1 Query: 189 LRESPYSRPKAEPDILENIVCDIISQIDGDDQSGRAKKMLAEMVQVSMEQSLRHLQQRAL 10 + E+P S + + +E I D + G + G K + E + EQ RHL+ AL Sbjct: 306 MAEAPESNHPEDQETMETISQD--PEHKGPKERGSKKDYIQEKQRRQEEQRKRHLEAAAL 363 Query: 9 V 7 + Sbjct: 364 L 364 Score = 31 (10.9 bits), Expect = 4.9, Sum P(2) = 0.99 Identities = 6/11 (54%), Positives = 7/11 (63%), Frame = -3 Query: 199 FAVTKRVSIFK 167 F V KR +FK Sbjct: 25 FVVLKREDVFK 35 >gi|37405|emb|CAA44719.1| (X62947) 55 kd protein [Homo sapiens] Length = 503 Frame -1 hits (HSPs): _______ __________________________________________________ Database sequence: | | | | | 503 0 150 300 450 Minus Strand HSPs: Score = 67 (23.6 bits), Expect = 5.9, P = 1.0 Identities = 22/63 (34%), Positives = 34/63 (53%), Frame = -1 Query: 201 LLQLLRESPYSR-PKAEPDILENIVCDIISQIDGDDQSGRAKKMLAEMVQ--VSMEQSLR 31 L Q+L + ++ PK+ + LE + D S+IDG GR +K E Q +E+ L Sbjct: 5 LQQVLERTELNKLPKSVQNKLEKFLADRQSEIDG--LKGRHEKFKVESEQQYFEIEKRLS 62 Query: 30 HLQQR 16 H Q+R Sbjct: 63 HSQER 67 >gi|625086|gb|AAA92686.1| (U19348) tpr-met fusion protein [Homo sapiens] Length = 523 Frame -1 hits (HSPs): _______ __________________________________________________ Database sequence: | | | | | 523 0 150 300 450 Minus Strand HSPs: Score = 67 (23.6 bits), Expect = 6.2, P = 1.0 Identities = 22/63 (34%), Positives = 34/63 (53%), Frame = -1 Query: 201 LLQLLRESPYSR-PKAEPDILENIVCDIISQIDGDDQSGRAKKMLAEMVQ--VSMEQSLR 31 L Q+L + ++ PK+ + LE + D S+IDG GR +K E Q +E+ L Sbjct: 5 LQQVLERTELNKLPKSVQNKLEKFLADQQSEIDG--LKGRHEKFKVESEQQYFEIEKRLS 62 Query: 30 HLQQR 16 H Q+R Sbjct: 63 HSQER 67 >gi|6469527|gb|AAF13321.1| (AF112543) envelope glycoprotein [Human immunodeficiency virus type 1] Length = 837 Frame -1 hits (HSPs): __ Frame -2 hits (HSPs): __ __________________________________________________ Database sequence: | | | | | | | 837 0 150 300 450 600 750 Minus Strand HSPs: Score = 51 (18.0 bits), Expect = 7.6, Sum P(2) = 1.0 Identities = 9/19 (47%), Positives = 15/19 (78%), Frame = -2 Query: 143 LRT*CVTSYHRLMGMINLV 87 LR+ C+ SYHRL ++++V Sbjct: 741 LRSLCLFSYHRLRDLLSIV 759 Score = 39 (13.7 bits), Expect = 7.6, Sum P(2) = 1.0 Identities = 9/21 (42%), Positives = 12/21 (57%), Frame = -1 Query: 201 LLQLLRESPYSRPKAEPDILE 139 + LL ES + K E D+LE Sbjct: 623 IYSLLEESQNQQEKNEQDLLE 643 >gi|37258|emb|CAA44819.1| (X63105) Tpr [Homo sapiens] Length = 726 Frame -1 hits (HSPs): _____ __________________________________________________ Database sequence: | | | | | | 726 0 150 300 450 600 Minus Strand HSPs: Score = 67 (23.6 bits), Expect = 9.1, P = 1.0 Identities = 22/63 (34%), Positives = 34/63 (53%), Frame = -1 Query: 201 LLQLLRESPYSR-PKAEPDILENIVCDIISQIDGDDQSGRAKKMLAEMVQ--VSMEQSLR 31 L Q+L + ++ PK+ + LE + D S+IDG GR +K E Q +E+ L Sbjct: 5 LQQVLERTELNKLPKSVQNKLEKFLADQQSEIDG--LKGRHEKFKVESEQQYFEIEKRLS 62 Query: 30 HLQQR 16 H Q+R Sbjct: 63 HSQER 67 Parameters: filter=none matrix=BLOSUM62 V=50 B=50 E=10 gi H=1 sort_by_pvalue echofilter ctxfactor=5.96 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H Std. 0 BLOSUM62 0.318 0.135 0.401 +3 0 BLOSUM62 0.318 0.135 0.401 0.344 0.147 0.494 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +2 0 BLOSUM62 0.318 0.135 0.401 0.335 0.137 0.467 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +1 0 BLOSUM62 0.318 0.135 0.401 0.368 0.166 0.594 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -1 0 BLOSUM62 0.318 0.135 0.401 0.319 0.133 0.365 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -2 0 BLOSUM62 0.318 0.135 0.401 0.352 0.153 0.576 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -3 0 BLOSUM62 0.318 0.135 0.401 0.384 0.170 0.740 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a Query Frame MatID Length Eff.Length E S W T X E2 S2 +3 0 67 66 10. 57 3 12 22 0.11 30 27 0.096 31 +2 0 67 66 10. 57 3 12 22 0.11 30 27 0.096 31 +1 0 68 67 10. 57 3 12 22 0.12 30 27 0.10 31 -1 0 68 67 10. 57 3 12 22 0.12 30 27 0.10 31 -2 0 67 67 10. 57 3 12 22 0.12 30 27 0.10 31 -3 0 67 66 10. 57 3 12 22 0.11 30 27 0.096 31 Statistics: Database: /usr/local/dot5/sl_home/beauty/seqdb/blast/nr Title: nr Release date: unknown Posted date: 8:50 PM CDT May 27, 2000 Format: BLAST # of letters in database: 158,518,215 # of sequences in database: 505,245 # of database sequences satisfying E: 14 No. of states in DFA: 585 (58 KB) Total size of DFA: 118 KB (128 KB) Time to generate neighborhood: 0.01u 0.00s 0.01t Elapsed: 00:00:00 No. of threads or processors used: 4 Search cpu time: 85.79u 1.19s 86.98t Elapsed: 00:00:23 Total cpu time: 85.83u 1.22s 87.05t Elapsed: 00:00:23 Start: Thu Feb 15 00:32:03 2001 End: Thu Feb 15 00:32:26 2001
Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000