WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.
BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.
BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract
Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract
RepeatMasker repeats found in sequence:No Repeats Found.Reference: Gish, Warren (1994-1997). unpublished. Gish, Warren and David J. States (1993). Identification of protein coding regions by database similarity search. Nat. Genet. 3:266-72.Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.
Query= SSH6H10.SEQ(1>175) (155 letters)
Translating both strands of query sequence in all 6 reading framesDatabase: nr 505,245 sequences; 158,518,215 total letters.Observed Numbers of Database Sequences Satisfying Various EXPECTation Thresholds (E parameter values) Histogram units: = 11 Sequences : less than 11 sequences EXPECTation Threshold (E parameter) | V Observed Counts--> 10000 2733 6 |: 6310 2727 83 |======= 3980 2644 314 |============================ 2510 2330 547 |================================================= 1580 1783 635 |========================================================= 1000 1148 468 |========================================== 631 680 270 |======================== 398 410 132 |============ 251 278 107 |========= 158 171 61 |===== 100 110 33 |=== 63.1 77 32 |== 39.8 45 13 |= 25.1 32 16 |= 15.8 16 4 |: >>>>>>>>>>>>>>>>>>>>> Expect = 10.0, Observed = 12 <<<<<<<<<<<<<<<<< 10.0 12 1 |: 6.31 11 2 |: 3.98 9 2 |: 2.51 7 3 |: 1.58 4 2 |: 1.00 2 0 | 0.63 2 2 |: Smallest Sum Reading High Probability Sequences producing High-scoring Segment Pairs: Frame Score P(N) N gi|7488713|pir||T08896Sali3-2 protein, aluminium-indu... +2 74 0.36 1 gi|7505979|pir||T25841hypothetical protein M03F4.6 - ... +1 53 0.37 2 gi|7020918|dbj|BAA91319.1|(AK000675) unnamed protein ... +1 51 0.70 2 gi|2789644|gb|AAC27444.1|(AF034626) ribosomal protein... +3 39 0.77 2 gi|7292590|gb|AAF47990.1|(AE003484) CG1537 gene produ... +3 45 0.84 2 gi|2769725|gb|AAC40337.1|(U92288) H88 [Human herpesvi... +1 61 0.87 1 gi|7519366|pir||E71213hypothetical protein PH1973 - P... +3 61 0.87 1 gi|485515|pir||S33622ADR6 protein - soybean >gi|29644... +2 67 0.92 1 gi|288324|emb|CAA39349.1|(X55823) T-cell receptor alp... -3 40 0.97 2 gi|6580252|emb|CAB63324.1|(Z83129) predicted using Ge... +1 58 0.99 1 gi|4958931|dbj|BAA78101.1|(AB027465) Rep protein 3 [s... +3 38 0.990 2 gi|4033371|sp|O51875|ATPD_BUCAPATP SYNTHASE DELTA CHA... +2 42 0.9992 2 Locally-aligned regions (HSPs) with respect to query sequence: Locus_ID Frame 3 Hits gi|7020918 | ________________________ gi|2789644 | _______________________ gi|7292590 | ____________________________ gi|7519366 | __________________________________________ gi|4958931 | ___________________________________ gi|4033371 | _________________ __________________________________________________ Query sequence: | | | | 52 0 20 40 Locus_ID Frame 2 Hits gi|7488713 |________________ gi|485515 |_______________ gi|4033371 | ___________________ __________________________________________________ Query sequence: | | | | 52 0 20 40 Locus_ID Frame 1 Hits gi|7505979 | _________________________ __________ gi|7020918 | _____________________ gi|2789644 | ___________________________ gi|2769725 | _____________________________ gi|6580252 | ____________________________ __________________________________________________ Query sequence: | | | | 52 0 20 40 Locus_ID Frame -2 Hits gi|288324 | ___________ __________________________________________________ Query sequence: | | | | 52 0 20 40 Locus_ID Frame -3 Hits gi|288324 | _____________________________ __________________________________________________ Query sequence: | | | | 52 0 20 40
Use the and icons to retrieve links to Entrez:
>gi|7488713|pir||T08896 Sali3-2 protein, aluminium-induced - soybean >gi|2317900|gb|AAB66369.1| (U89693) Sali3-2 [Glycine max] Length = 276 Frame 2 hits (HSPs): ___ __________________________________________________ Database sequence: | | | | | | | 276 0 50 100 150 200 250 Plus Strand HSPs: Score = 74 (26.0 bits), Expect = 0.45, P = 0.36 Identities = 14/15 (93%), Positives = 15/15 (100%), Frame = +2 Query: 5 PNLSMDTAYQTNVVV 49 PN+SMDTAYQTNVVV Sbjct: 262 PNISMDTAYQTNVVV 276 >gi|7505979|pir||T25841 hypothetical protein M03F4.6 - Caenorhabditis elegans >gi|1439652|gb|AAB04577.1| (U64601) M03F4.6 gene product [Caenorhabditis elegans] Length = 179 Frame 1 hits (HSPs): ________ ___ __________________________________________________ Database sequence: | | | | | 179 0 50 100 150 Plus Strand HSPs: Score = 53 (18.7 bits), Expect = 0.46, Sum P(2) = 0.37 Identities = 10/26 (38%), Positives = 13/26 (50%), Frame = +1 Query: 10 FIYGHCLSD*RCCLISPCICILGTIR 87 FIY C+ C+ C+C GT R Sbjct: 7 FIY-RCVDPFNACIAGTCLCAPGTTR 31 Score = 33 (11.6 bits), Expect = 0.46, Sum P(2) = 0.37 Identities = 5/10 (50%), Positives = 6/10 (60%), Frame = +1 Query: 118 VCCARPGTSP 147 +CC RP P Sbjct: 134 LCCPRPCRDP 143 >gi|7020918|dbj|BAA91319.1| (AK000675) unnamed protein product [Homo sapiens] Length = 272 Frame 3 hits (HSPs): _____ Frame 1 hits (HSPs): ____ __________________________________________________ Database sequence: | | | | | | | 272 0 50 100 150 200 250 Plus Strand HSPs: Score = 51 (18.0 bits), Expect = 1.2, Sum P(2) = 0.70 Identities = 11/22 (50%), Positives = 13/22 (59%), Frame = +1 Query: 22 HCLSD*RCCLISPCICILGTIR 87 HC D LI+PC C GT+R Sbjct: 67 HCEGDEESPLITPCRCT-GTLR 87 Score = 36 (12.7 bits), Expect = 1.2, Sum P(2) = 0.70 Identities = 6/23 (26%), Positives = 11/23 (47%), Frame = +3 Query: 84 KIVYKINYGVLCMLCKTWNFTFL 152 KI + + V+ + C W+ L Sbjct: 137 KIFCSVTFHVIAITCVVWSLYVL 159 >gi|2789644|gb|AAC27444.1| (AF034626) ribosomal protein S12 [Phytomonas serpens] Length = 89 Frame 3 hits (HSPs): ____________ Frame 1 hits (HSPs): __________________ __________________________________________________ Database sequence: | | | | | | 89 0 20 40 60 80 Plus Strand HSPs: Score = 39 (13.7 bits), Expect = 1.5, Sum P(2) = 0.77 Identities = 7/22 (31%), Positives = 14/22 (63%), Frame = +3 Query: 87 IVYKINYGVLCMLCKTWNFTFL 152 ++Y +G C++C + +F FL Sbjct: 48 VIYCFLFGC-CVICYSQSFYFL 68 Score = 33 (11.6 bits), Expect = 1.5, Sum P(2) = 0.77 Identities = 9/31 (29%), Positives = 15/31 (48%), Frame = +1 Query: 13 IYGHCLSD*---RCCLISPCICILGTIRLYI 96 +YG C+ C +SP + G R+Y+ Sbjct: 9 LYGFCVRFCFVFLCIYVSPRLPSSGNRRVYV 39 >gi|7292590|gb|AAF47990.1| (AE003484) CG1537 gene product [Drosophila melanogaster] Length = 135 Frame 3 hits (HSPs): ______ _____ __________________________________________________ Database sequence: | | | | 135 0 50 100 Plus Strand HSPs: Score = 45 (15.8 bits), Expect = 1.8, Sum P(2) = 0.84 Identities = 8/14 (57%), Positives = 9/14 (64%), Frame = +3 Query: 105 YGVLCMLCKTWNFT 146 YG+LC TW FT Sbjct: 85 YGILCNFKCTWFFT 98 Score = 32 (11.3 bits), Expect = 1.8, Sum P(2) = 0.84 Identities = 7/15 (46%), Positives = 8/15 (53%), Frame = +3 Query: 63 HLHLGHHKIVYKINY 107 H H G H I IN+ Sbjct: 46 HHHGGSHHIRRNINH 60 >gi|2769725|gb|AAC40337.1| (U92288) H88 [Human herpesvirus 6] Length = 40 Frame 1 hits (HSPs): ________________________________ __________________________________________________ Database sequence: | | | 40 0 20 Plus Strand HSPs: Score = 61 (21.5 bits), Expect = 2.0, P = 0.87 Identities = 13/30 (43%), Positives = 18/30 (60%), Frame = +1 Query: 43 CCLISPCICILGTIRLYIK*IMVCCVC-CAR 132 CC+ C+C+L L + + VCCVC C R Sbjct: 9 CCVCVCCVCVLC---LCV--VFVCCVCVCER 34 >gi|7519366|pir||E71213 hypothetical protein PH1973 - Pyrococcus horikoshii >gi|3258417|dbj|BAA31100.1| (AP000007) 107aa long hypothetical protein [Pyrococcus horikoshii] Length = 107 Frame 3 hits (HSPs): __________________ __________________________________________________ Database sequence: | | | | | | | 107 0 20 40 60 80 100 Plus Strand HSPs: Score = 61 (21.5 bits), Expect = 2.0, P = 0.87 Identities = 17/42 (40%), Positives = 23/42 (54%), Frame = +3 Query: 15 LWTLPIRLTLLFN*SMHLHLGHHKIVYKINYGVLCMLCKTWN 140 LW L LTLLFN + +G + +I+ G+ C L K WN Sbjct: 71 LWPLSANLTLLFNFRSSILVG---FILRIS-GISCHLVK-WN 107 >gi|485515|pir||S33622 ADR6 protein - soybean >gi|296445|emb|CAA49340.1| (X69639) auxin down regulated [Glycine max] >gi|2304955|gb|AAB65592.1| (U64866) similar to ADR6 encoded by GenBank Accession Number X69639; aluminum induced [Glycine max] Length = 272 Frame 2 hits (HSPs): ___ Annotated Domains: _____________________________________________ __________________________________________________ Database sequence: | | | | | | | 272 0 50 100 150 200 250 __________________ Annotated Domains: DOMO DM02982: 18..258 __________________ Plus Strand HSPs: Score = 67 (23.6 bits), Expect = 2.6, P = 0.92 Identities = 12/14 (85%), Positives = 14/14 (100%), Frame = +2 Query: 5 PNLSMDTAYQTNVV 46 PNLS+DTAYQTN+V Sbjct: 258 PNLSVDTAYQTNIV 271 >gi|288324|emb|CAA39349.1| (X55823) T-cell receptor alpha-chain [Mus musculus] Length = 95 Frame -2 hits (HSPs): ______ Frame -3 hits (HSPs): _________________ __________________________________________________ Database sequence: | | | | | | 95 0 20 40 60 80 Minus Strand HSPs: Score = 40 (14.1 bits), Expect = 3.6, Sum P(2) = 0.97 Identities = 10/31 (32%), Positives = 16/31 (51%), Frame = -3 Query: 150 ER*SSRSC-TTYTTHHNLFYIQSYGAQDADA 61 E+ SSR TY F++Q Q++D+ Sbjct: 30 EKGSSRGFEATYNKEATSFHLQKASVQESDS 60 Score = 29 (10.2 bits), Expect = 3.6, Sum P(2) = 0.97 Identities = 5/10 (50%), Positives = 5/10 (50%), Frame = -2 Query: 70 CRCMD*LNNN 41 C C NNN Sbjct: 63 CYCASFFNNN 72 >gi|6580252|emb|CAB63324.1| (Z83129) predicted using Genefinder [Caenorhabditis elegans] Length = 101 Frame 1 hits (HSPs): ____________ __________________________________________________ Database sequence: | | | | | || 101 0 20 40 60 80 100 Plus Strand HSPs: Score = 58 (20.4 bits), Expect = 4.3, P = 0.99 Identities = 11/29 (37%), Positives = 16/29 (55%), Frame = +1 Query: 43 CCLISPCICILGTIRLYIK*IMVCCVCCA 129 CC C C+ G +R + + +CC CCA Sbjct: 71 CC----CSCMHGVLRFFAE---LCCNCCA 92 >gi|4958931|dbj|BAA78101.1| (AB027465) Rep protein 3 [squash leaf curl virus] Length = 136 Frame 3 hits (HSPs): ____ ___________ __________________________________________________ Database sequence: | | | | 136 0 50 100 Plus Strand HSPs: Score = 38 (13.4 bits), Expect = 4.6, Sum P(2) = 0.99 Identities = 10/30 (33%), Positives = 19/30 (63%), Frame = +3 Query: 30 IRLTLLFN*SMHLHLGHHK--IVYKINYGV 113 I++ ++FN ++ LG HK I ++I G+ Sbjct: 48 IKVQVMFNHNLRKALGLHKCAITFQIWTGL 77 Score = 36 (12.7 bits), Expect = 4.6, Sum P(2) = 0.99 Identities = 4/7 (57%), Positives = 7/7 (100%), Frame = +3 Query: 9 IYLWTLP 29 +Y+WT+P Sbjct: 20 VYIWTVP 26 >gi|4033371|sp|O51875|ATPD_BUCAP ATP SYNTHASE DELTA CHAIN >gi|2827021|gb|AAC38113.1| (AF008210) ATP synthase subunit delta [Buchnera aphidicola] Length = 177 Frame 3 hits (HSPs): _____ Frame 2 hits (HSPs): _____ Annotated Domains: __________________________________________________ __________________________________________________ Database sequence: | | | | | 177 0 50 100 150 __________________ Annotated Domains: PFAM OSCP: ATP synthase delta (OSCP) subunit 7..175 PRODOM PD001250: ATPD(41) ATPO(6) 1..174 __________________ Plus Strand HSPs: Score = 42 (14.8 bits), Expect = 7.2, Sum P(2) = 1.0 Identities = 7/18 (38%), Positives = 13/18 (72%), Frame = +2 Query: 11 LSMDTAYQTNVVV*LVHA 64 L ++T+YQ N ++ L+ A Sbjct: 97 LKLETSYQGNTIIELISA 114 Score = 33 (11.6 bits), Expect = 7.2, Sum P(2) = 1.0 Identities = 6/17 (35%), Positives = 11/17 (64%), Frame = +3 Query: 66 LHLGHHKIVYKINYGVL 116 + L K +YKI++ +L Sbjct: 132 IFLSKIKFIYKIDHQIL 148 Parameters: filter=none matrix=BLOSUM62 V=50 B=50 E=10 gi H=1 sort_by_pvalue echofilter ctxfactor=5.90 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H Std. 0 BLOSUM62 0.318 0.135 0.401 +3 0 BLOSUM62 0.318 0.135 0.401 0.346 0.155 0.588 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +2 0 BLOSUM62 0.318 0.135 0.401 0.353 0.150 0.564 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +1 0 BLOSUM62 0.318 0.135 0.401 0.358 0.163 0.681 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -1 0 BLOSUM62 0.318 0.135 0.401 0.356 0.158 0.578 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -2 0 BLOSUM62 0.318 0.135 0.401 0.362 0.164 0.644 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -3 0 BLOSUM62 0.318 0.135 0.401 0.361 0.154 0.555 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a Query Frame MatID Length Eff.Length E S W T X E2 S2 +3 0 51 50 10. 57 3 12 22 0.10 29 26 0.071 29 +2 0 51 50 10. 57 3 12 22 0.10 29 26 0.071 29 +1 0 51 50 10. 57 3 12 22 0.10 29 26 0.071 29 -1 0 51 51 10. 57 3 12 22 0.10 29 26 0.077 29 -2 0 51 50 10. 57 3 12 22 0.10 29 26 0.071 29 -3 0 51 50 10. 57 3 12 22 0.10 29 26 0.071 29 Statistics: Database: /usr/local/dot5/sl_home/beauty/seqdb/blast/nr Title: nr Release date: unknown Posted date: 8:50 PM CDT May 27, 2000 Format: BLAST # of letters in database: 158,518,215 # of sequences in database: 505,245 # of database sequences satisfying E: 12 No. of states in DFA: 574 (57 KB) Total size of DFA: 113 KB (128 KB) Time to generate neighborhood: 0.00u 0.01s 0.01t Elapsed: 00:00:00 No. of threads or processors used: 4 Search cpu time: 73.07u 1.04s 74.11t Elapsed: 00:00:25 Total cpu time: 73.11u 1.08s 74.19t Elapsed: 00:00:25 Start: Wed Feb 14 23:39:36 2001 End: Wed Feb 14 23:40:01 2001
Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000