WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.
BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.
BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract
Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract
RepeatMasker repeats found in sequence:No Repeats Found.Reference: Gish, Warren (1994-1997). unpublished. Gish, Warren and David J. States (1993). Identification of protein coding regions by database similarity search. Nat. Genet. 3:266-72.Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.
Query= SSH4E04.SEQ(1>250) (226 letters)
Translating both strands of query sequence in all 6 reading framesDatabase: nr 505,245 sequences; 158,518,215 total letters.Observed Numbers of Database Sequences Satisfying Various EXPECTation Thresholds (E parameter values) Histogram units: = 8 Sequences : less than 8 sequences EXPECTation Threshold (E parameter) | V Observed Counts--> 10000 1625 435 |====================================================== 6310 1190 243 |============================== 3980 947 212 |========================== 2510 735 269 |================================= 1580 466 188 |======================= 1000 278 93 |=========== 631 185 55 |====== 398 130 40 |===== 251 90 26 |=== 158 64 11 |= 100 53 17 |== 63.1 36 18 |== 39.8 18 3 |: 25.1 15 8 |= 15.8 7 1 |: >>>>>>>>>>>>>>>>>>>>> Expect = 10.0, Observed = 6 <<<<<<<<<<<<<<<<< 10.0 6 1 |: 6.31 5 1 |: 3.98 4 2 |: 2.51 2 1 |: Smallest Sum Reading High Probability Sequences producing High-scoring Segment Pairs: Frame Score P(N) N gi|4220463|gb|AAD12690.1|(AC006216) This gene is cut ... +1 142 5.4e-09 1 gi|1498560|gb|AAB06427.1|(U64495) immunoglobulin heav... +2 46 0.83 2 gi|3820624|gb|AAC69631.1|(AF099962) factor essential ... -1 54 0.97 2 gi|4539614|gb|AAD22132.1|U23711_1(U23711) FEMA [Staph... -1 54 0.97 2 gi|3608190|dbj|BAA33160.1|(AB017341) AC3 [mungbean ye... +3 58 0.998 1 gi|7485145|pir||F71412hypothetical protein - Arabidop... +3 62 0.99990 1 Locally-aligned regions (HSPs) with respect to query sequence: Locus_ID Frame 3 Hits gi|3608190 | ___________________ gi|7485145 |______________________________ __________________________________________________ Query sequence: | | | | | 76 0 20 40 60 Locus_ID Frame 2 Hits gi|1498560 | ___________ __________________________________________________ Query sequence: | | | | | 76 0 20 40 60 Locus_ID Frame 1 Hits gi|4220463 | ________________________________________________ gi|1498560 | ________________ __________________________________________________ Query sequence: | | | | | 76 0 20 40 60 Locus_ID Frame -1 Hits gi|3820624 | __________________ gi|4539614 | __________________ __________________________________________________ Query sequence: | | | | | 76 0 20 40 60 Locus_ID Frame -3 Hits gi|3820624 | _____________ gi|4539614 | _____________ Prosite Hits: _____ __________________________________________________ Query sequence: | | | | | 76 0 20 40 60 __________________ Prosite hits: TYR_PHOSPHO_SITE Tyrosine kinase phosphorylation site. 21..28 __________________
Use the and icons to retrieve links to Entrez:
>gi|4220463|gb|AAD12690.1| (AC006216) This gene is cut off. [Arabidopsis thaliana] Length = 200 Frame 1 hits (HSPs): _____________ __________________________________________________ Database sequence: | | | | | 200 0 50 100 150 Plus Strand HSPs: Score = 142 (50.0 bits), Expect = 5.4e-09, P = 5.4e-09 Identities = 24/28 (85%), Positives = 28/28 (100%), Frame = +1 Query: 142 IYMNIDDSSENYWLRLLLREWAQFCIFL 225 IYMN+D+SS+NYWLRLL+REWAQFCIFL Sbjct: 132 IYMNMDNSSQNYWLRLLIREWAQFCIFL 159 Score = 86 (30.3 bits), Expect = 0.012, P = 0.011 Identities = 23/50 (46%), Positives = 33/50 (66%), Frame = +1 Query: 7 MFKKFQGAMQIVATAETMVLYLGFSS*FINLIYINI*L*QAHQF*IYMNI 156 MFKKFQGAMQIVA AET V+Y+ + N ++ + + + QF I++ I Sbjct: 114 MFKKFQGAMQIVAMAET-VIYMNMDNSSQNY-WLRLLIREWAQFCIFLYI 161 >gi|1498560|gb|AAB06427.1| (U64495) immunoglobulin heavy chain variable region [Homo sapiens] Length = 81 Frame 2 hits (HSPs): ___________ Frame 1 hits (HSPs): ________________ __________________________________________________ Database sequence: | | | | | | 81 0 20 40 60 80 Plus Strand HSPs: Score = 46 (16.2 bits), Expect = 1.7, Sum P(2) = 0.83 Identities = 7/16 (43%), Positives = 11/16 (68%), Frame = +2 Query: 176 TGFVYCSENGHNSASF 223 T YC++ G++S SF Sbjct: 51 TAIYYCAQTGYSSGSF 66 Score = 40 (14.1 bits), Expect = 1.7, Sum P(2) = 0.83 Identities = 9/24 (37%), Positives = 12/24 (50%), Frame = +1 Query: 13 KKFQGAMQIVATAETMVLYLGFSS 84 + FQG + I A T Y+ SS Sbjct: 22 ENFQGRVTITADTSTDTAYMELSS 45 >gi|3820624|gb|AAC69631.1| (AF099962) factor essential for methicillin resistance FEMA [Staphylococcus haemolyticus] Length = 420 Frame -1 hits (HSPs): ____ Frame -3 hits (HSPs): ___ __________________________________________________ Database sequence: | | | | 420 0 150 300 Minus Strand HSPs: Score = 54 (19.0 bits), Expect = 3.4, Sum P(2) = 0.97 Identities = 10/27 (37%), Positives = 16/27 (59%), Frame = -1 Query: 187 NEASNSLKSHRCSYISKIDELVKVRYL 107 NE + LK H C Y+ ++D + +YL Sbjct: 91 NELTKYLKQHNCLYV-RVDPYLPYQYL 116 Score = 33 (11.6 bits), Expect = 3.4, Sum P(2) = 0.97 Identities = 5/18 (27%), Positives = 12/18 (66%), Frame = -3 Query: 107 IYIKFINYEEKPKYNTIV 54 + +KF++ EE P + + + Sbjct: 191 VKVKFLSEEELPIFRSFM 208 >gi|4539614|gb|AAD22132.1|U23711_1 (U23711) FEMA [Staphylococcus haemolyticus] Length = 420 Frame -1 hits (HSPs): ____ Frame -3 hits (HSPs): ___ __________________________________________________ Database sequence: | | | | 420 0 150 300 Minus Strand HSPs: Score = 54 (19.0 bits), Expect = 3.4, Sum P(2) = 0.97 Identities = 10/27 (37%), Positives = 16/27 (59%), Frame = -1 Query: 187 NEASNSLKSHRCSYISKIDELVKVRYL 107 NE + LK H C Y+ ++D + +YL Sbjct: 91 NELTKYLKQHNCLYV-RVDPYLPYQYL 116 Score = 33 (11.6 bits), Expect = 3.4, Sum P(2) = 0.97 Identities = 5/18 (27%), Positives = 12/18 (66%), Frame = -3 Query: 107 IYIKFINYEEKPKYNTIV 54 + +KF++ EE P + + + Sbjct: 191 VKVKFLSEEELPIFRSFM 208 >gi|3608190|dbj|BAA33160.1| (AB017341) AC3 [mungbean yellow mosaic virus] Length = 126 Frame 3 hits (HSPs): ___________ __________________________________________________ Database sequence: | | | | 126 0 50 100 Plus Strand HSPs: Score = 58 (20.4 bits), Expect = 6.1, P = 1.0 Identities = 14/27 (51%), Positives = 17/27 (62%), Frame = +3 Query: 96 LDIYKYLTLTSSSILDIYEHR*LFREL 176 L +Y YLT TS IL I+ + LFR L Sbjct: 69 LKLYHYLTATSGMILSIFSRQ-LFRYL 94 >gi|7485145|pir||F71412 hypothetical protein - Arabidopsis thaliana >gi|2244850|emb|CAB10272.1| (Z97337) hypothetical protein [Arabidopsis thaliana] >gi|7268239|emb|CAB78535.1| (AL161540) hypothetical protein [Arabidopsis thaliana] Length = 275 Frame 3 hits (HSPs): _________ __________________________________________________ Database sequence: | | | | | | | 275 0 50 100 150 200 250 Plus Strand HSPs: Score = 62 (21.8 bits), Expect = 9.2, P = 1.0 Identities = 14/44 (31%), Positives = 23/44 (52%), Frame = +3 Query: 6 NVQEISGCNADSSYSRNYGIIFGFLFIVYKLDIYKYLTLTSSSI 137 NV G N + S + + FGFLF Y + +Y+ L + S ++ Sbjct: 123 NVGSNCGYNMSVNISSSVSLCFGFLFPYYVMFLYRLLGVYSGTV 166 Parameters: filter=none matrix=BLOSUM62 V=50 B=50 E=10 gi H=1 sort_by_pvalue echofilter ctxfactor=5.96 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H Std. 0 BLOSUM62 0.318 0.135 0.401 +3 0 BLOSUM62 0.318 0.135 0.401 0.338 0.151 0.452 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +2 0 BLOSUM62 0.318 0.135 0.401 0.371 0.163 0.651 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +1 0 BLOSUM62 0.318 0.135 0.401 0.363 0.161 0.565 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -1 0 BLOSUM62 0.318 0.135 0.401 0.363 0.163 0.544 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -2 0 BLOSUM62 0.318 0.135 0.401 0.358 0.160 0.569 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -3 0 BLOSUM62 0.318 0.135 0.401 0.363 0.158 0.529 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a Query Frame MatID Length Eff.Length E S W T X E2 S2 +3 0 74 74 10. 57 3 12 22 0.097 31 27 0.096 32 +2 0 75 75 10. 57 3 12 22 0.099 31 27 0.099 32 +1 0 75 74 10. 57 3 12 22 0.097 31 27 0.096 32 -1 0 75 75 10. 57 3 12 22 0.099 31 27 0.099 32 -2 0 75 75 10. 57 3 12 22 0.099 31 27 0.099 32 -3 0 74 74 10. 57 3 12 22 0.097 31 27 0.096 32 Statistics: Database: /usr/local/dot5/sl_home/beauty/seqdb/blast/nr Title: nr Release date: unknown Posted date: 8:50 PM CDT May 27, 2000 Format: BLAST # of letters in database: 158,518,215 # of sequences in database: 505,245 # of database sequences satisfying E: 6 No. of states in DFA: 572 (56 KB) Total size of DFA: 120 KB (128 KB) Time to generate neighborhood: 0.01u 0.00s 0.01t Elapsed: 00:00:00 No. of threads or processors used: 4 Search cpu time: 92.43u 1.10s 93.53t Elapsed: 00:00:59 Total cpu time: 92.47u 1.11s 93.58t Elapsed: 00:00:59 Start: Wed Feb 14 18:32:34 2001 End: Wed Feb 14 18:33:33 2001
Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000