WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.
BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.
BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract
Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract
RepeatMasker repeats found in sequence:No Repeats Found.Reference: Gish, Warren (1994-1997). unpublished. Gish, Warren and David J. States (1993). Identification of protein coding regions by database similarity search. Nat. Genet. 3:266-72.Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.
Query= 'E04C12_C12_06.ab1' (717 letters)
Translating both strands of query sequence in all 6 reading framesDatabase: nr 625,274 sequences; 197,782,623 total letters.Observed Numbers of Database Sequences Satisfying Various EXPECTation Thresholds (E parameter values) Histogram units: = 3 Sequences : less than 3 sequences EXPECTation Threshold (E parameter) | V Observed Counts--> 10000 830 161 |===================================================== 6310 669 125 |========================================= 3980 544 131 |=========================================== 2510 413 171 |========================================================= 1580 242 84 |============================ 1000 158 53 |================= 631 105 37 |============ 398 68 23 |======= 251 45 13 |==== 158 32 8 |== 100 24 8 |== 63.1 16 2 |: 39.8 14 2 |: 25.1 12 2 |: 15.8 10 6 |== >>>>>>>>>>>>>>>>>>>>> Expect = 10.0, Observed = 4 <<<<<<<<<<<<<<<<< 10.0 4 0 | 6.31 4 0 | 3.98 4 1 |: Smallest Sum Reading High Probability Sequences producing High-scoring Segment Pairs: Frame Score P(N) N gi|7486565|pir||T05123hypothetical protein F7H19.100 ... +3 871 3.8e-86 1 gi|7487452|pir||T09350hypothetical protein T26M18.120... +3 810 1.1e-79 1 gi|8778672|gb|AAF79680.1|AC022314_21(AC022314) F9C16.... +3 490 8.9e-46 1 gi|5174693ref|NP_005979.1| small proline-rich protein... +3 66 0.96 1
Use the and icons to retrieve links to Entrez:
>gi|7486565|pir||T05123 hypothetical protein F7H19.100 - Arabidopsis thaliana >gi|3292817|emb|CAA19807.1| (AL031018) hypothetical protein [Arabidopsis thaliana] >gi|7269139|emb|CAB79247.1| (AL161558) hypothetical protein [Arabidopsis thaliana] Length = 268 Frame 3 hits (HSPs): ____________________________________________ __________________________________________________ Database sequence: | | | | | | | 268 0 50 100 150 200 250 Plus Strand HSPs: Score = 871 (306.6 bits), Expect = 3.8e-86, P = 3.8e-86 Identities = 167/238 (70%), Positives = 190/238 (79%), Frame = +3 Query: 3 AGXNSLFPYCGRRVGKKNKAMVPVARLFGPAIFEASKLKVLFLGVDENKHPGNLPRTYTL 182 + +SLF + RR KKN+++VPVARLFGPAIFE+SKLKVLFLGVDE KHP LPRTYTL Sbjct: 26 SSSSSLF-FNNRRSKKKNQSIVPVARLFGPAIFESSKLKVLFLGVDEKKHPSTLPRTYTL 84 Query: 183 THSDITAKLTLAISQTINNSQLQGWYNRFQRDEVVAQWKKVKGRMSLHVHCHISGGHFLL 362 THSDITAKLTLAISQ+INNSQLQGW NR RDEVVA+WKKVKG+MSLHVHCHISGGHFLL Sbjct: 85 THSDITAKLTLAISQSINNSQLQGWANRLYRDEVVAEWKKVKGKMSLHVHCHISGGHFLL 144 Query: 363 DILARLRYFIFCKELPVVLKAVVHGDENLFNSYPELQDALVWVYFHSNIPEFNKVECWGP 542 D+ A+ RYFIFCKELPVVLKA VHGD NL N+YPELQ+ALVWVYFHSN+ EFNKVECWGP Sbjct: 145 DLFAKFRYFIFCKELPVVLKAFVHGDGNLLNNYPELQEALVWVYFHSNVNEFNKVECWGP 204 Query: 543 LKEASAPTGGVQEEGLAIPQP-CQEECQCCFPPLTLSPIQWSKQVPSRHYEPCDGIGTQ 716 L EA +P G E +P+ C +EC CCFP T+S I WS + + G T+ Sbjct: 205 LWEAVSPDGHKTE---TLPEARCADECSCCFP--TVSSIPWSHSLSNEGVNGYSGTQTE 258 >gi|7487452|pir||T09350 hypothetical protein T26M18.120 - Arabidopsis thaliana >gi|5002526|emb|CAB44329.1| (AL078606) putative protein [Arabidopsis thaliana] >gi|7267892|emb|CAB78234.1| (AL161533) putative protein [Arabidopsis thaliana] Length = 466 Frame 3 hits (HSPs): _____________________________________________ __________________________________________________ Database sequence: | | | | | 466 0 150 300 450 Plus Strand HSPs: Score = 810 (285.1 bits), Expect = 1.1e-79, P = 1.1e-79 Identities = 148/204 (72%), Positives = 173/204 (84%), Frame = +3 Query: 36 RRVGKKNKAMVPVARLFGPAIFEASKLKVLFLGVDENKHPGNLPRTYTLTHSDITAKLTL 215 RR KN+++VPVARLFGPAIFEASKLKVLFLGVDE KHP LPRTYTLTHSDITAKLTL Sbjct: 32 RRSKMKNRSIVPVARLFGPAIFEASKLKVLFLGVDEKKHPAKLPRTYTLTHSDITAKLTL 91 Query: 216 AISQTINNSQLQGWYNRFQRDEVVAQWKKVKGRMSLHVHCHISGGHFLLDILARLRYFIF 395 AISQ+INNSQLQGW N+ RDEVV +WKKVKG+MSLHVHCHISGGHF L+++A+LRY+IF Sbjct: 92 AISQSINNSQLQGWANKLFRDEVVGEWKKVKGKMSLHVHCHISGGHFFLNLIAKLRYYIF 151 Query: 396 CKELPVVLKAVVHGDENLFNSYPELQDALVWVYFHSNIPEFNKVECWGPLKEASAP---T 566 CKELPVVL+A HGDE L N++PELQ++ VWVYFHSNIPE+NKVECWGPL EA + Sbjct: 152 CKELPVVLEAFAHGDEYLLNNHPELQESPVWVYFHSNIPEYNKVECWGPLWEAMSQHQHD 211 Query: 567 GGVQEEGLAIPQ-PCQEECQCCFP 635 G ++ +P+ PC +EC+CCFP Sbjct: 212 GRTHKKSETLPELPCPDECKCCFP 235 Score = 777 (273.5 bits), Expect = 3.4e-76, P = 3.4e-76 Identities = 149/212 (70%), Positives = 166/212 (78%), Frame = +3 Query: 57 KAMVP-VARLFGPAIFEASKLKVLFLGVDENKHPGNLPRTYTLTHSDITAKLTLAISQTI 233 K P VARLFG AIFEASKL V FLGVDE KHP NLPRTYT THSDITAKLTLAIS +I Sbjct: 231 KCCFPTVARLFGQAIFEASKLNVKFLGVDEKKHPPNLPRTYTFTHSDITAKLTLAISHSI 290 Query: 234 NNSQLQGWYNRFQRDEVVAQWKKVKGRMSLHVHCHISGGHFLLDILARLRYFIFCKELPV 413 NNSQLQGW NR RDEVVA+W+KVK MSLHVHCHISG HFLLD++A LRYFIFCKELP+ Sbjct: 291 NNSQLQGWANRLYRDEVVAEWRKVKSNMSLHVHCHISGDHFLLDLIAELRYFIFCKELPM 350 Query: 414 VLKAVVHGDENLFNSYPELQDALVWVYFHSNIPEFNKVECWGPLKEASAPTGGVQEEGLA 593 VLKA VHGDEN+ N+YPEL +A VWVYFHSNIP+FNKVECWG L EA++ G Sbjct: 351 VLKAFVHGDENMLNNYPELHEAFVWVYFHSNIPKFNKVECWGRLCEATSHDGCKTPTCEI 410 Query: 594 IPQP-CQEECQCCFPPLTLSPIQWSKQVPSRHYE 692 +P+P C ++C CCFP T+S I WS H E Sbjct: 411 LPEPPCFDKCSCCFP--TVSTIPWSHSHGCSHGE 442 >gi|8778672|gb|AAF79680.1|AC022314_21 (AC022314) F9C16.20 [Arabidopsis thaliana] Length = 299 Frame 3 hits (HSPs): _____________________________ __________________________________________________ Database sequence: | | | | | | | 299 0 50 100 150 200 250 Plus Strand HSPs: Score = 490 (172.5 bits), Expect = 8.9e-46, P = 8.9e-46 Identities = 97/169 (57%), Positives = 121/169 (71%), Frame = +3 Query: 54 NKAMVPVARLFGP-AIFEASKLKVLFLG-VDENKHPGNL--PRTYTLTHSDITAKLTLAI 221 N + RL P A F++SKLKV FLG + ENK G + PRTY L+H D TA LTL I Sbjct: 57 NTLVSEAVRLLVPQANFDSSKLKVEFLGELLENKSNGGIITPRTYILSHCDFTANLTLTI 116 Query: 222 SQTINNSQLQGWYNRFQRDEVVAQWKKVKGRMSLHVHCHISGGHFLLDILARLRYFIFCK 401 S IN QL+GWY ++D+VVA+WKKV + LH+HC +SG L D+ A LRY IF K Sbjct: 117 SNVINLDQLEGWY---KKDDVVAEWKKVNDELRLHIHCCVSGMSLLQDVAAELRYHIFSK 173 Query: 402 ELPVVLKAVVHGDENLFNSYPELQDALVWVYFHSNIPEFNKVECWGPLKEAS 557 ELP+VLKAVVHGD +F PEL DA VWVYFHS+ P++N++ECWGPLK+A+ Sbjct: 174 ELPLVLKAVVHGDSVMFRENPELMDAYVWVYFHSSTPKYNRIECWGPLKDAA 225 >gi|5174693 ref|NP_005979.1| small proline-rich protein 2A [Homo sapiens] >gi|12719984 ref|XP_010605.1| small proline-rich protein 2A [Homo sapiens] >gi|464788|sp|P35326|SP2A_HUMAN SMALL PROLINE-RICH PROTEIN 2A (SPR-2A) (2-1) >gi|107689|pir||S12712 small proline-rich protein spr2-1 - human >gi|3367693|emb|CAA37239.1| (X53064) small proline-rich protein [Homo sapiens] Length = 72 Frame 3 hits (HSPs): ____________________ __________________________________________________ Database sequence: | | | | | 72 0 20 40 60 Plus Strand HSPs: Score = 66 (23.2 bits), Expect = 3.3, P = 0.96 Identities = 14/28 (50%), Positives = 18/28 (64%), Frame = +3 Query: 597 PQPCQ-EECQCCFPPLTLSPIQWSKQVP 677 PQPC ++CQ +PP+T SP SK P Sbjct: 42 PQPCPPQQCQQKYPPVTPSPPCQSKYPP 69 Parameters: filter=none matrix=BLOSUM62 V=50 B=50 E=10 gi H=1 sort_by_pvalue echofilter ctxfactor=5.97 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H Std. 0 BLOSUM62 0.318 0.135 0.401 +3 0 BLOSUM62 0.318 0.135 0.401 0.322 0.139 0.449 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +2 0 BLOSUM62 0.318 0.135 0.401 0.339 0.149 0.499 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +1 0 BLOSUM62 0.318 0.135 0.401 0.345 0.149 0.557 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -1 0 BLOSUM62 0.318 0.135 0.401 0.341 0.152 0.538 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -2 0 BLOSUM62 0.318 0.135 0.401 0.345 0.154 0.582 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -3 0 BLOSUM62 0.318 0.135 0.401 0.348 0.151 0.557 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a Query Frame MatID Length Eff.Length E S W T X E2 S2 +3 0 238 237 10. 78 3 12 22 0.11 35 32 0.10 39 +2 0 238 237 10. 78 3 12 22 0.11 35 32 0.10 39 +1 0 239 239 10. 78 3 12 22 0.12 35 32 0.10 39 -1 0 239 238 10. 78 3 12 22 0.11 35 32 0.10 39 -2 0 238 238 10. 78 3 12 22 0.11 35 32 0.10 39 -3 0 238 237 10. 78 3 12 22 0.11 35 32 0.10 39 Statistics: Database: /usr/local/dot5/sl_home/beauty/seqdb/blast/nr Title: nr Release date: unknown Posted date: 4:06 PM CST Feb 28, 2001 Format: BLAST # of letters in database: 197,782,623 # of sequences in database: 625,274 # of database sequences satisfying E: 4 No. of states in DFA: 598 (59 KB) Total size of DFA: 280 KB (320 KB) Time to generate neighborhood: 0.01u 0.01s 0.02t Elapsed: 00:00:00 No. of threads or processors used: 6 Search cpu time: 260.43u 1.30s 261.73t Elapsed: 00:01:00 Total cpu time: 260.48u 1.32s 261.80t Elapsed: 00:01:00 Start: Fri Jan 18 12:05:20 2002 End: Fri Jan 18 12:06:20 2002
Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000