WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.
BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.
BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract
Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract
RepeatMasker repeats found in sequence:No Repeats Found.Reference: Gish, Warren (1994-1997). unpublished. Gish, Warren and David J. States (1993). Identification of protein coding regions by database similarity search. Nat. Genet. 3:266-72.Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.
Query= A05C06_CONSENSUS (632 letters)
Translating both strands of query sequence in all 6 reading framesDatabase: nr 625,274 sequences; 197,782,623 total letters.Observed Numbers of Database Sequences Satisfying Various EXPECTation Thresholds (E parameter values) Histogram units: = 9 Sequences : less than 9 sequences EXPECTation Threshold (E parameter) | V Observed Counts--> 10000 2762 507 |======================================================== 6310 2255 412 |============================================= 3980 1843 379 |========================================== 2510 1464 428 |=============================================== 1580 1036 287 |=============================== 1000 749 253 |============================ 631 496 146 |================ 398 350 88 |========= 251 262 81 |========= 158 181 59 |====== 100 122 38 |==== 63.1 84 33 |=== 39.8 51 21 |== 25.1 30 9 |= 15.8 21 7 |: >>>>>>>>>>>>>>>>>>>>> Expect = 10.0, Observed = 14 <<<<<<<<<<<<<<<<< 10.0 14 1 |: 6.31 13 6 |: 3.98 7 2 |: 2.51 5 1 |: 1.58 4 0 | 1.00 4 3 |: 0.63 1 0 | 0.40 1 1 |: Smallest Sum Reading High Probability Sequences producing High-scoring Segment Pairs: Frame Score P(N) N gi|330179|gb|AAA45802.1|(M74421) ORF2 [human herpesvi... +1 77 0.24 1 gi|3914142|sp|P93329|NO20_MEDTREARLY NODULIN 20 PRECU... +1 90 0.50 1 gi|1066104|gb|AAC13878.1|(U39735) high molecular weig... +3 91 0.53 1 gi|100212|pir||S14976extensin class II (clones u1/u2)... +1 70 0.57 1 gi|9827832|emb|CAB97921.1|(AL160493) possible extensi... +1 90 0.84 1 gi|7298643|gb|AAF53859.1|(AE003664) CG16772 gene prod... +3 85 0.97 1 gi|11862955|dbj|BAB19336.1|(AP003044) hypothetical pr... +1 80 0.98 1 gi|2126663|pir||S60811M protein precursor - Streptoco... +3 68 0.99 1 gi|126466|sp|P17589|LRP2_HSV1FLATENCY-RELATED PROTEIN... +1 72 0.99 1 gi|11034580|dbj|BAB17104.1|(AP002866) contains ESTs C... +3 82 0.995 1 gi|10436960|dbj|BAB14941.1|(AK024637) unnamed protein... +1 77 0.995 1 gi|533560|gb|AAA99557.1|(U11941) emml gene product [S... +3 68 0.997 1 gi|12018147|gb|AAG45420.1|AF309494_1(AF309494) vegeta... +1 86 0.997 1 gi|3421391|gb|AAC32195.1|(AF082158) putative cysteine... +3 66 0.99993 2 Locally-aligned regions (HSPs) with respect to query sequence: Locus_ID Frame 3 Hits gi|1066104 |_______________________________________ gi|7298643 |_______________________ gi|2126663 | ____________ gi|11034580 | ___________________ gi|533560 | ____________ gi|3421391 | __________________________ __________________________________________________ Query sequence: | | | | | | 211 0 50 100 150 200 Locus_ID Frame 2 Hits gi|3421391 | ______ __________________________________________________ Query sequence: | | | | | | 211 0 50 100 150 200 Locus_ID Frame 1 Hits gi|330179 | _____________________ gi|3914142 | ___________________ gi|100212 | __________ gi|9827832 | ___________________ gi|11862955 | ____________ gi|126466 | _____________________ gi|10436960 | ________________ gi|12018147 | ____________________________ __________________________________________________ Query sequence: | | | | | | 211 0 50 100 150 200
Use the and icons to retrieve links to Entrez:
>gi|330179|gb|AAA45802.1| (M74421) ORF2 [human herpesvirus 1] Length = 107 Frame 1 hits (HSPs): _________________________________________ __________________________________________________ Database sequence: | | | | | | | 107 0 20 40 60 80 100 Plus Strand HSPs: Score = 77 (27.1 bits), Expect = 0.28, P = 0.24 Identities = 26/88 (29%), Positives = 38/88 (43%), Frame = +1 Query: 109 TAQHSISKPSLFSPSATLRHGPAPQPPTSPRARHQRKQFPPKTAT-QQGKQNESPNRLPP 285 T HS + P +P+ T H AP P +P H PP++ +QGK + P Sbjct: 12 THPHSHAPPLPRTPTPTHPHSHAPPLPRTPTPTHPHSHAPPRSIQHRQGKDTKVNLYFPI 71 Query: 286 PSXSVASYG-APTXLVRSPI*KP*AMVFP 369 S + S+ PT R + P + FP Sbjct: 72 DSKNPLSFLLGPTLKTRWCV-VPVSFTFP 99 >gi|3914142|sp|P93329|NO20_MEDTR EARLY NODULIN 20 PRECURSOR (N-20) >gi|1771351|emb|CAA67830.1| (X99467) ENOD20 [Medicago truncatula] Length = 268 Frame 1 hits (HSPs): ________________ Annotated Domains: __________________________________________________ __________________________________________________ Database sequence: | | | | | | | 268 0 50 100 150 200 250 __________________ Annotated Domains: Entrez Domain: PLASTOCYANIN-LIKE. 23..268 Entrez Domain: POLY-PRO. 136..145 PRODOM PD065375: NO20_MEDTR 1..25 PRODOM PD003122: 27..126 PRODOM PD000540: H1(18) O76786(11) TONB(10) 135..240 __________________ Plus Strand HSPs: Score = 90 (31.7 bits), Expect = 0.69, P = 0.50 Identities = 28/77 (36%), Positives = 39/77 (50%), Frame = +1 Query: 100 IKRTAQHSISKPSLFSPSATLRHGPAPQPPTSPRARHQRKQFP--PKTATQQGKQ---NE 264 I + S+ P SPS + P+P P ++P H RK+ P P + K +E Sbjct: 151 IPHPPRRSLPSPPSPSPSPSPSPSPSPSPRSTP-IPHPRKRSPASPSPSPSLSKSPSPSE 209 Query: 265 SPNRLPPPSXSVASYGAPT 321 SP+ P PS SVAS AP+ Sbjct: 210 SPSLAPSPSDSVASL-APS 227 >gi|1066104|gb|AAC13878.1| (U39735) high molecular weight basic nuclear protein [Pleuronectes americanus] Length = 326 Frame 3 hits (HSPs): ________________________ __________________________________________________ Database sequence: | | | | 326 0 150 300 Plus Strand HSPs: Score = 91 (32.0 bits), Expect = 0.76, P = 0.53 Identities = 42/161 (26%), Positives = 70/161 (43%), Frame = +3 Query: 12 SETRTPPPQTLSHTNCGVSRNPRRFSFPHHKTHSTAQHLQTFPFLSFRYSAPWPSAAAPN 191 S R+ PP+ T +++PRR P + + ++ P S R PS + Sbjct: 149 SPKRSNPPKRSVKTPKTRAKSPRRSKSPKRRVQTPKMRAKS-PMRS-RKRPRSPSRSTSP 206 Query: 192 LSQSQTPKETIPTKNRNTTRETERVAESPSAAESQRGELRSSXXPRTESNLKTLGHGVSS 371 + +SQ+PK + K + T ++ + SPS ++S + RS P+T + Sbjct: 207 M-RSQSPKRRV--KRQKMTAKSLMRSRSPSRSKSPK---RSVKTPKTRAKSPRRSKSPKR 260 Query: 372 SVENPCCAVSD*RFARGALPHLIRVPKPKVRRPXDLLKRKL 494 V+ P V + R P R+PKPK R P KR++ Sbjct: 261 RVQTPKRRVQTPK-RRVQTPKS-RIPKPKRRVPTP--KRRV 297 >gi|100212|pir||S14976 extensin class II (clones u1/u2) - tomato >gi|1345539|emb|CAA39217.1| (X55687) extensin (class II) [Lycopersicon esculentum] Length = 75 Frame 1 hits (HSPs): __________________________ Annotated Domains: ___________________________________________ __________________________________________________ Database sequence: | | | | | 75 0 20 40 60 __________________ Annotated Domains: DOMO DM01369: PROLINE-RICHPROTEIN 1..65 __________________ Plus Strand HSPs: Score = 70 (24.6 bits), Expect = 0.85, P = 0.57 Identities = 14/40 (35%), Positives = 21/40 (52%), Frame = +1 Query: 172 PAPQPPTSPRARHQRKQFPPKTATQQGKQNESPNRLPPPS 291 P+P PPT P H + Q PP T + ++P+ PP+ Sbjct: 4 PSPPPPT-PSYEHPQPQSPPPPPTPSYEHPKTPSHPTPPT 42 >gi|9827832|emb|CAB97921.1| (AL160493) possible extensin class ii precursor(cell wall hydroxyproline-rich glycoprotein) [Leishmania major] Length = 508 Frame 1 hits (HSPs): ________ __________________________________________________ Database sequence: | | | | | 508 0 150 300 450 Plus Strand HSPs: Score = 90 (31.7 bits), Expect = 1.8, P = 0.84 Identities = 27/80 (33%), Positives = 35/80 (43%), Frame = +1 Query: 67 RETRDVSLFHTIKRTAQ-HSISKPSLF--SPSATLRHGPAPQPPTSPRARHQRKQFPPKT 237 R T SL H + R + H +P +P A + G P P +P A HQR Q PP+ Sbjct: 123 RSTSAASLLHVLVRLLRXHQRGQPPPRPRTPPAEHQRGKPPPRPRTPPAEHQRGQPPPRP 182 Query: 238 ATQQGKQNESPNRLPPPSXSVAS 306 T + PPP AS Sbjct: 183 RTPPAEHQRGQ---PPPRPRYAS 202 >gi|7298643|gb|AAF53859.1| (AE003664) CG16772 gene product [Drosophila melanogaster] Length = 311 Frame 3 hits (HSPs): _______________ __________________________________________________ Database sequence: | | | | | | | | 311 0 50 100 150 200 250 300 Plus Strand HSPs: Score = 85 (29.9 bits), Expect = 3.5, P = 0.97 Identities = 28/89 (31%), Positives = 42/89 (47%), Frame = +3 Query: 15 ETRTPPPQTLSHTNCGVSR-NPRRFSFPHHKTHSTAQHLQTFPFLSFRYSAPWP----SA 179 E+ TP P+T + T + P + PH HS H +P+ +Y P P SA Sbjct: 116 ESTTPEPETTTPTTTTTTTPKPHKHE-PHFHPHSYP-HPHPYPY-PIQYPYPHPGVIFSA 172 Query: 180 AAPNLSQSQTPKETIPTKNRNTTRETERVAESPS 281 A P L TP T P ++ +E+ ++ PS Sbjct: 173 AGPKL----TPPSTTPPTAKDADKESPELSGYPS 202 >gi|11862955|dbj|BAB19336.1| (AP003044) hypothetical protein [Oryza sativa] Length = 167 Frame 1 hits (HSPs): _______________ __________________________________________________ Database sequence: | | | | | 167 0 50 100 150 Plus Strand HSPs: Score = 80 (28.2 bits), Expect = 3.7, P = 0.98 Identities = 18/48 (37%), Positives = 21/48 (43%), Frame = +1 Query: 163 RHGPAPQPPTSPRARHQRKQFPPKTATQQGKQNESPNRLPPPSXSVAS 306 RH PP + R H R PP TA Q P PP + +VAS Sbjct: 67 RHADIGNPPPTHRRPHHRACLPPCTAAQAAFATAPPPHKPPDAAAVAS 114 >gi|2126663|pir||S60811 M protein precursor - Streptococcus pyogenes (serotype M36) (fragment) Length = 96 Frame 3 hits (HSPs): ___________________________ Annotated Domains: ______________________ __________________________________________________ Database sequence: | | | | | | 96 0 20 40 60 80 __________________ Annotated Domains: PROSITE LEUCINE_ZIPPER: Leucine zipper pattern. 53..74 PROSITE LEUCINE_ZIPPER: Leucine zipper pattern. 60..81 PROSITE LEUCINE_ZIPPER: Leucine zipper pattern. 67..88 PROSITE LEUCINE_ZIPPER: Leucine zipper pattern. 74..95 __________________ Plus Strand HSPs: Score = 68 (23.9 bits), Expect = 4.4, P = 0.99 Identities = 17/52 (32%), Positives = 27/52 (51%), Frame = +3 Query: 195 SQSQTPKETI---PTKNRNTTRETERVAESPSAAESQRGELRSSXXPRTESN 341 S S+T ++TI KN N T+E E++ E S++ +L + TE N Sbjct: 34 SNSETARQTINDYEIKNHNLTQENEKLTEQNKELTSEKEKLTTDNGRLTEQN 85 >gi|126466|sp|P17589|LRP2_HSV1F LATENCY-RELATED PROTEIN 2 >gi|74038|pir||WMBEL2 latency-related protein 2 - human herpesvirus 1 (strain F) >gi|330135|gb|AAA45800.1| (J04323) latency-related protein 2 [human herpesvirus 1] Length = 107 Frame 1 hits (HSPs): _________________________________________ Annotated Domains: __________________________________________________ __________________________________________________ Database sequence: | | | | | | | 107 0 20 40 60 80 100 __________________ Annotated Domains: Entrez Domain: 3 X 17 AA TANDEM REPEATS. 2..49 Entrez Repetitive region: 1. 2..17 Entrez Repetitive region: 2. 18..33 Entrez Repetitive region: 3. 34..49 PRODOM PD015574: 1..106 __________________ Plus Strand HSPs: Score = 72 (25.3 bits), Expect = 4.6, P = 0.99 Identities = 25/88 (28%), Positives = 37/88 (42%), Frame = +1 Query: 109 TAQHSISKPSLFSPSATLRHGPAPQPPTSPRARHQRKQFPPKTAT-QQGKQNESPNRLPP 285 T HS + P +P+ H AP P +P H PP++ +QGK + P Sbjct: 12 THPHSHAPPLPRTPTPAHPHSHAPPLPRTPTPTHPHSHAPPRSIQHRQGKDTKVNLYFPI 71 Query: 286 PSXSVASYG-APTXLVRSPI*KP*AMVFP 369 S + S+ PT R + P + FP Sbjct: 72 DSKNPLSFLLGPTLKTRWCV-VPVSFTFP 99 >gi|11034580|dbj|BAB17104.1| (AP002866) contains ESTs C26436(C12323),AU082913(C3020),D23568(C3020)~similar to Arabidopsis thaliana chromosome 3, F21M11.17~unknown protein [Oryza sativa] Length = 245 Frame 3 hits (HSPs): ________________ __________________________________________________ Database sequence: | | | | | | 245 0 50 100 150 200 Plus Strand HSPs: Score = 82 (28.9 bits), Expect = 5.2, P = 0.99 Identities = 20/76 (26%), Positives = 34/76 (44%), Frame = +3 Query: 27 PPPQTLSHTNCGVSRNPRRFSFPHHKTHSTAQHLQTFPFLSFRYSAPWPSAAAPNLSQSQ 206 PPP +S T+ G++ +P + FP + T PF + + + AA N++ S Sbjct: 153 PPPSPVSPTDSGIAASPFKAEFPSQDQPAADTGADTTPFKA-EFPSSHEQPAADNVASSP 211 Query: 207 TPKETIPTKNRNTTRE 254 PK + + T E Sbjct: 212 PPKAEAAPQEQPTAAE 227 >gi|10436960|dbj|BAB14941.1| (AK024637) unnamed protein product [Homo sapiens] Length = 143 Frame 1 hits (HSPs): ________________________ __________________________________________________ Database sequence: | | | | 143 0 50 100 Plus Strand HSPs: Score = 77 (27.1 bits), Expect = 5.3, P = 0.99 Identities = 24/67 (35%), Positives = 32/67 (47%), Frame = +1 Query: 118 HSISKPSLFSPSATLRHGPAPQPPTS-PR-ARHQRKQFPPKTATQQGKQNESPNRLPPPS 291 H KP L +P A+L GP P PP S P A P+ G Q +P+ P P Sbjct: 45 HLPGKPGLGTPCASLTLGP-PTPPASMPNLAEATLADVMPRKDEHMGHQFLTPDEAPSPP 103 Query: 292 XSVASYGAP 318 +A+ G+P Sbjct: 104 RLLAA-GSP 111 >gi|533560|gb|AAA99557.1| (U11941) emml gene product [Streptococcus pyogenes] Length = 97 Frame 3 hits (HSPs): ___________________________ __________________________________________________ Database sequence: | | | | | | 97 0 20 40 60 80 Plus Strand HSPs: Score = 68 (23.9 bits), Expect = 5.8, P = 1.0 Identities = 17/52 (32%), Positives = 27/52 (51%), Frame = +3 Query: 195 SQSQTPKETI---PTKNRNTTRETERVAESPSAAESQRGELRSSXXPRTESN 341 S S+T ++TI KN N T+E E++ E S++ +L + TE N Sbjct: 34 SNSETARQTINDYEIKNHNLTQENEKLTEQNKELTSEKEKLTTDNGRLTEQN 85 >gi|12018147|gb|AAG45420.1|AF309494_1 (AF309494) vegetative cell wall protein gp1 [Chlamydomonas reinhardtii] Length = 555 Frame 1 hits (HSPs): ____________ __________________________________________________ Database sequence: | | | | | 555 0 150 300 450 Plus Strand HSPs: Score = 86 (30.3 bits), Expect = 5.9, P = 1.0 Identities = 37/117 (31%), Positives = 48/117 (41%), Frame = +1 Query: 145 SPSATLRHGPAPQPPTSPRARHQRKQFPPKTATQQGKQNESPNRLPPPSXSVASYGAPT- 321 SP + + PAP PP SP PP A + SP+ P PS S + +P+ Sbjct: 309 SPPSPVPPSPAPVPP-SPAPPSPAPSPPPSPAPPTPSPSPSPSPSPSPSPSPSPSPSPSP 367 Query: 322 XLVRSPI*KP*AMVFPHRLKTLAVQSPTDDLQGELYRT*FGCXSRKXGDRXISX---KGN 492 + SP KP +L DDL G R G SR G+ I+ KGN Sbjct: 368 SPIPSPSPKPSPSPVAVKLVWADDAIAFDDLNGTSTRP--GSASRMVGEPDIAGTKCKGN 425 >gi|3421391|gb|AAC32195.1| (AF082158) putative cysteine synthase [Arabidopsis thaliana] Length = 141 Frame 3 hits (HSPs): _______________________________________ Frame 2 hits (HSPs): _________ __________________________________________________ Database sequence: | | | | 141 0 50 100 Plus Strand HSPs: Score = 66 (23.2 bits), Expect = 9.5, Sum P(2) = 1.0 Identities = 29/108 (26%), Positives = 49/108 (45%), Frame = +3 Query: 84 FSFPHHKTHSTAQHLQTFPFLSFRYSAPWPSAAAPNLSQSQTPKETIPTKNRNTTRETER 263 FSF H + S A ++T P SF A ++ S+S+T ++ P T E + Sbjct: 34 FSFHHDSSSSLA--VRT-PVSSFVVGAISGKSSTGTKSKSKTKRKPPPPPPVTTVAEEQH 90 Query: 264 VAESPSAAESQRGELRSSXXPRTESNLKTLG--HGVSSSVEN--PCCAVSD 404 +AES + ++ P N T G +++ +E+ PC +V D Sbjct: 91 IAESETVNIAEDVTQLIGSTPMVYLNRVTDGCLADIAAKLESMEPCRSVKD 141 Score = 40 (14.1 bits), Expect = 9.5, Sum P(2) = 1.0 Identities = 9/22 (40%), Positives = 13/22 (59%), Frame = +2 Query: 29 SPANSLSHKLWRFAKPETFLFS 94 SP ++ KL RF+ + LFS Sbjct: 14 SPLGRITSKLHRFSTAKLSLFS 35 Parameters: filter=none matrix=BLOSUM62 V=50 B=50 E=10 gi H=1 sort_by_pvalue echofilter ctxfactor=5.94 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H Std. 0 BLOSUM62 0.318 0.135 0.401 +3 0 BLOSUM62 0.318 0.135 0.401 0.319 0.130 0.402 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +2 0 BLOSUM62 0.318 0.135 0.401 0.324 0.136 0.436 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +1 0 BLOSUM62 0.318 0.135 0.401 0.334 0.142 0.444 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -1 0 BLOSUM62 0.318 0.135 0.401 0.352 0.158 0.590 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -2 0 BLOSUM62 0.318 0.135 0.401 0.334 0.145 0.499 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -3 0 BLOSUM62 0.318 0.135 0.401 0.326 0.140 0.461 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a Query Frame MatID Length Eff.Length E S W T X E2 S2 +3 0 210 198 10. 76 3 12 22 0.093 35 31 0.10 38 +2 0 210 196 10. 76 3 12 22 0.092 35 31 0.10 38 +1 0 210 195 10. 76 3 12 22 0.091 35 31 0.10 38 -1 0 210 194 10. 76 3 12 22 0.091 35 31 0.10 38 -2 0 210 197 10. 76 3 12 22 0.092 35 31 0.10 38 -3 0 210 197 10. 76 3 12 22 0.092 35 31 0.10 38 Statistics: Database: /usr/local/dot5/sl_home/beauty/seqdb/blast/nr Title: nr Release date: unknown Posted date: 4:06 PM CST Feb 28, 2001 Format: BLAST # of letters in database: 197,782,623 # of sequences in database: 625,274 # of database sequences satisfying E: 14 No. of states in DFA: 592 (58 KB) Total size of DFA: 216 KB (256 KB) Time to generate neighborhood: 0.00u 0.01s 0.01t Elapsed: 00:00:00 No. of threads or processors used: 6 Search cpu time: 221.31u 0.96s 222.27t Elapsed: 00:00:46 Total cpu time: 221.34u 0.97s 222.31t Elapsed: 00:00:46 Start: Mon Oct 1 19:27:42 2001 End: Mon Oct 1 19:28:28 2001
Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000