WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.
BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.
BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract
Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract
RepeatMasker repeats found in sequence:No Repeats Found.Reference: Gish, Warren (1994-1997). unpublished. Gish, Warren and David J. States (1993). Identification of protein coding regions by database similarity search. Nat. Genet. 3:266-72.Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.
Query= 'E06F07_L07_12.ab1' (585 letters)
Translating both strands of query sequence in all 6 reading framesDatabase: nr 625,274 sequences; 197,782,623 total letters.Observed Numbers of Database Sequences Satisfying Various EXPECTation Thresholds (E parameter values) Histogram units: = 6 Sequences : less than 6 sequences EXPECTation Threshold (E parameter) | V Observed Counts--> 10000 1900 370 |============================================================= 6310 1530 323 |===================================================== 3980 1207 297 |================================================= 2510 910 209 |================================== 1580 701 208 |================================== 1000 493 134 |====================== 631 359 91 |=============== 398 268 92 |=============== 251 176 51 |======== 158 125 38 |====== 100 87 31 |===== 63.1 56 11 |= 39.8 45 19 |=== 25.1 26 7 |= 15.8 19 4 |: >>>>>>>>>>>>>>>>>>>>> Expect = 10.0, Observed = 15 <<<<<<<<<<<<<<<<< 10.0 15 2 |: 6.31 13 0 | 3.98 13 3 |: 2.51 10 0 | 1.58 10 0 | 1.00 10 0 | 0.63 10 0 | 0.40 10 0 | 0.25 10 0 | 0.16 10 1 |: Smallest Sum Reading High Probability Sequences producing High-scoring Segment Pairs: Frame Score P(N) N gi|5103825|gb|AAD39655.1|AC007591_20(AC007591) ESTs g... +1 290 1.4e-24 1 gi|9294450|dbj|BAB02669.1|(AB012247) gene_id:MSL1.7~u... +1 244 1.0e-19 1 gi|7511888|pir||T13381hypothetical protein 115C2.12 -... +1 194 2.1e-14 1 gi|7705431ref|NP_057017.1| hypothetical protein [Homo... +1 185 1.9e-13 1 gi|12730068ref|XP_003302.2| hypothetical protein [Hom... +1 185 1.9e-13 1 gi|7298038|gb|AAF53279.1|(AE003639) CG16824 gene prod... +1 182 3.9e-13 1 gi|6841310|gb|AAF29008.1|AF161448_1(AF161448) HSPC330... +1 174 2.7e-12 1 gi|7503800|pir||T22415hypothetical protein F49C12.11 ... +1 169 9.2e-12 1 gi|6323292ref|NP_013364.1| Ylr262c-ap [Saccharomyces ... +1 135 3.7e-08 1 gi|4494004|emb|CAB39063.1|(AL034559) F49C12.11-like p... +1 78 0.10 1 gi|7486459|pir||T02461hypothetical protein F4I18.16 -... +1 65 0.95 1 gi|2183055|gb|AAB64259.1|(U97509) cytochrome oxidase ... +3 70 0.96 1 gi|1042189|gb|AAB34942.1|T2=testis-specific pro-prota... +2 64 0.98 1 gi|10177250|dbj|BAB10718.1|(AB007644) gene_id:K19P17.... +1 61 0.9998 1 gi|3334471|sp|P46514|LE10_HELAN10 KDA LATE EMBRYOGENE... +1 65 0.9999 1
Use the and icons to retrieve links to Entrez:
>gi|5103825|gb|AAD39655.1|AC007591_20 (AC007591) ESTs gb|AA650895, gb|AA720043 and gb|R29777 come from this gene. [Arabidopsis thaliana] >gi|12484215|gb|AAG54006.1|AF336925_1 (AF336925) unknown protein [Arabidopsis thaliana] Length = 64 Frame 1 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | | 64 0 20 40 60 Plus Strand HSPs: Score = 290 (102.1 bits), Expect = 1.4e-24, P = 1.4e-24 Identities = 55/64 (85%), Positives = 62/64 (96%), Frame = +1 Query: 133 MSTKQGGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKK 312 MS+KQGGKAKPLK+PK+DKK+YDE D+ANIQKKK+EEKALKEL+AKA QKGSFGGSGLKK Sbjct: 1 MSSKQGGKAKPLKQPKADKKEYDETDLANIQKKKDEEKALKELRAKASQKGSFGGSGLKK 60 Query: 313 SGKK 324 SGKK Sbjct: 61 SGKK 64 >gi|9294450|dbj|BAB02669.1| (AB012247) gene_id:MSL1.7~unknown protein [Arabidopsis thaliana] Length = 62 Frame 1 hits (HSPs): ________________________________________________ __________________________________________________ Database sequence: | | | | | 62 0 20 40 60 Plus Strand HSPs: Score = 244 (85.9 bits), Expect = 1.0e-19, P = 1.0e-19 Identities = 48/59 (81%), Positives = 53/59 (89%), Frame = +1 Query: 148 GGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKKSGKK 324 GGK KPLK+PKS KK+YDE DM +QKKK+EEKALKEL+AKA QKGSFGGSGLKKSGKK Sbjct: 4 GGKLKPLKQPKSGKKEYDEHDMELMQKKKDEEKALKELRAKASQKGSFGGSGLKKSGKK 62 >gi|7511888|pir||T13381 hypothetical protein 115C2.12 - fruit fly (Drosophila melanogaster) >gi|4688661|emb|CAB41345.1| (AL031581) /prediction=(method:""genefinder"", version:""084"", score:""18.93"")~/prediction=(method:""genscan"", version:""1.0"", score:""25.85"")~/match=(desc:""F49C12.11 PROTEIN"", species:""CAENORHABDITIS ELEGANS"", ranges:(query:3059..3250, target> >gi|7290062|gb|AAF45528.1| (AE003418) CG13364 gene product [Drosophila melanogaster] Length = 64 Frame 1 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | | 64 0 20 40 60 Plus Strand HSPs: Score = 194 (68.3 bits), Expect = 2.1e-14, P = 2.1e-14 Identities = 39/64 (60%), Positives = 47/64 (73%), Frame = +1 Query: 133 MSTKQGGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKK 312 MS ++GGK KPLK PK D KD DE DMA QK+KE++KAL+ KA A +KG G G+KK Sbjct: 1 MSGREGGKKKPLKAPKKDSKDLDEDDMAFKQKQKEQQKALEAAKANASKKGPLVGGGIKK 60 Query: 313 SGKK 324 SGKK Sbjct: 61 SGKK 64 >gi|7705431 ref|NP_057017.1| hypothetical protein [Homo sapiens] >gi|4679018|gb|AAD26997.1| (AF077202) HSPC016 [Homo sapiens] >gi|12654537|gb|AAH01102.1|AAH01102 (BC001102) hypothetical protein [Homo sapiens] Length = 64 Frame 1 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | | 64 0 20 40 60 Plus Strand HSPs: Score = 185 (65.1 bits), Expect = 1.9e-13, P = 1.9e-13 Identities = 38/64 (59%), Positives = 46/64 (71%), Frame = +1 Query: 133 MSTKQGGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKK 312 MS ++GGK KPLK+PK K+ DE D A QK+KEE+K L+ELKAKA KG G+KK Sbjct: 1 MSGREGGKKKPLKQPKKQAKEMDEEDKAFKQKQKEEQKKLEELKAKAAGKGPLATGGIKK 60 Query: 313 SGKK 324 SGKK Sbjct: 61 SGKK 64 >gi|12730068 ref|XP_003302.2| hypothetical protein [Homo sapiens] Length = 76 Frame 1 hits (HSPs): ___________________________________________ __________________________________________________ Database sequence: | | | | | 76 0 20 40 60 Plus Strand HSPs: Score = 185 (65.1 bits), Expect = 1.9e-13, P = 1.9e-13 Identities = 38/64 (59%), Positives = 46/64 (71%), Frame = +1 Query: 133 MSTKQGGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKK 312 MS ++GGK KPLK+PK K+ DE D A QK+KEE+K L+ELKAKA KG G+KK Sbjct: 13 MSGREGGKKKPLKQPKKQAKEMDEEDKAFKQKQKEEQKKLEELKAKAAGKGPLATGGIKK 72 Query: 313 SGKK 324 SGKK Sbjct: 73 SGKK 76 >gi|7298038|gb|AAF53279.1| (AE003639) CG16824 gene product [Drosophila melanogaster] Length = 64 Frame 1 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | | 64 0 20 40 60 Plus Strand HSPs: Score = 182 (64.1 bits), Expect = 3.9e-13, P = 3.9e-13 Identities = 36/64 (56%), Positives = 47/64 (73%), Frame = +1 Query: 133 MSTKQGGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKK 312 M+ ++GGK KPLK PK D K+ DE DMA QK+KE++KA++ KA A +KG G G+KK Sbjct: 1 MAGREGGKKKPLKAPKKDSKNLDEEDMAFKQKQKEQQKAMEAAKAGASKKGPLLGGGIKK 60 Query: 313 SGKK 324 SGKK Sbjct: 61 SGKK 64 >gi|6841310|gb|AAF29008.1|AF161448_1 (AF161448) HSPC330 [Homo sapiens] Length = 59 Frame 1 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | 59 0 20 40 Plus Strand HSPs: Score = 174 (61.3 bits), Expect = 2.7e-12, P = 2.7e-12 Identities = 36/59 (61%), Positives = 42/59 (71%), Frame = +1 Query: 148 GGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKKSGKK 324 GGK KPLK+PK K+ DE D A QK+KEE+K L+ELKAKA KG G+KKSGKK Sbjct: 1 GGKKKPLKQPKKQAKEMDEEDKAFKQKQKEEQKKLEELKAKAAGKGPLATGGIKKSGKK 59 >gi|7503800|pir||T22415 hypothetical protein F49C12.11 - Caenorhabditis elegans >gi|3877368|emb|CAA92514.1| (Z68227) cDNA EST EMBL:T01259 comes from this gene~cDNA EST yk67d3.3 comes from this gene~cDNA EST yk67d3.5 comes from this gene~cDNA EST yk136h3.3 comes from this gene~cDNA EST yk136h3.5 comes from this gene~cDNA EST yk482b3.3 comes from this gene~cDN> Length = 64 Frame 1 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | | 64 0 20 40 60 Plus Strand HSPs: Score = 169 (59.5 bits), Expect = 9.2e-12, P = 9.2e-12 Identities = 35/64 (54%), Positives = 45/64 (70%), Frame = +1 Query: 133 MSTKQGGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKK 312 MS +QGGKAKPLK K +KD E D+ +K++EE K +KE+ AKA Q+G G G+KK Sbjct: 1 MSGRQGGKAKPLKAAKKTEKDLSEEDVEFKKKQQEEAKKIKEMAAKAGQRGPLLGGGIKK 60 Query: 313 SGKK 324 SGKK Sbjct: 61 SGKK 64 >gi|6323292 ref|NP_013364.1| Ylr262c-ap [Saccharomyces cerevisiae] >gi|2131800|pir||S72200 hypothetical protein YLR262c-a - yeast (Saccharomyces cerevisiae) Length = 64 Frame 1 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | | 64 0 20 40 60 Plus Strand HSPs: Score = 135 (47.5 bits), Expect = 3.7e-08, P = 3.7e-08 Identities = 29/64 (45%), Positives = 40/64 (62%), Frame = +1 Query: 133 MSTKQGGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKK 312 MS++QGGK KPLK+ K ++D D D+A +K+K + A K L A + G G+KK Sbjct: 1 MSSRQGGKMKPLKQKKKQQQDLDPEDIAFKEKQKADAAAKKALMANMKSGKPLVGGGIKK 60 Query: 313 SGKK 324 SGKK Sbjct: 61 SGKK 64 >gi|4494004|emb|CAB39063.1| (AL034559) F49C12.11-like protein [Plasmodium falciparum] Length = 54 Frame 1 hits (HSPs): ______________________________________________ __________________________________________________ Database sequence: | | | | 54 0 20 40 Plus Strand HSPs: Score = 78 (27.5 bits), Expect = 0.11, P = 0.10 Identities = 22/49 (44%), Positives = 30/49 (61%), Frame = +1 Query: 145 QGGKAKPLKKPKSDKKDYDEVDMA----NIQKKKEEEKALKELKAKAQQK 282 QGGK KPLK K + E D+A +KKK EE+A ++L KA++K Sbjct: 6 QGGKKKPLKAAKKGPVELTEEDIAFKKEMAEKKKAEEEAKQKL-LKAKKK 54 >gi|7486459|pir||T02461 hypothetical protein F4I18.16 - Arabidopsis thaliana >gi|3386608|gb|AAC28538.1| (AC004665) hypothetical protein [Arabidopsis thaliana] Length = 79 Frame 1 hits (HSPs): ______________________ __________________________________________________ Database sequence: | | | | | 79 0 20 40 60 Plus Strand HSPs: Score = 65 (22.9 bits), Expect = 3.0, P = 0.95 Identities = 16/34 (47%), Positives = 23/34 (67%), Frame = +1 Query: 217 NIQKKKE-EEKALKELKAKAQQKGSFGGSGLKKSG 318 N+QK+KE ++K +K+L A + K GSG KK G Sbjct: 17 NLQKEKELQDKKIKKLHAN-KNKMKVDGSGKKKKG 50 >gi|2183055|gb|AAB64259.1| (U97509) cytochrome oxidase II [Muscidifurax zaraptor] Length = 99 Frame 3 hits (HSPs): ____________________ __________________________________________________ Database sequence: | | | | | | 99 0 20 40 60 80 Plus Strand HSPs: Score = 70 (24.6 bits), Expect = 3.3, P = 0.96 Identities = 12/40 (30%), Positives = 25/40 (62%), Frame = +3 Query: 315 WKEIRALYHDCIHPILSPVVYYHDNFCLIVAFFIALMLVL 434 WK+I ++ D PI+ ++ +HD+ L++ I+L+L + Sbjct: 4 WKQI--MFQDSNSPIMESMIMFHDHGMLVIIIIISLILYI 41 >gi|1042189|gb|AAB34942.1| T2=testis-specific pro-protamine [Loligo pealeii=squids, testis chromatin, Peptide, 79 aa] Length = 79 Frame 2 hits (HSPs): ______________________________________ __________________________________________________ Database sequence: | | | | | 79 0 20 40 60 Plus Strand HSPs: Score = 64 (22.5 bits), Expect = 3.9, P = 0.98 Identities = 26/64 (40%), Positives = 31/64 (48%), Frame = +2 Query: 149 VEKLSL*RNPSLIRRITTRLTWLTFRRKRKRRRR*RS*KP-RRNRREALEVLGSRKVERN 325 VEKL L + RR + R RR R+RRRR RS P RR RR R+ R Sbjct: 12 VEKLDLLKGGRRRRRRSRR------RRSRRRRRRRRSRSPYRRRRRRRRRRSRRRRRYRR 65 Query: 326 KGSLS 340 + S S Sbjct: 66 RRSYS 70 >gi|10177250|dbj|BAB10718.1| (AB007644) gene_id:K19P17.4~unknown protein [Arabidopsis thaliana] Length = 66 Frame 1 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | | 66 0 20 40 60 Plus Strand HSPs: Score = 61 (21.5 bits), Expect = 8.5, P = 1.0 Identities = 28/65 (43%), Positives = 33/65 (50%), Frame = +1 Query: 91 RFCFQTKP*LRIATMSTKQGG-KAKPLKKPKSDKKDYDEVDMANIQK--KKEEEKA-LKE 258 +F FQ M K G KA KPK +KK EV I+K KKEE+K KE Sbjct: 2 KFLFQCPCCSCFCFMKPKPGKPKAVGDTKPKEEKKK--EVKKEEIKKEEKKEEKKEEKKE 59 Query: 259 LKA-KAQ 276 KA KA+ Sbjct: 60 TKAEKAE 66 >gi|3334471|sp|P46514|LE10_HELAN 10 KDA LATE EMBRYOGENESIS ABUNDANT PROTEIN (DS10) >gi|2828229|emb|CAA42220.1| (X59699) 10 kDa Lea (Late embryogenesis abundant) protein [Helianthus annuus] >gi|3724199|emb|CAA11834.1| (AJ224116) lea group I [Helianthus annuus] Length = 92 Frame 1 hits (HSPs): ________________________________________ Annotated Domains: ________________________________________________ __________________________________________________ Database sequence: | | | | | | 92 0 20 40 60 80 __________________ Annotated Domains: BLOCKS BL00431A: Small hydrophilic plant seed p 14..32 BLOCKS BL00431B: Small hydrophilic plant seed p 36..72 PFAM seed_protein: Small hydrophilic plant se 1..89 PRODOM PD002246: EM1(3) L194(2) 15..88 PROSITE SMALL_HYDR_PLANT_SEED: Small hydrophilic 27..35 __________________ Plus Strand HSPs: Score = 65 (22.9 bits), Expect = 9.1, P = 1.0 Identities = 24/74 (32%), Positives = 37/74 (50%), Frame = +1 Query: 133 MSTKQGGKAKPLKKPKSDKKDYDE--------VDMANIQKKKEEEKALKELKAKAQQ--K 282 M+++QG + + K P+ +KKD D+ V K E ++ L E ++K Q K Sbjct: 1 MASQQGQQTR--KIPEQEKKDLDQRAAKGETVVPGGTRGKSLEAQERLAEGRSKGGQTRK 58 Query: 283 GSFGGSGLKKSGKK*G 330 G G K+ GKK G Sbjct: 59 DQLGTEGYKEMGKKGG 74 Parameters: filter=none matrix=BLOSUM62 V=50 B=50 E=10 gi H=1 sort_by_pvalue echofilter ctxfactor=5.98 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H Std. 0 BLOSUM62 0.318 0.135 0.401 +3 0 BLOSUM62 0.318 0.135 0.401 0.362 0.164 0.630 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +2 0 BLOSUM62 0.318 0.135 0.401 0.344 0.150 0.514 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +1 0 BLOSUM62 0.318 0.135 0.401 0.354 0.158 0.546 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -1 0 BLOSUM62 0.318 0.135 0.401 0.339 0.149 0.458 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -2 0 BLOSUM62 0.318 0.135 0.401 0.340 0.146 0.467 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -3 0 BLOSUM62 0.318 0.135 0.401 0.353 0.151 0.578 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a Query Frame MatID Length Eff.Length E S W T X E2 S2 +3 0 194 192 10. 76 3 12 22 0.12 34 31 0.098 38 +2 0 194 193 10. 76 3 12 22 0.12 34 31 0.099 38 +1 0 195 193 10. 76 3 12 22 0.12 34 31 0.099 38 -1 0 195 192 10. 76 3 12 22 0.12 34 31 0.098 38 -2 0 194 193 10. 76 3 12 22 0.12 34 31 0.099 38 -3 0 194 192 10. 76 3 12 22 0.12 34 31 0.098 38 Statistics: Database: /usr/local/dot5/sl_home/beauty/seqdb/blast/nr Title: nr Release date: unknown Posted date: 4:06 PM CST Feb 28, 2001 Format: BLAST # of letters in database: 197,782,623 # of sequences in database: 625,274 # of database sequences satisfying E: 15 No. of states in DFA: 589 (58 KB) Total size of DFA: 212 KB (256 KB) Time to generate neighborhood: 0.02u 0.00s 0.02t Elapsed: 00:00:00 No. of threads or processors used: 6 Search cpu time: 194.56u 0.91s 195.47t Elapsed: 00:01:33 Total cpu time: 194.59u 0.93s 195.52t Elapsed: 00:01:33 Start: Fri Jan 18 15:40:53 2002 End: Fri Jan 18 15:42:26 2002
Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000