WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.
BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.
BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract
Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract
RepeatMasker repeats found in sequence:No Repeats Found.Reference: Gish, Warren (1994-1997). unpublished. Gish, Warren and David J. States (1993). Identification of protein coding regions by database similarity search. Nat. Genet. 3:266-72.Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.
Query= 'D14B10_D10_03.ab1' (564 letters)
Translating both strands of query sequence in all 6 reading framesDatabase: nr 625,274 sequences; 197,782,623 total letters.Observed Numbers of Database Sequences Satisfying Various EXPECTation Thresholds (E parameter values) Histogram units: = 6 Sequences : less than 6 sequences EXPECTation Threshold (E parameter) | V Observed Counts--> 10000 1715 328 |====================================================== 6310 1387 254 |========================================== 3980 1133 289 |================================================ 2510 844 205 |================================== 1580 639 188 |=============================== 1000 451 100 |================ 631 351 111 |================== 398 240 77 |============ 251 163 55 |========= 158 108 36 |====== 100 72 14 |== 63.1 58 16 |== 39.8 42 19 |=== 25.1 23 5 |: 15.8 18 3 |: >>>>>>>>>>>>>>>>>>>>> Expect = 10.0, Observed = 15 <<<<<<<<<<<<<<<<< 10.0 15 3 |: 6.31 12 0 | 3.98 12 2 |: 2.51 10 0 | 1.58 10 0 | 1.00 10 0 | 0.63 10 0 | 0.40 10 0 | 0.25 10 0 | 0.16 10 0 | 0.10 10 1 |: Smallest Sum Reading High Probability Sequences producing High-scoring Segment Pairs: Frame Score P(N) N gi|5103825|gb|AAD39655.1|AC007591_20(AC007591) ESTs g... +3 290 1.4e-24 1 gi|9294450|dbj|BAB02669.1|(AB012247) gene_id:MSL1.7~u... +3 244 1.0e-19 1 gi|7511888|pir||T13381hypothetical protein 115C2.12 -... +3 194 2.1e-14 1 gi|7705431ref|NP_057017.1| hypothetical protein [Homo... +3 185 1.8e-13 1 gi|12730068ref|XP_003302.2| hypothetical protein [Hom... +3 185 1.8e-13 1 gi|7298038|gb|AAF53279.1|(AE003639) CG16824 gene prod... +3 182 3.8e-13 1 gi|6841310|gb|AAF29008.1|AF161448_1(AF161448) HSPC330... +3 174 2.7e-12 1 gi|7503800|pir||T22415hypothetical protein F49C12.11 ... +3 169 9.2e-12 1 gi|6323292ref|NP_013364.1| Ylr262c-ap [Saccharomyces ... +3 135 3.7e-08 1 gi|4494004|emb|CAB39063.1|(AL034559) F49C12.11-like p... +3 78 0.090 1 gi|7486459|pir||T02461hypothetical protein F4I18.16 -... +3 65 0.94 1 gi|1042189|gb|AAB34942.1|T2=testis-specific pro-prota... +1 64 0.97 1 gi|10177250|dbj|BAB10718.1|(AB007644) gene_id:K19P17.... +3 61 0.9996 1 gi|3334471|sp|P46514|LE10_HELAN10 KDA LATE EMBRYOGENE... +3 65 0.9997 1 gi|5921900|sp|O46590|COX4_SAIUSCYTOCHROME C OXIDASE P... +3 60 0.99995 1
Use the and icons to retrieve links to Entrez:
>gi|5103825|gb|AAD39655.1|AC007591_20 (AC007591) ESTs gb|AA650895, gb|AA720043 and gb|R29777 come from this gene. [Arabidopsis thaliana] >gi|12484215|gb|AAG54006.1|AF336925_1 (AF336925) unknown protein [Arabidopsis thaliana] Length = 64 Frame 3 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | | 64 0 20 40 60 Plus Strand HSPs: Score = 290 (102.1 bits), Expect = 1.4e-24, P = 1.4e-24 Identities = 55/64 (85%), Positives = 62/64 (96%), Frame = +3 Query: 102 MSTKQGGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKK 281 MS+KQGGKAKPLK+PK+DKK+YDE D+ANIQKKK+EEKALKEL+AKA QKGSFGGSGLKK Sbjct: 1 MSSKQGGKAKPLKQPKADKKEYDETDLANIQKKKDEEKALKELRAKASQKGSFGGSGLKK 60 Query: 282 SGKK 293 SGKK Sbjct: 61 SGKK 64 >gi|9294450|dbj|BAB02669.1| (AB012247) gene_id:MSL1.7~unknown protein [Arabidopsis thaliana] Length = 62 Frame 3 hits (HSPs): ________________________________________________ __________________________________________________ Database sequence: | | | | | 62 0 20 40 60 Plus Strand HSPs: Score = 244 (85.9 bits), Expect = 1.0e-19, P = 1.0e-19 Identities = 48/59 (81%), Positives = 53/59 (89%), Frame = +3 Query: 117 GGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKKSGKK 293 GGK KPLK+PKS KK+YDE DM +QKKK+EEKALKEL+AKA QKGSFGGSGLKKSGKK Sbjct: 4 GGKLKPLKQPKSGKKEYDEHDMELMQKKKDEEKALKELRAKASQKGSFGGSGLKKSGKK 62 >gi|7511888|pir||T13381 hypothetical protein 115C2.12 - fruit fly (Drosophila melanogaster) >gi|4688661|emb|CAB41345.1| (AL031581) /prediction=(method:""genefinder"", version:""084"", score:""18.93"")~/prediction=(method:""genscan"", version:""1.0"", score:""25.85"")~/match=(desc:""F49C12.11 PROTEIN"", species:""CAENORHABDITIS ELEGANS"", ranges:(query:3059..3250, target> >gi|7290062|gb|AAF45528.1| (AE003418) CG13364 gene product [Drosophila melanogaster] Length = 64 Frame 3 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | | 64 0 20 40 60 Plus Strand HSPs: Score = 194 (68.3 bits), Expect = 2.1e-14, P = 2.1e-14 Identities = 39/64 (60%), Positives = 47/64 (73%), Frame = +3 Query: 102 MSTKQGGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKK 281 MS ++GGK KPLK PK D KD DE DMA QK+KE++KAL+ KA A +KG G G+KK Sbjct: 1 MSGREGGKKKPLKAPKKDSKDLDEDDMAFKQKQKEQQKALEAAKANASKKGPLVGGGIKK 60 Query: 282 SGKK 293 SGKK Sbjct: 61 SGKK 64 >gi|7705431 ref|NP_057017.1| hypothetical protein [Homo sapiens] >gi|4679018|gb|AAD26997.1| (AF077202) HSPC016 [Homo sapiens] >gi|12654537|gb|AAH01102.1|AAH01102 (BC001102) hypothetical protein [Homo sapiens] Length = 64 Frame 3 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | | 64 0 20 40 60 Plus Strand HSPs: Score = 185 (65.1 bits), Expect = 1.8e-13, P = 1.8e-13 Identities = 38/64 (59%), Positives = 46/64 (71%), Frame = +3 Query: 102 MSTKQGGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKK 281 MS ++GGK KPLK+PK K+ DE D A QK+KEE+K L+ELKAKA KG G+KK Sbjct: 1 MSGREGGKKKPLKQPKKQAKEMDEEDKAFKQKQKEEQKKLEELKAKAAGKGPLATGGIKK 60 Query: 282 SGKK 293 SGKK Sbjct: 61 SGKK 64 >gi|12730068 ref|XP_003302.2| hypothetical protein [Homo sapiens] Length = 76 Frame 3 hits (HSPs): ___________________________________________ __________________________________________________ Database sequence: | | | | | 76 0 20 40 60 Plus Strand HSPs: Score = 185 (65.1 bits), Expect = 1.8e-13, P = 1.8e-13 Identities = 38/64 (59%), Positives = 46/64 (71%), Frame = +3 Query: 102 MSTKQGGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKK 281 MS ++GGK KPLK+PK K+ DE D A QK+KEE+K L+ELKAKA KG G+KK Sbjct: 13 MSGREGGKKKPLKQPKKQAKEMDEEDKAFKQKQKEEQKKLEELKAKAAGKGPLATGGIKK 72 Query: 282 SGKK 293 SGKK Sbjct: 73 SGKK 76 >gi|7298038|gb|AAF53279.1| (AE003639) CG16824 gene product [Drosophila melanogaster] Length = 64 Frame 3 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | | 64 0 20 40 60 Plus Strand HSPs: Score = 182 (64.1 bits), Expect = 3.8e-13, P = 3.8e-13 Identities = 36/64 (56%), Positives = 47/64 (73%), Frame = +3 Query: 102 MSTKQGGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKK 281 M+ ++GGK KPLK PK D K+ DE DMA QK+KE++KA++ KA A +KG G G+KK Sbjct: 1 MAGREGGKKKPLKAPKKDSKNLDEEDMAFKQKQKEQQKAMEAAKAGASKKGPLLGGGIKK 60 Query: 282 SGKK 293 SGKK Sbjct: 61 SGKK 64 >gi|6841310|gb|AAF29008.1|AF161448_1 (AF161448) HSPC330 [Homo sapiens] Length = 59 Frame 3 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | 59 0 20 40 Plus Strand HSPs: Score = 174 (61.3 bits), Expect = 2.7e-12, P = 2.7e-12 Identities = 36/59 (61%), Positives = 42/59 (71%), Frame = +3 Query: 117 GGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKKSGKK 293 GGK KPLK+PK K+ DE D A QK+KEE+K L+ELKAKA KG G+KKSGKK Sbjct: 1 GGKKKPLKQPKKQAKEMDEEDKAFKQKQKEEQKKLEELKAKAAGKGPLATGGIKKSGKK 59 >gi|7503800|pir||T22415 hypothetical protein F49C12.11 - Caenorhabditis elegans >gi|3877368|emb|CAA92514.1| (Z68227) cDNA EST EMBL:T01259 comes from this gene~cDNA EST yk67d3.3 comes from this gene~cDNA EST yk67d3.5 comes from this gene~cDNA EST yk136h3.3 comes from this gene~cDNA EST yk136h3.5 comes from this gene~cDNA EST yk482b3.3 comes from this gene~cDN> Length = 64 Frame 3 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | | 64 0 20 40 60 Plus Strand HSPs: Score = 169 (59.5 bits), Expect = 9.2e-12, P = 9.2e-12 Identities = 35/64 (54%), Positives = 45/64 (70%), Frame = +3 Query: 102 MSTKQGGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKK 281 MS +QGGKAKPLK K +KD E D+ +K++EE K +KE+ AKA Q+G G G+KK Sbjct: 1 MSGRQGGKAKPLKAAKKTEKDLSEEDVEFKKKQQEEAKKIKEMAAKAGQRGPLLGGGIKK 60 Query: 282 SGKK 293 SGKK Sbjct: 61 SGKK 64 >gi|6323292 ref|NP_013364.1| Ylr262c-ap [Saccharomyces cerevisiae] >gi|2131800|pir||S72200 hypothetical protein YLR262c-a - yeast (Saccharomyces cerevisiae) Length = 64 Frame 3 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | | 64 0 20 40 60 Plus Strand HSPs: Score = 135 (47.5 bits), Expect = 3.7e-08, P = 3.7e-08 Identities = 29/64 (45%), Positives = 40/64 (62%), Frame = +3 Query: 102 MSTKQGGKAKPLKKPKSDKKDYDEVDMANIQKKKEEEKALKELKAKAQQKGSFGGSGLKK 281 MS++QGGK KPLK+ K ++D D D+A +K+K + A K L A + G G+KK Sbjct: 1 MSSRQGGKMKPLKQKKKQQQDLDPEDIAFKEKQKADAAAKKALMANMKSGKPLVGGGIKK 60 Query: 282 SGKK 293 SGKK Sbjct: 61 SGKK 64 >gi|4494004|emb|CAB39063.1| (AL034559) F49C12.11-like protein [Plasmodium falciparum] Length = 54 Frame 3 hits (HSPs): ______________________________________________ __________________________________________________ Database sequence: | | | | 54 0 20 40 Plus Strand HSPs: Score = 78 (27.5 bits), Expect = 0.094, P = 0.090 Identities = 22/49 (44%), Positives = 30/49 (61%), Frame = +3 Query: 114 QGGKAKPLKKPKSDKKDYDEVDMA----NIQKKKEEEKALKELKAKAQQK 251 QGGK KPLK K + E D+A +KKK EE+A ++L KA++K Sbjct: 6 QGGKKKPLKAAKKGPVELTEEDIAFKKEMAEKKKAEEEAKQKL-LKAKKK 54 >gi|7486459|pir||T02461 hypothetical protein F4I18.16 - Arabidopsis thaliana >gi|3386608|gb|AAC28538.1| (AC004665) hypothetical protein [Arabidopsis thaliana] Length = 79 Frame 3 hits (HSPs): ______________________ __________________________________________________ Database sequence: | | | | | 79 0 20 40 60 Plus Strand HSPs: Score = 65 (22.9 bits), Expect = 2.7, P = 0.94 Identities = 16/34 (47%), Positives = 23/34 (67%), Frame = +3 Query: 186 NIQKKKE-EEKALKELKAKAQQKGSFGGSGLKKSG 287 N+QK+KE ++K +K+L A + K GSG KK G Sbjct: 17 NLQKEKELQDKKIKKLHAN-KNKMKVDGSGKKKKG 50 >gi|1042189|gb|AAB34942.1| T2=testis-specific pro-protamine [Loligo pealeii=squids, testis chromatin, Peptide, 79 aa] Length = 79 Frame 1 hits (HSPs): ______________________________________ __________________________________________________ Database sequence: | | | | | 79 0 20 40 60 Plus Strand HSPs: Score = 64 (22.5 bits), Expect = 3.7, P = 0.97 Identities = 26/64 (40%), Positives = 31/64 (48%), Frame = +1 Query: 118 VEKLSL*RNPSLIRRITTRLTWLTFRRKRKRRRR*RS*KP-RRNRREALEVLGSRKVERN 294 VEKL L + RR + R RR R+RRRR RS P RR RR R+ R Sbjct: 12 VEKLDLLKGGRRRRRRSRR------RRSRRRRRRRRSRSPYRRRRRRRRRRSRRRRRYRR 65 Query: 295 KGSLS 309 + S S Sbjct: 66 RRSYS 70 >gi|10177250|dbj|BAB10718.1| (AB007644) gene_id:K19P17.4~unknown protein [Arabidopsis thaliana] Length = 66 Frame 3 hits (HSPs): __________________________________________________ __________________________________________________ Database sequence: | | | | | 66 0 20 40 60 Plus Strand HSPs: Score = 61 (21.5 bits), Expect = 7.7, P = 1.0 Identities = 28/65 (43%), Positives = 33/65 (50%), Frame = +3 Query: 60 RFCFQTKP*LRIATMSTKQGG-KAKPLKKPKSDKKDYDEVDMANIQK--KKEEEKA-LKE 227 +F FQ M K G KA KPK +KK EV I+K KKEE+K KE Sbjct: 2 KFLFQCPCCSCFCFMKPKPGKPKAVGDTKPKEEKKK--EVKKEEIKKEEKKEEKKEEKKE 59 Query: 228 LKA-KAQ 245 KA KA+ Sbjct: 60 TKAEKAE 66 >gi|3334471|sp|P46514|LE10_HELAN 10 KDA LATE EMBRYOGENESIS ABUNDANT PROTEIN (DS10) >gi|2828229|emb|CAA42220.1| (X59699) 10 kDa Lea (Late embryogenesis abundant) protein [Helianthus annuus] >gi|3724199|emb|CAA11834.1| (AJ224116) lea group I [Helianthus annuus] Length = 92 Frame 3 hits (HSPs): ________________________________________ Annotated Domains: ________________________________________________ __________________________________________________ Database sequence: | | | | | | 92 0 20 40 60 80 __________________ Annotated Domains: BLOCKS BL00431A: Small hydrophilic plant seed p 14..32 BLOCKS BL00431B: Small hydrophilic plant seed p 36..72 PFAM seed_protein: Small hydrophilic plant se 1..89 PRODOM PD002246: EM1(3) L194(2) 15..88 PROSITE SMALL_HYDR_PLANT_SEED: Small hydrophilic 27..35 __________________ Plus Strand HSPs: Score = 65 (22.9 bits), Expect = 8.2, P = 1.0 Identities = 24/74 (32%), Positives = 37/74 (50%), Frame = +3 Query: 102 MSTKQGGKAKPLKKPKSDKKDYDE--------VDMANIQKKKEEEKALKELKAKAQQ--K 251 M+++QG + + K P+ +KKD D+ V K E ++ L E ++K Q K Sbjct: 1 MASQQGQQTR--KIPEQEKKDLDQRAAKGETVVPGGTRGKSLEAQERLAEGRSKGGQTRK 58 Query: 252 GSFGGSGLKKSGKK*G 299 G G K+ GKK G Sbjct: 59 DQLGTEGYKEMGKKGG 74 >gi|5921900|sp|O46590|COX4_SAIUS CYTOCHROME C OXIDASE POLYPEPTIDE IV >gi|6166027|sp||COX4_SAISC_1 [Segment 1 of 2] CYTOCHROME C OXIDASE POLYPEPTIDE IV >gi|2809535|gb|AAB97759.1| (AF042779) cytochrome c oxidase subunit IV [Saimiri ustus] >gi|2895585|gb|AAC02988.1| (AF042764) cytochrome c oxidase subunit IV [Saimiri sciureus] Length = 55 Frame 3 hits (HSPs): _____________________________________ __________________________________________________ Database sequence: | | | | 55 0 20 40 Plus Strand HSPs: Score = 60 (21.1 bits), Expect = 10., P = 1.0 Identities = 15/40 (37%), Positives = 25/40 (62%), Frame = +3 Query: 123 KAKPLKKPKS-DKKDYDEVDMANIQKKKEEEKALKELKAKA 242 K++ +P D++DY D+A+++ +KALKE K KA Sbjct: 4 KSEDYARPSYVDRRDYPLPDVAHVRHLSASQKALKE-KEKA 43 Parameters: filter=none matrix=BLOSUM62 V=50 B=50 E=10 gi H=1 sort_by_pvalue echofilter ctxfactor=5.94 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H Std. 0 BLOSUM62 0.318 0.135 0.401 +3 0 BLOSUM62 0.318 0.135 0.401 0.357 0.160 0.549 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +2 0 BLOSUM62 0.318 0.135 0.401 0.363 0.165 0.634 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +1 0 BLOSUM62 0.318 0.135 0.401 0.349 0.153 0.533 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -1 0 BLOSUM62 0.318 0.135 0.401 0.352 0.150 0.562 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -2 0 BLOSUM62 0.318 0.135 0.401 0.344 0.152 0.469 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -3 0 BLOSUM62 0.318 0.135 0.401 0.342 0.146 0.465 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a Query Frame MatID Length Eff.Length E S W T X E2 S2 +3 0 187 184 10. 76 3 12 22 0.12 34 31 0.12 37 +2 0 187 184 10. 76 3 12 22 0.12 34 31 0.12 37 +1 0 188 187 10. 76 3 12 22 0.12 34 31 0.12 37 -1 0 188 185 10. 76 3 12 22 0.12 34 31 0.12 37 -2 0 187 186 10. 76 3 12 22 0.12 34 31 0.12 37 -3 0 187 184 10. 76 3 12 22 0.12 34 31 0.12 37 Statistics: Database: /usr/local/dot5/sl_home/beauty/seqdb/blast/nr Title: nr Release date: unknown Posted date: 4:06 PM CST Feb 28, 2001 Format: BLAST # of letters in database: 197,782,623 # of sequences in database: 625,274 # of database sequences satisfying E: 15 No. of states in DFA: 589 (58 KB) Total size of DFA: 205 KB (256 KB) Time to generate neighborhood: 0.00u 0.02s 0.02t Elapsed: 00:00:00 No. of threads or processors used: 6 Search cpu time: 181.56u 0.94s 182.50t Elapsed: 00:00:43 Total cpu time: 181.58u 0.99s 182.57t Elapsed: 00:00:43 Start: Thu Jan 17 12:04:52 2002 End: Thu Jan 17 12:05:35 2002
Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000