WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.
BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.
BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract
Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract
RepeatMasker Server unavailable.Reference: Gish, Warren (1994-1997). unpublished. Gish, Warren and David J. States (1993). Identification of protein coding regions by database similarity search. Nat. Genet. 3:266-72.Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.
Query= B07G03.seq(1>578) (547 letters)
Translating both strands of query sequence in all 6 reading framesDatabase: nr 625,274 sequences; 197,782,623 total letters.Observed Numbers of Database Sequences Satisfying Various EXPECTation Thresholds (E parameter values) Histogram units: = 5 Sequences : less than 5 sequences EXPECTation Threshold (E parameter) | V Observed Counts--> 10000 1466 306 |============================================================= 6310 1160 204 |======================================== 3980 956 240 |================================================ 2510 716 215 |=========================================== 1580 501 108 |===================== 1000 393 127 |========================= 631 266 79 |=============== 398 187 51 |========== 251 136 56 |=========== 158 80 26 |===== 100 54 23 |==== 63.1 31 6 |= 39.8 25 10 |== 25.1 15 2 |: 15.8 13 2 |: >>>>>>>>>>>>>>>>>>>>> Expect = 10.0, Observed = 11 <<<<<<<<<<<<<<<<< 10.0 11 1 |: 6.31 10 1 |: 3.98 9 0 | 2.51 9 0 | 1.58 9 0 | 1.00 9 0 | 0.63 9 1 |: Smallest Sum Reading High Probability Sequences producing High-scoring Segment Pairs: Frame Score P(N) N gi|8894548|emb|CAB95829.1|(AJ404639) hypothetical pro... +3 683 3.2e-66 1 gi|12323663|gb|AAG51796.1|AC067754_12(AC067754) cytos... +3 480 1.0e-44 1 gi|12323660|gb|AAG51793.1|AC067754_9(AC067754) cytoso... +3 461 1.1e-42 1 gi|7267559|emb|CAB78040.1|(AL161514) putative protein... +3 279 6.2e-41 2 gi|6587836|gb|AAF18525.1|AC006551_11(AC006551) Unknow... +3 443 2.1e-40 1 gi|4587525|gb|AAD25756.1|AC007060_14(AC007060) Contai... +3 371 4.0e-33 1 gi|11358091|pir||T46063hypothetical protein T18N14.50... +3 337 1.5e-29 1 gi|4902478|emb|CAB43521.1|(AJ238803) hypothetical pro... +3 289 1.8e-24 1 gi|1346953|sp|P49193|RALB_TODPARETINAL-BINDING PROTEI... +3 70 0.39 2 gi|5918866|gb|AAD56166.1|(AF154480) NADH dehydrogenas... +1 62 0.997 1 gi|10946319|gb|AAG24854.1|(AF299064) NADH dehydrogena... -1 75 0.99992 1
Use the and icons to retrieve links to Entrez:
>gi|8894548|emb|CAB95829.1| (AJ404639) hypothetical protein [Cicer arietinum] Length = 482 Frame 3 hits (HSPs): _________________ __________________________________________________ Database sequence: | | | | | 482 0 150 300 450 Plus Strand HSPs: Score = 683 (240.4 bits), Expect = 3.2e-66, P = 3.2e-66 Identities = 131/155 (84%), Positives = 141/155 (90%), Frame = +3 Query: 6 FFTQRTKSKFVFAGPSKSADTLFRYIAPELVPVQYGGLSREAEQEFTSAYPVTEFTIKPA 185 F T RTKSKF FAGPSKSADTLF+YIAPE VPVQYGGLSRE +QEFT+A P TE TIKPA Sbjct: 328 FLTPRTKSKFFFAGPSKSADTLFKYIAPEQVPVQYGGLSREGDQEFTTADPATEVTIKPA 387 Query: 186 TKHSVEFPVSEKSHLVWEIRVVGWDVSYGAEFVPSAEDGYTVIVHKSRKIAPADETVLTN 365 TKH+VEFP+ EKS LVWE+RVVGWDVSYGAEFVPSAEDGYTVIV K+RKIAPADETV+ N Sbjct: 388 TKHAVEFPIPEKSTLVWEVRVVGWDVSYGAEFVPSAEDGYTVIVQKNRKIAPADETVINN 447 Query: 366 GFKIGEPGKIVLTIDNQTSKKKKLLYRSKTKPIAE 470 FKIGEPGK+VLTIDNQTSKKKKLLYRSKT PI+E Sbjct: 448 TFKIGEPGKVVLTIDNQTSKKKKLLYRSKTIPISE 482 >gi|12323663|gb|AAG51796.1|AC067754_12 (AC067754) cytosolic factor, putative; 19554-17768 [Arabidopsis thaliana] Length = 490 Frame 3 hits (HSPs): ________________ __________________________________________________ Database sequence: | | | | | 490 0 150 300 450 Plus Strand HSPs: Score = 480 (169.0 bits), Expect = 1.0e-44, P = 1.0e-44 Identities = 91/156 (58%), Positives = 116/156 (74%), Frame = +3 Query: 6 FFTQRTKSKFVFAGPSKSADTLFRYIAPELVPVQYGGLSRE---AEQEFTSAYPVTEFTI 176 F T R+KSK VFAGPS+SA+TLF+YI+PE VPVQYGGLS + +F+ +E T+ Sbjct: 335 FMTPRSKSKLVFAGPSRSAETLFKYISPEQVPVQYGGLSVDPCDCNPDFSLEDSASEITV 394 Query: 177 KPATKHSVEFPVSEKSHLVWEIRVVGWDVSYGAEFVPSAEDGYTVIVHKSRKIAPADETV 356 KP TK +VE + EK LVWEIRV GW+VSY AEFVP +D YTV++ K RK+ P+DE V Sbjct: 395 KPGTKQTVEIIIYEKCELVWEIRVTGWEVSYKAEFVPEEKDAYTVVIQKPRKMRPSDEPV 454 Query: 357 LTNGFKIGEPGKIVLTIDNQTSKKKKLLYRSKTKPI 464 LT+ FK+ E GK++LT+DN TSKKKKL+YR KP+ Sbjct: 455 LTHSFKVNELGKVLLTVDNPTSKKKKLVYRFNVKPL 490 >gi|12323660|gb|AAG51793.1|AC067754_9 (AC067754) cytosolic factor, putative; 12503-14597 [Arabidopsis thaliana] Length = 573 Frame 3 hits (HSPs): _____________ __________________________________________________ Database sequence: | | | | | 573 0 150 300 450 Plus Strand HSPs: Score = 461 (162.3 bits), Expect = 1.1e-42, P = 1.1e-42 Identities = 89/147 (60%), Positives = 112/147 (76%), Frame = +3 Query: 18 RTKSKFVFAGPSKSADTLFRYIAPELVPVQYGGLSREAEQEFTSAYPVTEFTIKPATKHS 197 RT+SK V AGPSKSADT+F+YIAPE VPV+YGGLS++ + +TE +KPA ++ Sbjct: 430 RTRSKMVLAGPSKSADTIFKYIAPEQVPVKYGGLSKDTP---LTEETITEAIVKPAANYT 486 Query: 198 VEFPVSEKSHLVWEIRVVGWDVSYGAEFVPSAEDGYTVIVHKSRKIAPADETVLTNGFKI 377 +E P SE L WE+RV+G DVSYGA+F P+ E Y VIV K+RKI DE V+T+ FK+ Sbjct: 487 IELPASEACTLSWELRVLGADVSYGAQFEPTTEGSYAVIVSKTRKIGSTDEPVITDSFKV 546 Query: 378 GEPGKIVLTIDNQTSKKKKLLYRSKTK 458 GEPGKIV+TIDNQTSKKKK+LYR KT+ Sbjct: 547 GEPGKIVITIDNQTSKKKKVLYRFKTQ 573 >gi|7267559|emb|CAB78040.1| (AL161514) putative protein [Arabidopsis thaliana] Length = 723 Frame 3 hits (HSPs): ______ ______ __________________________________________________ Database sequence: | | | | | | 723 0 150 300 450 600 Plus Strand HSPs: Score = 279 (98.2 bits), Expect = 6.2e-41, Sum P(2) = 6.2e-41 Identities = 50/84 (59%), Positives = 67/84 (79%), Frame = +3 Query: 216 EKSHLVWEIRVVGWDVSYGAEFVPSAEDGYTVIVHKSRKIAPADETVLTNGFKIGEPGKI 395 +K +VWEIRVVGW+VSYGAEFVP ++GYTVI+ K RK+ +E V+++ FK+GE G+I Sbjct: 638 QKCTIVWEIRVVGWEVSYGAEFVPENKEGYTVIIQKPRKMTAKNELVVSHSFKVGEVGRI 697 Query: 396 VLTIDNQTSKKKKLLYRSKTKPIA 467 +LT+DN TS KK L+YR K KP+A Sbjct: 698 LLTVDNPTSTKKMLIYRFKVKPLA 721 Score = 201 (70.8 bits), Expect = 6.2e-41, Sum P(2) = 6.2e-41 Identities = 43/74 (58%), Positives = 52/74 (70%), Frame = +3 Query: 6 FFTQRTKSKFVFAGPSKSADTLFRYIAPELVPVQYGGLSR---EAEQEFTSAYPVTEFTI 176 F +QR+KSK VFAGPS+SA+TL +YI+PE VPVQYGGLS E +FT TE T+ Sbjct: 510 FMSQRSKSKLVFAGPSRSAETLLKYISPEHVPVQYGGLSVDNCECNSDFTHDDIATEITV 569 Query: 177 KPATKHSVEFPVSE 218 KP TK +VE V E Sbjct: 570 KPTTKQTVEIIVYE 583 >gi|6587836|gb|AAF18525.1|AC006551_11 (AC006551) Unknown protein [Arabidopsis thaliana] Length = 683 Frame 3 hits (HSPs): ___________ __________________________________________________ Database sequence: | | | | | | 683 0 150 300 450 600 Plus Strand HSPs: Score = 443 (155.9 bits), Expect = 2.1e-40, P = 2.1e-40 Identities = 85/147 (57%), Positives = 114/147 (77%), Frame = +3 Query: 18 RTKSKFVFAGPSKSADTLFRYIAPELVPVQYGGLSREAEQEFTSAYPVTEFTIKPATKHS 197 RT+SK V +GPSKSA+T+F+Y+APE+VPV+YGGLS+++ FT VTE +K +K++ Sbjct: 538 RTRSKMVLSGPSKSAETIFKYVAPEVVPVKYGGLSKDSP--FTVEDGVTEAVVKSTSKYT 595 Query: 198 VEFPVSEKSHLVWEIRVVGWDVSYGAEFVPSAEDGYTVIVHKSRKIAPADETVLTNGFKI 377 ++ P +E S L WE+RV+G DVSYGA+F PS E YTVIV K+RK+ DE V+T+ FK Sbjct: 596 IDLPATEGSTLSWELRVLGADVSYGAQFEPSNEASYTVIVSKNRKVGLTDEPVITDSFKA 655 Query: 378 GEPGKIVLTIDNQTSKKKKLLYRSKTK 458 E GK+V+TIDNQT KKKK+LYRSKT+ Sbjct: 656 SEAGKVVITIDNQTFKKKKVLYRSKTQ 682 >gi|4587525|gb|AAD25756.1|AC007060_14 (AC007060) Contains the PF|00650 CRAL/TRIO phosphatidyl-inositol-transfer protein domain. ESTs gb|T76582, gb|N06574 and gb|Z25700 come from this gene. [Arabidopsis thaliana] Length = 540 Frame 3 hits (HSPs): _______________ __________________________________________________ Database sequence: | | | | | 540 0 150 300 450 Plus Strand HSPs: Score = 371 (130.6 bits), Expect = 4.0e-33, P = 4.0e-33 Identities = 75/151 (49%), Positives = 103/151 (68%), Frame = +3 Query: 6 FFTQRTKSKFVFAGPSKSADTLFRYIAPELVPVQYGGLSREAEQEFTSAYPVTEFTIKPA 185 F TQRTKSKFV A P+K +TL +YI + +PVQYGG + EF++ V+E +KP Sbjct: 386 FLTQRTKSKFVVARPAKVRETLLKYIPADELPVQYGGFKTVDDTEFSNE-TVSEVVVKPG 444 Query: 186 TKHSVEFPVSE-KSHLVWEIRVVGWDVSYGAEFVPSAEDGYTVIVHKSRKIAPADETVLT 362 + ++E P E + LVW+I V+GW+V+Y EFVP+ E YTVIV K +K+ A+E + Sbjct: 445 SSETIEIPAPETEGTLVWDIAVLGWEVNYKEEFVPTEEGAYTVIVQKVKKMG-ANEGPIR 503 Query: 363 NGFKIGEPGKIVLTIDNQTSKKKKLLYRSKTK 458 N FK + GKIVLT+DN + KKKK+LYR +TK Sbjct: 504 NSFKNSQAGKIVLTVDNVSGKKKKVLYRYRTK 535 >gi|11358091|pir||T46063 hypothetical protein T18N14.50 - Arabidopsis thaliana >gi|6580149|emb|CAB63153.1| (AL132968) putative protein [Arabidopsis thaliana] Length = 409 Frame 3 hits (HSPs): ___________________ __________________________________________________ Database sequence: | | | | 409 0 150 300 Plus Strand HSPs: Score = 337 (118.6 bits), Expect = 1.5e-29, P = 1.5e-29 Identities = 66/148 (44%), Positives = 97/148 (65%), Frame = +3 Query: 6 FFTQRTKSKFVFAGPSKSADTLFRYIAPELVPVQYGGLSREAEQEFTSAYPVTEFTIKPA 185 F TQRTKSKFV + +A+TL+++I PE +PVQYGGLSR + + P +EF+IK Sbjct: 252 FLTQRTKSKFVMSKEGNAAETLYKFIRPEDIPVQYGGLSRPTDSQNGPPKPASEFSIKGG 311 Query: 186 TKHSVEFP-VSEKSHLVWEIRVVGWDVSYGAEFVPSAEDGYTVIVHKSRKIAPADETVLT 362 K +++ + + + W+I V GWD+ Y AEFVP+AE+ Y ++V K +K+ DE V Sbjct: 312 EKVNIQIEGIEGGATITWDIVVGGWDLEYSAEFVPNAEESYAIVVEKPKKMKATDEAVC- 370 Query: 363 NGFKIGEPGKIVLTIDNQTSKKKKLL-YR 446 N F E GK++L++DN S+KKK+ YR Sbjct: 371 NSFTTVEAGKLILSVDNTLSRKKKVAAYR 399 >gi|4902478|emb|CAB43521.1| (AJ238803) hypothetical protein [Arabidopsis thaliana] Length = 147 Frame 3 hits (HSPs): _____________________________________________ __________________________________________________ Database sequence: | | | | 147 0 50 100 Plus Strand HSPs: Score = 289 (101.7 bits), Expect = 1.8e-24, P = 1.8e-24 Identities = 56/131 (42%), Positives = 86/131 (65%), Frame = +3 Query: 57 SADTLFRYIAPELVPVQYGGLSREAEQEFTSAYPVTEFTIKPATKHSVEFP-VSEKSHLV 233 +A+TL+++I PE +PVQYGGLSR + + P +EF+IK K +++ + + + Sbjct: 7 AAETLYKFIRPEDIPVQYGGLSRPTDSQNGPPKPASEFSIKGGEKVNIQIEGIEGGATIT 66 Query: 234 WEIRVVGWDVSYGAEFVPSAEDGYTVIVHKSRKIAPADETVLTNGFKIGEPGKIVLTIDN 413 W+I V GWD+ Y AEFVP+AE+ Y ++V K +K+ DE V N F E GK++L++DN Sbjct: 67 WDIVVGGWDLEYSAEFVPNAEESYAIVVEKPKKMKATDEAVC-NSFTTVEAGKLILSVDN 125 Query: 414 QTSKKKKLL-YR 446 S+KKK+ YR Sbjct: 126 TLSRKKKVAAYR 137 >gi|1346953|sp|P49193|RALB_TODPA RETINAL-BINDING PROTEIN (RALBP) >gi|627118|pir||A53057 retinal-binding protein - Japanese flying squid >gi|545383|gb|AAB29891.1| (S68871) retinal-binding protein, RALBP [Todarodes pacificus=squid, eyes, Peptide, 343 aa] Length = 343 Frame 3 hits (HSPs): ______ ________________ Annotated Domains: _________________________________________________ __________________________________________________ Database sequence: | | | | 343 0 150 300 __________________ Annotated Domains: DOMO DM00869: CELLULARRETINALDEHYDE-BINDINGPR 1..208 Entrez acetylation site 2 PFAM CRAL_TRIO: CRAL/TRIO domain. 1..185 PRODOM PD002025: SC14(7) CRAL(2) TTPA(2) 3..178 PRODOM PD038870: O17907(1) O76054(1) RALB(1) 183..332 __________________ Plus Strand HSPs: Score = 70 (24.6 bits), Expect = 0.49, Sum P(2) = 0.39 Identities = 27/105 (25%), Positives = 49/105 (46%), Frame = +3 Query: 108 YGG-LSREAEQEFTSAYPVTE-FTIKPATKHSVEFPV-SEKSHLVWEIRVVGWDVSYGAE 278 +GG + +E E T + E T+ K VE+ + +E +++ WE + D+ +G Sbjct: 200 HGGEVPKEFYLENTDDFETMETITVGSGDKIYVEYEIENENTYIKWEYKTEEHDIGFGL- 258 Query: 279 FVPSAEDGYTVIVHKSRKIAPADETVLT--NGFKIGEPGKIVLTIDNQTS 422 F + ++ V+ I D +++T K +PG L DN S Sbjct: 259 FRKNGDEWEEVV-----PIERTDCSIMTLDGSHKCKDPGTYALCFDNSFS 303 Score = 60 (21.1 bits), Expect = 0.49, Sum P(2) = 0.39 Identities = 16/39 (41%), Positives = 21/39 (53%), Frame = +3 Query: 24 KSK-FVFAGPSKSADTLFRYIAPELVPVQYGGLSREAEQE 140 K+K FV G K DTL YI E +P GG E +++ Sbjct: 156 KNKIFVLGGDYK--DTLLEYIDAEELPAYLGGTKSEGDEK 193 >gi|5918866|gb|AAD56166.1| (AF154480) NADH dehydrogenase subunit 2 [Toxostoma crissale] Length = 85 Frame 1 hits (HSPs): ________________________________ __________________________________________________ Database sequence: | | | | | | 85 0 20 40 60 80 Plus Strand HSPs: Score = 62 (21.8 bits), Expect = 5.8, P = 1.0 Identities = 16/54 (29%), Positives = 26/54 (48%), Frame = +1 Query: 169 LLLNPLPNILLS-SLFLRKAILFGKSEWWVGMSAMELNLCPALRMDTLS*YTRA 327 LL+NP ++ + SL L I + W + +E+N L M + S + RA Sbjct: 8 LLMNPQAKLVFTTSLLLGSTITISSNHWITAWAGLEINTLAVLPMISKSHHPRA 61 >gi|10946319|gb|AAG24854.1| (AF299064) NADH dehydrogenase subunit 2 [Gymnotus pantherinus] Length = 149 Frame -1 hits (HSPs): ___________________ __________________________________________________ Database sequence: | | | | 149 0 50 100 Minus Strand HSPs: Score = 75 (26.4 bits), Expect = 9.4, P = 1.0 Identities = 20/55 (36%), Positives = 30/55 (54%), Frame = -1 Query: 427 FLDVWLSMVSTILPGSPILKPLVRTVSSAGAIFLL-LC---TMTVYPSSALGTNSAP 269 F+ WL + + GSPIL ++ + G F L LC T+T+YP++ G AP Sbjct: 72 FMPKWLILQELTMQGSPILATIMALTALLGLYFYLRLCYTMTLTIYPNT--GYAPAP 126 Parameters: filter=none matrix=BLOSUM62 V=50 B=50 E=10 gi H=1 sort_by_pvalue echofilter ctxfactor=5.99 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H Std. 0 BLOSUM62 0.318 0.135 0.401 +3 0 BLOSUM62 0.318 0.135 0.401 0.325 0.140 0.429 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +2 0 BLOSUM62 0.318 0.135 0.401 0.356 0.158 0.664 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +1 0 BLOSUM62 0.318 0.135 0.401 0.332 0.143 0.442 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -1 0 BLOSUM62 0.318 0.135 0.401 0.352 0.153 0.511 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -2 0 BLOSUM62 0.318 0.135 0.401 0.365 0.161 0.726 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -3 0 BLOSUM62 0.318 0.135 0.401 0.323 0.140 0.414 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a Query Frame MatID Length Eff.Length E S W T X E2 S2 +3 0 181 180 10. 76 3 12 22 0.11 34 31 0.12 37 +2 0 182 181 10. 76 3 12 22 0.12 34 31 0.12 37 +1 0 182 181 10. 76 3 12 22 0.12 34 31 0.12 37 -1 0 182 181 10. 76 3 12 22 0.12 34 31 0.12 37 -2 0 182 181 10. 76 3 12 22 0.12 34 31 0.12 37 -3 0 181 181 10. 76 3 12 22 0.12 34 31 0.12 37 Statistics: Database: /usr/local/dot5/sl_home/beauty/seqdb/blast/nr Title: nr Release date: unknown Posted date: 4:06 PM CST Feb 28, 2001 Format: BLAST # of letters in database: 197,782,623 # of sequences in database: 625,274 # of database sequences satisfying E: 11 No. of states in DFA: 594 (59 KB) Total size of DFA: 222 KB (256 KB) Time to generate neighborhood: 0.02u 0.00s 0.02t Elapsed: 00:00:00 No. of threads or processors used: 6 Search cpu time: 198.98u 0.86s 199.84t Elapsed: 00:00:47 Total cpu time: 199.01u 0.88s 199.89t Elapsed: 00:00:47 Start: Mon Feb 4 15:50:27 2002 End: Mon Feb 4 15:51:14 2002
Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000