WU-BLAST 2.0 search of the National Center for Biotechnology Information's NR Protein Database.
BEAUTY post-processing provided by the Human Genome Sequencing Center, Baylor College of Medicine.
BEAUTY Reference:
Worley KC, Culpepper P, Wiese BA, Smith RF. BEAUTY-X: enhanced BLAST searches for DNA queries. Bioinformatics 1998;14(10):890-1. Abstract
Worley KC, Wiese BA, Smith RF. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. Genome Res 1995 Sep;5(2):173-84 Abstract
RepeatMasker Server unavailable.Reference: Gish, Warren (1994-1997). unpublished. Gish, Warren and David J. States (1993). Identification of protein coding regions by database similarity search. Nat. Genet. 3:266-72.Notice: statistical significance is estimated under the assumption that the equivalent of one entire reading frame in the query sequence codes for protein and that significant alignments will involve only coding reading frames.
Query= B05H08.seq(1>613) (571 letters)
Translating both strands of query sequence in all 6 reading framesDatabase: nr 625,274 sequences; 197,782,623 total letters.Observed Numbers of Database Sequences Satisfying Various EXPECTation Thresholds (E parameter values) Histogram units: = 2 Sequences : less than 2 sequences EXPECTation Threshold (E parameter) | V Observed Counts--> 10000 481 99 |================================================= 6310 382 45 |====================== 3980 337 87 |=========================================== 2510 250 82 |========================================= 1580 168 67 |================================= 1000 101 38 |=================== 631 63 24 |============ 398 39 16 |======== 251 23 5 |== 158 18 2 |= 100 16 3 |= 63.1 13 2 |= 39.8 11 1 |: 25.1 10 1 |: 15.8 9 0 | >>>>>>>>>>>>>>>>>>>>> Expect = 10.0, Observed = 9 <<<<<<<<<<<<<<<<< 10.0 9 0 | 6.31 9 0 | 3.98 9 1 |: 2.51 8 0 | 1.58 8 0 | 1.00 8 0 | 0.63 8 0 | 0.40 8 1 |: Smallest Sum Reading High Probability Sequences producing High-scoring Segment Pairs: Frame Score P(N) N gi|6714305|gb|AAF26001.1|AC013354_20(AC013354) F15H18... +2 645 8.3e-61 1 gi|10176982|dbj|BAB10214.1|(AB010077) contains simila... +2 518 9.6e-49 1 gi|11357884|pir||T49947hypothetical protein F8M21.10 ... +2 461 1.1e-42 1 gi|4512685|gb|AAD21739.1|(AC006931) hypothetical prot... +2 440 1.8e-40 1 gi|9558428|dbj|BAB03364.1|(AP002486) ESTs AU069374(C6... +2 439 2.3e-40 1 gi|11358151|pir||T49150hypothetical protein T20N10.20... +2 414 1.0e-37 1 gi|7295160|gb|AAF50485.1|(AE003556) CG7550 gene produ... +2 126 1.2e-05 1 gi|6679797ref|NP_032042.1| fibroblast growth factor i... +1 74 0.25 1 gi|5817316|gb|AAD52701.1|AF091540_1(AF091540) cystein... +2 83 0.93 1
Use the and icons to retrieve links to Entrez:
>gi|6714305|gb|AAF26001.1|AC013354_20 (AC013354) F15H18.4 [Arabidopsis thaliana] Length = 1702 Frame 2 hits (HSPs): ______ __________________________________________________ Database sequence: | | | | | 1702 0 500 1000 1500 Plus Strand HSPs: Score = 645 (227.1 bits), Expect = 8.3e-61, P = 8.3e-61 Identities = 113/177 (63%), Positives = 139/177 (78%), Frame = +2 Query: 14 YVDIHECDSFTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESKE 193 ++DIHECD+FTMCIFCFPTSSVIPLHDHP M VFSK+LYGSLHVKAYDWVEPPCII + Sbjct: 1526 FLDIHECDTFTMCIFCFPTSSVIPLHDHPEMAVFSKILYGSLHVKAYDWVEPPCIITQDK 1585 Query: 194 --PGYAQVRLAKLAVDKVLNAPCDTSVLYPKHGGNLHCFTAVTPCAMLDILTPPYREEEG 367 PG RLAKL DKV+ + LYPK GGNLHCFTA+TPCA+LDIL+PPY+E G Sbjct: 1586 GVPGSLPARLAKLVSDKVITPQSEIPALYPKTGGNLHCFTALTPCAVLDILSPPYKESVG 1645 Query: 368 RRCTYYHDYPYSAFSVANA--PICDGEEEEYAWLTELESPSDLYMRQGVYAGPAIQL 532 R C+YY DYP+S F++ N + +G+E+EYAWL ++++P DL+MR G Y GP I++ Sbjct: 1646 RSCSYYMDYPFSTFALENGMKKVDEGKEDEYAWLVQIDTPDDLHMRPGSYTGPTIRV 1702 >gi|10176982|dbj|BAB10214.1| (AB010077) contains similarity to unknown protein~emb|CAB89322.1~gene_id:MYH19.8 [Arabidopsis thaliana] Length = 270 Frame 2 hits (HSPs): ________________________________ __________________________________________________ Database sequence: | | | | | | | 270 0 50 100 150 200 250 Plus Strand HSPs: Score = 518 (182.3 bits), Expect = 9.6e-49, P = 9.6e-49 Identities = 94/172 (54%), Positives = 123/172 (71%), Frame = +2 Query: 14 YVDIHECDSFTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESKE 193 Y+ I+ C F++CIFC P S VIPLH+HP MTVFSKLL+G++H+K+YDWV P +S + Sbjct: 103 YLHIYACHRFSICIFCLPPSGVIPLHNHPEMTVFSKLLFGTMHIKSYDWV--P---DSPQ 157 Query: 194 PGYAQVRLAKLAVDKVLNAPCDTSVLYPKHGGNLHCFTAVTPCAMLDILTPPYREEEGRR 373 P + RLAK+ VD APCDTS+LYP GGN+HCFTA T CA+LD++ PPY + GR Sbjct: 158 PS-SDTRLAKVKVDSDFTAPCDTSILYPADGGNMHCFTAKTACAVLDVIGPPYSDPAGRH 216 Query: 374 CTYYHDYPYSAFSVANAPICDGEEEEYAWLTELES-PSDLYMRQGVYAGPAIQ 529 CTYY DYP+S+FSV + + E+E YAWL E E P DL + +Y+GP I+ Sbjct: 217 CTYYFDYPFSSFSVDGVVVAEEEKEGYAWLKEREEKPEDLTVTALMYSGPTIK 269 >gi|11357884|pir||T49947 hypothetical protein F8M21.10 - Arabidopsis thaliana >gi|7671481|emb|CAB89322.1| (AL353993) putative protein [Arabidopsis thaliana] Length = 293 Frame 2 hits (HSPs): ______________________________ __________________________________________________ Database sequence: | | | | | | | 293 0 50 100 150 200 250 Plus Strand HSPs: Score = 461 (162.3 bits), Expect = 1.1e-42, P = 1.1e-42 Identities = 87/172 (50%), Positives = 118/172 (68%), Frame = +2 Query: 14 YVDIHECDSFTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESKE 193 Y+ +H+CD F++ IFC P S VIPLH+HPGMTVFSKLL+G++H+K+YDWV + +SK Sbjct: 123 YLHLHQCDQFSIGIFCLPPSGVIPLHNHPGMTVFSKLLFGTMHIKSYDWVVDAPMRDSK- 181 Query: 194 PGYAQVRLAKLAVDKVLNAPCDTSVLYPKHGGNLHCFTAVTPCAMLDILTPPYREEEGRR 373 RLAKL VD APC+ S+LYP+ GGN+H FTA+T CA+LD+L PPY EGR Sbjct: 182 -----TRLAKLKVDSTFTAPCNASILYPEDGGNMHRFTAITACAVLDVLGPPYCNPEGRH 236 Query: 374 CTYYHDYPYSAFSVANAPICDGEEEE--YAWLTELE-SPSDLYMRQG-VYAGPAIQ 529 CTY+ ++P S + + EEE+ YAWL E + +P D G +Y GP ++ Sbjct: 237 CTYFLEFPLDKLSSEDDDVLSSEEEKEGYAWLQERDDNPEDHTNVVGALYRGPKVE 292 >gi|4512685|gb|AAD21739.1| (AC006931) hypothetical protein [Arabidopsis thaliana] Length = 242 Frame 2 hits (HSPs): ____________________________________ __________________________________________________ Database sequence: | | | | | | 242 0 50 100 150 200 Plus Strand HSPs: Score = 440 (154.9 bits), Expect = 1.8e-40, P = 1.8e-40 Identities = 84/172 (48%), Positives = 110/172 (63%), Frame = +2 Query: 14 YVDIHECDSFTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESKE 193 Y+ +HECDSF++ IFC P SS+IPLH+HPGMTV SKL+YGS+HVK+YDW+EP + E ++ Sbjct: 73 YLHLHECDSFSIGIFCMPPSSMIPLHNHPGMTVLSKLVYGSMHVKSYDWLEPQ-LTEPED 131 Query: 194 PGYAQVRLAKLAVDKVLNAPCDTSVLYPKHGGNLHCFTAVTPCAMLDILTPPYREEEGRR 373 P + R AKL D + A + LYPK GGN+HCF A+T CA+LDIL PPY E R Sbjct: 132 PSQ-EARPAKLVKDTEMTAQSPVTTLYPKSGGNIHCFKAITHCAILDILAPPYSSEHDRH 190 Query: 374 CTYYHDYPYSAFSVANAPICDGEE-EEYAWLTELESPSDLYMRQGVYAGPAIQ 529 CTY+ + DGE + WL E + P D +R+ Y GP I+ Sbjct: 191 CTYFRKSRRE--DLPGELEVDGEVVTDVTWLEEFQPPDDFVIRRIPYRGPVIR 241 >gi|9558428|dbj|BAB03364.1| (AP002486) ESTs AU069374(C61044),D24451(R1944), AU031820(R1944) correspond to a region of the predicted gene.~Similar to Arabidopsis thaliana DNA chromosome 5, BAC clone F8M21; putative protein (AL353993) [Oryza sativa] Length = 246 Frame 2 hits (HSPs): ___________________________________ __________________________________________________ Database sequence: | | | | | | 246 0 50 100 150 200 Plus Strand HSPs: Score = 439 (154.5 bits), Expect = 2.3e-40, P = 2.3e-40 Identities = 85/171 (49%), Positives = 112/171 (65%), Frame = +2 Query: 14 YVDIHECDSFTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESKE 193 Y+ ++EC++F++ IFC P VIPLH+HP MTVFSKLL+G L VK+YDW + +S + Sbjct: 77 YLHLYECEAFSIGIFCLPPRGVIPLHNHPNMTVFSKLLFGELRVKSYDWADASQ--DSTD 134 Query: 194 PGYAQVRLAKLAVDKVLNAPCDTSVLYPKHGGNLHCFTAVTPCAMLDILTPPYREEEGRR 373 RLAK+ VD LNAPC TSVLYP+ GGNLHCFTA T CA+LD+L PPY + GR Sbjct: 135 AQLQGARLAKVKVDGTLNAPCATSVLYPEDGGNLHCFTAHTACAVLDVLGPPYDDGSGRH 194 Query: 374 CTYYHDYPYSAFSVANAPICDGEEEEYAWLTELESPSDLYMRQGVYAGPAI 526 C +Y+ SA S ++ G++ YAWL E E P + ++ Y GP I Sbjct: 195 CQHYN-VSSSAPSAGDSKPLPGDDG-YAWLEECEPPDNFHLVGSTYMGPRI 243 >gi|11358151|pir||T49150 hypothetical protein T20N10.20 - Arabidopsis thaliana >gi|7630062|emb|CAB88284.1| (AL353032) putative protein [Arabidopsis thaliana] Length = 242 Frame 2 hits (HSPs): ____________________________________ __________________________________________________ Database sequence: | | | | | | 242 0 50 100 150 200 Plus Strand HSPs: Score = 414 (145.7 bits), Expect = 1.0e-37, P = 1.0e-37 Identities = 79/172 (45%), Positives = 104/172 (60%), Frame = +2 Query: 14 YVDIHECDSFTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESKE 193 Y+ +HECDSF++ IFC P S+IPLH+HPGMTV SKL+YGS+HVK+YDW EP E + Sbjct: 73 YLQLHECDSFSIGIFCMPPGSIIPLHNHPGMTVLSKLVYGSMHVKSYDWAEPDQS-ELDD 131 Query: 194 PGYAQVRLAKLAVDKVLNAPCDTSVLYPKHGGNLHCFTAVTPCAMLDILTPPYREEEGRR 373 P Q R AKL D + +P + LYP GGN+HCF A+T CA+ DIL+PPY GR Sbjct: 132 P--LQARPAKLVKDIDMTSPSPATTLYPTTGGNIHCFKAITHCAIFDILSPPYSSTHGRH 189 Query: 374 CTYYHDYPYSAFSVANAPICDGEE-EEYAWLTELESPSDLYMRQGVYAGPAIQ 529 C Y+ P + +GE WL E + P + + + Y GP I+ Sbjct: 190 CNYFRKSPMLDLP-GEIEVMNGEVISNVTWLEEYQPPDNFVIWRVPYRGPVIR 241 >gi|7295160|gb|AAF50485.1| (AE003556) CG7550 gene product [Drosophila melanogaster] Length = 240 Frame 2 hits (HSPs): ____________________________ __________________________________________________ Database sequence: | | | | | | 240 0 50 100 150 200 Plus Strand HSPs: Score = 126 (44.4 bits), Expect = 1.2e-05, P = 1.2e-05 Identities = 41/129 (31%), Positives = 65/129 (50%), Frame = +2 Query: 11 AYVDIHECDSFTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESK 190 +Y+ I E D F+M +F +S IPLHDHP M + ++G L V ++ P + Sbjct: 61 SYMHIFEDDRFSMSLFIVRGASTIPLHDHPMMFGLLRCIWGQLMVDSFSHQLGPDEPLTY 120 Query: 191 EPGYAQVRLAKLAVDKVLN--APCDTSVLYPKHGGNLHCFTAVTP--CAMLDILTPPYRE 358 +P V++ + K++ +PC T L P+ N H + A DIL+PPY Sbjct: 121 DPHQTVVKV-NVEEPKLVTPASPCAT--LTPRKR-NYHQIAQIGSGVAAFFDILSPPYDA 176 Query: 359 EE---G-RRCTYY 385 + G R+C +Y Sbjct: 177 DMPTYGPRQCRFY 189 >gi|6679797 ref|NP_032042.1| fibroblast growth factor inducible 15 [Mus musculus] >gi|2498381|sp|Q61075|FI15_MOUSE FIBROBLAST GROWTH FACTOR INDUCIBLE PROTEIN 15 (FIN15) >gi|1353707|gb|AAB08866.1| (U42384) FIN15 gene product [Mus musculus] Length = 87 Frame 1 hits (HSPs): _______________________________ __________________________________________________ Database sequence: | | | | | | 87 0 20 40 60 80 Plus Strand HSPs: Score = 74 (26.0 bits), Expect = 0.29, P = 0.25 Identities = 19/54 (35%), Positives = 28/54 (51%), Frame = +1 Query: 265 SFISQTRWKSALFHSSDTLCHAR--HSHTSLQRRGRKEVYILS*LSLFSI-LSC 417 SFI WK+A +++ +C H+HT LQ Y + SLFS+ +SC Sbjct: 7 SFIIHKFWKNATYYTCSFVCVCMDIHTHTVLQNELFMYTYFRTAFSLFSVKISC 60 >gi|5817316|gb|AAD52701.1|AF091540_1 (AF091540) cysteine dioxygenase [Schistosoma japonicum] Length = 212 Frame 2 hits (HSPs): _________________________ __________________________________________________ Database sequence: | | | | | | 212 0 50 100 150 200 Plus Strand HSPs: Score = 83 (29.2 bits), Expect = 2.7, P = 0.93 Identities = 26/104 (25%), Positives = 43/104 (41%), Frame = +2 Query: 41 FTMCIFCFPTSSVIPLHDHPGMTVFSKLLYGSLHVKAYDWVEPPCIIESKEPGYAQVRLA 220 + + + C+ +HDH G F KL+ G + ++W + +E Q+ L Sbjct: 79 YNLFLLCWSEDQGTRIHDHSGAHCFVKLIKGCIKETIFEWPKY-FTVEKSNYSINQIDLP 137 Query: 221 KLAVDKVLNA-PCDTSVLYPKHG-GNLHCFTAVTPCAMLDILTPPY 352 L V V P D + ++ K G LH + L + PPY Sbjct: 138 -LTVKSVSEMRPGDVTYMHDKIGIHRLHNPSTTETAITLHLYFPPY 182 Parameters: filter=none matrix=BLOSUM62 V=50 B=50 E=10 gi H=1 sort_by_pvalue echofilter ctxfactor=5.99 Query ----- As Used ----- ----- Computed ---- Frame MatID Matrix name Lambda K H Lambda K H Std. 0 BLOSUM62 0.318 0.135 0.401 +3 0 BLOSUM62 0.318 0.135 0.401 0.368 0.163 0.557 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +2 0 BLOSUM62 0.318 0.135 0.401 0.325 0.142 0.467 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a +1 0 BLOSUM62 0.318 0.135 0.401 0.368 0.161 0.641 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -1 0 BLOSUM62 0.318 0.135 0.401 0.366 0.162 0.623 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -2 0 BLOSUM62 0.318 0.135 0.401 0.353 0.151 0.528 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a -3 0 BLOSUM62 0.318 0.135 0.401 0.358 0.159 0.580 Q=9,R=2 0.244 0.0300 0.180 n/a n/a n/a Query Frame MatID Length Eff.Length E S W T X E2 S2 +3 0 189 188 10. 76 3 12 22 0.12 34 31 0.12 37 +2 0 190 189 10. 76 3 12 22 0.12 34 31 0.12 37 +1 0 190 189 10. 76 3 12 22 0.12 34 31 0.12 37 -1 0 190 189 10. 76 3 12 22 0.12 34 31 0.12 37 -2 0 190 189 10. 76 3 12 22 0.12 34 31 0.12 37 -3 0 189 189 10. 76 3 12 22 0.12 34 31 0.12 37 Statistics: Database: /usr/local/dot5/sl_home/beauty/seqdb/blast/nr Title: nr Release date: unknown Posted date: 4:06 PM CST Feb 28, 2001 Format: BLAST # of letters in database: 197,782,623 # of sequences in database: 625,274 # of database sequences satisfying E: 9 No. of states in DFA: 597 (59 KB) Total size of DFA: 221 KB (256 KB) Time to generate neighborhood: 0.01u 0.01s 0.02t Elapsed: 00:00:00 No. of threads or processors used: 6 Search cpu time: 183.37u 1.06s 184.43t Elapsed: 00:00:32 Total cpu time: 183.39u 1.10s 184.49t Elapsed: 00:00:32 Start: Sat Feb 2 05:24:35 2002 End: Sat Feb 2 05:25:07 2002
Annotated Domains Database: March 14, 2000
Release Date: March 14, 2000