BLASTP 2.2.17 [Aug-26-2007]
Reference:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics:
Schäffer, Alejandro A., L. Aravind, Thomas L. Madden,
Schaffer, Alejandro A., L. Aravind, Thomas L. Madden,
Sergei Shavirin, John L. Spouge, Yuri I. Wolf,
Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005.
Query= TF0253
(518 letters)
Database: nr
5,470,121 sequences; 1,894,087,724 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
gi|114563525|ref|YP_751038.1| YD repeat protein [Shewanella... 118 1e-24
gi|37676636|ref|NP_937032.1| Rhs family protein [Vibrio vul... 109 4e-22
gi|50122336|ref|YP_051503.1| hypothetical protein ECA3412 [... 79 6e-13
gi|87308256|ref|ZP_01090397.1| hypothetical protein DSM3645... 67 3e-09
gi|134103084|ref|YP_001108745.1| hypothetical protein SACE_... 66 6e-09
gi|116694202|ref|YP_728413.1| filamentous hemagglutinin / a... 65 2e-08
gi|134099500|ref|YP_001105161.1| hypothetical protein SACE_... 64 2e-08
gi|77974229|ref|ZP_00829770.1| COG3210: Large exoproteins i... 64 4e-08
gi|77956068|ref|ZP_00820225.1| COG5444: Uncharacterized con... 63 4e-08
gi|83309335|ref|YP_419599.1| hypothetical protein amb0236 [... 62 7e-08
gi|148557344|ref|YP_001264926.1| hypothetical protein Swit_... 62 1e-07
gi|46200883|ref|ZP_00056285.2| hypothetical protein Magn030... 60 3e-07
gi|145232521|ref|XP_001399704.1| hypothetical protein An02g... 60 5e-07
gi|111221834|ref|YP_712628.1| hypothetical protein FRAAL240... 58 2e-06
gi|150017491|ref|YP_001309745.1| hypothetical protein Cbei_... 57 2e-06
gi|16799138|ref|NP_469406.1| hypothetical protein lin0059 [... 55 8e-06
>gi|114563525|ref|YP_751038.1| YD repeat protein [Shewanella frigidimarina NCIMB 400]
gi|114334818|gb|ABI72200.1| YD repeat protein [Shewanella frigidimarina NCIMB 400]
Length = 1494
Score = 118 bits (295), Expect = 1e-24, Method: Composition-based stats.
Identities = 61/163 (37%), Positives = 96/163 (58%), Gaps = 9/163 (5%)
Query: 356 RKPNFDDFVRSIDDSFPSDEIARKAFNLFENDEWGKLEELFKQYNINGGWPPNRGFASSR 415
++ ++ F ++ F S A + + ++ W +LE+L G WPPNRGF S
Sbjct: 1332 KENGWNTFQKNSKGIFSSVTQASQGYKFWKAQNWPELEDLLGP----GSWPPNRGFVSIE 1387
Query: 416 TITLSPGFEFDRYGGRINRKTGKFEDAGSFIADKETPYGYRSLPSGYEEKPLNSYRVKEP 475
T+ L+ G + DR+GG ++ G+F D G+F++ + + R+LPS + K N+Y V EP
Sbjct: 1388 TVDLNIGSKIDRFGGFLD-SNGEFRDYGTFVSPEGNSFTSRALPSSTKNKDYNAYEVIEP 1446
Query: 476 IQGVQQGEAIPWFGQEGGGIQYEIPASEGGIDGLLNSGKIERR 518
++ V G AIPWFGQ GGG QYE+P+S I+ L + GKI+ +
Sbjct: 1447 LK-VNSGPAIPWFGQPGGGTQYELPSS---IENLKDKGKIKEK 1485
>gi|37676636|ref|NP_937032.1| Rhs family protein [Vibrio vulnificus YJ016]
gi|37201179|dbj|BAC97002.1| Rhs family protein [Vibrio vulnificus YJ016]
Length = 1498
Score = 109 bits (273), Expect = 4e-22, Method: Composition-based stats.
Identities = 59/137 (43%), Positives = 75/137 (54%), Gaps = 10/137 (7%)
Query: 379 KAFNLFENDEWGKLEELFKQYNINGGWPPNRGFASSRTITLSPGFEFDRYGGRINRKTGK 438
KA++ ++ EW KLE L G WPP RGF PG FDR+GGR K G
Sbjct: 1362 KAYSAWKKQEWQKLESLLPI----GDWPPYRGFVKRSEGVFEPGMLFDRFGGRF--KNGV 1415
Query: 439 FEDAGSFIADKETPYGYRSLPSGYEEKPLNSYRVKEPIQGVQQGEAIPWFGQEGGGIQYE 498
F D GSF++ TP+ R LP P + Y+V +PI+G G AIPWF QEG G+Q+E
Sbjct: 1416 FSDGGSFVSPAGTPFTSRGLPDSTLHAPKSVYQVLKPIKG-HHGPAIPWFKQEGMGVQWE 1474
Query: 499 IPASEGGIDGLLNSGKI 515
+ ID LL +G I
Sbjct: 1475 L---NNNIDYLLKNGYI 1488
>gi|50122336|ref|YP_051503.1| hypothetical protein ECA3412 [Erwinia carotovora subsp. atroseptica
SCRI1043]
gi|49612862|emb|CAG76312.1| hypothetical protein [Erwinia carotovora subsp. atroseptica
SCRI1043]
Length = 123
Score = 79.3 bits (194), Expect = 6e-13, Method: Composition-based stats.
Identities = 50/137 (36%), Positives = 68/137 (49%), Gaps = 15/137 (10%)
Query: 383 LFENDEWGKLEELFKQYNINGGWPPNRGF-ASSRTITLSPGFEFDRYGGRINRKTGKFED 441
+ EN E + ++ + WPPNRGF S TL PG DRYG +
Sbjct: 1 MLENIEASRRARESSNFSDHNEWPPNRGFDGDSSKFTLLPGSTIDRYG----------TE 50
Query: 442 AGSFIADKETPYGYRSLPSGYEEKPLNSYRVKEPIQGVQQGEAIPWFGQEGGGIQYEIPA 501
G+F + + PY R+L G + +P N Y V +PI V G+ WFG EGGG QYE+P
Sbjct: 51 YGTFASPEGVPYRDRALKPGSDGRPYNVYEVVKPIT-VDAGDIKGWFGYEGGGTQYELPD 109
Query: 502 SEGGIDGLLNSGKIERR 518
I L+ G ++RR
Sbjct: 110 K---IINLVRDGSLKRR 123
>gi|87308256|ref|ZP_01090397.1| hypothetical protein DSM3645_11836 [Blastopirellula marina DSM
3645]
gi|87288813|gb|EAQ80706.1| hypothetical protein DSM3645_11836 [Blastopirellula marina DSM
3645]
Length = 245
Score = 67.0 bits (162), Expect = 3e-09, Method: Composition-based stats.
Identities = 42/121 (34%), Positives = 64/121 (52%), Gaps = 17/121 (14%)
Query: 400 NINGG--WPPNRGFASS-RTITLSPGFEFDRYGGRINRKTGKFEDAGSFIADKETPYGYR 456
N NG WPPN GFA + + ++ G + +RYG GSF+A P
Sbjct: 62 NPNGSIRWPPNEGFAGKPQEVKIAIGVQLERYG----------YPGGSFVAPLGEPAPEL 111
Query: 457 SLPSGYEEKPLNSYRVKEPIQGVQQGEAIPWFGQEGGGIQYEIPASEGGIDGLLNSGKIE 516
S+ G +KP + Y+V +P+ ++ G+A PWF Q GGG+QY++ S I+ L G +E
Sbjct: 112 SMAPGTIDKPYHVYKVLKPLPALE-GKAAPWFDQPGGGVQYDLVKS---IEHWLERGYLE 167
Query: 517 R 517
+
Sbjct: 168 Q 168
>gi|134103084|ref|YP_001108745.1| hypothetical protein SACE_6654 [Saccharopolyspora erythraea NRRL
2338]
gi|133915707|emb|CAM05820.1| hypothetical protein [Saccharopolyspora erythraea NRRL 2338]
Length = 904
Score = 66.2 bits (160), Expect = 6e-09, Method: Composition-based stats.
Identities = 37/100 (37%), Positives = 52/100 (52%), Gaps = 11/100 (11%)
Query: 403 GGWPPNRGFASSRTITLSPGFEFDRYGGRINRKTGKFEDAGSFIADKETPYGYRSLPSGY 462
GG PP + R + L PG + DR+G + G+ TPY RSLP +
Sbjct: 791 GGEPPLTLYRDRRHVVLQPGTDLDRFG----------DPNGNVAYAIRTPYTQRSLPPQW 840
Query: 463 EEKPLNSYRVKEPIQGVQQGEAIPWFGQEGGGIQYEIPAS 502
+ +YRV+ P+Q V +G A+PWF Q GGG Y +PA+
Sbjct: 841 ANRAYFAYRVQRPVQ-VLRGTAVPWFEQPGGGTAYVLPAA 879
>gi|116694202|ref|YP_728413.1| filamentous hemagglutinin / adhesin [Ralstonia eutropha H16]
gi|113528701|emb|CAJ95048.1| filamentous hemagglutinin / adhesin [Ralstonia eutropha H16]
Length = 2790
Score = 64.7 bits (156), Expect = 2e-08, Method: Composition-based stats.
Identities = 42/115 (36%), Positives = 54/115 (46%), Gaps = 16/115 (13%)
Query: 405 WPPNRGF--ASSRTITLSPGFEFDRYGGRINRKTGKFEDAGSFIADKETPYGYRSLPSGY 462
WPPN GF + TL PG DRYG G+F+A+ Y R+L G
Sbjct: 2688 WPPNNGFVPGTGERYTLFPGQFVDRYGSV----------QGTFVAEAGASYRSRALRPGS 2737
Query: 463 EEKPLNSYRVKEPIQGVQQGEAIPWFGQEGGGIQYEIPASEGGIDGLLNSGKIER 517
+ P + Y V PI+ V G PWFG EG G QY+ S + L+N G + R
Sbjct: 2738 DSAPYHVYEVTRPIE-VTAGPTRPWFGYEGMGTQYKFDES---VRNLINRGALRR 2788
>gi|134099500|ref|YP_001105161.1| hypothetical protein SACE_2958 [Saccharopolyspora erythraea NRRL
2338]
gi|133912123|emb|CAM02236.1| hypothetical protein [Saccharopolyspora erythraea NRRL 2338]
Length = 223
Score = 63.9 bits (154), Expect = 2e-08, Method: Composition-based stats.
Identities = 47/145 (32%), Positives = 66/145 (45%), Gaps = 14/145 (9%)
Query: 354 LFRKPNFDDFVRSIDDSFPSDEIARKAFNLFENDEWGKLEELFKQYNINGGWPPNRGF-A 412
L +P FD + P+DE+A + F + WG ++E Q +P N GF
Sbjct: 60 LSSRPPFDRLFTGYER--PTDEVACREFL----ETWGYVDE---QGRKRWTYPDNNGFEG 110
Query: 413 SSRTITLSPGFEFDRYGGRINRKTGKFEDAGSFIADKETPYGYRSLPSGYEEKPLNSYRV 472
R I + PG DR+G N G AG+ ++ P + PSG + + Y V
Sbjct: 111 EPRRIAIEPGMLLDRFG---NSTGGFLAPAGALYRERSLPPTNLNTPSGGPQHNYHVYEV 167
Query: 473 KEPIQGVQQGEAIPWFGQEGGGIQY 497
+P V G A WFGQ GGG+QY
Sbjct: 168 LQPFD-VDAGPAAAWFGQPGGGLQY 191
>gi|77974229|ref|ZP_00829770.1| COG3210: Large exoproteins involved in heme utilization or adhesion
[Yersinia frederiksenii ATCC 33641]
Length = 386
Score = 63.5 bits (153), Expect = 4e-08, Method: Composition-based stats.
Identities = 42/100 (42%), Positives = 50/100 (50%), Gaps = 14/100 (14%)
Query: 400 NINGGW--PPNRGFASSRTITLSP-GFEFDRYGGRINRKTGKFEDAGSFIADKETPYGYR 456
N NGGW P N GF T P G DRYG E GSF+A K TPY R
Sbjct: 274 NANGGWDWPKNLGFEGDPVKTTIPVGTRLDRYG----------EPNGSFLAPKGTPYEQR 323
Query: 457 SLPSGYEEKPLNSYRVKEPIQGVQQGEAIPWFGQEGGGIQ 496
+L G + + Y V +P+ + QGE P FGQ GGG+Q
Sbjct: 324 ALAPGAKAEKYYEYEVIKPLPAI-QGEIAPAFGQPGGGVQ 362
>gi|77956068|ref|ZP_00820225.1| COG5444: Uncharacterized conserved protein [Yersinia bercovieri
ATCC 43970]
Length = 302
Score = 63.2 bits (152), Expect = 4e-08, Method: Composition-based stats.
Identities = 42/100 (42%), Positives = 50/100 (50%), Gaps = 14/100 (14%)
Query: 400 NINGGW--PPNRGFASSRTITLSP-GFEFDRYGGRINRKTGKFEDAGSFIADKETPYGYR 456
N NGGW P N GF T P G DRYG E GSF+A K TPY R
Sbjct: 190 NANGGWDWPKNLGFEGDPVKTTIPVGARLDRYG----------EPNGSFLAPKGTPYEQR 239
Query: 457 SLPSGYEEKPLNSYRVKEPIQGVQQGEAIPWFGQEGGGIQ 496
+L G + + Y V +P+ + QGE P FGQ GGG+Q
Sbjct: 240 ALAPGAKAEKYYEYEVIKPLPTI-QGEIAPAFGQPGGGVQ 278
>gi|83309335|ref|YP_419599.1| hypothetical protein amb0236 [Magnetospirillum magneticum AMB-1]
gi|82944176|dbj|BAE49040.1| hypothetical protein [Magnetospirillum magneticum AMB-1]
Length = 156
Score = 62.4 bits (150), Expect = 7e-08, Method: Composition-based stats.
Identities = 40/125 (32%), Positives = 63/125 (50%), Gaps = 14/125 (11%)
Query: 396 FKQYNINGGWPPNRGFASSRT-ITLSPGFEFDRYGGRINRKTGKFEDAGSFIADKETPYG 454
+K N N WPPN G A + T + L+PG DR+G + G++ + + T +
Sbjct: 38 WKDGNGNLRWPPNDGAAGAITPVVLAPGMILDRFGC----------EGGNYFSPRGTAFA 87
Query: 455 YRSLPSGYEEKPLNSYRVKEPIQGVQQGEAIPWFGQEGGGIQYEIPASEGGI--DGLLNS 512
R+LP P +YRV P+ +A PWF Q+GG Q++ AS + DG++ +
Sbjct: 88 ARALPYVCATAPYYTYRVTRPLLA-WTAKAAPWFDQKGGATQFQTDASVAQLLADGVIEA 146
Query: 513 GKIER 517
K +R
Sbjct: 147 VKADR 151
>gi|148557344|ref|YP_001264926.1| hypothetical protein Swit_4450 [Sphingomonas wittichii RW1]
gi|148502534|gb|ABQ70788.1| hypothetical protein Swit_4450 [Sphingomonas wittichii RW1]
Length = 324
Score = 62.0 bits (149), Expect = 1e-07, Method: Composition-based stats.
Identities = 42/117 (35%), Positives = 61/117 (52%), Gaps = 14/117 (11%)
Query: 405 WPPNRGFASSRTITLSPGFEFDRYG--GRINRKTGKFEDAGSFIADKETPYGYRSLPSGY 462
WP N GF S + + FDRYG GR+ K +DAG F A TPY R++
Sbjct: 216 WPSNNGFLYSLKGEIPVKYRFDRYGDEGRLGDK----DDAGRFTAPLGTPYEQRAVGHRP 271
Query: 463 EEKPLNSYR--VKEPIQGVQQGEAIPWFGQEGGGIQYEIPASEGGIDGLLNSGKIER 517
+++ ++Y V P++G G A+PW GQ GG Q+ P + + +L +GKI R
Sbjct: 272 DDQAYHAYESMVGLPVEG---GPAVPWLGQPGGAWQFRHPEN---MTTMLANGKIRR 322
>gi|46200883|ref|ZP_00056285.2| hypothetical protein Magn03010983 [Magnetospirillum magnetotacticum
MS-1]
Length = 154
Score = 60.5 bits (145), Expect = 3e-07, Method: Composition-based stats.
Identities = 39/113 (34%), Positives = 55/113 (48%), Gaps = 15/113 (13%)
Query: 405 WPPNRGFASSRT-ITLSPGFEFDRYGGRINRKTGKFEDAGSFIADKETPYGYRSLPSGYE 463
WPPN G A + T + L+PG DRYG + G+F + + + R+LP
Sbjct: 46 WPPNDGAAGAITPVVLTPGMVIDRYGC----------EWGNFFSPRGAAFAARALPYVCA 95
Query: 464 EKPLNSYRVKEPIQGVQQGEAIPWFGQEGGGIQYEIPASEGGIDGLLNSGKIE 516
P +YRV P+ +A PWF Q+GG Q++ AS + LL G IE
Sbjct: 96 TSPYFTYRVVRPVVA-WTAKAAPWFDQKGGATQFQTDAS---VSQLLADGVIE 144
>gi|145232521|ref|XP_001399704.1| hypothetical protein An02g05680 [Aspergillus niger]
gi|134056621|emb|CAK47696.1| unnamed protein product [Aspergillus niger]
Length = 261
Score = 59.7 bits (143), Expect = 5e-07, Method: Composition-based stats.
Identities = 42/120 (35%), Positives = 60/120 (50%), Gaps = 22/120 (18%)
Query: 405 WPPNRGF-ASSRTITLSPGFEFDRYGGRINRKTGKFEDAGSFIADKETPYGYRSLPSGYE 463
+ PN GF +T+TL G DR+G + GS++A + TPY RS+ G
Sbjct: 128 YAPNNGFNGCPQTVTLPVGTLVDRFG----------TENGSYLAPEGTPYAERSIGPGSL 177
Query: 464 EKPLNS-------YRVKEPIQGVQQGEAIPWFGQEGGGIQYEIPASEGGIDGLLNSGKIE 516
K S Y V++P Q Q G +PW Q GGG QY + +GG+ L ++GK+E
Sbjct: 178 NKYNKSTEFNYWKYIVRQPFQA-QAGSILPWASQPGGGQQYYV---KGGLAPLRDAGKLE 233
>gi|111221834|ref|YP_712628.1| hypothetical protein FRAAL2406 [Frankia alni ACN14a]
gi|111149366|emb|CAJ61055.1| Hypothetical protein [Frankia alni ACN14a]
Length = 189
Score = 57.8 bits (138), Expect = 2e-06, Method: Composition-based stats.
Identities = 40/111 (36%), Positives = 50/111 (45%), Gaps = 24/111 (21%)
Query: 405 WPPNRGFASSRTIT-------LSPGFEFDRYGGRINRKTGKFEDAGSFIADKETPYGYRS 457
+PP GF T L PG E DRYG + G F+A +TPY R+
Sbjct: 66 YPPQDGFVLRTDATPQKAPTDLRPGQEIDRYGA----------EGGRFLAPDDTPYARRA 115
Query: 458 LP-SGYEEKP-----LNSYRVKEPIQGVQQGEAIPWFGQEGGGIQYEIPAS 502
+P S P + YRV P V G PWFGQ GGG+QY++ S
Sbjct: 116 IPPSNLVGVPAAACDYHEYRVLRPFT-VWGGPIAPWFGQPGGGVQYQLDGS 165
>gi|150017491|ref|YP_001309745.1| hypothetical protein Cbei_2635 [Clostridium beijerinckii NCIMB
8052]
gi|149903956|gb|ABR34789.1| hypothetical protein Cbei_2635 [Clostridium beijerinckii NCIMB
8052]
Length = 868
Score = 57.4 bits (137), Expect = 2e-06, Method: Composition-based stats.
Identities = 49/151 (32%), Positives = 65/151 (43%), Gaps = 24/151 (15%)
Query: 372 PSDEIARKAFNLFENDEWGKLEELFKQYNINGGWPPNRGFASS--RTITLSPGFEFDRYG 429
P D + K +++ N E + Q N N WP N GFA T L G DRYG
Sbjct: 724 PDDWLYLKYKDVYNN------ELYYNQANGNLNWPINNGFAEEFPGTEVLDEGMLVDRYG 777
Query: 430 GRINRKTGKFEDAGSFIADKETPYGYRSLPSGYEEKPLNSYRVKEPIQGVQQGEAIPWFG 489
E G+F A Y R+L E Y+V + ++ V G+A PWFG
Sbjct: 778 ----------ESYGNFFAPATDLYDTRALAPHSETANHYFYKVSQSVE-VTSGKAAPWFG 826
Query: 490 QEGGGIQYEIPASEGG----IDGLLNSGKIE 516
GG Q+ I E G ID L++ G +E
Sbjct: 827 SRGGARQF-IKYHENGKLYSIDELIDEGYLE 856
>gi|16799138|ref|NP_469406.1| hypothetical protein lin0059 [Listeria innocua Clip11262]
gi|16412480|emb|CAC95292.1| lin0059 [Listeria innocua]
Length = 577
Score = 55.5 bits (132), Expect = 8e-06, Method: Composition-based stats.
Identities = 43/132 (32%), Positives = 59/132 (44%), Gaps = 24/132 (18%)
Query: 369 DSFPSDEIARKAFNLFENDEWGKLEELFKQYNINGG---WPPNRGFASSRTITLSP-GFE 424
D PS E+ +K +++N K YN G WPPN GF L+ G
Sbjct: 440 DYPPSKELFKKYEEVYKNP---------KYYNQETGAINWPPNNGFIGETHERLANVGEF 490
Query: 425 FDRYGGRINRKTGKFEDAGSFIADKETPYGYRSLPSGYEEKPLNSYRVKEPIQGVQQGEA 484
FDRYG E +G F+A + R+L E YRV++P + + +GE
Sbjct: 491 FDRYG----------EPSGEFLAKSGFSFEERALAPHSETSIYYKYRVEKPFK-IIEGET 539
Query: 485 IPWFGQEGGGIQ 496
PWF Q+GG Q
Sbjct: 540 APWFDQKGGATQ 551
Database: nr
Posted date: Sep 17, 2007 11:41 AM
Number of letters in database: 999,999,834
Number of sequences in database: 2,976,859
Database: /nucleus1/users/jsaw/ncbi/db/nr.01
Posted date: Sep 17, 2007 11:48 AM
Number of letters in database: 894,087,890
Number of sequences in database: 2,493,262
Lambda K H
0.316 0.135 0.402
Lambda K H
0.267 0.0410 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 1,959,257,806
Number of Sequences: 5470121
Number of extensions: 87281930
Number of successful extensions: 223021
Number of sequences better than 1.0e-05: 16
Number of HSP's better than 0.0 without gapping: 0
Number of HSP's successfully gapped in prelim test: 16
Number of HSP's that attempted gapping in prelim test: 222988
Number of HSP's gapped (non-prelim): 20
length of query: 518
length of database: 1,894,087,724
effective HSP length: 137
effective length of query: 381
effective length of database: 1,144,681,147
effective search space: 436123517007
effective search space used: 436123517007
T: 11
A: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 132 (55.5 bits)