BLASTP 2.2.17 [Aug-26-2007]
Reference:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics:
Schäffer, Alejandro A., L. Aravind, Thomas L. Madden,
Schaffer, Alejandro A., L. Aravind, Thomas L. Madden,
Sergei Shavirin, John L. Spouge, Yuri I. Wolf,
Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005.
Query= PI0382
(178 letters)
Database: nr
5,470,121 sequences; 1,894,087,724 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
gi|89207496|ref|ZP_01186036.1| hypothetical protein BcerKBA... 94 4e-18
gi|118743698|ref|ZP_01591700.1| Cupin 2, conserved barrel [... 91 2e-17
gi|123458079|ref|XP_001316523.1| conserved hypothetical pro... 83 7e-15
gi|123493654|ref|XP_001326338.1| conserved hypothetical pro... 83 8e-15
gi|123458083|ref|XP_001316524.1| conserved hypothetical pro... 81 3e-14
gi|149175943|ref|ZP_01854560.1| hypothetical protein PM8797... 73 8e-12
gi|60683536|ref|YP_213680.1| hypothetical protein BF4112 [B... 71 2e-11
gi|53715586|ref|YP_101578.1| hypothetical protein BF4304 [B... 70 4e-11
gi|109671802|ref|ZP_01374048.1| cupin domain protein [Campy... 69 1e-10
gi|154418823|ref|XP_001582429.1| conserved hypothetical pro... 69 1e-10
gi|156862718|gb|EDO56149.1| hypothetical protein BACUNI_001... 68 2e-10
gi|153811529|ref|ZP_01964197.1| hypothetical protein RUMOBE... 66 9e-10
gi|139438123|ref|ZP_01771676.1| Hypothetical protein COLAER... 65 1e-09
gi|52550374|gb|AAU84223.1| conserved hypothetical protein [... 57 4e-07
gi|119493978|ref|ZP_01624537.1| TonB box-like protein [Lyng... 56 9e-07
gi|17231239|ref|NP_487787.1| hypothetical protein all3747 [... 54 3e-06
gi|75907801|ref|YP_322097.1| TonB box-like [Anabaena variab... 54 3e-06
gi|15669814|ref|NP_248628.1| hypothetical protein MJ1618 [M... 54 5e-06
>gi|89207496|ref|ZP_01186036.1| hypothetical protein BcerKBAB4DRAFT_1514 [Bacillus
weihenstephanensis KBAB4]
gi|89154557|gb|EAR74586.1| hypothetical protein BcerKBAB4DRAFT_1514 [Bacillus
weihenstephanensis KBAB4]
Length = 151
Score = 94.0 bits (232), Expect = 4e-18, Method: Composition-based stats.
Identities = 48/142 (33%), Positives = 83/142 (58%), Gaps = 5/142 (3%)
Query: 37 KGENFTTVNVSKLNEIKEYELAMGN-FSIPGKMFAGHALQATGAELSFQSLAAGQDYGTR 95
+G+NF+ V V K ++ +Y+ GN F PGK+F LQ TG E+S G
Sbjct: 10 EGKNFSAVQVGKWEDLLQYK--NGNPFKAPGKVFLKDELQCTGMEVSLNVFPPGISMPFY 67
Query: 96 HIHKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIAPNGKRAFKNTGSSEMLVLCVQYK 155
H H+ +EELY +KG+G F +D + + EG+++R+A G+R ++N S + +C+Q +
Sbjct: 68 HKHRENEELYIFVKGQGEFKIDDEILEIKEGTVIRVAQEGERIWRNNSSEPLYFICIQAR 127
Query: 156 ANSFSDDDEPLKDAIMLEANVK 177
AN+ + + ++D I L+ +V+
Sbjct: 128 ANTLEESN--IEDGIKLDKSVE 147
>gi|118743698|ref|ZP_01591700.1| Cupin 2, conserved barrel [Geobacter lovleyi SZ]
gi|118683441|gb|EAV89838.1| Cupin 2, conserved barrel [Geobacter lovleyi SZ]
Length = 141
Score = 91.3 bits (225), Expect = 2e-17, Method: Composition-based stats.
Identities = 43/95 (45%), Positives = 60/95 (63%), Gaps = 1/95 (1%)
Query: 66 GKMFAGHALQATGAELSFQSLAAGQDYGTRHIHKTHEELYFILKGEGIFDVDGKRFPVSE 125
GK F G L TG E+S L AG+ H HK +EELY +L+G G+F VDG FP+ E
Sbjct: 29 GKYFIGKELGLTGCEVSLNRLPAGKAMPFIHSHKKNEELYIVLRGSGMFYVDGDEFPIQE 88
Query: 126 GSIVRIAPNGKRAFKNTGSSEMLVLCVQYKANSFS 160
GS+VR+AP G+R +K G ++ +C+Q +A S +
Sbjct: 89 GSLVRVAPEGERGWK-AGDEDLYFICIQAEAGSLT 122
>gi|123458079|ref|XP_001316523.1| conserved hypothetical protein [Trichomonas vaginalis G3]
gi|121899232|gb|EAY04300.1| conserved hypothetical protein [Trichomonas vaginalis G3]
Length = 198
Score = 83.2 bits (204), Expect = 7e-15, Method: Composition-based stats.
Identities = 49/139 (35%), Positives = 76/139 (54%), Gaps = 10/139 (7%)
Query: 33 KIVEKGENFTTVNVSKLNEIKEYELAMGNFSIPGKMFAGHALQATGAELSFQSLAAGQDY 92
KIV + E +T V + K +E+++YE + K F AL + E+SF +G+
Sbjct: 3 KIVNE-EKYTAVEIGKYSELEKYEHS--------KAFLHDALNLSSMEISFTKYKSGEAI 53
Query: 93 GTRHIHKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIAPNGKRAFKNTGSSEMLVLCV 152
H HK HEE+Y ++ G G +D K P+ EG+ +R+AP R KNTG ++M+V+C
Sbjct: 54 PFFHDHKNHEEVYVVISGSGEMQLDEKIIPMKEGTSIRVAPGVSRNLKNTGETDMIVMCA 113
Query: 153 QYKANSFSDD-DEPLKDAI 170
Q + +S +E KD I
Sbjct: 114 QAEKDSLKTPLNEDYKDRI 132
>gi|123493654|ref|XP_001326338.1| conserved hypothetical protein [Trichomonas vaginalis G3]
gi|123493658|ref|XP_001326339.1| conserved hypothetical protein [Trichomonas vaginalis G3]
gi|121909251|gb|EAY14115.1| conserved hypothetical protein [Trichomonas vaginalis G3]
gi|121909252|gb|EAY14116.1| conserved hypothetical protein [Trichomonas vaginalis G3]
Length = 141
Score = 82.8 bits (203), Expect = 8e-15, Method: Composition-based stats.
Identities = 52/148 (35%), Positives = 75/148 (50%), Gaps = 12/148 (8%)
Query: 32 IKIVEKGENFTTVNVSKLNEIKEYELAMGNFSIPGKMFAGHALQATGAELSFQSLAAGQD 91
I V K +N+ V K + + ++E GK F G AL T E+S G
Sbjct: 3 IHTVAKADNYVVGEVGKFSGLDKFE--------NGKAFLGQALGLTSMEVSITKFKPGTA 54
Query: 92 YGTRHIHKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIAPNGKRAFKNTGSSEMLVLC 151
H HK HEE+Y I+ G G F ++ K VSEGS+VRI+P R KNTG ++++V+C
Sbjct: 55 VPFFHDHKNHEEVYIIVSGAGEFQLNDKVVKVSEGSVVRISPGVSRNIKNTGKTDLIVIC 114
Query: 152 VQYKANSFSDDDEPL-KDAIMLEANVKL 178
Q + +S P +D IM + + K
Sbjct: 115 AQAERDSIK---APFTEDFIMTKTDAKF 139
>gi|123458083|ref|XP_001316524.1| conserved hypothetical protein [Trichomonas vaginalis G3]
gi|121899233|gb|EAY04301.1| conserved hypothetical protein [Trichomonas vaginalis G3]
Length = 155
Score = 80.9 bits (198), Expect = 3e-14, Method: Composition-based stats.
Identities = 42/129 (32%), Positives = 70/129 (54%), Gaps = 8/129 (6%)
Query: 31 EIKIVEKGENFTTVNVSKLNEIKEYELAMGNFSIPGKMFAGHALQATGAELSFQSLAAGQ 90
E++ + E +T V + + +++YE + K F +AL T E+SF G+
Sbjct: 16 EMEKIASAEKYTAVEIGSFSGLEKYEHS--------KAFLQNALGLTSMEISFTKYKPGE 67
Query: 91 DYGTRHIHKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIAPNGKRAFKNTGSSEMLVL 150
H HK HEE+Y ++ G G +D K P+ EG+ +R+AP R KNTG+++M+V+
Sbjct: 68 AIPFFHDHKKHEEVYIVITGTGEMQLDDKIIPMKEGTSIRVAPGVSRNLKNTGNTDMIVM 127
Query: 151 CVQYKANSF 159
C Q + +S
Sbjct: 128 CAQAEVDSL 136
>gi|149175943|ref|ZP_01854560.1| hypothetical protein PM8797T_03790 [Planctomyces maris DSM 8797]
gi|148845097|gb|EDL59443.1| hypothetical protein PM8797T_03790 [Planctomyces maris DSM 8797]
Length = 164
Score = 72.8 bits (177), Expect = 8e-12, Method: Composition-based stats.
Identities = 33/93 (35%), Positives = 54/93 (58%)
Query: 66 GKMFAGHALQATGAELSFQSLAAGQDYGTRHIHKTHEELYFILKGEGIFDVDGKRFPVSE 125
GK F L + G E+S +L AG++ H HK ++E+YF+++G+G F F VSE
Sbjct: 41 GKYFLRKYLNSDGLEMSINTLPAGREMPFVHRHKENDEIYFVIQGQGQFQAGTDVFDVSE 100
Query: 126 GSIVRIAPNGKRAFKNTGSSEMLVLCVQYKANS 158
G +R++P R ++N + L +QY+A+S
Sbjct: 101 GFFIRLSPEVPRVWRNNSEEPLYYLVIQYRADS 133
>gi|60683536|ref|YP_213680.1| hypothetical protein BF4112 [Bacteroides fragilis NCTC 9343]
gi|60494970|emb|CAH09786.1| conserved hypothetical protein [Bacteroides fragilis NCTC 9343]
Length = 117
Score = 71.2 bits (173), Expect = 2e-11, Method: Composition-based stats.
Identities = 36/87 (41%), Positives = 49/87 (56%)
Query: 73 ALQATGAELSFQSLAAGQDYGTRHIHKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIA 132
+L TGAE+S L AG H HK +EE+Y IL G+G +DG++ + G +RIA
Sbjct: 20 SLALTGAEVSINHLPAGAGVPFVHSHKQNEEIYGILSGKGFITIDGEKIELQAGDWLRIA 79
Query: 133 PNGKRAFKNTGSSEMLVLCVQYKANSF 159
P+GKR S + LC+Q KA S
Sbjct: 80 PDGKRQISAASDSPIGFLCIQVKAGSL 106
>gi|53715586|ref|YP_101578.1| hypothetical protein BF4304 [Bacteroides fragilis YCH46]
gi|52218451|dbj|BAD51044.1| conserved hypothetical protein [Bacteroides fragilis YCH46]
Length = 117
Score = 70.5 bits (171), Expect = 4e-11, Method: Composition-based stats.
Identities = 35/87 (40%), Positives = 49/87 (56%)
Query: 73 ALQATGAELSFQSLAAGQDYGTRHIHKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIA 132
+L TGAE+S L AG H HK +EE+Y IL G+G +DG++ + G +RIA
Sbjct: 20 SLALTGAEVSINHLPAGAGVPFVHSHKQNEEIYGILSGKGFITIDGEKIELQAGDWLRIA 79
Query: 133 PNGKRAFKNTGSSEMLVLCVQYKANSF 159
P+GKR S + +C+Q KA S
Sbjct: 80 PDGKRQISAASDSPIGFICIQVKAGSL 106
>gi|109671802|ref|ZP_01374048.1| cupin domain protein [Campylobacter concisus 13826]
gi|112800630|gb|EAT97974.1| mate efflux family protein [Campylobacter concisus 13826]
Length = 124
Score = 68.9 bits (167), Expect = 1e-10, Method: Composition-based stats.
Identities = 38/87 (43%), Positives = 50/87 (57%), Gaps = 1/87 (1%)
Query: 73 ALQATGAELSFQSLAAGQDYGTRHIHKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIA 132
AL TG E+S LAA H HK +EELY I G+G +DG+ VS+G VRI
Sbjct: 20 ALNLTGCEVSINELAANVSVPFVHAHKQNEELYIITDGDGELFIDGEVIKVSKGDAVRID 79
Query: 133 PNGKRAFKNTGSSEMLVLCVQYKANSF 159
P+GKR FK G + + ++C+Q K S
Sbjct: 80 PDGKRCFK-AGKNGIKMICIQTKRGSL 105
>gi|154418823|ref|XP_001582429.1| conserved hypothetical protein [Trichomonas vaginalis G3]
gi|121916664|gb|EAY21443.1| conserved hypothetical protein [Trichomonas vaginalis G3]
Length = 100
Score = 68.9 bits (167), Expect = 1e-10, Method: Composition-based stats.
Identities = 31/80 (38%), Positives = 48/80 (60%)
Query: 80 ELSFQSLAAGQDYGTRHIHKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIAPNGKRAF 139
E+SF G+ H HK HEE+Y ++ G G +D P+ EG+ +R+AP R
Sbjct: 2 EISFTKYKPGEAIPFFHDHKKHEEVYIVITGSGEMQLDETIIPMKEGTSIRVAPGVSRNL 61
Query: 140 KNTGSSEMLVLCVQYKANSF 159
KNTG+++M+V+C Q +A+S
Sbjct: 62 KNTGNTDMIVMCAQAEADSL 81
>gi|156862718|gb|EDO56149.1| hypothetical protein BACUNI_00179 [Bacteroides uniformis ATCC 8492]
Length = 118
Score = 67.8 bits (164), Expect = 2e-10, Method: Composition-based stats.
Identities = 34/87 (39%), Positives = 48/87 (55%)
Query: 73 ALQATGAELSFQSLAAGQDYGTRHIHKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIA 132
+L TGAE+S +L AG H HK +EE+Y +L G G VDG+ + G +R+A
Sbjct: 21 SLNLTGAEVSISNLPAGAGVPFVHSHKQNEEIYAVLSGRGTMVVDGETVELQAGDWLRVA 80
Query: 133 PNGKRAFKNTGSSEMLVLCVQYKANSF 159
P G+R S + V+C+Q KA S
Sbjct: 81 PAGERQLSAAADSAISVICIQVKAGSL 107
>gi|153811529|ref|ZP_01964197.1| hypothetical protein RUMOBE_01921 [Ruminococcus obeum ATCC 29174]
gi|149832270|gb|EDM87355.1| hypothetical protein RUMOBE_01921 [Ruminococcus obeum ATCC 29174]
Length = 117
Score = 66.2 bits (160), Expect = 9e-10, Method: Composition-based stats.
Identities = 41/120 (34%), Positives = 58/120 (48%), Gaps = 16/120 (13%)
Query: 40 NFTTVNVSKLNEIKEYELAMGNFSIPGKMFAGHALQATGAELSFQSLAAGQDYGTRHIHK 99
N+T + K N I+ +E L TGAE+S L AG + H HK
Sbjct: 3 NYTKTTIGKENRIELHE----------------KLSLTGAEISLNELPAGANVPFVHSHK 46
Query: 100 THEELYFILKGEGIFDVDGKRFPVSEGSIVRIAPNGKRAFKNTGSSEMLVLCVQYKANSF 159
+EE+Y IL G G +DG+ +S G ++IAP KR F + S + +C+Q K NS
Sbjct: 47 ENEEIYGILSGNGKAIIDGEEISLSTGDWLKIAPAAKRQFFASDISGITYICIQVKENSL 106
>gi|139438123|ref|ZP_01771676.1| Hypothetical protein COLAER_00664 [Collinsella aerofaciens ATCC
25986]
gi|133776320|gb|EBA40140.1| Hypothetical protein COLAER_00664 [Collinsella aerofaciens ATCC
25986]
Length = 117
Score = 65.5 bits (158), Expect = 1e-09, Method: Composition-based stats.
Identities = 34/93 (36%), Positives = 52/93 (55%), Gaps = 3/93 (3%)
Query: 74 LQATGAELSFQSLAAGQDYGTRHIHKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIAP 133
L TGAE+S +L AG H+HK +EE+Y +L+G G +DG+ + G +RI+P
Sbjct: 21 LGLTGAEVSVNNLPAGAGVPFVHVHKENEEIYGVLEGTGSVTIDGEDIELGAGDWLRISP 80
Query: 134 NGKRAFKNTGSSEMLVLCVQYKA---NSFSDDD 163
R F+ S + +C+Q K N+F+ DD
Sbjct: 81 AAHRQFRAASDSGITYVCIQVKQGSLNAFTADD 113
>gi|52550374|gb|AAU84223.1| conserved hypothetical protein [uncultured archaeon GZfos3D4]
Length = 113
Score = 57.0 bits (136), Expect = 4e-07, Method: Composition-based stats.
Identities = 31/72 (43%), Positives = 41/72 (56%), Gaps = 4/72 (5%)
Query: 96 HIHKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIAPNGKRAFKNTGSSEMLVLCVQYK 155
H HKT EE+Y+ILKG+GI +++GKR VSEG V I P K T L C
Sbjct: 46 HFHKTAEEIYYILKGKGIMEIEGKRREVSEGDTVVIVPEKKHRIFATKKIRFLCFC---- 101
Query: 156 ANSFSDDDEPLK 167
+ +SD+D L+
Sbjct: 102 SPPYSDEDTVLE 113
>gi|119493978|ref|ZP_01624537.1| TonB box-like protein [Lyngbya sp. PCC 8106]
gi|119452266|gb|EAW33463.1| TonB box-like protein [Lyngbya sp. PCC 8106]
Length = 157
Score = 55.8 bits (133), Expect = 9e-07, Method: Composition-based stats.
Identities = 22/64 (34%), Positives = 37/64 (57%)
Query: 98 HKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIAPNGKRAFKNTGSSEMLVLCVQYKAN 157
H+ E++FILKGEG+ DGK+ + G+ + + P G +NTGS+ + LC+
Sbjct: 60 HQLAVEMFFILKGEGVVSCDGKQVKIKAGNSILVPPTGTHMIENTGSTRLYALCIMVPNE 119
Query: 158 SFSD 161
F++
Sbjct: 120 DFAE 123
>gi|17231239|ref|NP_487787.1| hypothetical protein all3747 [Nostoc sp. PCC 7120]
gi|17132881|dbj|BAB75446.1| all3747 [Nostoc sp. PCC 7120]
Length = 151
Score = 54.3 bits (129), Expect = 3e-06, Method: Composition-based stats.
Identities = 23/64 (35%), Positives = 36/64 (56%)
Query: 98 HKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIAPNGKRAFKNTGSSEMLVLCVQYKAN 157
H+ E++F++KGEGI DGK+ P+ G + + P G KNTGSS + L +
Sbjct: 61 HQWAVEMFFVIKGEGIAMCDGKKVPIKAGDSLLVPPTGTHLIKNTGSSRLYTLTIMVPNE 120
Query: 158 SFSD 161
F++
Sbjct: 121 DFAE 124
>gi|75907801|ref|YP_322097.1| TonB box-like [Anabaena variabilis ATCC 29413]
gi|75701526|gb|ABA21202.1| TonB box-like [Anabaena variabilis ATCC 29413]
Length = 151
Score = 54.3 bits (129), Expect = 3e-06, Method: Composition-based stats.
Identities = 23/64 (35%), Positives = 36/64 (56%)
Query: 98 HKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIAPNGKRAFKNTGSSEMLVLCVQYKAN 157
H+ E++F++KGEGI DGK+ P+ G + + P G KNTGSS + L +
Sbjct: 61 HQWAVEMFFVIKGEGIAMCDGKKVPIKAGDSLLVPPTGTHLIKNTGSSRLYTLTIMVPNE 120
Query: 158 SFSD 161
F++
Sbjct: 121 DFAE 124
>gi|15669814|ref|NP_248628.1| hypothetical protein MJ1618 [Methanocaldococcus jannaschii DSM
2661]
gi|42559937|sp|Q59013|Y1618_METJA Uncharacterized protein MJ1618
gi|1592216|gb|AAB99639.1| conserved hypothetical protein [Methanocaldococcus jannaschii DSM
2661]
Length = 125
Score = 53.5 bits (127), Expect = 5e-06, Method: Composition-based stats.
Identities = 23/68 (33%), Positives = 37/68 (54%)
Query: 96 HIHKTHEELYFILKGEGIFDVDGKRFPVSEGSIVRIAPNGKRAFKNTGSSEMLVLCVQYK 155
H H T EE+Y+IL+G G+ +D ++F V +G + I P +N G+ + +LC Y
Sbjct: 55 HKHYTSEEIYYILEGRGLMTLDNEKFEVKKGDTIYIPPKTPHKIENIGNVPLKILCCSYP 114
Query: 156 ANSFSDDD 163
S D +
Sbjct: 115 PYSHEDTE 122
Database: nr
Posted date: Sep 17, 2007 11:41 AM
Number of letters in database: 999,999,834
Number of sequences in database: 2,976,859
Database: /nucleus1/users/jsaw/ncbi/db/nr.01
Posted date: Sep 17, 2007 11:48 AM
Number of letters in database: 894,087,890
Number of sequences in database: 2,493,262
Lambda K H
0.320 0.136 0.382
Lambda K H
0.267 0.0410 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 610,550,025
Number of Sequences: 5470121
Number of extensions: 23914450
Number of successful extensions: 58275
Number of sequences better than 1.0e-05: 19
Number of HSP's better than 0.0 without gapping: 18
Number of HSP's successfully gapped in prelim test: 1
Number of HSP's that attempted gapping in prelim test: 58257
Number of HSP's gapped (non-prelim): 19
length of query: 178
length of database: 1,894,087,724
effective HSP length: 124
effective length of query: 54
effective length of database: 1,215,792,720
effective search space: 65652806880
effective search space used: 65652806880
T: 11
A: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 125 (52.8 bits)