MiST Information
Here you will find a brief description about the MiST database and how to use the MiST database to its full extent. Follow the links below to quickly navigate to a topic of interest
- Starting with MiST - how to work with selecting organisms and direct querying
- Taxonomy selector - select organisms based on their taxonomy
- Organism list
- Search for organism name or GI
- Querying multiple organisms
- View organism page
- Genome summary
- Signal transduction profile (legend)
- Querying
- Table of signal transduction proteins by replicon
- View protein page
- Bioinformatic tools used in MiST
- Pfam and SMART domains used to predict signal transduction proteins
Starting with MiST
Generally, exploring MiST first involves selecting one or more bacteria to compare or navigating to an organism of interest. There are three major entry points into exploring MiST from the homepage: the taxonomy selector, organism list, and querying.
- Taxonomy selector
This taxonomically-organized approach to finding organisms of interest is best suited for performing queries across multiple organisms. The taxonomy based selector organizes and displays a list of organisms in a hierarchical fashion according to their taxonomy. The number of organisms associated with each taxonomic designation is given in parentheses beside its name. Clicking on the 'Show/Hide' link beside each taxonomic designation will reveal the organisms belonging to that particular group.
The taxonomy level represents the depth or extent to which organisms are grouped taxonomically. The current taxonomy level defaults to phyla and is displayed with a green font. You may view the taxonomic tree at a different level by clicking on the desired taxonomic level (e.g. class, order, etc.)
There are a couple of means for selecting an organism(s). The most intuitive method invovles display the organisms beneath a taxonomic designation by clicking on the 'Show/Hide' link and clicking the organisms of interest. Once an organism is selected it will display a check in the checkbox beside its name.
The second alternative method selects/deselects groups of organisms belonging to a particular taxonomic designation. Clicking on the small gray button beside a taxonomic designation selects (or deselects) all the organisms belonging to this group. For example, to select all Cyanobacteria, make sure the taxonomy level is set to phyla and then push the button beside Cyanobacteria. Clicking the 'Show/Hide' link beside Cyanobacteria should reveal that all cyanobacteria species have a check in the checkbox beside their name.
After selecting all the organisms of interest to compare, click the button labeled 'Select Oragnisms' to continue the analysis. For more information about comparing these selected organisms, see the querying multiple organisms section.
- Organism list
The organism list displays the complete list of bacterial and archaeal genomes contained in the MiST database along with some of their basic information (e.g. GC content, number of genes/proteins,etc.). The list is sorted alphanumerically by the organism name; however, clicking on the header column labels (displayed in reddish-brown) will sort by that column. Click on the name of a given organism to view details about its signal transduction capabilities. To learn more about the details of the selected organism, see the view organism section.
- Querying MiST from the homepage
There are two means of querying MiST from the homepage: 1) By organism name, or 2) Genbank Identification (GI) number.
To query by organism name, make sure 'Organism' is selected and type part (or all) of the organism name in the adjacent text box and push the search button. Any organisms that match the given text will be displayed with links to viewing their signal transduction capabilities (see the view organism section). Any keywords separated by spaces will be interpreted as separate queries, allowing the user to search for multiple organisms simulatenously.
To search for proteins in MiST that match a given GI, make sure 'GI number' is selected and type in one or more GI numbers separated by spaces and push the search button. If a protein within MiST matches the given GI number, a link to this protein will be displayed including its associated organism. Frequently, the given GI number will not hit a direct match in the MiST database. This is due to many GI numbers identifying the same sequence. If this happens, an attempt will be made to find proteins within MiST that have an identical sequence to the sequence represented by the query GI number. Clicking the link to a particular protein will display the View Protein page with variuos information about that protein. For more information about viewing proteins, see the view protein section.
Querying multiple organisms
MiST makes it possible to query multiple organisms for proteins containing a particular domain(s), description, locus, GI number, or internal MiST identifer. First select two or more organisms via the taxonomy selector. This next page, the select analysis page, displays the list of selected organisms in a tree-like list according to their taxonomy. Next, select the type of search to perform (e.g. Domain, description, etc.), fill in the search terms of interest, and push the search button. The query will be executed against each organism that was selected.
Please note the following:
All search terms are case-insensitive. Thus, the text 'RESPONSE_REG' is treated the same as 'response_reg' and 'ResPOnSe_rEg'
Only organisms that have checks in the checkboxes beside their name will be searched
Multiple keywords should be separated by spaces and are treated independently (except for domain searches, see below)
When performing domain searches:
The boolean logic operators - AND, OR - may be used for more complex domain searches. For example, to search for the PAS domain and the conserved MCPsignal domain, type (without the quotes): 'PAS AND MCPsignal'
By default, domain searches search both the pfam and SMART libraries. To limit this to either Pfam or SMART, prefix the query with the domain library name. For example, to specifically search for the Pfam response_reg domain, type (without the quotes): 'Pfam:Response_reg'
The GI number, description, and domain architecture of each protein match is displayed beneath its associated organism. To view more detailed information about a particular protein, click its GI number, which will take you to the view protein page for this protein. The domain name, start, stop, score, evalue, and significance for all predicted domains including overlapping or insignificant domains can be displayed by clicking the appropriate 'Details' link. Clicking the 'Details' link again will hide this information.
To carry out another search against this same organism set, use your browser's back button to return to the Select Analysis page and simply input your new query text.
View organism page
The View Organism page provides a general overview of an organism's genome, and its signal transduction network, along with the ability to search this organism for particular keywords. Specifically, there are four primary sections to understand: 1) the Genome summary, 2) Signal transduction profile, 3) Querying, and 4) the table of signal transduction proteins by replicon.
- Genome summary
This section contains basic information about the organism's genome and taxonomic classification, and is displayed in the upper left. Just beneath the taxonomic information is a table containing the summary of the predicted two-component, and one-component systems. The number of two-component systems is based on the number of predicted response regulators as these imply a particular output response. This number is an estimate that is automatically determined and thus should not be taken as exact or necessarily accurate. Rules for distinguishing between phosphorelays or other more complex response regulator types (hybrid histidine kinases) have not been implemented in this calculation. Thus, for best results, it is advisable to ascertain the number of two-component systems from a manual inspection of all the predicted two-component proteins.
- Signal transduction profile
This graph provides a qualitative overview of the different types of input and output characteristics of this bacterium and signaling machinery (e.g. response regulators, etc). It is important to note that this graph is based on domain counts, rather than protein counts, and thus the number in the graph will not necessarily correlate to the number of proteins containing that domain. For example, the graph may show 60 receiver domains for an organism that only contains 55 response regulator proteins. This discrepancy is due to such things as hybrid sensor kinases which contain transmitter and receiver domains. Consequently, this protein would contribute to both the transmitter and receiver counts on the graph.
Legend Green histidine kinase domains Red response regulator domains Orange input domain type Blue output domain type - Querying
Select the type of search to perform (e.g. Domain, description, etc.), fill in the search terms of interest, and push the search button. The query will be executed against each organism that was selected.
Please note the following:
All search terms are case-insensitive. Thus, the text 'RESPONSE_REG' is treated the same as 'response_reg' and 'ResPOnSe_rEg'
Only organisms that have checks in the checkboxes beside their name will be searched
Multiple keywords should be separated by spaces and are treated independently (except for domain searches, see below)
When performing domain searches:
The boolean logic operators - AND, OR - may be used for more complex domain searches. For example, to search for the PAS domain and the conserved MCPsignal domain, type (without the quotes): 'PAS AND MCPsignal'
By default, domain searches search both the pfam and SMART libraries. To limit this to either Pfam or SMART, prefix the query with the domain library name. For example, to specifically search for the Pfam response_reg domain, type (without the quotes): 'Pfam:Response_reg'
The GI number, description, and domain architecture of each protein match is displayed beneath its associated organism. To view more detailed information about a particular protein, click its GI number, which will take you to the view protein page for this protein. The domain name, start, stop, score, evalue, and significance for all predicted domains including overlapping or insignificant domains can be displayed by clicking the appropriate 'Details' link. Clicking the 'Details' link again will hide this information.
- Table of signal transduction proteins by replicon
This table reveals the type and number of signal transduction proteins found on each replicon.
To view all the signal transduction proteins contained on a particular replicon, click on the replicon name
To view all proteins belonging to a particular class of signal transduction (e.g. One-component proteins), click on the desired class name
To view the proteins on a particular replicon that belong to a particular class of signal transduction, click on the desired number within the table
Performing one of the above actions will present a list of proteins displayed similarly to the output from querying (details), except it also includes beneath the protein description a list of the input and output domains identified in this protein.
View protein page
This page may conceptually be broken into three sections: 1) protein/gene sequence information, 2) domain architecture, and 3) a chromosome view.
- Protein/gene sequence information
The full RefSeq annotation is shown in this section along with various identification information. Protein information is shown on the left-hand side, and gene information is displayed on the right-hand side. Beside the italicized protein and gene labels is the MiST identifier. To obtain either the protein/gene sequence click on the appropriate 'Sequence' link. Clicking the sequence link again will hide the sequence.
- Domain architecture
The visualized Pfam and SMART domain architecture's for this protein. Information about the predicted domains may be revealed by clicking the 'Details' link. Blue vertical boxes () represent transmembrane regions, red boxes () represent signal peptides, green boxes () represent coiled-coil regions, and pink boxes () represent low-complexity regions.
- Chromosome view
A graphical representation (drawn to scale) of the genes surrounding the currently selected protein/gene. The current gene is centered on the image and drawn in blue. Neighboring genes are drawn in gray and may be viewed by clicking on the gray arrow. Hovering the mouse over a particular neighboring gene will display its location and MiST identifier.
The 'Return to query screen' link will take you to whatever page you were at before entering the View Protein page(s).
Bioinformatic tools used in MiST
- Pfam version 19.0*
- SMART version 5.0*
- Phobius version 1.01 - signal peptide and transmembrane region prediction
- Coils - coiled-coil prediction
- Seg - low-complexity region prediction
- * Note: the HMMER software suite was used to search the Pfam and SMART domain libraries
Pfam and SMART domains used to predict signal transduction proteins
Domain name | Source | Type | Function | |
---|---|---|---|---|
1. | ACT | Pfam | input | Small-molecule binding |
2. | Ada_Zn_binding | Pfam | input | Small-molecule binding |
3. | AlkA_N | Pfam | input | Small-molecule binding |
4. | AraC_binding | Pfam | input | Small-molecule binding |
5. | Autoind_bind | Pfam | input | Small-molecule binding |
6. | Cache_1 | Pfam | input | Small-molecule binding |
7. | Cache_2 | Pfam | input | Small-molecule binding |
8. | CHASE | Pfam | input | Small-molecule binding |
9. | cNMP_binding | Pfam | input | Small-molecule binding |
10. | Diacid_rec | Pfam | input | Small-molecule binding |
11. | Fe_dep_repr_C | Pfam | input | Small-molecule binding |
12. | FeoA | Pfam | input | Small-molecule binding |
13. | FHA | Pfam | input | Small-molecule binding |
14. | GAF | Pfam | input | Small-molecule binding |
15. | HMA | Pfam | input | Small-molecule binding |
16. | LysR_substrate | Pfam | input | Small-molecule binding |
17. | NIT | Pfam | input | Small-molecule binding |
18. | PAS | Pfam | input | Small-molecule binding |
19. | PAS_2 | Pfam | input | Small-molecule binding |
20. | PAS_3 | Pfam | input | Small-molecule binding |
21. | PAS_4 | Pfam | input | Small-molecule binding |
22. | PAS | SMART | input | Small-molecule binding |
23. | PAC | SMART | input | Small-molecule binding |
24. | Peripla_BP_1 | Pfam | input | Small-molecule binding |
25. | Peripla_BP_2 | Pfam | input | Small-molecule binding |
26. | SBP_bac_3 | Pfam | input | Small-molecule binding |
27. | SIS | Pfam | input | Small-molecule binding |
28. | STAS | Pfam | input | Small-molecule binding |
29. | TetR_C | Pfam | input | Small-molecule binding |
30. | TOBE | Pfam | input | Small-molecule binding |
31. | V4R | Pfam | input | Small-molecule binding |
32. | Aminotran_1_2 | Pfam | input | Enzymatic |
33. | Arch_ATPase | Pfam | input | Enzymatic |
34. | Citrate_synt | Pfam | input | Enzymatic |
35. | Cyanate_lyase | Pfam | input | Enzymatic |
36. | EPSP_synthase | Pfam | input | Enzymatic |
37. | FmdA_AmdA | Pfam | input | Enzymatic |
38. | GATase_2 | Pfam | input | Enzymatic |
39. | Glucokinase | Pfam | input | Enzymatic |
40. | Glycos_trans_3N | Pfam | input | Enzymatic |
41. | Glyoxalase | Pfam | input | Enzymatic |
42. | HEAT_PBS | Pfam | input | Enzymatic |
43. | HEM4 | Pfam | input | Enzymatic |
44. | Nitroreductase | Pfam | input | Enzymatic |
45. | NTP_transf_2 | Pfam | input | Enzymatic |
46. | NUDIX | Pfam | input | Enzymatic |
47. | PALP | Pfam | input | Enzymatic |
48. | Peptidase_M23 | Pfam | input | Enzymatic |
49. | peroxidase | Pfam | input | Enzymatic |
50. | PfkB | Pfam | input | Enzymatic |
51. | Pribosyltran | Pfam | input | Enzymatic |
52. | PTS_EIIC | Pfam | input | Enzymatic |
53. | PTS-HPr | Pfam | input | Enzymatic |
54. | Pyr_redox | Pfam | input | Enzymatic |
55. | Rhodanese | Pfam | input | Enzymatic |
56. | SKI | Pfam | input | Enzymatic |
57. | CBS | Pfam | input | Protein-protein interaction |
58. | HAMP | Pfam | input | Protein-protein interaction |
59. | TPR_1 | Pfam | input | Protein-protein interaction |
60. | TPR_2 | Pfam | input | Protein-protein interaction |
61. | TPR_3 | Pfam | input | Protein-protein interaction |
62. | TPR_4 | Pfam | input | Protein-protein interaction |
63. | BLUF | Pfam | input | Cofactor binding |
64. | Fer4 | Pfam | input | Cofactor binding |
65. | FeS | Pfam | input | Cofactor binding |
66. | Hemerythrin | Pfam | input | Cofactor binding |
67. | HhH-GPD | Pfam | input | Cofactor binding |
68. | NIR_SIR | Pfam | input | Cofactor binding |
69. | NIR_SIR_ferr | Pfam | input | Cofactor binding |
70. | Nitro_FeMo-Co | Pfam | input | Cofactor binding |
71. | Phytochrome | Pfam | input | Cofactor binding |
72. | CHASE2 | Pfam | input | Unknown function |
73. | CHASE3 | Pfam | input | Unknown function |
74. | CHASE4 | Pfam | input | Unknown function |
75. | MASE1 | Pfam | input | Unknown function |
76. | MASE2 | Pfam | input | Unknown function |
77. | MHYT | Pfam | input | Unknown function |
78. | TrkA_C | Pfam | input | Unknown function |
79. | Arc | Pfam | output | DNA-binding |
80. | Arg_repressor | Pfam | output | DNA-binding |
81. | AsnC_trans_reg | Pfam | output | DNA-binding |
82. | Crp | Pfam | output | DNA-binding |
83. | CtsR | Pfam | output | DNA-binding |
84. | DeoR | Pfam | output | DNA-binding |
85. | Fe_dep_repress | Pfam | output | DNA-binding |
86. | GerE | Pfam | output | DNA-binding |
87. | GntR | Pfam | output | DNA-binding |
88. | HTH_AraC | Pfam | output | DNA-binding |
89. | HTH_1 | Pfam | output | DNA-binding |
90. | HTH_3 | Pfam | output | DNA-binding |
91. | HTH_5 | Pfam | output | DNA-binding |
92. | HTH_6 | Pfam | output | DNA-binding |
93. | HTH_7 | Pfam | output | DNA-binding |
94. | HTH_8 | Pfam | output | DNA-binding |
95. | HTH_10 | Pfam | output | DNA-binding |
96. | HTH_11 | Pfam | output | DNA-binding |
97. | HTH_12 | Pfam | output | DNA-binding |
98. | IclR | Pfam | output | DNA-binding |
99. | LacI | Pfam | output | DNA-binding |
100. | LytTR | Pfam | output | DNA-binding |
101. | MarR | Pfam | output | DNA-binding |
102. | MerR | Pfam | output | DNA-binding |
103. | PadR | Pfam | output | DNA-binding |
104. | ROS_MUCR | Pfam | output | DNA-binding |
105. | TetR_N | Pfam | output | DNA-binding |
106. | Trans_reg_C | Pfam | output | DNA-binding |
107. | EAL | Pfam | output | Di-guanylate cyclase |
108. | GGDEF | Pfam | output | Di-guanylate cyclase |
109. | ANTAR | Pfam | output | RNA-binding |
110. | CsrA | Pfam | output | RNA-binding |
111. | PP2C_SIG | SMART | output | Phosphatase |
112. | Pkinase | Pfam | output | Protein kinase |
113. | HD | Pfam | output | Hydrolase |
114. | Guanylate_cyc | Pfam | output | Other |
115. | LytR_cpsA_psr | Pfam | output | Other |
116. | Rrf2 | Pfam | output | Other |
117. | RseA_N | Pfam | output | Other |
118. | HATPase_c | Pfam | transmitter | Histidine kinase |
119. | HWE_HK | Pfam | transmitter | Histidine kinase |
120. | HisKA | Pfam | transmitter | Histidine kinase |
121. | HisKA_2 | Pfam | transmitter | Histidine kinase |
122. | HisKA_3 | Pfam | transmitter | Histidine kinase |
123. | Response_reg | Pfam | receiver | Response regulator |
124. | MCPsignal | Pfam | chemotaxis | MCP |
125. | CheB_methylest | Pfam | chemotaxis | Chemotaxis |
126. | CheC | Pfam | chemotaxis | Chemotaxis |
127. | CheD | Pfam | chemotaxis | Chemotaxis |
128. | CheW | Pfam | chemotaxis | Chemotaxis |
129. | CheR | Pfam | chemotaxis | Chemotaxis |
130. | CheR_N | Pfam | chemotaxis | Chemotaxis |
131. | CheZ | Pfam | chemotaxis | Chemotaxis |
132. | HATPase_c | SMART | transmitter | Histidine kinase |
133. | REC | SMART | receiver | Response regulator |