Help for StrucTools

This set of tools is intended to provide a convenient web interface to simple, commonly used structural biology calculations with PDB files. Reasonable default parameters are provided, but users can select other values and there is no guarantee that the program will run or produce meaningful results.

Inputs

The basic input is a file of atomic coordinates in Protein Data Bank format. Only the ID code need be entered for PDB entries. Alternatively, a file of PDB-format coordinates can be uploaded.
PDB ID code: the four-character code that uniquely defines an entry in the Protein Data Bank. e.g. 1crn

Upload PDB file: any file of coordinates in PDB format can be uploaded from the desktop machine for these calculations.

Output

In most cases, the output can be drawn within the browser window (Graphics option), can be produced as a Postscript or PDF file (Postscript/PDF option), or produced as text within the browser window (Text option). The Text option is typically the data in ascii format. For example, for the Bfactor plot, the Text output will produce a list of the average bfactors for each residue. The Graphics or Postscript/PDF options will plot these values.
Graphics is the default when possible. Thus surface, volume, Bfactor and torsion angle are plotted and displayed as a gif image in the browser. Other calculations (e.g 'fasta format sequence') will return text output even when Graphics is selected.
Postscript/PDF format. If a plot will be printed, this is the best format to choose. It requires the Adobe Acrobat reader which is available free from Adobe's customer support page. Both Postscript and PDF files are produced when this option is selected.
Text. For surface, volume, Bfactor and torsion calculations, this choice will return a list of the actual numbers (e.g. mainchain and sidechain average B factors, or torsion angles, for each residue). For other calculations this is the default.
Raw. For surface, volume and torsion calculations, the 'Raw' output is literally the raw output from the underlying program, without any summing per residue or reformatting. This output format is most useful when debugging. For some calculations this is identical to 'text' output.

Torsion Angle Plot

also known as Ramachandran plots. The torsion angles are calculated by a slightly modified version of Andrew Martin's torsions program. The core, additional, generous and disallowed regions are drawn according to Morris et al, Proteins 12: 345-64 (1992). Residues with dual conformations are not plotted, nor are start and end residues of each chain, start and end residues of chain breaks, and prolines. Residues in generous or disallowed regions are labelled. There are no options for the Torsion Angle plot.

MSMS Surfaces

Surface areas calculated using Michel Sanner's MSMS program. MSMS calculates the Solvent Excluded and Solvent Accessible surfaces. The default probe radius is 1.5 Å and can be set to 1.3 to 1.6 Å. The user can also select the atoms to be used in the calculation. The default is to exclude HETATM lines from the PDB file. The user can select all ATOM + HETATM, or all atoms except waters. The excluded atoms are deleted from the input PDB file before the MSMS computation is performed.

Accessible surface (Gerstein)

Accessible Surface is calculated using Mark Gerstein's calc-surface program. Probe size can be selected (default 1.4 Å). By default, all atoms except waters in the PDB file are included in the calculation. Users can also choose to include all atoms except HETATMs, or all atoms including HETATMs (including waters). The excluded atoms are deleted from the PDB file before the calculation is performed. More information about the calculations is available at Gerstein's page on Macromolecular Geometry.

Voronoi Volume (Gerstein)

Voronoi Volume is calculated using Mark Gerstein's calc-volume program. The default is to use Method B with Chothia radii. The user can also select 'Normal Voronoi', 'Radical Plane', or 'Modified Method B' methods. (see F M Richards (1974), J. Mol. Biol. 82: 1-14; F M Richards (1977), Annu. Rev. Biophys. Bioeng. 6: 151-176 for a discussion of these methods). The calculation can be performed with Chothia or Richards radii. By default, no HETATMs are included in the calculation. The user can also choose to use all atoms including HETATMs, or all atoms except waters. The excluded atoms are deleted from the PDB file before the calculation is performed. More information about the calculations is available at Gerstein's Macromolecular geometry page

Bfactor plot

Bfactor plots plot the average Bfactors for main and side chain atoms against the residue number. Each chain in the PDB file is plotted separately.

Hydrogen Bonds (Levitt)

are calculated using Mark Gerstein's find-hbonds program. Criteria for Hbonds adapted from Arthur Lesk. The N-H...O-C angle is calculated. (For backbone atoms, for instance, the angle is expected to be around 120 degrees for a hydrogen bond to a carbonyl group.) One would expect a different "ideal" angle depending on whether you were dealing with a quaternary nitrogen (as in lysine) or a tertiary nitrogen (as in the distal nitrogens in arg), because the C-N-H angle differs -- even if the actual hydrogen bond N-H ...O is linear.
Currently default angle threshold = 110. degrees
default distance threshold = 3.5 A

Hydrogen Bonds (Lesk)

are calculated using Mark Gerstein's find-hbonds program. Criteria for Hbonds adapted from Mike Levitt. The program uses a criterion based on distances and angles. If there are no explicit hydrogen atoms, it uses the coordinates of the acceptor (O), the donor (N) and an atom bonded to the donor (C). The conditions are d(O..N) < 3.6 A, angle (O..N-C) between 90 and 150.

Mainchain hydrogen bonds

are calculated using Stride.

Secondary structure

is calculated using Stride.

Fasta format sequence

returns the sequence of residues in the PDB file in Fasta format. Each chain is treated separately. The chains are labelled as 'strand A', 'strand B', etc. in the Fasta output.

Raw sequence

returns the PDB-format data, or subsets thereof. The purpose would presumably be to save the returned output and use it as input for another program. The user can choose to see any of the following:

Spinning Molecule Movie

This option was formerly a separate web application called 'Indie'. Rasmol to create several views of a molecule and Gifsicle or ImageMagick create an animated gif/mpeg movie. The molecule is simply rotated by a specified degree around a specified axis, no morphing is performed.

Initial Orientation of Molecule: By default, the starting frame of the movie has the molecule in its initial Rasmol orientation (centered at the molecule's center of gravity, looking down the Z axis). You can enter an X, Y or Z angle here and the molecule will first be rotated by these angles before any movie frames are drawn. For example, if the most interesting loop of your protein appears at the back of the movie by default, enter a Y rotation angle of 180.

Display: The standard Rasmol options of displaying backbone, wireframe, ribbons or spacefilling model.

Colors: The standard Rasmol options of coloring the molecule by

- group (i.e. coloring the chain from blue at one end to red at the other)
- cpk (i.e. colored according atom type, as below
 C H O N S -- -->
Note that this is a bad choice for backbone displays, since you get only a single color.
- temperature. Colored according to the temperature factor of the residue in the PDB file. Blue is 'cold' (less mobile) and red is 'hot' (more mobile).
- chain. Each chain is colored differently. The colors are assigned automatically by Rasmol.
- charge. Colored according to the charge on the residue. See the Rasmol documentation for details.
- structure. Residues in alpha helices are colored magenta, beta sheets are yellow, turns are pale blue, and all other residues are colored white. The secondary structure is either read from the PDB file (HELIX and SHEET records), if available, or determined using Kabsch and Sander's DSSP algorithm.

Background:. Black or white background on the resultant gif.

Rotate around: x, y or z axes, as in the picture on the right.

Rotation angle:. The total angle through which the molecule is rotated. 360 degrees produces a spinning molecule, while any other choice produces a rocking motion.

Number of frames: The number of frames used to produce the animation. More frames produces a smoother motion, but will create bigger files. On a web page, an animated gif with lots of frames will take a longer time to load.

Size of gif: Size in pixels of the final animated gif. Larger sizes will produce bigger files. The final image may not be exactly what is requested, due to limitations in the conversion programs used. For example, if you request a gif of 300x300, you will get 302x302 so as to avoid an annoying black edge when a white-background rasmol picture is converted to a gif. Mpeg movies must have dimensions which are a multiple of 16, so the closest multiple is chosen, i.e. 300x300 will become 304x304. The output page will list the dimensions of the final image and the size of the final movie file.

Speed of movie: For animated gifs, the frame speed, i.e. the time that each frame of the movie will be displayed.


Errors/suggestions/problems to Susan Chacko (webtools@helix.nih.gov)