AMBER

                                 Version 5


                                 Volume 1

      Amber  5  is  a  collaborative effort of the research groups of
      Peter Kollman  (UCSF),  David  Case  (Scripps),  Ken  Merz(Penn
      State), Tom Darden (NIEHS), and Dave Fergunson (Minnesota).

      The authors are:


              David A. Case (The Scripps Research Institute)
                David A. Pearlman (Vertex Pharmaceuticals)
                         James W. Caldwell (UCSF)
                     Thomas E. Cheatham III (UCSF,NIH)
                           Wilson S. Ross (UCSF)
                         Carlos Simmerling (UCSF)
                            Tom Darden (NIEHS)
                       Kenneth M. Merz (Penn State)
                         Robert V. Stanton (UCSF)
                         Ailan Cheng (Penn State)
                       James J. Vincent(Penn State)
                            Mike Crowley (PSC)
                David M. Ferguson (University of Minnesota)
                           Randall Radmer (UCSF)
      George L. Seibel (for contributions to Amber version 3A while at UCSF)
      U. Chandra Singh (for contributions to Amber versions 2 and 3 while at UCSF)
      Paul Weiner (for contributions to Amber version 1 while at UCSF)
                          Peter A. Kollman (UCSF)


      All  contents  Copyright (c) 1986, 1991, 1995, 1997, University
                              of California.
                           All Rights Reserved.


                             Acknowledgements

      We acknowledge the generous cooperation  of  Wilfred  van  Gun-
      steren,  whose molecular dynamics code was used as the basis of
      the md modules in version 2.0. We are also pleased to  acknowl-
      edge Rad Olson and Bill Swope at IBM Almaden Center, whose con-
      tributions were instrumental in developing  the  better  vector
      optimized  non-bonded  routines  first  released  in version 3,
      revision A.  Research support from DARPA, the NIH, and the  NSF
      for  Peter  Kollman  is  gratefully acknowledged, as is support
      from the NIH for David Case. Use of the facilities of the  UCSF
      Computer  Graphics  Laboratory  (Thomas Ferrin, PI) is appreci-
      ated.  We thank Nelson H.F. Beebe of the University of Utah for
      permission  to  include  his ``portable namelist'' code.  Wendy
      Cornell contributed a discussion of charge  derivation  to  the
      manual  and added demos and documentation for the RESP program.
      We also thank Allison Howard and Valerie  Daggett  for  various
      helpful  discussions  and suggestions.  The pseudocontact shift
      code was provided by Ivano Bertini of the  University  of  Flo-
      rence.  Many people helped add features to various codes; these
      contributions are described in the documentation for the  indi-
      vidual  programs.  Jed Pitera, in particular, made a large con-
      tribution to the documentaion for PROFEC in Version 5.


                          Recommended Citations:

      When citing Amber Version 5 in the  literature,  the  following
      citation should be used:

           D.A.  Case,  D.A.  Pearlman,  J.W. Caldwell, T.E. Cheatham
           III, W.S. Ross, C.L. Simmerling, T.A. Darden,  K.M.  Merz,
           R.V.  Stanton,  A.L. Cheng, J.J. Vincent, M. Crowley, D.M.
           Ferguson, R.J.  Radmer,  G.L.  Seibel,  U.C.  Singh,  P.K.
           Weiner  and  P.A.  Kollman  (1997), AMBER 5, University of
           California, San Francisco.

      The basic description of the methods incorporated in  Amber  is
      in:

           D.A.  Pearlman,  D.A. Case, J.W. Caldwell, W.S. Ross, T.E.
           Cheatham, III, S. DeBolt, D. Ferguson, G. Seibel,  and  P.
           Kollman.  AMBER, a package of computer programs for apply-
           ing molecular mechanics, normal mode  analysis,  molecular
           dynamics  and  free  energy  calculations  to simulate the
           structural and energetic properties of  molecules.   Comp.
           Phys. Commun. 91, 1-41 (1995).


      1.  Introduction.

           Amber  is the collective name for a suite of programs that
      allow users to carry out molecular dynamics  simulations,  par-
      ticularly  on  biomolecules.   None  of the individual programs
      carries this name, but the various parts work  reasonably  well
      together, and provide a powerful framework for many common cal-
      culations.  The term amber is also sometimes used to  refer  to
      the  empirical force field that is implemented here.  It should
      be recognized however, that the code and force field are  sepa-
      rate:  several  other  computer  packages  have implemented the
      amber force field, and other force fields  can  be  implemented
      with  the  amber  programs.  Further, the force field is in the
      public domain,  whereas  the  codes  are  distributed  under  a
      license agreement.

           Amber  5  (1997)  represents a significant change from the
      most recent previous version, 4.1, which was released in  1995.
      Briefly, the major differences include:

       (1)   an updated and parallelized implementation of the parti-
             cle-mesh Ewald routine, and its incorporation  into  the
             free energy module;

       (2)   "locally-enhanced sampling" (LES) code that allows parts
             of the system to be present as multiple copies;

       (3)   an alternate version of Sander (ROAR) that includes  the
             ability  to  define  part  of  the  system as a quantum-
             mechanical section (QM/MM), and includes alternate inte-
             grators;

       (4)   PROFEC   (Pictorial   Representation   of   Free  Energy
             Changes), a set of tools for carrying out and displaying
             extrapolative free energy changes;

       (5)   new  and parallelized methods for NMR refinement; incor-
             poration of penalites based on pseudocontact shifts.

       (6)   updates to the functionality and stability of LEaP.


      1.1.  What to read next.

           If you are installing this package or want to  redimension
      the  code,  see installation section of this manual.  New users
      should continue with this introduction section, and  will  also
      find  the  tutorial  section  useful.   The  directories  under


      amber5/demo contain a number of systems that may serve as exam-
      ples.   You  should  familiarize yourself with the files in the
      database directory, amber5/dat, by looking  over  the  database
      section  of  the  manual.   There  is also lots of information,
      tips,   and   examples   on   the   Amber    Web    pages    at
      http://www.amber.ucsf.edu; a portion of this is included in the
      distribution tape: point your browser to amber5/Web/index.html.
      Although  Amber  may appear dauntingly complex at first, it has
      become easier to use over the past few years,  and  overall  is
      reasonably straightforward once you understand the basic archi-
      tecture and option choices.  Hundreds of people have learned to
      use Amber; don't be easily discouraged.


      1.2.  Information flow in Amber.

           Understanding where to begin in Amber is primarily a prob-
      lem of managing the flow of information in this  package.   You
      first  need  to  understand  what  information is needed by the
      energy programs (gibbs, sander, spasms, roar and  nmode).   You
      need to know where it comes from, and how it gets into the form
      that the energy programs need.  This section is meant to orient
      the  new  user, and is not a substitute for the individual pro-
      gram documentation.

      Information all the energy programs need:

       (1)   Cartesian coordinates for each atom in the system.

       (2)   "Topology":   connectivity,  atom  names,  atom   types,
             residue names, and charges.

       (3)   Force  field:  Parameters  for all of the bonds, angles,
             dihedrals, and atom types in the system.

       (4)   Commands: The user specifies the procedural options  and
             state parameters desired.

      This  information  is  provided to the energy programs in three
      files: One contains the coordinates; the  second  contains  the
      topology and parameters, and is called the "topology file"; the
      command or "input" file is the third  file.   Additional  files
      may needed for special options specified in the command file.

           Cartesian coordinates usually come from Xray crystallogra-
      phy, NMR spectroscopy, or model-building.   They should  be  in
      Brookhaven  Protein  Databank  (PDB)  format.  The program LEaP
      provides a platform for carrying out  many  of  these  modeling
      tasks, but users may wish to consider other programs as well.

           Topology  comes  from the database:  The database is found
      in the amber5/dat directory.  It is  called  db94.dat,  and  is
      described  in Chapter 3.  It contains topology for the standard


      amino acids as well as N and C-terminal  charged  amino  acids,
      DNA,  and  RNA.  The database contains default internal coordi-
      nates for these monomer units, but  coordinate  information  is
      usually  obtained  from  PDB  files.   Topology information for
      other molecules (not found in the standard database) is kept in
      user-generated "residue files".

           The  basic  force  field  parameters are also found in the
      amber5/dat directory; the database section of the  manual  con-
      tains   some  detailed  descriptions  of  various  force  field
      options.  This file may  be  used  "as  is"  for  proteins  and
      nucleic  acids,  or users may prepare their own files that con-
      tain modifications to the "standard" force fields.


         data-         seq          pdb         force
         base                                   field
           |            |             |            |
           |            |             |            |
       +++++++++++++++++++++++++++++++++++++++++++++++++
       +   |            |             |            |  prmtop
       + prep db94.datlink  lnkbin--edit edtbin--parm -+-----nmode
       +                                           |  prmcrd    |
       +++++++++++++++++++++++++++++++++++++++++++++++++        |
   (also handled by LEaP)                       prmtop          |
                                                prmcrd          |
                                                   |        nmanal,
                                    con-       sander,       lmanal
                                  straints      gibbs,
                                                spasms
                                                   |
                                                   |
                                                   |
                                       carnal      |     anal
                                                mdanal


                      Basic information flow in AMBER
      _______________________________________________________________


      1.3.  Preparatory programs.


      LEaP is the primary program to create a new system in Amber, or
           to modify old systems.  It combines the  functionality  of
           prep,  link,  edit, and parm outlined below.  These latter
           programs are retained primarily for backward compatibility
           with older versions of Amber.


      PREP creates or adds to a residue database from the appropriate
           topology/parameter information.  Required for residues not
           already  defined in the standard AMBER database.  (As sup-
           plied, the standard AMBER  database  contains  definitions
           for  the  20  standard amino acids, nucleic acids with the
           five standard bases, and a few other units).

      LINK deals only with  topology.   You  tell  LINK  the  residue
           sequence  of  your  molecule  (even  if  there is only one
           residue).  LINK will extract the topology information  for
           each  residue from the standard AMBER database or, option-
           ally, from the residue database files created  with  PREP.
           The  topology  for each residue will be linked together to
           form the topology for the system.  This is  written  to  a
           binary  file (default name = lnkbin) that is read by EDIT.

      EDIT deals mainly with coordinates.  One of  the  primary  pur-
           poses of EDIT is to read PDB coordinates and apply them to
           the system generated by LINK.  Coordinates for atoms  that
           are  missing from the PDB file (usually hydrogens) will in
           most cases be generated automatically by EDIT,  using  the
           stored  internal  coordinates  in  the  link  binary  file
           link.bin.  EDIT can be  used  to  solvate  a  molecule  in
           water,  to  add counterions to nucleic acid systems, or to
           alter coordinates in specific ways.

      PARM will determine which bonds, angles,  dihedrals,  and  atom
           types  exist  in  the  system, and extract the appropriate
           parameters for them from the force field file.  PARM  then
           writes  the  final coordinate and topology files needed by
           all other AMBER programs.  used to add simple  non-varying
           internal  coordinate restraints to the system, and to cre-
           ate a two-state topology file for use in GIBBS free energy
           perturbation  calculations.   PARM outputs two files which
           are used subsequently: The topology file (default  name  =
           prmtop);  and the coordinate file (default name = prmcrd).

      PROTONATE
           This program will add hydrogens in  appropriate  locations
           to  peptides  and  proteins  that  lack them.  It can also
           check the suitability of protons that are already present,
           and  convert from one naming system to another (e.g.  from
           IUPAC-IUB recommendations to Brookhaven format.)


      1.4.  Energy programs.


      SANDER
           is the basic energy minimizer and molecular dynamics  pro-
           gram.   This  program relaxes the structure by iteratively
           moving the atoms down the energy gradient until  a  suffi-
           ciently  low  average  gradient  is obtained.   Structures


           should usually be minimized before molecular dynamics sim-
           ulation.  The molecular dynamics portion generates config-
           urations of the system by integrating Newtonian  equations
           of motion.  MD will sample more configurational space than
           MIN, and will allow the  structure  to  cross  over  small
           potential  energy barriers.  For complicated systems MD is
           usually able to locate  lower  energy  conformations  than
           simple  energy  minimization.  Configurations may be saved
           at regular intervals during the simulation for later anal-
           ysis.

           More  elaborate  conformational  searching and modeling MD
           studies can also be carried out using the  SANDER  module.
           This  allows  a  variety of constraints to be added to the
           basic force field, and has been  designed  especially  for
           the  types  of  calculations  involved  in  NMR  structure
           refinement.

      GIBBS
           is the free energy perturbation program.  It is similar to
           SANDER,  but uses the ensemble of generated configurations
           to calculate the free energy difference between two  simi-
           lar  states through either a perturbation or thermodynamic
           integration approach.  The two states are defined  by  the
           user in LEaP or PARM.

      NMODE
           is  both  a  quasi-Newton Raphson second derivative energy
           minimizer and vibrational  analysis  program.   The  NMODE
           minimizer  is  capable  of  obtaining extremely low energy
           gradients.  NMODE can calculate the normal  modes  of  the
           system  as  well  as  numerous  thermochemical properties.
           Other features include the ability  to  compute  "Langevin
           modes"  (normal modes including viscous coupling to a con-
           tinuum solvent,) and techniques to find transitions states
           as well as minima.

      ROAR is  a  "Penn  State" version of sander, that incoporates a
           variety of features not found in sander itself.  The  most
           notable  change  is  the  incorporation  of the ability to
           define a part of the system quantum-mechanically, allowing
           it  to  interact  with  other parts of the system that are
           defined in a molecular mechanics sense.  Other features of
           ROAR  include  implementation  of  a  Nose-Hoover-chain MD
           integrator,  Ewald  summations,  and   multiple-time-scale
           integration routines.


      1.5.  Analysis programs.


      ANAL is for the analysis of structure and especially  molecular
           mechanical energy of a single configuration of  a  system.


           It  can be run on structures both before and after modifi-
           cation by the energy programs.  Running ANAL on  the  ini-
           tial  configuration of your system is a good way to locate
           errors in the structure that  result  in  large  energies.
           Anal  can  also be used for more sophisticated analyses of
           energy and structure.

      CARNAL
           is a molecular dynamics analysis program.  It is used  for
           geometrical measurements, root mean square coordinate fit-
           ting, trajectory averaging, and other structural  analyses
           of  MD  trajectories.   CARNAL executes a programming lan-
           guage for  filtering,  measuring  and  comparing  multiple
           streams of coordinate files (the language contains 44 key-
           words and uses 10 punctuation/logical characters).  As  an
           example, one can use it to build a trajectory in which the
           solute is positioned for minimum root mean square  fit  of
           residues  in  the  active site and only the first shell of
           waters is included.

      RDPARM
           is a general purpose utility that can examine  and  modify
           prmtop files created by LEaP or PARM.  It can also process
           trajectory files created from MD simulations, carrying out
           superpositions, extractions of coordinates, etc.

      NMANAL/LMANAL
           computes atomic fluctuations and various correlation func-
           tions from normal modes.


      2.  Installation of Amber 5

       (1)   Compile the basic AMBER distribution:

                  a. go to the src directory below this one.

                  b. create a link to the appropriate Machine file, e.g.
                         ln -s -f Machine/Machine.hp  MACHINE   for HP, or
                         ln -s -f Machine/Machine.sgi MACHINE   for SGI, etc.

                  c. ./Makeall >& make.errs &
                     look at the make.errs file for programs that didn't get made;
                     loader warnings (especially on SGI -- see the Machine.sgi file)
                     can generally be ignored; compiler warnings should be considered,
                     but most are innocuous.


       (2)   Compile LEaP:

                  a. go to the leap directory below this one.

                  b. xmkmf;  make World;  make install.leap;

                     [Note: on some Digital Unix machines, the second step above may
                      need to be: make "DNETLIB=-ldnet_stub" World ]

                  c. Set your LEAPROOT environment variable using the leapSetup.sh
                     or leapSetup.csh scripts in amber5/Leap (depending on whether
                     you use the Boune or C shells).


       (3)   Compile ROAR:

                  a. go to the roar/source directory below this one.

                  b. edit the Makefile, to uncomment the lines appropriate for your machine.

                  c. make install


       (4)   Make the database files:

                  a. go to the dat directory below this one.

                  b. make


       (5)   Make Interface (optional):

                  a. go to the interface directory below this one.

                  b. source install_int; source install_ambint.


       (6)   Test the basic AMBER distribution

                  a. go to the test directory below this one.

                  b. ./Run.tests >& tests.errs &

                     Examine the tests.errs file: where "possible FAILURE" messages are
                     found, go to the indicated directory under amber5/test, and look
                     at the *.dif files.  Differences should involve round-off in the
                     final digit printed, or occasional messages that differ from machine
                     to machine.

                     The "standards" to which your output will be compared are contained
                     in the demo dirctory.  The "default" values were run at UCSF on
                     an HP-735 workstation.  You should expect to see slighly different
                     detailed MD trajectories on other machines, especially in gibbs,
                     although the "statistics" (which are the real answers) should be
                     very close to our results.  The transition-state theory test may
                     give a large number of differences -- see the comments in the output
                     of that test to help interpret it.

                  c. go to the leap/test directory below this one

                  d. ./Run >& tests.errs &

                     Again, examine the tests.errs file for indications of more than
                     round-off error differences.

                  e. There is no automated test for gibbs with PME; if you are interested
                     in this option, please look at the files in
                     amber5/demo/sodium_gibbs_pme.

                  f. There is no automated test for addles; if you are interested in this
                     option, please look at the files in amber5/demo/addles.


      2.1.  Installing by hand.

           If your system is not  included  among  the  configuration
      files  supplied,  or  if you want to alter the existing file or
      are curious how these files hide machine dependencies, see  the
      file  amber5/src/Machine/0README.   If  you  are  developing or
      changing the MACHINE file, you may want to go more slowly, com-
      piling  and  testing.  You can type make or make install in any
      src/ directory to make the program(s) in that  directory,  e.g.
      when  redimensioning  arrays  or  otherwise modifying the code.
      make clean will cause all .o files  to  be  removed;  otherwise
      they stay around using significant space to conserve recompila-
      tion time.


      2.2.  Testing.

           We have installed and  tested  AMBER  5  on  a  number  of
      machines,  using  Cray, IBM, Sun, Hewlett-Packard, DEC, Convex,
      and Silicon Graphics hardware.   However,  owing  to  time  and
      access  limitations,  not  all combinations of code, compilers,
      and operating systems have been tested.  Therefore we recommend
      running the test suites.

           The  distribution  contains a validation suite that can be
      used to help verify  correctness.    The  nature  of  molecular
      dynamics,  and  to a lesser extent molecular mechanics, is such
      that the course of the calculation is  very  dependent  on  the
      order  of  arithmetical  operations  and the machine arithmetic
      implementation, i.e. the method used  for  roundoff.    Because
      each step of the calculation depends on the results of the pre-
      vious step, the slightest difference will eventually lead to  a
      divergence in trajectories.  As an initially identical dynamics
      run progresses on two different machines, the trajectories will
      eventually become completely uncorrelated.  Neither of them are
      ``wrong;'' they are just exploring different regions  of  phase
      space.   Hence,  states  at the end of long simulations are not
      very useful for verifying correctness.  Averages  are  meaning-
      ful,  provided  that  normal statistical fluctuations are taken
      into account.  ``Different machines'' in this context means any
      difference  in  floating point hardware, word size, or rounding
      modes, as well as any differences in  compilers  or  libraries.
      Differences  in  the order of arithmetic operations will affect
      roundoff behavior; (a + b) + c is not necessarily the  same  as
      a  +  (b  + c).  Different optimization levels will among other
      things affect operation order, and  may  therefore  affect  the
      course of the calculations.

           When  comparing the output from two different machines for
      purposes of verification, it is very important  that  identical
      input files be used to generate both sets of output.  The vali-
      dation  suite  uses  matched  inputs   and   outputs   in   the
      amber5/demo/  tree, which is set to read-only to help you avoid
      overwriting them with files created on your  machine.   Testing
      takes place in the amber5/test/ tree.

           All  initial values reported as integers should be identi-
      cal.  The energies and temperatures on the first  cycle  should
      be identical.  The RMS and MAX gradients reported in sander are
      often more precision sensitive than the energies, and may  vary
      by  1 in the last figure on some machines.  As is the case with
      sander, the trajectory in a Gibbs simulation will diverge,  but
      the  resulting  free energy should not if the simulation is run
      to convergence (this is not done because of the time involved).
      In minimization and dynamics calculations, it is not unusual to
      to see small divergences in behavior after as little  as  1-200
      cycles.


           In general, compiler and optimizer errors are fairly obvi-
      ous, and result in rather large changes in the output,  if  you
      get  any  output  at  all.   See  test/0README  for examples of
      acceptable output differences and discussion  of  peculiarities
      of various machines.


      2.3.  Memory Requirements.

           The  AMBER 5 programs as distributed are dimensioned for a
      fairly large system (about 10K atoms),  and  you  may  want  to
      change  their dimensions to be more appropriate for the machine
      you are using if you are running in a tight memory environment.
      See  src/0README for information on resizing. Some programs use
      local scripts called resize.csh; this script  uses  the  stream
      editor  sed,  and  employs  regular expression matching to cor-
      rectly redimension the code regardless of what  its  dimensions
      are currently set to, even if the same parameter has been inad-
      vertently set to different values in different  modules.   Here
      we  describe  dimensioning of sander as an example.  Sander can
      also be compiled the MEM_ALLOC switch set in the MACHINE  file.
      This  causes  arrays  to be allocated dynamically, based on the
      size of the molecule.  If this works on your  machine,  it  can
      eliminate most of the resizing effort described below.

           In src/sander/sizes.h, you will find the following parame-
      ters that might need to be changed:

                 parameter (MAXINT=1300000)
                 parameter (MAXPR=2500000)
                 parameter (MAXREA=340000)
                 parameter (MAXHOL=200000)
                 parameter (MAXDUP=5000)


      The actual memory requirements for  a  particular  job  can  be
      determined  from  the  output of SANDER or GIBBS.  An annotated
      example follows:


         +-------------------------------------------------------------+
         |                                                             |
         | 1.  R E S O U R C E   U S E:                                |
         |                                                             |
         |                                                             |
         |   Memory Use       Allocated      Used                      |
         |   Real              340000        19109  <-- minimum MAXREA |
         |   Integer          1300000        29904  <-- minimum MAXINT |
         |                                                             |
         |   Max Nonbonded Pairs: 1270096 packed  1 to a machine word  |
         |                                        ^                    |
         |                                     "NWDVAR"                |
         |   Duplicated   26 dihedrals                                 |
         |                                                             |
         |   Duplicated   94 dihedrals  <-- minimum MAXDUP             |
         |       .                                                     |
         |       .                                                     |
         |       .                                                     |
         |                                                             |
         |    NB-UPDATE: NPAIRS =  150395  HBPAIR =    2804            |
         |                                                             |
         +-------------------------------------------------------------+


           As is shown above, MAXREA must be at least as large as the
      number  of  Real  words  used.  It can be read directly off the
      output.  MAXDUP must be at least as large as the larger of  the
      two  'duplicated'  values  given, in this case 94.   The actual
      amount of memory controlled by MAXDUP is 10  times  its  value.
      MAXINT  is  slightly  more complicated, since it depends on the
      type of pairlist packing used.  The number of NB pair  pointers
      packed in a native integer word, NWDVAR, is printed in the out-
      put as shown.  It will be 1 or 2 on byte-oriented machines,  1,
      2,  or  4 on 64 bit machines like Cray or FPS264.   The minimum
      value of MAXINT is determined by the sum of the static  integer
      requirement  given  in  the  output.   For  gibbs,  MAXINT also
      includes the requirement for the pairlist, while in sander  the
      pairlist size is determined by MAXPR.  The pairlist requirement
      is the total number of nonbonded pairs, NPAIRS, divided by NWD-
      VAR.   Because  the number of pairs may grow or shrink during a
      run, you should include a safety  factor  of  5-10%  extra  for
      NPAIRS.   The algorithm to determine the MAXPR (sander) or MAX-
      INT pairlist component is thus:

                           (NPAIRS/NWDVAR) * 1.1

      More detailed documentation on memory use and packing  configu-
      ration  is  found  in sander/sizes.h.  For gibbs, the variables
      which define memory allocation are MAXREA, MAXINT, and  MAXCHR.
      MAXREA  and MAXINT can be set as described above.  MAXCHR allo-
      cates  character  storage,  is  typically  small,  and   scales


      linearly  with the number of atoms. These parameters are scaled
      to a single parameter, memmax.