IU MolViz User Guide
Insight II Notes: Discover
Introduction
The Discover program can minimize the energy in your
structure or can be used to perform molecular dynamics. The Discover
program is accessed from within InsightII. Click the left mouse
button the MSI logo (module icon) in the upper left corner
and then select discover. A new row of pull-down menu buttons will
appear.
While manuals are usually too verbose and poorly written,
the Discover manual (part 1) is a great review of molecular mechanics
calculations. The following is a summary of this manual as it
pertains to the Discover menu selections.
Constraints
Atoms can be constrained during the calculation through
the Constraint pull-down. Many minimizations can be accelerated
by fixing uninteresting parts of the molecule; if you minimize
a large number of NON-FIXED atoms, the calculation will take hours
to finish. Other types of constraints can force the molecule to
assume desired shapes and properties, but they ADD to the computation
time. For example, if a new inhibitor is placed in an active site,
the inhibitor and active site residues can be allowed to move
freely while the rest of the protein is fixed, which speeds the
calculation. For more sophistication, the layer of residues next
to the active site could be tethered to allow limited movement,
at the expense of computation time.
- Fix
- specified atoms will not move. It's very helpful
to use the Color_Constraint option to color the
parts of your molecule that are fixed, after the
fixed parts are defined. While you can ADD atoms & residues to
the fixed atom list, DELETEing atoms & residues from this list
doesn't work unless you delete exactly what you added.
Therefore, you can't ADD all residues and then delete selected residues.
- Generic Dis
- The distance between specified atoms
will be forced to a specified range. This is used in NMR
calculations, where the NOE measurement can determine
which pairs of hydrogens must be separated by1.8-5 Å.
This can also be used to define a hydrogen bond, by forcing the
distance between a hydrogen and a hydrogen-bond acceptor
to be within 1-1.5 Å, and by forcing the hydrogen bond
donor and acceptor to be within 2-3 Å.
The potential energy well defined by this menu has a lower distance and an upper distance.
If the two atoms lie between these distances, there is no energy penalty.
The steepness of the parabolic walls of this potential energy well are defined by the
upper force constant and the lower force constant. A maximum force value is also defined, so that
the energy doesn't approach infinity when the atoms are really far apart or really close together.
- Rotor
- used to rotate molecular segments during minimizations.
For instance, a side chain can be rotated
by 60 degree increments, and after each rotation,
a minimization can be performed to search with angle with minimum
energy. MSI/Search_Compare is similar to this (and more robust)
and MSI/Biopolymer//Residue/Auto_Rotamer performs this automatically
for amino acids.
- Template Force
- forces the conformation of one molecule
to look like another molecule. The molecules
can have different types and numbers of atoms, but
you must specify which atom pairs must attempt to superimpose.
The Viewer/Transform/superimpose will only superimpose molecules
(or parts of molecules) with identical numbers and types of atoms.
- Tether
- forces atoms to stay within a specified range
from their original positions.
- Torsion Force
- forces torsion angles to take specified
values. For instance, can be used to constrain the bond
angle of a proposed hydrogen bond.
- Warp
- used to set up free energy calculations.
Setting up the Minimization
The potential energy of a biomolecule can be plotted
as a multi-dimensional grid, which can be considered more simply
as a two-dimensional topographic map. When you minimize the molecule's
potential energy, your are trying to reach the nearest minimum,
or well, on this map. Minimization algorithms calculate the derivative
of the current point on the map, and then determine which way to "move"
(i.e., move the atoms) to reach the minimum. Minimization parameters
are set up with Parameter/Minimize pull-down menu.
Select 1 of the four algorithms:
- Steepest
- The steepest descents algorithm (with
line searches) is the most basic algorithm. It's slow, but it
never encounters problems. Using this algorithm first,
especially if your molecule is far from the minimum and until
the derivative <10. Then switch to another algorithm.
- Conjugate
- The conjugate gradients algorithm is
fairly fast, but it fails if there are saddle-points on the
potential energy surface. Use this to finish the
minimization of molecules with more than a few
hundred non-fixed atoms.
- VA09A
- This is a quasi-Newton-Raphson method, which
only used the first derivatives of a Hession
matrix. Not as accurate as Newton, but EXTREMELY
fast. Fails if you are far from the
minimum. No limit on the number of atoms, but try
this only for 200 non-fixed atoms AT MOST
(a 1,000 atom system approaches the limits of a Cray-XMP supercomputer).
- Newton
- The Newton-Raphson method used the full
Hession matrix. Fast, but fails if you are far from the
minimum. No limit on the number of atoms, but try
this only for 200 non-fixed atoms AT
MOST. Use VA09A instead unless you need extreme
accuracy (for instance, for calculating bond
vibrational modes).
Set the number of iterations to 10-1000 for the Steepest
algorithm and 100-5000 for the other algorithms. You may want
to set the number of iterations to a very small value, just to
make sure that the calculation and constraints will work as you
desire. Then increase the number of iterations and do the full
minimization. If the minimization doesn't "converge"
(see below), you can do another minimization of the structure
on the screen (the ending structure of the previous round of minimization).
For biomolecules, you rarely calculate an absolute
minimum. Eventually, you reach a flat "floor" in the
"well" of the potential energy surface, and the algorithms
can wander around this floor forever, hunting for the absolute
minimum between very tiny bumps in the floor. The derivative determines
when you are satisfied that the minimization has reached the floor
of the potential energy well. The units of the derivative are
kcal mol-1Å-1. When the MAXIMUM derivative
of all energy terms for all atoms is below the derivative parameter,
the minimization has "converged" and the calculation
will stop. The derivative reported on the screen during the minimization
in the AVERAGE derivative of all terms for all atoms, which can
be very different than the MAXIMUM derivative. Every molecule
is different, but follow these guesses:
- for calculating bond vibrations set derivative = 0.001
- for comparing energy values, set derivative < 0.01
- for comparing two conformations of the same structure, set derivative = 0.01 to 0.1
- for comparing two different structures, set derivative = 0.05 to 0.5
- for minimizing before doing dynamics, set derivative = 1.0
The ONLY way to tell if you have reached an acceptable
minimum is to save the structure after minimizing to a derivative
of 1, 0.5, 0.1, 0.05,
, and then compare the structures
to see when the structure stops moving to your satisfaction. You
should also plot the energy and derivative vs. Number of iterations.
However, this is a lot of work.
Toggle on Charges, Cross or Morse to modify the forcefield
terms.
- Charges toggles Coulombic charge terms on/off.
- Cross will
include cross-terms in the minimization. Cross-terms are combinations
of the "bond" energy terms (see below), such as the
combination of bond stretch and dihedral angle terms. It has been
shown that cross-terms are necessary if you need good extreme
accuracy (e.g., when the center bond stretches, it affects the
dihedral angle). Adding cross-terms will slow down the calculation.
Don't add cross-terms if you are far from a minimum.
- Morse substitutes
the Morse potential energy function for bond stretches instead
of using the harmonic well. Don't use Morse when far
from equilibrium. Morse is more accurate than the harmonic well,
but calculation will a little slower.
If ALL non-bond interactions were evaluated in every
iteration, the calculation would be very slow. A non-bond
cutoff is used to neglect non--bond interactions beyond a specified
distance. After a number of iterations, a matrix is
generated which lists atom pairs which are within the specified
distance, and this matrix is used for calculating non-bond energy
terms until a new matrix is created.
Also, each atom in a "group"
must evaluate non-bond interactions will all other atoms in the
same "group". If the "group" is longer than
8 Å, then the cutoff values must be increased. This is
a common problem for nucleotides, since nucleotide base is a "group"
that is wider than 8 Å, and the cutoff values must be
increased to 12 Å. Carbohydrates or large side chains of unnatural
amino acids also have this problem. Also, it has been shown that
non-bond interactions are significant up to 15 Å. The Parameters/Value menu is used to modify non-bond cutoffs:
- Timtmp: used to determine when a new matrix is created
- Cutoff: non-bonded interactions between atom pairs separated by less than this distance will be evaluated.
- Cutof2: used to set up double cutoffs.
- Dseed: used to determine when a new matrix is created
- Cutdis: usually 1-1.5 Å less than cutoff. All non-bond interactions
are zero if atom pairs are separated by this distance or greater.
- Swtdis: usually 1-1.5 Å. Non-bond interactions between atom pairs separated by (cutdis - swtdis) start to
be rounded to zero.
The dielectric can be modified with the Parameters/Set pull-down menu. If you have turned off charges in Parameters/Minimize,
the dielectric is not important. If you have turned charges on,
set dielectric = 1.0 or choose a distance-dependent dielectric.
Most rough calculations are done in a vacuum, while
biopolymers usually exist in solution. We should include a large
number of water molecules around the biopolymer, but the calculation
would be very big. As a compromise, many people use a distance-dependent dielectric.
A distance-dependent dielectric is used to crudely simulate solvent.
If two atoms are separated by a small distance, the probability that
the space between the atoms is occupied by a high-dielectric solvent is small; if
two atoms are separated by a large distance, the probability that
the space between the atoms is occupied by a high-dielectric solvent is large.
If the dielectric is multiplied by R (the distance between the atoms), this
dielectric crudely simulates this probability.
Electrostatics is not well-defined for biopolymers, especially
because the dielectric medium between two atoms inside the core
of a protein is different for every atom pair. DelPhi will calculate
the dielectric field of a protein's interior and exterior, but
the calculation is too laborious to include it in a minimization
calculation. There is just no good way to handle electrostatics at this time.
The individual terms of the potential energy function
describe:
- bond stretches and compressions
- angle rotations
- torsion angle rotations
- planar deformations
- van der Waals repulsion and attraction
- electrostatic repulsion and attraction
- (for AMBER only) hydrogen bonds
These terms can be scaled using the Parameters/Scale_Terms. For example, non-bond interactions can
be turned off (scale = 0) during an initial minimization of an
extended peptide chain with NMR restraints. This allows atoms
to pass through each other (no VDW radius) while maintaining the
bonding skeleton and while searching for a secondary structure
that satisfies the NMR data. Also, the Coulombic terms could be
turned off during the initial minimization of a new sketch of
a highly charged molecule. The charges would not influence the
"clean-up" of the bonds and bond angles. Terms that
are turned off should be turned on gradually:
- First minimization term = 0
- Second minimization term = 0.1
- Third minimization term = 0.25
- Fourth minimization term = 0.6
- Fifth minimization term = 1.0
WARNING!!! If you are using the AMBER forcefield, you must
scale the p1-4 term (the last term in this menu) by 0.5.
Running the Minimization
The normal output supplies you with more than enough information.
However, to change the output on the screen and the output
of new files to your directory, use Run/Report and Fun/Files.
By default, all output is displayed on the screen and all files
are saved. See also the Description of Discover Files.
To run the minimization, click on Run/run. Make sure
that you have selected the local machine, the interactive computation
mode, the correct molecule, add_auto, run_minimization, , and
reduce_output (for better output control, see run/files) and run_interactive.
It's better to run interactively, make sure that the calculation
starts, and then detach the calculation (see below) rather than
running directly in "batch" mode. Advanced users can
set up and run a command file which contains a customized setup
of minimization parameters. Then click on execute. After preparing
for the calculation, the program will show text in the top center
of the screen.
To stop minimization, click on run/stop. Some minimizations
may take hours, and you can detach from the calculation so that
it continues to run while you do something else.
Click on Run/Detach. To return
to the calculation, click on molecule/get and get the Biosym-format
.car file, and then click on Run/Attach.
If the minimization has finished, you cannot attach to the finished
calculation. Go to a UNIX shell and look at the file with the
.out extension to determine if the minimization met the derivative
threshold, or if the minimization ran out of steps. Look at the
file with the .pek extension to "peek" at the calculation
while it is running.
If Discover reports an error while setting up the
calculation, of if the minimizations ends and if your are not
sure the derivative was met, go to a UNIX shell (Viewer/Session/Unix
or just type unix and hit return TWICE) and type is to display the files that Discover
generated. Discover used the name of your molecule and appends
it with a number (i.e., first minimization of Crambin is Crambin1,
second minimization is Crambin2, then Crambin3, etc. If I do anything
to the structure after two minimizations runs, the name of the
structure will automatically change to Crambin2, and subsequent
minimizations will be Crambin21, Crambin22, Crambin23, etc.).
There are many files generated by Discover. The most important
file has the suffix .out (e.g., Crambin1.out). To see the last
lines of Crambin1.out, type tail Crambin1.out. To read the entire
file (it's pretty large)., type more Crambin1.out. If the minimization
ran out of steps before minimizing below the derivative, you will
see a message about an "abnormal end to minimization".
Continue the minimization by starting another run. Continue minimizing
until the convergence criterion is met. Many other errors can
be difficult to understand or fix.
Back to | Insight II Notes | Application
Guide | MolViz
Home |
Send comments to chemvis@indiana.edu
Last updated: 01/23/2001