IU MolViz User Guide

Insight II Notes: Discover


Introduction

The Discover program can minimize the energy in your structure or can be used to perform molecular dynamics. The Discover program is accessed from within InsightII. Click the left mouse button the MSI logo (module icon) in the upper left corner and then select discover. A new row of pull-down menu buttons will appear.

While manuals are usually too verbose and poorly written, the Discover manual (part 1) is a great review of molecular mechanics calculations. The following is a summary of this manual as it pertains to the Discover menu selections.


Constraints

Atoms can be constrained during the calculation through the Constraint pull-down. Many minimizations can be accelerated by fixing uninteresting parts of the molecule; if you minimize a large number of NON-FIXED atoms, the calculation will take hours to finish. Other types of constraints can force the molecule to assume desired shapes and properties, but they ADD to the computation time. For example, if a new inhibitor is placed in an active site, the inhibitor and active site residues can be allowed to move freely while the rest of the protein is fixed, which speeds the calculation. For more sophistication, the layer of residues next to the active site could be tethered to allow limited movement, at the expense of computation time.
Fix
specified atoms will not move. It's very helpful to use the Color_Constraint option to color the parts of your molecule that are fixed, after the fixed parts are defined. While you can ADD atoms & residues to the fixed atom list, DELETEing atoms & residues from this list doesn't work unless you delete exactly what you added. Therefore, you can't ADD all residues and then delete selected residues.

Generic Dis
The distance between specified atoms will be forced to a specified range. This is used in NMR calculations, where the NOE measurement can determine which pairs of hydrogens must be separated by1.8-5 Å. This can also be used to define a hydrogen bond, by forcing the distance between a hydrogen and a hydrogen-bond acceptor to be within 1-1.5 Å, and by forcing the hydrogen bond donor and acceptor to be within 2-3 Å.

The potential energy well defined by this menu has a lower distance and an upper distance. If the two atoms lie between these distances, there is no energy penalty. The steepness of the parabolic walls of this potential energy well are defined by the upper force constant and the lower force constant. A maximum force value is also defined, so that the energy doesn't approach infinity when the atoms are really far apart or really close together.

Rotor
used to rotate molecular segments during minimizations. For instance, a side chain can be rotated by 60 degree increments, and after each rotation, a minimization can be performed to search with angle with minimum energy. MSI/Search_Compare is similar to this (and more robust) and MSI/Biopolymer//Residue/Auto_Rotamer performs this automatically for amino acids.

Template Force
forces the conformation of one molecule to look like another molecule. The molecules can have different types and numbers of atoms, but you must specify which atom pairs must attempt to superimpose. The Viewer/Transform/superimpose will only superimpose molecules (or parts of molecules) with identical numbers and types of atoms.

Tether
forces atoms to stay within a specified range from their original positions.

Torsion Force
forces torsion angles to take specified values. For instance, can be used to constrain the bond angle of a proposed hydrogen bond.

Warp
used to set up free energy calculations.


Setting up the Minimization

The potential energy of a biomolecule can be plotted as a multi-dimensional grid, which can be considered more simply as a two-dimensional topographic map. When you minimize the molecule's potential energy, your are trying to reach the nearest minimum, or well, on this map. Minimization algorithms calculate the derivative of the current point on the map, and then determine which way to "move" (i.e., move the atoms) to reach the minimum. Minimization parameters are set up with Parameter/Minimize pull-down menu.

Select 1 of the four algorithms:

Steepest
The steepest descents algorithm (with line searches) is the most basic algorithm. It's slow, but it never encounters problems. Using this algorithm first, especially if your molecule is far from the minimum and until the derivative <10. Then switch to another algorithm.

Conjugate
The conjugate gradients algorithm is fairly fast, but it fails if there are saddle-points on the potential energy surface. Use this to finish the minimization of molecules with more than a few hundred non-fixed atoms.

VA09A
This is a quasi-Newton-Raphson method, which only used the first derivatives of a Hession matrix. Not as accurate as Newton, but EXTREMELY fast. Fails if you are far from the minimum. No limit on the number of atoms, but try this only for 200 non-fixed atoms AT MOST (a 1,000 atom system approaches the limits of a Cray-XMP supercomputer).

Newton
The Newton-Raphson method used the full Hession matrix. Fast, but fails if you are far from the minimum. No limit on the number of atoms, but try this only for 200 non-fixed atoms AT MOST. Use VA09A instead unless you need extreme accuracy (for instance, for calculating bond vibrational modes).
Set the number of iterations to 10-1000 for the Steepest algorithm and 100-5000 for the other algorithms. You may want to set the number of iterations to a very small value, just to make sure that the calculation and constraints will work as you desire. Then increase the number of iterations and do the full minimization. If the minimization doesn't "converge" (see below), you can do another minimization of the structure on the screen (the ending structure of the previous round of minimization).

For biomolecules, you rarely calculate an absolute minimum. Eventually, you reach a flat "floor" in the "well" of the potential energy surface, and the algorithms can wander around this floor forever, hunting for the absolute minimum between very tiny bumps in the floor. The derivative determines when you are satisfied that the minimization has reached the floor of the potential energy well. The units of the derivative are kcal mol-1Å-1. When the MAXIMUM derivative of all energy terms for all atoms is below the derivative parameter, the minimization has "converged" and the calculation will stop. The derivative reported on the screen during the minimization in the AVERAGE derivative of all terms for all atoms, which can be very different than the MAXIMUM derivative. Every molecule is different, but follow these guesses:

The ONLY way to tell if you have reached an acceptable minimum is to save the structure after minimizing to a derivative of 1, 0.5, 0.1, 0.05,……, and then compare the structures to see when the structure stops moving to your satisfaction. You should also plot the energy and derivative vs. Number of iterations. However, this is a lot of work.

Toggle on Charges, Cross or Morse to modify the forcefield terms.

If ALL non-bond interactions were evaluated in every iteration, the calculation would be very slow. A non-bond cutoff is used to neglect non--bond interactions beyond a specified distance. After a number of iterations, a matrix is generated which lists atom pairs which are within the specified distance, and this matrix is used for calculating non-bond energy terms until a new matrix is created.

Also, each atom in a "group" must evaluate non-bond interactions will all other atoms in the same "group". If the "group" is longer than 8 Å, then the cutoff values must be increased. This is a common problem for nucleotides, since nucleotide base is a "group" that is wider than 8 Å, and the cutoff values must be increased to 12 Å. Carbohydrates or large side chains of unnatural amino acids also have this problem. Also, it has been shown that non-bond interactions are significant up to 15 Å. The Parameters/Value menu is used to modify non-bond cutoffs:

The dielectric can be modified with the Parameters/Set pull-down menu. If you have turned off charges in Parameters/Minimize, the dielectric is not important. If you have turned charges on, set dielectric = 1.0 or choose a distance-dependent dielectric.

Most rough calculations are done in a vacuum, while biopolymers usually exist in solution. We should include a large number of water molecules around the biopolymer, but the calculation would be very big. As a compromise, many people use a distance-dependent dielectric.

A distance-dependent dielectric is used to crudely simulate solvent. If two atoms are separated by a small distance, the probability that the space between the atoms is occupied by a high-dielectric solvent is small; if two atoms are separated by a large distance, the probability that the space between the atoms is occupied by a high-dielectric solvent is large. If the dielectric is multiplied by R (the distance between the atoms), this dielectric crudely simulates this probability.

Electrostatics is not well-defined for biopolymers, especially because the dielectric medium between two atoms inside the core of a protein is different for every atom pair. DelPhi will calculate the dielectric field of a protein's interior and exterior, but the calculation is too laborious to include it in a minimization calculation. There is just no good way to handle electrostatics at this time.

The individual terms of the potential energy function describe:

These terms can be scaled using the Parameters/Scale_Terms. For example, non-bond interactions can be turned off (scale = 0) during an initial minimization of an extended peptide chain with NMR restraints. This allows atoms to pass through each other (no VDW radius) while maintaining the bonding skeleton and while searching for a secondary structure that satisfies the NMR data. Also, the Coulombic terms could be turned off during the initial minimization of a new sketch of a highly charged molecule. The charges would not influence the "clean-up" of the bonds and bond angles. Terms that are turned off should be turned on gradually:

WARNING!!! If you are using the AMBER forcefield, you must scale the p1-4 term (the last term in this menu) by 0.5.


Running the Minimization

The normal output supplies you with more than enough information. However, to change the output on the screen and the output of new files to your directory, use Run/Report and Fun/Files. By default, all output is displayed on the screen and all files are saved. See also the Description of Discover Files.

To run the minimization, click on Run/run. Make sure that you have selected the local machine, the interactive computation mode, the correct molecule, add_auto, run_minimization, , and reduce_output (for better output control, see run/files) and run_interactive. It's better to run interactively, make sure that the calculation starts, and then detach the calculation (see below) rather than running directly in "batch" mode. Advanced users can set up and run a command file which contains a customized setup of minimization parameters. Then click on execute. After preparing for the calculation, the program will show text in the top center of the screen.

To stop minimization, click on run/stop. Some minimizations may take hours, and you can detach from the calculation so that it continues to run while you do something else. Click on Run/Detach. To return to the calculation, click on molecule/get and get the Biosym-format .car file, and then click on Run/Attach. If the minimization has finished, you cannot attach to the finished calculation. Go to a UNIX shell and look at the file with the .out extension to determine if the minimization met the derivative threshold, or if the minimization ran out of steps. Look at the file with the .pek extension to "peek" at the calculation while it is running.

If Discover reports an error while setting up the calculation, of if the minimizations ends and if your are not sure the derivative was met, go to a UNIX shell (Viewer/Session/Unix or just type unix and hit return TWICE) and type is to display the files that Discover generated. Discover used the name of your molecule and appends it with a number (i.e., first minimization of Crambin is Crambin1, second minimization is Crambin2, then Crambin3, etc. If I do anything to the structure after two minimizations runs, the name of the structure will automatically change to Crambin2, and subsequent minimizations will be Crambin21, Crambin22, Crambin23, etc.). There are many files generated by Discover. The most important file has the suffix .out (e.g., Crambin1.out). To see the last lines of Crambin1.out, type tail Crambin1.out. To read the entire file (it's pretty large)., type more Crambin1.out. If the minimization ran out of steps before minimizing below the derivative, you will see a message about an "abnormal end to minimization". Continue the minimization by starting another run. Continue minimizing until the convergence criterion is met. Many other errors can be difficult to understand or fix.


Back to  |  Insight II Notes  |  Application Guide  |  MolViz Home  |
Send comments to chemvis@indiana.edu
Last updated: 01/23/2001