Introducing DOT

A key problem in molecular biology is the detection and analysis of interactions between biological macromolecules. These interactions include examining cellular metabolism, finding the most stable relative orientations between two proteins, studying protein subunit aggregation, performing computer-aided drug design, and solving problems of cellular signaling and expression.

DOT, or Daughter of Turnip, is based on the program TURNIP developed by Victoria Roberts at The Scripps Research Institute for use in the study of macromolecular docking. DOT is an improvement over TURNIP in that it scales to solve large problems on large networks of computers, it uses geometric fit information instead of closest distance between the molecules, and uses full Poisson-Boltzmann electrostatic potentials instead of the Coulombic electrostatic model used by TURNIP.

The primary goal of DOT is obtain a short list of candidate positions in a relatively short time. A "position" in this context refers to a relative orientation between a pair of molecules, defined in terms of a rotation and translation of three-dimensional space. DOT holds one of the molecules fixed, and takes a list of rotations to apply to the second (moving) molecule. For each rotation, a convolution operation computes the correlation between the potential field of the fixed molecule and the rotated charge distribution of the moving molecule. From the result, DOT computes the electrostatic interaction energies and the van der Waals interactions. Each convolution operation computes the energy over the entire spatial grid for a fixed rotation. DOT thus creates a list of promising relative positions between the two molecules. Further refinement of the resulting rotations can be accomplished with other programs that perform energy minimization via molecular dynamics or free energy simulation, or by visual inspection. The global view of molecular interactions provided by DOT can give additional insight into the binding process compared to local sampling and optimization methods.

At first glance, it may seem that DOT will run slowly because it computes an exhaustive search through six degrees of freedom (three translations and three rotations), and for a typical protein-protein docking problem this is on the order of 30 billion configurations. However, the calculation is efficient for several reasons:

It can use several workstations in single run. By doubling the number of systems, a single job completes in half the time.
The algorithms chosen are extremely efficient. Given a rotation, a single convolution operation calculates the energies for all translational possibilities at once.
The type of arithmetic performed by DOT is well suited for today's microprocessors and for the big-iron computers like CRAY, IBM SP, and Convex systems.

Although the calculation is efficient, it still requires substantial resources. DOT saves both the best energy and collision values for each translation, resulting in two 3-D grids of data requiring the same storage as the potential field of the fixed molecule. There is also a per orientation statistics level that can be set to save all computed energy values, which generates about 90 megabytes of output. (??? Processor and memory resources?)

Pre-computed lists of rotations are provided with DOT for use in your calculations. A tool for generating your own rotation lists is also provided, but we do not expect that everyone will need it.