DOCK 6 is written in C++, and is functionally divided into independent components, thus having a high degree of program flexibility. The attached program is written in C and Fortran 77. The workflow of DOCK6 is shown in the figure. The first is to prepare according to the geometric coordinates of the receptor and the ligand, and then do docking. There are three processes for receptor preparation: Sphgen, Grid and DOCK. The process is shown in Figure 2. The Sphgen process identifies the corresponding site and generates the center of the sphere that can fill the site. The Grid process generates a scoring grid. In the program DOCK, DOCK matches spheres (generated by Sphgen) with ligand atoms and uses a scoring grid (from Grid) to evaluate the orientation of the ligand. Program docking can also minimize energy-based scores.
Figure 1 Recipient preparation process
For docking, the handling of DOCK is:
(1) Determine the position of the ligand relative to the protein,
(2) Determine the scoring method for the orientation of the ligand.
Both of the two processing methods in DOCK can be replaced. Replace it with the way you need it.
(1) Confirming the binding site DOCK6.9 can automatically confirm potential sites, and all points in this site will be identified as possible sites for ligand atoms. However, in contrast to this automatic method, the choice can be made according to the active site of the protein, and grids can also be generated at these sites, and each grid point can be regarded as the center of the sphere. But the principle is to capture the shape features of interest through the fewest points, and not to be biased by the way the ligands have been found to bind. (For protein dot formation)
It feels a bit abstract here. Next, you can look at how the file is processed in the example to achieve this effect. Not only should we consider reducing the deviation of known ligands, but also considering the amount of calculation, how to choose this grid point. . . Sure enough, poverty limits the imaginable space of the machine.
(2) When generating lattice points for the ligand, a series of spherical atoms are paired with the atoms of the ligand for positioning. As a result, many sets of atom-ball pairs are produced, and each set contains only a few ball-atom pairs. The way to limit the number of sets is to use the longest distance heuristic algorithm: the approximate distance between the balls and the atomic distance between the ligands. Each group of atom spheres are used to calculate the orientation of the ligand at the corresponding site, and the oriented atom-ball pair is usually called a match, and then the matched ligand atoms are calculated, and the ligand atoms are calculated (converted) The rmsd is the smallest translation vector and rotation matrix, and the sphere center of the sphere atom set, and used to determine the position of the entire ligand in the active center.
What do orientation, translation vector and rotation matrix mean? What is a mathematical expression?
(3) The orientation of the ligand is evaluated by a shape scoring function and/or an approximate ligand-binding energy function. Most evaluations are performed on a (scoring) grid, with the goal of minimizing the total calculation time. At each grid point, the contribution of the enzyme to the score is stored. In other words, the receptor contribution to the score may be repetitive and time-consuming, only calculated once; then, the appropriate item is simply extracted from the memory.
The binding energy can be represented by Van del Waal attractive, Van del Waal attractive, van del dispersive and Coulombic electorstatic. The energy of the ligand and receptor is expressed by combining the energy of the grid points.