Optimization Technique In Molecular Biology

As far I remember, our honorable teacher Dr.Babul Hassan just came back after his PhD from New Zealand, we were students of 4th year on that time, facing some new courses, one was Non-Linear Programming and another one was Financial Mathematics under the direction of Kutub Uddin Sir. I don’t remember how others of my friends took that situation, but indeed I was afraid a lot !!! just because I was in my mind that, this is the last year and I have to use this year with full of concentration and I have to be laborious a lot !!! But, Dr. Babul Hasan Sir and also Kutub Sir, their teaching was so nice and easy to understand that made me free from certain disquiet after taking couple of lectures that gave Sir during class hours, one of his good strategy I had seen that, sir always described one method by two columns, one side was method and another was each step of an algorithm is applicable for a problem to solve. So, probably that’s why I still remember those methods and now I am using in my research work. Two of them are Steepest Descent (SD) and Conjugate Gradient (CG) Method. Today I am going to show how Optimization technique is useful in describing the behavior of Molecular Dynamics (MD).

I used these two Optimization Techniques that proceed to make energy minimization. But question is which energy? There are several energies, but in MD we eventually consider Potential Energy (PE) of each changing situation of dynamics due to attraction and repulsion by the atomic charge of molecules, and PE plays exactly same rule to measure energy during this kind of situation, because PE is an energy due to position, composition, or arrangement. Any object that is lifted from its resting position has stored energy therefore it is called potential energy because it has a potential to do work when released.

For this case then I read energy to get graphical representation of energy converging rate by using SD and CG. Here I want to note that I will discuss some answers for  some natural questions that will be easily arise while one reading this article, and those are going to be the same in the sense that they are the basic properties of optimization algorithms so I will write general answers and note for ones that generals answers don’t hold for each of molecule, first I am going to choose one molecule Glycine that is look like:

Glycine-3D-balls

Before begin answering question let’s recap or introduce SD and CG method briefly:

Steepest Descent(SD): An algorithm for finding the nearest local minimum of a function which presupposes that the gradient of the function can be computed. The method of steepest descent, also called the gradient descent method, starts at a point x_oand, as many times as needed, moves from x_ito x_{i+1}by minimizing along the line extending from x_iin the direction of -\nabla f(x_i),the local downhill gradient.

For one dimensional function f(x)case, SD will be: x_i = x_{i-1}-\epsilon f'(x_{i-1})

from a starting point x_o for some small  \epsilon>0until a fixed point is reached. Now question is when it will be converge, for us like those we are concerning numerical approximation, we know, it depends on adaptive step size, usually we take big step size for the faster convergence, just because if we wanna try finite difference technique, we know the more we refine mesh means increase step size, the more possibility to find solution near to the exact solution.

Conjugate Gradient (CG): The nonlinear conjugate gradient method is generally used to find also the local minimum of a nonlinear function using its gradient alone. It works when the function is approximately quadratic near the minimum, which is the case when the function is twice differentiable at the minimum.

[For whom are more interested to know about SD and CG and theirs convergence I recommend, please have a look on Bazaraa’s non-linear Optimization Book]

Now I am going to write some answers for some simple questions to show how Optimization is useful for researching Molecular Dynamics:

  • Starting from the same initial configuration, do the two algorithms give different results? In either case, why?

Answer: Algorithms will give same results, just conjugate gradient converges faster which is expected by the nature of conjugate gradient. Because they have a parameter (emtol) which they need to satisfy in order to stop execution algorithm, means?? yes, friends, each step ask yourself means??” it’s a strong word to understand. Okay, don’t worry, I am telling: emtol: [kJ mol-1 nm-1] is a simple a value to assign as a tolerance to minimize energy, the minimization is converged when the maximum force is smaller than this value. For shell MD (as for example take a molecule from a tissue) this value should be 1.0 at most, but since the variable is used for energy minimization, so it’s better to take much bigger to run energy as much as they want until converged, so the default value is suggested to take is 10.0. By using VMD (Visualize Molecular Dynamic Software) I calculate RMDS for optimizing structures of Glycine (Say: Gly16) and Distorted Glycine (Say: Gly_dist) aligned with CONFOUT (which is already minimized and RMDS is 0 using to test) and we get: 0.228049938 angstrom and 0.26675909348 angstrom respectively and we can see that the structures are almost the same with Gly_dist being very distorted so it has a higher value which is still negligible. I will visualize them by plotting the energies and show which convergent is faster.

  •  What do you expect now if, fixed the tolerance, increase or decrease the step size? And what about the convergence as a function of the threshold emtol?

Answer: If the tolerance is the same and you increase step size you expect more steps but in you cases it will not be that much for example: for emstep = 0.001 you will get 2089 steps and for emstep = 0.01 we get 2062 steps for the steepest descent convergence situation. Here you should pay attention not to decrease tolerance to much because steepest descent can get in never ending loop because it has a property to get stuck in the low energy regions and run forever there. So, what to do?? Simply use your mathematical brain here, your tolarance is fixed, initial step doesn’t matter, left is what?? Think numerical point of view…recap what I said just earlier “we know, it depends on adaptive step size”, so you have an option to play with number of steps (say: nsteps), because in numerical if you use nsteps = 1000, then you will not get convergent, because steepest descent will reach maximum number of steps before maximum force that you assign as a tolerance namely “emtol” which is 10. Also as the very distorted structures conjugate gradient can lose more time in the beginning. With increasing the tolerance we expect smaller number of steps, theoretically it has to be like that, so you may test and for emtol = 2, you will get 2388 steps comparable with 2062 steps when the emtol = 10 using steepest descent. As noted here you need to pay attention how small step size we use since our steepest descent algorithm can get into an in finite loop (See, Bazarra)

  • I said, play with number of step, now naturally we should think is the optimized structure (and the related energy) very sensitive to the input parameters? In either case, why? {Be a Mathematician 😛 }

Answer: In optimization methods conjugate gradient will be more eefficient close to the minimum energy structure but will perform worse when is far away from the the minimum that already we know as from our lesson. In this case, if our input data is a ected by a big noise so that initially CG says we will be far away from the local minimum and will have slower convergence. But, we note that conjugate gradient will be faster that steepest descent.

  • Starting from different initial points, do the two algorithms (steepest descent and conjugate gradient methods) produce the same converged structure (and energy)? In either case, why?

Answer: This was explained in fi rst question.  How ?? Just think simply if you take two different molecule their dynamical model will be different, so, initial point could different, depends on your face portraits of the dynamics. Also, I already said, when you will measure bond residue that is RMDS for optimized structures of Gly16 and Gly_dist you will get: 0.228049938 and 0.26675909348 respectively. So you can see that the structures are almost the same with Gly_dist being very distorted so it has a higher value which is still negligible.

Now very stupid type question LOL 🙂

  • Do the the two algorithms reach the same minimum? And what about a comparison with the other structures?

Answer: Yes algorithms reach same minimum. Unfavorable structures don’t reach same minimum as better structures which is obvious from presented plots without stating precise numbers. Now I am giving some calculated RMSD using VMD for each structure of glycine separately.

borgomul
Now I am showing some plots with following parameters:

                                                                 Steepest descent method         Conjugate gradient method
                                                                           emstep = 0.001                           emstep = 0.001
                                                                           emtol    = 10.0                             emtol    = 10.0
                                                                           nsteps  = 1000                            nsteps  = 1000

chow_dist-1000

Energy convergent with CG for Gly_dist

chow_gly16_1000nsepts

Energy convergent with SD for Gly16

Then you can see steps finished before energy start will be converged. So, you have to increase nsteps, so, I choose infi nite number of steps, then we will see the following converged structure for both two methods, where we can easily see that conjugate gradient (CG) method need much less steps than steepest decent (SD) method. So that, CG method converge more faster way than SD method:

1gly16-st

(1)

2gly16_cg

(2)  

Figure 1: Energy convergence for Gly16 : 1) with steepest decent and 2) with conjugate gradient.

3gly_dist-st

(1)

4gly_dist_cg

(2)

Figure 2: Energy convergence for Gly_dist : 1) with steepest decent and 2) with conjugate gradient.

9tyr51-st

(1)

10tyr51-cg

(2)

Figure 3: Energy convergence for tyr51 : 1) with steepest decent and 2) with conjugate gradient

“I am not sure I can able to promote the importance to learn Mathematics with a clear understanding concept or not. But if one single guy will be inspired to study Mathematics with a full of enthusiasm after reading this, that could be my happiness.”

———————————————————————————————————– S. M. Atiqur Rahman Chowdhury.

Permanent link to this article: https://www.borgomul.com/leatique/3123/


মন্তব্য করুন আপনার ফেসবুক প্রোফাইল ব্যবহার করে

মন্তব্য করুন

Discover more from বর্গমূল | Borgomul

Subscribe now to keep reading and get access to the full archive.

Continue reading