 Welcome everybody to the last webinar of this spring and exactly by Excel webinar number 71. And it's a pleasure for me to have here Sudashan Beheira from the Max Planck Institute for Multidisciplinary Science in Gottingen. And he will speak about GROMAC's PME for an accurate estimation of the free energy difference. I'm Alessandra Nilla, I'm hosting this webinar together with Otto Andersson from the Finnish IT Center for Science. The webinar is recorded, that is just for your information. And during the webinar, you can use the Q&A function that you find at the end of the Zoom application according to which operating system you have, you can have this symbol or you can see this symbol. Just type there the question, so we will see the question. And after the webinar, we will read the question if you write us no microphone. If you don't write no microphone, we will try to unmute you so you can ask directly to the speaker the question. After all the questions are done, we will activate if there is time, the raise hand. For any question about the desire about webinar, you can always go to look up by excel.eu and state your question. I'm listening on the speaker of today's presenter. So Suda Shamheira is a postdoctoral researcher in Göttingen. And, oh, sorry, sorry, sorry, sorry. In the computational by molecular dynamics group at the Max Planck Institute for multiplayer science in Göttingen. Isn't involved in developing methods for identifying and tackling convergence issue in an equilibrium alchemical free energy calculation, as well as in the development of the PMX software that you will tell us about today. No prior prior before it was is this PhD in Bangalore, India, where you want to get the structure and the dynamics of different type of enzyme using molecular dynamic simulation and and some an unsampling methods. So I'm curious to see what he will tell us about PMX. So now I will stop sharing and give you the chance to share your spin. Please go ahead. Okay, so can you see my screen. Yes, perfect. Okay. Thank you, Alessandra for your kind introduction. And welcome everyone once again to our bio extra webinar on accurate estimation of free energy differences using open source softwares. So I will be sharing how you can use grow max in combination with PMX to to calculate free energy differences across various systems. And I'm Sudarshan Behar as today's presenter. And in today's webinar, I'll briefly introduce you to molecular dynamics and alchemical free energy calculation so that all of us are on the same page, and it will be easier for us to discuss the rest of the slides. So by which I'll also talk a bit on PMX softwares and web servers, applying which will then look into three different test cases, and those test cases are free energy change due to mutation absolute protein binding free energy, relative protein binding free energy. And at the end, we'll conclude the webinar and talk about what we are planning to develop in the future. Okay. So with that, let us start and go a bit deeper into MD setup. How do we visually run a MD simulation. We try to model the system as close to experiment as possible by taking the crystal structure of protein, taking the right salt concentration, if like and poses available take the right like and pose then dissolve in water, etc. Then bring the system into the interested thermodynamic state by using temperature and thermal pressure coupling using thermostat and barostat. Then by applying an external force field, we evolve the system using Newton's equation of motion. As you can see in this trajectory, which is nothing but the evolution of this particular system from the initial state, you can calculate various interesting properties. And one of the most interesting properties that we can calculate using molecular dynamics is free energy. And why is free energy such an essential property to us why it is so much important to the community. Let us discuss a bit on that before moving further. And let us take the example of ligand binding process. Free energy actually drives various physical and biochemical processes, you can understand many biochemical process like ligand binding using free energy surface. And the free energy of binding delta G binding is nothing but the difference between free energy of the bound state and the unbound state. And if you have knowledge about the overall free energy surface, then you can talk about the mechanism of ligand binding. If you have just the information about delta G of binding, then then you can talk about how strongly a ligand binds to a protein. And if you know that you can tune the ligand or the protein so that you have a better or a stronger binder. Similarly, for protein folding problem, say if you have information about the free energy change due to folding or the entire free energy surface, then you can talk about how to change the protein in a way that you can have more stable protein at its folded state. So all these properties, all these phenomena what it basically tells us is that free energy is a very essential property and if you have knowledge about it, then you can do various things including rational drug design, better protein engineering and stuff like that. Okay. So with that motivation of why we should study free energy, how to calculate it, how to calculate it using molecular dynamic simulations. So there are various ways out of which I'm listing here three major widely known methods. One of them is all chemical methods such as free energy perturbation and thermodynamic integration which are going to talk in future in future slides. Also, you can use biasing techniques such as umbrella sampling, metadynamics, etc. or salvation methods like MMPBSA and MMGBSA. So all these methods are focused mostly on or entirely on the alchemical methods. So let us discuss a bit more detail about alchemical methods before going into results. And let's talk about this particular interesting case where we want to study protein ligand binding. Suppose you divide and devise an experiment or suppose you devise a computational protocol to calculate the binding affinity or binding free energy of ligand A to this particular protein using molecular dynamics. How you can do is you can actually study the entire process of binding and unbinding and say this is the free energy surface, then what you are interested in is to sample the entire free energy surface. And as you know, sampling such an entire free energy surface is computationally extremely costly because you have to spend a lot of time in crossing this particular barrier rather than just sampling the difference. Suppose you start from ligand unbound state, it is really difficult in simulation to see ligand unbinding because of this large barrier. But what we are actually interested in is just the difference between ligand unbound and bound state. We are not interested in sampling the entire free energy surface. So we can just add to this complexity. Suppose we have another ligand, ligand B, and we want to know which of these two ligand ligand A or ligand B binds to the targets at a stronger affinity. So that is calculating the relative binding free energy delta, delta G. So for that, how we can do it computationally using a traditional method is calculate delta G of binding for ligand A using very expensive computational method then do the same for ligand B. And the calculate, then calculate the difference, sorry, this is getting okay, then calculate the difference using just the difference between these two. But calculating this following this traditional method is very expensive as I have mentioned because you have to sample the entire free energy surface. Can we devise a way in which we can calculate the same property delta, delta G at a much cheaper price? Yes, we can do that. So somehow we have the ability to change ligand A to ligand B in water itself through a unphysical alchemical states. Similarly, we have the ability to change ligand A to ligand B when it is bound to protein alchemically or in a unphysical way in say very fast way in a cheaper way. Then what we can do is since it is a cycle closure, you can calculate delta GB minus delta GA is same as delta GPL minus delta GL. So you can calculate the delta delta G to be just the difference between this and this instead of this and this. Okay, so now that we have understood the problem and we have a solution that we can calculate the delta, delta G using alchemical methods going through unphysical states. But how does one do it in a practical application? There are many ways of doing that. One of the most popular ones is equilibrium FEP, equilibrium free energy perturbation method, where both the ligand states are defined and coupled with a coupling parameter known as lambda. When lambda is equal to 0, you are in lambda A state, sorry ligand A state, suppose ligand A is methane. When lambda is equal to 1, you are in ligand B, which is suppose you have methanol, so you want to go from methane to methanol. So in equilibrium FEP what you do is you change the coupling parameter in a discrete way. And at various lambda values, say lambda is equal to 0.1, 0.2, 0.3, till 0.9, you run equilibrium MD in the range of nanosecond to microsecond. And then collect del G value at each and every lambda window and some of them will give you the change in free energy due to the ligand mutation or going from methane to methanol. One more way of doing the same is non-equilibrium thermodynamic integration in which you don't divide the entire lambda space into various discrete lambda windows. Rather you just run equilibrium MD only at lambda is equal to 0 and lambda is equal to 1 state in the say nanosecond to microsecond range. And then from lambda is equal to 0 state, you start many non-equilibrium transitions in just picosecond range, say hundreds of picosecond. You run many, say you run hundreds of such non-equilibrium transitions starting from lambda is equal to 0 to lambda is equal to 1 in a very fast non-equilibrium way. And those transitions are called forward transitions because you are going from 0 to 1. Similarly, you can start from lambda is equal to 1 state and go to 0 and run many non-equilibrium transitions. And let us call that as reverse transitions. So you have 100 forward transitions and 100 reverse transitions. Then from this non-equilibrium forward and reverse transition, you can collect the work values associated with them or the work done on the system. And then you plot the distribution of those work. And this is the distribution of this work look like say just in a typical schematic presentation. This is distribution of the forward transitions. This is the distribution of the reverse transition. And then what Krux fluctuation theorem states is that if you have such distribution, the point of interactions with the point of intersection between these two overlap is nothing but the change in free energy, nothing but the delta delta g. So that's how you can calculate the change in free energy or the delta delta g using non-equilibrium thermodynamic integration as well. And throughout this webinar, all the test cases that I'm going to present are based on non-equilibrium thermodynamic integration. Then the question that comes is why non-equilibrium TI? Because as you might have already noticed that in non-equilibrium TI, we are focused only on sampling of the real physical states, unlike in equilibrium where we sample unphysical state and spend a lot of time in unphysical states. Also, if you have the true end states available from experiments, say the example of ligand binding energy to a protein, you need both the end states to be say hollow state of protein and apostate of proteins, apostate of the protein. And if both the states are available experimentally, their crystal structures are available, then you can incorporate them, both of them in our non-equilibrium method because you are running equilibrium MD only at the lamda end states. And that thing you cannot do in equilibrium FEP. And if you do that, we'll see in future slides that leads to much improved accuracy compared to not taking both the true end states. Similarly, since you have hundreds of non-equilibrium short transitions, say 100 picosecond transitions, hundreds of them, you can paralyze them in a better way compared to just having few intermediate lamda. So, there are reports showing that non-equilibrium TI converge to the free energy estimate at a much less computational time than equilibrium FEP. But also one has to be careful while doing non-equilibrium TI because of the work overlap in many cases when you have really large perturbation, or when you are perturbing the say entire ligand, then what happened is the work, forward work and the reverse work don't overlap. In such scenario, the delta-delta G value that you calculate might not be correct. So we have to be careful when doing non-equilibrium TI, especially looking into the work overlap, work distribution overlap. And all these simulations, the equilibrium MD and non-equilibrium transitions, you can use Gromax open source software, which is also part of BioXL to do the simulations. Great. So if all the simulations can be done using Gromax, then what is the problem? I mean, where are the problems comes into picture? The problem is the system setup. It is not easy to set up the system, especially creating the structure and the topology for alchemical transformation. Say we take the example of a ligand perturbation, a ligand we are calculating delta-delta G of changing a ligand where chlorine is in ortho position to meta position. Then in the PDB file or in the structure file, what we need is information about both the chlorine positions. Also in the topology file, what we need is information about both the ligand parameters, ligand A parameter where chlorine is in ortho position and ligand B parameter. So chlorine is in meta positions. And creating and manipulating this PDB and topology files manually or by writing a script by yourself to do it, how cumbersome it is, I think most of all, most of all of you know it if you have ever tweaked a topology or PDB file for Gromax. But then add complexity of doing the same for a large number of ligands or large number of protein mutations or a large number of nucleic acid mutations. Then you think about the complexity or how error prone such a calculation would be if you do it by yourself manually. To resolve that particular problem, that's where our tool PMX comes into picture, which was developed in the Groot Lab. What it does is it creates the structure and topology files in an automatic manner without much intervention from the user and without any error. So you can use this particular tool to study various interesting properties like what the various reports from Buddy Groot has shown is that you can use PMX and Gromax for protein ligand binding free energy, protein nucleic acid binding free energy, thermostability change due to mutation and protein-protein binding free energy. And we are going to talk about a few of them in the future slides. Also, from the same group, there is a PMX web server which was designed just for generating hybrid structure and topology for amino acid mutation for free energy calculation. So I would suggest you to have a look into this particular web server if you are interested in calculating free energy change due to protein mutation. Great. So with that introduction to MD, alchemical methods and PMX software and web server, let us look into the application of PMX and Gromax and go into various free various test cases. So for the first test case, we are going to look into protein mutations. And in this, I'm going to present a paper from the group lab which was published in 2016 where they study large scan free energy free energy calculations for protein mutation. So let us try to understand the problem and why they I mean what they are trying to do and why they are trying to do. So let us again devise an experiment or say computational tool or a computational method to calculate the delta G of folding like how much the free energy difference between the unfolded state and folded state we want to calculate for wild type. Also, when you mutate a particular residue of the protein, how much this delta G folding changes. So how do you want to calculate that? Say we start from the unfolded state of wild type protein and do a folding experiment and calculate delta G folding for wild type. Then you do the same for the mutant and take the difference to be delta G folding mutation and that will give you how much free energy change happen for the protein thermo stability or protein folding due to the change that due to the mutation. But as you might have already noticed that doing such a simulation or folding simulation for a relatively larger peptide it is almost impossible. So is experiment it is really expensive. So can you again devise alchemical protocol to calculate this delta G of folding mutation at a cheaper rate. Yes, you can do that. What you can do is in the unfolded state itself you would change the mutation and calculate delta G of mutation in unfolded state. Similarly, you can calculate delta G of mutation in the folded state and then take the difference which is same as that of delta delta G folding mutation. And this again you can do it using agromax and PMX this alchemical transformation and calculation. Great. So now that we have understood what they are planning to do what the authors are planning to do in this particular work. Then the question comes is why we want to do it. I mean why it is essential for us to understand I think most of you know. Suppose you have the power to call to know that what mutation has how much impact on the delta G of folding what mutation will give you a stronger folded state. Or what mutation will give you a folded state which is more thermodynamically stable. Then you can actually engineer the protein to become more thermostable so that you can use it in a really harsh condition at an industrial setting way above 100 degree centigrade. And if you have and if you have the ability to go beyond 100 degree centigrade or at a higher temperature, you can always increase the catalytic activity of the enzyme because as you know R and US equation, enzyme activity is in exponentially increases with temperature. Similarly, with having idea of how much folding energy change due to protein mutation or protein protein binding change due to protein mutation you can engineer better protein protein interface which are good drug target as well. Okay. So now we have the motivation also to study such processes. So let us look into the results of this particular paper what what what they have come what what are the significant results in this particular work. So the authors took various protein in this work one of them is Barney's where they have mutated at 55 position 119 19 mutations they have done and for all these mutations experimental delta delta G value is available. And as you can see here and the mutations here are point are marked as red, red residues. So you can see how the calculated delta delta G correlates well with experimental delta delta G most of you most of them if you can see they fall in a nice straight line with overall error absolute unsigned error is nothing but how much deviates from experiment is well within one kilo calorie per mole just 3.8 And more than 60% of the data points falls within one vehicle per mole, which I think is a really remarkable achievement, provided that we are using a cheaper all chemical method to get this delta delta G. Great. So, this is the overall results, but let us dig a slightly deeper into it and look on to some more more results of the same Barney system. So in this particular work, the authors have also used six different force field to calculate the same to get the same to compare across force fields. And what they have found is charm performs better than other force field but overall if you see all the force fields perform relatively similar to each other because if you see the overall error is around 4.5 kJ per mole between four and five kJ per mole for various force field on an average. And interesting to see is that the consensus approach when I say consensus approach when you combine the results from different force fields and take the average of the results, it performs better than any particular individual force field approach. So if you have charm results and amber results, if you combine charm and amber results and the average result is better than charm individual or amber individual results, which is something really something really fascinating to know. So what we basically suggest or what the authors basically suggest here is if you are doing protein free energy, protein free mutation free energy calculations, then do it for do the same using multiple force fields and take the average result because that will give you a better accuracy with experiment. Also, what the authors found is that when you have a charge conserving mutations, like say you are going from alanine to phenylalanine in the protein, then the error is much less compared to charge changing mutation. When you have charge changing mutation like say you are going from alanine to aspartate, you have results, you have error in the range of four to seven kJ per mole but if you have charge conserving mutation you have just within 4.5 kJ per mole. And this also makes sense because charge changing mutations are, they are much larger perturbation because you are changing the charge and hence what overall it says is that whenever you are trying to predict free energy change due to charge changing mutation, you should not expect as accuracy as charge conserving mutations. Okay, so this was results from the Barnett systems, but the authors also went ahead and applied the method on various different other protein system. One of the other protein systems is a SNAs where they have done 24 mutations again because experimental data is available for all this due to 20 mutations. And as you can see the results is very similar to what we saw in the previous slides compared to Barnett's where you have the consensus approach performing better than single force field approach. Other than just the Delta Delta G calculated and comparing with experiment, they also compare for various protein systems, the calculated Delta Delta G with experimental available Delta Tm. What is Delta Tm? It is changing melting temperature due to mutation, which one would expect to correlate well with Delta Delta G that is free energy change due to mutation. And as you can see for various protein system, the calculated Delta Delta G correlates really nicely with Delta Tm with the correlation coefficient of 0.86. And what it suggests is that this particular approach of using non-equilibrium alchemical methods using Gromax and PMX provide you or predicts the Delta Delta G accurately for various protein systems. Okay, so in summary, the non-equilibrium alchemical methods performs as good as experiments, but at a much cheaper price. And the authors also show that consensus approach or combination of various force field approach performs better than a single force field approach. And charge-conservative mutations are easier to predict than charge-changing mutations. And calculated Delta Delta G is shown to correlate well with Delta Tm, that is change in melting temperature of the protein due to mutation. Great, so that was overall discussion on the test case one, where we looked into change in protein free energy due to a mutation. Now let us go into the second test case where we calculate free energy of absolute ligand binding. When I say absolute protein ligand binding free energy, what I mean is basically the binding free energy of ligand to a protein. This is nothing else. Okay. And in this, again, I'm going to present a work from DeGroot Lab, which was published in 2021 in Chemical Science, you can go and have a look into this paper. And again, the thermodynamic cycle we have already looked into it, but say we discuss again about it a bit is that we have seen how difficult it is to calculate this in a traditional computational approach. Rather, you can devise a alchemical approach where you can decouple the ligand in water as well as in the complex and calculate the Delta G of binding to be Delta G Pl minus Delta G L. But why would one prefer to do such a calculation? Where are its application? Oh, there are lots of lots of application as we have discussed initially, especially to understand how ligand binds to a protein and stuff like that. Other than that, the most interesting and fascinating application is in drug discovery. Say we are talking about a simple drug discovery pipeline where we first identify a target and validate it. When I say target, it's a protein, say. Then we look into, then we identify a drug or identify a ligand which binds to the target very strongly that is known as ligand lead identification. That means the functional group of the ligand and optimize it so that we get a better binder that is lead optimization and then various testing and finally approval. So the absolute binding free energy comes into play in the third stage of drug discovery pipeline where we want to screen through various or say millions of billions of ligands and trying to find out which one is best binder or what are those few ligands which bind to the protein at a stronger affinity than others. In a traditional setting, docking score is usually used, which is really less accurate. But if you have a highly accurate absolute binding free energy approach that I'm going to show in this slide how accurate results are, then you can incorporate such a highly accurate absolute binding free energy in lead identification and you can get much more accurate result. Great. So now we have a very good motivation to do absolute binding free energy. So let us look into the overall performance of absolute binding free energy protocol that we have devised using non-equilibrium alchemy. So here the authors again took seven different targets, seven different proteins like CDK2, CMAT, Galactene and many others and 128 ligand systems. And this is the overall Delta G calculated versus Delta G experiment plot. As you can see again, it is really accurate. Most of them fall within 1 kcal per mole. But the overall results are the error, absolute unsigned error is around 5 kcal per mole. But if you look into individual cases like JNK1 or P38 alpha, you can see that the error is much less in this case around 3 kcal per mole. Okay. But there are also cases like type 2 where the error is much larger around 11 kcal per mole. And why in type 2 especially the error is such high around 11 kcal per mole whereas everywhere else it is just within 5 kcal per mole. The authors speculate that it is because of the poor representation of the apostate. Because a crystal structure of the apostate of type 2 is not available from experiment in the literature whereas it is available for all the rest six target system. And we'll see how actually the real APO versus the modeled APO affect the overall performance or the overall accuracy of our Delta G calculations in the next slide. So this is the plot again. So here we have apostate from X-ray available for six different targets except type 2. And as you can see on the left plot, you model the apostate just by removing ligand from X-ray holostate. So that is not the real apostate but just a modeled apostate. And you get error to be around 7 kcal per mole which is relatively high. But if you take the true apostate into account and do the same calculations both the true apostate and true holostate into account and calculate the and do the same what you get is the error within 5 kcal per mole just 4.4 kcal per mole. And this significant improvements shows the power of taking apostate into account in our simulations which non-equilibrium alchemy using Gromax and PMX provide but you cannot do the same in equilibrium FEP. Of course you can do but that will come at a much computational expense than non-equilibrium alchemy. Again if we look into individual cases there is a significant improvement especially in the case of P38 alpha where you see if you just take a modeled apo you have the error around 11 kcal per mole. But if you take the true apo the error is around 3 kcal per mole. So there is a improvement of around 8 kcal per mole but just by taking the true apostate into account if it is available from experiment. Yes for P38 alpha it is available. Then the authors thought maybe there is something more interesting going on especially only for P38 alpha because such an improvement is not seen in other case. So they dig a slightly deeper they dig much deeper into this particular case P38 alpha and what they found is it is a single rotomeric flip of a residue 39106 which leads to such an improvement in accuracy. So let us look into that particular result. So this is the hollow structure and the simulation the T106 through 9 residue is in one particular rotomeric state. And if you just remove the ligand from here and model the apostate the error you get is around 11 kcal per mole which I have already shown earlier. But if you take the real apostate into account from available crystal structure and do the simulation the 39106 is mostly in a different rotomeric state than hollow structure. And that leads to an improvement of to a great improvement and the and the error is just within 3 kcal per mole. I don't know whether you are able to see it or not. I hope they should solve this or not. Okay, so what what okay if you are not able to see the numbers the error is 3 kcal per mole. So you see the improvement from 11 to 3 kcal per mole just by taking the true apostate into account where the difference is only a rotomeric state. Of course there could be many other difference but we don't know but how would you confirm that this particular rotomeric flip is the only issue or is the only contributing factor for such an improvement in the accuracy. For that what the authors did is they took the hollow state and they changed nothing from the hollow state to contract or model the apostate except only changing the rotomeric state of this particular ligand. This particular residue and why it is not moving sorry there is some problem with the slides. Okay, let me share again. So we are seeing that how do you say that it is only the rotomeric flip that contributes to sustain improved in the accuracy. So for that what the author did is they took the hollow structure and remove the ligand and constructed the apostate structure just like in the first case. And this time they just flipped the rotomer of T106 they did they did change nothing other than just the rotomeric state of T106. And what they found again if you are not able to see the numbers is the accuracy to be around 3 kcal per mole which basically tells or concludes that it is this particular rotomer which contributes to such an improved in the accuracy or a sampling problem if you are not taking the true apostate into account. So in summary, again non equilibrium alchemy can give you very accurate absolute binding free energy which is very close to experimental measurement but at much at a much cheaper price. Also if you have the experimental structure of true hollow and apostate available from from experiment. It is advised that you took you take both into account in your simulation so that you have a much improved accuracy as we have seen in six different test cases six different target systems. And we also saw that a single rotomeric flip can have a significant impact on the accuracy so it's better to sample all possible rotomeric flip of near the active site. Right, so that was the second test case so let us move to our last and third test case where we look into relative protein ligand binding free energy when I say relative what I mean is that we have two different ligands and methane and methanol we want to know which one binds the stronger. How much change in free energy of binding happens in going from methane to methanol one ligand to other ligand that is what I mean by saying relative binding free energy. Great. And here again I'm presenting a paper from the same lab the group lab, which was published in 2020 in chemical science. And here also the same the thermodynamic cycle we have discussed a lot of time in these slides. So I'm not going into detail again. You can just construct one cycle closer and the horizontal leg would be equal to vertical legs and delta delta G can be calculated using delta GPL minus delta GL. And what are its application of relative what what are the applications of relative binding free energy, there are many, but one of the most important again it comes in drug discovery pipeline and this we have seen that how absolute binding free energy if you have an accurate way of calculating leads to accurate identification of lead. Similarly, relative binding free energy contributes in lead optimization where you have already a ligand and you want to know what kind of modification to the ligand leads to a better binder leads to a more stronger ligand binding affinity. So in that case you can use relative binding free energy so in drug discovery pipeline you can see that in between we have both absolute and relative binding free energy contributing. And with that motivation let us look into the results. So here the authors went and went for slightly larger systems slightly larger number of systems. They looked into 13 different targets, 13 different proteins and 482 ligand A to ligand B conversions. Also you can call them adjust so they looked for 482 adjusts. They calculated that they calculated the delta delta G or the relative binding free energy using commercial software commercial software FEP plus also using Gromax and PMX. And when they used Gromax and PMX they used GAF as well as season FF force fields for the ligand. And while using FEP plus they used OPL S3 which is a golden standard for FEP plus. And as you can see, if you look into slightly closer into the results, GAF performs slightly better with an error of 3.9 kW per mole compared to season FF. But the combination of these two results or the consensus approach if you take the results and average them, it is better than the individual force fields it is around 3.6 kW per mole. And when you have the consensus approach you see the results or the error that you get is very close to that of you can get from a commercial software. So what basically the authors are trying to say here is that you have open source software like Gromax and PMX you can use them using various force fields and the consensus approach will give you as accurate results as a commercial software like FEP plus. And consensus approach again here performs better than a single force field approach to be remembered. Okay, again on individual test cases individual protein target cases you can see that the performance is almost similar to the overall case where the consensus approach the square blue points performs as very close to that of the commercial FEP plus approach where you have a dark maroon squares and you can also notice that from Gromax and PMX using non-equilibrium alchemy we spend less computational time compared to that of FEP plus. Great. And then the question comes is why the consensus approach performs better than a single force field approach. Right. So for that we look into one particular target CMAT and we look into its dataset of 25 inhibitor. And what you can see from here is that the results from CAF and CGNFF points in opposite direction to that of experiment especially in those cases where it is marked as X say we look into this particular case. The yellow bar is from experiment and the blue bar you can see it is the result is more positive compared to yellow bar and CGNFF results are more negative compared to the yellow bar and hence if you take an average it will be closer to that of experiment. And it happens in almost 14 out of it happens in 14 out of 25 cases for CMAT inhibitor dataset and hence overall there is an improvement in accuracy if you combine results from various force fields. And with that let me summarize the results from the third part or the third test case and what we saw here is non-equilibrium alchemical methods using Gromax and PMX performs on par with commercial FEP plus on various protein ligand systems. So overall accuracy with experiment is less than 1k Kalpamol which is which is really remarkable. And we then looked into why consensus approach performs better because in some cases GAF and CGNFF results points in opposite direction from the experiment and hence if you take an average it performs close to that of experiment. So that's where we conclude our third test case as well. So let me provide an overall conclusion and summarize in one slide what all we looked into. First of all what I'm trying to say you or what I'm trying to convince you is that you can use molecular dynamics open source softwares like Gromax and the open source of topology and structure building software PMX developed by DeGroote lab. And calculate delta, delta G or delta G free energy change due to mutation ligand binding nucleic acid binding and various other stuff. And the results you get is as accurate as is very accurate to experiments and as accurate as possible to commercial software as possible by commercial softwares. And the evidence we saw in three different test cases in this in this webinar, especially in the protein mutation case, then absolute binding free energy and litic binding free energy cases. But there are many other test cases that are there on literature using PMX and Gromax you can find it in in in DeGroote, Professor Bird DeGroote's Google scholar, where they have shown the same working fine for protein protein binding. Then protein like a nucleic acid binding free energy calculations. And you can find tutorials of all of all of these in GitHub page of Professor Bird DeGroote you go to GitHub, then go to DeGroote lab and go to PMX you can find many tutorials. Also, there is a website hosted by Max Plank Institute PMX dot mpibpc dot mpg dot de and there itself you can find many tutorials on how to do protein mutation free energy calculation as well as relative binding free energy calculations. Nice. So with that, I am or what we are working on right now on future development of PMX is one of them is a post translational modification. What I mean by that is, we are now trying to devise a protocol where you can calculate free energy change due to methylation, acetylation or many other post translation modification to a protein. We are also working on to detect and resolve convergence problem. What I mean by that is as I have mentioned initially that if you don't have a overlap of the work distribution in some cases it might lead to poor prediction. In some scenario, how would you detect such issues and how do you resolve such issues and those are the two things we are working on now. And say if you want to contribute or if you want to suggest us that what on what other things that we can work on on how to prioritize our development of PMX, I would suggest you to go to this PMX user survey which will not take you more than two minutes just to click click click click. And it is there in bio Excel website you just go to bio Excel and look for user survey PMX. And if you have never used PMX still you have listened to this particular webinar and know what actually PMX does then still you can suggest us that how we can improve or what are the different things we can work on in future development of PMX. So I would request you to please go to bio Excel and do this user survey. With that I would like to thank the PMX developer, Vitas, Daniel, Matthew, Servas, Yuri, Professor Birdie Groot, as well as various funding agencies like bio Excel, which PMX is a part of then Max Plan Institute, Janssen and Bo Hinger for all the support. And thank you all for your kind attention, and I would be happy to take up any questions suggestions comments. Thank you. Thank you very much. There was a misunderstanding so I'm sorry. I know that Bert was answering the question but it was not what it was supposed so I will try to see because now for me it's difficult to see the order of the people, but I will try to unmute the person in charge. So then we will try to answer to the question, because the idea is not that the chat is used like I was explained the other time to answer. So, Josie, we will unmute Josie so he can answer, he can, if he's still online, we'll see. Yeah, Josie you are allowed to speak now. If you want to ask directly your question even if it was already answered because not everybody see the answer. Okay, so if you cannot speak. So, yeah the question on slide 20. In some case, show less accuracy in the Apple States, why it might be happened. Show lower accuracy, showing lower accuracy by taking explicitly the posted into account in some cases why is that why it has happened. That's the question I guess in the if you look into Galactic case. We think I mean what I think it might have happened. First of all, if you see they are well within the error bar so they are not differing much, but why this slide reduction in accuracy might have happened is because you know when you are starting from the true apostate and you be because before doing the non equilibrium simulation you do equilibrium MD and in the equilibrium MD initial setup itself there because there are various for suppose there are there is some issue with the slide issue with force fields that might might slowly accumulate on on the on the on the trajectory which might lead to say in some cases slide change in the confirmation of the apostate and what you get at the end of an equilibrium simulation may not be the exact apostate that might have happened but overall if you see the accuracy is comparable. Okay, thank you. We have another question for Volodimir. I allowed you to speak if you want. If he doesn't know it doesn't react. So the question I have a question about restrain. Legand is usually restrained in some way when computing BFES in order to avoid its free floating in the couple states. Yes, which approach do you use for restraining. Is it similar to equation and non equilibrium and non equilibrium free energy calculation. Yes, so we use the same very standard and well known way of restraining the ligand that is used in equilibrium. So in both the cases equilibrium as well as non equilibrium alchemy as of now use the same similar approach to restrain the ligand which is known as Boris style restrained developed by Boris and car plus where you restrain six different degrees of freedom freedom which is by taking three ligands three items from protein and three items from ligand and then devise one distance and two angles and three different dihedral and restrain these different six degrees of freedom. And if you restrain this six degrees of freedom there is an analytical solution that how much this restrain would contribute to the free energy value and you can subtract that particular contribution at the end. So yes, both the equilibrium and non equilibrium methods use the same Boris style restraint. Okay, thank you. Now we have a question from from an alley allowed him to talk if he wants. No, it's not reacting so slide 19 I understand why the comparison from for T, why K2 with experiment is poor, but why with the PDE to also as a large error as compared with to the others. Okay, so again, figuring out exactly what contributes to the error in free energy calculations is not a trivial task there are many things that contributes, like say you your sampling issue that how long you have run the simulations, then, then how accurate your force fields are, then, when you do non equilibrium transitions because these are absolute binding free energy, you have larger perturbation because you are decoupling a ligand having 30 to 40 atoms. So when you have such a larger perturbation you usually have poor convergence. So the work doesn't overlap properly the work of forward and reverse transitions don't don't overlap properly. So it could be because of all these various factors. So pinpointing exactly one particular reason what that contributes to PDE to I'm not sure, but it might be because of combination of all these things like poor convergence, then poor sampling in equilibrium run, and then Okay, thank you. Now we have a lot. I don't know if he will, will we want to speak. No, I think it doesn't. Oh yeah, go ahead, please. Yeah. Hi, so there's a hello. Yeah, I wanted to ask, like, Have you ever given thought about using replica exchange solute tempering along with chemical transformations in your, like, but implementing this approach in PMX. Like I have seen many papers where they use replicas change along with chemical transformations to like enhance the sampling and get a better result. Like, in case of like charge transformations in ligands, I'm having a particular problems with convergence. Like, the calculations are very difficult to converge. Yeah. So, when you say, taking replica exchange also or incorporating replica exchange also into account in the transformation itself. Then if you are talking about the non equilibrium non equilibrium transitions, then we still do not have an idea how to do is because you see the non equilibrium transitions that we do are on the order of 80 picoseconds or so. I'm having replica exchange within 80 picosecond is almost, you know, it is tough. And if you want to increase the transition time and do the replica exchange then it leads to a really large it becomes really expensive. But but what one can do is indeed do replica exchange in the equilibrium itself or generate the equilibrium trajectory using replica exchange and then do the transitions from like those replica exchange equilibrium trajectory and just to do 80 picosecond transitions and that might lead to a better improvement, which of course we have thought about but we haven't worked on that directions, or at least the basic preliminary results on some systems showed that or at least in some systems we have or at least from our lab people have shown is seen that it doesn't improve much the accuracy when you do replica exchange in equilibrium empty but we haven't done a systematic study yet, but that is still there in our mind. But if you see, as you have mentioned that there are various reports where people do all chemical transformation of the replica exchange, though those are done in equilibrium because you have various lambda windows and you run a tense of nanosecond in the various lambda windows then you can you have enough time to do do the replica exchange across various lambda values. But here our transitions are just 80 picosecond so there is no scope of incorporating replica exchange in the transition non equilibrium transition itself. And yeah, but I would suggest you maybe you can for your case, have a try have give a try on replica exchange on for generating the equilibrium ensemble itself, and see if it improves the accuracy. Or maybe you can also run the transitions longer, which of course you know that it might also leads to an improved, improved accuracy. Or if you want to work more deeply into the problem what you can do is actually find out where the convergence problem comes from, like is it when you do a large change in a large perturbation going going for charge is there some problem around the mutation side, say hydration is not proper or is there a certain conformational change that might lead to such a convergence issue, and then trying to solve the problem but these are various thoughts which might be very involved. But these are the things that you can try I guess. Thank you. Now I take the last question probably somewhere across the top. You can speak if you want. No it doesn't react. So one of his question I have a multiple question. He was to know about the fact of using multiple structure to model the Apple States of the F. Okay. Yes, and so. If you're asking the my thoughts on that. Yes, if you don't have an experimental crystal structure. Then for a comparison sake, I think you can do it to take the alpha four depostructure, but do we have a proof that that gives accurate results from using non equilibrium alchemy I don't think so. I think there are reports. I remember reading some papers which show that you can model the system using alpha fold and it also gives pretty accurate result a pretty accurate results like comparable to experiment. So I guess I guess one can give it a try in on equilibrium alchemy itself and see how it works. But indeed, there are reports showing that it does perform good. You'd say that you would like to know what is extended at what extent that we can account for difference between the two edges of F that was this question. I mean how many atoms of difference. We can have between the two leg and So, yeah, okay. So that is that is one of the advantages of non equilibrium alchemy that if you have a larger difference even 20 atoms 30 atoms 40 atoms, since you are doing the transformation in both the direction forward as well as backward direction. And you have, you have distribution of work from the forward as well as reverse direction. So the free energy is always somewhere in between this two distribution and we have shown that in RBF E case that I'm showing here in this particular slide. Here, the change in change in number of atoms is really use even it goes up to 30 and 40 atoms. But still we have results pretty close to experiment overall within five clues will promote. So in RBF E case. Also you can have a large number of atom perturbation. But you will have a problem with the convergence. So with what uncertainty you can predict the value is the uncertainty of the prediction would be really, really high. But the mean value would still be as per our experience from a BFE would be pretty close to experiment. But other issues, other issues to take into account is when you have such a larger perturbation, and you are still doing the same approach for RBF E calculation, you are assuming that both the ligands bind to the protein with the same ligand pose and same protein conformation, which may not be the case. So that might have an effect on on overall calculate accuracy of the calculation. But directly to answer your question to what extent you can have perturbation I would say up to 30 to 40 atoms but it will have a lower convergence but still pretty accurate calculated values as per our experience. Okay, thank you very much. So I close this section. I wish everybody a nice summer break. And we will come back at the end of August begins September with the presentation from AstraZeneca about application on AI. And now I say, I thank you all that the need for being here is to the channel to for the told. Thank you, I close this webinar section. Thank you everyone.