 Hello everyone and welcome to the next edition of the BioExcel webinar series. My name is Rosson Apostolouf and I will be the host of today's talk. Today we will have Vanya Kalandrini who will tell us about some methods for hybrid molecular mechanics and coarse-grained approaches to modeling of proteins. Before we start, I have to let you know that this webinar is being recorded. And we will publish a recording of the webinar on the BioExcel website and also on our YouTube channel which you can watch later at your convenience. And you are welcome to share the recording with your colleagues or friends. A very brief overview of BioExcel which is the European Centre of Excellence for Computational Bio Molecular Research and we are the organizers of this series. A few words about our work. For those of you who are not familiar with it. We work with several very important applications in the area of molecular modeling simulations such as ROMAX, for anti-simulations, HALO, for integrative modeling and doki, and CPMD for hybrid QMM methodology for enzymatic reactions present. We also work a lot on usability with devices of different workflows. With associated data integration, we use several platforms and we develop, we work on a number of use cases and pilot projects about which you can read more on our website that make it very easy for people to build on. We also put a lot of effort into training and consultancy. We have an extensive training program with various workshops, hands-on, bring your own workflow sessions, etc. that we believe will be of interest to the community. So have a look at our website for the upcoming events. Next year we have quite a few workshops that you might be interested in attending. One more activity that we do is we run a number of interest groups which focus on specific areas that we have expertise in. We have, for example, integrative modeling focused on doki around HADOC, free energy calculations, which are also very important nowadays for drug design. What's most probably of interest to you is the hybrid methods for biomolecular systems where we look at things that you will hear about today. So visit our website and you can subscribe to some of these use groups. We have also forums called Repositories in the chat channel which you are very welcome to use. One more thing, at the end of today's presentation we will have a questions and answers session where you can speak directly with Vania and ask any questions you have regarding the material. I will encourage you to write your questions in the control panel section to the right. You can see it as their own points. So type your question there during the talk and at the end I will give you the microphone to speak up. If you don't have a microphone or if you have promised without you, I will read the question on your behalf. And later you can join discussions at our forums, dusk.bioxcel.edu. And now it's my pleasure to introduce our today's presenter, Dr. Vania Calandrini, who is a researcher at the Computational Biomedicine Institute at the Ulich Research Center in Germany. After she got her habilitation in Physics at the University of Orleans in France, she continued working on physiochemical processes, shaping the signaling processes at sub-neural level. Now she's working on the development of models and computational methods, describing the internal protein dynamics and transport phenomena. So it's my pleasure now to change to Vania to continue with the presentation. Okay, so thanks Ross for the introduction and thanks also for this invitation. So today I will talk about an hybrid molecular mechanics cost-grain scheme, which has been proved to be especially useful to characterize ligand binding to proteins with unknown 3D structure and low sequence identity, which is mostly the case of many big families of membrane proteins, such as for instance, G-protein-caplet receptors. Actually, hybrid methods are in general the method of choice when both fast processes on a small spatial scale and slow processes on larger spatial scales are relevant, and you want to keep at the same time the minimum number of degrees of freedom of the problem. For instance, in the case of ligand-protein binding, cost-grain approaches could be a priori adapted, but if you are interested in the intrinsic atomistic features of the recognition process, you need to maintain an atomistic resolution at least on the binding site. So exploiting the fact that the important high-resolution details are specially localized, you can couple different resolutions keeping at the same time the minimum number of degrees of freedom of the problem. But there is something more and on top of this reduction of degree of freedom, and in this webinar I will try to show you that in some biologically relevant cases, such as for instance membrane proteins or GPCRs, a big family of membrane proteins, because of the intrinsic characteristic of the problem, hybrid methods are really strategic, if not unique. Not only for the simple reduction of the degree of freedom. To target this important class of membrane proteins, our group started to develop a molecular mechanics cost-grain approach several years ago, and this is also the reason why I put here so many people who contributed to this evolution. And so I will describe the methods as it is in its current implementation, and I'll describe also some recent developments to further improve the method in order to allow reliable estimation of ligand-protein binding energetics. So if the structural determinants of a pharmaceutical target, a receptor or an enzyme are experimentally known or reliable structural predictions, or you can make reliable structural predictions, usually you can get highly accurate ligand poses by using standard bioinformatics-based approaches, docking, homology modeling, docking. However, when these conditions are not verified, for instance this is the case for membrane proteins that represent as much as 60% of the overall proteins targeted by FDA-approved drugs. The application of these approaches is not so straightforward. So this, why it's so difficult in the case of membrane proteins, applies standard approaches. And why hybrid molecular mechanics cost-grain approach can be used to circumvent the problem. Actually, for membrane proteins we know the 3D structures are in about 2.5% of the cases of the proteins and for GPCRs in particular around 5%. And moreover, the average sequence identity between members of large families of these membrane proteins, such as G protein in a couple of receptors or voltage-gated ion channels, is often below 20%. This means that the template selection and the alignment for homology modeling is far from trivial. The probability to end up with side-chains orientation, inaccurate side-chains orientation is quite high. And so the docking protocols are very difficult to be used in this condition. Even running extensive plain MD simulation or enhanced sampling methods on different initial models may require very long time to allow side-chains to relax to a correct free energy minimum. And moreover, this may again end up with wrong orientations impacting the predicting power of the docking procedures. An alternative is represented by cost-grain simulations, but again they cannot describe the atomistic features of the recognition between proteins and other ligands. So this was for us the main motivation to develop an hybrid molecular mechanics cost-grain approach, specifically conceived to predict the ligand poses in membrane proteins with lower resolution models. In this scheme, the atomistic molecular mechanics region corresponds to the region of interest, for instance the receptor binding site along with the ligand and the solvent around. In the cost-grain region, far from the binding site, each residue is represented by a single CG bead centered on the C-alpha atom. And in order to ensure the backbone integrity, there is an intermediate region that couples the two regions at different resolutions. In this way, we can get rid of potentially wrong information coming from atomistic modeling of side-chains. So this is the potential energy function of the protein frame. So there is the eMM contribution is the potential energy function of the molecular mechanics region. ECG is the potential term for the cost-grain region. EI is that of the interface bridging the two resolution regions. EIMM and ECGI describe the interaction energies between the interface and the molecular mechanics region and that between the interface and the cost-grain region. EMM, EI and EMM, I are currently represented by Grommos 96 force field. ECGI and ECGI are represented in terms of a go-like, let's say, model. Specifically, you see there is this term that represents the bonded interaction between consecutive beads. And this part represents the non-bonded interaction in terms of more potential. And then you see here there is the ECGI bonded term that ensures the integrity of the protein backbone. And you see here you have the bonded interactions between CG beads and the consecutive C-alpha atoms in the interface as well as you see here the non-bonded interactions between the CG beads and the C-alpha and the CBIT atoms in the interface. In the first implementation, we tested this potential for two cytosolic proteins, HIV type 1 virus aspartic proteins. And the human secretase BAC, so before to move to receptors to membrane proteins, we first tested this scheme for cytosolic proteins and we tuned the parameters, the cut-off, the interface region thickness in order to reproduce the rhythmic square fluctuations calculated with plane MD simulations. And we further tested the accuracy of the model calculating the covariance matrix of the C-alpha position fluctuations. And you see here that if you calculate the agent vectors corresponding to the plane MD simulations and MMCG simulations, you see that the most relevant agent vectors almost coincide which means that basically with MMCG we are able to essentially recover the main aspect, the main features of the dynamics. And then, okay, we are very happy with this, but in order to proper model membrane proteins, we need something more. Specifically, we have to take into account the solvation around the binding site and also we have to take into account the membrane protein interactions. So we added five walls, two hemispheres at the extracellular and cytoplasmatic region of the protein to prevent water evaporation is P3 and P4 in the sketch, two planar walls at the lipid head and a wall enveloping the transmembrane portion of the protein to mimic the membrane. And specifically, water evaporation from the atomistic region around the binding site is prevented by a softened lenar-johns-like potential whose minimum is at a distance rp from the hemispherical boundary of the water droplet. And as for water, water penetration in the transmembrane space is prevented by a responsive potential proportional to 1 over d where d is the distance from the planar walls. And the protein membrane interaction is mimicked by a softened lenar-johns-like potential with the minimum at a distance rp from this virtual wall enveloping the transmembrane portion of the protein. So this frame, we tested this model on the human beta-2 adrenergic receptor, which is a GPCR, a G-protein-capled receptor in complex with its inverse agonist as carazolol for which the X-ray structure of the complex and MD simulations are available. And you see here the root-me-square fluctuation and the comparison between plain MD simulations and MMCG simulations. In gray here the residues corresponding to the molecular mechanics region and you see in blue the MD and in black the MMCG and you see there is a fair agreement at least for the atoms in the molecular mechanics part. Still there are some differences that can be expected due to the different force fields applied, number 99 in the case of MD simulation and grommos in the case of molecular mechanics coarse-grain simulation. You see here a snapshot of the binding site obtained from MMCG trajectory of the complex and you see also the superimposed positions of the ligand along the trajectories. The structural determinants of the active site are well maintained. You see here in the panel here the distribution of the distances between the ligand and some key residues involved in the binding with the ligand. Apart from this specific hydrogen bond the agreement is quite good between plain MD simulation and MMCG. After the calibration through the beta-drinergic receptor we have also applied our MMCG scheme to other GPCRs. In particular the beta-test receptors 38 and 46 and again the aim was to determine the binding poses of their agonists. This is a very challenging application because again for these GPCRs we don't have any structural information and the sequence identity with the template is less than 20%. Nevertheless our MMCG models were consistent with almost all site-directed mutagenesis experiments available and we were also able to find an intermediate binding site in addition to the standard binding site which again is compatible with experiments so let's say that at least so far with this method we are quite confident to be able to determine at least the main features of the binding from the structural point of view of the binding of the ligands to these membrane proteins. But despite all the improvement in the system due to the addition of the hydration water of the membrane and so on and so forth still some artifacts may be introduced in the solvent dynamics because of the boundary potential added to prevent water evaporation from the binding site. So for this sound alternative could be the implementation of dual-resolution modeling for the solvent such as the so-called Hamiltonian-based adaptive-resolution scheme which has been recently proposed by Raffaello Potestillo and Kurt Kramer. Within this scheme water is described with full atomistic resolution in the molecular mechanics region and a cost-grain representation outside the molecular mechanics region. The interface between these two sub-domains again consists of a hybrid-resolution region where water molecules change on the fly their resolution so they freely diffuse in and out of the molecular mechanics region removing this way possible artifacts introduced by the confining potential. And to couple the atomistic and the cost-grain potentials at the interface each address interpolates the Hamiltonians of the two regions. And we are very happy with this because each address preserves by construction the Hamiltonian framework which a priori allows a rigorous ligand binding free energy calculations. So let's say toward the implementation of this scheme in our MMCG method in collaboration with Raffaello Potestillo we have applied for the first time the HRD scheme to cytosolic proteins in water. For this first application the proteins is fully atomistic so only the solvent is treated with this dual resolution scheme based on each address and the aim of this work was to demonstrate the applicability of the method also to water sorbetting complex biological systems because so far the method was only tested for water in water basically. And so for this we compare the structural and dynamical properties of proteins in solvent computed with both each address simulation and fully atomistic simulations. And as a test case we use the etox one and cyclophiline two globular proteins without some differences for the folding is a bit different for these two proteins. And you see here in the scheme DAT indicates the radius of the atomistic region and we override the thickness of the atomistic region in order to determine the optimal thickness of the molecular mechanics part in order to reproduce correctly all the properties all dynamical and structural properties of protein and solvent as well. You see here the Hamiltonian of the system so you have the kinetic energy then you have this term for all atom bonded and non-bonded interactions internal to the protein and the all atom non-bonded protein water interactions and here you have this term that represents the hybrid non-bonded potential energy contribution of each water molecule VMM represents the SPC force field so the atomistic force field and VUCG the coarse grain potential and this term is derived from an independent all atom simulation on pure water through an iterative Boltzmann inversion procedure. And then you have this correction term called Gibbs free energy compensation term before to move to the correction term you see here in the hybrid part you see there is this lambda function that smoothly couples the two level of description the molecular mechanics part and the CG part you see here the function going from one for the molecular mechanics part to zero for the CG part. So in this this this correction term contains basically two contribution a contribution that to compensate the so-called drift force that arises from the coupling of the coarse grain and the MM Hamiltonians and a second term that compensates for the so-called thermodynamic force and you see here some comparison between all atoms simulations and H address simulation so the secondary structure elements are maintained and if you look at the root mean square fluctuations there is a very good agreement and also if you look for instance at the power spectrum of the dipole-dipole correlation dipole-dipole time correlation function again the agreement between all atoms and H address simulation is very good. Now if you look at the solvent you see here the radial density profile of water from the center of the atomistic region to the coarse grain region and you see here for H address simulations the density profile is almost flat so you have basically the same density in all of the region of the system and if you look also at the mean square displacement of water both for the bulk and for the hydration layers again H address is in good agreement with all atoms MD simulations. So in general we do not see an important variation in the protein dynamics or flexibility or structure by reducing the atomistic radius of the molecular mechanics part. The same is true for water structural properties and for translational diffusion whilst the reorientational dynamics of water is quite sensitive to the atomistic thickness and for instance you see here in this plot for thickness of the atomistic part of the order of 1.6 nanometer you have a discrepancy between MD and H address of the order of 10% and of course if you decrease the thickness of the atomistic part this discrepancy increases more and more. So the next step for us is to implement this H address scheme together with our MMCG potential for membrane protein and this is part of the work of one of our PhD students, Thomas Torenzi and this work will be done in collaboration with Raffaello Potestillo. So thank you for your attention and I think now I have to move to the question and answer session. Yes, thank you Vanya for the very nice presentation. Now we start our questions and answer session from the listeners. So everyone please use the chat channel, on the control panel there is the chat section where don't chat actually the questions. You can also use the chat, I'm not even involved to ask your question. So the first question we have is from Luca Maghi and I will now see if we can hear each other. Luca, could you say something? I'm not sure whether your audio works. Okay, can you hear me? There you are, yes. So microphone is off so you can ask your question directly. Hi Vanya, thank you for your presentation. And then I was wondering if you are planning to use a different force field in the MMCG scheme. Yeah, this is one of the future developments. Specifically we are working to implement amber force field. So yes. Yeah, okay. I guess I found that using ligands sometimes it's very difficult to use grommets because you have problems. Yeah, absolutely. Yes, this amber seems to be at least the standard for this kind of problem. And this is for us the main motivation to implement. Actually we are already working on this and we are doing right now preliminary tests. And of course all the parameters of the force field to be tuned for the coarse grain part in order to couple in the proper way the coarse grain part with the atomistic region, let's say. Okay, thank you. Thank you Luca. Our next question is from Alberto Duipietre. Unfortunately his audio doesn't work so I will read the question on his behalf. And he would like to know if you could comment on the correction of the thermodynamics force. Yes, so let's say that with the first correction you remove on average the drift force. But you cannot, let's say, remove the density imbalance originating from the different virial pressure on the CG and MM subsystems. So in terms of the ground potential PV means that the ground potential is for identical volumes is different in the CG and in the MM part while the compressibility for instance remain unchanged because you tune the CG potential to match the radial distribution function of the atomistic water. So by construction the compressibilities are unchanged. But this is not true for the virial pressure. So in order to enforce a uniform density you have to introduce compensating force, this second term. And basically you can show that the integral of this force across the hybrid region, so the work performed by this force on a molecule crossing the hybrid interface can be approximated by the ratio between the local pressure profile and the reference target density. So this is the origin of this second correction term and the form of this correction term. Thank you. I'll go to if you have further comments please write them in the question box. Our next question is from Anna Pociccio, if I pronounce correctly. So Anna let's see. So now you'll be able to ask a question directly. Hello, can you hear me? Yes. Okay, okay. So thank you for the nice talk. I have a curiosity let's say. So when you tested the MMCG on the beta-alphanergic you started from the x-ray structure if I'm not mistaken. Right. So the ligand is already in the good binding force. But did you test what might happen when you start from a wrong binding force or conformation? Yes. So we did also some tests in this direction. So we started with let's say wrong binding poses. So with the ligand very far from the binding site. And on average after 100, 200 nanoseconds the ligand migrates to the good binding pose. So it's let's say that we are quite confident that also starting in from very quite wrong configuration. So we are more or less able to recover the correct binding poses. Okay. Thank you. Thank you. And we have a question from John. Okay. So you can speak up. Can you hear us? Okay, maybe there is some problem with W. I will read the question on your behalf. John is asking to learn more about the main applications of this hybrid method. In other words, which are the biological processes that allow us to study in comparison with only molecular mechanical codes? This is a special interesting. So in general, as I said, hybrid molecular mechanics course grain methods are interesting to, for instance, to reduce the degree of freedom. If you need a high resolution only in a specific region of your system. Assuming that you have a good way to couple the high resolution region with the course resolution region. This is a good way, let's say to reduce the degree of freedom. So a priori you could increase the size or the complexity of your system just because you are on the other side, you are reducing the degree of freedom of the system. For us, in the case of GPCRs, the interest in using these hybrid methods in particular molecular mechanics course grain part is intrinsic to the system. Because, as I said, the system, we don't know for most of these proteins, we don't know the structure, the 3D structures. The sequence identity, if you try with homology modeling, the sequence identity is very low. So the models that you can obtain with standard bioinformatics approaches are, let's say, not reliable. For instance, we also tried in our institute to run MD simulations, standard plain MD simulations, starting from these models. But it's almost impossible to infer any information because you may experience unfolding or very strange behavior because the orientation of the side chains is very, maybe is wrong. And so it's really difficult to converge to the good orientation to some meaningful free energy minimum. But on top of this, in most of the case, you see really unfolding, so very strange behavior. Instead, with this MMCG approach, let's say that we remove all the potentially wrong information that is not essential to the problem. So we keep the minimum amount of information. For instance, in the part far from the binding site, we use a coarse grain representation just to have the essential flexibility and to transmit this flexibility also to the binding site. But no more than this because basically we need only this and at least so far this was the best strategy, let's say, in our experience to address binding, ligand binding problems to GPCRs or membrane proteins without unknown 3D structures and low sequence identity. Thank you, Vanya. I hope this was sufficient. If you have additional questions, please write it in the box. I was wondering, Vanya, I have a question to what are the maximum sizes that you have, sizes of systems that you have worked with? What is the limit of the method given the computers that we have at the moment? So far, we don't have any special limitation in the sense that this MMCG approach is embedded in GROMACS, so the engine is GROMACS. So it performs quite well, let's say, at least the same performance of GROMACS. The size of our system, at least so far, is quite small because we started with the binding of small ligand to the binding to GPCRs. But in principle, you can imagine to increase the size of the system or increase the complexity of your systems, adding other, following along the same lines, adding other proteins or whatever, or larger ligands or whatever. Thanks. One of our participants, Dike Osmario, is raising his hand, so can we hear each other? Would you like to ask a question? Hello. Can you hear me? Yes. Yes, very good. Oh, that's great. It's the first time I use this, so I'm a little bit confused. Well, thank you very much for this very interesting talk. I finally see after decades that people realize what I was doing in the 80s, namely the coarse-grained business. Because you can't be the world policeman that follows all the motions of all the particles in the world. You have to represent the solvent, especially in the biggest part of the system, by coarse-graining. And I have introduced the PMF approach for that that explains things that have not been explained by MD, by all atomic MD, up to date. So I predicted in 84, I predicted with very rough structural model for ions and water, the transition between left-handed DNA, right-handed DNA, left-handed DNA, the busy transition. Nobody working in the area of the brute force simulations that are pervading all science nowadays has ever been able to predict this because their DNAs, they disappear. They are poly ions and they dissociate. You cannot hold them together. But they keep supporting the brute force methods at the finitum, forget everything that has been done in the last 30 years, and now fortunately young people come in and they start to refine this business. And a final comment I have is that you don't need the coupling between, with the grain canonical ensemble and the in-between and all the stuff. You don't need all this if, if, a big if, if the relaxation times involved in relaxing the water which are obtainable from nuclear spring resonance or whatever, if these are fast compared to your motions, then you just need my PMF approach, nothing else for the water, for the ions, for whatever you have. If you are interested to collaborate with me, I'm retired, I don't have a group, I don't have funding, but I am still doing active science. If you are interested to collaborate with me, please contact me. Yes, yes, of course. You can left on the chat or maybe you can, no, you have also my address email so you can send me your article and of course you can contact me. No, it's not an article, it's 60 articles. I have the last, the last week, you know, it's impossible. People do not know the literature before the introduction of the scanning and all the stuff. If you want to collaborate with me, I would very much like to do this because I see that finally my concepts are being realized at this partly, but in a very obscure and awkward way, you don't need all this. You will gain a lot, you will gain a lot by contacting me, believe me. My name is Dikeos Mario Sumpases, Sumpases, S-O-U-M-P-A-S-I-S. You can find me in the research gate, you can find me in Google, you can find me in some places, but I'm not connected with some groups, etc. I'm just doing my work alone now because I'm retired. Okay, it would be a pleasure. Could you please say again the spelling of your name? My name is S, S like Siena. S, S-O-U-M-P-A-S-I-S, S-O-U-M-P-A-S-I-S. I have the name of Dikeos Mario here. Oh, correct. I can send you both email, like, foot through the search. Thank you very much for the talk though because this question you know with the relaxation times is very important for me for other reasons too. So if you have from your simulations where you will try to do this with the Grand Canonical I know the stuff you know. If you have some data on times involved, so what are the characteristic times of your ligand of your protein science compared to water distribution relaxation times? I think at least for the hydration water I think that they are not so different because you know in the for the hydration water you have this sub-diffusive phenomena so the timescale can be very very slow. But at least so far we are testing only mainly let's say the structural properties of the system. Of course the next step is to characterize the dynamics and to compare in a quantitative way. But of course I get the point of course if you have a big if the timescales are well separated let's say you don't need you don't need all this relation with different ensembles that's it you know so this was the strategy I introduced as I tell you beginning 1984 for all biopolymers whether they are poly ions whether they are DNA proteins or whatever you have you can use it in docking you can use protein interactions everywhere and it gives you very good results compared to experiments. Yeah so I don't know this method so I have to I have to admit so I have to go to the literature. Please take take a little bit of time to look at it and then you can you can contact me by you can contact me in the research gauge or link. Thank you this is a this is a very interesting discussion I'm afraid we are approaching the hour and yes thank you too we're approaching the hour and with which we will finish our webinar I want to remind you that our forums at asbelbergs.eu you're very welcome to post any questions and continue with topics of your interest and yes thank you again Vanya for the great talk. Thanks for the invitation. Yes and everybody have a great time over the upcoming holidays next year we'll prepare a bunch of very exciting talks in our webinar series so stay tuned. We will add you to our newsletter meeting this so that you can learn about the new talks that might be very interesting for you and your colleagues. So with this we are finishing and we will get again together next year. Have a good evening everybody. Thank you. Thanks. Good evening.